Polemic

Hamish Campbell – Aotearoa New Zealand

GitHub Twitter LinkedIn Keybase

Of Suburbs and Open Data Barriers

10 Sep 2015

This is a a follow-up to my first Auckland Suburbs map. As I outlined earlier, the only authoritative suburbs dataset for New Zealand is the NZ Localities dataset. The data is maintained by the New Zealand Fire Service Commission (NZFSC) and has been the subject of one of the longest running OIA disputes in NZ.

In this post I expand a bit more on the background of the data and demonstrate why the NZFSC's licensing approach has proven counter-productive to their aims.

Localities and Suburbs

What is the NZ localities dataset? As follows:1

The data stored in the dataset describes the boundaries between different ‘localities’, or parts of the entire country. It might be understood as describing the boundaries between suburbs, but this would not be accurate as many ‘localities’ are not suburbs of a town or city, and public perceptions of where one suburb starts and another ends may differ from the boundaries defined in the dataset.

While a locality is not technically a suburb - it certainly comprises of suburbs as we would commonly understand them to be. And indeed, this is how the data is commonly used.

The NZFSC has stated2 that:

The NZ Localities dataset is available for distribution, and NZ Fire Service promote the use of this dataset as all emergency services are using the data for reference and to support emergency response. The dataset is managed and maintained by a consortium including NZ Fire Service, NZ Post and Quotable Value NZ.

On the face of it, this is superb. It is a localities dataset, produced in partnership with our postal service and local authorities, available for free and ostensibly promoted for further use as an authoritative reference.

Putting up Barriers

Unfortunately there is a big catch. While free, the data is distributed under a license agreement with NZFSC as licensor. The license can not be transferred and can be terminated at any time.

It also prevents the following uses:

  • You may not incorporate the data into a project in such a way that someone can extract the data,
  • You may not modify the data, aside from text formatting and geographic re-projections,
  • You may not sell, lease or otherwise provide the data for commercial gain.

Finally, the license requires that you update your copy of the data (and any copies you've supplied to others who have signed the license) within 6 months of release of an update.

Unfortunately, that makes the data unsuitable for a wide range of projects. Just a few reasons include:

  • Incompatibility with licenses associated with existing projects or products,
  • Unclear who the license is granted to (does every contributor to an open source project have to sign it before they can see the data?),
  • Termination clauses that might be difficult to adhere to,
  • Difficulty in preventing 'reverse engineering' of the data.

Perhaps most significantly, it cannot be incorporated into Open Street Map - the same geographic data used in Auckland Transport's Journey Planner site (and transport apps), these things, all of these, pops up in Apple maps and so on.

Why are the license terms so restrictive?

As the NZFSC puts it3:

In our view there is a very clear risk to public safety where, the caller provides the call taker with address information based on a repackaged, on-sold dataset that contains unauthorised changes or does not include authorised updates.

The argument has some merit. They've spent a lot of time creating an accurate locality dataset to ensure they get where they need to be; lives and property are at risk and it's important to avoid mistakes by maintaining control. This logic ultimately convinced the Ombudsman that the NZFSC had a case to retain the license.

The Ombudsman acknowledge a key objection, that3:

..the constraints placed on the use or manipulation of the dataset by the NZFSC licence has the effect of inhibiting more widespread adoption of the locality boundaries. This, in turn, increases the risk that someone may give a call centre an account of their location that differs from the call centre software — increasing the risk of delay in dispatching the correct equipment and people to the right location.

But ultimately decided that:

NZFSC’s willingness to provide the NZ Localities dataset free of charge, to anyone who seeks it and is willing to become a licensee, largely satisfies the public interest in disclosure of this information. In light of the potential for harm arising from out-of-date datasets in the public domain, I do not consider the licence requirement for updating the dataset when requested, to be onerous.

The reality of the situation is that other datasets already exist in the public domain.

Put up a barriers and the community will route around you

The argument for retaining the license is predicated on the axiom that the NZFSC has control.

In the absence of a facts about the world that can be used freely, the community can and does create its own. The data they produce is a mix of community consensus, guesswork, approximations and bad assumptions. The result is a lot of conflicting locality information. The Nominatum Open Street Map address finder includes non-existent4. localities such as Rosebank and Balmoral in Auckland. MapZen's world gazetteer project will use open sources like Geonames.

Quattroshapes Neighbourhoods of Auckland — Data CC-BY Foursquare / LINZ, Map CC-BY Hamish Campbell

Another example (displayed above) is the Quattroshapes gazetteer dataset created by Foursquare.

To improve recommendations, we have created an authoritative source of polygons around a curated list of places. This gazetteer of non-overlapping polygons provides more relevant results than simple point geometries.

A quick inspection reveals omissions (Kelston), additions (Balmoral) and significant errors (no Rosebank, but the area is split between 4 incorrect locations). In the absence of authoritative open data, this is what is often used to create real consumer products.

The net result of restrictive licensing is to promote the use of inaccurate, non-authoritative and poor quality alternatives.

I believe the NZFSC is genuinely concerned with ensuring that up-to-date information is used in critical applications. The only way to achieve that aim is to release the data under an open license.

The Ombudsman's Decision

The original OIA request went all the way to the Ombudsman. Over 7 years after the original OIA request the Ombudsman ruled in favour of NZFSC's opinion.

This decision is a reminder that the Official Information Act is not a replacement for Open Data or the NZGOAL framework in New Zealand.

There was a glimmer of hope in the decision. Part 98 states:

In 2011 the NZFSC gave a public undertaking to review the licence agreement against the criteria of the NZGOAL framework. NZFS has now confirmed, by letter of 24 July 2014, that:

“Since 2013, NZFS has been working with the NZ Geospatial Office (NZGO) and LINZ to establish and mandate the NZ Localities dataset as a fundamental dataset for NZ under the NZ Geospatial Strategy and NZGOAL framework. NZFS is also working with NZGO and LINZ to establish the NZFS Commission as the authoritative custodian of the NZ Localities dataset. ... Prior to the end of 2014, and subsequent to establishing NZFS as custodian, NZFS will release the NZ Localities dataset under a new Creative Commons licence (CC variation yet to be defined).”

It's disappointing that nearing the end of 2015 we still haven't seen the data released.

Moving forward

The final part of this series will:

  • Provide a methodology for generating the authoritative boundaries from open data sources, and
  • Outline some of things that are happening to make an official localities dataset a reality.

Footnotes

1. Request for the New Zealand Localities dataset - Ombudsman's Decision - Page 2
2. ibid. Page 3
3. ibid. Page 8
4. Non-existent in that they don't exist in the NZFSC data, but are commonly known.