Local authority transport: how to publish your data
Explains the standard and practices you should use when publishing your data.
Applying the right data standards
Data standards
A data standard is a common way to format, define and present data. Data standards add value to your data by making sure the way it is formatted is consistent. Standards allow compatibility between data sets, ensuring users can understand your data sets and aggregate data as required.
In most data standards there is a defined format, structure and relationship between different data elements. Typically, this means how specific data elements should be coded, for example:
- one of a permitted set of values - for example, North, South
- a data value with a particular range, with a defined unit of measurement - for example, days or tonnes
- a template for data entry or required resolution - for example, monetary amounts in pounds to 2 decimal places, like £23,500.50
In many cases data standards define:
- which data elements are always required (mandatory)
- which are conditional and what the nature of the condition is - for example, if another data type is used then they are needed
- data that is optional
Most data standards are focused on data that is being exchanged between one application and another. Many system developers may choose to import and export data in a standardised format. Some systems may support several standardised formats.
When exchanging data in a standardised format, the standard normally dictates the specific form to code the structured data in. Some well-known examples include comma separated variable (CSV) format, extensible markup language (XML) format and JavaScript object notation (JSON) format. Many formal data standards use these generic data formats and then further constrain them to create a unique, clearly-defined structure for the standardised data.
For internet-oriented formats such as XML and JSON, there are ways to define ‘schemas’ that define these more constrained data structures. You can use software tools to check that your data meets some or all of the requirements of the data standard.
Choosing and applying a data standard
If an appropriate data standard exists, you should use one to format your data set. You can check this table of types of transport data indicates the recommended data standards, where applicable. In some cases, there may be no dominant standard.
The Local Government Association (LGA) has produced a guide to making standards work for you. The Open Data Institute (ODI) has produced guidance on open standards for data. This includes:
- when to use open standards
- when not to create new standards
- how to find and choose an open standard
To apply a standard you need to ensure your data contains the necessary fields in the correct format. To exchange data in a standardised format you need to check that your data is in the correct format. Where a schema is defined, you may be able to use tools provided to test your data against the schema.
Opening your data and making it usable
This section covers the main things you need to do prepare your data for publication.
Open by default
Ideally you should make your data open by publishing it under an open licence. The licence provides a framework for how the data should be used.
In some cases, you will not want or be able to open your data. Alternatively you could:
- wait for people to contact you to request data and share it only with restrictions
- keep your data closed with no sharing at all if you are not able or ready to open it
Open data policy
Many authorities have an open data policy. This demonstrates their ongoing commitment to open data. An open data policy will help to inform decision making and internal strategy. For example, it will help to prioritise data sets for publication. It will allow external stakeholders to understand your motivation to publish open data.
If your authority does not have an open data policy, you could reference open data in your local transport plan (LTP). This could involve a commitment to publishing open data.
Deciding on your licensing terms
If possible, publish open data under an open licence. This places few restrictions on what data users can do with the data available to them.
The Open Knowledge Foundation defines open data as data that anyone can freely access, use, modify and share for any purpose. Open data is subject, at most, to requirements to:
- attribute - data users must acknowledge to the source of the content or data
- share-alike - the licence may require publication of derived content or data under the same or similar licence
The licence for your open data should reflect, at most, these requirements.
When in doubt use the Open Government Licence
When publishing open data, your default licensing option should be the Open Government Licence (OGL). Local authority data is often published under the OGL. The OGL is a simple set of terms and conditions that allow the re-use of public sector information free of charge. The latest version is OGL v3.0, released in October 2014. The National Archive has produced a guide to the OGL.
The OGL allows anyone to copy, publish, distribute, transmit and adapt published data. Users can exploit the data for commercial and non-commercial uses as long as they acknowledge the source. There are exemptions that the OGL does not cover including personal data. This is not particularly relevant to transport data but it does include licence plate data.
The OGL is compatible with the Creative Commons Attribution License and the Open Data Commons Attribution License. This means that data licensed under either of these licenses meets the terms of the OGL.
You apply the OGL by including an attribution statement with your data. The OGL provides a standard attribution statement. This is for data shared under this licence with no attribution statement:
Contains public sector information licensed under the Open Government Licence v3.0.
The OGL also gives local authorities the freedom to write their own attribution statement.
Example
For example, TfL uses the statement ‘Powered by TfL Open Data’. Wyre Borough Council uses the statement ‘Open data, Wyre Borough Council, licensed under the Open Government Licence’.
Technical considerations
There are several things to consider to do with technical licensing when you publish transport data.
The ODI have produced a publishers guide and a reusers guide to open data licensing.
Personal data
If your users have to register state how you will use their personal data and how you will process it, if it is relevant.
Re-application for access
Some local authorities require users to re-apply annually for access to their data. This will allow you to monitor which feeds have the most users. It will mean you can close open data feeds that users are not accessing or using. It will also mean you can ask users for feedback so you can understand how your data is being used.
Limit traffic requests
Consider implementing the ability to limit traffic requests or block access. A limit to traffic requests will limit the frequency and amount of calls for your data. It will ensure that your website can handle the simultaneous users and requests.
Example
For example, Transport for London (TfL) limits traffic requests to 300 calls per minute, per data feed. TfL reserves the right to throttle or limit access to feeds, when the service is degraded by excessive usage.
Sharing geospatial data
Geospatial data is data about the locations of objects. If you are opening your transport data, you need to include the geospatial data so that users know where the data applies to. This section covers the main things you should consider when preparing your geospatial data for publication.
Sharing location data from Ordnance Survey base maps
If your data set includes Ordnance Survey (OS) data you must check the OS criteria before you publish it.
Depending upon the data set you are using, you can establish a route to publication of your data set. Your publication must provide an attribution statement for using OS data.
The Public Sector Geospatial Agreement (PGSA) is an agreement between the Geospatial Commission and OS. Beginning on 1 April 2020, the PGSA allows public sector access to OS data for a duration of 10 years. It allows local authorities to share OS data when it supports core business activity. This includes responses to Freedom of Information (FOI) requests. It is worth noting that PSGA members can share OS licensed data with each other with no restrictions.
If your data uses OS Open Data supplied under the PSGA then the OGL covers your publication.
If your data uses other OS data you must follow OS guidelines:
-
If your data meets the OS presumption to publish criteria you must inform the OS of the publication of your data set. You can do this by completing the online presumption to publish form. In this case, the public sector member license allows users to publish and share under the OGL.
-
If your data set does not meet the presumption to publish criteria you still may be able to publish it. You must believe that you can only achieve your objectives by releasing your data. If this is the case you must request a publishing exemption to publish your derived data set. The OS will review your request and aim to respond within 15 working days.
You can find out more about publishing data under the PGSA.
You must attribute data sets that use OS data. OS provides suggested attribution statements.
Using OpenStreetMap as an alternative to Ordnance Survey base maps
An alternative to OS maps is OpenStreetMap (OSM). OSM is a community project meaning anyone can update and edit the map. Changes to the map occur every minute. This means that local authorities can correct and update the map in near real time. In comparison, updates to the OS MasterMap Topography Layer occur every six months.
OSM is open data, meaning that anyone can freely use and share OSM. If you share derived data from OSM, you must attribute OSM and its contributors.
The ODI has produced guidance for local authorities when providing data to OSM.
Other alternatives
Check the terms and conditions before sharing data using other mapping services as a base map. Typically the choice of OSM as a base map is preferable because it is non-proprietary.
Example
If you want to use a Google product as a base map, for example Google Earth or Google Maps, you should follow Google’s guidelines on what you can share and details of the attribution they require.
Example
If you want to use Bing Maps as a base map, you will need a license. For non-commercial uses the license depends upon the number of transactions in a 24-hour period. You should also check Bing’s details of the attribution they require.
Creating metadata
Good metadata will make sure that users can find your data set and understand the data. You should maintain your metadata as your data sets grow and evolve.
Metadata is data that provides information about a data file. Metadata helps a machine to understand the data file. For example, every time you take a photo with your smartphone it saves relevant metadata. This includes the date, time, location and file size.
To make sure your metadata is easy to navigate and easy for data users to understand, the entries in these metadata fields should be as consistent as possible across your data sets. Avoid free text descriptions if possible as they can cause confusion and are not machine-readable.
The table below suggests the minimum metadata fields you should include with example responses. You should publish the metadata in the same place as your open data. For example, see Bristol’s metadata for its cycle network.
Metadata field | Some suggested response options |
---|---|
Title | Title of the data set |
Unique identifier | Include a unique identifier of the data set if applicable. |
Website URL of the data set | Website URL to the published data. |
What data is included | Air quality Electric vehicle charging points Cycle hire docking locations Cycle journey times Location of traffic signals Network performance Parking occupancy Street information |
Location | Latitude and longitude in a specified coordinate system. Typically this is WGS84 which is the system used by the Global Positioning System. The number of decimal places is important. |
Unit | Specify the units of measurement where applicable. |
Origin date | Date that the data set was generated. |
Last modified date | Date of the most recent update to the data set. |
Date | Dates that the data set covers. |
Frequency of update | Live Daily Weekly Monthly Quarterly Biannually Yearly No longer updated |
Contact details | Contact details for queries about the data set and for error-reporting purposes. Ideally this should be a team name and team email address. |
The table below includes some optional metadata fields.
Metadata field | Some suggested response options |
---|---|
Summary | A brief summary of the data set. |
Transport mode | Bus Cycle Pedestrian Train Taxi |
Road network | Motorways Dual carriageways Single carriageways Junctions |
Extents | If the data set has spatial extents these can be included as an optional metadata field. |
Format | File formats of the open data. |
Licensing | Free Commercial Open Government Licence (OGL) |
Version | Version 1.0. |
Publishing your data on a data platform
This section outlines the main steps for you to publish your data.
Decide on your open data platform
To publish your data you need a platform to publish from. Your options include urban traffic management and control (UTMC) systems or smart city or sub-national transport body (STB) data platforms.
Using UTMC as your data platform
You should use the systems that you already have as the first option for publishing your data. If you have a UTMC system consider using it as your data platform. There are official UTMC specifications, guidance and good practice advice. This documentation covers:
- establishing open data services
- dealing with APIs
- dealing with emerging technology - for example, connected and automated vehicles
UTMC systems support many devices and data types including:
- traffic signals via urban traffic control (UTC)
- fault and asset management - integration with existing remote monitoring systems (RMS)
- variable message signs
- car parks and parking guidance systems
- CCTV
- National Highways’ National Traffic Information Service (NTIS) data
- journey time data
- traffic counters
- journey time systems such as automatic number plate recognition (ANPR), Bluetooth and wifi
- incident and event management
Check with your UTMC system supplier whether you are able to:
- publish data on an open data platform - this may come with additional cost
- publish data via your existing public facing website
- publish data through other mechanisms - for example, Twitter, SMS or email
- bring new data sets into your UTMC to publish these data sets through it
Example
Reading Borough Council uses its UTMC to provide traffic and travel information through 2 externally accessible systems:
- the Reading open data server (ODS) provides raw data services to public app developers. This uses data from the real time passenger information (RTPI) and Stratos UTMC systems
- The Reading web server provides live traffic and travel data to Reading’s main council website. This includes real-time CCTV feeds and journey planner functionality. This uses data from the RTPI system, the ODS and external systems such as national rail data
There are four main UTMC suppliers offering different services to their users.
Mott MacDonald can provide:
- a DATEX II feed
- open data - for example, netraveldata
- a website - for example, Transport North East
- a Twitter feed - for example, NELiveTraffic
Siemens can provide:
- a DATEX II feed
- open data
- a website
- a Twitter feed
Dynniq can provide:
- a DATEX II feed
- a XML publisher application programming interface (API)
Idox can provide:
- a DATEX II feed
- a XML publisher API
How to publish UTMC data
When you publish UTMC data in DATEX II you can provide it in various formats including XML, JSON and ASN.1 (which is experimental). You should allow your users to choose which format they prefer. Ask your UTMC supplier what formats they can support.
Publishing via other data platforms
If you do not want to use UTMC you could procure a data platform. You might want to do this because:
- you do not have a UTMC system
- your UTMC does not support all data types and you prefer a platform that supports your UTMC data feed along with your other data types
Before you do this, check whether other teams in your authority have a platform you could use. Also consider using your authority’s smart city platform or your STB’s platform if applicable.
If you use an alternative platform adopt existing industry-standard data schemas and specifications where they apply to you. For example, DATEX II, ETSI, JSON, MAP and SPaT.
Communication with users
You should provide a way for users to provide feedback. This could be to:
- report missing or incorrect data
- ask for help if they do not understand something
- ask for a more accessible file format
Use an email address in the format opentransportdata@[local authority].gov.uk.
When a user reports an error, ask them to include at least:
- the URL the the error or feedback relates to
- the name and description of the data set
- their name and email address
When a user requests an accessible file format ask them to include at least:
- the URL or name of the data requested
- the name and description of the data set
- the file format they want
- their name and email address
You could also have a form for users to request that you publish data sets. This can be used to decide which data sets to prioritise for publication.