Guidance

Publishing your tabular data

Add metadata to data you are publishing so search engines can find the data you’ve published and display structured results for users.

If you are planning to publish your data, you can talk to your organisation’s publishing team about how to add the right metadata to your data collection. It’s likely this will involve following the guidance on ‘Record information about data sets you share with others’.

To make sure your data is easily found through search tools, you can also add additional metadata from schema.org that’s more specific to your data collection. The publishing team may handle this for you. For example, during the Coronavirus pandemic, the GOV.UK team added new enhancements to schema.org to assist the government response.

To help users find the most recent version of your data, make sure you use persistent resolvable identifiers as part of your version control.

Hosting metadata about your published data

If you’re publishing your data, you may want to host your metadata:

  • within the web page itself
  • in a Metadata Catalogue if your government organisation has one
  • in a separate JSON file if your data is in CSV format in order to follow CSV on the Web standards - add ‘-metadata.json’ to your filename

Working with a CSV file

If you are using a spreadsheet, whether it is the Open Standard OpenDocument Spreadsheet Document format (ODS) or a proprietary format such as MS Excel or Google Sheets, you may want to export your dataset to a CSV file. By doing this you can make it easier for users to import, open and manipulate the data.

When sharing a CSV file with others, you should format it so it conforms to best practice standards. These are listed in the Tabular data standard.

At a minimum this best practice should mean:

  • the first row of your CSV file is a header row, containing titles for cells in that CSV
  • each row in your CSV file contains cell values based on the column titles
  • the metadata associated with any published CSV file is stored as a JSON file

Recording metadata about your CSV file

When supplying metadata about your CSV file, this should include:

  • the same information you would collect for any data file, for example provenance information and its validity
  • an additional annotated table so that users can understand, validate and use your CSV file as intended when it was published

Annotated tables will help users manage further processing of your CSV file, such as validating, converting, or displaying the tables. Core annotations you may want to include are listed by W3C, and include:

  • listing the columns and rows in the table and their order
  • the titles of your columns
  • the data types, such as whether it is a String, Character or Boolean

There are tools to help you annotate your CSV file and provide an annotated table, as well as convert your schema into JSON files. The categories of tools are listed by W3C and include:

  • viewers that can use the metadata to provide a more user-friendly or human-readable view of the CSV file, which might include displaying it in a table, or as graphs, or charts
  • data entry tools that can use the metadata to prompt people to supply information for a CSV file
  • validators that can check the column labels in the metadata file match those in the CSV file, that the values in the columns are of the right type and in the right format
  • converters that can use the metadata to map the CSV data into other formats such as JSON
  • data aggregators that can use the indicated metadata, such as descriptions, titles, relevant dates, and licences, to help users find your data

Updates to this page

Published 7 August 2020

Sign up for emails or print this page