GOV.UK Taxonomy principles
Published 13 June 2019
GOV.UK’s topic taxonomy is maintained using these principles. The principles explain:
- what the topic taxonomy is for
- how the topic taxonomy is designed and structured
- how to tag content to the topic taxonomy
1. What the topic taxonomy is for
The topic taxonomy:
- helps users and machines explore topics on GOV.UK
- consolidates and structures content on GOV.UK
- only categorises content that already exists
- is displayed through topic pages and finders
1.1 Helping users and machines explore topics on GOV.UK
The taxonomy is designed to help users and machines understand and explore the subject areas covered by GOV.UK. It also helps to describe the meaning of the content.
Users can browse topics through topics pages and finders.
Publishers can analyse the content they publish by topics. This can be more powerful when combined with traffic metrics and user journey data.
1.2 Consolidating and structuring content on GOV.UK
The topic taxonomy is the primary way to categorise GOV.UK content by topic. This makes content on GOV.UK easier to manage and find. As well as using new topics, it includes terms from these previous taxonomies:
- policies (legacy)
- policy areas (legacy)
- specialist topics (legacy)
- mainstream browse
1.3 The topic taxonomy only categorises what’s already on, or what is soon to be published on GOV.UK
The topic taxonomy only describes the GOV.UK domain. This is all content on GOV.UK.
It does not aim to model the world. For example, only transport topics that already exist and that cover GOV.UK content can be found in the transport area of the taxonomy. If there’s no content about electric skateboards on GOV.UK, ‘electric skateboards’ will not be a topic in the taxonomy.
This helps to make the taxonomy easier to manage and makes it a truer reflection on what’s on GOV.UK.
1.4 How the topic taxonomy is displayed (topic pages and finders)
Topics are represented on GOV.UK by topic pages and their finders. Users can navigate to topic pages through sidebar links and breadcrumbs on content pages. Topic pages currently display content tagged to a topic and all its subtopics.
Topic pages and their finders are the first taxonomy-driven GOV.UK navigation product. There are likely to be alternative ways for users to interact with the topic taxonomy in the future, such as filtering long lists of document types or a department’s publications by topics.
2. The design and structure of the topic taxonomy
2.1 The taxonomy is topic-based
The taxonomy categorises content by topic. Topics are subject areas.
They are not:
- government departments or agencies
- content formats
- services or tasks
- user groups or professions
Topics describe what the content is about - they’re not intended to describe who has published the content, or who the content is for.
2.2 The taxonomy is hierarchical
The taxonomy is made up of parent-child relationships. Child topics are subcategories of their broader parent topics. This hierarchical tree structure makes the taxonomy easy to browse. The strength of hierarchical relationships between topics can be tested through tree tests. If a user can easily find a topic by starting at the top of the taxonomy ‘tree’, then the hierarchical relationships are strong.
2.3 A topic should have between 2 and 12 child topics (or none)
Showing users (or publishers) too many topics at any level of the taxonomy can be overwhelming. The maximum number of topics at any level should ideally be around 12.
Topics with only one child topic can create unnecessary granularity, so the minimum number of children a topic can have is 2 (assuming it has children).
Single child topics can either be merged with the parent topic, pulled up to the same level as the parent topic, or sibling topics and a new parent topic should be added.
2.4 Naming topics
It’s important that topic names can be distinguished from each other so that users (publishers or end users) know which topic to pick.
We should avoid ambiguity between similar categories in separate parts of the taxonomy. One way is to use qualifiers, for example orange (fruit) and orange (colour). The GOV.UK style guide should be used when naming topics.
The taxonomy also uses these official standards for controlled vocabularies:
- ANSI/NISO Z39.19 Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies
- ISO25964-1: Information and documentation — Thesauri and interoperability with other vocabularies parts 1 and 2
- BS 8723 (this was superseded by ISO25964-1 on 20 September 2011)
Principles for naming topics
Topics must:
- be subject areas (not tasks, audience types, names of departments, government initiatives, or content types)
- accurately describe the content tagged to it
- make sense on their own without context
- be disambiguated if there is more than one topic with the same name - use qualifiers in brackets, for pipes (musical instruments) and pipes (smoking implements)
- be written in plain English, avoiding jargon, overly technical terms, punctuation, symbols and numbers where possible
- reflect user language where possible - refer to search terms in Google Analytics or Google Trends
- not use acronyms, unless the acronym is more well known than the non-abbreviated term, for example MOTs
- be in sentence case unless it’s an official name
- not be phrases or sentences
2.5 The topic taxonomy is always evolving and publishers are encouraged to provide feedback
The taxonomy is flexible. Topics can be:
- added (if a new subject area emerges)
- removed, if they become redundant
- renamed
- merged
- split into child topics
Publishers can suggest new topics, or changes to existing topics. Use the naming principles as a guide when making suggestions.
3. Tagging to the topic taxonomy
All content published using Whitehall publisher must be tagged to at least one topic.
3.1 Content should be tagged based only on what it’s about
Tag content based on what it’s about. This helps users understand the meaning of the content on GOV.UK and using the topic taxonomy makes sure this is done consistently.
3.2 Publishers are responsible for tagging accurately
Publishing teams are responsible for making sure the content they publish is tagged accurately. Tagging to the taxonomy is part of the publishing flow in Whitehall publisher.
Publishers can tag accurately by:
- spot checking tagging on samples of content
- correcting tagging whenever they notice something inaccurate
- reviewing topic pages to check that they’re surfacing expected content (and if they are not, tagging missing content)
3.3 Publishers should check the whole taxonomy when tagging
It’s important for publishers to consider the whole taxonomy when tagging content, not just topics they’d usually associate with their organisation. Users should not need to understand the structure of government to interact with government.
3.4 Content should be tagged to the lowest possible level of the taxonomy
Publishers can tag to any topic, but they should try to tag at the most granular level. Occasionally it’s appropriate to tag content to broader topics, higher up in the taxonomy. Speeches, for example, are often about generic topics like transport.
Content should not be tagged to both a topic and its parent.
3.5 Distribution of content tagged to the taxonomy should be monitored but cannot be balanced
Topics with too many or too few content items should be reviewed to see if the taxonomy needs to be more or less granular - for example, whether topics should be split or merged.
However, it’s difficult to standardise the volume of content that should be tagged to a topic. Some topics will have significantly more content than others and there might not be any legitimate subtopics within the collection of content. For example, the Air Accidents Investigation Branch reports.
3.6 GDS manages the taxonomy
GDS manages the topic taxonomy based on feedback from a variety of sources, including:
- publishers and subject matter experts across departments
- performance data and user behaviour
- machine learning algorithms
- controlled user research