Tag content with international language codes
Updated 9 August 2022
Use the ISO 639-1:2002 language codes standard to add consistent, internationally recognised language codes to your data.
1. Summary of the standard’s use for government
The ISO 639-1 standard uses 2 letter codes to represent the names of more than 500 internationally recognised languages. It does not represent languages that are exclusively for machines. Use this standard to make sure you reference languages in a consistent way across your datasets.
The government chooses standards using the open standards approval process and the Open Standards Board has final approval. Read more about the process for language codes.
2. How this standard meets user needs
Use this standard for consistent language tagging. For example, when using cross-platform character encoding to make sure different systems correctly identify languages.
When systems can consistently identify which languages you’ve used it can help users who:
- want to trade with UK businesses
- plan to travel to the UK from abroad
- live in the UK but do not speak English
The government also has a legal requirement to translate information into Welsh. This standard provides a consistent way of tagging this information so it’s easier to find.
Using this standard means:
- users can find information in the language they need
- services and content have consistent language tags
- screen readers can identify which language the content is in
3. How to use the standard
When you’re publishing content in multiple languages you must use this standard’s 2 letter codes in the tags or metadata.
You must use language tags in the relevant HTML and XML document metadata.
This standards does not cover:
- standard methods for attaching language tags to other formats, such as JSON
- methods of presenting a user with text, in particular, HTTP language negotiation or the URL suffix scheme currently used by GOV.UK
Use the World Wide Web Consortium (W3C) guidance to:
- show you how to annotate language on the web, for both HTML and XML formats
- declare the language of a web page or a portion of a web page using HTML lang attribute
- declare the language of a body of text using the ‘xml:lang’ attribute
You can also get a list of:
-
2 and 3 letter language tags on the Internet Assigned Numbers Authority (IANA) website
- extended language tags with script subtags in RFC 5646, if you need to add information to the language tag
- tags and their country names on the US library of congress website