DCMS Sectors Economic Estimates: Trade - Quality assurance report
Updated 18 March 2022
1. Introduction
The DCMS Sectors Economic Estimates 2019: Trade statistics published on 29 October 2020 provide an estimate of the value of imports and exports of goods and services in each DCMS sector (excluding the Civil Society sector). This document summarises the quality assurance processes applied during the production of these statistics by our data providers, the Office for National Statistics (ONS) and Her Majesty’s Revenue and Customs (HMRC), as well as those applied by DCMS.
2. Data Sources
Estimates of trade in services are produced using the ONS International Trade in Services (ITIS) dataset, which uses Standard Industrial Classification (SIC) codes.
Estimates of trade in goods have been constructed from HMRC Overseas Trade Statistics[footnote 1], which also use international classification codes; Commodity (or CN08) codes. CN08 is based on the Harmonised System (HS) of tariff nomenclature and are 8 digit codes that identify categories of goods. The first 6 digits correspond to the HS code, with the 7th and 8th digits adding further detail. This is an internationally standardised system of coding for classifying goods for trade. DCMS Sectors are defined at the 4 digit SIC code level, and therefore a conversion from SIC to the CN codes was used to find the best match.
The definitions of ‘exports’ and ‘imports’ for the services and goods data are different. The services data (ONS) are calculated on a ‘Balance of Payments’ basis i.e. services entering and leaving an economic territory are not recorded as imports or exports unless they change ownership (between UK residents and non-residents). So, for example, gold moving from one owner to another in the same vault would classify as trade under this principle.
The goods data (HMRC) are constructed on a ‘Cross-border’ basis i.e. goods entering and leaving an economic territory are recorded as imports and exports. In this case, the gold above would have to cross a border from one country to the next to classify as trade, but it would classify as trade even if the owner remained the same. As the definitions are different, the estimates for goods and services cannot be combined and are presented as two separate figures.
Please note that Tourism is defined by the characteristics of the consumer in terms of whether they are a tourist or resident, rather than by the goods and services produced themselves. Exports and imports for tourism are taken from estimates of spend by overseas residents in the UK and spend by UK residents abroad, respectively. This data is sourced from the ONS International Passenger Survey (IPS) as well as Visit Britain, and does not distinguish between spend on goods and services. Unlike estimates for other DCMS sectors, we therefore provide estimates of trade in both goods and services combined for Tourism.
Data are available for each DCMS sector (excluding Civil Society); and also for sub-sectors within the Creative Industries, Digital Sector, and the Cultural Sector. However, for Trade in Goods, there are currently no goods associated with the Gambling and Telecoms sectors.
3. Quality assurance processes at ONS – Trade in Services
Quality assurance at ONS takes place at a number of stages, outlined below. To note, information presented here on the data sources are taken from the International trade in services Quality Methodology Information (QMI) and should be credited to the ONS.
The International Trade In Services (ITIS) data shows the import and export activity of UK companies overseas and is the main source of information for UK trade in services. The data is based solely on survey data.
3.1 Sampling frame and data collection
ITIS data are based on a quarterly sample of approximately 2,200 businesses and an annual sample of approximately 15,500 businesses. Response rate targets are 85% for both annual and quarterly ITIS data.
The survey data from both the quarterly and annual results are combined to produce the annual ITIS estimates and are used as a main data source to compile total trade in services estimates. It is worth noting that the surveys do not provide full coverage of the UK economy, and excluded sectors include: travel and transport; banking and other financial institutions; higher education; and most activities in the legal professions.
The ITIS survey is also supplemented by information collected via the Annual Business Survey (ABS). Historically, ITIS’ product-level estimates have been derived from the ABS. Since 2018, product- and industry-level data have been improved by directly surveying companies operating in some industries from ITIS itself. However, the ABS is still used as part of the process, for example to help inform the sample for these industries.
ITIS data are collected by both industry and service on a geographical basis, by collecting data for the countries to which services are exported to and where they are imported from. These data are primarily used in the compilation of the services account for the UK’s Balance of Payments (BoP), which in turn contributes towards the measure of UK gross domestic product (GDP). The ITIS estimates are published annually.
Data relating to the import or export of goods are excluded from this survey. However, merchanting (earnings from arranging the sale of goods between two countries outside the UK and where the goods never physically enter the UK) are included along with earnings from commodity trading. As with merchanting, the services element is calculated as the businesses’ profit minus the loss.
3.2 Validation and quality assurance
There is no simple way of measuring the accuracy of ITIS statistics, that is, the extent to which they measure the underlying “true” value for a particular period. Non-sampling errors are not easy to quantify and include errors of coverage, measurement, processing and non-response. Various procedures and checks are made to ensure these errors are minimised. As ITIS is based on survey responses, ONS systems validate these entries and prompt confirmation of suspect data is sought.
Every effort is made to ensure that the series are comparable over time. International standards (BPM6[footnote 2] and MSITS 2010[footnote 3]) are used in the production of ITIS data; therefore, figures published by the UK should be comparable with other countries. UK representation in working groups will ensure that the UK is synchronised with any changes in international standards.
Survey returns are run through a series of checks to identify errors. These checks ensure that:
- responses to individual questions are consistent within the questionnaire as a whole, that is, totals equate to the sum of the parts
- the return is consistent with historical data from the business
Further quality assurance applied includes the detection and treatment of outliers; application of imputation (ratio and means of ratio); and the application of disclosure control. Statistical disclosure control is applied to the ITIS survey data before release of the publication. This means that some published tables have been altered to ensure that information attributable to an individual or individual organisation is not identifiable in any published outputs. The Code of Practice for Statistics[footnote 4] describes the data protection procedures applied.
4. Quality assurance processes at HMRC – Trade in Goods
Data for goods trade with EU countries are currently collected through the Intrastat system - a monthly business survey used to determine the level of trade conducted within the EU.
UK businesses that trade to or from other EU Member States, and which meet the reporting threshold (described in the next section) for intra-EU Arrivals and Dispatches are legally obliged to submit supplementary trade declarations using the Intrastat system. A step-by-step guide or alternatively a demo is available to help businesses understand and comply with the Intrastat system.
Data for non-EU trade in goods are collected from customs declarations made to HMRC when goods leave or enter the UK (Customs Handling of Import and Export Freight (CHIEF) system). These data are combined to produce overall import and export estimates. HMRC apply disclosure control to the data before it is released.
Since 1 May 2016, the UK has moved from a General Trade system to a Special Trade system. This means, for example, that something moving from Iceland to Norway via a British port will be counted as Iceland-Norway trade. Under the general system, if the goods had moved into a warehouse on the UK mainland after arriving from Iceland, and before moving to Norway, it would have counted as Iceland-UK and UK-Norway trade respectively. This does not affect the comparability of data before and after May 2016; however, it is still important to know that the change has been made.
4.1 Sampling frame and data collection
Businesses whose annual value of arrivals and/or dispatches exceeds a given exemption threshold are required to provide an Intrastat declaration each month, showing full details of their arrivals (imports) and dispatches (exports) during that month. The thresholds are reviewed annually to minimise the burden on business of the Intrastat system whilst maintaining the coverage by value of UK trade required by European legislation. For example, for the calendar years 2010 - 2013 these thresholds were set at £600,000 for arrivals and £250,000 for dispatches; it changed to £1.5m for arrivals after 2013. These detailed Intrastat declarations are required to cover at least 93 per cent of the value of trade for arrivals, and at least 97 per cent of the value of trade for dispatches.
Businesses are expected to submit their data by the 21st day of the following month, so for instance; January data must be submitted by the 21st of February.
The fields that are collected are as follows:
- Commodity Code
- Invoice Value
- Net mass/Supplementary Unit (where appropriate as determined by Commodity Code)
- Country of Destination or Dispatch (COD)
- Delivery Terms (if the business reaches the Delivery terms threshold)
- Nature of Transaction
There are two main ways of submitting Intrastat data electronically, either via the Internet or using Electronic Data Interchange (EDI):
- The secure system for submitting via the Internet is accessed from the HMRC website and businesses can choose to either key directly onto an online form or submit offline using a Comma Separated Variable (CSV) file.
- The EDI facility allows HMRC to receive data in the Electronic Data Interchange for Commerce and Transport (EDIFACT) Standard
4.2 Validation and quality assurance
HMRC carries out extensive validation procedures as part of its data processing. A validity error is where a field has been submitted in an incorrect format or is missing where required. Validity checks are done electronically by HMRC computer systems. Suspect fields are verified by reference to the original source document or by contacting the business or agent. Special attention is paid to high value traders to ensure that all significant value transactions are included when the trade statistics are first produced.
Auto corrections are built into HMRC computer systems to cope with certain common types of error. Examples include obsolete commodity codes, partially invalid commodity codes (e.g. only the first six of the eight digits are valid), invalid/obsolete country/port codes etc.
Other checks on the trade data focus on value and quantity data. For example, HMRC carry out credibility checks on the relationships between the fields:
- ‘Value’ and ‘Quantity 1’ (i.e. net mass)
- ‘Value’ and ‘Quantity 2’ (i.e. supplementary units such as number of items)
- ‘Quantity 1’ and ‘Quantity 2’
Credibility checking is a tool for ensuring that the detailed data obtained is realistic and viable. These checks are not meant to indicate that a particular item is incorrect but that it is different from the norm. If this highlights a potential error, then the data will be investigated, often by contacting the business, and corrected where necessary (subject to risk profiling and resources available).
Administrative data from CHIEF is supplied to HMRC in the form of daily file transmissions of cleared/departed customs declarations to their data processing system, called TS93. This system allows for data validation and quality checking so that they can make data-led amendments to both CHIEF and Intrastat data. CHIEF and TS93 systems are both used to carry out credibility checking of trade data within HMRC.
5. Trade in the Tourism Sector
Tourism data are based on a different methodology to other DCMS sectors. Estimates for the Tourism sector are taken from the International Passenger Survey[footnote 5]. This survey, run by the ONS, collects information about passengers entering and leaving the UK, and has been running continuously since 1961.
These estimates are based on the assumption that imports of tourism are equal to spend by UK residents on trips abroad and exports of tourism are equal to spend by overseas residents during visits to the UK.
These figures represent trade in goods and services combined and therefore are not directly comparable with the trade in services or trade in goods estimates presented for all other sectors (excluding Civil Society). Therefore, estimates of imports and exports of Tourism are not presented in the DCMS sector totals.
Ahead of this year’s updates, methodological improvements were made to the International Passenger Survey (IPS). These particularly relate to the survey’s weighting process and, in particular, related to evidence that the previous estimation method was not providing accurate results for certain groups (e.g. underestimating visits by Chinese residents). As a result, a back-series from 2014 to 2019 has been produced for this year with these revisions. This means that data published this year are not directly comparable with previously published estimates, including those for data from before 2014.
5.1 Sampling frame and data collection
The IPS conducts between 700,000 and 800,000 interviews a year, of which over 250,000 are used to produce estimates of overseas travel and tourism. Published estimates are based on face-to-face interviews with a random sample of passengers as they enter or leave the UK by the principal air, sea and tunnel routes.
The IPS uses a multi-stage sample design, where the sampling for air, sea and tunnel travel is carried out separately, although the underlying principle for each mode of travel is broadly similar. In the absence of a sampling frame of travellers, time periods (or sea or shuttle crossings) at selected ports and routes are chosen at the first stage and travellers are then systematically selected at fixed intervals from a random start within these interviewing shifts or crossings at the second stage.
5.2 Validation and quality assurance
Numerous checks are built into the Computer Assisted Data Input (CADI) program, which acts as the first form of data validation and thus reduces the number of errors. On return of survey data to the office, a comprehensive suite of edits and validation checks are carried out to clarify (and correct where necessary) any outstanding issues with the data. These include:
- checking zero spend, for example, from the completed questionnaire
- checking high spend to ensure this has been correctly coded
- coding towns, countries, airlines, not included on the interviewers’ coding frames
- checking missing information to determine whether anything was written on the paper questionnaire that would enable the information to be input
- reviewing internal inconsistencies that have been identified, or flagged by the interviewer
Where the responses for the main items of interest are missing, the values are imputed on a topic by topic basis where the method is broadly similar for each topic. The IPS implements a mean-value within class imputation procedure as detailed in the IPS methodology[footnote 6].
6. Quality assurance processes at DCMS
The majority of quality assurance of the data underpinning the DCMS Sectors Economic Estimates Trade release takes place at ONS and HMRC, through the processes described above. However, further quality assurance checks are carried out within DCMS at various stages.
Production of the report is typically carried out by one member of staff, whilst quality assurance is completed by at least one other, to ensure an independent evaluation of the work.
6.1 Data requirements
For the Trade in Services data, DCMS discusses its data requirements with ONS and these are formalised as a Data Access Agreement (DAA). The DAA covers which data are required, the purpose of the data, and the conditions under which ONS provide the data. Discussions of requirements and purpose with ONS improve the understanding of the data at DCMS, helping us to ensure we receive the correct data and use it appropriately.
For Trade in Goods, data for the UK totals are downloadable from the HMRC website, via the “build your own tables” platform, however for ease of use, HMRC provide bespoke analysis for DCMS sectors, sent to the department as an excel spreadsheet which includes breakdowns by countries. HMRC apply disclosure control to the data before sending it to DCMS and therefore DCMS does not apply any further disclosure control. More information can be found on the UK Trade Info website.
For Tourism estimates, data are taken directly from the ONS International Passenger Survey (IPS).
6.2 Checking of the data delivery
For these Trade in Services estimates, one csv file is provided for exports, and another for imports. We check we have received data for all sectors and for all relevant 4-digit SIC codes (for the overlap charts).
Later in February, or in March, one raw data file will be received from the ONS, sent in csv format. This may then converted and imported into SPSS or R for immediate checks. For 2015 data and older, we were sent two files in text file format - one for Exports (‘Receipts’) and one for Imports (‘Payments’).
For this particular data we check that:
- We have received all data at the 4 digit SIC code level, which is required for us to aggregate up to produce estimates for our sectors and sub-sectors.
- There is no repetition of totals in order to avoid double counting.
- Data at the 4 digit SIC code has not been rounded unexpectedly. This would cause rounding errors when aggregating up to produce estimates for our sectors and sub-sectors.
For the Trade in Goods data, once the data is sent to DCMS, the following initial steps are taken:
- Check for changes to CN codes early in the process using correlation tables
- Check whether any of the codes affected are (or have been) DCMS Sector codes
- Decide how and where (in which DCMS Sector, if applicable) to classify the new Commodity Codes (resulting from the changes). This is done by matching the CN08 codes to the European Classification of Economic Activities (NACE) 2.1 at the 4-digit level, which is equivalent to 4 digit SIC codes.
- As an extension of these exercises, for these Trade estimates a refresh of commodity codes was carried out, going back to 2015. This uncovered around 14 new commodity codes to be classified in our sectors (8 from 2015, and 6 from 2017). All of these were Digital Sector goods codes, but two also spanned the Creative Industries and one also spanned the Cultural Sector. Some analysis of adding in the 2015 codes found an impact of around 1-2% on DCMS Sector goods exports and around 5-6% on DCMS Sector goods imports, based on 2015 data.
- Lookup tables can then be created for DCMS sectors by identifying SIC codes for each sector. These tables are published along with the published statistics for Trade in Goods in the DCMS Sectors.
For the Tourism estimates, DCMS take the latest figures from the IPS that are published by the ONS. Additionally, some specific, detailed country breakdowns data are received from Visit Britain (for example, breakdowns for smaller countries not published by the ONS). The relevant data from both these datasets are then included in the published tables. DCMS do not update the back series as the IPS does not get updated annually. However, countries may be added or removed from DCMS’s published tables based on user demand.
For the Tourism data, we check that:
- The data has been copied correctly from the files received by the data provider.
- The correct data year is copied over.
6.3 Data analysis
At the analysis stage, data is aggregated up to produce information about DCMS sectors and sub-sectors. For the 2018 Trade in Services estimates we published in February 2020, the outputs were provided to us by ONS in aggregated form based on R code supplied by DCMS. However, these still require secondary and tertiary disclosure control to minimise risk of disclosive data being published. These are applied by DCMS in Excel. DCMS also builds in the following checks at this stage:
- Checks that summing up breakdowns gives the same figure as the total they contribute to. E.g.:
- Do sub-sectors within the Creative Industries sum to the Creative Industries total?
- Do the individual geographic figures sum up to the wider geographic total (e.g. do the individual continents sum up to the World total?)?
- “Sense checks” of the data, which can then be queried with ONS colleagues. E.g.:
- Are the proportions of each sector and subsectors similar to last year? If not, why?
- Looking at any large differences between the data, when compared like-for-like with the previous year.
The Trade in Goods table production was carried out in the programming language R, as part of the automation work being undertaken in DCMS. This year, it was found that the data supplied was on a different basis to previous years, as it no longer covered “Below Threshold Trade Average” (BTTA) estimates for EU goods trade.
BTTA arises out of the fact that exports and imports below set thresholds are not reported to Intrastat, the database for recording Trade in Goods between EU countries. Previously, HMRC estimated BTTA at 8-digit commodity code level by using trade just above the threshold to estimate trade just below, and to then allocate this for each EU country. These estimates were stopped when they were found to be less robust at this level of granularity. They are instead only available at a less granular level (2-digit HS commodity code).
Analysis carried out by DCMS showed that removing BTTA trade had an impact of around 3.6% on the estimate of exports of DCMS Sector goods; and 5.3% on the estimate of imports of DCMS Sector goods. However, DCMS had a clear user need of comparability of trade in goods between EU and non-EU countries. Therefore, a revised BTTA metric developed by DCMS has been used for this publication.
This uses 2-digit BTTA (a more robust estimate) to “allocate” BTTA to different 8-digit commodity codes. Analysis for this found results broadly similar to the original approach (the new estimate was 0.2-0.3% higher for exports; and 0.6-0.7% higher for imports, for DCMS Sector goods).
Once tables have been run in R and exported in Excel for the latest year, DCMS builds in the following checks at this stage:
- Checks that summing up breakdowns gives the same figure as the total they contribute to. E.g.:
- Do sub-sectors within the Creative Industries sum to the Creative Industries total?
- Do the individual geographic figures sum up to the wider geographic total (e.g. do the individual continents sum up to the World total?)?
- “Sense checks” of the data. E.g.:
- Are the proportions of each sector and subsectors similar to last year? If not, could this because of changes to the methodology?
- Looking at any large differences between the data and possible causes to these.
6.4 Quality assurance of data analysis
Once analysis is complete, DCMS document the checks needed for quality assurers to carry out.
The checks for this release include:
- Introductory checks (correct files, years etc. used).
- Checking that the various stages of data processing have been correctly calculated. This includes checking that:
- The syntax is accurate
- The correct codes (SIC or Commodity) have been aggregated together to form DCMS sector (and sub-sector) estimates.
- That all codes we require are included, and that any non-DCMS codes have not been included by accident.
- Checking the data to make sure it is not possible to derive disclosive data from the figures that are published. (Only applicable for Services data).
- Making sure the correct data has been pasted to the final tables for publication and are formatted correctly.
- Making sure all charts are linking (correctly) to the right data and all maps produced are using the correct data.
6.5 Dissemination
Finalised figures are disseminated within Excel tables and a written report (which includes written text, graphs, tables and infographics) published on GOV.UK. These are produced by the Trade statistics lead. Before publishing, a quality assurer checks the figures match between the working-level analysis, the tables and the written report. The quality assurer also makes sure any statements made about the figures (e.g. regarding trends) are correct according to the analysis and checks for spelling or grammar errors.
7. Next steps
We encourage our users to engage with us so that we can improve our statistics and the documentation surrounding them. If you would like to comment on this quality assurance report, or have any enquiries please get in touch at evidence@dcms.gov.uk.