National Travel Survey 2022: Technical report chapter 3
Published 30 August 2023
Applies to England
Chapter 3: Sample Selection
3.1 Sample size and structure
The NTS 2022 was designed to provide a representative sample of households in England and was based on a stratified two-stage random probability sample of private households. The sampling frame was the ‘small user’ Postcode Address File (PAF) – a list of all addresses (delivery points) in the country.
The sample for the 2022 survey was drawn firstly by selecting the Primary Sampling Units (PSUs), and then by selecting addresses within PSUs. The sample design employs postcode sectors as PSUs. Consistent with previous years of the survey, a sample of 756 PSUs and 12,852 addresses were selected for the NTS 2022.
3.2 Quasi-panel design
Following a review of the NTS methodology in 2000, it was decided that the NTS should introduce a quasi-panel design from 2002 onwards. According to this design, half the PSUs in a given year’s sample are retained for the next year’s sample and the other half are replaced. This has the effect of reducing the variance of estimates of year-on-year change.
Therefore 378 of the PSUs selected for the 2021 sample were retained for the 2022 core sample, supplemented with 378 new PSUs. The PSUs carried over from the 2021 sample for inclusion in 2022 were excluded from the 2022 sample frame, so they could not appear twice in the sample, however, the dropped PSUs from 2021 were included.
Whilst the same PSU postcode sectors might appear in different survey years, no single addresses were allowed to be included in three consecutive years to minimise the chances of the same address being selected again. Each year, NatCen provides the sampling company with a list of the addresses selected for the previous three survey years. These addresses were excluded from the sampling frame before the addresses for 2022 were selected. This means respondents to the three previous year’s surveys in the carried over PSUs could not be contacted again.
For further information about the methodological review, see Elliott, D. (2000) ONS Quality Review of the National Travel Survey: Some Aspects of Design and Estimation Methods.
3.3 Selection of sample points
A list of all postcode sectors in England was generated (excluding those in the Isles of Scilly due to cost of interviewing). Sectors carried over from the previous year were also excluded. Sectors with fewer than 500 delivery points were grouped with an adjacent sector. Grouped sectors were then treated as one PSU. On average each PSU contained about 3,250 delivery points.
This list of grouped postcode sectors in England was stratified using a regional variable, an urban or rural indicator, car ownership and a working from home indicator (note that this stratification approach was first implemented in NTS 2015 following a stratification review that NatCen carried out in 2014). This was done to increase the precision of the sample and to ensure that the different strata in the population are correctly represented. Random samples of PSUs were then selected within each stratum.
The regional strata for England are based on the NUTS2 areas, grouped in a few cases where single areas are too small. NUTS or Nomenclature of Units for Territorial Statistics is a European-wide geographical classification developed by the European Office for Statistics (Eurostat). NUTS2 roughly relates to counties or groups of counties in England. The 30 regional strata for the survey are shown in Table 3.1.
Within each region, postcode sectors were allocated to “urban” or “rural” based on the urban or rural indicator creating 51 “expanded” regions. The urban rural indicator itself was based on the 2011 Census and derived from the ten-category Rural Urban Classification. Within each “expanded” region, postcode sectors were listed in increasing order of the proportion of households with no car (according to the 2011 Census). Cut-off points were then drawn approximately one third and two thirds (in terms of delivery points) down the ordered list, to create three roughly equal-sized bands. Within each of the 153 bands thus created (51x3), sectors were listed in order of the percentage of people working from home (based on the 2011 Census).
378 postcode sectors were then systematically selected for the core sample with probability proportional to delivery point count. Differential sampling fractions were used in Inner London, Outer London and the rest of England in order to oversample London (see section 3.4 for further details). These sectors were then added to the 378 sectors carried over from the previous year’s survey to make the final core sample of 756 sectors.
Table 3.1: NTS regional stratification variable
Stratification number | England | REGION code | ||
---|---|---|---|---|
1 | Inner London – East | 7 Greater London | ||
2 | Inner London – West | 7 Greater London | ||
3 | Outer London – East and North East | 7 Greater London | ||
4 | Outer London – South | 7 Greater London | ||
5 | Outer London West and North West | 7 Greater London | ||
6 | Devon and Cornwall | 9 South West | ||
7 | North Somerset, North East Somerset, Bath, Somerset and Dorset | 9 South West | ||
8 | Bristol, South Gloucestershire, Gloucestershire and Wiltshire | 9 South West | ||
9 | Oxfordshire, Buckinghamshire and Berkshire | 8 South East | ||
10 | Hampshire and Isle of Wight | 8 South East | ||
11 | Kent | 8 South East | ||
12 | West Sussex and East Sussex | 8 South East | ||
13 | Surrey | 8 South East | ||
14 | Essex | 6 Eastern | ||
15 | Cambridgeshire, Suffolk and Norfolk | 6 Eastern | ||
16 | Hertfordshire and Bedfordshire | 6 Eastern | ||
17 | Leicestershire, Lincolnshire and Northamptonshire | 4 East Midlands | ||
18 | Warwickshire and Hereford and Worcester | 5 West Midlands | ||
19 | West Midlands | 5 West Midlands | ||
20 | Shropshire and Staffordshire | 5 West Midlands | ||
21 | Nottinghamshire and Derbyshire | 4 East Midlands | ||
22 | Cheshire | 2 North West and Merseyside | ||
23 | Merseyside | 2 North West and Merseyside | ||
24 | Greater Manchester | 2 North West and Merseyside | ||
25 | Lancashire and Cumbria | 2 North West and Merseyside | ||
26 | South Yorkshire | 3 Yorkshire and Humberside | ||
27 | West Yorkshire | 3 Yorkshire and Humberside | ||
28 | North Yorkshire and Humberside | 3 Yorkshire and Humberside | ||
29 | Cleveland, County Durham and Northumberland | 1 North East | ||
30 | Tyne and Wear | 1 North East |
3.4 Oversampling of London
Each year, London PSUs are oversampled. Response rates tend to be much lower in London compared with the rest of England, with rates being lowest in Inner London. The NTS oversamples Inner and Outer London with the aim of achieving responding sample sizes in London and elsewhere which are proportional to their population. Estimates of response rates were made to oversample Inner and Outer London: 49% for Inner London, 58% for Outer London and 67% for the rest of England. These estimates were based on NTS response rates from 2016 to 2020 plus our own experience of achieving full household co-operation in these areas. Of the 756 sectors in the core sample, 73 were in Outer London and 54 in Inner London.
3.5 Selection of addresses
In the core sample for the NTS 2022, 17 addresses were systematically selected from each of the 756 PSUs, a total of 12,852 selected addresses.
For historical context, note that in 2013 a split sample design was trialled whereby some PSUs had 17 addresses selected from them and others had 22. This was to test the effect of clustering on survey estimates. As a result of this trial, from 2014 onwards the number of addresses in an interviewer assignment was reduced to 17.
3.6 Self-completion section
Starting in NTS 2017, a Computer Assisted Self Interviewing (CASI) module for transport satisfaction questions was added, where one adult from those present during the household interview is asked to complete the satisfaction questions.
The introduction of the CASI module added a new element to the sample design. The satisfaction questions are, by nature, individual as opposed to household questions (different members of the same household may hold different opinions). Previously satisfaction questions had been asked of the main household respondent, which tends to disproportionately comprise older and female household members. Furthermore, responses to satisfaction questions tend to vary by these same demographic characteristics. It was therefore important to transfer these questions such that they were asked by a randomly selected individual within the household. The methodology for incorporating the CASI module into the NTS sample was based on the methodological development work that NatCen carried out in 2016. This methodology is detailed in Appendix Q1 of the NTS 2017 Technical Report.
This development work showed that inclusion of the satisfaction questions in this way requires the selection of one adult per household among those present during the interview. Selecting only from those present, however, introduces a non-random element in the sampling process, as some individuals (those who are absent) would have a zero probability of selection, thus introducing bias to the selected sample.
One way to overcome the zero probability of selection for the absent individuals is to treat them as non-respondents to the satisfaction questions and weight the satisfaction sample accordingly to make it representative of the total NTS interview sample (and by extension representative of the adult population in England).
The development work also showed that younger men and women are under-represented in the sub-sample of NTS household members who are present during the interview. Given that younger people are less likely to live alone, this under-representation is likely to increase if one person per household is selected at random amongst those who are present.
This imbalance by age could be reduced by varying the probabilities of selection so that the number of young men and women selected is increased. Following the recommendation from the development work, the satisfaction sample for NTS 2022 was recruited using an equal probability, except in households where both people aged 16 to 29 and 30 or over are present. In such households, those aged 16 to 29 were selected with an 80% probability.
3.7 Allocation of PSUs to months
The survey year is divided into 12 quota (fieldwork) months and equal numbers of PSUs (189) are assigned to each quarter, resulting in an average of 63 assignments being issued each month. Allocating PSUs evenly across a quarter (rather than a month) results in a more even spread of the average number of assignments and hence interviews and travel diaries per day across months. This allows us to control for variation across seasons. Furthermore, PSUs were allocated to quota months such that a nationally representative sample would be obtained for each quarter. For historical context, note that until 2016, an equal number of assignments (63) were issued each month which meant that shorter months (particularly February) were slightly overrepresented in the data.
3.8 Selection of households at sampled addresses
At some addresses, interviewers may find that there is more than one dwelling unit, such as a house (for example, number 15) which has been split into two flats (say, 15a and 15b). (A dwelling unit is a living space with its own front door – this can be either a street door or a door within a house or block of flats.) They may also encounter dwelling units with multiple resident households, for example there could be two families living as two separate households in one house. A household is defined as one person or a group of people living in a dwelling unit, who share cooking facilities and share a living room, sitting room or a dining area (the living room may also serve as a kitchen or bedroom, but is still counted as a living room).
In England such addresses are not reliably identified on the PAF and will not be identified until the interviewer has visited the address. As a result, households residing at addresses with multiple dwelling units or households, or both, will have had a lower chance of selection than others. While there are relatively few such addresses (1%), they account for a larger proportion of households, and these households tend to be rather different to others (poorer, younger, and smaller), so consequent biases may not be entirely trivial.
Interviewers must select one household to approach to take part at each sampled address. Interviewers are instructed to first establish the number of dwelling units at each sampled address. If there is more than one, interviewers use a selection grid on the Address Record Form to select one. They then establish the number of households residing within the selected dwelling unit. Once again, if there is more than one, interviewers use a selection grid to make a random selection.
Corrective weighting is then used to remove any bias arising from the lower chance of selection among dwelling units or households residing at multi-household addresses.
Prior to 2009, the selection process at multi-household addresses was to list all households at the address and randomly select up to three in England and Wales, and only one in Scotland. This limitation on the number of extra households left some residual bias that was similarly removed using corrective weighting.
During NTS 2022, 13.0% of PSUs were issued as push-to-telephone and were therefore not contacted by an interviewer. For these the household selection process could not be completed as normal. However, as noted above this change in process will have only affected a small number of addresses.
3.9 Ineligible (deadwood) addresses
The following types of address were classified as ineligible in 2022:
-
houses not yet built or under construction
-
demolished or derelict buildings or buildings where the address has “disappeared” when 2 addresses were combined into one
-
vacant or empty housing unit: housing units known not to contain any resident household on the date of the first contact attempt
-
a non-residential address: an address occupied solely by a business, school, government office or other organisation with no resident persons
-
residential accommodation not used as the main residence of any of the residents; this is likely to apply to second homes, seasonal, vacation or temporary residences, and these were excluded to avoid double counting; the households occupying the address had a chance of selection at their permanent address
-
a communal establishment or institution: that is, an address at which four or more unrelated people sleep; while they may or may not eat communally, the establishment must be run or managed by the owner or a person (or persons) employed for this purpose
-
an address is residential and occupied by a private household(s), but does not contain any household eligible for the survey; it is very rare for a residential household not to be eligible for the NTS interview, exceptions include ‘Household of foreign diplomat or foreign serviceman living on a base’, addresses which are not the ‘Main residence’ of any of the residents and addresses where there are no residents aged 16 or over
-
an address out of sample: that is, cases where interviewers were directed not to approach a particular address; this is very rare and usually only occurs where an address should not have been listed on the original sampling frame
For further information about outcome coding, see section 4.12.
3.10 PSU level variables
In addition to the information provided by members of the sampled households, the NTS also collects information measured at the PSU-level. The value of a PSU-level variable applies to all households living within that PSU. The PSU-level is therefore the highest level at which the data may be analysed, coming just above the Household level in the analysis hierarchy.
3.11 Fieldwork start dates
Since 2014, an additional process followed the selection of sample points (also known as assignments). Start dates are evenly spread across each month and then assigned to the points per month at random. See section 2.2.2 for further information.
Chapter 1 – Fieldwork approach in 2022
Chapter 4 – Fieldwork Procedures and Response Rate
Instructions for printing and saving
Depending on which browser you use and the type of device you use (such as a mobile or laptop) these instructions may vary.
You will find your print and save options in your browser’s menu. You may also have other options available on your device. Tablets and mobile device instructions will be specific to the make and model of the device.
How to search
Select Ctrl and F on a Windows laptop or Command and F on a Mac
This will open a search box in the top right-hand corner of the page. Type the word you are looking for in the search bar and press enter.
Your browser will highlight the word, usually in yellow, wherever it appears on the page. Press enter to move to the next place it appears.
Further information
National Travel Survey statistics