National Travel Survey 2024 weighting review
Updated 16 April 2025
Chapter 1: Introduction
1.1 Background and aims
The National Travel Survey (NTS) provides up-to-date and regular information about personal travel within Great Britain by people living in England and monitors trends in travel behaviour. The survey collects detailed information on the key characteristics of each participating household and any vehicle to which they have access. In addition, each individual within the household is interviewed and then asked to complete a seven-day travel diary. The survey produces a rich dataset for analysis with information recorded at a number of different levels (household, individual, vehicle, long distance journey, trip and stage).
The NTS responses are weighted in order to control for known sources of bias, including household-level non-response and drop-off in trip reporting in the travel diary. The six sets of weights produced for each full calendar year of NTS data are:
- interview weights
- fully responding weights
- diary weights
- short walks weights
- long distance journey weights
- self-completion weights (also known as CASI weights, referring to the ‘computer assisted self-interview’ element of the interview)
The interview weights consist of three parts (w1, w2, and w3) that are combined and then calibrated to adjust their profile to match national population estimates. The w1 component is a household selection weight, controlling for random within-address selection by interviewers when a sampled address is found to include multiple delivery points and/or multiple households. The w2 component is a non-response weight, controlling for differences in the profile of eligible sampled addresses that did and did not participate in the NTS. The w3 component controls for differences in household size between households where not all members completed an interview and households where all members completed an interview.
The fully responding weights consist of four parts (w1, w2, w3, and w4) that are combined and then calibrated to adjust their profile to match national population estimates. The first three components of the weight are the same as the interview weights. The w4 component is an additional non-response weight, controlling for differences between households in which all members completed a travel diary and households in which they did not. The only differences between the interview and fully responding weights are the responding sample size and the w4 component. Both are calibrated to the same population estimates.
The fully responding weights always have a slightly smaller responding sample size than the interview weights, due to the additional requirement of travel diary completion. If NTS interviews and travel diaries are completed for all members of the household, they will receive both interview and fully responding weights. If interviews but not travel diaries are completed for all members of the household, they will receive interview weights only. If travel diaries but not interviews are completed for all household members, or neither are completed for all, the household will not receive either weight.
Each of the six sets of weights, except the final CASI set, are produced for mid-year data (consisting of the latter six months of one calendar year and the first six months of the next) as well as for full-year data. For mid-year, these were first produced using NTS 2022 July to December data and NTS 2023 January to June data. In 2024, a digital diary parallel run was conducted in addition to NTS 2024 quarter 1, and all listed weights except the final CASI set were also produced using the main NTS 2024 quarter 1 data and digital diary parallel run data.
The NTS weighting methodology has been established for several decades and has proved robust, even during fieldwork disruption caused by the COVID-19 pandemic, so it was deemed that a review of the entire methodology was not necessary. Instead, in 2022 it was agreed that NatCen would conduct a targeted review of the weighting in order to optimise and update the approach while retaining quality and continuity.
The overarching aims of this Weighting Review are therefore twofold, namely:
- to update the methodology where necessary to ensure it continues to deliver the best results
- to streamline the methodology where possible
1.2 Structure of the Weighting Review
This Weighting Review has been split into four parts, namely:
- part one: a review of covariates in the model for household-level non-participation (w2)
- part two: a review of the need for a model of household-level full response (w4)
- part three: a review of covariates in the self-completion (CASI) weighting model
- part four: a review of the diary drop-off weighting method’s suitability for trip data collected via digital diary
The analysis for the four parts of the Weighting Review took place at different points in time, due to data availability. The first two parts of this Weighting Review were initially run using the combined NTS 2022 and 2023 mid-year data. However, while conducting the initial analysis for these two elements, it became clear that the mid-year data for this particular time-period was not directly comparable to a full year of NTS data collected using the traditional face-to-face (F2F) mode. This was because fieldwork in 2022 was impacted by the COVID-19 pandemic, so a non-random portion of sampled points were either not worked or were issued using a push-to-telephone (P2T) mode (that is, not all fieldwork in 2022 was carried out using either the F2F mode or the knock-to-nudge (K2N) mode that was also in force at certain times during the pandemic survey years).
Note: For the purpose of this Weighting Review, K2N is treated as F2F since both of these modes used the same recruitment approach with interviewers visiting sampled addresses to encourage participation after having sent an advance letter, whereas for P2T only an advance letter was sent with an invitation to opt in. Although both the K2N and P2T modes used telephone interviewing exclusively, K2N was deemed to be equivalent to F2F due to the similarity in the achieved samples.
The first two parts of this Weighting Review were therefore rerun a second time using the full year data for NTS 2023, because NTS 2023 had returned fully to the traditional F2F mode and no longer used the P2T mode, which is also the case for subsequent survey years. As a result, the NTS 2023 data was quite different to the data collected in the survey years during the pandemic when P2T was used for some of the fieldwork (namely NTS 2022, 2021, and 2020). The NTS years with mixed-mode data collection required specific adjustments to the weighting to try and control for mode effects, which were not required for NTS 2023 or subsequent years. The Weighting Review’s final recommendations will apply to NTS for years to come, therefore they must be robust and based on data that does not include P2T cases, for which NTS 2023’s full year data is suitable.
The third part of this Weighting Review used NTS 2023 data (this time, from the outset), and the fourth part used NTS 2024 digital diary parallel run data, as this was the first opportunity to compare NTS travel data collected in paper diaries against data collected in digital diaries.
This report summarises the analysis undertaken to review the four different parts of the NTS weighting methodology. It concludes with recommendations for future NTS weighting.
Chapter 2: Reviewing the household-level participation model (w2)
2.1 Comparison of past models
The covariates within the household-level participation model were last reviewed during NTS 2013, using newly released Census 2011 data. The same procedure was followed, firstly using the combined NTS 2022 and 2023 mid-year data, and then rerun using the NTS 2023 data with the newly released Census 2021 variables (the results of the latter are presented in section 2.2).
The first step was to compare which of the covariates in the model used to generate the w2 weighting component had been significant (that is, with a p-value less than 0.05) over the past decade. The results of this comparison are presented below.
In contrast to previous reviews of weighting which showed stability over time, a larger degree of variation was found in the models from 2020 onwards due to the disruption of NTS fieldwork caused by the COVID-19 pandemic. This necessitated adjustments to the participation modelling stage of the interview weighting. Accordingly, two tables are presented below to show the variables included in the household participation models for survey years before and after the pandemic.
Table 1 covers the pre-pandemic years (2013 to 2019), when the variables tested in household participating models were:
- Government Office Region (GOR) in ten categories
- Census 2011 rural-urban classification in six categories
- Acorn socio-economic status codes in six categories
- the month that the address was issued
- distance of postcode from the nearest railway station in continuous and 6-category versions
This table shows which of these variables were included in each year’s final household participation model, and for those variables that were included whether they were significant (p-value less than 0.05) or not.
Consistently across all seven years analysed (2013 to 2019), GOR and Acorn categories were found to be significantly associated with household participation. The rural-urban classification was also significant in the first five years. Month of issue was only significant in one year (2014). Finally, measures of distance from the nearest railway station were not significant in any year, and in 2019 these rail distance variables were not available.
Table 1: Variables tested in household-level participation models in NTS 2013 to 2019
Variable | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 |
---|---|---|---|---|---|---|---|
GOR | Significant | Significant | Significant | Significant | Significant | Significant | Significant |
Rural-urban classification | Significant | Significant | Significant | Significant | Significant | Not significant | Not significant |
ACORN | Significant | Significant | Significant | Significant | Significant | Significant | Significant |
Month of issue | Not significant | Significant | Not significant | Not significant | Not significant | Not significant | Not significant |
Rail distance (continuous version) | Not significant | Not significant | Not significant | Not significant | Not significant | Not significant | Not available |
Rail distance (6-category version) | Not included | Not included | Not included | Not included | Not included | Not included | Not available |
Table 2 below covers the household participation models for NTS 2020 to 2023, and, in the same manner as Table 1, it shows which of the variables tested were included in each year’s final household participation model and whether they were significant (p-value less than 0.05).
Of the variables tested, the rail distance variables were only available for NTS 2020, and the remaining variables all required adjustments in NTS 2020, 2021, and 2022 to avoid small categories causing extreme weights. The 6-category rural-urban variable was also compressed into a dichotomous (2-category) version for NTS 2021 and 2022, to avoid the smaller categories producing extreme weights. Survey mode was also incorporate in some of the models for these years because of the mixed-mode fieldwork.
For the weighting of the NTS 2020, 2021, and 2022 data, the household participation models also had to be split to control for mode effects (that is, for the F2F mode compared to the P2T mode) and fluctuating response rates across the year resulting from the COVID-19 pandemic. Consequently, there was more variation in which variables were significant predictors of household participation from NTS 2020 to 2023 than there was for the pre-pandemic survey years. Table 2 therefore shows NTS 2020 and 2022 as separate columns split by F2F and P2T fieldwork, and shows NTS 2021 as a single column but with additional information for results split per quarter.
The most frequently significant variable during this period was Acorn. Unfortunately, the use of Acorn classifications for non-response weighting is no longer practical from NTS 2024 onwards, due to the much stricter licensing conditions and higher cost associated with purchasing Acorn codes for all addresses in NTS samples.
As P2T was used for data collection to some extent throughout this period, these models cannot be considered directly comparable to those used in fully F2F survey years. Nonetheless, the model covariates continued to be effective predictors even when models had to be split by survey mode or quarter of issue.
Table 2: Variables tested in household-level participation models in NTS 2020 to 2023
Variable | 2020 (F2F fieldwork) | 2020 (P2T fieldwork) | 2021, split by quarter | 2022 (F2F fieldwork) | 2022 (P2T fieldwork) | 2023 |
---|---|---|---|---|---|---|
GOR | Not significant | Significant | Significant for all 4 quarters | Significant | Not significant | Significant |
Rural-urban classification (6-category version) | Not significant | Not significant | Not applicable | Not applicable | Not applicable | Significant |
Rural-urban classification (2-category version) | Not applicable | Not applicable | Significant for 3 quarters | Significant | Not significant | Not applicable |
ACORN | Significant | Significant | Significant for all 4 quarters | Significant | Significant | Significant |
Month of issue | Significant | Significant | Not significant for any quarter | Significant | Not significant | Significant |
Rail distance (continuous version) | Not in model | Not in model | Not available | Not available | Not available | Not available |
Rail distance (6-category) | Not significant | Not significant | Not available | Not available | Not available | Not available |
Survey mode | Not in model | Not in model | Significant for all 4 quarters | Not in model | Not in model | Not in model |
Based on the findings above, six recommendations for the existing household participation model covariates are put forward.
Firstly, the region variable (GOR) should be retained, as this performs well.
Secondly, the 6-category Census 2011 rural-urban classification should be retained, as this performs well even when simplified into a dichotomous (2-category) rural-urban measure. This dichotomous version should only be used again if the 6-category version produces extreme weights (as was the case in NTS 2021 and 2022) and there is no significant residual bias in sub-categories. When the equivalent Census 2021 rural-urban classification is available, this should be used instead.
Thirdly, a socio-economic classification variable should be retained to replace the Acorn measure, as Acorn has been a consistently significant covariate in the household participation models.
Fourthly, month of issue should be retained, as even when it is not significant in models its inclusion is important to adjust for seasonality.
Fifthly, distance to the nearest railway station should be dropped, as this has never been a significant predictor when available.
Finally, survey mode should be dropped, as this is no longer needed now that P2T is not being used.
2.2 Testing potential additional covariates
Using NTS 2023 data, a range of other potential covariates from Census 2021 LSOA-level (Lower Layer Super Output Area) data were tested for association with household participation. These consisted of quintiles derived at national level for various Census measures, namely:
- the proportion of households with car access
- the proportion of non-car commuters
- the proportion of individuals with a disability
- the proportion of owner-occupiers
- the proportion of individuals in NS-SEC categories 1 and 2 (managerial, administrative, and professional occupations)
- the proportion of ethnic minority residents
- the proportion of adults qualified to degree level
- the proportion of adults in work
- the proportion of households with dependent children
- the proportion of households living in a house rather than a flat, converted building, or temporary structure
In addition, quintiles of population density (at Local Authority and postcode sector level), quintiles of English Index of Multiple Deprivation (EIMD) 2019, and the 2021 output area classification (OAC21) groups and supergroups were tested. Existing model covariates were also included, namely: region, 6-category rural-urban classification, Acorn categories, and month of issue.
Forward and backward stepwise logistic regression were used to fit models of household participation. As expected, the final model containing covariates that were significant (p-value less than 0.05) in either forward or backward stepwise models included the four main variables routinely included in NTS household participation models: GOR, rural-urban classification, Acorn, and month of issue. Also included were quintiles of households with car access, quintiles of ethnic minority residents, quintiles of adults educated to degree level, quintiles of adults in work, OAC21 supergroups, postcode sector-level population density quintiles, and Local Authority-level population density quintiles. The full model is shown in Table 3 below.
Table 3: Logistic regression model of household participation for NTS 2023, fitted using additional covariates
Variable | Category | Beta coefficient (B) | Standard error (S.E) | Degrees of freedom (df) | Significance (p) |
---|---|---|---|---|---|
Car access (quintiles) | Quintile 1 | Not applicable | Not applicable | 4 | 0.005 |
Car access (quintiles) | Quintile 2 | 0.056 | 0.091 | 1 | 0.538 |
Car access (quintiles) | Quintile 3 | 0.114 | 0.074 | 1 | 0.125 |
Car access (quintiles) | Quintile 4 | 0.127 | 0.063 | 1 | 0.045 |
Car access (quintiles) | Quintile 5 | 0.201 | 0.057 | 1 | less than 0.001 |
Ethnicity (quintiles) | Quintile 1 | Not applicable | Not applicable | 4 | 0.038 |
Ethnicity (quintiles) | Quintile 2 | 0.050 | 0.095 | 1 | 0.601 |
Ethnicity (quintiles) | Quintile 3 | 0.118 | 0.088 | 1 | 0.181 |
Ethnicity (quintiles) | Quintile 4 | 0.031 | 0.079 | 1 | 0.698 |
Ethnicity (quintiles) | Quintile 5 | 0.143 | 0.063 | 1 | 0.024 |
Degree-level education (quintiles) | Quintile 1 | Not applicable | Not applicable | 4 | 0.004 |
Degree-level education (quintiles) | Quintile 2 | -0.187 | 0.069 | 1 | 0.007 |
Degree-level education (quintiles) | Quintile 3 | -0.229 | 0.061 | 1 | less than 0.001 |
Degree-level education (quintiles) | Quintile 4 | -0.135 | 0.057 | 1 | 0.017 |
Degree-level education (quintiles) | Quintile 5 | -0.077 | 0.054 | 1 | 0.154 |
Adults in work (quintiles) | Quintile 1 | Not applicable | Not applicable | 4 | 0.008 |
Adults in work (quintiles) | Quintile 2 | 0.183 | 0.057 | 1 | 0.001 |
Adults in work (quintiles) | Quintile 3 | 0.137 | 0.053 | 1 | 0.009 |
Adults in work (quintiles) | Quintile 4 | 0.064 | 0.050 | 1 | 0.201 |
Adults in work (quintiles) | Quintile 5 | 0.028 | 0.048 | 1 | 0.559 |
OAC21 Supergroup | Retired professionals | Not applicable | Not applicable | 7 | 0.024 |
OAC21 Supergroup | Suburbanites and Peri-Urbanities | 0.065 | 0.119 | 1 | 0.586 |
OAC21 Supergroup | Multicultural and Educated Urbanites | 0.114 | 0.114 | 1 | 0.318 |
OAC21 Supergroup | Low-Skilled Migrant and Student Communities | -0.221 | 0.133 | 1 | 0.096 |
OAC21 Supergroup | Ethnically Diverse Suburban Professionals | 0.007 | 0.127 | 1 | 0.959 |
OAC21 Supergroup | Baseline UK | 0.079 | 0.125 | 1 | 0.524 |
OAC21 Supergroup | Semi-Skilled and Un-Skilled Workforce | -0.039 | 0.112 | 1 | 0.730 |
OAC21 Supergroup | Legacy Communities | 0.073 | 0.113 | 1 | 0.520 |
Postcode sector population density (quintiles) | Quintile 1 | Not applicable | Not applicable | 4 | less than 0.001 |
Postcode sector population density (quintiles) | Quintile 2 | 0.201 | 0.088 | 1 | 0.022 |
Postcode sector population density (quintiles) | Quintile 3 | 0.241 | 0.070 | 1 | 0.001 |
Postcode sector population density (quintiles) | Quintile 4 | 0.254 | 0.061 | 1 | less than 0.001 |
Postcode sector population density (quintiles) | Quintile 5 | 0.275 | 0.059 | 1 | less than 0.001 |
Local authority population density (quintiles) | Quintile 1 | Not applicable | Not applicable | 4 | 0.001 |
Local authority population density (quintiles) | Quintile 2 | 0.273 | 0.079 | 1 | 0.001 |
Local authority population density (quintiles) | Quintile 3 | 0.301 | 0.074 | 1 | less than 0.001 |
Local authority population density (quintiles) | Quintile 4 | 0.216 | 0.068 | 1 | 0.001 |
Local authority population density (quintiles) | Quintile 5 | 0.123 | 0.063 | 1 | 0.050 |
Government office region (GOR) | North East | Not applicable | Not applicable | 9 | less than 0.001 |
Government office region (GOR) | North West | -0.057 | 0.088 | 1 | 0.521 |
Government office region (GOR) | Yorkshire and Humberside | -0.940 | 0.071 | 1 | less than 0.001 |
Government office region (GOR) | East Midlands | -0.457 | 0.072 | 1 | less than 0.001 |
Government office region (GOR) | West Midlands | -0.839 | 0.076 | 1 | less than 0.001 |
Government office region (GOR) | Eastern | -0.347 | 0.071 | 1 | less than 0.001 |
Government office region (GOR) | Inner London | -0.244 | 0.067 | 1 | less than 0.001 |
Government office region (GOR) | Outer London | -0.084 | 0.112 | 1 | 0.452 |
Government office region (GOR) | South East | 0.003 | 0.092 | 1 | 0.974 |
Government office region (GOR) | South West | -0.081 | 0.063 | 1 | 0.201 |
Rural-urban classification | Urban: Major Conurbation | Not applicable | Not applicable | 5 | 0.016 |
Rural-urban classification | Urban: Minor Conurbation | -0.157 | 0.112 | 1 | 0.161 |
Rural-urban classification | Urban: City and Town | -0.409 | 0.139 | 1 | 0.003 |
Rural-urban classification | Rural: Town and Fringe | -0.232 | 0.100 | 1 | 0.020 |
Rural-urban classification | Rural: Village | -0.277 | 0.100 | 1 | 0.005 |
Rural-urban classification | Rural: Hamlets and Isolated Dwellings | -0.235 | 0.105 | 1 | 0.025 |
ACORN | Affluent Achievers | Not applicable | Not applicable | 4 | less than 0.001 |
ACORN | Rising Prosperity | 0.361 | 0.067 | 1 | less than 0.001 |
ACORN | Comfortable Communities | 0.195 | 0.068 | 1 | 0.004 |
ACORN | Financially Stretched | 0.244 | 0.057 | 1 | less than 0.001 |
ACORN | Urban Adversity | 0.129 | 0.053 | 1 | 0.015 |
Month of issue | January | Not applicable | Not applicable | 11 | less than 0.001 |
Month of issue | February | 0.469 | 0.075 | 1 | less than 0.001 |
Month of issue | March | 0.381 | 0.077 | 1 | less than 0.001 |
Month of issue | April | 0.314 | 0.075 | 1 | less than 0.001 |
Month of issue | May | 0.434 | 0.076 | 1 | less than 0.001 |
Month of issue | June | 0.350 | 0.076 | 1 | less than 0.001 |
Month of issue | July | 0.189 | 0.076 | 1 | 0.013 |
Month of issue | August | 0.242 | 0.069 | 1 | less than 0.001 |
Month of issue | September | 0.246 | 0.069 | 1 | less than 0.001 |
Month of issue | October | 0.090 | 0.071 | 1 | 0.205 |
Month of issue | November | 0.211 | 0.069 | 1 | 0.002 |
Month of issue | December | 0.081 | 0.070 | 1 | 0.250 |
Constant | Not applicable | -0.879 | 0.197 | 1 | less than 0.001 |
An alternative w2 was generated from this model and compared with the original NTS w2 model (derived from the four variables of GOR, rural-urban classification, Acorn, and month of issue). Despite the additional seven variables in the alternative model, the alternative w2 was found to have very similar efficiency and residual bias compared to the original. The untrimmed original w2 had a DEFF of 1.08 and efficiency of 93%, while the untrimmed alternative w2 had a DEFF of 1.10 and efficiency of 91%. The original w2 had only a slightly higher residual bias for the potential covariates tested, with a mean of 0.22 percentage points and maximum of 1.1 percentage points. The alternative w2 had a mean residual bias of 0.12 percentage points and maximum of 1.2 percentage points. When the alternative weight was trimmed at the first percentile and the 99th percentile, the efficiency was closer still to the original, with a DEFF of 1.09 and efficiency of 92%. Essentially, both the original and alternative versions were found to have similarly high efficiency and low residual bias.
A detailed comparison of the residual bias between the original and alternative w2 is shown in Table 4 below. All figures are shown in percentage points. The maximum initial bias column represents the difference between eligible and unweighted responding cases. The main w2 maximum residual bias column represents the difference between eligible cases and the original w2. The alternative w2 maximum residual bias column represents the difference between eligible cases and the alternative w2.
Table 4: Residual bias figures for the original NTS 2023 w2 and alternative w2, in percentage points
Variable | Maximum initial bias | Main w2 maximum residual bias | Alternative w2 maximum residual bias |
---|---|---|---|
Car access (quintiles) | 4.37 | 1.27 | 0.52 |
Non-car commuting (quintiles) | 2.81 | 0.81 | 0.90 |
Disability rate (quintiles) | 1.02 | 0.75 | 0.65 |
Owner occupation (quintiles) | 4.59 | 1.71 | 1.20 |
Professional NS-SEC occupations (quintiles) | 3.64 | 0.36 | 0.53 |
Ethnicity (quintiles) | 3.72 | 1.26 | 0.38 |
Degree-level education (quintiles) | 2.62 | 0.50 | 0.27 |
Adults in work (quintiles) | 0.42 | 0.58 | 0.31 |
Households with dependent children (quintiles) | 1.39 | 0.16 | 0.32 |
Households living in a house (quintiles) | 2.43 | 1.06 | 0.77 |
OAC21 supergroup | 2.64 | 0.95 | 0.32 |
Postcode sector population density (quintiles) | 4.16 | 2.04 | 0.32 |
Local authority population density (quintiles) | 3.42 | 1.53 | 0.16 |
EIMD (quintiles) | 3.66 | 0.88 | 0.68 |
Government office region (GOR) | 4.27 | 0.29 | 0.41 |
Rural-urban indicator | 3.58 | 0.15 | 0.19 |
ACORN | 4.72 | 0.24 | 0.29 |
Month of issue | 1.39 | 0.28 | 0.39 |
Based on the findings of tests using NTS 2023 data, two recommendations regarding potential additional covariates for the household participation model used to generate w2 are put forward.
Firstly, the minor differences between the original and alternative test versions do not suggest that additional variables would significantly improve the performance of the w2 weighting component.
Secondly, given that Acorn classifications are no longer accessible for NTS 2024 onwards, OAC21 supergroups are the most suitable substitute socio-economic measure. The 8-category supergroup was significant (p-value less than 0.05) in the alternative model. The more detailed 21-category OAC21 group variable was not significant, so the supergroup measure is the appropriate choice.
2.3 Recommendations regarding w2
Based on the analysis above, it is recommended that GOR, rural-urban categories, and month of issue are retained in the NTS household participation model. OAC21 supergroup is recommended as a socio-economic variable to replace Acorn classifications in the model, while the addition of further variables is not considered justified by available evidence.
These recommendations were trialled in the combined NTS 2023 and 2024 mid-year weighting, because Acorn classifications were not available for NTS 2024 cases. OAC21 supergroup was found to be significant (p-value less than 0.05) in the mid-year household participation model. The mid-year w2 had 93% efficiency, very similar to NTS 2023, which suggests that this change has only had a minor impact.
Chapter 3: Reviewing the need for a model of household-level full response (w4)
3.1 Background and approach
As part of this Weighting Review exercise, consideration was given to ways in which the complex weighting process could be streamlined. Consequently, it was suggested that weighting could potentially be simplified if the model for fully responding households (which is used to produce weighting component w4) is dropped.
The w4 model, prior to this Weighting Review, has used the interview sample as a base for modelling household propensity to provide travel diaries for all household members (which is the NTS definition of a fully responding household) and included ten demographic and geographic variables. The difference between the pre-calibration interview and fully responding NTS weights has been that the latter included w4 and the former did not.
In order to test whether this additional w4 model was necessary, a similar approach to the review of w2 covariates was used. Firstly, the performance of the w4 model was compared over the years 2013 to 2023. Secondly, NTS 2019 and 2023 data was used to produce alternative fully responding weights without w4 for comparison against the original fully responding weights.
3.2 Comparison of past w4
Rates of full response and levels of initial bias in the fully responding samples as well as the performance of w4 were checked for the years 2013 to 2023. Initial bias has been defined as the difference between the fully responding sample profile and interview sample profile when weighted by the interview weights (that is, without the use of w4).
Table 5 below shows the percentage of households in the interview sample that were fully responding and summarises the scale of initial bias that w4 adjusted for in each year. The bias summary is represented, firstly, by the mean initial bias of the four model variables and, secondly, by the maximum initial bias (that is, the bias of the variable with the highest bias across all four of model variables). These bias figures are shown in percentage points, and variables with an initial bias of 1 percentage point or more are then listed in the final column. Note that bias information was not available for NTS 2013.
Table 5: Rates of full response and initial bias in fully responding samples from NTS 2013 to 2023
Year | Rate of full response | Mean initial bias in w4 variables | Maximum initial bias in w4 variables | Variables with initial bias of 1 percentage point or more |
---|---|---|---|---|
2013 | 93.2% | Not available | Not available | Not available |
2014 | 92.8% | 0.2 | 0.7 | None |
2015 | 92.6% | 0.2 | 0.8 | None |
2016 | 90.8% | 0.2 | 0.6 | None |
2017 | 89.6% | 0.3 | 1.0 | Tenure |
2018 | 90.7% | 0.3 | 1.2 | Tenure and Ethnicity |
2019 | 90.8% | 0.2 | 0.5 | None |
2020 (F2F fieldwork) | 83.7% | 0.6 | 1.6 | Region and Ethnicity |
2020 (P2T fieldwork) | 97.2% | 0.1 | 0.5 | None |
2021 | 92.3% | 0.4 | 0.8 | None |
2022 | 84.9% | 0.4 | 1.2 | Ethnicity |
2023 | 85.0% | 0.3 | 1.2 | Tenure and Ethnicity |
Note: Maximum initial bias for 2020 with F2F fieldwork was 5.5 percentage points for the month variable, due to the sudden cessation of F2F fieldwork as a result of the March 2020 pandemic lockdown, however this is not reported in Table 5. Instead, for 2020 with F2F fieldwork, the tables shows the maximum bias for the other nine model variables only (1.6 percentage points). This figure was nonetheless still higher than other years, because the 2020 F2F responding sample included less than three months of response data.
The proportion of households in the NTS interview sample that fully complete travel diaries has remained consistently high throughout the past decade of the survey. The lowest rate was 83.7% for the F2F fieldwork in 2020, which was impacted by the sudden interruption of data collection. For the last two survey years of NTS, the fully responding rate has been 85%.
Partially as a consequence of this high and stable rate of full response, the differences between the interview and fully responding sample profiles have remained small. The mean level of residual bias (that is, the difference between the two sample profiles for the ten variables used to create w4) has been consistently very low. As shown in Table 5 above, all of the variables have had a level of maximum bias of approximately one percentage point or lower across the eleven survey years analysed. This suggests that weighting the fully responding sample to match the profile of the interview sample may not be necessary for bias reduction. It is also notable that the use of P2T for data collection did not significantly impact the rate of full response, suggesting that this rate is a stable feature of NTS.
Nonetheless, the introduction of the digital diary has the potential to impact on the rate of full response, hopefully in a positive direction. Analysis of the digital diary parallel run, which took place in quarter 1 of NTS 2024, included comparison of the rates of full response for households offered digital diaries and households offered only paper diaries. No significant difference in the rate of full response was found between the two groups: 84.4% of the main NTS 2024 quarter 1 interview sample were fully responding, while the figure for parallel run cases was 83.7%. However, further development and interviewer familiarisation with the digital diary could help to improve upon that rate.
The following two tables look in more detail at the ten variables included in the models used to create w4 and the design effect of w4. Table 6 covers the years 2013 to 2019 and shows which of the ten variables included in the w4 regression models were significant (p-value less than 0.05), as well as summarising the number and percentage of significant variables. The design effect of w4 is shown in the final row.
Table 6: Significance (p-value less than 0.05) of w4 model covariates and design effect from NTS 2013 to 2019
Variable | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 |
---|---|---|---|---|---|---|---|
Age of youngest adult | Significant | Significant | Significant | Significant | Significant | Significant | Significant |
GOR | Significant | Significant | Significant | Not significant | Significant | Significant | Significant |
Number of married partners | Not significant | Not significant | Significant | Not significant | Significant | Not significant | Not significant |
Month of issue | Not significant | Not significant | Significant | Significant | Not significant | Not significant | Not significant |
Tenure | Significant | Not significant | Significant | Not significant | Significant | Significant | Not significant |
Number of adults | Significant | Significant | Significant | Significant | Significant | Significant | Not significant |
Number of cohabiting partners | Not significant | Not significant | Not significant | Significant | Significant | Not significant | Not significant |
Ethnicity | Not significant | Significant | Significant | Not significant | Not significant | Significant | Not significant |
Regular use of a vehicle | Not significant | Not significant | Not significant | Not significant | Not significant | Not significant | Not significant |
Rural-urban classification (6 categories) | Not significant | Not significant | Not significant | Not significant | Not significant | Not significant | Not significant |
Number of significant variables | 4 | 4 | 7 | 4 | 6 | 5 | 2 |
Percentage of significant variables | 40% | 40% | 70% | 40% | 60% | 50% | 20% |
w4 design effect | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Over the period 2013 to 2019 shown in Table 6 above, w4 has not had a design effect greater than 1. This indicates that w4 has had minimal impact on the fully responding weights. It is also notable that for five of these seven years analysed, only half of the variables in the model or fewer were significantly associated with full response. Furthermore, two variables were not significant in any of the seven years, notably rural-urban classification, which is also included in the w2 model of household participation.
Table 7 below shows the results for the survey years NTS 2020 to 2023. In the same manner as Table 6, it shows which of the ten variables included in the w4 regression models were significant (p-value less than 0.05), as well as summarising the number and percentage of significant variables. The design effect of w4 is shown in the final row.
Table 7: Significance (p-value less than 0.05) of w4 model covariates and design effect from NTS 2020 to 2023
Variable | 2020 (F2F fieldwork) | 2020 (P2T fieldwork) | 2021 | 2022 | 2023 |
---|---|---|---|---|---|
Age of youngest adult | Not significant | Not significant | Not significant | Not significant | Significant |
GOR | Significant | Not significant | Significant | Significant | Significant |
Number of married partners | Not significant | Not significant | Not significant | Not significant | Not significant |
Month of issue | Significant | Significant | Significant | Significant | Not significant |
Tenure | Not significant | Not significant | Not significant | Significant | Significant |
Number of adults | Not significant | Not significant | Not significant | Not significant | Not significant |
Number of cohabiting partners | Not significant | Not significant | Not significant | Not significant | Not significant |
Ethnicity | Significant | Not significant | Not significant | Significant | Significant |
Regular use of a vehicle | Not significant | Not significant | Significant | Not significant | Not significant |
Rural-urban classification (6 categories) | Not significant | Significant | Significant | Significant | Significant |
Number of significant variables | 3 | 2 | 4 | 5 | 5 |
Percentage of significant variables | 30% | 20% | 40% | 50% | 50% |
w4 design effect | 1.05 | 1.00 | 1.00 | 1.01 | 1.00 |
Unlike w2, which required certain modelling adjustments (as discussed in Chapter 2 above), the modelling to create w4 required little modification as a result of mixed-mode data collection during the years 2019 to 2023. The model used to create w4 was split by survey mode in 2020 due to the six week pause in data collection, but no modifications were required from 2021 onwards. During the entire period 2020 to 2023, only half of the ten variables or fewer were significantly associated with full response. The design effect of w4 remained very close to 1, even for the small subsample of 2020 with F2F fieldwork.
Across the whole period analysed, from 2013 to 2023, a consistent pattern of minimal design effects and limited numbers of significant covariates in w4 models has been found. It should also be noted that for the weighting of the digital diary parallel run data (which covered only quarter 1 of 2024), w4 still had a design effect of 1.01 and mean bias of 0.5 percentage points.
3.3 Testing alternatives to w4
To test alternative fully responding weighting methods, NTS 2019 and 2023 data were used to run alternative versions of the fully responding weights using w1, w2, and w3 but not w4. These weights were then compared with the original fully responding weights. The first of the two alternative approaches involved rescaling the interview weight to the fully responding sample size prior to calibration and then calibrating the interview and fully responding weights separately to population estimates (to create test weight 1). The second alternative approach involved rescaling the interview weight to the fully responding sample size after calibration of the interview weights (to create test weight 2).
These two test weights were compared with the original NTS 2019 and 2023 fully responding weights that incorporate w4. To check that dropping w4 would not affect the profile of individuals in fully responding households, the analyses compared efficiency, residual bias at the individual respondent level, and the effect on key NTS estimates. This provided evidence as to whether it would be appropriate to drop w4 from the fully responding weights, and if so, which alternative method would be preferable.
Using NTS 2019 data, test weights 1 and 2 performed very similarly to the original fully responding weight. The original weight had a DEFF of 1.11, while test weight 1’s DEFF was also 1.11, and test weight 2’s DEFF was 1.09.
Weighted profiles were checked for the w4 model variables and a range of NTS individual measures, namely: commuting mode, holding a driving license, frequency of train use, frequency of bus use, frequency of cycling, frequency of walking for 20 minutes or more, and frequency of taxi use. The profiles for the original weight and test weight 1 were so similar that the mean difference was 0.0 percentage points, with a maximum difference of 0.4 percentage points. The difference between the original weight and test weight 2 was slightly larger but still low, with a mean difference of 0.1 percentage points and a maximum difference of 1.1 percentage points.
The same process was repeated with NTS 2023 data, with very similar results. Overall design effects for 2023 weights were higher than in 2019, due to lower overall response rates. The original NTS 2023 fully responding weights had a DEFF of 1.22, while test weight 1’s DEFF was 1.21, and test weight 2’s DEFF was 1.17. The original weight and test weight 1 had a mean difference of 0.1 percentage points and a maximum difference of 0.8 percentage points for the same w4 model variables and NTS measures used for the 2019 data. Again, similarly to NTS 2019, test weight 2 showed slightly greater differences to the original weight, with a mean of 0.1 percentage points and a maximum 1.3 percentage points.
3.4 Recommendations regarding w4
The alternative weights produced with two full years of NTS data, both entirely collected using F2F fieldwork, have demonstrated that w4 has had a minimal impact on efficiency, residual bias in w4 variables, or weighted NTS estimates in the fully responding sample. Differences in the profiles of interview and fully responding households were found to be minor, and the model used to create w4 has had a very limited impact on the fully responding weights for individuals.
The recommendation from this analysis is therefore that w4 is dropped from the fully responding weighting method. As test weight 1 produced lower residual bias than test weight 2 for both NTS 2019 and 2023 data, the method for test weight 1 is the recommended replacement approach for producing NTS fully responding weights in future.
This recommended method was trialled successfully during the combined NTS 2023 and 2024 mid-year weighting. The dropping of the w4 model streamlined the weighting process and the results were found to be very similar to previous years. The final mid-year fully responding weights had a DEFF of 1.22 and 82% efficiency, compared with 1.20 and 83% for the interview weights. The rates of full response and profile of fully responding cases will continue to be checked, in case a future divergence in the interview and fully responding samples requires adjustment. Likewise, the impact of digital diary adoption will be monitored in case this has implications for the rate of household full response.
Chapter 4: Reviewing the covariates in the self-completion (CASI) weighting model
4.1 Background and approach
Since NTS 2017, a computer assisted self-interview (CASI) module has formed part of NTS data collection. One adult over the age of 16 is selected from each participating household to complete this. The selection is random, but depends upon presence during the household interview and age. Adults not present cannot be selected and are treated as non-respondents. The youngest eligible age group (ages 16 to 29) are oversampled by being selected with an 80% probability. The CASI weighting controls for these differential selection probabilities and for differences in the profile of adults present and not present during the household interview using a modelling approach. The CASI model uses the interview sample as a base for modelling individual household member’s propensity to be present at the time of the household interview and the model covariates include demographic, geographic and NTS survey variables. The final CASI weights adjust the profile of CASI respondents back to the profile of adults aged 16 or over in England.
The CASI module is a relatively recent addition to the NTS compared with the interview and diary elements, and there is no indication that the weighting method requires changes. It has remained robust during the years when the pandemic disrupted data collection. Consequently, the CASI model covariates were analysed for this Weighting Review by comparing the models since the introduction of the CASI module. The aim of this comparison was to determine whether the current set of covariates is appropriate. There is already flexibility in the method, as the final CASI models are fitted using forward and backward stepwise regression. This differs from the models for household participation (w2) and fully-responding households (w4), which use the same covariates each time they are run. Additionally, as the CASI models are run weighted by the interview weights, any adjustment to the household participation will also be incorporated into the CASI weights.
4.2 Comparison of past CASI models
Table 8 below summarises the variables included in the CASI models from NTS 2017 to 2023, and additionally which of the included variables were significant covariates (p-value less than 0.05) and which were not. Due to disrupted fieldwork and lower response in NTS 2020, 2021, and 2022, adjustments to the potential covariates were required. In NTS 2020 the model was split by survey mode. In all of the three survey years disrupted by the pandemic, small categories for some of the variables needed to be combined in order to prevent extreme weights (for example, age-sex was collapsed from 16 categories to 12 categories).
Stepwise regression was used to select covariates in the CASI models, and it has been usual practice for significant variables (p-value less than 0.05) to be included in the final model. This approach was relaxed during NTS 2020, 2021, and 2022 in order to address higher levels of bias.
Table 8: Significance (p-value less than 0.05) of variables included in the CASI models from NTS 2017 to 2023
Variable | 2017 | 2018 | 2019 | 2020 (F2F fieldwork) | 2020 (P2T fieldwork) | 2021 | 2022 | 2023 |
---|---|---|---|---|---|---|---|---|
Age-sex (12 or 16 categories) | Significant | Significant | Significant | Significant | Significant | Significant | Significant | Significant |
GOR | Significant | Significant | Significant | Significant | Not significant | Significant | Not significant | Significant |
Household size (adults) | Significant | Significant | Significant | Significant | Not significant | Significant | Significant | Significant |
Household size (adults and children) | Significant | Significant | Significant | Not significant | Not significant | Not applicable | Not applicable | Significant |
Rural-urban classification (6 categories) | Significant | Not applicable | Significant | Not applicable | Not applicable | Not applicable | Not significant | Not applicable |
Tenure | Significant | Significant | Significant | Significant | Significant | Significant | Not applicable | Significant |
Income | Significant | Significant | Significant | Significant | Not significant | Significant | Significant | Significant |
Marital status (6 or 4 categories) | Significant | Significant | Significant | Significant | Significant | Significant | Significant | Significant |
Economic status (3 or 4 categories) | Significant | Not applicable | Significant | Significant | Not significant | Significant | Significant | Significant |
Disability | Not applicable | Significant | Not applicable | Not significant | Significant | Not applicable | Not applicable | Significant |
Car use | Significant | Significant | Not applicable | Not significant | Significant | Not significant | Not applicable | Not applicable |
Ethnicity | Not applicable | Not applicable | Not applicable | Not significant | Not significant | Significant | Not applicable | Significant |
Quarter of issue | Not applicable | Not applicable | Not applicable | Not applicable | Not applicable | Not significant | Significant | Not applicable |
Number of variables in model | 10 | 9 | 9 | 11 | 11 | 10 | 8 | 10 |
Number of significant variables | 10 | 9 | 9 | 7 | 5 | 8 | 6 | 10 |
Percentage of significant variables | 77% | 69% | 69% | 54% | 38% | 62% | 46% | 77% |
The comparison of the models across the seven survey years (including the two models for 2020, split by F2F and P2T fieldwork) for the CASI element of NTS, as shown in Table 8 above, demonstrates that a high proportion of the 13 potential covariates are included in the final models. The number of significant covariates per model varies from 5 to 10 (that is, 38% to 77%) of the total 13 variables tested. Moreover, there is variation between years as to which variables are significant and/or included in the final model. Although some are more frequently included than others, all of the 13 potential covariates have been used in at least two of the eight models. The variables age-sex, GOR, household size (adults), income, and marital status were included in all eight models in some form.
4.3 Recommendations regarding CASI non-presence model covariates
Comparison of the models over the seven years since the CASI module was added to NTS suggests that the current set of variables have been performing well to control for differences between adults that are present during the NTS household interview and those that are not. The only potential covariate of limited future relevance is quarter of issue, as this was included in 2021 and 2022 to control for quarterly variations in response rates. These variations were related to the use of the P2T mode. From NTS 2023 onwards, fieldwork has returned to F2F interviewing and therefore such variations should no longer be an issue. The other twelve variables used in the stepwise regressions to fit the CASI models remain appropriate.
Given that the vast majority of these twelve covariates were found to be significant and included in the CASI model across the seven survey years analysed, and that all twelve covariates have been significant at least once, the recommendation from this review is to fix the CASI model to include all twelve variables without going through the process of variable selection via a stepwise regression. Such an approach will streamline the CASI weighting and is consistent with the method used for the household participation model (w2). Fixing the CASI model should not have a noticeable impact on the CASI sample efficiency, and it is proposed that this should be confirmed during weighting of NTS 2024.
Chapter 5: Reviewing the diary drop-off weighting method for digital diary adoption
5.1 Background and approach
For decades the NTS has invited completion of travel diaries, covering a full week (that is, seven consecutive days) of journeys, for all members of a participating household. Households with complete diaries for all members form the fully responding sample and diary weights are produced for the analysis of the diaries. Since their introduction, travel diaries have been completed in paper format, and are intended to be completed by respondents themselves or, as was case during P2T fieldwork, by respondents relaying their journey information to interviewers over the telephone so that the interviewers could fill in the diaries for them.
In the first quarter of NTS 2024, a parallel run to the main NTS was carried out wherein an additional sample of 2,178 addresses were issued with participants offered the option of completing an online version of the diary, known as the ‘digital diary’. The paper diary remained as an alternative for respondents who were unwilling or unable to use the online version. The parallel run broadly demonstrated that the digital diary did not significantly impact the collection of travel diary data in comparison to using paper travel diaries. From NTS 2025, the digital diary will be offered to all households, with the paper diary as a backup. Full details about the results of the digital diary parallel run will be published alongside the 2025 data during 2026.
The diary weights adjust for higher reporting of trips (also referred to as journeys) on the first day (Day 1) of the travel diary. Towards the end of the week, the number of reported trips tends to drop-off and the weighting adjusts for this across eight categories of journey purpose. This part of the Weighting Review compares the drop-off in the reporting of trips by journey purpose over the travel diary week and the resulting weights by diary mode (digital or paper diary) using the data from the parallel run.
5.2 Comparison of digital and paper diary weighting
It would only be necessary to consider modifying the weighting methodology for digital diary data if there was evidence that the drop-off pattern was different: that is, if fewer trips were reported by digital diary cases on the first day (Day 1) of the diary compared to the latter six days of the diary (Days 2 to 7). To identify whether such a difference was evident, the established method of diary drop-off weighting was therefore applied to the main NTS 2024 quarter 1 data and to the digital diary parallel run data. In the first instance, the overall mean number of trips reported per day, for any purpose, was found to be very similar for both modes: 1.85 trips per day for paper diaries and 1.88 for digital diaries.
To provide further insight, Table 9 shows the mean number of trips per day for each category of journey purpose. The table divided is into figures for each mode of diary completion (namely digital diary completion and paper diary completion), and is further divided into figures for Day 1 of the travel diary and the combined figures for Days 2 to 7. The digital diary figures only represent respondents that completed the diary online, and respondents who were offered the digital diary option but completed a paper diary instead are included in the paper diary columns.
Table 9: Mean number of trips per day reported by purpose for digital and paper travel diaries in quarter 1 of 2024, comparing Day 1 of the travel diaries against Days 2 to 7
Journey purpose | Mean number of trips per day, Day 1 of digital diaries | Mean number of trips per day, Days 2 to 7 of digital diaries | Mean number of trips per day, Day 1 of paper diaries | Mean number of trips per day, Days 2 to 7 of paper diaries |
---|---|---|---|---|
Commuting | 0.3126 | 0.3035 | 0.2429 | 0.2381 |
Business | 0.0517 | 0.0395 | 0.0561 | 0.0566 |
Education | 0.1113 | 0.0985 | 0.0986 | 0.0959 |
Escort education | 0.1179 | 0.0788 | 0.0912 | 0.0809 |
Shopping | 0.2954 | 0.2762 | 0.3823 | 0.3267 |
Other personal business and escort | 0.4066 | 0.3810 | 0.3620 | 0.3556 |
Social and entertainment | 0.4742 | 0.4757 | 0.4846 | 0.4473 |
Holiday and other | 0.2636 | 0.2064 | 0.2550 | 0.2279 |
When the averages are broken down by journey purpose, there is a similar pattern of lower average journeys reported on Days 2 to 7 of the travel diary in both diary modes. The only exceptions to the drop-off pattern are social and entertainment journeys in the digital diary data and business journeys in the paper data. In both cases, the average for Day 2 to 7 of the diary is very close to Day 1, differing by less than 0.002. This suggests that the current method of weighting diary data to adjust for drop-off in reporting remains appropriate. Differences in mean number of trips per day reported by purpose between digital and paper diaries do not affect the drop-off adjustment used in the weighting. These are likely related to the demographic differences between groups who completed paper and digital diaries, which have been analysed separately for the parallel run reporting.
5.3 Recommendations regarding diary drop-off weighting for digital diary data
Comparison of diary data collected during the digital diary parallel run does not indicate that the diary drop-off weighting method requires adjustment for differences in mode of diary completion. However, the parallel run was limited in scale and the digital diary remains under development.
The recommendations resulting from the digital diary parallel run include the following for weighting, which also form part of the recommendations of this Weighting Review:
“Monitor whether the current weighting approach remains fit for design once the digital-first approach is fully implemented at scale. More specifically: Assess whether any measurement-based mode effects should be considered for correction by the weights (for example, mode differences in short walk distributions). Monitor whether any selection-based mode effects (meaning ways in which digital (that is, compliant) and paper (that is, non-compliant) respondents differ from each other) should be considered for correction by the weights.”
The short walk weights were not specifically included in this Weighting Review. As the parallel run analysis identified short walks as potentially displaying differences by diary mode, the rates of short walk reporting will be compared between digital and paper diaries during the weighting of NTS 2025 data. If large differences are found, alterations to the short walks weight will be considered.
Although no changes to the drop-off weighting method were found to be necessary when examining the parallel run data, this will be rechecked when the digital diary has been more widely adopted. During the weighting of NTS 2025 diary data, the comparison reported in Table 9 will be redefined to check that drop-off patterns remain similar across both modes of diary completion.
Chapter 6: Summary of final recommendations
The four parts of this NTS Weighting Review have covered a targeted series of investigations to ensure that the NTS weighting method remains up-to-date, proportionate, and adaptable to the future of the survey. This report shows that the NTS weighting methodology is performing well overall, and recommends only limited changes to the weighting methods. Additionally, as NTS 2023 and 2024 have returned to entirely F2F data collection, the pandemic-related modifications to adjust for mixed-mode and fluctuations in response rates in NTS 2020, 2021, and 2022 are no longer required.
The four main recommendations of this Weighting Review are summarised below.
Firstly, covariates in the model for household-level non-participation (w2) can remain largely unchanged. The measure of area-level socio-economic status Acorn is no longer available so will be replaced by Output Area Classification supergroups 2021.
Secondly, the model of household-level full response (w4) used to generate the fully responding weight has minimal impact and can be dropped. Instead, interview weights for the fully responding sample will be calibrated to population estimates.
Thirdly, covariates in the model used in the weights for the self-completion (CASI) module do not need to be changed, as the main twelve covariates remain appropriate, and in future can all be included rather than fitting a model using stepwise regression.
Finally, the diary drop-off weighting method appears suitable for digital as well as paper diary data, so no modifications are currently proposed. Monitoring for differences in journey reporting by diary mode will be used to identify the need for any alterations to the NTS 2025 diary or short walk weights.