Participation Survey 2022 to 2023 Annual Technical Report
Updated 30 November 2023
Applies to England
DCMS Participation Survey 2022/23
Annual Technical Note
April 2022 - March 2023
© Kantar Public [2023]
1. Introduction
1.1 Background to the survey
In 2021, the Department for Digital, Culture, Media and Sport (DCMS) commissioned Kantar Public to design and deliver a new, nationally representative ‘push-to-web’ survey to assess adult participation in DCMS sectors across England. The new survey serves as a successor to the Taking Part Survey, which ran for 16 years as a continuous face to face survey [footnote 1].
The scope of the survey is to deliver a nationally representative sample of adults (aged 16 years and over) in England. The data collection model for the Participation Survey is based on Address-Based Online Surveying (ABOS), a type of ‘push-to-web’ survey method. Respondents take part either online or by completing a paper questionnaire. In 2022/23 the sample consists of approximately 33,000 interviews across four quarters of fieldwork (April-June 2022, July-September 2022, October-December 2022 and January-March 2023).
The fieldwork period for the annual 2022/23 survey was divided in to four quarters.
- Quarter one: Fieldwork conducted between 1st April 2022 and 30th June 2022
- Quarter two: Fieldwork conducted between 1st July 2022 and 30th September 2022.
- Quarter three: Fieldwork conducted between 1st October 2022 and 1st January 2023.
- Quarter four: Fieldwork conducted between 11th January 2023 and 31st March 2023.
1.2 Survey objectives
The key objectives of the 2022/23 Participation Survey were:
- To inform and monitor government policy and programmes in DCMS and other governmental departments on adult engagement with the DCMS sectors. The survey also gathers information on demographics (for example, age, gender, education).
- To assess the variation in engagement with cultural and digital activities across DCMS[footnote 2] sectors in England, and the differences in social-demographics such as location, age, education, and income.
- To monitor the impact of previous and current restrictions due to the COVID-19 pandemic on cultural events/sites within its sectors, as well as feeding directly into the Spending Review Metrics, agreed centrally with the Treasury, to measure key departmental outcomes.
In preparation for the main survey launching in October 2021, Kantar Public undertook questionnaire development work and a pilot study to test various elements of the new design [footnote 3].
1.3 Survey design
The 2022/23 Participation Survey was conducted via an online and paper methodology using Address Based Online Surveying (ABOS), an affordable method of surveying the general population that still employs random sampling techniques. ABOS is also sometimes referred to as “push to web” methodology.
The basic ABOS design is simple: a stratified random sample of addresses is drawn from the Royal Mail’s postcode address file (PAF) and an invitation letter is sent to each one, containing username(s) and password(s) plus the URL of the survey website. Sampled individuals can log on using this information and complete the survey as they might any other web survey. Once the questionnaire is complete, the specific username and password cannot be used again, ensuring data confidentiality from others with access to this information.
It is usual for at least one reminder to be sent to each sampled address and it is also usual for an alternative mode (usually a paper questionnaire) to be offered to those who need it or would prefer it. It is typical for this alternative mode to be available only on request at first. However, after nonresponse to one or more web survey reminders, this alternative mode may be given more prominence.
Paper questionnaires ensure coverage of the offline population and are especially effective with sub-populations that respond to online surveys at lower-than-average levels. However, paper questionnaires have measurement limitations that constrain the design of the online questionnaire and also add considerably to overall cost. For the Participation Survey, paper questionnaires are used in a limited and targeted way, to optimise rather than maximise response.
1.4 Coronavirus (COVID-19)
It should be noted that some questions in the survey ask about engagement with cultural activities in the last 12 months. It is unclear what effect the COVID-19 pandemic, associated lockdown measures and associated media coverage may have had on public behaviours, attitudes and perceptions across the UK towards the topics in the survey.
The factors described above should be taken into consideration when interpreting the results.
1.5 Death of HM The Queen
Between 8th and 19th September 2022, the online survey was paused as a sign of respect during the period of national mourning following the announcement of the death of Her Majesty The Queen. At 8am on 20th September, the survey was resumed as usual.
2. Questionnaire
2.1 Questionnaire development
Although the Participation Survey serves as a successor to the Taking Part Survey, given the change in methodology and the extent of questionnaire changes it was important to implement a comprehensive development and testing phase. This was made up of four key stages:
- Questionnaire review
- Cognitive testing
- Usability testing
- Fieldwork pilot
Further details about the development work can be found in the Participation Survey 2021/22 Pilot Report [footnote 3].
2.2 2022/23 Participation Questionnaire
The online questionnaire was designed to take an average of 30 minutes to complete. A modular design was used with around half of the questionnaire made up of a core set of questions asked of the full sample. The remaining questions were split into three separate modules, randomly allocated to a subset of the sample.
The postal version of the questionnaire included the same set of core questions asked online, but the modular questions were omitted to avoid overly burdening respondents who complete the survey on paper, and to encourage response. A copy of the online and paper questionnaires are available online.
2.3 Questionnaire changes
The following question was removed in quarter two:
- SSMONROL – A single code question that asks participants what they think the main role of the Monarchy in the UK should be (see Chapter 4.3 for details).
The following question was added in quarter two:
- SEXORIENTATION – A single code question that asks participants to describe their sexual orientation.
The options of the following questions were updated in quarter two:
- CMAJE12 – The question asks participants which major events they have already participated in. The option ‘Birmingham Commonwealth Games 2022’ was added.
- SSPART22 – The question asks participants in what ways they have already participated in the major event they have selected in question CMAJE12. The option ‘Birmingham Commonwealth Games 2022’ was added.
- CEVEENG – The question asks participants which major events they would be interested in participating in. The option ‘Birmingham Commonwealth Games 2022’ was removed.
The following question was removed in quarter four:
- CEVEENG – The question asks participants in what ways they would be interested in participating in specific major events.
3. Sampling
3.1 Sample design: addresses
The address sample design is intrinsically linked to the data collection design (see ‘Details of the data collection model’ below) and was designed to yield a respondent sample that is representative with respect to neighbourhood deprivation level, and age group within each of the 33 ITL2 regions in England [footnote 4]. This approach limits the role of weights in the production of unbiased survey estimates, narrowing confidence intervals compared with other designs.
The design also sought a minimum four-quarter respondent sample size of 900 for each ITL2 region. Although there were no specific targets per quarter, the sample selection process was designed to ensure that the respondent sample size per ITL2 region was approximately the same per quarter.
As a first step, a stratified master sample of 187,000 addresses in England was drawn from the Postcode Address File (PAF) ‘small user’ subframe. Before sampling, the PAF was disproportionately stratified by ITL2 region (33 strata) and, within region, proportionately stratified by neighbourhood deprivation level (5 strata). A total of 165 strata were constructed in this way. Furthermore, within each of the 165 strata, the PAF was sorted by (i) local authority, (ii) super output area, and finally (iii) by postcode. This ensured that the master sample of addresses was geographically representative within each stratum.
This master sample of addresses was then augmented by data supplier CACI. For each address in the master sample, CACI added the expected number of resident adults in each ten-year age band. Although this auxiliary data will have been imperfect, Kantar Public’s investigations have shown that it is highly effective at identifying households that are mostly young or mostly old. Once this data was attached, the master sample was additionally stratified by expected household age structure based on the CACI data: (i) all aged 35 or younger (16% of the total); (ii) all aged 65 or older (18% of the total); (iii) all other addresses (66% of the total).
The conditional sampling probability in each stratum was varied to compensate for (expected) residual variation in response rate that could not be ‘designed out’, given the constraints of budget and timescale. The underlying assumptions for this procedure were derived from empirical evidence obtained from the 2021/22 Participation Survey.
Kantar Public drew a stratified random sample of 83,706 addresses from the master sample of approximately 187,000 and systematically allocated them with equal probability to quarters 1, 2, 3 and 4 (that is, approximately 20,927 addresses per quarter). Kantar Public then systematically distributed the quarter-specific samples to two equal-sized ‘replicates’, each with the same profile. The first replicate was expected to be issued six weeks before the second replicate, to ensure that data collection was spread throughout the three-month period allocated to each quarter.
These replicates were further subdivided into five differently sized ‘batches’, the first comprising two thirds of the addresses allocated to the replicate, and the second, third, fourth and fifth batches comprising a twelfth each. This process of sample subdivision into differently sized batches was intended to help manage fieldwork. The expectation was that only the first three batches within each replicate would be issued (that is, approximately 8,720 addresses), with the fourth and fifth batches kept back in reserve.
For quarters one, two, three and four, only the first three batches of each replicate were issued (that is, as planned). In total, 69,755 addresses were issued across all quarters. For quarter one, 17,439 addresses were issued; 17,440 for quarter two; 17,438 for quarter three; and 17,438 for quarter four.
Table 1 shows the combined quarters one, two, three and four (issued) sample structure with respect to the major strata.
Table 1: Address issue by area deprivation quintile group.
Expected household age structure | Most deprived | 2nd | 3rd | 4th | Least deprived |
---|---|---|---|---|---|
All <=35 | 2,561 | 3,047 | 2,401 | 1,865 | 1,233 |
Other | 9,601 | 10,344 | 9,470 | 9,065 | 7,702 |
All >=65 | 1,997 | 2,483 | 2,747 | 2,673 | 2,576 |
3.2 Sample design: individuals within sampled addresses
All resident adults aged 16+ were invited to complete the survey. In this way, the Participation Survey avoided the complexity and risk of selection error associated with remote random sampling within households.
However, for practical reasons, the number of logins provided in the invitation letter was limited. The number of logins was varied between two and four, with this total adjusted in reminder letters to reflect household data provided by prior respondent(s). Addresses that CACI data predicted contained only one adult were allocated two logins; addresses predicted to contain two adults were allocated three logins; and other addresses were allocated four logins. The mean number of logins per address was 2.8. Paper questionnaires were available to those who are offline, not confident online, or unwilling to complete the survey this way.
3.3 Details of the data collection model
Table 2 summarises the data collection design within each stratum, showing the number of mailings and type of each mailing: push-to-web (W) or mailing with paper questionnaires (P). For example, ‘WWP’ means two push-to-web mailings and a third mailing with paper questionnaires included alongside the web survey login information. In general, there was a two-week gap between mailings.
Table 2: Data collection design by stratum.
Expected household age structure | Most deprived | 2nd | 3rd | 4th | Least deprived |
---|---|---|---|---|---|
All <=35 | WWPW | WWWW | WWWW | WWW | WWW |
Other | WWPW | WWW | WWW | WWW | WWW |
All >=65 | WWPW | WWPW | WWP | WWP | WWP |
4. Fieldwork
Fieldwork for the Participation Survey 2022/23 was conducted between April 2022 and March 2023, with samples issued on a quarterly basis. Each quarter’s sample was split into two batches, the first of which began at the start of the quarter, and the second began midway through the quarter. The specific fieldwork dates for each quarter are shown below in Table 3.
Table 3: Fieldwork dates.
Quarter | Batch | Fieldwork start | Fieldwork end |
---|---|---|---|
Quarter one | 1 | 1st April 2022 | 29th May 2022 |
2 | 12th May 2022 | 30th June 2022 | |
Quarter two | 1 | 1st July 2022 | 4th September 2022 |
2 | 6th August 2022 | 30th September 2022 | |
Quarter three | 1 | 1st October 2022 | 27th November 2022 |
2 | 5th November 2022 | 1st January 2023 | |
Quarter four | 1 | 11th January 2023 | 12th March 2023 |
2 | 16th February 2023 | 31st March 2023 |
The paper questionnaire was made available to around 35% of respondents at the second reminder stage based on the response probability strata as described in section 3.3. The paper questionnaire was also available on request to all respondents who preferred to complete the survey on paper or who were unable to complete online.
4.1 Contact procedures
All sampled addresses were sent an invitation letter in a white envelope with an On Her Majesty’s Service logo; this was replaced with the On His Majesty’s Service logo from 31 October 2022. The letter contained the following information:
-
A brief description of the survey
-
The URL of survey website (used to access the online script)
-
A QR code that can be scanned to access the online survey
-
Log-in details for the required number of household members
-
An explanation that participants will receive a £10 shopping voucher
-
Information about how to contact Kantar Public in case of any queries
-
The reverse of the letter featured responses to a series of Frequently Asked Questions
All non-responding households were sent up to two reminder letters, at the end of the second and fourth weeks of fieldwork for each batch. A targeted third reminder letter was sent to households for which, based on Kantar Public’s ABOS field data from previous studies, this was deemed likely to have the most significant impact (mainly deprived areas and addresses with a younger household structure). The information contained in the reminder letters was similar to the invitation letters, with slightly modified messaging to reflect each reminder stage.
As well as the online survey, respondents were given the option to complete a paper questionnaire, which consisted of an abridged version of the online survey. Each letter informed respondents that they could request a paper questionnaire by contacting Kantar Public using the email address or freephone telephone number provided.
In addition, some addresses received up to two paper questionnaires with the second reminder letter. This targeted approach was, again, based on historical data Kantar Public has collected through other studies, which suggests that provision of paper questionnaires to all addresses can actually displace online responses in some areas. Paper questionnaires were pro-actively provided to (i) sampled addresses in the most deprived quintile group, and (ii) sampled addresses where it was expected that every resident would be aged 65 or older (based on CACI data).
4.2 Confidentiality
Each of the letters assured the respondent of confidentiality, by answering the question “Is this survey confidential?” with the following:
Yes, the information that is collected will only be used for research and statistical purposes. Your contact details will be kept separate from your answers and will not be passed on to any organisation outside of Kantar Public or supplier organisations who assist in running the survey.
Data from the survey will be shared with DCMS for the purpose of producing and publishing statistics. The data shared with DCMS won’t contain your name or contact details, and no individual or household will be identifiable from the results. For more information about how we keep your data safe, you can visit https://www.participationsurvey.co.uk/privacypolicy.html
4.3 Pause of survey during national mourning
In quarter two, between 8th and 19th September, the online survey was paused. The question SSMONROL, which ask participants what they think the main role of the Monarchy in the UK should be, was removed on 15 September while it was being reviewed, and the question was reinstated on 25 October 2023. Envelopes with the ‘On Her Majesty’s Service’ franking were replaced with plain envelopes, in the interim. From 31 October, envelopes with the ‘On His Majesty’s Service’ franking were used.
4.4 Royal Mail postal strikes
During the quarter three fieldwork period, there were 20 days of postal strikes which took place on 1, 13, 20, 25 October and 2, 3, 4, 8, 9, 10, 24, 25, 30 November and 1, 9, 11, 14, 15, 23, 24 December. The postal strikes have had an impact on the delivery of invitation letters, reminder letters, ad hoc paper questionnaires as well as the paper questionnaire returns. Therefore, paper questionnaire returns were accepted until the 11th of January to give respondents sufficient time to complete and post the paper questionnaires, after taking into account the postal delays and backlog. However, fieldwork for the web survey (CAWI mode) was closed on the 1st of January.
4.5 QR code experiment
Given the dramatic rise in QR code usage during the COVID-19 pandemic, the survey invitation letter and reminder mailings include a QR code that respondents can scan to access the survey website. However, the impact of including a QR code had not been tested on the Participation Survey. With that in mind, a QR code experiment was run in quarter three, to explore the impact of inclusion on response rates, device used to complete the survey and sample profile, and a full report was generated.
4.6 Fieldwork performance
When discussing fieldwork figures in this section, response rates are referred to in two different ways:
Household response rate – This is the percentage of households contacted as part of the survey in which at least one questionnaire was completed. Individual response rate – This is the estimated response rate amongst all adults that were eligible to complete the survey.
Overall, the target number of interviews was 33,000 post validation checks, equating to 8,250 per quarter.
In total 69,755 addresses were sampled, from which 33,524 interviews were achieved after validation checks. The majority (29,006) of participants took part online, while 4,518 completed a paper questionnaire.
At least one interview was completed in 22,308 households, which represented a household response rate of 32%.
In a survey of this nature, no information is known about the reason for non-response in each individual household. However, it can be assumed that 8% of the addresses in the sample were not residential and were therefore ineligible to complete the survey. Once deadwood addresses are accounted for, the final household response rate was 35% [footnote 5].
The expected number of eligible individuals per residential address was averaged at 1.89 per address, therefore the total number of eligible adults sampled was 121,290. The survey was completed by 33,524 people, indicating an online individual response rate of 28%.
The full breakdown of the fieldwork figures and response rates by quarter are available in Table 4.
Table 4: Combined online and paper fieldwork figures by quarter.
Quarter | No. of sampled addresses | Interviews achieved – online and paper | No. households completed | Household response rate | Individual response rate |
---|---|---|---|---|---|
Quarter one | 17,439 | 8,500 | 5,741 | 36% | 28% |
Quarter two | 17,440 | 8,369 | 5,630 | 35% | 28% |
Quarter three | 17,438 | 7,997 | 5,352 | 33% | 26% |
Quarter four | 17,438 | 8,658 | 5,585 | 35% | 29% |
Total | 69,755 | 33,524 | 22,308 | 35% | 28% |
4.7 Incentive system
All respondents that completed the Participation Survey were given a £10 shopping voucher as a thank you for taking part.
Online incentives
Participants completing the survey online were provided with details of how to claim their voucher at the end of the survey and were directed to the voucher website, where they could select from a range of different vouchers, including electronic vouchers sent via email and gift cards sent in the post.
Paper incentives
Respondents who returned the paper questionnaire were also provided with a £10 shopping voucher. This voucher was sent in the post and could be used at a variety of high street stores.
4.8 Survey length
The median completion length of the online survey, with outliers excluded, was 25 minutes, and the mean was 27 minutes [footnote 6]. This is based on full surveys and does not include partial completions.
5. Data processing
5.1 Data management
Due to the different structures of the online and paper questionnaires, data management was handled separately for each mode. Online questionnaire data was collected via the web script and, as such, was much more easily accessible. By contrast, paper questionnaires were scanned and converted into an accessible format.
For the final outputs, both sets of interview data were converted into IBM SPSS Statistics, with the online questionnaire structure as a base. The paper questionnaire data was converted to the same structure as the online data so that data from both sources could be combined into a single SPSS file.
5.2 Partial completes
Online respondents can exit the survey at any time, and while they can return to complete the survey at a later date some chose not to do so.
Equally respondents completing the paper question occasionally leave part of the questionnaire black, for example if they do not wish to answer a particular question or section of the questionnaire.
Partial data can still be useful, providing respondents have answered the substantive questions in the survey. These cases are referred to as usable partial interviews.
Survey responses were checked at several stages to ensure that only usable partial interviews were included. Upon receipt of receiving returned paper questionnaire, the booking in team removed obviously blank paper questionnaires. Following this, during data processing, rules were set for the paper and online surveys to ensure that respondents had provided sufficient data. For the online survey, respondents had to reach a certain point in the questionnaire for their data to count as valid (just before the wellbeing questions). Paper data was judged complete if they answered at least 50% of the questions and reached at least as far as Q59 in the questionnaire in quarters one, two and three, and Q58 in quarter four.
5.3 Validation
Initial checks were carried out to ensure that paper questionnaire data had been correctly scanned and converted to the online questionnaire data structure. For questions common to both questionnaires, the SPSS output was compared to check for any notable differences in distribution and data setup.
Once any structural issues had been corrected, further quality checks were carried out to identify and remove any invalid interviews. The specific checks were as follows:
- Selecting complete interviews: Any test serials in the dataset (used by researchers prior to survey launch) were removed. Cases were also removed if the respondent did not answer the fraud declaration statement (online: QFraud; paper: Q88).
- Duplicate serials check: If any individual serial had been returned in the data multiple times, responses were examined to determine whether this was due to the same person completing multiple times or due to a processing error. If they were found to be valid interviews, a new unique serial number was created, and the data was included in the data file. If the interview was deemed to be a ‘true’ duplicate, the more complete or earlier interview was retained.
- Duplicate emails check: If multiple interviews used the same contact email address, responses were examined to determine if they were the same person or multiple people using the same email. If the interviews were found to be from the same person, only the most recent interview was retained. In these cases, online completes were prioritised over paper completes due to the higher data quality.
-
Interview quality checks: A set of checks on the data were undertaken to check that the questionnaire was completed in good faith and to a reasonable quality. Several parameters were used:
a. Interview length (online check only).
b. Number of people in household reported in interview(s) vs number of total interviews from household.
c. Whether key questions have valid answers.
d. Whether respondents have habitually selected the same response to all items in a grid question (commonly known as ‘flatlining’).
e. How many multi-response questions were answered with only one option ticked.
This approach led to 5% of cases being removed, a rate that is low enough for us to be largely confident of the data’s veracity.
5.4 Standard paper questionnaire edits
Upon completion of the general quality checks described above, more detailed data checks were carried out to ensure that the right questions had been answered according to questionnaire routing. This is generally all correct for all online completes, as routing is programmed into the scripting software, but for paper completes, data edits were required.
There were two main types of data edit, both affecting the paper questionnaire data:
- Single-response questions edits: If a paper questionnaire respondent had mistakenly answered a question that they weren’t supposed to, their response in the data was changed to “-3: Not Applicable”. If a paper questionnaire respondent had neglected to answer a question that they should have, they were assigned a response in the data of “-4: Not answered but should have (paper)”.
- Multiple response question edits: If a paper questionnaire respondent had mistakenly answered a question that they weren’t supposed to, their response was set to “-3: Not Applicable”. If a paper questionnaire respondent had neglected to answer a question that they should have, they were assigned a response in the data of “-4: Not answered but should have (paper)”. Where the respondent had selected both valid answers and an exclusive code such as “None of these”, any valid codes were retained, and the exclusive code response was set to “0”.
5.5. Questionnaire specific paper questionnaire edits
Other, more specific data edits were also made, as described below:
-
Additional edits to library question: The question CLIBRARY1 was formatted differently in the online script and paper questionnaire. In the online script it was set up as one multiple-response question, while in the paper questionnaire it consisted of two separate questions (Q15 and Q21). During data checking, it was found that many paper questionnaire respondents followed the instructions to move on from Q15 and Q21 without ticking the “No” response. To account for this, the following data edits were made:
a. If CFRELIB12 was not answered and CNLIWHYA was answered, set CLIBRARY1_001 was set to 0.
b. If CFRELIDIG was not answered and CNLIWHYAD was answered, CLIBRARY1_002 was set to 0.
c. CLIBRARY1_003 was set to 0 for all paper questionnaire respondents.
-
Additional edits to grid questions: Due to the way the paper questionnaire was set up, additional edits were needed for the following linked grid questions: CARTS1/CARTS1A/CARTS1B, CARTS2/CARTS2A/CARTS2B, CARTS3/CARTS3A/CARTS3B, CARTS4/CARTS4A/CARTS4B, CHERVIS12/CFREHER12/CVOLHER, CDIGHER12/CFREHERDIG/CREPAY5.
Figure 1 shows an example for the CARTS1 section in the paper questionnaire.
Figure 1: Example - CARTS1 section in the paper questionnaire.
Marking the option “Not in the last 12 months” on the paper questionnaire was equivalent to the code “0: Have not done this” at CARTS1 in the online script. As such, leaving this option blank in the questionnaire would result in CARTS1 being given a default value of “1” in the final dataset. In cases where a paper questionnaire respondent had neglected to select any of the options in a given row, CARTS1 was recoded from “1” to “0”.
5.6 Coding
Post-interview coding was undertaken by members of the Kantar Public coding department. The coding department coded verbatim responses, recorded for ‘other specify’ questions.
For example, if a respondent selected “Other” at CARTS1 and wrote text that said they went to some type of live music event, in the data they would be back-coded as having attended a “a live music event” at CARTS1_006.
For the sets CASRT1/CARTS1A/CARTS1B, CASRT2/CARTS2A/CARTS2B and CHERVIS12/CFREHER12/CVOLHER data edits were made to move responses coded to “Other” to the correct response code, if the answer could be back coded to an existing response code.
5.7 Data outputs
Once the checks were complete a final SPSS data file was created that only contained valid interviews and cleaned, edited data. Five data sets were made available
- Quarter one data
- Quarter two data
- Quarter three data
- Quarter four data
- A combined annual dataset
A set of excel data tables, containing headline measures were produced alongside each data set.
The data tables also display confidence intervals. Confidence intervals should be considered when analysing the Participation Survey data set, especially when conducting sub-group analysis. A figure with a wide confidence interval may not be as robust as one with a narrow confidence interval. Confidence intervals vary for each measure and each demographic breakdown and will vary from year to year. Confidence intervals should be calculated using the complex survey package in SPSS or some other statistical package which takes account of design effects.
5.8 Standard errors
The standard error is useful as a means to calculate confidence intervals.
Survey results are subject to various sources of error, that can be divided into two types: systematic and random error.
Systematic error
Systematic error or bias covers those sources of error that will not average to zero over repeats of the survey. Bias may occur, for example, if a part of the population is excluded from the sampling frame or because respondents to the survey are different from non-respondents with respect to the survey variables. It may also occur if the instrument used to measure a population characteristic is imperfect. Substantial efforts have been made to avoid such systematic errors. For example, the sample has been drawn at random from a comprehensive frame, two modes and multiple reminders have been used to encourage response, and all elements of the questionnaire were thoroughly tested before being used.
Random error
Random error is always present to some extent in survey measurement. If a survey is repeated multiple times minor differences will be present each time due to chance. Over multiple repeats of the same survey these errors will average to zero. The most important component of random error is sampling error, which is the error that arises because the estimate is based on a random sample rather than a full census of the population. The results obtained for a single sample may by chance vary from the true values for the population, but the error would be expected to average to zero over a large number of samples. The amount of between-sample variation depends on both the size of the sample and the sample design. The impact of this random variation is reflected in the confidence intervals presented in the data tables for headline measures. Random error may also follow from other sources such as variations in respondents’ interpretation of the questions, or variations in the way different interviewers ask questions.
Standard errors for complex sample designs
The Participation Survey employs a systematic sample design, and the data is both clustered by address and weighted to compensate for non-response bias. These features will impact upon the standard errors for each survey estimate in a unique way. Generally speaking, systematic sampling will reduce standard errors while data clustering and weighting will increase them. If the complex sample design is ignored, the standard errors will be wrong and usually too narrow. The confidence intervals published in the quarter two and annual data tables (which also includes quarter one data) have been estimated using the SPSS Complex Samples module, which employs a Taylor Series Expansion method to do this.
5.9 Missing data
In the Major Events section for the web questionnaire, respondents were asked which major events they have heard of (CEVEAW) and which major events that were selected in CEVEAW have they participated in (CMAJE12). A dummy variable of the present date (DateEvent: Placeholder) was created because the options of a few questions in the Major Events section are date dependent, such as the options in CMAJE12.
A scripting error occurred to DateEvent: Placeholder, which affected any filtering condition based on DateEvent: Placeholder. The option, “Her Majesty The Queen’s Platinum Jubilee”, in CMAJE12 was affected as it was not shown to respondents in the period of 22nd October to 1st November, and also from 22nd November to 23rd November. Over that period 211 respondents were not able to answer CMAJE12 because “Her Majesty The Queen’s Platinum Jubilee” was the only option chosen at CEVEAW. Furthermore, 590 respondents who selected “Her Majesty The Queen’s Platinum Jubilee” as well as other events at CEVEAW, saw the CMAJE12 question but were not given “Her Majesty The Queen’s Platinum Jubilee” as a participation option. The script was updated on the 23rd of November and the filtering logic based on DateEvent: Placeholder was removed from script as it was no longer required.
Due to the fieldwork design, the respondents who were affected by this error may not truly be a random subset of quarter three respondents and could potentially skew the results. In the quarter three data, all responses for “Her Majesty The Queen’s Platinum Jubilee” participation at CMAJE12_001 have been set to -3 “Not applicable” and the variable is not used in analysis.
5.10 Missing paradata
At quarter three, a change to how the script recorded timing points was discovered. This meant that only respondents who reached the very last screen of the survey had the “MultiSession” flag recorded correctly. As the flag was quite incomplete, the “MultiSession” variable was removed from the data file. This also affected the quarter one and quarter two data files, hence, the “MultiSession” variable should not be used for those quarters either.
6. Weighting
A three-step weighting process was used to compensate for differences in both sampling probability and response probability:
- An address design weight was created equal to one divided by the sampling probability; this also served as the individual-level design weight because all resident adults could respond.
- The expected number of responses per address was modelled as a function of data available at the neighbourhood and address levels. The step two weight was equal to one divided by the predicted number of responses.
- The product of the first two steps was used as the input for the final step to calibrate the sample. The responding sample was calibrated to the January–March 2022 Labour Force Survey (LFS) with respect to (i) gender by age, (ii) educational level by age, (iii) ethnic group, (iv) housing tenure, (v) region, (vi) employment status by age, (vii) household size, and (viii) internet use by age.
The sum of these ‘grossing’ weights equals the population of England aged 16+. An additional standardised weight was produced that was the same but scaled so the weights sum to the respondent sample size.
Equivalent weights were also produced for the (majority) subset of respondents who completed the survey by web. This weight was needed because a few items were included in the web questionnaire but not the paper questionnaire.
For the annual dataset (quarters 1, 2, 3 and 4), the ‘grossing’ weights were divided by 4 and new standardised weights produced to ensure that each quarter would contribute equally to estimates based on the annual dataset.
The final weight variables in the quarters one, two, three and four datasets are:
- ‘Finalweight’ – to be used when analysing data available from both the web and paper questionnaires.
- ‘Finalweightweb’ – to be used when analysing data available only from the web questionnaire.
The final weight variables in the annual dataset are:
- ‘Y2SampleSizeWeight’ – to be used when making population estimates based on online and paper data.
- ‘Y2SampleSizeWeight_WebOnly’ – to be used when making population estimates based on online data only.
It should be noted that the weighting only corrects for observed bias (for the set of variables included in the weighting matrix) and there is a risk of unobserved bias. Furthermore, the raking algorithm used for the weighting only ensures that the sample margins match the population margins. There is no guarantee that the weights will correct for bias in the relationships between the variables.
7. Appendix
7.1 QR code experiment
Introduction
The Participation Survey is used by Department for Culture, Media & Sport to understand adult engagement with the sectors of culture, media and sport. It is an Address Based Online Survey where adults aged 16 and over in England are invited offline to go online and respond to the survey. A paper questionnaire is also available on request and is sent to households that typically respond to web surveys at lower-than-average levels as part of the second reminder mailing.
An experiment was conducted in the third quarter of the 2022/23 Participation Survey to explore the impact of including a QR code in the survey invitation and reminder mailings. Half of sampled addresses were sent invitation and reminder letters with a QR code embedded, while the other half did not receive the QR code.
The analysis in this report focuses on three aspects of the experiment:
- Whether the inclusion of a QR code increases the response rate
- Whether the inclusion of a QR code changes the device used to answer the web survey
- Whether the inclusion of a QR code leads to different respondent profiles (that is, whether respondents with certain characteristics are more/less likely to respond to the survey when a QR code is presented)
Descriptive summary
Overall, 17,438 addresses were issued in quarter three and 7,997 respondents completed the survey from 5,352 households. This constitutes a 46% conversion rate, a 33% household-level response rate, and an individual-level response rate of 26%.
Response rates were calculated via the standard ABOS method. An estimated 8% of ‘small user’ PAF addresses in England are assumed to be non-residential (derived from interviewer administered surveys). The average number of adults aged 16 or over per residential household, based on the Labour Force Survey, is 1.89. Thus, the response rate formula is Household RR = number of responding households / (number of issued addresses×0.92); Individual RR = number of responses / (number of issued addresses×0.92×1.89). The conversion rate is the ratio of the number of responses to the number of issued addresses.
Although the Participation Survey is predominately web-based, a paper questionnaire is also available for those not digitally engaged. In Q3, an experiment about QR codes was implemented where a random half of the sampled addresses were sent a survey invitation letter with a QR code. The key aim of the experiment was to test whether including a QR code improves the survey response rate. For respondents in the group with a QR code (4,099), 28% of them self-reported that they accessed the survey by scanning this code.
Response rate and QR code
Table 5 shows the number of survey completions per sampled address for each arm of the experiment. The computation uses the design weight, which compensates for unequal selection probabilities of the sampled addresses. Overall, the QR code group shows a statistically significant higher average survey completion per sampled address (0.49) compared to the control group (0.46) (The 95% confidence interval for this difference is between 0.005 and 0.05). Thus, the inclusion of a QR code in the invitation letter leads to a higher response rate. We can make the same conclusion when restricting the analysis to those who completed the survey via web.
Completion mode | Survey completions per sampled address with QR code | Survey completions per sampled address without QR code | t-test [footnote 7] |
---|---|---|---|
Overall (web + paper) | 0.49 | 0.46 | t = 2.32, df = 17422, p-value = 0.020 |
Web only | 0.42 | 0.40 | t = 2.00, df = 17422, p-value = 0.045 |
Paper only | 0.06 | 0.06 | t = 1.03, df = 17422, p-value = 0.30 |
Meanwhile, the QR code does not make any difference in terms of the response rate among those who completed the paper questionnaire. This is not surprising as the QR code is designed to push people to answer the survey online.
Given that the use of QR codes improves the overall as well as web response rate but does not change that of the paper mode, we can conclude that there is no displacement effect. That is, the additional respondents answering the web survey are likely due to the use of QR codes, not because of paper respondents switching to the web mode. This makes the use of QR codes promising.
Mobile device and QR code
We also explored if the QR code might change the device used to answer the survey. This is because using the QR code requires scanning it with a camera, which is easily accessible in almost all mobile devices. To investigate this speculation, we compare responding devices by the two arms in the experiment.
As shown in Figure 2, when being presented with a QR code (as opposed to without a QR code), more people responded to the survey using their phones. In contrast, when no QR code was provided, more people used their PCs.
Figure 2. The distribution of responding devices
In a separate analysis (not shown) (where the comparison base is the responding device as opposed to the experiment arm in Figure 2), we find that fewer PCs/laptops were used to answer the web survey given the QR code. In other words, the inclusion of a QR code leads to a displacement effect: the smartphone displaces the large-screen device.
Past literature suggests that using small-screen devices, particularly smartphones, can lead to slightly lower data quality [footnote 8]. This is mainly due to (1) the difficulty in optimising the survey design to be mobile-device-friendly, especially for certain question types (for example, grid question, sliders) and (2) the potential to multitask and engage in cognitive shortcuts when answering surveys using smartphones. Nonetheless, past research does not show that the use of smartphones consistently leads to lower data quality across different indicators (for example, non-substantive answers, straightlining) [footnote 9]. Also, it should be stressed that the survey template used on the Participation Survey automatically renders to all device types so that each question presents in the right format for desktops, laptops, tablets and smartphones.
Overall, the positive effect of QR codes (that is, an increase in the response rate) might come with a downside (the collected data might be a little less good compared to that from large-screen device users). However, unless there is strong evidence from the survey suggesting the presence of this effect, QR codes should be included in the survey invitation letters.
Respondent profile and QR code
The profile of respondents can also give insights into the effect of the QR code experiment. Eleven demographic characteristics were selected to see if there were any differences between the two arms of the experiment. They were chosen mainly because (1) they are recorded in the survey and (2) they are commonly found to be associated with the key outcomes of the survey. The characteristics were: Age, Sex, Housing tenure, Having a Degree or not, Number of children in the household, Ethnicity, Work status, Internet use, Any long-term health issue, Marital status, and Occupation.
We conducted two different comparisons here. The first compared the respondent profile between the experiment and control groups (that is, QR code vs. no QR code). This analysis used the design weight to account for potential confounders in the result (for example, unequal selection probabilities in the sampling).
In the second comparison, both groups in the experiment were compared to a benchmark. The estimates of the two groups were still computed by applying the design weight. However, the benchmark was computed by combining data from both groups and then applying the final survey weights. The final survey weights are the product of the design, nonresponse, and post-stratification weights. The post-stratification target is the Labour Force Survey. As the benchmark is supposed to represent the target population of the Participation Survey, any dramatic difference between the two groups and the benchmark is indicative of the presence of potential biases in the data.
Comparison 1: QR code vs. no QR code
None of the 11 characteristics investigated showed a dramatic difference between the experiment and control group. In fact, the difference between them was typically within 3% points. This means the use of a QR code neither worsened nor improved the respondent profile.
Most notably the use of QR codes did not lead to more respondents from younger age groups (16-24) (See Figure 3) completing the survey. You could hypothesise that younger age groups are more tech-savvy and as such the inclusion of a QR code in the invitation letter should increase participation. However, as can be seen in Figure 3, people in the experiment and control group are equally likely to respond to the survey, regardless of age.
Figure 3. The distribution of age groups
One possible reason is that people of all age groups have become familiar with the feature of QR codes, especially given the COVID-19 pandemic pushed many people to use QR codes (for example, ordering food via QR code to minimise contacts). As a result, the QR code is perhaps no longer a good proxy for how tech-savvy people are.
Comparison 2: experiment vs. benchmark
When comparing the respondent profiles from both arms of the experiment to the benchmark, the overall finding is: there are some biases in the respondent profile regardless of whether the QR code is used or not. More work needs to be done to make the respondent profile resemble the benchmark, but the inclusion of a QR code does not appear to help improve the respondent profile.
The discrepancies between the achieved sample and the benchmark are presented below. Compared to the benchmark, the achieved sample contains:
- more people aged 65+ and fewer people aged 16-24 (Figure 3)
- more degree-holders (Figure 4)
- more households that own their living place outright and fewer renters (Figure 5)
- more people who are married (Figure 6)
- more households that have no children under 16 (Figure 7)
- more females (Figure 8)
- more people who are currently not working (Figure 9)
Figure 4. The distribution of degree holders
Figure 5. The distribution of housing tenure
Figure 6. The distribution of marital status
Figure 7. The distribution of the number of under-16 children in households.
Figure 8. The distribution of sex
Figure 9. The distribution of work status
Conclusion
To summarise, a reasonable proportion (28%) of web respondents who received the QR code in the invitation and reminder mailings claimed to scan the code to access the survey. This suggests respondents find the QR code practical to use. Additionally, the use of QR codes increases the response rate, mainly because it increases the number responding by web without decreasing the number responding on paper. This is good news as the inclusion of the QR code in the letter is likely to incur a negligible cost, which translates into a reduced cost per survey completion. However, the availability of the QR code displaces some web respondents from PC response to small-screen (phone) response, which may have some (probably negative) effects on data quality. This is because data from small-screen users tend to be a little less good according to past literature.
In terms of the respondent profile, seven out of 11 characteristics investigated were found to differ from the benchmark. However, biases emerge in both arms of the experiment suggesting that this is a wider concern for general fieldwork management and nonresponse bias adjustment. Overall, although plenty of respondents used the QR code, the overall demographic profile barely changed. The use of QR codes does not bring the profile closer to that of the target population. On balance, the findings suggest that QR codes should be included in the survey invitation letters unless it is demonstrated that the data quality of smartphone responses is substantially lower than the data quality of large-screen device responses.
7.2 Invitation letter
7.3 Reminder letter 1
7.3.1 Partial response
7.3.2 No response
7.4 Reminder letter 2
7.4.1 Partial response with paper questionnaires included
7.4.2 Partial response with no paper questionnaires included
7.4.3 No response with paper questionnaires included
7.4.4 No response with no paper questionnaires included
7.5 Reminder letter 3
7.5.1 Partial response
7.5.2 No response
7.6 Ad hoc paper questionnaire request letter
7.7 Postal incentive letter
-
In February 2023, there was a Machinery of Government (MoG) change and responsibility for digital policy now sits within the Department for Science, Innovation and Technology (DSIT). This MoG did not affect the contents of the Participation Survey for 2022/23 - digital questions were still part of the survey. ↩
-
https://www.gov.uk/government/publications/participation-survey-methodology ↩ ↩2
-
International Territorial Level (ITL) is a geocode standard for referencing the subdivisions of the United Kingdom for statistical purposes, used by the Office for National Statistics (ONS). Since 1 January 2021, the ONS has encouraged the use of ITL as a replacement to Nomenclature of Territorial Units for Statistics (NUTS), with lookups between NUTS and ITL maintained and published until 2023. ↩
-
Response rates (RR) were calculated via the standard ABOS method. An estimated 8% of ‘small user’ PAF addresses in England are assumed to be non-residential (derived from interviewer administered surveys). The average number of adults aged 16 or over per residential household, based on the Labour Force Survey, is 1.89. Thus, the response rate formula: Household RR = number of responding households / (number of issued addresses×0.92); Individual RR = number of responses / (number of issued addresses×0.92×1.89). The conversion rate is the ratio of the number of responses to the number of issued addresses. ↩
-
This figure is calculated by removing outliers, which were any interviews shorter than 5 minutes or longer than 60 minutes. ↩
-
Sample stratification and design weights were accounted for in this statistical test. ↩
-
Chen, Z. et al. (2022) ‘Impact of question topics and filter question formats on web survey breakoffs’, International Journal of Market Research, 64(6), pp. 710–726. Available at: https://doi.org/10.1177/14707853211068008. ↩
-
Clement, S.L., Severin-Nielsen, M.K. and Shamshiri-Petersen, D. (2020) ‘Device effects on survey response quality. A comparison of smartphone, tablet and PC responses on a cross sectional probability sample’, Survey Methods: Insights from the Field [Preprint]. Available at: https://doi.org/10.13094/SMIF-2020-00020. ↩