Participation Survey May to June 2023: technical report
Updated 24 July 2024
Applies to England
September 2023 © Kantar Public 2023
1. Introduction
1.1 Background to the survey
In 2021, the Department for Culture, Media and Sport (DCMS) commissioned Kantar Public to design and deliver a new, nationally representative ‘push-to-web’ survey to assess adult participation in DCMS sectors across England. The survey served as a successor to the Taking Part Survey, which ran for 16 years as a continuous face to face survey [footnote 1].
This technical note relates to the 2023/24 Participation Survey Quarter 1 fieldwork, conducted between 9th May and 3rd July 2023.
The 2023/24 Participation Survey was commissioned by DCMS in partnership with Arts Council England (ACE). The scope of the survey is to deliver a nationally representative sample of adults (aged 16 years and over) and to assess adult participation in DCMS sectors across England, targeting enough households to allow for Local Authority representation of the data. The data collection model for the Participation Survey is based on ABOS (Address-Based Online Surveying), a type of ‘push-to-web’ survey method. Respondents take part either online or by completing a paper questionnaire. In 2023/24 the target respondent sample size increased to 175,000 – which was previously 33,000 per survey year in the interim survey from 2021 to 2023. Fieldwork will run across four quarters (May-June 2023, July-September 2023, October-December 2023 and January-March 2024).
1.2 Survey objectives
- To inform and monitor government policy and programmes in DCMS, ACE and other government departments (OGDs) on adult engagement with the DCMS and digital sectors [footnote 2]. The survey will also gather information on demographics (for example, age, gender, education).
- To assess the variation in engagement with cultural activities across DCMS sectors in England, and the differences in social-demographics such as location, age, education, and income.
- To monitor and report on progress in achieving the Outcomes set out in Let’s Create [footnote 3] – Creative People, Cultural Communities, and A Creative and Cultural Country (as set out in the ]Arts Council England Impact Framework).
In preparation of the 2023/24 survey, Kantar Public undertook questionnaire development work to test any new or amended questions . The 2023/24 survey launched in May 2023.
1.3 Survey design
The basic ABOS design is simple: a stratified random sample of addresses is drawn from the Royal Mail’s postcode address file and an invitation letter is sent to each one, containing username(s) and password(s) plus the URL of the survey website. Sampled individuals can log on using this information and complete the survey as they might any other web survey. Once the questionnaire is complete, the specific username and password cannot be used again, ensuring data confidentiality from others with access to this information.
It is usual for at least one reminder to be sent to each sampled address and it is also usual for an alternative mode (usually a paper questionnaire) to be offered to those who need it or would prefer it. It is typical for this alternative mode to be available only on request at first. However, after nonresponse to one or more web survey reminders, this alternative mode may be given more prominence.
Paper questionnaires ensure coverage of the offline population and are especially effective with sub-populations that respond to online surveys at lower-than-average levels. However, paper questionnaires have measurement limitations that constrain the design of the online questionnaire and also add considerably to overall cost. For the Participation Survey, paper questionnaires are used in a limited and targeted way, to optimise rather than maximise response.
2. Sampling
2.1 Sample design: addresses
The address sample design is intrinsically linked to the data collection design (see ‘Details of the data collection model’ below) and was designed to yield a respondent sample that is representative with respect to neighbourhood deprivation level, and age group within each of the 33 ITL2 regions and 309 lower-tier local authorities [footnote 4] in England. This approach limits the role of weights in the production of unbiased survey estimates, narrowing confidence intervals compared with other designs.
The design also sought a minimum four-quarter respondent sample size of 500 for each local authority and a minimum four-quarter effective sample size of 2,700 for each ITL2 region [footnote 5]. Although there were no specific targets per quarter, the sample selection process was designed to ensure that the respondent sample size per local authority was approximately the same per quarter.
As a first step, a stratified master sample of 726,790 addresses in England was drawn from the Postcode Address File (PAF) ‘small user’ subframe. Before sampling, the PAF was disproportionately stratified by lower tier local authority (309 strata) Furthermore, within each of the 309 strata, the PAF was sorted by (i) neighbourhood deprivation level (5 groups, each of a similar scale at the national level), (ii) super output area, and finally (iii) by postcode. This ensured that the master sample of addresses was geodemographically representative within each stratum.
This master sample of addresses was then augmented by data supplier CACI. For each address in the master sample, CACI added the expected number of resident adults in each ten-year age band. Although this auxiliary data will have been imperfect, Kantar Public’s investigations have shown that it is highly effective at identifying households that are mostly young or mostly old. Once this data was attached, the master sample was additionally stratified by expected household age structure based on the CACI data: (i) all aged 35 or younger (17% of the total); (ii) all aged 65 or older (21% of the total); (iii) all other addresses (62% of the total).
The conditional sampling probability in each stratum was varied to compensate for (expected) residual variation in response rate that could not be ‘designed out’, given the constraints of budget and timescale. The underlying assumptions for this procedure were derived from empirical evidence obtained from the 2021/22 and 2022/23 Participation Surveys.
Kantar Public drew a stratified random sample of 455,546 addresses from the master sample of 726,790 and systematically allocated them with equal probability to quarters 1, 2, 3 and 4 (that is, circa 113,886 addresses per quarter). Kantar Public then systematically distributed the quarter-specific samples to three equal-sized ‘replicates’, each with the same profile. The second replicate was expected to be issued two weeks after the first replicate, and the third replicate was expected to be issued two weeks after the second replicate to ensure that data collection was maximally spread throughout the three-month period allocated to each quarter [footnote 6].
These replicates were further subdivided into twenty-five equal sized ‘batches’ to help manage fieldwork. The expectation was that only the first twenty batches within each replicate would be issued (that is, circa 30,370 addresses), with the twenty first to the twenty fifth batches kept back in reserve.
However, as fieldwork for quarter 1 was only two months long (instead of the usual three), all three replicates were issued at the same time, that is, at the beginning of fieldwork. Only the first twenty batches of each replicate were issued (that is, as planned). In total, 91,110 addresses were issued for quarter 1. Fieldwork for Q1 was delayed until May 2023 to enable additional time for cognitive and pilot testing. However, the sample expected over a typical three-month quarter was carried out over a two-month quarter, meaning no loss of sample or data.
Figure 1 shows the quarter 1 (issued) sample structure with respect to the major ‘design’ strata: neighbourhood deprivation level and expected household age structure.
Figure 1: Initial address issue by area deprivation quintile group
Expected household age structure | Most deprived | 2nd | 3rd | 4th | Least deprived |
All <=35 | 3,920 | 4,213 | 3,457 | 2,845 | 2,158 |
Other | 10,392 | 12,941 | 12,399 | 11,514 | 10,167 |
All >=65 | 2,728 | 3,010 | 3,894 | 3,960 | 3,512 |
2.2 Sample design: individuals within sampled addresses
All resident adults aged 16+ were invited to complete the survey. In this way, the Participation Survey avoided the complexity and risk of selection error associated with remote random sampling within households.
However, for practical reasons, the number of logins provided in the invitation letter was limited. The number of logins was varied between two and four, with this total adjusted in reminder letters to reflect household data provided by prior respondent(s). Addresses that CACI data predicted contained only one adult were allocated two logins; addresses predicted to contain two adults were allocated three logins; and other addresses were allocated four logins. The mean number of logins per address was 2.7. Paper questionnaires were available to those who are offline, not confident online, or unwilling to complete the survey this way.
2.3 Details of the data collection model
Figure 2 summarises the data collection design within each stratum, showing the number of mailings and type of each mailing: push-to-web (W) or mailing with paper questionnaires (P). For example, ‘WWP’ means two push-to-web mailings and a third mailing with paper questionnaires included alongside the web survey login information. In general, there was a two-week gap between mailings.
Figure 2: Data collection design by stratum
Expected household age structure | Most deprived | 2nd | 3rd | 4th | Least deprived |
All <=35 | WWPW | WWWW | WWWW | WWW | WWW |
Other | WWPW | WWW | WWW | WWW | WWW |
All >=65 | WWPW | WWPW | WWP | WWP | WWP |
3. Questionnaire
3.1 Questionnaire development
The online questionnaire was designed to take an average of 30 minutes to complete. A modular design was used with around half of the questionnaire made up of a core set of questions asked of the full sample. The remaining questions were split into three separate modules, randomly allocated to a subset of the sample.
The postal version of the questionnaire included the same set of core questions asked online, but the modular questions were omitted to avoid overly burdening respondents who complete the survey on paper, and to encourage response. Copies of the online and paper questionnaires are available online [footnote 7].
Given the extent of questionnaire changes in 2023/24 Participation Survey, it was important to implement a comprehensive development and testing phase. This was made up of three key stages:
- Questionnaire review
- Cognitive testing
- Usability testing
3.2 Questionnaire changes
Questions on the following topics of interest were added to the Participation Survey 2023/24, as requested by ACE and/or DCMS:
- Environment, which included questions on mode of transport taken while travelling to an arts and cultural event, distance travelled, and reason(s) for transportation choice.
- Social prescribing, which included questions on the respondent’s experience with social prescribing, and the types of activities they were referred to.
- Further questions on arts and culture engagement, which included questions on the types of classes and clubs respondents have taken part in, the frequency and reasons(s) for their involvement, the impact/benefits of participating, and for non-participants, the reason for not participating.
- Pride in Place, which included questions on respondents’ sense of belonging and pride of their local area, the role culture plays in choosing where to live, and the current arts and culture scene in their local area.
3.3 Cognitive testing
Cognitive testing explores how participants understand, process, and respond to survey questions. Two stages of online cognitive testing were conducted, the first round took place in January 2023 and the second round in February 2023. A total of 18 participants, comprised of diverse genders, ages, highest educational qualifications and ethnicities, were a part of the cognitive testing.
In the first round of testing, participants enjoyed the (more interesting) questions about arts and culture activities, historic places visited and their local area. The questions towards the end of the interview about digital skills and data were not as salient and there was a concern that this could lead to participants satisficing [footnote 8]. Participants regularly reported there was too much to read on the screen, especially where examples and bracketed text were used. Another finding was some questions were perceived as repetitive where a similar list of options was used. Also, where there were many statements in agree/disagree batteries, there was the risk of participants straight-lining without engaging with the text properly due to the onerousness nature of the task.
The aim of the second round of cognitive testing was to address the changes made following the first round. Where wording had been simplified and amount of text had been reduced (for example removing brackets and complex language) this meant questions were easier to answer and participants were more willing to read through all the answer codes. Moreover, where examples were provided, the question was more meaningful than the more conceptual/theoretical version tested in round one.
Further details about the development work can be found in the Participation Survey methodology reports (2023/24 pilot report will be published shortly).
3.4 Usability testing
The primary focus of these interviews was to explore the usability of the paper questionnaire, presented as an A4 booklet, that is sent out with the second reminder letter. How participants approached the booklet and answered these questions was explored in detail. On the 22nd and 23rd March 2023, usability testing was carried out face-to-face with five participants comprised of diverse genders, social grades, highest educational qualifications and ethnicities.
Generally, the paper questionnaire was viewed as relatively straightforward to complete and contained questions that were engaging. Following feedback from the usability testing, some changes to the paper questionnaire were made which included ensuring all essential information is included within the questions as some respondents skimmed through the front page or skipped it entirely; making instructions clearer by underlining the number of responses respondents should cross and using darker shading; emphasising filters by changing its colour, moving the filter closer to the wording and placing filters at the top of the page rather than mid-page.
Further details about the development work can be found Participation Survey methodology reports (2023/24 pilot report will be published shortly). A copy of the 2023/24 paper questionnaire is available online.
4. Fieldwork
4.1 Contact procedures
All selected addresses were sent an initial invitation letter containing the following information:
-
A brief description of the survey
-
The URL of survey website (used to access the online script)
-
A QR code that can be scanned to access the online survey
-
Log-in details for the required number of household members
-
An explanation that participants will receive a £10 shopping voucher
-
Information about how to contact Kantar Public in case of any queries
The reverse of the letter featured responses to a series of Frequently Asked Questions.
All non-responding addresses were sent two reminder letters, at the end of the second and fourth weeks of fieldwork respectively. A pre-selected subset of non-responding addresses (see Figure 2) was sent a third reminder letter at the end of the sixth week of fieldwork. The information contained in the reminder letters was similar to the invitation letters, with slightly modified messaging to reflect each reminder stage.
As well as the online survey, respondents were given the option to complete a paper questionnaire, which consisted of an abridged version of the online survey. Each letter informed respondents that they could request a paper questionnaire by contacting Kantar Public using the email address or freephone telephone number provided, and a cut-off date for paper questionnaire requests was also included on the letters.
In addition, some addresses received up to two paper questionnaires with the second reminder letter. This targeted approach was developed based on historical data Kantar Public has collected through other studies, which suggests that proactive provision of paper questionnaires to all addresses can actually displace online responses in some strata. Paper questionnaires were pro-actively provided to (i) sampled addresses in the most deprived quintile group, and (ii) sampled addresses where it was expected that every resident would be aged 65 or older (based on CACI data).
4.2 Fieldwork performance
In total, 43,154 respondents completed the survey during quarter 1 – 37,470 via the online survey and 5,684 by returning a paper questionnaire. Following data quality checks (see Chapter 4 for details), 2,649 respondents were removed, leaving 40,505 respondents in the final dataset.
This constitutes a 44% conversion rate, a 32% household-level response rate, and an individual-level response rate of 26% [footnote 9].
For the online survey, the median completion time was 25 minutes and 9 seconds, and the average completion time was 27 minutes and 28 seconds [footnote 10].
5. Data processing
5.1 Data management
Due to the different structures of the online and paper questionnaires, data management was handled separately for each mode. Online questionnaire data was collected via the web script and, as such, was much more easily accessible. By contrast, paper questionnaires were scanned and converted into an accessible format.
For the final outputs, both sets of interview data were converted into IBM SPSS Statistics, with the online questionnaire structure as a base. The paper questionnaire data was converted to the same structure as the online data so that data from both sources could be combined into a single SPSS file.
5.2 Quality checking
Initial checks were carried out to ensure that paper questionnaire data had been correctly scanned and converted to the online questionnaire data structure. For questions common to both questionnaires, the SPSS output was compared to check for any notable differences in distribution and data setup.
Once any structural issues had been corrected, further quality checks were carried out to identify and remove any invalid interviews. The specific checks were as follows:
- Selecting complete interviews: Any test serials in the dataset (used by researchers prior to survey launch) were removed. Cases were also removed if the respondent reached, but did not answer the fraud declaration statement (online: QFraud; paper: Q73).
- Duplicate serials check: If any individual serial had been returned in the data multiple times, responses were examined to determine whether this was due to the same person completing multiple times or due to a processing error. If they were found to be valid interviews, a new unique serial number was created, and the data was included in the data file. If the interview was deemed to be a ‘true’ duplicate, the more complete or earlier interview was retained.
- Duplicate emails check: If multiple interviews used the same contact email address, responses were examined to determine if they were the same person or multiple people using the same email. If the interviews were found to be from the same person, only the most recent interview was retained. In these cases, online completes were prioritised over paper completes due to the higher data quality.
-
Interview quality checks: A set of checks on the data were undertaken to check that the questionnaire was completed in good faith and to a reasonable quality. Several parameters were used:
a. Interview length (online check only)
b. Number of people in household reported in interview(s) vs number of total interviews from household.
c. Whether key questions have valid answers.
d. Whether respondents have habitually selected the same response to all items in a grid question (commonly known as ‘flatlining’) where selecting the same responses would not make sense.
e. How many multi-response questions were answered with only one option ticked.
Following the removal of invalid cases, 40,505 valid cases were left in the final dataset.
5.3 Data checks and edits
Upon completion of the general quality checks described above, more detailed data checks were carried out to ensure that the right questions had been answered according to questionnaire routing. This is generally all correct for all online completes, as routing is programmed into the scripting software, but for paper completes, data edits were required.
There were two main types of data edit, both affecting the paper questionnaire data:
- Single-response questions edits: If a paper questionnaire respondent had mistakenly answered a question that they weren’t supposed to, their response in the data was changed to “-3: Not Applicable”. If a paper questionnaire respondent had neglected to answer a question that they should have, they were assigned a response in the data of “-4: Not answered but should have (paper)”. If a paper questionnaire respondent had tick more than one box for a single response question they were assigned a response in the data of “-5: Multi-selected for single response (paper)”.
- Multiple response question edits: If a paper questionnaire respondent had mistakenly answered a question that they weren’t supposed to, their response was set to “-3: Not Applicable”. If a paper questionnaire respondent had neglected to answer a question that they should have, they were assigned a response in the data of “-4: Not answered but should have (paper)”. Where the respondent had selected both valid answers and an exclusive code such as “None of these”, any valid codes were retained and the exclusive code response was set to “0”.
Other, more specific data edits were also made, as described below:
-
Additional edits to library question: The question CLIBRARY1 was formatted differently in the online script and paper questionnaire. In the online script it was set up as one multiple-response question, while in the paper questionnaire it consisted of two separate questions (Q21 and Q25). During data checking, it was found that many paper questionnaire respondents followed the instructions to move on from Q21 and Q25 without ticking the “No” response. To account for this, the following data edits were made:
a. If CFRELIB12 and CPARLI12B was not answered and CNLIWHYA was answered, set CLIBRARY1_001 was set to 0 if it was left blank.
b. If CFRELIDIG and CDIGLI12 was not answered and CNLIWHYAD was answered, CLIBRARY1_002 was set to 0 if it was left blank.
c. CLIBRARY1_003 and CLIBRARY1_004 was set to 0 for all paper questionnaire respondents.
-
Additional edits to grid questions: Due to the way the paper questionnaire was set up, additional edits were needed for the following linked grid questions: CARTS1/CARTS1A, CARTS2/CARTS2A, CARTS3/CARTS3A, CARTS4/CARTS4A, ARTPART12/ARTPART12A
The figure 3 shows an example for the CARTS1 section in the paper questionnaire.
Figure 3: Example - CARTS1 section in the paper questionnaire
Marking the option “Not in the last 12 months” on the paper questionnaire was equivalent to the code “0: Have not done this” at CARTS1 in the online script. As such, leaving this option blank in the questionnaire would result in CARTS1 being given a default value of “1” in the final dataset. In cases where a paper questionnaire respondent had neglected to select any of the options in a given row, CARTS1 was recoded from “1” to “0”.
If the paper questionnaire respondent did not tick any of the boxes on the page, they were recoded to “-4: Not answered but should have (paper)”.
5.4 Coding
Post-interview coding was undertaken by members of the Kantar Public coding department. The coding department coded verbatim responses, recorded for ‘other specify’ questions.
For example, if a respondent selected “Other” at CARTS1 and wrote text that said they went to some type of live music event, in the data they would be back-coded as having attended a “a live music event” at CARTS1_006.
For the sets CASRT1/CARTS1A/CARTS1B, CASRT2/CARTS2A/CARTS2B and CHERVIS12/CFREHER12/CVOLHER data edits were made to move responses coded to “Other” to the correct response code, if the answer could be back coded to an existing response code.
5.5 Data outputs
Once the checks were complete a final SPSS data file was created that only contained valid interviews and edited data. For 2023-24 the data file has a prefix of “Y3_” added to variable names to indicate this is the 2023-24 survey and there has been substantial changes to the questionnaire compared to last year.
From this dataset, a set of data tables were produced. Due to the changes to the questionnaire structure the tables have also been updated accordingly. Notably the measures for “Engaged with heritage physically or digitally” and “Engaged with heritage physically and digitally” from table 3a can no longer be derived in this quarter [footnote 11].
5.6 Weighting
A three-step weighting process was used to compensate for differences in both sampling probability and response probability:
- An address design weight was created equal to one divided by the sampling probability; this also served as the individual-level design weight because all resident adults could respond.
- The expected number of responses per address was modelled as a function of data available at the neighbourhood and address levels. The step two weight was equal to one divided by the predicted number of responses.
- The product of the first two steps was used as the input for the final step to calibrate the sample. The responding sample was calibrated to the January-March 2023 Labour Force Survey (LFS) with respect to (i) sex by age, (ii) educational level by age, (iii) ethnic group, (iv) housing tenure, (v) ITL2 region, (vi) employment status by age, (vii) household size, (viii) presence of children in the household, and (ix) internet use by age.
An equivalent weight was also produced for the (majority) subset of respondents who completed the survey by web. This weight was needed because a few items were included in the web questionnaire but not the paper questionnaire.
It should be noted that the weighting only corrects for observed bias (for the set of variables included in the weighting matrix) and there is a risk of unobserved bias. Furthermore, the raking algorithm used for the weighting only ensures that the sample margins match the population margins. There is no guarantee that the weights will correct for bias in the relationships between the variables.
The final weight variables in the dataset are:
- ‘Finalweight’ – to be used when analysing data available from both the web and paper questionnaires.
- ‘Finalweightweb’ – to be used when analysing data available only from the web questionnaire.
-
In February 2023, there was a Machinery of Government (MoG) change and responsibility for digital policy now sits within the Department for Science, Innovation and Technology (DSIT). This MoG change did not affect the contents of the Participation Survey for 2023/24 - digital questions are still part of the survey. ↩
-
Let’s Create, a strategic vision by ACE, sets out that by 2030 they want England to be a country in which the creativity of each of us is valued and given the chance to flourish and where everyone has access to a remarkable range of high quality cultural experiences. They invest public money from the government and The National Lottery to help support the sector and to deliver this vision. ↩
-
International Territorial Level (ITL) is a geocode standard for referencing the subdivisions of the United Kingdom for statistical purposes, used by the Office for National Statistics (ONS). Since 1 January 2021, the ONS has encouraged the use of ITL as a replacement to Nomenclature of Territorial Units for Statistics (NUTS), with lookups between NUTS and ITL maintained and published until 2023. ↩
-
The effective sample size represents the statistical value of the sample after applying weights to compensate for the variation in address sampling probabilities within each ITL2 region. ↩
-
In the event, the interval between first and second replicates was three weeks and between second and third replicates, the interval was one and a half weeks. ↩
-
https://www.gov.uk/government/publications/participation-survey-questionnaires ↩
-
Satisficing happens when respondents provide quick, “good enough” answers to complete a survey faster rather than carefully considering the answers. ↩
-
Response rates were calculated via the standard ABOS method. An estimated 8% of ‘small user’ PAF addresses in England are assumed to be non-residential (derived from interviewer administered surveys). The average number of adults aged 16+ per residential household, based on the Labour Force Survey, is 1.89. Thus, the response rate formula: Household RR = number of responding households / (number of issued addresses0.92); Individual RR = number of responses / (number of issued addresses0.92*1.89). The conversion rate is the simple ratio of the number of responses to the number of issued addresses. ↩
-
Interview lengths under 2 minutes are removed, and they are capped at the 97th percentile. If interviews are under 10 minutes, they are flagged in the system for the research team to evaluate; if they are flagged for other fraud checks, then those interviews are removed. ↩
-
Due to an oversight when allocating questions to different split sample modules, the physical heritage questions were asked to one subset of respondents, whilst the digital heritage questions were asked to a different subset of respondents. This means we cannot produce a figure for total heritage engagement (physical or digital) or a figure for engaging in both physically and digitally in Q1 and Q2. We are in the process of rectifying this for Q3 and Q4. ↩