Official Statistics

Participation Survey April to June 2024 Technical Report

Published 26 September 2024

Applies to England

September 2024

© Verian 2024

1. Introduction

1.1 Background to the survey

In 2021, the Department for Culture, Media and Sport (DCMS) commissioned Verian to design and deliver a new, nationally representative ‘push-to-web’ survey to assess adult participation in DCMS sectors across England. The survey served as a successor to the Taking Part Survey, which ran for 16 years as a continuous face to face survey.

This technical note relates to the 2024/25 Participation Survey Quarter 1 fieldwork, conducted between 10th April and 1st July 2024.

The scope of the survey is to deliver a nationally representative sample of adults (aged 16 years and over) and to assess adult participation in DCMS sectors across England. The data collection model for the Participation Survey is based on ABOS (Address-Based Online Surveying), a type of ‘push-to-web’ survey method. Respondents take part either online or by completing a paper questionnaire. In 2023/24, the target respondent sample size was boosted to 175,000. However, in 2024/25, the survey was not boosted, and therefore the target respondent sample consists of approximately 33,000, aligning more closely with the target sample sizes from the interim surveys conducted between 2021 and 2023. Fieldwork will run across four quarters (April to June 2024, July to September 2024, October to December 2024 and January to March 2025).

1.2 Survey objectives

  • To inform and monitor government policy and programmes in DCMS and other government departments (OGDs) on adult engagement with the DCMS and digital sectors [footnote 1]. The survey will also gather information on demographics (for example, age, sex, education).

  • To assess the variation in engagement with cultural activities across DCMS sectors in England, and the differences in social-demographics such as location, age, education, and income.

1.3 Survey design

The basic ABOS design is simple: a stratified random sample of addresses is drawn from the Royal Mail’s postcode address file and an invitation letter is sent to each one, containing username(s) and password(s) plus the URL of the survey website. Sampled individuals can log on using this information and complete the survey as they might any other web survey. Once the questionnaire is complete, the specific username and password cannot be used again, ensuring data confidentiality from others with access to this information.

It is usual for at least one reminder to be sent to each sampled address and it is also usual for an alternative mode (usually a paper questionnaire) to be offered to those who need it or would prefer it. It is typical for this alternative mode to be available only on request at first. However, after nonresponse to one or more web survey reminders, this alternative mode may be given more prominence.

Paper questionnaires ensure coverage of the offline population and are especially effective with sub-populations that respond to online surveys at lower-than-average levels. However, paper questionnaires have measurement limitations that constrain the design of the online questionnaire and also add considerably to overall cost. For the Participation Survey, paper questionnaires are used in a limited and targeted way, to optimise rather than maximise response.

2. Sampling

2.1 Sample design: addresses

The address sample design is intrinsically linked to the data collection design (see ‘Details of the data collection model’ below) and was designed to yield a respondent sample that is representative with respect to neighbourhood deprivation level and age group within each of the 33 ITL2 regions [footnote 2] in England. This approach limits the role of weights in the production of unbiased survey estimates, narrowing confidence intervals compared with other designs.

The design also sought a minimum four-quarter respondent sample size of 900 for each ITL2 region. Although there were no specific targets per quarter, the sample selection process was designed to ensure that the respondent sample size per ITL2 region was approximately the same per quarter.

As a first step, a stratified master sample of 231,000 addresses in England was drawn from the Postcode Address File (PAF) ‘small user’ subframe. Before sampling, the PAF was disproportionately stratified by ITL2 region (33 strata) and, within region, sorted by (i) lower tier local authority, (ii)  neighbourhood deprivation level (five groups, each of a similar scale at the national level), (iii) super output area, and finally (iv) by postcode. This ensured that the master sample of addresses was geodemographically representative within each stratum.

This master sample of addresses was then augmented by data supplier CACI. For each address in the master sample, CACI added the expected number of resident adults in each ten-year age band. Although this auxiliary data will have been imperfect, investigations by Verian have shown that it is highly effective at identifying households that are mostly young or mostly old. Once this data was attached, the master sample was additionally stratified by expected household age structure based on the CACI data:

(i) all aged 35 or younger (19% of the total)

(ii) all aged 65 or older (21% of the total)

(iii) all other addresses (60% of the total).

The conditional sampling probability in each stratum was varied to compensate for (expected) residual variation in response rate that could not be ‘designed out’, given the constraints of budget and timescale. The underlying assumptions for this procedure were derived from empirical evidence obtained from the 2023/24 Participation Survey.

Verian drew a stratified random sample of 116,894 addresses from the master sample of 231,000 and systematically allocated them with equal probability to 48 equal-sized ‘replicates’, each with the same profile and scale (2,436-2,437 addresses). The expectation was that only the first 32 replicates would be issued (that is, 77,934 addresses), with the remaining 16 kept back in reserve.

There are eight issue points across the full 2024-5 survey period: two per quarter, so the expectation was that the next four available replicates in each stratum (ITL2 region) would be activated at each issue point (so, eight per quarter; 32 in the whole survey period). The intention was to carry out a stratum level review towards the end of each quarter to inform the selection of replicates for the following quarter.

For quarter 1, the first eight replicates were issued in every stratum (that is, as planned). In total, 19,488 addresses were issued for quarter 1.

Table 1 shows the quarter 1 (issued) sample structure with respect to the major ‘design’ strata: neighbourhood deprivation level and expected household age structure.

Table 1: Initial address issue by area deprivation quintile group and expected household age structure

#Expected household age structure Most deprived 2nd 3rd 4th Least deprived
All <=35 1,095 1.020 764 729 393
Other 2,634 2,824 2,460 2,166 1,931
All >=65 628 681 796 730 637

2.2 Sample design: individuals within sampled addresses

All resident adults aged 16+ were invited to complete the survey. In this way, the Participation Survey avoided the complexity and risk of selection error associated with remote random sampling within households.

However, for practical reasons, the number of logins provided in the invitation letter was limited. The number of logins was varied between two and four, with this total adjusted in reminder letters to reflect household data provided by prior respondent(s). Addresses that CACI data predicted contained only one adult were allocated two logins; addresses predicted to contain two adults were allocated three logins; and other addresses were allocated four logins. The mean number of logins per address was 2.8. Paper questionnaires were available to those who are offline, not confident online, or unwilling to complete the survey this way.

2.2 Details of the data collection model

Table 2 summarises the data collection design within each stratum, showing the number of mailings and type of each mailing: push-to-web (W) or mailing with paper questionnaires (P). For example, ‘WWP’ means two push-to-web mailings and a third mailing with paper questionnaires included alongside the web survey login information. In general, there was a two-week gap between mailings.

Table 2: Data collection design by stratum.

#Expected household age structure Most deprived 2nd 3rd 4th Least deprived
All <=35 WWPW WWWW WWWW WWW WWW
Other WWPW WWW WWW WWW WWW
All >=65 WWPW WWPW WWP WWP WWP

3. Questionnaire

3.1 Questionnaire development

The online questionnaire was designed to take an average of 30 minutes to complete. A modular design was used with around half of the questionnaire made up of a core set of questions asked of the full sample. The remaining questions were split into three separate modules, randomly allocated to a subset of the sample.

The postal version of the questionnaire included the same set of core questions asked online, but the modular questions were omitted to avoid overly burdening respondents who complete the survey on paper, and to encourage response. Copies of the online and paper questionnaires are available online.

3.2 Questionnaire changes

As a result of a partnership between Arts Council England (ACE) and DCMS, in the 2023/24 survey year, the Participation Survey was boosted to be able to produce meaningful estimates at the Local Authority (LA) level. In 2024/25, the survey is not boosted by ACE, however looking ahead to 2026/27 ACE plans to boost the Participation Survey to LA level every 3 years. As a result, the questions on the following topics have been removed from the 2024/25 Participation Survey:

  • Environment, which included questions on mode of transport taken while travelling to an arts and cultural event, distance travelled, and reason(s) for transportation choice.

  • Social prescribing, which included questions on the respondent’s experience with social prescribing, and the types of activities they were referred to.

  • Further questions on arts and culture engagement, which included questions on the types of classes and clubs respondents have taken part in, the frequency and reasons(s) for their involvement, the impact/benefits of participating, and for non-participants, the reason for not participating.

  • Pride in local area, which included questions on respondents’ sense of belonging and pride in their local area, the role culture plays in choosing where to live, and the current arts and culture scene in their local area.

Questions on the following topic of interest was added to the 2024/25 Participation Survey , as requested by DCMS:

  • Archives, which assessed respondents’ use of archives or record offices in England, including the frequency and type of use (in-person or online), activities undertaken, and reasons for not using these services.

  • A question was added to the Heritage section to determine whether the respondent made a voluntary donation during their last visit to a heritage site.
  • A question was added to the Museums and Cultural Property section on reported knowledge and familiarity with the codes of practice related to metal detecting and mudlarking activities.
  • A question was added to the Digital Skills and Infrastructure section to capture where respondents found information about smart device security features before buying.

  • Further questions were added to the Mobile Internet section, on current access to mobile connectivity at home and, for those without access, how much they would be willing to pay for it per month.

  • A question was added to the Data section to measure comfort with public sector organizations using data for patterns, trends, and decision-making.

  • A question was added on awareness of government plans to commemorate and remember the impacts of the Covid-19 pandemic through various initiatives.

  • A question was added measuring satisfaction with the quality of cultural activities near the respondent’s home.

Cognitive testing was not required for this survey year, as there were no substantial changes to the questions. Additionally, many of the questions related to archives were previously included in DCMS’ Taking Part survey, the predecessor to the Participation Survey.

4. Fieldwork

4.1 Contact procedures

All selected addresses were sent an initial invitation letter containing the following information:

  • A brief description of the survey
  • The URL of survey website (used to access the online script)
  • A QR code that can be scanned to access the online survey
  • Log-in details for the required number of household members
  • An explanation that participants will receive a £10 shopping voucher
  • Information about how to contact Verian in case of any queries

The reverse of the letter featured responses to a series of Frequently Asked Questions.

All non-responding addresses were sent two reminder letters, at the end of the second and fourth weeks of fieldwork respectively. A pre-selected subset of non-responding addresses (see Table 2) was sent a third reminder letter at the end of the sixth week of fieldwork. The information contained in the reminder letters was similar to the invitation letters, with slightly modified messaging to reflect each reminder stage.

As well as the online survey, respondents were given the option to complete a paper questionnaire, which consisted of an abridged version of the online survey. Each letter informed respondents that they could request a paper questionnaire by contacting Verian using the email address or freephone telephone number provided, and a cut-off date for paper questionnaire requests was also included on the letters.

In addition, some addresses received up to two paper questionnaires with the second reminder letter. This targeted approach was developed based on historical data Verian has collected through other studies, which suggests that proactive provision of paper questionnaires to all addresses can actually displace online responses in some strata. Paper questionnaires were pro-actively provided to (i) sampled addresses in the most deprived quintile group, and (ii) sampled addresses where it was expected that every resident would be aged 65 or older (based on CACI data).

4.2 Fieldwork performance

In total, 8,753 respondents completed the survey during quarter 1 – 7,661 via the online survey and 1,092 by returning a paper questionnaire. Following data quality checks (see Chapter 5 for details), 352 respondents were removed (345 web and 7 paper), leaving 8,401 respondents in the final dataset.

This constitutes a 43% conversion rate, a 30% household-level response rate, and an individual-level response rate of 25% [footnote 3].

For the online survey, the median completion time was 26 minutes and 44 seconds, and the average completion time was 28 minutes and 29 seconds [footnote 4].

5. Data processing

5.1 Data management

Due to the different structures of the online and paper questionnaires, data management was handled separately for each mode. Online questionnaire data was collected via the web script and, as such, was much more easily accessible. By contrast, paper questionnaires were scanned and converted into an accessible format.

For the final outputs, both sets of interview data were converted into IBM SPSS Statistics, with the online questionnaire structure as a base. The paper questionnaire data was converted to the same structure as the online data so that data from both sources could be combined into a single SPSS file.

Quality checking

5.2 Quality checking

Initial checks were carried out to ensure that paper questionnaire data had been correctly scanned and converted to the online questionnaire data structure. For questions common to both questionnaires, the SPSS output was compared to check for any notable differences in distribution and data setup.

Once any structural issues had been corrected, further quality checks were carried out to identify and remove any invalid interviews. The specific checks were as follows:

  1. Selecting complete interviews: Any test serials in the dataset (used by researchers prior to survey launch) were removed. Cases were also removed if the respondent reached, but did not answer the fraud declaration statement (online: QFraud; paper: Q88).

  2. Duplicate serials check: If any individual serial had been returned in the data multiple times, responses were examined to determine whether this was due to the same person completing multiple times or due to a processing error. If they were found to be valid interviews, a new unique serial number was created, and the data was included in the data file. If the interview was deemed to be a ‘true’ duplicate, the more complete or earlier interview was retained.

  3. Duplicate emails check: If multiple interviews used the same contact email address, responses were examined to determine if they were the same person or multiple people using the same email. If the interviews were found to be from the same person, only the most recent interview was retained. In these cases, online completes were prioritised over paper completes due to the higher data quality.

  4. Interview quality checks: A set of checks on the data were undertaken to check that the questionnaire was completed in good faith and to a reasonable quality. Several parameters were used:

    a. Interview length (online check only)

    b. Number of people in household reported in interview(s) vs number of total interviews from household.

    c. Whether key questions have valid answers.

    d. Whether respondents have habitually selected the same response to all items in a grid question (commonly known as ‘flatlining’) where selecting the same responses would not make sense.

    e. How many multi-response questions were answered with only one option ticked.

Following the removal of invalid cases, 8,401 valid cases were left in the final dataset.

Data checks and edits

5.3 Data checks and edits

Upon completion of the general quality checks described above, more detailed data checks were carried out to ensure that the right questions had been answered according to questionnaire routing. This is generally all correct for all online completes, as routing is programmed into the scripting software, but for paper completes, data edits were required.

There were two main types of data edits, both affecting the paper questionnaire data:

  1. Single-response question edits: If a paper questionnaire respondent had mistakenly answered a question that they weren’t supposed to, their response in the data was changed to “-3: Not Applicable”. If a paper questionnaire respondent had neglected to answer a question that they should have, they were assigned a response in the data of “-4: Not answered but should have (paper)”. If a paper questionnaire respondent had tick more than one box for a single response question they were assigned a response in the data of “-5: Multi-selected for single response (paper)”.

  2. Multiple response question edits: If a paper questionnaire respondent had mistakenly answered a question that they weren’t supposed to, their response was set to “-3: Not Applicable”. If a paper questionnaire respondent had neglected to answer a question that they should have, they were assigned a response in the data of “-4: Not answered but should have (paper)”. Where the respondent had selected both valid answers and an exclusive code such as “None of these”, any valid codes were retained and the exclusive code response was set to “0”.

Other, more specific data edits were also made, as described below:

  1. Additional edits to library questions: The question CLIBRARY1 was formatted differently in the online script and paper questionnaire. In the online script it was set up as one multiple-response question, while in the paper questionnaire it consisted of two separate questions (Q15 and Q20). During data checking, it was found that many paper questionnaire respondents followed the instructions to move on from Q15 and Q20 without ticking the “No” response. To account for this, the following data edits were made:

    a. If CFRELIB12 and CPARLI12B was not answered and CNLIWHYA was answered, set CLIBRARY1_001 was set to 0 if it was left blank.

    b. If CFRELIDIG and CDIGLI12 was not answered and CNLIWHYAD was answered, CLIBRARY1_002 was set to 0 if it was left blank.

    c. CLIBRARY1_003 and CLIBRARY1_004 was set to 0 for all paper questionnaire respondents.

  2. Additional edits to grid questions: Due to the way the paper questionnaire was set up, additional edits were needed for the following linked grid questions: CARTS1/CARTS1A, CARTS2/CARTS2A, CARTS3/CARTS3A, CARTS4/CARTS4A, ARTPART12/ARTPART12A

Figure 1 shows an example of a section in the paper questionnaire asking about attendance at arts events.

Figure 1: Example of the CARTS1 and CARTS1A section in the paper questionnaire

Marking the option “Not in the last 12 months” on the paper questionnaire was equivalent to the code “0: Have not done this” at CARTS1 in the online script. As such, leaving this option blank in the questionnaire would result in CARTS1 being given a default value of “1” in the final dataset. In cases where a paper questionnaire respondent had neglected to select any of the options in a given row, CARTS1 was recoded from “1” to “0”.

If the paper questionnaire respondent did not tick any of the boxes on the page, they were recoded to “-4: Not answered but should have (paper)”.

5.4 Coding

Post-interview coding was undertaken by members of the Verian coding department. The coding department coded verbatim responses, recorded for ‘other specify’ questions.

For example, if a respondent selected “Other” at CARTS1 and wrote text that said they went to some type of live music event, in the data they would be back-coded as having attended a “a live music event” at CARTS1_006.

For the sets CARTS1/CARTS1A/CARTS1B, CASRT2/CARTS2A/CARTS2B and CHERVIS12/CFREHER12/CVOLHER data edits were made to move responses coded to “Other” to the correct response code, if the answer could be back coded to an existing response code.

5.5 Data outputs

Once the checks were complete a final SPSS data file was created that only contained valid interviews and edited data.

From this dataset, a set of data tables were produced. Due to the changes to the questionnaire structure the tables have also been updated accordingly.

5.6 Weighting

A three-step weighting process was used to compensate for differences in both sampling probability and response probability:

  1. An address design weight was created equal to one divided by the sampling probability; this also served as the individual-level design weight because all resident adults could respond.

  2. The expected number of responses per address was modelled as a function of data available at the neighbourhood and address levels. The step two weight was equal to one divided by the predicted number of responses.

  3. The product of the first two steps was used as the input for the final step to calibrate the sample. The responding sample was calibrated to the January-March 2024 Labour Force Survey (LFS) with respect to (i) sex by age, (ii) educational level by age, (iii) ethnic group, (iv) housing tenure, (v) ITL2 region, (vi) employment status by age, (vii) household size, (viii) presence of children in the household, and (ix) internet use by age.

An equivalent weight was also produced for the (majority) subset of respondents who completed the survey by web. This weight was needed because a few items were included in the web questionnaire but not the paper questionnaire.

It should be noted that the weighting only corrects for observed bias (for the set of variables included in the weighting matrix) and there is a risk of unobserved bias. Furthermore, the raking algorithm used for the weighting only ensures that the sample margins match the population margins. There is no guarantee that the weights will correct for bias in the relationships between the variables.

The final weight variables in the dataset are:

  • ‘Finalweight’ – to be used when analysing data available from both the web and paper questionnaires.
  • ‘Finalweightweb’ – to be used when analysing data available only from the web questionnaire.
  1. In February 2023, there was a Machinery of Government (MoG) change and responsibility for digital policy now sits within the Department for Science, Innovation and Technology (DSIT). This MoG change did not affect the contents of the Participation Survey for 2023/24—digital questions are still part of the survey. 

  2. International Territorial Level (ITL) is a geocode standard for referencing the subdivisions of the United Kingdom for statistical purposes, used by the Office for National Statistics (ONS). Since 1 January 2021, the ONS has encouraged the use of ITL as a replacement to Nomenclature of Territorial Units for Statistics (NUTS), with lookups between NUTS and ITL maintained and published until 2023. 

  3. Response rates were calculated via the standard ABOS method. An estimated 8% of ‘small user’ PAF addresses in England are assumed to be non-residential (derived from interviewer administered surveys). The average number of adults aged 16+ per residential household, based on the Labour Force Survey, is 1.89. Thus, the response rate formula: Household RR = number of responding households / (number of issued addresses0.92); Individual RR = number of responses / (number of issued addresses0.92*1.89). The conversion rate is the simple ratio of the number of responses to the number of issued addresses. 

  4. Interview lengths under 2 minutes are removed, and they are capped at the 97th percentile. If interviews are under 10 minutes, they are flagged in the system for the research team to evaluate; if they are flagged for other fraud checks, then those interviews are removed.