Technical Report
Updated 19 September 2024
Applies to England
1. Introduction
The People and Nature Surveys (PaNS) for England are national surveys of how people in England engage with the natural environment, perceive the quality of, benefit from, and take action to protect the natural environment. PaNS has been delivered in its current form since 2020. Prior to 2020, the project was delivered as the Monitor of Engagement with the Natural Environment survey (MENE).
PaNS comprises of two nationally representative surveys which are carried out annually:
-
a survey of c. 25,000 known as the Adults’ People and Nature Survey (A-PaNS)
-
a survey of 4,000 children known as the Children’s People and Nature Survey (C-PaNS).
This report describes the key technical features of PaNS. It replaces an earlier technical report for the project published in 2020. The report covers the following elements of technical survey delivery:
-
‘Introduction’ covers the background to the project and its development.
-
‘Questionnaires and datasets’ details the structure of the questionnaires and how they were developed.
-
‘Sample design’ describes the samples used for the adults’ and children’s surveys and summarises fieldwork performance to date.
-
‘Weighting’ provides details of how weightings have been applied to survey data.
-
‘Data production and publication’ outlines the data processing and quality assurance processes survey data undergoes prior to publication.
For more details about PaNS, please visit the main PaNS Gov.uk pages. If you have any questions not answered online, please contact the team directly at people_and_nature@naturalengland.org.uk.
1.1 Background to the project
Natural England has been working in partnership with the Department for Environment, Food and Rural Affairs (Defra) to deliver PaNS since 2020. The purpose of the surveys is to gather data and carry out analysis that helps government to:
-
Understand how people engage with the natural environment through visits and other forms of engagement.
-
Measure the benefits of engagement with the natural environment, including its effect on health and wellbeing.
-
Develop insight into perceptions of the quality of the natural environment in England.
-
Understand environmental attitudes and the actions people take at home, in the garden and in the wider community to protect the environment.
-
Monitor changes in these things over time, at a range of different spatial scales and for key groups within the population.
The full list of research questions underpinning PaNS are included in an annex to this technical report.
Evidence generated from PaNS supports the development of evidence-led policy, enabling Natural England and Defra to develop better policy and deliver more effectively. Data from the surveys is also used to measure progress against the Government’s Environmental Improvement Plan.
PaNS replaced MENE, which ran between 2009 and 2019. The main differences between PaNS and MENE are:
-
PaNS data is collected via an online survey, whereas MENE data was collected via online interviews.
-
MENE did not have a survey of children, whereas PaNS does.
-
The questionnaire and sample frame have been further developed since MENE was delivered.
Due to these differences, it is not generally recommended that data from MENE be compared to PaNS data. A longer guidance note is available providing greater detail about the comparability of the two surveys.
1.2 Delivery of the survey
Since 2020, Verian (formerly Kantar Public, until November 2023) has been commissioned to deliver the surveys. Both surveys are collected via an online panel. The adults’ survey collects data from c. 25,000 adults (aged 16+) throughout the year, with roughly 2,000 respondents being asked every month since April 2020. The children’s survey collects data from c. 4,000 children and young people (aged 8-15). This data is collected in two waves, with a wave collected during school summer holidays in August and another wave collected in term time in September.
Once data has been collected by Verian, it undergoes quality assurance checks from Verian and Natural England and then final quality control checks from Natural England prior to publication.
1.3 Accredited Official Statistics
Since 29 November 2023, statistics from PaNS have been published as accredited official statistics. The PaNS team has worked with the Office for Statistics Regulation (OSR) to ensure that the pillars of trustworthiness, quality and value in the UK Statistics Authority’s Code of Practice for Statistics have been adhered to throughout the project.
During earlier stages of the project, statistics were published as:
-
Official statistics in development (between April 2020 and March 2021).
-
Official statistics (between April 2021 and November 2023).
2. Questionnaires and datasets
2.1 Adults’ questionnaire
Structure of the adults’ questionnaire
A modular questionnaire has been designed for the adults’ questionnaire to maximise the number of questions asked in the survey while maintaining a survey length that is not overly burdensome for respondents. After a short screening section, there are six modules (M1-M6). Only M1 (General Experiences) and M6 (Demographics and Wellbeing) are asked of all respondents. M2A (Visits taken), M2B (No visits taken), M3 (Children), M4 (Environmental Attitudes), and M5 (Gardens) are asked of different subsets of respondents. The latest questionnaire document provides an overview of how participants are filtered through the modules (on page three of the latest pdf document).
Questionnaire development
In November 2019, Verian hosted a questionnaire design workshop, attended by stakeholders from within Natural England, Defra, other government and NGO users and several academics. The workshop provided an opportunity to understand the priority areas for users and what topic areas needed to be added ahead of questionnaire testing.
Following the workshop and development of initial questions, Verian conducted two stages of cognitive and usability testing in January 2020 in several locations (London, Leeds and Birmingham). The aim of cognitive testing is to examine if respondents understand the questions correctly and if they can provide accurate and consistent answers. The interviews included usability testing – checking that respondents were able to easily complete the survey online, including use of the survey map to accurately select the location of the most recent trip to a green and natural space.
In total, 49 interviews were completed. All interviews were conducted in person.
A ‘live trial’ pilot survey was conducted on the Kantar Profiles online panel among c. 1,040 adults in February 2020 to test questionnaire length, that the modular design was working correctly and that questions were being answered sensibly.
The questionnaire content and structure has evolved over time based on periodic reviews, the requirement to monitor the impact of the COVID-19 pandemic on engagement with green and natural spaces, and the requirement to capture new areas of policy interest. The latest copy of the questionnaire can be found in Natural England’s Access to Evidence catalogue.
Geocoding and spatially derived data
The adults’ survey captures both the start and destination of the main visit place for respondents that report having any visits to green and natural spaces in the previous 14 days. Geocoding of visits allows Natural England and Defra to better understand geographic patterns of visiting. The data also feeds into the analysis for the Outdoor Recreation Valuation tool.
For the majority of visits taken, the start point is the respondent’s home. Geo-coding is added on for respondents that provide a valid home postcode in England. 81% of all survey respondents have provided a valid home postcode. Data for a respondent’s local authority is taken from a combination of postcode derived data and (for those that did not provide a postcode) self-reported local authority. The main visit location coordinates were captured using the interactive map in module 2. The map uses Google Maps API to return latitude/longitude coordinates which are then matched against a number of external sources to return further geographic variables.
The following geographic variables are appended to the data for home and visit locations (where applicable):
-
Lower layer super output area
-
Indices of multiple deprivation
-
Local authority
-
Upper tier local authority
-
[Home location only] Urban/Rural status
-
National Character Areas
-
Local Nature Recovery Strategy
-
Sub Agricultural Landscape Types
Each quarter Natural England and Verian check that the sources used are the most up to date or whether they have been superseded. If location source data has been updated the entire dataset is updated to take this into account. The updated list of source data used to produce PaNS data can be found in the Annex.
Natural England and Verian check that visit and home post code locations are within UK territory. Visit locations outside of the UK territory, and invalid postcodes, have been removed from the dataset.
2.2 Children’s questionnaire
Structure of the children’s questionnaire
The children’s questionnaire has been designed to be easy to understand and fill out, with screening questions for parents prior to the child or young person completing the survey. There is no modular structure to C-PaNS and therefore all participants are eligible to answer all questions. There are two waves of the children’s survey each year, one in term time and one in the school holidays. Questions relating to experiences of nature at school are only asked in the term time wave of the survey.
Questionnaire development
In early 2020, Verian conducted informal scoping interviews with 5 key stakeholders to understand what users were looking to capture from a children’s survey. These interviews were conducted with stakeholders representing Natural England, Defra, Groundwork, UK Youth and Step Up to Serve.
This was followed by 12 face-to-face qualitative ‘triads’ (interviews with 3 children at a time). The aim of the qualitative research was to better understand how children think and talk about nature and environmental issues.
The initial survey incorporated feedback from the stakeholder interviews and qualitative triad interviews with children.
The C-PaNS survey was designed to be clear and easy to understand for children using simplified language, icons and sliding bars for easy to use response lists. Screening questions were written for parents before handing over the survey to their child. The survey received ethical approval from Natural England’s Ethics Committee on the basis that the questionnaire made sure that Verian received explicit permission from both the parent and child to conduct the interview.
Cognitive testing has been conducted each successive year for any changes to the questionnaire content. The latest copy of the questionnaire is saved on the Natural England website.
C-PaNS was first completed as a pilot in 2020. Following this pilot the survey has been adapted for use on an on-going basis from 2021 onwards, both in school holidays and term-time, to allow trends in attitudes and behaviours over time to be examined.
Geocoding and spatially derived data
From Year 2 (2022) onwards, during the initial screening section of the questionnaire, parents are asked to provide their postcode to allow further analysis by geographic area. The survey confirms that data is processed and stored securely in accordance with Natural England’s Privacy Policy and that the data will only be used for research purposes related to C-PaNS. There is a clear option to not provide postcode.
Verian match postcode to data from the ONS National Statistics Postcode Lookup file. This data is used to derive the local authority, upper tier local authority and urban/rural status of each respondent.
3. Sample design
3.1 Overall approach to sample design
The method of data collection in A-PaNS is the Kantar Profiles online panel. Specifically, it is the England subset of Kantar’s global online Profiles panel as the main sample source. The target annual sample size of 25,000 surveys per year allows for robust stand-alone analysis to be conducted by region and key sub-groups as well as reliable time series data, offering a monthly sample of around 2,080 interviews.
Survey quota targets are set on a monthly basis, with the aim of achieving an even spread of surveys throughout each month.
Quotas are set to achieve a representative sample (of English adults) and compensate for known biases in online panels. Quotas are set by age and gender (interlocked), ethnicity, region (collapsed into three categories – North, Midlands, South) and highest educational qualification achieved (using the standardised European classification – ISCED11). The population statistics used for the quotas are sourced from the latest available ONS Mid-Year Population Estimates, Labour Force Survey and reviewed annually to see if they need to be updated.
3.2 Sample quotas for the adults’ survey
The target proportion of interviews each month for the different quota categories is below:
Table 1 A-PaNS quota targets per survey year
Category | April 20 - March 21 | April 21 - March 22 | April 22 - March 23 | April 23 - March 24 |
---|---|---|---|---|
Male 16-24 | 6.8% | 6.7% | 6.7% | 7.0% |
Male 25-39 | 12.5% | 12.5% | 12.5% | 12.5% |
Male 40-54 | 12.2% | 12.0% | 12.0% | 11.8% |
Male 55-64 | 7.2% | 7.3% | 7.3% | 7.3% |
Male 65+ | 10.3% | 10.4% | 10.4% | 9.8% |
Female 16-24 | 6.5% | 6.4% | 6.4% | 6.9% |
Female 25-39 | 12.5% | 12.4% | 12.4% | 13.2% |
Female 40-54 | 12.4% | 12.3% | 12.3% | 12.2% |
Female 55-64 | 7.4% | 7.6% | 7.6% | 7.6% |
Female 65+ | 12.2% | 12.4% | 12.4% | 11.7% |
Region - North | 27.7% | 27.7% | 27.7% | 27.5% |
Region - Mid | 30.2% | 30.2% | 30.2% | 30.3% |
Region - South | 42.2% | 42.1% | 42.1% | 42.2% |
Educational status - Degree + (level 6 or above) | 28.5% | 29.2% | 29.2% | 30.3% |
Educational status - No degree (level 5 or below) | 71.5% | 70.8% | 70.8% | 69.7% |
Ethnicity - White | 86.5% | 86.1% | 86.1% | 84.7% |
Ethnicity – Ethnic minority groups | 13.5% | 13.9% | 13.9% | 15.3% |
3.3 Sample quotas for the children’s survey
Each ‘Wave’ of the survey aims to gather responses from 2000 children and young people aged 8 to 15 through Kantar’s Profiles panel. Each year there are two ‘waves’ which cover school holidays and term-time. In each wave, the target is for a representative sample of 1500 children and a ‘boost’ of 500 children from ethnic minority groups. This was a pragmatic sample size chosen based on resource and achievable representative sample size within the time frame of the survey.
Sampling quotas are based on age, region and ethnicity. The population statistics used for the quotas are sourced from the latest available ONS Mid-Year Population Estimates (or Census) and Labour Force Survey and are subject to change on an annual basis as updated population statistics are released. These population statistics are reviewed annually and if necessary, quota targets are updated in accordance with these. Quotas for the nationally representative sample of 1500 children (excluding the boost of 500 children from ethnic minority groups) are in the table below.
Table 2 C-PaNS quota targets per wave
Category | Wave 1-4 target % | Wave 5-6 target % | |
---|---|---|---|
Age – 8 to 11 | 51.1% | 50.8% | |
Age – 12 to 15 | 48.5% | 49.2% | |
Region – North | 27.2% | 27.5% | |
Region - Mid | 30.5% | 30.8% | |
Region - South | 42.3% | 41.7% | |
Ethnicity - White | 78.2% | 77.5% | |
Ethnicity – Ethnic minority groups | 21.8% | 22.5% |
3.4 Fieldwork performance
The tables linked [the tables will be available from 2nd October 2024] show the target proportion of surveys to be achieved for each quota category and the actual proportion of surveys completed per category per year, for the adults’ and children’s surveys.
The boost of 500 children from ethnic minority groups means the overall sample does not match the quota targets in Table 3-2, since those targets apply to the nationally representative sample of 1500 children.
4. Weighting
4.1 Overall approach to weighting
Weighting is used to ensure the sample is representative of the adult population in England. For any weight variable in the data each respondent has a value which represents the weight to which their response should play in the overall analysis. Weighting in data multiplies the sum of responses to any question by the sum of the respondent weights to provide a weighted count. This section includes information on the development of different weights throughout the adults’ and children’s surveys.
4.2 Weighting of the adults’ survey
Interim weight development
For the first year of the survey (April 2020 – March 2021), monthly indicators were generated and published using an interim weight whilst a bespoke People and Nature Survey weight was being developed. This weight was developed using a similar approach taken in MENE in Year 10 and was representative of the English adult population, according to the latest population estimate data available from the Office for National Statistics and was based on the weighting scheme developed for the MENE survey.
Data was weighted to minimise non-response across observable demographic characteristics. This demographic non-response “rim” weight was created using a raking calibration. Rim weighting is an iterative process, ending with a respondent profile that matches the population profile on several dimensions simultaneously.
The demographic categories used in the interim weight were:
-
Age by gender (interlocked)
-
Region
-
Urban/Rural status
-
Presence of children aged under 16 in the household
-
Gender by working status (interlocked)
People and Nature Survey weighting scheme development
With the launch of the Adults’ People and Nature Survey (A-PaNS) and the shift to an online methodology, Verian conducted work to develop a weighting scheme for the survey.
The purpose of weighting is to reduce the net error of survey estimates. Weighting aims to reduce bias, but this is usually at the cost of a reduction in precision (which is related to the variance of the weights as well as the sample size). To reduce bias, the variables included in the weighting need to be correlated with the key survey outcome – the number of visits made to green and natural spaces in the last 14 days.
The approach taken to develop the new weighting scheme was to:
-
Identify the demographic variables which could potentially be included in the weighting matrix and to source appropriate benchmark population statistics.
-
Conduct regression modelling to identify which of these variables are significantly associated with the number of visits which people have made in the last 14 days.
-
Assess the scheme for precision of survey estimates.
Verian initially reviewed data from the first quarter of A-PaNS fieldwork (April to June 2020). This process was reviewed after three quarters (April to December 2020). The main purpose of this review was to check whether the associations identified between the number of visits and the demographic variables still held outside of the strict Coronavirus lockdown period that coincided with Q1 of the survey.
The variables captured in the People and Nature Survey which were considered for inclusion in the weighting matrix were:
-
Urban / Rural
-
Region
-
Age by Gender (combined into a single variable)
-
Marital status
-
Working status by Gender
-
Long lasting health condition
-
Number cars / vans
-
Age by Highest qualification
-
Ethnicity
-
Dog ownership
-
Children under 16
These are variables for which robust population benchmarks exist.
Regression modelling was used to identify variables which were significantly associated with the number of visits which people have made in the last 14 days and should be included in the PaNS weighting matrix.
Nearly all proposed variables were significantly associated (p<0.05) with number of visits. Only marital status was found not to be associated and so was removed from the weighting matrix.
This process produced the standard demographic weight ‘Weight_Percent’.
4.3 Survey Weights
Weight_Percent
This weight was created by scaling the demographic weight for each month to the monthly target sample size (2,083). This weight should be used when conducting analysis of most questions within modules 1,2,4 and 6.
This weight can also be used for most questions within modules 2 and 4 – even though they were asked to random sub-sets of the overall sample.
The majority of questions use ‘Weight_Percent’. The questions that use a different weight are listed below.
4.4 Visit weights
Weight_percent_M2A
This weight should be used when conducting analysis of the detailed visit information collected in module 2A.
Questions within module 2A relate to a visit which a respondent has been on. Detailed information was only collected for one visit, regardless of the number of visits which respondents reported having made in the last 14 days.
This weight should be used to calculate proportions in: M2A, M2A_Q2, M2A_Q3, M2A_Q5, M2A_Q6, M2A_Q7, M2A_Q8A, M2A_Q8B, M2A_Q8C, M2A_Q9.
Weight_Percent_M2A_SUB
This weight should be used when conducting analysis of the detailed visit information collected in module 2A_SUB. Questions within module 2A_SUB were asked to c.30% of those that responded to Module 2A.
This weight should be used to calculate proportions in: M2A_SUB, M2A_SUB_Q1, M2A_SUB_Q2, M2A_SUB_Q3, M2A_SUB_Q4A, M2A_SUB_Q5, M2A_SUB_Q6, M2A_SUB_Q7, M2A_SUB_Q8.
Module weights created due to changes in module allocation
There are different weights to use for questions in Module 5 and some questions with Module 6. These were created due to the change in module allocation selection probabilities between April 2020 and May 2020.
Weight_Percent_M5
A separate weight is required for module 5, because the randomisation for this module changed within Q1. In April 2020, this question was asked to c.20% of respondents. From May 2020 onwards, this question was asked to c.40%. As such, with the overall weight applied (Weight_Percent) April would be under-represented in the weighted sample.
This weight should be used to calculate proportions in: M5, M5_Q1A, M5_Q1B_Old, M5_Q1B, M5_Q1C, M5_Q1D, M5_Q1E, M5_Q1F, M5_Q2, M5_Q3.
Weight_Percent_M6B
A separate weight is required for module 6B, because the randomisation for this module changed within Q1. In April 2020, this question was asked to c.25% of respondents. From May 2020 onwards, this question was asked to everyone. As such, with the overall weight applied (Weight_Percent) April would be under-represented in the weighted sample.
This weight should be used to calculate proportions in: M6B, Wellbeing_lonely, Wellbeing_satisfied, Wellbeing_worthwhile, Wellbeing_happy, Wellbeing_worried, Wellbeing_anxious.
4.5 Weights for grossing estimates to the adult population in England
There are some questions within Module 1 and Module 2 where weights are produced to gross up the number of respondents to match the adult population (16+) in England and provide monthly totals for number of visitors to different types of green and natural spaces, number of visits and total expenditure. The same weights can also be used to calculate weighted percentages for the respective questions.
‘Weight_Grossed_M1_Q2’ produces an estimate of the total number of adults aged 16+ who have visited each type of green and natural space in the past month (in 000s). It applies to question M1_Q2.
‘Weight_Grossed_No_Of_Visits’ produces an estimate of the total number of visits to green and natural spaces in the past month (in 000s). It applies to question No_Of_Visits.
‘Weight_Grossed_M2A_SUB_Q4B’ produces an estimate of the total amount in £s (000s) spent on visits to green and natural spaces in the past month. It applies to question M2A_SUB_Q4B.
Cross tabulation
To cross tabulate between two fields in the dataset, please use the weight that is associated with the survey question. For example, to look at M2A_Q2 versus Age the appropriate weight is ‘Weight_Percent_M2A’ as this is associated with the question of interest (M2A_Q2).
4.6 Weighting of the children’s survey
The weight in C-PaNS ‘Weight_percent’ is derived from a design weight and a non-response “rim” weight.
Design weight
A design weight was calculated to compensate for just one child per household being surveyed. Children and young people from households with only one eligible child were given a design weight of 1, children and young people from households with more than one eligible child were given a design weight of 2. A design weight is not used on its own for analysis, but rather it forms the starting point of the non-response weight.
Non-response “rim” weight
The design weight was input into a raking algorithm that ensured the sample margins matched the population margins for the following variables:
-
Age and Gender
-
Region
-
Ethnicity
The benchmark population statistics (for children and young people aged 8 to 15 years old in England) used were as follows:
Table 3 C-PaNS weighting targets per wave
Category | Wave 1-4 target % | Wave 5-6 target % |
---|---|---|
Male 8-9 | 13.4 % | 13.0% |
Male 10-11 | 13.0% | 13.0% |
Male 12-13 | 12.8% | 12.9% |
Male 14-15 | 12.1% | 12.3% |
Female 8-9 | 12.8% | 12.4% |
Female 10-11 | 12.3% | 12.4% |
Female 12-13 | 12.1% | 12.3% |
Female 14-15 | 11.5% | 11.7% |
Region - North East | 4.5% | 4.5% |
Region - North West | 13.0% | 13.3% |
Region - Yorkshire and the Humber | 9.7% | 9.7% |
Region - East Midlands | 8.4% | 8.5% |
Region - West Midlands | 10.8% | 11.0% |
Region - East | 11.2% | 11.3% |
Region - London | 16.2% | 15.7% |
Region - South East | 16.7% | 16.6% |
Region - South West | 9.4% | 9.4% |
Ethnicity - White | 78.2% | 73.5% |
Ethnicity - Mixed or multiple ethnic groups | 5.4% | 6.2% |
Ethnicity - Black/Black British | 5.8% | 12.2% |
Ethnicity - Asian/Asian British | 8.5% | 6.0% |
Ethnicity - All other | 2.0% | 2.1% |
5. Data production and publication
5.1 Data processing and quality assurance
We strive to ensure that any data provided to our users is of the highest quality. To achieve this, all survey data goes through several stages of quality assurance (QA) with automated scripts used where appropriate. These are reviewed and updated on a quarterly basis.
Each publication has its respective data dictionary document, which is updated quarterly to ensure that the data structure is recorded in a clear and easy to explore way. This includes the naming and labelling conventions for questions and response codes.
Verian produces the data using SPSS, after implementing data pre-processing and cleaning steps. The files are delivered using a secure transfer system. Natural England then undertakes its own detailed automated QA workflow using the R programming language. The R code is version controlled with GitHub. Issues discovered during these checks are raised with the contractor who then provide an updated dataset. Natural England’s QA process then begins again.
Pre-processing checks
These are implemented by the contractor before delivering the data to Natural England, using SPSS syntax developed by the Verian research team. The checks performed include verifying data structure, questionnaire routing, dates, and back-coding ‘other specify’ answers.
High level checks
Firstly, Natural England perform data structure checks by comparing all variables, values, and labels against the data dictionary and the previous quarter publication. This way we can confirm the Internal and derived datasets (Controlled, Safeguarded, Open) have the intended configuration. We also perform other checks, including ID uniqueness, .sav metadata, dates, spatial variables, target demographic quotas and weighted values.
Module level checks
Next, data is checked across each module, mostly focusing on checks at question level. Routing and rule checks are undertaken for each module. These ensure that appropriate questions have been asked based on which modules have been answered, as well as appropriate responses within each question based on previous responses. For example, if a respondent has answered M2A then they should have answered M2A_Q1. The survey questionnaire provides a breakdown of survey modules and sample sizes.
Sense checks
Once the data has been quality assured, it is then interpreted and contextualised. This is when we assess whether time trends and responses are within expected ranges. We also compare the results with relevant external outputs (e.g., in the case of standardised questions), where applicable. Systematic bias checks are also undertaken to assess whether systematic bias exists in the data in regard to non-completion of the survey, individual modules or individual questions as well as lack of comprehension (indicated by latency) among survey participants.
Publication checks
Finally, pre-publication checks provide the final confirmation that the correct variables are being published in each dataset. At this stage, we also check the layout and formatting of the data in MS Excel and add the homepage and data dictionary tabs to those files.
Please get in contact if you want more detailed information on the QA checks we undertake.
5.2 Data publication
Summary statistics from the People and Nature Surveys for England are published on GOV.UK.
The complete datasets are published via the UK Data Service (UKDS) to increase the robustness in how we manage disclosure of the data collected within the survey. By using the UKDS, we can provide varying levels of potentially sensitive data and adhere to the highest standards of data management, in line with official advice from the Office for Statistics Regulation.
Statistics from the adults’ survey are published quarterly. Statistics from the children’s survey are published annually. Provisional publication dates are initially pre-announced in GOV.UK. The date is then confirmed, at least four weeks in advance.
For the first two years of data collection (April 2020 to March 2022), Natural England published monthly data outputs from the adults’ survey with a focus on the impact of COVID-19 on adults’ (and parents reports on children’s) engagement with the natural environment. That data is accessible via the PaNS GOV.UK webpage.
Natural England publish three datasets, via UKDS, with varying access levels to better meet the needs of our users: Open, Safeguarded, and Controlled. The data is currently published in three file formats (.sav, .xlsx, .ods).
Across the different datasets, specific variables can be banded, truncated or excluded due to statistical disclosure control aiming to eliminate the ability for someone to identify a respondent based on a combination of their responses to demographic and geographical questions.
Open Access
The majority of our data is freely accessible to all users without any registration. It excludes any potentially sensitive variables. M2A_SUB_Q4B is edited (top coded to £100).
Safeguarded Access
To access this dataset, users need an account with the UKDS and must adhere to their End User License agreement. It includes all open access variables, plus selected variables with residual disclosure risk. The following variables are truncated: M2A_SUB_Q4B (top coded to £100), No_Of_Children (top coded to ‘6 and over’), M3_Q1 (top coded to ‘6 and over’), and Income (top coded to £50,000+).
Controlled Access
This dataset is designed for users who are likely to carry out advanced modelling or statistical analysis. It includes all safeguarded variables, plus potentially sensitive variables, such as Orientation, Ethnicity_Detailed and home geography variables. Statistical disclosure controls have not been undertaken on the dataset, and only the respondent postcode has been removed. As such, users will have to be accredited with the UKDS and do a training course before they can access the data through SecureLab. Their data usage must be approved by Natural England.
For more information about the three access levels, please check the guidance available in the UKDS website and the data dictionary published alongside the data.
6. Glossary
Design Weight
Design weights account for different probabilities of selection and if used are the starting weight used in any rim weighting calculation.
In C-PaNS, Verian (previously known as Kantar Public until November 2023) calculated a design weight to compensate for just one child per household being surveyed. Children from households with only one eligible child were given a design weight of 1, children from households with more than one eligible child were given a design weight of 2.
The design weights are derived as 1 divided by the probability of selection.
Non-response/Rim weighting
The non-response weights are intended to account for different probabilities of completing the survey.
The non-response weights are derived through iterative proportional fitting, also known as calibration raking or rim weighting. This method follows an algorithm which iteratively weights the sample to match known marginal distributions until the weighting converges. In other words, the algorithm starts with the design weight (if applicable, if not it starts with a value of 1) and then iteratively adjusts this initial weight to match each of the target distributions in turn until it converges on a weighting solution which matches all the target distributions at once.
The target distributions are based on demographic targets described within the technical report.
Design effects
Weighting reduces the effective sample size of a dataset; because of the differences in the probabilities of selection and the probabilities of response, the achieved sample provides less information than a notional simple random sample of the same size. The design effect quantifies the extent to which the expected sampling error in a survey departs from the sampling error that can expected under simple random sampling.
Weighting efficiency is the inverse of the design effect (1/Deff). This indicates how much statistical power is lost by weighting, the lower the efficiency the more power is lost.
Effective Sample Size
The effective sample size (ESS) is an estimate of the sample size required to achieve the same level of precision if that sample was a simple random sample.
It is calculated by dividing the unweighted sample size by the design effect. It is the sample sized used when calculating confidence intervals or in any tests of statistical significance.
Confidence Intervals
When a survey is carried out, the respondents who take part are only a subset of those in the population and as such may not give an exact representation of the ‘true’ average in the population. The reporting uses confidence intervals to account for the fact that the survey is based on a subset of the population. A 95% confidence interval is a margin of error around an estimate, which gives a range of values within which you can be 95% confident that the true mean will lie.
For instance, if 1000 people are interviewed, and 500 (50%) of them say that they agree with a statement, then you can be 95% confident that true proportion of people who agree with the statement is between 50% +/- 3% (47%, 53%). The analysis of confidence intervals within PaNS uses the Complex Samples Module within the analytical software package, SPSS to correct for these effects.
Statistically significant differences
Statistically significant differences are differences that are very unlikely to occur by chance. Statistical tests are used to determine the probability of these differences occurring. Significant differences reported at the 95% confidence estimate the difference would occur by chance only 5% of the time. At the 99% confidence level, this would only occur 1% of the time. PaNS data releases report on significant differences at the 95% confidence level.
Weighting matrix
The weighting matrix refers to the variables the PaNS survey is weighted by, and population statistics used to weight the data to. These include:
-
Age*Gender
-
Region
-
Age*Highest qualification
-
Children aged <16 in the household
-
Ethnicity
-
Long-lasting health condition
-
Number of cars / vans available for use by the household
-
Urban/Rural
-
Dog ownership
Weighted base
The weighted base is the base size of the data once the weights have been applied. In this survey, each month is weighted so that the weighted base size is 2,083.
Weighted profile
The weighted profile is the profile of the data once the weights have been applied. The profile is usually expressed in percentages and should closely match the profile of the weighting targets.
Calendar Month Factor
The calendar month factor is used to adjust the number of visits within a month based on the time frame of question and the number of days in each month. The timeframe for the frequency of visits question is 14 days, the Calendar Month Factor is applied to account for the number of times this can period occurs within each calendar month based on the number of days in the month.
Trip factor
The Trip Factor is the number of trips the respondent has made in the last 14 days.
Weights for grossed estimates
These weights gross up the number of respondents to match an overall figure. In this survey, they are grossed to match the adult population (16+) in England. These weights also take into account the overall population size, modularisation, and the time frame of the questions. These should be used when grossed estimates are required.
Weights for proportions or percentages
These weights produce adjusted percentages for Modules 2A and 2ASub to account for the fact that detailed information is only collected for one visit, regardless of the number of visits which respondents reported having made in the last 14 days. To ensure this sample is representative of ALL visits – the number of visits needs to be accounted for. The final grossed weight for the number of visits is multiplied by the number of visits.
7. Annex – List of source data
Survey fieldwork quotas
-
Age by gender: ONS Mid-Year Population Estimates/Census 2021
-
Highest qualification: ONS Labour Force Survey
-
Ethnicity: ONS Labour Force Survey
7.1 Weighting population targets
-
Age by gender: ONS Mid-Year Population Estimates/Census 2021
-
Age by highest qualification: ONS Labour Force Survey
-
Working status by gender: ONS Labour Force Survey
-
Children aged: <16 in the household ONS Labour Force Survey
-
Ethnicity: ONS Labour Force Survey
-
Long-lasting health condition: ONS Labour Force Survey
-
Number of cars / vans available for use by the household: National Survey for Wales
-
Urban/Rural: ONS Small Area Population Estimates with urban-rural appended
-
Dog ownership: Verian Public Voice 2021
Geo-demographic data sources
-
Home Local authority (LAD23): Lower_Layer_Super_Output_Area_(2021)to_LAD(April_2023)_Lookup_in_England_and_Wales
-
Home Upper tier local authority: Lower_Tier_Local_Authority_to_Upper_Tier_Local_Authority_(April_2023)_Lookup_in_England_and_Wales
-
Home Lower layer super output area 2021 (LSOA21): NSPL (Census 2021 geographies)
-
Home Indices of multiple deprivation (IMD): English indices of deprivation 2019
-
Home Urban/Rural status: NSPL (Census 2021 geographies)
-
Visit Local authority (LAD23) - England and Wales: Lower_Layer_Super_Output_Area_(2021)to_LAD(April_2023)_Lookup_in_England_and_Wales
-
Visit Local authority (LAD23) - Scotland / Northern Ireland: NSPL (Census 2021 geographies)
-
Visit Upper tier local authority - England and Wales: Lower_Tier_Local_Authority_to_Upper_Tier_Local_Authority_(April_2023)_Lookup_in_England_and_Wales
-
Visit Upper tier local authority - Scotland / Northern Ireland: NSPL (Census 2021 geographies)
-
Visit Lower layer super output area 2021 (LSOA21) - England and Wales: Lower-layer-super-output-areas-2021-boundaries-ew-bfc
-
Visit Lower layer super output area 2021 (LSOA21) - Northern Ireland: data-zones-census-2021
-
Visit Lower layer super output area 2021 (LSOA21) - Scotland: data-zone-boundaries-2011
-
Visit Indices of multiple deprivation (IMD) - England: English indices of deprivation 2019
-
Visit Indices of multiple deprivation (IMD) - Wales: welsh-index-multiple-deprivation-2019-index-and-domain-ranks-by-small-area
-
Visit Indices of multiple deprivation (IMD) - Scotland: SIMD+2020v2+-+datazone+lookup
-
Visit Indices of multiple deprivation (IMD) - Northern Ireland: NIMDM17_SOA results overall only
8. Annex – PaNS Research Questions
The original research questions for the People and Nature Surveys were written in 2019 and reviewed in 2024 to reflect the developing priorities of Natural England and the Environmental Improvement Plan.
As of 2024, the research questions have been laid out to be consistent with the natural capital framework. A natural capital approach to policy and decision making looks at the underlying assets underpinning activity, how these assets are used, and the benefits for people derived from these assets.
1. People’s perceptions of the quality of ecosystem assets
The places where people engage with the natural environment (i.e. assets)
- What proportion of the population perceive that they have green and blue spaces within 15 minutes’ walk of their home?
a. How does this differ between areas of England and groups of people?
b. Is this changing over time?
- How do people perceive the quality of the green and blue spaces close to where they live?
a. Is this changing over time?
b. How does this differ between areas of England and groups of people?
c. How do perceptions of environmental quality relate to ‘dose’ (time spent) and benefits?
d. Does biodiversity play a role in perceptions of quality?
- How do people perceive the quality of the landscape close to where they live?
a. How does this differ by landscape type?
b. Is this changing over time?
-
How do people perceive the quality of the green and blue space they visit?
-
What factors determine perceptions of quality?
-
What proportion of the public have access to gardens?
2. How and why people engage (or don’t engage) with nature and their experiences of it
The things people do while they are there (i.e. services).
-
How frequently do people spend their free time in green and natural places?
-
How does this differ between population groups? E.g. Age? Gender? Ethnic group? Household income?
-
What factors affect likelihood of people spending time in green and natural places?
-
How is this changing over time?
-
How does children’s engagement with nature differ between term time and school holidays?
-
How and why do people spend their free time in green and natural places?
-
What motivates people to spend time in green and natural places?
-
What green and natural places do people visit?
-
What do people do in green and natural places?
-
How does this vary in space and time and between key population groups?
-
How does spending time in nature compare with competing things for people’s leisure time?
-
How is this changing over time?
-
How do children and young people spend time in green and natural places?
a. How is this changing over time?
b. How does children’s engagement with the natural environment change at different ages? How does this change as they grow up?
-
What are people’s experiences of engaging with nature?
-
What are the barriers to accessing nature for different communities?
3. How people benefit from the natural environment, attitudes of environmental care, support for pro-environmental policy and pro-environmental behaviours.
The benefits people derive from the natural environment (i.e. benefits)
-
What are the benefits people report from spending time in nature?
-
How are these distributed spatially and across the population?
-
What’s significant in determining whether different groups benefit?
-
Do people derive different benefits from different types of visits (activity, duration, ‘specialness’) or place (urban, different habitats, gardens)?
-
How are benefits reported from spending time in nature changing over time?
-
How does being in nature (and the amount of time spent in nature) affect health and wellbeing?
a. How connected to nature and place do people feel?
b. How does this correlate with the use of the natural environment?
-
What environmental issues do people care about?
-
How is this changing over time?
-
What are the determinants of care and concern?
-
How does environmental care and concern relate to use of the natural environment?
-
How do people engage in action for the natural environment?
a. How is this changing over time?
b. How does this differ across the population?
c. How does action for the natural environment relate to use of the natural environment?
- What economic benefits are derived from cultural ecosystem services provided by engaging with the natural environment?