Research and analysis

National Reference Test annual statement 2024

Published 22 August 2024

Applies to England

Ofqual has today, Thursday 22 August, published the results of the National Reference Test (NRT) in 2024. The National Foundation for Educational Research (NFER) Results Digest shows the 2024 results alongside results from previous years. 

Background 

In February and March 2024, more than 13,000 year 11 students from over 340 schools in England took the National Reference Test in English or maths, which is administered by NFER. The tests are designed to provide evidence of the performance of 16-year-old students in English language and maths and were introduced to provide additional evidence to support the awarding of GCSEs in these subjects. The first live NRT, taken in 2017, was benchmarked against the first awards of the reformed GCSEs in English language and maths, and subsequent tests compare the performance of students with those in previous years. 

Results are reported at 3 grade boundaries – grade 7, grade 5 and grade 4. Results are reported as expected percentages of students achieving those grades (and above) based on changes in performance on the NRT. This overview focuses only on grades 7 and 4, since the grade 5 boundary is set arithmetically by exam boards and would not normally be adjusted.

Results for 2024 

The results are shown below. Because this test uses a sample of students, we report ‘confidence intervals’ around the results. These confidence intervals represent the possibility that if we had taken a different sample of students, we would get a slightly different result. The results show the changes in the expected percentage of students at the grade 7 and grade 4 boundaries, compared with 2017. 

The NRT results are compared with 2017 because this is the baseline year of the NRT, and, with the exception of 2021 and 2022, the year with which we have previously compared results. Reformed GCSEs in English language and maths were first awarded in 2017, and we know that when new assessments are available, performance typically dips in the first year and then subsequently improves. This is known as the sawtooth effect. When considering any changes in performance compared with 2017, we take into account any changes in performance that are typically observed when new qualifications are introduced, as we did when making decisions about using the NRT in 2019 and 2020. For example, were we to see a significant increase in NRT performance compared with 2017, we would need to consider whether this reflected a genuine change in attainment. 

The results show that, in English, there is a statistically significant downward change when compared with 2017 at grade 4 (at the 0.01 level of significance).[footnote 1]  There is no statistically significant difference at grade 7 when compared with 2017.

In maths, there is a statistically significant upward change at grade 7 (at the 0.05 level of significance). There is no statistically significant change at grade 4.  

Expected percentage of students at each grade (with associated confidence intervals)

Subject Grade 4 and above Grade 7 and above
English language 2017 69.9 (68.4-71.4) 16.8 (15.6-18.0)
English language 2024 65.1 (63.4-66.8) 16.4 (15.3-17.6)
Maths 2017 70.7 (69.3-72.1) 19.9 (18.6-21.2)
Maths 2024 70.9 (69.4-72.4) 22.4 (21.0-23.7)

Using NRT evidence in awarding 

The NRT provides an additional source of evidence to support the awarding of GCSEs in English language and maths. Where there is a statistically significant difference in performance, Ofqual can require exam boards to adjust the grade standards when setting GCSE grade boundaries.  

In considering the evidence from the NRT, we aim to make sure that: 

  • our decisions are consistent over time and between subjects, regardless of the direction of any change 

  • we take account of contextual evidence from the student survey and other sources, and that we act cautiously in making any adjustments to grade standards 

  • we document and publish the reasons for our decisions 

In order to help us interpret the NRT results, we carry out additional analysis to consider the prior attainment profile of the sample of students who take the test. We also consider the findings from the student survey in relation to student motivation and students’ views of the importance of the NRT and GCSEs in English language or maths.

Student sample 

In both English and maths, the achieved sample – that is, those students who took the test as opposed to all those who were selected to take part – has an upward bias in terms of prior attainment, demonstrated by the difference in the Key Stage 2 profile of the drawn and achieved samples.  

This is not, in itself, problematic. In both NRT subjects, the upward bias has remained relatively stable across the 8 administrations of the NRT. While there are small differences in the achieved sample’s KS2 profile between 2024 and the comparator year of 2017, our modelling suggests that those differences are unlikely to fully account for the changes in performance on the test. Indeed, in English, the difference in prior-attainment profile runs counter to the statistically significant differences in performance – that is, our analyses suggest that the sample may be very slightly stronger in 2024.

Student survey 

Immediately after taking the NRT, students also take a short survey to capture, among other things, their NRT-specific test motivation, preparation for GCSEs, and motivation, feelings and attitudes about learning the relevant GCSE subject. The aim of the survey is to provide context for any changes in NRT results. The survey was introduced in 2017 and 2024 saw the eighth administration of the survey. 

Compared to their counterparts in 2017, students taking the NRT in both subjects in 2024 reported lower perceived importance of the NRT, greater indifference to their own NRT performance and less preparation for the NRT. Those in maths, but not those in English, also reported less test-taking effort.  

Modelling exploiting the historical relationship between self-reported test motivation and test performance suggests that the decrease in test motivation was not a major contributor to the finding of a statistically significant decline in performance in English at grade 4.

Interpreting the results  

NFER report results for the NRT at both the 0.01 and 0.05 levels of significance. In interpreting the results, we have, as in previous years, focused on the 0.01 level of significance, due to the high stakes nature of the test and GCSE results.

English 

The statistically significant change at the 0.01 level of significance at grade 4 compared with 2017 could be interpreted as suggesting a small downwards adjustment to grade standards at grade 4 this summer would have been appropriate. Further, our analysis of the student sample suggests that the small differences in the prior attainment profile of the student sample do not account for the lower results in 2024, neither does the slightly lower test motivation for those taking the test this year. 

We have always been clear, however, that we would be cautious in using evidence from the NRT to inform awarding. For us to make an adjustment this summer we would need to be confident that the decline in performance indicated by the results of the NRT reflected a genuine change in the attainment of the GCSE cohort. Given the context in recent years following the pandemic, we do not consider that there is sufficient evidence of a genuine decline in performance such that we should make a downwards adjustment this summer. In subsequent years, we will continue to make decisions informed by the principles outlined above and any trends over time as normal grading arrangements continue.

Maths  

The statistically significant increase at grade 7 compared with 2017 could be interpreted as suggesting a small upwards adjustment to grade standards would have been appropriate at this grade this summer. Further, our contextual analyses suggests that the small differences in the NRT sample, and the student survey results, are unlikely to fully account for change in performance. 

We have considered, however, that the change compared with 2017 is only significant at the 0.05 level of significance, and that the 2024 NRT outcomes in maths are similar to those seen in 2019. At that time, we decided not to make an adjustment because we considered that the increase relative to 2017 was likely due to the sawtooth effect and the improvements that we might expect in the first few years that a qualification is available. To make an adjustment now would be rewarding a level of performance that we have previously decided not to (because we were not confident that it reflected a genuine improvement in attainment). 

Taking into account our principles of consistency (that our decisions should be consistent between years) and acting judiciously (that we will be cautious in applying any adjustment), we have therefore decided not to make an adjustment in maths at grade 7 this summer. In subsequent years, we will consider any trends and apply the principles outlined above, when deciding whether any change in the NRT reflects a genuine improvement in attainment.

  1. There are different levels of statistical significance. A 0.05 significance level indicates a 1 in 20 chance of the difference occurring by chance; at the 0.01 level of significance, that reduces to a 1 in 100 chance.