Correspondence

Ofqual’s response to consultations on primary assessment in England and the Rochford review recommendations

Published 22 June 2017

Applies to England

In regulating national assessments, Ofqual’s objectives are to promote standards and confidence in statutory early years assessments and primary assessments at key stage 1 and key stage 2 (more information on Ofqual’s specific powers and duties in relation to national assessments can be found in our national assessments regulatory framework). Our key concern is that assessments should be valid; this response provides a view on aspects of the consultation that relate to assessment validity. Our response does not consider proposals that fall outside our remit, such as those relating to whether or not there should be particular statutory assessments, or those relating to curriculum policy, accountability policy or internal school assessment.

The Department for Education has made some important proposals in relation to the future of primary assessment. We welcome the consultation, which gives stakeholders, including teachers, schools, parents, and assessment and subject experts, the opportunity to give their views. This is particularly important given the pace and scale of changes in recent years, with new tests and teacher assessments having been introduced at both key stages last year. Our response to this consultation is intended to support the future development and delivery of high quality, valid and reliable assessment.

We welcome a number of the premises underlying the proposals.

First, the government’s recognition that improvements can be made to primary assessment. It will, however, be important that any changes are carefully considered and cautiously introduced. To support valid assessment, teachers and schools will require time, training and support prior to significant changes being introduced. Where fundamental changes are made, such as the introduction of entirely new assessments, these should be trialled and/or piloted. The scale of any support, trials or pilots should be proportionate to the level and nature of changes being made.

Second, the longer-term commitment to stability. Regular changes to assessments make maintaining standards more difficult and can undermine confidence.

Third, the recognition of the limitations of assessment data. Standardised assessments can provide useful information about school performance, but as with any other data, assessment data will never give a complete picture; it forms part of a wider range of evidence and information.

Fourth, the recognition that assessments used for accountability purposes must be designed to withstand those pressures: the use of assessment outcomes in this way can impact on the reliability and validity of assessment. To secure a level playing field for schools, pressure points must be understood and negative impacts minimised; and we recognise that government has taken steps designed to lower the stakes of key assessments.

There are 6 principles underpinning our response to this consultation; the need for:

  1. Coherence; that assessment arrangements should be compatible across subjects, stages and ability ranges.
  2. Clarity of purpose for each assessment and, where multiple purposes are identified, for these to be compatible or reconcilable.
  3. Assessment arrangements to have sufficient validity, including minimal bias.
  4. Assessment arrangements to be fair, equitable and manageable; to minimise incentives for poor assessment practices.
  5. A clear rationale for any change, based on high quality research, evaluation and deep expertise in assessment techniques and practices.
  6. Care in the timing and manner of the introduction of new assessment arrangements, including an effective approach to communications and guidance to schools.

The remainder of this response provides a view on key aspects of the consultation that fall within Ofqual’s remit.

Baseline assessment

The consultation proposes a new reception baseline assessment to inform a school progress measure. We welcome the recognition that any new baseline assessment would need careful consideration and cautious introduction. If government decides to proceed with this proposal, we would recommend trialling, coupled with a full pilot and evaluation. It is likely that this process would provide valuable insights, including about the best time to take the assessment; administration arrangements including the use of technology; and how pupils with English as an Additional Language (EAL) and pupils with special educational needs and/or disabilities (SEND) engage with the assessment.

It would be necessary to take sufficient time to develop high quality assessment, both to ensure that it worked effectively for the age group during piloting, and so that those involved in supporting any assessment could be given appropriate training, guidance and support. In this context, introduction of a statutory assessment within the next two to three years could be challenging.

The potentially high-stakes nature of a baseline assessment presents challenges, particularly when coupled with the age of the children. Any assessment of 4 year-olds would need to be mediated, but the assessment would also need to be able to withstand incentives to depress results. Teachers must have sufficient assurance that they can trust that results will be comparable between different schools across the country.

If a new baseline assessment is developed, we would strongly recommend that all reasonable steps are taken to make sure that a fair and equitable approach is achieved. We would encourage government to consider what could be done to reduce pressures on the assessment when considering how a new progress measure could work and we appreciate the commitment to using a range of data, such as information about school context.

The approach to pupils with EAL would also need to be carefully considered. The relationship between baseline and key stage 2 assessments may be different for pupils whose first language is not English compared to native English speakers. This should be given due consideration during assessment design. The use of additional information about schools and pupils alongside assessment data would also be important to ensure that the test’s purpose as a consistent baseline for all schools can be met and that teachers and schools can have confidence in assessment outcomes.

Key stage 1 assessment

In the event that a reception baseline is introduced, the consultation proposes to make key stage 1 assessment non-statutory. It discusses alternative uses for key stage 1 assessments:

  1. As optional assessments to help schools monitor pupil performance and/or support transition to key stage 2.
  2. To monitor national standards and allow schools to benchmark against this if they chose to.
  3. As part of a progress measure for infant, junior and middle schools.

Ofqual’s role is not to decide whether there should or should not be assessment at key stage 1, or what the purpose of the assessment should be, however, our view is that the primary purpose of the assessment must be clear. Each of the above purposes suggests different approaches, for example to the design, administration and marking of the assessment. If key stage 1 assessment were required to achieve all of those purposes, it would not be likely to successfully meet all of them.

We agree with the government’s analysis that current key stage 1 assessments are not suitable as a long-term baseline measure; their design is not capable of withstanding the pressures of such a high-stakes use. We welcome moves to reduce pressure on the assessments while they continue to be used. Should key stage 1 assessments be retained to support accountability measures for infant, junior and middle schools, they would need to be redesigned to meet that specific purpose.

Multiplication tables check

Again, we would advise a careful and considered approach to the introduction of a new multiplication assessment. We welcome the fact that trialling is in train. Whilst piloting may not need to be as extensive as for a reception baseline, it is nonetheless likely that valuable lessons will be learned through this process. This could include insights into the best time to take the assessments, administration arrangements, including online/offline delivery, the most appropriate way of reporting results (including where any potential cut score(s) may be placed), how pupils with EAL and SEND can engage, and training and support requirements for schools.

The purpose of the check must be clear. The consultation suggests that the check ‘is designed to support teachers to identify pupils who have not yet learnt all their times tables’. If this is the primary purpose, it would be helpful to consider and set out the value that statutory assessment could add and how potential burdens weigh up against projected benefits. Clarity over the purpose of the assessments and the intended use of results will be necessary to inform effective assessment design.

Teacher assessment and moderation

Teacher assessment can be a highly valid and effective approach to assessment, however, where it informs school accountability measures this can place pressure both on the assessment and also on the teachers being asked to make the judgements. Where it is not possible to reduce the stakes of assessments, we would recommend that the potential impacts of pressures be minimised, through careful consideration of the design, delivery and controls put in place around assessments. Again, consideration should be given to the purpose of the assessment: what is the best way to design an assessment that can meet that purpose? Alternatives to the current teacher assessment model may provide greater fairness, reliability and validity.

We welcome government’s commitment to exploring the long-term approach to the assessment of writing; and to moderation. The current teacher assessment model was designed to be short-term and there is increasing evidence that the interim teacher assessment frameworks, particularly for writing, can be difficult to consistently implement and moderate. Where assessment is unduly burdensome, this can lead to negative impacts, not only on teaching and learning, but also on the validity of the assessments. We would make several recommendations:

  1. That reviews of moderation and of writing assessment be considered in tandem, to make sure there is a coherent and defensible approach taken, for example, in relation to assessment controls.
  2. That a writing review is extended to cover statutory teacher assessments across subjects and key stages, including a review of the ‘secure-fit’ approach of interim frameworks and the impacts of this on pupils with SEND and those working below the level of the curriculum.
  3. That a review of moderation similarly considers all subjects, stages and assessment types to secure a coherent approach.
  4. That a review of the future approach to both teacher assessment and moderation across all subjects is based on an evaluation of the current model, led by assessment expertise and informed by the research literature.

In relation to the ‘secure-fit’ model used in current interim frameworks, there is a need to consider carefully the rationale for taking a hurdle-based, or ‘mastery’ approach. The secure-fit model requires numerous accurate judgements for every pupil about whether or not each of the hurdles has been met in each subject to ensure the overall result is accurate. This can present risks to reliability and can make consistent implementation and moderation more difficult. The current Interim Frameworks also sample a particular set of learning outcomes, which can risk ‘teaching to the teacher assessment’ and consequent negative impacts on learning and validity. Key stage tests use a compensatory approach to assessment: a particular outcome can be achieved through a wide variety of different performances exhibiting different strengths and weaknesses. So outcomes of an effectively implemented mastery assessment are not likely to align with a compensatory assessment.

The proposed move to a ‘best-fit’ model for writing, and the removal of other statutory teacher assessments have the potential to improve coherence and address some of these concerns, however, in the longer-term a full evaluation and review of any frameworks that remain would be helpful to inform future development. This is also discussed below in relation to the Rochford Review.

There is helpful discussion of a number of alternative approaches in the consultation document, including comparative judgement and peer-moderation. Trialling, followed by a focused pilot to identify and test a small number of possible effective alternatives is likely to provide for a more reliable and valid approach to both moderation of teacher assessments and the assessment of writing. A pilot could usefully test how approaches and controls – such as cluster moderation, comparative judgement, plausibility checks or statistical moderation – may be used in combination to provide for an approach that:

  • minimises undue burden on teaching professionals
  • minimises potential conflicts of interest
  • provides for the most reliable outcomes

As any pilot would not be able to recreate the conditions of a high-stakes assessment, potential approaches could be subject to a ‘stress-test’ to analyse the likelihood of being able to control any identified negative incentives.

In relation to writing, any review of alternative approaches, including comparative judgement, should identify threats to validity related to holistic judgement and the extent to which any threats can be minimised. To create an equitable approach, consideration must be given to controls around three key areas:

  1. The task(s) that is/are set.
  2. The environment in which work is produced.
  3. How the work is judged.

All three of these elements must be consistently managed in order to provide for fairness across different schools nationally.

Early years assessment

In light of changes to the national curriculum, it would be helpful to review and restate the purpose of statutory early years foundation stage profile (EYFSP) assessment. This would be particularly helpful if a new reception baseline assessment is introduced and could clarify whether, and to what extent moderation of end of reception EYFSP assessments is necessary. This could also support consideration of whether the current grade descriptors (emerging, expected, exceeded) are appropriate for meeting the assessment’s purpose and are meaningful to those interpreting them. As mentioned above it will be important to secure a coherent approach to teacher assessment across the early years and key stages, particularly for pupils with SEND and those who may go on to be working below the level of the primary curriculum.

Rochford Review and pupils with SEND

We appreciate the government’s focus on reviewing the approach to the assessment of pupils working below the level of the national curriculum assessments, many of whom may have special educational needs and/or disabilities. We would encourage government to continue to keep the experience and outcomes for pupils with SEND – working at all levels – under review.

It is important that the approach to assessing pupils working below the level of the national curriculum assessments is compatible with the approach to those working at the level of the assessments. The Rochford Review proposed that additional ‘secure-fit’ grades be developed below the current grade descriptors. We would encourage a review of this approach, not only in relation to the proposed change to a ‘best-fit’ model for writing, but also as part of a wider review of teacher assessment, to secure a coherent and defensible approach.