Guidance

Apprenticeship training provider accountability framework and specification

Updated 28 November 2024

Applies to England

This apprenticeship accountability framework and its specification inform our assessment of the quality of provision delivered by apprenticeship training providers.

We expect training providers to routinely review and assess the quality and success of their individual programmes. We expect them to take proactive action to improve their provision and shape their apprenticeship provision offer. This framework supports training providers to proactively:

  • review their own performance
  • identify potential quality issues
  • support their continuous improvement

Accountability policy for apprenticeship training providers

It’s important to protect the experience of all apprentices, increase achievements and drive up quality. We also want to ensure that pockets of poor provision do not undermine the performance of apprenticeship provision more broadly and brand of the programme. Apprenticeships rely on the commitment and effective collaboration between an employer, an apprentice, and a training provider. While all parties have a key role to play in the success of an apprenticeship, we know that the quality of training is a major factor in whether apprentices successfully complete their apprenticeship.

This framework provides a holistic, timely and data-led approach to training provider accountability. It provides a wide range of quality indicators with specified threshold that providers can use to support self-improvement. These will reduce the likelihood that we’ll need to intervene. We’ll use this information to determine when and where intervention may be needed.

Now that the framework is established, we’ll be taking a more robust approach in its application. We’re retaining the same indicators, but we’re introducing higher thresholds for 3 of the 10 indicators (withdrawals, apprentices past planned end date, and breaks in learning) and removing the 250-apprentice threshold that applied to 3 of the measures.

The thresholds in this framework reflect our minimum expectations. You should not use them as targets or aspirations. Providers with large volumes of learners should expect close monitoring and performance management designed to effect continuous improvement.

Our stronger approach will take immediate effect. We’ll use the new thresholds in performance conversations from June 2024, when the supporting data dashboard is published with the updated thresholds and applied to current and historic data. We’ll use the framework to assess performance throughout the academic year to ensure that, where we need to take intervention action, this is timely and proportionate.

At our discretion, we’ll make full use of the range of existing contractual measures available that are detailed in your provider funding agreement. We’ll evaluate each case according to its own circumstances. This may include acting in cases where performance across the different indicators compares poorly with other providers delivering similar provision, or where there have been ongoing performance conversations but agreed improvements have not been realised for reasons within the provider’s control.

We’ll continue to consider relevant contextual factors that may be impacting a provider’s performance, where these can be appropriately evidenced.

Indicators we use to review provider performance

Reviews will assess the latest available data for each indicator for the current academic year and the previous academic year (where applicable).

We’ll continue to keep the indicators and their thresholds under review to ensure they’re set at the right level.

Quality indicators and thresholds

Outcomes from Ofsted inspection

This refers to the outcome from your most recent Ofsted inspection. We’ll consider organisations to be ‘at risk’ if Ofsted grades them:

  • inadequate for ‘apprenticeships’
  • inadequate for ‘overall effectiveness’ under its further education (FE) and skills remit, where there is no separate apprenticeship grade

The contractual action we may take in these circumstances is set out in your funding agreement.

We’ll prioritise providers under the accountability framework if they’ve not yet been inspected by Ofsted, or if they’ve been inspected by Ofsted and received:

  • a ‘requires improvement’ grade at full inspection
  • an ‘insufficient progress’ assessment at a new-provider monitoring visit

Apprenticeship achievement rate

Apprenticeship achievement rates are calculated as part of qualification achievement rates (QARs).

We’ll assess organisations with an all-age apprenticeship achievement rate:

  • of less than 50% as ‘at risk’
  • greater than or equal to 50% and less than 60% as ‘needs improvement’

We update your QAR data within the accountability framework dashboard at the same time as the main QAR dashboard updates. These updates happen during the year at R10 in June and R12 in August, and again in January and March for R14.

Apprenticeship retention rate

Apprenticeship retention rates are calculated as part of qualification achievement rates.

We’ll assess organisations with an all-age apprenticeship retention rate:

  • of less than 52% as ‘at risk’
  • greater than or equal to 52% and less than 62% as ‘needs improvement’

We update your QAR data within the accountability framework dashboard at the same time as the main QAR dashboard updates. These updates happen during the year at R10 in June and R12 in August, and again in January and March for R14.

Apprentice feedback

This is collected through an apprentice’s ‘My apprenticeship’ account on the Apprenticeship Service.

Information on how the apprentice feedback score is calculated is available.

In their Apprenticeship Service provider accounts, providers can see single academic year scores for the past 5 years, an average score based on the past 5 years’ scores and the detail behind their scores.

To ensure we only review recent performance, the framework will use the most recent academic year rating for assessment against the threshold. This will be displayed on the accountability framework dashboard, to which providers should refer.

Organisations with apprentice feedback of less than 2.5 will be assessed as ‘needs improvement’.

Employer feedback

This is collected through a questionnaire. Information on how the employer feedback score is calculated is available.

In their Apprenticeship Service provider accounts, providers can see single academic year scores for the past 5 years, an average score based on the past 5 years’ scores and the detail behind their scores.

To ensure we only review recent performance, the framework will use the most recent academic year rating for assessment against the threshold. This will be displayed on the accountability framework dashboard, to which providers should refer.

Organisations with average employer feedback of less than 2.5 will be assessed as ‘needs improvement.’

Supplementary indicators and thresholds

Further technical information related to the supplementary indicators can be found in Annex 1.

Apprentices past planned end date

This indicator refers to apprentices who are continuing training past their planned learning end date or were past it when they completed their apprenticeship (it does not include the end-point assessment period).

We’ll assess organisations with more than 15% of the total number of apprentices past their planned end date by 180 days or more as ‘at risk’.

We’ll assess organisations as ‘needs improvement’ if they have more than 15% of the total number of apprentices past their planned end date by 90 days or more but less than 180 days.

Break in learning

This indicator refers to apprentices who have gone on an agreed break in learning.

We’ll assess organisations with more than 10% of the total number of apprentices on a break in learning by 365 days or more as ‘at risk’.

We’ll assess organisations as ‘needs improvement’ if they have more than 10% of the total number of apprentices on a break in learning by 180 days or more but less than 365 days.

End-point assessment organisation (EPAO) data

This indicator assesses organisations with apprentices identified as having no EPAO in the ILR:

  • within 3 months of the planned end date as ‘at risk’
  • within 3 to 6 months of the planned end date as ‘needs improvement’

The EPAO in the ILR must be valid for the standard being delivered. It is the provider’s responsibility to ensure they select an appropriate EPAO for the apprenticeship standards they’re delivering. Where our validation checks highlight instances of invalid EPAOs being selected, we may take this into account as part of your performance review.

Off-the-job training

This is collected through the FRM37 report of financial assurance: monitoring post-16 funding.

We’ll assess organisations as ‘at risk’ if they have any of the following:

  • more than 20 records with errors
  • one or more apprentices reported with zero planned hours
  • one or more apprentices with zero actual hours on apprenticeship completion

We’ll assess organisations as ‘needs improvement’ if they have either:

  • more than 15 records for planned hours
  • one or more records for actual hours

Withdrawals

This indicator refers to apprentices who have withdrawn from their learning activities.

We’ll assess organisations as ‘at risk’ if their withdrawals are greater than 20% of the total number of apprentices.

We’ll assess organisations as ‘needs improvement’ if their withdrawals are less than or equal to 20% and greater than 15% of the total number of apprentices.

Review process

We’ll continually monitor provider performance against the indicators. We may contact providers at any point in the academic year to discuss where their performance falls below the thresholds.

Before a review

Case managers will identify any provider that is ‘at risk’ or ‘needs improvement.’ They will contact you to set out where data indicates provision has fallen below any thresholds.

The case manager will invite you to have a management conversation. We expect you to respond promptly.

Consider whether there are mitigating factors and share any supporting evidence with your case manager before the conversation.

During the management conversation

The case manager will review provider data against the framework’s quality and supplementary indicators.

The discussion or correspondence is focused on where your data indicates performance is below required thresholds. We’ll discuss:

  • evidenced reasons why performance is below our specified thresholds
  • your track record
  • capacity to improve, including evidence of action you may have already taken to improve and when you expect to see the impact, and progress in relation to improvement plans or targets we have previously agreed with you
  • how your performance compares with sector or standard benchmarks (where appropriate)
  • any other relevant contextual evidence

The case manager may ask for further information. You should provide this within the agreed time period.

After the review

Following the meeting, the case manager will inform you in writing about whether any follow-up action is required.

Range of interventions

After the management conversation, if we believe that poor-quality provision is due to provider failure, we’ll intervene and take further action.

Our interventions will prioritise protecting apprentices’ interests. We may support you to improve your provision, where you demonstrate this is possible in a timely manner.

Your funding agreement sets out the full range of intervention actions we may take. These include but are not limited to:

  • requiring you to produce an improvement plan with associated targets
  • restricting your ability to recruit new apprentices for one or multiple standards
  • restricting your sub-contracting arrangements
  • withholding or suspending funding for a fixed or indefinite period
  • capping funding for delivery of new standards, for a fixed or indefinite period
  • terminating your contract, where necessary
  • putting in place other contractual conditions as appropriate

No intervention

If you are ‘on track’ against all indicators, we’ll not contact you to arrange a review.

There may be instances where providers do not meet one or more of the quality indicators, but we decide that an immediate management conversation is not needed. For example, a provider may have an ‘outstanding’ Ofsted grade and we expect them to have the capacity to improve.

Contextual factors

We’ll take relevant contextual factors and a holistic view into account when we review your performance.

Benchmarked QARs

We have introduced benchmarked QAR data to the apprenticeship accountability framework (AAF) dashboard. This will be reviewed as part of the management conversation. It will support the evaluation of each case through more transparent comparison of performance with providers delivering similar provision and therefore facing similar sectoral challenges.

Benchmarked data will not lead to automatic removal of provision. Recognising that providers delivering similar provision may be impacted by different provider-level challenges, we’ll consider wider contextual information – for example, delivery location, learner and employer profiles, and cohort or provider size – when reviewing benchmarked data with you.

Further information on benchmarked QAR data can be found in Annex 2.

Apprentices with protected characteristics

Every apprentice deserves excellence in their training provision, and all those considering an apprenticeship need to know that apprenticeships are accessible and that they will be supported to achieve.

We expect providers to offer opportunities for training and progression that meet the needs of a range of apprentices and businesses. This is in line with your duties under the Equality Act 2010 not to discriminate against apprentices with protected characteristics. Some apprentices will need support to achieve full occupational competence on their chosen standard.

We’ll consider the profile of a provider’s cohort when we review their performance.

The provider guide to delivering high-quality apprenticeships provides guidance on the additional funding and support available to support different cohorts to achieve.

Small or new apprenticeship provision

When deciding on intervention action for underperforming providers, we’ll consider whether they:

  • have small cohorts
  • offer new or immature provision

We expect these providers to set realistic improvement targets as a priority. We’ll challenge them on reasonable progress and evidence of impact.

Data timeliness and accuracy

It’s important that your data accurately reflects your apprentice population and performance at any point in the year. Your funding agreement sets out your obligations with regard to the accurate and timely reporting of data.

You can use the following tools to test its credibility:

Complaints and feedback

You can complain about an intervention through our customer help portal, if you’re unable to resolve this with your case manager directly.

Framework policy principles

The framework is underpinned by 5 principles.

Data-driven

We’ll use a wide range of quality indicators to give us a rounded overview of a provider’s delivery.

Risk-based

We’ll take a risk-based approach, using a range of quality indicators. These will focus on providers where there may be quality issues so we can intervene to protect apprentices’ interests.

Encourage self-improvement

We aim to:

  • identify risks to quality early
  • protect apprentices’ interests
  • support self-improvement

Timeliness

We’ll monitor your performance data throughout the academic year. This ensures that management conversations and intervention happen earlier, where necessary.

We expect you to monitor and review your performance data throughout the academic year.

Proportionality

We’ll only take interventions as a result of a management conversation. Interventions are not automatic and we’ll consider previous or ongoing management conversations and providers’ track record.

We’ll take proportionate action to support providers to address quality issues, where they demonstrate they have the capacity to improve in a timely manner or that some performance issues are evidently beyond their control.

Annex 1: Definitions used for the supplementary indicators

Data that makes up the supplementary indicators is collected through the Individualised Learner Record (ILR) data collection on Submit Learner Data, a service for DfE funded providers to validate and submit their data.

We use the latest ILR return only to generate the Apprenticeship Accountability Framework, the outputs of which can be found on View Your Education Data.

All the supplementary indicators use the following definition in the denominator. ‘Total number of apprentices’ means all your apprenticeship programme aims within the academic year, regardless of their completion status.

It includes:

  • new starts
  • existing apprentices
  • apprentices on both apprenticeship standards and frameworks

We exclude apprentices who do not meet the qualifying period of a minimum 42 days. This is set out in the apprenticeship funding rules. We may monitor the volume of leavers in the first 42 days of apprenticeships in future updates.

Technically, this covers records with an:

  • an Aim Type of 1 (‘programme aim’)
  • and have a Programme type of:
    • 2 (‘advanced level apprenticeship’)
    • 3 (‘intermediate level apprenticeship’)
    • 20 (‘higher apprenticeship – level 4’)
    • 21 (‘higher apprenticeship – level 5’)
    • 22 (‘higher apprenticeship – level 6’)
    • 23 (‘higher apprenticeship – level 7+’)
    • 25 (‘apprenticeship standard’)
  • and have a Funding model of:
    • 35 (‘adult skills’)
    • 36 (‘apprenticeships’)
    • 81 (‘other adult’)

Apprentices past planned end date

This definition uses the following fields from the ILR:

  • Completion status
    • 1 (‘the learner is continuing or intending to continue the learning activities leading to the learning aim’)
    • 2 (‘the learner has completed the learning activities leading to the learning aim’)
  • Learning actual end date
  • Learning planned end date

The definition also uses a field called ILR Freeze Date. From R01 in September to R11 in July, this is the date that the ILR data collection for that period closed, as defined in column B of the ILR freeze schedule.

For R12 in August through to R14 in October, this date is coded to July 31st, as the data collection closes following the end of the academic year.

Between 90 and 180 days

Set to 1 if any of the following conditions are true.

If the Completion Status is 1 and the Learning Actual End Date is not populated and the difference between the Learning Planned End Date and the ILR Freeze Date is greater than or equal to 90 days and less than 180 days.

If the Completion Status is 1 and the Learning Actual End Date is populated and the difference between the Learning Planned End Date and the Learning Actual End Date is greater than or equal to 90 days and less than 180 days.

If the Completion Status is 2 and the difference between the Learning Planned End Date and the Learning Actual End Date is greater than or equal to 90 days and less than 180 days.

Greater than or equal to 180 days

Set to 1 if any of the following conditions are true.

If the Completion Status is 1 and the Learning Actual End Date is not populated and the difference between the Learning Planned End Date and the ILR Freeze Date is greater than or equal to 180 days.

If the Completion Status is 1 and the Learning Actual End Date is populated and the difference between the Learning Planned End Date and the Learning Actual End Date is greater than or equal to 180 days.

If the Completion Status is 2 and the difference between the Learning Planned End Date and the Learning Actual End Date is greater than or equal to 180 days.

Total

Set to 1 if any of the following conditions are true.

If the Completion Status is 1 and the Learning Actual End Date is not populated and the difference between the Learning Planned End Date and the ILR Freeze Date is greater than or equal to 1 day.

If the Completion Status is 1 and the Learning Actual End Date is populated and the difference between the Learning Planned End Date and the Learning Actual End Date is greater than or equal to 1 day.

If the Completion Status is 2 and the difference between the Learning Planned End Date and the Learning Actual End Date is greater than or equal to 1 day.

Break in learning

This definition uses the following fields from the ILR:

  • Completion status of 6 (‘learner has temporarily withdrawn from the aim due to an agreed break in learning’)
  • Learning actual end date
  • Learning planned end date

We exclude records from this definition where a learner has returned from a break in learning, and the details sent for the returning instance match those of the original aim.

We determine this by checking that the Learning Start Date of the new aim is greater than the Learning Actual End Date of the original aim. We match records using UKPRN, Learner Reference Number, Framework Code, Standard Code, and Programme Type. For ease of use, this is referred to as ‘Planned Breaks Restarted’.

The definition also uses a field called ILR Freeze Date. From R01 to R11, this is the date that the ILR data collection for that period closed. For R12 through to R14, this date is coded to July 31st, as the data collection closes following the end of the academic year.

Between 180 and 365 days

Set to 1 if the Completion Status is 6 and the difference between the Learning Actual End Date and the ILR Freeze Date is greater than or equal to 180 days and less than 365 days and the ‘Planned Breaks Restarted’ flag is set to 0.

Greater than or equal to 365 days

Set to 1 if the Completion Status is 6 and the difference between the Learning Actual End Date and the ILR Freeze Date is greater than or equal to 365 days and the ‘Planned Breaks Restarted’ flag is set to 0.

Total

Set to 1 if the Completion Status is 6 and the difference between the Learning Actual End Date and the ILR Freeze Date is greater than or equal to 1 day and the ‘Planned Breaks Restarted’ flag is set to 0.

End-point assessment organisation (EPAO) data

This definition uses the following fields from the ILR:

  • Completion status of 1 (‘the learner is continuing or intending to continue the learning activities leading to the learning aim’)
  • End point assessment organisation
  • Learning actual end date
  • Learning planned end date

For this definition, in the numerator and the denominator, we only look at records where they are apprenticeship standard on a funding model of 36.

The definition also uses a field called ILR Freeze Date. From R01 to R11, this is the date that the ILR data collection for that period closed. For R12 through to R14, this date is coded to July 31st, as the data collection closes following the end of the academic year. It includes apprentices who are past their planned end date.

Within 3 months of the planned end date

Set to 1 if the Completion Status is 1 and the EPA Organisation ID is NA or Unknown and the difference between the Learning Planned End Date and the ILR Freeze Date is less than or equal to 90 days.

Within 3 to 6 months of the planned end date

Set to 1 if the Completion Status is 1 and the EPA Organisation ID is NA or Unknown and the difference between the Learning Planned End Date and the ILR Freeze Date is greater than 90 days and less than or equal to 180 days.

Total

Set to 1 if the Completion Status is 1 and the EPA Organisation ID is NA or Unknown.

Withdrawals

This definition uses the following fields from the ILR:

  • Completion status of 3 (‘the learner has withdrawn from the learning activities leading to the learning aim’)
  • Learning actual end date
  • Learning start date

Total

Set to 1 if the Completion Status is 3 and the difference between the Learning Start Date and Learning Actual End Date is greater than 42 days.

Annex 2: Guidance and technical notes for benchmarked QAR data

Benchmarked data can be found by selecting ‘check here to see how your stats compare to national comparison’ on the QAR section of the AAF dashboard.

It includes comparative benchmarking data, setting out providers’ overall, sector subject area and standard-level QARs against national equivalents.

It also includes detailed benchmarked standard-level QAR data, based on quartiles – this shows providers in more detail how their delivery compares to providers delivering the same standard, by setting out whether they are performing better than 25%, 50% or 75% of other providers delivering the same standard, or are in the lowest 25%. As this is a new way of presenting data, we have provided technical notes to support providers’ understanding.

AAF indicator thresholds will remain the route for identifying which providers will be contacted for an AAF management conversation. The indicator thresholds for QARs are based on providers’ overall QAR. Benchmarked QAR data will not be used as an indicator threshold.

Benchmarked data will be updated once a year with full-year data, once national achievement rates tables (NARTs) have been published. The front page of AAF achievements data will continue to be updated in-year alongside in-year QAR updates.

Benchmarked data will remain within each provider’s private AAF dashboard and will not be shared publicly or published on Explore Education Statistics (EES). The quartile benchmarked data is confidential information belonging to DfE that is not in the public domain and therefore should not be used in public-facing materials.

We welcome feedback on the first iteration of this data, especially on how we can improve its presentation in the AAF dashboard and the explanation in the technical notes. Email provider.strategy@education.gov.uk.

Quartile benchmarking: technical notes

How quartile benchmarking is calculated

Quartiles divide an ordered list of providers’ qualification achievement rates for a given standard (that is, ordered from lowest to highest) into 4 equal-sized groups to reflect the relative performance of providers.  

We calculate that if a provider is achieving better than 25%, 50% or 75% of other providers for a given standard, based on respectively falling into the second, third or fourth quartile of this ordered list of achievement rates for that standard.

The QAR value is calculated at the 25th, 50th (median) and 75th percentile of an ordered list of achievement rates for a given standard. Each provider’s QAR is then compared to these values to calculate the quartile they fall into and the proportion of providers they are performing better than for that standard, as follows:

  • providers with a QAR less than or equal to the value at the 25th percentile: their QAR for this standard is within the lowest 25% of all QARs for this standard (the lowest quartile)

  • providers with a QAR less than or equal to the median, and above the 25th percentile:  their QAR for this standard is better than that of at least 25% of other providers delivering this standard

  • providers with a QAR less than or equal to the 75th percentile, and above the median: their QAR for this standard is better than that of at least 50% of other providers delivering this standard

  • providers with a QAR above the 75th percentile: their QAR for this standard is better than that of at least 75% of other providers delivering this standard (the top quartile)

The minimum and maximum quartile boundaries for the quartile a provider falls into for a standard are set out in the dashboard under the columns ‘quartile boundary (minimum)’ and ‘quartile boundary (maximum)’.

The quartile a provider falls into is determined only by the value of their QAR. This calculation is based on the exact QAR value, calculated as achievers ÷ leavers (× 100), without rounding to the single decimal place QAR is usually displayed with.

Scenarios

Scenario 1

The quartiled approach reflects the relative performance of providers. For standards where most providers have high QARs, the boundaries between quartiles will also be high. For standards where most providers have low QARs, the boundaries will be low. This means that:

  • providers securing high QARs in high-performing standards may not fall into the higher quartiles when benchmarked against other providers delivering the same standard because their performance is good but relatively lower than other providers delivering the same standard
  • providers delivering standards that usually achieve low QAR rates may fall into the higher quartiles despite having a low QAR, because their performance is relatively better than other providers delivering the same standard

Scenario 2

In any instance where a provider has exactly the same QAR for a standard as another provider (a tie), they will both fall in the same quartile.

Scenario 3

Where the 25th, 50th (median) and 75th percentile value does not fall exactly on an existing provider, a calculated value is used. This means the boundary value may not be a QAR achieved by a specific provider.

For instance, in a standard offered by 12 providers, each quartile would ideally contain 3 providers if no ties existed. Calculated values would be used to create boundaries falling between the third and fourth, sixth and seventh, and ninth and 10th provider to split the providers into 4 groups of 3.

Scenario 4

Where the QAR of a provider is exactly equal to a boundary, the provider falls into the quartile below to ensure the statement ‘For this standard, your QAR is within the lowest 25% of other providers or better than at least 25% , 50% or 75% of other providers’ holds true.

Scenario 5

For standards where many providers achieve a 100% or 0% QAR, the calculated quartile boundaries may be 100% or 0%, so, in some standards, a 100% QAR may not place a provider into the highest quartile. In this instance, enough providers have 100% QAR for this standard that 100% is not an achievement better than a full 75% of other providers, as ties are not counted.

Scenario 6

At present, there must be at least 2 providers delivering a standard for quartile benchmarking data to be generated for that standard. We’ll explore setting a further minimum in future iterations of this data if user testing suggests this is appropriate. Quartile benchmarked data has therefore not been generated for standards delivered by only one provider. This means providers that are the sole deliverers of standards will see comparative benchmarked data but not quartile benchmarked data for these standards.

Scenario 7

The use of calculated quartile boundaries, as explained in scenario 4, will have a greater effect on standards offered by fewer providers, as follows.

Where a standard is delivered by only 2 providers, the calculated boundaries would fall 25%, 50% and 75% of the way between the 2 providers’ QAR values. This means one provider will fall in the lowest and one in the highest quartile.

Having more than 2 providers on a standard gives more data for the boundary calculations, increasing the accuracy of the calculated boundaries.

The quartile given where numbers are too small for even quartile splits represents the proportion of providers you would be achieving better than if the standard was more widely offered and achievement followed the current pattern.

As quartiles are a measure of the spread of current data, quartile benchmarking is a more accurate indicator where there is more data for comparison, as this allows for more even quartile splits and more accurate projections.