DWP: Employment and Support Allowance Online Medical Matching

A tool which helps Employment and Support Allowance (ESA) agents match medical conditions for ESA claims.

Tier 1 Information

1 - Name

Employment and Support Allowance Online Medical Matching

2 - Description

Employment and Support Allowance (ESA) is a benefit you can apply for if you have a disability or health condition that affects how much you can work. https://www.gov.uk/employment-support-allowance

If someone thinks they are eligible they can apply for the benefit online. The claimant must fill in an online form which an agent will use to help them to decide if someone is eligible for ESA or not. One of the pieces of information included on the online form is medical condition(s). This is a free text box. An agent will use what is inputted in the free text box to map the condition(s) to those held and recognised by the Department for Work and Pensions. This solution uses cutting edge technology to help ESA agents by mapping the medical condition(s) entered into the free text box on the ESA claim against the list of conditions recognised by DWP. The medical condition is still validated by an agent but the solution helps by doing the upfront matching.

3 - Website URL

N/A

4 - Contact email

IAG.LIVESERVICE@DWP.GOV.UK

Tier 2 - Owner and Responsibility

1.1 - Organisation or department

Department for Work & Pensions

1.2 - Team

The Garage, Cross Boundary Team, DWP Digital

1.3 - Senior responsible owner

Deputy Director - Cross Boundary Team, Strategic Delivery Unit

1.4 - External supplier involvement

Yes

1.4.1 - External supplier

Accenture

1.4.2 - Companies House Number

4757301

1.4.3 - External supplier role

The Garage is a partnership between DWP and Accenture resources to create innovative solutions to resolve DWP challenges. On this piece of work, the developers were mainly Accenture resources, led by an Accenture Delivery Lead, working under DWP Project Managers

1.4.4 - Procurement procedure type

Open competition against a framework call off

1.4.5 - Data access terms

The Accenture team who worked on this did not require SC clearance.

Tier 2 - Description and Rationale

2.1 - Detailed description

Employment and Support Allowance (ESA) is a benefit you can apply for if you have a disability or health condition that affects how much you can work. https://www.gov.uk/employment-support-allowance If someone thinks they are eligible they can apply for the benefit online. The claimant must fill in an online form which an agent will use to help them to decide if someone is eligible for ESA or not. One of the pieces of information included on the online form is medical condition(s). This is a free text box. An agent will use what is imputed in the free text box to map the condition(s) to those held and recognised by the Department for Work and Pensions.

This solution uses cutting edge technology to help ESA agents map the medical condition(s) entered into a free text box on the ESA claim against the list maintained by DWP, to save agent time. In July 2020, The Garage was asked to create a solution that would help to reduce the time it took agents to work through claims. We introduced a solution using fuzzy matching, a cutting edge technology at that point in time, to help ESA agents map the medical condition(s) entered into a free text box on the ESA claim against a list of conditions maintained by DWP. However, fuzzy matching looks to match words/phrases based on spelling rather than the context in which they are used i.e. Chronic Fatigue was translated into Chronic Renal Failure or Partially Amputation of Foot was translated into Partially Sighted etc. Therefore, this resulted in only 35% of conditions being correctly matched, which means 65% were still having to be manually updated by an agent, and mapped across by them. Understandably, this was frustrating for agents. In 2024, we felt we could improve upon the solution. The Garage determined this solution could be made more effective through the use of AI, specifically a Large Language Model (LLM) where the context of the medical conditions were better understood from what was written in the free text fields. Whereas fuzzy matching matched medical conditions through spelling, the LLM has fundamentally changed the way that the solution functions e.g. Chronic Fatigue is now translated into Fatigue as the LLM understands that fatigue is the key to that condition and not chronic

2.2 - Scope

In July 2020, we implemented a solution using fuzzy matching, cutting edge technology at that point in time, to help ESA agents map the medical condition(s) entered into a free text box on the ESA claim against the list maintained by DWP, to save agent time. However, fuzzy matching looks to match words/phrases based on spelling rather than the context in which they are used i.e. Chronic Fatigue was translated into Chronic Renal Failure or Partially Amputation of Foot was translated into Partially Sighted etc. Therefore, this resulted in only 35% of conditions being correctly matched, which means 65% were still having to be handled manually. Understandably, this was frustrating for agents and in 2024, we felt it could definitely be improved upon. The Garage determined this solution could be made much more effective through the use of AI, specifically a Large Language Model (LLM) where the context of the medical conditions were better understood from what was written in the free text fields. Whereas fuzzy matching matched medical conditions through spelling, this has fundamentally changed the way that the solution functions e.g. Chronic Fatigue is now translated into Fatigue as the LLM understands that fatigue is the key to that condition and not chronic. Medical condition matching is only being used for ESA.

2.3 - Benefit

This solution has processed over 780,000 cases and saved 42,500 operational hours since its delivery into production in July 2020.

2.4 - Previous process

Before the 2020 solution, this was all done manually by having agents read all the free text in the ESA Online applications. In 2020, when we put the fuzzy matching in place, it picked up 35% of the medical conditions correctly. This new solution increases the accuracy to 87%.

In July 2020, The Garage was asked to create a solution that would help to reduce the time it took agents to work through claims. We introduced a solution using fuzzy matching, a cutting edge technology at that point in time, to help the ESA system/ESA agents map the medical condition(s) entered into a free text box on the ESA claim against a list of conditions maintained by DWP. However, fuzzy matching looks to match words/phrases based on spelling rather than the context in which they are used i.e. Chronic Fatigue was translated into Chronic Renal Failure or Partially Amputation of Foot was translated into Partially Sighted etc. Therefore, this resulted in only 35% of conditions being correctly matched, which means 65% were still having to be manually updated by an agent, and mapped across by them. Understandably, this was frustrating for agents. In 2024, we felt we could improve upon the solution. The Garage determined this solution could be made more effective through the use of AI, specifically a Large Language Model (LLM) where the context of the medical conditions were better understood from what was written in the free text fields. Whereas fuzzy matching matched medical conditions through spelling, the LLM has fundamentally changed the way that the solution functions e.g. Chronic Fatigue is now translated into Fatigue as the LLM understands that fatigue is the key to that condition and not chronic

2.5 - Alternatives considered

Consideration was given to using a drop down menu for claimants to select their condition from. This was discarded due to the fact that claimants may panic is they didn’t see their condition on the list and it could cause worry that they weren’t entitled

Tier 2 - Decision making Process

3.1 - Process integration

Although this solution does not make a decision, it reduces the agent time spent reading the documentation to obtain the information they need to make a decision on whether the solution has captured the condition correctly or it needs to be amended. The decision being made at the point by the system - with the agent overseeing - is whether the claimant is eligible to receive an award of ESA or not. By mapping the condition quickly and correctly the agent can make a quicker decision

3.2 - Provided information

The Medical Condition matching algorithmic tool matches the medical condition entered by the customer on the digital claim to the medical conditions listed in the DWP’s Incapacity Reference Guide (IRG). The closest match is used to register the claim on the ESA Benefit System by an automated registration solution. Once the registration is completed the agent performs a case review and a decision is made on the claim and whether ESA should be awarded.

3.3 - Frequency and scale of usage

This solution is used on a daily basis. Schedule is 08:00-18:00 - Monday-Friday

3.4 - Human decisions and review

The Medical Condition matching algorithmic tools matches the medical condition entered by the customer on the digital claim to the medical conditions listed in the DWP’s Incapacity Reference Guide (IRG). The closest match is used to register the claim on the ESA Benefit System by an automated registration solution. Once the automated registration of a digital claim is completed the agent performs a case review and decides on the claim and whether ESA should be awarded.

3.5 - Required training

The operational instructions for handling the Digital ESA new claims include the steps to perform a full case review when the registration is completed by the automated solution and handed back to the DWP agents.

In addition to this staff have received training to ensure they check the result, and do not simply accept the result provided by the solution. When the claim comes through, it gives the agent some tasks on the bottom of the screen. One of these is to check the IRG code which refers to the health condition and medical evidence

3.6 - Appeals and review

The decision is not made by the technology, only by an agent so normal reviews/appeals apply. If the claimant doesn’t agree with the decision they would follow the usual appeals process (https://www.gov.uk/appeal-benefit-decision))

Tier 2 - Tool Specification

4.1.1 - System architecture

The solution receives an API (Application Programming Interface) call with the text condition. With that text the solution calculates the embedding vector and matches the closest conditions on the list provided by business, the highest matching condition code is then returned through the API.

4.1.2 - Phase

Production

4.1.3 - Maintenance

Follows the normal approach that the garage takes for Continuous Service Improvement (CSI) and enhancement. Follows the rest of Garage live service maintenance and patching schedules - monitored by a full time live service team for any outages etc. Once a week a Patching scan is run on all instances to check for compliance and availability of patches automatically. Patches are released once a month.

4.1.4 - Models

This solution uses a Large Language Model (LLM) to understand what has been written in the free text box. This is a sentence transformer model - name: all-MiniLM-L6-v2.

Tier 2 - Model Specification

4.2.1 - Model name

all-MiniLM-L6-v2

4.2.2 - Model version

2

4.2.3 - Model task

Sentence-Transformer

4.2.4 - Model input

The input is a string, in that case a medical condition.

4.2.5 - Model output

The output is a 384 dimensional vector.

4.2.6 - Model architecture

Transformer.

4.2.7 - Model performance

On 360 samples, client cases, dataset provided the performance was: 87% Correct predictions 13% Incorrect predictions

4.2.8 - Datasets

These are the open source, publicly available datasets that were used: Reddit comments (2015-2018) S2ORC WikiAnswers PAQ S2ORC S2ORC Stack Exchange Stack Exchange Stack Exchange MS MARCO GOOAQ: Open Question Answering with Diverse Answer Types Yahoo Answers Code Search COCO Image captions SPECTER citation triplets Yahoo Answers Yahoo Answers SearchQA Eli5 Flickr 30k Stack Exchange SNLI MultiNLI Stack Exchange Stack Exchange Sentence Compression Wikihow Altlex Quora Question Triplets Simple Wikipedia Natural Questions (NQ) SQuAD2.0 TriviaQA

4.2.9 - Dataset purposes

The training was performed through a self-supervised learning using all the dataset.

Tier 2 - Data Specification

4.3.1 - Source data name

Medical conditions for ESA customers

4.3.2 - Data modality

Tabular

4.3.3 - Data description

Medical conditions for ESA customers applying for new claims via the ESA Online Digital Service

4.3.4 - Data quantities

360 medical conditions entered by the customers on the ESA Online Digital Service

4.3.5 - Sensitive attributes

The name, address, and gender of the claimants is not provided to the AI model, it only receives the condition which it matches against a list of medical conditions.

4.3.6 - Data completeness and representativeness

The data set used to assess the accuracy of the existing solution and evaluate the accuracy of the LLM is small. The field on which the medical condition is entered on the customer facing service is free text. Hence, continuing to log this information poses a risk of recording customer sensitive information in the logs. Hence, the approach agreed with the business was that post deployment of the LLM solution in live, the business stakeholders will confirm when they want the Garage to increase the logging levels (for a couple of days) to capture medical condition and the medical condition match found by the solution. They will provide the operational support to verify if the match found by the solution for a sample of medical conditions is appropriate or not. This will enable us to assess the improvement made by the new solution or if it needs to be further tuned. This activity is still outstanding as the operations seem to be content with the current accuracy of the solution.

4.3.8 - Data collection

The data is completed by the customer using an online form and once submitted by the customer, it will be stored within a managed secure database. This data within this database is then accessed by authorised DWP staff in order to process the claim for New Style ESA. The original purpose of the data collection is to decide whether to award ESA to an applicant.

The logging level were increased for few hours in the production to capture the list of the medical conditions and then they were tuned back to the original levels. The only data collected, is to understand the accuracy of the existing solution and use to assess the accuracy of the new solution included medical conditions.

4.3.9 - Data cleaning

No pre-processing required

4.3.10 - Data sharing agreements

N/A as no data is shared outside of DWP

4.3.11 - Data access and storage

The data used by this solution only has the medical conditions entered by the customers but no other customer details.

The data within the database can be accessed by authorised DWP staff. Claimants can request their data online, via letter or email. Once the request has been received there is a clerical process in place to action this.

DWP’s standard retention of 24 months will apply. Supporting records are retained for 24 months after DWP’s live interest in the claim has ended.

Tier 2 - Risks, Mitigations and Impact Assessments

5.1 - Impact assessment

We delivered this piece of work as a Change Request for a live automation, ESA Online automation as a Continuous Service Improvement (CSI).

We updated the DPIA to reflect technical changes made to the AI tool. Additionally, we conducted an Equality Impact Assessment. The CSI includes only a different technical approach to meet the same business requirement with higher accuracy on matching the medical conditions without making any design changes. The attached deck has the information on the background and change description for this CSI.

The Equality Assessment was completed on 02.09.24.

The DPIA was completed on 30.08.24.

5.2 - Risks and mitigations

RISK: There is a risk that this solution may incorrectly match the customer-provided medical descriptions with the DWP-approved medical terms.

MITIGATION: This is mitigated as all agents review the solution output against the application and will correct where incorrectly classified and must approve the suggested output. This reduces the risk of an incorrect classification being recorded although there is still the potential for both solution and human error to occur and result in an incorrect classification. However since the use of LLM has increased the matching accuracy, it has contributed towards bringing this overall risk down.

RISK: There is a risk that the use of AI is not a necessary and proportionate way to achieve the initiative’s goals.

MITIGATION: Less intrusive ways of meeting the aims of the initiative were considered such as having a drop-down menu which would allow claimants to select their medical condition. This was discarded due to the possibility that claimants may worry that they are not entitled to ESA if they cannot see their condition on the list.

RISK: Bias and/or discrimination: There is a risk that the processing could favour or penalise certain groups of people.

MITIGATION: The claimants name, address and gender is not provided to the AI model to prevent potential bias or discrimination. The model only receives the condition which it matches against a list of medical conditions. All decisions are reviewed by a human to ensure accurate and fair outcomes. An Equality Assessment was also completed.

RISK: There was a risk of bias in the automated decision-making process because the data used to train the model AI utilised publicly available data and did not use historic ESA claims data.

MITIGATION: The LLM was trained through self-supervised learning using a variety of datasets including flax sentence embeddings and conversational datasets. The performance was checked using 360 sample client cases. The dataset performance was: 87% Correct predictions, 13% Incorrect predictions. The accuracy is now 91.4% as a monthly average. This is constantly analysed and reviewed.

RISK: There is a risk that the human review during the decision-making process is not meaningful.

MITIGATION: To mitigate this, staff have received training to ensure they check the result, and do not simply accept the result provided by the solution. When the claim comes through, it gives the agent some tasks on the bottom of the screen. One of these is to check the IRG code which refers to the health condition and medical evidence.

Staff are also trained to check the result and avoid any unintentional automated decision making (ADM) in this process.

RISK: There is a risk that DWP are not meeting their transparency obligations as individuals may not be aware that their data has been processed by AI, or they have been subject to Automated decision-making.

MITIGATION: DWP’s Personal Information Charter (PIC) refers to the use of AI.

A privacy notice is also shown on the initial ‘Start now’ screen which is presented to the user when applying for ESA. The notice directs them to the PIC.

This Algorithmic Transparency Recording Standard (ATRS) document has also been published.

Updates to this page

Published 10 February 2025