Housing Benefit caseload and flows statistics: methodology statement
Updated 12 November 2024
The main purpose of this document is to provide users with information about the methods used to produce the Housing Benefit caseload and flows statistics, in accordance with practices set out in the Code of Practice for Statistics.
Housing Benefit statistics form part of the Department of Work and Pensions (DWP) benefits statistics collection, which brings together summary National and Official statistics for a range of different benefits.
Housing Benefit (HB) provides people with help in paying their rents if they are unemployed, on a low income or in receipt of another qualifying income related benefit. HB is being replaced by Universal Credit (UC) for Working Age claimants, and the number of people claiming HB has been gradually decreasing and will continue to fall. HB is different to other DWP benefits because it is administered by Local Authorities on behalf of DWP.
Overview
Responsibility for producing HB statistics sits with two teams who work closely together within DWP Digital Data and Analytics.
The Data Delivery Team collate Housing Benefit data returns from Local Authorities to produce a single data file called the Single Housing Benefit Extract (SHBE). This extract contains administrative data derived from Local Authority computer systems. Local Authorities return data to DWP once a month. Because they return extracts at different times during the month, there is no common time stamp on the extract.
The Client Statistics Team will combine data from two consecutive SHBE extracts to produce a HB caseload dataset; representing the number of live HB claims on a given date (the second Thursday of the month). This dataset will then be used to create HB caseload statistics for a particular month.
By further combining two HB caseload datasets, we can also provide information about any claims ending and new claims starting in the intervening time period between two HB scans. We call these statistics HB Flows.
Revision of HB statistics
Please note that HB statistics were revised in August 2022 in order to correct issues with the following fields:
- Passported Benefit Status
- Employment Status
- Removal of Spare Room Subsidy – Spare Room Subsidy Indicator
- Removal of Spare Room Subsidy – Number of Spare Rooms
- Removal of Spare Room Subsidy – Weekly Spare Room Reduction Amount bands
- Age
- Client Type
- Gender
Following an earlier policy change, we recently discovered that the passporting indicator on Housing Benefit statistics had been incorrectly recording outcomes for a section of claimants who receive Housing Benefit and Universal Credit at the same time. The correct outcome should have been “Passported: In receipt of Universal Credit.”
We have now fixed this issue, but in doing so, other variables were affected where the visible presence of a Universal Credit claim changes some outcomes. A methodological change was introduced to identify a caseload of UC claimants who were also in receipt of HB. From there we were able to apply coding rules to ensure that data matching was of sufficient quality. We have applied this change to datasets from April 2018, which reflects the data of the policy change.
An earlier methodological change to make better use of evidence from the Customer Information System has also been applied to datasets between April 2018 and May 2020. This has corrected the number of unknowns or missing values for age and gender and also help improve the quality of other fields such as Client Type.
Housing Benefit caseload statistics
Data sources
Single Housing Benefit Extract (SHBE)
The SHBE dataset is the primary source used to create HB statistics. It contains data about all HB and Council Tax Benefit claims, and is collated from returns from administering Local Authorities. Local Authorities use a range of software suppliers to provide the systems to administer HB and to provide data to DWP.
Data are collected using a well-defined set of specifications which helps to ensure both consistency and quality. The data returns are monthly and cover a range of different characteristics about the status of each claim, the personal characteristics of claimants, payable amounts and any deductions that may have been made. The data also provides information about which Local Authority is administering each claim.
Customer Information System (CIS)
The Customer Information System (CIS) is a system used by DWP to collect information about customers. It provides the latest customer information including:
- personal details such as date of birth, gender and geographic information
- benefit awards
- preferred method of contact
We use two different data feeds from CIS in the production of HB caseload statistics. Firstly, we use an address history file to provide residential based geographic information. Secondly, we use date of birth and gender fields from a weekly view CIS data to verify the age and gender data supplied from the SHBE extract.
Universal Credit Full System data (UCFS)
In order to identify HB claimants who are passported onto HB from UC, a scan of UC claimants is derived and then carefully matched on to SHBE data using a series of rules. The rules are designed to ensure we capture households who are on both benefits concurrently, rather than incorrectly describing claims who have begun the process of moving from HB to UC.
How we process data
All of the data sources we access are held securely in a UNIX environment, and only analysts with specific permissions may gain access to it. The data we access have a lot of personal details removed, and are provided in a ‘Masked’ format to protect claimant identities. Personal identifiers such as National Insurance Numbers are encrypted as a security measure.
We use a statistical software package called SAS Enterprise Guide to combine and manipulate data following a process of well-defined rules. SAS coding programs are set up as a process flow to take raw data and turn them into publishable statistics.
Creating a liveload
We use 2 consecutive SHBE extracts in combination to create a caseload snapshot of live cases at the second Thursday of the month (the count date). We use a combination the claim status field with claim start dates and claim end dates to determine which claims are live at that time point. We use data from two extracts so that we can gain information from the second scan on any cases who may have closed or recently opened on or before the count date.
The key points are:
- we can only use records with a valid National Insurance number and identifiable start dates (missing values are filtered out at the beginning)
- processing takes account of the fact that some Local Authorities return data before the count date and some data are returned after the count date
- we include data on closed claims to give us visibility of any short term claims which were live on the count date
- when removing duplication from the data to arrive at 1 row per claimant, live records are retained in preference to closed records, and we keep the observation closest to the count date
- only claims with an award greater than 50p are included in our statistics
Dealing with data quality issues
SHBE data is sometimes prone to known data quality issues where information supplied by Local Authorities may be incomplete or missing for a given month. After we create an initial liveload, we apply threshold based quality checks to determine where issues in Local Authority level data may present. If data returned by a Local Authority fall outside acceptable thresholds, we make a determination about whether they should be replaced. If we need to replace data from a Local Authority, we will bring forward the caseload data from the previous month to represent the best available evidence for the current month being reported. This is a process known as “substitution”. Where substitutions occur we maintain a count of how often data are substituted for each Local Authority so that any emerging or persistent quality issues can be fed back and kept in view.
Adding data from Customer Information System and correcting timing issues
Once a liveload has been determined and quality issues dealt with, further programs are used to verify dates of birth and gender information from SHBE against the latest data held on the CIS. We can then apply additional variables such as a State Pension Age for each individual associated with a claim. From there, we can determine whether claims are “Working Age” or “Pension Age” using rules agreed with Policy colleagues.
Coding rules will address other timing issues in the data, such as correcting any HB reduction amounts where we find a claim may no longer meet the criteria on the count date. SAS codes will also apply appropriate data formats to the data in preparation for its publication.
A final SAS program will add on geographic information to the HB caseload dataset. This consists of 2 different types:
- Administration Based Geographies
- Residential Based Geographies
Administration Based Geographies
This describes which Local Authority is administering each claim.
Residential Based Geographies
This describes where claimants are living. This information is derived from the CIS Address History File, which allows us to add on the Census Output Area associated with the residential address. From that, we can determine a full geographic hierarchy using the National Statistics Postcode Lookup file (NSPL).
Once the HB caseload dataset has been produced, we perform a series of quality checks before creating a summarised file for uploading onto the Stat-xplore tool. Please see below.
Housing Benefit flows statistics
To supplement HB caseload statistics, HB flows datasets provide counts of new Housing Benefit claims starting, and the number of claims ending.
As described above, the count date for published HB caseload data is always the second Thursday of any given month.
Housing Benefit flows are created by comparing 2 HB caseload datasets which are set one month apart. We determine which claims are “live” only at the count date for Month 1 and which claims are “live” only at the count date for Month 2.
An off-flow is defined as the end of an existing Housing Benefit claim. An off-flow is identified where a claim appears on the previous month’s Housing Benefit data and is not present on subsequent month’s data, for example May 2020 off-flow figure is calculated by counting claimants that are shown as live on the HB caseload data in April 2020 and who do not appear on the corresponding data for May.
An on-flow is defined as the start of a new Housing Benefit claim. An on-flow is identified where a claim appears data for the later month and is not present on the first month’s data, for example May 2020 on-flow figure is calculated by counting claimants that are shown as live on the HB caseload data in May 2020 and who do not appear on the corresponding data for April.
How are the HB caseload and flows data different?
The presence-based methodology for creating flows uses two separate HB caseload datasets. In the initial stage, only the National Insurance Number for ‘lead’ claimant is used to a create a dataset showing the on-flowing and off-flowing cases.
The caseload characteristics are then merged on from the relevant month’s HB caseload dataset.
- off-flows take caseload characteristics from Month 1, as they do not appear as a live case in Month 2
- on-flows take caseload characteristics from Month 2, as the claims were not live in Month 1
Because of this feature, there are natural timing differences that appear between caseload characteristics for on-flows and off-flows.
Additionally, quality checks have shown that off-flowing claims from Month 1 are more likely to have missing payment amounts than cases who remain on the HB caseload into Month 2.
At the highest level, changes in HB caseload can be reconciled using the number of on-flows and off-flows. However, when looking at specific characteristics, this is no longer possible as variations in caseload characteristics between two months may be driven by other events that are not captured by our flows methodology. For instance, if a claimant reaches State Pension Age, and moves from Working Age to Pension Age, then in practice their claim would not close and never present as a flow. Meanwhile, the client type volumes will have changed slightly as a result.
Designation of statistics
HB Caseload statistics have been designated National Statistics status, which means they have been prepared in accordance with the UK Statistics Authority Code of Practice for Official Statistics.
All HB Flows statistics are badged as Official Statistics. Official Statistics are produced in accordance with Statistics and Registration Service Act 2007 and the UK Statistics Authority Code of Practice for statistics and meet high standards of trustworthiness, quality and public value.
Quality statement
The Department for Work and Pensions (DWP) is committed to producing accurate, timely, high quality official statistics publications that take into account user needs and which are produced and disseminated in accordance with the UK Statistics Authority’s (UKSA) Code of Practice. The 3 pillars of the Code of Practice are trustworthiness, quality and value. The quality pillar of is further split into these areas:
- suitable data sources
- sound methods
- assured quality
Read about our wider approach to quality for the DWP benefits statistics release.
HB Caseload and HB Flows statistics methodologies cover a range of quality assurance activities to provide assurance across the 3 areas, these are:
-
when we receive data from suppliers, we routinely check the quality statements and assurances that are supplied with the monthly SHBE data. We gain an appreciation of any data gaps or where a Local Authority has not returned any data
-
we run quality checks on the data supplied to identify any potential cases that may be incorrectly removed by our methodology
- as we move through the caseload creation, we create counts to check assumptions and methods are working correctly
- we use data from other sources to improve the quality of our breakdowns for age, gender and geography
-
before data are moved onto Stat-xplore, we perform a thorough check of the statistics to check for anomalies, outliers and the impacts of any missing data
- issues are recorded onto a quality log and rated according their impact. Any issues that are detrimental to the quality of our published statistics are communicated to users
Where possible we use a standardised framework for the dissemination of statistics on Stat-xplore so that the data we provide can be compared across multiple time periods and across different benefits. For instance, residential geographies across all of DWP benefit statistics are underpinned by the same geographical reference file (National Statistics Postcode Lookup).
We check and review our methodologies regularly against any policy changes that may impact the meaning and interpretation of data. The rules we follow to create the variable groupings such as Client Type are continually reviewed. We receive HB Policy circulars and engage purposefully with wider analytical and policy stakeholder groups to consider changes either the Housing Benefit policy, the source data we receive and to pre-empt changing user needs.
On the Stat-xplore dissemination tool, against each available variable, users can find background information and a quality statement to make them aware of any issues that may impact its quality.
Known issues
Data may sometimes be supplied with missing start dates for a section of newer claims. We monitor how many cases are impacted, and where a start date is missing we use a proxy variable called the “Treat as Made” date, which represents the date at which we can treat a decision to award HB as having been taken.
On rare occasions, a Local Authority may fail to supply any data for a given month. Where this happens, we have a process known as substitution that effectively uses the best available evidence in order to allow us to create reliable counts for the whole of Great Britain.
Some timing issues can occur when there is a time lag between the data return date and the count date used for publication. As noted elsewhere, some recoding of information takes place where this timing difference may account for changes to the status of a claim, its categorisation or in consideration of any award reductions.
As highlighted earlier, cases identified as off-flows are more likely to have missing payment information.
Pre-publication and release
As noted earlier, HB statistics are released as part of a wider publication called DWP Benefit Statistics. Commentary about HB statistics can be found in the Bi-Annual Statistical Summary that forms part of this collection.
We pre-announce the release date of this statistical publication at least 28 calendar days in advance, in accordance with release practices set out in the Code of Practice for Statistics. Find dates of future DWP publications.
In addition to DWP staff who are responsible for the production and quality assurance of the statistics, a limited number of individuals are granted 24 hours pre-release access.
Under the Pre-release Access to Official Statistics Order 2008 government departments are required to maintain an up-to-date list of the job titles and the organizations of everyone who has pre-release access to statistical releases. For transparency, we publish the pre-release access list.
Commentary is added to the DWP website at 9:30am on Release Day. The publication has its own landing page which contains the publication itself as well as key information about the contents of the release. Old versions of the Statistical Summary are archived and still available. See the DWP benefits statistics landing page.
Also at 9.30am on Release day, Stat-xplore will “go-live” so that data for the new quarter are available.
Further information
Read our background information note for more details about changes and revisions to the release.
Data characteristics available on Stat-Xplore
Administrative Geographies
These are:
- Great Britain
- Country
- Region
- Local Authority
Residential Geographies
These are:
- Great Britain
- Country
- Region
- Local Authority
- Middle Layer Super Output Area (MSOA)
- Lower Layer Super Output Area (LSOA)
- Census Output Area (COA)
- Travel to Work Area (TTWA)
- Eurostat NUTS Areas
- International Territorial Level (ITLs)
- Westminster Parliamentary Constituency
- Scottish Parliamentary Region and Constituency
- Ward
Claimant Characteristics
These are:
- Gender (for single claimants only)
- Client Type (Working Age or Pension Age)
- Age (in bands or single year of age)
- Family Type
- Number of Child Dependants
- Number of Non-Dependants
- Employment Status
Housing Characteristics
These are:
- Housing Sector
- Housing Tenure
- Entitled Number of Bedrooms (for Local Housing Allowance tenants only)
- Removal of Spare Room Subsidy Indicator
- Number of Spare Rooms
Housing Benefit Award
These are:
- HB Award Amount (in bands)
- HB Average Weekly Amount (not available for HB Flows statistics)
- Payment Destination
- Passported Status
- Passported Benefit
- Weekly Spare Room Reduction Amount (in bands)
- Average Weekly Spare Room Reduction Amount