COVID-19 transport use statistics: ORR’s review of rail journeys methodology
Updated 11 December 2024
Introduction
In June 2020, Department for Transport (DfT) began publishing official statistics on Transport use during the coronavirus (COVID-19) pandemic. This provided statistics on transport use by mode during the pandemic and subsequent recovery.
The rail passenger journeys element of these statistics are sourced from the Latest Earning Networked Nationally Overnight (LENNON) ticketing and revenue database. LENNON holds information on the vast majority of national rail tickets purchased in Great Britain and is used to allocate the revenue from ticket sales between train operating companies (TOCs). LENNON contains two datas: pre-allocation (sales) and post-allocation (earnings).
DfT currently uses pre-allocation (sales) data but is proposing to switch to using the post-allocated (earnings) data as recovery from the pandemic continues. This note is aimed at providing an independent view on this change in methodology.
Pre-allocation and post-allocation data
The pre-allocation (sales) data records each product sold and assigns journeys associated with that product to the date of purchase. Each product sold within LENNON has an associated factor based on the expected number of journeys undertaken on that product. For example, it is assumed 480 journeys are undertaken on an annual season ticket.
The post-allocation (earnings) data assigns the journeys associated with the sold product between the train operating companies who operate on all, or part of, the given flow. Unlike the pre-allocation (sales) data, journeys are apportioned out across the duration of the product’s validity. For example, an annual season ticket would have its associated 480 journeys drip fed into the post-allocation (earnings) data from the date of purchase to its expiry date.
Use of pre-allocation data during the pandemic
DfT opted to use the pre-allocation (sales) data to capture what proportion of journeys were being undertaken on the mainline rail network compared to the same period a year previous. This was consistent with the broad methodological change adopted by ORR to measure passenger journeys during the pandemic.
Under normal circumstances, ORR uses the post-allocation (earnings) data to produce its official statistics on passenger rail usage on a quarterly basis. However, ORR adopted a different approach based on pre-allocated (sales) data during the peak pandemic period to produce more accurate estimates of usage.
Due to the restrictions in place during the pandemic, there were a significant number of refund requests. When a refund request is made, the original record of purchase is not removed from the system; instead, a negative record of usage is created to offset the original. Moreover, the refund is put into the system on the day the refund is issued rather than the date of the original purchase. The effect of this on the ORR statistics was that journeys being undertaken within that quarter were being offset by refunds being issued for historic journeys.
Consequently, ORR adopted a new methodology that excluded pandemic-related refunds due to their impact on aggregate journey numbers; although it should be noted that ORR still included an estimate for refunds claimed due to poor train service performance.
In addition to excluding refunds, ORR also based its adjusted methodology on pre-allocation (sales) data. The rationale for using the pre-allocation (sales) data was that this was based on actual ticket sales. Therefore, it would not include journeys from a historic season ticket purchase where either a refund had not yet been issued or where a refund had not been claimed.
Furthermore, ORR’s default assumption was that passengers who were purchasing tickets during this period were doing so with the full knowledge of the restrictions, and therefore intending to travel. As a result, the pre-allocation (sales) data was assessed to be a more reliable indicator of actual journeys undertaken.
The pre-allocation (sales) data was not without its limitations, most notably assigning journeys to the point of purchase rather than throughout a ticket’s validity. However, due to the restrictions in place and subsequent uncertainty over travel, purchasing patterns changed with passengers opting for shorter-term weekly season tickets or ‘on the day’ travel. Therefore, assigning journeys to the point of purchase was not overestimating the journeys to the same extent it would have if the volume of annual season ticket sales had held up.
One difference identified between the two series during the pandemic was DfT’s exclusion of TfL sold products. Sales and refunds for these products are entered into the LENNON database erratically (0 to 2 times per week) and during the early stages of the pandemic, these caused large variations in the reported data. As a result, it was decided that these sales should be omitted from DfT’s weekly data series. ORR continued to include these products as the sporadic nature of the entries did not have the same impact on the quarterly series.
Refunds were largely responsible for the variations in journeys on TfL sold products during the initial phase of the pandemic. Now that refunds have returned to expected levels, the large week-to-week variations are no longer apparent. Therefore, it is right for DfT to re-introduce journeys made on TfL sold products back into the weekly statistics.
Use of post-allocation data to measure recovery
As restrictions have lifted and some passengers revert to buying longer-term season tickets, drip feeding those through the system gives a better indication of journeys being undertaken.
In addition, the volume of refund requests on the post-allocation (earnings) data has reduced over time, resulting in a more negligible impact on the post-allocation (earnings) data. Consequently, ORR reverted to using the post-allocation (earnings) data from April 2021 onwards.
Differences between DfT weekly statistics and ORR quarterly journey statistics
Whilst reviewing the respective DfT and ORR methodologies, we have identified two additional inputs that are included within ORR’s quarterly official statistics but are not in DfT’s weekly statistics. These differences are:
Non-LENNON data
ORR statistics include estimates of tickets sold outside of LENNON. These are collected quarterly, direct from TOCs. Inclusion of these would not be practical for a weekly series, so it is appropriate for DfT to exclude these. These account for around 2.4% of all journeys.
Bulk payments
ORR quarterly statistics contain bulk inputs to LENNON. These are contractual revenue settlements with accompanying journeys covering free or concessionary travel (for example, TfL Freedom Pass) and are entered into the LENNON system on a more ad-hoc basis. Consequently, they can be incorporated into a quarterly or annual data series but would have a much greater impact on a weekly data series, particularly if the settlements were added early or late. These account for around 2.8% of all journeys.
If we remove the Non-LENNON and bulk payments data from ORR’s published quarterly statistics and compare them with a quarterly aggregation of DfT’s proposed methodology, ORR’s series is, on average, 0.2% higher than DfT’s.
From a proportionality and practical perspective, it is not appropriate for DfT to include these inputs within its weekly series. The base LENNON data that is extracted and used for DfT’s weekly series accounts for around 95% of all rail travel.
Rates of change across the three inputs are likely to be the same. Therefore, including Non-LENNON and bulk payments data is unlikely to have a material impact, particularly considering DfT present its data as a percentage of patronage compared to an equivalent week rather than actual journey volumes.
Therefore, we conclude that it is appropriate for DfT to switch to using the post-allocation (earnings) data to monitor the recovery of rail journeys.