Appendix 1: technical notes and calculating rates
Published 4 September 2019
Background
Collecting data on rare diseases is challenging, as data is scarce and often fragmented. Therefore, NCARDRS is currently looking at the value of various data sources to support rare disease registration and to estimate prevalence rates of these conditions. We aim to both check retrospective cases and agree methods to implement sustainable prospective reporting systems.
The work described in this report supports current work being undertaken to assess the use of Hospital Episode Statistics (HES) and ONS Mortality data in both prospective and retrospective reporting of non-structural congenital conditions that present in infancy. The limitations of these data sources are acknowledged and the data presented here are considered a first step towards understanding how these data sources can be used as part of a multi-sourced approach to case building and ascertainment and hence to estimate incidence and prevalence in the English population statistics. [footnote 1]
Technical notes
Data sources and definitions
The different types of SMA mostly have different ICD-10 codes, though in some cases, multiple sub-types of SMA fall under the same ICD-10 code. For this work, we included cases of SMA1 which were ascertained from HES, ONS mortality data and the NCARDRS’ congenital anomaly data management system. SMA1 is coded in ICD-10 using G120 Infantile spinal muscular atrophy, type I [Werdnig-Hoffman]. We did not include cases resident outside England or pregnancies that did not result in a live birth.
Hospital episode statistics
NCARDRS has access to HES data that contains identifiable patient data. There are known issues with the accuracy of HES data, including issues with diagnoses, which are coded using ICD-10 codes. In previous work, we have found issues with coding. For example, diagnostic codes are included in records where patients were tested for, rather than diagnosed with, a certain condition. Besides accuracy, few rare diseases have their own ICD-10 codes, meaning that they cannot be used to identify cases of rare disease without additional information, even if coded correctly. For example, Menke’s Disease is coded under E83.0 Disorders of copper metabolism, which also codes for Wilson’s Disease. SMA1 has a relatively precise ICD-10 code.
Because outpatient episodes are not coded, patients need to have been admitted and their diagnosis correctly coded for them to be identified. This means that HES data is not suitable for identifying conditions with a low hospitalisation rate. It also introduces bias for those conditions that might involve hospitalisation, as only more severe cases will be found in HES data. Despite the limitations of HES, it remains a valuable data asset as currently there are few patient data sources with national coverage.
ONS mortality data
ONS mortality data contains data found on death certificates. Besides demographics, it contains cause of death information. This information is captured in 2 ways:
- As free text written by the certifier.
- As ICD-10 codes used to capture that free text.
For rare disease, the free text fields are invaluable as they allow for a granularity not available through ICD-10 codes alone. For example, if a death certificate is coded E83.0 Disorders of copper metabolism, the free text should contain the actual disorder – meaning Menkes or Wilson’s Disease.
CARA
CARA is the NCARDRS data management system. Currently it is used to collect data on congenital anomalies for all England. Though SMA is not routinely collected on CARA because it falls outside of the EUROCAT definition of a congenital anomaly, it is on the inclusion list for passive reporting (we record it when it is reported to us).
Methods
To estimate the population prevalence and incidence (birth prevalence) of SMA1 in England, we used the following data:
-
HES data - finished consultant episodes from 1 April 2008 to 31 March 2018 containing an episode with ICD-10 diagnosis of G12.0 Infantile spinal muscular atrophy, type I (Werdnig-Hoffman)
-
ONS mortality data - all deaths from 2008 to 2018, with a code of G12 Spinal muscular atrophy and related syndromes appearing anywhere on the death certificate, with free text in the cause of death field indicating either SMA1, SMA0 or SMA were included
-
cases of G12.0 Infantile spinal muscular atrophy, type I (Werdnig-Hoffman) contained on the NCARDRS data management system (CARA) that were not ascertained from either HES or ONS mortality data were also included
All individual patients ascertained from HES and CARA were traced using the Summary Care Record to check vital status and date of death recorded where appropriate. Cases that were identified on HES to have SMA1, but where cause of death free text in the ONS mortality data indicated another type of SMA were excluded from analysis. We did not include cases of Spinal muscular atrophy with respiratory distress type 1 (SMARD1) for the reason that cases, if correctly coded, would be coded differently to SMA1.
A sample of cases identified through HES as having SMA1 were validated against hospital records, where NCARDRS has remote access to patient information.
Of the 305 deaths in those born in 2008 to 2016, 216 (70.8%) were ascertained from both HES and ONS mortality data; 15(4.9%) from SCR-traced HES alone and 74 (24.3%) were ascertained from ONS mortality data only. The level of agreement between HES and ONS did not appear to change over the study period. Because of the reliance on mortality data for ascertainment, we did not estimate prevalence after 2016 because of the apparent drop in mortality rate in 2017. For SMA1 deaths in those born in 2011 to 2016 reported in the mortality data, 167 (88%) of 189 deaths included a G120 ICD-10 cause of death code on the death certificate.
We used NHS number from HES to trace the death records for the 15 deaths that we identified in HES that did link to the original mortality data. Of these, 7 were indicated as having SMARD, 4 as having undefined neuromuscular disorders, one as having SMA, 2 had other defined disorders and in one case, the diagnostic description was non-specific for the underlying cause of death.
We used remote access into hospital systems to check a sample of records identified through HES as having SMA1. Of the 23 records that we checked, 16 cases were confirmed to have SMA1, 4 cases had a different type of SMA, one did not have SMA and 2 did not have specific diagnoses mentioned in the record. A summary of the type and magnitude of misclassification is given in Table 3.
Table 3. Possibility of misclassification of cases in the routinely collected data.
Underestimate | Overestimate | |
---|---|---|
HES | 24.3% of SMA1 deaths did not have a HES record with a G120 diagnostic code. | 21.7% of cases with a G120 diagnostic code did not have SMA1 when validated against the patient record. |
ONS mortality | 12% of death records that state SMA or SMA1 in cause of death free text do not have a G120 diagnostic code. | Death records that state SMA in cause of death free text may not refer to SMA1. |
Calculating rates
For the population prevalence:
- numerator – all SMA1 cases born in 2000 onwards, alive and resident in England on 1 July 2016
- denominator – the total population of England from the ONS mid-year estimates for 2016
For the birth prevalence (incidence):
- numerator – all SMA1 cases born in the year
- denominator – the number of ONS live births in the year
For the infant mortality rate:
- numerator – all deaths with SMA1 mentioned on death certificate in the year aged under 1 year confirmed in ONS mortality data
- denominator – the number of ONS live births in the year
References
-
Burns EM, Rigby E, Mamidanna R. Systematic review of discharge coding accuracy. J Pub Health 2011; 34(1): 138 to 148. ↩