Research and analysis

SARS-CoV-2 genome sequence prevalence and growth rate update: 19 July 2023

Updated 8 August 2024

Applies to England

Variant prevalence

Testing policy and sequencing should be considered when interpreting variant data and is described in full in the variant technical briefing series. The prevalence of lineages amongst UK sequences by Phylogenetic Assignment of Named Global Outbreak Lineages (Pangolin) designation is presented in Figure 1. Lineages are shown if there are more than or equal to 5,000 sequences since 23 January 2023 or if they represent more than or equal to 1% of sequences within a single week over the last 6 weeks. Lineages that do not meet these criteria are combined with their parent lineage (for example, BA.2.4 is combined with BA.2).

In the week beginning 12 June 2023, there were 307 sequences that have a lineage call, and in the week beginning 19 June 2023, there are 68 sequences that have a lineage call. Due to an issue with data import, the number of sequences from the week beginning 19 June 2023 is artificially low. Lineage proportions can be unreliable when the number of available sequences is low.

The lineages have been assigned using the accurate Ultrafast Sample placement on Existing tRee (UShER) mode and version 1.21 of the Pangolin data.

Figure 1. Prevalence of Pangolin lineages in the UK sequence data with a specimen date from week beginning 23 January 2023 to week beginning 19 June 2023, as of 13 July 2023

The total number of valid sequence results per week is shown by the black line. The ‘Other’ category in this plot contains all lineages that do not meet the relevant criteria after combining smaller sub-lineages. ‘Unassigned’ are sequences that could not be assigned a lineage by Pangolin. Lineages present in at least 2% of sequences in the most recent week are labelled to the right of the plot.

Variant modelling

Two models are currently used to estimate the growth advantage of emerging lineages: a logistic regression generalised linear model (GLM) and a generalised additive model (GAM). Models are fit to a geographically stratified sample of Pillar 1 cases to ensure that relative growth rates are estimated in relation to a local set of co-circulating lineages. Tests associated with travel are excluded. A full description of methods can be found in the variant technical briefing series.

In recent months there are fewer tests to model than earlier in the year (see Figure 1). This is due to reduced sampling effort and lower prevalence. Moreover, the proliferation of lineages to monitor means that sample sizes for specific lineages can be small. Uncertainty in our modelled relative growth rates is therefore increased, which is reflected in larger confidence intervals on the estimates.

We aim to select lineages and/or groups of lineages that are both specific enough to pick up on emerging signals but broad enough to maximise statistical power, and have now updated our approach, which will be used for this and future briefings. Any lineage that has made up more than 1.5% of the total samples and with at least 50 sequenced cases within 6 weeks of the most recent specimen date is modelled separately. Lineages that do not meet these criteria will be iteratively added to a parent lineage until this condition is met, forming new aggregations of parent and child lineages. If the condition cannot be met for a particular lineage (for example, there is no parent lineage at high enough prevalence) it will not be modelled.

Lineages with a different high-level parent will never be aggregated together (for example, we will not aggregate BA.2 and BA.5 to B.1.1.529). Unassigned lineages are excluded from this analysis. Note that these aggregations will oftentimes be coarser than in the prevalence plot presented in Figure 1. The relative growth rate of broad lineage classes (for example, parent lineages that include many distant child lineages) will be less informative than explicit modelling of specific sub-lineages. Methods of lineage collapsing for the growth rate analysis are therefore still being refined.

Growth rates were based on sequences sampled through Pillar 1 testing (primarily positive tests conducted in hospital) in England (Table 1). The sampling range for both the logistic regression GLM and GAM is from 05 January 2023 to 22 June 2023. The model fit for any lineage with a positive growth rate advantage (with 95% confidence intervals (CIs) that do not cross zero) are shown in Figure 2. The lineages with positive growth rate with reasonable confidence are EG.1 (39.7%, GAM), XBB.2.3 (21.81%, GAM), XBB.1.9.2 (18.03%, GAM), XBB.1 (17.92%, GAM) and XBB.1.16.1 (15.89%, GAM).

Table 1. Growth rate (GR) of English sequence lineages as of 22 June 2023†

Lineage* Lineage Group Composition** Pillar 1 Sample Size*** Weekly growth rate advantage (GAM) Estimated prevalence (GAM) Weekly growth rate advantage (GLM)
EG.1 (XBB.1.9.2.1) EG.1 (65.38%); EG.1.3 (25.96%); EG.1.4 (5.77%); EG.1.2 (2.88%) 104 39.76% (95% CI: 24.43 to 55.1) 9.75% (95% CI: 5.13 to 17.75) 41.26% (95% CI: 5.76 to 76.77)
XBB.2.3 XBB.2.3 (31.93%); GE.1 (19.88%); XBB.2.3.2 (15.66%); XBB.2.3.1 (9.04%); XBB.2.3.11 (8.43%)… 166 21.81% (95% CI: 20.72 to 22.89) 12.84% (95% CI: 10.74 to 15.28) 15.32% (95% CI: -13.09 to 43.74)
XBB.1.9.2 XBB.1.9.2 (59.79%); EG.5.2 (11.11%); EG.7 (6.35%); EG.5.1.1 (4.76%); EG.6.1 (4.76%)… 189 18.03% (95% CI: 7.46 to 28.59) 14.74% (95% CI: 9.46 to 22.25) 11.88% (95% CI: -13.42 to 37.19)
XBB.1 FY.1.2 (18.24%); XBB.1.17.1 (12.16%); XBB.1.22.1 (10.14%); XBB.1.11.1 (8.78%); FY.4.1 (7.43%)… 148 17.92% (95% CI: 3.55 to 32.3) 10.24% (95% CI: 5.91 to 17.18) 4.65% (95% CI: -22.64 to 31.93)
XBB.1.16.1 XBB.1.16.1 (63.27%); XBB.1.16.11 (15.31%); FU.1 (13.27%); FU.2 (7.14%); XBB.1.16.10 (1.02%) 98 15.89% (95% CI: 5.56 to 26.21) 6.87% (95% CI: 4.08 to 11.36) -13.06% (95% CI: -49.84 to 23.73)
XBB.1.9.1 XBB.1.9.1 (39.04%); FL.3 (12.59%); FL.4 (10.33%); FL.2 (5.04%); FL.9 (4.79%)… 397 3.29% (95% CI: -6.64 to 13.22) 16.76% (95% CI: 11.57 to 23.64) 4.03% (95% CI: -15.91 to 23.97)
XBB.1.16 XBB.1.16 (85.92%); XBB.1.16.2 (7.88%); XBB.1.16.3 (2.39%); XBB.1.16.5 (1.91%); XBB.1.16.4 (0.48%)… 419 -8.26% (95% CI: -18.73 to 2.2) 17.28% (95% CI: 11.61 to 24.95) 17.48% (95% CI: -1.42 to 36.39)
BA.2 DV.7 (31.4%); DV.6 (16.28%); CH.1.1.1 (10.47%); CH.1.1 (5.81%); FS.1 (4.65%)… 86 -21.98% (95% CI: -44.31 to 0.35) 1.53% (95% CI: 0.67 to 3.48) -28.23% (95% CI: -77.21 to 20.75)
XBB.1.5 XBB.1.5 (49.87%); XBB.1.5.7 (3.89%); XBB.1.5.13 (3.77%); XBB.1.5.18 (3.64%); XBB.1.5.12 (3.14%)… 796 -29.02% (95% CI: -40.27 to -17.77) 14.89% (95% CI: 10.48 to 20.72) -25.57% (95% CI: -42.29 to -8.84)
FL.3.1 (XBB.1.9.1.3.1) FL.3.1 (100%) 59 -30.28% (95% CI: -55.51 to -5.05) 0.8% (95% CI: 0.28 to 2.2) -20.71% (95% CI: -80.11 to 38.7)

*Listed parent lineages include all sub-lineages, other than those explicitly modelled.

**The top 5 contributing lineages to the modelled group in the most recent 6 weeks (11 May 2022 to 22 June 2023). More than 5 sub-lineages are indicated by “…”

***Sample size is for Pillar 1 samples in England in the most recent 6 weeks (11 May 2022 to 22 June 2023).

† Sampling range for both logistic regression and generalised additive models (GAM) is from 5 January 2022 to 22 June 2023.

CI = confidence intervals

Figure 2. Modelled prevalence of lineage groups with a growth rate advantage over other circulating lineages

The black line shows the central estimate and blue shaded regions the 95% confidence intervals. Points show the national level proportions, with the size indicative of sample numbers for that particular lineage. The grey portion of the ribbon denotes that this period of time is likely to be backfilled with more sequenced cases, making proportions unreliable.

Sources and acknowledgments

Data sources

Data used in this investigation is derived from the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) data set, the UK Health Security Agency (UKHSA) genomic programme data set and the UKHSA Second Generation Surveillance System.

Authors of this report

UKHSA Genomics Public Health Analysis Team

UKHSA Infectious Disease Modelling Team