NDNS: Assessment of salt intake in adults in England, 2018/19 – independent review by Dr Darren Greenwood
Published 30 March 2020
Independent review of the methodology by Dr Darren Greenwood, Biostatistician, University of Leeds and response from the NDNS consortium
Public Health England (PHE) asked Dr Darren Greenwood to provide a peer review of the survey methodology. Dr Greenwood is a member of the Scientific Advisory Committee on Nutrition (SACN) but provided comments in a personal capacity. In June 2019, Dr Greenwood was provided with 4 working papers setting out the analytical methodology for urinary sodium measurement and the statistical analysis plan for the report:
- Urinary sodium measurement in UK sodium surveys, including validation of the current analytical method (paper 1).
- Statistical analysis plan for the 2018/19 survey (paper 2).
- Note of meeting between PHE and the NDNS consortium to discuss the statistical analysis plan (paper 3).
- Proposals for graphical representation of the data (paper 4).
1. Urinary sodium measurement in UK sodium surveys (including validation of the current analytical method) (Paper 1)
1.1 Peer reviewer’s comments
The concept of applying a correction factor that would allow direct comparison of data collected using different methods and platforms is appropriate.
The sample sizes used to derive the correction factors are adequate, ranging from 120 to 268.
I suggest consideration is given to clarifying whether these samples are randomly sampled from NDNS or sampled using some other strategy to achieve representative samples. This clarification is important, as the correction factor could depend on the characteristics of the individuals sampled, and on the range of their intakes.
The scatter plot in figure 1 of paper 1 is not the clearest way of showing the pattern of agreement but shows there is a very strong association between the Cobas C111 and the adjusted 2014 survey results.
I suggest consideration is given to providing more detail on the % PABA recovery considered to indicate a complete collection. Was this, for example, any within the range of 85% to 110% recovery? Was any further attempt made to correct results to, say 93% recovery or 100% recovery, either in those considered complete, or those considered partially complete (say in the range of 50% to 85%)?
The correction factors appear to be based on sodium concentrations that have not undergone any log-transformation. I suspect in this case that it makes very little difference to the regression line because the skewness is not great and both axes would be transformed, but it is not consistent with the analysis and presentation included proposed in the statistical analysis plan for the 2018/19 survey report (papers 2 to 4).
There is some non-linearity evident in the graphs presented, particularly graphs 3 and 4 in appendix 2. For example, the plot suggests a slightly shallower gradient to the slope for lower values and slightly steeper gradient to the slope for the higher values. I suggest consideration is given to the extent to which the slope of the overall line of best fit, and the single correction factor derived from it, is equally applicable along the whole of the range.
Any variation in the correction factor along the range may not matter if the data is only used to derive population means or geometric means and trends in these across surveys. However, it would make a small difference to individuals’ values and to any subgroups where the range of intake differed from that of the sample, for example, those with particularly low or high intakes. Similarly, it is important to remember that the correction factor is only valid within the range of the existing data.
The dependence of the correction factor on the values sampled re-enforces the importance of noting if this is a randomly selected sample of individuals from NDNS.
The accuracy and quality control, both within and between-batch precision, of the Cobas C111 is excellent.
The Bland-Altman plots in figure 1 of appendix 3, cross-validating the Cobas C111 and adjusted 2014 survey results, reveal some non-linearity in the both the absolute and percentage agreement. The Cobas C111 tool appears to read lower compared to the adjusted 2014 survey by approximately 0% to 20% for sodium concentrations <50 mmol/l, it reads higher than the adjusted 2014 survey by approximately 0-10% for sodium concentrations of 50-100 mmol/l, and appears to agree well for sodium concentrations >100 mmol/l.
The implication is that the researchers’ conclusion that differences are spread evenly around the mean bias is an approximation that misses some potentially useful features of the data. The correction would appear adequate for the overall mean or geometric mean but may not be appropriate for individuals or subgroups of the population. I suggest that consideration is given to investigating the extent to which this matters. A simple correction factor may be a close enough approximation, but it would be informative to have some further exploration of the data to provide the evidence required to support that decision. ###Consortium response to the peer reviewer’s comments
The peer review was positive and supported previous analytical approaches as well as highlighting the excellent analytical performance of the current assay. The peer review considered:
- Historical correction factors used in previous surveys.
- The use of PABA recovery to assess the completeness of urine collection.
- Analytical performance for the 2018 survey.
1.2 Historical correction factors used in previous surveys
The reviewer had some queries around the methodology to determine correction factors that were produced by the former survey contractor (MRC Elsie Widdowson Laboratory (MRC EWL) in 2014. These correction factors were determined at MRC EWL and were applied to 2014 data and prior surveys in order to facilitate trend analysis of changes in salt intake over time.
The 268 samples used at that time for determination of the correction factors for urinary sodium measurement were randomly selected from a previous sodium survey. As the NDNS and sodium surveys are designed to be representative of the population it is expected that the 268 random samples provide a range of ages, income groups, region and approximately equal distribution of sexes. Within these subgroups there is an overlapping range of salt intakes. It is also important to note that salt intake is calculated from both urinary sodium concentration and urine volume together with their inherent variability. The correction factors were determined from the 268 samples in the concentration range 10 to 230 mmol/L. This is representative of the normal range and covers the range of urinary sodium concentrations in the 2018 sodium survey.
The reviewer specifically notes some non-linearity in graphs 3 and 4 in Appendix 2. Graph 4 was not used to determine correction factors and Graph 3 presents the relationship between the Cobas and flame photometer that was used only in a single survey included in the trend analysis (2008). The correction factor determined for the Cobas - Dimension relationship (presented in Graph 2 of Appendix 2) was applied post hoc on the data since 2008 and up to, and including, the 2014 survey.
We are confident that the single correction factor, derived by MRC EWL from random samples representing the observed range of urinary concentrations, is the most appropriate way to derive population geometric means that are used in trend analyses of the sodium survey reports.
1.3 The use of PABA recovery to assess the completeness of urine collection
In the present 2018 survey, PABA recovery of 70 to 103% was used to determine completeness of the 24 h urine collections. This range was derived from a study of PABA recovery in 50 participants and includes consideration of both analytical and biological variation [Cox et al., 2018. Validation of the use of p-aminobenzoic acid to determine completeness of 24 h urine collections in surveys of diet and nutrition. Eur J Clin Nutr. 72(8): 1180-2]. An important aspect of this study was that the recovery was based on measurement of PABA using a HPLC method that is more specific than the older colorimetric method, which was susceptible to interference from other compounds. No PABA-derived corrections to urine volume or sodium concentration have been applied.
1.4 Analytical performance for the 2018 survey
The peer review highlighted both the excellent accuracy and precision of the current analytical setup using the Cobas C111 for measurement of sodium in the 2018 survey and supports the views of the External Quality Advisor to the Nutritional Biomarker Laboratory.
The reviewer observed that the relationship of corrected values of the 2014 survey with values obtained from re-analysis of the same samples with the Cobas instrument in 2019 demonstrate very strong association (Figure 1 Appendix 3). The reviewer comments that there may be some non-linearity in the Bland Altman plots of the same data. However, the actual difference in mmol/L is small (mean bias is 0.2 mmol/L (95% CI: -3.7 to 4.0 mmol/L)) across the concentration range and given the accuracy of the Cobas instrument is more likely related to the performance of the Dimension analyser and the use of the universal correction factor. As expected, the percentage difference increases as the concentration nears the measurement limits of the instrument (manufacturer lower limit is 20 mmol/L), although there are few of these in the 2018 dataset (1.5% of participants have urinary sodium concentration less than 20 mmol/L). We therefore conclude that any non-linearity is minor and in the context of the information above, is unlikely to have any material impact on calculated salt intakes for the population.
2. Statistical analysis plan for the 2018/19 survey report (papers 2 to 4)
2.1 Peer reviewer’s comments
The statistical analysis plan (paper 2) is detailed and comprehensive.
The suggestion that sodium should be log-transformed for regression analysis is appropriate, though skewness in the untransformed sodium concentrations is slight and regression models are quite robust to moderate departures from normal distributions.
Presentation using geometric means (rather than arithmetic means) is the more appropriate summary measure to use. The 6g population target not being directly derived from one or other mean is unfortunate when changing the form of presentation. Is it the case that this is a target for individuals in the population to aspire to, and so all population means, whether arithmetic or geometric, should be less than this value?
The revised sample sizes and power estimations presented in the statistical analysis plan are appropriate and informative. I agree with the need to update these to at least 80% power. The 0.5g reduction in salt intake between the 2 surveys is a large, possibly unrealistic improvement. The comparison will be underpowered to detect smaller, more achievable improvements. However, power is about p-values and the purpose of comparing the surveys is more about quantifying the changes over time rather than testing the statistical significance of them.
Where absolute changes in geometric means are presented for illustration, I suggest that consideration is given to clarifying that this is “on average” or “for an average person”, or words to that effect. This is because where data are log-transformed, the relative change is fixed (5%, 6%, 7% or 8% say) and the absolute change varies according to the magnitude of the reference mean (that is, 7% of a small number is less in absolute terms than 7% of a large number).
So the 0.5g reduction the survey is adequately powered to detect is ‘on average’, derived from the 7% relative reduction used in the calculations.
Consideration should be given to clarifying that the ‘mean’ in the 2014 report refers to the ‘arithmetic mean’, for example, in table 2.
I was unclear what the standard error of a geometric mean would look like. Would this be the standard error of the log-transformed sodium? If so, this would be on a different scale to the back-transformed values in the geometric mean and percentiles. And if so, though neither are particularly helpful, why could this not be done for standard deviation too? I am unsure what to recommend here, and this is one disadvantage of using the geometric mean. Consideration could be given to clarifying if this is standard error on the log-scale. Consideration could be given to quoting lower and upper confidence limits for the geometric mean as being easier to interpret than the standard error on the log scale, but still derived from it. Failing that, consideration could be given to presenting other percentiles. I understand the desire to avoid presenting more than one average, but the median (50th percentile) would not be inappropriate if you wanted to keep this in. Having said that, the geometric mean is still the best measure.
I agree that the scatter plot figure 2 is useful for the detailed statistical analysis plan, but not so helpful to present in the final report.
The bullet point lists of suggested improvements to the presentation of the England 2018/19 trend analysis are all appropriate and helpful.
I agree that figure 4 is the better choice for the report. I agree with later suggestions that this could be improved by indicating the broken vertical axis. Ideally, we would not break this at all, but it is somewhat inevitable when using the log-scale, which is the appropriate scale to use.
Consideration could be given to whether the size of the square could be weighted by √n or 1/SE. If not, maybe use a circle rather than a square, so there is no confusion with a forest plot. Neither of these are particularly important points though.
For figure 4, the reference is the 6g population target. For figure 5, where the trend is separate into males and females, the separate 7g and 5g targets should be used instead, as suggested as an option in the statistical analysis plan.
The statistical analysis plan mentions that the data for 2008 has been revised compared with the data used for the England 2014 report to correct an error in the equation used to adjust sodium excretion data for collections deemed marginally complete. The plan would benefit from some more detail here, or a reference to elsewhere in the NDNS methodology where this is described. For example, what percentage of PABA recovery indicates marginal completeness here? Is this referring to 50% to 85% recovery? Or correction to 93% or 100% recovery?
The many suggestions of the SACN representatives (see meeting note – paper 3) are constructive and I agree would lead to improvements in presentation and ease of interpretation. In particular, the health impact of a 0.5g/day reduction salt intake (subject to the understanding that the survey is about more than just hypothesis testing), comments on the scatterplots, comments on interpretation of figure 4, the careful use of the broken scale, not extrapolating results beyond the range of the data, using separate target intakes for males and females if separate trends fitted, and presenting the distribution of sodium intakes.
The suggestions to improve the graphical presentation made (in paper 4) following the meeting on 2 May 2019 are all appropriate and useful. Consideration could be given to whether the break in the vertical axis could be made even clearer by inserting the break symbol into the axis itself rather than alongside, subject to software limitations.
The vertical line inserted into the graphs showing shifts in distribution between 2006, 2011 and 2014 presumably indicate the modal average. Consideration should be given to the appropriateness of using a fourth type of average here. Instead, the geometric mean could be indicated as a vertical line on the graph, or possibly the recommended maximum daily intake for individuals, because the distribution is of individuals’ intakes.
2.2 Consortium response to the peer reviewer’s comments (papers 2 to 4)
The peer review of the statistical analysis plan was very positive with only a few points for consideration. It confirmed our proposal to present findings using geometric means rather than arithmetic means and to perform the regression analysis on the log-transformed scale. There was reassurance that the 0.5g reduction in salt intake was adequately powered (at 80%) but concern that 0.5g difference is a large, possibly unrealistic reduction. The use of a standard error of a geometric mean was unclear to the reviewer and instead the use of a 95% confidence interval for the geometric mean was recommended – this will be investigated and proposed to PHE. Confirmation was also obtained for our suggested improvements to the graphical presentation of the trend analysis.