A Simple Method to Allow for Guanine-Cytosine Amplification Error
Total Page:16
File Type:pdf, Size:1020Kb
Clinica Chimica Acta 496 (2019) 13–17 Contents lists available at ScienceDirect Clinica Chimica Acta journal homepage: www.elsevier.com/locate/cca A simple method to allow for guanine-cytosine amplification error in prenatal DNA screening for trisomy 18 T ⁎ Nicholas J. Wald ,1, Jonathan P. Bestwick1, King Wai Lau, Wayne J. Huttly, Weilin Ke, Ray Cheng, Robert W. Old Wolfson Institute of Preventive Medicine, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK ARTICLE INFO ABSTRACT Keywords: Background: A source of error in prenatal screening for trisomies is PCR amplification error associated with Prenatal screening guanine-cytosine (GC) content of DNA fragments in maternal plasma. We describe a simple method of allowing Trisomy 18 for this. Edwards syndrome Methods: Data from a Reflex DNA screening programme (67 trisomy 18 and 83 unaffected pregnancies) were Cell-free DNA used to compare the ratio of chromosome 18 DNA fragment counts to chromosome 8 DNA fragment counts GC content (because chromosome 8 has a similar GC content to chromosome 18) with the percentage of chromosome 18 DNA counts using counts from all autosomes in the denominator, with and without an all autosome correction for the GC content of the DNA fragments. Results: A chromosome 18 to 8 ratio of DNA fragment counts was more discriminatory than the percentage of all autosome counts arising from chromosome 18 without, or with an all autosome correction for GC content bias. It achieves a high screening performance, eg. for a 0.25% false-positive rate, a 97% detection rate instead of 49% without a correction for GC content, and 91% with an all autosome correction for GC content. Conclusion: Consideration can be given to using the ratio of chromosome 18 DNA fragment counts to chromo- some 8 DNA fragment counts in cell-free DNA prenatal screening for trisomy 18, avoiding the need for more complex methods of making a correction for the GC content currently used. 1. Introduction sequenced DNA fragments from all chromosomes against the GC con- tent of the fragments [4]. Ideally, there should be no association be- Analysis of maternal plasma DNA (also known as cell-free DNA) is tween the GC content of a fragment and the fragment counts sequenced an accurate method for prenatal screening for fetal trisomies 21, 18, so that the plot is horizontal. In practice, however, the plot is bell- and 13 [1]. However the screening performance for trisomy 18, is less shaped, indicating underestimation with DNA fragments with high and than for trisomy 21 [2], and the reasons for this are unknown. This low GC content and overestimation in between. Deviations from the prompted us to examine possible sources of analytical error that might overall average (ie. expectation) can be used to standardize (ie. correct) affect DNA screening for trisomy 18. the error. The method has the advantage of generalizability (eg. ap- The DNA analysis most widely used in prenatal screening for plicable to DNA fragments from all chromosomes) but it has several trisomy 18 is massively parallel sequencing. This involves sequencing disadvantages. The method is prone to variation from analytical run to several million DNA fragments in maternal plasma and then calculating run, and corrections, vary according to the pre-sequencing steps (eg. the proportion of sequences that map to chromosome 18. The de- how the PCR is performed), and according to the sequencing methods nominator of the proportion is usually the number of DNA fragments used, all of which impair analytical precision. This all autosome GC that map to all autosomes. A correction for GC (guanine-cytosine) correction method is complex, not transparent, and requires a large content of the DNA fragments is usually applied [3] to allow for GC dataset, preferably linked to a particular sequencing method and la- associated error in the PCR copying number of DNA fragments. The boratory. usual method for allowing for GC error relies on a plot of the number of Sehnert and colleagues [5] indicate that it may be better to use a ⁎ Corresponding author. E-mail address: [email protected] (N.J. Wald). 1 Joint first authors. https://doi.org/10.1016/j.cca.2019.06.015 Received 25 March 2019; Received in revised form 5 June 2019 Available online 15 June 2019 0009-8981/ © 2019 Published by Elsevier B.V. N.J. Wald, et al. Clinica Chimica Acta 496 (2019) 13–17 single or a small number of chromosomes in the denominator, instead illustrate the fit of chromosome 18 MoM values (using all autosomes as of all autosomes when calculating the proportion of DNA fragments the denominator and using chromosome 8 as the denominator) to aligning to chromosome 18. Empirical testing of different chromosome Gaussian distributions, probability plots were generated separately for denominators indicated that chromosome 8 was the most dis- affected and unaffected pregnancies, with affected pregnancies stan- criminatory for trisomy 18. We explored this strategy as a way of im- dardised to a fetal fraction of 6% by adjusting the MoM values ac- proving DNA screening performance for trisomy 18, using a larger data cording to the slopes of the regression lines. Points on the probability set obtained from the Wolfson Institute (London) prenatal screening plots lying on a straight line indicate a good fit to a Gaussian dis- programme for trisomy 21, 18, and 13 from 2015 to 2018. tribution. The standard deviations of MoM values in affected and un- affected pregnancies were taken as the slopes of regression lines of the points on each probability plot between the 10th and 90th centiles; a 2. Methods standard method of estimating the standard deviation that avoids the undue influence of outliers. [7] The risk of each of the 67 affected and Maternal plasma DNA from 67 trisomy 18 (affected) pregnancies 83 unaffected pregnancies being affected with trisomy 18 was esti- ff and 83 una ected pregnancies was sequenced using a semiconductor mated as the maternal age-specific odds of an affected livebirth [8], sequencing platform and software [6]. Typically about 10 million DNA adjusted to the first trimester by the fetal loss rate from this time in fragments were analysed in each plasma sample. Data from the BAM pregnancy until term [9], multiplied by the likelihood ratio (the height (Binary Alignment Map) files that plasma DNA analysis generated for of the fetal-fraction specific Gaussian distribution in affected pregnan- each pregnancy were aligned to the human reference genome (hg19). cies divided by the height of the Gaussian distribution in unaffected DNA fragments that mapped to individual chromosomes were counted. pregnancies). Screening performance was estimated as the detection The fetal fraction of individual samples was estimated by proprietary rate (DR; the proportion of affected pregnancies with a positive result) (Premaitha) software within the test platform. for a specified false-positive rate (FPR; the proportion of unaffected DNA fragments from chromosome 18 were expressed as (i) a per- pregnancies with a positive result), (FPR for a specified DR and DR and centage using counts from all autosomes as the denominator, without a FPR for a specified risk cut-off). Modelling based on multivariate correction for GC content, (ii) a percentage using counts from all au- Gaussian analyses provides a more robust estimate of screening per- tosomes as the denominator, with an all autosome correction for GC formance, provided the underlying distribution of the markers are ap- content and (ii) a ratio using counts from chromosome 8 as the de- proximately Gaussian, a method that is routinely used in prenatal nominator. Whether a percentage or ratio is used with all autosomes as screening and has been empirically validated. [10–12] It avoids random ff the denominator will make little di erence because the proportion of error in estimation that arises from using directly observed results when DNA fragments from chromosome 18 is small compared to fragments the number of observations is not large, for example if there were no from all autosomes. However, when only considering chromosomes 18 false-positives among 100 unaffected pregnancies in a study sample, it and 8 it does matter because the random error arising from combining should not be taken to mean that there will be no false-positives in the the DNA fragment counts from chromosomes 18 and 8 in the denomi- population at large. We provide observed and modelled results so that a nator is greater than with fragments from chromosome 8 alone. Dot comparison can be made between the two methods. Screening perfor- plots, that categorise observations in a manner that avoids overlapping mance using modelled DNA counts was estimated by simulating data on ff dots, were used to visually compare the distributions in a ected and 100,000 a ffected and unaffected pregnancies based on (i) the dis- unaffected pregnancies with DNA fragments from chromosome 18 ex- tribution of live births in England and Wales 2014–16 [13], (ii) the pressed as (i) a percentage using counts from all autosomes as the de- distributions of fetal fraction in affected and unaffected pregnancies nominator, without a correction for GC content, (ii) a percentage using and (iii) the fetal-fraction-specific Gaussian distributions of chromo- counts from all autosomes as the denominator, with an all autosome some 18 MoM values using (a) all autosomes as the denominator, with correction for GC content and (ii) a ratio using counts from chromo- and without an all autosome correction for GC content, and (b) chro- some 8 as the denominator. mosome 8 as the denominator. The likelihood ratio and risk of being The percentages and ratio were converted into multiple of the affected was then calculated as with the observed data from the 67 median (MoM) values by dividing by the respective median percentages affected and 83 unaffected pregnancies i.e.