Comprehensive Viral Enrichment Enables Sensitive Respiratory Virus Genomic Identification and Analysis by Next Generation Sequencing
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from genome.cshlp.org on October 10, 2021 - Published by Cold Spring Harbor Laboratory Press Method Comprehensive viral enrichment enables sensitive respiratory virus genomic identification and analysis by next generation sequencing Brigid M. O’Flaherty,1,2,6 Yan Li,1,6 Ying Tao,1 Clinton R. Paden,1,2 Krista Queen,1,2 Jing Zhang,1,3 Darrell L. Dinwiddie,4 Stephen M. Gross,5 Gary P. Schroth,5 and Suxiang Tong1 1Centers for Disease Control and Prevention, NCIRD, DVD, Atlanta, Georgia 30329, USA; 2Oak Ridge Institute for Science Education, Oak Ridge, Tennessee 37830, USA; 3IHRC Incorporated, Atlanta, Georgia 30346, USA; 4Department of Pediatrics, Clinical Translational Science Center, University of New Mexico, Albuquerque, New Mexico 87131, USA; 5Illumina, Incorporated, San Diego, California 92122, USA Next generation sequencing (NGS) technologies have revolutionized the genomics field and are becoming more common- place for identification of human infectious diseases. However, due to the low abundance of viral nucleic acids (NAs) in relation to host, viral identification using direct NGS technologies often lacks sufficient sensitivity. Here, we describe an approach based on two complementary enrichment strategies that significantly improves the sensitivity of NGS-based virus identification. To start, we developed two sets of DNA probes to enrich virus NAs associated with respiratory diseases. The first set of probes spans the genomes, allowing for identification of known viruses and full genome sequencing, while the second set targets regions conserved among viral families or genera, providing the ability to detect both known and poten- tially novel members of those virus groups. Efficiency of enrichment was assessed by NGS testing reference virus and clinical samples with known infection. We show significant improvement in viral identification using enriched NGS compared to unenriched NGS. Without enrichment, we observed an average of 0.3% targeted viral reads per sample. However, after enrichment, 50%–99% of the reads per sample were the targeted viral reads for both the reference isolates and clinical specimens using both probe sets. Importantly, dramatic improvements on genome coverage were also observed following virus-specific probe enrichment. The methods described here provide improved sensitivity for virus identification by NGS, allowing for a more comprehensive analysis of disease etiology. [Supplemental material is available for this article.] Numerous infectious diseases, including 25%–50% of acute low- been used to reduce levels of host NAs in a sample (Allander er respiratory tract illnesses, go undiagnosed (Garau and Calbo et al. 2001). Although these methods show modest levels of im- 2008). Failures in detection may result from the limited repertoire provement in detection of viral target sequence reads, they do of routine diagnostics or the inability of commonly used PCR not achieve the desired sensitivity. methods to pick up variants of known pathogens or unknown An alternative sequence-specific approach involves hybridi- and previously unrecognized pathogens. Using next generation se- zation-based sequence capture of specific NAs using 80- to 120- quencing (NGS) for viral detection and discovery provides signifi- mer DNA or RNA probes (Lovett et al. 1991; Albert et al. 2007; cant advantages to traditional viral diagnostic assays—full genome Hodges et al. 2007; Okou et al. 2007). Several groups have em- sequences may be obtained and targets are not limited by pre-ex- ployed this method for virus-specific capture and genome enrich- isting knowledge of the pathogen. However, one of the greatest ment prior to NGS (Depledge et al. 2011; Mate et al. 2015; Olp challenges to NGS as a routine, affordable, diagnostic tool is the et al. 2015). Using the SureSelectXT Target Enrichment System, relative paucity of viral nucleic acids (NAs) present in clinical Depledge et al. (2011) enriched human herpesviruses from clinical samples. samples, finding significant improvement in sequencing depth. In an effort to increase the proportion of viral sequences ob- Two studies have described the use of enrichment for characteriza- tained by NGS, several techniques for non-sequence-specific viral tion of Ebola virus and Kaposi’s sarcoma-associated herpesvirus enrichment have been implemented. Ultracentrifugation, sample using Illumina TruSeq RNA Access and Agilent SureSelectXT filtration, and viral culture have been used to enrich viral particles Target Enrichment, respectively (Mate et al. 2015; Olp et al. prior to nucleic acid extraction (Breitbart and Rohwer 2005; 2015). Collectively, these strategies result in increased target se- Duhaime and Sullivan 2012). Additionally, host genomic DNA quence reads and improved depth of coverage. depletion by DNase treatment or ribosomal RNA depletion have © 2018 O’Flaherty et al. This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication 6These authors contributed equally to this work. date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it Corresponding author: [email protected] is available under a Creative Commons License (Attribution-NonCommercial Article published online before print. Article, supplemental material, and publi- 4.0 International), as described at http://creativecommons.org/licenses/by- cation date are at http://www.genome.org/cgi/doi/10.1101/gr.226316.117. nc/4.0/. 28:869–877 Published by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/18; www.genome.org Genome Research 869 www.genome.org Downloaded from genome.cshlp.org on October 10, 2021 - Published by Cold Spring Harbor Laboratory Press O’Flaherty et al. Sequence-specific enrichment probes have also been used Results in the study of the human virome (Briese et al. 2015; Wylie Enrichment analysis of reference viruses with two hybridization et al. 2015). Briese and colleagues developed the Virome Capture Sequencing Platform for vertebrate viruses, where the authors de- probe sets signed ∼2 million viral enrichment probes to target all virus taxa Both sets of enriching probes provide significant improvement, in- where at least one virus was known to infect vertebrates (Briese creasing both the overall rate of virus detection and the percent tar- et al. 2015). This method resulted in a 100- to 1000-fold increase geted viral reads (PTRs) per sample by NGS. We first evaluated the in the number of targeted viral reads and reduction in host back- enrichment by virus-specific probes using representative reference ground. Similarly, Wylie et al. (2015) reported the ViroCap enrich- viruses from viral families associated with human respiratory dis- ment system, where a custom panel of 2.1 million genus- and eases (Supplemental Table S1). Prior to library preparation, we de- virus-specific probes targeted against vertebrate viruses from 34 termined the virus load (Ct) where real-time PCR assays were families allowed for viral enrichment in clinical specimen libraries. available. Viral Ct values ranged from about 21 to 33, depending In both studies, the probe design targeting known sequences was on the available reference virus NA stock, with most Ct values at limited in identification of highly divergent viruses. or around 25. For each sample, we generated two parallel libraries Here, we report an approach for target-based enrichment for from the same viral NA template using different sequencing indi- sensitive detection of a broad spectrum of respiratory viruses by ces. This allowed us to assess samples with and without enrich- NGS. We used two complementary panels of oligonucleotide ment on the same sequencing run. probes targeted against representative common respiratory viruses The virus-specific probe pool improved detection of target vi- as a proof of concept. They are (1) virus-specific probes that span rus for all of the reference virus NAs tested when compared to their the full genome of common respiratory viruses, and (2) conserved unenriched matches, except the HBoV1 virus (Fig. 1). Overall en- viral group probes that target against conserved regions from each richment with virus-specific probes results in a 7285-fold median of nine viral families or subfamilies that are associated with respi- increase in PTRs (increase ranges from 33- to 188,019-fold) ratory diseases. The virus-specific probes allow for full genome se- (Supplemental Table S2). Notably, 18 of the 26 (69%) samples im- quencing, which contributes to a more complete and confident proved to 50%–99% PTRs following enrichment: 229E, NL63, identification and characterization of target viruses. The conserved OC43, AdV E4, HPeV6, HRV B14, HPIV1, HPIV2, HPIV3, RSV A2, viral group probes enable detection of divergent viruses with the RSV B1, HMPV83, all five representative influenza A viruses, and potential to recognize novel viruses within these known viral one influenza B (B/Yamagata) virus (Fig. 1; Supplemental Table families. The combination of these two enrichment approaches S2). Three samples—AdV B11, HMPV75, and HPeV1—were en- is expected to increase the number of usable reads per sample riched to 20%–45% PTRs. The overall linear genome coverage and per sequencing run, significantly improving the sensitivity and depth of coverage was improved dramatically, as 73% of the and value of NGS for viral detection, discovery, and sequence samples generated a linear genome coverage of more than 85% characterization. (Supplemental Table S2), which provided