Nanopore Sequencing the Advantages of Long Reads for Genome Assembly

Total Page:16

File Type:pdf, Size:1020Kb

Nanopore Sequencing the Advantages of Long Reads for Genome Assembly WHITE PAPER Nanopore sequencing The advantages of long reads for genome assembly SEPTEMBER 2017 nanoporetech.com nanoporetech.com/publications OXFORD NANOPORE TECHNOLOGIES | THE ADVANTAGESAPPLICATION ANDOF LONG ADVANTAGES READS FOR OF GENOMELONG-READ ASSEMBLY NANOPORE SEQUENCING TO STRUCTURAL VARIATION ANALYSIS Contents 1 De novo assembly 2 Advantages of long-read sequencing for genome assembly 3 Genome assembly tools 4 Case studies 5 Summary 6 About Oxford Nanopore Technologies 7 References OXFORD NANOPORE TECHNOLOGIES | THE ADVANTAGES OF LONG READS FOR GENOME ASSEMBLY Introduction Over the last decade, improvements in Whole genome assembly – next generation DNA sequencing solving the puzzle technology have transformed the field Traditional technologies have required of genomics, making it an essential tool users to sequence short lengths of DNA, in modern genetic and clinical research which must then be reassembled back laboratories. The facility to sequence into their original order as accurately as whole genomes or specific genomic possible. Such short-read sequencing regions of interest is delivering new insights technologies, however, present a number into a variety of applications such as of challenges, particularly the difficulty of human health and disease, metagenomics, accurately analysing repetitive regions and antimicrobial resistance, evolutionary large structural variations.1 biology and crop breeding. This means that many reference genomes that were created using short-read For applications such as the analysis of sequencing are highly fragmented, which larger structural variation, or de novo in turn introduces bias into any alignments assembly, whole genome sequencing made against that reference2. This review shows how these challenges are now (WGS) is typically the technique of choice. being met, by the emergence of long-read nanopore sequencing3. One of the key steps of WGS is the accurate assembly of the vast amount of data generated into a contiguous stretch of DNA sequence. This review provides a background to the DNA assembly process and the associated advantages of long or ultra-long DNA reads, as provided by nanopore sequencing technology. 1 OXFORD NANOPORE TECHNOLOGIES | THE ADVANTAGES OF LONG READS FOR GENOME ASSEMBLY 1 De novo assembly De novo assembly aims to reconstruct Several approaches have been the original genome sequence from the used for this purpose, including Greedy set of reads, for example when there is no extension, de Bruijn graphs and Overlap reference genome available or a scientist Layout Consensus (OLC) (Figure 1). The prefers to avoid potential bias that could graph is then simplified and a consensus arise from using an imperfect reference. is extracted. A key step in this process is Generally, the first step is to find overlaps the assembly of contiguous, uninterrupted between the reads and build a graph to stretches of overlapping DNA, so-called describe their relationship. contigs. As detailed in the next section, long-read sequencing technology offers many advantages for both de novo and Long-read sequencing technology alignment-based genome assembly. offers simplified and less ambiguous genome assembly. Figure 1 Long reads A schematic showing how long- read sequencing can deliver simplified, less ambiguous genome assembly. Long reads Short reads (solid arrows) have greater overlap with other reads than is provided by short reads (dashed arrows), allowing more accurate assemblies, especially in repeat regions (R). Image adapted from Schatz (2014)4. 2 OXFORD NANOPORE TECHNOLOGIES | THE ADVANTAGES OF LONG READS FOR GENOME ASSEMBLY Advantages of long-read sequencing 2 for genome assembly Until recently, cost-effective generation Facility to span repetitive of large amounts of sequence data could genomic regions only be performed using short-read (<300 Most genomes contain significant amounts bp) sequencing technologies; however, of repetitive DNA (e.g. transposons, nanopore-based sequencing can process satellites, gene duplications). As the very long DNA fragments (currently up to short reads produced by traditional next 950 kb)5 to create long reads, which deliver generation sequencing (NGS) technology a number of significant benefits: may not span each given repetitive region, the resulting genome assemblies can be highly fragmented6. Long-read sequencing Long-read nanopore sequencing technologies have a significant advantage offers easier assembly and here as the reads generated are more likely to span the full repetitive region, the ability to span repetitive allowing the creation of accurate genome genomic regions. assemblies with minimal gaps (see Figure Ease of assembly 3 and Case Study 1, page 7). The longer a sequencing read, the more overlap it will have with other reads. As such, it is much easier to assemble the DNA fragments back into the correct order. This can be visualised much like a jigsaw; the larger the pieces, the easier the puzzle (Figure 2). Figure 2 Like a jigsaw puzzle with large pieces, long-read DNA is much easier to assemble than short- read DNA. The Escherichia coli genome comprises 4.6 million bases, which would equate to 92,000 fragments of 50 bp or just 9 fragments of 500 kb in length. ~ 50 base read ~ 500,000 base read ~ 92,000 “pieces” ~ 9 “pieces” Figure 3 A schematic highlighting the Short reads advantages of long reads in de novo assembly of repetitive regions. Long read lengths Consensus are more likely to incorporate the whole repetitive region Long reads (shown in red) allowing more accurate assembly with fewer gaps. Image adapted from Sam Consensus Demharter7. 3 OXFORD NANOPORE TECHNOLOGIES | THE ADVANTAGES OF LONG READS FOR GENOME ASSEMBLY Identification of large Most existing genome assemblies have structural variation been created using short-read sequencing technology, which is limited in its ability The term structural variation (SV) covers to capture large structural variation. Even a range of genetic alterations, including the well- studied human genome has large copy number variation (CNV), duplications, gaps in its assembly6.As CNV alone is translocations and inversions (Table 1). estimated to comprise 4.8–9.5% of the human genome10, it is clear that accurate Traditionally, structural variants were analysis of such regions is of significant defined as spanning over 1,000 base importance. pairs; however, with the advent of higher resolution genome-analysis techniques, Unlike short-read technology, longer some researchers have now revised nanopore-sequencing reads can cover this definition to spanning over 50 base the whole structure variant in one read. pairs8. Such structural variants have been This results in more accurate genome associated with a number of diseases assemblies facilitating better understanding (including autism, schizophrenia and of genome architecture in genomic cancer) and, for this reason, have become diseases (see Figure 4 and Case Studies 2 increasingly important areas for research9. and 3, pages 8, 9 and 10)11,12,13. A Short reads Consensus Long reads Consensus Structural variants Definition 1 3 Short reads 1 2 4 Copy number variation (includes A segment of DNA which has 1 3 4 insertions and deletions) greater (insertion) or fewer Consensus 1 3 2 4 (deletion) copies than expected Duplications A segment of DNA that occurs 1 2 3 4 Long reads 1 2 3 4 in two or more copies per 1 2 3 4 haploid genome Consensus 1 2 3 4 Translocation A change in position of a region of DNA which involves no Figure 4 change to the total DNA content A schematic highlighting the advantages of long reads in de novo Inversions A segment of DNA that is assembly of copy number variations (CNVs). Long read lengths are reversed end to end more likely to contain the whole CNV, providing less potential for error in the consensus sequence. Image adapted from Sam Demharter7. Table 1 Definition of common large structural variants. 4 OXFORD NANOPORE TECHNOLOGIES | THE ADVANTAGES OF LONG READS FOR GENOME ASSEMBLY Assembly quality The much higher N50 values generated using long-read nanopore sequencing, The quality of genome assemblies is underlines the superior quality and often assessed by the number of contigs completeness of genome assembly required to represent the genome, the when compared with short-read length of these contigs relative to the size sequencing technology (Table 2 and of the genome and the proportion of reads Case Study 1). that can be assigned to the contigs. One of the most commonly used assembly metrics is N50. The N50 value is calculated by sorting all contigs by length, calculating the cumulative assembly length (i.e. adding the length of all contigs together) and identifying the length of the contig which is situated in the middle of the cumulative length (Figure 5). The higher the N50 value, the more contiguous the assembly. Figure 5 The N50 metric is one measure of the quality of a genome assembly. Other metrics such as N75 and N25 are also sometimes used to further assess assembly quality. Image adapted from Jansen (2016)14. Table 2 Comparison between genome Isolate Technology Number of contigs N50 length assemblies of Rhizoctonia solani using long- and short- AG1-IA Short read 6,452 20 kb read sequencing platforms. The higher N50 value coupled with the fewer number of contigs, AG3 Short read 6,040 26 kb obtained using long-read length nanopore technology is AG8 Short read 7,6 0 6 7 kb indicative of superior genome assembly quality. Table adapted 15 AG1-IB Short read 3,793 35 kb from Datema et al (2016) . AGX Nanopore long read 606 199 kb 5 OXFORD NANOPORE TECHNOLOGIES | THE ADVANTAGES OF LONG READS FOR GENOME ASSEMBLY 3 Genome assembly tools Many software tools are now available A recent study by Cherukuri and Janga for genome assembly utilising long-read (2016) comparing a number of assembly sequencing data. Tools differ in what they algorithms found the OLC approach to be offer and the extent of their use and should most favourable for long-read nanopore be assessed according to the specific sequencing – delivering higher genome project requirements (Table 3).
Recommended publications
  • Recovery of Small Plasmid Sequences Via Oxford Nanopore Sequencing
    bioRxiv preprint doi: https://doi.org/10.1101/2021.02.21.432182; this version posted February 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. Recovery of small plasmid sequences via Oxford Nanopore sequencing Ryan R. Wick1*, Louise M. Judd1 , Kelly L. Wyres1 and Kathryn E. Holt1,2 1. Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, VIC, 3004, Australia 2. Department of Infection Biology, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK * [email protected] Abstract Oxford Nanopore Technologies (ONT) sequencing platforms currently offer two approaches to whole-genome native-DNA library preparation: ligation and rapid. In this study, we compared these two approaches for bacterial whole-genome sequencing, with a specific aim of assessing their ability to recover small plasmid sequences. To do so, we sequenced DNA from seven plasmid-rich bacterial isolates in three different ways: ONT ligation, ONT rapid and Illumina. Using the Illumina read depths to approximate true plasmid abundance, we found that small plasmids (<20 kbp) were underrepresented in ONT ligation read sets (by a mean factor of ~4) but were not underrepresented in ONT rapid read sets. This effect correlated with plasmid size, with the smallest plasmids being the most underrepresented in ONT ligation read sets. We also found lower rates of chimeric reads in the rapid read sets relative to ligation read sets. These results show that when small plasmid recovery is important, ONT rapid library preparations are preferable to ligation-based protocols.
    [Show full text]
  • Nanopore Sequencing of Long Ribosomal DNA Amplicons Enables
    bioRxiv preprint first posted online Jun. 29, 2018; doi: http://dx.doi.org/10.1101/358572. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Nanopore sequencing of long ribosomal DNA amplicons enables portable and simple biodiversity assessments with high phylogenetic resolution across broad taxonomic scale Henrik Krehenwinkel1,4, Aaron Pomerantz2, James B. Henderson3,4, Susan R. Kennedy1, Jun Ying Lim1,2, Varun Swamy5, Juan Diego Shoobridge6, Nipam H. Patel2,7, Rosemary G. Gillespie1, Stefan Prost2,8 1 Department of Environmental Science, Policy and Management, University of California, Berkeley, USA 2 Department of Integrative Biology, University of California, Berkeley, USA 3 Institute for Biodiversity Science and Sustainability, California Academy of Sciences, San Francisco, USA 4 Center for Comparative Genomics, California Academy of Sciences, San Francisco, USA 5 San Diego Zoo Institute for Conservation Research, Escondido, USA 6 Applied Botany Laboratory, Research and development Laboratories, Cayetano Heredia University, Lima, Perú 7 Department of Molecular and Cell Biology, University of California, Berkeley, USA 8 Research Institute of Wildlife Ecology, Department of Integrative Biology and Evolution, University of Veterinary Medicine, Vienna, Austria Corresponding authors: Henrik Krehenwinkel ([email protected]) and Stefan Prost ([email protected]) Keywords Biodiversity, ribosomal, eukaryotes, long DNA barcodes, Oxford Nanopore Technologies, MinION Abstract Background In light of the current biodiversity crisis, DNA barcoding is developing into an essential tool to quantify state shifts in global ecosystems.
    [Show full text]
  • Local Solid-State Modification of Nanopore Surface Charges
    Local solid-state modification of nanopore surface charges Local solid-state modification of nanopore surface charges Ronald Kox1,2,5, Stella Deheryan1, Chang Chen1,3, Nima Arjmandi1, Liesbet Lagae1,4, and Gustaaf Borghs1,4 Imec, Kapeldreef 75, 3001, Leuven, Belgium Department of Electrical Engineering, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, 3001, Leuven, Belgium Department of Chemistry, Katholieke Universiteit Leuven, Celestijnenlaan 200 F, Leuven, 3001, Leuven, Belgium Department of Physics, Katholieke Universiteit Leuven, Celestijnenlaan 200 D, Leuven, 3001, Leuven, Belgium Abstract: The last decade, nanopores have emerged as a new and interesting tool for the study of biological macromolecules like proteins and DNA. While biological pores, especially alpha-hemolysin, have been promising for the detection of DNA, their poor chemical stability limits their use. For this reason, researchers are trying to mimic their behaviour using more stable, solid-state nanopores. The most successful tools to fabricate such nanopores use high energy electron or ions beams to drill or reshape holes in very thin membranes. While the resolution of these methods can be very good, they require tools that are not commonly available and tend to damage and charge the nanopore surface. In this work, we show nanopores that have been fabricated using standard micromachning techniques together with EBID, and present a simple model that is used to estimate the surface charge. The results show that EBID with a silicon oxide precursor can be used to tune the nanopore surface and that the surface charge is stable over a wide range of concentrations. PACS: 81.07.-b Nanoscale materials and structures: fabrication and characterization, 81.15.Ef Vacuum deposition 1 Introduction Examples of nanopores are abundant in biological systems, usually in the form of transmembrane protein channels in lipid bilayer membranes.
    [Show full text]
  • Genomic Sequencing of SARS-Cov-2: a Guide to Implementation for Maximum Impact on Public Health
    Genomic sequencing of SARS-CoV-2 A guide to implementation for maximum impact on public health 8 January 2021 Genomic sequencing of SARS-CoV-2 A guide to implementation for maximum impact on public health 8 January 2021 Genomic sequencing of SARS-CoV-2: a guide to implementation for maximum impact on public health ISBN 978-92-4-001844-0 (electronic version) ISBN 978-92-4-001845-7 (print version) © World Health Organization 2021 Some rights reserved. This work is available under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 IGO licence (CC BY-NC-SA 3.0 IGO; https://creativecommons.org/licenses/by-nc-sa/3.0/igo). Under the terms of this licence, you may copy, redistribute and adapt the work for non-commercial purposes, provided the work is appropriately cited, as indicated below. In any use of this work, there should be no suggestion that WHO endorses any specific organization, products or services. The use of the WHO logo is not permitted. If you adapt the work, then you must license your work under the same or equivalent Creative Commons licence. If you create a translation of this work, you should add the following disclaimer along with the suggested citation: “This translation was not created by the World Health Organization (WHO). WHO is not responsible for the content or accuracy of this translation. The original English edition shall be the binding and authentic edition”. Any mediation relating to disputes arising under the licence shall be conducted in accordance with the mediation rules of the World Intellectual Property Organization (http://www.wipo.int/amc/en/mediation/rules/).
    [Show full text]
  • Using Long Nanopore Reads to Delineate Structural Variants (Svs)
    Using long nanopore reads to delineate structural variants (SVs) in the human genome SVs, including large deletions, duplications, inversions, translocations and copy-number changes are abundant in large genomes, and require long reads for precise characterisation Contact: [email protected] More information at: www.nanoporetech.com and publications.nanoporetech.com Unique Repeat Unique Repeat Unique a) b) a) b) sequence 1 1 sequence 2 2 sequence 3 1,000 60 Short reads Insertions Long A B C D E Reference chromosome 1 40 reads 800 Deletions > 50 bp Short-read assembly 20 Collapsed repeat consensus Unique contig 1 Long-read Bases sequenced (Mb) assembly 600 0 Unique contig 1 Unique contig 3 0 10 20 30 40 V W X Y Z Reference chromosome 2 Single, fully-resolved contig Count Read length (kb) c) > 50 bp 400 chr7 (q33) 7p21.3 15.321.1 15.3 7p14.3 7p14.1 13 11.2 11.21 11.22 11.23 7q21.11 q21.3 7q22.1 7q31.1 7q33 7q34 7q35 36.1 36.3 Scale 50 kb hg38 chr7: 134,550,000 134,600,000 134,650,000 134,700,000 Inversion A D C B E GENCODE v24 comprehensive transcript set (only Basic displayed by default) 200 AKR1B10 AKR1B15 BGPM CALD1 AKR1B15 BGPM Deletion BGPM A B C E AC009276.4 Duplication A B C C C D E 0 1,000 10,000 20,000 30,000 Translocation V W C D E + A B X Y Z Event size (bp) Adapted from Huddleston, J. et al. Discovery and genotyping of structural variation from long-read haploid genome sequence data.
    [Show full text]
  • Detection and Mapping of 5-Methylcytosine and 5-Hydroxymethylcytosine with Nanopore Mspa
    Detection and mapping of 5-methylcytosine and 5-hydroxymethylcytosine with nanopore MspA Andrew H. Laszloa, Ian M. Derringtona, Henry Brinkerhoffa, Kyle W. Langforda, Ian C. Novaa, Jenny Mae Samsona, Joshua J. Bartletta, Mikhail Pavlenokb, and Jens H. Gundlacha,1 aDepartment of Physics, University of Washington, Seattle, WA 98195-1560; and bDepartment of Microbiology, University of Alabama at Birmingham, Birmingham, AL 35294 Edited* by Daniel Branton, Harvard University, Cambridge, MA, and approved September 26, 2013 (received for review June 5, 2013) Precise and efficient mapping of epigenetic markers on DNA may remain unchanged. Conditions required to bring this conversion become an important clinical tool for prediction and identification close to 100% completion cause DNA damage by fragmentation of ailments. Methylated CpG sites are involved in gene expression (14). Conventional bisulfite sequencing cannot differentiate and are biomarkers for diseases such as cancer. Here, we use the between mC and hC (15). Oxidative bisulfite sequencing (16, 17) engineered biological protein pore Mycobacterium smegmatis porin can distinguish between mCandhC; however, this assay has sig- A (MspA) to detect and map 5-methylcytosine and 5-hydroxymethyl- nificant sample losses with only 0.5% of the original DNA frag- cytosine within single strands of DNA. In this unique single-molecule ments remaining intact (16). In methylation-specificenzyme tool, a phi29 DNA polymerase draws ssDNA through the pore in restriction, proteins recognize and cut DNA strands at mCs, and single-nucleotide steps, and the ion current through the pore is subsequent sequencing and alignment of the strands to the known recorded. Comparing current levels generated with DNA containing genomic sequence reveal the locations of the mCs (18, 19).
    [Show full text]
  • High-Fidelity Nanopore Sequencing of Ultra-Short DNA Sequences
    bioRxiv preprint doi: https://doi.org/10.1101/552224; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Title: High-Fidelity Nanopore Sequencing of Ultra-Short DNA Sequences Authors: Brandon D. Wilson1, Michael Eisenstein2,3, H. Tom Soh2,3,4* Affiliations: 1Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA. 2Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA. 3Department of Radiology, Stanford University, Stanford, CA 94305, USA. 4Chan Zuckerberg Biohub, San Francisco, CA 94158, USA. * Correspondence to [email protected] One Sentence Summary: We introduce a simple method of accurately sequencing ultra-short (<100bp) target DNA on a nanopore sequencing platform. Abstract Nanopore sequencing offers a portable and affordable alternative to sequencing-by-synthesis methods but suffers from lower accuracy and cannot sequence ultra-short DNA. This puts applications such as molecular diagnostics based on the analysis of cell-free DNA or single- nucleotide variants (SNV) out of reach. To overcome these limitations, We report a nanopore-based sequencing strategy in Which short target sequences are first circularized and then amplified via rolling-circle amplification to produce long stretches of concatemeric repeats. These can be sequenced on the Oxford Nanopore Technology’s (ONT) MinION platform, and the resulting repeat sequences aligned to produce a highly-accurate consensus that reduces the high error-rate present in the individual repeats.
    [Show full text]
  • Nanopore Sequencing Is a Credible Alternative to Recover Complete Genomes of Geminiviruses
    microorganisms Article Nanopore Sequencing Is a Credible Alternative to Recover Complete Genomes of Geminiviruses Selim Ben Chehida 1 , Denis Filloux 2,3, Emmanuel Fernandez 2,3, Oumaima Moubset 2,3, Murielle Hoareau 1, Charlotte Julian 2,3, Laurence Blondin 2,3, Jean-Michel Lett 1, Philippe Roumagnac 2,3 and Pierre Lefeuvre 1,* 1 CIRAD, UMR PVBMT, F-97410 St Pierre, La Réunion, France; [email protected] (S.B.C.); [email protected] (M.H.); [email protected] (J.-M.L.) 2 CIRAD, PHIM, F-34398 Montpellier, France; [email protected] (D.F.); [email protected] (E.F.); [email protected] (O.M.); [email protected] (C.J.); [email protected] (L.B.); [email protected] (P.R.) 3 PHIM Plant Health Institute, University Montpellier, CIRAD, INRAE, Institut Agro, IRD, F-34398 Montpellier, France * Correspondence: [email protected] Abstract: Next-generation sequencing (NGS), through the implementation of metagenomic protocols, has led to the discovery of thousands of new viruses in the last decade. Nevertheless, these protocols are still laborious and costly to implement, and the technique has not yet become routine for everyday virus characterization. Within the context of CRESS DNA virus studies, we implemented two alternative long-read NGS protocols, one that is agnostic to the sequence (without a priori knowledge of the viral genome) and the other that use specific primers to target a virus (with a priori). Agnostic Citation: Ben Chehida, S.; Filloux, D.; and specific long read NGS-based assembled genomes of two capulavirus strains were compared to Fernandez, E.; Moubset, O.; Hoareau, those obtained using the gold standard technique of Sanger sequencing.
    [Show full text]
  • Nanopore Fabrication and Characterization by Helium Ion Microscopy
    Nanopore fabrication and characterization by helium ion microscopy D. Emmrich1, A. Beyer1, A. Nadzeyka2, S. Bauerdick2, J. C. Meyer3, J. Kotakoski3, A. Gölzhäuser1 1Physics of Supramolecular Systems and Surfaces, Bielefeld University, 33615 Bielefeld, Germany 2Raith GmbH, Konrad-Adenauer-Allee 8, 44263 Dortmund, Germany 3Faculty of Physics, University of Vienna, 1090 Vienna, Austria Abstract: The Helium Ion Microscope (HIM) has the capability to image small features with a resolution down to 0.35 nm due to its highly focused gas field ionization source and its small beam-sample interaction volume. In this work, the focused helium ion beam of a HIM is utilized to create nanopores with diameters down to 1.3 nm. It will be demonstrated that nanopores can be milled into silicon nitride, carbon nanomembranes (CNMs) and graphene with well-defined aspect ratio. To image and characterize the produced nanopores, helium ion microscopy and high resolution scanning transmission electron microscopy were used. The analysis of the nanopores’ growth behavior, allows inferring on the profile of the helium ion beam. Nanopores in atomically thin membranes can be used for biomolecule analysis,1 electrochemical storage,2 as well as for the separation of gases and liquids.3 All of these applications require a precise control of the size and shape of the nanopores. It was shown that the focused beam of a transmission electron microscope (TEM) is able to create nanopores in membranes of silicon nitride and graphene with diameters down to 2 nm.4,5 Pores can be further shrunk in a TEM by areal electron impact.6 However, the preparation of such nanopores in a TEM is time-consuming and is limited to small samples (~3 mm diameter) that fit into the microscope.
    [Show full text]
  • Managing Infectious Disease Outbreaks Through Rapid Pathogen Genome Sequencing
    BRIEFING PAPER Managing infectious disease outbreaks through rapid pathogen genome sequencing February 2021 oxfordnanoporetech.com 1 Introduction With over two million attributed deaths to Global health security depends on the rapid potential association of novel pathogen variants recognition and containment of infectious diseases (such as the COVID-19 B1.1.7 and B1.351 variants date and a projected economic cost of and no government can afford to be complacent originally identified in the UK and South Africa) with $28 trillion1, the COVID-19 pandemic has about the risks posed to population health, changes to disease severity, transmission, and economic, political, and social stability and diagnostic and therapeutic efficacy. refocused global attention on the acute, wellbeing. It is possible to be prepared to prevent ever-present threat of infectious disease. and control such threats by investing in intelligent This briefing paper describes when, where, and how and agile public health tools to monitor for potential genomic epidemiology can offer critical and timely risks, enabling responses at appropriate speed and insights for infectious disease experts, public health scale to problems as they appear. professionals, and policy-makers to stay a step ahead of infectious disease threats, responding with Executive summary Genomic epidemiology is a crucial weapon in the maximal effect. public health fight against infectious diseases, • The threat of infectious disease is ever present — providing rapid identification and complete This briefing
    [Show full text]
  • Direct RNA Nanopore Sequencing of Full-Length Coronavirus Genomes Provides Novel Insights Into Structural Variants and Enables Modification Analysis
    Downloaded from genome.cshlp.org on September 28, 2021 - Published by Cold Spring Harbor Laboratory Press Method Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis Adrian Viehweger,1,2,5 Sebastian Krautwurst,1,2,5 Kevin Lamkiewicz,1,2 Ramakanth Madhugiri,3 John Ziebuhr,2,3 Martin Hölzer,1,2 and Manja Marz1,2,4 1RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany; 2European Virus Bioinformatics Center, Friedrich Schiller University Jena, 07743 Jena, Germany; 3Institute of Medical Virology, Justus Liebig University Gießen, 35390 Gießen, Germany; 4Leibniz Institute on Aging–Fritz Lipmann Institute, 07743 Jena, Germany Sequence analyses of RNA virus genomes remain challenging owing to the exceptional genetic plasticity of these viruses. Because of high mutation and recombination rates, genome replication by viral RNA-dependent RNA polymerases leads to populations of closely related viruses, so-called “quasispecies.” Standard (short-read) sequencing technologies are ill-suit- ed to reconstruct large numbers of full-length haplotypes of (1) RNA virus genomes and (2) subgenome-length (sg) RNAs composed of noncontiguous genome regions. Here, we used a full-length, direct RNA sequencing (DRS) approach based on nanopores to characterize viral RNAs produced in cells infected with a human coronavirus. By using DRS, we were able to map the longest (∼26-kb) contiguous read to the viral reference genome. By combining Illumina and Oxford Nanopore sequencing, we reconstructed a highly accurate consensus sequence of the human coronavirus (HCoV)-229E genome (27.3 kb). Furthermore, by using long reads that did not require an assembly step, we were able to identify, in infected cells, diverse and novel HCoV-229E sg RNAs that remain to be characterized.
    [Show full text]
  • De Novo Sequencing and Variant Calling with Nanopores Using Poreseq
    De novo Sequencing and Variant Calling with Nanopores using PoreSeq Tamas Szalay1 & Jene A. Golovchenko1,2* 1School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138 USA 2Department of Physics, Harvard University, Cambridge, Massachusetts 02138 USA *Corresponding author, email: [email protected] 1 1 The single-molecule accuracy of nanopore sequencing has been an area of rapid academic 2 and commercial advancement, but remains challenging for the de novo analysis of genomes. 3 We introduce here a novel algorithm for the error correction of nanopore data, utilizing 4 statistical models of the physical system in order to obtain high accuracy de novo sequences 5 at a range of coverage depths. We demonstrate the technique by sequencing M13 6 bacteriophage DNA to 99% accuracy at moderate coverage as well as its use in an assembly 7 pipeline by sequencing E. coli and ࣅ DNA at a range of coverages. We also show the 8 algorithm’s ability to accurately classify sequence variants at far lower coverage than 9 existing methods. 10 DNA sequencing has proven to be an indispensable technique in biology and medicine, 11 greatly accelerated by the technological developments that led to multiple generations of low 12 cost and high throughput tools1,2. Despite these advances, however, most existing sequencing-by- 13 synthesis techniques remain limited to short reads using expensive devices with complex sample 14 preparation procedures3. 15 Initially proposed two decades ago by Branton, Deamer, and Church4, nanopore 16 sequencing has recently emerged as a serious contender in the crowded field of DNA 17 sequencing.
    [Show full text]