Using Long Nanopore Reads to Delineate Structural Variants (Svs)

Total Page:16

File Type:pdf, Size:1020Kb

Using Long Nanopore Reads to Delineate Structural Variants (Svs) Using long nanopore reads to delineate structural variants (SVs) in the human genome SVs, including large deletions, duplications, inversions, translocations and copy-number changes are abundant in large genomes, and require long reads for precise characterisation Contact: [email protected] More information at: www.nanoporetech.com and publications.nanoporetech.com Unique Repeat Unique Repeat Unique a) b) a) b) sequence 1 1 sequence 2 2 sequence 3 1,000 60 Short reads Insertions Long A B C D E Reference chromosome 1 40 reads 800 Deletions > 50 bp Short-read assembly 20 Collapsed repeat consensus Unique contig 1 Long-read Bases sequenced (Mb) assembly 600 0 Unique contig 1 Unique contig 3 0 10 20 30 40 V W X Y Z Reference chromosome 2 Single, fully-resolved contig Count Read length (kb) c) > 50 bp 400 chr7 (q33) 7p21.3 15.321.1 15.3 7p14.3 7p14.1 13 11.2 11.21 11.22 11.23 7q21.11 q21.3 7q22.1 7q31.1 7q33 7q34 7q35 36.1 36.3 Scale 50 kb hg38 chr7: 134,550,000 134,600,000 134,650,000 134,700,000 Inversion A D C B E GENCODE v24 comprehensive transcript set (only Basic displayed by default) 200 AKR1B10 AKR1B15 BGPM CALD1 AKR1B15 BGPM Deletion BGPM A B C E AC009276.4 Duplication A B C C C D E 0 1,000 10,000 20,000 30,000 Translocation V W C D E + A B X Y Z Event size (bp) Adapted from Huddleston, J. et al. Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Research 27 (5) 677-685 (2016). Fig. 1 Structural variation a) classes b) variant size and frequency in the human genome Fig. 2 Read length a) typical distribution b) assembly c) mapped long human MinION reads Structural variation: large inversions, Nanopore sequencing can give extremely deletions, duplications and translocations long reads without size selection Structural variation (SV) refers to inversions, insertions, deletions and translocations > 50 bp in The read length that can be obtained from nanopore sequencing is limited only by the integrity of length (Fig. 1a). SV encompasses millions of bases of DNA per human genome, can span tens the DNA extracted from the sample and the care taken during library preparation. The read- of kilobases containing entire genes and their regulatory regions (Fig. 1b) and contributes length distribution corresponds closely to the fragment-length distribution of the sample DNA. substantially to genome variation. SV can alter the copy number of dosage-sensitive genes, can When starting with high-molecular weight genomic DNA, it is straightforward to obtain reads that unmask recessive alleles and can disrupt the integrity or regulation of a gene, all of which can are tens of kilobases in length (Fig. 2a). The longer the sequence read, the longer the repetitive cause genetic disease. The study of SVs is challenging because they frequently arise in repetitive region or SV that can be resolved, allowing the correct structure of the variant to be elucidated regions of the genome, and can have highly complex structures. Short-read sequencing (Fig. 2b). Recent increases in throughput make it realistic to sequence whole human genomes technologies cannot span long SVs, leading to incomplete reference assemblies. on a MinION (Fig. 2c). Scale 10 kb hg19 a) 15q21.2: chr10 p14 p13 12.1 q21.1 q21.3 q23.1 q25.1 26.3 chr15: 52,265,000 52,275,000 52,285,000 14-591_wt Scale 20 kb hg19 Scale 10 kb hg19 Scale 5 kb hg19 10q23.1 10q23.1 10q24.33 14-59|14-591_del chr10: 85,445,000 85,465,000 chr10: 86,220,000 86,240,000 chr10: 105,450,000 105,460,000 SH3PXD2A No UCSC genes CCSER2 SH3PXD2A CCSER2 SH3PXD2A CCSER2 SH3PXD2A F0182|REACH000319_wt CCSER2 SH3PXD2A R1 R4 R2 R3 R6 R5 R7 F0182|REACH000319_del R8 Mapped reads R10 Mapped reads R9 Mapped reads R11 R13 R14 R3 R1 R2 R7 R4 R5 R9 R6 R10 R14 R8 R13 F0208|REACH000426_wt R11 Breakpoint 1 Breakpoint 2 Breakpoint 3 F0208|REACH000426_del b) CCSER2 SH3PXD2A LEO1 10q23.1 10q23.31 10q23.32 10q24.1 10q22.2 10q24.32 10q25.1 Layered H3K27Ac Complex rearrangement or cut-and-paste transposition DNase clusters Txn Factor ChIP 10q23.1 10q23.31 10q23.32 10q24.1 10q22.2 10q24.32 10q25.1 Adapted from Brandler, W. et al. Paternally inherited cis-regulatory structural variants are associated with autism. Science 360 (6386) 327-331 (2018). Adapted from Brandler, W. et al. Paternally inherited cis-regulatory structural variants are associated with autism. Science 360 (6386) 327-331 (2018). Fig. 3 Confirmation of LEO1 breakpoints and parental origin with nanopore reads Fig. 4 Detection of SVs by whole-genome sequencing a) mapped reads b) SV resolution Deletion of a regulatory element in autistic Using long-read whole-genome sequencing patients validated by long nanopore reads to resolve SVs in the human genome To demonstrate the utility of long nanopore reads in resolving structural variants, we amplified One individual who participated in the autism spectrum disorder study described in Fig. 3 had and barcoded patient and wild-type alleles from three families with known deletions in the LEO1 been diagnosed with depression/anxiety. She appeared to have an SV in chromosome 10 which locus on chromosome 15, and sequenced them on a flowcell. Deletion amplicons were had been identified as a complex break-end by Lumpy analysis of paired-end Illumina data. The approximately 10 kb in length, and the amplifiable wild-type amplicons spanned up to 20 kb. SV was not found in the individual’s parents, so was taken to be de novo, but the precise LEO1 encodes an RNA polymerase-associated protein which is expressed during foetal brain structure was unclear. We performed whole-genome library prep using an LSK-108 kit, and development. For the deletions, we created consensus reference haplotypes using Nanopolish sequenced the library on a FLO-MIN106 flowcell, generating approximately 24 Gb of sequence and realigned reads to these references for SNP-calling with MUMmer. All three deletions, as data. The long reads allowed us to fully resolve the variant, and nanopore data was phased well as the parental origin, were successfully validated by the nanopore reads (Fig. 3). using WhatsHap, revealing the individual’s mother to be the parent of origin of the SV. P17009 - Version 5.0 © 2018 Oxford Nanopore Technologies. All rights reserved..
Recommended publications
  • Recovery of Small Plasmid Sequences Via Oxford Nanopore Sequencing
    bioRxiv preprint doi: https://doi.org/10.1101/2021.02.21.432182; this version posted February 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. Recovery of small plasmid sequences via Oxford Nanopore sequencing Ryan R. Wick1*, Louise M. Judd1 , Kelly L. Wyres1 and Kathryn E. Holt1,2 1. Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, VIC, 3004, Australia 2. Department of Infection Biology, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK * [email protected] Abstract Oxford Nanopore Technologies (ONT) sequencing platforms currently offer two approaches to whole-genome native-DNA library preparation: ligation and rapid. In this study, we compared these two approaches for bacterial whole-genome sequencing, with a specific aim of assessing their ability to recover small plasmid sequences. To do so, we sequenced DNA from seven plasmid-rich bacterial isolates in three different ways: ONT ligation, ONT rapid and Illumina. Using the Illumina read depths to approximate true plasmid abundance, we found that small plasmids (<20 kbp) were underrepresented in ONT ligation read sets (by a mean factor of ~4) but were not underrepresented in ONT rapid read sets. This effect correlated with plasmid size, with the smallest plasmids being the most underrepresented in ONT ligation read sets. We also found lower rates of chimeric reads in the rapid read sets relative to ligation read sets. These results show that when small plasmid recovery is important, ONT rapid library preparations are preferable to ligation-based protocols.
    [Show full text]
  • Analysis of Gene Expression Data for Gene Ontology
    ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Robert Daniel Macholan May 2011 ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION Robert Daniel Macholan Thesis Approved: Accepted: _______________________________ _______________________________ Advisor Department Chair Dr. Zhong-Hui Duan Dr. Chien-Chung Chan _______________________________ _______________________________ Committee Member Dean of the College Dr. Chien-Chung Chan Dr. Chand K. Midha _______________________________ _______________________________ Committee Member Dean of the Graduate School Dr. Yingcai Xiao Dr. George R. Newkome _______________________________ Date ii ABSTRACT A tremendous increase in genomic data has encouraged biologists to turn to bioinformatics in order to assist in its interpretation and processing. One of the present challenges that need to be overcome in order to understand this data more completely is the development of a reliable method to accurately predict the function of a protein from its genomic information. This study focuses on developing an effective algorithm for protein function prediction. The algorithm is based on proteins that have similar expression patterns. The similarity of the expression data is determined using a novel measure, the slope matrix. The slope matrix introduces a normalized method for the comparison of expression levels throughout a proteome. The algorithm is tested using real microarray gene expression data. Their functions are characterized using gene ontology annotations. The results of the case study indicate the protein function prediction algorithm developed is comparable to the prediction algorithms that are based on the annotations of homologous proteins.
    [Show full text]
  • Identification of the Binding Partners for Hspb2 and Cryab Reveals
    Brigham Young University BYU ScholarsArchive Theses and Dissertations 2013-12-12 Identification of the Binding arP tners for HspB2 and CryAB Reveals Myofibril and Mitochondrial Protein Interactions and Non- Redundant Roles for Small Heat Shock Proteins Kelsey Murphey Langston Brigham Young University - Provo Follow this and additional works at: https://scholarsarchive.byu.edu/etd Part of the Microbiology Commons BYU ScholarsArchive Citation Langston, Kelsey Murphey, "Identification of the Binding Partners for HspB2 and CryAB Reveals Myofibril and Mitochondrial Protein Interactions and Non-Redundant Roles for Small Heat Shock Proteins" (2013). Theses and Dissertations. 3822. https://scholarsarchive.byu.edu/etd/3822 This Thesis is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of BYU ScholarsArchive. For more information, please contact [email protected], [email protected]. Identification of the Binding Partners for HspB2 and CryAB Reveals Myofibril and Mitochondrial Protein Interactions and Non-Redundant Roles for Small Heat Shock Proteins Kelsey Langston A thesis submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Science Julianne H. Grose, Chair William R. McCleary Brian Poole Department of Microbiology and Molecular Biology Brigham Young University December 2013 Copyright © 2013 Kelsey Langston All Rights Reserved ABSTRACT Identification of the Binding Partners for HspB2 and CryAB Reveals Myofibril and Mitochondrial Protein Interactors and Non-Redundant Roles for Small Heat Shock Proteins Kelsey Langston Department of Microbiology and Molecular Biology, BYU Master of Science Small Heat Shock Proteins (sHSP) are molecular chaperones that play protective roles in cell survival and have been shown to possess chaperone activity.
    [Show full text]
  • Nanopore Sequencing of Long Ribosomal DNA Amplicons Enables
    bioRxiv preprint first posted online Jun. 29, 2018; doi: http://dx.doi.org/10.1101/358572. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Nanopore sequencing of long ribosomal DNA amplicons enables portable and simple biodiversity assessments with high phylogenetic resolution across broad taxonomic scale Henrik Krehenwinkel1,4, Aaron Pomerantz2, James B. Henderson3,4, Susan R. Kennedy1, Jun Ying Lim1,2, Varun Swamy5, Juan Diego Shoobridge6, Nipam H. Patel2,7, Rosemary G. Gillespie1, Stefan Prost2,8 1 Department of Environmental Science, Policy and Management, University of California, Berkeley, USA 2 Department of Integrative Biology, University of California, Berkeley, USA 3 Institute for Biodiversity Science and Sustainability, California Academy of Sciences, San Francisco, USA 4 Center for Comparative Genomics, California Academy of Sciences, San Francisco, USA 5 San Diego Zoo Institute for Conservation Research, Escondido, USA 6 Applied Botany Laboratory, Research and development Laboratories, Cayetano Heredia University, Lima, Perú 7 Department of Molecular and Cell Biology, University of California, Berkeley, USA 8 Research Institute of Wildlife Ecology, Department of Integrative Biology and Evolution, University of Veterinary Medicine, Vienna, Austria Corresponding authors: Henrik Krehenwinkel ([email protected]) and Stefan Prost ([email protected]) Keywords Biodiversity, ribosomal, eukaryotes, long DNA barcodes, Oxford Nanopore Technologies, MinION Abstract Background In light of the current biodiversity crisis, DNA barcoding is developing into an essential tool to quantify state shifts in global ecosystems.
    [Show full text]
  • Genomic Sequencing of SARS-Cov-2: a Guide to Implementation for Maximum Impact on Public Health
    Genomic sequencing of SARS-CoV-2 A guide to implementation for maximum impact on public health 8 January 2021 Genomic sequencing of SARS-CoV-2 A guide to implementation for maximum impact on public health 8 January 2021 Genomic sequencing of SARS-CoV-2: a guide to implementation for maximum impact on public health ISBN 978-92-4-001844-0 (electronic version) ISBN 978-92-4-001845-7 (print version) © World Health Organization 2021 Some rights reserved. This work is available under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 IGO licence (CC BY-NC-SA 3.0 IGO; https://creativecommons.org/licenses/by-nc-sa/3.0/igo). Under the terms of this licence, you may copy, redistribute and adapt the work for non-commercial purposes, provided the work is appropriately cited, as indicated below. In any use of this work, there should be no suggestion that WHO endorses any specific organization, products or services. The use of the WHO logo is not permitted. If you adapt the work, then you must license your work under the same or equivalent Creative Commons licence. If you create a translation of this work, you should add the following disclaimer along with the suggested citation: “This translation was not created by the World Health Organization (WHO). WHO is not responsible for the content or accuracy of this translation. The original English edition shall be the binding and authentic edition”. Any mediation relating to disputes arising under the licence shall be conducted in accordance with the mediation rules of the World Intellectual Property Organization (http://www.wipo.int/amc/en/mediation/rules/).
    [Show full text]
  • Atlas Journal
    Atlas of Genetics and Cytogenetics in Oncology and Haematology Home Genes Leukemias Tumors Cancer prone Deep Insight Case Reports Portal Journals Teaching X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 NA Atlas Journal Atlas Journal versus Atlas Database: the accumulation of the issues of the Journal constitutes the body of the Database/Text-Book. TABLE OF CONTENTS Volume 13, Number 7, July 2009 Previous Issue / Next Issue Genes ABL1 (v-abl Abelson murine leukemia viral oncogene homolog 1) (9q34.1) - updated. Ali G Turhan. Atlas Genet Cytogenet Oncol Haematol 2009; 13 (7): 757-766. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/ABL.html BCL2L12 (BCL2-like 12 (proline-rich)) (19q13.3). Christos Kontos, Hellinida Thomadaki, Andreas Scorilas. Atlas Genet Cytogenet Oncol Haematol 2009; 13 (7): 767-771. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/BCL2L12ID773ch19q13.html BCR (Breakpoint cluster region) (22q11.2) - updated. Ali G Turhan. Atlas Genet Cytogenet Oncol Haematol 2009; 13 (7): 772-779. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/BCR.html ENAH (enabled homolog (Drosophila)) (1q42.12). Paola Nisticò, Francesca Di Modugno. Atlas Genet Cytogenet Oncol Haematol 2009; 13 (7): 780-785. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/ENAHID44148ch1q42.html FGFR2 (fibroblast growth factor receptor 2) (10q26.13). Masaru Katoh. Atlas Genet Cytogenet Oncol Haematol 2009; 13 (7): 786-799. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/FGFR2ID40570ch10q26.html MAPK6 (mitogen-activated protein kinase 6) (15q21.2). Sylvain Meloche. Atlas Genet Cytogenet Oncol Haematol 2009; 13 (7): 800-804.
    [Show full text]
  • High-Fidelity Nanopore Sequencing of Ultra-Short DNA Sequences
    bioRxiv preprint doi: https://doi.org/10.1101/552224; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Title: High-Fidelity Nanopore Sequencing of Ultra-Short DNA Sequences Authors: Brandon D. Wilson1, Michael Eisenstein2,3, H. Tom Soh2,3,4* Affiliations: 1Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA. 2Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA. 3Department of Radiology, Stanford University, Stanford, CA 94305, USA. 4Chan Zuckerberg Biohub, San Francisco, CA 94158, USA. * Correspondence to [email protected] One Sentence Summary: We introduce a simple method of accurately sequencing ultra-short (<100bp) target DNA on a nanopore sequencing platform. Abstract Nanopore sequencing offers a portable and affordable alternative to sequencing-by-synthesis methods but suffers from lower accuracy and cannot sequence ultra-short DNA. This puts applications such as molecular diagnostics based on the analysis of cell-free DNA or single- nucleotide variants (SNV) out of reach. To overcome these limitations, We report a nanopore-based sequencing strategy in Which short target sequences are first circularized and then amplified via rolling-circle amplification to produce long stretches of concatemeric repeats. These can be sequenced on the Oxford Nanopore Technology’s (ONT) MinION platform, and the resulting repeat sequences aligned to produce a highly-accurate consensus that reduces the high error-rate present in the individual repeats.
    [Show full text]
  • Nanopore Sequencing Is a Credible Alternative to Recover Complete Genomes of Geminiviruses
    microorganisms Article Nanopore Sequencing Is a Credible Alternative to Recover Complete Genomes of Geminiviruses Selim Ben Chehida 1 , Denis Filloux 2,3, Emmanuel Fernandez 2,3, Oumaima Moubset 2,3, Murielle Hoareau 1, Charlotte Julian 2,3, Laurence Blondin 2,3, Jean-Michel Lett 1, Philippe Roumagnac 2,3 and Pierre Lefeuvre 1,* 1 CIRAD, UMR PVBMT, F-97410 St Pierre, La Réunion, France; [email protected] (S.B.C.); [email protected] (M.H.); [email protected] (J.-M.L.) 2 CIRAD, PHIM, F-34398 Montpellier, France; [email protected] (D.F.); [email protected] (E.F.); [email protected] (O.M.); [email protected] (C.J.); [email protected] (L.B.); [email protected] (P.R.) 3 PHIM Plant Health Institute, University Montpellier, CIRAD, INRAE, Institut Agro, IRD, F-34398 Montpellier, France * Correspondence: [email protected] Abstract: Next-generation sequencing (NGS), through the implementation of metagenomic protocols, has led to the discovery of thousands of new viruses in the last decade. Nevertheless, these protocols are still laborious and costly to implement, and the technique has not yet become routine for everyday virus characterization. Within the context of CRESS DNA virus studies, we implemented two alternative long-read NGS protocols, one that is agnostic to the sequence (without a priori knowledge of the viral genome) and the other that use specific primers to target a virus (with a priori). Agnostic Citation: Ben Chehida, S.; Filloux, D.; and specific long read NGS-based assembled genomes of two capulavirus strains were compared to Fernandez, E.; Moubset, O.; Hoareau, those obtained using the gold standard technique of Sanger sequencing.
    [Show full text]
  • Atlas Journal
    Atlas of Genetics and Cytogenetics in Oncology and Haematology Home Genes Leukemias Solid Tumours Cancer-Prone Deep Insight Portal Teaching X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 NA Atlas Journal Atlas Journal versus Atlas Database: the accumulation of the issues of the Journal constitutes the body of the Database/Text-Book. TABLE OF CONTENTS Volume 12, Number 6, Nov-Dec 2008 Previous Issue / Next Issue Genes BCL8 (B-cell CLL/lymphoma 8) (15q11). Silvia Rasi, Gianluca Gaidano. Atlas Genet Cytogenet Oncol Haematol 2008; 12 (6): 781-784. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/BCL8ID781ch15q11.html CDC25A (Cell division cycle 25A) (3p21). Dipankar Ray, Hiroaki Kiyokawa. Atlas Genet Cytogenet Oncol Haematol 2008; 12 (6): 785-791. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/CDC25AID40004ch3p21.html CDC73 (cell division cycle 73, Paf1/RNA polymerase II complex component, homolog (S. cerevisiae)) (1q31.2). Leslie Farber, Bin Tean Teh. Atlas Genet Cytogenet Oncol Haematol 2008; 12 (6): 792-797. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/CDC73D181ch1q31.html EIF3C (eukaryotic translation initiation factor 3, subunit C) (16p11.2). Daniel R Scoles. Atlas Genet Cytogenet Oncol Haematol 2008; 12 (6): 798-802. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/EIF3CID44187ch16p11.html ELAC2 (elaC homolog 2 (E. coli)) (17p11.2). Yang Chen, Sean Tavtigian, Donna Shattuck. Atlas Genet Cytogenet Oncol Haematol 2008; 12 (6): 803-806. [Full Text] [PDF] URL : http://atlasgeneticsoncology.org/Genes/ELAC2ID40437ch17p11.html FOXM1 (forkhead box M1) (12p13). Jamila Laoukili, Monica Alvarez Fernandez, René H Medema.
    [Show full text]
  • I STRUCTURAL and FUNCTIONAL CHARACTERIZATION of RTF1
    STRUCTURAL AND FUNCTIONAL CHARACTERIZATION OF RTF1 AND INSIGHT INTO ITS ROLE IN TRANSCRIPTIONAL REGULATION by Adam Douglas Wier B.S., Molecular Biology and Biochemistry, Lebanon Valley College, 2009 Submitted to the Graduate Faculty of the Kenneth P. Dietrich School of Arts and Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2014 i UNIVERSITY OF PITTSBURGH KENNETH P. DIETRICH SCHOOL OF ARTS AND SCIENCES This dissertation was presented by Adam D. Wier It was defended on August 20, 2014 and approved by James M. Pipas, Ph.D., Professor, Biological Sciences Karen M. Arndt, Ph.D., Professor, Biological Sciences John M Rosenberg, Ph.D., Professor, Biological Sciences Martin C. Schmidt, Ph.D., Associate Professor, Microbiology and Molecular Genetics Committee Chair: Andrew P. VanDemark, Ph.D., Associate Professor, Biological Sciences ii Copyright © by Adam D. Wier 2014 iii STRUCTURAL AND FUNCTIONAL CHARACTERIZATION OF RTF1 AND INSIGHT INTO ITS ROLE IN TRANSCRIPTIONAL REGULATION Adam D. Wier, PhD University of Pittsburgh, 2014 Originally discovered in a search for RNA polymerase II-associated factors, the Paf1 complex (Paf1C) is best characterized for its roles in regulating transcription elongation. The complex co- localizes with RNA polymerase II from the promoter to the 3’ end of genes and has been linked to a growing list of transcription-related processes including: elongation through chromatin, histone modifications, and recruitment of factors important in transcript maturation. The complex is conserved throughout eukaryotes and is comprised of the proteins Paf1, Ctr9, Cdc73, Rtf1, and Leo1. The domain structures of Paf1C subunits are largely undefined and have few clear homologs, making it difficult to postulate for or localize functions to the individual subunits.
    [Show full text]
  • Managing Infectious Disease Outbreaks Through Rapid Pathogen Genome Sequencing
    BRIEFING PAPER Managing infectious disease outbreaks through rapid pathogen genome sequencing February 2021 oxfordnanoporetech.com 1 Introduction With over two million attributed deaths to Global health security depends on the rapid potential association of novel pathogen variants recognition and containment of infectious diseases (such as the COVID-19 B1.1.7 and B1.351 variants date and a projected economic cost of and no government can afford to be complacent originally identified in the UK and South Africa) with $28 trillion1, the COVID-19 pandemic has about the risks posed to population health, changes to disease severity, transmission, and economic, political, and social stability and diagnostic and therapeutic efficacy. refocused global attention on the acute, wellbeing. It is possible to be prepared to prevent ever-present threat of infectious disease. and control such threats by investing in intelligent This briefing paper describes when, where, and how and agile public health tools to monitor for potential genomic epidemiology can offer critical and timely risks, enabling responses at appropriate speed and insights for infectious disease experts, public health scale to problems as they appear. professionals, and policy-makers to stay a step ahead of infectious disease threats, responding with Executive summary Genomic epidemiology is a crucial weapon in the maximal effect. public health fight against infectious diseases, • The threat of infectious disease is ever present — providing rapid identification and complete This briefing
    [Show full text]
  • Direct RNA Nanopore Sequencing of Full-Length Coronavirus Genomes Provides Novel Insights Into Structural Variants and Enables Modification Analysis
    Downloaded from genome.cshlp.org on September 28, 2021 - Published by Cold Spring Harbor Laboratory Press Method Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis Adrian Viehweger,1,2,5 Sebastian Krautwurst,1,2,5 Kevin Lamkiewicz,1,2 Ramakanth Madhugiri,3 John Ziebuhr,2,3 Martin Hölzer,1,2 and Manja Marz1,2,4 1RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany; 2European Virus Bioinformatics Center, Friedrich Schiller University Jena, 07743 Jena, Germany; 3Institute of Medical Virology, Justus Liebig University Gießen, 35390 Gießen, Germany; 4Leibniz Institute on Aging–Fritz Lipmann Institute, 07743 Jena, Germany Sequence analyses of RNA virus genomes remain challenging owing to the exceptional genetic plasticity of these viruses. Because of high mutation and recombination rates, genome replication by viral RNA-dependent RNA polymerases leads to populations of closely related viruses, so-called “quasispecies.” Standard (short-read) sequencing technologies are ill-suit- ed to reconstruct large numbers of full-length haplotypes of (1) RNA virus genomes and (2) subgenome-length (sg) RNAs composed of noncontiguous genome regions. Here, we used a full-length, direct RNA sequencing (DRS) approach based on nanopores to characterize viral RNAs produced in cells infected with a human coronavirus. By using DRS, we were able to map the longest (∼26-kb) contiguous read to the viral reference genome. By combining Illumina and Oxford Nanopore sequencing, we reconstructed a highly accurate consensus sequence of the human coronavirus (HCoV)-229E genome (27.3 kb). Furthermore, by using long reads that did not require an assembly step, we were able to identify, in infected cells, diverse and novel HCoV-229E sg RNAs that remain to be characterized.
    [Show full text]