Unravelling the Complex Structure of Mobile Genetic Elements Through Short- and Long-Read Sequencing

Total Page:16

File Type:pdf, Size:1020Kb

Unravelling the Complex Structure of Mobile Genetic Elements Through Short- and Long-Read Sequencing O0567 Unravelling the complex structure of mobile genetic elements through short- and long-read sequencing Natacha Couto*2, Monika Chlebowicz1, Ana Carolina Da Cruz Campos1, Erwin G. Raangs1, Alexander W. Friedrich1, John W. A. Rossen2 1University of Groningen, University Medical Center Groningen, Medical Microbiology, Groningen, Netherlands, 2University of Groningen, University Medical Center Groningen, Department of Medical Microbiology, Groningen, The Netherlands Background: Mobile genetic elements (MGEs) are DNA segments moving within genomes and carriers of antimicrobial resistance and virulence genes. One way to study MGEs is through sequencing, however assembling short-reads is difficult because MGEs contain many repetitive insertion sequences. With long-read sequencing, the full structure of a bacterial genome can be obtained, often with higher sequencing error rates. Here we aim to combine MinION and Illumina data using hybrid-assemblers to unravel the structure of MGEs. Materials/methods: The DNeasy UltraClean Microbial Kit (Qiagen) was used for DNA extraction of colonies grown overnight at blood agar. Libraries were prepared with the Nextera XT Library Preparation kit (Illumina) and sequenced in a MiSeq (500-cycles, paired-end). For long-read sequencing, samples were barcoded with the Native Barcoding Kit 1D (EXP-NBD103) and libraries were prepared using the Ligation Sequencing Kit 1D (SQK-LSK108). The library was loaded onto an FLO-MIN106 R9.4 flow cell and ran on a MinION device (48 hours). Base calling was performed using Albacore v1.2.2. Data quality was analyzed through Poretools v0.6.0. Hybrid assemblies were performed using Unicycler v0.4.1. Bandage v0.8.1 was used to visualize the assembly graphics. Genes of interest were detected using ABRicate. Results: One Escherichia coli and one Staphylococcus haemolyticus strain were sequenced. The hybrid assembly produced a complete circular chromosome and several circular plasmids (7 plasmids in E. coli and 2 plasmids in S. haemolyticus). An antimicrobial resistance gene cluster (blaTEM-1B, sul2, strA, strB, tetA) was identified in one IncFII plasmid in E. coli also containing a virulence gene cluster. Several resistance genes (including mecA) were found in the S. haemolyticus chromosome. The SCCmec structure containing the mecA had 99% homology with S. haemolyticus SH32, but the region flanking the mecA was different. Conclusions: Illumina sequencing is not enough to resolve the complex structure of MGEs, and Nanopore sequencing contains too many sequencing errors to make it reliable. The combination of both methods provides the best solution to understand the intricate sequences surrounding antimicrobial resistance and virulence genes. This is highly desirable to study the dissemination of these important elements..
Recommended publications
  • Genome-Wide Detection of Chromosomal Rearrangements, Indels, and Mutations in Circular Chromosomes by Short Read Sequencing
    Downloaded from genome.cshlp.org on October 2, 2021 - Published by Cold Spring Harbor Laboratory Press Method Genome-wide detection of chromosomal rearrangements, indels, and mutations in circular chromosomes by short read sequencing Ole Skovgaard,1,3 Mads Bak,2 Anders Løbner-Olesen,1 and Niels Tommerup2 1Department of Science, Systems and Models, Roskilde University, DK-4000 Roskilde, Denmark; 2Wilhelm Johannsen Centre for Functional Genome Research, Department of Cellular and Molecular Medicine, University of Copenhagen, DK-2200 Copenhagen, Denmark Whole-genome sequencing (WGS) with new short-read sequencing technologies has recently been applied for genome- wide identification of mutations. Genomic rearrangements have, however, often remained undetected by WGS, and additional analyses are required for their detection. Here, we have applied a combination of WGS and genome copy number analysis, for the identification of mutations that suppress the growth deficiency imposed by excessive initiations from the Escherichia coli origin of replication, oriC. The E. coli chromosome, like the majority of bacterial chromosomes, is circular, and DNA replication is initiated by assembling two replication complexes at the origin, oriC. These complexes then replicate the chromosome bidirectionally toward the terminus, ter. In a population of growing cells, this results in a copy number gradient, so that origin-proximal sequences are more frequent than origin-distal sequences. Major rearrangements in the chromosome are, therefore, readily identified by changes in copy number, i.e., certain sequences become over- or under-represented. Of the eight mutations analyzed in detail here, six were found to affect a single gene only, one was a large chromosomal inversion, and one was a large chromosomal duplication.
    [Show full text]
  • Change in Chromosome Number Associated with a Double Deletion in the Neurospora Crussa Mitochondrial Chromosome
    Copyright 0 1989 by the Genetics Society of America Change in Chromosome Number Associated With a Double Deletion in the Neurospora crussa Mitochondrial Chromosome Samson R. Gross, Ann Mary and Pearl H. Levine Department of Biochemistry, Division of Genetics, Duke University, Durham, North Carolina 27710 Manuscript received October 27, 1988 Accepted for publication December 19, 1988 ABSTRACT The mitochondrial genome of Neurospora is usually found in a single covalently closed circular 62-kbp DNA molecule. We report here that the mitochondrial genome of a phenotypic revertant of a stopper mutant (stp-ruv) is contained primarily in two separate, nonoverlapping, autonomously replicating circular chromosomes. The circles, one about 21 kbp and the other somewhat less than 36 kbp are derived from the most frequent classes of recombinant chromosomes (21 and 41 kbp) in the chromosomal population of mitochondria in the original stopper mutant. The new, more stable chromosomal configuration, is associated with the deletion of two sequences (1 kbp and 4 kbp) at the splice junctions of the two circles. The data suggest that both deletions are likely to have originated from a single recombinational event involved in generating the 36-kbp circle. Secondary, sponta- neously arising derivatives of stp-ruv have been found to yield, at high copy number, shortsections of the 21-kbp circle in covalently closed supercoiled circles varying from unit length to very high multimers. The amplified segments span a common segment likely to contain the replication origin of the 2 1-kbp chromosome. - NTRACHROMOSOMAL recombination is a fre- and LEVINE1984). As indicated in the following sec- I quent event during the normal growth and repli- tions, the stability of the n = 2 chromosomal comple- cation of mitochondria in many different plant and ment is associated with the loss of two extended se- fungal species (CUMMINGS,BELCOUR and GRAND- quences of the single chromosome of normal mito- CHAMP 1979; PALMERand SHIELDS1984; GROSS, chondria.
    [Show full text]
  • Chapter 9 Genetics Chromosome Genes • DNA RNA Protein Flow Of
    Genetics Chapter 9 Topics • Genome - the sum total of genetic - Genetics information in a organism - Flow of Genetics/Information • Genotype - the A's, T's, G's and C's - Regulation • Phenotype - the physical - Mutation characteristics that are encoded - Recombination – gene transfer within the genome Examples of Eukaryotic and Prokaryotic Genomes Chromosome • Prokaryotic ( E. coli ~ 4,288 genes) – 1 circular chromosome ± extrachromosomal DNA ( plasmids ) • Eukaryotic (humans ~ 20 -25,000 genes) – Many paired chromosomes ± extrachromosomal DNA ( Mitochondria or Chloroplast ) • Subdivided into basic informational packets called genes Genes Flow of Genetics/Information • Three categories The Central Dogma –Structural - genes that code for • DNA RNA Protein proteins –Regulatory - genes that control – Replication - copy DNA gene expression – Transcription - make mRNA – Translation - make protein –Encode for RNA - non-mRNA 1 Replication Transcription & Translation DNA • Structure • Replication • Universal Code & Codons Escherichia coli with its emptied genome! Structure • Nucleotide – Phosphate – Deoxyribose sugar – Nitrogenous base • Double stranded helix – Antiparallel arrangement Versions of the DNA double helix Nitrogenous bases 5’ 3’ • Purines –Adenine 3’ 5’ –Guanine • Pyrimidines –Thymine –Cytosine 2 Replication • Semiconservative - starts at the Origin of Replication • Enzymes • Helicase • Dna Pol III • DNA Pol I • Primase • Gyrase • Ligase • Leading strand • Lagging strand – Okazaki fragments The function of important enzymes involved
    [Show full text]
  • Mechanisms of Microbial Genetics 443
    Chapter 11 | Mechanisms of Microbial Genetics 443 Chapter 11 Mechanisms of Microbial Genetics Figure 11.1 Escherichia coli (left) may not appear to have much in common with an elephant (right), but the genetic blueprints for these vastly different organisms are both encoded in DNA. (credit left: modification of work by NIAID; credit right: modification of work by Tom Lubbock) Chapter Outline 11.1 The Functions of Genetic Material 11.2 DNA Replication 11.3 RNA Transcription 11.4 Protein Synthesis (Translation) 11.5 Mutations 11.6 How Asexual Prokaryotes Achieve Genetic Diversity 11.7 Gene Regulation: Operon Theory Introduction In 1954, French scientist and future Nobel laureate Jacques Monod (1910–1976) famously said, “What is true in E. coli is true in the elephant,” suggesting that the biochemistry of life was maintained throughout evolution and is shared in all forms of known life. Since Monod’s famous statement, we have learned a great deal about the mechanisms of gene regulation, expression, and replication in living cells. All cells use DNA for information storage, share the same genetic code, and use similar mechanisms to replicate and express it. Although many aspects of genetics are universally shared, variations do exist among contemporary genetic systems. We now know that within the shared overall theme of the genetic mechanism, there are significant differences among the three domains of life: Eukarya, Archaea, and Bacteria. Additionally, viruses, cellular parasites but not themselves living cells, show dramatic variation in their genetic material and the replication and gene expression processes. Some of these differences have allowed us to engineer clinical tools such as antibiotics and antiviral drugs that specifically inhibit the reproduction of pathogens yet are harmless to their hosts.
    [Show full text]
  • Extrachromosomal Element Capture and the Evolution of Multiple Replication Origins in Archaeal Chromosomes
    Extrachromosomal element capture and the evolution of multiple replication origins in archaeal chromosomes Nicholas P. Robinson† and Stephen D. Bell† Medical Research Council Cancer Cell Unit, Hutchison Medical Research Council Research Center, Hills Road, Cambridge CB2 0XZ, United Kingdom Edited by Carl R. Woese, University of Illinois at Urbana–Champaign, Urbana, IL, and approved February 15, 2007 (received for review January 9, 2007) In all three domains of life, DNA replication begins at specialized Orc2–6 act to recruit MCM to origins of replication in a reaction loci termed replication origins. In bacteria, replication initiates from that absolutely requires an additional factor, Cdt1 (6). Although a single, clearly defined site. In contrast, eukaryotic organisms archaea possess orthologs of Orc1, Cdc6, and MCM, no archaeal exploit a multitude of replication origins, dividing their genomes homolog of Cdt1 has yet been identified. into an array of short contiguous units. Recently, the multiple In the current work, we reveal that Aeropyrum pernix has at replication origin paradigm has also been demonstrated within the least two replication origins, indicating that the multiple repli- archaeal domain of life, with the discovery that the hyperthermo- cation origin paradigm is not restricted to the Sulfolobus genus. philic archaeon Sulfolobus has three replication origins. However, Comparison of the A. pernix and Sulfolobus origins reveals a clear the evolutionary mechanism driving the progression from single to relationship between these loci. Further, analyses of the gene multiple origin usage remains unclear. Here, we demonstrate that order and identity in the environment of the origins provides Aeropyrum pernix, a distant relative of Sulfolobus, has two origins.
    [Show full text]
  • The RNA World” Mean to “The Origin of Life”?
    life Concept Paper What Does “the RNA World” Mean to “the Origin of Life”? Wentao Ma Hubei Key Laboratory of Cell Homeostasis, College of Life Sciences, Wuhan University, Wuhan 430072, China; [email protected] Received: 30 September 2017; Accepted: 24 November 2017; Published: 29 November 2017 Abstract: Corresponding to life’s two distinct aspects: Darwinian evolution and self-sustainment, the origin of life should also split into two issues: the origin of Darwinian evolution and the arising of self-sustainment. Because the “self-sustainment” we concern about life should be the self-sustainment of a relevant system that is “defined” by its genetic information, the self-sustainment could not have arisen before the origin of Darwinian evolution, which was just marked by the emergence of genetic information. The logic behind the idea of the RNA world is not as tenable as it has been believed. That is, genetic molecules and functional molecules, even though not being the same material, could have emerged together in the beginning and launched the evolution—provided that the genetic molecules can “simply” code the functional molecules. However, due to these or those reasons, alternative scenarios are generally much less convincing than the RNA world. In particular, when considering the accumulating experimental evidence that is supporting a de novo origin of the RNA world, it seems now quite reasonable to believe that such a world may have just stood at the very beginning of life on the Earth. Therewith, we acquire a concrete scenario for our attempts to appreciate those fundamental issues that are involved in the origin of life.
    [Show full text]
  • The Landscape of Chloroplast Genome Assembly Tools
    bioRxiv preprint doi: https://doi.org/10.1101/665869; this version posted May 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Freudenthal et al. RESEARCH The landscape of chloroplast genome assembly tools Jan A Freudenthal1,2, Simon Pfaff1,3, Niklas Terhoeven1,2, Arthur Korte1, Markus J Ankenbrand1,2,4y and Frank F¨orster1,3,5,6*y *Correspondence: [email protected] Abstract giessen.de 6Bioinformatics Core Facility of Chloroplasts are intracellular organelles that enable plants to conduct the University of Gießen, photosynthesis. They arose through the symbiotic integration of a prokaryotic cell Heinrich-Buff-Ring 58, 35392 into an eukaryotic host cell and still contain their own genomes with distinct Gießen, Germany Full list of author information is genomic information. Plastid genomes accommodate essential genes and are available at the end of the article regularly utilized in biotechnology or phylogenetics. Different assemblers that are yCorresponding author able to asses the plastid genome, have been developed. These assemblers often use data of whole genome sequencing experiments, which usually contain reads from the complete chloroplast genome. The performance of different assembly tools has never been systematically compared. Here we present a benchmark of seven chloroplast assembly tools, capable to succeed in more than 60 % of known real data sets. Our results show significant differences between the tested assemblers in terms of generating whole chloroplast genome sequences and computational requirements.
    [Show full text]
  • Copy Numbers of Mitochondrial Genes Change During Melon Leaf Development and Are Lower Than the Numbers of Mitochondria
    Shen et al. Horticulture Research (2019) 6:95 Horticulture Research https://doi.org/10.1038/s41438-019-0177-8 www.nature.com/hortres ARTICLE Open Access Copy numbers of mitochondrial genes change during melon leaf development and are lower than the numbers of mitochondria Jia Shen1, Yuejian Zhang1,MichaelJ.Havey2 and Weisong Shou1 Abstract Melon is a useful plant species for studying mitochondrial genetics because it contains one of the largest and structurally diverse mitochondrial genomes among all plant species and undergoes paternal transmission of mitochondria. We used droplet digital (dd) PCR in combination with flow cytometric determination of nuclear DNA quantities to determine the absolute per-cell copy numbers of four mitochondrial genes (nad9, rps1, matR, and atp6) across four stages of melon leaf development. The copy numbers of these mitochondrial genes not only varied during leaf development but also differed among each other, and there was no correlation between the copy numbers of the mitochondrial genes and their transcript levels. The gene copy numbers varied from approximately 36.8 ± 4.5 (atp6 copies in the 15th leaf) to approximately 82.9 ± 5.7 (nad9 copies in the 9th leaf), while the mean number of mitochondria was approximately 416.6 ± 182.7 in the 15th leaf and 459.1 ± 228.2 in the 9th leaf. These observations indicate that the leaf cells of melon do not contain sufficient copies of mitochondrial genes to ensure that every 1234567890():,; 1234567890():,; 1234567890():,; 1234567890():,; mitochondrion possesses the entire mitochondrial genome. Given this cytological evidence, our results indicate that mtDNA in melon exists as a sub-genomic molecule rather than as a single-master circle and that the copy numbers of individual mitochondrial genes may vary greatly.
    [Show full text]
  • Toward a Standard Computational Tool for DNA Microsatellites Detection Hani Z
    Published online 2 October 2012 Nucleic Acids Research, 2013, Vol. 41, No. 1 e22 doi:10.1093/nar/gks881 MsDetector: toward a standard computational tool for DNA microsatellites detection Hani Z. Girgis and Sergey L. Sheetlin* Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 9600 Rockville Pike, Bethesda, MD 20896, USA Received March 1, 2012; Revised August 29, 2012; Accepted August 30, 2012 Downloaded from https://academic.oup.com/nar/article/41/1/e22/1172735 by guest on 28 September 2021 ABSTRACT be exact copies in the case of perfect TRs or can be inexact copies in the case of approximate TRs. Depending on the Microsatellites (MSs) are DNA regions consisting of length of the repeated motif, TRs can be classified as repeated short motif(s). MSs are linked to several microsatellites (MSs) (the motif length is 1–6 bp) or diseases and have important biomedical applica- minisatellites (the motif length is 10–60 bp). tions. Thus, researchers have developed several MSs are important due to their documented functions computational tools to detect MSs. However, the and association with cancer and other diseases. In 2005, it currently available tools require adjusting many par- was demonstrated that MSs polymorphism, which is due ameters, or depend on a list of motifs or on a library to copy number variability, can enhance the virulence of of known MSs. Therefore, two laboratories pathogens and their adaptability to the environment (2). analyzing the same sequence with the same com- In addition, MSs can be involved in gene regulation (3–5).
    [Show full text]
  • Purification and Characterization of a Y-Like DNA Polymerase from Chenopodium Album L
    Nucleic Acids Research, 1993, Vol. 21, No. 21 4893 -4899 Purification and characterization of a y-like DNA polymerase from Chenopodium album L. K.Meil3ner, S.Heinhorst1, G.C.Cannon1 and T.Borner* FB Biologie, Humboldt-Universitat Berlin, Invalidenstr. 43, D-10115 Berlin, Germany and 'Department of Chemistry and Biochemistry, University of Southern Mississippi, Southern Station Box 5043, Hattiesburg, MS 39406, USA Received August 5, 1993; Revised and Accepted September 23, 1993 ABSTRACT A DNA polymerase activity from mitochondria of the efficient post-replicative repair mechanisms [11, 12]. Enzymes dicotyledonous angiosperm Chenopodium album L. of DNA replication and repair and the mode of their concerted was purified almost 9000 fold by successive column action in plant mitochondria are not known because of a lack chromatography steps on DEAE cellulose, heparin of a simple model system. Mutants deficient in enzymes involved agarose and ssDNA cellulose. The enzyme was in DNA synthesis are not available for plant mitochondria and characterized as a y-class polymerase, based on its transformation of the organelle is at present practically resistance to inhibitors of the nuclear DNA polymerase impossible. In contrast to chloroplast in vitro systems [13, 14, a and its preference for poly(rA)* (dT)12 -18 over 15], no specific initiation of DNA synthesis or preferential activated DNA in vitro. The molecular weight was labeling of cloned mtDNA fragments could be observed in estimated to be 80,000 - 90,000. A 3' to 5' exonuclease mitochondrial extracts [16, 17, 18, our own observations]. activity was found to be tightly associated with the DNA Origins of replication are largely unknown in plant mitochondrial polymerase activity through all purification steps.
    [Show full text]
  • Gene, Genomics and Genetics M. Sc
    INDIRA GANDHI NATIONAL TRIBAL UNIVERSITY AMARKANTAK, M.P. 484886 SUBJECT- Biotechnology TITLE- Gene, Genomics and Genetics M. Sc Biotechnology 2nd Semester Unit-II Reference Notes Dr. Parikipandla Sridevi Assistant Professor Dept. of Biotechnology Faculty of Science Indira Gandhi National Tribal University Amarkantak, MP, India Pin : 484887 Mob No: +919630036673, +919407331673 Email Id: [email protected], [email protected] [email protected] Unit II Chromosome and genomic organization: Eukaryotic nuclear genome nucleotide sequence composition –unique & repetitive DNA, satellite DNA. Centromere and telomere DNA sequences, middle repetitive sequences- VNTRs & dinucleotide repeats, repetitive transposed sequences- SINEs & LINEs, middle repetitive multiple copy genes, noncoding DNA. CONTENTS 1. Introduction 2. Difference between prokaryotic and eukaryotic genome 3. Configuration of eukaryotic genome 4. Conclusions INTRODUCTION The human genome is an enormous, cryptic store of information with approximately three billion bases that encode, either directly or indirectly, the instructions for synthesizing nearly all the molecules essential for the formation and function of each human cell, tissue, and organ. The genome of most of the eukaryotes comprises several protein-coding genes, non-protein coding genes, and transcription regulatory elements such as enhancers, suppressors, promoters, etc. In addition, there are sequences that are responsible for regulation of chromosomal structure and dynamic The organism‘s complete set of DNA is known as a genome. Genome can be defined as the entire collection of the genes and all other functional and non-functional DNA sequences in an organism in a haploid set of chromosomes. It comprises of both nuclear and mitochondrial DNA. The genomic DNA exists as single linear pieces of DNA and contains all information needed to build and maintain an organism.
    [Show full text]
  • Origin and Direction of Replication of the Chromosome of E. Coli B/R Millicent Masters
    Proceedings of the National Academy of Sciences Vol. 65, No. 3 pp. 601-608, March 1970 Origin and Direction of Replication of the Chromosome of E. coli B/r Millicent Masters MEDICAL RESEARCH COUNCIL, MICROBIAL GENETICS RESEARCH UNIT, DEPARTMENT OF MOLECULAR BIOLOGY, EDINBURGH UNIVERSITY, EDINBURGH, SCOTLAND Communicated by Arthur B. Pardee, July 3, 1969 Abstract. The origin and direction of replication of the E. coli B/r chromo- some has been determined by comparing gene frequencies in Pl-transducing lysates prepared on cultures growing at different rates. The gene frequencies found are consistent with the idea that replication of the chromosome is di- chotomous in rapidly growing B/r. The origin was found to be between 40 and 55 min on the E. coli genetic map with replication proceeding in a clockwise direction. Markers near the origin behaved anomalously. The chromosome of E. coli is a single circular molecule of DNA which replicates sequentially along its length.'-3 Since the demonstration by Yoshikawa and Sueoka4-5 that the B. subtilis chromosome replicates sequentially from a genet- ically fixed origin, many attempts have been made to determine whether this is also true for the circular chromosome of E. coli. Although some early work6-7 indicated that the presence and position of such an origin might be dependent on the integration of the F sex factor, later work has supported the idea that the E. coli chromosome is always replicated sequentially in a particular direction from a fixed origin whose position may be strain dependent, but probably does not de- pend on the integration of the F factor.8-'5 The majority of strains which have been studied are estimated to have an origin lying between 40 and 70 min on the genetic transfer map of E.
    [Show full text]