Rapid Phylogenetic and Functional Classification of Short Genomic

Total Page:16

File Type:pdf, Size:1020Kb

Rapid Phylogenetic and Functional Classification of Short Genomic Berendzen et al. BMC Research Notes 2012, 5:460 http://www.biomedcentral.com/1756-0500/5/460 RESEARCH ARTICLE Open Access Rapid phylogenetic and functional classification of short genomic fragments with signature peptides Joel Berendzen1, William J Bruno2, Judith D Cohn3, Nicolas W Hengartner3, Cheryl R Kuske4, Benjamin H McMahon2*, Murray A Wolinsky4 and Gary Xie4 Abstract Background: Classification is difficult for shotgun metagenomics data from environments such as soils, where the diversity of sequences is high and where reference sequences from close relatives may not exist. Approaches based on sequence-similarity scores must deal with the confounding effects that inheritance and functional pressures exert on the relation between scores and phylogenetic distance, while approaches based on sequence alignment and tree-building are typically limited to a small fraction of gene families. We describe an approach based on finding one or more exact matches between a read and a precomputed set of peptide 10-mers. Results: At even the largest phylogenetic distances, thousands of 10-mer peptide exact matches can be found between pairs of bacterial genomes. Genes that share one or more peptide 10-mers typically have high reciprocal BLAST scores. Among a set of 403 representative bacterial genomes, some 20 million 10-mer peptides were found to be shared. We assign each of these peptides as a signature of a particular node in a phylogenetic reference tree based on the RNA polymerase genes. We classify the phylogeny of a genomic fragment (e.g., read) at the most specific node on the reference tree that is consistent with the phylogeny of observed signature peptides it contains. Using both synthetic data from four newly-sequenced soil-bacterium genomes and ten real soil metagenomics data sets, we demonstrate a sensitivity and specificity comparable to that of the MEGAN metagenomics analysis package using BLASTX against the NR database. Phylogenetic and functional similarity metrics applied to real metagenomics data indicates a signal-to-noise ratio of approximately 400 for distinguishing among environments. Our method assigns ~6.6 Gbp/hr on a single CPU, compared with 25 kbp/hr for methods based on BLASTX against the NR database. Conclusions: Classification by exact matching against a precomputed list of signature peptides provides comparable results to existing techniques for reads longer than about 300 bp and does not degrade severely with shorter reads. Orders of magnitude faster than existing methods, the approach is suitable now for inclusion in analysis pipelines and appears to be extensible in several different directions. Background of sequences is high [1,2] and where sequences from As of this writing, DNA sequencers routinely produce close relatives are not to be found in reference databases more than 2 Gbp of data per hour, with the high-quality (e.g. [3]). Insight into microbial communities and their region of reads as short as 75 bp. Analytical methods dynamics would be desirable for a number of import- that can keep up with this flow rate are urgently needed. ant applications in medicine, agriculture, ecology, and Analysis is especially difficult for shotgun metagenomics industry [4]. data from environments such as soil where the diversity The first step in most sequence analyses is finding a suitable answer to the question, “How close is this se- * Correspondence: [email protected] quence to something seen before?”. The notion of close- 2Theoretical Division, MS K710, Los Alamos National Laboratory, Los Alamos, ness implied in the question is a phylogenetic distance, NM 87545, USA Full list of author information is available at the end of the article which is most properly answered by a phylogenetic © 2012 Berendzen et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Berendzen et al. BMC Research Notes 2012, 5:460 Page 2 of 21 http://www.biomedcentral.com/1756-0500/5/460 algorithm. Unfortunately the computational expense of homologous regions of genes for the purposes of con- such algorithms, coupled with the intractability of mak- structing a multiple sequence alignment [23,24]. In this ing the relevant alignments and trees for genes that may work, we are considering higher values of k, in order to have large numbers of paralogs, make this approach in- identify homologous genes by comparing entire bacterial feasible at present except for a small fraction of gene genomes. We begin by comparing the genomes of two families. The most common alternative is to find a proxy divergent bacteria and observing that random matches for phylogenetic distance in a more-readily-computed dominate for k < 8, while for k = 10, an average of only sequence similarity score as produced by the program about one random match is expected between the gen- BLAST [5] and its relatives. Yet the relationship between omes. Such 10-mer matches are evidently long enough sequence similarity and phylogenetic distance is skewed to specifically discriminate a portion of a conserved gene by rates of acceptance of mutations that can range over from other genes or organisms, while sufficiently short many orders of magnitude over a length scale of tens of as to be present in both reference genomes and soil bases due to differences in functional constraints experi- metagenomics data sets. enced by different parts of the gene [6]. Proteins from families of broadly-conserved genes and those parts of Results enzymes near an active site have significantly higher se- We begin by justifying our choice of k = 10 as the match quence identity than average [7]. The nature of current length long enough to be specific, yet short enough to shotgun metagenomics data, with short reads from be prevalent in environmental samples. Building on this randomly-selected regions of genes, tends to accentuate observation, we identify all 20 million strings of amino the problem of transforming similarity scores to some- acids of length 10 which are shared by at least two refer- thing resembling phylogenetic distances through ence genomes from distinct genera of bacteria. We de- injecting a noise term that can be difficult to remove by note these as orthogenomic signature peptides, and they post-processing (e.g., [8-10]). This problem exists even serve as the foundation for the rest of our analysis. We for close matches, but is exacerbated as similarity then develop one algorithm to establish the correct declines since the underlying sequence alignment may phylogenetic placement of these signature peptides, an- also be called into question. other to classify metagenomic reads matched by signa- A variety of methods to classify shotgun metagenomic ture peptides, and a final algorithm to functionally reads have been proposed, primarily based on protein profile reads through use of an externally defined data- families or gene clusters. These include partial assembly base. The use of fixed-length strings allows us to exploit and hidden-Markov-model searches [11] of protein fam- standard index-based information retrieval techniques ilies [12,13]; finding the closest neighbors in either nu- developed for web search engines. cleotide or protein space using a variety of similarity scores [8,14]; and finding shared sub-strings of variable Choice of k =10 length via suffix trees [15]. Other alternatives to similar- Figure 1 shows the run-length distribution of shared ity scores include short-seed [16] and sub-HMM [17] amino acid k-mers between Escherichia coli and various methods. Phylogenetic analysis is typically the next step sets of bacterial genomes, for k in the range of 3 to 500. after classification, using Least Common Ancestor [8], It is dominated by random matches for k < 8, and domi- nearest-neighbor [14,15], or hierarchical scoring [9] to nated by gene matches for k > 8. The solid red line is an assign phylogeny to the sequences identified in the clas- exponential fit to the run-length distribution of amino sification step. Because of the numerous pitfalls in acid matches between E. coli and Bacillus subtilis, designing a computer algorithm to define functionally reflecting a 16-fold reduction in the number of matches meaningful protein families [18], many classification for each increase by one in the run-length, k. The num- pipelines require continual curation of protein families, ber of random matches quickly drops with increasing k, which involves multiple-sequence alignment and the reaching 1.8 per pair of bacterial genomes for k=10, computation of a phylogenetic tree for each family, in with a ratio of observed matches of length 10 to the the hope of identifying orthologous genes [12,19]. Such expected number of random matches of 250. Since the efforts are labor-intensive and limited by the paucity of rest of our analysis will treat matches longer than 10 as biochemical validation of gene function. Another solu- multiple overlapping 10-mers, and the histogram in tion is to restrict analysis to a small number of well- Figure 1 counts matches only once, at the full extent of behaved ‘housekeeping’ genes [20-22]. However, using their match length, the appropriate ratio of non-random this approach results in discarding the vast majority of to random matches is not 250, but 1000. sequence reads. In addition to the small-k behavior of random Exact amino acid k-mer matches with k in the range matches, Figure 1 shows the large-k behavior for E. coli 3–6 have been employed to speed identification of scanned against three sets of bacteria. As the Berendzen et al. BMC Research Notes 2012, 5:460 Page 3 of 21 http://www.biomedcentral.com/1756-0500/5/460 A vertical slice through Figure 2, then, will approxi- mate the presence / absence phylogenetic profile of each gene across enteric bacteria (bottom portion) and repre- sentatives of the bacterial kingdom (top portion).
Recommended publications
  • Developing a Genetic Manipulation System for the Antarctic Archaeon, Halorubrum Lacusprofundi: Investigating Acetamidase Gene Function
    www.nature.com/scientificreports OPEN Developing a genetic manipulation system for the Antarctic archaeon, Halorubrum lacusprofundi: Received: 27 May 2016 Accepted: 16 September 2016 investigating acetamidase gene Published: 06 October 2016 function Y. Liao1, T. J. Williams1, J. C. Walsh2,3, M. Ji1, A. Poljak4, P. M. G. Curmi2, I. G. Duggin3 & R. Cavicchioli1 No systems have been reported for genetic manipulation of cold-adapted Archaea. Halorubrum lacusprofundi is an important member of Deep Lake, Antarctica (~10% of the population), and is amendable to laboratory cultivation. Here we report the development of a shuttle-vector and targeted gene-knockout system for this species. To investigate the function of acetamidase/formamidase genes, a class of genes not experimentally studied in Archaea, the acetamidase gene, amd3, was disrupted. The wild-type grew on acetamide as a sole source of carbon and nitrogen, but the mutant did not. Acetamidase/formamidase genes were found to form three distinct clades within a broad distribution of Archaea and Bacteria. Genes were present within lineages characterized by aerobic growth in low nutrient environments (e.g. haloarchaea, Starkeya) but absent from lineages containing anaerobes or facultative anaerobes (e.g. methanogens, Epsilonproteobacteria) or parasites of animals and plants (e.g. Chlamydiae). While acetamide is not a well characterized natural substrate, the build-up of plastic pollutants in the environment provides a potential source of introduced acetamide. In view of the extent and pattern of distribution of acetamidase/formamidase sequences within Archaea and Bacteria, we speculate that acetamide from plastics may promote the selection of amd/fmd genes in an increasing number of environmental microorganisms.
    [Show full text]
  • Articles, Onomic Units (Otus) of Which Half Were Related to Bacteria Their Numbers Are Highly Variable in Both Time and Space
    Biogeosciences, 4, 115–124, 2007 www.biogeosciences.net/4/115/2007/ Biogeosciences © Author(s) 2007. This work is licensed under a Creative Commons License. Fluorescent pseudomonads isolated from Hebridean cloud and rain water produce biosurfactants but do not cause ice nucleation H. E. Ahern1, K. A. Walsh2, T. C. J. Hill2, and B. F. Moffett2 1University of East London, Romford Road, Stratford, London, UK 2Environment Agency, Wallingford, UK Received: 1 September 2006 – Published in Biogeosciences Discuss.: 4 October 2006 Revised: 11 January 2007 – Accepted: 9 February 2007 – Published: 12 February 2007 Abstract. Microorganisms were discovered in clouds over 1 Introduction 100 years ago but information on bacterial community struc- ture and function is limited. Clouds may not only be a niche There has been a resurgence of interest in microorganisms within which bacteria could thrive but they might also in- in the atmosphere due, in part, to heightened awareness of fluence dynamic processes using ice nucleating and cloud disease epidemiology. Health issues however may be less condensing abilities. Cloud and rain samples were collected important than their role in cloud and rainfall processes and from two mountains in the Outer Hebrides, NW Scotland, link to climate change. Recently it has been reported that UK. Community composition was determined using a com- there are between 1500 and 355 000 bacteria per millilitre of bination of amplified 16S ribosomal DNA restriction analy- cloud water (Sattler et al., 2001, Bauer et al., 2002 and Am- sis and sequencing. 256 clones yielded 100 operational tax- ato et al., 2005). Therefore, as with other aerosol particles, onomic units (OTUs) of which half were related to bacteria their numbers are highly variable in both time and space.
    [Show full text]
  • Doctoral Dissertation Template
    UNIVERSITY OF OKLAHOMA GRADUATE COLLEGE ISOLATION AND CHARACTERIZATION OF NITRATE REDUCING/DENITRIFYING BACTERIA FROM TALLGRASS PRAIRIE SOILS A DISSERTATION SUBMITTED TO THE GRADUATE FACULTY in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY By SAMER MUHAMMAD SALIM ABUBAKR Norman, Oklahoma 2012 ISOLATION AND CHARACTERIZATION OF NITRATE REDUCING/DENITRIFYING BACTERIA FROM TALLGRASS PRAIRIE SOILS A DISSERTATION APPROVED FOR THE DEPARTMENT OF MICROBIOLOGY AND PLANT BIOLOGY BY ______________________________ Dr. Joseph Suflita, Chair ______________________________ Dr. Kathleen Duncan ______________________________ Dr. Lee Krumholz ______________________________ Dr. Michael McInerney ______________________________ Dr. Mark Nanny © Copyright by SAMER MUHAMMAD SALIM ABUBAKR 2012 All Rights Reserved. Acknowledgements There are several people I would like to thank for making the completion of my degree possible. I would like to thank my graduate advisor, Dr. Kathleen Duncan for her constant guidance and direction to keep my research on track which helped me to develop as a scientist. I am thankful to my other advisory committee members, Dr. Joseph Suflita, Dr. Lee Krumholz, Dr. Michael McInerney, and Dr. Mark Nanny for their constant help and support. Also, I would like to thank Dr. Victoria Parisi and Dr. Housna Mouttaki for teaching me how to use DIONEX and HPLC, Dr. Athenia Oldham for teaching me the basics of Mothur software, and to my friends Dr. Mostafa Elshahed and Dr. Fares Najar for their help and support throughout my Ph.D. program. I would like to thank the department of Microbiology and Plant Biology for giving me the opportunity to teach Introduction to Microbiology Laboratory and Fundamentals of Microbiology Laboratory, a teaching experience that I enjoyed a lot.
    [Show full text]
  • Seleksi Dan Identifikasi Secara Molekuler Bakteri Pendegradasi Insektisida Piretroid Dari Tanah
    SELEKSI DAN IDENTIFIKASI SECARA MOLEKULER BAKTERI PENDEGRADASI INSEKTISIDA PIRETROID DARI TANAH Salindri Prawitasari1, Siti Nur Jannah1*, dan Alina Akhdiya2 1Departemen Biologi, Fakultas Sains dan Matematika, Universitas Diponegoro Jl. Prof. Soedarto, SH, Tembalang, Semarang 2Balai Besar Penelitian dan Pengembangan Bioteknologi dan Sumberdaya Genetik Pertanian (BB Biogen) Bogor, Jawa Barat *Penulis korespondensi: [email protected] Abstrak Akumulasi residu insektisida pada lahan pertanian berdampak negatif bagi lingkungan dan organisme di sekitarnya. Salah satu teknologi alternatif untuk merehabilitasi lahan pertanian yang tercemar adalah dengan teknologi bioremediasi. Bioremediasi adalah teknologi untuk memecah atau menguraikan zat pencemar menjadi bahan yang kurang beracun atau tidak beracun (karbondioksida dan air) dengan memanfaatkan organisme atau produk organisme. Penelitian ini bertujuan untuk menyeleksi dan mengidentifikasi secara molekuler bakteri pendegradasi insektisida piretroid sintetik asal sampel tanah Pangalengan. Seleksi terhadap sembilan isolat bakteri tanah asal Pangalengan menggunakan medium NMS cair yang mengandung 100 ppm piretroid menghasilkan dua isolat bakteri yang memiliki kemampuan terbaik dalam mendegradasi piretroid. Isolat S-9 merupakan isolat bakteri yang memiliki kemampuan mendegradasi residu piretroid paling tinggi dibandingkan dengan 8 isolat lainnya, yaitu sebesar 87,38%. Hasil degradasi insektisida tersebut dimanfaatkan oleh bakteri sebagai sumber C, N, dan P untuk mendukung pertumbuhannya. Berdasarkan
    [Show full text]
  • A Report of 42 Unrecorded Bacterial Species Belonging to the Alphaproteobacteria in Korea
    Journal206 of Species Research 5(2):206-219, 2016JOURNAL OF SPECIES RESEARCH Vol. 5, No. 2 A report of 42 unrecorded bacterial species belonging to the Alphaproteobacteria in Korea Hyun Mi Jin1,2, Jung-Hoon Yoon3, Seung-Bum Kim4, Kwang-Yeop Jahng5, Jang-Cheon Cho6, Ki-seong Joh7, Chang-Jun Cha8, Chi-Nam Seong9, Jin-Woo Bae10, Wan-Taek Im11 and Che-Ok Jeon1,* 1Department of Life Science, Chung-Ang University, Seoul 06974, Korea 2Freshwater Bioresources Utilization Division, Nakdonggang National Institute of Biological Resources, Gyeongsangbuk-do 37242, Korea 3Department of Food Science and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea 4Department of Microbiology, Chungnam National University, Daejeon 34134, Korea 5Department of Life Sciences, Chonbuk National University, Jeonju-si 54899, Korea 6Department of Biological Sciences, Inha University, Incheon 22212, Korea 7Department of Bioscience and Biotechnology, Hankuk University of Foreign Studies, Geonggi 02450, Korea 8Department of Systems Biotechnology, Chung-Ang University, Anseong 17546, Korea 9Department of Biology, Sunchon National University, Suncheon 57922, Korea 10Department of Biology, Kyung Hee University, Seoul 02453, Korea 11Department of Biotechnology, Hankyong National University, Anseong 17546, Korea *Correspondent: [email protected] As a subset study to discover indigenous prokaryotic species in Korea, a total of 42 bacterial strains assigned to the class Alphaproteobacteria were isolated from diverse environmental habitats including plant roots, ginseng soil,
    [Show full text]
  • University of Thessaly Isolation and Study of Bacteria Able to Degrade
    University of Thessaly Department of Biochemistry and Biotechnology Isolation and study of bacteria able to degrade pesticides contained in the wastewaters produced by the fruit-packaging industry A thesis submitted by Chiara Perruchon For the degree of Doctor of Philosophy 2014 Institutional Repository - Library & Information Centre - University of Thessaly 09/12/2017 02:48:33 EET - 137.108.70.7 Members of the Examination Committee Assistant Prof. Dimitrios Karpouzas (Environmental Microbiology and Biotechnology), University of Thessaly, Department of Biochemistry and Biotechnology, Supervisor and Member of the Advisory Committee Associate Prof. Konstantinos Ehaliotis (Soil Fertility and Biology), Agricultural University of Athens, Department of Natural Resources Management and Agricultural Engineering, Member of the Advisory Committee Associate Prof. Kalliope Papadopoulou (Plant Biotechnology), University of Thessaly, Department of Biochemistry and Biotechnology, Member of the Advisory Committee Associate Prof. Konstantinos Kormas (Ecology of Aquatic Microorganisms), University of Thessaly, School of Agriculture Prof. Urania Menkissoglou-Spiroudi (Pesticide-Chemical Synthesis and Activity), Aristotle University of Thessaloniki, School of Agriculture Assistant Prof. Dimitrios Mossialos (Microbial Biotechnology), University of Thessaly, Department of Biochemistry and Biotechnology Assistant Prof. Spyridon Ntougias (Environmental Microbiology), Democritus University of Thrace, Department of Environmental Engineering Institutional Repository - Library & Information Centre - University of Thessaly 09/12/2017 02:48:33 EET - 137.108.70.7 Abstract Fruits are particularly vulnerable to fungal infestations or other physiological disorders during storage thus diminishing their market value. Post-harvest fruit treatments with fungicides like thiabendazole (TBZ), imazalil (IMZ), and ortho-phenyl-phenol (OPP), and antioxidants like diphenylamine (DPA) are the most effective means to minimize fruit spoilage in storage.
    [Show full text]
  • Potentially Pathogenic Bacteria in Shower Water and Air of a Stem Cell Transplant Unit Sarah D
    Washington University School of Medicine Digital Commons@Becker Open Access Publications 2009 Potentially pathogenic bacteria in shower water and air of a stem cell transplant unit Sarah D. Perkins Washington University in St Louis Jennie Mayfield Barnes-Jewish Hospital Victoria Fraser Washington University School of Medicine in St. Louis Largus T. Angenent Cornell University Follow this and additional works at: https://digitalcommons.wustl.edu/open_access_pubs Recommended Citation Perkins, Sarah D.; Mayfield, Jennie; Fraser, Victoria; and Angenent, Largus T., ,"Potentially pathogenic bacteria in shower water and air of a stem cell transplant unit." Applied and Environmental Microbiology.75,16. 5363-5372. (2009). https://digitalcommons.wustl.edu/open_access_pubs/1999 This Open Access Publication is brought to you for free and open access by Digital Commons@Becker. It has been accepted for inclusion in Open Access Publications by an authorized administrator of Digital Commons@Becker. For more information, please contact [email protected]. Potentially Pathogenic Bacteria in Shower Water and Air of a Stem Cell Transplant Unit Sarah D. Perkins, Jennie Mayfield, Victoria Fraser and Largus T. Angenent Downloaded from Appl. Environ. Microbiol. 2009, 75(16):5363. DOI: 10.1128/AEM.00658-09. Published Ahead of Print 6 July 2009. Updated information and services can be found at: http://aem.asm.org/ http://aem.asm.org/content/75/16/5363 These include: REFERENCES This article cites 70 articles, 25 of which can be accessed free at: http://aem.asm.org/content/75/16/5363#ref-list-1 on December 30, 2013 by Washington University in St. Louis CONTENT ALERTS Receive: RSS Feeds, eTOCs, free email alerts (when new articles cite this article), more» Information about commercial reprint orders: http://journals.asm.org/site/misc/reprints.xhtml To subscribe to to another ASM Journal go to: http://journals.asm.org/site/subscriptions/ APPLIED AND ENVIRONMENTAL MICROBIOLOGY, Aug.
    [Show full text]
  • Draft Genome Sequence of Bosea Sp. WAO an Arsenite and Sulfide Oxidizer Isolated from a Pyrite Rock Outcrop in New Jersey Alexandra B
    Walczak et al. Standards in Genomic Sciences (2018) 13:6 https://doi.org/10.1186/s40793-018-0312-4 EXTENDED GENOME REPORT Open Access Draft genome sequence of Bosea sp. WAO an arsenite and sulfide oxidizer isolated from a pyrite rock outcrop in New Jersey Alexandra B. Walczak1*, Nathan Yee2 and Lily Y. Young2 Abstract This genome report describes the draft genome and physiological characteristics of Bosea sp. WAO (=DSM 102914), a novel strain of the genus Bosea in the family Bradyrhizobiaceae. Bosea sp. WAO was isolated from pulverized pyritic shale containing elevated levels of arsenic. This aerobic, gram negative microorganism is capable of facultative chemolithoautotrophic growth under aerobic conditions by oxidizing the electron donors arsenite, elemental sulfur, thiosulfate, polysulfide, and amorphous sulfur. The draft genome is of a single circular chromosome 6,125,776 bp long consisting of 21 scaffolds with a G + C content of 66.84%. A total 5727 genes were predicted of which 5665 or 98.92% are protein-coding genes and 62 RNA genes. We identified the genes aioA and aioB, which encode the large and small subunits of the arsenic oxidase respectively. We also identified the genes for the complete sulfur oxidation pathway sox which is used to oxidize thiosulfate to sulfate. Keywords: Neutrophilic sulfur oxidizer, Sox, Arsenite oxidase gene, Aio,Geomicrobiology,Microbe-mineral interactions, carbon fixation RuBisCO Introduction the Bosea genus is most closely related to the genus Bosea sp. WAO (white arsenic oxidizer) was enriched Salinarimonas which currently consists of two from a pulverized sample of weathered black shale species, Salinarimonas rosea and Salinarmonas ramus obtained from an outcropping near Trenton, NJ that [2].
    [Show full text]
  • Evaluating Efficacy of Plant Growth Promoting Rhizobacteria for Promoting Growth and Preventing Disease in Both Fish and Plants
    Evaluating Efficacy of Plant Growth Promoting Rhizobacteria for Promoting Growth and Preventing Disease in both Fish and Plants by Malachi Astor Williams A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Auburn, Alabama May 2, 2020 Keywords: PGPR, disease control, Probiotic, Bacillus spp., Nile Tilapia, Channel Catfish Copyright 2020 by Malachi Astor Williams Approved by Mark R. Liles, Chair, Professor of Biological Sciences Joseph W. Kloepper, Professor of Entomology and Plant Pathology Jeffery S. Terhune, Associate Professor of Fisheries, Aquaculture and Aquatic Sciences Scott R. Santos, Professor of Biological Sciences Abstract Plant growth promoting rhizobacteria (PGPR), are bacteria residing within the rhizosphere of a plant, that elicit health benefits to the plant (Kloepper and Schroth, 1978; Kloepper et al., 2004). To understand the growth promoting and disease-inhibiting activities of PGPR strains, the genomes of 12 different PGPR strains affiliated with the B. subtilis group were sequenced. These B. subtilis strains exhibited high genomic diversity, whereas the genomes of Bacillus amyloliquefaciens strains (a member of the B. subtilis group) and B. velezensis strains (formerly B. amyloliquefaciens subsp. plantarum, now a part of the B. amyloliquefaciens clade (Fan et. al., 2017)) are highly conserved. A total of 2,839 genes were consistently present within the core genome of B. velezensis. Comparative genomic analyses of B. amyloliquefaciens and B. velezensis strains identified conserved genes that have been linked with biological control and colonization of roots or leaves. There were 73 genes uniquely associated with B. velezensis strains with predicted functions related to signaling, transportation, secondary metabolite production, and carbon source utilization.
    [Show full text]
  • Entwicklung Eines Datenbank-Gestützten Computerprogramms Zur Taxonomischen Identifizierung Von Mikrobiellen Populationen Auf Molekularbiologischer Basis
    Entwicklung eines Datenbank-gestützten Computerprogramms zur taxonomischen Identifizierung von mikrobiellen Populationen auf molekularbiologischer Basis Anwendung dieses Programms auf die Charakterisierung der Diversität stickstofffixierender und denitrifzierender Mikroorganismen in Abhängigkeit einer Stickstoffdüngung Inaugural-Dissertation zur Erlangung des Doktorgrades der Mathematisch-Naturwissenschaftlichen Fakultät der Universität zu Köln vorgelegt von Christopher Rösch aus München Hundt Druck Köln, 2005 Berichterstatter: Prof. Dr. H. Bothe Prof. Dr. D. Schomburg Tag der mündlichen Prüfung: 11. Juli 2005 Meinen Eltern Abstract The present work aimed at developing a novel method which allows to characterize microbial communities from environmental samples in a comprehensive and rapid way. Approaches tried so far either supplied information about the identity of a small fraction of organisms within the total community (e. g. by sequencing of clone libraries), or demonstrated the diversity of organisms without identifying them (e. g. community profiling). In the approach presented here, a method for determining such profiles (by tRFLP analysis) was combined with an automatic analysis by a computer program (TReFID) newly developed for the current study. For the characterization of microorganisms, three different data bases have been constructed: (1) for denitrifying bacteria: a nosZ data base with 607 entries (2) for dinitrogen fixing bacteria: a nifH data base with 1,318 entries (3) for bacteria in general: a 16S rDNA data base with 22,145 entries Thus a comprehensive data set has been developed particularly for the 16S rRNA gene. The use of the TReFID program now allows investigators to characterize bacterial communities from any environmental sample in a rather comprehensive way. Several control analyses showed that the TReFID program is suited for the analysis of environmental samples.
    [Show full text]
  • Oxidation of Thiosulfate by a New Bacterium, Bosea Thiooxidans (Strain BI-42) Gen
    INTERNATIONAL JOURNALOF SYSTEMATIC BACTERIOLOGY,OCt. 1996, p. 981-987 Vol. 46, No. 4 0020-7713/96/$04.00+0 Copyright 0 1996, International Union of Microbiological Societies Oxidation of Thiosulfate by a New Bacterium, Bosea thiooxidans (strain BI-42) gen. nov., sp. nov.: Analysis of Phylogeny Based on Chemotaxonomy and 16s Ribosomal DNA Sequencing SUBIZATA K. DAS,' AJIT K. MISHRA,'" BRIAN J. TINDALL,2 FRED A. RAINEY,2 AND ERKO STACKEBRANDT2 Department of Microbiology, Bose Institute, CIT Scheme VII-M, Calcutta 700 054, India, and DSM-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH, 0-38124 Braunschweig, Germany2 A gram-negative bacterium which was capable of oxidizing reduced inorganic sulfur compounds was isolated from agricultural soil and designated BI-42. This new isolate grew on a wide range of organic substrates but was not able to grow autotrophically and lacked ribulose 1,s-bisphosphate carboxylase, a key enzyme of carbon dioxide fixation. These results suggested that strain BI-42 was a chemolithoheterotroph.Ammonia and nitrate were not used as sole nitrogen sources for growth, and strain BI-42 lacked glutamate synthase activity, which resulted in glutamate auxotrophy. The glutamate dehydrogenase activity of this organism was apparently insufficient for ammonia assimilation. On the basis of the results of additional biochemical tests, the G+C content of the DNA, the results of a respiratory ubiquinone analysis, the results of a 16s ribosomal DNA sequence analysis, the fatty acid composition, and the results of a membrane lipid analysis, strain BI-42 was identified as a phylogenetically and physiologically distinct taxon belonging to the alpha subclass of the Proteobucteria.
    [Show full text]
  • Sequedex Documentation Release 1.0-Rc1
    Sequedex Documentation Release 1.0-rc1 Joel Berendzen, Judith Cohn, Nicolas Hengartner, Mira Dimitrijevic, Benjamin McMahon January 06, 2016 Contents 1 Copyright notice 1 2 Introduction to Sequedex 3 2.1 What is Sequedex?............................................3 2.2 What does Sequedex do?.........................................3 2.3 How is Sequedex different from other sequence analysis packages?..................4 2.4 Who uses Sequedex?...........................................4 2.5 How is Sequedex used with other software?...............................5 2.6 How does Sequedex work?........................................5 2.7 Sequedex’s outputs............................................8 3 Installation instructions 15 3.1 System requirements........................................... 16 3.2 Downloading and unpacking for Mac.................................. 16 3.3 Downloading and unpacking for Linux................................. 17 3.4 Downloading and unpacking for Windows 7 or 8............................ 17 3.5 Using Sequedex with Cygwin installed under Windows......................... 19 3.6 Installation and updates without network access............................. 20 3.7 Testing your installation......................................... 20 3.8 Running Sequedex on an example data file............................... 20 3.9 Obtaining a node-locked license file................................... 21 3.10 Installing new data modules and upgrading Sequedex - User-installs.................. 21 3.11 Installing new data modules
    [Show full text]