Improving the Usage of Ribosomal RNA Gene in Microbiology and Microbial Ecology Importance of Standardization and Biocuration

Total Page:16

File Type:pdf, Size:1020Kb

Improving the Usage of Ribosomal RNA Gene in Microbiology and Microbial Ecology Importance of Standardization and Biocuration lk Improving the usage of ribosomal RNA gene in microbiology and microbial ecology Importance of standardization and biocuration by Pelin Yilmaz, M.Sc. A thesis submitted in partial fulllment of requirements for the degree of DOCTOR OF PHILOSOPHY in Bioinformatics Approved, Thesis Committee: Prof. Dr. Frank Oliver Glöckner (chair) Max Planck Institute for Marine Microbiology Jacobs University Prof. Dr. Matthias Ullrich (member) Jacobs University Dr. Wolfgang Ludwig (member) Technical University of Munich Date of Defense: December, 9 2011 School of Engineering and Science Statement of Sources DECLARATION I declare that this thesis is my own work and has not been submitted in any form for another degree or diploma at any university or other institution of tertiary education. Information derived from published or unpublished scientic work has been cited in the text and listed in the references. Signature Date First classification of Bacteria: 1. Bacterium, inflexible rod forms; 2. Vibrio, flexible rod forms; 3. Spirillum, spiral, inflexible; 4. Spirochaeta, spiral, flexible 5. Spirodiscus, flattened spirals. * Adapted from; Die Infusionsthierchen als vollkommene Organismen: ein Blick in das tiefere organische Leben der Natur, Christian Gottfried Ehrenberg (1838). Thesis Abstract Environmental surveys of ribosomal RNA genes have become the standard approach to microbial ecology in the last twenty-five years. Although the ribosomal RNA approach has been competing with metagenomics during the past years, with high-throughput sequencing technologies presenting the ability to document the diversity and richness of the microbial biome at an astonishing rate, it is once again under spotlight. High-throughput sequencing technologies bring with them a sequence data deluge, and present challenges to the traditional data management, analysis and retrieval practices. Biocuration, which leads to specialized ribosomal RNA datasets, has become an important part of microbial ecology. More recently, the development and use of metadata standards has emerged as an important aid for biocuration in microbiology. With the help of such standards, it will be possible to curate specialized datasets, which integrate data from even more diverse resources, and bring the organisms, its functions, and the environment together. The work accomplished during this Ph.D. thesis aimed in improving the usage of ribosomal RNA gene in microbiology and microbial ecology by means of curating high- quality specialized datasets for ribosomal RNA gene sequences, and performing research that demonstrates the value of such datasets. More importantly, a metadata standard for marker genes sequences, such as ribosomal RNA, was developed and disseminated to a variety of bioinformatics service providers for implementation. This standard will greatly aid biocuration work in microbiology, and will improve the quantity and quality of integrative resources for microbiology, ultimately providing researchers the ability to understand the diversity and functioning of the microbial biosphere. Table of Contents Introduction ................................................................................... 1 Early Microbiology and Microbial Ecology ........................................... 1 The Molecular Dawn for Bacteria ............................................................ 3 Understanding the organism ................................................................................................... 3 Revisiting evolution ............................................................................................................... 3 Accessing the microbial biosphere ......................................................................................... 5 Cataloging Bacterial and Archaeal Diversity .......................................... 7 Primary databases ................................................................................................................... 7 Secondary databases ............................................................................................................... 8 rRNA Approach in the Age of “Omics” ................................................ 10 Is the rRNA approach still relevant? ..................................................................................... 10 High-throughput microbial diversity ..................................................................................... 13 Missing ingredient; contextual data ....................................................................................... 14 Motivation and Research Aims .............................................................. 18 Results and Discussion ................................................................ 21 Overview ................................................................................................... 22 Contextual Data Standards for Biocuration ......................................... 25 I. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications .................................................. 27 II. The genomic standards consortium: bringing standards to life for microbial ecology .... 43 III. MetaBar - a tool for consistent contextual data acquisition and standards compliant submission ............................................................................................................................. 51 IV. CDinFusion – Submission-ready, on-line Integration of Sequence and Contextual Data ............................................................................................................................................... 71 Curation of rRNA Datasets ..................................................................... 89 V. SILVA: Comprehensive databases for quality checked and aligned ribosomal RNA sequence data compatible with ARB .................................................................................... 91 VI. Megx.net: integrated database resource for marine ecological genomics ..................... 105 Usage of Curated Datasets in Microbial Ecology ............................... 119 VII. Analysis of 23S rRNA genes in metagenomes – A case study from the Global Ocean Sampling Expedition ........................................................................................................... 121 VIII. Ecological structuring of bacterial and archaeal taxa in ocean surface waters .......... 143 Summary .................................................................................... 177 Contextual Data Standards for Biocuration ....................................... 177 Minimum information about any (x) sequence (MIxS) ...................................................... 177 Implementation of MIxS ..................................................................................................... 180 Curation of rRNA Datasets .................................................................. 183 SILVA taxonomy ................................................................................................................ 183 Georeferenced diversity in megx.net ................................................................................... 186 Usage of Curated Datasets in Microbial Ecology ............................... 188 Investigating the potential of 23S rRNA genes in metagenomes ........................................ 188 Ecological structuring of bacterial and archaeal taxa in the marine environment .............. 189 Outlook ....................................................................................... 191 Appendix .................................................................................... 195 Acknowledgements .................................................................... 197 Bibliography ............................................................................... 199 INTRODUCTION Early Microbiology and Microbial Ecology The story of Bacteria started as a curiosity of nature with Leeuwenhoek’s microscopy observations in 1670s. Until mid-1800s, Bacteria were largely ignored; in fact it was not even sure that they were living organisms. The actual realization of the significance of Bacteria, without doubt, came with Robert Koch’s postulation of the “Germ Theory of Disease”. Having found an “applied” perspective, microbiology (specifically the study of Bacteria and Archaea) made a kick-start, and numerous findings ranging from biogenesis to development of vaccines dotted the early timeline of this field. Due to their implications in human health, other aspects of microbiology lagged behind, but did attain pace with the seminal works of Sergei Winogradsky and Martinus Beijerinck. Perhaps, those of the medical microbiology unfairly dominated their achievements. In fact, microbial ecology’s existence would not have been, if it were not for their discoveries of nitrogen fixation or chemoautotrophy, which are cornerstones in today’s understanding of the biogeochemical cycling of elements in the environment. Medical microbiology not only overshadowed Winogradsky and Beijericks’s achievements, but also made it difficult for microbiology to be realized as a “basic science”. However, Beijerinck had another vision for microbiology, rather than confining it to the applications of medicine. In the statement he made during his speech at the Dutch Royal Academy of Sciences in 1905, he outlines the best practices to studying microorganisms [1]: “[My] approach can
Recommended publications
  • Insights Into the Dynamics Between Viruses and Their Hosts in a Hot Spring Microbial Mat
    The ISME Journal (2020) 14:2527–2541 https://doi.org/10.1038/s41396-020-0705-4 ARTICLE Insights into the dynamics between viruses and their hosts in a hot spring microbial mat 1,2,11 1,2 1,2 1,2 1,2 Jessica K. Jarett ● Mária Džunková ● Frederik Schulz ● Simon Roux ● David Paez-Espino ● 1,2 1,2 1,2 3 4 Emiley Eloe-Fadrosh ● Sean P. Jungbluth ● Natalia Ivanova ● John R. Spear ● Stephanie A. Carr ● 5 6 7 8,9 1,2 Christopher B. Trivedi ● Frank A. Corsetti ● Hope A. Johnson ● Eric Becraft ● Nikos Kyrpides ● 9 1,2,10 Ramunas Stepanauskas ● Tanja Woyke Received: 13 January 2020 / Revised: 3 June 2020 / Accepted: 11 June 2020 / Published online: 13 July 2020 © The Author(s) 2020. This article is published with open access Abstract Our current knowledge of host–virus interactions in biofilms is limited to computational predictions based on laboratory experiments with a small number of cultured bacteria. However, natural biofilms are diverse and chiefly composed of uncultured bacteria and archaea with no viral infection patterns and lifestyle predictions described to date. Herein, we predict the first DNA sequence-based host–virus interactions in a natural biofilm. Using single-cell genomics and metagenomics – 1234567890();,: 1234567890();,: applied to a hot spring mat of the Cone Pool in Mono County, California, we provide insights into virus host range, lifestyle and distribution across different mat layers. Thirty-four out of 130 single cells contained at least one viral contig (26%), which, together with the metagenome-assembled genomes, resulted in detection of 59 viruses linked to 34 host species.
    [Show full text]
  • GSC Workshop 10 Grant Finances Monthly Teleconference Calls
    Standards in Genomic Sciences (2010) 3:225-231 DOI:10.4056/sigs.1423520 Meeting Report from the Genomic Standards Consortium (GSC) Workshop 10 Elizabeth Glass1, Folker Meyer1,2, Jack A Gilbert1,3, Dawn Field4, Sarah Hunter5, Renzo Kottmann6, Nikos Kyrpides7, Susanna Sansone8, Lynn Schriml9, Peter Sterk10, Owen White9 and John Wooley11 1 Argonne National Laboratory, Argonne, IL USA 2 Computation Institute, University of Chicago, Chicago, IL USA 3 Department of Ecology and Evolution, University of Chicago, Chicago, IL USA, 4 Centre for Ecology & Hydrology, Oxfordshire, UK 5 European Molecular Biology Laboratory (EMBL) Outstation, European Bioinformatics Insti- tute (EBI), Hinxton, Cambridge UK 6 Microbial Genomics Group, Max Planck Institute for Marine Microbiology and Jacobs University Bremen, Bremen, Germany 7 DOE Joint Genome Institute, Walnut Creek, CA, USA 8 University of Oxford, Oxford e-Research Centre, Oxford, UK 9 Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD USA 10 Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge UK 11 University of California San Diego, La Jolla, CA USA *Corresponding Author: Dawn Field This report summarizes the proceedings of the 10th workshop of the Genomic Standards Consortium (GSC), held at Argonne National Laboratory, IL, USA. It was the second GSC workshop to have open registration and attracted over 60 participants who worked together to progress the full range of projects ongoing within the GSC. Overall, the primary focus of the workshop was on advancing the M5 platform for next-generation collaborative computa- tional infrastructures. Other key outcomes included the formation of a GSC working group focused on MIGS/MIMS/MIENS compliance using the ISA software suite and the formal launch of the GSC Developer Working Group.
    [Show full text]
  • The Minimum Information About a Genome Sequence (MIGS) Specification
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln US Department of Energy Publications U.S. Department of Energy 2008 The minimum information about a genome sequence (MIGS) specification Dawn Field Natural Environmental Research Council Centre for Ecology and Hydrology, [email protected] George Garrity Michigan State University Tanya Gray Natural Environmental Research Council Centre for Ecology and Hydrology Norman Morrison FUnivollowersity this of and Manchester additional works at: https://digitalcommons.unl.edu/usdoepub Jer emyPart ofSelengut the Bior esource and Agricultural Engineering Commons J. Craig Venter Institute SeeField, next Dawn; page Garrity for additional, George; Grauthorsay, Tany a; Morrison, Norman; Selengut, Jeremy; Sterk, Peter; Tatusova, Tatiana; Thomson, Nicholas; Allen, Michael J.; Angiuoli, Samuel V.; Ashburner, Michael; Axelrod, Nelson; Baldauf, Sandra; Ballard, Stuart; Boore, Jeffrey; Cochrane, Guy; Cole, James; Dawyndt, Peter; De Vos, Paul; dePamphilis, Claude; Edwards, Robert; Faruque, Nadeem; Feldman, Robert; Gilbert, Jack; Gilna, Paul; Glöckner, Frank Oliver; Goldstein, Philip; Guralnick, Robert; Haft, Dan; Hancock, David; Hermjakob, Henning; Hertz-Fowler, Christiane; Hugenholtz, Phil; Joint, Ian; Kagan, Leonid; Kane, Matthew; Kennedy, Jessie; Kowalchuk, George; Kottmann, Renzo; Kolker, Eugene; Kravitz, Saul; Kyrpides, Nikos; Leebens-Mack, Jim; Lewis, Suzanna E.; Li, Kelvin; Lister, Allyson L.; Lord, Phillip; Maltsev, Natalia; Markowitz, Victor; Martiny, Jennifer; Methe, Barbara; Mizrachi, Ilene; Moxon, Richard; Nelson, Karen; Parkhill, Julian; Proctor, Lita; White, Owen; Sansone, Susanna-Assunta; Spiers, Andrew; Stevens, Robert; Swift, Paul; Taylor, Chris; Tateno, Yoshio; Tett, Adrian; Turner, Sarah; Ussery, David; Vaughan, Bob; Ward, Naomi; Whetzel, Trish; San Gil, Ingio; Wilson, Gareth; and Wipat, Anil, "The minimum information about a genome sequence (MIGS) specification" (2008). US Department of Energy Publications.
    [Show full text]
  • Meeting Report: “Metagenomics, Metadata and Meta-Analysis” (M3) Workshop at the Pacific Symposium on Biocomputing 2010
    Standards in Genomic Sciences (2010) 2:357-360 DOI:10.4056/sigs.802738 Meeting Report: “Metagenomics, Metadata and Meta-analysis” (M3) Workshop at the Pacific Symposium on Biocomputing 2010 Lynette Hirschman1, Peter Sterk2,3, Dawn Field2, John Wooley4, Guy Cochrane5, Jack Gilbert6, Eugene Kolker7, Nikos Kyrpides8, Folker Meyer9, Ilene Mizrachi10, Yasukazu Nakamura11, Susanna-Assunta Sansone5, Lynn Schriml12, Tatiana Tatusova10, Owen White12 and Pelin Yilmaz13 1 Information Technology Center, The MITRE Corporation, Bedford, MA, USA 2 NERC Center for Ecology and Hydrology, Oxford, OX1 3SR, UK 3 Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK 4 University of California San Diego, La Jolla, CA, USA 5 European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Hinxton, Cambridge, UK 6 Plymouth Marine Laboratory (PML), Plymouth, UK 7 Seattle Children’s Hospital, Seattle, WA, USA 8 Genome Biology Program, DOE Joint Genome Institute, Walnut Creek, California, USA 9 Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439, USA 10 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA 11 DNA Data Bank of Japan, National Institute of Genetics, Yata, Mishima, Japan 12 University of Maryland School of Medicine, Baltimore, MD, USA 13 Microbial Genomics Group, Max Planck Institute for Marine Microbiology & Jacobs University Bremen, Bremen, Germany This report summarizes the M3 Workshop held at the January 2010 Pacific Symposium on Biocomputing. The workshop, organized by Genomic Standards Consortium members, in- cluded five contributed talks, a series of short presentations from stakeholders in the genom- ics standards community, a poster session, and, in the evening, an open discussion session to review current projects and examine future directions for the GSC and its stakeholders.
    [Show full text]
  • The Positive Role of the Ecological Community in the Genomic Revolution
    The positive role of the ecological community in the genomic revolution Dawn Field, Barbara Methe, Karen Nelson, and Nikos Kyrpides, An increasing number of ecologists are turning to the use of genomic technologies to complement and extend traditional methods of characterizing microbial diversity and its biological consequences (Xu, 2006). The SCOPE meeting on Microbial Environmental Genomics (MicroEnGen-II, Shanghai, June 12-15, 2006) provided a rich set of examples of progress at this interface (van Veen, Kowalchuk, 2006), many of which are described in this special issue. The session of this meeting entitled ‘new frontiers’ included two presentations on informatics and data management. We (DF and NK) talked about projects we are involved in, namely, the Integrated Microbial Genomes Resource (IMG) (Markowitz et al., 2006), the Genomic Standards Consortium (GSC) (Field et al., et al, in review) and the NERC Environmental Bioinformatics Centre (NEBC) (Field et al., 2005). In this commentary we briefly describe these projects, provide updated information on their development and attempt to expand on the wider issue of the increasing need for ecologists to become involved in the development of these and similar genomic resources. These projects have all benefited from the direct and indirect contribution of ecologists. Further, each is working in its own way to help strengthen the position of researchers working in eco- and environmental genomics. The IMG system is an example of a new generation database project attempting to standardize access to the publicly available collection of genomes, through a comparative genomics approach (Markowitz et al., 2006). The GSC represents an international effort to produce consensus on the ways in which we collect, exchange and represent genomic metadata (Field et al., et al, in review).
    [Show full text]
  • F. Nucleatum Sub Spp Vincentii and Its Comparison with the Genome of F
    Letter Genome Analysis of F. nucleatum sub spp vincentii and Its Comparison With the Genome of F. nucleatum ATCC 25586 Vinayak Kapatral,1 Natalia Ivanova, Iain Anderson, Gary Reznik, Anamitra Bhattacharyya, Warren L. Gardner, Natalia Mikhailova, Alla Lapidus, Niels Larsen, Mark D’Souza, Theresa Walunas, Robert Haselkorn, Ross Overbeek, and Nikos Kyrpides Integrated Genomics Inc., Chicago, Illinois 60612, USA We present the draft genome sequence and its analysis for Fusobacterium nucleatum sub spp. vincentii (FNV), and compare that genome with F. nucleatum ATCC 25586 (FN). A total of 441 FNV open reading frames (ORFs) with no orthologs in FNhave been identified. Of these, 118 ORFs have no known fun ction and are unique to FNV, whereas 323 ORFs have functional orthologs in other organisms. In addition to the excretion of butyrate, H2S and ammonia-like FN, FNV has the additional capability to excrete lactate and aminobutyrate. Unlike FN, FNV is likely to incorporate galactopyranose, galacturonate, and sialic acid into its O-antigen. It appears to transport ferrous iron by an anaerobic ferrous transporter. Genes for eukaryotic type serine/threonine kinase and phosphatase, transpeptidase E-transglycosylase Pbp1A are found in FNV but not in FN. Unique ABC transporters, cryptic phages, and three types of restriction-modification systems have been identified in FNV. ORFs for ethanolamine utilization, thermostable carboxypeptidase, ␥ glutamyl-transpeptidase, and deblocking aminopeptidases are absent from FNV. FNV, like FN, lacks the classical catalase-peroxidase system, but thioredoxin/glutaredoxin enzymes might alleviate oxidative stress. Genes for resistance to antibiotics such as acriflavin, bacitracin, bleomycin, daunorubicin, florfenicol, and other general multidrug resistance are present.
    [Show full text]
  • Volume 12, Issue 2
    THE spring/summer volume 12 PRIMER issue 2 Data Quality, Data Sets and New Directions: Plotting IMG’s Next 10 Years in this issue At the recent 10th Annual Genomics An example of Kyrpides’ efforts to A Decade Using IMG .......... 2 of Energy & Environment meeting systematically describe and classify Tapping Microbial Communities hosted by the U.S. Department of microbes in action can be seen in the in Colorado ................ 4 Energy Joint Genome Institute (DOE Integrated Microbial Genomes (IMG) Highlights from the Annual JGI), a DOE Office of Science User data management system that his DOE JGI Meeting ............ 6 Facility, Nikos Kyrpides (below), head program developed and maintains in Concerned about Melting of the DOE JGI Prokaryote Super partnership with the Biosciences Permafrost ................. 8 Program, received the van Niel Computing Group of Berkeley Lab’s International Prize in Bacterial Computational Research Division. Systematics. The Van Niel Prize was IMG is the leading data analysis established in 1986 in honor of system of the DOE JGI’s Prokaryote microbiologist Cornelis Van Niel’s Super Program, and Kyrpides has contribution to scholarship in the been pushing the developments as field of microbiology, and is awarded the scientific lead of the project from Growing the Interest every three years by the University of its first working prototype in 2005 to in Genomics Queensland in Australia on the its current incarnation. On the IMG recommendation of a panel of experts system’s 10th year anniversary, he With a capacity crowd in atten- of the International Committee on took time to reflect on the milestones dance, the DOE JGI hosted the 10th Systematics of Prokaryotes.
    [Show full text]
  • High-Quality Permanent Draft Genome Sequence of Bradyrhizobium Sp Th.B2, a Microsymbiont of Amphicarpaea Bracteata Collected in Johnson City, New York
    Binghamton University The Open Repository @ Binghamton (The ORB) Biological Sciences Faculty Scholarship Biological Sciences 5-2015 High-quality permanent draft genome sequence of Bradyrhizobium sp Th.b2, a microsymbiont of Amphicarpaea bracteata collected in Johnson City, New York Rui Tian Matthew Parker Binghamton University--SUNY Rekha Seshadri T. B.K. Reddy Victor Markowitz See next page for additional authors Follow this and additional works at: https://orb.binghamton.edu/bio_fac Part of the Biology Commons Recommended Citation Tian, Rui; Parker, Matthew; Seshadri, Rekha; Reddy, T. B.K.; Markowitz, Victor; Ivanova, Natalia; Pati, Amrita; Woyke, Tanja; Baeshen, Mohammed; Baeshen, Nabih; Kyrpides, Nikos; and Reeve, Wayne, "High- quality permanent draft genome sequence of Bradyrhizobium sp Th.b2, a microsymbiont of Amphicarpaea bracteata collected in Johnson City, New York" (2015). Biological Sciences Faculty Scholarship. 10. https://orb.binghamton.edu/bio_fac/10 This Article is brought to you for free and open access by the Biological Sciences at The Open Repository @ Binghamton (The ORB). It has been accepted for inclusion in Biological Sciences Faculty Scholarship by an authorized administrator of The Open Repository @ Binghamton (The ORB). For more information, please contact [email protected]. Authors Rui Tian, Matthew Parker, Rekha Seshadri, T. B.K. Reddy, Victor Markowitz, Natalia Ivanova, Amrita Pati, Tanja Woyke, Mohammed Baeshen, Nabih Baeshen, Nikos Kyrpides, and Wayne Reeve This article is available at The Open Repository @ Binghamton (The ORB): https://orb.binghamton.edu/bio_fac/10 Tian et al. Standards in Genomic Sciences (2015) 10:24 DOI 10.1186/s40793-015-0008-y SHORT GENOME REPORT Open Access High-quality permanent draft genome sequence of Bradyrhizobium sp.
    [Show full text]
  • Third Annual DOE Joint Genome Institute User Meeting
    Third Annual DOE Joint Genome Institute User Meeting Sponsored By U.S. Department of Energy Office of Science March 26–28, 2008 Marriott Hotel Walnut Creek, California Contents Agenda............................................................................iv Speaker Presentations....................................................1 Poster Presentations.....................................................11 Attendees .......................................................................53 Author Index.................................................................59 iii Agenda All functions to be held at the Walnut Creek Marriott unless otherwise noted WEDNESDAY, March 26 Start Time End Time Subject Session Chair/Speaker 8:30 AM Registration Opens 9:00 AM 12:00 PM Workshops Eukaryotic Annotation Workshop IMG Workshop JGI 101 Workshop 12:00 PM 1:00 PM Lunch provided for WORKSHOP PARTICIPANTS ONLY on this day 1:00 PM 4:30 PM Session 1- Biomass Feedstocks Eddy Rubin, chair 1:00 PM 1:10 PM Introduction Eddy Rubin 1:10 PM 1:40 PM JGI Plant Portfolio Overview Jerry Tuskan 1:40 PM 2:10 PM The Sorghum bicolor Genome, the Diversification Andrew Paterson of Cereals, and the Productivity of Tropical Grasses 2:10 PM 2:40 PM Miscanthus Steve Long 2:40 PM 3:00 PM Break 3:00 PM 3:30 PM Poplar Metabolism Tim Tschaplinski 3:30 PM 4:00 PM Eucalyptus: Sequencing a Global Dario Grattapaglia Tree Genome for Energy, Fiber and Wood 4:00 PM 4:15 PM Short talk from poster abstracts Speaker TBD 4:15 PM 4:30 PM Short talk from poster abstracts Speaker TBD 5:30
    [Show full text]
  • The Future of Microbial Metagenomics (Or Is Ignorance Bliss?)
    The ISME Journal (2011) 5, 777–779 & 2011 International Society for Microbial Ecology All rights reserved 1751-7362/11 www.nature.com/ismej COMMENTARY The Future of microbial metagenomics (or is ignorance bliss?) Jack A Gilbert, Folker Meyer and Mark J Bailey The ISME Journal (2011) 5, 777–779; doi:10.1038/ microbial diversity studies. Although concerted ismej.2010.178; published online 25 November 2010 efforts are under way to improve the coverage of known genome-derived proteins from wet-lab derived biochemical annotations, the technology to A recent explosion in the number of studies taking link the small body of experimentally generated advantage of the power of next-generation sequen- evidence to large ‘families’ of similar proteins is still cing to explore metagenomic or 16S rRNA taxo- in flux. One major factor that hinders the use of the nomic diversity of microbial environments means existing annotations is the bias in the genome and that, we need to stop and think about how we best protein knowledge bases, for example, the vast interpret these data. majority of sequenced organisms originate from the Currently, 16S rRNA gene studies provide us with medical community. The Genomic Encyclopedia the most effective way of fingerprinting the species of Bacteria and Achaea (GEBA) project has provided richness of a community, with weak links to culture- a substantial increase to our understanding of derived functional relationships. However, the vast microbial genomics from the rest of the phylogenetic increase in the number of bacterial taxa known only matrix, highlighting the importance of whole from 16S rRNA sequence has started to sever this genomes in exploring evolution and the protein link, which can only be corrected through more universe.
    [Show full text]
  • GSC 6 and GSC 7 Meeting Reports Lent Speakers and Talks
    Standards in Genomic Sciences (2009) 1:68-71 DOI:10.4056/sigs.25165 Meeting Report from the Genomic Standards Consortium (GSC) Workshops 6 and 7 Dawn Field1*, Peter Sterk1,2, Nikos Kyrpides3, Renzo Kottmann4, Frank Oliver Glöckner4, Lynette Hirschman5, George M. Garrity6, John Wooley7 and Paul Gilna7 1 NERC Center for Ecology and Hydrology, Oxford, United Kingdom 2 The Pathogen Sequencing Unit, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK 3 Genome Biology Program, DOE Joint Genome Institute, Walnut Creek, California, USA 3 European Molecular Biology Laboratory (EMBL) Outstation, European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. 4 Microbial Genomics Group, Max Planck Institute for Marine Microbiology & Jacobs University Bremen, Bremen, Germany 5 Information Technology Center, The MITRE Corporation, Bedford, Massachusetts, USA 6 Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, USA 7 University of California San Diego, La Jolla California Corresponding author: Dawn Field This report summarizes the proceedings of the 6th and 7th workshops of the Genomic Stan- dards Consortium (GSC), held back-to-back in 2008. GSC 6 focused on furthering the activi- ties of GSC working groups, GSC 7 focused on outreach to the wider community. GSC 6 was held October 10-14, 2008 at the European Bioinformatics Institute, Cambridge, United King- dom and included a two-day workshop focused on the refinement of the Genomic
    [Show full text]
  • The Genome Sequence of the Facultative Intracellular Pathogen Brucella Melitensis
    The genome sequence of the facultative intracellular pathogen Brucella melitensis Vito G. DelVecchio*†, Vinayak Kapatral‡, Rajendra J. Redkar*, Guy Patra*, Cesar Mujer*, Tamara Los‡, Natalia Ivanova‡, Iain Anderson‡, Anamitra Bhattacharyya‡, Athanasios Lykidis‡, Gary Reznik‡, Lynn Jablonski‡, Niels Larsen‡, Mark D’Souza‡, Axel Bernal‡, Mikhail Mazur‡, Eugene Goltsman‡, Eugene Selkov‡, Philip H. Elzer§, Sue Hagius§, David O’Callaghan¶, Jean-Jacques Letessonʈ, Robert Haselkorn‡, Nikos Kyrpides‡, and Ross Overbeek‡ *Institute of Molecular Biology and Medicine, University of Scranton, Scranton, PA 18510; ‡Integrated Genomics, Incorporated, 2201 West Campbell Park Drive, Chicago, IL 60612; §Department of Veterinary Science, Louisiana State University Ag Center, 111 Dalrymple Building, Baton Rouge, LA 70803; ¶Institut National de la Sante´et de la Recherche Me´dicale, U431, Faculte´deMe´ decine, Avenue Kennedy, 30900 Nıˆmes,France; and ʈUnite´de Recherche en Biologie Mole´culaire, Laboratoire d’Immunologie et de Microbiologie, Universite´of Namur, 61 Rue de Bruxelles, 5000 Namur, Belgium Contributed by Robert Haselkorn, October 29, 2001 Brucella melitensis is a facultative intracellular bacterial pathogen Brucella melitensis (15–17). However, each biovar exhibits a pref- that causes abortion in goats and sheep and Malta fever in humans. erence for specific hosts: B. melitensis infects goats and sheep, The genome of B. melitensis strain 16M was sequenced and found Brucella abortus causes abortion in cattle, and Brucella suis infects to contain 3,294,935 bp distributed over two circular chromosomes swine. Brucella canis, Brucella ovis, and Brucella neotomae are of 2,117,144 bp and 1,177,787 bp encoding 3,197 ORFs. By using the pathogenic to dogs, sheep, and desert wood rats, respectively.
    [Show full text]