<<

PHYLOGENOMICS: A - LEVEL APPROACH TO ASSEMBLING THE BACTERIAL BRANCHES OF THE TREE OF Jonathan A. Eisen1, Naomi Ward1, Karen E. Nelson1, Jonathan H. Badger1, James Sakwa1, Dongying Wu1, Martin Wu1, Kevin Penn1, Grace Pai1, Shannon Smith1, Elizabeth M. O’Connor2, Julie Enticknap2, Tim Steppe2, Frank T. Robb2 1The Institute for Genomic Research (TIGR), Rockville, MD.2Center of Marine Biotechnology (COMB), Baltimore, MD.

Project Summary and Web Page RESULTS

Table 1. Sequencing Status for Selected Phyla RESULTS SUMMARY Figure 4. APIS Output for Yellowstone Mat shotgun sequences Species selected Growth, Libraries Shotgun Estimated # of Auto- DNA Coverage Genome Contigs Annotated 1. Shotgun sequencing is completed for 5 phyla and in progress for the isolation Size (Mb) Chrysiogenes Chrysiogenes arsenatis + + 4x 2.5 155 + other three (Table 1). 2. Annotation of the helps predict physiology and may aid in Coprothermobacter Coprothermobacter proteolyticus (CP) + + 8x 1.38 3 + experimental studies (Table 2). Dictyoglomi Dictyoglomus thermophilum (DT) + + 8x 2.0 9 + 3. Whole genome phylogenetic analysis suggests one group may not be a

Thermodesulfobacteria Thermodesulfobacterium commune (TC) + + 8x 1.78 26 + novel phylum (Thermomicrobium) and helps resolve relationships among phyla (Figure 3). Thermodesulfovibrio yellowstonii (TY) + + 8x 1.98 27 + 4. Using the genomic data, members of these phyla have been identified in Thermomicrobium roseum + + 8x 3.4 82 + environmental samples (Figure 4).

Deferribacteres Selecting from + In 5. We have developed multiple “phylogenomic” tools in conjunction with Deferribacter thermophilus, progress this project Geovibrio thiophilus, Flexistipes sinusarabici Synergistes Selecting from + In Synergistes jonesii, progress Aminobacter colombiense, Figure 3. Whole genome phylogeny Thermanaerovibrio acidaminovorans, Aminomonas paucivorans, Dethiosulfovibrio peptidovorans

Table 2: Predicted Metabolic Pathways. Predictions made through a combination of mining the automated lists and using the APIS-based ECFinder algorithm. See Table 1 for full species names. TC CP TY DT TC CP TY DT Acetoin + + + + Pentose Phosphate Pathway + + + + Aspartate from fumarate + - + - Pyruvate to isoleucine + - + - Aspartate to alanine + + + - Pyruvate to leucine + - + - Aspartate to oxaloacetate + - - - Pyruvate to valine + - + - C1 metabolism evidence - + + - Pyruvate to acetylCoA + + + - Cellulose to cellobiose - - - + Pyruvate to acetate + + - - Cellobiose metabolism - - + + Pyruvate to cysteine - + - - Chitobiose metabolism + - + - Pyruvate to formate - + - - Dextrin to - + - + Pyruvate to lactate + - + - Formate metabolism + - + + Pyruvate to malate + + + - Fumarate to glutamate - - - + Pyruvate to PEP + + + - Galactose metabolism - - - + Pyruvate to OAA + + + - Glycolysis partial + - - - Proline from glutamate + - - - Glycolysis intact - + + + Putrescine to spermidine + + + + Glucosamine metabolism + + + + Raffinose metabolism - - - + Glutamate to glutamine + + + - Ribose metabolism - + + + Glutamate to citrulline - + - - Ribulose metabolism - - + + Glutamate to - + - - Serine to cysteine - + - - Glycine cleavage - + + + Serine to glycine + + - - Glycine to proline - - - + Sorbose metabolism - - - + Glycine to sarcosine - + - - Sorbitol metabolism - - + + Glycerol metabolism + + + + Sucrose metabolism - - - + Glycogen - - + + Sulfate to sulfite + - + - Histidine biosynthesis - - + - Tryptophan biosynthesis - - + - metabolism - - - + TCA partial + - + + Malate to pyruvate + - - - to CO2 - + - - Mannose metabolism - + - - Xylan metabolism - - - + to + - + - Xylose metabolism - - - +

BACKGROUND APIS TOOLS - EXAMPLES Automated Whole Genome Trees Figure 1. . The tree is a schematic diagram showing the major Automated Phylogenetic Inference System Figure 2. Project “Pipeline” recognized bacterial phyla (based in part on Hugenholtz (2002) and Boon et al. Select “universal” phylogenetic markers (2001)). Phyla are colored by genome project and culturing status. In red are the phyla selected for this project. Genomic Data Set rRNA Automated Phylogeny/ Pipeline (, Proteins, DNA sequences) Build alignments for known species Small Subunit rRNA PCR TM6 This Project OS-K Published BLASTP Vs. ComboDB Build Hidden Markov Models (HMMs) ID phyla with cultured representative but no genomes. In progress (all proteins from complete genomes) Web Display Termite Group Sequence and Assemble OP8 Uncultured lineage Determine physiology Obtain live cultures Search HMMs against selected genomes Chlorobi Extract full length homologs from Chimera Check Compare Zoomed and Big ComboDB Marine GroupA Genomic DNA, PFGE Picture Trees Align genes, concatenate alignments WS3 Gemmimonas (ABE) Assignment Multiple Sequence Alignment (MUSCLE) Make shotgun library and sequence 2-300 clones by Blastn, Align, NJ Tree Taxon Assignment with Taxa Reps If conflicts with accepted, OP9 Phylogenetic analysis Phylogenetic Inference (currently further investigation Zoom In Tree bootstrapped NJ using QuickTree) Synergistes Search Domain-Specific DB (NJ or ML) Deferribacteres Chrysiogenetes Shotgun sequence Data Release NKB19 Determination of most related groups “Big Picture Tree” by Extract Taxon Seqs OP3 Assemble genome Blastn, Align, NJ Tree Plus Outgroups, with Taxa Reps and Top Hits Align Spriochaetes Web Display (see Figure 4) of RDP II Coprothmermobacter Close physical and sequencing gaps OP10 Thermomicrobia Chloroflexi TM7 Annotate genomes REFERENCES - ACKNOWLEDGEMENTS Dictyoglomus Aquificacae Thanks to Connie Shao for building the project web page and John Heidelberg for •Hugenholtz P (2002) Exploring prokaryotic diversity in the Genomic era. Genome Phylogenomic Analysis providing access to unpublished sequence data (supported by NSF FIBR Grant GC054-14- Biology 3(2):reviews0003.1-0003.8. OP1 Z3423) . This project is supported by an award to from the National Science Foundation’s “Assembling the Tree of Life” program (Grant DEB-0228651) •Boon DR, Castenholz RQ, Garrity GM (2001). Bergey’s Manual of Systematic OP11 Bacteriology, 2nd edition.Springer New York.