Genome Sequence of the Deep-Sea -Proteobacterium Idiomarina
Total Page:16
File Type:pdf, Size:1020Kb
Genome sequence of the deep-sea ␥-proteobacterium Idiomarina loihiensis reveals amino acid fermentation as a source of carbon and energy Shaobin Hou*†, Jimmy H. Saw*, Kit Shan Lee*, Tracey A. Freitas*†‡, Claude Belisle*, Yutaka Kawarabayasi§, Stuart P. Donachie*, Alla Pikina*, Michael Y. Galperin¶, Eugene V. Koonin¶, Kira S. Makarova¶, Marina V. Omelchenko¶, Alexander Sorokin¶, Yuri I. Wolf¶, Qing X. Liʈ, Young Soo Keumʈ, Sonia Campbellʈ, Judith Deneryʈ, Shin-Ichi Aizawa**, Satoshi Shibata**††, Alexander Malahoff‡‡, and Maqsudul Alam*†‡§§ *Department of Microbiology, University of Hawaii, Snyder Hall 111, 2538 The Mall, Honolulu, HI 96822; †Center for Genomics, Proteomics, and Bioinformatics Research Initiative, University of Hawaii, Keller Hall 319, Honolulu, HI 96822; ‡Maui High Performance Computing Center, 550 Lipoa Parkway, Kihei, Maui, HI 96753; §Center for Glycoscience, National Institute of Advanced Industrial Science and Technology, Higashi 1-1-1, Tsukuba-shi, Ibaraki 305-8566, Japan; ¶National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894; ʈDepartment of Molecular Biosciences and Bioengineering, University of Hawaii, Honolulu, HI 96822; **‘‘Soft Nano-Machine’’ Project, Japan Science and Technology Corporation, 1064-18 Hirata, Takanezawa, Shioya-gun, Tochigi 329-1206, Japan; ††Department of Agriculture, Shizuoka University, Yata, Shizuoka 422-8529, Japan; and ‡‡Institute of Geological and Nuclear Sciences, 69 Gracefield Road, P.O. Box 30368, Lower Hutt, New Zealand Edited by Rita R. Colwell, University of Maryland, College Park, MD, and approved November 12, 2004 (received for review October 14, 2004) We report the complete genome sequence of the deep-sea ␥- Hawaii (6). In contrast to obligate anaerobic vent hyperthermo- proteobacterium, Idiomarina loihiensis, isolated recently from a philes Thermotoga spp. and Pyrococcus spp. (7–9), I. loihiensis hydrothermal vent at 1,300-m depth on the Lo ihi submarine inhabits partially oxygenated cold waters at the periphery of the volcano, Hawaii. The I. loihiensis genome comprises a single chro- vent, surviving a wide range of growth temperatures (from 4°C mosome of 2,839,318 base pairs, encoding 2,640 proteins, four to 46°C) and salinities (from 0.5% to 20% NaCl). We report here rRNA operons, and 56 tRNA genes. A comparison of I. loihiensis to the complete genome sequence of I. loihiensis, deep-sea ␥- the genomes of other ␥-proteobacteria reveals abundance of proteobacterium. The 2,839,318-bp genome of I. loihiensis en- amino acid transport and degradation enzymes, but a loss of sugar codes 2,640 predicted proteins, which is larger than the typical transport systems and certain enzymes of sugar metabolism. This genome size of obligate parasites of ␥-proteobacterial lineage, finding suggests that I. loihiensis relies primarily on amino acid such as Haemophilus influenzae or Coxiella burnetii, but smaller catabolism, rather than on sugar fermentation, for carbon and than the genomes of facultative parasites like Escherichia coli or energy. Enzymes for biosynthesis of purines, pyrimidines, the Vibrio cholerae or environmental organisms like Shewanella majority of amino acids, and coenzymes are encoded in the oneidensis (10). Distinguishing properties of I. loihiensis include genome, but biosynthetic pathways for Leu, Ile, Val, Thr, and Met a versatile signal transduction system, an integrated mechanism are incomplete. Auxotrophy for Val and Thr was confirmed by in of exopolysaccharide biosynthesis, and a variety of secreted vivo experiments. The I. loihiensis genome contains a cluster of 32 metallopeptidases. I. loihiensis appears to derive carbon and genes encoding enzymes for exopolysaccharide and capsular poly- energy primarily from amino acid metabolism, rather than from saccharide synthesis. It also encodes diverse peptidases, a variety sugar fermentation. of peptide and amino acid uptake systems, and versatile signal transduction machinery. We propose that the source of amino Materials and Methods acids for I. loihiensis growth are the proteinaceous particles The I. loihiensis genome was sequenced by using both bacterial present in the deep sea hydrothermal vent waters. I. loihiensis articifial chromosome (BAC) clone and whole-genome shotgun would colonize these particles by using the secreted exopolysac- libraries (WGSL). Both ends of the BAC and WGSL clones of charide, digest these proteins, and metabolize the resulting pep- sizes 1, 2, and 5 kb were sequenced by Dye Terminator Cycle tides and amino acids. In summary, the I. loihiensis genome reveals Sequencing using Beckman CEQ2000 sequencers (Beckman an integrated mechanism of metabolic adaptation to the con- Coulter). Approximately 69,000 valid sequences were assembled stantly changing deep-sea hydrothermal ecosystem. into 148 contigs by using the Paracel Genome Assembler, giving Ϸ10ϫ coverage of the genome. Gap-spanning forward and hydrothermal vent reverse clone pairs were used to build contig scaffolds and scaffolds were further linked to each other by using BAC end teep thermal and chemical gradients at active, submarine pairs. The AUTOFINISH module of CONSED (www.phrap.org) was Shydrothermal vents affect the phyletic composition and used to automatically pick gap-closing primers between contigs metabolic activities of microbial communities at these sites (1, 2). for primer-walking experiments. Assembled contigs, along with Loihi is an active submarine volcano Ϸ30 km south of the island the primer-walking sequences, were manually assembled by of Hawaii. This youngest manifestation of the hotspot respon- using Seqman II (DNAStar). Direct sequencing of genomic sible for the Emperor–Hawaiian Seamount chain and Hawaiian DNA was then performed to close remaining gaps (11). The Islands has been geophysically and geochemically characterized entire assembly was verified by 187 long PCRs (Takara Bio, (1, 3). Vent waters are enriched in Fe, Li, As, Ba, Pb, Po, Mo, Tokyo) of 15- to 20-kb fragments with 1-kb overlapping regions. Sb, Te, and Tl (1). Deep-sea hydrothermal vents and their environments host diverse communities across three domains of life. Microbial communities in such habitats are adapted to high This paper was submitted directly (Track II) to the PNAS office. temperatures, pressure, and salinity. Complete genomes of Abbreviation: COG, cluster of orthologous groups. several bacteria and archaea from submarine vents have been Data deposition: The sequence reported in this paper has been deposited in the GenBank sequenced and offered insights into microbial evolution (4, 5). database (accession no. AE017340). Idiomarina loihiensis sp. nov. is a ␥-proteobacterium, isolated §§To whom correspondence should be addressed. E-mail: [email protected]. recently from the hydrothermal vents on the Lo‘ihi Seamount, © 2004 by The National Academy of Sciences of the USA 18036–18041 ͉ PNAS ͉ December 28, 2004 ͉ vol. 101 ͉ no. 52 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0407638102 Downloaded by guest on September 29, 2021 Table 1. General features of the I. Ioihiensis genome Feature I. Ioihiensis Genome size, bp 2,839,318 GϩC content, % 47.04 Number of predicted CDSs 2,711 Average size of CDSs, bp 962 Percentage coding, % 92.1 Number of protein-coding genes 2,640 tRNAs 56 rRNA operons (16S–23S–5S) 4 Structural RNAs 3 Number of CDSs with known function 2,230 Uncharacterized conserved proteins 410 (15.1%) ORFs were identified by using GENEMARK (12), followed by BLASTX (13) searches of the remaining intergenic regions. Trans- fer RNAs were predicted by using TRNASCAN-SE (14). Genome annotation was performed by comparing protein sequences with those in the NCBI protein database (www.ncbi.nlm.nih.gov) and the clusters of orthologous groups (COG) database (www. ncbi.nlm.nih.gov͞COG) (15), by using BLAST and PSI-BLAST (13) with manual verification, as described (10). A maximum likeli- hood phylogenetic tree of ␥-Proteobacteria was constructed by using concatenated alignments of ribosomal proteins (16). Results and Discussion Genome Organization and Analysis. The I. loihiensis genome is a single circular chromosome of 2,839,318 bp with an average GϩC content of 47% (Table 1). The genome contains 2,640 predicted ORFs, four rRNA operons (16S-23S-5S), 56 tRNA genes, and three other RNA genes, accounting for 92.1% of the genome (Table 1 and Fig. 1; see also Fig. 6, which is published BIOLOGY as supporting information on the PNAS web site). Phylogenetic Fig. 1. Circular representation of the I. loihiensis genome. From the outside DEVELOPMENTAL analysis of the concatenated alignment of ribosomal proteins inward: The first and second circles show predicted gene-coding region in plus (Fig. 2) shows that I. loihiensis represents a distinct lineage and minus strands, respectively. By COG functional categories, red indicates among the ␥-Proteobacteria, which branched from the ‘‘trunk’’ of translation, ribosomal structure, and biogenesis; blue indicates transcription; the ␥-proteobacterial tree after the Pseudomonas lineage, but pink indicates DNA replication, recombination, and repair; green indicates cell before the Vibrio cluster. Pairwise genome alignments show division and chromosome partitioning; dark red indicates posttranslational limited conservation of gene order between Idiomarina, Vibrio, modification; orange indicates cell envelope biogenesis, outer membrane; Pseudomonas, and Shewanella (Fig. 7, which is published as navy blue indicates cell motility and secretion; purple