Genome sequence of the deep-sea ␥-proteobacterium Idiomarina loihiensis reveals amino acid fermentation as a source of carbon and energy

Shaobin Hou*†, Jimmy H. Saw*, Kit Shan Lee*, Tracey A. Freitas*†‡, Claude Belisle*, Yutaka Kawarabayasi§, Stuart P. Donachie*, Alla Pikina*, Michael Y. Galperin¶, Eugene V. Koonin¶, Kira S. Makarova¶, Marina V. Omelchenko¶, Alexander Sorokin¶, Yuri I. Wolf¶, Qing X. Liʈ, Young Soo Keumʈ, Sonia Campbellʈ, Judith Deneryʈ, Shin-Ichi Aizawa**, Satoshi Shibata**††, Alexander Malahoff‡‡, and Maqsudul Alam*†‡§§

*Department of Microbiology, University of Hawaii, Snyder Hall 111, 2538 The Mall, Honolulu, HI 96822; †Center for Genomics, Proteomics, and Bioinformatics Research Initiative, University of Hawaii, Keller Hall 319, Honolulu, HI 96822; ‡Maui High Performance Computing Center, 550 Lipoa Parkway, Kihei, Maui, HI 96753; §Center for Glycoscience, National Institute of Advanced Industrial Science and Technology, Higashi 1-1-1, Tsukuba-shi, Ibaraki 305-8566, Japan; ¶National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894; ʈDepartment of Molecular Biosciences and Bioengineering, University of Hawaii, Honolulu, HI 96822; **‘‘Soft Nano-Machine’’ Project, Japan Science and Technology Corporation, 1064-18 Hirata, Takanezawa, Shioya-gun, Tochigi 329-1206, Japan; ††Department of Agriculture, Shizuoka University, Yata, Shizuoka 422-8529, Japan; and ‡‡Institute of Geological and Nuclear Sciences, 69 Gracefield Road, P.O. Box 30368, Lower Hutt, New Zealand

Edited by Rita R. Colwell, University of Maryland, College Park, MD, and approved November 12, 2004 (received for review October 14, 2004) We report the complete genome sequence of the deep-sea ␥- Hawaii (6). In contrast to obligate anaerobic vent hyperthermo- proteobacterium, Idiomarina loihiensis, isolated recently from a philes Thermotoga spp. and Pyrococcus spp. (7–9), I. loihiensis hydrothermal vent at 1,300-m depth on the Lo៮ ihi submarine inhabits partially oxygenated cold waters at the periphery of the volcano, Hawaii. The I. loihiensis genome comprises a single chro- vent, surviving a wide range of growth temperatures (from 4°C mosome of 2,839,318 base pairs, encoding 2,640 proteins, four to 46°C) and salinities (from 0.5% to 20% NaCl). We report here rRNA operons, and 56 tRNA genes. A comparison of I. loihiensis to the complete genome sequence of I. loihiensis, deep-sea ␥- the genomes of other ␥-proteobacteria reveals abundance of proteobacterium. The 2,839,318-bp genome of I. loihiensis en- amino acid transport and degradation , but a loss of sugar codes 2,640 predicted proteins, which is larger than the typical transport systems and certain enzymes of sugar metabolism. This genome size of obligate parasites of ␥-proteobacterial lineage, finding suggests that I. loihiensis relies primarily on amino acid such as Haemophilus influenzae or Coxiella burnetii, but smaller catabolism, rather than on sugar fermentation, for carbon and than the genomes of facultative parasites like Escherichia coli or energy. Enzymes for biosynthesis of purines, pyrimidines, the Vibrio cholerae or environmental organisms like Shewanella majority of amino acids, and coenzymes are encoded in the oneidensis (10). Distinguishing properties of I. loihiensis include genome, but biosynthetic pathways for Leu, Ile, Val, Thr, and Met a versatile signal transduction system, an integrated mechanism are incomplete. Auxotrophy for Val and Thr was confirmed by in of exopolysaccharide biosynthesis, and a variety of secreted vivo experiments. The I. loihiensis genome contains a cluster of 32 metallopeptidases. I. loihiensis appears to derive carbon and genes encoding enzymes for exopolysaccharide and capsular poly- energy primarily from amino acid metabolism, rather than from saccharide synthesis. It also encodes diverse peptidases, a variety sugar fermentation. of peptide and amino acid uptake systems, and versatile signal transduction machinery. We propose that the source of amino Materials and Methods acids for I. loihiensis growth are the proteinaceous particles The I. loihiensis genome was sequenced by using both bacterial present in the deep sea hydrothermal vent waters. I. loihiensis articifial chromosome (BAC) clone and whole-genome shotgun would colonize these particles by using the secreted exopolysac- libraries (WGSL). Both ends of the BAC and WGSL clones of charide, digest these proteins, and metabolize the resulting pep- sizes 1, 2, and 5 kb were sequenced by Dye Terminator Cycle tides and amino acids. In summary, the I. loihiensis genome reveals Sequencing using Beckman CEQ2000 sequencers (Beckman an integrated mechanism of metabolic adaptation to the con- Coulter). Approximately 69,000 valid sequences were assembled stantly changing deep-sea hydrothermal ecosystem. into 148 contigs by using the Paracel Genome Assembler, giving Ϸ10ϫ coverage of the genome. Gap-spanning forward and hydrothermal vent reverse clone pairs were used to build contig scaffolds and scaffolds were further linked to each other by using BAC end teep thermal and chemical gradients at active, submarine pairs. The AUTOFINISH module of CONSED (www.phrap.org) was Shydrothermal vents affect the phyletic composition and used to automatically pick gap-closing primers between contigs metabolic activities of microbial communities at these sites (1, 2). for primer-walking experiments. Assembled contigs, along with Lo៮ihi is an active submarine volcano Ϸ30 km south of the island the primer-walking sequences, were manually assembled by of Hawaii. This youngest manifestation of the hotspot respon- using Seqman II (DNAStar). Direct sequencing of genomic sible for the Emperor–Hawaiian Seamount chain and Hawaiian DNA was then performed to close remaining gaps (11). The Islands has been geophysically and geochemically characterized entire assembly was verified by 187 long PCRs (Takara Bio, (1, 3). Vent waters are enriched in Fe, Li, As, Ba, Pb, Po, Mo, Tokyo) of 15- to 20-kb fragments with 1-kb overlapping regions. Sb, Te, and Tl (1). Deep-sea hydrothermal vents and their environments host diverse communities across three domains of life. Microbial communities in such habitats are adapted to high This paper was submitted directly (Track II) to the PNAS office. temperatures, pressure, and salinity. Complete genomes of Abbreviation: COG, cluster of orthologous groups. several bacteria and archaea from submarine vents have been Data deposition: The sequence reported in this paper has been deposited in the GenBank sequenced and offered insights into microbial evolution (4, 5). database (accession no. AE017340). Idiomarina loihiensis sp. nov. is a ␥-proteobacterium, isolated §§To whom correspondence should be addressed. E-mail: [email protected]. recently from the hydrothermal vents on the Lo៮‘ihi Seamount, © 2004 by The National Academy of Sciences of the USA

18036–18041 ͉ PNAS ͉ December 28, 2004 ͉ vol. 101 ͉ no. 52 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0407638102 Downloaded by guest on September 29, 2021 Table 1. General features of the I. Ioihiensis genome Feature I. Ioihiensis

Genome size, bp 2,839,318 GϩC content, % 47.04 Number of predicted CDSs 2,711 Average size of CDSs, bp 962 Percentage coding, % 92.1 Number of protein-coding genes 2,640 tRNAs 56 rRNA operons (16S–23S–5S) 4 Structural RNAs 3 Number of CDSs with known function 2,230 Uncharacterized conserved proteins 410 (15.1%)

ORFs were identified by using GENEMARK (12), followed by BLASTX (13) searches of the remaining intergenic regions. Trans- fer RNAs were predicted by using TRNASCAN-SE (14). Genome annotation was performed by comparing protein sequences with those in the NCBI protein database (www.ncbi.nlm.nih.gov) and the clusters of orthologous groups (COG) database (www. ncbi.nlm.nih.gov͞COG) (15), by using BLAST and PSI-BLAST (13) with manual verification, as described (10). A maximum likeli- hood phylogenetic tree of ␥-Proteobacteria was constructed by using concatenated alignments of ribosomal proteins (16). Results and Discussion Genome Organization and Analysis. The I. loihiensis genome is a single circular chromosome of 2,839,318 bp with an average GϩC content of 47% (Table 1). The genome contains 2,640 predicted ORFs, four rRNA operons (16S-23S-5S), 56 tRNA genes, and three other RNA genes, accounting for 92.1% of the genome (Table 1 and Fig. 1; see also Fig. 6, which is published BIOLOGY as supporting information on the PNAS web site). Phylogenetic

Fig. 1. Circular representation of the I. loihiensis genome. From the outside DEVELOPMENTAL analysis of the concatenated alignment of ribosomal proteins inward: The first and second circles show predicted gene-coding region in plus (Fig. 2) shows that I. loihiensis represents a distinct lineage and minus strands, respectively. By COG functional categories, red indicates among the ␥-Proteobacteria, which branched from the ‘‘trunk’’ of translation, ribosomal structure, and biogenesis; blue indicates transcription; the ␥-proteobacterial tree after the Pseudomonas lineage, but pink indicates DNA replication, recombination, and repair; green indicates cell before the Vibrio cluster. Pairwise genome alignments show division and chromosome partitioning; dark red indicates posttranslational limited conservation of gene order between Idiomarina, Vibrio, modification; orange indicates cell envelope biogenesis, outer membrane; Pseudomonas, and Shewanella (Fig. 7, which is published as navy blue indicates cell motility and secretion; purple indicates inorganic ion transport and metabolism; wheat indicates signal transduction mechanisms; supporting information on the PNAS web site). Comparative aquamarine indicates energy production and conversion; beige indicates genome analysis of I. loihiensis proteins showed that I. loihiensis carbohydrate transport and metabolism; blue-violet indicates amino acid has a typical ␥-proteobacterial proteome. Indeed, most pre- transport and metabolism; light sky blue indicates nucleotide transport and dicted proteins in I. loihiensis have closest homologs in ␥- metabolism; gold indicates coenzyme metabolism; dark khaki indicates lipid proteobacteria (77%) or representatives of other proteobacterial metabolism; magenta indicates secondary metabolites biosynthesis; light subphyla (9%) (Fig. 3A). Biological roles were assigned to 1,662 cyan indicates, general function prediction only; gray indicates function un- (63%) ORFs; only a general functional prediction was made for known; black indicates hypothetical. The third circle shows transposons and 303 more (11.5%), and the rest remained functionally unchar- insertion sequence elements in red. The fourth circle shows pseudogenes in brown. The fifth and sixth circles show tRNAs and rRNAs in olive and dark acterized. Only 116 (4.4%) ORFs had no detectable homologs in green respectively. the public protein databases. For 18 of them, homologous ORFs were found among environmental sequences from the Sargasso Sea (17), leaving 98 unique predicted proteins. Among the lipid metabolism, in contrast to a substantial loss of sugar relatively small number of closest homologs outside the Pro- metabolism genes (Fig. 3B). teobacteria, most were from cyanobacteria and Gram-positive bacteria. Most I. loihiensis proteins (83%) were assigned to Amino Acid Metabolism. The I. loihiensis genome encodes the COGs (15) that contained proteins from at least one other complete set of enzymes for biosynthesis of most amino acids, ␥-proteobacterium (Fig. 8, which is published as supporting except for Leu, Ile, Val, Thr, and Met (Fig. 4). The common Leu, information on the PNAS web site). Predicted I. loihiensis Ile, Val pathway misses a single , the dihydroxyacid proteins were assigned to only 23 COGs (Ϸ1% of the total) that dehydratase (IlvD). The Thr and Met biosynthetic pathways miss did not include representatives of other ␥-proteobacteria. Con- threonine synthase (ThrC) and methionine synthase (MetE or versely, the I. loihiensis genome encodes 96% of the most MetH), the enzymes that catalyze the last steps in the respective conserved set of protein families (COGs) that are found in all pathways. These observations suggested that I. loihiensis should ␥-proteobacteria. A comparison of COGs found in I. loihiensis be auxotrophic by all or at least some of these amino acids. To and other free-living ␥-proteobacteria shows conservation of test these genome annotated predictions, we measured cell genes involved in translation, DNA replication and repair, and growth in the absence of each suspected essential amino acid and

Hou et al. PNAS ͉ December 28, 2004 ͉ vol. 101 ͉ no. 52 ͉ 18037 Downloaded by guest on September 29, 2021 Fig. 3. Comparative genomic analysis of I. loihiensis proteins. (A) Phyloge- netic distribution of the closest homologs of I. loihiensis proteins. The numbers indicate (in the clockwise order) best hits among ␥-proteobacteria (77%), other proteobacteria, firmicutes, cyanobacteria, other organisms, and pro- teins with no homologs. (B) Functional distribution of I. loihiensis proteins. The columns indicate the number of protein families (COGs) encoded in the genomes of seven free-living ␥-proteobacteria (E. coli K12, E. coli O157:H7, Salmonella typhimurium, Yersinia pestis, V. cholerae, and Pseudomonas aeruginosa), but not in I. loihiensis (Bottom); those shared by I. loihiensis and seven other ␥-proteobacteria (Middle); and those found in I. loihiensis but not in any of the others (Top). Protein functional groups represent translation (J), Fig. 2. Phylogenetic position of I. loihiensis. Maximum likelihood phyloge- replication, recombination and repair (L), energy production and conversion netic tree of ␥-Proteobacteria constructed from concatenated alignments of (C), and transport and metabolism of amino acids (E), carbohydrates (G), ribosomal proteins. Circles indicate internal nodes with resampling of esti- nucleotides (F), coenzymes (H), lipids (I), and inorganic ions (P). mated log-likelihoods (RELL) bootstrap support Ͼ95%. Distances are indi- cated in substitutions per site. and 2-keto-3-deoxy-6-phosphogluconate aldolase. Still, I. loihiensis encodes the enzymes for glycolysis͞gluconeogenesis found that only Thr and Val were required (Table 2). Leucine and the citric acid cycle with glyoxylate bypass (Fig. 4). Several biosynthesis might be carried out with low efficiency by enzymes central enzymes of sugar metabolism, including transketolase of the trichloroacetic acid cycle and fatty acid biosynthesis. (IL2214), ribose 5-phosphate (IL2103), and ribulose- Leucine and isoleucine also could be produced by transamina- 5-phosphate-3-epimerase (IL2324), which are encoded by I. tion of the corresponding oxo-acids derived from fatty acid loihiensis, appear to be sufficient to provide intermediates for metabolism (Fig. 4). biosynthesis of nucleotides, several cofactors, and amino acids. I. loihiensis is likely to obtain Thr and Val and supplement Accordingly, I. loihiensis is capable of using glycerol and maltose, other amino acids through hydrolysis of exogenous proteins. but not glucose 6-phosphate as the sole carbon source (6). Indeed, the genome encodes a diverse set of predicted pepti- dases. These include four genes for Xaa–Pro Coenzymes. In oligotrophic seawater, marine bacteria are usually (PepQ homologs), seven genes for aminopeptidases of the S9C self-sufficient with respect to coenzymes (20). I. loihiensis en- family, and 10 for peptidases of the M38 family (HutI homologs). codes all enzymes for NAD, FAD, biotin, folate, and ubiquinone Several of these are not typical for ␥-proteobacteria, biosynthesis, consistent with this organism’s ability to grow in a such as serine C (IL1750), two predicted minimal medium without vitamins (Table 2). Biosynthesis path- aspartyl proteases (IL0498 and IL2346), two subtilisin-like pro- ways for cobalamin (B12) and molybdopterin guanine dinucle- teases (IL0161 and IL0162), several diverged secreted peptidases otide are missing, which correlates with the lack of B12- of different families, namely, metallopeptidases (IL0589 and and molybdopterin-dependent enzymes. In CoA biosynthesis, IL1025) and zinc-dependent (IL0695 and the two missing enzymes, aspartate 1-decarboxylase (PanD) and IL0199). I. loihiensis also has an extracellular panthothenate kinase (CoaA), are the same as in the recently (IL2428) similar to the recently described cyanophycinase, an sequenced genome of the marine cyanobacterium Prochlorococ- enzyme that hydrolyses cyanophycin, a nonpolar reserve poly- cus marinus (20), suggesting that the two organisms might mer of cyanobacteria, into aspartate-arginine dipeptides (18). encode the same alternative enzymes. I. loihiensis also encodes The abundance of predicted metallopeptidases might be linked a homolog of the E. coli Naϩ͞panthothenate symporter PanF to the high concentations of heavy metals, including zinc, in the (IL1528), which might be used for panthothenate uptake. hydrothermal vent environment (1). I. loihiensis also encodes an extensive set of enzymes for amino acid degradation. These Fatty Acid Metabolism. The genus Idiomarina has a unique fatty include the Hut system of histidine degradation (IL2450– acid composition with a high percentage of iso-branched fatty IL2454), which is rare in ␥-proteobacteria. acids (21). Compared to other Idiomarina spp., I. loihiensis has twice the percentage of saturated fatty acids (6). Indeed, I. Sugar Metabolism. Deep-sea bacteria show limited ability to use loihiensis genome revealed a complete set of enzymes for fatty carbohydrates as their sole carbon and energy source (19). acid biosynthesis. However, genome sequence alone does not Indeed, I. loihiensis does not seem to have a functional PEP- allow prediction of the final composition of the membrane. dependent sugar͞phosphotransferase system or any ABC-type Therefore, we have complemented genome analysis with the sugar transporters, or the sugar–phosphate permease. Many characterization of the I. loihiensis membrane fatty acids. Gas enzymes of carbohydrate degradation typical of other ␥- chromatography/MS analysis of membrane fractions detected 11 proteobacteria are also absent. Examples include such key fatty acids, dominated by 3-hydroxyoctadecanoic acid (24.7%), enzymes of the pentose phosphate shunt and Entner–Doudoroff 3-hydroxyundecanoic acid (23.2%), 9-octadecenoic acid pathway as transaldolase, glucose-6-phosphate dehydrogenase, (19.3%), and hexadecanoic acid (11.2%) (Table 3, which is

18038 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0407638102 Hou et al. Downloaded by guest on September 29, 2021 BIOLOGY DEVELOPMENTAL Fig. 4. A scheme of the principal functional systems of the I. loihiensis cell. I. loihiensis gene identifiers are shown by numbers in green. Pathways involving multiple reactions shown by double arrows. Final biosynthetic products are indicated as follows: light blue for amino acids, dark yellow for nucleotides, brown for sugars, and red for cofactors. Reactions for which no candidate enzyme was confidently identified are indicated by question marks. Missing reactions of Thr and Val biosynthesis are crossed out; the uptake of these two essential amino acids is indicated with exclamation points.

published as supporting information on the PNAS web site). mannose-sensitive hemagglutinin gene cluster (IL0363–IL0379), These acids were identified as esters of phosphatidylglycerol homologous to the one that is involved in biofilm formation in (PG), diphosphatidylglycerol (DPG), phosphatidylethanolamine V. cholerae (24). (PE), phosphatidylserine (PS), and phosphatidylcholine (PC). PG and DPG were the most abundant phospholipids (45.6% and Motility, Chemotaxis, and Signal Transduction. Although biofilm 32.2%, respectively). I. loihiensis maintains several diverged formation is essential in forming a stable population of deep-sea copies of genes for fatty acid synthesis. Given that the compo- microorganisms, motility, and chemotaxis provide a mechanism sition and modification of fatty acids plays a crucial role in to locate new surfaces in response to geochemical and thermal maintaining cell membrane integrity in deep-sea microorgan- gradients (4). I. loihiensis has a single polar flagellum and isms (22), and further analysis of these genes could help under- encodes the full set of proteins involved in flagellar assembly and stand the regulation of membrane fluidity under constantly function. Most of the 45 flagella-associated genes are clustered changing pressure and temperature. in two large predicted operons (IL1147–IL1133 and IL1204– IL1187). Another large operon (IL1120–IL1110) combines Exopolysaccharide and Biofilm. Like other deep-sea vent microor- flagellar biosynthesis (flhAFGfliA) and chemotaxis (cheYZAB, ganisms, I. loihiensis in vitro produces a highly viscous exopo- cheW) genes. The I. loihiensis genome encodes 15 methyl- lysaccharide (19, 23). It has been suggested that vent microor- accepting chemotaxis proteins (MCPs), compared to only five ganisms develop biofilms by using exopolysaccharides to ensure MCPs in E. coli. One of these MCPs (IL0176) contains a the formation of robust and stable communities (4, 23). Indeed, hemerythrin domain and could be involved in O2 sensing in a low the I. loihiensis genome contains a cluster of 32 genes (IL0538– O2 environment. Indeed, purified recombinant hemerythrin IL0569), which encode enzymes involved in the synthesis of domain of IL0176 showed characteristic absorption spectra of export, modification, and polymerization of exopolysaccharides. oxygen binding (Fig. 9, which is published as supporting infor- Another gene cluster (IL0518–IL0521) encodes enzymes for mation on the PNAS web site). sialic acid biosynthesis and sialylation of the surface polysaccha- Hydrothermal vent systems are dynamic submarine environ- rides. Sialic acid synthetase of I. loihiensis contains a C-terminal ments, which experience constant fluctuations in temperature, antifreeze domain, which could facilitate functioning of this pressure, and O2 availability. Therefore, in contrast to marine enzyme at low temperatures. In addition, I. loihiensis has a cyanobacteria that inhabit a stable environment and encode few

Hou et al. PNAS ͉ December 28, 2004 ͉ vol. 101 ͉ no. 52 ͉ 18039 Downloaded by guest on September 29, 2021 Table 2. Growth requirements of I. Ioihiensis strain L2-TRT Expected from Experimental Media genome analysis results

MM* ϪϪ MM ϩ Casamino acids† ϩϩ MM ϩ 13 amino acids‡ ϩϩ MM ϩ Asn, Cys, Gln, His, Lys, ϪϪ Pro, Ser MM ϩ Arg, Met, Thr, Leu, ϩϪ͞ϩ Ile, Val MM ϩ 12 aa Ϫ Asn ϩϩ MM ϩ 12 aa Ϫ Cys ϩϩ MM ϩ 12 aa Ϫ Gln ϩϩ MM ϩ 12 aa Ϫ His ϩϩ MM ϩ 12 aa Ϫ Lys ϩϩ MM ϩ 12 aa Ϫ Pro ϩϩ MM ϩ 12 aa Ϫ Ser ϩϩFig. 5. A schematic diagram of lifestyle of I. loihiensis inferred from genome MM ϩ 12 aa Ϫ Arg ϩϩanalysis. MM ϩ 12 aa Ϫ Met Ϫϩ ϩ Ϫ ϪϪ MM 12 aa Thr pump, the Naϩ- translocating NADH͞ubiquinone oxidoreduc- MM ϩ 12 aa Ϫ Leu Ϫϩ ϩ Ϫ Ϫϩtase (IL1050–IL1045), and probably uses a sodium ion gradient MM 12 aa Ile as the source of energy for nutrient uptake. I. loihiensis also MM ϩ 12 aa Ϫ Val ϪϪ encodes other primary sodium pumps, namely, Naϩ-transporting *The basal minimal medium was the CHB͞E medium supplemented with SL-8 oxaloacetate decarboxylase (IL1736–IL1734), ATP-dependent trace elements solution and 10% NaCI (6). The ϩ sign indicates vigorous Naϩ exporter NatAB (IL0274–IL0275 and IL0481–IL0482, growth of I. Ioihiensis cells, Ϫ indicates no growth, and Ϫ͞ϩ indicates slow which is unusual for ␥-proteobacteria), and an RNF system growth. (IL1794–IL1799) (27). However, I. loihiensis membrane ATP † Casamino acids (Difco) is a mix of 18 amino acids. synthetase appears to be specific for Hϩ (28), indicating that ‡The 13 amino acids are listed in the two following lines. membrane energetics of I. loihiensis does not depend on the sodium ion cycle. Indeed, I. loihiensis encodes several primary ϩ signaling proteins (20), the vent microorganism I. loihiensis H pumps, namely, succinate dehydrogenase (IL1507–IL1504), needs a versatile sensory transduction system. Indeed, it encodes bc1 complex (IL0416–IL0418, IL0097, and IL0205), cytochrome 26 histidine kinases (including four hybrid kinases) and 36 d-type quinol oxidase (IL0041–IL0042), and cytochrome c oxi- dase (IL0260–IL0257). Export of Naϩ ions at the expense of the response regulators. In addition, there are 34 signaling proteins ϩ ϩ containing the GGDEF domain, which have diguanylate cyclase proton gradient is performed by a variety of Na ͞H antiport- ers, including NhaC (IL0584, IL1276, IL2378), NhaD (IL0189), activity and produce c-diGMP, a recently recognized secondary ϩ ϩ messenger in bacteria (25). Many of these proteins also contain NhaP (IL1058 and IL1978), and the multisubunit Na ͞H antiporter MnhABCDEFG (IL1907–IL1912). I. loihiensis en- the EAL domain, a likely c-diGMP phosphodiesterase (26), ϩ ϩ which is encoded in I. loihiensis in 22 copies. Several proteins codes a Na ͞proline symporter (IL0738), Na ͞alanine sym- ϩ containing GGDEF and EAL domains also contain periplasmic porter (IL1268), Na ͞glutamate symporter (IL0983 and ϩ ligand-binding sensory domains, indicating their role in sensing IL1037), Na ͞uridine symporter, and several other predicted ϩ extracellular cues. GGDEF and EAL domains are often linked Na -dependent permeases (IL0272, IL1178, IL1275, IL1331, to the production of extracellular polysaccharide (reviewed in IL1996, IL2299, IL2396, and IL2505). Finally, the I. loihiensis ϩ ref. 26) and are likely to play a similar role in I. loihiensis. genome contains three copies of the gene encoding a Na - dependent multidrug efflux pump (IL0250, IL1790, and IL1842). Adaptation to the Vent Environment. Survival of a bacterium in the I. loihiensis does not seem to synthesize organic compatible hydrothermal vent fluids depends on effective uptake of such solutes as ectoine (ectoine synthase EctC and aminotransferase elements as P, N, S, and Fe and detoxification of heavy metals. EctP are missing), trehalose (trehalose-6P synthetase OtsA and Phosphate uptake appears to be mediated by a typical ABC-type trehalose-6P phosphatase OtsB are missing, as is trehalose Pst system (IL2106–IL2109). I. loihiensis is capable of assimila- synthase TreS), di-inositol phosphate (myo-inositol-1-P synthase tion of nitrogen from both ammonia (IL2165) and nitrate INO1 is missing), or mannosylglycerate (mannosyl-3-phospho- (IL0183). Although the classical ABC-type sulfate transport glycerate synthase MpgS is missing). However, in line with its system is missing in I. loihiensis, it encodes three proteins that proteolytic metabolic life, I. loihiensis genome encodes uptake could each serve as a sulfate transporter: an MFS super family systems for small peptides, as well as for proline, glutamate, and permease (IL2129), a homolog of CysZ, another putative per- glycyl betaine. Examples include two paralogous PutP-like Naϩ͞ mease (IL1703), and a homolog of the sulfur deprivation re- proline symporters (IL0738 and IL1528), three copies of the sponse regulator YfbS (IL0057). Iron uptake mechanisms in- GltP-like glutamate transporters (IL0983, IL1037, and IL2299), clude siderophore-mediated transport (IL0795, -1581, and and two paralogous BetT-type choline͞betaine transporters -2511) and an ABC-type transport system (IL1059–IL1061). (IL1418 and IL2388). Systems for detoxification of heavy metals include enzymes for reduction and extrusion of arsenate (IL0699, IL1454, IL1465, Conclusions and IL2124–IL2125), export of cadmium and other divalent The genome sequence of the deep-sea ␥-proteobacterium I. loihien- cations (IL0463, IL0464, IL0632, IL0774, IL0777–IL0779, sis suggests that I. loihiensis relies primarily on amino acid fermen- IL1222, IL1242, IL1632, IL1636, IL1640, IL1641, and IL2587), tation, rather than on saccharolytic pathways for carbon and energy. and a mercury detoxification system (IL1643–IL1646). We propose that I. loihiensis is an opportunistic colonizer of Like other marine bacteria, I. loihiensis encodes a primary Naϩ proteinaceous particles in the deep-sea hydrothermal vent waters

18040 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0407638102 Hou et al. Downloaded by guest on September 29, 2021 (summarized in Fig. 5). Its versatile signal transduction system sequencing showed that survival and growth of I. loihiensis in the apparently senses dynamic changes in dissolved oxygen, tempera- constantly changing hydrothermal vent environment requires mul- ture and pressure, and initiates synthesis of highly viscous exopo- tiple, flexible adaptation mechanisms. lysaccharide(s) for attachment to these proteinaceous particles. Once settled, it would secrete diverse proteases to break down This work was supported by the Engineering Research Center program proteins and peptides and employ sodium-dependent transport of the National Science Foundation under Award EEC 9731725 and a systems to import the degradation products. Complete genome University of Hawaii intramural bioinformatics grant.

1. Karl, D. M., McMurtry, G. M., Malahoff, A. & Garcia, M. O. (1988) Nature 335, 15. Tatusov, R. L., Fedorova, N. D., Jackson, J. D., Jacobs, A. R., Kiryutin, B., 532–535. Koonin, E. V., Krylov, D. M., Mazumder, R., Mekhedov, S. L., Nikolskaya, 2. Karl, D. M. (1995) in The Microbiology of Deep-Sea Hydrothermal Vents, ed. A. N., et al. (2003) BMC Bioinformatics 4, 41. Karl, D. M. (CRC Press, Boca Raton, FL), pp. 35–124. 16. Wolf, Y. I., Rogozin, I. B., Grishin, N. V., Tatusov, R. L. & Koonin, E. V. (2001) 3. Hilton, D. R., McMurtry, G. M. & Goff, F. (1998) Nature 396, 359–362. BMC Evol. Biol. 1, 8. 4. Reysenbach, A. L. & Shock, E. (2002) Science 296, 1077–1082. 17. Venter, J. C., Remington, K., Heidelberg, J. F., Halpern, A. L., Rusch, D., 5. Nelson, K. E. (2003) Environ. Microbiol. 5, 1223–1225. Eisen, J. A., Wu, D., Paulsen, I., Nelson, K. E., Nelson, W., et al. (2004) Science 6. Donachie, S. P., Hou, S., Gregory, T. S., Malahoff, A. & Alam, M. (2003) Int. 304, 66–74. J. Syst. Evol. Microbiol. 53, 1873–1879. 18. Obst, M., Oppermann-Sanio, F. B., Luftmann, H. & Steinbuchel, A. (2002) 7. Kawarabayasi, Y., Sawada, M., Horikawa, H., Haikawa, Y., Hino, Y., J. Biol. Chem. 277, 25096–25105. Yamamoto, S., Sekine, M., Baba, S., Kosugi, H., Hosoyama, A., et al. (1998) 19. Ivanova, E. P., Romanenko, L. A., Chun, J., Matte, M. H., Matte, G. R., DNA Res. 5, 55–76. Mikhailov, V. V., Svetashev, V. I., Huq, A., Maugel, T. & Colwell, R. R. (2000) 8. Nelson, K. E., Clayton, R. A., Gill, S. R., Gwinn, M. L., Dodson, R. J., Haft, Int. J. Syst. Evol. Microbiol. 50, 901–907. D. H., Hickey, E. K., Peterson, J. D., Nelson, W. C., Ketchum, K. A., et al. 20. Dufresne, A., Salanoubat, M., Partensky, F., Artiguenave, F., Axmann, I. M., (1999) Nature 399, 323–329. Barbe, V., Duprat, S., Galperin, M. Y., Koonin, E. V., Legall, F., et al. (2003) 9. Cohen, G. N., Barbe, V., Flament, D., Galperin, M., Heilig, R., Lecompte, O., Proc. Natl. Acad. Sci. USA 100, 10020–10025. Poch, O., Prieur, D., Querellou, J., Ripp, R., et al. (2003) Mol. Microbiol. 47, 21. Brettar, I., Christen, R. & Hofle, M. G. (2003) Int. J. Syst. Evol. Microbiol. 53, 1495–1512. 407–413. 10. Koonin, E. V. & Galperin, M. Y. (2002) Sequence–Evolution–Function: Com- 22. Jannasch, H. W. & Taylor, C. D. (1984) Annu. Rev. Microbiol. 38, 487–514. putational Approaches in Comparative Genomics (Kluwer Academic, Boston). 23. Raguenes, G., Cambon-Bonavita, M. A., Lohier, J. F., Boisset, C. & Guezen- 11. Slesarev, A. I., Mezhevaya, K. V., Makarova, K. S., Polushin, N. N., nec, J. (2003) Curr. Microbiol. 46, 448–452. Shcherbinina, O. V., Shakhova, V. V., Belova, G. I., Aravind, L., Natale, 24. Watnick, P. I., Fullner, K. J. & Kolter, R. (1999) J. Bacteriol. 181, 3606–3609. D. A., Rogozin, I. B., et al. (2002) Proc. Natl. Acad. Sci. USA 99, 4644– 25. Paul, R., Weiser, S., Amiot, N. C., Chan, C., Schirmer, T., Giese, B. & Jenal, 4649. U. (2004) Genes Dev. 18, 715–727. 12. Besemer, J., Lomsadze, A. & Borodovsky, M. (2001) Nucleic Acids Res. 29, 26. Galperin, M. Y. (2004) Environ. Microbiol. 6, 552–567. 2607–2618. 27. Hase, C. C., Fedorova, N. D., Galperin, M. Y. & Dibrov, P. A. (2001) Microbiol. 13. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zheng, Z., Miller, W. Mol. Biol. Rev. 65, 353–370. & Lipman, D. J. (1997) Nucleic Acids Res. 25, 3389–3402. 28. Dzioba, J., Hase, C. C., Gosink, K., Galperin, M. Y. & Dibrov, P. (2003) J. 14. Lowe, T. M. & Eddy, S. R. (1997) Nucleic Acids Res. 25, 955–964. Bacteriol. 185, 674–678. BIOLOGY DEVELOPMENTAL

Hou et al. PNAS ͉ December 28, 2004 ͉ vol. 101 ͉ no. 52 ͉ 18041 Downloaded by guest on September 29, 2021