The genome of Salinibacter ruber: Convergence and gene exchange among hyperhalophilic and

E. F. Mongodin*, K. E. Nelson*, S. Daugherty*, R. T. DeBoy*, J. Wister*, H. Khouri*, J. Weidman*, D. A. Walsh†, R. T. Papke†, G. Sanchez Perez†, A. K. Sharma†, C. L. Nesbø†, D. MacLeod†, E. Bapteste†, W. F. Doolittle†‡, R. L. Charlebois§, B. Legault¶, and F. Rodriguez-Valera¶

*The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850; †Department of Biochemistry and Molecular Biology, 5850 College Street, Dalhousie University, Halifax, NS, Canada, B3H 1X5; §NeuroGadgets, Inc., Ottawa, ON, Canada K1G 4B5; and ¶Department of Microbiology, Universidad Miguel Hernandez, Apartado 18, 03550 San Juan de Alicante, Alicante, Spain

Contributed by W. F. Doolittle, October 19, 2005 Saturated thalassic brines are among the most physically demand- noids in its membrane, producing red colonies of similar ap- ing habitats on Earth: few microbes survive in them. Salinibacter pearance (2). However, the pigment found in Salinibacter (sali- ruber is among these organisms and has been found repeatedly in nixanthin) is a C-40 acyl glycoside carotenoid chemically related significant numbers in climax saltern crystallizer communities. The to the carotenoids found in R. marinus rather than the C-50 phenotype of this bacterium is remarkably similar to that of the bacterioruberins known from haloarchaea (6). hyperhalophilic Archaea (Haloarchaea). The genome sequence sug- The common features of Salinibacter and haloarchaea could gests that this resemblance has arisen through convergence at the have arisen through convergence at the physiological level physiological level (different genes producing similar overall phe- (different genes producing similar overall phenotype, as in the notype) and the molecular level (independent mutations yielding above-mentioned case of membrane carotenoids) or the molec- similar sequences or structures). Several genes and gene clusters ular level (independent mutations yielding similar sequences or also derive by lateral transfer from (or may have been laterally structures). Alternatively, genes may have been shared by lateral MICROBIOLOGY transferred to) haloarchaea. S. ruber encodes four rhodopsins. One gene transfer (LGT) between ancestors of these hyperhalophiles, resembles bacterial proteorhodopsins and three are of the halo- which we here define as organisms for whom saturated brines, archaeal type, previously uncharacterized in a bacterial genome. containing Ͼ5 M NaCl, are a natural habitat. The impact of these modular adaptive elements on the cell biology and ecology of S. ruber is substantial, affecting salt adaptation, Materials and Methods bioenergetics, and photobiology. The genome of S. ruber strain M31T DSM13855 (2) was se- quenced by the random shotgun method, with cloning, sequenc- ͉ lateral gene transfer ͉ convergence ͉ prokaryotic evolution ͉ ing, and assembly as described in ref. 7. Briefly, one small insert rhodopsins (2–3 kb) and one medium insert plasmid library (8–10 kb) were constructed by random nebulization and cloning of genomic ntil recently, halophilic archaea (haloarchaea) were thought to DNA. Each library was sequenced to an estimated 4-fold cov- Ube the only cells capable of thriving in saltern crystallizers. erage, and the sequences were assembled by using TIGR ASSEM- These impoundments contain Ϸ37% NaCl, at the limits of toler- BLER (8) or CELERA ASSEMBLER (9). All sequence and physical ance for this environmental factor. Further concentration of thalas- gaps were closed as described in ref. 7. ORFs were identified by sic (seawater-derived) hypersaline water leads to precipitation of using GLIMMER (10), and those shorter than 90 bp, as well as magnesium salts and sterility. Fluorescent in situ hybridization some of those with overlaps, were eliminated. Membrane pro- indicates that one crystallizer morphotype, well defined large rods, teins were detected by a hidden Markov method (11). Auto- corresponds to a bacterium of the Cytophaga cluster (1), within the mated gene annotation was performed as described in ref. 7. The Bacteroides͞Chlorobi group. This organism represents 10–20% of sequences have been submitted to GenBank [accession nos. the cells in climax crystallizer communities (spring and summer in CP000159 (chromosome) and CP000160 (plasmid)]. The Salini- temperate latitudes). Representative strains (as defined by 16S bacter automated genome annotation is available at http:͞͞ rRNA sequences) have been isolated from the same environment cmr.tigr.org͞tigr-scripts͞CMR͞CmrHomePage.cgi. and described as the previously uncharacterized genus and species Salinibacter ruber (2). Results The closest cultivated relative of S. ruber (henceforth Salini- Genome and Proteome Characteristics. The genome of strain M31T bacter)isRhodothermus marinus (89% 16S rRNA sequence DSM 13855, the type strain of S. ruber, comprises a 3,551,823-bp similarity), a slightly halophilic isolated from ma- chromosome of high GϩC content (66.29%) and a 35,505-bp rine hot springs (2). Salinibacter displays many remarkable plasmid (57.9% GϩC), containing 2,934 and 33 ORFs, respec- similarities to haloarchaea, one being a very high concentration tively. Table 1, which is published as supporting information on of potassium in the cytoplasm (3). This property is associated, as the PNAS web site, shows a comparison of general genomic in haloarchaea, with a high content of acidic amino acids and a features with those of Chlorobium tepidum TLS (Salinibacter’s low content of hydrophobic residues in bulk protein, necessary closest sequenced relative), Halobacterium sp. NRC-1 (an ar- for protein solubility at such high ionic strength (4). Cell integrity requires high salt concentrations in both cases, and growth only occurs at Ͼ2 M NaCl. Both Salinibacter and the haloarchaea are Conflict of interest statement: No conflicts declared. aerobic heterotrophs that exploit the large stock of organic Abbreviations: IS, insertion sequences; LGT, lateral gene transfer; SR, sensory rhodopsin. nutrients produced in previous stages of seawater concentration, Data deposition: The sequence reported in this paper has been deposited in the GenBank mostly by the green alga Dunaliella, and they use a similar range database [accession nos. CP000159 (chromosome) and CP000160 (plasmid)]. of organic compounds as carbon and energy sources (5). Like ‡To whom correspondence should be addressed. E-mail: [email protected]. haloarchaea, Salinibacter contains a high proportion of carote- © 2005 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0509073102 PNAS ͉ December 13, 2005 ͉ vol. 102 ͉ no. 50 ͉ 18147–18152 Downloaded by guest on October 2, 2021 NRC-1, 4.6; Haloarcula marismortui, 4.6) than those of B. fragilis (7.0) or C. tepidum (7.0). In some cases (see below), this increase in pI may reflect orthologous replacement of genes encoding the less acidic proteins of the Bacteroides͞Chlorobi group by LGT from haloarchaea. But because the majority of Salinibacter’s genes have their closest relative outside the haloarchaea (Fig. 2), residue-by-residue selection during adaptation of its ancestors to hypersaline conditions must be more often the explanation for such extensive convergence at the molecular level.

Phylogeny and Lateral Gene Transfer. Phylogenetic analysis of its 16S rRNA sequence places Salinibacter within the Bacteroides͞ Chlorobi group, with its closest relative R. marinus (2). The whole genome methods described by Gophna and colleagues (16) show C. tepidum and B. fragilis to be its closest relatives among bacteria with sequenced genomes. To assess the nature and number of potential LGTs into this genome, we used a ‘‘competitive matching’’ query (17). In this analysis, any ORF that has a match only to genomes outside the Bacteroides͞ Chlorobi group or a match at least 0.05 normalized BLAST score units better outside this group was counted as a potential LGT; 1,470 ORFs showed such discordant similarities by this BLAST- Fig. 1. Normalized distribution of pI values at 0.2 intervals for predicted ORFs based method. The 10 most frequently matched genomes outside in Haloarcula marismortui (purple), Halobacterium sp. NRC-1 (red), Salini- Bacteroides͞Chlorobi are displayed in Fig. 2 (black bars). bacter (blue), C. tepidum (green), and B. fragilis (cyan). Predicted pI values of BLAST analyses can mislead us for many reasons, and surely the the proteins were calculated by using NeuroGadgets Bioinformatics Web number of within-group best matches will rise (apparent trans- Service. fers fall) when more genome sequences appear from members of Bacteroides͞Chlorobi. Phylogenetic reconstruction may be a chaeal halophile), and Bacillus halodurans (a halotolerant bac- more reliable method for identifying LGTs, although applicable terium). Previous analysis of halophilic archaeal genomes (such only when adequate taxa are available and a rooting is possible. For each of these 10 genomes, we examined maximum likelihood as Halobacterium sp. NRC-1, Haloarcula marismortui and ͞ Haloferax volcanii) has suggested a bipartite organization, with a trees generated by PHYML (18) (with an assumed Bacteria high overall GϩC content but smaller replicons and chromosmal Archaea rooting) in which Salinibacter was the deepest branching islands of lower GϩC (12, 13), the low GϩC regions bearing member of a clade containing that genome. The total number of such candidate trees examined for specific support of transfer to most of the insertion sequences (IS) and phage-related elements or from Salinibacter and the indicated genome is shown by the (12). In Salinibacter, chromosomal low GϩC islands are also gray bars in Fig. 2. In the pie charts, the fraction of these trees enriched in such elements (Fig. 5, which is published as sup- actually supporting LGT from the indicated genome to Salini- porting information on the PNAS web site). The first of the low bacter is shown in blue, those supporting LGT from Salinibacter GϩC islands is located 250 kb from the inferred origin of to that genome in chartreuse, those supporting LGT of uncertain replication and contains 15 transposases and five prophage (either) direction in green, and those not supporting an LGT components (including three glycosyl-transferases). The second, with this candidate genome in red. Phylogeny, like BLAST, spanning over 55 kb, contains 12 transposases, several prophage- identified haloarchaea as the most frequent donors or recipients related ORFs and, unexpectedly, a number of ORFs involved in in LGTs involving Salinibacter, although the total number of O chain polysaccharide (or capsular polysaccharide) synthesis. ϩ apparent transfers between Salinibacter and haloarchaea ap- The third low G C island (39 kb) contains the only restriction- pears to be modest. (And aside from the haloarchaea and modification system found in this organism. The Salinibacter Rhodopirellula baltica, many LGTs indicated by BLAST are not plasmid codes for 19 hypothetical and conserved hypothetical supported by trees.) Where appropriate in the following sections genes (57.6% of the plasmid ORFs) but also contains several of this report, we highlight individual instances in which LGT ORFs involved in DNA metabolism, replication, and recombi- between these groups nevertheless has likely been a key factor ࿝ nation, and a gene involved in UV protection (SRU p0002). It in adaptation to the harsh saline environment. also encodes an IS5 family transposon element not found in the chromosome. Physiology. Salinibacter has a full complement of fermentative One striking feature of Salinibacter biology is its high intra- components and genes involved in transport and degradation of cellular potassium concentration, an adaptation to living in organic compounds. A chitinase, two cellulases, one amylase, a hypersaline conditions previously observed only in haloarchaea. pectinase, and several proteases and lipases were among the Consistent with this feature, Salinibacter proteins, like those of polymer-degrading enzymes encoded on the chromosome. Ad- haloarchaea, have a high content of acidic residues, mainly ditionally, the plasmid encodes a 1,4-␤-cellobiosidase homolog. aspartate, and a low content of basic residues, particularly lysine Genes related to the transport and metabolism of glycerol and (Fig. 1). This observation agrees with the previous estimation of glycine betaine (the most common compatible solutes found in the amino acid composition from the bulk protein (4). The more moderate and halotolerant species) were also genome sequence allows a global analysis of the predicted present. The broad degradative abilities fit with the expected isoelectric points (pI) of the proteome. A biomodal distribution complexity of the organic pool present in the crystallizer as a of pI values differentiating cytoplasmic and integral membrane result of the lysis of biomass produced at lower salinities. proteins, characteristic of most proteomes (14, 15), is shown by Contrary to a previous suggestion (19), glycolysis appears to take Bacteroides fragilis and C. tepidum but not by the haloarchaea place through an Embden-Meyerhoff pathway and not a mod- (Fig. 1). Salinibacter is intermediate in character, with a median ified Entner-Doudoroff pathway. pI value (5.2) nearer those of haloarchaea (Halobacterium sp. Like the haloarchaea, Salinibacter has all of the genes for a

18148 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0509073102 Mongodin et al. Downloaded by guest on October 2, 2021 Fig. 2. The nature and number of potential lateral gene transfers involving Salinibacter. Black bars correspond to the number of Salinibacter genes for which this genome is the best BLAST hit. Gray bars correspond to the number of genes for which maximum likelihood phylogenetic analyses (18) indicated a sister relationship with this genome and, thus, a potential LGT between the two species. The pie charts show whether LGT between Salinibacter and the indicated

genome was supported and, if so, its direction. Of those trees, the blue fraction support transfer from the indicated genome to Salinibacter, the chartreuse MICROBIOLOGY fraction support transfer from Salinibacter to this genome, the green fraction support transfer between Salinibacter and this genome for which no decision on direction can be made, and the red fraction do not support transfer between Salinibacter and this genome. When several genomes made up the sister group to or from which LGT might be inferred, no simple LGT event was assumed. Thus, complex or ancient events were not scored here.

complete tricarboxylic acid cycle and a cytochrome c-containing terium. Strangely, Salinibacter does not possess a ccoP gene, respiratory chain. It possesses a succinate dehydrogenase gene thought to be an essential subunit of cbb3-type cytochrome cluster (SRU࿝0484–0487) very similar to Actinobacteria and a oxidase (25). However, next to ccoON is a gene encoding a second succinate dehydrogenase flavoprotein subunit, sdhA cytochrome c protein (SRU࿝0324), which also appears to have (SRU࿝2444), most similar to sdhA2 from H. marismortui. Sim- been transferred from a ␤-proteobacterium and may substitute ilarly, comparison of respiratory chain proteins in Salinibacter for CcoP function in Salinibacter. indicates there are two clusters of cytochrome c oxidase subunit I and II genes, the first with coxA1 and coxB1 (SRU࿝2099 and Haloadaptation. A cluster of 19 genes (Fig. 3) includes Kϩ SRU࿝2100) and the second with coxA2 and coxB2 (SRU࿝0314 uptake͞efflux systems and cationic amino acid transporters of and 0313). Phylogenetic analyses demonstrate coxA1 and coxB1 crucial importance to a hyperhalophilic lifestyle. This ‘‘hyper- genes, which are also next to the genes for subunits III salinity island’’ is mosaic in nature, apparently pieced together (SRU࿝2098) and IV (SRU࿝2097), are related to those of R. from a variety of bacterial and archaeal sources. More than marinus, whereas coxB2 and coxA2 are clearly most closely one-half of the genes are most similar to their haloarchaeal related to their haloarchaeal homologs (Fig. 6, which is published homologs, including the cationic amino acid transporters, one as supporting information on the PNAS web site). This second trkH gene, and three trkA homologs. The Trk system is respon- gene cluster may have been imported from the haloarchaea, a sible for the uptake of Kϩ, where TrkH is the membrane bound shared adaptation to the microoxic conditions often associated translocating subunit and TrkA is a cytoplasmic membrane with hypersalinity, perhaps functioning in a novel respiratory surface protein that binds NADϩ (26). The existence of multiple electron transfer pathway. In support of this notion, a nosZDF trkA genes in Salinibacter suggests complex regulation of the Trk gene cluster and an oxygen-sensitive Cpr͞Fnr-type transcription system, a feature likely shared with haloarchaea. This complexity regulator (20) are found near to coxA2 and coxB2 in Salinibacter. is increased by the presence of yet an additional TrkAH system The nos cluster encodes a nitrous oxide reductase similar to that in the hypersalinity island that is most similar to that found in found in a variety of proteobacteria (21, 22) and in H. maris- Firmicutes (Fig. 3). mortui. Phylogenetic analyses of NosZ (SRU࿝0308) and NosD The recruitment of several nonhaloarchaeal hypersalinity genes (SRU࿝0310) demonstrate high similarity between Salinibacter may have involved an island-encoded IS1 transposase. In Esche- and H. marismortui; however, the exact relationship remains richia coli, IS1 has two overlapping ORFs, insA and insB, and unresolved (Fig. 7, which is published as supporting information translational frameshift results in the production of InsAB, a on the PNAS web site). functional IS1 transposase (27). There are four insAB-like genes in Salinibacter may have other unique characteristics with respect Salinibacter and a similar expression mechanism may exist in this to microoxic adaptation. Unlike haloarchaea but like R. marinus bacterium. Genes with significant similarity to the Salinibacter (23), it possesses components of a cbb3-type cytochrome oxidase insAB have been found only in cyanobacteria, Parachlamydia, and (ccoO and ccoN) (SRU࿝0323 and SRU࿝0322). Cytochrome c members of two archaeal orders, Methanosarcinales and Sul- oxidases of the cbb3 type have a very high affinity for O2 and folobales. This patchy phylogenomic distribution of the IS1 trans- ϩ allow respiration to continue under low-O2 levels (24). Both posase is mirrored in the distribution of the KefB K efflux system ccoO and ccoN were acquired likely by LGT from a ␤-proteobac- proteins (28) and the Na-K-Cl cotransport proteins (29) present in

Mongodin et al. PNAS ͉ December 13, 2005 ͉ vol. 102 ͉ no. 50 ͉ 18149 Downloaded by guest on October 2, 2021 Fig. 3. A schematic representation of the hypersalinity island identified in the genome of Salinibacter. Genes are color-coded with respect to their closest matches in BLAST sequence similarity searches: haloarchaea, red; cyanobacteria, green; methanogenic archaea, yellow; firmicutes, blue.

the island, supporting the notion that these genes have passed as a about Salinibacter photobiology and the function of the two single unit through an archaeal or cyanobacterial ‘‘host’’ before genes homologous to SRs still can be inferred only from the settling into the Salinibacter genome. The most striking examples of genomic context. Nevertheless, there seems little doubt that this observation are the Na-K-Cl cotransporters, which are found SRU࿝2579 is a photosensor, because the next ORF, 86 nucleo- throughout the eukaryotes and function as integral membrane tides downstream, is an Htr1 transducer, as in Halobacterium sp. transport proteins (29). Outside the eukaryotes, we could identify orthologs in the genomes of only five prokaryotes, all of which also possessed a Salinibacter-like IS1 transposase. The origin of the prokaryotic Na-K-Cl cotransporters remains enigmatic, but it is possible that they have been exchanged among very distantly related organisms, including Salinibacter. In addition, these findings suggest the convergence on an aerobic hyperhalophilic lifestyle between haloarchaea and Salinibacter was mediated, in part, by interdomain lateral gene transfer.

Rhodopsins and Retinal. The discovery of the proton pump bacte- riorhodopsin in the haloarchaea Halobacterium salinarum in the early 1970s launched a new fruitful field of bioenergetics and biophysics, and this protein and its relatives were seen as a defining invention of the archaea (30). More recently, rhodopsin-based photobiology has been found in other groups of prokaryotes (31, 32) and in unicellular eukaryotes (33, 34). The possibility that Salinibacter might also have rhodopsin genes, derived from the haloarchaea with which it lives, was one of the initial motivations behind the present work. Indeed, like Halobacterium sp. NRC-1 and Haloarcula marismortui, and unlike any other characterized bacte- rium, Salinibacter contains four rhodopsin genes (Fig. 4A). Three of the Salinibacter rhodopsins group with haloarchaeal rhodopsins in phylogenetic reconstructions and might be in- ferred, from the nature of neighboring genes in the Salinibacter genome, to have similar functions. Salinibacter SRU࿝2780 forms a basal member of the haloarchaeal halorhodopsin clade and, like them, is immediately upstream of an oxidoreductase gene (possibly a regulator of pump activity) (Fig. 4E).ANaϩ͞Hϩ antiporter is located only three genes away in Salinibacter and a Naϩ͞substrate cotransporter is adjacent. In both archaeal ge- nomes, trkA (a potassium symporter gene) is found nearby. This observation, together with relatively high sequence similarity, supports the same function, as an inward-directed chloride pump, for the Salinibacter protein. Salinibacter has two presumptive sensory rhodopsin (SR) genes, 83 kb apart on the genome and distantly related to each other and to the haloarchaeal SRs, specifically SRI. In Halobac- terium, two SRs (SRI and SRII) interact to produce a color- sensitive phototactic behavior. SRI is only synthesized under low oxygen tension (as are the proton pump BR and the chloride Fig. 4. The phylogeny and genomic context of Salinibacter rhodopsin genes. pump HR), mediating the attraction to orange light that drives (A) Maximum likelihood phylogeny of rhodopsin genes inferred from 212 the ion pumps (35). SRII is produced at high oxygen tensions conserved amino acid positions by PHYML (18). The evolutionary model and ϩ ϩ ␥ ϩ when the bioenergetics of the cell are driven by the respiratory parameters used in the analyses were WAG F I. Support for nodes correspond to 100 bootstrap replicates. Scale bar indicates the number of chain and mediates a photophobic response to blue light (35). In positions per site. (B–E) Gene context of Salinibacter rhodopsin genes. Darker haloarchaea, the signaling function of both SRs depends on their boxes suggest possible cotranscribed and͞or coregulated gene sets (same tight association with transducers that block their potential ion transcription direction and Ͻ250 bp separating any two adjacent genes within transport activity and transform them into photosensors (36). the box) Genes colored in yellow indicate those with homologs in halophilic Ϫ Until very recently (see below), there has been no information archaea with BLAST expect value Ͻ1 10.

18150 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0509073102 Mongodin et al. Downloaded by guest on October 2, 2021 NRC-1 (Fig. 4D). The second putative SR in Salinibacter Discussion ࿝ (SRU 2511) is also tightly linked to nearby signal transduction Salinibacter has adapted to hypersalinity in three general ways. genes (Fig. 4C), pointing to a photosensory function and poten- First, Salinibacter must have modified the sequences of many tially complex photobehavior in Salinibacter. Notably, although of its proteins. Whole and partial proteome studies of halo- there is no doubt of the relatively close relationships among the archaea show that these hyperhalophilic organisms have ac- sensory rhodopsins of Salinibacter and the haloarchaea, the commodated high internal ionic strength by replacing neutral transducers that are tightly linked at the genetic (and presumably amino acids with acidic ones (44). This situation must also be functional) level are not closely related to those of haloarchaea the case for Salinibacter because an acidic proteome is not a (Fig. 8, which is published as supporting information on the ͞ PNAS web site). Their closest relatives outside the Salinibacter property of its relatives among the Bacteroides Chlorobi genome are transducers of other bacteria. Thus, physiological group (Fig. 2). Although some genes imported from haloar- convergence between Salinibacter and haloarchaea likely has chaea may have produced proteins already adapted in this way, been effected through independent gene recruitment processes. these comprise only a fraction of Salinibacter’s genes. A Downstream and tightly linked to the first Salinibacter SR gene comparative study of orthologous proteins in Salinibacter and discussed above (Fig. 4D) (SRU࿝2579), there is a typically haloarchaea and their closest nonhalophilic relatives should bacterial flagellar cluster that might be coregulated with it: SR tell us much about convergent evolution at the level of protein function in phototaxis would be useful only when cells are structure. actively motile; members of the Bacteroides͞Chlorobi group Second, Salinibacter exhibits many cases of convergence at the typically lack flagella. However, a polar flagellum has been level of physiology, in which proteins from different sources or observed in Salinibacter’s closest relative R. marinus (37). Phy- with different original functions have been recruited to create a logenetic analyses of the genes within this flagellar gene cluster complex adaptation to hypersaline life that parallels a structure, indicate a mosaic structure, with many genes showing a close pathway or behavior already known from haloarchaea. The affinity to the ␦͞␧-proteobacteria. Perhaps the flagellum was a mix-and-match of genes of haloarchaeal and bacterial origin in recent acquisition by Salinibacter͞Rhodothermus. With respect to the postulated phototaxis system will likely provide a good motility, a general, although not universal, character of the example of analogous complex collective functions served by Bacteroides group is gliding motility. Studies in Flavobacterium nonhomologous component parts. The recent discovery that johnsoniae have identified a suite of genes required for the xanthorhodopsin (Salinibacter’s proteorhodopsin-like gliding motility phenotype (gldABDFGH) (38). Salinibacter pos- SRU࿝1500) functions in a light-harvesting complex with salin- MICROBIOLOGY sesses homologs of the gldA, gldF, and gldG genes (SRU࿝1249– ࿝ ixanthin similarly provides a striking analogy with chlorophyll- SRU 1251), which are thought to form an ATP-binding cassette based light-harvesting systems (39). transporter in F. johnsoniae, but it lacks the critical lipoprotein Third, some adaptations common among halophilic organ- components (gldB, gldD, and glDH) that are required for gliding. isms have been passed between them by LGT. The rhodopsins The fourth Salinibacter rhodopsin gene (SRU࿝1500) appears to be most closely related to those found in cyanobacteria and forms shared by Salinibacter and the haloarchaea may be the best case a clade with organisms that encode diverse rhodopsin genes, which in point. Overall, the specific functionally important residues include the proteorhodopsins found among uncultured marine in the Salinibacter rhodopsins match those of the haloarchaeal bacteria and known to act (at least when expressed heterologously proteins to which the rhodopsins are homologous, and the two in E. coli) as a proton pump (31). A recent study by Balashov et al. SR-transducer homologs both appear to be sensory rhodopsin (39) confirmed the proton pumping activity of the Salinibacter photoreceptor transducers, like their haloarchaeal Htrl ho- protein (presumably encoded by SRU࿝1500) and identified a novel mologs, rather than chemoreceptor taxis proteins (John Spu- light-harvesting complex (xanthorhodopsin) that not only includes dich, personal communication). Unless we imagine that the the chromophore retinal but utilizes a carotenoid antenna (salin- last universal common ancestor had a full complement of ixanthin) and enables the rhodopsin to absorb light across a greater rhodopsins, the presence of several identifiable classes of this absorbance spectrum than is possible with retinal alone. Salinix- protein in Salinibacter and haloarchaea is best explained by anthin, as noted in the Introduction, is responsible for the red color multiple events of LGT. Although proteorhodopsin-like genes of Salinibacter, comprises nearly 100% of its carotenoid, and is have been described in several bacterial phyla (45), we note structurally related to the major carotenoid pigment found in that Salinibacter is the only bacterium in which genes that R. marinus (6). The gene encoding xanthorhodopsin is linked to two cluster specifically with haloarchaeal rhodopsin genes are ORFs (crtY and crtO) (SRU࿝1501 and SRU࿝1502) that code for known. It might seem natural to assume that Salinibacter has proteins responsible for important steps in both carotenoid and derived these genes from archaea, by LGT, but we cannot be ␤ retinal synthesis (Fig. 4B). Lycopene -cyclase (CrtY) is required certain. Salinibacter’s two sensory rhodopsins appear to have for the formation of ␤-carotene, a precursor to both retinal and ␤ derived from one of the two haloarchaeal sensory rhodopsin subsequent carotenoid production, whereas -carotene ketolase paralogs (as if after the duplication in that lineage), but its (CrtO) is likely responsible for an important step in salinixanthin halorhodopsin diverged before the diversification of haloar- biosynthesis (40, 41). chaeal forms. Many of the other ‘‘haloarchaeal genes’’ found Until recently, the manner by which retinal was produced in in Salinibacter could just as easily be ‘‘Salinibacterial genes’’ bacteria was unknown. A metagenomic study of proteorhodopsin- containing BAC clones from marine waters (42) indicate that these introduced by LGT into haloarchaeal genomes. Indeed, the organisms may be synthesizing retinal by a mechanism first iden- respiratory chain and associated functions in haloarchaea are tified in haloarchaea, which have been shown to use bacteriorho- widely believed to have been of bacterial origin (46, 47), and dopsin-related protein (Brp) and its paralog, bacteriorhodopsin- Salinibacter’s ancestors could have been one source. Bacterial related protein like homolog (Blh) to produce retinal from and archaeal cells have likely shared the saltern habitat for symmetrical cleavage of ␤-carotene (43). A divergent homolog of millennia, with ample chance to exchange genes. The notion of Blh (20% amino acid identity with the archaeal version) was found a ‘‘habitat genome’’ (or a pool of genes useful for adaptation linked to proteorhodopsin on BAC clones (42), whereas in Salini- under a specific set of environmental constraints) is appealing. bacter, homologs of both brp and blh are present in the genome Such a gene pool could be analogous (in an evolutionary time (unlinked to rhodopsin genes) and exhibit close relation to those scale) to the gene pool in a metazoan (or plant) zygote from found in haloarchaea rather than marine bacteria. which different tissues extract the required components (48).

Mongodin et al. PNAS ͉ December 13, 2005 ͉ vol. 102 ͉ no. 50 ͉ 18151 Downloaded by guest on October 2, 2021 We thank Tom Cavalier-Smith for alerting us to the significance of the supported by Canadian Institutes for Health Research Grant MOP4467, presence of genes for flagellar components. Sequencing and annotation Genome Atlantic, and the Generic Architecture for Customised IP- were supported by National Science Foundation͞U.S. Department of Based IN Services over Hybrid VoIP and SS7 Project QLK3-CT-2002- Agriculture Grant EF0333190. Additional bionformatic analyses were 02056 of the European Commission.

1. Anton, J., Rossello-Mora, R., Rodriguez-Valera, F. & Amann, R. (2000) Appl. 23. Pereira, M. M., Carita, J. N., Anglin, R., Saraste, M. & Teixeira, M. (2000) Environ. Microbiol. 66, 3052–3057. J. Bioenerg. Biomembr. 32, 143–152. 2. Anton, J., Oren, A., Benlloch, S., Rodriguez-Valera, F., Amann, R. & 24. Pitcher, R. S. & Watmough, N. J. (2004) Biochim. Biophys. Acta 1655, 388–399. Rossello-Mora, R. (2002) Int. J. Syst. Evol. Microbiol. 52, 485–491. 25. Cosseau, C. & Batut, J. (2004) Arch. Microbiol. 181, 89–96. 3. Oren, A., Heldal, M., Norland, S. & Galinski, E. A. (2002) 6, 26. Kraegeloh, A., Amendt, B. & Kunte, H. J. (2005) J. Bacteriol. 187, 1036–1043. 491–498. 27. Escoubas, J. M., Lane, D. & Chandler, M. (1994) J. Bacteriol. 176, 5864–5867. 4. Oren, A. & Mana, L. (2002) Extremophiles 6, 217–223. 28. Epstein, W. (2003) Prog. Nucleic Acid Res. Mol. Biol. 75, 293–320. 5. Sher, J., Elevi, R., Mana, L. & Oren, A. (2004) FEMS Microbiol. Lett. 232, 29. Park, J. H. & Saier, M. H., Jr. (1996) J. Membr. Biol. 153, 171–180. 211–215. 30. Oesterhelt, D. & Stoeckenius, W. (1973) Proc. Natl. Acad. Sci. USA 70, 6. Lutnaes, B. F., Strand, A., Petursdottir, S. K. & Liaaen-Jensen, S. (2004) 2853–2857. Biochem. Syst. Ecol. 32, 455–468. 31. Beja, O., Aravind, L., Koonin, E. V., Suzuki, M. T., Hadd, A., Nguyen, L. P., 7. Fouts, D. E., Mongodin, E. F., Mandrell, R. E., Miller, W. G., Rasko, D. A., Jovanovich, S. B., Gates, C. M., Feldman, R. A., Spudich, J. L., et al. (2000) Ravel, J., Brinkac, L. M., DeBoy, R. T., Parker, C. T., Daugherty, S. C., et al. Science 289, 1902–1906. (2005) PLoS Biol 3, e15. 32. Jung, K. H., Trivedi, V. D. & Spudich, J. L. (2003) Mol. Microbiol. 47, 8. Sutton, G. G., White, O., Adams, M. D. & Kerlavage, A. R. (1995) Genome Sci. 1513–1522. Technol. 1, 9–19. 33. Bieszke, J. A., Spudich, E. N., Scott, K. L., Borkovich, K. A. & Spudich, J. L. 9. Myers, E. W., Sutton, G. G., Delcher, A. L., Dew, I. M., Fasulo, D. P., Flanigan, (1999) Biochemistry 38, 14138–14145. M. J., Kravitz, S. A., Mobarry, C. M., Reinert, K. H., Remington, K. A., et al. 34. Sineshchekov, O. A., Jung, K.-H. & Spudich, J. L. (2002) Proc. Natl. Acad. Sci. (2000) Science 287, 2196–2204. USA 99, 8689–8694. 10. Delcher, A. L., Harmon, D., Kasif, S., White, O. & Salzberg, S. L. (1999) 35. Spudich, J. L. (1998) Mol. Microbiol. 28, 1051–1058. Nucleic Acids Res. 27, 4636–4641. 36. Spudich, J. L. (1994) Cell 79, 747–750. 11. Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E. L. (2001) J. Mol. 37. Bjornsdottir, S. H., Blondal, T., Hreggvidsson, G. O., Eggertsson, G., Peturs- Biol. 305, 567–580. dottir, S., Hjorleifsdottir, S., Thorbjarnardottir, S. H. & Kristjansson, J. K. 12. Baliga, N. S., Bonneau, R., Facciotti, M. T., Pan, M., Glusman, G., Deutsch, (August 2, 2005) Extremophiles, 10.1007͞s00792-005-0466-z. E. W., Shannon, P., Chiu, Y., Weng, R. S., Gan, R. R., et al. (2004) Genome 38. McBride, M. J. (2004) J. Mol. Microbiol. Biotechnol. 7, 63–71. Res. 14, 2221–2234. 39. Balashov, S. P., Imasheva, E. S., Boichenko, V. A., Anton, J., Wang, J. M. & 13. Ng, W. V., Kennedy, S. P., Mahairas, G. G., Berquist, B., Pan, M., Shukla, H. D., Lanyi, J. K. (2005) Science 309, 2061–2064. Lasky, S. R., Baliga, N. S., Thorsson, V., Sbrogna, J., et al. (2000) Proc. Natl. 40. Sieiro, C., Poza, M., de Miguel, T. & Villa, T. G. (2003) Int. Microbiol. 6, 11–16. Acad. Sci. USA 97, 12176–12181. 41. Tao, L. & Cheng, Q. (2004) Mol. Genet. Genomics 272, 530–537. 14. Knight, C. G., Kassen, R., Hebestreit, H. & Rainey, P. B. (2004) Proc. Natl. 42. Sabehi, G., Loy, A., Jung, K. H., Partha, R., Spudich, J. L., Isaacson, T., Acad. Sci. USA 101, 8390–8395. Hirschberg, J., Wagner, M. & Beja, O. (2005) PLoS Biol. 3, e273. 15. Schwartz, R., Ting, C. S. & King, J. (2001) Genome Res. 11, 703–709. 43. Peck, R. F., Echavarri-Erasun, C., Johnson, E. A., Ng, W. V., Kennedy, S. P., 16. Gophna, U., Doolittle, W. F. & Charlebois, R. L. (2005) J. Bacteriol. 187, Hood, L., DasSarma, S. & Krebs, M. P. (2001) J. Biol. Chem. 276, 5739–5744. 1305–1316. 44. Oren, A. (2002) Halophilic Microorganisms and Their Environments (Kluwer, 17. Gophna, U., Charlebois, R. L. & Doolittle, W. F. (2004) Trends Microbiol. 12, Dordrecht, The Netherlands). 213–219. 45. de la Torre, J. R., Christianson, L. M., Be´ja`, O., Suzuki, M. T., Karl, D. M., 18. Guindon, S. & Gascuel, O. (2003) Syst. Biol. 52, 696–704. Heidelberg, J. & DeLong, E. F. (2003) Proc. Natl. Acad. Sci. USA 100, 19. Oren, A. & Mana, L. (2003) FEMS Microbiol. Lett. 223, 83–87. 12830–12835. 20. Korner, H., Sofia, H. J. & Zumft, W. G. (2003) FEMS Microbiol. Rev. 27, 46. Boucher, Y., Douady, C. J., Papke, R. T., Walsh, D. A., Boudreau, M. E., 559–592. Nesbo, C. L., Case, R. J. & Doolittle, W. F. (2003) Annu. Rev. Genet. 37, 21. Simon, J., Einsle, O., Kroneck, P. M. & Zumft, W. G. (2004) FEBS Lett. 569, 283–328. 7–12. 47. Kennedy, S. P., Ng, W. V., Salzberg, S. L., Hood, L. & DasSarma, S. (2001) 22. Velasco, L., Mesa, S., Xu, C. A., Delgado, M. J. & Bedmar, E. J. (2004) Antonie Genome Res. 11, 1641–1650. Van Leeuwenhoek 85, 229–235. 48. Rodriguez-Valera, F. (2002) Environ. Microbiol. 4, 628–633.

18152 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0509073102 Mongodin et al. Downloaded by guest on October 2, 2021