<<

The loss of the H2S-binding function in from sulfide-free habitats reveals molecular adaptation driven by Darwinian positive selection

Xavier Bailly*†‡, Riwanon Leroy†, Susan Carney§, Olivier Collin¶, Franck Zal†, Andre´ Toulmond†, and Didier Jollivet*

*Equipe Evolution et Ge´ne´ tique des Populations Marines and †Equipe Ecopyhsiologie, Station Biologique de Roscoff, Unite´Mixte de Recherche 7127, Centre National de la Recherche Scientifique, Universite Pierre et Marie Curie, 29680 Roscoff, France; §Department of , Pennsylvania State University, 208 Mueller Lab, University Park, PA 16802; and ¶Station Biologique de Roscoff, Service Informatique, Centre National de la Recherche Scientifique FR2424, 29680 Roscoff, France

Edited by Tomoko Ohta, National Institute of Genetics, Mishima, Japan, and approved March 14, 2003 (received for review December 18, 2002) The hemoglobin of the deep-sea vestimentif- original specific function (its disappearance or its maintenance). eran () is able to bind toxic hydrogen This could occur via diversifying selection when the ancestral sulfide (H2S) to free cysteine residues and to transport it to fuel polymorphism linked to this function is subdivided between endosymbiotic sulfide-oxidising . The cysteine residues are habitats (4). This can lead to the observation of highly divergent conserved key amino acids in annelid living in sulfide-rich variants regarding specific amino acid signatures and can be environments, but are absent in annelid globins from sulfide-free viewed as a positive Darwinian selection event acting on environments. Synonymous and nonsynonymous substitution that have emerged from this habitat speciation. The estimation analysis from two different sets of orthologous annelid of the fixation rate of nonsynonymous and synonymous substi- genes from sulfide rich and sulfide free environments have been tutions along orthologous coding sequences from a cluster of performed to understand how the sulfide-binding function of evolutionarily related taxa appears to be one of the most hemoglobin appeared and has been maintained during the course powerful tools to detect molecular adaptation (5–9). However, of evolution. This study reveals that the sites occupied by free- the signature of molecular adaptation can be cryptic and difficult cysteine residues in annelids living in sulfide-rich environments to extract because an adaptive change (advantageous mutation) and occupied by other amino acids in annelids from sulfide-free may only affect a small number of lineages and only a subset of environments, have undergone positive selection in annelids from sites according to their phylogenetic history (10). The accumu- sulfide-free environments. We assumed that the high reactivity of lation of ancient adaptive mutations is typically the situation cysteine residues became a disadvantage when H2S disappeared encountered in globins, a widespread molecule that is conserved because free cysteines without their natural ligand had the capac- in members of all living kingdoms, including annelids. ity to interact with other components, disturb homeostasis, The spatial and environmental distribution of annelids, from reduce fitness and thus could have been counterselected. To our deep-sea hydrothermal vents to terrestrial habitats, is the con- knowledge, we pointed out for the first time a case of function loss sequence of a long history of adaptive strategies since their driven by molecular adaptation rather than genetic drift. If con- radiation (11). One of these adaptations concerns the way by straint relaxation (H2S disappearance) led to the loss of the sulfide- which annelids living in sulfide-rich environments protect them- binding function in modern annelids from sulfide-free environ- selves against or use (H2S). Such a process ments, our work suggests that adaptation to sulfide-rich mainly relies on the occurrence of extracellular that environments is a plesiomorphic feature, and thus that the annelid bind and transport this toxic compound. H2S is toxic to aerobic ancestor could have emerged in a sulfide-rich environment. metabolism, particularly to such as cytochrome c oxidase and hemoglobin (12). This unusual sulfide-binding sulfide binding function ͉ free cysteine ͉ annelid evolution ͉ function of some annelid hemoglobins was first discovered in the loss of function vestimentiferan Riftia pachyptila, a mouthless and gutless - ism harboring intracellular chemolithoautotrophic sulfide- oxidizing bacterial symbionts. Riftia is found living close to the mergence of new functions in as a result of a high deep-sea eastern Pacific hydrothermal vents (13). Sulfide bind- Eevolutionary rate after gene duplication has been long ing is enabled by the presence of two highly reactive free cysteine debated in the molecular evolution field. From the neutralist residues that covalently bind H2S (14, 15), each one localized on standpoint, molecular evolution occurs by random drift of two different globin subunits included in extracellular hemoglo- EVOLUTION mutations that are nearly equivalent selectively. In this context, bin complexes found in annelids (see review in ref. 16). The ‘‘it is much more likely, if high rates occur, that they are caused annelid hemoglobin multigenic is subdivided into two by the removal of a preexisting functional constraint, allowing main gene families, A and B, and four subfamilies. A1, A2, B1, previously harmful mutants to become selectively neutral’’ (1). and B2, that emerged via at least three duplication events (17, For ‘‘selectionist,’’ high evolutionary rates are considered rather 18). These latter authors found that the free cysteine residues ϩ as the result of an acceleration of mutations called positive involved in H2S binding are located at the same positions, Cys Darwinian selection, the likely evolutionary force for the acqui- 1 and Cys ϩ 11 (1 and 11 aa after the well conserved distal sition of new functions after a duplication event (2). According histidine), on globin chains within the B2 and A2 subfamilies to Ohta’s consensual theory (3), positive Darwinian selection is respectively for a set of various annelids living in sulfide-rich needed for the accumulation of favorable mutations that provide habitats. Moreover, other nonsymbiotic annelid a new function or a modified function to a (duplicated) gene, whereas a gene whose function has been fixed for a long time evolves mostly through random genetic drift. However, molec- This paper was submitted directly (Track II) to the PNAS office. ular adaptation and emergence of a new function driven by Abbreviations: SBD, sulfide-binding ; HCA, hydrophobic cluster analysis. positive Darwinian selection is not always associated with du- Data deposition: The sequences reported in this paper have been deposited in the GenBank plication events. Transition from homogeneous to heteroge- database (accession nos. AY250083–AY250087 and AY273262–AY273265). neous habitats could also play a role in the evolution of an ‡To whom correspondence should be addressed. E-mail: [email protected].

www.pnas.org͞cgi͞doi͞10.1073͞pnas.1037686100 PNAS ͉ May 13, 2003 ͉ vol. 100 ͉ no. 10 ͉ 5885–5890 Downloaded by guest on September 26, 2021 living in sulfide-rich habitats such as and marina also possess hemoglobins that display a H2S- binding capability via free cysteines residues (19, 20, ࿣) for which positions are unknown (no sequence available). Many habitats that display high sulfide concentrations are known to occur on the Earth’s surface. Environmental sulfide may have either geothermal (hydrothermal vents, sulfurous springs) or biogenic (cold seeps, marine , mangroves) origins, including anthropogenic deposits in polluted marine or brackish areas. We postulated that species from sulfide-rich environment exhibiting free cysteine residues at positions Cys ϩ 1 and Cys ϩ 11 are able to bind sulfide by analogy to the mechanism used by both of the vestimentiferans Lamellibrachia sp. and R. pachyptila. Such H2S-binding function appears to be absent in annelids from sulfide-free environments such as the oligochaete terrestris () and the Tylorrhynchus heterochaetus which lack these residues. The in- ability of hemoglobin to bind sulfide was confirmed by Zal et al. (15) using specific cysteine inhibitors. It was found that H2S-binding A2 and B2 globins exhibit a lower evolutionary rate than the O2-binding A1 and B1 globins, which do not possess free cysteines (18). Such evolutionary rates suggest that A2 and B2 globins and their H2S-binding function are strongly selected. As a consequence, the authors proposed an evolutionary scenario regarding the evolution of the hemoglobin H2S-binding function in symbiotic and nonsymbiotic annelids living in sulfide-rich habitats and suggested that the H2S-binding function via a free cysteine residue was (i) an innovation in Annelida and (ii) lost by the relaxation of selective constraint (neutral evolution) in the annelid ancestors that colonized the newly emerging sulfide-free habitats. Starting with these assumptions, we focused our attention here on the A2 and B2 homologous free cysteine sites that are located in a well-conserved secondary structure region called the sulfide- Fig. 1. Neighbor-joining consensus tree of globin sequences from annelids binding domain (SBD) (18). Recent maximum likelihood models living in sulfide-rich habitats with percentage bootstrap support (1,000 rep- of synonymous and nonsynonymous substitution estimation licates). Rooting is performed according to the ‘‘duplicate rooting procedure’’ (21) using clade A versus clade B. The first dichotomy separates the A and B called the branch site specific models (22) were used to inves- families and the second distinguishes the A1, A2, B1, and B2 subfamilies. tigate the functional evolution of free cysteines in annelids from Typical subfamily HCA plots (secondary structure) were performed from each sulfide-rich and sulfide-free habitats. new globin of Lamellibrachia nov sp. (LaM) and other related vestimentifer- We present here a case of molecular adaptation (replacement ans (encircled names). HCA plots exhibit a high degree of similarity within each of the free cysteines) due to the relaxation of selective con- subfamily and are also similar to those previously obtained for other annelid ϩ ϩ straints (decline in H2S concentrations). In other words, the loss species (figure 4 in ref. 18). Free cysteines (Cys 11) in A2 and (Cys 1) in B2 globins are shown by an arrow (for GenBank accession numbers and addi- of a function (H2S binding function in sulfide-free habitats) can be also driven by positive Darwinian selection. The dramatic tional abbreviations, please see Appendix, which is published as supporting changes of environmental conditions during the course of evo- information on the PNAS web site). lution and the associated physiological modifications in annelids from well oxygenated emerging habitats are pointed out to ‘‘Nautile,’’ ‘‘Alvin,’’ and ‘‘ROPOS,’’ brought back alive to the explain such a loss of function by molecular adaptation. surface inside an insulated box, and immediately frozen and Materials and Methods stored in liquid after their recovery on board. Biological Materials. Juvenile specimens of Lamellibrachia nov sp. Identification and Characterization of Novel Extracellular Globins. In were collected around cold-seeps from mud volcanoes in the the present paper, globin primer design, total RNA extraction, Mediterranean Sea during the French oceanographic cruise cDNA synthesis, RT-PCR amplification, PCR-product cloning, Medinaut (Kazan site: 35°25.88Ј N, 24°33.56Ј E) at a depth of and sequencing were performed as described in Bailly et al. (18). Ϸ1,705 m. Juvenile specimens of the hydrothermal-vent tube- Riftia pachyptila, Oasisia alvinae, and Tevnia jerichonana This protocol was applied to the three hydrothermal vent annelid (up to 3–5 cm length) were collected at one single vent site from species; (encircled in Fig. 1) Ridgeia piscesae (Ridg), Oasisia the ridge segment 9°50ЈN on the East Pacific Rise (Riftia field: alvinae (Oas), and Tevnia jerichonana (Tevnia), and to the 9°50.75Ј N, 104°17.57Ј W) at a depth of 2,500 m during the Mediterranean cold seep annelid species Lamellibrachia nov sp. French oceanographic cruise HOT 96 and the American cruise (LaM) (see Fig. 1). LARVE98. Hydrothermal vent tubeworms Ridgeia piscesae were collected at the Endeavour Segment of the Juan de Fuca Ridge Multiple Alignments and Tree Reconstructions. Unrooted tree to- (Clam Bed, 47° 57Ј N, 129° 05Ј W) at a depth of 2197 m during pologies from multiple alignments of the A2 and B2 amino acid the Canadian–American Hi-Rise cruise 2001. Worms were globin sequences were obtained by using the neighbor joining sampled by using the telemanipulated arms of the submersibles method computed by using PHYLIP program (23) with 1,000 bootstrap resamplings of the data (Fig. 2). The orthologous A2 globin set (Fig. 3, which is published as ࿣Zal, F., Gotoh, T. & Toulmond, A. (1999) Comp. Biochem. Physiol. A 124, S16 (abstr.). supporting information on the PNAS web site, www.pnas.org)

5886 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.1037686100 Bailly et al. Downloaded by guest on September 26, 2021 was composed of 213 bp of DNA sequence (71 codons) from the ically) used to calculate posterior probabilities of site classes for terrestrial (sulfide-free) oligochaete (Gen- each site. Sites displaying ␻ Ͼ 1 and a high posterior probability Bank accession no. BF422675), the littoral (transitory sulfide- can be suspected to be under positive selection. All analyses rich) polychaete spallanzanii (GenBank accession no. were performed by using the CODEML program of the PAML AJ131285), the Mediterranean cold-seep (sulfide-rich) vesti- package (27). mentiferan Lamellibrachia nov sp. (GenBank accession no. AY250084) and the three East Pacific Rise hydrothermal vent Results (sulfide-rich) vestimentiferans Oasisia alvinae (GenBank acces- New Globin Sequences from Hydrothermal Vent and Cold Seep Spe- sion no. AY250087), Tevnia jerichonana (GenBank accession no. cies. Five partial sequences of both the A2 and B2 globins have AY250086) and Riftia pachyptila (GenBank accession no. been obtained for Tevnia jerichonana (A2: 88 aa), Oasisia alvinae AJ439733). The orthologous B2 globin set (Fig. 4, which is (A2: 76 aa and B2: 53 aa) and Ridgeia piscesae (B2: 71 aa). Globin published as supporting information on the PNAS web site) was assignment was based on sequence homology, specific amino composed of 213 bp of DNA sequence (71 codons) from acid patterns and the presence of free cysteine residues Cys ϩ 1 Lumbricus rubellus (GenBank accession no. BF422540), the two and Cys ϩ 11 (18) without ambiguity. A complete set of the littoral polychaetes (GenBank accession no. globin subfamilies A1, A2, B1, and B2 was also sequenced from AJ131283) and indica (GenBank accession no. Lamellibrachia nov sp. (see Appendix). As above, globin se- D58418), the Mediterranean cold-seep vestimentiferan Lamel- quences were unambiguously assigned to the right paralogous librachia nov sp. (GenBank accession no. AY250085), the East subfamily by the reconstruction of molecular phylogenies in Pacific Rise hydrothermal vent vestimentiferans Riftia pachyptila which the well-defined globin subfamilies of R. pachyptila and (GenBank accession no. AJ439737) and Juan de Fuca Ridge Lamellibrachia sp. from Japan (14, 18, 28) were inserted (Fig. 1). Ridgeia piscesae (GenBank accession no. AY250083). All of the A2 and B2 globins of vestimentiferan tubeworms The secondary structure of the molecular domain (SBD) displayed free cysteine residues at positions Cys ϩ 11 and Cys ϩ surrounding the free cysteine residues involved in the sulfide- 1, respectively, and similar SBD amino acid patterns. binding function was predicted by using a hydrophobic cluster analysis (HCA) from amino acid globin sequences and plotted A2 and B2 Globin Subfamilies dN͞dS Ratio Analyses. The dN͞dS ratio according to the DRAWHCA software (24). These plots were of analyses were performed solely on the two globin subfamilies A2 prime importance to deduce the level of conservation of the and B2 because they are both involved in the sulfide-binding sulfide-binding domain between the globin subfamilies (Fig. 1). function via the free cysteine residues in position Cys ϩ 11 and Cys ϩ 1 in annelids living in sulfide-rich habitats. All model Search for Darwinian Positive Selection from dN͞dS Ratios. To detect parameters (fixed or estimated), likelihood ratio tests and the Darwinian positive selection acting on extracellular globins, an putative positively selected sites are reported in Table 1. approach based on the maximum likelihood estimation of the Results obtained from the Bm2 and Bm3 models indicate that nonsynonymous͞synonymous substitution rate ratio (dN͞dS ϭ both the A2 and B2 globin subfamilies of the Lumbricus rubellus ␻; ref. 25) was applied by using our sets of coding sequences. The (foreground) lineage, representing annelids from terrestrial ratio ␻ provides a sensitive measure of selective pressures acting sulfide-free habitats, had undergone selection in a different way at the level, with ␻ values of Ͻ1, ϭ 1, and Ͼ1 indicating than lineages representing annelids living in various sulfide-rich negative selection, neutral evolution, and positive Darwinian habitats (background; see Table 1 and Fig. 2). The models Bm2 selection, respectively. and Bm3 provide significantly better likelihood values than site The branch site-specific models (22) were considered to specific models M1 and M3, and they have detected positively conduct selection pressure analysis for both the A2 and B2 globin selected sites in the foreground lineage. It is worth noting that subfamilies. These models use maximum likelihood methods for a part of the positively selected sites fall into the SBD for the estimating the parameters of a transition matrix describing the Lumbricus B2 and A2 globin subfamilies, respectively (Table 1). substitution rate between pairs of codons, including dN͞dS The Bm3 model detected more positively selected sites than the ratios (␻), transition͞transversion ratios and branch lengths. We Bm2 model for both A2 and B2 globins, suggesting that this compared nested models (a null and an alternative hypothesis) former model is more appropriate to detect positive selection. In ␦ ϭ with the likelihood ratio test (LRT) following the formula: 2 L both models and both subfamilies, the key sites for H2S-binding Ϫ 2(L1 L0), where L1 is the alternative hypothesis and L0 is the (i.e., the free cysteine 32C in B2 and 42C in A2 from Table 1, null hypothesis. The log likelihood difference between the two respectively positions Cys ϩ 1 and Cys ϩ 11) are subjected to models is expected to be ␹2 distributed with the number of positive selection with a posterior probability of 99% in A2 (with degrees of freedom equal to the difference in the number of the Bm3 model) and B2 (with the Bm2 and Bm3 models) and parameters between the models. The branch site-specific model 88% in A2 (with the Bm2 model). Other analyses using Bm2 and EVOLUTION is the combination of a lineage-specific model (5) and a site- Bm3 with different fixed foreground lineages were also per- specific model (26). This program provides an interesting tool to formed, but they yielded low and insignificant likelihood scores test for positive selection at each amino acid site within a and no positively selected amino acid residues within the SBD prespecified lineage of the phylogeny (foreground branch) as (data not shown). opposed to the rest of the lineages (background branches). In other words, these models allow testing the assumption that Discussion some orthologous amino acid sites have undergone positive Sulfide-Binding Function: A Widespread Function in Annelids Living selection only in some evolutionary strains of a given phylogeny. in Sulfide-Rich Habitats That May Have Been Lost in Annelids from These models, called Bm2 and Bm3, can be respectively com- Sulfide-Free Habitats. Bailly et al. (18) suspected that the A2 and pared for LRT with some site-specific models M1 and M3 (26). B2 globin subfamilies of the hemoglobin of annelids have ␻ ϭ The M1 model only assumes two sites in which 0 0 (any undergone strong directional selective constraints driven by high ␻ ϭ mutation is deleterious at a given site) and 1 1 (any mutation levels of H2S concentrations in taxa living in sulfide-rich habitats. is neutral at a given site) and for which the proportions p0 and This would explain the maintenance of the free cysteine residues ϩ ϩ p1 could be estimated over the whole protein. The model M3 uses at the conserved positions Cys 1 and Cys 11, and the a general discrete distribution of the ␻ ratios among sites with maintenance of a conserved SBD secondary structure in two sets two site classes for which the proportions p0 and p1 are estimated. of highly divergent paralogous strains (A2 and B2). The presence For branch site-specific models the Bayes theorem is (automat- of homologous free cysteine residues at positions Cys ϩ 1 and

Bailly et al. PNAS ͉ May 13, 2003 ͉ vol. 100 ͉ no. 10 ͉ 5887 Downloaded by guest on September 26, 2021 Table 1. Likelihood values (L), parameter estimates, and LTR obtained for the branch-site models Background Model L p Estimates of parameters LRT (df) Foreground species species

Site-specific

M1 (A2 globins) Ϫ970.45 1 p0 ϭ 0.290 ␻0 ϭ 0 p1 ϭ 0.709 ␻1 ϭ 1 M3 (A2 globins) Ϫ959.34 3 p0 ϭ 0.588 ␻0 ϭ 0.411 p1 ϭ 0.089 ␻1 ϭ 0.788 M1 (B2 globins) Ϫ1146.86 1 p0 ϭ 0.289 ␻0 ϭ 0 p1 ϭ 0.710 ␻1 ϭ 1 M3 (B2 globins) Ϫ1119.24 3 p0 ϭ 0.312 ␻0 ϭ 0.004 p1 ϭ 0.687 ␻1 ϭ 0.213 Branch-site

Bm2 (A2 globins) Ϫ967.78 3 p0 ϭ 0.286 ␻0 ϭ 0 p1 ϭ 0.512 ␻1 ϭ 1 Bm2 vs. M1 (2)* 51V 67G 41M° None p2 ϩ p3 ϭ 0.201 ␻2 ϭ 2.405 23I 42C°48D° Bm3 (A2 globins) Ϫ952.79 5 p0 ϭ 0.493 ␻0 ϭ 0.089 6M 17R 29Q° 45M° 49S 53A 60H 69S 70A R2 S3 K 8 E10 D50 p1 ϭ 0.512 ␻1 ϭ 1.328 Bm3 vs. M3 (2)* 4I 41M° 24S 54S 55Q A57 61A 64V p2 ϩ p3 ϭ 0.374 ␻2 ϭϱ 23I 25S 44S° 51V 65E 42C 48D° 67G

Bm2 (B2 globins) Ϫ1143.17 3 p0 ϭ 0.290 ␻0 ϭ 0 p1 ϭ 0.594 ␻1 ϭ 1 Bm2 vs. M1 (2)* 28F° 30A° None p2 ϩ p3 ϭ 0.114 ␻2 ϭ 85.75 32C° Bm3 (B2 globins) Ϫ1117.11 5 p0 ϭ 0.349 ␻0 ϭ 0.013 N7 A17 S41° S47° A50 V54 p1 ϭ 0.459 ␻1 ϭ 0.252 Bm3 vs. M3 (2), NS 28F° 30A° None p2 ϩ p3 ϭ 0.191 ␻2 ϭϱ 32C°

Amino acids in the foreground (Lumbricus terrestris) and the background species columns correspond to positive Darwinian selected sites for A2 and B2 globin coding sequences. p is the number of free parameters for the ␻ distribution. Sites potentially under positive selection (with an additional ° when included in the SBD) are identified using the Lamellibrachia nov sp. sequence as the reference. See Fig. 2. Bold underline sites have a posterior probability (PP) of 99%, bold sites have a PP of Ͼ95%, underlined sites have a PP of Ͼ90%, 80%Ͼ italic sites have a PP of Ͼ90%, sites in normal text have a PP of Ͻ80%. NS, not significant. *, P Ͻ 0.05.

Cys ϩ 11 exclusively found in A2 and B2 globins from the chipolynoe sp. (31) also exhibit such free cysteines in the same Mediterranean cold seep Lamellibrachia nov sp., and the eastern position. Bailly et al. (18) proposed that the occurrence of free Pacific hydrothermal vent Oasisia alvinae, Tevnia jerichonana, cysteines at positions Cys ϩ 1 and Cys ϩ 11 is a plesiomorphic and Ridgeia piscesae demonstrate the widespread occurrence and state already present in the annelid ancestor, rather than an the conservation of these residues in vestimentiferans. More- apomorphic one, occasionally acquired before annelid radiation over, some globins of nonsymbiotic polychaetes, such as Sabel- from sulfur-rich environment. The absence of free cysteine lastarte indica (29) and Sabella spallenzani (30), and the Bran- residues in globins from polychaetes living in free-sulfur envi-

Fig. 2. Orthologous globin A2 and B2 topologies from nucleotide sequences of Riftia pachyptila (Riftia), Sabella spallanzanii (Sabspal), Sabellastarte indica (Sabindica), Lamellibrachia nov sp. (LaM), Tevnia jerichonana (Tevnia), Oasisia alvinae (Oasisia), Ridgeia piscesae (Ridgeia) (the background species), and Lumbricus rubellus (Lrubelllus) (the foreground species). The dotted line represents the oligochaete lineage (sulfide-free habitats) where Cys ϩ 11 and Cys ϩ 1 sites are under positive selection (stars in SBD HCA representation).

5888 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.1037686100 Bailly et al. Downloaded by guest on September 26, 2021 ronment such as Tylorrhynchus heterochaetus (32) or oli- negatively selected even in sulfide-free lineages or that the gochaetes such as Lumbricus terrestris (33) has been interpreted accumulation of positively selected amino acid sites occurred in as a loss. Despite the absence of the free cysteine residues in the ancient times before the annelid radiation, leading to this latter species, the nonfunctional SBD is still more or less unusual, complex, extracellular hemoglobin. This may explain conserved in all globin subfamilies with an obvious degenerated why most examples of positive selection come from xenobiotic signature (18). Such a conserved structure in common paralo- recognition molecules or genes associated with male reproduc- gous globins, which emerged before radiation of annelids, rein- tion that have undergone recent adaptive changes such as forces the assumption of a loss of the free cysteines as well as the primate lyzozymes (5, 37), HIV membrane proteins (6, 7) or the inference that the SBD, and therefore its original function, salmon gamete recognition system (9). represents the primitive condition in annelid hemoglobins. These molecular data form a set of insights that allow us to Positively Selected Sites in Sulfide-Free Lineages: Why the Loss of a assume that annelid ancestors initially lived in sulfide-rich Function Could Also Be Considered as a Molecular Adaptation Rather conditions. Another argument in favor of the loss of free cysteine than a Relaxation of Constraints (Neutral Evolution). Bailly et al. (18) residues and which sustains the previous postulate is that adap- proposed a possible relaxation of selective constraints on the tation to sulfide did not only require cysteine acquisition in annelid globins from sulfide-free environments to explain the respiratory pigments but implied the establishment of a long absence of free cysteine residues on the A2 and B2 subunits. H2S biochemical pathway by which also acquired various was one of the most abundant molecules on the Earth’s surface adaptive physiological detoxification mechanisms, at the levels during the prebiotic era and its concentration greatly decreased of molecules, cells, tissues, and up to the whole organism. It is during the course of evolution, remaining only at some specific unlikely that this complicated metabolic process was the result of places (e.g., hydrothermal vents) as a relic of the primitive a preadapted pathway that abruptly shifted to sulfide utilization conditions (38, 39). Thus, the acquisition and the maintenance from using a molecule that possessed similar detoxifying and of H2S binding functions were of a prime importance during the energy carrier potentials for the purpose of oxidizing a different beginning of and were probably subjected to purifying metabolic compound. One must also keep in mind that the selection in the ancient annelid lineages living in sulfide-rich symbiotic worms from sulfidic environments use H2S to fuel habitats. To test whether the SBD and especially the free their endosymbionts, a situation not encountered in nonsymbi- cysteine residues in globins from annelids in sulfide-free envi- otic worms from similar habitats. It is likely that a detoxification ronments were subjected to ancient diversifying selection, we function in nonsymbiotic worms (a plesiomorphic trait) could used the recent branch-site set of models implemented in PAML have evolved in symbiotic worms into a H2S transport function (22). These more intricate and realistic models simultaneously (an apomorphic trait). This means that the H2S-binding capacity allow a combination of different ␻ ratios among sites and is an intricate mechanism that acts differentially according to the lineages. These models relies on a priori evolutionary hypotheses location of the cysteines and that the heterotrophic H S-binding 2 within a prespecified lineage that is suspected of a specific pathway which has preceded the symbiotic H S-binding pathway 2 history: in our case, the hypothetical loss (versus the gain) of free was probably present early in annelid evolution, even before cysteine residues in sulfide-free lineages was combined with the annelid radiation. way that they were lost during evolution (neutral relaxation or It is thus more reasonable to support the hypothesis of the loss Darwinian positive selection). For both the A2 and B2 globins, of sulfide-binding function in nonsymbiotic worms in sulfide-free the significantly lowest likelihood values were obtained when we environments rather than a repeated gain of this function in ͞ different sulfur-rich environments in various annelid lineages. assumed a different dN dS ratio in the sulfide-free lineages as opposed to lineages from the other habitats (hydrothermal vents, Seeking for Positively Selected Sites in Well Conserved Orthologous cold-seeps, coastal or intertidal hypoxic sediments). In only this case (all branch combinations have been tested; results not Sequences. Hemoglobin is an ubiquitous molecule found ‘‘from ϩ ϩ bacteria to man’’ (34, 35) for which the three-dimensional shown), the sites Cys 1 and Cys 11 were clearly subjected structure (globin fold) (36) is well conserved among highly to positive selection with a strong posterior probability together divergent evolutionary lineages. This structural universality with other residues of the SBD. These positively selected sites, demonstrates that globin is a strongly selected molecule, due in found only within the SBD, are in agreement with the loss or the part to its ability to bind and transport . Despite their degeneracy of the hydrophobic secondary structure surrounding great evolutionary distance in terms of common ancestry (two the free cysteine residues in annelids from terrestrial sulfide-free duplication events), the A2 and B2 subunits share an obvious habitats (18). ϩ secondary structure conservation of the SBD and the mainte- The high posterior probabilities obtained for both the Cys 1

ϩ ͞ Ͼ EVOLUTION nance of functional free cysteines in annelids living in sulfide- and Cys 11 sites are associated with a dN dS 1. This result rules rich conditions, two noteworthy insights of a functional coevo- out the assumption that the loss of the free cysteine residues have lution between these duplicated genes. Thus, it is more followed a neutral pattern of evolution. Free cysteine residues appropriate to search for selected sites in such a well conserved are key amino acids in biochemical reactions because of their structure than to try to detect cryptic adaptive evolution over the well-known highly reactive lateral chain. To our knowledge, all whole set of globin sequences. Historically two programs were free cysteines reported from protein analysis display an active performed to detect positive selection: the lineage-specific pro- and diversified functional role. These residues are involved in the gram, which averages the dN͞dS ratio over all sites of a given binding of some catalytic cofactors (40), pH-dependent oligomer- coding sequence (5), and site-specific models, which average the ization (41), macromolecular complexation, and many other mo- dN͞dS ratio for each homologous site over all coding sequences lecular detoxifying functions. However, in the case of specific (26). But because they average either all sites or all lineages they genetic diseases or metabolic disorders, free cysteines can bind are not sensitive enough to detect selective pressures at a specific atypical ligands. Examples of such ligands include benzoquinone, a amino acid of a given lineage. This lack of power was already known carcinogenic metabolite, in rodent globin (42), mercury ions pointed out by in ref. 10, when episodic positive selection has inhibiting enzymatic reactions (43), or oligomeric complexation by occurred only on a few amino acids of a strongly negatively abnormal aggregation with subsequent physiological modification selected molecule. The absence of apparent positive selection in in mice mutant hemoglobin (44). These three examples among the A2 and B2 globin strains by using these two programs (data others illustrate the deleterious role of unexpected free cysteines in not shown) may reflect either that these globins are highly mutant organisms and indicate that the disappearance of a specific

Bailly et al. PNAS ͉ May 13, 2003 ͉ vol. 100 ͉ no. 10 ͉ 5889 Downloaded by guest on September 26, 2021 ligand (e.g., H2S) can drastically reduce organismal fitness when molecular adaptation) in annelid lineages from sulfide-free conditions have changed. environments. In other words, we propose that the loss of We assumed that when sulfide concentrations decreased or sulfide-binding function is a disadaptation process (50) that was when annelid ancestors moved from sulfide-rich to sulfide-free a prerequisite to avoid the detrimental physiological effects habitats, the presence of free cysteine residues could have associated with the collateral activity of free cysteine residues induced irreversible deleterious effects and homeostatic disor- regarding other blood compounds. ders. Thus, it is not surprising to find such sites to have undergone molecular adaptation. In addition, the loss of free We gratefully acknowledge the captains and crews of the NO L’Atalante cysteines on both the A2 and B2 globin subunits in annelids from and the RV Atlantis II, the pilots and teams of the submersibles Nautile sulfur free environments confirms their implication in sulfide and Alvin, and F. Gaill (Universite´ Pierre et Marie Curie, Paris), L. S. binding in annelids from sulfur rich environments. This event has Mullineaux (Woods Hole Oceanographic Institute, Woods Hole, MA), and H. Felbeck (Scripps Research Institute, La Jolla, CA), chief scien- probably occurred during a short period as, because of their tists of the HOT96 and LARVE98 cruises. We thank C. Fisher (Penn- biochemical reactivity, free cysteines would have been rapidly sylvania State University, State College) for friendly collaboration, and recruited for alternative biochemical uses via a cooperative Myriam Sibuet (Institut Franc¸ais de Recherche pour L’Exploitation de process in a manner similar to the occurrence of San Marco la Mer) who provided us with Lamellibrachia specimens from Mediter- spandrels in Venice (45). The loss of globin genes in human and ranean Sea. We are particularly indebted to Z. Yang (University College apes (46) and the loss of oxygen carrier function in Antarctic London, London) and M. J. Ford (Northwest Fisheries Science Center, fishes (47–49) have already been reported, but our study is the Seattle) who have helped us in the use of the PAML software. We thank first to demonstrate that the loss of a molecular function two anonymous referees for their precious comments and advice. This work was supported by the Conseil Re´gional de Bretagne, the Ministe`re (capacity to bind H2S) could be driven by positive Darwinian de l’Education Nationale et de la Recherche (ACC-SV3), and the selection. Institut National des Sciences de l’Univers and National Oceanic and The use of dN͞dS ratio analyses shows that the sulfide-binding Atmospheric Administration͞National Undersea Research Program function has been secondarily lost by positive selection (i.e., Grant UAF01-0042.

1. Kimura, M. (1981) J. Mol. Evol. 17, 110–113. 28. Takagi, T., Iwaasa, H., Ohta, S. & Suzuki, T. (1991) in Structure and Function 2. Goodman, M., Moore, G. W. & Matsuda, G. (1975) Nature 253, 603–608. of Oxygen Carriers, eds. Vinogradov, S. N. & Kapp, O. H. (Springer, 3. Ohta, T. (1988) Evolution (Lawrence, Kans.) 42, 375–386. New York), pp. 245–249. 4. Hedrick, P. W., Ginevan, M. E. & Ewing, E. P. (1976) Annu. Rev. Ecol. Syst. 29. Suzuki, T., Hirao, Y. & Vinogradov, S. N. (1995) Biochim. Biophys. Acta 1252, 7, 1–32. 189–193. 5. Yang, Z. (1998) Mol. Biol. Evol. 15, 568–573. 30. Pallavicini, A., Negrisolo, E., Barbato, R., Dewilde, S., Ghiretti-Magaldi, A., 6. Nielsen, R. & Yang, Z. (1998) Genetics 148, 929–936. Moens, L. & Lanfranchi, G. (2001) J. Biol. Chem. 276, 26384–26390. 7. Zanotto, P. M., Kallas, E. G., de Souza, R. F. & Holmes, E. C. (1999) Genetics 31. Hourdez, S. (2000) Ph.D. thesis (Universite´ Pierre et Marie Curie, Paris). 153, 1077–1089. 32. Suzuki, T. & Gotoh, T. (1986) J. Biol. Chem. 261, 9257–9267. 8. Yang, Z., Swanson, W. J. & Vacquier, V. D. (2000) Mol. Biol. Evol. 17, 33. Fushitani, K., Matsuura, M. S. & Riggs, A. F. (1988) J. Biol. Chem. 263, 1446–1455. 6502–6517. 9. Ford, M. J. (2001) Mol. Biol. Evol. 18, 639–647. 34. Hardison, R. C. (1996) Proc. Natl. Acad. Sci. USA 93, 5675–5679. 10. Bielawski, J. P. & Yang, Z. (2001) Mol. Biol. Evol. 18, 523–529. 35. Hardison, R. (1998) J. Exp. Biol. 201, 1099–1117. 11. McHugh, D. (2000) Can. J. Zool. 78, 1873–1884. 36. Perutz, M. F. (1979) Annu. Rev. Biochem. 48, 327–386. 12. Nicholls, P. (1975) Biochim. Biophys. Acta 396, 24–35. 37. Messier, S. & Stewart, C. (1997) Nature 385, 151–154. 13. Arp, A. J., Childress, J. J. & Vetter, R. D. (1987) J. Exp. Biol. 128, 38. Clark, P. D., Dowling, N. I. & Huang, M. (1998) J. Mol. Evol. 47, 127–132. 139–158. 39. Tunnicliffe, V. (1992) Palaios 7, 338–350. 14. Suzuki, T., Takagi, T. & Ohta, S. (1990) Biochem. J. 266, 221–225. 40. Branden, R., Malmstrom, B. G. & Vanngard, T. (1973) Eur. J. Biochem. 36, 15. Zal, F., Leize, E., Lallier, F. H., Toulmond, A., Van Dorsselaer, A. & Childress, 195–200. J. J. (1998) Proc. Natl. Acad. Sci. USA 95, 8997–9002. 41. Sakai, K., Sakurai, K., Sakai, M., Hoshino, M. & Goto, Y. (2000) Protein Sci. 16. Weber, R. E. & Vinogradov, S. N. (2001) Physiol. Rev. 81, 569–628. 9, 1719–1729. 17. Gotoh, T., Shishikura, F., Snow, J. W., Ereifej, K. I., Vinogradov, S. N. & Walz, 42. Miranda, J. J. (2000) Biochem. Biophys. Res. Commun. 275, 517–523. D. A. (1987) Biochem. J. 241, 441–445. 43. Concha, N. O., Rasmussen, B. A., Bush, K. & Herzberg, O. (1997) Protein Sci. 18. Bailly, X., Jollivet, D., Vanin, S., Deutsch, J., Zal, F., Lallier, F. H. & Toulmond, 6, 2671–2676. A. (2002) Mol. Biol. Evol. 19, 1421–1433. 44. Leder, A., Wiener, E., Lee, M. J., Wickramasinghe, S. N. & Leder, P. (1999) 19. Zal, F., Green, B. N., Lallier, F. H., Vinogradov, S. N. & Toulmond, A. (1997) Proc. Natl. Acad. Sci. USA 96, 6291–6295. Eur. J. Biochem. 243, 85–92. 45. Gould, S. J. & Lewontin, R. C. (1979) Proc. R Soc. London B Biol. Sci. 205, 20. Zal, F., Green, B. N., Lallier, F. H. & Toulmond, A. (1997) Biochemistry 36, 581–598. 11777–11786. 46. Zimmer, E. A., Martin, S. L., Beverley, S. M., Kan, Y. W. & Wilson, A. C. 21. Donoghue, M. J. & Matthews, S. (1998) Mol. Phylogenet. Evol. 9, 489–500. (1980) Proc. Natl. Acad. Sci. USA 77, 2158–2162. 22. Yang, Z. & Nielsen, R. (2002) Mol. Biol. Evol. 19, 908–917. 47. Cocca, E., Ratnayake-Lecamwasam, M., Parker, S. K., Camardella, L., Ciara- 23. Felsenstein, J. (1989) 5, 164–166. mella, M., di Prisco, G. & Detrich, H. W., III (1995) Proc. Natl. Acad. Sci. USA 24. Callebaut, I., Labesse, G., Durand, P., Poupon, A., Canard, L., Chomilier, J., 92, 1817–1821. Henrissat, B. & Mornon, J. P. (1997) . Mol. Life Sci. 53, 621–645. 48. Zhao, Y., Ratnayake-Lecamwasam, M., Parker, S. K., Cocca, E., Camardella, 25. Yang, Z. & Bielawski, J. P. (2000) Trends Evol. Ecol. 15, 496–502. L., di Prisco, G. & Detrich, H. W., III (1998) J. Biol. Chem. 273, 14745–14752. 26. Yang, Z., Nielsen, R., Goldman, N. & Pedersen, A. M. (2000) Genetics 155, 49. Bargelloni, L., Marcato, S. & Patarnello, T. (1998) Proc. Natl. Acad. Sci. USA 431–449. 95, 8670–8675. 27. Yang, Z. (1997) Comput. Appl. Biosci. 13, 555–556. 50. Baum, A. D. & Larson. (1991) Syst. Zool. 40, 1–18.

5890 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.1037686100 Bailly et al. Downloaded by guest on September 26, 2021