<<

ORIGINAL RESEARCH ARTICLE published: 07 January 2014 doi: 10.3389/fpls.2013.00544 Origin and evolution of metal P-type ATPases in Plantae ()

Marc Hanikenne 1,2* and Denis Baurain 2,3

1 Functional Genomics and Molecular Imaging, Department of Sciences, Center for Protein Engineering (CIP), University of Liège, Liège, Belgium 2 PhytoSYSTEMS, University of Liège, Liège, Belgium 3 Eukaryotic Phylogenomics, Department of Life Sciences, University of Liège, Liège, Belgium

Edited by: Metal ATPases are a subfamily of P-type ATPases involved in the transport of metal Mark G. M. Aarts, Wageningen cations across biological membranes. They all share an architecture featuring eight University, Netherlands transmembrane domains in pairs of two and are found in prokaryotes as well as in a Reviewed by: variety of . In Arabidopsis thaliana, eight metal P-type ATPases have been Mee-Len Chye, The University of Hong Kong, Hong Kong described, four being specific to copper transport and four displaying a broader metal Ute Krämer, Ruhr University specificity, including zinc, cadmium, and possibly copper and calcium. So far, few efforts Bochum, Germany have been devoted to elucidating the origin and evolution of these proteins in Eukaryotes. *Correspondence: In this work, we use large-scale phylogenetics to show that metal P-type ATPases form Marc Hanikenne, Functional a homogenous group among P-type ATPases and that their specialization into either Genomics and Plant Molecular ... Imaging, Department of Life monovalent (Cu) or divalent (Zn, Cd ) metal transport stems from a duplication that Sciences, Center for Protein took place early in the evolution of Life. Then, we demonstrate that the four subgroups Engineering (CIP), University of of plant metal ATPases all have a different evolutionary origin and a specific taxonomic Liège, Bld du Rectorat, distribution, only one tracing back to the cyanobacterial progenitor of the . 27 B-4000 Liège, Belgium e-mail: [email protected] Finally, we examine the subsequent evolution of these proteins in green and conclude that the thoroughly characterized in model organisms are often the result of lineage-specific gene duplications, which calls for caution when attempting to infer function from sequence similarity alone in non-model organisms.

Keywords: P-type ATPases, paralogy, endosymbiosis, phylogenetics, evolution, metal transport, orthology

INTRODUCTION 1998; Argüello, 2003; Williams and Mills, 2005). PIB-type metal P-type ATPases constitute a superfamily of pumps using the ATPases can be further divided into two groups: monovalent energy of ATP to transport cations, and possibly phospho- (Cu+/Ag+) and divalent (Zn2+/Cd2+/Pb2+/Hg2+/Cu2+) metal lipids, across biological membranes (for detailed reviews, see ATPases. Monovalent metal (Me+) ATPases correspond to sub- Kuhlbrandt, 2004; Palmgren and Nissen, 2011). In phylogenetic group IB-1, whereas divalent metal (Me2+) ATPases include the analyses, P-type ATPases group according to protein architec- IB-2 and IB-4 subgroups, respectively, as described previously ture and substrate specificity, in both prokaryotes and Eukaryotes (Axelsen and Palmgren, 1998; Argüello, 2003). (Axelsen and Palmgren, 1998; Argüello, 2003; Hanikenne et al., Functionally, almost all metal ATPases pump metal out of the 2005; Thever and Saier, 2009; Chan et al., 2010; Blaby-Haas cytoplasm, i.e., out to the periplasm in prokaryotes and out of and Merchant, 2012; Pedersen et al., 2012). Two (partially over- the cell or into an organelle (e.g., , secretory vesicles, or lapping) classifications have been proposed for P-type ATPases. ) in Eukaryotes. This activity confers tolerance against The first one is based on three architecture subtypes (AI–AIII). toxic concentrations of metal and/or provides metals for metallo- Proteins of architecture subtype AI have 8 transmembrane (TM) protein maturation. Eight metal ATPases are found in the genome domains in pairs of two, whereas subtype AII proteins have 10 of the model plant Arabidopsis thaliana (Axelsen and Palmgren, TM domains, and subtype AIII proteins display an unusual num- 2001; Cobbett et al., 2003; Williams and Mills, 2005). Four of berof7TMdomains(Thever and Saier, 2009; Chan et al., 2010). them are copper ATPases. (i) AtRAN1 is involved in copper deliv- In the second classification system, the P-type ATPase superfam- ery to the secretory pathway, where copper is required for the ily is divided into five major classes, I–V, based on ion transport maturation of copper-dependent proteins (e.g., ethylene recep- specificities and clustering in phylogenetic (Axelsen and tor, laccases) (Hirayama et al., 1999; Woeste and Kieber, 2000; Palmgren, 1998; Palmgren and Nissen, 2011). Binder et al., 2010). (ii) AtHMA5 contributes to copper toler- PIB-type metal ATPases (or CPx metal ATPases) specifically ance by controlling its accumulation in (Andrés-Colás et al., transport metal cations. They share several structural features, 2006; Kobayashi et al., 2008). Polymorphisms in HMA5 coding including the presence of eight membrane-spanning domains sequences have been linked to natural variation in shoot cop- (architecture subtype AI) and a conserved CPx motif in the per accumulation among A. thaliana ecotypes (Kobayashi et al., sixth predicted TM , which is involved in metal cation 2008). AtRAN1 and AtHMA5 are related to ATP7A and ATP7B, translocation across the membrane (Axelsen and Palmgren, the two human copper P-type ATPases that participate in the

www.frontiersin.org January 2014 | Volume 4 | Article 544 | 1 Hanikenne and Baurain Evolution of plant metal ATPases

biosynthesis of copper-dependent proteins in the secretory path- If monovalent metal-transporting P-type ATPases are found way (e.g., caeruloplasmin). Mutations in these transporters are in most organisms from the three domains of Life (Archaea, responsible for Menkes’ and Wilson’s diseases, respectively, both Bacteria, and Eukaryotes), the taxonomic distribution of diva- associated with cellular copper overload (Wang et al., 2011). (iii) lent cation-transporting P-type ATPases is apparently limited AtPAA1 and AtPAA2 are chloroplastic proteins ensuring copper to prokaryotes, a few and (land) plants (Axelsen and transport across the inner membrane and thylakoid membranes, Palmgren, 1998; Argüello, 2003; Hanikenne et al., 2005; Thever respectively. Consistently, a paa1 mutant is defective both in stro- and Saier, 2009; Chan et al., 2010; Blaby-Haas and Merchant, mal Cu/ZnSOD and in plastocyanin, whereas the paa2 mutant 2012; Pedersen et al., 2012). As metal ATPases play several only lacks plastocyanin (Shikanai et al., 2003; Abdel-Ghany et al., key roles in metal homeostasis in plants, including the hyper- 2005). The CtaA and PacS P-type ATPases found in the plasma accumulation syndrome (see above), we conducted a robust membrane and thylakoid membrane of are respec- phylogenetic analysis to determine their actual distribution tively functional homologs of PAA1 and PAA2 proteins found in across prokaryotes and Eukaryotes. We were keen to exam- chloroplasts (Tottey et al., 2001). ine the origin of divalent metal P-type ATPases in Eukaryotes, The four additional metal ATPases found in A. thaliana especially how and when they appeared in the green lineage. transport divalent cations. Similarly to AtPAA1, AtHMA1 is Finally, we aimed to clarify orthology relationships among plant localized in the inner membrane of the chloroplast and pub- metal P-type ATPases, which is highly relevant when con- lished data on both AtHMA1 and a barley homolog suggest sidering functional homology of proteins within and outside that HMA1 is an exporter of copper (Cu2+), zinc, calcium, Brassicaceae. and cadmium out of the chloroplast (Seigneurin-Berny et al., 2006; Moreno et al., 2008; Kim et al., 2009; Mikkelsen et al., METHODS 2012). Finally, the AtHMA2–4 proteins are zinc and cadmium DATASETS transporters. AtHMA2 and AtHMA4 are plasma membrane zinc A taxonomically representative set of prokaryotic genomes was and cadmium pumps and are involved in metal -to-shoot selected using the phylogenomic clustering method recently pro- translocation and redistribution (Hussain et al., 2004; Wong and posed by Moreno-Hagelsieb et al. (2013) (http://microbiome. Cobbett, 2009). AtHMA3 is localized in the vacuolar membrane wlu.ca/research/redundancy/). Using the GSSb model and a GSS and plays a role in zinc and cadmium tolerance (Morel et al., threshold of 0.5, we retrieved 352 redundant clusters sorted 2009). Polymorphism in AtHMA3 sequence explains most of according to the least overannotation criterion. A non-redundant the natural variation in cadmium accumulation across A. prokaryotic dataset of 352 genomes corresponding to the best- thaliana accessions (Chao et al., 2012). Similarly, a functional annotated genome in each cluster was then assembled. With these homolog of AtHMA3 controls differential cadmium accumula- clustering settings, cyanobacterial genomes grouped into three tion in grains of rice (Ueno et al., 2010; Miyadate et al., clusters and Cyanobacteria were thus represented by only three 2011). genomes in our non-redundant dataset. As we were particu- Orthologs of both AtHMA3 and AtHMA4 are constitu- larly interested in analysing the ancestry of Plantae metal P-type tively more highly expressed in Arabidopsis halleri and Noccaea ATPases, the 41 additional cyanobacterial genomes belonging caerulescens, two zinc and cadmium hyperaccumulators related to to the three clusters were added to our non-redundant dataset. A. thaliana in the Brassicaceae (for recent reviews, see Verbruggen The complete genomes of a total of 393 prokaryotes were thus et al., 2009; Krämer, 2010; Hanikenne and Nouet, 2011)(see included in our study and downloaded from the NCBI FTP server other contributions in this issue). Quantitative genetic analyses in February 2013 (ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/) (Courbot et al., 2007; Willems et al., 2007; Frérot et al., 2010; in the form of protein sequences (Table S1)(Benson et al., 2013). Willems et al., 2010) and functional characterization established For Eukaryotes, a total of 219 complete proteomes were that high expression of HMA4 is required for both hyperaccu- downloaded from three different sources in February 2013: mulation and hypertolerance in A. halleri (Talke et al., 2006; JGI (24 proteomes, http://genome.jgi.doe.gov/) (Grigoriev Courbot et al., 2007; Hanikenne et al., 2008). Increased expres- et al., 2012), Phytozome (41 proteomes, ftp://ftp.jgi-psf.org/ sion of HMA4 in A. halleri occurred through a combination of pub/compgen/phytozome/v9.0/) (Goodstein et al., 2012), copy number expansion (tandem triplication) and cis-regulatory and Ensembl/Ensembl Genomes (154 proteomes, ftp://ftp.en changes activating the promoters of all three HMA4 copies sembl.org/pub/release-70/and ftp://ftp.ensemblgenomes.org/pub/ (Hanikenne et al., 2008). The A. halleri HMA4 locus was shaped release-17/) (Kersey et al., 2012; Flicek et al., 2013)(Table S2). by positive selection, resulting in a selective sweep, and ectopic EST sequences from 42 additional Plantae (, gene conversion, together substantiating selection for increased Rhodophyta, and Glaucophyta) were downloaded in June 2013 gene dosage (Hanikenne et al., 2013). In a fine example of par- from the NCBI EST database (http://www.ncbi.nlm.nih.gov/ allel evolution, high expression of HMA4 in Noccaea caerulescens nucest/). Four additional, recently published, complete CDS was also found to result from copy number expansion and reg- datasets were also downloaded at that time from the respective ulatory changes (O’Lochlainn et al., 2011; Craciun et al., 2012). lab websites (Table S3). In addition, differences in expression levels of both HMA3 and To annotate phylogenetic trees and to summarize the taxo- HMA4 were linked to variations in gene copy number between nomic distribution of the genes of interest, the NCBI N. caerulescens populations exhibiting contrasted metal accumu- database (Federhen, 2012) was weekly downloaded from the lation and tolerance (Ueno et al., 2011; Craciun et al., 2012). NCBI FTP server.

Frontiers in Plant Science | January 2014 | Volume 4 | Article 544 | 2 Hanikenne and Baurain Evolution of plant metal ATPases

HMM SEARCHES least reliably aligned positions. Finally, for Figure 4 and File S3, HMM profiles were computed with hmmbuild and then used to the sequences having more than 50% missing characters with query our complete proteomes with hmmsearch. hmmbuild and respect to the longest sequence were discarded. hmmsearch are components of the HMMER software package The filtered alignments were analyzed using PhyloBayes 3.3e (version 3.0) (http://hmmer.org/) (see also Durbin et al., 1998). (Lartillot et al., 2009) under the C20 empirical profile mixture Specifically, the 826-match-state HMM profile representing all model, which leads to more accurate trees than ordinary empir- three architectures of prokaryotic P-type ATPases was computed ical matrices (e.g., WAG or JTT) by accounting for site-specific from a 1656-AA alignment of the 82 Firmicutes proteins (e.g., amino-acid replacement patterns (Le et al., 2008). Each time, two Bacillus, Staphylococcus) described in Table 5 of Chan et al. (2010). independent chains were run for 1800–5500 cycles (depending The 1265-match-state HMM profile used to detect eukaryotic on the dataset) until convergence of the various chain param- metal ATPases was computed from a 2978-AA alignment of all eters (about 30–40 CPU days by chain). Statistical support is 873 metal ATPases identified in the non-redundant set of 393 a built-in feature of Bayesian inference and is provided in the prokaryotic proteomes. form of posterior probabilities (PP). Inferred topologies were largely congruent with those obtained in a maximum likelihood ALIGNMENTS AND PHYLOGENETIC ANALYSES (ML) framework using RAxML 7.2.8 (Stamatakis, 2006) under Large trees (Figures 1, 4, Files S1–S3) the LG + F + 4 model (data not shown, Yang, 1993; Le and Protein sequences were aligned through 10 iterations (–iter Gascuel, 2008). The LG model is an empirical matrix similar to option) of Clustal Omega 1.1.0 using default settings (Sievers WAG and JTT, but was estimated by incorporating the variability et al., 2011). The resulting alignments were filtered to elimi- of evolutionary rates across sites. nate poorly aligned positions and partial sequences using the Bio-MUST-Core software package (Denis Baurain, unpublished). Small trees (Figures 7, 8, Files S4–S7) Briefly, positions due to insertions in less than 50% of the The protein sequences corresponding to the four eukaryotic sequences were discarded. Gblocks 0.91b (Castresana, 2000)was metal ATPase subtrees were pruned from the alignment used then used with either loose (Figure 1A and File S1)ormedium in Figure 4. After automatic iterative re-alignment with Clustal (Figures 1B, 4 and Files S2, S3) parameters to further filter the Omega, each alignment was hand-curated using the editor of the MUST software package (Philippe, 1993). EST sequences were then added using the “forty” software package (Denis A Baurain, unpublished) that completely automates EST database architecture AI mining, contig assembly, and introduction of translated sequence (transition metal) [PIB] fragments into existing protein alignments, while controlling for orthology relationships based on databases of paralogous architecture AII sequences. The augmented alignments were once again hand- (Ca, Na/K, H, Mg) [PII/PIII] curated in the MUST editor (i) to selectively merge the multiple protein fragments belonging to a single gene from a single organ- architecture AIII ism, (ii) to shorten (or even discard) the protein fragments with (Kdp K) too many anomalous amino acids resulting from the translation [PIA] of low-quality EST sequences, (iii) to locally improve the auto- 0.4 mated alignments carried out by forty, and (iv) to iteratively B [PIB] remove protein fragments that were too partial to be accurately positioned in the tree [as estimated from preliminary parsimony + AI.Me trees built using PAUP∗ 4.0b10 (Swofford, 2002)]. Finally, the

0.82 resulting alignments were filtered as above using Bio-MUST-Core to eliminate poorly aligned positions (Gblocks medium), except AI.Me2+ that no further filter for partial sequences was applied to avoid discarding informative protein fragments predicted from EST contigs. FIGURE 1 | Phylogeny of P-type ATPases (A) and metal ATPases in ML trees were computed with PhyML 3.0 (Guindon et al., prokaryotes (B). The two trees were obtained with PhyloBayes (C20 2010) using the LG + F + 4 model. The starting tree for model) from the analysis of protein alignments of 1495 sequences × 295 the heuristic search was computed by parsimony and the search unambiguously aligned amino acid (AA) positions (A) and 873 sequences × included both NNI (nearest-neighbor interchange) and SPR 361 AA (B), respectively. In the schemes, tree branches were collapsed according to protein architecture (A) or metal specificity (B).In(A), the tree (subtree pruning and regrafting) topological moves. Statistical was rooted using the Kdp K family as outgroup, whereas the tree is support was estimated through the analysis of 100 bootstrap unrooted in (B). Statistical support is provided as posterior probabilities pseudo-replicates (Felsenstein, 1985). (PP). The scale bar at the bottom gives the number of substitutions per site. AI-AIII, architecture subtypes; Px, P-type ATPase subgroups (see text for SEQUENCE AND TREE ANNOTATIONS details). The corresponding fully annotated trees are provided as NEXUS Files S1, S2, respectively. HMMER reports and phylogenetic trees were automatically annotated for substrate specificity based on the best BLAST

www.frontiersin.org January 2014 | Volume 4 | Article 544 | 3 Hanikenne and Baurain Evolution of plant metal ATPases

hit (Altschul et al., 1997) of each sequence against two small (PIIC), proton (PIIIA), and magnesium (PIIIB) ATPases. Note databases of annotated reference P-type ATPase sequences assem- that phospholipid flippases (PIV) are also part of subtype AII, bled from prokaryotes (Chan et al., 2010)orA. thaliana (Axelsen but are only found in Eukaryotes (Axelsen and Palmgren, 1998; and Palmgren, 2001), respectively. Chan et al., 2010). Finally, Kdp-potassium ATPases (PIA) belong Tree rooting was done manually in Seaview (Gouy et al., 2010). to architecture subtype AIII (Chan et al., 2010). We elected to use Tree ladderization and annotation (e.g., substrate specificity, tax- HMM profiles rather than specific BLAST searches both to max- onomy, SPC/CPC motif), as well as coloring and collapsing imize detection sensitivity and to minimize sampling biases that at predetermined taxonomic levels, were automatically conducted might be caused by non-random selection of a limited number of using Bio-MUST-Core and further arranged in FigTree (http:// query sequences. tree.bio.ed.ac.uk/software/figtree/) running under Bio-Linux 7 In the HMM search report, we observed a breakdown in (http://nebc.nerc.ac.uk/tools/bio-linux/) (Field et al., 2006). All E-values around 1e-20, which allowed distinguishing P-type annotated trees are provided as Files S1–S7 in NEXUS format, ATPases from other ATP-binding protein families with high while the corresponding sequence alignments (before the Gblocks specificity. At this E-value threshold, we retrieved a total of step) are given as a ZIP archive in Data Sheet S1. 1495 proteins from 336 Bacteria and 23 Archaea, respectively (Table S1). The 1495 proteins were then aligned and subjected SEQUENCE LOGOS to Bayesian phylogenetic inference, using an evolutionary model Aligned sequences from Files S4–S7 were collected and the puta- that allows for different functional constraints across protein sites tive metal-binding regions regions corresponding to TM6 and (see Methods). For easy interpretation, the resulting phylogenetic TM7/8 were extracted. Then each of the eight sub-alignments tree (Figure 1A, File S1) was annotated with both substrate speci- was split into two parts, one with the green plants sequences ficity and taxonomic information (see Methods). Prokaryotic (Viridiplantae) and the other with the remaining sequences of P-type ATPases robustly clustered into three groups, correspond- the corresponding tree (outgroup). Depending on the origin of ing to the three protein architectures (Figure 1A). Among these, the green plant genes, outgroup corresponded to other Plantae or metal ATPases are monophyletic (i.e., formed a discrete and and/or to Cyanobacteria or Chlamydiae. Sequence homogeneous group) and represent the most common P-type logoswerethencomputedusingWebLogo3.3(Schneider and ATPases found in Bacteria and Archaea (Figures 1A, 2A, 3), in Stephens, 1990; Crooks et al., 2004), without compositional agreement with previous findings (Argüello, 2003; Chan et al., adjustment. 2010; Palmgren and Nissen, 2011). To dissect relationships between Me+ and Me2+ ATPases in OTHER ANALYSES our prokaryotic dataset, the 873 metal ATPases were extracted All other figures (e.g., bubble and bar charts, 2-set Venn dia- from the tree, realigned and once more subjected to Bayesian grams) were produced in R (R-Development-Core-Team, 2008) phylogenetic inference (Figure 1B, File S2). Notably, the tree fea- and further annotated using Inkscape (http://inkscape.org/). tured a fairly supported branch (posterior probability or PP = eulerAPE was used to generate the 3-set proportional Venn dia- 0.82) separating Me+ and Me2+ATPases, which suggests that gram (http://www.eulerdiagrams.org/eulerAPE/). cation selectivity is a conserved property among these proteins. Although, we cannot rule out that one of these two groups might RESULTS be paraphyletic (i.e., ancestral to the other, more derived, one), METAL ATPases ARE A MONOPHYLETIC GROUP IN PROKARYOTES we favor the hypothesis of two monophyletic groups based on The full set of completely sequenced prokaryotic genomes their wide taxonomic distribution (Figure 3B), in agreement with + 2+ (∼2300 Bacteria and Archaea genomes on the NCBI server in Chan et al. (2010). This means that Me and Me ATPases much likely stem from the duplication of an ancestral metal ATPase February 2013) is too large to be readily amenable to phylogenetic + analysis. Therefore, a clustering based on phylogenomic distance early in the evolution of Life. Note however that Me ATPases 2+ measures (Moreno-Hagelsieb et al., 2013) was used to assem- aremorecommonthanMe ATPases in extant prokaryotes ble a non-redundant set of 352 complete genomes encompassing (Figures 2B, 3A). the whole taxonomic diversity of prokaryotes (see Methods). The set included 326 Bacteria (including 3 Cyanobacteria) and ORIGIN OF EUKARYOTIC METAL ATPases 26 Archaea (Table S1). As we were particularly interested in As prokaryotic metal ATPases are monophyletic, a single HMM examining the ancestry of plant, algal, and cyanobacterial metal profile was computed from the 873 metal ATPases described ATPases, 41 additional cyanobacterial genomes were added to this above and used to search the complete proteomes of 219 non-redundant genome set (Table S1). Eukaryotes, which covered a good share of eukaryotic diversity To ascertain the relationships of metal ATPases within the (including e.g., , Plantae, and Stramenopiles; for a P-type ATPase family among prokaryotes, we searched the pro- full list see Table S2). teins of our genome set with a hidden Markov model (HMM) Each protein in the HMM search report was further anno- profile computed from 82 Firmicutes P-type ATPases represent- tated for substrate specificity based on similarity with A. thaliana ing all three architecture subtypes found in prokaryotes (see P-type ATPases (described in Axelsen and Palmgren, 2001). At an Introduction, Chan et al., 2010). Proteins of architecture subtype E-value threshold of 1e-78, we retrieved a total of 942 proteins AI correspond to metal ATPases (PIB). Proteins of architecture from 216 species corresponding mostly to proteins annotated as subtype AII correspond to calcium (PIIA), sodium/potassium Me+ and Me2+ ATPases (Table S2). Metal ATPases were thus

Frontiers in Plant Science | Plant Physiology January 2014 | Volume 4 | Article 544 | 4 Hanikenne and Baurain Evolution of plant metal ATPases

A (Kdp K) AIII A AI 20 873 AI.Me+ 1 (transition metal) 550

99 77 3

AII 16 514 AI.Me2+

# genomes 323 (Ca, Na/K, H, Mg) 142 AII (Ca, Na/K, H, Mg) AIII AI 100 150108 200 250 300 350 (Kdp K) (transition metal) 16 18 20 22 24 26 28 B #phyla 2+ AI.Me B 338 238 101 330 203 Dictyoglomi Caldiserica Chrysiogenetes Deferribacteres Elusimicrobia Gemmatimonadetes 8135195 Korarchaeota [Archaea] Thermodesulfobacteria Aquificae Fusobacteria Synergistetes Deinococcus−Thermus Nitrospirae unclassified Bacteria + Fibrobacteres/Acidobacteria AI.Me Tenericutes Thermotogae Planctomycetes Chlamydiae/Verrucomicrobia Chloroflexi FIGURE 2 | Co-occurrence statistics for the three architectures (AI–AIII) Crenarchaeota [Archaea] + + of P-type ATPases (A) and for Me (Cu, Ag) and Me2 (Zn, Cd...) Euryarchaeota [Archaea] ATPases (B) in prokaryotic genomes, respectively. Areas are proportional Spirochaetes Bacteroidetes/Chlorobi to the number of individual genomes containing a given protein subtype. In Actinobacteria (B), architecture AI proteins were further divided into two groups (AI.Me+ Cyanobacteria 2+ Firmicutes and AI.Me ) based on metal specificity. Proteobacteria 0.0 0.2 0.4 0.6 0.8 1.0 AI AII AIII AI.Me+ AI.Me2+ found in all eukaryotic proteomes but three (one , FIGURE 3 | Taxonomic distribution of the three architectures (AI–AIII) Entamoeba histolytica, and two Metazoa, Choloepus hoffmanni, of P-type ATPases and Me+ (Cu, Ag) and Me2+ (Zn, Cd...) ATPases in Nasonia vitripennis, respectively), but these exceptions might be prokaryotes. (A) Each bubble is located in the graph according to the number of genomes (y-axis) and phyla (x-axis) containing each architecture due to extreme sequence divergence or genome mis-annotation. subtype. The bubble areas are proportional to the total number of proteins The 942 eukaryotic proteins were aligned with the 873 prokary- of a given architecture type identified in the prokaryotic dataset. The AI otic metal ATPases described above. To minimize artifacts dur- bubble in yellow corresponds to the sum of the AI.Me+ (green) and + ing phylogenetic reconstruction, 54 conspicuously short protein AI.Me2 (blue) bubbles, since these are two subgroups of the former. (B) sequences (50 from Eukaryotes and 4 from Bacteria), mostly Relative distribution of each architecture type among prokaryotic phyla (according to NCBI Taxonomy). The total numbers of individual genomes corresponding to incomplete splice variants or mis-modeled containing each architecture subtype are given on top of the bars. proteins, were discarded from the alignment (see Methods). A Architecture AI proteins were further divided into two groups (AI.Me+ and large phylogenetic tree was then obtained by Bayesian inference AI.Me2+) based on metal specificity. (Figure 4, File S3). After addition of eukaryotic metal ATPases, Me+ and Me2+ ATPases again clustered into two distinct groups, thus further supporting the conclusions drawn from Figure 1B. Altogether, 665 Me+ ATPases and 218 Me2+ ATPases were iden- We next examined the relationships among eukaryotic and tified in Eukaryotes (Table S2). A group of nine proteins cor- prokaryotic metal ATPases in the tree and observed that four responding to calcium ATPases were also present in the set subgroups of eukaryotic proteins had distinct evolutionary ori- (designed as outgroup in Figure 4). They were not included in gins (Figure 4). First, eukaryotic metal ATPases clustering with further analyses. prokaryotic Me+ ATPases resolved into two subgroups, both

www.frontiersin.org January 2014 | Volume 4 | Article 544 | 5 Hanikenne and Baurain Evolution of plant metal ATPases

0.92 Eukaryota IB-1 ATPases Embryophyta 0.98 1 AtPAA2-like 1 1 0.63 1 Embryophyta 0.98 Mamiellales AtPAA1-like 0.57 1 1 IB-1 ATPases (chloroplast) Chlorophyta 0.83 0.99 1 1 Cyanobacteria 0.97 CtaA Cyanobacteria 1 + Bacteria + Archaea AI.Me

1 Cyanobacteria PacS

Embryophyta 0.69 1 Chlorophyta 0.87 Bacillariophyta 0.71 1 Bigelowiella_ natans_ 753081@18724 IB-2 ATPases (CPC) 0.99 Thraustochytriaceae 1 Emiliania huxleyi CCMP1516 1 0.95 1 1 Aureococcus_anophagefferens_44056@72032 Nematostella_ vectensis_ 45351@NEMVEDRAFT_ v1g70040-P A [HM|ref] Carboxydothermus hydrogenoformans Z-2901 [78044655] [CP C] Bacteria 0.96 Bacteria + Archaea 1 Bacteria + Archaea 2+ 0.56 AI.Me Bacteria + Archaea 0.97 Embryophyta 0.78 1 0.5 Chlorophyta 0.76 0.99 1 Chlamydiales 1 IB-4 ATPases (SPC) Cyanidioschyzon_ merolae_ 45157@CMS330CT 1 Bacteria 1 0.8 Bacteria 0.97 Bacteria 0.5 Bacteria + Archaea 1 Bacteria Bacteria+Archaea 0.52 0.5 outgroup (architecture AII) 1 0.4

FIGURE 4 | Phylogeny of IB metal ATPases in prokaryotes and eukaryotic proteins. The tree was rooted using a small group of Eukaryotes. The tree was obtained with PhyloBayes (C20 model) from architecture AII proteins as outgroup. Statistical support is provided as the analysis of a protein alignment of 1761 sequences × 340 AA. Tree PP. The scale bar at the bottom gives the number of substitutions branches were colored based on homogeneous taxonomic composition per site. The corresponding fully annotated tree is provided as NEXUS and collapsed to highlight relationships between prokaryotic and File S3. beingpartofIB-1P-typeATPasesdefinedbyAxelsen and (ii) Eukaryotic chloroplast Me+ ATPases involved in copper Palmgren (1998): transport into chloroplasts were found in Viridiplantae only (i.e., land plants and ) (Figure 4, Table 1)and + (i) Eukaryotic Me ATPases involved in copper pumping into clustered separately from the first subgroup described above. the secretory pathway or in copper exclusion from cells all Even though our Bayesian tree suggests that these two clustered together. Hence, ATP7A and ATP7B fell within a eukaryotic subgroups might be quite close in the prokaryotic large clade of proteins, as expected (Wang et al., diversity, this result is not compelling because the statistical 2011). Similarly, Plantae proteins involved in related cellular support is weak (PP = 0.63) and because this association was functions (e.g., HMA5 and RAN1 in A. thaliana)fellwithin not recovered when analysing the same alignment in a ML + this subgroup as well (Table 1 and File S4). In fact, these Me framework under another model (not shown). Chloroplast ATPases were recovered in all but four eukaryotic genomes Me+ ATPases provide copper for incorporation in the elec- included in our dataset (Figure 5, Table S2) and accounted tron carrier plastocyanin and/or copper/zinc superoxide dis- + for the vast majority of eukaryotic Me ATPases. Since mutases (Nouet et al., 2011). The presence of two copper this subgroup is moreover not associated to any particular ATPases in chloroplasts, one in the inner membrane (e.g., prokaryotic phyla, it corresponds to the eukaryotic counter- PAA1 in A. thaliana)(Shikanai et al., 2003)andonein + part of prokaryotic Me ATPases, which likely retained the the thylakoid membrane (e.g., PAA2 in A. thaliana)(Abdel- ancestral function tracing back to the common ancestor of Ghany et al., 2005), is an ancestral feature of Viridiplantae prokaryotes and Eukaryotes. (see also Merchant et al., 2006; Blaby-Haas and Merchant,

Frontiers in Plant Science | Plant Physiology January 2014 | Volume 4 | Article 544 | 6 Hanikenne and Baurain Evolution of plant metal ATPases

Table 1 | Overview of origin and taxonomic distribution of IB metal P-type ATPases in Eukaryotes.

Monovalent metal P-type ATPases Divalent metal P-type ATPases

P-type ATPase subgroupa IB-1 IB-1 (chloroplast) IB-2 IB-4 Architecture subtypeb II I I A. thaliana gene names HMA5, RAN1 PAA1-2 HMA2-4 HMA1 Conserved motif in TM6 CPC CPC CPC SPC

Origin of gene Eukaryotic ancestor Cyanobacteria Eukaryotic or ancestor Chlamydiae Eukaryotic distribution Widespread Viridiplantae-specific Patchy Plantae-specific Figures Figures 4, 5 Figures 4, 5 Figures 4, 5 and 7 Figures 4, 5 and 8 Suppl. NEXUS files Files S3, S4 Files S3, S5 Files S3, S6 Files S3, S7

Glaucophytes Y N N Y Y N N Y Plantae Green algae (Chlorophyta) Y Y Y Y Green algae Y Y Y Y (charophytes)

Viridiplantae Land plants Y Y Y Y

Streptophyta (Embryophyta) aAccording to Axelsen and Palmgren (1998), Argüello (2003). bAccording to Thever and Saier (2009), Chan et al. (2010).

2012). These two paralogous proteins are orthologous to Fornicata the lone cyanobacterial CtaA copper ATPase (Figure 4), thus Heterolobosea indicating acquisition through primary endosymbiosis, fol- Haptophyceae lowed by a duplication of the original gene. In Cyanobacteria, Cryptophyta CtaA is localized in the plasma membrane, carrying out cop- Alveolata SAR clade per import into the cell, and is required for plastocyanin Stramenopiles

150 200 Rhodophyta Plantae assembly (Phung et al., 1994; Tottey et al., 2001). In contrast, Viridiplantae Opisthokonta copper transport into cyanobacterial thylakoids is ensured unikonts by the PacS copper ATPase (Kanamaru et al., 1994; Tottey et al., 2001). Our tree shows that PacS proteins are not + 100 related to eukaryotic chloroplast Me ATPases, but are part # genomes of the diversity of prokaryotic copper ATPases (Figure 4), in agreement with Williams and Mills (2005). The chloroplast

copper uptake system thus represents a nice example of gene 50 duplication followed by neofunctionalization to maintain function: the CtaA and PacS proteins found in the plasma membrane and thylakoid membrane of Cyanobacteria are 0 functional homologs of PAA1-like and PAA2-like proteins IB-1 IB-1 (cp) IB-2 (CPC) IB-4 (SPC) found in chloroplasts, respectively, but both eukaryotic pro- monovalent metals (Cu) divalent metals (Zn, Cd...) teins are co-orthologs of CtaA only (Figure 4). Finally, note that chloroplast Me+ ATPases are missing in FIGURE 5 | Taxonomic distribution of the four subgroups of metal merolae (an early-branching red alga), which is also lack- ATPases in Eukaryotes. The y-axis gives the number of genomes ing plastocyanin, as previously described (Hanikenne et al., containing at least one protein of a given subgroup. 2005).

Our phylogenetic analysis thus suggests that IB-1 P-type ATPases defined by Axelsen and Palmgren (1998) could be further divided Table S2). These Me2+ ATPases displayed a narrower taxonomic into two subgroups, as described above, to take into account our distribution than Me+ ATPases and were found in only 58 species observation that the additional eukaryotic IB-1 ATPases found (Figure 5, Table S2). They resolved into two subgroups corre- in chloroplasts have a distinct origin with respect to the bulk of sponding to IB-2 and IB-4 P-type ATPases described by Axelsen eukaryotic IB-1 ATPases. and Palmgren (1998).AsforMe+ ATPases, we observed that Second, a more limited number of 218 eukaryotic P-type each subgroup of Me2+ ATPases had a distinct evolutionary ATPases clustered with prokaryotic Me2+ ATPases (Figure 4, origin:

www.frontiersin.org January 2014 | Volume 4 | Article 544 | 7 Hanikenne and Baurain Evolution of plant metal ATPases

(i) Subgroup IB-2 of Me2+ATPases, which corresponds to the A Dictyoglomi — — HMA2, HMA3, and HMA4 proteins of A. thaliana (Figure 4), Caldiserica — —

200 Chrysiogenetes CPC — had a patchy taxonomic distribution among mostly pho- Deferribacteres CPC — Elusimicrobia — — tosynthetic organisms: land plants (), green Gemmatimonadetes — SPC algae (chlorophytes), Stramenopiles [diatoms, pelagophytes Korarchaeota [Archaea] — — Thermodesulfobacteria — — (Aureococcus anophagefferens) and non-photosynthetic Aquificae CPC — Fusobacteria CPC — thraustochytrids], (Bigelowiella natans) Synergistetes CPC — and coccolithophorid (Emiliania huxleyi) Deinococcus−Thermus CPC — 2+ Nitrospirae — — (Figures 4, 5, Table 1). Surprisingly, one Me ATPase was unclassified Bacteria CPC — Fibrobacteres/Acidobacteria CPC — also found in the starlet sea anemone (Nematostella vectensis). Tenericutes CPC SPC

Considering our broad sampling of animal proteomes, the # genomes Thermotogae CPC — Planctomycetes CPC SPC latter is probably the result of a Chlamydiae/Verrucomicrobia CPC SPC Chloroflexi CPC SPC (HGT) or of a contamination during genome sequencing. Crenarchaeota [Archaea] CPC — Proteins of this subgroup have the canonical CPx (mostly Euryarchaeota [Archaea] CPC — Spirochaetes CPC SPC CPC) motif of metal ATPases in the sixth predicted trans- Bacteroidetes/Chlorobi CPC SPC Actinobacteria CPC SPC membrane domain (Figures 4, 6;seealsoFigure 9). Cyanobacteria CPC SPC Based on a partial view of the taxonomic distribution Firmicutes CPC SPC Proteobacteria CPC SPC 0 50 100 150 of this subgroup (mostly in primary photosynthetic species, IB-2 (CPC) IB-4 (SPC)

Bacteria and Archaea but excluding most Eukaryotes), it divalent metals (Zn, Cd...) has been so far assumed that the IB-2 Me2+ pumps found in plants evolved from chloroplast proteins acquired from B the cyanobacterial , and were later recruited to the plasma membrane or to the (Cobbett et al., SPC 2003; Pedersen et al., 2012). Our phylogenetic analysis sug- gests otherwise, though. Since eukaryotic IB-2 Me2+ ATPases are no more related to extant Cyanobacteria than to the whole prokaryotic diversity of Me2+ ATPases (Figure 4), this 17 40 146 means that these proteins were not introduced into photo- synthetic Eukaryotes by the primary endosymbiosis. Instead, we propose the following hypothesis: IB-2 Me2+ ATPases fulfil an ancient function, tracing back to the common ances- tor of prokaryotes and Eukaryotes, and their genes expe- rienced multiple parallel losses among eukaryotic lineages. Such a scenario would explain their patchy distribution, CPC which is not restricted to Plantae or even to photosyn- thetic organisms (thraustochytrids are non-photosynthetic FIGURE 6 | Taxonomic distribution of IB-2 (CPC) and IB-4 (SPC) Me2+ Stramenopiles). Recent HGT between Eukaryotes cannot be ATPases in prokaryotes. (A) The y-axis gives the number of genomes fully ruled out but appears unlikely in this case because containing at least one protein of a given subgroup. The two columns on the right list a qualitative taxonomic breakdown of the co-occurrence green plant sequences are monophyletic in our tree. Similarly, statistics shown in (B), in which areas are proportional to the number of sodium/potassium (Na/K) P-type ATPases are present in individual genomes containing a given protein type. prokaryotes and animals but not in plants (Palmgren and Nissen, 2011). (ii) Subgroup IB-4 of eukaryotic Me2+ ATPases includes proteins + related to the A. thaliana HMA1 protein (Figure 4), which IB-2 subgroup of eukaryotic Me2 ATPases (e.g., AtHMA2, is localized in the chloroplast (Seigneurin-Berny et al., 2006; −3, and −4). Kim et al., 2009; Mikkelsen et al., 2012). It is specific to Moreover, proteins of the IB-4 subgroup share an unchar- Plantae (or Archaeplastida, i.e., green plants, red algae, and acteristic Ser/Pro/Cys (SPC) motif in the sixth predicted ) (Figures 4, 5, Table 1). However, this subgroup transmembrane domain instead of the common Cys-Pro- clustered with proteins belonging to the bacterial Cys/His/Ser (CPx) motif characteristic of all other metal Chlamydiae, and not with Cyanobacteria, as usually observed P-type ATPases (see also Argüello, 2003; Pedersen et al., for Plantae-specific proteins. HMA1-like proteins are in fact 2012). An exception is represented by a paraphyletic subset part of a small group of about 50 genes, essentially linked of chlorophytes (e.g., Ostreococcus, Micromonas, Coccomyxa) to chloroplastic functions that are thought to result from that possess an even more unusual APC motif, whereas direct HGT from Chlamydiae into the common ancestor of Volvocales green algae (e.g., Chlamydomonas)andthered Plantae (Huang and Gogarten, 2007; Moustafa et al., 2008; alga C. merolae display the SPC motif (Figure 4,seealso Baum, 2013). Thus, IB-4 Me2+ ATPases have an origin that Hanikenne et al., 2005). A switch from a CPx motif to a SPC is both very clear and completely distinct from that of the or APC motif may alter metal specificity of the transporters

Frontiers in Plant Science | Plant Physiology January 2014 | Volume 4 | Article 544 | 8 Hanikenne and Baurain Evolution of plant metal ATPases

within this subgroup (Cobbett et al., 2003; Hanikenne et al., Taxonomic distribution of the four subgroups within Plantae 2005; Williams and Mills, 2005) and may be related to the was as follows: broader metal specificity reported for AtHMA1 compared to + AtHMA2–4 (see Introduction). Me2 ATPases of 57 Bacteria (i) Me+ ATPases were found in all Plantae phyla, thus con- (including Chlamydiae) are strongly associated (PP = 1) firming their broad taxonomic distribution in Eukaryotes + with this IB-4 subgroup of eukaryotic Me2 ATPases. As (Table 1, File S4). The observation of the tree revealed a expected, they also all present the SPC motif (Figures 4, 6; complex evolutive history in Plantae. From a single ances- see also Figure 9), except three species that show the APC tral protein, as found in extant chlorophytes, two gene + motif instead. The CPx motif, which is also present in Me duplication events occurred during the evolution of strep- ATPases, corresponds to the ancestral state (Figure 6). IB-4 tophytes: the first before the colonization of land and the + Me2 ATPases (SPC) are absent in Archaea and in (sup- second later, probably in the (angiosperms posedly) early-branching Bacteria (Thermotogae, Aquificae, and ) ancestor (File S4). A gene copy was and Fusobacteria). Therefore, following the terminology pro- later lost in Brassicaceae, resulting in the presence of only + posed by Battistuzzi and Hedges (2009),IB-4Me2 ATPases two proteins (e.g., AtHMA5 and AtRAN1) in this fam- + (SPC) possibly derived from IB-2 Me2 ATPases (CPx) after ily. Note that gene duplication events also occurred in a gene duplication having affected the common ancestor of chlorophytes after the split from streptophytes (e.g., in C. Hydrobacteria and Terrabacteria. Secondary loss of IB-2 or reinhardtii,seeBlaby-Haas and Merchant, 2012). + IB-4 Me2 ATPases then independently occurred in several (ii) Chloroplast Me+ ATPases are present in Viridiplantae lineages (Figure 6). only, which suggests secondary loss in both glaucophytes andrhodophytes.Suchalossmaybelinkedtotheabsence EVOLUTION OF METAL ATPases IN PLANTAE of the major copper-requiring protein plastocyanin since, Plantae are a monophyletic group of Eukaryotes stemming from in glaucophytes and rhodophytes, transfer of photosyn- the primary endosymbiosis. It includes glaucophytes, red algae thetic electrons from cytochrome b6/f to PSI is carried out (rhodophytes), and green plants (Viridiplantae) (Cavalier-Smith, by cytochrome c6 (Matsuzaki et al., 2004; Price et al., 2012). 1981; Rodriguez-Ezpeleta et al., 2005). Green plants are fur- However, the presence of two chloroplast copper ATPases ther split into two major lineages, chlorophytes and strepto- is a shared feature of all Viridiplantae (File S5). + phytes. While chlorophytes contain only green algae, strepto- (iii) Within Plantae, IB-2 Me2 ATPases are specific to phytes include six lineages of green algae (commonly referred Viridiplantae and were found in chlorophytes, to as charophytes) and a seventh lineage corresponding to land charophytes, and land plants (Figure 7, File S6). In plants (embryophytes) (Bremer, 1985; Lewis and McCourt, 2004; angiosperms, several independent gene duplication events Laurin-Lemay et al., 2012). Depending on the availability of occurred in different families. For example, a triplication complete genomes, our initial eukaryotic set only represented a specific to Brassicaceae resulted in the presence of three limited diversity of Plantae. Hence, it included no , terminal paralogs (e.g., AtHMA2–4), whereas other gene a single red alga, only micro-chlorophytes (e.g., Ostreococcus, duplications or triplications took place in Papilionoideae Micromonas, Chlamydomonas) and no charophyte. Similarly, and monocots, respectively (Figure 7). + completely sequenced genomes of land plants do not yet cover (iv) Finally, IB-4 Me2 ATPases (SPC) are present in all their whole diversity, and many important groups (e.g., gym- Plantae, where a single protein of this type was found in nosperms) were missing from our initial analyses. most species (Figure 8). These proteins are thus orthol- To better study the evolutionary trajectory of the four metal ogous across Bacteria (including the Chlamydiae donor) ATPase subgroups described in the previous section, we mined and Plantae. Note that a gene duplication occurred in EST databases and newly available complete proteomes from Brassicaceae, followed by a gene loss in A. thaliana only, 44 additional Plantae (Table S3). EST inclusion into the general resulting in the presence of a single HMA1 in this species. metal ATPase alignment of Figure 4 followed a careful procedure aimed at ensuring that only true orthologs were retained for fur- FUNCTIONAL CONSERVATION ther analyses (see Methods). This resulted in the addition of 71 Conserved amino acids, mostly with polar side chains, were iden- proteins from 24 Plantae, whereas the remaining 20 EST datasets tified in TM domains 6, 7, and 8 of PIB-type ATPases. Forming did not yield any metal ATPase, generally because of their limited putative TM metal binding domains, these invariant amino acids amount of sequences. may play a role in determining metal specificity of the pumps For each of the four metal ATPase subgroups, an enriched (Argüello, 2003; Argüello et al., 2007). To examine the conserva- alignment was then built and thoroughly hand-curated before tion of these motifs in the four IB subgroups in extant green plant being subjected to ML inference (see Methods; Files S4–S7). Note sequences (Viridiplantae) (Figures 7, 8, Files S4–S7), we com- that the four trees obtained were only used to examine pres- puted sequence logos for the TM6 and TM7–8 regions (Figure 9). ence/absence of the various subgroups at the phylum level. We The remaining sequences in each corresponding tree were used as did not attempt to assemble an inventory of those proteins in the outgroups, which are the closest available proxies to the ancestral different species because EST mining often yields partial and non- sequences that were at the base of each PIB-type ATPase sub- overlapping protein sequences, which do not represent the whole group. IB-1 and IB-1 chloroplast copper ATPases all share a CPC gene complement of the corresponding organism. motif in TM6, and the presence of invariant YN (TM7) and M,

www.frontiersin.org January 2014 | Volume 4 | Article 544 | 9 Hanikenne and Baurain Evolution of plant metal ATPases

Arabidopsis 100 Capsella_rubella_81985@20897962 Brassicacae I Brassica rapa (AtHMA4) Eutrema_ halophilum_ 98038@20202304 0.2 Capsella rubella Arabidopsis Eutrema_ halophilum_ 98038@20194930 Brassicacae II 99 Brassica_ rapa_ 3711@22703739 (AtHMA2) 100 Brassica_ rapa_ 3711@22710111 Capsella rubella 51 100 Arabidopsis Brassicacae III Brassica rapa 20 Eutrema_ halophilum_ 98038@20196026 (AtHMA3) Linum usitatissimum 15 Carica_ papaya_ 3649@16428863 Gossypium raimondii Theobroma cacao 12 34 Gossypium_ raimondii_ 29730@26773321 Citrus Manihot_ esculenta_ 3983@17986735 10 11 Ricinus_ communis_ 3988@16801483 P opulus_ trichocarpa_ 3694@27007613 Glycine max 100 11 Phaseolus_vulgaris_3885@27143309 Papilionoideae I 33 Medicago_ truncatula_ 3880@23048587 Cucumis_sativus_3659@16957079 24 Eucalyptus_ grandis_ 71139@23569622 Malus domestica 98 P runus_ persica_ 3760@17660986 46 Fragaria_ vesca_ 57918@27251768 20 Fragaria_ vesca_ 57918@27252124 Cucumis_sativus_3659@16956316 Vitis_ vinifera_ 29760@17822919 Phaseolus vulgaris 37 100 Glycine_ max_ 3847@26331536 Papilionoideae II Glycine_ max_ 3847@26340842 97 Medicago truncatula 39 Citrus 29 Eucalyptus_ grandis_ 71139@23571742 59 Mimulus guttatus 100 Solanum_ lycopersicum_ 4081@27293190 [email protected] Aquilegia_coerulea_218851@22040791 Zea mays Sorghum_ bicolor_ 4558@1956148 Panicumvirgatum 81 Setaria_ italica_ 4555@19713448 monocots I 99 Hordeum vulgare Brachypodium_ distachyon_ 15368@21815269 100 Oryza_ sativa_ 4530@24111357 Sorghum_ bicolor_ 4558@1956149 Zea_ mays_ 4577@20833593 100 Zea_ mays_ 4577@20841827 monocots II P anicum_ virgatum_ 38727@23814755 Setaria_ italica_ 4555@19713923 100 Zea mays 96 84 Sorghum_ bicolor_ 4558@1984937 P anicum_ virgatum_ 38727@23761139 Setaria_ italica_ 4555@19705476 monocots III 96 Brachypodium_ distachyon_ 15368@21821976 Hordeum vulgare Hordeum vulgare moellendorffii Physcomitrella_patens_3218@18038000 97 44 70 76 Syntrichia_ ruralis_ [email protected] P hyscomitrella_ patens_ 3218@18046267 21 [email protected] 97 77 P inus_ taeda_ [email protected] 44 Cryptomeria_ japonica_ [email protected] 80 Welwitschia_ mirabilis_ [email protected] Klebsormidium_ flaccidum_ [email protected] Micromonas 99 100 Ostreococcus_ sp._ 385169@21022 32 Ostreococcus Aureococcus_ anophagefferens_ 44056@72032 Chlorophyta 70 100 Chlorella_vulgaris_3077@60160 51 Coccomyxa_ subellipsoidea_ 574566@27387189 84 Chlorokybus_ atmophyticus_ [email protected] Chlorella_variabilis_554065@140903 P haeodactylum tricornutum 56 97 Fragilariopsis_ cylindrus_ 186039@206868 Fragilariopsis cylindrus 100 Fragilariopsis_cylindrus_186039@188713 64 100 P haeodactylum_ tricornutum_ 2850@P hatr676 Stramenopiles 80 Thalassiosira_pseudonana_35128@Thaps261657 Aurantiochytrium_ limacinum_ 717989@142191 69 99 Aplanochytrium_ kerguelense_ 702273@93495 Schizochytrium_ aggregatum_ 876976@84170

FIGURE 7 | Phylogeny of IB-2 Me2+ ATPases in Eukaryotes. The tree was or above) are shown. The scale bar at the bottom gives the number of obtained with PhyML (LG + F + 4 model) from the analysis of a protein substitutions per site. Note that Aureococcus anophagefferens alignment of 138 sequences × 496 AA. The tree was rooted using (Stramenopiles, Pelagophyceae) is located among chlorophytes, possibly due Stramenopiles as outgroup and, for clarity, branches were collapsed at the to a long-branch artifact (Felsenstein, 1978). The corresponding NEXUS file is genus level. Bootstrap proportions for selected nodes (mostly at family level provided as File S6.

SS (TM8) amino acids (Figures 9A,B). These are characteristic of In our sequence logos, additional conserved positions (shared copper transporters, although Y → F substitutions were observed between Viridiplantae and their outgroups) can be specifically in the outgroup (Figure 9A). Mutations in those motifs inactivate observed for each of the four subgroups; these residues may also the enzyme (Mandal et al., 2004) or have been linked to Wilson’s play a role in metal specificity (Figure 9). disease (Argüello et al., 2007). K (TM7) and D (TM8) residues are conserved in IB-2 Me2+ DISCUSSION ATPases (Figure 9C). Mutations in the D residue of ZntA of E. Our phylogenetic analyses based on a taxonomically representa- coli inactivate the pump, suggesting that this residue could be tive set of prokaryotic genomes and on all annotated eukaryotic involved in Zn coordination (Dutta et al., 2006). Finally, a HEGxT genomes confirmed that metal ATPases are monophyletic and motif in TM8 is shared by IB-4 Me2+ ATPases, in addition to the can be further divided into two groups, Me+ and Me2+,based APC/SPC motif in TM6, although the function of these residues on substrate specificities (Figure 1)(seeAxelsen and Palmgren, has not been investigated yet (Figure 9D). 1998; Argüello, 2003; Chan et al., 2010). In Eukaryotes, metal

Frontiers in Plant Science | Plant Physiology January 2014 | Volume 4 | Article 544 | 10 Hanikenne and Baurain Evolution of plant metal ATPases

Populus trichocarpa Manihot_ esculenta_ 3983@17990231 Ricinus_ communis_ 3988@16814679 0.2 20 Manihot_ esculenta_ 3983@17980552 Linum usitatissimum Gossypium raimondii Theobroma_ cacao_ 3641@27436809 Arabidopsis_ thaliana_ 3702@19648322 32 Capsella_ rubella_ 81985@20894021 97 Arabidopsis_ lyrata_ 59689@16045505 Brassicacae I Eutrema_halophilum_98038@20195769 100 Brassica_ rapa_ 3711@22703654 76 Arabidopsis_ lyrata_ 59689@16049188 Capsella_ rubella_ 81985@20904082 31 100 Brassica_ rapa_ 3711@22717060 Brassicacae II Eutrema_halophilum_98038@20208651 Carica_ papaya_ 3649@16423693 46 Citrus P haseolus_ vulgaris_ 3885@27144267 Glycine_ max_ 3847@26312878 Glycine_ max_ 3847@26320385 100 79 Streptophyta Cucumis_sativus_3659@16951729 43 Malus_ x_ 3750@22674726 64 P runus_ persica_ 3760@17653204 100 Fragaria_ vesca_ 57918@27259443 86 Eucalyptus grandis Vitis_ vinifera_ 29760@17816801 74 Solanum 100 Mimulus_ guttatus_ 4155@17671846 Aquilegia coerulea Zea mays 86 Sorghum bicolor Panicumvirgatum Setaria italica 90 98 Hordeum vulgare Brachypodium_ distachyon_ 15368@21820101 47 Oryza sativa [email protected] 100 P hyscomitrella patens 54 Marchantia_ polymorpha_ [email protected] Chaetosphaeridium_ globosum_ [email protected] 64 56 Selaginella_moellendorffii_88036@15418477 Ostreococcus [APC] 100 Micromonas [APC] 100 Chlorella_ vulgaris_ 3077@83263 [APC] 100 45 Coccomyxa_ subellipsoidea_ 574566@27391688 [APC] Chlorophyta 75 Chlorella_ variabilis_ 554065@136900 Chlamydomonas_reinhardtii_3055@27571089 100 41 Volvox_ carteri_ 3067@23135823 Chondrus_ crispus_ [email protected] 100 51 Eucheuma_ denticulatum_ [email protected] 51 P yropia_ yezoensis_ 2788@contig_ 15565_ g3723.H1.1 Rhodophyta Cyanidioschyzon_ merolae_ 45157@CMS330CT 69 65 Galdieria_ sulphuraria_ [email protected] 28 P orphyridium_ purpureum_ [email protected] [email protected] Glaucophyta [HM] Simkania negevensis Z [338732433] [SP C] 74 71 [HM] Waddlia chondrophila WSU 86-1044 [297621848] [SP C] Chlamydiae 41 Chlamydophila psittaci RD1 [392377044] [SP C] [HM] Coraliomargarita akajimensis DSM 45221 [294055086] [SP C] 100 [HM] Opitutus terrae P B90-1 [182415296] [S P C ]

FIGURE 8 | Phylogeny of IB-4 Me2+ ATPases in Eukaryotes. The tree was collapsed at the genus taxonomic level. Bootstrap proportions for selected obtained with PhyML (LG + F + 4 model) from the analysis of an alignment nodes (mostly at family level or above) are shown. The scale bar at the of 91 sequences × 563 AA. The tree was rooted using the bottom gives the number of substitutions per site. The corresponding NEXUS Chlamydiae/Verrumicrobia as outgroup and, for clarity, branches were file is provided as File S7.

ATPases clustered into four subgroups scattered in the prokary- distribution in Eukaryotes would then result from multiple inde- otic diversity. Observation of the relationships among the four pendent losses. subgroups of eukaryotic metal ATPases and with their prokary- Chloroplast IB-1 Me+ ATPases and IB-4 Me2+ ATPases are otic counterparts suggested that they had diverse evolutionary restricted to Viridiplantae and Plantae, respectively (Figure 5, origins, an interpretation that was strengthened by the analysis Table 1). Chloroplast IB-1 Me+ ATPases were acquired from of the taxonomic composition of these four subgroups (Figure 5, a cyanobacterial ancestor in the course of primary endosym- Table 1). biosis (Figure 4, File S5), whereas IB-4 Me2+ ATPases arose IB-1 Me+ ATPases are very common (Figures 3, 5) and pos- from a HGT event from Chlamydiae into the Plantae ancestor sibly represent an ancestral function tracing back to LUCA (Archaeplastida) (Figures 4, 8). (Last Universal Common Ancestor, the most recent ances- Looking at detailed phylogenetic analyses allow disentangling tor of extant living organisms) (e.g., Koonin, 2010). Common the complex orthology and co-orthology relationships between to all Viridiplantae, IB-2 Me2+ ATPases displayed a patchy Plantae metal ATPases, which cannot be achieved through mere distribution in other Eukaryotes, appearing to be limited to similarity searches (e.g., BLAST). In addition to the example some bikont lineages (Figure 5)(Stechmann and Cavalier-Smith, of the two Viridiplantae paralogs of chloroplast copper ATPases 2002). Eukaryotic IB-2 Me2+ ATPases are clearly monophyletic, originating from a single cyanobacterial protein ancestor (see which rules out multiple recruitments from prokaryotes dur- above), our analyses revealed that several gene duplication events ing eukaryotic evolution (from Cyanobacteria in Viridiplantae occurred within IB-1 Me+ and IB-2 Me2+ ATPases in Plantae. for example), and are equally related to the whole diversity of These gene duplications, some of them associated to whole their prokaryotic homologues, thus suggesting that they also trace genome duplication events (e.g., Van De Peer et al., 2009), back to LUCA (or at least to the bikont ancestor). Their patchy were often restricted to specific lineages (e.g., a triplication

www.frontiersin.org January 2014 | Volume 4 | Article 544 | 11 Hanikenne and Baurain Evolution of plant metal ATPases

A monovalent metal P-type ATPases (IB-1) I.B-1 Viridiplantae I.B-1 Viridiplantae 1.0 L 1.0 V **VLG V S GI L C**A 0.5 0.5 L T R V V L W G LF F F L II I A G Q V S W L L I WLR AM P F LK V AF I M T M A M V V F H V M M I L S I S A I G S V G Q L A V Y G I C G G T I C F A EV SS M A V V K L K M S M YL YP L I V AA A A M A L A F S P W A A Y Y P I SL V L T AIM M V VT P LA PT N F N TL A S P LM T 0.0 0.0 I I610 620 630 922 930 940 950 960 966 I.B-1 outgroup I.B-1 outgroup 1.0 1.0 G **LL A VIRR G **L S A L F I F V A L C 0.5 VL 0.5 V L A Y A I L V A L V I Y G VA I L LL LGQ V L A T S R VL A A Y L G A F L A T F L I I Q IA F V A A T M W A L MVGF VMF F W L L W F T V TMI C C G I I G S S S P VF LAAPSVYM VNFF W AMVFNSMFVPVAG PPPRREV Q MPP Y A S VL M GS T V IEAVAV I I I 0.0 I 0.0 610 620 630 922 930 940 950 960 966

B monovalent metal P-type ATPases (IB-1 chloroplast) I.B-1 (cp) Viridiplantae I.B-1 (cp) Viridiplantae 1.0 V 1.0 W F ** QY F L **G D A V G V V G AL A G A 0.5 0.5 A A I AT L V V S T I M F S V L L G IL IV F L C V IF G A H I T V A A S S L I Q IL G A L A I V E A N V S G M C I L MLDL T T L M M M F N C C G F G T SS S TF P I L T TM S T M S G V A C S C L T L F V LF N M A N L W L LRI I I I A VM P L LAAPTAM N A YN VP AA P V A A M V V 0.0 0.0 I A 547 550 560 567 864 870 880 890 900 908 I.B-1 (cp) outgroup I.B-1 (cp) outgroup 1.0 1.0 F ** L I SF L A **F T L L K I S I A L 0.5 A 0.5 L F IA V Y A V G LV PL S IS I AW G G G T L A A L T V T LL N A A L G PL F S L C V A RQA I A I G T M L T Q L VM I I S T A SS L MA I C C G I G T SS G G M LSW VA V LF LHD FN GF C V M VS T G VMV A P AL LATP VLS N VLA YNVFSLPVAM TGFPQTNVS GPVVS FGMAM LCV 0.0 IL I I I 0.0 I II I 547 550 560 567 864 870 880 890 900 908

C divalent metal P-type ATPases (IB-2 CPC) I.B-2 Viridiplantae I.B-2 Viridiplantae 1.0 1.0 T A * E VL GT F H V L S V A 0.5 V 0.5 V F I A L I T A I Q A A A A I G L A A VL A L G G L Y N L S I V A Y F G G F V P A F G L S L L I LS V A L ML F C S L S T V G V H V VM M V I AV I I V A M T V S S E V M V V V F A T SF I I M T V R VL M V G V VV TG MA R F S L S V S L G A T C C S I A M I I G I A S VA G T M D L M II I G V CV C M C V QA AG C L T M M I I I T Y QV LL T M LTA W T W TDT M T N AL K CQ VY I P SLL TP L FP TL F S SVAV A I I I I 0.0 0.0 I 350 360 370 658 660 670 680 690 700 702 I.B-2 outgroup I.B-2 outgroup 1.0 1.0 A V GL KMH A * A L I I V V VS Q IVV L L LAG TS A G L G 0.5 0.5 VV A A V T Y S K L F F MN VT WL G L V L S L S LI T V F LTL G C G V V S T S A IL TV TPLP Y V L AA T AYA L VGFKA VAA FGVDLD V LA A M TA MTL I I C C I I I S VLG P AMTLAAPMVVS ENL AFLSNS A S AMLYRSPLLWAVSVDLVVLVAVVTNASRP L FV I I II I I I I 0.0 I 0.0 I 350 360 370 658 660 670 680 690 700 702

D divalent metal P-type ATPases (IB-4 SPC) I.B-4 Viridiplantae I.B-4 Viridiplantae 1.0 1.0 TS VL S I 0.5 0.5 F T N A S F L F L A L V A V S L G C MV L A V L G T S A I G A I L V V LG A V A VM S C A VF L A P F Q V L A M S A L L L F A L MV T A I S A CL F C V W F T V I S C G G L L G I G A A A V A A S L M L V A M G A T T C L F M S MF M T S I S I C G V C C C S G G T S MAWLMV AP A V PLVY LL A AV VLVV TL YVPLW AVT HE T V V N R L M A A A 0.0 I 0.0 I I I 403 410 420 423 739 740 I 750 760 770 780 783 I.B-4 outgroup I.B-4 outgroup 1.0 1.0 L G A G A V A IL IA L S A I L I GCIL ASS A A IV I G V 0.5 0.5 I L T L F V Y S T T M A L W A I A C G L L T L I F MVV ALP VF A A G F I I L V CFV MGGAG R W A C V I Q I G I A V L S L L V N G A AVFMSTTLS S SL AMC FA F S C I I V II I GS G M A V M T T P ALLL TP T AF T A ENFLLSTV TMSTVWTTSVFTMPLSVGVLVHE TL VCLS LRL 0.0 I 0.0 I I 403 410 420 423 739 740 750 760 770 780 783 TM6 TM7 TM8

FIGURE 9 | Sequence logos of TM domains 6–8 of the four subgroups of background, while an asterisk designates the amino-acid residues whose metal P-type ATPases (A–D). The height of each letter gives the occurrence mutations inactivate the enzymes. The number of sequences used for frequency of the corresponding amino-acid residue across the aligned computing each logo was as follows: 224 (A-Viridiplantae), 13 (A-outgroup: sequences (y-axes), without compositional adjustment for simplicity. x-axes Glaucophyta and Rhodophyta), 133 (B-Viridiplantae), 43 (B-outgroup: are numbered according to the corresponding A. thaliana proteins: AtHMA5 Cyanobacteria), 124 (C-Viridiplantae), 13 (C-outgroup: Stramenopiles), 79 (A), AtPAA1 (B), AtHMA4 (C), AtHMA1 (D). The CPx/SPx motif is on gray (D-Viridiplantae), 12 (D-outgroup: Chlamydiae, Glaucophyta and Rhodophyta).

Frontiers in Plant Science | Plant Physiology January 2014 | Volume 4 | Article 544 | 12 Hanikenne and Baurain Evolution of plant metal ATPases

resulting in the presence of HMA2, HMA3 and HMA4 paralogs Argüello, J., Eren, E., and González-Guerrero, M. (2007). The structure and in Brassicaceae). This highlights that the generalization of func- function of heavy metal transport P1B-ATPases. Biometals 20, 233–248. doi: tional data/hypotheses obtained in A. thaliana to plants outside 10.1007/s10534-006-9055-6 Argüello, J. M. (2003). Identification of ion-selectivity determinants in heavy-metal Brassicaceae has to be done with care. transport P1B-type ATPases. J. Membr. Biol. 195, 93–108. doi: 10.1007/s00232- Altogether, our phylogenetic analyses shed light on the evolu- 003-2048-2 tion of metal ATPases in Eukaryotes, with a particular focus on Axelsen, K. B., and Palmgren, M. G. (1998). Evolution of substrate speci- Plantae. It also provides a solid phylogenetic framework for their ficities in the P-type ATPase superfamily. J. Mol. Evol. 46, 84–101. doi: functional analyses outside model plant species. 10.1007/PL00006286 Axelsen, K. B., and Palmgren, M. G. (2001). Inventory of the superfam- ily of P-type ion pumps in Arabidopsis. Plant Physiol. 126, 696–706. doi: ACKNOWLEDGMENTS 10.1104/pp.126.2.696 Funding for this study was provided by the “Fonds de la Battistuzzi, F. U., and Hedges, S. B. (2009). A major clade of prokaryotes Recherche Scientifique – FNRS” (FRFC-2.4583.08 and PDR- with ancient adaptations to life on land. Mol. Biol. Evol. 26, 335–343. doi: T.0206.13 to Marc Hanikenne), the “Fonds Spéciaux du Conseil 10.1093/molbev/msn247 Baum, D. (2013). The origin of primary : a pas de deux or a ménage à trois? de la Recherche”, University of Liège [to Marc Hanikenne 25, 4–6. doi: 10.1105/tpc.113.109496 (SFRD-12/03) and to Denis Baurain (SFRD-12/04)]. High- Benson, D. A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D. J., performance computing (HPC) resources were provided by the Ostell, J., et al. (2013). GenBank. Nucleic Acids Res. 41, D36–D42. doi: FNRS (NIC3). Marc Hanikenne is a Research Associate of the 10.1093/nar/gkt1030 Binder, B. M., Rodríguez, F. I., and Bleecker, A. B. (2010). The copper transporter FNRS. RAN1 is essential for biogenesis of ethylene receptors in Arabidopsis. J. Biol. Chem. 285, 37263–37270. doi: 10.1074/jbc.M110.170027 SUPPLEMENTARY MATERIAL Blaby-Haas, C. E., and Merchant, S. S. (2012). The ins and outs of The Supplementary Material for this article can be found algal metal transport. Biochim. Biophys. Acta 1823, 1531–1552. doi: online at: http://www.frontiersin.org/journal/10.3389/fpls. 10.1016/j.bbamcr.2012.04.010 Bremer, K. (1985). Summary of green plant phylogeny and classification. Cladistics 2013.00544/abstract 1, 369–385. doi: 10.1111/j.1096-0031.1985.tb00434.x Castresana, J. (2000). Selection of conserved blocks from multiple alignments Table S1 | Overview of prokaryotic genomes included in the study. The for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552. doi: number of P-type ATPases identified in each genome is provided for each 10.1093/oxfordjournals.molbev.a026334 architecture subtype (AI–AIII) and, for architecture AI, for each of Cavalier-Smith, T. (1981). kingdoms: seven or nine? Biosystems 14, monovalent metal (AI.Me+) and divalent metal (AI.Me2+) groups. 461–481. doi: 10.1016/0303-2647(81)90050-2 Chan, H., Babayan, V., Blyumin, E., Gandhi, C., Hak, K., Harake, D., et al. (2010). Table S2 | Overview of eukaryotic genomes included in the study. The The p-type ATPase superfamily. J. Mol. Microbiol. Biotechnol. 19, 5–104. doi: + 2+ number of monovalent metal (AI.Me ) and divalent metal (AI.Me ) 10.1159/000319588 ATPases identified in each genome is provided. Chao, D.-Y., Silva, A., Baxter, I., Huang, Y. S., Nordborg, M., Danku, J., et al. (2012). Genome-wide association studies identify heavy metal ATPase3 as the primary Table S3 | Overview of additional ESTs and genomes included in the study. determinant of natural variation in leaf cadmium in Arabidopsis thaliana. PLoS Genet. 8:e1002923. doi: 10.1371/journal.pgen.1002923 File S1 | Phylogenetic tree of P-type ATPases in prokaryotes (NEXUS file). Cobbett, C. S., Hussain, D., and Haydon, M. J. (2003). Structural and func- tional relationships between type 1 heavy metal-transporting P-type ATPases File S2 | Phylogenetic tree of IB metal ATPases in prokaryotes (NEXUS file). B in Arabidopsis. New Phytol. 159, 315–321. doi: 10.1046/j.1469-8137.2003. File S3 | Phylogenetic tree of IB metal ATPases in prokaryotes and 00785.x Eukaryotes (NEXUS file). Courbot, M., Willems, G., Motte, P., Arvidsson, S., Roosens, N., Saumitou-Laprade, P., et al. (2007). A major QTL for Cd tolerance in Arabidopsis halleri co- + File S4 | Phylogenetic tree of IB-1 Me ATPases in Plantae (NEXUS file). localizes with HMA4, a gene encoding a heavy metal ATPase. Plant Physiol. 144, File S5 | Phylogenetic tree of IB-1 chloroplast Me+ ATPases in 1052–1065. doi: 10.1104/pp.106.095133 Cyanobacteria and Viridiplantae (NEXUS file). Craciun, A. R., Meyer, C.-L., Chen, J., Roosens, N., De Groodt, R., Hilson, P., et al. (2012). Variation in HMA4 gene copy number and expression among 2+ File S6 | Phylogenetic tree of IB-2 Me ATPases in Stramenopiles and Noccaea caerulescens populations presenting different levels of Cd tolerance and Viridiplantae (NEXUS file). accumulation. J. Exp. Bot. 63, 4179–4189. doi: 10.1093/jxb/ers104 File S7 | Phylogenetic tree of IB-4 Me2+ ATPases in Chlamydiae and Crooks, G. E., Hon, G., Chandonia, J.-M., and Brenner, S. E. (2004). WebLogo: a Plantae (NEXUS file). sequence logo generator. Genome Res. 14, 1188–1190. doi: 10.1101/gr.849004 Durbin, R., Eddy, S., Krogh, A., and Mitchinson, G. (1998). Biological Sequence Data Sheet 1 | Alignments in FASTA format for Files S1–S7. Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge, UK: Cambridge University Press. doi: 10.1017/CBO9780511790492 Dutta, S. J., Liu, J., Hou, Z., and Mitra, B. (2006). Conserved aspartic acid 714 REFERENCES in transmembrane segment 8 of the ZntA subgroup of P1B-type ATPases is a Abdel-Ghany, S. E., Müller-Moulé, P., Niyogi, K. K., Pilon, M., and Shikanai, T. metal-binding residue. Biochemistry 45, 5923–5931. doi: 10.1021/bi0523456 (2005). Two P-type ATPases are required for copper delivery in Arabidopsis Federhen, S. (2012). The NCBI taxonomy database. Nucleic Acids Res. 40, thaliana chloroplasts. Plant Cell 17, 1233–1251. doi: 10.1105/tpc.104.030452 D136–D143. doi: 10.1093/nar/gkr1178 Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., et al. Felsenstein, J. (1978). Cases in which parsimony or compatibility methods will be (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database positively misleading. Syst. Zool. 27, 401–410. doi: 10.2307/2412923 search programs. Nucleic Acids Res. 25, 3389–3402. doi: 10.1093/nar/25.17.3389 Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the Andrés-Colás, N., Sancenón, V., Rodríguez-Navarro, S., Mayo, S., Thiele, D. J., bootstrap. Evolution 39, 783–791. doi: 10.2307/2408678 Ecker, J. R., et al. (2006). The Arabidopsis heavy metal P-type ATPase HMA5 Field, D., Tiwari, B., Booth, T., Houten, S., Swan, D., Bertrand, N., et al. (2006). interacts with metallochaperones and functions in copper detoxification of Open software for biologists: from famine to feast. Nat. Biotechnol. 24, 801–803. . Plant J. 45, 225–236. doi: 10.1111/j.1365-313X.2005.02601.x doi: 10.1038/nbt0706-801

www.frontiersin.org January 2014 | Volume 4 | Article 544 | 13 Hanikenne and Baurain Evolution of plant metal ATPases

Flicek, P., Ahmed, I., Amode, M. R., Barrell, D., Beal, K., Brent, S., et al. (2013). Laurin-Lemay, S., Brinkmann, H., and Philippe, H. (2012). Origin of land plants Ensembl 2013. Nucleic Acids Res. 41, D48–D55. doi: 10.1093/nar/gks1236 revisited in the light of sequence contamination and missing data. Curr. Biol. 22, Frérot, H., Faucon, M. P., Willems, G., Gode, C., Courseaux, A., Darracq, A., et al. R593–R594. doi: 10.1016/j.cub.2012.06.013 (2010). Genetic architecture of zinc hyperaccumulation in Arabidopsis halleri: Le, Q. S., Gascuel, O., and Lartillot, N. (2008). Empirical profile mixture models for the essential role of QTL x environment interactions. New Phytol. 187, 355–367. phylogenetic reconstruction. Bioinformatics 24, 2317–2323. doi: 10.1093/bioin- doi: 10.1111/j.1469-8137.2010.03295.x formatics/btn445 Goodstein, D. M., Shu, S., Howson, R., Neupane, R., Hayes, R. D., Fazo, J., et al. Le, S. Q., and Gascuel, O. (2008). An improved general amino acid replacement (2012). Phytozome: a comparative platform for green plant genomics. Nucleic matrix. Mol. Biol. Evol. 25, 1307–1320. doi: 10.1093/molbev/msn067 Acids Res. 40, D1178–D1186. doi: 10.1093/nar/gkr944 Lewis, L. A., and McCourt, R. M. (2004). Green algae and the origin of land plants. Gouy, M., Guindon, S., and Gascuel, O. (2010). SeaView version 4: a multiplatform Am. J. Bot. 91, 1535–1556. doi: 10.3732/ajb.91.10.1535 graphical user interface for sequence alignment and phylogenetic tree building. Mandal, A. K., Yang, Y., Kertesz, T. M., and Arguello, J. M. (2004). Identification of Mol. Biol. Evol. 27, 221–224. doi: 10.1093/molbev/msp259 the transmembrane metal binding site in Cu+-transporting PIB-type ATPases. Grigoriev, I. V., Nordberg, H., Shabalov, I., Aerts, A., Cantor, M., Goodstein, J. Biol. Chem. 279, 54802–54807. doi: 10.1074/jbc.M410854200 D., et al. (2012). The genome portal of the Department of Energy Matsuzaki, M., Misumi, O., Shin, I. T., Maruyama, S., Takahara, M., Miyagishima, Joint Genome Institute. Nucleic Acids Res. 40, D26–D32. doi: 10.1093/nar/ S. Y., et al. (2004). Genome sequence of the ultrasmall unicellular red alga gkr947 Cyanidioschyzon merolae 10D. Nature 428, 653–657. doi: 10.1038/nature02398 Guindon, S., Dufayard, J. F., Lefort, V., Anisimova, M., Hordijk, W., and Gascuel, Merchant, S., Allen, M. D., Kropat, J., Moseley, J. L., Long, J. C., Tottey, O. (2010). New algorithms and methods to estimate maximum-likelihood phy- S., et al. (2006). Between a rock and a hard place: trace element logenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321. doi: nutrition in Chlamydomonas. Biochim. Biophys. Acta 1763, 578–594. doi: 10.1093/sysbio/syq010 10.1016/j.bbamcr.2006.04.007 Hanikenne, M., Krämer, U., Demoulin, V., and Baurain, D. (2005). A comparative Mikkelsen, M. D., Pedas, P., Schiller, M., Vincze, E., Mills, R. F., Borg, S., et al. inventory of metal transporters in the green alga Chlamydomonas reinhardtii (2012). Barley HvHMA1 is a heavy metal pump involved in mobilizing organel- and the red alga Cyanidioschizon merolae. Plant Physiol. 137, 428–446. doi: lar Zn and Cu and plays a role in metal loading into grains. PLoS ONE 7:e49027. 10.1104/pp.104.054189 doi: 10.1371/journal.pone.0049027 Hanikenne, M., Kroymann, J., Trampczynska, A., Bernal, M., Motte, P., Clemens, Miyadate, H., Adachi, S., Hiraizumi, A., Tezuka, K., Nakazawa, N., Kawamoto, S., et al. (2013). Hard selective sweep and ectopic gene conversion in a T., et al. (2011). OsHMA3, a P1B-type of ATPase affects root-to-shoot cad- gene cluster affording environmental adaptation. PLoS Genet. 9:e1003707. doi: mium translocation in rice by mediating efflux into vacuoles. New Phytol. 189, 10.1371/journal.pgen.1003707 190–199. doi: 10.1111/j.1469-8137.2010.03459.x Hanikenne, M., and Nouet, C. (2011). Metal hyperaccumulation and hypertol- Morel, M., Crouzet, J., Gravot, A., Auroy, P., Leonhardt, N., Vavasseur, A., erance: a model for plant evolutionary genomics. Curr. Opin. Plant Biol. 14, et al. (2009). AtHMA3, a P1B-ATPase allowing Cd/Zn/Co/Pb vacuolar 252–259. doi: 10.1016/j.pbi.2011.04.003 storageinArabidopsis.Plant Physiol. 149, 894–904. doi: 10.1104/pp.108. Hanikenne, M., Talke, I. N., Haydon, M. J., Lanz, C., Nolte, A., Motte, P., et al. 130294 (2008). Evolution of metal hyperaccumulation required cis-regulatory changes Moreno, I., Norambuena, L., Maturana, D., Toro, M., Vergara, C., Orellana, A., et al. and triplication of HMA4. Nature 453, 391–395. doi: 10.1038/nature06877 (2008). AtHMA1 is a thapsigargin sensitive Ca2+/heavy metal pump. J. Biol. Hirayama, T., Kieber, J. J., Hirayama, N., Kogan, M., Guzman, P., Nourizadeh, Chem. 283, 9633–9641. doi: 10.1074/jbc.M800736200 S., et al. (1999). RESPONSIVE-TO-ANTAGONIST1, a Menkes/Wilson disease Moreno-Hagelsieb, G., Wang, Z., Walsh, S., and Elsherbiny, A. (2013). related copper transporter, is required for ethylene signaling in Arabidopsis. Cell Phylogenomic clustering for selecting non-redundant genomes for comparative 97, 383–393. doi: 10.1016/S0092-8674(00)80747-3 genomics. Bioinformatics 29, 947–949. doi: 10.1093/bioinformatics/btt064 Huang, J., and Gogarten, J. (2007). Did an ancient chlamydial endosymbio- Moustafa, A., Reyes-Prieto, A., and Bhattacharya, D. (2008). Chlamydiae has con- sis facilitate the establishment of primary plastids? Genome Biol. 8:R99. doi: tributed at least 55 genes to plantae with predominantly functions. PLoS 10.1186/gb-2007-8-6-r99 ONE 3:e2205. doi: 10.1371/journal.pone.0002205 Hussain, D., Haydon, M. J., Wang, Y., Wong, E., Sherson, S. M., Young, J., et al. Nouet, C., Motte, P., and Hanikenne, M. (2011). Chloroplastic and mito- (2004). P-type ATPase heavy metal transporters with roles in essential zinc chondrial metal homeostasis. Trends Plant Sci. 16, 395–404. doi: homeostasis in Arabidopsis. Plant Cell 16, 1327–1339. doi: 10.1105/tpc.020487 10.1016/j.tplants.2011.03.005 Kanamaru, K., Kashiwagi, S., and Mizuno, T. (1994). A copper-transporting O’Lochlainn, S., Bowen, H. C., Fray, R. G., Hammond, J. P., King, G. J., White, P. P-type ATPase found in the thylakoid membrane of the cyanobac- J., et al. (2011). Tandem quadruplication of HMA4 in the zinc (Zn) and cad- terium Synechococcus species PCC7942. Mol. Microbiol. 13, 369–377. doi: mium (Cd) hyperaccumulator Noccaea caerulescens. PLoS ONE 6:e17814. doi: 10.1111/j.1365-2958.1994.tb00430.x 10.1371/journal.pone.0017814 Kersey, P. J., Staines, D. M., Lawson, D., Kulesha, E., Derwent, P., Humphrey, Palmgren, M. G., and Nissen, P. (2011). P-type ATPases. Annu. Rev. Biophys. 40, J. C., et al. (2012). Ensembl genomes: an integrative resource for genome- 243–266. doi: 10.1146/annurev.biophys.093008.131331 scale data from non-vertebrate species. Nucleic Acids Res. 40, D91–D97. doi: Pedersen, C. N. S., Axelsen, K. B., Harper, J. F., and Palmgren, M. G. 10.1093/nar/gkr895 (2012). Evolution of plant P-type ATPases. Front Plant Sci. 3:31. doi: Kim, Y. Y., Choi, H., Segami, S., Cho, H. T., Martinoia, E., Maeshima, M., 10.3389/fpls.2012.00031 et al. (2009). AtHMA1 contributes to the detoxification of excess Zn(II) in Philippe, H. (1993). MUST, a computer package of management utili- Arabidopsis. Plant J. 58, 737–753. doi: 10.1111/j.1365-313X.2009.03818.x ties for sequences and trees. Nucleic Acids Res. 21, 5264–5272. doi: Kobayashi, Y., Kuroda, K., Kimura, K., Southron-Francis, J. L., Furuzawa, A., 10.1093/nar/21.22.5264 Kimura, K., et al. (2008). Amino acid polymorphisms in strictly conserved Phung, L. T., Ajlani, G., and Haselkorn, R. (1994). P-type ATPase from the domains of a P-type ATPase HMA5 are involved in the mechanism of cop- cyanobacterium Synechococcus 7942 related to the human Menkes and Wilson per tolerance variation in Arabidopsis. Plant Physiol. 148, 969–980. doi: disease gene products. Proc. Natl. Acad. Sci. U.S.A. 91, 9651–9654. doi: 10.1104/pp.108.119933 10.1073/pnas.91.20.9651 Koonin, E. V. (2010). The two empires and three domains of life in the postgenomic Price, D. C., Chan, C. X., Yoon, H. S., Yang, E. C., Qiu, H., Weber, A. P. M., et al. age. Nat. Educ. 3, 27. (2012). genome elucidates origin of in Krämer, U. (2010). Metal hyperaccumulation in plants. Annu. Rev. Plant Biol. 61, algae and plants. Science 335, 843–847. doi: 10.1126/science.1213561 517–534. doi: 10.1146/annurev-arplant-042809-112156 R-Development-Core-Team. (2008). R: A Language and Environment for Statistical Kuhlbrandt, W. (2004). , structure and mechanism of P-type ATPases. Nat. Computing. Vienna: R Foundation for Statistical Computing. Rev. Mol. Cell Biol. 5, 282–295. doi: 10.1038/nrm1354 Rodriguez-Ezpeleta, N., Brinkmann, H., Burey, S. C., Roure, B., Burger, G., Lartillot, N., Lepage, T., and Blanquart, S. (2009). PhyloBayes 3: a Bayesian software Loffelhardt, W., et al. (2005). Monophyly of primary photosynthetic eukary- package for phylogenetic reconstruction and molecular dating. Bioinformatics otes: green plants, red algae, and glaucophytes. Curr. Biol. 15, 1325–1330. doi: 25, 2286–2288. doi: 10.1093/bioinformatics/btp368 10.1016/j.cub.2005.06.040

Frontiers in Plant Science | Plant Physiology January 2014 | Volume 4 | Article 544 | 14 Hanikenne and Baurain Evolution of plant metal ATPases

Schneider, T. D., and Stephens, R. M. (1990). Sequence logos: a new way Wang, Y., Hodgkinson, V., Zhu, S., Weisman, G. A., and Petris, M. J. (2011). to display consensus sequences. Nucleic Acids Res. 18, 6097–6100. doi: Advances in the understanding of mammalian copper transporters. Adv. Nutr. 10.1093/nar/18.20.6097 2, 129–137. doi: 10.3945/an.110.000273 Seigneurin-Berny, D., Gravot, A., Auroy, P., Mazard, C., Kraut, A., Finazzi, G., Willems, G., Dräger, D. B., Courbot, M., Gode, C., Verbruggen, N., and et al. (2006). HMA1, a new Cu-ATPase of the chloroplast envelope, is essen- Saumitou-Laprade, P. (2007). The genetic basis of zinc tolerance in the tial for growth under adverse light conditions. J. Biol. Chem. 281, 2882–2892. metallophyte Arabidopsis halleri ssp. halleri (Brassicaceae): an analysis of doi: 10.1074/jbc.M508333200 quantitative trait loci. Genetics 176, 659–674. doi: 10.1534/genetics.106. Shikanai, T., Muller-Moule, P., Munekage, Y., Niyogi, K. K., and Pilon, M. (2003). 064485 PAA1, a P-type ATPase of Arabidopsis, functions in copper transport in chloro- Willems, G., Frérot, H., Gennen, J., Salis, P.,Saumitou-Laprade, P.,and Verbruggen, plasts. Plant Cell 15, 1333–1346. doi: 10.1105/tpc.011817 N. (2010). Quantitative trait loci analysis of mineral element concentrations Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., et al. (2011). Fast, in an Arabidopsis halleri x Arabidopsis lyrata petraea F2 progeny grown on scalable generation of high-quality protein multiple sequence alignments using cadmium-contaminated soil. New Phytol. 187, 368–379. doi: 10.1111/j.1469- Clustal Omega. Mol Syst Biol 7, 539. doi: 10.1038/msb.2011.75 8137.2010.03294.x Stamatakis, A. (2006). RAxML-VI-HPC: maximum likelihood-based phyloge- Williams, L. E., and Mills, R. F. (2005). P(1B)-ATPases–an ancient family of transi- netic analyses with thousands of taxa and mixed models. Bioinformatics 22, tion metal pumps with diverse functions in plants. Trends Plant Sci. 10, 491–502. 2688–2690. doi: 10.1093/bioinformatics/btl446 doi: 10.1016/j.tplants.2005.08.008 Stechmann, A., and Cavalier-Smith, T. (2002). Rooting the eukaryote tree by using Woeste, K. E., and Kieber, J. J. (2000). A strong loss-of-function mutation in RAN1 a derived gene fusion. Science 297, 89–91. doi: 10.1126/science.1071196 results in constitutive activation of the ethylene response pathway as well as a Swofford, D. L. (2002). PAUP∗. Phylogenetic Analysis Using Parsimony (∗and Other -lethal phenotype. Plant Cell 12, 443–455. doi: 10.1105/tpc.12.3.443 Methods). Version 4. Sunderland, MA: Sinauer Associates, Inc. Wong, C. K. E., and Cobbett, C. S. (2009). HMA P-type ATPases are the major Talke, I. N., Hanikenne, M., and Krämer, U. (2006). Zinc-dependent global tran- mechanism for root-to-shoot Cd translocation in Arabidopsis thaliana. New scriptional control, transcriptional deregulation, and higher gene copy number Phytol. 181, 71–78. doi: 10.1111/j.1469-8137.2008.02638.x for genes in metal homeostasis of the hyperaccumulator Arabidopsis halleri. Yang, Z. (1993). Maximum-likelihood estimation of phylogeny from DNA Plant Physiol. 142, 148–167. doi: 10.1104/pp.105.076232 sequences when substitution rates differ over sites. Mol. Biol. Evol. 10, Thever, M. D., and Saier, M. H. J. (2009). Bioinformatic characterization of P- 1396–1401. Type ATPases encoded within the fully sequenced genomes of 26 eukaryotes. J. Membr. Biol. 229, 115–130. doi: 10.1007/s00232-009-9176-2 Conflict of Interest Statement: The authors declare that the research was con- Tottey, S., Rich, P. R., Rondet, S. A. M., and Robinson, N. J. (2001). Two menkes- ducted in the absence of any commercial or financial relationships that could be type ATPases supply copper for photosynthesis in Synechocystis sp. 6803. J. Biol. construed as a potential conflict of interest. Chem. 276, 19999–20004. doi: 10.1074/jbc.M011243200 Ueno, D., Milner, M. J., Yamaji, N., Yokosho, K., Koyama, E., Clemencia Zambrano, Received: 15 July 2013; paper pending published: 12 August 2013; accepted: 12 M., et al. (2011). Elevated expression of TcHMA3 plays a key role in the extreme December 2013; published online: 07 January 2014. Cd tolerance in a Cd-hyperaccumulating ecotype of Thlaspi caerulescens. Plant Citation: Hanikenne M and Baurain D (2014) Origin and evolution of metal P- J. 66, 852–862. doi: 10.1111/j.1365-313X.2011.04548.x type ATPases in Plantae (Archaeplastida). Front. Plant Sci. 4:544. doi: 10.3389/fpls. Ueno, D., Yamaji, N., Kono, I., Huang, C. F., Ando, T., Yano, M., et al. (2010). 2013.00544 Gene limiting cadmium accumulation in rice. Proc. Natl. Acad. Sci. U.S.A. 107, This article was submitted to Plant Physiology, a section of the journal Frontiers in 16500–16505. doi: 10.1073/pnas.1005396107 Plant Science. Van De Peer, Y., Maere, S., and Meyer, A. (2009). The evolutionary signifi- Copyright © 2014 Hanikenne and Baurain. This is an open-access article distributed cance of ancient genome duplications. Nat. Rev. Genet. 10, 725–732. doi: under the terms of the Creative Commons Attribution License (CC BY). The use, dis- 10.1038/nrg2600 tribution or reproduction in other forums is permitted, provided the original author(s) Verbruggen, N., Hermans, C., and Schat, H. (2009). Molecular mechanisms or licensor are credited and that the original publication in this journal is cited, in of metal hyperaccumulation in plants. New Phytol. 181, 759–776. doi: accordance with accepted academic practice. No use, distribution or reproduction is 10.1111/j.1469-8137.2008.02748.x permitted which does not comply with these terms.

www.frontiersin.org January 2014 | Volume 4 | Article 544 | 15 Copyright of Frontiers in Plant Science is the property of Frontiers Media S.A. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.