<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector Current Biology, Vol. 12, 115–119, January 22, 2002, 2002 Elsevier Science Ltd. All rights reserved. PII S0960-9822(01)00649-2 A Cyanobacterial Gene in Nonphotosynthetic —An Early Acquisition in ?

Jan O. Andersson1 and Andrew J. Roger phytes, , , chlorarachnio- The Canadian Institute for Advanced Research phytes, and , by secondary or tertiary endo- Program in Evolutionary Biology symbioses, whereby photosynthetic eukaryotes were Department of Biochemistry engulfed by heterotrophic hosts [11–13]. However, gain and Molecular Biology of photosynthetic does not necessarily Dalhousie University mean that they will be permanently retained in a lineage. Halifax, Nova Scotia B3H 4H7 For example, loss of has occurred many Canada times in parasitic [14], dinoflagellates, [15] and apicomplexan protists, such as the malaria parasite Plasmodium falciparum, which harbor nonphotosyn- Summary thetic plastids [16]. The possibility of widespread sec- ondary loss of plastid functions makes it difficult to pin- Since the incorporation of mitochondria and chloro- point the timing of the primary endosymbiotic origin plasts (plastids) into the eukaryotic cell by endosym- of chloroplasts within eukaryotes. Like mitochondria, biosis [1], genes have been transferred from the orga- plastids could have entered eukaryotes relatively early nellar genomes to the nucleus of the host [2–4], via an on but been secondarily lost multiple times in various ongoing process known as endosymbiotic gene transfer nonphotosynthetic eukaryotic lineages. [5]. Accordingly, in photosynthetic eukaryotes, nuclear To address the timing of the chloroplast symbiosis in genes with cyanobacterial affinity are believed to have lineages, we selected one gene that has been originated from endosymbiotic gene transfer from chlo- suggested to represent an endosymbiotic gene transfer, roplasts. Analysis of the Arabidopsis thaliana genome gnd [6], for extensive phylogenetic sampling. We fo- has shown that a significant fraction (2%–9%) of the cused on nonphotosynthetic eukaryotic lineages that nuclear genes have such an endosymbiotic origin [3]. are closely related to photosynthetic eukaryotic lineages Recently, it was argued that 6-phosphogluconate dehy- (see Figure 1 in [22] for a current view of the phylogeny drogenase (gnd)—the second enzyme in the oxidative of eukaryotes), since they are likely to contain genes of pentose phosphate pathway—was one such example chloroplast origin if the chloroplast acquisition hap- [6]. Here we show that gnd genes with cyanobacterial pened earlier than previously thought. We PCR amplified affinity also are present in several nonphotosynthetic and sequenced gnd genes from the heteroloboseid protistan lineages, such as Heterolobosea, Apicom- amoeboflagellates gruberi, Naegleria ander- plexa, and parasitic Heterokonta. Current data cannot soni, and ; the diatom Pseudonitzschia definitively resolve whether these groups acquired the pungens; and the cellular (mycetozoan) Dic- gnd gene by primary and/or secondary endosymbiosis tyostelium discoideum, using degenerate primers against or via an independent lateral gene transfer event. Nev- conserved regions. The euglenozoans gracilis ertheless, our data suggest that chloroplasts were in- and sp., several species, the dino- troduced into eukaryotes much earlier than previously Prorocentrum lima, and the Spiro- thought and that several major groups of heterotro- nucleus barkhanus failed to yield positive gnd PCR prod- phic eukaryotes have secondarily lost photosynthetic ucts. cDNA clones from EST projects were retrieved and plastids. sequenced from the vaginalis, the oomycete Phytophthora infestans [17], the green alga Chlamydomonas reinhardtii [18], and the red alga Por- Results and Discussion phyra yezoensis [19]. The GenBank accession numbers for the reported sequences are AF394507–AF394516. By tracing genes of mitochondrial origin in the nucleus A data set of 73 eubacterial and eukaryotic gnd homo- of amitochondrial protists, it has been shown that most logs were retrieved from the databases, aligned, and, if not all amitochondrial groups are probably secondarily after exclusion of ambiguously aligned regions, 443 amitochondrial, suggesting that the origin of mitochon- amino acid positions (and the corresponding 1329 nu- dria may have been concurrent with the origin of the cleotides) were included in further analyses. One se- eukaryotic cell itself [7]. In contrast to mitochondria, it quence from each sequence pair with Ͼ85% amino acid is generally accepted that only a subset of eukaryotes sequence similarity was excluded. Five sequences (Buch- experienced the endosymbiosis that gave rise to pres- nera sp., Borellia burgdorferi, Treponema pallidum, ent-day chloroplasts [8]. Three lineages, Glaucophyta, Thermotoga maritima, and Plasmodium falciparum) Rhodophyta, and , possess primary plas- failed the ␹2 tests for deviation of amino acid frequencies tids that are thought to be derived from a single cyano- implemented in TREE-PUZZLE, version 4.02 [20] and bacterial endosymbiosis in their common ancestor after were excluded since the currently available phyloge- their divergence from other eukaryotic lineages [8–10]. netic methods cannot deal with strong amino acid com- Photosynthetic plastids then spread from these groups positional biases in the data [21], leaving 46 taxa for to other algal eukaryotes, including , hapto- extensive phylogenetic analysis. The protein maximum likelihood (ML) tree of the gnd gene data set is shown 1 Correspondence: [email protected] in Figure 1. With the exception of five eukaryotic se- Current Biology 116

Figure 1. Maximum Likelihood Tree of 6-Phosphogluconate Dehydrogenase Protein Sequences ML protein phylogenies were inferred using PROML within the PHYLIP package, version 3.6 [26], with a mixed four-category discrete-␥ model of among-site rate variation (Dayhoff ϩ⌫). Ten random additions with global rearrangements were used to find the optimal tree for the 46 taxon data set (shown). The ⌫ shape parameter ␣ was estimated to 0.69 and 0.65 for the 46 taxon and the 43 taxon short-branch data sets

(taxa labeled § excluded—see text), respectively, with no invariable sites detected (Pinv ϭ 0) using TREE-PUZZLE, version 4.02 [20]. Protein ML bootstrap values for bipartitions were calculated by analysis of 250 resampled data sets with one random addition with global re- arrangements. Support values Ͼ50% for bipartitions in the protein analysis of the larger data set are shown above the branches, followed by the support values of the smaller data set if they differed by more than 10% from the larger analysis (indicated by *). Nucleotide ML analysis based on the first and second codon positions (886 positions) were performed using PAUP*, version 4.0b8 [27]. Using MODELTEST, version 3.06 [28], it was found that a general time-reversible model incorporating a correction for among-site rate variation (using eight discrete rate categories) and invariable sites (GTR ϩ⌫ϩInv) best described all nucleotide data sets. The parameters were estimated from the data set

(␣ϭ1.37, Pinv ϭ 0.14). Nucleotide ML bipartition support values were obtained using 250 resampled data sets analyzed with one round of random stepwise addition and TBR branch swapping and are shown below the branches. Eukaryotes are labeled red, green, and other eubacteria black. Taxa containing primary and secondary plastids are highlighted with green and blue boxes, respectively. Vertical bar indicates outgroup used in branch length calculations (see text). quences, the eukaryotes group into two distinct clades: proteobacterial sequences indicate that the gnd gene sequences from heterokonts, Viridiplantae, Rhodo- has undergone lateral gene transfer within eubacteria phyta, and Heterolobosea form a “ ϩ clade,” [6] (Figure 1). On the other hand, the plant ϩ protist clade and metazoan, fungal, and a second rhodophyte homo- shows a specific affinity to an exclusively cyanobacterial log form a second grouping. The absence of known cluster, to the exclusion of all other eubacteria, indicat- archaebacterial gnd homologs [6] indicates that both ing an origin of this eukaryotic gnd version by gene eukaryotic gnd versions most likely have eubacterial transfer from cyanobacteria (Figure 1). Interestingly, the origins. It has been suggested that the /fungal taxa containing the cyanobacterial-like gnd gene neatly homologs may have originated from an endosymbiotic comprise a subset of organisms from a plant ϩ protist gene replacement from mitochondria [6]. However, this supercluster of eukaryotes recovered in recent com- interpretation is problematic, since the polyphyly of the bined protein phylogenies of eukaryotes (Figures 1 and Brief Communication 117

Figure 2. Evidence for the Monophyly of the Plant ϩ Protist Clade (A) maximum likelihood tree of all available 6-phosphogluconate dehydrogenase protein sequences from cyanobacteria and eukaryotes in the plant ϩ protist clade. Methods and labeling are the same as in Figure 1 (␣ϭ0.61, Pinv ϭ 0 for the protein data set, and ␣ϭ1.62, Pinv ϭ 0.24 for the nucleotide data set). The putative origin of a sequence signature (EW) is indicated (see below). The Spinacia oleracea homolog targeted to the chloroplast [6] is indicated by an arrow. No N-terminal extensions indicative of chloroplast localization were detected in any of the other sequences (data not shown). (B) A region of the amino acid alignment containing a sequence signature (EW, shaded) unique to all members of the plant ϩ protist clade to the exclusion of all other taxa in the data set (except the cyanobacterium Nostoc punctiforme, which encodesaWinthesecond position). Representative sequences of the major clades in Figure 1 are shown.

2) [22], making a single transfer from a cyanobacterium To determine the phylogenetic affinity of the P. falci- or plastid to the ancestor of this clade biologically rea- parum sequence, a smaller data set with cyanobacteria sonable. The presence of homologs from both clades and all available eukaryotic sequences putatively be- in the red alga Porphyra yezoensis and the absence of longing to the plant ϩ protist clade was constructed cyanobacterial gnd homologs in the animal and fungal and analyzed. In this analysis, P. falciparum is included lineages indicate that the homolog found in Metazoa, in the plant ϩ protist clade with 100% bootstrap support Fungi, and Rhodophyta is probably ancestral to eukary- (Figure 2A). A sequence signature (EW) shared by all otes and that the plant ϩ protist clade subsequently taxa in the plant ϩ protist clade to the exclusion of all acquired a second gnd homolog from cyanobacteria. other sequences is also found in P. falciparum (Figure Although the monophyly of the “cyanobacterial/plant ϩ 2B), providing further evidence for its placement within protist clade” is consistently recovered, it is only moder- the plant ϩ protist group. ately supported by bootstrap analysis (Figure 1). This is Curiously, five eukaryotic protistan sequences fall partly due to the existence of long branches in the gnd outside the two large clades in the gnd tree. The se- tree—long branches are well known to destabilize other- quence from the mycetozoan D. discoideum branches wise robust groupings in molecular phylogenies [23]. To as an immediate outgroup to an animal ϩ fungal/proteo- investigate the strength of the cyanobacterial/plant ϩ pro- bacterial clade, whereas the kinetoplastid T. tist clade in the absence of long branches, we identified brucei and L. major and the amitochondriate protists the latter by calculating the mean patristic distances Trichomonas vaginalis and G. lamblia are all included to the outgroup (Figure 1), using the TreeDis program in a weakly supported eubacterial clade with several (George Weiller, Australian National University). Three long branching taxa outside of the cyanobacterial/ branches ( brucei, major, and plant ϩ protist group (Figure 1). We tested if these five lamblia) were found to be significantly longer eukaryotic sequences are strongly excluded from the (p Ͻ 0.05) than the others and were excluded from the major eukaryotic groups. Unfortunately, different statis- data set to yield a 43 taxon “short-branch” data set. In tical tests gave contradictory results (see Supplemen- the absence of long branches, the bootstrap support for tary Material included with this article online). Conse- the cyanobacterial/plant ϩ protist bipartition increased quently, current data cannot definitively resolve whether significantly for this data set, from 65 to 87 for the ML some or all of these sequences truly belong to one or the protein analysis and from 62 to 72 for the ML nucleotide other major eukaryotic clades or if they were acquired analysis, in support of a true sister group relationship through separate eubacteria-to-eukaryote lateral gene between the plant ϩ protist and cyanobacterial clades. transfer events. Due to a strong amino acid composition bias, the Plas- Our phylogenetic analysis strongly supports the hy- modium falciparum gnd sequence was excluded from pothesis that the plant ϩ protist gnd homologs origi- the extensive phylogenetic analyses described above. nated through a gene transfer event from cyanobacteria. Current Biology 118

the common ancestor of and their amoe- boflagellate relatives, the Heterolobosea [22] (Figure 3B). Both these scenarios imply widespread loss of sec- ondary plastids from multiple lineages of nonphotosyn- thetic eukaryotes. The limited phylogenetic resolution within the plant ϩ protist clade in the gnd tree (Figures 1 and 2) makes it difficult to infer from the data which of the hypotheses presented in Figure 3 is most likely. However, we would expect the gnd sequences that originated via secondary endosymbiosis to branch within Viridiplantae or Rhodo- Figure 3. Different Hypotheses Explaining the Presence of a Cyano- phyta (Figure 3B). Since this is not observed (Figures 1 bacterial-like gnd Gene in Both Photosynthetic and Nonphotosyn- thetic Eukaryotes and 2) and the lack of resolution between these eukaryo- The relationships between eukaryote groups depicted are derived tic groups is also a feature of the combined protein from a synthesis of recent combined protein phylogenies [10, 22]. phylogenies of the organismal lineages [22], an early Red arrows indicate transfer of the gnd gene between organisms introduction of the gene into their common ancestor by mediated by the endosymbiosis. Branches are colored red to indi- a single primary chloroplast endosymbiosis followed by cate the presence of the gnd gene in a member of the group, whereas diversification of the host lineages (Figure 3A) seems to blue branches indicate phylogenetically related groups for which be the most likely scenario. In any case, the presence of no gnd homolog is currently known. The question marks indicate taxa for which a gnd homolog is known but do not branch within a gene of presumed plastid origin in nonphotosynthetic the cyanobacterial/plant ϩ protist clade in optimal gnd trees (Figure lineages, such as heteroloboseid amoebae and the oo- 1). (A) Early primary chloroplast endosymbiosis mediating the gnd mycete P. infestans, suggests that these lineages may gene transfer in the common ancestor of many eukaryote lineages. ancestrally have harbored chloroplasts. Furthermore, (B) primary endosymbiosis in the common ancestor of Glaucophyta, the phylogenetic relationships of lineages harboring cy- Rhodophyta, and Viridiplantae [9, 10], followed by early secondary anobacterial-related gnd homologs allow plastid loss to endosymbiosis mediating the gnd gene transfer. be inferred for other related nonphotosynthetic eukary- ote protistan lineages, including the and the ki- Several different evolutionary scenarios are possible: netoplastid Euglenozoa (Figure 3). An even more radical either the gene was laterally transferred independently possibility—neither clearly supported nor excluded by of the that gave rise to plastids, or the our data (see Supplementary Material)—that most parsi- gene originated from endosymbiotic gene transfer from moniously explains the distribution of gnd homologs the ancestral chloroplast genome. We favor the “chloro- among eukaryotes is that the ancestor of the amitochon- plast origin” scenario for two reasons. First, it is well driate and may have once established that the plastid has a cyanobacterial origin, also contained a chloroplast. and it is likely that hundreds of its genes have been In any case, these possibilities should be tested— transferred to the nucleus of plants [2, 3]. Thus, the most more genomic data from a variety of nonphotosynthetic likely explanation for a cyanobacterial-like gene in a and photosynthetic protists should reveal whether they eukaryotic genome is surely a chloroplast origin. Sec- possess a set of cyanobacterial genes that would betray ond, the placement of eukaryotes at the base of the an ancestral chloroplast endosymbiosis. If a large frac- two cyanobacterial groups (Figure 1) conforms to the tion of eukaryotic diversity evolved from plastid-bearing observation that plastids form a monophyletic group ancestors, then the chloroplast endosymbiosis oc- near the root of the cyanobacterial line of descent in curred much earlier in eukaryotic evolution than cur- ssuRNA trees [24]. If a plastid origin is correct, there are rently thought [8–11]. several plausible explanations for the propagation of the gene throughout the plant ϩ protist clade (Figure 3). Supplementary Material One explanation is that the gene was inherited in all A supplementary table showing the results of the statistical tests these lineages from a primary chloroplast endosym- for alternative phylogenies for the five eukaryotic sequences outside the two major eukaryotic clades and some explanatory text are biosis in their common ancestor (Figure 3A), implying available at http://images.cellpress.com/supmat/supmatin.htm. that a wide range of eukaryotes were primarily photosyn- thetic, many of which subsequently secondarily lost pri- Acknowledgments mary plastids. Alternatively, the gnd gene may have been transferred between eukaryotes via secondary en- We thank Alastair Simpson, Michael Charette, and Tony Russell dosymbioses (Figure 3B) or independent lateral gene (Dalhousie University), T. Martin Embley and Robert P. Hirt (Natural transfer events. A single early secondary endosymbiosis History Museum, London, UK), Susan Douglas (National Research Council, Halifax, NS, Canada), Laura Katz (Smith College, Northamp- of a rhodophyte alga in the common ancestor of hetero- ton, MA), Ronald Pearlman (York University, Toronto, ON, Canada), konts, , ciliates, and dinoflagellates, as Nobumi Kusuhara (Kazusa DNA Research Institute, Japan), and suggested by recent GAPDH phylogenies [25], may ex- Francine Govers (Wageningen University, the Netherlands) for gen- plain the presence of cyanobacterial gnd genes in non- erous gifts of genomic DNA and cDNA clones. We thank Lesley photosynthetic protists, such as the apicomplexan P. Davis for help with sequencing. We also thank the members of falciparum and the oomycete P. infestans Roger and Doolittle labs for helpful discussions and critical reading of the manuscript; and Brian Hoyt and Milan Horacek for help with (Figure 3B). Similarly, the secondary symbiosis that gave computer facilities. Preliminary sequence data were obtained from rise to plastids of green algal origin in photosynthetic the Institute for Genomic Research, the Joint Genome Institute, the euglenids (Euglenozoa) [8, 11] might have occurred in Sanger Centre, and the Josephine Bay Paul Center for Comparative Brief Communication 119

Molecular Biology and Evolution. J.O.A. is supported by a Post and Tabata, S. (2000). Generation of 10,154 expressed se- Doctoral Fellowship from the Wenner-Gren Foundations. A.J.R. is quence tags from a leafy gametophyte of a marine red alga, supported by the CIAR Program in Evolutionary Biology. This work Porphyra yezoensis. DNA Res. 7, 223–227. was supported by a Natural Sciences and Engineering Research 20. Strimmer, K., and von Haeseler, A. (1996). Quartet puzzling: Council (NSERC) Genomics Project Grant 228263-99 awarded to a quartet maximum-likelihood method for reconstructing tree A.J.R. topologies. Mol. Biol. Evol. 13, 964–969. 21. Foster, P.G., and Hickey, D.A. (1999). Compositional bias may Received: September 24, 2001 affect both DNA-based and protein-based phylogenetic recon- Revised: November 9, 2001 structions. J. Mol. Evol. 48, 284–290. Accepted: November 9, 2001 22. Baldauf, S.L., Roger, A.J., Wenk-Siefert, I., and Doolittle, W.F. Published: January 22, 2002 (2000). A -level phylogeny of eukaryotes based on com- bined protein data. Science 290, 972–977. References 23. Swofford, D.L., Olsen, G.L., Waddell, P.J., and Hillis, D.M. (1996). Phylogenetic inference. In Molecular systematics, D.M. Hillis, 1. Gray, M.W. (1992). The endosymbiont hypothesis revisited. Int. C. Moritz, and B.K. Mable, eds. (Sunderland, MA: Sinauer), pp. Rev. Cytol. 141, 233–357. 407–515. 2. Martin, W., Stoebe, B., Goremykin, V., Hansmann, S., Hasegawa, 24. Turner, S., Pryer, K.M., Miao, V.P.W., and Palmer, J.D. (1999). M., and Kowallik, K.V. (1998). Gene transfer to the nucleus and Investigation deep phylogenetic relationships among Cyano- the evolution of chloroplasts. Nature 393, 162–165. and plastids by small subunit rRNA sequence analysis. 3. Rujan, T., and Martin, W. (2001). How many genes in Arabidopsis J. Eukaryot. Microbiol. 46, 327–338. come from cyanobacteria? An estimate from 386 protein phy- 25. Fast, N.M., Kissinger, J.C., Roos, D.S., and Keeling, P.J. (2001). logenies. Trends Genet. 17, 113–120. Nuclear-encoded, plastid-targeted genes suggest a single com- 4. Karlberg, O., Canba¨ ck, B., Kurland, C.G., and Andersson, S.G.E. mon origin for apicomplexan and plastids. Mol. (2000). The dual origin of the yeast mitochondrial proteome. Biol. Evol. 18, 418–426. Yeast 17, 170–187. 26. Felsenstein, J. (1989). Phylogeny Inference Package (Version 5. Adams, K.L., Daley, D.O., Qiu, Y.L., Whelan, J., and Palmer, J.D. 3.2). Cladistics 5, 166. (2000). Repeated, recent and diverse transfers of a mitochon- 27. Swofford, D.L. (2000). PAUP*. Phylogenetic Analysis Using Par- drial gene to the nucleus in flowering plants. Nature 408, simony (*and Other Methods). Version 4 (Sunderland, MA: 354–357. Sinauer). 6. Krepinsky, K., Plaumann, M., Martin, W., and Schnarrenberger, 28. Posada, D., and Crandall, K.A. (1998). MODELTEST: testing the C. (2001). Purification and cloning of chloroplast 6-phosphoglu- model of DNA substitution. Bioinformatics 14, 817–818. conate dehydrogenase from spinach: Cyanobacterial genes for chloroplast and cytosolic isoenzymes encoded in eukaryotic Accession Numbers chromosomes. Eur. J. Biochem. 268, 2678–2686. 7. Roger, A.J. (1999). Reconstructing early events in eukaryotic The GenBank accession numbers for the gnd sequences reported evolution. Am. Nat. 154, S146–S163. in this paper are AF394507–AF394516. 8. Cavalier-Smith, T. (2000). Membrane heredity and early chloro- plast evolution. Trends Plant Sci. 5, 174–182. 9. Douglas, S.E. (1998). Plastid evolution: origins, diversity, trends. Curr. Opin. Genet. Dev. 8, 655–661. 10. Moreira, D., Le Guyader, H., and Phillippe, H. (2000). The origin of red and the evolution of chloroplasts. Nature 405, 69–72. 11. Delwiche, C.F. (1999). Tracing the thread of plastid diversity through the tapestry of . Am. Nat. 154, S164–S177. 12. Tengs, T., Dahlberg, O.J., Shalchian-Tabrizi, K., Klaveness, D., Rudi, K., Delwiche, C.F., and Jakobsen, K.S. (2000). Phyloge- netic analyses indicate that the 19’Hexanoyloxy-fucoxanthin- containing dinoflagellates have tertiary plastids of origin. Mol. Biol. Evol. 17, 718–729. 13. Douglas, S., Zauner, S., Fraunholz, M., Beaton, M., Penny, S., Deng, L.T., Wu, X., Reith, M., Cavalier-Smith, T., and Maier, U.G. (2001). The highly reduced genome of an enslaved algal nucleus. Nature 410, 1091–1096. 14. dePamphilis, C.W., Young, N.D., and Wolfe, A.D. (1997). Evolu- tion of plastid gene rps2 in a lineage of hemiparasitic and holo- parasitic plants: many losses of photosynthesis and complex patterns of rate variation. Proc. Natl. Acad. Sci. USA 94, 7367– 7372. 15. Saldarriaga, J., Taylor, F.J.R., Keeling, P.J., and Cavalier-Smith, T. (2001). Dinoflagellate ribosomal RNA phylogeny suggests multiple losses and replacements of chloroplasts. J. Mol. Evol. 53, 204–213. 16. McFadden, G.I., Reith, M.E., Munholland, J., and Lang-Unnasch, N. (1996). Plastid in human parasites. Nature 381, 482. 17. Kamoun, S., Hraber, P., Sobral, B., Nuss, D., and Govers, F. (1999). Initial assessment of gene diversity for the oomycete pathogen Phytophthora infestans based on expressed se- quences. Fungal Genet. Biol. 28, 94–106. 18. Asamizu, E., Miura, K., Kucho, K., Inoue, Y., Fukuzawa, H., Oh- yama, K., Nakamura, Y., and Tabata, S. (2000). Generation of

expressed sequence tags from low-CO2 and high-CO2 adapted cells of Chlamydomonas reinhardtii. DNA Res. 7, 305–307. 19. Nikaido, I., Asamizu, E., Nakajima, M., Nakamura, Y., Saga, N.,