<<

MBE Advance Access published March 2, 2005

Yoon et al. 1

Research Article: Tertiary Endosymbiosis Driven Genome Evolution in

Dinoflagellate

Hwan Su Yoon,1 Jeremiah D. Hackett,1 Frances M. Van Dolah,2 Tetyana Nosenko,1 Kristy L. Lidie,2 and

Debashish Bhattacharya1*

1 Department of Biological Sciences and Roy J. Carver Center for Comparative Genomics, University of

Iowa, 312 Biology Building, Iowa City, Iowa 52242, USA

2 Biotoxins Program, NOAA National Ocean Service, Center for Coastal Environmental Health and

Biomolecular Research, Charleston, South Carolina 29412, USA

* Corresponding author: Debashish Bhattacharya, Department of Biological Sciences and Roy J. Carver

Center for Comparative Genomics, University of Iowa, 312 Biology Building, Iowa City, Iowa 52242-

1324, USA, Tel: (319) 335-1977, fax: (319) 335-1069, E-mail: [email protected]

Key words: chromalveolates, , EST, Karenia brevis, minicircle genes, tertiary endosymbiosis

Running title: tertiary endosymbiosis

Abbreviations: GAPDH, glyceraldehyde-3-phosphate dehydrogenase; PsaA, photosystem I P700 chlorophyll a apoprotein A1; PsaB, photosystem I P700 chlorophyll a apoprotein A2; PsbA, photosystem II reaction center protein D1; PsbC, photosystem II 44 KD apoprotein; PsbD, photosystem II D2 reaction center protein; PsbO, oxygen-evolving enhancer protein 1 Yoon et al. 2

ABSTRACT

Dinoflagellates are important aquatic primary producers and cause “red tides”. The most widespread (photosynthetic organelle) in these algae contains the unique accessory pigment peridinin. This plastid putatively originated via a red algal secondary endosymbiosis and has some remarkable features, the most notable being a genome that is reduced to 1-3 gene minicircles with about 14 genes (out of an original 130-200) remaining in the organelle and a nuclear-encoded proteobacterial Form II Rubisco. The

“missing” plastid genes are relocated to the nucleus via a massive transfer unequaled in other photosynthetic . The fate of these characters is unknown in a number of dinoflagellates that have replaced the peridinin plastid through tertiary endosymbiosis. We addressed this issue in the fucoxanthin dinoflagellates (e.g., Karenia brevis) that contain a captured plastid. Our multi- protein phylogenetic analyses provide robust support for the haptophyte plastid replacement and are consistent with a red algal origin of the chromalveolate plastid. We then generated an expressed sequence tag (EST) database of 5,138 unique genes from K. brevis and searched for nuclear genes of plastid function. The EST data indicate the loss of the ancestral peridinin plastid characters in K. brevis including the transferred plastid genes and Form II Rubisco. These results underline the remarkable ability of dinoflagellates to remodel their genomes through endosymbiosis and the considerable impact of this process on cell evolution. Yoon et al. 3

INTRODUCTION

It is now generally accepted that the photosynthetic organelle of eukaryotes (plastid) originated through endosymbiosis (Gray 1992; Bhattacharya and Medlin 1995; Palmer 2003; Bhattacharya, Yoon, and

Hackett 2004), whereby a single-celled engulfed and retained a foreign photosynthetic cell. Over time, the foreign cell was reduced to a plastid and was vertically transmitted to subsequent generations.

Most have originated either through primary or secondary endosymbiosis. The first results from the engulfment of a photosynthetic prokaryote (cyanobacterium) and gave rise to a plastid bound by two membranes, whereas the second results from the engulfment of a eukaryotic alga and resulted in a plastid bound by 3 or 4 membranes. Only the dinoflagellates have undergone tertiary endosymbiosis, which is the engulfment of an alga containing a secondary plastid (Bhattacharya, Yoon, and Hackett 2004). The primary endosymbiosis is believed to have given rise to the plastid in the common ancestor of the red, green, and algae (Moreira, Le Guyader, and Phillippe 2000; Stibitz, Keeling, and

Bhattacharya 2000; Matsuzaki et al. 2004; Sanchez Puerta, Bachvaroff, and Delwiche 2004).

After their split from the , a red algal cell was engulfed by a non-photosynthetic protist and reduced to a plastid. This cell evolved chlorophyll c and putatively was the common ancestor of the protist super-assemblage chromalveolates (Cavalier-Smith 1999), comprising the

(cryptophytes, , and , Cavalier-Smith 1986) and the Alveolata (dinoflagellates, , and apicomplexans, Van de Peer and De Wachter 1997). Although not yet recovered in host gene trees, studies of plastid genes and plastid-targeted glyceraldehyde-3-phosphate dehydrogenase (GAPDH) are consistent with chromalveolate monophyly (e.g., Durnford et al. 1999; Fast et al. 2001; Yoon et al.

2002; Harper and Keeling 2003; Yoon et al. 2004).

Within dinoflagellates, the most common plastid type is bound by three membranes and contains the unique accessory pigment, peridinin. Other significant characteristics of peridinin plastids are a highly reduced genome encoding about 14 proteins on 1-3 gene minicircles, in addition to the large and small subunits of the plastid ribosomal RNA and minicircles encoding pseudogenes (Zhang, Green, and

Cavalier-Smith 1999; Barbrook and Howe 2000; Zhang, Cavalier-Smith, and Green 2002; Howe et al. Yoon et al. 4

2003). The distribution of minicircle genes in different dinoflagellates remains uncertain. Although apparently plastid-encoded in taxa such as Prorocentrum spp. and Amphidinium spp. (Zhang, Green, and

Cavalier-Smith 1999; Koumandou et al. 2004), these genes have been localized to the nucleus in

Ceratium horridum (Laatsch et al. 2004) and provisionally also in Pyrocystis lunula (rpl28 and rpl33, unpublished data; see GenBank Accession AF490367). and algal plastids generally contain a circular genome that is about 150 kilobases in size and encode between 130-200 genes. In contrast, the minicircles encode the core subunits of the photosystem (atpA, atpB, petB, petD, psaA, psaB, psbA-E, psbI) and ribosomal RNA (16S, 23S rRNA) and the remaining genes required for plastid function have been transferred to the nucleus (e.g., in Alexandrium tamarense, Amphidinium carterae, and

Lingulodinium polyedrum; Bachvaroff et al. 2004; Hackett et al. 2004) and encode a targeting sequence for plastid import (Nassoury, Cappadocia, and Morse 2003; Bachvaroff et al. 2004; Hackett et al. 2004).

In addition to this remarkable development, the normal plastid-encoded Form I Rubisco was replaced in peridinin dinoflagellates with a nuclear Form II enzyme of alpha-proteobacterial origin.

As noteworthy as this set of evolutionary developments may seem, in some dinoflagellates, the peridinin plastid was replaced by one from a cryptophyte, a haptophyte, a , or a green alga

(Bhattacharya, Yoon, and Hackett 2004). The genomic consequences for dinoflagellates of these tertiary endosymbioses (except for the green algal replacement which was a successive secondary event) remain unknown. For example, were the nuclear-encoded plastid genes completely lost or is there now a set of genes, potentially of non-overlapping functions, of both chromalveolate and tertiary endosymbiotic origin? And what of the Form II Rubisco? Was this gene lost or are there genes encoding both Form I and

II enzymes in these taxa?

To answer these questions, we studied in detail one particular dinoflagellate tertiary endosymbiosis, the replacement of the peridinin plastid with one of haptophyte origin in the

Gymnodiniales. The plastid in taxa such as Karenia spp., Karlodinium micrum, and Takayama spp. contain chlorophylls c1+c2 and 19’-hexanoyloxy-fucoxanthin and/or 19’-butanoyloxy-fucoxanthin but lack peridinin (e.g., De Salas et al. 2003; Daugbjerg, Hansen, and Moestrup 2000), similar to the Yoon et al. 5 haptophyte algae. Phylogenetic analyses support a haptophyte origin of fucoxanthin plastids (Tengs et al.

2000; Ishida and Green 2002) but their relationship to peridinin plastids has never been convincingly demonstrated in the context of a taxonomically broadly sampled multi-gene phylogeny. A previous DNA- based attempt at resolving this question using plastid- and minicircle-encoded psbA (Yoon, Hackett, and

Bhattacharya 2002) led to the artifactual clustering of fucoxanthin and peridinin dinoflagellates. This misleading result was most likely caused by codon usage heterogeneity in the DNA sequences (Inagaki et al. 2004). Use of protein data appears, therefore, to be of critical importance in resolving dinoflagellate plastid relationships.

Given these existing data, our study had two major aims. The first was to establish, using robust multi-protein (PsaA, PsaB, PsbA, PsbC, PsbD) analyses, the position of fucoxanthin plastids in a tree that included peridinin, red, chromist, green algal, glaucophyte, and cyanobacterial sequences. The second aim was to determine the impact of tertiary plastid endosymbiosis on nuclear genome evolution. To do this, we analyzed a unigene set of 5,138 expressed sequence tags (ESTs) that we generated from a light and a dark harvested culture of the toxic fucoxanthin dinoflagellate, Karenia brevis.

MATERIALS AND METHODS

Taxon Sampling and Sequencing

We determined the sequence of five minicircle-encoded genes (psaA, psaB, psbA, psbC, and psbD) in peridinin dinoflagellates and their putative plastid-encoded homologs from fucoxanthin-containing taxa as well as from chromist, red, and glaucophyte algae. A total of 60 new sequences were determined in this study (see Table 1 in the Supplementary Material).

The algal cultures were frozen in liquid nitrogen and ground with glass beads using a glass rod and/or Mini-BeadBeater™ (Biospec Products, Inc., Bartlesville, OK, USA). Total genomic DNA was extracted using the DNeasy Plant Mini Kit (Qiagen, Santa Clarita, CA, USA). Polymerase chain reactions

(PCR) were done using specific primers for each of the genes (Yoon, Hackett, and Bhattacharya 2002;

Yoon et al. 2002; Yoon et al. 2004). Degenerate primers were designed for the psbC gene (psbC60F: Yoon et al. 6

TGC YTG GTG GWC WGG TAA TGC; psbC100F: GGT AAR TTM YTM GGT GCT CAT; psbC1160R: TCT TGC CAS GTY TGS ATA TCA; psbC1200R: CCT ASS GGT GCA TGR TCA TA) and for psbD (psbD50F: GAT GAT TGG YAA AAC GAG A; psbD90F: TTG TWT TTA TCG GYT

GGT CYG G; psbD1010R: CAA GAT CAR CCW CAT GAA AAC; psbD1040R: GAG GAG GTW

TTA CCA CGT GGA AA). PCR products were purified using the QIAquick PCR Purification kit

(Qiagen) and were used for direct sequencing using the BigDyeTM Terminator Cycle Sequencing Kit

(PE-Applied Biosystems, Norwalk, CT, USA), and an ABI-3100 at the Roy J. Carver Center for

Comparative Genomics at the University of Iowa. Some PCR products were cloned into pGEM-T vector

(Promega, Madison, WI, USA) prior to sequencing.

Phylogenetic Analyses

The DNA sequence for each photosystem gene was translated and the amino acid data for each protein were manually aligned with the available data from GenBank using SeqPup (Gilbert 1995). Ambiguous positions were excluded from the phylogenetic analysis (the alignment is available upon request from D.

B.). We analyzed each amino acid data set individually as well as a concatenated data set of PsaA (264 aa), PsaB (277 aa), PsbA (319 aa), PsbC (334 aa), and PsbD (296 aa; total of 1490 aa). Because we were unable, in spite of repeated attempts, to PCR amplify the psaB gene from K. brevis, K. mikimotoi, and

Heterocapsa triquetra and the psbC gene from K. mikimotoi, these sequences were coded as missing data in the alignment. The cyanobacterium Nostoc sp. PCC7120 was used to root the tree.

The maximum likelihood (ML) method was used to infer the phylogenetic relationships of the different plastids using the concatenated data. We did separate analyses for data sets that included all dinoflagellates or excluded the peridinin-containing taxa. The ML analysis were done with proml in

PHYLIP V3.6b (Felsenstein 2004) using the JTT + Γ evolutionary model (Jones, Taylor, and Thornton

1992). The alpha values for the gamma distribution were calculated using Bayesian inference with

MrBayes V3.0b4 (Huelsenbeck and Ronquist 2001). The mean value for the gamma parameter was calculated from the post “burn-in” trees [see below] in the posterior distribution. Global rearrangements Yoon et al. 7 and randomized sequence input order with 5 replicates was used to find the best ML tree. To test the stability of monophyletic groups in the ML trees, 100 bootstrap replicates were analyzed with proml as described above (Felsenstein 1985) except that one round of random taxon addition was used in the analysis. In the Bayesian inference of the amino acid data we used the WAG + Γ model (Whelan and

Goldman 2001). Metropolis-coupled Markov chain Monte Carlo from a random starting tree was run for

1,000,000 generations with trees sampled each 500 cycles. Four chains were run simultaneously of which

3 were heated and one was cold, with the initial 250,000 cycles (500 trees) discarded as the burn-in. A consensus tree was made with the remaining 1,500 trees to determine the posterior probabilities at the different nodes. Minimum evolution (ME) bootstrap (500 replicates) analyses were done using

PUZZLEBOOT V1.03 (http://hades.biochem.dal.ca/Rogerlab/Software/software.html) and TREE-

PUZZLE V5.1 (Schmidt et al. 2002) to generate the WAG + Γ distance matrices. For the individual plastid proteins, a ML tree was inferred using proml and the JTT + Γ model as described above. One- hundred bootstrap replicates were analyzed with phyML V2.4.3 (Guindon and Gascuel 2003) and the JTT

+ Γ model to infer support for nodes in these trees. Bayesian posterior probabilities for nodes were calculated with MrBayes using the WAG + Γ model. We also analyzed a data set of publicly available chromalveolate cytosolic and plastid-targeted GAPDH sequences that included the novel K. brevis data using ML, ME, and Bayesian methods as described above for the 5-protein alignment.

In addition to the protein data analyses, we also inferred a tree from the concatenated DNA sequences of the five minicircle genes. This ML tree was inferred using PAUP* (Swofford 2002) and the site-specific GTR model (Rodriguez et al. 1990) with different evolutionary rates for each codon position.

Bootstrap analyses were done using phyML (100 replicates) and the minimum evolution method (GTR +

G + I model, using PAUP*). The Bayesian posterior probability was calculated using MrBayes and the site-specific GTR model. Yoon et al. 8

Tree Topology Tests

To assess the positions of the dinoflagellate plastids in our tree, we used MacClade V4.05 (Maddison and

Maddison 2002) to generate 38 alternative topologies from that shown in the best ML tree (Fig. 1). In these trees, the fucoxanthin was moved to 16 alternative positions, and Peridinium foliaceum and perdinin clade to 8 and 14 alternative positions, respectively. The one-sided Kishino-Hasegawa (KH) test

(Goldman, Anderson, and Rodrigo 2000) was implemented using TREE-PUZZLE V5.1 to assign probabilities to the best ML and the alternative trees.

We also implemented the approximately unbiased (AU-) test. Because of the high computing time required to calculate the site-by-site likelihoods, we generated a ML backbone tree (as described above, see also Inagaki et al. 2004) of 14 taxa that included all of the major plastid groups excluding the dinoflagellates. The fucoxanthin (K. brevis and K. mikimotoi) and peridinin (Akashiwo sanguinea and

Heterocapsa triquetra) dinoflagellate were then added separately (using MacClade V4.05,

Maddison and Maddison 2002) to each of the 25 available branches in the 14-taxon tree to generate two data sets of 25 topologies that addressed all of the possible divergence points for these two dinoflagellate clades. We also generated a 16-taxon ML tree that included the Karenia spp. The peridinin clade was then added to each of the 29 available branches in this tree. The site-by-site likelihoods for these three pools of trees were calculated using the respective data sets and Codeml implemented in PAML V3.13 (Yang

1997) with the JTT + Γ model and the default settings. The AU test was implemented using CONSEL

V0.1f (Shimodaira and Hasegawa 2001) to assign probabilities to the different trees in each pool.

EST Data Generation

K. brevis vegetative cells are haploid, with 121 chromosomes (Walker 1982) containing 50-100 pg

DNA/cell, or a haploid genome of approximately 5 x 1010 bp (Kim and Martin 1974; Rizzo 1982; Sigee

1986; Kamykowski, Milligan, and Reed 1998). This made unfeasible the generation of a complete genome sequence for this species. For this reason, we generated expressed sequence tags (ESTs) from two cDNA libraries from K. brevis and these were compared to obtain genes expressed during both the Yoon et al. 9 light and dark phases of the diel cycle (for a detailed description, see Lidie et al. 2005). For the light phase library (the “Wilson” strain deposited at the NOAA [= CCMP718]), total RNA was extracted with

Trizol (GibcoBRL), mRNA was purified with the Oligotex mRNA Midi Kit (Qiagen), and a non- normalized library was constructed according to Bonaldo et al. (Bonaldo, Lennon, and Soares 1996). For the dark phase library, total RNA was extracted with the Qiagen mRNeasy Maxi Kit and a non- normalized library was generated in λZapII from size-selected cDNA (>0.4 kb, Stratagene). The cDNA clones from the light library were sequenced from the 3’ end for a total of 1,320 ESTs (542 unique genes). The dark library was sequenced from the 5’end to yield 6,986 ESTs (5,053 unique genes, c.f.

5,138 unique genes from the combined libraries). Putative plastid-encoded genes were identified through amino acid sequence similarity searches (as in Hackett et al. 2004). The K. brevis EST data from the dark library have been released to GenBank (http://www.ncbi.nlm.nih.gov/dbEST/index.html).

RESULTS AND DISCUSSION

Chromalveolate Phylogeny and Plastid Evolution

Under the chromalveolate hypothesis, all taxa containing chlorophyll c (i.e., a chromophytic plastid) share a single common ancestor and comprise two monophyletic clades, the chromists and the

(Cavalier-Smith 2000). Phylogenetic analyses provide mixed support for this plastid-based view of eukaryotic relationships. Analysis of nuclear genes indicate a single origin of alveolates (e.g., Gajadhar et al. 1991; Baldauf et al. 2000; Stechmann and Cavalier-Smith 2003) but chromist monophyly is only indirectly supported by plastid trees (Durnford et al. 1999; Yoon et al. 2002; Hagopian et al. 2004; Yoon et al. 2004) and the haptophytes and cryptophytes have unresolved positions in nuclear and mitochondrial gene analyses (e.g., Bhattacharya and Weber 1997; Sanchez Puerta, Bachvaroff, and Delwiche 2004).

Nuclear gene trees do, however, suggest a specific relationship between the stramenopiles and alveolates

(Van de Peer and De Wachter 1997; Baldauf et al. 2000; Nozaki et al. 2003). Despite this uncertainty regarding both the monophyly and the internal branching order, the union of chromalveolate taxa is Yoon et al. 10 potentially confirmed by the existence of a gene replacement in which the cytosolic GAPDH gene was duplicated and retargeted to the plastid, uniquely in these taxa (Fast et al. 2001; Harper and Keeling

2003).

Furthermore, in contrast to the convincing evidence for a red algal secondary endosymbiotic origin of the chromist plastid (most parsimoniously, via a single event, Yoon et al. 2002; Yoon et al.

2004), verifying the source of the plastid has proven more challenging. The ciliates have apparently lost this organelle, whereas the parasitic apicomplexans retain a remnant plastid () that contains a reduced genome of about 35 kb in size (Williamson et al. 1994). The apicoplast genes are highly enriched in As and Ts (e.g., 86.9% AT-content in falciparum, Wilson et al. 1996) and therefore have extreme divergence rates, rendering them of limited utility in phylogenetic analyses (e.g.,

Kohler et al. 1997; Hackett et al. 2004). The apicoplast is hypothesized to be of chromalveolate descent

(e.g., McFadden and Waller 1999) or to have originated through a replacement of the ancestral plastid with another of eukaryotic provenance, resulting in the 4-membrane bound organelle in these taxa (Bodyl

1999). The dinoflagellate minicircle genes are also highly divergent and pose problems for tree reconstruction, in particular, when using the DNA sequences (e.g., due to codon usage heterogeneity), single proteins, or with limited taxon sampling (e.g., Zhang, Green, and Cavalier-Smith 1999; Yoon,

Hackett, and Bhattacharya 2002; Inagaki et al. 2004). For these reasons, we chose to address dinoflagellate plastid origin using a taxonomically broadly sampled data set of 5 proteins that are encoded on minicircles in peridinin dinoflagellates and are putatively borne on the haptophyte-derived plastid genome in the fucoxanthin-containing taxa, K. brevis and K. mikimotoi. We assume here that the minicircle genes are ultimately of plastid origin whether they are now localized in this organelle or in the nucleus.

Dinoflagellate Plastid Origin

The ML plastid tree inferred from the concatenated protein (1,490 aa) data set is shown in Fig. 1. This tree has significant Bayesian support for all of the nodes and the chromalveolate plastids (excluding Yoon et al. 11 ) form a monophyletic group within the as sister to the non-Cyanidiales clade. The fucoxanthin dinoflagellate plastids are positioned within the haptophytes as sister to and sp. (, ; Edvardsen et al. 2000), whereas the monophyletic peridinin plastids diverge within the stramenopiles as sister to the diatoms Skeletonema costatum and

Odontella sinensis. Peridinium foliaceum is robustly positioned within this clade confirming the origin of this dinoflagellate plastid through tertiary endosymbiosis (Chesnick, Morden, and Schmieg 1996;

Chesnick et al. 1997; Schnept and Elbraechter 1999). The bootstrap analyses provide, however, only weak to moderate support for many of the nodes in the tree. To determine whether the highly divergent dinoflagellate clades are the cause of this phenomenon (i.e., due to their long branches), we removed in separate analyses either the peridinin or the fucoxanthin plastid sequences from the alignment and recalculated the bootstrap values. In these trees, the peridinin and fucoxanthin clades were positioned in the identical place as shown in Fig. 1 (i.e., within the stramenopile plastids for the peridinin clade and within the haptophyte plastids for the fucoxanthin clade [e.g., see Fig. S1 in the Supplementary

Material]). These data suggest that the divergent dinoflagellate sequences do not exhibit significant long- branch attraction in our protein trees. However, removal of the peridinin plastid sequences results in a marked increase in bootstrap support for most of the nodes of interest in the tree, now with moderate support for chromalveolate monophyly (ML: 49 to 62%, ME: 86 to 86%) and strong support for haptophyte + fucoxanthin dinoflagellate (ML: 68 to 93%, ME: <50 to 73%) and stramenopile plastid monophyly (ML: 100%, ME: 96%) as shown in the gray boxes in Fig. 1 (see Fig. S1). Removal of the fucoxanthin clade did not however change significantly the bootstrap values in the resulting tree

(unpublished data). Our multi-protein data, therefore, provide convincing evidence for the independent evolutionary origins of fucoxanthin (putatively plastid-encoded) and peridinin (minicircle-encoded) genes in dinoflagellates, consistent with previous studies (Tengs et al. 2000; Ishida and Green 2002; Takishita et al. in press). Analysis of the individual protein data sets (see Fig. S2 in the Supplementary Material) shows significant topological instability, as would be expected for trees that are derived from single proteins (often) with variable divergence rates among taxa. Yoon et al. 12

To test the positions of the fucoxanthin and peridinin dinoflagellate plastids in Fig. 1, we generated

38 alternative topologies that tested the divergence point of these taxa (including Peridinium) and assessed their probabilities using the one-sided KH test. All 16 trees in which the fucoxanthin- dinoflagellate clade was moved to alternative positions in Fig. 1 (e.g., to the base of the chromalveolates

[P < 0.001] or to the base of the peridinin clade [P = 0.01]) were significantly rejected (P < 0.05). Seven alternative positions for Peridinium foliaceum were also significantly rejected (P < 0.000) except for the monophyly of this taxon with the Odontella + Skeletonema clade (P = 0.122). In contrast, of 14 alternative divergence points for the peridinin-dinoflagellate clade, the non-rejected positions included at the base of the stramenopiles, stramenopiles + hatophytes, and at the base of the chromalveolates (P =

0.33, 0.07, 0.05, respectively [see Fig. 1]).

We also generated two sets of 25 topologies of a 14-taxon backbone ML tree that excluded the dinoflagellates, in which the fucoxanthin and peridinin clades were added individually to each branch.

These pools of topologies were analyzed with the AU-test. The results of this analysis are shown in Fig. 2 and lead to some clear conclusions about dinoflagellate plastid phylogeny. First, the fucoxanthin dinoflagellate plastids receive overwhelming support for their origin from a prymnesiophyte tertiary endosymbiont (P = 0.995) and all alternative positions have significantly lower probabilities (Fig. 2A).

Second, the peridinin dinoflagellate plastids clearly are not related to this organelle in haptophytes (as suggested in Yoon, Hackett, and Bhattacharya 2002) but rather, their most likely position is sister to the diatom Odontella sinensis (P = 0.799). However, as is apparent in Fig. 2B, many alternative positions are also included in the confidence set (i.e., P > 0.05) of trees, in particular in the region defining the radiation of the Cyanidiales and non-Cyanidiales red algae and the chromists. The divergence point predicted from the chromalveolate hypothesis (as sister to the chromists, see filled circle in Fig. 2B) has a probability of 0.153.

Analysis of the 16-taxon ML backbone tree with the AU-test provides further support for the conclusions described above (Fig. 2C). Here, the Karenia sp. plastids are positioned as sister to the prymnesiophyte alga as in Fig. 1A and as strongly suggested in Fig. 2A. The tree of highest probability in Yoon et al. 13 this pool specified a sister relationship between the peridinin plastids and O. sinensis (P = 0.805); the position as sister to the chromists had P = 0.145. There were again many alternative positions possible for the peridinin plastids within the confidence set. Their divergence within the haptophytes was rejected except as sister to the Karenia spp. sequences (P = 0.059). However, this likely results from long-branch attraction between the relatively highly divergent dinoflagellate sequences because, in the absence of the fucoxanthin plastids, there is no support for a specific relationship between Akashiwo + Heterocapsa and the haptophytes (see Fig. 2B). Our results are therefore consistent with (but do not prove) the chromalveolate hypothesis that posits a red algal origin of the plastid in this group. Intriguingly, the weak support that we find for a specific relationship between peridinin and stramenopile plastids, which is also supported by the analyses of nuclear genes (e.g., Van de Peer and De Wachter 1997; Baldauf et al. 2000;

Nozaki et al. 2003; Hackett et al. 2004), suggests a paraphyletic chromista. Under this (speculative) scenario, the stramenopiles and alveolates share a specific relationship independent of the cryptophytes and haptophytes.

We also inferred a tree using the DNA sequences of the plastid genes (see Fig. S3 in the

Supplementary Material) and found that these more extensive data still recover the artifactual clustering of peridinin and fucoxanthin plastids within the haptophytes that was reported in Yoon, Hackett, and

Bhattacharya. (2002). A recent paper by Inagaki et al. (2004) suggests that this misplacement of the peridinin clade (using the DNA data) is explained by similar codon usage patterns for constant leucine, serine, and arginine residues in psbA among fucoxanthin dinoflagellates and some haptophytes. Usage of the protein data corrects for this codon usage heterogeneity and clearly supports the independent origins of these two types of dinoflagellate plastids.

Endosymbiotic Gene Replacement

Additional support for the haptophyte plastid replacement in K. brevis comes from analysis of GAPDH sequences. A previous study (Ishida and Green 2002) provided phylogenetic evidence for the haptophyte origin of the nuclear-encoded photosystem gene, oxygen-evolving enhancer protein 1 (psbO) in K. brevis Yoon et al. 14 putatively through tertiary endosymbiotic gene replacement. Inspection of our EST data set turned up both cytosolic and plastid-targeted GAPDH genes in K. brevis. Phylogenetic analysis of these data shows that the K. brevis plastid-targeted GAPDH gene is distantly related to this sequence in peridinin dinoflagellates but rather is sister to the prymnesiophyte, , within the haptophyte clade

(Fig. 3 [see also Takishita, Ishida, and Maruyama 2004]). This indicates that the ancestral plastid-targeted gene of chromalveolate origin in K. brevis was presumably replaced by the haptophyte homolog (Fig. 3).

In contrast, the cytosolic GAPDH in K. brevis is nested within a monophyletic clade of dinoflagellate sequences that were likely vertically inherited in these taxa. These results are consistent with the multi- protein plastid tree shown in Fig. 1 and support a prymnesiophyte source for the fucoxanthin dinoflagellate plastid.

Nuclear Genome Transformation in Fucoxanthin Dinoflagellates

Given the relatively robust view of dinoflagellate plastid evolution that has resulted from our study, we then asked what happened to the nuclear genome of fucoxanthin dinoflagellates after the haptophyte plastid replacement? In particular, what was the fate of the previously transferred nuclear genes of plastid function demonstrated in peridinin dinoflagellates and what of the proteobacterial Rubisco? To address these issues, we generated a genomic data set of 5,138 unique ESTs from K. brevis. We first searched the

K. brevis data set for homologs of each of the 48 nuclear encoded plastid-targeted genes that were uncovered in the EST unigene set from A. tamarense (Hackett et al. 2004) and from other dinoflagellate

EST projects (e.g., Bachvaroff et al. 2004). None of these genes were present in the K. brevis ESTs. In particular, peridinin-containing A. tamarense encodes 15 photosynthetic genes in its nucleus (atpE, F, H, petA, psaC, J, psbF, H, J, K, L, N, T, rpl2, rps19) that are restricted, in all known cases, to the plastid genome of other algae and (Hackett et al. 2004). These landmark nuclear markers of peridinin dinoflagellates were absent from our K. brevis EST set. In addition, there was no evidence of a nuclear- encoded Form II Rubisco gene, although the typical Form I sequence of haptophyte origin has previously been isolated (Yoon, Hackett, and Bhattacharya 2002). These results imply that the haptophyte tertiary Yoon et al. 15 endosymbiosis resulted in a genome transformation in K. brevis whereby it lost the unique plastid characters typical of most peridinin dinoflagellates.

Although we have no direct evidence for the plastid localization of the K. brevis photosystem genes used in this study because we did not determine the 5’-terminus of the genes (i.e., to detect a potential plastid-targeting sequence), analysis of the G + C-content of these sequences shows them to have a nucleotide content that is typical of plastid genes (see Table 2 in the Supplementary Material). In a comparison of genes known to be encoded in the plastid genome of the red alga Porphyra purpurea

(39.8% G + C) and in the red-algal-derived plastid in Emiliania huxleyi (41.6% G + C), the Karenia spp. photosystem genes had a similar G + C-content of 41.3% (K. brevis) and 41.8% (K. mikimotoi). In contrast, the nuclear-encoded genes in K. brevis had a markedly higher G + C-content of 52.8%, a trait shared with these genes in A. tamarense (i.e., 60.0%). These data are consistent with the idea that the K. brevis genes are located in the plastid but do not allow us to determine whether they are encoded on a typical genome or on minicircles. The most parsimonious explanation is that the haptophyte plastid replacement resulted in a typical genome in this species but this hypothesis awaits verification through the direct sequence analysis of this organelle-encoded DNA (T. N. and D. B. work in progress).

It should be also noted that although the majority of the ESTs were derived from the dark-grown

K. brevis library, we were able to identify many plastid-targeted genes in these libraries. Comparison of our ESTs to sequence databases using a BLAST e-value cut off of 1e-10 resulted in the identification of a number of transcripts that encode plastid-targeted proteins usually found in the nucleus, including numerous light-harvesting proteins, flavodoxin, ferredoxin, plastocyanin, GAPDH, and fructose-1,6- bisphosphate aldolase I and II (J. D. H., T. N, and D. B. unpublished data). We also recognize that our EST data potentially represents a fraction of the K. brevis nuclear gene complement, therefore, additional cDNA sequencing (work in progress with the Joint Genome Institute, D. B. and F. V. D.) may yet result in the identification of some coding regions of plastid function in this species that are typical of peridinin dinoflagellates. Normally, however, photosynthetic genes are found in the initial 200-300 EST Yoon et al. 16 sequences determined from a light-harvested dinoflagellate cDNA library (e.g., in A. tamarense, Hackett et al. 2004).

CONCLUSIONS

A model of dinoflagellate plastid evolution that summarizes the current state of knowledge is presented in

Fig. 4. Presently, the most parsimonious explanation is that the alveolate ancestor contained a plastid of red algal secondary endosymbiotic origin (Fast et al. 2001; Bhattacharya, Yoon, and Hackett 2004). This organelle was lost in the ciliates and its genome was significantly reduced in the parasitic apicomplexans and underwent a major transformation (PT1) at the base of the dinoflagellates (Bachvaroff et al. 2004;

Hackett et al. 2004). Major evolutionary innovations in the dinoflagellates were the origin of minicircle genes and large-scale plastid gene transfer to the nucleus. Our EST data show that a second plastid genome transformation (PT2) occurred following the haptophyte tertiary endosymbiosis that resulted in the loss of PT1 characters and reversion to a state typical of “normal” algae (e.g., origin of Form I

Rubisco, absence of peridinin). The presence of an intact, typical plastid genome in fucoxanthin dinoflagellates remains to be verified. The well supported sister group relationship of these taxa with peridinin-containing species in the Gymnodiniales (e.g., Akashiwo spp.; Daugbjerg et al. 2000; Zhang,

Bhattacharya, and Lin in press) strongly suggests, however, that the fucoxanthin plastid is derived from one which contained minicircles (e.g., as in Akashiwo sanguinea; see Fig. 1). Loss of the existing nuclear- encoded plastid genes in the K. brevis ancestor, which presumably were shielded from Mullers’s ratchet in the nucleus, may indicate that selection favors co-evolved proteins (i.e., in the captured haptophyte plastid genome) rather than a mixture of haptophyte and of ancient chromalveolate origin. Alternatively, the regulation of plastid function may be more efficient when the 15 landmark nuclear-encoded plastid genes of peridinin dinoflagellates are transcribed and translated in the organelle (see Hackett et al. 2004;

Koumandou et al. 2004).

We also uncovered a second case of endosymbiotic gene transfer in which the existing plastid- targeted GAPDH in K. brevis, presumably of chromalveolate origin, was replaced by the homolog from Yoon et al. 17 the haptophyte tertiary endosymbiont (as for psbO, see Ishida and Green 2002). An intriguing question that remains is whether PT2-type transformations have also occurred in other dinoflagellates that have undergone tertiary endosymbiosis such as Peridinium foliaceum (diatom plastid, see Fig. 1) and

Lepidodinium viride (green algal plastid; e.g., Watanabe et al. 1991) or whether these taxa have adopted different strategies for incorporating the genomic information encoded in the captured organelle. In conclusion, our data underline the remarkable ability of dinoflagellates to transform their genomes through endosymbiosis and identify these as ideal models for understanding this critical process in eukaryotic evolution. Yoon et al. 18

ACKNOWLEDGEMENTS

This work was supported by grants from the National Science Foundation awarded to D. B (DEB 01-

07754, MCB 02-36631) and from NOAA/ECOHAB to F. V. D. Yoon et al. 19

LITERATURE CITED

Bachvaroff, T. R., G. T. Concepcion, C. R. Rogers, E. M. Herman, and C. F. Delwiche. 2004.

Dinoflagellate expressed sequence tag data indicate massive transfer of genes to the

nuclear genome. Protist 155:65-78.

Baldauf, S. L., A. J. Roger, I. Wenk-Siefert, and W. F. Doolittle. 2000. A -level phylogeny of

eukaryotes based on combined protein data. Science 290:972-977.

Barbrook, A. C., and C. J. Howe. 2000. Minicircular plastid DNA in the dinoflagellate Amphidinium

operculatum. Mol. Gen. Genet. 263:152-158.

Bhattacharya, D., and L. Medlin. 1995. The phylogeny of plastids: A review based on comparisons of

small-subunit ribosomal RNA coding regions. J. Phycol. 31:489-498.

Bhattacharya, D., and K. Weber. 1997. The actin gene of the glaucocystophyte Cyanophora paradoxa:

analysis of the coding region and introns, and an actin phylogeny of eukaryotes. Curr. Genet.

31:439-446.

Bhattacharya, D., H. S. Yoon, and J. D. Hackett. 2004. Photosynthetic eukaryotes unite: endosymbiosis

connects the dots. BioEssays 26:50-60.

Bodyl, A. 1999. Evolutionary pathway of the apicomplexan plastids and its implications. Trends

Microbiol. 7:266-268.

Bonaldo, M. F., G. Lennon, and M. B. Soares. 1996. Normalization and subtraction: two approaches to

facilitate gene discovery. Genome Res. 6:791-806.

Cavalier-Smith, T. 2000. Membrane heredity and early chloroplast evolution. Trends Plant Sci. 5:174-

182.

Cavalier-Smith, T. 1999. Principles of protein and lipid targeting in secondary : Euglenoid,

Dinoflagellate, and Sporozoan plastid origins and the family tree. J. Eukaryot.

Microbiol. 46:347-366.

Cavalier-Smith, T. 1986. The kingdom Chromista: origin and systematics. Pp. 309-347 in F. E. Round,

and D. J. Chapman, eds., Progress in Phycological Research. Biopress, Bristol. U.K. Yoon et al. 20

Chesnick, J. M., W. H. Kooistra, U. Wellbrock, and L. K. Medlin. 1997. Ribosomal RNA analysis

indicates a benthic pennate diatom ancestry for the endosymbionts of the dinoflagellates

Peridinium foliaceum and Peridinium balticum (Pyrrhophyta). J. Eukaryot. Microbiol. 44:314-

320.

Chesnick, J. M., C. W. Morden, and A. M. Schmieg. 1996. Identity of the endosymbiont of Peridinium

foliaceum (Pyrrophyta): analysis of the rbcLS operon. J. Phycol. 32:850-857.

Daugbjerg, N., G. Hansen, J. Larsen, and O. Moestrup. 2000. Phylogeny of some of the major genera of

dinoflagellates based on ultrastructure and partial LSU rDNA sequence data, inculding the

erection of three new genera of unarmoured dinoflagellates. Phycologia 39:302-317.

De Salas, M. F., C. J. S. Bolch, L. Botes, G. Nash, S. W. Wright, and G. M. Hallegraeff. 2003. Takayama

Gen. Nov. (Gymnodiniales, Dinophyceae), a new genus of unarmored dinoflagellates with

sigmoid apical grooves, including the description of two new species. J. Phycol. 39:1233-1246.

Durnford, D. G., J. A. Deane, S. Tan, G. I. McFadden, E. Gantt, and B. R. Green. 1999. A phylogenetic

assessment of the eukaryotic light-harvesting antenna proteins, with implications for plastid

evolution. J. Mol. Evol. 48:59-68.

Edvardsen, B., W. Eikrem, J. C. Green, R. A. Andersen, S. Y. Moon-van der Staay, and L. K. Medlin.

2000. Phylogenetic reconstructions of the Haptophyta inferred from 18S ribosomal DNA

sequences and available morphological data. Phycologia 39:19-35.

Fast, N. M., J. C. Kissinger, D. S. Roos, and P. J. Keeling. 2001. Nuclear-encoded, plastid-targeted genes

suggest a single common origin for apicomplexan and dinoflagellate plastids. Mol. Biol. Evol.

18:418-426.

Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution

39:783-791.

Felsenstein, J. 2004. PHYLIP (Phylogenetic Inference Package) 3.6. Department of Genetics, University

of Washington, Seattle. Wash. Yoon et al. 21

Gajadhar, A. A., W. C. Marquardt, R. Hall, J. Gunderson, E. V. Ariztia-Carmona, and M. L. Sogin. 1991.

Ribosomal RNA sequences of Sarcocystis muris, Theileria annulata and Crypthecodinium cohnii

reveal evolutionary relationships among apicomplexans, dinoflagellates, and ciliates. Mol.

Biochem. Parasitol. 45:147-154.

Gilbert, D. G. 1995. SeqPup, a biological sequence editor and analysis program for Macintosh computer.

Indiana University, Bloomington, IN.

Goldman N., J. P. Anderson, and A. G. Rodrigo. 2000. Likelihood-based tests of topologies in

. Syst. Biol. 49:652–670.

Gray, M. W. 1992. The endosymbiont hypothesis revisited. Int. Rev. Cytol. 141:233-357.

Guidon, S. and O. Gascuel. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by

maximum likelihood. Syst. Biol. 52:696-704.

Hackett, J. D., H. S. Yoon, M. B. Soares, M. F. Bonaldo, T. L. Casavant, T. E. Scheetz, T. Nosenko, and

D. Bhattacharya. 2004. Migration of the plastid genome to the nucleus in a peridinin

dinoflagellate. Curr. Biol. 14:213-218.

Hagopian, J. C., M. Reis, J. P. Kitajima, D. Bhattacharya, and M. C. Oliveira. 2004. Comparative analysis

of the complete plastid genome sequence of the red alga Gracilaria tenuistipitata var. liui: insight

on the evolution of rhodoplasts and their relationship to other plastids. J. Mol. Evol. 59:464-477.

Harper, J. T., and P. J. Keeling. 2003. Nucleus-encoded, plastid-targeted glyceraldehyde-3-phosphate

dehydrogenase (GAPDH) indicates a single origin for chromalveolate plastids. Mol. Biol. Evol.

20:1730-1735.

Howe, C. J., A. C. Barbrook, V. L. Koumandou, R. E. Nisbet, H. A. Symington, and T. F. Wightman.

2003. Evolution of the chloroplast genome. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 358:99-106.

Huelsenbeck, J. P., and F. Ronquist. 2001. MRBAYES: Bayesian inference of phylogenetic trees.

Bioinformatics 17:754-755. Yoon et al. 22

Inagaki, Y., A. G. Simpson, J. B. Dacks, and A. J. Roger. 2004. Phylogenetic artifacts can be caused by

leucine, serine, and arginine codon usage heterogeneity: Dinoflagellate plastid origins as a case

study. Syst. Biol. 53:582-593.

Ishida, K., and B. R. Green. 2002. Second- and third-hand in dinoflagellates: phylogeny of

oxygen-evolving enhancer 1 (psbO) protein reveals replacement of a nuclear-encoded plastid

gene by that of a haptophyte tertiary endosymbiont. Proc. Natl. Acad. Sci. USA 99:9294-9299.

Jones, D. T., W. R. Taylor, and J. M. Thornton. 1992. A new approach to protein fold recognition. Nature

358:86-98.

Kamykowski, D., E. J. Milligan, and R. E. Reed. 1998. Biochemicsl relationships in the orientation of the

autotrophic dinoflagellate Gymnodinium breve under nutrient replete conditions. Marine Ecol.

Prog. Ser. 167:105-117

Kim, Y. S., and D. F. Martin. 1974. Effects of salinity on synthesis of DNA, acidic polysaccharide and

ichthyotoxin in Gymnodinium breve. Phytochemistry 13:533-538.

Kohler, S., C. F. Delwiche, P. W. Denny, L. G. Tilney, P. Webster, R. J. Wilson, J. D. Palmer, and D. S.

Roos. 1997. A plastid of probable green algal origin in apicomplexan parasites. Science

275:1485-1489.

Koumandou, V. L., R. E. Nisbet, A. C. Barbrook, and C. J. Howe. 2004. Dinoflagellate chloroplasts-

where have all the genes gone? Trends Genet. 20:261-267.

Laatsch, T., S. Zauner, B. Stoebe-Maier, K. V. Kowallik, and U. G. Maier. 2004. Plastid-derived single

gene minicircles of the dinoflagellate Ceratium horridum are localized in the nucleus. Mol. Biol.

Evol. 21:1318-1322.

Lidie, K.L., J. C. Ryan, M. Barbier, and F. M. Van Dolah. 2005. Gene expression in the Florida red tide

Dinoflagellate Karenia brevis: analysis of an expressed sequence tag (EST) library and

development of a DNA microarray. Mar. Biotec. (in press).

Maddison, D. R., and W. P. Maddison. 2002. MacClade. Sinauer, Sunderland. Yoon et al. 23

Matsuzaki, M., O. Misumi, I. T. Shin et al. (40 co-authors). 2004. Genome sequence of the ultrasmall

unicellular red alga Cyanidioschyzon merolae 10D. Nature 428:653-657.

McFadden, G., and R. Waller. 1999. Response from McFadden and Waller. Trends Microbiol. 7:267-268.

Moreira, D., H. Le Guyader, and H. Phillippe. 2000. The origin of red algae and the evolution of

chloroplasts. Nature 405:69-72.

Nassoury, N., M. Cappadocia, and D. Morse. 2003. Plastid ultrastructure defines the protein import

pathway in dinoflagellates. J. Cell Sci. 116:2867-2874.

Nozaki, H., M. Matsuzaki, M. Takahara, O. Misumi, H. Kuroiwa, M. Hasegawa, I. T. Shin, Y. Kohara, N.

Ogasawara, and T. Kuroiwa. 2003. The phylogenetic position of red algae revealed by multiple

nuclear genes from mitochondria-containing eukaryotes and an alternative hypothesis on the

origin of plastids. J. Mol. Evol. 56:485-497.

Palmer, J. D. 2003. The symbiotic birth and spread of plastids: how many times and whodunit? J. Phycol.

39:4-12.

Rizzo, P. D. 1982. Isolation and properties of isolated nuclei from the Florida red tide dinoflagellate

Gymnodinium breve. J. Protozool. 29:217-222.

Rodriguez, F., J. F. Oliver, A. Marin, and J. R. Medina. 1990. The general stochastic model of nucleotide

substitutions. J. Theor. Biol. 142:485-501.

Sanchez Puerta, M. V., T. R. Bachvaroff, and C. F. Delwiche. 2004. The complete mitochondrial genome

sequence of the haptophyte Emiliania huxleyi and its relation to . DNA Res. 11:1-10.

Schmidt, H. A., K. Strimmer, M. Vingron, and A. von Haeseler. 2002. TREE-PUZZLE: maximum

likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502-504.

Schnept, E., and M. Elbraechter. 1999. Dinophyte chloroplasts and phylogeny - a review. Grana 38:81-97.

Shimodaira, H., and M. Hasegawa. 2001. CONSEL: for assessing the confidence of phylogenetic tree

selection. Bioinformatics 17:1246-1247.

Sigee, D.C. 1986. The dinoflagellate chromosome. Adv. Bot. Res. 12:205-265. Yoon et al. 24

Stechmann, A., and T. Cavalier-Smith. 2003. Phylogenetic analysis of eukaryotes using heat-shock

protein Hsp90. J. Mol. Evol. 57:408-419.

Stibitz, T. B., P. J. Keeling, and D. Bhattacharya. 2000. Symbiotic origin of a novel actin gene in the

cryptophyte Pyrenomonas helgolandii. Mol. Biol. Evol. 17:1731-1738.

Swofford, D. L. 2002. PAUP*: Phylogenetic Analysis Using Parsimony (* and other methods) 4.0b8.

Sinauer, Sunderland, MA.

Takishita, K., K. Ishida, and T. Maruyama. 2004. Phylogeny of nuclear-encoded plastid-targeted GAPDH

gene supports separate origins for the peridinin- and the fucoxanthin derivative-containing

plastids of dinoflagellates. Protist 155:447-458.

Takishita, K., K. Ishida, M. Ishikura, and T. Maruyama. In press. Phylogeny of the psbC gene, coding a

photosystem II component CP43, suggests separate origins for the peridinin- and fucoxanthin

derivative-containing plastids of dinoflagellates. Phycologia.

Tengs, T., O. J. Dahlberg, K. Shalchian-Tabrizi, D. Klaveness, K. Rudi, C. F. Delwiche, and K. S.

Jakobsen. 2000. Phylogenetic analyses indicate that the 19' hexanoyloxy-fucoxanthin-containing

dinoflagellates have tertiary plastids of haptophyte origin. Mol. Biol. Evol. 17:718-729.

Van de Peer, Y., and R. De Wachter. 1997. Evolutionary relationships among the eukaryotic crown taxa

taking into account site-to-site rate variation in 18S rRNA. J. Mol. Evol. 45:619-630.

Walker, L. 1982. Evidence for a sexual cycle in the Florida red tide dinoflagellate, Ptychodiscus brevis

(=Gymnodinium breve). Trans. Am. Microbiol. Soc. 101:287-293.

Watanabe, M. M., T. Sasa, S. Suda, I. Inouye, and S. Takichi. 1991. Major carotenoid composition of an

endosymbiont in a green dinoflagellate, Lepidodinium viride. J. Phycol. 27:75.

Whelan, S., and N. Goldman. 2001. A general empirical model of protein evolution derived from multiple

protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18:691-699.

Williamson, D. H., M. J. Gardner, P. Preiser, D. J. Moore, K. Rangachari, and R. J. Wilson. 1994. The

evolutionary origin of the 35 kb circular DNA of Plasmodium falciparum: new evidence supports a Yoon et al. 25

possible rhodophyte ancestry. Mol. Gen. Genet. 243:249-252.

Wilson, R. J., P. W. Denny, P. R. Preiser, K. Rangachari, K. Roberts, A. Roy, A. Whyte, M. Strath, D. J.

Moore, P. W. Moore, and D. H. Williamson. 1996. Complete gene map of the plastid-like DNA of the

malaria parasite Plasmodium falciparum. J. Mol. Biol. 261:155-172.

Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput.

Appl. Biosci. 13:555-556.

Yoon, H. S., J. D. Hackett, and D. Bhattacharya. 2002. A single origin of the peridinin- and fucoxanthin-

containing plastids in dinoflagellates through tertiary endosymbiosis. Proc. Natl. Acad. Sci. USA

99:11724-11729.

Yoon, H. S., J. D. Hackett, C. Ciniglia, G. Pinto, and D. Bhattacharya. 2004. A molecular timeline for the

origin of photosynthetic eukaryotes. Mol. Biol. Evol. 21:809-818.

Yoon, H. S., J. D. Hackett, G. Pinto, and D. Bhattacharya. 2002. The single, ancient origin of chromist

plastids. Proc. Natl. Acad. Sci. USA 99:15507-15512.

Zhang, Z., T. Cavalier-Smith, and B. R. Green. 2002. Evolution of dinoflagellate unigenic minicircles and

the partially concerted divergence of their putative replicon origins. Mol. Biol. Evol. 19:489-500.

Zhang, Z., B. R. Green, and T. Cavalier-Smith. 1999. Single gene circles in dinoflagellate chloroplast

genomes. Nature 400:155-159.

Zhang, H., D. Bhattacharya, and S. Lin. In press. Phylogeny of dinoflagellates based on mitochondrial

cytochrome b and nuclear small subunit rDNA sequence comparisons. J. Phycol.

Figure Legends

Figure 1. Phylogenetic analysis of algal and plant plastids. This ML tree was inferred from the combined plastid protein sequences of PsaA, PsaB, PsbA, PsbC, and PsbD. The results of a ML bootstrap analysis are shown above the branches, whereas the values below the branches result from a ME bootstrap analysis. The thick branches represent >95% Bayesian posterior probability. The bootstrap values shown in the gray boxes were calculated after the removal of the highly divergent minicircle sequences in the Yoon et al. 26 peridinin dinoflagellates. The dinoflagellate plastid sequences are shown in boldface. Only bootstrap values > 60% are shown. The branch lengths are proportional to the number of substitutions per site (see scale in figure). The probabilities (using the 1-sided K-H test) for placing the peridinin plastid clade in 3 alternate positions are shown in the smaller gray boxes with the arrows.

Figure 2. Results of the AU-test for assessing the phylogenetic position of the fucoxanthin and peridinin dinoflagellate plastids in broadly sampled ML trees of this organelle. The results of a phyML bootstrap analysis are shown above the branches Only bootstrap values > 60% are shown. A. Placement of the fucoxanthin plastid clade in the 14-taxon ML tree. B. Placement of the peridinin plastid clade in the 14- taxon ML tree. C. Placement of the peridinin plastid clade in the 16-taxon ML tree. The position of highest probability in each tree is shown with the vertical arrow and the probability for each plastid placement is indicated with the branch thickness (see legend). The filled circle in Figs. 2B, C marks the predicted position of dinoflagellate peridinin plastids under the chromalveolate hypothesis. The branch lengths are proportional to the number of substitutions per site (see scales in figure).

Figure 3. ML tree of chromalveolate cytosolic and plastid-targeted GAPDH sequences. The results of a

ML bootstrap analysis are shown above the branches, whereas the values below the branches result from a ME bootstrap analysis. The thick branches represent >95% Bayesian posterior probability. The filled circle marks the strongly supported monophyletic clade of plastid-targeted GAPDH from K. brevis and from the haptophytes, indicating the origin of the K. brevis sequence via tertiary endosymbiotic gene replacement. Only bootstrap values > 60% are shown. The branch lengths are proportional to the number of substitutions per site (see scale in figure).

Figure 4. Schematic representation of alveolate plastid evolution with a focus on the dinoflagellates. The plastid was presumably lost in the ciliates (gray circle) and reduced to a remnant plastid (apicoplast) in parasitic apicomplexans. Within dinoflagellates, PT1 represents the first plastid genome transformation at Yoon et al. 27 the base of this clade when, for example, the chromalveolate plastid genome was reduced to 1 – 3 gene minicircles or transferred to the nucleus. PT2 is the second plastid genome transformation following the haptophyte tertiary endosymbiosis that resulted in the loss of the PT1 characters and essentially, reversion to a state typical of “normal” algae (e.g., Form I Rubisco, absence of peridinin, putatively intact plastid genome). The diatom plastid replacement in Peridinium foliaceum is also shown. Yoon et al., Figure 1

.05 .07 .33 100 Heterocapsa niei p-1sKH 100 99 Heterocapsa triquetra 100 Peridinin 98 Akashiwo sanguinea 98 Dinos. 100 84 Amphidinium 96 73 Skeletonema costatum operculatum 100 Peridinium foliaceum

98 Odontella sinensis STRAMENOPILES CHROMAL 64 89 Heterosigma akashiwo 93 91 Pylaiella littoralis 91 100 Karenia brevis Fucoxanthin 96 70 99 Karenia mikimotoi Dinos. 100 62 49 68 - Emiliania huxleyi 86 100 97 Isochrysis sp. VEOLA 86 93 - HAPTOPHYTES 99 73 Pavlova gyrans 99 Pavlova lutheri 98 97 Pyrenomonas helgolandii 98 TES 94 Rhodomonas abbreviata 100 100 Guillardia theta CRYPTOPHYTES 91 78 98 Chroomonas sp. 93 92 100 Bangia atropurpurea 98 60 97 Porphyra purpurea 81 64 95 Chondrus crispus 84 Palmaria palmata Compsopogon coeruleus 9898 89 Rhodosorus marinus 9494 96 Porphyridium aerugineum RHODOPHYTES 100 Cyanidioschyzon merolae 93 Galdieria maxima 93 98 97 Cyanidium caldarium 61 83 Cyanidium sp. Cyanidiales - 100 Galdieria sulphuraria DBV 009 97 Galdieria sulphuraria SAG 108.79 91/97 Arabidopsis thaliana 100/95 Lotus japonicus 89/89 Triticum aestivum 98/94 Zea mays Pinus thunbergii CHLOROPHYTES 96/94 Psilotum nudum 100 Anthoceros formosae & LAND PLANTS 100 95 Marchantia polymorpha 94 Chaetosphaeridium globosum 99 Chlamydomonas reinhardtii 84 94 97 Chlorella vulgaris 65 74 Mesostigma viride 99 Cyanophora paradoxa 96 Glaucocystis nostochinearum Nostoc sp. PCC7120 - CYANOBACTERIUM 0.05 substitutions/site Yoon et al., Figure 2

Fucoxanthin dinos. Peridinin dinos. Peridinin dinos. (Karenia spp.) (Heterocapsa tri.+ Akashiwo) (Heterocapsa tri. + Akashiwo)

A Cyanidioschyzon B Cyanidioschyzon C Cyanidioschyzon Bangia Bangia Bangia 99 Compsopogon Compsopogon Compsopogon 98 98 Chroomonas Chroomonas Chroomonas 89 100 Pyrenomonas 89 100 Pyrenomonas 89 100 Pyrenomonas 99 Emiliania Emiliania Emiliania Karenia b. 71 71 n P = 0.059 61 n 100 Pavlova g. 61 Pavlova g. 67 75 Karenia m. Pavlova g. 100 Pavlova l. 100 Pavlova l. 100 94 94 Pavlova l. Heterosigma Heterosigma 94 Heterosigma 97 Odontella 97 Odontella 98 Odontella Mesostigma Mesostigma Mesostigma 98 Pinus 98 Pinus Pinus 98 Cyanophora Cyanophora Cyanophora Nostoc Nostoc Nostoc 0.05 substitutions/site 0.05 substitutions/site 0.05 substitutions/site P > 0.05 P < 0.05 P < 0.01 Yoon et al., Figure 3

Amphidinium operculatum 100 Karenia brevis Dinoflagellates 92 76 Alexandrium tamarense Heterocapsa triquetra parvum 100 Guillardia theta CYTOSOLIC 100 67 96 Pyrenomonas salina 88 100 Isochrysis galbana 95 Pavlova lutheri Odontella sinensis Thraustotheca clavata 65 Halteria grandinella Tetrahymena thermophila 100 Mallomonas rasilis 94 Ochromonas danica 100 Plasmodium falciparum 96 Plasmodium yoelli Toxoplasma gondii

98 Karenia brevis - Fucoxanthin dino. PLASTID-TARGETED 96 91 Isochrysis galbana 98 n 85 Pavlova lutheri Haptophytes 71 Toxoplasma gondii 78 Alexandrium tamarense 100 Heterocapsa triquetra Peridinin dinos. 89 Gonyaulax polyedra 100 Guillardia theta 80 73 97 Mallomonas rasilis 96 Ochromonas danica 63 Odontella sinensis 0.05 substitutions/site Yoon et al., Figure 4

Fucoxanthin-Dinos. P. foliaceum Peridinin-Dinos.

Diatom Plastid Replacement PT2

Haptophyte Plastid Replacement Apicomplexans

- peridinin Ciliates - loss of chl. c1 PT1 - single gene circles - form-II rubisco - large-scale nuclear gene transfer Photosynthetic ancestor of Alveolata