The Eukaryotic Tree of Life from a Global Phylogenomic Perspective

Fabien Burki

Canadian Institute for Advanced Research, Department of Botany, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada

Molecular has revolutionized our knowledge of the eukaryotic . With the advent of , a new discipline of phylogenetics has emerged: phylogenom- ics. This method uses large alignments of tens to hundreds of to reconstruct - ary histories. This approach has led to the resolution of ancient and contentious relationships, notably between the building blocks of the tree (the supergroups), and allowed to place in the tree enigmatic yet important lineages for understanding evolution. Here, I discuss the pros and cons of and review the eukaryotic supergroups in light of earlier work that laid the foundation for the current view of the tree, including the position of the . I conclude by presenting a picture of eukaryote evolution, summarizing the most recent progress in assembling the global tree.

t is redundant to say that are di- ticellular . Eukaryotes have occupied Iverse. , , and fungi are the char- just about every ecological niche on . ismatic representatives of the eukaryotic do- Some actively gather food from the environ- main of life, but this narrow view does not do ment, others use () to de- justice to the eukaryotic diversity. Microscop- rive from the light; many can adapt to ic eukaryotes, often unicellular and known as variable conditions by switching between auto- the , represent the bulk of most major trophy and the predatory consumption of prey groups, whereas multicellular lineages are con- by phagotrophy. Eukaryotes also show a great fined to small corners on the global tree of eu- deal of genomic variation (Lynch and Conery karyotes. If all eukaryotes possess structures 2003). Some amoebozoan protists, for instance, enclosed within intracellular membranes (the have the largest known —more than ), an infinite variation of forms and 200 larger than that of (Keeling feeding strategies has evolved since their origin. and Slamovits 2005). Conversely, microbial Eukaryotic cells can wander on their own, some- parasites can have highly compact, bacterial- times forming hordes of free-living pico-sized size genomes (Corradi et al. 2010). Even smaller organisms that flourish in . They can be are the remnant nuclear genomes (nucleo- parasites or symbionts, or come together by the morphs) of what were once free-living microbi- billions in tightly packed, highly regulated mul- al . At around 500,000 and

Cold Spring Harb Perspect Biol 2014;6:a016147

F. Burki

hardly encoding a few hundreds genes, nucleo- easy to amplify and contains both hypervaria- morphs are the smallest nuclear of all ble and conserved regions, allowing researchers (Douglas et al. 2001; Gilson et al. 2006; Lane to investigate different depths of phylogenet- et al. 2007). ic resolution. As a result, the SSU rRNA domi- Recognizing this great diversity and pushed nates molecular databases, and the majority of by a desire to establish order, have the known eukaryotic diversity, when character- long attempted to assemble a global eukaryotic ized molecularly, is still defined solely by this tree of life. A fully resolved marker. including all organisms is not only the ultimate The pioneering molecular phylogenies con- goal of , it would also provide the sistently recovered a handful of deeply diverging foundation to infer the acquisition and evolu- protist lineages (e.g., , parabasal- tion of countless characters through the history ids, microsporidians, ), progres- of long-dead . But early attempts to re- sively emerging from the distant prokaryot- solve the eukaryotic tree, most of which were ic root, and followed by a densely branched based on comparisons of morphology and nu- “crown,” nesting the more familiar eukaryotic trition modes, faced the impossible challenge diversity (Sogin et al. 1986, 1989; Friedman of describing in an evolutionary sensitive way et al. 1987; Woese et al. 1990; Sogin 1991). a world in which most of the diversity occurs This was an appealing picture of evolution be- among tiny microbes. For decades, text- cause these early diverging species were seem- books assigned the eukaryotes to evolutionary ingly morphologically simple single- organ- entities called “kingdoms” in which the lords isms that lacked mitochondria and other were the animals, plants, and fungi (Copeland typical eukaryotic structures, such as peroxi- 1938; Whittaker 1969; Margulis 1971). This is somes (Keeling 1998). These phylogenies were not to say that ignored protists, and also consistent with the hypothesis, they have been in fact recognized as a which postulated that amitochondriate eukary- for more that a century (Haeckel 1866), but ote lineages diverged before the endosymbiotic protists were considered to be "simple" organ- event that gave rise to mitochondria (Cavalier- isms from which more elaborate, multicellular Smith 1983, 1987, 1989). Even more convincing species emerged. Although these early propos- was that other molecular markers, including als succeeded in recognizing several major as- various elongations factors and RNA polymer- semblages, such as animals and plants, they ase subunits, corroborated the deep-branching were less successful in resolving the relation- position of archezoan taxa, altogether sup- ships between the groups and, with the benefit porting the prediction that they should branch of hindsight, failed to account for the funda- earlier than the -containing mental paraphyletic and complex of eukaryotes if they predate the origin of this or- the protist lines. ganelle (Brown and Doolittle 1995; Klenk et al. 1995; Kamaishi et al. 1996; Yamamoto et al. 1997). A MOLECULAR (R)EVOLUTION At the other end of the tree, the so-called The backbone of the eukaryotic tree has gone crown, contain the major of eukaryotes, through some profound rearrangements in the appearing tightly bunched together as if they past 20 . Comparing or amino diverged almost simultaneously (Sogin 1991; acid sequences is now the tool of choice for re- Knoll 1992). These clades included animals, constructing evolutionary histories. This is par- fungi, and plants as well as diverse protist line- ticularly true for protists because the interpre- ages such as and (see tation of their morphological characters alone below). The branching among the SSU is problematic. For years, the go-to molecular rRNA crown taxa, however, could not be re- marker for phylogenetics has been the small solved even with the help of several subunit ribosomal RNA (SSU rRNA). It is markers (Baldauf 1999; Hirt et al. 1999; Roger

The Tree of Eukaryotes

et al. 1999; Moreira et al. 2000). This lack of mo- karyotic crown, and no longer supported the lecular resolution was interpreted as evidence idea that the -to-eukaryote transi- that not enough phylogenetic signal could accu- tion was a progressive transformation involving mulate in the sequences because most phyla intermediate amitochondriate forms. emerged in a very short period of , like in More generally, the whole eukaryotic tree a “big-bang” explosion of species diversification was shaken up by important discrepancies be- (Philippe et al. 2000a,b). tween SSU rRNA-based phylogenies and those inferred from a growing number of protein- coding genes, as well as discrete molecular char- TRIMMING THE TREE acters such as shared indels (insertion/dele- The early molecular-based interpretation of tions) or fusions, or the systematic analysis the eukaryotic tree showing the archezoan- of light and data (e.g., Bal- crown dichotomy did not last very long. First, dauf and Palmer 1993; Keeling and Doolittle as more and more lineages were being se- 1996; Fast et al. 1999; Baldauf et al. 2000; Mo- quenced, mitochondriate protist groups such reira et al. 2000; Cavalier-Smith 2002; Simpson as and squeezed in 2003; Nikolaev et al. 2004; Harper et al. 2005). between the archezoan taxa and the crown (So- The integration of these various kinds of data gin et al. 1986; Clark and Cross 1988; Pawlowski led to the conception that most, if not all, eu- et al. 1996). Moreover, the archezoans were karyotic diversity can be assigned to one of sev- characterized byelevated rates of molecular evo- eral major assemblages, called “supergroups” lution, which translates into long branches in (Baldauf 2003; Keeling 2004; Simpson and Rog- phylogenetic trees (Philippe et al. 2000b). This er 2004; Adl et al. 2005; Keeling et al. 2005; meant that for these “basal” taxa, most standard Parfrey et al. 2006). In this framework, animals molecular markers such as the SSU rRNA were occupy just one branch among hundreds, and mutationally saturated, a feature that can lead are far outnumbered at the level of major line- to the notorious (LBA) ages by unicellular eukaryotes. In its original artefact in which distantly related species with form, this new tree of eukaryotes was an un- fast evolving sequences are erroneously clus- rooted polytomy with six main stems, each rep- tered together (Felsenstein 1981). Ultimately, resenting a supergroup: Opisthokonta, Amoe- new genes, more taxa, and better phylogenetic bozoa, , , and methods showed that if the archezoan taxa ap- . All six of these supergroups peared to diverge early, it was not because they emerged from a common point, and their order were “primitive” eukaryotes but rather because of divergence was mostly unknown. of their artificial attraction to the base of the tree by distant outgroups (Embley and Hirt 1998; PHYLOGENETICS þ GENOMICS ¼ Roger 1999; Baldauf et al. 2000; Philippe et al. PHYLOGENOMICS 2000a,b). At the same time, mitochondrial-de- rived genes were progressively discovered in ar- Fast forward to present day, we are well into the chezoan genomes, the products of which were genomic era and reconstructing the tree of eu- shown to be targeted to reduced double-mem- karyotes is no longer the job of a few genes. brane-bounded organelles of mitochondrial Instead, huge phylogenomic data sets contain- ancestry ( and ), ing hundreds of genes for an always-increasing suggesting that all “amitochondriate” eukary- number of taxa can now be used. The tedious otes once possessed mitochondria (Clark and and expensive Sanger of the early Roger 1995; Bui et al. 1996; Keeling 1998; Tovar 2000s (the genome cost in the range of et al. 1999, 2003; Williams et al. 2002; Embley a billion U.S. dollars) has been replaced by fast and Martin 2006; Goldberg et al. 2008; Hjort and cheap next-generation sequencing, which et al. 2010). This marked the end of both the can produce genome-scale data at unprecedent- archezoa hypothesis and the concept of a eu- ed depths for a taxonomically broad sampling.

F. Burki

Once strongly biased toward organisms relevant such as saturated positions in the data set (sto- for human well-being (of economical or medi- chastic errors) or model misspecifications (sys- cal importance), there has been an explosion of tematic errors). Stochastic errors arise when the the taxonomic distribution of species for which number of positions in an alignment is small extensive genomic data are available. As of 2012, meaning that the random background noise, at least one species from each major of which inevitably accumulates through time be- eukaryotes has had its genome fully sequenced, cause of homoplasy, will have a neutralizing ef- not to mention the numerous smaller scale fect on the positions that contain the genuine genomic surveys and transcriptomic data sets phylogenetic signal. Model misspecifications that have been generated. With this wealth of leading to systematic errors include, for exam- sequence data at hand, phylogenomics started ple, the heterogeneity of nucleotide or amino off as a way to predict gene functions by evolu- acid composition, which tends to incorrectly tionary analysis in which uncharacterized genes cluster together species sharing the same com- were predicted by their phylogenetic position position, or the heterogeneity of the evolution- relative to genes with known functions (Eisen ary rates, which can result in the LBA artefact 1998). Although this area of phylogenomics is (Felsenstein 1981; Philippe 2000; Lopez et al. still very active, it also rapidly emerged as a new 2002; Foster 2004; Ho and Jermiin 2004; Jermiin of phylogenetics and became an essen- et al. 2004). tial tool for addressing controversial evolution- Logically, the goal of a phylogenetic infer- ary questions, such as the transitions that led to ence is to minimize the noise and maximize the the current diversity of eukaryotes. true phylogenetic signal (Philippe et al. 2005, In its most popular application to phyloge- 2011). Single-gene phylogenies, because of the ny, phylogenomics relies on multiple-sequence limited information they contain, are especial- alignments, much like the single-gene ap- ly susceptible to stochastic errors; to counter proach, but here the genes are often concatenat- this, the obvious solution is to gather more ed into large supermatrices (Delsuc et al. 2005). data in the hope that enough phylogenetic sig- With this approach, too, it is important to en- nal is recovered (i.e., synapomorphy will dom- sure the of the characters inate homoplasy). Systematic errors, on the oth- (orthology)—that the genes in related species er hand, tend not to vanish with the addition are inherited from a common ancestor. This is of more data as their causes do not average out not an easy task because the genes available from over longer alignments (Rodriguez-Ezpeleta genomic-scale projects are still poorly sampled, et al. 2007b). Yet, the key advantage of phyloge- substantially more so than for widely used phy- nomics is precisely that more data is available logenetic protein markers such as , a- and to start with, making it possible to apply strat- b-, or 2. egies to diminish the known sources of system- Consequently, differentiating between orthol- atic errors while still maintaining most of the ogy and, for example, the undesirable horizon- phylogenetic signal (see Delsuc et al. 2005; Phi- tally transferred genes (HGTs), independent lippe et al. 2005, 2011 for reviews). Thus, when gene losses or partially sampled genes can be combined with other options developed and difficult to achieve with a limited selec- tested on smaller data sets to increase the phy- tion. Moreover, many of the general biases that logenetic accuracy, such as the use of more ac- apply to single-gene phylogenetics are exacer- curate phylogenetic methods or better taxon bated in a phylogenomic context (Jeffroy et al. samplings, phylogenomics becomes a very pow- 2006; Rodriguez-Ezpeleta et al. 2007b; Philippe erful tool. et al. 2011). Phylogenetic reconstruction in- Phylogenomics has confirmed the existence volves two opposing forces: the true phyloge- of most supergroups, although various degrees netic signal carrying the evolutionary history, of controversy remain for each of them. The and the nonphylogenetic signals (noise) result- increased power in resolution has also led to ing from a combination of one or more causes some important shuffling among the super-

The Tree of Eukaryotes

groups and allowed the placement in the tree of the amoeboid cells, evidence that they form of several “orphan” lineages. Below, in the next a monophyletic group is based on single-gene section, I briefly introduce each supergroup, phylogenies (e.g., Fahrni et al. 2003; Smirnov and discuss the eukaryote phylogeny in light et al. 2005), phylogenomics (Bapteste et al. of the most recent advances (see Fig. 1). 2002; Minge et al. 2009; Brown et al. 2012), and similarities in mitochondrial genome ar- THE EUKARYOTIC SUPERGROUPS chitecture between and Dictyos- telium (Lonergan and Gray 1996; Iwamoto et al. Opisthokonta 1998). This supergroup contains animals (Metazoa) and are often and fungi, as well as several lines of hetero- united in a larger supergroup called the uni- trophic protists. Animals emerged from within konts (Cavalier-Smith 2002) or, more recently, a paraphyletic assemblage of protists, includ- (Adl et al. 2012), which is supported ing the choanoflagellates (e.g., Monosiga), Filas- by a /fusion (Stechmann and terea (e.g., ), and Ichthyosporea (e.g., Cavalier-Smith 2002, 2003) as well as single- Sphaeroforma; Steenkamp et al. 2006; Ruiz- gene phylogenies (e.g., Baldauf and Palmer Trillo et al. 2008; Shalchian-Tabrizi et al. 2008; 1993) and phylogenomics (Rodriguez-Ezpeleta del Campo and Ruiz-Trillo 2013). Fungi, in- et al. 2007a; Hampl et al. 2009; Brown et al. cluding the monophyletic and early assemblage 2012; Derelle and Lang 2012). Amorphea also comprising Cryptomycota (e.g., ), aphe- contains protist lineages of unclear placement, lids, and microsporidians (Lara et al. 2010; such as the apusomonads, ancryomonads, Jones et al. 2011a,b; James et al. 2013; Karpov and breviate amoebae. Based on phylogenomic et al. 2013) are closely related to nucleariid analysis, opisthokonts, apusomonads, and bre- ( simplex) and the aggregative viates were recently grouped into a larger entity (Brown et al. 2009; Liu named (Brown et al. 2013). The validity et al. 2009, 2012). Opisthokonts are putative- of this assemblage, however, relies on the posi- ly united by the presence of a single posterior tion of the eukaryotic root, which remains hy- flagellum in several representatives (Cavalier- pothetical (see below). Smith and Chao 1995), and many molecular- based evidences (single-gene phylogenies, phy- Excavata logenomics, and indels) have consistently sup- ported the existence of this group (e.g., Baldauf This supergroup is composed of diverse and and Palmer 1993; Wainright et al. 1993; Baldauf mainly heterotrophic protists, many of which et al. 2000; Brown et al. 2009; Liu et al. 2009). It are anaerobes and/or parasites and possess hy- is one of the most reliable supergroups. drogenosomes or mitosomes instead of mito- chondria (e.g., , ). A group including lineages with plastids of green algal Amoebozoa origin, the (e.g., ), also belongs This supergroup includes mostly amoeboid, to this assemblage. Excavates were originally heterotrophic protists such as the classical na- proposed based on a distinctive feeding groove ked and testate lobose amoebae with broad supported by a particular set of cytoskeletal fea- pseudopods (e.g., Amoeba), but also contains tures that are found in some, but not all, of these some amitochondriate parasitic lineages of crit- organisms (Simpson 2003). A strong molecular ical medical importance (e.g., ), fla- confirmation of this grouping is currently lack- gellated cells (e.g., , ), ing, which is at least, in part, attributable to the or the mycetozoan slime molds capable of ag- high rates of sequence evolution of most of its gregative multicellularity (e.g., ; putative constituent lineages. Single-gene phy- see Pawlowski and Burki 2009 for a recent re- logenies (e.g., Kolisko et al. 2008; Takishita et al. view). In addition to morphological similarities 2012) and phylogenomics (Rodriguez-Ezpeleta

F. Burki

Cryptophytes Cyanidiophytes Bangiophytes Palpitomonas Floridiophytes Trebouxiophytes Telonemids Chlorophyceans Rappemonads Ulvophytes Prasinophytes s e Zygnemophyceans k c ti Charophyceans e r Red o h algae Angiosperms p Vitrella ia D Green Colpodellids plants Kinetoplastids Chromera Diplonemids Colponemids Alveolates Euglenids Heteroloboseans Bicosoecids Labyrinthulids Archaeplastida Thraustochytrids Excavates Opalinids Diplomonads Actinophryids Stramenopiles Phaeophytes Dictyochophytes Opisthokonts Zygomycetes Pelagophytes SAR Ascomycetes Fungi Basidiomycetes Pinguiophytes Chytrids Chrysophytes Cryptomycota Nucleariids Fonticula Cercomonads Rhizaria Ichthyosporea Euglyphids Animals Thaumatomonads Amoebozoa Porifera Coelenterata Paradinium Haplosporidia ea Ancyromonads ph or Apusomonads Am Foraminifera Arcellinids Archamoebae Tubulinids Vannellids Leptomyxids Dactylopodids Acanthomyxids

Figure 1. Global tree of eukaryotes from a consensus of phylogenetic evidence (in particular, phylogenomics), rare genomic signatures, and morphological characteristics. Numerous eukaryotic groups are shown (not exhaustively), regardless of their . Cartoons illustrate the diversity constituting the largest assemblages (colored boxes). The branching pattern does not necessarily represent the inferred relationships between the lineages. Dotted lines denote uncertain relationships, including conflicting positions. Note the solid branch leading to haptophytes and rappemonads: This illustrates the strong support for placing haptophytes as sister to SAR(stramenopiles, alveolates, and Rhizaria) in a recent study (Burki et al. 2012b), but this lineage is not included in a colored assemblage because confirmation is needed. The arrows point to possible positions for the eukaryotic root; the solid arrow corresponds to the most popular hypothesis (Amorphea- rooting), the broken arrows represent the alternative hypotheses discussed in the text. (This figure was inspired by a template provided by Y. Eglit.)

et al. 2007a; Hampl et al. 2009) have generally Archaeplastida (Plantae) recovered the of the group when the fast-evolving taxa are excluded, but some This supergroup is composed of the three main lineages, such as Malawimonas, have eluded ro- lineages of primary photosynthetic taxa: organ- bust placement (Hampl et al. 2009; Zhao et al. isms that harbor plastids directly derived from 2012). the cyanobacterial endosymbiosis. (1) The glau-

The Tree of Eukaryotes

cophytes (e.g., Cyanophora) are a small group opposing flagella with tripartite flagellar hairs of enigmatic freshwater microscopic algae with (mastigonemes) on the forward flagellum, but uniquely cyanobacterial-like plastids (they have also include many lineages that lost one or both retained the prokaryotic layer flagella (Cavalier-Smith 1986). They are well between the two membranes). (2) The supported by molecular data (Ben Ali et al. (rhodophytes) form a diverse group of 2002; Cavalier-Smith and Chao 2006; Riisberg unicellular algae and large , for exam- et al. 2009). Alveolates are also extremely di- ple, the Porphyra commonly used to wrap verse, notably including the dinoflagellate algae, sushi (nori). (3) The “green” organisms (Viri- but also apicomplexan parasites such as the ma- diplantae), including the , with laria agent or protozoans mostly free-living unicellular (e.g., Chlamydo- (e.g., ). The cortical alveoli, a sys- ) and colonial or multicellular taxa (e.g., tem of vesicles supporting the plasma mem- , Caulerpa), but nonphotosynthetic para- brane, constitute a morphological synapomor- sitic taxa (e.g., Prototheca, Helicosporidum), are phy for the alveolates (Cavalier-Smith 1991). also known, as well as the land plants (, Molecular phylogenies strongly support the , angiosperms, etc.). Single-gene phyloge- monophyly of alveolates (e.g., Fast et al. 2002; nies and phylogenomics of plastid genes have Harper et al. 2005). Rhizaria is based on molec- strongly supported a common origin of the ular characters only (e.g., Keeling 2001; Archi- primary plastids in the ancestor of Archae- bald et al. 2003; Nikolaev et al. 2004; Bass et al. plastida (Chu et al. 2004; Hagopian et al. 2004; 2005; Burki and Pawlowski 2006; Burki et al. Rodriguez-Ezpeleta et al. 2005). 2010; Brown et al. 2012; Sierra et al. 2013). also supports a unique endosymbiosis with It includes naked and testate amoeboid, het- a shared plastid protein import machinery erotrophic protists with filose or reticulose (Palmer 2003; McFadden and van Dooren pseudopods such as foraminiferans and radio- 2004; Price et al. 2012). Nuclear phylogenies larians. Some rhizarians are photosynthetic, are less supportive (Moreira et al. 2000; Rodri- including lineages in the euglyphids (e.g., Pau- guez-Ezpeleta et al. 2005; Nozaki et al. 2007, linella) and chlorarachniophytes (e.g., Bigelo- 2009), but this supergroup is generally widely wiella). A large diversity of free-living flagellates recognized. and amoeboflagellates, sorocarpic (e.g., Guttu- linopsis), and parasite protists also belong to this group (see Pawlowski and Burki 2009 for SAR a review). One enigmatic parasite, Mikrocytos This vast assemblage is the most recently recog- mackini, which infects and kills , was re- nized supergroup and, contrary to the other cently shown to belong to Rhizaria and attracted supergroups, its existence is exclusively sup- attention because it likely possesses highly re- ported by molecular data (i.e., phylogenomic duced mitochondrion-derived (Burki analyses (Burki et al. 2007, 2008; Hackett et et al. 2013). al. 2007; Rodriguez-Ezpeleta et al. 2007a) and a derived RAB1 paralog (Elias et al. 2009). It THE RISE OF SAR was originally named SAR as an acronym of its constituents: stramenopiles, alveolates, and Before being members of SAR, stramenopiles Rhizaria (Burki et al. 2007). Stramenopiles and alveolates were part of another supergroup, (also known as ) embrace a very chromalveolates, which has played a central large diversity of protists, including, for exam- role in shaping our understanding of eukaryotic ple, ecologically important algal groups such as evolution, particularly the origin and spread of diatoms or large multicellular seaweeds (e.g., secondary plastids of red algal origin (Keeling ), as well as heterotrophic, often parasitic 2009). In addition to stramenopiles and alve- species such as oomycetes and Blastocystis. Stra- olates, the four original constituent chromal- menopiles are typically characterized by two veolate lineages also included two important

F. Burki

protist groups: haptophytes and nouskovec et al. 2010). Early nuclear phyloge- (Keeling 2004; Reyes-Prieto et al. 2007). Al- nies based on single genes also recovered a close though many of these lineages are nonphoto- association between alveolates and strameno- synthetic or lack plastid altogether, the rational piles (Van de Peer et al. 1996; Harper et al. for grouping them into a monophyletic entity 2005). Phylogenomics largely confirmed this was based on the idea that the plastids of the observation and, at the same time, produced chromalveolate taxa, which all share chloro- trees in which haptophytes and cryptomonads phyll c, can be traced back to a single endosym- shared a common ancestor, congruent with the biotic event with a red alga (Cavalier-Smith plastid topology (Hackett et al. 2007; Patron 1999). Under this hypothesis, the secondary et al. 2007; Burki et al. 2009). This latter asso- red plastid origin is unique and took place in ciation was also supported by a unique shared the chromalveolate ancestor, and the numerous insertion of a laterally transferred gene nonphotosynthetic lineages scattered among into the plastids of haptophytes and cryptomo- the chromalveolate tree were inferred to have nads (Rice and Palmer 2006), which led some to lost their plastid and/or (Cava- propose the name to accommodate lier-Smith 1999). However, because of the ab- the body of evidence in favor of a common or- sence of integrative molecular evidence sup- igin between these two groups (Okamoto et al. porting it, the chromalveolates have long been 2009). a controversial supergroup and it was recently However, what appeared to be a consistent remodeled in such a way that it has disappeared scenario rapidly became challenged by the ac- from the current consensus of the eukaryotic cumulating genetic data from diverse species tree (Fig. 1) (Adl et al. 2012; Keeling 2013; Paw- as well as evidence against the monophyly of lowski 2013). its original constituents. First, several phyloge- In phylogenetic terms, one condition of the nomic studies showed that Rhizaria branched chromalveolate hypothesis is that both the plas- together with alveolates and stramenopiles (the tid and (i.e., nuclear) trees must be consis- SAR group) to the exclusion of haptophytes and tent in showing the monophyly of alveolates, cryptomonads, whose exact positions remained stramenopiles, haptophytes, and cryptomo- unresolved (Burki et al. 2007, 2008, 2009; Hack- nads. From the plastid side, the monophyly ett et al. 2007; Rodriguez-Ezpeleta et al. 2007a). has usually been recovered for three of these This was significant because Rhizaria include groups (Yoon et al. 2002; Khan et al. 2007). mostly heterotrophic groups and only two Alveolates, however, have proven nearly impos- known photosynthetic lineages (chlorarachnio- sible to fit into this molecular framework be- phytes and ), neitherof which possess- cause their plastid genomes are generally highly es plastids of red algal origin. Thus, under the reduced, providing only few genes for compar- chromalveolate hypothesis, the SAR relation- ative analyses (Ko¨hler et al. 1997; Green 2004). ships imply that the ancestor of Rhizaria had But this was before the recent unexpected dis- a red-algal-derived plastid, which was lost be- covery of deep-branching relatives of apicom- fore their diversification. At first glance, this plexans (e.g., Chromera, Vitrella) (Moore et al. might not substantially alter the chromalveo- 2008), members of alveolates, which possess late hypothesis: Regardless of how many species more gene-rich plastid genomes (Janouskovec belong to Rhizaria, only one more loss of an et al. 2010). When the genomes of these species ancestral red plastid (out of many genuinely as- were compared to those of other chromalveo- sumed) is required to explain the current plastid lates, the tree that emerged showed a robust distribution, provided that SAR is closely related union between the and to Hacrobia. plastids, and recovered the global monophyly of In addition to Rhizaria, however, phyloge- the red plastids also including haptophytes and netic analyses showed that several other het- cryptomonads (which were most closely related erotrophic lineages were linked to the chromal- to each other), albeit with lower support (Ja- veolates, further challenging the hypothesis.

The Tree of Eukaryotes

Specifically, telonemids, centrohelids, katable- al. 2010). Altogether, these observations have pharids, and picobiliphytes (now Picozoa) (See- forged the basis for alternative scenarios to the nivasan et al. 2013)have all been inferred as sister chromalveolate hypothesis: scenarios in which to either haptophytes or cryptomonads (Shal- red plastids spread across the tree not by means chian-Tabrizi et al. 2006; Not et al. 2007; Burki of vertical inheritance, but through more com- et al. 2009), but these associations are generally plex serial eukaryote-to-eukaryote endosymbi- poorly resolved, with the exception of katable- oses (Lane and Archibald 2008; Sanchez Puerta pharids, which is robustly related to cryptomo- and Delwiche 2008; Archibald 2009; Bodyl et al. nads (Okamoto and Inouye 2005; Burki et al. 2009; Baurain et al. 2010; Dorrell and Smith 2012b). Nevertheless, the addition of new plas- 2011). tid-lacking lineages compromised both the ha- crobian and chromalveolate hypotheses because DIGGING FOR THE ROOT both are based on a common plastid origin (Cavalier-Smith 1999; Okamoto et al. 2009). In addition to improving the resolution of the Proponents of the chromalveolate hypothesis, eukaryotic tree, knowing precisely where the however, can always put forward the argument root lies is essential to give directionality to evo- of plastid loss to explain these relationships, an lution and go beyond the classical star-like rep- argument that is difficult to refute until the prev- resentation of the supergroups. This funda- alence of plastid loss among eukaryotes is better mental issue has proven extremely difficult to understood. tackle because it refers to a prohibitively ancient More problematic is evidence calling into event: the last eukaryotic common ancestor. question the existence of Hacrobia; in contrast The position of the root is generally considered to earlier reports, a recent phylogenomic study unresolved, but a few hypotheses have emerged. did not recover the association between hapto- The most straightforward approach to root a phytes and cryptomonads, but instead showed phylogenetic tree is to use an outgroup, which haptophytes branching closer to SAR, whereas often consists of one or several lineages known cryptomonads were sister to Archaeplastida to stem from the base of the group of interest. (Burki et al. 2012b). These relationships were In the case of eukaryotes, one possible outgroup generally weakly supported, in particular, the is to be found within . This is prob- position of cryptomonads, and thus require lematic because even the closest prokaryotic further testing before one can safely dismantle outgroup (Archaebacteria) represents a consid- theHacrobiahypothesis.But,byrecoveringcryp- erable evolutionary distance from eukaryotes, tomonads distantly related to SAR and hap- which could potentially lead to the LBA arte- tophytes, this study shifted attention on the fact (Felsenstein 1981). If fact, analyses includ- position of this group as key to infer red plastid ing prokaryotes usually produced suspicious evolution (Burki et al. 2012b). Indeed, if the phylogenies in which the fastest evolving eu- -Archaeplastida grouping is con- karyotes branched off close to the base of the firmed, it would not only invalidate the con- tree, attracted to the distant outgroup (Philippe dition of a shared origin between all chromal- et al. 2000b; Ciccarelli et al. 2006; Williams et al. veolate lineages in both nuclear and plastid 2012). phylogenies, it would also conflict with plastid To reduce the outgroup-to-ingroup dis- phylogenies that strongly group haptophytes tance, some researchers have used genes derived and cryptomonads (Janouskovec et al. 2010). from the a-proteobacterium endosymbiotic Furthermore, a study evaluating the phyloge- progenitor of mitochondria (Derelle and Lang nomic signal across the three genomic compart- 2012). This analysis placed the eukaryotic root ments (nuclear, plastid, and mitochondrial) in between unikonts and everything else (often re- chromalveolate taxa reported discrepancies too ferred to as ) (Cavalier-Smith 2002), high to be explained by a common origin of supporting what is perhaps the most prevailing both the plastid and host lineages (Baurain et view for the original bifurcation in the eukary-

F. Burki

otic tree. Two rare genomic changes have also tochondria encounter structure, which tethers supported the basal unikont–bikont split: (1) mitochondria to the ER membrane, showed unikonts have ancestral, bacterial-like dihydro- that the root is most consistently placed be- folate reductase (DHFR) and thymidylate syn- tween Amorphea þ excavates and all other thase (TS) genes, whereas bikonts have derived eukaryotes (Wideman et al. 2013). Researchers and fused version of these two genes (Philippe investigating a new of rare genomic chang- et al. 2000b; Stechmann and Cavalier-Smith es involving multiple, conserved 2002, 2003); and (2) unikonts have a unique residues inferred the initial split in eukaryote insertion to class II, whereas evolution between Archaeplastida and every- this paralog is absent from bikonts (Richards thing else (Rogozin et al. 2009). A different ap- and Cavalier-Smith 2005). Following a parsi- proach showed that by minimizing the number monious argumentation, these two signatures of gene duplications and loss across 20 genes, were taken as evidence for the monophyly of the most parsimonious root was found to lie in- unikonts and bikonts, and a root position out- between opisthokonts and the rest of eukaryotes side of both groups. The unikont–bikont bifur- (Katz et al. 2012). In yet another scenario, the cation was proposed to coincide with a funda- root was proposed to be deep within excavates mental difference of the flagellar apparatus: (thus invalidating the monophyletic origin of unikonts have retained the ancestral organiza- this supergroup), possibly between Euglenozoa tion with one anchoring one flagel- and all other eukaryotes, based on the absence lum, whereas bikonts are ancestrally biflagellated of the mitochondrial outer-membrane chan- with two basal bodies (Cavalier-Smith 2002; nel Tom40 and the DNA replication origin-rec- Stechmann and Cavalier-Smith 2002, 2003). ognition complexes, both ancestral bacterial However, the sequencing of new enigmatic features (Cavalier-Smith 2010b). Most recent- lineages has repetitively challenged the unikont- ly, the use of 37 nuclear-encoded of bikont bifurcation. For example, although the close bacterial ancestry, most of which are of protistean apuzomonads (e.g., Thecamonas)are mitochondrial function, positioned the root be- by essence bikonts (with two basal bodies and tween excavates and the rest of eukaryotes (He two flagella) and possess the fused TS-DHFR, et al. 2014). The list of hypotheses goes on, and they were shown to be related to unikonts (Kim it may be some time before the root is dug out, et al. 2006; Brown et al. 2012; Derelle and Lang but when its position is revealed with more con- 2012). Similarly, the breviate amoeba (e.g., Bre- sistency, it will have a fundamental impact on viata), which harbors two basal bodies (but only our understanding of how the eukaryotic super- one flagellum), turned out to be a deep-branch- groups relate to one another. ing unikont (Minge et al. 2009; Roger and Simpson 2009; Brown et al. 2013). These data DEEP RELATIONSHIPS AMONG suggest that the true unikonts, in addition to EUKARYOTES some protists not fitting with the original de- scription of a monoflagellated ancestry and sep- In an attempt to best reflect the current view arate TS and DHFR genes, may form a , of eukaryotic relationships, I present a rooted but it seems increasingly untenable that this tree with the origin of eukaryotes between group was ancestrally unikont with one basal Amorphea and everything else, but also indi- body. To formalize this taxon without reflecting cate five alternative rooting positions (Fig. 1). the controversial ancestral state, the name Amorphea includes opisthokonts, Amoebozoa, Amorphea was recently introduced (Adl et al. and several protist lineages of uncertain phylo- 2012). genetic placement, such as apusomonads, an- Other analyses have further exacerbated the cryomonads, and the breviate amoebae. (Heiss uncertain position of the eukaryotic root. Re- et al. 2011; Katz et al. 2011; Brown et al. 2013). cently, a study looking at the taxonomic distri- The rest of eukaryotes comprises excavates and bution of the (ER)-mi- the recently erected mega-clade Diaphoretickes

The Tree of Eukaryotes

(Adl et al. 2012), which recognizes the SAR and CONCLUDING REMARKS Archaeplastida clade (Burki et al. 2008) and cor- responds to the corticates of Cavalier-Smith Over the last 10 years, phylogenomics has led to (2010a). Other lineages, such as important refinements of the global tree of eu- haptophytes,cryptomonads,katablepharids,te- karyotes. Most of the large building blocks of lonemids, centrohelids, Rappemonads, and Pal- the tree (the supergroups) predating the geno- pitomonas, may also belong to Diaphoretickes, mic era have been reinforced by the analyses of as compiled from several phylogenetic results larger data sets, but some were also shuffled into (Rodriguez-Ezpeleta et al. 2007a; Burki et al. different arrangements. The increased phyloge- 2009; Hampl et al. 2009; Yabuki et al. 2010; netic power of phylogenomics helped resolve Kim et al. 2011; Brown et al. 2012). The branch- the relationships between the supergroups, pro- ing pattern within Diaphoretickes is generally viding new hypotheses for the deep backbone of poorly resolved, but haptophytes may be sister the tree. It also allowed the placement of orphan to SAR (Burki et al. 2012b). The position of taxa that had eluded proper classification until cryptomonads, although ambiguous, has re- more data became available. At the same time, cently shown affinities to Archaeplastida (Burki new challenges have appeared and several key et al. 2012b). The other lineages have failed to lineages remain of unknown evolutionary ori- show reliable phylogenetic placement and re- gin. It is clear that the dawn of eukaryote evo- main enigmatic. lution and subsequent diversification will not With this evolutionary framework in mind, be fully understood until these enigmatic line- one group, Collodictyon, stands out as particu- ages find a home and the root of the tree is larly remarkable. Collodictyon is an omnivorous characterized. What’s more, as we explore incer- amoeboflagellated cell with a mysterious posi- tae sedis taxa, classical microscopy as well as tion in the tree of eukaryotes. It possesses an next-generation environmental surveys and sin- eclectic mix of cellular features, such as a ventral gle-cell genomics will undoubtedly reveal new feeding groove typical of excavates and amoebo- essential protist taxa. But the exciting news is: zoan-like pseudopods, which make pinpoint- Technological advancements such as sequenc- ing its phylogenetic position difficult but desir- ing preparation from nano-quantities of mate- able (Klaveness 1995; Brugerolle et al. 2002). rial now allow one to tackle these taxa from a When its position was investigated using a large genomic perspective, even when cell cultures phylogenomic data set, Collodictyon branched cannot be established (Yoon et al. 2011). off early in eukaryote evolution, close to the Resolving the tree of eukaryotes will ne- Amorphea-bikont bifurcation (Fig. 1), suggest- cessitate continuing the integrative approach ing that its morphological characteristics may blending morphology, single-gene phylogeny, represent some of the ancestral conditions of and phylogenomics including all diversity. It the eukaryotic cell (Zhao et al. 2012). With the will also require more reliable ways of assem- potential exception of the contentious excavate bling genomic data and the development of Malawimonas, which has proven impossible to new methods to extract the phylogenetic infor- unambiguously assign to any large group of eu- mation contained in these genomes. Impor- karyotes and showed affinities to Collodictyon, tantly, the pervasiveness of HGT in eukaryotes, this branch of the eukaryotic tree is known to in particular, the genes transferred from plastids include onlyone other genus (), aview (endosymbiotic gene transfer), will need to be supported by observations of very low genetic systematically evaluated. Presently, the fierce de- diversity of Collodictyon-like sequences in envi- bate over the global impact of HGTon eukary- ronmental surveys (Zhao et al. 2012). Given a ote evolution and its harmful consequences on root as in Figure 1, this group may represent the phylogenomics is far from a consensus (Mou- first confirmed case of a lineage that diverged stafa et al. 2009; Stiller 2011; Burki et al. 2012a; after the Amorphea-bikont split, but before the Chan et al. 2012; Deschamps and Moreira subsequent diversification of these two groups. 2012). Eukaryote evolution started off more

F. Burki

