<<

REVIEW published: 21 July 2015 doi: 10.3389/fmicb.2015.00717

The universal tree of life: an update

Patrick Forterre 1,2*

1 Unité de Biologie Moléculaire du Gène chez les Extrêmophiles, Département de Microbiologie, Institut Pasteur, Paris, France, 2 Institut de Biologie Intégrative de la cellule, Université Paris-Saclay, Paris, France

Biologists used to draw schematic “universal” trees of life as metaphors illustrating the history of life. It is indeed a priori possible to construct an organismal tree connecting the three major domains of ribosome encoding organisms: , and Eukarya, since they originated by cell division from LUCA. Several universal trees based on ribosomal RNA sequence comparisons proposed at the end of the last century are still widely used, although some of their main features have been challenged by subsequent analyses. Several authors have proposed to replace the traditional universal tree with a ring of life, whereas others have proposed more recently to include viruses as new domains. These proposals are misleading, suggesting that endosymbiosis can modify the shape of a tree or that viruses originated from the last universal common ancestor (LUCA). Edited by: I propose here an updated version of Woese’s universal tree that includes several rootings Mechthild Pohlschroder, University of Pennsylvania, USA for each and internal branching within domains that are supported by recent Reviewed by: phylogenomic analyses of domain specific . The tree is rooted between Bacteria Dong-Woo Lee, and Arkarya, a new name proposed for the grouping Archaea and Eukarya. A Kyungpook National University, South Korea consensus version, in which each of the three domains is unrooted, and a version in which Reinhard Rachel, emerged within archaea are also presented. This last scenario assumes the University of Regensburg, transformation of a modern domain into another, a controversial evolutionary pathway. Germany David Penny, Viruses are not indicated in these trees but are intrinsically present because they infect Massey University, New Zealand the tree from its roots to its leaves. Finally, I present a detailed tree of the domain *Correspondence: Archaea, proposing the sub- neo- for the monophyletic group of Patrick Forterre, Unité de Biologie Moléculaire du euryarchaeota containing DNA gyrase. These trees, that will be easily updated as new Gène chez les Extrêmophiles, data become available, could be useful to discuss controversial scenarios regarding early Département de Microbiologie, life evolution. Institut Pasteur, 25 Rue du Docteur Roux, Paris 75015, France Keywords: archaea, bacteria, eukarya, LUCA, universal tree, evolution [email protected] Introduction Specialty section: This article was submitted to The editors of research topic on “archaeal cell envelopes and surface structures” gave me the Microbial Physiology and Metabolism, challenging task of drawing an updated version of the universal tree of life. This is a daunting a section of the journal Frontiers in Microbiology task indeed, given that the concept of a universal “tree” is disputed by some scientists, who have suggested replacing trees with networks, and that major features of the tree are still controversial Received: 05 January 2015 Accepted: 29 June 2015 (Gribaldo et al., 2010; Forterre, 2012). Thus, it will be difficult to draw a consensus tree welcomed Published: 21 July 2015 by all scientists in the field. In this paper, I thus try to propose updated versions of the universal tree that include as many features as possible validated by robust phylogenetic analyses and/or Citation: Forterre P (2015) The universal tree comparative molecular biology and biochemistry. I will draw a “universal tree” limited to ribosome- of life: an update. encoding organisms (Raoult and Forterre, 2008) that diverged from the last universal common Front. Microbiol. 6:717. ancestor (LUCA). Viruses (capsid encoding organisms) are polyphyletic, therefore their evolution doi: 10.3389/fmicb.2015.00717 can be neither illustrated by a single tree nor included in the universal tree as additional domains

Frontiers in Microbiology | www.frontiersin.org 1 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

(Forterre et al., 2014a). However, this should not be viewed as some major lineages identified in the past decade, such as the neglecting the role of viruses in biological evolution because “the , which are now recognized as one of the three tree of life is infected by viruses from the root to the leaves”(Forterre major archaeal phyla (Brochier-Armanet et al., 2008a; Spang et al., et al., 2014a). The universal tree of ribosome-encoding organisms 2010, 2015). contains cellular organisms that, unlike viruses, reproduce via cell cycles that imply the formation of new cells from the division of Problems with Textbook Trees mother cells. This implies a fascinating continuity in the heredity of the from LUCA to modern members of the three The universal trees of the 1990s based on rDNA that are still widely domains. A robust universal tree is critical to make sense of the used in textbooks, reviews and seminars provide a misleading evolution of the several types of , cell envelopes and surface view of the history of organisms. For instance, they all depict the structures that originated and evolved on top of this continuity. division of Eukarya between a crown including Plants, Metazoa, Fungi, and several lineages of protists, and several basal long The Textbook Trees branches leading to various other unicellular eukaryotes, of which the most basal are protists lacking mitochondria (formerly called Several popular drawings of the universal tree of life are Archaezoa). This topology of the eukaryotic tree was very popular widespread in the scientific literature and textbooks. These in the 1990s but is the result of a long branch attraction artifact. include the radial rRNA tree of Pace (1997), the tree of Stetter At the beginning of this century, it was acknowledged that all (1996) rooted in a hyperthermophilic LUCA and the most famous major eukaryotic divisions should be somewhere in the crown tree, which was published by Woese et al. (1990) to support their (Embley and Hirt, 1998; Keeling and McFadden, 1998; Philippe proposal to change the name “Archaebacteria” to Archaea and to and Adoutte, 1998; Gribaldo and Philippe, 2002). define the Domain as the highest taxonomic level (Woese et al., Another problem still present in many textbook trees is the 1990; Sapp, 2009). All these trees are based on ribosomal DNA position of . All rDNA trees of the 1990s (rDNA) sequence comparisons and are rooted between Bacteria were rooted within hyperthermophilic archaea and bacteria and a common ancestor of Archaea and Eukarya, a rooting (Woese et al., 1990; Stetter, 1996; Pace, 1997). In particular, that was first suggested by phylogenetic analyses of paralogous the hyperthermophilic bacteria of the genera and proteins (Iwabe et al., 1989; Gogarten et al., 1989). were the two most basal bacterial lineages in all these The rDNA trees are still widely used despite their age (from 17 trees. This explains why these bacteria are still often labeled to 25 years old) because scientists need metaphors to represent as “deep-branching bacteria” (Braakman and Smith, 2014). the history of living organisms (despite all criticisms to the tree However, the analysis of ribosomal RNA sequences at slowly concept itself) and because few new trees have been proposed in evolving nucleotide positions (Brochier and Philippe, 2002) and the past two decades that are accepted by most biologists. Doolittle phylogenetic analyses based on sequences do not support (2000) published a widely popularized tree in which the bases of the deep branching of Thermotoga and Aquifex in the bacterial the three domains are mixed by widespread lateral transfer tree (Boussau et al., 2008b; Zhaxybayeva et al., 2009). The exact (LGT). However, studies on archaeal phylogeny and more recently position of these hyperthermophilic bacteria currently remains in Bacteria and Eukarya, as well as comparative genomic analyses, controversial, because of the unusual extent of LGT that occurred have shown that the history of organisms (not to be confused between these bacteria and some other bacterial groups (Boussau with the history of ) can be inferred from a core of highly et al., 2008b; Zhaxybayeva et al., 2009; Eveleigh et al., 2013). conserved genes (Brochier et al., 2005a; Gribaldo and Brochier, The clustering of hyperthermophiles around the root and their 2009; Puigbò et al., 2013; He et al., 2014; Ramulu et al., 2014; “short branches” have been widely cited as support for the idea Raymann et al., 2014; Petitjean et al., 2015). of a hot LUCA and a hot origin of life (Stetter, 1996), although Comparative has confirmed the existence of three the phenotype of an organism at the tip of a branch does not versions (sensu Woese) of all universally conserved proteins, necessarily reflect that of its ancestor at the base. However, early validating the three domains concept at the genomic level reports noted that both features could be explained by the very and opening the way to protein based universal trees (Olsen high GC content of the ribosomal RNA of hyperthermophiles and Woese, 1997). A fairly popular radial tree based on a that limits the sequence space available for the evolution of these set of universal proteins was published in 2006 by Bork and molecules (Forterre, 1996; Galtier et al., 1999; Boussau et al., colleagues (Ciccarelli et al., 2006). Unfortunately, this tree is 2008a). Indeed, the reconstruction of ribosomal RNA and protein biased by the over-representation of bacteria, because of the sequences in LUCA shed serious doubt on its hyperthermophilic method used to create it, which requires complete , and suggests instead that it was either a mesophilic or a sequences. Furthermore, several detailed internal branches within moderate thermophilic organism (Galtier et al., 1999; Boussau each domain are either controversial, such as the presence of et al., 2008a). This result is in agreement with the facts that specific and in different bacterial superphyla thermoadaptation features of lipids in Archaea and Bacteria are (Kamke et al., 2014). A well-thought-out universal tree based on not homologous and that reverse gyrase, a protein required rRNA was published in by López-García and Moreira (2008). This for life at very high temperature, was probably not present in unrooted tree depicts each domain as a radial form with many the common ancestor of Archaea and Bacteria (Forterre, 1996; phyla, without resolving the relationships between phyla within Brochier-Armanet and Forterre, 2007; Glansdorff et al., 2008). domains. However, like all previous trees, this tree does not show These observations suggest that thermal adaptation from LUCA

Frontiers in Microbiology | www.frontiersin.org 2 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life to the ancestors of Archaea and Bacteria took place from cold to not present in Bacteria, but none with the bacterial ribosome that hot and not the other way around. are not present in Archaea (Lecompte et al., 2002; Figure 1).

A Tree or a Ring? The “eocyte” Question Several authors during the past three decades have proposed to In the 1980s, James Lake proposed a universal tree in which replace the universal tree with a “ring of life” (sensu Rivera and Eukarya are sister group of a subgroup of Archaea (later Lake, 2004) because they think that Eukarya originated from the recognized as ) that he called Eocytes (Lake association of a bacterium with an archaeon (for recent reviews, et al., 1984; Sapp, 2009). Most phylogeneticists now support see Forterre, 2011; Martijn and Ettema, 2013). The most recent a new version of the “Eocyte tree,” in which Eukarya emerge version of ring of life scenario is that eukaryogenesis was triggered from within Archaea, and are a sister group of a superphylum by the engulfment of an alpha-proteobacterium by a wall-less encompassing Thaumarchaeota, , Crenarchaeota, giant archaeon capable of (Martijn and Ettema, and (the TACK superphylum; Guy and Ettema, 2013). Proponents of fusion (association) scenarios argue that 2011; Williams et al., 2013; McInerney et al., 2014; Raymann et al., such fusion is required to explain why eukaryotic contain 2015), a sister group of Thaumarchaeota (Katz and Grant, 2014) both archaeal and bacterial-like genes. However, this is not the or a sister group of Korarchaeota (Raymann et al., 2015). In a case, since the presence of archaeal-like genes in Eukarya is a recent study, Eukarya are even proposed to have emerge from logical consequence of the sisterhood of Archaea and Eukarya, within a new phylum of the TACK superphylum (tentative phylum whereas the presence of bacterial-like genes is the expected result ; Spang et al., 2015). of mitochondrial endosymbiosis. Additional bacterial genes might The “eocyte” scenario is supported by phylogenetic analyses have been introduced in proto-eukaryotes by LGT (Doolittle, of universal proteins that use sophisticated methods for tree 1998), which may have been partly mediated by large DNA viruses reconstruction, which are thought to be very efficient at (Forterre, 2013a). Finally, ring of life scenarios do not easily identifying weak phylogenetic signals. However, these data are explain the presence of many core eukaryotic genes (around 40%) controversial, because most universal proteins are small (e.g., that were already present in the last eukaryotic common ancestor ribosomal proteins) and very divergent between Bacteria and (LECA) but have no detectable bacterial or archaeal homologs Archaea/Eukarya, which makes archaeal/eukaryal relationships (Fritz-Laylin et al., 2010). difficult to resolve (Gribaldo et al., 2010). For instance, the Ring of life scenarios, as well as scenarios in which elongation factor datasets are saturated and unable to identify Eukarya emerged from within Archaea (see below) assume deep phylogenetic relationships between eukaryal phyla the transformation of one and/or two of the modern domain (Philippe and Forterre, 1999), and it is therefore challenging into a third one. These scenarios have been criticized by several to use them as phylogenetic markers to resolve even deeper authors, as being biologically unsound (Woese, 2000; Kurland evolutionary relationships. In fact, in the single trees obtained et al., 2006; De Duve, 2007; Cavalier-Smith, 2010; Forterre, for universal proteins by Cox et al. (2008), Eukarya branch 2011, 2013a). In particular, Woese (2000) argued that: “modern within Euryarchaeota in about half of the trees and within cells are sufficiently integrated and “individualized” that further Crenarchaeota in the other half, and they are characterized change in their designs does not appear possible.” However, even if by poor node resolution and many aberrant groupings within eukaryogenesis was actually triggered by the endosymbiotic event Archaea (Cox et al., 2008, supporting information online). that produced mitochondria, this would not be a good reason to Similar unresolved and contradictory single-gene trees were replace the tree of life with a ring. The universal tree should depict again obtained by Williams and Embley in their more recent evolutionary relationships between domains defined according universal phylogeny (see supplementary Figure S1 in Williams to the translation apparatus reflecting the history of cells (and and Embley, 2014). their envelope; Woese et al., 1990) and not according to the global One should be cautious in the interpretation of trees obtained genomic composition that is influenced by LGT, virus integration from the concatenation of protein sequences that produce and endosymbiosis, the history of which is incredibly complex. such contradictory individual trees. Indeed, the Microsporidia This is well illustrated by the case of Plantae. The Encephalitozoon, a derived fungus, appears at the base of the endosymbiosis of a Cyanobacterium that created this eukaryotic eukaryotic tree published by Cox et al. (2008). This is reminiscent megagroup don’t prevent evolutionists to draw a tree of Eukarya of the phylogenies of the 1990s that misplaced Microsporidia in which Plantae are represented as one branch of the tree, and and other amitochondriate eukaryotes at the base of the tree of not as the product of a ring (He et al., 2014). The tree of any Eukarya (Hashimoto et al., 1995; Kamaishi et al., 1996). Similarly, particular taxonomic unit is indeed not affected by the presence several long basal eukaryotic branches (Fornicata, Archamoeba) (or absence) of endosymbionts in some of its branches! Thus, a emerged between Thaumarchaeota and other Eukarya in the tree universal tree of life depicting the three domains as three separate of Katz and Grant (2014). In the tree supporting the emergence entities does not contradict the fusion/endosymbiotic hypotheses of Eukaryotes from Lokiarchaeota, the archaeal tree is rooted in at the origin of Eukarya, as long as this event had no effect on the branch leading to Methanopyrus kandleri (Spang et al., 2015) the eukaryotic ribosome itself. This is not the case, because the a fast evolving archaeon (Brochier et al., 2004). Finally, in the eukaryotic ribosome is not a mixture of archaeal and bacterial analysis of Gribaldo and coworkers, the archaeal tree is rooted ribosomes; it shares 33 proteins with archaeal ribosomes that are between Euryarchaeota and the putative TACK superphylum

Frontiers in Microbiology | www.frontiersin.org 3 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

FIGURE 1 | Evolution of the ribosome proteins content. The proteins present at each evolutionary steps are deduced from the universal tree is rooted in the bacterial branch (left) or in the eukaryotic distribution of homologous ribosomal proteins in the three domains of life, branch (right). In each case, the most parsimonious scenarios for the Archaea (A), Bacteria (B), and Eukarya (E) (adapted from the data of evolution of ribosomal proteins content are presented. The numbers of Lecompte et al., 2002). using eukaryotic proteins as outgroup, but it is rooted in the Eukarya are a sister group of a clade containing crenarchaea branch leading to Korarchaeota when universal bacterial proteins and the euryarchaeon Methanopyrus kandleri, whereas in the are added to the dataset (Raymann et al., 2015). tree of Williams and Embley, eukaryotic Kae1 emerges from a The universal tree published by Gribaldo and co-workers clade grouping and Methanoccoccales. This reflects our best present knowledge of the internal branching illustrates the importance to present in supplementary material order within Archaea and recovers the monophyly of most phyla the individual trees of universal proteins beside those obtained in the three domains (Raymann et al., 2015). Notably, Archaea with concatenation of protein sequences. are rooted in this tree within Euryarchaeota when bacterial The various sets of universal proteins used by different groups proteins are used as an outgroup, suggesting a new root for to investigate the relationships between Archaea and Eukarya Archaea. However, Moreira and colleagues found instead that the show substantial overlap and it is probable that most protein root of the archaeal tree is located between Euryarchaeota and data sets lack valid phylogenetic signal (Gribaldo et al., 2010). Crenarchaeota when using a bacterial outgroup (Petitjean et al., Two groups that analyzed similar sets of proteins with various 2014). methods came to a similar conclusion (Lasek-Nesselquist and I previously reported an observation that questions the validity Gogarten, 2013; Rochette et al., 2014). Lasek-Nesselquist and of the methods used for tree reconstruction in some of these Gogarten (2013) noticed that “the methods used” to recover the analyses. The three domains are each monophyletic, with well eocyte tree “generate trees with known defects,...revealing that it resolved evolutionary relationships within domains, in a tree is still error prone,” whereas Rochette et al. (2014) concluded that of the universal protein Kae1/YgjD published in 2007 (see “the high frequency of paraphyletic-Archaea topologies for near- Figure S1 in Hecker et al., 2007). In contrast, Archaea are universal genes may be the consequence of stochastic effects.” paraphyletic for the same protein in the analysis of Cox et al. Generally speaking, it is very difficult to resolve ancient (2008) (see Figure 2 in Forterre, 2013a) or in a more recent relationships by molecular phylogenetic methods for both analysis by the same research group (Williams and Embley, 2014, practical and theoretical reasons, essentially because the infor- supporting information online). In the tree of Cox et al. (2008), mative signal is completely erased at long evolutionary distances

Frontiers in Microbiology | www.frontiersin.org 4 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

(Forterre and Philippe, 1999; Mossel, 2003; Penny et al., 2014). fact that about 20% of their genes originated from One possibility to bypass this phylogenetic impasse is to focus on (Martin et al., 2002). biological plausibility. Trees in which the three domains are each An argument often used for scenarios in which Eukarya monophyletic are more plausible than trees in which Archaea are descended from modern lineages of is that Eukarya monophyletic because they explain more easily the existence of only appeared recently in the fossil record (McInerney et al., three versions of the ribosome discovered by Woese et al. (1990). 2014). Most authors supporting this scenario systematically In the classical Woese tree, these versions emerged from ancestors ignore the discovery 5 years ago of possible multicellular that differed from modern cells, at a time when the tempo of eukaryotes in sediments dating 2.1 billions years old (El Albani protein evolution was faster than today. By contrast, scenarios et al., 2010). This suggests that the last common ancestor of in which Eukarya emerged from within Archaea assume that modern eukaryotes could have originated much earlier than members from a modern domain (Archaea) were transformed previously thought (well before 2.1 Gyr ago) and that proto- recently (after the diversification of this domain) into completely eukaryotes could have been already present even earlier. In different organisms (Eukarya), something difficult to imagine any case, this also confirms that the tempo of evolution of the from a biological point of view (Woese, 2000; Kurland et al., 2006; central components of the molecular cell fabric has decreased De Duve, 2007; Cavalier-Smith, 2010; Forterre, 2011, 2013a). around three Gyr ago, possibly after the emergence of the three It has been proposed that this dramatic transformation domains, as first suggested by (Woese, 2000; Forterre, 2006). This (implying among others the replacement of archaeal type lipids also explains why it is so challenging to determine the precise by bacterial type lipids) was initiated in a particular archaeal topology of the universal tree of life, considering that we are lineage that, in contrast to all other lineages, already evolved dealing with six Gyr of evolution, encompassing periods with toward more complex forms before eukaryogenesis (Martijn and very different evolutionary tempo, when we are comparing two Ettema, 2013). The recently proposed novel archaeal phylum, modern sequences of universal proteins. Lokiarchaeota, appears to be a good candidate in that case because its genome apparently encodes many genes potentially involved in the manipulation of membranes or in the formation of a The Elusive Root of the Tree , including several homologs of eukaryal proteins that are observed for the first time in Archaea (Spang et al., A major problem in drawing the universal tree of life is the 2015). Eukaryotic-like features present in Archaea (the archaeal position of the root. The tree is rooted between Bacteria and eukaryome, sensu Koonin and Yutin, 2014) are indeed widely Archaea/Eukarya in the classical Woese tree (Woese et al., 1990). dispersed among the various archaeal phyla (Koonin and Yutin, This rooting was initially supported by phylogenetic analyses 2014), suggesting that all these features were present in the last of protein paralogs (elongation factors and V/F types ATPase archaeal common ancestor (LACA), which was more complex subunits) that originated by duplication before LUCA (Gogarten than modern archaea (Forterre, 2013a). This observation is easily et al., 1989; Iwabe et al., 1989). This rooting was criticized in the explained in the framework of the Woese tree by the selective loss 1990s because the bacterial branches are much longer than the of these features (present in the last common ancestor of Archaea other two branches in the trees of these protein paralogs (Forterre and Eukarya) in different archaeal phyla during the streamlining and Philippe, 1999; Philippe and Forterre, 1999). Furthermore, the process that led to the emergence of modern archaea (Forterre, elongation factors and V/F type ATPase subunits data sets, as well 2013a). Comparative genomic analyses have indeed previously as other groups of paralogs (e.g., signal recognition particles, SRP) revealed a tendency toward reduction in the evolution of Archaea that also used to place the root between Bacteria and a common (Csurös and Miklos, 2009; Makarova et al., 2010; Koonin and ancestor of Archaea and Eukarya were saturated with mutations Yutin, 2014). (Philippe and Forterre, 1999). Therefore, it was unclear if this In contrast, in the eocyte scenario, most eukaryotic features rooting reflects the real history of life on our planet or if it is due to present in Archaea originated during the transition between a long branch attraction artifact, e.g., the “bacterial branch” being Euryarchaeota and the TACK superphylum. This scenario further attracted by the long branch of the outgroup sequences of the implies that all archaea “stopped evolving,” remaining archaea, paralogs. Statistical analysis of slowly evolving positions in the two whilst progressively and randomly losing some of these eukaryotic paralogous subunits of SRP confirmed that the bacterial rooting features, except for one particular lineage of Lokiarchaeota that obtained by more classical phylogenetic analyses with SRP was due experienced a dramatic burst of accelerated evolution and was to a long branch attraction artifact and suggested that the root is transformed into eukaryotes. located between Archaea/Bacteria and Eukarya (Brinkmann and It is traditionally suggested that the process that led to this Philippe, 1999); however, this analysis was not followed up. transformation was triggered by the endosymbiosis event that As is the case for archaeal/eukaryal relationships, there is created mitochondria (Lane and Martin, 2010). This seems probably no valid phylogenetic signal left in the universal protein to be a leap of faith, because there is no example of such data set to resolve the rooting of the universal tree by molecular a drastic transformation of the host molecular fabric at the phylogeny. This was confirmed in the case of the elongation more basic and fundamental levels (translation, transcription, factors data set by a cladistic analysis of individual amino-acid replication) triggered by an endosymbiotic event (Forterre, alignments that discriminate between primitive and share derived 2013a). For instance, Plantae remain bona fide Eukarya (with characters (Forterre et al., 1992). Only 23 positions could be typical eukaryotic version of all universal proteins) despite the subjected to this analysis in the elongation factor data set, of

Frontiers in Microbiology | www.frontiersin.org 5 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life which 22 gave ambiguous results and only one supported bacterial Forterre, 2002, 2013b). For instance, our preliminary analyses rooting! of universal proteins sequence alignments indicate that the These past 20 years, the rooting problem has been Lokiarchaeon is probably neither an early branching archaeon neglected—with a few exceptions (see for instance Harish nor a missing link between Archaea and Eukarya (see also Nasir et al., 2013) that I have no space to discuss here. Indeed, “ring et al., 2015). Similarly, other analogous, but non-homologous, of life” scenarios or those in which Eukaryotes originated from systems, such as the two distinct rotary proteins involved in Archaea automatically root the tree between Archaea and Bacteria ATP synthesis by F°/F1 and A/V ATPases, might have originated (rejuvenating the pre-Woesien / paradigm). independently in the bacterial and in the archaeal/eukaryal However, comparative molecular biology has now revealed lineages (Mulkidjanian et al., 2007). several situations that can help us to root the universal tree and Another explanation for the existence of non-homologous decide between alternative scenarios. systems between Archaeal/Eukaryal and Bacteria is that LUCA contained two redundant systems and that one of them was later Rooting From Comparative Molecular on lost at random in each domain (Edgell and Doolittle, 1997; Biology Glansdorff et al., 2008). However, it is unlikely that both versions of all non-homologous systems between Archaeal/Eukaryal and Comparative genomic analyses have shown that most proteins Bacteria were present in LUCA. For instance, no modern central for cellular function (both informational and operational) cells have two non-homologous versions of DNA replication show higher sequence similarity between Archaea and Eukarya machineries or two versions of RNA polymerases (the bacterial than between Eukarya and Bacteria. Furthermore, Archaea and and the archaeal ones). Some systems could have been randomly Eukarya also share many proteins that are either absent in distributed between LUCA and other contemporary cellular (or Bacteria or replaced with non-homologous proteins with the viral) lineages, and redistributed thereafter by LGT, but this seems same function. Surprisingly, comparative genomic analyses have very unlikely in the case of the ribosome. also shown that critical components of the DNA replication Recent biochemical work in our laboratory exemplifies why machinery (replicase, primase, helicase) are non-homologous comparative biochemistry data support a universal tree in between Archaea/Eukarya and Bacteria (Leipe et al., 1999; Archaea and Eukarya are indeed sister domains. Several research Forterre, 2006). This is also the case for the proteins that allow the groups, as well as our team in Orsay, succeeded in reconstituting bacterial F-type ATPase and archaeal A-type ATPase to work as in vitro the protein complexes involved in the biosynthesis of the ATP synthases (Mulkidjanian et al., 2007). All these observations universal threonylcarbamoyl adenosine (t6A) tRNA modification are difficult to interpret if the universal tree is rooted in the in position 37 of tRNA in the three domains of life (Deutsch archaeal or eukaryotic branches and/or if the archaeal/eukaryal et al., 2012; Perrochia et al., 2013a,b) and in mitochondria specific proteins were present in LUCA. Indeed, this would (Wan et al., 2013; Thiaville et al., 2014). In Bacteria, Archaea have required many non-orthologous replacement events that and Eukarya, the reactions require the combination of two occurred specifically in the bacterial branch. Figure 1 illustrates universal proteins and essential accessory proteins that exist in the case of the ribosome evolution. The eukaryotic (or archaeal) two versions, one present in Bacteria, the other present in Archaea rooting is clearly less parsimonious than the bacterial ones since it and Eukarya. Interestingly, the same reaction can be performed implies the loss of 33 ancestral ribosomal proteins in the bacterial in mitochondria by the two universal proteins alone, one (Qri7) branch concomitant with the gain of 23 new proteins. Such non- that came from Bacteria via the endosymbiotic route and the orthologous replacement scenario cannot be completely ruled out other (Kae1) corresponding to the eukaryotic version (Wan et al., since a similar gain and loss event occurred during the evolution 2013; Thiaville et al., 2014). These results suggest that LUCA of the mitochondrial ribosome from the bacterial one (Desmond was able to perform this universally conserved reaction with the et al., 2011). However, there is no evidence that the emergence ancestors of the two universal proteins and that accessory proteins of bacteria involved a dramatic evolutionary event similar to the (now essential) were added independently in the bacterial and in drastic reductive evolution that occurred during the emergence of the archaeal/eukaryal lineages. The most parsimonious scenario, mitochondria. illustrated in Figure 2, supports the rooting between Bacteria and It was previously suggested that non-orthologous replacement Archaea/Eukarya, because other roots would require the presence had indeed occurred for the DNA replication machinery, with the of the archaeal/eukaryal set of accessory proteins in LUCA, and ancestral DNA replication machinery in LUCA being replaced by its replacement by the non-homologous bacterial set in Bacteria. non-homologous DNA replication proteins of viral origin either This seems unlikely because biochemical analyses have shown in Bacteria or in Archaea/Eukarya (Forterre, 1999). However, this that the bacterial and archael/eukaryal accessory proteins are not type of explanation cannot be easily generalized. It seems unlikely functionally equivalent (Deutsch et al., 2012; Perrochia et al., that multiple non-orthologous replacements can explain all 2013b). It is therefore difficult to imagine intermediate steps in other major differences between archaeal/eukaryal and bacterial the replacement process. Furthermore, such replacement, even analogous but non-homologous systems! In the case of the partial, never occurred during the diversification of the three DNA replication machineries, it is simpler to imagine that two domains. versions present in modern cells were independently transferred Woese and Fox (1977) were thus possibly right when they from viruses to cells, once in the bacterial lineage and once proposed that the molecular fabric of LUCA was simpler than that in the archaeal/eukaryal lineage (Mushegian and Koonin, 1996; of modern organisms, and that this organism still had an RNA

Frontiers in Microbiology | www.frontiersin.org 6 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

FIGURE 2 | Evolution of the biosynthesis pathway of threonylcarbamoyl synthesis in modern organisms (Deutsch et al., 2012; Perrochia et al., 2013a,b). adenosine (t6A) from LUCA to the three domains of life. Homologous The bacterial proteins (YjeE, YeaZ) are non-homologous to the archaeal/eukaryal proteins have the same shapes and colors, except for YgjE and YeaZ, which are ones (Bud32, Pcc1, Cgi121). The most parsimonious scenario supports the homologous but not orthologous. In this scenario, this universal tRNA rooting between Bacteria and Archaea/Eukarya, because other roots would modification was performed in LUCA by two universal proteins (the ancestors of require the presence of the Bud32, Pcc1, and Cgi121 proteins in LUCA, and Kae1/YgjD and of Sua5/YrdC, respectively), as in mitochondria (Wan et al., their replacement by the YeaZ, YjeE proteins in Bacteria, an unlikely evolutionary 2013; Thiaville et al., 2014). Additional proteins are now essential for t6A pathway (see text). genome. In this scenario, major molecular machineries, such as name, adopting a “gradist” view of life evolution, with the three the DNA replication machineries or the ATP synthases, emerged Domains emerging independently from a “communal LUCA” and/or became sophisticated independently in the branches before the “Darwinian threshold” (Woese, 2000, 2002). In such leading to Bacteria on one side and to the common ancestor of view, the notion of clade itself cannot be used to group organisms Archaea and Eukarya on the other. that diverged at the time of LUCA when no real speciation The rooting of the universal tree in the so-called “bacterial occurred. I have criticized the Darwinian threshold concept, branch” (Figures 3–5) has been often interpreted as suggesting assuming—with many others—that Darwinian evolution started a “prokaryotic phenotype” for LUCA. This is a misleading as soon as biological evolution take off (see Forterre, 2012, and interpretation that again confuses the phenotypes at the tip and references therein). In particular, extensive genes exchanges that base of a branch. The rooting between a lineage leading to Bacteria possibly take place at the time of LUCA (but see Poole, 2009) and a lineage leading to Archaea and Eukarya is compatible with cannot be opposed to Darwinian evolution occurring through diverse types of LUCA, including a LUCA with some “eukaryotic- variation and selection, since gene transfer only corresponds to like features” that were lost in Archaea and Bacteria (Forterre, a specific type of variation (Forterre, 2012). 2013a). I think that it’s time now to look back at the universal tree Importantly, rooting of the universal tree in the “bacterial with a cladistics perspective and to propose a name for the clade branch” formally requires giving a name to the clade grouping grouping Archaea and Eukarya. It is challenging to find a common Archaea and Eukarya. Woese (2000) never proposed such a synapomorphy to Archaea and Eukarya that could provide a

Frontiers in Microbiology | www.frontiersin.org 7 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

FIGURE 3 | Schematic universal tree updated from (Woese et al., 1990; Common Ancestor, red circle, hyperthermophilic LACA. LARCA: Last Arkarya see text for explanations). PG: peptidoglycan; DNA (blue arrows) introduction Common Ancestor; FME: First Mitochondriate Eukarya; LECA: Last Eukaryotic of DNA; T (pink and red arrows) thermoreduction. LBCA: Last Bacterial Common Ancestor; blue circles, mesophilic ancestors. SARP: Stramenopila, Common Ancestor, pink circle: thermophilic LBCA; LACA: Last Archaeal Alveolata, Rhizobia, Plantae. good name for the clade corresponding to these two domains. major transformation events” (sensu Forterre and Philippe, 1999) David Prangishvili suggests Arkarya (personal communication), that occurred between LUCA and the formation of each domain combining the names of the two domains belonging to this clade (Woese, 1998; Forterre and Philippe, 1999). (Figure 3). Notably, universal trees in which Eukarya emerge from Evidently, it is not possible to draw a tree including all within Archaea are often viewed as “two domains trees” versus presently recognized phyla, especially in the bacterial domain, so the “three domains tree” of (Gribaldo et al., 2010). I made arbitrary choices, and tried to include most well studied However, the new nomenclature proposed here emphasizes that bacterial and archaeal phyla, as well as major eukaryotic divisions the classical Woese tree is also stricto sensu, a two domains tree and/or supergroups. In the case of Archaea, I only indicate (Bacteria and Arkarya)! the phyla Euryarchaeota, Crenarchaeota, Thaumarchaeota and the candidate phylum “Lokiarchaeota” because other proposed Updated Trees for Everybody archaeal phyla are represented by a single species and/or their phylum status is controversial or has been refuted by robust The backbone of the updated universal trees proposed here phylogenetic and phylogenomic analyses (see below). Although (Figures 3–5) was selected from the 1990 tree of Woese et al. still preliminary, the study of three partial lokiarchaeal genomes (1990) as a tribute to Carl Woese and the historical work of has shown that these archaea encode many eukaryotic-like genes the Urbana school (Sapp, 2009; Albers et al., 2013). The relative absent in Thaumarchaea and are clearly separated from both lengths of the branches linking the three domains together Thaumarchaeota and Crenarchaeota in phylogenetic analysis combine features of the rDNA and protein trees. It is indeed (Spang et al., 2015). Furthermore, Lokiarchaeota correspond to puzzling that Archaea and Eukaryotes are very close in trees a large clade of abundant and diversified uncultivated archaea, based on universal protein sequence comparison (Rochette et al., previously named deep-sea archaeal group (DSAG), that are 2014), but are more divergent in those based on rDNA (Pace widely distributed in both marine and fresh water environments et al., 1986). The reason for this discrepancy remains unclear (Jørgensen et al., 2013). and should be worth exploring further. The rather long branches I divide the phylum Euryarchaeota in sub-phylum I (I) between domains in the trees of Figures 3–5 also reflect the “three and sub-phylum II (II), according to the presence/absence of

Frontiers in Microbiology | www.frontiersin.org 8 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

FIGURE 4 | Schematic simplified universal tree updated from Woese et al. (1990). Abbreviations are the same as in Figure 3.

DNA gyrase (see below; Forterre et al., 2014b). Dotted lines Groussin and Gouy, 2011). Some proposed events are more indicate the endosymbiosis events that had a major impact speculative but supported by theoretical arguments, such as the on the history of life by triggering the emergence of both independent introduction of DNA (blue arrows) from viruses, modern eukaryotes (mitochondria) and Plantae (). In into the lineages leading to Bacteria and to Archaea/Eukarya particular, this reminds us that the first mitochondriate eukaryote (Forterre, 2002) and the thermoreduction (red arrows) at the (FME) emerged after the diversification of alpha , origin of the modern “akaryotic” phenotype (Forterre, 1995). indicating that “modern eukarya” are indeed much more recent Rooting of the domain Eukarya and internal branching in this than Archaea and Bacteria. domain have been adapted from the recent tree of Baldauf and In the tree of Figure 3, I use the terms “synkaryote” and colleagues which is based on a concatenation of mitochondrial “akaryote” (with and without a nucleus, respectively) instead of proteins rooted with their bacterial homologs (a rather close eukaryotes and prokaryotes (Forterre, 1992; Harish et al., 2013; outgroup compared to Archaea; He et al., 2014). This tree Penny et al., 2014). This is because the latter terms are the hallmark is rooted between Discoba (Jakobida plus Discritata) and all of the traditional (pre-Woesian) view of the evolution of life from other eukaryotic megagroups. This rooting has been criticized primitive bacteria (“pro” karyotypes) to lower and finally higher by Derelle et al. (2015) who found distant paralogs in the data eukaryotes (Forterre, 1992; Pace, 2006; Penny et al., 2014). set used by He et al. (2014). These authors located the root Some major events that shaped modern domains are indicated, of the eukaryotic tree between Amorpha and other eukaryotic such as the introduction of peptidoglycan (PG) in the lineage groups in mitochondrial proteins trees. This rooting corresponds leading to Bacteria. The last bacterial and archaeal common to the previous division between Unikonta and Bikonta originally ancestors (LBCA and LACA) are colored in pink and red, proposed by (Stechmann and Cavalier-Smith, 2003; Cavalier- respectively, to indicate their probable thermophilic and Smith, 2010). Derelle et al. (2015) have now suggested naming hyperthermophilic nature based on ancestral protein and these two assemblages Optimoda (Amorpha in Figures 3–5) and rRNA sequence reconstruction (Boussau et al., 2008a; Groussin Diphoda. and Gouy, 2011; Groussin et al., 2013). The grouping of The position of the root of the domain Eukarya has been hyperthermophiles at the base of the archaeal tree also suggests constantly changing with further phylogenetic analyses, raising that LACA was a (Brochier-Armanet et al., doubt about the possibility of settling this issue using molecular 2011; Petitjean et al., 2015), whereas LUCA was probably a phylogenetic methods based on protein sequences. I thus decided or a moderate (Boussau et al., 2008a; to root this domain between Jakobida and all other eukaryotes in

Frontiers in Microbiology | www.frontiersin.org 9 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

FIGURE 5 | Schematic universal tree updated from Woese et al. (1990) since “Archaea” are paraphyletic (LACA is also an ancestor of Eukarya) and modified according to the “eocyte” hypothesis. Abbreviations are the suggesting using the suffix karyota for the various “archaeal” phyla. Together same as in Figure 3. In this configuration, Archaea is no more a valid taxon with Eukarya, these phyla became various phyla of Arkarya. the universal tree of Figure 3, as recently suggested by Kannan (Ramulu et al., 2014). The tree is tentatively rooted between the et al. (2014), because Jakobida contain large mitochondrial PVC superphylum and all other Bacteria, according to the basal genomes that still encode the bacterial RNA polymerase genes rooting of Planctomycetes obtained by Brochier and Philippe (Burger et al., 2013; Kamikawa et al., 2014; Kannan et al., using slowly evolving positions in ribosomal RNA sequences 2014). The mitochondria of the LECA probably still had this (Brochier and Philippe, 2002). are indicated in the RNA polymerase, which was subsequently replaced with a viral second branching because these Bacteria are grouped with PVC RNA polymerase in all Eukaryotes, except Jakobida (Kannan bacteria in the phylogenetic analysis based on ribosomal proteins et al., 2014). These viral RNA polymerases have been recruited (Yutin et al., 2012). from a provirus integrated into the genome of the alpha- The basal position of PVC in the bacterial tree of Figure 3, proteobacterium, which gave rise to mitochondria (Filée and which remains to be confirmed, is appealing because the Forterre, 2005). It seems unlikely that this non-orthologous ancestor of PVC bacteria contained several genes encoding replacement occurred twice independently in the history of proteins structurally analogous to various eukaryotic coat proteins mitochondria. Accordingly, the rooting between Jakobida and involved in vesicle and nuclear pore formation (Santarella- other eukaryotes is reasonable (more parsimonious) because Mellwig et al., 2010). These proteins are probably involved in it requires a single NOR of RNA polymerase in mitochondrial the invagination of the cytoplasmic membrane that led to the evolution, whereas the rooting between Amorpha and other formation of the intracellular cytoplasmic membrane (ICM) in eukaryotes would require several independent non-orthologous most PVC bacteria. This mimics the role of coat proteins in replacements. eukaryotes that are involved in the formation of the endoplasmic Rooting of the domain Bacteria and internal branching in reticulum and nuclear membranes. The basal position of PVC this domain have been adapted from the ribosomal protein bacteria suggests a parsimonious scenario in which these proteins trees of Koonin and colleagues (Yutin et al., 2012). These were present in LUCA, and later on lost in Archaea and most authors have suggested several superphyla beside the previously Bacteria (Forterre and Gribaldo, 2010). However, this scenario recognized PVC superphylum, which includes Planctomycetes, is still controversial since it is presently unclear whether the and Chlamydiae (Kamke et al., 2014). These structurally analogous proteins of PVC Bacteria and Eukarya are putative superphyla are indicated by circles in the tree of also homologous (McInerney et al., 2011; Devos, 2012). These Figure 3. Branchings within Proteobacteria are drawn according proteins are formed by the fusion of two domains (one rich in to the ribosomal protein tree of Brochier-Armanet and colleagues alpha helices, the other in beta strands) that are each present in

Frontiers in Microbiology | www.frontiersin.org 10 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life the three domains. Accordingly, they can also have originated , , and class II independently by the fusion of these domains in the branches (Forterre et al., 2014b). leading to Eukarya and PVC bacteria. Phylogenetic analyses have shown that DNA gyrase has been Archaea have been tentatively rooted in the branch leading transferred from Bacteria to Archaea (Raymann et al., 2014). to Lokiarchaeota in the tree of Figure 3, because this candidate This transfer was an important and unique event that had a phylum contains most eukaryotic features present in Archaea and critical impact on chromosome structure and patterns of gene branches closer to Eukarya than to other archaea in phylogenetic expression. Indeed, plasmids from all archaea are relaxed or analyses of universal proteins, even when bacterial proteins are slightly positively supercoiled, whereas plasmids from member removed from the analysis (Spang et al., 2015, Figure S13D). of sub-phylum II Euryarchaeota containing gyrase are negatively Previously, the archaeal ribosomal tree was rooted in the branch supercoiled (Forterre et al., 2014b). Once transferred, DNA leading to Thaumarchaeota when eukaryotic proteins were used gyrase became most likely essential, as demonstrated in the as outgroup (Brochier-Armanet et al., 2008a). This rooting was case of Halobacteriales and , because such also observed in a phylogenetic analysis of the archaeal replicative drastic modification in DNA topology modifies all protein DNA helicase MCM, which is a good phylogenetic marker for the interactions involved in replication and transcription (for review archaeal domain (Krupovic et al., 2010), and in a phylogeny and discussion, see Forterre and Gadelle, 2009). Indeed, to date, of five informational proteins present in deeply branching the loss of DNA gyrase has not been reported in any organism. Thaumarchaeota from Kamchatkan thermal springs (Eme et al., Importantly, the phylogeny of archaeal DNA gyrase is fully 2013). I thus place Thaumarchaeota as the second branch in the congruent with the phylogeny of sub-phylum II Euryarchaeota, archaeal subtree. suggesting that, once transferred to the ancestor of this group, Moreira and colleagues, using bacterial proteins (including DNA gyrase has co-evolved with sub-phylum II Euryarchaeota ribosomal proteins) as outgroup, have recently proposed to (Raymann et al., 2014). Accordingly, considering the importance root the archaeal tree in the branch leading to Euryarchaeota of DNA gyrase in cell physiology (DNA topology controlling (Petitjean et al., 2014). As a consequence, they propose to create all gene expression patterns) I suggest calling sub-phylum II a new phylum, , grouping Crenarchaeota and Euryarchaeota, the neo-euryarchaeota. This name emphasizes the Thaumarchaeota, together with the putative phyla Aigarchaeota fact that the ancestor of this sub-phylum lived after the formation and Korarchaeota. Proteoarchaea thus corresponds to the of the major , since archaeal DNA gyrases branch previously so-called “TACK superphylum” (Thaumarchaeota, within bacterial ones (Raymann et al., 2014). “Aigarchaeota,” Crenarchaeota, Korarchaeota). However, as Since all rooting indicated in the tree of Figure 3, as well as previously mentioned, using the same strategy, Gribaldo and most internal nodes within domains, are controversial, I present a colleagues obtained a root located within Euryarchaeota, second tree (Figure 4), in which the information is limited to only more precisely in between subphyla I and II (Raymann et al., that accepted by consensus. Accordingly, each one of the three 2015). In contrast, their archaeal tree is rooted between domains is shown in a radial form without roots and only a few “TACK/Proteoarchaeota” and Euryarchaeota when they used nodes within domains that seem supported by strong phylogenetic eukaryotic proteins as an outgroup. It will be interesting to see if analyses are indicated (Brochier-Armanet et al., 2008a; He et al., they recover the root in the branch leading to Lokiarchaeota in 2014; Kamke et al., 2014; Ramulu et al., 2014; Spang et al., 2015). future analyses using eukaryotic sequences as an outgroup. Finally, I also present a third tree in which Eukarya emerged Moreira and colleagues argue that eukaryotic proteins cannot from within Archaea (Figure 5). This tree includes the new root be used to root the archaeal tree if Eukarya emerged from within proposed by Gribaldo and co-workers for Archaea (Raymann Archaea. However, in the framework of the classical Woese tree, et al., 2015) and shows Lokiarchaeota as sister group of Eukarya it makes more sense to root the archaeal tree using eukaryotic (Spang et al., 2015). Notably, if future analyses demonstrate that proteins as outgroup, because these proteins are much more such a tree is the more likely tree, Archaea will not be a valid closely related than bacterial proteins to their archaeal orthologs. taxon anymore, except if one accepts to consider eukaryotes as a Notably, the rooting between Lokiarchaeota/Thaumarchaeota particular archaeal phylum (much like Homo is a particular lineage and other Archaea, obtained in that case is more parsimonious of Apes)! In that case, the name Arkarya could be substituted than the rooting between Euryarchaeota and other Archaea to Archaea. Eukarya will become a particular phylum of in explaining the presence in Lokiarchaeota/Thaumarchaeota Arkarya, beside Euryarkaryota, Crenarkaryota, Thaumarkaryota, (including “Aigarchaeota,” see below) of many eukaryotic features and Lokiarkaryota (Figure 5). lacking in other Archaea (Brochier-Armanet et al., 2008b; Spang As can be seen, Figures 4 and 5 can be easily deduced from et al., 2010, 2015; Koonin and Yutin, 2014). Figure 3. This indicates that it will be easy to update and Euryarchaeota are divided in two sub-phyla I and II, according modify these trees following the accumulation of new data from to the presence/absence of DNA gyrase, a bacterial DNA comparative genomics and phylogenetic analyses. topoisomerase that was transferred once in the phylum Euryarchaeota (Raymann et al., 2014). The sub-phylum I The Archaeal Tree Euryarchaeota corresponds to those lacking DNA gyrase and encompasses , Nanoarchaeum, and class I Figure 6 illustrates a rather detailed, but schematic, archaeal methanogens, whereas sub-phylum I corresponds to those tree as a tribute to this issue devoted to Archaea. This containing DNA gyrase and encompasses Archaeoglobales, tree has been adapted from the ribosomal protein tree of

Frontiers in Microbiology | www.frontiersin.org 11 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

Brochier-Armanet et al. (2011) and from a recent phylogeny Neo-euryarchaeota (sub-phylum II) is a monophyletic group in all based on the concatenation of 273 proteins conserved in at least phylogenetic analyses. In contrast, sub-phylum I is paraphyletic in 119 archaeal species out of 129 (Petitjean et al., 2015; thereafter the ribosomal and archaeal protein trees (Brochier-Armanet et al., called the archaeal protein tree). I also include the recently 2011; Petitjean et al., 2015). However, they form a monophyletic described candidate phylum “Lokiarchaeota” (corresponding to assemblage in a tree based on replication proteins (Raymann the DSAG clade) considering its importance for the discussions et al., 2014) and in a recent phylogenomic analysis performed by about the origin of Eukarya. The various roots that have been Makarova et al. (2015) that involved both comparison of multiple proposed for the domain Archaea are indicated by orange circles. phylogenetic trees and a search for putative synapomorphies. I Aigarchaeota are included within Thaumarchaeota, because thus decided to favor the monophyly of sub-phylum I in the tree the latter were originally defined as a major archaeal phylum of Figure 6. encompassing all archaeal lineages that are sister groups of The close relationship between plasmids of Thermococcales Crenarchaeota in rDNA analyses (Brochier-Armanet et al., 2011). and Methanococcales also suggests that these two orders could be In the original paper in which we propose this new phylum, closely related (Soler et al., 2010). It is possible that the emergence we noticed that: “The diversity of mesophilic crenarchaeota— of pseudomurein in Methanobacteriales and that we proposed to rename Thaumarchaeota—based on SSU allowed these archaea to get rid of mobile elements that used to rRNA sequence is comparable to that of hyperthermophilic infect the ancestors of Thermococcales and Methanococcales. Crenarchaeota and Euryarchaeota, which suggests that they Notably, Methanopyrales and Methanobacteriales are represent a major lineage that has equal status to Euryarchaeota monophyletic in the archaeal protein tree, suggesting that and Crenarchaeota”(Brochier-Armanet et al., 2008a). We also the presence of pseudomurein is a synapomorphy for these predicted that the mesophily of Thaumarchaeota “could be sister groups (Petitjean et al., 2015). Petitjean and coworkers challenged by the future identification of non-mesophilic organisms recently proposed the name Methanomada (superclass) for that belong to this phylum.” This suggests considering candidatus class I methanogens, that are monophyletic in their protein tree Caldarchaeum subterraneum as a thermophilic member of the (Petitjean et al., 2015) and in the DNA replication tree (Raymann Thaumarchaeota and not as the prototype of a new phylum et al., 2014). (Aigarchaeota). In agreement with this proposal, candidatus C. In contrast to class I, the four orders of class II methanogens that subterraneum emerges as sister group of other thaumarchaea in are included within neo-euryarchaeota are always paraphyletic most phylogenies (Brochier-Armanet et al., 2011; Nunoura et al., in phylogenetic analyses. Methanogens of the recently described 2011; Eme et al., 2013; Guy et al., 2014; Petitjean et al., 2014, order Methanomassiliicoccus form a monophyletic assemblage 2015; Raymann et al., 2015; Spang et al., 2015). Furthermore, with Thermoplasmatales, the moderate thermoacidophilic strain candidatus C. subterraneum exhibits all molecular features first Aciduliprofundum boonei and several lineages of uncultivated used to define the phylum Thaumarchaeota (Brochier-Armanet archaea in a ribosomal protein tree (Borrel et al., 2013). The et al., 2008a; Spang et al., 2010), such as a eukaryotic-like Topo IB, name Diafoarchaea has been proposed for this major subgroup which is absent from all other Archaea (Brochier-Armanet et al., (superclass) of neo-Euryarchaeota (Petitjean et al., 2015). 2008b). Topo IB is absent in Lokiarchaeota, but one must bear in Altiarchaeales correspond to a recently described mesophilic mind that the reconstituted genome is only 92% complete (Spang archaeum, Candidatus Altiarchaeum hamiconexum, et al., 2015). The phylum Thaumarchaeota should also include characterized by fascinating appendages (Hami) that groups uncultivated archaea from the clade MCG (the Miscellaneous with Methanococcales in a ribosomal protein tree, but between Crenarchaeal Group), since these organisms systematically form Euryarchaeota of sub-phyla I and II in a tree based on several monophyletic groups with Thaumarchaea and “Aigarchaeota” in other universal proteins (Probst et al., 2014). Since Candidatus A. phylogenetic analyses (Spang et al., 2015). hamiconexum contain the two DNA gyrase genes, it is located at The recently proposed phylum “Geoarchaeota” is included the base of the neo-euryarchaeota in the tree of Figure 6. as a sister group of (without phylum status) Rinke et al. (2013) have recently proposed promoting the as suggested by the analysis of Ettema and co-workers (Guy nanosized archaea Nanoarchaeum, Parvarchaeum (ARMAN 4 and et al., 2014; Spang et al., 2015). Thermofilum always branch very 5) and to phylum level and grouping them with early as a sister group of Thermoproteales. This suggests that two other putative new phyla of Archaea (“Aenigmarchaeota” and Thermofilum, as well as Geoarchaeota, could have an order status. “Diapherotrites”) into a new superphylum, called DPANN (Rinke Korarchaeota branch in-between Euryarchaeota and other et al., 2013). In their phylogenetic analysis based on 38 “universal archaea (Crenarchaeota, Thaumarchaeota) in the ribosomal proteins,” the root of the archaeal tree is located between this proteins tree. Unfortunately, this phylum is presently represented putative DPANN superphylum and all other Archaea (see Figure by a single species whose genome has been sequenced, candidatus S11 in Rinke et al., 2013). However, their universal protein data Korarchaeum cryptofilum. The genome of candidatus K. set is confusing because it contains eukaryotic proteins of bacterial cryptofilum harbors a mixture of features characteristic of the origin (Williams and Embley, 2014). In the recent tree of Williams three other archaeal phyla. This can justify maintaining a phylum and Embley supporting the archaeal origin of eukaryotes, the status for this group for the moment. More genome sequences of archaeal tree is rooted between DPANN and Euryarchaeota (see Korarchaeota are nevertheless required to confirm this point. Figure 3 in Williams and Embley, 2014). The basal position of As previously discussed, Euryarchaeota are divided into the nanosized archaea in these trees confirms the difficulty of two groups depending of the presence of DNA gyrase. using universal proteins to resolve ancient phylogenies, because

Frontiers in Microbiology | www.frontiersin.org 12 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

FIGURE 6 | Schematic tree of the archaeal domain. The alternative pseudomurein; large green arrow, introduction of DNA gyrase from Bacteria proposed roots are from : I (Brochier-Armanet et al., 2008a), II (Petitjean (Raymann et al., 2014), pale blue ovals emphasize poorly resolved, et al., 2014), III (Raymann et al., 2015). Thin blue arrow: emergence of controversial nodes. this position is clearly misleading (see below and Petitjean et al., et al., 2011) or in the archaeal tree of Moreira and colleagues 2014). In the case of nanosized archaea, phylogenetic analyses (Petitjean et al., 2014). are even more challenging because their proteins tend to evolve In the tree of Figure 6, Thermococcales, Nanoarchaeum, rapidly, producing very long branches in phylogenetic trees. Parvarchaeum, and Nanohaloarchaea tentatively form a This suggests that these nanosized archaea evolve mainly by monophyletic group. This is supported by several lines of streamlining (Brochier-Armanet et al., 2011; Petitjean et al., 2014). converging (but weak and controversial) evidences. The grouping They have not been included in the protein tree of Petitjean and of Thermococcales and Nanoarchaeum with Parvarchaeum coworkers (Petitjean et al., 2015). (ARMAN 4 and 5) is supported by the phylogeny based on Previous phylogenetic and comparative genomic analyses ribosomal proteins (Brochier-Armanet et al., 2011), whereas focusing on Nanoarchaeum equitans have suggested that this the grouping of Nanoarchaeum and Parvarchaeum with fascinating archaeal symbiont belongs to the Euryarchaeota and Nanohaloarchaea is supported by the phylogeny of DNA could be distant relatives of Thermococcales (Brochier et al., replication proteins (Raymann et al., 2014). Interestingly, the 2005b). This sister relationship was later on supported by the grouping of Parvarchaeum, Nanoarchaeum, and Nanohaloarchaea discovery of the shared presence of a tRNA modification protein is supported by the shared presence of an atypical small primase that was recently transferred from Bacteria to both N. equitans and corresponding to the fusion of the two subunits of the bona Thermococcales (Urbonavicius et al., 2008). The sisterhood of N. fide archaeal/eukaryal primases, PriS, and PriL (Raymann et al., equitans and Thermococcales has been observed again in more 2014). It has been suggested that this fusion corresponds to a recent analyses based on ribosomal proteins (Brochier-Armanet convergent evolution associated to streamlining (Petitjean et al.,

Frontiers in Microbiology | www.frontiersin.org 13 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

2014). However, this is unlikely because these unusual monomeric “Proteoarchaea” if the root of the archaeal tree is located primase primases are also highly divergent in terms of amino-acid within Euryarchaeota (Raymann et al., 2015) or between sequence from the classical archaeal/eukaryal primases and very Thaumarchaeota/Lokiarchaeota and other Archaea. As we similar in all nanosized archaea. Alternatively, it has been previously discussed, defining the root of the archaeal tree suggested that this primase has been distributed between strongly depends on choosing between different scenarios for nanosized archaea by horizontal gene transfers (Raymann et al., the universal tree. Since the rooting of the archaeal tree is still 2014). This seems also unlikely because nanosized archaea in debate, I would not recommend at the moment to use names live in very different types of environments (high temperature such as “Proteoarchaea” or “TACK superphylum” in archaeal for known Nanoarchaeum, high salt for Nanohaloarchaea, phylogeny. high acidity for Parvarchaeum). It seems more likely that this This section on archaeal phylogeny has illustrated the fact that, primase was acquired by a common ancestor to Nanoarchaeum, in addition to the root of the tree itself, several nodes in the Parvarchaeum, and Nanohaloarchaea from a mobile genetic archaeal tree are still controversial and require more data and element (Raymann et al., 2014). It has indeed been shown that more work to be carried out. These nodes have been marked by some archaeal plasmids encode unusual primases from the circles in blue in the tree of Figure 6. Future progress will probably PriS/PriL superfamily that are only distantly related to bona fide come from the sequencing of more genomes, especially in poorly archaeal and eucaryal primases (Lipps et al., 2004; Krupovic et al., represented groups and in the many groups that are presently only 2013; Gill et al., 2014). known from environmental rDNA sequences. The grouping of Nanohaloarchaea with other nanosized archaea, as in Figure 6, is especially controversial because they Conclusion emerge as sister group of Halobacteriales in a ribosomal protein tree (Narasingarao et al., 2012). However, if Nanohaloarchaea are I hope that the universal and archaeal trees proposed here will sister group of Halobacteriales, they would be the only members be useful as new metaphors illustrating the history of life on our of neo-euryarchaeota lacking DNA gyrase. Nanohaloarchaea planet. From the above review, it should be clear that there is no might have lost this enzyme during the streamlining process protein or groups of proteins that can give the real species tree, i.e., related to their small size. However, the loss of DNA gyrase allow us to recapitulate safely the exact path of life evolution. In has not been reported until now in any other free-living particular, one should be cautious with composite trees based on organisms. This is why I finally choose to group Nanohaloarchaea the concatenation of protein sequences or addition of individual with other nanosized archaea in the tree of Figure 6. It is trees. The results obtained should always be compared to the possible that the atypical amino acid composition of the result of individual trees. Martin and colleagues recently reported nanohaloarchaeal proteome (Narasingarao et al., 2012), linked a lack of correspondence between individual protein trees and the to salt adaptation, introduces a bias favoring the artificial concatenation tree in several datasets of archaeal and bacterial grouping of Nanohaloarchaea and . This would proteins (Thiergart et al., 2014). This can reflect either LGT also explain why Nanohaloarchaea attracts Nanoarchaeum and/or the absence of real phylogenetic signal. Importantly, and Parvarchaeum away from Thermococcales and closer to this lack of correspondence between individual protein trees Haloarchaea in the DNA replication tree (Raymann et al., and concatenation trees is also observed in the analyses that 2014). place Eukarya within Archaea (Cox et al., 2008; Williams and The nanosized archaeon Candidatus Micrarchaeum Embley, 2014), raising doubts on results supporting the tree acidophilum (ARMAN 2) branches together with Parvarchaeum of Figure 5. Unfortunately, individual trees are not always (ARMAN 4, 5) in a rDNA tree (Baker et al., 2006) and in the available in the supplementary data of published studies (Katz tree of Moreira and coworkers (Petitjean et al., 2014). However, and Grant, 2014; Raymann et al., 2015). In our previous work it branches away from Parvarchaeum in the ribosomal tree, on archaeal phylogeny, careful analysis of individual trees in as an early branching neo-euryarchaeon. The latter position parallel to their concatenation was critical to obtain a rather is supported by the presence of DNA gyrase in Candidatus confident tree for Archaea based on ribosomal proteins (Matte- Micrarchaeum acidophilum and the absence of the single subunit Tailliez et al., 2002) and to find the (probably) correct position primase characteristic of other nanosized archaea (Raymann of , as sister group of Thermococcales (Brochier et al., 2014). I thus included Micrarchaeum among the superclass et al., 2005b). Notably, to obtain this result in Brochier et al. Diafoarchaea in the tree of Figure 6. Finally, the phylogenetic (2005a), it was necessary to remove the proteins from the large position and status of “Aenigmarchaeota” and “Diapherotrites” ribosome subunit because several of them exhibit a surprising cannot presently be determined because these groups have only affinity with their crenarchaeal homologs in individual trees, been defined from single cell genomic analyses and their genomes possibly indicating LGT between N. equitans and its host, the are probably incomplete (Petitjean et al., 2014). crenarchaeon Ignicoccus. It is also very important to compare trees Further analyses and (hopefully) many more isolates are obtained with different datasets, such as translation, transcription clearly required to determine the correct position of the and DNA replication trees, to pint-point discrepancies and various groups of nanosized archaea in the archaeal tree. identify their causes (Brochier et al., 2004, 2005a; Raymann et al., From all these considerations, it is clear that the “DPNN 2014). superphylum” is an artificial construction. The same is true In summary, it is (and it will be) only possible to draw for the “TACK superphylum” and the candidate phylum schematic (theoretical) trees by combining data from multiple

Frontiers in Microbiology | www.frontiersin.org 14 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life phylogenetic protein trees with information deduced from accumulate in the future and use them in discussions of various probable synapomorphies. It is the exercise that I have tried to controversial scenarios regarding the evolution of ancient life. do here in drawing the trees of Figures 3–6. These organismal trees, which are supposed to recapitulate the history of ribosome Acknowledgments encoding organisms, will probably evolve themselves, with the availability of new genomes (especially from poorly sampled I thank Mechthild Pohlschroeder and Sonja-Verena Albers for groups), better phylogenetic analyses, and the identification inviting me to draw an updated version of the universal tree of new synapomorphies defining specific domains, sub-phyla of life and David Prangishvili for suggesting the name Arkarya. groups and superphyla. For instance, my preliminary analyses I am grateful to Sukhvinder Gill for English corrections and a of universal proteins sequence alignments indicate that the referee for extensive critical analysis. I am supported by an ERC Lokiarchaeon is probably neither an early braching archaeon nor grant from the European Union’s Seventh Framework Program a missing link between Archaea and Eukarya (see also Nasir et al., (FP/2007-2013)/Project EVOMOBIL—ERC Grant Agreement 2015). It should be relatively easy to update these trees as new data no.340440.

References Brochier, C., and Philippe, H. (2002). Phylogeny: a non-hyperthermophilic ancestor for bacteria. Nature 417, 244. doi: 10.1038/417244a Albers, S. V., Forterre, P., Prangishvili, D., and Schleper, C. (2013). The legacy of Burger, G., Gray, M. W., Forget, L., and Lang, B. F. (2013). Strikingly bacteria- Carl Woese and Wolfram Zillig: from phylogeny to landmark discoveries. Nat. like and gene-rich mitochondrial genomes throughout Jakobid protists. Genome Rev. Microbiol. 11, 713–719. doi: 10.1038/nrmicro3124 Biol. Evol. 5, 418–438. doi: 10.1093/gbe/evt008 Baker, B. J., Tyson, G. W., Webb, R. I., Flanagan, J., Hugenholtz, P., Cavalier-Smith, T. (2010). Origin of the , mitosis and sex: roles of Allen, E. E., et al. (2006). Lineages of acidophilic archaea revealed by intracellular coevolution. Biol. Direct. 5, 7. doi: 10.1186/1745-6150-5-7 community genomic analysis. Science 314, 1933–1935. doi: 10.1126/science. Ciccarelli, F. D., Doerks, T., von Mering, C., Creevey, C. J., Snel, B., and Bork, P. 1132690 (2006). Toward automatic reconstruction of a highly resolved tree of life. Science Borrel, G., O’Toole, P. W., Harris, H. M., Peyret, P., Brugère, J. F., and Gribaldo, 311, 1283–1287. doi: 10.1126/science.1123061 S. (2013). Phylogenomic data support a seventh order of methylotrophic Cox, C. J., Foster, P. G., Hirt, R. P., Harris, S. R., and Embley, T. M. (2008). methanogens and provide insights into the evolution of methanogenesis. The archaebacterial origin of eukaryotes. Proc. Natl. Acad. Sci. U.S.A. 105, Genome Biol. Evol. 5, 1769–1780. doi: 10.1093/gbe/evt128 20356–20361. doi: 10.1073/pnas.0810647105 Boussau, B., Blanquart, S., Necsulea, A., Lartillot, N., and Gouy, M. (2008a). Parallel Csurös, M., and Miklos, I. (2009). Streamlining and large ancestral genomes in adaptations to high temperatures in the Archaean eon. Nature 456, 942–945. doi: Archaea inferred with a phylogenetic birth-and-death model. Mol. Biol. Evol. 10.1038/nature07393 26, 2087–2095. doi: 10.1093/molbev/msp123 Boussau, B., Guéguen, L., and Gouy, M. (2008b). Accounting for horizontal gene De Duve C. (2007). The origin of eukaryotes: a reappraisal. Nat. Rev. Genet. 8, transfers explains conflicting hypotheses regarding the position of Aquificales in 395–403. doi: 10.1038/nrg2071 the phylogeny of Bacteria. BMC Evol. Biol. 8:272. doi: 10.1186/1471-2148-8-272 Derelle, R., Torruella, G., Klimeš, V.,Brinkmann, H., Kim, E., Vlček, Č., et al. (2015). Braakman, R., and Smith, E. (2014). Metabolic evolution of a deep-branching Bacterial proteins pinpoint a single eukaryotic root. Proc. Natl. Acad. Sci. U.S.A. hyperthermophilic chemoautotrophic bacterium. PLoS ONE 9:e87950. doi: 112, E693–E699. doi: 10.1073/pnas.1420657112 10.1371/journal.pone.0087950 Desmond, E., Brochier-Armanet, C., Forterre, P., and Gribaldo, S. (2011). On Brinkmann, H., and Philippe, H. (1999). Archaea sister group of Bacteria? the last common ancestor and early evolution of eukaryotes: reconstructing Indications from tree reconstruction artefacts in ancient phylogenies. Mol. Biol. the history of mitochondrial ribosomes. Res. Microbiol. 162, 53–70. doi: Evol. 16, 817–825. doi: 10.1093/oxfordjournals.molbev.a026166 10.1016/j.resmic.2010.10.004 Brochier-Armanet, C., Boussau, B., Gribaldo, S., and Forterre, P. (2008a). Deutsch, C., El Yacoubi, B., de Crécy-Lagard, V., and Iwata-Reuyl, D. (2012). Mesophilic crenarchaeota: proposal for a third archaeal phylum, the Biosynthesis of threonylcarbamoyl adenosine (t6A), a universal tRNA Thaumarchaeota. Nat. Rev. Microbiol. 6, 245–252. doi: 10.1038/nrmicro1852 nucleoside. J. Biol. Chem. 287, 13666–13673. doi: 10.1074/jbc.M112. Brochier-Armanet, C., Gribaldo. S., and Forterre, P.(2008b). A DNA topoisomerase 344028 IB in Thaumarchaeota testifies for the presence of this enzyme in the last Devos, D. P. (2012). Regarding the presence of membrane coat proteins in bacteria: common ancestor of Archaea and Eucarya. Biol. Direct. 3, 54. doi: 10.1186/1745- confusion? What confusion? Bioessays 34, 38–39. doi: 10.1002/bies.201100147 6150-3-54 Doolittle, W. F. (1998). You are what you eat: a gene transfer ratchet could account Brochier-Armanet, C., and Forterre, P. (2007). Widespread distribution of for bacterial genes in eukaryotic nuclear genomes. Trends Genet. 14, 307–311. archaeal reverse gyrase in thermophilic bacteria suggests a complex history doi: 10.1016/S0168-9525(98)01494-2 of vertical inheritance and lateral gene transfers. Archaea 2, 83–93. doi: Doolittle, W. F. (2000). Uprooting the tree of life. Sci. Am. 282, 90–95. doi: 10.1155/2006/582916 10.1038/scientificamerican0200-90 Brochier-Armanet, C., Forterre, P., and Gribaldo, S. (2011). Phylogeny and Edgell, D. R., and Doolittle, W. F. (1997). Archaea and the origin(s) of DNA evolution of the Archaea: one hundred genomes later. Curr. Opin. Microbiol. 14, replication proteins. Cell 89, 995–998. doi: 10.1016/S0092-8674(00)80285-8 274–281. doi: 10.1016/j.mib.2011.04.015 El Albani, A., Bengtson, S., Canfield, D. E., Bekker, A., Macchiarelli, R., Brochier, C., Forterre, P., and Gribaldo, S. (2004). Archaeal phylogeny based Mazurier, A., et al. (2010). Large colonial organisms with coordinated on proteins of the transcription and translation machineries: tackling the growth in oxygenated environments 2.1 Gyr ago. Nature 466, 100–104. doi: Methanopyrus kandleri paradox. Genome Biol. 5:R17. doi: 10.1186/gb-2004-5- 10.1038/nature09166 3-r17 Embley, T. M., and Hirt, R. P. (1998). Early branching eukaryotes? Curr. Opin. Brochier, C., Forterre, P., and Gribaldo, S. (2005a). An emerging phylogenetic Genet. Dev. 8, 624–662. doi: 10.1016/S0959-437X(98)80029-4 core of Archaea: phylogenies of transcription and translation machineries Eme, L., Reigstad, L. J., Spang, A., Lanzén, A., Weinmaier, T., Rattei, T., et al. converge following addition of new genome sequences. BMC Evol. Biol. 5:36. (2013). of Kamchatkan hot spring filaments reveal two new major doi: 10.1186/1471-2148-5-36 (hyper)thermophilic lineages related to Thaumarchaeota. Res. Microbiol. 164, Brochier, C., Gribaldo, S., Zivanovic, Y., Confalonieri, F., and Forterre, P. (2005b). 425–438. doi: 10.1016/j.resmic.2013.02.006 Nanoarchaea: representatives of a novel archaeal phylum or a fast-evolving Eveleigh, R. J., Meehan, C. J., Archibald, J. M., and Beiko, R. G. (2013). Being euryarchaeal lineage related to Thermococcales? Genome Biol. 6:R42. doi: Aquifex aeolicus: untangling a hyperthermophile’s checkered past. Genome Biol. 10.1186/gb-2005-6-5-r42 Evol. 5, 2478–2497. doi: 10.1093/gbe/evt195

Frontiers in Microbiology | www.frontiersin.org 15 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

Filée, J., and Forterre, P. (2005). Viral proteins functioning in : a cryptic Groussin, M., Boussau, B., Charles, S., Blanquart, S., and Gouy, M. (2013). The origin? Trends Microbiol. 13, 510–513. doi: 10.1016/j.tim.2005.08.012 molecular signal for the adaptation to cold temperature during early life on Forterre, P. (1992). Neutral terms. Nature 335, 305. doi: 10.1038/355305c0 Earth. Biol. Lett. 9, 20130608. doi: 10.1098/rsbl.2013.0608 Forterre, P. (1995). Thermoreduction, a hypothesis for the origin of prokaryotes. C. Groussin, M., and Gouy, M. (2011). Adaptation to environmental temperature is a R. Acad. Sci. III. 318, 415–422. major determinant of molecular evolutionary rates in archaea. Mol. Biol. Evol. Forterre, P. (1996). A hot topic: the origin of hyperthermophiles. Cell 85, 789–792. 28, 2661–2674. doi: 10.1093/molbev/msr098 doi: 10.1016/S0092-8674(00)81262-3 Guy, L., and Ettema, T. J. (2011). The archaeal ‘TACK’ superphylum and the origin Forterre, P. (1999). Displacement of cellular proteins by functional analogues of eukaryotes. Trends Microbiol. 19, 580–587. doi: 10.1016/j.tim.2011.09.002 from plasmids or viruses could explain puzzling phylogenies of many DNA Guy, L., Spang, A., Saw, J. H., and Ettema, T. J. (2014). ‘Geoarchaeote NAG1’ is a informational proteins. Mol. Microbiol. 33, 457–465. doi: 10.1046/j.1365- deeply rooting lineage of the archaeal order Thermoproteales rather than a new 2958.1999.01497.x phylum. ISME J. 8, 1353–1357. doi: 10.1038/ismej.2014.6 Forterre, P. (2002). The origin of DNA genomes and DNA replication proteins. Harish, A., Tunlid, A., and Kurland, C. G. (2013). Rooted phylogeny of the three Curr. Opin. Microbiol. 5, 525–532. doi: 10.1016/S1369-5274(02)00360-0 superkingdoms. Biochimie. 95, 1593–1604. doi: 10.1016/j.biochi.2013.04.016 Forterre, P. (2006). Three RNA cells for ribosomal lineages and three DNA viruses Hashimoto, T., Nakamura, Y., Kamaishi, T., Nakamura, F., Adachi, J., Okamoto, K., to replicate their genomes: a hypothesis for the origin of cellular domain. Proc. et al. (1995). Phylogenetic place of -lacking protozoan, Giardia Natl. Acad. Sci. U.S.A. 103, 3669–3674. doi: 10.1073/pnas.0510333103 lamblia, inferred from amino acid sequences of elongation factor 2. Mol. Biol. Forterre, P. (2011). A new fusion hypothesis for the origin of Eukarya: better Evol. 12, 782–793. than previous ones, but probably also wrong. Res. Microbiol. 162, 77–91. doi: Hecker, A., Leulliot, N., Gadelle, D., Graille, M., Justome, A., Dorlet, P.,et al. (2007). 10.1016/j.resmic.2010.10.005 An archaeal orthologue of the universal protein Kae1 is an iron metalloprotein Forterre, P. (2012). Darwin’s goldmine is still open: variation and selection run the which exhibits atypical DNA-binding properties and apurinic-endonuclease world. Front. Cell Infect. Microbiol. 2:106. doi: 10.3389/fcimb.2012.00106 activity in vitro. Nucleic Acids Res. 35, 6042–6051. doi: 10.1093/nar/ Forterre, P. (2013a). The common ancestor of archaea and eukarya was not an gkm554 archaeon. Archaea 2013, 18. doi: 10.1155/2013/372396 He D., Fiz-Palacios, O., Fu, C. J., Fehling, J., Tsai, C. C., and Baldauf, S. L. (2014). Forterre, P.(2013b). Why are there so many diverse replication machineries? J. Mol. An alternative root for the eukaryote tree of life. Curr. Biol. 24, 465–470. doi: Biol. 425, 4714–4726. doi: 10.1016/j.jmb.2013.09.032 10.1016/j.cub.2014.01.036 Forterre, P., Benachenhou-Lahfa, N., Confalonieri, F., Duguet, M., Elie, C., and Iwabe, N., Kuma, K., Hasegawa, M., Osawa, S., and Miyata, T. (1989). Evolutionary Labedan, B. (1992). The nature of the last universal ancestor and the root of relationship of archaebacteria, eubacteria, and eukaryotes inferred from the tree of life, still open questions. Biosystems 28, 15–32. doi: 10.1016/0303- phylogenetic trees of duplicated genes. Proc. Natl. Acad. Sci. U.S.A. 86, 2647(92)90004-I 9355–8359. doi: 10.1073/pnas.86.23.9355 Forterre, P., and Gadelle, D. (2009). Phylogenomics of DNA topoisomerases: their Jørgensen, S. L., Thorseth, I. H., Pedersen, R. B., Baumberger, T., and Schleper C. origin and putative roles in the emergence of modern organisms. Nucleic Acids (2013). Quantitative and phylogenetic study of the Deep Sea Archaeal Group in Res. 37, 679–692. doi: 10.1093/nar/gkp032 sediments of the Arctic mid-ocean spreading ridge. Front. Microbiol. 4:299. doi: Forterre, P., and Gribaldo, S. (2010). Bacteria with a eukaryotic touch: a glimpse 10.3389/fmicb.2013.00299 of ancient evolution? Proc. Natl. Acad. Sci. U.S. A. 107, 12739–12740. doi: Kamaishi, T., Hashimoto, T., Nakamura, Y., Nakamura, F., Murata, S., Okada, N., et 10.1073/pnas.1007720107 al. (1996). Protein phylogeny of translation elongation factor EF-1 alpha suggests Forterre, P., Krupovic, M., and Prangishvili, D. (2014a). Cellular domains and viral microsporidians are extremely ancient eukaryotes. J. Mol. Evol. 42, 257–263. doi: lineages. Trends Microbiol. 22, 554–558. doi: 10.1016/j.tim.2014.07.004 10.1007/BF02198852 Forterre, P., Krupovic, M., Raymann, K., and Soler, N. (2014b). Plasmids from Kamke, J., Rinke, C., Schwientek, P., Mavromatis, K., Ivanova, N., Sczyrba, A., et al. euryarchaeota. Microbiol. Spectr. 2, 1–27. doi: 10.1128/microbiolspec.PLAS- (2014). The candidate phylum Poribacteria by single-cell genomics: new insights 0027-2014 into phylogeny, cell-compartmentation, eukaryote-like repeat proteins, and Forterre, P., and Philippe, H. (1999). Where is the root of the universal tree of life? other genomic features. PLoS ONE 9:e87353. doi: 10.1371/journal.pone.0087353 Bioessays 21, 871–879. doi: 10.1002/(SICI)1521-1878(199910)21:10<871::AID- Kamikawa, R., Kolisko, M., Nishimura, Y., Yabuki, A., Brown, M. W., Ishikawa, BIES10>3.0.CO;2-Q S. A., et al. (2014). Gene content evolution in Discobid mitochondria deduced Fritz-Laylin, L. K., Prochnik, S. E., Ginger, M. L., Dacks, J. B., Carpenter, M. L., Field, from the phylogenetic position and complete mitochondrial genome of M. C., et al. (2010). The genome of Naegleria gruberi illuminates early eukaryotic Tsukubamonas globosa. Genome Biol. Evol. 6, 306–315. doi: 10.1093/gbe/evu015 versatility. Cell 140, 631–642. doi: 10.1016/j.cell.2010.01.032 Kannan, S., Rogozin, I. B., and Koonin, E. V. (2014). MitoCOGs: clusters of Galtier, N., Tourasse N., and Gouy, M. (1999). A non-hyperthermophilic orthologous genes from mitochondria and implications for the evolution of common ancestor to extant life forms. Science 283, 220–221. doi: eukaryotes. BMC Evol. Biol. 14:237. doi: 10.1186/s12862-014-0237-5 10.1126/science.283.5399.220 Katz, L. A., and Grant, J. R. (2014). Taxon-rich phylogenomic analyses resolve the Gill, S., Krupovic, M., Desnoues, N., Béguin, P., Sezonov, G., and Forterre, P. eukaryotic tree of life and reveal the power of subsampling by sites. Syst. Biol. (2014). A highly divergent archaeo-eukaryotic primase from the Thermococcus [Epub ahead of print]. nautilus plasmid, pTN2. Nucleic Acids Res. 42, 3707–3719. doi: 10.1093/nar/ Keeling, P. J., and McFadden, G. I. (1998). Origins of microsporidia. Trends gkt1385 Microbiol. 6, 19–23. doi: 10.1016/S0966-842X(97)01185-2 Gogarten, J. P., Kibak, H., Dittrich, P., Taiz, L., Bowman, E. J., Bowman, Koonin, E. V., and Yutin, N. (2014). The dispersed archaeal eukaryome and the B. J., et al. (1989). Evolution of the vacuolar H+-ATPase: implications for complex archaeal ancestor of eukaryotes. Cold Spring Harb. Perspect. Biol. the origin of eukaryotes. Proc. Natl. Acad. Sci. U.S.A. 86, 6661–6665. doi: 6:a016188. doi: 10.1101/cshperspect.a016188 10.1073/pnas.86.17.6661 Krupovic, M., Gribaldo, S., Bamford, D. H., and Forterre, P. (2010). The Glansdorff, N., Xu, Y., and Labedan, B. (2008). The last universal common ancestor: evolutionary history of archaeal MCM helicases: a case study of vertical emergence, constitution and genetic legacy of an elusive forerunner. Biol. Direct. evolution combined with hitchhiking of mobile genetic elements. Mol. Biol. Evol. 3:29. doi: 10.1186/1745-6150-3-29 27, 2716–2732. doi: 10.1093/molbev/msq161 Gribaldo, S., and Brochier, C. (2009). Phylogeny of prokaryotes: does Krupovic, M., Gonnet, M., Hania, W. B., Forterre, P., and Erauso, G. (2013). it exist and why should we care? Res. Microbiol. 160, 513–521. doi: Insights into dynamics of mobile genetic elements in hyperthermophilic 10.1016/j.resmic.2009.07.006 environments from five new Thermococcus plasmids. PLoS ONE 8:e49044. doi: Gribaldo, S., and Philippe, H. (2002). Ancient phylogenetic relationships. Theor. 10.1371/journal.pone.0049044 Popul. Biol. 61, 391–408. doi: 10.1006/tpbi.2002.1593 Kurland, C. G., Collins, L. J., and Penny, D. (2006). Genomics and the irreducible Gribaldo, S., Poole, A. M., Daubi, V., Forterre, P., and Brochier-Armanet, C. nature of eukaryote cells. Science 312, 1011–1014. doi: 10.1126/science.1121674 (2010). The origin of eukaryotes and their relationship with the Archaea: Lake, J. A., Henderson, E., Oakes, M., and Clark, M. W. (1984). Eocytes: a new are we at a phylogenomic impasse? Nat. Rev. Microbiol. 8, 743–752. doi: ribosome structure indicates a with a close relationship to eukaryotes. 10.1038/nrmicro2426 Proc. Natl. Acad. Sci. U.S.A. 81, 3786–3790. doi: 10.1073/pnas.81.12.3786

Frontiers in Microbiology | www.frontiersin.org 16 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

Lane, N., and Martin, W. (2010). The energetics of genome complexity. Nature 467, Penny, D., Collins, L. J., Daly, T. K., and Cox, S. J. (2014). The relative ages of 929–934. doi: 10.1038/nature09486 eukaryotes and akaryotes. J. Mol. Evol. 79, 228–239. doi: 10.1007/s00239-014- Lasek-Nesselquist, E., and Gogarten, J. P. (2013). The effects of model choice and 9643-y mitigating bias on the ribosomal tree of life. Mol. Phylogenet. Evol. 69, 17–38. Perrochia, L., Guetta, D., Hecker, A., Forterre, P., and Basta, T. (2013a). Functional doi: 10.1016/j.ympev.2013.05.006 assignment of KEOPS/EKC complex subunits in the biosynthesis of the Lecompte, O., Ripp, R., Thierry, J. C., Moras, D., and Poch, O. (2002). universal t6A tRNA modification. Nucleic Acids Res. 41, 9484–9499. doi: Comparative analysis of ribosomal proteins in complete genomes: an example 10.1093/nar/gkt720 of reductive evolution at the domain scale. Nucleic Acids Res. 30, 5382–5390. doi: Perrochia, L., Crozat, E., Hecker, A., Zhang, W., Bareille, J., Collinet, B., et 10.1093/nar/gkf693 al. (2013b). In vitro biosynthesis of a universal t6A tRNA modification in Leipe, D. D., Aravind, L., and Koonin, E. V. (1999). Did DNA replication Archaea and Eukarya. Nucleic Acids Res. 41, 1953–1964. doi: 10.1093/nar/ evolve twice independently? Nucleic Acids Res. 27, 3389–3401. doi: gks1287 10.1093/nar/27.17.3389 Petitjean, C., Deschamps, P., López-García, P., and Moreira, D. (2014). Rooting Lipps, G., Weinzierl, A. O., von Scheven, G., Buchen, C., and Cramer, P. (2004). the Domain archaea by phylogenomic analysis supports the foundation of Structure of a bifunctional DNA primase-polymerase. Nat. Struct. Mol. Biol. 11, the new kingdom proteoarchaeota. Genome Biol. Evol. 7, 191–204. doi: 157–162. doi: 10.1038/nsmb723 10.1093/gbe/evu274 López-García, P., and Moreira, D. (2008). Tracking microbial biodiversity Petitjean, C., Deschamps, P., López-García, P., Moreira, D., and Brochier- through molecular and genomic ecology. Res. Microbiol. 159, 67–73. doi: Armanet, C. (2015). Extending the conserved phylogenetic core of archaea 10.1016/j.resmic.2007.11.019 disentangles the evolution of the third domain of life. Mol. Biol. Evol. doi: Makarova, K. S., Yutin, N., Bell, S. D., and Koonin, E. V.(2010). Evolution of diverse 10.1093/molbev/msv015 [Epub ahead of print]. cell division and vesicle formation systems in Archaea. Nat. Rev. Microbiol. 8, Philippe, H., and Adoutte, A. (1998). “The molecular phylogeny of : solid 731–741. doi: 10.1038/nrmicro2406 facts and uncertainties,” in Evolutionary Relationships Among Protozoa, eds G. Makarova, K. S., Wolf, Y. I., and Koonin, E. V. (2015). Archaeal clusters of H. Coombs, K. Vickerman, M. A. Sleigh, and A. Warren (Dordrecht: Kluwer orthologous genes (arCOGs): an update and application for analysis of shared Academic Publishers), 25–56. features between thermococcales, methanococcales, and methanobacteriales. Philippe, H., and Forterre, P. (1999). The rooting of the universal tree of life is not Life 5, 818–840. doi: 10.3390/life5010818 reliable. J. Mol. Evol. 49, 509–523. doi: 10.1007/PL00006573 Martijn, J., and Ettema, T. J. (2013). From archaeon to eukaryote: the evolutionary Poole, A. M. (2009). and the earliest stages of the evolution dark ages of the eukaryotic cell. Biochem. Soc. Trans. 41, 451–457. doi: of life. Res. Microbiol. 160, 473–480. doi: 10.1016/j.resmic.2009.07.009 10.1042/BST20120292 Probst, A. J., Weinmaier, T., Raymann, K., Perras, A., Emerson, J. B., Rattei, T., et al. Martin, W., Rujan, T., Richly, E., Hansen, A., Cornelsen, S., Lins, T., et al. (2014). Biology of a widespread uncultivated archaeon that contributes to carbon (2002). Evolutionary analysis of Arabidopsis, cyanobacterial, and fixation in the subsurface. Nat. Commun. 5, 5497. doi: 10.1038/ncomms6497 genomes reveals plastid phylogeny and thousands of cyanobacterial genes in Puigbò, P., Wolf, Y. I., and Koonin, E. V. (2013). Seeing the Tree of Life behind the the nucleus. Proc. Natl. Acad. Sci. U.S.A. 99, 12246–12251. doi: 10.1073/pnas. phylogenetic forest. BMC Biol. 11:46. doi: 10.1186/1741-7007-11-46 182432999 Ramulu, H. G., Groussin, M., Talla, E., Planel, R., Daubin, V., and Brochier- Matte-Tailliez, O., Brochier, C., Forterre, P., and Philippe, H. (2002). Archaeal Armanet, C. (2014). Ribosomal proteins: toward a next generation standard phylogeny based on ribosomal proteins. Mol. Biol. Evol. 19, 631–639. doi: for prokaryotic systematics? Mol. Phylogenet. Evol. 75, 103–117. doi: 10.1093/oxfordjournals.molbev.a004122 10.1016/j.ympev.2014.02.013 McInerney, J. O., Martin, W. F., Koonin, E. V., Allen, J. F., Galperin, M. Y., Lane, N., Raoult, D., and Forterre, P.(2008). Redefining viruses: lessons from Mimivirus. Nat. et al. (2011). Planctomycetes and eukaryotes: a case of analogy not homology. Rev. Microbiol. 6, 315–319. doi: 10.1038/nrmicro1858 Bioessays 33, 810–817. doi: 10.1002/bies.201100045 Raymann, K., Brochier-Armanet, C., and Gribaldo, S. (2015). The two-domains McInerney, J. O., O’Connell, M. J., and Pisani, D. (2014). The hybrid nature of the tree of life is linked to a new root for the Archaea. Proc. Natl. Acad. Sci. 112, Eukaryota and a consilient view of life on Earth. Nat. Rev. Microbiol. 12, 449–455. 6670–6675. doi: 10.1073/pnas.1420858112 doi: 10.1038/nrmicro3271 Raymann, K., Forterre, P., Brochier-Armanet, C., and Gribaldo, S. (2014). Global Mossel, E. (2003). On the impossibility of reconstructing ancestral data and phylogenomic analysis disentangles the complex evolutionary history of DNA phylogenies. J. Comput. Biol. 10, 669–676. doi: 10.1089/106652703322539015 replication in archaea. Genome Biol. Evol. 6, 192–212. doi: 10.1093/gbe/evu004 Mulkidjanian, A. Y., Makarova, K. S., Galperin, M. Y., and Koonin, E. V. (2007). Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N. N., Anderson, I. J., Cheng, J. F., Inventing the dynamo machine: the evolution of the F-type and V-type ATPases. et al. (2013). Insights into the phylogeny and coding potential of microbial dark Nat. Rev. Microbiol. 5, 892–899. doi: 10.1038/nrmicro1767 matter. Nature 499, 431–437. doi: 10.1038/nature12352 Mushegian, A. R., and Koonin, E. V. (1996). A minimal gene set for cellular life Rivera, M. C., and Lake, J. A. (2004). The ring of life provides evidence for a genome derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. fusion origin of eukaryotes. Nature 431, 152–155. doi: 10.1038/nature02848 U.S.A. 93, 10268–10273. doi: 10.1073/pnas.93.19.10268 Rochette, N. C., Brochier-Armanet, C., and Gouy, M. (2014). Phylogenomic test Nasir, A., Kim, K. M., and Caetano-Anollés, G. (2015). Lokiarchaeota: eukaryote- of the hypotheses for the evolutionary origin of eukaryotes. Mol. Biol. Evol. 31, like missing links from microbial dark matter? Trends Microbiol doi: 832–845. doi: 10.1093/molbev/mst272 10.1016/j.tim.2015.06.001 [Epub ahead of print]. Santarella-Mellwig, R., Franke, J., Jaedicke, A., Gorjanacz, M., Bauer, U., Budd, Narasingarao, P., Podell, S., Ugalde, J. A., Brochier-Armanet, C., Emerson, J. B., A., et al. (2010). The compartmentalized bacteria of the planctomycetes- Brocks, J. J., et al. (2012). De novo metagenomic assembly reveals abundant verrucomicrobia-chlamydiae superphylum have membrane coat-like proteins. novel major lineage of Archaea in hypersaline microbial communities. ISME J. PLoS Biol. 8:e1000281. doi: 10.1371/journal.pbio.1000281 6, 81–93. doi: 10.1038/ismej.2011.78 Sapp, J. (2009). The New Foundation of Evolution. New York: Oxford University Nunoura, T., Takaki, Y., Kakuta, J., Nishi, S., Sugahara, J., Kazama, H., et al. Press. (2011). Insights into the evolution of Archaea and eukaryotic protein modifier Soler, N., Marguet, E., Cortez, D., Desnoues, N., Keller, J., van Tilbeurgh, H., et systems revealed by the genome of a novel archaeal group. Nucleic Acids Res. 39, al. (2010). Two novel families of plasmids from hyperthermophilic archaea 3204–3223. doi: 10.1093/nar/gkq1228 encoding new families of replication proteins. Nucleic Acids Res. 238, 5088–5104. Olsen, G. J., and Woese, C. R. (1997). Archaeal genomics: an overview. Cell 89, doi: 10.1093/nar/gkq236 991–994. doi: 10.1016/S0092-8674(00)80284-6 Spang, A., Hatzenpichler, R., Brochier-Armanet, C., Rattei, T., Tischler, P., Spieck Pace, N. R. (1997). A molecular view of microbial diversity and the biosphere. E., et al. (2010). Distinct gene set in two different lineages of -oxidizing Science 276, 734–740. doi: 10.1126/science.276.5313.734 archaea supports the phylum Thaumarchaeota. Trends Microbiol. 18, 331–340. Pace, N. R. (2006). Time for a change. Nature 441, 289. doi: 10.1038/441289a doi: 10.1016/j.tim.2010.06.003 Pace, N. R., Olsen, G. J., and Woese, C. R. (1986). Ribosomal RNA phylogeny and Spang, A., Saw, J. H., Jorgensen, S. L., Zaremba-Niedzwiedzka, K., Martijn, J., Lind, the primary lines of evolutionary descent. Cell 45, 325–326. doi: 10.1016/0092- A. E., et al. (2015). Complex archaea that bridge the gap between prokaryotes 8674(86)90315-6 and eukaryotes. Nature 521, 173–179. doi: 10.1038/nature14447

Frontiers in Microbiology | www.frontiersin.org 17 July 2015 | Volume 6 | Article 717 Forterre The universal tree of life

Stechmann, A., and Cavalier-Smith, T. (2003). The root of the eukaryote tree Woese, C. R. (2000). Interpreting the universal . Proc. Natl. Acad. pinpointed. Curr Biol. 13:R665–R666. doi: 10.1016/S0960-9822(03)00602-X Sci. U.S.A. 97, 8392–8396. doi: 10.1073/pnas.97.15.8392 Stetter, K. O. (1996). Hyperthermophiles in the history of life. Ciba Found. Symp. Woese C. R. (2002). On the evolution of cells. Proc. Natl. Acad. Sci. U.S.A. 99, 202, 1–10. 8742–8747. doi: 10.1073/pnas.132266999 Thiaville, P. C., El Yacoubi, B., Perrochia, L., Hecker, A., Prigent, M., Thiaville, J. Woese, C. R., and Fox, G. E. (1977). The concept of cellular evolution. J. Mol. Evol. J., et al. (2014). Cross kingdom functional conservation of the core universally 10, 1–6. doi: 10.1007/BF01796132 conserved threonylcarbamoyl adenosine tRNA synthesis enzymes. Eukaryot. Woese, C. R., Kandler, O., and Wheelis, M. L. (1990). Towards a natural system of Cell 13, 1222–1231. doi: 10.1128/EC.00147-14 organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl. Thiergart, T., Landan, G., and Martin, W. F. (2014). Concatenated alignments and Acad. Sci. U.S.A. 87, 4576–4579. doi: 10.1073/pnas.87.12.4576 the case of the disappearing tree. BMC Evol. Biol. 14:266. doi: 10.1186/s12862- Yutin, N., Puigbò, P., Koonin, E. V., and Wolf, Y. I. (2012). Phylogenomics 014-0266-0 of prokaryotic ribosomal proteins. PLoS ONE 7:e36972. doi: Urbonavicius, J., Auxilien, S., Walbott, H., Trachana, K., Golinelli-Pimpaneau, 10.1371/journal.pone.0036972 B., Brochier-Armanet, C., et al. (2008). Acquisition of a bacterial RumA- Zhaxybayeva, O., Swithers, K. S., Lapierre, P., Fournier, G. P., Bickhart, D. M., type tRNA(uracil-54, C5)-methyltransferase by Archaea through an ancient DeBoy, R. T., et al. (2009). On the chimeric nature, thermophilic origin, and horizontal gene transfer. Mol. Microbiol. 67, 323–335. doi: 10.1111/j.1365- phylogenetic placement of the Thermotogales. Proc. Natl. Acad. Sci. U.S.A. 106, 2958.2007.06047.x 5865–5870. doi: 10.1073/pnas.0901260106 Wan, L. C., Mao, D. Y., Neculai, D., Strecker, J., Chiovitti, D., Kurinov, I., et al. (2013). Reconstitution and characterization of eukaryotic N6- Conflict of Interest Statement: The author declares that the research was threonylcarbamoylation of tRNA using a minimal enzyme system. Nucleic Acids conducted in the absence of any commercial or financial relationships that could Res. 41, 6332–6346. doi: 10.1093/nar/gkt322 be construed as a potential conflict of interest. Williams, T. A., Foster, P. G., Cox, C. J., and Embley, T. M. (2013). An archaeal origin of eukaryotes supports only two primary domains of life. Nature 504, Copyright © 2015 Forterre. This is an open-access article distributed under the terms 231–236. doi: 10.1038/nature12779 of the Creative Commons Attribution License (CC BY). The use, distribution or Williams, T. A., and Embley, T. M. (2014). Archaeal “dark matter” and the origin of reproduction in other forums is permitted, provided the original author(s) or licensor eukaryotes. Genome Biol. Evol. 6, 474–481. doi: 10.1093/gbe/evu031 are credited and that the original publication in this journal is cited, in accordance with Woese, C. (1998). The universal ancestor. Proc. Natl. Acad. Sci. U.S.A. 95, accepted academic practice. No use, distribution or reproduction is permitted which 6854–6859. doi: 10.1073/pnas.95.12.6854 does not comply with these terms.

Frontiers in Microbiology | www.frontiersin.org 18 July 2015 | Volume 6 | Article 717