<<

Universal Tree of Life Introductory article

James R Brown, SmithKline Beecham Pharmaceuticals, Collegeville, Pennsylvania, USA Article Contents

. Comparison of Classification Schemes The universal tree of life represents the proposed evolutionary relationships among all . , and Eukarya cellular life forms (thereby excluding viruses). Gene sequence data has moved the focus of . Evidence of Early Basal Split the universal tree away from emphasizing metazoan diversity to encompassing the greater . The Origin of genetic diversity found in prokaryotes and single-cell eukaryotes. Presently, the conical universal tree includes three main urkingdoms or domains called the Archaea (archaebacteria), Bacteria (eubacteria) and Eukarya (eukaryotes) with the rooting in the Bacteria such that Archaea and Eukarya are sister groups. However, the evolutionary relationships among these three groups, and even the status of the Archaea, are still hotly debated among evolutionary biologists.

Comparison of Classification Schemes the and their relatives ‘archaebacteria’, a Naturalists have striven perpetually to build a meaningful name which reflected their distinctness from the true classification scheme for living things. Long before bacteria or ‘eubacteria’ as well as contemporary precon- Darwin, plants and animals were believed to be the ceptions that these organisms might have thrived in the primary divisions of life. In 1866, Haeckel was the first to environmental conditions of a younger Earth. challenge this dichotomy by suggesting that the Protista In 1990, Woese, Kandler and Wheelis formally proposed should be considered to be a third kingdom equal in stature the replacement of the bipartite view of life with a new to the Plantae and Animalia. The Bacteria or Monera were tripartite scheme based on three urkingdoms or domains; designated as a fourth kingdom by Copeland, in 1938. the Bacteria (formerly eubacteria), Archaea (formerly Whittaker added the fungi in 1959, and his five kingdom archaebacteria) and Eukarya (formerly eukaryotes (Plantae, Animalia, Protista, Fungi and Monera) universal although this term is still more often used) (Figure 1). The tree is still taught as part of basic biology curricula. rationale behind this revision came from a growing body of Over 50 years ago, Chatton, and Stanier and van Niel biochemical, genomic and phylogenetic evidence which, suggested that life could be subdivided into two even more when viewed collectively, suggested that the archaebacter- fundamental cellular categories, prokaryotes and eukar- ia were worthy of a taxonomic status equal to that of yotes. The distinction between the two groups was eukaryotes and eubacteria. While there was wide accep- subsequently refined as studies of cellular biology and tance of this reclassification by most archaebacteriologists, genetics progressed such that prokaryotes became uni- several evolutionary biologists expressed serious dissent versally distinguishable from eukaryotes on the basis of over the elevation of archaebacteria (and hence the missing internal membranes (such as the nuclear mem- eubacteria) to a taxonomic rank comparable to eukar- brane and endoplasmic recticulum), nuclear division by yotes. fission rather than mitosis and the presence of a cell wall. The definition of eukaryotes was broadened to include Margulis’ endosymbiont hypothesis, which describes how Archaea, Bacteria and Eukarya eukaryotes improved their metabolic capacity by engulfing certain prokaryotes and converting them into intracellular At the centre of the controversy surrounding the concept of organelles, principally mitochondria and chloroplasts. the three domains are the Archaea and their degree of In the late 1970s the fundamental belief in the uniqueness from the Bacteria. Although discovered much prokaryote– dichotomy was shattered by the more recently than either the Bacteria or Eukarya, the work of Carl Woese and George Fox. By digesting in vivo biochemistry, genetics and evolutionary relationships of labelled 16S rRNA using T1 ribonuclease then accumulat- the Archaea have been intensively studied. In addition, the ing and comparing catalogues of the resultant oligonucleo- complete genomic DNA sequences are now known for tide ‘words’, Woese and Fox were able to derive several archaeal . Therefore, it is appropriate to dendrograms showing the relationships between different compare briefly the biology of these interesting organisms bacterial species. Analyses involving some unusual metha- with that of bacteria and eukaryotes. nogenic ‘bacteria’ revealed surprising and unique species According to rRNA trees, there are two groups within clusterings among prokaryotes. So deep was the split in the the Archaea: the kingdoms and Euryarch- prokaryotes that Woese and Fox proposed in 1977 to call aeota. The Crenarchaeota are generally hyperthermo-

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 1 Universal Tree of Life

Eukarya

Animals Fungi

Archaea Plants Entamoeba Crenarchaeota Euglena Bacteria Korarchaeota High G + C Gram Kinetoplasta Low G + C Gram positives (e.g. Trypanosoma) positives Parabasalia δ/ε-Purples (e.g. Trichomonas) α -Purples and mitochondria ‘Archezoa’ γ/β-Purples Microsporidia [?] Spirochaetes (e.g. Nosema) Fusobacteria Thermotogales Metamonda Flexibacter/bacteroides (e.g. Giardia) Cyanobacteria and chloroplasts Thermus Aquifex

‘Cenancestor’

Figure 1 Schematic drawing of a universal rRNA tree showing the relative positions of evolutionary pivotal groups in the domains Bacteria, Archaea and Eukarya. The location of the root (the cenancestor) corresponds with that proposed by reciprocally rooted gene phylogenies. The question mark beside the Archezoa group Microsporidia denotes recent suggestions that it might branch higher in the eukaryotic portion of the tree. (Branch lengths have no meaning in this tree.) philes or (some genera are Desulfur- isopranyl ether lipids, the absence of acyl ester lipids and ococcus, , Sulfolobus, Thermofilum and Ther- fatty acid synthetase, specially modified tRNA molecules, moproteus). The Euryarchaeota span a broader ecological a split in one of the RNA polymerase subunits, and a range and include (e.g. and specific range of antibiotic sensitivities. Among the species Thermococcus), methanogens (e.g. Methanosacrina), halo- of Archaea there are a variety of metabolic regimes which philes (e.g. and Haloferax), and even often differ greatly from the better known metabolic thermophilic methanogens (e.g. Methanobacterium, pathways of Bacteria and eukaryotes. and Methanothermus). However, it is Archaea and Bacteria are united in the ‘realm of important to note that microbial species assemblages in prokaryotes’ by generally similar cell sizes, a lack of a extreme environments are not exclusively archaeal as nuclear membrane and organelles, and the presence of a bacteria-specific rRNA signatures can also be amplified large circular chromosome occasionally accompanied by from such sites. In addition, through polymerase chain one or more smaller circular DNA plasmids. Like the reaction (PCR) amplification of rRNA sequences from Bacteria, the Archaea have multiple genes organized into water and sediment samples, a plethora of new archaeal operon transcriptional units and several of these operons species belonging to both kingdoms have been found in have the same gene order as their counterparts in the mesophilic environments such as temperate marine coastal Bacteria. Archaeal and bacterial mRNAs lack 5’ end caps waters, the Antarctic Ocean, and freshwater lakes, even as and often have Shine–Dalgarno ribosome binding sites. marine sponge symbionts. PCR-based surveys of hot However, the locations of putative Shine–Dalgarno springs’ microbiota have also detected novel archaeal sequences relative to the translational initiation codon rRNA sequences that possibly branch deeper than the are more variable in Archaea, and in fact, the upstream Crenarchaeota–Euryarchaeota divergence. These ‘organ- sequences of several highly expressed genes bear little isms’ have been tentatively assigned to a third archaeal resemblance to Shine–Dalgarno motifs. Other features kingdom, the Korarchaeota. shared between the Archaea and Bacteria include type II The Archaea have several unique biochemical charac- restriction enzyme systems, the absence of splicesomal teristics as well as unusual combinations of characteristics introns found in eukaryotes, and the presence of homing once thought to be exclusive to either the Bacteria or the endonucleases typical of group I introns found in Eukarya. Some solely archaeal characteristics include mitochondria and bacteriophages.

2 ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net Universal Tree of Life

While cell division in the Archaea might function as in Although Archaea lack eukaryotic splicesomal introns, Bacteria, many components of DNA replication, tran- they do have introns in tRNA genes which are of similar scription and translation are definitely more eukaryote- size and often inserted between the same residues as those like. Hints of genetic homology among Archaea and found in eukaryotes. Excision of tRNA introns in eukaryotes were first found in studies which revealed that eukaryotes involves a site-specific endonuclease compris- archaeal species showed patterns of antibiotic sensitivities ing two duplicated subunits which are similar to archaeal more similar to those observed in eukaryotes than in tRNA endonucleases. Inteins, unusual introns found bacteria. Subsequent biochemical and molecular biologi- originally in yeast that splice at the protein rather than cal studies provided proof of significant similarities the mRNA level, also occur in the Archaea and some between archaeal and eukaryotic DNA replication, Bacteria, particularly in proteins involved in DNA transcriptional and translational components which are replication such as DNA polymerases. the targets of many antibiotics. As an example, all known In summary, there are some features that distinguish the archaeal DNA polymerases belong to the eukaryotic Archaea from the Bacteria and eukaryotes, most notably family B type DNA polymerases which have no true the structure and composition of their membranes. bacterial counterpart. Many other DNA replication/repair Primarily, the members of the Archaea are unique in proteins which occur throughout the Archaea and having a combination of traits which, until now, were eukaryotes are absent in Bacteria. believed to be exclusive to either Bacteria or eukaryotes. Transcriptional components are also strikingly similar However, this picture may change rapidly in light of new between eukaryotes and the Archaea. Archaeal RNA information from the wide variety of microorganism polymerases are evolutionarily closer to those of eukar- genome sequencing projects now in progress. yotes and share similar numbers and kinds of subunits. The typical eukaryotic transcriptional promoter, consisting of a TATA box sequence located about 30 bp upstream of the first mRNA nucleotide, is also found in Archaea and serves Evidence of Early Basal Split in transcription. The eukaryotic core RNA polymerase does not contact the DNA template strand directly at the Fossil evidence suggests that Bacteria have existed for at TATA-box site; rather it requires for activation the binding least 3.5 billion years. However, it is nearly impossible to of several specific transcription factors. Archaeal counter- distinguish between the microfossils of early prokaryotes parts to nearly all eukaryotic transcription factors have and single-celled eukaryotes. Furthermore, archaeal and been identified in many different species of the Archaea. bacterial cells are nearly indistinguishable from each other Subsequent laboratory experiments proved that yeast in terms of general morphology. Therefore, molecular and human transcription factors can act as substitutes sequence information is the only means available for for native transcription factors in cell-free archaeal determining the evolutionary relationships between Ar- transcription systems. However, 5’-end capping, persistent chaea, Bacteria and eukaryotes. poly-A tailing, monocistronic messages and splicesomal There are three possible topologies for any universal introns, features unique to eukaryotic mRNA, have yet to tree: (i) Bacteria diverged first from a lineage producing be found in species of either the Bacteria or Archaea. Archaea and eukaryotes; (ii) a proto-eukaryotic lineage Therefore, these features of RNA transcripts probably diverged from a fully prokaryotic (Bacteria and Archaea) evolved within the evolutionary lineage that led to lineage; or (iii) Archaea diverged from a lineage leading to eukaryotes. eukaryotes and Bacteria. In order to decide objectively Protein synthesis or translation is also strikingly similar which of these three scenarios is correct, an outgroup between the Archaea and eukaryotes. Archaea and rooting of the universal tree must be derived. As an eukaryotes share several translation factors, not found in example, a phylogenetic tree constructed to resolve the Bacteria, although the Archaea have several unique types evolution of all vertebrate species could be rooted using as well. The close relationship between Archaea and invertebrates as an outgroup. However, outgroup species eukaryotes is supported by the phylogenetic analyses of are not available for a gene tree consisting of all living several other translation proteins including elongation organisms unless specific assumptions are made about the factors, ribosomal proteins, aminoacyl-tRNA synthetases progression of life from a prokaryotic to a eukaryotic cell. and methionine aminopeptidase. Therefore, the branching order of the three domains Archaeal-specific DNA-binding proteins, called HMf, emerging from the last common ancestor (cenancestor) can have a strong resemblance to eukaryotic histones, in terms only be established by some method unrelated to either of both primary sequence and three-dimensional structure. outgroup organisms or theories about primitive and However, Archaea also have proteins similar to bacterial advanced states. DNA-binding proteins, known as HU, which are not A solution to this problem is to derive parallel evolutionarily linked to histones, although they perform phylogenetic trees using ancient duplicated genes or gene similar functions. paralogues. Genes that are similar across different species

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 3 Universal Tree of Life are said to be homologous (long before molecular data, the species of Bacteria and eukaryotes, both reciprocal trees term homology was also applied to morphological showed Archaea and eukaryotes as sister groups. A similar characters as well). Homologous genes can be further rooting was obtained by Gogarten and co-workers based classified into orthologous and paralogous genes. Ortho- on a second gene duplication, that of the V-type (found in logous genes are directly related by descent, that is one Archaea and eukaryotes) and F-type (found in Bacteria) copy exists per organism being compared. Paralogous ATPase subunits. Subsequent analyses using another genes are also related but at a more distant level in the family of ancient duplicated proteins, the aminoacyl- evolutionary hierarchy (Figure 2). As an example gene A tRNA synthetases, confirmed the elongation factor and might have duplicated in the cenancestor to give rise to two ATPase subunit rooting. genes called A1 and A2. If there was never an occurrence of Woese, Kandler and Wheelis incorporated the elonga- gene loss or substitution, every descendent species of tion factor and ATPase subunit rooting in their 1990 Archaea, Bacteria and eukaryote should have a copy of formulation of the three domains, Archaea, Bacteria and both genes A1 and A2. All gene A1s would be orthologues Eukarya. Although archaeal and bacterial rRNA se- to one another as would all gene A2s to their own kind. quences are slightly more similar, Woese and colleagues However, A1 and A2 would also be related and this distant placed the root of the ribosomal tree such that Archaea and relationship is called paralogy. Paralogous genes exist as Eukarya were sister groups. Thus the ‘archaeal’ version of multiple copies within the same organism or genome. Not the universal tree is a synthesis of three different data all incidents of paralogy are ancient. For example, analyses which collectively show Archaea, Bacteria and metazoans have multiple copies of genes that are homo- Eukarya as separately evolved groups with the Archaea logous to genes occurring as single copies in single-celled and Eukarya as sister groups and the root of the universal eukaryotes. Therefore, this occurrence of paralogy oc- tree in the Bacteria. curred when multicellular eukaryotes evolved. However, there have been serious challenges to the In 1989, two research groups simultaneously published archaeal tree. Based on different reanalyses of rRNA and rootings of the universal tree based on ancient gene elongation factor trees, Lake and co-workers have long duplications. Iwabe and co-workers derived reciprocally advocated that the Archaea are a paraphyletic rather than rooted trees for paralogous elongation factors (EF), a monophyletic group where the kingdom Crenarchaeota is family of GTP-binding proteins that facilitate the binding more closely related to eukaryotes than is the kingdom of aminoacylated tRNA molecules to the ribosome (EF- Euryarchaeota. Locke suggests that the Crenarchaeota be Tu in Bacteria and EF-1a in eukaryotes and Archaea) and renamed ‘eocytes’. Focusing on the heat-shock protein 70 the translocation of peptidyl-tRNA (EF-G in Bacteria and kilodalton (kDa) subunit (HSP70) Gupta and co-workers EF-2 in eukaryotes and Archaea). When five conserved have also argued that the Archaea are not a true clade. A regions were aligned between EF-Tu/1a and EF-G/2 number of proteins for which only unrooted trees can be sequences for a single species of the Archaea and several derived suggest that either Archaea and Bacteria (some examples are glutamine synthetase, glutamate dehydro- genase and HSP70) or Bacteria and eukaryotes (some Archaea Eukarya Bacteria Bacteria Eukarya Archaea examples are valyl-tRNA synthetase, enolase and glycer- aldehyde 3-phosphate dehydrogenase) are interrelated groups, thus challenging the notion of monophyletic domains (i.e. that Archaea, Bacteria and Eukarya are separate groups joined at the base of the tree of life) – the basic tenant of the archaeal tree. In addition, rapidly growing DNA sequence databases have revealed many exceptions to the archaeal tree topology. Several instances were found where either a bacterium had an Archaea-like ATPase or an archaeon

Time (past to present) Time Gene A1 Gene A2 had a Bacteria-like ATPase. Several Bacteria have aminoacyl-tRNA synthetases that are most similar to Gene duplication in the types found in either the Archaea or eukaryotes. cenancestor or before An underlying problem with all phylogenetic analyses is Gene A just how accurately a particular gene tree reflects the actual Figure 2 Conceptual rooting of the universal tree using paralogous evolution of the organisms. Horizontal gene transfer and genes. Suppose that gene A was duplicated in the cenancestor such that all lineage-specific differences in evolutionary rates could extant organisms have both genes, A1 and A2. Provided that some result in a gene tree radically different from the true species sequence similarity still exists between genes A1 and A2, then reciprocally rooted gene trees could be constructed. The positioning of Archaea and phylogeny. How widespread these phylogenetic distortion Eukarya as sister groups, with the Bacteria as the outgroup, has been effects are, and to what extent they have affected particular consistently supported by such rootings. gene trees, are important issues in the field of molecular

4 ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net Universal Tree of Life evolution. Developing evolutionary theories solely on the collection of species united by their lack of mitochondria, basis of any single gene phylogeny is, at best, highly anaerobic metabolism, simple cell morphology and bac- speculative. However, suppose that the majority of gene teria-like ribosomes. The phylogenetic position and trees do correctly reflect the evolutionary origins for primitive features of the Archezoa led Cavalier-Smith to different bits of the genome, then one might conclude that suggest in 1983 that these organisms were relic lineages, the descent of Bacteria, Archaea and eukaryotes from the and that it was another, later evolved, single-celled cenancestor involved a more complex series of genetic eukaryote that actually engulfed the bacterial endosym- events. In this vein, several new theories have emerged biont ancestor of the mitochondria. which suggest that the eukaryotic cell did not directly However, recent molecular phylogenetic studies are now evolve from an Archaea-like ancestor but rather that the showing that all major groups of the Archezoa either eukaryotic nucleus arose from the cellular fusion between a secondarily lost their organelles or underwent some kind of bacterium and an Archaea-like ancestor. endosymbiosis which resulted in the successful fixation of This terminology is somewhat confusing since the several bacterial genes in the Archezoan nuclear genome chimaeric nature of the eukaryotic cell has been long but not the retention of an intracellular organelle. The recognized with respect to the endosymbiotic origin of main evidence emerges from a series of phylogenetic organelles. In addition, it has been established that the analyses of proteins such as the heat-shock 60 kDa subunit eukaryotic genome is a chimaera where genes of ancient protein (HSP60, also called GroEL, Cpn60 or bacterial eukaryotic ancestry coexist with genes more recently common antigen) and valyl-tRNA synthetase. In all these acquired from bacterial endosymbionts. In the context of studies, the eukaryotic copy of the protein, even those of debates surrounding the universal tree of life, the term amitochondriate protists, appear most closely related to a- ‘chimaera hypothesis’ is usually applied to theories that proteobacterial homologues. This would suggest that explain the origin of the eukaryotic genome by way of some endosymbiosis occurred very early in eukaryotic evolution, cellular or genome fusion involving two independent, prior to the emergence of the Archezoa or any known noneukaryotic lineages, while the ‘archaeal hypothesis’ refers eukaryotic species. Indeed, the harnessing of a bacterial to the more conventional view that eukaryotes and Archaea endosymbiont for energy production, and possibly other recently diverged from a common ancestor. The debate over uses, might have been the defining event in the formation of whether the chimaera or archaeal hypothesis is the most the eukaryotic cell. likely explanation for the evolution of eukaryotes still rages If large-scale gene transfers between either eukaryotes among evolutionary biologists. However, large-scale gene and the Bacteria or the Bacteria and Archaea did occur in transfers between bacteria and eukaryotes might also be early cellular evolution, the result would be that seen thus explained, in part, by expanding upon the well-accepted far – that different proteins give different topologies of the theory of organelle evolution through endosymbiosis. universal trees. Taking the phylogenetic trees at face value, many genes, mainly those coding for metabolic enzymes, were transferred between either Bacteria and Archaea or Bacteria and eukaryotes. However, a large number of The Origin of Eukaryotes genes, in particular those coding for central ‘information’ processes such as DNA replication, transcription and In many protein phylogenies, eukaryotes will branch translation, support the sisterhood of the Archaea and among those contemporary bacterial lineages most closely eukaryotes. As more DNA sequence data emerges from related to the endosymbiotic ancestors of mitochondria (a- widely diverse organisms, a major challenge for the field of proteobacteria or a-purple bacteria) or chloroplasts evolutionary biology will be the development of an integral (cyanobacteria). Phylogenies for glyceraldehyde 3-phos- theory for the evolution of cellular organisms. phate dehydrogenase, phosphoenolpyruvate synthetase (enolase) and triosephosphate isomerase suggest that these proteins descended directly from an a-proteobacteria Further Reading ancestor, although the genes encoding these proteins Brown JR and Doolittle WF (1997) Archaea and the prokaryote-to- always reside in the main chromosome found in the cell eukaryote transition. Microbiology and Molecular Biology Reviews 61: nucleus and not in the much smaller chromosomes of 456–502. mitochondria and chloroplasts. Doolittle WF (1999) Phylogenetic classification and the universal tree. The bacterial endosymbiosis which led to the develop- Science 284: 2124–2128. ment of the mitochondria probably occurred very early in Keeling PJ (1998) A kingdom’s progress – Archezoa and the origin of eukaryotic evolution. Phylogenetic analyses using different eukaryotes. Bioessays 20: 87–95. Woese CR (1987) Bacterial evolution. Microbiological Reviews51: 221–271. molecular markers place at the base of eukaryotes various Woese CR, Kandler O and Wheelis ML (1990) Towards a natural system protist lineages (the archamoeboe, metamonads, micro- of organisms: Proposal for the domains Archaea, Bacteria and sporidia and parabasalia) known as the Archezoa (Figure1). Eucarya. Proceedings of the National Academy of Sciences of the USA The Archezoa are not a true clade of organisms but rather a 87: 4576–4579.

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 5