<<

similarities processes Figure ondarily 4C,D 50m-P57), 2D). that es pseudo-folds plastic population, range marginal Tappania; vesicle vesicle numbers FIGURE tive U59); Figures (Figs. foramen, rival for a 4H of (now 2. of large 4E,F lost. details. the process (plus "histology" Tappania with for vegetative 1K,N,O, Tappania to the (Fig. distal flattened), and irregular M, indicated the detail. field-sample direct processes " parent X.41236 J, 5 at document type 1L), showing X.41233 portions for the sp. 2B), N, details. specimen material to plus outgrowths, center by from X.41237 evidence (KL2-49m-L60), vesicle merging (Butterfield linear to the (KL2-15m-H47). the of ID, a the major smaller B, left the box characteristic of a slide of (VI21-5m-J60); X.41241 Wynniatt and processes in Tappania. identifies gradually This contentdownloadedonTue,22Jan201314:51:48PM Tappania (see highly of size number, outpocketings O- circular PROBABLE from All usesubjectto arcuate Fig. r I't? a ;. with 2003). -?-/~a (KL2-56m-L60), (Figs. .,? Formation. relatively P, bearing this K, have 3E) ? with irregular variable nascent X.41239 a ridge and X.41234 see smaller and specimen ridg- been The the 1A, England Figure a JSTOR TermsandConditions (lower arrow, contiguous vesicle 'Y (KL2-9m-Q47). extensions secondarily Specimens (KL2-4m-048); vesicle with Buerfield 2005 here gated CAMSM ary irregular struct and as 4A left). are Finder As ,I'?X Tappania. wall; PROBABLE PROTEROZOIC FUNGI 169 Fossil record and molecular clocks for growth two suspended in a by with C, product Germinosphaera-like and detail. outgrowth with the lost. some distal pronounced X.41242 its FUNGI •. coordinates). X.41243, outgrowths branches the doubled see I, Sedgwick course and 0, portions X.41232 instances within of Figure "normal" X.41238 (VI23-4m-S60); secondary fusion Dang : (cf. longitudinal for (Figs. of the A, opacity, (KL2-49m-K50); Museum of 4B Fig. were (VI21-5m-064), example, X.41240 the anastomosing outgrowth; their for (e.g., it processes, 2D). fusion. processes is 2D, detail. capable possible an (CAMSM) but D, Figs. development. outgrowths (KL2-40m-Q43); 3E; isolated X.41243 The the ?i- L, also the ,? see have 2D, X.41235 ?i- processes; ;. recognized loop-forming these large

with .,? two of ,? Figures observed to acquisition secondary (VI21-4m- 3E, second- been penetra- from marked recon- more elon- (KL2- 4H), sec- 169 3B, see the see In

r Sarah Slotznick " O- -?-/~a? 'Y I't? ,I'?X •. Susan Liao

FIGURE 2. Tappania sp. from the Wynniatt Formation. Specimens with Sedgwick Museum (CAMSM) acquisition numbers (plus field-sample ID, slide number, and England Finder coordinates). A, X.41240 (KL2-40m-Q43); see Figures 4E,F and 5 for details. B, X.41241 (KL2-56m-L60), with two pronounced longitudinal outgrowths from the vesicle (now flattened), plus a smaller circular ridge (lower left). C, X.41242 (VI23-4m-S60); an isolated secondary vesicle of Tappaniashowing the characteristic irregular extensions and branches (cf. Fig. 2D). D, X.41243 (VI21-4m- U59); a large irregular specimen of Tappaniabearing a contiguous Germinosphaera-likeoutgrowth; the two penetra- tive foramen, indicated by the box (see Fig. 3E) and arrow, are a product of secondary fusion. The loop-forming marginal process at the center left identifies this specimen as Tappania. population, direct evidence of a relatively As with the "normal" processes, these more plastic "histology" (Butterfield 2003). The irregular outgrowths were capable of second- pseudo-folds document a highly variable ary growth and fusion (e.g., Figs. 2D, 3E, 4H), range of vegetative outgrowths, from nascent and in some instances it is possible to recon- processes (Fig. 1L), to linear and arcuate ridg- struct the course of their development. In es (Figs. 1K,N,O, 2B), to major outpocketings CAMSM X.41243, for example, the large elon- that rival the parent vesicle in size (Figs. 1A, gated outgrowth (Figs. 2D, 3E; recognized 2D). here by its doubled opacity, but also observed

Tappania;the distal portions of the processes have been secondarily lost. I, X.41232 (KL2-49m-K50); see Figures 3B, 4C,D for details. J, X.41233 (KL2-15m-H47). K, X.41234 (KL2-4m-048); see Figure 4B for detail. L, X.41235 (KL2- 50m-P57), with processes merging gradually with the vesicle wall; distal portions of the processes have been sec- ondarily lost. M, X.41236 (KL2-49m-L60), with a smaller vesicle suspended within the anastomosing processes; see Figure 4H for detail. N, X.41237 (VI21-5m-J60); see Figure 4A for detail. 0, X.41238 (VI21-5m-064), with marked similarities to the type material of Tappania.P, X.41239 (KL2-9m-Q47).

This content downloaded on Tue, 22 Jan 2013 14:51:48 PM All use subject to JSTOR Terms and Conditions Fossils in Rock Record

• Body Fossils – Casts/Molds – Mineralizaon – Compression – Acritarchs • Trace Fossils • Molecular Fossils (Biomarkers) NATURE | Vol 463 | 18 February 2010 LETTERS

abc

Oldest 100 µm 100 µm 50 µm Putave d ef

Fossils 50 µm 50 µm 100 µm k g i (Moodies

Group, South 50 µm 10 µm 20 µm h j Africa, 3.2 Ga) l

200 nm

50 µm 2 µm

m n

30 µm

200 nm Javaux et al. 2010 Figure 1 | Carbonaceous microstructures in situ in thin sections and b), and concentric folds (d, g, h), wrinkling (i), lanceolate fold (e) and extracted from the rock by acid maceration. Images were produced with a collapsing over (f), which are all typical taphonomic features of soft wall transmitted light microscope (a–f), a backscattered environmental SEM deformation. SEM images show the highly folded, wrinkled and degraded (g–j) and a TEM (k–n). Arrows point to spheroidal microstructures in texture of the wall (g–j). TEM images show the compressed vesicle walls section subparallel to the bedding (a, b), compressed microstructures in surrounding the lumen (arrowed) in semi-thin (k) and ultra-thin section across the bedding (c), microstructures extracted from the rock by (n) sections and the homogeneous ultrastructure (l, m) of the roughly 160- acid maceration (d–n), disseminated organic particles (short arrows in nm-thick wall, torn and wrinkled in places (l).

than an ornamentation. SEM–energy-dispersive X-ray analyses show The taphonomic features of soft wall deformation, commonly occasional disseminated arsenopyrite and other sulphide crystals on observed in Proterozoic and Phanerozoic organic-walled micro- the walls of the microstructures. Transmission electron microscope fossils with well-accepted biogenicity, are due to their loss of turgor (TEM) analyses of the wall ultrastructure show unambiguously that pressure and degradational collapse during decay17 before flattening they represent flattened hollow organic-walled vesicles with the cell of the hosting shales and siltstones during compaction, and show lumen visible between the compressed walls (Fig. 1k, n) rather than flexibility of the original organic wall. Another common feature with large kerogen particles. The organic wall shows folding along its length Proterozoic fossiliferous siliciclastic rocks is the low total organic (Fig. 1k, n) and seems disrupted in places because the 60-nm-thick carbon content ranging from 0.07 to 0.37wt%, with an average of ultra-thin sectioning cut through highly wrinkled and degraded walls, 0.17wt% (n 5 22). Generally, Proterozoic shales with a high total as observed in SEM images. Moreover, some small mineral grains were organic carbon content contain only particulate organic matter ripped off during sectioning, as demonstrated by the presence of holes without structurally preserved walls, whereas shales with a low total and, occasionally, pyrite cubes in the resin. The roughly 160-nm-thick organic carbon content (‘grey shales’) may preserve, sometimes wall appears torn and wrinkled in places, and has a homogeneous exquisitely, organic structures with cell walls13,17. Other important ultrastructure (Fig. 1l, m). controls on the preservation potential of microorganisms are their 935 ©2010 Macmillan Publishers Limited. All rights reserved Archean Biomarkers Pilbara Craton, 2.7Ga

“The presence of steranes, parcularly cholestane and its 28- to 30- carbon analogs, provides persuasive evidence for the existence of eukaryotes…”

“Whatever their origin, the biomarkers must have entered the rock aer peak metamorphism 2.2 Gyr ago, and thus do not provide evidence for the existence of eukaryotes and cyanobacteria in the Archaean eon.” Downloaded from rstb.royalsocietypublishing.org on June 12, 2012

1028 A.Proterozoic Eukaryoc Fossils H. Knoll and others Proterozoic eukaryotes

Knoll et al. 2006

Figure 4. (Caption opposite.)

Bangiomorpha only to the interval 1267G2 to 723G form taxa that appear to preserve vegetative and 3 Myr, but an unpublished Pb–Pb date of 1198G reproductive phases of a comparable 24 Myr and physical stratigraphic relationships to the extant xanthophyte alga Vaucheria ( Jankauskas strongly suggest that the fossils’ age lies close to the 1989; Herman 1990). Va u c h e r i a -like populations lower radiometric boundary (Butterfield 2000). preserving several -cycle stages also occur in the Latest Mesoproterozoic (more than 1005G4 Myr; 750–800 Myr Svanbergfjellet Formation, Spitsbergen Rainbird et al. 1998) microfossils from the Lakhanda (Butterfield 2004). Latest Meosoproterozoic and Early Group, Siberia, contain several additional populations Neoproterozoic acritarchs (figure 3h) continue the of coenocytic to multicellular filaments whose mor- record of moderate diversity established earlier, phologies and dimensions suggest eukaryotic origin although some taxa characteristic of these younger (Herman 1990; figure 3f ). Principal among these are assemblages have not, to date, been found in older fossils assigned to Palaoevaucheria clavata and other rocks (Knoll 1996).

Phil. Trans. R. Soc. B (2006) Changzhougou Fm., Changcheng Group, China ~1.8 Ga

Lamb et al., 2009 Roper Group, Australia, 1.5-1.4 Ga

Valeria lophostriata

Tappania plana

Satka favosa Javaux et al., 2004 Negaunee Iron Formaon, Michigan 1.874 Ga

5mm

Grypania species

Han and Runnegar, 1992 Belt Supergroup, Montana,

Horodyskia Monoliformis 1.47 Ga to 1.4 Ga 1 cm

Grypania spiralis

Horodyski 1993, Fedonkin and Yochelson 2002 MESOPROTEROZOIC SEX AND MULTICELLULARITY 391 the case with Bangiomorpha pubescens n. gen., n. sp., and the large populations in the Hunt- ing Formation allow a nearly complete recon- struction of its ontogeny (Figs. 3–5). Although diagnosed on its multicellular habit, the initial single-celled (Fig. 4A) and double-celled (Fig. 4B) stages of Bangiomorpha can be identified by the specific character of their cell walls, in par- ticular the relatively dark, pointillistically tex- Extant Eukaryoc Fossils tured inner cell wall surrounded by a relative- ly translucent outer wall. Filament growth was initiated by the first cell division, oriented parallel to the substrate. By the four-celled Bangiomorpha stage (Fig. 4C), the characteristic pairing of cells reveals the transverse intercalary nature of cell division in uniseriate filaments; cen- Hunng Fm, Arcc Canada, 1.27 to 0.723Ga tripetal cytokinesis is documented by the common occurrence of prominent circumfer- ential furrows (e.g., Fig. 3B). The basal hold- fast is first seen to differentiate at the ca. 12– 16 cell stage (Fig. 4F,G) and typically develops as a multilobed (usually two, but sometimes four or more) multicellular structure connect- ed to the rest of the filament via a single cell (Fig. 6). At some, presumably relatively mature, stage the cells of some Bangiomorpha filaments underwent longitudinal (with respect to the filament) intercalary division giving rise to multiseriate filaments. There are, however, at least three variations to the general pattern: Type 1 :Inmostinstancestheintercalarydi- vision was oriented radially resulting in four or eight wedge-shaped cells arranged around acentralspace(Fig.5E).Themeandiameter of all such filaments is 46.2 6 7.4 mm(n 5 23). In specimens with both uniseriate and multis- eriate portions (n 5 9), the mean multiseriate diameter is 42.0 6 5.5 mmandtheadjacentun- iseriate diameter 30.6 6 6.4 mm; the ratio of uniseriate to multiseriate diameter ranges from unity to 1.8 (x¯ 5 1.4 6 0.3). Type 2 :Ina few instances, longitudinal intercalary divi- sion gave rise to relatively few spheroidal cells separated from one another by translucent FIGURE 3. Bangiomorpha pubescens n. gen. n. sp. Thin- Buerfield 2000 outer wall material (Fig. 5A,D). Mean filament section identification and England Finder coordinates diameter is 40.0 6 9.1 mm(n 5 4). In speci- appear in parentheses. A, HUPC 63000 (HUST-1P, M- 32). B, HUPC 62995 (HUST-1Q, O-45), paratype; note mens with both uniseriate and multiseriate the hierarchically paired cells reflecting diffuse trans- portions (n 5 3), the mean multiseriate di- verse intercalary cell division. C, HUPC 63001 (HUST- ameter is 36.7 6 8.0 mmandtheadjacentun- 1Q, P-25); note the multiseriate portions of the filament, unaccompanied by filament expansion; scale as for A. iseriate diameter 24.0 6 7.0 mm; the ratio of Vase-Shaped Microfossil/Testate Chuar Group, Grand Canyon, 742 Ma Bonneia pynaia Melanocyrillium hexodiadema Trigonocyrillium horodyskii

Chuar

Cyphoderia Modern ampulla Trigonopyxis arcula Arcella conica Porter et al. 2003

Proterozoic Biomarkers

C27-C29 Steranes, McArthur Gammacerane, Basin, 1.69 Ga to 1.429 Ga Dinosterane

Walco Fm., Chuar Group, 742 Ma

Summons et al., 1988; Summons et al., 1988; Porter, 2006 Molecular Clock Hypothesis (MCH)

• “Most amino acid substuons in a protein… occur between funconally equivalent residues, so that their replacement along evolving lineages would be determined by mutaon rate and me elapsed rather than by natural selecon”

Ayala, 1999 Calculang Molecular Rates/Time • Substuons (Amino Acid, DNA) accumulate at a constant and comparable rate • D = RT, Differences or Substuons/Rate = Time

Berkeley, Understanding Evoluon 2006 Quesons in Eukaryote Evoluon

• Roong eukaryoc tree • Phylogenec arfacts • Dang ancient divergences • Endosymbioc and lateral gene transfer Long-Branch Aracon (LBA) Establishing a Minimum Constraint

• Fossil evidence Connecng Fossil and Molecular Evidence Connecng Fossil and Molecular Evidence

Bromham and , 2011 Establishing a Maximum Constraint

• (1) employ mathemacal funcons of probability densies to describe degree to which a minimum constraint approximates to a divergence date • (2) establish explicitly jusfied fossil-based “so maximum” constraints

Warnock et al, 2011 MCH – 50 Years

• “The molecular clock cannot be accurate in its details, in other words, it would be illusory to think that mutaons would actually happen at nearly idencal intervals…

Bernardi, 2012 MCH – 50 Years

• … the queson then is, are the mean values of these intervals meaningful? I made the assumpon that, yes, they characterize these molecules and that one could reach reasonable esmates of the actual age of various important stages in the development of this type of analysis.” • Emile Zuckerkandl

Bernardi, 2012 Current Direcons in MCH

• Invoking biological variables (Ayala, 1999) – Generaon me – Populaon size – Species-characterisc biological processes – Change in protein funcon – Natural selecon Current Direcons in MCH

Edwards, 2009 Current Direcons in MCH

• Shorter mescales (Peterson et al, 2009) • Matching specific lineages to the fossil record • Using mtDNA in addion to nDNA (Galer et al, 2009) • New sampling methods (Baele et al, 2012; Tamura et al, 2011) New Froners for MCH

Koonin, 2012 BMC Evolutionary Biology 2001, 1:4 http://www.biomedcentral.com/1471-2148/1/4

eral signature of the symbiotic origin of eukaryotes [2,3] and horizontal gene transfer (HGT) of symbiont genes to the nucleus [4–9]. On the one hand, this complexity re- sulting from HGT can obscure some aspects of evolution- ary history [8]. However, HGT also can provide theFigure 1 means to investigate otherwise difficult questions, such as inferring the number of symbiotic events and estimat- ing the time of those events. This is the approach that we take in this study.

The goal of this study is to estimate the timing of evolu- tionary events involved in the origin of eukaryotes (Fig. 1), including the related origin of oxygenic photosynthe- sis. The latter is believed to have occurred only in cyano- [10] and preceded the symbiotic event leading to the of eukaryotes. The earliest biomarker evidence of eukaryotes is at 2.7 Ga [11] and the earliest fossils appear 2.1 Ga [12]. The fossil record of cyanobac- Figure 1 teria has been argued to extend to 3.5 Ga [13] but the bi- Working model of gene relationships used in this study. omarker evidence at 2.7–2.8 Ga [14,15] usually is Eukaryotic proteins trace back to four different locations in considered to be the earliest record of cyanobacteria the evolutionary tree of prokaryotes. The divergence [10]. However, the 2-methylhopane biomarker of cyano- between archaebacteria and eubacteria (last common ances- bacteria has been detected in lower abundance in other tor, LCA), archaebacteria and eukaryotes (AK), and between cyanobacteria and other eubacteria (BC) are believed to rep- prokaryotes, and many taxa (especially anaerobic spe- resent speciation events between populations of prokaryo- cies) have not been examined for the biomarker [15–17]. tes. The remaining three divergence events are considered Also, the origin of oxygenic photosynthesis may have oc- to reflect horizontal gene transfer following symbiosis: (1) curred at some time later than the origin of cyanobacte- between an archaebacterium and a eubacterium leading to ria. Geologic evidence bearing on the origin and rise in the origin of eukaryotes (BK-o), (2) between an α-proteo- bacterium and a eukaryote leading to the origin of mitochon- oxygen likewise has been debated [18,19]. Although the dria (BK-m), and (3) between a cyanobacterium and a existence of banded iron formations prior to 3 Ga some- eukaryote leading to the origin of plastids (BK-p). In this times has been used as evidence for the early evolution of study, divergence times are estimated for AK, BC, BK-o, and oxygenic photosynthesis, oxygen-independent mecha- BK-m. The divergence time of a fifth event (not shown), the nisms of iron deposition are known [20]. speciation event between a eukaryote () and other eukaryotes (GK), also is estimated. Branch lengths are not proportional to time. The use of sequence changes to estimate the time of these early events also has its assumptions and limitations [21–23]. Nonetheless, many proteins contain conserved Results regions of amino acid sequence throughout prokaryotes Rate differences and eukaryotes that permit alignment and analysis. The The shape parameter (α) of the gamma distribution used most extensive of these analyses have found that all ma- to account for rate variation among sites was found to jor events related to the origin of eukaryotes occurred differ consistently between calibration taxa and the over- about 2.0–2.2 Ga [5,21]. This includes the divergence of all data set for each gene (Fig. 2), requiring a dual-gam- archaebacteria and archaebacterial proteins in eukaryo- ma approach (see Methods). Also, eukaryotic protein tes, the origin of cyanobacteria, and the divergence of eu- sequences were found to have an increased rate of evolu- bacteria and eubacterial proteins in eukaryotes (the tion compared with prokaryotic sequences regardless of latter presumably reflecting symbiosis). However, these their archaebacterial or eubacterial origin (Fig. 3A). Av- times were not adjusted for lineage-specific rate differ- erage eukaryote rates were 1.37 (AK), 1.18 (BK-o), and ences that have been discovered subsequently [23]. 1.38 (BK-m) times the rate of the most closely related Here, we estimate the time of these events with protein prokaryote in constant rate proteins (1.55, 1.24, and 1.56 sequences from complete genomes and consideration of in all proteins, respectively). Besides this general pat- lineage-specific rate variation. tern, which may reflect fundamental differences between prokaryotes and eukaryotes (e.g., recombination), there are further differences among eukaryotes. In comparing rates of evolution in eukaryotic sequences derived inde- pendently from eubacteria and archaebacteria in the BMC Evolutionary Biology 2001, 1:4 http://www.biomedcentral.com/1471-2148/1/4

true, this would bear on our time estimate for the diver- gence of archaebacteria and eukaryotes. Thus, we con- ducted a phylogenetic analysis of 72 proteins containing representatives of the two major groups of archaebacte- ria, eukaryotes, and eubacteria. At the 95% bootstrap sig- nificance level, 19 proteins supported archaebacterial monophyly whereas none supported the eocyte hypothe- sis (Crenarchaeota + Eukaryota). This indicates that the lineage of archaebacteria leading to the eukaryote nucle- ar genome diverged prior to the split between the Cre- narchaeota and Euryarchaeota. As noted previously [1], most (in this case, 21 out of 36) eukaryotic proteins with Figure 2 archaebacterial affinity are informational (involved in Differences in rate variation among sites (gamma parameter). transcription, translation, and related processes). Fraction of gamma parameters (64 proteins) measured from entire data sets for each protein (blue, prokaryotes and Among 41 eukaryotic proteins with eubacterial affinities, eukaryotes) and from subsets containing only calibration taxa Rickettsia is most closely related to eukaryotes in phylo- (red, eukaryotes). genetic analyses of nine individual proteins. This agrees with the genetic and cell biological evidence implicating an α-proteobacterium as progenitor of the mitochondri- same protein, those derived from eubacteria (in all cases, on [25] and supports the hypothesis that these nine eu- BK-o) were found to be evolving at roughly twice the rate karyotic proteins owe their origin to that symbiotic event as their archaebacteria-derived counterparts (Fig. 3B). [2]. However, the remaining 32 proteins do not show this The slope was 2.01 and the correlation coefficient was pattern but instead identify other species or groups of 0.54 (n = 14 comparisons in seven proteins). eubacteria as closest relative. Unlike Rickettsia, no other single species appears as closest relative in more than Two other rate comparisons were limited by a small three proteins, but rather most (19/32 proteins) identify number of proteins: eubacteria versus eukaryotes (KA) groups of species as closest relative (e.g, Fig. 4A). To fur- and eubacteria versus archaebacteria. Only three pro- ther explore this question we combined sequences of all teins were available in the first comparison and all three 11 proteins with a full representation of eubacterial taxa showed a faster rate in eukaryotes (1.43, 1.12, 1.23; x = (11 species). In the combined analysis, eukaryotes fall 1.26). This result differs from that reported elsewhere significantly outside of the well-defined clade containing [23], in which the two rates were not significantly differ- α- and γ-proteobacteria (Fig. 4B). The relatively basal ent. In the second case, we found that archaebacteria are and unresolved position of eukaryotes is consistent with evolving at a slower rate than eubacteria, as was noted the preponderance of single proteins showing different elsewhere [23]. In our case, regression of archaebacterial groups of species as closest relative. Three individual branch length versus eubacterial branch length, fixed proteins showed significant bootstrap support for a Rick- through the origin, resulted in a slope of 0.93 and corre- ettsia-eukaryote cluster in four-taxon analyses (rooted lation coefficient of 0.65 (n = 9 proteins). However, in with an archaebacterium) whereas four proteins signifi- both of these comparisons, rate tests did not yield signif- cantly supported a Rickettsia-Escherichia cluster that icant rate differences probably because of the short excluded the eukaryote. length of most proteins. Sample size (eight protein sets) also was limited in the Kollman and Doolittle study [23]. Divergence time estimates from the multigene (MG) and Taken together these data suggest the following relative average distance (AD) approaches are similar, but rate- order of rate differences: archaebacteria < eubacteria < adjusted times are older than unadjusted times (Table 1). eukaryotes (archaebacterial origin) < eukaryotes (eubac- The time estimate for the AK divergence averages 4.0 Ga terial origin). As additional genomic data become availa- and the remaining times range from 1.8 to 2.7 Ga. The ble, more proteins will be useful and greater precision in time estimate for BK-o (2.7 ± 0.20 Ga) was older than the these rates and rate differences will be possible. estimate for BK-m (1.8 ± 0.20 Ga) whereas the time esti- mate for the origin of Giardia (2.2 ± 0.12 Ga) was inter- Phylogeny and time estimation mediate. The BC time estimate was 2.6 ± 0.26 Ga. It has been suggested that eukaryotic genes and proteins of archaebacterial origin are more closely related to one Discussion lineage of archaebacteria (Crenarchaeota; "eocytes") The purpose of this study was to examine the temporal than the other major lineage (Euryarchaeota) [24]. If relationship between the origin of eukaryotes and events BMC Evolutionary Biology 2001, 1:4 http://www.biomedcentral.com/1471-2148/1/4

Figure 3

Figure 3 Differences in rates of protein evolution. (A) Prokaryotes versus eukaryotes. Histogram of ratios of eukaryote to prokaryote evolutionary rates. Eukaryotes are derived from three prokaryote lineages: BK-o (31 proteins, blue), BK-m (8 proteins, black), and AK (36 proteins, red). (B) Eukaryotes versus eukaryotes. Protein distances between two species of eukaryotes (KA1 and KA2 in inset) of archaebacterial origin compared with distances between the same two species of eubacterial origin (KB1 and KB2); slope (m) = 2.01. In each case, all sequences being compared are from the same protein. The mirrorlike phylogeny (inset) is the result of horizontal gene transfer and speciation rather than gene duplication. BMC Evolutionary Biology 2001, 1:4 http://www.biomedcentral.com/1471-2148/1/4

Figure 4 Phylogenetic relationships of eubacteria and eukaryotes rooted with archaebacteria. Neighbor-joining bootstrap consensus trees showing significant (≥ 95%) bootstrap values; maximum-likelihood and maximum parsimony produced identical topolo- gies for significant nodes. (A) Cytoplasmic alanyl tRNA synthetase, showing BK-o pattern: eukaryotes most closely related to eubacteria but not closely related to the α-proteobacterium (Rickettsia). (B) Combined analysis of all proteins with full comple- ment (11 species) of eubacterial taxa and showing eubacterial-eukaryote relationship (11 proteins, 1596 amino acids); signifi- cant groups remain after removal of cytoplasmic alanyl tRNA synthetase. BMC Evolutionary Biology 2001, 1:4 http://www.biomedcentral.com/1471-2148/1/4

Table 1: Divergence time estimates (billion years ago)

Multigene Average-distance

Comparison All constant all constant Mean ± SE*

Archaebacteria-eukaryotes (AK) 3.18 3.42 3.05 3.58 3.50 ± 0.25 Rate adjusted 4.11 3.86 3.69 4.09 3.97 ± 0.32 Eubacteria-eukaryotes (BK-o) 2.31 2.45 2.27 2.48 2.46 ± 0.14 Rate adjusted 2.54 2.76 2.51 2.70 2.73 ± 0.20 Eubacteria-cyanobacteria (BC) 1.68 1.73 1.92 1.85 1.79 ± 0.29 Rate adjusted 2.56 2.52 2.66 2.60 2.56 ± 0.26 Giardia-eukaryotes (GK) 2.82 2.54 3.32 2.46 2.50 ± 0.22 Rate adjusted 2.72 2.31 2.04 2.16 2.23 ± 0.12 Eubacteria-eukaryotes (BK-m) 1.70 1.72 1.47 1.39 1.56 ± 0.29 Rate adjusted 2.02 2.07 1.72 1.61 1.84 ± 0.20

*Mean (Multigene, constant rate + Average-distance, constant rate) ± standard error (Multigene, constant rate).

in Earth history. However, some unexpected results re- symbiotic or fusion event between an archaebacterium quired refinement in methodology. These included find- and an α-proteobacterium [8,28,29]. ing greater among-site rate variation in the calibration group and different rates of sequence change between Under the single-symbiosis model, eukaryotes should prokaryotes and eukaryotes, and between eukaryotes de- cluster exclusively with an α-proteobacterium (e.g., rived from different groups of prokaryotes. By taking Rickettsia), among eubacteria. However, our phyloge- into account these variables, the resulting time estimates netic analyses (Fig. 4) instead indicate, significantly, that are more robust and have fewer assumptions. For exam- many eukaryotic proteins originated from one (or more) ple, the time estimate for the origin of eukaryotes (BK-o) eubacterial lineages other than α-proteobacteria. The re- is not based on a general assumption of rate constancy duced genome of Rickettsia[25] would not explain this between prokaryotes (or even eubacteria) and eukaryo- result because Rickettsia possesses all of the proteins tes because rates are adjusted for each protein and each used in the combined analysis (Fig. 4B). Protein function comparison. Also, the calibration used for BK-o is not a and location also are consistent with a premitochondrial general eukaryotic calibration but one based exclusively origin. Only one of the 32 BK-o proteins is restricted to on eukaryote sequences derived from eubacteria. A the mitochondrion whereas eight of the nine BK-m pro- tradeoff in these improved methods was a reduction in teins are restricted to that organelle. Also, all six of the the number of proteins that could be used, which in- proteins involved in cellular respiration are in the BK-m creased the variance of the time estimates. Nonetheless, group. Based on the serial endosymbiosis theory, the the phylogenies and time estimates obtained in this first symbiotic event involved a spirochete [3]. On the study have a bearing on current models for the evolution other hand, sequence signatures of the heat shock molec- of eukaryotes. ular chaperone protein HSP-70 and other evidence have indicated that the first symbiotic event involved a gram- Until about five years ago, it was generally accepted that negative eubacterium [6]. Our data are unable to distin- there was a prior period (before mitochondria) in the his- guish between these two alternatives but agree with both tory of eukaryotes [2,26]. The basal position of eukaryo- in implicating an earlier, premitochondrial event. Preda- tes lacking mitochondria (amitochondriate) in tion by prokaryotes on early eukaryotes also may have phylogenetic trees [27] was consistent with this supposi- led to HGT. tion as was evidence from sequence signatures [6]. How- ever, molecular phylogenetic studies of several proteins If two or more symbiotic events were involved, this does in recent years have suggested that some or all amito- not necessarily confirm that any of the living lineages of chondriate eukaryotes once possessed mitochondria in amitochondriate eukaryotes arose prior to the second the past [9]. Based on this new evidence, most current (mitochondrial) event. All may have once possessed mi- models for the origin of eukaryotes assume only a single tochondria. However, because Giardia arose at an early time (Table 1) and branches near the base of the eukary- BMC Evolutionary Biology 2001, 1:4 http://www.biomedcentral.com/1471-2148/1/4

Figure 5 Summary diagram showing relationship between timing of evolutionary events (Table 2) and that of Earth and atmospheric his- tories. Time estimates are shown with ± 1 standard error (thick line) and 95% confidence interval (narrow line). The phyloge- netic tree illustrates the radiation of extant eubacterial lineages (blue), and dashed lines with arrows indicate the origin of eukaryotes (BK-o) and origin of mitochondria (BK-m). The earliest divergence (last common ancestor) was not estimated but is placed (arbitrarily) just prior to the AK divergence. The increasing thickness of the eukaryote lineage represents eubacterial genes added to the eukaryote genome through two major episodes of horizontal gene transfer. The rise in oxygen represents a change from <1% to >15% present atmospheric level [34,52], although the time of the transition period and levels have been disputed [19,53]. ote phylogeny, the simplest explanation is that it never ganelles, mitochondria. Moreover, the timing of these possessed mitochondria and is a primary (not second- biological events is consistent with the timing of events ary) amitochondriate. Although the position of Giardia in geologic and atmospheric history (Fig. 5). Cyanobacte- in some protein phylogenies [30] has been proposed as ria appear before the major (undisputed) evidence of the evidence that it is a secondary amitochondriate, others rise in oxygen (2.4–2.2 Ga) and mitochondria appear af- have urged caution until additional, more conclusive, ter the rise in oxygen. Also, the estimates for the origin of data become available [6]. cyanobacteria and eukaryotes are consistent (within one SE) with the earliest biomarker evidence for those two The number of symbiotic events was important for our groups (~2.7 Ga.) [11,15]. Phylogenetic analyses of pho- primary concern of estimating a timescale for the early tosynthetic genes and sequence signatures also support a evolution of eukaryotes. We find that the divergence be- relatively late order of appearance of cyanobacteria tween archaebacteria and the lineage leading to eukary- among photosynthetic prokaryotes [32,33]. otes (KA) was quite early (~4 Ga), which is about the time of the earliest biomarker evidence of life (3.9–3.8 Ga) Extensive glaciations occurred in the Paleoproterozoic [31]. We interpret that divergence to be a speciation (~2.4 Ga), and may have been global in extent [34]. It has event between two lineages of archaebacteria, with KA been proposed that a major rise in oxygen at this time not becoming "eukaryotic" until the first symbiotic event lowered global temperatures and may have triggered the at 2.7 Ga. The remaining time estimates cluster around glaciations [35]. If this is true, and given the time esti- the mid-life of Earth (1.8–2.7 Ga). The order of those mates here, the evolutionary innovation of oxygenic pho- events falls in a logical sequence: BK-o, BC, and BK-m. tosynthesis may have had a relatively rapid impact on the For example, the origin of mitochondria appears as the environment. Moreover, this innovation may have second (not first) symbiotic event, and the origin of cy- caused a mass extinction of prokaryotes at that time, as a anobacteria comes before the oxygen-utilizing or- result of the toxic effects of oxygen, as suggested by the Table 1. Calibration constraints for dating the eukaryotic tree of life Calibration†

Taxon Fossil Eon* Min Dist Ref(s).

Amniota Westlonthania Phan 328.3 4, 3 (54) Angiosperms Oldestangiopollen Phan 133.9 2,10 (55) Ascomycetes Paleopyrenomycites Phan 400 4, 50 (56) Coccolithophores Earliest Heterococcolith Phan 203.6 2, 8 (57) Diatoms Earliestdiatoms Phan 133.9 2,100 (58) Dinoflagellates Earliestgonyaulacales Phan 240 2,10 (59) Embryophytes Landplantspores Phan 471 2,20 (60) Endopterygota Mecoptera Phan 284.4 5, 5 (61) Eudicots Eudicot pollen Phan 125 2, 1.5 (62, 63) Moyeria Phan 450 2, 40 (64) Oldestforams Phan 542 2,200 (65) Gonyaulacales Gonyaulacaceaesplit Phan 196 2,10 (59) Pennatediatoms Oldestpennate Phan 80 3,5 (66) Spirotrichs Oldest tintinnids Phan 444 2.5, 100 (67) Trachaeophytes Earliest trachaeophytes Phan 425 4, 2.5 (68) Vertebrates Haikouichthys Phan 520 3, 5 (69) LOEMs,spongebiomarkers Protero 632 2,300 (70,71) Arcellinida Paleoarcella Protero 736 2, 300 (12) Kimberella Protero 555 2, 30 (72) Chlorophytes Palaeastrum Protero 700 2.5,300 (73) Gammacerane Protero 736 2.5,300 (74) Florideophyceae Doushantuoredalgae Protero 550 2.5,100 (75) ‡ Bangiomorpha Protero 1174 3, 250 (11)

*Eon: Phan, Phanerozoic; Protero, Proterozoic. Proterozoic calibrations are excluded from Phan analyses. †Calibration constraints are specified for BEAST using a gamma distribution with a minimum date in Ma based on the fossil record parameters as indicated: min, minimum divergence data; dist, gamma prior distribution (shape, scale). See Table S3 for details of PhyloBayes calibrations. ‡In the All 720 analysis (c), the minimum age constraint for the red algae node is set to 720 Ma. dates, especially for the estimated date of the root itself, which age of 720 Ma to this constraint, representing the absolute generally changed by <100 million years (myr; Fig. 1A). Phylo- younger bound of the Hunting Formation, Canada, in which it is bayes estimates generally showed more uncertainty than those found (SI Text) (11). In BEAST, placing the Bangiomorpha from BEAST analyses, but around similar means. Similarly, constraint at 720 Ma shifted the estimated age of the root by only estimates were robust to changing models (uncorrelated or 95 myr toward the present (Fig. 1A and Fig. S3, analysis c). autocorrelated) and to the inclusion of only Phanerozoic (Phan) The autocorrelated CIR model combined with the low number or all calibrations (All) with one exception: under the auto- of substitutions on deep branches of the eukaryotic tree appears correlated Cox–Ingersoll–Ross (CIR) model, estimates are much more sensitive to the distribution of calibration dates included in more recent in Phan analyses (1038 Ma and 1180 Ma; Fig. 1A). these analyses. Under the CIR autocorrelated model, a consistent age was estimated with All calibrations included (1798–1691 Ma; EVOLUTION Impact of Calibration Constraints on Estimates of the Origin of Extant Fig. 1A, analyses m and o), although confidence intervals are Eukaryotes. We assessed the impact of including Proterozoic greater in PhyloBayes analyses in general (Fig. 1A, analyses i–p). fossils, which are considered controversial by some (6, 7), by However, excluding Proterozoic calibration points did cause es- analyzing datasets without these seven calibration constraints timated ages to shift more than 600 myr younger under the CIR (Phan analyses). In BEAST analyses, the exclusion of Proterozoic model (1180–1038 Ma; Fig. 1A, analyses n and p), pushing the fossils shifted estimated divergence times toward the present, but estimated age for the root of extant eukaryotes younger than not dramatically so: estimates for the mean age of root of extant the widely accepted date for the Bangiomorpha fossils. Similarly, eukaryotes fall between 1506–1471 Ma in Phan analyses [95% the CIR analyses in PhyloBayes were sensitive to the age of the highest-probability density (HPD) range 1643–1347 Ma; Fig. 1A, Bangiomorpha constraint, shifting more than 500 myr younger to Figs. S1, S5, and S7, analyses b, f, and h] compared with 1837– 1296 Ma and 1167 Ma in analyses with All calibration points 1717 Ma (95% HPD range 1954–1601 Ma; Figs. 1A and 2 and rooted with Opisthokonta and “Unikonta,” respectively (Dataset Figs. S4 and S6; analyses a, e, and g) when Proterozoic fossils S1). The necessity of using PhyloBayes to explore the differences were included (All analyses). Similar dates were recovered in between autocorrelated and uncorrelated models introduces Phan and All PhyloBayes analyses when the uncorrelated gamma confounding factors, as PhyloBayes requires both uniform dis- model (UGAM) model (uncorrelated) of the molecular clock tributions around calibration points and a fixed tree topology. was assumed (Fig. 1A, analyses i–l). Given that calibration points are likely best represented by more Of the seven Proterozoic calibration points used in our anal- informative distributions, and that the topology of the tree is not yses, only the Bangiomorpha point is controversial in terms of fully known, we focus the rest of our discussions on the results either systematic attribution or age. The Bangiomorpha calibra- from BEAST, although data from all PhyloBayes analyses are tion constraint is more than 400 myr older than our other Pro- available in Fig. 1A and Dataset S1. terozoic constraints (Table 1). To determine whether this calibration point drives results in analyses with All calibrations, Origin of Major Clades. In most analyses, the major clades of extant we assessed the age of the root with a much more conservative eukaryotes diverged before 1200 Ma, with SAR, , and estimate for the age of this red alga (All 720; Fig. 1, analysis c). A arising within a similar time frame, as evidenced by number of factors place the age of Bangiomorpha ∼1200 Ma (SI overlapping 95% HPD ranges (Figs. 1 and 2, Figs. S1–S7, and Text); however, given the importance of the fossil we assigned an Dataset S1). The 95% HPD intervals are wider for clades with few

Parfrey et al. PNAS | August 16, 2011 | vol. 108 | no. 33 | 13625 A BEAST PhyloBayes algae, diverged within a similar time frame (Fig. 2). These results Figure 1 2400 imply an early acquisition of photosynthesis in eukaryotes, in ac- cordance with both previous molecular clock estimates (30) and the 2200 ∼1200 Ma age assigned to the red algal fossil Bangiomorpha (11). 2000 i Discussion d k 1800 a e m The molecular clock analyses presented here suggest that the last g o c common ancestor of extant eukaryotes lived between 1866 and 1600 j l b f h 1679 Ma when both Phanerozoic and Proterozoic fossils are 1400 considered. We favor these more-inclusive analyses as they should reveal a more accurate picture of eukaryotic diversification, es- 1200 p pecially because the chosen fossils are widely accepted by pale- n 1000 ontologists, and calibration constraints were assigned in a conservative manner that accounts for age uncertainties. Esti- 800 mated ages are younger when we remove Proterozoic calibration uncorrelated autocorrelated constraints, though not dramatically so, with the notable excep- tion of the autocorrelated model CIR as implemented in Phylo- Root op op opp otest ens unu op op unn u op op unn u Bayes with only Phanerozoic calibrations. Thus, our results tend Calibration AllPh720 All Alll Phh Al P All Ph All Ph All Ph All Ph to place the last common ancestor of extant eukaryotes deep within the Proterozoic Eon. B 2400 Our estimates for the timing of the origin of extant eukaryotes 2200 are in line with fossil evidence (2, 13), but reject the hypothesis 2000 that eukaryotes originated only 850 Ma (6, 7). Fossils provide minimum dates, leaving open the possibility that clades evolved 1800 e much earlier than their first fossil appearance (2, 31). Thus, it is d a d g 1600 a not surprising that divergence times for many eukaryotic clades c c de f a g g h fi 1400 e b d c are older than their rst unambiguous fossil occurrence (Table b a c e g b f h 2). The paleontological literature contains some references to 1200 f h b f h eukaryotic fossils older than our estimate of the last common 1000 ancestor. In some cases, these paleontological reports are in- 800 correct or ambiguous. For example, large carbonaceous fossils assigned to the genus Grypania were originally reported to be older than our molecular clock estimate (32), but more recent radiometric dates indicate an age of 1874 ± 9 Ma (33), consistent Fig. 1. Summary of mean divergence dates for the most recent common with the clock analyses presented here. Older still are the 50- to ancestor of major clades of extant eukaryotes. Letters are at the mean di- 300-μm spheroidal microfossils described from ∼3200 Ma rocks vergence time and denote analyses, as detailed in Table S1. Error bars rep- by Javaux et al. (34), and proposed as possible eukaryotes by – fi resent 95% HPD for BEAST analyses (a h)andthe95%con dence interval Buick (35), and sterane biomarkers from 2700 Ma shales (3). for PhyloBayes (analysis i–p). (A) Estimated age of the root of extant eukaryotes across analyses. Root position: Opis, root constrained to Opis- Whether these materials record Archean eukaryotes remains a thokonta; Uni, root constrained to “Unikonta”; Estim, root estimated by subject of debate (34, 36). Our molecular clock estimates suggest BEAST. Calibration: All,allPhanerozoicandProterozoicCCs;Phan,Phaner- that if these fossils do represent eukaryotes, they record stem ozic CCs only; 720, All CCs with the minimum age of red algae set to 720 Ma. lineages—early representatives of eukaryotic groups that went d=91taxa.(B)EstimatedagesofmajorcladesfromBEASTanalyses. extinct—that were present before the emergence of extant eu- karyotic clades. The major lineages of extant eukaryotes (Opisthokonta, SAR, calibration points, such as Excavata and Amoebozoa (Fig. 1B). Excavata, and Amoebozoa) are projected to have diverged from Estimates for the last common ancestor of extant Opisthokonta are one another by the Mesoproterozoic era (1600–1000 Ma), rela- younger than the other clades, at 1389–1240 Ma in analyses with All tively early in the history of the (Fig. 1 and Table 2). calibration constraints. This, in turn, suggests that these lineages were present for hun- Exclusion of Proterozoic calibration constraints (Phan analy- dreds of millions of years before the observed increase in the ses) shifted age estimates for the origins of major extant abundance and diversity of eukaryotic microfossils beginning eukaryotic clades younger by 200–300 myr (Fig. 1B). Differences ∼800 Ma (2, 37–40). Our molecular clock estimates indicate that in divergence times are relatively small for nested clades—e.g., stem groups were present well before recognizable members of the 95% HPD for Alveolata shifts from 1445 to 1236 Ma in crown lineages—monophyletic groups consisting of living rep- analysis a (Fig. 2) to 1206–1020 Ma with only Phanerozoic cal- resentatives and their ancestors—diversified. A similar pattern of ibration points (analysis b; Fig. S1). Not surprisingly, the differ- long stems preceding diversification is seen in and ing calibration schemes had their most dramatic impact on the and may be a consistent pattern in evolution (38). estimated age of the red algae, which changes from 1285 to 1180 Fossils and our molecular clock analyses agree that eukaryotes Ma 95% HPD (Fig. 2) to 959–625 Ma 95% HPD when Prote- originated and diversified during a time when oceans differed rozoic calibration points, including the constraint on red algae at substantially from the modern seas. Increasingly, geochemical 1174 Ma in accordance with the widely cited age for Bangio- data indicate that for much of the Proterozoic eon, mildly oxic morpha, are excluded (Fig. S1). Estimated ages of major clades surface waters lay above an oxygen-minimum zone that was per- were also much younger in analyses using the CIR model with sistently anoxic and commonly sulfidic (41, 42). Such conditions Phan calibrations (analyses n and p; Dataset S1). are compatible with scenarios for eukaryogenesis that rely on The topology of the eukaryotic tree produced through coes- anaerobic methanogens in symbiotic partnership with faculta- timation of phylogeny and divergence times in BEAST is broadly tively aerobic proteobacteria or sulfate reducers (see references consistent with other analyses (SI Text) (25, 26). Hence, the in ref. 43), because facultatively anaerobic mitochondria may BEAST topology was also used for the PhyloBayes analyses, which have enabled early eukaryotes to live in the sulfidic Proterozoic require a fixed topology. Though the relationships among the oceans (44). Because sulfide interferes with the function of mi- photosynthetic eukaryotes remain uncertain (25), our analyses tochondria in aerobically respiring eukaryotes, the radiation of suggest that many photosynthetic clades, such as red and green diverse species within eukaryotic clades may have become pos-

13626 | www.pnas.org/cgi/doi/10.1073/pnas.1110633108 Parfrey et al. Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Heterocapsa rotundata Alexandrium tamarense Crypthecodinium cohnii Figure 2 brevis marina SAR marinus parva tenella lemnae Sterkiella histriomuscorum Nyctotherus ovalis tetraurelia thermophila Chilodonella uncinata Reticulomyxa filosa Ovammina opaca Plasmodiophora brassicae Bigelowiella natans Gromia Corallomyxa tenera Heteromita globosa Thalassiosira pseudonana Phaeodactylum tricornutum Aureococcus anophagefferens Stramenopiles Heterosigma akashiwo Ectocarpus siliculosus Apodachlya brachynema Phytophthora infestans Isochrysis galbana Emiliania huxleyi Prymnesium parvum Pavlova lutheri Oryza sativa Arabidopsis thaliana Welwitschia mirabilis Ginkgo biloba Physcomitrella patens Mesostigma viride Volvox carteri Chlamydomonas reinhardtii Dunaliella salina Acetabularia acetabulum Micromonas pusilla Ostreococcus tauri Goniomonas Guillardia theta Leucocryptos marina Gracilaria changii Chondrus crispus Porphyra yezoensis Red algae Cyanidioschyzon merolae Glaucocystis nostochinearum Cyanophora paradoxa Glaucocystophytes brucei major Bodo saltans Diplonema papillatum longa Euglena gracilis Entosiphon sulcatum libera americana Seculamonas ecuadoriensis gruberi Excavata marylandensis vaginalis Giardia duodenalis barkhanus membranifera Monocercomonoides sp. Streblomastix strix Trimastix pyriformis californiana Malawimonas jakobiformis Acanthamoeba castellanii Hartmannella vermiformis Arcella hemisphaerica Rhizamoeba sp. Entamoeba histolytica Amoebozoa Mastigamoeba balamuthi Dictyostelium discoideum Physarum polycephalum Capitella capitata Aplysia californica Schistosoma mansoni Apis mellifera Drosophila melanogaster Caenorhabditis elegans Gallus gallus Homo sapiens Branchiostoma floridae Mnemiopsis leidyi Oscarella carmela Aphrocallistes vastus Opisthokonta Nematostella vectensis Monosiga brevicollis Amoebidium parasiticum

Sphaeroforma arctica EVOLUTION Capsaspora owczarzaki Candida albicans Saccharomyces cerevisiae Schizosaccharomyces pombe Phanerochaete chrysosporium Ustilago maydis Glomus intraradices Allomyces macrogynus Spizellomyces punctatus

2000 1750 1500 1250 1000 750 500 250 0

Fig. 2. Time-calibrated tree of extant eukaryotes using All calibration points, 109 taxa, and root constrained to Opisthokonta. Nodes are at mean divergence times and gray bars represent 95% HPD of node age. (Upper)Geologicaltimescale;(Lower) Absolute time scale in Ma. Thick vertical bars demarcate eras and thin vertical lines denote periods, with dates derived from the 2009 International Stratigraphic Chart. Node calibrated with Phanerozoic fossils (•); node calibrated with Proterozoic fossils (◯). Estimated ages of calibrated nodes differ from calibration constraints (Table 1) because they have been modified by relaxed clock analysis of sequence data.

sible only when sulfidic subsurface waters began to wane about photosynthetic bacteria are capable of nitrogen fixation, ame- 800 Ma (45). Alternatively, early eukaryotic evolution may have liorating the impact of nitrate and ammonia limitation on pri- occurred in coastal environments sheltered from the impact of mary production. Eukaryotes, however, have no such capacity; fi sul dic waters or in freshwater systems, which are both poorly thus, it may not be a coincidence that biomarkers indicating an fi sampled by the geologic record and not impacted by sul dic expanding importance of algae in marine primary production oceanic water masses (46). Consistent with this view, moderately occur in conjunction with geochemical data recording the spread diverse assemblages of fossil eukaryotes occur in well-ventilated lake deposits of the 1200 to 900 Ma Torridonian succession, of oxygen through later Neoproterozoic oceans (51). In our Scotland (47, 48), and in coastal marine deposits of the ∼1500 to analyses, the clade that contains extant photosynthetic taxa, in- 1400-Ma Roper Group, Australia (49). cluding green algae plus land and red algae, arose between Within Proterozoic oceans, low concentrations of biologically 1670 and 1428 Ma, but diversification within these lineages oc- available nitrogen may also have inhibited the diversification of curred later in the Neoproterozoic and may correspond to photosynthetic eukaryotes (50). Many cyanobacteria and other a changing redox profile in the oceans (Fig. 2).

Parfrey et al. PNAS | August 16, 2011 | vol. 108 | no. 33 | 13627 Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Heterocapsa_rotundata Alexandrium_tamarense Figure S1 Crypthecodinium_cohnii Karenia_brevis SAR Oxyrrhis_marina Perkinsus_marinus Theileria_parva Plasmodium_berghei Alveolates Toxoplasma_gondii Eimeria_tenella Stylonychia_lemnae Sterkiella_histriomuscorum Nyctotherus_ovalis Paramecium_tetraurelia Tetrahymena_thermophila Chilodonella_uncinata Reticulomyxa_filosa Ovammina_opaca Plasmodiophora_brassicae Rhizaria Bigelowiella_natans Gromia Corallomyxa_tenera Heteromita_globosa Thalassiosira_pseudonana Phaeodactylum_tricornutum Aureococcus_anophagefferens Stramenopiles Heterosigma_akashiwo Ectocarpus_siliculosus Apodachlya_brachynema Phytophthora_infestans Isochrysis_galbana Emiliania_huxleyi Prymnesium_parvum Haptophytes Pavlova_lutheri Oryza_sativa Arabidopsis_thaliana Welwitschia_mirabilis Ginkgo_biloba Physcomitrella_patens Mesostigma_viride Volvox_carteri Chlamydomonas_reinhardtii Dunaliella_salina Green algae Acetabularia_acetabulum Micromonas_pusilla Ostreococcus_tauri Gracilaria_changii Chondrus_crispus Porphyra_yezoensis Red algae Cyanidioschyzon_merolae Goniomonas Guillardia_theta Leucocryptos_marina Cryptomonads Glaucocystis_nostochinearum Cyanophora_paradoxa Glaucocystophytes Trypanosoma_brucei Leishmania_major Bodo_saltans Diplonema_papillatum Euglena_longa Euglena_gracilis Entosiphon_sulcatum Naegleria_gruberi Sawyeria_marylandensis Trichomonas_vaginalis Jakoba_libera Excavata Reclinomonas_americana Seculamonas_ecuadoriensis Giardia_duodenalis Spironucleus_barkhanus Carpediemonas_membranifera Monocercomonoides_sp Streblomastix_strix Trimastix_pyriformis Malawimonas_californiana Malawimonas_jakobiformis Arcella_hemisphaerica Rhizamoeba_sp Hartmannella_vermiformis Acanthamoeba_castellanii Entamoeba_histolytica Amoebozoa Mastigamoeba_balamuthi Dictyostelium_discoideum Physarum_polycephalum Capitella_capitata Aplysia_californica Schistosoma_mansoni Apis_mellifera Drosophila_melanogaster Caenorhabditis_elegans Gallus_gallus Homo_sapiens Branchiostoma_floridae Oscarella_carmela Aphrocallistes_vastus Mnemiopsis_leidyi Opisthokonta Nematostella_vectensis Monosiga_brevicollis Amoebidium_parasiticum Sphaeroforma_arctica Capsaspora_owczarzaki Candida_albicans Saccharomyces_cerevisiae Schizosaccharomyces_pombe Phanerochaete_chrysosporium Ustilago_maydis Glomus_intraradices Allomyces_macrogynus Spizellomyces_punctatus

2000 1750 1500 1250 1000 750 500 250 0

Fig. S1. Time-calibrated tree of eukaryotes using Phanerozoic calibration points, 109 taxa, rooted on Opisthokonta, and constructed in BEAST (analysis b). Nodes are at mean divergence times, and gray bars represent 95% HPD of node age. (Upper)Geologicaltimescale.(Lower) Absolute time scale (in Ma). Thick vertical bars demarcate eras, and thin vertical lines denote periods, with dates derived from the 2009 International Stratigraphic Chart.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 4 of 15 Table 2. Comparison of major node ages to fossil dates Encephalitozoon cuniculi)andorphans(e.g., anathema) were re- moved to minimize rate heterogeneity for the clock analysis. The resulting Majorclade Estimatedage,Ma Oldestfossil,Ma Ref. 109-taxon data matrix includes 5,696 characters, with each taxon having Eukaryotes * 1800 (2) between three and 15 of the target genes (36% missing character data; Extant eukaryotes 1679–1866 1200 (11) Table S2; analyses a–c and e–p). A 91-taxon alignment was created by re- Amoebozoa 1384–1624 800 (12) moving additional taxa with either long branches or high levels of missing Excavata 1510–1699 450 (64) data to ensure that our results were not driven by these potential sources of Opisthokonta 1240–1481 632 (71) artifact (analysis d). Rhizaria 1017–1256 550 (65) SAR 1365–1577 736 (74) Molecular Dating Analyses. Dating analyses were predominantly performed in BEAST v1.5.4 (52), and we also assessed results obtained in PhyloBayes Estimated age is range of mean dates from All analyses. 3.2f (53) (see SI Text for analysis details). BEAST offers a number of desirable *The age of the root of all eukaryotes is not estimated because molecular features, including flexible specification of prior distributions that enable clock studies can only inform the timing of extant clades. the uncertainty of the fossil record to be realistically modeled, as well as the ability to coestimate divergence times with topology (15). We compared divergence dates for eukaryotes obtained from different models to assess Discrepancy Between These and Previous Molecular Clock Studies. whether our conclusions were driven by the choice of a particular model (SI Previous molecular clock studies yielded vastly different dates for Text,Fig.1andTable S1). the root of extant eukaryotes, ranging from 3970 to 1100 Ma (1). In a recent analysis of small subunit ribosomal DNA (SSU-rDNA) Calibration Constraints. Calibration constraints were specified with prior dis- from 83 broadly sampled eukaryotes, Berney and Pawlowski (4) tributions to incorporate errors arising from age dating, stratigraphy, and placed the origin of eukaryotes at 1100 Ma, a conclusion that was clade assignment (Table 1). The impact of Proterozoic fossils was assessed by robust to changing the position of the root. They had numerous analyzing the data with only the 16 Phanerozoic calibration constraints Phanerozoic calibration constraints specified as either minimum or (Phan analyses b, f, h, j, l, n,andp)orwithPhanerozoicandProterozoic maximum divergence dates (4), but they found that including calibration constraints (All analyses a, c–e, g, i, k, m,ando). Calibration Proterozoic calibration points, such as Bangiomorpha at 1200 Ma, constraints were specified with prior distributions in BEAST using BEAUTi shifted their estimates of the origin and diversification of eukar- v1.5.4 (52) and were derived from a conservative reading of the fossil record yotes by 1000–2500 Ma. The age discrepancy observed by Berney (i.e., we err toward younger rather than older ages; SI Text). Distributions and Pawlowski (4), when Proterozoic calibration constraints are were specified with long tails unless the fossil record provided minimum- included, contrasts sharply with the relative stability of dates seen in divergence information. Calibration constraints used for PhyloBayes had to our analyses (Fig. 1A). We hypothesize that the increased gene and be specified as a uniform distribution (Table S3). taxon sampling, as well as the use of flexible prior distributions of calibration points as implemented in BEAST, are major factors Assessing Impact of the Root on the Inferred Age of Eukaryotes. Molecular contributing to the stability of molecular clock estimation in clock analyses require a rooted tree. However, the position of the eukaryotic our analyses. root remains an open question; therefore, we compared age estimates from molecular clock analyses with multiple positions for the root of extant Conclusion eukaryotes. First, the root was constrained to the branch leading to the Our molecular clock analyses yield a timeline of eukaryotic Opisthokonta or to Opisthokonta + Amoebozoa (“Unikonta”) in accordance evolution that is congruent with the paleontological record and with current hypotheses (see SI Text for discussion of the position of the eukaryotic root). In BEAST, the root was specified by constraining a mono- robust to varying analytical conditions. According to our analy- fi ses, crown (extant) groups of eukaryotes arose in the Paleo- phyletic ingroup. PhyloBayes requires the tree topology to be xed, and we “ ” proterozoic era (2500–1600 Ma) and began to diversify soon used the tree in Fig. 2 rooted on either Opisthokonta or Unikonta . Finally, thereafter, suggesting that early eukaryotic evolution was influ- for the third condition, the root was estimated by the molecular clock cri- terion, as implemented in BEAST (SI Text), which yielded variable estimates enced by anoxic and sulfidic water masses in contemporaneous of the location of the root. oceans. The stability in our analysis across a range of variables is a welcome departure from the large age discrepancies reported fl ACKNOWLEDGMENTS. We thank Ben Normark, Rob Dorit, and Sam Bowser in earlier molecular analyses, re ecting improved paleontologi- for useful discussions, and Jeff Thorne and Bengt Sennblad for helpful cal interpretation, advancements in molecular methods, and the discussions about molecular clock models. This manuscript has been improved rapidly growing body of molecular data from diverse eukaryotes. following the comments of Emmanuelle Javaux, Andrew Roger, and Heroen Verbruggen. We thank Jessica Grant and Tony Caldanaro for technical help. Materials and Methods This research was supported by the National Aeronautics and Space Admin- istration Astrobiology Institute (A.H.K.) and by National Science Foundation Alignments. Alignments are derived from the 15 protein-coding genes ana- Assembling the Tree of Life Grant 043115 and National Science Foundation lyzed in Parfrey et al. (dataset 15:10 of ref. 25). Using this 88-taxon dataset Systematics Grant 0919152 (to L.A.K). D.J.G.L. is supported by Conselho Nacional as a starting point, taxa were added to capture additional lineages, partic- de Desenvolvimento Científico e Tecnológico-Brazil Doutorado no Exterior Fel- ularly those with fossil data available (Table S2). Rapidly evolving taxa (e.g., lowship 200853/2007-4.

1. Roger AJ, Hug LA (2006) The origin and diversification of eukaryotes: Problems with 9. Javaux EJ, Knoll AH, Walter M (2003) Recognizing and interpreting the fossils of early molecular phylogenetics and molecular clock estimation. Philos Trans R Soc Lond B eukaryotes. Orig Life Evol Biosph 33:75–94. Biol Sci 361:1039–1054. 10. Javaux EJ, Knoll AH, Walter MR (2004) TEM evidence for eukaryotic diversity in mid- 2. Knoll AH, Javaux EJ, Hewitt D, Cohen P (2006) Eukaryotic organisms in Proterozoic Proterozoic oceans. Geobiology 2:121–132. oceans. Philos Trans R Soc Lond B Biol Sci 361:1023–1038. 11. Butterfield NJ (2000) Bangiomorpha pubescens n. gen., n. sp.: Implications for the 3. Brocks JJ, Logan GA, Buick R, Summons RE (1999) Archean molecular fossils and the evolution of sex, multicellularity, and the Mesoproterozoic/Neoproterozoic radiation early rise of eukaryotes. Science 285:1033–1036. of eukaryotes. Paleobiol 26:386–404. 4. Berney C, Pawlowski J (2006) A molecular time-scale for eukaryote evolution recali- 12. Porter SM, Meisterfeld R, Knoll AH (2003) Vase-shaped microfossils from the Neo- brated with the continuous microfossil record. Proc Roy Soc Lond B 273:18671872. proterozoic Chuar Group, Grand Canyon: A classification guided by modern testate 5. Douzery EJP, Snell EA, Bapteste E, Delsuc F, Philippe H (2004) The timing of eukaryotic amoebae. JPaleontol77:409–429. evolution: Does a relaxed molecular clock reconcile proteins and fossils? Proc Natl 13. Javaux EJ (2007) The early eukaryotic fossil record. Adv Exp Med Biol 607:1–19. Acad Sci USA 101:15386–15391. 14. Welch JJ, Bromham L (2005) Molecular dating when rates vary. Trends Ecol Evol 20: 6. Cavalier-Smith T (2002) The phagotrophic origin of eukaryotes and phylogenetic – classification of . Int J Syst Evol Microbiol 52:297–354. 320 327. 7. Cavalier-Smith T (2010) Deep phylogeny, ancestral groups and the four ages of life. 15. Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics and fi Philos Trans R Soc Lond B Biol Sci 365:111–132. dating with con dence. PLoS Biol 4:e88. 8. Porter SM (2004) The fossil record of early eukaryotic diversification. Paleontol Soc 16. Ho SYW, Phillips MJ (2009) Accounting for calibration uncertainty in phylogenetic Papers 10:35–50. estimation of evolutionary divergence times. Syst Biol 58:367–380.

13628 | www.pnas.org/cgi/doi/10.1073/pnas.1110633108 Parfrey et al.