<<

Cladogenesis, Coalescence and the of the Three Domains of

Olga Zhaxybayeva

University of Connecticut Department of Molecular and

ABGradCon 2004, Tucson, AZ, January 7-10, 2004 Acknowledgements

Gogarten Lab: Elsewhere: J.Peter Gogarten Andrew Martin Lorraine Olendzenski (University of Colorado at Boulder) Pascal Lapierre Yuri (NCBI) Eugene Koonin (NCBI) Joe Felsenstein (University of Washington) Hyman Hartman

NASA Exobiology Program NASA Institute NSF Microbial Trees as a Visualization of Evolution

Genealogy (Church Ceiling, Santo Domingo, Oaxaca)

Lebensbaum (German for “”) from , 1874 ACTERIA B Fig. modified from Fig. modified Norman Pace Cytophaga chloroplast Epulopiscium Synechococcus Chlorobium Bacillus CPS V/A-ATPase Prolyl RS Lysyl RS Mitochondria Treponema mitochondria Agrobacterium Thermus Deinococcus Aquifex E.coli Thermotoga EM 17 Paramecium Riftia Homo Zea Coprinus Dictyostelium Porphyra Chromatium 0.1 changes per nt Entamoeba

Naegleria N I

Methanococcus G

UCARYA Methanosarcina Methanobacterium I

Euglena R

Methanospirillum Methanopyrus O E Thermococcus Haloferax Trypanosoma Physarum Sulfolobus pJP 78 pJP 27 pSL 50 pSL 22 Thermoproteus pSL 4 Thermofilum pSL 12 Giardia

Marine group 1 Tritrichomonas Encephalitozoon RCHAEA Vairimorpha

Hexamita A

e f i L

f o

e r e T

A N R r - U S S All life forms on Earth share common ancestry

Cenancestor – (from the Greek “kainos” meaning recent and “koinos” meaning common) – the most recent common ancestor of all the that are alive today. The term was proposed by Fitch in 1987. Where is the root of the tree of life?

• On the branch leading to (Gogarten, J.P. et al.,1989; Iwabe, N. et al., 1989)

• On the branch leading to (Woese, C.R.,1987) • On the branch leading to (Forterre, P. and Philippe, H., 1999; Lopez, P. et al., 1999)

• Under aboriginal trifurcation (Woese, C.R., 1978)

• Inconclusive results (Caetano-Anolles, G., 2002) ACTERIA B Fig. modified from Fig. modified Norman Pace Cytophaga chloroplast Epulopiscium Synechococcus Chlorobium Bacillus CPS V/A-ATPase Prolyl RS Lysyl RS Mitochondria Plastids Treponema mitochondria Agrobacterium Thermus Deinococcus Aquifex E.coli Thermotoga EM 17 Paramecium Riftia Homo Zea Coprinus Dictyostelium Porphyra Chromatium 0.1 changes per nt Entamoeba

Naegleria N I

Methanococcus G

UCARYA Methanosarcina Methanobacterium I

Euglena R

Methanospirillum Methanopyrus O E Thermococcus Haloferax Trypanosoma Physarum Sulfolobus pJP 78 pJP 27 pSL 50 pSL 22 Thermoproteus pSL 4 Thermofilum pSL 12 Giardia

Marine group 1 Tritrichomonas Encephalitozoon RCHAEA Vairimorpha

Hexamita A

e f i L

f o

e r e T

A N R r - U S S BfTreponema (v-) Vacuolar & (A-) Archaeal BArabidops 206 1000 o aayi uuisCatal BHomo CatalyticNon Subunits 812 744 BSaccharomyces Another 373 BCandida BNeurospora 1000 BEnterococcos example of BBMethanosarcina BSulfolobu BDesulfurococcus 838 957 BDeinococcus BThermus

topology with alpE.coli Bacterial 1000 1000 alpVibrio alpPropionigenium alpSynechococcus long empty 541 949 alpNicotiana alpBos 610 alpSchizos 927 alpSaccharom. 388 830 (

alpNeurospora F- branches alpMycobacterium

901 ) 593 441 alpThermotoga alpPS3 FfBorrelia connecting 1000 FlSalmonella 951 Flagellar Assembly ATPases 901 FTreponema FlBacillus

betAquifex Bacterial three domains betMycoplasma 1000 betThermotoga 564 betE.coli 997 442 betVibrio betEnterococcus 971 of life – 174 betPS3 betWolinella 88 ( betChlorobium F- betPectinatus betPropionigenium ) betBos 616 betNicotiana 901 962 betSaccharomyces betNeurospora proton 315 388 BetSchizossacch. ASulfolobus A 378 rchaeal 417 AfTreponema AEnterococcus pumping ABMethanosarcina 1000 AMMethanosarcina y

268 984 Subunits tic 448 1000 AHaloferax 370 AHalobacterium ( ATPases 989 AjMethanoccus A- 1000 ADesulfurococcus 1000 AThermus ) ADeinococcus AGiardia APlasmodium Vacuolar ATrypanoso 973 ACyanidium AArabidops 954 914 1000 827 AVigna ADaucus 513 AGossypium 999 1000 (

A1Acetabularia V- A2Acetabularia

1000 AGallus A1 )

AMus A 299 1000 A1Homo1 SU TPase ABos1 1000 ADrosophila 262 727 AAedes 0.1 AManduca ANeurospora 1000 1000ASaccharomyces ACandida 282 ADictyostelium Olendzenski et al, 2000 AEntamoeba Hypotheses to explain the long empty branches connecting the three domains of life

Early evolution prior to the split of the three domains was following different mechanisms.

• Wolfram Zillig, 1992 • Otto Kandler, 1994 • Carl Woese, 1998 • Arthur Koch, 1994 Hypotheses to explain the long empty branches connecting the three domains of life (cont.)

There were major catastrophic events in the past that led to a bottleneck with only few survivors

For example:

•Tail of early heavy bombardment

•Rise of oxygen Hypotheses to explain the long empty branches connecting the three domains of life (cont.) There were major catastrophic events in the past that led to a bottleneck with only few survivors Hypotheses to explain the long empty branches connecting the three domains of life (cont.) Coalescence – the process of tracing lineages backwards in time to their common ancestors. Every two extant lineages coalesce to their most recent common ancestor. Eventually, all t/2 lineages coalesce (Kingman, 1982) to the cenancestor.

Illustration is from J. Felsenstein, “Inferring Phylogenies”, Sinauer, 2003 Coalescence as an Approach to Study

Clade – (from the Greek “klados” meaning branch or twig) – a group of organisms that includes all of the descendants of an ancestral . In a rooted phylogeny every node defines a as the lineages originating from node, including those that arise in successive furcations.

clade

Cladogenesis – the process of clade formation Simulations of cladogenesis by coalescence

•One and one event per generation

•Horizontal transfer event once in 10 generations

RED: organismal lineages (no HGT) BLUE: molecular lineages (with HGT) GRAY: extinct lineages

RESULTS: •Most recent common ancestors are different for organismal and molecular phylogenies

•Different coalescence times

•Long coalescence of the last two lineages EXTANT LINEAGES FOR THE SIMULATIONS OF 50 LINEAGES Number of extant lineages over time

green: organismal lineages ; red: molecular lineages (with gene transfer) Y Mitochondrial Adam Eve

Lived Lived approximately 166,000-249,000 50,000 years ago years ago

Thomson, R. et al. (2000) Cann, R.L. et al. (1987) Proc Natl Acad Sci U S A 97, Nature 325, 31-6 7360-5 Vigilant, L. et al. (1991) Underhill, P.A. et al. (2000) Science 253, 1503-7 Nat Genet 26, 358-61

Albrecht Durer, The Fall of Man, 1504

Adam and Eve never met / CONCLUSIONS •Coalescence leads to long branches at the root. •Using single genes as phylogenetic markers makes it difficult to trace organismal phylogeny in the presence of . •Each contemporary molecule has its own history and traces back to an individual molecular cenancestor, but these molecular ancestors were likely to be present in different organisms at different times. •The model alone already explains some features of the observed topology of the tree of life. Therefore, it does not appear warranted to invoke more complex hypotheses involving bottlenecks and extinction events to explain these features of the tree of life. OUTLOOK

• Incorporate non-random Horizontal Gene Transfer into the model.

• Consider the model as a null hypothesis and look at deviations from it. E.g., the bacterial clade does not conform to the model’s prediction. This deviation could be due to sudden radiation.

• Explore if the model can be used to infer parameters that describe the early evolution of life. E.g., number of .