The Universal SSU rRNA Tree Probing early eukaryotic Wheelis et al. 1992 PNAS 89: 2930 evolution using phylogenetic methods

The SSU Ribosomal RNA Tree for The Hypothesis - primitively amitochondriate eukaryotes Fungi (Tom Cavalier-Smith 1983) Choanozoa / green Mitochondria? + Stramenopiles Dictyostelium Entamoebae Physarum Prokaryotic outgroup

Trichomonas Archezoa Microsporidia ‘Tree’ courtesy of W. Giardia Entamoeba

The Archezoa Hypothesis T. Cavalier-Smith (1983)

• The Archezoa hypothesis would fall if: – Find mitochondrial genes on archezoan genomes – Find that archezoans branch among aerobic species with mitochondria – Find -derived organelles in archezoans

1 Mitochondrial HSP 70 is encoded by the host nucleus but is of endosymbiotic origin

Targeting ATPase PEPTIDE BINDING Retention sequences DOMAIN sequences N C

Cytosolic HSP70 mitochondrial HSP70 Endosymbiotic origin RER HSP70 alpha- other proteobacteria Gram +ve &

Chaperonin 60 Protein Maximum Likelihood Tree Mitochondrial genes in Archezoa (PROTML, Roger et al. 1998, PNAS 95: 229)

Archezoa Proteins of mitochondrial origin*

Giardia / Heat shock 70, Chaperonin 60 Note 100% Bootstrap support

Trichomonas Heat shock 70, Chaperonin 60

Microsporidia Heat shock 70

Entamoeba Heat shock 70, chaperonin 60 A case of Eukaryote HGT? *defined as forming a monophyletic group with mitochondrial homologues in a non-controversial species phylogeny

Long branches may cause problems for Chaperonin 60 Protein Maximum Likelihood Tree phylogenetic analysis (PROTML, Roger et al. 1998, PNAS 95: 229)

• Felsenstein (1978) made a simple model phylogeny including four taxa and a mixture of short and long branches

A B A

p p D q Longest TRUE TREE WRONG TREE q q p > q branches C C D B • Methods which assume all sites change at the same rate (e.g. PROTML) may be particularly sensitive to this problem

2 Cpn-60 Protein ML tree (PROTML) from variable A simple experiment: sites with outgroups removed Giardia

• Does the Cpn60 tree topology change:

– If we remove long-branch outgroups Trichomonas

– If we remove sites where every species & 30 has the same amino acid 31 Animals & Fungi Apicomplexa

Plants Dictyostelium

Entamoeba

The Archezoa Hypothesis T. Cavalier-Smith (1983)

• The Archezoa hypothesis would fall if: – Find mitochondrial genes on archezoan genomes – Find that archezoans branch among aerobic species with mitochondria – Find mitochondrion-derived organelles in archezoans

a-tubulin trees place Microsporidia with Fungi Microsporidia are primitive and early Edlind TD, et al. (1996). Mol Phylog Evol 5 :359-67 branching eukaryotes or relatives of fungi? Keeling PJ, Doolittle WF. (1996). Mol Biol Evol 13 :1297-305.

• Microsporidia branch deep in some trees but not in others Microsporidia – Deep: SSU rRNA, EF-2 – With fungi: tubulin and HSP70 65 Fungi • Microsporidia contain a mitochondrial HSP70 – They once contained the mitochondrial and thus are not Archezoa sensu Cavalier-Smith – Hirt et al., 1997 – Or they obtained this gene (and the one for tubulin) from fungi via HGT Other eukaryotes – Sogin, 1997 Curr. Op. Genet. Dev. 7: 792-799

3 Maximum likelihood and phylogenetic inference

The tree shown • The maximum likelihood tree is the one with is the maximum likelihood tree the highest probability of giving rise to the obtained using a data under a particular model program called 88 PROTML 83 • It is not the probability that it is the “true” 75 tree

Assumptions underpinning the LogDet/Paralinear distances for EF-2 DNA variable sites codon positions 1+2 PROTML model Animals • Assumes all sites in the molecule can change Chlorella • Assumes that all sequences are evolving in 70 Trypanosoma

the same way Trichomonas 60 • EF-2 violates both of these assumptions: Giardia Dictyostelium Human CAGGAATCACCTACGATCCTCGGCGGCGTTACGAACCGAA 53%G+C 25 Saccharomyces CAGGAATCACTTACGATCTTAAGCGGCGTTACGAACAGAA 46%G+C Entamoeba Chlorella CAGGAATCACCTACGATCCTCAGCGGCGCTACGAACCGGC Dictyostelium CAGGAAACACTTTCGATCTTGTTCGGAGTAATTCTGCGGC Saccharomyces Microsporidia Trypanosoma CAGGAATCACTTTCGATCCTGAGCGGCGATATGAACCGGC 76 + fungi! Entamoeba CAGGAATCACTTACGATCCTAAGCGGAGTAACGAACAGCC Glugea CAGGAATCACTTACGATCCTGCGCGGCGTTACGAACCGGC Cryptosporidium Tritrichomonas CAGGAATCACCTACGATCCTAAGCGGCGTTACGAACCGCC 54%G+C Tritrichomonas CAGGAATCACCTACGATCCTAAGCGGCGTTACGAACCGCC 54%G+C Sulfolobus Giardia CAGGAATCACCTACGATCCTTCGCGGCGTTACGAACCGCA 53%G+C Methanococcus Archaebacteria Glugea CAGGAAAGACCTACGATGCTGATCGGAGTAATGAACCGGA 45%G+C outgroups Sulfolobus CAGGAAACACACAGGAACCTGTGCGGCTGCTTGATACTTC 43%G+C Halobacterium Methanococcus CAGGAAACACTTTCGAAATTTTGCGGCTGCCTGATTGAGA 44%G+C Halobacterium CAGGAAACACCTACGAAACTACGCGGCTGCATGAACGAGA 58%G+C

A combination of factors (outgroup GC content and A combination of factors (outgroup GC content & site site rate heterogeneity) influence the EF-2 DNA tree rate heterogeneity) influence the EF-2 DNA tree Methanococcus outgroup Halobacterium outgroup Methanococcus outgroup Halobacterium outgroup (low G+C) Higher G+C (low G+C) Higher G+C 100 100 100 100 80 80 80 80 Bootstrap values 60 60 LogDet ML 60 60 Bootstrap estimate 40 40 values 40 40 20 20

20 20 0 0 0 20 40 60 80 100 0 20 40 60 80 100 0 0 0 20 40 60 80 100 0 20 40 60 80 100 Fraction of constant sites removed Fraction of constant sites removed (Microsporidia, outgroup) (Microsporidia, outgroup) (Microsporidia, Fungi)

4 A combination of factors (outgroup GC content & site Are the former Archezoa really rate heterogeneity) influence the EF-2 DNA tree ancient offshoots? Methanococcus outgroup Halobacterium outgroup • Probably not: (low G+C) Higher G+C 100 100 • Microsporidia are related to fungi (Tubulin, RNA polymerase,LSU rRNA, HSP70, TATA binding protein, 80 80 Bootstrap EF-2, EF-1 alpha, SSU rRNA) values 60 60 • Evidence for Giardia and Trichomonas branching deeper

40 40 than other eukaryotes is based on trees made using unrealistic assumptions (often PROTML) 20 20 • There is plenty of room for new hypotheses 0 0 0 20 40 60 80 100 0 20 40 60 80 100

Fraction of constant sites removed (Giardia, Trichomonas, outgroup)

Microsporidia are obligate If former archezoa contain intracellular parasites genes from the mitochondrial endosymbiont what happened to the organelle?

Life cycle of microsporidia

mtHsp70 has diverse roles in mitochondria Pfanner and Geissler 2001 Nature Reviews 2: 339-344

Tree of Mitochondrial Hsp70 mtHsp70

mtHsp70 also plays a role in assembly of Fe-S clusters - an essential function for yeast mitochondria (Lill & Kispal, 2000)

5 Microsporidial HSP70 have no obvious Localisation of a mtHsp70 in Trachipleistophora

N-terminal targeting signal Confocal microscopy using an antibody to Th-mtHsp70 Rabbit Infected cells Spores

Western blot using an antibody to Th-mtHsp70

The Hsp70 protein is Microsporidian localised to membrane bound or electron dense structures.

Microsporidian ‘mitochondrial remnant’

Traditional fixation Host cell mitochondrion methods show a double membrane

Williams et al. 2002 Nature 418: 865-869

Trees are a good way of exploring the Phylogenetic analysis requires history of genes but care is needed! careful thought

• Making trees is not easy: • Phylogenetic analysis is frequently treated as a – Among-site rate heterogeneity, “fast black box into which data are fed (often gathered clock” species, shared nucleotide or amino at considerable cost) and out of which “The Tree” acid composition biases springs – Different data sets may be affected by • (Hillis, Moritz & Mable 1996, Molecular Systematics) individual phenomena to different degrees – Biases need not be large if phylogenetic signal is weak

6