The Universal SSU rRNA Tree Probing early eukaryotic Wheelis et al. 1992 PNAS 89: 2930 evolution using phylogenetic methods
The SSU Ribosomal RNA Tree for Eukaryotes The Archezoa Hypothesis - primitively Animals amitochondriate eukaryotes Fungi (Tom Cavalier-Smith 1983) Choanozoa Trichomonas Microsporidia Plants / green algae Mitochondria? Ciliates + Apicomplexa Stramenopiles Red algae Dictyostelium Entamoebae Euglenozoa Physarum Prokaryotic Percolozoa outgroup
Trichomonas Giardia Archezoa Microsporidia ‘Tree’ courtesy of W. Ford Doolittle Giardia Entamoeba
The Archezoa Hypothesis T. Cavalier-Smith (1983)
• The Archezoa hypothesis would fall if: – Find mitochondrial genes on archezoan genomes – Find that archezoans branch among aerobic species with mitochondria – Find mitochondrion-derived organelles in archezoans
1 Mitochondrial HSP 70 is encoded by the host nucleus but is of endosymbiotic origin
Targeting ATPase DOMAIN PEPTIDE BINDING Retention sequences DOMAIN sequences N C
Cytosolic HSP70 mitochondrial HSP70 Endosymbiotic origin RER HSP70 alpha-proteobacteria other proteobacteria chloroplasts Gram +ve & Archaea cyanobacteria
Chaperonin 60 Protein Maximum Likelihood Tree Mitochondrial genes in Archezoa (PROTML, Roger et al. 1998, PNAS 95: 229)
Archezoa Proteins of mitochondrial origin*
Giardia / Heat shock 70, Chaperonin 60 Spironucleus Note 100% Bootstrap support
Trichomonas Heat shock 70, Chaperonin 60
Microsporidia Heat shock 70
Entamoeba Heat shock 70, chaperonin 60 A case of Eukaryote Eukaryote HGT? *defined as forming a monophyletic group with mitochondrial homologues in a non-controversial species phylogeny
Long branches may cause problems for Chaperonin 60 Protein Maximum Likelihood Tree phylogenetic analysis (PROTML, Roger et al. 1998, PNAS 95: 229)
• Felsenstein (1978) made a simple model phylogeny including four taxa and a mixture of short and long branches
A B A
p p D q Longest TRUE TREE WRONG TREE q q p > q branches C C D B • Methods which assume all sites change at the same rate (e.g. PROTML) may be particularly sensitive to this problem
2 Cpn-60 Protein ML tree (PROTML) from variable A simple experiment: sites with outgroups removed Giardia
• Does the Cpn60 tree topology change:
– If we remove long-branch outgroups Trichomonas
– If we remove sites where every species Euglena & 30 Trypanosoma has the same amino acid 31 Animals & Fungi Apicomplexa
Plants Dictyostelium
Entamoeba
The Archezoa Hypothesis T. Cavalier-Smith (1983)
• The Archezoa hypothesis would fall if: – Find mitochondrial genes on archezoan genomes – Find that archezoans branch among aerobic species with mitochondria – Find mitochondrion-derived organelles in archezoans
a-tubulin trees place Microsporidia with Fungi Microsporidia are primitive and early Edlind TD, et al. (1996). Mol Phylog Evol 5 :359-67 branching eukaryotes or relatives of fungi? Keeling PJ, Doolittle WF. (1996). Mol Biol Evol 13 :1297-305.
• Microsporidia branch deep in some trees but not in others Microsporidia – Deep: SSU rRNA, EF-2 – With fungi: tubulin and HSP70 65 Fungi • Microsporidia contain a mitochondrial HSP70 – They once contained the mitochondrial endosymbiont and thus are not Archezoa sensu Cavalier-Smith – Hirt et al., 1997 – Or they obtained this gene (and the one for tubulin) from fungi via HGT Other eukaryotes – Sogin, 1997 Curr. Op. Genet. Dev. 7: 792-799
3 Maximum likelihood and phylogenetic inference
The tree shown • The maximum likelihood tree is the one with is the maximum likelihood tree the highest probability of giving rise to the obtained using a data under a particular model program called 88 PROTML 83 • It is not the probability that it is the “true” 75 tree
Assumptions underpinning the LogDet/Paralinear distances for EF-2 DNA variable sites codon positions 1+2 PROTML model Animals • Assumes all sites in the molecule can change Chlorella • Assumes that all sequences are evolving in 70 Trypanosoma
the same way Trichomonas 60 • EF-2 violates both of these assumptions: Giardia Dictyostelium Human CAGGAATCACCTACGATCCTCGGCGGCGTTACGAACCGAA 53%G+C 25 Saccharomyces CAGGAATCACTTACGATCTTAAGCGGCGTTACGAACAGAA 46%G+C Entamoeba Chlorella CAGGAATCACCTACGATCCTCAGCGGCGCTACGAACCGGC Dictyostelium CAGGAAACACTTTCGATCTTGTTCGGAGTAATTCTGCGGC Saccharomyces Microsporidia Trypanosoma CAGGAATCACTTTCGATCCTGAGCGGCGATATGAACCGGC 76 + fungi! Entamoeba CAGGAATCACTTACGATCCTAAGCGGAGTAACGAACAGCC Glugea Cryptosporidium CAGGAATCACTTACGATCCTGCGCGGCGTTACGAACCGGC Cryptosporidium Tritrichomonas CAGGAATCACCTACGATCCTAAGCGGCGTTACGAACCGCC 54%G+C Tritrichomonas CAGGAATCACCTACGATCCTAAGCGGCGTTACGAACCGCC 54%G+C Sulfolobus Giardia CAGGAATCACCTACGATCCTTCGCGGCGTTACGAACCGCA 53%G+C Methanococcus Archaebacteria Glugea CAGGAAAGACCTACGATGCTGATCGGAGTAATGAACCGGA 45%G+C outgroups Sulfolobus CAGGAAACACACAGGAACCTGTGCGGCTGCTTGATACTTC 43%G+C Halobacterium Methanococcus CAGGAAACACTTTCGAAATTTTGCGGCTGCCTGATTGAGA 44%G+C Halobacterium CAGGAAACACCTACGAAACTACGCGGCTGCATGAACGAGA 58%G+C
A combination of factors (outgroup GC content and A combination of factors (outgroup GC content & site site rate heterogeneity) influence the EF-2 DNA tree rate heterogeneity) influence the EF-2 DNA tree Methanococcus outgroup Halobacterium outgroup Methanococcus outgroup Halobacterium outgroup (low G+C) Higher G+C (low G+C) Higher G+C 100 100 100 100 80 80 80 80 Bootstrap values 60 60 LogDet ML 60 60 Bootstrap estimate 40 40 values 40 40 20 20
20 20 0 0 0 20 40 60 80 100 0 20 40 60 80 100 0 0 0 20 40 60 80 100 0 20 40 60 80 100 Fraction of constant sites removed Fraction of constant sites removed (Microsporidia, outgroup) (Microsporidia, outgroup) (Microsporidia, Fungi)
4 A combination of factors (outgroup GC content & site Are the former Archezoa really rate heterogeneity) influence the EF-2 DNA tree ancient offshoots? Methanococcus outgroup Halobacterium outgroup • Probably not: (low G+C) Higher G+C 100 100 • Microsporidia are related to fungi (Tubulin, RNA polymerase,LSU rRNA, HSP70, TATA binding protein, 80 80 Bootstrap EF-2, EF-1 alpha, SSU rRNA) values 60 60 • Evidence for Giardia and Trichomonas branching deeper
40 40 than other eukaryotes is based on trees made using unrealistic assumptions (often PROTML) 20 20 • There is plenty of room for new hypotheses 0 0 0 20 40 60 80 100 0 20 40 60 80 100
Fraction of constant sites removed (Giardia, Trichomonas, outgroup)
Microsporidia are obligate If former archezoa contain intracellular parasites genes from the mitochondrial endosymbiont what happened to the organelle?
Life cycle of microsporidia
mtHsp70 has diverse roles in mitochondria Pfanner and Geissler 2001 Nature Reviews 2: 339-344
Tree of Mitochondrial Hsp70 mtHsp70
mtHsp70 also plays a role in assembly of Fe-S clusters - an essential function for yeast mitochondria (Lill & Kispal, 2000)
5 Microsporidial HSP70 have no obvious Localisation of a mtHsp70 in Trachipleistophora
N-terminal targeting signal Confocal microscopy using an antibody to Th-mtHsp70 Rabbit Infected cells Spores
Western blot using an antibody to Th-mtHsp70
The Hsp70 protein is Microsporidian localised to membrane bound or electron dense structures.
Microsporidian ‘mitochondrial remnant’
Traditional fixation Host cell mitochondrion methods show a double membrane
Williams et al. 2002 Nature 418: 865-869
Trees are a good way of exploring the Phylogenetic analysis requires history of genes but care is needed! careful thought
• Making trees is not easy: • Phylogenetic analysis is frequently treated as a – Among-site rate heterogeneity, “fast black box into which data are fed (often gathered clock” species, shared nucleotide or amino at considerable cost) and out of which “The Tree” acid composition biases springs – Different data sets may be affected by • (Hillis, Moritz & Mable 1996, Molecular Systematics) individual phenomena to different degrees – Biases need not be large if phylogenetic signal is weak
6