Inferring Ancestry
Total Page:16
File Type:pdf, Size:1020Kb
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1176 Inferring Ancestry Mitochondrial Origins and Other Deep Branches in the Eukaryote Tree of Life DING HE ACTA UNIVERSITATIS UPSALIENSIS ISSN 1651-6214 ISBN 978-91-554-9031-7 UPPSALA urn:nbn:se:uu:diva-231670 2014 Dissertation presented at Uppsala University to be publicly examined in Fries salen, Evolutionsbiologiskt centrum, Norbyvägen 18, 752 36, Uppsala, Friday, 24 October 2014 at 10:30 for the degree of Doctor of Philosophy. The examination will be conducted in English. Faculty examiner: Professor Andrew Roger (Dalhousie University). Abstract He, D. 2014. Inferring Ancestry. Mitochondrial Origins and Other Deep Branches in the Eukaryote Tree of Life. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1176. 48 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-9031-7. There are ~12 supergroups of complex-celled organisms (eukaryotes), but relationships among them (including the root) remain elusive. For Paper I, I developed a dataset of 37 eukaryotic proteins of bacterial origin (euBac), representing the conservative protein core of the proto- mitochondrion. This gives a relatively short distance between ingroup (eukaryotes) and outgroup (mitochondrial progenitor), which is important for accurate rooting. The resulting phylogeny reconstructs three eukaryote megagroups and places one, Discoba (Excavata), as sister group to the other two (neozoa). This rejects the reigning “Unikont-Bikont” root and highlights the evolutionary importance of Excavata. For Paper II, I developed a 150-gene dataset to test relationships in supergroup SAR (Stramenopila, Alveolata, Rhizaria). Analyses of all 150-genes give different trees with different methods, but also reveal artifactual signal due to extremely long rhizarian branches and illegitimate sequences due to horizontal gene transfer (HGT) or contamination. Removing these artifacts leads to strong consistent support for Rhizaria+Alveolata. This breaks up the core of the chromalveolate hypothesis (Stramenopila+Alveolata), adding support to theories of multiple secondary endosymbiosis of chloroplasts. For Paper III, I studied the evolution of cox15, which encodes the essential mitochondrial protein Heme A synthase (HAS). HAS is nuclear encoded (nc-cox15) in all aerobic eukaryotes except Andalucia godoyi (Jakobida, Excavata), which encodes it in mitochondrial DNA (mtDNA) (mt-cox15). Thus the jakobid gene was postulated to represent the ancestral gene, which gave rise to nc-cox15 by endosymbiotic gene transfer. However, our phylogenetic and structure analyses demonstrate an independent origin of mt-cox15, providing the first strong evidence of bacteria to mtDNA HGT. Rickettsiales or SAR11 often appear as sister group to modern mitochondria. However these bacteria and mitochondria also have independently evolved AT-rich genomes. For Paper IV, I assembled a dataset of 55 mitochondrial proteins of clear α-proteobacterial origin (including 30 euBacs). Phylogenies from these data support mitochondria+Rickettsiales but disagree on the placement of SAR11. Reducing amino-acid compositional heterogeneity (resulting from AT-bias) stabilizes SAR11 but moves mitochondria to the base of α-proteobacteria. Signal heterogeneity supporting other alternative hypotheses is also detected using real and simulated data. This suggests a complex scenario for the origin of mitochondria. Keywords: Molecular Phylogeny; Phylogenomics; Mitochondria; Eukaryote tree of life Ding He, Department of Organismal Biology, Systematic Biology, Norbyv. 18 D, Uppsala University, SE-75236 Uppsala, Sweden. © Ding He 2014 ISSN 1651-6214 ISBN 978-91-554-9031-7 urn:nbn:se:uu:diva-231670 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-231670) “If I have seen further, it is by standing on the shoulders of giants.” - Isaac Newton, 1676 For Family & Friends List of Papers This thesis is based on the following papers, which are referred to in the text by their Roman numerals. I He, D.*, Fiz-Palacios, O.*, Fu, C.-J., Fehling, J., Tsai, C.-C., & Bal- dauf, S. L. (2014) An Alternative Root for the Eukaryote Tree of Life. Current Biology, 24(4):465–470. II He, D., Sierra R., Pawlowski J., & Baldauf, S. L. Reducing long- branch effects in a multi-protein data set identifies a new Rhiza- ria+Alveolata clade. [Manuscript] III He, D., Fu, C.-J. & Baldauf, S. L. Horizontal transfer of a heme A synthase gene from bactera to jakobid mitochondrial DNA. Submit- ted. IV He, D., Whelan S., & Baldauf, S. L. Mining the phylogenetic signal for inferring the origin of mitochondria. [Manuscript] The reprint of Paper I was made with permission from Elsevier Limited (license number: 3464701032242). Ding He made primary contribution to the experimental design, data assem- bly and analyses in all four papers. He jointly conceived the ideas, wrote the first drafts and made extensive editing for the final versions of Paper II, III and IV. He contributed minor editing in Paper I. *Equal contribution. The cover mosaic image “Tree of Life” was built with Ernst Haeckel’s illustrations (Kunstformen der Natur, 1899-1904) using AndreaMosaic www.andreaplanet.com/ andreamosaic/ based on the drawing by Hwei-yen Chen. Contents Introduction ..................................................................................................... 9 Molecular Phylogenetics .......................................................................... 10 Phylogenomics – Uniting Phylogeny and Genomics ............................... 10 Part I: Eukaryote Tree of Life ................................................................... 12 Overview of current eukaryote supergroups ........................................ 12 The root of the eukaryote tree of life ................................................... 19 Phylogeny of supergroup SAR (Stramenopila, Alveolata and Rhizaria) .............................................................................................................. 20 Part II: Mitochondrial Evolution .............................................................. 22 The endosymbiotic theory .................................................................... 22 Movement of the genetic material ....................................................... 22 The bacterial origin(s) of mitochondria ............................................... 23 Challenges for Molecular Phylogeny ....................................................... 24 Quality control for phylogenomic data set ........................................... 24 Long-branch attraction ......................................................................... 25 Compositional bias ............................................................................... 25 Objectives ...................................................................................................... 27 Materials and Methods .................................................................................. 28 Results ........................................................................................................... 30 Summary of Paper I .................................................................................. 30 Universal eukaryotic proteins of bacterial origin (euBac) ................... 31 A Rooted Phylogeny of Eukaryotes ..................................................... 32 Analyzing the contradictory data ......................................................... 32 The nature of the proto-mitochondrial proteome ................................. 32 Evolutionary implications of the “neozoan-excavate” root ................. 33 Summary of Paper II ................................................................................. 34 The phylogenetic position of Rhizaria ................................................. 34 Summary of Paper III ............................................................................... 35 The jakobid mtDNA cox15 gene does not reflect a primitive state. .... 35 Summary of Paper IV ............................................................................... 36 A phylogenomic data set of 55 universal mitochondrial proteins ....... 36 Mitochondria-Rickettsiales signal and compositional heterogeneity .. 36 Testing phylogenetic signal heterogeneity ........................................... 37 Svensk Sammanfattning ................................................................................ 38 Acknowledgments ......................................................................................... 40 Abbreviations DNA Deoxyribonucleic acid RNA Ribonucleic acid SSU-rDNA Small subunit ribosomal deoxyribonucleic acid rRNA Ribosomal ribonucleic acid ATP Adenosine triphosphate DHFR Dihidrofolate reductase TS Thymidylate ToL Tree of life eToL Eukaryote tree of life RGC Rare genomic change MtMt Mitochondrial genome encoded mitochondrial NcMt Nuclear genome encoded mitochondrial euBac Eukaryotic proteins of bacterial origin LMCA Last mitochondrial common ancestor MROs Mitochondrial-related organelles FECA First eukaryote common ancestor LECA Last eukaryote common ancestor LMCA Last mitochondrial common ancestor EGT Endosymbiotic gene transfer HGT Horizontal gene transfer RPS-BLAST Reversed position specific basic local alignment search tool BLAST Basic local alignment search tool BLASTp Basic local alignment search tool for protein mlBP Maximum likelihood bootstrap proportion biPP Bayesian inference posterior probabilities EST