Comparative Genomics in Diplomonads
Total Page:16
File Type:pdf, Size:1020Kb
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1261 Comparative Genomics in Diplomonads Lifestyle Variations Revealed at Genetic Level FEIFEI XU ACTA UNIVERSITATIS UPSALIENSIS ISSN 1651-6214 ISBN 978-91-554-9262-5 UPPSALA urn:nbn:se:uu:diva-251650 2015 Dissertation presented at Uppsala University to be publicly examined in BMC, B41, Husargatan 3, Uppsala, Friday, 12 June 2015 at 13:00 for the degree of Doctor of Philosophy. The examination will be conducted in English. Faculty examiner: Doctor Matt Berriman (Wellcome Trust Sanger Institute). Abstract Xu, F. 2015. Comparative Genomics in Diplomonads. Lifestyle Variations Revealed at Genetic Level. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1261. 64 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-9262-5. As sequencing technologies advance genome studies are becoming a basic tool for studying an organism, and with more genomes available comparative genomics is maturing into a powerful tool for biological research. This thesis demonstrates the strength of a comparative genomics approach on a group of understudied eukaryotes, the diplomonads. Diplomonads are a group of single cell eukaryotic flagellates living in oxygen-poor environments. Most diplomonads are intestinal parasites, like the well-studied human parasite Giardia intestinalis. There are seven different G. intestinalis assemblages (genotypes) affecting different hosts, and it’s under debate whether these are one species. A genome-wide study of three G. intestinalis genomes from different assemblages reveals little inter-assemblage sexual recombination, supporting that the different G. intestinalis assemblages are genetically isolated and thus different species. A genomic comparison between the fish parasite S. salmonicida and G. intestinalis reveals genetic differences reflecting differences in their parasitic lifestyles. There is a tighter transcriptional regulation and a larger metabolic reservoir in S. salmonicida, likely adaptations to the fluctuating environments it encounters during its systemic infection compared to G. intestinalis which is a strict intestinal parasite. The S. salmonicida genome analysis also discovers genes involved in energy metabolism. Some of these are experimentally shown to localize to mitochondrion-related organelles in S. salmonicida, indicating that they possess energy-producing organelles that should be classified as hydrogenosomes, as opposed to the mitosomes in G. intestinalis. A transcriptome analysis of the free-living Trepomonas is compared with genomic data from the two parasitic diplomonads. The majority of the genes associated with a free-living lifestyle, like phagocytosis and a larger metabolic capacity, are of prokaryotic origin. This suggests that the ancestor of the free-living diplomonad was likely host-associated and that the free-living lifestyle is a secondary adaptation acquired through horizontal gene transfers. In conclusion, this thesis uses different comparative genomics approaches to broaden the knowledge on diplomonad diversity and to provide more insight into how the lifestyle differences are reflected on the genetic level. The bioinformatics pipelines and expertise gained in these studies will be useful in other projects in diplomonads and other organismal groups. Keywords: comparative genomics, Giardia intestinalis, Spironucleus salmonicida, Trepomonas, diplomonad, intestinal parasite, free-living, sexual recombination, hydrogenosome, horizontal gene transfer Feifei Xu, Department of Cell and Molecular Biology, Box 596, Uppsala University, SE-75124 Uppsala, Sweden. © Feifei Xu 2015 ISSN 1651-6214 ISBN 978-91-554-9262-5 urn:nbn:se:uu:diva-251650 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-251650) Dedicated to my family List of papers This thesis is based on the following papers, which are referred to in the text by their Roman numerals. I Xu, F., Jerlström-Hultqvist, J. & Andersson, J. O. (2012). Genome-wide analyses of recombination suggest that Giardia intestinalis assemblages represent different species. Molecular Biology and Evolution, 29(10), 2895-2898. II Jerlström-Hultqvist, J., Einarsson, E., Xu, F., Hjort, K., Ek, B., Steinhauf, D., Bergquist, J., Andersson, J. O. & Svärd, S. G. (2013). Hydrogenosomes in the diplomonad Spironucleus salmonicida. Nature Communications, 4, 2493. III Xu, F., Jerlström-Hultqvist, J., Einarsson, E., Astvaldsson, A., Svärd, S. G. & Andersson, J. O. (2014). The genome of Spironucleus salmonicida highlights a fish pathogen adapted to fluctuating environments. PLoS Genetics, 10(2), e1004053. IV Xu, F., Jerlström-Hultqvist, J., Kolisko, M., Simpson, A. G. B., Roger, A. J., Svärd, S. G. & Andersson, J. O. Adaptation to a free-living lifestyle via gene acquisitions in the diplomonad Trepomonas sp. PC1 Manuscript. Reprints were made with permission from the publishers. Publications not included in the thesis. 1. Jerlström-Hultqvist, J., Franzén, O., Ankarklev, J., Xu, F., Nohýnková, E., Andersson, J. O., Svärd, S. G. & Andersson B. (2010). Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate. BMC Genomics, 11, 543. Contents 1 Introduction . ............................................................................................... 11 2 Diplomonads .............................................................................................. 13 2.1 Giardia ............................................................................................ 15 2.2 Spironucleus ................................................................................... 17 2.3 Trepomonas .................................................................................... 19 3 Research aims ............................................................................................ 20 4 Bioinformatic methods .............................................................................. 22 4.1 Genomics ........................................................................................ 22 4.1.1 Sequencing ....................................................................... 22 4.1.2 Assembly .......................................................................... 24 4.1.3 Annotation ........................................................................ 31 4.1.4 Transcriptomics ............................................................... 33 4.1.5 Analysis ............................................................................ 35 4.2 Comparative genomics .................................................................. 37 5 Results & Discussion ................................................................................ 39 5.1 Genome-wide recombination study in G. intestinalis (paper I) .. 39 5.2 Hydrogenosomes in S. salmonicida (paper II) ............................. 39 5.3 S. salmonicida genome (paper III) ................................................ 41 5.4 Trepomonas transcriptome (paper IV) .......................................... 44 5.5 Discussion ....................................................................................... 47 6 Conclusions and future work .................................................................... 49 7 Svensk sammanfattning ............................................................................ 51 8 中文摘要 .................................................................................................... 53 9 Acknowledgements ................................................................................... 55 References ........................................................................................................ 58 Abbreviations ACT Artemis Comparison Tool ASH Allelic Sequence Heterozygosity BLAST Basic Local Alignment Search Tool bp Basepair CA Celera Assembler CLO Carpediemonas-Like Organism Cpn60 Chaperonin 60 CRMP Cysteine-Rich Membrane Protein CRP Cysteine-Rich Protein CWP Cyst Wall Protein DNA Deoxyribonucleic Acid EST Expressed Sequence Tag EVM EVidenceModeler GalNAc N-acetylgalactosamine Gb Gigabase HCMP High Cysteine Membrane Protein HCP High Cysteine Protein HGT Horizontal Gene Transfer HMM Hidden Markov Model Inr Initiator element JCVI J. Craig Venter Institute JGI Joint Genome Institue KAAS KEGG Automatic Annotation Server kb Kilobase KEGG Kyoto Encyclopedia of Genes and Genomes KO KEGG Orthology LRR Leucine-Rich Repeat Mb Megabase MCL Markov Cluster Algorithm ML Maximum Likelihood MLO Mitochondrion-Like Organelle mRNA Messenger RNA MRO Mitochondrion-Related Organelle NCBI National Center for Biotechnology Information NGS Next-Generation Sequencing ORF Open Reading Frame PacBio Pacific Biosciences 9 PCR Polymerase Chain Reaction PFOR Pyruvate Ferredoxin Oxidoreductase PHAT Phylome Analysis Tool RNA Ribonucleic Acid RNA-Seq RNA Sequencing RNR Ribonucleotide Reductase ROS Reactive Oxygen Species rRNA Ribosomal RNA SGS Second-Generation Sequencing SHMT Serine Hydroxylmethyltransferase SMRT Single-Molecule Real-Time STC Squalene-Tetrahymanol Cyclase Tb Terabase TGS Third-Generation Sequencing tRNA Transfer RNA VSP Variant Surface Protein 10 1. Introduction I still remember how I ended up in the bioinformatics program in university. It was a new program that my home university was starting, and it was adver- tised as a promising interdisciplinary field where biology, computer science, statistics and mathematics meet. Naive as I was, not knowing what exactly I wanted to do, I thought it was at