YU-DISSERTATION-2017.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
Copyright by Mengjie Yu 2017 The Dissertation Committee for Mengjie Yu Certifies that this is the approved version of the following dissertation: Tempo and Mode of Diatom Plastid Genome Evolution Committee: Edward C. Theriot, Supervisor Robert K. Jansen, Co-Supervisor John W. La Claire David L. Herrin Robin R. Gutell Tempo and Mode of Diatom Plastid Genome Evolution by Mengjie Yu Dissertation Presented to the Faculty of the Graduate School of The University of Texas at Austin in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy The University of Texas at Austin December 2017 Dedication This dissertation is dedicated to my parents, Xinyou Yu and Huiping Zhang, my husband Forest Pfeiffer, and my extended family for their endless support and love during this seven year journey. Acknowledgements I am indebted to my supervisor, Edward Theriot, for all his guidance and support during my doctoral training. His hardworking, dedication to science, and willingness to share made a huge impact in my life! I am thankful to my co-supervisor, Robert Jansen, for providing guidance, enlightening discussion and funding support for my project. I am also thankful to my committee members John La Claire and David Herrin for sharing their knowledge through phycology and plant molecular biology class, which helped me lay the foundation for my research. I am grateful for the discussion with my committee member Robin Gutell on bioinformatics and alternative career choices in industry. I would like to thank all the current and former Theriot lab members, especially Matt Ashworth for training me the basic lab skills and providing diatom strain samples for my project, Teofil Narkov for hands on instruction on phylogenetic analysis, Mariska Brady for helpful discussion. I would also like to thank all the wonderful people in Jansen lab, Tracey Ruhlman for her assistance in plastid DNA extraction, Jin Zhang and Mao-Lun Weng for their help and training in genome assembly and rate analysis, Seongjun Park and Erika Schwarz for their technical help and discussions. Finally, I would like to acknowledge the President of King Abdulaziz University, Prof. Abdulrahman O.Alyoubi, for funding support, the Genome Sequencing and Analysis Facility (GSAF) at the University of Texas at Austin for performing Illumina sequencing, the Texas Advanced Computing Center (TACC) at the University of Texas at Austin for access to supercomputers. And all the collaborators I had during graduate school, their attitude towards scientific research kept inspiring me along this journey. v Abstract Tempo and Mode of Diatom Plastid Genome Evolution Mengjie Yu, PhD The University of Texas at Austin, 2017 Supervisors: Edward C. Theriot & Robert K. Jansen Diatoms are mostly photosynthetic eukaryotes within the heterokont lineage. Their plastids were derived through secondary endosymbiosis of a red alga. Despite years of phylogenetic research, relationships among major groups of diatoms still remain uncertain. Additional plastid genome (plastome) sequences can not only provide more insight into diatom plastid evolution, but also assess phylogenetic relationships among the major lineages of diatoms. In my dissertation, I have more than doubled the available plastome sequences. My work in the plastome evolution in Thalassiosirales, one of the more comprehensively studied orders in terms of both genetics and morphology, showed highly conserved gene content and gene order within this order. I also documented the first instance of the loss of photosynthetic genes psaE, psaI and psaM in Rhizosolenia imbricata. By extensively sampling the diatoms with critical phylogenetic positions, I presented the largest genome scale phylogeny yet published for diatoms based on 103 shared plastid-coding genes from 40 diatoms and Triparma laevis as the outgroup. The most recent diatom classification posits that there are three major clades of diatoms: Coscinodiscophyceae (informally radial centrics), Mediophyceae (bi- or multipolar centrics), and Bacillariophyceae (pennates). Phylogenetic analysis of plastome data recovered the radial centric Leptocylindrus as the sister group to the remaining diatoms and recovered the polar diatoms Attheya plus vi Biddulphia in a clade sister to pennate diatoms. Statistical analysis comparing this optimal tree to trees constraining diatoms to the existing classification strongly rejected monophyly for the Coscinodiscophyceae and Mediophyceae. Extensive plastome rearrangements and variable gene content were observed among the 40 diatom species. Astrosyne radiata, recovered on the longest terminal branch, experienced extensive gene loss. The nucleotide substitution rates of plastid protein coding genes were estimated, and their patterns were compared across different gene categories. Relationships between substitution rates and plastome characteristics, such as indels, genome size, genome rearrangement, were examined. The analyses also revealed a strong positive correlation between sequence divergence and gene order change in diatom plastomes. vii Table of Contents List of Tables (if any, Heading 2,h2 style: TOC 2) ............................................... xi List of Figures (if any, Heading 2,h2 style: TOC 2) ............................................ xiii Chapter 1 Introduction ..........................................................................................1 Chapter 2 Conserved gene order and expanded invert repeats characterize plastid genome of Thalassiosirales .............................................................................4 Introduction .....................................................................................................4 Material and Method .....................................................................................6 Diatom strains and culture conditions....................................................6 DNA extraction ......................................................................................6 DNA sequencing and genome assembly ................................................7 Genome annotation and analyses ...........................................................8 Identification of genes transferred to the nucleus and signal peptide ....8 Phylogenetics analysis ...........................................................................9 Result ..............................................................................................................9 General features of plastid genomes ......................................................9 Gene loss ..............................................................................................10 Functional gene transfer from palstid to nucleus .................................11 Genome size and repetitive DNA ........................................................12 Ancestral plastid genome organization of Thalassiosirales .................13 Genome rearrangements between Thalassiosirales and others ............14 Discussion .....................................................................................................14 Conserved gene content within Thalassiosirales .................................14 Variation of gene content in non-Thalassiosirales species ..................16 Genome size .........................................................................................18 Genome rearrangements ......................................................................19 Chapter 3 Analysis of forty plastid genomes resolves relationships in diatoms and identifies genome-scale evolutionary patterns ..............................................25 Introduction ...................................................................................................25 viii Material and Method ...................................................................................29 Diatom strains and DNA extration ......................................................29 DNA sequencing and genome assembly ..............................................30 Genome annotations and analyses .......................................................30 Phylogenetic analysis ...........................................................................31 Gene order analysis ..............................................................................32 Gene content analysis ..........................................................................32 Result ............................................................................................................33 Phylogenomic analysis.........................................................................33 Genome size .........................................................................................35 Gene content ........................................................................................36 Gene order ............................................................................................37 Discussion .....................................................................................................38 Phylogeny of diatoms ..........................................................................38 Plastome evolution ...............................................................................41 Chapter 4 Correlation between plastome nucleotide substitution rates and genome organization across diatoms ..........................................................................51 Introduction ...................................................................................................51