Phylogeny and Chloroplast Evolution in Brassicaceae” Supervisor: Claudio Varotto
Total Page:16
File Type:pdf, Size:1020Kb
University of Trento PhD Thesis (2014 -2015) Hu Shiliang “Phylogeny and chloroplast evolution in Brassicaceae” Supervisor: Claudio Varotto International PhD Program in Biomolecular Sciences Centre for Integrative Biology XXVIII Cycle Phylogeny and chloroplast evolution in Brassicaceae Tutor Dr. Claudio Varotto FEM- Agricultural Institute of San Michele all’Adige Ecogenomics research group Ph.D. Thesis of Hu Shiliang FEM- IASMA Academic Year 2014-2015 Declaration I (Hu Shiliang) confirm that this is my own work and the use of all material from other sources has been properly and fully acknowledged. Student's signature: Date of submission: 14/03/2016 Abstract Brassicaceae is a large family of flowering plants, characterized by cruciform corolla, tetradynamous stamen and capsular fruit. In light of the important economic and scientific values of Brassicaceae, many phylogenetic and systematic studies were carried out. One recent and important phylogenetic analysis revealed three major lineages (I, II and III), however, classification at different taxonomic levels (tribe, genus, and species) remained problematic and evolutionary relationships among and within these lineages were still largely unclear. This is partly due to the fact that the past studies lacked information, as they mainly utilized the morphological data, nuclear DNA, partial chloroplast (cp) genes and so on. Nowadays, next generation sequencing (NGS) technology provides the possibility to make use of big data in phylogeny and evolutionary studies. Thus, we sequenced the chloroplast genomes of 80 representative species, using additional 15 reference chloroplast genomes from the NCBI database, and carried out both the phylogenetic reconstruction and the study of protein coding genes evolution in this novel dataset with different methods. Several novel results were obtained. 1 Successful application of NGS technology in chloroplast genome sequencing. During the final assembly, I could reconstruct full chloroplast genomes and the structure maps for 14 out of 80 sampled species, while the remaining were assembled nearly completely with only few gaps remaining. 2 Characterization of chloroplast genome structure. Gene number and order, single sequence repeat (SSR) as well as variety and distribution of large repeat sequence were characterized. 3 The difference of codon usage frequency was calculated between Cardamine resedifolia and Cardamine impatiens. Twelve genes with signatures of positive selection were identified at a family-wide level. 4 Three major lineages (I – III) were confirmed with high support values. Besides, the positions of various tribes were reclassified. Relationships among and within these lineages were highly resolved and supported in the final tree. Most of the tribes in the analyses were inferred to be monophyletic, only Thlaspideae was paraphyletic. Anastaticeae was for the first time classified into position of expanded lineage II, and position of tribe Lepidieae was delimited with relatively low support values in the final phylogenetic tree. This study was a new and successful application of NGS in large-scale Brassicaceae phylogeny and evolution, which offered the chance to look in details of the structural and functional features of the chloroplast genome. These results provided a paradigm on how to proceed towards the full elucidation of the evolutionary relationships among various biological species in the tree of life. Index Chapter 1: Introduction .......................................................................................................................... 1 1.1 Brief introduction of Brassicaceae: Species, Distribution, and Characteristics ........................... 1 1.2 History of phylogeny and evolutionary study in Brassicaceae .................................................... 3 1.2.1 Brief history of systematics, phylogenetics, and evolutionary research in Brassicaceae.. 3 1.3 Brief introduction of data collection and approaches for phylogenetic analysis ......................... 8 1.3.1 Traditional data resource and chloroplast genetics ........................................................... 8 1.3.2 Principle of phylogenetic inference and common phylogeny programs ......................... 11 1.4 New opportunities for in-depth phylogeny and evolution studies ............................................. 13 1.4.1 Development of high-throughput sequencing technology and its application in phylogeny and Molecular evolution study ............................................................................... 13 1.4.2 Established bioinformatic techniques promoted the phylogeny and molecular evolution studies ...................................................................................................................................... 14 1.4.3 New strategy for the phylogeny and molecular evolution studies .................................. 15 1.5 Aim of the study ........................................................................................................................ 18 Chapter 2: Plastome organization and evolution of chloroplast genes in Cardamine species adapted to contrasting habitats ............................................................................................................ 21 2.1 Introduction ............................................................................................................................... 21 2.2 Authors’ specific contributions for the appended paper ............................................................ 22 Chapter 3: Comparative analysis and structural feature exploration with 12 newly assembled chloroplast genomes in tribe Cardamineae ......................................................................................... 37 3.1 A brief introduction of tribe Cardamineae ................................................................................. 37 3.2 Sampling and bioinformatic pipeline ......................................................................................... 39 3.3 Feature exploration and a further phylogenetic analysis within tribe Cardamineae .................. 41 3.4 Conclusions ............................................................................................................................... 49 Chapter 4: Molecular Phylogeny in Brassicaceae based on 71 chloroplast protein-coding genes: Species relocation and taxonomic implications ................................................................................... 51 4.1 Short introduction of phylogeny in Brassicaceae ...................................................................... 51 4.2 Sampling and bioinformatic pipeline ......................................................................................... 52 4.3 Result of cp genome assembly and new phylogenetic reconstruction in Brassicaceae family .. 56 4.4 Discussion .................................................................................................................................. 66 Conclusion .............................................................................................................................................. 74 Acknowledgements ................................................................................................................................ 76 Bibliography ........................................................................................................................................... 77 Appendix ................................................................................................................................................ 92 1 Chapter 1: Introduction 1.1 Brief introduction of Brassicaceae: Species, Distribution, and Characteristics Brassicaceae, also known as the mustards, the crucifers or the cabbage family, is included in the order of Brassicales according to the Angiosperm Phylogeny Group system (APG system). It contains over 372 genera and about 4,060 species [1], constituting a large and economically important family in flowering plants, showing a wide diversity of phenotypes. The most common and large genera that identified include Draba with 440 species, Erysimum with 261 species, Lepidium with 234 species, Cardamine with 233 species and Alyssum with 207 species. However, the taxonomic circumscriptions of many taxa are still provisional, as many of the genera having fewer number of species sampled to a relatively low depth [2]. Fig. 1.1 Distribution of the Brassicaceae in the world [3] Previous research revealed a worldwide distribution of species in this family, as all continents except Antarctica (Fig. 1.1) [2], [3] are potential habitats. Most of the species are found concentrated in the temperate regions of the Northern Hemisphere. However, many genera are more common in the southern hemisphere, such as Draba, Lepidium, and Cardamine. Some species, which were subsumed under a genus Heliophila roughly defined by Al-Shehbaz and Mummenhoff, are widely distributed in the southern 1 (especially South African) regions, such as Brachycarpea, Chamira, Schlechteria, and Silicularia [4]. Tropics and subtropical regions, mountainous, and alpine regions are also habitats where Brassicaceae could often be found. The species Arabis alpina, for instance, is a representative that is widespread worldwide in the northern hemisphere, with a marked preference for mountainous, alpine and arctic habitats, including some high mountain chains in Kenya, Tanzania, Ethiopia and East Africa [5]. This worldwide distribution