Syntenic Gene and Genome Duplication Drives Diversification of Plant Secondary Metabolism and Innate Immunity in Flowering Plants
Total Page:16
File Type:pdf, Size:1020Kb
Genomics 4.0 - Syntenic Gene and Genome Duplication Drives Diversification of Plant Secondary Metabolism and Innate Immunity in Flowering Plants - Advanced Pattern Analytics in Duplicate Genomes - Johannes A. Hofberger Thesis committee Promotor Prof. Dr M. Eric Schranz Professor of Experimental Biosystematics Wageningen University Other members Prof. Dr Bart P.H.J. Thomma, Wageningen University Prof. Dr Berend Snel, Utrecht University Dr Klaas Vrieling, Leiden University Dr Gabino F. Sanchez, Wageningen University This research was conducted under the auspices of the Graduate School of Experimental Plant Sciences. Genomics 4.0 - Syntenic Gene and Genome Duplication Drives Diversification of Plant Secondary Metabolism and Innate Immunity in Flowering Plants - Advanced Pattern Analytics in Duplicate Genomes - Johannes A. Hofberger Thesis submitted in fulfilment of the requirements for the degree of doctor at Wageningen University by the authority of the Rector Magnificus Prof. Dr M.J. Kropff, in the presence of the Thesis Committee appointed by the Academic Board to be defended in public on Monday 18 May 2015 at 4 p.m. in the Aula. Johannes A. Hofberger Genomics 4.0 - Syntenic Gene and Genome Duplication Drives Diversification of Plant Secondary Metabolism and Innate Immunity in Flowering Plants 83 pages. PhD thesis, Wageningen University, Wageningen, NL (2015) With references, with summaries in Dutch and English ISBN: 978-94-6257-314-7 PROPOSITIONS 1. Ohnolog over-retention following ancient polyploidy facilitated diversification of the glucosinolate biosynthetic inventory in the mustard family. (this thesis) 2. Resistance protein conserved in structurally stable parts of plant genomes confer pleiotropic effects and expanded functions in plant innate immunity. (this thesis) 3. Integrating the science of management with the science of life leads to more and better results. 4. In every personality, obvious attributes are linked to hidden attributes like genes are linked when sharing one chromosome. 5. If you recognize what is in your sight, that which is hidden from you will become plain to you for there is nothing hidden which cannot become manifest 6. It doesn’t matter if it is genes, thoughts, people or stocks: the whole is more than the sum of its parts. Propositions belonging to the thesis, entitled “Genomics 4.0 - Syntenic Gene and Genome Duplication Drives Diversification of Plant Secondary Metabolism and Innate Immunity in Flowering Plants” Johannes A. Hofberger, Mai 18th, 2015 Table of Contents TABLE OF CONTENTS Summary ................................................................................................................................................. 7 General Introduction............................................................................................................................... 8 Chapter 1 - Whole Genome and Tandem Duplicate Retention Facilitated Glucosinolate Pathway Diversification in the Mustard Family ................................................................................................... 16 Chapter 2 – Large-scale evolutionary analysis of genes and supergene clusters from terpenoid modular pathways provides insights into metabolic diversification in flowering plants ..................... 37 Chapter 3 – A novel approach for multi-domain and multi-gene family identification provides insights into evolutionary dynamics of disease resistance genes in core eudicot plants..................... 72 Chapter 4 – A Complex Interplay of Tandem- and Whole Genome Duplication Drives Expansion of the L-type Lectin Receptor Kinase Gene Family in the Brassicaceae ............................................... 98 General Conclusion ............................................................................................................................. 117 References .......................................................................................................................................... 119 Curriculum Vitae ................................................................................................................................. 140 6 Summary Genomics 4.0 - Syntenic Gene and Genome Duplication Drives Diversification of Plant Secondary Metabolism and Innate Immunity in Flowering Plants Johannes A. Hofberger1, 2, 3 1 Biosystematics Group, Wageningen University & Research Center, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands (August 2012 – December 2013) 2 Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands (December 2010 – July 2012) 3 Chinese Academy of Sciences/Max Planck Partner Institute for Computational Biology, 320 Yueyang Road, Shanghai 200031, PR China (January 2014 – December 2014) TWO-SENTENCE SUMMARY Large-scale comparative analysis of Big Data from next generation sequencing provides powerful means to exploit the potential of nature in context of plant breeding and biotechnology. In this thesis, we combine various computational methods for genome-wide identification of gene families involved in (a) plant innate immunity and (a) biosynthesis of defense-related plant secondary metabolites across 21 species, assess dynamics that affected evolution of underlying traits during 250 Million Years of flowering plant radiation and provide data on more than 4500 loci that can underpin crop improvement for future food and live quality. GENERAL ABSTRACT As sessile organisms, plants are permanently exposed to a plethora of potentially harmful microbes and other pests. The surprising resilience to infections observed in successful lineages is due to a complex defense network fighting off invading pathogens. Within this network, a sophisticated plant innate immune system is accompanied by a multitude of specialized biosynthetic pathways that generate more than 200,000 secondary metabolites with ecological, agricultural, energy and medicinal importance. The rapid diversification of associated genes was accompanied by a series of duplication events in virtually all plant species, including local duplication of short sequences as well as multiplication of all chromosomes due to meiotic errors (plant polyploidy). In a comparative genomics approach, we combined several bioinformatics techniques for large-scale identification of multi-domain and multi-gene families that are involved in plant innate immunity or defense-related secondary metabolite pathways across 21 representative flowering plant genomes. We introduced a framework to trace back duplicate gene copies to distinct ancient duplication events, thereby unravelling a differential impact of gene and genome duplication to molecular evolution of target genes. Comparing the genomic context among homologs within and between species in a phylogenomics perspective, we discovered orthologs conserved within genomic regions that remained structurally immobile during flowering plant radiation. In summary, we described a complex interplay of gene and genome duplication that increased genetic versatility of disease resistance and secondary metabolite pathways, thereby expanding the playground for functional diversification and thus plant trait innovation and success. Our findings give fascinating insights to evolution across lineages and can underpin crop improvement for food, fiber and biofuels production. 7 General Introduction GENERAL INTRODUCTION From the Neolithic to the Genomics revolution: 12,000 years of plant biology The understanding of plant biology has facilitated the origin of modern civilization and greatly contributed to human life quality ever since. The history of modern civilization began with a first INTRODUCTION agricultural (“Neolithic”) evolution, starting around 12,000 years ago in the Holocene with domestication of various types of plants [1]. In this context, specialized food-crop cultivation led to increased yields and triggered a gradual transformation of small and mobile groups of hunter- gatherers to larger non-nomadic societies based in build-up settlements [2]. A second agricultural revolution rationalized crop production and coincided with the onset of industrialization 200 years ago [3]. Yields rose beyond subsistence and facilitated a near-exponential growth of the human population [4]. However, the increasing population led to land conversion and a decrease of the limited area of arable land [5]. Likewise, the potential of many available crop varieties reached its limits and yield overages were shrinking gradually. At the peak of this development, many countries came to the brink of food shortage and related famine in the mid of the 20th century [6]. In a third agricultural (“Green”) revolution, a series of research, development and technology transfer initiatives were initiated between the late 1940s and the late 1960s with the aim to create higher-yield crops [7]. Due to these efforts, the total production of cereals doubled in developing nations between the years 1961–1985 [8]. Once again, a detailed understanding of plant biology ensured the availability of food and hence better life quality for an estimated billon of people [9]. Uncovering the double helix structure of the DNA macromolecule and subsequent development of polymerase chain reaction (PCR)-technology in the 1980s facilitated high-throughput genotyping of plants [10]. Before that, crop improvement was still limited to classical forward genetics approaches that rely on time-consuming cross-breeding of individuals that carry unusual traits