Understanding Brassicaceae Evolution Through Ancestral
Total Page:16
File Type:pdf, Size:1020Kb
Understanding Brassicaceae evolution through ancestral genome reconstruction Florent Murat, Alexandra Louis, Florian Maumus, Alix Armero Villanueva, Richard Cooke, Hadi Quesneville, Hugues Roest Crollius, Jerome Salse To cite this version: Florent Murat, Alexandra Louis, Florian Maumus, Alix Armero Villanueva, Richard Cooke, et al.. Understanding Brassicaceae evolution through ancestral genome reconstruction. Genome Biology, BioMed Central, 2015, 16 (262), 10.1186/s13059-015-0814-y. hal-01281832 HAL Id: hal-01281832 https://hal.archives-ouvertes.fr/hal-01281832 Submitted on 2 Mar 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Murat et al. Genome Biology (2015) 16:262 DOI 10.1186/s13059-015-0814-y RESEARCH Open Access Understanding Brassicaceae evolution through ancestral genome reconstruction Florent Murat1†, Alexandra Louis2,3,4†, Florian Maumus5†, Alix Armero1, Richard Cooke6, Hadi Quesneville5, Hugues Roest Crollius2,3,4 and Jerome Salse1* Abstract Background: Brassicaceae is a family of green plants of high scientific and economic interest, including thale cress (Arabidopsis thaliana), cruciferous vegetables (cabbages) and rapeseed. Results: We reconstruct an evolutionary framework of Brassicaceae composed of high-resolution ancestral karyotypes using the genomes of modern A. thaliana, Arabidopsis lyrata, Capsella rubella, Brassica rapa and Thellungiella parvula. The ancestral Brassicaceae karyotype (Brassicaceae lineages I and II) is composed of eight protochromosomes and 20,037 ordered and oriented protogenes. After speciation, it evolved into the ancestral Camelineae karyotype (eight protochromosomes and 22,085 ordered protogenes) and the proto-Calepineae karyotype (seven protochromosomes and 21,035 ordered protogenes) genomes. Conclusions: The three inferred ancestral karyotype genomes are shown here to be powerful tools to unravel the reticulated evolutionary history of extant Brassicaceae genomes regarding the fate of ancestral genes and genomic compartments, particularly centromeres and evolutionary breakpoints. This new resource should accelerate research in comparative genomics and translational research by facilitating the transfer of genomic information from model systems to species of agronomic interest. Keywords: Evolution, Genome, Ancestor, Synteny, Repeats, Polyploidization Background Thellungiella parvula. These species represent the The Brassicaceae family is one of the major groups of Brassicaceae lineages I and II, derived from a com- the plant kingdom, being composed of more than 3600 mon ancestor ≈ 20 million years ago (hereafter mya), species grouped in 321 genera assigned into 49 tribes and include two closely related tribes: the Camelineae [1–3]. Understanding the evolutionary history of extant (A. thaliana, A. lyrata, C. rubella) and the Calepineae genomes is the primary objective of comparative genomics (B. rapa, representative of the Brassiceae, and T. par- research, fueled by the rapid increase in the availability of vula, representative of the Eutremae). new genome assemblies. Ancestral genome reconstruc- A. thaliana was the first plant genome to be se- tion, the field of paleogenomics, delivers a common refer- quenced [4], chosen partly for its phylogenetic relation- ence to study the evolution of descendants and is a ship to agriculturally important Brassica species, thus natural and logical standpoint to obtain a chronological greatly stimulating comparative genomics and evolution- view of evolutionary processes. The Brassicaceae consti- ary research in this family. Parkin et al. [5] first defined tute a model clade for paleogenomics, given the availabil- 21 genomic blocks (GBs) common to the B. rapa and A. ity of sequenced genomes such as Arabidopsis thaliana, thaliana genomes based on restriction fragment length Arabidopsis lyrata, Capsella rubella, Brassica rapa and polymorphism mapping data. Using genetic maps and comparative chromosomal painting (CCP) with pools of tens to hundreds of Arabidopsis bacterial artificial * Correspondence: [email protected] chromosome (BAC) clones arranged according to the †Equal contributors 1 genetic maps of A. lyrata and C. rubella, Schranz et al. INRA/UBP UMR 1095 GDEC ‘Génétique, Diversité et Ecophysiologie des – Céréales’, 5 Chemin de Beaulieu, 63100 Clermont Ferrand, France [6] defined 24 ancestral GBs (A X in Schranz et al. [6]; Full list of author information is available at the end of the article and re-evaluated in Mandáková and Lysak [7] and © 2015 Murat et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Murat et al. Genome Biology (2015) 16:262 Page 2 of 17 Cheng et al. [8]). They proposed that extant genomes (retained ancestral genes in the D or LF compartment) evolved from an ancestral Brassicaceae karyotype (ABK) and functions (dominant expression levels in this com- made of eight chromosomes. This scenario has since partment [8]) of the considered genes. Single nucleotide been refined, still using the 24 GBs but including the re- polymorphism investigation at the population level be- cently published A. lyrata [9], C. rubella [10], T. parvula tween B. rapa accessions also tends to indicate a bias in [11], and B. rapa [12] draft genome sequences. The ABK mutation rates between D (or LF) and S (or MF) compart- (n = 8) was proposed to have evolved through pericen- ments, with genes located in LF compartments showing tromeric inversions followed by reciprocal translocation less nonsynonymous or frameshift mutations than genes mechanisms [13] into a proto-Calepineae karyotype in MF compartments [8]. Such a phenomenon following (PCK) of seven chromosomes, which was proposed to WGD complicates the identification of orthologous genes represent the diploid ancestral genome of hexaploid B. in B. rapa based on the A. thaliana genes. However, re- rapa [8]. Schranz et al. [6] detected 61 of the predicted construction of the Calepineae (or PCK) ancestral progen- 72 (3 × 24) GBs in B. rapa, whereas Cheng et al. [8] itors (i.e., with precise protochromosome structure and more recently described 105 blocks or sub-blocks in this protogene content as well as orientation) would allow ac- genome. Therefore, although ancestral Brassicaceae curate elucidation of the fate of the paleohexaploidization karyotypes (i.e., study of chromosome number vari- event in Brassiceae in terms of extant genome architec- ation) have been proposed, largely based on low reso- tures, largely unknown to date. lution pairwise genome comparisons using the 24 GBs, Here we report the gene-based reconstruction of a a precise gene-based paleogenomic inference of ances- high resolution ABK and its two descendants, the tral genomes through robust protochromosome recon- Camelineae (ACK) and Calepineae (PCK) ancestral struction and definition of ancestral gene order is still karyotypes. This is a prerequisite to investigating the lacking. The inferred ancestral Brassicaceae genomes evolutionary forces that have shaped extant Brassicaceae are necessary to unveil evolutionary changes (large- genomes at the chromosome (especially at synteny break scale chromosome fusion, fission and translocation, as points, paleo-centromeric regions, highly shuffled loci) well as smaller-scale analysis of gains and losses of and gene (especially between post-polyploidization LF genes and functions, evolution of repeats, etc.) and as- and MF blocks) levels. Finally, the inferred ancestral sign them to specific tribes (Brassiceae, Camelineae, Brassicaceae genomes allow accurate translational re- Eutremae) and species (A. thaliana, C. rubella, A. lyr- search of candidate genes for key applied agronomic ata, B. rapa and T. parvula). traits from models to Brassicaceae crops based on a The impact of the lineage-specific paleohexaploidization high-resolution evolutionary framework. event on the extant Brassiceae genome architecture has been the subject of intense debate recently [8, 14–16]. Results The triplicated Calepineae chromosomes reported in the Reconstruction and evolutionary fate of the Brassicaceae extant B. rapa genome [17] experienced differential frac- ancestral genomes tionation in gene content leading to the identification of Although an increasing number of genome sequences three genomic compartments referred to as least fraction- for Brassiceae species are available, the quality of assem- ated (LF), medium fractionated (MF1) and most fraction- bly of only a few of them is sufficient to allow accurate ated blocks (MF2), covering the ten modern B. rapa evolutionary