This Is the Pre-Peer Reviewed Version of the Following Article: Sancho R
Total Page:16
File Type:pdf, Size:1020Kb
This is the pre-peer reviewed version of the following article: Sancho R, Cantalapiedra CP, López-Alvarez D, Gordon SP, Vogel JP, Pilar Catalán P, Contreras- Moreira B (2017) Pan-plastome and phylogenomics of Brachypodium: flowering time signatures, introgression and recombination in recently diverged ecotypes. New Phytologist, doi: 10.1111/nph.14926 which has been published in final form at http://onlinelibrary.wiley.com/doi/10.1111/nph.14926/abstract This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving. Comparative genomics and phylogenomics of Brachypodium plastomes: flowering time signatures, introgression and recombination in recently diverged ecotypes For Peer Review Journal: New Phytologist Manuscript ID Draft Manuscript Type: MS - Regular Manuscript Date Submitted by the Author: n/a Complete List of Authors: Sancho, Ruben; University of Zaragoza, Department of Agricultural and Environmental Sciences Cantalapiedra, Carlos; Estacion Experimental de Aula Dei, Departamento de Genética y Mejora López-Álvarez, Diana; Universidad de Zaragoza Escuela Politecnica Superior de Huesca, Department of Agricultural and Environmental Sciences Gordon, Sean; DOE Joint Genome Institute, Plant Functional Genomics Vogel, John; USDA, Western Regional Research Center Catalán, Pilar; University of Zaragoza, Department of Agriculture Contreras-Moreira, Bruno; Estacion Experimental de Aula Dei, Genetics Brachypodium distachyon – B. stacei – B. hybridum, comparative cpDNA Key Words: genomics, grass phylogenomics, intraspecific genealogy, nested dating analysis, plastid introgression and recombination Manuscript submitted to New Phytologist for review Page 1 of 42 1 Title: 2 Comparative genomics and phylogenomics of Brachypodium plastomes: flowering time 3 signatures, introgression and recombination in recently diverged ecotypes 4 5 Authors: Rubén Sancho 1,2, Carlos P. Cantalapiedra 3, Diana López-Alvarez 1, Sean P. Gordon 4, , 4,7 1,2,5* 2,3,6* 6 John P. Vogel , Pilar ForCatalán PeerBruno Contreras-Moreira Review 7 8 Affiliations: 9 1 Department of Agricultural and Environmental Sciences, High Polytechnic School of 10 Huesca, University of Zaragoza, Huesca, Spain 11 2 Grupo de Bioquímica, Biofísica y Biología Computacional (BIFI, UNIZAR), Unidad 12 Asociada al CSIC 13 3 Department of Genetics and Plant Breeding, Estación Experimental de Aula Dei-Consejo 14 Superior de Investigaciones Científicas, Zaragoza, Spain 15 4 DOE Joint Genome Institute, Walnut Creek, CA USA 16 5 Department of Biology, Tomsk State University, Tomsk, Russia 17 6 Fundación ARAID, Zaragoza, Spain. 18 7Department of Plant and Microbial Biology, University of California, Berkeley, CA 19 20 * Corresponding authors (both authors contributed equally as corresponding authors): 21 Pilar Catalán. High Polytechnic School of Huesca, University of Zaragoza. Ctra. Cuarte km 1, 22 22071 Huesca (Spain). Phone: +34974232465; fax: +34974239302. Email: 23 [email protected] Manuscript submitted to New Phytologist for review 1 Page 2 of 42 24 Bruno Contreras-Moreira. Estación Experimental de Aula Dei / CSIC, Av. Montañana 1.005, 25 50059 Zaragoza (Spain). Phone: +34976716089. Email: [email protected] Total word count 5 (Figs. 1, 2, 3a, b, 4 (excluding summary, 5896 No. of figures: references and in colour) legends): Summary: 200 No. of Tables: 1 (Table 1a, b) 14 (Tables S1, S2, For Peer ReviewS3, S4, S5, S6a, b, c, d, e, S7, S8, S9, S10; Figs. S1, S2a, b, c, No of Supporting Introduction: 579 S3a, b, S4a, b, S5a, Information files: b, S6a, b (Figs. S1, S2a, b, S3a, b, S4a, b, S6a, b in colour), Methods S1) Materials and 1268 Methods: Results: 1844 Discussion: 2205 Acknowledgements: 116 26 Abbreviation Definition BEP Bambusoideae, Ehrhartoideae and Pooideae clade BI bayesian inference bp base pairs BS bootstrap support CDS coding sequence Manuscript submitted to New Phytologist for review 2 Page 3 of 42 cpDNA chloroplast DNA d average number of nucleotide difference second order rate of change of the log probability of data between successive DeltaK (∆K) K values for a particular K DF delayed flowering EDF extremely delayed flowering EDF+ extremely delayed flowering (plus other flowering class types) clade ENA European Nucleotide Archive ERF extremelyFor rapid flowering Peer Review ESS Effective sample size Fis inbreeding coefficient G gamma GTR generalised time-reversible model h haplotype Hd haplotypes diversity index HPD highest posterior density (interval) I proportion of invariant sites IBD isolation by distance IDF intermediate delayed flowering IR inverted repeat IRF intermediate rapid flowering K number of potential genomic groups kbp kilo base pairs LSC long single copy Ma millions years ago Mbp mega base pairs MCC maximum clade credibility MCMC Markov chain Monte Carlo ML maximum likelihood Manuscript submitted to New Phytologist for review 3 Page 4 of 42 MP maximum parsimony MP mate-pair N or Ns missing data ncDNA non-coding DNA NSyn non-synonymous mutations NV no vernalization Panicoideae, Arundinoideae, Chloridoideae, Micrairoideae, Aristidoideae PACMAD and Danthonioideae clade PCR polymeraseFor chain reactionPeer Review PE paired-end PPS posterior probability support RADseq restriction site associated DNA markers RF Rapid Flowering S number of segregating sites Syn synonymous mutations shm number of shared mutations SNP single nucleotide polymorphism SSC short single copy S+ Spanish (plus other geographically close ecotypes) group Spanish (plus other geographically close ecotypes) and Turkish (plus other S+T+ geographically close ecotypes) clade T+ Turkish (plus other geographically close ecotypes) group ucld uncorrelated lognormal distribution WGS whole genome sequencing wV weeks of vernalization x chromosome base number 27 Manuscript submitted to New Phytologist for review 4 Page 5 of 42 28 SUMMARY 29 ● Few pan-genomic studies have been conducted in plants, and none of them have focused on 30 the intra-specific diversity and evolution of their chloroplast genomes. 31 ● We address this issue in Brachypodium distachyon , a model system for monocots, and its 32 close relatives B. stacei and B. hybridum , for which a large genomic data set has been 33 compiled. We analyze inter and intra-specific cpDNA comparative-genomics and 34 phylogenomic relationships within a family-wide framework. 35 ● Major structural rearrangements were detected between the B. distachyon and B. stacei/B. 36 hybridum plastomes. TwoFor main lineages, Peer an Extremely Review Delayed Flowering (EDF+) clade and 37 a Spanish Turkish (S+T+) clade, plus nine chloroplast capture and two cpDNA introgression 38 and micro-recombination events, were detected within B. distachyon . Early Oligocene (30.9 39 Mya) and Late Miocene (10.1 Mya) divergence times were inferred for the respective stem 40 and crown nodes of Brachypodium and a very recent Mid-Pleistocene (0.9 Mya) time for the 41 B. distachyon split. 42 ● Flowering time is a main factor driving rapid intra-specific divergence in B. distachyon , 43 though it is counterbalanced by repeated introgression between previously isolated lineages. 44 Swapping of plastomes among its three different genomic groups (EDF+, S+, T+), likely 45 resulted from random backcrossing followed by stabilization through selection pressure. 46 47 48 Key words: Brachypodium distachyon – B. stacei – B. hybridum , comparative cpDNA 49 genomics, grass phylogenomics, intraspecific genealogy, nested dating analysis, plastid 50 introgression and recombination. Manuscript submitted to New Phytologist for review 5 Page 6 of 42 51 INTRODUCTION 52 Chloroplast DNA (cpDNA) has been widely used in inter and intra-specific phylogenetic 53 analyses in multiple species and populations of plants (Waters et al. , 2012; Ma et al. , 2014; 54 Middleton et al. , 2014; Wysocki et al. , 2015). Phylogenetic dating of monocots and eudicots 55 have also been based on cpDNA (Chaw et al. , 2004). Comparative genomics of whole 56 chloroplast genomes (plastomes) have provided a way to detect and investigate genetic 57 variation across the seed plants (Jansen & Ruhlman, 2012). The proliferation of Whole 58 Genome Sequencing (WGS) , which typically includes a substantial amount of chloroplast 59 sequence, has providedFor large data Peersets that can beReview used to assemble and analyze plastomes 60 (Nock et al. , 2011). 61 Brachypodium is a small genus in the family Poaceae that contains approximately 20 species 62 (17 perennial and 3 annual) distributed worldwide (Schippmann, 1991; Catalán & Olmstead, 63 2000; Catalán et al. , 2012, 2016a). The three annuals include two diploids [B. distachyon 64 (2n=2x=10; x=5) and B. stacei (2n=2x=20; x=10)] and their derived allotetraploid [B. 65 hybridum (2n=4x=30; x=5+10)]. Until recently, all three species were described as variants of 66 B. distachyon (Catalán et al. , 2012). All three species have a large, overlapping distribution in 67 their native circum-Mediterranean region (Catalán et al., 2012, 2016b; López-Alvarez et al. , 68 2012; Lopez-Alvarez et al. , 2015) and B. hybridum has naturalized extensively around the 69 world. The evolutionary relationship between Brachypodium and other grasses has been 70 thoroughly studied (Catalán et al. , 1997; Catalán & Olmstead, 2000; Döring et al. , 2007). 71 Most recent phylogenetic analyses place Brachypodium in an intermediate position within the 72 Pooideae clade (Minaya et al. , 2015; Soreng et al. , 2015; Catalán et al. , 2016a). By contrast, 73 only