Genomic Landscape of the Global Oak Phylogeny
Total Page:16
File Type:pdf, Size:1020Kb
Research Genomic landscape of the global oak phylogeny Andrew L. Hipp1,2 , Paul S. Manos3, Marlene Hahn1, Michael Avishai4†, Catherine Bodenes5, Jeannine Cavender- Bares6 , Andrew A. Crowl3 , Min Deng7 , Thomas Denk8 , Sorel Fitz-Gibbon9 , Oliver Gailing10 , M. Socorro Gonzalez-Elizondo11 , Antonio Gonzalez-Rodrıguez12, Guido W. Grimm13 , Xiao-Long Jiang7 , Antoine Kremer5 , Isabelle Lesur5, John D. McVay3 , Christophe Plomion5 , Hernando Rodrıguez- Correa12 , Ernst-Detlef Schulze14, Marco C. Simeone15 , Victoria L. Sork9 and Susana Valencia-Avalos16 1The Morton Arboretum, Lisle, IL 60532-1293, USA; 2The Field Museum, Chicago, IL 60605, USA; 3Duke University, Durham, NC 27708, USA; 4 Previously of, The Hebrew University of Jerusalem, Botanical Garden, Zalman Shne’ur St. 1, Jerusalem, Israel; 5INRA, UMR1202 BIOGECO, Cestas F-33610, France; 6University of Minnesota, Minneapolis, MN 55455, USA; 7Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai 201602, China; 8Swedish Museum of Natural History, Stockholm 10405, Sweden; 9University of California, Los Angeles, Los Angeles, CA 90095, USA; 10Busgen-Institute,€ Georg-August-University G€ottingen, G€ottingen D-37077, Germany; 11CIIDIR Unidad Durango, Instituto Politecnico Nacional, Durango 34220, Mexico; 12Escuela Nacional de Estudios Superiores Unidad Morelia, Universidad Nacional Auto´noma de Me´xico, Morelia 58190, Me´xico; 13Orle´ans 45100, France; 14Max Planck Institute for Biogeochemistry, Hans-Knoell-Str. 10, Jena 07745, Germany; 15Universita` degli Studi della Tuscia, Viterbo 01100, Italy; 16Herbario de la Facultad de Ciencias, Departamento de Biologı´a Comparada, Universidad Nacional Auto´noma de Me´xico, Circuito Exterior, s.n., Ciudad Universitaria, Coyoaca´n, Me´xico City CP 04510, Me´xico Summary Author for correspondence: The tree of life is highly reticulate, with the history of population divergence emerging from Andrew L. Hipp populations of gene phylogenies that reflect histories of introgression, lineage sorting and Tel: +1 630 725 2094 divergence. In this study, we investigate global patterns of oak diversity and test the hypothe- Email: [email protected] sis that there are regions of the oak genome that are broadly informative about phylogeny. Received: 21 March 2019 We utilize fossil data and restriction-site associated DNA sequencing (RAD-seq) for 632 Accepted: 5 June 2019 individuals representing nearly 250 Quercus species to infer a time-calibrated phylogeny of the world’s oaks. We use a reversible-jump Markov chain Monte Carlo method to reconstruct New Phytologist (2019) shifts in lineage diversification rates, accounting for among-clade sampling biases. We then doi: 10.1111/nph.16162 map the > 20 000 RAD-seq loci back to an annotated oak genome and investigate genomic distribution of introgression and phylogenetic support across the phylogeny. Key words: diversification rates, genomic Oak lineages have diversified among geographic regions, followed by ecological divergence mosaicism, introgression, oaks, within regions, in the Americas and Eurasia. Roughly 60% of oak diversity traces back to four phylogenomics, Quercus, restriction-site clades that experienced increases in net diversification, probably in response to climatic transi- associated DNA sequencing (RAD-seq), tree tions or ecological opportunity. diversity. The strong support for the phylogeny contrasts with high genomic heterogeneity in phylo- genetic signal and introgression. Oaks are phylogenomic mosaics, and their diversity may in fact depend on the gene flow that shapes the oak genome. Introduction Boecklen, 2017; Cannon & Petit, 2019), a system of interbreed- ing species in which incomplete reproductive isolation may facili- The tree of life exhibits reticulation from its base to its tips (Folk tate adaptive gene flow and species migration (Petit et al., 2003; et al., 2018; Quammen, 2018). Oaks (Quercus L., Fagaceae) are Dodd & Afzal-Rafii, 2004; Leroy et al., 2019). The oak genome no exception (Hipp, 2018). The botanical and evolutionary liter- (Plomion et al., 2018) consequently tracks numerous unique ature is rife with case studies of localized gene flow (Hardin, species-level phylogenetic histories that result from lineage sort- 1975; Whittemore & Schaal, 1991; McVay et al., 2017a; Kim ing and differential rates of introgression (Anderson, 1953; Eaton et al., 2018) and ancient introgression (McVay et al., 2017b; Kim et al., 2015; McVay et al., 2017b; Edelman et al., 2018). Oak et al., 2018; Crowl et al., 2019) in oaks. Oaks have in fact been genomes are mosaics of disparate phylogenetic histories (cf. held up as a paradigmatic syngameon (Hardin, 1975; Van Valen, P€a€abo, 2003). Given the prevalence of hybridization in trees 1976; Dodd & Afzal-Rafii, 2004; Cannon & Scher, 2017; globally (Petit & Hampe, 2006; Cannon & Lerdau, 2015), understanding how these histories align with one another and † – This paper is dedicated to the memory of Michael Avishai (1935 2018), whether there are regions of the genome that track a common founder of the Jerusalem Botanical Gardens and cherished colleague. Ó 2019 The Authors New Phytologist (2019) 1 New Phytologist Ó 2019 New Phytologist Trust www.newphytologist.com New 2 Research Phytologist evolutionary history is essential to understanding the prevalence individual for each locus were then clustered across individuals, of adaptive gene flow and the phylogenetic history of forest trees. retaining loci present in at least four individuals and possessing a Restriction-site associated DNA sequencing (RAD-seq; Miller maximum of 20 single nucleotide polymorphisms (SNPs) and et al., 2007a,b; Lewis et al., 2007; Baird et al., 2008; Ree & Hipp, eight indels across individuals. The dataset was filtered to loci 2015) has revolutionized our understanding of oak phylogeny in with a minimum of 15 individuals each, for a total of 49 991 loci. the past 5 yr (Hipp et al., 2014, 2018; Cavender-Bares et al., Data were imported into R using the RADAMI package (Hipp 2015; Eaton et al., 2015; Fitz-Gibbon et al., 2017; Hipp, 2017; et al., 2014) for downstream analysis. Pham et al., 2017; Deng et al., 2018; Kim et al., 2018; Ortego RAD-seq loci were mapped back to the latest version of the et al., 2018; Jiang et al., 2019). Its ties to the genome, however, Q. robur haploid genome (HAPLOME 2.3; https://urgi.versailles. have not been fully exploited because of the lack of an assembled inra.fr/Data/Genome/Genome-data-access) (Plomion et al., genome. While earlier studies explored the effects of gene identity 2018). The Q. robur genome has been assembled into 12 pseu- on phylogenetic informativeness (Hipp et al., 2014) and genomic dochromosomes corresponding to the 12 observed Quercus chro- heterogeneity in phylogenetic vs introgressive signals (McVay mosomes, plus a set of 538 unassigned scaffolds. Mapping was et al., 2017a,b), they did not have access to a completed oak performed using BLAST+ 2.8.1 (Camacho et al., 2009). We fil- À genome. As a consequence, we do not understand the distribu- tered alignments based on expected (E) values (E-value ≤ 10 5), tion of genomic breakpoints between histories of introgression alignment length (≥ 80% of the length of the loci) and percentage and histories of population divergence. Moreover, no studies to identity (≥ 80%). For each locus, the best alignment was kept. date have brought together a comprehensive sampling of taxa to All sequence data analyzed in this paper are available as FASTQ investigate the history of diversification across the genus. files from NCBI’s Short Read Archive (Table S1), and aligned In this study, we integrate a phylogenomic sampling of c. 60% loci and additional data and scripts for all analysis are available of the world’s oaks with the annotated, chromosome-level from https://github.com/andrew-hipp/global-oaks-2019. Analy- Quercus robur genome (Plomion et al., 2016, 2018) to test the sis details are given in Methods S1. hypothesis that there are regions of the genome that are globally informative about Quercus phylogeny, i.e. regions that define oak Phylogenetic analysis lineages across the phylogeny. We analyze previously published RAD-seq data for 427 sequenced individuals sampled from Maximum likelihood (ML) phylogenetic analyses were con- across the oak phylogeny and new RAD-seq data for an addi- ducted in RAXML v8.2.4 (Stamatakis, 2014) using the tional 205 individuals to investigate the global oak phylogenomic GTRCAT implementation of the general time reversible model mosaic. Using a time-calibrated one-tip-per-species tree novel to of nucleotide evolution, an approximation of the GTR+c model this study, we also test the hypothesis that the high diversity of that affords substantial savings in computational time for large oaks in Mexico and eastern China is a consequence of high diver- phylogenetic datasets, such as the current one (Stamatakis, 2006), sification rates. Finally, we show that the consensus of the evolu- with branch support assessed using RELL bootstrapping (Minh tionary histories of > 20 000 RAD-seq loci matches our et al., 2013). For the phylogeny including all tips (Fig. S1), analy- understanding of oak evolution based on morphological informa- sis was unconstrained, and we used the taxonomic disparity index tion from extant and fossil species in spite of broadly conflicting (TDI) of Pham et al. (2016) to quantify nonmonophyly by individual locus genealogies. species. Topology within the white oaks of sections Ponticae, Virentes and Quercus (hereafter