Arabidopsis Thaliana
Total Page:16
File Type:pdf, Size:1020Kb
African genomes illuminate the early history and transition to selfing in Arabidopsis thaliana Arun Durvasulaa,b,c,1, Andrea Fulgionea,b,c,1, Rafal M. Gutakerd, Selen Irez Alacakaptanb,c, Pádraic J. Flooda, Célia Netoa, Takashi Tsuchimatsue, Hernán A. Burbanod, F. Xavier Picóf, Carlos Alonso-Blancog, and Angela M. Hancocka,b,c,2 aDepartment of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany; bDepartment of Structural and Computational Biology, University of Vienna, 1010 Vienna, Austria; cVienna Biocenter, 1030 Vienna, Austria; dResearch Group for Ancient Genomics and Evolution, Department of Molecular Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany; eDepartment of Biology, Chiba University, Chiba 263-8522 Japan; fDepartamento de Ecología Integrativa, Estación Biológica de Doñana, Consejo Superior de Investigaciones Científicas, 41092 Seville, Spain; and gCentro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, 28049 Madrid, Spain Edited by Johanna Schmitt, University of California Davis, CA, and approved April 11, 2017 (received for review October 13, 2016) Over the past 20 y, many studies have examined the history of the Although the genetic basis of self-compatibility in A. thaliana plant ecological and molecular model, Arabidopsis thaliana, in Europe is known, the specific events that occurred during the transition and North America. Although these studies informed us about the to predominant selfing are still unclear. In obligate out-crossing recent history of the species, the early history has remained elusive. Arabidopsis species, many highly divergent S-locus haplogroups In a large-scale genomic analysis of African A. thaliana, we sequenced (S-haplogroups) are maintained by balancing selection, pro- the genomes of 78 modern and herbarium samples from Africa and viding a mechanism for inbreeding avoidance. In A. thaliana, analyzed these together with over 1,000 previously sequenced Eur- three S-haplogroups are found, and each contains mutations that asian samples. In striking contrast to expectations, we find that all obliterate function of the S-locus genes (15–17). Loss-of-function African individuals sampled are native to this continent, including occurred independently in each S-haplogroup (18–21), but be- those from sub-Saharan Africa. Moreover, we show that Africa har- cause these three S-haplogroups were never found together in bors the greatest variation and represents the deepest history in the A. thaliana the same geographic region, self-compatibility is inferred to have lineage. Our results also reveal evidence that selfing, a evolved separately in multiple locations (16, 21, 22). However, major defining characteristic of the species, evolved in a single geo- the hypothesis of geographically distinct origins is difficult to graphic region, best represented today within Africa. Demographic in- A. thaliana reconcile with the major genomic and phenotypic changes that ference supports a model in which the ancestral population A. thaliana – render incompatible with its out-crossing congeners begantosplitby12090 kya, during the last interglacial and Abbassia – pluvial, and Eurasian populations subsequently separated from one (9 13). Shifts from out-crossing to predominant selfing are another at around 40 kya. This bears striking similarities to the patterns common and have been considered the most prevalent evolu- observed for diverse species, including humans, implying a key role for tionary transitions in flowering plants (23). Reconstructing the A. thaliana climatic events during interglacial and pluvial periods in shaping the evolutionary history of the transition to selfing in histories and current distributions of a wide range of species. could provide general insights into this common evolutionary evolution | population history | self-compatibility | climate | migration Significance he plant Arabidopsis thaliana is the principal plant model spe- The principal plant model species, Arabidopsis thaliana,is Tcies, and as such has been useful not only to examine basic central to our understanding of how molecular variants lead biological mechanisms but also to elucidate evolutionary processes. to phenotypic change. In this genome-sequencing effort fo- The exceptional resources available in this species, including seed cused on accessions from Africa, we show that African pop- stocks collected from throughout Eurasia for over 75 y, have been a ulations represent the most ancient lineages and provide new valuable tool for learning about the natural history of A. thaliana clues about the origin of selfing and the species itself. Pop- on this continent (1, 2). Previous studies have shown that current ulation history in Africa contrasts sharply with the pattern in variation in Eurasia is mainly a result of expansions and mixing Eurasia, where the vast majority of samples result from the from refugia in Iberia, Central Asia, and Italy/Balkans after the recent expansion of a single clade. This previously unexplored reservoir of variation is remarkable given the large number of end of the last glacial period ∼10 kya (3–8). The main finding genomic studies conducted previously in this well-studied of the recent analysis of 1,135 sequenced genomes was that a few species and implies that assaying variation in Africa may of- Eurasian samples represent divergent relict lineages, whereas the ten be necessary for understanding population history in vast majority derived from the recent expansion of a single clade diverse species. (4). Given the large number of studies that examine the natural A. thaliana history of , one would expect that this history would by Author contributions: A.D., A.F., and A.M.H. designed research; A.D., A.F., R.M.G., S.I.A., now be described rather completely and there would be no major P.J.F., C.N., T.T., H.A.B., and A.M.H. performed research; A.D., A.F., F.X.P., C.A.-B., and surprises left to uncover. However, there are still many open A.M.H. contributed new reagents/analytic tools; A.D., A.F., and A.M.H. analyzed data; and A.D., A.F., and A.M.H. wrote the paper. questions about the ancient history of the species. The authors declare no conflict of interest. Several features differentiate A. thaliana from its closest relatives. EVOLUTION Although most members of the Arabidopsis genus are obligate out- This article is a PNAS Direct Submission. crossing perennials with large flowers and genome sizes of over Freely available online through the PNAS open access option. 230 Mb and 8 chromosomes, A. thaliana is a predominantly selfing Data deposition: The sequences reported in this paper has been deposited in the European Nucleotide Archive/Sequence Read Archive database, study PRJEB19780 (accession nos. annual with reduced floral morphology and a reduced genome size ERS1575066–ERS1575147). Analysis scripts are available at https://github.com/HancockLab/ of ∼150 Mb and 5 chromosomes. The transition to predominant African_A.thaliana. selfing in A. thaliana was likely the catalyst for these derived mor- 1A.D. and A.F. contributed equally to this work. – phological and genomic features (9 13). These changes, in particular 2To whom correspondence should be addressed. Email: [email protected]. the rearranged and shrunken genome, created a strong reproductive This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. barrier between A. thaliana and its closest relatives (14). 1073/pnas.1616736114/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1616736114 PNAS | May 16, 2017 | vol. 114 | no. 20 | 5213–5218 Downloaded by guest on September 25, 2021 transition. However, because substantial time has passed since – A Eurasia B North this transition (24 26) and no intermediate forms have been Africa found between A. thaliana and its obligate out-crossing relatives, this reconstruction is challenging. We sequenced the genomes of 78 African samples and analyze PC 2 these in combination with 1,135 previously sequenced samples IR sub-Saharan SI Appendix ER (4) (Fig. 1 and , Table S1). We find that African 0.0 0.1 0.2 0.3 Africa variation reveals the ancient history of the species and clarifies ENR details concerning the transition to selfing. Congruence of A. thaliana 0.0 0.1 0.2 0.3 C PC 1 population history with major climatic events and paleontological observations illustrates the relevance of population genetic studies for understanding climate-mediated demography more 0.5 Africa Ancestry generally. 01 IR Rif DE HA IBC INR SMA NMA S. SE C. AS N. SE C. EU Results W. EU To examine the relationship between African individuals and Fig. 2. Global population structure. (A) Unrooted neighbor-joining tree, other worldwide samples, we used three complementary clus- (B) PCA, (C) ADMIXTURE results for K = 4. tering approaches. Distance-based clustering by neighbor-joining reveals a clear split between Eurasian and African samples, in- A. thaliana dicating deep divergence between the continents (Fig. 2A and SI striking because populations outside of Eurasia and Appendix, Fig. S1). The majority of Eurasian samples form a North Africa were previously thought to be recently introduced nearly star-shaped phylogeny, consistent with recent expansion by humans (27). of these lineages. Conversely, longer more bifurcated branches Similarly, principal component analysis (PCA) separates Afri- separate African subclusters and previously identified Eurasian can populations from each other