Linkage Disequilibrium Patterns, Population Structure and Diversity
Total Page:16
File Type:pdf, Size:1020Kb
Roncallo et al. BMC Genomics (2021) 22:233 https://doi.org/10.1186/s12864-021-07519-z RESEARCH ARTICLE Open Access Linkage disequilibrium patterns, population structure and diversity analysis in a worldwide durum wheat collection including Argentinian genotypes Pablo Federico Roncallo1, Adelina Olga Larsen2, Ana Laura Achilli1, Carolina Saint Pierre3, Cristian Andrés Gallo1, Susanne Dreisigacker3 and Viviana Echenique1* Abstract Background: Durum wheat (Triticum turgidum L. ssp. durum Desf. Husn) is the main staple crop used to make pasta products worldwide. Under the current climate change scenarios, genetic variability within a crop plays a crucial role in the successful release of new varieties with high yields and wide crop adaptation. In this study we evaluated a durum wheat collection consisting of 197 genotypes that mainly comprised a historical set of Argentinian germplasm but also included worldwide accessions. Results: We assessed the genetic diversity, population structure and linkage disequilibrium (LD) patterns in this collection using a 35 K SNP array. The level of polymorphism was considered, taking account of the frequent and rare allelic variants. A total of 1547 polymorphic SNPs was located within annotated genes. Genetic diversity in the germplasm collection increased slightly from 1915 to 2010. However, a reduction in genetic diversity using SNPs with rare allelic variants was observed after 1979. However, larger numbers of rare private alleles were observed in the 2000–2009 period, indicating that a high reservoir of rare alleles is still present among the recent germplasm in a very low frequency. The percentage of pairwise loci in LD in the durum genome was low (13.4%) in our collection. Overall LD and the high (r2 > 0.7) or complete (r2 = 1) LD presented different patterns in the chromosomes. The LD increased over three main breeding periods (1915–1979, 1980–1999 and 2000–2020). Conclusions: Our results suggest that breeding and selection have impacted differently on the A and B genomes, particularly on chromosome 6A and 2A. The collection was structured in five sub-populations and modern Argentinian accessions (cluster Q4) which were clearly differentiated. Our study contributes to the understanding of the complexity of Argentinian durum wheat germplasm and to derive future breeding strategies enhancing the use of genetic diversity in a more efficient and targeted way. Keywords: Durum, Linkage disequilibrium, Population structure, SNP, Diversity, Rare alleles * Correspondence: [email protected] 1Centro de Recursos Naturales Renovables de la Zona Semiárida (CERZOS), Departamento de Agronomía, Universidad Nacional del Sur (UNS)-CONICET, Bahía Blanca, Argentina Full list of author information is available at the end of the article © The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Roncallo et al. BMC Genomics (2021) 22:233 Page 2 of 17 Background frequency. Efforts in recovering genetic diversity and to Durum wheat (Triticum turgidum L. ssp. durum Desf. capture beneficial alleles for specific traits have been Husn) is one of the most important food crops in the made by exploring the genetic variability available in world [1] with a worldwide production of about 36 landraces [20–24] and wild relatives [25–27]. millon tons [2]. It was derived from wild Emmer wheat Single Nucleotide Polymorphisms (SNPs) are the most (T. turgidum ssp. dicoccoides, 2n = 4X = 28, AABB) in a common type of polymorphism in genomes [28]. The use two-step domestication process that took place in the of array technologies developed to capture variants in Fertile Crescent (10,000 BP) and now it is cultivated glo- SNP markers in wheat has become a cost-effective and bally [3]. The main producing areas of durum wheat are more efficient way to assess diverse genetic resources [29]. in the Mediterranean Basin, North America and India, Several wheat SNP arrays, such as the 9 K or 15 K Infi- Canada and Turkey being the main producer countries nium BeadChip [30] and the 90 K iSelect SNP Array [31] of this cereal, followed by Argelia, Italy and India [4]. from Illumina (https://www.illumina.com), or the 820 K Historically it has been used as the main source for mak- Wheat HD genotyping Array [32], the 35 K Axiom Wheat ing different products, mainly flat and leavened bread, Breeder’sArray[33] and the Wheat 660 K Array [34]from couscous, burghul and frekeh in the West Asia, and the Affymetrix (www.affymetrix.com), are available and have North and East Africa region and to elaborate pasta in been widely used during recent years. Furthermore, Next- Western Europe, as well as in North America and world- generation sequencing (NGS) based approaches, such as wide [5]. It has been suggested that durum wheat was Genotyping by Sequencing (GBS) [35], or DArtSeq [36], the first type of wheat sown in the Spanish colonies in and other emerging technologies are powerful tools for South America in 1527 [6]. In Argentina, the widespread SNP discovery. The sequencing of hexaploid (bread) and cultivation of durum started with the introduction of tetraploid (wild emmer and durum) wheat genomes [37– European or Asian landraces, followed by the beginning 39] has anchored the molecular markers to their physical of wheat breeding during the first two decades of the positions. XXth century. The incorporation of the semi-dwarf The study of LD can be defined as the nonrandom as- genes (Rht) during the green revolution occurred during sociation of alleles at different loci due to genetic link- the 70’s. The older cultivars, typically conformed by high age, as well as artificial selection, drift, bottlenecks and and less productive plants, were progressively replaced other genetic forces [40]. Previous studies have ad- before the beginning of the 80’s and all the durum wheat dressed this issue in durum wheat [10, 41–43]. However, varieties cultivated in Argentina today are semi-dwarf the analysis of LD patterns in a germplasm collection in- [7]. Argentina annually cultivates the largest durum cluding Argentinian durum wheat by using an SNP array wheat area in South America (53,480 ha in 2019/20) has not yet been performed. The study of LD could help (http://datosestimaciones.magyp.gob.ar/) mainly in the to understand the effect of selection pressure exerted southeast of Buenos Aires province, but also in the over the national germplasm that occurred during the north-center of the country in Tucumán province and breeding processes. An initial genetic characterization of minor areas in San Luis and Córdoba. Durum wheat a subset of the durum wheat collection used in this grains are mainly used for dry pasta production, one of study was performed with AFLP and a low number of the main staple foods in Argentina, with a consumption KASP™ SNPs markers [14]. For the present study our of 8.54 kg per capita p.a. and occupying the 7th world- goals were to i) assess the genetic diversity in a collec- wide position of production and consumption [8]. tion of 197 durum wheat accessions ii) study the popula- The understanding of genetic diversity available in this tion structure in our germplasm collection to establish crop provides breeders with important knowledge to 1) the main genetic relationships between the Argentinian properly design future strategies in plant breeding, 2) as- durum wheat and other foreign germplasm, iii) estimate sist in germplasm collection management, and 3) con- LD patterns considering the variation in the genome, serve diversity in the national genebanks. To evaluate population structure and the time of release of evaluated the genetic diversity in durum wheat, different wheat genotypes. germplasm collections have been established and genet- ically characterized using DNA markers by several re- Results search institutions [9–17]. Genetic diversity in modern Distribution and physical location of polymorphic SNPs cultivars is usually decreased due to bottleneck events From all the SNP results, 7431 SNPs were high-quality during domestication [18] and strong selection in breed- polymorphic in the 197 durum wheat accessions (Add- ing [13, 19]. However, some authors [17] have found a itional file 1: Table S1a, b), of which 4854 (65.3%) SNPs low or null decay in diversity from landraces to modern showed and minor allele frequency (MAF) > 0.05, here- cultivars, although they observed an effect of breeding after called high frequency (HF) SNPs and 2577 (34.7%) on the linkage disequilibrium (LD) patterns and allele’s corresponded to ˋrare alleles´ SNP with an MAF < Roncallo et al. BMC Genomics (2021) 22:233 Page 3 of 17 0.05, subsequently called low frequency (LF) SNPs. A genetically related to WANA region accessions. On the total of 7222 out of 7431 polymorphic SNPs could be other hand, Chilean accessions were closely related to aligned to the Svevo whole genome sequence assembly CIMMYT germplasm (Additional file 5: Table S4a).