Development of a Small Panel of Snps to Infer Ancestry in Chileans That Distinguishes Aymara and Mapuche Components Ricardo A
Total Page:16
File Type:pdf, Size:1020Kb
Verdugo et al. Biol Res (2020) 53:15 https://doi.org/10.1186/s40659-020-00284-5 Biological Research RESEARCH ARTICLE Open Access Development of a small panel of SNPs to infer ancestry in Chileans that distinguishes Aymara and Mapuche components Ricardo A. Verdugo1,2, Alex Di Genova3, Luisa Herrera1, Mauricio Moraga1, Mónica Acuña1, Soledad Berríos1, Elena Llop1, Carlos Y. Valenzuela1, M. Leonor Bustamante1,4, Dayhana Digman1, Adriana Symon1, Soledad Asenjo1, Pamela López1, Alejandro Blanco1, José Suazo5, Emmanuelle Barozet6, Fresia Caba7, Marcelo Villalón8, Sergio Alvarado8, Dante Cáceres8, Katherine Salgado7, Pilar Portales9, Andrés Moreno‑Estrada10, Christopher R. Gignoux11, Karla Sandoval10, Carlos D. Bustamante11, Celeste Eng12, Scott Huntsman12, Esteban G. Burchard13, Nicolás Loira3, Alejandro Maass3,14 and Lucía Cifuentes1* Abstract Background: Current South American populations trace their origins mainly to three continental ancestries, i.e. European, Amerindian and African. Individual variation in relative proportions of each of these ancestries may be con‑ founded with socio‑economic factors due to population stratifcation. Therefore, ancestry is a potential confounder variable that should be considered in epidemiologic studies and in public health plans. However, there are few stud‑ ies that have assessed the ancestry of the current admixed Chilean population. This is partly due to the high cost of genome‑scale technologies commonly used to estimate ancestry. In this study we have designed a small panel of SNPs to accurately assess ancestry in the largest sampling to date of the Chilean mestizo population (n 3349) from eight cities. Our panel is also able to distinguish between the two main Amerindian components of Chileans:= Aymara from the north and Mapuche from the south. Results: A panel of 150 ancestry‑informative markers (AIMs) of SNP type was selected to maximize ancestry informa‑ tiveness and genome coverage. Of these, 147 were successfully genotyped by KASPar assays in 2843 samples, with an average missing rate of 0.012, and a 0.95 concordance with microarray data. The ancestries estimated with the panel of AIMs had relative high correlations (0.88 for European, 0.91 for Amerindian, 0.70 for Aymara, and 0.68 for Mapuche components) with those obtained with AXIOM LAT1 array. The country’s average ancestry was 0.53 0.14 European, 0.04 0.04 African, and 0.42 0.14 Amerindian, disaggregated into 0.18 0.15 Aymara and 0.25 0.13± Mapuche. However,± Mapuche ancestry± was highest in the south (40.03%) and Aymara± in the north (35.61%)± as expected from the historical location of these ethnic groups. We make our results available through an online app and demonstrate how it can be used to adjust for ancestry when testing association between incidence of a disease and nongenetic risk factors. *Correspondence: [email protected] 1 Programa de Genética Humana del ICBM, Facultad de Medicina, Universidad de Chile, Independencia 1027, Santiago, Chile Full list of author information is available at the end of the article © The Author(s) 2020. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creat iveco mmons .org/publi cdoma in/ zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Verdugo et al. Biol Res (2020) 53:15 Page 2 of 11 Conclusions: We have conducted the most extensive sampling, across many diferent cities, of current Chilean population. Ancestry varied signifcantly by latitude and human development. The panel of AIMs is available to the community for estimating ancestry at low cost in Chileans and other populations with similar ancestry. Keywords: Chile, Admixture, Ancestry, Aymara, Mapuche, SNPs panel Background in the current Chilean population [13, 14]. Tis asym- Te estimation of ancestral components of current metrical admixture has also been demonstrated for mestizo populations has been the objective of numer- other contemporary Latin American populations [15, ous studies in recent years. Knowledge of population 16]. In recent years there have been two studies of con- ancestry is relevant as a variable to be controlled in tinental ancestry using SNPs, one of them based on 30 genotype-illness association studies [1–3], and is also ancestry-informative SNPs [17] and the other, geno- relevant in forensic genetics [4], prediction of popu- typed with the Afymetrix GeneChip Array [18]. Both lation risks to diseases [5], evolutionary history [6, 7] studies reported that Chileans, in average, have about among other areas. Latin American populations are 43% Amerindian, 55% European and 2% African ances- essentially mixtures from three continents, Europe, try [17, 18]. Tese values are in good agreement with a America and Africa; the latter being less prevalent in previous study based on a few autosomal markers [11]. the southernmost countries of South America. Te Te best estimations of ancestry are obtained by genetic variation in Latin American populations of the genotyping thousands of SNPs in the whole genome; southern extreme of South America, including Chile, however, this strategy is costly if many samples are is underrepresented in publicly accessible world data- analyzed, while studying too few ancestry-informative bases. Te Chilean population was formed mainly by markers (AIMs) may produce estimations with little the mixture of the Amerindian people resident in the confdence. Tus, small panels of AIMs (from a few territory with the Spanish conquistadors in the mid- dozen to a few hundred SNPs) have been designed, dle of the XVI century [8]. Te Spanish later brought which provide enough information to make ancestry African slaves; however, their input to the gene pool of inferences in diferent human populations [1, 17, 19, the current Chilean population is very small. Te native 20]. It has been shown that there is a strong correlation populations that inhabited America before the arrival between ancestry genetic estimates obtained with sets of European colonists were quite heterogeneous [9]. of AIMs and those obtained with high-density data in Tere were many Amerindian groups present in the samples of Latin America [17]. It has been proposed territory today known as Chile, with diferent cultures, that panels of 100 SNPs provide reliable estimations of languages, social organizations and inhabiting widely continental ancestry [21]. Tis technique has been used diferent climates and topographies. History indicates for some populations in southern South America; for that the greatest miscegenation occurred with Aymara instance, in Argentina using 99, 446 and 46 AIMs [16, in the north and Mapuche in central and southern terri- 22, 23] and one study in Chile using a panel of 30 AIMs tories; however, only one recent study has explored the [17]. Te aim of the present study was to design and relative contribution of Aymara and Mapuche peoples test a panel of AIMs especially designed for the Chilean in the admixture process using genomic approaches mestizo population, which is small enough to main- [10]. Te magnitude of the Amerindian component in tain genotyping costs low, but with enough number of the current Chilean population, estimated using blood SNPs to allow diferentiating the Amerindian ances- groups, varies according to socioeconomic strata; rang- tries from the north (Aymara) and south (Mapuche) ing from 0% in the highest class to 40% in the lowest of the country. Tis panel was designed as part of the class [11]. Te estimates on the Chilean population ChileGenomico Project, an initiative fnanced by pub- were based mostly on studies of a few urban centers lic funds (Fund for the Promotion of Scientifc and and very few regions of the autosomal genome. Stud- Technological Development, FONDEF) and sponsored ies using genetic markers with uniparental inheritance by the Ministry of Health [24, 25], which involved the have shown an asymmetrical admixture of the ances- largest sample to date of the average Chilean Continen- tral European and Amerindian genomes by sex, as a tal population (n = 3349), including individuals from consequence of Spanish men mating with Amerindian eight cities spanning 2600 km from north to south. Tis women during the frst century of colonization, which allowed estimating the magnitude of the ancestral com- has produced 84% Amerindian mitochondrial DNA ponents in the genome of the contemporaneous Chil- [12] and less than 20% Amerindian Y chromosomes ean population. Verdugo et al. Biol Res (2020) 53:15 Page 3 of 11 Methods Collection kit tubes (DNAGenotek Inc.,Ontario Canada), Recruitment of participants and answering a questionnaire about their demographic Participants of the ChileGenomico Project were recruited and social background. Te protocols and the informed from the general Chilean population at regular or tem- consent questionnaire were reviewed and approved by porary blood donation centers in eight cities (Fig. 1). We the corresponding ethics committees of healthcare cent- also included 40 father-mother-ofspring trios, who were ers and the ethics committee of the Faculty of Medicine contacted directly by project researchers. In three trios, of the Universidad de Chile. DNA extraction failed for one member and only parents were used in this work.