Genome-Wide Detection of Natural Selection in African Americans Pre- and Post-Admixture
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from genome.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press Method Genome-wide detection of natural selection in African Americans pre- and post-admixture Wenfei Jin,1 Shuhua Xu,1,6 Haifeng Wang,2 Yongguo Yu,3 Yiping Shen,4,5 Bailin Wu,4,5 and Li Jin1,4,6 1Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences and Max Planck Society (CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; 2Chinese National Human Genome Center, Shanghai 201203, China; 3Shanghai Children’s Medical Center, Shanghai Jiaotong University School of Medicine, Shanghai 200127, China; 4Ministry of Education (MOE) Key Laboratory of Contemporary Anthropology, School of Life Sciences and Institutes of Biomedical Sciences, Fudan University, Shanghai 200433, China; 5Children’s Hospital Boston, Harvard Medical School, Boston, Massachusetts 02115, USA It is particularly meaningful to investigate natural selection in African Americans (AfA) due to the high mortality their African ancestry has experienced in history. In this study, we examined 491,526 autosomal single nucleotide poly- morphisms (SNPs) genotyped in 5210 individuals and conducted a genome-wide search for selection signals in 1890 AfA. Several genomic regions showing an excess of African or European ancestry, which were considered the footprints of selection since population admixture, were detected based on a commonly used approach. However, we also developed a new strategy to detect natural selection both pre- and post-admixture by reconstructing an ancestral African population (AAF) from inferred African components of ancestry in AfA and comparing it with indigenous African populations (IAF). Interestingly, many selection-candidate genes identified by the new approach were associated with AfA-specific high- risk diseases such as prostate cancer and hypertension, suggesting an important role these disease-related genes might have played in adapting to a new environment. CD36 and HBB, whose mutations confer a degree of protection against malaria, were also located in the highly differentiated regions between AAF and IAF. Further analysis showed that the frequencies of alleles protecting against malaria in AAF were lower than those in IAF, which is consistent with the relaxed selection pressure of malaria in the New World. There is no overlap between the top candidate genes detected by the two approaches, indicating the different environmental pressures AfA experienced pre- and post-population admixture. We suggest that the new approach is reasonably powerful and can also be applied to other admixed populations such as Latinos and Uyghurs. [Supplemental material is available for this article.] Although the vast majority of human genetic variations evolve might reflect the natural selection since the two split. In this case, neutrally, some parts, however, have been shaped by natural se- we used African Americans, a well-studied admixed population, as lection (Kimura 2003; Balaresque et al. 2007). Studies on recent the object of our study. natural selection have led to the discovery of genes showing African Americans (AfA) are residents of the United States population differences in adapting to pathogens, diet, climate, and with partial recent Sub-Saharan African ancestry. The majority of other environmental challenges. These discoveries have greatly them are descendents of the Africans, probably 500,000 to 650,000 enriched our understanding about the origins and also the evolu- in number, who were forcibly brought to North America during the tionary history of the human species, identified many genes with Middle Passage (Thomas 1999; Zakharia et al. 2009). However, important biological functions, and will lead to further elucidation many of those captive Africans died either during the Atlantic of the genetic basis of some human diseases (Balaresque et al. 2007; shipment to America due to the severe conditions, or soon upon Nielsen et al. 2007; Sabeti et al. 2007; Hancock et al. 2008; Akey their arrival in the New World as a result of exposure to foreign 2009). The recent availability of high-density SNPs has provided pathogens and/or the poor living conditions. Although the exact essential resources for genome-wide detection of natural selection, amount of life lost in this process remains a mystery, it may equal especially in ethnically well-defined populations with little ad- or exceed the actual amount enslaved (Stannard 1993). Therefore, mixture (Sabeti et al. 2007; Barreiro et al. 2008; Akey 2009). Al- the high mortality of AfA in the whole slavery era could be attributed though there have been several studies on recently admixed pop- to the overwhelming environmental challenges. These persistent ulations (Tang et al. 2007; Basu et al. 2008; Bryc et al. 2010), no selection pressures might make the frequencies of ‘‘beneficial’’ alleles study so far has particularly investigated the locus-specific pop- increase continually, which should lead to higher population differ- ulation differentiation between the ancestral components of an entiation between African components of ancestry in AfA and in- admixed population and its ancestral parental population, which digenous Africans at these loci, in contrast to those evolved neutrally. Meanwhile, before Africans and Europeans migrated to the New World, their ancestral parental populations had evolved in- 6Corresponding authors. dependently in distinct environments for tens of thousands of E-mail [email protected]. years (Basu et al. 2008). The completely new environment in the E-mail [email protected]. Article published online before print. Article, supplemental material, and publi- New World constituted a challenge for both populations and the cation date are at http://www.genome.org/cgi/doi/10.1101/gr.124784.111. populations of their admixture. Therefore, it is very likely that 22:519–527 Ó 2012 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/12; www.genome.org Genome Research 519 www.genome.org Downloaded from genome.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press Jin et al. some genomic regions in AfA show an excess of a particular an- of 88 CEU and 88 YRI (the phased parents of trios, presumably cestry as a result of selection pressures after the population admixture unrelated, from HapMap3 data) to represent their ancestral pa- (Tang et al. 2007; Basu et al. 2008). For example, there are various rental populations. The identical monomorphic SNPs in CEU and studies in detecting signatures of selection in AfA by examining ad- YRI samples were removed for they could not provide valuable mixture proportion using a small number of available loci (Workman information in the local ancestry inference. HAPMIX (Price et al. et al. 1963; Reed 1969; Blumberg and Hesser 1971; Adams and Ward 2009) was employed to estimate the locus-specific genetic contri- 1973; Long 1991). Recently, Bryc et al. (2010) identified three auto- butions of the ancestral parental populations to AfA (see Methods) somal regions showing excessive or reduced African ancestry (by at by taking 21.65% as the European contribution, as estimated by least three standard deviations [SD] of the mean) as natural selection FRAPPE. Based on the likelihood given by HAPMIX, generation since candidates based on ;500K SNPs genotyped in 365 AfA. admixture (l) was estimated to be l = 7 (an essentially hybrid iso- Here, we analyzed the genotype data from ;500,000 auto- lation model), which was similar to those based on other AfA data somal SNPs shared by 1890 AfA and 3320 non-AfA for detecting sets (Smith et al. 2004; Price et al. 2009). Detailed analysis showed the signatures of selection in AfA that were classified into pre- that l values for most individuals ranged from 1 to 12, and could be admixture and post-admixture according to the different envi- explained by a continuous-gene-flow (CGF) model (see Supple- ronments they experienced (Supplemental Fig. S1). Firstly, we de- mental Text and Supplemental Fig. S6). The locus-specific European tected the natural selections that were more likely to occur after ancestry proportion across the genome of AfA was estimated to be admixture by examining the genome-wide distribution of ancestry 21.68% 6 0.75% (mean 6 SD). The SD of locus-specific ancestral in AfA. Then we developed a new strategy by reconstructing an genetic contributions in this study is lower than those in any pre- ancestral African population (AAF) from inferred African compo- vious studies (Workman et al. 1963; Reed 1969; Blumberg and Hesser nents of ancestry in AfA and comparing it with indigenous African 1971; Tang et al. 2007; Basu et al. 2008; Bryc et al. 2010), which was populations (IAF) (Supplemental Fig. S2), which reflect natural not beyond our expectation since we used a much larger sample size. selection since the African ancestors of AfA left Africa (including The genomic regions showing excessive or reduced ancestry both pre- and post-admixture). Many candidate genes identified in the admixed population are likely to be signatures of natural se- by our new approach could explain the challenges that AfA and lection (Tang et al. 2007; Basu et al. 2008; Bryc et al. 2010; Oleksyk their ancestry had experienced. Thus we suggest that our new et al. 2010). The loci showing strong deviation of European an- approach is reasonably powerful and can also be applied to the cestry (3 SDs above or below the genome-wide average) were studies of other admixed populations. therefore identified as candidates of natural selection in this study. Four regions (2p22, 3q13, 6q26, 16q21) with excessive European Results influence and two regions (1p36, 2q37) with excessive African influence were observed in AfA genomes (Fig. 1). Each of the six African and European ancestries in AfA regions was significantly different from the genome-wide average À16 The populations from West Africa and Europe, who had undoubtedly of ancestral contributions (P < 2.2 3 10 , t-test).