Analysis of Polygenic Risk Score Usage and Performance in Diverse Human Populations
Total Page:16
File Type:pdf, Size:1020Kb
ARTICLE https://doi.org/10.1038/s41467-019-11112-0 OPEN Analysis of polygenic risk score usage and performance in diverse human populations L. Duncan 1, H. Shen1, B. Gelaye2, J. Meijsen 1, K. Ressler3, M. Feldman4, R. Peterson5 & B. Domingue 6 A historical tendency to use European ancestry samples hinders medical genetics research, including the use of polygenic scores, which are individual-level metrics of genetic risk. We analyze the first decade of polygenic scoring studies (2008–2017, inclusive), and find that 1234567890():,; 67% of studies included exclusively European ancestry participants and another 19% inclu- ded only East Asian ancestry participants. Only 3.8% of studies were among cohorts of African, Hispanic, or Indigenous peoples. We find that predictive performance of European ancestry-derived polygenic scores is lower in non-European ancestry samples (e.g. African ancestry samples: t = −5.97, df = 24, p = 3.7 × 10−6), and we demonstrate the effects of methodological choices in polygenic score distributions for worldwide populations. These findings highlight the need for improved treatment of linkage disequilibrium and variant frequencies when applying polygenic scoring to cohorts of non-European ancestry, and bolster the rationale for large-scale GWAS in diverse human populations. 1 Department of Psychiatry and Behavioral Sciences, Stanford University, 401 Quarry Road, Stanford, CA 94305, USA. 2 Department of Epidemiology, Harvard T.H. Chan School of Public Health, 667 Huntington Ave, Kresge 505, Boston, MA 02115, USA. 3 Mailman Research Center, Harvard Medical School, McLean Hospital, 115 Mill St, Belmont, MA 02478, USA. 4 Department of Biology, Stanford University, Herrin 478A, Stanford, CA 94305, USA. 5 Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, P.O. Box 980003 , Richmond, VA 23298, USA. 6 Graduate School of Education, CERAS 510, Stanford University, Stanford, CA 94305, USA. Correspondence and requests for materials should be addressed to L.D. (email: [email protected]) NATURE COMMUNICATIONS | (2019) 10:3328 | https://doi.org/10.1038/s41467-019-11112-0 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-11112-0 he over-representation of participants of European ances- prediction using straightforward scoring methods11,12 for many try in human genetics research has been broadly complex genetic phenotypes (e.g., blood pressure13,17, height18, T 1–5 9,19 7,20 14 acknowledged , and increasing the representation of diabetes , depression , and schizophrenia ). Polygenic risk diverse populations has recently become a higher priority for the scores are calculated by summing risk alleles, which are weighted research community5–10. This has led funding agencies such as by effect sizes derived from GWAS results11,12,21. Commonly the National Institutes of Mental Health to prioritize genetic used methods account for ancestry using principal components studies of diverse populations. Accordingly, representation of (calculated on pruned genetic data). In the parlance of polygenic non-European ancestry participants in genome-wide association scoring studies, the training GWAS is referred to as the discovery studies (GWAS) increased, from 4% in 20091 to 19% in 20163. sample, and the testing dataset is referred to as the target sample. Most of the increase in non-European ancestry research is attri- No overlap between training and testing datasets is essential to butable to expansion of genetic studies of East Asian populations, maintain independence of predictions, and the removal of related as reported previously3 and as observed in our data (see below). individuals is also needed, as demonstrated by Wray et al.21. Thus, most populations are still severely under-represented. This Methods of prediction that offer modest improvements on this lack of representation, if not mitigated, will limit our under- basic framework are also available22–25. standing of etiological factors predisposing to disease risk, and Polygenic scores can be constructed for any complex genetic will hinder efforts to develop precision medicine. It is also phenotype for which appropriate GWAS (or other robust asso- important to understand the implications of the European-centric ciation) results are available. The challenges inherent in using bias of earlier genetic studies for work that builds upon existing polygenic scores—including modest capacity for prediction and research. For example, researchers need to know how the limited necessary considerations regarding statistical power—have been diversity in earlier medical genetic studies impacts the use of reviewed previously21,26. Recent research has focused on the polygenic risk scores in non-European ancestry populations. generalizability of polygenic scores to non-European ancestry The use of polygenic risk scores11,12 (PRS, also known as risk populations27. There is good reason to anticipate reduced pre- profile scoring, genetic scoring, and genetic risk scoring) has dictive power in non-European ancestry samples because of dif- become widespread in biomedical and social science dis- ferences in variant frequencies and linkage disequilibrium ciplines13–15. Businesses have commercialized this technology, patterns between populations12,28. However, the magnitude of including direct-to-consumer testing from 23andMe and other performance decrease is largely unknown. Few systematic studies companies. Perhaps most importantly, there is hope that poly- of polygenic score performance across different ancestry groups genic risk scores can improve health outcomes by accelerating are available, though see Hoffman et al.13 for an investigation of diagnosis and matching patients to tailored treatments16. Poly- blood pressure metrics and a recent UK Biobank analysis27. genic scoring studies have demonstrated reliable, though modest, Further, some previous findings may need to be re-evaluated in a 700 Middle eastern (4 studies) 600 Latino & native american (9 studies) 500 African ancestry (15 studies) Combined (140 studies) 400 Asian (140 studies) European ancestry (459 studies) 300 Studies 200 100 0 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 Year by year Proportion bc Representation of each group European: 460% Asian: 40% Latino: 19% African: 17% Middle eastern: 10% Oceanic: 0% World population Polygenic scoring estimates studies Fig. 1 Ancestry representation in the first decade of polygenic scoring studies (2008–2017; N = 733 studies). a Cumulative numbers of studies by year are denoted by color. The stacked bar graph below the cumulative distribution plot shows proportional ancestry by year. b Stacked bar charts depict world ancestry representation (left) and polygenic scoring study representation (right). c The percentage representation for each ancestry group is given, such that 100% would indicate equal representation in the world and in polygenic scoring studies. For example, European ancestry samples are over- represented (460%) whereas African ancestry samples are under-represented (17%) 2 NATURE COMMUNICATIONS | (2019) 10:3328 | https://doi.org/10.1038/s41467-019-11112-0 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-11112-0 ARTICLE light of newer findings about relationships between ancestry and Oceanic populations have the lowest representation in polygenic GWAS results29–31. Thus, there is a need for systematic evalua- scoring studies (10% and 0%, respectively). tion of polygenic score performance across multiple populations, Having analyzed the use of polygenic scores in different phenotypes, and samples. ancestry groups (above), we next assessed the performance of In this manuscript we answer questions about the predictive polygenic scores in multiple ancestry groups. Since most large- performance of polygenic scores. Based on ancestry, we find large scale GWAS have been conducted in primarily (or exclusively) statistically significant differences in performance between European ancestry individuals42, our a priori hypothesis was that populations. A second major area of inquiry concerns differences polygenic scores would perform best among European ancestry in the distributions of polygenic scores for worldwide popula- individuals, and less well for other populations. Figure 2 provides tions, as currently calculated30–39. (The term “worldwide popu- an overview of polygenic score performance across ancestry lations” is used in this manuscript. It is important to note that groups. Results from all complex genetic phenotypes are analyzed there are many meanings and operationalizations of the term together in order to increase the amount of data available for “population”, and that only some of the world’s populations are analysis. In Fig. 2, each point represents one within-study represented in genetic analyses). Multiple potential causes of comparison between a non-European ancestry sample and the observed distribution differences of polygenic scores have been matched (within-study) European ancestry sample. The vertical reported, including drift32, selection33,36–39, artifactual differences black line represents equal performance in the non-European due to uncorrected population stratification30,31, and different ancestry sample, as compared to the matched European ancestry environmental effects40,41. We show that investigator-driven sample from the same study. choices in the construction of polygenic scores also