Identification of Pleiotropic Cancer Susceptibility Variants from Genome-Wide Association Studies Reveals Functional Characteristics
Total Page:16
File Type:pdf, Size:1020Kb
Published OnlineFirst November 17, 2017; DOI: 10.1158/1055-9965.EPI-17-0516 Research Article Cancer Epidemiology, Biomarkers Identification of Pleiotropic Cancer Susceptibility & Prevention Variants from Genome-Wide Association Studies Reveals Functional Characteristics Yi-Hsuan Wu1, Rebecca E. Graff1, Michael N. Passarelli2, Joshua D. Hoffman1, Elad Ziv3,4,5, Thomas J. Hoffmann1,3, and John S. Witte1,3,5,6 Abstract Background: There exists compelling evidence that some gene. Relative to single-cancer risk variants, pleiotropic variants genetic variants are associated with the risk of multiple cancer were more likely to be in genes (89.0% vs. 65.3%, P ¼ 2.2 Â À sites (i.e., pleiotropy). However, the biological mechanisms 10 16), and to have somewhat larger risk allele frequencies through which the pleiotropic variants operate are unclear. (median RAF ¼ 0.49 versus 0.39, P ¼ 0.046). The 27 genes to Methods: We obtained all cancer risk associations from the which the pleiotropic variants mapped were suggestive for enrich- National Human Genome Research Institute-European Bioinfor- ment in response to radiation and hypoxia, alpha-linolenic acid matics Institute GWAS Catalog, and correlated cancer risk variants metabolism, cell cycle, and extension of telomeres. In addition, were clustered into groups. Pleiotropic variant groups and genes we observed that 8 of 33 pleiotropic cancer risk variants were were functionally annotated. Associations of pleiotropic cancer associated with 16 traits other than cancer. risk variants with noncancer traits were also obtained. Conclusions: This study identified and functionally character- Results: We identified 1,431 associations between variants and ized genetic variants showing pleiotropy for cancer risk. cancer risk, comprised of 989 unique variants associated with 27 Impact: Our findings suggest biological pathways common to unique cancer sites. We found 20 pleiotropic variant groups different cancers and other diseases, and provide a basis for the study (2.1%) composed of 33 variants (3.3%), including novel pleio- of genetic testing for multiple cancers and repurposing cancer tropic variants rs3777204 and rs56219066 located in the ELL2 treatments. Cancer Epidemiol Biomarkers Prev; 27(1); 1–11. Ó2017 AACR. Introduction (13, 14). The genes closest to this locus are FAM84B and MYC, both known cancer-related genes. As another example, the 5p15 An emerging focus in cancer research is the discovery and locus containing TERT and CLPTM1L is associated with multiple understanding of the shared genetic basis underlying the devel- cancer types, including lung (15, 16), testicular (17), prostate opment of different cancer types. In the past 10 years, genome- (18–20), breast (21), colorectal (22) cancers, and glioma (12). wide association studies (GWAS) have identified hundreds of Pleiotropy refers to the phenomenon of a gene or genetic genetic variants associated with cancer risk (1–3), and several loci variant affecting more than one phenotypic trait. Identifying and have been associated with multiple cancer sites. For example, characterizing pleiotropic genes and variants may have important variants at the 8q24 locus have been associated with prostate clinical and pharmacologic implications (23, 24). For example, a (4, 5), colorectal (6–8), bladder (9), breast (10), and ovarian drug used for one cancer type may be repurposed to treat another cancers (11), glioma (12), and chronic lymphocytic leukemia cancer type if the therapeutic target is common to both cancers. In addition, genetic tests for pleiotropic variants may provide an efficient way to identify patients at high risk of multiple cancers. 1 Department of Epidemiology and Biostatistics, University of California San Understanding the functional mechanisms by which variants Francisco, San Francisco, California. 2Department of Epidemiology, Geisel exhibit pleiotropy is important toward prioritizing potential drug School of Medicine, Dartmouth College, Hanover, New Hampshire. 3Institute for Human Genetics, University of California San Francisco, San Francisco, or genetic testing targets. California. 4Division of General Internal Medicine, Department of Medicine, Recent studies have looked at whether genetic variants previ- University of California San Francisco, San Francisco, California. 5Helen Diller ously associated with one cancer are associated with other cancers. Family Comprehensive Cancer Center, University of California San Francisco, Cancers studied in this way include endometrial (25), colorectal 6 San Francisco, California. Department of Urology, University of California San (26, 27), pancreatic (28), esophageal (29, 30), prostate (31), lung Francisco, San Francisco, California. (32), ovarian (33), gastric (34), and estrogen receptor negative Note: Supplementary data for this article are available at Cancer Epidemiology, (ER-) breast cancers (35), and non-Hodgkin lymphoma (36). Biomarkers & Prevention Online (http://cebp.aacrjournals.org/). Cross-cancer GWAS analyses for two to five cancers have also been Corresponding Author: John S. Witte, University of California San Francisco, conducted to identify pleiotropic variants (37–40). Previous work 1450 3rd St, San Francisco, CA 94158. Phone: 415-502-6882; Fax: 415-476-1356; has also estimated the genetic correlation between pairs of cancers E-mail: [email protected] using data from GWAS for multiple cancer sites (41, 42). doi: 10.1158/1055-9965.EPI-17-0516 We build on this previous work by investigating pleiotropy Ó2017 American Association for Cancer Research. across all cancer results presented in the National Human www.aacrjournals.org OF1 Downloaded from cebp.aacrjournals.org on September 23, 2021. © 2017 American Association for Cancer Research. Published OnlineFirst November 17, 2017; DOI: 10.1158/1055-9965.EPI-17-0516 Wu et al. Genome Research Institute-European Bioinformatics Institute Asian (EAS), Ad Mixed American (AMR), African (AFR), and (NHGRI-EBI) GWAS Catalog (1–3). The GWAS Catalog provides South Asian (SAS)]. The number of cancer risk associations for publicly available, manually curated, and literature-derived SNP- each super population was calculated. For associations without À trait associations with P values <10 5 from GWAS assessing at ancestry data reported in the Catalog, we reviewed the original least 100,000 SNPs. publications to obtain the ancestry information. Previous analyses of the GWAS Catalog data found substantial Our goal was to identify the following two types of cancer risk evidence of pleiotropy across various traits (43). However, this variants: (i) pleiotropic within ethnic group, and (ii) pleiotropic work did not fully investigate potential pleiotropy arising from across ethnic groups. To identify the first type, we estimated variants in linkage disequilibrium (LD) with the associated var- pairwise LD among variants discovered in the same super pop- iants, the functional implications of pleiotropic variants, or the ulation using reference genotype data from the corresponding ancestral populations in which the variants were detected. In this super population. To identify the second type, we estimated study, we addressed these limitations by evaluating variants in LD pairwise LD among all variants, regardless of the discovery pop- with the reported variants, functionally characterizing the GWAS ulation, using reference genotype data from each of the five 1000 reported variants, and incorporating ancestry information. Fur- Genomes Project's (46) super populations individually. thermore, we investigated the associations of the pleiotropic We ensured that all rs numbers were updated to build 142 of the cancer risk variants with other diseases and traits. Single Nucleotide Polymorphism Database (dbSNP; http://www. ncbi.nlm.nih.gov/projects/SNP/; ref. 47). For variants lacking rs Materials and Methods numbers in the original publications, we used chromosomal positions and the UCSC Genome Browser (48) to obtain rs Determining associations with cancer risk in the GWAS catalog numbers. LD was estimated with LDlink (http://analysistools. We accessed all associations reported in the GWAS Catalog as of nci.nih.gov/LDlink/) (49), which uses genotype data from Phase September 27, 2016. These were mapped to Ensembl release 3 of the 1000 Genomes Project and variant rs numbers indexed version 85 and contained associations published from March based on dbSNP build 142. HaploReg v4.1 (http://www.broad 10, 2005 through October 30, 2015. For associations with any institute.org/mammals/haploreg; refs. 50, 51) was used to eval- given trait, the data contained the most statistically significant uate LD for variants that could not be assessed by LDlink. We were variant from each independent locus for each study. To perform unable to calculate LD for variants that were monoallelic in a an initial screening for associations with cancer risk, we utilized given population and/or not in the 1000 Genomes data. Experimental Factor Ontology (EFO) terms (release 2016-03-15; refs. 44, 45). The curated traits in the GWAS Catalog are mapped Identifying variants associated with the risk of multiple cancer to EFO terms to facilitate cross-study comparisons. The initial set sites of associations we evaluated included traits mapped to the term First, to identify variants pleiotropic within the same ethnic "neoplasm", defined as benign or malignant tissue growth result- group, we grouped variants based on LD estimated in each super ing from uncontrolled cell proliferation (44). "Neoplasm" and