Published OnlineFirst November 4, 2014; DOI: 10.1158/1055-9965.EPI-14-0694-T
Research Article Cancer Epidemiology, Biomarkers Association of Prostate Cancer Risk Variants with & Prevention Gene Expression in Normal and Tumor Tissue Kathryn L. Penney1,2, Jennifer A. Sinnott1,2,3, Svitlana Tyekucheva3,4, Travis Gerke1, Irene M. Shui1, Peter Kraft1,3, Howard D. Sesso1,5, Matthew L. Freedman6,7, Massimo Loda6,7,8, Lorelei A. Mucci1,2, and Meir J. Stampfer1,2,9
Abstract
Background: Numerous germline genetic variants are associ- of variants with the expression of nearby genes including C2orf43, ated with prostate cancer risk, but their biologic role is not well ITGA6, MLPH, CHMP2B, BMPR1B, and MTL5. Genome-wide, five understood. One possibility is that these variants influence gene genes (MSMB, NUDT11, RBPMS2, NEFM, and KLHL33) were expression in prostate tissue. We therefore examined the associ- significantly associated after accounting for multiple comparisons ation of prostate cancer risk variants with the expression of genes for each SNP (P < 2.5 10 6). Many more genes had an FDR nearby and genome-wide. <10%, including SRD5A1 and PSCA, and we observed significant Methods: We generated mRNA expression data for 20,254 associations with pathways in tumor tissue. genes with the Affymetrix GeneChip Human Gene 1.0 ST micro- Conclusions: The risk variants were associated with several array from normal prostate (N ¼ 160) and prostate tumor (N ¼ genes, including promising prostate cancer candidates and lipid 264) tissue from participants of the Physicians' Health Study and metabolism pathways, suggesting mechanisms for their impact Health Professionals Follow-up Study. With linear models, we on disease. These genes should be further explored in biologic and tested the association of 39 risk variants with nearby genes and all epidemiologic studies. genes, and the association of each variant with canonical path- Impact: Determining the biologic role of these variants can lead ways using a global test. to improved understanding of prostate cancer etiology and iden- Results: In addition to confirming previously reported associa- tify new targets for chemoprevention. Cancer Epidemiol Biomarkers tions, we detected several new significant (P < 0.05) associations Prev; 24(1); 255–60. 2014 AACR.
Introduction cancer. Family and twin studies demonstrate that prostate cancer is highly heritable (16); these SNPs explain an ever increasing Numerous germline genetic risk variants have been linked to portion of this underlying heritability, currently about 30% (15). prostate cancer risk from genome-wide association studies (1– However, the biologic function of these risk SNPs remains largely 14). With the report from the PRACTICAL consortium in 2013, unknown given that the majority is located outside of protein the number of prostate cancer risk variants is now >70 (15), a coding regions. A critical next step in translating knowledge of major step toward uncovering the genetic etiology of prostate identified SNPs to the prevention or treatment of prostate cancer is determining their biologic mechanisms. One possibility is that the risk variants are expression quanti- 1Department of Epidemiology, Harvard School of Public Health, Bos- tative trait loci (eQTL), genetic loci that are associated with mRNA ton, Massachusetts. 2Channing Division of Network Medicine, Depart- ment of Medicine, Brigham and Women's Hospital and Harvard Med- transcript levels. Few large studies have both SNP data and ical School, Boston, Massachusetts. 3Department of Biostatistics, prostate tissue for gene expression studies. We recently showed Harvard School of Public Health, Boston, Massachusetts. 4Department that a prostate cancer risk SNP on chromosome 10q11, of Biostatistics and Computational Biology, Dana-Farber Cancer Insti- rs10993994, was significantly associated with mRNA expression tute, Boston, Massachusetts. 5Division of Preventive Medicine, Depart- ment of Medicine, Brigham and Women's Hospital and Harvard Med- of two nearby genes (17) in prostate tissue. Men with the risk allele ical School, Boston, Massachusetts. 6Department of Medical Oncolo- had decreased expression of MSMB in both normal prostate and gy, Dana-Farber Cancer Institute, Boston, Massachusetts. 7The Broad NCOA4 8 tumor tissue, and increased expression of in normal Institute, Cambridge, Massachusetts. Department of Pathology, Brig- MSMB ham and Women's Hospital and Harvard Medical School, Boston, prostate tissue only. A similar result for was observed by Massachusetts. 9Department of Nutrition, Harvard School of Public Lou and colleagues (18). A study of 12 prostate cancer risk loci Health, Boston, Massachusetts. found that four acted as eQTLs. In addition to confirming the Note: Supplementary data for this article are available at Cancer Epidemiology, MSMB and NCOA4 results, NUDT11 was associated with Biomarkers & Prevention Online (http://cebp.aacrjournals.org/). rs5945619 and SLC22A3 was borderline significantly associated Corresponding Author: Kathryn L. Penney, Channing Division of Network with rs9364554 in both tumor and normal tissue, and HNF1B was Medicine, Brigham and Women's Hospital, 181 Longwood Ave., Boston, MA associated with rs4430796 in normal tissue only (19). Harries and 02115. Phone: 617-525-0860; Fax: 617-525-2008; E-mail: colleagues observed an association between rs6465657 and [email protected] expression of nearby LMTK2 (20), whereas Xu and colleagues doi: 10.1158/1055-9965.EPI-14-0694-T found a proxy for rs12653946 to be strongly associated with the 2014 American Association for Cancer Research. expression of nearby IRX4 (21).
www.aacrjournals.org 255
Downloaded from cebp.aacrjournals.org on September 30, 2021. © 2015 American Association for Cancer Research. Published OnlineFirst November 4, 2014; DOI: 10.1158/1055-9965.EPI-14-0694-T
Penney et al.
Published work has primarily focused on the expression of transcriptome amplification system that allows for complete gene genes near the risk variants. Although variants may have larger expression analysis from archives of FFPE samples known to effects on nearby genes, and there is therefore more power to harbor small and degraded RNA. Using a combination 50 and identify these effects, genetic polymorphisms can influence the random primer, reverse transcription created a cDNA/mRNA expression of genes anywhere in the genome. This can happen hybrid. The mRNA was subsequently fragmented, creating bind- either directly or downstream in a pathway, such as through a ing sites for DNA polymerase. Isothermal strand-displacement, transcription factor (reviewed in ref. 22). We therefore examined using a proprietary DNA/RNA chimeric SPIA primer, amplified the association of the risk variants with transcriptome-wide the cDNA. The cDNA was then fragmented and labeled with a expression data in tumor and normal prostate tissue, performing terminal deoxynucleotidyl transferase covalently linked to biotin a cis analysis (examining the association of the variants with to prepare for microarray hybridization. The labeled cDNA was nearby genes), a trans-analysis (determining the association of then hybridized to a GeneChip Human Gene 1.0 ST microarray the variants with all genes), and a pathway analysis. (Affymetrix). For the expression profiles generated, we regressed out technical Materials and Methods variables including mRNA concentration, age of the block, batch (96-well plate), percentage of the probes on the array detectable Study participants above the background, log-transformed average background sig- Physicians' Health Study and Health Professionals Follow-up Study. nal, and the median of the perfect match probes for each probe The men in the study are participants in two prospective studies intensity of the raw data. The residuals were then shifted to have ongoing for more than 25 years: the Physicians' Health Study the original mean expression values and normalized using the (PHS) and Health Professionals Follow-up Study (HPFS). PHS RMA method (25, 26). We mapped gene names to Affymetrix began in 1986 as a randomized, double-blind trial of aspirin and transcript cluster IDs using the NetAffx annotations as implemen- b -carotene in the prevention of cardiovascular disease and cancer ted in Bioconductor annotation package pd.hugene.1.0.st.v1; this among 22,071 initially healthy U.S. physicians (23). The HPFS, an resulted in 20,254 unique named genes. Gene expression data are ongoing prospective cohort study on the causes of cancer and available through Gene Expression Omnibus accession number heart disease in men, consists of 51,529 U.S. health professionals GSE62872. who were ages 40 to 75 years in 1986 (24). In both studies, men were excluded if they had any serious medical conditions at Risk variant genotypes baseline including all cancers (except nonmelanoma skin cancer). The SNPs were genotyped on DNA extracted from whole blood The men in this study were diagnosed with incident, histolog- as part of the National Cancer Institute funded Breast and Prostate ically confirmed prostate cancer between 1982 and 2004. Parti- Cancer Cohort Consortium (BPC3) using the TaqMan assay cipants are followed through regular questionnaires to collect self- (Applied Biosystems) at the Harvard School of Public Health reported data on diet, lifestyle behaviors, medical history, and (Boston, MA); details on the SNP selection and genotyping are health outcomes, including prostate cancer. All prostate cancer provided in (ref. 27). Prostate cancer participants included in the cases in this study were verified through medical record and current study are all of European ancestry. To reduce missing data, pathology review. Through this systematic medical record review, we combined data for SNPs in very high linkage disequilibrium. we also abstract data on clinical information, including clinical For rs12418451, we used genotypes from either rs12418451 or stage and PSA at diagnosis. rs10896438 (r2 ¼ 0.96 in HapMap CEU population); for The Human Subjects Committee at Partners Healthcare and the rs2928679, we used genotypes from either rs2928679 or Harvard School of Public Health (Boston, MA) approved these rs13264338 (r2 ¼ 0.97); for rs1983891, we used genotypes from studies. either rs1983891 or rs9381080 (r2 ¼ 1.00); and for rs11672691, we used genotypes from either rs11672691 or rs11673591 (r2 ¼ mRNA expression profiling 1.00). In addition, eight SNPs from (ref. 27) were not genotyped In both cohorts, we sought to retrieve archival formalin-fixed in PHS, and were therefore excluded from this study due to the paraffin embedded (FFPE) specimens. The PHS and HPFS Tumor reduction in sample size. The average SNP and individual call rate Cohort includes 2,200 men with prostate cancer for whom we were 95.2% and 95.3%, respectively. have collected archival radical prostatectomy (RP; 95%) and trans-urethral resection of the prostate (TURP; 5%) specimens. Statistical analysis For a subset of the tumor cohort, we undertook whole-genome There were 264 participants with SNP and tumor tissue expres- gene expression profiling as part of a study designed to identify sion data; 160 of these cases also had normal prostate tissue expression signatures that can differentiate lethal from indolent expression data. Each SNP (three genotype categories: 0, 1, or 2 prostate cancer. We sampled men from the Tumor Cohort using copies of the risk allele) was compared with each gene (contin- an extreme case design, which includes 116 men who died of their uous expression) with a linear model test for trend. This analysis cancer or developed bony or distant metastases and 292 men who was performed separately for gene expression from prostate lived at least 8 years after prostate cancer diagnosis and were not tumor and normal prostate tissue. First, a cis analysis was per- diagnosed with metastases through 2012. For a subset of these formed, examining the association between each SNP and "near- men, we also profiled adjacent normal tissue. To conduct this by" (500 kb up- and downstream) genes. For this more focused profiling in FFPE tissues, whole transcriptome amplification was analysis, P < 0.05 was considered statistically significant; despite paired with microarray technologies. Briefly, RNA was extracted the possibility for false positives using this liberal approach, these using the Biomek FxP automated platform with the Agencourt are reported to provide candidates for future studies. Next, a FormaPure FFPE kit (Beckman Coulter). The mRNA was ampli- genome-wide analysis was performed, examining the association fied using the WT-Ovation FFPE System V2 (NuGEN), a whole of each SNP with all genes. For this analysis, a Bonferroni
256 Cancer Epidemiol Biomarkers Prev; 24(1) January 2015 Cancer Epidemiology, Biomarkers & Prevention
Downloaded from cebp.aacrjournals.org on September 30, 2021. © 2015 American Association for Cancer Research. Published OnlineFirst November 4, 2014; DOI: 10.1158/1055-9965.EPI-14-0694-T
Prostate Cancer Risk Variants and Tissue Gene Expression