Carcinogenesis, 2015, Vol. 36, No. 11, 1341–1353

doi:10.1093/carcin/bgv138 Advance Access publication September 29, 2015 Original Manuscript

original manuscript Common variants at the CHEK2 gene locus and risk of epithelial ovarian cancer Downloaded from Kate Lawrenson109,†, Edwin S.Iversen1,†, Jonathan Tyrer2,†, Rachel Palmieri Weber3, Patrick Concannon4, Dennis J.Hazelett5, Qiyuan Li6, Jeffrey R.Marks7, Andrew Berchuck8, Janet M.Lee, Katja K.H.Aben9,10, Hoda Anton-Culver11, 12 13

Natalia Antonenkova , Australian Cancer Study (Ovarian Cancer) , http://carcin.oxfordjournals.org/ Australian Ovarian Cancer Study Group13,14, Elisa V.Bandera15, Yukie Bean16,17, Matthias W.Beckmann18, Maria Bisogna19, Line Bjorge20,21, Natalia Bogdanova22, Louise A.Brinton23, Angela Brooks-Wilson24,25, Fiona Bruinsma26, Ralf Butzow27,28, Ian G.Campbell29,30,31, Karen Carty32, Jenny Chang-Claude33, Georgia Chenevix- Trench34, Ann Chen35, Zhihua Chen35, Linda S.Cook36, Daniel W.Cramer37,38, Julie M.Cunningham39, Cezary Cybulski40, Joanna Plisiecka-Halasa41, Joe Dennis42, 42 43 22 4,45 46 Ed Dicks , Jennifer A.Doherty , Thilo Dörk , Andreas du Bois , Diana Eccles , at Erlangen Nuernberg University on August 15, 2016 Douglas T.Easton42, Robert P.Edwards46,47, Ursula Eilber33, Arif B.Ekici48, Peter A.Fasching18,49, Brooke L.Fridley50, Yu-Tang Gao51, Aleksandra Gentry- Maharaj52, Graham G.Giles26,53, Rosalind Glasspool32, Ellen L.Goode54, Marc T.Goodman55,56, Jacek Gronwald40, Philipp Harter44,45, Hanis Nazihah Hasmad57, Alexander Hein18, Florian Heitz44,45, Michelle A.T.Hildebrandt58, Peter Hillemanns59, Estrid Hogdall60,61, Claus Hogdall62, Satoyo Hosono63, Anna Jakubowska40, James Paul32, Allan Jensen64, Beth Y.Karlan65, Susanne Kruger Kjaer64,66, Linda E.Kelemen67, Melissa Kellar16,17, Joseph L.Kelley68, Lambertus A.Kiemeney69, Camilla Krakstad20,21, Diether Lambrechts70,71, Sandrina Lambrechts72, Nhu D.Le73, Alice W.Lee, Rikki Cannioto74, Arto Leminen27, Jenny Lester65, Douglas A.Levine19, Dong Liang75, Jolanta Lissowska41, Karen Lu76, Jan Lubinski40, Lene Lundvall62, Leon F.A.G.Massuger77, Keitaro Matsuo78, Valerie McGuire79, John R.McLaughlin80, Heli Nevanlinna27, Iain McNeish81, Usha Menon52, Francesmary Modugno47,82,83,84, Kirsten B.Moysich74, Steven A.Narod85, Lotte Nedergaard86, Roberta B.Ness87, Mat Adenan Noor Azmi88, Kunle Odunsi89, Sara H.Olson90, Irene Orlow90, Sandra Orsulic65, Celeste L.Pearce91, Tanja Pejovic16,17, Liisa M.Pelttari27, Jennifer Permuth-Wey92, Catherine M.Phelan92, Malcolm C.Pike1,90, Elizabeth M.Poole93,94, Susan J.Ramus, Harvey A.Risch95, Barry Rosen96, Mary Anne Rossing97,98, Joseph H.Rothstein79, Anja Rudolph33, Ingo B.Runnebaum99, Iwona K.Rzepecka40, Helga B.Salvesen20,21,

Received: May 28, 2015; Revised: September 14, 2015; Accepted: September 16, 2015

© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected].

1341 1342 | Carcinogenesis, 2015, Vol. 36, No. 11

Agnieszka Budzilowska41, Thomas A.Sellers92, Xiao-Ou Shu100, Yurii B.Shvetsov101, Nadeem Siddiqui102, Weiva Sieh79, Honglin Song2, Melissa C.Southey30, Lara Sucheston75, Ingvild L.Tangen20,21, Soo-Hwang Teo58,103, Kathryn L.Terry37,38, Pamela J.Thompson55,56, Agnieszka Timorek104, Shelley S.Tworoger93,94, Els Van Nieuwenhuysen72, Ignace Vergote72, Robert A.Vierkant54, Shan Wang-Gohrke105, Christine Walsh65, Nicolas Wentzensen23, Alice S.Whittemore79, Kristine G.Wicklund98, Lynne R.Wilkens101, Yin-Ling Woo88,103, Xifeng Wu58, Anna H.Wu, Hannah Yang23, Wei Zheng100, Argyrios Ziogas10, Gerhard A.Coetzee5, Matthew L.Freedman106, Alvaro N.A.Monteiro107, Joanna Moes-Sosnowska41, Jolanta Kupryjanczyk41, 2 110 3,108,

Paul D.Pharoah , Simon A.Gayther and Joellen M.Schildkraut * Downloaded from Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, CA 90033, USA, 1Department of Statistical Science, Duke University, Durham, NC 27708, USA, 2Department of Oncology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research laboratory, Cambridge CB2 1TN, UK, 3Department of Community and Family Medicine, Duke University 4

Medical Center, Durham, NC 27710, USA, Genetics Institute and Department of Pathology, Immunology and Laboratory http://carcin.oxfordjournals.org/ Medicine, University of Florida, Gainesville, FL 32611, USA, 5Departments of Urology and Preventive Medicine, Norris Cancer Center, University of Southern California Keck School of Medicine, Los Angeles, CA 90089, USA, 6Medical School, Xiamen University, Xiamen 361000, China, 7Department of Surgery, Duke University Medical Center, Durham, NC 27710, USA, 8Department of Obstetrics and Gynecology, Duke University Medical Center, Durham, NC 27710, USA, 9Department for Health Evidence, Nijmegen 6500 HB, The Netherlands, 10Comprehensive Cancer Center, Utrecht 3542 EG, The Netherlands, 11Department of Epidemiology, Genetic Epidemiology Research Institute, School of Medicine, University of California Irvine, Irvine, CA 92697, USA, 12Byelorussian Institute for Oncology and Medical Radiology Aleksandrov N.N., Minsk 223052, Belarus, 13Cancer Division, QIMR Berghofer Medical Research Institute, Brisbane Queensland 4006, Australia, 14Peter MacCallum Cancer Institute, East Melbourne, Victoria 3002, Australia, 15Cancer Prevention and Control, at Erlangen Nuernberg University on August 15, 2016 Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08903, USA, 16Department of Obstetrics and Gynecology, Oregon Health and Science University, Portland, OR 97239, USA, 17Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97239, USA, 18Department of Gynecology and Obstetrics, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen 91054, Germany, 19Gynecology Service, Department of Surgery, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA, 20Department of Gynecology and Obstetrics, Haukeland University Hospital, Bergen 5021, Norway, 21Department of Clinical Science, Centre for Cancer Biomarkers, University of Bergen, Bergen 5020, Norway, 22Gynaecology Research Unit, Hannover Medical School, Hannover 30625, Germany, 23 Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20852, USA, 24Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada, 25Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada, 26Cancer Epidemiology Centre, Cancer Council of Victoria, Melbourne Victoria 3004, Australia, 27Department of Obstetrics and Gynecology, University of Helsinki and Helsinki University Central Hospital, Helsinki 00029, Finland, 28Department of Pathology, Helsinki University Central Hospital, Helsinki 00029, Finland, 29Cancer Genetics Laboratory, Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria 8006, Australia, 30Department of Pathology, University of Melbourne, Parkville, Victoria 3002, Australia, 31Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria 3002, Australia, 32Cancer Research UK Clinical Trials Unit, The Beatson West of Scotland Cancer Centre, Glasgow G12 0YN, UK, 33Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg 69009, Germany, 34Cancer Division, QIMR Berghofer Medical Research Institute, Brisbane Queensland 4029, Australia, 35Department of Biostatistics, Moffitt Cancer Center, Tampa, FL 33612, USA, 36Division of Epidemiology and Biostatistics, Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131, USA, 37Harvard School of Public Health, Boston, MA 02115, USA, 38Obstetrics and Gynecology Epidemiology Center, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA, 39Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA, 40Department of Genetics and Pathology, Pomeranian Medical University, Szczecin 70-115, Poland, 41Department of Cancer Epidemiology and Prevention, Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Warsaw 02-781, Poland, 42Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge CB1 8RN, UK, 43Department of Community and Family Medicine, Section of Biostatistics and Epidemiology, The Geisel School of Medicine at Dartmouth, Lebanon, NH 03756, USA, 44Department of Gynecology and Gynecologic Oncology, Kliniken Essen-Mitte, Essen 45136, Germany, 45Department of Gynecology and Gynecologic Oncology, Dr. Horst Schmidt Kliniken Wiesbaden, Wiesbaden 65199, Germany, 46Wessex Clinical Genetics Service, Princess Anne Hospital, Southampton SO16 5YA, UK, 47Ovarian Cancer Center of Excellence, University of Pittsburgh, Pittsburgh, PA 15222, USA, 48Institute of Human Genetics, University Hospital Erlangen, Friedrich-Alexander-University Erlangen- K.Lawrenson et al. | 1343

Nuremberg, Erlangen 91054, Germany, 49Department of Medicine, Division of Hematology and Oncology, University of California at Los Angeles, David Geffen School of Medicine, Los Angeles CA 90095, USA, 50Biostatistics and Informatics Shared Resource, University of Kansas Medical Center, Kansas City, KS 66160, USA, 51Shanghai Cancer Institute, Shanghai 200032, China, 52Department of Women’s Cancer, Institute for Women’s Health, University College London, London W1T 7DN, UK, 53Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Victoria 3010, Australia, 54Department of Health Science Research, Mayo Clinic, Rochester, MN 55905, USA, 55Cancer Prevention and Control, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA, 56Department of Biomedical Sciences, Community and Population Health Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA, 57Cancer Research Initiatives Foundation, Sime Darby Medical Centre, Subang Jaya 47500, Malaysia, 58Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA, 59Departments of Obstetrics and Gynaecology, Hannover Medical School, Hannover 30625, Germany, 60Institute of Cancer Epidemiology, Danish Cancer Society, Copenhagen DK-2100, Denmark, 61Molecular Unit, Department of Pathology, Herlev Hospital, University of Copenhagen, Copenhagen DK-2100, Denmark, 62Gyn Clinic, Rigshospitalet, University of Copenhagen DK-2100, Denmark, 63Division of Epidemiology and 64

Prevention, Aichi Cancer Center Research Institute, Nagoya, Aichi 464-8681, Japan, Department of Virus, Lifestyle and Downloaded from Genes, Danish Cancer Society Research Center, Copenhagen DK-2100, Denmark, 65Women’s Cancer Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA, 66Department of Gynecology, Rigshospitalet, University of Copenhagen, Copenhagen DK-2100, Denmark, 67Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC, 29425, USA, 68Department of Obstetrics, Gynecology and Reproductive Sciences, University of Pittsburgh School of Medicine, Pittsburgh, PA 15222, USA, 69Department for Health Evidence and Department of Urology, Radboud University Medical Centre, Nijmegen 6500 HB, http://carcin.oxfordjournals.org/ The Netherlands, 70Vesalius Research Center, VIB, Leuven 3000, Belgium, 71Laboratory for Translational Genetics, Department of Oncology, University of Leuven 3000, Belgium, 72Division of Gynecological Oncology, Department of Oncology, University Hospitals Leuven, 3000, Belgium, 73Cancer Control Research, BC Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada, 74Department of Cancer Prevention and Control, Roswell Park Cancer Institute, Buffalo, NY 14263, USA, 75College of Pharmacy and Health Sciences, Texas Southern University, Houston, TX 77004, USA, 76Department of Gynecologic Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA, 77Department of Gynaecology, Radboud University Medical Centre, Nijmegen 6500 HB, The Netherlands, 78Department of Preventive Medicine, Kyushu University Faculty of Medical Sciences, Fukuoka 812-8582, Japan, 79Department of Health 80 Research and Policy - Epidemiology, Stanford University School of Medicine, Stanford, CA 94305, USA, Prosserman at Erlangen Nuernberg University on August 15, 2016 Centre for Health Research, Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada, 81Institute of Cancer Sciences, University of Glasgow, Wolfson Wohl Cancer Research Centre, Beatson Institute for Cancer Research, Glasgow G61 1QH, UK, 82Women’s Cancer Research Program, Magee-Women’s Research Institute and University of Pittsburgh Cancer Institute, Pittsburgh, PA 15222, USA, 83Department of Obstetrics, Gynecology and Reproductive Sciences, University of Pittsburgh School of Medicine, Pittsburgh, PA 15222, USA, 84Department of Epidemiology, University of Pittsburgh Graduate School of Public Health, Pittsburgh, PA 15222, USA, 85Women’s College Research Institute, Toronto, Ontario M5G 1N8, Canada, 86Department of Pathology, Rigshospitalet, University of Copenhagen DK-2100, Denmark, 87The University of Texas School of Public Health, Houston, TX 77225, USA, 88Department of Obstetrics and Gynaecology, University Malaya Medical Centre, University Malaya, Kuala Lumpur 50603, Malaysia, 89Department of Gynecological Oncology, Roswell Park Cancer Institute, Buffalo, NY 14263, USA, 90Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA, 91Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA, 92Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center, Tampa, FL 33612, USA,93 Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA, 94Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA, 95Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, CT 06520, USA, 96Department of Gynecologic-Oncology, Princess Margaret Hospital, and Department of Obstetrics and Gynecology, Faculty of Medicine, University of Toronto, Toronto, Ontario M5G 1X6, Canada, 97Program in Epidemiology, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA, 98Department of Epidemiology, University of Washington, Seattle, WA 98109, USA, 99Department of Gynecology, Jena University Hospital - Friedrich Schiller University, Jena D-07743, Germany, 100Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37203, USA, 101Cancer Epidemiology Program, University of Hawaii Cancer Center, Hawaii 96826, USA, 102Department of Gynaecological Oncology, Glasgow Royal Infirmary, Glasgow G4 0SF, UK, 103University Malaya Cancer Research Institute, Faculty of Medicine, University Malaya Medical Centre, University Malaya, Kuala Lumpur 50603, Malaysia, 104Department of Obstetrics, Gynecology and Oncology, IInd Faculty of Medicine, Warsaw Medical University and Brodnowski Hospital, Warsaw 03-242, Poland, 105Department of Obstetrics and Gynecology, University of Ulm, Ulm 89075, Germany, 106Department of Medical Oncology, 02115, Dana-Farber Cancer Institute, Boston, MA, USA, 107Cancer Epidemiology Program, Division of Population Sciences, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL 33612, USA and 108Cancer Control and Population Sciences, Duke Cancer Institute, Durham, NC 27710, USA, 109Present address: Women’s Cancer Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA and 110Present address: Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA 1344 | Carcinogenesis, 2015, Vol. 36, No. 11

* To whom correspondence should be addressed. Tel: +1 4349 248569; Fax: +1 4349 248437; Email: [email protected]

† These authors contributed equally to this work.

Correspondence may also be addressed to Kate Lawrenson. Tel: +1 323 423 7935; Fax: +1 310-423-9537; Email: [email protected]

Abstract Genome-wide association studies have identified 20 genomic regions associated with risk of epithelial ovarian cancer (EOC), but many additional risk variants may exist. Here, we evaluated associations between common genetic variants [single nucleotide polymorphisms (SNPs) and indels] in DNA repair genes and EOC risk. We genotyped 2896 common variants at 143 gene loci in DNA samples from 15 397 patients with invasive EOC and controls. We found evidence of associations with EOC risk for variants at FANCA, EXO1, E2F4, E2F2, CREB5 and CHEK2 genes (P ≤ 0.001). The strongest risk association was for CHEK2 SNP rs17507066 with serous EOC (P = 4.74 x 10–7). Additional genotyping and imputation of genotypes from the 1000 genomes project identified a slightly more significant association for CHEK2 SNP rs6005807 (r2 with rs17507066 = 0.84, odds Downloaded from ratio (OR) 1.17, 95% CI 1.11–1.24, P = 1.1 × 10−7). We identified 293 variants in the region with likelihood ratios of less than 1:100 for representing the causal variant. Functional annotation identified 25 candidate SNPs that alter transcription factor binding sites within regulatory elements active in EOC precursor tissues. In The Cancer Genome Atlas dataset, CHEK2 gene expression was significantly higher in primary EOCs compared to normal fallopian tube tissues (P = 3.72 × 10−8). We also identified an association between genotypes of the candidate causal SNP rs12166475 (r2 = 0.99 with rs6005807) and CHEK2 -8 expression (P = 2.70 × 10 ). These data suggest that common variants at 22q12.1 are associated with risk of serous EOC and http://carcin.oxfordjournals.org/ CHEK2 as a plausible target susceptibility gene.

Introduction variant in CHEK2 (CHEK2*1100delC) is associated with a 2–3- fold increased risk of breast cancer but is not associated with Epithelial ovarian cancer (EOC) is the leading cause of death ovarian cancer risk (24). However, other CHEK2 coding variants from cancers of the female reproductive tract (1). Established may confer susceptibility to ovarian cancer, and in one study, protective risk factors for the disease include pregnancy, oral five CHEK2 missense variants were identified in an analysis of

contraceptives and breast-feeding, factors which reduce the at Erlangen Nuernberg University on August 15, 2016 360 EOC cases, two of which were predicted to be damaging in number of lifetime ovulatory cycles. The biological mechanisms functional assays (25). Common low penetrance variants may underlying these associations are not well understood but may also exist in and around the CHEK2 gene. We previously evalu- involve proliferation, inflammation, oxidative stress, mutation ated associations for germline genetic variants spanning 53 and DNA repair in the ovarian and/or fallopian tube epithelia DNA damage response and repair genes and risk of invasive (2–4). Several studies have highlighted the critical role in ovarian serous EOC in 364 EOC cases and 761 controls from the North tumorigenesis of maintaining genomic integrity, either through Carolina Ovarian Cancer Study (NCOCS), and identified border- DNA repair or apoptotic pathways (5). line evidence of an association for two variants (rs5762746 and The importance of highly penetrant genetic risk factors for rs6005835) in CHEK2 (26). EOC has become evident from studies of familial clustering In this study, we expanded the examination of risks associ- of breast and ovarian cancers, largely caused by mutations in ated with DNA repair genes by genotyping 2896 single nucleo- the BRCA1 and BRCA2 (6–8). Several additional susceptibility tide polymorphisms (SNPs) and indel variants spanning 143 genes exist that confer more moderate risks of EOC (e.g. MSH6, gene loci in DNA samples of 15 397 EOC cases and 30 816 con- MSH2, MLH1, RAD51C, RAD51D and BRIP1 (9–12)). Finally, several trols as part of the Collaborative Oncological Gene-environment common genetic variants conferring relatively mild risks of Study (COGS). Imputation of additional SNPs in COGS further EOC have been identified in genome-wide association studies defined risk associations at the CHEK2 locus. Finally, we inte- (GWAS) (13–21). grated germline genetic data with functional annotation of the Whereas GWAS are highly successful at identifying com- region to identify the most likely disease-causing susceptibility mon variant susceptibility alleles for multiple complex traits variants and target genes at this locus. and diseases using case–control study designs, only a frac- tion of the total number of risk variants for any one disease have been identified to date. For example, Michailidou et al. (22) Materials and methods estimated that many common risk variants for breast cancer Genetic association analyses await discovery; and the same is likely true of other cancers. One approach to identify additional susceptibility alleles is to Study datasets integrate knowledge of disease biology with germline genetic Genetic association analyses were carried out using data from several Ovarian Cancer Association Consortium (OCAC) genotyping projects. datasets (i.e. candidate gene or pathway studies). This approach Study subjects were of European ancestry (determined using principal has recently been successful in identifying the 5p15 (TERT- components analysis of genotype data): 2162 cases and 2564 controls from CLPTM1L) susceptibility locus for breast, prostate and ovarian a GWAS from North America (‘US GWAS’), 1763 cases and 6118 controls cancers (18). from a UK-based GWAS (‘UK GWAS’) and 441 cases and 442 controls from CHEK2 is a putative tumor suppressor gene that encodes a a second GWAS from North America (‘Mayo GWAS’) (13–19,21). In total, 11 activated in response to DNA damage and has 030 cases and 21 693 controls from 41 OCAC studies were genotyped using also been shown to interact with and phosphorylate BRCA1, the iCOGS array (‘OCAC-iCOGS’ stage 1 data). The USA and UK GWAS were promoting cellular survival after DNA damage (23). A deletion comprised of several independent case–control studies, and samples from K.Lawrenson et al. | 1345

Abbreviations number of principal components was chosen based on the position of the inflexion of the principal components screen plot. Two principal compo- COGS Collaborative Oncological Gene-environment nents were included in the analysis of the UK and US GWAS data sets, Study one was used for the Mayo GWAS and five were used for the COGS-OCAC EOC epithelial ovarian cancer dataset. Results from the three GWAS and COGS were combined using eQTL expression quantitative trait locus fixed-effects, inverse variance weighted meta-analysis. GWAS genome-wide association studies HGSOC high-grade serous ovarian cancer Functional analyses MAF minor allele frequency Public databases SNP single nucleotide polymorphism To perform functional annotation of 293 candidate variants and expres- TFBS transcription factor binding site sion quantitative trait locus (eQTL) analysis at the CHEK2 locus, we mined the following databases: ENCODE (http://genome.ucsc.edu); Haploreg (http://www.broadinstitute.org/mammals/haploreg/haploreg.php); and some of these studies were also subsequently genotyped using the iCOGS The Blood eQTL Browser (http://genenetwork.nl/bloodeqtlbrowser/) (29). array. Combined, these studies comprised 15 396 independent cases and 30 817 controls. All duplicates were removed. Further details of the com- Profiling of epigenetic marks in ovarian cancer and ovarian ponent studies are in Supplementary Table 1, available at Carcinogenesis cancer precursor cells Downloaded from Online. Details of genotyping platform are shown in Supplementary FAIREseq and ChIPseq profiles were generated for two immortalized nor- Table 2, available at Carcinogenesis Online. mal ovarian surface epithelial cell lines (IOE4 and IOE11, generated in- house) and two fallopian secretory epithelial cell lines (FT33 and FT246, Variant selection from Dr R Drapkin) as described previously (30). The CaOV3 and UWB1.289 To select genes, we expanded upon the 53 genes included in our earlier (31) ovarian cancer cell lines (from CRUK and ATCC, respectively) were investigation (26) using the gene sets described in Wood et al. (27) plus

also profiled. Prior to performing ChIPseq/FAIREseq, cell lines were http://carcin.oxfordjournals.org/ literature and gene ontology database searches, ultimately identifying 143 authenticated using the Promega PowerPlex16HS Assay (performed at the genes whose functions relate to DNA damage recognition and repair pro- University of Arizona Genetic Core), and mycoplasma-specific PCR was cesses for inclusion in this study. For each gene, we identified all SNPs performed to ensure cell lines were not contaminated with mycoplasma within genome windows ranging from 10 kb upstream of the transcription infections. ChIPseq was performed using antibodies that recognized start sites to 10 kb downstream of the transcription end sites of the 143 histone 3 lysine 27 acetylation (H3K27ac) and histone 3 lysine 4 mono- genes DNA repair genes. These variants were included in the US ovarian methylation (H3K4me) (Abcam). Identification of variants predicted sig- cancer GWAS database of SNPs imputed by the MACH software package nificantly to alter transcription factor motifs was performed as described against Hapmap Phase II genotypes (Release 22, NCBI build 36) for 60 CEU by Hazelett et al. (32). founders. We used these data to conduct a preliminary association analy- sis and to identify tag sets for each region, tagging polymorphisms with Collection of normal epithelial samples minor allele frequencies (MAF) of at least 0.025 to an r2 of 0.80 or above. Early passage primary normal ovarian surface epithelial cells and fal- at Erlangen Nuernberg University on August 15, 2016 We ranked the genes on basis of the most significant variants within each lopian tube secretory epithelial cells were obtained from disease-free gene. We tagged SNPs at TP53, CHEK2 and the top five ranked genes to an ovaries and fallopian tubes collected during gynecological surgical pro- r2 of 0.975, the genes ranked 6 through 124 to an r2 of 0.90 and the 20 low- cedures taking place at the University of Southern California, Los Angeles est ranked genes to an r2 of 0.80. We then chose the SNP with the highest (the Gynecological Tissue and Fluid Repository), University College Illumina design score as the tag in each r2 bin, choosing the most highly Hospital, London and Oregon Health and Science University. All samples significant SNP when there were ties. This yielded 3651 variants that were were collected with informed patient consent. Methods for the collection included on the COGS Illumina custom iSelect chip (iCOGS). After qual- have been described previously (33,34). RNA extraction was performed ity control analysis, 3252 variants passed QC of which 53 had MAF < 0.02 from cells at 80% confluence using standard protocols. Cell lines were con- (Supplementary Table 3 and Figure 1, available at Carcinogenesis Online). firmed to be free of mycoplasma, but were not authenticated as they were novel cell lines used at an early passage. Imputation OCAC-iCOGS samples and each of the GWAS sets were imputed separately. Variants were imputed from the 1000 Genomes Project data using the v3 eQTL data analysis April 2012 release as the reference panel. We used a two-step procedure, For each sample, 500 ng RNA were reverse transcribed using the Superscript ® which involved prephasing in the first step (using the SHAPEIT software III (Life Technologies). TaqMan was used to quantify CHEK2 gene ® and imputation, of the phased data in the second step using the IMPUTE expression using a TaqMan gene expression probe (Hs00200485_m1, Life version 2 software (28). To perform the imputation, we divided the data Technologies). Four control genes were also included: ACTB, Hs00357333_ into segments of ~5 Mb and excluded variants from the association analy- g1; GAPDH, Hs02758991_g1; HMBS, Hs00609293_g1; and HPRT1 Hs0280069_ sis if their imputation accuracy was r2 < 0.25 or their MAF was <0.005. The m1 (all Life Technologies). Relative expression levels were calculated using number of successfully imputed SNPs by MAF is shown in Supplementary the ΔΔCt method. Correlations between genotype and gene expression Table 4, available at Carcinogenesis Online. were calculated in R using a Jonckheere–Terpstra trend test (for three groups) or the Wilcoxon rank sum statistic (for 2 groups). Data analysis Analyses were restricted to women of European intercontinental ances- eQTL analysis using TCGA data try. We performed principal components analysis using a set of ~37 000 Publicly available microarray and germline genotyping data for high-grade unlinked markers to control for population substructure; an in-house serous EOCs was downloaded from TCGA (Lawrenson et al, Nat Comm, program was utilized (available at http://ccge.medschl.cam.ac.uk/soft- accepted). For each case, germline genotyping data were used to deter- ware/). Unconditional logistic regression treating the number of alternate mine ancestry using principal components through EIGENSTRAT software alleles carried as an ordinal variable (log-additive, codominant model) (HapMap profiles were used as a control set). Only cases with complete was used to evaluate the association between each variant and EOC risk. Northern or Western European ancestry were included. Cis-eQTL analyses A likelihood ratio statistic was used to examine significance of associa- were performed for all genes in a 1MB region spanning the top variant, tion, and per-allele log odds ratios (OR) and 95% confidence interval (CI) using a method we have described previously (35). Associations between were estimated. The logistic regression model was adjusted for study and risk variant genotypes and mRNA expression for 339 cases were evaluated population substructure by including study-specific indicators and a vari- using a linear regression model adjusting for the effects of copy number able number of eigenvalues from the principal components analyses. The and methylation. The Benjamini–Hochberg method was used to adjust for 1346 | Carcinogenesis, 2015, Vol. 36, No. 11

multiple testing. A significant association was defined by a false discovery with high-grade serous ovarian cancer (HGSOC) risk. This analy- rate of less than 0.1. sis identified multiple additional variants highly correlated with rs9625477, several of which were more significantly associated Results with disease risk. All imputed risk-associated variants with P value less than threshold 10−6 are given in Supplementary Genetic association analyses of DNA repair gene loci Table 7, available at Carcinogenesis Online. The most significant For 143 DNA repair genes, we identified 2896 tagging SNPs (minor risk association was for SNP rs6005807 (OR 1.17, 95%CI 1.11–1.24, −7 2 allele frequencies (MAFs) > 2%) that lie within 10 kb upstream P = 1.1 − 10 ), which is correlated with rs9625477 (r = 0.99). Risk and downstream of the transcription start and end sites of each associations for genotyped and imputed SNPs in HGSOC across gene; these variants were genotyped as part of iCOGS. Of these, the regions are illustrated in Figure 1B. 2 621 were successfully genotyped in 46 213 subjects from 43 We stratified risk associations by histological subtype. The studies. This sample included 15 397 women diagnosed with most significant SNP for serous ovarian cancer (rs6005807) was invasive EOC, of whom 9608 cases had serous ovarian cancer, more weakly associated with all invasive subtypes of EOC com- −6 and 30 816 controls. Details of the study populations, genotyp- bined (P = 2.9 × 10 ) and showed no evidence of association for clear ing platforms used for each data set and quality control analysis cell (P = 0.65), endometrioid (P = 0.55) or mucinous (P = 0.33) sub- are given in Supplementary Tables 1–3 and Figure 1, available at types. However, other variants in the region (all imputed) showed Downloaded from 2 Carcinogenesis Online. subtype-specific associations: rs78371015 (r 0.97 with rs6005807) Supplementary Table 5, available at Carcinogenesis Online, was the strongest risk allele for the clear cell subtype (P = 0.0002); 2 lists the DNA repair genes evaluated in this study, including the rs34051361 (r = 0.56 with rs6005807) was associated with the number of tag SNPs at each locus, and the significance of asso- endometrioid subtype (P = 0.0001); and the variant 22:29126347:D 2 ciation (P value) with serous ovarian cancer for the most signifi- (r = 0.92 with rs6005807) was associated with the mucinous sub- http://carcin.oxfordjournals.org/ cant risk-associated variant for each gene. The data for all SNPs type (P = 0.0001). Summary results are given in Table 1. and genes are illustrated in Figure 1A. SNPs at 6 different genes Functional annotation of risk-associated variants were associated with serous ovarian cancer risk at a P value threshold of 0.001: FANCA on chromosome 16q24.3 (P = 0.001); Two hundred and ninety-three variants (genotyped or imputed), EXO1 on chromosome 1q43 (P = 0.0005); E2F4 on chromosome representing the most likely candidate causal variants at the locus, 16q22.1 (P = 0.0005); E2F2 on chromosome 1p36.12 (P = 0.0004); had likelihood ratios greater than 1:100 compared with the most CREB5 on chromosome 7p15.1 (P = 0.0002) and CHEK2 on chro- significant SNP in serous ovarian cancer. The majority of candi- mosome 22q12.1 (P = 4.7 × 10−7). date causal variants were SNPs (267/293, 91.1%); the remaining The genomic inflation factor λ for the combined meta-analy- 26/293 (8.9%) were indel polymorphisms. We annotated these SNPs with respect to protein coding genes, predicted functional motifs sis analysis was 1.15 (adjusted value to 1000 cases and controls at Erlangen Nuernberg University on August 15, 2016 and regulatory elements cataloged in ENCODE and Haploreg. Two λ1000 = 1.01). This may be due to cryptic population structure not accounted for by adjusting for principal components. However, SNPs were located in protein coding regions of the TTC28 gene; there was no residual inflation observed for association with both were synonymous and therefore unlikely to be of functional clear cell and mucinous ovarian cancers (λ = 1.01 and 0.93, importance. The remaining variants were located in non-coding respectively) and minimal inflation for the larger set of SNPs on DNA regions: 274 (93.5%) were located within introns of the TTC28 the iCOGS array that had not been selected as candidates for gene, and 12 (4.1%) were located within introns of the CHEK2 gene (Figure 2). Fifty-four SNPs (18.4%) coincide with enhancer or pro- EOC susceptibility (λ = 1.07, λ1000 = 1.004). moter elements annotated in ENCODE, 51 SNPs (17.4%) are located Genetic association analyses of the CHEK2 in DNase hypersensitivity domains and 241 SNPs (82.3%) are pre- gene locus dicted to alter transcription factor binding motifs in Haploreg CHEK2 showed the strongest evidence of association with (Supplementary Table 8, available at Carcinogenesis Online). To serous ovarian cancer risk, for SNP rs17507066 (odds ratio (OR) define further the overlaps between risk variants and putative 0.86; 95% confidence interval (95% CI) 0.81–0.91). Because CHEK2 functional features, we identified those variants predicted signifi- is a known moderately penetrant susceptibility gene for breast cantly to alter transcription factor binding sites (TFBSs) identified cancer, additional common variants in the region spanning using data from FactorBook (32,36). We only considered TFBS vari- this gene had been included on the iCOGS array at providing ants that lie within regulatory DNA regions active in EOC precursor a greater density within the region than for other DNA repair tissues. Active regulatory elements in normal ovarian and fallopian genes (16). Genotype data were available for a further 176 vari- epithelial cells were profiled using formaldehyde assisted isolation ants in this sample set in addition to the 24 tagging SNPs origi- of regulatory element sequencing (FAIRE-seq) to identify regions of nally evaluated. Further genotyping identified rs9625477 with open chromatin, and chromatin immunoprecipitation sequencing a marginally more significant association with serous ovarian (ChIP-seq) for histone modification marks H3K4me1 and H3K27ac cancer risk (P = 2.4 × 10−7). The data for the association analysis of (37). We identified 25 instances where candidate SNPs altered TFBSs all 200 genotyped variants in the region in serous ovarian cancer within active regulatory sequences (Supplementary Table 9, availa- are given in Supplementary Table 6, available at Carcinogenesis ble at Carcinogenesis Online), suggesting that these SNPs may be the Online. most likely candidate causal variants at this locus. HOCOMOCO We further evaluated this region after imputing genotypes for (38) was also used to identify transcription factors that may bind variants identified through the 1000 Genomes Project for all par- to the risk associated SNPs. We found that rs12166475 is predicted ticipants of European ancestry (Supplementary Table 7, available to affect binding of WT1, and that rs9620817 and rs16986509 are at Carcinogenesis Online). After excluding poorly imputed SNPs, a predicted to alter TFBSs for BRCA1; both transcription factors are total of 4785 SNPs with an imputation r2 > 0.3 and an estimated known to be important in risk of development of HGSOC. These MAF >0.02% spanning a 2 Mb region on 22q12.1 (nucleotide posi- data are summarized in Supplementary Table 10, available at tion 28 000 000 to 30 000 000) were analyzed for their associations Carcinogenesis Online and Figure 2C. K.Lawrenson et al. | 1347

Functional analyses of candidate genes amplification in the region spanning the six candidate genes, whereas homozygous deletions were rare (<1% cases). We identi- We used somatic data to evaluate the role in EOC development fied a somatic coding sequence mutation in both the CHEK2 and of all protein-coding genes within a 1Mb region spanning the ZNRF3 genes out of 316 sequenced HGSOC cases. The mutation most significant risk-associated SNP (rs6005807) to identify in CHEK2 is a missense (R346H) predicted to be of ‘high impact’ the most likely susceptibility target gene. Six genes lie in the (mutationassesor.org). The ZNRF3 mutation is also missense region: rs6005807 is located in an intron of TTC28 (tetratri- (P805H), but predicted to have little functional impact. Using copeptide repeat domain 28); 5′ prime of TTC28 are CHEK2, another database of somatic mutation frequencies (COSMIC), HSCB (HscB mitochondrial iron-sulfur cluster co-chaperon), which includes data for over 8000 tumors, we observed that CCDC117 (coiled-coil domain containing 117), XBP1 (X-box CHEK2 was the most frequently mutated (in 2.5% of cases) of all binding protein 1), and ZNRF3 ( and ring finger 3). We eval- the genes in the region (42). uated somatic genetic alterations in primary ovarian tumors We examined differences in gene expression between nor- for these six genes using The Cancer Genome Atlas (TCGA) mal and cancer tissues (Figure 2D). CHEK2 gene expression was data and other public databases (summarized in Table 2). significantly higher in HGSOCs (n = 489) compared to normal These analyses revealed that 11% of high-grade serous fallopian tube tissues (P = 3.7 × 10−8). TTC28 was the only other ovarian cancer (HGSOC) cases showed copy number gain or Downloaded from http://carcin.oxfordjournals.org/ at Erlangen Nuernberg University on August 15, 2016

Figure 1. (A) Manhattan plot illustrating associations between 2621 SNPs spanning 143 DNA repair genes and risk of high-grade serous ovarian cancer. SNPs colored red are the top ranked risk associations at the CHEK2 gene locus. (B) Regional association plot for the 22q12.1 locus showing the distribution of genotyped (black dots) and imputed (red dots) SNPs and the genetic architecture with respect to the 19 genes in the region. 1348 | Carcinogenesis, 2015, Vol. 36, No. 11

Table 1. Risk associations for the top SNPs at the CHEK2 locus, by histological subtype

Top SNP EOC subtype Position Number of cases SNP location (nearest gene) r2 COGS Odds ratio 95% CI P value rs6005807 Serous 28934313 9608 Intronic (TTC28) 1 1.17 1.11–1.24 1.1 × 10−07 All invasive 15 397 1.12 1.07–1.18 2.9 × 10−06 Clear cell 1172 1.04 0.89–1.20 0.65 Endometrioid 2385 1.03 0.93–1.15 0.55 Mucinous 1112 1.07 0.92–1.24 0.33 rs5752754 All invasive 28925542 15 397 Intronic (TTC28) 0.94 1.11 1.06–1.16 2.4 × 10−06 rs78371015 Clear cell 29245611 1172 Intergenic (ZNRF3) 0.97 1.40 1.17–1.67 0.0002 rs34051361 Endometrioid 29582557 2385 Intergenic (KREMEN1) 0.56 1.57 1.24–1.98 0.0001 chr22:29126347:D Mucinous 29126347 1112 Intergenic (ZNRF3) 0.92 1.56 1.25–1.98 0.0001 Downloaded from http://carcin.oxfordjournals.org/ at Erlangen Nuernberg University on August 15, 2016

Figure 2. Functional annotation of candidate genes and SNPs at the 22q12 locus. We analysed the genes and ovarian cancer risk associated variants in a 1MB region spanning the top ranked risk SNP at this locus: (A) The location of all protein coding genes in the 1 MB region spanning the ranked risk SNP, and the 298 SNPs in the region with a 100:1 likelihood of being causal; (B) Illustration of the regulatory elements identified by RNAseq and ChIPseq analysis (H3K4me1, H3K4me3 and H3K27ac) catalogued by ENCODE, illustrates the extent of overlap between candidate SNPs and epigenetic marks; (C) The three most significant candidate SNPs identified through a combination of functional annotation, eQTL analyses and transcription factor binding site prediction. Position weight matrices for each factor are shown with SNP position indicated. Ref, reference allele; Alt, alternative allele; PCT; percent of maximum position weight matrix score for each allele; (D) Gene expression in 489 HGSOCs and 8 normal fallopian tube tissue specimens performed by TCGA indicates CHEK2 is the most significantly differentially expressed gene in the region; E( ) mRNA expression of each gene in an in vitro early stage transformation model of ovarian cancer. Expression in CMYC and KRAS transformed cells compared to untransformed immortalized ovarian epithelial (IOE) cells (*P < 0.05). gene in the region that was differentially expressed in ovarian these genes in a stepwise model of early-stage ovarian epithelial cancers with lower expression in tumors compared to normal cell transformation driven by overexpression of the CMYC gene control tissues (P = 0.01). We also evaluated the expression of and mutant KRAS (Figure 2E) (43). In this model, CHEK2, HSBC, K.Lawrenson et al. | 1349 Normal OSEC ( n = 65) rs1750706 P value P value FDR 0 rs60057900 0.018 1.0 N/A00 N/A 0.184 N/A rs714191 0.045 N/A N/A ND 1.0 ND N/A ND Downloaded from −8 −6 −6 −127 10 10 10 10

× × × ×

P value FDR SNP 2.70 NS NS N/A N/A N/A ND 1.21 1.39 1.16 NS NS N/A N/A N/A ND b http://carcin.oxfordjournals.org/ rs12166475 Expression quantitative trait locus (eQTL) analysis trait quantitative Expression eQTL analysesLymphocyte TCGA eQTL analyses SNP NS rs9620817 rs16986509 rs12165715 NS at Erlangen Nuernberg University on August 15, 2016 67/8204 (0.8%) 16/8533 (0.2%) 25/8326 (0.3%) 21/8204 (0.3%) 67/8305 (0.8%) 249/9830 (2.5%) COSMIC somatic mutations N N unique mutations/ unique samples tested (%) CHEK2, HSCB, CCDC117 and XBP1. HSCB, of CHEK2, expression associated with increased Minor alleles were b TCGA somatic mutation TCGA somatic mutation HGSOC N = 316 (%) rate Mean HGSOC −1.29 0.011 0.3 Mean Normal FT −8 10

×

TCGA gene expression TCGA gene expression tumor ( n = 489) versus normal FT ( n = 8) P value 0.010 0.347 0.001 0.0 0.3680.098 0.0650.130 −0.020 −0.151 0.0 −0.0050.975 0.268 0.0 −0.015 0.0 0.006 0.004 0.3 3.7 a tokinesis ( 39 ) biogenesis ( 40 ) system in the immune Wnt signaling ( 41 ) response Involved in mitosis and cy- Involved Involved in iron-sulfur cluster in iron-sulfur Involved Not characterized involved factor, Transcription in involved Ubiquitin , Involved in DNA damage in DNA Involved Candidate geneand eQTL analyses analyses

), the top disease-associated eQTL SNP is shown for each gene. gene. for each the top disease-associated eQTL SNP is shown Ref ( 29 ), performed using data from were Blood eQTL analyses provided. are references ‘Gene’ pages unless alternative identified using the NCBI Gene functions were Gene Function for HSCB , CCDC117 and ZNRF3 . available no data were 3 data there TCGA level in available, not epithelial cell; N/A, surface ovarian not done; OSEC, no significant eQTLs detected; ND, NS, a Table 2. , the top eQTL SNP was also disease associated. also disease associated. the top eQTL SNP was CHEK2 , HSCB CCDC117 and XBP1 For TTC28 HSCB CCDC117 XBP1 ZNRF3 CHEK2 1350 | Carcinogenesis, 2015, Vol. 36, No. 11

CCDC117, XBP1 and ZNFR3 all showed significantly increased which partly explains why more common diseases have identi- expression in ovarian epithelial cells (P < 0.05) that had under- fied the most risk associations using GWAS. Disease heterogene- gone early stage neoplastic transformation, suggesting that ity may also have restricted our ability to identify risk variants the upregulation of these genes may be an early event in EOC for EOC as some ovarian cancer risk loci are subtype-specific development. (16,19). It is likely that common variants in various DNA repair genes confer susceptibility to subtype-specific EOC as observed eQTL analyses for more highly-penetrant genes. Finally, rarer variants (MAF < We used eQTL analysis to evaluate associations between risk 0.02) in these genes may confer susceptibility to EOC, but we genotypes and the levels of mRNA expression for candidate did not have adequate power to detect such rarer associations genes in the region (29). We looked for cis-eQTL associations in this study. in both normal and tumor tissues. We did not detect an eQTL The genetic data suggest that variants at the CHEK2 locus are for CHEK2 in normal ovarian/fallopian epithelial cultures. In associated with risk of invasive EOC, but that the association for peripheral blood samples (N > 5300) we observed a particularly the top ranked SNP (rs6005807) is stronger with serous histology. strong association between rs12165715 and XBP1 expression We found no evidence that rs6005807 was associated with other (P = 1.16 × 10−127) (29). We also detected associations between EOC histologies but different imputed variants within the region rs12166475 and CHEK2 expression (P = 2.70 × 10−8), rs9620817 and showed evidence of association with clear cell, endometrioid Downloaded from HSCB expression (P = 1.21 × 10−6), and rs16986509 and CCDC117 and mucinous EOC at P values of 0.0002. Because these three expression (P = 1.29 × 10−6). The variant most significantly associ- EOC subtypes are less common than serous EOC, the weakness ated with ovarian cancer risk at this locus (rs6005807) showed of these associations may simply reflect the smaller sample significant eQTL associations for CHEK2 (P = 9.22 × 10−6) and HSCB sizes available for their genotyping. One caveat to the associa- (P = 1.85 × 10−5). Importantly, for CHEK2, HSCB, CCDC117 and XBP1 tion analyses lies in the minor residual inflation observed in the http://carcin.oxfordjournals.org/ the most significant eQTL SNPs were also disease associated. test statistics. This may be due to population structure but given These data are summarized in Table 2. Three of the top 25 candi- that this inflation was greater than for other sets of SNPs this date variants from in silico functional annotation were amongst seems an unlikely explanation. An alternative explanation is the most significant eQTL associations: rs12166475, associated an overall burden of weak susceptibility signals within this set with expression of CHEK2 coincides with a TFBS for EGR1, a of SNPs. transcription factor previously implicated in EOC development; The most significant associations were identified using rs9620817, associated with HSCB expression, is predicted to imputed genotypes, based on an estimated imputation r2 that alter CEBPB and ETS1 motifs; and rs16986509 associated with indicates a very high correlation between imputed genotypes CCDC117 expression is predicted to alter TFBSs for TAL1 and UA2 and actual genotypes. It is therefore unlikely that there are (Figure 2C). other common variants within the region that may represent at Erlangen Nuernberg University on August 15, 2016 Finally, we evaluated eQTL associations for all six protein- more highly associated SNPs than those already identified, and coding genes in the region in 339 primary HGSOC tissues using so these alleles represent the candidate causal variants for this publicly available data from TCGA. Variations in gene expression locus. We identified 293 candidate causal polymorphisms that were adjusted for changes in DNA copy number and methyla- are virtually indistinguishable from each other with respect to tion variation in each tumor. We observed no significant asso- their risk associations and any one (or even several) could be the ciations between rs6005807 and CHEK2, TTC28 or XBP1 at a P causal SNP(s) influencing expression of the target susceptibility value threshold of 0.05, and false discovery rate threshold of 0.1. gene. Only two of the variants were in protein-coding regions Details of the eQTL analyses are provided in Table 2. and both were synonymous changes, suggesting that the causal SNP(s) likely reside in non-coding DNA. As such we neither know the functional basis for the genetic susceptibility, nor the Discussion target susceptibility gene(s). Our in silico analysis of these vari- DNA repair mechanisms are important in the initiation and ants with respect to non-coding regulatory biofeatures profiled development of EOC and the current study represents the most in multiple different cell lines by ENCODE, and in EOC precur- comprehensive analysis of common genetic variation at DNA sor tissues, identified 25 risk SNPs that intersect with annotated repair genes and EOC risk to date. We evaluated 2621 candi- functional elements. While this represents a relatively small date variants spanning 143 gene regions at several different number of candidate causal polymorphisms and functional DNA repair pathways and found strong evidence of risk asso- targets at this locus, several other candidate causal SNPs may ciations for SNPs at the CHEK2 gene locus that were just below exist. None of the cell lines we evaluated have been comprehen- the threshold for genome-wide significance. This is consistent sively analysed for the full catalogue of non-coding regulatory with a smaller previous study in which we showed borderline elements and is possible that additional variants overlap regula- evidence of risk associations for SNPs spanning this locus (26). tory marks that were not profiled; for example CTCF repressor Even though we did not find strong statistical evidence of risk marks and non-coding RNAs. associations for SNPs at other DNA repair gene loci, we cannot We identified three risk SNPs located within regulatory ele- rule out that germline genetic variation at these genes is associ- ments active in ovarian cells. These SNPs were also the most ated with EOC risk. We only performed detailed genotyping and significant SNPs from eQTL analysis in the region: rs12166475 imputation analysis at the CHEK2 locus because of the strength associated with CHEK2; rs9620187 associated with HSCB; and of its association from our initial screen, but additional analy- rs16986509 associated with CCDC117. In each case the minor ses of the other gene loci in the future may identify other asso- allele was associated with increased gene expression; conse- ciations. COGS represents the largest genetic association study quently higher CHEK2, HSCB and CCDC117 expression was asso- reported for EOC, but is still substantially smaller in sample size ciated with reduced cancer risk, whereas higher XBP1 expression compared to GWAS for more common diseases such as breast was associated with higher cancer risk. The strongest candi- cancer and coronary artery disease (22,44,45). Sample size has date SNP is rs12166475, which alters the binding site for EGR1, a substantial impact on the ability to identify risk associations, a transcription factor involved in epithelial-to-mesenchymal K.Lawrenson et al. | 1351

transition in EOC (46). The alternative allele of this variant is also and a significant eQTL association with risk-associated vari- predicted to increase the binding affinity of WT1, a biomarker ants. Future studies will be needed to increase the power of commonly expressed in HGSOCs. WT1 can have both repressive our genetic association studies by increasing sample size to and activating effects on gene expression (47). Additional func- confirm that this region is a true susceptibility locus for ovar- tional analysis of the possible interaction between rs12166475 ian cancer. Finally, detailed functional characterization of this and CHEK2 will be required to validate these findings and to locus will be needed to confirm the functional impact of these elucidate the transcriptional consequences of allele-dependent candidate SNPs on their regulatory elements and establish their EGR1/WT1 binding at the site of this SNP. The SNP rs9620817, interactions and influence on the target susceptibility gene, as which is most significantly associated with HSCB expression, has been shown for other common variant susceptibility genes is also a strong candidate. This SNP is predicted to alter CEBPB and alleles. and ETS1 transcription factor binding sites (TFBS), although the difference in predicted binding affinity between the two SNP Supplementary material alleles is much greater for ETS1. Both rs16986509 and rs9620817 also alter binding sites for BRCA1, which may be of signifi- Supplementary Tables 1–10 and Figure 1 can be found at http:// cance given that BRCA1-associated pathways are deregulated carcin.oxfordjournals.org/ in approximately half of all HGSOCs (48). BRCA1 can function Downloaded from as a co-repressor or co-activator, and regulates gene expression Funding by interacting with a myriad of different transcription factors, The scientific development and funding for this project were including TP53 and CMYC (49). supported by the following: the Genetic Associations and Although we conditioned our analysis at 22q12.1 on tagging Mechanisms in Oncology (GAME-ON): a NCI Cancer Post-GWAS variants spanning the CHEK2 gene, five other genes lie within Initiative (U19-CA148112); the COGS project is funded through http://carcin.oxfordjournals.org/ a 500-kb region at either side of the most risk associated SNP a European Commission’s Seventh Framework Programme that could be the target of risk-associated variants at this locus. grant (agreement number 223175-HEALTH-F2-2009-223175); the However, somatic analysis of ovarian tumors from TCGA sug- Ovarian Cancer Association Consortium is supported by a grant gests that CHEK2 is the most likely target. It is the only gene from the Ovarian Cancer Research Fund thanks to donations by in the region that is differentially expressed in ovarian tumors the family and friends of Kathryn Sladek Smith (PPD/RPCI.07); compared to normal fallopian tube tissues, suggesting that it and Department of Defense Award (W81XWH-12-1-0561, may play a role in EOC development. While CHEK2 was overex- W81XWH-07-0449). F.M. was supported by National Institutes pressed in ovarian tumors compared to normal fallopian tubes, of Health K07-CA080668; S.S.T. and E.M.P. were supported in in our eQTL analyses reduced CHEK2 expression was associ- part by Department of Defense Award W81XWH-12-1-0561. ated with increased cancer risk, which may suggest that over- Funding of the constituent studies was provided by: California at Erlangen Nuernberg University on August 15, 2016 expression occurs at later stages of tumorigenesis but lower Cancer Research Program (00-01389V-20170, 2II0200); Cancer CHEK2 expression is involved in early cancer development. This Prevention Institute of California; Department of Defense hypothesis is consistent with the moderate risk of breast cancer (DAMD17-02-1-0666, DAMD17-02-1-0669, W81XWH-10-1-02802); conferred by CHEK2 loss-of-function variants, where large pop- Fred C. and Katherine B. Andersen Foundation; Lon V Smith ulation-based studies report estimated odds ratios for rare pro- Foundation grant LVS-39420; Mayo Foundation; Minnesota tein-truncating and splice-junction variants on the order of 6.18 Ovarian Cancer Alliance; National Institutes of Health (95% CI: 1.76–21.8) and 8.75 (95% CI: 1.06–72.2) for missense sub- (K07-CA095666, K07-CA143047, K22-CA138563); National Center stitutions (50). Breast and EOC have shared genetic etiology for for Research Resources/General Clinical Research Center grant both high and low penetrance susceptibility genes, providing a M01-RR000056, N01-CN025403, N01-CN55424, N01-PC67001, rationale for why germline genetic variants in or around CHEK2 N01-PC67010, P01-CA17054, P30-CA14089, P30-CA15083, may be associated with EOC risk. No similar rationale applies for P30-CA072720, P50-CA105009, P50-CA136393, P50-CA159981, other genes in the region but eQTL analyses identified several R01-CA058860, R01-CA074850, R01-CA080742, R01-CA092044, genotype-gene expression associations that indicate alterna- R01-CA112523, R01-CA122443, R01-CA126841, R01-CA16056, tive candidate target genes. The strongest association was with R01-CA54419, R01-CA58598, R01-CA61132, R01-CA76016, the XBP1 gene in peripheral lymphocytes, although the most R01-CA83918, R01-CA87538, R01-CA95023, R01-CA063678, R01 significant eQTL SNP for this gene was not predicted to alter a CA114343, R03-CA113148, R03-CA115195, U01-CA69417, U01- TFBS within an active regulatory element in EOC precursor cells. CA71966),UM1-CA182910; Rutgers Cancer Institute of New We also identified highly statistically significant cis-eQTL asso- Jersey; and the US Public Health Service (PSA-042205). Personal ciations between risk SNPs and expression of both CHEK2 and support: K.L is supported by a K99/R00 grant from the National HSCB. However, the true importance of these eQTL associations Cancer Institute (Grant number 1K99CA184415-01). D.F.E. is a is unclear given that they were identified in tissues that are Principal Research Fellow of Cancer Research UK. G.C.-T. and not associated with EOC development. Several previous studies P.M.W. are supported by the National Health and Medical have highlighted the importance of tissue-specific gene expres- Research Council. B.K. holds an American Cancer Society sion when evaluating eQTLs, and stressed the need to perform Early Detection Professorship (SIOP-06-258-01- COUN) and the eQTL analysis in tissues relevant to disease development (51). National Center for Advancing Translational Sciences (NCATS), We did not identify eQTL associations for any of these genes in Grant UL1TR000124. primary EOCs, although these studies were underpowered. In summary, we provide evidence that common genetic vari- ants in a region on chromosome 22q12.1 are associated with Acknowledgements risk of serous ovarian cancer. The most likely target suscepti- This study would not have been possible without the contribu- bility gene at this locus is CHEK2 based on a combination of its tions of the following: A. M. Dunning, P. Hall (COGS); D. C. Tessier, known role in DNA damage response pathways, somatic varia- F. Bacot, D. Vincent,, S. LaBoissière and F. Robidoux and the staff tion in gene expression suggesting a role in EOC development, of the genotyping unit (Genome Quebec); D. C. Whiteman, P. M. 1352 | Carcinogenesis, 2015, Vol. 36, No. 11

Webb, A. C. Green, N. K. Hayward, P. G. Parsons, D. M. Purdie, B. M. 4. Schildkraut, J.M. et al. (1997) Relationship between lifetime ovulatory Smithers, D. Gotley, A. Clouston, I. Brown, S. Moore. K. Harrap, cycles and overexpression of mutant p53 in epithelial ovarian cancer. J. T. Sadkowski, S. O’Brien, E. Minehan, D. Roffe, S. O’Keefe, Natl. Cancer Inst., 89, 932–938. S. Lipshut, G. Connor, H. Berry, F. Walker, T. Barnes, J. Thomas, 5. Moynahan, M.E. et al. (1999) Brca1 controls homology-directed DNA repair. Mol. Cell, 4, 511–518. L. Terry, M. Connard, L. Bowes, M-R. Malt, J. White, C. Mosse, 6. Wooster, R. et al. (1995) Identification of the breast cancer susceptibility N. Tait, C. Bambach, A. Biankan, R. Brancatisano, M. Coleman, gene BRCA2. Nature, 378, 789–792. M. Cox, S. Deane, G. L. Falk, J. Gallagher, M. Hollands, T. Hugh, 7. Futreal, P.A. et al. (1994) BRCA1 mutations in primary breast and ovar- D. Hunt, J. Jorgensen, C. Martin, M. Richardson, G. Smith, ian carcinomas. Science, 266, 120–122. R. Smith, D. Storey, J. Avramovic, J. Croese, J. D’Arcy, S. Fairley, 8. Miki, Y. et al. (1994) A strong candidate for the breast and ovarian can- J. Hansen, J. Masson, L. Nathanson, B. O’Loughlin, L. Rutherford, cer susceptibility gene BRCA1. Science, 266, 66–71. R. Turner, M. Windsor, J. Bessell, P. Devitt, G. Jamieson, D. Watson, 9. Song, H. et al. (2014) The contribution of deleterious germline muta- S. Blamey, A. Boussioutas, R. Cade, G. Crosthwaite, I. Faragher, tions in BRCA1, BRCA2 and the mismatch repair genes to ovarian can- J. Gribbin, G. Hebbard, G. Kiroff, B. Mann, R. Millar, P. O’Brien, cer in the population. Hum. Mol. Genet., 23, 4703–4709. 10. Meindl, A. et al. (2010) Germline mutations in breast and ovarian can- R. Thomas, S. Wood, S. Archer, K. Faulkner, J. Hamdorf (ACS); cer pedigrees establish RAD51C as a human cancer susceptibility gene. R. Stuart-Harris, F. Kirsten, J. Rutovitz, P. Clingan, A.Glasgow, Nat. Genet., 42, 410–414. Downloaded from A. Proietto, S. Braye, G. Otton, J. Shannon, T. Bonaventura, 11. Loveday, C. et al. (2011) Germline mutations in RAD51D confer suscep- J. Stewart, S. Begbie, M. Friedlander, D. Bell, S. Baron-Hay, tibility to ovarian cancer. Nat. Genet., 43, 879–882. G. Gard, D. Nevell, N. Pavlakis, S. Valmadre, B. Young, C Camaris, 12. Rafnar, T. et al. (2011) Mutations in BRIP1 confer high risk of ovarian R. Crouch, L. Edwards, N. Hacker, D. Marsden, G. Robertson, cancer. Nat. Genet., 43, 1104–1107. P. Beale, J. Beith, J. Carter, C. Dalrymple, R. Houghton, 13. Song, H. et al. (2009) A genome-wide association study identifies a new P. Russell, L. Anderson, M. Links, J. Grygiel, J. Hill, A. Brand, ovarian cancer susceptibility locus on 9p22.2. Nat. Genet., 41, 996–1000. K. Byth, R. Jaworski, P. Harnett, R. Sharma,.G Wain, D. Purdie, 14. Goode, E.L. et al. (2010) A genome-wide association study identifies http://carcin.oxfordjournals.org/ D. Whiteman, B. Ward, D. Papadimos, A. Crandon, M. Cummings, susceptibility loci for ovarian cancer at 2q31 and 8q24. Nat. Genet., 42, 874–879. K. Horwood. A. Obermair, L. Perrin, D. Wyld, J. Nicklin, M. Davy, 15. Bolton, K.L. et al. (2010) Common variants at 19p13 are associated with M. K. Oehler, C. Hall, T. Dodd, T. Healy, K. Pittman, D. Henderson, susceptibility to ovarian cancer. Nat. Genet., 42, 880–884. J. Miller, J. Pierdes, A. Achan, P. Blomfield, D. Challis, R. McIntosh, 16. Pharoah, P.D. et al. (2013) GWAS meta-analysis and replication identi- A. Parker, B. Brown, R. Rome, D. Allen, P. Grant, S. Hyde, R. Laurie, fies three new susceptibility loci for ovarian cancer. Nat. Genet., 45, M. Robbie, D. Healy, T. Jobling, T. Manolitsas, J. McNealage, P 362–370. Rogers, B. Susil, E. Sumithran, I. Simpson, I. Haviv, K. Phillips, 17. Permuth-Wey, J. et al. (2013) Identification and molecular characteri- D. Rischin, S. Fox, D. Johnson, S. Lade, P. Waring, M. Loughrey, zation of a new ovarian cancer susceptibility locus at 17q21.31. Nat. N. O’Callaghan, B. Murray, L. Mileshkin, P. Allan; V. Billson, Commun., 4, 16–27. at Erlangen Nuernberg University on August 15, 2016 J. Pyman, D. Neesham, M. Quinn, A. Hamilton, C. Underhill, 18. Bojesen, S.E. et al. (2013) Multiple independent variants at the TERT locus are associated with telomere length and risks of breast and ovar- R. Bell, L. F Ng, R. Blum, V.Ganju, I. Hammond, C. Stewart, ian cancer. Nat. Genet., 45, 371–384. Y. Leung, M. Buck, N. Zeps (ACS); G. Peuteman, T. Van Brussel and 19. Shen, H. et al. (2013) Epigenetic analysis leads to identification of D. Smeets (BEL); U. Eilber(GER); L. Gacucova (HMO); P. Schürmann, HNF1B as a subtype-specific susceptibility gene for ovarian cancer. F. Kramer, W. Zheng, T.-W. Park-Simon, K. Beer-Grondke and Nat. Commun., 4, 16–28. D. Schmidt (HJO); G.S. Keeney, S. Windebank, C. Hilker and 20. Chen, K. et al. (2014) Genome-wide association study identifies new J. Vollenweider (MAY); the state cancer registries of AL, AZ, AR, susceptibility loci for epithelial ovarian cancer in Han Chinese women. CA, CO, CT, DE, FL, GA, HI, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, Nat. Commun., 5, 4682. NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, 21. Kuchenbaecker, K.B. et al. (2015) Identification of six new susceptibility and WYL (NHS); L. Paddock, M. King, U. Chandran, A. Samoila, loci for invasive epithelial ovarian cancer. Nat. Genet., 47, 164–171. and Y. Bensman (NJO); L. Brinton, M. Sherman, A. Hutchinson, 22. Michailidou, K. et al. (2013) Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat. Genet., 45, 353–361. N. Szeszenia- Dabrowska, B. Peplonska, W. Zatonski, A. Soni, 23. Lee, J.S. et al. (2000) hCds1-mediated phosphorylation of BRCA1 regu- P. Chao and M. Stagner (POL);); C. Luccarini, P. Harrington the lates the DNA damage response. Nature, 404, 201–204. SEARCH team and ECRIC (SEA); the Scottish Gynaecological 24. Consortium, C.B.C.C.-C. (2004) CHEK2*1100delC and susceptibility to breast Clinical Trails group and SCOTROC1 investigators (SRO); W-H. cancer: a collaborative analysis involving 10,860 breast cancer cases and Chow, Y-T. Gao (SWH); Information about TCGA and the inves- 9,065 controls from 10 studies. Am. J. Hum. Genet., 74, 1175–1182. tigators and institutions who constitute the TCGA research 25. Walsh, T. et al. (2011) Mutations in 12 genes for inherited ovarian, fal- network can be found at http://cancergenome.nih.gov/ (TCGA); lopian tube, and peritoneal carcinoma identified by massively parallel I. Jacobs, M. Widschwendter, E. Wozniak, N. Balogun, A. Ryan sequencing. Proc. Natl. Acad. Sci. USA, 108, 18032–18037. and J. Ford (UKO); Carole Pye (UKR); a full list of the investigators 26. Schildkraut, J.M. et al. (2010) Association between DNA damage response and repair genes and risk of invasive serous ovarian cancer. who contributed to the generation of the WTCCC data is avail- PLoS One, 5, e10061. able from http://www.wtccc.org.uk/ (WTCCC). 27. Wood, R.D. et al. (2005) Human DNA repair genes, 2005. Mutat. Res., 577, Conflict of Interest Statement: None declared. 275–283. 28. Howie, B.N. et al. (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. References PLoS Genet., 5, e1000529. 1. American Cancer Society. (2012) Cancer Facts & Figures 2012. American 29. Westra, H.J. et al. (2013) Systematic identification of trans eQTLs as Cancer Society, Atlanta. putative drivers of known disease associations. Nat. Genet., 45, 1238– 2. Murdoch, W.J. et al. (2001) Ovulation-induced DNA damage in ovarian 1243. surface epithelial cells of ewes: prospective regulatory mechanisms of 30. Karst, A.M. et al. (2012) Primary culture and immortalization of human repair/survival and apoptosis. Biol. Reprod., 65, 1417–1424. fallopian tube secretory epithelial cells. Nat. Protoc., 7, 1755–1764. 3. Bahar-Shany, K. et al. (2014) Exposure of fallopian tube epithelium to 31. DelloRusso, C. et al. (2007) Functional characterization of a novel follicular fluid mimics carcinogenic changes in precursor lesions of BRCA1-null ovarian cancer cell line in response to ionizing radiation. serous papillary carcinoma. Gynecol. Oncol., 132, 322–327. Mol. Cancer Res., 5, 35–45. K.Lawrenson et al. | 1353

32. Hazelett, D.J. et al. (2014) Comprehensive functional annotation of 77 42. Forbes, S.A. et al. (2014) COSMIC: exploring the world’s knowledge prostate cancer risk loci. PLoS Genet., 10, e1004–102. of somatic mutations in human cancer. Nucleic Acids Res, 43, D805– 33. Lawrenson, K. et al. (2009) In vitro three-dimensional modelling of D811. human ovarian surface epithelial cells. Cell Prolif., 42, 385–393. 43. Lawrenson, K. et al. (2011) Modelling genetic and clinical heterogeneity 34. Li, N.F. et al. (2004) A modified medium that significantly improves the in epithelial ovarian cancers. Carcinogenesis, 32, 1540–1549. growth of human normal ovarian surface epithelial (OSE) cells in vitro. 44. Eeles, R.A. et al. (2013) Identification of 23 new prostate cancer suscep- Lab. Invest., 84, 923–931. tibility loci using the iCOGS custom genotyping array. Nat. Genet., 45, 35. Li, Q. et al. (2013) Integrative eQTL-based analyses reveal the biology of 385–391. breast cancer risk loci. Cell, 152, 633–641. 45. Deloukas, P. et al. (2013) Large-scale association analysis identifies new 36. Wang, J. et al. (2013) Factorbook.org: a Wiki-based database for tran- risk loci for coronary artery disease. Nat. Genet., 45, 25–33. scription factor-binding data generated by the ENCODE consortium. 46. Cheng, J.C. et al. (2013) Egr-1 mediates epidermal growth factor- Nucleic Acids Res., 41, D171–D176. induced downregulation of E-cadherin expression via Slug in human 37. Coetzee, S.G. et al. (2015) Cell type specific enrichment of risk associated ovarian cancer cells. Oncogene, 32, 1041–1049. regulatory elements at ovarian cancer susceptibility loci. Hum. Mol. Genet, 47. Hohenstein, P. et al. (2006) The many facets of the Wilms' tumour gene, 24, 3595–3607. WT1. Hum. Mol. Genet., 15, R196–R201. 38. Kulakovskiy, I.V. et al. (2013) HOCOMOCO: a comprehensive collection 48. Network, C.G.A.R. (2011) Integrated genomic analyses of ovarian carci- of human transcription factor binding sites models. Nucleic Acids Res., noma. Nature, 474, 609–615. Downloaded from 41, D195–D202. 49. Mullan, P.B. et al. (2006) The role of BRCA1 in transcriptional regulation 39. Izumiyama, T. et al. (2012) A novel big protein TPRBK possessing 25 and cell cycle control. Oncogene, 25, 5854–5863. units of TPR motif is essential for the progress of mitosis and cytokine- 50. Le Calvez-Kelm, F. et al. (2011) Rare, evolutionarily unlikely missense sis. Gene, 511, 202–217. substitutions in CHEK2 contribute to breast cancer susceptibility: 40. Uhrigshardt, H. et al. (2010) Characterization of the human HSC20, an results from a breast cancer family registry case-control mutation- unusual DnaJ type III protein, involved in iron-sulfur cluster biogen- screening study. Breast Cancer Res., 13, R6.

esis. Hum. Mol. Genet., 19, 3816–3834. 51. Pomerantz, M.M. et al. (2010) Analysis of the 10q11 cancer risk locus http://carcin.oxfordjournals.org/ 41. Hao, H.X. et al. (2012) ZNRF3 promotes Wnt receptor turnover in an implicates MSMB and NCOA4 in human prostate tumorigenesis. PLoS R-spondin-sensitive manner. Nature, 485, 195–200. Genet., 6, e1001204. at Erlangen Nuernberg University on August 15, 2016