Doctoral Programme in Biomedicine (DPBM)

Genetic modifiers of CHEK2-associated and familial breast cancer

Taru A. Muranen

Department of Obstetrics and Gynecology Helsinki University Hospital

Faculty of Medicine University of Helsinki Helsinki, Finland

Academic Dissertation To be discussed, with permission of the Faculty of Medicine, University of Helsinki, in Biomedicum 1, Lecture Hall 2, Haartmaninkatu 8, Helsinki on 2 November 2018, at 12 noon.

Helsinki 2018  Supervised by: Adjunct Professor Heli Nevanlinna, PhD Department of Obstetrics and Gynecology Helsinki University Hospital and University of Helsinki, Finland

Associate Professor Dario Greco, PhD Faculty of Medicine and Life Sciences Institute of Biosciences and Medical Technology University of Tampere, Finland

Reviewed by: Adjunct Professor Minna Tanner, MD, PhD Faculty of Medicine and Life Sciences University of Tampere, Finland

Professor Matti Nykter, PhD Faculty of Medicine and Life Sciences University of Tampere, Finland

Official Opponent: Associate Professor Ingrid Hedenfalk, PhD Division of Oncology and Pathology Department of Clinical Sciences Lund University, Sweden

Cover image: Three versions of the same pedigree overlaid: one colored by disease status (on the bottom), one colored by genotype of a moderate penetrance mutation (middle), and one colored by polygenic risk score (on the top).

Dissertationes Scholae Doctoralis Ad Sanitatem Investigandam Universitatis Helsinkiensis

ISBN 978-951-51-4503-1 (Paperback) ISBN 978-951-51-4504-8 (PDF) ISSN 2342-3161 (print) ISSN 2342-317X (online)

Unigrafia Helsinki 2018

2 Itseoppinut on ainoa oppinut. Muut ovat opetettuja. Erno Paasilinna

Ursalle, Elselle, Eerolle ja Urholle

3 Table of Contents

Table of Contents ...... 4 Abstract ...... 7 List of Original Publications ...... 9 Abbreviations ...... 10 and names...... 11 1 Introduction ...... 12 2 Review of the Literature ...... 13 2.1 General cancer characteristics ...... 13 2.1.1 Cancer progression ...... 13 2.1.2 Cancer ...... 15 2.2 Breast cancer ...... 16 2.2.1 Mammary gland ...... 17 2.2.2 Breast cancer risk factors ...... 20 2.2.3 Breast cancer subtypes ...... 20 2.2.4 Origin of breast cancer...... 22 2.3 Breast cancer treatment ...... 23 2.3.1 Adjuvant endocrine therapy ...... 23 2.3.2 Other targeted biological therapies ...... 23 2.3.3 Adjuvant chemotherapy ...... 24 2.4 Genetic predisposition to breast cancer ...... 25 2.4.1 Breast cancer heritability ...... 25 2.4.2 High-risk genes ...... 28 2.4.3 Moderate-risk genes ...... 29 2.4.4 The Breast cancer pathway ...... 29 2.4.5 Common predisposing variants ...... 31 2.5 CHEK2...... 31 2.5.1 CHEK2 protein function ...... 31 2.5.2 CHEK2 mutations ...... 32 2.5.3 CHEK2 and breast cancer risk...... 33 2.5.4 CHEK2 in breast tumors ...... 34 3 Aims of the Study ...... 35 4 Materials and Methods ...... 36

4 4.1 Study subjects and data sources ...... 36 4.1.1 Breast tumors (I, II) ...... 36 4.1.2 Study subjects from the Breast Cancer Association Consortium (II, III) ...... 36 4.1.3 Study subjects of the Helsinki breast cancer study (IV) ...... 37 4.2 Methods...... 37 4.2.1 Microarray data processing and analyses (I, II) ...... 37 4.2.2 Permutation analysis (I: unpublished data)...... 39 4.2.3 Survival analyses (II) ...... 40 4.2.4 Tumor pathology analyses (II) ...... 40 4.2.5 The Polygenic risk score (III, IV) ...... 40 4.2.6 Risk association analyses (III, IV) ...... 40 4.2.7 Feature selection (III: unpublished data) ...... 40 4.2.8 In silico functional analysis (III: unpublished data) ...... 41 4.3 Ethics statement ...... 41 5 Results ...... 42 5.1 c.1100delC and p.(I157T) carrier tumors (I, II) ...... 42 5.1.1 c.1100delC-associated copy number aberrations (I) ...... 42 5.1.2 c.1100delC-associated differences in gene expression (I and unpublished data) ...... 42 5.1.3 Combined analysis of aCGH and GEX data (I and unpublished data) ...... 43 5.1.4 p.(I157T)-associated gene expression (II) ...... 45 5.1.5 Clinico-pathological characteristics (II) ...... 45 5.2 p.(I157T) or c.1100delC carrier survival (II) ...... 45 5.3 Genetic modifiers of c.1100delC-associated breast cancer risk (III)...... 45 5.3.1 Synergistic risk effect of common variants for c.1100delC carriers (III)...... 45 5.3.2 The sparse model (III: unpublished data) ...... 46 5.3.3 In silico functional characterization (III: unpublished data) ...... 46 5.4 Risk modifiers in breast cancer families (IV) ...... 47 6 Discussion ...... 48 6.1 CHEK2-associated breast cancer (I, II, III) ...... 48 6.1.1 Germline CHEK2 mutations are associated with ER-positive breast cancer (II) ...... 48 6.1.2 Genomic profiling elucidates the steps of CHEK2-related tumorigenesis (I, II) ...... 49 6.1.3 1p22 loss might complement CHEK2 deficiency in breast cancer progression (I) ...... 50 6.1.4 Elevated expression of olfactory receptors in c.1100delC carrier tumors (I) ...... 51 6.1.5 WNT pathway deregulation – typical for c.1100delC breast cancers (I,III)? ...... 52 6.1.6 Hypothetical model for c.1100delC-associated breast cancer progression (I, III) ...... 55

5 6.1.7 Is p.(I157T) ‘the first hit’ for germline mutation carriers (II)? ...... 55 6.2 Survival of breast cancer patients carrying germline CHEK2 mutations (II)...... 57 6.2.1 Increased mortality associated with c.1100delC ...... 57 6.2.2 CHEK2 mutations and increased risk of local recurrence or new primary tumors ...... 58 6.2.3 p.(I157T), lobular carcinoma, and patient survival warrant further research ...... 59 6.3 Common genetic variants in breast cancer risk prediction (III, IV) ...... 59 6.3.1 PRS could be used in risk stratification of c.1100delC carriers (III) ...... 59 6.3.2 No epistatic interaction exists between c.1100delC and the common variants (III) ...... 59 6.3.3 PRS explains part of the increased familial risk (IV) ...... 60 7 Summary and Conclusions ...... 62 8 Acknowledgments...... 64 References ...... 66 Appendix

6 Abstract

Aims CHEK2 (checkpoint kinase 2) is a moderate-risk breast cancer susceptibility gene. By definition, the CHEK2 susceptibility mutations do not segregate consistently with breast cancer within pedigrees, and other genetic factors have been proposed to modify the penetrance of the CHEK2 mutations. The primary purpose of this study was to identify the risk-modifying genetic factors using risk association analyses. Furthermore, genomic profiling of mutation carrier tumors could suggest candidate loci for further risk analyses, but more importantly shed light on the events that have led to tumor development, complementing the CHEK2 deficiency. CHEK2 c.1100delC has been suggested to be associated with poor prognosis after breast cancer diagnosis. We tested whether the same effect would be shared by p.(I157T), another recurrent CHEK2 mutation in the Finnish population. Additionally, breast cancer phenotypic features associated with the two mutations were examined in terms of pathological characteristics and differential gene expression. By now, collaborative international studies have identified a vast number of common variants associated with a modest increase in the risk of breast cancer. However, combining multiple variants into a polygenic risk score (PRS) has been assumed to have potential in breast cancer risk stratification. We assessed the applicability of the PRS in risk prediction for women at elevated baseline risk, namely carriers of c.1100delC and women with a positive family history of breast cancer. Essential methods Genomic copy number aberrations (CNA) associated with c.1100delC were analyzed using data from 26 c.1100delC carrier and 76 non-carrier tumors. Analyses were performed in R environment for statistical computing using Bioconductor packages CGHcall, CGHregions, and WECCA. Associations between CNA regions and c.1100delC were tested with Wilcoxon rank-sum test. C.1100delC-associated differential gene expression was examined using data from 13 c.1100delC carrier and 65 non-carrier tumors and p.(I157T)-associated gene expression with data from 10 p.(I157T) carrier and 162 non-carrier tumors. Analyses were performed with Bioconductor package limma. Functional enrichment of the differentially expressed genes was analyzed with DAVID functional annotation tool and Gene Set Enrichment Analysis (GSEA) using gene libraries available at mSigDB. Survival and tumor pathologic characteristics of breast cancer patients carrying CHEK2 mutations were studied in collaboration with the Breast Cancer Association Consortium (BCAC) in a dataset consisting of 25940 non-carriers, 590 p.(I157T) carriers, and 271 c.1100delC carriers. Survival analyses were performed with Cox regression and pathology analyses with Cochran-Mantel- Haenszel test. Risk effect associated with about 75 common variants was studied in a BCAC dataset of 78 354 non-carriers and 848 c.1100delC carriers as well as in a Finnish dataset consisting of 1 303 unselected cases, 378 additional familial index cases, 1 272 population controls, and 429 women from 52 breast cancer families. A polygenic risk was calculated as a product of per-variant log

7 odds ratios, and standardized according to healthy population controls. Risk association analyses were performed with logistic regression. Nested regression models were compared with likelihood-ratio test. Results We identified seven chromosomal locations, whose copy number aberrations were more frequent in c.1100delC carrier tumors than in non-carrier tumors. Functional in silico analysis of CNA regions and differentially expressed genes suggested that loss of GBP genes, elevated activity of olfactory receptors and deregulation of the WNT pathway could be recurrent driver events in c.1100delC carrier cancers. Gene expression analysis suggested that CDH1 inactivation is a frequent event in p.(I157T) carrier breast cancers, possibly accounting for most of the observed differences between p.(I157T) carrier and non-carrier breast cancers. Germline CHEK2 mutations c.1100delC and p.(I157T) differ in terms of their association with patient prognosis, c.1100delC being a marker for poor prognosis. Both mutations predispose to estrogen receptor (ER)-positive breast cancer. P.(I157T) is associated with lobular breast cancer, whereas c.1100delC is not associated with any specific breast cancer histological subtype. The breast cancer risk associated with the PRS was similar for c.1100delC carriers, women from breast cancer families, and unselected women. When accounting for the elevated background risk associated with c.1100delC, about 20% of mutation carriers with the highest PRS values were estimated to have higher than 30% lifetime risk. Furthermore, even though PRS explained part of the excess familial risk, the high PRS values retained predictive value in risk stratification of women with a positive family history of breast cancer. Conclusions The genomic analyses of CHEK2 mutation carrier tumors could lay a foundation for a model of c.1100delC-associated tumorigenesis. Furthermore, c.1100delC and p.(I157T) might have different roles in the origin and development of breast cancer. The findings from these hypothesis- generating studies could benefit future functional in vitro and in vivo studies on breast cancer etiology. The poor survival of breast cancer patients carrying c.1100delC warrants further examination. The data presented in this work indicate that the survival association is not shared by p.(I157T), emphasizing that findings based on a certain mutation cannot always be generalized to other mutations of the same gene. On a general population level, the usability of the current PRS is limited due to the very low proportion of unselected women stratified into the high-risk category by PRS alone. However, for women at elevated background risk as a consequence of an inherited moderate penetrance mutation, like c.1100delC, or positive family history, the PRS of about 75 variants could provide significant clinical benefit in identifying women at high lifetime risk. 

8 List of Original Publications This thesis is based on the following original publications, referred to in the text by their Roman numerals. In addition, unpublished data exploring Studies I and III further are included. I. Muranen TA, Greco D, Fagerholm R, Kilpivaara O, Kämpjärvi K, Aittomäki K, Blomqvist C, Heikkilä P, Borg Å, Nevanlinna H. Breast tumors from CHEK2 1100delC- mutation carriers: genomic landscape and clinical implications. Breast Cancer Res. 2011;13:R90. II. Muranen TA, Blomqvist C, Dörk T, Jakubowska A, Bojesen SE, Fagerholm R, Greco D, Aittomäki K, Shah M, Dunning AM, Rhenius V, Hall P, Czene K, Brand JS, Darabi H, Chang-Claude J, Rudolph A, Nordestgaard BG, Couch FJ, Hallberg E, Figueroa J, García-Closas M, Fasching PA, Beckmann MW, Li J, Liu J, Andrulis IL, Knight JA, Winqvist R, Pylkäs K, Mannermaa A, Kataja V, Lindblom A, Margolin S, Lubinski J, Dubrowinskaja N, Bolla MK, Dennis J, Michailidou K, Wang Q, Easton DF, Pharoah PDP, Schmidt MK, Nevanlinna H. Patient survival and tumor characteristics associated with CHEK2 I157T: findings from the Breast Cancer Association Consortium. Breast Cancer Res. 2016 Oct 3;18(1):98. III. Muranen TA, Greco D, Blomqvist C, Aittomäki K, Khan S, Hogervorst F, Verhoef S, Pharoah PDP, Dunning AM, Shah M, Luben R, Bojesen SE, Nordestgaard BG, Schoemaker M, Swerdlow A, García-Closas M, Figueroa J, Dörk T, Bogdanova NV, Hall P, Li J, Khusnutdinova E, Bermisheva M, Kristensen V, Borresen-Dale AL, Investigators N, Peto J, Dos Santos Silva I, Couch FJ, Olson JE, Hillemans P, Park-Simon TW, Brauch H, Hamann U, Burwinkel B, Marme F, Meindl A, Schmutzler RK, Cox A, Cross SS, Sawyer EJ, Tomlinson I, Lambrechts D, Moisse M, Lindblom A, Margolin S, Hollestelle A, Martens JWM, Fasching PA, Beckmann MW, Andrulis IL, Knight JA, Investigators K, Anton-Culver H, Ziogas A, Giles GG, Milne RL, Brenner H, Arndt V, Mannermaa A, Kosma VM, Chang-Claude J, Rudolph A, Devilee P, Seynaeve C, Hopper JL, Southey MC, John EM, Whittemore AS, Bolla MK, Wang Q, Michailidou K, Dennis J, Easton DF, Schmidt MK, Nevanlinna H. Genetic modifiers of CHEK2 c.1100delC- associated breast cancer risk. Genet Med. 2017 May;19(5):599-603. IV. Muranen TA, Mavaddat N, Khan S, Fagerholm R, Pelttari L, Blomqvist C, Aittomäki K, Easton DF, Nevanlinna H. Polygenic risk score is associated with increased disease risk in 52 Finnish breast cancer families. Breast Cancer Res Treat. 2016 Aug;158(3):463-9.

These publications are reprinted with the permission of their copyright holders.

9 Abbreviations aCGH Array comparative genomic hybridization BAC Bacterial artificial BCAC Breast Cancer Association Consortium CI Confidence interval

CMF Cyclophosphamide – methotrexate – 5-fluorouracil CNA Copy number aberration COGS Collaborative Oncological Gene-Environment Study ER Estrogen receptor FFPE Formalin-fixed paraffin-embedded FHA Fork head-associated FWER Family-wise error rate G1/G2 Gap 1/2 GEO Gene Expression Omnibus GEX Gene expression GSEA Gene set enrichment analysis GWAS Genome-wide association study HR Hazard ratio IC-NST Invasive carcinoma of no special type ILC Invasive lobular carcinoma KD Kinase domain M Metastasis MMitosis N Status of adjacent lymph nodes OR Odds ratio PgR Progesterone receptor PRS Polygenic risk score QGlutamine S; Ser Serine S Synthesis SCD SQ/TQ cluster domain SNP Single-nucleotide polymorphism T; Thr Threonine T Tumor size TCGA The Cancer Genome Atlas TEB Terminal end bud UTR Untranslated region

10 Gene and protein names

ACIII Adenylate cyclase III LHRH Luteinizing hormone releasing AKT AKT serine/threonine kinase 1 hormone ALDH Aldehyde dehydrogenase LRP1 LDL receptor related protein 1 ALG14 ALG14, UDP-N- LRRC8D Leucine rich repeat containing 8 acetylglucosaminyltransferase VRAC subunit D subunit MLH1 mutL homolog 1 ANKLE1 Ankyrin repeat and LEM domain MMP Matrix metalloproteinase containing 1 MRE11 MRE11 homolog, double strand APC APC, WNT signaling pathway break repair regulator MSH2 mutS homolog 2 ATE1 Arginyltransferase 1 MTOR Mechanistic target of rapamycin ATM ATM serine/threonine kinase kinase AURKA Aurora kinase A MYC MYC proto-oncogene, bHLH BARD1 BRCA1 associated RING domain 1 transcription factor BRCA1/2 BRCA1/2, DNA repair associated NBN Nibrin CALCOCO1 Calcium binding and coiled-coil NF1 Neurofibromin 1 domain 1 NQO1 NAD(P)H:quinone oxidoreductase CDC25A Cell division cycle 25 A OR CDH1 Cadherin 1 OR6C3 Olfactory receptor family 6 CDK1/2/4/6 Cyclin dependent kinase 1/2/4/6 subfamily C member 3 CHEK2 Checkpoint kinase 2 PALB2 Partner and localizer of BRCA2 CLCA1 Chloride channel accessory 1 PARP Poly(ADP-ribose) polymerase CSAD Cysteine sulfinic acid decarboxylase PIK3CB Phosphatidylinositol-4,5- bisphosphate 3-kinase catalytic EGFR Epidermal growth factor receptor subunit beta ELL Elongation factor for RNA PRKDC Protein kinase, DNA-activated, polymerase II catalytic polypeptide FANCM Fanconi anemia complementation PTEN Phosphatase and tensin homolog group M PVT1 Pvt1 oncogene FGF Fibroblast growth factor RAD50 RAD50 double strand break repair FGFR2 Fibroblast growth factor receptor 2 protein FTO FTO, alpha-ketoglutarate dependent RAD51 RAD51 recombinase dioxygenase RasGEF Ras-type guanine nucleotide FZD1 Frizzled class receptor 1 exchange factors GBP1-7 Guanylate binding protein 1-7 RB1 RB transcriptional corepressor 1 GNAL G protein subunit alpha L STAT Signal transducer and activator of HER2 Human epidermal growth factor transcription receptor 2 STK11 Serine/threonine kinase 11 IFN-Ȗ Interferon gamma TMED5 Transmembrane p24 trafficking IL-1ȕ Interleukin 1 beta protein 5 JAK Janus kinase TOP2A Topoisomerase II alpha KRT5 Keratin 5 TP53 Tumor protein p53 LEF1 Lymphoid enhancer binding factor 1

11 1 Introduction CHEK2 has been established as a moderate-penetrance breast cancer susceptibility gene. Two CHEK2 mutations, protein truncating c.1100delC and missense p.(I157T), are relatively common in the Finnish population, having carrier frequencies of 1.4% and 5.3%, respectively.1, 2 The relative risk associated with c.1100delC and other truncating CHEK2 mutations is two- to threefold, whereas the risk effect of p.(I157T) is considerably lower.3-5 C.1100delC predisposes to familial breast cancer. However, it does not segregate consistently with the disease within breast cancer families. Furthermore, the moderate-risk level associated with c.1100delC (odds ratio (OR): 2.26; [95% confidence interval (CI) 1.90-2.69]) has rendered its applicability in genetic counseling limited.1, 3 About 30% of breast cancer incidence has been estimated to be caused by genetic factors.6 High- and moderate-risk mutations account for about one-fifth of the heritability.7 However, they do not operate alone. The best genetic model explaining both familial clustering and population-level incidence of breast cancer consists of rare high-penetrance mutations and common variants with low effect sizes contributing together in a multiplicative fashion to increase the risk.8 The discovery of multiple risk-modifying variants in genome-wide association studies (GWAS) studies has paved the way for investigation of genetic variants modifying the risk associated with CHEK2 c.1100delC.9 In addition to risk-modifying effects, common genetic variation has been predicted to contribute to familial clustering of breast cancer.8 Furthermore, the multiplicative model suggests that the nominal risk effects associated with single variants could be combined in order to estimate risk of individual women. Recently, a polygenic risk score (PRS) was introduced for risk prediction on a population level,10 and we have investigated its applicability in breast cancer families. Hereditary cancer is typically caused by an inherited loss-of-function mutation in a tumor suppressor gene.11 The loss of the intact allele is assumed to be often the initiating event of a multistep path to cancer.12, 13 The later steps of tumorigenesis arise as a result of random somatic events. However, only those changes that endow a growth advantage in combination with the earlier events are selected during the course of tumor evolution. The final cancer phenotype reflects the accumulation of novel features associated with the driver events.14 Since the driver changes are specific for the cancer relative to adjacent healthy tissue, and since tumor growth is dependent on them, they represent an appealing target for cancer therapy. Genomic analyses of copy number aberrations and gene expression changes in tumor tissue can be used for characterization of the driver events leading to cancer in specific cancer subgroups.15, 16

12 2 Review of the Literature 2.1 General cancer characteristics 2.1.1 Cancer progression Cancer is a progressive disease in which the cancerous cells gradually lose their tissue-typical morphology, proliferate in an uncontrolled manner, invade the surrounding tissue, and eventually spread via lymphatic and blood vasculature to give rise to metastatic growths.14 Cancer progression is driven and accompanied by mutations, which can be considered as stochastic events whose probability is increased by two types of factors: those increasing the number of cell divisions and those causing DNA damage (Figure 1).17 Most of the neoplastic events, i.e. emergence of driver mutations, take place in progenitor/transit-amplifying cells, which have the capacity to dedifferentiate into stem cell state, but which also proliferate at a frequency high enough for accumulation of a sufficient number of malignant mutations. 18 However, the actual origin of a cancer could be any cell that retains proliferative capacity, ranging from stem cells to their more differentiated descendants, depending on tissue hierarchy and cell half-lives. 19 After the initiating event, the progeny of any pre-neoplastic cell may normally take their place and function in the tissue hierarchy or, alternatively, form a benign growth or even be erased by innate mechanisms controlling tissue homeostasis until further mutations endowing a growth advantage emerge. Thus, tumor progression is a cellular-level combination of stochastic events and Darwinian evolution.14, 18, 19

Healthy tissue Inherited mutations Age DNA replication Increasing number errors of stem cell Carcinogenic exposure Intrinsic divisions oxidative Inflammation, stress hormonal exposure etc. Increasing probability of tumor driver mutations

Cancer

Figure 1. Summary of factors increasing the probability of cancer progression.

The branching evolution characteristic for cancer development can be seen in genomic profiles of excised tumors and metastatic growths. The benign clonal cell populations co-exist with their more aggressive descendants. The different subpopulations can be distinguished by their mutation and gene expression profiles, highlighting the intrinsic heterogeneity of any single tumor.20

13 Sustaining Resisting proliferative cell death signaling

Genome Deregulating instability and cellular mutation energetics

Enabling replicative immortality

Activating invasion and metastasis

Figure 2. Cancer Hallmarks introduced by Hanahan and Weinberg can be categorized as changes taking place primarily in the neoplastic cell lineage (inner circle) and changes affecting the interactions between the cancer cells and their tissue environment (outer circle). 14, 21

Owing to the progressive nature of the disease, each cancer is unique. Hanahan and Weinberg summarized features shared by most cancers into six ‘Cancer Hallmarks’ and later refined the model by addition of two novel hallmarks and two enabling characteristics (Figure 2).14, 21 First of all, alterations in cellular and tissue level programs regulating cell proliferation and survival are divided into three hallmarks: sustaining proliferative signaling, evading the growth suppressors, and escaping the intrinsic apoptotic programs. The number of replicative cycles is restricted by telomere erosion in human cells. The rampantly dividing cells face a telomere crisis and need to reactivate the telomerase enzyme to gain replicative immortality, the fourth hallmark. Another restrictive mechanism for tumor growth is the shortage of oxygen and nutrients. Two further hallmarks enable cancer to overcome this challenge: inducing neo-vasculature and reshaping energy metabolism. The ultimate hallmark transforming cancer into a systemic disease is tissue invasion and metastasis, requiring changes in cell phenotype and adaptation to a foreign cellular environment. Finally, all the way through the tumor progression, the cancer cells must escape the surveillance of the immune system in order to triumph. In addition to these eight hallmarks, Hanahan and Weinberg named two features as specific cancer-enabling characteristics: genomic instability and chronic inflammation (Figure 2).14 The concept of cancer hallmarks is a simplified framework; all hallmarks cannot be considered to apply to all cancer cells at all times, not to all cancer stem cells, and not even to all cancers. Floor et al.22 suggested a four-layered hierarchical model for understanding carcinogenesis etiology

14 (Figure 3). In this model, the ‘Cancer Hallmarks’ form the highest hierarchical level, the general cancer characteristics. The hallmarks result from changes in cellular pathways arising from genetic, epigenetic, or lysogenic oncogenic events, which on the bottom level are caused by various intrinsic and environmental factors (Figure 1). Importantly, the hierarchical levels of the model are connected by complex one-to-many and many-to-one relations.22 This model emphasizes the need for a molecular-level understanding of cancer. A good start for this effort would be harvesting the oncogenic events, i.e. the acquired somatic and predisposing germline mutations in genetic studies. Elucidating how these changes affect the complex network of cellular pathways and differentiation programs giving rise to the cancer hallmarks will be the future goal of cancer research.

Figure 3. Hierarchical model of carcinogenesis etiology with potential simple and complex interactions at all levels of hierarchy. Adapted from Floor et al. 2012.22

2.1.2 Cancer genes Cancer-associated genes are categorized into two groups based on how their aberrations accelerate tumorigenesis: activated oncogenes and silenced tumor suppressor genes; both promote cancer progression. By definition, oncogenes are genes encoding involved in cellular growth and differentiation programs that have lost important gene- or protein-level regulatory constraints. Cellular oncogenes have often been identified at characteristic chromosomal translocations due to sequence similarity to viral oncogenes.23 Oncogene activation is usually a somatic event, and mutations in proto-oncogenes are rarely associated with hereditary cancer.11, 21, 24, 25 The majority of tumor suppressor genes have been discovered in studies of cancer families.11, 26 Typically, inactivation of both alleles is required for cancer initiation, as suggested in Knudson’s ‘two-hit hypothesis’.12, 13 If one hit, i.e. a loss-of-function mutation of a tumor suppressor gene, is inherited, cancer probability is higher than if both hits were to be acquired as somatic changes. In addition to this simplistic model, haplo-insufficiency has been recognized as another mechanism for tumor suppressor-related cancer initiation; under certain environmental or intrinsic stress, expression of only one intact allele is not sufficient to protect the cell from additional neoplastic changes.17, 26

15 Tumor suppressor genes are further divided into three functional categories: gatekeepers, caretakers, and landscapers. Gatekeepers refer to the original idea of tumor suppressors as anti- oncogenes. Those are genes encoding proteins that limit cell cycle progression and entry into such as APC (APC, WNT signaling pathway regulator) in colorectal cancer and RB1 (RB transcriptional corepressor 1) in retinoblastoma. Caretakers include DNA repair genes, like BRCA1 and BRCA2 (BRCA1/2, DNA repair associated), whose mutations predispose to breast and ovarian cancer, or MSH2 (mutS homolog 2) and MLH1 (mutL homolog 1), which are associated with colorectal cancer.26 Landscaper genes are defined to encode proteins involved in regulating the tumor micro- environment. The tumor-initiating mutation of a landscaper gene might even take place in a stromal cell instead of the cancer cell lineage. Juvenile polyposis is a characteristic syndrome for a germline landscaper gene deficiency. It is manifested by multiple hamartomatous polyps of the colon at a young age. The abnormal growth of the epithelium has been concluded to be induced by a mutation in the surrounding stromal cells, causing sustained proliferative signaling. The elevated number of cell divisions in epithelial cells raises the probability of somatic neoplastic events, thus increasing the risk of carcinoma development.26, 27 Functional classification of tumor suppressor genes identifies the key features of cancer-associated genes. However, it is a rough simplification. One gene or protein may serve in different roles depending on the context. For example, TP53 (tumor protein p53) serves as both a gatekeeper and caretaker, connecting the surveillance of genomic integrity to apoptosis,26 and BRCA1 is involved in DNA repair, control of cell division, and regulation of differentiation via gene expression.28, 29 Furthermore, since even oncogene activation induces apoptotic programs, it is justified to shift the focus from single genes to the pathway level and consider cancer as a consequence of disturbed cellular programs.26 Molecular and cellular cancer research has focused in a reductionist fashion on characterization of cancer cell lineages like the transformed epithelial cells in carcinomas. However, cancer is not a cell-autonomous disease. Instead, the interactions with the microenvironment, extracellular matrix, stromal cells, and immune system contribute to cancer development. There is some experimental evidence implying that aneuploid cancerous cells could be normalized under regulation of a healthy cellular environment.30, 31 Furthermore, the risk of tumor spread has been suggested to depend more on immune response than on the features of the cancer cells themselves.32, 33 2.2 Breast cancer Breast cancer is the most common cancer and the leading cause of cancer-related death in women in developed countries.34 In Finland, the cumulative disease risk by the age of 75 years has increased steadily since the 1950s, reaching 9.9% in 2014. 35, 36 The high rank of breast cancer as a cause of mortality is partially misleading; it is mostly due to the high frequency of the disease itself. In general, breast cancer is a manageable disease, with a five-year survival rate of about 88%.35, 36 Male breast cancer is a rare disease, with an average of 25 diagnosed cases per year in Finland. 35, 36

16 Breast cancer covers a range of phenotypically different neoplastic diseases. However, here, the term ‘breast cancer’ is used to strictly refer to carcinomas originating from the epithelial cells of the mammary gland, distinct from connective tissue, lymphoid, or skin neoplasias occurring in the chest area.37

2.2.1 Mammary gland The human mammary gland is a tree-like structure, with primary, secondary, and tertiary ducts forming the stem and branches, and lobules forming the leaves. The hormone-independent early development of the mammary gland starts during embryogenesis, with formation of a bilateral mammary ridge, followed by development of placodes and rudimentary ductal trees. After birth, the gland remains in a quiescent stage until puberty, when the branching morphogenesis continues in response to estrogen stimulus.38, 39 At birth, the rudimentary ductal tree consists of short primitive ducts ending in terminal end buds (TEBs). Thereafter, the primary ducts are formed by bifurcation of the TEBs. While the primary ducts elongate, the secondary and tertiary ducts emerge as lateral appendages, with novel TEBs at each branch end. Eventually, the TEBs give rise to small ductules, which first develop into virginal lobules, i.e. terminal ductal lobular units, and later differentiate into milk secreting alveoli.38, 40 The tertiary ducts and the lobules are sensitive to oscillating exposure to ovarian and pituitary hormones, which induce a growth and differentiation pulse followed by a period of regression during each menstrual cycle. However, until the age of 35 years the regression never returns to the starting point, the net effect being cumulative growth and differentiation.38, 39 While the lobules develop, the number of ductules per lobule increases and the size of the ductules decreases. The full maturation of the ductal tree takes place during the first half of the first pregnancy, when the epithelial luminal cells differentiate in anticipation of lactation. Pregnancy, lactation, and weaning are followed by partial gland involution. Russo et al.38 classify the lobules into four types based on the number of ductules per lobule. The predominant lobule type for nulliparous women is ‘Lob1’ with about 11 ductules, whereas for parous women it is ‘Lob3’ with about 81 ductules representing the most abundant type. ‘Lob2’ is an intermediate structure and ‘Lob4’ refers to fully maturated structure during the latter half of pregnancy and lactation.38 In addition to structural changes, pregnancy induces epithelial cell differentiation by altering the balance of the activated cellular pathways, and by reducing the proliferative activity of the epithelial cells.41-43 Menopause is associated with major structural involution of the ductal tree, irrespective of parity. Thus, the mammary gland of menopausal parous and nulliparous women is structurally similar and occupied mainly by ‘Lob1’ lobules. However, on a cellular level there are significant differences in chromatin condensation and proliferative activity.42 The ducts are lined with two layers of epithelial cells: the luminal layer and the basal myoepithelial layer. Alveolar, milk-producing cells differentiate from dedicated alveolar precursors scattered around in the luminal layer.39 All of these epithelial cell lineages originate from the same precursors – the mammary stem cells, which dwell in the basal layer.40 The stem cell activity is restricted mainly to the period of embryonic development, and in the adult tissue cell regeneration is maintained by multiplication of luminal and basal lineage-specific precursors.44, 45 Basal lamina separates the epithelial cell layers from fibroblast-rich stroma, which surrounds the ductal tree amid the adipocytes of the mammary fat pad.39

17 Table 1. Regulators of mammary gland development.39, 46

Place of Gene/Protein Agonists Antagonists Targets Effects expression/ and action ligands

Systemic regulators Ovaries E1/E2 ESR1 pd: epithelial cell proliferation (Estrogens) preg: maintenance of alveolar cells Ovaries P PGR preg: tertiary branching (Progesterone) and alveologenesis Pituitary gland GH IGF1, ESR1 pd: epithelial cell proliferation (Growth hormone) Pituitary gland PRL PRLS preg: progesterone expression, (Prolactin) lobulo-alveolar development Liver IGF1 pd: epithelial cell proliferation (Insulin-like growth factor) Liver PLG KLK1 ECM iv: ECM breakdown, disruption of (Plasminogen) cell-cell contacts

Local regulators Epithelium WNT3, WNT6, WNT10B TBX3, LEF1/TCF ed: epithelial cell proliferation Wnt family members FGFR2B Mesenchyme WNT5A, WNT11 TGFB1 LEF1/TCF ed: epithelial cell proliferation Wnt family members pd: inhibition of ductal elongation

Epithelium WNT4 P LEF1/TCF preg: tertiary branching Wnt family member

Epithelium and LEF1/TCF WNT- Multiple mammary gland development mesenchyme lymphoid enhancer binding factor 1 / ligands transcriptional TCF family transcription factors level targets Mesenchyme TBX3 BMP4 WNT10B, ed: localization of developing gland T-box 3 BMP4 Epithelium BMP4 TBX3 BMPR1A, ed: localization of developing gland bone morphogenetic protein 4 TBX3 Mesenchyme BMPR1A PTH1R WNT ed: localization of developing gland, bone morphogenetic protein receptor signaling nipple formation type 1A Epithelium PTHLH PTH1R ed: branching and niplle formation parathyroid hormone like hormone Mesenchyme PTH1R PTHLH BMPR1A ed: branching and niplle formation parathyroid hormone 1 receptor Mammary line FGFR2 FGFs SPRY2 ed: placode placement Epithelium fibroblast growth factor receptor 2 pd: epithelial cell proliferation Somites FGF10 (and other FGF ligands) GLI3 FGFRs ed: placode placement Stroma fibroblast growth factor 10 pd: epithelial cell proliferation Somites GLI3 FGF10 ed: placode placement GLI family zinc finger 3 Mesenchyme NRG3 Integrins, ed: regulation of cell-cell interactions neuregulin 3 ECM, ERBB4 Epithelium ERBB4 NRG3 ed: regulation of cell-cell interactions erb-b2 receptor tyrosine kinase 4 Epithelium ERBB2 EGFR- pd: ductal morphogenesis erb-b2 receptor tyrosine kinase 2 coreceptor Stroma STX2 Metallo- CEBPB, pd: GH dependent branching syntaxin 2 / epimorphin enzymes MMP2, MMP3 Epithelium CEBPB STX2 Multiple pd: GH dependent branching CCAAT/enhancer binding protein beta transcriptional level targets Stroma IGF1 GH IGFBP5 IGF1R pd: epithelial cell proliferation insulin like growth factor 1 Stroma ESR1 E1/E2 AREG pd: epithelial cell proliferation and Luminal cells estrogen receptor 1 ductal elogation

18 Place of Gene/Protein Agonists Antagonists Targets Effects expression/ and action ligands Luminal cells AREG ESR1, HSPGs EGFR pd: proliferation of ER-negative cells amphiregulin ADAM17 Luminal cells ADAM17 PPCs TIMP3 AREG pd: epithelial cell proliferation a disintegrin and metalloproteinase 17 Stroma EGFR AREG, NRG3 pd: ductal elongation epidermal growth factor receptor and other EGF ligands Stroma GHR GH IGF pd: epithelial cell proliferation growth hormone receptor Epithelium TGFB1 TGFBR2 WNT pd: negative regulator of branching and Stroma transforming growth factor beta 1 signaling duct elongation Epithelium CSF1 CSF1R pd: macrophage recruitment colony stimulating factor 1 Stroma CCL11 CCR3 pd: eosinophil recruitment C-C motif chemokine ligand 11 Luminal cells PGR PWNT4,preg: tertiary branching progesterone receptor TNFSF11 and alveologenesis Luminal cells TNFSF11/RANKL PGR, JAK2 / TNFRSF11A preg: tertiary branching TNF superfamily member 11 / STAT5 and alveologenesis RANK ligand Luminal cells TNFRSF11A/RANK TNSF11 CCND1, preg: tertiary branching TNF receptor superfamily member 11a NFKB1 and alveologenesis Luminal cells PRLR PRL TNFS11, preg: tertiary branching prolactin receptor JAK2/STAT5 and alveologenesis Luminal cells Integrins ECM JAK2/STAT5 preg: tertiary branching and alveologenesis Luminal cells SIRPA ECM JAK2/STAT5 preg: tertiary branching signal regulatory protein Į and alveologenesis Luminal cells JAK2 / STAT5 PRLR, SOCS SOCS, preg: tertiary branching janus kinase 2 / Integrins, TNSF11 and alveologenesis signal transducer and activator of SIRPA transcription 5 Luminal cells SOCS family JAK2 / JAK2 / preg: tertiary branching suppressors of cytokine signaling STAT5 STAT5 and alveologenesis Epithelium LIF Milk stasis STAT3 iv: involution inducing signal leukemia inhibitory factor Luminal cells STAT3 LIF PI3 kinase, iv: inhibition of proliferative signals, signal transducer and activator of IGFBP5 apoptosis transcription 3 apoptotic programs Luminal cells IGFBP5 STAT3 IGF iv: inhibition of proliferative signals insulin-like growth factor binding protein-5 Epithelium KLK1 PLG iv: ECM breakdown, disruption of kallikrein 1 cell-cell contacts Stroma MMP2, MMP3 and MMP14 EGFR, TGFB, TIMPs 1-4, ECM, basal pd: duct elongation, secondary matrix metalloproteinases ESR1, STX2 TGFB lamina branching iv: disruption of basal lamina Stroma TIMPs 1-4 MMPs pd and iv: MMP regulation tissue inhibitor of metalloproteinases

Abbreviations: ed – embryonic development, pd – pubertal development, preg – pregnancy, iv - involution

The embryonic development of the mammary gland is independent of sex hormones and does not differ between women and men. It is regulated by reciprocal interactions between the epithelium and the underlying mesenchyme. Pubertal development is orchestrated by growth hormone secreted from the pituitary gland and ovarian estrogen, which together induce global and local downstream effects leading to mammary gland expansion (Table 1). Growth cycles associated

19 with the menstrual period rely mainly on ovarian progesterone stimulus, and the full maturation of the mammary gland taking place during pregnancy is driven by progesterone together with pi- tuitary gland prolactin. The key regulators in different developmental phases are listed in Table 1. In brief, WNT and FGF (fibroblast growth factor) pathways have an essential role in ductal growth and branching from embryogenesis until full maturation. The JAK/STAT (janus kinase/signal transducer and activator of transcription) pathway together with matrix metalloproteinases (MMPs) are involved in pubertal and antenatal development as well as in involution. Overall, the mammary gland development is a complex interplay between the epithelial cell layers, extracellular matrix, stromal fibroblasts, and recruited white blood cells and requires a strict control of cell proliferation and differentiation as well as matrix remodeling.39, 46 2.2.2 Breast cancer risk factors Breast cancer risk is increased by multiple factors related to life-style or to reproductive and medical history (Table 2).47-51 Pregnancy is associated with a transient increase in the risk of estrogen receptor (ER)-negative breast cancer. In the long run, pregnancy and breast feeding have a protective effect. However, if the time between the first menstrual period and the first full-term pregnancy exceeds 15 years, the protective effect fades away. The pregnancy-associated fluctuation in breast cancer risk has been proposed to reflect the intrinsic mammary gland biology. The periodically proliferating epithelial cells of a virgin gland are vulnerable to carcinogenic attack, whereas the full maturation of the gland entails protective changes in the form of increased chromatin condensation and lowered proliferative activity. However, gland expansion, accompanied by epithelial cell division and matrix remodeling, transiently raises the risk of malignant development during pregnancy.38, 42, 43, 46 Table 2. Breast cancer risk factors.

Life-style Reproductive history Medical history Tobacco smoking Nulliparity* Oral contraceptives* Alcohol consumption* High age at first full term pregnancy* Hormonal replacement therapy* Overweight* Early menarche* Chest area X-rays Lack of physical activity Late menopause* Mammographic density* Benign breast disease * Factors associated with increased exposure to estrogen.

Many of the risk factors are directly related to increased exposure to estrogen (Table 2).52 Estrogen contributes to breast cancer risk both by increasing the number of cell divisions and by inducing genotoxic stress; estrogen metabolite estradiol generates free oxygen radicals and forms DNA adducts, leading to depurination and increased risk of error-prone DNA repair.53 Oophorectomy or several years’ administration of antiestrogens halves the breast cancer risk of women at high familial risk.37, 48, 54 2.2.3 Breast cancer subtypes Breast cancers are categorized in many ways in order to assist in prognosis and in the choice of treatment. Grading is based on tumor cell nuclei morphology, proliferation, and the extent to which the tumor cells form tubular structures.55 Disease stage (TNM classification) is determined

20 according to tumor size (T), spread to adjacent lymph nodes (N), and distant metastases (M).56 Histopathologic classification relies on multicellular structures and proportion of infiltrating cells. The most common histological type is ‘Invasive carcinoma of no special type’ (IC-NST), previously called ‘Invasive ductal carcinoma’, and the second most common is ‘Invasive lobular carcinoma’ (ILC). Additionally, expression of four marker proteins, ER, PgR (progesterone receptor), HER2 (human epidermal growth factor receptor), and ki67 is often assessed in clinics. ER/PgR or HER2 positivity is a direct indicator for endocrine or HER2-targeted therapies, respectively, whereas ki67 expression indicates high proliferation and suggests that the patient may benefit from adjuvant chemotherapy.57 The diversity in external breast cancer features has evoked an attempt to classify tumors by their inherent cellular biology, leading to development of the intrinsic tumor subtypes and corresponding gene expression signatures.15, 58-60 The current consensus signature consist of 50 genes, whose expression levels divide breast tumors into luminal A, luminal B, basal-like, and HER2+ enriched subtypes.15, 61 The proportions of different subtypes have varied in the published literature, depending on cohort composition. However, in unselected patient series, luminal A is the most common subtype, assigned typically to about half of all cases. Luminal B is more frequent than basal-like and HER2+ enriched subtypes, which are equally common.62 The classification has gained popularity because of the added value it gives to prognostic estimation and treatment choices. In brief, the good prognosis luminal A tumors could be spared from chemotherapy, whereas for the other three subgroups chemotherapy would be justified.63, 64 If gene expression data are unavailable, the intrinsic breast cancer subtypes can also be estimated using surrogate histopathological markers; ER expression makes the major division between luminal and basal branches, and HER2 expression divides the ER-positive luminal and ER-negative basal branches further. PgR, ki67, and grade can be used for subdividing the luminal branch (Table 3).63-65 However, it is not uncommon that the surrogate marker-based prediction deviates from the gene expression-based classification. Especially, the differentiation of luminal A from luminal B tumors or HER2+ enriched from basal-like tumors remains challenging.66

Table 3. Surrogate intrinsic subtypes defined on the basis of immunohistochemical measurement of marker proteins as suggested in St Gallen 2013.63 Triple negative / Luminal A-like Luminal B-like HER2 positive Basal like ER positive positive negative negative PR positive Either PR-negative, negative negative Ki-67 low Ki-67 high, or any any HER2 negative HER2-positive positive negative

Alongside tumor subtypes, gene expression signatures have been developed either for classifying breast cancer patients into good and poor prognosis groups,67-72 for predicting benefit from a certain treatment specimen,73, 74 or for estimating specific tumor characteristics.75, 76 Even though there is little overlap in the individual genes included in these signatures, they have not been shown to differ in their ability to predict patient survival.77, 78 A critical meta-analysis showed that the common feature for these signatures is their ability to detect tumor proliferation, which has a direct association with patient survival.79 In fact, gene expression in general is confounded by cell

21 proliferation status and the breast cancer-associated signatures have not been able to outperform random signatures in predicting patient survival.80 Furthermore, different classification methods give opposite predictions on an individual patient level, and the traditional approach based on ER, HER2, grade, and TNM is still widely used for prognostic estimation.81, 82 Recently, IntClust classification has challenged the intrinsic subtypes as the best method for molecular taxonomy of breast cancer. IntClust is based on identification of recurrent chromosomal aberrations driving the tumorigenesis and affecting gene expression in cis.83 The method was developed using copy number and gene expression data in parallel, but later a surrogate signature using only gene expression was developed.84 The IntClust classification has 10 categories. Intrinsic basal tumors cluster almost exclusively to a single IntClust category, HER2+ enriched cancers to another category, but luminal tumors are dispersed across multiple IntClust categories. In the original study, IntClust outperformed the intrinsic subtyping especially in identifying a subgroup of chemo-insensitive ER-positive/luminal A breast cancers with poor prognosis, as well as in defining a signature indicative of a high number of infiltrating T-cells associated with good prognosis irrespective of the intrinsic subtype.57, 83-85 However, perhaps the most important contribution of the IntClust classification to breast cancer taxonomy is the shift in focus from primarily prognostic estimation to identification of the molecular events leading to breast cancer and characterization of specific druggable aberrations.85 2.2.4 Origin of breast cancer Development of the intrinsic molecular subtypes encouraged scientists to hypothesize that luminal tumors originate from luminal layer cells and basal tumors from myoepithelial cells.86 However, multiple ensuing studies have disputed this and indicated that luminal and basal breast cancers share a common origin, the luminal progenitor cells. Only later steps in tumor progression determine the resulting tumor phenotype.87-89 BRCA1-deficient cancer has served as an archetypic model for basal breast cancer, and the luminal origin of basal tumors was first discovered in a BRCA1 knock-down experiment. Loss of BRCA1 function in organoid and mouse models altered gene expression in luminal progenitor cells and diverted them from their predestined differentiation program.29, 90-93 Furthermore, it has been shown that even though the basal branch tumors typically do not express estrogen receptor, estrogen exposure plays an important role in the initiation of basal tumors.94 Interestingly, transformation of the basal progenitors leads to development of a metaplastic carcinoma, a rare breast cancer subtype, characterized by low expression of the Claudin genes.87, 95 Another model for breast cancer etiology has been proposed by mathematical modeling of age- related breast cancer incidence. The key observations were that the age distribution of breast cancer risk had a bimodal shape and that the risk of ER-negative breast cancer decreased with age. According to the suggested model, the two etiologic breast cancer subtypes would include ER- negative early-onset breast cancer and ER-positive breast cancer with a linearly increasing cumulative lifetime risk. The division was suggested to be caused by differences in the steps of tumor progression arising from biological differences between pre- and postmenopausal mammary glands.96

22 2.3 Breast cancer treatment The primary treatment for breast cancer is surgical removal of the tumor mass with sufficient margin or excision of the entire mammary gland. Radiation is commonly recommended to lower the risk of local recurrence. Furthermore, the treatment regimen generally includes a combination of hormonal, cytostatic, and biological adjuvant therapies to reduce the risk of death due to metastatic disease. Neoadjuvant therapy, i.e. therapy preceding the surgery, can be used for increasing the operability of an inflammatory or locally advanced breast cancer, or for reducing the tumor size for better success of a breast-conserving operation.97 2.3.1 Adjuvant endocrine therapy Estrogen receptor expression in tumor cells is an indication of benefit from adjuvant endocrine therapy.97 The rationale behind endocrine therapy is that the proliferation of tumor cells depends on uninterrupted supply of ovarian hormones.98, 99 Tamoxifen, the first antiestrogen drug used in an adjuvant setting, is still routinely used in treatment of premenopausal women with ER-positive breast cancer.97, 99, 100 Tamoxifen competes with endogenous estrogens in binding to estrogen receptor. The tamoxifen-receptor complex is able to dimerize and bind to estrogen-responsive elements in DNA, but does not induce transcription of estrogen target genes in breast tissue.98 Tamoxifen therapy in premenopausal women may be combined with drugs suppressing ovarian function, e.g. luteinizing hormone releasing hormone (LHRH) agonists, which overstimulate LHRH receptors in the pituitary gland, thus reducing LHRH levels and estrogen production in the ovaries.97, 99 In postmenopausal women, the primary estrogen source is androgen metabolism in peripheral tissues. Aromatase inhibitors (AIs) bind either covalently or reversibly to aromatase enzyme, blocking androgen conversion, and outperform antiestrogens in efficiency in treating post- menopausal women with ER-positive breast cancer.97, 99, 100 The most common mechanism for acquisition of resistance to endocrine therapy involves activation of the PIK3CB-AKT-MTOR pathway (phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit beta; AKT serine/threonine kinase 1; mechanistic target of rapamycin kinase).101 A specific MTOR inhibitor, Everolimus, has recently been approved for treatment of advanced ER-positive breast cancer.97, 102 Furthermore, a specific PIK3CB inhibitor has recently been reported to be effective in clinical trials, but it has also been associated with severe adverse effects, preventing its wider use.103 There is a continuously ongoing research effort to further improve endocrine regimens. The most recent advances include cyclin dependent kinase (CDK4/6) inhibitors and histone deacetylase inhibitors as well as refinement of AI therapy by addition of adjuvant bisphosphonates to protect against fractures and bone metastasis.100, 101, 104-108 2.3.2 Other targeted biological therapies Biological cancer therapy is based on agents that specifically target the drivers of tumorigenesis. Since the driver events often arise as a consequence of somatic mutations or by re-activation of embryonic pathways, they should not be present in the healthy adult system,14 and the systemic adverse effects of the treatment should be minimal. On the other hand, the treatment benefit is restricted to the subgroup of patients whose tumors carry these specific aberrations.101

23 Besides the estrogen receptor, the most important target in breast cancer therapy is the HER2 receptor, which is overexpressed or amplified in about 15% of breast cancers. Trastuzumab, a monoclonal antibody against HER2, was first introduced in a clinical trial in 1998. After the release of impressive and consistent results of two large trials combining trastuzumab with adjuvant chemotherapy in 2005, trastuzumab was widely adopted for adjuvant treatment of HER2- positive breast cancer.109 Subsequently, other anti-HER2 agents and regimen modifications have become a focus of intensive research.110 Dual inhibition of the HER2 pathway with a combination of drugs targeting different components of the pathway has been the most promising approach.111 Poly(ADP-ribose) polymerase (PARP) inhibitors represent the most promising emerging therapy for breast cancer. PARP silencing is lethal for cells devoid of BRCA1 or BRCA2 function. Thus, the PARP inhibitors are an ideal therapy with minimal side-effects for carriers of germline BRCA1 or BRCA2 mutations.112 Furthermore, since most of the moderate-risk breast cancer susceptibility genes, including ATM, PALB2, FANCM, and CHEK2, are involved in the same pathway controlling DNA repair via , PARP inhibitors may have potential also in treatment of breast cancer patients with germline mutations in any of these other risk genes.113 The potential of other biological therapies for breast cancer, including antiangiogenic agents and inhibitors of epidermal growth factor receptor (EGFR), has been studied intensively. To date, however, the success has been limited.101 2.3.3 Adjuvant chemotherapy Chemotherapy refers to a wide range of cytotoxic agents targeting actively proliferating cells on a systemic level. The rationale is that since active proliferation is what distinguishes cancer from normal tissue the agents would have selective toxicity for cancer cells. However, typical side- effects include immunosuppression and hair loss due to killing of actively dividing precursor cells.114-116 Ovarian suppression or premature menopause resulting from death of germ cell precursors may contribute to chemotherapy efficacy, but is an unwanted side-effect for younger women.114, 117 Cytotoxic compounds have been used in combinations to treat breast cancer since the 1970s.118 CMF (cyclophosphamide; methotrexate; 5-fluorouracil) was introduced in a clinical trial in 1973115, 119, 120 and is still included in the recommended adjuvant regimens.97 CMF combines one alkylating agent (cyclophosphamide) and two antimebolites (methotrexate and 5-fluorouracil/ capecitabine) administered at regular intervals separated by periods of recovery.115 Cyclophosphamide and its metabolites (4-hydroxycyclo-phosphamide and aldophosphamide) pass through the circulation in chemically inactive forms. Cytotoxic phosphoramide mustard is generated from aldophosphamide only in target cells with a low concentration of aldehyde dehydrogenase (ALDH), which is able to metabolize aldo-phosphamide into inactive carboxyphosphamide. Many normal tissues have sufficiently high expression of ALDH to protect them from the toxic side-effects.121 Phosphoramide mustard causes DNA cross-strand links at guanine nucleotides, leading to DNA damage and cell death. Additionally, cyclophosphamide has antiangiogenic and immunostimulatory effects, which may contribute to its efficacy as a chemotherapeutic agent.122 Methotrexate and 5-fluorouracil specifically block two enzymes required for thymine metabolism, dihydrofolate reductase and thymidylate synthase, respectively, halting DNA synthesis and leading to DNA degradation and cell death.123, 124

24 Anthracyclines (epirubicin, doxorubicin/adriamycin) stabilize topoisomerase II alpha (TOP2A) complex bound on cleaved DNA. This results in a mitotic catastrophe when the cell cycle proceeds from G2- (gap 2) to M-phase (mitosis).125 TOP2A is expressed from late S-phase (synthesis) to M-phase and regulates DNA topology during DNA replication.126 TOP2A copy number aberrations, HER2 gene amplification, high proliferation rate, and ki67 expression have been suggested as individual markers for benefit from anthracycline therapy.127-133 On the other hand, NQO1 (NAD(P)H:quinone oxidoreductase) germline variant rs1800566 has been suggested as a counter indication for anthracycline use.134 However, none of these markers have currently been included in the treatment guidelines.97, 116 Anthracyclines have proven superior to the older- generation CMF in prolonging overall and relapse-free survival.115, 135 The treatment efficacy has, however, come at the cost of an increasing amount of adverse side-effects. Anthracyclines cause neutropenia in up to 25% of patients, secondary leukemia in about 1% of patients, and increased short- and long-term risk of heart failure, depending on the cumulative dose.135, 136 Taxanes (paclitaxel and docetaxel) represent a newer generation of chemotherapeutic agents. Their administration alternately with anthracyclines improves patient prognosis relative to anthracycline-based therapies alone.136 Taxanes promote rapid assembly of overly stable microtubules, stalling cells in the M-phase and leading to apoptosis or immune-related cell lysis.137, 138 Adverse effects include fatigue, neutropenia, peripheral neuropathy, and decreased cognitive performance as a result of axon demyelination and direct neural cell toxicity.139-141 Other microtubule-targeting drugs with similar toxicity profiles include eribulin and vinca alkaloids, which are primarily used in treatment of metastatic breast cancer.97, 142-144 Chemotherapy is recommended as adjuvant treatment for patients at intermediate or high risk of recurrence or progression.97 Few biological markers indicating sensitivity or resistance to any specific agent have been identified thus far.132, 145, 146 The choice of an optimal treatment combi- nation for an individual patient is made following general guidelines, and adverse effects are monitored and the agent or dose is changed if necessary.97 As the most efficient agents tend to be increasingly toxic, further research is required to identify the patient groups for whom the benefit will outweigh the harm, and to develop better-tolerated means for drug administration. 57, 114, 116 2.4 Genetic predisposition to breast cancer 2.4.1 Breast cancer heritability Genetic factors account for slightly less than one-third of breast cancer incidence; twin studies have estimated the heritability to be about 31%.6 Having a first-degree relative with breast cancer is associated with a twofold increase in breast cancer risk, and the risk has been suggested to increase along with the number of affected relatives, so that the cumulative lifetime risk for a woman with three affected first-degree relatives would reach 30-40%.147, 148 The familial aggregation of breast cancer is attributable to shared environmental factors as well as genetic variants acting multiplicatively to increase the risk, with common low-penetrance variants modifying the penetrance of higher-risk mutations. 6, 149-152 Typically for a polygenic disorder, the currently known genetic risk variants are distributed on a diagonal, reaching from rare high-risk mutations to moderate-risk mutations and further to common low-penetrance variants, covering an area in which the balance between effect size and variant frequency is sufficient to make them biologically or clinically interesting (Figure 4).153-156

25 The breast cancer-predisposing variants have been discovered mainly using three different methods, each best-suited to discovery of certain classes of variants on the risk-frequency axis (Figure 4, Table 4). Historically, the golden age of linkage studies preceded the burst of resequencing of candidate genes, which has given way to the currently dominating genome-wide studies.4, 11

Figure 4. Breast cancer-predisposing genes and variants. High- and moderate-risk genes are named in the figure, the rest being low-penetrance variants with a relative risk below 1.5.

The most important single risk factors are truncating mutations of BRCA1 or BRCA2 genes, explaining about 16% of breast cancer genetic background (Figure 5). These together with mutations in other high- or moderate-risk genes add up to about 20%.7 Common predisposing variants with low effect sizes account for an additional 18%. Interestingly, a recent genome-wide association study identifying multiple novel predisposing variants indicated that an equally large fraction of breast cancer predisposition could be explained by as yet unidentified variants, which were included on or imputable from the genotyping array, but which alone did not exceed the genome-wide significance threshold (‘Unknown common variants’ in Figure 5). 153 Still, a large proportion of breast cancer genetic background remains unexplained. The missing heritability has been suggested to be attributable to rare or private variants, structural variants, imprinting, or epistasis resulting from genetic interactions.157, 158

26 Table 4. Methods used for discovery of genetic variants predisposing to breast cancer.

Method Study subjects Usability Discoveries Advantages Limitations Co-segregation of Low number of marker variants and Pedigree data – study subjects Well-suited to study disease defines the multiple closely required Unsuitable for Mendelian-like High-risk protein Linkage study region of interest related genotyped variants with lower traits with high truncating mutations Brings new individuals from effect sizes Sequencing is used penetrance information on to detect coding each family biological processes mutations causing the trait

Functional approach Functional in target selection Limited to information defines predetermined Resequencing the genes of interest Suitable for cohorts Moderate- Can detect pathways of candidate Case-control data enriched with penetrance coding mutations causing Sequencing is used genes familial index cases mutations familial breast No novel biological to detect coding cancer in situations information mutations where linkage fails Comparison of allele frequencies in No family data Suitable for cases and in Large case-control Low-penetrance required Genome-wide detection of controls defines risk datasets, usually variants, typically Very large number association common Brings new variants combined from from non-coding of samples required analysis predisposing information on multiple studies regulatory regions Fine-mapping is variants biological processes used to identify the causing the trait causal variants

27 2.4.2 High-risk genes Loss-of-function mutations in BRCA1 or BRCA2 cause hereditary breast and ovarian cancer syndrome. It is an autosomally dominantly inherited syndrome with high penetrance. Female BRCA1 mutation carriers have about a 72% lifetime risk of breast cancer and a 44% risk of ovarian cancer. For female carriers of BRCA2 mutations, the risks are somewhat lower: 69% for breast cancer and 17% for ovarian cancer.156 Other cancers of the syndrome spectrum include melanoma, pancreatic cancer, prostate cancer, and male breast cancer. However, lifetime risks of these other cancers are considerably lower.159

Figure 5. Relation of environmental and genetic factors predisposing to breast cancer.

BRCA1 and BRCA2 mutations follow Knudson’s hypothesis for inherited cancer susceptibility; when one impaired allele is inherited, complete loss of the tumor suppressor’s function on a cellular level is more probable than if both alleles were inherited intact. Therefore, the disease onset occurs earlier for mutation carriers than for non-carriers, and the risk of multiple primary cancers is elevated. Furthermore, mutations are manifested in familial clustering of cancer cases.12, 13, 156, 159

Over 70% of breast cancers of BRCA1 or BRCA2 mutation carriers are diagnosed as high grade. However, there are clear phenotypic differences between breast cancers of carriers of BRCA1 and BRCA2 mutations. About 80% of BRCA1 carrier cancers are ER-negative and about 70% triple- negative. For BRCA2 carrier cancers, the numbers are reverse; almost 80% are ER-positive. Furthermore, BRCA1 mutations are associated with a medullary histopathological type.160 Mutations in other high-risk susceptibility genes (Figure 4) do not cause only breast cancer, but also hereditary cancer syndromes with characteristic cancer spectra: TP53 mutations cause Li- Fraumeni syndrome characterized by sarcomas, breast cancer, brain tumors, leukemias, adrenocortical tumors, and multiple primary cancers;161 PTEN (phosphatase and tensin homolog) mutations cause Cowden syndrome with multiple hamartomas in different tissues as well as breast, thyroid, and endometrial cancers;162 STK11 (serine/threonine kinase 11) mutations cause Peutz- Jeghers syndrome with hamartomatous gastrointestinal polyps and cancer, mucocutaneous pigmentation, and pancreatic, breast, ovarian, and gallbladder cancers;163, 164 CDH1 (cadherin 1) mutations cause Hereditary Diffuse Gastric and Lobular Breast Cancer Syndrome;165 and NF1 (neurofibromin 1) mutations cause neurofibromatosis and confer a moderately elevated risk of

28 breast cancer.4 However, the contribution of mutations in these genes to breast cancer incidence overall is limited (Figure 5).7 2.4.3 Moderate-risk genes Moderate-penetrance genes have been identified by resequencing candidate genes, which have typically included direct binding partners of BRCA1, BRCA2, or TP53. By definition, mutations in moderate penetrance genes are associated with only two- to threefold increase in risk and do not segregate consistently with breast cancer within pedigrees.7 In most cases, their role in breast cancer predisposition has been validated by discovery of founder mutations enriched in certain populations. The evidence has been based both on mutation clustering in breast cancer families and on elevated risk of mutation carriers in unselected cohorts.1, 7, 155, 166, 167 For example, CHEK2 (checkpoint kinase 2) c.1100delC was established as a susceptibility mutation simultaneously in Dutch and Finnish populations, and NBN (nibrin) c.657del5 in the Polish population, all with relatively high, about 1% carrier frequencies.1, 166, 168 Recently, a recurrent FANCM (Fanconi anemia complementation group M) c.5101Cௗ!ௗT mutation was discovered in the Finnish population, and subsequently, another FANCM variant was indicated as a risk mutation for triple- negative breast cancer.155, 169 Also PALB2 (partner and localizer of BRCA2) c.1592delT was found in Finnish breast cancer families using the candidate gene approach.167 Due to the rarity of truncating PALB2 mutations on a population level, it has been difficult to determine their exact effect size. Estimates made in unselected series have been less than fourfold.154, 167, 170 However, a collaborative international study with the largest number of PALB2 mutation carriers so far and exhaustive family data estimated the lifetime risk of PALB2 carriers to reach 35% and the relative risk to be over ninefold, suggesting that PALB2 could be considered a high-risk gene instead.171 ATM (ATM serine/threonine kinase) truncating, splice junction, and certain rare missense mutations have been validated as breast cancer risk factors in collaborative studies,154, 172, 173 whereas for many other candidate genes the reports have been inconclusive and their contribution to breast cancer remains to be determined.4, 174 2.4.4 The Breast cancer pathway The majority of the high- and moderate-risk genes as well as candidate genes with some evidence of breast cancer predisposition are involved in a single cellular pathway, namely the BRCA/Fanconi anemia pathway involved in repair of DNA double-strand breaks via homologous recombination.175 Fanconi anemia is a rare recessively inherited disease characterized by bone marrow failure, developmental defects, and a predisposition to acute myeloid leukemia and a range of soft tissue sarcomas. It is caused by biallelic mutations in genes encoding central components of the BRCA/Fanconi anemia pathway. Currently, about twenty genes have been linked to Fanconi anemia, including several breast cancer risk and candidate genes (Figure 6).176 The Fanconi anemia protein complexes – core, anchor, and ID2 – recognize DNA lesions, especially the locations of stalled replication forks, and recruit DNA repair proteins, including BRCA1, BRCA2, and RAD51 (RAD51 recombinase).176 On the other hand, the MRN complex (MRE11-RAD50-NBN; MRE11 homolog, double strand break repair nuclease; RAD50 double strand break repair protein; nibrin) senses double-strand breaks induced by intrinsic reactive oxygen species or extrinsic genotoxic agents resulting in ATM activation by auto-phosphorylation. This leads to two parallel processes: local signal spread along chromatin via binding of additional

29 MRN complexes and diffuse signal amplification via kinase cascade leading to cell cycle delay or apoptosis.177, 178 BRCA1 is a key node in making the choice between different mechanisms for DNA repair and recruiting the effector proteins. BARD1 (BRCA1 associated RING domain 1) is an obligate N-terminal BRCA1 binding partner, but the C-terminal BRCT repeats of BRCA1 can be occupied by one of three alternative proteins forming bridges to different functional or regulatory units. The BRCA1-PALB-BRCA2 bridge is obligatory for recruiting RAD51 to sites of DNA double-strand breaks in order to repair the lesions via homologous recombination (Figure 6).179, 180

Figure 6. BRCA/Fanconi anemia –pathway.176, 178-184 FA: Fanconi anemia; HR: homologous recombination.

30 In summary, the high- and moderate-penetrance breast cancer risk and candidate genes are involved in making the decision of whether to fix DNA double-strand breaks with a fast but error- prone non-homologous end joining or with the time-consuming but accurate homologous recombination. Apart from this, there is the decision of whether to delay the cell cycle, giving time for repair, whether to enter into senescence, or whether to drive the cell into apoptosis. In these processes, the balance between multiple pathways makes the ultimate choice. It is noteworthy that these pathways never rest; intrinsic factors alone have been estimated to cause up to 200 000 DNA lesions per day.177, 179 2.4.5 Common predisposing variants The common risk variants have been discovered in genome-wide association studies (GWAS). Since GWAS genotyping arrays have been designed to be non-redundant and to cover most of the total genomic variation, the best hits have been assumed to be only markers of risk loci or tag- SNPs (single-nucleotide polymorphisms), not causative variants per se.9, 153, 185 Causal inference requires fine-mapping the region of interest for finding the strongest signal or the relations of multiple independent signals and testing those in functional in silico and in vitro models.186 The common risk variants are enriched on regulatory sites active in breast cancer cell lines and their predicted target genes are enriched in pathways essential for breast tissue development, but also in cancer-related pathways.153 The relative risk associated with any single common variant is so low that it lacks all clinical applicability. However, when multiple variants’ effects are combined into a polygenic risk score it could be used in patient risk stratification, especially in combination with other measurable risk factors, such as mammographic density or family history-based prediction models.10, 187, 188 2.5 CHEK2 2.5.1 CHEK2 protein function Checkpoint kinase 2 (CHEK2) is a serine/threonine kinase involved in regulation of cell cycle delay in response to DNA double-strand breaks.189 It is expressed in a wide range of actively proliferating, quiescent, and terminally differentiated cells in different tissues and contributes to regulation of both G1/S and G2/M checkpoints as well as to timely and proper assembly of the mitotic spindle.28, 190-192 The CHEK2 protein consists of three distinct domains: the N-terminal SQ/TQ cluster domain (SCD; residues 19-69; S:serine, Q:glutamine, T:threonine), central fork head-associated domain (FHA; residues 112-175), and C-terminal kinase domain (KD; residues 220-486).193 In the basal inactive state, CHEK2 monomers localize to the nucleus.194-196 Phosphorylation of threonine 68 (Thr68) on the SCD-domain leads to CHEK2 homodimerization via reciprocal phosphoThr68- FHA, FHA-FHA, and FHA-KD interactions, which enable intermolecular phosphorylation of multiple amino acids, including Thr383 and Thr387 of the activation loop as well as Ser516 of the kinase domain.193, 197, 198 The activation process ends with dissociation of the two CHEK2 monomers.193, 198 In summary, SCD is primarily a regulatory domain, FHA participates in target binding via protein-protein interactions, whereas KD is the functional domain. CHEK2 has a messenger role in response to DNA double-strand breaks. It is phosphorylated by ATM at sites of DNA damage, thereafter rapidly spreading the signal throughout the nucleus

31 owing to its diffuse mobility.194 At G1, CHEK2 phosphorylates CDC25A (Cell division cycle 25 A) phosphatase, preventing activation of CDC25A targets CDK1 and CDK2, entailing a rapid but short-term delay in entering the S-phase of cell cycle.190, 191 Similarly, at G2 CHEK2 induces only a transient arrest, allowing time for double-strand break repair.192 In parallel to regulating cell cycle progression, CHEK2 stimulates double-strand break repair via homologous recombination by phosphorylating BRCA1.189 CHEK2 may also contribute to TP53 stabilization. However, additional cues are required for a full, long-term, TP53-dependent G1/S or G2/M arrest, and CHEK2 possibly has only a redundant role in the regulation of TP53 activity.191-193 During normal mitosis CHEK2 Thr68 is phosphorylated by PRKDC (protein kinase, DNA- activated, catalytic polypeptide).197 Activated CHEK2 localizes to centrosomes and by phosphorylating BRCA1 contributes to regulation of mitotic spindle assembly.28, 195, 196 Compro- mised CHEK2 or BRCA1 function has been shown to cause abnormal spindle morphology and irregular chromosomal alignment, leading to increased chromosomal instability and aneuploidy, but not to evoke spindle assembly checkpoint or to decrease cell viability.28 CHEK2-dependent phosphorylation of BRCA1 protects it from inhibitory effects of AURKA (aurora kinase A), thereby preventing acceleration of microtubule plus-end assembly, which has been suggested to be one potential cause for chromosome missegregation.199, 200 Altogether, the cellular phenotype associated with CHEK2 deficiency is mild: the error correction for kinetochore attachment functions normally, are aligned with modest delay, chromatids segregate with minor loss of fidelity and cells proceed through mitosis as usual. However, the slight weaknesses in maintaining chromosomal stability may be enough to explain the increased cancer risk associated with CHEK2 loss-of-function mutations. 28, 199 Consistently with the two roles of CHEK2 in response to DNA double-strand breaks and in mitosis, the CHEK2-depleted mice are characterized by radioresistance, increased chromosomal instability, and increased rate of spontaneous tumors.201-203 Noteworthy is that the tumorigenesis rate is increased especially in female mice, raising the possibility that female hormones contribute to CHEK2-related cancer progression.203 2.5.2 CHEK2 mutations CHEK2 was first considered as a candidate gene for Li-Fraumeni and Li-Fraumeni-like syndrome due to its role as an upstream activator of TP53.204 The initial screen in Li-Fraumeni families revealed three mutations, of which two founder mutations (c.1100delC and p.(I157T)) were later connected to breast cancer predisposition.1, 2, 9, 166, 204 However, the third mutation (c.1422delT) was found to be a false discovery – a variant located on a pseudogene with with CHEK2. Furthermore, subsequent screens of Li-Fraumeni families indicated that CHEK2 was unlikely to be associated with Li-Fraumeni or Li-Fraumeni-like syndromes.205, 206 C.1100delC is the most widely studied CHEK2 mutation owing to its relatively high carrier frequency in Finland (1.4%) and in the Netherlands (1.1%).1, 166 A single-nucleotide (C) deletion at codon 366 induces premature stop at codon 381, truncating the kinase domain.204, 207 Other recurrent truncating CHEK2 mutations include c.IVS2+1G>A and del5395, which are founder mutations in Slavic populations and have the highest, about 0.4%, carrier frequency in Poland.208 The former is a splice variant, resulting in a four-base insertion in mRNA and premature stop codon in exon three, truncating the FHA-domain.209 The latter deletion abolishes exons nine and

32 ten, leading to a premature stop at codon 381, similarly as with c.1100delC.210 C.1100delC has been shown to evoke nonsense-mediated mRNA decay, leading to rapid degradation of the mutated transcript in living cells.211 Furthermore, both c.1100delC and c.IVS2+1G>A are associated with drastically reduced cellular levels of the CHEK2 protein.209 The overall European allele frequency of p.(I157T) is below 0.1%, but it is enriched in certain populations, like Finland with 5.3% and Poland with 4.8% carrier frequencies.2, 208, 212 The mutated protein is expressed in normal cells and breast tumors at the same level as the wild-type protein.2, 209 Furthermore, the intact kinase domain is able to phosphorylate downstream target proteins.207 However, CHEK2 conformation modeling has indicated that isoleucine-157 is located at a crucial point on the hydrophobic surface between the dimerizing CHEK2 proteins. Its mutation to threonine severely disturbs the reciprocal interactions between the two CHEK2 monomers and reduces the rate of autophosphorylation required for CHEK2 activation.198, 213 2.5.3 CHEK2 and breast cancer risk Protein truncating CHEK2 mutations are associated with a two- to threefold increase in breast cancer risk. Estimates vary in different studies such that cohorts enriched with familial patients give slightly higher estimates than unselected cohorts. A meta-analysis of 42 studies suggested ORs of 2.7 [2.1-3.4] for c.1100delC for unselected cases and 4.8 [3.3-7.2] for cases with positive family history.214 A recent large-scale study aiming at accurate risk prediction reported the c.1100delC-associated OR for invasive breast cancer to be 2.26 [1.90-2.69].3 A polish study including also the two other truncating founder mutations, c.IVS2+1G>A and del5395, suggested OR 3.3 [2.3-4.7] for any truncating mutation for sporadic cases and 5.0 [3.3-7.6] for familial cases.215 Despite the variation in the estimates of the relative risk, the population-level cumulative lifetime risk for carriers of truncating CHEK2 mutations has been set to about 20% and over 30% for women with a positive family history of breast cancer.3, 215 C.1100delC is associated also with increased risk of bilateral breast cancer, but the effect size complies with a model where the two cancers arise independently and the association could be explained by the baseline risk associated with c.1100delC.1, 216-219 Breast cancer risk associated with p.(I157T) is considerably lower than the risk associated with the truncating mutations. Therefore, instead of being considered as a moderate-penetrance mutation, p.(I157T) is more comparable to the common low-penetrance variants. In the discovery study, p.(I157T) OR was 1.43 [1.06-1.95] in an unselected cohort, and no association with familial breast cancer was found.2 A meta-analysis of 18 studies suggested a slightly higher risk estimate with OR 1.58 [1.42-1.75].5 Recently, resequencing CHEK2 in large cohorts has enabled a discovery of rare coding variants. Calvez-Kelm et al. reported a discovery of six novel and unique truncating variants in breast cancer patients and 34 missense variants, which they categorized according to evolutionary conservation. A crude risk analysis suggested that the tolerable variants would be associated with a risk comparable to p.(I157T) and deleterious variants with similar risk effects as the protein- truncating founder mutations.220 Decker et al. reported 14 rare truncating variants in a cohort of early-onset breast cancer and suggested a combined OR 3.11 [2.15-4.69] for all truncating variants.221 Converging results were obtained also from a genotyping of six CHEK2 variants in a

33 large cohort of invasive breast cancer cases: OR point estimate was higher than 2.0 for deleterious variants and lower than 1.5 for variants predicted to be benign.154 2.5.4 CHEK2 in breast tumors CHEK2 protein expression is reduced or absent in c.1100delC carrier breast tumors, corroborating the causal role of the mutation in tumorigenesis.1, 216 The low expression is likely to be caused by the instability of the mutated transcript accompanied by haplo-insufficiency.211 Loss of the intact allele may be a contributory factor in some cancer cases, but loss of heterozygosity cannot be considered a general mechanism associated with c.1100delC-related tumorigenesis.205, 222 On the contrary, CHEK2 protein expression appears to be normal in breast tumors of p.(I157T) carriers.2 However, the mutation may reduce the overall level of CHEK2 activity since the mutated protein can bind intact CHEK2 monomers and compete with the wild-type alleles in the formation of homodimers in the CHEK2 activation process.2, 198, 223 Thus, presumably, the consequence of both c.1100delC and p.(I157T) would be reduced CHEK2 kinase activity, and the mutations would differ only in the extent of CHEK2 silencing. It is noteworthy that CHEK2 expression has been reported to be reduced in 21% of unselected non-carrier breast tumors, suggesting that the role of CHEK2 in breast cancer tumorigenesis extends beyond the carriers of germline mutations.1 Over 90% of CHEK2 mutation carrier breast tumors are ER-positive, compared with about 70% of non-carriers, and the strong association is shared by the truncating and missense mutations. C.1100delC has been suggested to be associated with poor patient survival,218 and also with pathologic markers of poor prognosis such as higher grade and larger tumor size.216 However, all studies have not replicated these findings.217, 218, 224, 225 Interestingly, lobular histological type is enriched among p.(I157T) breast tumors, but not in tumors of carriers of truncating CHEK2 mutations, implying that there may be subtle differences in breast cancer tumorigenesis associated with these mutations.224

34 3 Aims of the Study This study was designed to gain a deeper understanding of breast cancer pathogenesis associated with germline CHEK2 mutations, to examine tumor phenotype and survival of mutation carrier patients, and to explore possibilities for enhanced risk stratification for CHEK2 mutation carriers and for women from non-BRCA1/2 families. The work was divided into four sub-studies whith explicit aims as follows: I To identify recurrent copy number aberrations in breast cancers of c.1100delC carriers and to use gene expression analysis to identify cellular level pathways driving the c.1100delC-associated breast cancer. II To examine survival of p.(I157T) carriers as well as tumor phenotype associated with germline p.(I157T) mutation in a comprehensive setting including pathology analysis of a large international dataset of the Breast Cancer Association Consortium and gene expression analysis of 180 breast cancers. III To investigate the contribution of common genomic variation to the breast cancer risk of c.1100delC mutation carriers as well as to assess the applicability of a polygenic risk score in clinical risk stratification of the mutation carriers. IV To examine the risk associated with the polygenic risk score in Finnish breast cancer families in order to assist in outlining the principles for using the polygenic risk score in genetic counseling.

35 4 Materials and Methods 4.1 Study subjects and data sources 4.1.1 Breast tumors (I, II) For genomic profiling of CHEK2 c.1100delC mutation carrier breast cancers (I), archival tumor samples from 121 breast cancer patients, including 30 c.1100delC carriers, were examined. For 37 (17) patients, only formalin-fixed paraffin-embedded (FFPE) samples, and for 79 (11) patients only fresh frozen tissue samples were available (number of 1100delC carriers in parentheses). Additionally, both FFPE and fresh frozen tissue samples were available for five (two) patients. Both FFPE and fresh frozen tumors were used as a source for genomic DNA samples for the array comparative genomic hybridization (aCGH) using a custom-made genomic array of BAC (bacterial artificial chromosome) clones (SCIBLU genomics, Lund, Sweden),226 whereas total RNA samples for gene expression (GEX) analysis using a custom-made genomic oligonucleotide array (SCIBLU genomics, Lund, Sweden)226 were extracted only from the fresh frozen tumors. DNA was successfully extracted from all FFPE and 59 (9) fresh frozen samples, and RNA from 78 (13) fresh frozen samples. A separate gene expression dataset of 183 fresh frozen tumor samples from female breast cancer patients hybridized on Illumina HumanHT-12 v3 Expression BeadChips (Illumina Inc, San Diego, CA, USA) was used for clinical validation of the CHEK2 c.1100delC-associated gene expression signature (I) as well as in the analysis of p.(I157T)-associated gene expression (II). Details on sample preparation and data preprocessing have been published elsewhere.227 This data included six carriers of c.1100delC and ten carriers of p.(I157T). The overlap between this Illumina dataset and the above-described older dataset was 49 tumor samples, which included two samples from p.(I157T) carriers and five samples from c.1100delC carriers. The clinical relevance of the CHEK2 c.1100delC-associated gene expression signature was validated using three publicly available gene expression datasets, cohorts of 315 and 249 breast cancers from Uppsala75, 228 (GEO – Gene Expression Omnibus: GSE3494; GSE4922), and a cohort of 159 breast cancers from Stockholm229 (GEO: GSE1456). 4.1.2 Study subjects from the Breast Cancer Association Consortium (II, III) The Breast Cancer Association Consortium (BCAC) is an international forum for breast cancer research. Currently, it consists of 108 studies that participate in collaborative projects, providing genotype data on breast cancer patients and controls with the same ethnicity. Due to the multi- ethnic nature of the consortium, in analyses concentrating on founder variants, such as the CHEK2 mutations, it is meaningful to include only those studies that provide adequate numbers of variant carriers. Furthermore, since the BCAC studies have originally been established to serve different scientific purposes, the patient follow-up or clinical data availability differ between studies, affecting the selection of eligible BCAC studies in the collaborative projects. In the analyses of p.(I157T)-associated patient survival and tumor characteristics (II), we included female breast cancer patients from 15 BCAC studies. The study selection based on the number of informative p.(I157T) carriers (•9) yielded a dataset of 26 801 study subjects, including 590 p.(I157T) carriers and 271 c.1100delC carriers.

36 In the analyses of synergistic risk effects of CHEK2 c.1100delC and common low-penetrance variants (III), we included data from 32 BCAC studies. The total of 39 139 female invasive breast cancer patients and 40 063 healthy population controls with European ethnic background included 624 cases and 224 controls carrying the c.1100delC variant. However, complete data of all 77 common variants and CHEK2 c.1100delC were available only for 17 640 cases and 15 984 controls, including 285 and 84 c.1100delC carrier cases and controls, respectively. All available data were utilized in pairwise interaction analyses between CHEK2 c.1100delC and the common variants, and the complete data were used in analyses of the polygenic risk score combining the risk effects of all common variants. The study subjects for Studies II and III were genotyped centrally as a part of the Collaborative Oncological Gene-Environment Study9 (COGS) or by individual BCAC studies following the BCAC genotyping standards as described previously.185, 230 4.1.3 Study subjects of the Helsinki breast cancer study (IV) Predictive potential of a polygenic risk score was investigated in two Finnish datasets, a case- control dataset and a breast cancer family dataset. The case-control dataset of 1 681 breast cancer cases and 1 272 population controls consisted of three series of unselected breast cancer patients and additional familial index cases described in detail elsewhere.1, 134, 216, 231, 232 The breast cancer family dataset consisted of 52 systematically collected breast cancer families,232 including 493 genotyped family members (183 affected women, 246 healthy women, and 64 men) and registry data for a further 3 992 relatives. All study subjects were genotyped for 75 common breast cancer risk variants using the array designed for the COGS consortium studies described above. Three moderate-penetrance mutations (CHEK2:c.1100delC, PALB2:c.1592delT, and FANCM: c.5101C>T) were genotyped locally as described previously.155, 170, 218 4.2 Methods 4.2.1 Microarray data processing and analyses (I, II) Array comparative genomic hybridization and gene expression data were background-corrected, log-transformed, and then processed and analyzed according to the flowchart in Figure 7 using standard methods suitable for each technology. Analysis results included mutation-associated regions of copy number aberrations, differentially expressed genes, potential tumor driver events, and cellular pathways characteristic for the mutation carrier breast cancers. In the two-color aCGH and GEX arrays, used in Study I (Figure 7), normal human male genomic DNA and Universal Human RNA were used as reference samples, respectively, because patient reference samples from adjacent healthy tissue were not available. The use of a universal reference sample may increase noise in the data and, especially in the aCGH analysis, blur the effect of personal variation in germline copy-number. However, the BAC array resolution was approximately a signal per 100 000 base pairs. Furthermore, the data preprocessing aimed at a drastic dimensionality reduction, so that the analysis of c.1100delC-associated regions was able to capture differences at the level of chromosome bands, where differences in germline copy- number variation are not relevant. In the GEX analysis, the reference sample was used for data normalization, and as the denominator it was reduced at the analysis stage.

37 The normalization methods applied to different datasets were chosen to best suit the technology used for hybridization and to make the samples comparable to each other. The popLowess method used for the aCGH data (Figure 7) is based on the assumption that in tumor data there should be three main clusters of copy number, i.e. loss, normal, and gain, and that these could be defined by analysis of the distribution of the intensity ratios.233 Thus, the normalization with popLowess transforms continuous data to categorical data. The following steps – segmentation and region calling – were done to smooth the data so that the consecutive probes covering large genomic regions would have the same call. Of note, we used soft calls, i.e. probabilities of a specific call (loss, normal, or gain) in copy number calling. Therefore, the preprocessed data were continuous data with values between [-1, 1], instead of categorical calls (-1, 0, or 1). The non-parametric Wilcoxon rank-sum test was chosen for identifying the c.1100delC-associated CNAs (Figure 7) because we were unwilling to make assumptions about the distribution of the soft calls associated with the defined genomic regions. We suspected that in many regions relevant for tumor progression, the soft call distribution would have a skewed or bimodal shape, rendering e.g. Student’s t-test a suboptimal choice. The intensity ratios of the two-color gene expression array used in Study I were normalized within arrays with print-tip loess to get rid of uneven sample concentration across the array (Figure 7). In Study II, the within-array normalization of the gene expression data were done as a part of the service by SCIBLU genomics. In both studies, the data were normalized between arrays with quantile method. The quantile normalization is based on the assumption that the log-transformed gene expression intensity values of any multi-cell sample should be normally distributed. Forcing the samples to have identical distributions makes the between-sample comparison of individual mRNAs reliable. Differences in gene expression between c.1100delC carrier and non-carrier tumors were analyzed with moderated t-test with Bayesian probability estimation. The nominal p-values were used to prioritize gene selection to functional annotation. Multiple testing correction was not used at this stage, because the differences between c.1100delC carriers and non-carriers were likely to be modest on a single gene level. However, the functional analyses with DAVID annotation tool and Gene Set Enrichment Analysis (GSEA) were corrected for multiple testing to identify the significant differences on a pathway level. The clinical relevance of the CHEK2 c.1100delC-associated 182-gene signature was assessed in four independent gene expression datasets by clustering the samples according to the expression of the signature genes and comparing the breast cancer-specific and distant relapse-free survival between the two main clusters using Kaplan-Meier curves234 and log-rank test.

38 Raw intensity data aCGH (I) GEX (custom, I) GEX (illumina, II) Two-color Two-color One-color BAC clone array: oligonucleotide array: beadchip array: intensity ratios intensity ratios intensity values

Normalization: Normalization (within arrays): popLowess, Print-tip loess, Base R library limma

Segmentation: Normalization (between arrays): Circular binary segmentation, Quantile normalization, R library DNAcopy R library limma

Copy number calling: Differential expression analysis: Soft calls, Moderated t-test with Bayesian R library CGHcall probability estimation, R library limma

Data dimensionality reduction: regions of constant calls, R library CGHregions Functional enrichment analysis of the differentially expressed genes: DAVID functional annotation tool Testing for mutation-specific copy number aberrations: Wilcoxon rank-sum test Gene set enrichment analysis: GSEA Identification of potential tumor java application driver genes: Overlap between c.1100delC associated regions and differentially expressed genes

Gained and lost Candidate Enriched Differentially regions driver genes pathways expressed genes

Figure 7. Microarray data processing and analysis flowchart (Processing or analysis step: method, environment).233, 235- 244

4.2.2 Permutation analysis (I: unpublished data) The non-randomness of enrichment of the olfactory pathway among the genes with higher expression in c.1100delC carrier tumors was assessed by randomizing the mutation carrier status and running the analysis of differential expression with the same parameters as in the original work for 500 times. In short, analyses were performed using Bioconductor package limma, adjusting for ER-status and other potentially confounding covariates as described in I: Materials

39 and methods. P-value threshold for differential expression was set to 0.05 similarly as in extracting the 862 c.1100delC-associated genes for DAVID functional enrichment analysis. The number and proportion of olfactory receptor genes on each of the 500 generated gene lists were calculated and compared with the number and proportion of olfactory genes on the c.1100delC-associated gene list. 4.2.3 Survival analyses (II) Patient survival in a group of interest, e.g. carriers of a specific mutation, was compared with patient survival in a reference group using Cox proportional hazards model.245 Study subjects were considered to become at risk at the time of their first invasive breast cancer diagnosis. The data were left truncated to account for late enrollment and right censored at the event of interest or at the end of follow-up, whichever occurred first. The events of interest in the parallel analyses included death by any cause, breast cancer-associated death, distant metastasis, locoregional relapse, and second breast cancer. 4.2.4 Tumor pathology analyses (II) Tumor pathology analyses of the BCAC study subjects were based on the data that had been collected by individual studies, as described earlier.246 Associations between the CHEK2 mutations and pathologic characteristics were tested by study-stratified Cochran-Mantel-Haenszel test, as implemented in R library vcdExtra.247 Mutation-associated differences in age at diagnosis were tested with meta-analysis of age distribution using R library meta.248 The breast tumors of the BCAC study subjects were categorized into molecular subtypes using histopathological markers following the St Gallen 2013 criteria.136 The subtypes of the 183 breast tumors in the gene expression dataset were defined according to the PAM50 signature as implemented in the R library genefu.15, 249 4.2.5 The Polygenic risk score (III, IV) Polygenic risk score (PRS) summarizes the risk effects associated with single low-penetrance variants. In Studies III and IV, the PRS was calculated as a sum of per variant log odds ratios, weighed by the number of risk alleles carried.10 The raw PRS was normally distributed in the population and individual PRS values were standardized by subtracting the population mean and dividing by the population standard deviation. The PRS used in Study III was based on 74 common variants, and the PRS used in Study IV was based on 75 variants. The difference was due to CHEK2:p.(I157T) (rs17879961), which was not included in the analyses of Study III, because the number of study subjects carrying c.1100delC and p.(I157T) was very low. 4.2.6 Risk association analyses (III, IV) The association between genetic risk factors and breast cancer was tested primarily with logistic regression. Pairwise interaction between variants was assessed by including in the model an interaction factor coded as a product of the two. Nested models were compared with likelihood- ratio test. 4.2.7 Feature selection (III: unpublished data) To build a sparse model of common breast cancer susceptibility variants modifying the risk of CHEK2 c.1100delC carriers, we used stepwise logistic regression as implemented in StataSE 10 (StataCorp, College Station, TX, USA) on 77 common variants (linked variants of the same region were both included) with the following p-value thresholds: entry into model 0.05; removal from

40 the model 0.1. Feature selection was performed also with R libraries glmnet250, 251 and Boruta252, which enabled significance estimation by cross-validation or by introduction of random covariates, respectively. 4.2.8 In silico functional analysis (III: unpublished data) We retrieved tagging variants (r2>0.8) for the six putative CHEK2 c.1100delC risk-modifying variants using SNAP proxy search (version 2.2)253 on European haplotype data and utilized Genevar,254-256 RegulomeDB,257 and HaploReg v2258, 259 database tools for identifying the target genes of these variants. For annotation of the 71 remaining risk-modifying variants, we performed a HaploReg analysis and collected data from previously published investigations. To investigate more thoroughly the pathway enrichments of the six putative modifiers and the other 71 breast cancer susceptibility variants, we performed a literature search where we looked for functional connections between the potential culprit genes and WNT or FGF signaling as well as DNA repair pathways or CHEK2 itself. Furthermore, we used QIAGEN’s Ingenuity® Pathway Analysis (IPA®, QIAGEN, Redwood City, CA, USA, www.qiagen.com/ingenuity) for a more systematic approach for functional enrichment analysis. Pathway enrichment was tested by comparing the loci of the six putative modifiers against 63 loci covered by the 77 common variants with Fisher’s exact test. Here, all variants connected to the same culprit gene were considered to belong to the same locus. 4.3 Ethics statement This study was carried out with permission of the Helsinki University Central Hospital Ethics Committee (Dnro207/E9/07) and with written informed consent from all patients. Each individual BCAC study followed national guidelines for participant inclusion and for informed consent procedures and was approved by the appropriate local institutional review committee.

41 5 Results 5.1 c.1100delC and p.(I157T) carrier tumors (I, II) Genomic characterization of c.1100delC and p.(I157T) carrier tumors was included in Studies I and II, respectively. The latter study also covered pathological comparison of tumors from the carriers of the two mutations. The results are summarized below. 5.1.1 c.1100delC-associated copy number aberrations (I) Array Comparative Genomic Hybridization of 26 CHEK2 c.1100delC carrier and 76 non-carrier tumors identified seven genomic locations whose copy-number aberrations (CNAs) were more common in mutation carrier tumors than in non-carrier tumors (I: Figure 1). These included a wide deletion of 1p13.3-31.3 and amplification of 12q13.11-3. Narrow focal copy number aberrations included deletions at 8p21.1-2, 8p23.1-2, and 17p12-13.1 as well as amplifications at 16p13.3, and 19p13.3. The genomic locus of the CHEK2 gene was diploid in the majority (16) of c.1100delC carrier tumors, deleted in six, and amplified in four. 5.1.2 c.1100delC-associated differences in gene expression (I and unpublished data) Differential gene expression analysis of 12 CHEK2 c.1100delC carrier and 61 non-carrier tumors with a 0.01 p-value threshold revealed a 188-gene c.1100delC-associated gene signature (I: Additional file 6). There was little overlap between this signature and previously published breast cancer signatures. However, when breast cancer patients from two independent datasets were split into two categories based on expression of the 188 genes of the c.1100delC signature, there was a significant difference between these categories in patient survival (I: Figure 2). We applied a looser, 0.05 p-value threshold to retrieve the top-ranking differentially expressed genes (I: Additional file 7) for functional enrichment analysis using David functional annotation tool and database. The top-ranking 522-gene list with higher expression in c.1100delC carrier tumors was enriched for genes involved in olfactory signaling, transcription regulation, adherens junction, and WNT signaling pathway. Furthermore, the number of RAS protein family genes was 6-7 times higher than expected and 16q22.1 appeared as a genomic hot spot for elevated expression (I: Additional file 8). The 340 genes with lower expression in c.1100delC tumors included multiple genes involved in RNA processing and translation, mitochondria, cytoskeleton and centrosome organization as well as in response to DNA damage (I: Additional file 10). According to permutation analysis (unpublished data), the enrichment of olfactory pathway among the genes with higher expression in c.1100delC carrier tumors was unlikely to be a random association. When the mutation carrier status was shuffled for the samples on the array and a similar analysis of differential gene expression run for 500 times, the probability of finding equally strong or stronger enrichment of olfactory receptor genes was lower than 0.006. GSEA is a method for recognizing subtle, but consistent patterns in differential gene expression analysis results. Instead of focusing only on the top-ranking differentially expressed genes, it takes as an input the complete genomic list of genes ranked from elevated to suppressed expression. Here (unpublished data), the ranking of the input gene list was done according to a product of log2(fold change) and –log10(p-value). The results from the GSEA analysis concurred with the

42 DAVID analysis; genes of olfactory signaling had consistently higher expression in c.1100delC carrier than non-carrier tumors (Table 5).  Another significant enrichment among the genes with elevated expression was the 16q22 genes. Other gene sets were not significant after p-value correction, but the top-ranking gene sets with elevated expression in c.1100delC included genes involved in cell junction and KRAS signaling (Table 5). Top-ranking gene sets with suppressed expression in c.1100delC carriers included MYC and E2F target genes and genes involved in G2/M checkpoint and cellular respiration. Table 5 includes enriched annotations with FWER (family-wise error rate) below 0.5. Of note, the top- ranking hallmark gene sets beyond the threshold included DNA repair, apoptosis, and TP53 pathway. 5.1.3 Combined analysis of aCGH and GEX data (I and unpublished data) Combined copy number and gene expression analysis identified seven candidate tumor drivers (I: Table 2): TMED5 (transmembrane p24 trafficking protein 5), ALG14 (ALG14, UDP-N- acetylglucosaminyltransferase subunit), and LRRC8D (leucine rich repeat containing 8 VRAC subunit D) from 1p21-22 and CALCOCO1 (calcium binding and coiled-coil domain 1), OR6C3 (olfactory receptor family 6 subfamily C member 3), CSAD (cysteine sulfinic acid decarboxylase), and KRT5 (keratin 5) from 12q13 (Figure 8). The top-ranking differentially expressed gene CLCA1 (chloride channel accessory 1) was also located on a c.1100delC-associated region (1p22.3). However, the higher expression in c.1100delC carrier tumors (fold change: 1.82, p-value: 6.9E-6) was inconsistent with the c.1100delC-associated copy number (deletion, I: Table 2).

Figure 8. Expression values of potential tumor driver genes from 1p21-22 and 12q13 based on combined analysis of the aCGH and GEX data (nc: non-carrier).

43 Table 5. Results from the Gene Set Enrichment Analysis (GSEA) of 12 CHEK2 c.1100delC carrier and 61 non-carrier tumors with FWER (family-wise error rate) cut-off of 0.5 (unpublished data).

Normalized Corrected Gene Genes in Rank at Enrichment enrichment p-value for p-value set size data max score score enrichment (FWER) Gene set description

Hallmark gene sets representing well-defined biological states or processes. HALLMARK_APICAL_JUNCTION 200 151 1050 0.64 1.65 0.00 0.14 Genes encoding components of apical junction complex

Genes encoding proteins involved in oxidative HALLMARK_OXIDATIVE_PHOSPHORYLATION 200 103 1805 -0.69 -1.84 0.03 0.334 phosphorylation

HALLMARK_MYC_TARGETS_V1 200 92 1662 -0.65 -1.72 0.01 0.34 A subgroup of genes regulated by MYC - version 1 (v1)

Genes encoding cell cycle related targets of E2F HALLMARK_E2F_TARGETS 200 129 1996 -0.6 -1.62 0.03 0.36 transcription factors

HALLMARK_MYC_TARGETS_V2 58 34 2036 -0.72 -1.61 0.02 0.36 A subgroup of genes regulated by MYC - version 2 (v2)

Genes encoding proteins involved in metabolism of fatty HALLMARK_FATTY_ACID_METABOLISM 158 106 1505 -0.57 -1.52 0.03 0.43 acids Genes encoding components of blood coagulation HALLMARK_COAGULATION 138 106 461 -0.57 -1.52 0.02 0.43 system; also up-regulated in platelets Genes involved in the G2/M checkpoint, as in HALLMARK_G2M_CHECKPOINT 200 133 1220 -0.53 -1.5 0.02 0.45 progression through the cell division cycle

Oncogenic signatures Genes up-regulated in epithelial lung cancer cell lines KRAS.AMP.LUNG_UP.V1_UP 144 94 1058 0.66 1.63 0.01 0.41 over-expressing KRAS [ID:3845] gene

Curated gene sets of canonical pathways KEGG_OLFACTORY_TRANSDUCTION 389 113 976 0.81 2.07 0.00 0.000 Olfactory transduction REACTOME_OLFACTORY_SIGNALING_PATHWAY 328 85 976 0.82 1.98 0.00 0.001 Genes involved in Olfactory Signaling Pathway KEGG_CYSTEINE_AND_METHIONINE_METABOLISM 34 21 4 0.93 1.85 0.03 0.09 Cysteine and methionine metabolism REACTOME_CELL_JUNCTION_ORGANIZATION 78 58 1139 0.78 1.8 0.07 0.28 Genes involved in Cell junction organization

Gene ontology NITROGEN_COMPOUND_METABOLIC_PROCESS 155 106 133 -0.86 -2.29 0.00 0.11 GO:0006807 AMINO_ACID_AND_DERIVATIVE_METABOLIC_PROCESS 101 65 133 -0.91 -2.28 0.00 0.134 GO:0006519 AMINE_METABOLIC_PROCESS 141 99 133 -0.87 -2.27 0.00 0.21 GO:0009308 ACTIVE_TRANSMEMBRANE_TRANSPORTER_ACTIVITY 122 82 198 -0.87 -2.25 0.00 0.27 GO:0022804

Positional gene sets CHR16Q22 168 95 974 0.76 1.86 0.00 0.02 Genes in cytogenetic band chr16q22 CHR2Q12 61 31 294 -0.94 -2.12 0.00 0.35 Genes in cytogenetic band chr2q12 DAVID functional enrichment analysis of the c.1100delC-associated CNAs (unpublished data) highlighted clusters of homologous genes. The region with the highest signal on chromosome 1, 1p21.3-22.2 encompassing 81 contiguous BAC clones and over 10 Mb (I: Table 1), covered a cluster of genes encoding all human guanylate binding proteins (GBP1-7). The c.1100delC- associated region on 12q13 was enriched for type II keratin (n: 27) and olfactory (n: 17) genes, 8p23.1-2 covered 14 defensin genes, 16p13.3 four hemoglobin alpha and eight serine protease genes, and 17p12-13.1 a cluster of six myosin genes. 5.1.4 p.(I157T)-associated gene expression (II) The expression levels of 21 genes were associated with p.(I157T) carriership in an analysis of data from 183 breast tumors (10 p.(I157T) carriers). All of these genes had higher expression in p.(I157T) carrier tumors (II: Table 4). One-third of them were collagen genes, forming the most prominent functional group. Gene set enrichment analysis suggested that CDH1 and RB1 would be important regulators of the overall gene expression differences between p.(I157T) carrier and non-carrier tumors (II: Table S6). Furthermore, gene signatures associated with cell adhesion, interaction with stroma, and epithelial-to-mesenchymal transition were enriched at the top of differentially expressed genes in p.(I157T) carrier tumors. 5.1.5 Clinico-pathological characteristics (II) Analysis of tumor clinico-pathological characteristics in a BCAC dataset of 26801 breast cancer patients showed that the p.(I157T) and c.1100delC carrier tumors shared some features that set them apart from non-carrier tumors, but there were also significant differences between tumors of p.(I157T) and c.1100delC carriers (II: Table 1). CHEK2 mutation carrier tumors were significantly more often ER- or PR-positive than non-carrier tumors. P.(I157T) was associated with low-grade and lobular tumors, whereas c.1100delC carrier tumors did not differ from non-carrier tumors in this regard. 5.2 p.(I157T) or c.1100delC carrier survival (II) Survival of carriers of p.(I157T) and c.1100delC mutations was investigated in collaboration with the BCAC for Study II. P.(I157T) was not associated with increased risk of early death, breast cancer-specific death, or disease recurrence; there was no significant difference in p.(I157T) carrier and non-carrier survival (II: Table 3). However, between carriers of p.(I157T) and c.1100delC, a significant difference was seen in the risk of early death and a marginally significant difference in the risk of breast cancer-associated death such that c.1100delC was associated with poorer prognosis. 5.3 Genetic modifiers of c.1100delC-associated breast cancer risk (III) 5.3.1 Synergistic risk effect of common variants for c.1100delC carriers (III) The combined risk effect of 74 common low penetrance variants summarized in a polygenic risk score was very similar for CHEK2 c.1100delC carriers and non-carriers (OR 1.59 [1.21 - 2.09] and 1.58 [1.55 - 1.62] per unit standard deviation, respectively. III: Table 1). We estimated that 20% of c.1100delC carriers with the highest PRS values would have over 32% lifetime risk of breast cancer, and for the lowest 20% the risk would be comparable to the average population risk. Thus, the PRS could be used for personal risk stratification of c.1100delC carriers.

45 In pairwise interaction analyses between c.1100delC and 77 previously reported common risk variants, we did not find evidence for deviation from the multiplicative (log-additive) risk model for breast cancer (III: Table S4). 5.3.2 The sparse model (III: unpublished data) We further performed an exploratory analysis for biological characterization of the CHEK2 c.1100delC-associated risk variants using forward stepwise logistic regression and pathway enrichment analyses. In the stepwise analysis, six variants (rs11249433, rs11780156, rs2981582, rs11075995, rs2363956, and rs4808801) appeared as independent and nominally significant (p”0.05) risk factors for the c.1100delC carriers (Table 6). The three first-mentioned also had nominally significant interaction with c.1100delC in the pairwise analyses. In the following, we refer to these six variants as the candidate CHEK2 c.1100delC modifiers. The six candidate modifier variants were not verified by the more stringent feature selection methods, which used randomization to estimate the model significance: only rs11780156 appeared as a relevant risk factor for c.1100delC carriers in the Boruta analysis,252 whereas none of the variants was considered significant in the glmnet analysis.250, 251

Table 6. Regression model of the candidate modifier variants explaining breast cancer risk in CHEK2 c.1100delC carriers. (unpublished data)

OR [95% CI] p-value rs11249433 0.68 [0.47 - 1.00] 0.050 rs11780156 2.16 [1.17 - 4.00] 0.014 rs2981582 1.52 [1.01 - 2.30] 0.045 rs11075995 1.85 [1.12 - 3.06] 0.017 rs2363956 1.60 [1.07 - 2.38] 0.021 rs4808801 0.63 [0.42 - 0.93] 0.020

5.3.3 In silico functional characterization (III: unpublished data) The genetic loci tagged by the 77 common risk variants as well as the subgroup of the six candidate modifiers were enriched for predicted enhancer elements in human mammary epithelial cells (HMEC) (p= 3.0E-6 and p=0.004 for enrichment for all 77 variants and the six candidates, respectively, when tested against the whole genome by HaploReg258). Three of the six candidate modifiers were located on and two others tagged HMEC enhancers, in comparison with 14/21 of the remaining 71 variants (p=0.15 for difference) (Appendix: Supplementary Table 1, Supplementary Table 2: first column). Assuming that the six candidate modifier variants have a regulatory role, their target genes could highlight pathways that have an essential role in CHEK2 c.1100delC-related breast cancer pathogenesis. Previously, rs2981582 has been linked to expression of FGFR2 (fibroblast growth factor receptor 2).260-262 eQTL analysis indicated ATE1 (arginyltransferase 1) to be another potential target gene for rs2981582 (Appendix: Supplementary Table 1). rs2363956, an ANKLE1 (ankyrin repeat and LEM domain containing 1) missense variant, was associated with expression of ANKLE1 and three other nearby genes (Appendix: Supplementary Table 1). Rs4808801 was strongly associated with the expression of ELL (elongation factor for RNA polymerase II) in

46 different cell types (Appendix: Supplementary Table 1). MYC (MYC proto-oncogene, bHLH transcription factor) and its downstream target PVT1 (Pvt1 oncogene) have been suggested as probable target genes of rs11780156, NOTCH2 as the target gene of rs11249433, and FTO (FTO, alpha-ketoglutarate dependent dioxygenase) as the target gene of rs11075995.9, 263, 264 Altogether, the target genes of five of these variants (rs11075995, rs11249433, rs11780156, rs2981582, rs4808801) were linked to the interconnected WNT and FGF (fibroblast growth factor) signaling routes,265-273 which have been implicated in CHEK2 c.1100delC-associated breast cancer tumorigenesis by us (I) and others.274 Furthermore, the locus tagged by rs2363956 possibly is under the regulation of WNT signaling because rs2363956 is in linkage disequilibrium with another variant (r2=1.0) on a MYC transcription factor binding site (Appendix: Supplementary Table 1). In contrast, only 28 of the 63 genomic loci of the 77 common variants had a similar connection to WNT/FGF signaling (p=0.011 for enrichment, Appendix: Supplementary Table 2). Interestingly, none of the six candidate modifiers were associated with DNA repair pathway genes (Appendix: Supplementary Table 2). 5.4 Risk modifiers in breast cancer families (IV) Our analyses of the polygenic risk score in 52 breast cancer families and a case-control dataset including index cases from breast cancer families indicated that the PRS is positively associated with family history of breast cancer and that it could add to pedigree-based individual risk prediction. The PRS was associated with slightly higher relative risk of breast cancer in women with a positive family history (IV: Table 1, OR 1.81 [1.59-2.06] when comparing cases with positive family history with population controls) than in women without this history (OR 1.41 [1.30-1.54] when comparing sporadic cases with population controls). Furthermore, the PRS was on average higher for unaffected women of breast cancer families than for population controls (OR 1.29 [1.12-1.48]). Within the breast cancer families, the OR associated with the PRS was 1.55 [1.26-1.91] when the model was adjusted for the magnitude of family history as measured by the BOADICEA risk estimator. In these 52 families, especially the higher PRS values were strongly associated with increased breast cancer risk, whereas the association between lower PRS values and protection from breast cancer was not as evident.

47 6 Discussion 6.1 CHEK2-associated breast cancer (I, II, III) One ambitious goal of this thesis was to build a hypothesis of the events leading to breast cancer for CHEK2 mutation carriers. Clues were searched from genomic profiling of the mutation carrier tumors as well as from functional characterization of the genetic modifiers of CHEK2-associated breast cancer risk. Since CHEK2 mutation carrier breast cancers are not very different from the non-carrier breast cancers, a CHEK2-associated model was not expected to depart much from any general model for breast cancer. Even so, we rationalized that when the tumor-initiating event is shared, there could also be similarities in later steps of the tumorigenesis and these could be seen as enrichment of genomic features in the mutation carrier breast cancers. As breast cancer is a heterogeneous disease and its etiology is still imperfectly known, a shared origin served as an interesting starting point for a study on breast cancer tumorigenesis. 6.1.1 Germline CHEK2 mutations are associated with ER-positive breast cancer (II) CHEK2 c.1100delC and p.(I157T) carrier breast cancers are primarily ER-positive. Our analyses on the BCAC dataset (II) confirmed this association, which has earlier been reported by multiple studies.3, 216-218, 224, 275-277 The strong association suggests that endocrine exposure would be a major driver of breast cancer tumorigenesis for CHEK2 mutation carriers, an inference with little specificity, as breast cancer in general and even ER-negative breast cancer are driven by sex hormones.52, 94 However, the question of which factors make almost all CHEK2 carrier breast cancers ER-positive has seldom been addressed in the literature. Common downstream targets of CHEK2 and ER have been suggested to explain the association, but the mechanism has not been elucidated further.203 An interesting point of comparison is BRCA1, which has been studied more widely than CHEK2 in cell-line, organoid, and mouse models;29, 90-92 CHEK2 and BRCA1 are functionally linked in two cellular processes, DNA double-strand break repair and centrosome organization.28, 189, 200 Truncating mutations of both genes are associated with earlier age of onset and premenopausal breast cancer,3, 278-280 which has been suggested to be enriched for ER-negative tumors.96 BRCA1 mutations cause mainly ER-negative breast cancer, while CHEK2 cause mutations mainly ER- positive breast cancer. Loss of BRCA1 function has been inferred to have a direct causal role in the oncogenic transformation of luminal epithelial cells, with concomitant loss of ER expression.29 The phenotypic switch is transmitted via BRCA1-dependent gene expression, and this functional role apparently is not shared with CHEK2.90 If we generalized that breast cancer originated from luminal progenitor cells expressing ER and that abrogation of the BRCA1/CHEK2 functional node was required for breast cancer to develop, at the same time remembering that cancer is a stepwise-progressive disease driven by randomly arising events and that the final phenotype arises from an evolutionary process characterized by survival of the fittest, the association between CHEK2 mutations and ER-positive breast cancer could be seen as a result of lack of selection for somatic BRCA1 silencing. In other words, if the oncogenic transformation of luminal progenitors involved the loss of BRCA1 function, the consequences would be chromosomal instability with the loss of ER expression and eventually development of ER-negative breast cancer; if the luminal progenitor transformation took place via loss of CHEK2 function, there would not be any further need for loss of BRCA1, the consequences

48 being chromosomal instability without loss of ER expression and eventually ER-positive breast cancer. Naturally, BRCA1 could be lost as a random event, and accordingly, a small proportion of CHEK2 carrier breast cancers are ER-negative. Furthermore, there could be mechanisms other than BRCA1 silencing, leading to a switch in gene expression and the loss of ER expression. Organoid model of CHEK2-deficient breast cancer in comparison with a BRCA1-driven model could serve as an interesting research setup to test this hypothesis. 6.1.2 Genomic profiling elucidates the steps of CHEK2-related tumorigenesis (I, II) In order to study CHEK2-associated tumorigenesis in depth, we performed parallel aCGH and GEX analyses on c.1100delC carrier tumors (I) as well as a separate GEX analysis on p.(I157T) carrier tumors (II). In silico functional analyses revealed unexpected cellular pathways and any overarching principles shared by the two mutations could not be found. Altogether, the analyses produced building blocks for multiple parallel hypotheses, the biggest challenge being how to separate true driver events from their possibly oncogenic consequences and further from mere passengers. Only in vitro and in vivo experimental models could make the final distinction, but in building a testable hypothesis, one must rely on recurrent and converging evidence from in silico work, together with experimental knowledge on molecular-level relations of biological pathways. Another challenge was the striking differences between tumors from carriers of different CHEK2 mutations. Factually, c.1100delC and p.(I157T) carrier tumors were not analyzed together in the same dataset, but instead both were compared with non-carriers in separate datasets. However, on both occasions the number of non-carriers far exceeded the number of mutation carriers, and the numbers were sufficient to represent breast cancers in general. Had the driver events in these c.1100delC and p.(I157T) carrier tumors been related to the CHEK2 function, there should have been similarities in the genomic features associated with the two mutations. Here, however, the enriched features of c.1100delC and p.(I157T) carrier tumors had nothing in common. Furthermore, regarding CDH1 expression and related pathways, the c.1100delC carrier tumors appeared totally different from p.(I157T) carrier tumors. In the aCGH analysis (I), we identified two large (loss of 1p13.3-31.3 and gain of 12q13.11-3) and five focal copy number aberrations, which were more common in c.1100delC carrier tumors than in non-carrier tumors. Additionally, the GEX analysis (I) indicated that 16q22.1 amplification could be a common event in c.1100delC carrier breast cancers since the region covered a focal enrichment of genes with higher expression in c.1100delC carrier than non-carrier tumors. In the GEX analysis, the significant biological enrichments in c.1100delC tumors included elevated expression of the olfactory pathway and lowered expression of mitochondrial genes (I: Additional file 8, Additional file 10). Furthermore, genes of an oncogenic KRAS signature (Table 5 (unpublished data)) and WNT pathway genes (I: Additional file 8, Additional file 9) had higher expression, whereas genes involved in cell cycle regulation and genes responding to DNA damage had lower expression (I: Additional file 10) in c.1100delC tumors than in non-carrier tumors. On the other hand, p.(I157T)-associated differential expression was dominated by genes involved in cell-cell and cell-matrix contacts. Collagen genes had significantly elevated expression in p.(I157T) tumors and epithelial-to-mesenchymal transition and focal adhesion were enriched annotations at elevated expression, whereas E cadherin (CDH1) and its target genes had lower expression in p.(I157T) tumors than in non-carrier tumors (II: Table S5, Table S6). Intriguingly, in c.1100delC tumors CDH1 ranked among the top 20 genes with elevated expression, the genes

49 of the CDH1 locus 16q22.1 formed a focal point of elevated expression, and genes encoding the apical junction proteins were enriched at elevated expression (Table 5 (unpublished data); I: Additional file 8). In the following, the enriched pathways and biological processes are first discussed separately and then drawn together for a comprehensive hypothesis of CHEK2-associated breast cancer. 6.1.3 1p22 loss might complement CHEK2 deficiency in breast cancer progression (I) The Cancer Genome Atlas (TCGA), a cross-cancer genomic study aiming to build a molecular taxonomy for different types of cancers, suggested that breast cancer arises primarily from chromosomal instability leading to copy number aberrations and that the oncogenic drivers of breast cancer could best be identified by studying recurrent copy number patterns.281 In concert with this, an IntClust classification method based on characteristic CNA patterns associated with gene expression in cis was suggested for subtyping breast cancers.83 The c.1100delC-associated CNAs were not included in the characteristic signatures of any IntClust subgroups. Instead, a wide loss of 1p was present at low but detectable frequency in at least four IntClust subgroups dominated by luminal tumors (clusters 1, 2, 6, and 9), and in one of these (cluster 1) there was also evidence on a low-frequency 12q amplification.83 Loss of 1p21.3-22.2 appears to be the best candidate for tumor driver event in c.1100delC- associated breast cancer. The region had the strongest association with c.1100delC in our analyses, and the association between c.1100delC and loss of 1p has been recently replicated in an independent study.282 Massink et al. did not report detailed statistics so it was not possible to compare the signal distribution along the 1p chromosome arm. Neither we nor they were able to link the 1p loss with expression of any gene in cis. In our analyses, the copy number and gene expression data of the c.1100delC carriers came essentially from different tumors due to the small number of samples from which good-quality specimens of both DNA and RNA were retrievable. Therefore, we had to rely on converging evidence instead of direct inference in functional annotation of the driver events. The highest signal came from a region covering a cluster of GBP genes. GBP function was the only significantly enriched annotation on 1p21.3-22.2, as all seven GBP genes and one pseudogene of the are located on this region. GBPs are large GTPases whose transcription is induced by pro-inflammatory interferons, especially IFN-Ȗ, and by interleukin IL-1ȕ or tumor necrosing factor TNF-Į in phagocytic cells, but also in a range of other cell types, including epithelial cells. The best-characterized function of GBPs is cell-autonomous defense against cytosolic pathogens, including bacteria, protozoans, and viruses.283-285 The GBP proteins form homo- and heterotetramers upon GTP hydrolysis and the oligomerization accounts for specificity of pathogen recognition. In cancer, high GBP1 or GBP2 expression is indicative of infiltrating T helper type I cells and associated with increased patient survival.285-287 Furthermore, GBP1 functions as a tumor suppressor that mediates anti-proliferative, anti-angiogenic, anti- migratory, and pro-apoptic effects induced by IFN-Ȗ in colorectal cancer cell lines.288, 289 The anti- migratory effects of GBP1 are exerted via direct interactions with ȕ-actin and the anti-proliferative effects via ȕ-catenin/TCF-dependent transcription.285, 289 Furthermore, GBP1 knock-down in macrophages has been reported to cause mitochondrial dysfunction: decreased oxygen

50 consumption and ATP production, lowered expression of mitochondrial genes, and impaired mitophagy.290 As a summary, loss of GBP gene cluster could be a novel candidate for a key driver of c.1100delC- associated tumorigenesis. The hypothesis is supported by validated evidence on the association between c.1100delC and loss of the GBP locus. The GBP genes have a proven and versatile role as tumor suppressors in different carcinomas, including breast cancer.287 Additionally, the genes with lower expression in c.1100delC tumors were enriched with mitochondrial genes, which could be a consequence of GBP silencing. We did not see any significant differences in the gene expression of the GBP genes between c.1100delC carrier and non-carrier tumors. However, because the GBP genes are not constitutively expressed, but induced in response to interferons and cytokines, it may be that we failed to detect the differential expression in this dataset, but were able to capture the longer-lasting downstream effects in lowered expression of mitochondrial genes, possibly indicating mitochondrial dysfunction (Table 5 (unpublished data), I: Additional file 8). 6.1.4 Elevated expression of olfactory receptors in c.1100delC carrier tumors (I) Another strong candidate for a driver of c.1100delC-associated breast cancer was the olfactory pathway. The peculiar association raised doubts about the array specificity. Previously, a sequence analysis of tumor/normal tissue pairs indicated olfactory receptor (OR) genes as hot spots for somatic mutations in cancer. However, this was shown to be an artefact stemming from a low expression rate of the olfactory receptor genes in epithelial cells and their uniformly late replication in relation to the cell cycle.291 However, there was no evidence questioning the association between c.1100delC and elevated expression of the OR genes in our data. When the mutation carrier status was randomly sampled, finding an equally strong enrichment of ORs in genes with either elevated or lowered expression was highly unlikely. In addition, the 34 OR genes with higher expression in c.1100delC carriers came from 17 distinct OR clusters located on nine different chromosomes, and thus, coincidental hyperactivity of a single promoter or amplification of a particular locus did not appear to be a plausible explanation. Of note, 12q13.11-3, whose amplification was more common in c.1100delC tumors than in non-carrier tumors, covered also an OR gene cluster including three of the 34 c.1100delC-associated OR genes. Furthermore, GSEA analysis indicated that not only the 34 OR genes had higher expression, but the expression of OR genes in general was higher in c.1100delC carrier than non-carrier tumors. Finally, probe cross-reactivity had been ruled out by blasting the sequences and including only perfectly specific probes. Olfactory receptors are the largest gene family in humans, including about 400 protein coding genes and at least as many pseudogenes.292 ORs were first characterized in olfactory sensory neurons, where their generally high expression level follows a one-cell-one-receptor model contributing to specificity in odorant detection.293 Later, low-level ectopic expression of multiple ORs has been detected in all studied tissues, but their functional roles are only beginning to unravel.294, 295 Specific ORs have been assigned roles in sperm chemotaxis, cytokinesis regulation, myocyte migration, cell adhesion, and proliferation of prostate cancer cells.295 Ectopic expression of certain ORs has been reported in different types of cancer where the ligand-dependent OR activation has led to reduced proliferation and migration, making the respective ORs potential targets for therapeutic intervention.296-301 However, Sanz et al. reported that exposing

51 PSGR/OR51E2-expressing prostate cancer cells to the PSGR ligand promoted cell invasion on collagen gels and increased the number of metastases in a mouse model, rendering the earlier findings based on 2D culture questionable, and suggesting that if the apparent lower proliferation in vitro is translated to enhanced invasiveness in vivo, the ORs are poor targets for cancer therapy.296, 302 Quite recently, Weber et al. described a set of OR genes whose expression was higher in breast cancer (number of samples: 45) or in breast cancer cell lines (21) than in healthy breast tissue (7). They suggested that OR2B6 could be used as a marker for breast cancer because its expression was present in a great majority of breast cancers and cell lines, but absent in all healthy tissue samples.303 In addition to OR2B6, two other olfactory receptor genes, OR10AD1 and OR13H1, which Weber et al. suggested as tumors markers in breast cancer, were included in the 34 OR genes with higher expression in c.1100delC carrier tumors than in non-carrier tumors, confirming the relevance of at least part of the 34-gene olfactory signature in breast cancer.

Upon ligand binding, OR conformational change activates GNAL (G protein subunit alpha L), an olfactory-specific G-protein, and ACIII (adenylate cyclase III), leading to increased cellular cAMP levels and Ca2+ impulse through cyclic nucleotide-gated channels and in turn inducing calcium-activated chloride channels.295 Also Ras and MAPK pathways are involved in sensory function of olfactory receptors,304, 305 and it is plausible that the same downstream pathways operate in OR-mediated signaling also in non-chemosensory tissues.295, 303, 306 Ras-type guanine nucleotide exchange factors (RasGEF) and KRAS oncogenic signature were enriched among the genes with elevated expression in c.1100delC tumors, possibly contributing to c.1100delC-related tumor progression in interaction with the elevated OR activity. 6.1.5 WNT pathway deregulation – typical for c.1100delC breast cancers (I,III)? DAVID analysis suggested that elevated expression of WNT pathway genes is associated with c.1100delC (I: Additional file 8), with the exception of LRP1 (LDL receptor related protein 1, a regulator of WNT-ligand receptor FZD1 (frizzled class receptor 1), Figure 9)307 and LEF1 (lymphoid enhancer binding factor 1, a key transcription factor of the WNT pathway, Figure 9),308 which had lower expression in c.1100delC carrier tumors (I: Additional file 9). On the other hand, GSEA indicated that MYC target genes in general have lower expression in c.1100delC carriers than in non-carriers (Table 5 (unpublished data)). Of note, there were two distinct probes for MYC, specific for two alternative protein coding transcripts on the gene expression array. One of the probes, mapping to 5’ UTR (untranslated region) of a transcript variant encoding a protein of 257 residues, was among the top 60 probes with elevated expression in c.1100delC tumors, whereas the expression level of the other probe, mapping to exon three of the main splice variant, did not differ between c.1100delC tumors and non-carrier tumors. Altogether, it seemed that the canonical WNT pathway through beta-catenin, TCF/LEF1, and MYC (Figure 9) had lower activity in c.1100delC tumors than in non-carrier tumors, and that the pathway genes with higher expression in c.1100delC were in regulatory and inhibitory roles. Instead, the non-canonical branch of the WNT pathway, mediated via calcium/calmodulin-dependent protein kinases and NFAT- dependent transcription (Figure 9), appeared as a potential candidate driving the c.1100delC- associated tumorigenesis (I: Additional file 9).308

52 Figure 9. WNT pathway (KEGG: hsa04310). Reproduced with the permission of Kanehisa Laboratories.

53 The WNT pathway (Figure 9) regulates both embryonic and pubertal development of the mammary gland influencing cell proliferation, fate, migration, and differentiation (Table 1). The canonical WNT pathway has a pronounced oncogenic role in breast carcinomas; WNT ligands have elevated expression in the majority of breast cancer cell lines, ȕ-catenin overexpression or mutations have been detected in about 50% of breast cancers, and MYC amplification has been reported in 30-50% of high-grade breast cancers.309, 310 It is noteworthy that MYC amplification and ȕ-catenin-independent expression are particularly common in triple-negative/basal breast cancer, whereas in ER-positive breast cancer MYC activity is sustained in an estrogen-dependent manner, and estrogen-independent MYC expression is associated with resistance to endocrine treatment.310, 311 While the canonical WNT pathway contributes to neoplastic transformation, increased activity of the non-canonical WNT pathway via NFAT and JNK endows breast cancer cells with invasive and migratory properties.309, 312 Therefore, assuming that the balance of the WNT pathway in c.1100delC carrier tumors had shifted towards the non-canonical branch, there should have been also other indications of invasiveness on the gene expression level. This, however, was not the case. Instead, CDH1 and genes of the apical junction had higher expression in c.1100delC tumors than in non-carrier tumors. In summary, it is difficult to draw conclusions about the role of the WNT pathway based on the gene expression data (I). The signals are contradictory and in addition to true driver events may reflect also compensatory mechanisms required to secure the tumor cell integrity. WNT pathway and MYC-related signaling are essential drivers of breast carcinogenesis,309, 310 but how they complement CHEK2 deficiency remains a topic for further investigations. Another perspective on c.1100delC-associated breast cancer development was provided by the literature-based functional characterization of the common variants modifying breast cancer risk of c.1100delC carriers (III: unpublished data). The six nominally significant and independent risk variants (Table 6 (unpublished data)) were associated with genes that could be linked to the WNT pathway (Appendix: Supplementary Table 1, Supplementary Table 2). Loss of the candidate culprit gene of rs11075995, FTO, has been reported to cause downregulation of the canonical WNT pathway and upregulation of the non-canonical Ca2+ signaling branch (Figure 9).265 The risk allele of rs11249433 is associated with elevated expression of NOTCH2, which is a direct downstream target of the canonical WNT pathway and TCF/LEF1.263, 266 rs11780156 is located on a gene desert downstream of MYC and its regulator PVT1. The locus of rs2981582 has been shown to regulate FGFR2 expression in breast fibroblasts,262 and FGFR2 regulates WNT pathway target gene expression via direct interaction with beta-catenin.269, 270 rs4808801 is associated with expression of ELL (Appendix: Supplementary Table 1), which regulates the WNT pathway in interaction with DVL, upstream of beta-catenin.313 Lastly, rs2363956 tags another variant located on a MYC binding site. Taken together, these associations do not themselves provide any mechanistic information, and the direction of the effects caused by these common variants on the culprit genes potentially contributing to a single pathway can at best only be speculated. It is noteworthy that the associations between the candidate variants and c.1100delC were only nominally significant without correction for multiple hypotheses. Furthermore, no significant interaction suggesting deviation from the multiplicative model including all 74 variants was found (III: Table S4). However, the enrichment of associations with the WNT pathway among the top-ranking candidate

54 risk modifier variants could indicate that WNT pathway deregulation is an early event and possibly a rate-limiting step in the development of breast cancer in c.1100delC carriers. 6.1.6 Hypothetical model for c.1100delC-associated breast cancer progression (I, III) To summarize the model on c.1100delC-associated carcinogenesis with reference to the Cancer Hallmarks (Figure 2),14 C.1100delC itself accounts for both ‘Genome instability and mutation’ and ‘Persisting cell death’, possibly mostly via decreased CHEK2 protein expression.1, 28, 189, 202, 209, 216 According to the genomic analysis of c.1100delC carrier tumors, the loss of GBP genes on 1p21.3-22.2 could be a novel driver candidate of c.1100delC-associated breast cancer, contributing possibly to three cancer hallmarks and one enabling characteristic, namely ‘Avoiding immune destruction’, ‘Inducing angiogenesis’, ‘Deregulating cellular energetics’, and ‘Tumor- promoting inflammation’. 283-290 Furthermore, increased activity of olfactory receptor signaling and the non-canonical WNT pathway are presented here as potential candidates promoting the fifth hallmark of ‘Activating invasion and metastasis’.302, 309, 312 In terms of the remaining hallmarks related to proliferation and ‘Enabling replicative immortality’, c.1100delC carrier tumors do not seem to markedly differ from non-carrier tumors. Estrogen is the most prominent growth factor in mammary tissue and deregulated signaling via estrogen receptor downstream pathways probably is the most important event contributing to ‘Sustaining proliferative signaling’ and ‘Evading the growth suppressors’ in both c.1100delC and non-carrier breast cancers.53, 94 6.1.7 Is p.(I157T) ‘the first hit’ for germline mutation carriers (II)? The results from differential gene expression analysis of p.(I157T) carrier tumors could be interpreted to be related to the invasive growth pattern typical for lobular breast cancer, where the normal contacts between epithelial cells sustaining the tissue integrity has been lost. CDH1 expression was lower in p.(I157T) carrier tumors than in non-carrier tumors (II: Figure 3). Similarly, expression of a gene sets previously reported to respond to CDH1 knock-down (II: Table S6, ONDER_CDH1_TARGETS_2_UP, ONDER_CDH1_TARGETS_2_DN) was consistent in p.(I157T) carrier tumors, confirming that the CDH1 activity in these tumors was reduced. Other gene sets enriched at high or low expression in p.(I157T) vs. non-carrier tumors suggested that the p.(I157T) tumors would have gone through epithelial-to-mesenchymal transition. However, this could have been a reflection of the discohesive growth pattern caused by the loss of CDH1 activity because if p.(I157T) was associated with more aggressive and invasive breast cancer, it should have been seen also in associations with higher grade and worse patient survival, which was not the case here (II: Table 1, Table 3). Lastly, the top genes with higher expression in p.(I157T) included a significantly elevated number of collagen genes, suggesting a higher degree of stromal contamination in the p.(I157T) tumors than in non-carrier tumors, which is another feature associated with lobular breast cancer and, on the other hand, speaks against increased invasiveness of the p.(I157T) tumors.314, 315 In conclusion, p.(I157T) is associated with diagnosis of lobular breast cancer, and molecular features characteristic of lobular breast cancer are present also in many p.(I157T) carrier tumors with other histologic diagnoses (II). Lobular breast cancer, the most common ‘special’ histological breast cancer subtype, is characterized by small, round, unattached cells invading the stroma alone or in single files. Loss of CDH1 function has been recognized as an early event in lobular breast cancer and suggested as a characteristic feature aiding in diagnosis of borderline cases.315-317 Lobular breast cancer is largely driven by the lifelong cumulative exposure to female sex hormones. The reproductive risk

55 factors have a stronger association with lobular breast cancer than with ductal cancer. For example, changes in the use of post-menopausal hormone replacement therapy induced respective fluctuations in the incidence of lobular breast cancer at the same time as the incidence of ductal breast cancer remained essentially constant. Furthermore, lobular breast cancer patients have on average three years higher age at diagnosis than patients diagnosed with ductal cancer, but larger tumor size and more advanced stage. 318 This is possibly caused by the fact that lobular tumors are hard to detect by palpation or mammography,315, 318 giving the tumors more time to develop before detection. Alternatively, the difference in diagnosis age could be explained if the tumor initiating and driving events took place at an older age, whereby the nature of the events would be influenced by the intrinsic biology of the mammary gland at involution favoring the lobular phenotype. It is tempting to speculate that the differences between c.1100delC and p.(I157T) are mainly related to the age at which the tumor driver events take place. P.(I157T) carriers are diagnosed at an older age (57.9 years) than c.1100delC carriers (54.3 years; II: Table I), and possibly certain types of driver events are more likely to take place in premenopausal than postmenopausal mammary glands. Furthermore, because the effect of p.(I157T) on CHEK2 function is milder, it is possible that the accumulation of somatic mutations is slower in p.(I157T) carrier cells and the total loss of CHEK2 function is likely to take place later than in c.1100delC carrier cells. However, the data included in this work do not support this hypothesis. First, the association between p.(I157T) and lobular breast cancer or molecular-level lobular features cannot be explained by age because the p.(I157T) carriers did not differ from non-carriers either in the BCAC dataset or in the gene expression dataset by age at diagnosis. On the other hand, the c.1100delC carriers in Study I were on average older than the non-carriers. Furthermore, the six risk variants with nominally significant associations with c.1100delC-associated breast cancer were not associated with earlier diagnosis age or premenopausal breast cancer.9 The evidence presented in this work suggests different models for c.1100delC- and p.(I157T)- associated breast cancer and different roles for CHEK2 in the development of breast cancer for c.1100delC and p.(I157T) carriers. C.1100delC is likely to be a true initiating factor in the mutation carrier breast cancers. Compromised CHEK2 activity as a result of lowered expression causes increased risk of somatic mutations and occurrence of further neoplastic events. Whether the preneoplastic CHEK2-deficient cells ever develop into a full-blown tumor is possibly primarily decided by interactions between the preneoplastic cells and the innate and adaptive immune system. By contrast, the loss of CHEK2 might not be an early or even late event in p.(I157T) carrier tumors, the cancer being driven by other factors in most cases. As a matter of fact, a normal level of CHEK2 protein expression was previously detected in four out of five examined invasive breast cancers from p.(I157T) carriers.2 Loss of CDH1 function as a result of large-scale deletion or point mutation has been suggested as an early event in lobular breast cancer.315-317 Interestingly, loss of 22q12.1 covering the CHEK2 gene has been reported by four studies to be a recurrent event in lobular breast cancer, suggesting that loss of CHEK2 could bring a growth advantage to lobular neoplasia.319-322 This could explain the unexpectedly high number of p.(I157T) carriers among patients diagnosed with lobular cancer, but raises the question, why is lobular cancer rare among carriers of c.1100delC and other truncating mutations?224 One possible answer could be the order of events: if CHEK2 function were lost first, consequent loss of CDH1 could be disadvantageous in clonal evolution and selected against in agreement with our

56 observation that CDH1 and genes involved in apical junction had higher expression in c.1100delC carrier than in non-carrier tumors. Whereas if CDH1 is lost first, there could be an increased advantage in slightly compromised fidelity in control of cell cycle checkpoints and spindle assembly as a result of p.(I157T) or loss of CHEK2, so that these would be targeted by positive selection.

6.2 Survival of breast cancer patients carrying germline CHEK2 mutations (II) Our analyses on the BCAC large international dataset indicated that the overall or breast cancer- specific survival of p.(I157T) carriers did not differ from survival of non-carriers. However, the risk of locoregional relapse and risk of second breast cancer were marginally increased especially in multivariate models adjusted for conventional clinico-pathological prognostic factors (II: Table 3). Compared with c.1100delC carriers, p.(I157T) carriers had better prognosis irrespective of the analysis endpoint, in agreement with a previous study analyzing the survival of c.1100delC carriers and non-carriers in essentially the same dataset.218 6.2.1 Increased mortality associated with c.1100delC The analysis endpoints used in Study II are not necessarily connected (II: Table 3), although all refer to adverse events occurring after and presumably as a consequence of breast cancer. Death of any cause is an imperfect estimator of breast cancer-associated mortality since primary breast cancer is rarely a life-threatening disease. Mortality is increased only after metastasis to vital organs. However, in many BCAC studies, death of any cause is the only unbiased estimate available. Breast cancer-associated death or occurrence of distant metastasis would be the best measures of disease severity, but these records were comprehensively provided only by a subgroup of BCAC studies; thus, the number of patients included in these analyses was substantially lower than in the overall analysis, hindering the search for significant associations. There are at least three important mechanisms mediating survival associations of germline mutations, the most important being intrinsic tumor aggressiveness. Subtyping, grading, and staging all aim to measure the aggressiveness in terms of proliferation and invasiveness.37, 61, 83, 97 If a mutation predisposes to a particularly aggressive type of cancer, the mutation would also be associated with poor patient prognosis, as in the case of BRCA1, PALB2 and FANCM, whose risk mutations increase the risk of triple-negative breast cancer.169, 170, 323 Apparent survival association could also be an outcome of either good or poor response to adjuvant therapy. For example, breast cancer patients carrying germline BRCA1 mutations have been reported to have good response to treatment with PARP inhibitors, but poor response to taxane therapy.324-326 In order to avoid bias caused by the availability of different adjuvant regimens at the time of cohort recruitment, the treatment choice should be included in retrospective studies of BRCA1-associated patient survival. A third important factor influencing patient survival via tumor invasiveness and metastatic potential is the control of local and systemic microenvironments, where the immune response plays a crucial role.32, 33, 327 The survival analyses suggesting that c.1100delC would be associated with increased mortality of breast cancer patients were adjusted for phenotypic features measuring tumor aggressivity.218 Furthermore, the proportion of poor prognosis subtypes (Luminal B, Basal, and Her2) was lower in c.1100delC carriers than in non-carriers (II: Table 1). Therefore it seems unlikely that an

57 especially aggressive tumor phenotype would be cause of reduced survival of breast cancer patients carrying c.1100delC. The effect associated with treatment choice was not addressed in our study. However, this should be taken into careful consideration when proceeding with analyses on CHEK2-associated breast cancer patient survival in future studies now that the evidence is accumulating for sensitivity and resistance associated with BRCA1. 324-326, 328 Carriers of predisposing germline CHEK2 mutations would be expected to have a similar response to treatment because of the related functions of CHEK2 and BRCA1 in DNA double-strand break repair and in regulation of spindle assembly.28, 189, 200 A verified association between CHEK2 and treatment outcome would have implications for a wider spectrum of breast cancer patients than just the mutation carriers since CHEK2 has been reported to be lost in about 20% of unselected breast cancers.216 It is noteworthy that the relatively high frequency of loss of CHEK2 expression in non-carrier tumors could be a confounding factor in treatment outcome analyses, diluting possible effects associated with the germline mutations. In an ideal situation, the analyses would be adjusted with an immunohistochemical measure of CHEK2 expression in patients’ tumors. Previously, CHEK2 mutation carrier tumors have been reported to have an unfavorable response to anthracycline-based neoadjuvant chemotherapy in a small study including three patients.329 On the other hand, a recent study found no difference in survival of c.1100delC carriers treated with anthracycline-based and non-anthracycline-based chemotherapy.330 In this study, however, the impact of taxanes was not taken into account. All in all, retrospective studies on treatment outcome should factor in the treatment choice and carry out parallel functional experiments to clarify the mechanisms causing enhanced or reduced patient survival. No reports on the frequency of infiltrating lymphocytes or the tumor immunogenicity related to germline CHEK2 mutations have been published to date. The findings presented in this thesis (I) suggest that the loss of the GBP gene cluster on 1p21.3-22.2 would complement CHEK2 deficiency associated with germline c.1100delC in development of breast cancer. The loss could possibly regulate the tumor microenvironment, affecting tumor-promoting inflammation and immune surveillance, and thus, contributing to reduced survival of c.1100delC carriers. 6.2.2 CHEK2 mutations and increased risk of local recurrence or new primary tumors Compared with distant metastasis, the causal relation of locoregional relapse or second breast cancer to breast cancer mortality is less clear. These two analysis endpoints could have been intertwined in our analyses, because ‘second breast cancer’ included ipsilateral cases, some of which may have been only local recurrences. Furthermore, some of the events classified as ‘locoregional relapse’ could have actually been new primaries since genomic analyses confirming or excluding clonality had not been performed. Both c.1110delC and p.(I157T) were associated with increased risk of second breast cancer. The hazard ratios were consistent with the primary risks associated with these mutations (II: Table 3),3-5 and it is unlikely that the mutations would cause any surplus risk of second breast cancer.219 The marginally significant finding that carriers of any of the two CHEK2 mutations had increased hazard of locoregional relapse was in agreement with a previous report studying c.1100delC in patients treated with breast-conserving surgery and radiotherapy.331 The adverse effect of ionizing radiation on CHEK2 mutation carriers had been suggested also earlier,332 and poor treatment outcome could at least partly explain the observed

58 increase in risk of local relapse. However, this topic would need to be addressed in future studies with more complete records on treatment history. 6.2.3 p.(I157T), lobular carcinoma, and patient survival warrant further research Lobular breast cancer has been associated with reduced long-term survival,333, 334 and loss of CDH1 expression has been suggested as an independent adverse prognostic factor in breast cancer.335-338 With this background, it was surprising that p.(I157T) was not associated with reduced survival, even though it was associated with lobular breast cancer in the BCAC dataset and lowered CDH1 activity in the gene expression analysis. However, in the referenced studies, the poor long-term prognosis of lobular cancer patients was visible only after ten years following the diagnosis, and our analyses may have failed to capture this effect due to an overly short follow- up or a high proportion of prevalent cases among subjects at risk at the later time-points. On the other hand, in the studies of non-lobular tumors, CDH1 downregulation possibly represented a marker for epithelial-to-mesenchymal transition,316 whereas in Study II low CDH1 activity was more likely to be an indication of infiltrating lobular-type growth pattern, as discussed above. 6.3 Common genetic variants in breast cancer risk prediction (III, IV) 6.3.1 PRS could be used in risk stratification of c.1100delC carriers (III) Higher values of a polygenic risk score (PRS) based on 74 common predisposing variants were associated with increased risk of breast cancer for CHEK2 c.1100delC carriers. The effect size was comparable to previous more stable estimates made in a much larger group of breast cancer cases.10 When PRS was used to stratify c.1100delC carriers into categories of high and low lifetime risk, 20% of carriers at highest risk were estimated to have upwards of 30% lifetime risk. Correspondingly, for 20% of carriers at lowest risk the lifetime risk would be comparable to the population average, about 10%. Further extrapolation of the model suggested that for 10% of carriers at the high end of the PRS distribution the lifetime risk would exceed 40%, which has been considered as the threshold of the high-risk category in Finland. In Finland, the relevance of c.1100delC in genetic counseling is attributable to its high carrier frequency, 1.4%.1 Ten percent of carriers in the high-risk group means 0.14% of all women and about 40 women of each annual birth cohort.339 As a comparison, BRCA1 mutations have about 0.2% carrier frequency in many Western populations and BRCA2 mutations slightly higher, about 0.4-0.5% frequency.340 Thus, using CHEK2 mutation analysis as a part of risk estimation for women with a positive family history would be well founded, especially in combination with the PRS. 6.3.2 No epistatic interaction exists between c.1100delC and the common variants (III) In pairwise interaction analyses of c.1100delC and common predisposing variants, we did not detect any deviation from the assumed multiplicative model (III: Table S4) suggesting that the risk effect associated with the common variants would be roughly the same for c.1100delC carriers and non-carriers. The common variants included in Study III were estimated to explain about 14% of the disease heritability and the most recent novel loci an additional 4%, which together with the 20% associated with mutations in high- and moderate-risk genes summed up to about 38%. 9, 153 The ‘missing heritability’ has been suggested to be partly attributable to an epistatic interaction between loci.157, 158, 341 According to a hypothetical ‘limiting pathway model’, genetic variation affecting any single pathway would increase the risk in a linear fashion, whereas parallel variation

59 in another pathway would confer a rapid increase in risk.158 Despite a systematic search, no significant epistatic effect associated with breast cancer predisposition has been discovered thus far.342, 343 Furthermore, fine mapping and functional studies of the established risk loci have indicated that the common variants per se contribute to breast cancer predisposition by regulating the activity of nearby genes.261-263, 344-346 However, even though epistasis appears to be rare in breast cancer genetics, this does not mean that there is none. Studies on c.1100delC- and p.(I157T) carrier tumor genomes presented in this work could assist in building models for tumorigenesis associated with these mutations. Furthermore, the discovered driver regions and genes could serve as candidates in further analyses of genetic risk modifiers. 6.3.3 PRS explains part of the increased familial risk (IV) Our results confirm the hypothesis that some part of familial clustering of breast cancer can be explained by aggregation of common variants with low individual effect sizes.8 The PRS values of both healthy and affected members of breast cancer families were elevated relative to the general population level (IV: Table 1). Furthermore, the magnitude of risk (OR 1.55 [1.26–1.91] per unit standard deviation) associated with the PRS within breast cancer families, when comparing healthy and affected individuals, was very similar to estimates made between unselected cases and population controls (ORs 1.47 [1.38-1.62] in IV and 1.55 [1.52- 1.58] in Mavaddat et al.8). Earlier, the risk associated with a PRS of 22 common variants was studied in Australian breast cancer families. They reported OR 1.88 [0.99-3.25] when comparing the highest quartile against the lowest quartile.347 The 22 variants known then, in 2011, were estimated to explain about 8% of excess familial risk of breast cancer,348, 349 whereas with the 75 variants the proportion of explained heritability rose to 14%.9 Addition of novel predisposing variants enhanced the discriminatory potential of PRS since we reported a comparable risk effect (OR 1.88 [0.93-3.78]) for women at the 80-90 percentile of PRS distribution compared with the average PRS. Now that an additional 65 loci have been confirmed in Oncoarray analyses to be significantly associated with breast cancer predisposition, their incorporation into PRS is expected to improve it further.153 However, although all predisposing variants even those below the significance threshold were included, a large proportion of breast cancer heritability remains unexplained and the PRS incomplete. In Study IV, BOADICEA score measuring the family history of breast cancer was weakly correlated to PRS in breast cancer cases, but not in healthy women from the same families. Similarly, the effect of family history in predicting breast cancer was previously reported to be attenuated when PRS was included in the same model.10 Basically, PRS explains part of the familial risk. However, when considering individual consultands, the PRS gives information beyond the family history. For example, sisters of the same family have the same familial risk, but each has a unique genome and therefore a different PRS, which could be used in analyzing which sisters have inherited the risk variants segregating in that particular family. In addition to individual-level variation in PRS, there is also variation between families. Even though the PRS on average is elevated in breast cancer families, some families have lower values than other families, suggesting that the number of currently known common risk variants is lower in some families than in others.

60 In the 52 families, the number of individuals with low PRS values was small, but in this group the PRS did not appear to have much predictive potential. Most of the variants included in the 75- variant PRS have stronger associations with ER-positive breast cancer than with ER-negative breast cancer.10 Had ER-negative breast cancer been more common in women with low PRS values, the poor predictive potential of the PRS at the low end would have been explained. However, this was not the case. Instead, it seemed likely that in the families with low PRS, the effect of unidentified risk variants exceeded the effect of identified variants, and because the unidentified variants do not segregate together with the known variants, the PRS based on known variants lost its discriminatory power. All in all, our analyses suggest that the high end of PRS could be used in risk stratification of women from breast cancer families, preferably using a model with both the family history and the PRS incorporated into it. However, caution should be exercised when interpreting the low PRS values in a familial context, as long as the PRS is far from complete.

61 7 Summary and Conclusions The work presented in this thesis validated the usability of a polygenic risk score for c.1100delC mutation carriers and Finnish women with a positive family history of breast cancer. Out of consultands with positive family history not fulfilling the Lund criteria350 or consultands carrying c.1100delC with about 20% lifetime risk, the PRS could be used in identifying women at high (30-40%) lifetime risk for intensive follow-up programs. Therefore, as soon as proper and standardized methods combining the PRS with pedigree information are developed, the PRS could be taken into clinical practice. It is noteworthy that for c.1100delC carriers the lower PRS values indicate reduced absolute risk such that for about 20% of the carriers the risk is comparable to that in the general population. However, a similar association was not seen within breast cancer families. Instead, low PRS values were relatively rare within the families and lacked predictive potential. Therefore, as long as a notable proportion of breast cancer heritability remains unexplained, care should be taken in interpreting the low PRS values so that women at elevated risk due to their family history of breast cancer are not deprived of regular surveillance and counseling. The analyses of c.1100delC- and p.(I157T)-associated tumor phenotype suggested that the two mutations could be associated with different tumor etiologies. This was unexpected since both mutations cause reduced CHEK2 kinase activity, c.1100delC via abrogation of the mutated allele accompanied by generally lowered expression, and p.(I157T) via reduced autophosphorylation required for kinase domain activation. Thus, the difference between the effects of the two mutations was assumed to be more quantitative than qualitative in nature. We were able to nominate putative driver events for breast cancer of c.1100delC carriers, but these were not replicated in p.(I157T) carrier tumors. Furthermore, CDH1 expression was elevated in c.1100delC carrier tumors, but reduced in p.(I157T) carrier tumors relative to non-carriers, in agreement with the association between p.(I157T) and lobular cancer, which is not shared with the truncating CHEK2 mutations. The two mutations could have different roles in promoting cancer development, c.1100delC typically being a true causal mutation initiating tumorigenesis and p.(I157T) being more often only an accelerating or risk-modifying factor for a cancer with CHEK2-independent origin. The genomic analysis of c.1100delC carrier tumors highlighted two protein families that could complement CHEK2 deficiency in the development of breast cancer. Recurrent loss of 1p21.3- 22.2 suggested that silencing of the GBP gene/protein family could confer a growth advantage for breast cancer driven by loss of CHEK2 activity, possibly via promotion of cancer-driving inflammation or via escape from immune surveillance. Increased expression of olfactory receptors was another feature significantly associated with germline c.1100delC. Further study of these novel hypotheses would require a CHEK2 knock-down organoid or mouse model. Examination of the epithelial cell phenotype after targeted silencing of GBP genes or stimulation of olfactory receptor genes, with or without concurrent exposure to estrogen, could ascertain the role of these two gene families in c.1100delC-associated breast cancer. The study of c.1100delC carrier tumors could also identify future directions in the search for germline risk modifiers in the putative driver regions.

62 The causes of c.1100delC-associated poor survival are yet to be explained, and the effect of treatment choice cannot be excluded. On the other hand, if GBP silencing was complementary to c.1100delC in breast cancer development, as hypothesized in this work, it could explain the reduced survival as a result of compromised immune surveillance. Unlike c.1100delC, p.(I157T) was not associated with increased breast cancer mortality. However, as discussed above, c.1100delC tumors are phenotypically different from p.(I157T) carrier tumors and in view of this background the difference in survival association is not unexpected. Unsolved issues include the marginal association of both CHEK2 mutations with increased risk of locoregional relapse. Further studies are also warranted for investigating the response to DNA-damaging chemotherapy in CHEK2-deficient cancer to determine whether the CHEK2 mutations could be used in treatment stratification. Optimally, these studies would need to take into consideration the somatic loss of CHEK2 function in non-carrier tumors as well as potential differences in the effects of the germline mutations.

63 8 Acknowledgments This study was carried out at the Department of Obstetrics and Gynecology, Helsinki University Hospital, during 2009-2018. I thank the foundations that have financially supported this work: the Finnish Cultural Foundation, the Ida Montin Foundation, the Cancer Society of Finland, the Orion Research Foundation, the Sigrid Juselius Foundation, the Helsinki Hospital Research Fund, and the Academy of Finland. I wish to express my sincere gratitude to all people who have supported my work, especially: The former and present head of the Department of Obstetrics and Gynecology, Professor Jorma Paavonen and Professor Juha Tapanainen, respectively, for excellent research facilities and a fruitful working environment, and Professor Tapanainen also for accepting the post of custos at the defense of my dissertation. My supervisor, Adjunct Professor Heli Nevanlinna, for the opportunity to work in her group and for believing in me. I greatly appreciate the challenges in various research projects that she entrusted to me, her advice and encouragement, and the chance to work on international collaborative projects. Her optimism and enthusiasm for research have been a wonderful example for my own career as a scientist. My supervisor, Associate Professor Dario Greco, for his advice and support of my work. With his vast experience in ‘omics’, he was always able to help find a way around the cul-de-sacs and to advise on how to proceed with analyses and where to retrieve more information, without giving any easy answers. While he was working in our group, his passion for science motivated us all on a daily basis, and his teaching made complicated analysis processes simple and understandable. Adjunct Professor Carl Blomqvist for intensive collaboration on the projects included in this work. His clinical expertise and extensive knowledge of published literature about breast cancer helped to focus and motivate my work. I am grateful for the time he generously gave for my projects in our regular meetings, where his clinical point of view was always valuable. The members of my thesis committee, Adjunct Professor Minna Pöyhönen and Adjunct Professor Outi Monni, for their support and encouragement; the official reviewers of my thesis, Professor Matti Nykter and Adjunct Professor Minna Tanner, for insightful comments that made this thesis more accurate and profound; and my author-editor Carol Ann Pelli for careful language revision. I warmly thank my co-authors and collaborators: All of the great scientists of the Breast Cancer Association Consortium with whom I have had the privilege to work. I owe my deepest gratitude to Professor Douglas Easton for his advice and guidance as well as for hospitability during my visit to the Strangeways Research Laboratory, and Dr. Marjanka Schmidt for her friendly advice on so many aspects of this work, starting with detailed instructions on data handling and analyses and finishing with fine-tuning of publications. Furthermore, I would like to express my gratitude especially to Dr. Nasim Mavaddat, Professor Thilo Dörk and Professor Anna Jakubowska for sharing their expertise and for collaborating on the projects included in this thesis.

64 Professor Åke Borg, Dr. Markus Ringnér, Dr. Johan Vallon-Christersson, and Dr. Göran Jönsson for their collaboration and the introduction to microarray data analysis and the BASE database as well as for their hospitability during my visits to Lund. Professor Kristiina Aittomäki and Adjunct Professor Päivi Heikkilä for their clinical expertise and interest in my research. Research nurses Irja Erkkilä, Outi Malkavaara, and Hanna Jäntti for their diligence in patient data management and their patience with my recurrent data requests. My present and former coworkers, especially Dr. Tuomas Heikkinen, my first instructor in the lab, whose dedication to the task and good spirit made the very first working days bright, and who continued to be a trusted senior fellow in the lab for many years; Liisa Pelttari and Johanna Kiiski, my fellow scientists with whom I have pushed through the doctoral education – I greatly value their peer support and good humor during the ups and downs of the process; Dr. Sofia Khan for her friendship, encouragement and devotion to science; All former and present personnel of the Nevanlinna lab – Anitta Tamminen, Jenny Forsström, Dr. Kati Kämpjärvi, Dr. Outi Kilpivaara, Dr. Johanna Tommiska, Dr. Kirsimari Aaltonen, Dr. Reetta Vainionpää, Marja-Liisa Nuotio, Gynel Arifsdhan, Rainer Fagerholm, Dr. Hanni Kärkkäinen, Dr. Netta Mäkinen, Maral Jamshidi, Ali Oghabian, Salla Ranta, Anna Nurmi, Dr. Maija Suvanto, Erja Nynäs and Himanshu Chheda – for their companionship and support, for sharing good and bad times, for the intelligent and not- always-so-intelligent-but-amusing discussion on science, politics, and everyday life. Good coworkers have always been one of the best parts of my work. I am grateful to my parents Helinä and Hannu for their love, for all material and practical support, and for always letting me pursue my dreams; my siblings Turo, Sara and Mira for love and friendship; and most of all Paavo, my beloved husband and best friend, for his love and companionship, for his support and patience during these years, for his keen interest in my work, and for his readiness to challenge and discuss everything from methods to results and conclusions. Finally, I thank the breast cancer patients and their family members for participating in the study. This work would not have been possible without their contribution.

Helsinki, September 2018

65 References

1. Vahteristo P, Bartkova J, Eerola H, Syrjakoski K, Ojala S, Kilpivaara O, Tamminen A, Kononen J, Aittomaki K, Heikkila P, Holli K, Blomqvist C, Bartek J, Kallioniemi OP, Nevanlinna H. A CHEK2 genetic variant contributing to a substantial fraction of familial breast cancer. Am J Hum Genet 71:432-438, 2002. 2. Kilpivaara O, Vahteristo P, Falck J, Syrjakoski K, Eerola H, Easton D, Bartkova J, Lukas J, Heikkila P, Aittomaki K, Holli K, Blomqvist C, Kallioniemi OP, Bartek J, Nevanlinna H. CHEK2 variant I157T may be associated with increased breast cancer risk. Int J Cancer 111:543-547, 2004. 3. Schmidt MK, Hogervorst F, van Hien R, Cornelissen S, Broeks A, Adank MA, Meijers H, Waisfisz Q, Hollestelle A, Schutte M, van den Ouweland A, Hooning M, Andrulis IL, Anton-Culver H, Antonenkova NN, Antoniou AC, Arndt V, Bermisheva M, Bogdanova NV, Bolla MK, Brauch H, Brenner H, Bruning T, Burwinkel B, Chang-Claude J, Chenevix- Trench G, Couch FJ, Cox A, Cross SS, Czene K, Dunning AM, Fasching PA, Figueroa J, Fletcher O, Flyger H, Galle E, Garcia-Closas M, Giles GG, Haeberle L, Hall P, Hillemanns P, Hopper JL, Jakubowska A, John EM, Jones M, Khusnutdinova E, Knight JA, Kosma VM, Kristensen V, Lee A, Lindblom A, Lubinski J, Mannermaa A, Margolin S, Meindl A, Milne RL, Muranen TA, Newcomb PA, Offit K, Park-Simon TW, Peto J, Pharoah PD, Robson M, Rudolph A, Sawyer EJ, Schmutzler RK, Seynaeve C, Soens J, Southey MC, Spurdle AB, Surowy H, Swerdlow A, Tollenaar RA, Tomlinson I, Trentham-Dietz A, Vachon C, Wang Q, Whittemore AS, Ziogas A, van der Kolk L, Nevanlinna H, Dork T, Bojesen S, Easton DF. Age- and tumor subtype-specific breast cancer risk estimates for CHEK2*1100delC carriers. J Clin Oncol, 2016. 4. Easton DF, Pharoah PD, Antoniou AC, Tischkowitz M, Tavtigian SV, Nathanson KL, Devilee P, Meindl A, Couch FJ, Southey M, Goldgar DE, Evans DG, Chenevix-Trench G, Rahman N, Robson M, Domchek SM, Foulkes WD. Gene-panel sequencing and the prediction of breast-cancer risk. N Engl J Med 372:2243-2257, 2015. 5. Han FF, Guo CL, Liu LH. The effect of CHEK2 variant I157T on cancer susceptibility: Evidence from a meta-analysis. DNA Cell Biol 32:329-335, 2013. 6. Moller S, Mucci LA, Harris JR, Scheike T, Holst K, Halekoh U, Adami HO, Czene K, Christensen K, Holm NV, Pukkala E, Skytthe A, Kaprio J, Hjelmborg JB. The heritability of breast cancer among women in the nordic twin study of cancer. Cancer Epidemiol Biomarkers Prev, 2015. 7. Stratton MR, Rahman N. The emerging landscape of breast cancer susceptibility. Nat Genet 40:17-22, 2008. 8. Ponder BA, Antoniou A, Dunning A, Easton DF, Pharoah PD. Polygenic inherited predisposition to breast cancer. Cold Spring Harb Symp Quant Biol 70:35-41, 2005. 9. Michailidou K, Hall P, Gonzalez-Neira A, Ghoussaini M, Dennis J, Milne RL, Schmidt MK, Chang-Claude J, Bojesen SE, Bolla MK, Wang Q, Dicks E, Lee A, Turnbull C, Rahman N, Breast and Ovarian Cancer Susceptibility Collaboration, Fletcher O, Peto J, Gibson L, Dos Santos Silva I, Nevanlinna H, Muranen TA, Aittomaki K, Blomqvist C, Czene K, Irwanto A, Liu J, Waisfisz Q, Meijers-Heijboer H, Adank M, Hereditary Breast and Ovarian Cancer Research Group Netherlands (HEBON), van der Luijt RB, Hein R, Dahmen N, Beckman L, Meindl A, Schmutzler RK, Muller-Myhsok B, Lichtner P, Hopper JL, Southey MC, Makalic E, Schmidt DF, Uitterlinden AG, Hofman A, Hunter DJ, Chanock SJ, Vincent D, Bacot F, Tessier DC, Canisius S, Wessels LF, Haiman CA, Shah M, Luben R, Brown J, Luccarini C, Schoof N, Humphreys K, Li J, Nordestgaard BG, Nielsen SF, Flyger H, Couch FJ, Wang X, Vachon C, Stevens KN, Lambrechts D, Moisse M, Paridaens R, Christiaens MR, Rudolph A, Nickels S, Flesch-Janys D, Johnson N, Aitken Z, Aaltonen K, Heikkinen T, Broeks A, Veer LJ, van der Schoot CE, Guenel P, Truong T, Laurent-Puig P, Menegaux F, Marme F, Schneeweiss A, Sohn C, Burwinkel B, Zamora MP, Perez JI, Pita G, Alonso MR, Cox A, Brock IW, Cross SS, Reed MW, Sawyer EJ, Tomlinson I, Kerin MJ, Miller N, Henderson BE, Schumacher F, Le Marchand L, Andrulis IL, Knight JA, Glendon G, Mulligan AM, kConFab Investigators, stralian Ovarian Cancer Study Group, Lindblom A, Margolin S, Hooning MJ, Hollestelle A, van den Ouweland AM, Jager A, Bui QM, Stone J, Dite GS, Apicella C, Tsimiklis H, Giles GG, Severi G, Baglietto L, Fasching PA, Haeberle L, Ekici AB, Beckmann MW, Brenner H, Muller H, Arndt V, Stegmaier C, Swerdlow A, Ashworth A, Orr N, Jones M, Figueroa J, Lissowska J, Brinton L, Goldberg MS, Labreche F, Dumont M, Winqvist R, Pylkas K, Jukkola-Vuorinen A, Grip M, Brauch H, Hamann U, Bruning T, GENICA (Gene Environment Interaction and Breast Cancer in Germany) Network, Radice P, Peterlongo P, Manoukian S, Bonanni B, Devilee P, Tollenaar RA, Seynaeve C, van Asperen CJ, Jakubowska A, Lubinski J, Jaworska K, Durda K, Mannermaa A, Kataja V, Kosma VM, Hartikainen JM, Bogdanova NV, Antonenkova NN, Dork T, Kristensen VN, Anton-Culver H, Slager S, Toland AE, Edge S, Fostira F, Kang D, Yoo KY, Noh DY, Matsuo K, Ito H, Iwata H, Sueta A, Wu AH, Tseng CC, Van Den Berg D, Stram DO, Shu XO, Lu W, Gao YT, Cai H, Teo SH, Yip CH, Phuah SY, Cornes BK, Hartman M, Miao H, Lim WY, Sng JH, Muir K, Lophatananon A, Stewart-Brown S, Siriwanarangsan P, Shen CY, Hsiung CN, Wu PE, Ding SL, Sangrajrang S, Gaborieau V, Brennan P, McKay J, Blot WJ, Signorello LB, Cai Q, Zheng W, Deming-Halverson S, Shrubsole M, Long J, Simard J, Garcia-Closas M, Pharoah PD, Chenevix-Trench G, Dunning AM, Benitez J, Easton DF. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet 45:353-361, 2013. 10. Mavaddat N, Pharoah PD, Michailidou K, Tyrer J, Brook MN, Bolla MK, Wang Q, Dennis J, Dunning AM, Shah M, Luben R, Brown J, Bojesen SE, Nordestgaard BG, Nielsen SF, Flyger H, Czene K, Darabi H, Eriksson M, Peto J, Dos-

66 Santos-Silva I, Dudbridge F, Johnson N, Schmidt MK, Broeks A, Verhoef S, Rutgers EJ, Swerdlow A, Ashworth A, Orr N, Schoemaker MJ, Figueroa J, Chanock SJ, Brinton L, Lissowska J, Couch FJ, Olson JE, Vachon C, Pankratz VS, Lambrechts D, Wildiers H, Van Ongeval C, van Limbergen E, Kristensen V, Grenaker Alnaes G, Nord S, Borresen-Dale AL, Nevanlinna H, Muranen TA, Aittomaki K, Blomqvist C, Chang-Claude J, Rudolph A, Seibold P, Flesch-Janys D, Fasching PA, Haeberle L, Ekici AB, Beckmann MW, Burwinkel B, Marme F, Schneeweiss A, Sohn C, Trentham-Dietz A, Newcomb P, Titus L, Egan KM, Hunter DJ, Lindstrom S, Tamimi RM, Kraft P, Rahman N, Turnbull C, Renwick A, Seal S, Li J, Liu J, Humphreys K, Benitez J, Pilar Zamora M, Arias Perez JI, Menendez P, Jakubowska A, Lubinski J, Jaworska-Bieniek K, Durda K, Bogdanova NV, Antonenkova NN, Dork T, Anton-Culver H, Neuhausen SL, Ziogas A, Bernstein L, Devilee P, Tollenaar RA, Seynaeve C, van Asperen CJ, Cox A, Cross SS, Reed MW, Khusnutdinova E, Bermisheva M, Prokofyeva D, Takhirova Z, Meindl A, Schmutzler RK, Sutter C, Yang R, Schurmann P, Bremer M, Christiansen H, Park-Simon TW, Hillemanns P, Guenel P, Truong T, Menegaux F, Sanchez M, Radice P, Peterlongo P, Manoukian S, Pensotti V, Hopper JL, Tsimiklis H, Apicella C, Southey MC, Brauch H, Bruning T, Ko YD, Sigurdson AJ, Doody MM, Hamann U, Torres D, Ulmer HU, Forsti A, Sawyer EJ, Tomlinson I, Kerin MJ, Miller N, Andrulis IL, Knight JA, Glendon G, Marie Mulligan A, Chenevix-Trench G, Balleine R, Giles GG, Milne RL, McLean C, Lindblom A, Margolin S, Haiman CA, Henderson BE, Schumacher F, Le Marchand L, Eilber U, Wang-Gohrke S, Hooning MJ, Hollestelle A, van den Ouweland AM, Koppert LB, Carpenter J, Clarke C, Scott R, Mannermaa A, Kataja V, Kosma VM, Hartikainen JM, Brenner H, Arndt V, Stegmaier C, Karina Dieffenbach A, Winqvist R, Pylkas K, Jukkola-Vuorinen A, Grip M, Offit K, Vijai J, Robson M, Rau-Murthy R, Dwek M, Swann R, Annie Perkins K, Goldberg MS, Labreche F, Dumont M, Eccles DM, Tapper WJ, Rafiq S, John EM, Whittemore AS, Slager S, Yannoukakos D, Toland AE, Yao S, Zheng W, Halverson SL, Gonzalez-Neira A, Pita G, Rosario Alonso M, Alvarez N, Herrero D, Tessier DC, Vincent D, Bacot F, Luccarini C, Baynes C, Ahmed S, Maranian M, Healey CS, Simard J, Hall P, Easton DF, Garcia-Closas M. Prediction of breast cancer risk based on profiling with common genetic variants. J Natl Cancer Inst 107:10.1093/jnci/djv036. Print 2015 May, 2015. 11. Foulkes WD. Inherited susceptibility to common cancers. N Engl J Med 359:2143-2153, 2008. 12. Knudson AG,Jr. Mutation and cancer: Statistical study of retinoblastoma. Proc Natl Acad Sci U S A 68:820-823, 1971. 13. Knudson AG. Two genetic hits (more or less) to cancer. Nat Rev Cancer 1:157-162, 2001. 14. Hanahan D, Weinberg RA. Hallmarks of cancer: The next generation. Cell 144:646-674, 2011. 15. Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, Davies S, Fauron C, He X, Hu Z, Quackenbush JF, Stijleman IJ, Palazzo J, Marron JS, Nobel AB, Mardis E, Nielsen TO, Ellis MJ, Perou CM, Bernard PS. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol 27:1160-1167, 2009. 16. Jonsson G, Staaf J, Vallon-Christersson J, Ringner M, Holm K, Hegardt C, Gunnarsson H, Fagerholm R, Strand C, Agnarsson BA, Kilpivaara O, Luts L, Heikkila P, Aittomaki K, Blomqvist C, Loman N, Malmstrom P, Olsson H, Johannsson OT, Arason A, Nevanlinna H, Barkardottir RB, Borg A. Genomic subtypes of breast cancer identified by array-comparative genomic hybridization display distinct molecular and clinical characteristics. Breast Cancer Res 12:R42, 2010. 17. Tomasetti C, Vogelstein B. Cancer etiology. variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science 347:78-81, 2015. 18. Chaffer CL, Weinberg RA. How does multistep tumorigenesis really proceed? Cancer Discov 5:22-24, 2015. 19. Visvader JE. Cells of origin in cancer. Nature 469:314-322, 2011. 20. Gerlinger M, Rowan AJ, Horswell S, Larkin J, Endesfelder D, Gronroos E, Martinez P, Matthews N, Stewart A, Tarpey P, Varela I, Phillimore B, Begum S, McDonald NQ, Butler A, Jones D, Raine K, Latimer C, Santos CR, Nohadani M, Eklund AC, Spencer-Dene B, Clark G, Pickering L, Stamp G, Gore M, Szallasi Z, Downward J, Futreal PA, Swanton C. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med 366:883-892, 2012. 21. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell 100:57-70, 2000. 22. Floor SL, Dumont JE, Maenhaut C, Raspe E. Hallmarks of cancer: Of all cancer cells, all the time? Trends Mol Med 18:509-515, 2012. 23. McKenzie SJ. Diagnostic utility of oncogenes and their products in human cancer. Biochim Biophys Acta 1072:193- 214, 1991. 24. Wells SA,Jr, Santoro M. Targeting the RET pathway in thyroid cancer. Clin Cancer Res 15:7119-7123, 2009. 25. Ekvall S, Wilbe M, Dahlgren J, Legius E, van Haeringen A, Westphal O, Anneren G, Bondeson ML. Mutation in NRAS in familial noonan syndrome--case report and review of the literature. BMC Med Genet 16:95-015-0239-1, 2015. 26. Macleod K. Tumor suppressor genes. Curr Opin Genet Dev 10:81-93, 2000. 27. Kinzler KW, Vogelstein B. Landscaping the cancer terrain. Science 280:1036-1037, 1998.

67 28. Stolz A, Ertych N, Kienitz A, Vogel C, Schneider V, Fritz B, Jacob R, Dittmar G, Weichert W, Petersen I, Bastians H. The CHK2-BRCA1 tumour suppressor pathway ensures chromosomal stability in human somatic cells. Nat Cell Biol 12:492-499, 2010. 29. Bai F, Smith MD, Chan HL, Pei XH. Germline mutation of Brca1 alters the fate of mammary luminal cells and causes luminal-to-basal mammary tumor transformation. Oncogene 32:2715-2725, 2013. 30. Bussard KM, Boulanger CA, Booth BW, Bruno RD, Smith GH. Reprogramming human cancer cells in the mouse mammary gland. Cancer Res 70:6336-6343, 2010. 31. Maffini MV, Calabro JM, Soto AM, Sonnenschein C. Stromal regulation of neoplastic development: Age-dependent normalization of neoplastic mammary cells by mammary stroma. Am J Pathol 167:1405-1410, 2005. 32. Malladi S, Macalinao DG, Jin X, He L, Basnet H, Zou Y, de Stanchina E, Massague J. Metastatic latency and immune evasion through autocrine inhibition of WNT. Cell 165:45-60, 2016. 33. Mlecnik B, Bindea G, Kirilovsky A, Angell HK, Obenauf AC, Tosolini M, Church SE, Maby P, Vasaturo A, Angelova M, Fredriksen T, Mauger S, Waldner M, Berger A, Speicher MR, Pages F, Valge-Archer V, Galon J. The tumor microenvironment and immunoscore are critical determinants of dissemination to distant metastasis. Sci Transl Med 8:327ra26, 2016. 34. Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin 65:87-108, 2015. 35. Engholm G, Ferlay J, Christensen N, et al: NORDCAN: Cancer incidence, mortality, prevalence and survival in the nordic countries, version 7.3, 08.07.2016 update, accessed 09.08.2017, http://www.ancr.nu 36. Engholm G, Ferlay J, Christensen N, Bray F, Gjerstorff ML, Klint A, Kotlum JE, Olafsdottir E, Pukkala E, Storm HH. NORDCAN--a nordic tool for cancer information, planning, quality control and research. Acta Oncol 49:725-736, 2010. 37. Mazen Sudah. Rintadiagnostiikan opas, 3. painos. Rintasyöpäryhmä, 2014. 38. Russo J, Russo IH. Development of the human breast. Maturitas 49:2-15, 2004. 39. Sternlicht MD. Key stages in mammary gland development: The cues that regulate ductal branching morphogenesis. Breast Cancer Res 8:201, 2006. 40. Tiede B, Kang Y. From milk to malignancy: The role of mammary stem cells in development, pregnancy and breast cancer. Cell Res 21:245-257, 2011. 41. Blakely CM, Stoddard AJ, Belka GK, Dugan KD, Notarfrancesco KL, Moody SE, D'Cruz CM, Chodosh LA. Hormone-induced protection against mammary tumorigenesis is conserved in multiple rat strains and identifies a core gene expression signature induced by pregnancy. Cancer Res 66:6421-6431, 2006. 42. Russo J, Santucci-Pereira J, de Cicco RL, Sheriff F, Russo PA, Peri S, Slifker M, Ross E, Mello ML, Vidal BC, Belitskaya-Levy I, Arslan A, Zeleniuch-Jacquotte A, Bordas P, Lenner P, Ahman J, Afanasyeva Y, Hallmans G, Toniolo P, Russo IH. Pregnancy-induced chromatin remodeling in the breast of postmenopausal women. Int J Cancer 131:1059- 1070, 2012. 43. Barton M, Santucci-Pereira J, Russo J. Molecular pathways involved in pregnancy-induced prevention against breast cancer. Front Endocrinol (Lausanne) 5:213, 2014. 44. Van Keymeulen A, Rocha AS, Ousset M, Beck B, Bouvencourt G, Rock J, Sharma N, Dekoninck S, Blanpain C. Distinct stem cells contribute to mammary gland development and maintenance. Nature 479:189-193, 2011. 45. Van Keymeulen A, Fioramonti M, Centonze A, Bouvencourt G, Achouri Y, Blanpain C. Lineage-restricted mammary stem cells sustain the development, homeostasis, and regeneration of the estrogen receptor positive lineage. Cell Rep 20:1525-1532, 2017. 46. Macias H, Hinck L. Mammary gland development. Wiley Interdiscip Rev Dev Biol 1:533-557, 2012. 47. Harris JR, Lippman ME, Morrow M, et al (eds): Diseases of the Breast (ed 5). Philadelphia, Lippincott Williams & Wilkins, 2014 48. Olver IN. Prevention of breast cancer. Med J Aust 205:475-479, 2016. 49. Rudolph A, Chang-Claude J, Schmidt MK. Gene-environment interaction and risk of breast cancer. Br J Cancer 114:125-133, 2016. 50. Boyd NF, Rommens JM, Vogt K, Lee V, Hopper JL, Yaffe MJ, Paterson AD. Mammographic breast density as an intermediate phenotype for breast cancer. Lancet Oncol 6:798-808, 2005. 51. Orr B, Kelley JL,3rd. Benign breast diseases: Evaluation and management. Clin Obstet Gynecol 59:710-726, 2016. 52. Travis RC, Key TJ. Oestrogen exposure and breast cancer risk. Breast Cancer Res 5:239-247, 2003.

68 53. Santen RJ, Yue W, Wang JP. Estrogen metabolites and breast cancer. Steroids 99:61-66, 2015. 54. Padamsee TJ, Wills CE, Yee LD, Paskett ED. Decision making for breast cancer prevention among women at elevated risk. Breast Cancer Res 19:34-017-0826-5, 2017. 55. Elston CW, Ellis IO. Pathological prognostic factors in breast cancer. I. the value of histological grade in breast cancer: Experience from a large study with long-term follow-up. Histopathology 19:403-410, 1991. 56. Cserni G, Chmielik E, Cserni B, Tot T. The new TNM-based staging of breast cancer. Virchows Arch, 2018. 57. Coates AS, Winer EP, Goldhirsch A, Gelber RD, Gnant M, Piccart-Gebhart M, Thurlimann B, Senn HJ, Panel Members. Tailoring therapies--improving the management of early breast cancer: St gallen international expert consensus on the primary therapy of early breast cancer 2015. Ann Oncol 26:1533-1546, 2015. 58. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Eystein Lonning P, Borresen-Dale AL. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 98:10869-10874, 2001. 59. Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, Deng S, Johnsen H, Pesich R, Geisler S, Demeter J, Perou CM, Lonning PE, Brown PO, Borresen-Dale AL, Botstein D. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A 100:8418-8423, 2003. 60. Hu Z, Fan C, Oh DS, Marron JS, He X, Qaqish BF, Livasy C, Carey LA, Reynolds E, Dressler L, Nobel A, Parker J, Ewend MG, Sawyer LR, Wu J, Liu Y, Nanda R, Tretiakova M, Ruiz Orrico A, Dreher D, Palazzo JP, Perreard L, Nelson E, Mone M, Hansen H, Mullins M, Quackenbush JF, Ellis MJ, Olopade OI, Bernard PS, Perou CM. The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics 7:96, 2006. 61. Wallden B, Storhoff J, Nielsen T, Dowidar N, Schaper C, Ferree S, Liu S, Leung S, Geiss G, Snider J, Vickery T, Davies SR, Mardis ER, Gnant M, Sestak I, Ellis MJ, Perou CM, Bernard PS, Parker JS. Development and verification of the PAM50-based prosigna breast cancer gene signature assay. BMC Med Genomics 8:54-015-0129-6, 2015. 62. Ohnstad HO, Borgen E, Falk RS, Lien TG, Aaserud M, Sveli MAT, Kyte JA, Kristensen VN, Geitvik GA, Schlichting E, Wist EA, Sorlie T, Russnes HG, Naume B. Prognostic value of PAM50 and risk of recurrence score in patients with early-stage breast cancer with long-term follow-up. Breast Cancer Res 19:120-017-0911-9, 2017. 63. Goldhirsch A, Winer EP, Coates AS, Gelber RD, Piccart-Gebhart M, Thurlimann B, Senn HJ, Panel members. Personalizing the treatment of women with early breast cancer: Highlights of the st gallen international expert consensus on the primary therapy of early breast cancer 2013. Ann Oncol 24:2206-2223, 2013. 64. Goldhirsch A, Wood WC, Coates AS, Gelber RD, Thurlimann B, Senn HJ, Panel members. Strategies for subtypes-- dealing with the diversity of breast cancer: Highlights of the st. gallen international expert consensus on the primary therapy of early breast cancer 2011. Ann Oncol 22:1736-1747, 2011. 65. Blows FM, Driver KE, Schmidt MK, Broeks A, van Leeuwen FE, Wesseling J, Cheang MC, Gelmon K, Nielsen TO, Blomqvist C, Heikkila P, Heikkinen T, Nevanlinna H, Akslen LA, Begin LR, Foulkes WD, Couch FJ, Wang X, Cafourek V, Olson JE, Baglietto L, Giles GG, Severi G, McLean CA, Southey MC, Rakha E, Green AR, Ellis IO, Sherman ME, Lissowska J, Anderson WF, Cox A, Cross SS, Reed MW, Provenzano E, Dawson SJ, Dunning AM, Humphreys M, Easton DF, Garcia-Closas M, Caldas C, Pharoah PD, Huntsman D. Subtyping of breast cancer by immunohistochemistry to investigate a relationship between subtype and short and long term survival: A collaborative analysis of data for 10,159 cases from 12 studies. PLoS Med 7:e1000279, 2010. 66. Cheang MC, Martin M, Nielsen TO, Prat A, Voduc D, Rodriguez-Lescure A, Ruiz A, Chia S, Shepherd L, Ruiz- Borrego M, Calvo L, Alba E, Carrasco E, Caballero R, Tu D, Pritchard KI, Levine MN, Bramwell VH, Parker J, Bernard PS, Ellis MJ, Perou CM, Di Leo A, Carey LA. Defining breast cancer intrinsic subtypes by quantitative receptor expression. Oncologist 20:474-482, 2015. 67. van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van der Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH, Bernards R. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347:1999- 2009, 2002. 68. van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH. Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530-536, 2002. 69. Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, Jatkoe T, Berns EM, Atkins D, Foekens JA. Gene-expression profiles to predict distant metastasis of lymph-node- negative primary breast cancer. Lancet 365:671-679, 2005.

69 70. Teschendorff AE, Naderi A, Barbosa-Morais NL, Pinder SE, Ellis IO, Aparicio S, Brenton JD, Caldas C. A consensus prognostic gene expression classifier for ER positive breast cancer. Genome Biol 7:R101, 2006. 71. Naderi A, Teschendorff AE, Barbosa-Morais NL, Pinder SE, Green AR, Powe DG, Robertson JF, Aparicio S, Ellis IO, Brenton JD, Caldas C. A gene-expression signature to predict survival in breast cancer across independent data sets. Oncogene 26:1507-1516, 2007. 72. Chang HY, Nuyten DS, Sneddon JB, Hastie T, Tibshirani R, Sorlie T, Dai H, He YD, van't Veer LJ, Bartelink H, van de Rijn M, Brown PO, van de Vijver MJ. Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. Proc Natl Acad Sci U S A 102:3738-3743, 2005. 73. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T, Hiller W, Fisher ER, Wickerham DL, Bryant J, Wolmark N. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 351:2817-2826, 2004. 74. Loi S, Haibe-Kains B, Desmedt C, Wirapati P, Lallemand F, Tutt AM, Gillet C, Ellis P, Ryder K, Reid JF, Daidone MG, Pierotti MA, Berns EM, Jansen MP, Foekens JA, Delorenzi M, Bontempi G, Piccart MJ, Sotiriou C. Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen. BMC Genomics 9:239, 2008. 75. Miller LD, Smeds J, George J, Vega VB, Vergara L, Ploner A, Pawitan Y, Hall P, Klaar S, Liu ET, Bergh J. An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci U S A 102:13550-13555, 2005. 76. Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, Nordgren H, Farmer P, Praz V, Haibe-Kains B, Desmedt C, Larsimont D, Cardoso F, Peterse H, Nuyten D, Buyse M, Van de Vijver MJ, Bergh J, Piccart M, Delorenzi M. Gene expression profiling in breast cancer: Understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst 98:262-272, 2006. 77. Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DS, Nobel AB, van't Veer LJ, Perou CM. Concordance among gene- expression-based predictors for breast cancer. N Engl J Med 355:560-569, 2006. 78. Prat A, Parker JS, Fan C, Cheang MC, Miller LD, Bergh J, Chia SK, Bernard PS, Nielsen TO, Ellis MJ, Carey LA, Perou CM. Concordance among gene expression-based predictors for ER-positive breast cancer treated with adjuvant tamoxifen. Ann Oncol 23:2866-2873, 2012. 79. Wirapati P, Sotiriou C, Kunkel S, Farmer P, Pradervand S, Haibe-Kains B, Desmedt C, Ignatiadis M, Sengstag T, Schutz F, Goldstein DR, Piccart M, Delorenzi M. Meta-analysis of gene expression profiles in breast cancer: Toward a unified understanding of breast cancer subtyping and prognosis signatures. Breast Cancer Res 10:R65, 2008. 80. Venet D, Dumont JE, Detours V. Most random gene expression signatures are significantly associated with breast cancer outcome. PLoS Comput Biol 7:e1002240, 2011. 81. Lips EH, Mulder L, de Ronde JJ, Mandjes IA, Koolen BB, Wessels LF, Rodenhuis S, Wesseling J. Breast cancer subtyping by immunohistochemistry and histological grade outperforms breast cancer intrinsic subtypes in predicting neoadjuvant chemotherapy response. Breast Cancer Res Treat 140:63-71, 2013. 82. Bartlett JM, Bayani J, Marshall A, Dunn JA, Campbell A, Cunningham C, Sobol MS, Hall PS, Poole CJ, Cameron DA, Earl HM, Rea DW, Macpherson IR, Canney P, Francis A, McCabe C, Pinder SE, Hughes-Davies L, Makris A, Stein RC, OPTIMA TMG. Comparing breast cancer multiparameter tests in the OPTIMA prelim trial: No test is more equal than the others. J Natl Cancer Inst 108:10.1093/jnci/djw050. Print 2016 Sep, 2016. 83. Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y, Graf S, Ha G, Haffari G, Bashashati A, Russell R, McKinney S, METABRIC Group, Langerod A, Green A, Provenzano E, Wishart G, Pinder S, Watson P, Markowetz F, Murphy L, Ellis I, Purushotham A, Borresen-Dale AL, Brenton JD, Tavare S, Caldas C, Aparicio S. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486:346-352, 2012. 84. Ali HR, Rueda OM, Chin SF, Curtis C, Dunning MJ, Aparicio SA, Caldas C. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biol 15:431-014-0431-1, 2014. 85. Russnes HG, Lingjaerde OC, Borresen-Dale AL, Caldas C. Breast cancer molecular stratification: From intrinsic subtypes to integrative clusters. Am J Pathol 187:2152-2162, 2017. 86. Polyak K. Breast cancer: Origins and evolution. J Clin Invest 117:3155-3163, 2007. 87. Keller PJ, Arendt LM, Skibinski A, Logvinenko T, Klebba I, Dong S, Smith AE, Prat A, Perou CM, Gilmore H, Schnitt S, Naber SP, Garlick JA, Kuperwasser C. Defining the cellular precursors to human breast cancer. Proc Natl Acad Sci U S A 109:2772-2777, 2012. 88. Blaas L, Pucci F, Messal HA, Andersson AB, Josue Ruiz E, Gerling M, Douagi I, Spencer-Dene B, Musch A, Mitter R, Bhaw L, Stone R, Bornhorst D, Sesay AK, Jonkers J, Stamp G, Malanchi I, Toftgard R, Behrens A. Lgr6 labels a rare

70 population of mammary gland progenitor cells that are able to originate luminal mammary tumours. Nat Cell Biol 18:1346- 1356, 2016. 89. Skibinski A, Kuperwasser C. The origin of breast tumor heterogeneity. Oncogene 34:5309-5316, 2015. 90. Furuta S, Jiang X, Gu B, Cheng E, Chen PL, Lee WH. Depletion of BRCA1 impairs differentiation but enhances proliferation of mammary epithelial cells. Proc Natl Acad Sci U S A 102:9176-9181, 2005. 91. Molyneux G, Geyer FC, Magnay FA, McCarthy A, Kendrick H, Natrajan R, Mackay A, Grigoriadis A, Tutt A, Ashworth A, Reis-Filho JS, Smalley MJ. BRCA1 basal-like breast cancers originate from luminal epithelial progenitors and not from basal stem cells. Cell Stem Cell 7:403-417, 2010. 92. Lim E, Vaillant F, Wu D, Forrest NC, Pal B, Hart AH, Asselin-Labat ML, Gyorki DE, Ward T, Partanen A, Feleppa F, Huschtscha LI, Thorne HJ, kConFab, Fox SB, Yan M, French JD, Brown MA, Smyth GK, Visvader JE, Lindeman GJ. Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers. Nat Med 15:907-913, 2009. 93. Proia TA, Keller PJ, Gupta PB, Klebba I, Jones AD, Sedic M, Gilmore H, Tung N, Naber SP, Schnitt S, Lander ES, Kuperwasser C. Genetic predisposition directs breast cancer phenotype by dictating progenitor cell fate. Cell Stem Cell 8:149-163, 2011. 94. Jones LP, Tilli MT, Assefnia S, Torre K, Halama ED, Parrish A, Rosen EM, Furth PA. Activation of estrogen signaling pathways collaborates with loss of Brca1 to promote development of ERalpha-negative and ERalpha-positive mammary preneoplasia and cancer. Oncogene 27:794-802, 2008. 95. Ince TA, Richardson AL, Bell GW, Saitoh M, Godar S, Karnoub AE, Iglehart JD, Weinberg RA. Transformation of different human breast epithelial cell types leads to distinct tumor phenotypes. Cancer Cell 12:160-170, 2007. 96. Anderson WF, Rosenberg PS, Prat A, Perou CM, Sherman ME. How many etiological subtypes of breast cancer: Two, three, four, or more? J Natl Cancer Inst 106:10.1093/jnci/dju165. Print 2014 Aug, 2014. 97. Kristiina Aittomäki, Päivi Auvinen, Päivi Heikkilä, Riikka Huovinen, Tiina Jahkola, Heikki Joensuu, Sanna Joukainen, Arja Jukkola-Vuorinen, Peeter Karihtala, Mauri Kouri, Vesa Kärjä, Outi Lahdenperä, Marjut Leidenius, Johanna Mattson, Minna Pöyhönen, Liisa Sailas, Mazen Sudah, Minna Tanner, Maria Tengström and Leena Vehmanen. Rintasyövän valtakunnallinen diagnostiikka- ja hoitosuositus, Suomen Rintasyöpäryhmä ry, 2105. 98. Green S. Modulation of oestrogen receptor activity by oestrogens and anti-oestrogens. J Steroid Biochem Mol Biol 37:747-751, 1990. 99. Sainsbury R. The development of endocrine therapy for women with breast cancer. Cancer Treat Rev 39:507-517, 2013. 100. Sonnenblick A, Piccart M. Adjuvant systemic therapy in breast cancer: Quo vadis? Ann Oncol 26:1629-1634, 2015. 101. Mohamed A, Krajewski K, Cakar B, Ma CX. Targeted therapy for breast cancer. Am J Pathol 183:1096-1112, 2013. 102. Baselga J, Campone M, Piccart M, Burris HA,3rd, Rugo HS, Sahmoud T, Noguchi S, Gnant M, Pritchard KI, Lebrun F, Beck JT, Ito Y, Yardley D, Deleu I, Perez A, Bachelot T, Vittori L, Xu Z, Mukhopadhyay P, Lebwohl D, Hortobagyi GN. Everolimus in postmenopausal hormone-receptor-positive advanced breast cancer. N Engl J Med 366:520-529, 2012. 103. Baselga J, Im SA, Iwata H, Cortes J, De Laurentiis M, Jiang Z, Arteaga CL, Jonat W, Clemons M, Ito Y, Awada A, Chia S, Jagiello-Gruszfeld A, Pistilli B, Tseng LM, Hurvitz S, Masuda N, Takahashi M, Vuylsteke P, Hachemi S, Dharan B, Di Tomaso E, Urban P, Massacesi C, Campone M. Buparlisib plus fulvestrant versus placebo plus fulvestrant in postmenopausal, hormone receptor-positive, HER2-negative, advanced breast cancer (BELLE-2): A randomised, double- blind, placebo-controlled, phase 3 trial. Lancet Oncol 18:904-916, 2017. 104. Loibl S, Turner NC, Ro J, Cristofanilli M, Iwata H, Im SA, Masuda N, Loi S, Andre F, Harbeck N, Verma S, Folkerd E, Puyana Theall K, Hoffman J, Zhang K, Bartlett CH, Dowsett M. Palbociclib combined with fulvestrant in premenopausal women with advanced breast cancer and prior progression on endocrine therapy: PALOMA-3 results. Oncologist 22:1028-1038, 2017. 105. Yardley DA, Ismail-Khan RR, Melichar B, Lichinitser M, Munster PN, Klein PM, Cruickshank S, Miller KD, Lee MJ, Trepel JB. Randomized phase II, double-blind, placebo-controlled study of exemestane with or without entinostat in postmenopausal women with locally recurrent or metastatic estrogen receptor-positive breast cancer progressing on treatment with a nonsteroidal aromatase inhibitor. J Clin Oncol 31:2128-2135, 2013. 106. de Groot AF, Kuijpers CJ, Kroep JR. CDK4/6 inhibition in early and metastatic breast cancer: A review. Cancer Treat Rev 60:130-138, 2017. 107. Hadji P, Aapro MS, Body JJ, Gnant M, Brandi ML, Reginster JY, Zillikens MC, Gluer CC, de Villiers T, Baber R, Roodman GD, Cooper C, Langdahl B, Palacios S, Kanis J, Al-Daghri N, Nogues X, Eriksen EF, Kurth A, Rizzoli R, Coleman RE. Management of aromatase inhibitor-associated bone loss (AIBL) in postmenopausal women with hormone

71 sensitive breast cancer: Joint position statement of the IOF, CABS, ECTS, IEG, ESCEO IMS, and SIOG. J Bone Oncol 7:1-12, 2017. 108. Early Breast Cancer Trialists' Collaborative Group (EBCTCG). Adjuvant bisphosphonate treatment in early breast cancer: Meta-analyses of individual patient data from randomised trials. Lancet 386:1353-1361, 2015. 109. Tuma RS. Trastuzumab trials steal show at ASCO meeting. J Natl Cancer Inst 97:870-871, 2005. 110. Figueroa-Magalhaes MC, Jelovac D, Connolly R, Wolff AC. Treatment of HER2-positive breast cancer. Breast 23:128-136, 2014. 111. Joensuu H. Escalating and de-escalating treatment in HER2-positive early breast cancer. Cancer Treat Rev 52:1-11, 2017. 112. Robson M, Im SA, Senkus E, Xu B, Domchek SM, Masuda N, Delaloge S, Li W, Tung N, Armstrong A, Wu W, Goessl C, Runswick S, Conte P. Olaparib for metastatic breast cancer in patients with a germline BRCA mutation. N Engl J Med 377:523-533, 2017. 113. Ohmoto A, Yachida S. Current status of poly(ADP-ribose) polymerase inhibitors and future directions. Onco Targets Ther 10:5195-5208, 2017. 114. Tao JJ, Visvanathan K, Wolff AC. Long term side effects of adjuvant chemotherapy in patients with early breast cancer. Breast 24 Suppl 2:S149-53, 2015. 115. Munzone E, Curigliano G, Burstein HJ, Winer EP, Goldhirsch A. CMF revisited in the 21st century. Ann Oncol 23:305-311, 2012. 116. Curigliano G, Criscitiello C. Maximizing the clinical benefit of anthracyclines in addition to taxanes in the adjuvant treatment of early breast cancer. J Clin Oncol 35:2600-2603, 2017. 117. Colozza M, de Azambuja E, Cardoso F, Bernard C, Piccart MJ. Breast cancer: Achievements in adjuvant systemic therapies in the pre-genomic era. Oncologist 11:111-125, 2006. 118. Carter SK. Single and combination nonhormonal chemotherapy in breast cancer. Cancer 30:1543-1555, 1972. 119. De Lena M, Brambilla C, Morabito A, Bonadonna G. Adriamycin plus vincristine compared to and combined with cyclophosphamide, methotrexate, and 5-fluorouracil for advanced breast cancer. Cancer 35:1108-1115, 1975. 120. Bonadonna G, Brusamolino E, Valagussa P, Rossi A, Brugnatelli L, Brambilla C, De Lena M, Tancini G, Bajetta E, Musumeci R, Veronesi U. Combination chemotherapy as an adjuvant treatment in operable breast cancer. N Engl J Med 294:405-410, 1976. 121. Hall AG, Tilby MJ. Mechanisms of action of, and modes of resistance to, alkylating agents used in the treatment of haematological malignancies. Blood Rev 6:163-173, 1992. 122. Sistigu A, Viaud S, Chaput N, Bracci L, Proietti E, Zitvogel L. Immunomodulatory effects of cyclophosphamide and implementations for vaccine design. Semin Immunopathol 33:369-383, 2011. 123. Goodsell DS. The molecular perspective: Methotrexate. Oncologist 4:340-341, 1999. 124. Longley DB, Harkin DP, Johnston PG. 5-fluorouracil: Mechanisms of action and clinical strategies. Nat Rev Cancer 3:330-338, 2003. 125. Mikhailov A, Shinohara M, Rieder CL. Topoisomerase II and histone deacetylase inhibitors delay the G2/M transition by triggering the p38 MAPK checkpoint pathway. J Cell Biol 166:517-526, 2004. 126. Romero A, Caldes T, Diaz-Rubio E, Martin M. Topoisomerase 2 alpha: A real predictor of anthracycline efficacy? Clin Transl Oncol 14:163-168, 2012. 127. Sanchez-Munoz A, Plata-Fernandez YM, Fernandez M, Jaen-Morago A, Fernandez-Navarro M, de la Torre-Cabrera C, Ramirez-Tortosa C, Lomas-Garrido M, Llacer C, Navarro-Perez V, Alba-Conejo E, Sanchez-Rovira P. The role of immunohistochemistry in breast cancer patients treated with neoadjuvant chemotherapy: An old tool with an enduring prognostic value. Clin Breast Cancer 13:146-152, 2013. 128. Keam B, Im SA, Lee KH, Han SW, Oh DY, Kim JH, Lee SH, Han W, Kim DW, Kim TY, Park IA, Noh DY, Heo DS, Bang YJ. Ki-67 can be used for further classification of triple negative breast cancer into two subtypes with different response and prognosis. Breast Cancer Res 13:R22, 2011. 129. Nishimura R, Osako T, Okumura Y, Hayashi M, Arima N. Clinical significance of ki-67 in neoadjuvant chemotherapy for primary breast cancer as a predictor for chemosensitivity and for prognosis. Breast Cancer 17:269-275, 2010. 130. Vincent-Salomon A, Rousseau A, Jouve M, Beuzeboc P, Sigal-Zafrani B, Freneaux P, Rosty C, Nos C, Campana F, Klijanienko J, Al Ghuzlan A, Sastre-Garau X, Breast Cancer Study Group. Proliferation markers predictive of the

72 pathological response and disease outcome of patients with breast carcinomas treated by anthracycline-based preoperative chemotherapy. Eur J Cancer 40:1502-1508, 2004. 131. Jarvinen TA, Tanner M, Barlund M, Borg A, Isola J. Characterization of topoisomerase II alpha gene amplification and deletion in breast cancer. Genes Chromosomes Cancer 26:142-150, 1999. 132. Desmedt C, Di Leo A, de Azambuja E, Larsimont D, Haibe-Kains B, Selleslags J, Delaloge S, Duhem C, Kains JP, Carly B, Maerevoet M, Vindevoghel A, Rouas G, Lallemand F, Durbecq V, Cardoso F, Salgado R, Rovere R, Bontempi G, Michiels S, Buyse M, Nogaret JM, Qi Y, Symmans F, Pusztai L, D'Hondt V, Piccart-Gebhart M, Sotiriou C. Multifactorial approach to predicting resistance to anthracyclines. J Clin Oncol 29:1578-1586, 2011. 133. Di Leo A, Desmedt C, Bartlett JM, Piette F, Ejlertsen B, Pritchard KI, Larsimont D, Poole C, Isola J, Earl H, Mouridsen H, O'Malley FP, Cardoso F, Tanner M, Munro A, Twelves CJ, Sotiriou C, Shepherd L, Cameron D, Piccart MJ, Buyse M, HER2/TOP2A Meta-analysis Study Group. HER2 and TOP2A as predictive markers for anthracycline- containing chemotherapy regimens as adjuvant treatment of breast cancer: A meta-analysis of individual patient data. Lancet Oncol 12:1134-1142, 2011. 134. Fagerholm R, Hofstetter B, Tommiska J, Aaltonen K, Vrtel R, Syrjakoski K, Kallioniemi A, Kilpivaara O, Mannermaa A, Kosma VM, Uusitupa M, Eskelinen M, Kataja V, Aittomaki K, von Smitten K, Heikkila P, Lukas J, Holli K, Bartkova J, Blomqvist C, Bartek J, Nevanlinna H. NAD(P)H:Quinone oxidoreductase 1 NQO1*2 genotype (P187S) is a strong prognostic and predictive factor in breast cancer. Nat Genet 40:844-853, 2008. 135. Trudeau M, Charbonneau F, Gelmon K, Laing K, Latreille J, Mackey J, McLeod D, Pritchard K, Provencher L, Verma S. Selection of adjuvant chemotherapy for treatment of node-positive breast cancer. Lancet Oncol 6:886-898, 2005. 136. Curigliano G, Criscitiello C, Andre F, Colleoni M, Di Leo A. Highlights from the 13th st gallen international breast cancer conference 2013. access to innovation for patients with breast cancer: How to speed it up? Ecancermedicalscience 7:299, 2013. 137. Milross CG, Mason KA, Hunter NR, Chung WK, Peters LJ, Milas L. Relationship of mitotic arrest and apoptosis to antitumor effect of paclitaxel. J Natl Cancer Inst 88:1308-1314, 1996. 138. Schimming R, Mason KA, Hunter N, Weil M, Kishi K, Milas L. Lack of correlation between mitotic arrest or apoptosis and antitumor effect of docetaxel. Cancer Chemother Pharmacol 43:165-172, 1999. 139. Deprez S, Amant F, Smeets A, Peeters R, Leemans A, Van Hecke W, Verhoeven JS, Christiaens MR, Vandenberghe J, Vandenbulcke M, Sunaert S. Longitudinal assessment of chemotherapy-induced structural changes in cerebral white matter and its correlation with impaired cognitive functioning. J Clin Oncol 30:274-281, 2012. 140. Eckhoff L, Knoop A, Jensen MB, Ewertz M. Persistence of docetaxel-induced neuropathy and impact on quality of life among breast cancer survivors. Eur J Cancer 51:292-300, 2015. 141. Hoke A, Ray M. Rodent models of chemotherapy-induced peripheral neuropathy. ILAR J 54:273-281, 2014. 142. Jordan MA, Kamath K, Manna T, Okouneva T, Miller HP, Davis C, Littlefield BA, Wilson L. The primary antimitotic mechanism of action of the synthetic halichondrin E7389 is suppression of microtubule growth. Mol Cancer Ther 4:1086- 1095, 2005. 143. Moudi M, Go R, Yien CY, Nazre M. Vinca alkaloids. Int J Prev Med 4:1231-1235, 2013. 144. Beijers AJ, Jongen JL, Vreugdenhil G. Chemotherapy-induced neurotoxicity: The value of neuroprotective strategies. Neth J Med 70:18-25, 2012. 145. Fasching PA, Pharoah PD, Cox A, Nevanlinna H, Bojesen SE, Karn T, Broeks A, van Leeuwen FE, van't Veer LJ, Udo R, Dunning AM, Greco D, Aittomaki K, Blomqvist C, Shah M, Nordestgaard BG, Flyger H, Hopper JL, Southey MC, Apicella C, Garcia-Closas M, Sherman M, Lissowska J, Seynaeve C, Huijts PE, Tollenaar RA, Ziogas A, Ekici AB, Rauh C, Mannermaa A, Kataja V, Kosma VM, Hartikainen JM, Andrulis IL, Ozcelik H, Mulligan AM, Glendon G, Hall P, Czene K, Liu J, Chang-Claude J, Wang-Gohrke S, Eilber U, Nickels S, Dork T, Schiekel M, Bremer M, Park-Simon TW, Giles GG, Severi G, Baglietto L, Hooning MJ, Martens JW, Jager A, Kriege M, Lindblom A, Margolin S, Couch FJ, Stevens KN, Olson JE, Kosel M, Cross SS, Balasubramanian SP, Reed MW, Miron A, John EM, Winqvist R, Pylkas K, Jukkola-Vuorinen A, Kauppila S, Burwinkel B, Marme F, Schneeweiss A, Sohn C, Chenevix-Trench G, kConFab Investigators, Lambrechts D, Dieudonne AS, Hatse S, van Limbergen E, Benitez J, Milne RL, Zamora MP, Perez JI, Bonanni B, Peissel B, Loris B, Peterlongo P, Rajaraman P, Schonfeld SJ, Anton-Culver H, Devilee P, Beckmann MW, Slamon DJ, Phillips KA, Figueroa JD, Humphreys MK, Easton DF, Schmidt MK. The role of genetic breast cancer susceptibility variants as prognostic factors. Hum Mol Genet 21:3926-3939, 2012. 146. Ravdin PM, Siminoff LA, Davis GJ, Mercer MB, Hewlett J, Gerson N, Parker HL. Computer program to assist in making decisions about adjuvant therapy for women with early breast cancer. J Clin Oncol 19:980-991, 2001.

73 147. Collaborative Group on Hormonal Factors in Breast Cancer. Familial breast cancer: Collaborative reanalysis of individual data from 52 epidemiological studies including 58,209 women with breast cancer and 101,986 women without the disease. Lancet 358:1389-1399, 2001. 148. Metcalfe KA, Finch A, Poll A, Horsman D, Kim-Sing C, Scott J, Royer R, Sun P, Narod SA. Breast cancer risks in women with a family history of breast or ovarian cancer who have tested negative for a BRCA1 or BRCA2 mutation. Br J Cancer 100:421-425, 2009. 149. Antoniou AC, Pharoah PD, McMullan G, Day NE, Stratton MR, Peto J, Ponder BJ, Easton DF. A comprehensive model for familial breast cancer incorporating BRCA1, BRCA2 and other genes. Br J Cancer 86:76-83, 2002. 150. Milne RL, Gaudet MM, Spurdle AB, Fasching PA, Couch FJ, Benitez J, Arias Perez JI, Zamora MP, Malats N, Dos Santos Silva I, Gibson LJ, Fletcher O, Johnson N, Anton-Culver H, Ziogas A, Figueroa J, Brinton L, Sherman ME, Lissowska J, Hopper JL, Dite GS, Apicella C, Southey MC, Sigurdson AJ, Linet MS, Schonfeld SJ, Freedman DM, Mannermaa A, Kosma VM, Kataja V, Auvinen P, Andrulis IL, Glendon G, Knight JA, Weerasooriya N, Cox A, Reed MW, Cross SS, Dunning AM, Ahmed S, Shah M, Brauch H, Ko YD, Bruning T, GENICA Network, Lambrechts D, Reumers J, Smeets A, Wang-Gohrke S, Hall P, Czene K, Liu J, Irwanto AK, Chenevix-Trench G, Holland H, kConFab, AOCS, Giles GG, Baglietto L, Severi G, Bojensen SE, Nordestgaard BG, Flyger H, John EM, West DW, Whittemore AS, Vachon C, Olson JE, Fredericksen Z, Kosel M, Hein R, Vrieling A, Flesch-Janys D, Heinz J, Beckmann MW, Heusinger K, Ekici AB, Haeberle L, Humphreys MK, Morrison J, Easton DF, Pharoah PD, Garcia-Closas M, Goode EL, Chang- Claude J. Assessing interactions between the associations of common genetic susceptibility variants, reproductive history and body mass index with breast cancer risk in the breast cancer association consortium: A combined case-control study. Breast Cancer Res 12:R110, 2010. 151. Antoniou AC, Beesley J, McGuffog L, Sinilnikova OM, Healey S, Neuhausen SL, Ding YC, Rebbeck TR, Weitzel JN, Lynch HT, Isaacs C, Ganz PA, Tomlinson G, Olopade OI, Couch FJ, Wang X, Lindor NM, Pankratz VS, Radice P, Manoukian S, Peissel B, Zaffaroni D, Barile M, Viel A, Allavena A, Dall'Olio V, Peterlongo P, Szabo CI, Zikan M, Claes K, Poppe B, Foretova L, Mai PL, Greene MH, Rennert G, Lejbkowicz F, Glendon G, Ozcelik H, Andrulis IL, Ontario Cancer Genetics Network, Thomassen M, Gerdes AM, Sunde L, Cruger D, Birk Jensen U, Caligo M, Friedman E, Kaufman B, Laitman Y, Milgrom R, Dubrovsky M, Cohen S, Borg A, Jernstrom H, Lindblom A, Rantala J, Stenmark- Askmalm M, Melin B, SWE-BRCA, Nathanson K, Domchek S, Jakubowska A, Lubinski J, Huzarski T, Osorio A, Lasa A, Duran M, Tejada MI, Godino J, Benitez J, Hamann U, Kriege M, Hoogerbrugge N, van der Luijt RB, van Asperen CJ, Devilee P, Meijers-Heijboer EJ, Blok MJ, Aalfs CM, Hogervorst F, Rookus M, HEBON, Cook M, Oliver C, Frost D, Conroy D, Evans DG, Lalloo F, Pichert G, Davidson R, Cole T, Cook J, Paterson J, Hodgson S, Morrison PJ, Porteous ME, Walker L, Kennedy MJ, Dorkins H, Peock S, EMBRACE, Godwin AK, Stoppa-Lyonnet D, de Pauw A, Mazoyer S, Bonadona V, Lasset C, Dreyfus H, Leroux D, Hardouin A, Berthet P, Faivre L, GEMO, Loustalot C, Noguchi T, Sobol H, Rouleau E, Nogues C, Frenay M, Venat-Bouvet L, GEMO, Hopper JL, Daly MB, Terry MB, John EM, Buys SS, Yassin Y, Miron A, Goldgar D, Breast Cancer Family Registry, Singer CF, Dressler AC, Gschwantler-Kaulich D, Pfeiler G, Hansen TV, Jonson L, Agnarsson BA, Kirchhoff T, Offit K, Devlin V, Dutra-Clarke A, Piedmonte M, Rodriguez GC, Wakeley K, Boggess JF, Basil J, Schwartz PE, Blank SV, Toland AE, Montagna M, Casella C, Imyanitov E, Tihomirova L, Blanco I, Lazaro C, Ramus SJ, Sucheston L, Karlan BY, Gross J, Schmutzler R, Wappenschmidt B, Engel C, Meindl A, Lochmann M, Arnold N, Heidemann S, Varon-Mateeva R, Niederacher D, Sutter C, Deissler H, Gadzicki D, Preisler- Adams S, Kast K, Schonbuchner I, Caldes T, de la Hoya M, Aittomaki K, Nevanlinna H, Simard J, Spurdle AB, Holland H, Chen X, kConFab, Platte R, Chenevix-Trench G, Easton DF, CIMBA. Common breast cancer susceptibility alleles and the risk of breast cancer for BRCA1 and BRCA2 mutation carriers: Implications for risk prediction. Cancer Res 70:9742- 9754, 2010. 152. Antoniou AC, Easton DF. Models of genetic susceptibility to breast cancer. Oncogene 25:5898-5905, 2006. 153. Michailidou K, Lindstrom S, Dennis J, Beesley J, Hui S, Kar S, Lemacon A, Soucy P, Glubb D, Rostamianfar A, Bolla MK, Wang Q, Tyrer J, Dicks E, Lee A, Wang Z, Allen J, Keeman R, Eilber U, French JD, Qing Chen X, Fachal L, McCue K, McCart Reed AE, Ghoussaini M, Carroll JS, Jiang X, Finucane H, Adams M, Adank MA, Ahsan H, Aittomaki K, Anton-Culver H, Antonenkova NN, Arndt V, Aronson KJ, Arun B, Auer PL, Bacot F, Barrdahl M, Baynes C, Beckmann MW, Behrens S, Benitez J, Bermisheva M, Bernstein L, Blomqvist C, Bogdanova NV, Bojesen SE, Bonanni B, Borresen-Dale AL, Brand JS, Brauch H, Brennan P, Brenner H, Brinton L, Broberg P, Brock IW, Broeks A, Brooks- Wilson A, Brucker SY, Bruning T, Burwinkel B, Butterbach K, Cai Q, Cai H, Caldes T, Canzian F, Carracedo A, Carter BD, Castelao JE, Chan TL, David Cheng TY, Seng Chia K, Choi JY, Christiansen H, Clarke CL, NBCS Collaborators, Collee M, Conroy DM, Cordina-Duverger E, Cornelissen S, Cox DG, Cox A, Cross SS, Cunningham JM, Czene K, Daly MB, Devilee P, Doheny KF, Dork T, Dos-Santos-Silva I, Dumont M, Durcan L, Dwek M, Eccles DM, Ekici AB, Eliassen AH, Ellberg C, Elvira M, Engel C, Eriksson M, Fasching PA, Figueroa J, Flesch-Janys D, Fletcher O, Flyger H, Fritschi L, Gaborieau V, Gabrielson M, Gago-Dominguez M, Gao YT, Gapstur SM, Garcia-Saenz JA, Gaudet MM, Georgoulias V, Giles GG, Glendon G, Goldberg MS, Goldgar DE, Gonzalez-Neira A, Grenaker Alnaes GI, Grip M, Gronwald J, Grundy A, Guenel P, Haeberle L, Hahnen E, Haiman CA, Hakansson N, Hamann U, Hamel N, Hankinson S, Harrington P, Hart SN, Hartikainen JM, Hartman M, Hein A, Heyworth J, Hicks B, Hillemanns P, Ho DN, Hollestelle A, Hooning MJ, Hoover RN, Hopper JL, Hou MF, Hsiung CN, Huang G, Humphreys K, Ishiguro J, Ito H, Iwasaki M, Iwata H, Jakubowska A, Janni W, John EM, Johnson N, Jones K, Jones M, Jukkola-Vuorinen A, Kaaks R, Kabisch M, Kaczmarek

74 K, Kang D, Kasuga Y, Kerin MJ, Khan S, Khusnutdinova E, Kiiski JI, Kim SW, Knight JA, Kosma VM, Kristensen VN, Kruger U, Kwong A, Lambrechts D, Le Marchand L, Lee E, Lee MH, Lee JW, Neng Lee C, Lejbkowicz F, Li J, Lilyquist J, Lindblom A, Lissowska J, Lo WY, Loibl S, Long J, Lophatananon A, Lubinski J, Luccarini C, Lux MP, Ma ESK, MacInnis RJ, Maishman T, Makalic E, Malone KE, Kostovska IM, Mannermaa A, Manoukian S, Manson JE, Margolin S, Mariapun S, Martinez ME, Matsuo K, Mavroudis D, McKay J, McLean C, Meijers-Heijboer H, Meindl A, Menendez P, Menon U, Meyer J, Miao H, Miller N, Taib NAM, Muir K, Mulligan AM, Mulot C, Neuhausen SL, Nevanlinna H, Neven P, Nielsen SF, Noh DY, Nordestgaard BG, Norman A, Olopade OI, Olson JE, Olsson H, Olswold C, Orr N, Pankratz VS, Park SK, Park-Simon TW, Lloyd R, Perez JIA, Peterlongo P, Peto J, Phillips KA, Pinchev M, Plaseska-Karanfilska D, Prentice R, Presneau N, Prokofyeva D, Pugh E, Pylkas K, Rack B, Radice P, Rahman N, Rennert G, Rennert HS, Rhenius V, Romero A, Romm J, Ruddy KJ, Rudiger T, Rudolph A, Ruebner M, Rutgers EJT, Saloustros E, Sandler DP, Sangrajrang S, Sawyer EJ, Schmidt DF, Schmutzler RK, Schneeweiss A, Schoemaker MJ, Schumacher F, Schurmann P, Scott RJ, Scott C, Seal S, Seynaeve C, Shah M, Sharma P, Shen CY, Sheng G, Sherman ME, Shrubsole MJ, Shu XO, Smeets A, Sohn C, Southey MC, Spinelli JJ, Stegmaier C, Stewart-Brown S, Stone J, Stram DO, Surowy H, Swerdlow A, Tamimi R, Taylor JA, Tengstrom M, Teo SH, Beth Terry M, Tessier DC, Thanasitthichai S, Thone K, Tollenaar RAEM, Tomlinson I, Tong L, Torres D, Truong T, Tseng CC, Tsugane S, Ulmer HU, Ursin G, Untch M, Vachon C, van Asperen CJ, Van Den Berg D, van den Ouweland AMW, van der Kolk L, van der Luijt RB, Vincent D, Vollenweider J, Waisfisz Q, Wang-Gohrke S, Weinberg CR, Wendt C, Whittemore AS, Wildiers H, Willett W, Winqvist R, Wolk A, Wu AH, Xia L, Yamaji T, Yang XR, Har Yip C, Yoo KY, Yu JC, Zheng W, Zheng Y, Zhu B, Ziogas A, Ziv E, ABCTB Investigators, ConFab/AOCS Investigators, Lakhani SR, Antoniou AC, Droit A, Andrulis IL, Amos CI, Couch FJ, Pharoah PDP, Chang- Claude J, Hall P, Hunter DJ, Milne RL, Garcia-Closas M, Schmidt MK, Chanock SJ, Dunning AM, Edwards SL, Bader GD, Chenevix-Trench G, Simard J, Kraft P, Easton DF. Association analysis identifies 65 new breast cancer risk loci. Nature 551:92-94, 2017. 154. Southey MC, Goldgar DE, Winqvist R, Pylkas K, Couch F, Tischkowitz M, Foulkes WD, Dennis J, Michailidou K, van Rensburg EJ, Heikkinen T, Nevanlinna H, Hopper JL, Dork T, Claes KB, Reis-Filho J, Teo ZL, Radice P, Catucci I, Peterlongo P, Tsimiklis H, Odefrey FA, Dowty JG, Schmidt MK, Broeks A, Hogervorst FB, Verhoef S, Carpenter J, Clarke C, Scott RJ, Fasching PA, Haeberle L, Ekici AB, Beckmann MW, Peto J, Dos-Santos-Silva I, Fletcher O, Johnson N, Bolla MK, Sawyer EJ, Tomlinson I, Kerin MJ, Miller N, Marme F, Burwinkel B, Yang R, Guenel P, Truong T, Menegaux F, Sanchez M, Bojesen S, Nielsen SF, Flyger H, Benitez J, Zamora MP, Perez JI, Menendez P, Anton-Culver H, Neuhausen S, Ziogas A, Clarke CA, Brenner H, Arndt V, Stegmaier C, Brauch H, Bruning T, Ko YD, Muranen TA, Aittomaki K, Blomqvist C, Bogdanova NV, Antonenkova NN, Lindblom A, Margolin S, Mannermaa A, Kataja V, Kosma VM, Hartikainen JM, Spurdle AB, Investigators K, Australian Ovarian Cancer Study Group, Wauters E, Smeets D, Beuselinck B, Floris G, Chang-Claude J, Rudolph A, Seibold P, Flesch-Janys D, Olson JE, Vachon C, Pankratz VS, McLean C, Haiman CA, Henderson BE, Schumacher F, Le Marchand L, Kristensen V, Alnaes GG, Zheng W, Hunter DJ, Lindstrom S, Hankinson SE, Kraft P, Andrulis I, Knight JA, Glendon G, Mulligan AM, Jukkola-Vuorinen A, Grip M, Kauppila S, Devilee P, Tollenaar RA, Seynaeve C, Hollestelle A, Garcia-Closas M, Figueroa J, Chanock SJ, Lissowska J, Czene K, Darabi H, Eriksson M, Eccles DM, Rafiq S, Tapper WJ, Gerty SM, Hooning MJ, Martens JW, Collee JM, Tilanus-Linthorst M, Hall P, Li J, Brand JS, Humphreys K, Cox A, Reed MW, Luccarini C, Baynes C, Dunning AM, Hamann U, Torres D, Ulmer HU, Rudiger T, Jakubowska A, Lubinski J, Jaworska K, Durda K, Slager S, Toland AE, Ambrosone CB, Yannoukakos D, Swerdlow A, Ashworth A, Orr N, Jones M, Gonzalez-Neira A, Pita G, Alonso MR, Alvarez N, Herrero D, Tessier DC, Vincent D, Bacot F, Simard J, Dumont M, Soucy P, Eeles R, Muir K, Wiklund F, Gronberg H, Schleutker J, Nordestgaard BG, Weischer M, Travis RC, Neal D, Donovan JL, Hamdy FC, Khaw KT, Stanford JL, Blot WJ, Thibodeau S, Schaid DJ, Kelley JL, Maier C, Kibel AS, Cybulski C, Cannon-Albright L, Butterbach K, Park J, Kaneva R, Batra J, Teixeira MR, Kote-Jarai Z, Olama AA, Benlloch S, Renner SP, Hartmann A, Hein A, Ruebner M, Lambrechts D, Van Nieuwenhuysen E, Vergote I, Lambretchs S, Doherty JA, Rossing MA, Nickels S, Eilber U, Wang-Gohrke S, Odunsi K, Sucheston-Campbell LE, Friel G, Lurie G, Killeen JL, Wilkens LR, Goodman MT, Runnebaum I, Hillemanns PA, Pelttari LM, Butzow R, Modugno F, Edwards RP, Ness RB, Moysich KB, du Bois A, Heitz F, Harter P, Kommoss S, Karlan BY, Walsh C, Lester J, Jensen A, Kjaer SK, Hogdall E, Peissel B, Bonanni B, Bernard L, Goode EL, Fridley BL, Vierkant RA, Cunningham JM, Larson MC, Fogarty ZC, Kalli KR, Liang D, Lu KH, Hildebrandt MA, Wu X, Levine DA, Dao F, Bisogna M, Berchuck A, Iversen ES, Marks JR, Akushevich L, Cramer DW, Schildkraut J, Terry KL, Poole EM, Stampfer M, Tworoger SS, Bandera EV, Orlow I, Olson SH, Bjorge L, Salvesen HB, van Altena AM, Aben KK, Kiemeney LA, Massuger LF, Pejovic T, Bean Y, Brooks-Wilson A, Kelemen LE, Cook LS, Le ND, Gorski B, Gronwald J, Menkiszak J, Hogdall CK, Lundvall L, Nedergaard L, Engelholm SA, Dicks E, Tyrer J, Campbell I, McNeish I, Paul J, Siddiqui N, Glasspool R, Whittemore AS, Rothstein JH, McGuire V, Sieh W, Cai H, Shu XO, Teten RT, Sutphen R, McLaughlin JR, Narod SA, Phelan CM, Monteiro AN, Fenstermacher D, Lin HY, Permuth JB, Sellers TA, Chen YA, Tsai YY, Chen Z, Gentry-Maharaj A, Gayther SA, Ramus SJ, Menon U, Wu AH, Pearce CL, Van Den Berg D, Pike MC, Dansonka-Mieszkowska A, Plisiecka-Halasa J, Moes-Sosnowska J, Kupryjanczyk J, Pharoah PD, Song H, Winship I, Chenevix-Trench G, Giles GG, Tavtigian SV, Easton DF, Milne RL. PALB2, CHEK2 and ATM rare variants and cancer risk: Data from COGS. J Med Genet 53:800-811, 2016. 155. Kiiski JI, Pelttari LM, Khan S, Freysteinsdottir ES, Reynisdottir I, Hart SN, Shimelis H, Vilske S, Kallioniemi A, Schleutker J, Leminen A, Butzow R, Blomqvist C, Barkardottir RB, Couch FJ, Aittomaki K, Nevanlinna H. Exome sequencing identifies FANCM as a susceptibility gene for triple-negative breast cancer. Proc Natl Acad Sci U S A 111:15172-15177, 2014.

75 156. Kuchenbaecker KB, Hopper JL, Barnes DR, Phillips KA, Mooij TM, Roos-Blom MJ, Jervis S, van Leeuwen FE, Milne RL, Andrieu N, Goldgar DE, Terry MB, Rookus MA, Easton DF, Antoniou AC, BRCA1 and BRCA2 Cohort C, McGuffog L, Evans DG, Barrowdale D, Frost D, Adlard J, Ong KR, Izatt L, Tischkowitz M, Eeles R, Davidson R, Hodgson S, Ellis S, Nogues C, Lasset C, Stoppa-Lyonnet D, Fricker JP, Faivre L, Berthet P, Hooning MJ, van der Kolk LE, Kets CM, Adank MA, John EM, Chung WK, Andrulis IL, Southey M, Daly MB, Buys SS, Osorio A, Engel C, Kast K, Schmutzler RK, Caldes T, Jakubowska A, Simard J, Friedlander ML, McLachlan SA, Machackova E, Foretova L, Tan YY, Singer CF, Olah E, Gerdes AM, Arver B, Olsson H. Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers. JAMA 317:2402-2416, 2017. 157. Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH. Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11:446-450, 2010. 158. Zuk O, Hechter E, Sunyaev SR, Lander ES. The mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci U S A 109:1193-1198, 2012. 159. Smith EC. An overview of hereditary breast and ovarian cancer syndrome. J Midwifery Womens Health 57:577-584, 2012. 160. Mavaddat N, Barrowdale D, Andrulis IL, Domchek SM, Eccles D, Nevanlinna H, Ramus SJ, Spurdle A, Robson M, Sherman M, Mulligan AM, Couch FJ, Engel C, McGuffog L, Healey S, Sinilnikova OM, Southey MC, Terry MB, Goldgar D, O'Malley F, John EM, Janavicius R, Tihomirova L, Hansen TV, Nielsen FC, Osorio A, Stavropoulou A, Benitez J, Manoukian S, Peissel B, Barile M, Volorio S, Pasini B, Dolcetti R, Putignano AL, Ottini L, Radice P, Hamann U, Rashid MU, Hogervorst FB, Kriege M, van der Luijt RB, HEBON, Peock S, Frost D, Evans DG, Brewer C, Walker L, Rogers MT, Side LE, Houghton C, EMBRACE, Weaver J, Godwin AK, Schmutzler RK, Wappenschmidt B, Meindl A, Kast K, Arnold N, Niederacher D, Sutter C, Deissler H, Gadzicki D, Preisler-Adams S, Varon-Mateeva R, Schonbuchner I, Gevensleben H, Stoppa-Lyonnet D, Belotti M, Barjhoux L, GEMO Study Collaborators, Isaacs C, Peshkin BN, Caldes T, de la Hoya M, Canadas C, Heikkinen T, Heikkila P, Aittomaki K, Blanco I, Lazaro C, Brunet J, Agnarsson BA, Arason A, Barkardottir RB, Dumont M, Simard J, Montagna M, Agata S, D'Andrea E, Yan M, Fox S, kConFab Investigators, Rebbeck TR, Rubinstein W, Tung N, Garber JE, Wang X, Fredericksen Z, Pankratz VS, Lindor NM, Szabo C, Offit K, Sakr R, Gaudet MM, Singer CF, Tea MK, Rappaport C, Mai PL, Greene MH, Sokolenko A, Imyanitov E, Toland AE, Senter L, Sweet K, Thomassen M, Gerdes AM, Kruse T, Caligo M, Aretini P, Rantala J, von Wachenfeld A, Henriksson K, SWE-BRCA Collaborators, Steele L, Neuhausen SL, Nussbaum R, Beattie M, Odunsi K, Sucheston L, Gayther SA, Nathanson K, Gross J, Walsh C, Karlan B, Chenevix-Trench G, Easton DF, Antoniou AC, Consortium of Investigators of Modifiers of BRCA1/2. Pathology of breast and ovarian cancers among BRCA1 and BRCA2 mutation carriers: Results from the consortium of investigators of modifiers of BRCA1/2 (CIMBA). Cancer Epidemiol Biomarkers Prev 21:134-147, 2012. 161. Hisada M, Garber JE, Fung CY, Fraumeni JF,Jr, Li FP. Multiple primary cancers in families with li-fraumeni syndrome. J Natl Cancer Inst 90:606-611, 1998. 162. Farooq A, Walker LJ, Bowling J, Audisio RA. Cowden syndrome. Cancer Treat Rev 36:577-583, 2010. 163. Hemminki A, Markie D, Tomlinson I, Avizienyte E, Roth S, Loukola A, Bignell G, Warren W, Aminoff M, Hoglund P, Jarvinen H, Kristo P, Pelin K, Ridanpaa M, Salovaara R, Toro T, Bodmer W, Olschwang S, Olsen AS, Stratton MR, de la Chapelle A, Aaltonen LA. A serine/threonine kinase gene defective in peutz-jeghers syndrome. Nature 391:184-187, 1998. 164. Tomlinson IP, Houlston RS. Peutz-jeghers syndrome. J Med Genet 34:1007-1011, 1997. 165. Pharoah PD, Guilford P, Caldas C, International Gastric Cancer Linkage Consortium. Incidence of gastric cancer and breast cancer in CDH1 (E-cadherin) mutation carriers from hereditary diffuse gastric cancer families. Gastroenterology 121:1348-1353, 2001. 166. Meijers-Heijboer H, van den Ouweland A, Klijn J, Wasielewski M, de Snoo A, Oldenburg R, Hollestelle A, Houben M, Crepin E, van Veghel-Plandsoen M, Elstrodt F, van Duijn C, Bartels C, Meijers C, Schutte M, McGuffog L, Thompson D, Easton D, Sodha N, Seal S, Barfoot R, Mangion J, Chang-Claude J, Eccles D, Eeles R, Evans DG, Houlston R, Murday V, Narod S, Peretz T, Peto J, Phelan C, Zhang HX, Szabo C, Devilee P, Goldgar D, Futreal PA, Nathanson KL, Weber B, Rahman N, Stratton MR, CHEK2-Breast Cancer Consortium. Low-penetrance susceptibility to breast cancer due to CHEK2(*)1100delC in noncarriers of BRCA1 or BRCA2 mutations. Nat Genet 31:55-59, 2002. 167. Erkko H, Xia B, Nikkila J, Schleutker J, Syrjakoski K, Mannermaa A, Kallioniemi A, Pylkas K, Karppinen SM, Rapakko K, Miron A, Sheng Q, Li G, Mattila H, Bell DW, Haber DA, Grip M, Reiman M, Jukkola-Vuorinen A, Mustonen A, Kere J, Aaltonen LA, Kosma VM, Kataja V, Soini Y, Drapkin RI, Livingston DM, Winqvist R. A recurrent mutation in PALB2 in finnish cancer families. Nature 446:316-319, 2007. 168. Gorski B, Debniak T, Masojc B, Mierzejewski M, Medrek K, Cybulski C, Jakubowska A, Kurzawski G, Chosia M, Scott R, Lubinski J. Germline 657del5 mutation in the NBS1 gene in breast cancer patients. Int J Cancer 106:379-381, 2003.

76 169. Kiiski JI, Fagerholm R, Tervasmaki A, Pelttari LM, Khan S, Jamshidi M, Mantere T, Pylkas K, Bartek J, Bartkova J, Mannermaa A, Tengstrom M, Kosma VM, Winqvist R, Kallioniemi A, Aittomaki K, Blomqvist C, Nevanlinna H. FANCM c.5101C>T mutation associates with breast cancer survival and treatment outcome. Int J Cancer 139:2760-2770, 2016. 170. Heikkinen T, Karkkainen H, Aaltonen K, Milne RL, Heikkila P, Aittomaki K, Blomqvist C, Nevanlinna H. The breast cancer susceptibility mutation PALB2 1592delT is associated with an aggressive tumor phenotype. Clin Cancer Res 15:3214-3222, 2009. 171. Antoniou AC, Foulkes WD, Tischkowitz M. Breast-cancer risk in families with mutations in PALB2. N Engl J Med 371:1651-1652, 2014. 172. Goldgar DE, Healey S, Dowty JG, Da Silva L, Chen X, Spurdle AB, Terry MB, Daly MJ, Buys SM, Southey MC, Andrulis I, John EM, BCFR, kConFab, Khanna KK, Hopper JL, Oefner PJ, Lakhani S, Chenevix-Trench G. Rare variants in the ATM gene and risk of breast cancer. Breast Cancer Res 13:R73, 2011. 173. Tavtigian SV, Oefner PJ, Babikyan D, Hartmann A, Healey S, Le Calvez-Kelm F, Lesueur F, Byrnes GB, Chuang SC, Forey N, Feuchtinger C, Gioia L, Hall J, Hashibe M, Herte B, McKay-Chopin S, Thomas A, Vallee MP, Voegele C, Webb PM, Whiteman DC, Australian Cancer Study, Breast Cancer Family Registries (BCFR), Kathleen Cuningham Foundation Consortium for Research into Familial Aspects of Breast Cancer (kConFab), Sangrajrang S, Hopper JL, Southey MC, Andrulis IL, John EM, Chenevix-Trench G. Rare, evolutionarily unlikely missense substitutions in ATM confer increased risk of breast cancer. Am J Hum Genet 85:427-446, 2009. 174. Young EL, Feng BJ, Stark AW, Damiola F, Durand G, Forey N, Francy TC, Gammon A, Kohlmann WK, Kaphingst KA, McKay-Chopin S, Nguyen-Dumont T, Oliver J, Paquette AM, Pertesi M, Robinot N, Rosenthal JS, Vallee M, Voegele C, Hopper JL, Southey MC, Andrulis IL, John EM, Hashibe M, Gertz J, Breast Cancer Family Registry, Le Calvez-Kelm F, Lesueur F, Goldgar DE, Tavtigian SV. Multigene testing of moderate-risk genes: Be mindful of the missense. J Med Genet 53:366-376, 2016. 175. D'Andrea AD, Grompe M. The fanconi anaemia/BRCA pathway. Nat Rev Cancer 3:23-34, 2003. 176. Tan W, Deans AJ. A defined role for multiple fanconi anemia gene products in DNA-damage-associated ubiquitination. Exp Hematol 50:27-32, 2017. 177. Matt S, Hofmann TG. The DNA damage-induced cell death response: A roadmap to kill cancer cells. Cell Mol Life Sci 73:2829-2850, 2016. 178. Panier S, Durocher D. Push back to respond better: Regulatory inhibition of the DNA double-strand break response. Nat Rev Mol Cell Biol 14:661-672, 2013. 179. Wu W, Togashi Y, Johmura Y, Miyoshi Y, Nobuoka S, Nakanishi M, Ohta T. HP1 regulates the localization of FANCJ at sites of DNA double-strand breaks. Cancer Sci 107:1406-1415, 2016. 180. Prakash R, Zhang Y, Feng W, Jasin M. Homologous recombination and human health: The roles of BRCA1, BRCA2, and associated proteins. Cold Spring Harb Perspect Biol 7:a016600, 2015. 181. Chun J, Buechelmaier ES, Powell SN. Rad51 paralog complexes BCDX2 and CX3 act at different stages in the BRCA1-BRCA2-dependent homologous recombination pathway. Mol Cell Biol 33:387-395, 2013. 182. Xiao J, Liu CC, Chen PL, Lee WH. RINT-1, a novel Rad50-interacting protein, participates in radiation-induced G(2)/M checkpoint control. J Biol Chem 276:6105-6111, 2001. 183. Bogliolo M, Bluteau D, Lespinasse J, Pujol R, Vasquez N, d'Enghien CD, Stoppa-Lyonnet D, Leblanc T, Soulier J, Surralles J. Biallelic truncating FANCM mutations cause early-onset cancer but not fanconi anemia. Genet Med 20:458- 463, 2018. 184. Ceccaldi R, Sarangi P, D'Andrea AD. The fanconi anaemia pathway: New players and new functions. Nat Rev Mol Cell Biol 17:337-349, 2016. 185. Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, Ballinger DG, Struewing JP, Morrison J, Field H, Luben R, Wareham N, Ahmed S, Healey CS, Bowman R, SEARCH collaborators, Meyer KB, Haiman CA, Kolonel LK, Henderson BE, Le Marchand L, Brennan P, Sangrajrang S, Gaborieau V, Odefrey F, Shen CY, Wu PE, Wang HC, Eccles D, Evans DG, Peto J, Fletcher O, Johnson N, Seal S, Stratton MR, Rahman N, Chenevix-Trench G, Bojesen SE, Nordestgaard BG, Axelsson CK, Garcia-Closas M, Brinton L, Chanock S, Lissowska J, Peplonska B, Nevanlinna H, Fagerholm R, Eerola H, Kang D, Yoo KY, Noh DY, Ahn SH, Hunter DJ, Hankinson SE, Cox DG, Hall P, Wedren S, Liu J, Low YL, Bogdanova N, Schurmann P, Dork T, Tollenaar RA, Jacobi CE, Devilee P, Klijn JG, Sigurdson AJ, Doody MM, Alexander BH, Zhang J, Cox A, Brock IW, MacPherson G, Reed MW, Couch FJ, Goode EL, Olson JE, Meijers- Heijboer H, van den Ouweland A, Uitterlinden A, Rivadeneira F, Milne RL, Ribas G, Gonzalez-Neira A, Benitez J, Hopper JL, McCredie M, Southey M, Giles GG, Schroen C, Justenhoven C, Brauch H, Hamann U, Ko YD, Spurdle AB, Beesley J, Chen X, kConFab, AOCS Management Group, Mannermaa A, Kosma VM, Kataja V, Hartikainen J, Day NE, Cox DR, Ponder BA. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447:1087-1093, 2007.

77 186. Fachal L, Dunning AM. From candidate gene studies to GWAS and post-GWAS analyses in breast cancer. Curr Opin Genet Dev 30:32-41, 2015. 187. Vachon CM, Pankratz VS, Scott CG, Haeberle L, Ziv E, Jensen MR, Brandt KR, Whaley DH, Olson JE, Heusinger K, Hack CC, Jud SM, Beckmann MW, Schulz-Wendtland R, Tice JA, Norman AD, Cunningham JM, Purrington KS, Easton DF, Sellers TA, Kerlikowske K, Fasching PA, Couch FJ. The contributions of breast density and common genetic variation to breast cancer risk. J Natl Cancer Inst 107:10.1093/jnci/dju397. Print 2015 May, 2015. 188. Dite GS, MacInnis RJ, Bickerstaffe A, Dowty JG, Allman R, Apicella C, Milne RL, Tsimiklis H, Phillips KA, Giles GG, Terry MB, Southey MC, Hopper JL. Breast cancer risk prediction using clinical models and 77 independent risk- associated SNPs for women aged under 50 years: Australian breast cancer family registry. Cancer Epidemiol Biomarkers Prev, 2015. 189. Nevanlinna H, Bartek J. The CHEK2 gene and inherited breast cancer susceptibility. Oncogene 25:5912-5919, 2006. 190. Lukas J, Lukas C, Bartek J. Mammalian cell cycle checkpoints: Signalling pathways and their organization in space and time. DNA Repair (Amst) 3:997-1007, 2004. 191. Deckbar D, Stiff T, Koch B, Reis C, Lobrich M, Jeggo PA. The limitations of the G1-S checkpoint. Cancer Res 70:4412-4421, 2010. 192. Landsverk KS, Patzke S, Rein ID, Stokke C, Lyng H, De Angelis PM, Stokke T. Three independent mechanisms for arrest in G2 after ionizing radiation. Cell Cycle 10:819-829, 2011. 193. Ahn J, Urist M, Prives C. The Chk2 protein kinase. DNA Repair (Amst) 3:1039-1047, 2004. 194. Lukas C, Falck J, Bartkova J, Bartek J, Lukas J. Distinct spatiotemporal dynamics of mammalian checkpoint regulators induced by DNA damage. Nat Cell Biol 5:255-260, 2003. 195. Tsvetkov L, Xu X, Li J, Stern DF. Polo-like kinase 1 and Chk2 interact and co-localize to centrosomes and the midbody. J Biol Chem 278:8468-8475, 2003. 196. Golan A, Pick E, Tsvetkov L, Nadler Y, Kluger H, Stern DF. Centrosomal Chk2 in DNA damage responses and cell cycle progression. Cell Cycle 9:2647-2656, 2010. 197. Shang Z, Yu L, Lin YF, Matsunaga S, Shen CY, Chen BP. DNA-PKcs activates the Chk2-Brca1 pathway during mitosis to ensure chromosomal stability. Oncogenesis 3:e85, 2014. 198. Cai Z, Chehab NH, Pavletich NP. Structure and activation mechanism of the CHK2 DNA damage checkpoint kinase. Mol Cell 35:818-829, 2009. 199. Ertych N, Stolz A, Stenzinger A, Weichert W, Kaulfuss S, Burfeind P, Aigner A, Wordeman L, Bastians H. Increased microtubule assembly rates influence chromosomal instability in colorectal cancer cells. Nat Cell Biol 16:779-791, 2014. 200. Ertych N, Stolz A, Valerius O, Braus GH, Bastians H. CHK2-BRCA1 tumor-suppressor axis restrains oncogenic aurora-A kinase to ensure proper mitotic microtubule assembly. Proc Natl Acad Sci U S A 113:1817-1822, 2016. 201. Takai H, Naka K, Okada Y, Watanabe M, Harada N, Saito S, Anderson CW, Appella E, Nakanishi M, Suzuki H, Nagashima K, Sawa H, Ikeda K, Motoyama N. Chk2-deficient mice exhibit radioresistance and defective p53-mediated transcription. EMBO J 21:5195-5205, 2002. 202. Bahassi el M, Penner CG, Robbins SB, Tichy E, Feliciano E, Yin M, Liang L, Deng L, Tischfield JA, Stambrook PJ. The breast cancer susceptibility allele CHEK2*1100delC promotes genomic instability in a knock-in mouse model. Mutat Res 616:201-209, 2007. 203. Bahassi el M, Robbins SB, Yin M, Boivin GP, Kuiper R, van Steeg H, Stambrook PJ. Mice with the CHEK2*1100delC SNP are predisposed to cancer with a strong gender bias. Proc Natl Acad Sci U S A 106:17111-17116, 2009. 204. Bell DW, Varley JM, Szydlo TE, Kang DH, Wahrer DC, Shannon KE, Lubratovich M, Verselis SJ, Isselbacher KJ, Fraumeni JF, Birch JM, Li FP, Garber JE, Haber DA. Heterozygous germ line hCHK2 mutations in li-fraumeni syndrome. Science 286:2528-2531, 1999. 205. Sodha N, Bullock S, Taylor R, Mitchell G, Guertl-Lackner B, Williams RD, Bevan S, Bishop K, McGuire S, Houlston RS, Eeles RA. CHEK2 variants in susceptibility to breast cancer and evidence of retention of the wild type allele in tumours. Br J Cancer 87:1445-1448, 2002. 206. Siddiqui R, Onel K, Facio F, Nafa K, Diaz LR, Kauff N, Huang H, Robson M, Ellis N, Offit K. The TP53 mutational spectrum and frequency of CHEK2*1100delC in li-fraumeni-like kindreds. Fam Cancer 4:177-181, 2005. 207. Wu X, Webster SR, Chen J. Characterization of tumor-associated Chk2 mutations. J Biol Chem 276:2971-2974, 2001.

78 208. Cybulski C, Wokolorczyk D, Huzarski T, Byrski T, Gronwald J, Gorski B, Debniak T, Masojc B, Jakubowska A, van de Wetering T, Narod SA, Lubinski J. A deletion in CHEK2 of 5,395 bp predisposes to breast cancer in poland. Breast Cancer Res Treat 102:119-122, 2007. 209. Dong X, Wang L, Taniguchi K, Wang X, Cunningham JM, McDonnell SK, Qian C, Marks AF, Slager SL, Peterson BJ, Smith DI, Cheville JC, Blute ML, Jacobsen SJ, Schaid DJ, Tindall DJ, Thibodeau SN, Liu W. Mutations in CHEK2 associated with prostate cancer risk. Am J Hum Genet 72:270-280, 2003. 210. Walsh T, Casadei S, Coats KH, Swisher E, Stray SM, Higgins J, Roach KC, Mandell J, Lee MK, Ciernikova S, Foretova L, Soucek P, King MC. Spectrum of mutations in BRCA1, BRCA2, CHEK2, and TP53 in families at high risk of breast cancer. JAMA 295:1379-1388, 2006. 211. Anczukow O, Ware MD, Buisson M, Zetoune AB, Stoppa-Lyonnet D, Sinilnikova OM, Mazoyer S. Does the nonsense-mediated mRNA decay mechanism prevent the synthesis of truncated BRCA1, CHK2, and p53 proteins? Hum Mutat 29:65-73, 2008. 212. 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR. A global reference for human genetic variation. Nature 526:68-74, 2015. 213. Schwarz JK, Lovly CM, Piwnica-Worms H. Regulation of the Chk2 protein kinase by oligomerization-mediated cis- and trans-phosphorylation. Mol Cancer Res 1:598-609, 2003. 214. Weischer M, Bojesen SE, Ellervik C, Tybjaerg-Hansen A, Nordestgaard BG. CHEK2*1100delC genotyping for clinical assessment of breast cancer risk: Meta-analyses of 26,000 patient cases and 27,000 controls. J Clin Oncol 26:542- 548, 2008. 215. Cybulski C, Wokolorczyk D, Jakubowska A, Huzarski T, Byrski T, Gronwald J, Masojc B, Deebniak T, Gorski B, Blecharz P, Narod SA, Lubinski J. Risk of breast cancer in women with a CHEK2 mutation with and without a family history of breast cancer. J Clin Oncol 29:3747-3752, 2011. 216. Kilpivaara O, Bartkova J, Eerola H, Syrjakoski K, Vahteristo P, Lukas J, Blomqvist C, Holli K, Heikkila P, Sauter G, Kallioniemi OP, Bartek J, Nevanlinna H. Correlation of CHEK2 protein expression and c.1100delC mutation status with tumor characteristics among unselected breast cancer patients. Int J Cancer 113:575-580, 2005. 217. de Bock GH, Schutte M, Krol-Warmerdam EM, Seynaeve C, Blom J, Brekelmans CT, Meijers-Heijboer H, van Asperen CJ, Cornelisse CJ, Devilee P, Tollenaar RA, Klijn JG. Tumour characteristics and prognosis of breast cancer patients carrying the germline CHEK2*1100delC variant. J Med Genet 41:731-735, 2004. 218. Weischer M, Nordestgaard BG, Pharoah P, Bolla MK, Nevanlinna H, Van't Veer LJ, Garcia-Closas M, Hopper JL, Hall P, Andrulis IL, Devilee P, Fasching PA, Anton-Culver H, Lambrechts D, Hooning M, Cox A, Giles GG, Burwinkel B, Lindblom A, Couch FJ, Mannermaa A, Grenaker Alnaes G, John EM, Dork T, Flyger H, Dunning AM, Wang Q, Muranen TA, van Hien R, Figueroa J, Southey MC, Czene K, Knight JA, Tollenaar RA, Beckmann MW, Ziogas A, Christiaens MR, Collee JM, Reed MW, Severi G, Marme F, Margolin S, Olson JE, Kosma VM, Kristensen VN, Miron A, Bogdanova N, Shah M, Blomqvist C, Broeks A, Sherman M, Phillips KA, Li J, Liu J, Glendon G, Seynaeve C, Ekici AB, Leunen K, Kriege M, Cross SS, Baglietto L, Sohn C, Wang X, Kataja V, Borresen-Dale AL, Meyer A, Easton DF, Schmidt MK, Bojesen SE. CHEK2*1100delC heterozygosity in women with breast cancer associated with early death, breast cancer-specific death, and increased risk of a second breast cancer. J Clin Oncol 30:4308-4316, 2012. 219. Fletcher O, Johnson N, Dos Santos Silva I, Kilpivaara O, Aittomaki K, Blomqvist C, Nevanlinna H, Wasielewski M, Meijers-Heijerboer H, Broeks A, Schmidt MK, Van't Veer LJ, Bremer M, Dork T, Chekmariova EV, Sokolenko AP, Imyanitov EN, Hamann U, Rashid MU, Brauch H, Justenhoven C, Ashworth A, Peto J. Family history, genetic testing, and clinical risk prediction: Pooled analysis of CHEK2 1100delC in 1,828 bilateral breast cancers and 7,030 controls. Cancer Epidemiol Biomarkers Prev 18:230-234, 2009. 220. Le Calvez-Kelm F, Lesueur F, Damiola F, Vallee M, Voegele C, Babikyan D, Durand G, Forey N, McKay-Chopin S, Robinot N, Nguyen-Dumont T, Thomas A, Byrnes GB, Breast Cancer Family Registry, Hopper JL, Southey MC, Andrulis IL, John EM, Tavtigian SV. Rare, evolutionarily unlikely missense substitutions in CHEK2 contribute to breast cancer susceptibility: Results from a breast cancer family registry case-control mutation-screening study. Breast Cancer Res 13:R6, 2011. 221. Decker B, Allen J, Luccarini C, Pooley KA, Shah M, Bolla MK, Wang Q, Ahmed S, Baynes C, Conroy DM, Brown J, Luben R, Ostrander EA, Pharoah PD, Dunning AM, Easton DF. Rare, protein-truncating variants in ATM, CHEK2 and PALB2, but not XRCC2, are associated with increased breast cancer risks. J Med Genet 54:732-741, 2017. 222. Lee SB, Kim SH, Bell DW, Wahrer DC, Schiripo TA, Jorczak MM, Sgroi DC, Garber JE, Li FP, Nichols KE, Varley JM, Godwin AK, Shannon KM, Harlow E, Haber DA. Destabilization of CHK2 by a missense mutation associated with li-fraumeni syndrome. Cancer Res 61:8062-8067, 2001. 223. Falck J, Mailand N, Syljuasen RG, Bartek J, Lukas J. The ATM-Chk2-Cdc25A checkpoint pathway guards against radioresistant DNA synthesis. Nature 410:842-847, 2001.

79 224. Domagala P, Wokolorczyk D, Cybulski C, Huzarski T, Lubinski J, Domagala W. Different CHEK2 germline mutations are associated with distinct immunophenotypic molecular subtypes of breast cancer. Breast Cancer Res Treat 132:937-945, 2012. 225. Huzarski T, Cybulski C, Wokolorczyk D, Jakubowska A, Byrski T, Gronwald J, Domagala P, Szwiec M, Godlewski D, Kilar E, Marczyk E, Siolek M, Wisniowski R, Janiszewska H, Surdyka D, Sibilski R, Sun P, Lubinski J, Narod SA. Survival from breast cancer in patients with CHEK2 mutations. Breast Cancer Res Treat 144:397-403, 2014. 226. Jonsson G, Staaf J, Olsson E, Heidenblad M, Vallon-Christersson J, Osoegawa K, de Jong P, Oredsson S, Ringner M, Hoglund M, Borg A. High-resolution genomic profiles of breast cancer cell lines assessed by tiling BAC array comparative genomic hybridization. Genes Chromosomes Cancer 46:543-558, 2007. 227. Heikkinen T, Greco D, Pelttari LM, Tommiska J, Vahteristo P, Heikkila P, Blomqvist C, Aittomaki K, Nevanlinna H. Variants on the promoter region of PTEN affect breast cancer progression and patient survival. Breast Cancer Res 13:R130, 2011. 228. Ivshina AV, George J, Senko O, Mow B, Putti TC, Smeds J, Lindahl T, Pawitan Y, Hall P, Nordgren H, Wong JE, Liu ET, Bergh J, Kuznetsov VA, Miller LD. Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. Cancer Res 66:10292-10301, 2006. 229. Pawitan Y, Bjohle J, Amler L, Borg AL, Egyhazi S, Hall P, Han X, Holmberg L, Huang F, Klaar S, Liu ET, Miller L, Nordgren H, Ploner A, Sandelin K, Shaw PM, Smeds J, Skoog L, Wedren S, Bergh J. Gene expression profiling spares early breast cancer patients from adjuvant therapy: Derived and validated in two population-based cohorts. Breast Cancer Res 7:R953-64, 2005. 230. Cox A, Dunning AM, Garcia-Closas M, Balasubramanian S, Reed MW, Pooley KA, Scollen S, Baynes C, Ponder BA, Chanock S, Lissowska J, Brinton L, Peplonska B, Southey MC, Hopper JL, McCredie MR, Giles GG, Fletcher O, Johnson N, dos Santos Silva I, Gibson L, Bojesen SE, Nordestgaard BG, Axelsson CK, Torres D, Hamann U, Justenhoven C, Brauch H, Chang-Claude J, Kropp S, Risch A, Wang-Gohrke S, Schurmann P, Bogdanova N, Dork T, Fagerholm R, Aaltonen K, Blomqvist C, Nevanlinna H, Seal S, Renwick A, Stratton MR, Rahman N, Sangrajrang S, Hughes D, Odefrey F, Brennan P, Spurdle AB, Chenevix-Trench G, Kathleen Cunningham Foundation Consortium for Research into Familial Breast Cancer, Beesley J, Mannermaa A, Hartikainen J, Kataja V, Kosma VM, Couch FJ, Olson JE, Goode EL, Broeks A, Schmidt MK, Hogervorst FB, Van't Veer LJ, Kang D, Yoo KY, Noh DY, Ahn SH, Wedren S, Hall P, Low YL, Liu J, Milne RL, Ribas G, Gonzalez-Neira A, Benitez J, Sigurdson AJ, Stredrick DL, Alexander BH, Struewing JP, Pharoah PD, Easton DF, Breast Cancer Association Consortium. A common coding variant in CASP8 is associated with breast cancer risk. Nat Genet 39:352-358, 2007. 231. Syrjakoski K, Vahteristo P, Eerola H, Tamminen A, Kivinummi K, Sarantaus L, Holli K, Blomqvist C, Kallioniemi OP, Kainu T, Nevanlinna H. Population-based study of BRCA1 and BRCA2 mutations in 1035 unselected finnish breast cancer patients. J Natl Cancer Inst 92:1529-1531, 2000. 232. Eerola H, Blomqvist C, Pukkala E, Pyrhonen S, Nevanlinna H. Familial breast cancer in southern finland: How prevalent are breast cancer families and can we trust the family history reported by patients? Eur J Cancer 36:1143-1148, 2000. 233. Staaf J, Jonsson G, Ringner M, Vallon-Christersson J. Normalization of array-CGH data: Influence of copy number imbalances. BMC Genomics 8:382, 2007. 234. Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Statist Assoc 53:457--481, 1958. 235. Vallon-Christersson J, Nordborg N, Svensson M, Hakkinen J. BASE--2nd generation software for microarray data management and analysis. BMC Bioinformatics 10:330, 2009. 236. Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5:557-572, 2004. 237. Venkatraman ES, Olshen AB. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23:657-663, 2007. 238. van de Wiel MA, Kim KI, Vosse SJ, van Wieringen WN, Wilting SM, Ylstra B. CGHcall: Calling aberrations for array CGH tumor profiles. Bioinformatics 23:892-894, 2007. 239. van de Wiel MA, van Wieringen WN. CGHregions: Dimension reduction for array CGH data with minimal information loss. Cancer Inform 3:55-63, 2007. 240. Smyth GK, Speed T. Normalization of cDNA microarray data. Methods 31:265-273, 2003. 241. Smyth GK: Limma: Linear models for microarray data, in Gentleman R, Carey V, Dudoit S, et al (eds): Bioinformatics and Computational Biology Solutions using R and Bioconductor. New York, Springer, 2005 242. Smyth GK. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3:Article3, 2004.

80 243. Dennis G,Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA. DAVID: Database for annotation, visualization, and integrated discovery. Genome Biol 4:P3, 2003. 244. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545-15550, 2005. 245. Cox DR. Regression models and life-tables. J Roy Statist Soc Ser B 34:187--220, 1972. 246. Broeks A, Schmidt MK, Sherman ME, Couch FJ, Hopper JL, Dite GS, Apicella C, Smith LD, Hammet F, Southey MC, Van 't Veer LJ, de Groot R, Smit VT, Fasching PA, Beckmann MW, Jud S, Ekici AB, Hartmann A, Hein A, Schulz- Wendtland R, Burwinkel B, Marme F, Schneeweiss A, Sinn HP, Sohn C, Tchatchou S, Bojesen SE, Nordestgaard BG, Flyger H, Orsted DD, Kaur-Knudsen D, Milne RL, Perez JI, Zamora P, Rodriguez PM, Benitez J, Brauch H, Justenhoven C, Ko YD, Genica Network, Hamann U, Fischer HP, Bruning T, Pesch B, Chang-Claude J, Wang-Gohrke S, Bremer M, Karstens JH, Hillemanns P, Dork T, Nevanlinna HA, Heikkinen T, Heikkila P, Blomqvist C, Aittomaki K, Aaltonen K, Lindblom A, Margolin S, Mannermaa A, Kosma VM, Kauppinen JM, Kataja V, Auvinen P, Eskelinen M, Soini Y, Chenevix-Trench G, Spurdle AB, Beesley J, Chen X, Holland H, kConFab, AOCS, Lambrechts D, Claes B, Vandorpe T, Neven P, Wildiers H, Flesch-Janys D, Hein R, Loning T, Kosel M, Fredericksen ZS, Wang X, Giles GG, Baglietto L, Severi G, McLean C, Haiman CA, Henderson BE, Le Marchand L, Kolonel LN, Alnaes GG, Kristensen V, Borresen-Dale AL, Hunter DJ, Hankinson SE, Andrulis IL, Mulligan AM, O'Malley FP, Devilee P, Huijts PE, Tollenaar RA, Van Asperen CJ, Seynaeve CS, Chanock SJ, Lissowska J, Brinton L, Peplonska B, Figueroa J, Yang XR, Hooning MJ, Hollestelle A, Oldenburg RA, Jager A, Kriege M, Ozturk B, van Leenders GJ, Hall P, Czene K, Humphreys K, Liu J, Cox A, Connley D, Cramp HE, Cross SS, Balasubramanian SP, Reed MW, Dunning AM, Easton DF, Humphreys MK, Caldas C, Blows F, Driver K, Provenzano E, Lubinski J, Jakubowska A, Huzarski T, Byrski T, Cybulski C, Gorski B, Gronwald J, Brennan P, Sangrajrang S, Gaborieau V, Shen CY, Hsiung CN, Yu JC, Chen ST, Hsu GC, Hou MF, Huang CS, Anton-Culver H, Ziogas A, Pharoah PD, Garcia-Closas M. Low penetrance breast cancer susceptibility loci are associated with specific breast tumor subtypes: Findings from the breast cancer association consortium. Hum Mol Genet 20:3289-3303, 2011. 247. Friendly M. vcdExtra: 'vcd' Extensions and Additions. R package version 0.6-8, 2015. 248. Schwarzer G. meta: General Package for Meta-Analysis. R package version 4.1-0. 2015. 249. Haibe-Kains B, Schroeder M, Bontempi G, Sotirou C, Quackenbush J. genefu: Relevant Functions for Gene Expression Analysis, Especially in Breast Cancer. R package version 1.12.0, 2013. 250. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1-22, 2010. 251. Simon N, Friedman J, Hastie T, Tibshirani R. Regularization paths for cox’s proportional hazards model via coordinate descent. J Stat Softw 39:1-13, 2011. 252. Kursa MB, Rudnicki WR. Feature selection with the boruta package. J Stat Softw 36:1-13, 2010. 253. Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O'Donnell CJ, de Bakker PI. SNAP: A web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24:2938-2939, 2008. 254. Yang TP, Beazley C, Montgomery SB, Dimas AS, Gutierrez-Arcelus M, Stranger BE, Deloukas P, Dermitzakis ET. Genevar: A database and java application for the analysis and visualization of SNP-gene associations in eQTL studies. Bioinformatics 26:2474-2476, 2010. 255. Nica AC, Parts L, Glass D, Nisbet J, Barrett A, Sekowska M, Travers M, Potter S, Grundberg E, Small K, Hedman AK, Bataille V, Tzenova Bell J, Surdulescu G, Dimas AS, Ingle C, Nestle FO, di Meglio P, Min JL, Wilk A, Hammond CJ, Hassanali N, Yang TP, Montgomery SB, O'Rahilly S, Lindgren CM, Zondervan KT, Soranzo N, Barroso I, Durbin R, Ahmadi K, Deloukas P, McCarthy MI, Dermitzakis ET, Spector TD, MuTHER Consortium. The architecture of gene regulatory variation across multiple human tissues: The MuTHER study. PLoS Genet 7:e1002003, 2011. 256. Grundberg E, Small KS, Hedman AK, Nica AC, Buil A, Keildson S, Bell JT, Yang TP, Meduri E, Barrett A, Nisbett J, Sekowska M, Wilk A, Shin SY, Glass D, Travers M, Min JL, Ring S, Ho K, Thorleifsson G, Kong A, Thorsteindottir U, Ainali C, Dimas AS, Hassanali N, Ingle C, Knowles D, Krestyaninova M, Lowe CE, Di Meglio P, Montgomery SB, Parts L, Potter S, Surdulescu G, Tsaprouni L, Tsoka S, Bataille V, Durbin R, Nestle FO, O'Rahilly S, Soranzo N, Lindgren CM, Zondervan KT, Ahmadi KR, Schadt EE, Stefansson K, Smith GD, McCarthy MI, Deloukas P, Dermitzakis ET, Spector TD, Multiple Tissue Human Expression Resource (MuTHER) Consortium. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat Genet 44:1084-1089, 2012. 257. Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, Kasowski M, Karczewski KJ, Park J, Hitz BC, Weng S, Cherry JM, Snyder M. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res 22:1790- 1797, 2012. 258. Ward LD, Kellis M. HaploReg: A resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res 40:D930-4, 2012.

81 259. ENCODE Project Consortium, Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M. An integrated encyclopedia of DNA elements in the human genome. Nature 489:57-74, 2012. 260. Huijts PE, Vreeswijk MP, Kroeze-Jansema KH, Jacobi CE, Seynaeve C, Krol-Warmerdam EM, Wijers-Koster PM, Blom JC, Pooley KA, Klijn JG, Tollenaar RA, Devilee P, van Asperen CJ. Clinical correlates of low-risk variants in FGFR2, TNRC9, MAP3K1, LSP1 and 8q24 in a dutch cohort of incident breast cancer cases. Breast Cancer Res 9:R78, 2007. 261. Meyer KB, Maia AT, O'Reilly M, Teschendorff AE, Chin SF, Caldas C, Ponder BA. Allele-specific up-regulation of FGFR2 increases susceptibility to breast cancer. PLoS Biol 6:e108, 2008. 262. Huijts PE, van Dongen M, de Goeij MC, van Moolenbroek AJ, Blanken F, Vreeswijk MP, de Kruijf EM, Mesker WE, van Zwet EW, Tollenaar RA, Smit VT, van Asperen CJ, Devilee P. Allele-specific regulation of FGFR2 expression is cell type-dependent and may increase breast cancer risk through a paracrine stimulus involving FGF10. Breast Cancer Res 13:R72, 2011. 263. Fu YP, Edvardsen H, Kaushiva A, Arhancet JP, Howe TM, Kohaar I, Porter-Gill P, Shah A, Landmark-Hoyvik H, Fossa SD, Ambs S, Naume B, Borresen-Dale AL, Kristensen VN, Prokunina-Olsson L. NOTCH2 in breast cancer: Association of SNP rs11249433 with gene expression in ER-positive breast tumors without TP53 mutations. Mol Cancer 9:113-4598-9-113, 2010. 264. Rhie SK, Coetzee SG, Noushmehr H, Yan C, Kim JM, Haiman CA, Coetzee GA. Comprehensive functional annotation of seventy-one breast cancer risk loci. PLoS One 8:e63925, 2013. 265. Osborn DP, Roccasecca RM, McMurray F, Hernandez-Hernandez V, Mukherjee S, Barroso I, Stemple D, Cox R, Beales PL, Christou-Savina S. Loss of FTO antagonises wnt signaling and leads to developmental defects associated with ciliopathies. PLoS One 9:e87662, 2014. 266. Ungerback J, Elander N, Grunberg J, Sigvardsson M, Soderkvist P. The notch-2 gene is regulated by wnt signaling in cultured colorectal cancer cells. PLoS One 6:e17957, 2011. 267. Mustonen T, Tummers M, Mikami T, Itoh N, Zhang N, Gridley T, Thesleff I. Lunatic fringe, FGF, and BMP regulate the notch pathway during epithelial morphogenesis of teeth. Dev Biol 248:281-293, 2002. 268. Small D, Kovalenko D, Soldi R, Mandinova A, Kolev V, Trifonova R, Bagala C, Kacer D, Battelli C, Liaw L, Prudovsky I, Maciag T. Notch activation suppresses fibroblast growth factor-dependent cellular transformation. J Biol Chem 278:16405-16413, 2003. 269. Krejci P, Aklian A, Kaucka M, Sevcikova E, Prochazkova J, Masek JK, Mikolka P, Pospisilova T, Spoustova T, Weis M, Paznekas WA, Wolf JH, Gutkind JS, Wilcox WR, Kozubik A, Jabs EW, Bryja V, Salazar L, Vesela I, Balek L. Receptor tyrosine kinases activate canonical WNT/beta-catenin signaling via MAP kinase/LRP6 pathway and direct beta-catenin phosphorylation. PLoS One 7:e35826, 2012. 270. Pond AC, Herschkowitz JI, Schwertfeger KL, Welm B, Zhang Y, York B, Cardiff RD, Hilsenbeck S, Perou CM, Creighton CJ, Lloyd RE, Rosen JM. Fibroblast growth factor receptor signaling dramatically accelerates tumorigenesis and enhances oncoprotein translation in the mouse mammary tumor virus-wnt-1 mouse model of breast cancer. Cancer Res 70:4868-4879, 2010. 271. Jackson D, Bresnick J, Rosewell I, Crafton T, Poulsom R, Stamp G, Dickson C. Fibroblast growth factor receptor signalling has a role in lobuloalveolar development of the mammary gland. J Cell Sci 110:1261-1268, 1997. 272. Incassati A, Chandramouli A, Eelkema R, Cowin P. Key signaling nodes in mammary gland development and cancer: Ǻ-catenin. Breast Cancer Res 12:213, 2010. 273. Shah CA, Bei L, Wang H, Platanias LC, Eklund EA. The leukemia-associated mll-ell oncoprotein induces fibroblast growth factor 2 (Fgf2)-dependent cytokine hypersensitivity in myeloid progenitor cells. J Biol Chem 288:32490-32505, 2013. 274. Nagel JH, Peeters JK, Smid M, Sieuwerts AM, Wasielewski M, de Weerd V, Trapman-Jansen AM, van den Ouweland A, Bruggenwirth H, van I Jcken WF, Klijn JG, van der Spek PJ, Foekens JA, Martens JW, Schutte M, Meijers-Heijboer H. Gene expression profiling assigns CHEK2 1100delC breast cancers to the luminal intrinsic subtypes. Breast Cancer Res Treat 132:439-448, 2012. 275. Schmidt MK, Tollenaar RA, de Kemp SR, Broeks A, Cornelisse CJ, Smit VT, Peterse JL, van Leeuwen FE, Van't Veer LJ. Breast cancer survival and tumor characteristics in premenopausal women carrying the CHEK2*1100delC germline mutation. J Clin Oncol 25:64-69, 2007. 276. Cybulski C, Huzarski T, Byrski T, Gronwald J, Debniak T, Jakubowska A, Gorski B, Wokolorczyk D, Masojc B, Narod SA, Lubinski J. Estrogen receptor status in CHEK2-positive breast cancers: Implications for chemoprevention. Clin Genet 75:72-78, 2009.

82 277. de Bock GH, Mourits MJ, Schutte M, Krol-Warmerdam EM, Seynaeve C, Blom J, Brekelmans CT, Meijers-Heijboer H, van Asperen CJ, Cornelisse CJ, Devilee P, Tollenaar RA, Klijn JG. Association between the CHEK2*1100delC germ line mutation and estrogen receptor status. Int J Gynecol Cancer 16 Suppl 2:552-555, 2006. 278. Krainer M, Silva-Arrieta S, FitzGerald MG, Shimada A, Ishioka C, Kanamaru R, MacDonald DJ, Unsal H, Finkelstein DM, Bowcock A, Isselbacher KJ, Haber DA. Differential contributions of BRCA1 and BRCA2 to early-onset breast cancer. N Engl J Med 336:1416-1421, 1997. 279. Foulkes WD. BRCA1 and BRCA2 - update and implications on the genetics of breast cancer: A clinical perspective. Clin Genet 85:1-4, 2014. 280. CHEK2 Breast Cancer Case-Control Consortium. CHEK2*1100delC and susceptibility to breast cancer: A collaborative analysis involving 10,860 breast cancer cases and 9,065 controls from 10 studies. Am J Hum Genet 74:1175- 1182, 2004. 281. Ciriello G, Miller ML, Aksoy BA, Senbabaoglu Y, Schultz N, Sander C. Emerging landscape of oncogenic signatures across human cancers. Nat Genet 45:1127-1133, 2013. 282. Massink MP, Kooi IE, Martens JW, Waisfisz Q, Meijers-Heijboer H. Genomic profiling of CHEK2*1100delC- mutated breast carcinomas. BMC Cancer 15:877-015-1880-y, 2015. 283. Kim BH, Shenoy AR, Kumar P, Bradfield CJ, MacMicking JD. IFN-inducible GTPases in host cell defense. Cell Host Microbe 12:432-444, 2012. 284. Pilla-Moffett D, Barber MF, Taylor GA, Coers J. Interferon-inducible GTPases in host resistance, inflammation and disease. J Mol Biol 428:3495-3513, 2016. 285. Britzen-Laurent N, Herrmann C, Naschberger E, Croner RS, Sturzl M. Pathophysiological role of guanylate-binding proteins in gastrointestinal diseases. World J Gastroenterol 22:6434-6443, 2016. 286. Naschberger E, Croner RS, Merkel S, Dimmler A, Tripal P, Amann KU, Kremmer E, Brueckl WM, Papadopoulos T, Hohenadl C, Hohenberger W, Sturzl M. Angiostatic immune reaction in colorectal carcinoma: Impact on survival and perspectives for antiangiogenic therapy. Int J Cancer 123:2120-2129, 2008. 287. Ascierto ML, Kmieciak M, Idowu MO, Manjili R, Zhao Y, Grimes M, Dumur C, Wang E, Ramakrishnan V, Wang XY, Bear HD, Marincola FM, Manjili MH. A signature of immune function genes associated with recurrence-free survival in breast cancer patients. Breast Cancer Res Treat 131:871-880, 2012. 288. Britzen-Laurent N, Lipnik K, Ocker M, Naschberger E, Schellerer VS, Croner RS, Vieth M, Waldner M, Steinberg P, Hohenadl C, Sturzl M. GBP-1 acts as a tumor suppressor in colorectal cancer cells. Carcinogenesis 34:153-162, 2013. 289. Capaldo CT, Beeman N, Hilgarth RS, Nava P, Louis NA, Naschberger E, Sturzl M, Parkos CA, Nusrat A. IFN- gamma and TNF-alpha-induced GBP-1 inhibits epithelial cell proliferation through suppression of beta-catenin/TCF signaling. Mucosal Immunol 5:681-690, 2012. 290. Qiu X, Guo H, Yang J, Ji Y, Wu CS, Chen X. Down-regulation of guanylate binding protein 1 causes mitochondrial dysfunction and cellular senescence in macrophages. Sci Rep 8:1679-018-19828-7, 2018. 291. Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA, Kiezun A, Hammerman PS, McKenna A, Drier Y, Zou L, Ramos AH, Pugh TJ, Stransky N, Helman E, Kim J, Sougnez C, Ambrogio L, Nickerson E, Shefler E, Cortes ML, Auclair D, Saksena G, Voet D, Noble M, DiCara D, Lin P, Lichtenstein L, Heiman DI, Fennell T, Imielinski M, Hernandez B, Hodis E, Baca S, Dulak AM, Lohr J, Landau DA, Wu CJ, Melendez-Zajgla J, Hidalgo-Miranda A, Koren A, McCarroll SA, Mora J, Crompton B, Onofrio R, Parkin M, Winckler W, Ardlie K, Gabriel SB, Roberts CWM, Biegel JA, Stegmaier K, Bass AJ, Garraway LA, Meyerson M, Golub TR, Gordenin DA, Sunyaev S, Lander ES, Getz G. Mutational heterogeneity in cancer and the search for new cancer- associated genes. Nature 499:214-218, 2013. 292. Olender T, Feldmesser E, Atarot T, Eisenstein M, Lancet D. The olfactory receptor universe--from whole genome analysis to structure and evolution. Genet Mol Res 3:545-553, 2004. 293. Ferreira T, Wilson SR, Choi YG, Risso D, Dudoit S, Speed TP, Ngai J. Silencing of odorant receptor genes by G protein betagamma signaling ensures the expression of one odorant receptor per olfactory sensory neuron. Neuron 81:847- 859, 2014. 294. Flegel C, Manteniotis S, Osthold S, Hatt H, Gisselmann G. Expression profile of ectopic olfactory receptors determined by deep sequencing. PLoS One 8:e55368, 2013. 295. Kang N, Koo J. Olfactory receptors in non-chemosensory tissues. BMB Rep 45:612-622, 2012. 296. Neuhaus EM, Zhang W, Gelis L, Deng Y, Noldus J, Hatt H. Activation of an olfactory receptor inhibits proliferation of prostate cancer cells. J Biol Chem 284:16218-16225, 2009.

83 297. Massberg D, Simon A, Haussinger D, Keitel V, Gisselmann G, Conrad H, Hatt H. Monoterpene (-)-citronellal affects hepatocarcinoma cell signaling via an olfactory receptor. Arch Biochem Biophys 566:100-109, 2015. 298. Kalbe B, Schulz VM, Schlimm M, Philippou S, Jovancevic N, Jansen F, Scholz P, Lubbert H, Jarocki M, Faissner A, Hecker E, Veitinger S, Tsai T, Osterloh S, Hatt H. Helional-induced activation of human olfactory receptor 2J3 promotes apoptosis and inhibits proliferation in a non-small-cell lung cancer cell line. Eur J Cell Biol 96:34-46, 2017. 299. Gelis L, Jovancevic N, Bechara FG, Neuhaus EM, Hatt H. Functional expression of olfactory receptors in human primary melanoma and melanoma metastasis. Exp Dermatol 26:569-576, 2017. 300. Weber L, Al-Refae K, Ebbert J, Jagers P, Altmuller J, Becker C, Hahn S, Gisselmann G, Hatt H. Activation of odorant receptor in colorectal cancer cells leads to inhibition of cell proliferation and apoptosis. PLoS One 12:e0172491, 2017. 301. Manteniotis S, Wojcik S, Gothert JR, Durig J, Duhrsen U, Gisselmann G, Hatt H. Deorphanization and characterization of the ectopically expressed olfactory receptor OR51B5 in myelogenous leukemia cells. Cell Death Discov 2:16010, 2016. 302. Sanz G, Leray I, Dewaele A, Sobilo J, Lerondel S, Bouet S, Grebert D, Monnerie R, Pajot-Augy E, Mir LM. Promotion of cancer cell invasiveness and metastasis emergence caused by olfactory receptor stimulation. PLoS One 9:e85110, 2014. 303. Weber L, Massberg D, Becker C, Altmuller J, Ubrig B, Bonatz G, Wolk G, Philippou S, Tannapfel A, Hatt H, Gisselmann G. Olfactory receptors as biomarkers in human breast carcinoma tissues. Front Oncol 8:33, 2018. 304. Hirotsu T, Iino Y. Neural circuit-dependent odor adaptation in C. elegans is regulated by the ras-MAPK pathway. Genes Cells 10:517-530, 2005. 305. Uozumi T, Hirotsu T, Yoshida K, Yamada R, Suzuki A, Taniguchi G, Iino Y, Ishihara T. Temporally-regulated quick activation and inactivation of ras is important for olfactory behaviour. Sci Rep 2:500, 2012. 306. Kang N, Kim H, Jae Y, Lee N, Ku CR, Margolis F, Lee EJ, Bahk YY, Kim MS, Koo J. Olfactory marker protein expression is an indicator of olfactory receptor-associated events in non-olfactory tissues. PLoS One 10:e0116097, 2015. 307. Zilberberg A, Yaniv A, Gazit A. The low density lipoprotein receptor-1, LRP1, interacts with the human frizzled-1 (HFz1) and down-regulates the canonical wnt signaling pathway. J Biol Chem 279:17535-17542, 2004. 308. Katoh M. Networking of WNT, FGF, notch, BMP, and hedgehog signaling pathways during carcinogenesis. Stem Cell Rev 3:30-38, 2007. 309. Turashvili G, Bouchal J, Burkadze G, Kolar Z. Wnt signaling pathway in mammary gland development and carcinogenesis. Pathobiology 73:213-223, 2006. 310. Fallah Y, Brundage J, Allegakoen P, Shajahan-Haq AN. MYC-driven pathways in breast cancer subtypes. Biomolecules 7:10.3390/biom7030053, 2017. 311. Xu J, Chen Y, Huo D, Khramtsov A, Khramtsova G, Zhang C, Goss KH, Olopade OI. Beta-catenin regulates c-myc and CDKN1A expression in breast cancer cells. Mol Carcinog 55:431-439, 2016. 312. Foldynova-Trantirkova S, Sekyrova P, Tmejova K, Brumovska E, Bernatik O, Blankenfeldt W, Krejci P, Kozubik A, Dolezal T, Trantirek L, Bryja V. Breast cancer-specific mutations in CK1epsilon inhibit wnt/beta-catenin and activate the wnt/Rac1/JNK and NFAT pathways to decrease cell adhesion and promote cell migration. Breast Cancer Res 12:R30, 2010. 313. Sakurai K, Michiue T, Kikuchi A, Asashima M. Inhibition of the canonical wnt signaling pathway in cytoplasm: A novel property of the carboxyl terminal domains of two xenopus ELL genes. Zoolog Sci 21:407-416, 2004. 314. Fu Z, Song P, Li D, Yi C, Chen H, Ruan S, Shi Z, Xu W, Fu X, Zheng S. Cancer-associated fibroblasts from invasive breast cancer have an attenuated capacity to secrete collagens. Int J Oncol 45:1479-1488, 2014. 315. McCart Reed AE, Kutasovic JR, Lakhani SR, Simpson PT. Invasive lobular carcinoma of the breast: Morphology, biomarkers and 'omics. Breast Cancer Res 17:12-015-0519-x, 2015. 316. Dabbs DJ, Schnitt SJ, Geyer FC, Weigelt B, Baehner FL, Decker T, Eusebi V, Fox SB, Ichihara S, Lakhani SR, Palacios J, Rakha E, Richardson AL, Schmitt FC, Tan PH, Tse GM, Vincent-Salomon A, Ellis IO, Badve S, Reis-Filho JS. Lobular neoplasia of the breast revisited with emphasis on the role of E-cadherin immunohistochemistry. Am J Surg Pathol 37:e1-11, 2013. 317. Ciriello G, Gatza ML, Beck AH, Wilkerson MD, Rhie SK, Pastore A, Zhang H, McLellan M, Yau C, Kandoth C, Bowlby R, Shen H, Hayat S, Fieldhouse R, Lester SC, Tse GM, Factor RE, Collins LC, Allison KH, Chen YY, Jensen K, Johnson NB, Oesterreich S, Mills GB, Cherniack AD, Robertson G, Benz C, Sander C, Laird PW, Hoadley KA, King TA, TCGA Research Network, Perou CM. Comprehensive molecular portraits of invasive lobular breast cancer. Cell 163:506- 519, 2015.

84 318. Dossus L, Benusiglio PR. Lobular breast cancer: Incidence and genetic and non-genetic risk factors. Breast Cancer Res 17:37-015-0546-7, 2015. 319. Gruel N, Lucchesi C, Raynal V, Rodrigues MJ, Pierron G, Goudefroye R, Cottu P, Reyal F, Sastre-Garau X, Fourquet A, Delattre O, Vincent-Salomon A. Lobular invasive carcinoma of the breast is a molecular entity distinct from luminal invasive ductal carcinoma. Eur J Cancer 46:2399-2407, 2010. 320. Gunther K, Merkelbach-Bruse S, Amo-Takyi BK, Handt S, Schroder W, Tietze L. Differences in genetic alterations between primary lobular and ductal breast cancers detected by comparative genomic hybridization. J Pathol 193:40-47, 2001. 321. Loo LW, Grove DI, Williams EM, Neal CL, Cousens LA, Schubert EL, Holcomb IN, Massa HF, Glogovac J, Li CI, Malone KE, Daling JR, Delrow JJ, Trask BJ, Hsu L, Porter PL. Array comparative genomic hybridization analysis of genomic alterations in breast cancer subtypes. Cancer Res 64:8541-8549, 2004. 322. Lu YJ, Osin P, Lakhani SR, Di Palma S, Gusterson BA, Shipley JM. Comparative genomic hybridization analysis of lobular carcinoma in situ and atypical lobular hyperplasia and potential roles for gains and losses of genetic material in breast neoplasia. Cancer Res 58:4721-4727, 1998. 323. van den Broek AJ, Schmidt MK, van 't Veer LJ, Tollenaar RA, van Leeuwen FE. Worse breast cancer prognosis of BRCA1/BRCA2 mutation carriers: What's the evidence? A systematic review with meta-analysis. PLoS One 10:e0120189, 2015. 324. Wysocki PJ, Korski K, Lamperska K, Zaluski J, Mackiewicz A. Primary resistance to docetaxel-based chemotherapy in metastatic breast cancer patients correlates with a high frequency of BRCA1 mutations. Med Sci Monit 14:SC7-10, 2008. 325. Sung M, Giannakakou P. BRCA1 regulates microtubule dynamics and taxane-induced apoptotic cell signaling. Oncogene 33:1418-1428, 2014. 326. Akashi-Tanaka S, Watanabe C, Takamaru T, Kuwayama T, Ikeda M, Ohyama H, Mori M, Yoshida R, Hashimoto R, Terumasa S, Enokido K, Hirota Y, Okuyama H, Nakamura S. BRCAness predicts resistance to taxane-containing regimens in triple negative breast cancer during neoadjuvant chemotherapy. Clin Breast Cancer 15:80-85, 2015. 327. Pedersen L, Idorn M, Olofsson GH, Lauenborg B, Nookaew I, Hansen RH, Johannesen HH, Becker JC, Pedersen KS, Dethlefsen C, Nielsen J, Gehl J, Pedersen BK, Thor Straten P, Hojman P. Voluntary running suppresses tumor growth through epinephrine- and IL-6-dependent NK cell mobilization and redistribution. Cell Metab 23:554-562, 2016. 328. Stordal B, Davey R. A systematic review of genes involved in the inverse resistance relationship between cisplatin and paclitaxel chemotherapy: Role of BRCA1. Curr Cancer Drug Targets 9:354-365, 2009. 329. Chrisanthar R, Knappskog S, Lokkevik E, Anker G, Ostenstad B, Lundgren S, Berge EO, Risberg T, Mjaaland I, Maehle L, Engebretsen LF, Lillehaug JR, Lonning PE. CHEK2 mutations affecting kinase activity together with mutations in TP53 indicate a functional pathway associated with resistance to epirubicin in primary breast cancer. PLoS One 3:e3062, 2008. 330. Kriege M, Hollestelle A, Jager A, Huijts PE, Berns EM, Sieuwerts AM, Meijer-van Gelder ME, Collee JM, Devilee P, Hooning MJ, Martens JW, Seynaeve C. Survival and contralateral breast cancer in CHEK2 1100delC breast cancer patients: Impact of adjuvant chemotherapy. Br J Cancer 111:1004-1013, 2014. 331. Meyer A, Dork T, Sohn C, Karstens JH, Bremer M. Breast cancer in patients carrying a germ-line CHEK2 mutation: Outcome after breast conserving surgery and adjuvant radiotherapy. Radiother Oncol 82:349-353, 2007. 332. Broeks A, Braaf LM, Huseinovic A, Nooijen A, Urbanus J, Hogervorst FB, Schmidt MK, Klijn JG, Russell NS, Van Leeuwen FE, Van 't Veer LJ. Identification of women with an increased risk of developing radiation-induced breast cancer: A case only study. Breast Cancer Res 9:R26, 2007. 333. Pestalozzi BC, Zahrieh D, Mallon E, Gusterson BA, Price KN, Gelber RD, Holmberg SB, Lindtner J, Snyder R, Thurlimann B, Murray E, Viale G, Castiglione-Gertsch M, Coates AS, Goldhirsch A, International Breast Cancer Study Group. Distinct clinical and prognostic features of infiltrating lobular carcinoma of the breast: Combined results of 15 international breast cancer study group clinical trials. J Clin Oncol 26:3006-3014, 2008. 334. Rakha EA, El-Sayed ME, Powe DG, Green AR, Habashy H, Grainge MJ, Robertson JF, Blamey R, Gee J, Nicholson RI, Lee AH, Ellis IO. Invasive lobular carcinoma of the breast: Response to hormonal therapy and outcomes. Eur J Cancer 44:73-83, 2008. 335. Younis LK, El Sakka H, Haque I. The prognostic value of E-cadherin expression in breast cancer. Int J Health Sci (Qassim) 1:43-51, 2007. 336. Rakha EA, Abd El Rehim D, Pinder SE, Lewis SA, Ellis IO. E-cadherin expression in invasive non-lobular carcinoma of the breast and its prognostic significance. Histopathology 46:685-693, 2005.

85 337. Liu J, Sun X, Qin S, Wang H, DU N, Li Y, Pang Y, Wang C, Xu C, Ren H. CDH1 promoter methylation correlates with decreased gene expression and poor prognosis in patients with breast cancer. Oncol Lett 11:2635-2643, 2016. 338. Ricciardi GR, Adamo B, Ieni A, Licata L, Cardia R, Ferraro G, Franchina T, Tuccari G, Adamo V. Androgen receptor (AR), E-cadherin, and ki-67 as emerging targets and novel prognostic markers in triple-negative breast cancer (TNBC) patients. PLoS One 10:e0128368, 2015. 339. Tilastokeskus. Suomi Lukuina / Väestö, http://www.stat.fi/tup/suoluk/index.html, accessed April 2018. 340. Rebbeck TR, Friebel TM, Friedman E, Hamann U, Huo D, Kwong A, Olah E, Olopade OI, Solano AR, Teo SH, Thomassen M, Weitzel JN, Chan TL, Couch FJ, Goldgar DE, Kruse TA, Palmero EI, Park SK, Torres D, van Rensburg EJ, McGuffog L, Parsons MT, Leslie G, Aalfs CM, Abugattas J, Adlard J, Agata S, Aittomaki K, Andrews L, Andrulis IL, Arason A, Arnold N, Arun BK, Asseryanis E, Auerbach L, Azzollini J, Balmana J, Barile M, Barkardottir RB, Barrowdale D, Benitez J, Berger A, Berger R, Blanco AM, Blazer KR, Blok MJ, Bonadona V, Bonanni B, Bradbury AR, Brewer C, Buecher B, Buys SS, Caldes T, Caliebe A, Caligo MA, Campbell I, Caputo SM, Chiquette J, Chung WK, Claes KBM, Collee JM, Cook J, Davidson R, de la Hoya M, De Leeneer K, de Pauw A, Delnatte C, Diez O, Ding YC, Ditsch N, Domchek SM, Dorfling CM, Velazquez C, Dworniczak B, Eason J, Easton DF, Eeles R, Ehrencrona H, Ejlertsen B, EMBRACE, Engel C, Engert S, Evans DG, Faivre L, Feliubadalo L, Ferrer SF, Foretova L, Fowler J, Frost D, Galvao HCR, Ganz PA, Garber J, Gauthier-Villars M, Gehrig A, GEMO Study Collaborators, Gerdes AM, Gesta P, Giannini G, Giraud S, Glendon G, Godwin AK, Greene MH, Gronwald J, Gutierrez-Barrera A, Hahnen E, Hauke J, HEBON, Henderson A, Hentschel J, Hogervorst FBL, Honisch E, Imyanitov EN, Isaacs C, Izatt L, Izquierdo A, Jakubowska A, James P, Janavicius R, Jensen UB, John EM, Vijai J, Kaczmarek K, Karlan BY, Kast K, Investigators K, Kim SW, Konstantopoulou I, Korach J, Laitman Y, Lasa A, Lasset C, Lazaro C, Lee A, Lee MH, Lester J, Lesueur F, Liljegren A, Lindor NM, Longy M, Loud JT, Lu KH, Lubinski J, Machackova E, Manoukian S, Mari V, Martinez-Bouzas C, Matrai Z, Mebirouk N, Meijers-Heijboer HEJ, Meindl A, Mensenkamp AR, Mickys U, Miller A, Montagna M, Moysich KB, Mulligan AM, Musinsky J, Neuhausen SL, Nevanlinna H, Ngeow J, Nguyen HP, Niederacher D, Nielsen HR, Nielsen FC, Nussbaum RL, Offit K, Ofverholm A, Ong KR, Osorio A, Papi L, Papp J, Pasini B, Pedersen IS, Peixoto A, Peruga N, Peterlongo P, Pohl E, Pradhan N, Prajzendanc K, Prieur F, Pujol P, Radice P, Ramus SJ, Rantala J, Rashid MU, Rhiem K, Robson M, Rodriguez GC, Rogers MT, Rudaitis V, Schmidt AY, Schmutzler RK, Senter L, Shah PD, Sharma P, Side LE, Simard J, Singer CF, Skytte AB, Slavin TP, Snape K, Sobol H, Southey M, Steele L, Steinemann D, Sukiennicki G, Sutter C, Szabo CI, Tan YY, Teixeira MR, Terry MB, Teule A, Thomas A, Thull DL, Tischkowitz M, Tognazzo S, Toland AE, Topka S, Trainer AH, Tung N, van Asperen CJ, van der Hout AH, van der Kolk LE, van der Luijt RB, Van Heetvelde M, Varesco L, Varon-Mateeva R, Vega A, Villarreal-Garza C, von Wachenfeldt A, Walker L, Wang-Gohrke S, Wappenschmidt B, Weber BHF, Yannoukakos D, Yoon SY, Zanzottera C, Zidan J, Zorn KK, Hutten Selkirk CG, Hulick PJ, Chenevix-Trench G, Spurdle AB, Antoniou AC, Nathanson KL. Mutational spectrum in a worldwide study of 29,700 families with BRCA1 or BRCA2 mutations. Hum Mutat, 2018. 341. Dickson SP, Wang K, Krantz I, Hakonarson H, Goldstein DB. Rare variants create synthetic genome-wide associations. PLoS Biol 8:e1000294, 2010. 342. Milne RL, Herranz J, Michailidou K, Dennis J, Tyrer JP, Zamora MP, Arias-Perez JI, Gonzalez-Neira A, Pita G, Alonso MR, Wang Q, Bolla MK, Czene K, Eriksson M, Humphreys K, Darabi H, Li J, Anton-Culver H, Neuhausen SL, Ziogas A, Clarke CA, Hopper JL, Dite GS, Apicella C, Southey MC, Chenevix-Trench G, kConFab Investigators, Australian Ovarian Cancer Study Group, Swerdlow A, Ashworth A, Orr N, Schoemaker M, Jakubowska A, Lubinski J, Jaworska-Bieniek K, Durda K, Andrulis IL, Knight JA, Glendon G, Mulligan AM, Bojesen SE, Nordestgaard BG, Flyger H, Nevanlinna H, Muranen TA, Aittomaki K, Blomqvist C, Chang-Claude J, Rudolph A, Seibold P, Flesch-Janys D, Wang X, Olson JE, Vachon C, Purrington K, Winqvist R, Pylkas K, Jukkola-Vuorinen A, Grip M, Dunning AM, Shah M, Guenel P, Truong T, Sanchez M, Mulot C, Brenner H, Dieffenbach AK, Arndt V, Stegmaier C, Lindblom A, Margolin S, Hooning MJ, Hollestelle A, Collee JM, Jager A, Cox A, Brock IW, Reed MW, Devilee P, Tollenaar RA, Seynaeve C, Haiman CA, Henderson BE, Schumacher F, Le Marchand L, Simard J, Dumont M, Soucy P, Dork T, Bogdanova NV, Hamann U, Forsti A, Rudiger T, Ulmer HU, Fasching PA, Haberle L, Ekici AB, Beckmann MW, Fletcher O, Johnson N, Dos Santos Silva I, Peto J, Radice P, Peterlongo P, Peissel B, Mariani P, Giles GG, Severi G, Baglietto L, Sawyer E, Tomlinson I, Kerin M, Miller N, Marme F, Burwinkel B, Mannermaa A, Kataja V, Kosma VM, Hartikainen JM, Lambrechts D, Yesilyurt BT, Floris G, Leunen K, Alnaes GG, Kristensen V, Borresen-Dale AL, Garcia-Closas M, Chanock SJ, Lissowska J, Figueroa JD, Schmidt MK, Broeks A, Verhoef S, Rutgers EJ, Brauch H, Bruning T, Ko YD, The GENICA Network, Couch FJ, Toland AE, The TNBCC, Yannoukakos D, Pharoah PD, Hall P, Benitez J, Malats N, Easton DF. A large-scale assessment of two-way SNP interactions in breast cancer susceptibility using 46 450 cases and 42 461 controls from the breast cancer association consortium. Hum Mol Genet, 2013. 343. Kuchenbaecker KB, Neuhausen SL, Robson M, Barrowdale D, McGuffog L, Mulligan AM, Andrulis IL, Spurdle AB, Schmidt MK, Schmutzler RK, Engel C, Wappenschmidt B, Nevanlinna H, Thomassen M, Southey M, Radice P, Ramus SJ, Domchek SM, Nathanson KL, Lee A, Healey S, Nussbaum RL, Rebbeck TR, Arun BK, James P, Karlan BY, Lester J, Cass I, Breast Cancer Family Registry, Terry MB, Daly MB, Goldgar DE, Buys SS, Janavicius R, Tihomirova L, Tung N, Dorfling CM, van Rensburg EJ, Steele L, v O Hansen T, Ejlertsen B, Gerdes AM, Nielsen FC, Dennis J, Cunningham J, Hart S, Slager S, Osorio A, Benitez J, Duran M, Weitzel JN, Tafur I, Hander M, Peterlongo P, Manoukian S, Peissel B, Roversi G, Scuvera G, Bonanni B, Mariani P, Volorio S, Dolcetti R, Varesco L, Papi L, Tibiletti MG, Giannini

86 G, Fostira F, Konstantopoulou I, Garber J, Hamann U, Donaldson A, Brewer C, Foo C, Evans DG, Frost D, Eccles D, EMBRACE Study, Douglas F, Brady A, Cook J, Tischkowitz M, Adlard J, Barwell J, Ong KR, Walker L, Izatt L, Side LE, Kennedy MJ, Rogers MT, Porteous ME, Morrison PJ, Platte R, Eeles R, Davidson R, Hodgson S, Ellis S, Godwin AK, Rhiem K, Meindl A, Ditsch N, Arnold N, Plendl H, Niederacher D, Sutter C, Steinemann D, Bogdanova-Markov N, Kast K, Varon-Mateeva R, Wang-Gohrke S, Gehrig A, Markiefka B, Buecher B, Lefol C, Stoppa-Lyonnet D, Rouleau E, Prieur F, Damiola F, GEMO Study Collaborators, Barjhoux L, Faivre L, Longy M, Sevenet N, Sinilnikova OM, Mazoyer S, Bonadona V, Caux-Moncoutier V, Isaacs C, Van Maerken T, Claes K, Piedmonte M, Andrews L, Hays J, Rodriguez GC, Caldes T, de la Hoya M, Khan S, Hogervorst FB, Aalfs CM, de Lange JL, Meijers-Heijboer HE, van der Hout AH, Wijnen JT, van Roozendaal KE, Mensenkamp AR, van den Ouweland AM, van Deurzen CH, van der Luijt RB, HEBON, Olah E, Diez O, Lazaro C, Blanco I, Teule A, Menendez M, Jakubowska A, Lubinski J, Cybulski C, Gronwald J, Jaworska- Bieniek K, Durda K, Arason A, Maugard C, Soucy P, Montagna M, Agata S, Teixeira MR, KConFab Investigators, Olswold C, Lindor N, Pankratz VS, Hallberg E, Wang X, Szabo CI, Vijai J, Jacobs L, Corines M, Lincoln A, Berger A, Fink-Retter A, Singer CF, Rappaport C, Kaulich DG, Pfeiler G, Tea MK, Phelan CM, Mai PL, Greene MH, Rennert G, Imyanitov EN, Glendon G, Toland AE, Bojesen A, Pedersen IS, Jensen UB, Caligo MA, Friedman E, Berger R, Laitman Y, Rantala J, Arver B, Loman N, Borg A, Ehrencrona H, Olopade OI, Simard J, Easton DF, Chenevix-Trench G, Offit K, Couch FJ, Antoniou AC, CIMBA. Associations of common breast cancer susceptibility alleles with risk of breast cancer subtypes in BRCA1 and BRCA2 mutation carriers. Breast Cancer Res 16:3416-014-0492-9, 2014. 344. Darabi H, Beesley J, Droit A, Kar S, Nord S, Moradi Marjaneh M, Soucy P, Michailidou K, Ghoussaini M, Fues Wahl H, Bolla MK, Wang Q, Dennis J, Alonso MR, Andrulis IL, Anton-Culver H, Arndt V, Beckmann MW, Benitez J, Bogdanova NV, Bojesen SE, Brauch H, Brenner H, Broeks A, Bruning T, Burwinkel B, Chang-Claude J, Choi JY, Conroy DM, Couch FJ, Cox A, Cross SS, Czene K, Devilee P, Dork T, Easton DF, Fasching PA, Figueroa J, Fletcher O, Flyger H, Galle E, Garcia-Closas M, Giles GG, Goldberg MS, Gonzalez-Neira A, Guenel P, Haiman CA, Hallberg E, Hamann U, Hartman M, Hollestelle A, Hopper JL, Ito H, Jakubowska A, Johnson N, Kang D, Khan S, Kosma VM, Kriege M, Kristensen V, Lambrechts D, Le Marchand L, Lee SC, Lindblom A, Lophatananon A, Lubinski J, Mannermaa A, Manoukian S, Margolin S, Matsuo K, Mayes R, McKay J, Meindl A, Milne RL, Muir K, Neuhausen SL, Nevanlinna H, Olswold C, Orr N, Peterlongo P, Pita G, Pylkas K, Rudolph A, Sangrajrang S, Sawyer EJ, Schmidt MK, Schmutzler RK, Seynaeve C, Shah M, Shen CY, Shu XO, Southey MC, Stram DO, Surowy H, Swerdlow A, Teo SH, Tessier DC, Tomlinson I, Torres D, Truong T, Vachon CM, Vincent D, Winqvist R, Wu AH, Wu PE, Yip CH, Zheng W, Pharoah PD, Hall P, Edwards SL, Simard J, French JD, Chenevix-Trench G, Dunning AM. Fine scale mapping of the 17q22 breast cancer locus using dense SNPs, genotyped within the collaborative oncological gene-environment study (COGs). Sci Rep 6:32512, 2016. 345. Zeng C, Guo X, Long J, Kuchenbaecker KB, Droit A, Michailidou K, Ghoussaini M, Kar S, Freeman A, Hopper JL, Milne RL, Bolla MK, Wang Q, Dennis J, Agata S, Ahmed S, Aittomaki K, Andrulis IL, Anton-Culver H, Antonenkova NN, Arason A, Arndt V, Arun BK, Arver B, Bacot F, Barrowdale D, Baynes C, Beeghly-Fadiel A, Benitez J, Bermisheva M, Blomqvist C, Blot WJ, Bogdanova NV, Bojesen SE, Bonanni B, Borresen-Dale AL, Brand JS, Brauch H, Brennan P, Brenner H, Broeks A, Bruning T, Burwinkel B, Buys SS, Cai Q, Caldes T, Campbell I, Carpenter J, Chang-Claude J, Choi JY, Claes KB, Clarke C, Cox A, Cross SS, Czene K, Daly MB, de la Hoya M, De Leeneer K, Devilee P, Diez O, Domchek SM, Doody M, Dorfling CM, Dork T, Dos-Santos-Silva I, Dumont M, Dwek M, Dworniczak B, Egan K, Eilber U, Einbeigi Z, Ejlertsen B, Ellis S, Frost D, Lalloo F, EMBRACE, Fasching PA, Figueroa J, Flyger H, Friedlander M, Friedman E, Gambino G, Gao YT, Garber J, Garcia-Closas M, Gehrig A, Damiola F, Lesueur F, Mazoyer S, Stoppa-Lyonnet D, behalf of GEMO Study Collaborators, Giles GG, Godwin AK, Goldgar DE, Gonzalez-Neira A, Greene MH, Guenel P, Haeberle L, Haiman CA, Hallberg E, Hamann U, Hansen TV, Hart S, Hartikainen JM, Hartman M, Hassan N, Healey S, Hogervorst FB, Verhoef S, HEBON, Hendricks CB, Hillemanns P, Hollestelle A, Hulick PJ, Hunter DJ, Imyanitov EN, Isaacs C, Ito H, Jakubowska A, Janavicius R, Jaworska-Bieniek K, Jensen UB, John EM, Joly Beauparlant C, Jones M, Kabisch M, Kang D, Karlan BY, Kauppila S, Kerin MJ, Khan S, Khusnutdinova E, Knight JA, Konstantopoulou I, Kraft P, Kwong A, Laitman Y, Lambrechts D, Lazaro C, Le Marchand L, Lee CN, Lee MH, Lester J, Li J, Liljegren A, Lindblom A, Lophatananon A, Lubinski J, Mai PL, Mannermaa A, Manoukian S, Margolin S, Marme F, Matsuo K, McGuffog L, Meindl A, Menegaux F, Montagna M, Muir K, Mulligan AM, Nathanson KL, Neuhausen SL, Nevanlinna H, Newcomb PA, Nord S, Nussbaum RL, Offit K, Olah E, Olopade OI, Olswold C, Osorio A, Papi L, Park-Simon TW, Paulsson- Karlsson Y, Peeters S, Peissel B, Peterlongo P, Peto J, Pfeiler G, Phelan CM, Presneau N, Radice P, Rahman N, Ramus SJ, Rashid MU, Rennert G, Rhiem K, Rudolph A, Salani R, Sangrajrang S, Sawyer EJ, Schmidt MK, Schmutzler RK, Schoemaker MJ, Schurmann P, Seynaeve C, Shen CY, Shrubsole MJ, Shu XO, Sigurdson A, Singer CF, Slager S, Soucy P, Southey M, Steinemann D, Swerdlow A, Szabo CI, Tchatchou S, Teixeira MR, Teo SH, Terry MB, Tessier DC, Teule A, Thomassen M, Tihomirova L, Tischkowitz M, Toland AE, Tung N, Turnbull C, van den Ouweland AM, van Rensburg EJ, Ven den Berg D, Vijai J, Wang-Gohrke S, Weitzel JN, Whittemore AS, Winqvist R, Wong TY, Wu AH, Yannoukakos D, Yu JC, Pharoah PD, Hall P, Chenevix-Trench G, KConFab, AOCS Investigators, Dunning AM, Simard J, Couch FJ, Antoniou AC, Easton DF, Zheng W. Identification of independent association signals and putative functional variants for breast cancer risk through fine-scale mapping of the 12p11 locus. Breast Cancer Res 18:64-016-0718-0, 2016. 346. Ghoussaini M, French JD, Michailidou K, Nord S, Beesley J, Canisus S, Hillman KM, Kaufmann S, Sivakumaran H, Moradi Marjaneh M, Lee JS, Dennis J, Bolla MK, Wang Q, Dicks E, Milne RL, Hopper JL, Southey MC, Schmidt MK, Broeks A, Muir K, Lophatananon A, Fasching PA, Beckmann MW, Fletcher O, Johnson N, Sawyer EJ, Tomlinson I, Burwinkel B, Marme F, Guenel P, Truong T, Bojesen SE, Flyger H, Benitez J, Gonzalez-Neira A, Alonso MR, Pita G,

87 Neuhausen SL, Anton-Culver H, Brenner H, Arndt V, Meindl A, Schmutzler RK, Brauch H, Hamann U, Tessier DC, Vincent D, Nevanlinna H, Khan S, Matsuo K, Ito H, Dork T, Bogdanova NV, Lindblom A, Margolin S, Mannermaa A, Kosma VM, kConFab/AOCS Investigators, Wu AH, Van Den Berg D, Lambrechts D, Floris G, Chang-Claude J, Rudolph A, Radice P, Barile M, Couch FJ, Hallberg E, Giles GG, Haiman CA, Le Marchand L, Goldberg MS, Teo SH, Yip CH, Borresen-Dale AL, NBCS Collaborators, Zheng W, Cai Q, Winqvist R, Pylkas K, Andrulis IL, Devilee P, Tollenaar RA, Garcia-Closas M, Figueroa J, Hall P, Czene K, Brand JS, Darabi H, Eriksson M, Hooning MJ, Koppert LB, Li J, Shu XO, Zheng Y, Cox A, Cross SS, Shah M, Rhenius V, Choi JY, Kang D, Hartman M, Chia KS, Kabisch M, Torres D, Luccarini C, Conroy DM, Jakubowska A, Lubinski J, Sangrajrang S, Brennan P, Olswold C, Slager S, Shen CY, Hou MF, Swerdlow A, Schoemaker MJ, Simard J, Pharoah PD, Kristensen V, Chenevix-Trench G, Easton DF, Dunning AM, Edwards SL. Evidence that the 5p12 variant rs10941679 confers susceptibility to estrogen-receptor-positive breast cancer through FGF10 and MRPS30 regulation. Am J Hum Genet 99:903-911, 2016. 347. Sawyer S, Mitchell G, McKinley J, Chenevix-Trench G, Beesley J, Chen XQ, Bowtell D, Trainer AH, Harris M, Lindeman GJ, James PA. A role for common genomic variants in the assessment of familial breast cancer. J Clin Oncol 30:4330-4336, 2012. 348. Turnbull C, Ahmed S, Morrison J, Pernet D, Renwick A, Maranian M, Seal S, Ghoussaini M, Hines S, Healey CS, Hughes D, Warren-Perry M, Tapper W, Eccles D, Evans DG, Breast Cancer Susceptibility Collaboration (UK), Hooning M, Schutte M, van den Ouweland A, Houlston R, Ross G, Langford C, Pharoah PD, Stratton MR, Dunning AM, Rahman N, Easton DF. Genome-wide association study identifies five new breast cancer susceptibility loci. Nat Genet 42:504-507, 2010. 349. Ghoussaini M, Fletcher O, Michailidou K, Turnbull C, Schmidt MK, Dicks E, Dennis J, Wang Q, Humphreys MK, Luccarini C, Baynes C, Conroy D, Maranian M, Ahmed S, Driver K, Johnson N, Orr N, dos Santos Silva I, Waisfisz Q, Meijers-Heijboer H, Uitterlinden AG, Rivadeneira F, Netherlands Collaborative Group on Hereditary Breast and Ovarian Cancer (HEBON), Hall P, Czene K, Irwanto A, Liu J, Nevanlinna H, Aittomaki K, Blomqvist C, Meindl A, Schmutzler RK, Muller-Myhsok B, Lichtner P, Chang-Claude J, Hein R, Nickels S, Flesch-Janys D, Tsimiklis H, Makalic E, Schmidt D, Bui M, Hopper JL, Apicella C, Park DJ, Southey M, Hunter DJ, Chanock SJ, Broeks A, Verhoef S, Hogervorst FB, Fasching PA, Lux MP, Beckmann MW, Ekici AB, Sawyer E, Tomlinson I, Kerin M, Marme F, Schneeweiss A, Sohn C, Burwinkel B, Guenel P, Truong T, Cordina-Duverger E, Menegaux F, Bojesen SE, Nordestgaard BG, Nielsen SF, Flyger H, Milne RL, Alonso MR, Gonzalez-Neira A, Benitez J, Anton-Culver H, Ziogas A, Bernstein L, Dur CC, Brenner H, Muller H, Arndt V, Stegmaier C, Familial Breast Cancer Study (FBCS), Justenhoven C, Brauch H, Bruning T, Gene Environment Interaction of Breast Cancer in Germany (GENICA) Network, Wang-Gohrke S, Eilber U, Dork T, Schurmann P, Bremer M, Hillemanns P, Bogdanova NV, Antonenkova NN, Rogov YI, Karstens JH, Bermisheva M, Prokofieva D, Khusnutdinova E, Lindblom A, Margolin S, Mannermaa A, Kataja V, Kosma VM, Hartikainen JM, Lambrechts D, Yesilyurt BT, Floris G, Leunen K, Manoukian S, Bonanni B, Fortuzzi S, Peterlongo P, Couch FJ, Wang X, Stevens K, Lee A, Giles GG, Baglietto L, Severi G, McLean C, Alnaes GG, Kristensen V, Borrensen-Dale AL, John EM, Miron A, Winqvist R, Pylkas K, Jukkola-Vuorinen A, Kauppila S, Andrulis IL, Glendon G, Mulligan AM, Devilee P, van Asperen CJ, Tollenaar RA, Seynaeve C, Figueroa JD, Garcia-Closas M, Brinton L, Lissowska J, Hooning MJ, Hollestelle A, Oldenburg RA, van den Ouweland AM, Cox A, Reed MW, Shah M, Jakubowska A, Lubinski J, Jaworska K, Durda K, Jones M, Schoemaker M, Ashworth A, Swerdlow A, Beesley J, Chen X, kConFab Investigators, Australian Ovarian Cancer Study Group, Muir KR, Lophatananon A, Rattanamongkongul S, Chaiwerawattana A, Kang D, Yoo KY, Noh DY, Shen CY, Yu JC, Wu PE, Hsiung CN, Perkins A, Swann R, Velentzis L, Eccles DM, Tapper WJ, Gerty SM, Graham NJ, Ponder BA, Chenevix-Trench G, Pharoah PD, Lathrop M, Dunning AM, Rahman N, Peto J, Easton DF. Genome-wide association analysis identifies three new breast cancer susceptibility loci. Nat Genet 44:312-318, 2012. 350. Henriksson K, Olsson H, Kristoffersson U. The need for oncogenetic counselling. ten years' experience of a regional oncogenetic clinic. Acta Oncol 43:637-649, 2004. 351. 1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA. An integrated map of genetic variation from 1,092 human genomes. Nature 491:56- 65, 2012. 352. International HapMap 3 Consortium, Altshuler DM, Gibbs RA, Peltonen L, Altshuler DM, Gibbs RA, Peltonen L, Dermitzakis E, Schaffner SF, Yu F, Peltonen L, Dermitzakis E, Bonnen PE, Altshuler DM, Gibbs RA, de Bakker PI, Deloukas P, Gabriel SB, Gwilliam R, Hunt S, Inouye M, Jia X, Palotie A, Parkin M, Whittaker P, Yu F, Chang K, Hawes A, Lewis LR, Ren Y, Wheeler D, Gibbs RA, Muzny DM, Barnes C, Darvishi K, Hurles M, Korn JM, Kristiansson K, Lee C, McCarrol SA, Nemesh J, Dermitzakis E, Keinan A, Montgomery SB, Pollack S, Price AL, Soranzo N, Bonnen PE, Gibbs RA, Gonzaga-Jauregui C, Keinan A, Price AL, Yu F, Anttila V, Brodeur W, Daly MJ, Leslie S, McVean G, Moutsianas L, Nguyen H, Schaffner SF, Zhang Q, Ghori MJ, McGinnis R, McLaren W, Pollack S, Price AL, Schaffner SF, Takeuchi F, Grossman SR, Shlyakhter I, Hostetter EB, Sabeti PC, Adebamowo CA, Foster MW, Gordon DR, Licinio J, Manca MC, Marshall PA, Matsuda I, Ngare D, Wang VO, Reddy D, Rotimi CN, Royal CD, Sharp RR, Zeng C, Brooks LD, McEwen JE. Integrating common and rare genetic variation in diverse human populations. Nature 467:52-58, 2010. 353. International HapMap Consortium, Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, Pasternak S, Wheeler DA, Willis TD, Yu F, Yang H, Zeng C, Gao Y, Hu H, Hu W, Li C, Lin W, Liu S, Pan H, Tang X, Wang J, Wang W, Yu J, Zhang B, Zhang Q, Zhao H, Zhao H, Zhou J, Gabriel SB,

88 Barry R, Blumenstiel B, Camargo A, Defelice M, Faggart M, Goyette M, Gupta S, Moore J, Nguyen H, Onofrio RC, Parkin M, Roy J, Stahl E, Winchester E, Ziaugra L, Altshuler D, Shen Y, Yao Z, Huang W, Chu X, He Y, Jin L, Liu Y, Shen Y, Sun W, Wang H, Wang Y, Wang Y, Xiong X, Xu L, Waye MM, Tsui SK, Xue H, Wong JT, Galver LM, Fan JB, Gunderson K, Murray SS, Oliphant AR, Chee MS, Montpetit A, Chagnon F, Ferretti V, Leboeuf M, Olivier JF, Phillips MS, Roumy S, Sallee C, Verner A, Hudson TJ, Kwok PY, Cai D, Koboldt DC, Miller RD, Pawlikowska L, Taillon-Miller P, Xiao M, Tsui LC, Mak W, Song YQ, Tam PK, Nakamura Y, Kawaguchi T, Kitamoto T, Morizono T, Nagashima A, Ohnishi Y, Sekine A, Tanaka T, Tsunoda T, Deloukas P, Bird CP, Delgado M, Dermitzakis ET, Gwilliam R, Hunt S, Morrison J, Powell D, Stranger BE, Whittaker P, Bentley DR, Daly MJ, de Bakker PI, Barrett J, Chretien YR, Maller J, McCarroll S, Patterson N, Pe'er I, Price A, Purcell S, Richter DJ, Sabeti P, Saxena R, Schaffner SF, Sham PC, Varilly P, Altshuler D, Stein LD, Krishnan L, Smith AV, Tello-Ruiz MK, Thorisson GA, Chakravarti A, Chen PE, Cutler DJ, Kashuk CS, Lin S, Abecasis GR, Guan W, Li Y, Munro HM, Qin ZS, Thomas DJ, McVean G, Auton A, Bottolo L, Cardin N, Eyheramendy S, Freeman C, Marchini J, Myers S, Spencer C, Stephens M, Donnelly P, Cardon LR, Clarke G, Evans DM, Morris AP, Weir BS, Tsunoda T, Mullikin JC, Sherry ST, Feolo M, Skol A, Zhang H, Zeng C, Zhao H, Matsuda I, Fukushima Y, Macer DR, Suda E, Rotimi CN, Adebamowo CA, Ajayi I, Aniagwu T, Marshall PA, Nkwodimmah C, Royal CD, Leppert MF, Dixon M, Peiffer A, Qiu R, Kent A, Kato K, Niikawa N, Adewole IF, Knoppers BM, Foster MW, Clayton EW, Watkin J, Gibbs RA, Belmont JW, Muzny D, Nazareth L, Sodergren E, Weinstock GM, Wheeler DA, Yakub I, Gabriel SB, Onofrio RC, Richter DJ, Ziaugra L, Birren BW, Daly MJ, Altshuler D, Wilson RK, Fulton LL, Rogers J, Burton J, Carter NP, Clee CM, Griffiths M, Jones MC, McLay K, Plumb RW, Ross MT, Sims SK, Willey DL, Chen Z, Han H, Kang L, Godbout M, Wallenburg JC, L'Archeveque P, Bellemare G, Saeki K, Wang H, An D, Fu H, Li Q, Wang Z, Wang R, Holden AL, Brooks LD, McEwen JE, Guyer MS, Wang VO, Peterson JL, Shi M, Spiegel J, Sung LM, Zacharia LF, Collins FS, Kennedy K, Jamieson R, Stewart J. A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851-861, 2007. 354. Senthivinayagam S, Mishra P, Paramasivam SK, Yallapragada S, Chatterjee M, Wong L, Rana A, Rana B. Caspase- mediated cleavage of beta-catenin precedes drug-induced apoptosis in resistant cancer cells. J Biol Chem 284:13577- 13588, 2009. 355. Han J, Sridevi P, Ramirez M, Ludwig KJ, Wang JY. Beta-catenin-dependent lysosomal targeting of internalized tumor necrosis factor-alpha suppresses caspase-8 activation in apoptosis-resistant colon cancer cells. Mol Biol Cell 24:465- 473, 2013. 356. Abdul-Ghani M, Dufort D, Stiles R, De Repentigny Y, Kothary R, Megeney LA. Wnt11 promotes cardiomyocyte development by caspase-mediated suppression of canonical wnt signals. Mol Cell Biol 31:163-178, 2011. 357. Messmer UK, Briner VA, Pfeilschifter J. Basic fibroblast growth factor selectively enhances TNF-alpha-induced apoptotic cell death in glomerular endothelial cells: Effects on apoptotic signaling pathways. J Am Soc Nephrol 11:2199- 2211, 2000. 358. Ma EL, Zhao DM, Li YC, Cao H, Zhao QY, Li JC, Sun LX. Activation of ATM-Chk2 by 16-dehydropregnenolone induces G1 phase arrest and apoptosis in HeLa cells. J Asian Nat Prod Res 14:817-825, 2012. 359. Thomas G, Jacobs KB, Kraft P, Yeager M, Wacholder S, Cox DG, Hankinson SE, Hutchinson A, Wang Z, Yu K, Chatterjee N, Garcia-Closas M, Gonzalez-Bosquet J, Prokunina-Olsson L, Orr N, Willett WC, Colditz GA, Ziegler RG, Berg CD, Buys SS, McCarty CA, Feigelson HS, Calle EE, Thun MJ, Diver R, Prentice R, Jackson R, Kooperberg C, Chlebowski R, Lissowska J, Peplonska B, Brinton LA, Sigurdson A, Doody M, Bhatti P, Alexander BH, Buring J, Lee IM, Vatten LJ, Hveem K, Kumle M, Hayes RB, Tucker M, Gerhard DS, Fraumeni JF,Jr, Hoover RN, Chanock SJ, Hunter DJ. A multistage genome-wide association study in breast cancer identifies two new risk alleles at 1p11.2 and 14q24.1 (RAD51L1). Nat Genet 41:579-584, 2009. 360. Suwaki N, Klare K, Tarsounas M. RAD51 paralogs: Roles in DNA damage signalling, recombinational repair and tumorigenesis. Semin Cell Dev Biol 22:898-905, 2011. 361. Hoffmeyer K, Raggioli A, Rudloff S, Anton R, Hierholzer A, Del Valle I, Hein K, Vogt R, Kemler R. Wnt/beta- catenin signaling regulates telomerase in stem cells and cancer cells. Science 336:1549-1554, 2012. 362. Park JI, Venteicher AS, Hong JY, Choi J, Jun S, Shkreli M, Chang W, Meng Z, Cheung P, Ji H, McLaughlin M, Veenstra TD, Nusse R, McCrea PD, Artandi SE. Telomerase modulates wnt signalling by association with target gene chromatin. Nature 460:66-72, 2009. 363. Donnini S, Terzuoli E, Ziche M, Morbidelli L. Sulfhydryl angiotensin-converting enzyme inhibitor promotes endothelial cell survival through nitric-oxide synthase, fibroblast growth factor-2, and telomerase cross-talk. J Pharmacol Exp Ther 332:776-784, 2010. 364. Listerman I, Gazzaniga FS, Blackburn EH. An investigation of the effects of the core protein telomerase reverse transcriptase on wnt signaling in breast cancer cells. Mol Cell Biol 34:280-289, 2014. 365. Delmas V, Beermann F, Martinozzi S, Carreira S, Ackermann J, Kumasaka M, Denat L, Goodall J, Luciani F, Viros A, Demirkan N, Bastian BC, Goding CR, Larue L. Beta-catenin induces immortalization of melanocytes by suppressing p16INK4a expression and cooperates with N-ras in melanoma development. Genes Dev 21:2923-2935, 2007.

89 366. Cowan CE, Kohler EE, Dugan TA, Mirza MK, Malik AB, Wary KK. Kruppel-like factor-4 transcriptionally regulates VE-cadherin expression and endothelial barrier function. Circ Res 107:959-966, 2010. 367. Evans PM, Chen X, Zhang W, Liu C. KLF4 interacts with beta-catenin/TCF4 and blocks p300/CBP recruitment by beta-catenin. Mol Cell Biol 30:372-381, 2010. 368. Zhang W, Chen X, Kato Y, Evans PM, Yuan S, Yang J, Rychahou PG, Yang VW, He X, Evers BM, Liu C. Novel cross talk of kruppel-like factor 4 and beta-catenin regulates normal intestinal homeostasis and tumor repression. Mol Cell Biol 26:2055-2064, 2006. 369. Zhang P, Chang WH, Fong B, Gao F, Liu C, Al Alam D, Bellusci S, Lu W. Regulation of induced pluripotent stem (iPS) cell induction by wnt/beta-catenin signaling. J Biol Chem 289:9221-9232, 2014. 370. Lanner F, Lee KL, Sohl M, Holmborn K, Yang H, Wilbertz J, Poellinger L, Rossant J, Farnebo F. Heparan sulfation- dependent fibroblast growth factor signaling maintains embryonic stem cells primed for differentiation in a heterogeneous state. Stem Cells 28:191-200, 2010. 371. Brandt T, Townsley FM, Teufel DP, Freund SM, Veprintsev DB. Molecular basis for modulation of the p53 target selectivity by KLF4. PLoS One 7:e48252, 2012. 372. Yang Y, Goldstein BG, Chao HH, Katz JP. KLF4 and KLF5 regulate proliferation, apoptosis and invasion in esophageal cancer cells. Cancer Biol Ther 4:1216-1221, 2005. 373. Hiremath M, Dann P, Fischer J, Butterworth D, Boras-Granic K, Hens J, Van Houten J, Shi W, Wysolmerski J. Parathyroid hormone-related protein activates wnt signaling to specify the embryonic mammary mesenchyme. Development 139:4239-4249, 2012. 374. Minina E, Kreschel C, Naski MC, Ornitz DM, Vortkamp A. Interaction of FGF, ihh/pthlh, and BMP signaling integrates chondrocyte proliferation and hypertrophic differentiation. Dev Cell 3:439-449, 2002. 375. Zhang Y, Park E, Kim CS, Paik JH. ZNF365 promotes stalled replication forks recovery to maintain genome stability. Cell Cycle 12:2817-2828, 2013. 376. Garcia-Closas M, Couch FJ, Lindstrom S, Michailidou K, Schmidt MK, Brook MN, Orr N, Rhie SK, Riboli E, Feigelson HS, Le Marchand L, Buring JE, Eccles D, Miron P, Fasching PA, Brauch H, Chang-Claude J, Carpenter J, Godwin AK, Nevanlinna H, Giles GG, Cox A, Hopper JL, Bolla MK, Wang Q, Dennis J, Dicks E, Howat WJ, Schoof N, Bojesen SE, Lambrechts D, Broeks A, Andrulis IL, Guenel P, Burwinkel B, Sawyer EJ, Hollestelle A, Fletcher O, Winqvist R, Brenner H, Mannermaa A, Hamann U, Meindl A, Lindblom A, Zheng W, Devillee P, Goldberg MS, Lubinski J, Kristensen V, Swerdlow A, Anton-Culver H, Dork T, Muir K, Matsuo K, Wu AH, Radice P, Teo SH, Shu XO, Blot W, Kang D, Hartman M, Sangrajrang S, Shen CY, Southey MC, Park DJ, Hammet F, Stone J, Veer LJ, Rutgers EJ, Lophatananon A, Stewart-Brown S, Siriwanarangsan P, Peto J, Schrauder MG, Ekici AB, Beckmann MW, Dos Santos Silva I, Johnson N, Warren H, Tomlinson I, Kerin MJ, Miller N, Marme F, Schneeweiss A, Sohn C, Truong T, Laurent- Puig P, Kerbrat P, Nordestgaard BG, Nielsen SF, Flyger H, Milne RL, Perez JI, Menendez P, Muller H, Arndt V, Stegmaier C, Lichtner P, Lochmann M, Justenhoven C, Ko YD, Gene ENvironmental Interaction and breast CAncer (GENICA) Network, Muranen TA, Aittomaki K, Blomqvist C, Greco D, Heikkinen T, Ito H, Iwata H, Yatabe Y, Antonenkova NN, Margolin S, Kataja V, Kosma VM, Hartikainen JM, Balleine R, kConFab Investigators, Tseng CC, Berg DV, Stram DO, Neven P, Dieudonne AS, Leunen K, Rudolph A, Nickels S, Flesch-Janys D, Peterlongo P, Peissel B, Bernard L, Olson JE, Wang X, Stevens K, Severi G, Baglietto L, McLean C, Coetzee GA, Feng Y, Henderson BE, Schumacher F, Bogdanova NV, Labreche F, Dumont M, Yip CH, Taib NA, Cheng CY, Shrubsole M, Long J, Pylkas K, Jukkola-Vuorinen A, Kauppila S, Knight JA, Glendon G, Mulligan AM, Tollenaar RA, Seynaeve CM, Kriege M, Hooning MJ, van den Ouweland AM, van Deurzen CH, Lu W, Gao YT, Cai H, Balasubramanian SP, Cross SS, Reed MW, Signorello L, Cai Q, Shah M, Miao H, Chan CW, Chia KS, Jakubowska A, Jaworska K, Durda K, Hsiung CN, Wu PE, Yu JC, Ashworth A, Jones M, Tessier DC, Gonzalez-Neira A, Pita G, Alonso MR, Vincent D, Bacot F, Ambrosone CB, Bandera EV, John EM, Chen GK, Hu JJ, Rodriguez-Gil JL, Bernstein L, Press MF, Ziegler RG, Millikan RM, Deming-Halverson SL, Nyante S, Ingles SA, Waisfisz Q, Tsimiklis H, Makalic E, Schmidt D, Bui M, Gibson L, Muller-Myhsok B, Schmutzler RK, Hein R, Dahmen N, Beckmann L, Aaltonen K, Czene K, Irwanto A, Liu J, Turnbull C, Familial Breast Cancer Study (FBCS), Rahman N, Meijers-Heijboer H, Uitterlinden AG, Rivadeneira F, stralian Breast Cancer Tissue Bank (ABCTB) Investigators, Olswold C, Slager S, Pilarski R, Ademuyiwa F, Konstantopoulou I, Martin NG, Montgomery GW, Slamon DJ, Rauh C, Lux MP, Jud SM, Bruning T, Weaver J, Sharma P, Pathak H, Tapper W, Gerty S, Durcan L, Trichopoulos D, Tumino R, Peeters PH, Kaaks R, Campa D, Canzian F, Weiderpass E, Johansson M, Khaw KT, Travis R, Clavel-Chapelon F, Kolonel LN, Chen C, Beck A, Hankinson SE, Berg CD, Hoover RN, Lissowska J, Figueroa JD, Chasman DI, Gaudet MM, Diver WR, Willett WC, Hunter DJ, Simard J, Benitez J, Dunning AM, Sherman ME, Chenevix-Trench G, Chanock SJ, Hall P, Pharoah PD, Vachon C, Easton DF, Haiman CA, Kraft P. Genome-wide association studies identify four ER negative- specific breast cancer risk loci. Nat Genet 45:392-398, 2013. 377. Jain VK, Turner NC. Challenges and opportunities in the targeting of fibroblast growth factor receptors in breast cancer. Breast Cancer Res 14:208, 2012.

90 378. Christensen J, Bentz S, Sengstag T, Shastri VP, Anderle P. FOXQ1, a novel target of the wnt pathway and a new marker for activation of wnt signaling in solid tumors. PLoS One 8:e60051, 2013. 379. Sengerova B, Allerston CK, Abu M, Lee SY, Hartley J, Kiakos K, Schofield CJ, Hartley JA, Gileadi O, McHugh PJ. Characterization of the human SNM1A and SNM1B/apollo DNA repair exonucleases. J Biol Chem 287:26254-26267, 2012. 380. Mason JM, Sekiguchi JM. Snm1B/apollo functions in the fanconi anemia pathway in response to DNA interstrand crosslinks. Hum Mol Genet 20:2549-2559, 2011. 381. Ye J, Lenain C, Bauwens S, Rizzo A, Saint-Leger A, Poulet A, Benarroch D, Magdinier F, Morere J, Amiard S, Verhoeyen E, Britton S, Calsou P, Salles B, Bizard A, Nadal M, Salvati E, Sabatier L, Wu Y, Biroccio A, Londono-Vallejo A, Giraud-Panis MJ, Gilson E. TRF2 and apollo cooperate with topoisomerase 2alpha to protect human telomeres from replicative damage. Cell 142:230-242, 2010. 382. Roy R, Chun J, Powell SN. BRCA1 and BRCA2: Different roles in a common pathway of genome protection. Nat Rev Cancer 12:68-78, 2011. 383. Smith J, Tho LM, Xu N, Gillespie DA. The ATM-Chk2 and ATR-Chk1 pathways in DNA damage signaling and cancer. Adv Cancer Res 108:73-112, 2010. 384. Carramusa L, Contino F, Ferro A, Minafra L, Perconti G, Giallongo A, Feo S. The PVT-1 oncogene is a myc protein target that is overexpressed in transformed cells. J Cell Physiol 213:511-518, 2007. 385. Katoh M. Network of WNT and other regulatory signaling cascades in pluripotent stem cells and cancer stem cells. Curr Pharm Biotechnol 12:160-170, 2011. 386. Menssen A, Hermeking H. Characterization of the c-MYC-regulated transcriptome by SAGE: Identification and analysis of c-MYC target genes. Proc Natl Acad Sci U S A 99:6274-6279, 2002. 387. Wang WJ, Wu SP, Liu JB, Shi YS, Huang X, Zhang QB, Yao KT. MYC regulation of CHK1 and CHK2 promotes radioresistance in a stem cell-like population of nasopharyngeal carcinoma cells. Cancer Res 73:1219-1231, 2013. 388. Zhuang L, Hulin JA, Gromova A, Tran Nguyen TD, Yu RT, Liddle C, Downes M, Evans RM, Makarenkova HP, Meech R. Barx2 and pax7 have antagonistic functions in regulation of wnt signaling and satellite cell differentiation. Stem Cells 32:1661-1673, 2014. 389. Iwata J, Suzuki A, Yokota T, Ho TV, Pelikan R, Urata M, Sanchez-Lara PA, Chai Y. TGFbeta regulates epithelial- mesenchymal interactions through WNT signaling activity to control muscle development in the soft palate. Development 141:909-917, 2014. 390. Sasaki T, Ito Y, Bringas P,Jr, Chou S, Urata MM, Slavkin H, Chai Y. TGFbeta-mediated FGF signaling is crucial for regulating cranial neural crest cell proliferation during frontal bone development. Development 133:371-381, 2006. 391. Falk S, Wurdak H, Ittner LM, Ille F, Sumara G, Schmid MT, Draganova K, Lang KS, Paratore C, Leveen P, Suter U, Karlsson S, Born W, Ricci R, Gotz M, Sommer L. Brain area-specific effect of TGF-beta signaling on wnt-dependent neural stem cell expansion. Cell Stem Cell 2:472-483, 2008. 392. Cleveland AG, Oikarinen SI, Bynote KK, Marttinen M, Rafter JJ, Gustafsson JA, Roy SK, Pitot HC, Korach KS, Lubahn DB, Mutanen M, Gould KA. Disruption of estrogen receptor signaling enhances intestinal neoplasia in apc(min/+) mice. Carcinogenesis 30:1581-1590, 2009. 393. Neto A, Mercader N, Gomez-Skarmeta JL. The Osr1 and Osr2 genes act in the pronephric anlage downstream of retinoic acid signaling and upstream of Wnt2b to maintain pectoral fin development. Development 139:301-311, 2012. 394. Rankin SA, Gallas AL, Neto A, Gomez-Skarmeta JL, Zorn AM. Suppression of Bmp4 signaling by the zinc-finger repressors Osr1 and Osr2 is required for wnt/beta-catenin-mediated lung specification in xenopus. Development 139:3010- 3020, 2012. 395. Renard CA, Labalette C, Armengol C, Cougot D, Wei Y, Cairo S, Pineau P, Neuveut C, de Reynies A, Dejean A, Perret C, Buendia MA. Tbx3 is a downstream target of the wnt/beta-catenin pathway and a critical mediator of beta-catenin survival functions in liver cancer. Cancer Res 67:901-910, 2007. 396. Eblaghie MC, Song SJ, Kim JY, Akita K, Tickle C, Jung HS. Interactions between FGF and wnt signals and Tbx3 gene expression in mammary gland initiation in mouse embryos. J Anat 205:1-13, 2004. 397. Carrera I, Janody F, Leeds N, Duveau F, Treisman JE. Pygopus activates wingless target gene transcription through the mediator complex subunits Med12 and Med13. Proc Natl Acad Sci U S A 105:6644-6649, 2008. 398. Fillmore CM, Gupta PB, Rudnick JA, Caballero S, Keller PJ, Lander ES, Kuperwasser C. Estrogen expands breast cancer stem-like cells through paracrine FGF/Tbx3 signaling. Proc Natl Acad Sci U S A 107:21737-21742, 2010.

91 399. Mosbech A, Lukas C, Bekker-Jensen S, Mailand N. The deubiquitylating enzyme USP44 counteracts the DNA double-strand break response mediated by the RNF8 and RNF168 ubiquitin ligases. J Biol Chem 288:16579-16587, 2013. 400. Backman M, Machon O, Mygland L, van den Bout CJ, Zhong W, Taketo MM, Krauss S. Effects of canonical wnt signaling on dorso-ventral specification of the mouse telencephalon. Dev Biol 279:155-168, 2005. 401. Yamagishi C, Yamagishi H, Maeda J, Tsuchihashi T, Ivey K, Hu T, Srivastava D. Sonic hedgehog is essential for first pharyngeal arch development. Pediatr Res 59:349-354, 2006. 402. Bei M, Maas R. FGFs and BMP4 induce both Msx1-independent and Msx1-dependent signaling pathways in early tooth development. Development 125:4325-4333, 1998. 403. Stacey SN, Manolescu A, Sulem P, Thorlacius S, Gudjonsson SA, Jonsson GF, Jakobsdottir M, Bergthorsson JT, Gudmundsson J, Aben KK, Strobbe LJ, Swinkels DW, van Engelenburg KC, Henderson BE, Kolonel LN, Le Marchand L, Millastre E, Andres R, Saez B, Lambea J, Godino J, Polo E, Tres A, Picelli S, Rantala J, Margolin S, Jonsson T, Sigurdsson H, Jonsdottir T, Hrafnkelsson J, Johannsson J, Sveinsson T, Myrdal G, Grimsson HN, Sveinsdottir SG, Alexiusdottir K, Saemundsdottir J, Sigurdsson A, Kostic J, Gudmundsson L, Kristjansson K, Masson G, Fackenthal JD, Adebamowo C, Ogundiran T, Olopade OI, Haiman CA, Lindblom A, Mayordomo JI, Kiemeney LA, Gulcher JR, Rafnar T, Thorsteinsdottir U, Johannsson OT, Kong A, Stefansson K. Common variants on chromosome 5p12 confer susceptibility to estrogen receptor-positive breast cancer. Nat Genet 40:703-706, 2008. 404. Neubuser A, Peters H, Balling R, Martin GR. Antagonistic interactions between FGF and BMP signaling pathways: A mechanism for positioning the sites of tooth formation. Cell 90:247-255, 1997. 405. Mandler M, Neubuser A. FGF signaling is necessary for the specification of the odontogenic mesenchyme. Dev Biol 240:548-559, 2001. 406. Fugger K, Mistrik M, Danielsen JR, Dinant C, Falck J, Bartek J, Lukas J, Mailand N. Human Fbh1 helicase contributes to genome maintenance via pro- and anti-recombinase activities. J Cell Biol 186:655-663, 2009. 407. Simandlova J, Zagelbaum J, Payne MJ, Chu WK, Shevelev I, Hanada K, Chatterjee S, Reid DA, Liu Y, Janscak P, Rothenberg E, Hickson ID. FBH1 helicase disrupts RAD51 filaments in vitro and modulates homologous recombination in mammalian cells. J Biol Chem 288:34168-34180, 2013. 408. Abe K, Takeichi M. NMDA-receptor activation induces calpain-mediated beta-catenin cleavages for triggering gene expression. Neuron 53:387-397, 2007. 409. Mahaffey JP, Grego-Bessa J, Liem KF,Jr, Anderson KV. Cofilin and Vangl2 cooperate in the initiation of planar cell polarity in the mouse embryo. Development 140:1262-1271, 2013. 410. Nair M, Bilanchone V, Ortt K, Sinha S, Dai X. Ovol1 represses its own transcription by competing with transcription activator c-myb and by recruiting histone deacetylase activity. Nucleic Acids Res 35:1687-1697, 2007. 411. Jiang Z, Guerrero-Netro HM, Juengel JL, Price CA. Divergence of intracellular signaling pathways and early response genes of two closely related fibroblast growth factors, FGF8 and FGF18, in bovine ovarian granulosa cells. Mol Cell Endocrinol 375:97-105, 2013. 412. Sarbajna S, Davies D, West SC. Roles of SLX1-SLX4, MUS81-EME1, and GEN1 in avoiding genome instability and mitotic catastrophe. Genes Dev 28:1124-1136, 2014. 413. McGowan CH. Checking in on Cds1 (Chk2): A checkpoint kinase and tumor suppressor. Bioessays 24:502-511, 2002. 414. Chandler DS, Singh RK, Caldwell LC, Bitler JL, Lozano G. Genotoxic stress induces coordinately regulated alternative splicing of the p53 modulators MDM2 and MDM4. Cancer Res 66:9502-9508, 2006. 415. Wang YV, Wade M, Wong E, Li YC, Rodewald LW, Wahl GM. Quantitative analyses reveal the importance of regulated hdmx degradation for p53 activation. Proc Natl Acad Sci U S A 104:12365-12370, 2007. 416. Buscemi G, Carlessi L, Zannini L, Lisanti S, Fontanella E, Canevari S, Delia D. DNA damage-induced cell cycle regulation and function of novel Chk2 phosphoresidues. Mol Cell Biol 26:7832-7845, 2006. 417. Mourgues S, Gautier V, Lagarou A, Bordier C, Mourcet A, Slingerland J, Kaddoum L, Coin F, Vermeulen W, Gonzales de Peredo A, Monsarrat B, Mari PO, Giglia-Mari G. ELL, a novel TFIIH partner, is involved in transcription restart after DNA repair. Proc Natl Acad Sci U S A 110:17927-17932, 2013. 418. French JD, Ghoussaini M, Edwards SL, Meyer KB, Michailidou K, Ahmed S, Khan S, Maranian MJ, O'Reilly M, Hillman KM, Betts JA, Carroll T, Bailey PJ, Dicks E, Beesley J, Tyrer J, Maia AT, Beck A, Knoblauch NW, Chen C, Kraft P, Barnes D, Gonzalez-Neira A, Alonso MR, Herrero D, Tessier DC, Vincent D, Bacot F, Luccarini C, Baynes C, Conroy D, Dennis J, Bolla MK, Wang Q, Hopper JL, Southey MC, Schmidt MK, Broeks A, Verhoef S, Cornelissen S, Muir K, Lophatananon A, Stewart-Brown S, Siriwanarangsan P, Fasching PA, Loehberg CR, Ekici AB, Beckmann MW, Peto J, Dos Santos Silva I, Johnson N, Aitken Z, Sawyer EJ, Tomlinson I, Kerin MJ, Miller N, Marme F, Schneeweiss A, Sohn C, Burwinkel B, Guenel P, Truong T, Laurent-Puig P, Menegaux F, Bojesen SE, Nordestgaard BG, Nielsen SF,

92 Flyger H, Milne RL, Zamora MP, Arias Perez JI, Benitez J, Anton-Culver H, Brenner H, Muller H, Arndt V, Stegmaier C, Meindl A, Lichtner P, Schmutzler RK, Engel C, Brauch H, Hamann U, Justenhoven C, The GENICA Network, Aaltonen K, Heikkila P, Aittomaki K, Blomqvist C, Matsuo K, Ito H, Iwata H, Sueta A, Bogdanova NV, Antonenkova NN, Dork T, Lindblom A, Margolin S, Mannermaa A, Kataja V, Kosma VM, Hartikainen JM, kConFab Investigators, Wu AH, Tseng CC, Van Den Berg D, Stram DO, Lambrechts D, Peeters S, Smeets A, Floris G, Chang-Claude J, Rudolph A, Nickels S, Flesch-Janys D, Radice P, Peterlongo P, Bonanni B, Sardella D, Couch FJ, Wang X, Pankratz VS, Lee A, Giles GG, Severi G, Baglietto L, Haiman CA, Henderson BE, Schumacher F, Le Marchand L, Simard J, Goldberg MS, Labreche F, Dumont M, Teo SH, Yip CH, Ng CH, Vithana EN, Kristensen V, Zheng W, Deming-Halverson S, Shrubsole M, Long J, Winqvist R, Pylkas K, Jukkola-Vuorinen A, Grip M, Andrulis IL, Knight JA, Glendon G, Mulligan AM, Devilee P, Seynaeve C, Garcia-Closas M, Figueroa J, Chanock SJ, Lissowska J, Czene K, Klevebring D, Schoof N, Hooning MJ, Martens JW, Collee JM, Tilanus-Linthorst M, Hall P, Li J, Liu J, Humphreys K, Shu XO, Lu W, Gao YT, Cai H, Cox A, Balasubramanian SP, Blot W, Signorello LB, Cai Q, Pharoah PD, Healey CS, Shah M, Pooley KA, Kang D, Yoo KY, Noh DY, Hartman M, Miao H, Sng JH, Sim X, Jakubowska A, Lubinski J, Jaworska-Bieniek K, Durda K, Sangrajrang S, Gaborieau V, McKay J, Toland AE, Ambrosone CB, Yannoukakos D, Godwin AK, Shen CY, Hsiung CN, Wu PE, Chen ST, Swerdlow A, Ashworth A, Orr N, Schoemaker MJ, Ponder BA, Nevanlinna H, Brown MA, Chenevix-Trench G, Easton DF, Dunning AM. Functional variants at the 11q13 risk locus for breast cancer regulate cyclin D1 expression through long-range enhancers. Am J Hum Genet, 2013. 419. Prasad CP, Rath G, Mathur S, Bhatnagar D, Ralhan R. Potent growth suppressive activity of curcumin in human breast cancer cells: Modulation of wnt/beta-catenin signaling. Chem Biol Interact 181:263-271, 2009. 420. Li Z, Chen K, Jiao X, Wang C, Willmarth NE, Casimiro MC, Li W, Ju X, Kim SH, Lisanti MP, Katzenellenbogen JA, Pestell RG. Cyclin D1 integrates estrogen-mediated dna damage repair signaling. Cancer Res, 2014. 421. Dok R, Kalev P, Van Limbergen EJ, Asbagh LA, Vazquez I, Hauben E, Sablina A, Nuyts S. p16INK4a impairs homologous recombination-mediated DNA repair in human papillomavirus-positive head and neck tumors. Cancer Res 74:1739-1751, 2014. 422. Pestell RG. New roles of cyclin D1. Am J Pathol 183:3-9, 2013. 423. Ferreira AC, Robaina MC, Rezende LM, Severino P, Klumb CE. Histone deacetylase inhibitor prevents cell growth in burkitt's lymphoma by regulating PI3K/akt pathways and leads to upregulation of miR-143, miR-145, and miR-101. Ann Hematol 93:983-993, 2014. 424. Pungartnik C, Picada J, Brendel M, Henriques JA. Further phenotypic characterization of pso mutants of with respect to DNA repair and response to oxidative stress. Genet Mol Res 1:79-89, 2002. 425. Brendel M, Bonatto D, Strauss M, Revers LF, Pungartnik C, Saffi J, Henriques JA. Role of PSO genes in repair of DNA damage of saccharomyces cerevisiae. Mutat Res 544:179-193, 2003. 426. Gong X, Carmon KS, Lin Q, Thomas A, Yi J, Liu Q. LGR6 is a high affinity receptor of R-spondins and potentially functions as a tumor suppressor. PLoS One 7:e37137, 2012. 427. de Lau W, Barker N, Low TY, Koo BK, Li VS, Teunissen H, Kujala P, Haegebarth A, Peters PJ, van de Wetering M, Stange DE, van Es JE, Guardavaccaro D, Schasfoort RB, Mohri Y, Nishimori K, Mohammed S, Heck AJ, Clevers H. Lgr5 homologues associate with wnt receptors and mediate R-spondin signalling. Nature 476:293-297, 2011. 428. Lee J, Beliakoff J, Sun Z. The novel PIAS-like protein hZimp10 is a transcriptional co-activator of the p53 tumor suppressor. Nucleic Acids Res 35:4523-4534, 2007. 429. Mahmoudi T, Boj SF, Hatzis P, Li VS, Taouatas N, Vries RG, Teunissen H, Begthel H, Korving J, Mohammed S, Heck AJ, Clevers H. The leukemia-associated Mllt10/Af10-Dot1l are Tcf4/beta-catenin coactivators essential for intestinal homeostasis. PLoS Biol 8:e1000539, 2010. 430. Mohan M, Herz HM, Takahashi YH, Lin C, Lai KC, Zhang Y, Washburn MP, Florens L, Shilatifard A. Linking H3K79 trimethylation to wnt signaling through a novel Dot1-containing complex (DotCom). Genes Dev 24:574-589, 2010. 431. Liu Z, Habener JF. Wnt signaling in pancreatic islets. Adv Exp Med Biol 654:391-419, 2010. 432. Kanda S, Miyata Y, Kanetake H. T-cell factor-4-dependent up-regulation of fibronectin is involved in fibroblast growth factor-2-induced tube formation by endothelial cells. J Cell Biochem 94:835-847, 2005. 433. Shao G, Patterson-Fortin J, Messick TE, Feng D, Shanbhag N, Wang Y, Greenberg RA. MERIT40 controls BRCA1- Rap80 complex integrity and recruitment to DNA double-strand breaks. Genes Dev 23:740-754, 2009. 434. Sue Ng S, Mahmoudi T, Li VS, Hatzis P, Boersema PJ, Mohammed S, Heck AJ, Clevers H. MAP3K1 functionally interacts with Axin1 in the canonical wnt signalling pathway. Biol Chem 391:171-180, 2010. 435. Pardo OE, Arcaro A, Salerno G, Raguz S, Downward J, Seckl MJ. Fibroblast growth factor-2 induces translational regulation of bcl-XL and bcl-2 via a MEK-dependent pathway: Correlation with resistance to etoposide-induced apoptosis. J Biol Chem 277:12040-12046, 2002.

93 436. Cross MJ, Lu L, Magnusson P, Nyqvist D, Holmqvist K, Welsh M, Claesson-Welsh L. The shb adaptor protein binds to tyrosine 766 in the FGFR-1 and regulates the ras/MEK/MAPK pathway via FRS2 phosphorylation in endothelial cells. Mol Biol Cell 13:2881-2893, 2002. 437. Ekici AB, Hilfinger D, Jatzwauk M, Thiel CT, Wenzel D, Lorenz I, Boltshauser E, Goecke TW, Staatz G, Morris- Rosendahl DJ, Sticht H, Hehr U, Reis A, Rauch A. Disturbed wnt signalling due to a mutation in CCDC88C causes an autosomal recessive non-syndromic hydrocephalus with medial diverticulum. Mol Syndromol 1:99-112, 2010. 438. Niwa Y, Masamizu Y, Liu T, Nakayama R, Deng CX, Kageyama R. The initiation and propagation of Hes7 oscillation are cooperatively regulated by fgf and notch signaling in the somite segmentation clock. Dev Cell 13:298-304, 2007.

94 Appendix

Supplementary Table 1. (III: unpublished data) Supplementary Table 2. (III: unpublished data) Supplementary Table 1. In silico functional characterization of the six candidate modifier loci of CHEK2:c.1100delC associated breast cancer risk. The table includes the functional summary* of the risk variant (bold) as well as those tagging variants (r2>0.8) that have been predicted to have a functional consequence (RegulomeDB score<5).

Variant Distance r2 r2 r2 RegulomeDB Closest Variant eQTL Cell lines Cell lines Cell lines Bound proteins Transcription factor binding (bp) (1000gen) (HM3) (HM2.2) score gene consequence with active with active with open motifs promoter enhancer chromatin 5: minimal rs11249433 0 binding EMBP1 intronic K562 HSMM 4 cell lines Pit-1, Mef2, Pax-2, Pou1f1 evidence 5: minimal HMEC and BCL, Pax-4, Sin3Ak-20, rs11780156 0 binding MIR1208 intergenic 4 other cell Nhek, Htr8 Zfp281 evidence lines HMEC and Foxp1, GATA, HDAC2, Irf, 2b: likely to rs72722756 8531 1 MIR1208 intergenic 6 other cell 19 cell lines EP-300, FOS Nanog, Pax-5, RXRA, STAT, affect binding lines p300

2b: likely to HMEC and HCF1, SPI-B, Irf4, Ets, HNF4, rs12542202 10086 0.882 0.936 1 MIR1208 intergenic GM12878 5 cell lines RFX3, EP300, NFKB1, FOS affect binding 5 other cell Hsf, STAT lines 5: minimal ATE1 rs2981582 0 binding FGFR2 intronic NHEK 8 cell lines NF-kappaB, ZEB1 (skin) evidence HMEC and 2b: likely to NHEK, rs2981578 12006 0.844 FGFR2 intronic HepG2 13 other FOXA1, E2F1 Oct-1, Foxa, Pou2f2, Pou3f2 affect binding HMEC, H1 cell lines

HMEC and rs11075995 0 7: no data FTO intronic 3 other cell TP53 lines 4: minimal Melano, rs9923295 6631 0.957 binding FTO intronic HMEC NT2-D1, NANOG Egr-1, Irf, VDR evidence H7es 1f: likely to ELL affect binding (monocytes, HMEC and HMEC and rs4808801 0 and linked to ELL intronic LCL); 4 other cell 27 other Nrf-1 expression of ARRDC2 lines cell lines a target gene (skin) ELL CTCF, EBF1, PAX5C20, CTCFL, CTCF, E2A, HEN1, HMEC and 2a: likely to (LCL) RAD21, SMC3, YY1, ZNF143, Lmo2-complex, Myf, RXRA, rs8103622 1693 1 1 ELL intronic 118 other affect binding GABP, CTCFL, MAX, HSF1, Rad21, SMC3, ZBTB7A, cell lines GABPA, PAX5 MyoD, E12, E47 Variant Distance r2 r2 r2 RegulomeDB Closest Variant eQTL Cell lines Cell lines Cell lines Bound proteins Transcription factor binding (bp) (1000gen) (HM3) (HM2.2) score gene consequence with active with active with open motifs promoter enhancer chromatin HMEC and HDAC2, Irf, Sox, TATA, 2b: likely to ELL rs2385089 20707 1 0.981 1 SSBP4 intergenic HepG2 4 other cell 25 cell lines CTCF Zfp105, p300, Srf, SRY, affect binding (LCL) lines Tcfap2e, Sox4, Sox11 1f: likely to SSBP4; HMEC and rs10442 25800 1 0.981 1 affect binding SSBP4 intronic (LCL) GM12878 H1, Huvec 33 other EGR1 Foxo, Hoxa9, Pbx-1, Pou3f2 and linked to ELL; cell lines HMEC and IKZF1, CREBBP, CTCFL, CCNT2, ERalpha-a, Egr-1, 2b: likely to K562 rs34746918 31397 1 SSBP4 intronic 4 cell lines 46 other POLR2A, CTCF, HMGN3, INSM1, PU.1, RXRA, SP1, affect binding (leukemia) cell lines RFX3, POL2 ZNF219, KROX 1f: likely to ELL HMEC and GM12878, rs7258465 37499 1 0.981 1 affect binding SSBP4 intronic (monocytes, 17 other ETS1, CREBBP, ELF1 Huvec and linked to LCL); cell lines 1f: likely to ELL rs7252848 41210 1 0.943 affect binding ELL intronic 4 cell lines (monocytes) and linked to 1d: likely to ELL rs4808136 47726 1 0.943 0.964 affect binding ELL intronic (monocytes, HepG2 47 cell lines HNF4A, RAD21, USF1, RXRA CTCF and linked to LCL) ELL HMEC and E2F, GATA, Pbx-1, Pbx3, 2b: likely to H1, K562, rs10408290 28440 0.965 1 SSBP4 intronic (LCL); 24 other ZNF263 UF1H3BETA, ZBTB33, Tcf7l2, affect binding GM12878 ARRDC2 cell lines Pbx ANKLE1, MRPL34 5: minimal (LCL), rs2363956 0 binding ANKLE1 missense K562 8 cell lines NF-I, Pbx3, Smad4 GTPBP3, evidence OCEL1 (adipose) POLR2A, TFAP2C, TFAP2A, MAX, HMGN3, E2F6, ZNF263, ETS1, CCNT2, PAX5, AP-2, AP-4, Ascl2, BHLHE40, HMEC and HMEC and TAF1, CTCF, ZEB1, ELF1, 2b: likely to E2A, HEN1, LBP-1, Myf, rs8108174 594 1 ANKLE1 missense 8 other cell 100 other NRF1, TBP, GABPA, GATA1, affect binding NRSF, Rad21, Sin3Ak-20, lines cell lines YY1, ATF3, SMC3, EBF1, TCF12, p300 RAD21, RFX5, MYC, POL2, PAX5C20, POL24H8, GABP, AP2ALPHA, AP2GAMMA 1000gen = 1000 genomes; HM = HapMap; rel = release; eQTL = expressed quantitative trait locus Distance to the index variant is given for all variants in base pairs (bp). Tagging variants were searched from 1000 genomes (1000gen)351 and from HapMap releases 3 (HM3)352 and 2.2 (HM2.2)353. * Functional data are a summary of SNAP253, RegulomeDB257, HaploReg258 and GeneVar254 database searches. Supplementary Table 2. Functional summary of the 77 common low penetrance breast cancer risk variants based on published literature and Ingenuity® Pathway Analysis (IPA®).

IPA® term* (IPA® enrichment p-value)† PubMed (search key-words bolded) Repair of Cell Cell cycle Proliferation Differentiation Locus / Potential DNA death progression of cells of cells variant culprit genes (1,4E-04) (4,1E-04) (2,1E-06) (2,8E-05) (2,4E-06) WNT signaling FGF signaling DNA repair Link to CHEK2 References Interaction Interaction rs1045485 Downstream of CASP8 CASP8 CASP8 CASP8 between between 230, 354-358 rs17468277 CHEK2 pathways pathways rs999737 Homologous 359, 360 RAD51L1 rs10483813E recombination Downstream target of beta- Downstream of rs10069690 TERT TERT TERT TERT TERT TERT catenin, FGF signaling modulator of 264, 361-364 beta-catenin CDKN2A: CDKN2B, CDKN2A, CDKN2A, CDKN2A, downstream rs1011970 CDKN2A CDKN2A CDKN2A CDKN2B CDKN2B CDKN2B target of beta- 264, 365 catenin 9 rs10472076 RAB3C, PDE4D PDE4D PDE4D PDE4D

Interaction with Downstream of TP53 target 264, 361, 366-372 rs10759243 KLF4 KLF4 KLF4 KLF4 KLF4 beta-catenin FGF signaling selectivity Upstream of beta- Downstream of 264, 373, 374 rs10771399E PTHLH PTHLH PTHLH PTHLH PTHLH catenin and LEF1 FGF signaling

rs10941679

Double-strand break repair via rs10995190 ZNF365 ZNF365 homologous 264, 375 recombination Regulation of canonical and rs11075995E FTO 264, 265, 376 non-canonical WNT pathways Interaction rs11199914E FGFR2 FGFR2 FGFR2 FGFR2 FGFR2 between Core FGF pathway 264, 269, 270, 377 pathways Downstream 9, 264, 378 rs11242675 FOXQ1 FOXQ1 target of WNT h IPA® term* (IPA® enrichment p-value)† PubMed (search key-words bolded) Repair of Cell Cell cycle Proliferation Differentiation Locus / Potential DNA death progression of cells of cells variant culprit genes (1,4E-04) (4,1E-04) (2,1E-06) (2,8E-05) (2,4E-06) WNT signaling FGF signaling DNA repair Link to CHEK2 References NOTCH2: NOTCH2: interaction rs11249433 NOTCH2, EMBP1 NOTCH2 NOTCH2 NOTCH2 NOTCH2 downstream of between 263, 264, 266-268 beta-catenin pathways DCLRE1B: PTPN22, interstrand cross- DCLRE1B, PHTF1, rs11552449PE DCLRE1B PTPN22 link repair; BCL2L15, RSBN1, telomere 9, 264, 379-381 AP4B1 protection Double-strand break repair Double-strand rs11571833 BRCA2 BRCA2 BRCA2 BRCA2 BRCA2 through break repair homologous pathway 9, 264, 382, 383 recombination MYC: canonical MYC: coupling of pathway MYC: interaction DNA replication MYC, MIR1208, CHEK2: direct MYC rs11780156E MYC MYC MYC MYC PVT1: MYC between and DNA repair PVT1 target downstream pathways (target genes of 9, 264, 272, 384-387 target MYC) 9, 264 rs11814448E DNAJC1

WNT effector 264, 388 rs11820646E BARX2 complex 264 rs12422552E ATF7IP

Interaction Upstream rs12493607 TGFBR2 TGFBR2 TGFBR2 between regulation of FGF 9, 264, 389-391 pathways signaling Upstream of rs12662670 ESR1 ESR1 ESR1 ESR1 ESR1 WNT/beta-catenin 348, 392 pathway Upstream OSR1: rs12710696E OSR1, MIR4757 regulation of WNT 264, 376, 393, 394 downstream signaling IPA® term* (IPA® enrichment p-value)† PubMed (search key-words bolded) Repair of Cell Cell cycle Proliferation Differentiation Locus / Potential DNA death progression of cells of cells variant culprit genes (1,4E-04) (4,1E-04) (2,1E-06) (2,8E-05) (2,4E-06) WNT signaling FGF signaling DNA repair Link to CHEK2 References TBX3: downstream of beta-catenin TBX3: paracrine rs1292011 TBX3, MED13 TBX3 TBX3 TBX3 TBX3 MED13: upstream signaling regulation of WNT 264, 395-398 signaling 264 rs132390 EMID1

MYC: coupling of MYC: interaction DNA replication MYC: canonical CHEK2: direct MYC rs13281615E MYC, POU5F1B MYC MYC MYC MYC between and DNA repair pathway target pathways (target genes of 185, 264, 272, 385-387 MYC) 9, 264 rs13329835 CDYL2

264 rs13387042 TNP1 TNP1

264 rs1353747E RAB3C, PDE4D PDE4D PDE4D PDE4D

9, 264 rs1432679 EBF1 EBF1

9, 264 rs1436904 CHST9, AQP4

9, 264 rs1550623P RPS2P18,CDCA7 CDCA7 rs16857609E DIRC3

USP44: counteracts double-strand rs17356907E NTN4, USP44 NTN4 NTN4 break repair repair through RNF8 and 9, 264, 399 RNF168 264 rs17529111 FAM46A IPA® term* (IPA® enrichment p-value)† PubMed (search key-words bolded) Repair of Cell Cell cycle Proliferation Differentiation Locus / Potential DNA death progression of cells of cells variant culprit genes (1,4E-04) (4,1E-04) (2,1E-06) (2,8E-05) (2,4E-06) WNT signaling FGF signaling DNA repair Link to CHEK2 References Regulation of canonical and rs17817449E FTO non-canonical 9, 264, 265 WNT pathways Downstream of Downstream of rs2016394PE DLX2 beta-catenin, MYC 264, 400-402 FGF8 binding site 9, 264 rs204247PE RANBP9 RANBP9

ESR1: Upstream of ESR1, CCDC170, rs2046210PE ESR1 ESR1 ESR1 ESR1 WNT/beta-catenin 264, 392, 403 C6orf97 pathway Downstream of 9, 264, 404, 405 rs2236007P PAX9,SLC25A21 PAX9 FGF signaling ANKLE1, ABHD8, 264 rs2363956PE PDE4C FBXO18: ANKRD16, GDI2, rs2380205E homologous 264, 406, 407 FBXO18 recombination Downstream of Homologous 9, 264, 360 rs2588809E RAD51L1 FGF signaling recombination Downstream target of beta- Downstream of rs2736108 TERT TERT TERT TERT TERT TERT catenin, FGF signaling modulator of 9, 361-364 beta-catenin 264 rs2823093 NRIP1 NRIP1

9, 264 rs2943559 HNF4G

Interaction rs2981579E FGFR2 FGFR2 FGFR2 FGFR2 FGFR2 between Core FGF pathway 9, 269, 270, 377 pathways Interaction rs2981582E FGFR2 FGFR2 FGFR2 FGFR2 FGFR2 between Core FGF pathway 185, 264, 269, 270, 377 pathways IPA® term* (IPA® enrichment p-value)† PubMed (search key-words bolded) Repair of Cell Cell cycle Proliferation Differentiation Locus / Potential DNA death progression of cells of cells variant culprit genes (1,4E-04) (4,1E-04) (2,1E-06) (2,8E-05) (2,4E-06) WNT signaling FGF signaling DNA repair Link to CHEK2 References Upstream of 348, 392 rs3757318 ESR1 ESR1 ESR1 ESR1 ESR1 WNT/beta-catenin pathway KCNN4, ZNF283, 9, 264 E rs3760982 SMG9, LYPD5, KCNN4 264 rs3803662 TOX3

264 rs3817198 TNNT3, LSP1 LSP1

FOSL1: downstream of MUS81: Halliday CFL1, OVOL1, beta-catenin and junction release, SNX32, MUS81, lef1 FOSL1: CFL1, homologous MUS81: direct rs3903072E AP5B1, FOSL1, MUS81 MUS81 FOSL1 CFL1: planar cell downstream of MUS81 recombination, CHEK2 target C11orf68, BANF1, polarity pathway FGF8 and FGF18 interstrand cross- CTSW, EFEMP2 (non-canonical WNT pathway) link repair 9, 264, 408-413 OVOL1: MYC MDM4, MDM4: TP53 MDM4: direct 264, 376, 414-416 rs4245739PE MDM4, PIK3C2B MDM4 MYC binding site PIK3C2B regulator CHEK2 target ELL: upstream ELL: transcription ELL, ISYNA1, ELL: oncoprotein rs4808801E ELL ELL regulation of WNT restart after DNA 9, 264, 273, 313, 417 SSBP4, UBA52 upstream of FGF2 signaling repair LOC84931, 9, 264 rs4849887 INHBB 264 rs4973768 NEK10

264 rs527616E LOC728606

Downstream rs554219E CCND1 CCND1 CCND1 CCND1 CCND1 CCND1 target of beta- 418-423 catenin 9, 264 rs6001930E MKL1, SGSM3 MKL1

Downstream Homology- TP53 signaling rs614367 CCND1 CCND1 CCND1 CCND1 CCND1 CCND1 target of beta- directed DNA 264, 419-423 pathway catenin repair IPA® term* (IPA® enrichment p-value)† PubMed (search key-words bolded) Repair of Cell Cell cycle Proliferation Differentiation Locus / Potential DNA death progression of cells of cells variant culprit genes (1,4E-04) (4,1E-04) (2,1E-06) (2,8E-05) (2,4E-06) WNT signaling FGF signaling DNA repair Link to CHEK2 References 9, 264 rs616488 PEX14

264 rs6472903 HNF4G

COX11: metabolic STXBP4, HLF, 264, 424, 425 rs6504950E COX11 protection against COX11 oxidative stress Interaction rs6678914E LGR6 between 264, 376, 426, 427 pathways 9, 264 rs6762644E ITPR1, EGOT ITPR1 ITPR1

9, 264 rs6828523 ADAM29

Co-activator of 264, 428 rs704010E ZMIZ1 ZMIZ1 TP53 MLLT10: DNAJC1,MLLT10, rs7072776 TCF4/beta-catenin 9, 264, 429, 430 C10orf114 coeffector ARHGEF5, 9, 264 rs720475 MYC binding site NOBOX Downstream Homology- TP53 signaling rs75915166 CCND1 CCND1 CCND1 CCND1 CCND1 CCND1 target of beta- directed DNA 418-423 pathway catenin repair Interaction Canonical WNT rs7904519E TCF7L2 TCF7L2 TCF7L2 TCF7L2 TCF7L2 between 9, 264, 431, 432 pathway pathways C19orf62: Double- strand break rs8170PE C19orf62 C19orf62 NR2F6 repair through homologous 9, 433 recombination Interaction with Downstream of TP53 target 264, 361, 366-372 rs865686 KLF4 KLF4 KLF4 KLF4 KLF4 beta-catenin FGF signaling selectivity MAP3K1, MIER3, MAP3K1: MAP3K1: 264, 434-436 rs889312 SETD9, MAP3K1 MAP3K1 interaction interaction MGC33648 b b IPA® term* (IPA® enrichment p-value)† PubMed (search key-words bolded) Repair of Cell Cell cycle Proliferation Differentiation Locus / Potential DNA death progression of cells of cells variant culprit genes (1,4E-04) (4,1E-04) (2,1E-06) (2,8E-05) (2,4E-06) WNT signaling FGF signaling DNA repair Link to CHEK2 References DVL suppression, rs941764 CCDC88C upstream of beta- 9, 264, 437 catenin RPL17P33, DUSP4: FGF 9, 264, 438 rs9693444E DUSP4 DUSP4 MYC binding site C8orf75, DUSP4 signaling inhibitor 9, 264 rs9790517P TET2 TET2

P Variant tags a region of promoter histone marks in human mammary epithelial cells according to HaploReg results. EVariant tags a region of enhancer histone marks in human mammary epithelial cells according to HaploReg results. * Biological functions with which the 77 common breast cancer risk variants are associate. From the analyses, a general term covering the highest number of variants was selected to this table from each group of multiple related functions (e.g. “cell death” instead of “cell death of ovarian cancer cell lines”). † P-value for enrichment of the biological function in the 77 common breast cancer risk variants.