Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Cancer Molecular and Cellular Pathobiology Research

Genome-wide Methylation Analysis Identifies Specific to Breast Cancer Hormone Status and Risk of Recurrence

Mary Jo Fackler1, Christopher B. Umbricht1,2, Danielle Williams1, Pedram Argani3, Leigh-Ann Cruz1, Vanessa F. Merino1, Wei Wen Teo1, Zhe Zhang1, Peng Huang1, Kala Visvananthan1,4, Jeffrey Marks5, Stephen Ethier6, Joe W. Gray7, Antonio C. Wolff1, Leslie M. Cope1, and Saraswati Sukumar1

Abstract To better understand the biology of –positive and –negative breast cancer and to identify methylated markers of disease progression, we carried out a genome-wide methylation array analysis on 103 primary invasive breast cancers and 21 normal breast samples, using the Illumina Infinium HumanMethylation27 array that queried 27,578 CpG loci. Estrogen and/or –positive tumors displayed more hypermethylated loci than (ER)-negative tumors. However, the hypermethylated loci in ER-negative tumors were clustered closer to the transcriptional start site compared with ER-positive tumors. An ER-classifier set of CpG loci was identified, which independently partitioned primary tumors into ER subtypes. A total of 40 (32 novel and 8 previously known) CpG loci showed differential methylation specific to either ER-positive or ER-negative tumors. Each of the 40 ER subtype–specific loci was validated in silico, using an independent, publicly available methylome dataset from the Cancer Genome Atlas. In addition, we identified 100 methylated CpG loci that were significantly associated with disease progression; the majority of these loci were informative particularly in ER-negative breast cancer. Overall, the set was highly enriched in containing genes. This pilot study shows the robustness of the breast cancer methylome and illustrates its potential to stratify and reveal biological differences between ER subtypes of breast cancer. Furthermore, it defines candidate ER-specific markers and identifies potential markers predictive of outcome within ER subgroups. Cancer Res; 71(19); 6195–207. 2011 AACR.

Introduction receptor (ER) and progesterone receptor (PR) status, and it is established that ER expression (ER-positive) identifies a tumor Approximately 200,000 women are diagnosed each year in phenotype with improved near/midterm prognosis and likely the United States with breast cancer, and nearly 50,000 die of benefits from adjuvant endocrine therapy when compared their metastatic disease. Significant improvement made in both with ER-negative tumors. Yet, little is known about the genomic early detection and local/systemic therapy in the past few de- features within each ER subtype of breast cancer that could cades has significantly improved patient outcomes, especially explain why some patients with the same ER status have a good survival. Breast cancers are characterized by their estrogen outcome whereas others do poorly regardless of treatment. Current decision algorithms based on standard clinico- pathologic factors (1) stratify ER-negative disease as having – Authors' Affiliations: Departments of 1Oncology, 2Surgery, and 3Pathol- a high risk for recurrence (2 4). Although patients are now ogy, Johns Hopkins University School of Medicine; 4Department of Ep- routinely offered adjuvant chemotherapy, most patients idemiology, Bloomberg School of Public Health, Baltimore, Maryland; 5Department of Surgery, Duke University School of Medicine, Durham, with node-negative, ER-negative disease remain disease free North Carolina; 6Department of Oncology, Wayne State University School after local therapy alone, including approximately 80% of ER- of Medicine, Detroit, Michigan; and 7Biomedical Engineering, Oregon negative patients with tumors of 1 cm or less (5) and up to Health Sciences University, Portland, Oregon 60% of all with stage 1 disease (6). Consequently, there are Note: Supplementary data for this article are available at Cancer Research patients with ER-negative disease that might do well without Online (http://cancerres.aacrjournals.org/). adjuvant chemotherapy and could avoid its potential toxi- Corresponding Author: Saraswati Sukumar or Leslie M. Cope, Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins, 1650 Orleans cities, whereas others with a high residual risk despite it Street, CRB 1-Rm 143, Baltimore, MD 21231. Phone: 410-614-2479; Fax: might be offered trials of novel therapies. Unfortunately, 410-614-4073; E-mail: [email protected] or [email protected] existing markers routinely used in clinical practice are doi: 10.1158/0008-5472.CAN-11-1630 of limited or no use in ER-negative patients (7). For 2011 American Association for Cancer Research. example, commonly used tests by reverse

www.aacrjournals.org 6195

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Fackler et al.

transcriptase PCR have no clear prognostic/predictive utility could reveal important biological differences in the epigen- in ER-negative disease (8, 9), and microarray assays develop- ome between breast cancer subtypes and provide ancillary ed so far seem to identify essentially all such patients as high clinical diagnostic, prognostic, and predictive tools. risk (10, 11), whereas other markers are still in development. Consequently, there is a critical need to develop better Materials and Methods prognostic factors to improve assessment of residual risk and better predictive markers to optimize patient selection Tissues for standard and investigational systemic therapies. Frozen breast cancer tissues that were excised from patients Established clinically annotated tissue banks from prospec- with stages 1 to 3 disease prior to treatment (n ¼ 103) were tive randomized clinical trials and extensive databases on retrieved from Surgical Pathology at Johns Hopkins Hospital expression profiling in breast cancer in the past decade have (Baltimore, Maryland) and confirmed to contain more than allowed the prospective–retrospective development of the 50% epithelial cells. Normal breast organoids were prepared several prognostic and predictive tests. For instance, clinicians by enzymatic digestion of reduction mammoplasty specimens have come to accept foregoing adjuvant chemotherapy in (n ¼ 15; median patient age ¼ 52 years, range ¼ 47–71). Normal patients with ER-positive, node-negative disease with a low ducts from breast tissue more than 2 cm away from the tumor risk of distant recurrence of less than 10% at 10 years (n ¼ 6) were isolated from cryosections, using lasercapture according to the 21-gene expression profile Oncotype DX microdissection (PALM MicroBeam; Carl Zeiss Microimaging). assay, while strongly recommending it in those with a residual The studies were done with Institutional Review Board risk of more than 20% despite 5 years of adjuvant tamoxifen approval. Tumor characteristics are provided in Table 1 and (9). Similarly useful tests are urgently needed for ER-negative Supplementary Table S1. breast cancer. Multiple published studies using candidate gene þ – approaches have suggested the utility of analyzing genes Table 1. Characteristics of the ER and ER that undergo tumor-specific and promoter-specific hyper- primary breast cancer patients in the study methylation as biomarkers for early detection and for pre- þ diction of outcome in multiple types of cancer (reviewed in Characteristics ER ER ref. 12). Methylated genes are particularly robust as bio- (N ¼ 44) (N ¼ 38) markers. In past studies, we developed a cancer detection panel using a quantitative cumulative methylation assay Recurrences 7 11 (quantitative multiplex methylation-specific PCR; QM- DFS at 5 y (estimated by 87% 71% Kaplan–Meier) MSP) wherein the methylation status of multiple genes þ could be determined individually and cumulatively from HER2 /no. of cases 4/20 10/25 picograms of input DNA, such as is retrieved from ductal annotated lavage or ductoscopy (13, 14) and pathologic nipple dis- Median % Ki67 20 50 charge fluid (15). We and others have found that methylated AJCC stage genes are frequently detected in the preinvasive stage of I74 ductal carcinoma in situ (DCIS; refs. 16–18). Furthermore, II 22 19 histopathologically normal ducts in the vicinity of tumor III 15 15 tissue display detectable hypermethylation of genes that are Median tumor size, mm 28 59 Having <1 mm margin 10 7 present in the adjacent DCIS or invasive cancer, whereas a normal ducts present farther away do not (18–21). However, Therapy using the candidate marker approach, it has been difficult to Locoregion therapy 3 11 identify markers informative of the biology specifically of Endocrine 0 2 ER-positive or -negative breast cancer or those that predict Hormone 34 0 response to therapy, disease progression, and survival. Chemotherapy 21 27 Therefore, we tested whether a genome-wide discovery NOTE: A total of 21 additional samples were arrayed and platform would identifygenelociintumorsthatbetter used for ER classification, but not for outcome analyses. These – predict clinical outcomes (22 26). 21 cases were excluded from the outcome analysis for the As the first step toward studies with clinical trial samples, following reasons: neoadjuvant treatment (n ¼ 8), samples we carried out methylation array analyses on a discovery set of obtained 6 months after the initial diagnosis (n ¼ 10), and 103 primary invasive tumors and 21 normal samples. We progression within 6 month after diagnosis (n ¼ 10). found that distinctly different gene CpG loci typify the methy- Abbreviations: DFS, disease-free survival or total follow-up lome of ER-positive and ER-negative breast cancers. Forty in No Progression cases; AJCC, American Joint Commis- gene loci were identified that stratified tumors according to sion on Cancer. ER status. We also identified a putative "prognostic signature" aTherapies add up to more than totals because of cases of 100 CpG loci that are individually and collectively associ- with both endocrine and chemotherapy; locoregional ther- ated with outcome in patients with breast cancer. This apy only (surgery radiation). feasibility study shows that CpG locus methylation levels

6196 Cancer Res; 71(19) October 1, 2011 Cancer Research

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Methylome Classifiers of Cancer Subtype and Prognosis

Genomic DNA extraction, sodium bisulfite conversion, We selected TCGA to carry out this analysis because Illu- and quality assurance mina Meth27K was used, enabling direct comparisons for DNA extraction and quality assurance were carried out as the same 50 bp CpG locus probes. In total, 185 samples described previously (27, 28) and in Supplementary Materials were available on the Illumina 27k Human Methylation and Methods (Supplementary Fig. S1A). platform, and 465 samples were available on the Agilent G4502A Expression Array. Time to recurrence was not Methylome analysis available at the time of download, but time to death was Bisulfite-converted DNA was analyzed by using Illumina obtained for 342 of the samples queried on expression array Infinium Human Methylation27 BeadChip Kit (WG-311-1202) and 182 samples queried on methylation array. Probe level in the JHU DNA Microarray Core. Locus methylation was data (TCGA level 2) were obtained for the methylation calculated as a b-value within GenomeStudio software, low platform, whereas gene-level summaries (TCGA level 3) were to high ranging from 0 to 1, respectively. used for RNA expression. Rank-based Spearman correlations were calculated between methylation and expression, using Data analysis the 182 samples. Each methylation probe was mapped to the Data were analyzed by GenomeStudio software (Illumina, nearest gene by using the open source Illumina methylation Inc.) and Bioconductor in R (http://www.bioconductor.org). platform annotation package available from Bioconductor Unsupervised cluster analysis was used to visualize and (http://www.bioconductor.org/packages/2.6/data/annotation/ characterize broad methylation patterns in the data. All html/IlluminaHumanMethylation27k.db.html), and correla- tests were 2-tailed and P < 0.05 was considered significant. tions calculated for probes mapping to genes found on the Cox regression and Kaplan–Meier plots were used to model expression array. Benjamini–Hochberg adjusted P values associations between methylation levels and time to recur- are reported for each probe, alongside the correlation rence, with and without adjustment for relevant clinical coefficient. Association between overall survival and meth- covariates, and to identify potential predictive markers. ylation or expression was evaluated by Cox regression. The Covariates used were patients’ age at diagnosis, tumor ability of molecular markers to predict ER status was grade, pathologic T stage, lymph node status, ER, proges- measured by carrying out an ROC analysis by using the terone receptor, type of primary surgery (with or without methylation and expression levels of individual genes as radiotherapy), and adjuvant therapy (chemotherapy and/or predictors and reporting the area under the ROC curve. For endocrine therapy). To identify methylated genes associat- expression, the ROC analysis was based on the expectation ed with ER status and their biology, a different approach of an inverse relationship between methylation and expres- was taken, emphasizing genes in which methylation chan- sion, so that in some cases, where a significant, positive ged dramatically between ER-positive and ER-negative sam- association is observed between the 2 platforms, the area ples. To achieve this, the initial selection was based on under the ROC may be substantially less than 0.5. large fold changes. To evaluate the predictive capability of a panel of loci associated with ER status, we used Quantitative multiplex methylation-specific PCR independent samples to carry out receiver operating char- The 2-step multiplexed methylation-specific PCR acteristic (ROC) analysis of a summary score of methylation method was previously described (15, 27, 28). For AKR1B1 derived as follows: (i) methylation at each locus was stan- primers/probes, see Supplementary Materials and dardized to have a common scale by subtracting the mean Methods. methylation level and dividing by the SD for that locus so that low methylation resulted in negative values, whereas Results high methylation gave positive values; (ii) high methylation was associated with ER-positive status at some loci, and Methylation profiling of primary invasive breast cancer with ER-negative status at others, so standardized methyl- tumors ation scores for these latter loci were multiplied by–1, such Whole-genome methylation array analysis was carried out that a high score uniformly indicated ER-positive samples; by using the Illumina Infinium HumanMethylation27 Bead- and (iii) genes were combined by averaging the standard- Chip with primary invasive carcinoma samples (n ¼ 103), ized methylation scores for each patient, and the average samples from microdissected normal breast tissue distant score used in ROC analyses. The same procedure was used from the primary tumor (n ¼ 6), and epithelium-enriched to summarize multilocus homeobox panels associated with organoids isolated from normal breast (n ¼ 15). The array recurrence. Data can be accessed at the Gene Expression quantifies the proportion of methylated cytosines (5mC) to Omnibus (accession number GSE31979). total cytosines at each of the 27,578 different CpG dinucleo- tides. The steps followed for our analysis is shown as a Validation in the Cancer Genome Atlas samples flowchart in Figure 1. To verify that patterns of methylation observed in asso- To characterize the overall methylation profile of primary ciation with ER status and risk of recurrence within the invasive breast tumors, unsupervised hierarchical cluster JHU cohort were characteristic of breast cancer, we down- analysis using the Manhattan distance was carried out loaded and analyzed data publicly available from the Cancer on the most varied probes across tumors (1,378 gene loci, Genome Atlas Project (TCGA; http://tcga-data.nci.nih.gov/). SD > 1.60; Fig. 2A). Two distinct clusters of tumors were

www.aacrjournals.org Cancer Res; 71(19) October 1, 2011 6197

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Fackler et al.

A B Discovery set of n = 103 sporadic invasive Discovery set of n = 82 annotated sporadic primary breast cancer samples, 15 normal invasive primary breast cancer samples, 15 breast organoids, 6 normal breast tissue normal breast organoids, 6 normal breast adjacent to tumor tissue adjacent to tumor

Inclusion ER+, n = 44 ER–, n = 38 Between tumor β-value variation SD > 0.1; probe detection P < 0.0001; 8376 probes Recurrent, Non-recurrent, Recurrent, Non-recurrent, n = 7 n = 37 n = 11 n = 27 Calculated fold change in hypermethylation + – between ER and ER Differential methylation analysis of recurrent (R) tumors for each probe versus nonrecurrent (NR) cases (ranked highest to lowest) ER+ ER– Criteria: high in recurrent breast cancer; low in normal, significant differential methylation, FDR > 0.05 between R and NR Selected 100 Selected 100 + probes with highest probes with highest ER ER– methylation in the methylation in the ER+ sample group ER– sample group Selected 50 probes Selected 50 probes with highest with highest methylation in ER+ methylation in ER– (reviewing individual samples) breast cancers breast cancers Removed probes with hypermethylation in any of 21 individual normal breast samples

Removed probes with hypermethylation 100 CpG loci evaluated in Kaplan- among Both ER+ and ER– breast Meier plots, tumor samples High Cox Coefficient model fit analysis, AIC, adjusted for ER– status, tumor size, age, etc.

ER+, 27 probes ER–, 13 probes

40 candidate CpG probes

Validation set: performed external validation of association between methylation and ER status using TCGA breast cancer tumor set (n = 50)

Figure 1. Schema outlining study design for analysis of association of methylation with ER status (A) and disease outcome (B). FDR, false discovery rate.

observed. Cluster 1 was enriched for ER-positive breast cancer Distinct groups of genes are specifically and (21 of 28, 75%), whereas cluster 2 contained 85% of the ER- reciprocally hypermethylated in ER-positive versus negative tumors (41 of 75, 55% of total). Given the importance ER-negative breast cancer of ER in breast cancer, it is not surprising to observe a strong Very little is known about the genomic features within each association between predominant methylation patterns and ER subtype of breast cancer that could explain why some ER status (OR ¼ 3.57; 95% CI, 1.27–11.20; P ¼ 0.082), but the patients have a good outcome, whereas others will do poorly result also highlights the importance of gene methylation in regardless of treatment. To determine the differences in breast the disease process. The data also suggested additional sub- cancer biology/behavior between ER subtypes, we character- groups within clusters 1 and 2 with distinct methylation ized methylation patterns at 8,376 selected CpG loci according – þ profiles such as cluster 2B, which contains all ER PR tumor to ER status. These loci met 2 criteria: (i) showed the most samples. variation across primary tumors (SD > 0.100) and (ii) had

6198 Cancer Res; 71(19) October 1, 2011 Cancer Research

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Methylome Classifiers of Cancer Subtype and Prognosis

A B Histogram for log10 ER+/ER– DAB2IP PDXK 700 + 600 –

HSD17B8 500 PER1 400 VIM FAM78A FAM89A 300 RUNX3 ER– No. of probes ER+ 200 NORMAL 100 100 highest in ER 100 highest 100 highest in ER 100 highest

–0.6 –0.4 –0.2 0.0 0.2 0.4 0.6 Fold difference in median loci methylation (log10) ER+/ER– breast cancer: Cluster: 1 2A 2B Classification of ER subtype C Distances to TSS: 200 ER genes D in TCGA methylation data

ER– ER+ All Density Sensitivity AUC = 0.961 0.2 0.4 0.6 0.8 1.0 0.0 –2,000 –1,000 0 1,000 2,000 0.0 0.2 0.4 0.6 0.8 1.0 Distance to TSS 1 – Specificity

Figure 2. A, unsupervised hierarchical cluster analysis of the most varied 5% of CpG loci probes among tumors (1,378 loci, SD > 0.160). 2D hierarchical cluster analysis was carried out by using the Manhattan distance on 103 tumors, and 1,378 loci showed 2 distinct clusters. Cluster 1 is enriched for ER-positive (ERþ) tumors (pink bars); regions of cluster 2 appear enriched for ER-negative (ER) tumors (purple bars). Normal organoids cluster together (yellow bars). Examples of breast cancer-related gene loci are indicated at left. B, histogram plot showing the frequency of loci differentially methylated in ER-positive and ER-negative tumors. Plotted on the x-axis is the fold difference in the mean locus methylation of ER-positive/ER-negative tumors, among 8,376 loci with (SD > 0.1 across all tumors). The 200 CpG loci (left and right shaded boxes) are most differentially methylated between ER-positive and ER-negative tumors. C, distance to the TSS of the top CpG loci identified in ER-positive versus ER-negative tumors. Location of 100 ER-positive (pink line), 100-ER negative (purple line), and all 8,376 (black line) CpG loci was plotted relative to the TSS indicated as 0 on the x-axis. D, validation of ER subtype markers. An ROC plot shows an AUC of 0.961 for detection of ER subtype in 50 independent tumors within TCGA, using the ER-specific 40 loci marker set (data in Fig. 3, Supplementary Table S2, and Supplementary Fig. S2).

probe detection P < 0.0001 (indicating that DNA from that scriptional start site (TSS) than ER-positive tumors or the locus was present above background levels and that probe 8,376 array loci as a whole (Fig. 2C). This finding suggested a intensities were consistently measured across replicate beads; more rigorous suppression of gene expression by methylation the distribution of methylation among these loci is shown in in the ER-negative subtype, because methylated regions over- Supplementary Fig. S1B). A substantial number of loci were lapping the TSS have been shown to most tightly negatively observed with median methylation levels of 0.15 or more in regulate transcription. both groups of tumor and normal breast organoids. However, To further refine this set to identify ER subtype–specific the majority of loci were more highly methylated in tumor biological/molecular functions most driven by the epigenome than in normal organoid samples; 1,744 loci in tumors had in breast cancer, we selected a subgroup of 40 hypermethy- median methylation more than 2-fold higher than normal lated loci of the 200 CpG locus set that individually showed the organoids (Supplementary Fig. S1C). highest subtype specificity in individual tumor samples. Each ER-positive tumors were found to have a higher frequency individual locus was selected whose methylation profile of hypermethylated gene loci than ER-negative tumors showed (i) robust reciprocal methylation between the 2 ER (Fig. 2B). Methylation at 5,264 loci was higher (ratio > 1) in subtypes, (ii) an incidence of more than 20% of methylation ER-positive tumors samples than methylation of 3,112 loci within the breast cancer subtype, and (iii) low methylation in (ratio < 1) in ER-negative tumors. The top 100 hypermethy- normal breast epithelium/stroma and leukocytes (b<0.15; lated CpG loci in each group of ER-positive and ER-negative Fig. 3). Using these selection criteria, in the discovery set, we tumors were selected (Fig. 2B; ER-negative loci, ratio: 0.52–0.15 identified 27 loci/probes aberrantly and reciprocally hyper- and ER-positive loci, ratio: 3.98–2.23; Supplementary methylated in ER-positive tumors and 13 loci/probes aber- Table S2). Interestingly, ER-negative tumors had a higher rantly hypermethylated in ER-negative tumors. The majority number of hypermethylated loci located closer to the tran- of these were at loci newly identified as hypermethylated in

www.aacrjournals.org Cancer Res; 71(19) October 1, 2011 6199

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Fackler et al.

A

Discovery Set (JH) Validation Set (TCGA)

1.0 ER Pos ER Neg ADAMTSL1 ER Pso ER NegN

0.5 -Value Methylation

β 0.0

1.0 TNFSF9 ER Pso ER NegNER Pos ER Neg

0.5 -Value Methylation

β 0.0

1.0 FLJ34922 ER Pso ER NegNER Pos ER Neg (SLFN11)

0.5 -Value Methylation

β 0.0

1.0 HEY2 ER Pso ER NegNER Pos ER Neg

0.5 -Value Methylation

β 0.0

1.0 PDXK ER Pso ER NegNER Pos ER Neg

0.5 -Value Methylation

β 0.0

1.0 HSD17B8 ER Pso ER NegN ER Pos ER Neg

0.5 -Value Methylation

β 0.0

B

1.0 EVI1EVI1 N ER Pos ER Neg ER Pos ER Neg N

0.5 -Value Methylation

β 0.0

1.0 DAB2IPDAB2IP ER Pos ER Neg NN ER Pos ER Neg

0.5 -Value Methylation

β 0.0

Breast cancer samples Breast cancer samples

Figure 3. CpG methylation biomarkers of ER-positive versus ER-negative subtypes in primary tumors. Left (Discovery set, JH), CpG loci were evaluated by using 103 primary tumors. y-axis, b-methylation; x-axis, breast cancer samples; and N, normal breast organoids. The ER status of the tumors is denoted on top of each box. Right (Validation set, TCGA), validation on an independent dataset (n ¼ 50) from TCGA is shown. Representatives of novel (A) and previously known (B) CpG loci specifically hypermethylated in ERþ and ER breast cancers are shown.

6200 Cancer Res; 71(19) October 1, 2011 Cancer Research

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Methylome Classifiers of Cancer Subtype and Prognosis

breast cancer, and some never observed before as hyper- ACADL, ADAMTSL1, ARFGAP3, B3GAT1, CDCA7, FAM78A, methylated in cancer (Tables 2 and 3). As shown in Figure FAM89A, FLJ31951 (RNF145), FLJ34922 (SLFN11), GAS6, HAAO, 3A, Supplementary Figure S2, and Supplementary Table S3, HEY2, HOXB9, ITGA11, NETO, PROX1, PSAT1, RECK, SMOC1,

Table 2. Hypermethylated loci newly identified in breast cancers

Gene symbol Identified Hyper- Known Gene Location in this methylated aberrant study in other expression cancers

ACADL ERþ No No Acyl-CoA dehydrogenase, Cytoplasm long chain ADAMTSL1 ERþ No No ADAMTS-like 1 Extracellular ARFGAP3 ERþ No No ADP-ribosylation factor GTPase Cytoplasm activating 3 B3GAT1 ERþ No No b-1,3-Glucuronyltransferase 1 Cytoplasm CDCA7 ERþ No ER (37) Cell division cycle associated 7 Nucleus FAM78A ERþ No No Family with sequence Unknown similarity 78, member A FAM89A ERþ No No Family with sequence Unknown similarity 89, member A FLJ31951 ERþ No BASAL (38) Unknown Unknown (RNF145) FLJ34922 ERþ No No Schlafen family member 11 Nucleus (SLFN11) GAS6 ERþ No No Growth arrest–specific 6 Extracellular HAAO ERþ No No 3-Hydroxyanthranilate Cytoplasm 3,4-dioxygenase HEY2 ERþ No No Hairy/enhancer-of-split related Nucleus with YRPW motif 2 HOXB9 ERþ No No Homeobox B9 Nucleus ITGA11 ERþ No No Integrin, alpha 11 Plasma membrane NETO ERþ No No Neuropilin (NRP) and tolloid Unknown (TLL)-like 2 PROX1 ERþ No No Prospero homeobox 1 Nucleus PSAT1 ERþ No BASAL (38) Phosphoserine aminotransferase 1 Cytoplasm RECK ERþ Yes No Reversion-inducing cysteine-rich Plasma protein with kazal motifs membrane SMOC1 ERþ No No SPARC related modular calcium Extracellular binding 1 SND1 ERþ No No Staphylococcal nuclease and Nucleus tudor domain containing 1 TNFSF9 ERþ Yes No TNF (ligand) superfamily, Extracellular member 9 ADHFE1 ER Yes No Alcohol dehydrogenase, Unknown iron containing, 1 DYNLRB2 ER No No Dynein, light chain, Cytoplasm roadblock-type 2 HSD17B8 ER No No Hydroxysteroid (17-beta) Cytoplasm dehydrogenase 8 PISD ER No No Phosphatidylserine Cytoplasm decarboxylase PDXK ER No No Pyridoxal kinase Cytoplasm (C21orf124) (vitamin B6 kinase) WNK4 ER No No WNK lysine deficient Plasma membrane protein kinase 4

www.aacrjournals.org Cancer Res; 71(19) October 1, 2011 6201

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Fackler et al.

Table 3. Hypermethylated loci previously identified in breast cancers

Gene symbol Published Hyper- Known Gene Location methylated aberrant in other expression cancers

EVI1 ERþ (25, 39) No Basal (40) MDS1 and EVI1 complex locus Nucleus (MECOM) ETS1 ERþ (25) Basal (38) v-ets erythroblastosis virus E26 Nucleus oncogene homologue 1 IRF7 ERþ (25) Yes Interferon regulatory factor 7 Nucleus LYN ERþ (25) Yes Basal v-yes-1 Yamaguchi sarcoma viral Cytoplasm (38, 41, 42) related oncogene homolog PDXK ERþ (29) No Pyridoxal kinase Cytoplasm (vitamin B6 kinase) PTGS2 ERþ (25) Yes Basal (38) Prostaglandin-endoperoxide Cytoplasm (COX2) synthase 2 RUNX3 ERþ (43) Yes Basal (38) Runt-related transcription Nucleus factor 3 VIM ERþ Yes Basal (41) Vimentin Cytoplasm DAB2IP (4 loci) ER (25) Yes DAB2 interacting protein Plasma Membrane HSD17B4 ER (44) Yes ERþ (44) Hydroxysteroid (17-b) Cytoplasm dehydrogenase 4 PER1 ER (23) Period homologue 1 (Drosophila) Nucleus

SND1, and TNFSF9 were found hypermethylated almost ex- of most of these same genes is also a very strong predictor of clusively in ER-positive tumors, whereas ADHFE1, DYNLRB2, ER status. Here, 121 of the 175 unique genes from our ER panel HSD17B8, PDXK, PISD, and WNK4 were found hypermethy- and available on the expression array had areas under the lated in ER-negative tumors. A number of genes previously curve exceeding the same 5% threshold. This is consistent reported as having subtype-specific methylation were also with the high degree of correlation observed between expres- identified. EVI1, ETS1, IRF7, LYN, PTGS2 (COX2), RUNX3, sion and methylation measurements of these genes in the and VIM were found to be hypermethylated in ER-positive TCGA data. At a false discovery rate of 0.05, 142 of 200 CpG tumors, whereas DAB2IP, HSD17B4, and PER1 were reported loci are significantly inversely correlated with expression. to be hypermethylated in ER-negative breast cancers (detailed And, as seen in Figure 3, the TCGA data provided support information and references in Table 3). A second distinct CpG for the existence of ER subtype–specific methylation in breast locus of PDXK was previously found hypermethylated in ER- cancer. To evaluate the predictive performance of the 40 locus positive breast cancers (29). We did not find any gene that was panel (Fig. 2D), we derived an average methylation score for preferentially methylated in ER-positive or ER-negative the entire set as described in Materials and Methods. Using tumors where the literature conflicted with our data. The this score, ROC analysis showed a high classification accuracy concordance between current and published data is shown for for the ER subtype in TCGA data with an area under the ROC 2 of these gene CpG loci, EVI1 (23–25) and DAB2IP (23), in curve of 0.961, with a specificity of 89% at a sensitivity of 90% Figure 3B. Thus, many novel and some published gene loci (Fig. 2D; details in Supplementary Materials and Methods). A were discovered that showed tumor-specific and ER subtype– similar composite score derived from expression probes for specific hypermethylation. Existing literature provided further the same genes showed some discriminatory ability in the validity to our current observations. TCGA data, albeit reduced, with an area under the ROC of 0.667 (data not shown). External validation of methylation array findings in an independent test set of primary tumors CpG loci associated with disease progression in Next, we validated these findings in publicly available data patients with newly diagnosed invasive breast cancer on the breast cancer samples in TCGA (http://tcga-data.nci. To develop an epigenomic signature that predicts outcome nih.gov/), using an ROC analysis to evaluate predictive ability in patients with breast cancer, we conducted differential (Supplementary Table S2A). The median area under the ROC methylation analysis on primary tumors from recurrent versus curve for the 200 loci was 0.7, 1 gene, SERPINA12, had an area nonrecurrent breast cancers. We used a subgroup of 82 well- under the curve (AUC) of 0.95. In all, 156 of 200 ER probes annotated, invasive breast tumors derived from the discovery yielded AUCs higher than 0.563, a range in which we expect set of 103 tumors that included 44 ER-positive (7 recurrences) only 5% of CpG loci by chance alone. Interestingly, expression and 38 ER-negative (11 recurrences) breast cancers and

6202 Cancer Res; 71(19) October 1, 2011 Cancer Research

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Methylome Classifiers of Cancer Subtype and Prognosis

independently queried the ER-positive and ER-negative tumor dimensional (2D) hierarchical cluster analysis (using the Man- groups (Table 1; Supplementary Table S1) as follows. Differ- hattan distance) was carried out to characterize these loci. As ential methylation analysis was carried out in GenomeStudio, shown in the heatmap in Figure 5A, the 18 homeobox gene loci using the DiffScore algorithm to compare tumors which later derived from the 100 recurrence locus set have distinctive recurred with those that did not. The analysis was carried out methylation patterns, showing significant comethylation with- separately on the ER-positive and ER-negative tumor groups. in the first cluster, with highly methylated samples tending to Candidate loci (50 per ER subtype) that met the following 3 be methylated for all the loci. Interestingly, a similar clustering criteria were selected: (i) more highly methylated in recurrent profile was observed with the 60 homeobox loci (Fig. 5B), tumors than in nonrecurrent tumors, (ii) relatively unmethy- suggesting that the homeobox genes as a group have a common lated in normal samples (b<0.15), and (iii) significantly methylation signature. To evaluate correlation with recur- differentially methylated above the false discovery rate cutoff rence, we derived an average methylation score for the panel (5%). Next, we carried out a multivariate Cox regression as described in Materials and Methods. In a multivariate ana- analysis for each of these candidate loci and generated lysis that included age, stage, treatment, and ER status, there Kaplan–Meier plots, showing the interrelationships between was clear evidence of a significant additional and independent ER status and methylation and depicted in these plots as contribution to the model where the Cox coefficient was 1.74, high/low with respect to the median methylation level for with a P value of 0.0042. Kaplan–Meier plots for the 18 and each CpG locus. From these 100 candidate CpG loci, a set of 32, the 60 homeobox loci (but not for all 1,378 CpG loci that showed selected for high Cox coefficients (Supplementary Table S4A) differential methylation across all the tumors) illustrate their and visually striking Kaplan–Meier plots (Fig. 4; Supplemen- predictive value. These results support the notion that highly tary Fig. S3) were followed up most closely, including with an methylated homeobox loci and loss of their expression may extensive literature search to identify previous associations likely contribute to poor outcome in breast cancer. with outcome in breast cancer (Supplementary Table S4C). Novel associations with poor outcome were identified for External validation of associations with outcome in an (i) TMEM179, CRMP1, and SCNN1B in ER-positive breast independent test set of primary tumors cancer; (ii) ALX1, COL14A1, EPHA5, EYA4, FLRT2, GPX7, Next, we sought to validate these findings in publicly KCNB2, LAMA1, LHX1, NEUROG1, POU3F2, and STMN3 in available TCGA breast cancer samples (http://tcga-data.nci. ER-negative breast cancer; and (iii) AKR1B1, COL6A2, EYA4, nih.gov/), using Cox regression to evaluate association be- GPX7, HOXA13, HOXB13, NKX6-2, NRP2, POU4F2, REM1, and tween methylation and overall survival; progression-free sur- SLITRK2, in both ER-positive and ER-negative tumors. Cox vival was not available for these samples at the time of regression P values for each member of the 100 CpG loci set is download. In total, survival information was available for presented in Supplementary Table S4A. Because the differen- 342 of the samples available on expression array, of which tial methylation analysis was designed in such a way to find 182 were also available on methylation array. Despite the loci most highly methylated in recurrent tumors, we did not change of outcome variable and moderate sample sizes, results observe hypomethylated loci associating with recurrence. in TCGA data as a set confirmed the findings that these genes To verify array data using an independent assay, and to are significantly associated with outcome. An overwhelming ensure future technical translation of the HumanMethyla- majority of our recurrence marker loci (78 of 100) have positive tion27 array data to laboratory assays, we tested several Cox regression coefficients, indicating that hypermethylation methylated genes, such as EVI1, DAB2IP, and AKR1B1 by of these loci is associated with a worse outcome in these carrying out QM-MSP. In each case, we observed an excellent samples as well. In comparison, we would expect only half of correlation between the levels of gene methylation assessed by these loci to have positive Cox coefficients by chance alone, both assays. A comparison of level of methylation in AKR1B1 giving a composite P value of 2.2e-09 in support of the assessed by the array and by QM-MSP in individual primary association. Additional confirmation for the panel is provided tumors and both data plotted as Kaplan–Meier plots as shown by the fact that for more than two third of these genes, Cox in Figure 4C. regression analysis of TCGA expression data show that low A striking observation was that nearly 20% of the recurrence expression correlates with worse outcome. This result is loci (18 of 100 loci, 15 of 91 unique genes) were from homeo- wholly consistent with the observed methylation results, box-containing genes including the HOX, LHX, POU, ALX, and and statistically significant in its own right, with a P value NK6 gene families (Fig. 5; Supplementary Table S4A). With of 0.00022. This is consistent with the high degree of correla- only 375 homeobox loci (189 genes) present in the 27,578 loci tion observed between expression and methylation measure- (14,495 genes) array, this represented a dramatic enrichment ments of these genes in the TCGA data. At a false discovery rate of homeobox genes in our 100 loci recurrence related set of 0.05, 43 of 100 CpG loci are significantly inversely correlated (OR ¼ 16.17, P ¼ 6.515e-13). These data clearly implicate with expression. We also carried out a multivariate Cox methylated homeobox genes as key factors in tumor progres- regression analysis for each of these candidate loci and gen- sion. To determine if the other homeobox loci on the array erated Kaplan–Meier plots for the sets of 18 loci (log-rank test, exhibited similar methylation patterns, we extended our anal- P ¼ 0.00027) and 60 loci (log-rank test, P ¼ 0.00036), compared ysis to 60 homeobox loci which showed high variance (SD with the top 5% of varied probes (1,378 probes, P ¼ 0.112), above the 95th percentile for the array) among the tumors, showing significant interrelationships between homeobox excluding the 18 loci represented in the recurrence sets. Two- gene methylation and survival (data not shown).

www.aacrjournals.org Cancer Res; 71(19) October 1, 2011 6203

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. 6204 Downloaded from tumors; yemtyainadhg aeo ies rgeso mn 2tmr.Rcrec ssonfrER for shown is Recurrence tumors. 82 among progression disease of rate high and hypermethylation rdln)adlw(leln)mtyainwsdfndrltv otemda ehlto ee o ie p oiwti Rsbye p oiassocia loci CpG subtype. ER within loci CpG given a ER for (A) level methylation in median the recurrence to of relative defined rate was high methylation line) (blue low and line) (red rsne nSplmnayTbe4.C eiiaino uaMtyain7arydt tp lebr)b MMPaayi bto)frAKR1B1. for (bottom) analysis QM-MSP by bars) blue (top, data array HumanMethylation27 of verification C, 4A. Table Supplementary in presented acrRs 11)Otbr1 2011 1, October 71(19) Res; Cancer 4. Figure al. et Fackler y ai,mtyainlvl ( levels methylation -axis, ehltdCGlc soitdwt ies rgeso.Kaplan progression. disease with associated loci CpG Methylated C A B nlssArray Analysis Survival P χ 2 Array verification Array Methylated genes in ER+ and ER- tumors with tumors with higher rates of recurrence Methylated genesin ER+ andER- Methylated genes in ER- tumors with tumors with higher rates of recurrence Methylated genesin ER- Overall survival (Fraction) Overall survival (Fraction) cancerres.aacrjournals.org AKR1B1-cg18416881 AKR1B1-cg18416881 LHX2 -cg07109287 NEUROG1 -cg04897683 -cg04897683 NEUROG1 -cg07109287 LHX2 FLRT2 -cg17140236 GPX7 -cg22129364 KCNB2 -cg20890210 LAMA1 -cg07846220 LAMA1 -cg20890210 KCNB2 -cg22129364 GPX7 -cg17140236 FLRT2 -cg02409351 -cg22660578 ALX1 -cg18429965 -cg23196831 LHX1 EPHA5 COL14A1 NRP2 -cg22367989 -cg13619915 -cg22367989 NRP2 SLITRK3 Published OnlineFirstAugust8,2011;DOI:10.1158/0008-5472.CAN-11-1630 .050.0037 0.0035 FLRT2-cg17140236–cg21499834 .0 8.450 8.501 AKR1B1-cg18416881 þ n ER and b MSP vlefrary ehlto o MMP.Kaplan QM-MSP). for methylation % array, for -value QM-

rattmr.B Rngtv rattmr.Cxregression Cox tumors. breast ER-negative B, tumors. breast Fraction survival 0.2 0.4 0.6 0.8 1.0 0 AKR1B1: Array AKR1B1: QM-MSP AKR1B1: Array AKR1B1: POU4F2 -cg13262687 HO -cg13262687 POU4F2 AKR1B1: ARRAY AKR1B1: 50 R AKR1B1- ER- AKR1B1+ ER- AKR1B1- ER+ AKR1B1+ ER+ on October 2,2021. ©2011 American Association forCancer Research. 100 LHX1-cg22660578 POU4F2 –cg13262687 POU4F2 150 Follow-up time (months)

Fraction survival 0.2 0.4 0.6 0.8 1.0 0 AKR1B1: QM-MSPAKR1B1: 50 – R AKR1B1- ER- AKR1B1+ ER- ER+AKR1B1- ER+AKR1B1+ ee lt hwCGlc ihsrn soitosbetween associations strong with loci CpG show plots Meier POU3F2 -cg14823162 -cg08291098 -cg14823162 STMN3 POU3F2 100 TMEFF2 -cg18221862 POU4F2 -cg21499834 POU4F2 -cg18221862 TMEFF2 XA13 -cg10883303 HOXB13 -cg15786837 HOXB13 -cg10883303 XA13 150 – ee uvsuigaryadQ-S values. QM-MSP and array using curves Meier % methylation ß value methylation

% Methylation 20 40 60 80 0

814 835 7132 1A12 2B15 824 801 807 809 815 819 823 827 828 þ 830 837 852 857 dse ie n ER and line) (dashed 877 878 P 879 7103 7107 7115 ausfrec fte10CGlc esare sets loci CpG 100 the of each for values 7130 7146 7148 7219 1A4 1B23 1C1 1D3 1D9

1F12 Tumors Primary 1F24 1G16 1G6 1H6 2A1 2B20 2B24 2C13 2H21 3A15 817 872 7109 STMN3 –cg08291098 STMN3

7121 LAMA1-cg07846220

7144 –cg21499834 POU4F2 –cg21499834 POU4F2 1B20 1C23 1E 17 2B22 2C2 2C6 805

829 832 836 847 uos(oi ie.High line). (solid tumors 856 867 873 876 7113 7117 acrResearch Cancer 7142 1A18 1A24 QM-MSP 1B16 1B9 1C10 Array 1E 11 1F9 1H18 1H2 1H3 2A17 2C18 2C4 803 818 e iha with ted x -axis, Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Methylome Classifiers of Cancer Subtype and Prognosis

A C

Recur V ER POU4F1 HOXA13 NKX6-2 ALX1 POU4F2 POU4F2 LHX1 HOXB4 Methylated ER- neg HOXB4 HOXB13 has higher Figure 5. 2-D hierarchical cluster HOXB13 recurrence 18 homeobox CpG MIXL1 loci in 32 marker set; analysis of CpG loci in homeobox POU3F2 P LHX2 log rank = 0.0328 genes. Methylation levels of VAX2 SIX6 homeobox 18 loci from the 100 SIX1 recurrence markers (A) and 60 HHEX additional unique loci in 82 primary B D invasive tumors (B). Kaplan–Meier plots of homeobox 18 loci set, log-rank P ¼ 0.033 (C), 60 loci set, P ¼ Methylated ER- neg 0.025 (D), and unsupervised has higher top 5% most varied array probes recurrence across tumor samples, P ¼ 0.438 60 homeobox CpG loci; log rank (E). Strong enrichment of P = 0.0247 homeobox genes was observed within the 100 recurrence loci set E (table). The bars under the case dendrogram indicate recurrence (black, recurred < 60 months; salmon, did not recur by 60 months; gray, censored by All 1378 CpG loci (most varied across 60 months) and ER (pink, tumors); log rank P = 0.438 ER-positive; purple, ER-negative) status.

Follow-up time (months) Enrichment of Homeobox Genes Among 100 Recurrence Confidence Odds Ratio* Interval p-value

18/100 recurrence CpG loci (15/91 genes) 16.62 8.46 - 29.09 6.52E-13 * Total in array = 375 homeobox loci (189 genes) among 27,578 loci (14,495 genes)

Discussion drivers that enhance the expression of the DNA methyl- transferases or inhibit the repair processes that remove In this study, we report the results of a genome-wide array methyl groups from DNA. On the contrary, reduced methy- analysis of primary invasive breast cancers of 27,578 CpG loci. lation in ER-negative tumors might offer an explanation This screen identified hypermethylated genes that specifically for their relative aggressiveness because the uncontrolled segregate with ER-positive or ER-negative tumor subtypes, expression of growth factors and their receptors may be which were then validated in silico by using the newly populated facilitated by removing the protective imposition of methy- TCGA breast cancer database. The array analysis also identified lation-mediated silencing. The observation of a higher 100 gene loci that were enriched for homeobox-containing genes frequency of hypermethylated genes in ER-positive tumors and predicted recurrence in breast cancers. Many novel hyper- is substantiated by 5 recent studies describing the breast methylated loci were identified. In summary, we show that the cancer methylome (22, 24–26, 29). In a study examining 44 methylome is a richsource of genes whose hypermethylation has primary tumors, Hill and colleagues (22) confirmed a highly the potential to significantly contribute to the understanding of significant difference (ANOVA, P ¼ 0.001) in hypermethyla- ER subgroups of breast cancer and predict recurrence in ER- tion depending on hormone receptor status. Similarly, in positive and in ER-negative breast cancers. the Fang study (26), ER/PR-positive tumors displayed high We observed a significantly higher frequency of hyper- level of methylation across the top 5% variant loci in the 27K methylation in ER-positive than ER-negative tumors Illumina array. Our study of 103 primary breast cancers (P < 0.0001). The reason for the hypermethylated pheno- identified many novel loci that have the ability to impres- type of ER-positive tumors is not yet clear. The simplest sively segregate ER-positive and ER-negative tumors (Tables explanation could be that the ER-positive markers are 2 and 3; Fig. 3A and B), shedding light on many novel

www.aacrjournals.org Cancer Res; 71(19) October 1, 2011 6205

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Fackler et al.

pathways and constituent genes that may be involved in the fore samples of convenience with their inherent drawbacks. genesis of these subgroups. We tested the strength of these Additional studies will need to address the question of the observations by using an independent dataset from TCGA preciseroleofmethylationsignatures in prognosticating for both methylation and expression. All 40 CpG loci showed outcome and predicting response to therapy. More discov- reproducible associations with ER subtype and these mar- ery and validation will need to be carried out with anno- kers classified essentially all tumors into the correct ER sub- tated samples from controlled studies, with more uniform type (AUC, 0.961). Interestingly, expression of the same standards of sample collection, such as in the context of genes was also found to be a very strong predictor of ER large mature randomized clinical trials. To allow investiga- status. Thus, independent TCGA data strongly validated our tion on archival specimens, the rapid development of findings. With the caveat that both the discovery and methods to retrieve high-quality DNA from paraffin-embed- validation cohorts represent small sample sets, the repro- ded tissues is imperative. Our recent success in standard- ducibility of the findings supports the strength of this plat- izing restoration of DNA retrieved from formalin-fixed, form to reveal differences that can now be studied in detail. paraffin-embedded tissues, in collaboration with Illumina A major goal of our work is to find markers that can (our unpublished data; ref. 36), bodes well for the future of prognosticate recurrence and predict benefit from therapy. these investigations. Expression array-based analyses have proven to be useful for In summary, this study has shown the feasibility of distin- ER-positive breast cancers (30, 31). Their utility, however, guishing ER subtype in breast cancers and possibly predicting has been limited in ER-negative breast cancer. Also, DNA outcome based on CpG DNA methylation. The study suggests mutation and copy number studies have been found less pathways that may explain distinctive behaviors among ER- useful in breast cancer compared with other cancers (e.g., positive and ER-negative tumors. In conclusion, the data lung and colon), probably reflecting the greater diversity of strongly support upcoming planned studies that will use breast cancer subtypes (32–34). Epigenetically mediated existing clinically annotated tissues from previously con- gene silencing through DNA methylation occurs extremely ducted prospective randomized trials to examine the prog- frequently and has now been accepted as a major driver in nostic outcome and predictive therapeutic information neoplastic transformation, especially in the breast (12). offered by methylation markers in a prospective–retrospective Genome-wide methylation analysis allowed us to identify fashion. a tumor recurrence marker set of 100 gene loci; a few specific to ER-positive tumors, many loci specific to ER-negative Disclosure of Potential Conflicts of Interest tumors, and many common to both (Fig. 4; Supplementary Table S4C). The emergence of a homeobox gene methylation No potential conflicts of interest were disclosed. signature predictive of recurrence among the 100 recurrence loci and also among all homeobox loci on the array is Acknowledgments notable. Substantial inverse correlation was seen between methylation and expression of the genes, both of the ER- We thank Dr. Wayne Yu for carrying out the methylation microarray, Drs. stratifying set and the recurrence set of CpG loci, suggesting Gedge Rosen and Michelle Manahan for providing the reduction mammoplasty functional relevance to the effects of methylated genes tissues, and Ms. Areli Lopez for excellent technical assistance. observed in our study. These genes play critical roles in differentiation and development, growth factor receptor Grant Support signaling, angiogenesis, and more recently an unequivocal This work was supported by grants from the Rubenstein and Cohen families role in stem cell function (reviewed in ref. 35). At the same and the Breast Cancer SPORE: P50-CA-88843. time, particularly within the recurrence panel, expression The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in alone could not duplicate the level of performance achieved accordance with 18 U.S.C. Section 1734 solely to indicate this fact. with methylation probes. The tissues used for the current analysis were from Received May 11, 2011; revised August 3, 2011; accepted August 4, 2011; an institutional cohort of frozen specimens and are there- published OnlineFirst August 8, 2011.

References 1. Carlson RW, Allred DC, Anderson BO, Burstein HJ, Carter WB, Edge 4. Wolff AC, Hammond ME, Schwartz JN, Hagerty KL, Allred DC, Cote RJ, SB, et al. Breast cancer. Clinical practice guidelines in oncology. J Natl et al. American Society of Clinical Oncology/College of American Compr Canc Netw 2009;7:122–92. Pathologists guideline recommendations for human epidermal growth 2. Hammond ME, Hayes DF, Dowsett M, Allred DC, Hagerty KL, Badve factor receptor 2 testing in breast cancer. J Clin Oncol 2007;25:118–45. S, et al. American Society of Clinical Oncology/College Of American 5. Fisher B, Dignam J, Tan-Chiu E, Anderson S, Fisher ER, Wittliff JL, Pathologists guideline recommendations for immunohistochemical et al. Prognosis and treatment of patients with breast tumors of one testing of estrogen and progesterone receptors in breast cancer. J centimeter or less and negative axillary lymph nodes. J Natl Cancer Clin Oncol 2010;28:2784–95. Inst 2001;93:112–20. 3. Harris L, Fritsche H, Mennel R, Norton L, Ravdin P, Taube S, et al. 6. Fisher B, Jeong JH, Dignam J, Anderson S, Mamounas E, Wickerham American Society of Clinical Oncology 2007 update of recommenda- DL, et al. Findings from recent National Surgical Adjuvant Breast and tions for the use of tumor markers in breast cancer. J Clin Oncol Bowel Project adjuvant studies in stage I breast cancer. J Natl Cancer 2007;25:5287–312. Inst Monogr 2001:62–6.

6206 Cancer Res; 71(19) October 1, 2011 Cancer Research

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Methylome Classifiers of Cancer Subtype and Prognosis

7. Marchionni L, Wilson RF, Wolff AC, Marinopoulos S, Parmigiani G, 26. Fang F, Turcan S, Rimner A, Kaufman A, Giri D, Morris LG, et al. Breast Bass EB, et al. Systematic review: gene expression profiling assays in cancer methylomes establish an epigenomic foundation for metas- early-stage breast cancer. Ann Intern Med 2008;148:358–69. tasis. Sci Transl Med 2010;3:75ra25. 8. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene 27. Fackler MJ, McVeigh M, Mehrotra J, Blum MA, Lange J, Lapides A, assay to predict recurrence of tamoxifen-treated, node-negative et al. Quantitative multiplex methylation-specific PCR assay for the breast cancer. N Engl J Med 2004;351:2817–26. detection of promoter hypermethylation in multiple genes in breast 9. Paik S, Tang G, Shak S, Kim C, Baker J, Kim W, et al. Gene expression cancer. Cancer Res 2004;64:4442–52. and benefit of chemotherapy in women with node-negative, estrogen 28. Swift-Scanlan T, Blackford A, Argani P, Sukumar S, Fackler MJ. Two- receptor-positive breast cancer. J Clin Oncol 2006;24:3726–34. color quantitative multiplex methylation-specific PCR. Biotechniques 10. Buyse M, Loi S, van't Veer L, Viale G, Delorenzi M, Glas AM, et al. 2006;40:210–9. Validation and clinical utility of a 70-gene prognostic signature for 29. Kamalakaran S, Varadan V, Giercksky Russnes HE, Levy D, Kendall J, women with node-negative breast cancer. J Natl Cancer Inst Janevski A, et al. DNA methylation patterns in luminal breast cancers 2006;98:1183–92. differ from non-luminal subtypes and can identify relapse risk inde- 11. Simon R. Development and evaluation of therapeutically relevant pendent of other clinical variables. Mol Oncol 2011;5:77–92. predictive classifiers using gene expression profiling. J Natl Cancer 30. Albain KS, Paik S, van't Veer L. Prediction of adjuvant chemotherapy Inst 2006;98:1169–71. benefit in endocrine responsive, early breast cancer using multigene 12. Sharma S, Kelly TK, Jones PA. Epigenetics in cancer. Carcinogenesis assays. Breast 2009;18 Suppl 3:S141–5. 2010;31:27–36. 31. Kim C, Paik S. Gene-expression-based prognostic assays for breast 13. Fackler MJ, Malone K, Zhang Z, Schilling E, Garrett-Mayer E, Swift- cancer. Nat Rev Clin Oncol 2010;7:340–7. Scanlan T, et al. Quantitative multiplex methylation-specific PCR 32. Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, et al. analysis doubles detection of tumor cells in breast ductal fluid. Clin The consensus coding sequences of human breast and colorectal Cancer Res 2006;12:3306–10. cancers. Science 2006;314:268–74. 14. Locke I, Kote-Jarai Z, Fackler MJ, Bancroft E, Osin P, Nerurkar A, et al. 33. Leary RJ, Lin JC, Cummins J, Boca S, Wood LD, Parsons DW, et al. Gene promoter hypermethylation in ductal lavage fluid from healthy Integrated analysis of homozygous deletions, focal amplifications, BRCA gene mutation carriers and mutation-negative controls. Breast and sequence alterations in breast and colorectal cancers. Proc Natl Cancer Res 2007;9:R20. Acad Sci U S A 2008;105:16224–9. 15. Fackler MJ, Rivers A, Teo WW, Mangat A, Taylor E, Zhang Z, et al. 34. Wood LD, Parsons DW, Jones S, Lin J, Sjoblom T, Leary RJ, et al. The Hypermethylated genes as biomarkers of cancer in women with genomic landscapes of human breast and colorectal cancers. Sci- pathologic nipple discharge. Clin Cancer Res 2009;15:3802–11. ence 2007;318:1108–13. 16. Fackler MJ, McVeigh M, Evron E, Garrett E, Mehrotra J, Polyak K, et al. 35. Shah N, Sukumar S. The Hox genes and their roles in oncogenesis. DNA methylation of RASSF1A, HIN-1, RAR-beta, cyclin D2 and Twist Nat Rev Cancer 2010;10:361–71. in in situ and invasive lobular breast carcinoma. Int J Cancer 36. Le J, Pokholok D, Teo WW, Barnes B, Steemers F, Hansen M, et al. 2003;107:970–5. Analysis of restored FFPE samples on Infinium methylation arrays 17. Lee JS, Fackler MJ, Teo WW, Lee JH, Choi C, Park MH, et al. [abstract]. In: Proceedings of the 102nd Annual Meeting of the Quantitative promoter hypermethylation profiles of ductal carcinoma American Association for Cancer Research; 2011 Apr 2–6; Orlando, in situ in North American and Korean women: potential applications Florida. Philadelphia (PA): AACR; 2011. Abstract nr LB-178. for diagnosis. Cancer Biol Ther 2008;7:1398–406. 37. Choschzick M, Lebeau A, Marx AH, Tharun L, Terracciano L, Hei- 18. Van der Auwera I, Bovie C, Svensson C, Trinh XB, Limame R, van Dam lenkotter U, et al. Overexpression of cell division cycle 7 homolog is P, et al. Quantitative methylation profiling in tumor and matched associated with gene amplification frequency in breast cancer. Hum morphologically normal tissues from breast cancer patients. BMC Pathol 2010;41:358–65. Cancer 2010;10:97. 38. Charafe-Jauffret E, Ginestier C, Monville F, Finetti P, Adelaide J, 19. Fu DY, Wang ZM, Wang BL, Chen L, Yang WT, Shen ZZ, et al. Cervera N, et al. Gene expression profiling of breast cell lines identifies Frequent epigenetic inactivation of the receptor tyrosine kinase potential new basal markers. Oncogene 2006;25:2273–84. EphA5 by promoter methylation in human breast cancer. Hum Pathol 39. Dejeux E, Ronneberg JA, Solvang H, Bukholm I, Geisler S, Aas T, et al. 2010;41:48–58. DNA methylation profiling in doxorubicin treated primary locally ad- 20. Lehmann U, Langer F, Feist H, Glockner S, Hasemeier B, Kreipe H. vanced breast tumours identifies novel genes associated with survival Quantitative assessment of promoter hypermethylation during breast and treatment response. Mol Cancer 2010;9:68. cancer development. Am J Pathol 2002;160:605–12. 40. Patel JB, Appaiah HN, Burnett RM, Bhat-Nakshatri P, Wang G, Mehta 21. Umbricht CB, Evron E, Gabrielson E, Ferguson A, Marks J, Sukumar R, et al. Control of EVI-1 oncogene expression in metastatic breast S. Hypermethylation of 14-3-3 sigma (stratifin) is an early event in cancer cells through microRNA miR-22. Oncogene 2011;30:1290– breast cancer. Oncogene 2001;20:3348–53. 301. 22. Hill VK, Ricketts C, Bieche I, Vacher S, Gentle D, Lewis C, et al. 41. Neve RM, Chin K, Fridlyand J, Yeh J, Baehner FL, Fevr T, et al. A Genome-wide DNA methylation profiling of CpG islands in breast collection of breast cancer cell lines for the study of functionally cancer identifies novel genes associated with tumorigenicity. Cancer distinct cancer subtypes. Cancer Cell 2006;10:515–27. Res 2011;71:2988–99. 42. Choi YL, Bocanegra M, Kwon MJ, Shin YK, Nam SJ, Yang JH, et al. 23. Li L, Lee KM, Han W, Choi JY, Lee JY, Kang GH, et al. Estrogen and LYN is a mediator of epithelial-mesenchymal transition and a target of progesterone receptor status affect genome-wide DNA methylation dasatinib in breast cancer. Cancer Res 2010;70:2296–306. profile in breast cancer. Hum Mol Genet 2010;19:4273–7. 43. Lau QC, Raja E, Salto-Tellez M, Liu Q, Ito K, Inoue M, et al. RUNX3 is 24. Van der Auwera I, Yu W, Suo L, Van Neste L, van Dam P, Van Marck frequently inactivated by dual mechanisms of protein mislocalization EA, et al. Array-based DNA methylation profiling for breast cancer and promoter hypermethylation in breast cancer. Cancer Res subtype discrimination. PLoS One 2010;5:e12616. 2006;66:6512–20. 25. Holm K, Hegardt C, Staaf J, Vallon-Christersson J, Jonsson G, 44. Fiegl H, Millinger S, Goebel G, Muller-Holzner E, Marth C, Laird PW, Olsson H, et al. Molecular subtypes of breast cancer are associated et al. Breast cancer DNA methylation profiles in cancer cells and tumor with characteristic DNA methylation patterns. Breast Cancer Res stroma: association with HER-2/neu status in primary breast cancer. 2010;12:R36. Cancer Res 2006;66:29–33.

www.aacrjournals.org Cancer Res; 71(19) October 1, 2011 6207

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research. Published OnlineFirst August 8, 2011; DOI: 10.1158/0008-5472.CAN-11-1630

Genome-wide Methylation Analysis Identifies Genes Specific to Breast Cancer Hormone Receptor Status and Risk of Recurrence

Mary Jo Fackler, Christopher B. Umbricht, Danielle Williams, et al.

Cancer Res 2011;71:6195-6207. Published OnlineFirst August 8, 2011.

Updated version Access the most recent version of this article at: doi:10.1158/0008-5472.CAN-11-1630

Supplementary Access the most recent supplemental material at: Material http://cancerres.aacrjournals.org/content/suppl/2011/08/08/0008-5472.CAN-11-1630.DC1

Cited articles This article cites 42 articles, 15 of which you can access for free at: http://cancerres.aacrjournals.org/content/71/19/6195.full#ref-list-1

Citing articles This article has been cited by 21 HighWire-hosted articles. Access the articles at: http://cancerres.aacrjournals.org/content/71/19/6195.full#related-urls

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at Subscriptions [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://cancerres.aacrjournals.org/content/71/19/6195. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from cancerres.aacrjournals.org on October 2, 2021. © 2011 American Association for Cancer Research.