Leukemia (2008) 22, 1035–1043 & 2008 Nature Publishing Group All rights reserved 0887-6924/08 $30.00 www.nature.com/leu ORIGINAL ARTICLE

DNA methylation profiles in diffuse large B-cell lymphoma and their relationship to expression status

BL Pike1, TC Greiner2, X Wang1, DD Weisenburger2, Y-H Hsu1, G Renaud3, TG Wolfsberg3, M Kim1,4, DJ Weisenberger1,4, KD Siegmund5,WYe5, S Groshen5, R Mehrian-Shai1, J Delabie6, WC Chan2, PW Laird1,4 and JG Hacia1

1Department of Biochemistry and Molecular Biology, University of Southern California, Los Angeles, CA, USA; 2Department of Pathology and Microbiology, University of Nebraska Medical Center, Omaha, NE, USA; 3Genome Technology Branch, National Research Institute, National Institutes of Health, Bethesda, MD, USA; 4Department of Surgery, University of Southern California, Los Angeles, CA, USA; 5Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA and 6Department of Pathology, Norwegian Radium Hospital, University of Oslo, Oslo, Norway

In an initial epigenetic characterization of diffuse large B-cell In contrast to these genetic characterizations, much less is lymphoma (DLBCL), we evaluated the DNA methylation levels known about epigenetic changes present in DLBCL. Although of over 500 CpG islands. Twelve CpG islands (AR, CDKN1C, DLC1, DRD2, GATA4, GDNF, GRIN2B, MTHFR, MYOD1, NEU- widespread epigenetic analyses could yield insights into disease ROD1, ONECUT2 and TFAP2A) showed significant methylation origins and provide diagnostic and prognostic biomarkers, larger 11 in over 85% of tumors. Interestingly, the methylation levels of a scale studies have focused on NHL cell lines. Here, we have CpG island proximal to FLJ21062 differed between the activated taken the initial steps in the widespread epigenetic characteri- B-cell-like (ABC-DLBCL) and germinal center B-cell-like zation of DLBCL by comparing the DNA methylation levels of (GCB-DLBCL) subtypes. In addition, we compared the methyla- over 500 gene-associated CpG islands in tumors. Using a two- tion and expression status of 67 proximal (within 500 bp) to the methylation assays. We frequently observed that phase methylation screening strategy, we identified genes that hypermethylated CpG islands are proximal to genes that are are frequently methylated in all DLBCL tumors and uncovered expressed at low or undetectable levels in tumors. However, evidence of epigenetic differences between ABC-DLBCL and many of these same genes were also poorly expressed in GCB-DLBCL tumor subtypes. Furthermore, we identified genes DLBCL tumors where their cognate CpG islands were hypo- that show proportional reductions in expression in response to methylated. Nevertheless, the proportional reductions in BNIP3, increased methylation levels in nearby CpG islands. Overall, we MGMT, RBP1, GATA4, IGSF4, CRABP1 and FLJ21062 expres- sion with increasing methylation suggest that epigenetic highlight candidate DNA methylation changes associated with processes strongly influence these genes. Lastly, the moderate DLBCL that also could warrant further investigation as potential expression of several genes proximal to hypermethylated CpG clinical biomarkers. tracts suggests that DNA methylation assays are not always accurate predictors of gene silencing. Overall, further investi- gation of the highlighted CpG islands as potential clinical biomarkers is warranted. Materials and methods Leukemia (2008) 22, 1035–1043; doi:10.1038/leu.2008.18; published online 21 February 2008 Samples and patients Keywords: epigenetics; genomics; CpG island; microarray; gene Frozen tissue sample DNA from diagnostic tumor biopsies from expression 21 ABC-DLBCL and 24 GCB-DLBCL patients acquired prior to anthracycline-based chemotherapy was obtained from the University of Nebraska Medical Center and the Norwegian Radium Hospital. There was consensus central pathology Introduction re-review of the specimens to confirm the diagnosis of DLBCL and that the samples had 475% tumor cells. Clinical Diffuse large B-cell lymphoma (DLBCL) is an aggressive information has been obtained from all patients according to a malignancy of the mature B-lymphocyte that accounts for protocol approved by the University of Nebraska Medical 1,2 approximately one-third of non-Hodgkin lymphoma. Exten- Center Institutional Review Board. sive analyses of gene expression profiles3–5 and genomic copy number6–9 in DLBCL have provided valuable insights into their cellular origins and the molecular bases for their variable CpG island microarray assays clinical behaviors. For example, gene expression analyses have CpG island microarrays, consisting of 4395 PCR products, and identified two major DLBCL subtypes, germinal center B-cell- nucleic acid targets were prepared as previously described and like (GCB-DLBCL) and activated B-cell-like (ABC-DLBCL), that 4 used for two-color hybridization analyses (Supplementary originate from different stages of normal B-cell development. 12,13 Figure 1). Briefly, tumor DNA was digested with MseI and Patients with GCB-DLBCL have a substantially longer median ligated to double-stranded linkers prior to division into equal test overall survival than patients with ABC-DLBCL.3,4,10 and reference fractions. The test fraction was digested with the methylation-dependent McrBC endonuclease (50 y Pu mC Correspondence: Professor JG Hacia, Department of Biochemistry and m 0 [N40–3000]Pu C y 3 ), while the reference fraction was Molecular Biology, University of Southern California, 2250 Alcazar untreated. Later, both fractions were subjected to amplification Street, IGM 240, Los Angeles, CA 90089, USA. E-mail: [email protected] with linker-specific primers. In theory, only those fragments in Received 26 May 2007; revised 11 January 2008; accepted 18 the test fraction lacking a methylated McrBC recognition January 2008; published online 21 February 2008 sequence remain intact after digestion and serve as PCR DNA methylation in diffuse large B-cell lymphoma BL Pike et al 1036 templates. The amplified test and reference fractions were as ‘_at’ tilings) that interrogate National Center for Biotechno- random prime labeled with Cy3- or Cy5-dUTP and combined in logy Information (NCBI)-designated Reference Sequence a cocktail prior to hybridization with CpG island microar- (RefSeq) transcripts located within 500 bp of MethyLight rays.12,13 Later, the microarrays were subjected to two washes, reactions (Supplementary Table 3). dried by centrifugation and imaged using the ScanArray 5000 (GSI Lumonics Inc., Boston, MA, USA) with ScanArray Express software (Perkin Elmer, Waltham, MA, USA). Signal intensities Results and discussion and quality metrics for each clone were calculated using ImaGene microarray software (BioDiscovery Inc., El Segundo, In phase I of our two-phase strategy to characterize epigenetic CA, USA). phenomena in DLBCL, we measured DNA methylation levels in seven ABC-DLBCL and seven GCB-DLBCL tumors using CpG 12,16,17 Microarray data processing island microarray assays (Figure 1). This provided a rapid means of evaluating the methylation-dependent cleavage of 442 Log2-transformed background-subtracted hybridization signals were obtained for both test and reference targets and imported unique gene-associated CpG islands by McrBC endonuclease into Microsoft Excel. We employed multiple filters to ensure that (Supplementary Table 1 and Supplementary Figure 2). Because our final analysis was limited to non-repetitive clones yielding of the semi-quantitative nature of CpG island microarray-based analyses,18 we validated selected results using MethyLight, a the most robust data. First, we eliminated clones that either 15,19 failed to amplify or gave multiple PCR products. We also quantitative bisulfite PCR platform (Table 1). disregarded data from clones with poor spot morphology or On the basis of our phase I CpG island microarray analyses, whose reference signal intensity was less than twice that of the 15 candidate CpG islands revealed differences in methylation local background. Lastly, we only analyzed clones with a between subtypes of DLBCL (uncorrected Wilcoxon Po0.05 reference signal intensity o30 000 units to reduce the effects of and greater than 5% difference in median methylation levels) excessive cross-hybridization. (Table 1 and Supplementary Table 1). We used modest criteria After data filtration, test and reference fraction hybridization to identify differential methylation since cross-hybridization of signals were normalized using an interactive linear regression CpG island sequences can artificially compress methylation approach based on the signal intensities of mitochondrial estimates. In our phase I confirmatory analyses, we developed clones.12 Since mitochondrial DNA is unmethylated,12,14 the novel MethyLight assays for each of these 15 candidate CpG signal intensities for both Cy3 and Cy5 are expected to be equal. islands and interrogated their methylation levels in the same Following normalization, the ratios of test and reference group of 14 DLBCL (Table 1 and Supplementary Table 4). The signals were calculated. To minimize experimental noise, methylation levels of 6 (CPVL, FLJ21062, GNMT, HOXC9, ratios were truncated to a maximum value of one. These values ONECUT2 and PRIMA1) of these 15 CpG islands showed PMR reflect the fraction of unmethylated alleles in a given tumor values greater than 10 in more than one tumor. Interestingly, sample. Overall, we obtained DNA methylation measurements based on an uncorrected Wilcoxon signed-rank test, MethyLight for 592 CpG islands adjacent to 442 unique annotated genes in assays for FLJ21062 (HB-442, P ¼ 0.009), GNMT (HB-426, a minimum of four GCB-DLBCL and four ABC-DLBCL P ¼ 0.005) and ONECUT2 (HB-242, P ¼ 0.018; HB-446, (Supplementary Table 1 and Supplementary Figure 2). P ¼ 0.025) showed differences between ABC-DLBCL and GCB- DLBCL (Table 1). The results from all replicate experiments were in excellent agreement (Supplementary Table 4). MethyLight Using published protocols, tumor DNA was subjected to sodium bisulfite conversion and individual loci were amplified with Phase I analyses Phase II analyses methylation-specific primers that flank a methylation-specific 15 Unique CpG islands reporter oligonucleotide (Supplementary Table 2). Samples 442 217 MethyLight marker panel were analyzed on an Opticon DNA Engine Continuous evaluated using microarrays Fluorescence Detector (MJ Research/Bio-Rad, Hercules, CA, Screen panel of 7 ABC-DLBCL >5% CV in panel of 2 ABC-DLBCL USA). Relative measurements of DNA methylation (reported as and 7 GCB-DLBCL and 3 GCB-DLBCL PMR, percentage of methylated reference) values were calcu- Candidates for MethyLight markers lated based upon the performance of a normalizing control 15 MethyLight validation 89 reaction (Alu: HB-313) in a 1:25 dilution series of in vitro Screen panel of 7 ABC-DLBCL Screen panel of 7 ABC-DLBCL methylated human reference sample.15 and 7 GCB-DLBCL and 7 GCB-DLBCL Candidate CpG islands with Candidate CpG islands with 3 subtype-specific methylation 6 subtype-specific methylation Bisulfite sequencing Bisulfite PCR products representing specific CpG islands from selected DNA samples were subcloned and individual colonies sequenced as described in Supplementary Figure 3. Unique CpG islands for MethyLight analysis in an additional 8 14 ABC-DLBCL, 17 GCB-DLBCL and 6 normal PBL

Confirmatory analysis Gene expression profiling Total RNA from frozen tumor biopsies was isolated and Figure 1 Flowchart of study design. Phase I and II analyses were subjected to analysis on U133 Plus 2.0 Arrays conducted independent of one another. The identities and relevant (Affymetrix, Santa Clara, CA, USA) according to the manufac- methylation data from all candidate CpG islands with subtype-specific methylation levels are provided in Table 1. CV refers to the coefficient turer’s recommended protocols. We report normalized log2- of variation. Note that only seven gene-associated CpG islands are transformed gene expression data for probe tilings with minimal interrogated by both the phase I CpG island microarray assays and cross-hybridization potential (that is, designated by Affymetrix phase II MethyLight assays.

Leukemia DNA methylation in diffuse large B-cell lymphoma BL Pike et al 1037 Table 1 Summary of phase I and II analyses to identify DLBCL subtype-specific methylation

Gene Phase I Phase I

CpG island microarray screen MethyLight confirmation

ABCa GCBa Pb ABCc GCBc Pb (ML ID)

CEBPG 0.991 0.947 0.0099 0.0 0.0 1.000 HB-480 CENPH 0.944 0.913 0.0104 0.0 0.0 0.320 HB-428 CPVL 0.948 0.843 0.0163 61.3 60.5 0.340 HB-427d FLJ21062 0.967 0.897 0.0446 0.0 23.4 0.009 HB-442d GNMT 0.934 0.794 0.0062 0.0 55.6 0.005 HB-426d HOXC9 0.953 0.892 0.0105 69.1 80.4 0.650 HB-440d HTRA4 0.892 0.787 0.0472 0.1 0.30 0.064 HB-484 KLHL14 0.973 0.938 0.0209 2.2 3.6 0.220 HB-481 NOS1AP 0.900 0.985 0.0472 0.0 0.0 0.940 HB-485 ONECUT2 0.826 0.974 0.0143 97.4 50.5 0.018 HB-242d ONECUT2 0.826 0.974 0.0143 82.0 36.7 0.025 HB-446d PFDN5 0.919 0.812 0.0074 0.0 0.0 0.940 HB-424d PFDN5 0.919 0.812 0.0074 0.0 0.0 0.320 HB-425 PHC2 0.900 0.841 0.0424 0.0 0.0 0.320 HB-483 PRIMA1 0.875 0.802 0.0321 31.5 64.6 0.140 HB-482 TP53I11 0.863 0.797 0.0285 0.0 0.0 1.000 HB-443 ZNF615 0.939 0.795 0.0223 0.0 0.0 0.950 HB-431d

Gene Phase II

Independent MethyLight screen

ABCa GCBa Pb (ML ID)

CYP27B1 4.5 56.9 0.0127 HB-233 ONECUT2 55.9 36.6 0.0127 HB-243 NEUROG1 29.3 11.7 0.0348 HB-261 KL 6.3 25.8 0.0467 HB-175 MINT2 24.7 0.0 0.0467 HB-187 DRD1 43.6 28.5 0.0476 HB-252 Abbreviations: ABC, activated B-cell-like; DLBCL, diffuse large B-cell lymphoma; GCB, germinal center B-cell-like. aMedian fraction of unmethylated alleles in a subtype, as provided by CpG island microarray analyses. bUncorrected Wilcoxon t-test. Data are in bold if Po0.05 in MethyLight analyses. cMedian MethyLight PMR (percent of methylated reference) value that reflects the fraction of fully methylated alleles in a subtype. dOn the basis of one or more replicate reactions.

The fact that 9 (CEBPG, CENPH, HTRA4, KHL14, NOSAP1, Table 5). Next, we analyzed the DNA methylation levels of PFDN5, PHC2, TP53I11 and ZNF615) of 15 CpG islands these 80 CpG islands in the 14 DLBCL from the phase I study. demonstrate methylation in DLBCL by CpG island microarray Twelve CpG islands (AR, CDKN1C, DLC1, DRD2, GATA4, assays, but not by MethyLight, partially reflects fundamental GDNF, GRIN2B, MTHFR, MYOD1, NEUROD1, ONECUT2 and differences in these platforms. On the basis of McrBC TFAP2A) showed substantial methylation (PMR420) in over endonuclease recognition sequence, our microarray assays 85% of the samples (Supplementary Table 6). Furthermore, can score a CpG island as being methylated if it contains at seven CpG islands previously reported to be methylated in least two 5-methylcytosines that are between 40 bases and 30 kb DLBCL (AR,20 CDKN2B (aka TP15),21 CDKN2A (aka apart. In contrast, our MethyLight assays are designed to only p16INK4),21,22 CYP27B1,11 DLC1,11 MGMT23–26 and RARB score CpG islands as being methylated if they contain an (aka RARb2)11) showed substantial methylation (PMR420) in at average of eight closely spaced 5-methylcytosines. This high- least one tumor. However, note that the methylation status of AR lights the complex nature of methylation within individual CpG could be influenced by gender since it is located on the X islands and the advantages of using complimentary technologies and subject to inactivation in women. to evaluate their status. Importantly, the CpG island microarrays The ABC-DLBCL and GCB-DLBCL subtypes could not be succeeded in rapidly identifying viable candidates for confirma- discerned based on hierarchical clustering analysis of our phase tion using more quantitative methods. II MethyLight data (Figure 2). However, this was not unexpected In our phase II studies, we analyzed the DNA methylation given that these subtypes also could not be discerned based on status of a focused group of CpG islands proximal to genes hierarchical clustering analysis of gene expression data for only whose methylation status is either known or suspected to be these same CpG islands (Figure 2). Nevertheless, we identified associated with the development and progression of six (CYP27B1, DRD1, KL, MINT2, NEUROG1 and ONECUT2) (Figure 1). We first surveyed the methylation levels of 217 CpG islands in these phase II analyses that showed subtype- unique CpG island sequences in two ABC-DLBCL and three specific differences in methylation levels (uncorrected Wilcoxon GCB-DLBCL using pre-existing MethyLight assays. We identi- signed-rank test Po0.05 and 410 unit difference in median fied 80 unique CpG islands with at least moderate differences in PMR) (Table 1). methylation levels among these five randomly selected DLBCL In preliminary confirmatory studies, we conducted Methy- (that is, coefficient of variation greater than 5%) (Supplementary Light analyses of eight candidate CpG islands (FLJ21062,

Leukemia DNA methylation in diffuse large B-cell lymphoma BL Pike et al 1038 6 ABC 4 ABC 5 GCB 6 GCB GCB 5 ABC 2 ABC 1 ABC 2 ABC 1 ABC 6 ABC 5 ABC 6 GCB 5 ABC 4 GCB 1 ABC 7 GCB 4 GCB 2 GCB 3 ABC 3 GCB 4 GCB 3 ABC 3 GCB 1 GCB 2 ABC 7

0612 0 3.25 7.07

log2 expression value log2 PMR value

Figure 2 Hierarchical clustering of gene expression and DNA methylation analyses for CpG islands in diffuse large B-cell lymphoma (DLBCL). We performed hierarchical clustering analysis on 89 MethyLight markers (a) generated in our phase II analyses of seven activated B-cell-like (ABC)- DLBCL and seven germinal center B-cell-like (GCB)-DLBCL and gene expression data from proximal genes (b) listed in Supplementary Table 2. Clustering analyses were performed on log-transformed data using Euclidean distance and average linkage. The ABC-DLBCL and GCB-DLBCL subtypes could not be discriminated from one another by either the expression (a) or the methylation analyses (b) conducted on this group of CpG islands.

GNMT, ONECUT2, CYP27B1, DRD1, KL, MINT2 and NEU- the CpG island proximal to FLJ21062 is a more promising ROG1) identified by our phase I and/or phase II analyses as candidate for having subtype-specific methylation levels than having subtype-specific methylation levels on a new test group the CpG island proximal to ONECUT2. Larger-scale validation of 14 ABC-DLBCL, 17 GCB-DLBCL and 6 normal peripheral studies of ONECUT2 and FLJ21062 CpG island methylation blood lymphocytes (PBLs) (Figure 1 and Supplementary Table levels in DLBCL are warranted to rigorously address questions 7). When we pooled all MethyLight data conducted on these concerning their subtype specificity. Lastly, the MethyLight eight CpG islands (that is, data from 21 ABC-DLBCL and 24 assays for ONECUT2 and FLJ21062 displayed little DNA GCB-DLBCL cases), only the FLJ21062 (HB-442, ABC-DLBCL methylation in the six normal PBLs (that is, PMRo5 in all cases). median PMR ¼ 0; GCB-DLBCL median PMR ¼ 27.6, P ¼ 0.001) To further elucidate the nature of DNA methylation in the and ONECUT2 (HB-446, median ABC-DLBCL PMR ¼ 67.7; ONECUT2 and FLJ21062 CpG islands, we performed bisulfite median GCB-DLBCL PMR ¼ 46.8, P ¼ 0.012) MethyLight reac- sequencing analysis of these CpG islands in four ABC-DLBCL, tions showed differences between the DLBCL subtypes (Supple- four GCB-DLBCL and two normal PBL (Table 2, Figure 3 and mentary Table 7). When considering MethyLight data only from Supplementary Figure 3). Overall, bisulfite sequencing of the new test group of 14 ABC-DLBCL and 17 GCB-DLBCL, only ONECUT2 and FLJ21062 CpG island subclones (average 25 FLJ21062 (HB-442, ABC-DLBCL median PMR ¼ 4.6; GCB- per case) yielded results that are in excellent agreement with the DLBCL median PMR ¼ 28.0, P ¼ 0.025) showed a difference MethyLight PMR values (Table 2). While the methylation status between the DLCBL subtypes (Supplementary Table 7). Thus, of CpG dinucleotides centered within these two islands reflected

Leukemia DNA methylation in diffuse large B-cell lymphoma BL Pike et al 1039 Table 2 Confirmatory bisulfite sequencing analyses It should also be noted that specific probe tilings for DLC1, GATA4, NKD2 and RARRES1 indicated at least modest a b c Tumor CpG island No. of clones sequenced % PMR expression levels (that is, log2 expression score above eight units) even when CpG islands located within 500 bp (in either ABC 6 FLJ21062 23 1 0 direction) of their transcription start sites had PMR values greater ABC 7 FLJ21062 19 2 0 than 80 units. There are many possible explanations for these ABC 8 FLJ21062 22 19 0 ABC 4 FLJ21062 14 16 0 observations. For example, we may not be interrogating CpG GCB 3 FLJ21062 19 4 0 islands or CpG dinucleotides relevant to the transcriptional GCB 15 FLJ21062 26 0 8 regulation of the transcripts pertaining to these probe tilings. GCB 7 FLJ21062 19 45 72 Alternatively, if copies of these genes proximal to the residual GCB 10 FLJ21062 37 73 104 unmethylated CpG islands were highly expressed, the gene PBL 1 FLJ21062 27 0 0 silencing signature associated with methylated CpG islands PBL 4 FLJ21062 22 2 0 ABC 2 ONECUT2 27 71 64 could be masked. More intriguingly, it is formally possible that ABC 6 ONECUT2 24 78 69 these methylated CpG islands are not attracting the appropriate ABC 9 ONECUT2 41 70 111 cadre of factors responsible for methylation-associated gene ABC 7 ONECUT2 24 65 137 silencing. This could be meaningful given that the relationship GCB 5 ONECUT2 17 4 0 of the various nucleic and components (for example, GCB 4 ONECUT2 28 53 51 modifications28) involved in epigenetic gene silencing GCB 7 ONECUT2 31 69 82 GCB 10 ONECUT2 33 93 135 have still not been fully defined. PBL 1 ONECUT2 20 2 0 Nevertheless, BNIP3, MGMT, RBP1, GATA4, IGSF4, CRABP1 PBL 4 ONECUT2 17 7 1 and FLJ21062 showed significant (Benjamini–Hochberg cor- Abbreviations: ABC, activated B-cell-like; GCB, germinal center B-cell- rected Po0.05; see Supplementary Table 8) trends for decreas- like; PBL, peripheral blood lymphocyte; PMR, percentage of methy- ing gene expression with increasing levels of DNA methylation lated reference. (Figure 4). These represent candidate genes for which the DNA aThe type (ABC-DLBCL or GCB-DLBCL) and sample number for each methylation levels of a proximal CpG island is associated with tumor are provided. b gene silencing in DLBCL. However, some observations (for Percentage of CpG dinucleotides methylated across all clones. example, FLJ21062) could be influenced by experimental noise cPMR value obtained by MethyLight analysis. associated with measuring the abundance of rare transcripts. Nevertheless studies involving the demethylating agent 5-aza- 20-deoxycytidine demonstrate that BNIP3,29 MGMT,30 RBP1 an all-or-none phenomenon, the status of CpG dinucleotides on (aka CRBP1),31 GATA4,32 IGSF4 (aka TSLC1)33 and CRABP1 34 their edges was less reflective of the status of the island as a expression are dependent upon CpG island methylation status in whole (Figure 3 and Supplementary Figure 3). various cancer cell culture models. Next, we investigated the relationships between DNA Next, we examined the expression of the 7 (BNIP3, MGMT, methylation and expression levels of ONECUT2, FLJ21062 RBP1, GATA4, IGSF4, CRABP1 and FLJ21062) candidates as and other genes in DLBCL. We were able to compare Methy- well as the 12 frequently methylated CpG islands (AR, Light PMR values and oligonucleotide microarray-based gene CDKN1C, DLC1, DRD2, GATA4, GDNF, GRIN2B, MTHFR, expression values for 67 genes in 13 DLBCL (see Figure 4, MYOD1, NEUROD1, ONECUT2 and TFAP2A) in two tonsils Supplementary Table 3 and Supplementary Figure 4, where 134 and two peripheral blood CD19 þ B-cell preparations (Supple- plots are provided that reflect multiple gene expression probe mentary Table 9). These data derive from published gene tilings and/or MethyLight reactions for some genes). A total of 39 expression analyses (http://wombat.gnf.org/index.html).35 Four CpG islands proximal (that is, within 500 bp in either direction) candidate genes (that is, BNIP3, MGMT, RBP1 and IGSF4) to the transcription start site of genes showed sufficient variation showing decreased expression with increasing methylation in in PMR values among our DLBCL to justify comment (that is, tumors were expressed at X0.5% of tonsillar b-actin transcript greater than 20 unit difference in the second lowest and second levels. In addition, two of the seven candidate genes (MGMT highest PMR). For 32 of 39 (82%) of these CpG islands and IGSF4) were expressed at X0.5% of b-actin transcript levels (including ONECUT2), increasing levels of DNA methylation in CD19 þ B-cells. Meanwhile, five of the frequently methylated did not result in proportional decreases in gene expression. This genes (that is, AR, CDKN1C, DRD2, GRIN2B and TFAP2A) were was influenced by the fact that genes proximal to CpG islands expressed at X0.5% of tonsillar b-actin transcript levels. often showed weak or modest expression (that is, every log2 However, only one (DRD2) of the frequently methylated genes expression score was below seven units) regardless of methyla- met that same criteria in CD19 þ B cells. None of the 18 unique tion level (for example, CALCA, CDX1, DRD1, GABRA2, genes discussed above were expressed at 41.2% of b-actin GATA3, GNMT, KL, LDLR, MTHFR, NEUROD1, NOS1AP, levels in tonsils or CD19 þ B cells. Although microarray-based ONECUT2, TFPI2 and TWIST1) (Supplementary Figure 4). It is comparisons of transcript levels within a single sample should possible that such genes are strongly expressed in normal be viewed with caution, we conclude that both the candidate precursor cells, but are silenced in all DLCBL via genetic and/or genes showing decreased expression with increasing methyla- epigenetic mechanisms. Alternatively, such genes could be tion and the frequently methylated genes in DLBCL are already weakly or modestly expressed in normal precursor cells prior to expressed at low to modest levels in normal B cells. methylation incurred during the development of cancer. The Lastly, we compared ONECUT2 expression levels in GCB- latter possibility would be consistent with a study showing that DLCBL and ABC-DLCBL with those from normal PBLs and liver 69% (118/170) of genes that are methylated in colon tumor using quantitative PCR (Supplementary Table 10). In agreement samples are expressed at low levels in normal colon as well as in with our microarray-based gene expression analyses (Supple- colorectal adenocarcinomas.27 Overall, we favor the interpreta- mentary Table 3), ONECUT2 was expressed at low levels in four tion that DNA methylation is not frequently involved in ABC-DLCBL and two GCB-DLBCL samples. However, ONE- initiating the silencing of highly expressed genes. CUT2 expression was not detected in the four normal PBLs. This

Leukemia DNA methylation in diffuse large B-cell lymphoma BL Pike et al 1040 FLJ21062 PMR = 72 FLJ21062 PMR = 104 45% methylated CpGs in 19 clones 73% methylated CpGs in 37 clones 89% 100% 0% 0% 78% 78% 89% 100% 78% 89% 78% 78% 67% 0% 67% 78% 0% 67% 0% 78% 0% 89% 89% 89% 67% 67% 89% 100% 67% 0% 78% 56% 100% 89% 0% 89% 0% 89% 0% 67% 100% 56% 78% 78% 78% 78% 89% 78% 100% 67% 78% 0% 67% 89% 0% 78%

c ONECUT2 PMR = 82 d ONECUT2 PMR = 64 69% methylated CpGs in 31 clones 71% methylated CpGs in 27 clones 100% 100% 100% 100% 100% 100% 18% 94% 94% 65% 0% 76% 0% 94% 0% 94% 94% 100% 100% 100% 76% 0% 94% 6% 88% 94% 100% 24% 100% 100% 100% 100% 100% 0% 94% 100% 76% 76% 0% 100% 100% 94% 0% 0% 0% 94% 100% 88% 94% 88% 71% 100% 0% 0% 59% 0% 100% 100%

Figure 3 Bisulfite sequencing of CpG islands proximal to FLJ21062 and ONECUT2. The regions encompassing MethyLight reactions HB-442 (FLJ21062) and HB-446 (ONECUT2) were subject to bisulfite sequencing analysis in four activated B-cell-like (ABC)-diffuse large B-cell lymphoma (DLBCL), four germinal center B-cell-like (GCB)-DLBCL and two peripheral blood lymphocyte (PBL) samples, as summarized in Table 2 and provided in Supplementary Table 3. Here, we depict representative bisulfite sequencing analyses of FLJ21062 in samples (a) GCB-DLBCL 7 and (b) GCB-DLBCL 10. Likewise, representative bisulfite sequencing analyses of ONECUT2 in samples (c) GCB-DLBCL 7 and (d) ABC-DLBCL 2 are shown. Light gray and blackened circles denote methylated and unmethylated CpG dinucleotides, respectively. The percentage of methylated CpGs in all clones and total number of clones sequenced are provided above each panel along with the corresponding MethyLight PMR value. In addition, the percentage of methylated CpGs in a given clone is provided to the right of each clone. The ONECUT2 amplicon spans nucleotide positions 53256301– 53256419 of and the FLJ21062 amplicon spans nucleotide positions 89519154–89519280 of , based on the May 2004 human genome assembly provided at http://genome.ucsc.edu/.

suggests that hypermethylation of this CpG island would not Interestingly, the frequently methylated CpG islands we affect ONECUT2 expression in a normal lymphocyte sample. uncovered in DLBCL such as CDKN1C, DLC1, DRD2, GATA4, Overall, our epigenetic and genetic analyses have uncovered GDNF, GRIN2B, MTHFR, MYOD1, NEUROD1, ONECUT2 and candidate genes that could warrant further investigation into TFAP2A have been reported to be hypermethylated in their functional roles in the development of DLBCL and outside of DLBCL. This could reflect their status as known or potential as biomarkers for early detection of disease recurrence. suspected tumor suppressor genes that affect pathways common

Leukemia DNA methylation in diffuse large B-cell lymphoma BL Pike et al 1041 BNIP3 MGMT MethyLight HB-363 vs Affymetrix 201849_at MethyLight HB-160 vs Affymetrix 204880_at

14 y = –0.046X + 9.47 14 y = –0.039X + 8.98 2 12 R = 0.6202 12 R 2 = 0.517 10 10 8 8 6 6 Expression Value Expression Expression Value Expression 2 2 4 4 Log Log 2 2 0 102030405060708090 100 0 102030405060708090 100 PMR Value PMR Value

RBP1 GATA4 MethyLight HB-185 vs Affymetrix 239782_at MethyLight HB-323 vs Affymetrix 243692_at

14 y = –0.023X + 6.74 14 y = –0.028X + 7.75 12 R 2 = 0.4236 12 R 2 = 0.367 10 10 8 8 6 6 Expression Value Expression Expression Value Expression 2 2 4 4 Log Log 2 2 0 10 20 30 40 50 60 70 8090 100 0 102030405060708090 100 PMR Value PMR Value

IGSF4 CRABP1 MethyLight HB-069 vs Affymetrix 209031_at MethyLight HB-197 vs Affymetrix 205350_at

14 y = –0.043X + 11.45 14 y = –0.042X + 0.706 12 R 2 = 0.358 12 R 2 = 0.3288 10 10 8 8 6 6 Expression Value Expression Expression Value Expression 2 2 4 4 Log Log 2 2 0102030405060708090 100 0 10 20 30 40 50 60 70 80 90 100 PMR Value PMR Value

FLJ21062 MethyLight HB-442 vs Affymetrix 219455_at

14 y = –0.027X + 4.57 12 R2 = 0.265 10 8 6 Expression Value Expression

2 4

Log 2 0 102030405060708090 100 PMR Value

Figure 4 Relationships among DNA methylation and gene expression status. MethyLight (ML) percentage of methylated reference (PMR) values (x axis) were plotted against expression value (y axis) for all genes showing statistically significant trends (Benjamini–Hochberg corrected Po0.05) for decreasing expression with increasing levels of methylation. The location of each ML reaction relative to the start of the corresponding RefSeq is provided in parentheses. (a) BNIP3 (32 bp upstream of RefSeq NM_004052), (b) MGMT (exon 48 bp downstream of RefSeq NM_002412), (c) RBP1 (exon 43 bp downstream of RefSeq NM_002899), (d) GATA4 (intron 394 bp downstream of RefSeq NM_002052), (e) IGSF4 (exon 37 bp downstream of RefSeq NM_014333), (f) CRABP1 (exon 37 bp downstream of RefSeq NM_004378) and (g) FLJ21062 (21 bp upstream of RefSeq NM_001039706). The ML reaction ID, Affymetrix probe tiling ID, equation for a linear fit of the data and R2 value are provided. to multiple cancers. However, it is also possible these genes are hypermethylated chromosome blocks in the development and not functionally relevant to DLBCL, but are simply located in progression of cancer. hypermethylated chromosomal blocks that could contain one or Lastly, the DNA methylation levels observed for specific CpG more tumor suppressor genes directly relevant to DLBCL. islands suggest there is considerable epigenetic heterogeneity Recently, hypermethylated chromosomal blocks have been within the tumors analyzed. This could be related to histological detected in colorectal cancer36 and acute lymphoblastic heterogeneity (for example, levels of tumor-infiltrating immune leukemia.37 The continued development of technologies for cells38) or the heterogeneity of cancer cell populations genome-wide DNA methylation analyses is needed to address comprising these tumors. Regardless of its origin, epigenetic fundamental questions concerning nature and relevance of heterogeneity could confound comparisons of gene expression

Leukemia DNA methylation in diffuse large B-cell lymphoma BL Pike et al 1042 and methylation profiles. This highlights the value of focusing on 12 Nouzova M, Holtan N, Oshiro MM, Isett RB, Munoz-Rodriguez JL, individual or limited numbers of cells in cancer genome and List AF et al. Epigenomic changes during leukemia cell epigenome projects. The development of high-throughput DNA differentiation: analysis of histone acetylation and cytosine methylation profiling technologies that require limited starting methylation using CpG island microarrays. J Pharmacol Exp Ther 2004; 311: 968–981. materials would facilitate the identification of clinical biomar- 13 Pike BL, Groshen S, Hsu YH, Shai RM, Wang X, Holtan N et al. kers and accelerate studies aimed at defining the roles Comparisons of PCR-based genome amplification systems using epigenetic phenomena play in the etiology of different cancers. CpG island microarrays. Hum Mutat 2006; 27: 589–596. 14 Maekawa M, Taniguchi T, Higashi H, Sugimura H, Sugano K, Kanno T. Methylation of mitochondrial DNA is not a useful marker Acknowledgements for cancer detection. Clin Chem 2004; 50: 1480–1481. 15 Weisenberger DJ, Campan M, Long TI, Kim M, Woods C, Fiala E We thank Drs Larry Brody (NIH), Darren Magda (Pharmacyclics et al. Analysis of repetitive element DNA methylation by Methy- Inc.) for valuable discussion, and Lou Staudt and Sandeep Dave Light. Nucleic Acids Res 2005; 33: 6823–6836. 16 Novak P, Jensen T, Oshiro MM, Wozniak RJ, Nouzova M, Watts (NCI) for access to the Affymetrix gene expression data. GS et al. Epigenetic inactivation of the HOXA gene cluster in Acknowledgement of Funding Support: This study was funded breast cancer. Cancer Res 2006; 66: 10664–10670. by NIH Grants P50-HG002790 and P30-CA014089, and United 17 Ibrahim AE, Thorne NP, Baird K, Barbosa-Morais NL, Tavare S, States Public Health Service Grants CA36727 and CA84967 Collins VP et al. MMASS: an optimized array-based method for awarded by the Department of Health and Human Services, assessing CpG island methylation. Nucleic Acids Res 2006; 34: National Cancer Institute. TCG is a grantee of the Mantle Cell e136. 18 Yan PS, Chen CM, Shi H, Rahmatpanah F, Wei SH, Huang TH. Lymphoma Research Program of the Lymphoma Research Applications of CpG island microarrays for high-throughput Foundation. This research was supported in part by the Intramural analysis of DNA methylation. J Nutr 2002; 132: 2430S–2434S. Research Program of the National Human Genome Research 19 Eads CA, Danenberg KD, Kawakami K, Saltz LB, Blake C, Institute, National Institutes of Health. This investigation was Shibata D et al. MethyLight: a high-throughput assay to measure conducted in a facility constructed with support from Research DNA methylation. Nucleic Acids Res 2000; 28: E32. Facilities Improvement Program Grant Number C06 (RR10600- 20 Yang H, Chen CM, Yan P, Huang TH, Shi H, Burger M et al. The androgen receptor gene is preferentially hypermethylated in 01, CA62528-01 and RR14514-01) from the National Center for follicular non-Hodgkin’s lymphomas. Clin Cancer Res 2003; 9: Research Resources, National Institutes of Health. 4034–4042. 21 Garcia MJ, Martinez-Delgado B, Cebrian A, Martinez A, Benitez J, Rivas C. Different incidence and pattern of p15INK4b and p16INK4a promoter region hypermethylation in Hodgkin’s and References CD30-Positive non-Hodgkin’s lymphomas. Am J Pathol 2002; 161: 1007–1013. 1 Lossos IS, Morgensztern D. Prognostic biomarkers in diffuse large 22 Shiozawa E, Takimoto M, Makino R, Adachi D, Saito B, Yamochi- B-cell lymphoma. J Clin Oncol 2006; 24: 995–1007. Onizuka T et al. Hypermethylation of CpG islands in p16 as a 2 Lossos IS. Molecular pathogenesis of diffuse large B-cell lympho- prognostic factor for diffuse large B-cell lymphoma in a high-risk ma. J Clin Oncol 2005; 23: 6351–6357. group. Leuk Res 2006; 30: 859–867. 3 Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A 23 Esteller M, Gaidano G, Goodman SN, Zagonel V, Capello D, et al. Distinct types of diffuse large B-cell lymphoma identified by Botto B et al. Hypermethylation of the DNA repair gene O(6)- gene expression profiling. Nature 2000; 403: 503–511. methylguanine DNA methyltransferase and survival of patients 4 Rosenwald A, Wright G, Chan WC, Connors JM, Campo E, Fisher with diffuse large B-cell lymphoma. J Natl Cancer Inst 2002; 94: RI et al. The use of molecular profiling to predict survival after 26–32. chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med 24 Rossi D, Capello D, Gloghini A, Franceschetti S, Paulli M, Bhatia K 2002; 346: 1937–1947. et al. Aberrant promoter methylation of multiple genes throughout 5 Wright G, Tan B, Rosenwald A, Hurt EH, Wiestner A, Staudt LM. A the clinico-pathologic spectrum of B-cell neoplasia. Haemato- gene expression-based method to diagnose clinically distinct logica 2004; 89: 154–164. subgroups of diffuse large B cell lymphoma. Proc Natl Acad Sci 25 Al-Kuraya KS, Siraj AK, Al-Dayel FA, Ezzat AA, Al-Jommah NA, USA 2003; 100: 9991–9996. Atizado VL et al. Epigenetic changes and their clinical relevance 6 Bea S, Colomo L, Lopez-Guillermo A, Salaverria I, Puig X, in Saudi diffuse large B-cell lymphoma. A molecular and Pinyol M et al. Clinicopathologic significance and prognostic tissue microarray analysis of 100 cases. Saudi Med J 2005; 26: value of chromosomal imbalances in diffuse large B-cell lympho- 1099–1103. mas. J Clin Oncol 2004; 22: 3498–3506. 26 Hiraga J, Kinoshita T, Ohno T, Mori N, Ohashi H, Fukami S et al. 7 Bea S, Zettl A, Wright G, Salaverria I, Jehn P, Moreno V et al. Promoter hypermethylation of the DNA-repair gene O6-methyl- Diffuse large B-cell lymphoma subgroups have distinct genetic guanine-DNA methyltransferase and p53 mutation in diffuse large profiles that influence tumor biology and improve gene-expres- B-cell lymphoma. Int J Hematol 2006; 84: 248–255. sion-based survival prediction. Blood 2005; 106: 3183–3190. 27 Keshet I, Schlesinger Y, Farkash S, Rand E, Hecht M, Segal E et al. 8 Tagawa H, Suguro M, Tsuzuki S, Matsuo K, Karnan S, Ohshima K Evidence for an instructive mechanism of de novo methylation in et al. Comparison of genome profiles for identification of distinct cancer cells. Nat Genet 2006; 38: 149–153. subgroups of diffuse large B-cell lymphoma. Blood 2005; 106: 28 Irvine RA, Lin IG, Hsieh CL. DNA methylation has a local effect on 1770–1777. transcription and histone acetylation. Mol Cell Biol 2002; 22: 9 Chen W, Houldsworth J, Olshen AB, Nanjangud G, Chaganti S, 6689–6696. Venkatraman ES et al. Array comparative genomic hybridization 29 Murai M, Toyota M, Suzuki H, Satoh A, Sasaki Y, Akino K et al. reveals genomic copy number changes associated with Aberrant methylation and silencing of the BNIP3 gene in color- outcome in diffuse large B-cell lymphomas. Blood 2006; 107: ectal and gastric cancer. Clin Cancer Res 2005; 11: 1021–1027. 2477–2485. 30 Danam RP, Howell SR, Brent TP, Harris LC. Epigenetic regulation 10 Hans CP, Weisenburger DD, Greiner TC, Gascoyne RD, Delabie J, of O6-methylguanine-DNA methyltransferase gene expression by Ott G et al. Confirmation of the molecular classification of diffuse histone acetylation and methyl-CpG binding . Mol Cancer large B-cell lymphoma by immunohistochemistry using a tissue Ther 2005; 4: 61–69. microarray. Blood 2004; 103: 275–282. 31 Esteller M, Guo M, Moreno V, Peinado MA, Capella G, Galm O 11 Shi H, Guo J, Duff DJ, Rahmatpanah F, Chitima-Matsiga R, et al. Hypermethylation-associated inactivation of the cellular Al-Kuhlani M et al. Discovery of novel epigenetic markers in retinol-binding-protein 1 gene in human cancer. Cancer Res 2002; non-Hodgkin’s lymphoma. Carcinogenesis 2007; 28: 60–70. 62: 5902–5905.

Leukemia DNA methylation in diffuse large B-cell lymphoma BL Pike et al 1043 32 Guo M, House MG, Akiyama Y, Qi Y, Capagna D, Harmon J et al. 36 Frigola J, Song J, Stirzaker C, Hinshelwood RA, Peinado MA, Hypermethylation of the GATA gene family in esophageal cancer. Clark SJ. Epigenetic remodeling in colorectal cancer results in Int J Cancer 2006; 119: 2078–2083. coordinate gene suppression across an entire chromosome band. 33 Heller G, Fong KM, Girard L, Seidl S, End-Pfutzenreuter A, Lang G Nat Genet 2006; 38: 540–549. et al. Expression and methylation pattern of TSLC1 cascade genes 37 Taylor KH, Pena-Hernandez KE, Davis JW, Arthur GL, in lung carcinomas. Oncogene 2006; 25: 959–968. Duff DJ, Shi H et al. Large-scale CpG methylation analysis 34 Lind GE, Kleivi K, Meling GI, Teixeira MR, Thiis-Evensen E, identifies novel candidate genes and reveals methylation Rognum TO et al. ADAMTS1, CRABP1, and NR3C1 identified as hotspots in acute lymphoblastic leukemia. Cancer Res 2007; 67: epigenetically deregulated genes in colorectal tumorigenesis. Cell 2617–2625. Oncol 2006; 28: 259–272. 38 Dave SS, Wright G, Tan B, Rosenwald A, Gascoyne RD, Chan WC 35 Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D et al. et al. Prediction of survival in follicular lymphoma based on A gene atlas of the mouse and human protein-encoding molecular features of tumor-infiltrating immune cells. N Engl J transcriptomes. Proc Natl Acad Sci USA 2004; 101: 6062–6067. Med 2004; 351: 2159–2169.

Supplementary Information accompanies the paper on the Leukemia website (http://www.nature.com/leu)

Leukemia