Oncogene (2006) 25, 3059–3064 & 2006 Nature Publishing Group All rights reserved 0950-9232/06 $30.00 www.nature.com/onc TECHNICAL REPORT -wide profiling of methylation in human

I Hatada1,2, M Fukasawa1,2, M Kimura1,2, S Morita1,2, K Yamada3, T Yoshikawa3, S Yamanaka4,5, C Endo4, A Sakurada4, M Sato4, T Kondo4, A Horii5, T Ushijima6 and H Sasaki7,8

1Laboratory of Genome Science, Biosignal Genome Resource Center, Department of Molecular and Cellular Biology, Gunma University, Maebashi, Japan; 2PRESTO, Japan Science and Technology Corporation (JST), Kawaguchi, Japan; 3Laboratory for Molecular Psychiatry, RIKEN Brain Science Institute, Wako, Japan, 4Department of Thoracic Surgery, Institute of Development, Aging and , Tohoku University, Sendai, Japan; 5Department of Molecular Pathology, Tohoku University School of Medicine, Sendai, Japan; 6Carcinogenesis Division, National Cancer Center Research Institute, Chuo-ku, Tokyo, Japan; 7Division of Human Genetics, Department of Integrated Genetics, National Institute of Genetics, Research Organization of Information and Systems, Mishima, Japan and 8Department of Genetics, School of Life Science, Graduate University for Advanced Studies, Mishima, Japan

DNA methylation in the promoter region of a is Microarray-based methods of comparing differences in associated with a loss of that gene’s expression and plays DNA methylation in the genome of two samples using an important role in gene silencing. The inactivation of methylation-sensitive restriction enzymes (Yan et al., tumor-suppressor by aberrant methylation in the 2001; Hatada et al., 2002) have two problems. The first promoter region is well recognized in . is that the microarrays contain clones from libraries of However, there has been little study in this area when it CpG islands. CpG islands are CpG-rich regions of the comes to genome-wide profiling of the promoter methyla- genome originally thought to be associated with the 50 tion. Here, we developed a genome-wide profiling method region of genes. There were several approaches using called Microarray-based Integrated Analysis of Methyla- CpG islands libraries for microarrays (Yan et al., 2001; tion by Isoschizomers to analyse the DNA methylation of Hatada et al., 2002; Heisler et al. 2005; Weber et al., promoter regions of 8091 human genes. With this method, 2005). Although 60% of human genes have CpG islands resistance to both the methylation-sensitive restriction in the promoter or first exon, more than 80% of all CpG enzyme HpaII and the methylation-insensitive isoschizo- islands have no relation to genes and are unlikely to mer MspI was compared between samples by using a regulate gene expression (Takai and Jones, 2002). To microarray with promoter regions of the 8091 genes. The solve this problem, we used a microarray with 60-mer reliability of the difference in HpaII resistance was judged oligonucleotides derived from promoter regions of 8091 using the difference in MspI resistance. We demonstrated human genes. DNA methylation in promoter regions is the utility of this method by finding epigenetic most important for the regulation of gene expression. in cancer. Aberrant hypermethylation is known to The second problem is the risk of false positives inactivate tumour suppressor genes. Using this method, resulting from restriction site polymorphisms and/or we found that frequency of the aberrant promoter incomplete digestion of DNA. To resolve this issue, hypermethylation in cancer is higher than previously we developed a new method called Microarray-based hypothesized. Aberrant hypomethylation is known to Integrated Analysis of Methylation by Isoschizomers induce activation of oncogenes in cancer. Genome-wide (MIAMI). We utilized resistance to a methylation- analysis of hypomethylated promoter sequences in cancer insensitive restriction enzyme, MspI, to judge the demonstrated low CG/GC ratio of these sequences, false-positive results for resistance to the methylation- suggesting that CpG-poor genes are sensitive to demethy- sensitive isoschizomer HpaII (Figure 1a). If two samples lation activity in cancer. have a restriction site polymorphism at an HpaII site Oncogene (2006) 25, 3059–3064. doi:10.1038/sj.onc.1209331; and/or one of the samples has incomplete digestion at a published online 9 January 2006 HpaII site, they will differ in resistance to HpaII. However, in this case the resistance to methylation- Keywords: DNA methylation; genome-wide profiling; insensitive MspI at this site will also differ between ; microarray samples because both enzymes recognize the same recognition site, CCGG. Therefore, we can treat such changes as false positives based on MspI resistance (MR). We constructed a 60-mer-oligonucleoide microarray Correspondence: Dr I Hatada, Department of Molecular and Cellular containing portions of HpaII fragments located in Biology, Gunma University, 3-39-15, Showa-machi, Maebashi, promoter regions of 8091 genes. We targeted the region Gunma 371-8512, Japan. E-mail: [email protected] from 600 base pairs upstream to 200 base pairs Received 25 August 2005; revised 31 October 2005; accepted 17 downstream of the transcriptional start sites for genes November 2005; published online 9 January 2006 whose start sites were characterized on the basis of the Genome-wide profiling of promoter methylation I Hatada et al 3060 a I. Amplification of unmethylated DNA II. Amplification of all DNA (For Hpa II Resistance) (For Msp I Resistance)

Sample A Sample B Sample A Sample B Me Me

Hpa II digestion Me Msp I digestion

Adaptor ligation Adaptor ligation & 1st PCR & 1st PCR

Msp I digestion Msp I digestion

2nd PCR & Labeling 2nd PCR & Labeling

Cy5 Cy3 Cy3 Cy5

Me: Methylation Resitance=1/ (normalized Intensity) : HpaII (MspI) recognition sites

Scannning & Normalization Scannning & Normalization

(Hpa II Resistance)B/ (HpaII Resistance )A= (Msp I Resistance)B/ (MspI Resistance) A= (normalized HpaII Intensity)A/ (normalized HpaII Intensity)B (normalized MspI Intensity)A/ (normalized MspI Intensity)B

b Regression Line log5 log5 Msp I 2 resistant log5 log5 1.5

1

0.5 ) C

/MR

LC -3 -2.5 -2 -1.5 -1 0 0 0.5 1 1.5 2 2.5 3

-0.5 log(MR -1

Msp I -1.5

sensitive -2 Regression Line

Hpa II log(HRLC/HRC) Hpa II sensitive resistant

abs[log(HRLC/HRC)] Hor. Dis. Reg. Line COBRA Results . >log5 >log5 Positive Hor. Dis. >log5 log5 Positive log5 False Positive

Figure 1 MIAMI. (a) Schematic flowchart for the MIAMI method for comparison of sample A and sample B. Details were described in the text. (b) Application of the MIAMI method to a lung cancer cell line (1–87, abbreviated as LC) and a normal lung (abbreviated as C). Values for log(HRLC/HRC) are plotted on the x-axis and log(MRLC/MRC) are plotted on the y-axis. Green lines are located log5 of the horizontal distance from the y-axis. The regression line is in yellow and red lines are located log5 of the horizontal distance from this line. Points located more than this distance right of the regression line are judged as hypermethylated. Points located more than log5 of the horizontal distance left of the regression line are judged as hypomethylated. Genes indicated by red and green circles are located more than this distance from the y-axis. Hypermethylation was confirmed for genes indicated by red circles that were found to meet the criteria. Hypermethylation was not confirmed for genes indicated by green circles that did not meet the criteria, although they were located log5 of the horizontal distance from the y-axis. Orange and blue circles meet the criteria but are less than log5 of the distance from the y-axis. Hypermethylation was confirmed for genes indicated by orange circles but not blue circles. (c) Summary of methylation changes in a lung cancer cell line (1–87, abbreviated as LC) compared to a normal lung (abbreviated as C). Methylation change (horizontal distance from the regression line) for each gene is plotted on the y-axis. Red broken lines indicate the threshold we used (log5). Genes are placed in order of position along x-axis. (d) Average ratio of CpG contents to GC contents (CG/GC) and average ratio of AT contents to TA contents (AT/TA) were calculated for hypermethylated, unchanged, and hypomethylated genes.

National Center for Biotechnology Information annota- the condition that it does not have self-complementarity tion and/or Database of Transcriptional Start Sites. The (Primer 3 program; Rozen et al., 2000) and to probe nearest to a transcriptional start was selected on the (megaBlast program; Altschul et al.,

Oncogene Genome-wide profiling of promoter methylation I Hatada et al 3061 c 2.5 1 2 2 3 4 1.5 5 Hypermethylated 6 1 7 8 Threshold 0.5 9 (log5) 10 Ch13 Ch15 Ch18 Ch21ChX 0 Ch1 Ch2 Ch3 Ch4 Ch5Ch6 Ch7Ch8Ch9Ch10Ch11Ch12 Ch14 Ch16 Ch17 Ch19 Ch20Ch22 11 Threshold 12 (log5) -0.5 13 14

(Hor. Dis. Reg. Line ) 15 -1 16 Hypomethylated Methylation Change (LC/C) 17 -1.5 18 19 Regression Line -2 20 21 -2.5 22 16995X Y . Order of position on chromosome Hor. Dis. Reg. Line

d 0.9 1.8 0.8 1.6 0.7 1.4 0.6 1.2 0.5 1

0.4 AT/TA 0.8 CG/GC 0.3 0.6 0.2 0.4 0.1 0.2 0 0 Hypermethylated Unchanged Hypomethylated Hypermethylated Unchanged Hypomethylated

Hypermethylated (400 genes)

Unchanged (6556 genes)

Hypomethylated (39 genes)

Figure 1 Continued.

1990). Microarrays were made using an ink-jet oligonu- had methylated internal HpaII sites before the PCR cleotide synthesizer as described (Hughes et al., 2001). retained HpaII (MspI) sites. Therefore, MspI digestion Average position of the 8091 probes was 36 base pairs made it impossible to amplify these methylated frag- upstream of the transcription start sites. Average GC ments. In the second main PCR, only unmethylated content of the probes was 65%. All probes were DNA fragments were amplified. Amplified unmethy- included in the HpaII fragments less than 600 base lated HpaII-cleaved DNA fragments from two samples pairs. Average fragment length of the probe containing were labeled with Cy3 and Cy5, respectively, and co-hy- HpaII fragments was 194 base pairs. bridized to the microarray with 60-mer oligonucleotides We defined resistance as reciprocal of sensitivity. from promoter regions of 8091 genes. After hybridiza- Therefore, HpaII-sensitive (cleavable) DNA and MspI- tion, the microarray was scanned and fluorescence sensitive (cleavable) DNA were amplified and used for intensities on a scanned image were quantified, corrected calculating the HpaII resistance (HR) and MR, respec- for background noise, and normalized with the software tively. For HR, HpaII-cleavable unmethylated DNA DNASIS Array (Hitachi Software Engineering). Spots was amplified (I). HpaII-cleaved DNA fragments were with both Cy3 and Cy5 signals less than 0.001% of total ligated to an adaptor and subjected to first PCR signals were removed before analysis. HR was defined as (Figure 1a). At this stage, only DNA fragments that 1/(normalized HpaII intensity). Therefore, the ratio of

Oncogene Genome-wide profiling of promoter methylation I Hatada et al 3062 HR of two samples (HRB/HRA) can be represented by ized MspI intensity)B. Details for all procedures are (normalized HpaII intensity)A/(normalized HpaII in- described in Supplementary information 1. tensity)B. For MR, all MspI -cleavable DNA (unmethy- We applied the MIAMI method to a lung cancer lated plus methylated) was amplified (II). MspI-cleaved cell line (1–87, abbreviated as LC) and a normal DNA fragments were amplified and labeled similarly as lung (abbreviated as C). Values for log(HRLC/HRC) HpaII-cleaved DNA fragments then co-hybridized to and log(MRLC/MRC) are plotted on the x and the y- another microarray with the same 8091 genes. MR was axis, respectively, of Figure 1b. Various genes whose defined as 1/(normalized MspI intensity). Therefore, the HR changed more than fivefold (abs[log(HRLC/ ratio of MR of two samples (MRB/MRA) can be HRC)]>log5, areas more than log5 of the horizontal represented by (normalized MspI intensity)A/(normal- distance from the y-axis) were selected as candidates (indicated by red and green circles in Figure 1b and

a b Msp I (Msp I) 1-87 1-87 Control1-87 Control1-87 Control1-87 Control Control1-87 Control CARD9 IGFBP4 COPZ2 YPEL3 U NID2 U TWISTNB U U U U ontrol Control1-87 C 1-87 Control1-87 Control1-87 Control1-87 SARM1 RASGRP2 TREML2 U SUFU ELAVL2 U MLH3 U U U U

CIDEB BTG3 U SNX6 U RNPC2 GATA3 U TWISTNB TREML2 RNPC2 SEMA6A ADSSL1 U U KL U ID4 SEMA6A U UHRF1 TNFRSF10C U U U

ADSSL1 U

c COBRA RT-PCR

S S P C -L C R M Normal1-87 lung RERF-LCMEBC-1 LK-2 VMRC-LCPLK79 Normal lung1-87 RERF-LCMEBC-1 LK-2 V LK79

U CIDEB CIDEB MLH3

U G3PDH MLH3

d Squamous Small cell Normal Adenocarcinoma cell carcinoma lung carcinoma

Figure 2 Characterization of genes detected by MIAMI in a lung cancer. The PCR primers used are indicated in Supplementary information 3. (a) COBRA analysis of indicated genes. Genes indicated by red, orange, and blue circles met our criteria whereas indicated by green circles did not. COBRA analysis confirmed hypermethylation for genes indicated in red and orange and not genes indicated by green and blue. U indicates bands originating from unmethylated DNA. Other bands originated from methylated DNA. (b) Characterization of genes not meeting our criteria. MspI(HpaII) polymorphisms were detected by PCR followed by digestion with MspI. All the genes not meeting the criteria have MspI(HpaII) polymorphisms. (c) COBRA and RT–PCR analysis of CIDEB and MLH3 genes in six lung cancer cell lines (1–87, RERF-LCMS, EBC-1, LK-2, VMRC-LCP and LK79). U indicates bands originating from unmethylated DNA. Hypermethylation was observed for five of six lung cancer cell lines for CIDEB and two of six for MLH3. RT–PCR analysis showed that expression was reduced in all these hypermethylated cell lines. G3PDH was used for a control. (d) COBRA analysis of CIDEB genes in primary tumours. Eight adenocarcinomas, eight squamous cell carcinomas and five small cell carcinomas were used for analysis. Seventy-one per cent (15/21) of primary tumours were hypermethylated.

Oncogene Genome-wide profiling of promoter methylation I Hatada et al 3063 Supplementary Figure 1). These genes were confirmed to genes (Figure 1d). We found hypomethylated genes differ in methylation between the cancer and the normal had a low CG/GC ratio compared to genes without lung by combined bisulfite restriction (COBRA) analy- methylation change (P ¼ 4.0 Â 10À15, Figure 1d). How- sis, with the genes indicated by red circles hypermethy- ever, the AT/TA ratio showed no such tendency lated in the cancer cells and the genes indicated by (Figure 1d). This suggests that CpG-poor genes are green circles having no methylation-based changes easily demethylated compared to CpG-rich genes. In (Figure 2a). To characterize these false positives without other words, CpG-poor genes are more sensitive to changes in methylation (green circles), PCR was demethylation activity than to CpG-rich genes. This conducted followed by digestion with HpaII to test for could be explained by protection of demethylation site polymorphisms. We found these false positives have activity by a methyl-CpG binding . Promoters site polymorphisms between the cancer and the normal with a low density of methyl-CpGs bind MeCP-1 less lung (Figure 2b). All these false positives were close to strongly than those with a high density of methyl-CpGs the regression line (yellow line in Figure 1b and (Boyes and Bird, 1992). Therefore, it is intriguing to Supplementary Figure 1) where ideal changes in HR speculate that CpG-poor genes are less protected by and MR are postulated to be equal. Therefore, we made MeCP-1 from demethylation activity. Aberrant hypo- threshold criteria with which to judge points located methylation is related to the activation of oncogenes. more than log5 of the horizontal distance from the Therefore, our finding of the unique character of regression line as altered genes. (Figure 1b and hypomethylated genes will help us to understand the Supplementary Figure 1). Points located more than this mechanism of carcinogenesis. distance right of the regression line were judged as Aberrant hypermethylation is known to inactivate hypermethylated and points located more than this tumour suppressor genes (Baylin et al., 1997; Ushijima, distance left of the regression line were judged as 2005). Among the hypermethylated genes we identified, hypomethylated. Using our criteria, we could neglect further analysis of CIDEB and MLH3 were performed. all false positives (green circles) and all genes meeting CIDEB (Cell death-inducing DFFA-like effector b) the criteria (red circles) had methylation changes, activates in mammalian cells (Inohara et al., indicating our threshold is quite reasonable for selecting 1998) and is located at 14q11 where LOH frequently methylation-changed genes. Next we chose six genes occurs in lung (Abujiang et al., 1998). MLH3, that were located more than log5 of the horizontal MutL Homolog 3, is a DNA mismatch repair gene distance from the regression line but less than log5 from associated with mammalian microsatellite instability the y-axis (indicated by orange and blue circles in (Lipkin et al., 2000). MLH1, from the same family, Figure 1b and Supplementary Figure 1). These genes was frequently mutated in hereditary nonpolyposis can be judged as hypermethylated using our criteria but colon cancer (Papadopoulos et al., 1994) and was their changes in HR are less than fivefold. COBRA involved in microsatellite instability DNA in colon analysis indicated that five of the six had actually cancers (Jager et al., 1997). A methylation-based methylation-based changes (Figure 2a), indicating again analysis of an additional five lung cancer cell lines using our threshold criteria is useful for selecting methylation- COBRA revealed hypermethylation in five of six for changed genes (orange circles indicate positives and blue CIDEB and two of six for MLH3 (Figure 2c). RT–PCR circles indicates false positive). Conventional, indepen- analysis showed that expression was reduced in all dent COBRA experiments using gene-specific primers hypermethylated cancers (Figure 2c), indicating that the confirmed 17 of 18 hypermethylations that were expression profile of the genes completely correlated identified by integrated analysis of HR and MR at a with the methylation profile of the genes. Further threshold of log5. This suggests that our empirical rate methylation analysis was performed for CIDEB in of false positives is 6%. We used our threshold criteria primary tumours using COBRA. We found 71% (15/ to calculate the ratio of changes and found that 5.7% of 21) of primary lung cancers were hypermethylated in the the promoters of the genes were hypermethylated and promoter of CIDEB (Figure 2d). The present study was 0.6% were hypomethylated in lung cancer (Figure1c and approved by the Ethics Committees of Tohoku Uni- Supplementary information 2). This frequency is much versity School of Medicine and Gunma University. higher than a previous result in lung cancers (Yan et al., Following a complete description of the research 2001), suggesting high sensitivity. Further improvement protocol, written informed consent was obtained from such as using linear amplification could make this each participant. method more efficient because it is expected that a In conclusion, we conclude MIAMI is a powerful proportion of fragments will not amplify and give no method for genome-wide profiling of promoter methyla- signal by PCR. Actually we removed 14% of spots with tion in the human genome. This method is useful for both Cy3 and Cy5 signals less than 0.001% of total epigenetic studies of cancers. signals for analysis to get reproducible results. 0 Next we analysed the character of 5 sequences (from Acknowledgements 1000 base pairs upstream to 200 base pairs downstream of the transcriptional start sites) for these hypermethy- This work was supported in part by grants from the Japanese lated and hypomethylated genes. Average ratio of CpG Science and Technology Agency (IH), the Ministry of contents to GC contents (CG/GC) was calculated for Education, Culture, Sports, Science and Technology of Japan hypermethylated, unchanged, and hypomethylated (IH), and the Ministry of Health, Labour and Welfare of

Oncogene Genome-wide profiling of promoter methylation I Hatada et al 3064 Japan (IH). We thank the Cancer Cell Repository (Institute of providing cancer cell lines and Miss Asano for technical Development, Aging and Cancer, Tohoku University) for assistance.

References

Abujiang P, Mori TJ, Takahashi T, Tanaka F, Kasyu I, Jager AC, Bisgaard ML, MyrhojT, Bernstein I, Rehfeld JF, Hitomi S et al. (1998). Oncogene 17: 3029–3033. Nielsen FC. (1997). Am J Hum Genet 61: 129–138. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Lipkin SM, Wang V, Jacoby R, Banerjee-Basu S, Baxevanis (1990). J Mol Biol 215: 403–410. AD, Ly nch HT et al. (2000). Nat Genet 24: 27–35. Baylin SB, Herman JG, Graff JR, Vertino PM, Issa J-P. (1997) Papadopoulos N, Nicolaides NC, Wei YF, Ruben SM, Carter In: Vande W and Klein G (eds). Advances in Cancer KC, Rosen CA et al. (1994). Science 263: 1625–1629. Research. Academic Press: San Diego. pp 141–196. Rozen S, Skaletsky HJ. (2000) In: Krawetz S and Misener S Boyes J, Bird A. (1992). EMBO J 11: 327–333. (eds). Bioinformatics Methods and Protocols: Methods in Hatada I, Kato A, Morita S, Obata Y, Nagaoka K, Sakurada Molecular Biology. Humana Press: Totowa. pp 365–386. A et al. (2002). J Hum Genet 47: 448–451. Takai D, Jones PA. (2002). Proc Natl Acad Sci USA 99: Heisler LE, Torti D, Boutrous PC, Watson J, Chan C, 3740–3745. Winegarden N et al. (2005). Nucleic Acids Res 33: Ushijima T. (2005). Nat Rev Cancer 5: 223–231. 2952–2961. Weber M, Davies JJ, Wittig D, Oakeley EJ, Haase M, Lam Hughes TR, Mao M, Jones AR, Burchard J, Marton MJ, WL et al. (2005). Nat Genet 37: 853–862. Shannon KW et al. (2001). Nat Biotechnol 19: 342–347. Yan PS, Chen CM, Shi H, Rahmatpanah F, Wei SH, Caldwell Inohara N, Koseki T, Chen S, Wu X, Nunez G. (1998). EMBO CW et al. (2001). Cancer Res 61: 8375–8380. J 17: 2526–2533.

Supplementary Information accompanies the paper on Oncogene website (http://www.nature.com/onc)

Oncogene