Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

Genomics Molecular Cancer Research A Minimal DNA Methylation Signature in Oral Tongue Squamous Cell Carcinoma Links Altered Methylation with Tumor Attributes Neeraja M. Krishnan1, Kunal Dhas1, Jayalakshmi Nair1, Vinayak Palve1, Jamir Bagwan1, Gangotri Siddappa2, Amritha Suresh2, Vikram D. Kekatpure3, Moni Abraham Kuriakose2,3, and Binay Panda1,4

Abstract

Oral tongue squamous cell carcinomas (OTSCC) are a MIR10B was significantly associated with the differential homogenous group of aggressive tumors in the head and neck expression of its target NR4A3 and BCL2L11 (P ¼ region that spread early to lymph nodes and have a higher 0.0125 and P ¼ 0.014, respectively), which was inversely incidence of regional failure. In addition, there is a rising correlated with disease-free survival (P ¼ 9E15 and P ¼ incidence of oral tongue cancer in younger populations. Stud- 2E15, respectively) in patients. Finally, differential methyl- ies on functional DNA methylation changes linked with ation in FUT3, TRIM5, TSPAN7, MAP3K8, RPS6KA2, SLC9A9, altered expression are critical for understanding the and NPAS3 genes was found to be predictive of certain clinical mechanisms underlying tumor development and metastasis. and epidemiologic parameters. Such studies also provide important insight into biomarkers linked with viral infection, tumor metastasis, and patient Implications: This study reveals a functional minimal meth- survival in OTSCC. Therefore, we performed genome-wide ylation profile in oral tongue tumors with associated risk methylation analysis of tumors (N ¼ 52) and correlated habits, clinical, and epidemiologic outcomes. In addition, altered methylation with differential . The NR4A3 downregulation and correlation with patient survival minimal tumor-specific DNA 5-methylcytosine signature iden- suggests a potential target for therapeutic intervention in oral tified genes near 16 different differentially methylated regions, tongue tumors. Data from the current study are deposited which were validated using genomic data from The Cancer in the NCBI Geo database (accession number GSE75540). Genome Atlas cohort. In our cohort, hypermethylation of Mol Cancer Res; 14(9); 805–19. 2016 AACR.

Introduction in the anterior part of tongue or oral tongue have an increased association with younger patients (3, 4), spread early to lymph Head and neck squamous cell carcinomas (HNSCC) are a hetero- nodes (3), and have a higher regional failure (5). Tobacco and geneous group of tumors with different incidences, mortalities, and alcohol are common risk factors for this group of cancer. Although prognosis for different subsites. HNSCCs are the sixth leading the role of human papilloma virus (HPV) is clearly understood in cause of cancer worldwide (1) and account for almost 30% of all the prognosis of oropharyngeal tumors (6), the role of HPV as an cancer cases in India (2). Oral cancer is the most common subtype etiologic agent and/or a prognostic marker in oral tongue squa- of head and neck cancers with a worldwide incidence of greater mous cell carcinoma (OTSCC) is not yet established. than 300,000 cases. The disease is an important cause of morbidity In the last 5 years, studies have identified somatic mutations, and mortality with a 5-year survival of less than 50% (1, 2). Unlike insertions, deletions and amplifications, copy number altera- other subsites of oral cavity like gingivo-buccal, tumors originating tions, loss of heterozygosity, gene expression, and DNA meth- ylation changes for different tumor subsites in HNSCC (7–12).

1 In addition, few studies (13, 14) have demonstrated the role Ganit Labs, Bio-IT Centre, Institute of Bioinformatics and Applied fi Biotechnology, Bangalore, India. 2Integrated Head and Neck Oncol- of epigenetic modi cations and associated pathways in HPV- ogy Program, Mazumdar Shaw Centre for Translational Research, positive HNSCC patients. Bangalore, India. 3Head and Neck Oncology, Mazumdar Shaw Medical Epigenetic changes are known biomarkers for cancer initia- 4 Centre, Bangalore, India. Strand Life Sciences, Bangalore, India. tion and progression (15). DNA methylation at cytosine resi- Note: Supplementary data for this article are available at Molecular Cancer dues [5-methylcytosine (5mC)] is one of the best studied Research Online (http://mcr.aacrjournals.org/). epigenetic changes in cancer (16). Cytosine methylation alters Corresponding Author: Binay Panda, Ganit Labs, Institute of Bioinformatics and gene expression and thus plays an important role in other bio- Applied Biotechnology, Biotech Park, Electronic City Phase I, Bangalore 560100, logic processes, such as embryonic development, imprinting, þ þ India. Phone: 91-80-28528900; Fax: 90-80-28528904; E-mail: and X- inactivation (17, 18). Cancer development [email protected] is characterized by two major DNA methylation changes, doi: 10.1158/1541-7786.MCR-15-0395 hypermethylation of CpG islands located in the promoter 2016 American Association for Cancer Research. regions of tumor suppressor genes (making them inactive) and

www.aacrjournals.org 805

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

Krishnan et al.

global hypomethylation leading to activation of oncogenes and surgery with or without postoperative adjuvant radiation or transposons (19). By conducting genome-wide methylation chemoradiation at the Mazumdar Shaw Medical Centre were studies, one can identify potential DNA methylation markers accrued for the study (Supplementary Table S1). for early cancer detection. Previous studies profiling DNA methylation changes in HNSCC primarily used low-through- Whole-genome methylation assay put assays and identified changes in DCC, DAPK1, SPEN, Genomic DNA was extracted from tissues using DNeasy E-cadherin, CDKN2A, RASSF1,andMGMT genes (20–22). Pick- Blood and Tissue Kit (Qiagen) as per the manufacturer's spe- ering and colleagues (2013) studied methylation in oral squa- cifications. Genomic DNA quality was assessed by agarose gel mous cell carcinomas using low-density HumanMethyla- electrophoresis, and samples with intact, high molecular weight tion27K arrays and found 3,694 loci to be differenti- DNA bands were used for bisulfite conversion. Five hundred ally methylated in tumor tissues (9). Oral cancer studies by nanograms of genomic DNA was used for bisulfite conversion Shaw and colleagues (2013) used the Illumina GoldenGate using the Zymo EZ DNA Methylation Kit (Zymo Research assay that comprises 1,505 CpG island loci and found 48 loci Corp.) following the manufacturer's instructions and eluted to be differentially methylated in tumors (23). in 12 mL of elution buffer. Four microliters of the bisulfite- In this study, we profiled tumor-specific5mCchangesusing converted DNA was used as template for target preparation high-density arrays comprising 485,512 probes in 52 pairs of to hybridize on the BeadChip and was processed as per OTSCCs. We identified differentially methylated loci or probes the manufacturer's instructions following Infinium HD and (DMP) between tumors and their matched normals and stud- Infinium protocol for HumanMethylation450K BeadChip ied their functional impact by comparing the levels of differ- (Illumina), respectively. Data were collected using the HiScan ential methylation with differential gene expression in tumors. (Illumina) reader and analyzed with GenomeStudio V2011.1 We found varying extents of differential methylation in various methylation module1.1.1 (Illumina). subregions. These differentially methylated regions (DMRs) often encompass multiple genesandwereassociatedwithdif- Methylation450 BeadChip probe alignment and removal of ferential gene expression, thus representing functional DMRs. ambiguous probes from analyses We found aberrant methylation in a total of 27,276 hyper- The probe sequences were aligned using bowtie2 (25), while methylated and 21,231 hypomethylated loci with an average allowing up to 2 mismatches to the human reference hg19 Db (level of differential methylation) of at least 0.2. We sequence to detect and exclude probes mapping ambiguously to discovered a minimal methylation signature comprising 16 multiple locations in the reference genome (Supplementary Text different DMRs that harbored GPER1, TTLL8, RHPN1, OR2T6, S1). We found 17,649 probes (3.6% of the total number of MIR10B, ENAH, EMX2OS, BTK, C11orf53, CENPVL1, COL9A1, probes) to be ambiguously mapping in this manner. These probes DEFA1, ODF3, RARRES2, SHF,andSP6 genes with significant mapped to multiple locations, ranging from 2 to 6265 (Supple- differential methylation (quantitative and categorical, see mentary Table S2). Materials and Methods) using a machine learning–based ran- dom forest approach (24), which were validated using meth- Whole-genome gene expression assay ylation data (450K) from the head and neck cohort in the The Whole-genome gene expression profiling was carried out Cancer Genome Atlas (TCGA) project. In our study, hyper- with Illumina HumanHT-12 v4 expression BeadChip (Illu- methylation in a DMR-harboring miRNA, MIR10B,waslinked mina) with 21 tumors and their matched adjacent normal to downregulation of two of its validated gene targets, NR4A3 tissues. Total RNA was extracted using PureLink RNA Mini Kit and BCL2L11, which correlated well with disease-free survival. (Invitrogen), and RNA quality was checked on the bioanalyzer Furthermore, the machine-learning algorithm identified DMPs using RNA Nano6000 chip (Agilent). RNA samples with poor within TSPAN7, FUT3, TRIM5, SLC9A9, NPAS3, RPS6KA2, RNA integrity numbers (RIN; 7) on the bioanalyzer chip, MAP3K8, TIMM8A,andRNF113A genestobepredictiveof indicating partial degradation of RNA, were processed using risk habits, nodal status, tumor staging, prognosis, and HPV Illumina WGDASL assay and the ones with good RIN numbers infection. (>7) were labeled using Illumina TotalPrep RNA Amplification Kit (Ambion) as per the manufacturer's recommendations. Materials and Methods Targets were used to hybridize arrays, and arrays were pro- cessed according to the manufacturer's recommendations. Informed consent, ethics approval, and patient samples used Arrays were scanned using HiScan, Illumina, and the data in the study collected were analyzed with GenomeStudio V2011.1 Gene Informed consent was obtained voluntarily from each patient, Expression module 1.9.0 (Illumina) to check for the assay and ethics approval was obtained from the Institutional Ethics quality control probes. Raw signal intensities were exported Committees of the Mazumdar Shaw Medical Centre (Bangalore, from GenomeStudio for transformation, normalization, and India). Tumor and matched control (blood and/or adjacent differential expression analyses using R. normal tissue) specimens were collected and used in the study. Only those patients whose histologic sections confirmed the Data preprocessing and differential methylation analyses presence of squamous cell carcinoma with at least 70% tumor Raw data for 52 tumor:matched control sample pairs obtained cells in the specimen were used in this study. At the time of from the Illumina Infinium Methylation450 BeadChip scanned admission, patients were asked about their habits (chewing, using the HiScan system were analyzed in GenomeStudio Meth- smoking, and/or alcohol consumption). Fifty-two treatment- ylation Module v1.9.0 (Illumina) for assay controls, before na€ve patients who underwent staging according to AJCC criteria exporting the b values for analysis using R. For each CpG site, and curative intent treatment as per NCCN guidelines involving ratio of fluorescent signal was measured by that of a methylated

806 Mol Cancer Res; 14(9) September 2016 Molecular Cancer Research

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

DNA Methylation in Oral Tongue Tumors

probe relative to the sum of the methylated and unmethylated Minimal differentially methylated loci using machine-learning probes, termed as a b value. b values, ranging from 0 (no algorithm methylation) to 1.0 (100% methylation of both alleles), were Sample-wise dasen-normalized and batch-corrected intensi- recorded for each probe on the array. The b value is calculated ties of probes corresponding to DMRs, identified postfunctional using the formula mentioned below: normalization using minfi, along with the tumor:matched con- trol tissue type for each sample was used as a training set P b ¼ m (training set #1) for the random forest analyses. Pm þ Pum where Pm is the signal intensity of the methylation detection Variable selection probe and Pum is the signal intensity of the nonmethylation The method employed here follows the method described detection probe. Infinium methylation assay has built-in sam- before (32). Briefly, the algorithm performs both backward elim- ple-dependent and -independent controls, which were used to ination of variables and selection based on the importance test the overall efficiency of the assay for each sample. Raw signal spectrum. The Bioconductor package VarSelRF (33) was used to intensities were exported as .idat files, preprocessed (transformed perform this function. About 20% of the least important variables and normalized), and analyzed using R Bioconductor packages, are eliminated iteratively until the current out-of-bag (OOB) error wateRmelon, minfi,andComBat (26–28), to estimate differential rate becomes larger than the initial OOB error rate or the OOB methylation. The Illumina 450K data were preprocessed using error rate in the previous iteration. We computed a metric by a functional normalization procedure implemented by the multiplying the prediction accuracy (sensitivity and specificity) "preprocessFunnorm" function within the R Bioconductor pack- and repeatability, and dividing the product by the number of age minfi (Supplementary Text S2). This algorithm was recently parameters contained within the predicted set, and reported the developed and found to outperform other 450K data normal- set with the highest value for this metric. Five hundred of such ization and batch correction methods (29). After functional iterations were performed, and each of the iterations was per- normalization, we identified DMRs using "bumphunter" func- muted across 3,000 random forest trees (Supplementary Text S9). tion in minfi, retaining significant ones with P 0.05 and fwer A score was computed for predictive parameter sets across all 0.05. Dasen normalization was also performed using iterations, for each sample, which factored in the prediction wateRmelon to obtain differentially methylated loci using accuracy of the tissue type of that sample, the repeatability minfi (Supplementary Text S3). A clear difference before and (number of instances out of 500) of the predictive parameter set, after normalization was seen in the density plots (Supple- presence of other predictive parameter sets, and the parameter set mentary Fig. S1). The normalized data were further batch size. The first two weighed the score positively, and the latter two, corrected using ComBat (28) using empirical Bayes methods. negatively. The R commands used in our method are provided Dasen-normalized DMPs corresponding to these DMRs were below: extracted. Postprocessing of the differentially methylated data library(randomForest) for visualization was performed as per the method provided library(varSelRF) in the Supplementary Material (Supplementary Text S4–S6). DS¼read.table("/home/neeraja/RF/OSCC/TminN.NoFlanks. forRF",header¼TRUE,na.strings¼"NA") Data preprocessing and differential expression analyses DS na.omit(DS) Whole-genome gene expression data were analyzed using final for (i in 1:500) { reports exported from Genome Studio, using the R Bioconductor set.seed(i) packages lumi (30). The VST transformation and LOESS normal- DS.rf.vsf¼varSelRF(DS[,-1],DS[,1],ntree¼3000, ization methods in lumi performed best in preprocessing data ntreeIterat¼2000,vars.drop.frac¼0.2,whole.range ¼ FALSE,keep. from the DASL and WG assays. They were chosen based on their forest ¼ TRUE) highest mean IACs (interarray correlation coefficients), among print(DS.rf.vsf$selected.vars) various combinations of three types of transformations (VST, log2 print(predict(DS.rf.vsf$rf.model,subset(DS[,-1],select¼DS.rf.vsf , and cubicRoot) and six types of normalizations (SSN, quantile, $selected.vars))) RSN, VSN, RankInvariant, and LOESS). The preprocessed data was } further batch corrected using ComBat, as the experiments had been fi performed in multiple batches. The batch-corrected gene expres- For all iterations of all RF analyses, we con rmed that the sion data were then analyzed for differential expression using rankings of variable importances remained highly correlated limma (Supplementary Text S7). This method of transformation, (Supplementary Table S3) before and after correcting for multiple – normalization, and batch correction was performed as described hypotheses comparisons using pre- and post- Benjamin Hoch- previously (31). berg FDR-corrected P values (Supplementary Text S10). The R commands used to recompute importance values after multiple comparisons testing for the six somatic mutation category RF Correlation of methylation and expression data analyses are provided below: All probes were annotated using the Ensemble v75 database and classified on the basis of their locations to genic subregions, # Selected variables from varSelRF analyses such as 50 and 30 untranslated regions (UTR), north and south DS.rf.vsf$selected.vars shelves and shores, TSS1500, promoters, CpG islands, gene ds.rf randomForest(DS[,1] ., DS, ntree ¼ 3000, norm.votes ¼ bodies, and first exons. The relationship between differential FALSE, importance ¼ TRUE) expression of genes and differential methylation of genic sub- #Importance values before FDR correction regions was visualized (Supplementary Text S8). ds.rf$importance[-1,3]

www.aacrjournals.org Mol Cancer Res; 14(9) September 2016 807

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

Krishnan et al.

#Importance values after FDR correction Validation of minimal methylation signature in TCGA p.adjust(ds.rf$importance[-1,3], method ¼ "BH", n ¼ 6) HNSCC data The best-scoring signatures identified from the random forest The random forest algorithm was also implemented to analyses on the OTSCC data, using three training sets, were fi predict tumor-speci c minimal methylation signature, this examined in the 50 tumor:matched control pairs (various subsites time training with a categorical input for the probes, instead in head and neck region) from the TCGA cohort. TCGA data were of the actual intensity values (training set #2). This training set preprocessed and normalized exactly the same way as described Db was created on the basis of the normalized (tumor-matched above for the OTSCC data. control b) values falling into predefined categories. Following the Db density distributions across probes (Supplementary Fig. Prediction of minimal differential methylation signature in S2), for 0.2

Visualization of variants Validation of miR-10B methylation using quantitative We visualized all the variants and the corresponding genes methylation-specific PCR using Circos v0.66. We integrated information on LOHs, somatic The genomic DNA was isolated from paired oral tongue tumor SNPs and indels, and copy number insertions and deletions from and the adjacent normal tissue samples using DNeasy Blood & 48 patients described previously (13) with 10% frequency of Tissue Kit (Qiagen) following the manufacturer's instructions. patients bearing the variants, genes undergoing significant expres- Genomic DNA (total 500 ng) was bisulfite converted using EZ sion, and methylation changes and visualized the changes using DNA Methylation Kit (Zymo Research) following the manufac- the circular genomic representation. The DNAse1 hypersensitivity turer's guidelines. For the specific amplification of CpG islands of tracks, methylation, histone and chromatin modifications, and MIR10B promoter, methylation-specific primers (forward, 50- TFBS tracks were loaded and visualized using the UCSC Genome GTGAGTAGTTTTTCGGAGGTAGC-30 and reverse, 50-CGAAC- Browser. Genome-wide profiles of differential expression and TAACTATAAACCCGAACG-30) were designed using MethPrimer differential methylation of DMRs housing island/TSS1500/pro- software (http://www.urogene.org/methprimer/), quantitative moter probes were visualized using GenomeGraphs, a tool in the methylation-specific PCR (qMSP) was carried out using KAPA UCSC Genome Browser. SYBR Fast qPCR Master Mix (Kapa Biosystems). Quantitative

808 Mol Cancer Res; 14(9) September 2016 Molecular Cancer Research

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

DNA Methylation in Oral Tongue Tumors

methylation in each sample was normalized using the methyl- ated primers for ACTB (forward, 50-AACCAATAAAACCTACTC- CTCCCTTAA-30 and reverse, 50-TGGTGATGGAGGAGGTTTAG- TAAGT-30). The reaction mix was denatured initially at 95C for 5 minutes and amplified for 40 cycles at 95C for 10 seconds, 53C for 30 seconds (59C for ACTB primer), followed by extension at 72C for 1 minute. The amplification was followed by dissociation curve analysis. PCR reactions were performed using Stratagene MX300P (Agilent Technologies) and were ana- lyzed by MxPro Mx-3000p software (Agilent Technologies). Human HCT116 DKO methylated and nonmethylated DNA set standards (Zymo Research) were used as positive and negative assay controls, respectively, for qMSP. The relative level of methylated DNA was calculated for each tumor versus the adjacent normal using the method described by Schmittgen and Livak, 2008 (38).

Differential expression analysis of NR4A3 in TCGA cohort Level 3 RNA-seq expression data on TCGA HNSCC (N ¼ 42 for all subsites and N ¼ 12 for oral tongue, where expression data were available for tumor:normal pairs), along with survival and living status, were downloaded from the UCSC Cancer Browser (https://genome-cancer.ucsc.edu/proj/site/hgHeatmap/#). Tumor-specific ratios of the normalized expression values for NR4A3 gene were log transformed before further analyses.

Survival analysis Event-free survival probability distributions were estimated using the Kaplan–Meier method, and significance was assessed using the log-rank test. We calculated event-free survival for patients as the time from diagnosis to the event (recurrence, disease progression, or death) or last examination if the patient had no event. Significance was calculated as two-tailed P values, on association with event-free survival (months), of habits alone, NR4A3 expression status, and habits and NR4A3 expression, – Figure 1. together. The R commands used for Kaplan Meier analyses are Analysis workflow schema. The figure illustrates the tools, Bioconductor provided in Supplementary Text S12. packages, and functions in R used for data preprocessing, differential methylation and expression analyses, correlation between them, identifying a tumor-specific minimal methylation signature using an ensemble machine- Results learning algorithm and its association with clinical and epidemiologic factors, Analysis workflow and differentially methylated loci in and validation. OTSCC The analytic pipeline for genome-wide methylation and gene expression data analyses along with linking methylation whereas hypomethylated probes were more frequent in the with gene expression, machine-learning approach to discover gene bodies for OTSCC (Fig. 2B). The average Db of the island methylation signature, and follow-up validation is provided in DMRs had a narrow distribution for the OTSCC and HNSCC Fig. 1. The methylation data were normalized by multiple (all and oral tongue subsites from the TCGA cohort), with methods (see Materials and Methods), batch corrected, and median values around 0.30 and 0.40, respectively, indicative of then used for the identification of DMPs and regions encom- greater hypermethylation events (Fig. 2C). The interquartile passing multiple probes (DMRs) in the tumors. Scanning of the ranges for average Db distributions of TSS1500 DMRs spanned preprocessed data in tumor samples before normalization, both hyper- and hypomethylation ranges, for the OTSCC and batch correction, and pairing with corresponding normal tis- HNSCC cancer types, respectively. The median, however, was sues revealed DMPs in seven genes, UBE4B, CCDC13, LRP5L, more indicative of hypermethylation in the OTSCC and hypo- BCL3, MIR4260, FOXK2,andCOL18A1,tobehypermethy- methylation in the HNSCC. The average Db distributions were lated, and one in gene GSTM2,tobehypomethylatedinnearly identical for island DMRs between the OTSCC and the TCGA 95% of the tumors. We found more DMRs to be hypomethy- HNSCC cohort for early- and late-stage tumors, with their lated than hypermethylated (Fig. 2A), which is consistent with medians suggestive of hypermethylation (Fig. 2D). The Db the data from the TCGA cohort, including when data from the distribution for TSS1500 DMRs in OTSCC was much wider oral tongue subsite alone was used (Fig. 2A). The fraction of in the late-stage patient samples, as compared with those in hypermethylated probes located within the CpG islands, early stages of the disease and their medians (both 0.3–0.4) TSS1500, and promoter subregions, combined, were higher, were representative of hypermethylation. Overall, the

www.aacrjournals.org Mol Cancer Res; 14(9) September 2016 809

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

Krishnan et al.

Figure 2. DMR statistics. A, DMR (q 0.05) numbers compared between inhouse oral tongue tumors (OTSCC) and data from the TCGA cohort (TCGA_OralTongue: data for oral tongue subsites of the TCGA cohort and TCGA_HNSCC: data for all tumor subsites in the TCGA cohort). B, region-wise characterization thereof. C, distribution of magnitude of differential methylation in CpG islands and TSS1500. D, distribution of DMRs in early- and late-stage tumors

intraregion differences (within a dataset and across tumor were processed posttransformation; normalization and batch stages) for the Db distributions were significant (P < 0.0001; correction (see Materials and Methods) and tumor-specificup- unpaired t test) over the interregion differences across datasets and downregulated genes were considered for comparison. for the same subregions (TSS1500, P ¼ 0.0751 or CpG island, We found a maximum of 27 genes satisfying these criteria in P ¼ 0.2758). A comprehensive representation of all 5mC the gene bodies, closely followed by 26 genes each, in the CpG changes, along with previously identified somatic variants in islands and 50 UTR and 22 genes in the first exon (Supple- OTSCC (37), is presented as a Circos graph (Supplementary mentary Fig. S4). Finally, we obtained 12 genes across all the Fig. S3; Supplementary Table S4). regions where the differential methylation across all tumors was significantly linked (2-tailed t test, P < 0.05) with gene Correlation between expression and methylation expression changes in the same tumor:normal paired samples We wanted to identify the functional relevance of 5mC (Fig. 3). The magnitude of correlation between expression and changes by correlating methylation changes with changes in methylation varied widely across genes for various subregions gene expression. To do this, we extracted gene level information (Fig. 3 and Supplementary Fig. S4), with a significant negative using probes spanning hyper- or hypomethylated regions that correlation (P < 0.05, two-tailed t test) observed for most were up- or downregulated, respectively, for the same patient. subregions: 50 UTR, R ¼0.51; promoter, R ¼0.5; TSS1500, First, we separated probes in gene bodies, CpG islands, 50 and 30 R ¼0.56; body, R ¼0.35; S shelf, R ¼0.39; first exon, UTRs, north and south shelves and shores, promoters, 1.5 kb R ¼0.46; S shore, 0.66; and N shelf, R ¼0.36. The island flanking the transcription start sites (TSS1500), and first exons. region harbored the largest fraction of hypermethylated and For gene expression analysis, whole-genome expression data downregulated genes, whereas the promoter and TSS1500

810 Mol Cancer Res; 14(9) September 2016 Molecular Cancer Research

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

DNA Methylation in Oral Tongue Tumors

Figure 3. Region-wise correlation between expression and methylation. The relationship between preprocessed differential methylation (tumor MINUS normal) values for probes from specific regions, such as gene bodies, CpG islands, TSS1500,Nshore,Sshore,Nshelf,Sshelf,50 UTR, and first exon, and the log2 fold change (tumor/normal fold change) of the corresponding gene expression for the same patient. Blue and orange colors, hypo- and hypermethylation, respectively. Filled circles indicate a correlation in the expected direction, that is, hypermethylation and downregulation or hypomethylation and upregulation. Empty circles indicate a counterintuitive correlation, namely, hypermethylation and up-regulation or hypomethylation and downregulation. The sizes of the circles indicate the magnitude of gene expression. Only genes with significant associations (P < 0.05) across all tumors are shown.

www.aacrjournals.org Mol Cancer Res; 14(9) September 2016 811

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

Krishnan et al.

subregions displayed a strong inverse relationship between for the majority of tumors (94% and 88%, respectively), while differential expression and differential methylation (Supple- ENAH exhibited a mixed trend.Analyzing homogeneous DMRs mentary Fig. S4). This observation formed a basis for consid- in CpG islands/TSS1500 and promoters, alone (training set #3; ering DMRs housed in these three subregions as one of our see Materials and Methods), we obtained DMRs encompassing training sets to identify a minimal methylation signature. 10 genes (BTK, C11orf53, CENPVL1, COL9A1, DEFA1, ODF3, RARRES2, RHPN1, SHF,andSP6) with the highest sensitivity Tumor-specific DMPs and DMRs and specificity (Fig. 4A). We found RHPN1 gene showed up in We used three different training sets (Materials and Methods multiple training sets. The DMRs corresponding to BTK, section) to discover minimal methylation signature using var- C11orf53, COL9A1, DEFA1, ODF3 along with RHPN1 were iable selection by elimination analyses implemented by the hypomethylated, while the remaining were relatively hyper- random forest approach. The scores for the discovered signa- methylated. The results from our 0.632þ bootstrapping errors tures are presented in Supplementary Table S5. We found two were low, with a median error of approximately 0.025, for sets of two probes in four different DMRs (encompassing predictions of normal, tumor, or either tissues (Supplementary GPER1 and OR2T6, TTLL8 and RHPN1 genes), each to be the Fig. S5) indicative of a high overall prediction accuracy. best diagnostic predictor for tumors (Fig. 4A, top) with training We further validated the tumor-specificDMRsdiscoveredin set #1 (see Materials and Methods). These probes along with our RF analysis in the TCGA cohort. As presented in Fig. 4B other loci housed within the same DMRs showed a hypomethy- (bottom), the differential methylation trends observed in our lation signature for most tumor samples when compared with OTSCC cohort were also observed in both oral tongue and the normal samples (Fig. 4A). The signature was strongest for other subsites in the TCGA HNSCC cohort, except in the case of the predicted probe (DMP) within each DMR. Using the cat- ENAH gene, where the majority of tumors were hypomethy- egorical training set based on binning Db of paired tumor and lated for the particular probe in the TCGA cohort (Fig. 4B; matched control samples (training set #2; see Materials and training set #2). In addition, we also looked at the cell line data Methods), we further identified three more DMRs that encom- from ENCODE consortium that revealed DNAse I hypersensi- passed MIR10B, ENAH,andEMX2OS genes (Fig. 4A). For tivity and/or CpG methylation in these genes, including specific MIR10B and EMX2OS, probes were distinctly hypermethylated locations within the DMRs housing the tissue type predictive

Figure 4. Discovery of a minimal DNA methylation signature. A, minimal differential methylation profiles of methylation loci from random forest analyses using three different training sets (see Materials and Methods). B, validation of specific DMPs and DMRs using the same training sets in the TCGA HNSCC data.

812 Mol Cancer Res; 14(9) September 2016 Molecular Cancer Research

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

DNA Methylation in Oral Tongue Tumors

probes (Supplementary Fig. S6), possibly indicating the role of extended our earlier observations by performing Kaplan–Meier these CpG sites in regulation of gene expression. survival analyses on the effects of habits alone, NR4A3 down- regulation alone, or their combined effects on survival. We observed that the habits alone do not have significant impact on Hypermethylation of MIR10B and downregulation of its patient survival (Fig. 6A, P ¼ 0.494; Fig. 6D, P ¼ 0.283), whereas target gene expression NR4A3 expression does (Fig. 6B, P ¼ 0.034; Fig. 6E, P ¼ 0.019). We scanned upstream and downstream of the DMR that We further observed that the combination of no risk habits and housed MIR10B gene in an attempt to understand its differential NR4A3 expression has a significant effect on survival (Fig. 6C, P ¼ methylation relative to its neighborhood. We found a DMR of 0.046 and Fig. 6F, P ¼ 0.021). Overall, patients without habits nearly 1.31 MB to be differentially methylated in all the tumor (habits negative) expressed NR4A3 at a lower level compared with samples (Fig. 5A). We found that the sizes of such functional those with habits (Fig. 5C). DMRs discovered for the MIR10B gene to be exaggerated relative to the stringent DMR boundaries determined by minfi, also for DMPs predicting various epidemiologic and clinical ENAH and EMX2OS, codiscovered with MIR10B, as the minimal parameters set (Supplementary Fig. S7). Differential methylation (Db) for all tumors used as training We observed negative correlations between the differential sets along with five parameters, namely HPV infection, nodal methylation of MIR10B gene and the differential expression of status, risk habits, tumor stage, and prognosis, resulted in best- two of its validated gene targets, BCL2L11 and NR4A3(P ¼ 0.0125 scoring DMPs corresponding to TSPAN7, FUT3, TRIM5, SLC9A9, and P ¼ 0.014, respectively; Fig. 5B and C). NR4A3 expression was NPAS3, RPS6KA2, MAP3K8, TIMM8A, and RNF113A genes. The downregulated in all except one tumor studied (>95% of the set of predictive DMPs displayed contrasting patterns of differ- tumors). In addition to miR-10B hypermethylation in 60% of the ential methylation among the various categories of each param- tumors, and very frequent downregulation of NR4A3, we found eter (Fig. 7). that 50% to 60% of the tumors have hypermethylation in the NR4A3 gene itself, near the first exon and first intron, spanning a CpG island (Supplementary Fig. S8). Similarly, about 60% of the Discussion tumors had hypermethylation in the promoter region of BCL2L11 Epigenetic changes like DNA methylation play important gene, also spanning a CpG island (Supplementary Fig. S8). We roles in cancer initiation and progression. One of the first steps found the extent of NR4A3 downregulation was correlated, in studying DNA methylation is to catalog the loci where although weakly, in patients with no risk habits than those methylation changes take place and link those to the disease. patients with at least one of the risk habits, smoking, alcohol Previous studies in head and neck cancer were primarily aimed consumption, and chewing tobacco. The tumor-specific differen- at identifying 5mC changes in a handful of loci in the genome tial expression of BCL2L11 and NR4A3 genes was inversely or using low-throughput methylation arrays. In the current correlated with disease-free survival (P ¼ 9E15 and P ¼ 2E15, study, we have used >482, 000 probes to profile genome-wide respectively; Fig. 5D and E), with absence of risk habits associating changes in 5mC patterns in oral tongue tumors (OTSCC) and more with improved disease-free survival. defined functional regions of importance by comparing DNA We validated the inverse association between differential methylation changes with those in gene expression. We term methylation of MIR10B and differential expression in its down- these regions of the genome as functional DMRs. We also stream targets, NR4A3 (R ¼0.66; P ¼ 0.04) and BCL2L11 (R ¼ linked global DNA methylation changes with various clinical 0.78; P ¼ 0.01), experimentally by performing qMSP for and epidemiologic parameters in OTSCC. Our observation of MIR10B and qPCR for BCL2L11 and NR4A3 (see Materials the presence of genome-wide hypomethylated regions along- and Methods), in 12 additional tumor:matched control pairs side promoter hypermethylation (Fig. 2) corroborates pervious (Supplementary Fig. S9). We found that MIR10B is hyper- study in other cancers (19). Although the role of global hypo- methylated in nearly 60% of the samples, with 83% and methylation is unclear, several animal experiments using 42% of the tumors showing downregulation for NR4A3 and DNA methylation inhibitors (39, 40) are indicative of its BCL2L11, respectively (Supplementary Fig. S9), validating the involvement in oncogenesis and tumor progression. The hyper- findings from the arrays. methylation of CpG islands observed in our study (Fig. 2B–D) was supportive of the theory that CpGs located within islands NR4A3 expression and patient survival in normal cells are relatively unmethylated, whereas those We studied the effect of MIR10B target gene expression on located outside of the islands are more methylated, with the patient survival and found that the extent of NR4A3 downregula- reverse pattern occurring in tumor cells (41). We observed a tion was correlated, although weakly, in patients with no risk shift of the global hypermethylation in TSS1500 toward a habits than those patients with at least one of the risk habits, relative hypomethylation, from early to advanced stages of the smoking, alcohol consumption, and chewing tobacco. The disease (Fig. 2D), a trend also found in hepatocellular carci- tumor-specific differential expression of BCL2L11 and NR4A3 noma (42). We found some uniquely differentially methylated genes was inversely correlated with disease-free survival (P ¼ loci in our study, different from other studies so far, a possible 9E15 and P ¼ 2E15, respectively; Fig. 5D and E), with absence indication of specific epigenetic marks linked with our cohort of risk habits associating more with improved disease-free sur- and patient geography as the epigenome is known to be vival. In our discovery set, we found 95% of the tumors with dynamic and highly dependent on environmental and dietary NR4A3 downregulation that we validated in 83% and 42% of the factors (43). tumors in an independent validation set using qPCR and by using In our study, we define the importance of functional and data from TCGA oral tongue cohort (Supplementary Fig. S10). We differentially methylated regions or DMRs in tumors based on

www.aacrjournals.org Mol Cancer Res; 14(9) September 2016 813

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

Krishnan et al.

814 Mol Cancer Res; 14(9) September 2016 Molecular Cancer Research

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

DNA Methylation in Oral Tongue Tumors

two factors. First, we identified genes that were encompassed (tubulin tyrosine ligase-like family, member 8) gene is differen- in DMRs by correlating differential methylation in a number tially methylated in diseases such as myelodysplastic syndrome of probes within specific genomic loci, especially in the CpG (57), and OR2T6 shows copy number gains in many cancers, islands, promoter, and TSS1500, with the differential expression including HNSCC (http://ow.ly/RniAf). in these genes, between the same tumor:matched normal pairs, MIR10B has been implicated in the past in multiple cancers, and estimating the significance of the association (2-tailed t test, including in head and neck (58–62). Unlike in some cancers, its P < 0.05) between those two events (Fig. 3). Second, we identified role in HNSCC is unclear (63). We found a large functional the entire stretch of DMPs and assigned importance to all the hypermethylated DMR around miR-10B spanning several mega- genes in that region. It is vital to assign function to the entire bases (Fig. 5A and Supplementary Fig. S7). A similar effect in DMR, rather than a single probe or a gene housed within it, as the MIR10B has been reported previously in gastric cancers (64). We probe hybridization patterns vary widely, even within a given found downregulation in one of the MIR10B target genes, NR4A3, subregion, and methylation changes are not necessarily localized in significant number of oral tongue tumors both in our array data single-gene events. Both these criteria led us to identify functio- and also in a smaller independent validation set of tumors nal DMRs. A genome-wide representation of such functional (Fig. 5C and Supplementary Fig. S8). NR4A3 (NOR1)isan DMRs is shown in Supplementary Fig. S11. We found that methyl- experimentally validated target of MIR10B (http://mirtarbase. ation changes at certain regions do not go hand-in-hand with mbc.nctu.edu.tw/), is an orphan nuclear receptor, a member of gene expression changes in the expected direction (i.e., hyper- the nuclear receptor superfamily, and is one of the primary classes and hypomethylated loci giving rise to up- and downregulation of therapeutic drug targets (65, 66). The gene is epigenetically of gene expression, respectively; Supplementary Fig. S4). Even silenced in gastric cancers and promotes migration and invasion though such phenomena were previously reported (44), their in gastric cancer (67). Although data from the TCGA HNSCC significance is currently not known. cohort displayed a hypermethylation trend for the DMP in 95% Our analyses using an ensemble machine-learning algo- of the tumors and associated functional DMR in all the tumors rithm followed by estimation of errors picked several genes (Fig. 4B), we could not test the downregulation of its target predictive of clinical and epidemiologic factors (Fig. 7). Sev- genes, NR4A3 and BCL2L11, as gene expression data from the eral of those are shown in the past to have a role in cancer. same sets of tumors were not available from the TCGA cohort. For example, TSPAN7, a tetraspanin family member upregu- BCL2L11, another MIR10B target gene is a potential downstream lated in multiple myeloma patients (45) in human epithelial target of NR4A. We found hypermethylation in miR-10B is mod- malignancies (46), FUT3 in gastric cancer (47), SLC9A9, erately associated with the downregulation of BCL2L11 (Fig. 5B). overexpressed in most cancers (http://www.proteinatlas.org/ It is possible that there are multiple mechanisms through which ENSG00000181804-SLC9A9/cancer), NPAS3, a neuronal PAS the expressions of NR4A3 and BCL2L11 are modulated in transcriptional factor family member as a tumor suppressor in OTSCC, and hypermethylation of miR-10B is one such path. Low astrocyomal progression (48), RPS6KA2, a ribosomal kinase and expression in NR4A1, another member from the nuclear receptor as a putative tumor suppressor in sporadic epithelial ovarian NR4A family, is linked with poor overall survival in aggressive cancer (49) and a modifier of the EGFR signaling pathway in lymphomas (65) and NR4A1 and NR4A3 double-knockout mice pancreatic cancer (50), and MAP3K8 as a proto-oncogene that rapidly develop acute myeloid leukemia (65). Interestingly, plays a role in lung (51), prostrate (52), and colorectal (53) NR4A3 was found to act as an oncogene in neuroblastoma, cancers. where its overexpression lowered the survival rate in patients We identified a signature containing differentially methylated (68). Like in neuroblastoma, it is possible that in solid tumors loci encompassing 16 genes using a machine-learning algorithm like HNSCC, NR4A3 acts as an oncogene. However, our data do using three different training sets (Fig. 4A) that was further not provide mechanistic details, and future functional biology validated using data from TCGA cohort (Fig 4B). Although the experiments are needed to precisely define the role(s) and machine-learning algorithm identified DMPs, we could identify mechanism of action of NR4A3 in OTSCC. We found tumor- the functional DMRs encompassing these DMPs (Fig. 5A and specific differential expression of BCL2L11 and NR4A3 genes to Supplementary Fig. S12). Many of the genes within those DMRs be inversely correlated with disease-free survival, especially in are cancer-associated genes. Among those, GPER1 (G-– habits negative patients, from our array data. Downregulation coupled estrogen receptor 1) is known to act as a tumor suppressor of NR4A3 and hence BCL2L11, may explain such an association in ovarian cancer (54) and has low expression in breast cancer with improved disease-free survival. Despite the fact that (55), RHPN1 (rhophilin, Rho GTPase-binding protein 1) is the hypermethylation of MIR10B and downregulation of reported to be highly expressed in most cancers (http://www. two of its target genes were validated in an independent proteinatlas.org/ENSG00000158106-RHPN1/cancer) and CpG validation set, due to the small sample size (N ¼ 7forhabits islands adjacent to the RPHN1 gene is reported to be linked with positive, N ¼ 4 for habits negative, N ¼ 1 habits not known) survival of colorectal cancer patients (56). In addition, TTLL8 and therefore lack of statistical power, it was not possible to

Figure 5. Differential methylation of MIR10B, linking it to the expression of its downstream target genes and disease-free survival. A, median differential methylation of chromosomal regions flanking MIR10B gene. The actual scatter is in gray, exponential smoothing of the scatter in red, and MIR10B is highlighted in blue. B and C, correlation between methylation (MIR10B) and expression (NR4A3 and BCL2L11). FC, fold change. D and E, correlation of NR4A3 and BCL2L11 expression with disease-free survival. Green filled circles, patients without any risk habits; red filled circles, patients with at least one risk habit (smoking, chewing tobacco, or alcohol).

www.aacrjournals.org Mol Cancer Res; 14(9) September 2016 815

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

Krishnan et al.

Figure 6. Effect of NR4A3 expression on survival probability distributions. A and D, Kaplan–Meier survival probability curves for habits negative (habits-ve) versus habits positive (habits þve). B–E, NR4A3 downregulated (NR4A3- Down) versus NR4A3 upregulated (NR4A3-Up). C–F, habits negative OR NR4A3 downregulated versus habits positive and NR4A3 upregulated, in the discovery and validation sets (A–C), or in the discovery, validation, and the TCGA oral tongue dataset, combined together (D–F).

associate survival with the expression of these genes linked with validation cohort assayed using qPCR and in TCGA sample MIR10B hypermethylation. Therefore, we combined the dis- cohort (Supplementary Fig. S10) and that the downregulation covery and validation sets to accurately ascertain such a of this gene was significantly linked with better survival (Fig. 6). relationship. Kaplan–Meier survival analyses revealed that the downregula- We found that NR4A3 was downregulated in a large fraction tion of NR4A3 expression alone (Fig. 6B and E), but not the of tumors, both in the discovery set as well as an independent absence of habits (Fig. 6A and D), is sufficient for providing

816 Mol Cancer Res; 14(9) September 2016 Molecular Cancer Research

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

DNA Methylation in Oral Tongue Tumors

Figure 7. Discovery of differentially methylated loci linked with clinical and epidemiologic parameters. A–E, differential methylation profiles of HPV, habits, node, prognosis, and stage, respectively, predicting DMPs and DMRs from random forest analyses. Blue circles indicate hypomethylation and orange circles indicate hypermethylation, with the sizes indicative of the magnitude of methylation changes (Db).

survival advantage to the patients. It is possible that habits themodeofactionandtheimplicationsofthekeygeneslike upregulate NR4A3 expression (as most habits positive patients NR4A3 in OTSCC biology. have lowered extent of NR4A3 downregulation; Fig. 5C), there- Insummary,wehaveidentified a set of differential methyl- by working as a negative factor for survival (Fig. 6C and F). ation signatures with functional relevance in oral tongue However, future work is needed to define mechanistic details of tumors. The association between the differentially

www.aacrjournals.org Mol Cancer Res; 14(9) September 2016 817

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

Krishnan et al.

methylated signatures with clinical and epidemiologic para- Writing, review, and/or revision of the manuscript: N.M. Krishnan, A. Suresh, meters points to their potential application in diagnosis and M.A. Kuriakose, B. Panda disease stratification. In addition, our data on NR4A3 present Study supervision: B. Panda Other (data visualization and presentation): N.M. Krishnan potential avenues of exploiting the molecule as a target for Other (patient cohort selection, follow up): A. Suresh therapeutic intervention in head and neck cancer. Grant Support Disclosure of Potential Conflicts of Interest Research presented in this article is funded by the Department of Electronics No potential conflicts of interest were disclosed. and Information Technology, Government of India [ref nr.: 18(4)/2010-E-Infra., 31-03-2010] and the Department of IT, BT and ST, Government of Karnataka, India (ref nr.: 3451-00-090-2-22). Authors' Contributions The costs of publication of this article were defrayed in part by the Conception and design: N.M. Krishnan, B. Panda payment of page charges. This article must therefore be hereby marked Development of methodology: N.M. Krishnan, B. Panda advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate Acquisition of data (provided animals, acquired and managed patients, this fact. provided facilities, etc.): K. Dhas, J, Nair, V. Palve, J. Bagwan, G. Siddappa, V.D. Kekatpure, M.A. Kuriakose Analysis and interpretation of data (e.g., statistical analysis, biostatistics, Received September 18, 2015; revised April 7, 2016; accepted April 24, 2016; computational analysis): N.M. Krishnan, B. Panda published OnlineFirst June 10, 2016.

References 1. Ferlay J, Shin HR, Bray F, Forman D, Mathers C, Parkin DM. Estimates of 17. Goldberg AD, Allis CD, Bernstein E. Epigenetics: a landscape takes shape. worldwide burden of cancer in 2008: GLOBOCAN 2008. Int J Cancer Cell 2007;128:635–8. 2010;127:2893–917. 18. Bird A. Perceptions of epigenetics. Nature 2007;447:396–8. 2. Mishra A, Meherotra R. Head and neck cancer: global burden and regional 19. Ehrlich M. DNA hypomethylation in cancer cells. Epigenomics 2009;1: trends in India. Asian Pac J Cancer Prev 2014;15:537–50. 239–59. 3. Llewellyn CD, Johnson NW, Warnakulasuriya KA. Risk factors for squa- 20. Yeh KT, Chang JG, Lin TH, Wang YF, Tien N, Chang JY, et al. Epigenetic mous cell carcinoma of the oral cavity in young people–a comprehensive changes of tumor suppressor genes, P15, P16, VHL and P53 in oral cancer. literature review. Oral Oncol 2001;37:401–18. Oncol Rep 2003;10:659–63. 4. Kuriakose M, Sankaranarayanan M, Nair MK, Cherian T, Sugar AW, Scully 21. Gasche JA, Goel A. Epigenetic mechanisms in oral carcinogenesis. Future C, et al. Comparison of oral squamous cell carcinoma in younger and older Oncol 2012;8:1407–25. patients in India. Eur J Cancer B Oral Oncol 1992;28B:113–20. 22. Mascolo M, Siano M, Ilardi G, Russo D, Merolla F, De Rosa G, et al. 5. Pathak KA, Das AK, Agarwal R, Talole S, Deshpande MS, Chaturvedi P, et al. Epigenetic disregulation in oral cancer. Int J Mol Med Sci 2012;13: Selective neck dissection (I-III) for node negative and node positive necks. 2331–53. Oral Oncol 2006;42:837–41. 23. Shaw RJ, Hobkirk AJ, Nikolaidis G, Woolgar JA, Triantafyllou A, Brown JS, 6. Fakhry C, Zhang Q, Nguyen-Tan PF, Rosenthal D, El-Naggar A, Garden AS, et al. Molecular staging of surgical margins in oral squamous cell carcinoma et al. Human papillomavirus and overall survival after progression of using promoter methylation of p16(INK4A), cytoglobin, E-cadherin, and oropharyngeal squamous cell carcinoma. J Clin Oncol 2014;32:3365–73. TMEFF2. Ann Surg Oncol 2013;20:2796–802. 7. Agrawal N, Frederick MJ, Pickering CR, Bettegowda C, Chang K, Li RJ, et al. 24. Breiman L. Random Forests. Dordrecht, The Netherlands: Kluwer Academ- Exome sequencing of head and neck squamous cell carcinoma reveals ic Publishers; 2001. inactivating mutations in NOTCH1. Science 2011;333:1154–7. 25. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. 8. Stransky N, Egloff AM, Tward AD, Kostic AD, Cibulskis K, Sivachenko A, Nat Methods 2012;9:357–9. et al. The mutational landscape of head and neck squamous cell carcinoma. 26. Pidsley R, CC YW, Volta M, Lunnon K, Mill J, Schalkwyk LC. A data- Science 2011;333:1157–60. driven approach to preprocessing Illumina 450K methylation array data. 9. Pickering CR, Zhang J, Yoo SY, Bengtsson L, Moorthy S, Neskey DM, et al. BMC Genomics 2013;14:293. Integrative genomic characterization of oral squamous cell carcinoma 27. Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen identifies frequent somatic drivers. Cancer Discov 2013;3:770–81. KD, et al. Minfi:aflexible and comprehensive Bioconductor package for the 10. India Project Team of the International Cancer Genome Consortium. analysis of Infinium DNA methylation microarrays. Bioinformatics 2014; Mutational landscape of gingivo-buccal oral squamous cell carcinoma 30:1363–9. reveals new recurrently-mutated genes and molecular subgroups. Nat 28. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray Commun 2013;4:2873. expression data using empirical Bayes methods. Biostatistics 2007;8: 11.TheCancerGenomeAtlasNetwork.Comprehensive genomic charac- 118–27. terization of head and neck squamous cell carcinomas. Nature 2015; 29. Fortin JP, Labbe A, Lemire M, Zanke BW, Hudson TJ, Fertig EJ, et al. 517:576–82. Functional normalization of 450k methylation array data improves rep- 12. Krishnan NM, Gupta S, Palve V, Varghese L, Pattnaik S, Jain P, et al. lication in large cancer studies. Genome Biol 2014;15:503. Integrated analysis of oral tongue squamous cell carcinoma identifies key 30. Du P, Kibbe WA, Lin SM. lumi: a pipeline for processing Illumina micro- variants and pathways linked to risk habits, HPV, clinical parameters and array. Bioinformatics 2008;24:1547–8. tumor recurrence. F1000Res 2015;4:1215. 31. Chow ML, Winn ME, Li HR, April C, Wynshaw-Boris A, Fan JB, et al. 13. Lechner M, Fenton T, West J, Wilson G, Feber A, Henderson S, et al. Preprocessing and quality control strategies for illumina DASL assay-based Identification and functional validation of HPV-mediated hypermethyla- brain gene expression studies with semi-degraded samples. Front Genet tion in head and neck squamous cell carcinoma. Genome Med 2013;5:15. 2012;3:11. 14. Wilson GA, Lechner M, Koferle A, Caren H, Butcher LM, Feber A, et al. 32. Krishnan NM, Gupta S, Khyriem C, Palve V, Suresh A, Siddappa G, et al. Integrated virus-host methylome analysis in head and neck squamous cell Integrated analysis links TP53, NOTCH, SLC38A and 11p with survival in carcinoma. Epigenetics 2013;8:953–61. patients with oral tongue squamous cell carcinoma. bioRxiv 2015;doi: 15. Herceg Z, Hainaut P. Genetic and epigenetic alterations as biomarkers for http://dx.doi.org/10.1101/033829. cancer detection, diagnosis and prognosis. Mol Oncol 2007;1:26–41. 33. Diaz-Uriarte R. GeneSrF and varSelRF: a web-based tool and R package for 16. Dawson MA, Kouzarides T. Cancer epigenetics: from mechanism to ther- gene selection and classification using random forest. BMC Bioinformatics apy. Cell 2012;150:12–27. 2007;8:328.

818 Mol Cancer Res; 14(9) September 2016 Molecular Cancer Research

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

DNA Methylation in Oral Tongue Tumors

34. Tibshirani R, Efron B. Improvements on cross-validation: the .632 þ 53. Tunca B, Tezcan G, Cecener G, Egeli U, Zorluoglu A, Yilmazlar T, et al. bootstrap method. J Am Stat Assoc 1997;92:13. Overexpression of CK20, MAP3K8 and EIF5A correlates with poor prog- 35. Boulesteix A-L, Janitza S, Kruppa J, Konig€ IR. Overview of random forest nosis in early-onset colorectal cancer patients. J Cancer Res Clin Oncol methodology and practical guidance with emphasis on computational 2013;139:691–702. biology and bioinformatics. WIREs Data Mining Knowl Discov 2012;2: 54. Ignatov T, Modl S, Thulig M, Weissenborn C, Treeck O, Ortmann O, et al. 493–507. GPER-1 acts as a tumor suppressor in ovarian cancer. J Ovarian Res 36. Boulesteix AL, Janitza S, Hapfelmeier A, Van Steen K, Strobl C. Letter to the 2013;6:51. editor: on the term `interaction' 647 and related phrases in the literature on 55. Ignatov T, Weissenborn C, Poehlmann A, Lemke A, Semczuk A, Roessner A, Random Forests. Brief Bioinform 2015;16:338–45. et al. GPER-1 expression decreases during breast cancer tumorigenesis. 37. Efron B. Estimating the error rate of a prediction rule. J Am Stat Assoc Cancer Invest 2013;31:309–15. 1983;78:316–331. 56. Beggs AD, Dilworth MP, Domingo E, Midgley R, Kerr D, Tomlinson IP, et al. 38. Schmittgen TD, Livak KJ. Analyzing real-time PCR data by the comparative Methylation changes in the TFAP2E promoter region are associated with CT method. Nat Protocols 2008;3:1101–08. BRAF mutation and poorer overall & disease free survival in colorectal 39. Carr BI, Reilly JG, Smith SS, Winberg C, Riggs A. The tumorigenicity of 5- cancer. Oncoscience 2015;2:508–16. azacytidine in the male Fischer rat. Carcinogenesis 1984;5:1583–90. 57. Maegawa S, Gough SM, Watanabe-Okochi N, Lu Y, Zhang N, Castoro RJ, 40. Denda A, Rao PM, Rajalakshmi S, Sarma DS. 5-azacytidine potentiates et al. Age-related epigenetic drift in the pathogenesis of MDS and AML. initiation induced by carcinogens in rat liver. Carcinogenesis 1985; Genome Res 2014;24:580–91. 6:145–6. 58. Lu YC, Chen YJ, Wang HM, Tsai CY, Chen WH, Huang YC, et al. Oncogenic 41. Baylin SB, Herman JG. DNA hypermethylation in tumorigenesis: epige- function and early detection potential of miRNA-10b in oral cancer as netics joins genetics. Trends Genet 2000;16:168–74. identified by microRNA profiling. Cancer Prev Res 2012;5:665–74. 42. Lin CH, Hsieh SY, Sheen IS, Lee WC, Chen TC, Shyu WC, et al. Genome- 59. Tian Y, Luo A, Cai Y, Su Q, Ding F, Chen H, et al. MicroRNA-10b promotes wide hypomethylation in hepatocellular carcinogenesis. Cancer Res migration and invasion through KLF4 in human esophageal cancer cell 2001;61:4238–43. lines. J Biol Chem 2010;285:7986–94. 43. Hardy TM, Tollefsbol TO. Epigenetic diet: impact on the epigenome and 60. Ma L, Teruya-Feldstein J, Weinberg RA. Tumour invasion and metastasis cancer. Epigenomics 2011;3:503–18. initiated by microRNA-10b in breast cancer. Nature 2007;449:682–8. 44. Pogribny IP, Rusyn I. Role of epigenetic aberrations in the development 61. Bourguignon LY, Wong G, Earle C, Krueger K, Spevak CC. Hyaluronan- and progression of human hepatocellular carcinoma. Cancer Lett 2014; CD44 interaction promotes c-Src-mediated twist signaling, microRNA-10b 342:223–30. expression, and RhoA/RhoC up-regulation, leading to Rho-kinase-associ- 45. Cheong CM, Chow AW, Fitter S, Hewett DR, Martin SK, Williams SA, et al. ated cytoskeleton activation and breast tumor cell invasion. J Biol Chem Tetraspanin 7 (TSPAN7) expression is upregulated in multiple myeloma 2010;285:36721–35. patients and inhibits myeloma tumour development in vivo. Exp Cell Res 62. Jiang J, Gusev Y, Aderca I, Mettler TA, Nagorney DM, Brackett DJ, et al. 2015;332:24–38. Association of MicroRNA expression in hepatocellular carcinomas with 46. Romanska HM, Berditchevski F. Tetraspanins in human epithelial malig- hepatitis infection, cirrhosis, and patient survival. Clin Cancer Res 2008; nancies. J Pathol 2011;223:4–14. 14:419–27. 47. Serpa J, Mesquita P, Mendes N, Oliveira C, Almeida R, Santos-Silva F, et al. 63. Severino P, Bruggemann H, Andreghetto FM, Camps C, Klingbeil Mde F, de Expression of Lea in gastric cancer cell lines depends on FUT3 expression Pereira WO, et al. MicroRNA expression profile in head and neck cancer: regulated by promoter methylation. Cancer Lett 2006;242:191–7. HOX-cluster embedded microRNA-196a and microRNA-10b dysregula- 48. Moreira F, Kiehl TR, So K, Ajeawung NF, Honculada C, Gould P, et al. tion implicated in cell proliferation. BMC Cancer 2013;13:533. NPAS3 demonstrates features of a tumor suppressive role in driving the 64. Li Z, Lei H, Luo M, Wang Y, Dong L, Ma Y, et al. DNA methylation progression of astrocytomas. Am J Pathol 2011;179:462–76. downregulated mir-10b acts as a tumor suppressor in gastric cancer. Gastric 49. Bignone PA, Lee KY, Liu Y, Emilion G, Finch J, Soosay AE, et al. RPS6KA2, a Cancer 2015;18:43–54. putative tumour suppressor gene at 6q27 in sporadic epithelial ovarian 65. Deutsch AJ, Angerer H, Fuchs TE, Neumeister P. The nuclear orphan cancer. Oncogene 2007;26:683–700. receptors NR4A as therapeutic target in cancer therapy. Anticancer Agents 50. Milosevic N, Kuhnemuth B, Muhlberg L, Ripka S, Griesmann H, Lolkes C, Med Chem 2012;12:1001–14. et al. Synthetic lethality screen identifies RPS6KA2 as modifier of epidermal 66. Kojetin DJ, Burris TP. REV-ERB and ROR nuclear receptors as drug targets. growth factor receptor activity in pancreatic cancer. Neoplasia 2013;15: Nat Rev Drug Discov 2014;13:197–216. 1354–62. 67. Wang YY, Li L, Ye ZY, Zhao ZS, Yan ZL. MicroRNA-10b promotes migration 51. Clark AM, Reynolds SH, Anderson M, Wiest JS. Mutational activation of the and invasion through Hoxd10 in human gastric cancer. World J Surg Oncol MAP3K8 protooncogene in lung cancer. Genes Cancer 2015;13:259. 2004;41:99–108. 68. Uekusa S, Kawashima H, Sugito K, Yoshizawa S, Shinojima Y, Igarashi J, 52. Jeong JH, Bhatia A, Toth Z, Oh S, Inn KS, Liao CP, et al. TPL2/COT/MAP3K8 et al. Nr4a3, a possibile oncogenic factor for neuroblastoma asso- (TPL2) activation promotes androgen depletion-independent (ADI) pros- ciated with CpGi methylation within the third exon. Int J Oncol 2014; tate cancer growth. PLoS One 2011;6:e16205. 44:1669–77.

www.aacrjournals.org Mol Cancer Res; 14(9) September 2016 819

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research. Published OnlineFirst June 10, 2016; DOI: 10.1158/1541-7786.MCR-15-0395

A Minimal DNA Methylation Signature in Oral Tongue Squamous Cell Carcinoma Links Altered Methylation with Tumor Attributes

Neeraja M. Krishnan, Kunal Dhas, Jayalakshmi Nair, et al.

Mol Cancer Res 2016;14:805-819. Published OnlineFirst June 10, 2016.

Updated version Access the most recent version of this article at: doi:10.1158/1541-7786.MCR-15-0395

Supplementary Access the most recent supplemental material at: Material http://mcr.aacrjournals.org/content/suppl/2016/06/25/1541-7786.MCR-15-0395.DC1

Cited articles This article cites 66 articles, 10 of which you can access for free at: http://mcr.aacrjournals.org/content/14/9/805.full#ref-list-1

Citing articles This article has been cited by 5 HighWire-hosted articles. Access the articles at: http://mcr.aacrjournals.org/content/14/9/805.full#related-urls

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at Subscriptions [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://mcr.aacrjournals.org/content/14/9/805. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from mcr.aacrjournals.org on October 1, 2021. © 2016 American Association for Cancer Research.