Downloaded for the Cancer Genome Atlas (TCGA)
Total Page:16
File Type:pdf, Size:1020Kb
Xu et al. J Transl Med (2019) 17:311 https://doi.org/10.1186/s12967-019-2065-2 Journal of Translational Medicine RESEARCH Open Access Identifcation of key DNA methylation-driven genes in prostate adenocarcinoma: an integrative analysis of TCGA methylation data Ning Xu1†, Yu‑Peng Wu1†, Zhi‑Bin Ke1†, Ying‑Chun Liang1†, Hai Cai1, Wen‑Ting Su2, Xuan Tao3, Shao‑Hao Chen1, Qing‑Shui Zheng1, Yong Wei1* and Xue‑Yi Xue1* Abstract Background: Prostate cancer (PCa) remains the second leading cause of deaths due to cancer in the United States in men. The aim of this study was to perform an integrative epigenetic analysis of prostate adenocarcinoma to explore the epigenetic abnormalities involved in the development and progression of prostate adenocarcinoma. The key DNA methylation‑driven genes were also identifed. Methods: Methylation and RNA‑seq data were downloaded for The Cancer Genome Atlas (TCGA). Methylation and gene expression data from TCGA were incorporated and analyzed using MethylMix package. Methylation data from the Gene Expression Omnibus (GEO) were assessed by R package limma to obtain diferentially methylated genes. Pathway analysis was performed on genes identifed by MethylMix criteria using ConsensusPathDB. Gene Ontology (GO) term enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were also applied for the identifcation of pathways in which DNA methylation‑driven genes signifcantly enriched. The protein– protein interaction (PPI) network and module analysis in Cytoscape software were used to fnd the hub genes. Two methylation profle (GSE112047 and GSE76938) datasets were utilized to validate screened hub genes. Immunohisto‑ chemistry of these hub genes were evaluated by the Human Protein Atlas. Results: A total of 553 samples in TCGA database, 32 samples in GSE112047 and 136 samples in GSE76938 were included in this study. There were a total of 266 diferentially methylated genes were identifed by MethylMix. Plus, a total of 369 diferentially methylated genes and 594 diferentially methylated genes were identifed by the R package limma in GSE112047 and GSE76938, respectively. GO term enrichment analysis suggested that DNA methylation‑ driven genes signifcantly enriched in oxidation–reduction process, extracellular exosome, electron carrier activity, response to reactive oxygen species, and aldehyde dehydrogenase [NAD(P) ] activity. KEGG pathway analysis found DNA methylation‑driven genes signifcantly enriched in fve pathways including+ drug metabolism—cytochrome P450, phenylalanine metabolism, histidine metabolism, glutathione metabolism, and tyrosine metabolism. The vali‑ dated hub genes were MAOB and RTP4. Conclusions: Methylated hub genes, including MAOB and RTP4, can be regarded as novel biomarkers for accurate PCa diagnosis and treatment. Further studies are needed to draw more attention to the roles of these hub genes in the occurrence and development of PCa. Keywords: Prostate adenocarcinoma, Methylation, TCGA , GEO, Integrative epigenetic analysis *Correspondence: [email protected]; [email protected] †Ning Xu Yu‑Peng Wu Zhi‑Bin Ke and Ying‑Chun Liang contributed equally to this work 1 Department of Urology, The First Afliated Hospital of Fujian Medical University, 20 Chazhong Road, Fuzhou 350005, China Full list of author information is available at the end of the article © The Author(s) 2019. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/ publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Xu et al. J Transl Med (2019) 17:311 Page 2 of 15 Background Table 1 Clinical dada of prostate cancer patients Prostate cancer (PCa) remains the second leading cause from TCGA of deaths due to cancer in the United States in men [1]. Characteristic Total Siegel et al. [2] demonstrated that there will have 164,690 Cohort sizea 500 newly diagnosed PCa patients and 29,430 deaths in Mean age, years 61.01 6.823 2018 in the United States. Tus, the diagnosis of PCa in ± T stage early stage is vitally important [3]. Currently, the serum pT2a 13 (2.6%) prostate-specifc antigen (PSA) screening remains the pT2b 10 (2%) primary way for early diagnosis of PCa. However, the pT2c 165 (33%) sensitivity and specifcity of PSA test remains low [4]. pT3a 159 (31.8%) Terefore, it is important to fnd notable biomarkers for pT3b 136 (27.2%) the diagnosis of PCa. pT4 10 (2%) Previous studies [5–7] have shown that DNA meth- Unknown 7 (1.4%) ylation plays a crucial role in the development and pro- N stage gression of prostate cancer. DNA methylation is treated N0 348 (69.6%) as a promising investigative tool for the study of pro- N1 79 (15.8%) gressive prostate cancer because that DNA methylation is a reversible progress [8]. High-throughput screening Unknown 73 (14.6%) has been widely used to identify the DNA methylation a Three clinical data of prostate cancer patients are not available involved in the initiation and progression of the pros- tate cancer [9–11]. Epigenetic modifcations are crucial for diagnosis and prognosis of prostate cancer and prov- the R package limma in GSE112047 and GSE76938, ing additional options for prostate cancer diagnosis and respectively. treatment strategies [12, 13]. In this study, prostate cancer associated DNA methyl- Gene ontology terms analysis of DNA methylation‑driven ation-driven genes between cancerous and normal sam- genes obtained from TCGA database ples were identifed and GO term enrichment analysis, GO term enrichment analysis results varied from GO KEGG pathway analysis, PPI network analysis were also classifcation. As to biological process, the DNA meth- performed respectively. ylation-driven genes enriched in oxidation–reduction process, response to reactive oxygen species, xenobiotic Results catabolic process, bone morphogenesis, hydrogen per- Identifcation of DNA methylation‑driven genes oxide biosynthetic process, response to interferon-beta, Clinical data of prostate cancer patients extracted from negative regulation of ATPase activity, and limb mor- TCGA were demonstrated in Table 1. A total of 266 phogenesis. For cellular component, the DNA methyl- genes were diferentially methylated when comparing ation-driven genes enriched in extracellular exosome. tumor to normal by MethylMix criteria for all 553 sam- For molecular function, the DNA methylation-driven ples. Representative diferential methylation of tumor genes enriched in electron carrier activity, aldehyde samples compared with normal samples was demon- dehydrogenase [NAD(P)+] activity, glutathione per- strated in Fig. 1 and Table 2. Of these genes, 209 genes oxidase activity, structural molecule activity, and glu- (78.57%) were hypermethylated and the remainder of the tathione binding (Fig. 6 and Table 3). 57 genes (21.43%) were hypomethylated. Te entire matrix of methylation values was evaluated. KEGG pathway analysis of DNA methylation‑driven genes In this study, the correlation between gene expression obtained from TCGA and DNA methylation data was calculated and 266 gene KEGG pathway analysis found fve signifcantly expression were found to be negatively correlated with enriched pathways. Seven DNA methylation-driven DNA methylation (Fig. 2). Correlation between genes genes enriched in drug metabolism—cytochrome P450. expression and DNA methylation of top 10 hypermethyl- Four DNA methylation-driven genes enriched in Phe- ated genes (Fig. 3) and top hypomethylated genes (Fig. 4) nylalanine metabolism. Four DNA methylation-driven was also demonstrated. A Heatmap of the methylation genes enriched in Histidine metabolism. Five DNA values of all patients was demonstrated in Fig. 5. methylation-driven genes enriched in Glutathione A total of 369 diferentially methylated genes and metabolism. Four DNA methylation-driven genes 594 diferentially methylated genes were identifed by enriched in Tyrosine metabolism (Fig. 7 and Table 4). Xu et al. J Transl Med (2019) 17:311 Page 3 of 15 Fig. 1 Summary of top hypermethylated and top hypomethylated genes. The horizontal black bar demonstrates the distribution of methylation values in the normal samples (denoted as beta values where higher beta values demonstrate greater methylation). The histogram represents the distribution of methylation in tumor samples Table 2 Top 10 hypomethylated genes and hypermethylated genes in patients with prostate cancer Gene Normal mean Tumor mean Log FC p value Adjust p Cor Cor p value Top 10 hypomethylated genes TAF1D 0.298851771 0.17046481 0.809956148 5.12E 22 1.61E 19 0.515133904 3.62E 35 − − − − − TMEM87A 0.36898334 0.219989007 0.746124244 2.40E 22 7.57E 20 0.415783216 2.80E 22 − − − − − KLK2 0.42230197 0.252663444 0.741058027 2.34E 25 7.36E 23 0.424469057 3.03E 23 − − − − − MARS 0.661117252 0.407190703 0.69920154 2.88E 18 9.08E 16 0.402813221 6.87E 21 − − − − − SLC10A5 0.682378547 0.421046571 ‑0.696592477 1.19E 16 3.76E 14 0.320114016 2.36E 13 − − − − KLK3 0.528659515 0.348234132 0.602281232 2.38E 22 7.50E 20 0.352378214 4.92E 16 − − − − − MPC2 0.847716806 0.573604393 0.563526315 3.24E 22 1.02E 19 0.509873152 2.25E 34 − − − − − ALDH1A3 0.390659331 0.266302408