DNA Methylation Biomarkers for Hepatocellular Carcinoma Guorun Fan1†, Yaqin Tu1†, Cai Chen3, Haiying Sun1, Chidan Wan2* and Xiong Cai2*
Total Page:16
File Type:pdf, Size:1020Kb
Fan et al. Cancer Cell Int (2018) 18:140 https://doi.org/10.1186/s12935-018-0629-5 Cancer Cell International PRIMARY RESEARCH Open Access DNA methylation biomarkers for hepatocellular carcinoma Guorun Fan1†, Yaqin Tu1†, Cai Chen3, Haiying Sun1, Chidan Wan2* and Xiong Cai2* Abstract Background: Aberrant methylation of DNA is a key driver of hepatocellular carcinoma (HCC). In this study, we sought to integrate four cohorts profle datasets to identify such abnormally methylated genes and pathways associated with HCC. Methods: To this end, we downloaded microarray datasets examining gene expression (GSE84402, GSE46408) and gene methylation (GSE73003, GSE57956) from the GEO database. Abnormally methylated diferentially expressed genes (DEGs) were sorted and pathways were analyzed. The String database was then used to perform enrichment and functional analysis of identifed pathways and genes. Cytoscape software was used to create a protein–protein interaction network, and MCODE was used for module analysis. Finally, overall survival analysis of hub genes was performed by the OncoLnc online tool. Results: In total, we identifed 19 hypomethylated highly expressed genes and 14 hypermethylated lowly expressed genes at the screening step, and fnally found six mostly changed hub genes including MAD2L1, CDC20, CCNB1, CCND1, AR and ESR1. Pathway analysis showed that aberrantly methylated-DEGs mainly associated with the cell cycle process, p53 signaling, and MAPK signaling in HCC. After validation in TCGA database, the methylation and expres- sion status of hub genes was signifcantly altered and same with our results. Patients with high expression of MAD2L1, CDC20 and CCNB1 and low expression of CCND1, AR, and ESR1 was associated with shorter overall survival. Conclusions: Taken together, we have identifed novel aberrantly methylated genes and pathways linked to HCC, potentially ofering novel insights into the molecular mechanisms governing HCC progression and serving as novel biomarkers for precision diagnosis and disease treatment. Keywords: Hepatocellular carcinoma, Methylation, Hub genes Background with hepatitis C virus (HCV) [3, 4]. Other risk factors Hepatocellular carcinoma (HCC), an infammation- for developing HCC include exposure to afatoxin, alco- driven disease, is the third deadliest cancer worldwide, hol intake, smoking, and diabetes [5]. Te best curative and HCC prevalence is predicted to continue to rise in treatments to date in early stage HCC patients involve coming years, serving as a major economic burden [1, surgical resection, tumor ablation, and potentially liver 2]. Most cases of HCC occur in developing countries, transplantation [6, 7]. However, the prognosis after cura- such as China, and the leading cause of HCC is infection tive therapy for HCC remains unsatisfactory because with hepatitis B virus (HBV); in contrast, the main cause of a high postoperative recurrence rate. An improved in developed countries, such as the USA, is infection understanding of the basic biology of HCC is needed to enhance prognostic predictions and to enhance thera- peutic efcacy against this deadly disease. *Correspondence: [email protected]; [email protected] Te term epigenetics refers to heritable gene expression †Guorun Fan and Yaqin Tu contributed equally to this work 2 Department of Hepatobiliary Surgery, Union Hospital, Tongji Medical alterations no associated with DNA sequence changes College, Huazhong University of Science and Technology, Wuhan 430022, [8]. Te DNA methylation is closely related to embry- China onic development [9], regulation of gene expression [10], Full list of author information is available at the end of the article © The Author(s) 2018. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/ publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Fan et al. Cancer Cell Int (2018) 18:140 Page 2 of 13 X-chromosome inactivation [11], genomic imprinting threshold set for up and down-regulated genes was a | [12], and genomic stability [13]. Altered DNA methyla- log FC | ≥ 2 and P ≤ 0.05. For gene methylation profling tion such as tumor suppressor gene hypermethylation dataset (GSE73003 and GSE57956), the GEO2R software or oncogene hypomethylation is thought to promote was used to analyze the raw data and to identify difer- tumorigenesis. Genes including P15, P16, Ras asso- entially methylated genes (DMGs). GEO2R is an inter- ciation domain family 1 isoform A (RASSF1A), and Ret- active web tool which allows users to compare diferent inoblastoma 1 are inactivated in HCC due to promoter groups of samples in a GEO series to screen genes that hypermethylation of these genes [14–17]. Given that are diferentially expressed in experimental conditions. methylation is potentially reversible, detection of such Te adjusted P-values (adj. P) and Benjamini and Hoch- aberrant DNA methylation of tumor suppressors and berg false discovery rate were applied to provide a bal- oncogenes in HCC could be useful as a therapeutic ance between discovery of statistically signifcant genes target. and limitations of false-positives. Te DMGs cutof crite- While altered methylation of many genes has been ria were P < 0.05 and |t| > 2. Additionally, one-sided tests demonstrated to date in the context of HCC, a com- were used to categorize the upregulated or downregu- plete interaction network documenting the relation- lated DMGs. Finally, a Venn diagram was used to identify ship between said genes remains to be produced. Te hypomethylated highly-expressed genes and hypermeth- comprehensive analysis of multiple datasets ofers the ylated lowly-expressed genes. power needed to properly identify and assess pertinent pathways and genes mediating the biological processes Functional and pathway enrichment analysis associated with HCC. To this end, we used datasets from microarrays examining gene expression (GSE84402, Gene ontology (GO) analyses and Kyoto Encyclopedia GSE46408) and gene methylation (GSE73003, GSE57956) of Genes and Genomes (KEGG) pathway enrichment to assess HCC gene and epigenetic signatures, allowing analyses were conducted on identifed genes with altered for identifcation of genes and pathways that were both methylation/expression using the Search Tool for the abnormally methylated and diferentially expressed. Retrieval of Interacting Genes (STRING, https ://strin Using a protein–protein interaction network we were g-db.org/). Tis allowed for protein–protein interaction also able to identify key so-called “hub” genes central to (PPI) network generation and functional annotation of these signaling events. Trough this analysis, we believe genes of interest. P < 0.05 was the threshold for statistical it is possible to identify novel diferentially methylated signifcance. genes associated with HCC, ofering key insights into the molecular mechanisms governing HCC development and Generation and analysis of a protein–protein interaction progression. (PPI) network In order to interpret the drivers of carcinogenesis in a Methods meaningful manner, a functional PPI analysis was neces- Microarray data sary. Te STRING database allowed us to generate PPI Gene expression profling datasets GSE84402 and networks for both hypomethylated/and highly-expressed GSE46408, and gene methylation profling datasets genes as well as for hypermethylated and lowly-expressed GSE73003 and GSE57956, were downloaded from the genes. A 0.4 interaction score served as the cutof prior to gene expression omnibus (GEO, https ://www.ncbi.nlm. visualization. Modules were then screened using Molec- nih.gov/geo/) [18, 19]. 14 HCC and 14 normal specimens ular Complex Detection (MCODE) in the Cytoscape were obtained in GSE84402, while 6 HCC and 6 normal software package, with MCODE score > 4 and num- samples were obtained in GSE46408. GSE57956 included ber of nodes > 5. Te top 3 Hub genes were selected by a total of 59 primary HCC tumor samples and 59 adja- CytoHubba app in Cytoscape software. To determine the cent normal tissue samples, while GSE73003 included 40 expression pattern of six hub genes in HCC, we used the paired normal and HCC samples from 20 patients. datasets in the Oncomine (https ://www.oncom ine.org) and UALCAN (http://ualca n.path.uab.edu/) database. Data acquisition and processing Oncomine and UALCAN are an online database consist- Raw gene expression profling datasets GSE84402 and ing of previously published and open-access microarray GSE46408 were downloaded from GEO public reposito- data. Te analysis enables multiple comparisons of gene ries. Data processing was performed using robust multi- expression between diferent studies; the signifcance of array average (RMA) in GeneSpring GX 11.5 (Agilent the gene expression across the available studies was also Technologies Pty Ltd), including background adjustment, taken into account [20, 21]. Te results were fltered by normalization and log transformation of the values. Te selecting hepatocellular carcinoma vs. normal tissue. Fan et al. Cancer Cell Int (2018) 18:140 Page 3 of 13 Hub gene validation two sets of genes, a total of 19 hypomethylated,