A Robust Twelve-Gene Signature for Prognosis Prediction Of
Total Page:16
File Type:pdf, Size:1020Kb
Ouyang et al. Cancer Cell Int (2020) 20:207 https://doi.org/10.1186/s12935-020-01294-9 Cancer Cell International PRIMARY RESEARCH Open Access A robust twelve-gene signature for prognosis prediction of hepatocellular carcinoma Guoqing Ouyang1, Bin Yi2, Guangdong Pan1 and Xiang Chen3* Abstract Background: The prognosis of hepatocellular carcinoma (HCC) patients remains poor. Identifying prognostic mark- ers to stratify HCC patients might help to improve their outcomes. Methods: Six gene expression profles (GSE121248, GSE84402, GSE65372, GSE51401, GSE45267 and GSE14520) were obtained for diferentially expressed genes (DEGs) analysis between HCC tissues and non-tumor tissues. To identify the prognostic genes and establish risk score model, univariable Cox regression survival analysis and Lasso-penalized Cox regression analysis were performed based on the integrated DEGs by robust rank aggregation method. Then Kaplan–Meier and time-dependent receiver operating characteristic (ROC) curves were generated to validate the prognostic performance of risk score in training datasets and validation datasets. Multivariable Cox regression analysis was used to identify independent prognostic factors in liver cancer. A prognostic nomogram was constructed based on The Cancer Genome Atlas (TCGA) dataset. Finally, the correlation between DNA methylation and prognosis-related genes was analyzed. Results: A twelve-gene signature including SPP1, KIF20A, HMMR, TPX2, LAPTM4B, TTK, MAGEA6, ANX10, LECT2, CYP2C9, RDH16 and LCAT was identifed, and risk score was calculated by corresponding coefcients. The risk score model showed a strong diagnosis performance to distinguish HCC from normal samples. The HCC patients were stratifed into high-risk and low-risk group based on the cutof value of risk score. The Kaplan–Meier survival curves revealed signifcantly favorable overall survival in groups with lower risk score (P < 0.0001). Time-dependent ROC anal- ysis showed well prognostic performance of the twelve-gene signature, which was comparable or superior to AJCC stage at predicting 1-, 3-, and 5-year overall survival. In addition, the twelve-gene signature was independent with other clinical factors and performed better in predicting overall survival after combining with age and AJCC stage by nomogram. Moreover, most of the prognostic twelve genes were negatively correlated with DNA methylation in HCC tissues, which SPP1 and LCAT were identifed as the DNA methylation-driven genes. Conclusions: We identifed a twelve-gene signature as a robust marker with great potential for clinical application in risk stratifcation and overall survival prediction in HCC patients. Keywords: Hepatocellular carcinoma, Overall survival, The Cancer Genome Atlas, Gene signature, Prognosis, DNA methylation Background *Correspondence: [email protected] Hepatocellular carcinoma (HCC) is one of the most 3 Department of General Surgery, The Second Xiangya Hospital, Central common malignant solid tumors and the fourth leading South University, Changsha, China cause of cancer-related deaths worldwide [1]. Tere were Full list of author information is available at the end of the article © The Author(s) 2020. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creat iveco mmons .org/publi cdoma in/ zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Ouyang et al. Cancer Cell Int (2020) 20:207 Page 2 of 18 approximately 841,000 new cases of HCC and 782,000 patients. However, additional gene signatures are needed deaths in 2018. Te incidence and mortality continue to for accurate prognosis of HCC because of complexity and increase for both sexes. Surgical resection is the most heterogeneity of this disease. efective treatment to cure early stage HCC, but most In the current study, six sets of diferentially expressed patients are frst diagnosed at an advanced stage which genes (DEGs) from diferent Gene Expression Omni- missed the best time for surgical treatment. Although bus (GEO) datasets were integrated to identify overlap- chemotherapy, radiotherapy, liver transplantation and ping DEGs. Functional annotation assessment with Gene other potentially curative treatment have achieved a vari- Ontology (GO) annotation and Kyoto Encyclopedia of ety of therapeutic efects and prolonged survival period, Genes and Genomes (KEGG) pathway enrichment analy- the prognosis of HCC remains poor due to the high rate ses were conducted with the overlapping DEGs. Uni- of recurrence and intrahepatic spread [2]. Tose high- variable and Lasso-Cox regression analysis were applied risk HCC patients with potentially poor outcomes must to identify overall survival-related DEGs and propose be monitored and adopt timely and efective treatments a prognostic risk score model to stratify HCC patients. to prolong survival and improve quality of life [3]. Tere- Besides, independent prognostic factors of OS were iden- fore, it is an urgent need for efective prediction markers tifed by multivariable Cox survival analysis. We fnally to accurately assess the prognosis of HCC patients. identifed twelve-gene signature as a robust marker with Prognostic models based on parameters including great potential in risk stratifcation and OS prediction in clinical baseline characteristics to molecular biomark- HCC patients. ers for HCC have been constructed in several previous studies [4]. A prognostic score by using positive tumor Methods markers (alpha-fetoprotein (AFP), fucosylated AFP and Study population des-gamma-carboxy prothrombin) showed a useful pre- In present study, we searched and downloaded mRNA dictive prognostic value in HCC patients treated with expression chip data of HCC tissues from the GEO transcatheter arterial chemoembolization [5]. But the database by using the keywords of “hepatocellular car- characteristic of tumor markers, depended on tumor cinoma” and “Homo sapiens”. Six microarray data- burden, limits their value in diagnosing early stage sets (GSE121248, GSE84402, GSE65372, GSE51401, tumors. With the developments in gene chips and high- GSE45267 and GSE14520 (based on the GPL571 plat- throughput sequencing, gene signature based on mRNA form) were obtained for DEGs analysis. Details of the expression levels have shown great potential in predict- GEO datasets used in this study are shown in Table 1. ing HCC prognosis. Te abnormal expression levels of RNA-sequencing data of 371 HCC tissues and 50 normal single gene such as SEC62 [6], SHP-1 [7], RING1 [8], tissues normalized by log2 transformation were acquired AGBL2 [9] have been reported to be independent prog- from Te Cancer Genome Atlas (TCGA) for analyz- nostic factors for HCC patients. Moreover, a risk‐coef- ing the integrated DEGs from the six GEO datasets and fcient model based on a multigene mRNA expression building gene prognostic models. GSE14520 datasets signature has been identifed to be an independent prog- (based on the GPL3921 platform) included 216 HCC nostic factor for overall survival (OS) and could stratify tissues with complete clinical information and mRNA patients into high- and low-risk group with signifcantly expression data for external validation of the prognostic diferent OS [10–12]. Tese gene signatures could be gene signature. After excluding TCGA cases with incom- used for the preclinical and clinical treatment for HCC plete clinical information, 233 HCC patients along with Table 1 Details of the GEO datasets included in this study Datasets References Platform Sample size Application (tumor/control) GSE121248 Hui et al. GPL570 [HG-U133_Plus_2] Afymetrix Human Genome U133 Plus 2.0 Array 107 (70/37) Identifcation of DEGs GSE84402 Cheng et al. GPL570 [HG-U133_Plus_2] Afymetrix Human Genome U133 Plus 2.0 Array 28 (14/14) Identifcation of DEGs GSE65372 Hoshida et al. GPL14951 Illumina HumanHT-12 WG-DASL V4.0 R2 expression beadchip 54 (39/15) Identifcation of DEGs GSE51401 Kong et al. GPL570 [HG-U133_Plus_2] Afymetrix Human Genome U133 Plus 2.0 Array 64 (30/34) Identifcation of DEGs GSE45267 Hsieh et al. GPL570 [HG-U133_Plus_2] Afymetrix Human Genome U133 Plus 2.0 Array 87 (46/41) Identifcation of DEGs GSE14520 Wang et al. GPL571 [HG-U133A_2] Afymetrix Human Genome U133A 2.0 Array 41 (22/19) Identifcation of DEGs GSE14520 Wang et al. GPL3921 [HT_HG-U133A] Afymetrix HT Human Genome U133A Array 445 (225/220) Validation of DEGs Ouyang et al. Cancer Cell Int (2020) 20:207 Page 3 of 18 their complete age, gender, sex, tumor grade, American Independence of the prognostic gene signature from other Joint Committee on Cancer (AJCC) pathologic tumor clinical parameters in TCGA stage, vascular invasion, OS status and time informa- Te risk