The Underlying Molecular Mechanisms and Prognostic Factors
Total Page:16
File Type:pdf, Size:1020Kb
He et al. Cancer Cell Int (2021) 21:325 https://doi.org/10.1186/s12935-021-02031-6 Cancer Cell International PRIMARY RESEARCH Open Access The underlying molecular mechanisms and prognostic factors of RNA binding protein in colorectal cancer: a study based on multiple online databases Qinglian He1, Ziqi Li1, Xue Lei1, Qian Zou1, Haibing Yu2, Yuanlin Ding2, Guangxian Xu3 and Wei Zhu1* Abstract Background: RNA binding protein (RBP) is an active factor involved in the occurrence and development of colorec- tal cancer (CRC). Therefore, the potential mechanism of RBP in CRC needs to be clarifed by dry-lab analyses or wet-lab experiments. Methods: The diferential RBP gene obtained from the GEPIA 2 (Gene Expression Profling Interactive Analysis 2) were performed functional enrichment analysis. Then, the alternative splicing (AS) events related to survival were acquired by univariate regression analysis, and the correlation between RBP and AS was analyzed by R software. The online databases were conducted to analyze the mutation and methylation of RBPs in CRC. Moreover, 5 key RBP signatures were obtained through univariate and multivariate Cox regression analysis and established as RBP prognosis model. Subsequently, the above model was verifed through another randomized group of TCGA CRC cohorts. Finally, multi- ple online databases and qRT-PCR analysis were carried to further confrm the expression of the above 5 RBP signa- tures in CRC. Results: Through a comprehensive bioinformatics analysis, it was revealed that RBPs had genetic and epigenetic changes in CRC. We obtained 300 diferentially expressed RBPs in CRC samples. The functional analysis suggested that they mainly participated in spliceosome. Then, a regulatory network for RBP was established to participate in AS and DDX39B was detected to act as a potentially essential factor in the regulation of AS in CRC. Our analysis discov- ered that 11 diferentially expressed RBPs with a mutation frequency higher than 5%. Furthermore, we found that 10 diferentially expressed RBPs had methylation sites related to the prognosis of CRC, and a prognostic model was constructed by the 5 RBP signatures. In another randomized group of TCGA CRC cohorts, the prognostic performance of the 5 RBP signatures was verifed. Conclusion: The potential mechanisms that regulate the aberrant expression of RBPs in the development of CRC was explored, a network that regulated AS was established, and the RBP-related prognosis model was constructed and verifed, which could improve the individualized prognosis prediction of CRC. Keywords: RNA binding protein, Alternative splicing, Colorectal cancer Background *Correspondence: [email protected] Colorectal cancer (CRC) is one of the most fatal primary 1 Department of Pathology, Guangdong Medical University, No.1 digestive tract tumors [1, 2]. Despite some improve- Xincheng Road, Dongguan 523808, Guangdong Province, China Full list of author information is available at the end of the article ments in diagnosis and treatment, global mortality © The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. The Creative Commons Public Domain Dedication waiver (http:// creat iveco mmons. org/ publi cdoma in/ zero/1. 0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. He et al. Cancer Cell Int (2021) 21:325 Page 2 of 16 remains high [1]. At present, the feld of CRC research data sets, the expressions of the model’s RBPs in CRC is focused on the development of tools for early detec- were further analyzed. tion, reliable prognosis and predictive biomarkers, as well as new treatments that can overcome drug resist- Materials and methods ance [3–5]. Gene regulation in eukaryotes is a multi- Acquisition of RBPs step process and new RNA formed after transcription is Based on the data reported by Gerstberger et al. in 2014 usually modifed, transported, localized and translated. [12], a complete list of 1542 RBPs used in this study was With the emergence of high-throughput technology in obtained (Additional fle 1). Tis list was used for all the genomics and the new viewpoint of genetic and epige- analyses in this study. netic mechanisms, the research has been concentrated on the change of transcriptional level [6]. Many studies Diferential expression analysis and functional enrichment often showed that there is a lack of signifcant correla- analysis tion between transcripts and protein levels in cells [7]. Te abnormally expressed genes in CRC samples were Tese observations lead the public to believe that other collected on the Gene Expression Profling Interactive processes may also play an important role in the cell pool Analysis 2 (GEPIA 2) database (http:// gepia2. cancer- pku. that afects the translation of proteins from their respec- cn/# index) (condition set to Dataset: COAD or READ, tive transcripts. Tis paradox can be further explained by |Log2FC| Cutof: 1; q-value Cutof: 0.05; Diferential identifying post-transcriptional regulatory points, which Methods: LIMMA), of which 300 RBP genes were dif- make a great contribution to the regulation of protein ferentially expressed in CRC (Additional fle 2). Using R level. Tese checkpoints are mainly composed of regula- software, the gene name was converted to entrezID by tion mediated by non-coding RNAs (microRNAs, circu- referring to "org.Hs.eg.db" package. Ten the R pack- lar RNAs and long non-coding RNAs) and RNA-binding ets "clusterProfler", "org.Hs.eg.db", "enrichplot" and proteins (RBPs) [8, 9]. "ggplot2" were performed to analyze the enrichment of RBPs bind to newborn RNAs in the whole process of gene ontology (GO) and Kyoto Encyclopedia of Genes cells [10]. Te versatility and wide range of RBPs targets and Genomes (KEGG) and to visualize the results. make them critical post-transcriptional regulatory factors [10]. Terefore, it is necessary to understand the struc- AS data analysis ture and function of these molecules for comprehending Te transcriptome and clinical information of CRC were many processes that have changed due to the dyregula- obtained from the TCGA GDC platform (https:// portal. tion of these proteins. Many of these dysregulated RBPs gdc. cancer. gov/; released before October 27, 2019), and have also been shown to contribute to the pathogenesis the AS data of TCGA CRC samples were taken from the of cancer [11]. TCGA SpliceSeq platform (https:// bioin forma tics. mdand In this study, besides exploring the potential mecha- erson. org/ TCGAS plice Seq/ PSIdo wnload. jsp; released nism of regulating the abnormal expression of RBPs, it before July, 10, 2020). Univariate Cox analysis with R was also found that using multiple RBP integrated model, software was used to fnd AS events related to survival RBP may afect the prognosis of CRC, thus improving (Additional fle 3). Te correlation (correlation coefcient the prediction accuracy of prognosis. We procured the R > 0.55, P value < 0.001) between RBP expression and the results of functional analysis of diferentially expressed Percent-spliced-in (PSI) value of survival-related AS was RBPs, in CRC from online database, which prompted analyzed using the function cor.test () in R software. Te the construction of an alternative splicing (AS) network network diagram was generated by Cytoscape (version of CRC after acquiring diferentially expressed AS events 3.7.1). related to the prognosis of RBP. We hypothesized that gene mutation and DNA methylation were the poten- Mutation analysis and methylation data analysis of RBP tial mechanisms for regulating aberrant expression of gene RBP. Hence, a number of online databases were used to Te gene mutation data of 8930 CRC samples were down- analyze and verify that mutation and DNA methylation loaded from COSMIC website (https:// cancer. sanger. were involved in the regulation of aberrant expression ac. uk/ cosmic) (Additional fle 4), and the gene mutation of RBPs. Moreover, univariate and multivariate propor- frequencies of 1542 RBP were investigated. Te muta- tional hazard regression analysis were applied to further tion of RBP with high mutation frequency in CRC was screen prognostic RBP genes from Te Cancer Genome further studied graphically in cBioPortal (http:// www. Atlas (TCGA) CRC cohorts and establish the optimal cbiop ortal. org/). Te methylation data of 8930 CRC sam- risk model, which was verifed in randomized test group. ples were obtained on the Catalogue of Somatic Muta- Finally, through the analysis of various publicly available tions in Cancer (COSMIC) website (Additional fle 5) to He et al. Cancer Cell Int (2021) 21:325 Page 3 of 16 investigate the methylation of abnormally expressed RBP fve RBP signatures at the CRC organizational level was genes in CRC. In addition, we used the TCGA 450 k array searched on Te Human Protein Atlas (https:// www. downloaded from UCSC (https:// xenab rowser. net/ datap prote inatl as. org/). ages/; released before July, 20, 2020) (Additional fle 5) to analyze the independent prognosis of TCGA CRC meth- Cell lines, RNA isolation and qRT‑PCR ylation sites and found that there were prognostic meth- CRC cells (HCT 116, SW480, SW620, Caco2, RKO) and ylation sites in RBP.