Identification of Seven Hub Genes As the Novel Biomarkers in Triple
Total Page:16
File Type:pdf, Size:1020Kb
Identication of Seven Hub Genes as the Novel Biomarkers in Triple-negative Breast Cancer and Breast Cancer Metastasis Huanxian Wu Southern Medical University Huining Lian Southern Medical University Nanfang Hospital Qianqing Chen Southern Medical University Jinlamao Yang Southern Medical University Nanfang Hospital Baofang Ou Southern Medical University Dongling Quan Southern Medical University Lei Zhou Southern Medical University Lin Lv Southern Medical University Minfeng Liu Southern Medical University Nanfang Hospital Shaoyu Wu ( [email protected] ) Guangdong Provincial Key Laboratory of New Drug Screening, School of Pharmaceutical Science, Southern Medical University, Guangzhou, Guangdong, 510515, PR China. https://orcid.org/0000-0002- 1247-5295 Research article Keywords: Triple-negative breast cancer, Metastasis, Biomarkers, Prognostic signature Posted Date: October 5th, 2020 DOI: https://doi.org/10.21203/rs.3.rs-73076/v1 Page 1/20 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Page 2/20 Abstract Background: Breast cancer is one of the most common malignant tumors with the highest morbidity and mortality among women. Compared with the other breast cancer subtypes, Triple-negative breast cancer (TNBC) has a higher probability of recurrence and is prone to distant metastasis. To reveal the underlying disease mechanisms and identify more effective biomarkers for TNBC and breast cancer metastasis. Methods: Gene Ontology and KEGG pathway analysis were used for investigating the role of overlapping differentially expressed genes (DEGs). Hub genes among these DEGs were determined by the protein- protein interactions network analysis and CytoHubba. Oncomine databases were used for verifying the clinical relevance of hub genes. Furthermore, the differences in the expression of these genes in cancer and normal tissues were validated in the cellular, animal and human tissue. Results: Seven hub genes, including TTK, KIF11, SPAG5, RRM2, BUB1, CDCA8 and CDC25C, were identied that might be associated with TNBC and breast cancer metastasis. Meanwhile, these genes have been veried highly expressed in tumor cells and tumor tissues, and patients with higher expression of these genes have a poorer prognosis. Conclusions: Seven hub genes were potential biomarkers for the diagnosis and therapy of TNBC and breast cancer metastasis. Background Breast cancer, a highly heterogeneous progressive disease, was estimated to account for about a quarter of all female cancers.[1] The survival rate of breast cancer patients is closely correlated with the clinical stage of initial diagnosis.[2] In the past decades, the study in the molecular mechanisms of breast cancer has been achieved great progress, making the treatment of breast cancer more personalized;[3] however, despite this, breast cancer is still the major cause of cancer death among females.[1] Triple negative breast cancer (TNBC) is a sophisticated and invasive subtype of breast cancer lacking estrogen receptors, progesterone receptors and the aberrant expression of HER2,[4] accounting for 15– 20% of breast cancer cases but 25% of deaths.[5] Compared with the other breast cancer subtype, TNBC has a higher probability of recurrence and a poor 5-year prognosis on account of a higher degree of vascular lymphatic inltration, strong metastatic ability, prone to distant metastasis and a lack of therapeutic targets.[6], [7–9] Statistically, 20–30% breast cancer patients would experience distant metastasis,[10] and ~ 90% of cancer-related deaths are caused by metastasis.[11] Overall, the 5-year survival rate of patients with breast cancer is about 90%.[12] However, if the tumor has distant metastases, the 5-year survival rate drops to about 25%.[13] TNBC is about 2.5-time more likely to develop metastasis within 5 years after diagnosis than non-TNBC.[14] What’s worse, TNBC preferentially metastases to the viscera in contrast to other breast cancer subtypes, mainly metastases to bone, which tend to worse prognosis.[15] Page 3/20 Currently, chemotherapy remains the standard treatment for TNBC at all stages, wing to the lack of approved cellular targets. [15, 16] For recurrent and metastatic TNBC, systemic chemotherapy is the only currently available treatment strategy.[17] However, poor response, toxicity and the occurrence of multidrug resistant limit the application of this method.[18] Recently, some targeted drugs have been used in clinical studies of TNBC, which bring hope to the treatment of TNBC, but also have signicant deciency. Due to the heterogeneity of TNBC, some patients who do not meet the requirements cannot benet from targeted therapy (PARP inhibitor; some targeted drugs are not satisfactory for the response rate of TNBC (TKIs, cetuximab, rapamycin and their combination with the conventional chemotherapies); and compensatory signaling pathway activation also leads to the rapid development of resistance to targeted drugs in patients (mTOR inhibitor and PI3K inhibitor).[19–23] Therefore, novel biomarkers are urgently needed to explore targeted therapy for TNBC patients. In order to nd potential biomarkers and therapeutical targets for TNBC and breast cancer metastasis, we screened and analyzed the results of high-throughput data in public databases. Finally, seven potential biomarkers related to TNBC and breast cancer metastasis were found, which are also correlated with the prognosis of breast cancer patients. Methods Raw data The original gene expression proles were downloaded from the GSE52604 and GSE53752 datasets in the Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) database. Gene expression proles of 35 Breast Brain Metastasis samples, 10 Non-Neoplastic Brain samples and 10 Non-Neoplastic Breast samples were obtained from the GSE52604. Another dataset, including information of 51 TNBCs and 25 normal breast tissues obtain from the GSE53752 datasets. Identication of differentially expressed genes Differentially expressed genes (DEGs) were identify according to adj.p < 0.05 and |LogFC|>1.5 selection criteria by the R language limma package in separate datasets. The heatmaps and Volcano plot of DEGs was built by R language heatmap and ggplot2 package, respectively. Afterward, venny2.1.0 (https://bioinfogp.cnb.csic.es/tools/venny/index.html), an online tool, was used for determining overlapping DEGs in the two datasets. The up-regulated and down-regulated genes were identied respectively. Gene Ontology and KEGG pathway analysis Gene Ontology (GO) function enrichment analysis includes cellular component (CC), molecular function (MF), and biological process (BP). The Database for Annotation, Visualization and Integrated Discovery (DAVID) is a biological information database that integrates biological data and analysis tools to provide systematic and comprehensive biological function annotation information for large-scale genes or Page 4/20 proteins.[24] The overlapping DEGs were imported into DAVID and perform GO and KEGG pathway enrichment analysis. P < 0.05 was considered as selection criteria. Protein-protein interaction network construction and analysis Protein interactions play a pivotal role in cancer-related signaling, cell localization, and expression regulation. Therefore, the study of the interaction network between proteins is helpful to detect the core of the regulatory genes. Search Tool for the Retrieval of Interacting Genes (STRING),[25] an online tool (https://string-db.org/), is based on the collection, evaluation, and integration of all common "protein- protein" interaction resources, supplemented by computer prediction to predict protein-protein interactions. The proteins encoded by overlapped DEGs were imported into STRING to frame protein- protein interaction (PPI) network and visualize by Cytoscape software (Cytoscape_v3.7.1).[26] Interaction score of 0.4 was the selection criteria. Subsequently, Molecular complex detection (MCODE)[27] and CytoHubba[28] in Cytoscape were used for analyzing PPI network to identify hub genes and top modules in the PPI network. Clinical outcomes of hub genes The Kaplan-Meier curve is a commonly used tool to study the correlation between gene expression and disease prognosis. The patients were divided into high-expression group and low-expression group on the basis of the median gene expression level. Subsequently, the Kaplan-Meier curves of hub genes were obtained from the Kaplan-Meier plotter (http://www.kmplot.com/). The expression of hub genes was measured by Gene Expression Proling Interactive Analysis (GEPIA) including the RNA sequencing expression data from the TCGA and the GTEx projects.[29] The genes expression of breast cancer grade, [30] metastasis,[31] TNBC[30] and recurrent[32] status were download from the Oncomine dataset. Cell culture and mice breast cancer xenograft tumor The following cell lines were obtained from Cobioer (Cobioer Biosciences Co., Ltd., Nanjing, China): MDA- MB-231, T47D, MDA-MB-468, MDA-MB-157, MCF-7 and MCF-10A. Human breast cancer cell lines MDA- MB-231, T47D, MDA-MB-468, MDA-MB-157 and MCF-7 were preserved in RPMI 1640 supplemented with 10% FBS and were growth in a 37℃ humidied chamber with 5% CO2. MCF10A cells were cultured in Dulbecco's modied Eagle's medium/F12 with 10% horse serum, 20 ng/ml epidermal growth factor, 0.5 ug/ml hydrocortisone, 10 ug/ml insulin, 100 ng/ml cholera toxin. The passage number for each cell line was less than 15 when the experiments were performed. Female BALB/c nude mice, 6 to