Identifcation of Seven Hub as the Novel Biomarkers in Triple-negative Breast Cancer and Breast Cancer Metastasis

Huanxian Wu Southern Medical University Huining Lian Southern Medical University Nanfang Hospital Qianqing Chen Southern Medical University Jinlamao Yang Southern Medical University Nanfang Hospital Baofang Ou Southern Medical University Dongling Quan Southern Medical University Lei Zhou Southern Medical University Lin Lv Southern Medical University Minfeng Liu Southern Medical University Nanfang Hospital Shaoyu Wu (  [email protected] ) Guangdong Provincial Key Laboratory of New Drug Screening, School of Pharmaceutical Science, Southern Medical University, Guangzhou, Guangdong, 510515, PR China. https://orcid.org/0000-0002- 1247-5295

Research article

Keywords: Triple-negative breast cancer, Metastasis, Biomarkers, Prognostic signature

Posted Date: October 5th, 2020

DOI: https://doi.org/10.21203/rs.3.rs-73076/v1

Page 1/20 License:   This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License

Page 2/20 Abstract

Background: Breast cancer is one of the most common malignant tumors with the highest morbidity and mortality among women. Compared with the other breast cancer subtypes, Triple-negative breast cancer (TNBC) has a higher probability of recurrence and is prone to distant metastasis. To reveal the underlying disease mechanisms and identify more effective biomarkers for TNBC and breast cancer metastasis.

Methods: Ontology and KEGG pathway analysis were used for investigating the role of overlapping differentially expressed genes (DEGs). Hub genes among these DEGs were determined by the protein- protein interactions network analysis and CytoHubba. Oncomine databases were used for verifying the clinical relevance of hub genes. Furthermore, the differences in the expression of these genes in cancer and normal tissues were validated in the cellular, animal and human tissue.

Results: Seven hub genes, including TTK, KIF11, SPAG5, RRM2, BUB1, CDCA8 and CDC25C, were identifed that might be associated with TNBC and breast cancer metastasis. Meanwhile, these genes have been verifed highly expressed in tumor cells and tumor tissues, and patients with higher expression of these genes have a poorer prognosis.

Conclusions: Seven hub genes were potential biomarkers for the diagnosis and therapy of TNBC and breast cancer metastasis.

Background

Breast cancer, a highly heterogeneous progressive disease, was estimated to account for about a quarter of all female cancers.[1] The survival rate of breast cancer patients is closely correlated with the clinical stage of initial diagnosis.[2] In the past decades, the study in the molecular mechanisms of breast cancer has been achieved great progress, making the treatment of breast cancer more personalized;[3] however, despite this, breast cancer is still the major cause of cancer death among females.[1]

Triple negative breast cancer (TNBC) is a sophisticated and invasive subtype of breast cancer lacking estrogen receptors, progesterone receptors and the aberrant expression of HER2,[4] accounting for 15– 20% of breast cancer cases but 25% of deaths.[5] Compared with the other breast cancer subtype, TNBC has a higher probability of recurrence and a poor 5-year prognosis on account of a higher degree of vascular lymphatic infltration, strong metastatic ability, prone to distant metastasis and a lack of therapeutic targets.[6], [7–9] Statistically, 20–30% breast cancer patients would experience distant metastasis,[10] and ~ 90% of cancer-related deaths are caused by metastasis.[11] Overall, the 5-year survival rate of patients with breast cancer is about 90%.[12] However, if the tumor has distant metastases, the 5-year survival rate drops to about 25%.[13] TNBC is about 2.5-time more likely to develop metastasis within 5 years after diagnosis than non-TNBC.[14] What’s worse, TNBC preferentially metastases to the viscera in contrast to other breast cancer subtypes, mainly metastases to bone, which tend to worse prognosis.[15]

Page 3/20 Currently, chemotherapy remains the standard treatment for TNBC at all stages, wing to the lack of approved cellular targets. [15, 16] For recurrent and metastatic TNBC, systemic chemotherapy is the only currently available treatment strategy.[17] However, poor response, toxicity and the occurrence of multidrug resistant limit the application of this method.[18]

Recently, some targeted drugs have been used in clinical studies of TNBC, which bring hope to the treatment of TNBC, but also have signifcant defciency. Due to the heterogeneity of TNBC, some patients who do not meet the requirements cannot beneft from targeted therapy (PARP inhibitor; some targeted drugs are not satisfactory for the response rate of TNBC (TKIs, cetuximab, rapamycin and their combination with the conventional chemotherapies); and compensatory signaling pathway activation also leads to the rapid development of resistance to targeted drugs in patients (mTOR inhibitor and PI3K inhibitor).[19–23] Therefore, novel biomarkers are urgently needed to explore targeted therapy for TNBC patients. In order to fnd potential biomarkers and therapeutical targets for TNBC and breast cancer metastasis, we screened and analyzed the results of high-throughput data in public databases. Finally, seven potential biomarkers related to TNBC and breast cancer metastasis were found, which are also correlated with the prognosis of breast cancer patients.

Methods Raw data

The original profles were downloaded from the GSE52604 and GSE53752 datasets in the Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) database. Gene expression profles of 35 Breast Brain Metastasis samples, 10 Non-Neoplastic Brain samples and 10 Non-Neoplastic Breast samples were obtained from the GSE52604. Another dataset, including information of 51 TNBCs and 25 normal breast tissues obtain from the GSE53752 datasets. Identifcation of differentially expressed genes

Differentially expressed genes (DEGs) were identify according to adj.p < 0.05 and |LogFC|>1.5 selection criteria by the R language limma package in separate datasets. The heatmaps and Volcano plot of DEGs was built by R language heatmap and ggplot2 package, respectively. Afterward, venny2.1.0 (https://bioinfogp.cnb.csic.es/tools/venny/index.html), an online tool, was used for determining overlapping DEGs in the two datasets. The up-regulated and down-regulated genes were identifed respectively. and KEGG pathway analysis

Gene Ontology (GO) function enrichment analysis includes cellular component (CC), molecular function (MF), and biological process (BP). The Database for Annotation, Visualization and Integrated Discovery (DAVID) is a biological information database that integrates biological data and analysis tools to provide systematic and comprehensive biological function annotation information for large-scale genes or

Page 4/20 proteins.[24] The overlapping DEGs were imported into DAVID and perform GO and KEGG pathway enrichment analysis. P < 0.05 was considered as selection criteria. Protein-protein interaction network construction and analysis

Protein interactions play a pivotal role in cancer-related signaling, cell localization, and expression regulation. Therefore, the study of the interaction network between proteins is helpful to detect the core of the regulatory genes. Search Tool for the Retrieval of Interacting Genes (STRING),[25] an online tool (https://string-db.org/), is based on the collection, evaluation, and integration of all common "protein- protein" interaction resources, supplemented by computer prediction to predict protein-protein interactions. The proteins encoded by overlapped DEGs were imported into STRING to frame protein- protein interaction (PPI) network and visualize by Cytoscape software (Cytoscape_v3.7.1).[26] Interaction score of 0.4 was the selection criteria. Subsequently, Molecular complex detection (MCODE)[27] and CytoHubba[28] in Cytoscape were used for analyzing PPI network to identify hub genes and top modules in the PPI network. Clinical outcomes of hub genes

The Kaplan-Meier curve is a commonly used tool to study the correlation between gene expression and disease prognosis. The patients were divided into high-expression group and low-expression group on the basis of the median gene expression level. Subsequently, the Kaplan-Meier curves of hub genes were obtained from the Kaplan-Meier plotter (http://www.kmplot.com/). The expression of hub genes was measured by Gene Expression Profling Interactive Analysis (GEPIA) including the RNA sequencing expression data from the TCGA and the GTEx projects.[29] The genes expression of breast cancer grade, [30] metastasis,[31] TNBC[30] and recurrent[32] status were download from the Oncomine dataset. Cell culture and mice breast cancer xenograft tumor

The following cell lines were obtained from Cobioer (Cobioer Biosciences Co., Ltd., Nanjing, China): MDA- MB-231, T47D, MDA-MB-468, MDA-MB-157, MCF-7 and MCF-10A. Human breast cancer cell lines MDA- MB-231, T47D, MDA-MB-468, MDA-MB-157 and MCF-7 were preserved in RPMI 1640 supplemented with

10% FBS and were growth in a 37℃ humidifed chamber with 5% CO2. MCF10A cells were cultured in Dulbecco's modifed Eagle's medium/F12 with 10% horse serum, 20 ng/ml epidermal growth factor, 0.5 ug/ml hydrocortisone, 10 ug/ml insulin, 100 ng/ml cholera toxin. The passage number for each cell line was less than 15 when the experiments were performed. Female BALB/c nude mice, 6 to 8 weeks old, were purchased from Guangdong Medical Animal Experimental Center (Guangdong, China). Animal experimental procedures comply with the Care and Use of Laboratory Animals Guide and approved by the Animal Experimental Ethics Committee of Southern Medical University. MDA-MB-468, T47D and MCF-7 cells (1.0 × 107 in 0.2 ml PBS) were subcutaneously injected into the right side of the posterior fanks of nude mice (n = 5), respectively. When tumor volumes reached 1000 mm3, mice were sacrifced by exposure to carbon dioxide and tumors were excised.

Page 5/20 Patients and specimens

A total of 23 cases of human breast cancer tissues and paired adjacent tissues were collected from patients who underwent breast surgery in Nanfang Hospital (Guangzhou, China). The use of human samples obtained ethical approval from the Research Ethics Committee of Southern Hospital. All patients did not receive any adjuvant treatment before surgery and were diagnosed by two independent pathologists. The clinical characteristics of patients were shown in Table S1. Reverse transcription-quantitative RCR (RT-qPCR)

Cells and tissues were lysed by TRIzol reagent (Takara). RNA was isolated using chloroform and extracted with isopropyl alcohol and 75% ethanol. RNA was reverse-transcribed into cDNA by Takara reverse transcriptase according to manufacturer's instructions (Dalian, China). Primers (Table S2,Sangon Biotech, Shanghai, China) were dissolved and diluted to a concentration of 10 µM. cDNA was quantifed by SYBR Green Master Mix from Takara on LightCycler 480 (Roche, Switzerland). Statistical analysis

All data analysis and potting were handled using the IBM SPSS Statistics 20.0 software (Armonk, NY: IBM Corp) and GraphPad Prism7 (San Diego California USA, GraphPad Software).

Results Identifcation of DEGs

In GSE52604 we compared samples of Breast Brain Metastasis samples with Non-Neoplastic Brain samples and Non-Neoplastic Breast samples, respectively. We obtained 227 Upregulated and 271 downregulated DEGs that matched the screening criteria that adj.p-value < 0.05 and |LogFC| > 1.5. On the other hand, TNBC tissues were compared to the normal breast tissues in the GSE53752, and 158 upregulated and 337 downregulated DEGs were met selection criteria (Fig. 1a). Subsequently, the DEGs were used to constructed heatmap (Fig. 1b). To confrm the genes be relevant with tumor metastasis in TNBC, the DEGs of the two microarray datasets were contrasted to obtain the overlapping DEGs, containing 53 upregulated genes and 64 downregulated genes. (Fig. 1c). GO and KEGG pathway analysis

To further analysis the biological function of the DEGs, GO analysis and KEGG pathway analysis were analyzed using DAVID. DEGs are mainly enriches in nucleus, cytoplasm, cytosol and extracellular region, including , kinetochore and midbody in the cellular component (CC) by the GO function analysis. As for the biological process (BP) group, DEGs are mostly enriches in the cell division, mitotic nuclear division, positive regulation of cell proliferation, and chromosome segregation. The DEGs are enriched in the protein binding, ATP binding, zinc ion binding and calcium ion binding in MF (Fig. 2). All

Page 6/20 information is available in Supplementary Table S3 and S4. For the KEGG pathways analysis, these DEGs are enriched in cell cycle, pathways in cancer, p53 signaling pathway and melanoma. PPI network analysis and screening for hub genes

So as to elucidate the molecular mechanisms of key cellular activities in tumorigenesis and development, it is necessary to build PPI network of overlapping DEGs. The PPI network of the DEGs was obtained by STRING (Fig. 3a). To further analyze the hub genes inside the network, the network was imported into the Cytoscape. And then, CytoHubba were used to analysis the hub genes by seven algorithms including MCC, DMNC, MNC, Degree, EPC, Closeness and Radiality. Top 20 genes of each algorithms were shown in Table 1. The overlapped genes of these genes were considered as the hub genes, including RRM2, BUB1, CDCA8, SPAG5, TTK, KIF11 and CDC25C. To further confrm the signifcance of Hub genes in PPI, the overlapping DEGs were input into MCODE to extract the top interactive module. Subsequently, the hub genes and the genes in the top interactive module were inputted into STRING to verify the accuracy of the prediction. The hub genes interact with each other and contained in the key nodes of the top interaction module (Fig. 3b and 3c).

Table 1 Differentially expressed genes ranked in cytoHubba Rank Top 20 genes method cytoHubba

DMNC CENPF, DEPDC1, KIF18A, UBE2T, CKS2, FAM83D, RRM2, CEP55, BUB1, CDCA8, SPAG5, TTK, KIF11, ASPM, NCAPG, CDCA5, NUF2, NDC80, SPC25, CDC25C

MCC RRM2, CEP55, BUB1, CDCA8, SPAG5, TTK, KIF11, TPX2, ASPM, NCAPG, CDCA5, PBK, MKI67, BIRC5, CENPF, TOP2A, OIP5, CDKN3, CDC25C, NDC80

MNC MKI67, BIRC5, TPX2, PBK, CDKN3, OIP5, RRM2, CEP55, BUB1, CDCA8, SPAG5, TTK, KIF11, TOP2A, ASPM, NCAPG, CDCA5, NUF2, NDC80, CDC25C

Degree MKI67, RRM2, BIRC5, TPX2, PBK, CDKN3, OIP5, CEP55, BUB1, CDCA8, SPAG5, TTK, KIF11, TOP2A, ASPM, NCAPG, CDCA5, NUF2, NDC80, CDC25C

EPC RRM2, MKI67, OIP5, PBK, TPX2, TTK, TOP2A, NUF2, CDC25C, NDC80, BIRC5, CDCA5, CDCA8, KIF11, ASPM, CDKN3, BUB1, CENPF, NCAPG, SPAG5

Closeness MKI67, BIRC5, PBK, TPX2, RRM2, CDKN3, OIP5, TOP2A, NEK2, CEP55, BUB1, CDCA8, SPAG5, TTK, KIF11, ASPM, NCAPG, CDCA5, CDC25C, NUF2

Radiality MKI67, BIRC5, FN1, PBK, FGF2, CDKN2A, NEK2, RAD51, TPX2, CDKN3, OIP5, RRM2, TOP2A, CEP55, BUB1, CDCA8, CDC25C, SPAG5, TTK, KIF11 Expression levels of hub genes are connected with prognosis of breast cancer patients

To validate the clinical signifcance of these hub genes, their expression was compared in breast cancer and normal tissues based on the dates downloading from the GEPIA. These results indicated that hub

Page 7/20 genes were more highly expressed in the tissues of breast cancer patients (Fig. 4a). In addition, the clinical outcomes and the corresponding gene expression of breast cancer patients were obtained from the Oncomine database. Kruskal Wallis test suggested that the expression of Hub gene was connection with the grade of breast cancer (Table 2). And the higher grades confer to increased aggressiveness and an increased propensity to spread. Seven hub genes were higher expressed in patients with metastatic breast cancer than in those without metastatic (Fig. 5a). The patients who have had a recurrence of breast cancer, TTK, KIF11, SPAG5, RRM2, BUB1 and CDCA8 had higher gene expression (Fig. S1). What’ s more, we detected that seven hub genes were highly expression in the TNBC compared with the other biomarker status (Fig. 5b). The survival curves of RRM2, BUB1, CDCA8, SPAG5, TTK, KIF11 and CDC25C show that higher expression of these genes had worse overall survival (Fig. 4b).

Table 2 Univariate analysis of Hub genes mRNA the expression and Final Tumor Grade Total Grade 1 Grade 2 Grade 3 χ2 p-Value

Patient (n) 1902 170 775 957 - -

BUB1 1.08 0.48 0.81 1.45 476.18 < 0.001

(0.59–1.60) (0.17–0.83) (0.44–1.17) (0.98–1.90)

CDC25C 0.57 0.24 0.45 0.73 275.56 < 0.001 (0.28–0.91) (-0.04-0.55) (0.19–0.73) (0.46–1.06)

CDCA8 1.20 0.70 0.98 1.53 447.59 < 0.001

(0.79–1.69) (0.45–0.99) (0.66–1.34) (1.10–2.10)

KIF11 1.31 0.87 1.15 1.58 334.79 < 0.001

(0.94–1.74) (0.56–1.20) (0.82–1.49) (1.18–1.98)

RRM2 0.33 0.10 0.24 0.52 209.83 < 0.001

(0.06–0.70) (-0.05-0.29) (0.02–0.49) (0.19–0.92)

SPAG5 0.33 0.18 0.26 0.45 190.32 < 0.001

(0.14–0.59) (0.04–0.34) (0.09–0.47) (0.22–0.74)

TTK 1.14 0.32 0.91 1.59 407.52 < 0.001

(0.68–1.78) (0.16–0.92) (0.57–1.30) (1.03–2.18)

Expression level of hub genes in breast cancer cell lines and mice breast cancer xenograft tumor tissue

To confrm the credibility of the bioinformatics analysis method, the expression of Hub gene in fve breast cancer cell lines (MDA-MB-231, MDA-MB-468, MDA-MB-157, MCF-7 and T47D) and normal mammary epithelial cell line (MCF10A) were compared to detect whether hub genes were highly expressed in breast

Page 8/20 cancer. Compared to MCF10A, the expressions of the hub genes have been elevated in breast cancer cell lines (Fig. 6a). To further verify the expression level of hub genes in breast cancer, xenograft models were constructed in nude mice, and the level of hub gene in breast cancer tissues was detected. Expression of hub genes in the xenograft models were consistent with that on the breast cancer cell lines (Fig. 6b). Hub gene expression is highly expressed in breast cancer tissues

To verify the expression level of hub genes in breast cancer tissues, the patients’ tissues were collected from the Department of Breast Surgery, Nanfang hospital (Guangzhou, China). And then, we compared the level of the hub genes expression of breast cancer tissues and paired para-carcinoma tissue. As shown in the Fig. 7, compared to the para-carcinoma tissue, hub genes had higher expression in the tumor tissue.

Discussion

TNBC is a heterogeneous and aggressive subtype of breast cancer, which prone to recurrence and metastasis, and association with poor prognosis.[17] Unfortunately, it cannot be treated with hormone therapy and there is no recognized therapeutic target. To these days, chemotherapy is still the only currently available strategy for TNBC. However, patients that failure to achieve completely pathological respond have a high risk of recurrence. Meanwhile, this approach may be accompanied by toxicity and multidrug resistance. Hence, it is urgent to fnd the novel biomarker for the diagnosis and therapy of breast cancer metastasis and TNBC.

In this study, two data sets from TNBC and breast cancer brain metastases were used to analyze and 117 overlapping DEGs were obtained. GO and KEGG pathway analysis demonstrated that DEGs is mainly related to the mitosis, cell cycle, cell proliferation, cell division and cancer. These enriched functions and pathways provide a reference for the study of the molecular mechanism of TNBC and breast cancer metastasis. Further studies showed that TTK, KIF11, SPAG5, RRM2, BUB1, CDCA8 and CDC25C might play a key role in the progression of the TNBC and metastasis. To confrm the results of the analysis, the gene expression value and clinical outcome were downloaded from TCGA and Oncomine databases. Seven hub genes are highly expressed in tumor tissues, especially in high grade and triple negative breast cancer. What’s worse, patients with highly expression of these genes have a worse outcome. In addition, seven hub genes may be a risk factor for tumor metastasis, while TTK, KIF11, SPAG5, RRM2, BUB1 and CDCA8 might relate to the breast cancer recurrence. To further verify the role of the hub genes, the expression of there were detected in breast cancer cell lines, mice breast cancer xenograft tumor tissue, breast cancer patients’ tissue and paired para-carcinoma tissue. All the experiments showed that seven hub genes were increased in the tumor. In summary, seven hub genes were highly expressed in breast cancer and related to TNBC and breast cancer metastasis, which provide more insights to disclose the mechanisms of breast cancer metastasis and TNBC to develop new diagnostic and therapeutic targets.

Page 9/20 The progression of a tumor is a complex process driven by specifc genetic and epigenetic alterations. TTK is reported abnormal expression in a wide variety of tumors. It is overexpression in the TNBC, which related to the mesenchymal and proliferative.[33] In additional, TTK is reported as a biomarker of cancer metastasis in various cancers.[34, 35] Overexpression of KIF11[36] or SPAG5[37] can promote the tumor growth in TNBC, and associated with the worse prognosis.[38, 39] RRM2 be considered as the early molecular markers for breast cancer[40] and be found that was higher expressed in tumoral metastasis and recurrence in colorectal cancer.[41] BUB1 was identifed is necessary for maintaining cancer stem cells in breast cancer.[42] Phosphorylated CDC25C plays key role in breast cancer and may be a potential treatment target.[43] TTK, CDCA8 and BUB1 were reported are related to breast cancer metastasis. Here, we found that TTK, KIF11, SPAG5, RRM2, BUB1, CDCA8 and CDC25C were associated with TNBC and breast cancer metastasis. RRM2, BUB1 and CDCA8, for the frst time, were found may play a key role in the TNBC. What’s more, we demonstrated that KIF11, SPAG5, RRM2 and CDC25C might associate with breast cancer metastasis.

Conclusion

In summary, our study disclosed seven potential biomarkers of TNBC and breast cancer metastasis, which are TTK, KIF11, SPAG5, RRM2, BUB1, CDCA8 and CDC25C. In addition, RRM2, BUB1 and CDCA8 play a crucial role in TNBC. And KIF11, SPAG5, RRM2 and CDC25C were shown the relationship with breast cancer metastasis for the frst time. The genes might be new diagnostic and therapeutic targets of TNBC and breast cancer metastasis. We verifed the importance of these genes in breast cancer through bioinformatics research methods and in vivo and in vitro experiments, but the mechanism still unclear. Therefore, we will take further research to confrm these mechanisms in the future.

Abbreviations

TNBC: Triple-negative breast cancer; DEGs: Overlapping differentially expressed genes; GO: Gene Ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; TCGA: The Cancer Genome Atlas; GEO: Gene Expression Omnibus; STRING: Search Tool for the Retrieval of Interacting Genes; CC: Cellular component; MF: Molecular function; BP: Biological process; DAVID: The Database for Annotation, Visualization and Integrated Discovery; PPI: Protein-protein interaction; MCODE: Molecular complex detection; GEPIA: Gene Expression Profling Interactive Analysis; TTK: TTK protein kinase; KIF11: Kinesin family member 11; SPAG5: Sperm associated antigen 5; RRM2: Ribonucleotide reductase regulatory subunit M2; BUB1: BUB1 mitotic checkpoint serine/threonine kinase; CDCA8: Cell division cycle associated 8; CDC25C: Cell division cycle 25C; TKI: Tyrosine kinase inhibitor.

Declarations

Acknowledgments

Not Applicable.

Page 10/20 Authors’ contributions

SYW, MFL, HXW conceived and designed the experiments; HXW, HNL, QQC, JLMY, BFO, DLQ performed the experiments; LZ, LL and HXW analyzed the data; HXW and SYW wrote the paper. All authors have read and approved the manuscript.

Funding

This work was supported by the Natural Science Foundation of Guangdong Province, China (No. 2018A030313405), the Science and Technology Planning Project of Guangdong Province, China (No. 2017A050501020) and the Medical Scientifc Research Foundation of Guangdong Province, China (No. A2018189).

Availability of data and materials

All analyzed data are included in this published article and its supplementary information fle. The original data are available upon reasonable request to the corresponding author.

Ethics approval and consent to participate

This study was conducted with approval from the Animal Experimental Ethics Committee of Southern Medical University, Guangzhou, China (Approval No. L2018021).

Competing interests

The authors declare that they have no competing interests.

Data Availability Statement

The data that support the fndings of this study are openly available in GEO DataSets at https://www.ncbi.nlm.nih.gov/gds/?term= and the Oncomine DataSet at https://www.oncomine.org/resource/login.html, reference number GSE52604 and GSE53752.

References

1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A: Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Ca- Cancer J Clin 2018, 68(6):394-424. 2. Xiao W, Zheng S, Yang A, Zhang X, Zou Y, Tang H, Xie X: Breast cancer subtypes and the risk of distant metastasis at initial diagnosis: a population-based study. Cancer management and research 2018, 10:5329-5338. 3. Liu M, Li ZY, Yang JJ, Jiang YY, Chen ZS, Ali ZS, He NY, Wang ZF: Cell-specifc biomarkers and targeted biopharmaceuticals for breast cancer treatment. Cell Proliferat 2016, 49(4):409-420.

Page 11/20 4. Bianchini G, Balko JM, Mayer IA, Sanders ME, Gianni L: Triple-negative breast cancer: challenges and opportunities of a heterogeneous disease. Nat Rev Clin Oncol 2016, 13(11):674-690. 5. Rastelli F, Biancanelli S, Falzetta A, Martignetti A, Casi C, Bascioni R, Giustini L, Crispino S: Triple- negative breast cancer: current state of the art. Tumori J 2010, 96(6):875-888. 6. Geenen JJJ, Linn SC, Beijnen JH, Schellens JHM: PARP Inhibitors in the Treatment of Triple-Negative Breast Cancer. Clin Pharmacokinet 2018, 57(4):427-437. 7. Stovgaard ES, Nielsen D, Hogdall E, Balslev E: Triple negative breast cancer - prognostic role of immune-related factors: a systematic review. Acta Oncol 2018, 57(1):74-82. 8. Mathe A, Wong-Brown M, Morten B, Forbes JF, Braye SG, Avery-Kiejda KA, Scott RJ: Novel genes associated with lymph node metastasis in triple negative breast cancer. Scientifc reports 2015, 5. 9. Malorni L, Shetty PB, De Angelis C, Hilsenbeck S, Rimawi MF, Elledge R, Osborne CK, De Placido S, Arpino G: Clinical and biologic features of triple-negative breast cancers in a large cohort of patients with long-term follow-up. Breast Cancer Res Tr 2012, 136(3):795-804. 10. Abe O, Abe R, Enomoto K, Kikuchi K, Koyama H, Masuda H, Nomura Y, Sakai K, Sugimachi K, Tominaga T et al: Effects of chemotherapy and hormonal therapy for early breast cancer on recurrence and 15-year survival: an overview of the randomised trials. Lancet 2005, 365(9472):1687- 1717. 11. Cancer Genome Atlas N: Comprehensive molecular portraits of human breast tumours. 2012, 490(7418):61-70. 12. Siegel RL, Miller KD, Jemal A: Cancer statistics, 2020. CA: a cancer journal for clinicians 2020, 70(1):7-30. 13. Perez EA: Treatment strategies for advanced hormone receptor-positive and human epidermal growth factor 2-negative breast cancer: the role of treatment order. Drug Resist Update 2016, 24:13- 22. 14. Dent R, Trudeau M, Pritchard KI, Hanna WM, Kahn HK, Sawka CA, Lickley LA, Rawlinson E, Sun P, Narod SA: Triple-negative breast cancer: Clinical features and patterns of recurrence. Clinical Cancer Research 2007, 13(15):4429-4434. 15. Liedtke C, Mazouni C, Hess KR, Andre F, Tordai A, Mejia JA, Symmans WF, Gonzalez-Angulo AM, Hennessy B, Green M et al: Response to neoadjuvant therapy and long-term survival in patients with triple-negative breast cancer. J Clin Oncol 2008, 26(8):1275-1281. 16. Carey LA, Dees EC, Sawyer L, Gatti L, Moore DT, Collichio F, Ollila DW, Sartor CI, Graham ML, Perou CM: The triple negative paradox: Primary tumor chemosensitivity of breast cancer subtypes. Clinical Cancer Research 2007, 13(8):2329-2334. 17. Bianchini G, Balko JM, Mayer IA, Sanders ME, Gianni L: Triple-negative breast cancer: challenges and opportunities of a heterogeneous disease. Nat Rev Clin Oncol 2016, 13(11):674-690. 18. Lee A, Djamgoz MBA: Triple negative breast cancer: Emerging therapeutic modalities and novel combination therapies. Cancer Treat Rev 2018, 62:110-122.

Page 12/20 19. Makvandi M, Xu KY, Lieberman BP, Anderson RC, Effron SS, Winters HD, Zeng CB, McDonald ES, Pryma DA, Greenberg RA et al: A Radiotracer Strategy to Quantify PARP-1 Expression In Vivo Provides a Biomarker That Can Enable Patient Selection for PARP Inhibitor Therapy. Cancer research 2016, 76(15):4516-4524. 20. Finn RS, Press MF, Dering J, Arbushites M, Koehler M, Oliva C, Williams LS, Di Leo A: Estrogen Receptor, Progesterone Receptor, Human Epidermal Growth Factor Receptor 2 (HER2), and Epidermal Growth Factor Receptor Expression and Beneft From Lapatinib in a Randomized Trial of Paclitaxel With Lapatinib or Placebo As First-Line Treatment in HER2-Negative or Unknown Metastatic Breast Cancer. J Clin Oncol 2009, 27(24):3908-3915. 21. Harbeck N, Schmidt M, Harter P, Possinger K, Jonat W, Luck HJ, Beckmann M, Fasching P, Schutte J, Solca F et al: BIBW 2992, a Novel Irreversible EGFR/HER1 and HER2 Tyrosine Kinase Inhibitor for the Treatment of Patients with HER2-Negative Metastatic Breast Cancer after Failure of No More Than Two Prior Chemotherapies. Cancer research 2009, 69(24):785s-786s. 22. Baselga J, Gomez P, Greil R, Braga S, Climent MA, Wardley AM, Kaufman B, Stemmer SM, Pego A, Chan A et al: Randomized Phase II Study of the Anti-Epidermal Growth Factor Receptor Monoclonal Antibody Cetuximab With Cisplatin Versus Cisplatin Alone in Patients With Metastatic Triple-Negative Breast Cancer. J Clin Oncol 2013, 31(20):2586-+. 23. Carey LA, Rugo HS, Marcom PK, Mayer EL, Esteva FJ, Ma CX, Liu MC, Storniolo AM, Rimawi MF, Forero-Torres A et al: TBCRC 001: Randomized Phase II Study of Cetuximab in Combination With Carboplatin in Stage IV Triple-Negative Breast Cancer. J Clin Oncol 2012, 30(21):2615-2623. 24. Huang DW, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols 2009, 4(1):44-57. 25. Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P et al: STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2019, 47(D1):D607-D613. 26. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research 2003, 13(11):2498-2504. 27. Bader GD, Hogue CW: An automated method for fnding molecular complexes in large protein interaction networks. BMC bioinformatics 2003, 4:2. 28. Chin CH, Chen SH, Wu HH, Ho CW, Ko MT, Lin CY: cytoHubba: identifying hub objects and sub- networks from complex interactome. BMC systems biology 2014, 8 Suppl 4:S11. 29. Tang ZF, Li CW, Kang BX, Gao G, Li C, Zhang ZM: GEPIA: a web server for cancer and normal gene expression profling and interactive analyses. Nucleic Acids Research 2017, 45(W1):W98-W102. 30. Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y et al: The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012, 486(7403):346-352.

Page 13/20 31. Schmidt M, Böhm D, von Törne C, Steiner E, Puhl A, Pilch H, Lehr H-A, Hengstler JG, Kölbl H, Gehrmann M: The Humoral Immune System Has a Key Prognostic Impact in Node-Negative Breast Cancer. Cancer research 2008, 68(13):5405-5413. 32. Wang Y, Klijn JGM, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J et al: Gene-expression profles to predict distant metastasis of lymph-node-negative primary breast cancer. The Lancet 2005, 365(9460):671-679. 33. King JL, Zhang B, Li Y, Li KP, Ni JJ, Saavedra HI, Dong JT: TTK promotes mesenchymal signaling via multiple mechanisms in triple negative breast cancer. Oncogenesis 2018, 7(9):69. 34. Chen S, Wang Y, Ni C, Meng G, Sheng X: HLF/miR-132/TTK axis regulates cell proliferation, metastasis and radiosensitivity of glioma cells. Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie 2016, 83:898-904. 35. Liu XD, Yao DW, Xin F: TTK contributes to tumor growth and metastasis of clear cell renal cell carcinoma by inducing cell proliferation and invasion. Neoplasma 2019, 66(6):946-953. 36. Jiang M, Zhuang HR, Xia R, Gan L, Wu YT, Ma JZ, Sun YH, Zhuang ZX: KIF11 is required for proliferation and self-renewal of docetaxel resistant triple negative breast cancer cells. Oncotarget 2017, 8(54):92106-92118. 37. Li M, Li AQ, Zhou SL, Lv H, Yang WT: SPAG5 upregulation contributes to enhanced c-MYC transcriptional activity via interaction with c-MYC binding protein in triple-negative breast cancer. J Hematol Oncol 2019, 12. 38. Zhou J, Chen WR, Yang LC, Wang J, Sun JY, Zhang WW, He ZY, Wu SG: KIF11 Functions as an Oncogene and Is Associated with Poor Outcomes from Breast Cancer. Cancer Res Treat 2019, 51(3):1207-1221. 39. Bertucci F, Viens P, Birnbaum D: SPAG5: the ultimate marker of proliferation in early breast cancer? Lancet Oncol 2016, 17(7):863-865. 40. Kretschmer C, Sterner-Kock A, Siedentopf F, Schoenegg W, Schlag PM, Kemmner W: Identifcation of early molecular markers for breast cancer. Mol Cancer 2011, 10. 41. Chang CC, Lin CC, Wang CH, Huang CC, Ke TW, Wei PL, Yeh KT, Hsu KC, Hsu NY, Cheng YW: miR-211 regulates the expression of RRM2 in tumoral metastasis and recurrence in colorectal cancer patients with a k-ras gene mutation. Oncol Lett 2018, 15(5):8107-8117. 42. Han JY, Han YK, Park GY, Kim SD, Kim JS, Jo WS, Lee CG: Bub1 is required for maintaining cancer stem cells in breast cancer cell lines (vol 5, 15993, 2015). Scientifc reports 2016, 6. 43. Jiang HY, Wang B, Zhang FL, Qian YY, Chia, Ying MZ, Wang YJ, Zuo L: The Expression and Clinical Outcome of pCHK2-Thr68 and pCDC25C-Ser216 in Breast Cancer. International Journal Of Molecular Sciences 2016, 17(11).

Figures

Page 14/20 Figure 1

Identifcation of overlapping DEGs from GEO datasets (GSE52604, GSE53756). (a) Volcano plot of DEGs in two datasets. (b) Heat map of the 50 overlapping DEGs. Venn diagram of 53 overlapping upregulated DEGs (c) and 64 over-lapping down-regulated DEGs (d).

Page 15/20 Figure 2

Go analysis and KEGG pathway analysis of overlapping DEGs. (a) GO analysis; (b) KEGG pathway analysis; (c) Molecular Function, (d) Cellular Component, (e) Biological Process and (f) KEGG analysis of overlapping DEGs. P value < 0.05 was considered as signifcantly enriched.

Page 16/20 Figure 3

PPI (protein-protein interaction) network analysis and screening for hub genes. (a) The PPI network of DEGs; (b) PPI network of 7 hub genes; (c) The top Module inside the PPI network was recognized via MCODE.

Page 17/20 Figure 4 breast cancer patients

Figure 5

The clinical outcomes of seven hub genes. (a) Expression levels of TTK, KIF11, SPAG5, RRM2, BUB1, CDCA8 and CDC25C in metastasis (M, n=46) and no metastasis (NM, n=154); ***, P < 0.001, **, P < 0.01, *, P < 0.05 metastasis vs no metastasis. (b) Expression levels of TTK, KIF11, SPAG5, RRM2, BUB1, CDCA8

Page 18/20 and CDC25C in TNBC (3, n=250), no value (1, n=161) and other Biomarker status (2, n=1725) base on the Oncomine database. ****, P < 0.0001, TNBC vs No value; ####, P < 0.0001 TNBC vs other Biomarker status.

Figure 6

The expression level of seven hub genes in vitro and in vivo. (a) mRNA level of 7 hub genes in several breast cancer cells and MCF-10A, a normal breast cell line, GAPDH is the internal control. *p < 0.05, **p < 0.01, ***p <0 .0001, ****p < 0.00001 vs. MCF10A. (b) mRNA level of 7 hub genes in xenograft breast cancer tissue (n=5), GAPDH is the internal control. *p < 0.05, **p <0 .01, ***p <0 .0001, ****p < 0.00001 vs. MDA-MB-468; ##P<0.001 vs. T47D.

Page 19/20 Figure 7 mRNA level of seven hub genes in human breast cancer tissue. ***p <0 .001; ****p < 0.0001 vs. paired adjacent tissues.

Supplementary Files

This is a list of supplementary fles associated with this preprint. Click to download.

AuthorChecklistFull.pdf SupportingInformation.docx

Page 20/20