Transcriptomic analysis of breast cancer patients sensitive and resistant to chemotherapy: Looking for overall survival and drug resistance biomarkers

Carlos Barrón-Gallardo University of Guadalajara Mariel García-Chagollán University of Guadalajara Andrés Morán-Mendoza Mexican Social Security Institute Raúl Delgadillo-Cristerna Mexican Social Security Institute María Martínez-Silva Mexican Social Security Institute Adriana Aguilar-Lemarroy Mexican Social Security Institute Luis Jave-Suárez (  [email protected] ) Mexican Social Security Institute

Research Article

Keywords: breast cancer patients, neoadjuvant chemotherapy, resistance development, differentially expressed (DEGs), chemotherapy response, overall survival

Posted Date: December 15th, 2020

DOI: https://doi.org/10.21203/rs.3.rs-121509/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License

Page 1/26 Abstract

Neoadjuvant chemotherapy is one important therapeutic strategy for breast cancer with the drawback of resistance development. Chemotherapy has adverse effects that combined with resistance could contribute to lower overall survival. This work aimed to evaluate the molecular profle of patients who received neoadjuvant chemotherapy to discover differentially expressed genes (DEGs) that could be used as biomarkers of chemotherapy response and overall survival. Breast cancer patients who received neoadjuvant chemotherapy were enrolled in this study and according to their pathological response were assigned as sensitive or resistant. To evaluate DEGs, GO, KEGG, and -protein interactions, RNAseq information from all patients was obtained by next-generation sequencing. A total of 1985 DEGs were found and KEGG analysis indicated a great number of DEGs in metabolic pathways, pathways in cancer, cytokine-cytokine receptor interactions, and neuroactive ligand-receptor interactions. A selection of 73 DEGs was used further for an analysis of overall survival using the METABRIC study. Seven of those DEGs correlated with overall survival, of them the sub-expression of C1QTNF3, CTF1, OLFML3, PLA2R1, PODN and the over expression of TUBB and TCP1 were found in resistant patients and related to patients with lower overall survival.

Introduction

Breast cancer ranks frst in mortality and incidence of neoplasia in women over 20 years old worldwide1. One of the frst therapeutic approaches to treat this disease is the use of neoadjuvant chemotherapy, which has the objective of decreasing tumoral size, increase the possibilities of conservative surgery, remove micrometastases, and improve overall survival. However, despite the benefts of this treatment in breast cancer, some patients develop resistance to it, making their future therapeutic approach more difcult. The molecular factors involved in chemoresistance are not totally elucidated so far. Nevertheless, there is a relationship between chemoresistance and lower overall survival and disease-free survival2,3. In this sense, there is a growing interest and urgency for fnding molecular biomarkers useful for the prognosis of neoadjuvant chemotherapy response.

The use of predictive biomarkers, molecules that can be easily measured and gives clues of the behavior of the disease, is gaining importance in the clinic. Biomarkers have been used primarily for targeted therapy, classifying subjects, and ultimately predict chemotherapy response and overall survival. The study of biomarkers in breast cancer has allowed the development of panels such as Mammaprint, Rotterdam, and OncotypeDx, which can predict the risk of metastasis, disease recurrence, and response to tamoxifen, respectively, for hormone receptors positive patients4. Most of the biomarker panels developed to date for breast cancer have focused primarily on specifc molecular subtypes, and there is currently no panel able to predict the response to neoadjuvant chemotherapy. In this work, our main goal was to evaluate the transcriptome of breast cancer patients who were resistant or sensitive to neoadjuvant chemotherapy, to fnd differentially expressed genes (DEGs) for their potential use as chemotherapy response and overall survival biomarker.

Page 2/26 Results

A total of 41 patients were recruited for this study, nevertheless, 18 samples had poor quality or insufcient quantity of RNA for the analysis, and 1 patient dead during treatment which were excluded from the study. Therefore, only 22 samples covered the criteria for RNAsEq. All clinical characteristics are summarized in the supplementary table 1, briefy, the age mean was 51.2 ± 10.18 years old. All patients were diagnosed with breast invasive ductal carcinoma. The molecular subtype distribution among the patients was 3 luminal A, 6 luminal B, 8 Her2, 4 triple-negative, and 1 patient with non-available information. The histological grade was 1 SBRI, 10 SBRII, 9 SBRIII and 2 patients with non-available information. After patients completed the 6 months of neoadjuvant chemotherapy 9 were sensitive, that is, had a pathological complete response (pCR) and 13 of them were resistant (non-pCR)(Supplementary table 1).

Differentially Expressed Genes (DEGs) in Resistant Patients to Neoadjuvant Chemotherapy

Analysis of DEGs was performed between resistant and sensitive patients. To accomplish the comparisons, the sensitive group was set as the reference, so the results indicate DEGs in the resistant group. The data for performing the analysis was taken from the ENSEMBL project. From the 39,723 genes reported in the ENSEMBL database, a total of 29,739 genes had reads that aligned with its sequence, and there were 1985 DEGs with a p-value < 0.05 (1005 DEGs overexpressed and 980 subexpressed) (Fig. 1a). When fltering the data by the log2FoldChange > |1|, the number of DEGs diminishes to 1178 (556 overexpressed and 662 subexpressed). Besides, 73 genes (53 DEGs genes with subexpression, and 20 with overexpression) showed a padj value < 0.05 (Fig. 1b).

The most downregulated genes with a p-value < 0.05 and a log2FoldChange lower than − 5 were FLG2, LCE2B, UGT1A6, KRT1, UGT2B7, PSAPL1, KRTDAP, C1orf68, LOR, SPRR2E, KPRP, LST1, AC116533.1, and AC244489.2. Meanwhile, highly expressed genes with a log2FoldChange > 5 and a p-value < 0.05 were PVALB, CRIM1, IGHV3-43, CHRNA4, HLA-DQA2, ARHGAP11A, MUC2.

Pathways Differentially Modulated in Resistance to Neoadjuvant Chemotherapy

To highlight the physiological processes that could be related to neoadjuvant chemotherapy resistance, we performed a KEGG PATHWAY and Ontology (GO) enrichment analysis from DEGs with a p-value < 0.05. The GO analysis results showed that cellular components modulated in resistant patients belong to the extracellular region, including extracellular matrix, cell periphery, and intrinsic/extrinsic components of the plasma membrane (Fig. 2). The molecular functions of those DEGs were related to extracellular matrix structural constituent, binding function, receptor activity, protein binding (collagen, integrin, growth factor, calcium ion, and signal receptor binding), transmembrane receptor protein kinase activity, and metallopeptidase activity (Fig. 2).

Besides, the biological processes with more DEGs were cell and tissue development, extracellular matrix assembly, positive regulation of epithelial to mesenchymal transition, regulation of cellular response to

Page 3/26 transforming growth factor-beta receptor signaling pathway, homophilic cell adhesion via plasma membrane adhesion molecules, epithelium development, regulation of angiogenesis and cell migration (Fig. 2).

The KEGG pathways analysis revealed that pathways with more DEGs were related to the regulation of transcription, cell proliferation, and signaling transduction, furthermore, there were signaling pathways differentially modulated among the frst 15 KEGG, among those, PI3K-Akt, MAPK, Calcium, TGF-beta, Ras, cAMP and JAK-STAT signaling pathways (Fig. 3)(Supplementary table 2). Protein-Protein Interaction Networks Functional Enrichment Analysis of DEGs

DEGs with a value for the p-adj lower than 0.05 were further analyzed by STRING to know the possible interactions among their protein products. A total of 73 genes were included in this last analysis, of them, only 68 were incorporated and used to build the interactions matrix. We focus our analysis on those protein interactions with a minimum confdence score of 0.7. The confdence score is defned as the approximate probability that a predicted link exists between two proteins in the same metabolic map of the KEGG database (its range is from 0 to 1), with a higher score the number of interactions diminishes among the query proteins, but those interactions are more probable to be real. A lower score results in a great number of interactions but also more false positives. After this analysis, 18 edges (protein-protein interaction) were reported to have a high confdence score (0.7), with 5 clusters identifed. The proteins of the largest cluster included COL21A1, COL3A1, MMP2, DCN, VCAN, and FBLN2, and all belong to the category of extracellular matrix organization of the Reactome pathways. The next cluster grouped the proteins LCE2B, LOR, FLG, KRT1, KRT15, and KRT2, all of them related to keratinocyte differentiation, epithelial cell differentiation, the formation of the cornifed envelope, and anatomical structure development. The third cluster was made up of CLDN1 and CLDN8, which are part of the plasma membrane. HLA-A and HLA-B form a cluster that contributes to cellular response to chemical stimulus. Finally, the last cluster was composed of UGT1A7 and UGT1A6 that are part of drug metabolism (cytochrome P450). Furthermore, there are proteins that, although do not interact with each other, belong to the plasmatic membrane and anatomical structure development (Fig. 4). DEGs in resistant patients were associated with better Overall Survival

To see if there was a relationship between the DEGs observed in resistant and sensitive patients, and the overall survival of patients with breast cancer, the previously selected DEGs were used to perform a comparative analysis against data stored in the database of cBioPortal. We selected the METABRIC, Nature 2012 & Nat Commun 2016 database that has information about the gene expression and overall survival data from 2509 invasive breast carcinoma patients. Nevertheless, we selected 1904 patients for the analysis that also had Illumina sequencing data. The data about patient status (alive or dead) was

Page 4/26 taken with a cut-off of 60 months and over- or sub-expression for each analyzed gene was set up at a z- score threshold >|1|. The results showed that the group of patients with overexpression (z-score > 1 and LogRank < 0.05) of TCP1 and TUBB had lower overall survival (Fig. 5a). Additionally, subexpression (z- score < -1 and LogRank < 0.05) of C1QTNF3, CTF1, OLFML3, PLA2R1, and PODN was also related to lower overall survival (Fig. 5b). In none of the analyzed genes the percentage at risk of 50% was reached.

Discussion

Neoadjuvant chemotherapy is the standard treatment for patients with locally advanced breast cancer, its application has the advantage of diminishing the volume of the principal tumor making it easier to be removed by surgery, besides, axillary disease, micrometastases, and circulating tumor cells are expected to be also eliminated5. The major drawback of this therapeutic approach is that a large percentage of patients develop chemoresistance and this complicates their treatment6. Also, there are controversial reports about the negative effect of this therapy, indicating that an increase in metastasis was related to its application7. Given the heterogeneity of the breast cancer disease, new therapeutic strategies suggest the individualization of the treatment8; however, this has been hampered by the high cost of current diagnostic and prognostic panels, most of which have been developed for specifc molecular subtypes. In this sense, it is important to fnd new prognostic tools that can be applied to all subtypes of breast cancer. Genomic approaches have been proven to highlight differences among patients with distinct conditions. The application of genomic strategies in breast cancer allowed to discriminate patients in the early stages of the disease, to prognosticate metastasis to lymph nodes in triple-negative patients, and to predict the response to adjuvant chemotherapy in endocrine responsive patients9,10. Analysis of gene expression has allowed classifying breast cancer in different molecular subtypes, in the same way, among the molecular subtypes, the expression of hormonal receptors have been of special utility for the prognosis and prediction of response to therapy8,11. In this work, the analysis of RNAseq data of resistance and sensitive patients to neoadjuvant chemotherapy, regardless of their molecular classifcation, indicated a great number of differentially expressed genes. This highlights that there are notable differences at the level of gene expression between these two conditions and there could be indicators of chemotherapy resistance and overall survival.

After performing a Enrichment Analysis, we identify that most DEGs belong to cellular components related principally to the membrane and the extracellular region, which are processes that involve many receptors responsible for sending signals of proliferation, survival or apoptosis, and signals to metastasis12,13. Likewise, the KEGG analysis indicated some pathways that could be involved in resistance. One of these was the PI3K/AKT pathway which can induce mTOR (mammalian Target of Rapamycin) stimulation leading to sustained proliferative signals. Activation of PI3K/AKT has been related to breast cancer and currently, there is a lot of interest in the development of inhibitors of this pathway to target breast cancer14. Besides, PI3K/AKT/mTOR pathway upregulation has been linked to chemotherapy and radiotherapy resistance15 and its inhibition overcome drug resistance in breast cancer cells16,17. Another pathway suggested by our analysis was the MAPK pathway. The activation of the

Page 5/26 MAPK pathway in breast cancer is promoted by the stimulation of EGFR and HER2 receptors and contributes to drug resistance, cancer cell survival, and invasion18,19, in addition, the MAPK pathway plays a role in the maintenance of the breast cancer stem cells and the promotion of epithelial-to- mesenchymal transition (EMT) a process necessary for tumor migration and the development of distant metastases20. Breast cancer stem cells have been linked as responsible for the development of chemoresistance21.

An additional pathway suggested by the KEGG analysis was the TGF-β beta pathway, which is well known to act as a double edge sword during cancer development. TGF-β is a potent tumor suppressor but also enhances invasiveness and metastasis by inducing epithelial-to-mesenchymal transition (EMT)22. The loss of MED12, a regulator of the TGF-β signaling pathway, has been related to chemoresistance in lung cancer and colon cancer23, besides, activation of TGF-β compensate the action of tyrosine kinase inhibitors (anticancer agents) by inducing the MAPK pathway24. In hepatocellular carcinoma, TGF-β signaling contributes to drug resistance by inducing the expression of PXR25 and in squamous cell carcinoma, the inhibition of TGF-β results in a chemosensitive phenotype26. In breast cancer, the determination of protein expression levels of TGF-β pathway components has been suggested to be of utility for the prognosis and to identify patients at increased risk for disease recurrence27.

The interactome analysis showed 3 principal clusters related to extracellular matrix organization, keratinocyte differentiation, and drug metabolism. The importance of those pathways in cancer has been documented. Proteins of the extracellular matrix have a role in cancer activating pathways related to growth, proliferation, and metastasis28,29. The proteins involved in keratinocyte differentiation as keratins have been documented to play a role in invasion, and endothelial-mesenchymal transition30,31. And the third cluster contains genes related to drug metabolism, specifcally, overexpression of CYP450 components that have been related to drug resistance in breast cancer32.

Genes downregulated in the resistance group were also observed to have low expression in the group of patients with lower overall survival. Among the genes related to overall survival, the CTF1 gene that codifes for a cytokine of the gp130 group and that is mainly expressed in heart, skeletal muscle, kidneys, lung, and liver33, was observed with low expression in the resistant group and the lower overall survival group. The functions of CTF1 have been related to activation of the JAK-STAT pathway, promotion of angiogenesis, and activation of cellular proliferation34. Its role in cancer was related to cell growth and IL- 6 positive regulation35, however, a relationship with drug response or chemotherapy resistance mechanisms has not been reported so far.

The OLFML3 (Olfactomedin-like 3) gene also had a low expression in the group of resistance as well as in the lower overall survival group. The functions and mechanisms in which this gene is involved remain unclear, although it has been related to embryonic development36. Regarding its role in cancer, it has been proposed that the expression of this gene increases in the tumor stroma, and in the epithelial- mesenchymal transition process37,38, furthermore, it is the target of miR-155 and BRMS1. The frst has

Page 6/26 been associated with cell cycle, proliferation and myogenic differentiation39, and the latter with the inhibition of breast cancer cell metastasis through the recruitment of LSD2/CoREST complex and the inhibition of OLFML3 expression38. In this sense, low expression levels of OLFML3 could be associated with less metastasis potential, we could not analyze this hypothesis since the samples included in this study were not from breast cancer patients with metastasis. In addition, changes in the expression of BRMS1 were not observed between the study groups.

The expression of PODN (Podocan) was also observed diminished in both resistant patients and the lowest overall survival group. This gene blocks proliferation in human bladder smooth muscle cells40 and its overexpression has been associated with an increase of p21 expression and a decrease of cdk2 leading to the arrest of cell cycle and the inhibition of the G1 to S phase transition40,41.

An interesting gene that regulates the TGF-β pathway and that was observed with low expression in resistant patients and patients with poor overall survival was the C1QTNF3 (C1q and tumor necrosis factor-related protein 3). This gene has been related to stimulation of cell proliferation and induction of antiapoptotic molecules expression in prostate cancer cells by mediating the activation of the PKC signaling pathway42. Hofmann et al. found that C1QTNF3 has antifbrotic effects by inhibiting TGF-β beta production43, In this work, KEGG analysis points out the possible involvement of TGF-β beta pathway in the mechanisms of chemotherapy resistance. Low expression of C1QTNF3 in the resistant group could lead to the activation of the TFG-beta pathway, which in turn could increase proliferative signals resulting in tumor progression44.

Likewise, the PLA2R1 (Phospholipase A2 receptor 1) gene was observed with low expression in the resistant group and the group with less overall survival. This gene has tumor-suppressor activity by promoting apoptosis and blocking transformation45, and its down-modulation has been observed in most cancer types46. The knockdown of this gene in prostate cancer cells increases cell proliferation but does not affect their sensitivity to docetaxel47. In breast cancer, negative regulation of this gene was reported in all molecular subtypes and promoter hypermethylation of this gene has been associated with aggressive subtypes48. Regarding its role in chemotherapy, PLA2R1 regulates JAK/STAT signaling and the targeting of PLA2R1 overcomes senescence49. All those reports indicate that this gene could play an important role in the resistance and overall survival.

On the other hand, two genes were highly expressed in resistance and low overall survival patients, TUBB (tubulin-β), and TCP1 (T-complex protein 1). Tubulin-β beta together with Tubulin-α forms a heterodimer that is the component of microtubules50,51. TUBB is necessary for microtubules polymerization and depolymerization to spindle formation during mitosis52. It has been proposed that resistance to taxanes, microtubule-stabilizing agents, depends on tubulins mutations and the content of the tubulin isoforms53. Nokimura et al. evaluated the expression of tubulin beta by histochemistry in samples of patients that received anthracycline and taxane as chemotherapy, they report that more than 10% of positivity for this protein was associated with taxane responders and less than 10% of positivity was a characteristic of

Page 7/26 taxane non-responders54, also, in their work, they showed evidence about the probable use of tubulin beta as a prognostic marker for overall survival. In our work, we found high expression of TUBB in resistant patients, this contradictory result could be explained because we measured RNA levels instead of protein, high levels of RNA could be an indication of a compensatory mechanism after protein diminution, however, this hypothesis needs to be addressed. On the other hand, TCP1 is a protein involved in the folding of cytoskeletal proteins, among those, tubulin. The interruption of the Tubulin-β/TCP-1 axis leads to cellular apoptosis through caspase-dependent signaling55. TCP1 has been reported to play a role in the maturation of Cycline E, which in turn activates the Cdk2 to positively control the G1/S phase transition56. Overexpression of TCP-1 has been observed in multidrug-resistance uterine cancer cells and colorectal cancer cells57,58. In breast cancer patients has been reported that amplifcation and/or overexpression of TCP1 correlates with reduced overall survival measured at 300 months. In our analysis, it was observed a similar pattern but with a cut-off of 60 months, a time that is more useful for clinicians 59.

In conclusion, this work highlights differences in the level of gene expression in patients resistant and sensitive to neoadjuvant chemotherapy. These differences further indicated that cellular components related to the extracellular region and plasma membrane were mainly involved. Furthermore, 73 DEGs were able to discriminate against patients resistant and sensitive to neoadjuvant chemotherapy. Those DEGS could be used as possible biomarkers of response to chemotherapy regardless of the molecular subtype, and 7 of them were able to predict overall survival. More studies are needed to test these putative biomarkers in a large number of breast cancer patients.

Methodology

Patient Eligibility and Selection

Patients aged 18 years and older with a diagnosis of breast cancer, candidates to receive neoadjuvant chemotherapy with tumor size > 2 cm and/or positive nodes, without previous therapy against cancer. All molecular subtypes were included in the analysis. Patients with metastatic cancer, insufcient breast cancer biopsy tissue for pathological analysis, or RNA extraction were excluded. All participants provided written informed consent before enrollment. The study was approved by the Ethical and Research Committee of the Instituto Mexicano del Seguro Social (IMSS) (number R-2013-785-061). All procedures performed in this study were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Treatment Plan and Study Design

Patients with breast cancer candidates to receive neoadjuvant chemotherapy were recruited at the Servicio de Oncología of Centro Médico Nacional de Occidente of IMSS. Patients were asked to participate and signed informed consent; thereafter a breast biopsy was taken for RNA extraction and sequencing. Neoadjuvant chemotherapy consisted of 8 cycles of taxanes plus anthracyclines for a period

Page 8/26 of 6 months. The chemotherapy treatment started with 4 cycles of Doxorubicin (60 mg/m2) or epirubicin plus cyclophosphamide (600 mg/m2) every 3 weeks followed by either paclitaxel (90 mg/m2) or docetaxel (100 mg/m2) every 3 weeks for 12 weeks. After chemotherapy conclusion, another breast biopsy was taken only when was needed as part of their treatment, this last biopsy was used by the pathology department to evaluate the response to treatment. Patients with residual tumor were assigned to the resistant group while patients with pathologic complete response (pCR), were assigned to the sensitive group. The pCR was set as the (absence of residual invasive and in situ cancer of the complete resected breast specimen and all sampled regional lymph nodes following completion of neoadjuvant systemic therapy)

RNA extraction, QC, library preparation and sequencing.

Once the biopsy was taken, it was immediately submerged into 1 ml of RNAlater® RNA Stabilization Reagent (Qiagen, Cat No. 76104) and incubated overnight in the reagent at 2–8 °C, then was transferred to − 80 °C for storage until processing. For RNA extraction the RNeasy Plus Mini Kit (Qiagen, Cat No. 74136) was used according to the indications of the manufacturer. Total RNA integrity was determined by using the Agilent 2100 Bioanalyzer Instrument and the Agilent RNA 6000 Nano Kit (Agilent, Cat No. 5067 − 1511).

Extracted RNAs were sequenced by the Beijing Genomics Institute (BGI Genomics), briefy, total RNA (200 ng) was purifed with oligo-dT beads for the obtention of mRNA, then it was fragmented into small pieces with Fragment Buffer. mRNA was converted into cDNA using SuperScript™ III First-Strand Synthesis SuperMix (No. Cat, 18080400) (Reaction condition: 25 °C for 10 min;42 °C for 50 min༛70 °C for 15 min) and Second Strand Master Mix. End Repair Mix was added (30 °C for 30 min) and cDNA was purifed with Ampure XP Beads (AGENCOURT). Poly-A tail addition was performed by adding an A-Tailing Mix (37 °C for 30 min). Adapters were added to cDNA by combining Adenylate 3'Ends DNA with RNA Index Adapter and Ligation Mix (30 °C for 10 min). PCR amplifcation was performed with PCR Primer Cocktail and PCR Master Mix to enrich the cDNA fragments. Then the PCR products were purifed with Ampure XP Beads (AGENCOURT). Validation of the library was performed in two steps, frst, the average molecule length was determined using the Agilent 2100 Bioanalyzer Instrument and the Agilent DNA 1000 Reagents (Agilent, Cat No. 5067 − 1504), and second, a quantifcation of the library using TaqMan Probes (Applied Bioscience) and real-time quantitative PCR (q-PCR) was done. For sequencing, the qualifed and quantifed libraries were used, frstly an amplifcation was performed within the fow cell on the cBot instrument for cluster generation (HiSeq® 4000 PE Cluster Ki, Illumina). Then, the clustered fow cell was loaded onto the HiSeq4000-Sequencer for paired-end sequencing (HiSeq® 4000 SBS Kit, Illumina) with recommended read lengths 150 bp.

Analysis of Differentially Expressed Genes.

For fltration and remotion of Illumina adapters from the reads, we used Flexbar. The alignment of the clean short reads was mapped using Kallisto and the GRCh38 version of the as a reference index for the generation of abundance tables. The differentially expressed genes were identifed Page 9/26 using the software package DESeq2 in R 3.5.1 “Feather Spray” (R Core Team, Vienna, Austria. URL https://www.R-project.org/). A false discovery rate (FDR) or p-adjust < 0.05 with the Benjamini-Hochberg method was used for further analysis. The DEG analysis was performed based on the chemotherapy response setting up the sensitive group as reference. The raw and processed data were submitted to Gene Expression Omnibus (GEO) repository with the number GSE162187.

Functional Gene Analysis

Gene Ontology analysis of DEGs was performed to identify characteristic biological attributes stratifed into categories. Kyoto Encyclopedia Gene and Genome (KEGG) analysis of the DEGs was carried out online with the bioinformatic tool KEGG Mapper. A cut-off p < 0.05 was set up as a criterion for enrichment analysis.

Functional Protein Association Analysis

Interactome analysis of p-adj < 0.5 DEGs was performed by using String version 11.0. A high confdence score (the approximate probability that a predicted link exists between two enzymes in the same metabolic map in the KEGG database) > 0.7 was set up. The clusters were identifed by interconnected nodes. Finally, we selected the pathways among GO, KEGG, and Reactome Pathways as well as UniProt Keywords in which the evaluated proteins participate.

Overall Survival Analysis

The research was performed in cBioPortal for Cancer Genomics (https://www.cbioportal.org/) which provides visualization, analysis, and download of large-scale cancer genomics data sets. In this sense, we selected the dataset composed by METABRIC, Nature 2012 & Nat Commun 201660 for performing overall survival analysis. Samples with an mRNA expression z-score threshold of >|1| were selected to be analyzed. A cut-off of 60 months was setting up for the overall survival analysis. DEGs with an adjusted p-value < 0.05 were analyzed. A LogRank < 0.05 was taken as signifcant.

Declarations

Acknowledgements

This work was fnancial supported by Fondo de Investigación en Salud - IMSS (FIS/IMSS/PROT/PRIO/14/030 to LFJ-S). CAB-G is grateful for a scholarship from Consejo Nacional de Ciencia y Tecnología (CONACyT)- Mexico

Author contributions

CAB-G contributed to the caption of patients, sample processing, bioinformatic analysis, and writing of the paper, MG-C Contributed with sample collection, and processing, AJM-M, RD-C and MGM-S contributed with patient recruiting, biopsies collection, biopsies analysis, and the follow up of the

Page 10/26 patients. AA-L and LFJ-S conceived the study, advised, analyzed the results and wrote and revised the manuscript.

Competing Interests Statement

The authors declare no competing interests.

References

1 Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 68, 394-424, doi:10.3322/caac.21492 (2018).

2 Guiu, S. et al. Pathological response and survival after neoadjuvant therapy for breast cancer: a 30-year study. Breast 22, 301-308, doi:10.1016/j.breast.2012.07.012 (2013).

3 Krishnan, Y., Al Awadi, S., Sreedharan, P. S., Sujith Nair, S. & Thuruthel, S. Analysis of neoadjuvant therapies in breast cancer with respect to pathological complete response, disease-free survival and overall survival: 15 years follow-up data from Kuwait. Asia Pac J Clin Oncol 12, e30-37, doi:10.1111/ajco.12118 (2016).

4 Xin, L., Liu, Y. H., Martin, T. A. & Jiang, W. G. The Era of Multigene Panels Comes? The Clinical Utility of Oncotype DX and MammaPrint. World J Oncol 8, 34-40, doi:10.14740/wjon1019w (2017).

5 Early Breast Cancer Trialists' Collaborative, G. Long-term outcomes for neoadjuvant versus adjuvant chemotherapy in early breast cancer: meta-analysis of individual patient data from ten randomised trials. Lancet Oncol 19, 27-39, doi:10.1016/S1470-2045(17)30777-5 (2018).

6 Ji, X. et al. Chemoresistance mechanisms of breast cancer and their countermeasures. Biomed Pharmacother 114, 108800, doi:10.1016/j.biopha.2019.108800 (2019).

7 Karagiannis, G. S. et al. Neoadjuvant chemotherapy induces breast cancer metastasis through a TMEM-mediated mechanism. Sci Transl Med 9, doi:10.1126/scitranslmed.aan0026 (2017).

8 Foulon, A., Theret, P., Rodat-Despoix, L. & Kischel, P. Beyond Chemotherapies: Recent Strategies in Breast Cancer Treatment. Cancers (Basel) 12, doi:10.3390/cancers12092634 (2020).

9 Tan, W. et al. Construction of an immune-related genes nomogram for the preoperative prediction of axillary lymph node metastasis in triple-negative breast cancer. Artif Cells Nanomed Biotechnol 48, 288-297, doi:10.1080/21691401.2019.1703731 (2020).

10 Albain, K. S., Paik, S. & van't Veer, L. Prediction of adjuvant chemotherapy beneft in endocrine responsive, early breast cancer using multigene assays. Breast 18 Suppl 3, S141-145, doi:10.1016/S0960-9776(09)70290-5 (2009).

Page 11/26 11 Brufsky, A. M. & Dickler, M. N. Estrogen Receptor-Positive Breast Cancer: Exploiting Signaling Pathways Implicated in Endocrine Resistance. Oncologist 23, 528-539, doi:10.1634/theoncologist.2017- 0423 (2018).

12 Lugo-Cintron, K. M. et al. Breast Fibroblasts and ECM Components Modulate Breast Cancer Cell Migration Through the Secretion of MMPs in a 3D Microfuidic Co-Culture Model. Cancers (Basel) 12, doi:10.3390/cancers12051173 (2020).

13 R, M. B.-C. et al. Extracellular Matrix Derived from High Metastatic Human Breast Cancer Triggers Epithelial-Mesenchymal Transition in Epithelial Breast Cancer Cells through alphavbeta3 Integrin. Int J Mol Sci 21, doi:10.3390/ijms21082995 (2020).

14 Costa, B., Amorim, I., Gartner, F. & Vale, N. Understanding Breast cancer: from conventional therapies to repurposed drugs. Eur J Pharm Sci 151, 105401, doi:10.1016/j.ejps.2020.105401 (2020).

15 Wanigasooriya, K. et al. Radiosensitising Cancer Using Phosphatidylinositol-3-Kinase (PI3K), Protein Kinase B (AKT) or Mammalian Target of Rapamycin (mTOR) Inhibitors. Cancers (Basel) 12, doi:10.3390/cancers12051278 (2020).

16 Mei, Y., Liao, X., Zhu, L. & Yang, H. Overexpression of RSK4 reverses doxorubicin resistance in human breast cancer cells via PI3K/AKT signalling pathway. J Biochem 167, 603-611, doi:10.1093/jb/mvaa009 (2020).

17 Hu, Y. et al. Effects of PI3K inhibitor NVP-BKM120 on overcoming drug resistance and eliminating cancer stem cells in human breast cancer cells. Cell Death Dis 6, e2020, doi:10.1038/cddis.2015.363 (2015).

18 Burotto, M., Chiou, V. L., Lee, J. M. & Kohn, E. C. The MAPK pathway across different malignancies: a new perspective. Cancer 120, 3446-3456, doi:10.1002/cncr.28864 (2014).

19 Yousefnia, S. et al. Mechanistic Pathways of Malignancy in Breast Cancer Stem Cells. Front Oncol 10, 452, doi:10.3389/fonc.2020.00452 (2020).

20 Leontovich, A. A. et al. Raf-1 oncogenic signaling is linked to activation of mesenchymal to epithelial transition pathway in metastatic breast cancer cells. Int J Oncol 40, 1858-1864, doi:10.3892/ijo.2012.1407 (2012).

21 De Angelis, M. L., Francescangeli, F. & Zeuner, A. Breast Cancer Stem Cells as Drivers of Tumor Chemoresistance, Dormancy and Relapse: New Challenges and Therapeutic Opportunities. Cancers (Basel) 11, doi:10.3390/cancers11101569 (2019).

22 Katsuno, Y., Lamouille, S. & Derynck, R. TGF-beta signaling and epithelial-mesenchymal transition in cancer progression. Curr Opin Oncol 25, 76-84, doi:10.1097/CCO.0b013e32835b6371 (2013).

Page 12/26 23 Huang, S. et al. MED12 controls the response to multiple cancer drugs through regulation of TGF- beta receptor signaling. Cell 151, 937-950, doi:10.1016/j.cell.2012.10.035 (2012).

24 Brunen, D. et al. TGF-beta: an emerging player in drug resistance. Cell Cycle 12, 2960-2968, doi:10.4161/cc.26034 (2013).

25 Bhagyaraj, E. et al. TGF-beta induced chemoresistance in liver cancer is modulated by xenobiotic nuclear receptor PXR. Cell Cycle 18, 3589-3602, doi:10.1080/15384101.2019.1693120 (2019).

26 Brown, J. A. et al. TGF-beta-Induced Quiescence Mediates Chemoresistance of Tumor-Propagating Cells in Squamous Cell Carcinoma. Cell Stem Cell 21, 650-664 e658, doi:10.1016/j.stem.2017.10.001 (2017).

27 de Kruijf, E. M. et al. The prognostic role of TGF-beta signaling pathway in breast cancer patients. Ann Oncol 24, 384-390, doi:10.1093/annonc/mds333 (2013).

28 Insua-Rodriguez, J. & Oskarsson, T. The extracellular matrix in breast cancer. Adv Drug Deliv Rev 97, 41-55, doi:10.1016/j.addr.2015.12.017 (2016).

29 Pickup, M. W., Mouw, J. K. & Weaver, V. M. The extracellular matrix modulates the hallmarks of cancer. EMBO Rep 15, 1243-1253, doi:10.15252/embr.201439246 (2014).

30 Kim, H. J., Choi, W. J. & Lee, C. H. Phosphorylation and Reorganization of Keratin Networks: Implications for Carcinogenesis and Epithelial Mesenchymal Transition. Biomol Ther (Seoul) 23, 301-312, doi:10.4062/biomolther.2015.032 (2015).

31 Karantza, V. Keratins in health and cancer: more than mere epithelial cell markers. Oncogene 30, 127-138, doi:10.1038/onc.2010.456 (2011).

32 Azzariti, A., Porcelli, L., Quatrale, A. E., Silvestris, N. & Paradiso, A. The coordinated role of CYP450 enzymes and P-gp in determining cancer resistance to chemotherapy. Curr Drug Metab 12, 713-721, doi:10.2174/138920011798357042 (2011).

33 Lopez-Yoldi, M., Moreno-Aliaga, M. J. & Bustos, M. Cardiotrophin-1: A multifaceted cytokine. Cytokine Growth Factor Rev 26, 523-532, doi:10.1016/j.cytogfr.2015.07.009 (2015).

34 Yang, Z. F. et al. Cardiotrophin-1 enhances regeneration of cirrhotic liver remnant after hepatectomy through promotion of angiogenesis and cell proliferation. Liver Int 28, 622-631, doi:10.1111/j.1478-3231.2008.01687.x (2008).

35 Robledo, O. et al. Regulation of interleukin 6 expression by cardiotrophin 1. Cytokine 9, 666-671, doi:10.1006/cyto.1997.0220 (1997).

Page 13/26 36 Zeng, L. C., Han, Z. G. & Ma, W. J. Elucidation of subfamily segregation and intramolecular coevolution of the olfactomedin-like proteins by comprehensive phylogenetic analysis and gene expression pattern assessment. FEBS Lett 579, 5443-5453, doi:10.1016/j.febslet.2005.08.064 (2005).

37 Torres, S. et al. Proteome profling of cancer-associated fbroblasts identifes novel proinfammatory signatures and prognostic markers for colorectal cancer. Clin Cancer Res 19, 6006-6019, doi:10.1158/1078-0432.CCR-13-1130 (2013).

38 Qiu, R. et al. BRMS1 coordinates with LSD1 and suppresses breast cancer cell metastasis. Am J Cancer Res 8, 2030-2045 (2018).

39 Zhao, S. et al. OLFML3 expression is decreased during prenatal muscle development and regulated by microRNA-155 in pigs. Int J Biol Sci 8, 459-469, doi:10.7150/ijbs.3821 (2012).

40 Sun, Y. et al. MiR 3180-5p promotes proliferation in human bladder smooth muscle cell by targeting PODN under hydrodynamic pressure. Sci Rep 6, 33042, doi:10.1038/srep33042 (2016).

41 Shimizu-Hirota, R., Sasamura, H., Kuroda, M., Kobayashi, E. & Saruta, T. Functional characterization of podocan, a member of a new class in the small leucine-rich repeat protein family. FEBS Lett 563, 69-74, doi:10.1016/S0014-5793(04)00250-9 (2004).

42 Hou, Q. et al. RankProd Combined with Genetic Algorithm Optimized Artifcial Neural Network Establishes a Diagnostic and Prognostic Prediction Model that Revealed C1QTNF3 as a Biomarker for Prostate Cancer. EBioMedicine 32, 234-244, doi:10.1016/j.ebiom.2018.05.010 (2018).

43 Hofmann, C. et al. C1q/TNF-related protein-3 (CTRP-3) is secreted by visceral adipose tissue and exerts antiinfammatory and antifbrotic effects in primary human colonic fbroblasts. Infamm Bowel Dis 17, 2462-2471, doi:10.1002/ibd.21647 (2011).

44 Katz, L. H. et al. Targeting TGF-beta signaling in cancer. Expert Opin Ther Targets 17, 743-760, doi:10.1517/14728222.2013.782287 (2013).

45 Bernard, D. & Vindrieux, D. PLA2R1: expression and function in cancer. Biochim Biophys Acta 1846, 40-44, doi:10.1016/j.bbcan.2014.03.003 (2014).

46 Vindrieux, D. et al. PLA2R1 mediates tumor suppression by activating JAK2. Cancer Res 73, 6334- 6345, doi:10.1158/0008-5472.CAN-13-0318 (2013).

47 Quach, N. D. et al. Role of the phospholipase A2 receptor in liposome drug delivery in prostate cancer cells. Mol Pharm 11, 3443-3451, doi:10.1021/mp500174p (2014).

48 Mitwally, N., Yousef, E., Abd Al Aziz, A. & Taha, M. Clinical Signifcance of Expression Changes and Promoter Methylation of PLA2R1 in Tissues of Breast Cancer Patients. Int J Mol Sci 21, doi:10.3390/ijms21155453 (2020).

Page 14/26 49 Griveau, A., Wiel, C., Ziegler, D. V., Bergo, M. O. & Bernard, D. The JAK1/2 inhibitor ruxolitinib delays premature aging phenotypes. Aging Cell 19, e13122, doi:10.1111/acel.13122 (2020).

50 Parker, A. L., Teo, W. S., McCarroll, J. A. & Kavallaris, M. An Emerging Role for Tubulin Isotypes in Modulating Cancer Biology and Chemotherapy Resistance. Int J Mol Sci 18, doi:10.3390/ijms18071434 (2017).

51 Downing, K. H. & Nogales, E. Crystallographic structure of tubulin: implications for dynamics and drug binding. Cell Struct Funct 24, 269-275, doi:10.1247/csf.24.269 (1999).

52 Gadadhar, S., Bodakuntla, S., Natarajan, K. & Janke, C. The tubulin code at a glance. J Cell Sci 130, 1347-1353, doi:10.1242/jcs.199471 (2017).

53 McGrogan, B. T., Gilmartin, B., Carney, D. N. & McCann, A. Taxanes, microtubules and chemoresistant breast cancer. Biochim Biophys Acta 1785, 96-132, doi:10.1016/j.bbcan.2007.10.004 (2008).

54 Norimura, S. et al. Candidate biomarkers predictive of anthracycline and taxane efcacy against breast cancer. J Cancer Res Ther 14, 409-415, doi:10.4103/jcrt.JCRT_1053_16 (2018).

55 Liu, Y. J., Chang, Y. J., Kuo, Y. T. & Liang, P. H. Targeting beta-tubulin/CCT-beta complex induces apoptosis and suppresses migration and invasion of highly metastatic lung adenocarcinoma. Carcinogenesis 41, 699-710, doi:10.1093/carcin/bgz137 (2020).

56 Won, K. A., Schumacher, R. J., Farr, G. W., Horwich, A. L. & Reed, S. I. Maturation of human cyclin E requires the function of eukaryotic chaperonin CCT. Mol Cell Biol 18, 7584-7589, doi:10.1128/mcb.18.12.7584 (1998).

57 Lin, Y. F., Tsai, W. P., Liu, H. G. & Liang, P. H. Intracellular beta-tubulin/chaperonin containing TCP1- beta complex serves as a novel chemotherapeutic target against drug-resistant tumors. Cancer Res 69, 6879-6888, doi:10.1158/0008-5472.CAN-08-4700 (2009).

58 Coghlin, C. et al. Characterization and over-expression of chaperonin t-complex proteins in colorectal cancer. J Pathol 210, 351-357, doi:10.1002/path.2056 (2006).

59 Guest, S. T., Kratche, Z. R., Bollig-Fischer, A., Haddad, R. & Ethier, S. P. Two members of the TRiC chaperonin complex, CCT2 and TCP1 are essential for survival of breast cancer cells and are linked to driving oncogenes. Exp Cell Res 332, 223-235, doi:10.1016/j.yexcr.2015.02.005 (2015).

60 Pereira, B. et al. The somatic mutation profles of 2,433 breast cancers refnes their genomic and transcriptomic landscapes. Nat Commun 7, 11479, doi:10.1038/ncomms11479 (2016).

Figures

Page 15/26 Figure 1

Distribution of DEGs by p-value and Log2-FoldChange. The DEGs were plotted in the graph in which the x- axis represents the Log2 Fold Change and the y axis the Log10 of the p-value. The sensitive group was set up as a reference for comparison. a) DEGs with a p<0.5. b) DEGs fltered by a padj<0.05. Red color indicates overexpressed genes, blue color indicates subexpressed genes. C1orf68 and LCE2B genes are not showed in the plot since its Log2FoldChange is lower than the limit of the x-axis.

Page 16/26 Figure 1

Distribution of DEGs by p-value and Log2-FoldChange. The DEGs were plotted in the graph in which the x- axis represents the Log2 Fold Change and the y axis the Log10 of the p-value. The sensitive group was set up as a reference for comparison. a) DEGs with a p<0.5. b) DEGs fltered by a padj<0.05. Red color indicates overexpressed genes, blue color indicates subexpressed genes. C1orf68 and LCE2B genes are not showed in the plot since its Log2FoldChange is lower than the limit of the x-axis.

Page 17/26 Figure 2

GO analysis of DEGs. A total of 1985 DEGs with a p<0.05 were analyzed in the Pantherdb tool. The light- gray bars show the total genes reported for each of the GO terms, dark-gray bars are the number of DEGs, and the colored bars indicate the expected number of genes for each GO term in normal conditions. The GO categories of biological process, cellular component, and molecular functions are colored by green,

Page 18/26 blue, and red respectively. The y-left axis lists the enrichment terms (FDR<0.05) for the 3 GO categories, the x-axis represents the Log2 of the DEG numbers.

Figure 2

GO analysis of DEGs. A total of 1985 DEGs with a p<0.05 were analyzed in the Pantherdb tool. The light- gray bars show the total genes reported for each of the GO terms, dark-gray bars are the number of DEGs, and the colored bars indicate the expected number of genes for each GO term in normal conditions. The

Page 19/26 GO categories of biological process, cellular component, and molecular functions are colored by green, blue, and red respectively. The y-left axis lists the enrichment terms (FDR<0.05) for the 3 GO categories, the x-axis represents the Log2 of the DEG numbers.

Figure 3

KEGG analysis of DEGs. A total of 1985 DEGs with a p<0.05 were analyzed in the KEGG mapper tool. On the x-axis is represented the number of DEGs, on the y-axis the name of the pathways. Red color means overexpressed genes and blue color means subexpressed genes in the chemotherapy resistance group.

Page 20/26 Figure 3

KEGG analysis of DEGs. A total of 1985 DEGs with a p<0.05 were analyzed in the KEGG mapper tool. On the x-axis is represented the number of DEGs, on the y-axis the name of the pathways. Red color means overexpressed genes and blue color means subexpressed genes in the chemotherapy resistance group.

Page 21/26 Figure 4

Interactome analysis. A total of 73 DEGs with a padj<0.05 were analyzed in the STRING tool. Every node (circle) represents the protein encoded by every gene. The colors indicate the functional enrichments of the node in GO, KEGG, or Reactome pathways. The edges (lines) represent the type of evidence reported in the literature for de interaction of proteins. A confdence value of 0.7 was used to establish the interactions.

Page 22/26 Figure 4

Interactome analysis. A total of 73 DEGs with a padj<0.05 were analyzed in the STRING tool. Every node (circle) represents the protein encoded by every gene. The colors indicate the functional enrichments of the node in GO, KEGG, or Reactome pathways. The edges (lines) represent the type of evidence reported in the literature for de interaction of proteins. A confdence value of 0.7 was used to establish the interactions.

Page 23/26 Figure 5

Overall Survival (OS) Analysis of DEGs. A total of 73 DEGs with a padj<0.05 were analyzed in the database of (METABRIC, Nature 2012 & Nat Commun 2016) with a cut off time of 60 months and a z- score expression >|1|. From the query genes, only 7 genes had a log-rank<0.05. a) OS of TCP1 and TUBB; b) OS of C1QTNF3, CTF1, OLFML3, PLA2R1, and PODN. The blue color indicates the group of patients

Page 24/26 that have the gene downregulated and the red color indicates the group of patients with the gene upregulated.

Figure 5

Overall Survival (OS) Analysis of DEGs. A total of 73 DEGs with a padj<0.05 were analyzed in the database of (METABRIC, Nature 2012 & Nat Commun 2016) with a cut off time of 60 months and a z- score expression >|1|. From the query genes, only 7 genes had a log-rank<0.05. a) OS of TCP1 and TUBB;

Page 25/26 b) OS of C1QTNF3, CTF1, OLFML3, PLA2R1, and PODN. The blue color indicates the group of patients that have the gene downregulated and the red color indicates the group of patients with the gene upregulated.

Supplementary Files

This is a list of supplementary fles associated with this preprint. Click to download.

Supplementarytable1.pdf Supplementarytable1.pdf Supplementarytable2.pdf Supplementarytable2.pdf

Page 26/26