www.nature.com/npjbcancer

ARTICLE OPEN The temporal mutational and immune tumour microenvironment remodelling of HER2-negative primary breast cancers ✉ Leticia De Mattos-Arruda 1,2,3 , Javier Cortes 4,5,6,7,8, Juan Blanco-Heredia1,2, Daniel G. Tiezzi 3,9, Guillermo Villacampa10, Samuel Gonçalves-Ribeiro 10, Laia Paré11,12,13, Carla Anjos Souza1,2, Vanesa Ortega7, Stephen-John Sammut3,14, Pol Cusco10, Roberta Fasani10, Suet-Feung Chin 3, Jose Perez-Garcia4,5,6, Rodrigo Dienstmann10, Paolo Nuciforo10, Patricia Villagrasa12, Isabel T. Rubio10, Aleix Prat 11,12,13 and Carlos Caldas 3,14

The biology of breast cancer response to neoadjuvant therapy is underrepresented in the literature and provides a window-of- opportunity to explore the genomic and microenvironment modulation of tumours exposed to therapy. Here, we characterised the mutational, expression, pathway enrichment and tumour-infiltrating lymphocytes (TILs) dynamics across different timepoints of 35 HER2-negative primary breast cancer patients receiving neoadjuvant eribulin therapy (SOLTI-1007 NEOERIBULIN- NCT01669252). Whole-exome data (N = 88 samples) generated mutational profiles and candidate neoantigens and were analysed along with RNA-Nanostring 545- (N = 96 samples) and stromal TILs (N = 105 samples). Tumour mutation burden varied across patients at baseline but not across the sampling timepoints for each patient. Mutational signatures were not always conserved across tumours. There was a trend towards higher odds of response and less hazard to relapse when the percentage of

1234567890():,; subclonal mutations was low, suggesting that more homogenous tumours might have better responses to neoadjuvant therapy. Few driver mutations (5.1%) generated putative neoantigens. Mutation and neoantigen load were positively correlated (R2 = 0.94, p = <0.001); neoantigen load was weakly correlated with stromal TILs (R2 = 0.16, p = 0.02). An enrichment in pathways linked to immune infiltration and reduced programmed death expression were seen after 12 weeks of eribulin in good responders. VEGF was downregulated over time in the good responder group and FABP5, an inductor of epithelial mesenchymal transition (EMT), was upregulated in cases that recurred (p < 0.05). Mutational heterogeneity, subclonal architecture and the improvement of immune microenvironment along with remodelling of hypoxia and EMT may influence the response to neoadjuvant treatment. npj Breast Cancer (2021) 7:73 ; https://doi.org/10.1038/s41523-021-00282-0

INTRODUCTION transcriptomic changes across breast cancer patients receiving 13 Breast cancer is the most commonly diagnosed cancer and the neoadjuvant therapy . Therefore, the neoadjuvant setting in breast leading cause of female cancer death worldwide1. It represents a cancer provides a window-of-opportunity to explore the genomic and heterogeneous group of tumours with characteristic molecular microenvironment modulation of tumours exposed to therapy over 5,14 features, prognosis and responses to available therapy2,3. time . In the early stage breast cancer setting, treatment decisions are In this study, we temporally characterised the mutational, gene guided by clinical subtypes, namely hormone receptor (HR) positive expression, pathway enrichment and tumour-infiltrating lymphocytes (HR+/HER2−), epidermal growth factor receptor 2 amplified (TILs) dynamics across different timepoints over a 12-week period in (HER2+) and triple-negative breast cancer (TNBC). This general HER2-negative primary breast cancers enrolled in the single-arm classification does not take into account the complex genomic SOLTI-1007 NEOERIBULIN phase II clinical trial (NCT01669252). We landscape and breast cancer evolution during therapy administration show that the mutational and immune tumour microenvironment and disease recurrence or progression4,5. remodelling of HER2-negative primary breast cancers provides a path Currently, the biology of the neoadjuvant response to therapy is forward for gathering biological insights from primary breast cancers. underrepresented in the literature. Large-scale genomic studies have mostly focused on the analysis of single primary breast cancers2,3,6–9, which do not provide information on cancers over time. The analysis RESULTS ofthegeneexpressionlandscape of tumours has been shown to A clinical cohort of HER2-negative breast cancer patients correlate with response to cytotoxic therapies10–12,thoughvery Primary breast cancer tumour specimens were obtained from the limited work has been done to characterise the genomic and open-label, single-arm SOLTI-1007 NEOERIBULIN phase II clinical

1IrsiCaixa, Germans Trias i Pujol University Hospital, Badalona, Spain. 2Germans Trias i Pujol Research Institute (IGTP), Badalona, Spain. 3Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, UK. 4Oncology Department International Breast Cancer Center (IBCC), Quiron Group, Barcelona, Spain. 5Medica Scientia Innovation Research (MedSIR), Barcelona, Spain. 6Medica Scientia Innovation Research (MedSIR), Ridgewood, NJ, USA. 7Breast Cancer Research program, Vall d´Hebron Institute of Oncology (VHIO), Barcelona, Spain. 8Universidad Europea de Madrid, Faculty of Biomedical and Health Sciences, Department of Medicine, Madrid, Spain. 9Breast Disease Division, Ribeirão Preto School of Medicine, University of São Paulo, São Paulo, Brazil. 10Vall d’Hebron Institute of Oncology (VHIO), Vall d’Hebron University Hospital, Barcelona, Spain. 11Department of Medical Oncology, Hospital Clinic of Barcelona, Barcelona, Spain. 12SOLTI Breast Cancer Research Group, Barcelona, Spain. 13Translational Genomics and Targeted Therapeutics in Solid Tumors, August Pi i Sunyer Biomedical Research Institute, Barcelona, Spain. 14Department of Oncology, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK. ✉ email: [email protected]

Published in partnership with the Breast Cancer Research Foundation L. De Mattos-Arruda et al. 2

a Whole exome seq NanoString gene Tumor infiltrating 88 samples, 35 patients expression profiling lymphocytes (TILs) 96 samples, 35 patients 105 samples, 35 patients

Excluded: Excluded: Purity < 20%,reads < 15 PAM50 PAM50 not available Excluded: Low sequencing quality subtyping 1 sample Not evaluated or not available

65 primary tumors and one 547 91 primary tumors metastasis across 28 patients 96 primary tumors across 34 patients across 35 patients

ORR at Surgery (good responders vs poor responders) - TMB, clonality - neoantigen prediction - gene expression -TILs counts

b

Stage I/II Eribulin 1.4mg/m2 Day 1 and 8, every 21 days HER2- breast cancer

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Surgery Adjuvant therapy as per investigator choice (anthracycline recommended) .... recurrence 1234567890():,; Core or Core Biopsy (V2) (V3) (VR) incisional biopsy (V1) Physical exam ORR Imaging diagnosis Mamogram/ US pCRb V1V2V3 VR Total WES 28 24 13 1 66 Gene Expresion 35 3129 - 96 TILs 34 29 28 - 91

Her2Enriched (2.9%) PR (40%)

Luminal B CR (14.3%) (22.8%) Basal-like (34.3%)

PD (11.4%)

NA (2.9%)

SD (34.3%) Luminal A (37.1%) PAM50 (V1) ORR surgery (V3)

Fig. 1 The study schematics. a. Tumour tissue samples underwent (i) Whole-exome sequencing (WES) for mutation and clonality detection followed by neoantigen prediction; (ii) Nanostring gene expression profiling; and (iii) stromal TILs counting. Our goal was to select samples that passed quality control and perform the temporal characterisation of the mutational, gene expression and TILs in serial primary HER2- negative breast cancers that were good responders or poor responders to eribulin. DNA sequencing (WES) was performed in 88 primary invasive breast cancers and matched the normal DNA of each patient. Of these, 66 tumour samples were used for mutational and clonality analyses. RNA-Nanostring gene expression profiling was performed in 96 primary invasive breast cancers. From the DNA sequencing data, candidate neoantigens were predicted. Stromal TILs were counted from the H&E slides in 91 out of 105 tumour specimens. Clinical features and the PAM50 intrinsic molecular subtypes of each of the sequential primary tumour’s biopsies were examined. TMB tumour mutation burden, ORR overall response rate. b Schematics of the clinical trial. Temporal tumour sampling and a number of samples included in each analysis and time point are depicted. Distribution of PAM50 molecular intrinsic breast cancer subtypes at V1 (diagnostic biopsy), and ORR at V3 (surgery). E eribulin administration, V1 visit one, V2 visit two, V3 visit three, VR visit recurrence, CR complete response, PR partial response, SD stable disease, PD progressive disease.

trial (NCT01669252). We included sequential primary tumour adjuvant therapy. Although six patients presented clinic-radiologic biopsies from 35 HER2-negative (22 HR-positive and 13 HR- recurrence, tumour material was available at the time of negative) breast cancer patients (1–3 tumour tissue samples per recurrence in one patient. The samples that passed quality control patient) during eribulin administration. Whole-exome data (N = (i.e., tumour cellularity, sequencing quality, see Fig. 1a and 88 samples) generated mutational profiles and candidate neoanti- “Methods”), were further processed and analysed. gens and were analysed along with RNA-Nanostring 545-gene Clinical features of the cohort are summarised in Table 1 and expression (N = 96 samples) and stromal TILs (N = 105 samples) the schematics of the study design are shown in Fig. 1b. The 5- from 35 patients with HER2-negative breast cancer (Fig. 1a). year relapse-free survival was 85.6% (95% CI: 74.7–98.1%) after Eighty per cent of cases had Ki67 greater than 14% at diagnosis. breast cancer was diagnosed and the overall survival rate at 5 Disease recurrence after neoadjuvant therapy with eribulin was years was 91.3% (95% CI: 82.4–100%) for the patients analysed observed in six patients, despite the use of anthracyclines as here.

npj Breast Cancer (2021) 73 Published in partnership with the Breast Cancer Research Foundation L. De Mattos-Arruda et al. 3 (range: 0.13–8.19), p = 0.62; V3, median 1.73 mutations/Mb (range: Table 1. Clinicopathological characteristics of the study cohort. 0.38–7.11), p = 0.81) (mean coverage was ~40×) (Fig. 2a). At least 18,19 Number of patients 35 p value (Fisher’s one breast cancer driver gene was detected in 28 (80%) exact test) patients further analysed. TP53 (n = 15) and PIK3CA (n = 11) were the most prevalent mutated driver genes in the cohort Age (mean) 52.8 years (Supplementary Fig. 1, Supplementary Data). Mutations within Histologic grade Diagnosis Surgery these breast cancer drivers were detected at all time points as well Grade 1 4 1 as in the recurrent sample (case N021). In V1, TP53 mutation was more prevalent in basal cancers (N = 9), as compared to Luminal-A Grade 2 16 18 (N = 4) and Luminal-B (N = 2). Relapse free survival was not Grade 3 14 14 0.5 statistically significant when samples bearing TP53 mutations in V1 Missing or not available 1 2 (N = 15) were compared to those without the mutation (N = 20) ER status [HR = 2.07, 95% CI: 0.35–12.4, p value = 0.43]. Likewise, no = ER+ 21 18 difference was found in patients with PIK3CA mutation (n 11) [HR = 1.46, 95% CI: 0.24–8.75, p value = 0.68] (Supplementary Fig. ER− 14 15 0.8 1). However, this cohort was not adequately powered to detect an Missing or not available 0 2 association between mutations in driver genes and clinical PR status outcomes. PR+ 17 17 We next investigated signatures of mutagenic biological PR− 18 16 1 processes to determine whether there were any detectable shifts during neoadjuvant therapy. A non-negative matrix factorisation Missing or not available 0 2 technique to identify mutagenic processes in breast cancer KI67 at diagnosis including ageing, APOBEC cytidine deaminases, defective DNA >or= 14% 28 30 repair, BRCA1/BRCA2 deficiency was used16. Mutational signatures <14% 7 3 0.3 were studied in each tumour sample across neoadjuvant therapy — Missing or not available 0 2 (Fig. 2a bottom panel). Signature 1 (C > T transitions at CpG dinucleotides) contributed to a relatively higher proportion of Clinical tumour stage at diagnosis mutations, but others were predominant in some patients, such as APOBEC (signatures 2 and 13), were more prevalent in luminal B T0 0 2 patients, and BRCA-related (signature 3) in basal-like tumours. T1 1 21 Signatures were not always conserved across tumour samples T2 31 10 (defective DNA repair signature was conserved in 62.5% of T3 2 1 <0.001 patients (n = 8) while APOBEC and BRCA-related in 40% (n = 7 and = Node n 10, respectively). No statistical differences were observed between mutation signatures and overall response rate (ORR) (all N0 24 19 p values > 0.05, Fisher’s test) (Supplementary Fig. 1), although the N1 11 12 study was not initially powered to detect this association. N2 0 4 0.1 We computed tumour clonal composition and CCFs by applying 17 Histology the ABSOLUTE algorithm across temporal primary tumours from IDC 30 27 each patient to determine how clonal composition relates to outcomes. Mutations predicted to be clonal (found in every cancer ILC 2 2 cell) or subclonal (found in a subset of cancer cells) were other 3 4 0.9 identified: the proportion of clonal mutations was 44.7% on the Missing or not available – 2 first visit, 21.0% on the second visit, 28.9% at surgery and 5% at Treatment response recurrence). pCR 2 We studied the shape of the association of subclonal mutations at baseline (V1) and the patientsʼ odds to obtain a complete or Good responders (CR or PR) 19 (5 CR, 14 PR) partial response (PR) and also the hazard to relapse (Fig. 2b, see Poor responders (PR or SD) 16 (12 SD, 4 PD) “Methods”). Overall, there was a trend towards higher odds of Missing or not available 3 response and less hazard to relapse when the percentage of Residual cancer burden (RCB) RCB1 = 1, RCB2 = 18, RCB3 = 13 subclonal mutations was low (i.e., from 0 to 40%), suggesting that after neoadjuvant treatment patients, NA = 3 less heterogeneous tumours might have better responses to Recurrence 6 neoadjuvant therapy. Further increases above 40% in the percentage of subclonal did not seem to impact outcomes. The same analysis was carried out for HR-positive and HR-negative patients (Fig. 2b, bottom panels). Overall, HR-positive patients had The mutational and TILs profiling across temporal primary a trend to have better outcomes. breast cancers We generated HLA-class I binding predictions for somatic We characterised the mutational landscape of the tumours mutations that may result in changes. Totally, 492 out of through an analysis of tumour mutational burden (TMB), breast 2009 (24.50%) unique mutations were predicted to encode for at cancer driver identification15, mutational signatures16 and muta- least one strong binder (binding rank < 0.5%) of HLA-class I. We tional clonality17 at baseline (V1) (28 cases) and correlated these at analysed candidate neoantigens generated from nonsynonymous visit 2 (V2) (24 cases) and with the response after 12 weeks of mutations. A total of 1002 putative neoantigens (average of 36 per therapy at the time of surgery (V3) (13 cases). patient, [range: 2–153]) were present in 27 patients (Supplemen- TMB varied among patients at baseline (V1) (median 1.75 tary Data). mutations per megabase (Mb) [range: 0.59–8.19], with a standard We inferred candidate neoantigens derived from breast cancer deviation of 1.48) but there was no evidence of variation across driver mutations (5.19% of total neoantigens) and whether they the three sampling time points (V2, median 1.62 mutations/Mb were clonal or subclonal. Overall, 52 neoantigens (average of 3.11

Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2021) 73 L. De Mattos-Arruda et al. 4

a 7

6

5

4 * 3 * * 2 * Mutations per Mb * 1 *

0 TP53 PIK3CA ATM SMAD4 PTEN NOTCH1 NF1 MAP3K1 KMT2C HRAS FOXA1 CDH1 STK11 GATA3 CUX1 ARID1A APC TBX3 SMARCA4 RUNX1 PIK3R1 MAP2K4 GPS2 CBFB BAP1 ZFP36L1 USP9X TET2 PBRM1 NCOR1 MEN1 EGFR BRCA1 ASXL1 AKT1 PAM50 ORR Purity N026_V3 N028_V3 N010_V1 N016_V1 N019_V1 N006_V2 N015_V2 N021_V3 N027_V2 N030_V2 N031_V2 N031_V3 N010_V2 N003_V2 N006_V1 N007_V1 N007_V3 N011_V1 N011_V2 N014_V1 N015_V1 N021_V1 N021_V2 N024_V1 N024_V2 N027_V1 N027_V3 N030_V1 N031_V1 N032_V1 N033_V1 N033_V2 N033_V3 N036_V1 N009_V2 N018_V3 N029_V2 N016_V3 N003_V3 N014_V2 N024_V3 N025_V1 N032_V2 N036_V2 N008_V1 N008_V2 N008_V3 N009_V1 N013_V1 N013_V2 N017_V1 N017_V2 N017_V3 N018_V1 N026_V1 N028_V1 N028_V2 N029_V1 N035_V1 N035_V2 N034_V3 N016_V2 N019_V2 N006_V3 N034_V2 N021_VR

Purity Timepoints Mutation PAM50 molecular subtypes ORR Mutation signatures

1 Clock-like, age related (1, 5) V1 Missense Basal like CR 0.8 APOBEC (2, 13) Later LuminalA PD BRCA-related (3) * 0.6 V2 Truncating Recurrence V3 Inframe LuminalB PR Defective DNA repair (6, 15) 0.4 Polymerase n (9) VR Her2 enriched SD 0.2 POLE mutations (10) 0 Unknown breast (30) Others

3 3

2 2

Majority Clonal (n = 16) l response 1 b 1 d partia 12.5% d for RFS 0 0 40 -1

ative hazar ns 18.7% -1 l 31.2% -2 30 ORR

Log re * CR -2 -3 TILs (%) PD Log odds for complete or 20 37.5% PR SD 0% 10% 20% 30% 40% 50% 60% 0% 10% 20% 30% 40% 50% 60% Percentage of subclonality Percentage of subclonality 10

Majority Subclonal (n = 11) 3 3 0 9% 2 2

R positive 1 1

45.4% 0 Luminal A Luminal B 45.4% 0 HR negative HR negative Basal Like -1

-1 R positive -2 Log relative hazard for RFS -2 -3 Log odds for complete or partial response

0% 10% 20% 30% 40% 50% 60% 0% 10% 20% 30% 40% 50% 60% Percentage of subclonality Percentage of subclonality c e

2 150 R = 0.1605 ++ + 100 , R2 = 0.94

75

100

50 Peptides

50

25 Mutation generating neoantigens Mutation generating

0 0

(+) double PIK3CA mutation p.H1047L, p.L239Q 0 50 100 150 010203040 Nonsynonymous Mutations load MedianTILs_Mean TILs per patient Fig. 2 Mutational landscape of HER2-negative primary breast cancers under neoadjuvant chemotherapy. a Landscape of mutational alterations over time. Stacked plots display mutational burden (top), breast cancer drivers (tile plots, middle), PAM50 molecular intrinsic subtypes, clinical–pathological responses per patient and purity (tile plots, middle), mutation signatures (filled histogram). b Mutation clonality and subclonal distribution across different responses to eribulin. Left panels: odds for complete or partial response; right panels: a relative hazard for relapse-free survival (RFS). The analysis was performed for all comers (top panels) and for HR-positive and HR-negative patients (bottom panels). c Distribution of selected driver mutations generating neoantigens. Driver gene mutations are coloured whether the mutation is clonal or subclonal. d TILs across PAM50 intrinsic subtypes. (*) refer to p value < 0.05; ns nonsignificant. For each box plot, the centre line, the boundaries of the box, the ends of the whiskers and points beyond the whiskers represent the median value, the interquartile range, the minimum and maximum values, and the outliers, respectively. e Relationship between predicted neoantigen load (y-axis) and nonsynonymous mutational load (x-axis) and between predicted neoantigen load (y-axis) and mean stromal TILs per patient (x-axis).

npj Breast Cancer (2021) 73 Published in partnership with the Breast Cancer Research Foundation L. De Mattos-Arruda et al. 5 per patient [range: 1–11]) were generated by 22 (unique) driver with statistically significant enrichment of pathways associated to mutations in 17 patients. Driver mutations including TP53 (p. lymphocyte and T cell activation in the end of therapy (V3 or R110P, p.R209Kfs*6), PIK3CA (p.H1047L, p.K111E), MAP3K1 (p. surgery), and with reduction in expression of the immunosup- R961Sfs*44, p.E504Vfs*36) predicted neoantigens in more than pressive programmed cell death pathway. In contrast, there was one patient, and in general, they were clonal and present across an increase in cell proliferation and reduction in cytotoxic T cell all samples of each patient (Fig. 2c). Each candidate neoantigen pathways for the poor responder group at V3 or surgery was predicted to be bound to one or more HLA class-I molecule. timepoint. This suggests that eribulin apparent reversal of EMT Finally, stromal TILs were counted in 91 samples of 34 patients and restoration of hypoxia might lead to improvement of and had a median of 9.3% [range: 0–40] among patients. Among immunosuppressive features in the primary tumours of good the PAM50 intrinsic subtypes, basal-like tumours had significantly responders, and an increase of proliferation and reduction of more TILs than luminal A (p value = 0.02) but not luminal B (p immune cytotoxic activity in poor responders. value = 0.11) (Fig. 2d, Supplementary Fig. 1). We observed that the higher nonsynonymous mutation load was strongly correlated with neoantigen load and the latter was DISCUSSION weakly correlated with stromal TILs. A significant positive Our study describes a step forward in gathering insights related to correlation was observed in both analyses (R2 = 0.94, p = <0.001 response to neoadjuvant therapy for HER2-negative primary and R2 = 0.16, p = 0.02), respectively (Fig. 2e). breast cancers. Whole-exome sequencing, gene expression, path- Therefore, although TMB and driver mutations were temporally way enrichment and TILs analyses of temporal primary breast conserved across neoadjuvant therapy, cases with a lower tumours of 35 HER2-negative breast cancer patients show cancer percentage of subclonal mutations, thus more homogeneous remodelling during neoadjuvant chemotherapy. breast cancers, had a trend for better responses. TILs were weakly The strengths of our analysis include the prospective design of correlated with neoantigen load, and a small fraction of the study, and the orthogonal genomic and immune infiltration neoantigens were derived from driver gene mutations. data for each primary breast cancer systematically obtained from the SOLTI-1007 NEOERIBULIN phase II clinical trial (NCT01669252). Clonal driver mutations were maintained over time, but intratu- Multiple expressed genes across temporal primary breast moral genomic heterogeneity, measured as a fraction of subclonal cancers mutations may affect response to therapy. Heterogeneity in To identify differentially expressed gene candidates during the mutational signatures across patients was considerably greater administration of neoadjuvant chemotherapy, we performed than at the intrapatient level, which was, in general, not conserved fi targeted gene expression pro ling using a custom 545-gene across temporal samples of each patient in 12 weeks of therapy Nanostring panel composed of breast cancer-related genes administration. It should be noted, however, that eribulin is a weak among 35 patients (Supplementary Data). chemotherapeutic agent which might explain the minimal change We observed a total of 155 genes differentially expressed in within the genomic landscape21. Its efficacy might be related to patients responding to eribulin at surgery (good responders, CR or the fact that that it might change tumour phenotype (based on = = 21 PR [N 19]) vs. those with poor responses (PD or SD [N 16]). the EMT hypothesis) rather than changing tumour genotypes. After a false discovery rate (FDR) adjustment p value < 0.05, Neoantigens have been predicted in a few primary breast ALDH1A1, EVI2A, ADRA2A, LHFP, LOC400043 and PTGER4, were cancer datasets23,24. There is usually a discrepancy of neoantigens upregulated while VEGFA, FANCA, ORC6, KIFC1 and ANGPTL4 were predicted computationally and those actually shown to leverage downregulated in the good responder group (Fig. 3a, Supple- robust T-cell responses25,26. An immunosuppressive tumour mentary Data). microenvironment might be associated with this fact. A recent To assess variability within and between individuals, gene analysis of The Cancer Genome Atlas (TCGA) dataset of invasive expression levels were analysed over a 12-week period for each breast cancers, predicted HLA class I-binding neoepitopes for 870 patient (Fig. 3a bottom panels, Supplementary Fig. 2). Consistent breast cancer samples24. About 40% of the nonsynonymous with the general trend in the previous analysis, increased levels of mutations led to the generation of candidate neoepitopes and the ALDH1A1, EVI2A, ADRA2A, LHFP, LOC400043 and PTGER4 were neoepitope load was also highly correlated with the mutational modulated over time in patients responding to eribulin. burden. Here, 21% of the nonsynonymous mutations led to the Overall and across all time points, we analysed patients that generation of putative neoantigens. Candidate neoantigens and developed disease recurrence after curative therapy (n = 6) vs. stromal TILs were positively correlated and were more frequent in those that did not recur (n = 29). We observed 58 upregulated basal-like primary tumours. Driver mutations such as frameshifts genes in the group of patients with clinical recurrence as mutations in TP53 and MAP3K1 genes or missense mutations in compared to those who did not have a recurrence. Among those the PIK3CA gene can generate tumour neoantigens, suggesting were FABP5, YBX1, TUBB6, PLOD1 and CXCL8 genes that were that a T cell-mediated immune response, if present and properly upregulated and analysed over 12 weeks during neoadjuvant validated, would target all cancer cells15,27,28. The relevance of therapy. neoantigens derived from clonal driver mutations, particularly Our analyses evidenced fatty acid-binding protein 5 (FABP5) those arising from truncating mutations, is the possibility of being overexpression at surgery (V3) was the only statically significant incorporated as targets for adoptive T cell therapies and cancer after FDR adjustment p < 0.05 in cases that recurred, as compared vaccines28,29. to those that did not recur (FDR = 0.04) (Fig. 3b, Supplementary The evaluation of TILs has been shown to represent a reliable Fig. 2). FABP5 is an inductor of the epithelial-mesenchymal surrogate of the immune anti-tumour activity and a robust transition (EMT) transition process20. EMT has been reported to independent prognostic biomarker in breast cancer patients, contribute to tumour aggressiveness and eribulin resistance21,22. especially in the TNBC and HER2-positive breast cancer sub- Finally, we used pathway enrichment analysis to identify the types30,31. Retrospective reports analysed TILs as prognostic and activity of the immune system and common cancer signalling predictive markers in metastatic TNBC patients that received pathways before and after neoadjuvant therapy (see “Methods”, eribulin32. In a previous analysis, the high TILs group vs low TILs Table 2). At the baseline V1 timepoint, it was observed an increase within the metastatic TNBC cases had better outcomes and in pathways associated with angiogenesis, hypoxia and epithelial suggested TILs to predict treatment response to eribulin in the features, in patients that had poor responses to therapy (SD/PD). TNBC metastatic setting32. In our early stage breast cancer cohort, In the good responder group, eribulin treatment was associated stromal TILs detected in H&E slides were more frequent in basal-

Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2021) 73 L. De Mattos-Arruda et al. 6 ORR Down UP Timepoint a V1 5 V2 5 ALDH1A1 V3 4 EVI2A 4 PTGER4TG R

ANGPTL4 ADRA2A −log10(p LOC400043 3 ABCB1C 3 ORC6 VEGFA LHFPFP FANCA v

2 KIFC1KIKI 1 2 al) −log10(pval)

1 1

0 0 −1.4 −1.2 −1.0 1.0 1.1 1.2 1.3 1.4 1.5 Log(2) FoldChange

ALDH1A1 EVI2A PTGER4 LHFP ADRA2A 5 * * **** * *** *** *********

4

3

2

1

VEGFA FANCA ORC6 KIFC1 ANGPTL4 5 *** ** **** ** *** ormed gene expression f 4 rans

3

Log10 t 2

1 V1 V2 V3 V1 V2 V3 V1 V2 V3 V1 V2 V3 V1 V2 V3

ORR Poor responders Good responders

b Recurrence vs non-recurrence Timepoint Down UP V1 5 V2 5 V3

4 YBX1 FABP5 4 −log10(pv STC2 TUBB6 3 ABCC8 PNP 3 TSPAN13 PLOD1 MAGEA1 STC2 SLC16A3 GAL CXCL8 GSTP1GS P 2 SCUBE2 TMEM25M2255 2 al) TRIM29 −log10(pval) SLC25A19CC22255A5A19 TFRCTFRR GSTM3GSTM3 RNF103RN 1 1

0 0 −1.6 −1.4 −1.2 −1.0 1.0 1.1 1.2 1.3 1.4 1.5 Log(2) FoldChange

FABP5 YBX1 TUBB6 PLOD1 CXCL8 5 ** * * ** **

4

3

2 xpression e 1

0 STC2 TSPAN13 SCUBE2 MAGEA1 ABCC8

med gene 5 * ** ** or f 4 rans 3

2 Log10 t 1

0 V1 V2 V3 V1 V2 V3 V1 V2 V3 V1 V2 V3 V1 V2 V3

Non−recurrence Recurrence

Fig. 3 Gene expression profiling in longitudinal primary breast tumour biopsies. a ORR at surgery (V3) (patients with good response to eribulin vs. poor response) (Top). Genes identified have a change corresponding to eribulin treatment. Volcano plots with the strength of the association on the y-axis (−log10 p values) and the effect size on the x-axis (log 2-fold change (FC)). Differentially expressed genes were highlighted during different timepoints across neoadjuvant therapy (V1–V3). Genes above the red dotted line represent those whose expression levels were significantly different (p value < 0.01). A full list of the most up and downregulated genes can be found on Supplementary Data. Boxplots of good and poor responders on eribulin over time, colour-coded by their corresponding poor response (red) or good response (blue) in eribulin from baseline to surgery (Bottom). b Same as in (a). Patients with later clinical recurrence vs. no recurrence (Top). Boxplots of individual patients on eribulin over time, colour-coded by their corresponding recurrence (purple) or nonrecurrence (grey) in eribulin from baseline to surgery (Bottom).

npj Breast Cancer (2021) 73 Published in partnership with the Breast Cancer Research Foundation L. De Mattos-Arruda et al. 7 like primary tumours. A statistically significant higher percentage of stromal TILs after the end of neoadjuvant therapy was not observed possibly due to the small sample size and other biological factors including immunosuppressive TME. However, value .0000413 p 0.00385 0.00000000266 0.0276 we observed an enrichment in pathways associated with immune infiltration and reduced programmed cell death after 12 weeks of , eribulin administration corroborating findings observed in meta- static breast cancer. In a recent study33, lymphocyte infiltration increase and PDL1 expression turning to negative values were suggested to be improvements in the immunosuppressed TME of patients receiving eribulin for advanced breast cancer. Our findings add to current evidence for potentially combining checkpoint inhibitors to eribulin in the early breast cancer setting34. There is the emerging rationale for eribulin to activate the immune system in pre-clinical and retrospective studies, through EMT suppression, and vascular remodelling and improvement of the tumour immune microenvironment21. Our analyses indicate that temporal gene expression responses to eribulin might be associated with the process of restoration of hypoxia, and the EMT process, which should be contextualised with the immune TME. Eribulin treatment has shown to suppress genes that are known to be involved in hypoxic signalling cascades, including VEGF, contributing to TME remodelling and restoring the scenario of normoxia35,36. We observed that VEGF was downregulated over- time during eribulin therapy in the good responder group. At diagnosis, poor responders presented an overexpression of angiopoietin-like 4 (ANGPTL4), a pro-angiogenic factor that is modulated by hypoxia and associated with poor prognosis, metastasis, cell differentiation and vascular permeability37. Pre- viously, aldehyde dehydrogenase 1A1 (ALDH1A1) was reported as a cancer stem cell marker and suggested to have a favourable prognostic role for cervical cancer38,39. Although we observed ALDH1A to be upregulated in good responders, its role in breast cancer is undefined yet. Furthermore, FABP5 promotes tumour cell growth in numerous cell types and is a negative prognostic marker in renal cell carcinoma20,40. FABP5 was upregulated in cases that presented further clinic-radiological recurrence. How- ever, identification of FABP5 as a biomarker in HER2-negative breast cancer requires substantially increased cohort size and mechanistic validation for robust interpretation. Our work has limitations which are mostly secondary to the small sample size and therefore lack of power to detect specific CENPF, CKS2, DLGAP5, CCNB1,FZD6, CDK4, DDIT4, MYC, BRCA2, CXCL8, KIF2C, BUB1, CDCA7, CDK1, HSPD1, CTPS1, CCNA2, FANCA, MKI67, PRC1 ADM, and CHEK1, BOP1 BTG3, CDC25C, CDCA7, HSPD1, ADM, CHEK1, MAD2L1, ANGPTL4, DDIT4, BUB1, CDK1, KRT19, KRT5, KRT14, S100A14 and PGAM5 IGF1, IGBP1, NR4A3, TWIST1, CDKN1A, IL6ST, KLF4, SLC40A1, CYR61, PIK3R1, BTG2, AXL and TWIST2 0.00716 IGF1, IGBP1, NR4A3, TWIST1, CDKN1A, IL6ST, KLF4, SLC40A1, CYR61, PIK3R1, BTG2, AXL and TWIST2 0.00595 PTGS2, VEGFA, AKT3, ADM, ANGPTL4 and CXCL8 0.00697 STRAP, VEGFA, AKT3, MYC, ERBB2, EGFR and NFIB 0.0217 PTGS2, VEGFA, AKT3, ERBB2, ADM, ANGPTL4 and CXCL8associations. Nonetheless, 0.0462 this HER2-negative breast cancer cohort gives a glimpse of the changes in cancer biology during neoadjuvant therapy administration. Another limitation is the 545 targeted gene expression panel used did contemplate only a few immune markers including IDO1, LAG3 and STAT1. Further- more, the primary endpoint of the present clinical trial is pathologic complete response (pCR) in the breast. We used ORR for analyses of responses after the neoadjuvant therapy since there are only a few pCR cases; residual cancer burden was used but was not informative in the dataset; thus, the association Pathway Genes programmed cell death apoptotic process angiogenesis cell proliferation development between biomarkers and response is challenging. Taken together, these results suggest that mutational hetero- geneity, subclonal architecture and the immune microenviron- ment along with remodelling of hypoxia and EMT may influence Pathway Database GO:0008283 Cell proliferation TACC3, STRAP, LAMC2, CDH3, GAL, GRHL2, BIRC5, CDC6, CDKN3, NDRG1, EZH2, FOXM1, BYSL, VEGFA, TTK, CDC20, GO:0042110 T cell activationGO:0043069 Negative regulation of IGF1, CD86, IGFBP2, IL6ST, PIK3R1, ZEB1, TGFBR2, PTGER4 and KIF13B 0.0227 GO:0043066 Negative regulation of the KEGG:04066 HIF-1 signalling pathway VEGFA, AKT3, ERBB2 and EGFR GO:0050678 Regulation of epithelial GO:1901342 Regulation of vasculature the response to neoadjuvant treatment in early stage disease, with possible implications for clinical decision-making and monitoring of treatment efficacy.

METHODS

Gene ontology association with important biological pathways. Patients and tissues Primary breast cancer tumour specimens were obtained from the phase II (poor responders) Increased GO:0012501 Programmed cell death YBX3, GAL, MCM2, BIRC5,study, AARS, ITGA6, MYBL2, NDRG1, VEGFA, SNAI1, KRT17, TOP2A, CDK4,open-label, MYC, BRCA2, TP53BP2 single-arm SOLTI-1007 NEOERIBULIN (NCT01669252). The patients reported here were treated at the Vall Hebron Institute of Table 2. V1 Increased GO:0045766 Positive regulation of Visit Effect in tested group V3 Decreased GO:0046649 Lymphocyte activation ERCC1, IGF1, IGBP1, CD86, IGFBP2, CDKN1A, VAV3, IL6ST, PIK3R1, ZEB1, TGFBR2, AXL, PTGER4 and KIF13BOncology, 0 Barcelona. In brief, eribulin was administered as neoadjuvant

Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2021) 73 L. De Mattos-Arruda et al. 8 treatment for stage I-II HER2-negative breast cancer with a dose of 1.4 mg/ all libraries were quantified using quantitative polymerase chain reaction m2 intravenously on Days 1 and 8 every 21-day cycle, for 4 cycles, and the (qPCR). KAPA Library Quantification Kit (cat. KK4873, KAPA Biosystems) as pathological CR was defined as the absence of residual invasive tumour in used as per manufacturer’s recommendations. A subset of libraries was breast tumour specimens. All patients were negative for HER2 over- analysed using DNA 1000 Kit (cat. 5067-1504, Agilent). expression on clinical assays. The Ethics Committee of the Vall Hebron Whole-genome libraries and exome libraries were normalised and University Hospital, Barcelona, Spain approved the study. All patients gave pooled in equal volumes to create balanced pools. Each pool was informed consent for DNA and RNA sequencing. normalised to molarity of 4 nM and used for sequencing with clustering concentration 20 pM with 1% spike-in of PhiX control. Sequencing was Clinic-pathological response analyses performed on an Illumina HiSeq2500 using v4 chemistry and 50 cycles fi single-end for s-WGS and 75 cycles paired-end for WES. Demultiplexing We adopted a conservative method of de ning ORR after neoadjuvant was performed using Illumina’s bcl2fastq2 v.2.17 software using default therapy as CR or PR by response evaluation criteria in solid tumours options. FASTQ files were used for subsequent data analysis. (RECIST) (major decrease in tumour burden following treatment) and poor response as progressive disease (PD) or stable disease (SD) by RECIST (major increase or stability in tumour burden following treatment) after WES analyses eribulin therapy. For WES analysis, reads were mapped to the (GRCh37) and Patients were classified into two groups depending on their clinical base quality recalibration were performed using Novoalign v 3.02 response to therapy at surgery: good responders—patients that had CR or (Novocraft). Coordinate sorting of reads and PCR-duplicate marking was PR; poor responders—patients that had PD or SD. performed using Novosort (v 3.02). The resulting bam files for all samples for the same patient were locally realigned using the Genome Analysis 43 Tissue processing and DNA and RNA extraction Toolkit (GATK, v 3.4.46) . MuTect (version 2) was run using default parameters44. Strelka (version 1.0.14) was run with recommended starting Frozen biopsies were embedded in a frozen tissue matrix (OCT; Sakura parameters for BWA and default parameters45. The mean coverage per Finetek, Torrance, CA) and cut at the cryostat for tumour cellularity sample was calculated with CollectWgsMetrics (Picard). A joint calling was assessment by a pathologist. Genomic DNA was isolated from 10 × 10um fi sections using the DNeasy Kit (Qiagen, Hilden, Germany). Samples with less done and ltered allelic fraction >10% in at least one sample of each than 10% tumour material or that produced low DNA yield were excluded patient. Somatic mutations were annotated using Variant Effect Predictor from the analysis. (VEP, http://grch37.ensembl.org/) and visualised using IGV. Output with A section of the FFPE breast tissue was first examined with haematoxylin reads > 15 and purity > 20% were included in further analyses. and eosin staining to confirm the presence of invasive tumour cells (≥10%). For RNA purification (High Pure Formalin-Fixed Paraffin-embedded RNA ABSOLUTE Isolation Kit, Roche Diagnostics Limited, West Sussex, UK), one to five μ ABSOLUTE (v1.0.6) was used to infer the cancer cell fraction (CCF) of 10 m FFPE slides were used for each tumour specimen (at diagnosis, cycle mutations and the mutations were classified as clonal or subclonal as 2 and surgery). previously described17,46. A mutation was classified as clonal if its probability of being clonal was >50% or if the lower bound of the 95% Gene expression and Intrinsic subtype analyses confidence interval of its CCF was >90%. Mutations that did not meet the A minimum of ∼125 ng of total RNA was used to measure the expression above criteria were considered subclonal. of 545 genes involved in breast cancer, including 5 housekeeping genes ABSOLUTE software was used to calculate tumour clonality, purity and (ACTB, MRPL19, PSMC4, RPLP0 and SF3A1), using the Prosigna assay ploidy. For running the ABSOLUTE we obtained the mutation annotation (NanoString Technologies, Seattle, USA). Samples with 20 or fewer counts file by running vcf2maf script with VCF files for the corresponding tumour in at least 70% of the genes were removed. Data were log base 2 and normal control samples, annotation was performed by VEP. To find the transformed and normalised using the housekeeping genes. segmentation we ran the CNVkit batch command with the.bam files from 47 The same RNA was used to measure the expression of 50 genes of the tumour and normal samples . For running ABSOLUTE we subset the.cns PAM50 intrinsic subtype predictor assay. For each sample, we calculated files from the CNVkit output file for variant coordinates as well as probes the PAM50 signature scores (Basal-like, HER2-E, Luminal A and B) and the and log2 values. We used the options min.ploidy = 0.95, max.ploidy = 10. proliferation signature score41. Differential gene expression analysis fold Probability of a mutation to be clonal was defined essentially as described change of each gene was calculated as the ratio of average gene in ref. 48. expression intensity of (i) the good responder group (n = 19 [5 CR, 14 PR]) = to that of the poor responder group (n 16 [12 SD, 4 PD]) or (ii) the Mutation signature patients that recurred (n = 6) to that of the non-recurrent group (n = 29). A Decomposition of the mutational signature was performed using two-sample t test was used to compare gene expression intensities 16 between groups. deconstructSigs , based on the set of 30 mutational signatures. These fi A gene was claimed to be differentially expressed if it showed a fold signatures are largely de ned by the relative frequency of the six possible change of >1 (increased in good responders, or non-recurrent) or ≤ −1 base substitutions (C > A, C > G, C > T, T > A, T > C, and T > G) in the ′ ′ (increased in poor responders, or in the later recurred) and further sequence context of their adjacent 5 and 3 base. Of these, COSMIC adjustment FDR ≤ 0.05 was applied. Volcano plots were used to visualise signatures 1, 2, 3, 6, 13, 9, 10, 15, have been associated with breast cancer. log 2-fold change on the x-axis and −log10 p values on the y-axis. For clarity, we explored the following mutational signatures in this cohort: 49,50 Pathway enrichment analysis was performed using the gprofiler toolkit42 ageing/clock-like signatures (signatures 1 and 5), APOBEC (signatures 2 comparing good responder vs. poor responder groups using two ontology and 13), BRCA-related (signature 3), defective DNA repair (signatures 6 and databases as reference: (i) and (ii) Kyoto Encyclopaedia of 15), polymerase n (signature 9), POLE mutations (signature 10), unknown Genes and Genomes. breast (signature 30) and other signatures.

DNA sequencing Neoantigen prediction WES was performed to the breast cancer tumour and matched normal The 4-digit HLA type was obtained from matched normal WES data and 51 DNA obtained from the buffy coat of each HER2-negative breast cancer the OptiType 1.3.1 analysis software package was used . WES and HLA 52 patient. typing was integrated, and NeoPredPipe pipeline was applied with minor Libraries for Illumina sequencing were prepared using Illumina Nextera modifications for the neoantigen prediction. First, non-synonymous Rapid Capture Exome kit (cat. FC-140-1003, Illumina) as we reported cancer-specific mutations, i.e., present in all tumour cells and absent in previously. Prior to library preparation DNA concentrations for each sample all normal cells, were used to generate a comprehensive list of peptides were quantified using a fluorescence-based method (Quant-IT dsDNA BR, (9–11 amino acids in length) with the mutated amino acid represented at cat. Q33130, Thermo Fisher Scientific) and 50 ng of genomic DNA was used each peptide position and used as input was previous described. for library preparation. Prioritisation of neoantigens: the selection of candidate neoantigens were Samples were processed following the manufacturer’s instructions based on a %Rank < 0.5; epitopes already existing in the reference (part# 15037436 Rev. J, Illumina) for WES. Prior to the first hybridisation, proteome indicated by the Novelty parameter were excluded.

npj Breast Cancer (2021) 73 Published in partnership with the Breast Cancer Research Foundation L. De Mattos-Arruda et al. 9 Tumour infiltrating lymphocytes 7. Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole- Evaluations of stromal TILs were performed on haematoxylin and eosin- genome sequences. Nature https://doi.org/10.1038/nature17676 (2016). stained sections by an experienced board-certified pathologist (R.F.) 8. Stephens, P. J. et al. The landscape of cancer genes and mutational processes in according to the 2014 recommendations of the international TILs working breast cancer. Nature 486, 400–404 (2012). group53. 9. Yates, L. R. et al. Subclonal diversification of primary breast cancer revealed by multiregion sequencing. Nat. Med. 21, 751–759 (2015). 10. Fox, N. S., Haider, S., Harris, A. L. & Boutros, P. C. Landscape of transcriptomic Statistical analysis interactions between breast cancer and its microenvironment. Nat. Commun. 10, Continuous variables were expressed as median and range, while 3116 (2019). categorical variables were expressed as absolute values or percentages. 11. Paik, S. et al. A multigene assay to predict recurrence of tamoxifen-treated, node- For statistical comparison, we used Mann–Whitney test for independent negative breast cancer. N. Engl. J. Med. 351, 2817–2826 (2004). continuous variables, Wilcoxon signed-rank test for paired continuous data 12. Wallden, B. et al. Development and verification of the PAM50-based Prosigna and Fisher’s exact test for categorical data. To study if there was a breast cancer gene signature assay. BMC Med. Genomics 8, 54 (2015). significant variation of TMB across sampling timepoints the Wilcoxon 13. Höglander, E. K. et al. Time series analysis of neoadjuvant chemotherapy and signed-rank test was used to compare pairwise timepoints. In addition, the bevacizumab-treated breast carcinomas reveals a systemic shift in genomic range and standard deviation of TMB in each time point were calculated to aberrations. Genome Med. 10, 92 (2018). estimate intra-period TMB heterogeneity. To study the association 14. Angelova, M. et al. Evolution of metastases in space and time under immune between mutations in driver genes and relapse-free survival the Cox selection. Cell 175, 751–765.e16 (2018). proportional hazard model was fitted. Furthermore, to analyse the 15. De Mattos-Arruda, L. et al. The genomic and immune landscapes of lethal association between the percentage of subclonal mutations at baseline metastatic breast cancer. Cell Rep. 27, 2690–2708.e10 (2019). and clinical outcomes, we used the logistic and Cox models after relaxed 16. Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. linearity assumption using restricted cubic splines by means of rms R Nature 500, 415–421 (2013). package. Data analyses were carried out using R version statistical 17. Carter, S. L. et al. Absolute quantification of somatic DNA alterations in human software 3.6.3. cancer. Nat. Biotechnol. 30, 413–421 (2012). 18. De Mattos-Arruda, L. et al. The genomic and immune landscapes of lethal Reporting summary metastatic breast cancer. Cell Rep. 27(2690-2708), e10 (2019). 19. Reiter, J. G. et al. Minimal functional driver gene heterogeneity among untreated Further information on research design is available in the Nature Research metastases. Science 361, 1033–1037 (2018). Reporting Summary linked to this article. 20. Ohata, T. et al. Fatty acid-binding protein 5 function in hepatocellular carcinoma through induction of epithelial-mesenchymal transition. Cancer Med. 6, 1049–1061 (2017). DATA AVAILABILITY 21. Cortes, J., Schöffski, P. & Littlefield, B. A. Multiple modes of action of eribulin The data generated and analysed during this study are described in the following mesylate: emerging data and clinical implications. Cancer Treat. Rev. 70, 190–198 data record: https://doi.org/10.6084/m9.figshare.1445426154. The whole-exome (2018). sequencing bam files have been deposited at the European Genome-phenome 22. Yoshida, T. et al. Eribulin mesilate suppresses experimental metastasis of breast Archive (EGA), which is hosted by the EBI and the CRG, under the study accession cancer cells by reversing phenotype from epithelial–mesenchymal transition number https://identifiers.org/ega.study:EGAS00001004953 and dataset accession (EMT) to mesenchymal–epithelial transition (MET) states. Br. J. Cancer 110, number https://identifiers.org/ega.dataset:EGAD00001006980 55. The decreased and 1497–1505 (2014). increased pathway enrichment analyses are available via GitHub at https://github. 23. Ren, Y. et al. HLA class-I and class-II restricted neoantigen loads predict overall com/NeoVaCan/NPJBCANCER_DeMattos_2021/tree/main/Data_Supp_Table2. The survival in breast cancer. Oncoimmunology https://doi.org/10.1080/ supplementary tables are also available in Excel format as part of the figshare data 2162402X.2020.1744947 (2020). record. The patient metadata and patient tumour-infiltrating lymphocytes data are 24. Narang, P., Chen, M., Sharma, A. A., Anderson, K. S. & Wilson, M. A. The neoepitope not publicly available for the following reason: data contain information that could landscape of breast cancer: implications for immunotherapy. BMC Cancer 19, 200 compromise research participant privacy. However, the data can be made available (2019). upon reasonable request to the corresponding author. 25. Saini, S. K., Rekers, N. & Hadrup, S. R. Novel tools to assist neoepitope targeting in personalized cancer immunotherapy. Ann. Oncol. https://doi.org/10.1093/ annonc/mdx544 (2017). CODE AVAILABILITY 26. De Mattos-Arruda, L., Blanco-Heredia, J., Aguilar-Gurrieri, C., Carrillo, J. & Blanco, J. New emerging targets in cancer immunotherapy: the role of neoantigens. ESMO Custom R code used to analyse tumour whole-exome sequencing data and Open 4, e000684 (2020). Nanostring gene expression data are available at https://github.com/NeoVaCan/ 27. Turajlic, S. et al. Insertion-and-deletion-derived tumour-specific neoantigens and NPJBCANCER_DeMattos_2021. the immunogenic phenotype: a pan-cancer analysis. Lancet Oncol. 18, 1009–1021 (2017). Received: 24 September 2020; Accepted: 3 May 2021; 28. McGranahan, N. et al. Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade. Science 351, 1463–1469 (2016). 29. Robertson, J., Salm, M. & Dangl, M. Adoptive cell therapy with tumour-infiltrating lymphocytes: the emerging importance of clonal neoantigen targets for next- generation products in non-small cell lung cancer. Immuno-Oncol. Technol. 3,1–7 REFERENCES (2019). 30. Pruneri, G., Vingiani, A. & Denkert, C. Tumor infiltrating lymphocytes in early 1. Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and breast cancer. Breast 37, 207–214 (2018). mortality worldwide for 36 cancers in 185 countries. CA. Cancer J. Clin. https://doi. 31. Dushyanthen, S. et al. Relevance of tumor-infiltrating lymphocytes in breast org/10.3322/caac.21492 (2018). cancer. BMC Med. 13, 202 (2015). 2. Banerji, S. et al. Sequence analysis of mutations and translocations across breast 32. Kashiwagi, S. et al. Use of Tumor-infiltrating lymphocytes (TILs) to predict the cancer subtypes. Nature 486, 405–409 (2012). treatment response to eribulin chemotherapy in breast cancer. PLoS ONE https:// 3. Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast doi.org/10.1371/journal.pone.0170634 (2017). tumours reveals novel subgroups. Nature https://doi.org/10.1038/nature10983 33. Goto, W. et al. Eribulin promotes antitumor immune responses in patients with (2012). locally advanced or metastatic breast cancer. Anticancer Res. 38, 2929–2938 4. Rivenbark, A. G., O’Connor, S. M. & Coleman, W. B. Molecular and cellular het- (2018). erogeneity in breast cancer: challenges for personalized medicine. Am. J. Pathol. 34. Pizzuti, L. et al. Eribulin in triple negative metastatic breast cancer: critic inter- https://doi.org/10.1016/j.ajpath.2013.08.002 (2013). pretation of current evidence and projection for future scenarios. J. Cancer 10, 5. De Mattos-Arruda, L., Shen, R., Reis-Filho, J. S. & Cortés, J. Translating neoadjuvant 5903–5914 (2019). therapy into survival benefits: one size does not fit all. Nat. Rev. Clin. Oncol. 13, 35. Funahashi, Y. et al. Eribulin mesylate reduces tumor microenvironment 566–579 (2016). abnormality by vascular remodeling in preclinical human breast cancer models. 6. Koboldt, D. C. et al. Comprehensive molecular portraits of human breast tumours. Cancer Sci. 105, 1334–1342 (2014). Nature 490,61–70 (2012).

Published in partnership with the Breast Cancer Research Foundation npj Breast Cancer (2021) 73 L. De Mattos-Arruda et al. 10 36. Zhao, S. et al. Elimination of tumor hypoxia by eribulin demonstrated by 18F- AUTHOR CONTRIBUTIONS FMISO hypoxia imaging in human tumor xenograft models. EJNMMI Res. https:// L.D.M.A wrote the paper with the assistance of co-authors. L.D.M.A., J.C., D.T. and C.C. doi.org/10.1186/s13550-019-0521-x (2019). contributed to the overall design of this study. Data were collected by V.O. and J.P.G. fl 37. Carbone, C. et al. Angiopoietin-like in angiogenesis, in ammation and Data were analysed and interpreted by J.B.H., D.T., G.V., S.G.R., L.P., C.A.S., S.J.S., P.C., cancer. Int. J. Mol. Sci. 19, 431 (2018). R.F. and S.F.C. All authors have read and approved the final version of the paper. 38. Charafe-Jauffret, E. et al. Aldehyde dehydrogenase 1-positive cancer stem cells mediate metastasis and poor clinical outcome in inflammatory breast cancer. Clin. Cancer Res. 16,45–55 (2010). COMPETING INTERESTS 39. Vassalli, G. Aldehyde dehydrogenases: not just markers, but functional regulators L.D.M.A. has received honoraria for participation in a speaker’s bureau/consultancy of stem cells. Stem Cells Int. 2019,1–15 (2019). from Roche, and reports research collaboration and support from NanoString 40. Lv, Q. et al. FABP5 regulates the proliferation of clear cell renal cell carcinoma Technologies. J.C. reports consulting from Roche, Celgene, Cellestia, AstraZeneca, cells via the PI3K/AKT signaling pathway. Int. J. Oncol. https://doi.org/10.3892/ Biothera Pharmaceutical, Merus, Seattle Genetics, Daiichi Sankyo, Erytech, Athenex, ijo.2019.4721 (2019). Polyphor, Lilly, Servier, Merck Sharp&Dohme, GSK, Leuko, Bioasis, Clovis Oncology, 41. Parker, J. S. et al. Supervised risk predictor of breast cancer based on intrinsic Boehringer Ingelheim, Honoraria: Roche, Novartis, Celgene, Eisai, Pfizer, Samsung subtypes. J. Clin. Oncol. 27, 1160–1167 (2009). Bioepis, Lilly, Merck Sharp&Dohme, Daiichi Sankyo. Research funding to the 42. Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and Institution: Roche, Ariad pharmaceuticals, AstraZeneca, Baxalta GMBH/Servier conversions of gene lists (2019 update). Nucleic Acids Res. 47, W191–W198 (2019). Affaires, Bayer healthcare, Eisai, F.Hoffman-La Roche, Guardanth health, Merck 43. Depristo, M. A. et al. A framework for variation discovery and genotyping using next- Sharp&Dohme, Pfizer, Piqur Therapeutics, Puma C., Queen Mary University of London. generation DNA sequencing data. Nat. Genet. https://doi.org/10.1038/ng.806 (2011). Stock, patents and intellectual property: MedSIR. Travel, accommodation, expenses: 44. Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and het- Roche, Novartis, Eisai, pfizer, Daiichi Sankyo. A.P. reports consulting fees from erogeneous cancer samples. Nat. Biotechnol. https://doi.org/10.1038/nbt.2514 (2013). Nanostring Technologies Roche, Pfizer, Novartis and Daiichi Sankyo outside the 45. Saunders, C. T. et al. Strelka: accurate somatic small-variant calling from submitted work. C.C. is a member of the AstraZeneca External Science Panel and has sequenced tumor–normal sample pairs. Bioinformatics 28, 1811–1817 (2012). research grants from Roche, Genentech, AstraZeneca, and Servier that are 46. De Mattos-Arruda, L. et al. Genetic heterogeneity and actionable mutations in administered by the University of Cambridge. The remaining authors declare no HER2-positive primary breast cancers and their brain metastases. Oncotarget 9, competing interests. 20617–20630 (2018). 47. Talevich, E., Shain, A. H., Botton, T. & Bastian, B. C. CNVkit: genome-wide copy number detection and visualization from targeted DNA sequencing. PLOS Com- put. Biol. 12, e1004873 (2016). ADDITIONAL INFORMATION 48. Miao, D. et al. Genomic correlates of response to immune checkpoint blockade in Supplementary information The online version contains supplementary material microsatellite-stable solid tumors. Nat. Genet. 50, 1271–1281 (2018). available at https://doi.org/10.1038/s41523-021-00282-0. 49. Alexandrov, L. B. et al. Clock-like mutational processes in human somatic cells. Nat. Genet. 47, 1402–1407 (2015). Correspondence and requests for materials should be addressed to L.D.M-A. 50. Kim, Y. A. et al. Network-based approaches elucidate differences within APOBEC and clock-like signatures in breast cancer. Genome Med. https://doi.org/10.1186/ Reprints and permission information is available at http://www.nature.com/ s13073-020-00745-2 (2020). reprints 51. Szolek, A. et al. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30, 3310–3316 (2014). Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims 52. Schenck, R. O., Lakatos, E., Gatenbee, C., Graham, T. A. & Anderson, A. R. A. in published maps and institutional affiliations. NeoPredPipe: high-throughput neoantigen prediction and recognition potential pipeline. BMC Bioinform. 20, 264 (2019). 53. Salgado, R. et al. The evaluation of tumor-infiltrating lymphocytes (TILs) in breast cancer: recommendations by an International TILs Working Group 2014. Ann. Open Access This article is licensed under a Creative Commons Oncol. 26, 259–271 (2015). Attribution 4.0 International License, which permits use, sharing, 54. De Mattos-Arruda, L. Metadata record for the article: The temporal mutational adaptation, distribution and reproduction in any medium or format, as long as you give and immune tumour microenvironment remodelling of HER2-negative primary appropriate credit to the original author(s) and the source, provide a link to the Creative breast cancers. figshare https://doi.org/10.6084/m9.figshare.14454261 (2021). Commons license, and indicate if changes were made. The images or other third party 55. European Genome-phenome Archive https://identifiers.org/ega.dataset: material in this article are included in the article’s Creative Commons license, unless EGAD00001006980 (2021). indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory ACKNOWLEDGEMENTS regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. The authors acknowledge the patients who participated in this study, the Vall d org/licenses/by/4.0/. ´Hebron Institute of Oncology and the Cellex foundation. This work was supported by Cancer Research UK. L.D.M.A. was partly funded by Spanish Association against cancer. © The Author(s) 2021

npj Breast Cancer (2021) 73 Published in partnership with the Breast Cancer Research Foundation