(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2014/135655 Al 12 September 2014 (12.09.2014) P O P C T

(51) International Patent Classification: (81) Designated States (unless otherwise indicated, for every C12Q 1/68 (2006.01) kind of national protection available): AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY, (21) International Application Number: BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, PCT/EP2014/054384 DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, (22) International Filing Date: HN, HR, HU, ID, IL, IN, IR, IS, JP, KE, KG, KN, KP, KR, 6 March 2014 (06.03.2014) KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, (25) Filing Language: English OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, (26) Publication Language: English SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, (30) Priority Data: ZW. 13305253.0 6 March 2013 (06.03.2013) EP (84) Designated States (unless otherwise indicated, for every (71) Applicants: INSTITUT CURIE [FR/FR]; 26 rue d'Ulm, kind of regional protection available): ARIPO (BW, GH, F-75248 Paris cedex 05 (FR). CENTRE NATIONAL DE GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, SZ, TZ, LA RECHERCHE SCIENTIFIQUE [FR/FR]; 3 rue UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, Michel Ange, F-75016 Paris (FR). TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, LV, (72) Inventors: RADVANYI, Francois; 36 rue des Potiers, F- MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, 92260 Fontenay Aux Roses (FR). REBOUISSOU, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, Sandra; 42 rue des Cordelieres, F-75013 Paris (FR). KM, ML, MR, NE, SN, TD, TG). KAMOUN, Aurelie; 4 1 avenue de Saint Mande, F-75012 Paris (FR). ALLORY, Yves; 40 rue Sedaine, F-7501 1 Par Published: is 69 (FR). DE REYNIES, Aurelien; boulevard Saint- — with international search report (Art. 21(3)) Michel, F-75005 Paris (FR). BERNARD-PIERROT, Isa- belle; 8 rue Francois Rolland, F-94130 Nogent sur Marne — before the expiration of the time limit for amending the (FR). LEBRET, Thierry; 10 chemin des Biloises, F- claims and to be republished in the event of receipt of 78300 Bougival (FR). amendments (Rule 48.2(h)) (74) Agents: PIERRU, Benedicte et al.; Becker & Associes, — with sequence listing part of description (Rule 5.2(a)) 25, rue Louis le Grand, 75002 Paris (FR).

(54) Title: COMPOSITIONS AND METHODS FOR TREATING MUSCLE-INVASIVE BLADDER CANCER (57) Abstract: The present invention relates to a method to classify patients suffering from a muscle-invasive bladder cancer for therapeutic intervention, in particular for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment compris - ing an EGFR kinase inhibitor and/or capecitabine. Compositions and methods for treating muscle-invasive bladder cancer

FIELD OF THE INVENTION

The present invention relates to the field of medicine, in particular of oncology. It relates to a new method to classify patients suffering from a muscle-invasive bladder cancer for therapeutic intervention.

BACKGROUND OF THE INVENTION

Bladder carcinoma is one of the most common cancers in North America and Europe, accounting for approximately 200,000 new cases and 65,000 deaths in these regions in 2008. Bladder carcinoma may present as a non-muscle-invasive (70-80% of cases) or muscle-invasive (20-30% of cases) disease, with highly divergent outcomes. Most patients with non-muscle- invasive bladder cancers (NMIBC) suffer multiple recurrences of the disease without developing a muscle-invasive neoplasm. In contrast, muscle-invasive bladder cancer (MIBC) is a major clinical issue, with cancer-related deaths of 40-50% at five years for patients with organ-confined tumors and more than 80% for those with lymph node involvement or distant metastasis. Radical cystectomy is the standard treatment for MIBC. The addition of neoadjuvant and/or adjuvant chemotherapy has very modest benefits for overall survival (Sternberg et al, 2012). Iterative bladder resection and radiotherapy alone are generally considered as palliative treatment options for patient unfit for cystectomy or as part of a multimodal bladder-preserving approach (Bellmunt et al., 2010). Improvements in understanding the molecular mechanisms involved in bladder carcinoma have highlighted several potential molecular treatment targets, but no targeted treatment for MIBC is currently used in clinical practice (Dovedi and Davies, 2009). Clinical trials based on targeted therapies, either alone or in combination with conventional chemotherapy, have been largely unsuccessful (Bellmunt and Petrylak, 2012; Necchi et al, 2012; Pruthi et al, 2010; Wong et al, 2012). In particular, autocrine-mediated EGFR signaling has been shown to play an essential role in normal urothelium repair and its involvement in bladder carcinogenesis has been suggested, as the overexpression of EGFR and its ligands has been associated with advanced tumor grade/stage and poor clinical outcome (Chow et al, 2001; Thogersen et al, 2001). Moreover, preclinical studies in human bladder cancer cell lines have identified a spectrum of sensitivity to EGFR inhibitors (Adam et al., 2009; Black et al., 2008). However, recent phase II clinical trials in neoadjuvant or adjuvant settings have suggested that EGFR inhibitors have limited effects in patients with MIBC (Pruthi et al, 2010; Wong et al, 2012). All clinical trials using EGFR inhibitors have been performed in unselected MIBC patients since up to now, no predictive factors of response have been identified. Several expression profiling studies have revealed a high degree of molecular diversity in bladder carcinomas, including MIBC (Blaveri et al, 2005; Dyrskjot et al, 2003; Dyrskjot et al, 2007; Lindgren et al., 2010; Sjodahl et al., 2012), and patients with a particular tumor subtype might benefit from a targeted treatment considered ineffective in an unselected patient population (Baselga, 2008). As MIBC is a highly heterogeneous disease in both molecular and clinical terms, tumor stratification is a key issue in the identification of appropriate targeted treatments. In particular, a stratified approach to anti-EGFR therapy for MIBC may improve treatment efficacy. There is thus a strong need to provide reliable markers that could be used to stratify MIBC and to select patients for successful targeted therapies and in particular anti-EGFR therapy.

SUMMARY OF THE INVENTION

Towards a stratified approach to bladder cancer therapy, the inventors searched for clinically-relevant molecularly-homogeneous subgroups of MIBC. They identified a subgroup of particularly aggressive MIBC tumors wherein the EGFR pathway was deregulated and they further provided evidence of a relationship between this subgroup of MIBC and sensitivity to anti-EGFR drugs. Accordingly, in a first aspect, the present invention concerns a method for determining whether a muscle-invasive bladder cancer has a basal-like phenotype, wherein the method comprises (i) determining the expression level of KRT5, KRT6A and/or KRT6B and the expression level of nuclear FOXA1 in a cancer sample; and/or (ii) determining the expression level of at least 2 selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSS1 and RGS20 genes, and the expression level of at least 2 genes selected from the group consisting of PHCl, THYNl, TACCl, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNA1D, RAB15, MAN1C1, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, in a cancer sample; and/or (iii) determining the expression level of at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 41, in a cancer sample; and/or (iv) determining the DNA methylation status of at least 4 CpG islands selected from the group consisting of CpG islands listed in Table 7 in a cancer sample; and/or (v) determining the DNA methylation status of at least 4 GpC sites selected from the group consisting of GpC sites listed in Table 9 in a cancer sample; and/or (vi) determining the expression level of TGM1 gene in a cancer sample, thereby determining whether a muscle-invasive bladder cancer has a basal-like phenotype. Preferably, the method comprises determining the expression level of K T5, K T6A and/or K T6B and the expression level of nuclear FOXAl in a cancer sample, the expression of K T5, K T6A and/or K T6B and the absence of nuclear FOXAl being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In particular, the expression level of K T5, K T6A and/or K T6B and the expression level of nuclear FOXAl may be assessed by immunohistochemistry. In a particular embodiment, the method comprises determining the expression level of PKP1, IPPK, MAML3 and TGFBR3 genes in a cancer sample, high expression level of PKP1 and IPPK genes and low expression level of MAML3 and TGFBR3 genes, being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In another particular embodiment, the method comprises determining the expression level of the exon of SEQ ID NO: 24 in a cancer sample, low expression level of said exon being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In another particular embodiment, the method comprises determining the expression level of the exon of SEQ ID NO: 37 in a cancer sample, high expression level of said exon is indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In another particular embodiment, the method comprises determining the expression level ofTGMl gene, high expression level is indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In particular, this method further comprises determining the expression level of the exon of SEQ ID NO: 37. In another particular embodiment, the method comprises determining the expression level of TGM1 gene and determining the expression level of HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307. In particular, the method further comprises calculating the ratio of HDAC9 short isoform to HDAC9 long isoforms, high expression level of TGM1 gene and high ratio are indicative that the muscle- invasive bladder cancer has a basal-like phenotype. In a further particular embodiment, the method comprises determining the DNA methylation status of CpG islands of SEQ ID NO: 43, 45, 47 and 51, hypermethylation of the CpG island of SEQ ID NO: 43 and hypomethylation of CpG islands of SEQ ID NO: 45, 47 and 51, being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In another particular embodiment, the method comprises determining the DNA methylation status of GpC sites of SEQ ID NO: 85, 86, 9 1 and 96, hypomethylation of said CpG islands being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In a second aspect, the present invention concerns a method for predicting clinical outcome of a patient afflicted with a muscle-invasive bladder cancer, wherein the method comprises determining in a cancer sample from said patient whether the muscle-invasive bladder cancer has a basal-like phenotype with the method according to the invention, the presence of the basal-like phenotype being indicative of a poor prognosis. In a further aspect; the present invention concerns a method for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the method comprises determining in a cancer sample from said patient whether the muscle-invasive bladder cancer has a basal-like phenotype with the method according to the invention, and optionally determining whether the muscle-invasive bladder cancer has a RAS-activating mutation, the presence of the basal-like phenotype being indicative that said patient is susceptible to benefit from a treatment comprising capecitabine, and the presence of the basal-like phenotype and the absence of a RAS-activating mutation being indicative that said patient is susceptible to benefit from a treatment comprising an EGFR kinase inhibitor. In particular, the EGFR kinase inhibitor is selected from the group consisting of erlotinib, cetuximab, gefitinib, lapatinib, panitumumab, zalutumumab, nimotuzumab and matuzumab, and any combination thereof. Preferably, the EGFR kinase inhibitor is selected from the group consisting of erlotinib and cetuximab, and a combination thereof. The present invention also concerns a method of predicting the sensitivity of a muscle- invasive bladder cancer to a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the method comprises determining whether the muscle-invasive bladder cancer has a basal-like phenotype with the method according to the invention, and optionally determining whether the muscle-invasive bladder cancer has a RAS-activating mutation, the presence of the basal-like phenotype in said cancer being indicative that said cancer is sensitive to a treatment comprising capecitabine, and the presence of the basal-like phenotype and the absence of RAS-activating mutation being indicative that said cancer is sensitive to a treatment comprising an EGFR kinase inhibitor. In a further aspect, the present invention concerns an EGFR kinase inhibitor for use in the treatment of muscle-invasive bladder cancer having a basal-like phenotype as determined with the method according to the invention and without RAS-activating mutation. The present invention also concerns capecitabine for use in the treatment of muscle- invasive bladder cancer having a basal-like phenotype as determined with the method according to the invention. In another aspect, the present invention concerns a kit and its use (i) for predicting clinical outcome of a patient afflicted with a muscle-invasive bladder cancer, (ii) for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment comprising an EGFR kinase inhibitor and/or capecitabine, and/or (iii) for predicting the sensitivity of a muscle- invasive bladder cancer to a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the kit comprises detection means selected from the group consisting of a pair of primers, a probe and an antibody specific to (a) the genes KRT5, KRT6A, KRT6B and/or FOXA1; and/or (b) at least 2 genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSSl and RGS20 genes, preferably PKPland IPPK, and at least 2 genes selected from the group consisting of PHC1, THYN1, TACC1, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNA1D, RAB15, MAN1C1, SORL1, CHN2, TGFBR3, CAB39L, LIMCHl and BAMBI genes, preferably MAML3 and TGFBR3; and/or (c) at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 41, preferably the exon of SEQ ID NO: 24; and/or (d) at least 4 CpG islands selected from the group consisting of CpG islands listed in Table 7; and/or (e) at least 4 GpC sites selected from the group consisting of GpC sites listed in Table 9; and/or (f the TGM1 gene. and optionally, a leaflet providing guidelines to such use. The kit may further comprise detection means selected from the group consisting of a pair of primers, a probe and an antibody specific to the HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307. The present invention also concerns a DNA chip and its use (i) for predicting clinical outcome of a patient afflicted with a muscle-invasive bladder cancer, (ii) for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment comprising an EGFR kinase inhibitor and/or capecitabine, and/or (iii) for predicting the sensitivity of a muscle-invasive bladder cancer to a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the DNA chip comprises a solid support which carries nucleic acids that are specific to (b) at least 2 genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSSl and RGS20 genes, preferably PKPland IPPK, and at least 2 genes selected from the group consisting of PHCl, THYNl, TACCl, PPAP2B,

NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNA1D, RAB15, MANIC 1, SORL1, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, preferably MAML3 and TGFBR3; and/or (c) at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 41, preferably the exon of SEQ ID NO: 24; and/or (d) at least 4 CpG islands selected from the group consisting of CpG islands listed in Table 7; and/or (e) at least 4 GpC sites selected from the group consisting of CpG sites listed in Table 9; and/or (f) the TGM1 gene. The DNA chip may further comprise nucleic acids that are specific to the HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST0000040645 1, ENST00000405010 and ENST00000428307. In another aspect, the present invention concerns a combined preparation, product or kit containing (a) capecitabine and (b) an alkylating agent, as a combined preparation for simultaneous, separate or sequential use in the treatment of a muscle-invasive bladder cancer having a basal-like phenotype as determined with the method of the invention. Preferably, the alkylating agent is cisplatin. BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1. Gene expression profiling identifies a basal-like molecular subtype of MIBC with specific genetic and immunohistochemical features. (A) Hierarchical consensus clustering and (B) principal component analysis of gene expression profiles of 85 MIBC (CIT series), showing two very distinct subgroups of tumors, the first (at left) comprising 64 tumors partitioned into several subclusters and the second (at right) comprising 1 tumors on which the inventors focused. (C) Heatmap of the top 52 differentially-expressed genes between the two subgroups of MIBC, using row mean centered data. (D) Immunohistochemical markers associated with the basal-like bladder cancer phenotype. Images on the left: three representative cases of basal-like MIBC, demonstrating strong, diffuse labeling for CK5/6 and an absence of nuclear FOXA1. Images on the right: three representative cases of non-basal-like MIBC, showing an absence of CK5/6 in conjunction with strong nuclear FOXA1 (CIT.39), an absence of CK5/6 in conjunction with nuclear FOXA1 (CIT.80) or positive labeling for both CK5/6 and nuclear FOXA1 (CIT.277) (x 20 objective magnification). The graph below shows the distribution of the various CK5/6 and FOXA1 phenotypes in basal-like and non-basal- like MIBC (the P-value given was obtained in a chi-squared test for trend). (E) Somatic mutation status of the FGFR3, RAS, PI3KCA and TP53 genes and histologic type for each MIBC sample. Two-tailed Fisher's exact tests were used to compare the distributions between basal-like and non-basal-like MIBC. Figure 2. The basal-like phenotype is predictive of a poorer clinical outcome in patients with MIBC. Kaplan-Meier curves of overall survival in basal-like (BL) and non-basal-like (non-BL) muscle-invasive bladder carcinomas (MIBC) are shown for all patients (Fig. 2A), node negative, non-metastatic patients (NO, M0) (Fig. 2B), node positive, non-metastatic patients (Nl, M0) (Fig. 2C) and patients with metastases (Ml) (Fig. 2D). P-values of log-rank tests for comparisons of overall survival between the BL and non-BL groups are shown for time points one, two and five years after diagnosis. Forest plots (Fig. 2E and F) illustrate overall survival-related hazard ratios (HR), with 95% confidence intervals (upper and lower limits) between BL and non-BL MIBC patients within subpopulations defined by sex (female/male), age group, stage (T2/T3/T4), node status (positive/negative) and metastasis (yes/no). Squares indicate the hazard ratios, segments show the corresponding 95% confidence intervals. The size of the square represents the size of the subpopulation (n). Hazard ratios were calculated with Cox models, for one year (Fig. 2E) and five years (Fig. 2F) after diagnosis. Figure 3. Activation of the EGFR pathway in basal-like MIBC. (A) qRT-PCR validation of gene expression array data comparing mRNA levels for EGFR and its ligands (AREG, EREG, HBEGF and TGFA) between basal- like and non-basal- like MIBC. (B) Western-blotting analysis of EGFR and phospho-EGFR Tyrl068 (p-EGFR) levels in basal-like and non-basal-like MIBC; the tumor sample indicated by # displayed EGFR gene amplification. Down panel: representative immunoblots. Upper panel: immunoblot quantification (arbitrary units). (C) Immunohistochemical analysis of phospho-EGFR Tyrl068. Upper panel: representative case of basal-like MIBC displaying intense membranous and cytoplasmic staining in tumor cells that was absent in stromal cells. Bottom panel: non-basal- like MIBC displaying a negative staining. (D) Comparison of EGFR mRNA level between basal-like and non-basal-like MIBC for tumors with a normal EGFR gene copy number. Data are presented as medians and interquartile ranges. A two-tailed Mann-Whitney test was used in B and C (** < 0.001; *** P < 0.0001). Figure 4. EGFR signaling is essential for the growth of basal- like bladder cancer cells in vitro and in vivo. (A) Heatmap comparing levels of basal cell cytokeratins and EGFR pathway-related components between basal-like and non-basal-like bladder cancer cell lines. mRNA levels were quantified by qRT-PCR and protein levels were quantified by western blotting. Fold-change ratios are the mean value for basal-like tumors with respect to that for non-basal-like bladder cancer cell lines. P-values indicate the significance of differences between the two groups of cell lines. Cell lines indicated by # displayed EGFR gene amplification and were excluded from fold-change ratio calculations for EGFR (mRNA and protein) and p-EGFR. The cell lines indicated by * carried a RAS-activating mutation. (B) Effect of erlotinib on the growth of basal-like and non-basal- like bladder cancer cell lines after

72 h of treatment. The GI50 for erlotinib is plotted for each cell line. (C) Comparison of the GI50 for erlotinib between (BL) and non-basal-like (Non-BL) bladder cancer cell lines (box plot representation); (+) indicates the mean. (D) Western-blot analysis of EGFR phosphorylation and downstream signaling , following erlotinib treatment, in two basal-like (SCaBER and UMUC6) and two non-basal-like (JMSU1 and K 47) bladder cancer cell lines. (E) The inhibition, by erlotinib (100 mg/kg), of the growth of xenografts of human basal- like and non- basal-like bladder cancer cell lines. Data are presented as means ± s.e.m. A two-tailed Mann- Whitney test was used in A, C and E P < 0.01; ** P < 0.001; *** P < 0.0001). Figure 5. An EGFR-driven autocrine mitogenic loop is activated in human basal-like bladder cancer cell lines. (A) The effect of cetuximab on cell growth was evaluated at the clinically relevant concentration of 10 µg/ml for 11 basal- like and 11 non-basal- like bladder cancer cell lines, after 72 h of treatment. Results are expressed as the percentage of viable cells relative to untreated control cells. (B) Comparison of cell growth following cetuximab treatment between (BL) and non-basal-like (Non-BL) bladder cancer cell lines (box plot representation); (+) indicates the mean. (C) EGFR phosphorylation and the downstream signaling proteins ER and AKT were analyzed by western blotting after six hours of cetuximab treatment, in two basal-like (UMUC16 and UMUC6) and two non-basal-like (JMSU1 and K 47) bladder cancer cell lines. Cells were deprived of serum overnight, before treatment. (D) Effect of a neutralizing antibody against amphiregulin (anti-AR) on the autonomous growth of three basal-like (SCaBER, UMUC16 and UMUC6) and two non-basal- like (JMSU1 and K 47) bladder cancer cell lines. Cells were treated for 6 days in serum-free medium supplemented with 40 µg/ml transferrin. Results are expressed as the percentage of viable cells with respect to untreated control. Data are represented as means ± SEM and P- values, indicating the significance of differences between cells treated with the IgG mouse control and cells treated with the anti-AR, as assessed in a two-tailed Mann-Whitney test (** P < 0.001; *** P < 0.0001; ns: non significant) (E) Effect of erlotinib and cetuximab on the transcription of EGFR ligand genes, as assessed by qRT-PCR after 8 h or 24 h of treatment of two basal-like bladder cancer cell lines, SCaBER and L1207. The L1207 cell line displayed EGFR gene amplification. Figure 6. Mouse BBN-induced bladder tumors have a basal-like molecular profile and are sensitive to anti-EGFR therapy (erlotinib). (A). Expression levels of genes associated with the basal-like phenotype in human basal-like MIBC (BL MIBC) and in mouse basal-like BBN- induced bladder tumors (BL BBN-T) compared to normal urothelium (N), human or mouse. For each species, the heatmap on the left shows the level of expression (quantified by microarray) of each gene (rows) in each sample (columns). The bar charts on the right show the mean expression value in basal-like tumors (T) for each gene relative to the mean expression value in the normal urothelium samples (N) (normalized to 1). Data are represented as means ± s.e.m. P-values for differences between basal-like tumors and normal urothelium were obtained in two-tailed Mann-Whitney tests ( P < 0.01; ** P < 0.001). (B, C) Effect of erlotinib treatment (100 mg/kg , 6 days per week) on the progression of BBN-induced mouse bladder tumors. Kaplan-Meier curves of tumor-free survival (B, ultrasound scan showing a tiny tumor on the bladder wall) and overall survival (C, ultrasound scan showing complete obstruction of the bladder by the tumor) in mice treated with erlotinib or vehicle. P-values were obtained in a log-rank test. Figure 7. Summary of the molecular and pathological features associated with the basal- like phenotype in 82 human MIBC from the CIT series. A 40-gene predictor discriminating basal-like from non-basal-like bladder tumors was constructed from the CIT series. This predictor, validated in 6 independent transcriptome datasets, correctly classified MIBC into the basal-like and non-basal-like molecular subtypes, with the exception of one sample (CIT36), as shown by comparison with the clustering consensus approach used as the reference method for classification. An immunohistochemical signature predictive of the basal-like subtype was established in a series of 62 MIBC. The CK5/6+FOXA1- phenotype was found in most of the basal-like MIBC analyzed. mR A levels for seven genes were analyzed by qRT-PCR (EGFR, AREG, EREG, HBEGF, TGFA and ∆ΝΡ 63) or with gene expression arrays (KRT14). The optimal cutoff point defining high and low levels of expression was obtained for each gene by regression tree analysis (RPart). High levels of EGFR, AREG, EREG, HBEGF and TGFA expression were significantly associated with the basal-like subgroup in our study (indicated by *). Overexpression of KRT14 and ∆ΝΡ 63 have been identified as single markers, associated with the expression of basal cell differentiation markers in previous studies by Kami-Schmidt et al, 2011 (#) and Volkmer et al, 2012 (†). Two-tailed Fisher's exact tests were used for statistical comparisons between basal-like and non-basal-like MIBC. Figure 8. TYMP/DPYD ratio is increased in human MIBC. (A) qRT-QCR analysis of TYMP and DPYD mRNA expression in human normal urothelial samples and in human MIBC presenting (BL MIBC) or not (Non-BL MIBC) a basal-like signature as defined herein (B) TYMP protein expression measured by reverse phase protein array in human normal urothelial samples and in human MIBC presenting or not a basal-like signature. (C) Correlation between mRNA level and protein level of TYMP in urothelial samples and in basal-like and non-basal- like MIBC. Statistical significance of correlation was assessed by Spearman's rank correlation test. Data are presented as mean ± s.e.m. A two tailed Mann-Whitney test was used in A and B. *P<0.05, ** PO.001; *** PO.0001, ns: not significant. Figure 9. Antitumor activity of capecitabine on tumor xenografts of four human basal- like bladder cancer cell lines and two human non-basal-like bladder cancer cell lines (n=6 animals/group). Mice were treated either with vehicle control (40 mM citrate buffer containing 0.5% of carboxymethylcellulose, pH 6.0) or capecitabine at a dose of 400 mg/kg for 7 consecutive days followed by a one-week rest. Data are presented as mean ± s.e.m. Statistical analysis was performed using a two tailed Mann-Whitney test. Figure 10. Class centroids for basal tumours and non basal tumours. Horizontal bars on the right (resp. left) sides quantify the inclusion (resp. exclusion) level of the exon undergoing a splicing change in each of the 19 genes. Figure 11. Expression of FOXA1 and CK5/6 assessed by immunohistochemistry in basal-like and non basal-like MIBC. BL-MIBC are represented by plain black dots, whereas NBL-MIBC are represented by open black dots. The number beside each dot represents the number of tumors presenting the same expression of FOXAl and CK5/6 in case there are several tumors of the same type (BL-MIBC or NBL-MIBC) with the same expression of FOXAl and CK5/6. The black lines represent the thresholds which separate the BL-MIBC from the NBL-MIBC: BL-MIBC present a high expression of CK5/6 and a low expression of FOXAl nuclear staining. Figure 12: Alternatively spliced isoform of gene TGM1 lacking exon 9 is a specific marker for basal-like tumors. Figure 13: Comparison of capecitabine and cisplatin treatment on tumor xenografts of one basal-like bladder cancer cell lines presenting a N-RAS mutation (n=6 animals/group). Mice were treated either with vehicle control (40 mM citrate buffer containing 0.5% of carboxymethylcellulose, pH 6.0) or capecitabine at a dose of 400 mg/kg for 7 consecutive days followed by a one-week rest or cisplatin at a dose 6 mg/kg once every three week or with a combination of capecitabine and cisplatin. Data are presented as mean ± s.e.m. Statistical analysis was performed using a two tailed Mann-Whitney test. Figure 14. Quantification of short (grey) and long (black) HDAC9 isoforms in basal- like and non basal-like bladder tumors. Figure 15. Quantification of short/long HDAC9 isoform expression ratio in basal-like and non-basal-like bladder tumors. Figure 16. Quantification of TGM1 gene expression in basal-like and non-basal-like bladder tumors by RT-qPCR. Figure 17. Number of tumors exhibiting a high expression level of TGM1 gene and a high HDAC9 short/long isoform ratio according to exon array dataset. This representation includes all tumors analyzed by exon array including 30 basal-like and 177 non basal-like tumors. As shown in this figure, 15 basal-like and none non-basal-like tumors have a high expression level of TGM1 and a high HDAC9 short/long isoform ratio. Figure 18. Number of tumors positive for overexpression of TGM1 and HDAC9 short/long isoform ratio according to exon array dataset. This representation includes tumors analyzed by RT-qPCR. As shown in this figure, 13 basal-like and none non-basal-like tumors have a high expression level of TGM1 and a high HDAC9 short/long isoform ratio. Figure 19. Number of tumors positive for overexpression of TGM1 and HDAC9 short/long isoform ratio by RT-qPCR. All basal-like tumors are identified with this marker combination. As shown in this figure, 14 basal-like and none non-basal-like tumors have a high expression level of TGM1 and a high HDAC9 short/long isoform ratio. Figure 20. Number of tumors positive for overexpression of TGM1 and HDAC9 short isoform by RT-qPCR. As shown in this figure, 13 basal-like and none non-basal-like tumors have a high expression level of TGM1 and a high HDAC9 short/long isoform ratio.

DETAILED DESCRIPTION OF THE INVENTION

New treatment options are required to improve muscle-invasive bladder cancer (MIBC) outcomes, which are currently poor. In cases where primary treatment (surgery) has failed, no effective second line treatment has been found. From large-scale transcriptomic data, the inventors identified a distinct MIBC subgroup wherein the EGFR pathway was deregulated. Due to the consistent expression of basal cell markers, this subgroup was named "basal-like". Basal-like MIBC are particularly aggressive and account for 20% of MIBC cases. The levels of EGFR protein and its phosphorylated form were significantly higher in basal-like tumors than in non-basal-like tumors. However, these levels are not sufficient to distinguish these two subgroups. Accordingly, the inventors developed four methods to identify basal-like MIBC. This subgroup may be identified by immunohistochemical markers or transcriptomic, alternative splicing and DNA methylation signatures. They also demonstrated, with in vitro and in vivo preclinical models, that therapy targeting EGFR and/or comprising capecitabine, was particularly effective for basal-like tumors.

Definitions The methods of the invention as disclosed herein, may be in vivo, ex vivo or in vitro methods. Preferably, the methods of the invention are in vitro methods. As used herein, the term "subject" or "patient" refers to an animal, preferably to a mammal, even more preferably to a human, including adult, child and human at the prenatal stage. However, the term "subject" can also refer to non-human animals, in particular mammals such as dogs, cats, horses, cows, pigs, sheeps and non-human primates, among others, that are in need of treatment. The term "sample", as used herein, means any sample containing cells derived from a subject, preferably a sample which contains nucleic acids. Examples of such samples include fluids such as blood, plasma, saliva, urine and seminal fluid samples as well as biopsies, organs, tissues, cell samples or cancer associated ascite fluids. The sample may be treated prior to its use. The term "sample" may also refer to any sample containing free circulating nucleic acids. The term "cancer sample" refers to any sample comprising tumor cells derived from a patient, preferably a sample which comprises nucleic acids. Preferably, the sample contains only tumor cells (i.e., no normal or healthy cell). The term "cancer sample" may also refer to any sample comprising free circulating nucleic acids from tumor cells. Preferably, the sample contains only nucleic acids from tumor cells. The term "cancer" or "tumor", as used herein, refers to the presence of cells possessing characteristics typical of cancer-causing cells, such as uncontrolled proliferation, immortality, metastatic potential, rapid growth and proliferation rate, and certain characteristic morphological features. The term "bladder cancer" or "bladder tumor" is intended herein urinary bladder tumor, bladder cancer or urinary bladder cancer, and bladder neoplasm or urinary bladder neoplasm. A bladder tumor can be a bladder carcinoma or a bladder adenoma, preferably a bladder carcinoma. The most common staging system for bladder tumors is the TNM (tumor, node, metastasis) system. This staging system takes into account how deep the tumor has grown into the bladder, whether there is cancer in the lymph nodes and whether the cancer has spread to any other part of the body. The T part of TNM gives indication on how far into the bladder the cancer cells have grown: CIS/Ta: cancer cells are detected only in the innermost layer of the bladder lining; Tl : the cancer has started to grow into the connective tissue beneath the bladder lining; T2: the cancer has grown through the connective tissue into the muscle; T3: the cancer has grown through the muscle into the fat layer; and T4: the cancer has spread outside the bladder. The term "muscle-invasive bladder cancer" or MIBC refers to a bladder tumor that is invasive, i.e. a bladder cancer from T2 to T4 stages. As used herein, the term "poor prognosis" refers to a decreased patient survival and/or an early disease progression and/or an increased disease recurrence and/or an increased metastasis formation, preferably a decreased patient survival. The term "patient survival" refers to the time interval between the date of diagnosis or the surgery and the date of death, preferably refers to the time interval between the date of cystectomy and the date of death. As used herein, the term "treatment", "treat" or "treating" refers to any act intended to ameliorate the health status of patients such as therapy, prevention, prophylaxis and retardation of the disease. In certain embodiments, such term refers to the amelioration or eradication of a disease or symptoms associated with a disease. In other embodiments, this term refers to minimizing the spread or worsening of the disease resulting from the administration of one or more therapeutic agents to a subject with such a disease. This term refers to the treatment at any stage of the disease. In particular, it can be an adjuvant therapy (chemo- or radiotherapy after surgery) or a neo-adjuvant therapy (chemo- or radiotherapy before surgery). In particular, the term "to treat a cancer" or "treating a cancer" means reversing, alleviating, inhibiting the progress of, or preventing, either partially or completely, the growth of tumors, tumor metastases, or other cancer-causing or neoplastic cells in a patient.

In a first aspect, the present invention relates to a method for determining whether a MIBC has a basal-like phenotype, i.e. belongs to the "basal-like" subgroup identified by the inventors. Immunohistochemical analysis carried out on 62 of the 85 MIBC revealed that most of the basal-like tumors (89%) exhibit the CK5/6+ FOXA1 phenotype, i.e. a strong expression of the basal cytokeratins CK5/6 and the absence of nuclear FOXA1 while only 4.5% of non basal- like tumors presented this phenotype. Thus, in a first embodiment, the method for determining whether a MIBC has a basal- like phenotype comprises determining the expression level of K T5, K T6A and/or K T6B and the expression level of nuclear FOXA1 in a cancer sample. In particular, the expression of K T5, K T6A and/or K T6B and the absence of nuclear FOXA1 are indicative that the muscle-invasive bladder cancer has a basal-like phenotype. Cytokeratin 6A (KRT6A, also named cytokeratin 6C or 6D) is encoded by the gene KRT6A (also named K6A, K6C, K6D, CK6A, CK6C, CK6D, KRT6C or KRT6D; Gene ID: 3853) located at 12ql3.13. Cytokeratin 6B (KRT6B) is encoded by the gene KRT6B (also named K6B, PC2, CK6B, CK-6B, KRTL1; Entrez Gene ID: 3854) located at 12ql3.13. Cytokeratin 5 (KRT5) is encoded by the gene KRT5 (also named K5, CK5, DDD, EBS2, KRT5A; Entrez Gene ID: 3852) located at 12ql3.13. Forkhead box A l (FOXA1) is a nuclear factor encoded by the gene FOXA1 (also named HNF3A or TCF3A; Entrez Gene ID: 3169) located at 14ql2-ql3. The expression level of KRT5, KRT6A and/or KRT6B may be determined by any method known by the skilled person. In particular, expression level may be determined (i) by measuring the quantity of mR A and/or (ii) by measuring the quantity of encoded protein. The expression level of nuclear FOXA1 may be determined by any method known by the skilled person and that is suitable to distinguish cytoplasmic and nuclear expression of FOXA1. Methods for determining the quantity of mRNA are well known in the art and include, but are not limited to, quantitative or semi-quantitative RT-PCR, real time quantitative or semi quantitative RT-PCR, Nanostring technology, sequencing based approaches or transcriptome approaches. The nucleic acid contained in the sample (e.g., cells or tissue prepared from the patient) may be first extracted according to standard methods, for example using lytic enzymes or chemical solutions or extracted by nucleic-acid-binding resins following the manufacturer's instructions. The extracted mRNA may be then detected by hybridization (e.g., Northern blot analysis) and/or amplification (e.g., RT-PCR). Quantitative or semi-quantitative RT-PCR is preferred. Real-time quantitative or semi-quantitative RT-PCR is particularly advantageous. Preferably, primer pairs were designed in order to overlap an intron, so as to distinguish cDNA amplification from putative genomic contamination. Such primers may be easily designed by the skilled person. Other methods of Amplification include, but are not limited to, ligase chain reaction (LCR), transcription-mediated amplification (TMA), strand displacement amplification (SDA) and nucleic acid sequence based amplification (NASBA). Alternatively, the quantity of mRNA may also be measured using the Nanostring's NCOUNTER™ Digital Gene Expression System (Geiss et al, 2008) which captures and counts individual mRNA transcripts by a molecular bar-coding technology and is commercialized by Nanostring Technologies, or the QuantiGene ® Plex 2.0 Assay (Affymetrix). The quantity of mRNA may further be determined using approaches based on high-throughput sequencing technology such as RNA-Seq (Wang et al, 2009). The expression level of a gene may also be determined by measuring the quantity of mRNA by transcriptome approaches, in particular by using DNA microarrays. To determine the expression level of a gene, the sample, optionally first subjected to a reverse transcription, is labelled and contacted with the microarray in hybridization conditions, leading to the formation of complexes between target nucleic acids that are complementary to probe sequences attached to the microarray surface. The labelled hybridized complexes are then detected and can be quantified or semi-quantified. Labelling may be achieved by various methods, e.g. by using radioactive or fluorescent labelling. Many variants of the microarray hybridization technology are available to the man skilled in the art. Examples of DNA biochips suitable to measure the expression level of the genes of interest include, but are not limited to, U133 Plus 2.0 array (Affymetrix), or any other whole human genome microarray, such as those from Agilent or Illumina. Next Generation Sequencing methods (NGS) may also be used. Preferably, the quantity of mR A is measured by quantitative or semi-quantitative RT- PCR, by real-time quantitative or semi-quantitative RT-PCR, by Nanostring technology or sequencing based approaches, or by transcriptome approaches. Methods for measuring the quantity or the activity of the encoded protein are also well- known by the skilled person and the choice of the method depends on the encoded protein. Usually, these methods comprise contacting the sample with a binding partner capable of selectively interacting with the protein present in the sample. The binding partner is generally a polyclonal or monoclonal antibody, preferably monoclonal. The quantity of protein is measured by semi-quantitative Western blots, immunochemistry (enzyme-labeled and mediated immunoassays, such as ELISAs, biotin/avidin type assays, radioimmunoassay, Immunoelectrophoresis or immunoprecipitation) or by protein or antibody arrays. The protein expression level may also be assessed by immunohistochemistry on a tissue section of the cancer sample (e.g. frozen or formalin-fixed paraffin embedded material). The reactions generally include revealing labels such as fluorescent, chemiluminescent, radioactive, enzymatic labels or dye molecules, or other methods for detecting the formation of a complex between the antigen and the antibody or antibodies reacted therewith. Specific activity assays may also be used, in particular when the encoded protein is an enzyme. Preferably, the expression level of nuclear FOXA1 is determined by immunohistochemistry. More preferably, the expression level of KRT5, KRT6A and/or KRT6B and the expression level of nuclear FOXA1 are assessed by immunohistochemistry. Antibodies specifically recognizing human KRT5, KRT6A, KRT6B and FOXA1 proteins are commercially available (e.g. FOXA1: Ref:23738, Abeam, Cambridge, United Kingdom; CK5/6: clone D5/16 B4, DakoCytomation, Glostrup, Denmark). Antibodies recognizing cytokeratins 5, 6A and 6B may react with only one of these cytokeratins or with two or three, preferably with all of cytokeratins 5, 6A and 6B. In an embodiment, the method comprises determining the expression level of KRT5, KRT6A and KRT6B. In another embodiment, the method comprises determining the expression level of one or two cytokeratins selected from the group consisting of KRT5, KRT6A and KRT6B. In a preferred embodiment, the method comprises determining the expression level of KRT6B.

The inventors analyzed transcriptomic data of 85 MIBC by two different unsupervised methods and identified a subgroup of 2 1 "basal-like" tumors wherein 761 genes were differentially expressed (P<0.01 and fold change >2) by comparison with the others tumors. Using supervised analyses, the inventors then selected a set of predictive genes for distinguishing basal-like tumors from non basal-like tumors. Thus, in a second embodiment, the method for determining whether a MIBC has a basal- like phenotype comprises determining, in a cancer sample, the expression levels of - at least 2 genes selected from the group consisting of PI3 (Entrez Gene ID:5266) , KRT6B (Entrez Gene ID:3854), CSTA (Entrez Gene ID: 1475), DSC2 (Entrez Gene ID: 1824), MT1X (Entrez Gene ID:4501), RAB38 (Entrez Gene ID:23682), SFN (Entrez Gene ID:2810), SAMD9 (Entrez Gene ID:54809), EGFR (Entrez Gene ID:1956), CD44 (Entrez Gene ID:960), ILIRAP (Entrez Gene ID:3556), DSP (Entrez Gene ID:1832), PKP1 (Entrez Gene ID:5317), SERPINB7 (Entrez Gene ID:8710), CELSR2 (Entrez Gene ID: 1952), DUSP7 (Entrez Gene ID: 1849), TBC1D2 (Entrez Gene ID:55357), ARL4D (Entrez Gene ID:379), IPPK (Entrez Gene ID:64768), MTSSI (Entrez Gene ID:9788), RGS20 (Entrez Gene ID:8601), PHC1 (Entrez Gene ID: 191 1), THYN1 (Entrez Gene ID:29087), TACC1 (Entrez Gene ID:6867), PPAP2B (Entrez Gene ID:8613), NRXN3 (Entrez Gene ID:9369), GNA14 (Entrez Gene ID:9630), ZFHX3 (Entrez Gene ID:463), TLE2 (Entrez Gene ID:7089), MAML3 (Entrez Gene ID:55534), EPS8 (Entrez Gene ID:2059), CACNA1D (Entrez Gene ID:776), RAB15 (Entrez Gene ID:376267), MAN1C1 (Entrez Gene ID:57134), SORLl (Entrez Gene ID:6653), CHN2 (Entrez Gene ID:1 124), TGFBR3 (Entrez Gene ID:7049), CAB39L (Entrez Gene ID:81617), LIMCH1 (Entrez Gene ID:22998) and BAMBI (Entrez Gene ID:25805) genes; and/or - at least one gene selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, ILIRAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSSI and RGS20 genes, and at least one gene selected from the group consisting of PHC1, THYN1, TACC1, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNA1D, RAB15, MAN1C1, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes; and/or - at least one gene selected from the group consisting ofKRT6B, ZFHX3, SFN, TGFBR3 and CHN2 genes. High expression level of genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X RAB38, SFN, SAMD9, EGFR, CD44, ILIRAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSSI and RGS20, and low expression level of genes selected from the group consisting PHCl , THYN1, TACC1, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNA1D, RAB15, MAN1C1, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI, are indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In a particular embodiment, the method comprises determining the expression levels of at least 3, 4, 5, 10, 15, 20, 25, 30, 35 genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSS1, RGS20, PHC1, THYNl, TACCl, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNAID, RAB15, MANICI, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes. More preferably, the expression levels of all these genes are determined. In another particular embodiment, the method comprises determining the expression levels of at least two genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSS1 and RGS20 genes, and the expression level of at least two genes selected from the group consisting ofPHCl, THYNl, TACCl, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNAID, RAB15, MANICI, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes in a cancer sample. Preferably, the expression levels of at least 3, 4, 5, 6, 7, 8, 9, 10 or 15 genes of each group are determined. The inventors tested all 2-gene combinations comprising a gene selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSS1 and RGS20 genes, and a gene selected from the group consisting ofPHCl, THYNl, TACCl, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNAID, RAB15, MANICI, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, all 4-gene combinations comprising two genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSS1 and RGS20 genes, and two genes selected from the group consisting of PHC1, THYNl, TACCl, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNAID, RAB15, MANICI, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, and all 3-gene combinations comprising a gene selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSS1 and RGS20 genes, and two genes selected from the group consisting of PHC1, THYNl, TACCl, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNAID, RAB15, MANICI, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, on the training CIT dataset described in the experimental section, i.e. 399, 3591 and 35910 combinations. They found that 2, 3, and 4-gene combinations have a minimal sensitivity of about 76, 76 and 81% and specificity of about 77, 79 and 81%, respectively. Furthermore, about 80% of the 4-gene combinations allowed to correctly classify more than 95% of MIBC of the CIT dataset and about 1400 of these combinations allowed to correctly classify 100% of MIBC of the CIT dataset. Thus, using well-known statistical methods, the skilled person may easily select a combination comprising at least one gene selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKPl, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSS1 and RGS20 genes, and one gene selected from the group consisting of PHC1, THYN1, TACC1, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNA1D, RAB15, MANICI, SORLl, CHN2, TGFBR3, CAB39L, LIMCHl and BAMBI genes, for determining whether a MIBC has a basal-like phenotype. In particular, the method may comprise determining the expression levels of at least one of the following 2-gene combinations allowing to determine whether a MIBC has a basal-like phenotype with at least 98% specificity on the training CIT dataset: BAMBI, IPPK; TLE2, IPPK; RAB15, RAB38; EPS8, IPPK; SORLl, MTSS1; SORLl, ARL4D; SORLl, PKPl; SORLl, RAB38; TGFBR3, PI3; TGFBR3, RAB38; ZFHX3, PKPl; CAB39L, IPPK; CAB39L, RAB38; LIMCHl, DSC2; LIMCHl, IL1RAP; MAML3, IL1RAP; MAML3, IPPK; THYN1, IPPK; GNA14, IPPK; TACC1, IPPK; CACNA1D, IPPK; PPAP2B, IPPK; TGFBR3, IPPK; NRXN3, IPPK; CHN2, PKPl; CHN2, IPPK; LIMCHl, IPPK; MANICI, IPPK; and PHC1, IPPK Alternatively, the method may comprise determining the expression levels of at least one of the following 3-gene combinations allowing to determine whether a MIBC has a basal- like phenotype with 100% sensibility and specificity on the training CIT dataset: BAMBI, SORLl, PKPl; BAMBI, TGFBR3, IPPK; TLE2, TGFBR3, IPPK; PPAP2B, TGFBR3, PI3; PPAP2B, MAML3, SFN; SORLl, ZFHX3, PKPl; SORLl, CAB39L, PKPl; TGFBR3, NRXN3, KRT6B; TGFBR3, NRXN3, PI3; TGFBR3, ZFHX3, DSC2; TGFBR3, ZFHX3, CSTA; TGFBR3, ZFHX3, IPPK; TGFBR3, LIMCHl, IPPK; TGFBR3, MANICI, PI3; NRXN3, LIMCHl, IL1RAP; NRXN3, LIMCHl, IPPK; NRXN3, MAML3, PKPl; NRXN3, MAML3, CELSR2; CHN2, LIMCHl, IPPK; CHN2, MAML3, PKPl; ZFHX3, TACC1, DSC2; ZFHX3, TACC1, PKPl; LIMCHl, GNA14, IPPK; MAML3, MANICI, PKPl; MAML3, MANICI, SFN; RGS20, KRT6B, MAML3; DSP, KRT6B, TGFBR3; DSP, PKPl, MAML3; EGFR, SFN, MAML3; KRT6B, IPPK, TGFBR3; KRT6B, CELSR2, SORLl; KRT6B, CELSR2, MAML3; PI3, DSC2, NRXN3; PI3, CSTA, TGFBR3; PI3, PKPl, SORLl; PI3, RAB38, TGFBR3; PI3, SFN, TGFBR3; PI3, CELSR2, SORLl; MTSS1, DSC2, ZFHX3; MTSS1, DSC2, MAML3; MTSS1, PKPl, BAMBI; MTSS1, PKPl, ZFHX3; MTSS1, PKPl, MAML3; ARL4D, DSC2, MAML3; ARL4D, DSC2, GNA14; ARL4D, SERPINB7, MAML3; DUSP7, PKPl, ZFHX3; CD44, DSC2, LIMCHl; CD44, CELSR2, MAML3; TBC1D2, CELSR2, MAML3; DSC2, PKPl, ZFHX3; DSC2, RAB38, ZFHX3; DSC2, CELSR2, TLE2; CSTA, CELSR2, MAML3; IL1RAP, PKPl, SORLl; SERPINB7, RAB38, MAML3; RAB38, CELSR2, THYN1; SFN, CELSR2, SORLl; and SFN, CELSR2, MAML3. The method may comprise determining the expression levels of at least one of the following 4-gene combinations allowing to determine whether a MIBC has a basal-like phenotype with 100% sensibility and specificity on the training CIT dataset: PPAP2B, SORLl, PI3, PKPl; PPAP2B, SORLl, PI3, CELSR2; PPAP2B, SORLl, SFN, CELSR2; PPAP2B, TGFBR3, PI3, DSC2; PPAP2B, TGFBR3, PI3, RAB38; PPAP2B, TGFBR3, PI3, SFN; PPAP2B, TGFBR3, PI3, CELSR2; PPAP2B, TGFBR3, IPPK, RAB38; PPAP2B, NRXN3, PI3, DSC2; PPAP2B, NRXN3, PI3, RAB38; PPAP2B, NRXN3, RAB38, CELSR2; PPAP2B, ZFHX3, MTSSl, DSC2; PPAP2B, ZFHX3, MTSSl, PKPl; PPAP2B, ZFHX3, MTSSl, IPPK; PPAP2B, ZFHX3, MTSSl, RAB38; PPAP2B, ZFHX3, DSC2, PKPl; PPAP2B, ZFHX3, DSC2, SFN; PPAP2B, ZFHX3, PKPl, RAB38; PPAP2B, ZFHX3, PKPl, SFN; PPAP2B, LIMCHl, RAB38, SFN; PPAP2B, MAML3, PI3, MTSSl; PPAP2B, MAML3, PI3, RAB38; PPAP2B, MAML3, MTSSl, DSC2; PPAP2B, MAML3, MTSSl, PKPl; PPAP2B, MAML3, MTSSl, IPPK; PPAP2B, MAML3, MTSSl, SFN; PPAP2B, MAML3, RAB38, SFN; PPAP2B, MAML3, RAB38, CELSR2; PPAP2B, MAML3, SFN, CELSR2; SORLl, TGFBR3, PI3, PKPl; SORLl, TGFBR3, PI3, SFN; SORLl, TGFBR3, PI3, CELSR2; SORLl, TGFBR3, PKPl, IPPK; SORLl, TGFBR3, PKPl, SFN; SORLl, TGFBR3, PKPl, CELSR2; SORLl, TGFBR3, IPPK, SFN; SORLl, TGFBR3, IPPK, CELSR2; SORLl, TGFBR3, SFN, CELSR2; SORLl, NRXN3, PI3, PKPl; SORLl, NRXN3, PI3, SFN; SORLl, NRXN3, PI3, CELSR2; SORLl, NRXN3, DSC2, PKPl; SORLl, NRXN3, PKPl, SFN; SORLl, NRXN3, PKPl, CELSR2; SORLl, NRXN3, IPPK, CELSR2; SORLl, NRXN3, SFN, CELSR2; SORLl, ZFHX3, MTSSl, PKPl; SORLl, ZFHX3, MTSSl, SFN; SORLl, ZFHX3, MTSSl, CELSR2; SORLl, ZFHX3, DSC2, PKPl; SORLl, ZFHX3, DSC2, SFN; SORLl, ZFHX3, DSC2, CELSR2; SORLl, ZFHX3, PKPl, IPPK; SORLl, ZFHX3, PKPl, RAB38; SORLl, ZFHX3, PKPl, SFN; SORLl, ZFHX3, PKPl, CELSR2; SORLl, ZFHX3, IPPK, RAB38; SORLl, ZFHX3, IPPK, SFN; SORLl, ZFHX3, RAB38, SFN; SORLl, ZFHX3, RAB38, CELSR2; SORLl, ZFHX3, SFN, CELSR2; SORLl, LIMCHl, PI3, CELSR2; SORLl, LIMCHl, DSC2, PKPl; SORLl, LIMCHl, DSC2, SFN; SORLl, LIMCHl, PKPl, CELSR2; SORLl, LIMCHl, IPPK, CELSR2; SORLl, LIMCHl, RAB38, CELSR2; SORLl, LIMCHl, SFN, CELSR2; SORLl, MAML3, PI3, PKPl; SORLl, MAML3, PI3, CELSR2; SORLl, MAML3, MTSSl, PKPl; SORLl, MAML3, PKPl, SFN; SORLl, MAML3, PKPl, CELSR2; SORLl, MAML3, IPPK, SFN; SORLl, MAML3, IPPK, CELSR2; SORLl, MAML3, RAB38, CELSR2; SORL1, MAML3, SFN, CELSR2; TGFBR3, NRXN3, PI3, DSC2; TGFBR3, NRXN3, PI3, RAB38; TGFBR3, NRXN3, PI3, SFN; TGFBR3, NRXN3, PI3, CELSR2; TGFBR3, NRXN3, PKPl, SFN; TGFBR3, NRXN3, IPPK, RAB38; TGFBR3, ZFHX3, PI3, RAB38; TGFBR3, ZFHX3, MTSSl, DSC2; TGFBR3, ZFHX3, MTSSl, PKPl; TGFBR3, ZFHX3, MTSSl, IPPK; TGFBR3, ZFHX3, MTSSl, SFN; TGFBR3, ZFHX3, DSC2, PKPl; TGFBR3, ZFHX3, DSC2, SFN; TGFBR3, ZFHX3, PKPl, IPPK; TGFBR3, ZFHX3, PKPl, SFN; TGFBR3, ZFHX3, IPPK, RAB38; TGFBR3, LIMCHl, PI3, DSC2; TGFBR3, LIMCHl, PI3, IPPK; TGFBR3, LIMCHl, PI3, SFN; TGFBR3, LIMCHl, DSC2, IPPK; TGFBR3, LIMCHl, PKPl, IPPK; TGFBR3, LIMCHl, PKPl, SFN; TGFBR3, LIMCHl, IPPK, RAB38; TGFBR3, LIMCHl, IPPK, SFN; TGFBR3, LIMCHl, IPPK, CELSR2; TGFBR3, LIMCHl, RAB38, CELSR2; TGFBR3, MAML3, PI3, MTSSl; TGFBR3, MAML3, PI3, DSC2; TGFBR3, MAML3, PI3, IPPK; TGFBR3, MAML3, PI3, RAB38; TGFBR3, MAML3, PI3, SFN; TGFBR3, MAML3, PI3, CELSR2; TGFBR3, MAML3, MTSSl, DSC2; TGFBR3, MAML3, MTSSl, IPPK; TGFBR3, MAML3, PKPl, IPPK; TGFBR3, MAML3, IPPK, RAB38; TGFBR3, MAML3, SFN, CELSR2; NRXN3, ZFHX3, MTSSl, DSC2; NRXN3, ZFHX3, MTSSl, PKPl; NRXN3, ZFHX3, MTSSl, IPPK; NRXN3, ZFHX3, MTSSl, RAB38; NRXN3, ZFHX3, DSC2, PKPl; NRXN3, ZFHX3, DSC2, RAB38; NRXN3, ZFHX3, DSC2, SFN; NRXN3, ZFHX3, PKPl, RAB38; NRXN3, ZFHX3, RAB38, CELSR2; NRXN3, LIMCHl, DSC2, IPPK; NRXN3, LIMCHl, IPPK, CELSR2; NRXN3, MAML3, PI3, RAB38; NRXN3, MAML3, MTSSl, SFN; NRXN3, MAML3, DSC2, PKPl; NRXN3, MAML3, PKPl, SFN; NRXN3, MAML3, PKPl, CELSR2; NRXN3, MAML3, RAB38, CELSR2; ZFHX3, LIMCHl, MTSSl, DSC2; ZFHX3, LIMCHl, MTSSl, PKPl; ZFHX3, LIMCHl, MTSSl, IPPK; ZFHX3, LIMCHl, MTSSl, RAB38; ZFHX3, LIMCHl, DSC2, PKPl; ZFHX3, LIMCHl, DSC2, RAB38; ZFHX3, LIMCHl, DSC2, SFN; ZFHX3, LIMCHl, PKPl, IPPK; ZFHX3, LIMCHl, PKPl, RAB38; ZFHX3, MAML3, MTSSl, IPPK; ZFHX3, MAML3, RAB38, SFN; LIMCHl, MAML3, MTSSl, SFN; LIMCHl, MAML3, MTSSl, CELSR2; LIMCHl, MAML3, DSC2, RAB38; LIMCHl, MAML3, PKPl, SFN; LIMCHl, MAML3, PKPl, CELSR2; LIMCHl, MAML3, IPPK, RAB38; LIMCHl, MAML3, IPPK, SFN; LIMCHl, MAML3, IPPK, CELSR2; LIMCHl, MAML3, RAB38, SFN; LIMCHl, MAML3, RAB38, CELSR2; and LIMCHl, MAML3, SFN, CELSR2. The inventors also calculated the sensibility and positive predictive value (true positives / (true positives + false positives)) on all 4-gene combinations on the group consisting of the CIT series disclosed in the experimental section, GSE 13507 (Kim et al, 2010), GSE1827 (Blaveri et al, 2005, GSE19915 (Lindgren et al, 2010), GSE31684 (Riester et al, 2012), E- TABM-147 (Stransky et al., 2006) and the cohort disclosed in Sanchez-Carbayo et al., 2006. Based on these results, the method may comprise, in a particular embodiment, determining the expression levels of at least one of the following 4-gene combinations allowing to determine whether a MIBC has a basal-like phenotype with at least 95% sensibility, at least 95% specificity and a positive predictive value (true positives / (true positives + false positives)) of at least 85% on the above-identified cohorts : TGFBR3, ZFHX3, CSTA, SFN; NRXN3, MANICI, KRT6B, TBC1D2; PPAP2B, ZFHX3, KRT6B, TBC1D2; PPAP2B, ZFHX3, KRT6B, IPPK; CHN2, ZFHX3, KRT6B, MTSSl; PPAP2B, ZFHX3, KRT6B, DSC2; TGFBR3, ZFHX3, KRT6B, MTSSl; TGFBR3, ZFHX3, PI3, PKPl; PPAP2B, ZFHX3, DSP, MTSSl; TGFBR3, ZFHX3, EGFR, CSTA; TGFBR3, ZFHX3, EGFR, PKPl; TGFBR3, ZFHX3, DSC2, PKPl; TGFBR3, ZFHX3, CSTA, PKPl; NRXN3, CHN2, KRT6B, SAMD9; TGFBR3, TACCl, KRT6B, TBC1D2; NRXN3, TACCl, KRT6B, TBC1D2; PPAP2B, ZFHX3, KRT6B, RAB38; PPAP2B, ZFHX3, KRT6B, MTSSl; PPAP2B, ZFHX3, EGFR, KRT6B; PPAP2B, ZFHX3, PKPl, SAMD9; TGFBR3, LIMCHl, KRT6B, MTSSl; NRXN3, ZFHX3, KRT6B, PI3; and CHN2, ZFHX3, KRT6B, CELSR2. In a more particular embodiment, the method comprises determining the expression levels of at least two genes selected from the group consisting of KRT6B, CSTA, EGFR, PKPl, TBC1D2 and MTSSl genes, preferably selected from the group consisting of KRT6B, PKPl, TBC1D2 and MTSSl genes, and the expression level of at least two genes selected from the group consisting of PPAP2B, NRXN3, ZFHX3, CHN2 and TGFBR3 genes, preferably selected from the group consisting of PPAP2B, ZFHX3 and TGFBR3 genes, in a cancer sample. In another particular embodiment, the method may comprise determining the expression levels of at least one of the following gene combinations allowing to determine whether a MIBC has a basal-like phenotype with 100% sensibility, at least 82% specificity and a positive predictive value (true positives / (true positives + false positives)) of at least 59% on the above- identified cohorts: CACNAID, ZFHX3, DSP, PKPl; CACNAID, ZFHX3, CD44, PKPl; CACNAID, ZFHX3, TBC1D2, PKPl; BAMBI, MANICI, CD44, SFN; MANICI, THYN1, TBC1D2, DSC2; CACNAID, ZFHX3, PKPl, SAMD9; EPS8, MANICI, CD44, CELSR2; CACNAID, ZFHX3, DSP, SAMD9; EPS8, MANICI, CD44, TBC1D2; MANICI, PHCl, DSP, TBC1D2; MANICI, PHCl, CD44, TBC1D2; CACNAID, NRXN3, CD44, SFN; MANICI, PHCl, MTSSl, CD44; MANICI, CD44; CACNAID, LIMCHl, CD44, SAMD9; MANICI, CD44, TBC1D2; CACNAID, LIMCHl, SAMD9, CELSR2; and CACNAID, MANICI, TBC1D2, PKPl. In a more particular embodiment, the method comprises determining the expression levels of at least CD44 and TBC1D2 genes, and the expression level of at least two genes selected from the group consisting of CACNA1D, ZFHX3 andMANlCl genes, in a cancer sample. In another particular embodiment, the method comprises determining the expression level of at least PKPl, IPPK, MAML3 and TGFBR3 genes in a cancer sample, high expression level of PKPl and IPPK genes and low expression level of MAML3 and TGFBR3 genes, being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. The expression level of each gene may be determined from a cancer sample by a variety of techniques. In particular, the expression level of each gene may be determined, as disclosed above, by measuring the quantity of mRNA and/or by measuring the quantity or the activity of encoded protein. Preferably, the expression level of a gene is determined by measuring the quantity of mRNA. Based on the expression levels, the phenotype of the MIBC may be determined using any commonly used suitable algorithm such as, for example, the nearest shrunken centroid (NSC) algorithm, the support vector machine (SVM) algorithm, or the k-nearest neighbour algorithm. Preferably, the phenotype of the MIBC is obtained using the Nearest Shrunken Centroid method developed by Tibshirani et al, 2002. The training data set for the selected algorithm comprises MIBC with basal-like and non basal-like phenotypes. In particular, the training data set may be selected, for example, from the group consisting of the CIT series disclosed in the experimental section, GSE13507 (Kim et al, 2010), GSE1827 (Blaveri et al, 2005, GSE19915 (Lindgren et al, 2010), GSE31684 (Riester et al, 2012), GSE5479 (Dyrskjot et al, 2007), E-TABM-147 (Stransky et al, 2006) and the cohort disclosed in Sanchez-Carbayo et al, 2006. In another embodiment, the method further comprises determining whether the expression levels of said genes are high or low compared to the reference expression level(s). The reference expression level may be the expression level of a gene having a stable expression in different cancer samples, such as RPLPO, HPRT1, GAPDH, B2M, TBP and 18S genes. In this case, the expression level of the gene of interest is considered as high if the level or quantity of mRNA is above a cut-off value easily adjusted by the skilled person depending on the gene of interest and the reference gene. The cut-off value may be easily defined for each gene via a training data set by optimizing a criterion such as a chi-squared test. The reference expression levels may also be the mean expression levels of said genes among a population of randomly selected MIBC samples. Before to be compared with the reference expression level, the expression levels of the genes may be normalized using the expression level of an endogenous control gene having a stable expression in different cancer samples, such as RPLPO, HPRT1, GAPDH, B2M, TBP and 18S genes.

Using Affymetrix Human Exon arrays, the inventors analyzed splicing changes in 191 MIBC samples and 7 samples of normal urothelium. They then selected alternative splicing events to distinguish basal-like and non basal-like tumors. In particular, the inventors showed that a single alternative splicing event identified in TGM1 gene was enough to correctly classify 70% of basal-like tumours based on numeric data from the exon arrays 3.17. Alternative isoform including exon 9 (SEQ ID NO:24) appears to be ubiquitously expressed in basal-like, non basal-like and normal tissue. However, isoform lacking exon 9 is highly specific to basal- like tumours since the other samples only show a slightly detectable signal. The inventors further showed that basal-like tumors can be correctly classified based on an alternative splicing event identified in HDAC9 gene. Alternative isoform including exon 1 (SEQ ID NO:37) appears to be specific to basal-like tumors. In particular, basal-like tumors can be classified based on the short/long HDAC9 isoform expression ratio wherein the HDAC9 short isoform corresponds to the transcript ENST00000456174 and comprises exon 1 (SEQ ID NO:37) and HDAC9 long isoforms correspond to transcripts ENST00000406451,

ENST00000405010 and ENST00000428307 and do not include exon 1. Indeed, the inventors demonstrated that this ratio is higher in basal-like tumors than in non basal-like tumors or normal urothelium samples. Thus, in a third embodiment, the method for determining whether a MIBC has a basal- like phenotype comprises determining the expression level of at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 4 1 in a cancer sample. Preferably, the expression levels of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 or 18 exons selected from the group consisting of the exons of SEQ ID NO: 23 to 4 1 are determined. More preferably, the expression levels of all the exons of SEQ ID NO: 23 to 4 1 are determined. In a particular embodiment, , the method for determining whether a MIBC has a basal- like phenotype comprises determining the expression level of the exon of SEQ ID NO:24 in a cancer sample. The method may further comprise determining the expression level of at least one exon selected from the group consisting of the exons of SEQ ID NO: 23, and 25 to 4 1. In another particular embodiment, the method for determining whether a MIBC has a basal-like phenotype comprises determining the expression level of the exon of SEQ ID NO:37 in a cancer sample. The method may further comprise determining the expression level of at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 36 and 38 to 41, preferably of the exon of SEQ ID NO:24. The expression level of each exon may be determined from a cancer sample by a variety of techniques, in particular by measuring the quantity of mRNA comprising said exon. The method may further comprise determining whether the expression levels of said exons are high or low compared to the reference expression level. The reference expression level may be the expression level of each exon in a normal sample, preferably a normal sample of urothelium. The reference expression level may also be the expression level of other transcript isoforms from the same gene that do not comprise said exon. Before to be compared with the reference expression level, the expression levels of the exons may be normalized using the expression level of an endogenous control gene having a stable expression in different cancer samples, such as RPLPO, HPRT1, GAPDH, B2M, TBP and 18S genes. In particular, high expression levels of exons selected from the group consisting of the exons of SEQ ID NO:23, 25, 27, 28, 30, 31, 32 34, 37 and 39, and low expression levels of exons selected from the group consisting of the exons of SEQ ID NO:24, 26, 29, 33, 35, 36, 38, 40 and 41, are indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In a particular embodiment, the method for determining whether a MIBC has a basal- like phenotype comprises determining the expression level of the exon of SEQ ID NO:24 in a cancer sample, a low expression level of said exon being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. The expression level of the exon of SEQ ID NO:24 may be determined by in situ hybridization targeting the specific junction between exon 8 and exon 10. The method may further comprise determining the expression level of at least one exon selected from the group consisting of the exons of SEQ ID NO: 23, and 25 to 41, high expression levels of exons selected from the group consisting of the exons of SEQ ID NO:23, 25, 27, 28, 30, 31, 32 34, 37 and 39, and low expression levels of exons selected from the group consisting of the exons of SEQ ID NO: 26, 29, 33, 35, 36, 38, 40 and 41, being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In another particular embodiment, the method for determining whether a MIBC has a basal-like phenotype comprises determining the expression level of the exon of SEQ ID NO:37 in a cancer sample, a high expression level of said exon being indicative that the muscle- invasive bladder cancer has a basal-like phenotype. The method may further comprise determining the expression level of at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 36 and 38 to 41, high expression levels of exons selected from the group consisting of the exons of SEQ ID NO:23, 25, 27, 28, 30, 31, 32, 34 and 39, and low expression levels of exons selected from the group consisting of the exons of SEQ ID NO: 24, 26, 29, 33, 35, 36, 38, 40 and 41, being indicative that the muscle-invasive bladder cancer has a basal-like phenotype.

In another embodiment, the method for determining whether a MIBC has a basal-like phenotype comprises determining the expression level of HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307. The transcript ENST00000456174 is 2,307 bp in length and the corresponding cDNA is set forth in SEQ ID NO: 109. The transcript ENST00000406451 is 9,705 bp in length and the corresponding cDNA is set forth in SEQ ID NO: 110. The transcript ENST00000405010 is

4,239 bp in length and the corresponding cDNA is set forth in SEQ ID NO: 111. The transcript ENST00000428307 is 2,359 bp in length and the corresponding cDNA is set forth in SEQ ID NO: 112. The method may further comprise calculating the ratio of HDAC9 short isoform expression level to HDAC9 long isoform expression levels, i.e. the ratio of the expression level of the transcript ENST00000456174 to the sum of the expression levels of the transcripts ENST00000406451, ENST00000405010 and ENST00000428307. In particular, this ratio can be calculated by determining the quantity of mRNA for HDAC9 short isoform corresponding to the transcript ENST00000456174 and the quantity of mRNA for HDAC9 long isoforms corresponding to transcripts ENST0000040645 1, ENST00000405010 and ENST00000428307. The quantity of mRNA for each transcript can be assessed by any method known by the skilled person, preferably by RT-qPCR. Prior to calculating this ratio, the expression level of each transcript may be normalized, for example using the expression level of an endogenous control gene having a stable expression in different cancer samples, such as RPLPO, HPRT1, GAPDH, B2M, TBP and 18S genes. The method may further comprise determining whether the calculated ratio is high or low compared to a reference level, a high ratio being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. The reference level may be the ratio calculated in a normal sample, preferably a normal sample of urothelium. The ratio is considered as high if its value is above a cut-off value easily adjusted by the skilled person depending on the reference level. In a preferred embodiment, the reference level is the ratio calculated in a normal urothelium sample and the ratio is considered as high if its value corresponds to at least 20 or 25 fold increase compared to the reference level, preferably at least 30 fold increase. In this embodiment, the method may further comprise determining the expression level of at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 41, preferably selected from the group consisting of the exons of SEQ ID NO: 24 and 37, even more preferably of the exon of SEQ ID NO:24. The inventors further showed that basal-like tumors can be correctly classified not only based on an alternative splicing event identified in TGM1 gene, but also on the global expression of TGM1 gene. Indeed, they observed that TGM1 gene is overexpressed in basal- like tumors by comparison with non-basal-like tumors. Thus, in another embodiment, the method for determining whether a MIBC has a basal- like phenotype comprises determining the expression level of TGM1 gene (Entrez Gene ID: 7051) in a cancer sample. High expression level of this gene is indicative that the MIBC has a basal-like phenotype. Expression level of TGM1 gene may be determined from a cancer sample by a variety of techniques as disclosed above, in particular by measuring the quantity of mRNA. The method may further comprises determining whether the expression level of TGM1 gene is high or low compared to the reference expression level(s). The reference expression level may be the expression level of a gene having a stable expression in different cancer samples, such as RPLPO, HPRT1, GAPDH, B2M, TBP and 18S genes. In this case, the expression level is considered as high if the level or quantity of mRNA is above a cut-off value easily adjusted by the skilled person. The cut-off value may be easily defined via a training data set by optimizing a criterion such as a chi-squared test. In a preferred embodiment, the reference expression level is the expression level of TGM1 gene in a normal urothelium sample. In this embodiment, the expression level of TGM1 gene is considered as high if its value corresponds to at least 3 fold increase compared to the reference level, preferably at least 4 fold increase. Before to be compared with the reference expression level, the expression level of TGM1 gene may be normalized using the expression level of an endogenous control gene having a stable expression in different cancer samples, such as RPLPO, HPRT1, GAPDH, B2M, TBP and 18S genes. In a preferred embodiment, the method for determining whether a MIBC has a basal- like phenotype comprises determining the expression level of TGM1 gene and determining (i) the expression level of the exon of SEQ ID NO:37 or (ii) the expression levels of HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST0000040645 1, ENST00000405010 and ENST00000428307. Preferably, the method comprises determining the expression level of TGM1 gene, and determining the expression levels of HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307. The method may further comprise calculating the ratio of HDAC9 short isoform to HDAC9 long isoforms as disclosed above. Preferably, the method comprises determining the expression level of TGM1 gene, determining the expression levels of HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307, calculating the ratio of HDAC9 short isoform expression level to HDAC9 long isoform expression levels, wherein a high expression level of TGM1 gene and a high ratio are indicative that the MIBC has a basal- like phenotype.

The inventors further identified DNA methylation signatures to distinguish basal-like and non basal-like MIBC. Thus, in a fourth embodiment, the method for determining whether a MIBC has a basal- like phenotype comprises - determining the DNA methylation status of at least 4 CpG islands selected from the group consisting of CpG islands listed on Table 7 in a cancer sample; and/or - the DNA methylation status of at least 4 GpC sites selected from the group consisting of CpG sites listed on Table 9 in a cancer sample. The DNA methylation status may be determined using any method commonly known by the skilled person such as, for example, methods which used DNA methylation sensitive enzymes, or bisulfite treatment of DNA such as COBRA, methylation specific PCR (MSP), pyrosequencing, MethyLight, DNA arrays or highthroughput sequencing. The method may comprise determining the DNA methylation status of at least 5, 6, 7,

8, 9, 10, 11, 12 CpG islands selected from the group consisting of CpG islands listed on Table 7, preferably the methylation status of all CpG islands listed on Table 7. The method may comprise determining the DNA methylation status of at least 5, 6, 7,

8, 9, 10, 11, 12, 13, 14 CpG sites selected from the group consisting of CpG sites listed on Table 9, preferably the methylation status of all CpG sites listed on Table 9. Preferably, the method comprises determining - the DNA methylation status of CpG islands of SEQ ID NO: 43, 45, 47 and 51; and/or - the DNA methylation status of GpC sites of SEQ ID NO: 85, 86, 9 1 and 96. Hypermethylation of the CpG islands selected from the group consisting of the CpG island in the genomic region chrl : 22396508-223936838 (SEQ ID NO: 43) and the CpG island in the genomic region chrlO: 6220879-6220943 (SEQ ID NO: 49), and hypomethylation of the CpG islands selected from the group consisting of the CpG island in the genomic region chrl:155931629-155931858 (SEQ ID NO: 42), the CpG island in the genomic region chrl: 230249965-230250010 (SEQ ID NO:44), the CpG island in the genomic region chr6: 163730158-163730368 (SEQ ID NO: 45), the CpG island in the genomic region chr6: 163730796-163730853 (SEQ ID NO: 46), the CpG island in the genomic region chr7: 872735- 872797 (SEQ ID NO: 47), the CpG island in the genomic region chr9: 119976922-1 19976922, the CpG island in the genomic region chrl4: 80328091-80328262 (SEQ ID NO: 50), the CpG island in the genomic region chrl 6 : 1742021-1742281 (SEQ ID NO: 51), the CpG island in the genomic region chrl6:87491587-87491587, the CpG island in the genomic region chrl8: 9536992-9536992 and the CpG island in the genomic region chrl9: 10370550-10370550, is indicative that the muscle-invasive bladder cancer has a basal-like phenotype. Thus, in a particular embodiment, the method comprises determining the DNA methylation status of CpG islands of SEQ ID NO: 43, 45, 47 and 51, hypermethylation of the CpG island of SEQ ID NO: 43 and hypomethylation of CpG islands of SEQ ID NO: 45, 47 and 51, being indicative that the muscle-invasive bladder cancer has a basal-like phenotype. In another particular embodiment, the method comprises determining the DNA methylation status of GpC sites of SEQ ID NO: 85, 86, 9 1 and 96, hypomethylation of said CpG islands being indicative that the muscle- invasive bladder cancer has a basal-like phenotype.

In the experimental section, the inventors demonstrated overall survival was significantly shorter for patients with basal-like MIBC. They further showed that the basal-like subtype was a prognosis factor independent of sex, node and metastasis status. Accordingly, in a second aspect, the present invention relates to a method for predicting clinical outcome of a patient afflicted with a muscle-invasive bladder cancer, wherein the method comprises determining in a cancer sample from said patient whether the muscle- invasive bladder cancer has a basal-like phenotype with the method according to the invention, the presence of the basal-like phenotype being indicative of a poor prognosis. All embodiments disclosed for the method for determining whether a muscle-invasive bladder cancer has a basal-like phenotype are also contemplated in this aspect. In an embodiment, the method further comprises the step of providing a cancer sample from the patient.

Using in vitro and in vivo preclinical models, the inventors further demonstrate that, contrary to non-basal-like tumors, therapy targeting EGFR and/or comprising capecitabine was particularly effective for basal-like tumors. Thus, in a further aspect, the present invention relates to a method for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the method comprises determining in a cancer sample from said patient whether the muscle-invasive bladder cancer has a basal-like phenotype with the method according to the invention, and optionally determining whether the muscle-invasive bladder cancer has a RAS-activating mutation, the presence of the basal-like phenotype being indicative that said patient is susceptible to benefit from a treatment comprising capecitabine, and the presence of the basal-like phenotype and the absence of RAS-activating mutation being indicative that said patient is susceptible to benefit from a treatment comprising an EGFR kinase inhibitor. All embodiments disclosed for the method for determining whether a muscle-invasive bladder cancer has a basal-like phenotype are also contemplated in this aspect. In an embodiment, the method further comprises the step of providing a cancer sample from the patient. As used herein, the term "RAS-activating mutation" refers to an activating mutation in a gene encoding a RAS protein. RAS refers to a family of proteins (HRAS (SEQ ID NO: 100), KRAS (SEQ ID NO: 101), NRAS (SEQ ID NO: 102)) involved in signal transduction, in particular the signal transduction of tyrosine kinase receptors (Eswarakumar et al., 2005). They exist in two different states, an active state when bound to GTP and an inactive state when the GTP is converted to GDP because of the GTPase activity of RAS. In cancer, including bladder cancer, mutated form of all three RAS genes have been described. These mutated forms have less GTPase activity so they remain in an active state. An estimated rate of 17% of RAS mutations (HRAS, KRAS and NRAS mutations) in bladder cancer can be deduced from the Cosmic data base of the Sanger Institute (www.sanger.ac.uk/genetics/CGP/cosmic/). The references for the different activating mutations of HRAS, KRAS and NRAS can also be found in the review of Schubbert et al. (2007). For instance, the RAS-activating mutation can be selected among HRAS mutations G12S and G13V, KRAS mutations G12C and G12D and NRAS mutation M72I. However, in addition to mutations, an overexpression of RAS is also contemplated. As used herein, the term "EGFR kinase inhibitor" refers to a molecule which inhibits or reduces the kinase activity of the epidermal growth factor receptor. The activity of EGFR kinase can be easily assayed by any method known in the art for quantifying kinase activity or analyzing protein phosphorylation (see for example Olive, 2004). In particular, the EGFR kinase inhibitor may be selected from the group consisting of a small molecule inhibiting the EGFR kinase activity, an antibody directed against the extracellular domain of the EGFR and a nucleic acid molecule interfering specifically with the expression of EGFR. In an embodiment, the EGFR kinase inhibitor is a small molecule inhibiting the EGFR kinase activity. As used herein, the term "small molecule inhibiting the EGFR kinase activity" refers to small molecule that can be an organic or inorganic compound, usually less than 1000 daltons, with the ability to inhibit or reduce the activity of the EGFR kinase activity. This small molecule can be derived from any known organism (including, but not limited to, animals, plants, bacteria, fungi and viruses) or from a library of synthetic molecules. In particular, the small molecule may be selected from the group consisting of erlotinib, gefitinib and lapatinib, and any combination thereof. In another embodiment, the EGFR kinase inhibitor is an antibody, preferably a monoclonal antibody, directed against the extracellular domain of the EGFR. As used herein, the term "antibody" is intended to refer broadly to any immunologic binding agent such as IgG, IgM, IgA, IgD and IgE, and humanized or chimeric antibody. In certain embodiments, IgG and/or IgM are preferred because they are the most common antibodies in the physiological situation and they are most easily manufactured. The term "antibody" is used to refer to any antibody-like molecule that has an antigen binding region, and includes antibody fragments such as Fab', Fab, F(ab') 2, single domain antibodies (DABs), Fv, scFv (single chain Fv), and the like. The techniques for preparing and using various antibody-based constructs and fragments are well known in the art. Means for preparing and characterizing antibodies are also well known in the art (See, e.g., Harlow and Lane, 1988). A "humanized" antibody is an antibody in which the constant and variable framework region of one or more human immunoglobulins is fused with the binding region, e.g. the CDR, of an animal immunoglobulin. "Humanized" antibodies contemplated in the present invention are chimeric antibodies from mouse, rat, or other species, bearing human constant and/or variable region domains, bispecific antibodies, recombinant and engineered antibodies and fragments thereof. Such humanized antibodies are designed to maintain the binding specificity of the non-human antibody from which the binding regions are derived, but to avoid an immune reaction against the non-human antibody. A "chimeric" antibody is an antibody molecule in which (a) the constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable region) is linked to a constant region of a different or altered class, effector function and/or species, or an entirely different molecule which confers new properties to the chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable region, or a portion thereof, is altered, replaced or exchanged with a variable region having a different or altered antigen specificity. Particularly, the term "antibody directed against the extracellular domain of EGFR" designates an antibody as described above which is able to bind to the extracellular domain of the EGF receptor and to block or reduce its activity. This inhibition can be due to steric hindrance or modification which prevents ligand binding. In particular, the antibody directed against the extracellular domain of the EGFR may be selected from the group consisting of cetuximab, panitumumab, zalutumumab, nimotuzumab and matuzumab, and any combination thereof. In a further embodiment, the EGFR kinase inhibitor a nucleic acid molecule interfering specifically with the expression of EGFR. The term "nucleic acid molecule" includes, but is not limited to, RNAi, antisense and ribozyme molecules. In the present invention, a "nucleic acid molecule specifically interfering with the expression of EGFR" is a nucleic acid molecule which is able to reduce or to suppress the expression of gene coding for EGFR, in a specific way. The term "RNAi" or "interfering RNA" means any RNA which is capable of down-regulating the expression of the targeted protein. It encompasses small interfering RNA (siRNA), double-stranded RNA (dsRNA), single-stranded RNA (ssRNA), micro-RNA (miRNA), and short hairpin RNA (shRNA) molecules. A number of patents and patent applications have described, in general terms, the use of siRNA molecules to inhibit gene expression, for example, WO 99/32619, US 20040053876, US 20040102408 and WO 2004/007718. siRNA are usually designed against a region 50-100 nucleotides downstream the translation initiator codon, whereas 5'UTR (untranslated region) and 3'UTR are usually avoided. The chosen siRNA target sequence should be subjected to a BLAST search against EST database to ensure that the only desired gene is targeted. Various products are commercially available to aid in the preparation and use of siRNA. In a preferred embodiment, the RNAi molecule is a siRNA of at least about 15-50 nucleotides in length, preferably about 20-30 base nucleotides, preferably about 20-25 nucleotides in length. RNAi can comprise naturally occuring RNA, synthetic RNA, or recombinantly produced RNA, as well as altered RNA that differs from naturally-occurring RNA by the addition, deletion, substitution and/or alteration of one or more nucleotides. Such alterations can include addition of non-nucleotide material, such as to the end of the molecule or to one or more internal nucleotides of the RNAi, including modifications that make the RNAi resistant to nuclease digestion. RNAi may be administered in free (naked) form or by the use of delivery systems that enhance stability and/or targeting, e.g., liposomes, or incorporated into other vehicles, such as hydrogels, cyclodextrins, biodegradable nanocapsules, bioadhesive microspheres, or proteinaceous vectors (WO 00/53722), or in combination with a cationic peptide (US 2007275923). They may also be administered in the form of their precursors or encoding DNAs. In a particular embodiment, RNAi are encapsulated within vesicles, preferably within liposomes. Antisense nucleic acid can also be used to down-regulate the expression of EGFR. The antisense nucleic acid can be complementary to all or part of a sense nucleic acid encoding a EGFR polypeptide e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence, and it thought to interfere with the translation of the target mRNA In a preferred embodiment, the antisense nucleic acid is a RNA molecule complementary to a target mRNA encoding a EGFR polypeptide. An antisense nucleic acid can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. Particularly, antisense RNA molecules are usually 18-50 nucleotides in length. An antisense nucleic acid for use in the method of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. Particularly, antisense RNA can be chemically synthesized, produced by in vitro transcription from linear (e.g. PCR products) or circular templates (e.g., viral or non-viral vectors), or produced by in vivo transcription from viral or non-viral vectors. Antisense nucleic acid may be modified to have enhanced stability, nuclease resistance, target specificity and improved pharmacological properties. For example, antisense nucleic acid may include modified nucleotides designed to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides. Ribozyme molecules can also be used to decrease levels of EGFR. Ribozymes are catalytic RNA molecules with ribonuclease activity which are capable of cleaving a single- stranded nucleic acid, such as an mRNA, to which they have a complementary region. Thus, ribozymes can be used to catalytically cleave mRNA transcripts to thereby inhibit translation of the protein encoded by the mRNA. Ribozyme molecules can be designed, produced, and administered by methods commonly known to the art (see e.g., Fanning and Symonds, 2006, reviewing therapeutic use of hammerhead ribozymes and small hairpin RNA). In a preferred embodiment, the EGFR kinase inhibitor is selected from the group consisting of erlotinib, gefitinib, lapatinib, cetuximab, panitumumab, zalutumumab, nimotuzumab and matuzumab, and any combination thereof. Preferably, the EGFR kinase inhibitor is selected from the group consisting of erlotinib and cetuximab, and combination thereof.

The present invention also relates to a method of predicting the sensitivity of a muscle- invasive bladder cancer to a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the method comprises determining whether the muscle-invasive bladder cancer has a basal-like phenotype with the method according to the invention, and optionally determining whether the muscle-invasive bladder cancer has a RAS-activating mutation, the presence of the basal-like phenotype in said cancer being indicative that said cancer is sensitive to a treatment comprising capecitabine, and the presence of the basal-like phenotype and the absence of RAS-activating mutation being indicative that said cancer is sensitive to a treatment comprising an EGFR kinase inhibitor. All embodiments disclosed for the method for determining whether a muscle-invasive bladder cancer has a basal-like phenotype are also contemplated in this aspect. In an embodiment, the method further comprises the step of providing a cancer sample from the patient.

In a further aspect, the present invention also relates to an EGFR kinase inhibitor for use in the treatment of muscle-invasive bladder cancer having a basal-like phenotype as determined with the method according to the invention and without RAS-activating mutation. The present invention further relates to a method for treating a patient affected with a muscle-invasive bladder cancer having a basal-like phenotype as determined with the method according to the invention and without RAS-activating mutation, comprising administering a therapeutically efficient amount of a pharmaceutical composition comprising an inhibitor of EGFR, and optionally a pharmaceutically acceptable carrier. The present invention also relates to capecitabine for use in the treatment of muscle- invasive bladder cancer having a basal-like phenotype as determined with the method according the invention. The present invention further relates to a method for treating a patient affected with a muscle-invasive bladder cancer having a basal-like phenotype as determined with the method according to the invention, comprising administering a therapeutically efficient amount of a pharmaceutical composition comprising capecitabine, and optionally a pharmaceutically acceptable carrier. In a particular embodiment, the muscle-invasive bladder cancer has a basal-like phenotype and RAS-activating mutation. All embodiments disclosed for the method for determining whether a muscle-invasive bladder cancer has a basal-like phenotype are also contemplated in this aspect. In particular, the EGFR inhibitor may be used in combination with capecitabine. The EGFR inhibitor and/or capecitabine may also be used in combination with an alkylating agent. Examples of alkylating agents include, but are not limited to, platinum-based chemotherapy drugs, cyclophosphamide, chlorambucil, uramustine, estramustine, ifosfamide, melphalan, bendamustine, carmustine, lomustine, semustine, streptozotocin, busulfan, dacarbazine, procarbazine, altretamine, adozelesin, thiotepa, mitozolomide and temozolomide. Platinum-based chemotherapy drugs may include, but are not limited to, cisplatin, carboplatin, iproplatin, spiroplatin, nedaplatin, oxaliplatin, triplatin tetranitrate and satraplatin. Preferably, the alkylating agent is selected from the group consisting of cisplatin, carboplatin, oxaliplatin, cyclophosphamide and ifosfamide, and any combination thereof. More preferably, the alkylating agent is cisplatin. Even more preferably, capecitabine is used in combination with an alkylating agent, preferably with cisplatin. By a "therapeutically efficient amount" is intended an amount of therapeutic agent administered to a patient that is sufficient to constitute a treatment of a bladder cancer. The amount of EGFR inhibitor and/or capecitabine to be administered has to be determined by standard procedure well known by those of ordinary skill in the art. Physiological data of the patient (e.g. age, size, and weight) and the routes of administration have to be taken into account to determine the appropriate dosage. The appropriate dosage of each compound may also vary if it is used alone or in combination.

In another aspect, the present invention concerns a combined preparation, product or kit containing (a) capecitabine and (b) an alkylating agent, preferably cisplatin, as a combined preparation for simultaneous, separate or sequential use in the treatment of a muscle-invasive bladder cancer having a basal-like phenotype as determined with the method according to the invention, preferably a muscle-invasive bladder cancer having a basal-like phenotype and RAS- activating mutation. The present invention also concerns a combined preparation, product or kit containing (a) an EGFR kinase inhibitor and (b) an alkylating agent, preferably cisplatin, as a combined preparation for simultaneous, separate or sequential use in the treatment of a muscle-invasive bladder cancer having a basal-like phenotype as determined with the method according to the invention, preferably a muscle-invasive bladder cancer having a basal-like phenotype and without RAS-activating mutation.

In a further aspect, the present invention relates to a kit and its use (i) for predicting clinical outcome of a patient afflicted with a muscle-invasive bladder cancer, (ii) for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment comprising an EGFR kinase inhibitor and/or capecitabine, and/or (iii) for predicting the sensitivity of a muscle- invasive bladder cancer to a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the kit comprises detection means selected from the group consisting of a pair of primers, a probe and an antibody specific to (a) the genes KRT5, KRT6A, KRT6B and/or FOXA1; and/or (b) at least 2 genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MTIX, RAB38, SFN, SAMD9, EGFR, CD44, ILIRAP, DSP, PKPl, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSSl and RGS20 genes, preferably PKPland IPPK, and at least 2 genes selected from the group consisting of PHC1, THYN1, TACC1, PPAP2B,

NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNA1D, RAB15, MANIC 1, SORL1, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, preferably MAML3 and TGFBR3; and/or (c) at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 41, preferably the exon of SEQ ID NO: 24 or 37, more preferably the exon of SEQ ID NO: 24; and/or (d) at least 4 CpG islands selected from the group consisting of CpG islands listed in Table 7; and/or (e) at least 4 GpC sites selected from the group consisting of CpG sites listed in Table 9; and/or (f the TGM1 gene; and/or (g) the HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307, and optionally, a leaflet providing guidelines to such use. The kit may further comprise detection means selected from the group consisting of a pair of primers, a probe and an antibody specific to a RAS-activating mutation, in particular a mutation selected from the group consisting of HRAS mutations G12S and G13V, KRAS mutations G12C and G12D andNRAS mutation M72I.

The present invention also relates to a DNA chip and its use (i) for predicting clinical 5 outcome of a patient afflicted with a muscle-invasive bladder cancer, (ii) for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment comprising an EGFR kinase inhibitor and/or capecitabine, and/or (iii) for predicting the sensitivity of a muscle-invasive bladder cancer to a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the DNA chip comprises a solid support which carries nucleic acids that are 10 specific to (a) at least 2 genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSS1 and RGS20 genes, preferably PKPland IPPK, and at least 2 genes selected from the group consisting of PHC1, THYN1, TACC1, PPAP2B,

15 NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNAID, RAB15, MANICI, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, preferably MAML3 and TGFBR3; and/or (b) at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 41, preferably the exon of SEQ ID NO: 24 or 37, more preferably the exon of SEQ ID NO: 20 24; and/or (c) at least 4 CpG islands selected from the group consisting of CpG islands listed in Table 7; and/or (d) at least 4 GpC sites selected from the group consisting of CpG sites listed in Table 9; and/or 25 (f) the TGM1 gene; and/or (g) the HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307. The DNA chip may further comprise nucleic acids that are specific to a RAS-activating 30 mutation, in particular a mutation selected from the group consisting of HRAS mutations G12S and G13V, KRAS mutations G12C and G12D and NRAS mutation M72I. Further aspects and advantages of the present invention will be described in the following examples, which should be regarded as illustrative and not limiting.

EXAMPLES

Example 1

Materials and methods

Patients and tissue samples (CIT series). A series of 85 MIBC (31 pT2, 35 pT3 and 19 pT4) was collected from patients treated surgically between 1988 and 2006 at Henri Mondor Hospital, Institut Gustave Roussy (Villejuif, France) and Foch Hospital (Suresnes, France). All tumors were pathologically reviewed, staged according to the 1997 TNM classification (Sobin and Fleming, 1997) and graded according to the 1973 WHO classification (Mostofi FK, 1973). The clinical annotations for the patients and the pathological features of each tumor were also recorded. Three normal urothelial samples were obtained by scraping fresh urothelial cells from the normal bladder wall and from the lamina propria during organ procurement from cadaveric donors for transplantation (Gil-Diez de Medina et al, 1998). All patients provided written informed consent and the study was approved by the ethics committees of the various hospitals. Additional information about patient samples is provided in Table 1 below. Table 1: Clinical data and identification of the basal-like molecular subtype in the CIT dataset

Last known Follow Sample Tumor Tumor status Lymph Distant Squamous Cystectomy up time ID stage grade (l=dead; node metastasis differentiation (months) 0=alive)

CIT.l T4 G3 yes 1 1 positive negative no CIT.100 T3 G3 yes 15 1 positive positive no CIT.103 T2 G3 yes 54 0 negative negative no CIT.106 T4 G3 yes 12 1 positive positive no CIT.108 T3 G3 yes 70 NA negative negative no CIT.12 T3 G3 yes 26 1 positive positive no CIT.120 T2 G3 no 107 0 negative negative no CIT.127 T2 G3 yes 94 1 negative negative no CIT.134 T2 G3 no 13 0 NA negative no CIT.138 T3 G3 yes 3 1 positive positive no CIT.139 T2 G3 no 10 1 NA negative no CIT.14 T3 G3 yes 4 1 negative negative no CIT.141 T2 G3 yes 2 0 NA positive no CIT.142 T2 G3 yes 57 0 NA negative no CIT.15 T3 0 negative negative no CIT.17 T3 1 positive negative no CIT.170 Τ2 NA negative negative no CIT.172 Τ2 0 negative negative no CIT.175 Τ2 NA negative negative no CIT.177 Τ4 1 positive positive no CIT.179 Τ4 1 positive positive no CIT.180 Τ4 0 positive positive no CIT.181 Τ2 0 negative negative yes CIT.182 Τ2 1 NA positive no CIT.185 Τ2 NA negative NA no CIT.186 Τ4 0 positive positive no CIT.187 Τ3 NA positive negative no CIT.188 Τ4 1 positive positive yes CIT.19 Τ3 0 negative negative no CIT.190 Τ3 1 negative positive yes CIT.192 Τ2 0 NA positive no CIT.2 Τ4 1 negative negative no CIT.23 Τ3 NA positive positive no CIT.242 Τ4 1 negative positive no CIT.25 Τ3 0 negative negative yes CIT.252 Τ2 1 negative negative no CIT.253 Τ3 1 NA negative no CIT.254 Τ3 1 negative positive no CIT.256 Τ2 0 negative negative no CIT.259 Τ2 1 negative negative no CIT.260 Τ3 NA negative negative no CIT.262 Τ3 1 negative positive no CIT.265 Τ3 NA positive positive no CIT.266 Τ3 NA negative positive no CIT.267 Τ2 1 negative negative yes CIT.269 Τ3 1 NA positive no CIT.277 Τ2 NA negative negative no CIT.28 Τ3 0 negative negative yes CIT.280 Τ2 NA negative negative no CIT.290 Τ2 0 NA negative no CIT.30 Τ3 1 positive negative no CIT.31 Τ4 1 negative negative yes CIT.310 Τ2 0 negative negative no CIT.32 Τ3 0 negative negative no CIT.33 Τ2 0 negative negative no CIT.34 Τ2 0 negative negative no CIT.36 Τ2 0 negative negative no CIT.37 Τ4 1 positive negative no CIT.38 Τ2 1 positive negative yes CIT.39 Τ2 1 negative positive no CIT.40 Τ2 NA NA positive no CIT.44 Τ4 1 positive negative yes CIT.6 Τ4 1 positive positive no CIT.65 Τ2 0 NA positive no CIT.67 Τ3 1 positive positive no CIT.68 Τ4 1 negative positive no CIT.7 Τ4 1 positive positive no CIT.70 Τ3 0 positive negative no CIT.72 Τ3 0 negative negative yes CIT.73 T3 G3 yes 35 1 negative positive no CIT.75 T3 G3 yes 1 0 negative negative no CIT.76 T4 G3 yes 16 1 positive positive no CIT.78 T3 G3 yes 2 1 0 negative negative yes CIT.79 T2 G3 yes 1 0 negative negative no CIT.8 T4 G3 yes 39 1 positive positive no CIT.80 T3 G3 yes 5 1 positive negative no CIT.81 T3 G3 yes 17 1 positive negative no CIT.82 T3 G3 yes 18 1 positive positive no CIT.83 T3 G3 yes 2 1 positive negative no CIT.84 T4 G3 yes 9 1 negative negative yes CIT.91 T2 G3 no 9 0 NA NA no CIT.93 T3 G3 yes 8 1 negative positive yes CIT.96 T3 G3 yes 18 0 negative negative no CIT.97 T3 G3 yes 4 0 positive negative yes CIT.99 T4 G3 yes 23 1 negative positive no NA indicates that data was not recorded in the corresponding sample

Patients and tissue samples (Stransky series). The Stransky series is composed of 39 MIBC samples. The corresponding expression profiles, as well as clinical and biological annotations, are publicly available (ArrayExpress Database (website: https://www.ebi.ac.uk/arrayexpress/), dataset ID = E-TABM-147) (Stransky et al, 2006).

RNA, DNA andprotein extractionfrom tissues and cell lines. Immediately after surgery, the tissue samples were frozen in liquid nitrogen and stored at -80°C until nucleic acid and protein extraction. RNA, DNA and proteins were extracted from frozen human bladder samples and from human bladder cancer cell lines grown to 70% confluence, by centrifugation through a cesium chloride density gradient (Chirgwin et al, 1979; Coombs et al, 1990). RNA was extracted from BBN-induced mouse bladder tumors or normal mouse urothelium, with Trizol. The concentration, integrity and purity of each RNA sample were determined with the RNA 6000 LabChip Kit (Agilent Technologies) and an Agilent 2100 Bioanalyzer. DNA purity was assessed by determining the ratio of absorbances at 260 and 280 nm. DNA concentration was determined with a Hoechst dye-based fluorescence assay (Labarca and Paigen, 1980). Proteins were resuspended in 1 x Laemmli buffer supplemented with protease and phosphatase inhibitors (Roche) and concentration was determined with a BCA Protein Assay-Reducing Agent Compatible kit (Pierce).

Gene mutation analysis Seven genes were screened for mutation: FGFR3, TP53, KRAS, NRAS, HRAS, PIK3CA and EGFR. FGFR3 mutations were studied with the SNaPshot method, as previously described (van Oers et al, 2005). TP53 (exons 2 to 11), KRAS (exons 2-3), NRAS (exons 2-3), HRAS (exons 2-3) and PIK3CA (exons 2, 9 and 20) gene mutations were screened by direct sequencing with previously described primers and protocols (Boyault et al, 2007; Wallerand et al, 2005). Basal-like bladder tumors and cell lines were analyzed for the presence of hotspot mutations in the EGFR gene (exons 18-21) by direct sequencing with previously described primers (Boyault et al, 2007). The presence of the truncated EGFR variant III (EGFRvlll) was also assessed in these samples, by analysis of the cDNA with primers binding to sequences flanking the deletion of exons 2 to 7, after gel electrophoresis, as previously described (Sok et al, 2006). All mutations were confirmed by sequencing both strands of a second, independent PCR product.

Antibodies and western blotting Immunoblot analysis was carried out with the following primary antibodies: anti-EGFR (#2232); anti-phospho-EGFR Tyrl068 (#3777); anti-p44/42 MAPK (#9102); anti-phospho- p44/42 MAPK (Thr202/Tyr204) (#9101); anti-AKT (#9272); anti-phospho-AKT ser473 (#4060) antibodies. All these antibodies were purchased from Cell Signaling Technology and used at a dilution of 1:1000. The same amount of total protein was loaded for each sample and an anti-a-tubulin antibody (T9026, 1:2000; Sigma-Aldrich) was used as a loading control (for experiments with bladder cancer cell lines). Signals were detected with the ECL SuperSignal West Pico chemiluminescent substrate (Pierce), with either anti-mouse or anti-rabbit (1:2000, Cell Signaling Technology) horseradish peroxidase-conjugated IgG as the secondary antibody. Western blots were analyzed with a Fujifilm LAS-3000 imager and protein levels were determined by densitometry with MultiGauge software (Fujifilm).

Immunohistochemistry Tissue samples were immunostained for CK5/6 and FoxAl on tissue microarrays (TMA) constructed with Beecher's Tissue arrayer®, according to the manufacturer's instructions (http://www.beecherinstruments.com), using three replicate cores (diameter: 0.6 mm) in each case. For FoxAl, the sections were placed in citrate buffer pH 6, and microwaved for antigen retrieval, and then incubated with the primary rabbit polyclonal antibody against FoxAl (ref 23738, Abeam, Cambridge, United Kingdom) overnight (dilution 1:200) at +4°C. Nuclear staining was assessed by one pathologist, taking into account the intensity I (from 0, null, 1, mild, 2, moderate and, 3, strong) and the percentage P of tumor cells with stained nuclei, and a quick score (QS) was then calculated as QS= I * P (from 0 to 300). For CK5/6 immunostaining (mouse monoclonal antibody, clone D5/16 B4, dilution 1/100, DakoCytomation, Glostrup, Denmark), antigen retrieval was carried out in EDTA pH 9, and paraffin sections were processed on an automated instrument (Ventana Nexes; Ventana Medical Systems, Tucson AZ, USA) with an indirect biotin-avidin system, the Ventana Basic DAB Detection kit (Ventana Medical Systems), according to the manufacturer's instructions. For phospho-EGFR immunostaining, nine whole tumor sections were used. The sections were placed in EDTA pH 9 and microwaved for antigen retrieval, then incubated with the rabbit monoclonal antibody (clone D7A5, ref 3777, Cell Signaling, Boston, MA) overnight (dilution l:50) at +4°C.

Quantitative RT-PCR. One microgram of total RNA was reverse transcribed with the High-Capacity cDNA Reverse Transcription kit (Applied Biosystems). mRNA levels were quantified on a LightCycler® 480 Instrument (Roche) in predesigned or custom gene expression assays, with gene-specific primers and a dye-labeled hydrolysis probe (TaqMan probes from Applied Biosystems or UPL (Universal ProbeLibrary) probes from Roche). Predesigned assays were used to quantify human 18s rRNA (reference gene) and EREG gene expression (Gene: EREG, Assay ID: 110729, Supplier: Roche; Gene: Human 18s rRNA, Assay ID: 4319413E, Supplier: Applied Biosystems). Custom assays were used for the other eight genes (Table 2). For custom assays, primers and probes were designed with Probe Finder software via the Universal Probe Library Assay Design Center (Roche). qRT-PCR was carried out with the LightCycler® 480 Instrument (Roche) in a 20 µΐ reaction mixture containing lOng of reverse-transcribed RNA, 1x LightCycler® 480 Probe Master, 25 µΜ each of the forward and reverse primers and 10 µΜ of the UPL probe (or 1 x the predesigned assay probe). All expression assays were run in the same thermal cycling conditions, including an initial step at 95°C (10 min), followed by 40 cycles at 95°C (10 s), 60°C (30 s) and 72°C (10 s). For each gene of interest, the amount of mRNA was normalized with respect to that of the ribosomal 18S (R18S) reference gene by the 2~ACt method. Table 2: Custom gene expression assays

Gene UPL probe Amplicon Forward primer Reverse primer ID (Roche) size (bp) KRT5 GCAGATCAAGACCCTCAACA CCACTTGGTGTCCAGAACCT 84 9 1 AT (SEQ ID NO : 1) (SEQ ID NO : 2) KRT6A AGTTTGCCTCCTTCATCGAC CAGCAGGGTCCACTTTGTTT 84 77 (SEQ ID NO : 3) (SEQ ID NO : 4) KRT1 7 TTGAGGAGCTGCAGAACAAG AGTCATCAGCAGCCAGACG 76 93 (SEQ ID NO : 5) (SEQ ID NO : 6) EGFR GATCCAAGCTGTCCCAATG GCACAGATGATTTTGGTCAGTT 3 77 (SEQ ID NO : 7) (SEQ ID NO : 8) HBEGF TGGGGCTTCTCATGTTTAGG CATGCCCAACTTCACTTTCTC 55 77 (SEQ ID NO : 9) (SEQ ID NO : 10) AREG CGGAGAATGCAAATATATAG CACCGAAATATTCTTGCTGACA 38 68 AGCAC (SEQ ID NO : 11) (SEQ ID NO : 12) TGFA TTGCTGCCACTCAGAAACAG ATCTGCCACAGTCCACCTG 63 70 (SEQ ID O : 13) (SEQ ID NO : 14) ANp63 GGTTGGCAAAATCCTGGAG GGTTCGTGTACTGTGGCTCA 56 119 (SEQ ID O : 15) (SEQ ID NO : 16)

Affymetrix microarrayprofiling ofRNA extractedfrom human tumors (CIT series), human cell lines and mouse samples. Human muscle-invasive bladder cancer (MIBC) samples (n=S5), normal human urothelium samples (n=3) and samples from a subset of human bladder cancer cell lines (n=7) were hybridized with Affymetrix HG-U133 Plus 2.0 arrays. Extracts of all the human bladder cancer cell lines used in this study (n=22) were hybridized with the Affymetrix Human Exon 1.0 ST Array GeneChip™. Mouse BBN-induced bladder tumor samples (n=l l) and normal mouse urothelium samples (n=3) were hybridized with an Affymetrix Mouse Exon 1.0 ST Array GeneChip™. Raw feature data from Affymetrix HG-U133A Plus 2.0, Mouse Exon 1.0 ST Array and Human Exon 1.0 ST Array GeneChip™ microarrays were normalized by the robust multi-array average (RMA) method (R package affy) (Irizarry et al., 2003). For the mouse and human exon array datasets, probe-to-gene assignments were made on the basis of custom CDF files from Dai et al. (Dai et al., 2005) in version 12 for EntrezGene genes.

Publicly available gene expression datasets Expression profiles and associated clinical data for 326 MIBC samples were obtained from six public datasets (Blaveri et al, 2005; Dyrskjot et al, 2007; Kim et al, 2010; Lindgren et al, 2010; Riester et al, 2012; Stransky et al, 2006). Expression profiles for bladder cancer cell lines were also obtained from two public series (Lee et al, 2007) (and an unpublished but publicly available dataset created by Wooster et. al. in collaboration with GlaxoSmithKline). A detailed description of these series is provided in the table below. Table 3: Detailed description of eight independent external public datasets

Accession Human Human bladder Dataset Platform Probes number MIBC cancer cell lines Affymetrix HG- Riester GSE31684 54,675 78 U133 Plus 2.0 cDNA Lindgren GSE19915 36,288 45 microarray Illumina human- Kim GSE13507 6 v2.0 expression 48,000 6 1 beadchip MDL human 3k Dyrskjot GSE5479 1,369 5 1 oligo array Affymetrix HG- Stransky E-TABM-147 12,626 39 U95A/U95Av2 cDNA Blaveri GSE1827 10,368 52 microarray Affymetrix HG- Lee GSE5845 22,283 40 U133A Affymetrix HG- Wooster E-MTAB-37 54,675 11 U133 Plus 2.0

CGH array analysis DNA copy number was analyzed for the 85 MIBC from the CIT series on the human genome-wide CIT-CGHarray (V6) designed by the CIT-CGH Consortium.

Clinical data analysis The overall survival of patients presenting with muscle-invasive bladder tumors was analyzed by combining seven independent datasets (CIT, Riester, Lindgren, Kim, Dyrskjot, Stransky, and Blaveri). Survival time was calculated from the date of cystectomy. Data were censored for patients lost to follow-up or alive at the time of last follow-up. Survival analysis was based on Kaplan-Meier curves (function surv), log rank tests (function survdiff) and Cox models (function coxph) using the survival package of R. Forest plots were generated with the R package rmeta.

Cell culture and reagents The inventors used 22 human bladder cancer cell lines in this study. Most were purchased from DSMZ (BFTC-905, CAL29, JMSU1, RT1 12, SW1710, T24, VMCUB1, VMCUB3, 5637), ECACC (UMUCl, UMUC5, UMUC6, UMUC9, UMUCIO, UMUCl 6) and ATCC (J82). The MGHU3, RT4 and SCaBER cell lines were a gift from Francisco Real (CNIO Madrid). TCCSUP and KK47 were obtained from the laboratories of Dominique Chopin (Hopital Henri Mondor, Creteil, France) and Jennifer Southgate (previously of Cancer Research Unit St James's University Hospital Leeds, UK), respectively. Cell line identity was confirmed by analyzing specific gene mutations previously described in each cell line and/or by CGH analysis. The L1207 cell line was derived from tumor T1207, as previously described (De Boer et al., 1997). All cell lines were maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal bovine serum, 100 U/ml penicillin and 100 U/ml streptomycin. Erlotinib was obtained from LC laboratories. Cetuximab (5 mg/ml) was purchased from Merck KGaA (Darmstadt, Germany) and stored at 4°C. Assessment of cell viability Cell viability was determined by colorimetric MTT assays. For anti-EGFR experiments, cell lines were dispensed into 96-well plates at a density of 1500-8000 cells per well and incubated overnight. They were then treated for 72 h with various doses of erlotinib (0.01, 0.05,

5 0.1, 0.5, 1, 5 and 10 µΜ, dissolved in 0.1% DMSO) or with 10 µg/ml cetuximab in 100 µΐ of DMEM supplemented with 0.5% fetal bovine serum and 40 µg/ml transferrin. The inventors included 0.1% DMSO (vehicle) alone, as a negative untreated control for erlotinib experiments. Each concentration was tested in triplicate, and each experiment was repeated two to four times for each cell line. The concentration of erlotinib inhibiting the growth of the treated cells by 10 50% with respect to untreated controls (GI50) was calculated after curve fitting with GraphPad

Prism 5.0 software. Cell lines were considered to be sensitive if the GI50 was no higher than 1 µΜ, a concentration that can be achieved in the plasma of cancer patients treated with this agent. Neutralizing antibody experiments were performed with a mouse monoclonal antibody 15 directed against human amphiregulin (MAB262) (purchased from R&D Systems). 24 h after the seeding of 96-well plates at a density of 4000-6500 cells per well, cells were treated for 6 days in DMEM (without serum) supplemented with 40 µg/ml transferrin and the neutralizing antibody directed against amphiregulin at a concentration of 5 µg/ml. Mouse IgG (5 µg/ml; R&D Systems) was included as a negative control. The culture medium was replaced after 3 0 days with medium containing fresh antibodies. All experiments were performed in triplicate and were carried out at least 3 times. For both anti-EGFR and neutralizing antibody experiments, cell viability was determined by adding MTT ( 1 mg/ml) to each well, after 1 hour of incubation at 37°C. The

medium was then removed, and 100 µΐ of DMSO was added, to lyse the cells and solubilize the 5 formazan. Data are expressed as the percentage of viable cells with respect to the appropriate untreated control, depending on the experiment.

Human bladder cancer cell line xenografts Six-week-old female Swiss nu/nu mice (Charles River Laboratories) were raised in the 30 animal facilities of Institut Curie, in specified pathogen-free conditions. Animal care and housing conformed to the institutional guidelines of the French National Ethics Committee (Ministere de Agriculture et de la Foret, Direction de la Sante et de la Protection Animale, Paris, France) and was supervised by accredited investigators. Mice received a subcutaneous injection, into each flank (dorsal region), of 2xl0 UMUC6 or 6xl0 6 K 47, JMSU1, BFTC905, VMCUB1 or L1207 bladder cancer cells in 100 µΐ PBS. For each of the cell lines injected, mice were randomly separated into two groups of six mice when tumors reached a volume of 100 mm3 (± 20). The mice were treated six days per week by oral gavage with erlotinib (100 mg/kg) in one group and with vehicle (0.5% carboxymethylcellulose) in the other. Tumor size was measured twice weekly with calipers, and the volume in mm3 was calculated with the formula: π/6 (largest diameter) x (shortest diameter) 2.

BBN-induced bladder tumors andpreclinical testing of anti-EGFR therapy Experiments were performed with the approval of the Ile-de-France ethics committee for animal experimentation. Bladder carcinogenesis was induced by supplying eight-week-old adult C57BL/6 male mice for 13 weeks with 0.05% N-butyl-N-(4-hydroxybutyl) nitrosamine (BBN) (TCI Europe) in drinking water (el-Marjou et al, 2000). Mice were kept in the carcinogenesis room, with standard water, for a further two weeks. They were then randomly separated into two groups (20 mice/group), which were treated six days per week by oral gavage with erlotinib (100 mg/kg) or with vehicle (0.5% carboxymethylcellulose 0.5%) until the mice were killed. Tumor formation was followed weekly, by ultrasound, in anesthetized mice, with a High-Resolution In Vivo Micro-Imaging System with a 707B probe (Vevo770, VisualSonics, Toronto, Canada). Tiny tumors accounting for 2 to 5% of bladder volume were detectable by this method. The time at which a tumor could first be unambiguously localized and its growth followed on a weekly basis was considered for tumor- free survival curve. Mice were killed when their tumors were 90-95% the size of the bladder or when they showed signed of weakness (large weight loss), and this was the time considered for the overall survival curve. Tumors were removed. A part of each tumor was fixed in formalin and embedded in paraffin for histological analyses, and another part of the tumor was flash-frozen in liquid nitrogen for mRNA extraction in the Trizol protocol, for gene expression analysis.

Results

Transcriptionalprofiling of muscle-invasive bladder cancers identifies a distinct molecular subtype, the basal-like subtype Transcriptomic data of 85 MIBC (from the CIT series) were analyzed by two different unsupervised methods. Consensus clustering analysis revealed a subgroup of 2 1 tumors (Figure 1A, on the right) that was of particular interest as it was very stable. This subgroup was also clearly identified by a second unsupervised method: principal component analysis (Figure IB). Moderated t-tests were then carried out to identify genes significantly differentially expressed between this subgroup of tumors and the other MIBC (P < 0.01 and fold change (FC) > 2). Using these two criteria, 1064 probe sets were identified corresponding to 761 genes. The 52 most significantly differentially expressed of these genes (P < 10 12 and FC > 2) are shown in Figure 1C. The list of overexpressed genes (466 of 761) in the subgroup of interest was significantly enriched in genes associated with epithelial wound healing, including KRT6A, KRT16, KRT17 and PLAU, PLAUR (plasminogen activator/plasmin system) (Romer et al, 1994), keratinocyte differentiation, including several genes of the epidermal differentiation complex at lq21, and genes expressed in the basal layer of stratified epithelia, such as KRT5, KRT6A, KRT14 and EGFR. The underexpressed genes included several bladder epithelial (urothelial) differentiation markers, such as uroplakin genes (UPK1A, UPK1B, UPK2, UPK3A) and FOXA1 (Varley et al, 2009). The list of differentially-expressed genes was also significantly enriched in markers of a particular subgroup of breast cancer known as basal breast cancer (Perou et al, 2000). This, together with the enrichment in basal epithelial cell markers, led the inventors to call this subgroup "basal-like". Immunohistochemical analysis of 62 of the 85 MIBC from the CIT series with an antibody against both cytokeratins KRT5 (CK5) and KRT6 (CK6) and an antibody recognizing FOXA1 confirmed the strong expression of the basal cytokeratins CK5/6 and the absence of nuclear FOXA1 in basal-like tumors (Figure ID). The CK5/6+ FOXA1 phenotype was found in most of the basal-like MIBC analyzed (16 of 18, i.e. 89%). Only two of the 44 non-basal- like (4.5%) tumors presented this phenotype. The combination of these two immunohistochemical markers may significantly distinguish between basal-like and non-basal- like MIBC in this series (four missclassifications in the series of 62 MIBC analyzed; Figure ID and Figure 11). These results were confirmed by immunohistochemical analysis of an additional series of 32 MIBC (the "Stransky" series) with the same antibodies. The CK5/6+ FOXA1 phenotype was found in most of the basal-like MIBC analyzed (7 of 9). Only one of the 23 non-basal-like tumors presented this phenotype. In the CIT series of 85 MIBC, mutations of the genes known to be frequently altered in bladder cancer were then assessed (Forbes et al, 201 1) (Figure IE). TP53 was significantly (P = 0.02) more frequently mutated in the basal-like subgroup (16 of 2 1 tumors, 76%) than in the other MIBC (29 of 64 tumors, 45%). This subtype of bladder carcinoma also displayed significantly (P < 0.0001) more frequent histological signs of squamous cell differentiation ( 11 of 2 1 cases, 52.4%) than non-basal-like tumors (3 of 64 cases, 4.7%) (Figure IE). Using BAC CGH arrays, the inventors identified three regions significantly (Chi2 P < 0.01) more altered in the basal-like subgroup than in other MIBC (at 3pl4.2, 6p, 7pl 1.2). The 3pl4.2 region, which presented significantly more losses (both hemizygous and homozygous deletions), contained 52 genes, including the tumor suppressor gene FHIT. The whole 6p arm showed significantly more gains. The 7pl 1.2 region, which presented a significant increase in copy number (both gains (up to four copies) and amplifications (more than four copies)) contained two genes, including the gene encoding the tyrosine kinase receptor EGFR. However, they identified no mutations of EGFR or the truncated EGFR variant III in the 2 1 basal-like tumors analyzed.

Identification of the basal-like subgroup in publicly-available transcriptomic data of MIBC. Association of this subgroup with shorter survival In addition to the CIT transcriptomic dataset, the inventors analyzed publicly-available transcriptomic datasets from six series of MIBC for which survival data were available. They used consensus clustering to identify the basal-like subgroup in each of the six datasets. The percentage of basal-like tumors was 23% (90 of 383 MIBC) when all the datasets were considered together. The significantly higher proportion of bladder carcinomas displaying squamous differentiation in the basal-like subgroup was confirmed in the series (Riester et al, 2012) for which this parameter was available. Taking into account all series, the basal-like tumors were of a significantly higher stage (P=0.004). Overall survival was significantly shorter for patients with basal-like MIBC, with most events occurring within one year of diagnosis (Figure 2A). Similar results were obtained if the patients were stratified according to the two most important prognosis factors: lymph node and metastasis status (Figures 2B-D). Forest plots (Figures 2E and 2F) showed that the basal-like subtype was a prognostic factor independent of sex, node and metastasis status at one year (Figure 2E), and a prognostic factor independent of metastasis status at five years (Figure 2F).

Activation of EGFR in basal-like bladder tumors The inventors determined from the KEGG and Biocarta databases those pathways most significantly altered in the basal-like subgroup with respect to other MIBC, using all seven available transcriptome datasets. The pathway involving EGFR ("EGF signaling" in Biocarta and "ErbB signaling" in KEGG) was one of the most significantly altered pathways in the various datasets. EGFR, the genes encoding five of its six ligands (AREG, AREGB, EREG, HBEGF, TGFA), a downstream effector of EGFR (MYC), and genes known to be induced by

EGFR activation in the urothelium (IL8, SOX9) (Ling et al, 201 1 Perrotte et al, 1999) were overexpressed in basal-like tumors compared to non-basal- like MIBC. By contrast, ERBB3, another member of the ERBB family, was found to be underexpressed. The inventors confirmed, by qRT-PCR, that EGFR and the genes encoding several of its ligands were significantly overexpressed in basal-like compared to non-basal-like tumors (Figure 3A). Moreover, the levels of EGFR protein and its phosphorylated form were significantly higher (2.4 times higher and 2 times higher, respectively) in basal-like tumor cells than in non-basal- like MIBC (Figure 3B and 3C). Finally, they investigated the mechanisms underlying the upregulation of EGFR and its ligands, by analyzing mRNA levels as a function of gene copy number. They found a highly significant positive correlation between EGFR mRNA levels and EGFR copy number in basal-like tumors. This relationship was much weaker in non-basal- like tumors. In tumors with normal EGFR copy number, EGFR mRNA levels were significantly higher in basal-like tumors than in non-basal-like tumors (Figure 3D), indicating that copy number was not the sole mechanism of EGFR overexpression in basal-like tumors. No relationship between gene copy number and RNA levels was observed for the EGFR ligands. Taken together, these results indicated that the EGFR pathway is activated specifically in the basal-like subgroup of MIBC.

Identification of bladder tumor cell lines with a basal-like phenotype. EGFR-dependent growth is stimulated by an autocrine ligand-receptor loop The inventors hypothesized that the activation of the EGFR pathway observed in basal- like MIBC would play a role in tumor cell growth. They searched for basal-like bladder cancer cell lines using a 40-gene predictor (Table 4 below). Table 4: Human basal-like bladder cancer gene expression predictor. A set of 40 predictive genes was selected to discriminate the basal-like samples from non basal-like samples using the CIT discovery cohort and the area under curve (AUC) criteria. Log2 ratio Entrez. Moderate.T- basal-like vs Probe.Set.ID Gene.Symbol AUC AUC.sd Gene test.p-values non4oasal-like MIBC 41469 at PI3 5266 0,981 0,011 8,53E-25 5,08 209126 x at KRT6B 3854 0,979 0,013 1.28E-23 3,67 204971 at CSTA 1475 0,931 0,031 1.70E-10 2,90 204751 x at DSC2 1824 0,955 0,020 1.02E-14 2,68 204326 x at MT1X 4501 0,932 0,034 2.18E-14 2,43 219412_at RAB38 23682 0,926 0,035 5,85E-14 2,30 209260 at SFN 2810 0,953 0,023 1.70E-13 2,23 228531 at SAMD9 54809 0,926 0,033 4,98E-12 2,10 201984 s at EGFR 1956 0,919 0,031 9,52E-09 1,82 204490 s at CD44 960 0,912 0,031 9.30E-11 1,82 205227 at IL1 AP 3556 0,935 0,032 1.53E-15 1,77 200606 at DSP 1832 0,963 0,018 3.08E-11 1,69 205724 at PKP1 5317 0,955 0,021 7,95E-16 1,63 206421 s at SERPINB7 8710 0,923 0,028 7,86E-08 1,29 36499 at CELSR2 1952 0,915 0,030 3,49E-09 1,17 214793_at DUSP7 1849 0,927 0,031 6,85E-14 1,05 222173 s at TBC1D2 55357 0,908 0,031 7,23E-09 0,92 203586 s at ARL4D 379 0,909 0,034 2.65E-11 0,90 219092 s at IPPK 64768 0,968 0,023 5,23E-18 0,88 203036 s at MTSS1 9788 0,938 0,029 3,59E-13 0,82 1569303 s at RGS20 8601 0,913 0,036 1.58E-12 0,63 225958 at PHC1 1911 0,920 0,030 3,75E-08 -0,74 218491 s at THYN1 29087 0,906 0,033 3,34E-09 -0,83 242290 at TACC1 6867 0,907 0,032 3,64E-08 -1,15 212230 at PPAP2B 8613 0,925 0,029 1.65E-09 -1,24 205795 at NRXN3 9369 0,943 0,024 7,54E-08 -1,27 220108 at GNA14 9630 0,907 0,035 6,88E-09 -1,45 226137 at ZFHX3 463 0,967 0,019 5.91E-16 -1,46 40837 at TLE2 7089 0,969 0,016 9,79E-15 -1,46 242794 at MAML3 55534 0,929 0,027 1,09E-11 -1,51 202609 at EPS8 2059 0,930 0,027 3.34E-11 -1,53 210108 at CACNA1D 776 0,916 0,033 3,44E-08 -1,70 59697 at RAB15 376267 0,923 0,033 9.16E-12 -1,70 214180 at MAN1C1 57134 0,946 0,022 4,30E-08 -1,86 203509 at S0RL1 6653 0,961 0,018 6,78E-15 -2,00 213385_at CHN2 1124 0,955 0,023 4,07E-12 -2,06 226625 at TGFBR3 7049 0,951 0,024 4.12E-16 -2,20 225915 at CAB39L 81617 0,926 0,033 9,95E-13 -2,33 212328 at LIMCH1 22998 0,914 0,036 3,93E-13 -2,49 203304 at BAMBI 25805 0,937 0,029 1.25E-12 -2,51

This predictor (Table 4) developed on the CIT series of tumors was validated on the six independent publicly-available tumor transcriptome datasets. The inventors applied this predictor to their own transcriptomic dataset of 22 bladder cancer cell lines. Eleven of the 22 bladder cancer cell lines analyzed were classified as basal-like. For the only tumor/tumor- derived cell line (T1207 / L1207) pair in our series (De Boer et al, 1997), the basal-like phenotype was identified in both samples. Affymetrix transcriptome data were publicly- available for 15 of the 22 cell lines (Lee et al., 2007) (and an unpublished but publicly-available dataset (E-MTAB-37) created by Wooster et. al. in collaboration with GlaxoSmithKline). This predictor applied to these data gave the same classification for 14 of the 15 cell lines (Table 5). Table 5: Molecular subtype classification of 22 distinct human bladder cancer cell lines. Supervised based centroid prediction ( 1: Bladder cancer cell line ID Dataset Chip basal- like subtype; 0: non-basal-like subtype) 5637 CIT Affymetrix EXON Hs 1 5637 Lee Affymetrix HG-U133A 1 5637 Wooster Affymetrix HG-U133plus2.0 1 5637 Wooster Affymetrix HG-U133plus2.0 1 5637 Wooster Affymetrix HG-U133plus2.0 1 BFTC905 CIT Affymetrix EXON Hs 1 BFTC905 Wooster Affymetrix HG-U133plus2.0 1 BFTC905 Wooster Affymetrix HG-U133plus2.0 1 BFTC905 Wooster Affymetrix HG-U133plus2.0 1 CAL29 CIT Affymetrix EXON Hs 1 J82 CIT Affymetrix HG-U133plus2.0 0 J82 Lee Affymetrix HG-U133A 0 J82 Wooster Affymetrix HG-U133plus2.0 0 J82 Wooster Affymetrix HG-U133plus2.0 0 J82 Wooster Affymetrix HG-U133plus2.0 0 JMSU1 CIT Affymetrix EXON Hs 0 KK47 CIT Affymetrix EXON Hs 0 KK47 CIT Affymetrix HG-U133plus2.0 0 KK47 Lee Affymetrix HG-U133A 0 L1207 CIT Affymetrix EXON Hs 1 MGHU3 CIT Affymetrix EXON Hs 1 MGHU3 CIT Affymetrix HG-U133plus2.0 1 MGHU3 Lee Affymetrix HG-U133A 1 RT112 CIT Affymetrix EXON Hs 0 RT4 CIT Affymetrix EXON Hs 0 RT4 CIT Affymetrix HG-U133plus2.0 0 RT4 Lee Affymetrix HG-U133A 0 SCABER CIT Affymetrix EXON Hs 1 SCABER CIT Affymetrix HG-U133plus2.0 1 SCABER Lee Affymetrix HG-U133A 1 SCABER Wooster Affymetrix HG-U133plus2.0 1 SCABER Wooster Affymetrix HG-U133plus2.0 1 SCABER Wooster Affymetrix HG-U133plus2.0 1 SW1710 CIT Affymetrix EXON Hs 0 SW1710 Lee Affymetrix HG-U133A 0 T24 CIT Affymetrix HG-U133plus2.0 0 T24 Lee Affymetrix HG-U133A 0 TCCSUP CIT Affymetrix EXON Hs 0 TCCSUP CIT Affymetrix HG-U133plus2.0 0 TCCSUP Lee Affymetrix HG-U133A 0 UMUC1 CIT Affymetrix EXON Hs 0 UMUC1 Lee Affymetrix HG-U133A 1 UMUC10 CIT Affymetrix EXON Hs 1 UMUC16 CIT Affymetrix EXON Hs 1 UMUC5 CIT Affymetrix EXON Hs 1 UMUC6 CIT Affymetrix EXON Hs 1 UMUC6 Lee Affymetrix HG-U133A 1 UMUC9 CIT Affymetrix EXON Hs 0 UMUC9 Lee Affymetrix HG-U133A 0 VMCUB1 CIT Affymetrix EXON Hs 1 VMCUB1 Lee Affymetrix HG-U133A 1 VMCUB3 CIT Affymetrix EXON Hs 0 VMCUB3 Lee Affymetrix HG-U133A 0 The mRNA levels were analyzed for some of the genes typical of the basal-like phenotype (basal cytokeratins; EGFR and its ligands) by qRT-PCR and the levels of EGFR protein and its phosphorylated form by western blotting, in the 22 bladder cancer cell lines (Figure 4A). The results obtained were consistent with the predicted basal-like/non-basal-like classification of the cell lines. The two cell lines with the highest levels of EGFR and phospho- EGFR (LI 207 and UMUC5) were basal- like and were the only cell lines displaying EGFR gene amplification (Black et al, 2008; Nicolle et al., 2006). The inventors investigated the role of the EGFR signaling pathway in the basal-like MIBC subtype, by comparing the effect of erlotinib, a small-molecular inhibitor of EGFR, on the growth of basal- like and non-basal- like bladder cancer cell lines. Nine of the eleven basal- like bladder cancer cell lines were sensitive to the inhibitor (GI50 ≤ 1 µΜ) (Figure 4B). One of the two remaining cell lines displayed intermediate sensitivity ( 1 µΜ < GIso ≤ 10 µΜ; 5637), and the other was resistant (GI50 > 10 µΜ ; BFTC-905) (Figure 4B). The lack of sensitivity of these two cell lines may be due to a gain of function of one of the downstream effectors of EGFR signaling (a RAFI gene amplification in 5637 cells and a Ni^S-activating mutation in BFTC-905 cells). By contrast, only one of the 11 non-basal-like cell lines (VMCUB3) was sensitive to erlotinib, the others displaying either intermediate sensitivity or resistance (Figure 4B). This difference in sensitivity to erlotinib between basal-like and non-basal-like bladder cancer cell lines was significant (P =0.002, two-tailed Mann- Whitney test) (Figure 4C). Similar results were obtained with an antibody targeting EGFR (cetuximab) (Figure 5A and, 5B). The inventors also showed that both erlotinib (Figure 4D) and cetuximab (Figure 5C) treatments inhibited both EGFR phosphorylation and the activation of the downstream signaling molecules AKT and ERK1/2, specifically in basal-like bladder cancer cell lines. The overexpression of EGFR ligands may contribute to EGFR activation and cell growth in basal-like bladder cancer cells. Amphiregulin was one of the most strongly overexpressed ligands in basal-like MIBC and bladder cancer cell lines. They therefore investigated the potential role of this molecule as an autocrine factor. As anticipated, an antibody against amphiregulin decreased the growth of basal-like bladder cancer cell lines but not of non-basal-like cell lines (Figure 5D). They also found that EGFR ligands were underexpressed following anti-EGFR treatment (Figure 5E). These results demonstrate the existence of an autocrine mitogenic loop between EGFR and EGFR ligands in basal-like bladder cancer cell lines. The effect of erlotinib was further investigated in vivo (Figure 4E) on subcutaneous xenografts of four basal-like bladder cancer cell lines (L1207, VMCUB1, UMUC6, BFTC-905) and two non-basal- like bladder cancer cell lines (JMSU1, K 47). Erlotinib significantly decreased tumor growth for three of the four basal-like bladder cancer cell lines, the greatest inhibition being observed with the LI207 cell line, which had an EGFR gene amplification. The only resistant basal-like cell line (BFTC-905) had an NRAS-activating mutation. The two non-basal-like tumors were insensitive to the inhibitor.

The BBN-induced mouse model of muscle-invasive bladder cancer has a basal-like phenotype and is sensitive to an EGFR inhibitor, erlotinib In mice, BBN (N-butyl-N-(4-hydroxybutyl)nitrosamine) induces muscle-invasive bladder tumors that typically have mixed histological features, displaying both urothelial and squamous cell differentiation (Becci et al, 1978; Tamano et al, 1991), as in the human basal- like subgroup. Additionally, the inventors previously showed that the growth of mouse cell lines derived from bladder tumors induced by chemical treatment with BBN is dependent on an autocrine loop involving EGFR and secreted growth factors (el-Marjou et al., 2000). They therefore hypothesized that BBN-induced mouse tumors could be used as a model of human basal-like tumors. To test this hypothesis they performed a cross-species comparison of gene- expression profiles by predicting basal-like group membership for the BBN-induced mouse bladder tumors using 37 orthologous genes (KRT6B, DSC2, CSTA, RAB38, SFN, IL1RAP, PKP1, EGFR, CD44, SERPINB7, DSP, DUSP7, CELSR2, IPPK, ARL4D, TBC1D2, MTSS1, RGS20, PHC1, THYN1, NRXN3, TACC1, PPAP2B, GNA14, MAN1C1, TLE2, CACNA1D, MAML3, ZFHX3, EPS8, RAB15, SORLl, CHN2, BAMBI, TGFBR3, LIMCH1, CAB39L), within the above established human 40-gene predictor. BBN-induced mouse bladder tumors closely resembled human bladder basal-like tumors, as nine out of the 11 mouse tumors showed a high correlation coefficient with the human basal-like centroid. Consistent with these results, analyses of the level of expression of genes typically associated with the basal-like phenotype showed the same profile of deregulation in mouse and human basal-like tumors, with respect to normal urothelium (mouse or human) (Figure 6A). Thus, the BBN mouse model of bladder cancer is a suitable model for the basal-like subtype of bladder tumors. The inventors investigated the effect of erlotinib on bladder tumor-free survival and overall survival in this mouse model. Erlotinib treatment, initiated two weeks after the end of BBN treatment and administered six days per week, significantly delayed the detection of tumors assessed by echography (Figure 6B) and increased survival in mice (Figure 6C).

Conclusion The basal-like subgroup in MIBC was first identified from a non-supervised analysis of tumor transcriptomic data. The inventors then developed, from their CIT series, a 40-gene predictor, which they validated in six independent transcriptomic datasets, with a sensitivity of 97% and a specificity of 88%. This 40-gene predictor can therefore be used to identify basal- like MIBC. Immunohistochemistry represents a simpler and more accessible assay in clinical practice. They showed that the expression of cytokeratins 5 and 6 and the lack of expression of

FOXA1 can be used to identify basal-like MIBC in the CIT series (Figure 1, Figure 7), with a specificity of 89%> and a sensitivity of 89%>. Subclassification of bladder cancer on the basis of single markers (keratin 14, ∆ΝΡ 63), has already been proposed (Kami- Schmidt et al., 201 1; Volkmer et al., 2012). However, these single markers, which are basal cell markers, do not accurately identify the basal- like group (for example, the use of CK14 would exclude eight of the 2 1 tumors, whereas the use of ∆ΝΡ 63 would lead to the inclusion of 26/62 non-basal- like tumors that importantly, lack the overexpression of EGFR and its ligands; Figure 7).

Example 2 : Sensitivity of basal-like MIBC to capecitabine

Materials and methods

Patients and tissue samples A series of 85 muscle-invasive bladder carcinomas (31 pT2, 35 pT3 and 19 pT4) was collected from patients surgically treated from 1988 and 2006 at Henri Mondor Hospital, Institut Gustave Roussy (Villejuif, France), and Foch Hospital (Suresnes, France). Tumors were staged according to the 1997 TNM classification (Sobin and Fleming 1997) and graded according to the 1973 WHO classification (Mostofi, 1973). The clinical annotations for the patients and the pathologic features for each tumor were also recorded. Normal urothelial samples were obtained during organ procurement from cadaveric donors for transplantation, as previously described (Diez de Medina, 1998). All patients provided written informed consent and the study was approved by the ethics committees of the different hospitals.

RNA, DNA and protein extraction from human tissues and bladder cancer cell line xenografts Immediately after surgery, the tissue samples were frozen in liquid nitrogen and stored at -80°C until nucleic acid and protein extraction. RNA, DNA and proteins were extracted from bladder frozen samples and from bladder cancer cell lines grown to 70%> confluency, by cesium chloride density centrifugation(Chirgwin et al, 1979; Coombs et al, 1990). The concentration, integrity and purity of each RNA sample were determined with the RNA 6000 LabChip Kit (Agilent Technologies) and an Agilent 2100 Bioanalyzer. DNA purity was also assessed by determining the ratio of absorbances at 260 and 280 nm. DNA concentration was determined with a Hoechst dye-based fluorescence assay (Labarca et al, 1980). Proteins were resuspended in IX Laemmli buffer containing anti-protease and anti-phosphatase (Roche) and concentration was determined by a BCA Protein Assay-Reducing Agent Compatible kit (Pierce).

Bladder cancer cell lines and reagents Six human bladder cancer cell lines were used. Some of them were purchased from DSMZ (VMCUB1, JMSU1, BFTC-905) or ECACC (UMUC6). K 47 were obtained from the laboratory of Jennifer Southgate (Cancer Resarch Unit St James'Hospital Leeds, UK). Cell line identity was confirmed by analyzing specific gene mutations previously described in each cell line. The L1207 cell line was derived from tumor T1207, as previously described (De Boer et al, 1997). All cell lines were maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal bovine serum, 100 U/mL penicillin and 100 U/mL streptomycin. Capecitabine (LC laboratories) was provided as a powder and suspended in vehicle (40 mM citrate buffer containing 0.5% of carboxymethylcellulose, pH 6.0) and was administrated by oral gavage. Cisplatine was from Mylan Laboratory and was administrated by intraperitoneal injection.

Human bladder cancer cell line xenografts Six-week-old female Swiss nu/nu mice (Charles River Laboratories) were raised in the animal facilities of the Curie Institute, in specified pathogen-free conditions. Their care and housing were in accordance with the institutional guidelines of the French National Ethics Committee (Ministere de I'Agriculture et de la Foret, Direction de la Sante et de la Protection Animale, Paris, France), with supervision by authorized investigators. Mice receive a subcutaneous injection, into each flank (dorsal region), of 2xl0 6 UMUC6 or 6 106 KK47, JMSU1, BFTC905 and L1207 bladder cancer cells in 100 µΐ PBS. For each injected cell line, when tumors reached a volume of 100 mm3 (+/- 20), mice were randomly separated into two groups of six mice. Mice were treated by oral gavage with vehicle control (40 mM citrate buffer containing 0.5% of carboxymethylcellulose, pH 6.0) or capecitabine (400 mg/kg) for 7 consecutive days followed by a one-week rest. Tumor size was measured twice a week with calipers and the volume in mm3 was calculated using the formula: π/6 (larger diameter) x (shorter diameter)2.

Quantitative RT-PCR One microgram of total RNA was reverse transcribed using the High capacity cDNA reverse transcription kit (Applied Biosystems). mRNA levels were quantified using pre- designed (for R18S) or custom gene expression (for TYMP and DPYD) assays containing gene specific primers and a dye-labeled hydrolysis probe (TaqMan probes from Applied Biosystems or UPL (Universal ProbeLibrary) probes from Roche). For custom assays, primers and probes were designed with probeFinder software via the Universal ProbeLibrary Assay Design Center (Roche). The following primers and probes were used: TYMP-sens 5'GCACACAGGAGGCACCTT3 ' (SEQ ID NO: 17) ; TYMP-anti-sens 5'CCTGGTCCAGCAGCACTT3 ' (SEQ ID NO: 18); TYMP-probe 5'CTCCAGCT3' (SEQ ID NO: 19); DPYD-sens 5'CAAGAGCTGCAAAGGAAGGT3 ' (SEQ ID NO: 20); DPYD-anti- sens 5'CCCATCAGACCTGAGACAGTG3 ' (SEQ ID NO: 21); DPYD-probe 5'AGCTGGAG3' (SEQ ID NO: 22). QRT-PCR were carried out with the LightCycler® 480 Instrument (Roche) in a 20 µΐ reaction mixture containing, lOng of reverse transcribed RNA, l LightCycler® 480 Probes Master, 25 µΜ each of the forward and reverse primers and 10 µΜ of the UPL probe (or IX of the pre-designed assay). All expression assays were run using the same thermal cycling conditions including initial step at 95°C (10 min), followed by 40 cycles at 95°C (10 s), 60°C (30 s) and 72 °C (10 s). For each gene of interest, the amounts of mRNA was normalized to that of the Ribosomal 18S (R18S) reference gene using the 2 AC method.

Reverse Phase Protein Array Samples were deposited onto nitrocellulose covered slides (Schott Nexterion NC-C, Jena, Germany) using a dedicated arrayer (2470 Arrayer, Aushon Biosystems, Billerica, MA, USA). Four serial dilutions, ranging from 1.5 to 0.094 mg/ml, and four technical replicates per dilution were deposited for each sample. Arrays were revealed with an anti-TYMP antibody (HPA001072 from Sigma) using an Autostainer Plus robot (Dako, Trappes, France). Briefly, slides were incubated with avidin, biotin and peroxydase blocking reagents (Dako) before saturation with TBS containing 0.1% Tween-20 and 5% BSA (TBST-BSA). Slides were then probed overnight at 4°C with primary antibodies diluted in TBST-BSA. After washes with TBST, arrays were probed with horseradish peroxidase-coupled secondary antibodies (Jackson ImmunoResearch Laboratories, Newmarket, UK) diluted in TBST-BSA for 1 h at room temperature. To amplify the signal, slides were incubated with Bio-Rad Amplification Reagent for 15 min at room temperature. The arrays were washed with TBST, probed with Cy5- Streptavidin (Jackson ImmunoResearch Laboratories) diluted in TBST-BSA for 1 h at room temperature and washed again in TBST. For staining of total protein, arrays were incubated 15 min in 7% acetic acid and 10% methanol, rinsed twice in water, incubated 10 min in Sypro Ruby (Invitrogen) and rinsed again. The processed slides were dried by centrifugation and scanned using a GenePix 4000B microarray scanner (Molecular Devices, Sunnyvale, USA). Spot intensity was determined with MicroVigene software (VigeneTech Inc). Specificity of the primary antibody used in this study was first validated by Western blotting on several cell and tumour lysates. For quantification the optimal dilution was determined using the two following criteria: signal above the negative control (sample with secondary antibody only) and without saturation. Then, for each sample, signal intensities from the optimal dilution were subtracted from the negative control values and normalized to the amount of total protein. Final results for each sample were expressed as an average of the four technical replicates.

Results

Capecitabine is an orally administrated anticancer pro-drug of 5-fluorouracil (5-FU) that is selectively tumor-activated. Capecitabine is metabolized in the liver to the intermediate 5'- deoxy-5-fluorouridine, which is subsequently converted into cytotoxic 5-FU by the thymidine phosphorylase (TYMP), an enzyme overexpressed in many tumors. Intracellular 5-FU is then catabolized to its inactive metabolites via dihydropyrimidine dehydrogenase (DPYD).

The ratio TYMP/DPYD mRNA is increased in basal-like MIBC and xenograft models of basal-like bladder tumors are sensitive to capecitabine treatment Using microarray analysis we identified a significant increase in mRNA levels of TYMP in basal-like MIBC compared to non-basal-like tumors. We then confirmed the strong overexpression of TYMP both at the transcriptional and protein level in basal-like compared to non-basal-like MIBC and normal urothelium samples (Figure 8A and 8B). In these samples we also found a highly significant positive correlation between TYMP mRNA and protein expression (Figure 8C). As the ratio TYMP/DPYD activity was found to be correlated with capecitabine susceptibility in various human cancer xenograft models (Ishikawa et al. 1998), we also investigated the expression level of DPYD by qRT-PCR in MIBC and normal urothelium samples. While, TYMP/DPYD mRNA ratio was significantly elevated both in basal-like (13- fold) and non basal-like MIBC (7 fold) compared to normal urothelium samples, this increase was much more important in basal-like tumors. These results led us to hypothesize that capecitabine may be particularly effective in the treatment of basal-like tumors. To verify this hypothesis, the effect of capecitabine was further investigated in vivo on subcutaneous xenograft of four basal-like bladder cancer cell lines (LI 207, VMCUB1, UMUC6 and BFTC- 905) and two non-basal-like bladder cancer cell line (JMSU1 and RT1 12). In these models we showed that capecitabine significantly reduced tumor growth for the four basal-like bladder cancer cell lines (ranging from 43 to 65% of growth inhibition) (Figure 9). In contrast, xenografts-derived from non-basal-like bladder cancer cell lines were unresponsive to capecitabine treatment (Figure 9). 5 The inventors further showed that the combination capecitabine + cisplatin provided a synergistic effect and greatly reduced tumor growth of the basal-like cell line BFTC-905 having a NRAS-activating mutation and that is resistant to a treatment using EGFR kinase inhibitor (cf. above) (Figure 13).

Example 3 : alternative splicing signature to identify basal-like MIBC

10 Materials and methods

Patients and tissue samples The set of 191 bladder carcinomas were obtained from tumour tissue banks at Henri Mondor Hospital, Institut Gustave Roussy (Villejuif, France), and Foch Hospital (Suresnes, France). These cancers were selected randomly from a consecutive set, to cover the different

15 stages of bladder cancer as follows: 5 1 Ta, 37 Tl, 27 T2, 46 T3, and 30 T4 tumours. 7 normal urothelial samples were also used for the analysis. They were obtained from fresh urothelial cells scraped from the normal bladder wall and dissected from the lamina propria during organ procurement from cadaveric donors for transplantation. All patients provided written informed consent and the study was approved by the ethics committees of the different hospitals.

20 Affymetrix Array hybridization

1.5 microgram of total RNA from the 198 samples was processed and labelled using the Affymetrix GeneChip Whole Transcript Sense Target Labelling Assay as outlined in the manufacturer's instructions. Hybridization to Affymetrix Human Exon 1.0 ST arrays was carried out independently at the Institut Curie Affymetrix platform. Affymetrix Expression 25 Console Software was used to perform quality assessment.

Data pre-processing andfiltering Signal estimates were derived from the CEL files of the microarrays. Partek® Genomics Suite™ was used to quantile normalize the probe fluorescence intensities and to summarize the probe set (representing exon expression) intensities using the RMA algorithm (Irizarry et al, 30 2003) and BrainArray custom CDF annotation files (Dai et al, 2005). Only a subset of well annotated core probe sets from the Exon Array were used for the study. Cross-hybridizing probe-sets (targeting non-unique sequences of the transcriptome) and probe-sets having extremely low standard deviation (due to a poor hybridization or an exon missing in all samples) were discarded to remove noisy data that might contribute to false positive in the alternative splicing analysis.

Statistical analysis of alternative splicing changes The ARH method (Rasche et al., 2010) was used to identify exons differentially included in transcripts between tumours and normal samples. ARH scores by gene and ARH splicing deviations by exon were computed for each tumour using median values over normal samples as reference. ARH splicing deviations give a measure of inclusion level differences of each exon in the corresponding gene for each tumour compared to normal samples. The values are centered in 0, negative values suggest a loss of exon whereas positive values suggest a gain of exon in the tumour relative to normal samples. Fisher exact test was used to identify relationships between each splicing change and molecular or clinical data. The computations were performed using R platform (R development Core Team, 201 1).

Nearest shrunken centroidASE classifier The Nearest Shrunken Centroid method developed by Tibshirani (Tibshirani et al.,

2002; Hastie et al, 201 1) was used to build a classifier identifying basal tumours based on alternative splicing changes. The classifier was trained using basal/non basal class prediction based on the established transcriptomic signature and the ARH splicing deviation data obtained for each tumour and each exon were used to train and classify the samples. Briefly, the method uses a training set to computes a standardized centroid for each class, which is derived from the average splicing deviation for each exon in each class of tumours divided by the within-class standard deviation for that exon. The classifier takes the exon splicing deviation profile of a new sample, and compares it to each of these class centroids (see class centroids on figure 10). The class whose centroid that it is closest to, in squared distance, is the predicted class for that new sample.

Results

Identification of tumour associated splicing events Affymetrix Human Exon arrays were used to analyse splicing changes in 191 tumour samples and 7 samples of normal urothelium. ARH scores and splicing deviations were used to assess significance to every putative splicing event in tumours compared to normal samples. A total of 26 038 alternative splicing changes were found to occur in at least 10% of tumours considering p-values < 0.05 and fold-changes > 2. For each of those events Fisher's exact tests were used to identify the relationships between splicing changes, molecular and clinical features of the disease, including basal-like phenotype. Only splicing changes linked to basal tumours with the smallest significant p-value were selected to identify a basal-like-specific alternative splicing signature ( 1 309 events).

Alternative splicing signature of basal tumours Using nearest shrunken centroids method with ARH splicing deviations data, the inventors trained a classifier based on a subset of 103 tumours: only muscle-invasive tumours were selected for the training phase, so that the classification would not depend on the invasiveness of tumours. According to the training phase results, the inventors chose to select 19 alternative splicing events that were sufficient to distinguish basal-like and non basal-like tumours with only 2 misclassification errors, which corresponds to the best ratio [error rate / number of markers] achieved. The same efficiency was observed when applying the classifier on the whole dataset of 191 tumours. A simple hierarchical clustering based on ARH splicing deviations for the 19 markers performs as well for separating basal-like and non basal-like tumours (data not shown). The 19 alternative splicing changes selected therefore define an alternative splicing signature for the basal-like tumours which can be used by itself to characterize this aggressive subgroup of tumours with a sensitivity of 96% and a specificity of 99%. The genes and exons (ranked according to the longest coding transcript they belong to) included in the signature are listed in Table 6 below. Table 6: Signature of alternative splicing changes specific to basal-like tumours. 9 exons are down-regulated and 10 exons are up-regulated in basal-like tumours compared to non-basal-like tumours and normal samples. The ENSE and ENST identifiers refer to Ensembl database.

Ensembl Exon rank in Regulation Basal Ensembl exon ID HGNC Symbol transcript ID transcript vs Non Basal

ENSE00001375695 (SEQ ID NO:23) TTLL8 ENST00000433387 7 up ENSE00000654397 (SEQ ID NO:24) TGM1 ENST00000206765 9 down ENSE00000670133 (SEQ ID NO:25) CACNA1F ENST00000376265 10 up ENSE0000 1165962 (SEQ ID NO:26) HEPHL1 ENST000003 15765 15 down ENSE00000756250 (SEQ ID NO:27) KIF17 ENST00000247986 9 up ENSE0000 12299 14 (SEQ ID NO:28) SRRM3 ENST00000388802 10 up ENSE00001456383 (SEQ ID NO:29) PDZK1IP1 ENST00000371885 2 down

ENSEOOOO 1627024 (SEQ ID NO:30) ATP2B2 ENST00000352432 6 up ENSE00001273363 (SEQ ID N0:31) GRIN1 ENST00000371546 14 up

ENSEOOOO 1784702 (SEQ ID NO:32) PGLYRP4 ENST00000359650 5 up ENSE000007 12909 (SEQ ID NO:33) KLC3 ENST00000470402 7 down

ENSEOOOO 1428027 (SEQ ID NO:34) MBOAT1 ENST00000324607 2 up

ENSEOOOO 1273220 (SEQ ID NO:35) ADA ENST00000372874 8 down ENSE00001475900 (SEQ ID NO:36) MTSS1 ENST00000378017 11 down

ENSEOOOO 16425 12 (SEQ ID NO:37) HDAC9 ENST00000456174 1 up ENSE00001643818 (SEQ ID NO:38) SERPINB2 ENST00000413956 6 down ENSE00000968908 (SEQ ID NO:39) TM4SF19 ENST00000273695 4 up

ENSEOOOO 1757469 (SEQ ID NO:40) TLE2 ENST00000262953 11 down

ENSEOOOO 1032209 (SEQ ID N0:41) CACNA1D ENST00000288139 38 down

Basal-like specific mRNA isoform of TGMl gene A single alternative splicing event identified in TGMl gene was enough to correctly classify 70% of basal- like tumours based on numeric data from the exon arrays 3.17. Experimental validation of TGMl alternative splicing pattern enabled the characterization of two unknown transcript structures involving mutually exclusive exons (Figure 12). Alternative isoform including exon 9 appears to be ubiquitously expressed in basal-like, non basal-like and normal tissue. However, isoform lacking exon 9 is highly specific to basal-like tumours since the other samples only show a slightly detectable signal. Thus, in situ hybridization targeting the specific junction between exon 8 and exon 10 provides a good marker to detect basal-like tumours.

Example 4 : methylation signature to identify basal-like MIBC

Material and methods

Patients and Tissue Samples A set of 68 bladder carcinomas were obtained from tumour tissue banks at Henri Mondor Hospital, Institut Gustave Roussy (Villejuif, France), and Foch Hospital (Suresnes, France). These cancers were selected randomly from a consecutive set, to cover the different stages of bladder cancer as follows: 20 Ta, 9 Tl, 11 T2, 17 T3, and 11 T4 tumours. 4 normal urothelial samples and 3 normal bladder muscle samples were also used for the analysis. Normal urothelium samples were obtained from fresh urothelial cells scraped from the normal bladder wall and dissected from the lamina propria during organ procurement from cadaveric donors for transplantation. All patients provided written informed consent and the study was approved by the ethics committees of the different hospitals.

Illumina Infinium methylation assay 4 µΐ of bisulfite-converted DNA was used for hybridization on Infinium HumanMethylation 450 BeadChip, following the Illumina Infinium HD Methylation protocol (Bibikova et al., 201 1). Data from the array were then processed using GenomeStudio (Illumina, Inc.) in order to assess methylation levels through the computation of β-values for each targeted CpG site in each sample. We used annotation data provided by Illumina to group CpG sites according to their relation to identified CpG islands. For each annotated CpG island, we computed a unique mean β-value over all related CpG sites for each sample.

Selection of CpG sites and CpG islands with differential methylation pattern in basal-like tumours In order to define a signature of basal-like tumours based on their methylation profile, we pre-selected a subset of CpG sites and CpG islands according to the fold-change of median β-values in basal-like tumours compared to median β-values in normal tumours (|fold-change| > 0.2) and in muscle samples (|fold-change| > 0.2). The computations were performed using R platform (R Core Team, 2012).

Nearest shrunken centroid ASE classifier The Nearest Shrunken Centroid method developed by Tibshirani (Tibshirani et al., 5 2002; Hastie et al, 2201 1) was used to build a classifier identifying basal-like tumours based on their DNA methylation profile. The classifier was trained using basal-like/non basal-like class prediction based on the established transcriptomic signature and the β-values obtained for each tumour and each CpG site (or each CpG island) were used to train and classify the samples. Briefly, the method uses a training set to compute a standardized centroid for each class, which 0 is derived from the average β-values of CpG sites (or CpG islands) in each class of tumours divided by the within-class standard deviation. The classifier takes the profile of β-values of a new sample, and compares it to each of these class centroids (see class centroids on figures 1 and 2). The class whose centroid that it is closest to, in squared distance, is the predicted class for that new sample.

5 Thresholds computations For each CpG site (or island) included in the proposed signatures, we used in silico β- values to compute preliminary thresholds that could be used in clinical practice. The thresholds were computed as follow: - For each CpG site (or island) identified as hypermethylated in basal-like tumours, we 0 define a lower threshold as the maximum among normal β-values and the mean β-value over non basal-like tumours, to which we add 1.5 times the standard deviations of the β-value distribution considering normal samples and non-basal like tumours:

Max(

where Norm = {normal urothelium, normal muscle} and NB= {non basal-like tumours} . 5 Samples whose measured CpG site (or island) methylation level is above this threshold will be consider to possess the basal-like profile for the considered CpG site (or island).

- For each CpG site (or island) identified as hypomethylated in basal-like tumours, we define an upper threshold as the minimum among normal β-values and the mean β-value over 0 non basal-like tumours, to which we subtract 1.5 times the standard deviations of the β-value distribution considering normal samples and non-basal like tumours: Min( M

where Norm = {normal urothelium, normal muscle} and NB= {non basal-like tumours} Samples whose measured CpG site (or island) methylation level is below this threshold will be consider to possess the basal-like profile for the considered CpG site (or island).

Results: Methylation signatures of basal tumours

Table 7: Signature of CpG island methylation specific to basal-like tumours. 2 CpG islands are hypermethylated and 11 CpG islands are hypomethylated in basal-like tumours compared to non-basal-like tumours.

Genomic region Deregulation Threshold.eg Threshold.I Gene symbol CpG ID CpG position

chrl :15593 1629- hypo 0,37 0,22 ARHGEF2 g 15593 1629 15593 1858 20847292 (SEQ ID hypo 0,00 0,22 ARHGEF2 cgl3921921 15593 1763 NO :42) hypo 0,04 0,22 ARHGEF2 cgl4204586 15593 1858

chrl :223936508- hyper 0,62 0,43 CAPN2 cgl93 18393 223936508 223936838 (SEQ ID hyper 0,32 0,43 CAPN2 cg0675621 1 223936799 NO :43) hyper 0,44 0,43 CAPN2 cgl9598416 223936812

hyper 0,40 0,43 CAPN2 cg216175 16 223936838

chrl :230249965- hypo 0,52 0,41 GALNT2 cgl63 14254 230249965 230250010 (SEQ ID hypo 0,29 0,41 GALNT2 cgl7737409 230250010 NO :44) chr6: 163730158- hypo 0,88 0,73 PACRG cg00797500 163730158 163730368 (SEQ ID hypo 0,71 0,73 PACRG cgl0584587 163730283 NO :45) hypo 0,55 0,73 PACRG cg0497883 1 163730368

hypo 0,1 8 0,35 PACRG cg08555556 163730796 chr6: 163730796- hypo 0,35 0,35 PACRG cg06638568 163730822 163730853 (SEQ ID hypo 0,45 0,35 PACRG cg21484573 163730853 NO :46) chr7:872735- hypo 0,55 0,68 UNC84A cgl0770230 872735 872797 (SEQ ID hypo 0,62 0,68 UNC84A cgl3975093 872797 NO :47) chr9: 119976922- hypo 0,80 0,80 ASTN2 cg071715 18 119976922 119976922 (SEQ ID NO :48) chrl0:6220879- hyper 0,91 0,70 PFKFB3 cg03555710 6220879 6220943 (SEQ ID hyper 0,62 0,70 PFKFB3 cgl2664173 6220910 NO :49) hyper 0,59 0,70 PFKFB3 cg24202817 6220943

chrl4:80328091- hypo 0,71 0,73 NRXN3 cg24197470 80328091 80328262 (SEQ ID hypo 0,65 0,73 NRXN3 cg021 11786 803281 89 NO :50) hypo 0,77 0,73 NRXN3 cg07001909 80328262

chrl 6:1742021- hypo 0,66 0,69 HN1L cgl0458734 1742021 1742281 (SEQ ID hypo 0,62 0,69 HN1L cg06775420 1742245 NO :5 1) hypo 0,77 0,69 HN1L cg2743 1500 1742281

chrl6:87491587- hypo 0,41 0,41 ZCCHC14 cg02660643 87491587 87491587 (SEQ ID NO :52) chrl 8:9536992- hypo 0,67 0,67 RALBP1 cgl3272780 9536992 9536992 (SEQ ID NO :53) chrl9: 10370550- hypo 0,73 0,73 MRPL4 cgl9819654 10370550 10370550 (SEQ ID NO :54) Table 8 : Forward sequence of the area targeted with the array. CpG sites are in brackets. CpGposition FOR WARDJSEQUENCE 15593 1629 GGTCCATGCGGTTGTAGATCTCCTGCAGACGGGCCCCTTTCTCCAGCTGATAAA TACCCT[CG]TCCACATTGGACAGCAGCTCCTTCACTAGCCCCAGTGCTGTGGTCA GGTCCTGGCGCTCC (SEQ ID NO :55) 15593 1763 GGCACCGGGGGTTGGCATGGGGAACGGCTCAACCAGTTTCACTCACACCCCAG TTCCCAT[CG]TGTCTTGATTCCACCTTTAGAGGCTGCCCAGGGTTTCACACCCGA CCCCACCCTTCCTGT (SEQ ID NO :56) 15593 1858 TTTCACACCCGACCCCACCCTTCCTGTGGTCATACCGAGCCTTTCCTCACCCCAG GTGGC[CG]TCTCCCAGTGGCCCTCTCCTGGGCCTGCCTACTCACCGTGGGAATGC TGCAGGATGCGGC (SEQ ID NO :57) 223936508 TCCCAGCCTGCAGAAATCCGTGGATGCGCTATTCACACGGGAATGCCTGTGCTC GTTCCG[CG]AGAGGGTTGCATTCCGCCTTTTCTCTGGTATTCTCGGCGTTCAAGC ATTTGAACGGGGTA (SEQ ID NO :58) 223936799 AGATCACCAGCGCCGCGGACTCGGAGGCCATCACGTTTCAGAAGCTGGTGAAG GGGCACG[CG]TACTCGGTCACCGGAGCCGAGGAGGTAACGGCCGGCGCGGATG TGCAGGGGTCCTGCTGT (SEQ ID NO :59) 223936812 CGCGGACTCGGAGGCCATCACGTTTCAGAAGCTGGTGAAGGGGCACGCGTACT CGGTCAC[CG]GAGCCGAGGAGGTAACGGCCGGCGCGGATGTGCAGGGGTCCTG CTGTCCTGACACGATGG (SEQ ID NO :60) 223936838 AGAAGCTGGTGAAGGGGCACGCGTACTCGGTCACCGGAGCCGAGGAGGTAACG GCCGGCG[CG]GATGTGCAGGGGTCCTGCTGTCCTGACACGATGGCCACAGGCAC AGTTTGTGGTGATGCC (SEQ ID NO :61) 230249965 GATTCTTGGAAACGGGCCACAGTGATTCCTCTGGCCACATTCAGGGTCAGGTTT TCAGTT[CG]TCTTCCTGGTGTTTCCGCGCCTCATTCTAGGGGGAGGTTTAGTCGT TCCTTTCTCTGCTG (SEQ ID NO :62) 230250010 GTCAGGTTTTCAGTTCGTCTTCCTGGTGTTTCCGCGCCTCATTCTAGGGGGAGGT TTAGT[CG]TTCCTTTCTCTGCTGTCCCCCAGTAGCTTCCGGCCATTCCTTGTGAAG GAATCTTCCTAA (SEQ ID NO :63) 163730158 AGATTACCTGGCCCCAGTGGGGAACACATGGCAAGGCGATTTCTGCAGTCGGCT TGATTC[CG]ACAAATGCTGGGTACGGAAAGAGCTTGGATCTGGCCTGAGTTCAC TCAAGCGCCATGTGT (SEQ ID NO :64) 163730283 AAGTGGCTACGCCTGGCAGCAAATGCTCCACATGCCAACGGCCAGAGCACAGC CCTAGAC[CG]ATGGCGGTGCGGCTGAACCACGAGGCCTCCGCAGCTATTTATTT CAGAGCCTTGGTTTTA (SEQ ID NO :65) 163730368 GGCCTCCGCAGCTATTTATTTCAGAGCCTTGGTTTTATGGCTGTCATTCAGCCTT GGGTG[CG]TGGAGAGAGTCAGTCTGCCTGAGTCAATGTCTCATGCTCTCTGCGG CTAAGGTGAATGTT (SEQ ID NO :66) 163730796 AGCCTCCAGCTCTTCAGGCATCTCAGGCAGGAGAGCGTGTTGGGGCCGGGACTC TCCTCC[CG]TGGACCTGACAACTCTGCCTTCCCCGGTGTGGTCTCGCTGTTCCCT TTGAGAGTGCGATG (SEQ ID NO :67) 163730822 GCAGGAGAGCGTGTTGGGGCCGGGACTCTCCTCCCGTGGACCTGACAACTCTGC CTTCCC[CG]GTGTGGTCTCGCTGTTCCCTTTGAGAGTGCGATGCTGCCGCTTCAG CCAGGACGTTCTCA (SEQ ID NO :68) 163730853 TCCCGTGGACCTGACAACTCTGCCTTCCCCGGTGTGGTCTCGCTGTTCCCTTTGA GAGTG[CG]ATGCTGCCGCTTCAGCCAGGACGTTCTCAAAATTAGCAGAAGGGCT TCCTTCCTGAAATC (SEQ ID NO :69) 872735 AGTTTTCTGCTTTTATTCATATTCCGATCCACTGAACCAAGACATTGGAGTCTGA AGAGG[CG]CTGCCCTGCCCCTTCTGCTTTCAGTTTTCCAAGTAGAAAGAGCGAC TGAAACGAAGTTTG (SEQ ID NO :70) 872797 CTGCCCTGCCCCTTCTGCTTTCAGTTTTCCAAGTAGAAAGAGCGACTGAAACGA AGTTTG[CG]TTTGATGACATTACGGGTCAGCAGCAAGGGAAGGTACACAGCTAG ACGGGCCCGGATTTG (SEQ ID NO :71) 119976922 CCTGGGGACCCAGCAGCACAGATGGGATGTAGTGGATCTCATGAGTGGCTTCTG TGCTTG[CG]CTCTTCTGGGGGATGCGGCGACGCTTCTGCCAACGTCGCTGGGCG TACAGCGCCACGGTG (SEQ ID NO :72) 6220879 GCGCACTGGTCCCTACCTTGGAGGCCTGACTCCCTTGAGAAGTGTCCCCAACCC AGTTCC[CG]TCTCACTACCAGCCACCACCTCCCCAGCACGGGGTCCTCCGCAGG TGATTTCATCTCTGA (SEQ ID NO :73) 6220910 CCCTTGAGAAGTGTCCCCAACCCAGTTCCCGTCTCACTACCAGCCACCACCTCC CCAGCA[CG]GGGTCCTCCGCAGGTGATTTCATCTCTGAGGCGGCTTTCTTATCTC TGGAAGAGCAGTAC (SEQ ID NO :74) 6220943 TCACTACCAGCCACCACCTCCCCAGCACGGGGTCCTCCGCAGGTGATTTCATCT CTGAGG[CG]GCTTTCTTATCTCTGGAAGAGCAGTACTATCTCCCTGGTGGAGTCG TTCTAAAAATGAAA (SEQ ID NO :75) 80328091 ACGGGTTCCGGGGGCCTCAGAGGTGATCCGGGAGTCGAGCAGCACAACAGGGA TGGTCGT[CG]GCATTGTGGCTGCTGCCGCCCTCTGCATCTTGATCCTCCTGTACG CCATGTACAAGTACA (SEQ ID NO :76) 80328189 TCCTGTACGCCATGTACAAGTACAGGAACAGGGACGAGGGGTCCTATCAAGTG GACGAGA[CG]CGGAACTACATCAGCAACTCCGCCCAGAGCAACGGCACGCTCA TGAAGGAGAAGCAGCAG (SEQ ID NO :77) 80328262 CAGCAACTCCGCCCAGAGCAACGGCACGCTCATGAAGGAGAAGCAGCAGAGCT CGAAGAG[CG]GCCACAAGAAACAGAAAAACAAGGACAGGGAGTATTACGTGTA AACATGCGAACACTGCT (SEQ ID NO :78) 1742021 GCTTGGCACACCCAAACAAACCCAAGGTATGGACTGCATTCAGACGTGACAGC GCAGCAG[CG]GGTATGCCAGGTGCTCTTTCCAAAAAGGCTCCAAGGCAGATGCG ACATGTTTTTAGGGAG (SEQ ID NO :79) 1742245 GAAGCCGTGGGGGAAGCTCTTCTGTGCTGGTGGCGGACGCCCACTGCAGACGG GCTGTGG[CG]GCTCCTCACTGCAGTGCTGCGGGGCGCGGAGAAGCGGTGGGGA GCGGAACGTGCCGCAGA (SEQ ID NO :80) 1742281 ACGCCCACTGCAGACGGGCTGTGGCGGCTCCTCACTGCAGTGCTGCGGGGCGC GGAGAAG[CG]GTGGGGAGCGGAACGTGCCGCAGACGAGCTGGGCCCTTGTCCG TCTTCCACTCTTCCTGT (SEQ ID NO :81) 87491587 CTGCGGTATTTTAAGCACTGTGGGGTTTGAGAATGACCACTCGTGGCCTCAGAG CCGACC[CG]CCAGCCTACGCAGGCTTTGCCACAGCTCACATGAGAGTCAGCTCA CCCTCTGTCTCTCTG (SEQ ID NO :82) 9536992 CCTTTCCTTTGAATGATAGCTGTGATTCACCCCACCCCATTTTCTTGTTTCTGGTC CATC[CG]ATGAGACGGATGCTCTGATGCTCTGAGGCTTCTGGGAGGCTGGGCCC TGGAGGCAACGTG (SEQ ID NO :83) 10370550 TGTGAAGCACCTCTTCTGAGCCAGGCCGAGCCCCTGGCCGACTTGGGAGCCTCA GGCCCA[CG]CCCACCCTTCGAGGAAGGTGTCACCTGGACCCCTTCATTCCACGG AGGAAGCTGAGGCCA (SEQ ID NO :84)

Table 9: Signature of CpG site methylation specific to basal-like tumours. All these CpG sites are hypomethylated in basal-like tumours compared to non-basal-like tumours. CpG sites are in brackets. Threshold gene CpG ID Chr. CpGposition FOR WARDJSEQUENCE eg 0,53 FLJ43663 cg2022873 1 1 13064605 1 GCTGGAGCCCGAAATCCCCTTGATGA ATGAAACTGCACATCGTAATATACAC ATTCAGAG[CG]GGCTATTGGCTAAAT GAAGGCAATGCGTGTGGAAGTCTGAT TTGCCACCTTCAGGGCAGG (SEQ ID NO :85) 0,53 FLJ43663 cgl5928106 1 130646078 TGAAACTGCACATCGTAATATACACA TTCAGAGCGGGCTATTGGCTAAATGA AGGCAATG[CG]TGTGGAAGTCTGATT TGCCACCTTCAGGGCAGGCGTCTTATA TCCTAATTCAGAAAGATG (SEQ ID NO :86) ,66 ATXN7 cg07275179 3 63962440 AGTCAGTTCTGCTATAGCACAACATG CATTCCTAAAAAATTACTGCACAATG CAAAATTA[CG]CAACAAAAACCACAT GAGGCTTATTGGGAAAACAGGGTTAT GGCACAACACTCAAAGACC (SEQ ID NO :87) ,48 TNRC18 cg05687083 7 5371160 AAATGGGGTTTTCCTTCAGGTGAGAG TGGAGGCCACAAACCACAAAAGTACC AGCAGCGC[CG]GTGAAACCACAGAGT CTTGCAGCAACGGAAACCACAGAGAC TTGCAGCACACAATAGTTA (SEQ ID NO :88) ,40 MCC cg06062378 5 112387002 CTCTCCCCCTACCCCCACCACCGAGGA AATGAGAATGAATCAGCACTGACGAA GAATGAG[CG]GATGCAAAAAGTAAGA GTCAGCATTTACAAGAAGCAGAAATA ACAATGTGCAAAACTAGA (SEQ ID NO :89) ,48 LOC2855 cg02353916 4 15691786 TCTTTGCCTTTTTACTTTTAAAAATCTA 50 ATTTTGACATAACTGCTGTAACCATCC AGAAA[CG]GCATTGATGTTGCTTCAC GTTGCTGATGCTTAAGCAATGTATATT GTGTAATATACAATG (SEQ ID NO :90) ,83 NRF1 cg20712980 7 129305192 TACCTGCCCAGGCTCTGGAATCAGGC TTCCTGGGTTGGAATCCCAACTCCACT TAACCAG[CG]TGGGGATTTTTCTGAG GAGCTGGTCTGTAGCATTCATCATATC TGAAAGGTGACCCAGAA (SEQ ID NO :91) ,58 ABL2 cg03471346 1 179112077 TGATTAATGTGGTTATCAAGCCTCCCT GAAACTATCAACTCTGAACCATACCT GTTAAGT[CG]GGTAGAGCAGATTCTG AGGCCTCAGTGCACAGGCAGCAAAGT GAAGTGTCCTGATCTCTG (SEQ ID NO :92) ,68 SUFU cgl8580385 10 104364518 CATTCAGGAGAGTAGGGGTCCTAATC TCCCACCCAAGTCAGAGGGTAGGGCT GGGGATGG[CG]GCCCCTGGTGCCCAG AGGAGTTAATAGTTGACTAGGAAATC CCCTAACCTTGATGCAGAT (SEQ ID NO :93) ,24 SAMD12 cg07960624 8 119208486 TGAGCGGAGGCAGACAATGCTTATGT ATCTGGAGACACCCTAAAACACCAAG CTTGCTTG[CG]TTGGAAGTCTTCCAGA ACTGGTTACCCATATTTTAAAGCAAA GCCTCTGATGCCCGTCAG (SEQ ID NO :94) ,34 GNAI2 cg01542384 3 50284305 GGAAGGAGGCCCCAGCTGTGTAGGAG TGCCGGGTTATACTGAAGGGAAGTGA CGACACAG[CG]GAAAGCCAACAGAG GAGCCAGTAGCTCTGACTCCTGTCCCA GGTTATGCTGAGTCCCAGC (SEQ ID NO :95)

,51 AP3S2 cg02349866 15 90391877 CTCCTGACCTTGTGATCCGCCTGCCTT GGCCTCCCAGGCTGATAGCTTTTTCTA CTCCTT[CG]TGGCATAGCTGGTTTCAT TAATTCCTAGGAAAGGGCTTCAGAAA ATAAGTAGGGAAGAGA (SEQ ID NO :96)

,41 BIVM cg24062389 13 103478447 TAATTTAATGAATTCGCAATTGAGGAT ATGAGTCAGGAGGAATTCAGGAGATT GGGGTTA[CG]AAACTTCTAAGAGAGA ATTTTGTTGTCCATTCCAAAGCTGGCA TTCTTCCTTTGCAAGAG (SEQ ID NO :97) ,51 SYNJ2 cgl2002139 6 158478872 CCAGGGGCAGAGCCTGCACGTGTCAG AGCCAGGCCTGGGATGCACCTTCTCT GGGCAGAC[CG]ACCCAAAAGAAAGCC CACAAGGGCTGAAGCCACACACAGCC CGCCCCAGGGCTGCCTGGA (SEQ ID NO :98) 0,24 GNAI2 cgl l l l 8235 3 50284010 TACCCCCACCTGCCCCACCATATCCCA CTGGGATAGGCTGGGGAGTTGTGGCA GTGTGGG[CG]GGCCTGGTGTTGTGGG TTTGAGCTGGGCCGCTGCCCAAGTCTA CCCTGTGGTCCCAGCCG (SEQ ID NO :99)

Example 5: Identification of basal-like MIBC using basal-like specific mR A isoform of HDAC9 gene and/or HDAC9 short/long isoform ratio and/or overexpression of TGM1 gene

Material and methods

Description of HDAC9 short and long isoforms. HDAC9 short and long isoforms are the result of alternative promoter usage. HDAC9 long isoforms correspond to transcripts ENST00000406451, ENST00000405010 and ENST00000428307 as described on the Ensembl database. Short isoform corresponds to the transcript ENST00000456174.

Affymetrix exon arrayprofiling of RNA extractedfrom human tumors and cell lines 207 bladder tumors were collected from patients surgically treated between 1988 and 2006 at Henri Mondor Hospital, Institut Gustave Roussy (Villejuif, France), and Foch Hospital (Suresnes, France). 5 normal bladder urothelium samples were also collected. RNA was extracted by cesium chloride density centrifugation. The concentration, integrity and purity of each RNA sample were determined with the RNA 6000 LabChip Kit (Agilent Technologies, Santa Clara, California, USA) and an Agilent 2100 Bioanalyzer. 1,5 microgram of total RNA from tissue samples were processed and labelled using the Affymetrix GeneChip Whole Transcript Sense Target Labelling Assay as outlined in the manufacturer's instructions. Hybridization to Affymetrix GeneChip Human Exon 1.0 ST arrays was carried out independently at the Institut Curie Affymetrix platform. Affymetrix Expression Console Software was used to perform quality assessment.

Data pre-processing andfiltering Signal estimates were derived from the microarrays CEL files. Partek® Genomics Suite™ was used to quantile normalize the raw feature data and to summarize the probe set (representing exon expression) intensities using the Robust Multi-array Average (RMA) method (Irizarry et al, 2009). Probe to exons assignments were drawn from the custom CDF files of Dai et al (Dai et al., 2005) in version 12 for Ensembl exons. Only a subset of well- annotated core probe sets from the exon array was used for the study. Cross-hybridizing probe- sets (targeting non-unique sequences of the transcriptome) and probe-sets having extremely low standard deviation (due to a poor hybridization or an exon missing in all samples) were discarded to remove noisy data that might contribute to false positive in the alternative splicing analysis.

Alternative promoter usage analysis Identification of alternative promoter usage from exon array data was conducted based on a published statistical method called ARH (Rasche et al., 2010). The method enables to compare two samples and measure significance of the possible differences in transcript structures for each gene between the samples. ARH scores by gene and ARH splicing deviations by exon were computed for each tumor using median values over normal samples as reference (bladder urothelium samples). ARH scores are linked to p-values referring to the hypothesis that the gene is alternatively spliced (has at least one exon differentially included in the gene) between tumor and normal samples. ARH splicing deviations give a measure of inclusion level differences of each exon in the corresponding gene for each tumor compared to normal samples. The values are centered in 0, negative values suggesting a loss of exon whereas positive values suggesting a gain of exon in the tumor relative to normal samples. When dealing with discrete data, an exon of a tumor was considered to show significant differential inclusion level compared to normal if corresponding gene p-value was lower than 0.05 and absolute exon splicing deviation was greater than 1. Exons associated to cross-hybridizing probe sets or probe sets that showed very low variation were discarded from the results (standard deviations < 0.4). All the computations were performed using R platform (R Core Team 2012).

Enrichment analysis of alternativepromoter usage in bladder tumors Significance assessment of possible associations between frequent alternative promoter usage and subgroups of bladder tumors (e.g. basal-like and no basal-like) was obtained by performing a two-steps enrichment analysis. First, we computed Fisher exact test p-values for each pair of {Alternative promoter usage, tumor subgroup} among the Alternative promoter usage identified and the tumor properties. An alternative promoter usage was then assigned to the tumor subgroup property with the smallest significant p-value (p-value < 0.001). Secondly, we used these assignments to perform a Fisher exact test for each tumor property in order to find the properties that were the most significantly enriched with specific alternative promoter usage. Quantification of TGM1 gene expression and HDAC9 short and long isoforms by RT-qPCR. 2µg of RNA were reverse transcribed using the High capacity cDNA reverse transcription kit (Applied Biosystems, Courtaboeuf, France). For all samples, lOng of cDNA were used for amplification with a LightCycler 480 Instrument (Roche Diagnostics, Meylan, France). All samples were run in duplicate. Amplification was performed using SybrGreen master mix (Roche Diagnostics). Forward/reverse primers used for TGM1 gene expression are 5'-AACTCCCTGGATGACAATGG-3 ' (SEQ ID NO : 103) / 5'- GCAGCACTGTGGTGGTC A-3 ' (SEQ ID NO : 104). Forward reverse primers used for HDAC9 short and long isoforms are 5'-TCACAGTGTAGCTTGAGAAAAATG-3 ' (SEQ ID NO : 105) / 5'-TGCAACTTGATATGCTCCTGA-3 ' (SEQ ID NO : 106) and 5'- CAGATGGGGTGGCTGGAC-3 ' (SEQ ID NO : 107) / 5'-TGCTTCTGGATTTGTTGCTG-3' (SEQ ID NO : 108) respectively. To normalize sample-to-sample differences in cDNA input and to perform relative quantification, 18S housekeeping gene expression was quantitatively measured in each sample using a pre-designed gene expression assay containing gene specific primers and a dye-labeled hydrolysis probe Assay-on demand (Applied Biosystems). Quantification was performed on 20 basal-like and 20 no basal-like tumors randomly chosen from samples analyzed on exon array.

Results

HDAC9 short isoform corresponding to the transcript ENST00000456174 comprises exon 1 contrary to HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307. Quantification of short and long HDAC9 isoforms in basal-like and non-basal-like bladder tumors was performed by RT-qPCR on 20 basal-like bladder tumors, 20 non basal-like bladder tumors and one normal bladder urothelium sample (919-1). As shown on Figure 14, short and long HDAC9 isoforms are differentially expressed in basal-like and non-basal-like bladder tumors, short HDAC9 isoform being mainly expressed in basal-like tumors.

Figure 15 shows short/long HDAC9 isoform expression ratios in basal-like and non- basal-like bladder tumors. These results demonstrate that this ratio is overexpressed in basal- like and then discriminates basal-like and non-basal-like bladder tumors. Quantification of TGM1 gene expression in basal-like and non-basal-like bladder tumors was performed by RT-qPCR (Figure 16). The inventors observed that TGM1 gene was overexpressed in basal-like tumors by comparison with non-basal-like tumors and showed that basal-like MIBC can be classified based on global expression of TGM1 gene. As shown on Figures 17 to 20, combining TGMl gene expression and short/long HDAC9 isoform expression ratio resulted in highly efficient classification of basal-like and non basal like MIBC. In figures 19 and 20, the qPCR cut-off values were chosen to optimize both specificity and sensibility. The HDAC9 short/long isoform ratio was considered as "high" if its value corresponded to 30 fold increase compared to normal urothelium sample. The expression level of TGMl gene was considered as "high" if its value corresponded to 4 fold increase compared to normal urothelium sample.

REFERENCES Adam et al. (2009) Clin Cancer Res 15, 5060-5072. Baselga, J. (2008). Ann Oncol 19 Suppl 7, vii281-288. Becci et al (1978). Cancer Res 38, 4463-4466. Bellmunt et al. (2010) Ann Oncol 2 1 Suppl 5, vl34-136. Bellmunt, J., and Petrylak, D. P. (2012). Semin Oncol 39, 598-607.

Bibikova et al. (201 1) Genomics 98, 288-295. Black et al. (2008) Clin Cancer Res 14, 1478-1486.

Blaveri et al (2005) Clin Cancer Res 11, 4044-4055. Boyault et al. (2007). Hepatology 45, 42-52. Chirgwin et al (1979) Biochemistry 18, 5294-5299. Chow et al (2001) Clin Cancer Res 7, 1957-1962. Coombs et al (1990) Anal Biochem 188, 338-343. Dai et al. (2005). Nucleic Acids Res. 33, el75-el75. De Boer et al (1997) Int J Cancer 71, 284-291. Dovedi, S. J., and Davies, B. R. (2009). Cancer Metastasis Rev 28, 355-367. Dyrskjot et al (2003) Nat Genet 33, 90-96. Dyrskjot et al. (2007) Clin Cancer Res 13, 3545-3551. El-Marjou et al (2000). Carcinogenesis 21, 221 1-2218. Eswarakumar et al. 2005, Cytokine Growth Factor Rev. 16:139-49 Fanning and Symonds (2006) RNA Towards Medicine (Handbook of Experimental Pharmacology), ed. Springer p. 289-303 Forbes et al. (201 1) Nucleic Acids Res 39, D945-950. Geiss et al. 2008 Nat. Biotechnol. 26:317-325 Gil-Diez de Medina et al (1998) Hum Pathol 29, 1005-1012. Hastie et al (201 1). pamr: Pam: prediction analysis for microarrays. R package version

1.5 1. http://CRAN.R-project.org/package=pamr Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, ed., Cold Spring Harbor Laboratory. Ishikawa et al (1998). Cancer Res 58(4):685-90 Irizarry et al (2003). Biostatistics 4, 249-264. Irizarry et al. Nature Genetics 41, 178-186 (2009). Karni-Schmidt et al (201 1). Am J Pathol 178, 1350-1360. Kim et al. (2010). Mol Cancer 9, 3. Labarca, C , and Paigen, K. (1980). Anal Biochem 102, 344-352. Lee et al (2007). Proc Natl Acad Sci U S A 104, 13086-13091. Lindgren et al (2010) Cancer Res 70, 3463-3472.

Ling et al (201 1) Cancer Res 71, 3812-3821. Mostofi FK, S. L., Torloni H (1973). Histological Typing of Urinary Bladder Tumours. Geneva: World Health Organization. Necchi et al. (2012) Lancet Oncol 13, 810-816. Nicolle et al. (2006). Clin Cancer Res 12, 2937-2943. Perou et al. (2000) Nature 406, 747-752. Perrotte et al (1999) Clin Cancer Res 5, 257-265. Pruthi et al. (2010). BJU Int 106, 349-354.

Olive, Expert Review of Proteomics, October 2004, Vol. 1, No. 3, Pages 327-341 Rasche, A., and Herwig, R. (2010). Bioinformatics 26, 84 -90. R Development Core Team (201 1). R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/ R Core Team (2012). R : A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http ://www.R-project.org/ Riester et al (2012). Clin Cancer Res 18, 1323-1333. Romer et al (1994) J Invest Dermatol 102, 519-522. Sanchez-Carbayo, J. Clin Oncol, Vol 24, No 5 (February 10), 2006: pp. 778-789 Schubbert et al, 2007. Nat Rev Cancer. 7:295-308 Sjodahl et al. (2012) Clin Cancer Res 18, 3377-3386. Sobin, L. H., and Fleming, I. D. (1997). Cancer 80, 1803-1804. Sok et al. (2006). Clin Cancer Res 12, 5064-5073. Sternberg et al. (2012). ICUD-EAU International Consultation on Bladder Cancer 2012: Chemotherapy for Urothelial Carcinoma-Neoadjuvant and Adjuvant Settings. Eur Urol. Stransky et al. (2006). Nat Genet 38, 1386-1396. Tamano et al (1991). Jpn J Cancer Res 82, 650-656. Thogersen et al (2001) Cancer Res 61, 6227-6233. Tibshirani et al (2002). Pnas 99, 6567-6572.

van Oers et al (2005). Clin Cancer Res 11, 7743-7748. Varley et al (2009) Cell Death Differ 16, 103-1 14. Volkmer et al. (2012). Proc Natl Acad Sci U S A 109, 2078-2083. Wallerand et al. (2005). Carcinogenesis 26, 177-184. Wang et al. Nat Rev Genet. 2009 January; 10(1): 57-63. Wong et al (2012). J Clin Oncol 30, 3545-3551. CLAIMS

1. An in vitro method for determining whether a muscle-invasive bladder cancer has a basal-like phenotype, wherein the method comprises (i) determining the expression level of KRT5, KRT6A and/or KRT6B and the expression level of nuclear FOXAl in a cancer sample; and/or (ii) determining the expression level of at least 2 genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, ILIRAP, DSP, PKPl, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSSl and RGS20 genes, and the expression level of at least 2 genes selected from the group consisting of PHC1, THYN1, TACC1, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNA1D, RAB15, MAN1C1, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, in a cancer sample; and/or (iii) determining the expression level of at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 41, in a cancer sample; and/or (iv) determining the DNA methylation status of at least 4 CpG islands selected from the group consisting of CpG islands listed in Table 7 in a cancer sample; and/or (v) determining the DNA methylation status of at least 4 GpC sites selected from the group consisting of CpG sites listed in Table 9 in a cancer sample; and/or (vi) determining the expression level of TGM1 gene in a cancer sample, thereby determining whether a muscle-invasive bladder cancer has a basal-like phenotype.

2. The method according to claim 1, wherein the method comprises determining the expression level of KRT5, KRT6A and/or KRT6B and the expression level of nuclear FOXAl in a cancer sample, and wherein the expression of KRT5, KRT6A and/or KRT6B and the absence of nuclear FOXAl are indicative that the muscle-invasive bladder cancer has a basal- like phenotype.

3. The method according to claim 1 or 2, wherein the expression level of KRT5, KRT6A and/or KRT6B and the expression level of nuclear FOXAl are assessed by immunohistochemistry. 4. The method according to claim 1, wherein the method comprises determining the expression level of PKP1, IPPK, MAML3 and TGFBR3 genes in a cancer sample, and wherein high expression level of PKP1 and IPPK genes and low expression level of MAML3 and TGFBR3 genes, are indicative that the muscle-invasive bladder cancer has a basal-like phenotype.

5. The method according to claim 1, wherein the method comprises determining the expression level of the exon of SEQ ID NO: 24 in a cancer sample, and wherein low expression level of said exon is indicative that the muscle-invasive bladder cancer has a basal-like phenotype.

6. The method according to claim 1, wherein the method comprises determining the expression level of the exon of SEQ ID NO: 37 in a cancer sample, and wherein high expression level of said exon is indicative that the muscle-invasive bladder cancer has a basal-like phenotype.

7. The method according to claim 1, wherein the method comprises determining the expression level of TGMl gene, and wherein high expression level is indicative that the muscle- invasive bladder cancer has a basal-like phenotype.

8. The method according to claim 7, wherein the method further comprises determining the expression level of the exon of SEQ ID NO:37.

9. The method according to claim 1, wherein the method comprises determining the expression level of TGMl gene and determining the expression level of HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307.

10. The method according to claim 9, further comprising calculating the ratio ofHDAC9 short isoform to HDAC9 long isoforms, wherein high expression level of TGMl gene and a high ratio are indicative that the muscle-invasive bladder cancer has a basal-like phenotype

11. The method according to claim 1, wherein the method comprises determining the DNA methylation status of CpG islands of SEQ ID NO: 43, 45, 47 and 51, and wherein hypermethylation of the CpG island of SEQ ID NO: 43 and hypomethylation of CpG islands of SEQ ID NO: 45, 47 and 51, is indicative that the muscle-invasive bladder cancer has a basal- like phenotype.

12. The method according to claim 1, wherein the method comprises determining the DNA methylation status of GpC sites of SEQ ID NO: 85, 86, 9 1 and 96, and wherein hypomethylation of said CpG islands is indicative that the muscle-invasive bladder cancer has a basal-like phenotype.

13. An in vitro method for predicting clinical outcome of a patient afflicted with a muscle-invasive bladder cancer, wherein the method comprises determining in a cancer sample from said patient whether the muscle-invasive bladder cancer has a basal-like phenotype with the method according to any one of claims 1 to 12, the presence of the basal-like phenotype being indicative of a poor prognosis.

14. An in vitro method for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the method comprises determining in a cancer sample from said patient whether the muscle- invasive bladder cancer has a basal-like phenotype with the method according to any one of claims 1 to 12, and optionally determining whether the muscle-invasive bladder cancer has a RAS-activating mutation, the presence of the basal-like phenotype being indicative that said patient is susceptible to benefit from a treatment comprising capecitabine, and the presence of the basal-like phenotype and the absence of a RAS-activating mutation being indicative that said patient is susceptible to benefit from a treatment comprising an EGFR kinase inhibitor.

15. An in vitro method of predicting the sensitivity of a muscle-invasive bladder cancer to a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the method comprises determining whether the muscle-invasive bladder cancer has a basal-like phenotype with the method according to any one of claims 1 to 12, and optionally determining whether the muscle-invasive bladder cancer has a RAS-activating mutation, the presence of the basal- like phenotype in said cancer being indicative that said cancer is sensitive to a treatment comprising capecitabine, and the presence of the basal-like phenotype and the absence of RAS- activating mutation being indicative that said cancer is sensitive to a treatment comprising an EGFR kinase inhibitor. 16. An EGFR kinase inhibitor for use in the treatment of muscle-invasive bladder cancer having a basal-like phenotype as determined with the method according to any one of claims 1 to 12 and without RAS-activating mutation.

17. Capecitabine for use in the treatment of muscle-invasive bladder cancer having a basal-like phenotype as determined with the method according to any one of claims 1 to 12.

18. Use of a kit (i) for predicting clinical outcome of a patient afflicted with a muscle- invasive bladder cancer, (ii) for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment comprising an EGFR kinase inhibitor and/or capecitabine, and/or (iii) for predicting the sensitivity of a muscle-invasive bladder cancer to a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the kit comprises detection means selected from the group consisting of a pair of primers, a probe and an antibody specific to (a) the genes KRT5, KRT6A, KRT6B and/or FOXA1; and/or (b) at least 2 genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSSl and RGS20 genes, preferably PKPland IPPK, and at least 2 genes selected from the group consisting of PHCl, THYNl, TACCl, PPAP2B,

NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNAID, RAB15, MANIC 1, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, preferably MAML3 and TGFBR3; and/or (c) at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 41, preferably the exon of SEQ ID NO: 24; and/or (d) at least 4 CpG islands selected from the group consisting of CpG islands listed in Table 7; and/or (e) at least 4 GpC sites selected from the group consisting of CpG sites listed in Table 9; and/or (f the TGM1 gene, and optionally, a leaflet providing guidelines to such use.

19. Use of DNA chip (i) for predicting clinical outcome of a patient afflicted with a muscle-invasive bladder cancer, (ii) for selecting a patient afflicted with a muscle-invasive bladder cancer for a treatment comprising an EGFR kinase inhibitor and/or capecitabine, and/or (iii) for predicting the sensitivity of a muscle-invasive bladder cancer to a treatment comprising an EGFR kinase inhibitor and/or capecitabine, wherein the DNA chip comprises a solid support which carries nucleic acids that are specific to (b) at least 2 genes selected from the group consisting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, IL1RAP, DSP, PKP1, SERPINB7, CELSR2, DUSP7, TBC1D2, ARL4D, IPPK, MTSSl and RGS20 genes, preferably PKPland IPPK, and at least 2 genes selected from the group consisting of PHC1, THYN1, TACC1, PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNAID, RAB15, MANICI, SORLl, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes, preferably MAML3 and TGFBR3; and/or (c) at least one exon selected from the group consisting of the exons of SEQ ID NO: 23 to 41, preferably the exon of SEQ ID NO: 24; and/or (d) at least 4 CpG islands selected from the group consisting of CpG islands listed in Table 7; and/or (e) at least 4 GpC sites selected from the group consisting of CpG sites listed in Table 9; and/or (f the TGM1 gene.

20. The use of claim 18, wherein the kit further comprises detection means selected from the group consisting of a pair of primers, a probe and an antibody specific to the HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST0000040645 1, ENST00000405010 and ENST00000428307.

21. The use of claim 19, wherein the solid support further carries nucleic acids that are specific to the HDAC9 short isoform corresponding to the transcript ENST00000456174 and HDAC9 long isoforms corresponding to transcripts ENST00000406451, ENST00000405010 and ENST00000428307.

22. The method according to claim 14 or 15 or the EGFR kinase inhibitor according to claim 16, wherein the EGFR kinase inhibitor is selected from the group consisting of erlotinib, cetuximab, gefitinib, lapatinib, panitumumab, zalutumumab, nimotuzumab and matuzumab, and any combination thereof. 23. A combined preparation, product or kit containing (a) capecitabine and (b) an alkylating agent, as a combined preparation for simultaneous, separate or sequential use in the treatment of a muscle-invasive bladder cancer having a basal-like phenotype as determined with the method according to any one of claims 1to 12.

24. The combined preparation, product or kit of claim 23, wherein the alkylating agent is cisplatin.

International application No.

INTERNATIONAL SEARCH REPORT PCT/EP2014/054384

Box No. I Nucleotide and/or amino acid sequence(s) (Continuation of item 1.c of the first sheet)

1. With regard to any nucleotide and/or amino aoid sequence disclosed in the international application and necessary to the claimed invention, the international search was carried out on the basis of:

(means)

on paper

in electronii

in the international application as filed

together with the international application in electronic form

subsequently to this Authority for the purpose of search

In addition, in the case that more than one version or copy of a sequence listing and/or table relating thereto has been filed □ or furnished, the required statements that the information in the subsequent or additional copies is identical to that in the application as filed or does not go beyond the application as filed, as appropriate, were furnished.

3 . Additional comments:

Form PCT/ISA/21 0 (continuation of first sheet (1)) (July 2009) International application No. PCT/EP2014/054384 INTERNATIONAL SEARCH REPORT

Box No. II Observations where certain claims were found unsearchable (Continuation of item 2 of first sheet)

This international search report has not been established in respect of certain claims under Article (2)(a) for the following reasons:

□ Claims Nos.: because they relate to subject matter not required to be searched by this Authority, namely:

□ Claims Nos.: because they relate to parts of the international application that do not comply with the prescribed requirements to such an extent that no meaningful international search can be carried out, specifically:

□ Claims Nos.: because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a).

Box No. Ill Observations where unity of invention is lacking (Continuation of item 3 of first sheet)

This International Searching Authority found multiple inventions in this international application, as follows:

see addi tional sheet

1. 1 As all required additional search fees were timely paid by the applicant, this international search report covers all searchable ' claims.

2 . I I As all searchable claims could be searched without effort justifying an additional fees, this Authority did not invite payment of additional fees.

As only some of the required additional search fees were timely paid by the applicant, this international search report covers ' ' only those claims for which fees were paid, specifically claims Nos. :

I I No required additional search fees were timely paid by the applicant. Consequently, this international search report is restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 2-6, 20, 21(completely) ; 1, 13-19, 22-24(partial ly)

Remark on Protest The additional search fees were accompanied by the applicant's protest and, where applicable, the ' ' payment of a protest fee. The additional search fees were accompanied by the applicant's protest but the applicable protest ' ' fee was not paid within the time limit specified in the invitation.

I INo protest accompanied the payment of additional search fees.

Form PCT/ISA/21 0 (continuation of first sheet (2)) (April 2005) International application No PCT/EP2014/054384

A. CLASSIFICATION O F SUBJECT MATTER INV. C12Q1/68 ADD.

According to International Patent Classification (IPC) or to both national classification and IPC

B. FIELDS SEARCHED Minimum documentation searched (classification system followed by classification symbols C12Q

Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched

Electronic data base consulted during the international search (name of data base and, where practicable, search terms used)

EPO-Internal WPI Data

C. DOCUMENTS CONSIDERED TO BE RELEVANT

Category* Citation of document, with indication, where appropriate, of the relevant passages Relevant to claim No.

J . -P. VOLKMER ET AL: "Three 1,2, 13 , di fferenti ati on states r i sk-strati f y 18 bl adder cancer into di sti nct subtypes" , PROCEEDINGS OF THE NATIONAL ACADEMY OF SCI ENCES, vol . 109 , no. 6 , 7 February 2012 (2012-02-07) , pages 2078-2083 , XP055069880, ISSN : 0027-8424, D0I : 10. 1073/pnas . 1120605109 the whole document

/ -

X Further documents are listed in the continuation of Box C. See patent family annex.

* Special categories of cited documents : "T" later document published after the international filing date or priority date and not in conflict with the application but cited to understand "A" document defining the general state of the art which is not considered the principle or theory underlying the invention to be of particular relevance "E" earlier application or patent but published on or after the international "X" document of particular relevance; the claimed invention cannot be filing date considered novel or cannot be considered to involve an inventive "L" document which may throw doubts on priority claim(s) orwhich is step when the document is taken alone cited to establish the publication date of another citation or other " document of particular relevance; the claimed invention cannot be special reason (as specified) considered to involve an inventive step when the document is "O" document referring to a n oral disclosure, use, exhibition or other combined with one or more other such documents, such combination means being obvious to a person skilled in the art "P" document published prior to the international filing date but later than the priority date claimed "&" document member of the same patent family

Date of the actual completion of the international search Date of mailing of the international search report

28 March 2014 07/07/2014

Name and mailing address of the ISA/ Authorized officer European Patent Office, P.B. 5818 Patentlaan 2 NL - 2280 HV Rijswijk Tel. (+31-70) 340-2040, Fax: (+31-70) 340-3016 Cornel s , Karen

page 1 of 4 International application No PCT/EP2014/054384

C(Continuation). DOCUMENTS CONSIDERED TO BE RELEVANT

Category* Citation of document, with indication, where appropriate, of the relevant passages Relevant to claim No.

X G. SJODAHL ET AL: "A Mol ecul ar Taxonomy 1 ,2, 13 , for Urothel ial Carcinoma" , 18 CLINICAL CANCER RESEARCH , vol . 18, no. 12, 15 June 2012 (2012-06-15) , pages 3377-3386, XP055070151, ISSN : 1078-0432, D0I : 10. 1158/1078-0432 . CCR-12-0077-T the whole document

A CHAKRAVARTI A ET AL: "Expression of the 1 epidermal growth factor receptor and Her-2 are predi ctors of favorabl e outcome and reduced compl ete response rates, respectively, i n pati ents wi t h muscl e-invadi ng b l adder cancers treated by concurrent radiati on and c i spl ati n-based chemotherapy: A report from the Radiation Therapy 0 " , INTERNATIONAL JOURNAL OF RADIATION: ONCOLOGY BIOLOGY PHYSICS, PERGAM0N PRESS, USA, vol . 62, no. 2 , 1 June 2005 (2005-06-01) , pages 309-317 , XP025262773, ISSN : 0360-3016, D0I : 1O. 1016/J . I ROBP. 2 4 . 9 .047 [retri eved on 2005-06-01] abstract

A W0 2010/123354 A2 (UNIV ERASMUS MEDICAL CT 1 [NL] ; ZWARTH0FF ELLEN CATHARINA [NL] ; VAN TI LB) 28 October 2010 (2010-10-28) tables 2 ,5,7

A M. SANCHEZ-CARBAYO ET AL: "Defini ng 1 Molecular Profi les of Poor Outcome i n Patients Wi t h Invasive Bl adder Cancer Using Ol i gonucleotide Mi croarrays" , JOURNAL OF CLINICAL ONCOLOGY, vol . 24, no. 5 , 17 January 2006 (2006-01-17) , pages 778-789, XP055069669 , ISSN : 0732-183X, D0I : 10. 120O/JCO.20O5 .03.2375 the whole document

A DYRSKJ0T L ET AL: " Identi fyi ng d i sti nct 1 c l asses of b l adder carcinoma using mi croarrays" , NATURE GENETICS, NATURE PUBLISHING GROUP, NEW YORK, US, vol . 33, 1 January 2003 (2003-01-01) , pages 90-96, XP002271899 , ISSN : 1061-4036, DOI : 10. 1038/NG1061 the whole document

-/--

page 2 of 4 International application No PCT/EP2014/054384

C(Continuation). DOCUMENTS CONSIDERED TO BE RELEVANT

Category* Citation of document, with indication, where appropriate, of the relevant passages Relevant to claim No.

A BLAVERI EKATERINI ET AL: "Bl adder cancer 1 outcome and subtype c assi i cation by gene expression" , CLINICAL CANCER RESEARCH , THE AMERICAN ASSOCIATION FOR CANCER RESEARCH, US, vol . 11, no. 11, 1 June 2 5 (2005-06-01) , pages 4044-4055, XP002488795 , ISSN : 1078-0432, D0I : 10. 1158/1078-0432 . CCR-04-2409 the whole document

Y DAVID J . DEGRAFF ET AL: " Loss of the 1 ,2, 13 , Urothel ial Di fferenti ation Marker F0XA1 I s 18 Associ ated wi t h Hi gh Grade, Late Stage Bl adder Cancer and Increased Tumor Prol i feration" , PL0S ONE, vol . 7 , no. 5 , 10 May 2012 (2012-05-10) , page e36669, XP055070192 , D0I : 10. 1371/journal .pone. 0036669 the whole document

A 0 2012/018609 A2 (UNIV JOHNS HOPKINS 14-18 [US] ; SIDRANSKY DAVID [US] ; CHANG XIA0FEI [US] ) 9 February 2012 (2012-02-09) abstract

A W0 2012/009382 A2 (UNIV COLORADO [US] ; 1 ,4, BARAS ALEX [US] ; LEE JAE K [US] ; SMITH 13-19 , STEVEN [US] ) 19 January 2012 (2012-01-19) 22-24 table 1

A US 2011/262921 Al (SABICHI ANITA L [US] ET 1 ,4, AL) 27 October 2011 (2011-10-27) 13-19 , 22-24 c l aim 7

A W0 2012/178087 Al (ONCOCYTE CORP; CHAPMAN 1 ,4, KAREN [US] ; WEST MICHAEL [US] ; WAGNER 13-19 , JOSEPH [U) 27 December 2012 (2012-12-27) 22-24 paragraph [0159] - paragraph [0164]

A ARUM C J ET AL: "Gene Expression 5 Profi l ing and Pathway Analysi s of Superfi c i al Bl adder Cancer i n Rats" , UROLOGY, BELLE MEAD, NJ , US, vol . 75, no. 3 , 1 March 2010 (2010-03-01) , pages 742-749 , XP026939173, ISSN : 0090-4295, DOI : 1 . 1016/ J . URO LOGY . 2009 . 3 . 08 [retri eved on 2009-08-03] table 1

-/--

page 3 of 4 International application No PCT/EP2014/054384

C(Continuation). DOCUMENTS CONSIDERED TO BE RELEVANT

Category* Citation of document, with indication, where appropriate, of the relevant passages Relevant to claim No.

WO 2004/060302 A2 (CEMINES LLC [US]; 1,5,6, NEUMAN TOOMAS [US]; PALM KAIA [EE]) 13-24 22 July 2004 (2004-07-22) abstract

page 4 of 4 International application No Information on patent family members PCT/EP2014/054384

Patent document Publication Patent family Publication cited in search report date member(s) date

WO 2010123354 A2 28-10-2010 CA 2759312 Al 28-10-2010 EP 2421988 A2 29-02-2012 US 2012101023 Al 26-04-2012 O 2010123354 A2 28-10-2010

WO 2012018609 A2 09 -02- 2012 EP 2598890 A2 05·-06--2013 US 2013190310 Al 25--07--2013 WO 2012018609 A2 09·-02--2012

WO 2012009382 A2 19 -01- 2012 NONE

US 2011262921 Al 27 -10- 2011 US 2011262921 Al 27·-10--2011 WO 2011133981 Al 27-- 1 -2011

WO 2012178087 Al 27 -12- -2012 AU 2012203810 Al 17-- 1 -2013 CA 2840472 Al 27·-12--2012 EP 2723898 Al 30·-04--2014 US 2014154691 Al 05·-06--2014 WO 2012178087 Al 27·-12--2012

W0 2004060302 A2 22-07-2004 AU 2003300368 Al 29-07-2004 CA 2511816 Al 22-07-2004 EP 1583504 A2 12-10-2005 J P 2006524035 A 26-10-2006 US 2004219575 Al 04-11-2004 W0 2004060302 A2 22-07-2004 International Application No. PCTY EP2Q14/ Q54384

FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210

Thi s International Searching Authori t y found mul tiple (groups of) i nventions i n thi s international appl i cation , as fol l ows :

1. claims: 2-6, 20, 21 (compl et e ) ; 1, 13-19, 22-24(partial ly)

An i n v i tro method for determining whether a muscle-i nvasive bladder cancer has a basal - l i ke phenotype by determining the expression level of KRT5 , KRT6A and/or KRT6B and the expression level of nuclear FOXAI i n a cancer sample; methods t o predi ct sensi tivi t y t o EGFR inhibi tor therapy or capeci tabine ; use of a ki t compri sing means t o detect the genes KRT5, KRT6A, KRT6B and/or FOXAI AND An i n v i tro method for determining whether a muscle-i nvasive bladder cancer has a basal - l i ke phenotype by determining the expression level of determini ng the expression level of at l east 2 genes sel ected from the group consi sting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, I L1RAP, DSP, PKPI , SERPINB7, CELSR2, DUSP7, TBCID2, ARL4D, I PPK, MTSSI and RGS20 genes, and the expressi on l evel of at l east 2 genes sel ected from the group consi sting of PHCI , THYN1, TACC1 , PPAP2B, NRXN3, GNA14, ZFHX3, TLE2, MAML3, EPS8, CACNAID, RAB15 , MAN1C1 , S0RL1, CHN2, TGFBR3, CAB39L, LIMCH1 and BAMBI genes; methods t o predi ct sensi tivi t y t o EGFR inhi bi tor therapy or capeci tabine ; use of a ki t compri sing means t o detect the genes determining the expression level of at least 2 genes selected from the group consi sting of PI3, KRT6B, CSTA, DSC2 , MT1X, RAB38, SFN, SAMD9, EGFR, CD44, I L1RAP, DSP, PKPI , SERPINB7, CELSR2, DUSP7, TBCID2, ARL4D, I PPK, MTSSI and RGS20 genes, and the expression level of at least 2 genes selected from the group use of a DNA chip wi t h nuclei c acids speci f i c t o determining the expression level of at least 2 genes selected from the group consi sting of PI3, KRT6B, CSTA, DSC2, MT1X, RAB38, SFN, SAMD9, EGFR, CD44, I L1RAP, DSP, PKPI , SERPINB7, CELSR2, DUSP7, TBCID2, ARL4D, I PPK, MTSSI and RGS20 genes, and the expression level of at least 2 genes selected from the group AND An i n v i tro method for determining whether a muscle-i nvasive bladder cancer has a basal - l i ke phenotype by determining the expression level of determini ng the expression level of at l east one exon selected from the group consi sting of the exons of SEQ I D NO: 23 t o 4 1 ; methods t o predi ct sensi tivi t y t o EGFR i nhibi tor therapy or capeci tabine ; use of a ki t compri sing means t o detect at least one exon selected from the group consi sting of the exons of SEQ I D NO: 23 t o 4 1 , use of a DNA chip compri sing nucl ei c acids speci f i c for at l east one exon selected from the group consi sting of the exons of SEQ I D NO: 23 t o 4 1

2 . claims: ll(completely) ; 1, 13-19 , 22-24(parti al ly)

An i n v i tro method for determining whether a muscle-i nvasive bladder cancer has a basal - l i ke phenotype by determining the expression level of determini ng the DNA methylati on status International Application No. PCTY EP2Q14/ Q54384

FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210

of at least 4 CpG i sl ands sel ected from the group consi sting of CpG i slands l i sted i n Tabl e 7 methods t o predi ct sensi tivi t y t o EGF i nhibi tor therapy or capeci tabine ; use of a ki t compri si ng means t o detect at least 4 CpG i slands selected from the group consi sting of CpG i sl ands l i sted i n Table 7 , use of a DNA chi p compri sing nuclei c aci ds speci f i c for at least 4 CpG i slands selected from the group consi sting of CpG i sl ands l i sted i n Table 7

3 . claims: ^(completely) ; 1, 13-19 , 22-24(parti al ly)

An i n v i tro method for determining whether a muscle-i nvasive bladder cancer has a basal - l i ke phenotype by determining the expression level of determini ng the DNA methylati on status of at least 4 CpG i sl ands sel ected from the group consi sting of CpG i slands l i sted i n Tabl e 9 ; methods t o predi ct sensi tivi t y t o EGFR i nhibi tor therapy or capeci tabine use of a ki t compri si ng means t o detect at least 4 CpG i slands selected from the group consi sting of CpG i sl ands l i sted i n Table 9 , use of a DNA chi p compri sing nuclei c aci ds speci f i c for at least 4 CpG i slands selected from the group consi sting of CpG i sl ands l i sted i n Table 9

4 . claims: 7-10 (completely) ; 1, 13-19, 22-24(partial ly)

An i n v i tro method for determining whether a muscle-i nvasive bladder cancer has a basal - l i ke phenotype by determining the expression level of TGM1 ; methods t o predi ct sensi tivi t y t o EGFR inhi bi tor therapy or capeci tabine ; use of a ki t compri sing means t o detect the expression of TGM1, use of a DNA chip compri si ng nuclei c acids speci f i c for TGM1