Supplementary Figure S1

A 40

20 20

0 TCGA 0 −20 PETACC-3 Agendia (GSE42284) −40 Agendia (ICO) −20 Agendia (VHB)

Principal component 2 −60 Principal component 4 Marisa et al. (GSE39582) Schlicker et al. (GSE35896) −80 −40 GSE75316 GSE75315 −60 −40 −20 0 20 40 60 80 −40 −20 0 20 40 Principal component 1 Principal component 3

B 4 3 2 1 0

supplementary Figure S1. Clustering of colorectal cancer BRAF V600E mutant patients. A: Principal component analysis of BRAF mutant patients. The four first principal components which explain the most of data variation are shown. Patients are labeled according to the cohort to which they belong. B: Hierarchical clustering of BRAF mutant patients using the Ward method to compute distance between patients.

Supplementary Figure S2

A Row Z−Score

- 4 - 2 0 2 4 BM1 BM2 200 150 100 50 BRAF V600E mutant patients

1000 2000 3000 4000 5000 6000 7000 8000 9000 .6 .4 .2 0 BM Subtype

B

BM1 BM2 NMF clustering after 0 MSI status correction BM2 BM1 0.5 Unsupervised BM1 51 9 Cohen’s kappa = 0.82 Proportion NMF clustering [95%CI = 0.73-0.90]

6 125 BM subtypes

BM2 MSI corrected

1

supplementary Figure S2. Subtyping of CRC BRAF mutant patients. A: Heatmap displaying genewise and patientwise double clustering of 218 CRC BRAF mutant patients. Clustering was performed using the Ward method to compute distance between genes and between patients. expression is displayed as standardized z-score. B: NMF clustering after adjustment for MSI status, the BM subtype calls were then compared to the original calls computed without any adjustment. High agreement is shown between the two methods by the Cohen’s kappa coefficient of 0.82.

Supplementary Figure S3

EMT signature 0.6 BM Subtype 0.5 0.4 423 0.3 Site 0 0 0 0.2 0 0 53 0 EMT 0 0 8

Enrichment score 0.1 0.0 147 0 0 0 0 0 0 0 0 0 0 0 0 0 0 BM1 BM2 1 0 0 0 48 Ranked 476 genes MSI status Gender KRAS signature 0.30 BM Subtype 0.25 0.20 448 0.15 Site 0 0 0.10 0 0 0 28 0 0.05 KRAS 0 0 8 Enrichment score 0.00 170 0 0 -0.05 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 BM1 BM2 46 Ranked 476 genes MSI status Gender E2F signature 0.0 BM Subtype -0.1 -0.2 -0.3 452 -0.4 Site 0 0 -0.5 0 0 0 -0.6 24 0 E2F 0 0 7 Enrichment score -0.7 -0.8 172 0 0 1 0 0 0 0 0 0 3 0 0 0 0 1 0 0 0 BM1 BM2 45 Ranked 476 genes MSI status Gender G2M signature 0.0 -0.1 BM Subtype -0.2 -0.3 454 -0.4 Site 0 0 -0.5 0 0 0 22 0 -0.6 G2M 0 0 7 Enrichment score -0.7 172 0 0 -0.8 1 0 0 0 0 0 0 4 0 1 0 0 0 0 0 0 BM1 BM2 48 Ranked 476 genes MSI status Gender supplementary Figure S3. Gene Set Enrichment Analysis of BRAF mutant patients. The 476 BRAF mutant specific genes were used to perform a standard GSEA analysis between BM1 and BM2. EMT (MSigDB ref: M5930) and KRAS (MSigDB ref: M5953) signatures were found enriched in BM1 while G2M (MSigDB ref: M5901) and E2F (MSigDB ref: M5925) signatures were enriched in BM2. Venn diagrams on the right side highlight the number of genes that are shared or specific between the respective signatures and the indicated clinical factors. Supplementary Figure S4

BM1 BRAF/KRAS WT (WT2) Positive signatures BM2 KRAS mutant BM subtypes Vs WT2 Negative signatures COMPLEMENT KRAS SIGNALING COAGULATION EPITHELIAL MESENCHYMAL TRANSITION INFLAMMATORY RESPONSE w Z−Score R o UV RESPONSE 3 APICAL JUNCTION 2 1

MYOGENESIS 0

TNFA SIGNALING VIA NFKB −1

GLYCOLYSIS −2 E2F TARGETS −3 G2M CHECKPOINT MITOTIC SPINDLE ESTROGEN RESPONSE LATE

G2M CHECKPOINT MTORC1 SIGNALING MITOTIC SPINDLE E2F TARGETS ALLOGRAFT REJECTION GLYCOLYSIS IL2 STAT5 SIGNALING MYOGENESIS EPITHELIAL MESENCHYMAL TRANSITION APICAL JUNCTION ESTROGEN RESPONSE EARLY XENOBIOTIC METABOLISM COAGULATION UV RESPONSE APOPTOSIS

BM subtypes Vs KRAS mutant COMPLEMENT INFLAMMATORY RESPONSE KRAS SIGNALING EPITHELIAL MESENCHYMAL TRANSITION COAGULATION UV RESPONSE MYOGENESIS GLYCOLYSIS E2F TARGETS ESTROGEN RESPONSE LATE MITOTIC SPINDLE G2M CHECKPOINT

G2M CHECKPOINT MITOTIC SPINDLE E2F TARGETS MTORC1 SIGNALING ALLOGRAFT REJECTION GLYCOLYSIS MYOGENESIS ESTROGEN RESPONSE EARLY APICAL JUNCTION EPITHELIAL MESENCHYMAL TRANSITION APOPTOSIS XENOBIOTIC METABOLISM supplementary Figure S4. Pathway analysis comparison between BM subtypes, KRAS/BRAF wild-type and KRAS mutant CRCs. Single sample gene set enrichment analysis (ssGSEA) was performed for BM1 versus KRAS/BRAF wild-type (WT2) patients (top panel, top heatmap, 69 patients per group) or versus KRAS mutant patients (bottom panel, top heatmap, 69 patients per group) and similarly for BM2 versus WT2 patients (top panel, bottom heatmap, 149 patients per group) or versus KRAS mutant patients (bottom panel, bottom heatmap, 149 patients per group). These analyses were performed using the hallmark gene signature collection from MSigDB and using the only the 476 BM-specific genes. Supplementary Figure S5

Signature scores FDR

Row scaled average signature score

-1 -0.5 0 0.5 1

BM1 WT2 KRAS mut BM2 All genes 476 BM-specific genes MITOTIC SPINDLE MYC TARGETS CHOLESTEROL HOMEOSTASIS Enriched OXIDATIVE PHOSPHORYLATION in UV RESPONSE (DOWN) GLYCOLYSIS BM2 UNFOLDED PROTEIN RESPONSE 40 E2F TARGETS MTORC1 SIGNALING 30 DNA REPAIR G2M CHECKPOINT 20 BILE ACID METABOLISM PEROXISOME 10 MYC TARGETS FATTY ACID METABOLISM 0 PI3K/AKT/MTOR SIGNALING -log10(FDR) 10 SPERMATOGENESIS PROTEIN SECRETION 20 UV RESPONSE (UP) HYPOXIA Enriched APICAL_SURFACE in IL2/STAT5 SIGNALING TNFA SIGNALING VIA NFKB BM1 APOPTOSIS P53 PATHWAY COMPLEMENT INFLAMMATORY RESPONSE KRAS SIGNALING IL6/JAK/STAT3 SIGNALING ALLOGRAFT REJECTION INTERFERON ALPHA RESPONSE INTERFERON GAMMA RESPONSE PANCREAS BETA CELLS REACTIVE OXIGEN SPECIES PATHWAY HEME METABOLISM XENOBIOTIC METABOLISM WNT/BETA-CATENIN SIGNALING ESTROGENE RESPONSE LATE ANDROGEN RESPONSE LATE ANGIOGENESIS COAGULATION EPITHELIAL MESENCHYMAL TRANSITION KRAS SIGNALING (UP) TGF-B SIGNALING MYOGENESIS UV RESPONSE (DOWN) APICAL JUNCTION NOTCH SIGNALING HEDGEHOG SIGNALING ESTROGEN RESPONSE EARLY ADIPOGENESIS KRAS SIGNALING (DOWN) supplementary Figure S5. Molecular characterization of BM subtypes. Heatmap displaying the average ssGSEA hallmark signature scores for BM1, BM2, KRAS/BRAF wild-type (WT2) and KRAS mutant using all gene profiles (compared to Figure 3 where only the 476 BM-specific genes were used). False discovery rates adjusted p-values (FDR) for BM1 versus BM2 statistical analyses appear on the right under the form of a heatmap for both all genes and for the 476 BM-specific genes.

Supplementary Figure S6

M2882 M1417

44 182 M5930 M5953 M11513 M2878 26 1 8 1 42 6 0 0 1 1 0 15 163 0 132 130 0 0 0 2 0 0 1 2 19 0 7 4 0 0 0 0 0 0 4 0 0 0 0 3 0 6 1 0 0 0 0 0 0 0 5 0 5 0 1 1 170 25 194 79 EMT signatures KRAS signatures M8191 M5956 M10573 M5915 APICAL JUNCTION M12829

43 M1484 M1396 M2632 M5925 4 20 M8627 151 116 M5901 0 4 1 3 65 8 136 0 2 9 0 60 1 11 0 0 37 1 0 0 12 113 0 0 12 le signatures 12 0 0 0 0 1 0 y c 11 0 10 339 3 9 1 1 4

Cell c M543 M14894

MTORC1 HALLMARK M7561 M16909 M5924 46 535 M1919 1 3 15

176 1 1 329

0 11 0

mTOR-related signatures 1 0 10

supplementary Figure S6. MSigDB gene set overlap. Venn diagram showing the number of genes that are overlap between collections of signatures tested in Figure 3. Supplementary Figure S7

100 Neutrophils

Eosinophils

Mast cells resting

Dendritic cells activated

Dendritic cells resting

75 Macrophages M2

Macrophages M1

Macrophages M0

Monocytes

NK cells activated 50 NK cells resting T cells gamma delta

T cells regulatory (Tregs)

T cells follicular helper

Relative percentage T cells CD4 memory activated 25 T cells CD4 memory resting T cells CD4 naive

T cells CD8

Plasma cells

B cells memory

B cells naive 0

BM1 BM2

Mast cells activated Dendritic cells resting Macrophages M2 Macrophages M1 Monocytes Plasma cells 40 40 20 30 15 30 20 30 15

20 20 10 20 10 10 10 5 10 10 5

Relative percentage 0 0 0 0 0 0

FDR 5.1.10-3 3.5.10-2 2.8.10-6 3.8.10-2 5.5.10-3 5.5.10-3

Macrophage-associated genes

4

CYBB 2 CD163 CD84 0 MS4A4A GM2A

row z-score −2 CD68 MARCO −4 APOE CHIT1 CHI3L1 BCAT1 Subtype FN1 RAI14 BM1 COLEC12 BM2 COL8A2 SCARB2 ATG7 MSI status CXCL5 MSS SCG5 GPC4 MSI ME1 Not available FDX1 supplementary Figure S7. Immune profiles of BM subtypes. A CIBERSORT analysis that reveals the immune landscape of BM subtypes in terms of relative percentage for each immune subset (top panel). The bottom panel displays the relative percentage of immune subsets that were found significantly enriched (FDR < 0.01) for each BM subtypes. Bottom heatmap displays the expression of macrophage-associated genes according to BM subtypes. MSI status is displayed as well. Supplementary Figure S8 β-value A TCGA patients 0 0.2 0.4 0.6 0.8 Illumina 450k 1 BRAF WT BRAF Mutant Methylated Not methylated

Probes more methylated in BRAF mutant 15’591 / 374’801 (4.2 %)

Probes more methylated in BRAF WT

196 / 374’801 (0.05 %)

Differentially methylated Differentially probes (15’591) Probes not significantly different 359’014 / 374’801 (95.8 %) B TCGA patients Illumina 27k Illumina 450k

BM1 BM2

MSS

All DNA methylation probes All DNA in common between the two platforms (22’333) MSI C Illumina 27k Illumina 450k MLH1 methylation probes D Illumina 27k Illumina 450k

SLC16A8 CLDN15 RPS6KA2 SPAG6 C16orf95 NGEF KLK8 TIMP2 METRN DMRT2 NGEF TBC1D13 PLCH2 PSD KIF17 AHRR RHOT1 FAM124B NLRP2 CCBP2 COL18A1 C9orf47 NCRNA00176 DNAJB6 MAPK15 MTERF QSOX2 PSTPIP1 MTERF MCAM PCDHB12 SSTR5 TSSC1 SLC27A6 RAMP1 TRIM29 CD4 SECTM1 HIST1H3H PSMG3 GRAMD2 HSD17B7P2 GDPD5 HLA−DPB2 HSD17B7 GRIN2A AGK LINC00982 CD40 NR5A2 FBXO34

Differentially methylated probes Differentially VPS18 GRB14 SOGA1 supplementary Figure S8. Methylation profiles of BRAF mutant subtypes. A: In TCGA patients, general methylation profiles were assessed between BRAF wild-type and BRAF mutant patients. More than 95% of the probes had similar similar β-values between these two groups. B: The analysis of two different methylation platforms used by TCGA shows that the methylation profiles between BM1 and BM2 patients is overall highly similar with few differences that are obviously driven by the MSI status. C: As the MSI status was enriched according to BM subtypes, we assessed the methylation status of several MLH1 probes and did not find any difference between BM subtypes. Rather, the differences were driven by the MSI status. D: List of gene probes that are differentially methylated between BM1 and BM2 and according to analyses made in different methylation platforms. Supplementary Figure S9

Definition of relapse and time zero

TCGA PETACC3 GSE39582 Agendia Local metastasis Regional metastasis Distant metastasis Death Secondary Primary

randomiz- diagnosis surgery surgery Time zero ation

Events and censoring time OS RFS Events Med. Cens [d] Events Med. Cens [d] TCGA 1 213.5 NA NA PETACC3 20 2117.5 21 2121 GSE39582 10 1734.9 12 1734.9 Agendia 19 2678.5 21 2678.5

Clinical variables per cohort Site MSI status Gender Stage BM subtype Left Right Rect MSS MSI F M I II III BM1 BM2 TCGA 0 % 100 % 0 % 16 % 84 % 61 % 39 % 25 % 52 % 23 % 23 % 77 % PETACC3 24 % 76 % 0 % 60 % 38 % 46 % 54 % 0 % 20 % 80 % 32 % 68 % GSE39582 11 % 89 % 0 % 16 % 71 % 69 % 31 % 4 % 49 % 47 % 38 % 62 % Agendia 29 % 66 % 5 % 21 % 33 % 76 % 24 % 5 % 42 % 53 % 24 % 76 %

TCGA RFS OS Gender Tumor Site Stage MSI status BM subtype PETACC3 RFS OS Gender Tumor Site Stage MSI status BM subtype GSE39582 RFS OS Gender Tumor Site Stage MSI status BM subtype Agendia RFS OS Gender Tumor Site Stage MSI status BM subtype

OS/RFS Dead / Relapsed Gender Male Site Proximal Stage I MSI status MSS Subtype BM1 Alive Female Distal II MSI BM2 Rectal III Not available

supplementary Figure S9. Characteristics of patients enrolled in the survival analysis. In the four cohort used in the survival analysis, the tables at the top of the figures display how the relapses and time zeros are defined, the number of survival events for overall survival (OS) and relapse-free survival (RFS) and the median of the censoring time (Med. Cens.) in days. For each clinical variables used in the survival analysis, the percentage of features is shown for each cohort. Heatmaps also show what the clinical features of each patient are. Supplementary Figure S10 44 genes 0.35 0.30 selected lambda 0.25 0.20 0.15 Misclassication error 0.10

−20 −15 −10 −5 log(lambda)

supplementary Figure S10. Performance of the BM classifier. We built a generalized logistic model classifier for BM1/BM2. We performed one hundred leave-one-out cross-validation steps and selected a penalty coefficient lambda that minimizes the misclassification error. This plot shows the misclassifica- tion error rate of the hundred cross-validations according to the lambda (in the log scale). Error bars represents standard deviations. The selected lambda results in the selection of 44 non-zero coefficient genes (see list in supplementary Table S2) with a misclassification error of 11%.

Supplementary Figure S11 Sanger

Area under the curve ABT−263 PLX4720 GDC−0449 AZD7762 CI−1040 0 0.2 0.4 0.6 0.8 1 AMG−706 Vorinostat RDEA119 Resistance to drug RO−3306 SB 216763 PD−0325901 GDC0941 AG−014699 BRAF WT ABT−888 ZM−447439 BRAF mutant CCT018159 Docetaxel Cell lines predicted to be: Cytarabine NU−7441 KU−55933 BM1

O BM2 R K SW48 HCT15 SW837 LS1034 C2BBe1 SW1417 SW1116

CCLE

Drug Activity AZD0530 Topotecan 17−AAG TKI258 0 2 4 6 PLX4720 Paclitaxel Panobinostat Resistance to drug TAE684 RAF265 Sorafenib AZD6244 PD−0332991 ZD−6474 L−685458 Erlotinib Nutlin−3 PD−0325901 Lapatinib LBW242 PHA−665752 PF2341066 AEW541 O R K SW48 HCT15 C2BBe1 SW1417 supplementary Figure S11. Cancer cell line drug sensitivity analyses. Cell lines from the Sanger (A) and CCLE (B) databases were classified as BM1 or BM2 cell lines and their sensitivity to a panel of different drugs was assessed by differential analyses. Drugs from the top display the most different sensitivity between BM1 and BM2 cell lines.