SUPPLEMENTARY MATERIALS AND METHODS

Stable KHSRP knock-down cell line An ultracomplex pooled shRNA library targeting each annotated human -coding with 25 independent shRNAs on average, as well as 500 negative control shRNAs, was previously described and used to perform genome-wide genetic interaction screens in mammalian cells 1. From this original library, 3 independently validated shRNA sequences targeting KHSRP were selected, along with one negative control scramble shRNA sequence; for each sequence, top and bottom oligonucleotides were synthesized (Integrated DNA Technologies, Coralville, Iowa, USA). The shRNA-expressing SW620-derivative cell line was generated following a previously published protocol 2. The lentiviral plasmid used for shRNA expression was pMK1201, which is a modified version of pMK1200 2 based on the design of pINDUCER10 for tetracycline-induced expression of the shRNA coupled with a fluorescent reporter (tRFP) and a Puromycin resistance sequence for positive selection 3. To clone each individual shRNA into the lentiviral backbone, 2 µl of top and bottom oligonucleotides were annealed at 95°C for 5 min in a buffer containing 100 mM potassium acetate, 30 mM HEPES-KOH (pH=7.4), 2 mM magnesium acetate. Annealed oligonucleotides (0.01 µM) were ligated in 25 ng of the pMK1201 vector pre-digested with BstXI (New England Biolabs, Ipswich, MA, USA) and gel purified; ligation was carried out at RT for 2h using 2000 U of T4 DNA ligase (New England Biolabs). DH5α cells (Thermo) were transformed and plated onto ampicillin-containing LB plates overnight at 37°C. Single colonies were picked and expanded for plasmid DNA mini prep (Qiagen). Correct insertion of shRNA sequences was confirmed by sequencing using the 5’ pSico-Eco-insert-seq 2. For lentivirus preparation, the second-generation virion packaging vector psPAX2 (plasmid #12260, Addgene, Cambridge, MA, USA) and the VSV-G envelope plasmid pMD2.G (plasmid #12259, Addgene) from Didier Trono (Lausanne) were used. The producer cell line was 293T, a highly-transfectable derivative of human embryonic kidney cell 293 (ATCC), which was maintained in DMEM supplemented with 10% fetal bovine serum, 1% penicillin/streptomycin and 1% Fungizone (Invitrogen Life Technologies). 293T cells were transfected in 10 cm plates using the calcium phosphate method: 2.5 μg of scramble shRNA vector, or a pooled mix of 0.83 μg each of the three KHSRP shRNA vectors, were combined with 0.58 μg of pMD2.G and 1.92 μg of psPAX2 and transfected with 12.5 μM chloroquine diphosphate and 12.5 mM of calcium chloride. Transfection medium was replaced after 16h 1 with full DMEM, and virus-containing conditioned medium was harvested 2 times at 48h and 72h post transfection; pooled harvests from each condition were pre-cleared by centrifugation, filtered through a 0.45 μm membrane, and used immediately to transduce SW620 in 6-well plates at different dilutions (neat, 1:2, 1:4, 1:10, 1:20 dilution in full DMEM). Cells were selected with 1 μg/ml puromycin for 2 weeks, expanded and stock- frozen. A fresh stock of cells was used each time for experiments.

Gene Expression Microarray Total RNA was extracted from cells using the RNeasy Plus kit (Qiagen); the integrity and concentration of RNA was confirmed using the RNA 6000 Nano kit on a Bioanalyzer 2100 (Agilent), with reported RNA Integrity Numbers (RIN) > 9. Amplified cDNA for gene expression analysis was prepared with the Ovation® PicoSL WTA System V2 (Nugen, San Carlos, CA, USA); labeled cDNA targets were generated with the Encore® Biotin Module (Nugen) and hybridized to a GeneChip® Human Transcriptome Array 2.0 (Affymetrix, Thermo Fisher) following manufacturer’s instructions. CEL files were analysed with the Affymetrix® Expression Console™ Software, using the Affymetrix Human Transcripome Array 2.0 library files and the HTA-2_0.na35.2.hg19 annotation files. An RMA workflow was performed which used a quantile normalization procedure and a general background correction. The resulting normalised CHP files were then imported in the Affymetrix® Transcriptome Analysis Console software to test for differential expression using a paired One-Way Repeated Measure (ANOVA) approach, and the default filter criteria (fold change > |2|, and p-value < 0.05). The full data have been deposited in NCBI's Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/) and can be accessed with the GEO Series accession number GSE112329. Over-representation analysis was used to identify canonical pathways and functional processes of biological importance within the list of differentially expressed ; the analysis was performed using GeneTrail2 (version 1.5, https://genetrail2.bioinf.uni-sb.de) 4. Network analysis was performed using String (version 10.0, https://string-db.org/) 5. STRING is a database of known and predicted protein-protein interactions, including both direct (physical) and indirect (functional) associations, inferred from experimental data and computational predictions. STRING computes a global score by combining the probabilities from the different types of evidence and correcting for the probability of randomly observing an interaction. in the

2 network are then clustered based on the distance matrix obtained from the global scores, using the KMEANS clustering algorithm 6.

Shotgun Proteomics SW620-M2 cells where treated with vehicle or 500ng/ml doxycycline for 4 days to induce shRNA-mediated knockdown of KHSRP, as outlined above. Cells were then incubated in serum-free medium for 16h, and triplicate conditioned media for each condition were harvested, buffer exchanged and concentrated in a final volume of 300 µl PBS using Amicon centrifugal filter units with a 3 kDa cut off membrane (MilliporeSigma, Burlington, MA, USA). From each sample, 10 µg total proteins were incubated with 6 M urea and 10 mM DTT for 20 min at 55°C. Samples were alkylated with 12.5 mM iodoacetamide in the dark at RT for 1h, quenched with 10 mM DTT, and the final volume was diluted 3-fold into 25 mM ammonium bicarbonate. Trypsin digestion was performed with a 1:50 mass ratio of sequencing-grade trypsin (Promega) to total protein overnight at 37°C. Samples were acidified to approximately pH=2 with formic acid, and peptides desalted with C18 Desalting Tips (Rainin, Oakland, CA, USA), lyophilized, and rehydrated in 0.2% formic acid. Peptide sequencing by LC-MS/MS was performed on an LTQ Orbitrap Velos mass spectrometer (Thermo) equipped with a nanoACQUITY (Waters) ultraperformance LC (UPLC) system and an EASY-Spray PepMap C18 column (Thermo, ES800; 3- m bead size, 75 m by 150 mm) for reversed-phase chromatography. Peptides were eluted over a linear gradient over the course of 60 min from 2% to 50% (vol/vol) acetonitrile in 0.1% formic acid. MS and MS/MS spectra were acquired in a data-dependent mode with up to six higher-energy collisional dissociation (HCD) MS/MS spectra acquired for the most intense parent ions per MS. For data analysis, MS peak lists were generated with in-house software called PAVA, and database searches were performed with Protein Prospector software, v.5.16.1 (http://prospector.ucsf.edu/prospector/mshome.htm, UCSF) 7 against the SwissProt human protein database (2017.11.01). The database was concatenated with an equal number of fully randomized entries for estimation of the false-discovery rate (FDR). Database searching was car ried out with tolerances of 20 ppm for parent ions and 0.8 Da for fragment ions. Peptide sequences were matched as tryptic peptides with up to two missed cleavages, and carbamidomethylated cysteines as constant modification. Variable modifications included oxidation of methionine, N-terminal pyroglutamate from glutamine, loss of methionine and N-terminal acetylation. The following Protein Prospector score 3 thresholds were selected to yield a maximum protein FDR of less than 1%. A minimum “protein score” of 22 and a minimum “peptide score” of 15 were used; maximum expectation values of 0.01 for protein and 0.05 for peptide matches were used. The list of proteins was further curated to include only proteins identified in at least 2 out of 3 biological replicates, and with a minimum of two unique peptides identified in at least one biological replicate. The natural log transformation of NSAF values and t-test analysis was performed as previously described 8. GO pathway enrichment analysis was performed using the DAVID 6.8 database (https://david.ncifcrf.gov) 9. Raw data is publically available for download on the MassIVE database (ftp://massive.ucsd.edu/MSV000082206).

4

Supplementary Table 1: details of the differential expression analysis for KHSRP in CRC datasets from the Oncomine database. For each analysis the corresponding dataset and citation is reported, along with the KHSRP gene rank, p-value (Student’s t-test, corrected for multiple hypothesis testing using the false discovery rate method), and fold change. Percentile rank is reported, and is also shown by the degree of color saturation of the left-side box. The more saturated the color, the higher the percentile rank. The most highly saturated boxes denote that the gene rank is in the top 1 percentile for that analysis; medium saturation in the top 5 percentile, and pale in the top 10 percentile (according to the color legend at the bottom of the table).

5

Supplementary Table 2: details of the differential expression analysis for KHSRP in pan-cancer datasets from the Oncomine database. Each analysis is reported with the associated dataset in Oncomine, and the corresponding citation. Percentile rank is shown by the degree of color saturation of the left-side box, as described in Supplementary Table 1.

6

Supplementary Table 3 (continues on the next page): list of the 135 differentially regulated genes in the microarray. Log2-normalized fold change, and associated p-values, are reported for the comparison of SW480 cells transfected with siKHSRP versus siSCR control.

Fold Change (siKHSRP vs. Transcript Cluster ID Gene Symbol Description siSCR) p-value TC15001577.hg.1 SCARNA14 small Cajal body-specific RNA 14 17.12 0.040413 TC16000473.hg.1 MT1F metallothionein 1F 5.67 0.041758 TC14001369.hg.1 SNORA79 small nucleolar RNA, H/ACA box 79 5.4 0.024183 TC01000388.hg.1 RNU11 RNA, U11 small nuclear 4.93 0.038545 TC16000476.hg.1 MT1X metallothionein 1X 4.92 0.042708 TC19000722.hg.1 SNAR-G1 small ILF3/NF90-associated RNA G1 4.71 0.038779 TC08000259.hg.1 SNORD13 small nucleolar RNA, C/D box 13 4.34 0.021887 TC03000498.hg.1 LINC00973 long intergenic non-protein coding RNA 973; novel transcript 4.25 0.03006 TC16001576.hg.1 MT1JP metallothionein 1J, pseudogene 4.2 0.037313 TC12001144.hg.1 SCARNA11 small Cajal body-specific RNA 11 3.94 0.042436 TC03002732.hg.1 HES1 hes family bHLH transcription factor 1 3.59 0.033088 TC16002035.hg.1 MT1A metallothionein 1A 3.47 0.029922 TC01002578.hg.1 SLC2A1 solute carrier family 2 (facilitated glucose transporter), member 1 3.31 0.033427 TC12003283.hg.1 TMBIM4 transmembrane BAX inhibitor motif containing 4 3.24 0.04265 TC0X001346.hg.1 LOC101928402; RP1-315G1.3 uncharacterized LOC101928402; putative novel transcript 3.1 0.010148 TC0X000788.hg.1 MIR1184-1; MIR1184-2; MIR1184-3 microRNA 1184-1; microRNA 1184-2; microRNA 1184-3 3.09 0.008648 TC0X001548.hg.1 MIR1184-1; MIR1184-2; MIR1184-3 microRNA 1184-1; microRNA 1184-2; microRNA 1184-3 3.09 0.008648 TC0X001553.hg.1 MIR1184-1; MIR1184-2; MIR1184-3 microRNA 1184-1; microRNA 1184-2; microRNA 1184-3 3.09 0.008648 TC12000295.hg.1 KIAA1551 KIAA1551 3.08 0.032332 TC01005500.hg.1 HSPB11 heat shock protein family B (small), member 11 2.94 0.019518 TC19000634.hg.1 APOC1 apolipoprotein C-I 2.92 0.034765 TC03001888.hg.1 TM4SF1 transmembrane 4 L six family member 1 2.79 0.015873 TC11001251.hg.1 POLR2L polymerase (RNA) II (DNA directed) polypeptide L, 7.6kDa 2.71 0.024345 TC01002597.hg.1 RNU5F-1 RNA, U5F small nuclear 1 2.65 0.024963 TC19002097.hg.1 FXYD5 FXYD domain containing ion transport regulator 5 2.63 0.026623 TC08000829.hg.1 ZNF623 zinc finger protein 623 2.57 0.016508 TC12002888.hg.1 KRT5 keratin 5 2.57 0.041426 TC20001194.hg.1 DNMT3B DNA (cytosine-5-)-methyltransferase 3 beta 2.55 0.008502 TC07000235.hg.1 ANLN anillin, actin binding protein 2.55 0.009929 TC17002618.hg.1 TOP2A topoisomerase (DNA) II alpha 170kDa 2.54 0.025983 TC11002727.hg.1 NEAT1; MIR612 nuclear paraspeckle assembly transcript 1 (non-protein coding); microRNA 612 2.54 0.041208 TC08001143.hg.1 RP11-90P5.7 U6 spliceosomal RNA (U6 snRNA) 2.53 0.037878 TC16001135.hg.1 MT1G metallothionein 1G 2.52 0.025686 TC06003665.hg.1 CLIC5 chloride intracellular channel 5 2.52 0.027029 TC18000953.hg.1 KDSR 3-ketodihydrosphingosine reductase 2.52 0.031646 TC01005732.hg.1 CD58 CD58 molecule 2.51 0.007847 TC07002518.hg.1 BCAP29 B-cell receptor-associated protein 29 2.5 0.02553 TC10000997.hg.1 RP13-463N16.6 putative novel transcript 2.49 0.002268 TC12003109.hg.1 TESC tescalcin 2.48 0.005011 TC05000872.hg.1 THG1L tRNA-histidine guanylyltransferase 1-like (S. cerevisiae) 2.47 0.008288 TC08001467.hg.1 RP11-410L14.2 novel transcript 2.41 0.036623 TC01005824.hg.1 S100A10 S100 calcium binding protein A10 2.39 0.023355 TC12002907.hg.1 ERBB3 v-erb-b2 avian erythroblastic leukemia viral oncogene homolog 3 2.35 0.026689 TC09001056.hg.1 RMRP RNA component of mitochondrial RNA processing endoribonuclease 2.34 0.034312 TC15000603.hg.1 RNU5B-1 RNA, U5B small nuclear 1 2.34 0.047444 TC02003993.hg.1 TMSB4XP2 thymosin beta 4, X-linked pseudogene 2 2.34 0.047486 TC13001100.hg.1 CKAP2 cytoskeleton associated protein 2 2.34 0.049599 TC16002074.hg.1 MT1M metallothionein 1M 2.33 0.007886 TC09000687.hg.1 TMSB4XP4 thymosin beta 4, X-linked pseudogene 4 2.33 0.04441 TC19002126.hg.1 C19orf33 19 open reading frame 33 2.32 0.039592 TC10002105.hg.1 DDX21 DEAD (Asp-Glu-Ala-Asp) box helicase 21 2.32 0.039948 TC12000268.hg.1 REP15 RAB15 effector protein 2.3 0.039496 TC06003152.hg.1 SNX9 sorting nexin 9 2.3 0.049966 TC12002547.hg.1 TDG thymine-DNA glycosylase 2.29 0.046655 TC16000472.hg.1 MT1B metallothionein 1B 2.28 0.036333 TC01002452.hg.1 FABP3 fatty acid binding protein 3, muscle and heart (mammary-derived growth inhibitor) 2.26 0.000037 TC08000823.hg.1 RP11-661A12.4 7SK RNA 2.26 0.040453 TC01003751.hg.1 NUCKS1 nuclear casein kinase and cyclin-dependent kinase substrate 1 2.25 0.003626 TC11000295.hg.1 ARL14EP ADP-ribosylation factor-like 14 effector protein 2.25 0.040296 TC02001171.hg.1 NDUFB3 NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 3, 12kDa 2.24 0.035406 TC14001078.hg.1 LOC101927418 uncharacterized LOC101927418 2.23 0.00557 TC20000126.hg.1 LINC00493 long intergenic non-protein coding RNA 493 2.23 0.02879 TC19001392.hg.1 RN7SKP22 RNA, 7SK small nuclear pseudogene 22 2.23 0.047348 TC19002500.hg.1 ETHE1 ethylmalonic encephalopathy 1 2.22 0.001952 TC16000857.hg.1 EMP2 epithelial membrane protein 2 2.22 0.003163 TC18000676.hg.1 ZNF271 zinc finger protein 271 2.22 0.026613 TC13000147.hg.1 RGCC regulator of cell cycle 2.21 0.003909 TC13000352.hg.1 RN7SKP9 RNA, 7SK small nuclear pseudogene 9 2.21 0.021926

7

TC12002517.hg.1 MRPL42 mitochondrial ribosomal protein L42 2.21 0.03279 TC09000601.hg.1 TLR4 toll-like receptor 4 2.2 0.024108 TC07001869.hg.1 MIR29A; MIR29B1 microRNA 29a; microRNA 29b-1 2.19 0.013964 TC08000415.hg.1 KRT8 keratin 8 2.17 0.00153 TC01003029.hg.1 CD58 CD58 molecule 2.17 0.009806 TC04002781.hg.1 SH3D19 SH3 domain containing 19 2.17 0.026195 TC12000487.hg.1 METTL7B methyltransferase like 7B 2.16 0.045231 TC17002755.hg.1 HELZ helicase with zinc finger 2.14 0.021481 TC19002708.hg.1 PSG1 pregnancy specific beta-1-glycoprotein 1 2.14 0.043093 TC12000946.hg.1 DYNLL1 dynein, light chain, LC8-type 1 2.14 0.045171 TC01006221.hg.1 MT1HL1 metallothionein 1H-like 1 2.13 0.038449 TC11000700.hg.1 RNU6-46P RNA, U6 small nuclear 46, pseudogene 2.13 0.049692 TC01004166.hg.1 CAMTA1 calmodulin binding transcription activator 1 2.1 0.007778 TC05002760.hg.1 TSPAN17 tetraspanin 17 2.1 0.00968 TC01000576.hg.1 TMEM69 transmembrane protein 69 2.1 0.032788 TC08000782.hg.1 RNU6-144P RNA, U6 small nuclear 144, pseudogene 2.1 0.034045 TC02001528.hg.1 TMSB4XP2 thymosin beta 4, X-linked pseudogene 2 2.09 0.042425 TC18000535.hg.1 RNU6-737P RNA, U6 small nuclear 737, pseudogene 2.09 0.046735 TC0X000145.hg.1 TAB3-AS2 TAB3 antisense RNA 2; novel transcript 2.08 0.014806 TC03002898.hg.1 IP6K2 inositol hexakisphosphate kinase 2 2.08 0.015098 TC01005719.hg.1 BCL2L15 BCL2-like 15 2.08 0.02053 TC20000913.hg.1 SULF2 sulfatase 2 2.07 0.046058 TC12001526.hg.1 KRT5 keratin 5 2.06 0.003984 TC03000729.hg.1 CEP63 centrosomal protein 63kDa 2.05 0.03457 TC0X001325.hg.1 LAMP2 lysosomal-associated membrane protein 2 2.04 0.027238 TC02003843.hg.1 GMPPA GDP-mannose pyrophosphorylase A 2.04 0.029998 TC02004969.hg.1 SNRNP27 small nuclear ribonucleoprotein 27kDa (U4/U6.U5) 2.03 0.019528 TC01001413.hg.1 DUSP12 dual specificity phosphatase 12 2 0.015411 TC03000634.hg.1 PARP14 poly (ADP-ribose) polymerase family, member 14 -2 0.008751 TC11001851.hg.1 DDB1 damage-specific DNA binding protein 1, 127kDa -2 0.018486 TC01000683.hg.1 RP1-158P9.1 novel transcript -2 0.019953 TC07000777.hg.1 LINC01000 long intergenic non-protein coding RNA 1000 -2 0.031392 TC07000364.hg.1 LINC01061 long intergenic non-protein coding RNA 1061 -2 0.047497 TC10001286.hg.1 FAM21EP; RP11-324H6.5 family with sequence similarity 21, member A (FAM21A) pseudogene -2.01 0.039723 TC08001051.hg.1 TNFRSF10D tumor necrosis factor receptor superfamily, member 10d, decoy with truncated death domain -2.03 0.011946 TC09000358.hg.1 PSAT1 phosphoserine aminotransferase 1 -2.03 0.01388 TC21000398.hg.1 SNORA80A small nucleolar RNA, H/ACA box 80A -2.03 0.017664 TC12003287.hg.1 VPS33A vacuolar protein sorting 33 homolog A (S. cerevisiae) -2.05 0.015214 TC10000341.hg.1 FAM21A; FAM21B; FAM21C; LOC101930591 family with sequence similarity 21, member A, member B, member C; uncharacterized LOC101930591 -2.06 0.015566 TC01006397.hg.1 RP11-483I13.4 -2.1 0.007219 TC10000310.hg.1 FAM21B; FAM21C; FAM21A family with sequence similarity 21, member B, member C, member A -2.16 0.019788 TC12002211.hg.1 LPCAT3 lysophosphatidylcholine acyltransferase 3 -2.16 0.020631 TC06000058.hg.1 RIOK1 RIO kinase 1; RIO kinase 1 (yeast) -2.16 0.032681 TC07000467.hg.1 WBSCR16 Williams-Beuren syndrome chromosome region 16 -2.17 0.039999 TC03001171.hg.1 RNA5SP123 RNA, 5S ribosomal pseudogene 123 -2.2 0.017628 TC10000294.hg.1 FAM21C; LOC101930591; FAM21A family with sequence similarity 21, member C, member A; uncharacterized LOC101930591 -2.21 0.007283 TC04001211.hg.1 CLOCK clock circadian regulator -2.3 0.023962 TC06001944.hg.1 MDN1 MDN1, midasin homolog (yeast) -2.31 0.028077 TC03000429.hg.1 EBLN2 endogenous Bornavirus-like nucleoprotein 2 -2.35 0.041933 TC06001152.hg.1 IGF2R insulin-like growth factor 2 receptor -2.36 0.045611 TC07002575.hg.1 LINC01000 long intergenic non-protein coding RNA 1000 -2.37 0.023946 TC02004602.hg.1 CCNT2-AS1 CCNT2 antisense RNA 1 -2.37 0.035312 TC16001794.hg.1 SMG1 SMG1 phosphatidylinositol 3-kinase-related kinase -2.42 0.01057 TC09002468.hg.1 VCP valosin containing protein -2.45 0.007867 TC01001050.hg.1 SRGAP2; SRGAP2C SLIT-ROBO Rho GTPase activating protein 2; SLIT-ROBO Rho GTPase activating protein 2C -2.46 0.020388 TC04000806.hg.1 ETFDH electron-transferring-flavoprotein dehydrogenase -2.48 0.027637 TC19002505.hg.1 ZNF45 zinc finger protein 45 -2.62 0.045943 TC05002974.hg.1 LOC100506548 uncharacterized LOC100506548 -2.64 0.006107 TC01001090.hg.1 TXNIP; LOC101060503 thioredoxin interacting protein; thioredoxin-interacting protein-like -2.65 0.016884 TC02002216.hg.1 FLJ42351; AC079922.3 uncharacterized LOC400999; novel gene (FLJ42351) -2.65 0.039799 TC01002293.hg.1 UBR4 ubiquitin protein ligase E3 component n-recognin 4 -2.76 0.021879 TC06001540.hg.1 RNU6-850P RNA, U6 small nuclear 850, pseudogene -2.93 0.0081 TC17001065.hg.1 RNA5SP435 RNA, 5S ribosomal pseudogene 435 -2.95 0.030654 TC13000461.hg.1 XPO4 exportin 4 -3.09 0.016961 TC13001263.hg.1 BIVM basic, immunoglobulin-like variable motif containing -3.47 0.039871 TC17001757.hg.1 MIR4737 microRNA 4737 -3.69 0.000691 TC08000735.hg.1 RP11-136O12.2 novel transcript -7.66 0.025323

8

Supplementary Table 4: list of the 40 differentially regulated proteins in the shotgun proteomics experiment. Log2-normalized fold change, and associated p-values, are reported for the comparison of doxycycline versus vehicle treated SW620 cells.

ACCNUM Protein Name ENTREZID SYMBOL log2 Ratio doxy/ctrl log10 p-value O43707 Alpha-actinin-4 81 ACTN4 -1.184424571 1.183448416 P09651 Heterogeneous nuclear ribonucleoprotein A1 3178 HNRNPA1 -1 1.930937361 P01892 HLA class I histocompatibility antigen, A-2 alpha chain 3105 HLA-A -2.389042291 0.95875269 Q15365 Poly(rC)-binding protein 1 5093 PCBP1 -1.298081353 0.54316914 P52565 Rho GDP-dissociation inhibitor 1 396 ARHGDIA -4.736965594 1.399044822 P55072 Transitional endoplasmic reticulum ATPase 7415 VCP -1.952694285 0.765195675 P25311 Zinc-alpha-2-glycoprotein 563 AZGP1 -1.108934372 0.514272657 Q9Y617 Phosphoserine aminotransferase 29968 PSAT1 -1.134301092 0.493655026 P48643 T-complex protein 1 subunit epsilon 22948 CCT5 -1 0.994855935 P48637 Glutathione synthetase 2937 GSS 1.538419915 0.192218621 P23526 Adenosylhomocysteinase 191 AHCY -3.058893689 1.205876241 O60568 Procollagen-lysine,2-oxoglutarate 5-dioxygenase 3 8985 PLOD3 -1.514573173 0.591413537 P50395 Rab GDP dissociation inhibitor beta 2665 GDI2 -1.874469118 1.329517001 P52799 Ephrin-B2 1948 EFNB2 -1.36923381 0.348013021 Q9P258 Protein RCC2 55920 RCC2 -1.175086707 0.569582139 P28072 Proteasome subunit beta type-6 5694 PSMB6 -1.134301092 0.517501592 O00391 Sulfhydryl oxidase 1 5768 QSOX1 -1.772589504 0.373577394 Q8WUJ3 Cell migration-inducing and hyaluronan-binding protein 57214 CEMIP -1.280107919 0.169771033 Q13283 Ras GTPase-activating protein-binding protein 1 10146 G3BP1 2.058893689 0.953648538 P31949 Protein S100-A11 6282 S100A11 -4.736965594 2.293455052 P16422 Epithelial cell adhesion molecule 4072 EPCAM -1.251538767 0.609038684 P78504 Protein jagged-1 182 JAG1 -3.36923381 0.720328198 P61604 10 kDa heat shock protein, mitochondrial 3336 HSPE1 -1.471305719 0.335666177 P23284 Peptidyl-prolyl cis-trans isomerase B 5479 PPIB -1.514573173 0.677408101 Q15223 Nectin-1 5818 NECTIN1 -3.36923381 0.720328198 P40227 T-complex protein 1 subunit zeta 908 CCT6A -1.36923381 0.348013021 Q07654 Trefoil factor 3 7033 TFF3 -1.367731785 0.623942196 Q92820 Gamma-glutamyl hydrolase 8836 GGH -3.736965594 2.085455996 O43278 Kunitz-type protease inhibitor 1 6692 SPINT1 -1.184424571 0.667962156 P33316 Deoxyuridine 5'-triphosphate nucleotidohydrolase, mitochondrial 1854 DUT -4.058893689 2.160403276 Q07955 Serine/arginine-rich splicing factor 1 6426 SRSF1 1.736965594 0.85762402 P62308 Small nuclear ribonucleoprotein G 6637 SNRPG -4.058893689 2.160403276 O75787 Renin receptor 10159 ATP6AP2 -3.772589504 0.736355839 Q86TH1 ADAMTS-like protein 2 9719 ADAMTSL2 1.736965594 0.85762402 P30086 Phosphatidylethanolamine-binding protein 1 5037 PEBP1 -3.36923381 0.720328198 P09327 Villin-1 7429 VIL1 3.36923381 0.720328198 Q99729 Heterogeneous nuclear ribonucleoprotein A/B 3182 HNRNPAB -3.736965594 2.085455996 P04040 Catalase 847 CAT 4.544320516 2.730882301 Q9BVM2 Protein DPCD 25911 DPCD -3.36923381 0.720328198 Q8WVQ1 Soluble calcium-activated nucleotidase 1 124583 CANT1 1.36923381 0.348013021

9

GSE6988

*** *** 1

0

-1 signal intensity signal normalized KHSRP normalized 2 log -2

Normal Liver Normal Colon Liver Metastasis

Primary Adenocarcinoma

Figure 1 – Supplement 1: Differential expression of KHSRP between paired matched tumor and normal tissue in primary adenocarcinoma and liver metastasis. *** p<0.0001 (Wilcoxon matched-pairs signed rank test).

10

4.5

4.0

3.5

3.0

2.5 KHSRP expression (Log RNA Seq V2)

2.0 pRCCccRCCchRCCPancreasThyroidLiverLung CarcinomaHCCProstate AdenoBreastPheochromocytomaCholangiocarcinomaACCUterineGBMAMLMesotheliomaBladderUterineCervicalHead CSSarcoma andDLBC LungNeck LymphomaThymoma SCCBrainStomachColorectalEsophagusOvarianMelanomaUvealTesticular Melanoma Germ Cell

Figure 1 – Supplement 2: RNA-Seq expression data for KHSRP in the TCGA cohort was plotted in increasing order, grouped by tumor type (for details of the tumor type codes refer to the TCGA Data Portal, https://tcga-data.nci.nih.gov/docs/publications/tcga/)

11

Figure 1 – Supplement 3: The frequency of genetic alterations in KHSRP across 4 different CRC datasets 10-13 as reported in the cBio portal.

12

Stage II-III (N=7) A B Tumor vs. Normal Colon p = 0.0772 2.5 p < 0.0001

2.0 10

1.5

1.0 5

(fold difference) 0.5 (T/N ratio) equal T/N ratio

normalized KHSRP protein KHSRP normalized 0.0 0

CRC KHSRP staining percentage

Normal Colon stroma C D epithelium

Figure 2 – Supplement 1: (A) Western blot analysis of KHSRP in lysates from fresh-frozen tissue samples of patients with stage II-III CRC (N=7), comparing tumor (T) and matched normal tissue (N) from each patient. A representative blot for 2 patients is shown along with semi-quantitative densitometric analysis of all patients. (B) Quantification of tumor-to-normal (T/N) ratio of KHSRP staining in epithelial and stromal compartments of TMA. Waterfall plots of T/N ratios in the epithelium (C) or stroma (D) indicate patients with increased or decreased tumor-specific KHSRP expression.

13

A epithelium stroma B

300 p = 0.0078 p = 0.0625

200

100

KHSRP staining quickscore 0

Liver met Liver met Normal Liver Normal Liver

C D epithelium stroma E Metastasis vs. Normal Liver Primary vs. Liver Metastasis Primary vs. Liver Metastasis 300 (epithelium) (stroma) 80 800 800 p = 0.0195 * 60 600 600 200 40 400 400 20 200 200 100 (M/N ratio) (T/N or M/N ratio) 0 equal ratio (T/N or M/N ratio) 0 equal ratio 0 equal M/N ratio KHSRP staining percentage -20 KHSRP staining percentage -200

KHSRP staining quickscore 0

KHSRP staining percentage -200

Liver Met Liver Met stroma Liver Met Liver Met epithelium Primary Tumor Primary Tumor Primary Tumor Primary Tumor

Figure 2 – Supplement 2: (A) Quantitation of stromal and epithelial KHSRP staining in normal liver vs. liver metastasis tissue from CRC patients from the TMA. (B) Quantification of metastasis-to-normal (M/N) ratio of KHSRP staining in epithelial and stromal compartments of TMA. Waterfall plots of M/N ratios for single patients are shown, along with (C) a comprehensive plot of all patients (N=8). (D) Quantitation of stromal and epithelial KHSRP staining in primary CRC tumor vs. liver metastasis tissue for 10 patients from the TMA. (E) Comparison of T/N ratios in the primary tumor with the M/N ratios in the corresponding liver metastasis for both epithelial and stromal compartments of 10 patients in the TMA.

14

A

B

Figure 4 – Supplement 1: Immunofluorescent images of (A) SW480 and (B) SW620 cells. Red: nucleai (DAPI), Green: KHSRP, Blue: F-actin (Phalloidin).

15

A SW620 B SW620 C 1.6 Mock 1.5 1.5 1.4 Scramble siKHSRP * 1.2 ** 1.0 1.0 1.0 0.8 0.6 0.5 Cell Index 0.5 0.4 KHSRP mRNA Cell Index (120h) Index Cell Relative expression 0.2 0.0 0.0 0.0 0 15 30 45 60 75 90 105120135 Mock Mock Time (Hour) Sramble ScramblesiKHSRP si KHSRP

D E F *** 150 *** 1.5 200 p < 0.0001 100 *** 150 100 1.0 100 50 50 0.5 50 (% (% of control) Neutral Red staining Colony Area (%) (% (% of untreated control) Neutral Red staining Surviving Fraction 0 Mock Scramble siKHSRP 0 0 0.0

Mock Mock siKHSRP Scramble ScramblesiKHSRP

Mock ScramblesiKHSRP

Figure 4 – Supplement 2: (A) qRT-PCR analysis of SW620 cells transfected with a pool of siRNAs targeting KHSRP or a scramble control pool for 48h. (B) Continuous monitoring of proliferation of siRNA pool-transfected cells; full time-course curves are shown, along with differences in cell index values after 5 days. **p<0.001 and *p<0.05 (ANOVA). (C) WB analysis of KHSRP protein expression in SW620 cells transfected with an siRNA targeting KHSRP, or a scramble control siRNA, or a mock transfection control for 48h. (D) Growth of transfected cells monitored after 7 days by neutral red assay; representative images are shown with staining quantification from triplicate assays. (E) Clonogenic potential for the transfected cells measured by both colony area and surviving fraction. (F) Spheroids from transfected cells grown in Matrigel; representative images are shown with staining quantification from triplicate assays. ***p<0.0001 (ANOVA).

16

A B

C

150 1.5Vehicle Doxycicline150 Doxycicline Doxycicline Vehicle Vehicle *** *** *** 100 1.0 100

50 0.5 50 (% (% of control) Colony Area (%) Surviving Fraction Neutral Red staining

0 0.0 0

Scramble Scramble Scramble KHSRP (M2) KHSRP (M2) KHSRP (M2)

Figure 4 – Supplement 3: SW620 cells stably transfected with a conditionally-expressible shRNA pool targeting KHSRP (M2) or a non-targeting control pool (Scramble). (A) Phase- contrast and fluorescence images of cells treated with doxycycline or vehicle control for 4 days. Consistent expression of red fluorescent protein (RFP) indicates induced expression of the shRNA-containing cassette, and (B) WB analysis confirmed effective knock down of KHSRP protein expression. (C) Growth of stably transfected cells in the presence and absence of doxycycline was quantified using the neutral red assay; colony formation was measured by both surviving fraction and colony area.

17

A C

B

D

18

Figure 5 – Supplement 1 (previous page): transcriptomic profile of SW480 cells transfected with KHSRP siRNA compared to scramble negative siRNA control. (A) Volcano plot showing differentially regulated genes in green (down-regulated) and red (up-regulated) (log2 Fold Change > |1|, p-value < 0.05). Genes that are differentially regulated above the fold change threshold with a non-significant p-value are colored in orange. (B) Distribution of the 135 significantly differentially regulated genes. (C) Over-representation analysis of the 135 differentially regulated genes. GO terms are arbitrarily grouped by biological function, and reported with the reference database, the corresponding number of gene hits and the associated p-value. (D) Network map of predicted associations for the protein products of the 135 differentially regulated genes. Proteins are represented by nodes (smaller nodes indicate proteins with unknown structural information); edges represent the predicted functional associations, with the thickness of the line indicating the degree of confidence for the prediction of the interaction. Proteins are color-coded arbitrarily by clustering, with the three main clusters annotated.

Supplementary References

1. Bassik MC, Kampmann M, Lebbink RJ, et al. A systematic mammalian genetic interaction map reveals pathways underlying ricin susceptibility. Cell. 2013;152(4):909– 922. doi:10.1016/j.cell.2013.01.030.

2. Kampmann M, Bassik MC, Weissman JS. Functional genomics platform for pooled screening and generation of mammalian genetic interaction maps. Nat Protoc. 2014;9(8):1825–1847. doi:10.1038/nprot.2014.103.

3. Meerbrey KL, Hu G, Kessler JD, et al. The pINDUCER lentiviral toolkit for inducible RNA interference in vitro and in vivo. Proceedings of the National Academy of Sciences. 2011;108(9):3665–3670. doi:10.1073/pnas.1019736108.

4. Stöckel D, Kehl T, Trampert P, et al. Multi-omics enrichment analysis using the GeneTrail2 web service. Bioinformatics. 2016;32(10):1502–1508. doi:10.1093/bioinformatics/btq059.

5. Damian Szklarczyk JHMHCMKSWMSASNTDARPBLJJCVM. The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible. Nucleic Acids Res. 2017;45(Database issue):D362. doi:10.1093/nar/gkw937.

6. Sylvain Brohée JVH. Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics. 2006;7:488. doi:10.1186/1471-2105-7-488.

7. Chalkley RJ, Baker PR, Medzihradszky KF, Lynn AJ, Burlingame AL. In-depth analysis of tandem mass spectrometry data from disparate instrument types. Mol Cell Proteomics. 2008;7(12):2386–2398. doi:10.1074/mcp.M800021-MCP200.

19

8. Zybailov B, Mosley AL, Sardiu ME, Coleman MK, Florens L, Washburn MP. Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiae. Journal of proteome research. 2006;5(9):2339–2347. doi:10.1021/pr060161n.

9. Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57. doi:10.1038/nprot.2008.211.

10. Giannakis M, Mu XJ, Shukla SA, et al. Genomic Correlates of Immune-Cell Infiltrates in Colorectal Carcinoma. Cell Rep. 2016;15(4):857–865. doi:10.1016/j.celrep.2016.03.075.

11. Seshagiri S, Stawiski EW, Durinck S, et al. Recurrent R-spondin fusions in colon cancer. Nature. 2012;488(7413):660–664. doi:10.1093/bioinformatics/bti310.

12. Brannon AR, Vakiani E, Sylvester BE, et al. Comparative sequencing analysis reveals high genomic concordance between matched primary and metastatic colorectal cancer lesions. Genome Biol. 2014;15(8):454. doi:10.1186/s13059-014-0454-7.

13. Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487(7407):330–337. doi:10.1038/nature11252.

20