Supporting Information

Fei et al. 10.1073/pnas.1617467114 SI Materials and Methods Cont. Cell Culture, Reagents, and Antibodies. LNCaP, CWR22Rv1, DU145, name siRNA-a siRNA-b and PC3 cells were cultured in RPMI medium 1640 supplemented GUAGUAGUGUCUUACUGGU GUUGGGUCCCUCUGAAGUU with 10% FBS. RWPE-1 cells were cultured in K-SFM (Kit Cat. no. HNRNPD CUUAGUAAGCUGGUCCAGA CUAUAUGGAUGAGGUGACA 17005-042) with kit-supplied bovine pituitary extract and HNRNPUL2 CUCAUGUAACUGUGAAGAA CUGAUAGGCAGUCUGGAAA recombinant epidermal growth factor. RNase R was purchased HNRNPA2B1 HNRNPLL AGUGCAACGUAUUGUUAUA CUUAAUGUUUGCGUGUCUA from Epicentre (Cat. no. RNR07250). The antibodies were pur- HNRNPA1 GGAAGAGUUGUGGAACCAA GUGGUAAUGAGAGAUCCAA chased from the following sources: GAPDH (FL-335, Santa Cruz, HNRNPR GGAGUAUGGAGUAUGCUGU GCUAGUGCUUUGUCUUAGU Cat. no. sc-25778), HNRNPL (4D11, Santa Cruz, Cat. no. sc-32317; HNRNPAB GAGAUUGAGGCCAUUGAAU CUGUGGACCCUGGUUGUAA D5, Santa Cruz, Cat. no. sc-48391), HNRNPK (3C2, Santa Cruz, Cat. no. sc-32307), and AR (N-20, Santa Cruz, Cat. no. sc-816). RNA Isolation and qRT-PCR. RNA was isolated using the RNeasy Pooled Genome-Wide CRISPR Screen. LNCaP cells were cultured in Mini Kit (Qiagen). Reverse transcriptase (Invitrogen) was used RPMI medium 1640 supplemented with 10% FBS and infected with for random-primed first-strand cDNA (cDNA) synthesis. Real- the pooled lentiviral library (GeCKO v2 library) at an multiplicity of time PCR was carried out on ABI Prism 7300 detection system infection of 0.5. Large-scale spin-infection of 1 × 108 cells were usingSYBRGreenPCRmastermix.TheΔΔCt method was carried out in four 12-well plates with 2 × 106 cells per well. Wells used to comparatively quantify the amount of mRNA level. were pooled together into larger flasks on the day after spin- RPS28 gene expression served as the internal control. Primer infection. After 3 d of puromycin selection, half of the surviving sequences detecting mRNA levels are listed below: cells were stored as 0-d control samples, and the rest of the cells were cultured in white RPMI medium 1640 supplemented with Gene name Forward primer (5′ to 3′) Reverse primer (5′ to 3′) 10% charcoal-stripped FBS in the presence of 10 nM DHT for an additional 2 wk. PCR was performed on genomic DNA to con- RPS28 CGATCCATCATCCGCAATG AGCCAAGCTCAGCGCAAC AGCTCCCGCTCGAATCTGAT CCTCAACTCGCAGTCAAAGTC struct sequencing libraries with each containing around 300 μg HNRNPK ∼ HNRNPL TTCTGCTTATATGGCAATG- GACTGACCAGGCATGATGG DNA each. Each library was sequenced at around 30 40 million TGG reads to achieve ∼300× average coverage over the CRISPR library. HNRNPC GGAGATGTACGGGTCAGTA- CCCGAGCAATAGGAGGAGGA Data analysis was performed by MAGeCK and MAGeCK-VISPR. ACA RBMX GCTCTTCATTGGTGGGCTTA GGGCTTTCAAAGGTGACAAA Targeted siRNA Knockdown for Functional HNRNP . LNCaP HNRNPDL TGTGAGATCACCCGTTGTGT CAGGTTTCAGAGGACCTGGA cells were seeded in a 24-well plate and transfected with 20 nM PCBP1 AAAGGCGGGTGTAAGATCA- GGCAAATCTGCTTGACACACTC siRNA oligos by RNAiMax reagent (Life Technology, Cat. no. AAG 13778-150). Knockdown efficiency was determined after 72 h of HNRNPM TGGTCCGAGCAGACATTC- TGACGTGCATTGGTCTATCAAA transfection. Cell counting was performed after 6 d of transfection. TTG The siRNA oligos targeting the HNRNP genes were purchased FUS TCAATCCTCCATGAGTAG- CACGGTCCTGCTGTCCATA from Sigma and Dharmacon. The siRNA target sequence for TGGT siControl is 5′- GCGACCAACGCCUUGAUUG-3′.ThesiRNA PCBP2 GCGCAGATCAAAATTGCG- ATATTGAGCCAGGCTAATGCTG target sequences for the HNRNP genes are as follows: AAC HNRNPU GAGCATCCTATGGTGTGT- TGACCAGCCAATACGAACTTC CAAA Gene name siRNA-a siRNA-b HNRNPUL1 GAAGCACCTTCCGTCTAC- AGGAGAAAGGCTCTTCGCCTA HNRNPK GCAAGAAUAUUAAGGCUCU GAUCUUGGUGGACCUAUUA AGA HNRNPL CAUCAUGCCUGGUCAGUCA AGGUUUGUAGAGGCUUACU HNRNPF CTGCTCTGTTGAGGACGTG CCTGCCCTCTCTAGTGTAGATG HNRNPC CAGUAGAGAUGAAGAAUGA GAUGAAGAAUGAUAAGUCA SYNCRIP GAGCTAGAGGAAGGGGTGGT CTCTTTGTTGTTGGGCACCT RBMX CGAUAGAGAUGGAUAUGGU CUACUCAAGUGGUCGUGAU HNRNPH3 AATGGTCCAAATGACGCTA- CTCCCCTGGTAGTCCATCGT HNRNPDL GUCACUAUGGAGGAUAUGA CAAGGAUAUGGAAAUUACA GTG PCBP1 CGGUUAAGAGGAUCCGCGA GUAUUAGUCUGGCCCAGUA PTBP1 AGCGCGTGAAGATCCTGTTC CAGGGGTGAGTTGCCGTAG HNRNPM GAUUGACGUUCGAAUUGAU CGAUUUGGAUCUGGGAUGA HNRNPH2 GAAGCATACAGGTCCGAAT- CGCCCCTGAAAGTCCACTG FUS CAGAGCUCCCAAUCGUCUU GGCUAUGGAACUCAGUCAA AGC PCBP2 GCAUUAGCCUGGCUCAAUA GAACCCAGUGGAAGGAUCU HNRNPA3 TGATGGGCGTGTAGTGGAAC AGCAGACTGCATCTCTTGTT- HNRNPU GUGGAAUCGGCUAUCCAUA GUCACUAACUACAAGUGGA TAG HNRNPUL1 CUAUAUCCUAGAUCAGACA GUUGCUAUUGACACCUAUA HNRNPA0 TGGCTTCGTGACCTACTCCAA GGCCTCCGACAAAGAGCTT HNRNPF GGUACAUUGGCAUCGUGAA CAAUAUGCAGCACAGAUAU RALY TTCAGGCAAGCAATGTAACCA CACGGCCATACTTAGAGAAG- SYNCRIP GCUAGUUGCACAUAGUGAU GUUAUGCGUUUGUCACUUU ATG HNRNPH3 GACAGUACGACUUCGUGGA CAAUUACAGUGGAGGAUAU HNRNPH1 ATTCAAAATGGGGCTCAAG- GTGTCAGGACTATTTGGACCAG PTBP1 CAAGAACUUCCAGAACAUA CUGACCAAGGACUACGGCA GTAT HNRNPH2 GUACAUUUGUGGGAGUUGA CUGUACAUUUGUGGGAGUU HNRNPD GCGTGGGTTCTGCTTTATT- TTGCTGATATTGTTCCTTCG- HNRNPA3 GUACAUUCCUGAGGUCUUU CAAUGUGUGCUCGACCACA ACC ACA HNRNPA0 CAGACCAAGCGCUCCCGUU CACUUUGAGGCCUUUGGGA HNRNPUL2 GGCAAAGGTAACCCAGAAT- GGACGGGAAAAATCAACAGACC RALY GGCAAGCAAUGUAACCAAC GCAAGCAAUGUAACCAACA CTC HNRNPH1 CUUCUUGAAUUCUACAGCA CUUUGUACGGCUUAGAGGA HNRNPA2B1 AGCTTTGAAACCACAGAAGAA TTGATCTTTTGCTTGCAGGA

Fei et al. www.pnas.org/cgi/content/short/1617467114 1of10 Cont. EGFP: 5′- GATCACAATTAACCCTCACTAAAGGGATG- GTGAGCAAGGGCGAGGAGC-3′ and 5′- GATCACTAA- Gene name Forward primer (5′ to 3′) Reverse primer (5′ to 3′) TACGACTCACTATAGGGTTATCTAGATCCGGTGGAT- HNRNPLL ACCATTCCTGGTACAGCACTG TGGCCAGCACTTGTAAAGC CCC-3′; TCAGAGTCTCCTAAAGAGCCC ACCTTGTGTGGCCTTGCAT HNRNPA1 RPS28: 5′- GATCACAATTAACCCTCACTAAAGGGCCA- GCAAGGTGCAAGAGTCCACA CACGCCAGAGTACACACTGTC HNRNPR TCATGGACACCAGCCGTGTG-3′ and 5′- GATCACTAA- ATTGAGGCCATTGAATTGCCA GGCCACCTTGATCTCACACTT HNRNPAB TACGACTCACTATAGGGAACTTGAAACACAAACGC- TGTGTGCTGGACGCTGGA CACTGCCCCATGACGTGAT KLK3 TTTAT-3′; TMPRSS2 GGACAGTGTGCACCTCAAAGAC TCCCACGAGGAAGGTCCC FKBP5 GCGGAGAGTGACGGAGTC TGGGGCTTTCTTCATTGTTC LARP: 5′- GATCACAATTAACCCTCACTAAAGGGCCT- GGTGACTCGGACATTCCAGG-3′ and 5′- GATCACTAA- TACGACTCACTATAGGGTGATCCGCTGTGCGGCCA- RIP. Adherent cells grown in 15-cm plates were first cross-linked CAGGTC-3′; with 0.3% formaldehyde for 10 min at room temperature before the reaction was quenched by adding one-tenth volume of 1.25 M CTBP1: 5′- GATCACAATTAACCCTCACTAAAGGGCA- glycine for 5 min. Cells were then scraped off the plates and lysed GATAACGTACACGGATGCCACAG-3′ and 5′- GATCAC- with RIPA lysis buffer (50 mM Tris, pH7.6, 150 mM NaCl, 1 mM TAATACGACTCACTATAGGGGTGTGTGACATCTGT- EDTA, 0.1% SDS, 1% Nonidet P-40, 0.5% sodium deoxycholate, GCAGGCCCTG-3′; protease inhibitor, and RNase inhibitor) for 10 min on ice. Then ROR2: 5′- GATCACAATTAACCCTCACTAAAGGGACC- the lysate was sonicated to assist solubilization and RNA frag- TTCTTACTGCCCCTTCTTCTTC-3′ and 5′- GATCACTA- mentation before centrifugation at 20,000 × g for 10 min at 4 °C. ATACGACTCACTATAGGGTCTTTGTGTGTGTCTGAA- The supernatant was collected and precleared with G TATTCTG-3′; beads. The input fraction was obtained from the supernatant after preclear step. The antibodies were preincubated with pro- STX3: 5′- GATCACAATTAACCCTCACTAAAGGGTTG- tein G beads and washed with RIP wash buffer (RIPA lysis TAGGAATTGTGTCTGGAACC-3′ and 5′- GATCACTAA- buffer without inhibitors) before adding the precleared cell ly- TACGACTCACTATAGGGACAGCTCTCTGATATATCA- sate. After 4- to 6-h incubation, the beads were washed twice AATTCC-3′. with RIPA lysis buffer followed by three washes with 1 M RIPA buffer (1 M NaCl in RIPA lysis buffer). The RNA was eluted Tissue Microarray Analysis. The use of human prostate samples from the beads with 100 μL NaHCO and 1% SDS in the has been approved by The Gelb Center Committee. Immuno- 3 μ presence of proteinase K and RNase inhibitor at room temper- histochemical staining for HNRNPL was performed on 4- m ature for 10 min with occasional vortexing. The eluted material paraffin sections cut from two prostate tissue microarray (TMA) was decross-linked at 65 °C for 45–60 min before purification of blocks provided by the Gelb Center Tissue Bank. The immuno- RNA using TRIzol LS reagent (Life Technology). DNase I chemical stain was initially optimized on the bench, and then treatment was performed to remove any residual DNA before transitioned to the Leica BOND-III (Leica Biosystems) autostain- phenol/chloroform/ethanol purification of the final RNA. RIP ing system. Immunostaining was performed on tissue sections fol- RNA can be used for either library preparation or direct qPCR lowing deparaffinization in two 5-min changes of xylene and assay. HNRNPL (4D11) antibody was used for HNRNPL RIP- rehydration through graded alcohols to distilled water. After seq. The RIP-qPCR primers used in this study are as follows: blocking endogenous peroxidase activity, sections were sub- jected to heat-induced epitope retrieval in citrate buffer (pH 6.1) for 30 min. Following heat-induced epitope retrieval, the primary Gene name Forward primer (5′ to 3′) Reverse primer (5′ to 3′) mouse monoclonal antibody targeting HNRNPL (4D11, sc- RPS28 CGTGGAATTCATGGACGAC GCTTCTCGCTCTGACTCCAA 32317, Santa Cruz Biotechnology) was applied to the sections. CTBP1 ACGTCTGTGCTGTGATGTCC CGGATGTCATAGATGCCACA Protein levels were examined using a dilution of 1:5,000 for 1 h at ROR2 GTGTCATTCAATATTCTGT- ACAGAGAACACACTTAGAGA- room temperature. Incubation with the biotinylated universal sec- GTGTG CACAA ondary antibody was then performed. Visualization was performed STX3 GCCATGTTTTAGCTGTGTGG TTGTTGCTGTTGGTTGTGGT with 3,3′-diaminobenzidine (DAB) as the chromogen substrate. AR intron CCCACCTTCTCCAGTCTGTC GTTGCTGAGTCAGGGGAAAG Once stained, both TMA slides were scanned on the Olympus BX- PPFIA2 site 1 GACAAAAGTAAGGTCCAAA- CCCCACCACAATTCTGTTTT 51 W1 microscope using Vectra 2.0.8 software (Perkin-Elmer). TGGT Cores that were disrupted or contained insufficient tissue were PPFIA2 site 2 GAGCAACTGGAAAGGCAAAG TATCCAGCCACAATCGAACA eliminated from the analysis. Finally, cores from 79 pairs of MYH10 intron ACCTGGTTGTATCCCCTGTG TCCACCATCACAGATAAGCAA matched tumor and benign samples were valid for later analysis. Following the standard bright-field TMA scanning protocol, a RNA Pulldown Assay. RNA fragments were in vitro transcribed from chromogenic spectral library was composed using the spectra of corresponding PCR products amplified from genomic DNA by T7/T3 both the counterstain (hematoxylin) and the immunostain (DAB). RNA polymerase. T7/T3 promoter sequences were introduced into Tissue segmentation algorithms were subsequently constructed using inForm v2.0.2 (Perkin-Elmer). Initially, a training set com- PCR products via primers to generate strand-specific RNA fragments. prising three classes of tissue was created (i.e., tumor, stroma, and Biotin-labeled UTP was incorporated into final RNA fragments other). Representative regions of interest for each of these classes during transcription. Cells were lysed with RIPA lysis buffer (50 mM were marked on 15–20 images from the TMAs. The software was Tris,pH7.6,150mMNaCl,1mMEDTA,0.1%SDS,1%Nonidet trained on these areas and tested to determine how accurately P-40, 0.5% sodium deoxycholate, protease inhibitor and RNase in- it could differentiate between the tissue classes. Cell-segmentation hibitor) and then incubated with indicated biotin-labeled RNA algorithms were then constructed for nuclear and cytoplasmic fragments for 1 h at room temperature followed by Streptavidin beads populations. Cell-segmentation algorithms identified nuclei as pixels incubation for 1 more hour. Beads were washed twice with RIPA lysis above the minimum signal value and cytoplasm as a two-pixel radius buffer and three times with 500 mM NaCl RIPA buffer. The asso- around each identified nucleus. The spectral library and algorithms ciated were analyzed by Western blot analysis. The PCR were then run on all samples. Poorly segmented cores were man- primers used to amplify in vitro transcription templates are as follows: ually corrected via touchscreen editing following pathology review.

Fei et al. www.pnas.org/cgi/content/short/1617467114 2of10 Statistical analysis was performed using SPSS software, v22.0 rRNA. RNase R treatment was performed with 20 U RNase R (SPSS). All reported P values were two sided. For all statistical (Epicenter) per microgram RNA at 37 °C for 15 min. RNAs with analysis, P < 0.05 was considered significant. or without RNase R treatment were purified and subjected to library preparation and next generation sequencing. Duplicates RNA Interference. In addition to the siRNA screen of HNRNPs, were performed for each condition. Computational data analysis additional siRNA oligos were synthesized to silence indicated is elaborated in Bioinformatics Analysis, below. For individual genes. These siRNA oligos were purchased from Dharmacon. cases of validation, qRT-PCR analysis was performed. The The target sequences are as follows: qPCR primers used for the circRNA validation are as follows: siControl: 5′-GCGACCAACGCCUUGAUUG-3′; ′ ′ ′ ′ siHNRNPL-1 (siL-1): 5′-GAAUGGAGUUCAGGCGAUG-3′; Gene name Forward primer (5 to 3 ) Reverse primer (5 to 3 ) siHNRNPL-2 (siL-2): 5′-CUACGAUGACCCGCACAAA-3′; circ-PRKAR1B TCCAGCTTCTCGAAGTG- TTCCCAAGGACTACAAAACCA CTC siHNRNPK-1 (siK-1): 5′-GAUCUUGGUGGACCUAUUA-3′; circ-ZMIZ1 GATGGAGCTGGAGTGAG- TCTGCAGAAGGACCAGGACT siHNRNPK-2 (siK-2): 5′-GUCGGGAGCUUCGAUCAAA-3′. GTG circ-FOXJ3 GACAAGCCTGTCCATAC- TCTGTCATTGAACAAATGTT- and Soft Agar Assay. For cell-growth assays, cells were AAACC TCC plated in 24-well plates and transfected with indicated siRNA circ-PPFIA2-1 TTCAGAGATTTCTTCTT- AGGTAAGGGAGCGACTGAGG oligos in triplicate. Cells were counted after 6 d of transfection CCTTTTC using a hemocytometer. Anchorage-independent cell growth in circ-PPFIA2-2 CCCTGCAGGCATTTAAT- ATTGTTGCCTTGCGTGAAC soft agar was performed in triplicate with 10,000 LNCaP cells TCT per well suspended in 1.5 mL medium containing 0.35% agar circ-CCNY CCGAAGTGCCACCAGAT- AAATGGTGGAGCAGGAACTG spread on top of 1.5 mL of 0.7% solidified agar in six-well plates. TAT Colonies were stained with Crystal violet and counted after 4 wk pre-PRKAR1B ATGGGAGTCCGACTGTG- CTCCTGAGTTCAAGCGATCC of plating. AGT pre-ZMIZ1 TCTCAGGCACACCTCAT- GGCCCTCAGTTGTTCTCAAA Minigene Assay. The GAPDH gene (NM_002046) fragment en- CTG compassing exon 5 and exon 6 along with their flanking introns was pre-FOXJ3 GTTCCTGGAGTGCTCAC- AGGCAAAATCTCCCTCCTTC + PCRamplifiedandclonedintopcDNA3.1 plasmid using BamH I ACA TCTTCCTTTTCTAGAAG- TCCCCATTCTTTTCATTTTGA and EcoR I sites. The HNRNPL binding (CA)20 sequence was pre-PPFIA2 introduced into either or both ends along with PCR primers used TTGTTCC to amplify the GAPDH minigene. The qPCR primers used for pre-CCNY AGGACGAGTGAGCAATG- GGGCAGAGTCAGGTTGTCTG the GAPDH minigene assay are as follows: circRNA of GAPDH, GAC 5′-GCCAAAAGGGTCATCATCTC-3′ and 5′-TGGACTCCAC- ′ ′ GACGTACTCA-3 ; pre-mRNA of GAPDH, 5 -GCGAGATCC- Bioinformatics Analysis. CRISPR screen data were analyzed by CTCCAAAATCAA-3′ and 5′-CAGGGCTGAGTCAGCTTCCC-3′. MAGeCK and MAGeCK-VISPR. De novo motif analysis was performed using MDscan (51) and SeqPos that were imple- Analysis. LNCaP cells transfected with either mented in the Cistrome package (52). The RIP-seq and RNA-seq + siCtrl or siHNRNPL(siL1 siL2) for 3 d were cultured in the pres- reads were aligned against the hg19 human reference genomes ence of 10 nM DHT for 4 h before being harvested. Polyadenylated with the UCSC known gene transcript annotation using TopHat RNA was extracted and constructed into sequencing library for v2.0.9 (53, 54). Because the RIP-seq and RNA-seq data are RNA-seq analysis. Triplicates were prepared for each condition. strand-specific, the RIP-seq peaks were identified for “+” and “–” Computational data analysis is elaborated in Bioinformatics Analysis, strands separately using macs2 v2.0.10 (55), with a scanning below. For individual cases of validation, qRT-PCR analysis was window size of 100 bps, but without shifting the reads (peak performed. length ≥ 150, FDR ≤ 0.01, fold-change ≥ 4, and RIP read The qPCR primers used for the splicing validation are as count ≥ 50). The differential alternative splicing analysis follows: was performed using rMATS (31), a computational tool that allows for the analysis of replicate RNA-seq data. The circular Gene name Forward primer (5′ to 3′) Reverse primer (5′ to 3′) RNA was identified using the CIRCexplorer (32). Briefly, RNA- seq reads were first mapped using TopHat 2.0.9 (parameters: -a AAGACCTGCCTGATCTG- CGAAGACGACAAGATGG- AR isoform-a 6–microexon-search -m 2) against the hg19 human reference TGG ACA genome with the UCSC known gene transcript annotation. Un- GTGGAAGCTGCAAGGTC- GCCACACTCTAGAGCTG- AR isoform-b mapped reads were then extracted and mapped onto the relevant TTC CAA reference genome using TopHat-Fusion (56) (TopHat 2.0.9, CTGTGGAGATGAAGCTT- GGGCCCTGAAAGGTTAG- AR pre-mRNA parameters:–fusion-search–keep-fasta-order–bowtie1–no-coverage- CTGG TGT search). Back-spliced junction reads were extracted and further MYH10 isoform-a GTTTCACTGGTTTAGGC- AGGTGCTGGGAAGACAG- GAT AAA realigned against existing gene annotations to determine the pre- exon5/6 cise positions of the donor or acceptor splice site for each back- MYH10 isoform-a CACAGTCTTCGCATTTC- ATCGCCTAAACCAGTGA- CAA AAC spliced event. Junction reads with shifted alignments against ca- exon6/7 nonical splice sites were adjusted to the correct positions and reads MYH10 isoform-b CACAGTCTTCGCATTTC- AGGTGCTGGGAAGACAG- CAA AAA with alignments on different genes or noncanonical splice sites were discarded. Only those back-spliced junctions that were sup- ported by at least one junction read in two replicates of RNase Circular RNA Analysis. LNCaP cells were transfected with either R-enriched RNA-seq were considered as candidate circRNAs. The siCtrl or siHNRNPL(siL1+siL2) for 3 d before being harvested. back-spliced junction read count of individual circRNA across Total RNA was extracted with TRIzol reagent (Life Technol- different samples was normalized by trimmed mean of M-values ogy). The Ribominus kit (Life Technology) was used to remove method implemented in edgeR (57, 58). The differential

Fei et al. www.pnas.org/cgi/content/short/1617467114 3of10 expression analysis of circRNAs was performed using LIMMA (59) differentially expressed. Notably, increasing sequencing depth would (FDR ≤ 0.35, fold-change ≥ 2), and the total normalized count no further enhance the power and robustness to identify or calculate less than 15. Moreover, only those circRNAs with parental gene the differential expression of circRNAs. Genomic alterations of showing little expression change (fold-change ≤ 1.2) were considered HNRNPL were analyzed using cBioPortal (60, 61).

A *** B 25

20

15 01 10 F

-log10 (P value) -log10 (P 5

0

ribosome cell cycle proteasome

ribonucleoprotein 1k_positive 1k_essential non-selective protein biosynthesis Ribo_essential C 6 5 * 4 D 3

2 -log10 (P value) -log10 (P

1 01

0

AR pathway

PID: OME: hemostasis PID: PI3KCI pathway

REACT KEGG: pathways in cancer

OME: generic transcription pathway cycle and respiratory electron transport REACTOME: striated muscle contraction OME: metabolism of lipids and lipoproteins REACT TCA r REACT KEGG: aldosterone-regulated sodiumOME: reabsorption OME: transmembrane transport of small molecules no. of genes 10 69 4022 14303 14 1

REACT REACT

Fig. S1. Genome-wide CRISPR screen in prostate cancer cells. (A) Fold-change of sgRNA abundance between day 14 and day 0 samples for the indicated groups of genes: ribosomal essential genes (Ribo_essential); top 1,000 essential genes (1k_essential); top 1,000 positively selected genes (1k_positive) and nonselective genes. *P < 6.75e-11. (B) The most significantly enriched functional categories of pan-essential genes calculated by DAVID functional annotation tools. (C) The most significantly enriched pathways for LNCaP essential genes. The pathway gene sets were extracted from MSigDB (software.broadinstitute. org/gsea/msigdb). (D) The distributions of β-essentiality scores of all genes, grouped by the copy number status (measured in log2 ratio) of the gene in LNCaP cells. Negative (or positive) β-score indicates the corresponding gene is undergone negative (or positive) selection. The lower the β-score, the more essential of the corresponding gene. *P = 0.035 by Wilcox rank sum test. Copy number variation (CNV) data were downloaded from The Cancer Cell Line Encyclopedia project (https://portals.broadinstitute.org/ccle/home).

Fei et al. www.pnas.org/cgi/content/short/1617467114 4of10 Relative mRNA level (siHNRNP/siControl) A 0 0.25 0.5 0.75 1.0 C DU145 HNRNPK siRNA-a 120 HNRNPL siRNA-b HNRNPC 80 RBMX (HNRNPG) HNRNPDL 40 PCBP1 (HNRNPE1)

Relative cell growth (%) 0 HNRNPM siCtrl siL-1 siL-2 siK-1 siK-2 FUS (HNRNPP2) PCBP2 (HNRNPE2) HNRNPL

HNRNPU HNRNPK HNRNPUL1 GAPDH HNRNPF SYNCRIP (HNRNPQ) HNRNPH3 PTBP1 (HNRNPI) PC3 120 HNRNPH2 HNRNPA3 80 HNRNPA0 RALY (HNRNPCL2) 40 HNRNPH1

HNRNPD Relative cell growth (%) 0 siCtrl siL-1 siL-2 siK-1 siK-2 HNRNPUL2 HNRNPA2B1 HNRNPL HNRNPLL HNRNPK HNRNPA1 GAPDH HNRNPR HNRNPAB D

B LNCaP soft agar assay 60

GM12878 H1 H2171 HCC1954 HCT-116 HMEC HSMMtube HUVEC HeLa HepG2 IMR90 Jurkat K562 LNCaP MCF-7 MM1S NHDF-Ad NHLF Osteoblast RPMI-8402 Skeletal_Muscle_Myoblast 40 HNRNPK HNRNPL 20 RBMX HNRNPC Number of colonies PCBP2 0 SYNCRIP siCtrl siL-1 siL-2 siK-1 siK-2 PCBP1 PTBP1 HNRNPF HNRNPM FUS RALY HNRNPU HNRNPA3 HNRNPUL1 HNRNPH2 HNRNPH3 HNRNPA1 HNRNPUL2 HNRNPA0 HNRNPD HNRNPH1 HNRNPA2B1 HNRNPAB HNRNPR

Fig. S2. HNRNPL is an essential RBP in prostate cancer. (A) RT-qPCR analysis of corresponding HNRNP gene expression upon two independent HNRNP knockdowns (siRNA-a and siRNA-b). Data are shown as mean ± SEM, n = 3. (B) Soft agar assay of LNCaP cells upon HNRNPL or HNRNPK knockdown. (C) Cell- growth effect of HNRNPL or HNRNPK knockdown in two AR-null prostate cancer cell lines DU145 and PC3. (D) Superenhancer-associated HNRNP genes across 21 cell lines. The red box indicates the gene associated with a superenhancer in the respective cell line. The catalog of the superenhancer in different human cell types was obtained from ref. 27. A gene is considered to be associated with a superenhancer if its transcription start site is within 10-kb from the superenhancer.

Fei et al. www.pnas.org/cgi/content/short/1617467114 5of10 A HNRNPL RIP-seq [0-40] Input [0-40]

NOS3

HNRNPL RIP-seq [0-60] Input [0-60]

TCF3

HNRNPL RIP-seq [0-200] Input [0-200]

PORCN

HNRNPL RIP-seq [0-100] Input [0-100]

TRIOBP B S: sense EGFP RPS28 LARP CTBP1 ROR2 STX3 AS: anti-sense S/AS S/AS S/AS S/AS S/AS S/AS Input HNRNPL

EGFP RPS28 LARP CTBP1 ROR2 STX3 (bp) 1000 Fragments for in vitro transcription of biotin RNA 500 300

C HNRNPL RIP-seq 100 100 Input 25 LNCaP_siCtrl 25 RNA-seq LNCaP_siL 25 LNCaP95

AR exon 2 exon 2b exon 3 D 150 125 125 100 100 75 75 50 50 25

Relative cell growth (%) 0 25 siCtrlsiL-1 siL-2 siCtrl siL-1 siL-2 Relative cell growth (%) Parental 0 AR siCtrlsiL-1 siL-2 siCtrl siL-1 siL-2 Overexpression - Dox + Dox AR AR

HNRNPL HNRNPL

GAPDH GAPDH

Fig. S3. RIP-seq analysis of HNRNPL and AR is a downstream target of HNRNPL. (A) Genome browser representation of previously reported HNRNPL binding (NOS3 and TCF3), 3′UTR binding (PORCN), and exon binding (TRIOBP) events. (B) RNA pulldown validation of several indicated HNRNPL-bound regions. The HNRNPL binding peak regions from indicated genes were PCR-amplified and in vitro transcribed to produce either the sense or antisense strand of corresponding RNAs with biotin labeling. The biotin-labeled RNA fragments were then incubated with LNCaP cell lysate and pulled down by streptavidin beads. RNA-associated proteins were resolved by Western blot analysis. EGFP and RPS28 served as negative control for HNRNPL binding. (C) Genome browser representation of HNRNPL binding and the differentially spliced cryptic exon (exon 2b) over AR loci from RIP-seq and RNAseq data. (D) Overexpression of the full-length AR only partially rescues the cell growth effect upon HNRNPL knockdown. (Left) LNCaP cells with stably transfected full-length AR and parental cells were used to determine the cell- growth effect upon HNRNPL knockdown. Cells were counted after 6 d of siRNA transfection. Western blot was performed 3 d after siRNA transfection. (Right) LNCaP cells were infected with lentivirus expressing doxycycline-inducible full length AR to determine the cell-growth effect upon HNRNPL knockdown; 500 ng/mL doxycycline (Dox) was used and cell counting was performed after 6 d of induction. Western blot was done 3 d postinduction.

Fei et al. www.pnas.org/cgi/content/short/1617467114 6of10 A HNRNPL RIP-seq [0-40]

Input [0-40]

MYH10 B C MYH10 intron MYH10 576 100 isoform-a 10 5 6 7 qPCR primer 1 qPCR primer

Relative enrichment IgG HNRNPL HNRNPL 1.5 1.5 (4D11) (D5) RIP-qPCR 1.0 1.0 0.5 0.5

Relative expression 0 Relative expression 0 siCtrl siL-1 siL-2 siCtrl siL-1 siL-2 isoform-b 1.5

5 7 1.0

qPCR primer 0.5

Relative expression 0 siCtrl siL-1 siL-2 D DHT up-regulated genes DHT down-regulated genes 8 Fold Change (log2) Fold Change (log2) 0246 012345

Ctrl si-HNRNPL Ctrl si-HNRNPL E F 14 siCtrl veh siCtrl DHT 12 siL-1 veh siL-1 DHT siL-2 veh siL-2 DHT 10 8 608 6284 Veh 1872 DHT 6 4

Relative expression 2 0 HNRNPL KLK3 TMPRSS2 FKBP5

Fig. S4. Splicing validation of MYH10 and HNRNPL does not affect DHT/AR target genes. (A) Genome browser representation of HNRNPL binding over MYH10 loci as determined by RIP-seq. The peak region is indicated by a red rectangle. (B) RIP-qPCR confirmation of HNRNPL binding on MYH10 intron. Data are shown as mean ± SEM, n = 3. (C) RT-qPCR confirmation of HNRNPL-dependent alternative splicing of MYH10. Data are shown as mean ± SEM, n = 3. (D) Box plot showing that there are no change in the expression levels of DHT up- or down-regulated genes upon HNRNPL knockdown. The DHT up- and down- regulated gene lists were obtained from Wang et al. (62). (E) Venn diagram of HNRNPL RIP-seq peaks between vehicle and DHT conditions. LNCaP cells were cultured in charcoal/dextran treated FBS-containing medium for three days and then treated with either vehicle (1:1,000 ethanol) or 10 nM DHT for 4 h before being harvested for HNRNPL RIP-seq analysis. (F) Knockdown of HNRNPL does not affect classic target gene expression induced by androgen/AR signaling. Control or HNRNPL knockdown LNCaP cells were treated with either vehicle or 10 nM DHT for 4 h and RT-qPCR analysis was performed to evaluate the in- dicated androgen/AR target gene expression. Data are shown as mean ± SEM, n = 3.

Fei et al. www.pnas.org/cgi/content/short/1617467114 7of10 3.0 A RNase R - 2.5 RNase R +

2.0

1.5

1.0 Relative expression 0.5

0

RPS28 ZMIZ1 FOXJ3 CCNY HNRNPLHNRNPK HNRNPC PRKAR1B PPFIA2-1PPFIA2-2 circRNA B circ-PRKAR1B circ-ZMIZ1 back-spliced junction back-spliced junction

circ-FOXJ3 circ-CCNY back-spliced junction back-spliced junction

D C circ-GAPDH mini gene back-spliced junction exon 6 exon 5 Fraction of genes with circRNA 0.0 0.1012345678+ 0.2 0.3 0.4 0.5 0.6 Number of HNRNPL binding sites

Fig. S5. CircRNA Regulation by HNRNPL. (A) RT-qPCR analysis of indicated circRNAs and mRNAs before and after RNase R treatment. (B) Sanger sequencing traces for confirmation of back-spliced exon junctions of indicated circRNAs. (C) Sanger sequencing traces for confirmation of back-spliced exon junction of the GAPDH circRNA from minigene assay. (D) Bar plot showing the relationship between the fractions of genes that form circular transcripts and the number of HNRNPL binding sites that they have.

Fei et al. www.pnas.org/cgi/content/short/1617467114 8of10 p=0.019 A B p=0.025 0.35 88 78 0.30 0.25 68 58 0.20 0.15 48 Mean Intensity 0.10

Percentage Positivity (%) 38 Benign Tumor Benign Tumor C HNRNPL-regulated alternatively spliced genes HNRNPL-regulated circRNA genes

p=0.466 p=0.02 0.10 0.00 0 Enrichment Score (ES) Enrichment Score (ES) −0.10 −0.05 0.05 0.15

Negative selection Positive selection Negative selection Positive selection Rank by essentiality Rank by essentiality D

24%

22% Deletion Amplification Multiple alterations

20%

18%

16%

14%

Alteration frequency 12%

10%

8%

6%

4%

2%

0% Cancer type Mutation data +++++++++++++- +++++++++- ++++++++++++++++++++++++++++

CNA data +++++++++++++++++++++++++++++++++- +- +++- - - +- - ++- ++++

CCLE NCI-60

MM (Broad)Liver (AMC) MBL (Broad) Liver (TCGA) GBM (TCGA) Bladder (BGI) Breast (TCGA) pRCC (TCGA) PCPG (TCGA) Uterine (TCGA) Glioma (TCGA) Thyroid (TCGA)ccRCC (TCGA) Ovarian (TCGA) Bladder (TCGA) ACyC (MSKCC) Cervical (TCGA) Breast (BCCRC) Prostate (SU2C) Prostate (TCGA) Lung squ (TCGA) Pancreas (TCGA) Sarcoma (TCGA) Stomach (TCGA) Melanoma (TCGA) Colorectal (TCGA) Melanoma (Broad) Uterine CS (TCGA) Sarcoma (MSKCC) Esophagus (TCGA) GBM (TCGA 2013)Breast (TCGA pub) Esophagus (Broad) Bladder (TCGA pub) Ovarian (TCGA pub) Lung adeno (TCGA)Uterine (TCGA pub) Lung adeno (Broad) Thyroid (TCGA pub) Head & neckStomach (TCGA) (TCGA pub) Lung squ (TCGA pub) Stomach (Pfizer UHK) Prostate (TCGA 2015) Colorectal (TCGA pub) Colorectal (Genentech) Lung adeno (TCGA pub) Prostate (MSKCC 2010) Head & neck (TCGA pub) Breast (BCCRC Xenograft)

Fig. S6. Clinical relevance of HNRNPL. (A) Percentage of HNRNPL positive staining as determined by tissue microarray analysis. (B) Mean intensity of HNRNPL staining as determined by tissue microarray analysis. (C) Enrichment plot by gene set enrichment analysis. HNRNPL-regulated circRNA genes (Dataset S4), but not HNRNPL-regulated alternatively spliced genes (Dataset S3), are significantly enriched for LNCaP essential genes. (D) Genomic alterations of HNRNPL across several cancer types from cBioPortal.

Fei et al. www.pnas.org/cgi/content/short/1617467114 9of10 Dataset S1. CRISPR screen data

Dataset S1

Dataset S2. HNRNPL-associated RNA regions and genes

Dataset S2

Dataset S3. HNRNPL-dependent splicing events

Dataset S3

Dataset S4. CircRNAs in LNCaP cells

Dataset S4

Fei et al. www.pnas.org/cgi/content/short/1617467114 10 of 10