ARTICLE

https://doi.org/10.1038/s41467-021-24795-1 OPEN AKT3-mediated IWS1 phosphorylation promotes the proliferation of EGFR-mutant lung adenocarcinomas through cell cycle-regulated U2AF2 RNA splicing ✉ Georgios I. Laliotis 1,2,3,11 , Evangelia Chavdoula1,2, Maria D. Paraskevopoulou4, Abdul Kaba1,2, Alessandro La Ferlita1,2,5, Satishkumar Singh2,6, Vollter Anastas 1,2,7, Keith A. Nair II 1,2, Arturo Orlacchio1,2, Vasiliki Taraslia4,12, Ioannis Vlachos8,13, Marina Capece1,2, Artemis Hatzigeorgiou8, Dario Palmieri1,2,

1234567890():,; Christos Tsatsanis3,9, Salvatore Alaimo 5, Lalit Sehgal 2,6, David P. Carbone 2,10, Vincenzo Coppola 1,2 & ✉ Philip N. Tsichlis 1,2,7

AKT-phosphorylated IWS1 regulates alternative RNA splicing via a pathway that is active in . RNA-seq studies in lung adenocarcinoma cells lacking phosphorylated IWS1, identified a exon 2-deficient U2AF2 splice variant. Here, we show that exon 2 inclusion in the U2AF2 mRNA is a cell cycle-dependent process that is regulated by LEDGF/SRSF1 splicing complexes, whose assembly is controlled by the IWS1 phosphorylation-dependent deposition of marks in the body of target . The exon 2-deficient U2AF2 mRNA encodes a Serine-Arginine-Rich (RS) domain-deficient U2AF65, which is defective in CDCA5 pre-mRNA processing. This results in downregulation of the CDCA5-encoded Sororin, a phosphorylation target and regulator of ERK, G2/M arrest and impaired cell proliferation and tumor growth. Analysis of human lung adenocarcinomas, confirmed activation of the pathway in EGFR-mutant tumors and showed that pathway activity correlates with tumor stage, histologic grade, metastasis, relapse after treatment, and poor prognosis.

1 Department of Cancer Biology and Genetics, The Ohio State University, Columbus, OH, USA. 2 The Ohio State University Comprehensive Cancer Center- Arthur G. James Cancer Hospital and Richard J. Solove Research Institute, Columbus, OH, USA. 3 School of Medicine, University of Crete, Heraklion, Crete, Greece. 4 Molecular Oncology Research Institute, Tufts Medical Center, Boston, MA, USA. 5 Department of Clinical and Experimental Medicine, Bioinformatics Unit, University of Catania, Catania, Italy. 6 Department of Medicine, Division of Hematology, The Ohio State University, Columbus, OH, USA. 7 Tufts Graduate School of Biomedical Sciences, Program in Genetics, Boston, MA, USA. 8 DIANA-Lab, Hellenic Pasteur Institute, Athens, Greece. 9 Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology Hellas, Heraklion, Crete, Greece. 10 Department of Internal Medicine, Division of Medical Oncology, The Ohio State University Medical Center, Columbus, OH, USA. 11Present address: Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA. 12Present address: Center of Basic Research, Biomedical Research Foundation of the Academy of Athens, Athens, Greece. 13Present address: Department Of Pathology, Beth Israel-Deaconess Medical Center, Harvard ✉ Medical School, Boston, MA, USA. email: [email protected]; [email protected]

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

he higher complexity of the proteome relative to the gen- sequencing of 183 lung adenocarcinomas in one study, ome is due to multiple factors, one of which is RNA identified somatic in the splicing factors U2AF1 and T 1 splicing . At least 97% of the genes in the RBM10 and in several epigenetic factors, which may also regulate have introns2 and more than 95% of them undergo alternative RNA splicing26. Such mutations may render the cancer cells RNA splicing3. RNA splicing therefore plays a critical role in vulnerable to modulators of the core splicing machinery, as determining the biological phenotype4. Several additional obser- suggested by experiments showing that H3B-8800, a recently vations also support the importance of alternative RNA splicing described modulator of SF3B1, preferentially kills cancer cells in biology. First, the pattern of alternative RNA splicing changes with mutations in spliceosomal components. Mechanistically, during differentiation and many of these changes contribute to H3B-8800 may function by promoting the retention of short GC- the differentiation process and/or the phenotype of the differ- rich introns, in the mRNA of mutant cells27. entiated cells5. In addition, cell survival and proliferation signals Activating mutations in epidermal growth factor receptor may function by regulating alternative RNA splicing6. Finally, (EGFR) and Kirsten rat sarcoma (KRAS) are the most common changes in alternative RNA splicing caused by mutations in cis- genetic alterations in NSCLC with 69% of the tumors harboring acting regulatory elements, such as splicing enhancers or silen- mutations in these genes. Both types of mutations activate mul- cers, and changes in the expression and/or post-translational tiple signaling pathways, including the AKT and the ERK path- modification of RNA splicing regulators have been linked to the ways and both carry poor prognosis. Data in this report delineate pathogenesis of several diseases, including cancer7–9. a pathway, which regulates alternative RNA splicing in lung Two complementary molecular mechanisms may contribute to adenocarcinomas, and they show that the influence of the path- alternative RNA splicing, the rate of transcription and way on the biology of tumors harboring EGFR mutations is more modifications in the body of transcribed genes. Splicing occurs robust, than its influence on the biology of tumors harboring cotranscriptionally and when the rate of transcription is high, it KRAS mutations. increases the probability for exons that are not efficiently spliced, We had previously shown that the transcription elongation to be spliced out of the mature transcript10. Also, chromatin factors IWS1 and AKT play a central role in the regulation of the modifications are recognized by readers of epigenetic marks, alternative splicing of FGFR2, by promoting the skipping of exon which form the nucleus for the assembly of molecular complexes 8 from the mature FGFR2 mRNA transcript. The exclusion of the that bind to, and functionally regulate RNA-associated enhancers FGFR2 exon 8 depends on the phosphorylation of IWS1 by AKT or silencers of splicing, perhaps by altering the rate of assembly (primarily AKT3) on Ser720/Thr721, which recruits SETD2 to an and the composition of spliceosomal complexes11–13. IWS1-containing complex in the C-terminal domain (CTD) of One cellular function, which interfaces with the RNA splicing RNA polymerase II. This results in the trimethylation of histone machinery, is the cell cycle14. The connection between RNA H3 at K36 in the body of the transcribed FGFR2 , which splicing and the cell cycle was originally suggested by experiments triggers the skipping of exon 8 from the mature transcript. As a in the yeast Saccharomyces cerevisiae, which showed that the follow-up to this study, we proceeded to address the global effects impact of mutations in the cell cycle gene cdc5, can be suppressed of IWS1 and IWS1 phosphorylation in lung adenocarcinomas. To by removing the intron of the tubulin-encoding gene TUB115. this end, we carried out an RNA-seq experiment in the lung Progression through the cell cycle depends on periodic changes of adenocarcinoma cell line NCI-H522, in which IWS1 was either gene function, which can be achieved by multiple mechanisms, knocked down or replaced by its phosphorylation site mutant one of which is the periodic modulation of RNA splicing and S720A/T721A. The results of this analysis revealed that exon spliceosomal components14. Importantly, some of the changes in inclusion was significantly more common than the exon skipping RNA splicing are regulated by periodic shifts in the expression we observed with the FGFR2 gene. One of the genes undergoing and/or the activity of known cell cycle regulators, such as the exon inclusion was the U2AF2 gene, which encodes the core Aurora kinases16, the large RS domain-containing protein SON17, splicing factor U2AF6528–30. Exon inclusion was also under the and the RNA-recognition motif (RRM)-containing control of SETD2 and histone H3K36 trimethylation. However, TgRRM118 and CLK114. Despite the fact that a connection the reader of the histone H3K36me3 mark was the p52 isoform of between RNA splicing and the regulation of the cell cycle has LEDGF12, which interacts with the RNA-binding protein already been established, our understanding of the mechanisms SRSF131. Therefore, although the chromatin-modification mark and the consequences of RNA splicing during cell cycle pro- promoting exon inclusion is the same as the mark promoting gression, remains rudimentary. Exploring the links between RNA exon exclusion, the effector complexes assembled on H3K36me3 splicing and the cell cycle is likely to yield important information, in the two cases are different. The alternatively spliced exon 2 with a significant impact in our understanding of human disease, encodes the U2AF65 N-terminal serine–arginine-rich domain especially cancer. (RS domain), which is required for the interaction between Shifts in RNA splicing play an important role in many types of U2AF65 and the splicing cofactor Prp1932–35. The binding of human cancer, including non-small-cell lung cancer (NSCLC)19, U2AF65 to Prp19 is required for RNA splicing and expression of the second most common cancer, with more than 250,000 new a gene set, which includes CDCA5, the gene encoding Sororin, a cases per year in the United States. There are three NSCLC his- component of the cohesin complex36. Here, we show that the tological subtypes, adenocarcinomas, squamous cell carcinomas, IWS1-regulated Sororin is phosphorylated by ERK and that and large-cell carcinomas20, and all of them carry a very poor phosphorylated Sororin promotes ERK phosphorylation. prognosis with <5% survival in 5 years21. Following our original Remarkably, inhibition of the IWS1 phosphorylation pathway, observations, linking the alternative RNA splicing of FGFR2 to which regulates the Sororin/ERK positive feedback loop, results in the biology of NSCLC22, several additional shifts in alternative inhibition of ERK phosphorylation, even in tumors with acti- RNA splicing were described in these tumors. These include vating KRAS or EGFR mutations. alternative RNA splicing shifts in the Bcl-XL, CD44, Androgen The Sororin/ERK feedback loop described above, promotes the Receptor, HLA-G, and PKM genes. These shifts ultimately pro- expression of CDK1 and CCNB1 (Cyclin B1), and the progression mote cell survival, metastasis, and chemoresistance, inhibit through the G2/M phase of the cell cycle. Importantly, similar to immune-surveillance mechanisms, or endow the cancer cells with other cell cycle-regulatory pathways, the IWS1 phosphorylation a metabolic advantage13,23–25. Massive parallel exome and pathway is also cell cycle-regulated. Mouse xenograft experiments

2 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE confirmed that the IWS1 phosphorylation-dependent U2AF2 and we observed that the most common event associated with the mRNA splicing controls tumor growth in vivo. Moreover, our expression and phosphorylation of IWS1, was exon inclusion studies on human lung adenocarcinoma samples and our analyses (Fig. 1c). of the data on lung adenocarcinomas in publicly available data- GO ()–biological process-based functional sets, confirmed the activation of the IWS1 phosphorylation analyses of the alternative splicing events in shControl versus pathway in these tumors. More important, the data derived from shIWS1 and shIWS1/WT-R versus shIWS1/MT-R cells identified these studies also showed that the activity of this pathway cor- GO functions “RNA splicing” and “RNA metabolic process”, relates with tumor grade, stage, metastatic potential, relapse after among the top functions regulated by IWS1 expression and treatment, and reduced patient survival, in patients with tumors phosphorylation (Fig. 1d, e). Comparisons were limited to harboring activating EGFR, but not KRAS mutations. This alternative splicing events whose abundance changed significantly observation was in agreement with our data showing that tumor with the expression and phosphorylation of IWS1 (p < 0.05). cell lines with EGFR mutations exhibit a stronger dependence on These findings imply that the effect of IWS1 on RNA processing this pathway than tumor cell lines with KRAS mutations. may be direct or indirect. The indirect effect may be due to the Overall, the data in this report describe a pathway, which starts IWS1 expression and phosphorylation-dependent differential with the phosphorylation of IWS1 by AKT3 and results in the regulation of genes involved in RNA processing. modulation of cell-cycle progression. The importance of this Validation of the RNA-seq data, using RT-PCR, confirmed pathway to human cancer was confirmed by our studies on several IWS1 and IWS1 phosphorylation-regulated alternative human lung adenocarcinomas and by meta-analysis of pre- RNA-splicing events, characterized by exon inclusion (Fig. 1 and existing patient data. Supplementary Fig. 1g–j). One of these events is the inclusion of exon 2 in the mature mRNA transcript of U2AF2, the gene encoding the splicing factor U2AF65. Whereas the predominant Results U2AF2 mRNA transcript in shControl and shIWS1/WT-R cells IWS1 expression and phosphorylation regulate alternative contains exon 2, the predominant transcript in shIWS1 and mRNA splicing. We had previously reported that IWS1 phos- shIWS1/MT-R cells is a transcript lacking exon 2 (Fig. 1f, g). The phorylation at Ser720/Thr721, primarily by AKT3, resulted in the decrease in the E2/E3 ratio in shIWS1 and shIWS1/MT-R, exclusion of exon 8 from the FGFR2 mRNA in the human relative to shControl cells, and the rescue of the shIWS1 NSCLC cell lines NCI-H522 and NCI-H129922. To explore the phenotype by wild-type IWS1, were confirmed by quantitative molecular mechanisms driving IWS1 phosphorylation-regulated RT-PCR (Supplementary Fig. 1e upper panels). Importantly, the RNA splicing and , we examined the tran- knockdown of IWS1 and the rescue with the IWS1-S720A/T721A scriptome of shControl, shIWS1, shIWS1/wild-type IWS1 rescue mutant did not significantly change the expression of U2AF2 or (shIWS1/WT-R), and shIWS1/IWS1S720A/T721A rescue the inclusion of the U2AF2 exon 3 in NCI-H522 and NCI-H1299 (shIWS1/MT-R) NCI-H522 cells by RNA-Seq. This allowed us to cells (Supplementary Fig. 1f, upper panel) (Supplementary Fig. 1f, identify additional target genes of the IWS1 phosphorylation lower panel). In parallel experiments, we examined the alternative pathway. First, we confirmed the downregulation of IWS1 in cells RNA splicing of FGFR2 in the same cells, by qRT-PCR. The transduced with the lentiviral shIWS1 construct and the rescue of results showed that the IIIb/IIIc FGFR2 transcript ratio was IWS1 expression in shIWS1-transduced cells with Flag-tagged increased in shIWS1 and shIWS1/MT-R cells (Supplementary wild-type IWS1, or the mutant IWS1-S720A/T721A (Fig. 1a). Fig. 1e, lower panel), confirming our earlier findings22. The Analysis of the RNA-seq data, identified 1621 genes, differentially preceding findings showed that the inclusion of exon 2 in the expressed between shControl and shIWS1 cells and 562 genes U2AF2 mRNA depends on IWS1 phosphorylation, and suggested differentially expressed between shIWS1/WT-R and shIWS1/MT- that the shift in U2AF2 mRNA splicing caused by the knockdown R cells (p ≤ 0.01, FDR ≤ 0.2). Three hundred and forty genes of IWS1 in NCI-H522 and NCI-H1299 cells might be rescued by upregulated or downregulated in shIWS1 cells were similarly the phosphomimetic mutant IWS1-S720D/T721E (DE MT-R) upregulated or downregulated in shIWS1/MT-R cells (Supple- (Fig. 1a). Expression of this mutant in shIWS1-transduced cells mentary Fig. 1a, b). Moreover, 19 out of the FDR-ranked top 100 indeed rescued the phenotype and confirmed the critical role of differentially expressed genes, in shControl vs shIWS1 cells, were IWS1 phosphorylation in this process (Supplementary Fig. 1f, e, also differentially expressed in shIWS1/WT-R versus shIWS1/ upper panel). MT-R cells (Supplementary Fig. 1c). Gene-set enrichment ana- To determine whether IWS1 is directly involved in U2AF2 lysis of differentially expressed genes37 revealed significant mRNA splicing, we used chromatin immunoprecipitation (ChIP) enrichment of genes involved in RNA metabolism and the reg- to examine the binding of IWS1 to the U2AF2 exons 2 and 3 in ulation of RNA processing (Supplementary Fig. 1d). shControl, shIWS1, shIWS1-WT-R, shIWS1-MT-R NCI-H522, Analysis of the data for differential exon usage, by DEXseq38 and NCI-H1299 cells. The results revealed that IWS1 binds exons identified 1434 differentially employed exons (corresponding to 2 and 3 of U2AF2, suggesting its direct involvement to the U2AF2 851 genes) between shControl and shIWS1 cells and 436 alternative RNA splicing (Fig. 1h). IWS1 WT and IWS1-S720A/ differentially employed exons (corresponding to 273 genes) T721A bind equally well (Fig. 1h), suggesting that IWS1 between shIWS1/WT-R and shIWS1/MT-R cells (FDR ≤ 0.05). phosphorylation controls U2AF2 exon 2 alternative RNA splicing The 1796 differentially expressed genes and the 692 genes with by regulating events occurring after the binding of IWS1 to differential exon usage in shIWS1 versus shControl cells exhibited chromatin. an overlap of 165 genes (p ≤ 0.05). Similarly, the 858 differentially expressed genes and the 230 genes with differential exon usage, between shIWS1/MT-R and shIWS/WT-R cells, revealed an IWS1 phosphorylation-dependent mRNA splicing of U2AF2 is overlap of 44 genes (Fig. 1b). regulated by serum and IGF1 via AKT3. IWS1 is phosphory- Exon inclusion and exon skipping represent common alter- lated by AKT3 and to a lesser extent by AKT1 at Ser720/ native RNA-splicing events. Our earlier studies had linked IWS1 Thr72122. To determine the physiological significance of this phosphorylation with an exon skipping event in the FGFR2 observation, we examined whether IGF1 stimulation of serum- gene22. Here, we analyzed the exon usage data in both shIWS1 starved NCI-H522 and NCI-H1299 cells promotes U2AF2 exon 2 versus shControl and shIWS1/WT-R versus shIWS1/MT-R cells, inclusion along with the expected AKT activation and IWS1

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

a b shCon +-- -- shIWS1 -++++ NCI-H522 WT-R --+-- p<0.05 MT-R --- + - shCon WT-R vs vs DE MT-R ----+ (kDa) shIWS1 MT-R -150 IWS1 differentially differentially -150 expressed expressed Flag-IWS1 NCI-H522 858 44 230 -45 β-αctin -150 IWS1 differentially differentially spliced spliced -150 NCI-H1299 Flag-IWS1

-45 β-αctin c d NCI-H522 NCI-H522 GO Biological Processes gene expression # of significant events (kDa)nucleobase-containing compound metabolic process response to stimulus with psi ( ) >0.6 RNA splicing RNA metabolic process regulation of gene expression cellular protein metabolic process cell communication shCon signal transduction shCon developmental process organelle organization vs multicellular organism development shIWS1 transcription, DNA-templated shIWS1 protein localization regulation of transcription, DNA-templated exon inclusion 0 150 300 450 0 150 300 450 600 response to stimulus nucleobase-containing compound metabolic process RNA metabolic process RNA splicing exon exclusion signal transduction regulation of gene expression WT response to organic substance anatomical structure development rescue multicellular organism development WT-R cell differentiation amide transport vs MT establishment of protein localization protein transport rescue ion transport MT-R transmemberane transport ion transmemberane transport 0 50 100 150 0 50 100 150 Combined score

e f shCon +---- shIWS1 -++++ NCI-H522 WT-R --+-- 20 shCon MT-R --- + - vs DE MT-R ----+ U2AF2 15 shIWS1 -1300 bp -1100 bp 123

IFT88 -350 bp 345 10 U2AF2 NCI-H522 -550 bp 8910

GO Splicing -400 bp GAPDH -log p value p -log 5 U2AF2 -1300 bp -1100 bp 123 0 0 0.125 0.25 0.375 0.5 0.625 0.75 0.875 1.00 NCI-H1299 -350 bp 345 20 -550 bp 8910 WT-R SLC12A2

-400 bp GAPDH e

u 15 vs l

a U2AF2 quantification

v MT-R STXBP1

p l

10 1.5

o g r shCon

C1qTNF6 t *** **** o

n p=0.0658

l - o shIWS1

GO Splicing c 5 *** ****

o 1.0 WT-R

t

e MT-R

v i

0 t a

0 0.125 0.25 0.375 0.5 0.625 0.75 0.875 1.00 l DE MT-R

e 0.5

R

psi (ψ) 3

E

/ 2

E 0 NCI-H522 NCI-H1299 g h E3 TSS E2 U2AF2 cDNA

U2AF2 Exon 1 Exon 2 shCon shIWS1 WT-R MT-R ChIP IWS1 NCI-H522

10 (n=3) t

n ** e 8

shCon m

h *

6

c

i r

n 4

e p=0.1632 p=0.3112

C G AG AATAAACAAG A GC GGGACAAGG A GAAC d

l 2 o

F 0 Exon 1 Exon 3 TSS E2 E3 GAPDH E3 ChIP IWS1 NCI-H1299

10 (n=3)

t n

e ***

8 n.s

m h 6 **

shIWS1 c

i r

n 4

e

d

l n.s n.s 2

C G AG AATAAACAAG A CAAACCTTT GACCAGA o F 0 TSS E2 E3 GAPDH E3 phosphorylation. In addition, we examined whether IGF1 Supplementary Fig. 2a). To address whether the U2AF2 exon 2 enhances the skipping of the FGFR2 exon 8, increasing the ratio inclusion depends on AKT activation, we treated NCI-H522 and of IIIb/IIIc FGFR2 transcripts. The results confirmed that both NCI-H1299 cells growing in serum-containing media with 5 μM the inclusion of exon 2 in the U2AF2 transcripts and the skipping of the AKT inhibitor MK2206. The results confirmed that at this of FGFR2 exon 8, indeed parallel the activation of AKT (Fig. 2a, concentration, which fully inhibits all AKT isoforms, MK2206

4 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE

Fig. 1 IWS1 expression and/or phosphorylation regulate alternative mRNA splicing. a Western blots of NCI-H522 and NCI-H1299 cell lysates, transduced and probed with the indicated constructs and antibodies. b Overlaps between differentially-expressed and differentially spliced genes in the indicated groups (q < 0.05). c Bar graphs of alternative splicing events with exon inclusion. The comparisons were limited to alternative splicing events with a percentage of the alternatively spliced exon spliced in (psi/ψ) >0.6 and a p value < 0.05. d GO analysis of statistically significant alternative splicing events in the indicated groups (p < 0.05). Red boxes highlight gene sets involved in the regulation of RNA processing. e Volcano plots of all the exon inclusion and exclusion alternative splicing events, detected by DEXseq in the indicated groups. The statistically significant events (p < 0.05) with a percentage spliced in (psi/ψ) level of >0.6 or <0.4 are shown in red. Statistically significant events in the GO functions RNA splicing or RNA metabolic processes are shown in green. Alternatively spliced IWS1 targets validated in this report are shown in blue. f (Upper panel) RT-PCR of U2AF2 in the indicated NCI-H522 and NCI-H1299 cells, using primers mapping in exons 1 and 3, 3 and 5, and 8 and 10 (control). GAPDH was used as the loading control. The U2AF2 E2/E3 ratio was calculated from the GAPDH-normalized levels of the RT-PCR products. The bars show the mean ratio ± SD in the indicated NCI-H522 and NCI-H1299 cells relative to shControl. g Sequencing chromatograms of the two alternatively spliced U2AF2 RNA transcripts. h (Upper) UCSC browser snapshot showing exons 1, 2, and 3 of the human U2AF2 gene. The map position of the PCR primer sets used in the ChIP experiments in this figure is indicated by blue marks. (Lower) ChIP assays of IWS1 on the U2AF2 and GAPDH genes in shControl, shIWS1 shIWS1/WT-R and shIWS1/MT-R NCI-H522 and NCI-H1299 cells. Bars show the mean fold enrichment (anti-IWS1 IP, vs IgG control IP) in IWS1 binding, in shIWS1 relative to shControl cells or in shIWS1/MT-R relative to shIWS1/WT-R cells ±SD. Data were normalized relative to the input (2%). All assays were done in triplicate, on three biological replicates. n.s: non-significant, *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 (one-sided unpaired t-test).

a NCI-H522 NCI-H1299 bcNCI-H522 NCI-H1299 NCI-H522 NCI-H1299 MK2206 -+ -+ IGF-1 -+-+(KDa) (KDa) -150 -150 p-IWS1 (S720) p-IWS1 (S720) -150 IWS1 -150 IWS1 (KDa) shAKT1 shAKT1 shControl shControl shControl shAKT2 shControl shAKT3 shControl -75 shAKT2 shControl shAKT3 -75 -75 p-AKT (T308) p-AKT (T308) AKT1 -50 -50 -50 -75 -75 -75 p-AKT (S473) AKT2 -50 p-AKT (S473) -50 -50 -75 -75 -75 AKT3 pan-AKT -50 -50 pan-AKT -50 β-actin -100 -37 GAPDH GAPDH -37 U2AF2 U2AF2 U2AF2 -1300 bp -1300 bp -1300 bp 123 -1100 bp 123 -1100 bp 123 -1100 bp -550 bp -550 bp -550 bp 8910 8910 8910 -400 bp -400 bp GAPDH GAPDH -400 bp GAPDH

U2AF2 quantification U2AF2 quantification U2AF2 quantification 15 1.5 1.5

**** **** shControl l

**** **** starved l **** **** DMSO

o o

r r t

t *** *** **** **** shAKT1 n

IGF-1 MK2206 n

o o

c c

10 1.0 1.0 shAKT2

o o

t t

e

e shAKT3

v v

i i

t t

a a

l l e

5 e 0.5 R

R 0.5

3 3

E E

/ /

2 2

E E2/E3 Relative to control to Relative E2/E3 E

0 0 0 NCI-H522 NCI-H1299 NCI-H522 NCI-H1299 NCI-H522 NCI-H1299

Fig. 2 IWS1 phosphorylation-dependent mRNA splicing of U2AF2 is regulated by serum and IGF-1-induced signals, transduced by AKT3. a (Upper panel) Following serum starvation for 24 h, NCI-H522 and NC-H1299 cells were stimulated with IGF-1 or harvested. Four hours later, the stimulated cells were lysed and all the lysates were probed with the indicated antibodies. (Middle panel) RT-PCR with mRNA derived from the cells shown in the upper panel and oligonucleotide primers mapping to exons 1 and 3 of the U2AF2 gene (Lower panel) The U2AF2 gene E2/E3 ratio was calculated following quantification of the RT-PCR products in the middle panel. The bars show this ratio (mean± SD) in IGF1-stimulated and in unstimulated, serum-starved NCI-H522 and NCI-H1299 cells. b (Upper panel) Lysates of NCI-H522 and NCI-H1299 cells, treated with MK2206 (5μM) or DMSO for 4 h, were probed with the indicated antibodies. (Middle panel) RT-PCR reactions, using mRNA derived from the cells in the upper panel and oligonucleotide primers mapping in U2AF2 exons 1 and 3. (Lower panel) The U2AF2 mRNA E2/E3 ratio was calculated following quantification of the RT-PCR products in the middle panel. The bars show this ratio (mean± SD) in MK2206-treated (5μM) and in DMSO-treated NCI-H522 and NCI-H1299 cells. c (Upper panel) Lysates of NCI-H522 and NCI-H1299 cells, transduced with shControl, shAKT1, shAKT2, or shAKT3, were probed with the indicated antibodies. (Middle panel) RT- PCR reactions, using mRNA derived from the cells in the upper panel and oligonucleotide primers mapping in U2AF2 exons 1 and 3 show that knocking down AKT3, and to a lesser extent AKT1, inhibits the inclusion of exon 2 in mature U2AF2 mRNA transcripts in both cell lines. (Lower panel) The U2AF2 E2/E3 ratio was calculated following quantification of the RT-PCR products in the middle panel. The bars show this ratio (mean ± SD) in shControl, shAKT1, shAKT2 and shAKT3 NCI-H522 and NCI-H1299 cells. All experiments in this figure were done in triplicate, on three biological replicates. n.s: non- significant, *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 (one-sided unpaired t-test). inhibits both AKT and IWS1 phosphorylation and U2AF2 exon 2 constructs, and we examined the effects of the transduction on inclusion (Fig. 2b, Supplementary Fig 2b, upper panels). The the alternative RNA of splicing of U2AF2 and FGFR2. The results same treatment also increased the relative abundance of the exon (Fig. 2c, Supplementary Fig. 2c) confirmed that only the knock- 8-containing IIIb transcript of FGFR2 (Supplementary Fig. 2b down of AKT1 and AKT3 phenocopies the MK2206 results and lower panel), as expected. To determine whether it is the AKT3 that the effect of the AKT3 knockdown is significantly more isoform of AKT, which is responsible for these shifts in alter- robust than the effect of the AKT1 knockdown. These findings native RNA splicing, we transduced NCI-H522 and NCI-H1299 are consistent with the phosphorylation of IWS1 primarily by cells with shControl, shAKT1, shAKT2, or shAKT3 lentiviral AKT3, and strongly support the hypothesis that the

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 phosphorylation of IWS1, primarily by AKT3 and secondarily by trimethyltransferase, there are several methyltransferases, AKT1, plays an important role in the physiological regulation of which catalyze mono- or dimethylation of histone H3 at K36 and alternative RNA splicing by external signals. may influence the SETD2 output. Transfection of the NCI-H522 and NCI-H1299 cells with siRNAs targeting a set of methyl- transferases that are known to catalyze histone H3K36 mono- U2AF2 exon 2 inclusion, induced by IWS1 phosphorylation at and dimethylation (NSD1, NSD2, and NSD3), or only dimethyla- Ser720/Thr721, depends on histone H3K36 trimethylation by tion (SMYD2 and ASHL1)42–47, revealed that none of them SETD2. We had previously reported that IWS1 phosphorylation contributes to the regulation of the alternative splicing of the by AKT3 promotes the exclusion of exon 8 from the mature U2AF2 exon 2 (Supplementary Fig. 3e). FGFR2 mRNA transcript, via a process that depends on histone The histone H3K36me3 marks are erased by KDM4A and H3K36 trimethylation by SETD2, and that the latter is recruited KDM4C, two members of the KDM4 JmjC domain–histone to the CTD of RNA Pol lI by phosphorylated IWS122.To demethylase family48–50. Ectopic expression of KDM4A, KDM4B, determine whether the U2AF2 exon 2 inclusion phenotype also and KDM4C from lentivirus constructs in NCI-H522 and NCI- depends on histone H3K36 trimethylation, we performed ChIP H1299 cells showed that none of the three altered the IWS1- assays in shControl, shIWS1, shIWS1/WT-R, and shIWS1/MT-R dependent pattern of the U2AF2 alternative RNA splicing NCI-H522 and NCI-H1299 cells, addressing the abundance of (Supplementary Fig. 3f). This was expected for KDM4B, which H3K36me3 marks on exons 2 and 3 of U2AF2. The U2AF2 does not target H3K36me3, and was used as a negative control, transcriptional start site (TSS) and the GAPDH gene, as well as but it was unexpected for KDM4A and KDM4C. Overall, these exons 8 and 9 of FGFR2, were used as controls. The results data suggest that if KDM4A and KDM4C contribute to the confirmed the IWS1 phosphorylation-dependent trimethylation demethylation of histone H3K36me3, they may do so, only under of histone H3 at K36, in exons 8 and 9 of the FGFR2 gene conditions that need to be determined. Alternatively, it is possible (Supplementary Fig. 3a). In addition, they showed that the IWS1 that the transcription-coupled SETD2-catalyzed histone H3K36 phosphorylation-dependent U2AF2 exon 2 inclusion is also trimethylation may be erased by an oxygenase or demethylase, associated with the trimethylation of histone H3 at K36 in U2AF2 whose specificity toward histone H3K36me3 has not been exons 2 and 3 (Fig. 3a). In parallel experiments, the AKT inhi- determined yet. bitor MK2206 phenocopied the phosphorylation-site mutant of Recently, and after the completion of the ChIP experiments IWS1 (Fig. 3b), confirming that H3K36 trimethylation in U2AF2 described above, we carried out ChIP-Seq experiments, addres- exons 2 and 3 was due to IWS1 phosphorylation by AKT. Given sing the binding of IWS1 and SETD2 and the distribution of the importance of SETD2 on transcription-coupled H3K36 tri- H3K36me3 marks genome-wide in shIWS1/WT-R and shIWS1/ methylation, we used ChIP assays to also address the binding of MT-R, NCI-H522 cells. The unbiased data on the abundance of SETD2 to exons 2 and 3 of U2AF2, in NCI-H522 and NCI-H1299 these markers in the U2AF2 gene were in general agreement with cells transduced with a lentiviral construct of hemagglutinin the ChIP data described above. Specifically, IWS1 was found to (HA)-tagged SETD2 (HA-SETD2). The TSS of U2AF2 GAPDH bind U2AF2 E2, independent of its phosphorylation, but SETD2 gene and exons 8 and 9 of FGFR2 were again used as controls. binding and H3K36me3 abundance on U2AF2 E2 increased only The results revealed that the pattern of SETD2 binding parallels when IWS1 was phosphorylated (Fig. 3e). the abundance of H3K36me3 marks in both exons 2 and 3 of U2AF2 (Fig. 3c) and exons 8 and 9 of FGFR2 (Supplementary Fig. 3b). These data combined suggest that the IWS1 The regulation of the alternative RNA splicing of the U2AF2 phosphorylation-dependent histone H3K36 trimethylation is exon 2 by IWS1 phosphorylation, depends on the p52 isoform mediated by SETD2. of the H3K36me3 reader LEDGF. Our earlier studies had shown The preceding data suggested that the enzymatically active that the regulation of the FGFR2 alternative RNA splicing by SETD2 contributes to the IWS1 phosphorylation-dependent IWS1 phosphorylation, depends on the reading of the histone regulation of the U2AF2 alternative RNA splicing. To determine H3K36me3 marks by MRG1522. To determine whether MRG15 is whether it is also required, we knocked down SETD2 in NCI- also the reader of the IWS1-dependent alternative RNA splicing H522 and NCI-H1299 cells and we rescued the knockdown with of U2AF2, we knocked down MRG15 in both NCI-H522 and wild-type SETD2 or the SETD2 methyltransferase mutant NCI-H1299 cells. Using RT-PCR and qRT-PCR to monitor the R1625C39. Using RT-PCR and qRT-PCR, we observed that the alternative RNA splicing of U2AF2 in these cells revealed that it is knockdown of SETD2 phenocopies the knockdown of IWS1 on independent of MRG15 (Fig. 4a, left panels). In agreement with the splicing of the U2AF2 and FGFR2 mRNAs and that the effect this result, the knockdown of the splicing repressor and binding of the knockdown on both the U2AF2 and FGFR2 mRNA splicing partner of MRG15, PTB, also has no role on the RNA splicing of is rescued by the wild-type SETD2, but not by its methyltransfer- U2AF2 (Fig. 4a right panels), These results were in sharp contrast ase mutant (Fig. 3d, Supplementary Fig. 3c). We conclude that the with the results of parallel control experiments, which confirmed enzymatic activity of SETD2 is indeed required for the regulation that the knockdown of MRG15 or PTB in NCI-H522 and NCI- of the alternative splicing of both the U2AF2 and FGFR2 mRNAs. H1299 cells increases the FGFR2 IIIb/IIIc, transcript ratio, as If SETD2 is recruited to the CTD of RNA Pol II by expected (Supplementary Fig. 5a)11,22. Given that in some cells, phosphorylated IWS1, wild-type SETD2 should not rescue the the knockdown of PTB upregulates PTBP2, which can compen- shIWS1 and shIWS1/MT-R phenotype. This was confirmed by sate for the loss of PTB51, we also examined the expression of experiments addressing the rescue of U2AF2 RNA splicing in PTBP2 before and after the knockdown of PTB in these cells, and shControl, shIWS1, shIWS1/WT-R and shIWS1/MT-R NCI- we observed no PTB-dependent changes (Fig. 4a, right upper H522, and NCI-H1299 cells, by a lentiviral construct of wild-type panel). In addition, the knockdown of PTBP2, similar to the HA-SETD2 (Supplementary Fig. 3d). The failure to rescue the knockdown of PTB, did not affect the alternative RNA splicing of phenotype supports the model of SETD2 recruitment by U2AF2 in either the NCI-H522 or NCI-H1299 cells (Supple- phosphorylated IWS1. mentary Fig. 5b), suggesting that U2AF2 mRNA splicing is also SETD2 is the only known H3K36 trimethyltransferase in independent of PTBP2. mammalian cells40. However, is a To identify the H3K36me3 reader that may control the IWS1- dynamic process41, and while SETD2 is the only H3K36 dependent splicing of U2AF2, we transfected NCI-H522 and

6 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE

a d shCon +- - - shCon +- - - NCI-H522 NCI-H1299 shSETD2 -+ + + shSETD2 -+ + + shCon shIWS1 WT-R MT-R SETD2 wt -- + - SETD2 wt -- + - 12 12 -- - + -- - +

*** t SETD2 R1625C SETD2 R1625C n *** *** ** ChIP H3K36me3 (KDa) (KDa) ** e ** ***

** -250 SETD2 -250 SETD2 m (n=3) -200 -200

8 *** h 8 -21 -21 ** H3K36me3 H3K36me3 c -15 -15 i ** **

r -120 Vinculin -120 Vinculin n

e U2AF2 U2AF2 4 4 p=0.0779 -1300 bp -1300 bp

d -1100 bp 123 -1100 bp 123

l p=0.1479 -550 bp -550 bp

o 8910 8910 F Fold enrichment Fold -400 bp -400 bp 0 0 GAPDH GAPDH TSS E2 E3 GAPDH E3 TSS E2 E3 GAPDH E3 b NCI-H522 NCI-H1299 shCon shSETD2 SETD2 WT-R SETD2 R1625C DMSO MK2206 U2AF2 quantification U2AF2 quantification

12 12 ChIP H3K36me3

t l

l 1.5 2.0

o o

**** n **** **** **** r

(n=3) r

t

t

e n

n **** ****

o

o

m c

*** **** c

1.5

8 h 8

o

o t

t 1.0

c

e

i

e

v

v

r

i

i t

t 1.0

n

a

a

l

l

e

e e 4 4

0.5 R R

d

l 3

3 0.5

E

E

o

/

/

2

2

F

Fold enrichment Fold E 0 E 0 0 TSS E2 E3 GAPDH E3 0 TSS E2 E3 GAPDH E3 c NCI-H522 NCI-H1299

shCon shIWS1 WT-R MT-R qPCR U2AF2 Splicing qPCR U2AF2 Splicing l p=0.1205**** l 1.5

8 **** 8 1.5 *** ***

o o

*** t r

**** **** ChIP HA-SETD2 p=0.1286 r n.s t

**** t

n

n n

*** e o

(n=3) *** o **** c

**** 6 **** c

6

m o

**** **** 1.0 o 1.0

t t

h

**

e e

c

i

v v

i i

r 4 t

4 t

a a

n

l l

e e

e 0.5

R

n.s R 0.5

2

2 d

3 3

l

E E

/ /

o

2 2

F

E Fold enrichment Fold 0 0 0 E 0 TSS E2 E3 GAPDH E3 TSS E2 E3 GAPDH E3 NCI-H522 NCI-H1299 e H522 shIWS1/WT-R H522 shIWS1/MT-R

(0-626)

Flag-IWS1 (0-2556)

(0-2055)

HA-SETD2 (0-277)

(0-737)

H3K36me3 (0-2236)

(0-729)

RNA-Seq (0-700)

U2AF2

Fig. 3 U2AF2 Exon 2 inclusion, induced by IWS1 phosphorylation at Ser720/Thr721, depends on H3K36 trimethylation by SETD2. a ChIP assays showing the abundance of H3K36me3 in the indicated NCI-H522 (left) and NCI-H1299 (right) cells. Bars show the mean fold enrichment in H3K36me3 (anti-H3K36me3 IP, vs IgG control IP) in the indicated regions of the U2AF2 gene, in shControl, shIWS1, shIWS1/WT-R and shIWS1/MT-R cells ±SD. Data were normalized relative to the input (2%). b ChIP assays showing the abundance of H3K36me3 in the indicated NCI-H522 (left) and NCI-H1299 (right) cells, treated with MK2206 (5μM) or DMSO. The bars show the mean fold enrichment of H3K36me3 (anti-H3K36me3 IP, vs IgG control IP) in the indicated regions of the U2AF2 gene, ±SD. Data were normalized relative to the input (2%). c ChIP assays showing the binding of HA-SETD2 in the indicated NCI-H522 (left) and NCI-H1299 (right) cells, transduced with a lentiviral HA-SETD2 construct. The bars show the mean fold enrichment in SETD2 binding (anti-HA IP, vs IgG control IP) ±SD. Data were normalized relative to the input (2%). The expression of HA-SETD2 is shown in Supplementary figure 4a. d (Upper panel) Lysates of NCI-H522 (left) and NCI-H1299 (right) cells transduced with the indicated constructs, were probed with the antibodies, as shown. RT-PCR addressing U2AF2 exon 2 splicing in the same cells. (Middle panel) GAPDH-normalized E2 and E3 bands in the RT-PCR experiment above, were used to calculate the E2/E3 ratio±SD. (Lower panel) Ratio of the U2AF2 exons 2 and 3 levels in the same cells ±SD, as determined by quantitative RT-PCR. E2/E3 ratio is shown relative to shControl (value = 1). All experiments had three biological replicates, all done in triplicate. n.s non-significant, *p <0.05,**p < 0.01, ***p < 0.001, ****p < 0.0001 (one-sided unpaired t- test). e Snapshots of the integrative genomic viewer showing the ChIP-Seq-determined distribution of Flag-IWS1, HA-SETD2, and histone H3K36me3, in U2AF2. Parallel snapshots show the distribution of RNA reads in the RNA-seq experiment. Scale represents reads per million (RPM). Snapshots of peaks detected in the two biological replicates are shown. Black box highlights U2AF2 exons 2, 3 and adjacent regions.

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

497 333 abc p52 C-term d +- +- shControl +-+- 4 97 352 429 530 shControl p75 -+- + -+-+ PWWP AT-hook IBD shMRG15 shPTB (kDa) (kDa) (kDa) siCon siPHF1 siMSH6 siGLYR1 siLEDGF 60- siBRPF1 NCI-H522 ChIP V5-p52/LEDGF 40- PTB PHF1 MRG15 50- 50- 180- 60- (n=3) 50- BRPF1 shControl shLEDGF p75/WT-R p52/WT-R p52 A51P-R (kDa) V5-p75- 12 PTBP2 150- t *** β-actin 50- MSH6 p75- -75 n.s 50- n *** ***

U2AF2 37- GLYR1 LEDGF e 10 1300 bp- 37- *** 1100 bp- α-actinin 75- V5-p52- n.s 123 -52 m LEDGF p52- *** ****

U2AF2 h 8 1300 bp- 50- -55 550 bp- 1100 bp- β-actin a-tubulin c 123 i

8910 r 1300 bp- U2AF2 -1300 bp 6 1100 bp- U2AF2 123 -1100 bp n 550 bp- 400 bp- 550 bp- -550 bp 123 e GAPDH 8910 8910 4

8910 d

l n.s 400 bp- -400 bp n.s n.s

400 bp- GAPDH o

GAPDH GAPDH 2 F siControl siPHF1 siBRPF1 siMSH6 shControl shLEDGF p75 WT-R 0 siGLYR1 siLEDGF shControl shMRG15 shControl shPTB p52 WT-R p52 A51P-R TSS E2 E3 GAPDH E3

U2AF2 quantification U2AF2 quantification U2AF2 quantification U2AF2 quantification

l

l

o

l l o 1.5 **** r 1.5

r n.s ****

o

o t t 1.5 1.5

n.s r n.s n.s r MT-R

t shIWS1

n shCon WT-R

n.s n.s t n n.s ****

n.s n

o n

o ****

c

o

o

c

c

c

o o

1.0 t

t 1.0

o

o

1.0

1.0 t

t

e

e

e

v

e

v

i

i

v

v

t

t

i

i

t

t

a

a

l

l

a

a

l l 0.5 e e 0.5 0.5

0.5 e

e

R

R

R

R

3

3

3

3

E

E

/

/

E

E

/

/ 2 2 0 0 2

2 0 0

E

E

E E

qPCR U2AF2 Splicing qPCR U2AF2 Splicing qPCR U2AF2 Splicing qPCR U2AF2 Splicing

l

l

l

l o

o 1.5 r

o **** r o 1.5 n.s ****

t 1.5

r t

r n.s 1.5 t n.s n.s t **** n.s n.s n

n n.s

n

n o

o n.s ****

o

o

c

c

c

c

o

o t

1.0 o

o t

1.0

1.0 t

1.0 t

e

e

e

e

v

v

i

v

v

i

i

t

i

t

t

t

a

a

a

l

a

l

l l

0.5 e e 0.5 0.5 e

0.5 e

R

R

R

R

3

3

3

3

E

E

E

E

/

/

/

/

2

2 2 0 2 0

E 0

E 0 E NCI-H522 NCI-H1299 NCI-H522 NCI-H1299 E NCI-H522 NCI-H522 e f h shControl +- - shSRSF1 -+ + NCI-H522 ChIP V5-SRSF1 12

-- + t (n=3) SRSF1 wt (kDa) **

n **** V5-SRSF1- ** e 10 SRSF1- -33 SRSF1

m shCon

h 8 -37

GAPDH c shIWS1 i U2AF2 r 6 * -1300 bp n WT-R -1100 bp 123 e * * 4

d MT-R

-550 bp l n.s

8910 o 2 F -400 bp GAPDH 0 TSS E2 E3 GAPDH E3 shControl shSRSF1 SRSF1 WT-R

U2AF2 quantification l

o 1.5

r t

n ****

o g

c NCI-H522 RIP V5-SRSF1

o 1.0

t

e

v (n=3)

i ****

t 12

t a l 0.5

e **** ****

n

R

e

3 10 E

/ shCon 2

0 m E

h 8 shIWS1

qPCR U2AF2 Splicing c

l i

o 1.5

r r t 6

n **** WT-R

n o

c ***

e o 1.0 t 4 **

*** MT-R

e

d

v

l

i

t

a o

l 0.5 2

e

F

R

3 E

/ 0

2 0 E NCI-H522 E2 i2 E3

Fig. 4 The regulation of the alternative splicing of the U2AF2 exon 2 by IWS1 and IWS1 phosphorylation, depends on the p52 isoform of the H3K36me3 reader LEDGF and its splicing partner SRSF1. a (Upper panel) Lysates of the indicated cells were probed with antibodies, as shown. RT-PCR addressing exon 2 U2AF2 splicing in the same cells. (Middle and Lower Panels) U2AF2 E2/E3 ±SD, was determined by quantification of the RT-PCR results and by qRT-PCR. b (Upper Panel) Lysates of NCI-H522 cells transfected with the indicated siRNAs, probed with antibodies, as shown. RT-PCR addressing exon 2 U2AF2 splicing. (Middle and Lower Panels) U2AF2 E2/E3 ±SD, was determined by quantification of the RT-PCR results and by qRT-PCR. c (Upper panel) Domains of the p52 and p75 LEDGF isoforms (see Ferris et al., 201050). (Middle panel) Western blot of the indicated NCI-H522 cells, probed for LEDGF expression and RT-PCR, addressing U2AF2 exon 2 splicing in the same cells. (Lower panels) U2AF2 E2/E3 ratio ±SD, was determined by quantification of the RT-PCR results and by qRT-PCR. d ChIP of p52/LEDGF in NCI-H522 cells, transduced with V5-p52/LEDGF. Mean fold enrichment in p52/LEDGF binding (anti-V5 IP, vs IgG control IP) to the indicated U2AF2 regions±SD. V5-p52/LEDGF expression, in figure S5G. e (Upper panel) Immunoblotting, showing SRSF1 expression in the indicated NCI-H522 cells. RT-PCR addressing U2AF2 exon 2 splicing. (Lower panels) U2AF2 E2/E3 ratio ±SD, was determined by quantification of the RT-PCR results and by qRT-PCR. f SRSF1 ChIP in the indicated V5-SRSF1-transduced NCI-H522 cells. Mean fold enrichment in SRSF1 binding (anti-V5 IP, vs IgG control IP) to the indicated U2AF2 regions±SD. V5-SRSF1 expression in Supplementary figure S6b. g RIP, addressing SRSF1 binding to U2AF2 RNA in panel f cells. Mean fold enrichment in SRSF1 binding (anti-V5 IP, vs IgG control IP) in the indicated U2AF2 regions ±SD. All assays were done, using three biological replicates, in triplicate. n.s non-significant, *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 (one-sided unpaired t-test). h Phosphorylation of IWS1 recruits SETD2 in CTD of RNA Pol II. SetD2 trimethylates histone H3K36 co-transcriptionally. p52/ LEDGF and its partner SRSF1, bind histone H3K36me3 promoting exon 2 inclusion in the mature U2AF2 transcript.

NCI-H1299 cells with a control siRNA and siRNAs of the known of U2AF2 (Fig. 4b, Supplementary Fig. 5c), suggesting that H3K36me3 readers PHF151, BRPF152, MSH653, GLYR-154, and LEDGF is the sole H3K36me3 reader responsible for the U2AF2 LEDGF12,55. Monitoring the effects of these transfections by RT- alternative RNA splicing. To confirm this observation and to PCR and qRT-PCR revealed that only the knockdown of LEDGF determine which isoform of LEDGF may be responsible for the phenocopied the knockdown of IWS1 on the alternative splicing detected phenotype, we used a lentiviral shRNA construct to

8 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE knock down LEDGF, and we rescued the knockdown by hypothesis, we transduced shControl, shIWS1, shIWS1/WT-R transducing the cells with lentiviral constructs of the p75 and and shIWS1/MT-R NCI-H522, and NCI-H1299 cells with a V5- p52 isoforms of LEDGF (Fig. 4c, upper panel). Monitoring the tagged SRSF1 lentiviral construct and we employed ChIP to effects of these transductions by RT-PCR and qRT-PCR revealed address the binding of V5-SRSF1 to U2AF2 exons 2 and 3. U2AF2 that only the p52 isoform rescues the U2AF2 alternative splicing TSS and GAPDH were again used as controls. The results con- phenotype. Importantly, the A51P mutant of the p52 isoform, firmed the binding of SRSF1 to exons 2 and 3 only in shControl which cannot bind histone H3K36me356, did not rescue the and shIWS1/WT-R cells (Fig. 4f and Supplementary Fig. 6c), phenotype, suggesting that the rescue depends on the binding of providing support to the proposed hypothesis. p52 to the H3K36me3 marks (Fig. 4c, Supplementary Fig. 5d). Based on the preceding data, we hypothesized that the binding Notably, the knockdown of LEDGF did not affect the alternative to the chromatin-associated p52/LEDGF should bring SRSF1 into RNA splicing of FGFR2 (Supplementary Fig. 5e), suggesting that proximity with the nascent pre-mRNA, facilitating their interac- although H3K36me3 may be a common signal for the IWS1 tion. Analysis of the U2AF2 mRNA sequence using the web-based phosphorylation-dependent RNA splicing of multiple targets, the pipeline RBP-map57, identified four potential SRSF1-binding sites RNA splicing regulators assembled by H3K36me3 on different (2 in U2AF2 exon 2, and 2 in exon 3) (Supplementary Fig. 6d)58, targets are target-specific. providing additional support to this hypothesis. These findings If the p52 isoform of LEDGF regulates the alternative splicing raised the question whether SRSF1 binding could be common of U2AF2 by reading the histone H3K36me3 marks, as suggested among genes undergoing IWS1 phosphorylation-dependent by the preceding data, and if the abundance of these marks alternative RNA splicing, characterized by exon inclusion. To depends on phosphorylated IWS1, the ectopic expression of p52/ address this question, we analyzed the sequences of four LEDGF should not rescue the shIWS1 and shIWS1/MT-R- alternatively spliced genes, which like U2AF2, undergo IWS1 induced alternative splicing phenotype. This was confirmed by phosphorylation-dependent exon inclusion (Supplementary experiments addressing the rescue of the U2AF2 RNA splicing in Fig. 1g–J). This analysis identified SRSF1-binding motifs in the shControl, shIWS1, shIWS1/WT-R, and shIWS1/MT-R NCI- alternatively spliced and/or flanking exons in all these genes H522, and NCI-H1299 cells transduced with a lentiviral construct (Supplementary Fig. 6e–h), and provided support to the of wild-type V5-p52/LEDGF (Supplementary Fig. 5g). The failure hypothesis that SRSF1 binding may be a common feature of to rescue the phenotype supports the model of p52/LEDGF genes undergoing alternative RNA splicing via this mechanism. regulating the alternative RNA splicing of U2AF2 by reading the To experimentally address the proposed model, we carried out IWS1 phosphorylation and SETD2-dependent histone H3K36 RNA-IP (RIP) experiments in the same shControl, shIWS1, and trimethylation in the body of the U2AF2 gene. shIWS1/WT-R and shIWS1/MT-R NCI-H522 and NCI-H1299 The preceding data provide strong genetic evidence that p52/ cells, focusing on the binding of SRSF1 to the U2AF2 exon 2, LEDGF regulates the alternative RNA splicing of the U2AF2 by intron 2, and exon 3. The results confirmed that SRSF1 binds reading the IWS1 phosphorylation-dependent histone primarily to exon 2, but only in the shControl and shIWS/WT-R H3K36me3 marks. To confirm this interpretation of the results, cells (Fig. 4g, Supplementary Fig. 6i), which parallels its binding we used ChIP to address the binding of p52/LEDGF on exons 2 to the H3K36me3-bound p52/LEDGF. We conclude that exon 2 and 3 of U2AF2 in shControl, shIWS1, shIWS1/WT-R and inclusion in the U2AF2 mRNA in cells expressing wild-type IWS1 shIWS1/MT-R NCI-H522, and NCI-H1299 cells. The results depends on the phosphorylation of IWS1 by AKT3, the confirmed that p52/LEDGF indeed binds U2AF2 exons 2 and 3 recruitment of SETD2 to the CTD of RNA Pol II, the and that the binding depends on IWS1 phosphorylation and transcription-coupled histone H3K36 trimethylation, and the correlates with the abundance of histone H3K36me3 marks subsequent bridging of chromatin with the splicing machinery by (Fig. 4d, Supplementary Fig. 5f). We conclude that p52/LEDGF LEDGF-p52 and SRSF1 (Fig. 4h). indeed regulates the alternative RNA splicing of U2AF2,by reading the trimethylation of histone H3 at K36, downstream of IWS1 phosphorylation, and SETD2 recruitment to RNA Pol II. U2AF65β, encoded by the exon 2-deficient splice variant of U2AF2, does not interact with Prp19. The predominant splice variant of the U2AF2 mRNA in shIWS1 and shIWS1/MT-R cells The p52 isoform of LEDGF regulates the alternative RNA is a variant, which lacks exon 2, the exon encoding the U2A65 N- splicing of U2AF2, via its interaction with the RNA splicing terminal RS domain (Fig. 5a). This domain is responsible for the factor SRSF1. It had been reported that the p52 isoform of interaction of U2AF65 with several factors that contribute to LEDGF is transported to the nucleus, in response to signals tar- mRNA splicing and 3′ cleavage and polyadenylation59,28. One of geting its unique CTD, and that in the nucleus it interacts with these factors is Prp19, a component of the seven-member ubi- the splicing factor SRSF1 regulating the distribution of SRSF1 to quitin ligase complex Prp19C32–35. alternatively spliced genes12. To investigate the role of SRSF1 in Using co-immunoprecipitation in HEK-293T cells transduced alternative RNA splicing, we knocked it down in NCI-H522 and with V5-tagged U2AF2 constructs containing or lacking exon 2, NCI-H1299 cells and we showed that its loss reproduces the we confirmed that whereas the protein encoded by the exon 2- IWS1-knockdown phenotype of U2AF2, but not FGFR2 alter- containing splice variant (U2AF65α) interacts with endogenous native RNA splicing (Supplementary Fig. 6a). The dependence of Prp19, the protein encoded by the exon 2-deficient splice variant the U2AF2 exon 2 splicing on SRSF1, which was suggested by this (U2AF65β), does not (Fig. 5b). More important, co- result, was confirmed by rescue experiments with wild-type immunoprecipitation of endogenous U2AF65 from shControl, SRSF1 (Fig. 4e, Supplementary Fig. 6b). We therefore conclude shIWS1, shIWS1/WT-R, and shIWS1-MT-R NCI-H522 and that SRSF1 regulates the alternative RNA splicing of the U2AF2 NCI-H1299 cells, revealed that the two proteins co- exon 2. However, SRSF1 did not rescue the U2AF2 splicing immunoprecipitate only in shControl and shIWS1-WT-R cells, phenotype in shIWS1 and shIWS1/MT-R cells (Supplementary which express primarily U2AF65α (Fig. 5c, Supplementary Fig. 6b). This finding suggested that SRSF1 does not function Fig. 7a). These data confirmed that the interaction of U2AF65 independently, but instead provides a link between p52/LEDGF, with Prp19 depends on the sequence encoded by U2AF2 exon 2, recruited to IWS1 phosphorylation-dependent chromatin mod- whose inclusion in the transcript is regulated by IWS1 ification marks, and the RNA splicing machinery. To test this phosphorylation.

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

a d e shCon shIWS1 shCon shCon/U2AF65α shCon/U2AF65β WT/R MT/R shIWS1/U2AF65α shIWS1/U2AF65β

1.5 ** *** n.s 1.5 *** **** *** ****

1.0 1.0

bc 0.5 0.5 shControl +--- 0 0 Spliced/Unspliced U2AF65α -+- shIWS1 -+++ Spliced/Unspliced E1-E2/E1-I1E2-E3/E2-I2 E5-E6/I5-E6 E1-E2/E1-I1E2-E3/E2-I2 E5-E6/I5-E6 U2AF65β --+ WT-R --+- CDCA5 GUSB CDCA5 GUSB ---+ 12 n.s 12 +- - MT-R (kDa) n.s Ab only 10 RIP U2AF65 RIP U2AF65 55 10 (kDa) U2AF65 (n=3) (n=3) 8 8 55 58 IP V5-U2AF65 IP Prp19 6 6 V5-U2AF65 U2AF65 35 U2AF35 4 4 2 2 58 150 SF3B1

Prp19 Fold Enrichment 0 Fold Enrichment 0 IP : IgG 55 Prp19 12 12 RIP Prp19 55 *** *** *** RIP Prp19 ** ** *** 10 *** **** **** **** **** **** 10 V5-U2AF65 140 IWS1 (n=3) (n=3) 8 8 55 U2AF65 6 55 Prp19 10% Input 6 10% Input n.s 4 55 Prp19 4 2 37 2 GAPDH 37 GAPDH 0 Fold Enrichment

0 Fold Enrichment CDCA5 I1 CDCA5 I2 CDCA5 I3 GUSB I6 CDCA5 I1 CDCA5 I2 CDCA5 I3 GUSB I6 ΗΕΚ-293T NCI-Η522 NCI-H522 NCI-H522 E f g h 1 n n R R o R R o R - - -R 5 C C T DM- DM-A T-R hIWS1 hIWS1 h r wt hIWS h hCon WT- s WT s MT-R shCon MT- shCon shIWS1 WT-R M s s s M s MT-R shIWS1 WT- (kDa) WT (kDa) 65β-V5 65α-V5 /Sor /So / /

shCon shIWS1 WT-R MT-R 65β-V 140- -140 1 1 p-IWS1(S720) on/

shIWS1/U2AF65α shIWS1/U2AF65β -R 140- -140

T IWS1 hIWS hIWS (kDa) hCon shIWS1/Sor s shIWS1 s s M shCon/65α-V5 shC WT-R shIWS1 shIWS1 Cytosolic CDCA5 mRNA -1300bp 1300bp- U2AF2 140 p-IWS1 (S720) 1100bp- -1100bp 1.5 ** 40- Sororin -40 ** IWS1 ** 140 p-ERK1/2 42- (Y202/T204) -42 1.0 58 U2AF65 42- ERK1/2 -42 55 Prp19 55- α-tubulin -55 0.5 V5-Sororin to GAPDH 40 Sororin A549 NCI-H460 NCI-H1975 PC-9 NCI-H1650 K-RAS G12C K-RAS Q61H EGFR EGFR EGFR 58 V5-U2AF65

Expression relative L858R, T790M L858R, E746_750 del E746_750 del 0 V5 40 V5-Sororin shCon shIWS1 WT-R MT-R NCI-H522 p-ERK1/2 1.5 *** 42 (Y202/T204) qPCR U2AF2 Splicing 80 *** 1.5 **** *** 42 ERK1/2 **** **** *** *** **** **** **** **** **** *** *** **** **** **** 1.0 60 34 p-CDK1 (Y15) 1.0 40 0.5 34 CDK1 to GAPDH

0.5 % p-ERK 1/2 55 Cyclin B1 20 Expression relative 0 reduction by shIWS1 45 β-actin NCI-H1299 E2/E3 Relatrive to Control 0 0 NCI-H522 Η522Η1299 A549 H460 H1975 PC-9 H1650 H522 H1299 A549 H460 H1975 PC-9 H1650

Fig. 5 IWS1 phosphorylation controls the CDCA5/ERK phosphorylation feedback loop, through U2AF2 alternative RNA splicing. a Functional domain composition of U2AF65. Numbers mark the domain boundaries. Exon 2 encodes the RS domain. b HEK-293T cells were transduced with the V5-tagged splice variants of U2AF2, encoding V5-U2AF65α and V5-U2AF65β. Anti V5-U2AF65 immunoprecipitates and input lysates were probed with the indicated antibodies. c Anti-U2AF65, anti-IgG mouse isotype control immunoprecipitates and input lysates of NCI-H522 cells transduced with the indicated constructs were probed with antibodies, as shown. d (Upper panel) The ratio of spliced to unspliced CDCA5 and GUSB (control) RNAs was measured in the indicated NCI-H522 cells by qRT-PCR. (Middle and Lower panels) RIP assays of U2AF65 and Prp19 in the same cells. Anti-U2AF65 or anti-Prp19 IP, vs IgG control IP were used to calculate the mean fold binding enrichment in the indicated RNA regions ±SD. Primers listed in Supplementary Table 2. Primer location in Supplementary Fig. 7b. e (Upper panel) shIWS1 rescue by U2AF65α and U2AF65β. Spliced/unspliced RNA ratios of CDCA5 and GUSB, measured by qRT-PCR, as in Fig. 5d. (Middle and Lower panels) RIP assays of U2AF65 and Prp19 in the same cells. Data presented as in figure 5d. f Cytosolic CDCA5 mRNA levels in the indicated shIWS1-transduced cells were rescued by IWS1 and U2AF65α, as determined by q-RT-PCR. GAPDH- normalized CDCA5 mRNA ±SD. Validation of fractionation in Supplementary Fig. 7e. g Lysates of NCI-H522 cells, transduced with the indicated constructs, were probed with antibodies, as shown. DM-A and DM-E are the S79/S209AA and S79/S209EE Sororin mutants, respectively. h (Upper panels) Lysates of KRAS and EGFR mutant cell lines, transduced with the indicated constructs, were probed with antibodies as shown. RT-PCR, using U2AF2 exon 1 and 3 primers. (Lower Left panel) Bars show the qRT-PCR-determined U2AF2 E2/E3 ratios in NCI-H522 and NCI-H1299 cells, and in the KRAS and EGFR mutant cell lines, relative to the shControl±SD. (Lower Right panel) Bars show the shIWS1-induced percent reduction of ERK-phosphorylation in the same cell lines, normalized to tubulin ± SD. EGFR mutant and non-mutant cells were compared, using a one-way ANOVA. All assays were in triplicate, on three biological replicates. n.s non-significant, *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 (one-sided unpaired t-test).

The splicing of the U2AF2 mRNA, downstream of IWS1 spliced and unspliced CDCA5 RNA in shControl, shIWS1, phosphorylation, regulates the mRNA splicing of CDCA5 and shIWS1/WT-R, and shIWS1/MT-R NCI-H522 and NCI-H1299 the abundance of its protein product Sororin. It had been cells. The RNA splicing of GUSB, does not depend on the previously shown that U2AF65 binds RNA Pol II and recruits U2AF65 interaction with Prp1936, and it was used as the negative Prp19 to the newly synthetized pre-mRNA, promoting co- control. The results confirmed that whereas the RNA splicing of transcriptional RNA splicing60. One of the genes whose RNA CDCA5 is impaired in both the shIWS1 and shIWS1/MT-R cells, splicing depends on the U2AF65-dependent recruitment of Prp19 the RNA splicing of GUSB is not (Fig. 5d, Supplementary Fig. 7b to RNA Pol II, is CDCA5, the gene encoding Sororin, a compo- and c, upper panels). More important, the splicing defect was nent of the cohesin complex36. Given that U2AF65β, which is the rescued by U2AF65α but not by U2AF65β (Fig. 5e, Supplemen- predominant U2AF65 isoform expressed in shIWS1 and shIWS1/ tary Fig. 7d, upper panels). MT-R cells does not bind Prp19, we hypothesized that the RNA To determine how IWS1 phosphorylation regulates CDCA5 splicing of CDCA5 in these cells will be impaired. To address this RNA splicing, we performed RNA immunoprecipitation (RIP) hypothesis, we employed qRT-PCR to determine the ratio of experiments in shControl, shIWS1, shIWS1/WT-R, and shIWS1/

10 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE

MT-R NCI-H522 and NCI-H1299 cells. The results confirmed phosphorylation and U2AF2 alternative RNA splicing and is that the two splice variants of U2AF65 bind equally well the active in lung adenocarcinomas. CDCA5 pre-mRNA, as well as the control GUSB pre-mRNA, as Inhibiting the expression of Sororin results in downregulation expected (Fig. 5d, e, Supplementary Fig. 7c and Supplementary of CDK1 and Cyclin B161,62. This observation suggested that Fig. 7d, middle panels). However, the binding of Prp19 to the IWS1 phosphorylation, which regulates the abundance of same pre-mRNA regions of CDCA5 was significantly impaired in Sororin, may also regulate the expression of CDK1 and Cyclin shIWS1 and shIWS1/MT-R cells, which predominantly express B1. Experiments addressing this hypothesis showed that shIWS1 the U2AF65β isoform (Fig. 5d, Supplementary Fig. 7c, lower and shIWS1/MT-R NCI-H522 and NCI-H1299 cells indeed panels). More important, the impaired Prp19 binding to the pre- express reduced levels of CDK1, phosphor-CDK1(Y15), and mRNA of CDCA5 in shIWS1-transduced cells, was rescued by Cyclin B1, and that the downregulation of these molecules in U2AF65α, but not U2AF65β (Fig. 5e and Supplementary Fig. 7d, shIWS1 cells is rescued by wild-type Sororin and Sororin DM-E, lower panels). but not Sororin DM-A (Fig. 5g, Supplementary Fig. 7f, lower Given that only spliced mRNAs are transported out of the panels). Based on these data, we conclude that the regulation of nucleus, we used qRT-PCR to determine the abundance of CDK1 and Cyclin B1 by IWS1 phosphorylation depends on the cytosolic CDCA5 mRNA in shControl, shIWS1, shIWS1/WT-R, activation of the Sororin-ERK phosphorylation feedback loop. and shIWS1/MT-R and shIWS1 NCI-H522 and NCI-H1299 cells, as well as in shIWS1 cells, before and after rescue with U2AF65α The IWS1 phosphorylation-dependent alternative RNA spli- or U2AF65β. To this end, we fractionated the cells into nuclear cing of U2AF2 regulates ERK phosphorylation in lung ade- and cytosolic compartments and we probed western blots of the nocarcinoma cell lines, including those harboring EGFR or fractions with antibodies to Lamin A/C and GAPDH, to confirm KRAS mutations. EGFR and KRAS are frequently mutated in the fractionation (Supplementary Fig. 7e). The results confirmed human lung adenocarcinomas and the mutated forms of these that the mature CDCA5 mRNA was present at low abundance in genes promote oncogenesis by activating multiple signaling the cytoplasmic fraction of shIWS1 and shIWS1/MT-R cells as pathways, including the ERK pathway64,65. Given that IWS1 expected and that its abundance was restored in shIWS1 cells phosphorylation also promotes the activation of ERK via the rescued with U2AF65α, but not U2AF65β (Fig. 5f). Sororin/p-ERK positive feedback loop, we asked whether IWS1 To determine whether the CDCA5 RNA splicing defect in phosphorylation influences ERK phosphorylation in lung ade- shIWS1 and shIWS1/MT-R NCI-H522 and NCI-H1299 cells nocarcinoma cell lines, harboring KRAS (A549 and NCI-H460) prevents the expression of its protein product Sororin, we or EGFR (NCI-H1975, PC-9, and NCI-H1650) mutations. The examined the expression of Sororin in these cells, along with the results showed that IWS1 phosphorylation and U2AF2 exon 2 expression of IWS1, pIWS1, U2AF65, and Prp19 by western inclusion were independent of the EGFR or KRAS mutational blotting. The results confirmed that the expression of Sororin was status (Fig. 5h). In addition, Sororin expression and ERK phos- indeed impaired as expected, in shIWS1 and shIWS1/MT-R NCI- phorylation were reduced in all shIWS1 and shIWS1/MT-R cell H522 and NCI-H1299 cells. More important, the expression of lines, including those with KRAS or EGFR mutations (Fig. 5h). Sororin was again rescued by U2AF65α, but not by U2AF65β We conclude that the Sororin/p-ERK positive feedback loop (Fig. 5g, Supplementary Fig 7f, upper panels). defines a pathway of ERK regulation, which has the potential to modulate ERK activation by KRAS or tyrosine kinase receptor signals. Surprisingly, the role of this pathway in the regulation of Sororin and p-ERK form a positive feedback loop, which is EGFR-induced ERK activation signals was more robust than its activated by IWS1 phosphorylation and promotes the expres- role in the regulation of KRAS-induced signals (Fig. 5h). sion of CDK1 and Cyclin B1. It had been shown previously that the downregulation of Sororin leads to reduced ERK phosphor- ylation at Y202/T204 in human colorectal cancer (CRC) and The AKT/IWS1/U2AF2 pathway promotes cell proliferation by human hepatocellular carcinomas (HCC)61,62. Assuming that the activating the Sororin/ERK positive feedback loop. Our earlier link between Sororin abundance and ERK phosphorylation is studies had shown that IWS1 phosphorylation promotes the conserved in lung adenocarcinomas, these findings suggested that proliferation of the lung adenocarcinoma cell lines NCI-H522 the knockdown of IWS1 and its replacement by the and NCI-H129922. Given that the Sororin/ERK positive feedback phosphorylation-site mutant IWS1S720A/T721A in NCI-H522 loop, downstream of the IWS1-dependent inclusion of exon 2 in and NCI-H1299 cells would also result in inhibition of ERK the U2AF2 mRNA, upregulates CDK1 and Cyclin B1, we hypo- phosphorylation. This question was addressed and the results thesized that IWS1 promotes cell proliferation, by activating this confirmed the prediction. More important, the reduction of p- loop. To address this hypothesis, we examined the rate of pro- ERK in shIWS1 cells was rescued by U2AF65α, but not by liferation of shControl, shIWS1, shIWS1/Sororin WT, shIWS1/ U2AF65β (Fig. 5g, Supplementary Fig. 7f, lower panels), con- Sororin DM-A, and shIWS1/Sororin DM-E NCI H522 and NCI- firming that the activity of the p-IWS1/Sororin/p-ERK axis in H1299 cells growing under standard culture conditions. The lung adenocarcinomas depends on the alternative RNA splicing results showed that cell proliferation was inhibited by shIWS1 of U2AF2. and that the inhibition was rescued by wild-type Sororin and Sororin is phosphorylated by ERK at Ser79 and Ser20963. This Sororin DM-E but not by Sororin DM-A. Importantly, they also observation raised the question whether it is Sororin, or the showed that the phosphomimetic Sororin mutant (DM-E) pro- phosphorylated Sororin, which promotes the phosphorylation of motes cell proliferation more robustly than the wild-type protein. ERK. Experiments addressing this question showed that whereas In addition, the experiment in Fig. 6b shows that whereas the wild-type Sororin and the Sororin phosphomimetic mutant S79E/ shIWS1-induced proliferation defect in NCI-H522 and NCI- T209E (Sororin DM-E) rescue the phosphorylation of ERK in H1299 cells is rescued by U2AF65α, it is not rescued by shIWS1 cells, the S79A/S209A (Sororin DM-A) mutant does not U2AF65β, confirming the role of U2AF2 alternative RNA splicing (Fig. 5g, Supplementary Fig. 7f, lower panels). We conclude that in this pathway. ERK phosphorylation is promoted by phosphorylated Additional experiments showed that IWS1 also promotes the Sororin, and that Sororin and ERK are components of a positive proliferation of A549 and NCI-H1975 cells and that its role in the feedback loop, which is controlled by AKT-dependent IWS1 proliferation of NCI-H1955 cells, which harbor an activating

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

shControl shIWS1 200 a shIWS1/Sor wt shIWS1/Sor DM-A c shControl shIWS1 e H522 shControl H522 shIWS1 H522 U2AF65β-R 160 H522 U2AF65α-R shIWS1/Sor DM-E G0/G1 G0/G1 : 59.39 % G0/G1 : 37.93 % G0/G1 G0/G1 : 58.17 % G0/G1 : 38.04 %

120 S : 11.03 % S : 7.20 % S : 8.24 % S : 7.47 % s

t G0/G1 G0/G1

n G2/M : 26.67 % G2/M : 49.18 % G2/M : 29.35 % G2/M G2/M : 48.34 % u

% 100 80 o 40 G2/M p<0.0001 p<0.0001 % NCI-H522 A549 90 C G2/M

S e 80 40 e S G2/M 30 c 70 S S

c ** n 60

n ** 0 e 50 200

e 20

u 40 l

u H1299 shControl H1299 shIWS1 H1299 U2AF65α-R H1299 U2AF65β-R

f 30

l 160

f 10 * * 20 n G0/G1 : 63.21 % G0/G1 : 46.19 % G0/G1 : 53.63 % G0/G1 : 46.19 %

10 G0/G1 G0/G1 G0/G1 G0/G1 G2/M n

o 120 S : 22.21 % S : 22.67 % S : 24.83 % S : 22.67 % s o 0

0 t G2/M C

0 1234567 0 1234567 n G2/M : 12.57 % G2/M : 27.07 % G2/M : 16.19 % G2/M : 26.54 % C u 80

o S G2/M S p=0.0008 S G2/M S p=0.0003 C NCI-H1299 NCI-H1975 40

% 100 70

%

90 0

e 60 200 80 e

c 50 70 c A549 shControl A549 shIWS1 A549 U2AF65α-R A549 U2AF65β-R

160 n 60 n 40 G0/G1 : 64.73 % G0/G1 : 66.28 % G0/G1 : 58.01 % e 50 G0/G1 G0/G1 : 57.92 % **** e 30 u 40 120 S : 21.04 % S : 12.73 % S : 13.53 % S : 11.96 %

u G0/G1 G0/G1

l

s l 30 t

f 20 n f ** G2/MG2/M : 13.06 % G2/M : 20.21 % G2/M : 13.86 % G0/G1 G2/M : 24.99 %

20 *** ** u G2/M G2/M G2/M

n 80 o

10 n 10 p=0.0008 p<0.0001 C

o S S 0 o S 0 40

C 0 12345678 0 123 456789101112 C S Days in culture Days in culture 0 200 H1975 shControl H1975 shIWS1 H1975 U2AF65α-R H1975 U2AF65β-R 160 G0/G1 : 46.43 % G0/G1 : 24.67 % G0/G1 G0/G1 : 45.51 % G0/G1 G0/G1 : 25.90 %

120 G0/G1 S : 13.05 % G0/G1 S : 5.75 % S : 12.59 % S : 6.94 % s t G2/M n S G2/M : 31.66 % G2/M G2/M : 54.26 % G2/M : 30.19 % G2/M : 50.95 % u 80 o S p<0.0001 p=0.0008 b shControl shIWS1 d shControl shIWS1 C G2/M S G2/M U2AF65α-R U2AF65β-R U2AF65α-R U2AF65β-R 40 S 0 200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000 FL2-A FL2-A FL2-A FL2-A 20 % 100

% NCI-H522 A549 90

n.s. e e 80

15 c

c 70

n 60 n **

e 50 e 10 * *

u 40

u 100 l

l *

f 30

f 5 20 n

n 10

o 80 o 0 0

0 1234567 C 0 1234567 C NCI-H1299 NCI-H1975 60

% 100 70

90 % 40

e 80 n.s. 60 e

c 70 50 c n 60 ** n 40 20 e 50 by sh IWS1 by sh

n.s. e

u 40 30

l u

30 cells in G2/M f % increase of l 20 20 f *** *** 0 n 10 n 10 o 0 H522H1299 A549 H1975 o 0

C 0 123456

0 1234567 C Days in culture Days in culture

qPCR U2AF2 Splicing

f h i l

o r

t 14

NCI-H1299 NCI-H1299 n ****

(KDa) o 12 **** *** c G0/G1

G0/G1 S G2/M o 10 t

ChIC IWS1 (n=3) S -140 e 8

IWS1 10 v G2/M i

**** t 6

a t

p-IWS1 (S720) -140 **** **** l n

8 e 4

e

R

3 2

SETD2 -260 m ***

E /

h 6 **** ** 0

2 c

-17 i

H3K36me3 E r NCI-H1299 H3 n 4

-17 E U2AF2 -1300bp d j l 2 123

-1100bp o F Sororin -40 0 Cyclin E1 -48 ChIC SETD2 (n=3) 12 CDC25A -70 t **** n 10 **** *** p-H3 (S10) -17 e G0/G1 m 8 S -37 h *** GAPDH c G2/M

i **** **

r 6

n E

4 d g l

o 2 n.s NCI-H1299 F 0 8 *** **** ChIC H3K36me3 (n=3) **** ** **** **** 14 **** 6 t **** **

G0/G1 n 12 S e

m 10

4 G2/M h *** c 8

i **** * r

n 6 E

2 4

d

l relative to GAPDH to relative mRNA expression expression mRNA n.s

o 2 F 0 0 IWS1 CDCA5 U2AF2 U2AF2 TSS U2AF2 E2 U2AF2 E3

Fig. 6 IWS1 phosphorylation promotes cell proliferation by controlling a Sororin/ERK phosphorylation feedback loop, through U2AF2 RNA splicing. a, b Growth curves of shControl and shIWS1 NCI-H522 and NCI-H1299 cells, growing in fully supplemented media, before and after rescue with Sororin (wt or mutants), or U2AF65α/U2AF65β. Experiment was done in triplicate and proliferation was expressed as mean percent confluence ±SD. P values (one-sided unpaired t-test) were calculated for the endpoint measurements. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001. b Growth curves of shControl and shIWS1 A549 and NCI-H1975 cells. c Growth curves of the same shIWS1 cells rescued with U2AF65α or U2AF65β. Proliferation in panels c and d was measured as in panels a and b. d (Upper panel) Cell cycle profiles of the indicated propidium iodide (PI)-stained cell lines. One representative, out of three biological replicates. Mean percentages of cells in different cell cycle phases in red. (Lower panel) shIWS1-induced percent change of cells in G2/M±SD. Statistical analysis was performed using a one-sided unpaired t-test, *p<0.05. Gating strategy in supplementary figure 8a. e NCI-H1299 cells were sorted into G0/G1, S and G2/M fractions. Lysates of sorted cells were probed with the indicated antibodies. RT-PCR on RNA from the same cells, using primers for U2AF2 exons 1 and 3. Gating strategy in supplementary figure 8a. f qRT-PCR-determined expression of IWS1, CDCA5 and U2AF2, relative to GAPDH±SD. SD was calculated based on three biological replicates. g ChIC-determined mean fold enrichment of IWS1, SETD2 and H3K36me3 in cell cycle fractionated cells in the indicated regions of the U2AF2 RNA±SD. h qRT-PCR-based calculation of the mean U2AF2 E2/E3 ratio in S and G2/M, relative to G0/G1 cells ±SD. SD was calculated based on three biological replicates *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001. (one-sided unpaired t-test). i IWS1 expression is induced during the S and G2/M phases of the cell cycle. Following phosphorylation by AKT3, IWS1 orchestrates the cell cycle- dependent assembly of epigenetic complexes on the U2AF2 gene. This promotes the inclusion of exon 2 in the U2AF2 mRNA. U2AF65α, encoded by the exon 2-containing U2AF2 mRNA, interacts with Prp19, promoting CDCA5 splicing and the expression of Sororin. The latter is phosphorylated by ERK, and promotes ERK phosphorylation, in a positive feedback loop, which stimulates proliferation of lung adenocarcinomas.

12 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE

EGFR , is significantly more robust than its role in the upregulated most abundantly during S phase, the upregulation of proliferation of A549 cells, which harbor a KRAS mutation Sororin, SETD2, and Histone H3K36me3 was more robust during (Fig. 6c, Supplementary Fig. 9a). Although it is difficult to G2/M (Fig. 6f), as previously reported36,68. RT-PCR, using RNA determine the significance of an observation that is based on only derived from the same cells, revealed that the inclusion of exon 2 two cell lines, we wish to point out that this observation is in in the U2AF2 mRNA, also fluctuates with the cell cycle, and agreement with the results of the experiment in Fig. 5h, which parallels the expression and phosphorylation of IWS1 (Fig. 6f). shows that IWS1 phosphorylation activates the Sororin/ERK qRT-PCR, monitoring the expression of IWS1 and CDCA5, positive feedback loop more strongly in EGFR mutant than in revealed that the abundance of the RNA transcripts of these genes KRAS mutant cell lines. Importantly, the observed antiprolifera- (Fig. 6g) parallels the abundance of their protein products tive effect of the knockdown of IWS1 in these cell lines, also (Fig. 6f), which indicates that the fluctuation of their expression depends on the alternative RNA splicing of U2AF2, as determined during the cell cycle is regulated at the RNA level. Parallel qRT- by the phenotypic rescue of shIWS1, by U2AF65α, but not by PCR experiments revealed that although the pattern of U2AF2 U2AF65β (Fig. 6d). The role of U2AF2 alternative RNA splicing mRNA splicing changes as the cells progresses through the cell in the regulation of cell proliferation by IWS1 in all four cell lines cycle, the overall abundance of the U2AF2 RNA does not change was also supported by the results of immunoblotting experiments (Fig. 6g). addressing the expression of the proliferation marker PCNA, Chromatin immunocleavage (ChIC) experiments revealed which showed that shIWS1 reduces the expression of PCNA, and increased binding of IWS1 on U2AF2 exons 2 and 3 during S that the reduced PCNA expression can be rescued again by and G2/M, with highest binding during S phase (Fig. 6h). Finally, U2AF65α, but not U2AF65β (Supplementary Fig. 9b). although SETD2 and H3K36me3 are more abundant during G2/ The activation of the Sororin/p-ERK positive feedback loop M (Fig. 6f), the binding of SETD2, the abundance of H3K36me3 and the induction of its downstream targets CDK1 and Cyclin B1 chromatin marks, and the U2AF2 E2/E3 ratio parallel the by IWS1 phosphorylation, suggested that IWS1 promotes cell abundance of IWS1 and its binding to the U2AF2 gene, which proliferation by facilitating progression through the G2/M phase are the highest during S phase (Fig. 6h, i). The cell cycle of the cell cycle. To address this hypothesis, we stained log-phase regulation of the pathway and the potential mechanisms involved cultures of shControl, shIWS1, shIWS1/U2AF65α, and shIWS1/ are outlined in Supplementary Fig. 9c. U2AF65β-rescued NCI-H522, NCI-H1299, A549, and NCI- Overall, these data provide strong support for the model in H1975 cells with propidium iodide and we analyzed them by Fig. 6j. IWS1 expression and AKT activation increase as the cells flow cytometry. The results of this experiment confirmed the enter S phase and their increase is maintained during G2M. AKT hypothesis by showing that the shIWS1 cells accumulate in G2/ (primarily AKT3) phosphorylates IWS1 at S720/T721. The IWS1 M, and that the G2/M arrest is rescued by U2AF65α, but not by phosphorylation signals the recruitment of SETD2 to the CTD of U2AF65β (Fig. 6e). In agreement with the results of the RNA Pol II and promotes the trimethylation of histone H3 at K36 experiments in Figs. 5h and 6c, the shIWS1-induced percent in U2AF2 and other target genes. In the case of U2AF2, reduction of cell proliferation (Supplementary Fig. 9a) and the H3K36me3 is recognized by p52/LEDGF, which interacts with percent increase of cells in G2/M (Fig. 6e, lower panel) were the RNA splicing regulator SRSF1, and promotes the inclusion of more robust in the EGFR mutant (NCI-H1975) than in the exon 2 in the mature U2AF2 mRNA transcript. The exon 2- KRAS mutant (A549) cell line. Given the small number of cell containing U2AF2 transcript encodes U2AF65α, while the exon lines, these experiments provide only an indication that the 2-deficient transcript encodes U2AF65β. Of those, only regulation of cell proliferation by IWS1 may be more robust in U2AF65α, whose expression is promoted by IWS1 phosphoryla- lung adenocarcinomas with EGFR mutations. Strong support to tion, binds Prp19 and facilitates the proper splicing of CDCA5, this hypothesis was provided by experiments in primary leading to accumulation of its protein product Sororin, during S lung adenocarcinomas, which will be presented in subsequent and G2/M. Finally, Sororin and ERK form a positive feedback sections. loop, with ERK phosphorylating Sororin and Sororin promoting indirectly the phosphorylation of ERK. Activation of this loop plays an important role in the maintenance of ERK phosphoryla- IWS1 expression and phosphorylation, and U2AF2 alternative tion, and in the progression through the G2/M phase of the RNA splicing, fluctuate during progression through the cell cell cycle. cycle. The preceding findings suggest that an RNA splicing event, regulated by the AKT-mediated phosphorylation of IWS1, plays a critical role in cell cycle progression. It is known that the The AKT/IWS1/U2AF2/CDCA5/ERK pathway transforms expression or activity of molecules critically involved in the reg- hTert-immortalized human bronchial epithelial cells (hTert- ulation of the cell cycle, tend to fluctuate as the cells transit from HBEC) in culture. The role of the AKT/IWS1/U2AF2/CDCA5/ one phase of the cell cycle to the next66. We therefore examined ERK pathway in cell cycle regulation raised the question whether the expression and phosphorylation of IWS1, the pattern of this pathway also transforms cells in culture. To address this U2AF2 alternative RNA splicing, and the expression of Sororin, question, we used a soft agar-based assay to determine whether along with the expression of SETD2 and the abundance of activation of the pathway promotes anchorage-independent H3K36me3 chromatin marks, in NCI-H1299 cells, sorted into growth of hTert-HBEC cells. The cells were first transduced G1, S, and G2/M pools. To separate cells in different phases of the with lentiviral constructs of constitutively active AKT3 (Myr- cell cycle into distinct pools, we stained exponentially growing AKT3 and AKT3-DD), the phosphomimetic IWS1-DE mutant, cells with a carboxyfluorescein succinimidyl ester (CFSE)-like U2AF65α, and U2AF65β, encoded by the two splice variants of DNA dye, and we sorted them by FACS67 (Supplementary Fig. 8). U2AF2 and Sororin, wild type, and its phosphomimetic and The cell cycle markers we used to validate the sorting, were Cyclin phosphorylation-deficient mutants (DM-E and DM-A, respec- E1 (G1 phase), CDC25A (S phase), and phosphorylated Histone tively). The expression of the proteins encoded by all the trans- H3 (S10) (G2/M phase). Western blotting of cells in different duced constructs was determined by western blotting pools revealed that IWS1, phospho-IWS1, Sororin, SETD2, and (Supplementary Fig. 10a). Cells were plated in triplicate and they histone H3K36me3 are indeed upregulated in S and G2/M were imaged seven days later, using an incucyte live-cell imager (Fig. 6f). Whereas IWS1 expression and phosphorylation were (Supplementary Fig. 10b). Live-cell numbers were measured

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 13 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 immediately after imaging, as described in the experimental The AKT/IWS1/U2AF2/CDCA5/ERK pathway is active in procedures and the data are presented as the mean number ± SD human lung adenocarcinomas and impacts tumor grade, stage, of three independent cultures (Supplementary Fig. 10c). The metastatic potential, and treatment relapse in patients with results of this experiment fully support the role of this pathway in EGFR mutant, but not KRAS mutant tumors. To determine cell transformation, by showing that whereas constitutively active whether the pathway activated by IWS1 phosphorylation and AKT3, wild-type Sororin, the phosphomimetic mutants of IWS1, leading to Sororin expression and ERK phosphorylation is active Sororin (IWS1 DE and Sororin DM-E), and U2AF65α transform in human lung adenocarcinomas (LUAD), we examined the cells in culture, the phosphorylation-deficient mutant of Sororin expression and phosphorylation of IWS1, the alternative splicing (Sororin DM-A) and U2AF65β does not. of U2AF2, and the abundance of ERK and phosphor-ERK, CDK1, and phosphor-CDK1 and cyclin B1 in a set of 40 human LUAD samples. For 30 of these tumors, normal adjacent tissue (NAT) The AKT/IWS1/U2AF2/CDCA5/ERK pathway controls tumor was also available and was tested on parallel with the matching growth in vivo. Our earlier studies had shown that the loss of tumor sample. The results showed that the expression and IWS1, or IWS1 phosphorylation, inhibits tumor growth in a phosphorylation of IWS1, the E2/E3 ratio in the U2AF2 mRNA, mouse xenograft model22. To confirm this observation, we and the expression and/or phosphorylation of downstream tar- repeated the experiment in two lung adenocarcinoma cell lines gets of the pathway, were all higher in the tumors, than in normal not tested before (A549 and NCI-H1975), and in NCI-H1299 tissues (Fig. 8a). Importantly, IWS1 phosphorylation promoted cells, which were used as the positive control. Cells transduced the inclusion of exon 2 in the U2AF2 mRNA, but it did not alter with shIWS1 or shControl constructs, were inoculated sub- the expression of U2AF2 (Supplementary Fig. 12a). Overall, these cutaneously, in the flanks of immunocompromised NSG mice. data confirmed that the pathway is active in the tumors, but not Mice injected with NCI-H1299 and NCI-H1975 cells were in NAT. sacrificed at 4 weeks post injection, while mice injected with A549 Human LUAD frequently harbors KRAS or EGFR mutations cells were sacrificed at 6 weeks post-injection. The results revealed and data presented in this report suggested that lung adeno- that the IWS1 knockdown reduced tumor growth and that the carcinoma cells harboring EGFR mutations may be more sensitive growth reduction was least pronounced in tumors derived from to the loss of IWS1 than KRAS mutant cells (Figs. 5, 6, 7 and the KRAS mutant cell line A549 (Fig. 7a, b). The weak growth Supplementary Fig. 10a). To identify tumors harboring mutations reduction of tumors derived from shIWS1 A549 cells paralleled in these genes, we probed the tumor lysates with monoclonal the weak inhibition of ERK phosphorylation (Fig. 5h) and cell antibodies, which selectively recognize the G12V and G12D proliferation (Supplementary Fig. 9a) induced by the knockdown mutants of KRAS and the L858R mutant of EGFR69 (Fig. 8a). of IWS1 in these cells. Comparison of the abundance of IWS1 and phosphorylated IWS1 To address the mechanism of the inhibition of tumor growth with the U2AF2 E2/E3 ratio, and with the abundance of Sororin, by shIWS1, we first confirmed the efficiency of the IWS1 CDK1, phosphor-CDK1, and Cyclin B1, revealed strong correla- knockdown, by probing western blots of tumor cell lysates with tions in the entire cohort. However, the correlations were more anti-IWS1 and anti-phospho-IWS1 (S720) antibodies (Fig. 7c). robust in the EGFR than in the KRAS mutant tumors (Fig. 8b). Following this, we employed RT-PCR and qRT-PCR to address Consistent with these data, the abundance of phosphor-IWS1 and the usage of exon 2 in the U2AF2 mRNA in the tumors. The the U2AF2 E2/E3 ratio, correlates positively with tumor stage results confirmed that the knockdown of IWS1 has no effect on (Fig. 8c) and negatively with survival in patients with EGFR the total U2AF2 mRNA levels (Supplementary Fig. 11a, upper mutant, but not KRAS mutant tumors (Fig. 8d). panel), but promotes the exclusion of exon 2 from U2AF2 mRNA The preceding data were confirmed by IHC, using sequential (Fig. 7c, Supplementary Fig. 11a, lower panel). Probing both sections of a commercially available tissue microarray (TMA) of tumor lysates and tissue sections with antibodies to regulators 50 LUAD with paired NAT. The TMA samples were probed with and targets of the Sororin/ERK feedback loop, confirmed that its antibodies to p-IWS1, Sororin, p-ERK, p-CDK1, and EGFR activity was reduced in tumors derived from shIWS1 cells (Fig. 7c, ΔE746-A750 (Supplementary Fig. 12b, c). The results confirmed Supplementary Fig. 11b). Measuring the abundance of the that the pathway is more active in the tumors, than in NAT proliferation markers PCNA (western blotting) and Ki-67 (Supplementary Fig. 12d) and that its activity correlates with (immunohistochemistry), confirmed that the expression of these tumor stage and grade (Supplementary Fig. 12e, f). More markers was also reduced in the shIWS1 xenografts (Fig. 7c, d). important, the abundance of IWS1 phosphorylation correlates Quantitative analyses of the western blot (p-ERK and PCNA) and with the abundance of Sororin, phosphor-ERK, and phosphor- IHC data (Ki-67) showed more robust downregulation of all these CDK1, and the correlations are significantly more robust in the markers in tumors derived from the shIWS1-transduced NCI- EGFR mutant tumors (Supplementary Fig. 12g). In addition to H1975 than A549 cells (Fig. 7e), as expected. confirming the western blot data in our set of LUADs, the IHC To determine whether the regulation of xenograft growth by data also demonstrate that the activity of the pathway can be phosphorylated IWS1 depends on the inclusion of exon 2 in the monitored in human tumors by IHC. U2AF2 mRNA, we knocked down IWS1 in NCI-H1299 cells and The data generated from the analysis of the tumor samples in we rescued the knockdown with U2AF65α or U2AF65β our LUAD cohort, and the tumor samples in the TMA, were (U2AF65α-R and U2AF65β-R). These cells, as well as shControl confirmed by data in publicly available databases. Analysis of and shIWS1 NCI-H1299 cells, were injected in the flanks of NSG LUAD data derived from the Tumor Cancer Genome Atlas mice, as described in the “Methods” section. The results of this (TCGA), revealed correlations between IWS1 or SRSF1, and the experiment (Supplementary Fig. 11c and Fig. 7f) confirmed that U2AF2 E2/E3 ratio, as well as other components of the IWS1 the U2AF2 alternative RNA splicing plays a critical role in the phosphorylation pathway (Fig. 8e). The U2AF2 E2/E3 ratio and regulation of tumor growth by phosphorylated IWS1. The the expression of the CDCA5 mRNA were also significantly xenograft data presented here were in full agreement with the higher in tumors expressing high levels of IWS1 (Supplementary data on the role of the IWS1 phosphorylation pathway in cell Fig. 12h) and IWS1 expression exhibited a positive correlation proliferation (Fig. 6a, b) and cell transformation in culture with tumor stage, in tumors harboring EGFR but not KRAS (Supplementary Fig. 10). mutations (Supplementary Fig. 12i).

14 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE

a NCI-H1299 A549 NCI-H1975 b shCon NCI-H1299 A549 NCI-H1975 100 p=0.0027 shIWS1 p=0.0126 80

p<0.0001 2500 2500 p=0.0001 3 60 p<0.0001 p=0.0004 shCon 2000 2000 p=0.0005 shIWS1 p=0.0007 40 1500 1500 by shIWS1 % reduction 1000 1000 20 500 500

Tumor Weight (mg) Weight Tumor 0

0 (mm ) Volume Tumor 0 NCI-H1299 A549 NCI-H1975 NCI-H1299 A549 NCI-H1975 Tumor Weight Tumor Volume c d shCon shIWS1 shCon shIWS1 shCon shIWS1

NCI-H1299 A549 NCI-H1975

100 μm 100 μm 100 μm N1 N2 N3 N4 N5 N1 N2 N3 N4 N5 N1 N2 N3 N4 N5 100 μm 100 μm 100 μm shCon +-+-+-+-+- +-+-+-+-+- +-+-+-+-+- Ki-67 shIWS1 -+-+-+-+-+ -+-+-+-+-+ -+-+-+-+-+(KDa) p-IWS1 (S720) -140

IWS1 -140 25 μm 25 μm 25 μm 25 μm 25 μm 25 μm U2AF2 -1300bp 1 2 3 -1100bp NCI-H1299 A549 NCI-H1975 Sororin -40 p-ERK1/2 3.0 p=0.0002 (Y202/T204) -40 ERK1/2 -40 shCon p-CDK1 (Y15) -34 p=0.0002 2.0 CDK1 -34 p=0.0007 shIWS1 Cyclin B1 -55 PCNA -36 1.0 β-actin -55 Ki-67 IHC signal 0 per section area unit NCI-H1299 A549 NCI-H1975 ef NCI-H1299 A549 NCI-H1975 75 p<0.0001 p<0.0001 p<0.0001 2500 p=0.0004 2000 p=0.0002

NCI-H1299 1500

50 shCon 1000 500 shCon

Tumor Weight (mg) Weight Tumor shIWS1 shIWS1 0 U2AF65α-R p=0.0004 2500 3 U2AF65β-R

by shIWS1 U2AF65α-R p=0.0007 % reduction 25 2000 1500 U2AF65β-R 1000

500 0 Tumor Volume (mm ) Volume Tumor 0 p-ERK1/2 PCNA Ki-67 NCI-H1299

Fig. 7 IWS1 phosphorylation controls tumour growth in vivo, by regulating the U2AF2/Sororin/ERK axis. a (Upper panels) NSG mice were injected subcutaneously with shControl or shIWS1 NCI-H1299, A549, or NCI-H1975 cells. (N = 5 mice/group). Images of tumors, harvested at 4 weeks (NCI- H1299, NCI-H1975) or 6 weeks (A549) from the time of inoculation. (Lower panels) Scatter plots showing the weight and volume of individual tumors. The horizontal lines indicate mean tumor weight or volume. Statistical analyses were done using the one-sided paired t-test. b shIWS1-induced percent reduction of tumor weight and tumor volume, calculated from the weight and volume of the tumors induced by shIWS1 and shControl cells in panel a. Bars show the mean percent reduction±SD. Statistical analyses were done using the one-sided unpaired t-test. c Cell lysates derived from NCI-H1299, A549 and NCI-H1975 (shControl and shIWS1) mouse xenografts, were probed with the indicated antibodies. RT-PCR on RNA from the same cells, using primers for U2AF2 exons 1 and 3. d (Upper panels) The indicated shControl and shIWS1, formalin-fixed, paraffin-embedded tumor samples were stained with the Ki-67 antibody. Boxes in the upper images delineate the area of higher magnification shown in the lower images. Scale bar in the right lower corner of each image. (Lower panels) Scatter plots show the Ki-67 IHC signal per section area unit in shControl and shIWS1 tumors. The horizontal lines show mean Ki-67 signals. Statistical analyses were performed, using the one-sided paired t-test. e ShIWS1-induced percent reduction of the phosphor-ERK, PCNA and Ki-67 signals in the xenograft tumors. Bars show the mean signal reduction, based on quantification of the Western blot data in panel c (p-ERK and PCNA) and IHC data in figure 7e (Ki-67) ± SD. Statistics were performed using the one-sided unpaired t-test. f (Left panels) shControl, and shIWS1 NCI-H1299 cells, as well as shIWS1 NCI-H1299 cells rescued with U2AF65α, or U2AF65β, were injected subcutaneously in NSG mice, as in the experiment in panel a (N = 5 mice/group). Images of tumors harvested at 4 weeks from the time of inoculation. (Right panels) Scatter plots showing the weight and volume of the harvested tumors. The horizontal lines indicate mean tumor weight or volume. Statistical analyses were performed with the one-sided paired t-test.

Analysis of the molecular signature dataset GSE13213, which (Fig. 8f, left panel). Sorting tumors harboring EGFR or KRAS focuses on tumor relapse70, revealed that the expression of IWS1, mutations into separate groups, revealed that the activity of the CDCA5, CDK1, and CCNB1 (encoding Cyclin B1) was higher in a pathway was again higher, only in relapsing tumors with EGFR set of relapsing than in another set of non-relapsing LUADs mutations (Fig. 8f, middle and right panels, Supplementary

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 15 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 3132333435 36 3738 39 40 a (kDa) NTNTNTNTNT NTNTNTNTNT NTNTNTNTNT NTNTNTNTNT NTNTNTNTNT NTNTNTNTNTTTTTTTTTTT 140- IWS1 140- p-IWS1 (S720) 60- p-AKT (S473) 60- AKT3 1300bp- U2AF2 1100bp- 123 10- 8- 6- 4- 2-

to normal 0-

E2/E3 relative 40- Sororin 42- p-ERK1/2 (Y202/T204) 42- ERK1/2 34- p-CDK1 (Y15) 0.599620819 34- CDK1 55- Cyclin B1 21- KRAS G12V 21- KRAS G12D 175- EGFR L858R 175- EGFR E746-A750 del 55- α-tubulin

n

1

3

i

2

r

S 3

F E

1

/

o

bcT d

A

r

S W

2

I

2 K

o

-

E

W

U

I p S A p-ERK p-CDK1 Cyclin B1 p-AKT CDK1 OSU LUAD cohort IWS1 OSU LUAD cohort p-IWS1 U2AF2 E2/E3 Stage I Stage II/III p-IWS1 (S720) Sororin U2AF2 E2/E3 p-ERK 250 Low U2AF2 E2/E3 (n=10) p=0.0073 100 Low p-IWS1 (n=11) 100 p-CDK1 High U2AF2 E2/E3 (n=20) p=0.0012 High p-IWS1 (n=19) CDK1 200 p=0.8063 80 80 Cyclin B1 (n=40) 60 60 AKT3 150 p-AKT 40 40 All patients 20 HR = 3.75 (1.029-13.66) 20 HR = 4.145 (1.148-14.96) IWS1 100 logrank p=0.0451 logrank p=0.0299 p-IWS1 0 0 Survival Propability Survival Propability

U2AF2 All patients 20 40 60 80 100 120 20 40 60 80 100 120 E2/E3 50 Time (Months) Time (Months) Sororin p-ERK 100 100 p-IWS1 (S720) p-CDK1 N=19 N=21 N=10 N=4 N=5 N=8 80 80 CDK1 60 60 Cyclin B1 Low p-IWS1 (n=6)

(n=14) 12 Low U2AF2 E2/E3 (n=4) AKT3 p=0.9963 40 High p-IWS1 (n=6) 40 p=0.001 p=0.001 High U2AF2 E2/E3 (n=8) p-AKT 10 KRAS mut 20 HR = 2.938 (0.24-23.48) 20 HR = 4.862 (0.46-51.16) logrank p=0.4524 logrank p=0.1878 IWS1 8 0 0 Survival Propability KRAS mut p-IWS1 40 80 Survival Propability 40 80 U2AF2 E2/E3 6 Time (Months) Time (Months) E2/E3 100 Low p-IWS1 (n=5) Sororin 4 100 Low U2AF2 E2/E3 (n=6) p-ERK High p-IWS1 (n=8) 80 80 High U2AF2 E2/E3 (n=7) p-CDK1 2 to normal CDK1 60 60 0 U2AF2 Cyclin B1 (n=13) N=19 N=21 N=10 N=4 N=5 N=8 40 40 AKT3 20 HR = 4.57 (1.256-7.27) 20 HR = 6.55 (1.006-42.70) p-AKT EGFR mut All KRAS EGFR logrank p=0.0292 logrank p=0.0492 0 0 Survival Propability Survival Propability

0 0.25 0.5 0.75 1.0 EGFR mut patients mut mut 20 40 60 80 100 120 20 40 60 80 100 120 Time (Months) Time (Months) efg TCGA LUAD GSE13213 GSE141685 GSE123903 All patients KRAS mut EGFR mut Incipient E2 E2 Metastasis B1 B1 Total Total IWS1 CDK1 CDK1

Prp19 (n=117) (n=14) (n=45) Cyclin Cyclin SRSF1 U2AF2 CDCA5 U2AF2 U2AF2 U2AF2 CDCA5 Macro (n=6) lesion metastasis Brain Metastatic (n=59)

(n=8) DTC (n=26) Relapsed Relapsed Relapsed (n=24) Tumor Primary LUAD (n=58) (n=6) (n=19) Non-Relapsed Non-Relapsed B1 Non-Relapsed 9.5 11 12.5 11.5 14 16.5 IWS1 E2/E3 (n=516) CDK1

(n=128) Cyclin IIIb/IIIc

log2 (norm. count) FGFR2 U2AF2 log2 (norm. count) CDCA5 IWS1 B1 B1 B1 CDCA5 IWS1 IWS1 IWS1 CDK1 CDK1 CDK1 -1.5 0 1.5 Cyclin Cyclin -3 -2 -1 0 1 2 3 Cyclin Sororin Sororin Sororin z-score z-normalized expression -3 -2 -1 0 1 2 3 z-score h TCGA LUAD IWS1 U2AF2 E2/E3 FGFR2 IIIb/IIIc

100 Low IWS1 (n=336) 100 Low U2AF2 E2/E3 (n=285) 100 High FGFR2 IIIb/IIIc (n=218) 80 High IWS1 (n=336) 80 High U2AF2 E2/E3 (n=282) 80 Low FGFR2 60 HR = 4.43 (1.21-7.21) 60 logrank p=0.0352 60 IIIb/IIIc (n=218) 40 40 40 HR = 2.75 (2.206-3.420) 20 HR = 1.45 (1.13-1.85) 20 20 logrank p=0.0001 logrank p=0.0031 0 0

0 Survival Propability Survival Propability 20 40 60 80 100 120 Survival Propability 0 20 40 60 80 100 120 All patients 50 100 150 200 250 Time (Months) Time (Months) Time (Months) 100 Low U2AF2 High FGFR2 100 100 Low IWS1 (n=21) E2-E3 (n=70) IIIb/IIIc (n=77) 80 High IWS1 (n=131) 80 High U2AF2 80 Low FGFR2 mut 60 60 E2/E3 (n=88) 60 IIIb/IIIc (n=78) HR = 1.26 (0.74-2.13) 40 40 40 logrank p=0.3807 20 HR = 0.625 (0.27-1.432) 20 HR = 0.97 (0.57-1.68) 20 logrank p=0.9394 0 logrank p=0.2669 0 0 Survival Propability Survival Propability KRAS 50 100 150 200 250 Survival Propability 0 20 40 60 80 100 120 0 20 40 60 80 100 120 Time (Months) Time (Months) Time (Months) 100 Low U2AF2 100 Low IWS1 (n=18) 100 High FGFR2 High IWS1 (n=17) E2/E3 (n=39) 80 80 80 IIIb/IIIc (n=39) HR = 4.002 (1.59-10.02) High U2AF2 Low FGFR2 mut 60 logrank p=0.0031 60 60 E2/E3 (n=27) IIIb/IIIc (n=27) 40 40 HR = 3.95 (1.631-9.56) 40 logrank p=0.0023 20 20 20 HR = 2.25 (1.06-4.78) logrank p=0.0342

0 Survival Propability 0 0 Survival Propability Survival Propability

EGFR 20 40 60 80 100 120 20 40 60 80 100 120 20 40 60 80 100 120 Time (Months) Time (Months) Time (Months)

Fig. 12j). We conclude that the IWS1 phosphorylation pathway phosphorylation pathway was higher in the metastatic tumors may also promote treatment relapse of lung adenocarcinomas, (Fig. 8g, Supplementary Fig. 12k). Importantly, the U2AF2 E2/E3 especially those with EGFR mutations. ratio was also higher in the metastatic tumors, while the IIIb/IIIc Analysis of the RNA-seq data in the GSE141685 lung FGFR2 transcript ratio was reduced (Supplementary Fig. 12l), in adenocarcinoma dataset of primary tumors and brain metastases agreement with our earlier observations, showing that IWS1 revealed that the expression of IWS1 and its targets in the IWS1 phosphorylation promotes FGFR2 exon 8 skipping22. The role of

16 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE

Fig. 8 The p-IWS1/U2AF2 pathway is active in human lung adenocarcinomas and impacts tumour grade, stage, metastatic potential and treatment relapse in patients with EGFR mutant, but not KRAS mutant tumors. a Lysates of 30 LUAD samples paired with NAT, and 10 unpaired samples, were probed with the indicated antibodies. RT-PCR of U2AF2 in the same samples was performed, using exon 1 and 3 primers. Bars show the U2AF2 E2/E3 exon ratio in the tumors in the upper panel, relative to the average of the 30 normal samples. b Correlation heatmaps between components of the IWS1 phosphorylation pathway, in the LUADs in a. Correlation coefficients were calculated using simple linear regression. The values and the statistical confidence of all the comparisons can be found in Supplementary Table 5. c Violin plots showing the abundance of IWS1 phosphorylation and the U2AF2 E2/E3 ratio (right) in stage I and Stage II/III tumors. Data shown for all tumors in panels a and b, and selectively for EGFR or KRAS mutant tumors. The horizontal black lines indicate mean values for phospho-IWS1 levels and U2AF2 E2/E3 ratios. Statistical analyses were performed using the one-sided unpaired t-test. d Kaplan-Meier Curves of phosphor-IWS1 and U2AF2 E2/E3 ratios in the LUAD cohort in panel a. Statistical analyses were performed using the one-sided log rank test and Cox’s proportional hazards model. e Heat Maps showing the correlation of IWS1 or SRSF1 with components of the IWS1/

U2AF2 pathway in the TCGA LUAD database using log2 normalized counts. f Heatmaps showing the expression of IWS1 pathway components in the tumors of the GSE13213 dataset. g (Left panel) Heat Maps showing expression of IWS1 pathway components in resected brain metastases and primary lung adenocarcinomas. RNA-seq data for metastatic lesions were derived from the GSE141685 dataset and for primary LUADs from TCGA. (Right upper panel) scRNA-Seq data from GSE123903. Force-directed layout/t-SNE of all metastatic tumor cells isolated from patient-derived mouse xenografts (Laughney et al., 202071), colored by source, (Right lower panel) Force-directed layout (as in the upper panel) of all xenograft tumor cells colored for z- normalized IWS1 and CDCA5 expression. h Kaplan–Meier curves showing the impact of IWS1, U2AF2 E2/E3 and FGFR2 IIIb/IIIc ratios on patient survival in the indicated patients in the TCGA LUAD database. Statistical analyses were performed as in panel d. the IWS1 phosphorylation pathway in metastasis was additionally promotes cell proliferation by regulating the alternative RNA supported by the results of single-cell RNA-Seq experiments on splicing of U2AF2, the RNA splicing of its target CDCA5, and the tumors, in a patient-derived LUAD mouse model71. These results expression of the CDCA5-encoded protein Sororin. Specifically, revealed that both IWS1 and CDCA5 are highly expressed in IWS1 phosphorylation promotes the inclusion of the alternatively subsets of the disseminated tumor cell (DTC) and incipient spliced exon 2 in the mature U2AF2 mRNA transcript. Exon 2 metastasis sets of tumor cells (Fig. 8g, right panel). Analysis of the encodes the RS domain of the U2AF2 protein product U2AF65, RNASeq data in the TCGA-LUAD dataset, which contains which interacts with several proteins involved in the regulation of information on cancer-associated mutations, confirmed the link RNA metabolism, one of which is the ubuiquitin ligase Prp19. between IWS1 expression and metastatic disease, but also showed The latter is a member of a splicing complex composed of four that IWS1 is upregulated in metastatic tumors with EGFR, but core and three accessory polypeptides, which is recruited to RNA not KRAS mutations (Supplementary Fig. 12m). Pol II via its interaction with the RS domain of U2AF65. Data in As expected from the preceding data, IWS1 expression, U2AF2 this report confirmed that only the protein encoded by the exon exon 2 inclusion, and FGFR2 exon 8 skipping are indicators of 2-containing splice form of U2AF2 (U2AF65α), which is poor prognosis in TCGA patients with lung adenocarcinomas, expressed in cells undergoing AKT-dependent IWS1 phosphor- harboring EGFR, but not KRAS mutations (Fig. 8h). In addition, ylation, interacts with Prp19. In addition, they showed that the EGFR mutations in LUADs in the TCGA and the GSE13213/ Prp19-interacting protein U2AF65α is required for CDCA5 GSE26939 datasets were associated with worse prognosis, when mRNA processing, Sororin expression, cell cycle progression they occurred in patients with high expression of IWS1 through G2/M, and cell proliferation. Sororin is phosphorylated (Supplementary Fig. 12n, o). by ERK, and following phosphorylation, promotes the activation The AKT/IWS1/U2AF2/CDCA5/ERK pathway can potentially of ERK by indirect and poorly understood mechanisms. We be activated by multiple mechanisms in human lung adenocarci- should add that the Sororin-dependent phosphorylation of ERK nomas. To determine the potential role of genetic changes in the plays a dominant role in ERK regulation, as inhibition of the activation of the pathway, we examined the whole-exome IWS1 phosphorylation pathway significantly inhibits the activa- sequencing information in the TCGA-LUAD dataset for copy- tion of ERK by EGFR mutations and has a major impact in the number variations, point mutations, and genomic fusions biology of lung adenocarcinomas harboring such mutations. involving genes in this pathway (Supplementary Data 1). This Importantly, the IWS1 phosphorylation pathway summarized analysis identified several genetic changes of which the most here is an integral component of the cell cycle machinery, as it common was an amplification of the AKT3 gene (Supplementary does not only regulate the cell cycle, but is also cell cycle- Fig. 13a). Gain-of-function mutations targeting AKT1 (E17K, regulated. D323Y) were also observed (Supplementary Fig. 13b). The The results of our earlier studies, combined with the data in significance of other mutations (L11F and Q10H, immediately this report, show that a given RNA splicing regulator may upstream of the RS domain of U2AF65) and the significance of modulate alternative RNA splicing of different target genes by X77 = , a splice-site U2AF2 mutation, is not known. different mechanisms. Our previous findings had shown that the The pathway described in this report is activated by AKT, abundance of IWS1 and IWS1 phosphorylation regulates the primarily AKT3, upregulates CDCA5, and is associated with poor alternative RNA splicing of FGFR2 by promoting the exclusion of prognosis. Here we show that both the high expression of AKT3, the alternatively spliced exon 8 from the mature transcript. Here the signaling molecule at the initial step of the activation of the we show that IWS1 and IWS1 phosphorylation promote the pathway, and CDCA5, the signaling molecule at the pathway inclusion of alternative spliced exons in the mature transcripts of endpoint, are also associated with poor prognosis (Supplementary several genes. Moreover, although the SETD2-dependent H3K36 Fig. 13c). These data provide additional support to the trimethylation is required for alternative RNA splicing in both importance of this pathway in the pathophysiology of human cases, the effector complexes nucleated by H3K36me3 differ. lung adenocarcinomas. Thus, whereas the H3K36me3 reader responsible for exon exclusion from the FGFR2 mRNA is MRG15, the H3K36me3 reader for exon inclusion in the U2AF2 mRNA is the p52 isoform Discussion of LEDGF. Also, whereas H3K36/MRG15 recruits the spliceoso- Data presented in this report describe a signaling pathway, which mal factor PTB for the alternative splicing of FGFR2, H3K36/ starts with the AKT3-dependent phosphorylation of IWS1 and LEDGF (p52) recruits SRRSF1 for the alternative splicing of

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 17 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

U2AF2. Currently, we do not know why the two types of genes mechanisms on the regulation of CCNB1 and CDK1 expression is respond differently to the IWS1 phosphorylation-induced histone under investigation. H3K36 trimethylation. One possibility is that cis-acting elements The information discussed in the preceding paragraph in the RNA facilitate the binding of factors, which could synergize describes a dominant mechanism by which the IWS1 phos- with specific H3K36me3 readers and associated proteins. This is phorylation pathway regulates cell cycle progression and cell supported by findings reported here, which show that RNAs proliferation. Given that the cell cycle is an integrated system and undergoing IWS1-dependent exon inclusion, including the that cell cycle regulatory mechanisms tend to also be cell cycle- U2AF2 mRNA, contain sites that may be recognized by SRSF1. regulated, we examined whether the expression and phosphor- Alternatively, there may be differences in the epigenetic marks ylation of IWS1, the RNA splicing of U2AF2, and the expression responsible for the selection of H3K36me3 readers and reader- of Sororin fluctuate as the cells progress through the cell cycle. associated factors in different genes. We should add here that our The results revealed that the abundance of these molecules and earlier studies and data in this report show that not only both the RNA splicing of U2AF2 indeed fluctuate in a cell cycle- mechanisms have a role in human cancer, but they may also be dependent fashion, and confirmed that the regulation of RNA active simultaneously in the same cancer. splicing by IWS1 is an integral component of the cell cycle Another important observation presented in this report is that machinery. The fluctuation of the activity of the IWS1 phos- mRNA splicing is a process that is regulated at multiple levels. phorylation pathway during cell cycle progression is due to the Thus, the alternative splicing of U2AF2 is regulated directly by the fluctuating levels of IWS1 mRNA, protein, and protein phos- IWS1 phosphorylation-dependent abundance of histone phorylation. The latter may be due to cell cycle-dependent H3K36me3 marks in the body of the U2AF2 gene. However, by changes in the activity of AKT. Earlier reports had indeed shown regulating the alternative splicing of a basic RNA splicing factor, that CDK2 is activated in S phase by interacting with cyclin A2, this process introduces a new layer of RNA splicing regulation, and that following activation, phosphorylates AKT at Ser477/ which depends in part on the differential binding of U2AF65 to Thr479, enhancing its activity75. To the changing levels of IWS1 yet another splicing factor, Prp19. The binding between these expression and phosphorylation, we should add that the abun- factors is required for the efficient splicing of CDCA5 and perhaps dance of some core splicing factors also fluctuates with the cell other genes. The reason for the multilayered control of RNA cycle. Our working hypothesis therefore is that the activity of the splicing by a single RNA splicing regulator could be that this IWS1 phosphorylation pathway may fluctuate during the cell allows a limited number of available pathways to converge in cycle, due to a combination of cell cycle-dependent processes. different combinations for the differential modulation of a large An important component of the cell cycle regulatory number of RNA splicing events. This may be critical for the fine- mechanisms initiated by the AKT-dependent phosphorylation of tuning of the global regulation of RNA splicing under different IWS1 is a positive feedback loop between ERK and Sororin. ERK physiological conditions. We should add here that although the phosphorylates Sororin and the phosphorylated Sororin promotes effects of the RS domain-deficient U2AF65 on RNA splicing may the phosphorylation of ERK. How the ERK-phosphorylated be global, they are selective, affecting the RNA splicing of some, Sororin promotes the phosphorylation and activation of ERK is but not all the genes. One of the factors that determine the spe- currently unknown. Our working hypothesis is that the phos- cificity in the pathway described here could be the binding of phorylation and activation of ERK are due to signals induced by Prp19, but this question remains to be addressed. the interaction of Sororin with its partners in the cohesin com- The IWS1 phosphorylation pathway described in this report plex. If this is the case, the cell may use this mechanism to sense plays a critical role in cell cycle progression by regulating the the successful progression from prometaphase to metaphase, in RNA splicing of CDCA5 and the abundance of the CDCA5- order to activate a molecular switch, which enhances the phos- encoded protein Sororin. The latter is one of the seven members phorylation of Sororin, facilitating entry into, and progression of the cohesin complex, a ring-like structure, which holds the through the G2/M phase of the cell cycle. sister chromatids together during metaphase72. Defects in this One of the most important findings presented in this report is complex activate the spindle assembly checkpoint, arresting the link between the AKT3/IWS1/U2AF2/CDCA5/ERK pathway progression through G2/M73. This explains the partial G2/M and the biology of lung adenocarcinomas harboring EGFR arrest induced by the CDCA5 mRNA processing block associated mutations. The first surprising observation was the significant with the downregulation of IWS1 expression and/or phosphor- downregulation of the phosphorylation of ERK, induced by the ylation and with the RS domain-deficient U2AF65β. However, the knockdown of IWS1 in lung adenocarcinoma cell lines with downregulation of CDCA5/Sororin may interfere with additional EGFR mutations, and to a lesser extent KRAS mutations. Sub- G2/M-associated processes, such as transcription of genes con- sequent observations confirmed that the knockdown of IWS1 had tributing to progression through the G2/M phase of the cell cycle. a major impact on the proliferation of lung adenocarcinoma cell Data presented in this report show that Sororin expression and/or lines, particularly those with EGFR mutations. Moreover, studies phosphorylation, promotes the expression of CCNB1 and CDK1 on 40 lung adenocarcinomas from the OSU tumor bank and 50 at both the RNA and protein levels. In addition, the abundance of lung adenocarcinomas in commercially available tissue micro- the CDCA5 mRNA exhibits very strong correlations with the arrays showed that the IWS1 phosphorylation pathway is active abundance of the CCNB1 and CDK1 mRNAs, in the TCGA in primary human tumors. More important, these data also datasets of human lung adenocarcinomas. These observations revealed that the activity of the pathway correlates positively with indicate that the induction of cyclin B1 and CDK1 by Sororin is tumor grade and stage, and negatively with patient survival, regulated at the RNA level, most likely at the level of transcrip- selectively in tumors harboring EGFR mutations. These obser- tion. A potential mechanism for the transcriptional regulation of vations were also in agreement with data generated from the CCNB1 and CDK1 by Sororin was suggested by earlier studies meta-analysis of lung adenocarcinoma datasets. Meta-analysis of showing that the Cohesin complex interacts with the Mediator these datasets showed that the activity of the IWS1 phosphor- complex and that Mediator–Cohesin complexes are loaded by the ylation pathway selectively correlates not only with tumor grade, NIBPL Cohesin loading factor to enhancers and core promoters tumor stage, and patient survival, but also with metastasis and of target genes. Enhancer and core promoter-associated com- with tumor relapse following treatment. Collectively, these data plexes promote loop formation between these segments and suggest that the AKT3/IWS1/U2AF2/CDCA5/ERK pathway is regulate transcription74. The contribution of this and other associated with less differentiated, more invasive, and more

18 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE metastatic tumors, and perhaps with resistance to EGFR inhibi- Transfections were carried out using 2× HEPES Buffered Saline (Sigma, Cat. No tors. Based on these findings, we propose two translational 51558) and CaCl2 precipitation. Forty-eight hours later, virus-containing culture fi applications for the IWS1 phosphorylation pathway described in media were collected and ltered. Infections were carried out in the presence of 8 μg/mL polybrene (Sigma, Cat. this report: (a) the expression and phosphorylation of IWS1, the No. 107689). Depending on the selection marker in the vector, 48 h from the start alternative splicing of U2AF2, and the gene expression program of the exposure to the virus, cells were selected for resistance to puromycin (Gibco, initiated by these processes, can be used as biomarkers to stratify Cat. No. A11138) (10 μg/mL), G-418 (Cellgro, Cat. No. 30-234) (500 μg/mL), or μ patients for treatment. (b) Treatment with inhibitors of the EGFR blasticidin (Gibco, Cat. No A1113903) (5 g/mL). Cells infected with multiple constructs, were selected for infection with the first construct, prior to the next pathway, in combination with AKT1/AKT3 inhibitors or decoy infection. RNA oligonucleotides targeting U2AF2 RNA splicing, may Transfection of lung adenocarcinoma cell lines with siRNAs (20 nM final enhance the therapeutic potential of EGFR pathway inhibition concentration) was carried out, using the Lipofectamine 3000 Transfection Reagent (Invitrogen, Cat. No. 13778) and Opti-MEM Reduced Serum Medium (Gibco, Cat. and may prevent the emergence of treatment-resistant clones. ’ In conclusion, the data in this report describe an important no. 11058021), according to the manufacturer s protocol. pathway that links cell cycle-regulated AKT activity to RNA splicing and cell cycle regulation. More important, this process is Cell proliferation assay, cell cycle analysis, and FACS-sorting of cells in dif- active in a significant fraction of human lung adenocarcinomas, ferent phases of the cell cycle. shControl, shIWS1, shIWS1/CDCA5 WT rescue, and its activity is associated with poor prognosis selectively in shIWS1/CDCA5 S79A/S209A mutant rescue, and shIWS1 CDCA5 S79E/S209E mutant rescue cells were plated in triplicate in 12-well tissue culture plates. Given patients with lung adenocarcinomas harboring EGFR mutations. that the growth rates of these cell lines differ, they were plated at different densities The activity of this pathway therefore, provides a precision (NCI-H522 8000 cells/well, NCI-H1299 5000 cells/well, H1975 5000 cells/well, and medicine biomarker, which may be used to stratify human lung A549 8000 cells/well). Cell proliferation of cells growing under normal culture adenocarcinomas and inform the optimal treatment strategy. conditions was monitored every 6 h, using the Incucyte S3 Live-Cell Imaging and Analysis System (Essen Biosciences, Ann Arbor, MI). All cell lines were monitored for 7 days, with the exemption of the NCI-H1975 cell line, which was monitored Methods for 12 days. Images were captured and analyzed using the Incucyte confluence Cells, culture conditions, growth factors, and inhibitors. NCI-H522, NCI- masking software (Essen Biosciences, Ann Arbor, MI), which calculates the surface H1299, and HEK-293T cells were grown in Dulbecco’s modified Eagle’s medium area occupied by the growing cells, as a percentage of the total surface area of the (DMEM) (Sigma-Millipore, Cat No. D5796) supplemented with penicillin/strep- well at sequential time points. Confluence monitoring was optimized for each cell tomycin (Corning, Cat No. 30-002-CI), nonessential amino acids (Corning, Cat line to minimize background. To ensure an unbiased analysis, the optimization No. 25-025-CI), glutamine (Corning, Cat No. 25-005-CI), plasmocin 2.5 ng/μL parameters determined for a given cell line were also applied to all the derivatives (Invivogen, Cat No. ant-mpp), and 10% fetal bovine serum. A549, NCI-H460, of that cell line. NCI-H1975, PC-9, NCI-H1650 and HBEC hTERT cells were grown in similarly To determine the cell cycle distribution of exponentially growing cells, supplemented Roswell Park Memorial Institute 1640 (RPMI 1640) medium semiconfluent cultures were harvested by trypsinization and the cell pellet was (Sigma-Millipore, Cat No D8758). Cell lines were periodically checked for myco- resuspended in 700 μL of PBS and fixed by adding 2.8 mL of ice-cold ethanol. The plasma, using the PCR mycoplasma detection kit (ABM, Cat No. G238) and they ethanol-suspended cells were kept at −20 °C overnight. Following two washes with were used for up to five passages. All experiments were carried out in mycoplasma- 1× PBS, the fixed cells were stained with propidium iodide (Propidium Iodide free cultures. IGF1 (Cell Signaling, Cat. No. 8917) (20 ng/mL), was used to sti- (1:2500)) (Invitrogen, Cat. No. P3566), 0.1 mg/mL RNAse A (Invitrogen, Cat. No. mulate NCI-H522 or NCI-H1299 cells that had been serum-starved for 24 h. Cells 12091-039), and 0.05% Triton X and incubated in the dark at 37 °C for 30 min. were treated with IGF1 for up to 4 h. To inhibit AKT in cells growing in complete Subsequently, the cells were analyzed on a BD FACS Calibur v2.3 Flowcytometer media, we treated them with the AKT inhibitor MK2206 (MERCK) (5 μM) for 4 h. (BD Biosciences, San Jose, CA). All the experiments were performed in triplicate At this concentration, MK2206 inhibits all three AKT isoforms. and they were analyzed, using the FlowJo v9.3.3 software. The raw data obtained from this analysis can be found in Supplementary Table 4. The analysis was performed in the Flow Cytometry Shared Resource of the Ohio State University siRNAs, shRNAs, expression constructs, and site-directed mutagenesis . siR- (https://cancer.osu.edu/for-cancer-researchers/resources-for-cancer-researchers/ NAs, shRNAs, and expression constructs are described in Supplementary Table 3. fl α β fi shared-resources/ ow-cytometry). cDNA copies of the U2AF2 splice variants and were ampli ed by RT-PCR, To separate cells in different phases of the cell cycle for further analysis, 2 × 106 from NCI-H522 shControl and NCI-H522 shIWS1 cells, respectively. The ampli- fi fi exponentially growing NCI-H1299 cells were harvested by trypsinization, counted, ed cDNAs were electrophoresed in 1% agarose gels and they were gel-puri ed and resuspended in DMEM, to a final concentration of 5 × 105 cells/mL. Following using the NucleoSpin Gel and PCR Clean-Up kit (M&N, Cat. No. 740609.50). The μ ™ ™ fi this, cells were stained by adding 2 L/mL Vybrant DyeCycle Ruby Stain puri ed cDNAs were cloned in the pENTR/D-TOPO cloning vector (Invitrogen, (Thermo Fisher, Cat. No. V10309) and by incubating them in the dark at 37 °C for Cat. No. 45-0218). Subsequently, they were transferred by recombination from the 30 min. The stained cells were sorted based on their DNA content, using a BD pENTR/D-TOPO clones to pLx304-V5-DEST (Addgene #25890), using standard fi FACS Aria III cell sorter (BD Biosciences, San Jose, CA). Cellular fractions Clonase II LR mix (Thermo sher, Cat No 11791100). The Gateway LR reaction enriched for cells in G1, S, and G2/M were harvested for protein, RNA, and was incubated at room temperature overnight. pDONR201-p52/LEDGF was pur- chromatin analyses. For protein extraction, cells were lysed in RIPA lysis buffer chased from the DNAsu Plasmid Repository (DNAsu Plasmid Repository Clone: fi (LB) and processed for immunoblotting. For RNA extraction, we used the HsCD00000034). p75/LEDGF cDNA was ampli ed by RT-PCR from NCI-H522 PureLink RNA Kit. Extracted RNA was used for RT-PCR and qRT-PCR analyses, shControl cells and was cloned into pENTR/D-TOPO cloning vector. Cloning to as described in this report. Chromatin analyses were performed using ChIC. The the pENTR/D-TOPO Vector and transfer to the PLX-304-V5-DEST vector were antibodies and primer sets used are described in Supplementary Tables 1 and 2, carried out as described above for the U2AF2 cDNA clones. respectively. Site-directed mutagenesis was carried out using standard PCR-based techniques. Briefly, pairs of overlapping oligonucleotide primers, harboring the desired mutation, were used to amplify plasmids containing the genes we wished to Cell transformation assay. Cell transformation assays in immortalized HBEC mutagenize. To remove the original plasmid from the amplified reaction mix, we hTERT were performed using the Cell Transformation Assay Kit-Colorimetric incubated it with 2 μL of the 6-methyladenine-dendent restriction endonuclease (Abcam Cat No. ab235698). Based on the manufacturer’s protocol, two layers of DpnI,(NEB, Cat. No R0176) at 37 °C for 4 h. Subsequently, the DNA in the agarose were made (base and top layer). Prior to the initiation of the experiment, reaction mix was purified, using the NucleoSpin Gel and PCR Clean-Up kit. At the we performed a cell-dose curve by using seven serial dilutions of cells (twofold) and end, DH5-alpha Electrocompetent Escherichia coli (NEB, Cat. No. C2986) were incubating them for 4 h at 37 °C with WST working solution. After that, the transformed with 5 μL of the purified product, using an Eppendorf Eporator® absorbance at 450 nm was determined and the cell-dose curve was calculated (Eppendorf, Cat. No. 4309000027). All the mutagenized constructs were sequenced y ¼ αx þ β, using linear regression on GraphPad Prism 8.4. In order to perform in the Genomic Shared Resource (GSR) of The Ohio State University (https:// the assay, after solidification of the base agarose layer, 2.5 × 104 HBEC hTERT cells cancer.osu.edu/for-cancer-researchers/resources-for-cancer-researchers/shared- per condition were mixed with top agarose layer in 10× DMEM solution and plated resources/genomics), prior to use. The primers we used for all the mutagenesis in a 96-well plate, in triplicates along with blank wells. The cells were then plated experiments in this report are listed in Supplementary Table 2. for 7 days at 37 °C and monitored for colony formation. After 7 days, the cells were imaged in the Incucyte live-cell imager using the 20× lens. Then, the cells were Transfections and infections. Retroviral constructs were packaged by transient incubated for 4 h on WST working solution at 37 °C. The absorbance at 450 nm transfection in 293T cells, in combination with ecotropic (Eco-pac) or amphotropic was determined with a plate reader. Regarding the analysis, the average of the blank (Ampho-pac) packaging constructs. Lentivirus constructs were also packaged in wells was subtracted from all the readings of the experimental conditions. Then, the HEK-293T cells by transient transfection in combination with the packaging final number of the transformed cells was calculated by inserting the corrected constructs psPax2 (Addgene #12260) and pMΔ2.G (Addgene #12259). values in the cell-dose curve created prior to the experiment.

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 19 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

Library preparation and RNA-seq. Total RNA was isolated from shControl, downstream applications. The precipitated nuclear fraction was further treated shIWS1, shIWS1/WT-R, and shIWS1/MT-R NCI-H522 cells, using the PureLink with LB2 {(10 mM Tris-HCL (pH 7.5), 20 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, RNA Kit (Invitrogen Cat No 12183018A). RNA samples were analyzed on and 1 mM DTT) and fresh 1× Halt™ Protease and Phosphatase Inhibitor Cocktails}. Advanced Analytical Fragment Analyzer, using an RNA kit for integrity check and The LB2 lysates were again clarified by centrifugation at 12,000 × g for 6 min. The quantification. About 100–500 ng of total RNA from each sample was used as input pellet containing the nuclei, was further lysed with LB3 {(10 mM Tris-HCL (pH for library preparation with the Illumina TruSeq stranded mRNA Library Pre- 7.5), 100 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 0.1% (w/v) sodium deox- paration Kit (Cat. No. RS-122-2101) and they were individually indexed. Libraries ycholate, and 0.5% (v/v) N-lauroylsarcosine) and fresh 1× Halt™ Protease and were quantified on Fragment Analyzer using a next-generation sequencing (NGS) Phosphatase Inhibitor Cocktails}. The LB3 lysates were sonicated and clarified by kit and the libraries of all the samples were pooled in equal molar concentration. centrifugation at 21,000 × g for 15 min. To validate the fractionation, nuclear and The pooled library was sequenced on an Illumina HiSeq 2500 platform with Rapid cytosolic fractions were analyzed by immunoblotting for the abundance of Lamin V2 chemistry and 100-bp paired-end reads. Sequencing results were demultiplexed A/C and GAPDH. with bcl2fastq and compressed. Demultiplexed fastq file pairs from each sample The cells in the second fraction, were washed twice with TD buffer (135 mM were used for analysis. The whole procedure was performed in the Tufts University NaCl, 5 mM KCl, 0.7 mM Na2HPO4, and 25 mM Tris-HCl) and then lysed using Core Genomic Facility (TUCF-http://tucf-genomics.tufts.edu). TD/1% NP-40/RVC (Ribonucleoside-Vanadyl Complex, NEB, Cat. No. S1402) in All RNA-Seq experiments were performed in duplicate, and average depth of the presence of the Recombinant Ribonuclease Inhibitor RNaseOUT™ (Thermo sequenced samples was 37.5M (±5M fragments). Data preprocessing and Fisher, Cat. No. 10777019). Following incubation on ice for 10 min, and alignment was conducted as previously described76. RNA-Seq libraries were centrifugation at 21,000 × g for 1 min, the supernatant (cytosolic fraction), was quality-checked using FastQC (www.bioinformatics.babraham.ac.uk/projects/ aspirated and kept on ice. The nuclear fraction was washed twice with TD/0.5% fastqc/). Adapters and sequence contaminants were detected and removed using an NP-40/RVC. The RNA from both fractions was isolated using Trizol and a mixture in-house-developed algorithm and additional software such as the Kraken suite and of phenol–chloroform–isoamyl alcohol and it was precipitated with ethanol at −80 Cutadapt77. Paired-end reads were aligned against the human reference genome °C overnight. cDNA was synthesized from 1.0 μg of total RNA, using oligo-dT (GRCh38/hg38) with GSNAP78 spliced aligners. For gene and transcript priming and the QuantiTect Reverse Transcription Kit. Quantitative RT-PCR was annotation, we utilized Ensembl v85 reference database79. carried out as described in the following section.

Differential gene expression and alternative RNA splicing. Gene expression Immunoprecipitation84, immunoblotting, and image acquisition and utiliza- and exon-level expression was calculated by counting reads overlapping meta-gene tion fi and exon features using featureCounts80. DESeq81 and DEXSeq38 were employed . For the immunoprecipitation experiments in this report, we rst fractionated to address differential gene expression and differential exon usage, respectively. cell lysates into nuclear and cytoplasmic fractions, using the protocol described under cell fractionation. LB3 nuclear lysates were sonicated and clarified by cen- trifugation at 21,000 × g for 15 min. About 300 μL of the clarified lysates were Differential gene expression analysis. We performed differential gene expression added to Magnetic beads-Antibody conjugates, which were prepared as follows. analyses using R package DESeq, which utilizes a generalized linear model (GLM) Pierce™ Protein A/G Magnetic Beads (Thermofisher, Cat. No 88803) were washed 3 and is applied directly to raw read counts. When we compared the transcriptomes times, 5 min each, with LB3. Following overnight incubation at 4 °C with the of shIWS1 and shControl cells, we identified 1357 differentially expressed genes immunoprecipitating antibody or the Mouse Isotype Control antibody (Thermo- (p value ≤ 0.01, FDR ≤ 0.2), and when we compared the transcriptomes of shIWS1/ fisher, Cat. No 10400 C), the bead–antibody conjugates were again washed multiple WT-R and shIWS1/MT-R cells, we identified 417 differentially expressed genes times with LB3. (p value ≤ 0.01, FDR ≤ 0.2). Following the addition of 300 μL of the clarified lysates to the antibody–bead conjugates, the mixture was incubated at 4 °C overnight. The agarose bead-bound immunoprecipitates were washed five times, 5 min each, with LB3, and they were Differential exon usage. DEXSeq employs a GLM to model the differential exon electrophoresed (20 μg protein per lane) in SDS-PAGE. Following electrophoresis, usage between sample groups. Pairwise comparison of the transcriptomes of proteins were transferred to polyvinylidene difluoride (PVDF) membranes in 25 shIWS1 and shControl cells with DEXSeq identified 1,434 differentially employed mM Tris, 192 mM glycine. Immunoprecipitated proteins were detected by probing exons, assigned to 851 genes (FDR ≤ 0.05). Pairwise comparison of the tran- the membranes with the relevant antibodies, as described in the following scriptomes of shIWS1/WT-R and shIWS1/MT-R cells, identified 436 differentially paragraphs. To reduce the IgG heavy- and light-chain signal, V5-tagged and utilized exons, assigned to 273 genes (FDR ≤ 0.05). endogenous U2AF65 were immunoprecipitated using a mouse monoclonal Detailed lists of differentially expressed genes and differentially used exons, are antibody and they were detected with a rabbit monoclonal U2AF65 antibody. provided in supplementary files. In both DESeq and DEXSeq analyses, false Antibodies used for immunoprecipitation are listed in Supplementary Table 1. The discovery rate (FDR) was controlled with the Benjamini–Hochberg procedure82. detailed protocol can be found in the online protocol repository84. One hundred and sixty-five genes were identified as both differentially expressed For immunoblotting, cells were lysed using a RIPA LB {50 mM Tris (pH 7.5), and alternatively spliced when we compared the transcriptomes of shIWS1 and 0.1% SDS, 150 mM NaCl, 5 mM EDTA, 0.5% sodium deoxycholate, 1% NP-40, and shControl cells. Similarly, transcriptomic comparison of shIWS1/WT-R and fresh 1× Halt™ Protease and Phosphatase Inhibitor Cocktails (Thermofisher, Cat. shIWS1/MT-R cells identified 44 differentially expressed and alternatively No 78444)}. Lysates were sonicated twice for 30 s and clarified by centrifugation at spliced genes. 18,000 × g for 15 min at 4 °C. The clarified lysates were electrophoresed (20 μg protein per lane) in SDS-PAGE. Electrophoresed lysates were transferred to PVDF Gene-set enrichment analysis (GSEA). For this analysis, we used the GSEA membranes (EMD Millipore Cat No. IPVH00010) in 25 mM Tris and 192 mM v2.0.13 software. All the gene set files were downloaded from GSEA website (www. glycine. Following blocking with 5% nonfat dry milk in TBS and 0.1% Tween-20, broadinstitute.org/gsea/). Enrichment maps were used for visualization of the the membranes were probed with antibodies (at the recommended dilution), GSEA results. Enrichment score and FDR values were applied to sort pathways followed by horseradish peroxidase-labeled secondary antibodies (1:2500), and they enriched after gene set permutations were performed 1000 times for the analysis. were developed with Pierce ECL Western Blotting Substrate (Thermo Scientific, cat. no 32106). The antibodies used for western blotting are listed in Supplementary Table 1. Functional analysis of alternative RNA splicing events . Log2 fold-change values Western blot images were captured, using the Li-Cor Fc Odyssey Imaging of alternative spliced genes whose RNA splicing is differentially affected in shIWS1 System (LI-COR Biosciences, Lincoln, NE). For protein ladder detection, we used vs shControl and shIWS1/WT-R vs shIWS1/MT-R NCI-H522 cells, were imported the 700-nm channel and for protein band detection, we used the in the RStudio framework (V 3.5.2) for the GO analysis. GO analysis was per- 83 chemiluminescence channel. Data were collected using a linear acquisition method. formed, using the Bioconductor GOfuncR software . Alternatively spliced genes All images in this report were captured with the same protocol in order to ensure were annotated, according to their biological process. For each biological process, the comparability of the results from different experiments. Images were exported the number of associated genes and combined score, which is the absolute value of in high-quality image files (600-dpi png files) and they were processed with the X the sum of the Log2 fold change values of each gene associated with the biological Illustrator 2020 (Adobe, San Jose, CA) for figure preparation. The summary figures process, were also calculated. in Figs. 4h, 6j, and Supplementary Fig. 9c were designed in Bio Render using a student plan promo (legacy), and include content from Biorender (https:// Subcellular fractionation. About 5 × 106 cells were trypsinized, following two biorender.com/terms/). washes with ice-cold PBS. Harvested cells were centrifuged at 1200 × g for 5 min and the pellet was resuspended in 1 mL of PBS and aliquoted into two equal fractions, one for protein and the other one for RNA isolation. In the first fraction, qRT-PCR and RT-PCR. Total cell RNA was extracted using the PureLink RNA Kit the cells were lysed using a Triton X-100 cytosolic LB1 {(50 mM Tris-HCL (pH (Invitrogen, Cat. No 12183018A). cDNA was synthesized from 1.0 μg of total RNA, 7.5), 20 mM NaCl, 1 mM EDTA, 0.5% NP-40, 0.25% Triton X-100, 10% Glycerol, using oligo-dT priming and the QuantiTect Reverse Transcription Kit (QIAGEN, and 1 mM DTT) and fresh 1× Halt™ Protease and Phosphatase Inhibitor Cocktails Cat No. 205310). Gene and exon expression was measured by quantitative RT- (Thermofisher, Cat. No 78444)}. Lysates were rotated for 10 min at 4 °C, and PCR, using the iTaq™ Universal SYBR® Green Super mix (Biorad, Cat No. following this, they were clarified by centrifugation at 14,000 × g for 6 min. The 1725121) and a StepOne Plus qRT-PCR instrument (Thermofisher). qRT-PCR supernatant, containing the cytosolic protein fraction, was collected for experiments were performed with three biological replicates in triplicate. Data were

20 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE

2+ normalized to internal controls hGAPDH mRNA, or human 18S rRNA. The primer the addition of 100 mM Ca (CaCl2) and following activation, it digested the sets used for all the real-time PCR assays are listed in Supplementary Table 2. antibody-bound DNA in a reaction which was allowed to proceed for 30 min on ice. The reaction was terminated with the addition of 2× stop buffer (NaCl, 340 mM, EDTA 20 mM pH 8.0, EGTA 4 mM, digitonin 0.1% (wt/vol), RNAse A 0.2 Chromatin immunoprecipitation (ChIP). Attached cells were washed with PBS mg, Glycogen 0.02 mg) and the chromatin fragments were released, following a 10- and then treated with 1% formaldehyde (Sigma, Cat. No F8775) for 15 min at 37 °C min incubation at 37 °C. Subsequently, the chromatin fragments were extracted to cross-link proteins and DNA. The cross-linking reaction was stopped with a 5- using DNA Purification Buffers and Spin Columns (Cell SIgnaling, Cat. No 14209). fi min treatment with 0.125 M glycine ( nal concentration) at room temperature. DNA amplification by quantitative PCR, and data analyses were carried out as Cells were subsequently scraped off the Petri dish and they were washed and lysed described under chromatin immunoprecipitation. by treatment with Nuclear LB {(50 mM Tris (pH 8.0), 10 mM EDTA, and 0.5% SDS) with added fresh 1× Halt™ Protease and Phosphatase Inhibitor Cocktails (Thermofisher, Cat. No 78444)} for 10 min on ice. Cellular lysates were diluted RNA immunoprecipitation. The first step in the RNA immunoprecipitation with IP Dilution buffer (16.7 mM Tris (pH 8.0), 167 mM NaCl, 1.2 mM EDTA, protocol was the cross-linking of proteins with DNA, which was carried out by 1.1% Triton X-100, and 0.01% SDS) to a final volume of 1 mL and sonicated to treating the cells with 1% formaldehyde, as described under ChIP. Following cross- shear the DNA to an average length of 300- to 500-bp fragments. Following linking, the cells were scraped into 1 mL of Phosphate Buffered Saline (PBS)- sonication, the lysates were first centrifuged for 30 min at 18,000 g at 4 °C and the Nuclear Isolation Buffer (sucrose 1.28 M, Tris-HCl 40 mM, MgCl2 20 mM, and 4% supernatants were incubated with protein A and salmon sperm DNA-bound Triton X-100) (ratio 1:1:3). Following this, the cells were washed twice and then agarose beads (Cell Signaling, Cat. No 9863), for 1 h at 4 °C. The precleared lysates lysed with RIP buffer (150 mM KCl, 25 mM Tris-HCl, 5 mM EDTA, 0.5 mM DTT, were incubated overnight with the diluted primary antibody or with the Rabbit and 0.5% NP-40), supplemented with fresh 1× Halt™ Protease and Phosphatase Isotype Control antibody (Thermofisher, Cat. No 10500 C) and following this, they Inhibitor Cocktails (Thermofisher, Cat. No 78444) and RNaseOUT™ Recombinant were incubated with the Pierce™ Protein A/G Magnetic Beads (Thermofisher, Cat. Ribonuclease Inhibitor (Thermo Fisher, Cat. No. 10777019) and the lysates were No 88803) for 4 h at 4 °C. The immunoprecipitates were then washed sequentially kept on ice for 10 min. Subsequently, the lysates were clarified by centrifugation at with the following buffers. A Low Salt Wash Buffer {20 mM Tris (pH 8.0), 2 mM 18,400 x g at 4 °C for 30 min and a fraction of the supernatant was incubated with EDTA, 150 mM NaCl, 1% Triton X-100, and 0.1% SDS}, a High Salt Wash Buffer protein A and salmon sperm DNA-bound agarose beads (Cell Signaling, Cat. No {20 mM Tris (pH 8.0), 2 mM EDTA, 500 mM NaCl, 1% Triton X-100, and 0.1% 9863), for 1 h in 4 °C. The clarified lysates were then incubated with the immu- SDS}, a LiCl Wash Buffer {10 mM Tris (pH 8.0), 1 mM EDTA, 250 mM LiCl, 1% noprecipitating antibody or an isotype control antibody (rabbit isotype control- NP-40, and 1% (w/v) deoxycholic acid and TE buffer (10 mM Tris (pH 8.0), 1 mM Thermofisher, Cat. No 10500 C or mouse isotype Control-Thermofisher, Cat. No EDTA}. The immunoprecipitated DNA was recovered by reversing the cross- 10400C) at 4 °C overnight. The resulting antigen–antibody complexes were incu- linking with NaCl, and following incubation with proteinase K, it was extracted bated with Pierce™ Protein A/G Magnetic Beads (Thermofisher, Cat. No 88803) at with DNA Purification Buffers and Spin Columns (Cell SIgnaling, Cat. No 14209). 4 °C for 4 additional hours and the immunoprecipitates were washed four times The immunoprecipitated DNA of the target loci, was then amplified by quanti- using the RIP buffer. The RNA–protein complexes were eluted in 100 μL of RIP tative PCR, using the sets of primers listed in Supplementary Table 2, the iTaq™ buffer and the RNA was recovered by reverse cross-linking at 70 °C and proteinase Universal SYBR® Green Super mix (Biorad, Cat No. 1725121) and a StepOne Plus K incubation at 55 °C. The RNA was then extracted with qRT-PCR machine (Thermofisher). The same was done with input DNA, isolated phenol–chloroform–isoamyl alcohol and it was precipitated with ethanol at -80 °C from 2% of the pre-cleared nuclear lysate, prior to the immunoprecipitation. Fold overnight, in the presence of yeast tRNA carrier (10 mg/mL). The immunopreci- enrichment was calculated, using the software https://www.sigmaaldrich.com/ pitated RNA fragments and input RNA derived from clarified cell lysates corre- technical-documents/articles/biology/chip-qpcr-data-analysis.html) provided sponding to 2% of the amount of lysate used for RNA IP were reverse-transcribed online by Sigma-Aldrich. The detailed protocol can be found in the online protocol with random hexamers. The abundance of the amplified RNA fragment in the two repository85. pre-mRNA-derived pools was measured by quantitative RT-PCR, carried out in triplicate. The primer sets used in these amplification reactions correspond to intronic and exonic regions of target pre-mRNAs. Amplification reactions were ChIP-seq: library preparation, sequencing, and analysis. ChIP-seq libraries carried out, using the iTaq™ Universal SYBR® Green Super mix (Biorad, Cat No. were generated using NEB Next® Ultra™ II DNA Library Prep Kit for Illumina® 1725121) and a StepOne Plus qRT-PCR machine (Thermofisher). The data were (New England Biolabs, Cat. no. E7645) and following the manufacturer’s protocol. analyzed using software provided online by Sigma-Aldrich. (https://www. ChIP-seq libraries were quality-checked using FastQC (www.bioinformatics. sigmaaldrich.com/technical-documents/articles/biology/chip-qpcr-data-analysis. babraham.ac.uk/projects/fastqc/). High-quality libraries were sequenced on an html). SNRNP-70 binding in the human U1 snRNP gene, using the primers F: 5′- Illumina HiSeq 2500 platform. ChIP-seq experiments were performed in duplicate, GGG AGA TAC CAT GAT CAC GAA GGT-3′,R:5′-CCA CAA ATT ATG CAG and average depth of sequenced samples was 49 M (±5M) 100 bp paired-end reads. TCG AGT TTC CC-3′, was used as the control for RNA IPs. The detailed protocol Sequencing results were demultiplexed with bcl2fastq. Compressed and demulti- can be found in the online protocol repository85. plexed fastq file pairs from each sample were used for analysis. Adapters and sequence contaminants were detected and removed using Cutadapt77. Paired-end reads were aligned against the human reference genome (GRCh38/hg38) using Tumor xenografts Bowtie (version 2.2.6) (with default parameters). Peak discovery was performed Ethics statement. All mouse experiments were approved by the Institutional Animal with HOMER (version 4.6)86. Sonicated input DNA was used as a control for peak Care and Use Committee (IACUC) of the Ohio State University. IACUC protocol discovery. Data snapshots were created using the Integrative Genomic Viewer of number 2018A00000134, PI: Philip N. Tsichlis the Broad Institute (https://software.broadinstitute.org/software/igv/home)87. Sequencing was performed in the DNA sequencing Center of Brigham Young Experimental protocol. A total of 2 × 106 NCI-H1299 cells, 5 × 106 A549 cells, and University (Provo, UTAH) (https://biology.byu.edu/dnasc). 1×107 NCI-H1975 cells were suspended into 30% Matrigel (Corning, Cat. No. 356231) in PBS in a total volume of 200 µL and implanted subcutaneously into the flanks of 6-week-old NSG (NOD.Cg-Prkdcscid-IL2rgtm1Wjl/Scj) mice (left side for 88 Chromatin immunocleavage (ChIC) . NCI-H1299 cells were FACS-sorted based the shControl and right side for the shIWS1 cells). The mice were monitored every on DNA content, as described above. Cell pools enriched for cells in G0/G1, S, and 3 days and the size of the tumors was measured using a digital caliper. The tumor G2/M phase of the cell cycle (5 × 104 cells /pool) were washed multiple times with fi ¼ 1 ν ´ 2 volume was calculated with the modi ed ellipsoid formula: V 2 s (where v is wash buffer (20 mM HEPES pH 7.5, 150 mM NaCl and 0.5 mM Spermidine, fi ™ length and s is width). The mice were sacri ced 4 weeks (NCI-H1299 and NCI- supplemented with fresh 1x Halt Protease and Phosphatase Inhibitor Cocktails). H1975 cells), or 6 weeks (A549 cells) post inoculation. Tumors were resected and In parallel with the preparation of the cells, the antibodies to be used for ChIC their weights were measured. Part of each resected tumor was snap-frozen in liquid (Supplementary Table 1) were attached to activated Magnetic Biomag Plus Con- nitrogen and was kept at −80 °C for RNA and protein isolation. The remainder canavalin A Beads (Bangs Laboratories, Cat. No. BP531). To activate the Magnetic was fixed in 10% (v/v) formalin (Sigma, Cat. No. HT501640) overnight. Subse- Biomag Plus Concanavalin A Beads (Bangs Laboratories, Cat. No. BP531), we fi quently, it was transferred to 70% EtOH and following this, it was embedded in washed the rst multiple times with a binding buffer (20 mM HEPES-KOH, pH paraffin at the Comparative Pathology & Mouse Phenotyping Shared Resource of 7.9, 10 mM KCl, 1 mM CaCl2, and 1 mM MnCl2). The activated beads were mixed the OSUCCC, prior to H and E and immunohistochemistry (IHC) staining. with 50 μL of the primary antibody or Rabbit Isotype Control antibody (Ther- mofisher, Cat. No 10500C) diluted 1/50 in antibody buffer (2 mM EDTA (pH 8.0), 0.1% (wt/vol) digitonin diluted in wash buffer) and the bead-attached primary RNA and protein isolation from mouse xenografts. About 50–100 mg of the antibodies were mixed and incubated with the cell pellet at 4 °C overnight. The frozen mouse xenografts were homogenized in 1 mL of Trizol reagent (Thermo- resulting immunoprecipitates were washed multiple times with wash buffer and fisher Scientific, Cat. No. 15596026). RNA and protein were isolated from the then mixed with 50 μL of a 1/50 dilution of a Guinea Pig anti-Rabbit IgG (Heavy & homogenized samples by following the instructions of the manufacturer. Briefly, Light Chain) secondary antibody (Antibodies-Online, Cat. No. ABIN101961) and 200 µL of chloroform (Sigma, Cat. No. C2432) were added to all the 1 mL Trizol incubated at 4 °C for 4 h. Following multiple washes, the immunoprecipitates were extracts and following mixing, the extracts were centrifuged at 12,000 × g for 15 mixed with micrococcal nuclease (CUTANA™ pAG-MNase. EpiCypher, Cat No. min at 4 °C for phase separation. Following this, the RNA in the aqueous phase was SKU: 15-1116) (final concentration 700 ng/mL), which interacts with the transferred into a new tube, while the organic phase was stored O/N at 4 °C for ABIN101961 secondary antibody. The antibody-bound MNase was activated with protein isolation.

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 21 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

RNA extraction. RNA was precipitated by mixing the aqueous phase with 0.5 mL of Homogenized samples were moved into chilled microcentrifuge tubes, kept for 40 isopropanol (Fisher Scientific, Cat. No. A416P-4). Following incubation at RT for min on ice, and then centrifuged at 16,000x g for 20 min at 4 °C. Supernatants were 15 min, the aqueous phase/isopropanol mixture was spun at 12,000 × g for 10 min collected in fresh tubes and placed on ice for protein quantification, which was at 4 °C. The RNA pellet was washed with 75% ethanol, followed by a second spin at performed using the BioRad Bradford Reagent (Biorad, Cat. No. 5000001). RNA 7500 × g for 5 min at 4 °C and was dissolved in 30 µL of DEPC-treated water (IBI was extracted by grinding tissue samples as before, in 1 mL Trizol Reagent Scientific, Cat. No. IB42210). (Thermo Fisher, Cat. No. 15596026) and by following the manufacturer’s instructions for subsequent extraction steps. Protein extraction. About 0.3 mL of 100% ethanol was added to the IHC staining of lung adenocarcinomas was done using lung adenocarcinoma interphase–organic phase and the samples were centrifuged at 2000 × g for 5 min at tissue arrays (US Biomax, LC1504). The staining procedures and the analysis of the 4 °C. The proteins in the phenol–ethanol supernatant were then precipitated by data were done as described for the mouse xenografts. Since the tissue arrays fi adding 1.5 mL of isopropanol, followed by incubation at room temperature for 10 contain only two sections from each tumor, the nal score for a given tumor was min and centrifugation at 12,000 × g at 4 °C for 10 min. The protein pellet was the average of the scores for the two sections. washed 3 times in a solution of 0.3 M guanidine hydrochloride (Sigma, Cat. No. SRE0066) in 95% ethanol. Each wash cycle included the resuspension of the pellet Data analysis. Western blot images were imported to ImageJ and the intensity of in the wash solution, a 20-min incubation at room temperature, and centrifugation the bands was measured. The values obtained from this analysis were normalized at 7500 × g at 4 °C for 5 min. After the final guanidine hydrochloride wash, the to tubulin and the normalized values were imported to GraphPad Prism 8.4. pellet was washed again, in 100% ethanol, the ethanol resuspended pellet was Correlation coefficients were calculated using simple linear regression. Following incubated for 20 min at room temperature, and was centrifuged at 7500 × g for 5 this, correlations were visualized in heatmaps. The exact values and statistical min at 4 °C. The protein pellet was dissolved in 200 µL of 1% SDS, supplemented significance of the correlations can be found in Supplementary Table 5. The U2AF2 with a protease and phosphatase inhibitor cocktail (Thermo Fisher Scientific, Cat. E2/E3 ratios, generated from the analysis of the RNAs of the 40 LUADs in our No. 78444). Insoluble material was removed by centrifugation at 10,000 × g for 10 patient cohort and the quantitative data generated from the IHC analyses of the min at 4 °C. Cellular protein and RNA were analyzed by immunoblotting and qRT- tissue microarrays were also imported in Image J and they were analyzed and PCR, respectively. The antibodies and primers we used for these analyses are listed presented as described for the western blot data. The exact values and statistics of Supplementary Tables 1 and 2. the correlation analyses, can also be found in Supplementary Table 5. The information available for all the lung adenocarcinomas in our cohort IHC staining. About 5-μm-thick sections of the paraffin-embedded mouse tumors included sex, age, and clinical stage, and the information available for the tissue were heated to 55 °C for 20 min prior to deparaffinization with xylene (Fisher array cohort, included clinical stage and histologic grade. For the 30 OSU tumor scientific, Cat. No. X3F-1GAL). Following deparaffinization, tissue sections were samples, patient survival was also available. All the tumors in our cohort were also rehydrated by treatment with decreasing concentrations of ethanol, down to dis- analyzed for EGFR and KRAS genetic alterations and, based on this analysis, they fi tilled water. The endogenous peroxidase activity was blocked by treatment with 3% were subclassi ed into EGFR mutant and KRAS mutant subgroups. As mentioned fi in the preceding paragraph, western blot and IHC data were quantified by Image J H2O2 (Fisher Scienti c, Cat. No. H325500) in PBS (pH 7.4), at room temperature for 10 min. This was followed by antigen retrieval via a 30-min treatment at 80 °C, and following normalization, they were imported into GraphPad Prism 8.4. The fi fi with Citrate Buffer, pH 6.0, Antigen Retriever (Sigma, Cat. No. C9999). Subse- rst question we addressed was whether the pathway we de ned with controlled quently, tissues were rinsed with PBS for 5 min and then treated with normal goat experiments in cultured cells was also active in naturally occurring human lung fi blocking serum at room temperature for 20 min. Subsequent steps were carried out adenocarcinomas. This was addressed by measuring the correlation coef cients using the Vectastain Elite ABC Universal kit peroxidase (Vector Laboratories, Cat. between IWS1 and phosphor-IWS1 and multiple mediators and targets of the No. PK-6200). Briefly, following treatment with the blocking serum, and a 5-min pathway, as described in the preceding paragraph. The correlations were presented wash with PBS at room temperature, the tissues were incubated with the primary in heatmaps and although correlations were observed in all the tumors, they were fi antibody diluted in PBS with 2.5% serum at 4 0C overnight. Subsequently, the signi cantly more robust in EGFR mutant tumors. To determine the clinical tissues were rinsed with PBS for 5 min at room temperature, and incubated with importance of the IWS1 phosphorylation pathway, we examined the activity of the the biotinylated Universal Antibody for 30 min, also at room temperature. Fol- pathway in Stage I vs Stage II/III tumors and in Grade 2 vs Grade 3 tumors. The lowing an additional single, room-temperature 5-min wash with PBS, the tissues surrogate markers for the activity of the pathway were the levels of phosphor-IWS1 fi were treated with the Vectastain Elite ABC reagent for 30 min at room tempera- and the U2AF2 E2/E3 ratio, both of which were quanti ed and imported to ture, to enhance the signal. An additional single 5-min wash with PBS was followed GraphPad Prism 8.4, as described in the preceding paragraph. To determine by a 2–10 min incubation with a DAB peroxidase substrate solution (Vector whether the activity of the pathway impacts patient survival, we divided the tumors Laboratories, Cat. No. SK-400) according to the manufacturer’s instructions. At the in our cohort into high and low phosphor-IWS1 tumors and into high and low end, the slides were washed with tap water and covered with the DPX mounting U2AF2 E2/E3 ratio tumors. Survival curves of tumors in the high and low – 90 medium (Sigma, Cat. No. 06522). The primary antibodies used for staining are categories were generated, using the Kaplan Meir method and they were fi listed in Supplementary table 1. statistically compared using the log-rank test. A tumor was identi ed as a high All IHC images were captured on a Nikon eclipse 50i microscope with attached IWS1 or high E2/E3 ratio tumor, if the value of these parameters in the given Axiocam 506 color camera using the ZEN 2.6 blue edition software (Zeiss). Imaging tumor was higher than the mean value, plus one standard deviation over the mean + fi files were imported to ImageJ89 for analysis. Using the native freehand function of (mean 1 SD). All tumors below this mark were classi ed as low phosphor-IWS1, the software, the signal derived from glandular areas of the tumor was measured or as low U2AF2 E2/E3 ratio tumors. and was divided by the surface area occupied by these glandular areas. For each fi fi sample, at least ve different sections of the tumor were scanned. The nal score for Publicly available LUAD datasets and analyses each tumor was the average value of these measurements. To ensure that the The Cancer Genome Atlas (TCGA) dataset. TCGA data were downloaded from analysis was unbiased, the protocol described above was followed for the analysis of https://portal.gdc.cancer.gov/. Overall, 658 TCGA-LUAD patients (all stages), 162 all the images generated in the course of all the experiments in this report. TCGA-LUAD (all stages) with KRAS mutations, and 67 TCGA-LUAD with EGFR mutations were obtained with their survival and clinical data (clinical stage, metas- tasis, and survival). A total of 516 out of 658 had RNA-seq data. For the expression of Human tumor samples SRSF1, 128 out of 658 had RNA-Seq data. Figures for the correlation analysis were Tumor procurement and analysis. Thirty LUAD samples with matching NAT were generated using the visualization tools of the Xena browser (http://xena.ucsc.edu/.) obtained from the Tissue Bank of The Ohio State University, under the universal and GraphPad Prism 8.4. The clinical stage and Metastasis analysis for the TCGA- consenting and biobanking protocol, Total Cancer Care (TCC). TCC is the single LUAD patients was performed as described above. The Survival analysis was per- protocol used by the Oncology Research Information Exchange Network (ORIEN), formed in GraphPad Prism 8.4 with a Kaplan–Meier method and log-rank p statistics, which was formed through a partnership between OSUCCC-James and the Moffitt with similar approach to the cut-off decision as described above. Cancer Center (Tampa, FL). For more information, please see the Biospecimen Core Services facility of The Ohio State University Comprehensive Cancer Center (https://cancer.osu.edu/for-cancer-researchers/resources-for-cancer-researchers/ Gene Expression Omnibus (GEO) repository datasets shared-resources/biospecimen-services) and the ORIEN project (https://cancer.osu. GSE141685. This dataset includes RNA-Seq data of brain metastases derived from edu/for-cancer-researchers/resources-for-cancer-researchers/orien). six LUAD patients, and 24 primary nonmetastatic LUADs, independent of stage Ten additional consented LUADs without matching normal tissue had been and mutational status (12 early- and 12 late- stage LUADs). The primary LUADs obtained earlier from the tissue bank of Tufts Medical Center. The latter had also were obtained from the TCGA dataset and the TCGA ID of these patients, along been used in an earlier study on the role of IWS1 in NSCLC (Sanidas et al.22). All with their clinical information are presented in Supplementary Table 6. The FPKM tumor samples were provided to this study as unidentified samples. (fragments per kilobase per million fragments) values were downloaded for further Frozen tissues were grinded on dry ice into very small pieces, which were then analysis. First, we compared the IWS1 FPKM values in the primary tumors and the transferred into chilled 2-mL round-bottom Eppendorf tubes. Protein was metastatic lesions and we presented them in violin plots. Next, we calculated the extracted by adding 500 μL of ice-cold NP-40 LB (50 mM Tris-HCl, pH 7.5, 150 U2AF2 E2/E3 and the FGFR2 E8/E9 ratios, as described under the TCGA dataset mM NaCl, 0.5% Triton X, and 1 mM EDTA, pH 8, supplemented with a protease and we presented them also in violin plots. Using GraphPad Prism 8.4, we and phosphatase inhibitor cocktail (Thermo Scientific Cat Nr 78442)) followed by visualized FPKM values of genes in the IWS1 pathway in the primary tumors and homogenization of the tissue fragments with an electric homogenizer, on ice. the metastatic lesions, as heatmaps.

22 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE

The exon expression profiles of the TCGA and GSE141685 LUAD samples were References measured experimentally using the Illumina HiSeq 2000 RNA Sequencing platform. 1. Harper, J. W. & Bennett, E. J. Proteome complexity and the forces that drive Exons were mapped to the human genome, using UCSC Xena unc_RNAseq_exon proteome imbalance. Nature 537, 328–338 (2016). probeMap. Exon-level transcription estimation was presented in RPKM values 2. Louhichi, A., Fourati, A. & Rebaï, A. IGD: a resource for intronless genes in + (reads per kilobase per million mapped reads). A log2 (RPKM 1) exon expression the human genome. Gene 488,35–40 (2011). matrix was then imported to the RStudio framework (V 3.5.2) for the selection and 3. Pan, Q., Shai, O., Lee, L. J., Frey, B. J. & Blencowe, B. J. Deep surveying of export of the values of exons 2 and 3 of U2AF2 and exons 8 and 9 of FGFR2. alternative splicing complexity in the human transcriptome by high- throughput sequencing. Nat. Genet. 40, 1413–1415 (2008). GSE1321370. A LUAD gene expression dataset, based on the Agilent microarray 4. Manning, K. S. & Cooper, T. A. The roles of RNA processing in translating technology. In total, 117 tumors were divided into two groups, one with high (n = genotype to phenotype. Nat. Rev. Mol. Cell Biol. 18, 102–114 (2017). 59) and another one with low (n = 58) probability of relapse. Fourteen of these 5. Agosto, L. M. & Lynch, K. W. Alternative pre-mRNA splicing switch controls tumors harbored KRAS mutations and 45 harbored EGFR mutations. To determine hESC pluripotency and differentiation. Genes Dev. 32, 1103–1104 (2018). the significance of the IWS1 phosphorylation pathway in tumor relapse, we 6. Paronetto, M. P., Passacantilli, I. & Sette, C. Alternative splicing and cell examined the expression of IWS1, Sororin, CDK1, and Cyclin B1 in patients with survival: from tissue homeostasis to disease. Cell Death Differ. 23, 1919–1929 high and low probability of relapse. Data were presented as heatmaps, which were (2016). generated, using GraphPad Prism 8.4. 7. Singh, R. N. & Singh, N. N., Mechanism of splicing regulation of spinal muscular atrophy genes. In RNA Metabolism In Neurodegenerative Disease. GSE2693991. A LUAD gene expression dataset based on the Agilent microarray 31–61 (Springer, Cham, 2018). technology. The microarray data of individual tumors were linked to patient sur- 8. Jayasinghe, R. G. et al. Systematic analysis of splice-site-creating mutations in vival data. Based on the microarray data and using the criteria described under cancer. Cell rep. 23, 270–281 (2018). “data analysis” in the “human tumor samples” section, tumors were first placed 9. Giaj Levra, M., Andre Olaussen, K., Novello, S. & Soria, J. C. PARP inhibitors: into high or low IWS1 subgroups. The low IWS1 subgroup contained 17 tumors an interesting pathway also for non-small cell lung cancer? Curr. Pharm. Des. with a KRAS mutation and 27 tumors with an EGFR mutation and the high 20, 3875–3882 (2014). IWS1 subgroup contained 18 tumors with a KRAS mutation and 26 with an EGFR 10. Listerman, I., Sapra, A. K. & Neugebauer, K. M. Cotranscriptional coupling of mutation. Survival curves of patients with low and high IWS1 tumors were gen- splicing factor recruitment and precursor messenger RNA splicing in – erated, using the Kaplan Meier methodology and Log-rank statistics. Gene mammalian cells. Nat. Struct. Mol. Biol. 13, 815 (2006). expression values in Agilent two color arrays were expressed as the log2 ratios of 11. Luco, R. F. et al. Regulation of alternative splicing by histone modifications. the two color signals. These normalized and background-corrected values were Science 327, 996–1000 (2010). imported from the microarray dataset into the RStudio-integrated development 12. Pradeepa, M. M., Sutherland, H. G., Ule, J., Grimes, G. R. & Bickmore, W. A. environment (IDE) (V 3.5.2) for analysis. Log2 ratios for IWS1, CDCA5, CDC2, and fi Psip1/Ledgf p52 binds methylated histone H3K36 and splicing factors and CCNB1 were exported from the RStudio IDE into an Excel le and they were used contributes to the regulation of alternative splicing. PLoS genet. 8, e1002717 to generate heatmaps, violin plots, or Kaplan–Meier survival curves. (2012). 13. Oltean, S. & Bates, D. O. Hallmarks of alternative splicing in cancer. Oncogene 71 GSE123903 . A set of single-cell RNA-Seq (scRNA-seq) data derived from the 33, 5311–5318 (2014). analysis of tumor cells from a patient-derived LUAD mouse model. The normal- 14. Dominguez, D. et al. An extensive program of periodic alternative splicing ized scRNA-seq counts were retrieved and analyzed for IWS1 and CDCA5 linked to cell cycle progression. Elife 5, e10288 (2016). expression. The data were visualized using the Barnes–Hut approximate version of α 92 15. Burns, C. G. et al. Removal of a single -tubulin gene intron suppresses cell t-SNE (https://github.com/lvdmaaten/bhtsne). cycle arrest phenotypes of splicing factor mutations in Saccharomyces cerevisiae. Mol. Cell. Biol. 22, 801–815 (2002). Statistics and reproducibility. The experiments in Fig. 1f–g and Supplementary 16. Moore, M. J., Wang, Q., Kennedy, C. J. & Silver, P. A. An alternative splicing Figs. 1e–i; 2a–c; 3a–d, a–c; 4a–c, a–g; 5a–g, b–h; 6a–d, a–i; 7a–f; 8b, c; 9a, b were network links cell-cycle control to apoptosis. Cell 142, 625–636 (2010). performed in a minimal of three independent biological experiments. The data in 17. Ahn, E. Y. et al. SON controls cell-cycle progression by coordinated regulation Fig. 7 (mouse xenografts) were performed once, using five mice/group. Western of RNA splicing. Mol. cell 42, 185–198 (2011). blots of the LUAD samples in 8a was performed two times. The IHC staining 18. Suvorova, E. S. et al. Discovery of a splicing regulator required for cell cycle experiments of the mouse xenografts in Supplementary Fig. 10c and of the human progression. PLoS Genet. 9, e1003305 (2013). tissue arrays, were performed once, using the antibodies and techniques listed in 19. Coomer, A. O., Black, F., Greystoke, A., Munkley, J. & Elliott, D. J. Alternative the Methods section. All the attempts at replications were successful. Statistical splicing in lung cancer. Biochimica et Biophysica Acta (BBA)-Gene Regulatory analyses were done using GraphPad Prism 8.4. All the statistical analyses can be Mechanisms 1862, 194388 (2019). found in the Mendeley dataset where the source data of this report were 20. Yuan, M., Huang, L. L., Chen, J. H., Wu, J. & Xu, Q. The emerging treatment 93 deposited . landscape of targeted therapy in non-small-cell lung cancer. Signal Transduct. Target. Ther. 4,1–14 (2019). Reporting summary. Further information on research design is available in the Nature 21. de Groot, P. M., Wu, C. C., Carter, B. W. & Munden, R. F. The epidemiology Research Reporting Summary linked to this article. of lung cancer. Transl. Lung Cancer Res. 7, 220 (2018). 22. Sanidas, I. et al. Phosphoproteomics screen reveals akt isoform-specific signals linking RNA processing to lung cancer. Mol. Cell 53, 577–590 (2014). Data availability 23. Li, Z. et al. Pro-apoptotic effects of splice-switching oligonucleotides targeting – All the raw data for Figs. 1 8 (uncropped gel images, qPCR, FACS, plate reader, and Bcl-x pre-mRNA in human glioma cell lines. Oncol. Rep. 35, 1013–1019 proliferation data, processed counts of RNA-seq, Junction-seq splicing platform for (2016). visualization of the exon usage results, and microscope images) have been deposited in 24. Todaro, M. et al. CD44v6 is a marker of constitutive and reprogrammed 93 Mendeley in two independent publicly available datasets (Laliotis et al. ). In addition, all cancer stem cells driving colon cancer metastasis. Cell Stem Cell 14, 342–356 fi the uncropped WB and PCR gels are provided in the source data le. These datasets also (2014). include data analyses and p values. The main uncropped full scans are provided as source 25. Calabretta, S. et al. Modulation of PKM alternative splicing by PTBP1 data with the paper. The latest version of Ensembl database can be downloaded through promotes gemcitabine resistance in pancreatic cancer cells. Oncogene 35, the Ensembl project (http://useast.ensembl.org/info/data/ftp/index.html) or github 2031–2039 (2016). (https://github.com/Ensembl/ensemblhive). All the RNA-seq and ChIP-seq data in this 26. Imielinski, M. et al. Mapping the hallmarks of lung adenocarcinoma with report have been deposited in the Gene Expression Omnibus (GEO), under the massively parallel sequencing. Cell 150, 1107–1120 (2012). Superseries accession number GSE166955 (RNAseq:GSE166953, ChIP-seq: 27. Seiler, M. et al. H3B-8800, an orally available small-molecule splicing GSE165954). Source data are provided with this paper. modulator, induces lethality in spliceosome-mutant cancers. Nat. Med. 24, 497 (2018). Code availability 28. Shen, H. & Green, M. R. A pathway of sequential arginine-serine-rich All the code used for the analysis in this report is derived from previously published domain-splicing signal interactions during mammalian spliceosome assembly. – reports. It is also explained and cited in the appropriate materials and methods or Mol. Cell 16, 363 373 (2004). supplementary experimental procedures sections. 29. Glasser, E., Agrawal, A. A., Jenkins, J. L. & Kielkopf, C. L. Cancer-associated mutations mapped on high-resolution structures of the U2AF2 RNA recognition motifs. Biochemistry 56, 4757–4761 (2017). Received: 30 July 2020; Accepted: 5 March 2021; 30. Li, J. et al. OTUB2 stabilizes U2AF2 to promote the Warburg effect and tumorigenesis via the AKT/mTOR signaling pathway in non-small cell lung cancer. Theranostics 9, 179 (2019).

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 23 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1

31. de Miguel, F. J. et al. Identification of alternative splicing events regulated 60. David, C. J., Boyne, A. R., Millhouse, S. R. & Manley, J. L. The RNA by the oncogenic factor SRSF1 in lung cancer. Cancer Res. 74, 1105–1115 polymerase II C-terminal domain promotes splicing activation through (2014). recruitment of a U2AF65–Prp19 complex. Genes Dev. 25, 972–983 (2011). 32. Chan, S. P., Kao, D. I., Tsai, W. Y. & Cheng, S. C. The Prp19p-associated 61. Wang, J. et al. Silencing of CDCA5 inhibits cancer progression and serves as a complex in spliceosome activation. Science 302, 279–282 (2003). prognostic biomarker for hepatocellular carcinoma. Oncol. Rep. 40, 33. Chan, S. P. & Cheng, S. C. The Prp19-associated complex is required for 1875–1884 (2018). specifying interactions of U5 and U6 with pre-mRNA during spliceosome 62. Shen, A. et al. Cell division cycle associated 5 promotes colorectal cancer activation. J. Biol. Chem. 280, 31190–31199 (2005). progression by activating the ERK signaling pathway. Oncogenesis 8,19 34. Hogg, R., McGrail, J. C. & O'Keefe, R. T. The function of the NineTeen (2019). Complex (NTC) in regulating spliceosome conformations and fidelity during 63. Nguyen, M. H. et al. Phosphorylation and activation of cell division cycle pre-mRNA splicing. Biochem. Soc. Trans. 38, pp. 1110–1115 (2010). associated 5 by mitogen-activated protein kinase play a crucial role in human 35. Chanarat, S. & Sträßer, K. Splicing and beyond: the many faces of the Prp19 lung carcinogenesis. Cancer Res. 70, 5337–5347 (2010). complex. Biochim. Biophys. Acta (BBA) Mol. Cell Res. 1833, 2126–2134 64. Guo, Y. J. et al. ERK/MAPK signalling pathway and tumorigenesis. Exp. Ther. (2013). Med. 19, 1997–2007 (2020). 36. Watrin, E., Demidova, M., Watrin, T., Hu, Z. & Prigent, C. Sororin pre‐ 65. Wee, P. & Wang, Z. Epidermal growth factor receptor cell proliferation mRNA splicing is required for proper sister chromatid cohesion in human signaling pathways. Cancers 9, 52 (2017). cells. EMBO Rep. 15, 948–955 (2014). 66. Kent, L. N. & Leone, G. The broken cycle: E2F dysfunction in cancer. Nat. Rev. 37. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based Cancer 19, 326–338 (2019). approach for interpreting genome-wide expression profiles. Proc. Natl Acad. 67. Begum, J. et al. A method for evaluating the use of fluorescent dyes to track Sci. USA 102, 15545–15550 (2005). proliferation in cell lines by dye dilution. Cytometry A 83, 1085–1095 (2013). 38. Anders, S., Reyes, A. and Huber, W. Detecting differential usage of exons from 68. Dronamraju, R. et al. Set2 methyltransferase facilitates cell cycle progression RNA-seq data. Genome Res. 22, 2008-17 (2012). by maintaining transcriptional fidelity. Nucleic Acids Res. 46, 1331–1344 39. Hacker, K. E. et al. Structure/Function analysis of recurrent mutations in (2018). SETD2 protein reveals a critical and conserved role for a SET domain residue 69. Kim, C. H. et al. Identification of EGFR mutations by immunohistochemistry in maintaining protein stability and histone H3 Lys-36 trimethylation. J. Biol. with EGFR mutation–specific antibodies in biopsy and resection specimens Chem. 291, 21283–21295 (2016). from pulmonary adenocarcinoma. Cancer Res. Treat. 47, 653 (2015). 40. Greer, E. L. & Shi, Y. Histone methylation: a dynamic mark in health, disease 70. Tomida, S. et al. Relapse-related molecular signature in lung adenocarcinomas and inheritance. Nat. Rev. Genet. 13, 343 (2012). identifies patients with dismal prognosis. J. Clin. Oncol. 27, 2793–2799 (2009). 41. Hyun, K., Jeon, J., Park, K. & Kim, J. Writing, erasing and reading histone 71. Laughney, A. M. et al. Regenerative lineages and immune-mediated pruning lysine methylation. Exp. Mol. Med. 49, e324 (2017). in lung cancer metastasis. Nat. Med. 26, 259–269 (2020). 42. Lucio-Eterovic, A. K. et al. Role for the nuclear receptor-binding SET domain 72. Ishiguro, K. I. The cohesin complex in mammalian meiosis. Genes Cells 24, protein 1 (NSD1) methyltransferase in coordinating lysine 36 methylation at 6–30 (2019). histone 3 with RNA polymerase II function. Proc. Natl Acad. Sci. 107, 73. Dreier, M. R., Bekier, M. E. & Taylor, W. R. Regulation of sororin by Cdk1- 16952–16957 (2010). mediated phosphorylation. J. Cell Sci. 124, 2976–2987 (2011). 43. Qiao, Q. et al. The structure of NSD1 reveals an autoregulatory mechanism 74. Kagey, M. H. et al. Mediator and cohesin connect gene expression and underlying histone H3K36 methylation. J. Biol. Chem. 286, 8361–8368 (2011). chromatin architecture. Nature 467, 430–435 (2010). 44. Berdasco, M. et al. Epigenetic inactivation of the Sotos overgrowth syndrome 75. Liu, P. et al. Cell-cycle-regulated activation of Akt kinase by phosphorylation gene histone methyltransferase NSD1 in human neuroblastoma and glioma. at its carboxyl terminus. Nature 508, 541–545 (2014). Proc. Natl Acad. Sci. USA 106, 21830–21835 (2009). 76. Vlachos, I. S. et al. DIANA-mirExTra v2.0: uncovering microRNAs and 45. Rahman, S. et al. The Brd4 extra terminal domain confers transcription transcription factors with crucial roles in NGS expression data. Nucleic Acids activation independent of pTEFb by recruiting multiple proteins, including Res. 44, W128–W134 (2016). NSD3. Mol. Cell. Biol. 31, 2641–2652 (2011). 77. Martin, M. Cutadapt removes adapter sequences from high-throughput 46. Brown, M. A., Sims, R. J., Gottlieb, P. D. & Tucker, P. W. Identification and sequencing reads. EMBnet. J. 17,10–12 (2011). characterization of Smyd2: a split SET/MYND domain-containing histone H3 78. Wu, T. D. & Nacu, S. Fast and SNP-tolerant detection of complex variants and lysine 36-specific methyltransferase that interacts with the Sin3 histone splicing in short reads. Bioinformatics 26, 873–881 (2010). deacetylase complex. Mol. Cancer 5, 26 (2006). 79. Aken, B. L. et al. Ensembl 2017. Nucleic Acids Res. 45, D635–D642 (2017). 1. 47. Gregory, G. D. et al. Mammalian ASH1L is a histone methyltransferase that 80. Liao, Y. et al. featurecounts: an efficient general purpose program for assigning occupies the transcribed region of active genes. Mol. Cell. Biol. 27, 8466–8479 sequence reads to genomic features. Bioinformatics 30, 923–930 (2014). (2007). 81. Anders, S. & Huber, W. Differential expression analysis for sequence count 48. Berry, W. L. & Janknecht, R. KDM4/JMJD2 histone demethylases: epigenetic data. Genome Biol. 11, R106 (2010). regulators in cancer cells. Cancer Res. 73, 2936–2942 (2013). 82. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical 49. Cloos, P. A. et al. The putative oncogene GASC1 demethylates tri-and and powerful approach to multiple testing. J. R. Stat. Soc. B. 57, 289–300 dimethylated lysine 9 on histone H3. Nature 442, 307 (2006). (1995). 50. Li, W. et al. Histone demethylase JMJD2B is required for tumor cell 83. Grote, S. GOfuncR: Gene ontology enrichment using FUNC. R package proliferation and survival and is overexpressed in gastric cancer. Biochem. version 1.12.0 Bioconductor. Open Source Software for Bioinformatics https:// Biophys. Res. Commun. 416, 372–378 (2011). www.bioconductor.org/packages/release/bioc/html/GOfuncR.html (2021). 51. Cai, L. et al. An H3K36 methylation-engaging Tudor motif of polycomb-like 84. Mohammed, H. et al. Rapid immunoprecipitation mass spectrometry of proteins mediates PRC2 complex targeting. Mol. Cell 49, 571–582 (2013). endogenous proteins (RIME) for analysis of chromatin complexes. Nat. 52. Vezzoli, A. et al. Molecular basis of histone H3K36me3 recognition by the Protoc. 11, 316 (2016). PWWP domain of Brpf1. Nat. Struct. Mol. Biol. 17, 617–619 (2010). 85. Laliotis, G. I., Tsichlis, P. N. Effective identification of protein-protein 53. Li, F. et al. The histone mark H3K36me3 regulates human DNA mismatch interaction using RIME-IP. Protocols.io, https://doi.org/10.17504/protocols.io. repair through its interaction with MutSα. Cell 153, 590–600 (2013). bqx8mxrw. Effective identification of DNA-bound protein complexes using 54. Vermeulen, M. et al. Quantitative interaction proteomics and genome-wide Chromatin Immunoprecipitation. Protocols.io, https://doi.org/10.17504/ profiling of epigenetic histone marks and their readers. Cell 142, 967–980 protocols.io.bjj6kkre. Effective identification of RNA-binding proteins using (2010). RNA Immunoprecipitation. Protocols.io, https://doi.org/10.17504/protocols. 55. Ferris, A. L. et al. Lens epithelium-derived growth factor fusion proteins io.bjpbkmin (2020). redirect HIV-1 DNA integration. Proc. Natl Acad. Sci. USA 107, 3135–3140 86. Heinz, S. et al. Simple combinations of lineage-determining transcription (2010). factors prime cis-regulatory elements required for macrophage and B cell 56. Shun, M. C. et al. LEDGF/p75 interacts with mRNA splicing factors and identities. Mol. Cell 38, 576–589 (2010). targets HIV-1 integration to highly spliced genes. Genes Dev. 29, 2287–2297 87. Robinson, J. et al. Integrative genomics viewer. Nat. Biotechnol. 29,24–26 (2015). (2011). 57. Paz, I., Kosti, I., Ares, M. Jr, Cline, M. & Mandel-Gutfreund, Y. RBPmap: a 88. Skene, P. J., Henikoff, J. G. & Henikoff, S. Targeted in situ genome-wide web server for mapping binding sites of RNA-binding proteins. Nucleic Acids profiling with high efficiency for low cell numbers. Nat. Protoc. 13, 1006 Res. 42, W361–W367 (2014). (2018). 58. Anczuków, O. et al. SRSF1-regulated alternative splicing in breast cancer. Mol. 89. Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 Cell 60, 105–117 (2015). years of image analysis. Nat. Methods 9, 671–675 (2012). 59. Millevoi, S. et al. An interaction between U2AF 65 and CF Im links the 90. Goel, M. K., Khanna, P. & Kishore, J. Understanding survival analysis: splicing and 3′ end processing machineries. EMBO J. 25, 4854–4864 (2006). Kaplan-Meier estimate. Int. J. Ayurveda Res. 1, 274 (2010).

24 NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24795-1 ARTICLE

91. Wilkerson, M. D. et al. Differential pathogenesis of lung adenocarcinoma advised on the biology of lung cancer and on the design of experiments, provided cell lines, subtypes involving sequence mutations, copy number, chromosomal and reagents. V.C. contributed to the overall experimental design. P.N.T. conceived and instability, and methylation. PloS ONE 7, p. e36530 (2012). initiated the project, contributed to the experimental design, supervised the work and 92. Maaten, L. V. D. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. monitored its progress, and wrote the paper, together with G.I.L. Res. 9, 2579–2605 (2008). “ 93. Laliotis, G. I. et al. IWS1 phosphorylation promotes cell proliferation and Competing interests predicts poor prognosis in EGFR mutant lung adenocarcinoma patients, through the cell cycle-regulated U2AF2 RNA splicing”. Mendeley Data, The authors declare no competing interests. https://doi.org/10.17632/y82yx8gfkj.4 (2020). Additional information Supplementary information The online version contains supplementary material Acknowledgements available at https://doi.org/10.1038/s41467-021-24795-1. The authors wish to thank all the members of the Tsichlis Lab for helpful discussions. We also thank Drs. Carlo M. Croce, Michael Freitas, Wayne Miles, Raphael Pollock, Amanda Correspondence and requests for materials should be addressed to G.I.L. or P.N.T. Toland, Samir Achaya, and Joal Beane for critically reviewing the paper before the Peer review information Nature Communications thanks Klemens J Hertel and the submission. This work was supported by the National Institute of Health grants other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer R01CA186729 to P.N.T., and R01 CA198117 to P.N.T and V.C, and by the National Institutes of Health/National Cancer Institute P30 Grant CA016058 to the Ohio State reviewer reports are available. University Comprehensive Cancer Center (OSUCCC). G.I.L was supported by a Pelo- tonia Post-Doctoral fellowship from OSUCCC. Reprints and permission information is available at http://www.nature.com/reprints

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in Author contributions published maps and institutional affiliations. G.I.L. conceived and performed experiments, analyzed data, prepared figures, and con- tributed to the writing of the paper. E.C. designed and performed the mouse xenograft experiments, including the characterization of the tumors, performed the IHC experiments Open Access This article is licensed under a Creative Commons on human TMAs, and provided comments contributing to the writing of the paper. M.D.P. Attribution 4.0 International License, which permits use, sharing, performed bioinformatics analyses of RNA-seq data that led to the identification of alter- adaptation, distribution and reproduction in any medium or format, as long as you give natively spliced targets of the IWS1 pathway. A.D.K. performed experiments, under the appropriate credit to the original author(s) and the source, provide a link to the Creative supervision of G.I.L. A.L.F. performed bioinformatics analyses of microarray and TCGA Commons license, and indicate if changes were made. The images or other third party data. S.S. contributed to the FACS experiments. V.A. performed and analyzed the pro- material in this article are included in the article’s Creative Commons license, unless liferation experiments. S.A. performed and supervised bioinfomatics analyses. A.O. prepared indicated otherwise in a credit line to the material. If material is not included in the extracts of human tumors and provided comments on the paper. K.A.N. performed article’s Creative Commons license and your intended use is not permitted by statutory experiments, under the supervision of G.I.L. V.T. prepared the cells and the mRNA for the regulation or exceeds the permitted use, you will need to obtain permission directly from RNA-Seq experiment. I.V. Bioinformatics analyses of RNA-seq data. M.C. prepared extracts the copyright holder. To view a copy of this license, visit http://creativecommons.org/ of human tumors used in this study A.H. Provided supervision for the bioinformatics licenses/by/4.0/. analyses of the RNA-seq data. D.P. provided technical advice on several experiments in this paper and contributed to the design of these experiments. Provided comments on the paper. C.T. advised on the design of experiments. L.S. advised on the design of experiments. D.P.C. © The Author(s) 2021

NATURE COMMUNICATIONS | (2021) 12:4624 | https://doi.org/10.1038/s41467-021-24795-1 | www.nature.com/naturecommunications 25