Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

CLINICAL CANCER RESEARCH | PRECISION MEDICINE AND IMAGING

RNA Splicing Alterations Induce a Cellular Stress Response Associated with Poor Prognosis in Acute Myeloid Leukemia Govardhan Anande1, Nandan P. Deshpande2, Sylvain Mareschal3, Aarif M. N. Batcha4,5, Henry R. Hampton1, Tobias Herold6,7, Soren Lehmann3,8, Marc R. Wilkins2, Jason W.H. Wong1,9, Ashwin Unnikrishnan1, and John E. Pimanda1,10,11

ABSTRACT ◥ Purpose: RNA splicing is a fundamental biological process that predict the functional impact of on the generates diversity from a finite set of . Recurrent translated protein, we discovered that approximately 45% somatic of splicing factor genes are common in some of the splicing events directly affected highly conserved protein hematologic cancers but are relatively uncommon in acute myeloid domains. Several splicing factors were themselves misspliced leukemia (AML, < 20% of patients). We examined whether RNA and the splicing of their target transcripts were altered. Studying splicing differences exist in AML, even in the absence of splicing differential expression in the same patients, we identified factor mutations. that alternative splicing of protein genes in ELNAdv Experimental Design: We developed a pipeline patients resulted in the induction of an integrated stress to study alternative RNA splicing in RNA-sequencing data from response and upregulation of inflammation-related genes. Final- large cohorts of patients with AML. ly, using machine learning techniques, we identified a splicing Results: We have identified recurrent differential alternative signature of four genes which refine the accuracy of existing risk splicing between patients with poor and good prognosis. prognosis schemes and validated it in a completely independent These splicing events occurred even in patients without any cohort. discernible splicing factor mutations. Alternative splicing recur- Conclusions: Our discoveries therefore identify aberrant alter- rently occurred in genes with specific molecular functions, native splicing as a molecular feature of adverse AML with clinical primarily related to protein translation. Developing tools to relevance.

Introduction Acute myeloid leukemia (AML) is a hematologic malignancy associated with a poor prognosis and a <30% 5-year survival rate (1). 1Adult Cancer Program, Lowy Cancer Research Centre & Prince of Wales Clinical With an incidence rate of 4 per 100,000 adults per year (2) and a 5-fold School, University of New South Wales Sydney, New South Wales, Australia. higher rate in people over the age of 65, AML represents approximately 2School of Biotechnology and Biomolecular Sciences, University of New South 40% of all new adult-onset leukemias in developed societies (3). AML is Wales Sydney, New South Wales, Australia. 3Hematology Centre, Karolinska characterized by the clonal proliferation of undifferentiated myeloid University Hospital and Department of Medicine, Karolinska Institutet, Hud- precursor cells in the bone marrow and impaired hematopoiesis (4). 4 dinge, Stockholm, Sweden. Institute of Medical Data Processing, Biometrics Patients with AML have recurrent somatic driver mutations (5–7) in and Epidemiology, Faculty of Medicine, LMU Munich, Munich, Germany. 5Data Integration for Future Medicine, LMU Munich, Munich, Germany. 6Department of addition to characteristic cytogenetic and chromosomal abnormali- fi Medicine III, University Hospital, LMU Munich, Munich, Germany. 7Research Unit ties. These alterations have prognostic signi cance and are used to Apoptosis in Hematopoietic Stem Cells, Helmholtz Zentrum Munchen,€ German classify AML (5). However, not all of these mutations are exclusive to Research Center for Environmental Health, Munich, Germany. 8Department of AML, with many also being detected in myelodysplastic syndrome 9 Medical Sciences, Uppsala University, Uppsala, Sweden. School of Biomedical (MDS; ref. 8) as well as in healthy individuals with age-related clonal Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong hematopoiesis (9). Kong. 10Department of Pathology, School of Medical Sciences, University of New South Wales Sydney, New South Wales, Australia. 11Department of Haematol- The standard-of-care treatment for AML is intensive induction ogy, Prince of Wales Hospital, Sydney, New South Wales, Australia. chemotherapy. However, despite complete remission (CR) rates of >50%, long-term disease-free survival remains poor at <10% and a Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/). median overall survival of less than 12 months in patients aged over 60 years (10). In addition, because of significant comorbidities, inten- A. Unnikrishnan and J.E. Pimanda contributed equally as co-senior authors of sive chemotherapy may not suit older patients (11). Alternate therapies this article. for these individuals may include lower intensity treatments, DNA Corresponding Authors: Ashwin Unnikrishnan, UNSW Sydney, Room 217, Level hypomethylating agents (HMA; ref. 12), or targeted therapies. How- 2, C25 Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW 2052, fi Australia. Phone: 612-9385-8045; E-mail: [email protected]; ever, response rates and survival bene ts still remain poor (13), and John E. Pimanda, [email protected] highlighting an important need to develop new therapeutic options for the management of AML. Clin Cancer Res 2020;XX:XX–XX To develop more effective drugs for AML, it is necessary to better doi: 10.1158/1078-0432.CCR-20-0184 understand the molecular aberrations present in leukemic cells. Aber- 2020 American Association for Cancer Research. rations in RNA splicing, a fundamental and highly conserved process

AACRJournals.org | OF1

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

Anande et al.

RNA-sequencing data analyses Translational Relevance RNA-sequencing (RNA-seq) data were analyzed for multiple types Utilizing cytogenetic and mutational information, the European of alternative splicing using a custom in-house bioinformatics pipeline LeukemiaNet (ELN) algorithm is the clinical standard for prog- incorporating available tools, including Mixture of Isoforms (MISO) to nosis in AML. However, there is considerable room for improve- determine Percent Spliced In (PSI) values in each sample and rMATS ment, especially in patients classified as intermediate-risk for for differential splicing analyses. Differential analyses whom treatment is challenging. The 4-gene splicing signature that were performed using DESeq2. A custom in-house pipeline was we have discovered improves the accuracy of classification, con- developed to identify possible changes in well-annotated protein verting the existing three-group risk classification (favorable, domains due to differentially spliced events. Full details are provided intermediate, and adverse risk) into essentially two groups with as Supplementary Data. significantly different overall survival. This will facilitate improved treatment decisions to be made for patients. Our findings also Transcript motif analyses reveal new molecular vulnerabilities that can be potential drug Predictions of differential binding of RNA-binding were targets for the treatment of AML, a disease that currently has poor made using rMAPS (29). Maximum entropy modeling was done with overall outcomes (<30% 5-year survival rate). Direct pharmaco- MaxEntScan (30). Full details are provided as Supplementary Data. logic inhibition of splicing factors is potentially challenging clin- ically due to the toxicity of the drugs. Our data suggest that Prognostic model generation targeting integrated stress response or pathways stimulated as a The splicing signature was generated using LASSO Cox Regression consequence of missplicing in leukemic cells could be an alternative with 10-fold cross-validation implemented in glmnet (R package v 2.0- approach. 16). The splicing risk score for each patient was calculated from the regression coefficients. Performances of prognostic models were assessed by Harrel C index. Risk contributions and variable impor- tance of all prognostic models were estimated as described previous- occurring in >95% of multi- human genes (14), are increasingly ly (31). Full details are provided as Supplementary Data. being described in many cancers. Pan-cancer studies have begun to reveal that tumors have an average of approximately 20% more alternative splicing events than matched healthy tissues (15, 16). Results Splicing is a cotranscriptional event, orchestrated by cis-acting regu- Identification of differential alternative splicing related to latory elements as well as trans-acting factors of the spliceosomal outcome in patients with AML complex. Dysregulation of the expression of splicing factors (17) and To determine whether RNA splicing alterations might be a factor in upstream signaling pathways (18), as well as genomic mutations in cis- adverse outcomes in AML, we developed a bioinformatics pipeline to splice sites (19) have all been reported in cancers. In addition, in quantify differential alternative splicing in RNA-seq data. We first hematologic malignancies such as MDS and chronic myelomonocytic analyzed AML from The Cancer Genome Atlas leukemia, recurrent somatic mutations in members of the E- and A- (TCGA) (6). To detect whether splicing alterations can exist even in splicing complexes, such as SF3B1, U2AF1, SRSF2, and ZRSR2 are the absence of any somatic mutations in splicing factors, we focused detected in >50% of patients (20). The exact mechanisms through our analyses on patients who did not have any splicing factor muta- which these mutations contribute to the malignancy remain poorly tions using the available somatic data (6). To validate the understood. U2AF1 and SF3B1 mutations might alter the 30 splice site mutational data, we queried the RNA-seq data to detect the presence of in target transcripts (21, 22), SRSF2 hotspot mutations affect the known splicing factor hotspot mutations (20) in transcripts. We preferred binding motif on transcripts (23, 24) while ZRSR2 gain- confirmed the mutation data in all patients identified to have splicing of-function mutations increase retention (25). The different factor mutations and identified an additional 16 patients with splicing mutations have been proposed to affect unique sets of genes (21, 24) factor hotspot mutations detectable in the RNA-seq data (fourteen although convergence has been proposed at the level of pathways (26) with SRSF2 and one each with SF3B1 and ZRSR2). Our findings of or through the induction of R-loops (27). TCGA-AML patients with unannotated splicing mutations are con- Somatic mutations in the splicing machinery are less frequent in sistent with a recent independent report (32). AML, however. Analyses of large cohorts of patients have determined We stratified patients in the TCGA-AML using the widely used the overall frequency of splicing mutations to be <20% (6, 7). However, European LeukemiaNet (ELN) prognostic scheme (refs. 1, 4; Fig. 1A) widespread dysregulation of RNA splicing has been observed even in and restricted the analysis to patients who received intensive induction cancers with low frequencies of splicing factor mutations (15, 16). We chemotherapy and for whom full clinical data was available (n ¼ 104, therefore examined whether RNA-splicing alterations exist in AML Supplementary Fig. S1A; Supplementary Table S1). Performing dif- even in the absence of somatic splicing factor mutations, and whether it ferential splicing analyses between ELNFav and ELNAdv (Fig. 1B), correlates with disease outcome. studying the five different types of alternative splicing events (sche- matized in Fig. 1C), we identified 1,288 differentially spliced events (at FDR ≤ 0.05) in 910 genes (Fig. 1D; Supplementary Table S2). A Materials and Methods majority of the events involved the differential skipping (or retention) Patient cohorts of preferentially in one set of patients (n ¼ 716, 55.5%, Fig. 1D). Data from two adult AML cohorts were used in the discovery phase Of these, 395 events involved preferential skipping of exons in ELNAdv of the study: The Cancer Genome Atlas (TCGA)-AML cohort (6) and patients (Supplementary Fig. S1C), with the remaining (n ¼ 321) the Clinseq-AML cohort (28). Data from the Beat-AML cohort (7) associated with in ELNFav patients. An example of were used to validate significance of the splicing signature. Full details differential exon usage was the skipping of exon 37 of MYO9B in are provided as Supplementary Data. TCGA-ELNFav patients (Fig. 1E). Only reads spanning exon 36–exon

OF2 Clin Cancer Res; 2020 CLINICAL CANCER RESEARCH

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

RNA Splicing Alterations in AML

ACDB TCGA 1,200 TCGA 201 Skipped exon (SE) n 100 P AML patients = 1,288 (15.6%) = 0.0187 716 Retained intron (RI) 800 185 n (55.6%) ELNFav ( =41) (14.4%) Alternative 5’ splice site (A5’SS) 400 Adv (n =21) ELN Fav ELN Adv 50 ELN 88 (6.8%) Alternative 3’ splice site (A3’SS) 100 98 (7.6%) Survival (%) RNA-seq data analyses Mutually exclusive exons (MXE) # Diff. spliced events 0 Upstream exon Skipped exons 0 0462 8 10 • Alternative Splicing OS (Years) Down stream exon Intron SE RI MXE A5’SS A3’SS Types of splicing events

MYO9B Exon skipping E TSS FGTSS CDK10 Intron retention Clinseq

[0-103] [0-273] 100 P = 0.0006 63 186 Fav (n =47) #2914 #2811 ELN [0-73] 53 Fav

Fav [0-125] 46 82 n 0-73 ELNAdv ( =75) #2955 #2835 32 50 [0-197] 32 [0-56] 26 24 128 114 #2885 #2014 Survival (%)

109 [0-144] 29 22 8 23 Adv Adv [0-35] 41 61 #2817 0 #2849 0 5 10 15 20 24 87 OS (Years)

232

H (9.3%) IJOccurrence Clinseq 0 25 50 100 150 −Log (P) 1,200 n = 2,484 854 EIF2 signaling 1,217 (34.4%) 13.4

(49.0%) SE mTOR signaling

800 TCGA vents 4.1 e 688 Regulation of eIF4 Protein 1.4 translation RI and p70S6K iced iced 400 92 (3.7%) 222 l 0.4 ILK signaling 89 (3.6%) 100 MXE Signaling by Rho Clinseq family GTPases entially sp r

1,344 signaling # Diff. spliced events ERK/MAPK signaling Upstream A5’SS Diffe 0 TCGA Clinseq SE RI MXE A5’SS A3’SS Common A3’SS Types of splicing events

Figure 1. Identification of differential alternative splicing in patients with AML. A, Kaplan–Meier survival analysis of ELN stratification in the TCGA AML cohort showing significant survival differences between ELNFav and ELNAdv patients. P value computed using log-rank (Mantel–Cox) test. B, Schematic outline to analyze RNA-seq data for differential alternative splicing. C, Different alternative splicing events detected. Exons affected are represented in gray, while up- and downstream exons are shown in brown and green, respectively. are represented as a black line, and depicted as a solid thick black line when retained. D, Distribution of differentially spliced events identified comparing ELNFav and ELNAdv in the TCGA AML cohort. SE, skipped exons; RI, retained introns; MXE, mutually exclusive exons; A50SS, alternative 50 splice sites; A30SS, alternative 30 splice sites. E, Sashimi plots of an alternative exon skipping event in the MYO9B gene in the TCGA data, with examples from representative patients shown. Sequencing reads support the skipping of exon 37 (boxed) in two representative ELNFav patients (#2914, #2955, orange tracks), while indicating exon inclusion in ELNAdv patients (#2855, #2817, red tracks). Lines connecting each exon represent splice junctions and numbers on each line indicate the number of supporting RNA-seq reads. F, Sashimi plots of a differential intron retention event in the CDK10 gene in the TCGA data from representative example patients. Differential retention of intron 8 (boxed) is observed in ELNAdv patients (exemplified by representative patients #2014, #2849, red tracks) and not observed in ELNFav patients (as seen in representative patients #2811, #2835, orange tracks). Lines connecting each exon represent splice junctions and numbers on each line indicate the number of supporting RNA-seq reads. G, Kaplan–Meier survival analysis of ELN stratification in the Clinseq AML cohort showing significant survival differences between three ELN subgroups. P value computed using log-rank (Mantel–Cox) test. H, Distribution of differentially spliced events identified comparing ELNFav and ELNAdv in the Clinseq AML cohort. SE, skipped exons; RI, retained introns; MXE, mutually exclusive exons; A50SS, alternative 50 splice sites; A30SS, alternative 30 splice sites. I, Venn diagram depicting the overlap of differentially spliced genes in both cohorts. Bar plots represent the distribution of alternative splicing events in the shared set of genes. SE, skipped exons; RI, retained introns; MXE, mutually exclusive exons; A50SS, alternative 50 splice sites; A30SS, alternative 30 splice sites. J, Bubble plots of Ingenuity Pathway Analysis of the differentially spliced genes, in TCGA (red), Clinseq (green), and shared genes (gray). The size of each bubble corresponds to significance of enrichment.

38 were detected in ELNFav patients (representative examples, patients reads, respectively, compared with 63 and 46 in ELNFav patients #2914 #2914 and #2955; Fig. 1E). In ELNAdv patients (representative patients and #2955; Fig. 1E). A related phenomenon, of mutually exclusive #2855 and #2817, Fig. 1E); however, there was an increase in the exon usage, where adjacent exons are alternately used, contributed to number of reads indicating the inclusion of exon 37 (128 and 41 reads 185 differential events (Fig. 1D). The retention of introns was the next joining exons 36 and 37, and 114 and 61 reads joining exons 37–38, in most prevalent class (n ¼ 201, 15.6%, Fig. 1D) in TCGA-ELNAdv ELNAdv patients #2855 and #2817, respectively, Fig. 1E; compared patients, as seen in the representative example of the retention of with no reads in the ELNFav patients) and a concomitant decrease in intron 8 in CDK10 (increased intron-specific reads and a decrease in reads spanning exon 36 and 38, skipping exon 37 entirely (29 and 24 exon–exon reads in ELNAdv patients, Fig. 1F). Additional examples of

AACRJournals.org Clin Cancer Res; 2020 OF3

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

Anande et al.

differential 30 or 50 splice site usages are shown in Supplementary Figs. genes separately (Supplementary Fig. S1F). Orthogonal gene ontolo- S1D and S1E, respectively. gy–based analyses also supported these findings, with enrichment for To validate these findings, we analyzed an independent cohort of pathways related to protein translation and RNA processing (Supple- patients with AML from the Scandinavian Clinseq study (28). Selecting mentary Fig. S1G). Our data reveals recurrent and shared alternative the patients similarly to the TCGA cohort (Supplementary Fig. S1B; splicing differences between patients with AML with good or poor Supplementary Table S1), we performed differential splicing analyses prognosis in two independent cohorts, converging on specific molec- between Clinseq-ELNFav (n ¼ 47) and ELNAdv (n ¼ 75) patients ular pathways. (Fig. 1G). We detected a total of 2,484 alternative splicing events (FDR ≤0.05), affecting 1,566 genes (Fig. 1H; Supplementary Table S3). As in Prediction of the functional consequences of alternative the TCGA cohort, the majority of the events in the Clinseq data were splicing skipped exons (n ¼ 1217, 48.9%, Fig. 1H; Supplementary Fig. S1C). Analogous to genetic mutations, we expected that while some Mutually exclusive exon usage was the next most prevalent (n ¼ 854, splicing events would have potentially deleterious effects on subse- 34.4%) followed by intron retention (n ¼ 232, 9.25%, Fig. 1H). quent protein translation, others might be silent. To identify delete- Comparing both cohorts, we found differential splicing events occur- rious splicing events, we developed a custom bioinformatics pipeline ring in the same direction, that is, enriched either in ELNAdv patients in (described in Materials and Methods). Briefly, the chromosomal both cohorts, or in ELNFav patients in both cohorts, in 222 genes coordinates of each splicing event were used to generate nucleotide (Fig. 1I; Supplementary Table S4). Of these, 93 splicing events (in sequences for the spliced and unspliced transcripts. These were then in 78 genes) were identical in both cohorts, which we define as Class A silico translated and the generated primary sequences were scanned to events. A second class, Class B (244 events/173 genes in TCGA, 424 predict the protein effect (Fig. 2A). events/182 genes in Clinseq), affected the same gene and with the same Using this methodologic framework, we predict that 26% of the directionality but represented different splicing events or occurred at Class A plus Class B events (n ¼ 87, Fig. 2B) cause a complete loss of different locations within the gene in the two cohorts. In 19 genes, we well-annotated protein domains. The majority of these events involved observed splicing occurring in opposite directions between the two intron retention events that alter the (Fig. 2B). An cohorts. example of this is the retention of intron 8 of the splicing regulator To determine the molecular impact of this alternative splicing, we HNRPH1 disrupting the RNA Recognition Motif (RRM) protein performed pathway analyses. Ingenuity Pathway Analysis of the Class (Fig. 2C) and an altered transcript predicted to trigger A plus Class B genes revealed enrichment for a number of pathways, nonsense-mediated decay. An additional 20% of events (n ¼ 67) lead including those with functions related protein translation or intracel- to a partial loss of protein domains (Fig. 2B). The functional con- lular signaling (Fig. 1J; Supplementary Table S5). Pathways related to sequences of a partial loss of a domain are harder to predict a priori and protein translation remained enriched when considering Class A or B likely to be protein-specific. Furthermore, of the remaining 181 events

A BE TCGA Clinseq Occurrence Alternative 1 splicing events 0 10 20 35 45 Partial loss Informatics (20%) SE 0.8 pipeline RI Complete 0.6 Unknown domain loss MXE Predict protein effect Consequence (25.97%) • Loss of domain (54.03%) A5’SS 0.4 • Frameshifts A3’SS • Truncations 0.2

0

C HNRNPH1 D P-value Adverse like Favorable like Adverse like Favorable like P-value TSSTSS P 0.0027 NPM1 0.0027 -Log ( ) 0.999 DNMT3A 0.999 EIF2 signaling 0.085 FLT3 0.0765 4.87 [0-725] 119 0.1174 IDH1 0.7312 0.999 168 NRAS 0.6152 mTOR signaling 3.40 0.0893 RUNX1 0.0513 [0-534] 93 0.0025 TP53 <0.0001 Fav 1.14 Regulation of eIF4 0.665 CEBPA 0.0088 [0-817] and p70S6K 0.999 PTPN11 0.999 417 0.6567 KIT 0.6543 Signaling by Rho 0.5797 WT1 0.4444

[0-547] family GTPases 0.22 292 IDH2 0.1528 283 0.99 TET2 0.6462 Adv ILK signaling <0.0001 Favourable <0.0001 ERK/MAPK signaling <0.0001 Adverse <0.0001 RRM 1 RRM 1 domain domain

Figure 2. Analysis of the predicted impact of alternative splicing on protein function. A, Schematic of the analytic pipeline to identify potentially deleterious alternative splicing events. B, Pie chart distribution of protein domain prediction results. Bar plots on the right indicating the distribution of alternative splicing events predicted to lead to a complete loss of protein domains. C, Sashimi plot of a representative protein domain disruption event caused by intron retention in HNRNPH1 gene with examples from representative patients shown. Intron 11 is differentially retained in patients with ELNFav AML (exemplified by the two representative tracks at the top, orange), disrupting the RRM1 domain. Lines connecting flanking exons represent splice junctions and the numbers on each line indicate the number of supporting RNA-seq reads. D, Bubble plot of Ingenuity Pathway Analysis of genes with predicted complete loss of domains. The size of each bubble corresponds to significance of enrichment. E, Nonnegative matrix factorization clustering of patients with AML based on similarity of splicing [percent spliced in (PSI) values] of the shared splicing events, in the TCGA (left) and Clinseq (right) cohorts. Patients were classified as “adverse-like” (red), or “favorable-like” (orange) based on clustering. Oncoprints below denote somatic mutations identified in the patients. P values (Fisher exact test) are shown for TCGA (left) and Clinseq (right), with events with P < 0.05 in bold.

OF4 Clin Cancer Res; 2020 CLINICAL CANCER RESEARCH

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

RNA Splicing Alterations in AML

(54%) of unknown consequence (Fig. 2B), we cannot rule out that mutations, which are intrinsic to the ELN classification algorithm, no some may also affect protein function, through altering protein other somatic driver mutation showed any statistically significant secondary structure or unannotated domains. Focusing on the events correlation with the splicing groups (Fig. 2E). leading to a complete domain loss, pathway analysis revealed that Among the alternatively spliced genes, we observed there were proteins affected by aberrant splicing are still enriched for specific RNA-binding proteins including factors with known roles in RNA molecular functions, including protein translation (Fig. 2D), which we splicing (Fig. 3A). These factors form a tightly interconnected network previously observed (Fig. 1J). Our results suggest that alternative with multiple known protein–protein interactions (Fig. 3B), suggest- splicing changes leading to predicted protein dysfunction in genes ing that missplicing of these factors could trigger a cascade of splicing involved in protein translation recurrently occur in patients with AML. alterations in patients with AML. To find evidence for this, we performed motif-scanning analyses (29) of the Class A plus B set of Analysis of the upstream drivers of alternative splicing commonly differentially spliced transcripts to determine whether they differences might be targets for the misspliced splicing factors. We focused on the We next sought to understand the underlying reasons for the exon skipping events that formed the majority of the shared differ- splicing differences in these risk groups. We investigated whether any ential splicing events across both cohorts (Fig. 1I). Eighty-seven exons of the somatic driver mutations (in the absence of splicing factor were recurrently differentially skipped in ELNAdv patients compared mutations) are correlated with the alternative splicing differences. We with ELNFav patients in both cohorts, while 94 exons were differentially clustered patients based on the similarity of their splicing of Class A retained in ELNAdv patients. To determine motif overenrichment, we and B events and assessed the enrichment for somatic mutations identified a control set of no-differentially spliced exons (n ¼ 9,986). within the clusters (Fig. 2E). Apart from NPM1, TP53 and CEBPA We then performed motif scanning analyses of the transcript

AB Gene Splicing Predicted FDR SRRM1 HNRNPC name type functional change CIRBP HNRNPA1 SE/MXE Non-functional in Adv 0.0007 CIRBP RINon-functional in Adv 0.0017 TCGA RBM8A 688 SRRM1 RI Non-functional in Adv 0.0010 PABPC1 MXENon-functional in Adv 0.0454 222 PCBP2 PCBP2 MXEFunctional in Adv 0.0004 PABPC1 HNRNPA1 PTBP1 RI Functional in Adv 0.0330 Clinseq RBM8A RI Functional in Adv 0.0006 1,344 HNRNPC A3'SS Functional in Adv 0.0394 PTBP1 MATR3 A5'SS Functional in Adv 0.0166 MATR3

C D HNRNPA1 [AGT]TAGGG[AT] HNRNPC [ACT]TTTTT[GT] 0.097 4.831 0.340 4.695

0.073 3.623 ) 0.255 3.521 ) ( P ( P

0.048 2.416 10 0.170 2.347 10

0.024 1.208 0.085 1.174 Motif score (mean)

Motif score (mean) 0.0 0.0 -Log (P) skipped exons 0.0 0.0 -Log (P) skipped exons Negative log -250 -125 0 50 -50 0 125 250 -250 -125 0 50-50 0 125 250 Negative log -Log (P) retained exons -Log (P) retained exons Exon Exon E F

Exons retained in Adv Exons skipped in Adv

Donor site of skipped exon Donor site of bg exon 0.10 0.15 Density 0.05 0.00 −5 0510 15 Shapiro score

Figure 3. Analysis of the upstream drivers of differential alternative splicing in AML. A, List of splicing factors that are commonly differentially spliced across both cohorts. The type of splicing, predicted effect on the protein and FDR are shown. B, Interaction network indicating validated protein–protein interactions (edges) between the differentially spliced, with predicted functional impairment, splicing factor genes (nodes). C, Motif scanning analysis for HNRNPA1 binding sites across a meta-exon generated from the differentially spliced events, with an arrow indicating a peak of significant overenrichment. Motif enrichment scores (left axis) and P values (right axis) are shown. The dashed lines indicate scores of skipped (red) and retained (blue) exons, while the black solid line indicates that of a background score from all nondifferentially spliced exons. The green horizontal line is set at P ¼ 0.05. D, Motif scanning analysis for HNRNPC across a meta-exon generated from all differentially spliced events. Representation similar to C. Arrows indicate peaks of significant overenrichment. E, LOGO analyses of splice donor sites of exons differentially retained (left) or skipped (right) in ELNAdv patients. Analysis is within a 9-base window across the intron–exon junction (3 bases in exon and 6 bases in intron). F, Smoothened density estimates of the position weight matrices (Shapiro score) of the splice donor sites of all differential exon-skipping events. Skipped exons (blue) and background exons (gray) are displayed, illustrating weaker splice sites in the skipped exons.

AACRJournals.org Clin Cancer Res; 2020 OF5

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

Anande et al.

sequences flanking the differentially spliced exons (n ¼ 181), com- in patients with ELNAdv AML in both cohorts (Fig. 4C). In addition, paring against the nondifferentially spliced set. These analyses indi- individual patient analyses revealed a proportional trend between the cated that a well-conserved binding motif for HNRNPA1 is signifi- strength of the induction of ISR gene expression and the extent of cantly overrepresented flanking the 94 exons differentially retained in missplicing of protein translation genes within the same patient ELNAdv AML compared with nondifferentially spliced exons (blue (Fig. 4D). ELNAdv patients, who had higher levels of expression of dotted line compared with black line, P < 0.0002; Fig. 3C). Our ISR target genes, tended to have higher levels of missplicing of protein informatics pipeline prediction was that the detected alternative translation genes (Fig. 4D). splicing in HNRNPA1, a multifunctional splicing regulator that is It has recently been shown that metabolic stresses, including amino known to act as a splicing repressor (33), would produce a nonfunc- acid deprivation that decrease protein synthesis, trigger a proinflam- tional protein in ELNAdv patients. Consistent with this, exon inclusion matory response (38). Pathway analyses of the 602 genes that were in ELNAdv patients was higher within transcripts where it would commonly differentially expressed in the same direction in both normally bind and repress splicing (Fig. 3C). We extended the motif cohorts (i.e., either upregulated in ELNAdv both cohorts, or down- scan analyses to HNRNPC, a splicing factor whose physiologic func- regulated in both cohorts, Supplementary Table S5) revealed an tion is to repress exon inclusion (34) and which is predicted by our enrichment for a number of inflammation-related pathways informatics analyses to be nonfunctional in ELNFav AML. Consistent (Fig. 4E). We find an upregulation of these stress-induced inflamma- with our predictions, we observed a significant overenrichment for tory genes in patients with ELNAdv AML in whom protein translation HNRNPC-binding motifs flanking the 87 exons that were differen- is impacted due to splicing (Fig. 4F). Network analyses further confirm tially retained in ELNFav AML patients compared with the background strong interconnections between the misspliced translation-related set of nondifferentially spliced exons (dotted red line vs. black genes and the differentially expressed proinflammation genes in line, Fig. 3D). ELNAdv AML (Fig. 4G). Our data support a scenario where missplicing Extending our analyses of the sequence determinants of differential of protein translation genes triggers a proinflammatory stress response splicing further, we observed increased usage of noncanonical bases at in ELNAdv patients. the donor (Fig. 3E) and acceptor (Supplementary Fig. S2A) splice sites adjacent to exons differentially spliced in ELNAdv patients (n ¼ 181) Determining the prognostic relevance of alternative splicing when compared with the background set of nondifferentially spliced events exons (n ¼ 9,986). Exons differentially skipped in ELNAdv patients (n Given these findings, we examined whether alternative splicing ¼ 87) also have weaker splice donor sites compared with background could serve as a prognostic marker for adverse outcomes in AML. exons (Fig. 3F). To determine whether dysregulation of other RNA- While gene expression and epigenetic studies have been previously binding factors (in addition to the ones we have already predicted linked to AML outcome (39, 40), these analyses would have missed the above) might be contributing to differential splicing, we performed a impact of alternative splicing. Utilizing machine-learning techniques systematic evaluation of 114 RNA-binding factors using a catalog of (schematized in Fig. 5A; see Supplementary methods for more details), well-characterized binding motifs (29). We extended the analyses to we identified four genes (MYO9B, GAS5, GIGYF2, RPS9, Fig. 5B) include the entire set of Class A and B events beyond just skipped whose differential alternative splicing could stratify AML patients with exons. We found overenrichment of motifs for PABPC1, a RNA- good and poor prognosis. The differential splicing of these four genes binding protein recently proposed to have roles in RNA splicing (35), (“splicing signature”) performs comparably with the ELN in both and for RBM46 specifically adjacent to the 94 exons that were cohorts (Fig. 5C), with similar Harrell C-index (31; Supplementary differentially retained in ELNAdv patients compared with ELNFav Fig. S3A) where a C-index of 50% is equivalent to a random assignment patients in both the TCGA and Clinseq cohorts (blued dotted line, and 100% represents a correct ranking of the survival times of all Supplementary Fig. S2B and S2C). Similarly, introns that were pref- patients. erentially retained in ELNAdv patients were enriched for SRSF3 binding More accurate stratification and improved prognosis would espe- at the 30 end (Supplementary Fig. S2D). These RNA-binding factors cially benefit patients with AML classified as intermediate-risk, a group are also known to have protein–protein interactions with other of patients with response and survival rates intermediate to ELNFav and splicing proteins predicted by our analyses to be affected by differential ELNAdv. Accurately identifying ELNInt patients with the most severe splicing (Supplementary Fig. S2E). Our results suggest that missplicing risk prognosis would aid in treatment decisions made in the clinic. of splicing factors, together with specific biophysical properties of cis- Equally, ELNInt patients with a predicted favorable prognosis could be factors, contribute to the alternative splicing differences we have treated appropriately. As the mutations and cytogenetics-based ELN, observed in patients with AML. gene expression–based LSC17 signature (40), and our splicing signa- ture represent complementary biological measurements all with the Induction of an integrated stress response in patients with potential to contribute to disease severity, we investigated their ELNAdv AML combined potential to more accurately classify patients with AML. Our analyses had indicated that genes related to protein translation Addition of the splicing signature to the ELN or LSC17 alone improved were differentially spliced (Fig. 1J), with predicted functional the accuracy in both the TCGA (Supplementary Fig. S3B and S3C) and impairment (Fig. 2D) in patients with ELNAdv AML. A cellular Clinseq cohorts (Supplementary Fig. S3D and S3E), with higher C- consequence of impaired protein translation would be the induction indices for the combined signatures (Supplementary Fig. S3F and of the Integrated Stress Response (ISR) within cells (36). To find S3G). Applied together, the combination of the three signatures evidence in support of this, we performed differential gene expression improved the accuracy of classification of patients with AML, con- analyses between ELNFav and ELNAdv patients (Fig. 4A). A total of verting the three-group risk classification to essentially two groups 2,219 genes were differentially expressed in the TCGA cohort at FDR < with significantly different overall survival in both cohorts (Fig. 5D). 0.05 (Fig. 4B; Supplementary Table S6) and 1,710 genes in Clinseq To independently validate the prognostic significance of the splicing (Fig. 4B; Supplementary Table S6). GSEA analyses of the differentially signature, we also analyzed data from the BEAT-AML cohort (7). expressed genes clearly indicate a strong upregulation of ISR genes (37) Selecting patients for the TCGA and Clinseq cohorts (Supplementary

OF6 Clin Cancer Res; 2020 CLINICAL CANCER RESEARCH

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

RNA Splicing Alterations in AML

A B C TCGA Clinseq Integrated stress response genes

AML Patients 20 TCGA Clinseq 15

ELN Fav ELN Adv ( P ) 10 -Log RNA-seq data 510 analyses Fav Fav 0 Alternative Splicing 0 5 10 15 20 25 Adv Adv Differential gene expr. −2 −1 0 1 2 −3 −2 −1 0 1 2 3 Log (Fold change) Log (Fold change) 2 2 NES: -1.998 NES: -1.711 FDR: 0 FDR: 0.0011

D TCGA ELN Clinseq Fav Adv 2 ELN

0

−2

ISR

Gene expression

More Less Splicing (protein translation genes) Splicing (protein translation genes) Missplicing E F G Inflammation genes Inflammation genes

GNA11 RRAS2 PPP3CC OSM COL18A1 CD247 NFATC2 CD82 MAF MYLK TRIM25 IL18RAP IL5RA

ITPR3 COL6A1

RAPGEF3 COL6A2 up in cell stress C1QB CXCL2

F2RL1 CXCL3 SOCS2 KLRD1 DEG CASP2 IFI44

ARHGEF10 PRKCH P ARHGEF17 IFITM1 CD59 OAS3 PHKG2 PIK3R6

Pathway names -Log ( ) NEDD4 COL5A1

IRAK2 F2R

PHKA2 ACTA2

AKT3 Up (Gameiro & Struhl, 2018) BMP6

TCF4 IFITM3

IL17RC USP18

PYGB YWHAE TGFBR3 ITGA3

CEACAM1 LAMC1

C1R PRKCZ

CDK7 MAP3K1 ABLIM1 EOMES Down YWHAG Natural Killer Cell Signaling 4.12 SARM1 MAPK11

BAX MX1

UNC5B RUNX3

DAPK1 ETS1

CACNB2 IL12RB2

CACNB3 ADCY6

PRKD3 SMAD3

Clinseq INPP5F ANXA1

Th1 and Th2 Activation Pathway 3.61 LRBA ITGAE

FZD6 IL1R1

CD22 IFIH1

KLF12 PRKCE

ANXA3 RAC3 1,108 FCRL3 GZMB TNFRSF25 PTK2 NOTCH3 CCL4 TNFRSF10C TBX21 Spliced MAST2 CXCR3 NLRC3 PIK3R3 Fc Epsilon RI Signaling 3.55 GPR56 IL2RA IL2RB 602 NF-kB Activation 3.45

DDX56 RHOT2 PTBP1

SND1 RPS6KB2

PRKAG1 MRPL22

MEPCE EIF3M IL3 Signaling 2.96 SNRNP35 RPL10A TCGA MRPS36 RPS26

DDX19B RPL9

1,617 TRPS1 RPS9 Protein IL15 Signaling 2.59 Fav Adv CNOT8 EIF3E Fav Adv IK RPL26

TDRKH RPL13A

TP53INP2 HNRNPA1

DUS4L PABPC1 ITGB2 translation CCR3 Signaling in Eosinophils 2.29 LPS-stimulated MAPK Signaling 2.26 TCGA Clinseq NES: -1.47 FDR: 0.013 NES: -1.74 FDR: 0

Figure 4. Analysis of the impact of alternative splicing on the . A, Schematic outline of the informatics methodology used to identify differentially expressed genes. B, Volcano plots of differentially expressed genes in TCGA (left) and Clinseq (right). Genes highlighted in red have FDR < 0.05 and in orange with FDR < 0.05 and log2 (fold change) > |1|. C, Gene set enrichment analyses (GSEA) showing upregulation of a set of previously reported ATF4-regulated integrated stress response genes (38) in ELNAdv patients from the TCGA (left) and Clinseq (right) cohorts. D, Hierarchical clustering of individual patients (columns) based on the expression of integrated stress response genes (rows) with core enrichment from GSEA for the TCGA and Clinseq cohorts, respectively. Rows were scaled on the basis of expression. A scaled z-score of the PSI values of protein translation genes was calculated in each patient and is represented below. E, Ingenuity Pathway Analysis results of the differentially expressed genes. Venn diagram indicates differentially expressed genes shared by both cohorts. Inflammation-related pathways with associated enrichment values are shown. F, GSEA analyses of a published (39) set of inflammation genes upregulated as a result of decreased protein synthesis. Results show upregulation in ELNAdv patients in both the TCGA (left) and Clinseq (right) cohorts. G, Integrated network analysis of differentially spliced translation genes (green) and differentially expressed inflammation genes (upregulated in red, and downregulated in blue). Experimentally validated protein–protein interactions are depicted as lines, connecting the proteins (nodes). DEG, differentially expressed genes.

AACRJournals.org Clin Cancer Res; 2020 OF7

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

Anande et al.

A B Gene name Reg. coefficient Event type Effect TCGA (62) & Clinseq (117) Training set LASSO Cox Identified (80%) (10 FCV) markers MYO9B 0.345360187 SE Skipped in Fav Data Splicing events (93) partition GAS5 0.115741639 RI Retained in Adv Test set Internal validation GIGYF2 Clinical information (20%) 0.080531251 SE Skipped in Fav RPS9 0.057787292 SE Skipped in Fav C TCGA Clinseq

ELN Splicing signature ELN Splicing signature P = 0.0015 P =0.005 P P 100 =0.02 100 =0.001 100 100 ELN Fav (41) Low risk (31) ELN Fav (47) Low risk (59) ELN Adv (21) High risk (31) ELN Adv (70) High risk (58) al (%) 50 v 50 50 50 Survival (%) Survi

0 0 0 0 02468 02468 0 5 10 15 20 0 5 10 15 20 OS (years) OS (years) OS (years) OS (years)

ELN ELN refined by D LSC17 + Splicing ELN ELN > LSC17 + Splicing P = 0.06 P = 0.02 100 ELN Fav: 41 Low risk: 41 100 ELN Fav (41) Low risk (41) ELN Int (42) 14 Intermediate (42) ELN Adv (21) 14 High risk (21) 50 50 ELN Int: 42 Int. risk: 42

TCGA 3

Survival (%) 3 Survival (%) 0 0 02468 ELN Adv: 21 High risk: 21 02468 OS (years) OS (years)

ELN ELN > LSC17 + Splicing P =0.006 P = 0.0047 100 ELN Fav: 47 Low risk: 45 100 ELN Fav (47) Low risk (49) 12 Intermediate (70) ELN Int (67) 10 High risk (65) ELN Adv (70) 50 ELN Int: 67 Int. risk: 82 50 8 Clinseq Survival (%) 21 Survival (%) 0 0 0 5 10 15 20 ELN Adv: 70 High risk: 57 0 5 10 15 20 OS (years) OS (years) E ELN Beat-AML cohort Splicing signature P = 0.035 P = 0.0018 ELN Fav (29)

100 ELN Int (30) 100 Low risk (43) ELN Adv (26) High risk (42)

50 50 Survival (%) Survival (%)

0 0 01234 01234 OS (years) OS (years)

Figure 5. Evaluating the prognostic significance of alternatively splicing in AML. A, Machine learning approach used to identify splicing markers. Input patients were randomly classified into two sets, training (80%) and test set (20%). LASSO Cox regression with 10-fold cross-validation was applied to the training set to identify markers. Identified markers were internally validated on the test set. B, The four splicing markers identified to have prognostic significance in both AML cohorts. Regression coefficient and event type are displayed. C, Kaplan–Meier analysis of TCGA-AML (top row) or Clinseq-AML patients (bottom row) stratified either by the ELN or the splicing signature. P values were computed using log-rank (Mantel–Cox) test. D, TCGA-AML (top row) or Clinseq-AML patients (bottom row) classified initially by ELN (left) and reclassified by adding the LSC17 and the Splicing Signature (right). Sankey flow diagrams (middle) illustrate the redistribution of patients, with the widths of the lines proportional to numbers of patients redistributed (number also denoted). P values were computed using log-rank (Mantel–Cox) test. E, Independent validation of the splicing signature in the Beat-AML (7) cohort. Patients were stratified by the splicing signature-based risk score. P values were computed using log-rank (Mantel–Cox) test.

OF8 Clin Cancer Res; 2020 CLINICAL CANCER RESEARCH

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

RNA Splicing Alterations in AML

Fig. S4A), we performed RNA-splicing analyses on the transcriptomic MDS (47). It is possible that the functional consequences of aberrant data and calculated the splicing risk score (see Supplementary Methods RNA splicing, through somatic mutations or otherwise, might con- for details). The prognostic significance of the splicing signature was verge on common downstream consequences. The upregulation of better than the ELN 2017 (P ¼ 0.0018 vs. P ¼ 0.035, Fig. 5E) within this inflammation could induce a leukemic microenvironment that sup- independent cohort of patients. Addition of the splicing signature to ports the growth of AML clones. AML cells have been recently the ELN 2017 (Supplementary Fig. S4B) or the LSC17 (Supplementary reported to be dependent on signaling from the proinflammatory Fig. S4C), and combining the signatures (Supplementary Fig. S4D), cytokine IL1 (48). Furthermore, IL1 signaling suppressed the growth of further improved the accuracy of classification (Supplementary healthy leukemic cells, thereby promoting leukemogenesis and influ- Fig. S4E) consistent with what we had observed in the TCGA and encing clonal selection of neoplastic cells (48). While pharmacologic Clinseq cohorts. inhibition of splicing factors has been proposed as a targetable vulnerability of leukemic cells (45, 49), the narrow therapeutic window for these drugs due to toxicity poses a potential challenge to using them Discussion clinically. Our data suggest that targeting integrated stress response or Recurrent somatic mutations in RNA splicing factors have been inflammation-promoting pathways that might be stimulated in leu- reported in some hematologic malignancies (20). Analyzing AML kemic cells as a consequence of missplicing could be an alternative transcriptomes, we have discovered recurrent alternative RNA approach. splicing differences between ELNFav and ELNAdv patients even in While cytogenetic and mutational information have become the the absence of splicing factor mutations. Many of these alternatively clinical standard for prognosis in AML, there is still significant spliced events are predicted to alter protein function, including heterogeneity that remains unresolved. Assessing additional molec- members of the spliceosomal complex and protein translation ular parameters, including gene expression (40) and DNA meth- genes. Integration with gene expression revealed that ELNAdv ylation (39) have been proposed to improve stratification of patients had an induction of the ISR and a proinflammatory patients. However, these analyses would have missed capturing an transcriptional program that was proportional to the degree of important molecular feature of AML, aberrant alternative splicing. missplicing of protein translation genes. Furthermore, using By complementing existing schema with splicing information, we machine learning, we identified four alternatively spliced genes were able to improve the accuracy of risk stratification, including for that could be used to refine current mutation and transcriptome- ELNInt patients, which should aid in treatment decisions in the based prognostic classification of patients with AML. clinic. Demographically, the three cohorts (TCGA and BEAT-AML, The origin of the missplicing that we have detected in patients with North American; Clinseq, European) do share some broad com- AML remains unknown. It is possible that aberrant transcriptional monalities, including a high proportion of patients of white eth- programs initiated by oncogenic driver mutations might dysregulate nicity (>80% in all) and older patients (median age >55 years in the splicing networks through the misexpression of splicing cofactors. The TCGA and >60 years in the Clinseq and BEAT-AML cohorts; splicing factors are also subject to a number of regulatory posttrans- refs. 6, 7, 28). Further assessment of the splicing signature in lational modifications. of splicing factors by kinases additional AML cohorts with broader representation of ethnic of the SRPK and CLK families control their enzymatic activity diversity and younger patients should bring clarity to the wider and subcellular localization (41) and AML cells are sensitive to utility of the splicing signature. pharmacologic inhibition of these kinases (42). Many RNA-binding proteins are also methylated by the PRMT family of protein arginine Key Points methyltransferases and PRMT inhibition kills leukemic cells (43). Furthermore, splicing alterations due to epigenetic or chromatin * Widespread and recurrent alternative splicing differences exist changes due to somatic mutations (32, 44) or possibly as a consequence between patients with AML with good or poor prognosis of aging (45) have also been recently reported. It is possible that some * Missplicing of RNA splicing factors leads to altered splicing of or all of these mechanisms could contribute to the splicing alterations their target transcripts we have detected in AML. A cascade of missplicing would then be * Aberrant splicing of protein translation genes triggers the predicted to arise because of the highly interconnected regulatory induction of an integrated stress response and concomitant networks involving a number of splicing factors and RNA-binding inflammatory response proteins (33, 46). * Alternative RNA splicing information can be used to improve the Decreased protein translation induces the ISR, a conserved pathway accuracy of existing prognostic algorithms in AML which serves promotes cell survival by modulating cellular homeo- * The addition of the splicing signature accurately classifies patients Int stasis during cellular stress (36). Protein translational stress leads to the with ELN AML, effectively converting a three-group efficient translation of the ISR effector ATF4 and upregulation of its classification system into a two-group one, facilitating better target genes (36). Increased ISR and ATF4 activity have been recently treatment decisions to be made. shown to be marker of leukemic stem cells in patients with AML (37). Our data indicates aberrant alternative splicing of protein translation Disclosure of Potential Conflicts of Interest genes and an induction of the ISR in patients with AML with poor No potential conflicts of interest were disclosed. outcomes. Recently, a second cellular stress response, induction of a ’ proinflammatory transcriptional program, has been identified as a Authors Contributions result of decreased protein synthesis (38). Our data are consistent with Conception and design: G. Anande, A. Unnikrishnan, J.E. Pimanda Development of methodology: G. Anande, N.P. Deshpande, A. Unnikrishnan this, where upregulation of inflammatory genes is seen in ELNAdv Acquisition of data (provided animals, acquired and managed patients, provided patients. facilities, etc.): T. Herold, S. Lehmann Induction of inflammatory genes and the NFkB pathway have also Analysis and interpretation of data (e.g., statistical analysis, biostatistics, been reported as a consequence of SF3B1 and SRSF2 mutations in computational analysis): G. Anande, N.P. Deshpande, S. Mareschal,

AACRJournals.org Clin Cancer Res; 2020 OF9

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

Anande et al.

A.M.N. Batcha, H.R. Hampton, T. Herold, J.W.H. Wong, A. Unnikrishnan, nan acknowledges funding support from the National Health and Medical J.E. Pimanda Research Council of Australia (APP1163815), Leukemia & Lymphoma Society Writing, review, and/or revision of the manuscript: N.P. Deshpande, S. Mareschal, (USA) and Anthony Rothe Memorial Trust. J.E. Pimanda acknowledges funding T. Herold, S. Lehmann, M.R. Wilkins, J.W.H. Wong, A. Unnikrishnan, J.E. Pimanda from National Health and Medical Research Council of Australia (APP1024364, Administrative, technical, or material support (i.e., reporting or organizing data, 1043934, 1102589), Cancer Institute of New South Wales/Translational Cancer constructing databases): G. Anande, N.P. Deshpande, T. Herold Research Network and Anthony Rothe Memorial Trust. T. Herold is supported Study supervision: N.P. Deshpande, M.R. Wilkins, J.W.H. Wong, A. Unnikrishnan, by the Wilhelm Sander Foundation (2013.086.2), the Physician Scientists Grant J.E. Pimanda (G-509200-004) from the Helmholtz Zentrum Munchen€ and the German Cancer Consortium (Deutsches Konsortium fur€ Translationale Krebsforschung, Heidel- Acknowledgments berg, Germany) and a grant from Deutsche Forschungsgemeinschaft (DFG SFB The authors would like to thank Dr. Ling Zhong (Mark Wainwright Ana- 1243). A.M.N. Batcha was partially funded by the BMBF grant 01ZZ1804B lytical Centre, University of New South Wales, New South Wales, Australia) (DIFUTURE). for technical assistance rendered. The authors also thank Dr. Annatina Schnegg (University of New South Wales, New South Wales, Australia) for The costs of publication of this article were defrayed in part by the payment of page critical review and discussion of the manuscript. This work was facilitated by charges. This article must therefore be hereby marked advertisement in accordance infrastructure provided by the NSW Government co-investment in the National with 18 U.S.C. Section 1734 solely to indicate this fact. Collaborative Research Infrastructure Scheme (NCRIS, Australia). The authors acknowledge the following funding support: G. Anande was supported by a postgraduate scholarship from the University of New South Wales, with addi- Received January 15, 2020; revised February 13, 2020; accepted February 26, 2020; tional funding from the Translational Cancer Research Network. A. Unnikrish- published first March 2, 2020.

References 1. Dohner H, Estey E, Grimwade D, Amadori S, Appelbaum FR, Buchner T, et al. 18. Vu NT, Park MA, Shultz JC, Goehe RW, Hoeferlin LA, Shultz MD, et al. hnRNP Diagnosis and management of AML in adults: 2017 ELN recommendations from U enhances caspase-9 splicing and is modulated by AKT-dependent phosphor- an international expert panel. Blood 2017;129:424–47. ylation of hnRNP L. J Biol Chem 2013;288:8575–84. 2. SEER Cancer Stat Facts: acute myeloid leukemia. Bethesda, MD: National Cancer 19. Jayasinghe RG, Cao S, Gao Q, Wendl MC, Vo NS, Reynolds SM, et al. Systematic Institute. Available from: https://seer.cancer.gov/statfacts/html/amyl.html. analysis of splice-site-creating mutations in cancer. Cell Rep 2018;23, 270–81. 3. Shallis RM, Wang R, Davidoff A, Ma X, Zeidan AM. Epidemiology of acute 20. Yoshida K, Sanada M, Shiraishi Y, Nowak D, Nagata Y, Yamamoto R, et al. myeloid leukemia: recent progress and enduring challenges. Blood Rev 2019;36: Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 70–87. 2011;478:64–9. 4. Dohner H, Estey EH, Amadori S, Appelbaum FR, Buchner T, Burnett AK, et al. 21. Ilagan JO, Ramakrishnan A, Hayes B, Murphy ME, Zebari AS, Bradley P, et al. Diagnosis and management of acute myeloid leukemia in adults: recommenda- U2AF1 mutations alter splice site recognition in hematological malignancies. tions from an international expert panel, on behalf of the European Leukemia- Genome Res 2015;25:14–26. Net. Blood 2010;115:453–74. 22. Dolatshad H, Pellagatti A, Fernandez-Mercado M, Yip BH, Malcovati L, Att- 5. Papaemmanuil E, Gerstung M, Bullinger L, Gaidzik VI, Paschka P, Roberts ND, wood M, et al. Disruption of SF3B1 results in deregulated expression and splicing et al. Genomic classification and prognosis in acute myeloid leukemia. N Engl J of key genes and pathways in myelodysplastic syndrome hematopoietic stem and Med 2016;374:2209–21. progenitor cells. Leukemia 2015;29:1092–103. 6. Cancer Genome Atlas Research Network, Ley TJ, Miller C, Ding L, Raphael BJ, 23. Zhang J, Lieu YK, Ali AM, Penson A, Reggio KS, Rabadan R, et al. Disease- Mungall AJ, et al. Genomic and epigenomic landscapes of adult de novo acute associated mutation in SRSF2 misregulates splicing by altering RNA-binding myeloid leukemia. N Engl J Med 2013;368:2059–74. affinities. Proc Natl Acad Sci U S A 2015;112, E4726–34. 7. Tyner JW, Tognon CE, Bottomly D, Wilmot B, Kurtz SE, Savage SL, et al. 24. Kim E, Ilagan JO, Liang Y, Daubner GM, Lee SC, Ramakrishnan A, et al. SRSF2 Functional genomic landscape of acute myeloid leukaemia. Nature 2018;562: mutations contribute to myelodysplasia by mutant-specific effects on exon 526–31. recognition. Cancer Cell 2015;27:617–30. 8. Haferlach T, Nagata Y, Grossmann V, Okuno Y, Bacher U, Nagae G, et al. 25. Madan V, Kanojia D, Li J, Okamoto R, Sato-Otsubo A, Kohlmann A, et al. Landscape of genetic lesions in 944 patients with myelodysplastic syndromes. Aberrant splicing of U12-type introns is the hallmark of ZRSR2 mutant Leukemia 2014;28:241–7. myelodysplastic syndrome. Nat Commun 2015;6:6042. 9. Genovese G, Kahler AK, Handsaker RE, Lindberg J, Rose SA, Bakhoum SF, et al. 26. Qiu J, Zhou B, Thol F, Zhou Y, Chen L, Shao C, et al. Distinct splicing signatures Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence. affect converged pathways in myelodysplastic syndrome patients carrying N Engl J Med 2014;371:2477–87. mutations in different splicing regulators. RNA 2016;22:1535–49. 10. DiNardo CD, Cortes JE. Mutations in AML: prognostic and therapeutic impli- 27. Chen L, Chen JY, Huang YJ, Gu Y, Qiu J, Qian H, et al. The augmented R-Loop is cations. Hematology Am Soc Hematol Educ Program 2016;2016:348–55. a unifying mechanism for myelodysplastic syndromes induced by high-risk 11. Kantarjian H, Ravandi F, O'Brien S, Cortes J, Faderl S, Garcia-Manero G, et al. splicing factor mutations. Mol Cell 2018;69:412–425. Intensive chemotherapy does not benefit most older patients (age 70 years or 28. Wang M, Lindberg J, Klevebring D, Nilsson C, Mer AS, Rantalainen M, et al. older) with acute myeloid leukemia. Blood 2010;116:4422–9. Validation of risk stratification models in acute myeloid leukemia using sequenc- 12. Dombret H, Seymour JF, Butrym A, Wierzbowska A, Selleslag D, Jang JH, et al. ing-based molecular profiling. Leukemia 2017;31:2029–2036. International phase 3 study of azacitidine vs. conventional care regimens in older 29. ParkJW,JungS,RouchkaEC,TsengYT,XingY.rMAPS:RNAmapanalysis patients with newly diagnosed AML with >30% blasts. Blood 2015;126:291–9. and plotting server for alternative exon regulation. Nucleic Acids Res 2016; 13. Estey E, Levine RL, Lowenberg B. Current challenges in clinical development of 44:W333–8. "targeted therapies": the case of acute myeloid leukemia. Blood 2015;125:2461–6. 30. Yeo G, Burge CB. Maximum entropy modeling of short sequence motifs with 14. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, et al. Alternative applications to RNA splicing signals. J Comput Biol 2004;11:377–94. isoform regulation in human tissue transcriptomes. Nature 2008;456:470–6. 31. Gerstung M, Pellagatti A, Malcovati L, Giagounidis A, Porta MG, Jadersten M, 15. Kahles A, Lehmann KV, Toussaint NC, Huser M, Stark SG, Sachsenberg T, et al. et al. Combining gene mutation with gene expression data improves outcome Comprehensive analysis of alternative splicing across tumors from 8,705 prediction in myelodysplastic syndromes. Nat Commun 2015;6:5901. patients. Cancer Cell 2018;34:211–24. 32. Yoshimi A, Lin KT, Wiseman DH, Rahman MA, Pastore A, Wang B, et al. 16. Climente-Gonzalez H, Porta-Pardo E, Godzik A, Eyras E. The functional impact Coordinated alterations in RNA splicing and epigenetic regulation drive leu- of alternative splicing in cancer. Cell Rep 2017;20:2215–26. kaemogenesis. Nature 2019;574:273–7. 17. Luo C, Cheng Y, Liu Y, Chen L, Liu L, Wei N, et al. SRSF2 regulates alternative 33. Huelga SC, Vu AQ, Arnold JD, Liang TY, Liu PP, Yan BY, et al. Integrative splicing to drive hepatocellular carcinoma development. Cancer Res 2017;77: genome-wide analysis reveals cooperative regulation of alternative splicing by 1168–78. hnRNP proteins. Cell Rep 2012;1:167–78.

OF10 Clin Cancer Res; 2020 CLINICAL CANCER RESEARCH

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

RNA Splicing Alterations in AML

34. Konig J, Zarnack K, Rot G, Curk T, Kayikci M, Zupan B, et al. iCLIP reveals the isoform usage of epigenetic regulators including BRD4. Nat Commun function of hnRNP particles in splicing at individual nucleotide resolution. 2018;9:5378. Nat Struct Mol Biol 2010;17:909–15. 43. Fong JY, Pignata L, Goy PA, Kawabata KC, Lee SC, Koh CM, et al. Therapeutic 35. Horan L, Yasuhara JC, Kohlstaedt LA, Rio DC. Biochemical identification of new targeting of RNA splicing catalysis through inhibition of protein arginine proteins involved in splicing repression at the Drosophila P-element exonic methylation. Cancer Cell 2019;36:194–209. splicing silencer. Genes Dev 2015;29:2298–311. 44. Inoue D, Chew GL, Liu B, Michel BC, Pangallo J, D'Avino AR, et al. 36. Pakos-Zebrucka K, Koryga I, Mnich K, Ljujic M, Samali A, Gorman AM. The Spliceosomal disruption of the non-canonical BAF complex in cancer. integrated stress response. EMBO Rep 2016;17:1374–95. Nature 2019;574:432–6. 37. van Galen P, Mbong N, Kreso A, Schoof EM, Wagenblast E, Ng SWK, et al. 45. Crews LA, Balaian L, Delos Santos NP, Leu HS, Court AC, Lazzari E, et al. RNA Integrated stress response activity marks stem cells in normal hematopoiesis and splicing modulation selectively impairs leukemia stem cell maintenance in leukemia. Cell Rep 2018;25:1109–1117. secondary human AML. Cell Stem Cell 2016;19:599–612. 38. Gameiro PA, Struhl K. Nutrient deprivation elicits a transcriptional and trans- 46. Liang Y, Tebaldi T, Rejeski K, Joshi P, Stefani G, Taylor A, et al. SRSF2 mutations lational inflammatory response coupled to decreased protein synthesis. Cell Rep drive oncogenesis by activating a global program of aberrant alternative splicing 2018;24:1415–24. in hematopoietic cells. Leukemia 2018;32:2659–71. 39. Figueroa ME, Lugthart S, Li Y, Erpelinck-Verschueren C, Deng X, Christos PJ, 47. Lee SC, North K, Kim E, Jang E, Obeng E, Lu SX, et al. Synthetic lethal and et al. DNA methylation signatures identify biologically distinct subtypes in acute convergent biological effects of cancer-associated spliceosomal gene mutations. myeloid leukemia. Cancer Cell 2010;17:13–27. Cancer Cell 2018;34:225–41. 40. Ng SW, Mitchell A, Kennedy JA, Chen WC, McLeod J, Ibrahimova N, et al. A 17- 48. Carey A, Edwards DK, Eide CA, Newell L, Traer E, et al. Identification of gene stemness score for rapid determination of risk in acute leukaemia. Nature interleukin-1 by functional screeningasakeymediatorofcellularexpan- 2016;540:433–7. sion and disease progression in acute myeloid leukemia. Cell Rep 2017;18: 41. Stamm S. Regulation of alternative splicing by reversible protein phosphoryla- 3204–18. tion. J Biol Chem 2008;283:1223–7. 49. Lee SC, Dvinge H, Kim E, Cho H, Micol JB, Chung YR, et al. Modulation of 42. Tzelepis K, De Braekeleer E, Aspris D, Barbieri I, Vijayabaskar MS, Liu splicing catalysis for therapeutic targeting of leukemia with mutations in genes WH, et al. SRPK1 maintains acute myeloid leukemia through effects on encoding spliceosomal proteins. Nat Med 2016;22:672–8.

AACRJournals.org Clin Cancer Res; 2020 OF11

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research. Published OnlineFirst March 2, 2020; DOI: 10.1158/1078-0432.CCR-20-0184

RNA Splicing Alterations Induce a Cellular Stress Response Associated with Poor Prognosis in Acute Myeloid Leukemia

Govardhan Anande, Nandan P. Deshpande, Sylvain Mareschal, et al.

Clin Cancer Res Published OnlineFirst March 2, 2020.

Updated version Access the most recent version of this article at: doi:10.1158/1078-0432.CCR-20-0184

Supplementary Access the most recent supplemental material at: Material http://clincancerres.aacrjournals.org/content/suppl/2020/02/29/1078-0432.CCR-20-0184.DC1

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Subscriptions Department at [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://clincancerres.aacrjournals.org/content/early/2020/05/24/1078-0432.CCR-20-0184. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from clincancerres.aacrjournals.org on October 1, 2021. © 2020 American Association for Cancer Research.