Identification of Novel Compounds That Inhibit HIV-1 Expression by Targeting Viral RNA Processing

by

Ahalya Balachandran

A thesis submitted in conformity with the requirements for the degree of Master of Science Department of Molecular Genetics University of Toronto

© Copyright by Ahalya Balachandran 2015

Identification of Novel Compounds That Inhibit HIV-1 by Targeting Viral RNA Processing

Ahalya Balachandran

Master of Science

Department of Molecular Genetics University of Toronto

2015 Abstract

Novel strategies targeting different stages of the HIV lifecycle are vital for continued success in combating viral infection. Since HIV gene expression is dependent upon controlled splicing of the viral transcript, small molecule modulators of RNA processing hold tremendous promise as novel drugs. To this end, we screened splicing modulators for their effect on HIV-1 gene expression. We identified four compounds, 191, 791, 833 and 892, that strongly suppressed accumulation of HIV-1 incompletely spliced RNA and expression of viral structural/regulatory . Furthermore, compound treatment had limited effects on of host

RNA splicing events. Subsequent studies confirmed anti-HIV activity of two compounds in the context of peripheral blood mononuclear cells. The distinct effects of these compounds from previously characterized HIV-1 RNA processing inhibitors validate targeting this stage of the virus lifecycle. Elucidating the mechanism by which these compounds alter HIV-1 gene expression holds key insights for novel therapeutic strategies.

ii

Acknowledgments

I would like to thank my supervisor, Dr. Alan Cochrane, for the opportunity to work on this project in his laboratory over the past few years. I would also like to thank my committee members, Dr. Lori Frappier, Dr. Craig Smibert, and Dr. Peter Roy, for their continued guidance and support. It has been a pleasure working with all members of the Cochrane Lab over the past few years. I’d like to thank all the students and post docs for their help and support along the way. Special thanks go to Raymond Wong for taking me under his wing when I was an undergraduate student and for sharing his knowledge and experience about the drug screening projects in our lab. I’d also like to thank Dr. Alex Chen for training me in preparation for working with replicative HIV in the BSL3 facility and the Scott Gray-Owen Lab for source of PBMCs. Last but not least, I’d like to thank our collaborators Dr. Peter Stoilov at West Virginia and Dr. Sandy Pan from the Blencowe Lab for examining the effect of the compounds on cellular alternative splicing. The work presented here would not be possible without funding provided by CIHR grants, as well as the Ontario Graduate Scholarship Award.

iii

Table of Contents

Acknowledgments...... iii

Table of Contents ...... iv

List of Tables ...... viii

List of Figures ...... ix

List of Appendices ...... xi

Abbreviations ...... xii

1 Introduction ...... 1

1.1 mRNA processing ...... 1

1.1.1 mRNA capping ...... 1

1.1.2 Constitutive splicing and the spliceosome ...... 2

1.1.3 Alternative splicing ...... 2

1.1.4 Polyadenylation...... 4

1.1.5 RNA export ...... 4

1.1.6 Translational initiation ...... 6

1.1.7 Interdependence of events in mRNA processing ...... 6

1.2 Regulation of mRNA splicing ...... 7

1.2.1 Role of cis elements in splicing ...... 7

1.2.2 Role of trans factors in splicing ...... 10

1.2.2.1 SR- family of splicing factors ...... 10

1.2.2.2 Heterogeneous nuclear ribonucleoproteins (hnRNPs) ...... 10

1.2.3 Regulation of splicing factors ...... 11

1.2.4 Splicing factors and signaling pathways ...... 12

1.3 Perturbation of alternative splicing in disease ...... 14

1.4 HIV-1 utilizes host alternative splicing machinery for viral gene expression ...... 15

iv

1.4.1 Overview of the HIV-1 lifecycle ...... 15

1.4.2 Current treatment strategies for HIV-1 ...... 16

1.4.3 Limitations of current HIV-1 therapies ...... 18

1.4.4 HIV-1 RNA processing...... 19

1.4.5 Regulation of HIV-1 RNA splicing ...... 19

1.4.6 HIV-1 gene expression and Rev-dependent export ...... 24

1.5 Modulation of RNA splicing as a therapeutic strategy ...... 27

1.5.1 Modulation of AS using small molecules ...... 27

1.5.1.1 Spliceosome inhibitors ...... 29

1.5.1.2 Histone deacetylase (HDAC) inhibitors ...... 29

1.5.1.3 Topoisomerase (Topo I) inhibitors ...... 30

1.5.1.4 Kinase and phosphatase inhibitors ...... 30

1.6 Effect of splicing modulators on HIV-1 gene expression ...... 31

1.7 Research objective and rationale ...... 33

2 Materials and Methods ...... 34

2.1 HIV-1 provirus doxycycline-inducible cell lines ...... 34

2.2 Assess activity of compounds on HIV-1 gene expression ...... 34

2.2.1 Preparation of compounds ...... 34

2.2.2 Compound treatment assay ...... 34

2.3 HIV-1 p24 antigen ELISA ...... 36

2.4 XTT cytotoxicity assay ...... 36

2.5 Analysis of HIV-1 protein expression ...... 37

2.6 Analysis of HIV-1 RNA expression and localization ...... 38

2.6.1 RNA extraction and reverse ...... 38

2.6.2 Quantification of HIV-1 mRNA expression by qPCR ...... 38

2.6.3 Analysis of splice site selection within the HIV-1 MS RNA ...... 39 v

2.6.4 Analysis of HIV-1 US RNA subcellular localization ...... 40

2.7 Monitoring protein synthesis by SUnSET ...... 42

2.8 Viral protein degradation assay ...... 44

2.9 Proteasomal degradation protection assay ...... 44

2.10 Analysis of cellular alternative splicing events by RT-PCR ...... 45

2.11 Analysis of cellular alternative splicing by RNA sequencing ...... 45

2.11.1 Sample preparation for RNA sequencing (RNAseq) ...... 45

2.11.2 RNAseq ...... 46

2.11.3 Analysis of RNAseq data ...... 46

2.11.3.1 Gene expression estimation ...... 46

2.11.3.2 Percent spliced in (PSI) estimation ...... 47

2.12 Compound treatment assay in primary cells ...... 48

2.12.1 Human primary cell donors and cell preparation ...... 48

2.12.2 Generation of replication-competent HIV-1 virus ...... 48

2.12.3 HIV-1 BaL infection of primary cells ...... 49

2.12.4 Compound treatment of primary cells ...... 49

2.13 Statistical analysis ...... 50

3 Results ...... 51

3.1 Identification of four compounds that suppress HIV-1 gene expression in HeLa cells ....51

3.1.1 Previously published literature for 191, 791, 833, and 892 activity ...... 53

3.2 191, 791, 833, and 892 potently inhibited HIV-1 gene expression in a dose-dependent manner...... 54

3.3 191, 791, 833, and 892 decreased HIV-1 structural and regulatory protein expression ....54

3.4 191, 791, 833, and 892 reduced HIV-1 US and SS RNA but not MS RNA ...... 56

3.5 191 and 791 did not alter splice site usage among HIV-1 MS RNA ...... 60

3.6 Inhibition of cytoplasmic accumulation of HIV-1 US RNA and Gag with compound treatment was consistent with perturbation of Rev function ...... 62 vi

3.7 191, 791, 833, and 892 did not affect total protein synthesis ...... 62

3.8 The compounds did not alter the stability of existing HIV-1 Tat protein...... 67

3.9 791 did not significantly affect cellular alternative splicing while 191, 833, and 892 had limited effects ...... 70

3.10 Preliminary analysis of the effect of the compounds on expression of cellular splicing factors ...... 75

3.11 191 and 791 inhibit HIV-1 BaL replication in primary cells ...... 77

4 Discussion ...... 80

4.1 Future Directions ...... 88

4.2 Conclusions ...... 90

Appendices ...... 102

I. Analysis of cellular alternative splicing by RT-PCR ...... 102

II. Global analysis of cellular alternative splicing and gene expression by RNA seq ...... 110

vii

List of Tables

Table 1.1 List of small molecule inhibitors of alternative splicing and their molecular targets ... 41

Table I-1 Effect of 892 treatment on a subset of cellular alternative splicing (AS) ...... 112

Table I-2 Effect of 791 treatment on a subset of cellular alternative splicing (AS) ...... 114

Table I-3 Effect of 833 treatment on a subset of cellular alternative splicing (AS) ...... 116

Table I-4 Effect of 191 treatment on a subset of cellular alternative splicing (AS) ...... 118

Table II-1 Effect of 791 treatment on a cellular alternative splicing (AS) by RNAseq ...... 120

Table II-1 Effect of 191 treatment on a cellular alternative splicing (AS) by RNAseq ...... 126

Table II-3 Effect of 791 treatment on a gene expression by RNAseq...... 133

Table II-4 Effect of 191 treatment on a gene expression by RNAseq...... 136

viii

List of Figures

Figure 1.1 Mechanism of mRNA splicing ...... 16

Figure 1.2 Possible products of alternative splicing of a hypothetical gene ...... 18

Figure 1.3 Regulation of alternative splicing by SR and hnRNP proteins ...... 21

Figure 1.4 Cellular alternative splicing factors ...... 22

Figure 1.5 Regulation of alternative splicing by signaling pathways ...... 26

Figure 1.6 HIV-1 lifecycle and current treatment strategies ...... 30

Figure 1.7 HIV-1 mRNA splicing and regulation ...... 33

Figure 1.8 HIV-1 gene expression in host cell ...... 38

Figure 2.1 Schematic of HIV-1 proviral system integrated in HeLa cell lines ...... 48

Figure 2.2 Characterization of HeLa C7 cells for fluorescence studies ...... 54

Figure 2.3 Compound treatment in HeLa C7 cells inhibits HIV-1 gene expression in a dose- dependent manner similar to effects observed in HeLa B2 cells ...... 56

Figure 3.1 Screen of RNA splicing modulators identifies four potent inhibitors of HIV-1 gene expression...... 65

Figure 3.2 Compound treatment inhibits HIV-1 gene expression in a dose-dependent manner .. 68

Figure 3.3 Compound treatment dramatically decreases the expression of HIV-1 structural proteins ...... 70

Figure 3.4 191, 791, 833, and 892 dramatically decrease the expression of HIV-1 regulatory proteins, in contrast to previously characterized HIV-1 inhibitors ...... 71

Figure 3.5 The compounds dramatically decrease the levels of HIV-1 US and SS RNAs ...... 72

Figure 3.6 191 and 791 do not alter splice site selection within HIV-1 MS RNAs...... 74 ix

Figure 3.7 Compounds inhibit cytoplasmic accumulation of HIV-1 US RNA ...... 76

Figure 3.8 The compounds do not affect total protein synthesis ...... 78

Figure 3.9 191 and 791 had better long-term toxicity profiles than 833 and 892...... 79

Figure 3.10 Compounds do not affect the half-life of HIV-1 Tat relative to DMSO ...... 81

Figure 3.11 HIV-1 Tat expression can be rescued with proteasome inhibition by MG132 ...... 82

Figure 3.12 Compounds have limited effects on cellular alternative splicing events ...... 85

Figure 3.13 191 and 791 do not appreciably alter cellular alternative splicing events ...... 86

Figure 3.14 Differential host gene expression with 191 and 791 treatment ...... 87

Figure 3.15 Compounds have limited effects on expression of cellular splicing factors ...... 89

Figure 3.16 191 and 791 inhibit HIV-1 replication in PBMCs ...... 91

Figure 3.17 191 and 791 inhibit HIV-1 replication in PBMCs in a dose-dependent manner...... 92

Figure 4.1 Proposed model for how the compounds inhibit HIV-1 gene expression ...... 97

x

List of Appendices

I. Analysis of cellular alternative splicing by RT-PCR ...... 112

II. Global analysis of cellular alternative splicing and gene expression by RNA seq ...... 120

xi

Abbreviations

AIDS acquired immunodeficiency syndrome

BSA bovine serum albumin

DMSO dimethyl sulfoxide

Dox doxycyline

ELISA enzyme-linked immunosorbent assay

ESE exon splicing enhancer

ESS exon splicing silencer

FBS fetal bovine serum

HAART highly active antiretroviral therapy

HIV-1 human immunodeficiency virus type 1 hnRNP heterogeneous nuclear ribonucleoprotein

IC inhibitory concentration

IMDM Iscove’s modified Delbecco’s medium

MS multiply spliced

Nef negative effector

NRTI nucleoside or nucleotide reverse transcriptase inhibitor

NNRTI non-nucleoside reverse transcriptase inhibitor

P/S penicillin/streptomycin

PBS phosphate buffered saline

xii

PCR polymerase chain reaction qRT-PCR quantitative reverse transcription PCR

Rev regulator of expression of virion proteins

RT-PCR reverse transcription PCR rtTA reverse tetracycline transactivator snRNP small nuclear ribonucleoprotein particle

SS singly spliced

TAR trans-acting response region

Tat transactivator of transcription tetO tet operator

US unspliced

Vif viral infectivity factor

Vpu virion protein unique to HIV-1

xiii

1 Introduction

Transcription of messenger RNA (mRNA) is the first step of converting the information stored within a genome into functional proteins. In eukaryotic cells, mRNA is further processed by events that include capping, splicing, and 3’ end formation to produce a mature mRNA prior to subsequent export to the cytoplasm and translation. Human immunodeficiency virus 1 (HIV-1) is a retrovirus that relies on host cellular mRNA processing for viral gene expression and replication. However, unlike most cellular mRNAs, HIV-1 encodes many of its structural and enzymatic proteins on unspliced viral RNAs. To overcome the requirement of fully processed mRNAs for export, HIV-1 encodes a viral regulatory protein, Rev, which specifically binds and exports incompletely spliced viral RNAs. Studying the interplay between host factors and viral proteins can provide insight into novel strategies for inhibiting HIV-1 replication.

1.1 mRNA processing mRNA processing refers to the series of events that occur for mature mRNA to be generated from the primary transcript. This process was often seen as a linear cascade of events that included mRNA capping, splicing, polyadenylation, export to the cytoplasm and translation to produce the encoded protein. However, an increasing body of evidence over the years suggests a shift in this paradigm. In fact, there is extensive crosstalk between these events and cellular factors involved in mRNA processing often have roles in more than one of these events (1).

1.1.1 mRNA capping

The earliest processing event is modification of the 5’ end of the nascent RNA polymerase II (Pol II) transcript, when it is 20-25 nucleotides in length, to form the 7-methyl guanosine cap (2). This evolutionarily conserved modification is necessary for efficient eukaryotic gene expression and cell viability (2). Formation of the cap occurs via three reactions by three different enzymes. The 7-methylguanosine cap is joined to the first transcribed nucleotide via the 5′ hydroxyl group, through a triphosphate linkage which is hydrolysed by an RNA 5′ triphosphatase (2). Next, guanosine monophosphate is added to the diphosphate–RNA by a guanylyltransferase to produce the guanosine cap, via a two-step reversible reaction (2). Finally, an RNA (guanine-7-) methyltransferase catalyses the methylation of the guanosine cap at the N-7 position to produce the 7-methylguanosine cap, using S-adenosylmethionine as the methyl donor (2). The cap serves 1 to protect mRNA from the action of 5’-exonucleases and promotes transcription, splicing, polyadenylation and mRNA export (2).

1.1.2 Constitutive splicing and the spliceosome

Splicing is the process of excising the sequences in pre-mRNA corresponding to introns (typically thousands of nucleotides in length), so that exons (typically hundreds of nucleotides in length) are connected into a continuous mRNA to form the coding sequence (3). When only one mature mRNA is formed in this process, it is called constitutive splicing. Splicing is carried out by a large ribonucleoprotein complex referred to as the spliceosome, which recognizes conserved sequence elements in the pre-mRNA. These include 5’ splice sites (5’ SS) and 3’ splice sites (3’ SS), the polypyrimidine tract (PPT) and the branchpoint sequence (BPS) (3). Figure 1.1A depicts these elements and the proteins that bind to them. The spliceosome machinery consists of five core small nuclear ribonucleoproteins (snRNPs), U1, U2, U5 and U4/U6 and up to 300 other proteins. The pre-mRNA is recognized and bound by the splicing factor 1 (SF1) at the BPS and the U2-associated factor (U2AF; 65 and 35 kDa subunits) at the PPT upstream of the 3’ splice site (3’ss) (3). Following the binding of SF1 and U2AF, U1snRNP binds the 5’ splice site (5’ss) and U2AF recruits U2 snRNP to the branch point. This U1-pre-mRNA-U2 complex then interacts with the U4-U5-U6 snRNP complex and conformational rearrangement leads to the splicing reaction by two transesterification steps, outlined in Figure 1.1B (3). The first step involves the attack by the 2’ hydroxyl of the branch point adenine on the phosphate at the 5’ss, cleaving the RNA at the 5’ exon/intron boundary. The second step involves the attack by one of the hydroxyl groups of the terminal phosphate on the phosphate at the 3’ss, liberating the intron in the form of a lariat (3). During the second step of splicing reaction, a complex of proteins called the exon junction complex (EJC) recognizes the splicing complex and binds to the RNA (3). The EJC complex consists of over nine proteins, including a group of proteins called the REF family (3).

1.1.3 Alternative splicing

To increase the diversity of mRNAs expressed from the genome, almost all transcripts in higher eukaryotic cells undergo alternative splicing (AS) of the pre-mRNA (3). Both constitutive and alternative splicing is mediated by the spliceosome, but alternative splicing differentially links exon regions in a single precursor mRNA to produce two or more different mature mRNAs. The

2

Figure 1.1 Mechanism of mRNA splicing. A) Consensus splicing sequences. The 5’ss, BPS, PPT and 3’ss are represented and are bound by U1 snRNP, SF1, U2AF65 and U2AF35, respectively. B) Splicing reaction. The first step involves the attack by the 2’ hydroxyl of the branch point adenine on the phosphate at the 5’ss, releasing the 3’end of the mRNA. The second step involves the attack by the hydroxyl of the terminal phosphate on the phosphate at the 3’ss, liberating the intron in the form of a lariat.

Brosseau, J-P and S. Abou-Elela. The Merit of Alternative Messenger RNA Splicing as a New Mine for the Next Generation Ovarian Cancer Biomarkers, Ovarian Cancer - A Clinical and Translational Update, InTech. Edited by Dr. I. Diaz-Padilla (2013). Copyright Brosseau and Abou-Elela. Reproduced with permission. Available from URL:

3 choice of which splice sites are used is regulated by cis-acting sequences present in the mRNA exonic and intronic regions and trans-acting factors that bind to these elements to promote or repress splicing at that site (3, 4). In addition, AS can also affect 5’ and 3’ untranslated region (UTR) regulatory sequences and polyadenylation site selection. There are a number of possible mRNA isoforms that may be generated by exon skipping, intron retention, alternative splice site selection, alternative usage and alternative polyadenylation (5). Some of these isoforms are described in Figure 1.2. Thus, AS can lead to changes in the proteins encoded by mRNAs and results in more profound functional effects in the cell. In fact, AS has been shown to regulate binding, localization, enzymatic properties, interactions with ligands and enable additional post- transcriptional control of gene expression (5). Thus, it is not surprising that aberrations in alternative splicing has been implicated in numerous diseases, cancers and viral infections.

1.1.4 Polyadenylation

Mature 3’ ends of mRNAs are generated by endonucleolytic cleavage of the pre-mRNA, followed by polyadenylation of the upstream cleavage product (6, 7). 3'-cleavage and polyadenylation are closely coupled to the termination of transcription since Pol II transcribes the DNA template several hundreds of nucleotides downstream of the cleavage and polyadenylation site (conserved AAUAAA sequence), while specific sequence signals in the pre- mRNA direct the binding of protein factors (6, 7). Polyadenylation requires more than a dozen proteins but the main conserved factors include cleavage stimulation factor (CstF), cleavage/ polyadenylation specificity factor (CPSF), poly(A) polymerase (PAP), and the poly(A) binding protein, PABPN1 (6, 7). PAP is not strongly associated with the end of the pre-mRNA initially, until approximately 20 adenosines have been added and PABPN binds to the short poly(A)-tail. Then, PAP is more firmly bound until 150-200 adenosines are rapidly added, at which point PAP dissociates (6, 7) and the mRNA can be transported to the cytoplasm for translation.

1.1.5 RNA export

Nuclear export of mRNAs, occurs through nuclear pore complexes (NPCs) embedded in the nuclear envelope. Generally, translocation of proteins and RNAs through the NPC is carried out by soluble transport receptors, which recognize specific signals on the transport substrate and mediate the interaction between the transport receptor–cargo complex and NPC components d

4

Figure 1.2 Possible products of alternative splicing of a hypothetical gene Types of alternative splicing that can generate functionally distinct transcripts are depicted. Blue boxes indicate alternative exons.

Blencowe, BJ. Alternative splicing: new insights from global analyses. Cell 126:1 (2006). Copyright Elsevier Inc. Reproduced with permission.

5 called nucleoporins (8, 9). However, export of fully processed RNA is difficult since the transport substrate recognized by the mRNA export machinery is the messenger ribonucleoprotein particle (mRNP) consisting of the mRNA molecule in association with cap binding complex (CBC; CBP20 and CBP80), RNA binding proteins, splicing factors, the EJC proteins (Aly/REF), PABPN and other factors involved in pre-mRNA processing (8, 9). Thus, export of bulk mRNA is thought to be mediated by members of the conserved family of TAP/NXF proteins (8, 9). TAP interacts with components of the NPC and binds directly or indirectly to its RNA cargoes (usually by interaction with Aly/REF) via two distinct functional domains: an N-terminal cargo-binding domain and a C-terminal NPC-binding domain (8, 9).

1.1.6 Translational initiation

Following nuclear export, the newly processed mRNA (in association with CBC, PABPN and the EJC), undergoes a “pioneer round of translation” (10). This step is thought to assess the quality of RNA processing before commitment to significant protein synthesis (10). During this step, EJC proteins are removed, PABPN1 is replaced by PABPC (cytoplasmic isoform) and the CBC is replaced by eukaryotic initiation factor 4E (eIF4E) (10). eIF4E is part of the eIF4F translation initiation complex, consisting of eIF4E, eIF4G, and eIF4A (11, 12). eIF4E binds to the mRNA cap and recruits the 43S ribosomal subunit and pre-initiation complex (PIC) by binding to eIF4G (11, 12). eIF4G is the large scaffolding protein onto which the initiation factors assemble by interaction with their corresponding domains. eIF4A is the ATP-dependent helicase that unwinds the mRNA. Two additional factors, eIF4B and eIF4H are RNA-binding proteins that stimulate the activity of eIF4A and stabilize single strand RNA regions. eIF4G also binds PABPC, causing the mRNA to circularize and stimulates the formation of the PIC (11, 12). Finally, the 60S ribosomal subunit is recruited and protein translation begins.

1.1.7 Interdependence of events in mRNA processing

For many years, the paradigm for mRNA processing was that pre-mRNA splicing was a post- transcriptional process, with the spliceosomal machinery devoted to the removal of introns from the transcripts. However, over the years, splicing and splicing factors were shown to impact additional processes during transcription and extending to mRNA export and translation, indicating a link between splicing and all the other steps in gene expression (5, 13). Furthermore, it has also been demonstrated over the past decade, that pre-mRNA can be spliced

6 cotranscriptionally. Co-transcriptional splicing allows functional integration of transcription and RNA processing, and could allow them to modulate one another, whereas post-transcriptional splicing could facilitate coupling RNA splicing with downstream events such as RNA export (5, 13). Often RNA binding proteins can act as multi-taskers with roles in alternative splicing, polyadenylation, RNA export and RNA transport (5, 13). Thus, there appears to be many opportunities for crosstalk between splicing and other RNA-processing steps in the cell. This additional level of regulation means that alternative splicing has a profound impact on many aspects of gene regulation.

1.2 Regulation of mRNA splicing

It is estimated that 80–95% of human multi-exon pre-mRNAs are alternatively spliced (3). Thus, to regulate mRNA isoform generation, there must be additional RNA sequences present in both exon and intron elements to either stimulate or inhibit splicing (14). These sequences are referred to as cis-acting RNA sequences and are often bound by trans-acting factors which facilitate or prevent the recruitment of the splicing machinery to these sites as depicted in Figure 1.3 (14).

1.2.1 Role of cis elements in splicing

The requirement of exonic sequences other than the splice sites for correct processing of certain transcripts was demonstrated experimentally by Reed and Maniatis (1986) (15). It was shown that some cis-acting RNA sequence elements located within the regulated exons, increase exon inclusion by serving as binding sites for the assembly of multicomponent splicing enhancer complexes (15). Thus, these sequence elements were termed exonic splicing enhancers (ESEs). Other classes of splicing regulatory elements that recruit proteins to enhance and silence splicing were subsequently identified and named intronic splicing enhancers (ISEs), exonic splicing silencers (ESSs) and intronic splicing silencers (ISSs), respectively, depending on their location and effect on neighbouring splice sites. These elements allow the splicing machinery to discriminate between pseudoexons and real exons, and between competing splice sites (14). These silencer and enhancer sequences are often present near exon/intron junctions, suggesting that the interplay between the activation and repression of cis-acting elements, by trans-acting factors, regulates the extent of exon inclusion.

7

Figure 1.3 Regulation of alternative splicing by SR and hnRNP proteins (A) Model for RS domain proteins in mediating the ESE-dependent inclusion of the alternative exon. Members of the SR family and SR-related proteins bind to exonic splicing enhancer (ESE) motifs (green bands) within the alternative exon (blue box) and facilitate the stable assembly of U1 and U2 snRNPs. SR-related splicing coactivator proteins (green ovals) serves to bridge interactions involving snRNPs and ESE-bound SR proteins. (B) Model for exonic splicing silencer (ESS) dependent skipping of the alternative exon promoted by the binding of an hnRNP protein to ESS motif (purple band). Binding of the ESS motif by the hnRNP protein disrupts binding of one or more adjacent SR proteins, resulting in exon skipping. Not shown are interactions involving intronic splicing enhancers (ISE) and silencers (ISS), which can function to promote or repress interactions required for the inclusion of adjacent alternative exons.

Blencowe, BJ. Alternative splicing: new insights from global analyses. Cell 126:1 (2006). Copyright Elsevier Inc. Reproduced with permission. 8

A

B

Figure 1.4 Cellular alternative splicing factors (A) Classification of the main human alternative splicing factors by RNA-binding domain composition. Only the proteins containing RRM domains are shown. (B) Members of the SR protein family of splicing factors and their evolutionary relationship. Cléry, A. and F. H-T. Allain. A structural biology perspective of proteins involved in splicing regulation (Chapter 4, page 34) from Alternative pre-mRNA Splicing: Theory and Protocols, First Edition. Edited by Stefan Stamm, Chris Smith, and Reinhard Lührmann. (2012). Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Adapted and reproduced with permission. Francisco Javier Blanco and Carmelo Bernabéu. The splicing factor SRSF1 as a marker for endothelial senescence. Front. Physiol. (2012). Copyright Blanco and Bernabéu.

9

1.2.2 Role of trans factors in splicing

The trans-acting cellular factors that regulate splicing can be categorized into three main families: SR proteins, hnRNPs, and tissue-specific splicing factors. All of these splicing factors contain different types of RNA-binding domains, with the most common being the RNA- recognition motifs (RRMs), KH domains and zinc fingers (see Figure 1.4) (14). These factors recognize specific RNA sequences, which in turn dictates their effect on select RNAs. Splicing enhancer sequences generally recruit SR proteins or spliceosomal components to enhance exon recognition. In contrast, splicing silencers generally influence RNA splicing events by recruiting heterogeneous nuclear ribonucleoproteins (hnRNPs) (14). This concept has been the general rule, however, recent studies have shown that the activity of a splicing factor as an inhibitor or enhancer is dependent on the location of protein binding relative to the regulated exon (16). Thus, the location of the splicing regulatory sequence, in addition to the sequence specificity and the balance of antagonistic splicing factors (SR proteins and hnRNPs, described in following sections) dictate the splicing reaction.

1.2.2.1 SR-protein family of splicing factors

The majority of cellular splicing factors include serine/arginine-rich (SR) proteins and SR- related proteins, which contain N-terminal RNA binding domains called RNA recognition motifs (RRMs) and C-terminal domains rich in serine and arginine residues (RS domains). SR and SR- related proteins containing a single RRM include SRSF3 (SRp20), SRSF7 (9G8), SRSF2 (SC35), SRSF8 (SRp46), SRSF11 (SRp54), Tra2α, and Tra2β (14). However, most splicing factors contain multiple RRM copies. Five human SR proteins, SRSF1 (SF2/ASF), SRSF9 (SRp30c), SRSF5 (SRp40), SRSF6 (SRp55) and SRSF4 (SRp75), contain a canonical RRM and a pseudo-RRM and have different RNA-binding specificities (14). SR proteins help define exons and introns in pre-mRNA splicing by acting as bridges between snRNPs along the length of the pre-mRNA. Generally, SR and SR-related proteins enhance splicing by binding to exonic or intronic splicing enhancer (ESE or ISE) motifs and facilitating the stable assembly of U1 and U2 snRNPs to the pre-mRNA at adjacent splice sites (5) (Figure 1.3).

1.2.2.2 Heterogeneous nuclear ribonucleoproteins (hnRNPs)

Heterogeneous nuclear ribonucleoproteins (hnRNPs) also have RNA recognition motifs (RRMs) by which they interact with the pre-mRNA to regulate splicing, but lack the RS domains found in 10

SR proteins (14). The hnRNP family consists of approximately 20 splicing factors including hnRNP A1 – U (17). In contrast to SR proteins, hnRNPs generally bind exon splicing silencers (ESSs) or intron splicing silencers (ISSs) to repress splicing. Thus, they generally compete with SR proteins in an antagonistic manner to determine whether an exon is included or skipped. hnRNP-bound splicing silencers have been shown to repress spliceosomal assembly through steric hindrance, multimerization along exons, or by looping out exons (5, 17) (see Figure 1.3). The steric hindrance mechanism involves binding of an hnRNP protein to an ESS leading to the direct displacement of an adjacent SR protein. The multimerization of hnRNPs along the alternative exon, is thought to be mediated by the arginine/glycine (RG) repeat region of the protein, and is proposed block the recruitment of snRNPs and the spliceosome machinery, resulting in exon skipping. In the “looping-out” mechanism, binding of hnRNP proteins to distal sites within the introns flanking an alternative exon results in preferential splicing of the distal splice sites and skipping of the alternative exon (5, 17). These models are not mutually exclusive and may operate in different pre-mRNAs.

1.2.3 Regulation of splicing factors

Many splicing factors are post-translationally modified by phosphorylation, glycosylation or methylation, to allow rapid alteration of splice site selection (4). The most common modification is reversible phosphorylation, and the function of SR proteins and hnRNPs is primarily regulated by the phosphorylation and dephosphorylation by kinases and phosphatases, respectively. The primary protein kinase families that control SR protein phosphorylation include the SR protein kinase (SRPK) family, the Cdc-2 like kinase (Clk) family, and topoisomerase I (4). A single SR protein may be modified by members of more than one kinase family to regulate alternative splicing (4). In contrast to the over 400 protein kinases encoded by the , only 25 serine/threonine protein phosphatases are known (4). Two such proteins, protein phosphatase 1 and 2C (PP1 and PP2C) have been identified to dephosphorylate splicing factors by binding to a degenerate RVXF motif present in their interacting proteins (4). In addition to alteration of their activity, changes in the subcellular localization of these splicing factors affects their concentration in areas where splicing occurs and results in altered splice site selection. Several splicing factors shuttle between the nucleus and the cytosol, and their localization is sensitive to reversible phosphorylation that mediates interactions with export and import systems (18).

11

Thus, by influencing protein-protein and protein-RNA interactions, reversible protein phosphorylation modulates the assembly of regulatory proteins on pre-mRNA. It follows that even a small change in the proportions of the spliceosomal components or their regulatory kinases could trigger a change from the inclusion of an exon to its exclusion (14).

1.2.4 Splicing factors and signaling pathways

In addition to the various kinases and phosphatases that regulate modification of SR proteins and hnRNPs, there are additional levels of control upstream of these processes. Signaling cascades in the cell lead to the phosphorylation and dephosphorylation of the kinases and phosphatases, rendering them active or inactive for their subsequent roles in phosphorylating splicing regulatory proteins (4). Some of these signaling pathways are depicted in Figure 1.5. Numerous studies have shown that targeting the proteins involved in these pathways can alter splicing reactions via the modulation of SR protein or hnRNP activity.

A large number of splicing events are regulated by the phosphoinositide-3 kinase (PI3K) /Akt pathway since spliceosomal proteins are the most abundant substrates for Akt (4). For example, insulin activates Akt, which phosphorylates the SR proteins SRSF5, SRSF1, and SRSF7 (SRp40, SF2/ASF, and 9G8, respectively), resulting in a shift in the splicing pattern of protein kinase C beta (PKC) toward exon inclusion, creating a PKC isoform that facilitates glucose uptake (4, 19, 20). In addition, studies by the Schaal group have demonstrated that many viruses exploit PI3K/Akt signaling pathway for efficient viral replication and that pharmacological inhibition of this signaling cascade alters viral mRNA splicing (21, 22).

T cell receptor signaling has also been implicated in regulating alternative splicing. A study by Heyd and Lynch (2011) showed that T cell activation leads to reduced glycogen synthase kinase 3 (GSK3) activity such that phosphorylation of PTB-associated splicing factor (PSF) by GSK3 is reduced. The unphosphorylated form of PSF is released from a complex with TRAP150 and allows PSF to mediate exon skipping within CD45 mRNA via the splicing regulatory sequence ESS1 (23, 24). Similarly, a study by Matter et al (2002) demonstrated that activation of the Ras- ERK signaling pathway leads to phosphorylation of Sam68, which mediates alternative splicing of CD44 mRNA (25). Furthermore, stress-induced signaling via the p38-mitogen activated protein kinase (MAPK) pathway has been shown to increase hnRNP A1 phosphorylation, resulting in altered splicing of an adenovirus E1A mRNA reporter (26). 12

Figure 1.5 Regulation of alternative splicing by signaling pathways. The p38 kinase transduces stress signals to hnRNP A1 by the MAPK pathway. The Wnt or T cell receptor (TCR) signaling pathway, by regulating GSK3, phosphorylates and potentiates the activity of PSF by releasing the splicing regulator from the inhibitory complex with TRAP150. Growth factor signals (GFs) activate both the Raf-MEK-ERK pathway to modify Sam68 and the PI3K-Akt pathway. Activated Akt binds to SRPKs and induces nuclear translocation of SRPKs. In the nucleus, SRPKs act in synergy with Clks to phosphorylate SR proteins. Thus, these signaling pathways ultimately affect the ability of splicing factors (hnRNP A1, Sam68, SR proteins, and PSF) to bind to splicing regulatory sequences and alter splice site usage of mRNAs transcribed by RNA polymerase II (Pol II). Zhou, Z and X Fu. Regulation of splicing by SR proteins and SR protein-specific kinases. Chromosoma. 122:3 (2013). Copyright Springer-Verlag Berlin Heidelberg. Adapted and reproduced with permission.

13

Together, these studies demonstrate that changes in the activity or levels of kinases or phosphatases by extracellular stimulus and subsequent signaling cascades have far reaching consequences for gene expression. Thus, signal transduction pathways induce post-translational modification of multiple splicing regulators, which in turn function to modulate splice site selection in the nucleus. The spectrum of splicing regulators and the distinct activities of individual signaling pathways, suggest roles for specific splicing programs in different cell types, during development or in the context of disease, cancer or viral infection. (18).

1.3 Perturbation of alternative splicing in disease

It is becoming increasingly evident that a number of diseases are caused by aberrant splicing or the selection of “wrong” splice sites during mRNA processing. The selection of these splice sites can be caused by mutation in cis-acting sequences or by changes in trans-acting factors and their regulation (27). Since mRNA processing is coupled to transcription and translation, it is likely that these changes in alternative splicing affect transcription and translation, as well. Over the past decades, several groups have identified links between changes in alternative splicing and cancer, neuromuscular disorders, and viral infections (27). In fact, aberrant mRNA processing is also seen in many neuromuscular disorders and cells infected with virus.

A considerable amount of research has been published on regulation of the Survival of Motor Neuron (SMN) pre-mRNA splicing (14, 27) and can be seen as a model for aberrant alternative splicing causing disease. Two almost identical code for functional protein SMN1 and mostly nonfunctional protein SMN2, due to a single base transition in exon 7 that is preferentially skipped in SMN2 (14, 27). The CT mutation (CU in mRNA transcript) in the SMN2 gene at the 6th position of exon 7 is translationally silent but results in low, insufficient levels of functional SMN protein due to truncation of the transcript (14, 27). Autosomal recessive SMA is caused by the loss of SMN1 and the inability of SMN2 to compensate for the less of SMN1 (14, 27). The disease is characterized by progressive paralysis caused by the loss of alpha-motor neurons in the spinal cord and is the most frequent genetic cause of infantile death (14, 27, 28). Since the genes encoding SMN1 and SMN2 are nearly identical, it was generally believed that restoration of SMN2 exon 7 inclusion held the promise of a cure for SMA.

14

Numerous studies have uncovered a number of splicing regulatory elements within exon 7 and its flanking introns, including an enhancer element associated with splicing factor SRSF1 (SF2/ASF), and a silencer element associated with hnRNP A1 (14, 27). Many groups have attempted to modulate splicing and influence inclusion of exon 7 in SMN2 as a therapeutic approach to treat SMA. A recent study by Naryshkin et al (2014) validates this approach using small molecules as a means to shift the balance of SMN2 splicing toward the production of full- length SMN2 messenger RNA with a high degree of selectivity (28). In fact, administration of these compounds to a mouse model of severe SMA, led to an increase in SMN protein levels in the brain, improvement of motor function, and increased longevity, suggesting that selective SMN2 splicing modifiers is a promising therapeutic strategy for patients with SMA (28).

The success of this approach in modulating mRNA processing to promote exon inclusion and rescue protein expression suggests that perhaps a similar strategy can be used to inhibit the balance of mRNA splicing during HIV infections as well. Indeed, a number of studies have verified the use of small molecules as modulators of mRNA splicing in the context of numerous diseases, cancer and viral infection, as outlined in further detail in section 1.5.

1.4 HIV-1 utilizes host alternative splicing machinery for viral gene expression

A common mechanism among many human and animal viruses is the use of alternative splicing (AS) to maximize their viral protein expression from a limited genome size. HIV-1 is such a virus that requires AS for efficient viral replication during the infectious life cycle.

1.4.1 Overview of the HIV-1 lifecycle

HIV-1 is a complex retrovirus consisting of two identical RNA strands of 9.3 kb contained in a conical capsid, surrounded by a lipoprotein membrane (29). The glycoproteins on the surface of the virion are comprised of trimers of an external glycoprotein, gp120, and a transmembrane protein, gp41. The gp120-gp41 trimer structure mediates HIV tropism towards cells expressing CD4 and chemokine co-receptors CCR5 or CXCR4. Viral entry into susceptible cells, such as CD4+ T lymphocytes, is mediated by the binding of gp120 to CD4 on the cell surface (Figure 1.6, Step 1), resulting in a conformational change in gp120 and exposure of a region that is able to bind the co-receptor, CCR5 or CXCR4 (29). Binding of the co-receptor causes another conformational change in gp41, initiating fusion of viral and cellular membranes and release of 15 the viral capsid into the cytoplasm of the target cell. Once in the cytoplasm, the virus undergoes partial disassembly of the capsid and initiates reverse transcription and delivery of the viral double stranded DNA to the nucleus (Step 2). Once integrated into the host cell genome (Step 3), the HIV-1 provirus uses the host transcription, mRNA processing, and translation pathways for efficient viral gene expression (Steps 4-5). The HIV-1 RNA genome and associated viral proteins are assembled at the plasma membrane where release of the viral particle occurs (Step 6). Finally, proteolytic cleavage and maturation must occur for the virus particle to infect new host cells (Step 7).

1.4.2 Current treatment strategies for HIV-1

The drugs currently used to treat HIV-1 infection belong to four distinct classes targeting viral entry and each of the viral enzymes, reverse transcriptase, integrase and protease. The stages at which these classes of drugs target are illustrated in Figure 1.6. Entry inhibitors block the penetration of HIV virions into their target cells by blocking fusion of the viral and cellular membranes (eg. enfuvirtide/T20) or binding to the co-factor CCR5 (eg. maraviroc). Nucleoside/nucleotide reverse transcriptase inhibitors (NRTIs) are nucleoside or nucleotide analogues which act as DNA-chain terminators and inhibit reverse transcription of the viral RNA genome into DNA (eg. zidovudine/AZT) while non-nucleoside reverse- transcriptase inhibitors (NNRTIs) bind and inhibit reverse transcriptase activity (eg. nevirapine) (30). Protease inhibitors target the viral protease to inhibit cleavage of precursor proteins (gag and gag-pol), (eg. ritonavir) and integrase inhibitors prevent the provirus from integrating into the host genome (eg. raltegravir) (30). A therapy to treat HIV-1 infection uses combinations of these anti-retroviral drugs and is known as highly active antiretroviral therapy (HAART). Current HAART regimens generally comprise three anti-retroviral drugs, usually two nucleoside analogues and either a protease inhibitor or a nonnucleoside reverse-transcriptase inhibitor. Effective combination anti- retroviral therapy can suppress HIV viral load in patients and has dramatically improved HIV- associated morbidity and mortality. However, HAART requires strict adherence to long-term therapy to prevent the emergence of drug resistant virus from latent reservoir pools.

16

Figure 1.6 HIV-1 lifecycle and current treatment strategies Diagram depicts stages of the HIV-1 lifecycle in a host CD4+ T cell (Steps 1-7). Current drugs used in HIV-1 treatments are indicated in red boxes next to their viral targets. NRTIs = nucleoside/nucleotide reverse transcriptase inhibitors (zidovudine, didanosine, zalcitabine, stavudine, lamivudine, abacavir, emtricitabine and tenofovir). NNRTIs = Non-nucleoside reverse transcriptase inhibitors (nevirapine, delavirdine, efavirenz and etravirine).

17

1.4.3 Limitations of current HIV-1 therapies

HAART has greatly improved the quality of life in HIV-1-infected individuals however, success of this therapeutic approach is limited by patient incompliance to therapy, development of drug resistance, side effects with prolonged use, and virus persistence in latent reservoirs (30). Viral drug resistance is particularly problematic because of HIV-1 genetic heterogeneity, high replication rates and high mutation rates associated with reverse transcriptase (30). For example, the proportion of multidrug-resistant virus transmitted in new HIV infections increased in North America from 1.1% to 6.2% within a five year period between 1995 and 2000 while the frequency of multidrug resistance detected by sequence analysis increased from 3.8% to 10.2% (31). Furthermore, among subjects infected with drug-resistant virus, the time to viral suppression after the initiation of antiretroviral therapy was longer, and the time to virologic failure was shorter (31). The prevalence of transmitted drug-resistant virus, especially multidrug- resistant HIV, has important implications for the continued use and management of current anti- viral therapies. The existence of fewer options for initial treatment and suboptimal responses to treatment among recently infected patients may seriously limit the expected reduction in the rate of disease progression and increase secondary transmission of drug-resistant variants. An additional caveat to treating HIV-1 infection, is the ability of the virus to establish a latent cellular reservoir and avoid immune detection. HIV-infection is currently treated as a chronic condition requiring life-long daily treatment because HAART does not eliminate resting long- lived cells containing integrated proviruses. If treatment is stopped, HIV-1 rebounds to very high levels as virus emerges from these cells. The mechanism of how HIV-1 persists latently in these infected cells is not known, but presumably requires interaction with cellular factors involved in chromatin modification pathways to keep the provirus in a latent state.

Thus, a better understanding of how HIV-1 interacts with the host cell would give insight towards curbing viral drug resistance and combating persistent viral infection. New drugs that act on stages of the HIV-1 lifecycle not currently targeted by HAART with less susceptibility to developing resistant viral strains should be explored to continue combating HIV-1 infection. Over the past several years, these has been a refocusing of HIV-1 research towards the development of drugs targeting cellular factors that are essential for viral infection, rather than targeting viral proteins. This approach would be advantageous because it would reduce the risk of developing viral drug resistance. There are a number of cellular factors, including host

18 proteins essential for viral gene expression and RNA processing that are promising targets for novel therapeutic strategies for HIV-1 infection.

1.4.4 HIV-1 RNA processing

The HIV-1 genome consists of long terminal repeat (LTR) regions flanking the open reading frames that encode for fifteen distinct proteins (Figure 1.7A). The gag gene encodes the nucleocapsid, capsid, and matrix proteins. The pol gene encodes the viral reverse transcriptase, integrase and protease. The env gene encodes the glycoproteins gp120 and gp41. Other proteins encoded by the viral genome include regulatory proteins Rev and Tat, and accessory proteins Nef, Vif, Vpu, and Vpr (29).

Following integration into the host cell genome, the HIV-1 provirus is transcribed by the cellular RNA polymerase II (Pol II) to generate a 9 kb pre-mRNA. To generate all the proteins required for virion assembly from a single 9 kb transcript, HIV-1 relies on a controlled process of alternative splicing to generate over 40 mRNAs (32), a subset of which are depicted in Figure 1.7B. The viral RNAs are divided into three classes depending on their degree of splicing: unspliced (US) 9 kb RNAs, singly spliced (SS) 4 kb RNAs, and the multiply spliced (MS) 1.8 kb RNAs. Unlike eukaryotic cellular transcripts, HIV-1 requires a significant portion of viral RNA to remain unspliced as the viral RNA genome and to encode viral structural proteins (32). Thus, there must be a controlled process of alternative splicing to get efficient viral gene expression. HIV-1 uses suboptimal 5’ and 3’ splice sites (5’ and 3’ ss) to generate the different viral RNA species (32). These splice sites are in turn regulated by exonic splicing enhancers (ESEs), exonic splicing silencers (ESSs), and intronic splicing silencers (ISSs). Regulation of viral mRNA processing is described in detail in the following section.

1.4.5 Regulation of HIV-1 RNA splicing

The efficiency of splice site use is determined by the interactions between the proteins and the pre-mRNA and is influenced by the action of many cellular splicing factors. These factors bind to splicing regulatory elements near the splice acceptor and splice donor sites, almost all of which are conserved across all HIV-1 strains (32), in the pre-mRNA to mediate inclusion and exclusion of nearby exons. Many of these elements as well as the splicing factors that bind them are extensively reviewed in Stoltzfus (2009) with more recently identified viral cis-elements described by Erkelenz et al (2015). Studies examining the intrinsic strength of the viral 5’ss and 19 s

Figure 1.7 HIV-1 mRNA splicing and regulation. (A) Schematic diagram of HIV-1 genome indicating open reading frames (open rectangles) and LTRs (gray rectangles). (B) Locations of 5’ and 3’ss and RRE in the HIV-1 genome. The exons present in the SS (4 kb) and MS (1.8 kb) mRNA species corresponding to the HIV-1 genes are shown as open rectangles. Noncoding exon 1 and is present in all spliced HIV-1 mRNA species, while exons 2 and 3 (black rectangles) are included in a fraction of the mRNA species. The exon compositions of the RNA species are shown with ‘‘I’’ designating incompletely spliced mRNA species and brackets indicating mRNA isoforms containing neither, only one, or both exons 2 and 3. (C) Locations of known splicing regulatory elements in the HIV-1. Splicing enhancers and splicing silencers are designated by green and red rectangles, respectively.

Stoltzfus, CM. Regulation of HIV-1 alternative RNA splicing and its role in virus replication. Advances in Virus Research, Volume 74, Chapter 1 (2009). Copyright Elsevier Inc. Reproduced with permission. Erkelenz, S et al. Balanced splicing at the Tat-specific HIV-1 3′ss A3 is critical for HIV-1 replication. Retrovirology, 12:29 (2015). Copyright Erkelenz et al. Reproduced with permission.

20

3’ss, revealed that the 5’ss D1 and D4 are relatively strong (closely match the consensus motif) while the 5’ss D2 and D3 are relatively weak, consistent with reduced complementarity to U1 snRNA (32). Furthermore, in contrast to 3’ss A2 and A3, which splice with an efficiency of at least 40% compared to an optimal control, A1, A4c, A4a, A4b, A5 and A7 have weak intrinsic strength (32, 33).

However, addition of exonic sequences downstream of the 3’ ss, significantly change the efficiency of splicing, demonstrating the importance of splicing regulatory elements to functional splice site strength. The use of 3’ss A1, A4cab, A5 and A7 is considerably increased in presence of their respective downstream exonic sequences, whereas the splicing efficiency at 3’ss A2 and A3 is decreased (32). Viral mRNAs encoding Vif, Vpr and Tat proteins, are expressed at relatively low levels in infected cells, suggesting that 3’ss A1, A2 and A3 are rarely spliced. In contrast, viral mRNAs encoding Rev, Nef, Env, and Vpu are expressed at higher levels, suggesting that splicing at the 3’ss A4cab and A5 occurs more efficiently (32). In addition, approximately one half of all spliced viral mRNAs remove the downstream env-intron, indicating that 3’ss A7 is also used with high efficiency (32). These observations demonstrate that alternative splicing of HIV-1 mRNAs must be strictly controlled to allow efficient expression of all viral proteins.

Splicing at each of the viral splice sites is tightly regulated by neighbouring splicing elements such as exonic and intronic splicing silencers (ESSs/ISSs) and exonic splicing enhancers (ESEs) (see Figure 1.7C). The viral 3’ss A2, A3 and A7 contain ESS elements (ESS2, ESS3, and ESSV) within their downstream exonic sequences. Most of these elements contain motifs which match the consensus binding sequence for hnRNP A/B proteins and were found to negatively act on splice site activation (34). Studies have shown that depletion of hnRNP A1/A1B/A2/B1 resulted in inhibition of ESS2, ESS3 and ESSV-mediated splicing and that splicing can be rescued by addition of any of the depleted hnRNPs (34). In addition, a UGGGU sequence downstream of 3’ss A3 (ESS2p) was shown to be bound by hnRNP H for inhibition of the tat-mRNA specific 3’ss A3 (34). A mutation within ESS2p was shown to cause reduced hnRNP H binding and 2- fold increase in splicing at A3 (34).

An ISS element was also identified to regulate splicing at 3’ss A7 and was found to be hnRNP A/B-dependent. Disruption of this ISS by mutagenesis increases the splicing efficiency at 3’ss

21

A7 (34). Furthermore, hnRNP mediation of ESS and ISS repression was shown to occur at early steps in splicing process with models proposing that the cooperative binding of hnRNPs to these sequences prevent efficient binding of the cellular splicing factors to the 3’ ss (34). The binding of hnRNPs to the silencer sequences act antagonistically on viral splice site usage with cellular factors and cis elements that increase the efficiency of splicing of neighbouring splice sites.

Members of the SR protein family were identified to positively regulate viral splicing by recognizing exonic splicing enhancer (ESE) elements (34). One of the earliest identification of SR proteins necessary for HIV-1 splicing was the requirement of SRSF1 (SF2/ASF) for splicing at the 3’ ss A7 (34). In fact, deletions downstream of A7 lead to the identification of

ESE3/(GAA)3 as an enhancer element (35). However, further mutational studies suggested that the ESE3/(GAA)3 element could act either as an ESS or an ESE in the context of exon 7 (a so- called Janus element) and that its activity may be determined by the relative amounts of hnRNP proteins and SRSF1 (35). In addition, studies by the Schaal group revealed that a guanosine- adenosine rich (GAR) enhancer within HIV-1 exon 5 is bound by the SR proteins SRSF1 and SRSF5 and allows recruitment of the U1 snRNP to the flanking 5’ss D4 (36). This recruitment is necessary for bridging interactions across the exon and splice site pairing, as exon 5 recognition in the absence of the GAR element can be partially bypassed by coexpression of a mutated U1 snRNA perfectly matching 5’ss D4 (36). Subsequent overexpression studies have identified

ESE2 to be SRSF2 (SC35)-dependent and ESEVpr to be SRSF1-dependent (16). In addition,

Exline et al (2008) showed that ESEVif (within 5'-proximal region of exon 2) binds specifically to

SRSF4 and that mutations within ESEVif resulted in altered Vif expression (37). Similarly, Kammler et al (2006) described a SRSF1-dependent ESE (ESEM) within exon 2 for which single point mutation was shown to be detrimental for HIV-1 exon 2 recognition without affecting Rev-dependent Vif expression (38). A recent study by Erkelenz et al (2015) identified an additional splicing enhancer, ESEtat, located between ESS2p and ESE2/ESS2, which is critical for regulation of 3′ss A3 usage and viral tat-mRNA splicing. Subsequent in vitro binding assays suggest SRSF2 and SRSF6 as candidate splicing factors acting through ESEtat and ESE2 for 3′ss A3 activation (39).

The strict requirement for balanced splicing of viral mRNAs for HIV-1 replication is most strongly demonstrated by mutations in ESSV. Disruption of ESSV activity resulted in a selective increase in the levels of incompletely spliced Vpr-mRNAs and a reduction in the levels of US 22 mRNAs and intracellular Gag protein levels (40). This oversplicing phenotype is consistent with a dramatic perturbation of the balance between spliced and unspliced viral mRNAs and a 10 to 20 fold reduction in virus particle production, probably due to insufficient accumulation of structural proteins required for capsid assembly (40). Consistent with its role as a key regulator of viral splicing, it has been shown that viruses lacking ESSV escape from their replication defect by second site mutations upon prolonged culturing, to switch off unbalanced exon 3 splice site recognition (40). Therefore, ESSV appears to be important in regulating HIV-1 exon 3 splicing to levels permitting both accumulation of unspliced mRNA for structural protein expression and vpr-mRNA formation. Furthermore, a recent study by Erkelenz et al (2015) has revealed that mutational inactivation or masking of the ESEtat element resulted in dramatic impairment of viral replication due to decreased accumulation of mRNAs encoding Tat (39). These studies further demonstrate that regulation of HIV-1 splicing, particularly by altering the splicing pattern of viral mRNAs encoding regulatory proteins, is critical for viral gene expression and that perturbations in splicing cause severe defects in viral replication.

Several studies have examined the effect of over expression of SR proteins in splicing of HIV-1 mRNAs. As mentioned briefly, over expression of SRSF2 and SRSF5 resulted in selective increase of tat mRNA isoforms spliced at 3’ ss A3, while over expression of SRSF1 resulted in exon 3 inclusion by activation of 3’ ss A2 splice site use (34). In addition, over expression of SRSF1, SRSF2 and SRSF7 resulted in significant reduction of unspliced HIV-1 mRNAs and decreased Env expression. Likewise, previous studies in our lab have revealed that changes in the expression of hnRNP D, Tra2α, and Tra2β also modulate HIV-1 mRNA alternative splicing (41, 42). Studies by Lund et al (2012) demonstrated that siRNA mediated depletion of hnRNP A1 and hnRNP A2 increased expression of viral structural proteins, while depletion of hnRNP H, hnRNP I or hnRNP K had little effect (41). In contrast, depletion of hnRNP D expression decreased synthesis of HIV-1 Gag and Env due to the reduction of accumulation of HIV-1 unspliced and singly spliced RNAs in the cytoplasm (41). Similarly, over expression of Tra2α or Tra2β resulted in a marked reduction in HIV-1 Gag/Env expression by perturbation of HIV-1 RNA accumulation, altered viral splice site usage, and a block to export of HIV-1 genomic RNA (42). In addition, depletion of Tra2β resulted in a selective reduction in HIV-1 Env expression and an increase in multiply spliced viral RNA (42). The role of kinases that regulate phosphorylation of splicing factors have also been shown to alter splicing of HIV-1, as the

23 overexpression of CLK1 and CLK2 resulted in the enhancement and inhibition of HIV-1 Gag production, respectively (43). Together, these findings demonstrate that tight regulation of HIV-1 splicing is required for efficient virus replication and that this regulation can be abrogated by changes in the levels or phosphorylation status of cellular splicing factors.

Given the numerous studies which demonstrate that HIV-1 replication is severely impaired by mutations within splicing regulatory elements or changes in the expression levels of splicing factors that bind to these elements, it seems likely that the perturbation of the expression of cellular splicing factors may be a novel avenue by which to inhibit HIV-1 gene expression. Since the action of these splicing factors can be modulated by specific kinases and phosphatases by differential phosphorylation of the RS domains, the ratio of these regulatory enzymes can also play an important role in determining which pairs of splice sites are selected. Therefore, a novel therapeutic strategy can be outlined where targeting specific regulatory proteins involved in alternative splicing pathways leads to inhibition of HIV-1 viral RNA processing and hence virus replication.

1.4.6 HIV-1 gene expression and Rev-dependent export

Since HIV-1 relies on the host cellular machinery for splicing and export of viral US, SS and MS RNAs, it must adopt ways to bypass the cellular restriction on export of incompletely spliced mRNAs. For many years, the paradigm for the export of viral RNAs was as follows: During the early phase of viral gene expression, only the completely spliced MS RNAs are exported presumably via the TAP-dependent export pathway, like all spliced cellular mRNAs, while the US and SS RNAs are degraded in the nucleus. The MS RNA encodes the virsl regulatory facter, Rev, which contains both a nuclear localization signal (NLS) and a nuclear export signal (NES). The NLS and NES allow Rev to interact with Importin  and the CRM1, respectively, so that Rev can shuttle between the nucleus and cytoplasm via the nuclear pore complex. When Rev has accumulated in the nucleus during the late phase of viral gene expression, it recognizes and binds the Rev response elements (RREs) present in both the US and SS RNAs. Binding of Rev to the RRE, and interaction between Rev and CRM1, allows the export of US and SS viral RNAs by the CRM1-dependent export pathway, and subsequent expression of viral proteins encoded by these RNAs. Thus, HIV-1 was thought to bypass the nuclear retention mechanism, by expressing

24

s

Figure 1.8 HIV-1 gene expression in host cell. Following integration of the HIV-1 provirus into the host genome, a single 9kb transcript is produced. This transcript needs to be alternatively spliced to generate >40 mRNAs, which are divided into 3 classes US, SS and MS RNA. There are two phases of HIV-1 gene expression. In the early phase, only the MS RNA is exported from the nucleus while the US and SS RNA are degraded. The MS RNA encodes for regulatory proteins, importantly Rev. Once Rev has accumulated, during the late phase of gene expression, Rev can shuttle back to the nucleus and bind to RRE present on US and SS RNA to allow their export and subsequent expression of viral structural proteins.

25 the viral regulatory factor, Rev, to specifically transport the incompletely spliced viral RNAs via the CRM1-mediated pathway.

A recent publication by Taniguchi et al (2014), shows that Rev-mediated export of viral RNAs is more complicated than the paradigm initially suggested. The authors proposed a model for Rev- mediated export of RRE-containing mRNAs, whereby Rev binds to the RRE and also interacts with the cap-binding complex (CBC) at the 5’ end of the mRNA and competitively inhibits the interaction between CBC and Aly/REF, a component of the TAP-mediated export pathway. In this way, interaction of Aly/REF with viral mRNAs and subsequent recruitment of the TREX complex is suppressed, such that RRE-containing RNAs are preferentially exported via the CRM1-mediated pathway (44). It was further suggested, that HIV-1 likely suppresses TAP- dependent RNA export as a means to prevent the association of TAP with the incompletely spliced viral US and SS RNAs, and the specific nuclear retention and reduction of these RNAs (44). Thus, HIV-1 presumably utilizes Rev to circumvent TAP-mediated reduction in viral gene expression.

The molecular mechanism by which Rev binds to both the RRE and the distantly located CBC remains to be elucidated, but they propose the most likely way this could occur as follows: RRE- containing RNA may form a closed loop structure between the 5’ end and the RRE by the interaction of Rev with the CBC, similar to the closed loop structure observed for the 5’ end and the (poly A) tail in cellular mRNA translation (44). An alternative possible model where CRM1- binding of Rev-RRE may enhance Rev multimerization along the entire length of the RRE- containing RNA and association of Rev with the 5’ end is stabilized by CBC, was deemed unlikely since the intervening sequence is cleavable by DNase and RNase H (44). Taken together, this study demonstrates that Rev is crucial for efficient HIV-1 gene expression and is the mediator required to bypass the cellular export block of incompletely spliced mRNAs. Furthermore, since Rev interacts with many cellular proteins to carry out its function, it suggests that perturbation of the specific Rev-cellular factor interaction could be a strategy to specifically inhibit HIV-1 mRNA processing and subsequent viral gene expression. Thus, an approach that targets both cellular regulators of alternative splicing and Rev function would be tremendously detrimental to viral replication and offer a novel therapeutic strategy for HIV-1 infection.

26

1.5 Modulation of RNA splicing as a therapeutic strategy

Since alternative pre-mRNA splicing is an important regulator of gene expression, the selection of ‘wrong’ alternative exons, leading to differential protein expression, is being increasingly recognized as the cause of numerous human diseases, cancers and viral infections (reviewed in (21, 27, 45)). Thus, strategies that target regulation of alternative splicing can be used to modify aberrant splicing patterns to treat these diseases.

1.5.1 Modulation of AS using small molecules

A promising line of research that has attracted recent attention involves the use of small molecules that act by interfering with cellular signaling pathways, thereby modifying the activity of splicing regulatory proteins through an altered cellular distribution or a change in phosphorylation state. For this, screening methods have been developed to identify small molecules from chemical libraries that regulate a given splicing event. Stoilov et al (2008) described a high-throughput screening assay to discover compounds that target the splicing reaction using a two-color fluorescent reporter system. The authors tested known bioactive compounds for their effect on inclusion of microtubule-associated protein tau (MAPT) exon 10. From their compound library screen, they identified digoxin, a cardiotonic steroid used in the treatment of heart failure, as a novel splicing modulator. Futhermore, another study by Anderson et al (2012) demonstrated that digitoxin, another cardiotonic steroid, regulates alternative splicing by depletion of SRSF3 and Tra2β. These observations identify previously characterized drugs as novel modulators of alternative splicing and demonstrate the feasibility of screening for compounds that alter exon inclusion.

Indeed, research during the last several years has identified a number of small molecules that can change alternative exon usage, most often by targeting histone deacetylases or by interfering with the phosphorylation of splicing factors (reviewed in (45-47). Table 1.1 lists some of the small molecules that were identified to modulate splicing, with the compounds tested in the context of HIV-1 highlighted orange. There still remains many compounds for which the mechanistic basis for how they perturb splicing is not yet fully understood. Thus, further examination of these small molecules gives insight into alternative pre-mRNA splicing, and more importantly, paves the way for therapeutic application of these compounds to control diseases and infections that are dependent upon alternative splicing.

27

Table 1.1 List of small molecule inhibitors of alternative splicing and their molecular targets. Compounds tested in the context of HIV are indicated in orange. *unknown mechanism.

Compound Drug type Mechanism Reference(s) Spliceostatin A FR901464-derivative SF3b Kaida et al, 2007 Sudemycin C1 FR901464-derivative SF3b Fan et al, 2011 Sodium butyrate Short chain fatty acid HDAC inhibition Chang et al, 2001 Carbon branched-chain fatty Valproic acid HDAC inhibition Brichta et al, 2003 acid Phenylbutyrate Short chain fatty acid HDAC inhibition Andreassi et al, 2004 M344 Benzamide HDAC inhibition Riessland et al, 2006 Hydroxyl-phenyl- SAHA HDAC inhibition Hahnen et al, 2006 octanediamide Aclarubicin Aclacinomycin A Topo 1 Andreassi et al, 2001 Camptothecin Alkaloid Topo 1 Gonzalez-Molleda et al, 2012 Tazi et al, 2005 Isodiospyrin Diospyrin derivative Topo 1 Ting et al, 2003 NB-506 Indolocarbazole derivative Topo 1 Pilch et al, 2001 IDC16 Indol derivative Topo 1, SRSF1 Bakkour et al, 2007 IDC13 Indol derivative SR proteins* Keriel et al, 2009 IDC78 Pyridocarbazole SR proteins* Keriel et al, 2009 Digitoxin Cardiac glycoside SRSF3, Tra2β Anderson et al, 2012 Karakama et al, 2010 SRPIN340 Isonicotinamide derivative SRPK1, SRPK2 Fukuhara et al, 2006 Muraki et al, 2004 TG003 Benzothiazole CLK1, CLK4 Wong et al, 2011 Leucettine L41 Leucettamine B derivative CLKs, DYRKs Debdab et al, 2011 Dichloroindolyl KH-CB19 CLK1, CLK4 Fedorov et al, 2011 enaminonitrile Younis et al, 2010 Chlorhexidine Biguanide CLK2, CLK3, CLK4 Wong et al, 2011 Lithium chloride GSK3 Hernandez et al, 2004 Yadav et al, 2014 AR-A014418 Thiazole GSK3 Hernandez et al, 2004 SB216763 Indole maleimide GSK3 Heyd and Lynch, 2010 C6-ceramide Ceramide analog PP1 regulation Chalfant et al, 2002 Tautomycin Alkylmaleic anhydride PP1 inhibition Novoyatleva et al, 2008 Cantharidin Natural toxin PP1 inhibition Novoyatleva et al, 2008 Stoilov et al, 2008 Digoxin Cardiac glycoside * HIV-1 Rev Wong et al, 2013 8-azaguanine Purine analog * HIV-1 Rev Wong et al, 2013 5350150 Quinoline * HIV-1 Rev Wong et al, 2013 ABX464 IDC16-derivative * HIV-1 Rev Campos et al, 2015

28

1.5.1.1 Spliceosome inhibitors

Spliceostatin A is a stabilized derivative of FR901464, a Pseudomonas bacterial fermentation product that has been shown to modulate pre-mRNA splicing (48). Spliceostatin A inhibits alternative splicing by binding the U2 small nuclear ribonucleoprotein (snRNP) component SF3b, which is essential for recognition of the pre-mRNA branch point (48). Studies by Kaida et al (2007) revealed that spliceostatin A inhibited interaction of an SF3b subunit with the pre- mRNA by preventing recruitment of U2 snRNP to sequences 5′ of the branch point (48). Sudemycin C1 is an analog of FR901464 and its derivative spliceostatin A. This compound and another analog sudemycin E similarly bind to SF3b, induce dissociation of the U2 snRNPs and alter pre-messenger RNA splicing (49). These compounds illustrate a proof of principle, but the development of small molecule inhibitors of splicing as therapeutics requires compounds that act in a more selective manner. Compounds that were shown to inhibit various factors that regulate the activity of splicing factors is described in the following sections.

1.5.1.2 Histone deacetylase (HDAC) inhibitors

HDAC inhibitors were identified in studies aimed at promoting exon 7 inclusion in SMN2 mRNA. The function of HDACs is to regulate chromatin structure and gene expression by controlling the acetylation state of histones. The acetylation of histones determines histone affinity for DNA, hence it follows that application of HDAC inhibitors would cause a coordinated change in the expression of splicing regulatory factors, and thus splicing. In support of this hypothesis, a change in SR protein expression was observed after sodium butyrate application in mice (50). Similarly, valproic acid, phenylbutyrate, M344 and SAHA (suberoylanilide hydroxamic acid) increased SMN2 RNA and protein levels in vitro (51-54). For valproic acid and M344, this occurred via two mechanism: increase the overall SMN2 expression through inhibition of targeted HDACs and increase the incorporation of exon 7 into the SMN2 transcripts through the activation of splicing factors (51, 53). Both valproic acid and phenylbutyrate were tested in clinical trials, however the results of the trial were varied (47). Given the therapeutic potential of HDAC inhibitors and their proposed mechanisms of action, a search for further alternative splicing inhibitors is warranted in an effort to identify molecules with more suitable properties that can be used as therapeutics agents.

29

1.5.1.3 Topoisomerase (Topo I) inhibitors

DNA topoisomerase I (Topo I) have a dual function in RNA metabolism. The enzyme nicks the DNA strand upon transcription to regulate supercoiling of the DNA (55, 56). Furthermore, studies have shown that Topo I phosphorylates SR proteins that associate with the nascent pre- mRNA and may act as a potential protein kinase in vivo (55, 56). So, it comes as no surprise that testing of numerous drugs that target Topo I found that several of them alter splice-site selection (46). Diospyrin was found to inhibit spliceosomal assembly whereas its derivatives had specific inhibitory effects on catalytic steps in splicing (57, 58). Another Topo I inhibitor, NB-506, inhibited phosphorylation of SRSF1 (SF2/ASF) and perturbed the early formation of the spliceosome (59). Furthermore, an indole derivative, IDC16, was shown to interfere with exonic splicing enhancer activity of the SR protein splicing factor SRSF1 (60).

1.5.1.4 Kinase and phosphatase inhibitors

SR proteins are also phosphorylated by a family of nuclear cell division cycle 2-related kinases, termed CDC-like kinases (Clks) 1–4. A specific inhibitor of these kinases, TG003, changes alternative splicing in reporter genes and has been tested as an anti-viral agent (61) but it is not active against HIV-1 (43). Similarly, chlorohexidine was found to selectively inhibit CLK2, CLK3 and CLK4 without having a general effect on splicing, and also inhibited CLK3 in the context of HIV-1 (43, 62). Yet another inhibitor of CLKs, KH-CB19, specifically inhibited CLK1 and CLK4 and altered the phosphorylation patterns of SR proteins (63). Leucettine L41, a CLK and dual-specificity tyrosine kinase (DYRK) inhibitor, inhibits phosphorylation of several SR proteins, including SRSF4, SRSF6, and SRSF7 (64). In contrast, SRPIN340 selectively inhibited SRPK1 and SRPK2 with no inhibition of CLK1, CLK4 or other kinases (65). When tested in the context of viral infections, SRPIN340 was not able to reproducibly inhibit HIV replication, but suppressed propagation of Sindbis virus and inhibited HCV replication in vitro (65, 66), suggesting that SRPIN340 and other SRPK1/2 inhibitors may be useful for limiting viral infections.

Furthermore, inhibition of glycogen synthase kinase 3 (GSK3) by AR-A014418 resulted in significant downregulation of splicing factors (SRSF1, SRSF5, PTPB1, and hnRNP) in U87 cells with downregulation of anti-apoptotic genes (67). Furthermore, Hernandez et al (2004) showed that inhibition of GSK3 by lithium chloride and AR-A014418 changed alterative splicing of

30 exon 10 of the tau gene, mutations in which were found to cause aberrant usage of the exon leading to frontotemporal dementia and Alzheimer’s disease (68). In fact, compound-induced inhibition of GSK3 resulted in redistribution of SRSF2 to nuclear speckles. Studies by Heyd and Lynch (2010) revealed that SB216763-mediated inhibition of GSK3 resulted in a decrease in PTB-associated splicing factor (PSF) phosphorylation and subsequently induced PSF-mediated CD45 exon skipping in an ESS1-dependent manner (23).

Since protein phosphatase-1 (PP1) binds directly to a conserved motif in the RNA-recognition motif of at least nine different splicing-regulatory proteins, inhibition of PPI would have an effect on alternative splicing. Indeed this is the case, as tautomycin, a specific inhibitor for PP1, was found to induce changes in alternative splicing in cell culture and mouse models (69). Similar effects were seen for cantharidin, which inhibits both PP1 and protein phosphatase-2A (PP2A) (69). In addition, C6-ceramide has been shown to change splice-site selection in some apoptotic genes (70).

Together, these observations demonstrate that targeting of alternative splicing by small molecules can be achieved in a specific manner without detriments to the normal cellular splicing process. Thus, these studies have tremendous implications for the treatment of diseases associated with altered mRNA splicing events. HIV-1 infection, is one disease that requires new therapeutic strategies to continue combating the development of drug resistant viral strains. Since HIV-1 relies on cellular mRNA splicing to generate all viral proteins, small molecule modulators of alternative splicing is a promising avenue for further research.

1.6 Effect of splicing modulators on HIV-1 gene expression

As outlined above, several studies have shown that it is indeed feasible to modulate mRNA processing as a therapeutic approach for treating disease, cancer and viral infection. It is fair to presume that this method would also work in the context of HIV-1 infection. Indeed, previous work from our lab, as well as two recent studies, have verified that small molecules can be used to inhibit HIV-1 infection by modulating viral RNA splicing. Previously, our lab has shown that chlorohexidine, digoxin, 8-azaguanine, and 5350150 treatment potently inhibited HIV-1 gene expression in vitro. These compounds inhibited HIV-1 RNA processing by inducing oversplicing of viral RNA, and/or perturbation of HIV-1 Rev function (43, 71, 72). Although the compounds inhibited HIV-1 through different mechanisms of action, all lead to the same outcome of 31 decreased expression of viral structural proteins and the incompletely spliced viral RNAs. Thus, these findings demonstrate that perturbation of HIV-1 splicing by small molecules is an effective strategy to inhibit viral gene expression.

In addition to the small molecules tested by our lab, a study published by Bakkour et al (2007), demonstrated that the indole derivative, IDC16, suppresses the production of key HIV-1 proteins, thereby compromising subsequent synthesis of full-length HIV-1 pre-mRNA and assembly of infectious particles. IDC16 was also shown to inhibit replication of macrophage- and T cell–tropic laboratory strains, clinical isolates, and strains with high-level resistance to inhibitors of viral protease and reverse transcriptase (60). Importantly, drug treatment of primary blood cells did not alter splicing profiles of endogenous genes involved in cell cycle transition and apoptosis (60).

Furthermore, a recent study by Campos et al (2015) showed that ABX464, a synthetic derivative of IDC16 with decreased cytotoxic effects, inhibits HIV-1 replication of clinical isolates and decreased viral proliferation in humanized mouse models (73). The inhibitory effect of ABX464 was shown to be dose-dependent in peripheral blood mononuclear cells and in macrophages infected with different subtypes of HIV with no adverse effects on cell viability when treated at concentrations in the micromolar range (73). Importantly, this compound did not select for drug resistant mutations in vitro and controlled viral rebound in humanized mouse models for two months following cessation of treatment while viral loads rebounded within a week in animals following cessation of HAART treatment (73). Thus, this drug is promising as a novel therapeutic agent for HIV-1 infection and is currently being tested in clinical trials. Together these studies validate that small molecules targeted at modulating alternative splicing, can be used as a novel therapeutic approach to treat HIV-1 infection. Since these compounds act on host cellular processes required for viral replication rather than viral proteins, they might have less risk in developing drug resistance, complement existing anti-viral therapies in combination with HAART, or serve as a second line of a defense to combat drug-resistant viral strains. Thus, further studies of compounds that specifically inhibit HIV-1 alternative splicing, without perturbing cellular splicing is warranted for continued success in combating HIV-1 infection.

32

1.7 Research objective and rationale

Since HIV gene expression is critically dependent upon controlled splicing of the viral transcript, perturbing mRNA splicing would have detrimental effects on HIV-1 gene expression. Thus, small molecules that are able to modulate RNA processing are promising as novel anti-HIV drugs. We and others have previously shown this to be true with small molecular compounds digoxin, 8-azaguanine, 5350150 (71, 72), IDC16 (60), and ABX464 (73). The success of these compounds in inhibiting HIV-1 gene expression, prompted us to expand the repertoire of HIV-1 inhibitors and look for compounds that have distinct modes of action from those previously described. The potential to differentially affect HIV-1 gene expression would further validate the use of small molecule modulators of alternative splicing as a viable new strategy against HIV-1 replication. Furthermore, since current anti-viral therapies for HIV, do not target viral RNA processing, this approach can complement existing treatments or be used as salvage therapy to combat drug-resistant virus.

33

2 Materials and Methods 2.1 HIV-1 provirus doxycycline-inducible cell lines

To determine the effects of small molecular compound treatment on HIV-1 gene expression, HeLa cells stably transduced with an inducible Tet-On HIV-1 system (as described by (43, 71, 72)) were used. Briefly, an HIV-1 LAI-2 viral genome was modified with the following changes: tet operator (tetO) DNA binding sites incorporated into the LTR promoter, inactivating mutation in the Tat gene, five nucleotide substitutions in the TAR hairpin motif, and replacement of the nef gene with reverse tetracycline transactivator (rtTA) (74, 75). The provirus was further modified with a deletion in the reverse transcriptase and integrase region of the pol gene by an MlsI restriction digest (B2 cell line) or gfp gene in the pol open reading frame, deleting the PR and RT-coding regions (C7 cell line). In this system, rtTA undergoes a conformational change when bound by doxycycline (dox) allowing dox-bound rtTA to bind to the tetO sites and activate viral gene expression. Tat and its TAR binding site are inactivated so that HIV-1 gene expression is only induced in the presence of doxycycline (dox). Thus, these cells allow the production of virus particles from a single-round of replication upon dox induction. All cell lines were maintained in Iscove’s modified Delbecco’s medium (IMDM; Wisent) supplemented with 10% (vol/vol) fetal bovine serum (FBS, Wisent), 1% penicillin/streptomycin (P/S, Wisent) and 0.2% Amphotericin B (Wisent).

2.2 Assess activity of compounds on HIV-1 gene expression

2.2.1 Preparation of compounds

The compounds used in the treatment assay were obtained from ChemBridge. All compounds were solubilized 10 mM or 1mM stock concentrations in dimethyl sulfoxide (DMSO), aliquoted into microtubes and stored at -20°C for subsequent experiments.

2.2.2 Compound treatment assay

The compound treatment assays were performed as described by Wong et al (43, 71, 72). Briefly, B2 or C7 cells were seeded at 60-80% cell confluence in IMDM complete medium in 6- well, 24-well, 6 cm or 10 cm tissue culture plates (Sarstedt) one day prior to compound treatment and cultured overnight at 37°C in a 5% CO2 humidified incubator. The following day, compounds were diluted in Opti-MEM (Invitrogen/GIBCO) with equivalent concentrations of 34 d

Figure 2.1. Schematic of HIV-1 proviral system integrated in HeLa cell lines To assess the effect of small molecules on HIV-1 gene expression, we used HeLa cells that have been stably integrated with an HIV-1 provirus. The provirus consists of an X4-tropic LAI genome that has been modified with the Tet-On regulatory system as previously described (71, 72, 74, 75). Briefly, the HIV-1 Nef gene was replaced with rtTA (reverse tetracyclin transactivator) and Tat and its TAR binding site were mutated and functionally replaced with a TetOperator (TetO, double copy) within the LTR region. The genome was further modified by 1) Mls deletion of the pol gene, deleting RT & IN (1000 bp deletion) or 2) replacement of a portion of the pol gene with gfp (to produce a Gag-GFP fusion protein) and stably transfected into HeLa cells (we call this the HeLa B2 and HeLa C7 cell lines, respectively. With the addition of the activator molecule, doxycyclin, rtTA can bind doxycyclin causing a conformational change that allows it to bind to the TetO and induce viral gene expression.

35

DMSO and added to each well or plate in a circular drop-wise manner to achieve the desired final concentration. The plates were then incubated for 3-5 hours in the presence of the compounds prior to induction with doxycycline (dox) at a final concentration of 2 μg/mL (equal volume of IMDM complete was added to uninduced control samples) and incubated overnight at

37°C in a 5% CO2 humidified incubator. 24 hours post compound treatment, 900 μL of culture medium was harvested and added to 100 μL of 10% Triton X-100 and incubated at 37°C for 1 hour prior to storage at -20°C for p24 antigen ELISA. The remaining culture medium was discarded and the cells were washed with PBS twice, before the addition of 2 mM EDTA-PBS for 15 minutes at 37°C in a 5% CO2 humidified incubator. Cells were lifted from the well or plate, collected in separate microtubes for RNA and protein, and pelleted by centrifugation at 3,800 x g for 5 minutes at room temperature. The supernatant was discarded and the cells were lysed in either 350 μL of total RNA lysis buffer (BioRad) for RNA or 100-200 μL of RIPA buffer (1% NP-40, 0.1% SDS, 0.5% Sodium Deoxycholate, 150 mM NaCl, 50 mM Tris-HCl) for protein in RNase free microtubes. The lysates were kept on ice prior to storage at -20°C for further analysis.

2.3 HIV-1 p24 antigen ELISA

HIV-1 gene expression was measured by quantifying the levels of HIV-1 present in culture supernatants by ELISA for p24 Gag antigen using kits purchased from Frederick National Laboratory for Cancer Research (Leidos) and performed according to manufacturer’s instructions. ELISA plates were read at 450 nm and 650 nm on Thermo Scientific Multiskan FC Filter-based Photometer (Thermo Scientific) or the VersaMax microplate reader using Softmax Pro version 5.0 software (Molecular Devices). HIV-1 p24 concentration in the samples was calculated by inputting the absorbance of the sample into a four parameter sigmoid fit equation based on the two-fold serial dilutions of the HIV-1 p24 standard lysate and expressed relative to the concentration in DMSO-treated samples.

2.4 XTT cytotoxicity assay

Cellular metabolism following compound treatment was measured by an XTT-based in vitro toxicology assay kit (Sigma-Aldrich) as proxy for degree of cytotoxicity relative to DMSO control treatment. This assay provides a spectrophotometric method for estimating cell number based on the mitochondrial dehydrogenase activity in viable cells since an increase or decrease in

36 viable cells relative to control cells would result in an accompanying change in the amount of the coloured formazan derivative generated. Briefly, HeLa cells were seeded at a density of ~8,000 cells / 100 L in IMDM complete medium in 96-well tissue culture flat-bottom plates (Sarstedt) and treated as described above for the compound treatment assay. After 24 hours, culture supernatant was removed, replaced with 20% XTT solution (40% IMDM complete, 40% PBS,

20% XTT) and incubated at 37°C in a 5% CO2 humidifed incubator for 2-6 hours. Plates were read at 450 nm and 650 nm on Thermo Scientific Multiskan FC Filter-based Photometer (Thermo Scientific). Relative cell viability was measured as absorbance at 450 nm subtracted by the absorbance at 650 nm and absorbance of blank wells containing only the XTT solution as background signal, in compound treated cells relative to DMSO-treated cells. To examine the long term effects of the compounds on cell proliferation, HeLa cells were seeded at 2,000 to 6,000 cells / 100 L in IMDM complete medium in 96-well tissue culture flat-bottom plates (Sarstedt) and treated as described above for the compound treatment assay. After 24, 72, and 96 hours post treatment, culture supernatant was removed, replaced with 20% XTT solution and incubated at 37°C in a 5% CO2 humidifed incubator for 2-6 hours, and relative cell viability was measured in compound treated cells relative to DMSO-treated cells, as described above.

2.5 Analysis of HIV-1 protein expression

Protein concentration in cell lysates was quantified by Bradford assay and equal amounts of protein run on 7, 10, 12, or 14% SDS-PAGE, depending on the protein of interest, under reducing conditions. Proteins were transferred to 0.2-0.45 m PVDF (BioRad or Perkin-Elmer) by electrophoretic transfer or by the Trans-Blot Turbo blotting system (BioRad). Blots were blocked in either 5% Milk-PBS-T (5% Milk, 0.05% Tween-20, 1x PBS) or 3% BSA-PBS-T (3% BSA, 0.05% Tween-20, 1x PBS) for ≥1 hour at room temperature, prior to incubating the blots in primary antibody (all diluted in 3% BSA-PBS-T). Conditions used for the primary antibodies are as follows: purified mouse anti-p24 supernatant from hybridoma 183 (anti-HIV-1 Gag, NIH) at 1/500 dilution probed 2 hours at room temperature, mouse anti-gp120 purified supernatant from hybridoma 902 (anti-HIV-1 Env, NIH) at 1/10 dilution probed overnight at 4°C, mouse monoclonal antibody to HIV-1 Rev (Abcam) 1/1000 dilution probed overnight at 4°C, rabbit polyclonal antibody to HIV-1 Tat (Abcam) 1/7500 dilution probed for 2 hours at room temperature, rabbit polyclonal antibody to GAPDH (Sigma-Aldrich) 1/5000 dilution probed for 2 hours at room temperature, and mouse monoclonal antibody to α-Tubulin (Sigma-Aldrich) 37

1/5,000 dilution probed for 1 hour at room temperature. After incubations, blots were washed three times with PBS-T and incubated with a 1/5000 dilution of isotype-specific HRP-conjugated secondary antibody (Jackson ImmunoResearch) in PBS-T. Following washes, blots were visualized by ECL, ECL Plus (Perkin-Elmer), or Clarity Western ECL substrate (BioRad) and exposed to autoradiography film or imaged using the ChemiDoc MP imager (BioRad) and ImageLab (BioRad) software. Quantification of the relative intensity of the detected bands was done using ImageLab software and normalized to corresponding bands of the loading control (GAPDH or α-Tubulin).

2.6 Analysis of HIV-1 RNA expression and localization

2.6.1 RNA extraction and reverse transcription

Samples were processed and assayed as previously described (43, 71, 72). Briefly, total RNA was extracted from compound-treated cell pellets and genomic DNA was eliminated using the BioRad Aurum Total RNA Lysis Kit (BioRad) as per manufacturer’s instructions with the addition of Turbo DNase (Ambion). Purified RNA (0.5-2 g) was reverse transcribed using M- MLV (Invitrogen) to generate complementary DNA (cDNA). The cDNA product was then diluted 1:7.5 in nuclease free water and the samples stored at -20°C for further anaylsis.

2.6.2 Quantification of HIV-1 mRNA expression by qPCR

HIV-1 mRNA levels in DMSO- and compound-treated samples were quantified by qPCR using the Mastercycler ep realplex (Eppendorf ) as described by Wong et al (43, 71, 72). Briefly, 25 l reactions were run in duplicate in 96-well skirted plates (Axygen) using the standard curve method with a non-template control blank for each primer to control for contamination or primer-dimers. Each reaction was set-up as follows: 0.4 μL of Taq DNA polymerase (5 U/μL, NEB), 2.5 μL of ThermolPol buffer, 2.5 μL of 10X SYBR Green I (Sigma-Aldrich), 2.5 μL of 2.5 mM dNTPs, 1.0 μL of 5' primer (0.1 ug/uL), and 1.0 μL of 3' primer (0.1 μg/μL), 10.1 μL

H2O, and 5 μL of cDNA. The forward and reverse primers used in the quantitation of HIV-1 mRNA are outlined below: unspliced (US), 5' - GAC GCT CTC GCA CCC ATC TC - 3' and 5' - CTG AAG CGC GCA CGG CAA - 3'; singly spliced (SS), 5' - GGC GGC GAC TGG AAG AAG C - 3' and 5' - CTA TGA TTA CTA TGG ACC ACA C - 3'; and multiply spliced (MS), 5' - GAC TCA TCA AGT TTC TCT ATC AAA - 3' and 5' - AGT CTC TCA AGC GGT GGT - 3'.

38

Results were normalized to the housekeeping gene, -actin, which served as an internal loading control. The forward and reverse primers used to detect -actin were as follows: 5'-GAG CGG TTC CGC TGC CCT GAG GCA CTC-3' and 5'-GGG CAG TGA TCT CCT TCT GCA TCC TG-3'. cDNA amplification was detected under the following cycle conditions: 95°C, 2 min followed by 40 cycles of 95°C, 15s; 60°C, 15s; and 72°C, 15s (for US, MS, and Actin) and 95°C, 2 min followed by 40 cycles of 95°C, 30s; 55°C, 30s; and 72°C, 30s (for SS). qPCR values crossing threshold (Ct) were obtained during the exponential amplification phase and exported into Microsoft Excel where gene quantification was evaluated using the absolute quantification method, normalized to -actin expression, and expressed relative to DMSO-treatment.

2.6.3 Analysis of splice site selection within the HIV-1 MS RNA

The effect of compound treatment on splice site selection within the HIV-1 MS RNA class was analyzed by radioactive RT-PCR as described previously (43, 71, 72). Total RNA from DMSO- or compound-treated samples was extracted, reverse transcribed to cDNA and diluted as described above. The forward and reverse primers used the amplify HIV-1 MS RNAs are as follows: 5'-GGG CAG TGA TCT CCT TCT GCA TCC TG -3' and 5' -TCA TTG CCA CTG TCT TCT GCT CT - 3'. Initial rounds of cold RT-PCR were set-up as follows: 1 μL cDNA, 1 μL of Taq DNA polymerase, 5 μL of 10X ThermolPol buffer, 4 μL of 2.5 mM dNTPs, 10 μL of forward primer (10 μM), 10 μL of reverse primer (10 μM), and 19 μL of H2O in a 50 μL final reaction volume. Thermocycler conditions used were 95°C, 2 min followed by 34 cycles of 95°C, 1 min; 57°C, 1 min; and 68°C, 1 min; and ended with 68°C, 5 min; and 4°C, indefinitely. A second round of radioactive PCR was run with the following changes/additions to the conditions described above: 3 μL of diluted cDNA from the first PCR reaction (1/10th dilution), 32 0.5 μL of α- P-dCTP (Perkin Elmer), and 16.5 μL of H2O. The same thermocycler conditions were also used except only 5 cycles were run. An equal volume of loading buffer (90% formamide, 10 mM EDTA, 0.025% xylene cyanol, and 0.025% bromophenol blue) was added to the products and heated at 95°C for 5 minutes prior to resolving radioactive reaction products using 6% denaturing polyacrylamide gels (8 M Urea, 1xTBE) and detection using a Typhoon 9400 PhosphorImager (Amersham). Gel densitometry was analyzed using ImageJ software (NIH) to calculate mRNA levels of HIV-1 MS mRNA isoforms, measured as the density of an individual isoform divided by the total density of all visible viral RNA species in a sample.

39

2.6.4 Analysis of HIV-1 US RNA subcellular localization

Changes in HIV-1 US RNA subcellular distribution in response to compound treatment was analyzed by fluorescent in situ hybridization in HeLa C7 cells, as described by Wong et al., 2013. I confirmed the induction of viral gene expression in HeLa C7 cells with doxycyclin by fluorescent microscopy. Induced cells (+ Dox) showed strong GFP fluorescence in the cytoplasm while cells incubated in the absence of doxycyclin only showed background fluorescence (Figure 2.2A). Briefly, HeLa C7 cells were treated with DMSO or compounds as described initially in the compound treatment assay, except, after 24 hours, cells were fixed in 3.7% formaldehyde-1X PBS for 10 minutes at room temperature. Cells were permeabilized by treatment with 70% ethanol, then rehydrated in hybridization buffer (10% formamide, 2X SSPE). Hybridization was performed using a mixture of 48 Quasar 570-labelled oligonucleotides spanning the matrix, capsid, and nucleocapsid regions of HIV-1 as detailed by the supplier (Biosearch Technologies). Following washing to remove unbound probe, nuclei were stained with DAPI and images were acquired using a Leica DMR microscope at 630× magnification by Raymond Wong.

To ensure that the effect of the compounds in the context of HeLa C7 cells were similar to their effects in HeLa B2 cells, I tested a wide range of concentrations of the compounds in HeLa C7 cells and measured GagGFP fluorescence intensity as a readout for HIV-1 gene expression using the Typhoon 9400 imager and ImageJ software. First, I determined the range of cell density that would provide a linear relationship between cell number and fluorescence intensity. Briefly, HeLa C7 cells were seeded at a range of concentrations and incubated either in the presence or absence of doxycyclin for a period of 24 hours, after which the cells were washed and stored in PBS, covered at either room temperature (less than 10 minutes) or at 4°C (longer than 10 minutes). GagGFP fluorescence was detected using the Typhoon 9400 imager (laser emission 488nm) and the mean fluorescence intensities were used to calculate the HIV-1 GagGFP signal in uninduced and induced cells using ImageJ software (blank wells were used as background signal). Induced cells showed a linear relation between cell number and mean fluorescent intensity between 2.0 x 104 and 8.0 x 104 cells (r2 = 0.9810) while uninduced cells had almost undetectable fluorescence, as expected (Figure 2.2B). Next, I examined whether GagGFP fluorescence reflected the levels of HIV-1 Gag levels as measured by p24 antigen ELISA following compound treatment. To do this, HeLa C7 cells were seeded, treated and induced as outlined previously (section 2.2.2), however, instead of harvesting cells by EDTA, cells were

40 d A

B

Figure 2.2. Characterization of HeLa C7 cells for fluorescence studies. (A) Representative images of HeLa C7 cells treated with DMSO in the absence (uninduced) or presence (induced) of doxycyclin (N ≥ 3). Cells were viewed at 630X (oil immersion) magnification. Images are cropped to show a representative field of view. (B) Measurement of mean fluorescence intensity in uninduced and induced cells at various cell densities (N = 1). Linear regression of mean fluorescent intensity in induced cells (between 2x104 and 8x104) is indicated by the dotted line and labelled with the regression coefficient (N = 1).

41 washed and stored in PBS (covered) and GFP fluorescence was detected using the Typhoon 9400 imager as described above. The mean fluorescence intensities were used to calculate the HIV-1 GagGFP signal in the compound treated cells relative to the DMSO control treated cells. Furthermore, XTT assays were performed in parallel as described previously (section 2.4) to examine the effect of the compounds on HeLa C7 cell metabolism as a proxy for cell viability. The IC80-90 concentrations for the compounds were approximately 15 uM, 35 uM, >3 uM and 3 uM for 892, 791, 833 and 191, respectively (Figure 2.3). These concentrations correlate well with those used in HeLa B2 cells as measured by p24 antigen ELISA (Figure 3.2), suggesting that the effect of the compounds on HIV-1 US RNA localization and GagGFP expression in HeLa C7 cells reflect the effect of the compounds in HeLa B2 cells as well.

2.7 Monitoring protein synthesis by SUnSET

The effect of the compounds on nascent protein synthesis was measured by surface sensing of translation (SUnSET) as described by Schmidt et al., 2009 (76). Cells were incubated with puromycin, an aminoacyl tRNA analog, to allow puromycin incorporation into newly translated peptides and prevention of further ribosomal elongation by chain termination. In this way, newly synthesized polypeptides were “tagged” with puromycin and detected by SDS-PAGE using an antibody against puromycin. To assess the effect of the compounds on protein translation, B2 cells were prepared and treated as described by the compound treatment assay, but were incubated with 10 g/mL of puromycin for a period of 30 minutes at 37°C in a 5% CO2 humidified incubator prior to harvesting cell lysates for protein analysis (as described previously). Protein concentration in cell lysates was quantified by Bradford assay and equal amounts of protein (30-50 g) was run on either 10% or 4-15% (gradient) Tris-glycine gels. Proteins were transferred to 0.2 m PVDF (BioRad) using the Trans-Blot Turbo blotting system (BioRad) and blots were blocked in 5% Milk-PBS-T for ≥2 hours at room temperature. Blots were probed overnight at 4°C with a 1/5000 dilution of mouse monoclonal antibody to puromycin (anti-12D10, EMD Millipore) in 3% BSA-PBS-T. After incubations, blots were washed three times with PBS-T for 10 minutes and incubated with a 1/5000 dilution of isotype- specific HRP-conjugated anti-mouse antibody in PBS-T (Jackson ImmunoResearch). Following washes, blots were developed using ECL Plus (Perkin-Elmer) or Clarity (BioRad) and imaged using the ChemiDoc MP Imager (BioRad). To quantify the levels of protein synthesis, the d 42

Figure 2.3. Compound treatment in HeLa C7 cells inhibits HIV-1 gene expression in a dose-dependent manner similar to effects observed in HeLa B2 cells. The dose range of the compounds which inhibit HIV-1 GagGFP expression in HeLa C7 cells was measured by mean fluorescence intensity and expressed relative to fluorescence intensity in DMSO-treated samples (N ≥ 3, * = p ≤ 0.05, ** = p ≤ 0.01, and *** = p ≤ 0.001). The effect of the compounds on cellular metabolism at the indicated concentrations was measured using an XTT assay as a readout of viable cells and expressed relative to absorbance reads of DMSO- treated samples (N ≥ 3, * = p ≤ 0.05, ** = p ≤ 0.01, and *** = p ≤ 0.001). Error bars indicate standard error of the mean (SEM).

43 volume intensity in each lane of compound-treated sample was calculated relative to the DMSO- treated, dox-induced sample lane and normalized to GAPDH loading control using ImageLab software (BioRad) from at least four independent experiments.

2.8 Viral protein degradation assay

To determine whether the compounds directly cause destabilization and/or degradation of HIV-1 regulatory proteins, the decay of HIV-1 Tat levels was compared between DMSO-treated and compound-treated protein lysates in the presence of cycloheximide, an inhibitor of protein translation. First, B2 cells were seeded in 6cm or 10cm plates (multiple plates per treatment for different time points) in IMDM complete medium and HIV-1 gene expression was induced with doxycylin (dox) for 24 hours at 37°C in a 5% CO2 humidified incubator to allow viral protein expression. Next, 5 g/mL cycloheximide (Sigma-Aldrich) was added to block new protein synthesis in combination with either DMSO or the compounds and cell lysates were harvested for protein every 2 hours. Protein concentration in cell lysates was quantified by Bradford assay and equal amounts of protein run on 13 or 14% gels by SDS-PAGE. Proteins were transferred, blocked, probed with antibodies for Tat and Gapdh, and detected as described above. Quantification of the relative intensity of the detected bands was performed using ImageLab software (BioRad) and normalized to corresponding bands of the loading control (GAPDH) from at least three independent experiments.

2.9 Proteasomal degradation protection assay

To determine whether the effect of the compounds on HIV-1 gene expression can be directly reversed with protection from degradation of viral regulatory proteins, B2 cells were treated with compounds in the presence of MG132, a proteasome inhibitor. Briefly, the compound treatment assay was performed as previously described with the addition of 10 M MG132 (Sigma- Aldrich) to compound-treated cells 8 hours prior to harvesting cell lysates for protein. Protein concentration in cell lysates was quantified by Bradford assay and equal amounts of protein run on 13 or 14% gels by SDS-PAGE. Proteins were transferred, blocked, probed with antibodies for Tat and Gapdh, and detected as described above. Quantification of the relative intensity of the detected bands was performed using ImageLab software (BioRad) and normalized to corresponding bands of the loading control (GAPDH) from at least three independent experiments. 44

2.10 Analysis of cellular alternative splicing events by RT-PCR

The effect of the compounds on alternative splicing of cellular RNA was analyzed by RT-PCR by Peter Stoilov as previously described (Wong et al., 2013). Briefly, total RNA from three independent biological replicates of each compound treatment was reverse transcribed using random hexamers and RNaseH(-) reverse transcriptase. The samples were assayed by medium throughput RT-PCR to determine the inclusion levels of alternatively spliced exons and splice sites located in 73 events. For this purpose 73 primer sets (see Appendix I) containing a fluorescently (5-FAM) labeled primer for each, were used. The fluorescently labeled PCR products were denatured in formamide and quantified using ABI Prism capillary sequencer (Life Technologies). The PCR reaction assembly and the subsequent liquid handling steps were carried out using 384 well PCR plates (Axygen) and automated using Biomek 2000 and Multimek 96 liquid handlers. The fragment analysis was performed on the PeakScanner software (Life Technologies) in batch mode and automated using custom scripts written in Python. The inclusion level of each exon was calculated as the amount of transcripts carrying the alternative exon relative to the total amount of all transcripts detected in the PCR reaction and results are summarized for compound-treatment in comparison to DMSO treatment.

2.11 Analysis of cellular alternative splicing by RNA sequencing

2.11.1 Sample preparation for RNA sequencing (RNAseq)

Total RNA from DMSO-, 791-, and 191- treated samples (RNA extraction described earlier) was converted to mRNA into a library of template molecules suitable for subsequent cluster generation and DNA sequencing using the Illumina TruSeq RNA Sample Preparation Kit (Illumina) according to the manufacturer’s instructions. First, total RNA integrity was verified using an Agilent Technologies 2100 Bioanalyzer (RNA Integrity Number (RIN) value ≥ 8). Next, polyadenylated RNA was enriched twice from 1 g of total RNA using oligo-dT attached magnetic beads and fragmented under elevated temperature. The RNA fragments were then copied into first strand cDNA using reverse transcriptase and random primers, followed by second strand cDNA synthesis using DNA Polymerase I and RNase H. Finally, end repair, A- tailing, and paired end adaptor ligation of the cDNA fragments was performed prior to PCR amplification to create the cDNA library.

45

2.11.2 RNAseq

The cDNA library was validated (passed quality control on a Bioanalyzer 1000 DNA chip (Agilent)), normalized and pooled for cluster generation. cDNA libraries were sequenced on the Illumina HiSeq2500 (paired-end, 125 bp) with version four chemistry following manufacturer’s protocols.

2.11.3 Analysis of RNAseq data

The full human genome and transcriptomic sequences were downloaded from the UCSC Genome Browser database and Ensembl, respectively, as described by Irimia et al., 2014 (77) and was analyzed by Dr. Sandy Pan (Blencowe Lab, University of Toronto). For each gene, a canonical transcript was selected for gene expression (GE) analysis based on the hierarchy derived from the BioMart associated transcript names, or if this information was not available, the longest protein-coding transcript was selected as the gene representative. Exon annotations and genomic coordinates for alternative splicing (AS) analysis were derived from tables downloaded from the UCSC Genome Browser database. To determine GE or AS changes in an unbiased way, the effective number of unique mappable positions in each transcript (i.e. the effective length) was determined by aligning sequences with unique transcriptomic alignment to the human genome using Bowtie, by Dr. Sandy Pan (Blencowe Lab, University of Toronto). Briefly, the reads obtained from the sequencing were first mapped to the human genome with reads that map more than one place in the genome removed and the remaining reads aligned to the transcriptome. Then, the effective mappable positions are counted by mapping a k-mer from the transcriptome that is the same length as the reads to the genome, removing the k-mers that map more than one place in the genome, and mapping the remaining k-mers back to the transcriptome. This way, the "unmappable" positions are disregarded since if the k-mer extracted from the transcriptome cannot be aligned, the reads cannot be aligned either.

2.11.3.1 Gene expression estimation

For each sample, the corresponding mRNA-Seq data were aligned against the human genome using Bowtie, allowing for a maximum of two mismatches by Dr. Sandy Pan (Blencowe Lab, University of Toronto).. Reads with one unique genomic alignment were then aligned against the canonical transcriptome and, for each transcript, the number of reads with one unique transcriptomic alignment were counted. The expression level of genes was quantified as 46 corrected ‘reads per kilobase of exon model per million mapped reads’ (cRPKM), a widely used metric to estimate gene expression levels. The expression cutoff was 0.5 cRPKM, corresponding to the transcript of the gene being present if there were ≥10 reads that mapped uniquely to a single genomic locus. Approximately 19,847 Ensembl annotated protein-coding genes were compared to create a gene list of differentially expressed genes. Genes were considered differentially expressed if fold changes in cRPKM was ≥ 2 in compound-treated versus DMSO- treated samples.

2.11.3.2 Percent spliced in (PSI) estimation

Every internal exon in each annotated transcript was considered a potential “cassette” exon as described previously (77). Briefly, each “cassette” AS event was defined by three exons: C1, A and C2, where A was the alternative exon, and C1 and C2 were the 5´ and 3´ constitutive exons, respectively. For each event, spliced junctions were defined as follows: C1A (connecting exons C1 and A), AC2 (connecting exons A and C2), and one alternative junction, C1C2 (connecting exons C1 and C2). For each sample, the corresponding mRNA-Seq data were aligned against the human genome using Bowtie, allowing for a maximum of two mismatches. Reads that did not map to the genome were then aligned to the full non-redundant set of junction sequences and, for each junction, the number of reads with one unique alignment mapping to it were counted. For each junction, the corresponding read count was normalized for its mapping ability by multiplying the read count by the ratio between the maximum number of mappable positions and its effective number of unique mappable positions (as defined above). The percent inclusion, or “percent spliced-in” (PSI) value, for each internal exon was defined as: PSI = 100 × average (#C1A,#AC2) / (#C1C2 + average(#C1A,#AC2)), where #C1A, #AC2 and #C1C2 were the normalized read counts for the associated junctions. Exons were considered alternative in a sample if 5 ≤ PSI ≤ 95. In addition “high confidence” PSI levels were defined as those PSI values that fulfilled the following specific coverage and balance criteria: max(min(#C1A,#AC2),#C1C2) ≥ 5 AND min(#C1A,#AC2) + #C1C2 ≥ 10 and |log2(#C1A/#AC2)| ≤ 1 OR max(#C1A,#AC2) < #C1C2. The goal of the first criterion was to ensure enough read coverage for sufficient precision and resolution in the estimation of PSI levels. The goal of the second criterion was to exclude AS events where there was a high imbalance in read counts between the two junctions formed by exon inclusion since these imbalances can confound PSI estimates for cassette AS events. For comparison of AS levels

47 between pairs of samples, Pearson correlation was applied to PSI levels. Events were considered differentially spliced between DMSO- and compound-treated samples if changes in PSI levels were ≥ 10.

2.12 Compound treatment assay in primary cells

2.12.1 Human primary cell donors and cell preparation

Peripheral blood mononuclear cells (PBMCs) were isolated from healthy (HIV-uninfected) volunteer blood donors as described by Dobson-Belaire et al., 2010 (78). Informed consent was obtained from participants in accordance with the guidelines for conduct of clinical research at the University of Toronto and St. Michael’s Hospital, Toronto, Ontario, Canada. Briefly, PBMCs were isolated from the volunteers by leukophoresis (Spectra apheresis system, Gambro BCT) or whole blood collection (by Gordon McSheffrey). PBMCs were collected using Ficoll-Paque Plus (Amersham Biosciences) following the manufacturer’s instructions (PBMCs obtained from whole blood were further depleted of monocytes by Gordon McSheffrey) and stored at -80°C in 90% (vol/vol) heat-inactivated fetal calf serum (FCS, HyClone) and 10% (vol/vol) dimethyl sulfoxide (DMSO, Sigma-Aldrich) for subsequent experimentation.

2.12.2 Generation of replication-competent HIV-1 virus

HIV-1 R5 BaL virus was generated in U87.CD4.CCR5 cells (NIH AIDS reagent program #4035) by Dr. Alex Chen. Briefly, U87 cells were grown in Dulbecco’s Modified Eagle’s Medium (DMEM, Wisent) supplemented with 10% [vol/vol] heat inactivated fetal bovine serum (FBS, Wisent), 1 g/ml puromycin (Sigma-Aldrich), and 300 g/ml G418 (Sigma-Aldrich) in a T75 tissue culture flask (Sarstedt). After 24 hours, (approximately 70% cell confluency), the cells were infected with the HIV BaL stock (obtained from Dr. Donald Branch) at a multiplicity of infection (MOI) of 0.01 for 1 hour at 37°C in 5% CO2 humidified incubator. After 1 hour, the cells were washed twice with DMEM medium to remove the remaining HIV BaL viruses and cultured in fresh DMEM medium at 37°C in 5% CO2 humidified incubator. Viral supernatants were harvested by filtering through a 0.45 M filter at different days post-infection and the level of infectious virus was measured by p24 antigen ELISA. Viral supernatants harvested on Day 10 post infection were found to correspond to peak levels of viral replication and these supernatants were stored in aliquots at -80°C for subsequent experiments.

48

2.12.3 HIV-1 BaL infection of primary cells

PBMCs were thawed, washed with RPMI 1640 complete medium and cultured in RPMI 1640 complete medium containing 2 μg/mL of PHA-L (Sigma-Aldrich) and 20 U/mL of IL-2 (BD

Pharmingen) at 37°C in a 5% CO2 humidified incubator for 72 hours. Subsequently, cells were counted and a portion of the cells was separated to another tube for uninfected control treatments. The remaining PBMCs were resuspended in HIV-1 BaL at a multiplicity of infection (MOI) of approximately 0.01 in a total volume of 1 mL and infected by spinoculation for 1 hour at 900 x g at room temperature. Subsequently, cells were washed twice with room temperature RPMI 1640 complete medium and resuspended to a concentration of 5 x 105 cells/mL in complete RPMI 1640 containing 40 U/mL of IL-2. Cells were seeded in 6-well or 12-well tissue culture plates (Sarstedt and Falcon, respectively) in a volume of 1 mL in preparation for compound treatment.

2.12.4 Compound treatment of primary cells

Compounds were prepared at 2X of the desired concentrations in complete RPMI 1640 with equivalent concentrations of dimethyl sulfoxide (DMSO) and added to infected PBMCs or uninfected control PBMCs to a total volume of 2 mL/well. Azidothymidine (AZT, Sigma- Aldrich) was used as control treatment at a final concentration of 3.74 M. Plates were incubated at 37°C in a 5% CO2 humidified incubator for a period of eight days. On day 4 post infection, culture medium was replenished with the compounds and IL-2 in fresh complete RPMI 1640. On days 0, 2, 4, and 6 post infection, 450 L of culture supernatant was harvested, lysed with 50 L of 10% TritonX-100 at room temperature for approximately 1 hour and stored at -20°C for p24 antigen ELISA. Subsequently, 20 L of culture medium was harvested to assess percent cell viability by trypan blue exclusion using glasstic slides (Kova). On day 8 post infection, 1.0-1.2 mL of culture medium was harvested, centrifuged at 2,000 rpm for 5 minutes, and 450 L of culture supernatant was harvested for p24 antigen ELISA as described for the previous days. The remaining supernatant was discarded and the pellet was resuspended in 100-200 L of complete RPMI 1640 for assessing cell viability by trypan blue exclusion as described for previous days. If necessary, cells were further diluted in complete RPMI 1640 for more accurate counts. Relative percent cell viability in compound treated samples versus DMSO-control treated samples was calculated as follows: (total viable cells / total cells)compound / (total viable cells / total cells)DMSO.

49

2.13 Statistical analysis

In vitro experiments were all performed on at least three separate occasions and are represented as the mean  the standard error (SEM) of the experiment, unless otherwise stated. Statistical significance comparisons between two samples were calculated using the paired two-tailed student’s t test (Microsoft Excel) and graphs were generated using Prism 5.0 software (GraphPad). Significant differences are represented by comparison to DMSO-treated control samples with the following legend: * = p ≤ 0.05, ** = p ≤ 0.01 and *** = p ≤ 0.001. Significance levels of p ≤ 0.05 were considered statistically significant.

50

3 Results

Contributions: Results described in sections 3.1 through 3.5 includes data collected and analyzed by both myself and Raymond W. Wong as part of my undergraduate research project. The initial screen of sixty compounds was done by Raymond W. Wong. The radioactive RT-PCR examining the effect of the compounds on splice site selection within the HIV-1 MS RNAs was done by Alan Cochrane from RNA samples prepared by me. Analysis of results outlined in sections 3.6 and beyond describes studies conducted and analyzed by me as part of my graduate research project. Testing of the compounds in SupT1 T cell lines was done by Raymond W. Wong. The RT-PCR assessing the effect of the compounds on select cellular alternative splicing events was done by Peter Stoilov using RNA samples prepared by me. RNAseq was performed by the Donnelly Sequencing Centre with subsequent mapping of reads and calculation of percent spliced in (PSI) scores and corrected RPKM values done by Sandy Pan. Testing of the maximum tolerable doses of these compounds in mice models was done by Liang Ming.

3.1 Identification of four compounds that suppress HIV-1 gene expression in HeLa cells

The success of digoxin as a potent inhibitor of HIV-1 gene expression, described previously by Wong et al. (2013), lead us to screen other small molecular compounds for activity against HIV. We tested over sixty compounds identified as RNA splicing modulators using an SMN2 mini- gene reporter (Dr. Peter Stoilov at West Virginia, unpublished) for their ability to inhibit HIV-1 gene expression. We identified four compounds, designated 191, 791, 833, and 892, as potent inhibitors of HIV-1 gene expression (Figure 3.1). The four compounds differed in the number of five and six-numbered rings they contained, but did not have a steroid-ring structure like digoxin and other cardiatonic steroids (Figure 3.1A). Portions of both 791 and 191 structures resembled nucleotide bases, while portions of 892 and 833 structures resembled amido-groups. In addition, both 791 and 191 contained chlorine and/or fluorine groups at the ends of their structures. These compounds were structurally dissimilar to each other and to previously characterized modulators of HIV-1 RNA processing digoxin, 8-azaguanine, and 5350150, herein referred to as 8-aza and 150 (Wong et al, 2013).

51

A

B

Figure 3.1. Screen of RNA splicing modulators identifies four potent inhibitors of HIV-1 gene expression. (A) Structures of compounds tested. (B) Effect of compound treatment on HIV-1 virion accumulation in culture supernatant as measured by p24 antigen ELISA and expressed relative to DMSO-treated samples (N ≥ 17, *** = p ≤ 0.001). Uninduced, DMSO- treated (DMSO, - Dox) samples were included as negative controls. Concentrations of the compounds were as follows: 15 M for 892, 30 M for 791, and 2 M for 833 and 191.

52

3.1.1 Previously published literature for 191, 791, 833, and 892 activity

Since these compounds were active against HIV-1, I investigated whether the activity of these compounds were previously described in scientific literature or in patent applications using SciFinder. To date, 791 and 833 have not been published in literature or been patented, however, there is limited information available for the activity of 191 and 892, as well as structures similar to 791, in other contexts.

191 has been previously tested for activity against microsomal prostaglandin E synthase-1 (mPGES-1), an essential enzyme involved in inflammatory diseases such as rheumatoid arthritis, fever, and pain (79). Since several compounds targeting human mPGES-1 were not specific for murine models of mPGES-1, 191 was tested in a screen with three other compounds for their activity against murine mPGES-1. 191 was shown to inhibit the enzymatic activity of murine mPGES-1 by 71% when used at a concentration of 50 μM (79). In addition, binding of 191 to mPGES-1 was modeled using protein homology to define molecular determinants of mPGES-1 ligand binding for further rationale-drug design (79).

892 and similarly structured compounds have been patented as putative activators of AMP- activated protein kinase (AMPK) (WO 2012027548), modulators of telomerase binding (WO 20122097600 and US 201200160260), and activators of histone deacetylase 1 (HDAC1) (WO 2010011318). Interestingly, a compound that is structurally similar to 892 was tested for inhibitory activity in the context of Hepatitis C virus (HCV) and was shown to inhibit enzymatic activity of HCV protease by ~57% at 50 μM (80).

Two compounds resembling 791 were tested for the ability to inhibit the activity of cyclin dependent kinase 2 (CDK2)/cyclin A. These compound differ in the side groups attached to the core pyrimidine ring structure. One compound, designated 12a, has a phenol group in place of the methyl group and a methyl group in place of the phenol ring with a chlorine in 791. 12a was shown to inhibit CDK2/cyclin A activity in vitro at an IC50 of 0.25 μM (81).

53

3.2 191, 791, 833, and 892 potently inhibited HIV-1 gene expression in a dose-dependent manner

To determine the basis for the effect of the compounds on HIV-1 gene expression, we treated HeLa cells containing a doxycycline-inducible HIV-1 provirus (Figure 2.1) with each of the compounds added to the cell culture medium. Treatment of HeLa B2 cells with the compounds and doxycyclin resulted in inhibition of HIV-1 viral production by 80-90% relative to DMSO treatment, as measured by p24 antigen ELISA, at concentrations in the low M range (Figure 3.1B). Virus production from uninduced, DMSO-treated cells showed no p24 Gag expression, as expected. Furthermore, inhibition of HIV-1 replication with compound treatment was dose- dependent with no significant cytotoxicity observed with compounds 892, 833, or 191 at 24 hours post treatment (Figure 3.2). High doses of 791 (>30 M) had a significant effect on cell viability, as measured by an XTT assay, in HeLa B2 cells, but did not show significant toxicity in CD4+ SupT1 cells at that concentration (Raymond W. Wong, unpublished) and was active in PBMCs at much lower concentrations with little to no cytotoxicity (preliminary data, see Figure 4.5). In addition, 791, 833, and 191 maintained their inhibitory activity in the context of HIV-1 replication in CD4+ SupT1 cells at concentrations which potently inhibited HIV-1 gene expression in B2 cells, with no significant cytotoxicity (Raymond W. Wong., unpublished).

3.3 191, 791, 833, and 892 decreased HIV-1 structural and regulatory protein expression

Since compound-treatment potently inhibits virus production, we examined the effect of the compounds on expression of multiple viral proteins. Following compound treatment and doxycycline induction for 24 hours, cell lysates were harvested for protein and analyzed by SDS- PAGE using antibodies to detect viral structural proteins Gag and Env, as well as regulatory proteins Rev and Tat. Representative western blots from at least three independent experiments are shown in Figure 3.3 and Figure 3.4. All four compounds reduced the levels of p55, p41, and p24 Gag proteins and gp160 and gp120 Env proteins relative to DMSO treatment (Figure 3.3). Furthermore, uninduced, DMSO-treated cells showed no viral protein expression, as expected. Blotting for GAPDH or α-tubulin was used to ensure equal loading of total protein across all the samples and allows for comparison of viral protein expression. The effect of the compounds on viral regulatory proteins, however, is very different from that observed with previously d 54

Figure 3.2. Compound treatment inhibits HIV-1 gene expression in a dose-dependent manner. The dose range of the compounds which inhibit HIV-1 virion production in culture supernatant was measured by p24 antigen ELISA and expressed relative to p24 Gag levels in DMSO-treated samples (N ≥ 3, * = p ≤ 0.05, ** = p ≤ 0.01, and *** = p ≤ 0.001). The effect of the compounds on cellular metabolism, at the ranges of concentrations tested, was measured using an XTT assay as a readout of viable cells and expressed relative to absorbance reads of DMSO-treated samples (N ≥ 3, * = p ≤ 0.05, ** = p ≤ 0.01, and *** = p ≤ 0.001). Error bars indicate standard error of the mean (SEM).

55 characterized HIV-1 inhibitors (Figure 3.4). Digoxin treatment resulted in the depletion of Rev and p14 Tat levels, but had no effect on the levels of p16 Tat, while 8-Aza and 150 treatment did not affect either Rev or Tat levels relative to DMSO treatment. These results are consistent with previously published data (Wong et al, 2013 and Wong et al, 2013). Together, these results suggest that 191, 791, 833, and 892 potently inhibit HIV-1 protein expression in vitro by blocking expression of both early (Rev, Tat) and late (Gag, Env) HIV-1 proteins.

3.4 191, 791, 833, and 892 reduced HIV-1 US and SS RNA but not MS RNA

To determine whether the dramatic loss of viral proteins is accompanied by changes in viral mRNA levels, the effect of compound treatment on the abundance of HIV-1 RNA classes was examined by qRT-PCR. Total RNA was isolated from DMSO- or compound-treated cells, and qPCR was performed using forward and reverse primers specific to -actin (internal control for normalization) as well as HIV-1 unspliced (US), singly-spliced (SS), and multiply spliced (MS) RNAs. Analysis of HIV-1 RNA abundance revealed that the compounds reduced levels of HIV-1 US and SS RNAs with no significant changes in levels of MS RNA relative to DMSO treatment. Uninduced, DMSO-treated cells showed no viral RNA expression, as expected (Figure 3.5). This data correlated with the reduced levels of Gag, Env, and p14 Tat (Figures 3.3 and 3.4) since these proteins are encoded by HIV-1 US and SS RNAs, respectively. However, the imbalance in viral RNA classes suggested that the compounds may be altering viral RNA splicing, a critical step in HIV-1 replication that relies heavily on regulation of splicing involving many cellular factors.

56

A

B

Figure 3.3. Compound treatment dramatically decreases the expression of HIV-1 structural proteins. Representative blots showing the effect of the compounds on HIV-1 (A) Gag protein and (B) Env protein expression relative to GAPDH or α-tubulin expression as loading controls (SDS-PAGE, N ≥ 3). Uninduced, DMSO-treated (DMSO, - Dox) samples and dox-induced, DMSO-treated samples serve as negative and positive controls, respectively. Images showing p55, p41, and p24 expression were cropped from same blot visualized at different exposure times due to difference in abundance of these isoforms. Concentrations of the compounds were as follows: 15 M for 892, 30 M for 791, and 2 M for 833 and 191.

57

Figure 3.4. 191, 791, 833, and 892 dramatically decrease the expression of HIV-1 regulatory proteins, in contrast to previously characterized HIV-1 inhibitors. Representative blots showing the effect of the compounds on HIV-1 Rev and Tat protein expression relative to α- tubulin expression as loading control (SDS-PAGE, N ≥ 3). Uninduced, DMSO-treated (DMSO, - Dox) samples and dox-induced, DMSO-treated samples serve as negative and positive controls, respectively. For the blot shown, lanes were cropped from the same blot to show compound- treated lanes adjacent to DMSO-treated control lanes. Concentrations of the compounds were as follows: 0.1 M for digoxin, 50 M for 8-Aza, 15 M for 892, 30 M for 791, and 2 M for 833, 150 and 191.

58

A

B

Figure 3.5. The compounds dramatically decrease the levels of HIV-1 US and SS RNAs. (A) Schematic of HIV-1 genome with the positions of the forward and reverse primers used for qRT-PCR analysis indicated by the arrows. US = unspliced, SS = singly spliced and MS = multiply spliced. (B) Quantification of viral mRNA levels in compound-treated samples were normalized to -actin and the mean mRNA levels expressed relative to DMSO-treatment (N ≥ 4, ** = p ≤ 0.01, and *** = p ≤ 0.001). Error bars indicate standard error of the mean (SEM). Concentrations of the compounds were as follows: 15-20 M for 892, 30 M for 791, and 2-2.5 M for 833 and 191.

59

3.5 191 and 791 did not alter splice site usage among HIV-1 MS RNA

Given that HIV-1 MS RNA abundance is unaffected by compound treatment but MS-encoded viral regulatory proteins, Rev and Tat, are lost, the compounds could be inducing changes in splice site usage, thereby altering the levels of splice variants within the MS RNA class such that these proteins are no longer expressed. Hence, we analyzed whether the compounds induced preferential selection of splice site within the MS RNAs by radioactive RT-PCR using forward and reverse primers that amplify the differentially spliced isoforms within the MS RNA class. Although the HIV-1 proviral genome in HeLa B2 cells contains modifications, it recapitulates the splicing events of HIV-1 pre-RNA, so that the levels of most MS RNA isoforms (less abundant isoforms are below the limit of detection) can be analyzed using this method (41, 82). Amplified products were visualized and the levels of HIV-1 MS RNA isoforms were quantified by densiometric analysis and designated according to size as described by Purcell and Martin (82). No significant changes in splice site usage were observed with 791 and 191 treatment, relative to DMSO treatment (Figure 3.6), suggesting that the loss of HIV-1 regulatory proteins with compound treatment is not due to preferential production of specific viral MS RNAs encoding these proteins. In contrast, 892 and 833 treatment caused modest decreases in levels of Rev1/2 and Nef RNAs and increased Tat1 and Tat2 RNAs, relative to DMSO treatment. However, these changes do not explain the loss of p16 Tat, which is encoded by the MS RNA, when treated with 892 or 833. These results suggest that the compounds do not alter the production of Rev and Tat MS RNAs (early phase of viral gene expression) since the MS RNAs remain following compound treatment. Instead, the compounds appear to perturb the transition from early to late HIV-1 gene expression, consistent with inhibition of Rev function.

Since compound treatment resulted in loss of HIV-1 MS-encoded regulatory proteins Rev and Tat, but had no appreciable effect on the abundance or splice site usage within MS RNA, we hypothesized that the compounds may inhibit HIV-1 gene expression by perturbing Rev- mediated viral RNA transport, protein synthesis, or protein stability.

60

A

B C

Figure 3.6. 191 and 791 do not alter splice site selection within HIV-1 MS RNAs. (A) Schematic of HIV-1 genome with the positions of the forward and reverse primers used to amplify the 1.8 kb class of HIV-1 RNAs indicated by the arrows. (B) Representative RT-PCR gel with HIV-1 MS isoforms labelled on the right according to Purcell and Martin, 1993 (N ≥ 3). (C) Quantification of PCR products was performed by densiometry analysis with the level of each isoform expressed as the mean percentage of the total density of all RNA species within the sample from at least three independent experiments. Error bars indicate standard error of the mean (SEM) and statistical significance is indicated by * (p ≤ 0.05, N ≥ 3). Concentrations of the compounds were as follows: 15-20 M for 892, 30 M for 791, and 2-2.5 M for 833 and 191.

61

3.6 Inhibition of cytoplasmic accumulation of HIV-1 US RNA and Gag with compound treatment was consistent with perturbation of Rev function

To assess the effect of the compounds on the Rev-dependent export of incompletely spliced viral RNA, the subcellular localization of HIV-1 US RNA and Gag was examined by fluorescent in situ hybridization (FISH). If the compounds perturb Rev function, we would expect to see accumulation of US and SS viral RNAs in the nucleus with little to no expression in the cytoplasm. Since the compounds caused depletion of Rev protein (Figure 3.4), it was likely that HIV-1 US RNA were unable to be exported to the cytoplasm for subsequent virus particle assembly and translation of viral structural proteins. To determine if this was the case, HeLa C7 cells were treated with DMSO or compounds as described previously (see Methods section for data showing similar activity of the compounds in HeLa C7 cells) and inhibition of HIV-1 gene expression was measured by FISH. Induction of HIV-1 gene expression (DMSO, +Dox) results in US RNA localization in both the nucleus and cytoplasmic region with strong GagGFP expression throughout the cell (Figure 3.6) Co-localization of viral US RNA and GagGFP is indicated by the merged signal (yellow). In contrast, compound treatment prevents cytoplasmic accumulation of HIV-1 US RNA and reduced GagGFP levels relative to DMSO treatment (N ≥ 3). No US RNA and GagGFP expression was detected in uninduced cells, as expected. The effect of the compounds on HIV-1 US RNA and GagGFP expression is consistent with US RNA abundance and Gag protein expression measured by qRT-PCR and SDS-PAGE, respectively (Figures 3.3 and 3.5). Furthermore, the nuclear retention of US RNA upon compound treatment is consistent with the loss of Rev protein observed by SDS-PAGE (Figure 3.4). These results suggest that the compounds prevent the early to late phase transition in HIV-1 gene expression by inhibiting Rev-mediated viral RNA transport, thereby effectively hindering viral replication.

3.7 191, 791, 833, and 892 did not affect total protein synthesis

To determine whether the compounds caused depletion of viral proteins by inhibiting cellular protein translation, the effect of compound treatment on protein synthesis was measured by surface sensing of translation (SUnSET) as described by Schmidt et al (76). This nonradioactive method to monitor protein synthesis uses puromycin, a structural analog of aminoacyl tRNAs produced by Streptomyces alboniger, to “tag” nascent peptides by chain termination and allows their detection following SDS-PAGE using a monoclonal antibody to puromycin. Following 62

DAPI US RNA GagGFP

Figure 3.7. Compounds inhibit cytoplasmic accumulation of HIV-1 US RNA. Representative fluorescent in situ hybridization images of HeLa B2 cells treated with DMSO or the indicated compounds (N ≥ 3). Cells were viewed at 630X (oil immersion) magnification. Images are cropped to show a representative field of view.

63 compound treatment and induction of viral gene expression (24 hours), cells were “pulsed” with puromycin and cell lysates were harvested to directly monitor levels of newly synthesized proteins by western blotting. Analysis of blots from at least four independent experiments indicated that the compounds did not induce significant changes to protein synthesis relative to DMSO treatment by 24 hours post treatment (Figure 3.8). In contrast, cells incubated either in cycloheximide (CHX), an inhibitor of translation elongation, or without puromycin, showed relatively decreased puromycin-tagged polypeptides, as expected. These results suggest that the loss of HIV-1 proteins is not a consequence of a global block of cellular protein translation, but rather, is a selective effect on HIV-1 gene expression.

The observation that the compounds do not significantly perturb cellular protein synthesis is corroborated by the long-term toxicity profiles of the compounds (Figure 3.9). If the compounds induce a stress response or inhibit protein translation, a detrimental effect on cell proliferation would have been observed in cells treated with compounds for a period longer than 24 hours. I monitored cellular metabolism and cell proliferation in B2 cells up to four days post treatment by XTT assay. Although, 191, 791, 833, and 892 treatment had significant effects on cell growth/cellular metabolism at three and four days post treatment, both 191 and 791 were much better tolerated by the cells over the four days compared to 892 or 833 treatment. In addition, 191 and 791 are active in primary cells at similar or lower concentrations than tested here up to six days post treatment (refer to section 3.11 and Figures 3.16 and 3.17). These observations suggest that the compounds do not directly perturb protein translation, but does not rule out whether 833 and 892 induce signaling pathways involved in the stress response since these compounds appeared to be more toxic with prolonged exposure in HeLa B2 cells.

64

A

B

Figure 3.8. The compounds do not affect total protein synthesis. (A) Representative blot showing the effect of the compounds on protein synthesis by puromycin labelling of nascent polypeptides (N ≥ 4). Samples not incubated with puromycin (No Puro) or treated with cycloheximide (CHX), to block translation, served as negative controls. (B) Quantification of protein synthesis in the presence of the compounds was measured by the volume intensity in each lane normalized to GAPDH intensity and expressed relative to the DMSO-treatment (N ≥ 4, *** = p ≤ 0.001). Error bars indicate standard error of the mean (SEM).

65

Figure 3.9. 191 and 791 had better long-term toxicity profiles than 833 and 892. The graph shows cell proliferation as measured by XTT assay 1, 3, and 4 days post-treatment with the compounds relative to DMSO-treated HeLa B2 cells (N = 3). Error bars depict standard error of the mean and *, **, and *** indicate P values ≤ 0.05, 0.01, and 0.001, respectively.

66

3.8 The compounds did not alter the stability of existing HIV-1 Tat protein.

Since the compounds appeared to selectively decrease the viral regulatory proteins without altering the levels of the MS RNAs encoding them, I examined whether compound treatment had a direct effect on the stability of these proteins. To determine if the compounds directly caused destabilization and/or degradation of HIV-1 MS-encoded proteins, the decay of HIV-1 Tat levels was compared between DMSO-treated and compound-treated cells in the presence of cycloheximide, an inhibitor of protein translation. Briefly, HeLa B2 cells were induced with doxycylin for 24 hours to allow viral protein expression in the absence of the compounds. Cycloheximide was then added, to block new protein synthesis, in combination with either DMSO or the compounds. Cell lysates were harvested for protein every 2 hours to measure the decay of HIV-1 Tat. If the compounds directly caused destabilization and/or degradation of Tat, Tat expression would be lost much sooner with compound treatment than with DMSO treatment. Representative western blots from at least two independent experiments are shown in Figure 3.10 with a summary of the data illustrated in the graph below. HIV-1 p14 Tat expression was lost more quickly than p16 Tat, with both proteins lost by approximately 8 hours. Quantification of multiple blots revealed that the compounds did not enhance the decay of Tat relative to DMSO treatment since Tat levels in compound-treated samples fall within the standard error of the mean described for Tat levels with DMSO treatment. This observation suggests that addition of the compounds did not have an effect on the stability of existing Tat protein. Furthermore, the levels of both Tat isoforms were rescued with the addition of proteasome inhibitor, MG132, suggesting that HIV-1 Tat may be degraded by the proteasome degradation pathway. To determine whether HIV-1 regulatory proteins can be protected from proteasomal degradation in the presence of the compounds, HeLa B2 cells were treated with compounds and induced with doxycycline as previously described, but were additionally treated with MG132 for eight hours prior to harvesting cell lysates for protein analysis. Representative blots from at least three independent experiments are shown in Figure 3.11. MG132 treatment dramatically increased the levels of both p14 and p16 Tat, indicating that proteasomal inhibition rescued the accumulation of Tat isoforms in the presence of the compounds. Thus, there was ongoing synthesis of Tat in the presence of the compounds. This effect was not mirrored with respect to the levels of HIV-1 Gag. In fact, addition of MG132 to DMSO-treated cells resulted in a reduction in p24 Gag.

67

Gag. A

B

Figure 3.10. Compounds do not affect the half-life of HIV-1 Tat relative to DMSO. (A) Representative blots showing the decay of Tat protein in the presence of cycloheximide (10 M) and DMSO or indicated compounds (N ≥ 3, except for 833, N = 1-2). MG132 (10 μM) was added for 8h as an additional control to determine whether inhibition of the proteasome prevents protein degradation. All uninduced (unind.) and 0h samples were treated with DMSO. GAPDH serves as loading control. (B) Summary of effect of compounds on HIV-1 Tat degradation. Band volume intensities of both p14 and p16 Tat isoforms were calculated for each treatment relative to that of the DMSO control treatment and were then normalized to corresponding GAPDH bands (N ≥ 3, except for 833, N = 1-2). Error bars depict standard error of the mean, if possible.

68

A

B

Figure 3.11. HIV-1 Tat expression can be rescued with proteasome inhibition by MG132. (A) Representative blot showing effect of the compounds on HIV-1 Gag and Tat expression in the presence or absence of proteasome inhibitor MG132, relative to DMSO treatment in HeLa B2 cells. GAPDH serves as loading control. (B) Summary of band intensities of HIV-1 p24 Gag and p14 and p16 Tat with each treatment relative to that of the DMSO control normalized to the corresponding GAPDH bands (N ≥ 3). Error bars depict standard error of the mean and *, **, and *** indicate P values ≤ 0.05, 0.01, and 0.001, respectively (gray * reflects significance relative to DMSO, + Dox, + MG132 treatment). 69

Together, these results indicate that Tat synthesis did indeed occur in the presence of the compounds, since Tat accumulation was rescued with proteasomal inhibition, but that the compounds did not directly induce Tat destabilization. This suggests that the compounds affect processes that alter the rate of synthesis of HIV-1 regulatory proteins, their degradation, or both. Furthermore, the lack of changes in the levels of Gag with MG132 treatment, suggests that the compounds inhibit HIV-1 gene expression by other modes of action in addition to altering the stability of viral regulatory proteins.

3.9 791 did not significantly affect cellular alternative splicing while 191, 833, and 892 had limited effects

To evaluate the effect of compound treatment on alternative splicing of select endogenous transcripts (see Appendix for list) RT-PCR was performed using RNA isolated from DMSO and compound-treated HeLa B2 cells and quantitated by capillary electrophoresis of the amplicons in collaboration with Dr. Peter Stoilov (West Virginia). The ‘percent spliced in’ or PSI in annotated cassette exons was determined and compared to DMSO treatment (Figure 3.12). Treatment with 791 showed no appreciable changes in alternative splicing of the examined events as most events fell along the theoretical diagonal dotted line depicting no difference between compound and DMSO treatments (Pearson correlation coefficient, R = 0.97). The other three compounds showed some deviation from the diagonal line with a few events falling above or below the diagonal indicating increased and decreased exon inclusion, respectively, but also correlated well with DMSO treatment (R = 0.94). Changes in alternative splicing of endogenous genes/transcripts with |PSI| ≥ 10% and 20% are represented as red and yellow dots, respectively, and a subset of these genes are labelled next to their respective data points (Figure 3.12). Interestingly, three differentially spliced genes, fgfr1op2, macf1, and gm130/golga2 were common among all four compounds, while an additional gene, nap1l1, was common to three of the four compounds, within the subset of alternative splicing events examined (see Appendix, events marked in bold font). The functions of these genes and the roles they may play in the inhibition of HIV-1 gene expression is outlined in the Discussion section.

To determine the global effect of the compounds on alternative splicing of endogenous transcripts in an unbiased fashion, paired-end RNAseq was performed on RNA isolated from DMSO, 791, and 191 treated HeLa B2 cells. I focused on 791 and 191 since these two

70

compounds had the best long-term toxicity profiles of the four compounds (see Figure 3.9). To calculate altered splicing events in response to 791 or 191 treatment, the PSI in annotated cassette exons was determined and compared to DMSO treatment. Based on the analysis of biological duplicates of the ≥ 9,000 alternatively spliced events detected, 791 treatment resulted in very few altered splicing events (2 AS events with exon inclusion/exclusion ≥20% out of >10,000 events) and correlated well (R = 0.99) with changes seen in DMSO treated samples (Figure 3.13). 191 treatment induced more changes (25 AS events with exon inclusion/exclusion ≥20% out of >9,800 events) in endogenous alternative spliced events, but also correlated well (R = 0.97) with splicing changes observed in DMSO treated samples (Figure 3.13). The patterns of alternative splicing changes observed by RNAseq were consistent with data from the subset of AS events measured by the Stoilov group. In fact, 791 altered splicing of fgfr1op2 with a PSI score of -20% (p = 0.0004, N = 2, >9,000 events), relative to DMSO treatment. Together, these results indicate that 191 and 791 did not significantly perturb cellular alternative splicing and suggest that their inhibitory effect is selective to processes involved in HIV-1 gene expression. This idea is corroborated by the lack of alternative splicing changes (|PSI| ≥ 20%), that are common to both 191 and 791 (Figure 3.13C).

To determine whether signal induced changes in mRNA expression levels may have affected the detection of exon inclusion changes, changes in total mRNA expression were compared with changes in alternative spicing. The differential expression level of genes with DMSO, 191, or 791 treatment was quantified as corrected reads per kilobase of exon model per million mapped (cRPKM) reads. The expression cutoff was a cRPKM value of 0.5, corresponding to ≥ 10 reads that uniquely mapped to a single genomic locus. Genes were described as differentially expressed (DE) if the cRPKM fold change was ≥ 2 or ≤ 0.5. Of 11,406 total genes examined, relatively few DE genes were detected following compound treatment (Figure 3.14). In fact, 791 and 191 treatment only induced changes in 0.74% and 0.46% of total genes analyzed, respectively, relative to DMSO treatment. 791 treatment resulted in more upregulated genes while 191 treatment resulted in approximately equal numbers of upregulated and downnregulated genes (Figure 3.14A). Of the genes whose expression levels were altered, trib3, which encodes a putative protein kinase, was expressed about 9-fold more with 791 treatment relative to DMSO treatment (N = 2; see Appendix). Examination of differentially expressed genes that were shared between both 791 and 191 treatment revealed little overlap. In fact, only

71

c

Figure 3.12. Compounds have limited effects on cellular alternative splicing events. Mean alternative splicing changes (PSI, percent spliced in) were plotted comparing DMSO and compound treatment (N = 3, RT-PCR). Diagonal dotted line: no difference between treatments. Dots above/below the diagonal: increased/decreased exon inclusion. |PSI| ≥ 10% and 20% are indicated as red and yellow dots (labelled), respectively. Statistically significant alternative splicing changes with |PSI| ≤ 10% are indicated by the gray dots (Student’s t-test, two-tailed). Error bars not shown. Pearson correlations (R values) are shown.

72

A

B C 191

791

15 66

Figure 3.13. 191 and 791 do not appreciably alter cellular alternative splicing events. (A) Mean alternative splicing changes (PSI or percent spliced in) were plotted comparing DMSO and compound treatment (N = 2, RNA-seq). |PSI| ≥ 10% and 20% are represented as red and yellow dots, respectively. AS genes with exon inclusion/exclusion ≥ 20% are labelled or listed on the right. Statistically significant alternative splicing changes with |PSI| ≤ 10% are indicated by the gray dots (Student’s t test, two-tailed). Error bars not shown. (B) Summary of altered exon inclusion or exclusion (Incl. or Excl.) with compound treatment (RNAseq, N = 2). (C) Venn diagram comparing AS events with exon inclusion/exclusion ≥10% between 791 and 191 treatment (N = 2). Pearson correlations (R values) are shown.

73

A B

C

D

Figure 3.14. Differential host gene expression with 191 and 791 treatment. (A) Differentially expressed (DE) genes described as cRPKM fold change ≥ 2 or ≤ 0.5 with compound treatment relative to DMSO treatment (p ≤ 0.05, 11,406 genes, N = 2). (B) Venn diagram comparing shared DE events between 791 and 191 treatment (N = 2). Orange and blue indicate up- and down- regulated genes, respectively. (C) Fold change distribution of differentially expressed genes based on compound treatment relative to DMSO treatment within the RNAseq dataset (p ≤ 0.05, 1,020 genes, N = 2). (D) Venn diagrams comparing DE and AS (|PSI| ≥ 10%) events with 791 (left) and 191 (right) treatment (N = 2).

74

six of the differentially expressed genes (three of which three genes were upregulated and remaining three were downregulated) were common to both compounds (Figure 3.14B). Furthermore, there was very little overlap in genes that showed altered splicing or differentially expression (Figure 3.14D). To put these observations into perspective, Martinez et al demonstrated that T cell activation, a normal cellular signaling process, results in changes in alternative splicing in approximately 10% of the >10,000 events examined (83). Furthermore, they also observed very little overlap between alternative spliced and differentially expressed genes (83).

Together, these results suggest that 191 and 791 do not appreciably alter cellular alternative splicing or gene expression, but instead, selectively alter the balance HIV-1 RNA splicing and gene expression. Thus, it seems likely that these compounds do not primarily inhibit HIV-1 by perturbing alternative splicing, but rather, induce the loss of HIV-1 Rev protein such that the balance of viral RNAs is altered.

3.10 Preliminary analysis of the effect of the compounds on expression of cellular splicing factors

Given that HIV-1 RNA processing relies on host cell splicing machinery and since splicing factors can selectively alter RNA splicing, the effect of the compounds on select cellular splicing factors was examined. The expression of members of the SR protein family of splicing factors, SRSF3 (SRp20), SRSF5 (SRp40), and SRSF6 (SRp55) was measured from at least three independent experiments and normalized to either GAPDH or α-tubulin. Bands corresponding to SRSF5 and SRSF6 were detected using the pan-SR antibody, 1H4, and designated based on their predicted size. Treatment with the compounds modest changes in the expression of SRSF3 (N = 2-3), relative to DMSO treatment, but other members of the SR family (N = 2-4) were largely unchanged across the treatments (Figure 3.15). These results suggest that the primary mechanism of action of these compounds is not mediated by altering splicing but by perturbing the balance of HIV-1 mRNAs. In contrast, digoxin alters HIV-1 splicing by decreasing the MS RNA isoforms encoding Rev and has been shown to induce changes in post-translational modifications of SRSF3 and Tra2β (72). Similarly, another cardiotonic steroid, digitoxin, was shown alter splicing by depleting the levels of SRSF3 and Tra2β (84). Thus, the effect of 191, 791, 833, and 892 on SR proteins are consistent with the low degree of cellular alternative splicing changes

75

c

Figure 3.15. Compounds have limited effects on expression of cellular splicing factors. Representative immunoblots showing the effect of the compounds on the expression of SR proteins relative to GAPDH or α-tubulin expression (N = 2-4). Quantification of mean SRSF3 (SRp20), SRSF5 (SRp40), and SRSF6 (SRp55) protein levels (blot probed for pan-SR proteins using 1H4 antibody) from multiple blots shown on the right. Error bars indicate SEM. Concentrations of the compounds were: 892 (15 M), 791 (30 M), 833 (2 M) and 191 (2 M).

76

observed with compound treatment, but do not rule the involvement of signaling pathways in splicing regulation as a way by which the compounds selectively inhibit HIV-1 RNA processing.

3.11 191 and 791 inhibit HIV-1 BaL replication in primary cells

The ability of the compounds to potently inhibit HIV-1 gene expression in the context of HeLa cells led me to confirm their activity in the context of HIV-1 BaL replication in peripheral blood mononuclear cells (PBMCs) from healthy donors. PBMCs were activated for three days prior to infection with HIV-1 BaL (MOI < 0.01) and treatment with DMSO, 191, or 791. Cell culture medium from compound-treated cells was sampled every two days to measure the effect of compound treatment on virus production and cell viability. HIV-1 virus production in PBMCs infected with HIV-1 in vitro was potently inhibited upon treatment with 191 and 791 in comparison to the viral growth observed with DMSO alone in at least three independent experiments using cells from two different donors (representative data shown in Figure 3.16). Azidothymidine (AZT), one of the first drugs used to treat HIV-1 infection in patients, completely inhibited virus production, as expected. In fact, treatment with either 191 or 791 was able to inhibit HIV-1 virus replication similar to AZT up to 4 days post infection. Furthermore, inhibition of HIV-1 replication with 191 and 791 treatment was dose-dependent with little to no cytotoxicity observed at concentrations below 4 M (preliminary cell viability data, Figure 3.17). Therefore, the compounds inhibited HIV-1 replication in a mixed cell population even under in vitro HIV infection conditions where cell infection rates are substantially higher than in HIV+ patients. Furthermore, 191 and 791 maintain their inhibitory activity in primary cells against replication-competent HIV-1 at similar or lower concentrations than needed in HeLa cells, suggesting that these compounds are active at low μM concentrations in a physiologically relevant context.

77

Figure 3.16. 191 and 791 inhibit HIV-1 replication in PBMCs. Representative experiment from a single donor showing HIV-1 BaL virus replication over a period of eight days post-infection (p.i.,) as measured by p24 antigen ELISA (N = 4, 2 donors). PBMCs were infected with HIV-1 BaL (MOI < 0.01) and treated on days 0 and 4 post infection with.DMSO, AZT (3.74 M), or 791 and 191 at the concentrations indicated. Uninfected control PBMCs were similarly treated with DMSO on days 0 and 4. Error bars indicate standard error of the mean (SEM) of replicate wells from an independent experiment.

78

A

B

Figure 3.17. 191 and 791 inhibit HIV-1 replication in PBMCs in a dose-dependent manner. The effect of increasing concentrations of the compounds on HIV-1 BaL virion production in PBMCs. Culture supernatant was measured by p24 antigen ELISA and expressed relative to p24 Gag levels with DMSO-treatment (N ≥ 3 for 0-3 M of 191 and 0-3.8 M of 791, N = 1-2 for rest, 2 donors, * = p ≤ 0.05, ** = p ≤ 0.01, and *** = p ≤ 0.001). The effect of the compounds on cell viability was measured by trypan blue exclusion as a percentage of total cells and expressed relative to percent cell viability with DMSO-treatment (preliminary data; N = 2 for for 0-3 M of 191, N = 1 for rest, 1 donor). Error bars indicate standard error of the mean (SEM).

79

4 Discussion

Continued success in combating HIV infection globally relies on discovery of novel therapeutic strategies against previously untargeted avenues of the HIV lifecycle. Current treatment options for HIV-1 infection primarily target the activities of viral enzymes reverse transcriptase, integrase and protease. Although this is a great strategy to specifically inhibit the HIV, viral genetic diversity due to high viral replication rates and reverse transcriptase mutation rates, means that there is a greater risk of developing drug resistant viruses. In contrast, novel therapeutic strategies that exploit specific host-virus interactions without perturbing normal cellular processes, would be more effective at preventing viral drug resistance across various HIV subtypes. The requirement of HIV-1 for the host cellular splicing machinery for efficient expression of viral proteins provides many opportunities for identifying novel therapeutic targets. In fact, recent studies by Campos et al (2015) has validated this approach (73). The authors showed that treatment of infected PBMCs with ABX464, a small molecule that interacts with the cellular cap binding complex (CBC) and specifically prevents Rev-mediated RNA export, was able to sustainably suppress viral load without selecting for resistance mutations (73). More importantly, studies revealed a dramatic rebound of viral load within a week in HIV-infected humanized mouse models after cessation of HAART treatment, while only a slight rebound was observed by 52 days after cessation of ABX464 treatment alone (73). These findings suggest that targeting cellular components required for efficient HIV replication is a promising strategy that can complement existing anti-viral treatments.

Since HIV-1 requires strict regulation and processing of its RNA for efficient replication and expression of viral proteins, our lab focused on perturbing this stage of the viral lifecycle using small molecules. From a screen of compounds shown to modulate splicing of an SMN2 mini- gene reporter (collaboration with Peter Stoilov), we identified four compounds that potently inhibited HIV-1 gene expression. Although the four compounds are structurally very dissimilar, each compound inhibited HIV-1 p24 Gag expression by 80-90% relative to DMSO-treated cells (Figure 3.1). In addition, these four compounds are very different from previously characterized HIV-1 inhibitors, digoxin (72), 8-azaguanine, and 5350150 – herein referred to as 8-Aza and 150, respectively. Digoxin, a cardiatonic steroid, inhibited HIV-1 by perturbing viral RNA splicing in two ways. First, digoxin selectively decreased the levels of Rev1/2 mRNA by 73% relative to the levels of mRev1/2 RNA observed with DMSO treatment, thereby dramatically

80

decreasing the levels of Rev present in the cytoplasm (72). Secondly, digoxin resulted in oversplicing of HIV-1 RNAs, such that HIV-1 MS RNA abundance was greatly increased and incompletely spliced RNA abundance was decreased (72). In this way, digoxin perturbs the balance of HIV-1 RNAs and thus viral gene expression. The loss of both Rev protein and incompletely spliced viral mRNAs severely impairs the export of viral genomic RNA and the production of viral structural proteins. 8-Aza and 150, on the other hand, inhibited HIV-1 gene expression by perturbing Rev-mediated viral RNA transport without affecting Rev expression directly (71). Since 191, 791, 833, and 892 are structural dissimilar from digoxin, 8-Aza and 150, there may be multiple ways to perturb HIV-1 replication via small molecule intervention. Indeed, this appears to be the case, as the four compounds presented here inhibit HIV-1 gene expression in a manner that results in the depletion of both Rev and Tat, in contrast to the previously characterized HIV-1 RNA processing inhibitors.

191, 791, 833, and 892 inhibited HIV-1 in a dose-dependent manner at concentrations in the low micromolar range in multiple contexts. Initial screening and characterization of the effect of the compounds on HIV-1 gene expression was done using HeLa B2 cells (Figures 3.1 and 3.2), yet 191, 791, and 833 were also active at similar, if not identical, concentrations in the context of CD4+ SupT1 cells (Raymond W. Wong, unpublished). Furthermore, I have shown that both 191 and 791 inhibit HIV-1 BaL replication in primary peripheral blood mononuclear cells (PBMCs) at concentrations at or below those tested in HeLa B2 cells with little to no toxicity (Figures 3.16 and 3.17). To test the long-term effects of the compounds on cell growth, HeLa B2 cells were incubated either with the compounds or with DMSO for a period of four days. Although the compounds had a significant effect on cellular metabolism with prolonged treatment (Figure 3.9), 191 and 791 were much better tolerated by the cells than the remaining two compounds, indicated that 191 and 791 would be less likely to induce adverse effects in vivo. Consistent with this theory, both 191 and 791 were able to inhibit HIV-1 replication in PBMCs over a period of six days (Figure 3.17). These results confirm activity of the compounds in a more physiologically relevant context and suggest that small molecules can effectively be used to inhibit HIV-1 replication as a novel strategy.

Analysis of HIV-1 protein expression following compound treatment, revealed that compound treatment resulted in the loss of both early (Rev, Tat) and late (Gag, Env) viral proteins. The compounds decreased the expression of HIV-1 structural proteins (Figure 3.3) that are dependent

81

on Rev function for their expression, as well as key viral regulatory proteins that are generated early in HIV-1 replication (Figure 3.4). Inhibition of cytoplasmic localization of Rev by Leptomycin B results in a similar reduction in cytoplasmic accumulation of HIV-1 US and SS RNAs without affecting MS RNA accumulation (85)Alan Cochrane, unpublished). Thus, the loss of the late proteins and p14 Tat can be explained by the decrease in US and SS RNA abundance (Figures 3.5 and 3.7) given the requirement of Rev for the export and translation of these RNAs. This was confirmed with inhibition of cytoplasmic accumulation of HIV-1 US RNA upon compound treatment (Figure 3.7). The abundance of HIV-1 MS RNA, however, does not correlate with the loss of Rev and p16 Tat. Furthermore, there was no significant variation in the levels of splice variants within this class of RNAs with either 191 or 791 treatment (Figure 3.6), suggesting that the compounds did not induce preferential selection of a viral splice sites. 892 and 833 treatment induced a few changes in the levels of splice variants encoding Rev, Nef and Tat (Figure 3.6), however, these changes are much less profound than the changes in splice site selection induced by digoxin (72). Together, these results suggest that perturbation of the balance of HIV-1 splicing is a consequence of decreased Rev activity in exporting incompletely spliced viral RNAs.

To verify that the compounds did not significantly or globally effect the splicing of endogenous genes, the effect of the compounds on cellular splicing factors and either a panel (73 events) or a library (>9,000) of alternatively spliced events was examined. Preliminary studies looking at the effect of the compounds on the expression of endogenous cellular splicing factors revealed only modest changes in the levels of SRSF3 (SRp20) and little to no changes in SRSF5 (SRp40) and SRSF6 (SRp55) levels in the presence of the compounds relative to DMSO treatment (Figure 3.15). This is consistent with the minimal effects on global mRNA splicing observed (Figures 3.12 and 3.13). Since the activity of SR proteins is dependent on their phosphorylation status, analysis of posttranslational modifications of these splicing factors may be more informative of perturbations of cellular signaling events involved in RNA processing in the presence of the compounds. Overall, the compounds had limited effects on global cellular alternative splicing events (Figure 3.12) as there was a high correlation between compound-treated and DMSO- treated samples (R = 0.94-0.99) and a similar conclusion was drawn when specific alternative splicing events were examined. In contrast, 892 and 833 treatment resulted in changes of 30-50% in the levels of HIV-1 MS RNA variants relative to DMSO treatment (Figure 3.6). This suggests

82

that the inhibitory effect of the compounds is selective to HIV-1. Consistent with this suggestion, the compounds alter the splicing profile of HIV-1 by enhancing the expression of spliced viral RNA and reducing the expression of incompletely spliced viral RNAs, without having any effect on cellular splicing.

Since there was a disconnect between the levels of HIV-1 MS RNA and the expression of viral regulatory factors, the compounds likely inhibit HIV-1 gene expression by perturbing mRNA export, protein synthesis or protein stability. I have shown that the compounds do not effect cellular protein synthesis (Figure 3.8) even though they induce significant depletion of viral protein expression. Thus, these compounds selectively decrease HIV-1 protein expression without perturbing global protein synthesis. Studies examining the immediate effects of the compounds on the stability of HIV-1 regulatory proteins revealed that Tat degrades quite rapidly (half-life approximately 8 hours) but the compounds do not directly alter the decay of Tat relative to DMSO treatment (Figure 3.10). Further analysis of the effect of compound treatment on the stability of viral proteins at a posttranslational level, revealed that expression of both p16 and p14 Tat could be rescued with proteasome inhibition after 24 hour treatment with the compounds. In contrast, proteasome inhibition with MG132 caused a decrease in the levels of HIV-1 p24 Gag. Previously, Schubert et al (86) demonstrated that MG132-induced proteasomal inhibition severely decreases the budding, maturation, and infectivity of HIV-1 by reducing the level of free ubiquitin in HIV-1-infected cells and thereby prevented mono-ubiquitination of p6gag, which is important for virus assembly and release (86). Thus, decreased p24 Gag levels with MG132 treatment is consistent with the requirement of functional proteasome for proteolytic processing of HIV-1 Gag (86). Since, proteasome inhibition prevented the compound-induced loss of p14 Tat (encoded on SS RNA), this suggests that Rev-mediated export of incompletely spliced viral RNAs did indeed occur when viral regulatory proteins were prevented from degradation. Therefore, the compounds most likely inhibit HIV-1 gene expression by affecting Rev and Tat protein accumulation, which leads to perturbation of viral US and SS RNA accumulation (see Figure 4.1 for proposed model of inhibition).

Examination of the effect of the compounds on differential alternative splicing may allow us to implicate cellular signaling cascades involved in regulation of splicing and thereby identify putative cellular factors that may be involved in the destabilization of the viral regulatory proteins. The compounds induced limited changes in cellular alternative splicing events with

83

d

Figure 4.1 Proposed model for how the compounds inhibit HIV-1 gene expression. Following transcription of the HIV-1 provirus, RNA processing (5’ capping, splicing, and 3’ polyadenylation) leads to the generation of MS, SS, and US RNAs. In the early phase of HIV-1 gene expression, only the MS RNAs are exported (via the TAP/NXF1 export pathway). The US and SS RNAs, which require Rev for export, remain in the nucleus where they are degraded. In the cytoplasm, translation of MS RNA results in the production of viral regulatory proteins Rev and Tat (p16 isoform). The stability of Rev and Tat may be influenced by cellular chaperones that promote protein function, or destabilizing factors that promote protein degradation. Our studies suggest that these compounds lead to the loss of the viral regulatory proteins by inhibiting the activity of chaperone proteins or by enhancing the effect of destabilizing factors and subsequently inhibit the export and translation of Rev-dependent US and SS RNAs, and virus replication.

84

RNAseq studies revealing that <1% of > 9,800 measured alternatively spliced events were altered by 191 or 791, relative to DMSO. In contrast, previous studies have demonstrated that T cell activation altered ~10% of >10,000 alternatively spliced events (83). T cell activation offers a great comparison for assessing alternative splicing changes since CD4+ T cells are the natural hosts for HIV-1 and the compounds may affect similar signaling cascades to inhibit HIV-1 gene expression. Thus, a cellular process involved in immune response alters splicing more than these compounds, suggesting that 191, 791, 833, and 892 do not primarily inhibit HIV-1 RNA processing by altering splicing. Although the compounds did not significantly affect cellular splicing events in general, the splicing of three genes, fgfr1op2, macf1, and gm130/ golga2, were altered by all four compounds, while the splicing of an additional gene, nap1l1, was altered by all compounds, with the exception of 791 (as determined by RT-PCR). Furthermore, the RNAseq approach showed that splicing of fgfr1op2 was also altered by 791 (PSI = -20, p = 0.0004, N = 2). Given that only a few cellular alternatively spliced events were appreciably changed among the total number of detected events, any changes that are common among the compounds would be predicted to be involved in their shared activity as inhibitors of HIV-1 gene expression.

The macf1, gm130/golga2, and fgfr1op2 genes encode microtubule-actin crosslinking factor 1 (MACF1), Golgin A2, and fibroblast growth factor receptor 1 oncogene partner 2 (FGFR1OP2), respectively. MACF1 is a large protein that form bridges between different cytoskeletal elements and has been shown to regulate microtubule dynamics by GSK3 signaling in skin stem cells and developing neurons (87, 88). These studies found that GSK3 binds and phosphorylates MACF1, inhibiting MACF1’s ability to bind microtubules (87, 88). Thus, MACF1 appears to be a downstream target of GSK3 signaling and further suggests that the compounds may impact the GSK3/Wnt signaling pathway. Similarly, Golgin A2 appears to be involved in cytoskeletal signaling pathways that regulate microtubule dynamics, as well as roles in the maintenance of the Golgi apparatus and secretory pathway (89). Golgin A2 is phosphorylated by cyclin dependent kinase 1 (Cdk1)-cyclin B and cyclin dependent kinase 5 (Cdk5) (90, 91). In turn, Golgin A2 binds and promotes the auto-phosphorylation of yeast Ste20-like kinases YST1 (human homologue is Stk25) and MST4, implicating the involvement of Golgin A2 in the MAPK signaling pathway (92). In contrast to MACF1 and Golgin A2, the function of FGFR1OP2 is unknown, but is predicted to be translated into an evolutionarily conserved protein containing coiled-coil domains and may also play a role in related FGFR1 signaling pathways

85

(93). The nap1l1 gene encodes for the histone chaperone, Nap1. Given that this gene is alternatively spliced by three of the four compounds and has previously been shown to interact with HIV-1 Rev and Tat and increase their activity (94-96), it can be predicted that perturbation of alternative splicing or expression of Nap1 would affect HIV-1 gene expression. It has previously been shown that siRNA knockdown of Nap1 altered HIV-1 Rev aggregation, localization, import, and function (94). Hence, it would be worthwhile to determine whether the compounds alter NAP1 function or perturb the Nap1-Rev interactions, thereby inhibiting Rev- mediated export of viral RNAs. A model of inhibition can be proposed, whereby compound treatment leads to the loss of Nap1 (depicted as chaperone protein in Figure 4.1), which in turn leads to aggregation of HIV-1 Rev and their subsequent proteasomal degradation.

Furthermore, there was very little overlap between the alternatively spliced and differentially expressed genes for each compound (Figures 3.15). This is consistent with mounting evidence from genome-wide studies in support of a paradigm shift in the understanding that most genes often undergo alternative splicing changes in protein isoforms largely without accompanying changes in overall transcript levels (97, 98). Only a few genes (84 for 791 and 53 for 191) were differentially expressed among the 11,406 genes examined upon compound treatment. In fact, most of these differentially expressed genes were upregulated with 791 treatment while a roughly equal portion of genes were upregulated or downregulated with 191 treatment (Figure 3.15). Of the few genes that were differentially expressed, trib3, the gene encoding Tribbles pseudokinase 3 (TRIB3) was upregulated by over 9-fold with 791 treatment. TRIB3 is a putative protein kinase that is induced by transcription factor NFκB, and involved in numerous cellular processes (99). Some of its roles include, inhibiting the activation of Akt, regulating activation of MAP kinases, and inhibiting APOBEC3A editing of nuclear DNA (99-101). Since TRIB3 plays a role in regulating the PI3K/Akt signaling pathway and there is a dramatic difference in gene expression with 791 treatment, it would be interesting to further examine the involvement of TRIB3 during HIV-1 replication. Thus, the modest gene expression changes with 191 and 791 treatment and the few shared differentially expressed genes suggests that these compounds are selective inhibitors of HIV-1 gene expression that have little effect on normal cellular processes.

Taken together, these results indicate that the compounds 192, 791, 833, and 892 inhibit HIV-1 gene expression by inducing the loss of key early viral regulatory proteins, which in turn leads to a perturbation in the balance of HIV-1 RNAs and subsequent loss of viral structural proteins. The

86

molecular mechanism by which this occurs remains to be determined, but nonetheless, these compounds offer another strategy to the list of possible ways to target HIV-1 RNA processing. In addition to be structurally dissimilar to digoxin, 8-Aza, and 150, these four compounds are also structurally distinct from NB-506, a splicing inhibitor that specifically blocks the kinase activity of DNA topoisomerase I (59), and ABX464, an inhibitor of Rev-mediated RNA export (73). The fact that small molecular compounds with distinct structures can effect gene expression by modulating pre-mRNA splicing (NB-506, digoxin), mRNA transport (ABX464, 8-aza, 150), and protein stability (191, 791, 833, and 892) validates using small molecules as drugs to target specific cellular proteins implicated in disease or viral infections, which require the cellular splicing machinery to persist. Furthermore, the similarities between the effects of these compounds and ABX464 on both HIV and cellular splicing events, suggest that these compounds may be able to inhibit HIV replication in vivo.

There are many challenges in translating the effect of small molecules in vitro to their application as novel drugs in humans. The four compounds described here may not be directly applicable in patients, as the systemic effects and therapeutic dose ranges remain unknown, however, confirmation of the activity of the compounds against HIV-1 replication in the context of primary human cells and in humanized mouse models is the closest to testing the application of these compounds in physiological condition in the laboratory setting, prior to testing their efficacy in humans in clinical trials. I have shown that 791 and 191 inhibit HIV-1 BaL (R5- tropic) replication in peripheral blood mononuclear cells at comparable levels to AZT, one of the first drugs used to treat HIV+ patients, up to six days post infection with no significant effects on cell viability (Figures 3.16 and 3.17). Furthermore, initial studies looking at the maximum tolerated doses of the four compounds in NOD SCID gamma (NSG) mice, were done by Dr. Liang Ming, a post-doctoral fellow in the lab. NSG mice were injected intraperitoneal (IP) with 892, 791, 833, or 191 and monitored for changes in body weight and behavior for up to two weeks. No significant changes in body weight or behavior were observed in NSG mice injected with 892 (36 mg/kg or 300 M, once), 791 (210 mg/kg or 600 M, every two days), or 833 (78 mg/kg or 200 M, once) for one week or 191 (2.1 mg/kg or 6 M, daily) for up to two weeks.

Therefore, these compounds are tolerated in mouse models at 3-100x the IC90 concentrations observed in HeLa B2 cells. These results are very promising for further testing and development of these compounds as novel drugs for treatment of HIV-1 infection.

87

4.1 Future Directions

Future studies should address two aspects: 1) elucidating the mechanism of action of these compounds in vitro and 2) confirming the efficacy of these compounds as therapeutic strategies in more physiological contexts of HIV-1 infection.

Given that the compounds only induce a small proportion of alternative splicing changes any changes that are common between the compounds could potentially be important for inhibition of HIV-1 gene expression. Since all four compounds resulted in differentially splicing of fgfr1op2, macf1, and gm130/golga2 it would be worthwhile to further examine their sequences for motif analysis and study their roles in inhibiting viral gene expression using minigene constructs combined with mutagenesis analysis. Motif discovery tools such as MEME (http://meme.nbcr.net/meme/), RescueESE or ESE finder may be used to identify direct regulators of the exons presumed to be co-regulated and the corresponding cis-acting sequences. A consensus sequence can then be determined and used to identify putative cellular factors that bind to these genes. These studies would allow us to pinpoint regulators and cellular signaling cascades involved in inhibition of HIV-1 gene expression. Since analysis of common changes in cellular alternative splicing suggest a role for NAP1 in 892, 833, and 191-induced inhibition of HIV-1 gene expression, it would be interesting to determine whether NAP1 expression is altered with compound treatment.

In parallel to these studies, it would be interesting to determine whether HIV-1 Rev expression and viral RNA export can be rescued with proteasome inhibition, since the compound-induced degradation of HIV-1 Tat isoforms, p16 (encoded on MS RNA) and p14 (encoded on SS RNA), can be reversed with the addition of MG132. This can be assessed by examination of Rev subcellular localization and abundance of HIV-1 US and SS RNAs Rev activity following MG132 treatment in the presence of the compounds by immunofluorescence, fluorescent in situ hybridization and qRT-PCR, as described previously. These studies would allow us to directly determine whether the ability of Rev to shuttle between the nucleus and cytoplasm is perturbed with compound treatment. Furthermore, it would be interesting to determine whether transfection of HIV-1 Rev in trans in HeLa B2 cells can reverse inhibition of HIV-1 gene expression following compound treatment. Examination of the effect of the compounds in the presence of wildtype Rev or a mutant Rev incapable of binding HIV-1 RNA (negative control), would tell us

88

if addition of functional Rev could rescue the effect of the compounds on HIV-1 gene expression, or whether a cellular factor or pathway is involved in the degradation of viral regulatory proteins. Together, these studies will give insights to how these small molecule induce destabilization of HIV-1 regulatory proteins, what cellular factors are involved, and whether this is can be adapted as an effective strategy against HIV-1 replication in vivo.

In addition to mechanistic studies, there should also be focus on the application of these compounds in more physiologically relevant contexts. I have shown that two of the compounds, 791 and 191, maintain their inhibitory effect on HIV-1 replication in the context of peripheral blood mononuclear cells (PBMCs), obtained from healthy human donors, at similar or lower doses than required in HeLa B2 cells without affecting cell viability. Future studies should confirm whether the remaining two compounds, 892 and 833, are active in PBMCs in the context of replicating HIV-1. Since HIV is characterized by high genetic diversity, subsequent experiments should assess whether prolonged treatment with these compounds select for drug resistant mutations in vitro. In addition, determination of the ability of these compounds to suppress viral replication of drug-resistant strains, clinical isolates and viruses from different HIV clades would further strengthen the validity of this strategy to control HIV infection and complement existing anti-viral therapies.

Finally, to determine whether these compounds can be developed into safe, efficacious, anti-viral drugs as treatment for HIV-infected individuals, the activity of these compounds should be tested in humanized mice models. Initial testing of the maximum tolerated doses of the compounds in

NOD SCID gamma (NSG) mice revealed that the compounds are tolerated at 3-100x the IC90 concentrations in these mouse models. Thus, future studies should examine the effect of these compounds in HIV-infected humanized mouse models (NSG mice transplanted with haematopoietic progenitor cells isolated from umbilical cord blood) to assess their efficacy under physiological conditions comparable to those in HIV-infected patients. Determination of the therapeutic dose ranges and efficacy of these compounds in mouse models allows us to recommend doses and treatment regimens for phase I clinical trials, the next step towards getting these compounds out to the market as anti-HIV drugs. Even if these compounds do not progress to clinical trials in humans, studying the mechanism of action of these compounds in vitro allows us to identify key cellular factors that can be systematically targeted by rational drug design.

89

4.2 Conclusions

From a screen of small molecular modulators of RNA splicing, we identified four compounds, 191, 791, 833, and 892, that potently inhibited HIV-1 gene expression in vitro in the context of both HeLa cells and peripheral blood mononuclear cells. Compound treatment resulted in loss of viral structural and regulatory proteins as well as the abundance of incompletely spliced viral RNAs, without affecting the abundance of viral MS RNAs or splice site usage within this class. Furthermore, I have shown that compound treatment did not significantly affect protein synthesis or cellular alternative splicing, suggesting that the effect of the compounds is selective to HIV-1 RNA processing. Examination of their effect on the stability of viral proteins at a post- translational level, revealed that the compounds induced destabilization of viral regulatory proteins Tat and Rev, thereby preventing Rev-mediated export of incompletely spliced viral RNAs. Thus, destabilization of HIV-1 regulatory proteins appears to be a distinct way by which these compounds alter the balance of HIV-1 RNA splicing and inhibit HIV-1 gene expression and replication. The ability to differentially effect RNA processing without perturbing normal cellular processes validates targeting this stage of the virus lifecycle as a novel therapeutic strategy that can be developed to complement existing treatment regimens or used as a second line of defense against drug-resistant HIV strains.

90

References

1. Blencowe BJ. Alternative splicing: new insights from global analyses. Cell. 2006 Jul 14;126(1):37-47.

2. Cowling VH. Regulation of mRNA cap methylation. Biochem J. 2009 Dec 23;425(2):295- 302.

3. Kelemen O, Convertini P, Zhang Z, Wen Y, Shen M, Falaleeva M, et al. Function of alternative splicing. Gene. 2013 Feb 1;514(1):1-30.

4. Stamm S. Regulation of alternative splicing by reversible protein phosphorylation. J Biol Chem. 2008 Jan 18;283(3):1223-7.

5. Braunschweig U, Gueroussov S, Plocik AM, Graveley BR, Blencowe BJ. Dynamic integration of splicing within gene regulatory pathways. Cell. 2013 Mar 14;152(6):1252-69.

6. Proudfoot NJ. Ending the message: poly(A) signals then and now. Genes Dev. 2011 Sep 1;25(17):1770-82.

7. Shatkin AJ, Manley JL. The ends of the affair: capping and polyadenylation. Nat Struct Biol. 2000 Oct;7(10):838-42.

8. Izaurralde E. A novel family of nuclear transport receptors mediates the export of messenger RNA to the cytoplasm. Eur J Cell Biol. 2002 Nov;81(11):577-84.

9. Erkmann JA, Kutay U. Nuclear export of mRNA: from the site of transcription to the cytoplasm. Exp Cell Res. 2004 May 15;296(1):12-20.

10. Maquat LE, Hwang J, Sato H, Tang Y. CBP80-promoted mRNP rearrangements during the pioneer round of translation, nonsense-mediated mRNA decay, and thereafter. Cold Spring Harb Symp Quant Biol. 2010;75:127-34.

11. Jackson RJ, Hellen CU, Pestova TV. The mechanism of eukaryotic translation initiation and principles of its regulation. Nat Rev Mol Cell Biol. 2010 Feb;11(2):113-27.

91

12. Sonenberg N, Hinnebusch AG. Regulation of translation initiation in eukaryotes: mechanisms and biological targets. Cell. 2009 Feb 20;136(4):731-45.

13. Han J, Xiong J, Wang D, Fu XD. Pre-mRNA splicing: where and when in the nucleus. Trends Cell Biol. 2011 Jun;21(6):336-43.

14. Stamm S, Smith CWJ, Lührmann R. Alternative pre-mRNA splicing. 2012:622.

15. Reed R, Maniatis T. A role for exon sequences and splice-site proximity in splice-site selection. Cell. 1986 Aug 29;46(5):681-90.

16. Erkelenz S, Mueller WF, Evans MS, Busch A, Schoneweis K, Hertel KJ, et al. Position- dependent splicing activation and repression by SR and hnRNP proteins rely on common mechanisms. RNA. 2013 Jan;19(1):96-102.

17. Han N, Li W, Zhang M. The function of the RNA-binding protein hnRNP in cancer metastasis. J Cancer Res Ther. 2013 Nov;9 Suppl:S129-34.

18. Zhou Z, Fu XD. Regulation of splicing by SR proteins and SR protein-specific kinases. Chromosoma. 2013 Jun;122(3):191-207.

19. Patel NA, Kaneko S, Apostolatos HS, Bae SS, Watson JE, Davidowitz K, et al. Molecular and genetic studies imply Akt-mediated signaling promotes protein kinase CbetaII alternative splicing via phosphorylation of serine/arginine-rich splicing factor SRp40. J Biol Chem. 2005 Apr 8;280(14):14302-9.

20. Blaustein M, Pelisch F, Tanos T, Munoz MJ, Wengier D, Quadrana L, et al. Concerted regulation of nuclear and cytoplasmic activities of SR proteins by AKT. Nat Struct Mol Biol. 2005 Dec;12(12):1037-44.

21. Diehl N, Schaal H. Make yourself at home: viral hijacking of the PI3K/Akt signaling pathway. Viruses. 2013 Dec 16;5(12):3192-212.

22. Hillebrand F, Erkelenz S, Diehl N, Widera M, Noffke J, Avota E, et al. The PI3K pathway acting on alternative HIV-1 pre-mRNA splicing. J Gen Virol. 2014 Aug;95(Pt 8):1809-15.

92

23. Heyd F, Lynch KW. Phosphorylation-dependent regulation of PSF by GSK3 controls CD45 alternative splicing. Mol Cell. 2010 Oct 8;40(1):126-37.

24. Lynch KW. Regulation of alternative splicing by signal transduction pathways. Adv Exp Med Biol. 2007;623:161-74.

25. Matter N, Herrlich P, Konig H. Signal-dependent regulation of splicing via phosphorylation of Sam68. Nature. 2002 Dec 12;420(6916):691-5.

26. van der Houven van Oordt,W., Diaz-Meco MT, Lozano J, Krainer AR, Moscat J, Caceres JF. The MKK(3/6)-p38-signaling cascade alters the subcellular distribution of hnRNP A1 and modulates alternative splicing regulation. J Cell Biol. 2000 Apr 17;149(2):307-16.

27. Tazi J, Bakkour N, Stamm S. Alternative splicing and disease. Biochim Biophys Acta. 2009 Jan;1792(1):14-26.

28. Naryshkin NA, Weetall M, Dakka A, Narasimhan J, Zhao X, Feng Z, et al. Motor neuron disease. SMN2 splicing modifiers improve motor function and longevity in mice with spinal muscular atrophy. Science. 2014 Aug 8;345(6197):688-93.

29. Acheson NH. Fundamentals of molecular virology. 2nd ed. Hoboken, NJ: John Wiley & Sons; 2011.

30. Clavel F, Hance AJ. HIV drug resistance. N Engl J Med. 2004 Mar 4;350(10):1023-35.

31. Little SJ, Holte S, Routy JP, Daar ES, Markowitz M, Collier AC, et al. Antiretroviral-drug resistance among patients recently infected with HIV. N Engl J Med. 2002 Aug 8;347(6):385-94.

32. Stoltzfus CM. Chapter 1. Regulation of HIV-1 alternative RNA splicing and its role in virus replication. Adv Virus Res. 2009;74:1-40.

33. O'Reilly MM, McNally MT, Beemon KL. Two strong 5' splice sites and competing, suboptimal 3' splice sites involved in alternative splicing of human immunodeficiency virus type 1 RNA. Virology. 1995 Nov 10;213(2):373-85.

93

34. Stoltzfus CM, Madsen JM. Role of viral splicing elements and cellular RNA binding proteins in regulation of HIV-1 alternative RNA splicing. Curr HIV Res. 2006 Jan;4(1):43-55.

35. Marchand V, Mereau A, Jacquenet S, Thomas D, Mougin A, Gattoni R, et al. A Janus splicing regulatory element modulates HIV-1 tat and rev mRNA production by coordination of hnRNP A1 cooperative binding. J Mol Biol. 2002 Nov 1;323(4):629-52.

36. Caputi M, Freund M, Kammler S, Asang C, Schaal H. A bidirectional SF2/ASF- and SRp40- dependent splicing enhancer regulates human immunodeficiency virus type 1 rev, env, vpu, and nef gene expression. J Virol. 2004 Jun;78(12):6517-26.

37. Exline CM, Feng Z, Stoltzfus CM. Negative and positive mRNA splicing elements act competitively to regulate human immunodeficiency virus type 1 vif gene expression. J Virol. 2008 Apr;82(8):3921-31.

38. Kammler S, Otte M, Hauber I, Kjems J, Hauber J, Schaal H. The strength of the HIV-1 3' splice sites affects Rev function. Retrovirology. 2006 Dec 4;3:89.

39. Erkelenz S, Hillebrand F, Widera M, Theiss S, Fayyaz A, Degrandi D, et al. Balanced splicing at the Tat-specific HIV-1 3'ss A3 is critical for HIV-1 replication. Retrovirology. 2015 Mar 28;12(1):29,015-0154-8.

40. Madsen JM, Stoltzfus CM. A suboptimal 5' splice site downstream of HIV-1 splice site A1 is required for unspliced viral mRNA accumulation and efficient virus replication. Retrovirology. 2006 Feb 3;3:10.

41. Lund N, Milev MP, Wong R, Sanmuganantham T, Woolaway K, Chabot B, et al. Differential effects of hnRNP D/AUF1 isoforms on HIV-1 gene expression. Nucleic Acids Res. 2012 Apr;40(8):3663-75.

42. Platt C, Calimano M, Nemet J, Bubenik J, Cochrane A. Differential Effects of Tra2ss Isoforms on HIV-1 RNA Processing and Expression. PLoS One. 2015 May 13;10(5):e0125315.

94

43. Wong R, Balachandran A, Mao AY, Dobson W, Gray-Owen S, Cochrane A. Differential effect of CLK SR Kinases on HIV-1 gene expression: potential novel targets for therapy. Retrovirology. 2011 Jun 17;8:47,4690-8-47.

44. Taniguchi I, Mabuchi N, Ohno M. HIV-1 Rev protein specifies the viral RNA export pathway by suppressing TAP/NXF1 recruitment. Nucleic Acids Res. 2014 Jun;42(10):6645- 58.

45. Hernandez-Lopez HR, Graham SV. Alternative splicing in human tumour viruses: a therapeutic target? Biochem J. 2012 Jul 15;445(2):145-56.

46. Sumanasekera C, Watt DS, Stamm S. Substances that can change alternative splice-site selection. Biochem Soc Trans. 2008 Jun;36(Pt 3):483-90.

47. Mohseni J, Zabidi-Hussin ZA, Sasongko TH. Histone deacetylase inhibitors as potential treatment for spinal muscular atrophy. Genet Mol Biol. 2013 Sep;36(3):299-307.

48. Kaida D, Motoyoshi H, Tashiro E, Nojima T, Hagiwara M, Ishigami K, et al. Spliceostatin A targets SF3b and inhibits both splicing and nuclear retention of pre-mRNA. Nat Chem Biol. 2007 Sep;3(9):576-83.

49. Fan L, Lagisetti C, Edwards CC, Webb TR, Potter PM. Sudemycins, novel small molecule analogues of FR901464, induce alternative gene splicing. ACS Chem Biol. 2011 Jun 17;6(6):582-9.

50. Chang JG, Hsieh-Li HM, Jong YJ, Wang NM, Tsai CH, Li H. Treatment of spinal muscular atrophy by sodium butyrate. Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9808-13.

51. Brichta L, Hofmann Y, Hahnen E, Siebzehnrubl FA, Raschke H, Blumcke I, et al. Valproic acid increases the SMN2 protein level: a well-known drug as a potential therapy for spinal muscular atrophy. Hum Mol Genet. 2003 Oct 1;12(19):2481-9.

52. Andreassi C, Angelozzi C, Tiziano FD, Vitali T, De Vincenzi E, Boninsegna A, et al. Phenylbutyrate increases SMN expression in vitro: relevance for treatment of spinal muscular atrophy. Eur J Hum Genet. 2004 Jan;12(1):59-65.

95

53. Riessland M, Brichta L, Hahnen E, Wirth B. The benzamide M344, a novel histone deacetylase inhibitor, significantly increases SMN2 RNA/protein levels in spinal muscular atrophy cells. Hum Genet. 2006 Aug;120(1):101-10.

54. Hahnen E, Eyupoglu IY, Brichta L, Haastert K, Trankle C, Siebzehnrubl FA, et al. In vitro and ex vivo evaluation of second-generation histone deacetylase inhibitors for the treatment of spinal muscular atrophy. J Neurochem. 2006 Jul;98(1):193-202.

55. Rossi F, Labourier E, Forne T, Divita G, Derancourt J, Riou JF, et al. Specific phosphorylation of SR proteins by mammalian DNA topoisomerase I. Nature. 1996 May 2;381(6577):80-2.

56. Malanga M, Czubaty A, Girstun A, Staron K, Althaus FR. Poly(ADP-ribose) binds to the splicing factor ASF/SF2 and regulates its phosphorylation by DNA topoisomerase I. J Biol Chem. 2008 Jul 18;283(29):19991-8.

57. Tazi J, Bakkour N, Soret J, Zekri L, Hazra B, Laine W, et al. Selective inhibition of topoisomerase I and various steps of spliceosome assembly by diospyrin derivatives. Mol Pharmacol. 2005 Apr;67(4):1186-94.

58. Ting CY, Hsu CT, Hsu HT, Su JS, Chen TY, Tarn WY, et al. Isodiospyrin as a novel human DNA topoisomerase I inhibitor. Biochem Pharmacol. 2003 Nov 15;66(10):1981-91.

59. Pilch B, Allemand E, Facompre M, Bailly C, Riou JF, Soret J, et al. Specific inhibition of serine- and arginine-rich splicing factors phosphorylation, spliceosome assembly, and splicing by the antitumor drug NB-506. Cancer Res. 2001 Sep 15;61(18):6876-84.

60. Bakkour N, Lin YL, Maire S, Ayadi L, Mahuteau-Betzer F, Nguyen CH, et al. Small- molecule inhibition of HIV pre-mRNA splicing as a novel antiretroviral therapy to overcome drug resistance. PLoS Pathog. 2007 Oct 26;3(10):1530-9.

61. Muraki M, Ohkawara B, Hosoya T, Onogi H, Koizumi J, Koizumi T, et al. Manipulation of alternative splicing by a newly developed inhibitor of Clks. J Biol Chem. 2004 Jun 4;279(23):24246-54.

96

62. Younis I, Berg M, Kaida D, Dittmar K, Wang C, Dreyfuss G. Rapid-response splicing reporter screens identify differential regulators of constitutive and alternative splicing. Mol Cell Biol. 2010 Apr;30(7):1718-28.

63. Fedorov O, Huber K, Eisenreich A, Filippakopoulos P, King O, Bullock AN, et al. Specific CLK inhibitors from a novel chemotype for regulation of alternative splicing. Chem Biol. 2011 Jan 28;18(1):67-76.

64. Debdab M, Carreaux F, Renault S, Soundararajan M, Fedorov O, Filippakopoulos P, et al. Leucettines, a class of potent inhibitors of cdc2-like kinases and dual specificity, tyrosine phosphorylation regulated kinases derived from the marine sponge leucettamine B: modulation of alternative pre-RNA splicing. J Med Chem. 2011 Jun 23;54(12):4172-86.

65. Fukuhara T, Hosoya T, Shimizu S, Sumi K, Oshiro T, Yoshinaka Y, et al. Utilization of host SR protein kinases and RNA-splicing machinery during viral replication. Proc Natl Acad Sci U S A. 2006 Jul 25;103(30):11329-33.

66. Karakama Y, Sakamoto N, Itsui Y, Nakagawa M, Tasaka-Fujita M, Nishimura-Sakurai Y, et al. Inhibition of hepatitis C virus replication by a specific inhibitor of serine-arginine-rich protein kinase. Antimicrob Agents Chemother. 2010 Aug;54(8):3179-86.

67. Yadav AK, Vashishta V, Joshi N, Taneja P. AR-A 014418 Used against GSK3beta Downregulates Expression of hnRNPA1 and SF2/ASF Splicing Factors. J Oncol. 2014;2014:695325.

68. Hernandez F, Perez M, Lucas JJ, Mata AM, Bhat R, Avila J. Glycogen synthase kinase-3 plays a crucial role in tau exon 10 splicing and intranuclear distribution of SC35. Implications for Alzheimer's disease. J Biol Chem. 2004 Jan 30;279(5):3801-6.

69. Novoyatleva T, Heinrich B, Tang Y, Benderska N, Butchbach ME, Lorson CL, et al. Protein phosphatase 1 binds to the RNA recognition motif of several splicing factors and regulates alternative pre-mRNA processing. Hum Mol Genet. 2008 Jan 1;17(1):52-70.

70. Chalfant CE, Rathman K, Pinkerman RL, Wood RE, Obeid LM, Ogretmen B, et al. De novo ceramide regulates the alternative splicing of caspase 9 and Bcl-x in A549 lung

97

adenocarcinoma cells. Dependence on protein phosphatase-1. J Biol Chem. 2002 Apr 12;277(15):12587-95.

71. Wong RW, Balachandran A, Haaland M, Stoilov P, Cochrane A. Characterization of novel inhibitors of HIV-1 replication that function via alteration of viral RNA processing and rev function. Nucleic Acids Res. 2013 Nov;41(20):9471-83.

72. Wong RW, Balachandran A, Ostrowski MA, Cochrane A. Digoxin suppresses HIV-1 replication by altering viral RNA processing. PLoS Pathog. 2013 Mar;9(3):e1003241.

73. Campos N, Myburgh R, Garcel A, Vautrin A, Lapasset L, Nadal ES, et al. Long lasting control of viral rebound with a new drug ABX464 targeting Rev - mediated viral RNA biogenesis. Retrovirology. 2015 Apr 9;12(1):30,015-0159-3.

74. Zhou X, Vink M, Klaver B, Berkhout B, Das AT. Optimization of the Tet-On system for regulated gene expression through viral evolution. Gene Ther. 2006 Oct;13(19):1382-90.

75. Das AT, Zhou X, Vink M, Klaver B, Verhoef K, Marzio G, et al. Viral evolution as a tool to improve the tetracycline-regulated gene expression system. J Biol Chem. 2004 Apr 30;279(18):18776-82.

76. Schmidt EK, Clavarino G, Ceppi M, Pierre P. SUnSET, a nonradioactive method to monitor protein synthesis. Nat Methods. 2009 Apr;6(4):275-7.

77. Irimia M, Weatheritt RJ, Ellis JD, Parikshak NN, Gonatopoulos-Pournatzis T, Babor M, et al. A highly conserved program of neuronal microexons is misregulated in autistic brains. Cell. 2014 Dec 18;159(7):1511-23.

78. Dobson-Belaire WN, Rebbapragada A, Malott RJ, Yue FY, Kovacs C, Kaul R, et al. Neisseria gonorrhoeae effectively blocks HIV-1 replication by eliciting a potent TLR9- dependent interferon-alpha response from plasmacytoid dendritic cells. Cell Microbiol. 2010 Dec;12(12):1703-17.

79. Corso G, Coletta I, Ombrato R. Murine mPGES-1 3D structure elucidation and inhibitors binding mode predictions by homology modeling and site-directed mutagenesis. J Chem Inf Model. 2013 Jul 22;53(7):1804-17. 98

80. Li J, Liu X, Li S, Wang Y, Zhou N, Luo C, et al. Identification of novel small molecules as inhibitors of hepatitis C virus by structure-based virtual screening. Int J Mol Sci. 2013 Nov 20;14(11):22845-56.

81. Paruch K, Dwyer MP, Alvarez C, Brown C, Chan TY, Doll RJ, et al. Pyrazolo[1,5- a]pyrimidines as orally available inhibitors of cyclin-dependent kinase 2. Bioorg Med Chem Lett. 2007 Nov 15;17(22):6220-3.

82. Purcell DF, Martin MA. Alternative splicing of human immunodeficiency virus type 1 mRNA modulates viral protein expression, replication, and infectivity. J Virol. 1993 Nov;67(11):6365-78.

83. Martinez NM, Pan Q, Cole BS, Yarosh CA, Babcock GA, Heyd F, et al. Alternative splicing networks regulated by signaling in human T cells. RNA. 2012 May;18(5):1029-40.

84. Anderson ES, Lin CH, Xiao X, Stoilov P, Burge CB, Black DL. The cardiotonic steroid digitoxin regulates alternative splicing through depletion of the splicing factors SRSF3 and TRA2B. RNA. 2012 May;18(5):1041-9.

85. Wolff B, Sanglier JJ, Wang Y. Leptomycin B is an inhibitor of nuclear export: inhibition of nucleo-cytoplasmic translocation of the human immunodeficiency virus type 1 (HIV-1) Rev protein and Rev-dependent mRNA. Chem Biol. 1997 Feb;4(2):139-47.

86. Schubert U, Ott DE, Chertova EN, Welker R, Tessmer U, Princiotta MF, et al. Proteasome inhibition interferes with gag polyprotein processing, release, and maturation of HIV-1 and HIV-2. Proc Natl Acad Sci U S A. 2000 Nov 21;97(24):13057-62.

87. Wu X, Shen QT, Oristian DS, Lu CP, Zheng Q, Wang HW, et al. Skin stem cells orchestrate directional migration by regulating microtubule-ACF7 connections through GSK3beta. Cell. 2011 Feb 4;144(3):341-52.

88. Ka M, Jung EM, Mueller U, Kim WY. MACF1 regulates the migration of pyramidal neurons via microtubule dynamics and GSK-3 signaling. Dev Biol. 2014 Nov 1;395(1):4-18.

89. Nakamura N. Emerging new roles of GM130, a cis-Golgi matrix protein, in higher order cell functions. J Pharmacol Sci. 2010;112(3):255-64. 99

90. Lowe M, Rabouille C, Nakamura N, Watson R, Jackman M, Jamsa E, et al. Cdc2 kinase directly phosphorylates the cis-Golgi matrix protein GM130 and is required for Golgi fragmentation in mitosis. Cell. 1998 Sep 18;94(6):783-93.

91. Sun KH, de Pablo Y, Vincent F, Johnson EO, Chavers AK, Shah K. Novel genetic tools reveal Cdk5's major role in Golgi fragmentation in Alzheimer's disease. Mol Biol Cell. 2008 Jul;19(7):3052-69.

92. Preisinger C, Short B, De Corte V, Bruyneel E, Haas A, Kopajtich R, et al. YSK1 is activated by the Golgi matrix protein GM130 and plays a role in cell migration through its substrate 14-3-3zeta. J Cell Biol. 2004 Mar 29;164(7):1009-20.

93. Grand EK, Grand FH, Chase AJ, Ross FM, Corcoran MM, Oscier DG, et al. Identification of a novel gene, FGFR1OP2, fused to FGFR1 in 8p11 myeloproliferative syndrome. Genes Cancer. 2004 May;40(1):78-83.

94. Cochrane A, Murley LL, Gao M, Wong R, Clayton K, Brufatto N, et al. Stable complex formation between HIV Rev and the nucleosome assembly protein, NAP1, affects Rev function. Virology. 2009 May 25;388(1):103-11.

95. Vardabasso C, Manganaro L, Lusic M, Marcello A, Giacca M. The histone chaperone protein Nucleosome Assembly Protein-1 (hNAP-1) binds HIV-1 Tat and promotes viral transcription. Retrovirology. 2008 Jan 28;5:8,4690-5-8.

96. De Marco A, Dans PD, Knezevich A, Maiuri P, Pantano S, Marcello A. Subcellular localization of the interaction between the human immunodeficiency virus transactivator Tat and the nucleosome assembly protein 1. Amino Acids. 2010 May;38(5):1583-93.

97. Giudice J, Xia Z, Wang ET, Scavuzzo MA, Ward AJ, Kalsotra A, et al. Alternative splicing regulates vesicular trafficking genes in cardiomyocytes during postnatal heart development. Nat Commun. 2014 Apr 22;5:3603.

98. Martinez NM, Pan Q, Cole BS, Yarosh CA, Babcock GA, Heyd F, et al. Alternative splicing networks regulated by signaling in human T cells. RNA. 2012 May;18(5):1029-40.

100

99. Kiss-Toth E, Bagstaff SM, Sung HY, Jozsa V, Dempsey C, Caunt JC, et al. Human tribbles, a protein family controlling mitogen-activated protein kinase cascades. J Biol Chem. 2004 Oct 8;279(41):42703-8.

100. Du K, Herzig S, Kulkarni RN, Montminy M. TRB3: a tribbles homolog that inhibits Akt/PKB activation by insulin in liver. Science. 2003 Jun 6;300(5625):1574-7.

101. Aynaud MM, Suspene R, Vidalain PO, Mussil B, Guetard D, Tangy F, et al. Human Tribbles 3 protects nuclear DNA from cytidine deamination by APOBEC3A. J Biol Chem. 2012 Nov 9;287(46):39182-92.

101

Appendices

I. Analysis of cellular alternative splicing by RT-PCR

Table I-1 Effect of 892 treatment on a subset of cellular alternative splicing (AS). HeLa B2 cells were treated as described previously. RT-PCR and analysis was done by Stoilov group. For each splicing event, the percent spliced in (PSI) score, the mean change in exon inclusion with compound treatment and the associated p value (student’s t test) is listed (N = 3). AS events with |PSI| ≥ 10% are orange. Bolded events are common to multiple compounds.

Summary Total count of AS events: 70 AS events with P ≤ 0.05: 18 AS events with PSI ≥ 10%: 2 AS events with PSI ≤ -10%: 7

DMSO 892 Transcript ID PSI P Value PSI (25) PSI (26) PSI (27) PSI (25) PSI (26) PSI (27)

MACF1_1 62.00 65.00 64.00 76.00 85.00 84.00 18.00 0.004 EIF4A2_1 28.00 23.00 24.00 34.00 38.00 44.00 13.67 0.014 NUMB_2 20.00 15.00 17.00 26.00 26.00 26.00 8.67 0.004 EIF4A2_1 19.00 17.00 18.00 22.00 26.00 31.00 8.33 0.035 ZNF827_1 19.00 22.00 20.00 26.00 27.00 27.00 6.33 0.003 EXOC7_1 20.00 23.00 21.00 27.00 27.00 27.00 5.67 0.003 CAST_1 65.00 69.00 69.00 72.00 73.00 73.00 5.00 0.022 RAN_1 96.00 98.00 100.00 93.00 93.00 94.00 -4.67 0.018 APLP2_1 23.00 23.00 24.00 15.00 18.00 19.00 -6.00 0.009 FIP1L1_1 32.00 34.00 36.00 27.00 27.00 28.00 -6.67 0.005 NAP1L1_1 85.00 85.00 81.00 76.00 75.00 79.00 -7.00 0.018

FGFR1OP2_1 34.00 25.00 28.00 22.00 15.00 15.00 -11.67 0.030 MACF1_5 32.00 30.00 30.00 27.00 15.00 14.00 -12.00 0.047 SEC24B_1 13.00 24.00 29.00 7.00 9.00 8.00 -14.00 0.042 GM130_1 30.00 35.00 34.00 23.00 18.00 15.00 -14.33 0.007 DRCTNNB1A_1 21.00 27.00 24.00 7.00 7.00 6.00 -17.33 0.001 GGCT_1 48.00 61.00 60.00 24.00 24.00 26.00 -31.67 0.002 SMN2_1 94.00 95.00 91.00 39.00 44.00 46.00 -50.33 0.000 TRIM37_1 71.00 76.00 74.00 69.00 71.00 68.00 -4.33 0.063 FAM62B_1 31.00 35.00 36.00 26.00 31.00 29.00 -5.33 0.065 RPS24_1 7.00 8.00 7.00 8.00 13.00 12.00 3.67 0.079 MAP3K7_1 10.00 11.00 11.00 11.00 14.00 14.00 2.33 0.091 MRIP_1 30.00 31.00 30.00 32.00 31.00 31.00 1.00 0.101 NAP1L1_1 85.00 88.00 81.00 76.00 78.00 83.00 -5.67 0.123

102

DMSO 892 Transcript ID PSI P Value PSI (25) PSI (26) PSI (27) PSI (25) PSI (26) PSI (27) PDCL_1 90.00 90.00 89.00 90.00 91.00 93.00 1.67 0.152 MVK_1 86.00 80.00 64.00 78.00 100.00 100.00 16.00 0.179 SMN2_2 70.00 73.00 71.00 71.00 76.00 77.00 3.33 0.180 SETD5_1 6.00 7.00 5.00 8.00 21.00 N/A 8.50 0.181 MBNL2_1 9.00 15.00 15.00 11.00 30.00 29.00 10.33 0.187 RAI14_1 81.00 89.00 89.00 86.00 95.00 95.00 5.67 0.231 KIF13A_1 23.00 23.00 22.00 22.00 26.00 25.00 1.67 0.252 APP_1 29.00 31.00 31.00 30.00 32.00 34.00 1.67 0.279 MFF_1 76.00 100.00 100.00 77.00 80.00 88.00 -10.33 0.298 TPM1_1 88.00 79.00 77.00 84.00 73.00 67.00 -6.67 0.330

TPM1_1 12.00 21.00 23.00 16.00 27.00 33.00 6.67 0.330 ATP6V0A1_1 77.00 81.00 76.00 81.00 86.00 77.00 3.33 0.331 AGPAT4_1 0.00 13.00 0.00 1.00 0.00 0.00 -4.00 0.409 POLDIP3_1 68.00 64.00 60.00 68.00 67.00 64.00 2.33 0.421 DNM1L_1 20.00 31.00 33.00 24.00 37.00 39.00 5.33 0.438 DNM1L_1 44.00 58.00 58.00 38.00 52.00 53.00 -5.67 0.447 EIF4H_1 17.00 13.00 12.00 17.00 12.00 20.00 2.33 0.450 MVK_1 63.00 54.00 41.00 64.00 74.00 45.00 8.33 0.477 SRPK2_1 16.00 7.00 8.00 14.00 5.00 2.00 -3.33 0.508 CRBN_1 99.00 100.00 100.00 100.00 99.00 99.00 -0.33 0.519 FAM104A_1 14.00 23.00 23.00 15.00 20.00 18.00 -2.33 0.523 NUMB_2 77.00 94.00 88.00 82.00 96.00 95.00 4.67 0.525 AGPAT4_1 97.00 95.00 96.00 96.00 100.00 79.00 -4.33 0.539 GRB10_1 97.00 99.00 100.00 97.00 98.00 99.00 -0.67 0.561 CASP9_1 36.00 45.00 41.00 36.00 47.00 46.00 2.33 0.622 POMT1_1 50.00 41.00 36.00 50.00 37.00 28.00 -4.00 0.626 POMT1_1 93.00 91.00 85.00 95.00 94.00 63.00 -5.67 0.627 SRPK2_1 90.00 95.00 96.00 90.00 97.00 99.00 1.67 0.640 MBD1_1 22.00 0.00 56.00 28.00 15.00 9.00 -8.67 0.641

ADD3_1 42.00 53.00 59.00 42.00 60.00 62.00 3.33 0.701 EXOC7_1 36.00 19.00 20.00 35.00 23.00 25.00 2.67 0.709 CA12_1 11.00 7.00 12.00 15.00 7.00 4.00 -1.33 0.731 MARK3_1 19.00 14.00 13.00 20.00 13.00 16.00 1.00 0.734 MARK3_1 12.00 10.00 10.00 11.00 9.00 11.00 -0.33 0.742 CLSTN1_2 11.00 12.00 12.00 10.00 11.00 13.00 -0.33 0.742 CA12_1 96.00 96.00 98.00 96.00 100.00 95.00 0.33 0.851 APP_1 73.00 79.00 79.00 71.00 77.00 81.00 -0.67 0.859 ZNF827_1 40.00 53.00 46.00 47.00 48.00 46.00 0.67 0.869 CTNND1_1 78.00 97.00 79.00 82.00 76.00 100.00 1.33 0.895 GLK_1 28.00 38.00 37.00 26.00 40.00 39.00 0.67 0.910 GGCT_1 81.00 91.00 89.00 83.00 90.00 89.00 0.33 0.934 CLSTN1_1 59.00 53.00 53.00 61.00 56.00 49.00 0.33 0.938 ERC1_1 35.00 52.00 54.00 36.00 55.00 52.00 0.67 0.941 CRBN_1 94.00 94.00 94.00 95.00 93.00 94.00 0.00 1.000 NPHP3_1 92.00 80.00 84.00 87.00 86.00 83.00 0.00 1.000

103

Table I-2 Effect of 791 treatment on a subset of cellular alternative splicing (AS). HeLa B2 cells were treated as described previously. RT-PCR and analysis was done by Stoilov group. For each splicing event, the percent spliced in (PSI) score, the mean change in exon inclusion with compound treatment and the associated p value (student’s t test) is listed (N = 3). AS events with |PSI| ≥ 10% are orange. Bolded events are common to multiple compounds.

Summary Total count of AS events: 70 AS events with P ≤ 0.05: 9

AS events with PSI ≥ 10%: 0 AS events with PSI ≤ -10%: 2

DMSO 791 Transcript ID PSI P Value PSI (25) PSI (26) PSI (27) PSI (25) PSI (26) PSI (27) CAST_1 65.00 69.00 69.00 73.00 73.00 73.00 5.33 0.016 APLP2_1 23.00 23.00 24.00 24.00 25.00 25.00 1.33 0.047 FIP1L1_1 32.00 34.00 36.00 30.00 29.00 30.00 -4.33 0.023 EXOC7_1 20.00 23.00 21.00 16.00 16.00 18.00 -4.67 0.013 DRCTNNB1A_1 21.00 27.00 24.00 16.00 16.00 16.00 -8.00 0.010 FAM62B_1 31.00 35.00 36.00 23.00 26.00 28.00 -8.33 0.017 GM130_1 30.00 35.00 34.00 23.00 23.00 24.00 -9.67 0.003 FGFR1OP2_1 34.00 25.00 28.00 22.00 15.00 17.00 -11.00 0.031 MACF1_5 32.00 30.00 30.00 21.00 14.00 16.00 -13.67 0.003 RPS24_1 7.00 8.00 7.00 10.00 9.00 8.00 1.67 0.067

TRIM37_1 71.00 76.00 74.00 66.00 71.00 70.00 -4.67 0.091 SPAG9_1 26.00 22.00 20.00 21.00 13.00 14.00 -6.67 0.096 POMT1_1 93.00 91.00 85.00 96.00 97.00 93.00 5.67 0.103 CLSTN1_2 11.00 12.00 12.00 11.00 7.00 10.00 -2.33 0.135 KIF13A_1 23.00 23.00 22.00 20.00 22.00 22.00 -1.33 0.148 CLSTN1_1 59.00 53.00 53.00 55.00 44.00 40.00 -8.67 0.152 MVK_1 63.00 54.00 41.00 53.00 84.00 82.00 20.33 0.162 EIF4A2_1 19.00 17.00 18.00 18.00 20.00 22.00 2.00 0.196 EIF4A2_1 28.00 23.00 24.00 25.00 29.00 31.00 3.33 0.226

MACF1_1 62.00 65.00 64.00 63.00 71.00 67.00 3.33 0.249 PDCL_1 90.00 90.00 89.00 90.00 93.00 90.00 1.33 0.275 NUMB_2 77.00 94.00 88.00 89.00 95.00 95.00 6.67 0.282 MVK_1 86.00 80.00 64.00 81.00 100.00 82.00 11.00 0.289 AGPAT4_1 97.00 95.00 96.00 83.00 86.00 100.00 -6.33 0.296 POLDIP3_1 68.00 64.00 60.00 64.00 57.00 60.00 -3.67 0.299 DNM1L_1 20.00 31.00 33.00 24.00 43.00 44.00 9.00 0.305 CA12_1 11.00 7.00 12.00 22.00 12.00 10.00 4.67 0.310

104

DMSO 791 Transcript ID PSI P Value PSI (25) PSI (26) PSI (27) PSI (25) PSI (26) PSI (27) SMN2_2 70.00 73.00 71.00 71.00 75.00 73.00 1.67 0.315 NAP1L1_1 85.00 88.00 81.00 97.00 86.00 85.00 4.67 0.343 RAN_1 96.00 98.00 100.00 93.00 97.00 98.00 -2.00 0.355 SETD5_1 6.00 7.00 5.00 7.00 4.00 3.00 -1.33 0.374 MAP3K7_1 10.00 11.00 11.00 11.00 11.00 11.00 0.33 0.374 EXOC7_1 36.00 19.00 20.00 24.00 15.00 18.00 -6.00 0.382 POMT1_1 50.00 41.00 36.00 57.00 45.00 42.00 5.67 0.409

AGPAT4_1 0.00 13.00 0.00 1.00 0.00 0.00 -4.00 0.409 RAI14_1 81.00 89.00 89.00 84.00 94.00 92.00 3.67 0.417 GGCT_1 81.00 91.00 89.00 86.00 93.00 92.00 3.33 0.425 NAP1L1_1 85.00 85.00 81.00 97.00 86.00 81.00 4.33 0.427 MARK3_1 12.00 10.00 10.00 12.00 7.00 9.00 -1.33 0.451 ZNF827_1 19.00 22.00 20.00 19.00 18.00 21.00 -1.00 0.468 MRIP_1 30.00 31.00 30.00 34.00 30.00 30.00 1.00 0.507 SEC24B_1 13.00 24.00 29.00 11.00 22.00 21.00 -4.00 0.534 MARK3_1 19.00 14.00 13.00 20.00 8.00 10.00 -2.67 0.555

MBD1_1 22.00 0.00 56.00 23.00 15.00 12.00 -9.33 0.604 ZNF827_1 40.00 53.00 46.00 51.00 45.00 50.00 2.33 0.607 GGCT_1 48.00 61.00 60.00 50.00 64.00 65.00 3.33 0.630 MFF_1 76.00 100.00 100.00 89.00 100.00 100.00 4.33 0.648 APP_1 73.00 79.00 79.00 72.00 78.00 77.00 -1.33 0.651 CTNND1_1 78.00 97.00 79.00 86.00 90.00 65.00 -4.33 0.685 CASP9_1 36.00 45.00 41.00 33.00 44.00 41.00 -1.33 0.766 ERC1_1 35.00 52.00 54.00 38.00 54.00 57.00 2.67 0.768 ADD3_1 42.00 53.00 59.00 40.00 61.00 61.00 2.67 0.772 APP_1 29.00 31.00 31.00 29.00 32.00 31.00 0.33 0.778 GRB10_1 97.00 99.00 100.00 97.00 99.00 99.00 -0.33 0.778

TPM1_1 88.00 79.00 77.00 92.00 80.00 77.00 1.67 0.784 TPM1_1 12.00 21.00 23.00 8.00 20.00 23.00 -1.67 0.784 NUMB_2 20.00 15.00 17.00 19.00 14.00 21.00 0.67 0.806 FAM104A_1 14.00 23.00 23.00 14.00 20.00 23.00 -1.00 0.815 NPHP3_1 92.00 80.00 84.00 86.00 87.00 80.00 -1.00 0.821 DNM1L_1 44.00 58.00 58.00 37.00 58.00 59.00 -2.00 0.827 CA12_1 96.00 96.00 98.00 96.00 93.00 100.00 -0.33 0.883 SRPK2_1 16.00 7.00 8.00 23.00 0.00 5.00 -1.00 0.901 SMN2_1 94.00 95.00 91.00 98.00 93.00 90.00 0.33 0.905

MBNL2_1 9.00 15.00 15.00 6.00 19.00 15.00 0.33 0.942 GLK_1 28.00 38.00 37.00 26.00 36.00 40.00 -0.33 0.952 CRBN_1 99.00 100.00 100.00 100.00 100.00 99.00 0.00 1.000 CRBN_1 94.00 94.00 94.00 95.00 94.00 93.00 0.00 1.000 EIF4H_1 17.00 13.00 12.00 14.00 13.00 15.00 0.00 1.000 SRPK2_1 90.00 95.00 96.00 83.00 100.00 98.00 0.00 1.000

105

Table I-3 Effect of 833 treatment on a subset of cellular alternative splicing (AS). HeLa B2 cells were treated as described previously. RT-PCR and analysis was done by Stoilov group. For each splicing event, the percent spliced in (PSI) score, the mean change in exon inclusion with compound treatment and the associated p value (student’s t test) is listed (N = 3). AS events with |PSI| ≥ 10% are orange. Bolded events are common to multiple compounds.

Summary Total count of AS events: 70 AS events with P ≤ 0.05: 22 AS events with PSI ≥ 10%: 1 AS events with PSI ≤ -10%: 10

DMSO 833 Transcript ID PSI P Value PSI (25) PSI (26) PSI (27) PSI (25) PSI (26) PSI (27) MACF1_1 62.00 65.00 64.00 71.00 79.00 83.00 14.00 0.018 ATP6V0A1_1 77.00 81.00 76.00 83.00 87.00 84.00 6.67 0.027 SRPK2_1 90.00 95.00 96.00 100.00 98.00 100.00 5.67 0.045 PDCL_1 90.00 90.00 89.00 91.00 94.00 95.00 3.67 0.042 CRBN_1 94.00 94.00 94.00 93.00 92.00 91.00 -2.00 0.026 APLP2_1 23.00 23.00 24.00 19.00 21.00 22.00 -2.67 0.047 CLSTN1_2 11.00 12.00 12.00 7.00 8.00 6.00 -4.67 0.002 TRIM37_1 71.00 76.00 74.00 67.00 66.00 68.00 -6.67 0.013 SMN2_2 70.00 73.00 71.00 67.00 62.00 63.00 -7.33 0.014 EIF4A2_1 19.00 17.00 18.00 15.00 7.00 9.00 -7.67 0.036 DRCTNNB1A_1 21.00 27.00 24.00 15.00 15.00 17.00 -8.33 0.011 SRPK2_1 16.00 7.00 8.00 0.00 4.00 0.00 -9.00 0.046 SPAG9_1 26.00 22.00 20.00 17.00 7.00 8.00 -12.00 0.030 GLK_1 28.00 38.00 37.00 21.00 20.00 21.00 -13.67 0.013 FIP1L1_1 32.00 34.00 36.00 26.00 16.00 16.00 -14.67 0.014 NAP1L1_1 85.00 88.00 81.00 62.00 68.00 77.00 -15.67 0.031 CLSTN1_1 59.00 53.00 53.00 47.00 32.00 35.00 -17.00 0.027 NAP1L1_1 85.00 85.00 81.00 64.00 55.00 77.00 -18.33 0.048 MACF1_5 32.00 30.00 30.00 24.00 7.00 6.00 -18.33 0.036

GM130_1 30.00 35.00 34.00 20.00 14.00 10.00 -18.33 0.005 SEC24B_1 13.00 24.00 29.00 8.00 1.00 1.00 -18.67 0.024 FGFR1OP2_1 34.00 25.00 28.00 20.00 1.00 4.00 -20.67 0.033 MARK3_1 12.00 10.00 10.00 9.00 3.00 4.00 -5.33 0.054

EIF4H_1 17.00 13.00 12.00 11.00 7.00 9.00 -5.00 0.059 EIF4A2_1 28.00 23.00 24.00 22.00 9.00 12.00 -10.67 0.065 MRIP_1 30.00 31.00 30.00 29.00 21.00 22.00 -6.33 0.067 MARK3_1 19.00 14.00 13.00 13.00 3.00 3.00 -9.00 0.078

106

DMSO 833 Transcript ID PSI P Value PSI (25) PSI (26) PSI (27) PSI (25) PSI (26) PSI (27) POLDIP3_1 68.00 64.00 60.00 61.00 42.00 38.00 -17.00 0.085 MAP3K7_1 10.00 11.00 11.00 11.00 15.00 14.00 2.67 0.099 EXOC7_1 20.00 23.00 21.00 21.00 14.00 12.00 -5.67 0.119 SETD5_1 6.00 7.00 5.00 6.00 1.00 1.00 -3.33 0.132 CAST_1 65.00 69.00 69.00 66.00 62.00 65.00 -3.33 0.137 FAM62B_1 31.00 35.00 36.00 26.00 33.00 29.00 -4.67 0.140 ERC1_1 35.00 52.00 54.00 36.00 35.00 37.00 -11.00 0.143 MBNL2_1 9.00 15.00 15.00 11.00 41.00 39.00 17.33 0.154 NUMB_2 20.00 15.00 17.00 18.00 23.00 22.00 3.67 0.157 RPS24_1 7.00 8.00 7.00 7.00 13.00 11.00 3.00 0.170 NPHP3_1 92.00 80.00 84.00 83.00 76.00 77.00 -6.67 0.183 SMN2_1 94.00 95.00 91.00 94.00 97.00 96.00 2.33 0.193 MVK_1 86.00 80.00 64.00 77.00 100.00 100.00 15.67 0.196 MVK_1 63.00 54.00 41.00 68.00 53.00 87.00 16.67 0.228

RAN_1 96.00 98.00 100.00 94.00 97.00 97.00 -2.00 0.261 FAM104A_1 14.00 23.00 23.00 12.00 13.00 20.00 -5.00 0.271 DNM1L_1 20.00 31.00 33.00 26.00 39.00 42.00 7.67 0.294 MFF_1 76.00 100.00 100.00 69.00 85.00 87.00 -11.67 0.301 CA12_1 11.00 7.00 12.00 11.00 7.00 4.00 -2.67 0.353 AGPAT4_1 97.00 95.00 96.00 81.00 100.00 92.00 -5.00 0.418 APP_1 29.00 31.00 31.00 30.00 30.00 29.00 -0.67 0.422 GGCT_1 81.00 91.00 89.00 87.00 92.00 91.00 3.00 0.429 NUMB_2 77.00 94.00 88.00 82.00 99.00 97.00 6.33 0.436 RAI14_1 81.00 89.00 89.00 87.00 89.00 90.00 2.33 0.453 DNM1L_1 44.00 58.00 58.00 43.00 51.00 53.00 -4.33 0.481 AGPAT4_1 0.00 13.00 0.00 3.00 0.00 0.00 -3.33 0.495 KIF13A_1 23.00 23.00 22.00 23.00 22.00 22.00 -0.33 0.519 GRB10_1 97.00 99.00 100.00 99.00 99.00 100.00 0.67 0.519 POMT1_1 93.00 91.00 85.00 95.00 93.00 51.00 -10.00 0.530 CTNND1_1 78.00 97.00 79.00 86.00 N/A 70.00 -6.67 0.551 ADD3_1 42.00 53.00 59.00 41.00 62.00 64.00 4.33 0.651 POMT1_1 50.00 41.00 36.00 59.00 52.00 30.00 4.67 0.654 EXOC7_1 36.00 19.00 20.00 32.00 16.00 16.00 -3.67 0.657 CA12_1 96.00 96.00 98.00 97.00 100.00 95.00 0.67 0.698

CASP9_1 36.00 45.00 41.00 37.00 44.00 37.00 -1.33 0.722 APP_1 73.00 79.00 79.00 73.00 78.00 78.00 -0.67 0.811 TPM1_1 88.00 79.00 77.00 88.00 80.00 79.00 1.00 0.832 TPM1_1 12.00 21.00 23.00 12.00 20.00 21.00 -1.00 0.832 MBD1_1 22.00 0.00 56.00 17.00 37.00 14.00 -3.33 0.861 GGCT_1 48.00 61.00 60.00 49.00 59.00 60.00 -0.33 0.954 CRBN_1 99.00 100.00 100.00 100.00 100.00 99.00 0.00 1.000 ZNF827_1 19.00 22.00 20.00 23.00 14.00 24.00 0.00 1.000

107

Table I-4 Effect of 191 treatment on a subset of cellular alternative splicing (AS). HeLa B2 cells were treated as described previously. RT-PCR and analysis was done by Stoilov group. For each splicing event, the percent spliced in (PSI) score, the mean change in exon inclusion with compound treatment and the associated p value (student’s t test) is listed (N = 3). AS events with |PSI| ≥ 10% are orange. Bolded events are common to multiple compounds.

Summary Total count of AS events: 70 AS events with P ≤ 0.05: 25 AS events with PSI ≥ 10%: 0 AS events with PSI ≤ -10%: 19

DMSO 191 Transcript ID PSI P Value PSI (25) PSI (26) PSI (27) PSI (25) PSI (26) PSI (27) MACF1_1 62.00 65.00 64.00 67.00 73.00 69.00 6.00 0.038 AGPAT4_1 97.00 95.00 96.00 100.00 100.00 100.00 4.00 0.002 SETD5_1 6.00 7.00 5.00 2.00 1.00 0.00 -5.00 0.004 CAST_1 65.00 69.00 69.00 64.00 61.00 58.00 -6.67 0.038 MARK3_1 12.00 10.00 10.00 6.00 4.00 2.00 -6.67 0.007 MRIP_1 30.00 31.00 30.00 25.00 22.00 17.00 -9.00 0.019 MARK3_1 19.00 14.00 13.00 8.00 4.00 4.00 -10.00 0.012 SPAG9_1 26.00 22.00 20.00 16.00 12.00 6.00 -11.33 0.029

FAM104A_1 14.00 23.00 23.00 10.00 8.00 7.00 -11.67 0.020 TRIM37_1 71.00 76.00 74.00 63.00 65.00 58.00 -11.67 0.010 GM130_1 30.00 35.00 34.00 20.00 18.00 23.00 -12.67 0.004 GLK_1 28.00 38.00 37.00 24.00 19.00 19.00 -13.67 0.019 SMN2_2 70.00 73.00 71.00 58.00 57.00 57.00 -14.00 0.000 MACF1_5 32.00 30.00 30.00 22.00 19.00 8.00 -14.33 0.029 CLSTN1_1 59.00 53.00 53.00 43.00 42.00 35.00 -15.00 0.010 EIF4A2_1 19.00 17.00 18.00 2.00 2.00 4.00 -15.33 0.000 NAP1L1_1 85.00 88.00 81.00 65.00 71.00 N/A -16.67 0.017 NAP1L1_1 85.00 85.00 81.00 65.00 68.00 N/A -17.17 0.004 SEC24B_1 13.00 24.00 29.00 9.00 1.00 0.00 -18.67 0.028 FIP1L1_1 32.00 34.00 36.00 20.00 12.00 11.00 -19.67 0.003 EIF4A2_1 28.00 23.00 24.00 2.00 3.00 5.00 -21.67 0.000 ERC1_1 35.00 52.00 54.00 29.00 24.00 23.00 -21.67 0.026 CASP9_1 36.00 45.00 41.00 30.00 16.00 7.00 -23.00 0.033 FGFR1OP2_1 34.00 25.00 28.00 9.00 1.00 2.00 -25.00 0.002 POLDIP3_1 68.00 64.00 60.00 42.00 30.00 25.00 -31.67 0.005 KIF13A_1 23.00 23.00 22.00 21.00 14.00 12.00 -7.00 0.064 RAI14_1 81.00 89.00 89.00 79.00 76.00 64.00 -13.33 0.066

108

DMSO 191 Transcript ID PSI P Value PSI (25) PSI (26) PSI (27) PSI (25) PSI (26) PSI (27) RAN_1 96.00 98.00 100.00 96.00 93.00 91.00 -4.67 0.066 MAP3K7_1 10.00 11.00 11.00 10.00 8.00 7.00 -2.33 0.069 MBNL2_1 9.00 15.00 15.00 16.00 32.00 28.00 12.33 0.077 CA12_1 11.00 7.00 12.00 8.00 4.00 0.00 -6.00 0.096 CA12_1 96.00 96.00 98.00 90.00 96.00 88.00 -5.33 0.099 MVK_1 86.00 80.00 64.00 83.00 100.00 100.00 17.67 0.111 GGCT_1 81.00 91.00 89.00 81.00 82.00 78.00 -6.67 0.112

NUMB_2 77.00 94.00 88.00 94.00 97.00 100.00 10.67 0.113 GGCT_1 48.00 61.00 60.00 46.00 52.00 43.00 -9.33 0.132 APLP2_1 23.00 23.00 24.00 21.00 21.00 13.00 -5.00 0.136 CLSTN1_2 11.00 12.00 12.00 11.00 4.00 9.00 -3.67 0.157 PDCL_1 90.00 90.00 89.00 89.00 96.00 100.00 5.33 0.174 EXOC7_1 20.00 23.00 21.00 23.00 9.00 2.00 -10.00 0.184 SMN2_1 94.00 95.00 91.00 94.00 98.00 96.00 2.67 0.185 SRPK2_1 16.00 7.00 8.00 9.00 4.00 1.00 -5.67 0.199 APP_1 73.00 79.00 79.00 70.00 76.00 73.00 -4.00 0.205

DRCTNNB1A_1 21.00 27.00 24.00 17.00 0.00 24.00 -10.33 0.232 FAM62B_1 31.00 35.00 36.00 24.00 32.00 33.00 -4.33 0.251 ZNF827_1 40.00 53.00 46.00 26.00 7.00 54.00 -17.33 0.288 DNM1L_1 20.00 31.00 33.00 33.00 33.00 32.00 4.67 0.314 RPS24_1 7.00 8.00 7.00 10.00 13.00 6.00 2.33 0.320 ZNF827_1 19.00 22.00 20.00 23.00 15.00 9.00 -4.67 0.324 APP_1 29.00 31.00 31.00 30.00 30.00 27.00 -1.33 0.329 NUMB_2 20.00 15.00 17.00 20.00 16.00 23.00 2.33 0.403 CRBN_1 94.00 94.00 94.00 91.00 90.00 96.00 -1.67 0.420 MBD1_1 22.00 0.00 56.00 12.00 0.00 N/A -20.00 0.421 SRPK2_1 90.00 95.00 96.00 91.00 98.00 99.00 2.33 0.497

CRBN_1 99.00 100.00 100.00 99.00 99.00 100.00 -0.33 0.519 GRB10_1 97.00 99.00 100.00 99.00 99.00 100.00 0.67 0.519 EIF4H_1 17.00 13.00 12.00 12.00 11.00 15.00 -1.33 0.530 POMT1_1 50.00 41.00 36.00 48.00 37.00 28.00 -4.67 0.546 CTNND1_1 78.00 97.00 79.00 86.00 94.00 N/A 5.33 0.575 DNM1L_1 44.00 58.00 58.00 51.00 50.00 51.00 -2.67 0.599 AGPAT4_1 0.00 13.00 0.00 0.00 32.00 0.00 6.33 0.612 POMT1_1 93.00 91.00 85.00 95.00 94.00 63.00 -5.67 0.627 ATP6V0A1_1 77.00 81.00 76.00 81.00 80.00 68.00 -1.67 0.727

NPHP3_1 92.00 80.00 84.00 86.00 81.00 85.00 -1.33 0.746 TPM1_1 88.00 79.00 77.00 85.00 76.00 88.00 1.67 0.753 TPM1_1 12.00 21.00 23.00 15.00 24.00 12.00 -1.67 0.753 ADD3_1 42.00 53.00 59.00 41.00 56.00 51.00 -2.00 0.779 MFF_1 76.00 100.00 100.00 87.00 94.00 N/A -1.50 0.897 EXOC7_1 36.00 19.00 20.00 35.00 21.00 18.00 -0.33 0.967

109

II. Global analysis of cellular alternative splicing and gene expression by RNA seq

Table II-1 Effect of 791 treatment on cellular alternative splicing (AS) by RNAseq. HeLa B2 cells were treated as described previously. For each splicing event, the ‘percent spliced in’ or PSI score is given. The mean change in exon inclusion with compound treatment and the associated p value (student’s t test) is listed (N = 2). AS events with |PSI| ≥ 10% and ≥ 20% are coloured orange and red, respectively.

Summary Raw total count of AS events: 18,611 AS events with confidence: 10,001 AS events with P ≤ 0.05: 265 AS events with PSI ≥ 10%: 7 AS events with PSI ≤ -10%: 8

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI PLEKHA7 27.56 25.64 43.86 41.94 39.26 64.81 16.30 0.0069 DPYD 81.98 85.23 98.60 100.00 92.52 94.55 15.70 0.0363 ALS2CL 48.05 50.82 63.27 66.67 62.75 69.23 15.54 0.0215 TATDN2 16.09 17.23 27.24 29.96 25.58 26.23 11.94 0.0422 MIPOL1 74.00 74.67 85.62 84.91 75.00 74.74 10.93 0.0020 SAPS2 67.45 69.22 79.00 78.37 72.16 70.24 10.35 0.0338 NT5C2 7.59 9.44 20.06 17.60 28.61 51.46 10.32 0.0260 ZFYVE9 79.64 78.57 88.73 87.88 77.78 76.00 9.20 0.0066 SRSF2 10.21 11.06 20.00 19.14 20.92 20.50 8.94 0.0045 TMEM18 8.90 9.46 18.53 17.67 15.46 17.97 8.92 0.0062 USP33 43.89 42.02 52.79 50.74 47.16 43.50 8.81 0.0244 SYCP2 65.95 63.49 72.12 74.86 61.91 65.99 8.77 0.0423 AC009533.2 8.82 11.32 19.09 17.44 39.66 52.60 8.20 0.0432 PODXL 24.89 23.00 32.57 31.49 32.46 43.38 8.09 0.0320 USP14 6.43 7.33 15.24 13.93 3.51 4.57 7.71 0.0153 ZNF616 21.88 20.65 27.23 29.33 10.94 13.59 7.02 0.0459 TIA1 14.10 13.09 21.19 19.53 19.27 20.22 6.77 0.0323 SRC 92.46 92.83 98.50 98.59 95.70 94.74 5.90 0.0141 SNRK 92.92 94.20 98.75 100.00 98.28 97.08 5.82 0.0229 [Undefined] 67.30 67.89 73.04 73.06 72.28 62.37 5.46 0.0342 DCAKD 9.24 7.57 14.34 13.10 13.69 14.92 5.32 0.0427 C6orf192 94.04 94.44 99.20 99.27 99.07 97.10 5.00 0.0215 KIF27 10.84 11.48 15.94 15.70 10.26 21.95 4.66 0.0245 TAF4B 95.14 95.60 100.00 100.00 100.00 94.48 4.63 0.0316 PRPF38B 8.41 9.61 13.37 13.88 8.94 4.65 4.62 0.0497 CEP135 95.56 95.74 100.00 100.00 100.00 93.89 4.35 0.0132 STARD3 88.48 88.14 92.35 92.54 93.17 95.92 4.14 0.0064 C8orf59 88.75 87.64 92.54 91.94 92.14 85.07 4.05 0.0432 DEM1 72.17 71.13 75.76 75.16 77.78 94.29 3.81 0.0403 IMPAD1 1.71 2.88 5.47 6.64 3.64 4.04 3.76 0.0452 BX004987.4 92.44 92.96 95.94 96.68 90.39 92.27 3.61 0.0208

110

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI OSBPL5 86.15 86.32 89.82 89.61 93.37 98.55 3.48 0.0019 FANCC 88.84 89.51 93.02 92.16 96.63 98.82 3.41 0.0282 SUV420H2 60.25 61.17 63.58 64.66 60.68 52.38 3.41 0.0427 MED24 4.40 5.37 7.74 8.41 8.26 15.86 3.19 0.0417 KIAA1324L 92.00 92.20 95.19 95.26 98.99 96.38 3.13 0.0102 USP37 96.83 97.08 100.00 100.00 100.00 100.00 3.05 0.0261 TTC27 90.06 90.24 93.26 93.06 90.28 94.48 3.01 0.0021 TGDS 2.57 1.67 5.41 4.66 2.97 0.80 2.92 0.0406 FANCG 96.28 96.27 99.01 99.29 96.79 96.10 2.88 0.0308 CXXC1 4.84 5.54 7.59 8.53 10.56 13.65 2.87 0.0458 MPRIP 4.58 4.85 7.28 7.88 5.51 6.39 2.87 0.0352 KIAA0284 94.16 93.78 96.32 96.91 92.08 96.38 2.65 0.0261 C13orf27 92.12 91.74 94.40 94.69 97.71 94.03 2.62 0.0104 CHD2 92.12 92.51 94.74 95.02 100.00 98.37 2.56 0.0120 P4HA3 90.72 91.52 93.22 93.84 92.53 90.38 2.41 0.0465 TMEM199 3.48 3.05 5.84 5.33 8.11 9.68 2.32 0.0216 WDR77 4.99 5.57 7.91 7.26 7.09 9.30 2.31 0.0348 OSTF1 95.51 94.94 97.75 97.19 95.69 95.62 2.25 0.0303 FAM96A 2.18 2.17 4.43 4.30 2.17 3.61 2.19 0.0182 CCDC18 7.90 7.23 9.42 10.06 14.10 10.94 2.18 0.0427 SF3B1 4.65 4.30 6.66 6.52 6.99 6.86 2.12 0.0291 CCDC123 96.26 96.92 98.97 98.44 100.00 90.70 2.11 0.0414 NUP205 97.45 97.82 99.63 99.86 99.65 99.75 2.11 0.0182 MED4 95.04 94.92 97.04 97.04 98.19 96.98 2.06 0.0185 VIPAR 97.92 97.68 99.56 100.00 100.00 100.00 1.98 0.0311 DPY19L4 98.00 98.24 100.00 100.00 100.00 100.00 1.88 0.0406 BAG1 4.33 4.58 6.42 6.19 5.25 5.31 1.85 0.0085 ST3GAL1 94.48 94.85 96.66 96.36 97.72 96.86 1.85 0.0183 NTHL1 97.24 96.72 99.06 98.53 98.21 99.81 1.82 0.0394 ITPR1 97.19 97.58 99.22 99.09 98.94 98.86 1.77 0.0485 GAK 98.05 97.54 99.72 99.41 99.30 97.86 1.77 0.0419 CDCA3 96.44 96.16 98.09 97.97 98.41 99.55 1.73 0.0262 ADARB1 82.86 82.87 84.60 84.52 89.73 89.91 1.69 0.0136 RNF215 98.26 98.37 100.00 100.00 100.00 100.00 1.69 0.0208 FCHSD1 98.39 98.33 100.00 100.00 100.00 99.14 1.64 0.0116 DOCK11 98.34 98.45 100.00 100.00 100.00 100.00 1.61 0.0218 CDK17 96.77 97.07 98.61 98.40 92.34 97.41 1.59 0.0181 CABIN1 98.45 98.48 100.00 100.00 97.74 97.42 1.54 0.0062 PAM 97.76 98.03 99.35 99.51 97.72 97.70 1.54 0.0195 SFRS12 96.79 97.10 98.29 98.65 99.46 100.00 1.53 0.0247 MLLT6 97.59 97.62 99.02 99.08 98.75 95.83 1.44 0.0029 TELO2 98.55 98.23 100.00 99.59 99.08 100.00 1.41 0.0369 SIRT2 0.61 0.38 1.92 1.80 1.18 2.13 1.37 0.0216 NT5DC1 0.69 1.04 1.96 2.38 0.71 1.19 1.31 0.0438 SNRPD3 95.65 96.04 97.29 96.99 96.24 96.22 1.29 0.0391 FAT1 97.96 98.17 99.40 99.29 98.40 99.56 1.28 0.0207 MMAB 98.70 98.76 100.00 100.00 97.87 98.68 1.27 0.0150 LLGL1 98.70 98.88 100.00 100.00 99.48 100.00 1.21 0.0473 AHCYL2 98.71 98.89 100.00 100.00 100.00 100.00 1.20 0.0477 REXO4 98.64 98.79 99.77 100.00 98.84 100.00 1.17 0.0209 FNDC3B 97.54 97.92 99.08 98.72 99.54 96.91 1.17 0.0468

111

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI MRPS27 97.10 97.17 98.18 98.27 95.75 87.48 1.09 0.0036 TTC17 98.64 98.36 99.38 99.69 98.50 99.14 1.04 0.0392 41889 1.13 1.11 2.17 2.11 1.25 1.53 1.02 0.0097 CPT1A 98.63 98.68 99.57 99.72 99.12 100.00 0.99 0.0308 SMARCD2 98.94 99.09 100.00 100.00 100.00 99.74 0.98 0.0484 MAST2 98.76 99.01 100.00 99.74 100.00 100.00 0.98 0.0320 TMEM131 99.09 99.05 100.00 100.00 97.88 90.57 0.93 0.0137 SMARCAD1 99.03 99.13 100.00 100.00 95.73 73.31 0.92 0.0346 DDA1 97.91 98.12 98.80 99.02 97.87 93.37 0.89 0.0278 ATHL1 99.18 99.06 100.00 100.00 100.00 100.00 0.88 0.0433 NENF 98.94 98.87 99.75 99.77 99.59 99.01 0.85 0.0169 HTRA2 96.02 96.08 96.89 96.90 98.74 98.59 0.85 0.0193 RAE1 1.32 1.53 2.19 2.33 0.81 0.63 0.84 0.0310 DMXL1 99.22 99.24 100.00 100.00 100.00 100.00 0.77 0.0083 FNIP1 99.22 99.30 100.00 100.00 100.00 100.00 0.74 0.0344 MED12 99.21 99.32 100.00 100.00 100.00 99.13 0.74 0.0475 SLC25A40 99.26 99.35 100.00 100.00 100.00 100.00 0.70 0.0412 PHKA2 99.27 99.36 100.00 100.00 100.00 97.17 0.69 0.0418 NIN 99.35 99.33 100.00 100.00 97.80 99.16 0.66 0.0096 GBE1 99.34 99.38 100.00 100.00 99.67 95.74 0.64 0.0199 PPIG 1.39 1.31 1.93 2.03 3.09 1.88 0.63 0.0119 COL18A1 98.61 98.53 99.16 99.17 98.51 97.76 0.59 0.0398 AP3M1 99.21 99.13 99.76 99.75 99.69 100.00 0.59 0.0405 HDAC6 99.38 99.45 100.00 100.00 98.50 100.00 0.59 0.0380 C1orf77 99.42 99.41 100.00 100.00 99.63 99.82 0.59 0.0054 SLC7A6 99.46 99.43 100.00 100.00 100.00 100.00 0.56 0.0172 RNF213 99.41 99.49 100.00 100.00 97.91 100.00 0.55 0.0462 WRAP53 99.51 99.43 100.00 100.00 99.50 100.00 0.53 0.0480 RPS6KB1 98.88 98.96 99.42 99.46 100.00 100.00 0.52 0.0201 LAMB1 99.30 99.22 99.77 99.77 99.23 99.85 0.51 0.0498 ZMYM2 99.49 99.53 100.00 100.00 100.00 97.68 0.49 0.0260 MBD2 97.98 98.01 98.43 98.52 99.29 98.89 0.48 0.0399 CCDC6 99.53 99.51 100.00 100.00 99.56 100.00 0.48 0.0133 GIT1 99.31 99.42 99.81 99.76 100.00 98.88 0.42 0.0474 KARS 99.54 99.48 99.92 99.90 99.31 99.43 0.40 0.0304 KIAA1598 99.63 99.58 100.00 100.00 100.00 98.65 0.40 0.0402 SSR4 98.89 99.01 99.39 99.29 99.26 99.17 0.39 0.0404 AHCYL1 99.46 99.47 99.78 99.84 99.21 99.88 0.35 0.0495 PPP4C 95.30 95.30 95.67 95.62 95.41 95.55 0.35 0.0461 R3HCC1 99.44 99.33 99.76 99.69 100.00 99.54 0.34 0.0487 NCBP1 98.35 98.26 98.57 98.66 99.43 99.40 0.31 0.0397 LAMA5 99.70 99.68 100.00 100.00 99.23 99.86 0.31 0.0205 MVP 99.71 99.73 100.00 100.00 100.00 100.00 0.28 0.0227 CYFIP1 99.40 99.44 99.70 99.67 99.69 99.19 0.27 0.0113 ACO2 99.68 99.73 100.00 99.92 99.93 100.00 0.25 0.0470 LTBR 99.76 99.75 100.00 100.00 99.38 99.53 0.24 0.0130 CCT4 99.78 99.76 99.95 100.00 100.00 100.00 0.20 0.0481 CUEDC2 99.79 99.80 100.00 100.00 100.00 100.00 0.20 0.0155 C11orf48 99.86 99.84 100.00 100.00 100.00 100.00 0.15 0.0424 ATP6AP2 0.15 0.15 0.23 0.22 0.19 0.05 0.08 0.0424 AKIRIN2 100.00 100.00 99.86 99.84 100.00 99.59 -0.15 0.0424

112

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI HYOU1 100.00 100.00 99.84 99.86 100.00 99.86 -0.15 0.0424 DLG5 99.71 99.76 99.49 99.53 100.00 100.00 -0.23 0.0222 SRPR 100.00 100.00 99.76 99.77 100.00 98.86 -0.23 0.0135 AHSA1 99.31 99.28 99.02 99.06 99.26 99.74 -0.26 0.0121 SETD3 100.00 100.00 99.74 99.73 99.78 99.77 -0.27 0.0120 ECHS1 99.93 100.00 99.72 99.65 99.93 99.73 -0.28 0.0299 DCUN1D4 100.00 100.00 99.67 99.69 100.00 99.43 -0.32 0.0199 UBR4 100.00 100.00 99.67 99.66 100.00 99.61 -0.34 0.0095 HOMER3 99.77 99.83 99.43 99.49 99.50 99.72 -0.34 0.0152 U2AF1 99.94 100.00 99.63 99.58 99.52 99.35 -0.37 0.0125 RBM26 100.00 100.00 99.65 99.61 99.35 97.97 -0.37 0.0344 PLIN3 99.73 99.69 99.34 99.29 99.45 99.61 -0.39 0.0077 IBTK 100.00 100.00 99.60 99.61 99.06 97.14 -0.40 0.0081 EMP3 99.37 99.38 98.94 98.95 98.79 98.74 -0.43 0.0003 LPCAT4 100.00 100.00 99.58 99.52 100.00 100.00 -0.45 0.0424 PYGL 99.36 99.52 98.87 99.01 99.44 99.60 -0.50 0.0438 IMPA2 1.36 1.43 0.81 0.89 1.01 0.73 -0.55 0.0099 HACL1 100.00 100.00 99.48 99.40 99.19 98.87 -0.56 0.0454 ACTR8 100.00 100.00 99.43 99.44 100.00 100.00 -0.56 0.0056 ZNF276 100.00 100.00 99.44 99.37 100.00 100.00 -0.59 0.0374 YEATS2 100.00 100.00 99.35 99.43 100.00 99.54 -0.61 0.0417 DUSP11 99.58 99.65 99.03 98.91 98.76 98.56 -0.65 0.0218 GPD1L 100.00 100.00 99.33 99.32 100.00 100.00 -0.68 0.0047 NCKIPSD 99.08 99.01 98.32 98.40 98.50 99.08 -0.69 0.0064 41893 99.71 99.57 98.89 99.01 99.57 100.00 -0.69 0.0185 PNPLA2 99.42 99.54 98.69 98.87 99.26 98.63 -0.70 0.0322 RCN2 1.10 0.99 0.40 0.26 0.23 0.18 -0.72 0.0177 PDE3A 98.27 98.31 97.62 97.51 98.89 92.86 -0.72 0.0286 PPP2R5B 100.00 100.00 99.26 99.28 98.19 100.00 -0.73 0.0087 PLXNB1 100.00 100.00 99.22 99.30 99.25 100.00 -0.74 0.0344 MMS19 97.45 97.27 96.50 96.69 94.89 98.93 -0.77 0.0282 DCTD 100.00 100.00 99.23 99.22 99.05 99.10 -0.77 0.0041 EHMT1 100.00 100.00 99.20 99.18 99.76 96.14 -0.81 0.0079 SLC25A23 1.36 1.61 0.56 0.75 0.48 1.44 -0.83 0.0393 VPS33B 100.00 100.00 99.17 99.16 100.00 100.00 -0.84 0.0038 NT5DC2 99.36 99.53 98.68 98.52 99.28 98.93 -0.84 0.0187 ITGAV 99.54 99.31 98.66 98.43 99.54 97.32 -0.88 0.0325 BAIAP2 6.51 6.41 5.60 5.54 2.19 0.37 -0.89 0.0093 DDX23 100.00 99.84 98.95 99.10 100.00 100.00 -0.90 0.0149 WDR12 99.71 100.00 99.03 98.84 100.00 99.70 -0.92 0.0458 CEP250 100.00 100.00 99.07 99.09 100.00 99.61 -0.92 0.0069 ZNF791 100.00 100.00 99.08 98.96 99.10 95.54 -0.98 0.0389 BIRC6 100.00 100.00 98.94 98.99 99.03 100.00 -1.04 0.0154 SEC23A 99.85 99.85 98.72 98.87 98.83 99.28 -1.05 0.0452 GPSM2 100.00 100.00 98.88 99.01 98.88 96.21 -1.06 0.0392 PRKCZ 99.79 99.84 98.73 98.78 98.03 100.00 -1.06 0.0011 YARS2 98.62 98.93 97.82 97.61 98.39 92.23 -1.06 0.0395 C19orf63 5.23 5.06 4.04 4.03 9.44 6.14 -1.11 0.0479 RFC1 99.41 99.08 98.17 97.92 96.05 97.67 -1.20 0.0334 AGK 100.00 100.00 98.74 98.86 100.00 98.13 -1.20 0.0318 CEP57 99.57 99.44 98.23 98.32 94.49 65.95 -1.23 0.0066

113

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI CLTC 99.37 99.76 98.04 98.30 98.86 99.22 -1.40 0.0370 AFF1 99.46 99.35 98.00 98.01 98.71 98.37 -1.40 0.0239 WDR43 99.51 99.53 98.14 98.07 97.55 86.85 -1.42 0.0094 AC099524.1 100.00 99.76 98.51 98.39 100.00 100.00 -1.43 0.0228 SMARCAD1 100.00 100.00 98.58 98.53 100.00 100.00 -1.45 0.0110 TEP1 100.00 100.00 98.53 98.57 100.00 100.00 -1.45 0.0088 SCD5 99.46 99.47 97.99 97.88 98.58 99.24 -1.53 0.0218 USP20 95.85 95.67 94.07 94.38 83.39 84.77 -1.54 0.0249 DNAJC11 99.55 99.02 97.94 97.49 97.43 99.15 -1.57 0.0479 CAB39 98.56 98.48 96.77 97.04 94.63 93.18 -1.62 0.0378 RUSC1 99.19 99.55 97.96 97.47 98.37 99.46 -1.66 0.0386 NEDD4 100.00 100.00 98.35 98.31 100.00 86.78 -1.67 0.0076 ERCC2 99.25 98.73 97.47 96.91 94.60 97.13 -1.80 0.0427 NCAPD3 98.65 99.20 97.27 96.71 99.11 97.23 -1.94 0.0388 EFR3A 100.00 100.00 97.98 98.10 100.00 97.80 -1.96 0.0195 DNMBP 97.48 97.74 95.56 95.65 97.97 96.45 -2.01 0.0246 MRPL13 98.99 98.94 96.97 96.93 98.11 97.33 -2.01 0.0003 MGAT5 97.55 97.84 95.93 95.41 97.67 95.76 -2.03 0.0379 POLA1 97.86 97.62 95.72 95.58 95.59 84.70 -2.09 0.0101 NFKBIZ 99.43 99.36 97.16 97.40 96.17 97.54 -2.12 0.0243 SURF6 99.02 99.37 96.96 97.19 98.08 98.07 -2.12 0.0154 SHMT2 98.36 97.95 96.35 95.71 98.12 97.70 -2.13 0.0432 AP001011.3 99.45 100.00 97.20 97.81 97.60 74.04 -2.22 0.0333 ARHGEF2 97.22 97.12 94.77 95.13 95.57 96.28 -2.22 0.0379 DHX38 98.43 98.29 95.99 96.24 97.35 97.83 -2.25 0.0103 RMND1 100.00 100.00 97.84 97.64 94.38 70.59 -2.26 0.0282 PKN2 97.82 97.61 95.69 95.17 95.47 97.76 -2.29 0.0436 ANKRD28 99.53 100.00 97.57 97.27 95.44 95.74 -2.35 0.0221 BLZF1 99.34 99.29 97.06 96.83 92.44 92.09 -2.37 0.0243 TDRD7 100.00 100.00 97.50 97.47 97.50 87.65 -2.52 0.0038 SH3BP4 100.00 99.59 97.36 97.15 99.48 97.86 -2.54 0.0208 SOX13 100.00 100.00 97.40 97.45 100.00 99.10 -2.58 0.0062 COL4A5 100.00 99.50 96.81 97.50 100.00 95.35 -2.60 0.0322 ZNF2 100.00 100.00 97.33 96.88 100.00 98.06 -2.90 0.0494 FOXM1 13.56 13.55 10.71 10.48 12.03 12.82 -2.96 0.0245 UBTF 41.61 41.32 38.46 38.15 31.83 34.20 -3.16 0.0046 ZNHIT6 95.44 94.86 92.41 91.57 86.00 73.68 -3.16 0.0332 CASC5 99.36 100.00 96.30 96.55 98.60 100.00 -3.26 0.0372 TRPM7 99.28 98.88 95.43 95.80 93.03 90.65 -3.47 0.0063 VPS35 93.46 93.28 89.95 89.82 95.87 94.89 -3.49 0.0017 CHD1 7.75 8.64 4.90 4.49 5.74 7.79 -3.50 0.0450 ORC3L 96.24 96.16 92.70 92.69 95.09 91.48 -3.51 0.0064 DENND1B 100.00 100.00 96.30 96.61 82.61 61.90 -3.55 0.0278 SOAT1 97.00 97.95 94.13 93.45 96.38 91.00 -3.69 0.0307 ELP3 98.84 100.00 94.97 96.08 98.39 88.51 -3.90 0.0401 EIF2C4 100.00 100.00 96.30 95.88 100.00 97.30 -3.91 0.0342 CLTC 6.44 7.43 3.19 2.51 2.46 2.74 -4.09 0.0284 HRSP12 96.72 97.34 93.50 92.39 92.70 88.28 -4.09 0.0412 KIAA0317 98.60 98.37 94.38 94.21 97.25 92.59 -4.19 0.0018 OSBPL3 94.94 95.97 90.32 91.65 88.43 68.86 -4.47 0.0382 C10orf75 43.87 44.11 38.81 39.70 57.09 63.00 -4.74 0.0456

114

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI TBC1D22B 92.83 91.54 86.88 87.54 81.76 59.44 -4.98 0.0418 UGGT1 91.92 90.76 86.37 85.38 82.30 77.80 -5.47 0.0202 WDR19 10.28 9.45 4.51 4.05 22.76 29.80 -5.59 0.0164 VLDLR 95.67 97.09 91.29 90.06 90.66 96.65 -5.71 0.0273 KIAA1919 96.33 96.26 90.20 90.32 86.05 89.57 -6.04 0.0006 PHF19 87.28 87.13 81.18 81.13 88.84 76.00 -6.05 0.0034 BTBD3 35.43 37.16 29.09 31.02 32.06 31.52 -6.24 0.0415 FAM114A2 94.48 96.63 90.13 88.34 77.63 66.67 -6.32 0.0485 FIS1 92.95 94.11 87.41 86.97 96.54 96.74 -6.34 0.0348 KDM6A 73.06 71.57 66.50 65.11 68.24 52.21 -6.51 0.0239 INTS4 16.48 14.47 9.61 8.14 8.13 8.57 -6.60 0.0406 KIF15 100.00 99.22 92.86 92.44 98.94 78.08 -6.96 0.0111 SRRM2 40.36 41.29 33.26 34.16 33.42 20.75 -7.12 0.0082 KNTC1 97.49 96.47 88.77 90.52 93.68 94.94 -7.34 0.0323 ARL6 94.87 93.33 87.10 86.21 75.00 69.77 -7.45 0.0260 VPS13C 96.83 96.06 88.24 89.47 85.37 85.54 -7.59 0.0159 MED17 95.00 93.62 86.05 86.30 83.33 71.55 -8.14 0.0473 OSBPL9 87.74 86.41 78.59 77.82 82.44 88.28 -8.87 0.0156 R3HDM1 31.91 34.32 23.67 24.82 25.49 25.35 -8.87 0.0477 AC136619.1 34.07 31.58 25.08 22.73 15.37 7.85 -8.92 0.0352 DCUN1D3 36.64 36.36 27.31 27.31 24.93 11.82 -9.19 0.0097 CHRNA5 93.55 93.83 83.51 84.47 87.34 89.74 -9.70 0.0207 SPAG1 96.88 100.00 87.10 90.16 97.10 85.19 -9.81 0.0462 NCOA5* 94.87 93.21 83.21 81.17 71.93 64.29 -11.85 0.0137 BTBD9 25.30 24.00 12.00 13.43 11.43 15.15 -11.94 0.0067 TNFAIP3 88.65 89.50 75.23 73.33 75.00 88.24 -14.80 0.0181 DST 82.57 79.68 65.99 65.09 69.22 49.41 -15.59 0.0413 RBMX 51.69 55.52 36.07 38.35 31.87 20.47 -16.40 0.0306 CCDC18 96.92 92.16 79.10 75.57 56.58 34.55 -17.21 0.0341 FGFR1OP2 71.43 71.87 51.40 51.71 49.51 25.08 -20.10 0.0004 C12orf29 61.70 65.24 40.44 37.44 41.06 15.17 -24.53 0.0097

115

Table II-2 Effect of 191 treatment on cellular alternative splicing (AS) by RNAseq. HeLa B2 cells were treated as described previously. For each splicing event, the ‘percent spliced in’ or PSI score is given. The mean change in exon inclusion with compound treatment and the associated p value (student’s t test) is listed (N = 2). AS events with |PSI| ≥ 10% are coloured orange.

Summary Raw total count of AS events: 18,611 AS events with confidence: 9,806 AS events with P ≤ 0.05: 339 AS events with PSI ≥ 10%: 24 AS events with PSI ≤ -10%: 42

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI KIAA0649 29.13 27.10 32.69 18.53 54.06 53.83 25.83 0.0233 ANKRD9 76.11 71.84 81.07 83.26 95.49 99.23 23.39 0.0152 TCF19 68.55 72.16 76.09 79.93 91.82 94.12 22.62 0.0151 CLCN6 67.83 67.14 59.38 66.67 87.72 90.86 21.81 0.0372 AF011889.4 58.54 62.89 60.00 78.38 80.95 83.91 21.72 0.0206 IFT88 66.99 72.33 80.85 83.74 85.00 89.47 17.58 0.0395 TBCK 5.62 3.76 13.04 10.00 20.31 23.56 17.25 0.0228 STIL 29.29 26.05 32.21 27.90 41.54 45.99 16.10 0.0344 ZNF445 83.78 81.58 91.84 92.21 97.30 100.00 15.97 0.0132 ZNF317 71.90 73.64 77.69 70.71 86.43 90.32 15.61 0.0449 THSD1P1 43.75 40.00 32.37 31.78 60.00 54.84 15.55 0.0473 FUBP1 10.65 8.49 20.02 12.88 25.57 24.60 15.52 0.0201 SLC4A7 80.84 83.74 95.65 88.84 95.79 99.17 15.19 0.0221 WDR91 85.71 87.01 88.38 87.23 99.18 100.00 13.23 0.0068 HYAL3 78.08 81.92 88.14 83.43 91.43 94.12 12.78 0.0406 KITLG 73.83 75.71 78.54 82.31 86.09 88.94 12.75 0.0256 CEP170 77.40 79.39 77.65 80.51 89.56 92.06 12.42 0.0185 UBP1 38.96 38.84 52.11 48.56 50.54 50.23 11.49 0.0029 VPS41 2.03 1.70 3.07 1.47 12.17 13.63 11.04 0.0335 CRAMP1L 84.71 88.33 95.92 83.23 98.67 96.00 10.82 0.0478 XPO4 27.30 27.99 25.45 37.38 37.56 38.82 10.55 0.0119 STK19 86.26 84.70 90.37 89.56 95.84 96.03 10.46 0.0445 STK39 41.33 43.71 55.69 49.72 52.21 53.38 10.28 0.0370 SRSF2 10.21 11.06 20.00 19.14 20.92 20.50 10.08 0.0085 CENPE 90.56 88.89 88.19 85.44 100.00 99.15 9.85 0.0226 FBXO18 9.88 9.69 12.29 6.91 18.49 19.99 9.46 0.0471 TATDN2 16.09 17.23 27.24 29.96 25.58 26.23 9.25 0.0117 OGT 86.39 84.73 90.94 91.46 94.64 93.05 8.29 0.0188 ADARB1 82.86 82.87 84.60 84.52 89.73 89.91 6.96 0.0080 WDR90 88.64 89.22 95.64 88.69 95.73 95.88 6.88 0.0188 MFSD3 89.47 90.76 91.07 89.18 95.77 97.08 6.31 0.0206

116

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI TIA1 14.10 13.09 21.19 19.53 19.27 20.22 6.15 0.0126 PCNXL3 10.54 10.02 4.52 7.38 16.70 16.13 6.14 0.0041 CDCA7 16.21 17.72 19.84 17.33 23.41 22.60 6.04 0.0378 DCAKD 9.24 7.57 14.34 13.10 13.69 14.92 5.90 0.0356 CTC-338M12.5 12.51 14.13 16.52 19.57 18.10 19.97 5.72 0.0455 C19orf50 5.59 4.33 9.70 7.53 9.28 11.04 5.20 0.0492 ZER1 93.68 94.18 95.99 93.48 98.45 99.71 5.15 0.0486 C10orf35 94.68 95.21 96.67 100.00 100.00 100.00 5.06 0.0333 MLL3 88.89 90.05 93.52 91.23 95.31 93.67 5.02 0.0466 PDE8B 89.17 90.37 92.06 89.20 94.09 94.93 4.74 0.0302 CRELD2 5.03 4.44 4.43 4.26 9.61 9.25 4.70 0.0108 RUFY2 75.00 75.84 82.22 90.58 80.25 79.78 4.60 0.0223 METAP1 92.04 91.81 95.64 96.94 96.27 96.76 4.59 0.0130 MAP4K2 93.60 94.49 95.43 94.82 98.01 99.14 4.53 0.0276 POGZ 92.56 91.67 92.03 95.56 96.68 96.13 4.29 0.0242 INVS 95.52 95.94 96.26 98.10 100.00 100.00 4.27 0.0313 SNRK 92.92 94.20 98.75 100.00 98.28 97.08 4.12 0.0428 EML2 95.06 95.20 96.38 97.68 99.06 99.06 3.93 0.0113 ARVCF 96.34 95.99 91.79 100.00 100.00 100.00 3.84 0.0290 KIRREL 96.25 95.72 100.00 97.31 99.26 100.00 3.65 0.0201 KIAA1549 96.23 96.59 96.82 99.42 100.00 100.00 3.59 0.0319 ANAPC1 1.20 1.30 1.63 1.68 4.79 4.49 3.39 0.0160 NSUN2 2.26 2.65 2.76 2.10 5.33 5.89 3.16 0.0163 TMEM138 94.99 95.27 96.91 97.58 97.89 98.50 3.07 0.0321 UHRF1BP1L 96.44 95.79 97.84 97.63 98.73 99.59 3.05 0.0351 USP37 96.83 97.08 100.00 100.00 100.00 100.00 3.05 0.0261 CDCA7 16.17 15.79 19.92 23.26 18.97 18.91 2.96 0.0364 TNIP2 96.55 96.67 97.11 99.51 99.25 99.60 2.82 0.0236 SFRS12 96.79 97.10 98.29 98.65 99.46 100.00 2.79 0.0237 HTRA2 96.02 96.08 96.89 96.90 98.74 98.59 2.62 0.0073 MLKL 97.22 97.59 98.10 95.88 100.00 100.00 2.60 0.0453 C22orf13 96.33 96.87 94.99 98.47 98.92 99.41 2.57 0.0201 NCSTN 1.02 0.84 1.08 2.27 3.40 3.47 2.51 0.0102 C1orf93 97.70 97.31 100.00 98.61 100.00 100.00 2.50 0.0497 KIAA1109 3.42 2.99 2.94 2.91 6.01 5.34 2.47 0.0363 SF3B1 4.65 4.30 6.66 6.52 6.99 6.86 2.45 0.0260 STAG1 97.61 97.58 93.79 100.00 100.00 100.00 2.41 0.0040 ILKAP 1.88 1.28 4.23 0.51 3.59 4.36 2.40 0.0439 RRM2B 97.55 97.67 99.07 100.00 100.00 100.00 2.39 0.0160 BAT2L 97.49 97.71 98.66 98.27 99.85 100.00 2.33 0.0055 PREB 96.80 97.36 95.81 97.16 99.05 99.70 2.30 0.0348 DAGLA 97.85 97.65 96.30 93.97 100.00 100.00 2.25 0.0283 CTTNBP2 97.92 97.62 97.56 88.41 100.00 100.00 2.23 0.0428 VIPAR 97.92 97.68 99.56 100.00 100.00 100.00 2.20 0.0347 HPS3 97.31 97.89 98.89 100.00 100.00 99.43 2.12 0.0351 C7orf64 97.86 97.29 97.22 100.00 99.35 100.00 2.10 0.0412 MBD1 97.54 97.00 98.05 96.87 99.19 99.50 2.08 0.0376 NUP205 97.45 97.82 99.63 99.86 99.65 99.75 2.07 0.0431 NME4 6.83 6.97 8.37 6.99 8.91 9.02 2.07 0.0024 CACNA1H 97.80 97.24 98.98 98.13 99.65 99.15 1.88 0.0386 DPY19L4 98.00 98.24 100.00 100.00 100.00 100.00 1.88 0.0406

117

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI P4HA2 97.50 97.97 98.32 98.68 99.27 99.83 1.82 0.0406 PCNXL3 98.22 97.73 98.97 98.50 100.00 99.57 1.81 0.0321 FAIM 98.18 98.23 98.80 99.49 100.00 100.00 1.79 0.0089 CELF1 97.62 97.97 98.18 99.16 99.80 99.36 1.79 0.0270 STK11IP 98.29 98.25 100.00 95.60 100.00 100.00 1.73 0.0074 RNF215 98.26 98.37 100.00 100.00 100.00 100.00 1.69 0.0208 MAGOHB 3.42 3.53 4.64 6.70 4.98 5.31 1.67 0.0425 ACBD5 97.15 97.42 97.00 96.72 98.87 98.96 1.63 0.0343 DOCK11 98.34 98.45 100.00 100.00 100.00 100.00 1.61 0.0218 NEURL 98.13 97.79 98.74 99.76 99.51 99.32 1.46 0.0327 CXXC1 98.52 98.60 97.92 98.48 100.00 100.00 1.44 0.0177 ABCB6 90.07 89.97 91.71 89.76 91.48 91.37 1.41 0.0029 MRPL52 2.20 2.50 4.37 6.04 3.82 3.67 1.40 0.0326 RP6-109B7.3 2.44 2.75 3.48 2.65 4.08 3.84 1.37 0.0234 MIIP 97.91 98.06 98.71 97.25 99.25 99.13 1.21 0.0075 AHCYL2 98.71 98.89 100.00 100.00 100.00 100.00 1.20 0.0477 UNK 98.80 98.82 100.00 98.80 100.00 100.00 1.19 0.0053 ADNP 83.72 83.98 80.59 82.70 85.08 84.98 1.18 0.0436 DOCK9 98.86 98.81 99.08 100.00 100.00 100.00 1.17 0.0137 NCBP1 98.35 98.26 98.57 98.66 99.43 99.40 1.11 0.0144 BAT2L 98.80 98.67 98.79 99.38 99.87 99.80 1.10 0.0121 RPS6KB1 98.88 98.96 99.42 99.46 100.00 100.00 1.08 0.0236 WDFY3 98.97 98.88 98.09 100.00 100.00 100.00 1.08 0.0266 SNX2 98.68 98.94 99.27 99.89 99.75 100.00 1.07 0.0276 MED12 98.95 99.00 100.00 98.88 100.00 100.00 1.03 0.0155 ATHL1 99.18 99.06 100.00 100.00 100.00 100.00 0.88 0.0433 NAP1L4 0.48 0.41 0.62 0.90 1.32 1.30 0.87 0.0167 CCT6A 98.09 97.99 98.35 98.70 98.94 98.85 0.85 0.0064 SMARCD2 98.94 99.09 100.00 100.00 100.00 99.74 0.85 0.0476 DHX30 99.09 99.21 100.00 99.45 100.00 100.00 0.85 0.0449 ULK3 99.21 99.12 99.32 99.26 100.00 100.00 0.84 0.0343 IGHMBP2 99.13 99.25 99.35 100.00 100.00 100.00 0.81 0.0471 NCOA5 99.21 99.18 97.95 98.41 100.00 100.00 0.81 0.0119 CXXC1 99.22 99.20 99.41 99.27 100.00 100.00 0.79 0.0081 NKIRAS2 98.83 98.90 96.62 100.00 99.61 99.68 0.78 0.0040 STT3B 98.68 98.92 99.56 99.21 99.66 99.49 0.77 0.0425 DMXL1 99.22 99.24 100.00 100.00 100.00 100.00 0.77 0.0083 AP1M1 99.22 99.25 98.25 99.42 100.00 100.00 0.77 0.0125 FNIP1 99.22 99.30 100.00 100.00 100.00 100.00 0.74 0.0344 PICALM 99.14 99.10 99.85 99.38 99.84 99.84 0.72 0.0177 SLC25A40 99.26 99.35 100.00 100.00 100.00 100.00 0.70 0.0412 ITGA5 99.32 99.31 99.22 100.00 100.00 100.00 0.69 0.0046 TRABD 99.17 98.99 99.21 100.00 99.82 99.63 0.64 0.0390 NUP188 99.37 99.27 99.24 99.16 99.89 100.00 0.63 0.0142 UPF3B 99.42 99.35 99.31 99.60 100.00 100.00 0.62 0.0362 GMIP 99.37 99.40 98.80 97.60 100.00 100.00 0.61 0.0155 INO80 99.40 99.38 98.31 99.48 100.00 100.00 0.61 0.0104 KIF1B 99.41 99.45 99.36 100.00 100.00 100.00 0.57 0.0223 PNPLA6 99.43 99.46 99.69 100.00 100.00 100.00 0.56 0.0172 SLC7A6 99.46 99.43 100.00 100.00 100.00 100.00 0.56 0.0172 ALDH16A1 99.46 99.45 98.84 100.00 100.00 100.00 0.55 0.0058

118

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI GAK 96.68 96.86 93.09 97.12 97.22 97.37 0.52 0.0492 MSLN 99.21 99.33 99.41 99.43 99.73 99.85 0.52 0.0256 AMPD2 97.00 97.14 98.00 96.36 97.50 97.66 0.51 0.0422 TRAPPC1 99.26 99.37 99.39 99.93 99.74 99.90 0.50 0.0448 DCTN2 99.34 99.35 99.26 99.37 99.79 99.86 0.48 0.0424 SRP68 99.59 99.58 99.85 100.00 100.00 100.00 0.41 0.0077 BTAF1 99.57 99.61 99.03 97.97 100.00 100.00 0.41 0.0310 STAM 99.39 99.38 100.00 99.78 99.78 99.81 0.41 0.0127 SH3GL1 99.16 99.15 99.27 99.64 99.55 99.58 0.41 0.0127 USP5 99.54 99.53 99.38 99.01 99.88 99.89 0.35 0.0004 CLPTM1 99.18 99.13 99.10 99.18 99.46 99.52 0.33 0.0147 YME1L1 99.50 99.48 99.51 99.79 99.79 99.82 0.31 0.0058 SLC25A24 99.70 99.72 99.28 98.31 100.00 100.00 0.29 0.0219 MVP 99.71 99.73 100.00 100.00 100.00 100.00 0.28 0.0227 DDX11 99.13 99.15 99.44 99.20 99.41 99.42 0.27 0.0067 TBL1Y 99.72 99.76 99.80 99.59 100.00 100.00 0.26 0.0489 ACO2 99.68 99.73 100.00 99.92 99.93 100.00 0.26 0.0332 EXOSC8 99.76 99.73 99.71 100.00 100.00 100.00 0.25 0.0374 DDX54 99.76 99.77 100.00 99.69 100.00 100.00 0.23 0.0135 CCT4 99.78 99.76 99.95 100.00 100.00 100.00 0.23 0.0277 FAM129B 99.70 99.70 99.63 99.85 99.94 99.91 0.22 0.0424 CUEDC2 99.79 99.80 100.00 100.00 100.00 100.00 0.20 0.0155 PTPN1 99.13 99.11 98.77 98.60 99.34 99.30 0.20 0.0294 BAT3 99.71 99.70 99.64 99.73 99.85 99.88 0.16 0.0399 C11orf48 99.86 99.84 100.00 100.00 100.00 100.00 0.15 0.0424 CCNG1 99.43 99.46 99.66 99.28 99.56 99.58 0.13 0.0286 ESYT1 100.00 100.00 100.00 99.69 99.89 99.90 -0.10 0.0303 P4HB 99.86 99.84 99.78 99.86 99.75 99.73 -0.11 0.0161 CENPF 99.13 99.15 99.04 98.22 98.98 99.02 -0.14 0.0492 DDX24 100.00 100.00 99.85 100.00 99.85 99.87 -0.14 0.0454 PDCD6IP 100.00 100.00 99.15 99.42 99.83 99.85 -0.16 0.0397 SETD3 100.00 100.00 99.74 99.73 99.78 99.77 -0.23 0.0141 SMARCA4 100.00 100.00 100.00 99.60 99.71 99.72 -0.29 0.0112 ISY1 0.62 0.61 0.20 0.43 0.33 0.32 -0.29 0.0006 CUL4A 99.91 100.00 99.71 100.00 99.69 99.62 -0.30 0.0388 SPHK1 100.00 100.00 99.35 100.00 99.62 99.63 -0.38 0.0085 HNRNPAB 99.95 99.94 99.89 99.75 99.55 99.58 -0.38 0.0140 RNF31 100.00 100.00 100.00 99.52 99.62 99.59 -0.39 0.0242 MTOR 99.66 99.66 99.67 99.27 99.29 99.23 -0.40 0.0477 DDX17 100.00 99.91 99.57 99.90 99.49 99.62 -0.40 0.0466 RRN3 100.00 100.00 100.00 100.00 99.57 99.63 -0.40 0.0477 ASAP3 100.00 100.00 99.14 100.00 99.60 99.57 -0.42 0.0230 AP1G1 99.48 99.51 96.27 99.53 99.04 99.11 -0.42 0.0272 ZC3H7B 100.00 100.00 98.97 99.65 99.56 99.60 -0.42 0.0303 RELA 99.59 99.49 98.47 99.50 99.12 99.06 -0.45 0.0280 VAMP1 100.00 100.00 100.00 100.00 99.58 99.52 -0.45 0.0424 MRTO4 99.71 99.82 99.56 99.46 99.27 99.35 -0.45 0.0271 USP32 100.00 100.00 100.00 99.08 99.56 99.52 -0.46 0.0277 LRBA 100.00 100.00 100.00 97.30 99.56 99.50 -0.47 0.0406 SAMM50 100.00 100.00 99.72 99.65 99.50 99.53 -0.48 0.0197 PCBP2 99.37 99.41 99.13 99.58 98.87 98.76 -0.57 0.0382

119

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI FASTKD1 100.00 100.00 98.48 99.52 99.41 99.44 -0.58 0.0166 RP11-313P13.3 99.32 99.28 99.26 99.61 98.74 98.68 -0.59 0.0065 EMP3 99.37 99.38 98.94 98.95 98.79 98.74 -0.61 0.0210 LIMCH1 0.87 0.99 0.70 0.79 0.25 0.34 -0.64 0.0171 NMT1 100.00 100.00 99.87 100.00 99.33 99.38 -0.65 0.0247 GLB1 99.80 99.85 99.83 99.68 99.22 99.10 -0.66 0.0312 FBXL2 100.00 100.00 100.00 100.00 99.37 99.28 -0.67 0.0424 RAE1 1.32 1.53 2.19 2.33 0.81 0.63 -0.71 0.0382 TINAGL1 100.00 100.00 100.00 99.31 99.27 99.32 -0.71 0.0226 DGCR14 98.77 98.82 97.34 98.90 98.11 97.95 -0.76 0.0475 EXOC3 99.65 99.78 99.19 100.00 99.03 98.81 -0.80 0.0403 DNM1L 1.11 1.28 1.19 0.31 0.49 0.30 -0.80 0.0252 EFNA1 97.93 97.86 97.97 97.79 97.17 97.00 -0.81 0.0385 RCN2 1.10 0.99 0.40 0.26 0.23 0.18 -0.84 0.0182 GFM2 100.00 100.00 99.51 100.00 99.18 99.11 -0.85 0.0260 PSMA5 98.74 98.87 97.78 98.58 97.83 98.05 -0.87 0.0352 SMG5 1.80 2.00 1.78 1.54 0.96 1.08 -0.88 0.0290 LTBP3 100.00 100.00 99.14 100.00 99.11 99.10 -0.90 0.0036 AZI2 98.86 98.79 98.18 98.68 97.88 97.93 -0.92 0.0035 BCL2L1 2.13 2.19 2.05 2.39 1.18 1.29 -0.93 0.0119 DCTD 100.00 100.00 99.23 99.22 99.05 99.10 -0.93 0.0172 DUSP11 99.58 99.65 99.03 98.91 98.76 98.56 -0.95 0.0440 SEC31B 97.85 97.70 91.58 95.00 96.75 96.89 -0.95 0.0115 CHCHD3 97.96 97.95 97.79 96.88 96.92 96.99 -1.00 0.0198 BAG2 2.42 2.30 2.03 3.24 1.47 1.22 -1.02 0.0414 CYLD 100.00 100.00 100.00 98.79 99.03 98.94 -1.02 0.0282 RTTN 100.00 100.00 97.56 98.65 98.94 98.99 -1.04 0.0154 PDCD4 98.68 98.91 99.33 96.56 97.86 97.60 -1.07 0.0265 BMS1 99.03 98.68 98.57 99.44 97.84 97.54 -1.17 0.0388 BRE 99.70 100.00 99.78 100.00 98.79 98.46 -1.23 0.0322 FBXL17 100.00 100.00 98.36 99.07 98.63 98.82 -1.28 0.0473 STK11IP 100.00 100.00 96.97 100.00 98.63 98.53 -1.42 0.0224 GREB1L 100.00 100.00 100.00 97.01 98.45 98.64 -1.46 0.0415 AP1G1 100.00 99.59 97.67 100.00 98.45 98.20 -1.47 0.0397 NPEPPS 99.23 99.09 98.44 97.78 97.52 97.84 -1.48 0.0378 ODF2 99.73 100.00 99.61 100.00 98.48 98.26 -1.50 0.0150 LRP5 99.09 99.40 97.77 98.66 97.95 97.50 -1.52 0.0399 SP110 91.58 91.40 72.88 100.00 89.90 90.00 -1.54 0.0113 PIKFYVE 100.00 100.00 100.00 94.28 98.48 98.36 -1.58 0.0242 MEGF9 100.00 99.64 99.41 98.53 98.35 98.11 -1.59 0.0259 MINA 100.00 100.00 99.03 100.00 98.19 98.41 -1.70 0.0411 C6orf125 3.19 3.71 4.53 4.24 1.52 1.78 -1.80 0.0500 TUBGCP5 95.60 95.65 93.98 97.41 93.68 93.81 -1.88 0.0098 NUF2 98.49 98.86 98.28 99.48 96.68 96.79 -1.94 0.0439 KIAA1429 100.00 100.00 99.47 100.00 97.88 98.14 -1.99 0.0415 DDX18 98.71 99.29 98.65 98.48 97.33 96.67 -2.00 0.0464 FUCA1 100.00 99.34 99.32 100.00 97.77 97.31 -2.13 0.0429 XPC 99.60 100.00 98.40 98.99 97.51 97.43 -2.33 0.0465 TTC31 99.60 100.00 98.98 97.98 97.77 97.12 -2.36 0.0387 EFHA1 99.14 99.53 99.63 100.00 96.87 96.89 -2.46 0.0499 BAZ1B 2.81 3.37 2.07 2.46 0.85 0.39 -2.47 0.0229

120

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI PICK1 100.00 100.00 98.48 99.22 97.14 97.52 -2.67 0.0452 FNBP4 96.40 96.61 94.49 95.73 94.05 93.59 -2.69 0.0263 SNX4 4.47 3.78 3.28 3.72 1.08 1.72 -2.73 0.0289 CDAN1 100.00 100.00 97.44 100.00 97.10 97.45 -2.73 0.0408 SESTD1 98.26 98.24 97.84 90.79 95.68 95.26 -2.78 0.0475 TNPO2 3.88 4.76 3.89 4.22 1.97 1.07 -2.80 0.0470 TMEM219 11.26 11.53 8.59 7.36 8.88 8.26 -2.83 0.0388 GIT1 4.69 4.48 1.99 4.55 1.48 1.99 -2.85 0.0312 TAOK2 15.33 14.85 14.03 17.34 12.37 12.07 -2.87 0.0168 XRN2 99.34 99.72 96.07 98.99 96.20 96.85 -3.01 0.0276 ANAPC4 98.98 99.55 100.00 98.25 96.54 95.80 -3.10 0.0258 TAOK3 6.02 6.59 1.59 4.17 2.70 3.67 -3.12 0.0486 ZFYVE20 96.10 95.10 94.27 89.25 92.00 92.82 -3.19 0.0417 SF3B1 4.50 5.49 5.55 4.70 2.15 1.24 -3.30 0.0396 CCDC76 100.00 100.00 100.00 100.00 96.49 96.88 -3.32 0.0374 SSX2IP 100.00 99.42 97.75 100.00 96.63 96.13 -3.33 0.0138 NSMCE1 4.41 4.98 1.64 3.33 1.61 0.98 -3.40 0.0157 POLH 97.51 97.22 94.50 96.21 93.95 93.81 -3.49 0.0087 CEP250 98.75 99.45 96.45 97.87 95.08 96.07 -3.53 0.0359 CDCA2 4.67 4.78 2.69 3.98 1.12 0.88 -3.73 0.0067 ERMP1 96.93 96.22 95.20 97.74 92.34 93.01 -3.90 0.0155 PPP2R3B 5.80 6.52 2.50 4.58 1.97 2.35 -4.00 0.0235 KIAA0391 95.25 96.57 94.34 96.30 92.36 91.30 -4.08 0.0442 PGM2 100.00 99.31 99.23 100.00 95.42 95.69 -4.10 0.0304 AATF 99.36 100.00 97.89 99.06 95.31 95.83 -4.11 0.0113 ANKRD28 99.53 100.00 97.57 97.27 95.44 95.74 -4.18 0.0084 PRTG 100.00 100.00 100.00 88.00 96.08 95.56 -4.18 0.0395 ACOT9 87.77 87.86 88.90 86.59 83.75 83.20 -4.34 0.0356 ZNF770 96.83 97.35 97.60 99.21 92.36 92.91 -4.46 0.0072 ATR 100.00 100.00 100.00 99.15 95.72 95.31 -4.49 0.0291 MYO9B 99.60 100.00 94.33 99.05 94.55 95.29 -4.88 0.0176 TMEM55A 5.49 6.84 3.01 8.80 2.02 0.51 -4.90 0.0412 MXD3 97.14 96.24 90.64 92.35 92.44 91.06 -4.94 0.0376 PCBP4 91.82 92.72 93.90 91.83 86.63 87.82 -5.05 0.0254 COL4A3BP 6.12 6.94 4.51 6.01 1.77 0.86 -5.22 0.0140 CCNE2 99.16 100.00 100.00 100.00 94.03 94.37 -5.38 0.0266 ERCC8 97.81 98.94 95.64 99.25 93.61 92.37 -5.39 0.0239 RIOK3 94.84 94.38 89.02 92.94 88.83 89.32 -5.54 0.0037 PATZ1 13.22 14.06 14.47 13.33 8.36 7.75 -5.59 0.0116 TCERG1 9.06 10.33 12.79 9.84 4.20 3.63 -5.78 0.0377 MAP3K9 100.00 100.00 97.10 100.00 94.17 94.08 -5.88 0.0049 USP40 96.03 97.99 97.59 99.05 89.89 91.88 -6.13 0.0483 BCKDHB 98.27 99.66 97.34 97.11 91.60 93.56 -6.39 0.0418 INADL 100.00 100.00 100.00 94.50 93.45 93.44 -6.56 0.0005 GOLGA5 96.53 96.94 93.24 97.33 90.50 89.81 -6.58 0.0084 WRN 100.00 98.46 94.95 97.85 93.55 91.74 -6.59 0.0328 FIBP 94.00 94.23 89.16 90.92 86.99 87.81 -6.72 0.0272 TTLL5 23.50 25.13 22.70 19.44 18.02 17.15 -6.73 0.0363 ERBB2 98.08 96.65 93.88 94.57 90.66 90.03 -7.02 0.0347 DENND1A 98.45 97.55 94.74 98.42 90.71 91.20 -7.05 0.0134 BLZF1 99.34 99.29 97.06 96.83 92.44 92.09 -7.05 0.0138

121

DMSO 791 191 Mean Gene ID P Value PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) PSI FBF1 93.44 91.60 96.67 95.35 84.47 85.59 -7.49 0.0324 DPY19L3 100.00 100.00 89.83 98.18 91.84 92.52 -7.82 0.0277 LRRC16A 95.83 96.26 94.44 93.58 88.17 88.24 -7.84 0.0148 SOS1 100.00 98.10 99.34 100.00 90.55 91.61 -7.97 0.0338 PTPN4 95.97 95.22 95.41 95.52 86.74 87.85 -8.30 0.0103 ALDH5A1 19.91 19.45 16.05 13.36 10.39 11.95 -8.51 0.0422 DMXL1 95.65 92.67 98.30 97.54 84.31 86.52 -8.75 0.0492 GPR89B 97.52 98.43 94.16 98.20 88.36 89.13 -9.23 0.0046 BICC1 98.80 99.19 94.56 96.80 90.22 89.06 -9.36 0.0240 POLA1 99.67 98.31 98.74 97.75 89.02 89.53 -9.72 0.0251 PWWP2A 15.97 13.11 11.36 8.72 5.60 3.85 -9.82 0.0424 VPS13D 12.24 15.06 4.21 9.75 2.54 4.56 -10.10 0.0353 MATN2 97.83 100.00 100.00 100.00 89.19 86.67 -10.99 0.0234 VPS13C 96.83 96.06 88.24 89.47 85.37 85.54 -10.99 0.0169 PPP3CB 100.00 98.24 92.55 95.11 87.68 87.27 -11.65 0.0383 USP20 95.85 95.67 94.07 94.38 83.39 84.77 -11.68 0.0347 INSIG2 100.00 100.00 95.04 100.00 87.23 88.68 -12.05 0.0383 ZDHHC13 93.24 89.64 89.32 94.27 78.06 80.10 -12.36 0.0453 MCTP1 63.06 60.19 57.26 54.85 47.88 49.61 -12.88 0.0280 SLC19A2 95.35 96.61 91.71 96.20 84.62 81.47 -12.94 0.0480 SLC25A15 100.00 98.69 93.28 96.58 87.29 85.04 -13.18 0.0190 KLHL29 98.58 98.80 95.51 100.00 84.48 86.44 -13.23 0.0446 KIAA0586 98.47 95.12 94.59 100.00 85.19 81.25 -13.58 0.0363 MAP4K4 54.68 53.07 50.81 54.89 39.80 40.34 -13.81 0.0223 TMOD1 95.47 93.40 78.55 87.62 79.18 81.97 -13.86 0.0193 CHCHD7 38.41 40.37 39.53 46.10 24.09 22.57 -16.06 0.0074 HISPPD1 41.78 44.23 38.65 34.66 26.65 26.74 -16.31 0.0474 PIAS2 88.44 92.54 91.79 80.44 76.47 71.31 -16.60 0.0412 APLP1 81.61 85.71 76.98 81.22 65.40 68.25 -16.84 0.0284 ACVR2A 100.00 100.00 87.69 100.00 83.67 82.46 -16.94 0.0227 COL27A1 96.32 100.00 90.12 94.02 82.06 77.38 -18.44 0.0285 SAP130 49.87 54.88 52.21 57.10 35.98 30.79 -18.99 0.0343 C20orf4 31.28 28.87 27.78 22.70 11.26 9.30 -19.80 0.0071 SYNJ2 98.18 97.06 100.00 93.42 78.09 75.32 -20.92 0.0215 GUF1 48.98 53.78 47.44 33.96 33.33 27.32 -21.06 0.0353 NCOR2 42.48 44.72 30.15 47.90 23.34 21.45 -21.21 0.0053 RP11-187C18.3 65.10 68.72 51.44 62.43 40.96 46.87 -23.00 0.0346 AC009086.1 38.90 35.93 40.86 39.83 15.13 11.17 -24.27 0.0131 RALGAPA2 100.00 100.00 100.00 95.35 76.67 74.19 -24.57 0.0321 ACAD10 89.47 97.66 82.55 92.94 72.09 64.56 -25.24 0.0459 CASP8AP2 97.80 93.87 80.36 86.36 72.97 68.12 -25.29 0.0168 SMEK2 50.71 50.85 46.29 34.27 26.11 23.91 -25.77 0.0266 HOXC4 36.00 42.86 30.91 59.26 8.57 13.04 -28.63 0.0291 SMPDL3A 83.72 81.20 57.89 71.11 56.36 50.00 -29.28 0.0417 MBTD1 69.62 63.64 61.45 61.90 34.51 29.20 -34.78 0.0135 FAM48A 81.22 79.73 77.21 78.24 46.84 43.35 -35.38 0.0135 RFX3 82.86 76.81 76.62 82.98 38.89 47.22 -36.78 0.0242 TMEM20 58.54 56.32 30.77 23.36 17.65 14.81 -41.20 0.0025 C13orf23 95.95 96.15 92.52 93.78 53.00 50.35 -44.38 0.0184 KIAA0240 60.00 65.08 25.00 43.21 16.28 16.67 -46.07 0.0341 CCDC150 69.77 83.91 48.28 77.05 36.26 24.68 -46.37 0.0397 ALKBH8 75.86 60.66 46.51 59.49 25.00 13.04 -49.24 0.0407 WDFY3 76.56 68.75 52.17 41.77 27.27 12.09 -52.98 0.0481 122

Table II-3 Effect of 791 treatment on gene expression by RNAseq. HeLa B2 cells were treated as described previously. Expression level of genes with DMSO, 791 or 191 treatment is quantified as corrected RPKM values (reads per kilobase of exon model per million mapped reads) with p value by student’s t test (N = 2). Expression cutoff is 0.5 RPKM (≥10 reads that mapped uniquely to a single genomic locus). Mean fold changes > 2 and < 0.5 or > 5 and < 0.2 are coloured orange and red, respectively. A subset of the data is shown. Bolded events are common to both 791 and 191 treatment samples.

Summary Raw total count of genes: 19,847 Genes with RPKM ≥ 0.5: 11,406 Gene expression with P ≤ 0.05: 1,020 Genes with fold change ≥ 2: 60

Genes with fold change ≤ 0.5: 24 DMSO 791 191 Mean fold Gene ID Gene Expression (cRPKM) Meanchange fold P Value Gene ID PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) P Value DMSO 791 191 change(RPKM) TRIB3 23.20 14.75 176.50 161.73 40.12 27.00 9.286 0.0082 GDF15 81.71 59.28 493.65 563.64 163.40 249.66 7.775 0.0321 CHAC1 1.89 1.29 10.04 11.18 5.44 3.42 6.989 0.0139 ASNS 31.53 32.92 179.31 201.11 100.84 122.33 5.898 0.0431 WARS 37.85 28.74 137.22 155.32 50.62 53.59 4.515 0.0211 ETV4 6.37 4.56 21.62 23.66 15.83 15.65 4.291 0.0066 PCK2 20.17 23.70 82.27 90.50 42.69 46.11 3.949 0.0190 ARG2 2.93 1.77 8.22 7.81 4.53 4.11 3.609 0.0425 MAP1B 1.23 0.96 3.53 4.06 2.75 2.24 3.550 0.0280 CEBPG 12.90 10.61 40.79 40.87 26.14 21.72 3.507 0.0249 PSAT1 60.90 53.67 182.93 199.37 107.97 110.20 3.359 0.0174 ABCG1 3.18 4.36 11.35 13.31 5.73 4.95 3.311 0.0293 LARP6 6.80 7.04 20.37 22.48 12.27 11.30 3.094 0.0437 PHLDA1 5.64 4.96 15.18 16.42 12.73 13.21 3.001 0.0117 HSPB6 0.67 0.70 1.86 2.04 0.97 0.64 2.845 0.0401 FAM86B2 1.86 2.21 5.71 5.78 3.47 3.58 2.843 0.0244 AARS 72.29 77.97 206.19 219.44 123.92 150.85 2.833 0.0130 NUPR1 58.16 50.86 147.17 156.22 104.32 89.40 2.801 0.0043 MORN4 1.84 2.20 5.54 5.52 3.78 2.84 2.760 0.0321 GPT2 8.41 7.13 19.45 22.45 12.10 17.81 2.731 0.0415 PHGDH 102.68 99.49 286.35 265.58 160.29 149.92 2.729 0.0338 SARS 81.23 72.14 210.52 201.93 127.28 131.46 2.695 0.0024 ABCC3 14.28 19.73 41.17 49.34 23.03 11.37 2.692 0.0392 CTH 10.06 8.74 22.83 26.42 14.90 17.60 2.646 0.0492 RNF187 45.46 38.13 109.95 108.82 56.11 53.20 2.636 0.0307 AC068020.2 0.69 1.05 1.99 2.35 1.47 0.93 2.561 0.0363 FAM86B1 6.77 7.45 18.16 17.16 10.79 5.40 2.493 0.0056 GADD45A 26.88 21.10 59.67 58.26 40.92 58.14 2.491 0.0414 TM4SF19 1.81 1.35 4.04 3.61 2.55 3.27 2.453 0.0193 RP11-121N13.3 0.64 0.92 1.70 2.04 1.67 1.12 2.437 0.0414 AC138904.2 11.31 9.67 27.21 23.76 19.09 19.84 2.431 0.0379

123

Gene Expression (cRPKM) DMSO 791 191 Mean fold Gene ID P Value PSI (25)DMSOPSI (29) PSI (25)791PSI (29) PSI (25)191PSI (29) change RBCK1 19.01 21.56 47.83 49.79 29.32 32.32 2.413 0.0042 SNX10 4.38 3.80 10.48 9.14 6.53 8.06 2.399 0.0426 TCEA1 60.01 57.30 138.64 140.08 69.97 62.91 2.377 0.0018 FAM27E2 2.61 3.77 6.80 7.97 4.99 1.61 2.360 0.0365 CARS 31.92 29.23 70.48 71.80 51.18 53.88 2.332 0.0061 SLC22A15 1.07 0.72 2.19 1.87 1.46 1.18 2.322 0.0416 LONP1 68.63 62.86 150.24 153.02 86.16 95.69 2.312 0.0064 GPCPD1 3.18 3.59 7.52 7.93 7.27 12.56 2.287 0.0044 PGPEP1 8.46 8.58 18.63 19.54 12.78 15.50 2.240 0.0250 EIF4EBP1 79.98 78.73 172.40 182.87 115.37 127.50 2.239 0.0316 PSPH 16.17 14.54 32.31 35.82 17.79 17.38 2.231 0.0292 ARHGEF2 15.36 15.55 33.31 35.38 19.79 19.14 2.222 0.0334 MLPH 1.01 0.77 2.07 1.82 1.24 1.86 2.207 0.0260 AC004490.2 1.16 0.98 2.39 2.30 1.10 0.56 2.204 0.0177 PTP4A3 2.31 2.35 4.93 5.23 5.61 6.09 2.180 0.0318 HSPA9 245.42 185.92 477.68 446.89 277.97 269.17 2.175 0.0372 TOR3A 15.69 12.50 30.79 29.70 18.70 19.07 2.169 0.0417 TIMP4 3.73 3.42 7.83 7.17 4.12 2.70 2.098 0.0247 IFRD1 60.02 67.82 134.90 131.75 92.24 139.05 2.095 0.0173 SPRED2 5.67 6.22 12.14 12.56 8.95 6.94 2.080 0.0039 C19orf57 1.48 1.08 2.43 2.68 1.12 0.55 2.062 0.0470 C2orf18 18.41 16.50 36.87 34.84 17.79 14.88 2.057 0.0058 XPOT 47.93 44.72 92.78 97.05 63.15 64.02 2.053 0.0042 C6orf48 90.19 100.40 188.58 200.03 133.24 137.29 2.042 0.0062 PCLO 0.95 0.96 2.01 1.88 1.25 1.26 2.037 0.0406 WI2-3658N16.1 14.64 17.22 31.30 33.07 13.17 7.50 2.029 0.0136 TGFA 0.96 0.80 1.78 1.74 0.62 0.94 2.015 0.0456 SLC1A5 93.82 76.59 176.05 164.57 141.31 133.62 2.013 0.0214 B3GNT5 3.06 3.61 6.23 7.15 4.68 7.09 2.008 0.0393 PLK3 2.84 2.51 5.27 5.38 6.23 8.80 2.000 0.0243 RHBDD1 3.61 4.07 7.41 7.85 5.40 5.25 1.991 0.0070 ATP6AP1L 2.33 2.98 4.92 5.50 2.63 1.97 1.979 0.0287 GCC1 7.60 7.09 15.18 13.79 8.92 8.76 1.971 0.0388 KCTD15 8.37 7.08 14.78 15.32 8.02 6.15 1.965 0.0301 WDR86 2.19 2.50 4.67 4.45 4.03 2.42 1.956 0.0105 NFIL3 12.05 10.94 22.59 22.29 14.71 24.54 1.956 0.0225 LGALS9 1.49 1.22 2.84 2.43 2.47 3.26 1.949 0.0469 SEC31B 2.97 3.05 5.67 6.06 4.43 5.82 1.948 0.0361 CCNB1IP1 31.63 33.62 61.39 65.20 45.30 47.77 1.940 0.0137 PIGZ 0.89 0.80 1.58 1.67 1.25 1.00 1.931 0.0066 ALDH2 9.10 11.50 18.23 21.31 11.87 16.53 1.928 0.0448 EDA2R 1.51 1.80 3.01 3.34 2.41 5.06 1.924 0.0211 HAUS6 12.67 12.10 24.10 23.21 16.69 16.11 1.910 0.0046 CLEC2D 1.16 1.16 2.18 2.24 1.37 0.95 1.905 0.0182 GRPEL2 7.43 6.00 13.04 12.31 12.50 16.97 1.903 0.0374 CCDC40 1.82 1.49 3.00 3.18 2.54 1.91 1.891 0.0328 GOLGA9P 1.29 1.56 2.53 2.82 1.95 2.19 1.884 0.0245 ZNF25 0.52 0.62 0.99 1.15 0.94 1.69 1.879 0.0485 KCNJ2 2.33 2.27 4.13 4.48 2.33 1.98 1.873 0.0494

124

Gene Expression (cRPKM) DMSO 791 191 Mean fold Gene ID P Value PSI (25)DMSOPSI (29) PSI (25)791PSI (29) PSI (25)191PSI (29) change DUSP8 1.43 1.55 0.79 0.93 0.70 1.42 0.576 0.0220 C9orf3 7.53 8.45 5.07 4.02 4.54 2.82 0.575 0.0400 CSRP2 55.89 54.82 30.49 33.02 46.40 59.17 0.574 0.0153 TAGLN2 352.40 354.27 214.11 190.98 353.61 472.42 0.573 0.0474 PDGFD 2.35 2.61 1.53 1.28 1.86 1.18 0.571 0.0271 NEXN 10.31 10.61 5.72 6.18 7.02 5.37 0.569 0.0068 MKL1 13.58 14.49 8.02 7.76 9.79 8.76 0.563 0.0336 PDGFRL 9.49 9.52 5.51 5.15 6.52 5.65 0.561 0.0264 SETBP1 1.58 1.81 0.85 1.05 0.92 0.66 0.559 0.0409 P2RY6 17.64 17.77 9.36 10.32 12.33 12.64 0.556 0.0356 PTRF 90.53 85.28 52.52 45.24 64.09 54.65 0.555 0.0172 MPPED2 12.35 13.74 6.88 7.54 8.43 4.65 0.553 0.0399 F2R 50.96 54.21 28.45 29.47 40.88 23.43 0.551 0.0288 SAMD14 1.40 1.40 0.78 0.76 1.25 0.50 0.550 0.0101 PROC 1.67 1.82 0.99 0.90 1.67 1.01 0.544 0.0213 TRIM5 5.68 6.07 3.09 3.26 4.28 5.45 0.541 0.0220 CYTH3 25.06 26.27 13.78 13.70 26.12 21.88 0.536 0.0316 MAP3K14 15.76 15.99 8.59 8.30 12.18 9.88 0.532 0.0008 SMTN 11.54 10.02 6.34 5.08 10.91 13.20 0.528 0.0385 ATP8B1 1.40 1.31 0.78 0.65 1.84 1.37 0.527 0.0208 USP43 1.86 2.04 0.89 1.17 1.60 1.11 0.526 0.0438 APOBEC3B 15.07 13.73 8.21 6.81 12.20 9.99 0.520 0.0193 CALD1 49.49 52.98 28.16 24.92 45.68 35.17 0.520 0.0093 CGN 2.51 2.48 1.21 1.38 1.86 1.39 0.519 0.0393 CAP2 13.18 12.19 6.21 6.88 8.95 6.56 0.518 0.0142 PRRX2 16.83 19.40 9.88 8.48 14.03 9.25 0.512 0.0460 DNMT3L 1.83 2.10 1.04 0.90 1.20 1.07 0.498 0.0441 IDI1 29.45 31.17 13.51 16.55 16.39 40.81 0.495 0.0251 PPP1R13L 30.53 33.16 15.94 15.16 35.80 29.49 0.490 0.0362 MYL9 134.60 116.42 67.03 54.32 153.15 109.37 0.482 0.0360 S100A10 202.14 192.72 105.38 81.16 165.12 161.70 0.471 0.0463 CDKN2AIP 13.14 14.08 6.15 6.57 10.26 8.71 0.467 0.0184 GDPD5 7.06 6.67 3.54 2.81 6.31 3.18 0.461 0.0267 FZD2 25.81 26.93 12.29 11.76 21.98 17.31 0.456 0.0082 APBB1 4.04 4.41 1.76 2.10 2.46 1.69 0.456 0.0120 ZNF488 1.79 1.63 0.89 0.67 1.97 1.62 0.454 0.0261 NUAK2 5.15 6.01 2.09 2.82 4.61 4.40 0.438 0.0329 CITED2 50.54 48.51 19.77 23.16 17.34 19.88 0.434 0.0105 GRAMD3 3.89 3.37 1.73 1.42 3.97 2.87 0.433 0.0347 CACNG4 3.73 3.22 1.85 1.18 3.46 1.41 0.431 0.0491 CAV1 54.45 48.76 24.86 18.51 45.74 27.64 0.418 0.0203 TAGLN 5.94 6.54 2.32 2.86 6.26 7.92 0.414 0.0124 C19orf21 16.31 17.87 5.87 7.62 15.45 16.18 0.393 0.0131 COL9A3 7.39 6.11 3.20 1.95 4.79 1.67 0.376 0.0430 CTGF 302.98 363.04 94.56 156.20 95.80 44.77 0.371 0.0404 CYR61 633.51 653.94 215.14 260.35 248.87 146.23 0.369 0.0146 CRISPLD2 2.02 1.85 0.72 0.70 3.42 4.46 0.367 0.0415 TMEM139 9.82 10.34 3.15 4.17 5.18 2.87 0.362 0.0205 CSDC2 13.86 15.78 6.05 4.53 13.43 6.57 0.362 0.0186 GPR146 2.16 2.36 0.62 0.86 0.89 1.18 0.326 0.0115

125

Table II-4 Effect of 191 treatment on gene expression by RNAseq. HeLa B2 cells were treated as described previously. Expression level of genes with DMSO, 791 or 191 treatment is quantified as corrected RPKM values (reads per kilobase of exon model per million mapped reads) with p value by student’s t test (N = 2). Expression cutoff is 0.5 RPKM (≥10 reads that mapped uniquely to a single genomic locus). Mean fold changes > 2 and < 0.5 or > 5 and < 0.2 are coloured orange and red, respectively. A subset of the data is shown. Bolded events are common to both 791 and 191 treatment samples.

Summary Raw total count of genes: 19,847 Genes with RPKM ≥ 0.5: 11,406 Gene expression with P ≤ 0.05: 540 Genes with fold change ≥ 2: 21 Genes with fold change ≤ 0.5: 32 DMSO 791 191 Mean fold Gene ID Gene Expression (cRPKM) Meanchange fold P Value Gene ID PSI (25) PSI (29) PSI (25) PSI (29) PSI (25) PSI (29) P Value DMSO 791 191 change(RPKM) AL138831.1 0.51 0.53 1.10 0.80 2.17 2.12 4.127 0.0032 TMEM178 0.70 0.54 2.03 2.97 2.11 2.51 3.831 0.0462 ADM2 1.95 1.70 10.48 12.72 4.82 4.91 2.680 0.0134 FBXO36 1.04 0.86 1.75 1.35 2.74 2.33 2.672 0.0480 MICAL2 1.31 1.27 1.41 1.56 3.20 3.34 2.536 0.0142 PTP4A3 2.31 2.35 4.93 5.23 5.61 6.09 2.510 0.0420 ZNF177 0.95 0.90 1.62 1.27 2.19 2.38 2.475 0.0330 PHLDA1 5.64 4.96 15.18 16.42 12.73 13.21 2.460 0.0047 PER2 1.50 1.66 1.50 1.28 3.75 3.93 2.434 0.0030 ZNF805 1.10 0.85 1.53 1.12 2.01 2.30 2.267 0.0267 SLC2A4 1.35 1.19 0.90 0.58 2.86 2.56 2.135 0.0289 RGPD8 3.72 3.68 6.32 5.41 8.13 7.60 2.125 0.0394 AC009133.2 0.87 0.73 1.31 1.22 1.76 1.61 2.114 0.0134 NT5DC3 2.19 2.22 2.73 2.69 4.77 4.46 2.094 0.0392 FAM176B 1.08 1.05 1.37 1.42 2.12 2.28 2.067 0.0384 C3orf71 1.25 1.59 2.37 2.98 2.70 3.13 2.064 0.0358 CCDC88B 3.41 2.68 4.64 3.39 6.58 5.87 2.060 0.0247 CTC-241N9.1 1.22 1.37 2.44 2.33 2.78 2.51 2.055 0.0259 TMEM231 2.66 3.08 4.65 4.61 5.84 5.75 2.031 0.0375 PCK2 20.17 23.70 82.27 90.50 42.69 46.11 2.031 0.0118 ZNF441 2.03 1.61 2.72 2.62 3.54 3.73 2.030 0.0402 SESN2 4.81 4.08 15.86 19.31 8.01 9.21 1.961 0.0419 SPOCD1 0.65 0.82 1.57 2.24 1.46 1.35 1.946 0.0323 ZBTB6 5.87 5.04 7.28 6.06 10.11 10.91 1.943 0.0128 ANKRD9 17.31 14.33 16.37 19.01 29.52 31.22 1.942 0.0260 ZNF555 0.63 0.53 0.82 0.66 1.05 1.17 1.937 0.0228 PSAT1 60.90 53.67 182.93 199.37 107.97 110.20 1.913 0.0297 GARS 110.53 85.61 239.79 240.42 176.43 190.82 1.913 0.0445 AC138904.2 11.31 9.67 27.21 23.76 19.09 19.84 1.870 0.0287 AC073995.2 13.55 13.41 14.79 15.37 25.12 24.53 1.842 0.0116 EIF4A2 95.46 125.19 120.27 125.03 188.24 214.22 1.842 0.0456

126

Gene Expression (cRPKM) DMSO 791 191 Mean fold Gene ID P Value PSI (25)DMSOPSI (29) PSI (25)791PSI (29) PSI (25)191PSI (29) change ATP6V0E1 81.07 85.53 82.40 92.03 49.59 46.31 0.577 0.0083 NT5DC2 146.26 140.50 92.31 97.97 83.08 82.19 0.577 0.0267 MFSD11 7.63 8.51 7.95 9.86 4.88 4.35 0.575 0.0347 MLF1IP 11.23 10.50 9.36 11.71 6.52 5.85 0.569 0.0113 YJEFN3 1.59 1.61 1.58 2.10 0.91 0.91 0.569 0.0092 FBXO9 18.86 20.02 20.62 24.12 11.28 10.66 0.565 0.0153 TMEM140 7.75 8.58 10.66 12.80 4.82 4.30 0.562 0.0283 AC006486.1 1.16 1.08 0.73 0.81 0.65 0.60 0.558 0.0158 SLC35E3 4.26 4.66 3.63 4.43 2.61 2.31 0.554 0.0189 PARP12 1.88 2.15 1.51 1.87 1.18 1.01 0.549 0.0419 UPRT 3.67 3.86 3.54 4.43 2.18 1.93 0.547 0.0105 CHML 14.96 16.18 9.81 13.93 8.62 8.05 0.537 0.0250 C1orf145 1.34 1.17 1.64 1.71 0.74 0.61 0.537 0.0374 SGK1 101.75 118.62 73.67 77.05 52.87 65.64 0.536 0.0465 TCF7 9.69 11.25 6.60 8.94 6.29 4.69 0.533 0.0469 SIRPA 1.59 1.48 1.53 1.74 0.89 0.72 0.523 0.0280 SAMHD1 20.08 22.11 16.54 18.04 12.19 9.26 0.513 0.0367 WWP2 9.06 10.33 7.98 8.18 5.45 4.11 0.500 0.0337 RBM12 20.08 21.73 12.29 15.95 9.31 11.28 0.491 0.0157 GPR75 1.01 1.10 0.76 1.08 0.50 0.53 0.488 0.0346 C9orf156 5.88 5.64 5.73 6.45 2.96 2.62 0.484 0.0074 AL353791.1 6.86 7.63 3.87 6.53 3.35 3.65 0.483 0.0395 LMAN2L 9.97 11.04 8.93 10.67 5.40 4.57 0.478 0.0175 SETBP1 1.58 1.81 0.85 1.05 0.92 0.66 0.473 0.0360 RDH5 1.65 1.72 1.49 1.92 0.80 0.75 0.460 0.0035 GPR146 2.16 2.36 0.62 0.86 0.89 1.18 0.456 0.0272 MED17 12.18 11.52 10.94 11.12 5.63 5.18 0.456 0.0064 MEIS2 11.05 12.90 8.39 9.01 6.02 4.69 0.454 0.0353 DUSP1 382.44 368.48 278.63 269.45 175.44 164.85 0.453 0.0026 BRMS1L 5.04 4.71 3.49 4.02 2.52 1.90 0.452 0.0342 CCDC103 2.45 2.48 2.83 2.07 1.21 1.00 0.449 0.0450 C12orf26 1.57 1.82 1.01 1.40 0.84 0.65 0.446 0.0309 C7orf31 1.46 1.61 1.49 1.80 0.77 0.57 0.441 0.0246 SECTM1 8.84 9.94 7.56 7.61 4.28 3.93 0.440 0.0467 ZBED1 17.73 18.91 11.40 13.10 9.01 6.84 0.435 0.0284 NMI 1.84 2.05 1.83 2.41 0.91 0.76 0.433 0.0178 PHTF1 7.32 8.57 5.75 8.21 3.79 2.93 0.430 0.0347 AMIGO3 3.30 2.81 2.59 2.87 1.37 1.13 0.409 0.0466 AMH 5.88 6.43 5.71 7.83 2.90 2.05 0.406 0.0276 TMEM133 2.29 2.15 1.55 1.46 1.07 0.74 0.406 0.0475 C21orf67 1.50 1.63 2.28 2.65 0.75 0.50 0.403 0.0427 CITED2 50.54 48.51 19.77 23.16 17.34 19.88 0.376 0.0034 HYAL3 5.15 5.61 4.82 5.88 2.15 1.88 0.376 0.0132 AMOTL2 31.89 39.43 12.75 18.50 15.87 9.83 0.373 0.0460 SLC22A3 4.57 5.50 3.22 3.05 2.28 1.29 0.367 0.0413 C12orf4 4.38 4.09 3.94 3.59 1.83 1.17 0.352 0.0437 C5orf36 2.20 2.71 1.64 0.97 0.81 0.52 0.280 0.0436 CTGF 302.98 363.04 94.56 156.20 95.80 44.77 0.220 0.0232 NCOA5* 18.36 16.98 15.16 13.76 5.17 1.77 0.193 0.0464

127