Regulation of Core Splicing Factors by Alternative Splicing and Nonsensemediated mRNA Decay

by

Arneet L. Saltzman

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Molecular Genetics University of Toronto

© Copyright by Arneet L. Saltzman 2011 Regulation of Core Splicing Factors by Alternative Splicing and Nonsensemediated mRNA Decay

Arneet L. Saltzman

Doctor of Philosophy

Department of Molecular Genetics

University of Toronto

2011 Abstract

The majority of human are transcribed into a precursor messenger RNA (premRNA) that is processed to produce multiple mRNA variants through alternative splicing. Although alternative splicing is known for its role in generating proteomic diversity, it can also regulate expression by introducing premature termination codons that target the spliced transcript for nonsensemediated mRNA decay (ASNMD). In order to understand the impact of ASNMD on , I performed quantitative AS microarray profiling of NMDinhibited human cells. Using this system, I address the prevalence, trans acting factor requirements and the range of cellular functions regulated by ASNMD. While this pathway had been implicated in homeostatic feedback regulation of genes encoding splicingregulatory , my results revealed highly conserved alternative exons regulated by ASNMD in genes encoding basal or

‘core’ splicing factors. I further characterized one of these exons in the gene encoding SmB/B′, and demonstrated that SmB/B′ autoregulates its expression through ASNMD. Furthermore, AS profiling revealed that knockdown of this core splicing factor affects the inclusion levels of additional alternative exons enriched in genes with functions in RNA processing and RNA binding. In summary, my results reveal a role for ASNMD in regulating the expression of core splicing factors, as well as a role for the core spliceosomal machinery in coordinating a network of alternative exons in RNA processing factor genes.

ii Acknowledgments

I am grateful to many people for their support during my graduate work. My supervisor and mentor Dr. Ben Blencowe has never wavered from the full support that he offered me right from the time I joined the lab as a naïve and “generally keen” student. He fostered my scientific development by providing me with the opportunity to do exciting research, present at conferences and publish my work. Ben’s guidance and encouragement have been essential for my progress as a graduate student and beyond. I would also like to thank my supervisory committee members, Dr. Howard Lipshitz, Dr. Tim Hughes and Dr. Quaid Morris, whose support and advice have helped me to develop as a scientist.

For their direct contributions to the work in this thesis, I would like to thank Matthew Fagnani and my collaborators Dr. Yoon Ki Kim, Dr. Lynne Maquat, Dr. Ofer (“the data are the data”) Shai and Dr. Brendan Frey. For their indirect contributions, I would like to thank the people behind the UCSC genome browser and Galaxy, who have made the genome accessible to the masses. For financial support, I am grateful to NSERC and the Jennifer Dorrington Graduate Student Endowment Fund.

I’ve had the privilege to work with many great people during my time in the Blencowe lab. I wish to sincerely thank all my current and former labmates, whose friendship, support, helpful advice, and love of fine beverages have been invaluable. I am particularly indebted to our Lab “Sages” Mr. Dave O’Hanlon and Dr. Susan McCracken. For supporting me along my path to graduate school, I also thank my past teachers and mentors, Mr. Flemming Kress, Dr. Shelagh Mirski, Ms. Kathy Sparks, Dr. Peter Davies and Dr. Igor Bendik.

Of course words cannot express my gratitude to my family and my ‘life partner’ for their unflappable love and support.

iii Table of Contents

Acknowledgments ...... iii Table of Contents ...... iv List of Tables ...... viii List of Figures...... ix List of Appendices...... xi Abbreviations Used...... xiii Chapter 1 ...... 1 1 Introduction...... 2 1.1 Coordination of the gene expression machinery...... 2 1.1.1 Interdependence among transcription, mRNA processing and chromatin ...... 2 1.1.2 mRNA processing remodels the messenger RNP...... 5 1.2 PremRNA splicing...... 6 1.2.1 Core and auxiliary splicing signals...... 6 1.2.2 Spliceosome assembly ...... 7 1.2.3 Exon definition...... 9 1.2.4 Spliceosomal snRNPs and Sm proteins...... 10 1.3 Regulation of alternative splicing...... 12 1.3.1 Roles of alternative splicing...... 13 1.3.2 Mechanisms of alternative splicing regulation ...... 13 1.3.3 Families of alternative splicing regulatory factors...... 14 1.3.4 Regulation of splice site recognition...... 15 1.3.4.1 SR and SRrelated proteins ...... 15 1.3.4.2 hnRNPs...... 16 1.3.5 Regulation of splice site pairing and catalysis...... 17 1.3.6 Roles of basal splicing factors in alternative splicing regulation ...... 17 1.3.7 Breaking the ‘code’ of cis acting alternative splicing regulatory sequences...... 18 1.3.8 Largescale analysis of alternative splicing regulation...... 19 1.3.9 Overview of largescale alternative splicing detection methods used in this thesis ...... 20 1.3.9.1 Alternative splicing microarray profiling...... 22 1.3.9.2 AS profiling by high throughput RNA sequencing (RNASeq) ...... 22 1.4 Nonsensemediated mRNA decay (NMD) ...... 23 1.4.1 Features targeting transcripts for NMD...... 23 1.4.2 NMD trans acting factors and mechanisms of decay...... 24 1.4.3 Discriminating between normal and premature nonsense codons: integrating the EJCdependent and faux 3′UTR models...... 26 1.5 Feedback regulation of gene expression...... 28 1.5.1 Posttranscriptional autoregulation...... 29 1.5.1.1 Splicing regulatory factors ...... 29 1.5.1.2 Ribosomal proteins, translation factors and other examples...... 31 1.5.2 Roles of posttranscriptional autoregulation...... 32

iv 1.5.2.1 Developmentallyregulated AS programs ...... 32 1.5.2.2 Plant circadian oscillations...... 32 1.5.2.3 Coordinating gene expression ...... 33 1.5.3 Sequence and functional conservation...... 33 1.6 Rationale and outline ...... 34 Chapter 2 ...... 36 2 Impact of nonsensemediated mRNA decay (NMD) factors on alternative splicing (AS)...... 37 2.1 Introduction...... 37 2.1.1 Prevalence of ASNMD...... 37 2.1.2 Differential requirements for UPF factors in NMD...... 37 2.1.3 Summary...... 38 2.2 Materials and Methods...... 39 2.2.1 Cell culture, siRNA and plasmid transfection ...... 39 2.2.2 RTPCR assays and Western blotting...... 39 2.2.3 Microarray design and hybridization...... 40 2.2.4 Microarray data analysis...... 40 2.2.5 Annotation of PTCintroducing AS events...... 40 2.2.6 Categorization of conserved and speciesspecific alternative exons ...... 41 2.3 Results...... 41 2.3.1 Predicted PTCcontaining splice variants represent minor isoforms across ten mouse tissues ...... 41 2.3.2 Most predicted PTCintroducing AS events are not conserved between human and mouse ...... 46 2.3.3 Alternative splicing microarray profiling following knockdown of the essential NMD factor UPF1 in HeLa cells ...... 48 2.3.4 A subset of PTCintroducing AS events are regulated by NMD...... 48 2.3.5 Effect of UPF1 knockdown on the expression of genes containing PTC introducing AS events...... 50 2.3.6 Alternative splicing microarray profiling following individual knockdowns of NMD factors UPF1, UPF2 or UPF3X ...... 52 2.3.7 Overlapping but distinct effects of UPF1, UPF2 and UPF3X knockdowns on PTCintroducing AS events ...... 54 2.4 Discussion...... 56 2.4.1 Function versus ‘noise’ in PTCintroducing AS events ...... 56 2.4.2 Alternative branches of the mammalian NMD pathway ...... 56 Chapter 3 ...... 58 3 Conserved ASNMD in genes encoding core splicing factors ...... 59 3.1 Introduction...... 59 3.1.1 Cellular functions regulated by ASNMD...... 59 3.1.2 Summary...... 59 3.2 Materials and Methods...... 60 3.2.1 RTPCR and Western blotting...... 60 3.2.2 Analysis of conservation of flanking intron sequence and conserved AS...... 60 3.2.3 Identification of AS events in spliceosomal and control gene sets...... 60

v 3.2.4 Statistical Analysis...... 61 3.3 Results...... 61 3.3.1 PTCintroducing AS events affected by UPF knockdowns are flanked by highly conserved sequences...... 61 3.3.2 Core spliceosomal proteins are new regulatory targets of ASNMD ...... 64 3.3.3 Conserved AS in genes encoding spliceosomal factors enriched in PTC introducing events...... 65 3.3.4 Autoregulation of core splicing factors by ASNMD...... 69 3.4 Discussion...... 70 3.4.1 ASNMD and the regulation of core spliceosomal proteins...... 71 Chapter 4 ...... 72 4 Autoregulation of the core splicing factor SmB/B′ via ASNMD...... 73 4.1 Introduction...... 73 4.1.1 ASNMD of SNRPB , encoding SmB/B′ ...... 73 4.1.2 Summary...... 73 4.2 Materials and Methods...... 74 4.2.1 Cell culture, siRNA and plasmid transfection ...... 74 4.2.2 Estimation of mRNA halflives ...... 74 4.2.3 RNA and isolation, RTPCR and Western blotting...... 74 4.2.4 Plasmid Construction...... 75 4.3 Results...... 75 4.3.1 Inclusion of a highly conserved premature termination codon (PTC) introducing alternative exon in SNRPB premRNA is affected by SmB/B′ protein levels...... 75 4.3.2 Knockdown of the core snRNP protein SmD1 affects the inclusion of the conserved SNRPB alternative exon...... 78 4.3.3 Knockdown of SmB/B′ or SmD1 affects the levels of Smclass snRNAs ...... 79 4.3.4 Cis acting elements regulating inclusion of the SNRPB alternative exon ...... 80 4.3.5 Mutations that strengthen the 5′ss reduce the effects of SmB/B′ knockdown...... 82 4.4 Discussion...... 83 4.4.1 Feedback and crossregulation of splicing factors...... 84 Chapter 5 ...... 86 5 Regulation of alternative splicing by the core spliceosomal machinery...... 87 5.1 Introduction...... 87 5.1.1 Summary...... 87 5.2 Materials and Methods...... 88 5.2.1 Analysis of AS and transcript levels by RNASeq ...... 88 5.2.2 Calculation of Splice Site Strength...... 89 5.2.3 (GO) analysis...... 89 5.2.4 Statistical Analysis...... 89 5.3 Results...... 90 5.3.1 A widespread role for core splicing factors in promoting the inclusion of alternative exons ...... 90 5.3.2 Characteristics of SmB/B′ knockdowndependent alternative exons ...... 95

vi 5.3.3 Changes in transcript levels associated with SmB/B′ knockdowndependent PTCintroducing alternative exons ...... 95 5.3.4 SmB/B′ knockdown affects AS events in RNAprocessing factor genes...... 97 5.4 Discussion...... 97 5.4.1 Mechanisms of AS regulation by core splicing factors ...... 97 5.4.2 Physiological roles of AS regulation by general splicing factors...... 98 Chapter 6 ...... 100 6 Conclusions...... 101 6.1 Future Directions ...... 102 6.1.1 What features underlie the differential dependencies of NMD substrates on UPF2 and UPF3/UPF3X?...... 102 6.1.2 Mechanisms of core splicing factordependent AS regulation...... 102 6.1.3 Origins of ultra and highlyconserved nonsense exons...... 103 6.1.4 Networks of auto and crossregulation among RNA processing factors...... 104 References...... 106 Appendices...... 133

vii List of Tables

Table 11. Posttranscriptional auto and crossregulation of proteins with roles in RNA biogenesis and metabolism...... 30 Table 31. Selected microarray PTCintroducing AS events in genes with functions related to RNA processing...... 65 Table 32. Conserved, PTCintroducing AS events identified in transcripts from spliceosome associated proteins...... 68

viii List of Figures

Figure 11. Coordination of transcription and premRNA processing machineries...... 3 Figure 12. Overview of core splicing signals and early stages of spliceosome assembly...... 7 Figure 13. Outline of microarray and RNASeq AS profiling methods used in this work...... 21 Figure 14. Alternative splicing of cassettetype exons can lead to introduction of a premature termination codon (PTC) in the included or skipped splice variant (ASNMD)...... 24 Figure 15. An integrated model for discrimination between premature and normal stop codons...... 28 Figure 16. Simplified model for autoregulation of a splicingregulatory factor through AS NMD...... 31 Figure 21. Overview of Chapter 2...... 38 Figure 22. Alternative splicing microarray data reveal that predicted PTCintroducing splice variants represent minor forms across ten mouse tissues...... 43 Figure 23. Representative RTPCRs of PTC upon inclusion and PTC upon skipping AS events in ten mouse tissues...... 45 Figure 24. Predicted PTCintroducing AS events are more often speciesspecific than conserved between human and mouse...... 47 Figure 25. Knockdown of the essential NMD factor UPF1 leads to an increase in a subset of PTCcontaining splice variants...... 49 Figure 26. Changes in % exon inclusion and transcript levels upon UPF1 knockdown predicted by the AS microarray are confirmed by RTPCR...... 51 Figure 27. Overlapping but distinct effects of UPF protein knockdowns on PTCintroducing AS events...... 53 Figure 28. Representative RTPCR assays showing effects of UPF protein knockdowns on levels of PTCintroducing alternative exons...... 55 Figure 31. Overview of Chapter 3...... 59 Figure 32. Conservation of intron sequences flanking PTCintroducing exons affected by UPF factor knockdowns...... 62 Figure 33. PTC upon inclusion alternative exons that show UPF1 or UPF2dependent changes in inclusion level are often flanked by highly conserved intronic sequences...... 63 Figure 34. Conserved PTCintroducing AS events in genes encoding spliceosomal proteins.... 67 Figure 35. SNRPB (also known as SmB/B’) or SMNDC1 (also known as SPF30) over expression leads to increased levels of the respective PTCcontaining (PTC+) alternative transcript...... 70 Figure 41. Overview of Chapter 4...... 73 Figure 42. The inclusion of a highly conserved PTCintroducing alternative exon in SNRPB is affected by SmB/B′ knockdown...... 77

ix Figure 43. The halflife of the endogenous SNRPB PTCcontaining included splice variant (A) but not that of the exonincluded variant from the SNRPB reporter ‘miniSmB’ (B) is increased upon treatment with cycloheximide (CHX) to inhibit NMD...... 78 Figure 44. Knockdown of SmD1 leads to more skipping of the SNRPB alternative exon in miniSmB (A), and knockdown of SmB/B′ (B) or SmD1 (C) affects snRNA levels...... 79 Figure 45. Auxiliary cis acting elements regulating inclusion of the SNRPB alternative exon in miniSmB are proximal to the splice sites...... 81 Figure 46. Mutations that strengthen the 5′ss (splice site), but not mutations that strengthen the 3′ss, reduce the effects of SmB/B′ knockdown on miniSmB AS...... 83 Figure 51. Overview of Chapter 5...... 87 Figure 52. Quantitative analysis of alternative splicing by RNASeq reveals that knockdown of SmB/B′ leads to increased skipping of alternative exons...... 91 Figure 53. Changes in alternative exon inclusion levels measured by RNASeq are confirmed by RTPCR assays...... 93 Figure 54. Confirmation of the effects of SmB/B′ knockdown on alternative exon inclusion in two independent knockdowns with different siRNAs...... 94 Figure 55. Characteristics of alternative exons affected by knockdown of SmB/B′...... 96

x List of Appendices

Appendices to Chapter 2: Impact of nonsensemediated mRNA decay (NMD) factors on alternative splicing (AS) Appendix 1. Reprint: Pan Q, Saltzman AL, Kim YK, Misquitta C, Shai O, Maquat LE, Frey BJ, Blencowe BJ. 2006. Quantitative microarray profiling provides evidence against widespread coupling of alternative splicing with nonsensemediated mRNA decay to control gene expression. Genes Dev 20 (2): 153158...... 133 Appendix 2. Reprint: Saltzman AL, Kim YK, Pan Q, Fagnani MM, Maquat LE, Blencowe BJ. 2008. Regulation of multiple core spliceosomal proteins by alternative splicingcoupled nonsensemediated mRNA decay. Mol Cell Biol 28 (13): 43204330...... 133 Appendix 3. Correlation of probe intensities (A) or % exon inclusion (B) between Cy3 and Cy5 fluor reversals for six samples...... 134 Appendix 4. Correlation of % inclusion between pairs of AS events with duplicate probes on the AS microarray...... 135 Appendix 5. Correlation between % exon skipping (A) or knockdowndependent difference in % exon skipping (B) measurements by AS microarray or RTPCR...... 136 Appendix 6. Microarray data for 1704 AS events that met our detection criteria...... 137 Appendix 7. Annotation for 1704 microarraymonitored AS events that met our detection criteria...... 137 Appendix 8. Significant overlaps in AS events with a consistent change in exon inclusion levels when comparing any two UPF KDs...... 138 Appendix 9. Effects of each UPF factor knockdown on PTCintroducing AS events...... 139 Appendix 10. Frequency of changes in exon inclusion level upon knockdown of UPF1, UPF2, or UPF3X for all detectable AS events (A) or for specific categories (BD)...... 140

Appendices to Chapter 3: Conserved ASNMD in genes encoding core splicing factors Appendix 11. Cumulative distribution function (CDF) plots of flanking intron sequence overlap with phastCons elements for the ‘No PTC’ group...... 141 Appendix 12. Annotation for microarraymonitored PTCintroducing AS events with conserved flanking intron sequences...... 142 Appendix 13. Annotation of cassette AS events identified in spliceosomeassociated genes.... 142 Appendix 14. Annotation of cassette AS events identified in the control gene set...... 142

Appendices to Chapter 4: Autoregulation of the core splicing factor SmB/B′ via ASNMD Appendix 15. Reprint: Saltzman AL, Pan Q, Blencowe BJ. 2011. Regulation of alternative splicing by the core spliceosomal machinery. Genes Dev 25 (4), 373384...... 142 Appendix 16. Comparison of SmB/B′ and SmN amino acid sequences ( A) and mRNA expression patterns across 84 tissue and cell types ( B)...... 143

xi Appendix 17. Abrogation of NMD by treatment of HeLa cells with the translation inhibitor cycloheximide (CHX) leads to an increase in the steadystate level of the endogenous exonincluded PTCcontaining SNRPB variant ( A), but not the exonincluded variant from the SNRPB reporter ‘miniSmB’ ( B)...... 144 Appendix 18. A deletion adjacent to the 5′ss that strengthens potential basepairing to U1 snRNA abrogates SmB/B′ knockdowndependent skipping...... 145 Appendix 19. Mutations that strengthen the 3′ss do not abrogate SmB/B′ knockdowndependent skipping...... 146

Appendices to Chapter 5: Regulation of alternative splicing by the core spliceosomal machinery Appendix 20. Data and annotations for 5752 AS events monitored by RNASeq that passed our filtering criteria...... 147 Appendix 21. Data and annotations for 8626 triplets of consecutive 'constitutive' exons monitored by RNASeq that passed our filtering criteria...... 147 Appendix 22. Gene Ontology (GO) and Pathway Commons enrichment analysis for 235 genes containing AS events with a ≥30% change in % exon inclusion upon SmB/B' knockdown...... 147 Appendix 23. Exon inclusion levels and knockdowndependent changes for all 27 assayed alternative exons agree well with RNASeq predictions...... 148

xii Abbreviations Used

AS alternative splicing cDNA complementary DNA

CTD carboxyterminal domain

EJC exon junction complex

EST expressed sequence tag

GO gene ontology mRNP messenger ribonucleoprotein

NMD nonsensemediated mRNA decay

NT/siNT nontargeting siRNA pol II RNA polymerase II

PTC premature termination codon

RNA ribonucleic acid

RNASeq high throughput RNA sequencing

RTPCR reverse transcriptionpolymerase chain reaction snRNA small nuclear RNA snRNP small nuclear ribonucleoprotein siRNA short interfering RNA

UTR untranslated region

xiii 1

Chapter 1

2

1 Introduction

A major theme of my thesis research is how gene expression can be regulated by coordination between different steps, particularly alternative splicing (AS) and nonsensemediated mRNA decay (NMD). I will therefore begin with an overview of the coordination among different steps in mammalian gene expression (Section 1.1). Next, I will discuss AS and its regulation, focusing on the relatively uncharacterized roles of the basal splicing machinery and on insights from high throughput analysis methods (Sections 1.2 and 1.3). This is followed by an introduction to the NMD pathway, focusing on the recognition of premature stop codons and on ASNMD (Section 1.4). Finally, I will discuss how genes with diverse roles in RNA biogenesis and metabolism take advantage of their own cellular functions to autoregulate their expression (Section 1.5).

1.1 Coordination of the gene expression machinery

Almost all human proteincoding genes are transcribed into a precursor messenger RNA (pre mRNA) that must be extensively processed before it is exported to the cytoplasm and recognized by the translation machinery. In the nucleus, premRNA undergoes capping, splicing, cleavage and polyadenylation. These processes are integrated with transcription by RNA polymerase II (pol II) and also result in the association of proteins with the mRNA to form a messenger ribonucleoprotein (mRNP). Extensive crosstalk among these processes plays important roles in the fidelity, efficiency and regulation of gene expression (Figure 11) (reviewed in Maniatis and Reed 2002; Komili and Silver 2008; Pandit et al. 2008; Moore and Proudfoot 2009).

1.1.1 Interdependence among transcription, mRNA processing and chromatin

The carboxyterminal domain (CTD) of the largest subunit of pol II plays a central role in the crosstalk between transcription and premRNA processing (reviewed in Perales and Bentley 2009; Munoz et al. 2010). The mammalian CTD contains 52 heptamer repeats with the consensus sequence YS 2PTS 5PS, and it is required for efficient mRNA processing (McCracken et al. 1997b). During the transcription cycle, changes in the phosphorylation pattern of the CTD serine residues allow the recruitment of premRNA processing, elongation, and histone modifying factors (reviewed in Buratowski 2009). Early in transcription, the CTD is phosphorylated on Ser5 by TFIIH, and the Ser5phosphorylated CTD recruits and activates the mRNA capping enzymes (Cho et al. 1997; McCracken et al. 1997a; Cho et al. 1998; Ho and

3

Shuman 1999) (Figure 11A). The nuclear capbinding complex (CBC, CBP80/20 heterodimer) recognizes the capped mRNA 5′end and promotes efficient splicing, export and translation initiation (Section 1.1.2). As elongation proceeds, the CTD becomes highly phosphorylated on Ser2 residues by the positive transcription elongation factor b (PTEFb) and pol II enters into the productive elongation phase (Figure 11B) (Marshall et al. 1996) (reviewed in Bres et al. 2008). The Ser2phosphorylated CTD recruits the cleavage and polyadenylation machinery, which then stimulates 3′end formation once the poly(A) site is transcribed (Figure 11D) (Licatalosi et al. 2002; Ahn et al. 2004; Meinhart and Cramer 2004; Ni et al. 2004; Rosonina and Blencowe 2004).

Figure 11. Coordination of transcription and premRNA processing machineries. See text for details.

Crosstalk between transcription elongation and splicing is mediated by the CTD as well as by factors that associate with the nascent transcript and the chromatin template (Figure 11B) (reviewed in Perales and Bentley 2009). This ‘coupling’ is functionally important for the efficiency of splicing (Das et al. 2006; Hicks et al. 2006) and for regulating the differential use of splice sites through alternative splicing (AS) (Cramer et al. 1997; Auboeuf et al. 2002; Kadener

4 et al. 2002; Nogues et al. 2002; de la Mata et al. 2003; Pagani et al. 2003; Ip et al. 2011). Transcription can influence AS through both ‘recruitment’ and ‘kinetic’ coupling (reviewed in Munoz et al. 2010). In recruitment coupling, specific splicing regulatory proteins as well as factors with dual roles in transcription and splicing regulation (see below for examples) are recruited to the transcribing polymerase, often via association with the CTD. In kinetic coupling, changes in the pol II elongation rate affect splice site choice by influencing the timing of presentation of splicing signals in the premRNA to the splicing machinery. The pol II elongation rate may be influenced by promoter identity and associated transcriptional activators or coactivators and by elongation factors associated with the CTD.

Studies of the Ser/Argrich (SR) proteins, a class of sequencespecific RNAbinding factors that bind premRNA to regulate AS (Section 1.3.4), illustrate the interdependence between transcription and AS regulation. The activities of several SR proteins are modulated in a promoter and CTDdependent manner (Cramer et al. 1999; de la Mata and Kornblihtt 2006). In addition, several factors involved in splicing, including the SR protein SRSF2 (also known as SC35), enhance transcription through the recruitment or stimulation of elongation factors such as the CTD kinase PTEFb (Figure 11B) (Fong and Zhou 2001; Bres et al. 2005; Lin et al. 2008). Thus, transcription can affect splicing and, reciprocally, splicing can affect transcription.

During the transcription cycle, Ser2 or Ser5phosphorylated pol II and associated elongation factors recruit chromatin modifying complexes that establish or maintain characteristic patterns of histone modifications on active genes (reviewed in Buratowski 2009). In human cells, the 5′ ends of active genes are typically marked by histone H3 lysine 4 trimethylation (H3K4me3) (Bernstein et al. 2005). This chromatin mark is recognized by CHD1 (chromodomain helicase DNA binding protein 1), which can recruit the splicing machinery (U2 snRNP; see Section 1.2.2) to facilitate efficient premRNA splicing (Sims et al. 2007). In addition, the histone modification H3K36me3 is enriched in gene regions encoding alternative exons regulated by the splicing factor PTB (polypyrimidine tract binding protein; see Section 1.3.3). These modified histone tails are recognized by MRG15 (MORFrelated gene 15), which enhances the recruitment of PTB to the premRNA (Luco et al. 2010). Thus, physical crosstalk between chromatin and the splicing machinery represents an additional layer of gene regulation (Figure 11C) (reviewed in Allemand et al. 2008; Luco et al. 2011).

5

1.1.2 mRNA processing remodels the messenger RNP

Capping, splicing and polyadenylation in the nucleus result in the association of protein complexes with the mRNA, which in turn influence mRNA export, localization, translation and stability. The transcription and export (TREX) complex is recruited to the 5′ end of mRNAs in a cap and splicingdependent manner in human cells (Figure 11B) (Masuda et al. 2005; Cheng et al. 2006). The TREX subunit ALY (also known as REF, THOC4 or Yra1 in yeast) directly binds the mRNA as well as the CBP80 subunit of the capbinding complex. ALY functions as an mRNA export adapter by transferring the mRNA to TAP (TIPassociated protein; also known as NXF1, nuclear RNA export factor 1, or Mex67 in yeast). Together with its partner p15, TAP interacts with the nuclear pore complex to mediate mRNA export (Hautbergue et al. 2008). Additonal RNAbinding proteins, including some SR proteins, can also act as TAPdependent export adapters (Huang et al. 2003).

Splicing results in the deposition of a multiprotein exon junction complex (EJC) approximately 20 nt upstream of exonexon junctions (Figure 11B) (reviewed in Le Hir and Andersen 2008). The four core factors of the EJC are eIF4A3, Y14, MAGOH (magonashi homologue), and MLN51 (metastatic lymph node gene 51; also known as Barentz, Btz, CASC3). These four proteins along with RNPS1 and UPF3 remain associated with the mRNA during export, until they are removed during the first round of translation (Dostie and Dreyfuss 2002; Lejeune et al. 2002). Additional splicingrelated proteins are peripherally associated with the EJC in the nucleus but do not remain bound during export. While it was initially believed that all splice junctions are marked by the EJC, recent evidence in fly cells suggests that EJC deposition may be a regulated process (Sauliere et al. 2010). The EJC factors have multiple roles in RNA metabolism, including in mRNA localization (Palacios et al. 2004) and translation (Wiegand et al. 2003; Nott et al. 2004). The EJC also communicates the positions of splice junctions to cytoplasmic factors involved in nonsensemediated mRNA decay (NMD), a pathway that degrades mRNAs containing premature termination codons (PTC). Specifically, the presence of an EJC downstream of a PTC strongly stimulates mammalian NMD (see Section 1.4). In addition to these postsplicing roles, new findings in Drosophila show that the EJC functions in the splicing of exons flanked by long introns (AshtonBeaucage et al. 2010; Roignant and Treisman 2010).

6

1.2 PremRNA splicing

Approximately 92% of human proteincoding genes are interrupted by introns, and on average each gene contains 89 introns (Fedorova and Fedorov 2005). The excision of introns from pre mRNA, or splicing, is catalyzed by the spliceosome, a large ribonucleoprotein (RNP) complex comprising the U1, U2, U4/6 and U5 small nuclear (sn)RNPs and a few hundred protein factors (reviewed in Wahl et al. 2009). Both RNA and protein components of the spliceosome play important roles in recognition of the core splicing signals and in catalysis. This section outlines the recognition of the core splicing signals and subsequent assembly of the spliceosome on the premRNA. I will also focus on the core snRNP Sm proteins, which will be relevant in the later chapters of my thesis.

1.2.1 Core and auxiliary splicing signals

The core splicing signals in the premRNA are short motifs with considerable sequence flexibility. The 5′ (donor) and 3′ (acceptor) splice sites (ss) are located at the 5′ and 3′ boundaries of the intron, respectively, and the branch point is located upstream of the 3′ss. Consensus sequences for the mammalian core splicing signals are shown in Figure 12A. The splicing reaction involves two successive transesterifications. In the first step, the 2′ hydroxyl of the branch point adenosine attacks the phosphodiester bond at the 5′ss, generating a free 3′ hydroxyl on the 5′ exon and a branched intronlariat3′exon as intermediates. In the second step, the free 3′ hydroxyl of the 5′ exon attacks the phosphodiester bond at the 3′ss, resulting in ligation of the exons and release of the intron lariat.

The short, degenerate core splicing signals that mark the boundaries of introns do not contain sufficient information to accurately define the exons in human transcripts (Lim and Burge 2001). Introns often contain many ‘pseudoexons’ – intronic sequences flanked by ‘decoy’ consensus ss sequences that are not normally recognized by the splicing machinery. Thus additional cis acting regulatory sequences are necessary to distinguish introns and exons (reviewed in Chasin 2007). These auxiliary sequences are known as exonic or intronic splicing enhancers when they promote splicing (ESE/ISEs), or as splicing silencers when they inhibit splicing (ESS/ISS). Splicing enhancers and silencers are usually short, degenerate sequence motifs (510 nt) and they play roles in the recognition of constitutive exons (Section 1.2.3) as well as in the regulation of inclusion of alternative exons (Section 1.3.2).

7

Figure 12. Overview of core splicing signals and early stages of spliceosome assembly. (A) Consensus sequences of the mammalian core splicing signals. PPT, polypyrimidine tract; ss, splice site. (B) Early stages of spliceosome assembly are shown. The U1, U2, and U4/6.U5 snRNPs contain the indicated snRNA(s) and associated proteins. Sequences of U1 and U2 snRNAs that basepair with the 5′ss and branch site, respectively, are shown in white text. Ψ, pseudouridine; R, A/G; Y, U/C.

1.2.2 Spliceosome assembly

The consensus model of spliceosome assembly has been mostly characterized using in vitro approaches (reviewed in Matlin and Moore 2007; Wahl et al. 2009). Spliceosome assembly is a stepwise process involving recruitment of snRNPs and proteins to the premRNA and dynamic rearrangements of RNA–RNA, RNA–protein and protein–protein interactions (Figure 12B). In the early (E) complex, also known in yeast as the ‘commitment complex’, the 5′ss is recognized

8 by U1 snRNP, the branch point is recognized by SF1 (Splicing Factor 1; also known in yeast as BBP, branchpoint binding protein), and the PPT and 3′ss are recognized by the subunits of the U2 accessory factor (U2AF) heterodimer (U2AF65 and U2AF35, respectively). Recognition of the 5′ss involves basepairing between the 5′end of U1 snRNA and the premRNA, which is stabilized by proteins in the U1 snRNP (Zhang and Rosbash 1999).

The U2 snRNP then replaces SF1 at the branchpoint, forming the A complex (also referred to as the prespliceosome) (Figure 12B). Formation of the A complex is ATPdependent and involves basepairing of U2 snRNA at the branch site region, which is stabilized by components of the U1 and U2 snRNPs and by U2AF65 (Barabino et al. 1990; Valcarcel et al. 1996; Gozani et al. 1998). A bulged duplex formed between the U2 snRNA and the branch site region specifies the protruding adenosine as the nucleophile for the first transesterification reaction of splicing (Query et al. 1994). The bulged adenosine is also recognized by the U2 snRNP protein p14 (SF3B14) (MacMillan et al. 1994; Schellenberg et al. 2011). While the splice sites are recognized in E complex, the pairing of splice sites for catalysis occurs at, or subsequent to, A complex formation (Chiara and Reed 1995; Lim and Hertel 2004; Kotlajich et al. 2009).

The U4/6.U5 trisnRNP then joins the spliceosome, forming the B complex. This complex undergoes extensive remodeling to form the catalytically active spliceosome. The multiprotein PRP19/CDC5L complex (also known in yeast as the NineTeen Complex or NTC) and additional RNA helicases also associate with the spliceosome and function in spliceosome activation and splicing fidelity (reviewed in Valadkhan 2007; Hogg et al. 2010). The remodelling of RNA– RNA interactions during spliceosome activation includes disruption of U4–U6 snRNA base pairing to allow basepairing of U6 snRNA with intronic nucleotides at the 5′ss, release or destabilization of the U1 and U4 snRNPs, and rearrangement of interactions between U2 and U6 snRNA and within U6 snRNA. Following the two transesterification reactions of splicing, the products are released and the components of the spliceosome are recycled.

In contrast to the stepwise model of spliceosome assembly characterized in vitro , the isolation of a ‘pentasnRNP’ from yeast cells led to the hypothesis that the spliceosome may encounter the premRNA in a preassembled form in vivo (Stevens et al. 2002). However, it has been suggested that the two models might be reconciled if the stepwise assembly characterized in vitro could be viewed instead as stepwise rearrangement and activation of the pentasnRNP (reviewed in Brow

9

2002; Nilsen 2002). Recent work also supports the relevance of the stepwise assembly model in vivo . Several groups used chromatin immunoprecipitation to monitor the cotranscriptional recruitment of snRNP components and other splicing factors to nascent transcripts of yeast introncontaining genes. These studies showed a sequential pattern of snRNP or splicing factor recruitment that was consistent with stepwise spliceosome assembly (Gornemann et al. 2005; Lacadie and Rosbash 2005; Tardiff and Rosbash 2006). In addition, live imaging of snRNP components tagged with fluorescent proteins revealed distinct interaction dynamics of individual snRNPs with premRNA, in support of a stepwise recruitment model in human cells (Huranova et al. 2010).

1.2.3 Exon definition

The splicing reaction takes place between 5′ and 3′ splice sites paired across an intron. However, in metazoan genes, where introns are often longer than exons by an order of magnitude or more, it is likely that splicing is facilitated by a process termed ‘exon definition’ (Berget 1995). In the exon definition model, the factors bound to the splice sites on either side of internal exons initially interact and are stabilized across the exon (Figure 12B). Early evidence for exon definition included the finding that the presence and strength of a 5′ss downstream of an exon affects the recognition and splicing of the upstream intron (Nasim et al. 1990; Robberson et al. 1990; Talerico and Berget 1990; Kuo et al. 1991). In addition, using a reporter containing an isolated exon flanked by splice sites, it was found that the 5′ss sequence and U1 snRNP promoted UV crosslinking of U2AF65 at the PPT/3′ss, thus providing further evidence for the importance of crossexon interactions (Hoffman and Grabowski 1992). Key mediators of this crossexon bridging activity include proteins in the SR family (Section 1.3.4). These proteins interact with exonic splicing enhancers (ESEs) and promote binding of U1 snRNP and U2AF to the premRNA through direct interactions as well as through interactions with splicing co activator proteins (reviewed in Blencowe 2000). The RS domains of SR proteins also promote or stabilize RNA–RNA contacts between the core splicing signals and the UsnRNAs (Shen and Green 2006). Computational analysis of splicing signals in human and mouse also support the exon definition model. Compensatory changes in the strength of 5′ and 3′ splice sites are observed across exons, but not across introns (Xiao et al. 2007). Furthermore, splice sites, ESEs, and ESSs coevolve to preserve the overall exon strength (Xiao et al. 2007).

10

Our understanding of exon definition complexes is incomplete, since the majority of in vitro spliceosome assembly assays have used reporter premRNAs containing two exons separated by a single short intron. However, several recent studies have shed light on exondefined complexes. Assembly of spliceosome complexes in vitro on a threeexon premRNA reporter revealed that an exondefined E complex can be chased into an exondefined A complex in the presence of ATP (Sharma et al. 2008). In addition, proteomics analysis indicated that these exon defined complexes were similar in composition to previously characterized introndefined complexes (Sharma et al. 2008). The mechanism for conversion of crossexon interactions into crossintron interactions is also an area under active investigation. Recently, using an in vitro trans splicing assay, it was shown that the U4/6.U5 trisnRNP can associate with an exon defined A complex, without requiring prior establishment of crossintron interactions between U1 and U2 snRNP (Schneider et al. 2010). In addition, the establishment of crossintron interactions upstream of an exon did not require disruption of the interactions formed across that exon (Schneider et al. 2010). In a related study, conversion of crossintron to crossexon interactions was investigated using premRNA reporters with multiple introns. Following splicing of one intron, U1 snRNP previously engaged in crossexon interactions on the 3′exon remains associated with the mRNA and promotes efficient splicing of the neighbouring intron (Crabb et al. 2010).

1.2.4 Spliceosomal snRNPs and Sm proteins

The snRNPs are major components of the spliceosome. Each snRNP contains a uridinerich snRNA (U1, U2, U4, U5 or U6) and associated proteins, however U4 and U6 are basepaired in a U4/6 disnRNP (Bringmann et al. 1984; Hashimoto and Steitz 1984) which is also found associated with U5 snRNP in a U4/6.U5 trisnRNP complex (Konarska and Sharp 1987). The purification of snRNP components from mammalian cells was fortuitously accomplished using serum from a patient with the autoimmune disease systemic lupus erythematosus (SLE) (Lerner and Steitz 1979). This SLE serum was known to contain antibodies that react with a nuclear antigen present in many mammalian tissues (Tan and Kunkel 1966). The nuclear antigen was designated ‘Sm’, for ‘Smith’, in honour of Stephanie Smith, the SLE patient from whom the serum was isolated (Tan and Kunkel 1966) (reviewed in Reeves et al. 2003). Using the antiSm serum, RNPs containing the UsnRNAs and 7 small (1235 kDa) proteins designated AG were

11 immunoprecipitated from mammalian cell extracts (Lerner and Steitz 1979). A subset of these proteins that are common to the U1, U2, U4 and U5 snRNPs became known as the Sm proteins.

The snRNPs contain both common and unique proteins. The seven common Sm proteins (B/B′ (see Chapter 4), D1, D2, D3, E, F and G) are assembled onto the snRNAs by the SMN complex (survival of motor neuron) (reviewed in Neuenkirchen et al. 2008). Formation of this snRNP ‘core’ is essential for subsequent steps in the biogenesis of mature snRNP particles. The Sm proteins bind a conserved singlestranded ‘Sm site’ with consensus sequence PuA[U 36]GPu, located between two stemloops near the 3′ end of the ‘Smclass’ snRNAs (U1, U2, U4 and U5) (Branlant et al. 1982; Liautard et al. 1982). Based on crystal structures of the BD3 and D1D2 Sm protein dimers along with previous biochemical data, a model was proposed in which the Sm site RNA passes through the central cavity formed by a heteroheptameric Sm protein ring (Kambach et al. 1999). This model was recently confirmed by two crystal structures of the U1 snRNP assembled from recombinant components (Pomeranz Krummel et al. 2009) or generated by limited proteolysis of native snRNPs isolated from HeLa cells (Weber et al. 2010).

The Sm proteins are essential for the assembly and stability of snRNPs. However, their role in the splicing process is not well characterized. In yeast, Sm proteins B, D1 and D3 contact the premRNA near the 5′ss in the commitment/E complex (Zhang and Rosbash 1999). These three Sm proteins have extensions or ‘tails’ located Cterminal to their conserved Sm domains. Splicing assays in yeast strains harboring tailtruncated Sm proteins suggested that the tails of Sm B, D1 and D3 contribute to the stability of the U1 snRNA–premRNA interaction, perhaps through basic arginine and lysine residues in the yeast Sm protein tails (Zhang et al. 2001). The mammalian Cterminal tails are also rich in positively charged residues. The D1 and D3 tails contain glycinearginine (GR) repeats. In contrast, the SmB tail in mammals is quite divergent from that of yeast and contains a striking stretch of repeats of 34 prolines interspersed with glycine, methionine and arginine residues (e.g. GMPPPGMRPPPPGMR). These ‘PGM’ motifs in the tail interact with the WW domain of FBP21 (forminbinding protein 21), a spliceosome associated protein implicated in crossintron bridging interactions (Bedford et al. 1998). However, the function of this interaction in splicing has not been studied. In addition, while the U1 snRNP crystal structures mentioned above provided insight into recognition of the snRNA by the Sm ring, they were less informative regarding the function of the Cterminal Sm tails, since these repetitive regions were either omitted from recombinant proteins or found to be disordered

12

(Pomeranz Krummel et al. 2009; Weber et al. 2010). Overall, while the Cterminal tails of the Sm B/B′, D1 and D3 proteins play a role in nuclear localization of the snRNPs (Bordonne 2000; Girard et al. 2004), the roles of the mammalian tails in U1 snRNA–premRNA interaction or other steps in splicing remain to be studied.

Additional insights into the function of Sm proteins in splicing might be inferred by analogy to the functions these proteins in other RNA–protein complexes. In addition to the spliceosomal snRNPs, Sm proteins form a related but distinct heptamer on U7 snRNA. The U7 heptamer contains five Sm proteins (B/B′, D3, E, F, and G), along with two Smlike (LSm) proteins LSm10 and LSm11, which replace Sm proteins D1 and D2, respectively. The U7 snRNP functions in histone 3′end processing (reviewed in Dominski and Marzluff 2007). A recent study found that the U7 snRNP components SmB, SmD3 and LSm10 UVcrosslinked to the histone mRNA (Yang et al. 2009). A model was proposed in which these proteins might function as a ‘molecular ruler’ to specify the histone mRNA cleavage site at a fixed distance upstream of an RNA sequence (the ‘histone downstream element’) that is recognized by basepairing to U7 snRNA (Yang et al. 2009). In this model, Sm proteins B and D3 function as part of the heptamer to mediate RNA–RNA interactions between the U7 snRNA and the histone mRNA. This function is reminiscent of the proposed role of the yeast Sm complex in U1 snRNA–premRNA interaction discussed above (Zhang et al. 2001). However, it is not known if the Sm protein– RNA interaction occurs via the Cterminal tails, as suggested in the yeast model, or another region of the Sm proteins.

1.3 Regulation of alternative splicing

Alternative splicing (AS) is the process of differential splice site usage to generate multiple mRNA variants from a single premRNA. Upon release of the draft , it was estimated that at least 59% of genes undergo AS, based on aligning expressed sequence tags (ESTs) and cDNAs to coding genes on 22 (International Human Genome Sequencing Consortium 2001). A higher frequency of AS, affecting 74% of multiexon genes, was then estimated based on data from tissue profiling on exon junction microarrays and EST/cDNA evidence (Johnson et al. 2003). More recently, the use of highthroughput RNA sequencing (RNASeq) has led to an estimate that transcripts from 95% of human multiexon genes undergo AS (Pan et al. 2008; Wang et al. 2008). Alternative splicing affects transcript

13 diversity in several ways, including cassettetype exons, mutually exclusive exons, alternative 5′ or 3′ss selection, alternative promoters, alternative polyadenylation, and intron retention. In my work, I will focus on cassettetype exons, which are either included or skipped in the spliced mRNA, and which represent the most common type of AS (Castle et al. 2008; Wang et al. 2008). Although AS is widespread, the functional importance of most splice variants remains to be investigated.

1.3.1 Roles of alternative splicing

Very soon after the discovery that genes are interrupted by introns, it was proposed that exons might be joined in different combinations to generate multiple polypeptides from a single gene (Gilbert 1978). This role of AS in expansion of the proteome has been particularly emphasized following the sequencing of the human genome (International Human Genome Sequencing Consortium 2001), which was found to encode fewer proteincoding genes than anticipated by many (reviewed in Aparicio 2000; Pennisi 2003). A primary outcome of AS is the expansion of transcriptome complexity. An important consequence is an increase in the diversity of the encoded proteome (reviewed in Maniatis and Tasic 2002; Nilsen and Graveley 2010). However, an additional outcome of transcriptome expansion by AS is an increase in posttranscriptional regulatory potential. For example, differences in the coding region, 5′UTR or 3′UTR between mRNA variants produced from the same premRNA can affect translation (e.g. upstream ORFs), stability (e.g. microRNA binding sites, AUrich elements, premature stop codons), and mRNA localization, and thus have important consequences for the regulation of gene expression (Majoros and Ohler 2007; Tan et al. 2007; Mayr and Bartel 2009; Resch et al. 2009; Bell et al. 2010; Salomonis et al. 2010) (reviewed in Smith et al. 1989; Hughes 2006). The roles of AS in regulating gene expression will be discussed further in Section 1.5 below.

1.3.2 Mechanisms of alternative splicing regulation

Alternative splicing can be controlled in a developmental stage and cell typespecific manner, as well as in response to signaling or environmental cues (reviewed in Chen and Manley 2009). This AS regulation is achieved through multiple levels of control. For example, transcription elongation rate, chromatin modification, EJC deposition (see Section 1.1) and premRNA secondary structure (reviewed in Warf and Berglund 2010) can influence splice site choice. However, the bestcharacterized mechanism of AS regulation is through the recognition of short

14 cis acting RNA sequence motifs (ESE/S, ISE/S) by splicingregulatory proteins. Initial studies of AS regulation focused on the enhancement or repression of splice site recognition at the early stages of spliceosome assembly (Section 1.3.4). In contrast, some regulatory mechanisms affect splice site pairing, rather than recognition, or recruitment of the U4/U6.U5 trisnRNP. These diverse mechanisms allow regulation of splice site choice at later stages of spliceosome assembly or even during splicing catalysis (Section 1.3.5).

1.3.3 Families of alternative splicing regulatory factors

The most extensively studied groups of splicingregulatory factors are the SR (Ser/Argrich), SRrelated and hnRNP (heterogeneous ribonucleoprotein) families, which I will discuss in the next section (1.3.4). Many of these proteins are widely expressed and thought to affect AS regulation in a concentrationdependent manner (Mayeda et al. 1993; Caceres et al. 1994; Hanamura et al. 1998) (reviewed in Chen and Manley 2009). However, some members of these families have tissuerestricted expression patterns. For example, our lab recently identified and characterized the first example of a nervous systemspecific SRrelated protein, nSR100 (also known as SRRM4, serine/arginine repetitive matrix 4) (Calarco et al. 2009). In addition, the hnRNP family member PTBP1 (polypyrimidine tract binding protein P1; also known as PTB, hnRNPI) is widely expressed, while two PTBP1 paralogues, PTBP2 (also known as nPTB, brPTB, neural/brain PTB) and ROD1 (regulator of differentiation 1) are expressed in specific cell types. Interestingly, regulation of the AS of the genes encoding these proteins plays a role in establishing their expression patterns (see Section 1.5) (Wollerton et al. 2004; Boutz et al. 2007b; Makeyev et al. 2007; Spellman et al. 2007).

Several other AS factors with tissuerestricted expression have also been characterized. Members of the NOVA (neurononcological ventral antigen) and ELAVlike (embryonic lethal, abnormal visionlike; also known as paraneoplastic encephalomyelitis antigen Hu) families are expressed in neurons, FOX (Feminizing gene On X homolog) and CELF (CUGbinding protein and ETR3 like family, also known as Brunolike) proteins are expressed in the brain, heart or muscle, (reviewed in Li et al. 2007) and ESRPs (epithelial splicing regulatory proteins) are expressed in epithelial cells (Warzecha et al. 2009). Like many of the SR proteins and hnRNPs, these factors bind short RNA motifs in a sequencespecific manner, through RNA recognition motifs (RRMs) or hnRNPK homology (KH) domains (Cook et al. 2011).

15

1.3.4 Regulation of splice site recognition

1.3.4.1 SR and SRrelated proteins

The SR proteins contain 12 Nterminal RNA recognition motifs (RRMs) and a Cterminal RS domain that is rich in alternating serine and arginine dipeptides (reviewed in Lin and Fu 2007; Long and Caceres 2009). The prototypical SR proteins function in both constitutive and alternative splicing. Based on in vitro splicing assays, these SR proteins appear to be functionally redundant in their ability to complement splicingdeficient HeLa S100 extract (Fu et al. 1992; Mayeda et al. 1992). However, additional studies indicate that SR proteins bind distinct RNA sequences and that they have nonredundant AS functions in vivo (reviewed in Long and Caceres 2009). For example, depletion of the prototypical SR protein SRSF1 (also known as SF2, ASF) in C. elegans by RNAi results in late embryonic lethality (Longman et al. 2000). Similarly, loss of SRSF1 in chicken DT40 cells or in mouse embryos is lethal (Wang et al. 1996; Xu et al. 2005). Moreover, tissuespecific ablation of SRSF1 in the mouse heart resulted in misregulation of an SRSF1dependent AS event in Ca 2+ /calmodulindependent kinase IIδ (CaMKIIδ) and a defect in postnatal heart remodelling (Xu et al. 2005). Thus, SR proteins have specific, non redundant functions in the regulation of AS.

In addition to the prototypical SR proteins, many other ‘SRrelated’ proteins also function as regulators of splicing and AS. These proteins often contain RS and RRM domains, but in a different configuration than the classical SR proteins. Examples of such SRrelated proteins include TRA2A and TRA2B, which are homologues of transformer-2, an AS regulator involved in Drosophila sex determination. Other SRrelated proteins contain RS domains alone or in combination with other RNAbinding domains (reviewed in Blencowe et al. 1999).

Though best known as positive regulators of AS, SR proteins can both promote and inhibit the inclusion of alternative exons (reviewed in Lin and Fu 2007; Long and Caceres 2009). SR proteins function in ESEdependent splicing in several ways (reviewed in Blencowe 2000; Graveley 2000). SR proteins can bind specific ESE sequences and recruit the splicing machinery via interactions of their RS domains with snRNP components (e.g. U2AF35 and U170K) (Lavigueur et al. 1993; Wu and Maniatis 1993; Wang et al. 1995; Zuo and Maniatis 1996; Graveley et al. 2001). Alternatively, some SRrelated proteins can function in ESEdependent splicing by acting as splicing coactivators that bridge interactions between ESEbound SR/SR

16 related proteins and snRNPs (Blencowe et al. 1998; Eldridge et al. 1999; Blencowe et al. 2000). Binding of SR proteins can also enhance exon inclusion by antagonizing the activity of negative regulators bound at nearby silencer elements (Kan and Green 1999). Recent results also show that inclusion of an alternative exon can be repressed by strong interactions of SR proteins with the flanking constitutive exons (Han et al. 2011). In addition to roles in AS regulation, some SR and SRrelated proteins function in transcription, 3′end formation, mRNA export and translation (reviewed in Blencowe et al. 1999; Long and Caceres 2009).

1.3.4.2 hnRNPs

The heterogeneous ribonucleoproteins (hnRNPs) are a diverse group of proteins functionally defined by their association with nascent hnRNA (premRNA). The hnRNPs typically contain one to four RNAbinding domains (RRMs, quasiRRMs or KH domains), as well as other auxiliary domains such as RGG boxes (ArgGlyGly) or Glyrich domains (reviewed in MartinezContreras et al. 2007). Many of the hnRNPs that have been implicated in AS regulation can inhibit splice site recognition through binding to specific silencer sequences (Caputi et al. 1999; Chen et al. 1999; Del GattoKonczak et al. 1999). Some hnRNPs such as hnRNPA1 may also cooperatively multimerize on the premRNA to block the association of other factors at a distance (Zhu et al. 2001). The recognition of silencers by hnRNPs can thus block or compete with the recognition of either nearby or distal enhancer sequences by positive regulatory factors. Alternatively, hnRNPs may block or compete with the binding of snRNPassociated factors such as U2AF to the core splicing signals (Lin and Patton 1995; Singh et al. 1995). Some hnRNPs also stimulate intron definition through interactions between multiple proteins recognizing sites at the boundaries of long introns (MartinezContreras et al. 2006). In addition, when intronic hnRNP binding sites flank an alternative exon, interaction between the hnRNPs can lead to exon silencing by ‘looping out’ the alternative exon and bringing the splice sites of the flanking exons into close proximity (Chabot et al. 1997; Blanchette and Chabot 1999). However, at least in one case of such a looping mechanism, the binding of U1 snRNP to the 5′ss of the silenced exon was not inhibited (Chabot et al. 1997; Blanchette and Chabot 1999). Therefore, this mechanism may involve inhibition of splice site pairing rather than recognition, as described in the next section.

17

1.3.5 Regulation of splice site pairing and catalysis

In addition to the regulation of splice site recognition at the earliest stages of spliceosome assembly, a number of recent studies have revealed that AS can be regulated at later stages, including the subsequent steps involved in the pairing of splice sites or the recruitment of the tri snRNP (reviewed in House and Lynch 2008). Moreover, some trans acting splicing factors can regulate AS at both early and late stages of spliceosome assembly. For example, the hnRNP PTBP1 can repress alternative exon inclusion by inhibiting early steps leading to exon definition (Izquierdo et al. 2005; Sharma et al. 2005). However, in another mechanism, PTB can act after exon definition, by binding in an intron and blocking the functional crossintron pairing of U1 and U2 snRNPs already associated with the splice sites (Sharma et al. 2008). Repression of alternative exon inclusion by hnRNPL and hnRNPE2 can also occur through a post–exon definition mechanism. In this case, the binding of the hnRNPs to an exon prevents the U1 and U2 snRNPs bound at its splice sites from forming productive crossintron interactions with snRNPs at the flanking exons (House and Lynch 2006). Post exon definition mechanisms are also not limited to hnRNPs. The SRrelated tumor suppressor RBM5 can repress exon inclusion by a dual mechanism involving both blocking the transition to intron definition of the snRNP recognized splice sites flanking a repressed alternative exon, as well as facilitating the pairing of the splice sites of the flanking constitutive exons (Bonnal et al. 2008). Splice site choice can also be regulated during catalysis. In the Drosophila melanogaster sex determination gene Sex-lethal , the Sexlethal protein causes skipping of an alternative exon in its own transcript through an interaction with the splicing factor SPF45 that blocks splicing at the second catalytic step (Lallena et al. 2002). Together, these studies reveal the diversity of splicing regulatory mechanisms.

1.3.6 Roles of basal splicing factors in alternative splicing regulation

Studies in yeast and metazoans have shown that the levels of some basal or ‘core’ components of the splicing machinery can affect splice site choice. Microarray profiling revealed transcript specific splicing effects in yeast strains harboring mutations in or deletions of core splicing components (Clark et al. 2002; Pleiss et al. 2007; Kawashima et al. 2009). In addition, an RNAi screen in Drosophila cells identified transcriptspecific effects on AS upon depletion of general spliceosome factors, including U2AF and components of U1, U2 and U4/U6 snRNPs (Park et al. 2004). Studies in C. elegans and mammalian cells also suggested that the U2AF subunits and the

18

U2 snRNP component SAP155 can affect splice site choice (Massiello et al. 2006; Pacheco et al. 2006; Hastings et al. 2007; Ma and Horvitz 2009). Two very recent studies implicate additional core splicing factors in AS regulation and identify associated target sequence features. The branchpoint recognition factor SF1 may regulate AS of some transcripts by binding to branch sitelike sequences (Corioni et al. 2011). Also, transcriptome profiling in zebrafish embryos deficient in the U1 snRNPspecific protein U1C revealed altered splice site choice in targets with intronic Urich sequences (Rosel et al. 2011). In a mouse model of spinal muscular atrophy (SMA), deficiency of the snRNP assembly factor SMN (Survival of Motor Neuron) resulted in tissuespecific perturbations in snRNP levels and splicing defects (Gabanella et al. 2007; Zhang et al. 2008; Baumer et al. 2009). Tiling microarray profiling analysis of fission yeast RNA also revealed transcriptspecific splicing defects of a temperaturedegron allele of SMN, and that some of the defects could be alleviated by strengthening the pyrimidine tract upstream of the branchpoint (Campion et al. 2010). In addition to these studies, the work in my thesis will provide new evidence for the role of core splicing factors in AS regulation (Saltzman et al. 2011).

In summary, the features that underlie the differential sensitivity of introns or alternative exons to particular defects in the core splicing machinery are only beginning to be explored. Moreover, in contrast to the AS regulatory factors described above, the mechanisms of these effects are poorly understood. Some clues may be provided by analogy to the kinetic proofreading model of splicing fidelity in yeast. This model broadly predicts that any changes that alter the kinetics of transitions in the splicing pathway, including the availability or activity of core splicing factors, can alter splice site choice (Yu et al. 2008) (reviewed in Smith et al. 2008).

1.3.7 Breaking the ‘code’ of cis acting alternative splicing regulatory sequences

A goal of the study of AS is to build predictive models for AS regulation, or a splicing regulatory ‘code’ (reviewed in Matlin et al. 2005; Blencowe 2006; Wang and Burge 2008). Deciphering the rules that control AS will be important for understanding gene expression on a genomewide scale, and for the ability to predict how mutations affect this regulation. However, the nature of splicing regulation complicates the path from genomic sequence to AS predictions. For example, a particular cis regulatory sequence can have opposite effects on AS regulation depending on its position within an intron or exon, even when the sequence is recognized by the same trans acting

19 regulator (reviewed in Chen and Manley 2009). The activity of an AS regulator can also depend on local sequence context (Xiao et al. 2009; MottaMena et al. 2010) or on its posttranslational modification state (Feng et al. 2008). Many regulated alternative exons and their flanking introns also have binding sites for multiple factors, suggesting they are controlled in a combinatorial manner. Nevertheless, significant advances have been made recently in identifying sequence features that predict tissueregulated AS as well as regulation by specific trans acting factors (Barash et al. 2010; Zhang et al. 2010). This progress has been accelerated by integrating information from multiple sources, especially sequence conservation across species, splicing regulatory motifs identified through bioinformatic and experimental screening approaches, RNA target binding data for AS regulators, RNA structural features, and splice variant profiling data from microarrays or high throughput RNA sequencing (RNASeq).

1.3.8 Largescale analysis of alternative splicing regulation

Many insights into AS and its regulation have been made possible using highthroughput methods to study the transcriptome. Technologies used to detect and quantify the levels of splice variants in an mRNA sample include microarrays (tiling, exon, exonjunction and exon/exon junction combinations) (Shoemaker et al. 2001; Johnson et al. 2003; Pan et al. 2004) (reviewed in Calarco et al. 2007; Hallegger et al. 2010), fibreoptic bead arrays (Yeakley et al. 2002), high throughput RTPCR (Klinck et al. 2008), and RNASeq (Cloonan et al. 2008; Mortazavi et al. 2008; Pan et al. 2008; Sultan et al. 2008; Wang et al. 2008) (reviewed in Blencowe et al. 2009). These methods have been used to profile differences in the mammalian splice variant repertoire among tissues, individuals, developmental stages and cell culture models of developmental transitions, as well as in cancer versus normal tissues (reviewed in Calarco et al. 2007; Hartmann and Valcarcel 2009; Hallegger et al. 2010). High throughput methods have also been used to identify functional targets of specific AS regulators by profiling AS following knockdown or loss of a particular protein (Blanchette et al. 2005; Ule et al. 2005) (reviewed in Calarco et al. 2007; Hallegger et al. 2010). Combining this profiling data with factor binding site preferences determined by methods such as SELEX (Tuerk and Gold 1990) or RNAcompete (Ray et al. 2009) can then provide insights into the biological function of an AS regulator. Furthermore, to distinguish direct from indirect targets, methods such as UV Crosslinking and Immunoprecipitation coupled with high throughput sequencing (CLIPSeq; also known as high throughput sequencing of RNA isolated by CLIP, HITSCLIP) allow the isolation of RNA

20 targets directly bound by a protein of interest on a genomewide scale (Ule et al. 2003) (reviewed in Witten and Ule 2011).

In addition to cataloguing transcriptome complexity, the approaches mentioned above have revealed sequence features associated with AS regulation and allowed construction of ‘RNA splicing maps’ of the positiondependent effects of AS regulators (reviewed in Witten and Ule 2011). More generally, while mRNA expression profiling microarrays showed that functionally related genes are often coexpressed in mammalian cells and tissues (Eisen et al. 1998; Su et al. 2004; Zhang et al. 2004), AS microarray profiling studies revealed that functionally related genes are also coordinately regulated by AS. These ‘AS networks’ or ‘exon networks’ have functional properties reflecting tissue identity, but the groups of genes are often distinct from those coregulated at the transcriptional level (Le et al. 2004; Pan et al. 2004; Fagnani et al. 2007; Castle et al. 2008). In addition, functionally related genes are often coregulated by tissue restricted AS factors such as NOVA, nSR100, ESRP and CELF/MBNL (reviewed in Licatalosi and Darnell 2010; Calarco et al. 2011). The coordination of gene expression through AS networks extends previous models proposing that mRNPs represent “posttranscriptional operons” in eukaryotes (Keene and Tenenbaum 2002).

1.3.9 Overview of largescale alternative splicing detection methods used in this thesis

In my thesis work, I used both microarray and RNASeqbased methods to quantify the relative abundance of mRNA splice variants. An overview comparing and contrasting these approaches is presented in Figure 13. In both cases, the experimental workflow begins with isolation of polyadenylated (polyA+) RNA from cells or tissues which is then reversetranscribed to cDNA (Figure 13A). Fluorlabeled singlestranded cDNA is generated for hybridization to AS microarrays (Hughes et al. 2006), whereas fragmented, doublestranded cDNA flanked by adapters is generated for RNASeq following the Illumina mRNASeq protocol. In parallel to these steps, a database of cassettetype AS events is generated, by identifying cassettetype AS events in cDNA and EST sequences that have been aligned to the genome (Figure 13B) (performed by Sandy Pan) (Pan et al. 2004; Pan et al. 2005). This AS database is used to design oligonucleotide probes for the AS microarray, or as a set of exonexon junction sequences onto which RNASeq reads are bioinformatically aligned (Figure 13A). The % alternative exon inclusion measurements (‘% inclusion’, i.e. the percentage of transcripts in which the alternative

21 exon is included) calculated using the AS microarray platform or the RNAseq method are then qualityfiltered using simple criteria. The resulting AS predictions correlate well with measurements made by independent methods such as RTPCR (Chapter 2, Chapter 5).

Figure 13. Outline of microarray and RNASeq AS profiling methods used in this work. (A) Left : For AS microarray profiling, fluorlabeled cDNAs are hybridized to the AS microarray. The GenASAP algorithm is then used to estimate the % exon inclusion levels and confidence ranks from the signal intensities of the scanned microarray images. Right : For RNASeq AS profiling, 50nt highthroughput short read sequencing is performed on cDNA libraries using the Illumina Genome Analyzer II. The % exon inclusion levels are calculated by counting the number of sequence reads that align to the included or skipped junctions in the AS database. (B) Construction of a database of cassettetype AS events mined from ESTs/cDNAs. These AS events are used to design exon and exonexon junction microarray probes or to align RNASeq reads to exonexon junction sequences.

22

1.3.9.1 Alternative splicing microarray profiling

The AS microarray platform developed by the Blencowe and Frey labs contains sets of six probes for ~3000 AS events (three exon probes: C1, A, C2 and three junction probes C1A, A C2, C1C2) (Figure 13A) (Pan et al. 2004). Ideally, both splice variants should hybridize to the C1 and C2 exon probes, whereas the included variant should hybridize specifically to the C1A, A, and AC2 probes, and the skipped variant should hybridize specifically to the C1C2 junction probe. Although the probes are designed for optimal specificity, in practice the probe signals do not correspond to this ‘ideal hybridization profile’, especially as a result of crosshybridization of the splice variants to the junction probes. In addition, accurate prediction of relative splice variant levels for some AS events is complicated by outlier probes, whose signals are not consistent with the other five probes for the AS event, as well as by other sources of noise. Therefore, a Bayesian learning algorithm called the Generative model for the Alternative Splicing Array Platform (GenASAP) is used to accurately predict the AS levels (% inclusion) from the microarray data (Shai et al. 2006) (Figure 13A). GenASAP uses the microarray data to model the hybridization of the included and skipped splice variants to the six probes. This significantly improves the accuracy of the % inclusion predictions in comparison to using the ‘ideal’ hybridization profile described above. In addition, GenASAP models aberrant probe data and noise, and assigns a confidence rank for each prediction based on the fit of the probes to the learned model and their signal intensities (Pan et al. 2004; Shai et al. 2006). This confidence rank is then used to filter the list of AS events used for further analysis.

1.3.9.2 AS profiling by high throughput RNA sequencing (RNASeq)

In more recent work in this thesis, I used RNASeq to measure the % inclusion levels of alternative exons (Chapter 5). In this approach, short sequence reads generated using Illumina RNASeq are aligned to the included (C1A, AC2) or skipped (C1C2) exonexon junction sequences in our database of cassettetype AS events (Figure 13) (Pan et al. 2008). The % exon inclusion level is then calculated by counting the number of reads aligning to the included junction as a percentage of the total number of reads aligning to both the included and skipped junctions. The AS events are then filtered using criteria such as setting a lower limit on the number of junction read counts identified for the event. When compared to AS profiling using the microarray, the RNASeq method typically offers a substantial increase in the number of AS events for which % inclusion can be accurately predicted in a single experiment.

23

1.4 Nonsensemediated mRNA decay (NMD)

In all eukaryotes studied to date, transcripts containing premature termination codons (PTCs) are degraded by the translationdependent nonsensemediated mRNA decay (NMD) pathway (reviewed in Chang et al. 2007; Isken and Maquat 2008). The NMD pathway functions in both quality control and the regulation of gene expression. In addition, the activity of the NMD pathway may be regulated. The efficiency of NMD can vary among tissues or individuals (Bateman et al. 2003; Linde et al. 2007; Viegas et al. 2007) and recent evidence also suggests that NMD may be developmentally regulated in C. elegans (BarberanSoler et al. 2009).

1.4.1 Features targeting transcripts for NMD

NMD degrades PTCcontaining mRNAs arising from nonsense or frameshift mutations, expressed pseudogenes and transposons, and, in the immune system, degrades transcripts from unproductively rearranged immunoglobulin and Tcell receptor genes. Transcript features such as upstream open reading frames (uORFs), 3′UTR introns and long 3′UTRs can also target mRNAs for NMD (reviewed in Chang et al. 2007; Isken and Maquat 2008). Cytoplasmic intron retaining transcripts are also degraded by NMD in yeast, C. elegans , and ciliates (He et al. 1993; Mitrovich and Anderson 2000; Jaillon et al. 2008; Sayani et al. 2008; Ramani et al. 2009).

An additional group of NMD targets that will be a subject of my thesis are PTCcontaining splice variants resulting from AS. I will refer to this process as ‘AS coupled with NMD’ (ASNMD), although it has also been called ‘regulated unproductive splicing and translation’ (RUST) (Green et al. 2003; Lewis et al. 2003). My work will focus on the introduction of PTCs by inclusion or skipping of cassettetype alternative exons. Two specific cases that will be studied are (1) the inclusion of an alternative exon that contains an inframe PTC or shifts the reading frame to introduce a PTC (PTC upon inclusion); and (2) the skipping of an alternative exon that shifts the reading frame to introduce a PTC (PTC upon skipping) (Figure 14). Roles for ASNMD in regulating gene expression will be discussed below in section 1.5.

24

Figure 14. Alternative splicing of cassettetype exons can lead to introduction of a premature termination codon (PTC) in the included or skipped splice variant (ASNMD). (A) In ‘PTC upon inclusion’, inclusion of an alternative exon introduces an inframe PTC or, alternatively, shifts the reading frame to introduce a PTC (not shown). (B) In ‘PTC upon skipping’, skipping of an alternative exon shifts the reading frame leading to a PTC. Green octagon, normal stop codon; red octagon, PTC.

1.4.2 NMD trans acting factors and mechanisms of decay

Three core NMD factors, UPF1 (UpFrameshift 1; also known as Regulator of Nonsense Transcripts 1, RENT1, or Suppressor with Morphogenetic Effect on Genitalia 2, SMG2), UPF2 (RENT2, SMG3) and UPF3 (SMG4), are conserved among all studied eukaryotes. In vertebrates, there are two UPF3related paralogues, UPF3 and UPF3X (also known as UPF3A and UPF3B, respectively) (LykkeAndersen et al. 2000; Serin et al. 2001; Wittkopp et al. 2009). UPF3X more strongly activates NMD than UPF3 when tethered downstream of a stop codon (Kunz et al. 2006), but these proteins may have partially redundant functions (LykkeAndersen et al. 2000; Kim et al. 2001; Tarpey et al. 2007; Chan et al. 2009). Additional NMD factors SMG1, SMG5, SMG6 and SMG7 are conserved in C. elegans , Drosophila (except for SMG7), and mammals. Although these latter factors were initially thought to be specific to metazoans, a putative yeast SMG7 orthologue, Ebs1p, may also play a role in NMD (Luke et al. 2007). Two other factors, SMGL1 and SMGL2 (SMGlethal 1 and 2, also known as NAG, neuroblastoma amplified gene, and DHX34, DEAH box protein 34, respectively) also function in NMD in C. elegans , zebrafish, and human cells, but their specific roles are not known (Longman et al. 2007;

25

Anastasaki et al. 2011). In addition, SMG8 and SMG9 are involved in NMD in C. elegans and mammals through regulation of SMG1 kinase activity (Yamashita et al. 2009).

In the consensus model of NMD in mammals (reviewed in Isken and Maquat 2008), UPF2 and UPF3 associate with spliced mRNA via the EJC (discussed above in Section 1.1.2) (Kim et al. 2001; Le Hir et al. 2001; LykkeAndersen et al. 2001). When the ribosome encounters a stop codon upstream of an EJC, UPF1 and SMG1 are recruited to the terminating ribosome (Kashima et al. 2006), likely via interactions with the eukaryotic release factors eRF1 and eRF3 (Czaplinski et al. 1998; Kobayashi et al. 2004). Interaction of UPF1 with UPF2 and/or UPF3 and of SMG1 with UPF2, UPF3 and other EJC proteins then promotes the phosphorylation of UPF1 by SMG1 (Kashima et al. 2006). Phosphorylated UPF1 recruits SMG5 and SMG7 (Ohnishi et al. 2003), as well as SMG6, which also interacts with EJC components (Kashima et al. 2010). The association of SMG5 and SMG7 promotes the dephosphorylation of UPF1 (Ohnishi et al. 2003) and the recruitment of the mRNA decay machinery. In addition, the interaction of UPF1 with UPF2 and UPF3 stimulates the ATPase and helicase activity of UPF1 (Chamieh et al. 2008), which then promotes the disassembly of the mRNP and the completion of decay (Franks et al. 2010).

The decay of the PTCcontaining mRNAs can occur through two pathways in mammals (reviewed in Muhlemann and LykkeAndersen 2010). In the first pathway, which is similar to decay of NMD targets in yeast, the binding of SMG5/SMG7 to phosphorylated UPF1 results in decapping and deadenylation followed by exonucleolytic degradation. Alternatively, in a second pathway that also represents the principal decay pathway in Drosophila NMD, SMG6 endonucleolytically cleaves the mRNA near the PTC, which is then followed by exonucleolytic degradation of the decay intermediates (Gatfield and Izaurralde 2004; Glavan et al. 2006; Huntzinger et al. 2008; Eberle et al. 2009).

The mechanism by which the NMD machinery discriminates between normal and premature nonsense codons may vary between organisms and has been a subject of controversy in the field, as described in the next section. In addition, there is evidence for alternative branches of the NMD pathway with differential requirements for EJC and UPF components, which will be discussed in Chapter 2.

26

1.4.3 Discriminating between normal and premature nonsense codons: integrating the EJCdependent and faux 3′UTR models

In the prevailing model of mammalian NMD, a nonsense codon is recognized as premature when it is located at least 5055 nucleotides upstream of the final splice junction. This splicing dependent ‘50 nt rule’ was initially proposed based on analysis of diseaseassociated nonsense mutations and mutagenesis of nonsensecontaining reporters. It was also supported by bioinformatic examination of genes containing 3′UTR introns, which found that the intron is located less than 50 bp downstream of the natural stop codon in 98% of genes studied (n=105) (Nagy and Maquat 1998). This empirical rule accurately predicts many cases of NMDinducing PTCs, although some exceptions will be discussed below in this section. The subsequent discovery that splice junctions are ‘marked’ by the EJC provided a biochemical mechanism to determine the position of a stop codon relative to exonexon junctions of a spliced mRNA. Further support for this ‘EJCdependent’ mechanism for PTC recognition in mammalian cells was provided by experiments demonstrating that artificially tethering EJC proteins downstream of a stop codon triggers NMD, as observed for the UPF proteins (LykkeAndersen et al. 2000), and also experiments showing that knockdown of EJC proteins impairs NMD (reviewed in Lejeune and Maquat 2005). As mentioned in the previous section, the EJC also functions to recruit the UPF proteins to the mRNA. In addition, recent studies in zebrafish suggest that a splicingdependent mechanism of PTC recognition is also conserved across vertebrates (Wittkopp et al. 2009).

In contrast to vertebrates, NMD is not dependent on splicing or EJC components in Drosophila or C. elegans (Gatfield et al. 2003; Longman et al. 2007). Moreover, a splicingdependent pathway for PTC recognition would not be generally applicable in budding yeast, in which most genes are intronless and NMD is not splicingdependent. Early studies of NMD targets in yeast suggested that cis acting sequences downstream of a PTC are recognized by specific RNA binding factors, which then recruit the UPF factors to trigger decay (Peltz et al. 1993). This ‘downstream sequence element’ model is conceptually similar to the mammalian EJCdependent model, since both suggest that specific factors bound downstream of a termination codon ‘mark’ it as premature and promote NMD. In contrast to these models, in vitro studies of translation termination in yeast suggested that NMD is triggered by the absence of factors downstream of a PTC that would normally promote efficient translation termination and antagonize NMD (Amrani et al. 2004). This ‘ faux 3′UTR’ model is supported by the biochemically abberant

27 termination of ribosomes at PTCs, which can be alleviated by loss of UPF1 or by placing a normal 3′UTR downstream of the stop codon. Tethering poly(A)binding protein (PABP) downstream of a PTC also stabilizes the mRNA in yeast. It was subsequently also observed that tethering of cytoplasmic PABP 1 (PABPC1) downstream of a PTC inhibits NMD in Drosophila (BehmAnsmant et al. 2007), which suggested that the ‘ faux 3′UTR’ model of PTC recognition is conserved in other eukaryotes.

Thus, despite the conservation of the core NMD machinery, two different models for discriminating between normal and premature stop codons emerged: an ‘EJCdependent’ mechanism in vertebrates and a ‘ faux 3′UTR’ mechanism in yeast and flies. However, recent studies have led to an ‘integrated’ model that incorporates features of both mechanisms (Figure 15) (reviewed in Rebbapragada and LykkeAndersen 2009). In yeast, neither a poly(A) tail nor PABP is required for NMD, suggesting that additional redundant features not accounted for in the ‘ faux 3′UTR’ model are involved in discriminating normal and premature stop codons (Meaux et al. 2008). Conversely, in human cells, mRNAs with long 3′UTRs are subject to NMD in a splicing and EJCindependent manner (Buhler et al. 2006; Eberle et al. 2008; Hogg and Goff 2010). In addition, consistent with previous observations in yeast and flies, recent studies in human cells show that recruitment of cytoplasmic PABP downstream of a PTC inhibits NMD, whereas artificially extending the 3′UTR stimulates NMD (Eberle et al. 2008; Ivanov et al. 2008; Silva et al. 2008; Singh et al. 2008). These studies also suggest that competition between UPF1 and cytoplasmic PABP for binding to the translation release factor eRF3 may underlie the opposing effects of these two proteins on efficient translation termination and NMD (Ivanov et al. 2008; Singh et al. 2008). These new findings have led to an ‘integrated’ model of PTC recognition, in which competition between factors that promote or antagonize NMD determines whether or not the UPF complex associates with the terminating ribosome to elicit NMD (summarized in Figure 15).

28

Figure 15. An integrated model for discrimination between premature and normal stop codons. Recent studies suggest that competition between factors that promote or inhibit NMD and/or translation termination determines if a stop codon is recognized as premature or normal. An exon junction complex (EJC) downstream of a stop codon stimulates recruitment of UPF factors to the terminating ribosome, thus promoting NMD. Poly(A)binding protein (PABP) bound to the poly(A) tail proximal to a stop codon antagonizes the recruitment of UPF factors, thus inhibiting NMD, and also stimulates normal translation termination. CBC, capbinding complex; eIF4G, eukaryotic translation initiation factor 4 gamma; eRF, eukaryotic release factor.

1.5 Feedback regulation of gene expression

The gene expression programs underlying cell functions include both transcriptional and post transcriptional layers of regulation. Detailed studies of transcriptional regulatory networks have led to the identification of common regulatory patterns, or biological network motifs (reviewed in Alon 2007). One common transcription network motif is positive or negative autoregulation, in which a transcription factor stimulates or represses its own expression. This autoregulation is common in both prokaryotic and eukaryotic transcription factors (Thieffry et al. 1998; Lee et al. 2002; Odom et al. 2006). Interconnected autoregulatory loops are also important for regulation of the key pluripotency transcription factors OCT4, SOX2 and NANOG (Boyer et al. 2005). In contrast, network motifs in posttranscriptional regulation are less wellcharacterized. In this section, I provide examples suggesting that autoregulatory and crossregulatory feedback circuits are a common theme among factors with diverse roles in RNA metabolism. Examination of this

29 regulation highlights the roles of AS in gene expression and the multifunctionality of many RNAbinding proteins (Table 11).

1.5.1 Posttranscriptional autoregulation

1.5.1.1 Splicing regulatory factors

An everincreasing number of splicing regulatory proteins regulate the AS of their own pre mRNAs. These include members of the SR and hnRNP families, as well as numerous tissue restricted factors (Table 11). Autoregulation through AS can have a number of functional outcomes. The most common result of autoregulated AS is the control of gene expression, by producing alternative transcripts containing premature termination codons (PTCs) that are degraded by nonsensemediated mRNA decay (ASNMD). A simple model of feedback regulation through ASNMD involving a PTCintroducing cassettetype alternative exon is presented in Figure 16. Depending on the gene, PTCs targeting an alternative splice variant for NMD may be introduced through inclusion of cassettetype alternative exons as well as through alternative splice site selection or alternative intron retention, particularly of 3′UTR introns. In addition, autoregulation can involve multiple levels of control. For example, the multifunctional SR protein SRSF1 (SF2/ASF) autoregulates its expression through both AS and translation regulation (Sun et al. 2010).

Alternatively, in other cases of splicing factor autoregulation, AS can lead to protein isoforms with different functional properties. The neuronal KHdomain protein NOVA1 autoregulates its expression by inhibiting inclusion of a cassette exon encoding a phosphorylation domain. Studies of NOVA1 autoregulation revealed for the first time that the effects of this protein on AS were positiondependent (Dredge et al. 2005). In the FOX1 and FOX2 proteins, autoregulated exon skipping leads to expression of dominantnegative isoforms with truncated second RRM domains (Damianov and Black 2010).

In addition to autoregulation, there are many splicing factors that crossregulate the AS of pre mRNAs encoding other AS factors (reviewed in McGlincy and Smith 2008) (Table 11). Cross regulatory interactions among paralogues play important roles in the regulation of developmental transitions (see Section 1.5.2 below).

30

Table 11. Posttranscriptional auto and crossregulation of proteins with roles in RNA biogenesis and metabolism. In the first column: †, crossregulation, *, autoregulation not demonstrated but suggested by AS NMD. Examples are in mammalian species unless indicated otherwise in the first column. Abbreviations: At , Arabidopsis thaliana ; Ce , Caenorhabditis elegans ; Dm , Drosophila melanogaster ; Sc , Saccharomyces cerevisiae ; Xl , Xenopus laevis .

Gene Name (Alias) (homologues) References SR proteins SRSF1 (ASF, SF2, SRp30a, SFRS1) (Sun et al. 2010); (Wang et al. 1996) SRSF2 (SC35, SRp30b, SFRS2) (Sureau et al. 2001); (Dreumont et al. 2010) SRSF3 (SRp20, SFRS3) (Jumaa and Nielsen 1997); (Anko et al. 2010) SRSF4 (SRp75, SFRS4) (Anko et al. 2010) SRSF7 (9G8, SFRS7) (Lejeune et al. 2001) SRSF5 (SRp40, SFRS5); SRSF6 (SRp55, B52, SFRS6); * SRSF8 (SRp46, SFRS2B); SRSF9 (SRp30c, SFRS9); (Lareau et al. 2007b) SRSF10 (SRp38, FUSIP1, SRrp40); SRSF11 (SRp54, SFRS11) SRp20; *, Ce (Morrison et al. 1997); (Ramani et al. 2009) SRp30b (SC35 homologue) *, Ce rsp1 (SRp75); rsp2 (SRp40); rsp7 (p54) (Ramani et al. 2009) †, Dm Rbp1; Rbp1like; (SRp20 homologues) (Kumar and Lopez 2005) SR1 (atSRp34); (SRSF1 homologue); (Lopato et al. 1999); †, At atSRp30; (SRSF1 homologue); atRSZ33 (Kalyna et al. 2003) SRrelated proteins TRA2B (SFRS10) (Stoilov et al. 2004) Dm transformer2 (Mattox and Baker 1991) *, Ce rsp8 (TRA2B homologue) (Ramani et al. 2009) SFSWAP (SWAP, SFRS8) (Sarkissian et al. 1996) Dm su(w a) (SWAP homologue) (Zachar et al. 1994) Sc Npl3 (Lund et al. 2008) * RBM39 (CAPER, RNPC2); SRRM1 (SRm160) (Ni et al. 2007) hnRNPs HNRNPA1 (Chabot et al. 1997); (Blanchette and Chabot 1999) HNRNPA2B1 (McGlincy et al. 2010) HNRNPD (AUF1, HNRPD) (Banihashemi et al. 2006); (Wilson et al. 1999) PTBP1 (PTB, hnRNPI) (Wollerton et al. 2004); see also PTBP2 below HNRPL (hnRNPL); (Rossbach et al. 2009) HNRPLL (hnRNPLL) Dm hrp59 (HNRNPM homologue) (Hase et al. 2006) * hnRNPH1; hnRNPK; hnRNPM (Ni et al. 2007) Tissuerestricted AS regulators PTBP2 (nPTB, brPTB); ROD1 (PTBP3); (+PTBP1 (PTB)) (Wollerton et al. 2004); (Boutz et al. 2007b); (Makeyev et al. 2007); † (Spellman et al. 2007) NOVA1 (Dredge et al. 2005) RBFOX1 (FOX1; A2BP1); RBFOX2 (FOX2, RBM9) (Baraniak et al. 2006); (Damianov and Black 2010) CELF1 (BRUNOL2, CUGBP1); † (Dembowski and Grabowski 2009) CELF2 (BRUNOL3, CUGBP2, ETR3) TIA1 † (Le Guiner et al. 2001); (Izquierdo and Valcarcel 2007) TIAL1 (TIAR) MBNL1 (MBNL); † (Lin et al. 2006); (Kalsotra et al. 2008) MBNL2 (MBLL) ELAVL1 (HuR) (AlAhmadi et al. 2009); (Yi et al. 2010) Dm Elav (Borgeson and Samson 2005) TARDBP (TDP43, ALS10) (Ayala et al. 2010); (Polymenidou et al. 2011) Spliceosomerelated U1A (Boelens et al. 1993) SMN (Jodelka et al. 2010) SNRNP48 (U1148K); (Verbeeren et al. 2010) RNPC3 (U11/U1265K) *, Ce U2AF65 (Zorio et al. 1997) Ribosomal proteins Sc Rpl2 (Presutti et al. 1991); (Presutti et al. 1995) Xl Rpl1 (Caffarelli et al. 1987) Rpl3 (Cuccurese et al. 2005) Rpl12 (Cuccurese et al. 2005) Ce Rpl12 (Mitrovich and Anderson 2000) Sc Rpl30 (Macias et al. 2008) Sc Rpl32 (Dabeva and Warner 1993)

Table 1-1 continued on next page

31

Table 1-1 continued Gene Name (Alias) (homologues) References Other FMR1 (FMRP, FRAXA) (Didiot et al. 2008) ADAR2 (Rueter et al. 1999); (Feng et al. 2006) Dm Adar (Keegan et al. 2005) Dm Su(f) (CSTF3 homologue) (Audibert and Simonelig 1998); (Juge et al. 2000) Dm Sexlethal (Bell et al. 1991) Sc Dbp2 (p68/DDX5 homologue) (Barta and Iggo 1995) PABP (de Melo Neto et al. 1995); (Patel et al. 2005) Sc NAB2 (nPABP) (Roth et al. 2005) Sc Yra1 (ALY/REF homologue) (Preker and Guthrie 2006); (Preker et al. 2002); (Dong et al. 2007) DGCR8 (pasha) (Triboulet et al. 2009); (Han et al. 2009) Dm pasha (Kadener et al. 2009)

Figure 16. Simplified model for autoregulation of a splicingregulatory factor through AS NMD.

1.5.1.2 Ribosomal proteins, translation factors and other examples

Other proteins involved in premRNA processing and translation are also posttranscriptionally autoregulated. These include factors involved in 3′end processing (su(f)), mRNA export (Yra1), mRNA editing (ADAR), polyAtail binding (PABP and NAB2) and mRNA localization and translation (FMRP) (Table 11). In addition, several ribosomal proteins in yeast, C. elegans , Xenopus and mammals autoregulate their expression, by directly binding their transcripts to regulate splicing and/or translation (Table 11) (Caffarelli et al. 1987; Presutti et al. 1991; Dabeva and Warner 1993; Cuccurese et al. 2005; Macias et al. 2008). Another interesting

32 example was recently described for the Microprocessor, a complex comprising Drosha and DGCR8 (also known as pasha) that cleaves hairpins in primary microRNA transcripts during microRNA biogenesis. Through a mechanism that is conserved in Drosophila and mammals, the Microprocessor negatively autoregulates DGCR8 expression by cleavage of a primary microRNAlike hairpin in the DGCR8 5′UTR, resulting in DGCR8 transcript destabilization (Han et al. 2009; Triboulet et al. 2009). These examples are united by a common theme in which the RNArecognition function of the factor plays dual roles in its homeostatic autoregulation and in the regulation of other RNAs.

1.5.2 Roles of posttranscriptional autoregulation

1.5.2.1 Developmentallyregulated AS programs

The Drosophila melanogaster sex determination pathway is not only a prototypical example of the role of AS in developmental control, but also revealed the first examples of splicing factor autoregulation (reviewed in Lopez 1998). At the core of this AS cascade, the splicing regulator sex-lethal autoregulates its expression via AS, and also regulates AS of target transcripts including transformer . Together with transformer-2, which also autoregulates its AS, transformer regulates the AS of the transcription factor double-sex and other target transcripts with roles in sex determination. Thus, a regulatory cascade involving feedback and cross regulation of splicing factors controls a developmental choice between two states.

Regulatory circuits involving feedback and crossregulation of splicing factors also play key roles in mammalian developmental transitions. During neuronal and muscle differentiation, these circuits are integrated with microRNAmediated regulation to orchestrate a posttranscriptional switch between the hnRNP paralogues PTBP1 and PTBP2 to establish tissuespecific AS programs (Boutz et al. 2007a; Boutz et al. 2007b; Makeyev et al. 2007). Auto and cross regulation together with microRNA regulation also coordinates expression of the CELF and MBNL proteins during heart development (Kalsotra et al. 2008; Kalsotra et al. 2010).

1.5.2.2 Plant circadian oscillations

ASNMD is prevalent in genes encoding Arabidopsis thaliana SR protein homologues (Filichkin et al. 2010; Palusa and Reddy 2010). Several of these proteins auto and crossregulate their expression (Table 11). This regulation may play a role in plant abiotic stress response (Filichkin et al. 2010). The A. thaliana glycinerich RNAbinding proteins AtGRP7 and AtGRP8 exhibit

33 circadian clockdependent oscillations. Interestingly, the oscillations of these proteins are further regulated through an interlocked negative feedback loop via ASNMD (Staiger et al. 2003; Schoning et al. 2008). Thus, auto and crossregulatory feedback can contribute to oscillatory gene expression patterns.

1.5.2.3 Coordinating gene expression

Negative transcriptional autoregulation can reduce variability in gene expression (Becskei and Serrano 2000) and speed the response time of gene circuits (Rosenfeld et al. 2002). Although the consequences of posttranscriptional autoregulatory network motifs have not been directly examined, one might speculate that they share analogous functions. Such a function may be important in maintaining homeostatic levels of splicing regulatory factors, since inappropriate expression levels of these factors may affect splice site selection or have detrimental effects including oncogenic transformation (Karni et al. 2007) (reviewed in Wang and Cooper 2007). Auto and crossregulatory feedback may also play an important role in coordinating the expression of components of macromolecular machines such as the spliceosome or the ribosome.

1.5.3 Sequence and functional conservation

Autoregulated AS events associated with ASNMD often lie within highly or ultraconserved sequence regions, in accordance with their functional importance. In a landmark comparative genomics study, among 93 known proteincoding genes containing ‘ultraconserved’ elements, defined as ≥200bp of 100% identity between the human, mouse and rat genomes, 59 genes, including several encoding known AS regulators, contained ultraconserved elements that overlapped with AS events (Bejerano et al. 2004). No explicit link to ASNMD was made at the time, however two subsequent studies found that 15 of these AS events overlapping ultraconserved regions were also regulated by NMD (Lareau et al. 2007b; Ni et al. 2007). Moreover, other groups demonstrated experimentally that four of the genes containing ultraconserved elements associated with ASNMD autoregulated their expression through AS (SRSF1 (ASF/SF2), SRSF3 (SRp20), SRSF7 (9G8) and TRA2B). Notably, additional examples of autoregulated ASNMD are also associated with highly conserved sequence elements, though these did not pass the strict ‘ultraconservation’ criteria above (Table 11) (Lareau et al. 2007b; Ni et al. 2007; Yeo et al. 2007; Saltzman et al. 2008).

34

In addition to the crossspecies sequence conservation of specific gene regions that are post transcriptionally autoregulated, the use of a similar regulatory strategy in many related and unrelated genes suggests the utility and importance of this type of mechanism. For example, the use of ASNMD in autoregulation is conserved in yeast and mammals for ribosomal proteins, and is conserved in the plant and animal kingdoms for SR proteins (Table 11). In some cases, different mechanisms to autoregulate orthologous genes also appear to have evolved in different species (e.g. ADAR2, some SR and SRrelated proteins). Even within the related mammalian SR protein family, exons implicated in ASNMD were found in nonhomologus positions and did not share sequence similarity among genes. These observation led to a model in which conserved sequences involved in ASNMD evolved independently in multiple SR protein genes (Lareau et al. 2007b). Further study of the emergence of posttranscriptional autoregulation may shed light on the evolution of posttranscriptional gene regulatory networks.

1.6 Rationale and outline

Transcriptome analysis has shown that most human genes are alternatively spliced. Deciphering the functional consequences of these AS events is a major goal in the genomic era. Many AS events are predicted to introduce PTCs that target the spliced transcript for NMD (ASNMD). However, the prevalence of ASNMD, its factor requirements and the repertoire of gene functions regulated by this pathway are not well understood. I begin to address these questions in Chapter 2 by using AS microarray profiling to determine the contributions of three core NMD factors to the degradation of splice variants containing PTCs. Then, in Chapter 3, I examine the sequence properties and gene functions associated with regulated ASNMD. The data presented in these chapters indicate that a subset of PTCintroducing AS events are regulated by NMD and that ASNMD can occur through alternative UPF1dependent branches of the NMD pathway. In addition, I found that many genes encoding core spliceosome components contain highly conserved PTCintroducing alternative exons.

The results in Chapters 2 and 3 suggested that ASNMD plays a wider role in the expression of splicing factors than previously appreciated, as a mechanism for homeostatic regulation not only of AS regulatory proteins, but of components of the basal splicing machinery as well. In support of this hypothesis, data presented in Chapter 4 show that the core splicing factor SmB/B′ regulates its expression by promoting the inclusion of a conserved PTCintroducing alternative

35 exon in its own premRNA. In Chapter 5, AS profiling using RNASeq reveals that SmB/B′ knockdown affects the inclusion of many additional alternative exons that are enriched in genes encoding proteins with functions in RNA metabolism. These results provide new insight into the relatively uncharacterized role of basal splicing factors in AS regulation and suggest a role for this regulation in coordinating the expression of RNA processing factors. More broadly, the work in this thesis supports a model of functional coupling between RNA splicing and decay pathways to control specific gene expression networks.

36

Chapter 2

Data presented in this chapter are adapted with permission from the following publications:

Pan Q, Saltzman AL , Kim YK, Misquitta C, Shai O, Maquat LE, Frey BJ, Blencowe BJ. 2006. Quantitative microarray profiling provides evidence against widespread coupling of alternative splicing with nonsensemediated mRNA decay to control gene expression. Genes Dev 20 (2): 153158. doi: 10.1101/gad.1382806 Copyright © 2006, Cold Spring Harbor Laboratory Press. For reprint, see Appendix 1.

Saltzman AL , Kim YK, Pan Q, Fagnani MM, Maquat LE, Blencowe BJ. 2008. Regulation of multiple core spliceosomal proteins by alternative splicingcoupled nonsensemediated mRNA decay. Mol Cell Biol 28 (13): 43204330. doi:10.1128/MCB.0036108 Copyright © 2008, American Society for Microbiology. For reprint, see Appendix 2.

Contributions:

The knockdowns of three UPF factors (siRNA transfection, total RNA isolation and Western blots) were performed by Dr. Yoon Ki Kim (at the lab of Lynne E. Maquat, University of Rochester).

Mining of alternative splicing events and their conservation in ESTs/cDNAs, mapping of premature termination codons, design of AS microarrays and analysis of the mouse AS microarray data was performed by Sandy (Qun) Pan (Blencowe Lab).

The algorithm for analysis of the AS microarrays (GenASAP) was designed by Dr. Ofer Shai (at the lab of Dr. Brendan J. Frey, U of T) and Dr. Quaid D. Morris (then at the labs of Drs. Brendan J. Frey and Timothy R. Hughes, U of T).

37

2 Impact of nonsensemediated mRNA decay (NMD) factors on alternative splicing (AS)

2.1 Introduction

2.1.1 Prevalence of ASNMD

The production of multiple mRNA variants through alternative splicing (AS) represents a widespread mechanism for the expansion of proteomic diversity, and regulated AS plays important roles in many physiological processes (reviewed in Black 2003; Matlin et al. 2005; Blencowe 2006). However, sequencebased predictions have revealed that approximately one third or more of AS events have the potential to introduce a premature termination codon (PTC) that could target the resulting spliced transcript for nonsensemediated mRNA decay (NMD) (Green et al. 2003; Lewis et al. 2003). This finding led to speculation that ASNMD may represent a widely used mechanism for the downregulation of gene expression (Hillman et al. 2004; NeuYilik et al. 2004; Lejeune and Maquat 2005) and for the tissuespecific regulation of gene expression (Holbrook et al. 2004; Raes and Van de Peer 2005). However, the levels of predicted PTCcontaining mRNA splice variants in normal cells or tissues had not previously been measured on a large scale. In addition, it was reported that inhibition of the NMD pathway by siRNAmediated knockdown of UPF1 in cultured human cells resulted in changes in the levels of ~9% of transcripts detected using mRNA expression microarrays (Mendell et al. 2004). However, the proportion of predicted PTCcontaining splice variants affected by NMD inhibition was not determined.

2.1.2 Differential requirements for UPF factors in NMD

Microarray profiling studies in yeast (Lelivelt and Culbertson 1999; He et al. 2003) and Drosophila cells (Rehwinkel et al. 2005) revealed that a common set of transcripts was differentially expressed following inactivation or depletion of UPF1, UPF2 or UPF3. These results were consistent with previous genetic and biochemical evidence that UPF1, UPF2 and UPF3 act in a single, linear pathway in these organisms. In contrast, studies in human cells provided evidence for alternative branches of the NMD pathway. Tethering of EJC components Y14, MAGOH or EIF4A3 to an mRNA downstream of a stop codon activated NMD in a UPF2 independent manner, whereas NMD activated by tethering of the EJC factor RNPS1 required UPF2 (Gehring et al. 2005). In addition, mRNA expression microarray profiling revealed NMD

38 targeted transcripts that were stabilized upon knockdown of UPF1 but not upon knockdown of UPF3X alone or in combination with its paralogue UPF3 (Chan et al. 2007). These studies suggested that NMD requires UPF1 but for certain transcripts is not dependent on either UPF2 or UPF3X. However, the effects of all three UPF proteins had not been directly compared, and the relative requirements of these proteins in ASNMD had not been investigated.

2.1.3 Summary

Figure 21. Overview of Chapter 2.

In this chapter, AS microarray profiling is used to measure the relative levels of PTCcontaining versus nonPTCcontaining splice variants in ten diverse mammalian tissues (Figure 21). In addition, the proportion of PTCintroducing AS events regulated by NMD and the relative requirements of the core NMD factors UPF1, UPF2 and UPF3X in this regulation are investigated. These experiments reveal that the majority of predicted PTCcontaining splice variants are minor isoforms, which are present at low steadystate abundance relative to the non PTC splice variants arising from the same gene. Only a small proportion of these predicted PTC introducing AS events display pronounced changes in splice variant levels when NMD is inhibited by knockdown of UPF1. Together with the finding that most of these AS events are not conserved in human and mouse EST/cDNA databases, these results suggest that many of the PTCcontaining splice variants predicted by sequence analysis are rare splice variants that are not regulated by NMD and may not be functionally relevant. In addition, comparing the effects of individual knockdowns of UPF1, UPF2 and UPF3X revealed that these factors have

39 overlapping but distinct effects on the degradation of the PTCcontaining transcripts arising from AS, indicating that ASNMD is regulated by different UPF1dependent mechanisms.

2.2 Materials and Methods

2.2.1 Cell culture, siRNA and plasmid transfection

Human HeLa cells (2 x 10 8) were propagated in DMEM medium (GIBCOBRL) containing 10% fetal bovine serum (GIBCOBRL) and transiently transfected with 100 nM siRNA (Dharmacon) using Oligofectamine (Invitrogen). Protein and total RNA were purified three days later. Control, UPF1, UPF2 and UPF3X siRNAs have been described (Kim et al. 2005). For transient protein overexpression, cDNAs encoding SNRPB (SmB), SMNDC1 and DDX42 from the human ORFeome collection (Open Biosystems) were cloned into pMT3989 (Marcia Roy, M. Tyers lab) using Gateway LR Clonase II (Invitrogen). The resulting Nterminally 3xFlagtagged constructs were transfected into HeLa cells using Lipofectamine 2000 (Invitrogen) or Fugene6 (Roche Applied Science) according to the manufacturer’s instructions. Mouse NIH3T3 cells were maintained in DMEM supplemented with 10% calf serum (GIBCOBRL). Total RNA was harvested using Trizol Reagent (Invitrogen) according to the manufacturer’s instructions following three hours in the presence of DMSO (0.24%; v/v) or cycloheximide (300 ug/mL in DMSO; Sigma).

2.2.2 RTPCR assays and Western blotting

RTPCR assays using primers specific for constitutive exons flanking the alternative exon were performed using 0.20.5 ng input polyA+ RNA in a 10L reaction using the OneStep RTPCR kit (Qiagen). Reactions were performed with or without addition of α32 PdCTP (PerkinElmer). Reaction products for nonradioactive RTPCR were separated on a 2% agarose gel, stained with ethidium bromide and quantified using Quantity One (Biorad). For radioactive RTPCR, reaction products were separated on a 5% polyacrylamide gel. Dried gels were visualized by scanning of a phosphor imaging screen after 618 hours exposure (Molecular Imager FX, Biorad or Typhoon, GE Healthcare). Bands were quantified using Quantity One (Biorad) or ImageQuant (GE Healthcare). Western blotting for UPF factors was performed as described (Kim et al. 2005).

40

2.2.3 Microarray design and hybridization

Cassette AS events were identified in cDNA and EST sequences and filtered as described (Pan et al. 2004; Pan et al. 2005). Cassette and AS events in human transcripts were mined from UniGene Build #158 (ftp://ftp.ncbi.nih.gov/repository/UniGene/) and human genome sequence data (ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/). Probes representing 3055 human AS events (of which 2923 were unique) were designed as described (Pan et al. 2004; Hughes 2006) and printed on a 22K microarray (Agilent Technologies). Cy3 and Cy5labelled cDNAs were prepared from polyA+ RNA in duplicate with fluorreversal and hybridized to Agilent microarrays as described (Pan et al. 2004; Hughes 2006).

2.2.4 Microarray data analysis

Microarrays were scanned and signal intensities were detrended and normalized as described (Zhang et al. 2004). Relative inclusion levels and confidence ranks of alternative exons monitored on the microarray were determined using the ‘Generative model for the Alternative Splicing Array Platform’ (GenASAP) algorithm (Shai et al. 2006). For human AS microarray data, analysis was limited to events with confidence ranks in the top half of the data in at least two of the six samples, representing 1704 of the 2923 unique AS events on the microarray. Correlations of probe intensities and % exon inclusion between fluor reversals for six HeLa samples (three control siRNA, three knockdown siRNA) are shown in Appendix 3. Correlations of AS duplicates on the same array are shown in Appendix 4. Correlations between microarray and RTPCR measurements of % exon inclusion are shown in Appendix 5. Microarray data for six HeLa samples is in Appendix 6 and information for these AS events is in Appendix 7. Analysis of percent exon inclusion levels for 3126 AS events in ten mouse tissues was performed using data described in previously (Pan et al. 2004).

2.2.5 Annotation of PTCintroducing AS events

Open reading frame (ORF) information for genes represented on the microarray was obtained from the headerline of the corresponding cDNA sequences in UniGene (ftp://ftp.ncbi.nih.gov/repository/UniGene/). Alternative splicing events were then separated into categories as follows (counts are given for the entire human AS microarray, followed by the count for events passing our filtering criteria in brackets): of 2171 (1360) unique events mapping to a cDNA with a known ORF, 434 (249) were excluded from further PTC analysis because the

41 alternative exon was upstream of, or overlapping with, the start codon, or because the termination codon was not located in the last exon. Termination codons were mapped to the ORF sequences in relation to the splice junctions for the remaining 1737 (1111) events. After applying the “50nucleotide rule” (Nagy and Maquat 1998), 735 (495) of the AS events had potential PTCs. Of these, 282 (164) were PTC upon inclusion and 453 (331) were PTC upon skipping events. Gene identifiers, aliases and gene ontology annotation (Appendix 7) were obtained manually or using SOURCE (http://source.stanford.edu) (Diehn et al. 2003) or Gene/Clone ID Converter (http://idconverter.bioinfo.cipf.es/) (Montaner et al. 2006). For the mouse AS microarray, of 1522 AS events, 318 were excluded from PTC analysis as described above. After applying the “50nucleotide rule” for the remaining 1204 AS events, 172 were PTC upon inclusion and 342 were PTC upon skipping events.

2.2.6 Categorization of conserved and speciesspecific alternative exons

Two “probe” sequences were designed to determine whether an AS event is conserved between human and mouse or specific to one species: (1) a 100mer ‘skipped’ probe consisting of 50 nucleotides of the constitutive exon upstream of the alternative exon fused to 50 nucleotides of the constitutive exon downstream of the alternative exon; and (2) a longer ‘included’ probe spanning the same region but with the alternative exon sequence included in the middle. These two probes were used to Blastsearch the orthologous expressed sequences using an evalue of 10 3. A ‘skipped’ isoform was identified if there was a match to the middle 40 or more nucleotides of the ‘skipped’ probe. An ‘included’ isoform was identified if the alignment spanned the two splice sites flanking the alternative exon. A conserved AS event was identified if the ‘included’ and ‘skipped’ isoforms were detected in both human and mouse.

2.3 Results

2.3.1 Predicted PTCcontaining splice variants represent minor isoforms across ten mouse tissues

To measure the inclusion levels of alternative exons whose inclusion or skipping is predicted to introduce a PTC, AS microarray profiling data from ten diverse mouse tissues were used (Pan et al. 2004). For 1204 AS events on the mouse AS microarray, open reading frames were mapped computationally by Sandy Pan. The AS events were separated into three categories based on whether inclusion or skipping of the alternative exon introduced a PTC (‘PTC upon inclusion’,

42 n=172, ‘PTC upon skipping’, n=342) or whether neither splice variant contained a PTC (‘No PTC’, n=690) (Figure 22). Nonsense codons were considered premature if they occurred ≥50 nt upstream of the final splice junction, an empirically defined rule for mammalian NMD (Nagy and Maquat 1998).

Alternative exon inclusion levels for PTC upon inclusion, PTC upon skipping, and No PTC groups are shown in heat map and cumulative distribution function plots (Figure 22). Alternative exons introducing a PTC upon inclusion are overall highly skipped across the ten tissues (median 19% inclusion), whereas those introducing a PTC upon skipping are highly included (median 78% inclusion). These median inclusion levels are both significantly different 80 from the nonPTCintroducing events (median 70% inclusion; p (PTCinclusion vs. no PTC) <10 ; p (PTC 62 skipping vs. no PTC) =5*10 , Wilcoxon rank sum test; Figure 22). For PTC upon inclusion AS events, the skipped variant is the major form in at least half of the ten tissues in 82% (141) of events, and is the major form across all ten tissues in 46% (79) of events. Conversely, for PTC upon skipping AS events, the included variant is the major form in at least half of the ten tissues in 93% (317) of events, and is the major form across all ten tissues in 62% (210) of events. Therefore, the AS microarray data show that computationallypredicted PTCcontaining splice variants generally represent minor isoforms across ten diverse mouse tissues.

43

Figure 22. Alternative splicing microarray data reveal that predicted PTCintroducing splice variants represent minor forms across ten mouse tissues. (A) The % exon inclusion for AS events that introduce a PTC upon inclusion ( left panel ), a PTC upon skipping ( middle panel ) or that do not introduce a PTC ( right panel ) are represented using the indicated color scale. Each row represents an AS event. Rows are ordered (top to bottom) by the average % exon inclusion across the tissues. Each column represents a tissue (from left to right: brain, heart, intestine, kidney, liver, lung, muscle, salivary, spleen, and testis). (B) Cumulative distribution function (CDF) plots to compare the % exon inclusion among the PTC categories. The pvalues (Wilcoxon rank sum test) compare the % inclusion between the PTC upon inclusion and No PTC categories (blue text), or between the PTC upon skipping and No PTC categories (red text). The three heavier lines show the CDFs for the given PTC category with data from the ten tissues pooled together. The thinner, lighter lines show the CDFs for each PTC category in each tissue individually (there are ten thin lines for each PTC category).

44

To validate the % inclusion values predicted by the AS microarray, representative PTC introducing AS events were selected for measurement of % inclusion by RTPCR (Figure 23). These assays complement extensive validation of % inclusion for nonPTCintroducing AS events previously performed by others in the lab (Pan et al. 2004). Two groups of AS events were selected, where either (1) the PTCcontaining variant is predicted to be the minor isoform across all ten tissues (Figure 23A), or (2) the PTCcontaining variant is predicted to be the major isoform in one or more tissues (Figure 23B, C). For the first group, eight AS events were selected and RTPCR assays were performed for all ten mouse tissues. The % exon inclusion measured by RTPCR agreed well with the microarray data in all cases (five representative examples are shown in Figure 23A). For the second group, ten events were assayed by RTPCR (six representative examples are shown in Figure 23B, C). However, the predicted PTC containing variant was found to be more abundant than the nonPTCcontaining variant in only three of these ten events (Figure 23B). For two AS events, this unexpected pattern of % inclusion was found only in the testes (AS#2033, 2103). Only one AS event in which the PTC introducing variant was predicted by the microarray to be the major variant across all tissues was validated by RTPCR (AS#252; Figure 23B). Notably, none of seven AS events for which the microarray predicted that a PTC upon skipping variant was the major isoform in one or more tissues were confirmed by RTPCR (three examples are shown in Figure 23C). For all seven such AS events, the PTCcontaining skipped variant was instead found to be the minor isoform across all ten tissues (Figure 23C). In summary, these RTPCR assays confirm that PTC containing splice variants are generally minor isoforms, and suggest that, particularly for PTC upon skipping events, tissuespecific changes in their inclusion levels are even more rare than already predicted by the AS microarray.

To assess the role of NMD in degrading the predicted PTCcontaining splice variants, mouse NIH3T3 cells were treated with the translation inhibitor cycloheximide (CHX) to inhibit NMD or mocktreated with DMSO (Control) and RTPCR assays were performed (Figure 23). In the presence of CHX, an increase in the relative abundance of the PTCcontaining splice variant was observed for four of six tested AS events (Figure 23A and AS#252 in Figure 23B). However, no increase was observed for AS#2033 or #4341 (Figure 23B). It is therefore possible that these AS events already generate low levels of PTCcontaining splice variants, independently of NMD. This possibility will be investigated on a larger scale below.

45

Figure 23. Representative RTPCRs of PTC upon inclusion and PTC upon skipping AS events in ten mouse tissues. Arrows to the right of each gel indicate the expected sizes of included and skipped products. The color bar below each gel shows the % inclusion predicted by the microarray (upper color) or measured by RTPCR (lower color). Sequencing was used to confirm the identity of a subset of the bands. (A) All AS events where the predicted PTCcontaining variant is at low relative levels across the ten tissues were confirmed by RTPCR. (B, C) AS events where the microarray data predicts that the PTCcontaining splice variant is more abundant in one or more tissues. (B) Three such PTC upon inclusion events (AS# 2033, 2103, 252) are confirmed by RTPCR. The upper band (●) in #252 was sequenced and is also predicted to introduce a PTC. (C) No such PTC upon skipping events (AS# 4341, 3115, 2519) are confirmed. RTPCR indicates instead that the included, nonPTCcontaining variant is more abundant across all tissues, as in (A) . The upper band ( +) in #2519 may represent a transcript retaining an intron.

46

2.3.2 Most predicted PTCintroducing AS events are not conserved between human and mouse

To assess the functional importance of predicted PTCintroducing AS events, the proportion of these AS events that are conserved in transcripts from orthologous human and mouse gene pairs was determined and compared to the proportion for nonPTCintroducing AS events (Figure 24). Human cassettetype AS events (~1800) were mapped to orthologous mouse ESTs/cDNAs and divided into three categories of AS conservation established previously (Modrek and Lee 2003; Pan et al. 2005; Yeo et al. 2005) as follows (Figure 24): (1) ‘conserved AS’: sequence conserved alternative and flanking exons detected in both human and mouse transcripts; (2) ‘speciesspecific AS of conserved exons’: a sequenceconserved exon that is alternatively spliced in human but constitutively included in orthologous mouse transcripts; (3) ‘genomespecific AS’: an alternative exon in human transcripts that is not detected in the orthologous mouse transcripts.

The majority of PTCintroducing AS events are not conserved between human and mouse transcripts (Figure 24). Eightyeight percent of PTC upon inclusion events represent genome specific AS (cf. 11% conserved AS, p<10 60, chisquare). Similarly, 91% of PTC upon skipping AS events represent speciesspecific AS of conserved exons (cf. 9% conserved AS, p=3.3*10 15 , chisquare). The PTCintroducing AS events also represent a larger proportion of both types of speciesspecific AS (42% of speciesspecific AS of conserved exons, p=4*10 8, 62% of genome specific AS, p<10 60 ; chisquare) than of conserved AS (24%). These results suggest that the majority of predicted PTCcontaining splice variants, which are not conserved, may not represent functionally important transcripts. However, the small proportion (~10%) of predicted PTC introducing events that are conserved may be physiologically relevant and potentially subject to regulation. The conservation of these AS events at the nucleotide level will also be discussed in the next chapter.

47

Figure 24. Predicted PTCintroducing AS events are more often speciesspecific than conserved between human and mouse. Percentages of AS events predicted to be PTCintroducing or nonPTCintroducing in three AS conservation categories are shown. Diagrams above the bar graph depict the splice variant pattern in human and mouse transcripts (EST/cDNA sequences) for each conservation category.

48

2.3.3 Alternative splicing microarray profiling following knockdown of the essential NMD factor UPF1 in HeLa cells

To examine what proportion of computationallypredicted PTCcontaining splice variants are regulated by NMD, I performed AS microarray profiling on RNA from HeLa cells following siRNA knockdown of the NMD factor UPF1, and on RNA from cells transfected with a non specific siRNA as a control (Figure 25). UPF1 protein levels were depleted to approximately 5% of the level in the control knockdown (Figure 25A; knockdown and Western blot performed by Yoon Ki Kim). Fluorlabeled cDNAs were hybridized to custom microarrays containing sets of exon body and splice junction probes for monitoring the inclusion levels of approximately 3000 human cassette alternative exons mined from EST and cDNA sequence alignments. The PTC upon inclusion and PTC upon skipping AS events represent 15% (n=164) and 30% (n=331), respectively, of the profiled AS events for which sufficient ORF information was available to map PTCs, and which also met our detection criteria (total events=1111). As observed in the ten mouse tissue AS microarray data (Figure 22), PTC upon inclusion events are overall highly skipped (median 19% inclusion), whereas PTC upon skipping events are highly included (median 80% inclusion) in HeLa cells (Figure 25B). Therefore, PTCcontaining splice variants represent minor isoforms in diverse cell types from different mammalian species.

2.3.4 A subset of PTCintroducing AS events are regulated by NMD

To compare the effects of UPF1 knockdown on PTCintroducing versus nonPTCintroducing AS events, the frequency and direction of AS level changes in the UPF1 knockdown relative to the control knockdown were examined (Figure 25). The AS changes are enriched in predicted PTCintroducing AS events (p=5*10 46 for 10% difference in exon inclusion; Fisher’s exact test) (Figure 25C). Approximately 10% of the predicted PTCintroducing AS events show a 15% or greater AS change upon UPF1 knockdown, whereas 3040% display a 5% or greater AS change (Figure 25C). The direction of the majority (>80%; p<0.01, Fisher’s exact test) of these changes represents an increase in the level of the PTCcontaining splice variant upon knockdown of UPF1. This trend is also seen in a cumulative distribution plot of the % inclusion changes upon UPF1 knockdown, in which the median AS change for the PTC upon inclusion and PTC upon skipping events show significantly more inclusion or more skipping, respectively, relative to the No PTC events (Figure 25D, top panel).

49

Figure 25. Knockdown of the essential NMD factor UPF1 leads to an increase in a subset of PTCcontaining splice variants. (A) Western blots show the siRNAmediated knockdown of UPF1 in HeLa cells. Control cells were transfected with a nonspecific siRNA. The protein lysates were probed with an antibody specific for UPF1 and with an antiCalnexin antibody as a loading control. Serial threefold dilutions of protein in the left three lanes demonstrate that the Western blot is semiquantitative and allow an estimation of the UPF1 depletion efficiency at ~95%. (B) The % inclusion for AS events that introduce a PTC upon inclusion ( left panel ), a PTC upon skipping ( middle panel ) or that do not introduce a PTC ( right panel ) are represented using the indicated color scale. Each row represents an AS event. Rows are ordered (top to bottom) by the % inclusion in the control knockdown. (C) The proportion of AS events in the three PTC categories (yaxis) showing the indicated change in % inclusion (xaxis) in the UPF1 kd relative to the control kd are shown in a bar graph. (D) The difference in % inclusion ( upper panel ) and ratios of transcript levels ( lower panel ) between the UPF1 kd and Control samples for the three PTC categories are displayed in cumulative distribution function (CDF) plots. The pvalues (Wilcoxon rank sum test) compare the median change in % inclusion ( upper panel ) or ratio of transcript level ( lower panel ) between the PTC upon inclusion and No PTC categories (blue text), or between the PTC upon skipping and No PTC categories (red text).

50

To confirm the effects of UPF1 knockdown on AS levels, I performed RTPCR assays on both PTCintroducing and nonPTCintroducing AS events that display a range of UPF1dependent changes % exon inclusion (Figure 26). There is a close correlation between the results obtained from the microarray data and the RTPCR assays (Figure 26; see also Appendix 5). The direction of the change in % inclusion predicted from the microarray data was validated by RT PCR for 22/26 (85%) of the reactions. A smaller proportion (relative to PTCintroducing events) of the ‘No PTC’ AS events also display changes upon UPF1 knockdown (Figure 25 and Figure 26). However, unlike the PTCintroducing AS events, increased exon inclusion or skipping were observed at similar frequencies. Further investigation of the UPF1dependent AS changes in ‘No PTC’ events would be necessary to determine if these result from indirect effects of UPF1 knockdown, or, alternatively, if these transcripts have other features that target them for NMD.

Overall, the observed UPF1 knockdowndependent AS changes are consistent with a role for NMD in degrading PTCcontaining splice variants. However, these results also indicate that only a subset (at least 10%, and perhaps up to 40%) of predicted PTCcontaining splice variants are regulated by NMD, while the majority of these splice variants may be at low levels independent of NMD.

2.3.5 Effect of UPF1 knockdown on the expression of genes containing PTC introducing AS events

To examine the role of NMD in reducing the expression of genes containing PTCintroducing AS events, the effect of UPF1 knockdown on steadystate transcript levels was measured. Steady state transcript levels were estimated using the average of the signals from the two flanking constitutive exon microarray probes. The change in transcript level, expressed as a ratio of the transcript level in the UPF1 knockdown versus the control, was plotted for the PTCintroducing and nonPTCintroducing AS events (Figure 25D, lower panel). The change in transcript level was also estimated by RTPCR, using the ratio of the sums of the included and skipped bands in the UPF1 knockdown versus the control (Figure 26). The direction of change in transcript level predicted by the microarray data was validated for 22/26 (85%) tested AS events (Figure 26).

51

Figure 26. Changes in % exon inclusion and transcript levels upon UPF1 knockdown predicted by the AS microarray are confirmed by RTPCR. The assayed AS events in the PTC upon inclusion ( left panel ), PTC upon skipping ( middle panels ), and No PTC ( right panel ) categories are ordered from left to right by the microarray predicted change in % inclusion in the UPF1 kd relative to the contol (more skipping to more inclusion). The colour bars below the gels represent the change in % inclusion ( top ), and the fold change in transcript level ( bottom ; UPF1 kd vs. Control) quantified from the microarray and RT PCR data as indicated. RTPCR assays were performed using primers specific for the constitutive exons flanking each cassette alternative exon, and products corresponding to the expected sizes of the included and skipped splice variants are indicated by arrows.

In contrast to the changes in AS levels described above, the UPF1 knockdowndependent changes in transcript level for the PTCintroducing AS events were not significantly different from the nonPTCintroducing AS events (Figure 25D, lower panel). Similar results were obtained when comparing the change in transcript levels for only the subset of AS events that show a change in % exon inclusion upon UPF1 knockdown (data not shown). Among the PTC introducing AS events assayed by RTPCR that show an increase in the PTCcontaining splice variant upon UPF1 knockdown, some also show an overall increase in transcript level (e.g. AS events #140, 1920, 1268, 470, Figure 26), whereas others show no change or a slight decrease in overall transcript level. For this second group, it is likely that some AS events do not show a transcript level increase since the PTCcontaining splice variant is at very low levels (is a minor isoform) in the control knockdown, and its increase upon UPF1 knockdown is not large enough to change the overall transcript levels. For other AS events, the increase in the PTCcontaining splice variant appears to be accompanied by a decrease in the nonPTCcontaining splice variant,

52 such that the overall transcript levels are not significantly altered (e.g. AS events #2378, 1188, 1238, 263, 462, Figure 26). Further investigation would be necessary to determine the reason for these changes, which could potentially include feedback effects on AS. In summary, UPF1 knockdown results in increased levels of PTCcontaining splice variants, but variable effects on overall transcript levels. As a result, the AS level changes observed for PTCintroducing AS events upon UPF1 knockdown are not correlated with increases in transcript levels.

2.3.6 Alternative splicing microarray profiling following individual knockdowns of NMD factors UPF1, UPF2 or UPF3X

To assess the relative requirements of the three core UPF proteins UPF1, UPF2 and UPF3X in ASNMD, I performed AS microarray profiling on RNA from cells transfected with siRNAs to individually knock down these factors, or transfected with a nontargeting siRNA as a control. The siRNAs were previously reported to specifically and efficiently target these factors (Kim et al. 2005). Based on Western blots, the knockdown levels of UPF1, UPF2 and UPF3X were estimated to be between ~5% and 30% of the levels of the proteins detected in the corresponding control lysates (Figure 27A; for UPF1 knockdown, please see Figure 25A).

To initially examine the effects of UPF1, UPF2, or UPF3X knockdown on AS, the sets of AS events affected by the knockdowns were compared. The overall frequency of AS level changes in all microarrayprofiled AS events passing the filtering criteria described in the Methods (n=1704) was similar in the UPF1 and UPF3X knockdowns, and only slightly lower in the UPF2 knockdown (Appendix 8 and Appendix 10). Alternative splicing events affected by any one of the three knockdowns were significantly more likely than expected by chance to be similarly affected (more skipping or more inclusion) by knockdown of one or both of the other two factors (p<2*10 12 , Fisher’s exact test; Appendix 8). Among the subset of AS events similarly affected when comparing any pair of knockdowns, on average approximately half (minimum of 26% and maximum of 86%) were also similarly affected by the third knockdown (Appendix 8 and data not shown). In summary, there were significant overlaps between sets of AS events affected by pairs of knockdowns, variable proportions of which were affected by all three knockdowns. The effects of the UPF1, UPF2 and UPF3X knockdowns on PTCintroducing AS events are investigated in further detail below.

53

Figure 27. Overlapping but distinct effects of UPF protein knockdowns on PTC introducing AS events. (A) Western blots of HeLa cell lysates following siRNAmediated UPF factor knockdowns. Western blots were probed with UPF2 or UPF3specific antibodies ( upper panels ) and with calnexin or betaactinspecific antibodies as loading controls ( lower panels ). Serial threefold dilutions of control lysates are shown in the left panels. An estimation of the knockdown efficiency following targeting siRNA treatment relative to the control siRNA treatment is shown below the gels ( right panels ). For a Western blot showing knockdown of UPF1, please see Figure 25. (B) The change in % alternative exon inclusion is shown for AS events that introduce a PTC upon inclusion ( left panels ) or upon skipping ( right panels ) and for which I detected at least a 5% change in % inclusion upon UPF1 knockdown. The change in % inclusion is calculated for each AS event as the % inclusion in the UPF1, UPF2 or UPF3X knockdown minus the % inclusion for the same AS event in the corresponding control siRNA treatment. This change is represented by the yellowblackcyan color scale shown, where brighter yellow indicates more inclusion in the knockdown than in the control and brighter cyan indicates more skipping in the knockdown than in the control. Black indicates no change in % inclusion level between the knockdown and the control. The rows are ordered according to the median of the change in % inclusion across the three knockdowns. The change in % inclusion calculated from RTPCR assays performed for a subset of the data is shown in the panels to the right of the microarray data.

54

2.3.7 Overlapping but distinct effects of UPF1, UPF2 and UPF3X knockdowns on PTCintroducing AS events

To compare the effects of the UPF1, UPF2 and UPF3X knockdowns on ASNMD, the changes in inclusion levels of PTCintroducing alternative exons were examined (Figure 27). As observed for the larger set of AS events in the previous section, there were significant overlaps in the sets of PTCintroducing events affected by pairs of UPF factor knockdowns (Figure 27, Appendix 9 and data not shown). My previous results had also shown that AS level changes upon UPF1 knockdown were enriched for PTCintroducing AS events, and that ~90% of these changes were consistent with a loss of NMD activity (i.e. UPF1 knockdown resulted in an increase in the abundance of the PTCcontaining splice variant; Figure 25). Therefore, the AS changes in the three knockdowns for the PTCintroducing AS events that display at least a 5% change in inclusion level following UPF1 knockdown were examined in further detail (Figure 27; PTC upon inclusion: 40% of 164 events; PTC upon skipping: 30% of 331 events).

While some PTCintroducing AS events are similarly affected by all three knockdowns, others are more affected by only two knockdowns (e.g. UPF1 and UPF3X, or UPF1 and UPF2) or only show a pronounced effect upon UPF1 knockdown (Figure 27). Several observations argue that these overlapping but distinct effects of the three UPF factors are not a result of differences in the efficiencies of the siRNA knockdowns. First, although AS changes for PTCintroducing events were more often observed upon UPF1 knockdown than upon UPF2 or UPF3X knockdown (Figure 27 and Appendix 9), the frequency of changes in nonPTC introducing AS events was similar for the three knockdowns (Appendix 10). Second, more pronounced changes upon UPF2 or UPF3X than upon UPF1 knockdown are observed for a subset of AS events (Appendix 8 and Appendix 9). These results therefore provide evidence that ASNMD has differential dependencies on the three core NMD factors.

To confirm the differential effects of the UPF protein knockdowns predicted by the microarray, RTPCR assays were performed using primers specific for the constitutive exons flanking the PTCintroducing alternative exons. Quantifications of RTPCR data are shown next to the microarray data in Figure 27, and representative assays for twelve AS events (out of 128 across six samples) are shown in Figure 28. The % exon inclusion measured by RTPCR assays agrees very well with the microarray data (R 2=0.80.9, Appendix 5, all RTPCR quantifications are also included in Appendix 7). For knockdowndependent AS level changes predicted by the

55 microarray data to be at least 5% different in exon inclusion level, RTPCR assays showed a change in the same direction in 83% of cases tested (Appendix 5). Overall, the RTPCR assays were consistent with the distinct and overlapping effects of the three knockdowns observed in the microarray data.

Figure 28. Representative RTPCR assays showing effects of UPF protein knockdowns on levels of PTCintroducing alternative exons. RTPCR assays were performed using primers specific for the constitutive exons flanking the alternative exons. RNA samples analyzed are indicated above the lanes and correspond to cells transfected with either a control siRNA () or with an siRNA specific for the indicated UPF factor (+). Arrows to the right of each gel indicate expected sizes of included and skipped products. The color bar below each gel panel shows the quantification of the knockdown dependent changes in % skipping predicted by the microarray (upper color), and measured by RTPCR (lower color). The change in % skipping is shown using the same color scale as in Figure 27. For details on these AS events see Appendix 7 (PTC upon inclusion AS events #3,146, 957, 557, 2315, and 2372; PTC upon skipping AS events #750, 1268, 76, 188, 1909 and 2914).

56

2.4 Discussion

2.4.1 Function versus ‘noise’ in PTCintroducing AS events

Our AS microarray profiling data revealed that splice variants in EST/cDNA databases predicted to contain PTCs generally represent minor isoforms across diverse mouse tissues. Most of these AS events are not conserved between orthologous human and mouse transcripts, in agreement with studies indicating that conserved AS events are more often framepreserving than species specific AS events (Philipps et al. 2004; Resch et al. 2004; Sorek et al. 2004; Yeo et al. 2005). Knockdown of the essential NMD factor UPF1 led to a substantial increase in the relative levels of PTCcontaining splice variants (≥15% change in exon inclusion level) for ~10% of events, and a small but reproducible increase for ~40% of events (≥5% change). Thus, only a subset of the predicted PTCcontaining splice variants are degraded by NMD, whereas most PTC containing splice variants are expressed at low steadystate levels in an NMDindependent manner. We cannot rule out the possibility that some of these latter rare, nonconserved splice variants serve functions in specific contexts not examined here or are degraded by a different pathway. However, it is likely that many of these are transcripts are nonfunctional, and may represent aberrant splice variants that have been proposed to contaminate EST databases (Sorek et al. 2004). In contrast, I have also shown that a subset of PTCcontaining splice variants is degraded in an NMDdependent manner. These AS events are likely to play a role in regulating gene expression through the coupling of AS and NMD. Therefore, I will examine the features that distinguish the regulated ASNMD from other PTCintroducing AS events in the next chapter.

2.4.2 Alternative branches of the mammalian NMD pathway

Using quantitative AS microarray profiling, I compared the effects of individually knocking down each of the three core NMD factors, UPF1, UPF2 and UPF3X, on the inclusion levels of alternative exons that do or do not introduce PTCs. In pairwise comparisons of the effects of each knockdown, significant overlaps are observed between the PTCintroducing AS events that display inclusion level changes. In addition, a substantial proportion (on average ~50%) of the AS events similarly affected by any two knockdowns are also affected by the third knockdown. This enrichment for AS events that have overlapping UPF protein requirements is consistent with these factors operating in a common pathway in many cases. However, at least onethird of PTCintroducing AS events displayed distinct effects in the different knockdowns, with many of

57 the UPF1dependent events showing little to no detectable dependence on either UPF2 and/or UPF3X. My results thus support differential requirements for UPF proteins in ASNMD, and extend previous studies suggesting multiple branches of the NMD pathway based on tethering assays or conventional microarray analysis (Gehring et al. 2005; Chan et al. 2007).

Both UPF1 and UPF2 are encoded by unique genes whereas UPF3X (also UPF3B) and UPF3 (also UPF3A) are paralogs. This raises the question as to whether UPF3Xindependent NMD could be due to the existence of possible redundant functions with UPF3. While this possibility has not been specifically tested here, previous studies suggest that functional redundancy between these proteins at most plays a limited role in explaining the effects that I have observed. Despite sharing 42% and 60% amino acid sequence identity and similarity, respectively (Serin et al. 2001), and both associating with spliced mRNA (Kim et al. 2001), UPF3X and UPF3 have divergent Cterminal regions, and the activation of NMD by tethered UPF3X is more efficient than it is by tethered UPF3, and requires different factors (Kunz et al. 2006). Moreover, the majority of UPF3Xindependent targets validated in the aforementioned microarray profiling study were not affected by knockdown of UPF3 alone or in combination with UPF3X (Chan et al. 2007). Nevertheless, some degree of redundant effects may exist between these factors in AS NMD. In addition, since truncating mutations associated with Xlinked mental retardation were found to eliminate expression of UPF3X in an NMDdependent manner, UPF3X may have partially redundant functions in NMD (Tarpey et al. 2007).

Additional studies published after the data in this chapter provide additional support for UPF2 and/or UPF3independent NMD. Unlike UPF1 and UPF2, which are required for development in mice and in flies (Medghalchi et al. 2001; Metzstein and Krasnow 2006; Weischenfeldt et al. 2008), upf3 null flies are viable and fertile, and UPF3 is not required for the degradation of several Drosophila NMD targets in vivo (Avery et al. 2011). In addition, recent in vitro EJC assembly assays revealed that the EJC protein MLN51 (also known as Barentz) can be recruited to a trimeric EJC containing EIF4A3, MAGOH, Y14 and interact with UPF1 independently of UPF2 and UPF3 (Gehring et al. 2009). This study provides a potential mechanistic explanation for many UPF1dependent, but UPF2 and UPF3Xindependent effects on PTCcontaining AS events observed in the AS microarray profiling data presented here.

58

Chapter 3

Data presented in this chapter are adapted with permission from the following publication:

Saltzman AL , Kim YK, Pan Q, Fagnani MM, Maquat LE, Blencowe BJ. 2008. Regulation of multiple core spliceosomal proteins by alternative splicingcoupled nonsensemediated mRNA decay. Mol Cell Biol 28 (13): 43204330. doi:10.1128/MCB.0036108 Copyright © 2008, American Society for Microbiology. For reprint, see Appendix 2.

Contributions:

Initial analysis of intron conservation using phastCons was suggested by and performed with the assistance of Matthew M. Fagnani (Blencowe Lab).

Mining of alternative splicing events was performed by Sandy (Qun) Pan (Blencowe Lab).

59

3 Conserved ASNMD in genes encoding core splicing factors

3.1 Introduction

3.1.1 Cellular functions regulated by ASNMD

Previous studies have provided evidence for an autoregulatory role of ASNMD in the expression of AS factors (Sureau et al. 2001; Wollerton et al. 2004; Hase et al. 2006), ribosomal proteins (Dabeva and Warner 1993; Mitrovich and Anderson 2000; Cuccurese et al. 2005) and several genes with other functions (Duncan et al. 1997; Hyvonen et al. 2006; Lareau et al. 2007a). In addition, PTCintroducing AS events in genes encoding splicing regulatory factors, including members of the hnRNP and SR families, have been associated with highly conserved genomic sequences, implying NMDdependent regulatory roles for these events (Lareau et al. 2007b; Ni et al. 2007). However, the range of cellular functions regulated by ASNMD has not been fully elucidated. In the previous chapter, I used AS microarray profiling to identify PTC introducing AS events regulated by NMD. These data provide a basis for identifying new genes and functional processes regulated by ASNMD.

3.1.2 Summary

Figure 31. Overview of Chapter 3.

In this chapter, I found that the intron sequences flanking alternative exons that are regulated by NMD are often highly conserved, suggesting important regulatory roles for these AS events (Figure 31). The genes containing these alternative exons encode proteins with diverse cellular

60 functions. Unexpectedly, within this group there are multiple genes encoding core spliceosomal proteins and assembly factors. I also show that conserved, PTCintroducing AS events are enriched in genes encoding core spliceosomal proteins. Altering the expression levels of the core snRNP protein SmB/B′ (SNRPB) or the assembly factor SPF30 (SMNDC1) affects the regulation of PTCcontaining splice variants from the corresponding gene. These results indicate that ASNMD plays a much wider role in the regulation of the splicing machinery than previously appreciated. The results also implicate general spliceosome components in AS regulation.

3.2 Materials and Methods

3.2.1 RTPCR and Western blotting

RTPCR assays (Figure 35) using one primer specific for the alternative exon and a second primer specific for either the 5′UTR or another coding exon were performed as above with the following changes: 10 ng input total cell RNA (isolated using the RNeasy Mini kit, Qiagen) was used and ethidium bromidestained bands were quantified using QuantityOne (Biorad). Total cell protein lysates in RIPA buffer were separated using SDSPAGE. Western blotting (Figure 35) was performed using antiFlagM2 (F3165, Sigma) or antialphatubulin (T6074, Sigma) as a loading control.

3.2.2 Analysis of conservation of flanking intron sequence and conserved AS

Genomic coordinates of microarrayprofiled alternative exons, and the 150 nucleotides of intronic sequences flanking these exons, were determined by aligning the exon sequences to the human genome (hg18, Mar. 2006 assembly, UCSC genome browser at http://genome.ucsc.edu/) using BLASTN. Conserved elements identified by the phastCons algorithm (Siepel et al. 2005) were retrieved from the UCSC human genome browser ‘phastConsElements17way’ table of the ‘17Way Most Conserved’ track (Kuhn et al. 2007). The overlap of the flanking intron sequences with these elements was determined using the Galaxy web application (Giardine et al. 2005). The conservation of the AS event in mouse ESTs was determined as described (Pan et al. 2004).

3.2.3 Identification of AS events in spliceosomal and control gene sets

Unigene identifiers for a curated list of spliceosomeassociated proteins (BarbosaMorais et al. 2006) were retrieved using MatchMiner (Bussey et al. 2003) or manually. To generate the

61

‘control’ gene set, for each spliceosomeassociated factor with ‘n’ sequences in its Unigene cluster, all other Unigene clusters with ‘n’ sequences were grouped. One cluster, other than the one associated with the spliceosomal factor, was then randomly selected from this group to be in the control set (or if no other such clusters were found, from the next group with ‘n+1’ sequences). Cassette AS events were mined as described (Pan et al. 2004; Pan et al. 2005) and PTCs were annotated as described in Chapter 2. Gene identifiers and aliases (Table 31, Table 32, Appendix 13, Appendix 14) were obtained manually or using SOURCE (http://source.stanford.edu) (Diehn et al. 2003) or Gene/Clone ID Converter (http://idconverter.bioinfo.cipf.es/) (Montaner et al. 2006).

3.2.4 Statistical Analysis

Microarrayprofiled AS events were classified by overlap with phastCons elements and by UPF dependence (Figure 32). Fisher’s exact test was used to determine whether or not these two groupings are independent. This test was also used to compare sets of AS events showing pronounced dependence on one or another UPF factor (Appendix 8). AS events from the spliceosomal and control groups of genes were classified by conservation and by PTCstatus (Figure 34), and the Chisquare test was used to test whether or not these grouping are independent. The number (n) of AS events considered in each case is given in the text and/or appropriate figure legend.

3.3 Results

3.3.1 PTCintroducing AS events affected by UPF knockdowns are flanked by highly conserved sequences

To identify PTCintroducing AS events that are most likely to be associated with conserved gene regulatory functions, the sequence conservation of introns flanking PTCintroducing exons that display UPF factordependent changes (Chapter 2) was examined. Alternative exons represented on our array were aligned to human genome sequences and the coordinates of the upstream and downstream flanking intron sequences (50 or 150 nucleotides) were identified. The proportion of nucleotides that overlap with the ‘most conserved’ sequence elements detected by the phastCons algorithm in alignments of the genomes of 17 species (Siepel et al. 2005) was then calculated. These overlaps for PTCintroducing AS events are shown in Figure 32. A statistically significant enrichment for overlap with these phastCons elements in introns flanking PTC upon

62 inclusion alternative exons that display a UPF1 knockdowndependent increase in relative abundance, when compared with all PTC upon inclusion exons, was observed (Figure 33A; 35% vs. 16% with phastCons overlap, n=46 and n=164, respectively; p=2*10 4, Fisher’s exact test). Similarly, a significant enrichment for overlap with phastCons elements in introns flanking the UPF2dependent group of PTC upon inclusion exons was observed (Figure 33A; 38% with phastCons overlap, n=16, p=0.03). No such enrichment was observed for the UPF3Xdependent events. In addition, the overlap with conserved elements for intron sequences flanking PTC upon inclusion alternative exons was significantly different for events that do or do not show a UPF1 knockdowndependent increases in % inclusion (Figure 33B; upstream intron, p=1*10 4; downstream intron: p=8*10 4; Wilcoxon ranksum test).

Figure 32. Conservation of intron sequences flanking PTCintroducing exons affected by UPF factor knockdowns. (left panels ) (As shown in Figure 27) The effects of each of the three KDs on % exon inclusion of PTC upon inclusion (A) and PTC upon skipping (B) events (scale, row order and details as in Figure 27). (right panels) For each AS event, the number of upstream and downstream flanking intron nucleotides (nt) (out of 50) overlapping with conserved phastCons elements is shown using the color scale at the bottom. Black indicates all 50 proximal nt overlap and white indicates no nt overlap.

63

Figure 33. PTC upon inclusion alternative exons that show UPF1 or UPF2dependent changes in inclusion level are often flanked by highly conserved intronic sequences. (A) The stacked bar graph shows the proportion of PTC upon inclusion AS events that overlap (black) or do not overlap (white) phylogenetically conserved sequences, as identified using the phastCons algorithm (Siepel et al. 2005). ‘Overlap’ requires that at least 35 of the first 50 nucleotides of both the upstream and downstream intron sequences flanking the alternative exon overlap phastCons elements. ‘All’ represents all detectable PTC upon inclusion events (n=164), and ‘KD’ represents PTC upon inclusion events with at least a 5% difference in the indicated direction (more inclusion or more skipping) and knockdown (UPF1KD more inclusion, n=46; UPF2KD more inclusion, n=16). Comparing proportions marked by asterisks to proportion for total group, ‘All’: *1: p=2*10 4 *2: p=3*10 2, Fisher’s exact test. (B) Cumulative distribution function (CDF) plots of the overlap between the flanking intron sequence (upstream, black, downstream, red) and phastCons elements for the PTC upon inclusion exons that show (solid line) or do not show (dotted line) a UPF1 KDdependent change in % inclusion. The differences between the AS events showing ≥5% more inclusion and AS events not affected by UPF1 knockdown are significant (upstream intron, p=1*10 4; downstream intron: p=8*10 4; Wilcoxon ranksum text).

64

Some PTC upon skipping alternative exons displaying UPF knockdowndependent changes in AS levels also overlapped phastCons conserved elements (Figure 32), however a statistically significant enrichment was not observed (data not shown). Enrichment of phastCons elements was also observed among AS events which display an increase in skipping upon knockdown of UPF2 or UPF3X but which do not introduce a PTC (Appendix 11). Further experimentation will be necessary to determine whether or not these effects are associated with NMD. In summary, the analysis described above reveals that a significant number of PTCintroducing AS events regulated by one or more UPF proteins, particularly those in the PTC upon inclusion category, have flanking introns that overlap highly conserved sequences. This observation suggests that these AS events and their associated flanking intronic sequences have important regulatory roles.

3.3.2 Core spliceosomal proteins are new regulatory targets of ASNMD

To identify new regulatory targets of ASNMD, I examined the list of functions represented by genes with PTCintroducing AS events that show UPF factor knockdowndependent changes in % inclusion (Figure 32). Among the subset flanked by conserved sequences (Figure 32, Figure 33) are 13 AS events located in transcripts from genes with RNA processingrelated functions (Table 31). Other such AS events are located in genes that represent a range of other cellular functions, including nucleotide metabolism (e.g. NT5C3), signaling (e.g. RAB5A) and sumoylation (e.g. SENP1; for additional examples, see Appendix 12). Surprisingly, in addition to AS regulators such as SR and hnRNP proteins which have recently been associated with conserved PTCintroducing events (Lareau et al. 2007b; Ni et al. 2007), regulated PTC introducing AS events were found in core spliceosome components. Among this list are genes encoding the U1 snRNPspecific 70kDa protein (U170K), the common snRNP Sm protein SmB/B′ (SNRPB), and other genes involved in spliceosome formation (SF1, PRPF18, SMNDC1; Table 31). Consistent with conserved regulatory roles of the PTCintroducing AS events in these genes implied by detection of overlaps with phastCons elements, most of these AS events were also detected in alignments of mouse ESTs (Table 31). Moreover, the microarraypredicted change in relative inclusion levels of these exons was validated by RTPCR assays in at least one of the UPF factor knockdowns (Figure 27, Figure 28, and Appendix 12). I identified conserved, UPFregulated, PTCintroducing AS events in several genes encoding core spliceosomal components. This led me to investigate whether such AS events are a common feature in additional spliceosomal factor genes not represented on our microarray.

65

Table 31. Selected microarray PTCintroducing AS events in genes with functions related to RNA processing. These alternative exons show changes in inclusion level upon UPF knockdown(s) and are flanked by highly conserved intron sequences.

3.3.3 Conserved AS in genes encoding spliceosomal factors enriched in PTC introducing events

To determine if additional genes encoding spliceosomal factors contain conserved PTC introducing AS events, we computationally mined PTC and nonPTCintroducing cassettetype AS events from EST/cDNA data representing a curated list of 253 genes encoding spliceosome associated proteins and other splicing factors (BarbosaMorais et al. 2006). For comparison, a set of nonspliceosomerelated Unigene clusters with similar EST coverage was also analyzed in parallel. We identified 443 AS events in 149 Unigene clusters in the spliceosomeassociated set (Appendix 13), and 547 events in 161 Unigene clusters in the control set (Appendix 14). Both sets had similar distributions of AS events per gene (data not shown), with a median of two AS events per gene. Available sequence information for these transcripts allowed annotation of whether the AS events were PTCintroducing or nonPTCintroducing for 73% and 74% of events in the spliceosomal and control gene groups, respectively.

My microarray results above suggested that PTCintroducing events affected by UPF factor knockdowns in HeLa cells are often flanked by conserved sequences (Figure 32) and many show conservation with mouse based on ESTs (Table 31). These features were assessed for the AS events mined from the spliceosomal and control gene groups. As described above for the

66 microarray events, I calculated the proportions of the 150 nucleotides of intron sequences flanking the human alternatively spliced exons that overlap with phastCons elements. In parallel, we mined mouse ESTs to assess which of the human AS events are ‘conserved’ in mouse. I found that among AS events showing flanking intron sequence conservation and/or conservation based on analysis of EST/cDNA sequences (n=84 and n=54 for the spliceosomal and control groups, respectively), the distribution of PTC and nonPTCintroducing AS events was significantly different between the two groups, with a higher proportion of PTCintroducing events in the spliceosomal group than in the control group (71% vs. 46%, p=3*10 3, Chisquare test; Figure 34A). Within the spliceosomal group, there is also a higher proportion of PTC introducing AS events in the conserved group than in the nonconserved group (71% vs. 61%; Figure 34A). The distributions of conserved sequence elements in the introns flanking PTC introducing alternative exons are also significantly different between the spliceosomal and control groups (Figure 34B).

A subset of the genes contains more than one type (PTC category) of AS event. However, similar results were obtained when these genes were removed from the analysis (data not shown). Alternative splicing factors including SR and hnRNP proteins, which have been previously associated with conserved, PTCintroducing AS events (Lareau et al. 2007b; Ni et al. 2007) were also identified by my analysis (Table 32). However, these did not account for the entire enrichment of conserved, PTCintroducing events in the spliceosomal genes, and similar results were also obtained when these genes were removed (data not shown). Thus, among the genes identified with at least one conserved, PTCintroducing event, most encode spliceosomal proteins and other splicing factors not known previously to be associated with ASNMD (Table 32). Moreover, the association of conserved, PTCintroducing AS events with genes encoding these spliceosomal factors suggests that these AS events play important regulatory roles.

67

Figure 34. Conserved PTCintroducing AS events in genes encoding spliceosomal proteins. (A) Conserved AS events either have flanking introns overlapping ( ≥35 nt) phastCons conserved elements or are conserved in mouse based on analysis of cDNAs/ESTs. Conserved AS events in genes of spliceosomal proteins are more often PTCintroducing than in genes from the control set (compare proportions marked by asterisks; p=3*10 3, n=84, 54; Chi square test). PTC introducing AS events in the spliceosomal protein set are also more often conserved than non conserved (compare proportions marked by closed circles). Genes with more than one AS event of the same type (PTC or nonPTCintroducing) were only counted once. (B, C) Cumulative distribution function (CDF) plots of flanking intron sequence overlap (B, upstream intron; C, downstream intron flanking the alternative exon) with phastCons elements for the indicated PTC groups in the splicing factor AS events and control AS events. The distributions are significantly different when comparing the splicing factor and control groups, for both the PTC upon inclusion (blue) and PTC upon skipping (red) events (2sample KolmogorovSmirnov test).

68

Table 32. Conserved, PTCintroducing AS events identified in transcripts from spliceosomeassociated proteins. Each AS event either has at least 35 out of 150 nt of flanking intron overlapping conserved nucleotides (phastCons, Siepel et al. 2005) or is conserved between human and mouse based on cDNA/EST analysis.

69

3.3.4 Autoregulation of core splicing factors by ASNMD

Previous studies have shown that several AS regulatory factors, such as SR and hnRNP proteins, can autoregulate their expression levels via ASNMD (reviewed in Lareau et al. 2007a). However, such a function for core spliceosomal proteins has not been previously established. The identification of conserved, PTCintroducing alternative exons in core spliceosomal factors suggested that the levels of these proteins might be subject to autoregulation. Accordingly, I tested the effect of increased expression of SNRPB or SMNDC1 on the inclusion levels of the PTCintroducing exons in their respective transcripts. Vectors encoding Flag epitopetagged versions of these proteins, and for control purposes the parental vector or a vector encoding a Flag epitopetagged U2 snRNPassociated protein, DDX42 (Will et al. 2002), were transient transfected into HeLa cells. RTPCR assays were performed using primer pairs designed to specifically amplify the endogenous PTCcontaining splice variants (Figure 35). Comparable levels of expression were obtained for the three Flag epitopetagged proteins (Figure 35A). Increased expression of SNRPB and SMNDC1 led to a reproducible (observed in three independent experiments), approximately twofold increase in the level of the PTCcontaining splicevariant transcript arising from alternative exon inclusion, specifically in SNRPB and SMNDC1 transcripts, respectively (Figure 35B). In contrast, overexpression of DDX42 in parallel had a comparatively little effect on the levels of the exonincluding transcript. Over expression of SNRPB or SMNDC1 also had lesser effects on levels of PTCcontaining transcripts from other splicingrelated genes (SF1, TRA2A, SR140; Figure 35). Thus, artificially increasing the amount of SNRPB or SMNDC1 protein leads to an increase in the level of endogenous PTCcontaining, alternative exonincluding SNRPB or SMNDC1 transcripts, respectively. These results are thus consistent with the autoregulation of expression levels of these factors via ASNMD. However, further experimentation will be necessary to establish whether this autoregulation occurs via direct or indirect mechanisms.

70

Figure 35. SNRPB (also known as SmB/B’) or SMNDC1 (also known as SPF30) over expression leads to increased levels of the respective PTCcontaining (PTC+) alternative transcript. (A) Either Flagtagged SNRPB or SMNDC1 was transiently overexpressed in HeLa cells, and the endogenous PTCcontaining transcript of SNRPB (A, left top panel) or SMNDC1 (A, right top panel) was amplified by RTPCR using a forward primer specific for the 5'UTR or an upstream exon, and a reverse primer specific for the PTCintroducing alternative exon. RTPCR amplification of conserved, PTCcontaining transcripts of SF1, TRA2A and SR140 were not affected to the same degree by overexpression of SNRPB (A, left lower panels) or SMNDC1 (A, right lower panels). Overexpression of another Flagtagged spliceosomeassociated protein (DDX42) in parallel had little effect on the level of the SNRPB and SMNDC1 transcripts. (B) A Western blot shows expression of each Flagtagged protein (left, approximate molecular weight markers, kDa; right, arrows indicate expected protein sizes). Data are representative of three independent transfections and RTPCR assays were performed in triplicate.

3.4 Discussion

In this chapter, I examined the functional properties of PTCintroducing AS events affected by knockdown of NMD factors. The intronic sequences flanking these alternative exons are often highly conserved. Sequence conservation is generally associated with selection pressure to preserve function, and conserved flanking intron sequences are specifically implicated in the regulation of alternative exon inclusion levels (Sorek and Ast 2003; Yeo et al. 2005; Sugnet et al. 2006). Thus, sequence conservation suggests that regulation of these PTCintroducing alternative exons is functionally important. My results also identified core spliceosome components and spliceosome assembly factors as a new functional gene group not previously known to be regulated by ASNMD. I further showed that increased expression of some of these core splicing

71 proteins can activate autoregulatory ASNMD. The mechanism of this feedback regulation will be explored in more detail in the next chapter.

3.4.1 ASNMD and the regulation of core spliceosomal proteins

I initially identified conserved and UPF factorregulated PTCintroducing alternative exons in transcripts from a subset of spliceosomal component or assembly factor genes based on AS microarray profiling. Since the microarray did not represent a comprehensive set of AS events in splicing factor genes, we performed a computational search for AS events in a list of spliceosomal factor genes curated based on experimental evidence supporting a functional or physical association with spliceosomal complexes (BarbosaMorais et al. 2006). Strikingly, I found that conserved, PTCintroducing AS events were significantly enriched in these genes, even when excluding previously identified examples of regulatory splicing factor genes with PTCintroducing AS events. My results therefore extended the identification of UPF1dependent, PTC introducing AS events in highly conserved regions of genes encoding known AS regulators of the SR and hnRNP families (Lareau et al. 2007b; Ni et al. 2007).

My results indicating that proteins involved in formation of the core spliceosome are regulated by ASNMD, and the results implicating these proteins in autoregulatory loops also extend previous findings that components of the ‘basal’ splicing machinery can function in the regulation of AS. An RNAi screen for AS regulators revealed that knockdown of several core splicing factors in Drosophila cells result in transcriptspecific AS effects (Park et al. 2004). Among these factors was SPF30, the ortholog of human SMNDC1, which I have shown contains a conserved PTCintroducing alternative exon that is regulated by UPF1, and when over expressed affects the levels of its PTCcontaining transcript (Table 31, Figure 35). Furthermore, while my results point to autoregulatory functions of core spliceosomal proteins operating via NMD, it is also interesting to consider that these and other proteins can regulate additional splicing events, including, for example, ASNMD events that are associated with other splicing factors. Such auto and crossregulation could provide an important mechanism for ensuring both the appropriate absolute and relative levels of proteins comprising the core splicing machinery. This hypothesis will be investigated further in Chapter 5.

72

Chapter 4

Data presented in this chapter are adapted with permission from the following publication:

Saltzman AL , Pan Q, Blencowe BJ. 2011. Regulation of alternative splicing by the core spliceosomal machinery. Genes Dev 25 (4), 373384. doi: 10.1101/gad.2004811 Copyright © 2011 by Cold Spring Harbor Laboratory Press. For reprint, see Appendix 15.

73

4 Autoregulation of the core splicing factor SmB/B′ via ASNMD

4.1 Introduction

4.1.1 ASNMD of SNRPB , encoding SmB/B′

Using AS microarray profiling following knockdown of NMD factors, I identified highly conserved, PTCintroducing alternative exons in genes encoding multiple core splicing factors (Chapter 3) (Saltzman et al. 2008). These genes included SNRPB , which encodes the core snRNP component SmB/B′, a subunit of the heteroheptameric Sm protein complex that is assembled by the SMN complex onto the Smsite of U1, U2, U4 and U5 snRNAs (reviewed in Neuenkirchen et al. 2008). The SmB and SmB′ proteins are nearly identical and arise from alternative 3′ splice site usage in the terminal exon of SNRPB (Figure 42A, top and Appendix 16), leading to an additional repeat of a short Prolinerich motif at the Cterminus of SmB′ (van Dam et al. 1989). The PTCintroducing alternative exon in SNRPB lies within a region of the second intron that is highly conserved in mammalian genomes (Figure 42A). Splice variants including this PTC introducing exon accumulate when NMD is disrupted, as well as when expression of SmB is increased exogenously (Figure 35, Figure 42). These results suggest that ASNMD plays a role in the homeostatic regulation of basal splicing factors.

4.1.2 Summary

Figure 41. Overview of Chapter 4.

In this chapter, I show that the core snRNP protein SmB/B′ regulates its own expression by promoting the inclusion of a highly conserved PTCintroducing alternative exon in its pre mRNA (Figure 41). This mode of regulation depends on a suboptimal 5′ splice site associated with the SNRPB PTCintroducing exon and appears to be controlled by changes in the level of

74

U1 snRNP as a consequence of SmB/B′ depletion. My results also suggest that inclusion of the SNRPB PTCintroducing exon is involved in crossregulation of SmB/B′ by the related tissue restricted SmB/B′ paralogue SmN. The autoregulation of SmB/B′ further suggests that this core splicing factor also play a role in the regulation of other alternative exons.

4.2 Materials and Methods

4.2.1 Cell culture, siRNA and plasmid transfection

HeLa cells were grown in DMEM (Sigma) supplemented with 10% fetal bovine serum (Sigma) and penicillin/streptomycin (Gibco). For knockdowns, cells were transfected with siRNAs (On TargetPlusmodified, Dharmacon) at a final concentration of 100nM using Dharmafect1 (Dharmacon) according to the manufacturer’s instructions. Plasmids were transfected using Lipofectamine2000 (Invitrogen) according to the manufacturer’s instructions. For knockdown/minigene experiments, cells were transfected with siRNAs and then cotransfected with minigene plasmids and cDNA expression plasmids two days later. Cells were then harvested two days after the plasmid transfection.

4.2.2 Estimation of mRNA halflives

HeLa cells were transfected with miniSmB and two days later were treated for three hours with CHX (300 ug/mL) to inhibit translation and NMD, or with an equivalent amount of DMSO as a control. Cells were then additionally treated with actinomycin D (5ug/mL) to inhibit transcription, and RNA was isolated at several timepoints. Exponential decay curves were fit to the data and used to estimate the halflives of each splice variant.

4.2.3 RNA and protein isolation, RTPCR and Western blotting

Total RNA was isolated using TRI reagent (Sigma) according to the manufacturer’s instructions. RTPCR assays were performed using 510 ng input total RNA in a 10L reaction using the OneStep RTPCR kit (Qiagen) with or without addition of α32 PdCTP (PerkinElmer). Total protein lysates in radioimunnoprecipitation buffer supplemented with complete mini EDTAfree protease inhibitor cocktail (Roche) were separated by SDSPAGE and Western blotting was performed with the monoclonal antibody Y12 to detect SmB/B′ (Lerner et al. 1981), with mAb96 to detect SRSF1 (Hanamura et al. 1998) or with antiαtubulin (Sigma; T6074). Primers for RT PCR analysis of snRNAs were as described (Zhang et al. 2008).

75

4.2.4 Plasmid Construction

To construct miniSmB, the 124 nt SNRPB PTCintroducing alternative exon along with 124 nt of upstream intron and 122 nt of downstream intron was amplified by PCR from HeLa genomic DNA and cloned into the Xho I/ Not I sites of the pET01/Exontrap vector (Mobitec). The amplified SNRPB fragment corresponds to human chr20:23957152396221 (Hg18; reverse strand). Mutations of miniSmB were introduced by sitedirected mutagenesis or by overlapPCR using Phusion polymerase (NEB). For expression of 3xFlagtagged proteins, the cDNAs encoding SmB, SmD1 or SmN from the human ORFeome collection (Open Biosystems) were cloned into pMT3989 (a gift from Marcia Roy, Tyers Laboratory) using Gateway LR Clonase II (Invitrogen). The SmB′ construct was generated by sitedirected mutagenesis of SmB. All constructs were verified by sequencing.

4.3 Results

4.3.1 Inclusion of a highly conserved premature termination codon (PTC) introducing alternative exon in SNRPB premRNA is affected by SmB/B′ protein levels

To initially explore the role of the core spliceosomal machinery in the regulation of AS, I determined the effect of SmB/B′ knockdown on the AS of the PTCintroducing SNRPB exon. This exon, together with its highlyconserved flanking intronic sequences, was cloned into a minigene reporter plasmid containing upstream and downstream heterologous intron and constitutive exon sequences (‘miniSmB’; Figure 42A). Unlike endogenous SmB/B´ transcripts including the PTCintroducing exon (Figure 42B), exonincluded transcripts derived from the miniSmB reporter are not degraded by NMD, as neither the steadystate level nor halflife of these transcripts are increased upon disruption of NMD (Appendix 17 and Figure 43). Monitoring transcripts derived from miniSmB therefore allows an analysis of the splicing regulation of the PTCintroducing SNRPB exon in the absence of effects of NMD. HeLa cells were transfected with a control nontargeting (NT) siRNA or an siRNA to knockdown SmB/B′, followed by transfection with the miniSmB reporter plasmid. Knockdown of SmB/B′ led to increased skipping of the SNRPB alternative exon in miniSmB (Figure 42C). Loss of inclusion of the SNRPB alternative exon was rescued by expression of a Flagepitopetagged cDNA construct encoding SmB or SmB′ (Figure 42C). It was also rescued by expression of SmN, a tissuerestricted paralogue of SmB/B′ that is 93% identical to SmB′ (Appendix 16). However,

76 loss of inclusion of the SNRPB alternative exon upon SmB/B′ knockdown could not be rescued by expression of the related protein SmD1, another component of the heteroheptameric Sm ring (Figure 42C). Thus, SmB/B′ knockdown leads to an increase in the exonskipped miniSmB splice variant (Figure 42C), which represents the proteincoding isoform. Reciprocally, my previous experiments have shown that increasing SmB/B′ protein levels leads to an increase in the exonincluded PTCcontaining splice variant (Figure 35) (Saltzman et al. 2008). Together, these results support a role for the highly conserved SNRPB PTCintroducing alternative exon in the homeostatic autoregulation of SmB/B′ via ASNMD.

77

Figure 42. The inclusion of a highly conserved PTCintroducing alternative exon in SNRPB is affected by SmB/B′ knockdown. (A) ( Upper panel ) Diagram of the exon/intron structure of the SNRPB gene. The two encoded proteins, SmB and SmB′, arise from alternative 3′ splicesite usage in the final exon. A conservation plot (phyloP), is shown below (generated using the UCSC genome browser (Rhead et al. 2010), http://genome.ucsc.edu/). ( Lower panel ) An expanded view of the region between the second and third SNRPB exons. The highly conserved area within this intron contains an alternative exon (‘A’). This alternative exon and its conserved flanking intron regions (boxed in blue) were cloned into a minigene (miniSmB), in which they are flanked by heterologous intron and exon sequences. (B) RTPCR assays confirm an increase in the level of the PTCcontaining SNRPB variant when NMD is abrogated by knockdown of UPF1. See also Figure 43 and Appendix 17. (C) Knockdown of SmB/B′ leads to more skipping of the SNRPB alternative exon in miniSmB. HeLa cells were transfected with a control nontargeting (NT) siRNA, or an siRNA targeting the 3′UTR of SmB/B′. Cells were then cotransfected with miniSmB and either an empty vector or a vector encoding a 3xFlagtagged cDNA, as indicated above the gels. To assay alternative exon inclusion in miniSmB (top panel ), RTPCR assays were performed using primers specific for the flanking constitutive exons of the minigene (hatched boxes). Quantifications of the % exon inclusion level are shown below the gel, and the average % inclusion ± standard deviation calculated from at least three independent analyses is shown in the bar graph. The level of SmB/B′ mRNA ( middle panel ) or 5S rRNA (loading control, bottom panel ) were assayed by RT PCR.

78

Figure 43. The halflife of the endogenous SNRPB PTCcontaining included splice variant (A) but not that of the exonincluded variant from the SNRPB reporter ‘miniSmB’ (B) is increased upon treatment with cycloheximide (CHX) to inhibit NMD. Cells were transfected with miniSmB and two days later were pretreated for three hours with CHX or with an equivalent amount of DMSO as a control. Cells were then additionally treated with actinomycin D to inhibit transcription, and RNA was isolated at the indicated timepoints. The endogenous included PTCcontaining splice variant ( A) or the transcripts derived from the minigene ( B, C ) were detected by RTPCR. Quantifications of the RTPCR signals (average ± standard deviations) were calculated from four RTPCR assays performed on samples from two independent transfections and are shown in semilog plots. Exponential decay curves were fit to the data and used to estimate the halflives of each splice variant.

4.3.2 Knockdown of the core snRNP protein SmD1 affects the inclusion of the conserved SNRPB alternative exon

To investigate whether other proteins in the Sm heptameric complex might also affect the inclusion of the SNRPB alternative exon, SmD1 was knocked down and the effect on the AS of miniSmB was assayed by RTPCR (Figure 44A). As observed for SmB/B′ (Figure 42C), knockdown of SmD1 resulted in more skipping of the miniSmB alternative exon. This effect was rescued by exogenous expression of SmD1, but not of SmB (Figure 44A). Taken together with the results described above, these observations suggest that SmB and SmD1 affect AS in a similar but nonredundant manner. One possibility is that depletion of each component of the Sm heptameric complex similarly affects the overall levels and/or integrity of one or more snRNPs in a manner that affects the recognition of the SNRPB alternative exon.

79

Figure 44. Knockdown of SmD1 leads to more skipping of the SNRPB alternative exon in miniSmB (A), and knockdown of SmB/B′ (B) or SmD1 (C) affects snRNA levels. (A) HeLa cells were transfected with nontargeting (NT) or SmD1specific siRNAs. Cells were then cotransfected with miniSmB and with either an empty vector or a vector encoding the 3xFlagtagged cDNA indicated above the gel. RTPCR assays were performed using primers specific for the flanking constitutive exons of the minigene ( top panel ) or primers specific for SmD1 or βactin transcripts ( lower panels ). The quantifications and bar graph of % inclusion levels are as in Figure 42. (B) Knockdown of SmB/B′ leads to a decrease in steadystate levels of three of four Smclass snRNAs that form the major spliceosome (U1, U4, U5, but not U2) and of the Smclass snRNAs that form the minor spliceosome (U11, U12, U4atac). (C) Knockdown of SmD1 shows similar effects, but also a slight decrease in U2 snRNA levels. The levels of LSm (Smlike)class snRNAs U6 and U6atac were not decreased in either the SmD1 or SmB/B′ knockdown.

4.3.3 Knockdown of SmB/B′ or SmD1 affects the levels of Smclass snRNAs

To investigate how the knockdown of SmB/B′ or SmD1 might affect SmB/B′ AS, I next determined whether reduced levels of either Sm protein affects the steadystate levels of spliceosomal snRNAs using RTPCR assays. The relative levels of snRNAs measured in total cellular RNA are comparable to those in snRNPs immunoprecipitated with antiSm antibodies (Zhang et al. 2008). This is in agreement with previous studies indicating that the pool of “Sm free” snRNAs is relatively small (Sauterer et al. 1988; Zieve et al. 1988). Consistent with an important role for Sm core assembly in snRNP stability (Jones and Guthrie 1990), knockdown of

80

SmB/B′ led to a decrease in Smclass snRNAs (U1, U4, U5, U11, U12, U4atac), with the exception of U2 (Figure 44B). Knockdown of SmD1 also led to a decrease in Smclass snRNAs, including a slight decrease in U2 (Figure 44C). The similar effects observed for both knockdowns are consistent with a shared role for Sm proteins in the inclusion of alternative exons through modulating the overall levels of one or more snRNPs.

4.3.4 Cis acting elements regulating inclusion of the SNRPB alternative exon

To investigate the mechanism by which knockdown of Sm proteins and the associated reductions in snRNP levels affects AS of SmB/B´ premRNA, I performed a detailed mutagenesis analysis of the miniSmB reporter. Recapitulation in this reporter (Figure 42) of the Smprotein dependent AS effects seen for endogenous transcripts (Saltzman et al. 2008) suggested that all of the cis acting elements required for mediating regulation are contained within the highly conserved SNRPB PTCintroducing exon and/or its flanking intronic sequences. Linkerscanning mutagenesis was performed in which successive 12base segments of the alternative exon and its upstream and downstream flanking introns were deleted or substituted with a linker sequence. This strategy identified sequences in the exon and introns acting as splicing enhancers or silencers. These elements are concentrated near splice sites (Figure 45A). In most cases, knockdown of SmB/B′ led to a comparable increase in exon skipping of the mutated or deleted minigenes as observed for the wildtype miniSmB (Figure 45B). These results suggest that either these sequence motifs act through different trans acting factors, or that they are involved in regulation in a manner that is redundant with other sequence motifs. However, in several cases, the mutation and/or deletion of a particular sequence reduced or eliminated the impact of SmB/B′ knockdown on inclusion of the alternative exon. In particular, a deletion from the 7 th to 18 th nucleotide downstream of the 5′ splice site (ss) led to an increase in the inclusion of the alternative exon. However, in contrast to wildtype miniSmB, knockdown of SmB/B′ had no detectable effect on the exon inclusion level of this mutant reporter (Figure 45 and Appendix 18). Substitution of the same sequence with the 12base linker also led to an increase in exon inclusion, but, in contrast to the deletion mutant, did not prevent increased exon skipping caused by SmB/B′ knockdown (Figure 45 and Appendix 18). Examination of sequences adjacent to the 5′ss created by this pair of mutants revealed that the deletion but not the substitution created a sequence with increased potential for basepairing to U1 snRNA. These results therefore suggest that regulation of the SNRPB alternative exon by Sm protein levels depends on sequences at or

81 proximal to the exon 5′ss. Therefore, the role of 5′ss sequences in regulating the inclusion of the SNRPB alternative exon was investigated in greater detail.

Figure 45. Auxiliary cis acting elements regulating inclusion of the SNRPB alternative exon in miniSmB are proximal to the splice sites. Adjacent 12mers were either substituted or deleted by linkerscanning mutagenesis of miniSmB. Sequences in the 3′ splice site (20 to the first 2 nt of the exon) and 5′ splice site (last 2 nt of the exon to +6) were analyzed separately. (A) HeLa cells were transfected with either a minigene containing the mutation/deletion, or with the wildtype miniSmB. The effect on alternative exon inclusion for each substitution (left bars within each 12nt window) or deletion (right bars within each 12nt window) was assayed by RT PCR, using primers specific for the heterologous flanking exons of the minigene. The bar height represents the exon inclusion level of transcripts from the minigene containing the mutation or deletion, when compared to the exon inclusion level from the wildtype miniSmB. Error bars represent the standard deviations determined from two or three independent transfections. (B) Effect of SmB/B′ knockdown on exon inclusion levels of mutated or deleted minigenes. The bar height represents the exon inclusion level of the minigene containing a particular mutation or deletion in the SmB/B′ knockdown, when compared to its inclusion level in cells transfected with a control nontargeting siRNA. For comparison, the effect of SmB/B′ knockdown on the wildtype miniSmB (26±4% more skipping) is shown by the dotted lines (calculated from 10 independent transfections). Where present, error bars in ( B) represent standard deviation for two independent transfections. Abbreviations : %in, percent exon inclusion; nt, nucleotides; I/ESS, intronic/exonic splicing silencer; I/ESE, intronic/exonic splicing enhancer. siSmB/B′, siRNA knockdown of SmB/B′; siNT, control knockdown using nontargeting siRNA.

82

4.3.5 Mutations that strengthen the 5′ss reduce the effects of SmB/B′ knockdown

To determine the role of 5′ss sequences in Sm protein knockdowndependent effects on miniSmB AS, 5′ss mutations were introduced into miniSmB, and the level of exon inclusion was assayed in control and SmB/B′ knockdown conditions (Figure 46A). Two mutations that strengthen the 5′ss increase its inclusion level, yet, unlike wildtype miniSmB but similar to the deletion mutant described above that strengthens the 5´ss, show very little skipping when SmB/B′ is knocked down (Figure 46A; ‘consensus’ and ‘strong’). However, SmB/B′ knockdown sensitivity is restored by the introduction of additional mutations in the 5′ss consensus minigene that weaken the 5′ss (U+6C, A+3G; Figure 46A). In contrast, a mutation that strengthens the 3′ss (U 10 ACAG |G) also increases the inclusion level of the minigene yet does not abrogate the effect of SmB/B′ knockdown (Figure 46B). Similar results were obtained for two other mutations that increase the strength of the 3′ss (Appendix 18). Thus, minigenes with strong 5′ or 3′ splice sites have similarly high basal levels of exon inclusion (~90%), yet the effect of SmB/B′ knockdown on exon skipping is only abrogated for minigenes that have mutations that strengthen the 5′ss. Taken together with the observation that SmB/B′ or SmD1 knockdown results in reduced levels of U1 snRNP (Figure 44), my data provide evidence that the suboptimal 5′ss of the SNRPB PTCintroducing alternative exon is necessary for its sensitivity to Sm protein depletion.

83

Figure 46. Mutations that strengthen the 5′ss (splice site), but not mutations that strengthen the 3′ss, reduce the effects of SmB/B′ knockdown on miniSmB AS. HeLa cells were transfected with nontargeting (NT) or SmB/B′specific siRNAs. Cells were then cotransfected with wildtype (wt) miniSmB or with minigenes harbouring mutations in the 5′ss ( A) or 3′ss ( B) as indicated above the gel images, along with either empty vector or a vector encoding the 3xFlagtagged SmB cDNA construct. (A) Increasing the 5′ss strength (‘consensus’ and ‘strong’) results in increased % inclusion relative to wt, as well as a marked reduction in the effect of SmB/B′ knockdown on % inclusion. Secondary mutations introduced into the ‘consensus’ construct to decrease the 5′ss strength (‘U+6C’ and ‘A+3G’) restore sensitivity to SmB/B′ knockdown. (B) Increasing the 3′ss strength results in increased % inclusion relative to wt, but no reduction in the effect of SmB/B′ knockdown. RTPCR assays and quantifications of % exon inclusion are as in Figure 42C. ψ, pseudouridine.

4.4 Discussion

In this chapter, I characterized the role of ASNMD in the feedback regulation of the core splicing factor SmB/B′. Through detailed mutagenesis of sequences surrounding a highly conserved PTCintroducing SNRPB , and the observation that knockdown of SmB/B′ results in reduced levels of U1 but not U2 snRNP, my results demonstrated that the strength of the 5′ss of an alternative exon may be an important determinant of its sensitivity to depletion of this core spliceosomal component. The role of SmB/B′ in regulating its own expression via ASNMD

84 further suggested that it may also regulate the AS of other transcripts. This possibility will be explored in the next chapter.

4.4.1 Feedback and crossregulation of splicing factors

Several splicing factors and other RNA binding proteins are controlled through feedback via AS NMD (reviewed in Lareau et al. 2007a; McGlincy and Smith 2008). In addition, ‘cross regulation’ through ASNMD has been found to occur between gene family members and paralogues of auxiliary splicing regulators, including: HNRNPL and HNRPLL (Rossbach et al. 2009); PTBP1, PTBP2 (also known as nPTB/brPTB) and ROD1 (also known as PTBP3) (Wollerton et al. 2004; Boutz et al. 2007b; Makeyev et al. 2007; Spellman et al. 2007); the fly homologues of the SR protein SRSF3 (also known as SRp20), Rbp1 and Rbp1like (Kumar and Lopez 2005); the Tcellrestricted intracellular antigen1 RNAbinding proteins TIA1 and TIAL1 (also known as TIAR) (Le Guiner et al. 2001; Izquierdo and Valcarcel 2007); CUGBP and ETR3like family members CELF1 (also known as CUGBP1) and CELF2 (also known as CUGBP2) (Dembowski and Grabowski 2009); and the muscleblindlike factors MBNL1 and MBNL2 (Lin et al. 2006; Kalsotra et al. 2008). Interestingly, analogous to the observations in the present study, it was recently shown that the minor spliceosomal snRNP components U1148K and U11/U1265K (also known as SNRNP48 and RNPC3, respectively) are regulated post transcriptionally through a feedback mechanism involving AS and ASNMD (Verbeeren et al. 2010).

Building on these examples, I propose that there is crossregulation between SmB/B′ and its closely related paralogue, SmN, encoded by the imprinted SNRPN locus which arose by duplication of the SNRPB gene in mammals (Rapkins et al. 2006). Unlike SmB/B′, which is widely expressed, SmN is expressed primarily in the brain and heart (Appendix 16) (McAllister et al. 1988; McAllister et al. 1989). The expression of SmN is often disrupted in PraderWilli syndrome (PWS), a disorder with a range of symptoms including cognitive impairment (reviewed in Cassidy and Driscoll 2009). However, in brain tissue from PWS individuals or mouse models lacking SmN expression, SmB expression is upregulated through a previously unknown mechanism (Yang et al. 1998; Gray et al. 1999a; Gray et al. 1999b). My results strongly suggest that this apparent dosage compensation occurs by crossregulation between SmN and SmB/B′ involving the highly conserved PTCintroducing exon I have defined in SNRPB . In particular, I observed that SmN, SmB′ and SmB expressed from cDNAs display very

85 similar activity in the restoration of inclusion levels of the SNRPB PTCexon when endogenous SmB/B′ is knocked down. It follows, therefore, that elevated SmN expression in the brain would lead to reduced levels of SmB/B′ by promoting inclusion of its PTCexon. This repression would be relieved when SmN expression is disrupted in PWS, allowing increased expression of SmB/B′. Such a mechanism could also relate to the concomitant reduction in SmB/B’ expression upon increased expression of SmN in the postnatal relative to the embryonic rodent brain (Grimaldi et al. 1993). My results show that these highly similar proteins have overlapping functions and therefore are capable of crossregulation via ASNMD in vivo . It is also interesting to consider that crossregulation of these paralogues via ASNMD may reduce the phenotypic severity of loss of SmN expression.

86

Chapter 5

Data presented in this chapter are adapted with permission from the following publication:

Saltzman AL , Pan Q, Blencowe BJ. 2011. Regulation of alternative splicing by the core spliceosomal machinery. Genes Dev 25 (4), 373384. doi: 10.1101/gad.2004811 Copyright © 2011 by Cold Spring Harbor Laboratory Press. For reprint, see Appendix 15

Contributions:

Sandy (Qun) Pan (Blencowe Lab) constructed the database of human cassette alternative exons, aligned the RNASeq reads to splice junctions, and calculated the expression levels from the RNASeq data.

87

5 Regulation of alternative splicing by the core spliceosomal machinery

5.1 Introduction

Previous studies have shown that mutation, deletion, or knockdown of core spliceosomal and spliceosome assembly factors can result in altered splicing patterns in yeast (Clark et al. 2002; Pleiss et al. 2007; Kawashima et al. 2009; Campion et al. 2010), fly (Park et al. 2004) and mammalian cells (Massiello et al. 2006; Pacheco et al. 2006; Hastings et al. 2007; Zhang et al. 2008; Baumer et al. 2009). However, the features that underlie the differential sensitivity of introns or alternative exons to particular defects in the core splicing machinery are not well understood. In the previous chapter, I showed that levels of the core splicing factor SmB/B′ affect AS of an exon in its own premRNA. These results indicated that SmB/B′ would provide a good model to investigate how core spliceosomal components can regulate AS, in addition to their wellestablished roles in constitutive splicing.

5.1.1 Summary

Figure 51. Overview of Chapter 5.

In this chapter, I show that knockdown of SmB/B′ leads to a striking reduction in the inclusion levels of many alternative exons, with comparatively few effects on constitutive exon splicing levels (Figure 51). The alternative exons affected by SmB/B′ knockdown are significantly enriched in functions related to RNA processing and RNA binding. Changes in the inclusion levels of a subset of these alternative exons also appear to control the expression levels of the corresponding mRNAs by ASNMD. My results thus reveal a role for the core spliceosomal machinery in establishing the inclusion levels of a specific subset of alternative exons, and

88 further suggest that changes in the levels of these alternative exons control the expression of other RNA processing factors.

5.2 Materials and Methods

5.2.1 Analysis of AS and transcript levels by RNASeq

Total RNA was submitted to Illumina for the FastTrack mRNASeq service, and 50 nt reads were generated (siNT, 4923MB; siSmB/B′ 2551 MB; siSRSF1: 2814 MB). Cassette AS events (n=27,240) were mined by aligning EST/cDNA sequences to the genome essentially as described (Pan et al. 2004; Pan et al. 2005). The mRNASeq reads were mapped to exonexon junction sequences in this database of cassette AS events as described (Pan et al. 2008). Exonexon junctions were filtered for coverage in all three samples by matching to one or both of the following two criteria, where exonA is the alternative exon, and exons C1 and C2 are the upstream and downsteam flanking exons, respectively: (i) ≥20 reads matching the skipped junction (exonC1:exonC2), or (ii) ≥20 reads matching the included junction with higher coverage and ≥15 reads matching the included junction with lower coverage (included junctions: exonC1:exonA and exonA:exonC2). Percent inclusion was calculated using the junction read counts as follows: avg(C1:A,A:C2)/[(C1:C2+avg(C1:A,A:C2)]. In parallel, sequencing reads were also aligned to RefSeq transcripts, and transcript levels were estimated using the reads per kilobase of exon per million mapped reads (RPKM) calculation (Mortazavi et al. 2008). Data for filtered AS events (n=5752) are provided in Appendix 20. Sequencing read data is also deposited in the NCBI Gene Expression Omnibus (GEO; accession GSE26463).

A database of internal consecutive constitutive exon triplets (n=33,319) was constructed using the Galaxy tool (http://main.g2.bx.psu.edu/) (Blankenberg et al. 2007; Blankenberg et al. 2010) as follows: exons from UCSC known genes that overlapped with genes in our AS database were selected following removal of exons that overlap sequences in the UCSC knownAlt track as well as removal of exons that overlap our cassette AS database. Reads were aligned to the exonexon junctions and filtered as described above for the AS events. Data for filtered exon triplets (n=8626) are provided in Appendix 21.

89

5.2.2 Calculation of Splice Site Strength

Strengths of splice sites for exons profiled by RNASeq were calculated using maximum entropy models (Yeo and Burge 2004), available online at the Burge lat at MIT (http://genes.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq.html). The 5′ss scoring uses the last 3 nt of the exon and the first 6 nt of the downstream intron, while the 3′ss scoring uses the last 20 nt of the upstream intron and the first 3 nt of the exon.

5.2.3 Gene ontology (GO) analysis

Enrichment of GO (Ashburner et al. 2000) or Pathway Commons (Cerami et al. 2010) terms (Figure 55D and Appendix 22) was calculated for genes containing alternative exons changing by ≥30% inclusion in the SmB/B′ knockdown (n=235) relative to all genes containing alternative exons passing our filtering criteria (n=3173) using WEBbased GEne SeT AnaLysis Toolkit (WebGestalt; http://bioinfo.vanderbilt.edu/webgestalt/) (Zhang et al. 2005a). A minimum of 10 genes per category was specified and enrichment pvalues were calculated using the hypergeometric test and adjusted for multiple testing using the false discovery rate (FDR) (Benjamini and Hochberg 1995). A network of GO terms with p<0.005 and FDR<0.1 was constructed using the Enrichment Map plugin (http://baderlab.org/Software/EnrichmentMap/) (Isserlin et al. 2010) for Cytoscape (Cline et al. 2007). Three nodes for GO terms with an identical set of 14 genes were collapsed into the single 14gene node shown (Figure 55D), and nodes were arranged using Cytoscape hierarchic layout.

5.2.4 Statistical Analysis

To compare the frequency of SmB/B′ knockdowndependent changes in % inclusion of alternative vs. constitutive exons profiled by RNASeq, the Chisquare test was used. Sample sizes are given in the text. To compare the median splice site strengths, the lengths of profiled exons, and the changes in transcript levels between subsets of AS events, the nonparametric Wilcoxon rank sum test was used. Sample sizes are shown in Figure 55.

90

5.3 Results

5.3.1 A widespread role for core splicing factors in promoting the inclusion of alternative exons

To determine whether reducing the levels of SmB/B′, and thus Smclass snRNPs, via SmB/B′ knockdown affects the inclusion of alternative exons from other genes, highthroughput RNA sequencing (RNASeq) was performed on RNA from HeLa cells following knockdown of SmB/B′ using an siRNA pool, and on RNA from cells transfected with a nontargeting siRNA pool as a control (Figure 52A). To compare and assess the specificity and extent of the effects of SmB/B′ knockdown on AS with those of a relatively welldefined splicing regulator, I used another siRNA pool to knockdown the SR family protein SRSF1 (also known as ASF, SF2, SFRS1; Figure 52A). A comparable knockdown efficiency of >85% was achieved in both factor knockdowns.

The RNASeq reads (50nt) were mapped to exonexon junctions in a database of EST/cDNA supported cassettetype AS events (see Methods). Counts of reads mapping to included versus skipped exon junctions were used to calculate the percent inclusion level (% inclusion) of these alternative exons. Alternative exons meeting filtering criteria based on junction read coverage (n=5752; Appendix 20) were analyzed, and the proportion of cassette alternative exons changing in inclusion level between each knockdown and the control were plotted (Figure 52B, left).

Knockdown of SmB/B′ specifically reduced the inclusion levels of a large number of alternative exons, and, overall, affected the inclusion levels of more than twice the number of alternative exons than were affected by knockdown of SRSF1. Relative to the control knockdown, 18% (n=1035) of alternative exons were ≥10% more skipped in the SmB/B′ knockdown, whereas only 0.8% (n=48) were ≥10% more included. Moreover, all alternative exons showing a change in % inclusion of ≥30% (n=268) were more skipped in the SmB/B′ knockdown compared to the control. In contrast, knockdown of SRSF1 resulted in 7.4% (n=423) of alternative exons changing by ≥10% inclusion, and 61% of these were more skipped while 39% were more included (Figure 52B, left). Most alternative exons strongly affected by SmB/B′ knockdown were not similarly affected by knockdown of SRSF1, as shown in the heat map of the % inclusion of alternative exons showing ≥30% change in the SmB/B′ knockdown (Figure 52C). Thus, the SmB/B′ and SRSF1 knockdowns affected distinct sets of alternative exons, and nearly

91 all changes in alternative exon inclusion following knockdown of SmB/B′ represent increased exon skipping.

Figure 52. Quantitative analysis of alternative splicing by RNASeq reveals that knockdown of SmB/B′ leads to increased skipping of alternative exons. (A) Western blots indicate that SmB/B′ ( upper panels ) or SRSF1 ( lower panels ; also known as SF2, ASF, SFRS1) were efficiently depleted by siRNA transfections. NT, nontargeting siRNA. Serial dilutions of protein extract indicate that the blots are semiquantitative. (B) The percentage of alternative exons (left) or constitutive exons (right and inset) showing changes in inclusion levels upon knockdown of SmB/B′ or SRSF1 when compared to inclusion levels in cells transfected with a control nontargeting (NT) siRNA are shown in a bar graph. (C) Distinct effects of SmB/B′ and SRSF1 knockdowns. The % exon inclusion values in the control nontargeting (NT), SmB/B′ and SRSF1 knockdowns are shown for 268 alternative exons found to have a ≥30% inclusion change (increased skipping) in the SmB/B′ knockdown when compared to the control (NT). The AS events are ordered from top to bottom by the absolute difference in % inclusion between the SRSF1 knockdown and the control (NT).

92

To determine the effect of knockdown of SmB/B′ on the inclusion levels of constitutive exons, the RNASeq reads were aligned to exonexon junctions in a database of high confidence internal constitutive exons using the same filtering criteria as applied for alternative exons (refer to Methods; Appendix 21). Only 1.9% of the constitutive exons (160/8626) showed ≥10% skipping when SmB/B′ was knocked down, compared to 18% (1035/5752) of alternative exons (Figure 52B; p<1*10 4; chisquare). These results indicate that a specific subset of alternative exons is particularly sensitive to reduced snRNP levels as a consequence of SmB/B′ knockdown, whereas constitutive exon splicing is largely unaffected.

To assess the accuracy of alternative exon inclusion levels and knockdowndependent changes detected by analysis of the RNASeq data, AS events analyzed above (n=5752) were divided into three equally sized groups based on their junction read coverage. Events showing more skipping, no change, or more inclusion when comparing the SmB/B′ knockdown to the control were selected from these three groups and the alternative exon inclusion levels were measured by RT PCR using primer pairs targeting the flanking constitutive exons (n=28 events). The RNASeq measurements for percent inclusion levels agreed very well with those from the RTPCR data (Figure 53A; r=0.97). Representative RTPCR results are shown in Figure 53B and all results are shown in Appendix 23. Twenty one of these AS events were also assayed in two additional independent knockdowns of SmB/B′, and the knockdowndependent AS changes were confirmed in all cases (Figure 54).

93

Figure 53. Changes in alternative exon inclusion levels measured by RNASeq are confirmed by RTPCR assays. (A) Scatterplot showing agreement between % inclusion of 27 alternative exons in the three knockdowns, as measured by RTPCR vs. RNASeq (left), and between differences in inclusion levels (knockdown relative to control nontargeting, NT) for the same 27 alternative exons (right). (B) Representative RTPCR assays using primers annealing to flanking constitutive exons. For all RTPCR assays, see Appendix 23. Gene names: hnRNPAB , hnRNPH1 , heterogeneous nuclear ribonucleoprotein A/B and H1; SRSF7 (also known as SFRS7, 9G8), serine/argininerich splicing factor 7; SFRS18 (also known as SRrp130), splicing factor, arginine/serinerich 18; DDX11 , DEAD/H (AspGluAlaAsp/His) box polypeptide 11; DDX49 , DEAD (AspGluAla Asp) box polypeptide 49, CPSF7 , cleavage and polyadenylation specific factor 7, 59kDa; CENPN , centromere protein N. The number following the period designates the AS event ID. Names with an asterisk, PTC (Premature Termination Codon) upon skipping.

94

Figure 54. Confirmation of the effects of SmB/B′ knockdown on alternative exon inclusion in two independent knockdowns with different siRNAs. (A) Increased skipping of all tested alternative exons (21 of the 27 shown in Appendix 23) was confirmed by RTPCR assays in two independent knockdowns (SmB/B′10, SmB/B′11). NT, nontargeting. (B) Western blot showing SmB/B′ protein depletion level using three different siRNAs (9, 10, 11). Two transfection replicates per siRNA are shown. The three RNA samples used for RTPCR in ( A) correspond to the marked (●) lanes.

95

5.3.2 Characteristics of SmB/B′ knockdowndependent alternative exons

The results from analyzing the miniSmB reporter mutants indicated that the presence of a suboptimal 5′ss is an important determinant of the effect of SmB/B′ knockdown on alternative exon inclusion levels. To address whether this and other sequence features account more generally for the effects of SmB/B′ knockdown on exon inclusion levels, I next investigated the relationship between splice site strength and sensitivity to SmB/B′ knockdown, by comparing the average splice site strength scores (Yeo and Burge 2004) of the affected alternative exons to those of other alternative and constitutive exons analyzed above by RNASeq.

Consistent with previous results (Stamm et al. 2000; Clark and Thanaraj 2002; Itoh et al. 2004), the 3′ss (Figure 55A, top panel ) and 5′ss (Figure 55A, middle panel ) of the profiled alternative exons are on average weaker than those of the constitutive exons (Figure 55A, see legend, bottom panel ; 3′ss: p=4.5*10 27 ; 5′ss: p=5.7*10 41 , Wilcoxon rank sum test). However, the average strength of the 3′ss of alternative exons whose inclusion is affected by the SmB/B′ knockdown is higher than that of the other profiled alternative exons (8.57 vs. 8.14; p=0.02, Wilcoxon rank sum test), and not significantly different from the average strength of the 3′ss of the constitutive exons (8.57 vs. 8.60; Figure 55A). In contrast, the average strength of the 5′ss of alternative exons affected by SmB/B′ knockdown was lower than that of the other profiled alternative exons, although this difference was not statistically significant (8.06 vs. 8.34; Figure 55A). In addition, alternative exons affected by SmB/B′ knockdown were on average shorter (median=86 nt) than the other alternative exons (median=104 nt; p=3.9*10 11 , Wilcoxon rank sum test; Figure 55B). Thus, alternative exons showing more skipping when SmB/B′ is knocked down are on average shorter and have a stronger 3′ss than other profiled alternative exons. These results are consistent with my SNRPB minigene mutagenesis results, in which the effect of SmB/B′ knockdown on exon inclusion was not reduced by mutations increasing the 3′ss strength, but was essentially eliminated by mutations increasing the 5′ss strength (Figure 46).

5.3.3 Changes in transcript levels associated with SmB/B′ knockdown dependent PTCintroducing alternative exons

To investigate the functional consequences of SmB/B′ knockdowndependent AS changes, the capacity of these AS events to produce NMDtargeted isoforms that affect overall mRNA expression levels of the corresponding genes was next determined. The mRNA expression levels

96

Figure 55. Characteristics of alternative exons affected by knockdown of SmB/B′. (A, B) Cumulative distribution function (CDF) plots of 3′ splice site (ss) scores ( A, top), 5′ss scores ( A, middle), and exon lengths ( B) for alternative and constitutive exons profiled by RNA Seq (Figure 52). Alternative exons that show a pronounced increase in skipping (≥30%) upon knockdown of SmB/B′ are plotted separately from other profiled alternative exons, as shown in the legend (bottom panel).

(C) CDF of the fold change in overall mRNA transcript level (log 2 scale; SmB/B′ knockdown vs. control) of transcripts containing AS events that are more skipped upon knockdown of SmB/B′ compared to the control knockdown. Transcripts containing AS events that introduce a PTC upon exon inclusion or skipping, or that do not introduce a PTC (‘No PTC’) are plotted separately as shown in the legend below the plot. Abbreviations: alt, alternative; const, constitutive; PTC, premature termination codon. (D) Enriched gene ontology (GO) terms (p<0.005 and FDR<0.1) annotating genes containing exons affected by SmB/B′ knockdown (≥30% more skipping) are represented as a network of gene sets. Each node represents the set of genes annotated with the indicated GO term. Node size is proportional to the number of genes annotated by the term (indicated by the node label), and edge thickness is proportional to the number of genes in common between the sets.

for genes containing SmB/B′ knockdownsensitive alternative exons (≥10% more skipping) were measured by aligning RNASeq reads to RefSeq transcripts (see Methods). The foldchange in expression level in the SmB/B′ knockdown compared to the control was plotted for AS events that do not introduce a premature termination codon (PTC), and for events that introduce a PTC in the exonincluded or skipped isoform (Figure 55C). For exons more skipped in the SmB/B′ knockdown that introduce a PTC upon skipping, the overall mRNA levels from the genes were on average reduced in the SmB/B′ knockdown, and the median foldchange was significantly different from that of nonPTCintroducing events (p=2.5*10 8; Wilcoxon rank sum test; Figure 55C). Examples of three such AS events are shown in Figure 53B (DDX49.6, DDX11.11,

97

CPSF7.4). Conversely, for exons more skipped in the SmB/B′ knockdown that introduce a PTC upon inclusion, the overall mRNA levels from the corresponding genes were on average higher in the SmB/B′ knockdown (p= 4*10 3; Figure 55C). These results are consistent with ASNMD acting to both positively and negatively modulate transcript levels of these genes in response to reduced snRNP levels as a consequence of SmB/B′ depletion.

5.3.4 SmB/B′ knockdown affects AS events in RNAprocessing factor genes

The functional categories represented in genes containing SmB/B′ knockdownsensitive alternative exons were examined using Gene Ontology (GO) term enrichment (Figure 55D and Appendix 22). Genes containing alternative exons showing more skipping upon knockdown of SmB/B′ (≥30%) were significantly enriched for terms related to nucleic acid binding and RNA processing (Figure 55D). These genes include spliceosome components, splicing regulatory factors such as SR, SRrelated and hnRNP family proteins, mRNA 3′end processing factors, RNA helicases and other RNAbinding proteins (e.g. Figure 53B, Figure 54 and Appendix 23). Similar results were also obtained using Pathway Commons annotations, which are compiled mostly from proteinprotein interaction data (see Methods and Appendix 23). These results therefore support the conclusion that an important role for alternative exons affected by changes in the level of the core spliceosomal snRNP machinery is to coordinately control the expression of many RNA processing factors and other regulators of RNA.

5.4 Discussion

In this chapter, I identified alternative exons in RNA processing factor genes that are controlled by the levels of the core spliceosomal machinery. Some of these exons affect mRNA levels by introducing PTCs that elicit NMD. A subset of the affected exons likely play a critical role in maintaining balanced levels of splicing and other RNAassociated factors. These results thus provide new insight into regulated exon networks as well as the functions of core spliceosomal components in AS.

5.4.1 Mechanisms of AS regulation by core splicing factors

Global analysis of AS events displaying altered inclusion upon knockdown of SmB/B′ revealed an association of these alternative exons with relatively weak 5′ splice sites, and with 3′ splice sites that were on average as strong as those of constitutive exons. The results presented in the

98 previous chapter (Chapter 4) showed that knockdown of SmB/B′ results in depletion of U1 snRNP, and that 5′ss strength is an important determinant of the sensitivity of the SNRPB alternative exon to SmB/B′ knockdown. Thus, the reduced inclusion levels of a large number of alternative exons as a consequence of Sm protein depletion may be mediated more generally by reduced rates of interaction between the 5′ splice site and U1 snRNP. These findings may relate to previous observations revealing that the Sm complex contributes to the stability of the U1 snRNA–premRNA interaction in yeast (Zhang et al. 2001), and that proper Sm core assembly is essential for snRNP stability (Jones and Guthrie 1990; Zhang et al. 2008). My results also relate to in vitro studies demonstrating that differential binding of U1 snRNP to stronger or weaker 5′ splice sites can affect the inclusion levels of a reporter alternative exon (Kuo et al. 1991), and that some splicing substrates differ in their requirement for U1 snRNP (Crispino et al. 1994; Tarn and Steitz 1994; Crispino et al. 1996). Furthermore, my data support evidence that altering the kinetics of spliceosomal rearrangements can affect splice site selection (Query and Konarska 2004; Yu et al. 2008). Such kinetic competition between splice sites may provide a basis for the changes in the alternative exon inclusion levels that I observe in this chapter, where specific splice sites are no longer efficiently recognized when core splicing components, normally present at saturating levels, may become ratelimiting (Smith et al. 2008; Graveley 2009; Nilsen and Graveley 2010).

5.4.2 Physiological roles of AS regulation by general splicing factors

The results in this chapter contribute to emerging evidence in the field that the relative concentration or activity of general splicing factors can affect splice site selection. In addition, the results suggest that feedback and coordinated control of RNA processing factors is a physiological role for such regulation. Previous studies have also supported physiological roles for the core spliceosomal machinery in the regulation of AS. For example, components of snRNPs are differentially expressed in mammalian cells and tissues (Grosso et al. 2008; Castle et al. 2010) and several core splicing factors are differentially expressed during development and in tissues of the fly (Park et al. 2004). Evidence for critical roles of core spliceosomal components and assembly factors has also emerged from the study of certain human diseases. For example, mutations in components of the U4/U6.U5 trisnRNP particle are associated with retinitis pigmentosa (reviewed in Mordes et al. 2006), mutations in the U4/U6 snRNP recycling factor SART3 (also known as p110) are associated with the skin disorder disseminated superficial

99 actinic porokeratosis (Zhang et al. 2005b), and loss or mutation of the widelyexpressed snRNP assembly factor SMN1 causes spinal muscular atrophy (reviewed in Burghes and Beattie 2009). Although the specific mechanisms and transcript targets that are responsible for these diseases are largely unknown, these studies point to the importance of maintaining appropriate expression of the core splicing machinery and to tissuespecific effects of loss or mutation of core splicing factors.

100

Chapter 6

101

6 Conclusions

While AS is known for its role in expanding the proteome, my work focuses on the role of AS in expanding posttranscriptional regulatory potential through functional coupling with NMD. The work in this thesis provides new evidence for the roles of ASNMD in feedback control and in the coordination of gene expression. My initial AS microarray profiling was the first largescale experiment to determine the effect of NMD inhibition on mRNAs containing PTCs introduced by AS. This work showed that many predicted PTCcontaining splice variants in EST/cDNA databases are unlikely to be substrates for NMD. Further analysis of these PTCintroducing AS events suggested that conservation of sequence or transcript architecture should be considered in order to bioinformatically distinguish NMDregulated PTCcontaining splice variants from those unlikely to be regulated by NMD.

My results from AS microarray profiling expanded the repertoire of genes that are likely to be regulated or autoregulated by ASNMD by adding many genes encoding general components of the spliceosome. I also showed that the core snRNP component SmB/B′ can autoregulate its expression through ASNMD, which suggested that this and perhaps other core splicing factors play a role in AS regulation. Consistent with this model, profiling of AS changes following SmB/B′ knockdown revealed a striking reduction in the inclusion levels of a subset of alternative exons, and further suggested that the levels of the core spliceosomal machinery can coordinate a network of alternative exons in RNA processing factor genes.

Together with other studies discussed in Sections 1.3.6 and 5.4.2, my work challenges the traditional view that basal splicing factors are not directly involved in AS regulation (reviewed in Graveley 2009). In a broader context, the concept that ‘constitutive’ spliceosome components may play transcriptspecific regulatory roles shares parallels with recent work investigating the regulation of gene expression at the levels of transcription and translation. For example, in addition to the wellcharacterized actions of sequencespecific transcription factors, changes in the core promoter recognition complex can drive celltype specific changes in transcription (reviewed in Goodrich and Tjian 2010). In addition, a speculative model proposes that variations in the composition of ribosomes may play a role in translational regulation of specific transcripts (reviewed in Gilbert 2011). Understanding the roles of celltype or conditionspecific changes in the basal gene expression machines may also help to explain the mechanisms through which

102 genetic mutations in core splicing or translation factors lead to specific disease phenotypes (reviewed in Cooper et al. 2009).

6.1 Future Directions

6.1.1 What features underlie the differential dependencies of NMD substrates on UPF2 and UPF3/UPF3X?

My results from AS microarray profiling following knockdown of the three core NMD factors suggested that ASNMD can occur through alternative UPF1dependent branches of the NMD pathway. These results support other studies using reporters or endogenous NMD targets identified by mRNA expression microarrays, which have suggested that certain transcripts are degraded by NMD through mechanisms independent of UPF2 (Gehring et al. 2005), UPF3/3X (Chan et al. 2007; Tarpey et al. 2007; Avery et al. 2011), both UPF2 and UPF3/3X (Gehring et al. 2009) or the EJC (Buhler et al. 2006; Eberle et al. 2008; Singh et al. 2008). However, it is not known what determines differential sensitivity of NMD targets to these factors. The PTC containing splice variants identified in this thesis with variable dependencies on UPF2/3X could be used to investigate the basis for these differences. Recapitulation of these differences in minigene reporters would allow the contribution of particular cis acting RNA sequences to be analyzed. In addition, in light of recent work suggesting that EJC deposition is regulated (Sauliere et al. 2010), the role of transcript or intronspecific EJCs could be studied as a potential mechanism for differential UPF factor dependencies.

6.1.2 Mechanisms of core splicing factordependent AS regulation

Data presented in my thesis provide insight into the features underlying the differential sensitivity of alternative exons to particular perturbations in the core splicing machinery. My data suggest that impaired U1 snRNA–5′ss interaction plays a role in SmB/B′ knockdown dependent effects on the inclusion of the autoregulated exon and other alternative exons. I also showed that SmB/B′ knockdown results in both destabilization of U1 snRNA and reduced inclusion levels of alternative exons.

To help clarify whether the effects of SmB/B′ knockdown on alternative exon inclusion are primarily mediated by loss of functional U1 snRNP, the effects of U1 snRNA depletion on alternative exon inclusion levels could be assayed. In addition, the regions of SmB/B′ protein important for its effects on AS could be investigated through mutagenesis. This approach would

103 provide functional insight into the role of the Sm proteins in mediating RNARNA interactions in the mammalian spliceosome, which might extend previous results in the yeast spliceosome (Zhang et al. 2001) and in histone 3′end processing via the U7 snRNP (Yang et al. 2009) (see Introduction, Section 1.2.4). Along these lines, it is still unclear whether SmB/B′ directly contacts its own premRNA, and whether or not such an interaction is sequence or position specific. My initial attempts to address this question using UV crosslinking strategies were inconclusive, but could not definitively rule out a direct interaction (data not shown).

Recent work by Dr. Joanna Ip in the Blencowe lab indicated an additional layer of control for the SmB/B′ PTCintroducing alternative exon. Her work showed that SmB/B′ expression can be downregulated through RNA pol II elongationdependent changes in the inclusion of this alternative exon, and suggested that this mechanism functions in response to cell stress (Ip et al. 2011). Therefore, the SmB/B′ PTCintroducing alternative exon could be used as a model to investigate the determinants of transcriptioncoupled splicing regulation.

6.1.3 Origins of ultra and highlyconserved nonsense exons

As shown by the work in this thesis as well as other studies (Lareau et al. 2007b; Ni et al. 2007; Yeo et al. 2007) (reviewed in McGlincy and Smith 2008), regulated ASNMD is associated with highly conserved sequence elements. Though most of the examples of conserved ASNMD that I have identified do not pass the stringent ‘ultraconservation’ criteria of another study (Section 1.5.3) (Bejerano et al. 2004), they are nevertheless highly conserved, and have significantly more conservation in the proximal flanking introns than other alternatively spliced exons examined. Furthermore, the conservation of the ‘PTC upon inclusion’ alternative exons, such as the example studied in the gene encoding SmB/B′, is particularly striking, since these exons do not appear to have proteincoding potential. These new cases of ASNMD can provide a basis to examine the evolutionary origins and reasons for the extensive conservation of these sequences, which have thus far remained enigmatic.

The high sequence conservation associated with regulated ASNMD might reflect a high density of cis regulatory elements, such as those involved in splicing or other layers of regulation, conserved RNA secondary structure, and/or retroposon origins. Alternatively, these regions could have nonproteincoding functions when transcribed. It is also possible that combinations of these features contribute to conservation in different genes. For example, in the case of the SR

104 protein SRSF1 (SF2/ASF), conserved 3′UTR sequences involved in ASNMD also function in both translational autoregulation (Sun et al. 2010) and microRNAmediated regulation (Wu et al. 2010). In the case of SRSF2 (SC35), structure probing revealed that a highly conserved 3′UTR region involved in ASNMD forms a stemloop that contains multiple binding sites bound by SRSF2 and other RNAbinding factors in both the terminal loop and unpaired regions of the stem (Dreumont et al. 2010). Thus, selection pressure to preserve RNA secondary structure, as well as multiple splicing factor recognition sites, likely contribute to the high sequence conservation of the SRSF2 3′UTR.

To investigate possible reasons underlying sequence their conservation, the examples of highly conserved regulated ASNMD I have identified could be used for computational prediction of RNA secondary structure. In addition, small RNA transcriptome libraries could be examined for evidence of noncoding RNAs originating from these gene regions or targeting these regions. It would also be interesting to examine some of the most deeply conserved ASNMD exons and predict their evolutionary histories. Analysis of a small group of classical mammalian SR proteins led to the hypothesis that ASNMD arose independently in each gene, with the exception of two recent paralogues (Lareau et al. 2007b). However, the evolutionary dynamics of other highly conserved examples of ASNMD has not been studied.

6.1.4 Networks of auto and crossregulation among RNA processing factors

Many examples of autoregulated AS events in splicing factors that are crossregulated by paralogous factors or by unrelated factors have been identified (reviewed in McGlincy and Smith 2008). However, in most cases the functional roles of this crossregulation have not been characterized in detail. My work suggested that crossregulation via ASNMD between paralogues SmB/B′ and SmN may enable a switch in SmN expression during neuronal development as well as dosage compensation upon loss of SmN expression in PraderWilli syndrome (PWS). Further study of the physiological role of this switch would provide new insight into the roles of splicing factor crossregulation. For example, a knockdown of SmN in mouse primary neurons or neuronal cell lines expressing high levels of SmN could be used to determine if the loss of SmN is sufficient to upregulate SmB/B′ expression, and if these highly similar proteins are functionally redundant in the regulation of neuronal splicing. In addition, model cell lines specifically lacking the SmB/B′ autoregulated exon (or such exons in other genes) could be created in order to probe its function. For example, singlecell analysis could be

105 used to determine whether such exons reduce celltocell variability in protein levels, as previously observed for transcriptional autoregulation (Becskei and Serrano 2000).

My mutagenesis of the autoregulated SmB/B′ exon suggested that additional regulators control its inclusion. This possibility could be investigated with an siRNA screen in which AS of this exon is assayed using a fluorescent reporter or another highthroughput method. This approach could also be extended to additional conserved AS events identified in this thesis in basal or regulatory splicing factors. Such a strategy would comprehensively determine the extent of AS crossregulation among many splicing factors and provide a broad view of the coordination of the splicing machinery.

106

References

Ahn SH, Kim M, Buratowski S. 2004. Phosphorylation of serine 2 within the RNA polymerase II Cterminal domain couples transcription and 3' end processing. Mol Cell 13 : 6776. AlAhmadi W, AlGhamdi M, AlHaj L, AlSaif M, Khabar KS. 2009. Alternative polyadenylation variants of the RNA binding protein, HuR: abundance, role of AUrich elements and autoRegulation. Nucleic Acids Res 37 : 36123624. Allemand E, Batsche E, Muchardt C. 2008. Splicing, transcription, and chromatin: a menage a trois. Curr Opin Genet Dev 18 : 145151. Alon U. 2007. Network motifs: theory and experimental approaches. Nat Rev Genet 8: 450461. Amrani N, Ganesan R, Kervestin S, Mangus DA, Ghosh S, Jacobson A. 2004. A faux 3'UTR promotes aberrant termination and triggers nonsensemediated mRNA decay. Nature 432 : 112118. Anastasaki C, Longman D, Capper A, Patton EE, Caceres JF. 2011. Dhx34 and Nbas function in the NMD pathway and are required for embryonic development in zebrafish. Nucleic Acids Res . Anko ML, Morales L, Henry I, Beyer A, Neugebauer KM. 2010. Global analysis reveals SRp20 and SRp75specific mRNPs in cycling and neural cells. Nat Struct Mol Biol 17 : 962970. Aparicio SA. 2000. How to count ... human genes. Nat Genet 25 : 129130. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al. 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 2529. AshtonBeaucage D, Udell CM, Lavoie H, Baril C, Lefrancois M, Chagnon P, Gendron P, CaronLizotte O, Bonneil E, Thibault P et al. 2010. The exon junction complex controls the splicing of MAPK and other long introncontaining transcripts in Drosophila. Cell 143 : 251262. Auboeuf D, Honig A, Berget SM, O'Malley BW. 2002. Coordinate regulation of transcription and splicing by steroid receptor coregulators. Science 298 : 416419. Audibert A, Simonelig M. 1998. Autoregulation at the level of mRNA 3' end formation of the suppressor of forked gene of Drosophila melanogaster is conserved in Drosophila virilis. Proc Natl Acad Sci U S A 95 : 1430214307. Avery P, VicenteCrespo M, Francis D, Nashchekina O, Alonso CR, Palacios IM. 2011. Drosophila Upf1 and Upf2 loss of function inhibits cell growth and causes animal death in a Upf3independent manner. Rna 17 : 624638. Ayala YM, De Conti L, AvendanoVazquez SE, Dhir A, Romano M, D'Ambrogio A, Tollervey J, Ule J, Baralle M, Buratti E et al. 2010. TDP43 regulates its mRNA levels through a negative feedback loop. Embo J . Banihashemi L, Wilson GM, Das N, Brewer G. 2006. Upf1/Upf2 regulation of 3' untranslated region splice variants of AUF1 links nonsensemediated and A+Urich elementmediated mRNA decay. Mol Cell Biol 26 : 87438754.

107

Barabino SM, Blencowe BJ, Ryder U, Sproat BS, Lamond AI. 1990. Targeted snRNP depletion reveals an additional role for mammalian U1 snRNP in spliceosome assembly. Cell 63 : 293302. Baraniak AP, Chen JR, GarciaBlanco MA. 2006. Fox2 mediates epithelial cellspecific fibroblast growth factor receptor 2 exon choice. Mol Cell Biol 26 : 12091222. Barash Y, Calarco JA, Gao W, Pan Q, Wang X, Shai O, Blencowe BJ, Frey BJ. 2010. Deciphering the splicing code. Nature 465 : 5359. BarberanSoler S, Lambert NJ, Zahler AM. 2009. Global analysis of alternative splicing uncovers developmental regulation of nonsensemediated decay in C. elegans. Rna 15 : 16521660. BarbosaMorais NL, CarmoFonseca M, Aparicio S. 2006. Systematic genomewide annotation of spliceosomal proteins reveals differential gene family expansion. Genome Res 16 : 66 77. Barta I, Iggo R. 1995. Autoregulation of expression of the yeast Dbp2p 'DEADbox' protein is mediated by sequences in the conserved DBP2 intron. Embo J 14 : 38003808. Bateman JF, Freddi S, Nattrass G, Savarirayan R. 2003. Tissuespecific RNA surveillance? Nonsensemediated mRNA decay causes collagen X haploinsufficiency in Schmid metaphyseal chondrodysplasia cartilage. Hum Mol Genet 12 : 217225. Baumer D, Lee S, Nicholson G, Davies JL, Parkinson NJ, Murray LM, Gillingwater TH, Ansorge O, Davies KE, Talbot K. 2009. Alternative splicing events are a late feature of pathology in a mouse model of spinal muscular atrophy. PLoS Genet 5: e1000773. Becskei A, Serrano L. 2000. Engineering stability in gene networks by autoregulation. Nature 405 : 590593. Bedford MT, Reed R, Leder P. 1998. WW domainmediated interactions reveal a spliceosome associated protein that binds a third class of prolinerich motif: the proline glycine and methioninerich motif. Proc Natl Acad Sci U S A 95 : 1060210607. BehmAnsmant I, Gatfield D, Rehwinkel J, Hilgers V, Izaurralde E. 2007. A conserved role for cytoplasmic poly(A)binding protein 1 (PABPC1) in nonsensemediated mRNA decay. Embo J 26 : 15911601. Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D. 2004. Ultraconserved elements in the human genome. Science 304 : 13211325. Bell LR, Horabin JI, Schedl P, Cline TW. 1991. Positive autoregulation of sexlethal by alternative splicing maintains the female determined state in Drosophila. Cell 65 : 229 239. Bell ML, Buvoli M, Leinwand LA. 2010. Uncoupling of expression of an intronic microRNA and its myosin host gene by exon skipping. Mol Cell Biol 30 : 19371945. Benjamini Y, Hochberg Y. 1995. Controlling the False Discovery Rate a practical and powerful approach to multiple testing. J R Statist Soc B 57 : 289300. Berget SM. 1995. Exon recognition in vertebrate splicing. J Biol Chem 270 : 24112414.

108

Bernstein BE, Kamal M, LindbladToh K, Bekiranov S, Bailey DK, Huebert DJ, McMahon S, Karlsson EK, Kulbokas EJ, 3rd, Gingeras TR et al. 2005. Genomic maps and comparative analysis of histone modifications in human and mouse. Cell 120 : 169181. Black DL. 2003. Mechanisms of alternative premessenger RNA splicing. Annu Rev Biochem 72 : 291336. Blanchette M, Chabot B. 1999. Modulation of exon skipping by highaffinity hnRNP A1binding sites and by intron elements that repress splice site utilization. Embo J 18 : 19391952. Blanchette M, Green RE, Brenner SE, Rio DC. 2005. Global analysis of positive and negative premRNA splicing regulators in Drosophila. Genes Dev 19 : 13061314. Blankenberg D, Taylor J, Schenck I, He J, Zhang Y, Ghent M, Veeraraghavan N, Albert I, Miller W, Makova KD et al. 2007. A framework for collaborative analysis of ENCODE data: making largescale analyses biologistfriendly. Genome Res 17 : 960964. Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, Mangan M, Nekrutenko A, Taylor J. 2010. Galaxy: a webbased genome analysis tool for experimentalists. Curr Protoc Mol Biol Chapter 19 : Unit 19 10 1121. Blencowe BJ. 2000. Exonic splicing enhancers: mechanism of action, diversity and role in human genetic diseases. Trends Biochem Sci 25 : 106110. . 2006. Alternative splicing: new insights from global analyses. Cell 126 : 3747. Blencowe BJ, Ahmad S, Lee LJ. 2009. Currentgeneration highthroughput sequencing: deepening insights into mammalian transcriptomes. Genes Dev 23 : 13791386. Blencowe BJ, Bauren G, Eldridge AG, Issner R, Nickerson JA, Rosonina E, Sharp PA. 2000. The SRm160/300 splicing coactivator subunits. Rna 6: 111120. Blencowe BJ, Bowman JA, McCracken S, Rosonina E. 1999. SRrelated proteins and the processing of messenger RNA precursors. Biochem Cell Biol 77 : 277291. Blencowe BJ, Issner R, Nickerson JA, Sharp PA. 1998. A coactivator of premRNA splicing. Genes Dev 12 : 9961009. Boelens WC, Jansen EJ, van Venrooij WJ, Stripecke R, Mattaj IW, Gunderson SI. 1993. The human U1 snRNPspecific U1A protein inhibits polyadenylation of its own premRNA. Cell 72 : 881892. Bonnal S, Martinez C, Forch P, Bachi A, Wilm M, Valcarcel J. 2008. RBM5/Luca15/H37 regulates Fas alternative splice site pairing after exon definition. Mol Cell 32 : 8195. Bordonne R. 2000. Functional characterization of nuclear localization signals in yeast Sm proteins. Mol Cell Biol 20 : 79437954. Borgeson CD, Samson ML. 2005. Shared RNAbinding sites for interacting members of the Drosophila ELAV family of neuronal proteins. Nucleic Acids Res 33 : 63726383. Boutz PL, Chawla G, Stoilov P, Black DL. 2007a. MicroRNAs regulate the expression of the alternative splicing factor nPTB during muscle development. Genes Dev 21 : 7184. Boutz PL, Stoilov P, Li Q, Lin CH, Chawla G, Ostrow K, Shiue L, Ares M, Jr., Black DL. 2007b. A posttranscriptional regulatory switch in polypyrimidine tractbinding proteins reprograms alternative splicing in developing neurons. Genes Dev 21 : 16361652.

109

Boyer LA, Lee TI, Cole MF, Johnstone SE, Levine SS, Zucker JP, Guenther MG, Kumar RM, Murray HL, Jenner RG et al. 2005. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122 : 947956. Branlant C, Krol A, Ebel JP, Lazar E, Haendler B, Jacob M. 1982. U2 RNA shares a structural domain with U1, U4, and U5 RNAs. Embo J 1: 12591265. Bres V, Gomes N, Pickle L, Jones KA. 2005. A human splicing factor, SKIP, associates with P TEFb and enhances transcription elongation by HIV1 Tat. Genes Dev 19 : 12111226. Bres V, Yoh SM, Jones KA. 2008. The multitasking PTEFb complex. Curr Opin Cell Biol 20 : 334340. Bringmann P, Appel B, Rinke J, Reuter R, Theissen H, Luhrmann R. 1984. Evidence for the existence of snRNAs U4 and U6 in a single ribonucleoprotein complex and for their association by intermolecular base pairing. Embo J 3: 13571363. Brow DA. 2002. Allosteric cascade of spliceosome activation. Annu Rev Genet 36 : 333360. Buhler M, Steiner S, Mohn F, Paillusson A, Muhlemann O. 2006. EJCindependent degradation of nonsense immunoglobulinmu mRNA depends on 3' UTR length. Nat Struct Mol Biol 13 : 462464. Buratowski S. 2009. Progression through the RNA polymerase II CTD cycle. Mol Cell 36 : 541 546. Burghes AH, Beattie CE. 2009. Spinal muscular atrophy: why do low levels of survival motor neuron protein make motor neurons sick? Nat Rev Neurosci 10 : 597609. Bussey KJ, Kane D, Sunshine M, Narasimhan S, Nishizuka S, Reinhold WC, Zeeberg B, Ajay W, Weinstein JN. 2003. MatchMiner: a tool for batch navigation among gene and gene product identifiers. Genome Biol 4: R27. Caceres JF, Stamm S, Helfman DM, Krainer AR. 1994. Regulation of alternative splicing in vivo by overexpression of antagonistic splicing factors. Science 265 : 17061709. Caffarelli E, Fragapane P, Gehring C, Bozzoni I. 1987. The accumulation of mature RNA for the Xenopus laevis ribosomal protein L1 is controlled at the level of splicing and turnover of the precursor RNA. Embo J 6: 34933498. Calarco JA, Saltzman AL, Ip JY, Blencowe BJ. 2007. Technologies for the global discovery and analysis of alternative splicing. in Alternative splicing in the postgenomic era (eds. BJ Blencowe, BR Graveley). Landes Biosciences, Austin, TX. Calarco JA, Superina S, O'Hanlon D, Gabut M, Raj B, Pan Q, Skalska U, Clarke L, Gelinas D, van der Kooy D et al. 2009. Regulation of vertebrate nervous system alternative splicing and development by an SRrelated protein. Cell 138 : 898910. Calarco JA, Zhen M, Blencowe BJ. 2011. Networking in a global world: Establishing functional connections between neural splicing regulators and their target transcripts. Rna . Campion Y, Neel H, Gostan T, Soret J, Bordonne R. 2010. Specific splicing defects in S. pombe carrying a degron allele of the Survival of Motor Neuron gene. Embo J 29 : 18171829. Caputi M, Mayeda A, Krainer AR, Zahler AM. 1999. hnRNP A/B proteins are required for inhibition of HIV1 premRNA splicing. Embo J 18 : 40604067.

110

Cassidy SB, Driscoll DJ. 2009. PraderWilli syndrome. Eur J Hum Genet 17 : 313. Castle JC, Armour CD, Lower M, Haynor D, Biery M, Bouzek H, Chen R, Jackson S, Johnson JM, Rohl CA et al. 2010. Digital genomewide ncRNA expression, including SnoRNAs, across 11 human tissues using polyAneutral amplification. PLoS ONE 5: e11779. Castle JC, Zhang C, Shah JK, Kulkarni AV, Kalsotra A, Cooper TA, Johnson JM. 2008. Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines. Nat Genet 40 : 14161425. Cerami EG, Gross BE, Demir E, Rodchenkov I, Babur O, Anwar N, Schultz N, Bader GD, Sander C. 2010. Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res 39 : D685690. Chabot B, Blanchette M, Lapierre I, La Branche H. 1997. An intron element modulating 5' splice site selection in the hnRNP A1 premRNA interacts with hnRNP A1. Mol Cell Biol 17 : 17761786. Chamieh H, Ballut L, Bonneau F, Le Hir H. 2008. NMD factors UPF2 and UPF3 bridge UPF1 to the exon junction complex and stimulate its RNA helicase activity. Nat Struct Mol Biol 15 : 8593. Chan WK, Bhalla AD, Le Hir H, Nguyen LS, Huang L, Gecz J, Wilkinson MF. 2009. A UPF3 mediated regulatory switch that maintains RNA surveillance. Nat Struct Mol Biol 16 : 747753. Chan WK, Huang L, Gudikote JP, Chang YF, Imam JS, MacLean JA, 2nd, Wilkinson MF. 2007. An alternative branch of the nonsensemediated decay pathway. Embo J 26 : 18201830. Chang YF, Imam JS, Wilkinson MF. 2007. The nonsensemediated decay RNA surveillance pathway. Annu Rev Biochem 76 : 5174. Chasin LA. 2007. Searching for splicing motifs. Adv Exp Med Biol 623 : 85106. Chen CD, Kobayashi R, Helfman DM. 1999. Binding of hnRNP H to an exonic splicing silencer is involved in the regulation of alternative splicing of the rat betatropomyosin gene. Genes Dev 13 : 593606. Chen M, Manley JL. 2009. Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches. Nat Rev Mol Cell Biol 10 : 741754. Cheng H, Dufu K, Lee CS, Hsu JL, Dias A, Reed R. 2006. Human mRNA export machinery recruited to the 5' end of mRNA. Cell 127 : 13891400. Chiara MD, Reed R. 1995. A twostep mechanism for 5' and 3' splicesite pairing. Nature 375 : 510513. Cho EJ, Rodriguez CR, Takagi T, Buratowski S. 1998. Allosteric interactions between capping enzyme subunits and the RNA polymerase II carboxyterminal domain. Genes Dev 12 : 34823487. Cho EJ, Takagi T, Moore CR, Buratowski S. 1997. mRNA capping enzyme is recruited to the transcription complex by phosphorylation of the RNA polymerase II carboxyterminal domain. Genes Dev 11 : 33193326.

111

Clark F, Thanaraj TA. 2002. Categorization and characterization of transcriptconfirmed constitutively and alternatively spliced introns and exons from human. Hum Mol Genet 11 : 451464. Clark TA, Sugnet CW, Ares M, Jr. 2002. Genomewide analysis of mRNA processing in yeast using splicingspecific microarrays. Science 296 : 907910. Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, Christmas R, Avila Campilo I, Creech M, Gross B et al. 2007. Integration of biological networks and gene expression data using Cytoscape. Nat Protoc 2: 23662382. Cloonan N, Forrest AR, Kolle G, Gardiner BB, Faulkner GJ, Brown MK, Taylor DF, Steptoe AL, Wani S, Bethel G et al. 2008. Stem cell transcriptome profiling via massivescale mRNA sequencing. Nat Methods 5: 613619. Cook KB, Kazan H, Zuberi K, Morris Q, Hughes TR. 2011. RBPDB: a database of RNA binding specificities. Nucleic Acids Res 39 : D301308. Cooper TA, Wan L, Dreyfuss G. 2009. RNA and disease. Cell 136 : 777793. Corioni M, Antih N, Tanackovic G, Zavolan M, Kramer A. 2011. Analysis of in situ premRNA targets of human splicing factor SF1 reveals a function in alternative splicing. Nucleic Acids Res 39 : 18681879. Crabb TL, Lam BJ, Hertel KJ. 2010. Retention of spliceosomal components along ligated exons ensures efficient removal of multiple introns. Rna 16 : 17861796. Cramer P, Caceres JF, Cazalla D, Kadener S, Muro AF, Baralle FE, Kornblihtt AR. 1999. Coupling of transcription with alternative splicing: RNA pol II promoters modulate SF2/ASF and 9G8 effects on an exonic splicing enhancer. Mol Cell 4: 251258. Cramer P, Pesce CG, Baralle FE, Kornblihtt AR. 1997. Functional association between promoter structure and transcript alternative splicing. Proc Natl Acad Sci U S A 94 : 1145611460. Crispino JD, Blencowe BJ, Sharp PA. 1994. Complementation by SR proteins of premRNA splicing reactions depleted of U1 snRNP. Science 265 : 18661869. Crispino JD, Mermoud JE, Lamond AI, Sharp PA. 1996. Cisacting elements distinct from the 5' splice site promote U1independent premRNA splicing. Rna 2: 664673. Cuccurese M, Russo G, Russo A, Pietropaolo C. 2005. Alternative splicing and nonsense mediated mRNA decay regulate mammalian ribosomal gene expression. Nucleic Acids Res 33 : 59655977. Czaplinski K, RuizEchevarria MJ, Paushkin SV, Han X, Weng Y, Perlick HA, Dietz HC, Ter Avanesyan MD, Peltz SW. 1998. The surveillance complex interacts with the translation release factors to enhance termination and degrade aberrant mRNAs. Genes Dev 12 : 16651677. Dabeva MD, Warner JR. 1993. Ribosomal protein L32 of Saccharomyces cerevisiae regulates both splicing and translation of its own transcript. J Biol Chem 268 : 1966919674. Damianov A, Black DL. 2010. Autoregulation of Fox protein expression to produce dominant negative splicing factors. Rna 16 : 405416. Das R, Dufu K, Romney B, Feldt M, Elenko M, Reed R. 2006. Functional coupling of RNAP II transcription to spliceosome assembly. Genes Dev 20 : 11001109.

112 de la Mata M, Alonso CR, Kadener S, Fededa JP, Blaustein M, Pelisch F, Cramer P, Bentley D, Kornblihtt AR. 2003. A slow RNA polymerase II affects alternative splicing in vivo. Mol Cell 12 : 525532. de la Mata M, Kornblihtt AR. 2006. RNA polymerase II Cterminal domain mediates regulation of alternative splicing by SRp20. Nat Struct Mol Biol 13 : 973980. de Melo Neto OP, Standart N, Martins de Sa C. 1995. Autoregulation of poly(A)binding protein synthesis in vitro. Nucleic Acids Res 23 : 21982205. Del GattoKonczak F, Olive M, Gesnel MC, Breathnach R. 1999. hnRNP A1 recruited to an exon in vivo can function as an exon splicing silencer. Mol Cell Biol 19 : 251260. Dembowski JA, Grabowski PJ. 2009. The CUGBP2 splicing factor regulates an ensemble of branchpoints from perimeter binding sites with implications for autoregulation. PLoS Genet 5: e1000595. Didiot MC, Tian Z, Schaeffer C, Subramanian M, Mandel JL, Moine H. 2008. The Gquartet containing FMRP binding site in FMR1 mRNA is a potent exonic splicing enhancer. Nucleic Acids Res 36 : 49024912. Diehn M, Sherlock G, Binkley G, Jin H, Matese JC, HernandezBoussard T, Rees CA, Cherry JM, Botstein D, Brown PO et al. 2003. SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data. Nucleic Acids Res 31 : 219 223. Dominski Z, Marzluff WF. 2007. Formation of the 3' end of histone mRNA: getting closer to the end. Gene 396 : 373390. Dong S, Li C, Zenklusen D, Singer RH, Jacobson A, He F. 2007. YRA1 autoregulation requires nuclear export and cytoplasmic Edc3pmediated degradation of its premRNA. Mol Cell 25 : 559573. Dostie J, Dreyfuss G. 2002. Translation is required to remove Y14 from mRNAs in the cytoplasm. Curr Biol 12 : 10601067. Dredge BK, Stefani G, Engelhard CC, Darnell RB. 2005. Nova autoregulation reveals dual functions in neuronal splicing. Embo J 24 : 16081620. Dreumont N, Hardy S, BehmAnsmant I, Kister L, Branlant C, Stevenin J, Bourgeois CF. 2010. Antagonistic factors control the unproductive splicing of SC35 terminal intron. Nucleic Acids Res 38 : 13531366. Duncan PI, Stojdl DF, Marius RM, Bell JC. 1997. In vivo regulation of alternative premRNA splicing by the Clk1 protein kinase. Mol Cell Biol 17 : 59966001. Eberle AB, LykkeAndersen S, Muhlemann O, Jensen TH. 2009. SMG6 promotes endonucleolytic cleavage of nonsense mRNA in human cells. Nat Struct Mol Biol 16 : 49 55. Eberle AB, Stalder L, Mathys H, Orozco RZ, Muhlemann O. 2008. Posttranscriptional gene regulation by spatial rearrangement of the 3' untranslated region. PLoS Biol 6: e92. Eisen MB, Spellman PT, Brown PO, Botstein D. 1998. Cluster analysis and display of genome wide expression patterns. Proc Natl Acad Sci U S A 95 : 1486314868.

113

Eldridge AG, Li Y, Sharp PA, Blencowe BJ. 1999. The SRm160/300 splicing coactivator is required for exonenhancer function. Proc Natl Acad Sci U S A 96 : 61256130. Fagnani M, Barash Y, Ip J, Misquitta C, Pan Q, Saltzman AL, Shai O, Lee L, Rozenhek A, Mohammad N et al. 2007. Functional coordination of alternative splicing in the mammalian central nervous system. Genome Biol 8: R108. Fedorova L, Fedorov A. 2005. Puzzles of the Human Genome: Why Do We Need Our Introns? Curr Genomics 6: 589595. Feng Y, Chen M, Manley JL. 2008. Phosphorylation switches the general splicing repressor SRp38 to a sequencespecific activator. Nat Struct Mol Biol 15 : 10401048. Feng Y, Sansam CL, Singh M, Emeson RB. 2006. Altered RNA editing in mice lacking ADAR2 autoregulation. Mol Cell Biol 26 : 480488. Filichkin SA, Priest HD, Givan SA, Shen R, Bryant DW, Fox SE, Wong WK, Mockler TC. 2010. Genomewide mapping of alternative splicing in Arabidopsis thaliana. Genome Res 20 : 4558. Fong YW, Zhou Q. 2001. Stimulatory effect of splicing factors on transcriptional elongation. Nature 414 : 929933. Franks TM, Singh G, LykkeAndersen J. 2010. Upf1 ATPasedependent mRNP disassembly is required for completion of nonsense mediated mRNA decay. Cell 143 : 938950. Fu XD, Mayeda A, Maniatis T, Krainer AR. 1992. General splicing factors SF2 and SC35 have equivalent activities in vitro, and both affect alternative 5' and 3' splice site selection. Proc Natl Acad Sci U S A 89 : 1122411228. Gabanella F, Butchbach ME, Saieva L, Carissimi C, Burghes AH, Pellizzoni L. 2007. Ribonucleoprotein assembly defects correlate with spinal muscular atrophy severity and preferentially affect a subset of spliceosomal snRNPs. PLoS ONE 2: e921. Gatfield D, Izaurralde E. 2004. Nonsensemediated messenger RNA decay is initiated by endonucleolytic cleavage in Drosophila. Nature 429 : 575578. Gatfield D, Unterholzner L, Ciccarelli FD, Bork P, Izaurralde E. 2003. Nonsensemediated mRNA decay in Drosophila: at the intersection of the yeast and mammalian pathways. Embo J 22 : 39603970. Gehring NH, Kunz JB, NeuYilik G, Breit S, Viegas MH, Hentze MW, Kulozik AE. 2005. Exonjunction complex components specify distinct routes of nonsensemediated mRNA decay with differential cofactor requirements. Mol Cell 20 : 6575. Gehring NH, Lamprinaki S, Hentze MW, Kulozik AE. 2009. The hierarchy of exonjunction complex assembly by the spliceosome explains key features of mammalian nonsense mediated mRNA decay. PLoS Biol 7: e1000120. Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, Zhang Y, Blankenberg D, Albert I, Taylor J et al. 2005. Galaxy: a platform for interactive largescale genome analysis. Genome Res 15 : 14511455. Gilbert W. 1978. Why genes in pieces? Nature 271 : 501. Gilbert WV. 2011. Functional specialization of ribosomes? Trends Biochem Sci 36 : 127132.

114

Girard C, Mouaikel J, Neel H, Bertrand E, Bordonne R. 2004. Nuclear localization properties of a conserved protuberance in the Sm core complex. Exp Cell Res 299 : 199208. Glavan F, BehmAnsmant I, Izaurralde E, Conti E. 2006. Structures of the PIN domains of SMG6 and SMG5 reveal a nuclease within the mRNA surveillance complex. Embo J 25 : 51175125. Goodrich JA, Tjian R. 2010. Unexpected roles for core promoter recognition factors in celltype specific transcription and gene regulation. Nat Rev Genet 11 : 549558. Gornemann J, Kotovic KM, Hujer K, Neugebauer KM. 2005. Cotranscriptional spliceosome assembly occurs in a stepwise fashion and requires the cap binding complex. Mol Cell 19 : 5363. Gozani O, Potashkin J, Reed R. 1998. A potential role for U2AFSAP 155 interactions in recruiting U2 snRNP to the branch site. Mol Cell Biol 18 : 47524760. Graveley BR. 2000. Sorting out the complexity of SR protein functions. Rna 6: 11971211. . 2009. Alternative splicing: regulation without regulators. Nat Struct Mol Biol 16 : 1315. Graveley BR, Hertel KJ, Maniatis T. 2001. The role of U2AF35 and U2AF65 in enhancer dependent splicing. Rna 7: 806818. Gray TA, Saitoh S, Nicholls RD. 1999a. An imprinted, mammalian bicistronic transcript encodes two independent proteins. Proc Natl Acad Sci U S A 96 : 56165621. Gray TA, Smithwick MJ, Schaldach MA, Martone DL, Graves JA, McCarrey JR, Nicholls RD. 1999b. Concerted regulation and molecular evolution of the duplicated SNRPB'/B and SNRPN loci. Nucleic Acids Res 27 : 45774584. Green RE, Lewis BP, Hillman RT, Blanchette M, Lareau LF, Garnett AT, Rio DC, Brenner SE. 2003. Widespread predicted nonsensemediated mRNA decay of alternativelyspliced transcripts of human normal and disease genes. Bioinformatics 19 Suppl 1 : i118121. Grimaldi K, Horn DA, Hudson LD, Terenghi G, Barton P, Polak JM, Latchman DS. 1993. Expression of the SmN splicing protein is developmentally regulated in the rodent brain but not in the rodent heart. Dev Biol 156 : 319323. Grosso AR, Gomes AQ, BarbosaMorais NL, Caldeira S, Thorne NP, Grech G, von Lindern M, CarmoFonseca M. 2008. Tissuespecific splicing factor gene expression signatures. Nucleic Acids Res 36 : 48234832. Hallegger M, Llorian M, Smith CW. 2010. Alternative splicing: global insights. FEBS J 277 : 856866. Han J, Ding JH, Byeon CW, Kim JH, Hertel KJ, Jeong S, Fu XD. 2011. SR proteins induce alternative exon skipping through their activities on the flanking constitutive exons. Mol Cell Biol 31 : 793802. Han J, Pedersen JS, Kwon SC, Belair CD, Kim YK, Yeom KH, Yang WY, Haussler D, Blelloch R, Kim VN. 2009. Posttranscriptional Crossregulation between Drosha and DGCR8. Cell 136 : 7584. Hanamura A, Caceres JF, Mayeda A, Franza BR, Jr., Krainer AR. 1998. Regulated tissue specific expression of antagonistic premRNA splicing factors. Rna 4: 430444.

115

Hartmann B, Valcarcel J. 2009. Decrypting the genome's alternative messages. Curr Opin Cell Biol 21 : 377386. Hase ME, Yalamanchili P, Visa N. 2006. The Drosophila heterogeneous nuclear ribonucleoprotein M protein, HRP59, regulates alternative splicing and controls the production of its own mRNA. J Biol Chem 281 : 3913539141. Hashimoto C, Steitz JA. 1984. U4 and U6 RNAs coexist in a single small nuclear ribonucleoprotein particle. Nucleic Acids Res 12 : 32833293. Hastings ML, Allemand E, Duelli DM, Myers MP, Krainer AR. 2007. Control of premRNA splicing by the general splicing factors PUF60 and U2AF65. PLoS ONE 2: e538. Hautbergue GM, Hung ML, Golovanov AP, Lian LY, Wilson SA. 2008. Mutually exclusive interactions drive handover of mRNA from export adaptors to TAP. Proc Natl Acad Sci U S A 105 : 51545159. He F, Li X, Spatrick P, Casillo R, Dong S, Jacobson A. 2003. Genomewide analysis of mRNAs regulated by the nonsensemediated and 5' to 3' mRNA decay pathways in yeast. Mol Cell 12 : 14391452. He F, Peltz SW, Donahue JL, Rosbash M, Jacobson A. 1993. Stabilization and ribosome association of unspliced premRNAs in a yeast upf1 mutant. Proc Natl Acad Sci U S A 90 : 70347038. Hicks MJ, Yang CR, Kotlajich MV, Hertel KJ. 2006. Linking splicing to Pol II transcription stabilizes premRNAs and influences splicing patterns. PLoS Biol 4: e147. Hillman RT, Green RE, Brenner SE. 2004. An unappreciated role for RNA surveillance. Genome Biol 5: R8. Ho CK, Shuman S. 1999. Distinct roles for CTD Ser2 and Ser5 phosphorylation in the recruitment and allosteric activation of mammalian mRNA capping enzyme. Mol Cell 3: 405411. Hoffman BE, Grabowski PJ. 1992. U1 snRNP targets an essential splicing factor, U2AF65, to the 3' splice site by a network of interactions spanning the exon. Genes Dev 6: 2554 2568. Hogg JR, Goff SP. 2010. Upf1 senses 3'UTR length to potentiate mRNA decay. Cell 143 : 379 389. Hogg R, McGrail JC, O'Keefe RT. 2010. The function of the NineTeen Complex (NTC) in regulating spliceosome conformations and fidelity during premRNA splicing. Biochem Soc Trans 38 : 11101115. Holbrook JA, NeuYilik G, Hentze MW, Kulozik AE. 2004. Nonsensemediated decay approaches the clinic. Nat Genet 36 : 801808. House AE, Lynch KW. 2006. An exonic splicing silencer represses spliceosome assembly after ATPdependent exon recognition. Nat Struct Mol Biol 13 : 937944. . 2008. Regulation of alternative splicing: more than just the ABCs. J Biol Chem 283 : 1217 1221. Huang Y, Gattoni R, Stevenin J, Steitz JA. 2003. SR splicing factors serve as adapter proteins for TAPdependent mRNA export. Mol Cell 11 : 837843.

116

Hughes TA. 2006. Regulation of gene expression by alternative untranslated regions. Trends Genet 22 : 119122. Hughes TR, Hiley SL, Saltzman AL, Babak T, Blencowe BJ. 2006. Microarray analysis of RNA processing and modification. Methods Enzymol 410 : 300316. Huntzinger E, Kashima I, Fauser M, Sauliere J, Izaurralde E. 2008. SMG6 is the catalytic endonuclease that cleaves mRNAs containing nonsense codons in metazoan. Rna 14 : 26092617. Huranova M, Ivani I, Benda A, Poser I, Brody Y, Hof M, ShavTal Y, Neugebauer KM, Stanek D. 2010. The differential interaction of snRNPs with premRNA reveals splicing kinetics in living cells. J Cell Biol 191 : 7586. Hyvonen MT, Uimari A, Keinanen TA, Heikkinen S, Pellinen R, Wahlfors T, Korhonen A, Narvanen A, Wahlfors J, Alhonen L et al. 2006. Polyamineregulated unproductive splicing and translation of spermidine/spermine N1acetyltransferase. Rna 12 : 1569 1582. International Human Genome Sequencing Consortium. 2001. Initial sequencing and analysis of the human genome. Nature 409 : 860921. Ip JY, Schmidt D, Pan Q, Ramani AK, Fraser AG, Odom DT, Blencowe BJ. 2011. Global impact of RNA polymerase II elongation inhibition on alternative splicing regulation. Genome Res 21 : 390401. Isken O, Maquat LE. 2008. The multiple lives of NMD factors: balancing roles in gene and genome regulation. Nat Rev Genet 9: 699712. Isserlin R, Merico D, AlikhaniKoupaei R, Gramolini A, Bader GD, Emili A. 2010. Pathway analysis of dilated cardiomyopathy using global proteomic profiling and enrichment maps. Proteomics 10 : 13161327. Itoh H, Washio T, Tomita M. 2004. Computational comparative analyses of alternative splicing regulation using fulllength cDNA of various eukaryotes. Rna 10 : 10051018. Ivanov PV, Gehring NH, Kunz JB, Hentze MW, Kulozik AE. 2008. Interactions between UPF1, eRFs, PABP and the exon junction complex suggest an integrated model for mammalian NMD pathways. Embo J 27 : 736747. Izquierdo JM, Majos N, Bonnal S, Martinez C, Castelo R, Guigo R, Bilbao D, Valcarcel J. 2005. Regulation of Fas alternative splicing by antagonistic effects of TIA1 and PTB on exon definition. Mol Cell 19 : 475484. Izquierdo JM, Valcarcel J. 2007. Two isoforms of the Tcell intracellular antigen 1 (TIA1) splicing factor display distinct splicing regulation activities. Control of TIA1 isoform ratio by TIA1related protein. J Biol Chem 282 : 1941019417. Jaillon O, Bouhouche K, Gout JF, Aury JM, Noel B, Saudemont B, Nowacki M, Serrano V, Porcel BM, Segurens B et al. 2008. Translational control of intron splicing in eukaryotes. Nature 451 : 359362. Jodelka FM, Ebert AD, Duelli DM, Hastings ML. 2010. A feedback loop regulates splicing of the spinal muscular atrophymodifying gene, SMN2. Hum Mol Genet 19 : 49064917.

117

Johnson JM, Castle J, GarrettEngele P, Kan Z, Loerch PM, Armour CD, Santos R, Schadt EE, Stoughton R, Shoemaker DD. 2003. Genomewide survey of human alternative pre mRNA splicing with exon junction microarrays. Science 302 : 21412144. Jones MH, Guthrie C. 1990. Unexpected flexibility in an evolutionarily conserved proteinRNA interaction: genetic analysis of the Sm binding site. Embo J 9: 25552561. Juge F, Audibert A, Benoit B, Simonelig M. 2000. Tissuespecific autoregulation of Drosophila suppressor of forked by alternative poly(A) site utilization leads to accumulation of the suppressor of forked protein in mitotically active cells. Rna 6: 15291538. Jumaa H, Nielsen PJ. 1997. The splicing factor SRp20 modifies splicing of its own mRNA and ASF/SF2 antagonizes this regulation. Embo J 16 : 50775085. Kadener S, Fededa JP, Rosbash M, Kornblihtt AR. 2002. Regulation of alternative splicing by a transcriptional enhancer through RNA pol II elongation. Proc Natl Acad Sci U S A 99 : 81858190. Kadener S, Rodriguez J, Abruzzi KC, Khodor YL, Sugino K, Marr MT, 2nd, Nelson S, Rosbash M. 2009. Genomewide identification of targets of the droshapasha/DGCR8 complex. Rna 15 : 537545. Kalsotra A, Wang K, Li PF, Cooper TA. 2010. MicroRNAs coordinate an alternative splicing network during mouse postnatal heart development. Genes Dev 24 : 653658. Kalsotra A, Xiao X, Ward AJ, Castle JC, Johnson JM, Burge CB, Cooper TA. 2008. A postnatal switch of CELF and MBNL proteins reprograms alternative splicing in the developing heart. Proc Natl Acad Sci U S A 105 : 2033320338. Kalyna M, Lopato S, Barta A. 2003. Ectopic expression of atRSZ33 reveals its function in splicing and causes pleiotropic changes in development. Mol Biol Cell 14 : 35653577. Kambach C, Walke S, Young R, Avis JM, de la Fortelle E, Raker VA, Luhrmann R, Li J, Nagai K. 1999. Crystal structures of two Sm protein complexes and their implications for the assembly of the spliceosomal snRNPs. Cell 96 : 375387. Kan JL, Green MR. 1999. PremRNA splicing of IgM exons M1 and M2 is directed by a juxtaposed splicing enhancer and inhibitor. Genes Dev 13 : 462471. Karni R, de Stanchina E, Lowe SW, Sinha R, Mu D, Krainer AR. 2007. The gene encoding the splicing factor SF2/ASF is a protooncogene. Nat Struct Mol Biol 14 : 185193. Kashima I, Jonas S, Jayachandran U, Buchwald G, Conti E, Lupas AN, Izaurralde E. 2010. SMG6 interacts with the exon junction complex via two conserved EJCbinding motifs (EBMs) required for nonsensemediated mRNA decay. Genes Dev 24 : 24402450. Kashima I, Yamashita A, Izumi N, Kataoka N, Morishita R, Hoshino S, Ohno M, Dreyfuss G, Ohno S. 2006. Binding of a novel SMG1Upf1eRF1eRF3 complex (SURF) to the exon junction complex triggers Upf1 phosphorylation and nonsensemediated mRNA decay. Genes Dev 20 : 355367. Kawashima T, Pellegrini M, Chanfreau GF. 2009. Nonsensemediated mRNA decay mutes the splicing defects of spliceosome component mutations. Rna 15 : 22362247. Keegan LP, Brindle J, Gallo A, Leroy A, Reenan RA, O'Connell MA. 2005. Tuning of RNA editing by ADAR is required in Drosophila. Embo J 24 : 21832193.

118

Keene JD, Tenenbaum SA. 2002. Eukaryotic mRNPs may represent posttranscriptional operons. Mol Cell 9: 11611167. Kim VN, Kataoka N, Dreyfuss G. 2001. Role of the nonsensemediated decay factor hUpf3 in the splicingdependent exonexon junction complex. Science 293 : 18321836. Kim YK, Furic L, Desgroseillers L, Maquat LE. 2005. Mammalian Staufen1 recruits Upf1 to specific mRNA 3'UTRs so as to elicit mRNA decay. Cell 120 : 195208. Klinck R, Bramard A, Inkel L, DufresneMartin G, GervaisBird J, Madden R, Paquet ER, Koh C, Venables JP, Prinos P et al. 2008. Multiple alternative splicing markers for ovarian cancer. Cancer Res 68 : 657663. Kobayashi T, Funakoshi Y, Hoshino S, Katada T. 2004. The GTPbinding release factor eRF3 as a key mediator coupling translation termination to mRNA decay. J Biol Chem 279 : 4569345700. Komili S, Silver PA. 2008. Coupling and coordination in gene expression processes: a systems biology view. Nat Rev Genet 9: 3848. Konarska MM, Sharp PA. 1987. Interactions between small nuclear ribonucleoprotein particles in formation of spliceosomes. Cell 49 : 763774. Kotlajich MV, Crabb TL, Hertel KJ. 2009. Spliceosome assembly pathways for different types of alternative splicing converge during commitment to splice site pairing in the A complex. Mol Cell Biol 29 : 10721082. Kuhn RM, Karolchik D, Zweig AS, Trumbower H, Thomas DJ, Thakkapallayil A, Sugnet CW, Stanke M, Smith KE, Siepel A et al. 2007. The UCSC genome browser database: update 2007. Nucleic Acids Res 35 : D668673. Kumar S, Lopez AJ. 2005. Negative feedback regulation among SR splicing factors encoded by Rbp1 and Rbp1like in Drosophila. Embo J 24 : 26462655. Kunz JB, NeuYilik G, Hentze MW, Kulozik AE, Gehring NH. 2006. Functions of hUpf3a and hUpf3b in nonsensemediated mRNA decay and translation. Rna 12 : 10151022. Kuo HC, Nasim FH, Grabowski PJ. 1991. Control of alternative splicing by the differential binding of U1 small nuclear ribonucleoprotein particle. Science 251 : 10451050. Lacadie SA, Rosbash M. 2005. Cotranscriptional spliceosome assembly dynamics and the role of U1 snRNA:5'ss base pairing in yeast. Mol Cell 19 : 6575. Lallena MJ, Chalmers KJ, Llamazares S, Lamond AI, Valcarcel J. 2002. Splicing regulation at the second catalytic step by Sexlethal involves 3' splice site recognition by SPF45. Cell 109 : 285296. Lareau LF, Brooks AN, Soergel DAW, Meng Q, Brenner SE. 2007a. The coupling of alternative splicing and nonsense mediated mRNA decay. in Alternative splicing in the postgenomic era (eds. BJ Blencowe, BR Graveley), pp. 191212. Landes Biosciences, Austin, TX. Lareau LF, Inada M, Green RE, Wengrod JC, Brenner SE. 2007b. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446 : 926929.

119

Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R et al. 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23 : 29472948. Lavigueur A, La Branche H, Kornblihtt AR, Chabot B. 1993. A splicing enhancer in the human fibronectin alternate ED1 exon interacts with SR proteins and stimulates U2 snRNP binding. Genes Dev 7: 24052417. Le Guiner C, Lejeune F, Galiana D, Kister L, Breathnach R, Stevenin J, Del GattoKonczak F. 2001. TIA1 and TIAR activate splicing of alternative exons with weak 5' splice sites followed by a Urich stretch on their own premRNAs. J Biol Chem 276 : 4063840646. Le Hir H, Andersen GR. 2008. Structural insights into the exon junction complex. Curr Opin Struct Biol 18 : 112119. Le Hir H, Gatfield D, Izaurralde E, Moore MJ. 2001. The exonexon junction complex provides a binding platform for factors involved in mRNA export and nonsensemediated mRNA decay. Embo J 20 : 49874997. Le K, Mitsouras K, Roy M, Wang Q, Xu Q, Nelson SF, Lee C. 2004. Detecting tissuespecific regulation of alternative splicing as a qualitative change in microarray data. Nucleic Acids Res 32 : e180. Lee TI, Rinaldi NJ, Robert F, Odom DT, BarJoseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I et al. 2002. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298 : 799804. Lejeune F, Cavaloc Y, Stevenin J. 2001. Alternative splicing of intron 3 of the serine/arginine rich protein 9G8 gene. Identification of flanking exonic splicing enhancers and involvement of 9G8 as a transacting factor. J Biol Chem 276 : 78507858. Lejeune F, Ishigaki Y, Li X, Maquat LE. 2002. The exon junction complex is detected on CBP80bound but not eIF4Ebound mRNA in mammalian cells: dynamics of mRNP remodeling. Embo J 21 : 35363545. Lejeune F, Maquat LE. 2005. Mechanistic links between nonsensemediated mRNA decay and premRNA splicing in mammalian cells. Curr Opin Cell Biol 17 : 309315. Lelivelt MJ, Culbertson MR. 1999. Yeast Upf proteins required for RNA surveillance affect global expression of the yeast transcriptome. Mol Cell Biol 19 : 67106719. Lerner EA, Lerner MR, Janeway CA, Jr., Steitz JA. 1981. Monoclonal antibodies to nucleic acidcontaining cellular constituents: probes for molecular biology and autoimmune disease. Proc Natl Acad Sci U S A 78 : 27372741. Lerner MR, Steitz JA. 1979. Antibodies to small nuclear RNAs complexed with proteins are produced by patients with systemic lupus erythematosus. Proc Natl Acad Sci U S A 76 : 54955499. Lewis BP, Green RE, Brenner SE. 2003. Evidence for the widespread coupling of alternative splicing and nonsensemediated mRNA decay in humans. Proc Natl Acad Sci U S A 100 : 189192. Li Q, Lee JA, Black DL. 2007. Neuronal regulation of alternative premRNA splicing. Nat Rev Neurosci 8: 819831.

120

Liautard JP, SriWidada J, Brunel C, Jeanteur P. 1982. Structural organization of ribonucleoproteins containing small nuclear RNAs from HeLa cells. Proteins interact closely with a similar structural domain of U1, U2, U4 and U5 small nuclear RNAs. J Mol Biol 162 : 623643. Licatalosi DD, Darnell RB. 2010. RNA processing and its regulation: global insights into biological networks. Nat Rev Genet 11 : 7587. Licatalosi DD, Geiger G, Minet M, Schroeder S, Cilli K, McNeil JB, Bentley DL. 2002. Functional interaction of yeast premRNA 3' end processing factors with RNA polymerase II. Mol Cell 9: 11011111. Lim LP, Burge CB. 2001. A computational analysis of sequence features involved in recognition of short introns. Proc Natl Acad Sci U S A 98 : 1119311198. Lim SR, Hertel KJ. 2004. Commitment to splice site pairing coincides with A complex formation. Mol Cell 15 : 477483. Lin CH, Patton JG. 1995. Regulation of alternative 3' splice site selection by constitutive splicing factors. Rna 1: 234245. Lin S, CoutinhoMansfield G, Wang D, Pandit S, Fu XD. 2008. The splicing factor SC35 has an active role in transcriptional elongation. Nat Struct Mol Biol 15 : 819826. Lin S, Fu XD. 2007. SR proteins and related factors in alternative splicing. Adv Exp Med Biol 623 : 107122. Lin X, Miller JW, Mankodi A, Kanadia RN, Yuan Y, Moxley RT, Swanson MS, Thornton CA. 2006. Failure of MBNL1dependent postnatal splicing transitions in myotonic dystrophy. Hum Mol Genet 15 : 20872097. Linde L, Boelz S, NeuYilik G, Kulozik AE, Kerem B. 2007. The efficiency of nonsense mediated mRNA decay is an inherent character and varies among different cells. Eur J Hum Genet 15 : 11561162. Long JC, Caceres JF. 2009. The SR protein family of splicing factors: master regulators of gene expression. Biochem J 417 : 1527. Longman D, Johnstone IL, Caceres JF. 2000. Functional characterization of SR and SRrelated genes in Caenorhabditis elegans. Embo J 19 : 16251637. Longman D, Plasterk RH, Johnstone IL, Caceres JF. 2007. Mechanistic insights and identification of two novel factors in the C. elegans NMD pathway. Genes Dev 21: 1075 1085. Lopato S, Kalyna M, Dorner S, Kobayashi R, Krainer AR, Barta A. 1999. atSRp30, one of two SF2/ASFlike proteins from Arabidopsis thaliana, regulates splicing of specific plant genes. Genes Dev 13 : 9871001. Lopez AJ. 1998. Alternative splicing of premRNA: developmental consequences and mechanisms of regulation. Annu Rev Genet 32 : 279305. Luco RF, Allo M, Schor IE, Kornblihtt AR, Misteli T. 2011. Epigenetics in alternative pre mRNA splicing. Cell 144 : 1626. Luco RF, Pan Q, Tominaga K, Blencowe BJ, PereiraSmith OM, Misteli T. 2010. Regulation of alternative splicing by histone modifications. Science 327 : 9961000.

121

Luke B, Azzalin CM, Hug N, Deplazes A, Peter M, Lingner J. 2007. Saccharomyces cerevisiae Ebs1p is a putative ortholog of human Smg7 and promotes nonsensemediated mRNA decay. Nucleic Acids Res . Lund MK, Kress TL, Guthrie C. 2008. Autoregulation of Npl3, a yeast SR protein, requires a novel downstream region and serine phosphorylation. Mol Cell Biol 28 : 38733881. LykkeAndersen J, Shu MD, Steitz JA. 2000. Human Upf proteins target an mRNA for nonsensemediated decay when bound downstream of a termination codon. Cell 103 : 11211131. . 2001. Communication of the position of exonexon junctions to the mRNA surveillance machinery by the protein RNPS1. Science 293 : 18361839. Ma L, Horvitz HR. 2009. Mutations in the Caenorhabditis elegans U2AF large subunit UAF1 alter the choice of a 3' splice site in vivo. PLoS Genet 5: e1000708. Macias S, Bragulat M, Tardiff DF, Vilardell J. 2008. L30 binds the nascent RPL30 transcript to repress U2 snRNP recruitment. Mol Cell 30 : 732742. MacMillan AM, Query CC, Allerson CR, Chen S, Verdine GL, Sharp PA. 1994. Dynamic association of proteins with the premRNA branch region. Genes Dev 8: 30083020. Majoros WH, Ohler U. 2007. Spatial preferences of microRNA targets in 3' untranslated regions. BMC Genomics 8: 152. Makeyev EV, Zhang J, Carrasco MA, Maniatis T. 2007. The MicroRNA miR124 promotes neuronal differentiation by triggering brainspecific alternative premRNA splicing. Mol Cell 27 : 435448. Maniatis T, Reed R. 2002. An extensive network of coupling among gene expression machines. Nature 416 : 499506. Maniatis T, Tasic B. 2002. Alternative premRNA splicing and proteome expansion in metazoans. Nature 418 : 236243. Marshall NF, Peng J, Xie Z, Price DH. 1996. Control of RNA polymerase II elongation potential by a novel carboxylterminal domain kinase. J Biol Chem 271 : 2717627183. MartinezContreras R, Cloutier P, Shkreta L, Fisette JF, Revil T, Chabot B. 2007. hnRNP proteins and splicing control. Adv Exp Med Biol 623 : 123147. MartinezContreras R, Fisette JF, Nasim FU, Madden R, Cordeau M, Chabot B. 2006. Intronic binding sites for hnRNP A/B and hnRNP F/H proteins stimulate premRNA splicing. PLoS Biol 4: e21. Massiello A, Roesser JR, Chalfant CE. 2006. SAP155 Binds to ceramideresponsive RNA cis element 1 and regulates the alternative 5' splice site selection of Bclx premRNA. FASEB J 20 : 16801682. Masuda S, Das R, Cheng H, Hurt E, Dorman N, Reed R. 2005. Recruitment of the human TREX complex to mRNA during splicing. Genes Dev 19 : 15121517. Matlin AJ, Clark F, Smith CW. 2005. Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol 6: 386398.

122

Matlin AJ, Moore MJ. 2007. Spliceosome assembly and composition. Adv Exp Med Biol 623 : 1435. Mattox W, Baker BS. 1991. Autoregulation of the splicing of transcripts from the transformer2 gene of Drosophila. Genes Dev 5: 786796. Mayeda A, Helfman DM, Krainer AR. 1993. Modulation of exon skipping and inclusion by heterogeneous nuclear ribonucleoprotein A1 and premRNA splicing factor SF2/ASF. Mol Cell Biol 13 : 29933001. Mayeda A, Zahler AM, Krainer AR, Roth MB. 1992. Two members of a conserved family of nuclear phosphoproteins are involved in premRNA splicing. Proc Natl Acad Sci U S A 89 : 13011304. Mayr C, Bartel DP. 2009. Widespread shortening of 3'UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138 : 673684. McAllister G, Amara SG, Lerner MR. 1988. Tissuespecific expression and cDNA cloning of small nuclear ribonucleoproteinassociated polypeptide N. Proc Natl Acad Sci U S A 85 : 52965300. McAllister G, RobyShemkovitz A, Amara SG, Lerner MR. 1989. cDNA sequence of the rat U snRNPassociated protein N: description of a potential Sm epitope. Embo J 8: 11771181. McCracken S, Fong N, Rosonina E, Yankulov K, Brothers G, Siderovski D, Hessel A, Foster S, Shuman S, Bentley DL. 1997a. 5'Capping enzymes are targeted to premRNA by binding to the phosphorylated carboxyterminal domain of RNA polymerase II. Genes Dev 11 : 33063318. McCracken S, Fong N, Yankulov K, Ballantyne S, Pan G, Greenblatt J, Patterson SD, Wickens M, Bentley DL. 1997b. The Cterminal domain of RNA polymerase II couples mRNA processing to transcription. Nature 385 : 357361. McGlincy NJ, Smith CW. 2008. Alternative splicing resulting in nonsensemediated mRNA decay: what is the meaning of nonsense? Trends Biochem Sci 33 : 385393. McGlincy NJ, Tan LY, Paul N, Zavolan M, Lilley KS, Smith CW. 2010. Expression proteomics of UPF1 knockdown in HeLa cells reveals autoregulation of hnRNP A2/B1 mediated by alternative splicing resulting in nonsensemediated mRNA decay. BMC Genomics 11 : 565. Meaux S, van Hoof A, Baker KE. 2008. Nonsensemediated mRNA decay in yeast does not require PAB1 or a poly(A) tail. Mol Cell 29 : 134140. Medghalchi SM, Frischmeyer PA, Mendell JT, Kelly AG, Lawler AM, Dietz HC. 2001. Rent1, a transeffector of nonsensemediated mRNA decay, is essential for mammalian embryonic viability. Hum Mol Genet 10 : 99105. Meinhart A, Cramer P. 2004. Recognition of RNA polymerase II carboxyterminal domain by 3' RNAprocessing factors. Nature 430 : 223226. Mendell JT, Sharifi NA, Meyers JL, MartinezMurillo F, Dietz HC. 2004. Nonsense surveillance regulates expression of diverse classes of mammalian transcripts and mutes genomic noise. Nat Genet 36 : 10731078.

123

Metzstein MM, Krasnow MA. 2006. Functions of the nonsensemediated mRNA decay pathway in Drosophila development. PLoS Genet 2: e180. Mitrovich QM, Anderson P. 2000. Unproductively spliced ribosomal protein mRNAs are natural targets of mRNA surveillance in C. elegans. Genes Dev 14 : 21732184. Modrek B, Lee CJ. 2003. Alternative splicing in the human, mouse and rat genomes is associated with an increased frequency of exon creation and/or loss. Nat Genet 34 : 177180. Montaner D, Tarraga J, HuertaCepas J, Burguet J, Vaquerizas JM, Conde L, Minguez P, Vera J, Mukherjee S, Valls J et al. 2006. Next station in microarray data analysis: GEPAS. Nucleic Acids Res 34 : W486491. Moore MJ, Proudfoot NJ. 2009. PremRNA processing reaches back to transcription and ahead to translation. Cell 136 : 688700. Mordes D, Luo X, Kar A, Kuo D, Xu L, Fushimi K, Yu G, Sternberg P, Jr., Wu JY. 2006. Pre mRNA splicing and retinitis pigmentosa. Mol Vis 12 : 12591271. Morrison M, Harris KS, Roth MB. 1997. smg mutants affect the expression of alternatively spliced SR protein mRNAs in Caenorhabditis elegans. Proc Natl Acad Sci U S A 94 : 97829785. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. 2008. Mapping and quantifying mammalian transcriptomes by RNASeq. Nat Methods 5: 621628. MottaMena LB, Heyd F, Lynch KW. 2010. Contextdependent regulatory mechanism of the splicing factor hnRNP L. Mol Cell 37 : 223234. Muhlemann O, LykkeAndersen J. 2010. How and where are nonsense mRNAs degraded in mammalian cells? RNA Biol 7: 2832. Munoz MJ, de la Mata M, Kornblihtt AR. 2010. The carboxy terminal domain of RNA polymerase II and alternative splicing. Trends Biochem Sci 35 : 497504. Nagy E, Maquat LE. 1998. A rule for terminationcodon position within introncontaining genes: when nonsense affects RNA abundance. Trends Biochem Sci 23 : 198199. Nasim FH, Spears PA, Hoffmann HM, Kuo HC, Grabowski PJ. 1990. A Sequential splicing mechanism promotes selection of an optimal exon by repositioning a downstream 5' splice site in preprotachykinin premRNA. Genes Dev 4: 11721184. NeuYilik G, Gehring NH, Hentze MW, Kulozik AE. 2004. Nonsensemediated mRNA decay: from vacuum cleaner to Swiss army knife. Genome Biol 5: 218. Neuenkirchen N, Chari A, Fischer U. 2008. Deciphering the assembly pathway of Smclass U snRNPs. FEBS Lett 582 : 19972003. Ni JZ, Grate L, Donohue JP, Preston C, Nobida N, O'Brien G, Shiue L, Clark TA, Blume JE, Ares M, Jr. 2007. Ultraconserved elements are associated with homeostatic control of splicing regulators by alternative splicing and nonsensemediated decay. Genes Dev 21 : 708718. Ni Z, Schwartz BE, Werner J, Suarez JR, Lis JT. 2004. Coordination of transcription, RNA processing, and surveillance by PTEFb kinase on heat shock genes. Mol Cell 13 : 5565. Nilsen TW. 2002. The spliceosome: no assembly required? Mol Cell 9: 89.

124

Nilsen TW, Graveley BR. 2010. Expansion of the eukaryotic proteome by alternative splicing. Nature 463 : 457463. Nogues G, Kadener S, Cramer P, Bentley D, Kornblihtt AR. 2002. Transcriptional activators differ in their abilities to control alternative splicing. J Biol Chem 277 : 4311043114. Nott A, Le Hir H, Moore MJ. 2004. Splicing enhances translation in mammalian cells: an additional function of the exon junction complex. Genes Dev 18 : 210222. Odom DT, Dowell RD, Jacobsen ES, Nekludova L, Rolfe PA, Danford TW, Gifford DK, Fraenkel E, Bell GI, Young RA. 2006. Core transcriptional regulatory circuitry in human hepatocytes. Mol Syst Biol 2: 2006 0017. Ohnishi T, Yamashita A, Kashima I, Schell T, Anders KR, Grimson A, Hachiya T, Hentze MW, Anderson P, Ohno S. 2003. Phosphorylation of hUPF1 induces formation of mRNA surveillance complexes containing hSMG5 and hSMG7. Mol Cell 12 : 11871200. Pacheco TR, Moita LF, Gomes AQ, Hacohen N, CarmoFonseca M. 2006. RNA interference knockdown of hU2AF35 impairs cell cycle progression and modulates alternative splicing of Cdc25 transcripts. Mol Biol Cell 17 : 41874199. Pagani F, Stuani C, Zuccato E, Kornblihtt AR, Baralle FE. 2003. Promoter architecture modulates CFTR exon 9 skipping. J Biol Chem 278 : 15111517. Palacios IM, Gatfield D, St Johnston D, Izaurralde E. 2004. An eIF4AIIIcontaining complex required for mRNA localization and nonsensemediated mRNA decay. Nature 427 : 753 757. Palusa SG, Reddy AS. 2010. Extensive coupling of alternative splicing of premRNAs of serine/arginine (SR) genes with nonsensemediated decay. New Phytol 185 : 8389. Pan Q, Bakowski MA, Morris Q, Zhang W, Frey BJ, Hughes TR, Blencowe BJ. 2005. Alternative splicing of conserved exons is frequently speciesspecific in human and mouse. Trends Genet 21 : 7377. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. 2008. Deep surveying of alternative splicing complexity in the human transcriptome by highthroughput sequencing. Nat Genet 40 : 14131415. Pan Q, Shai O, Misquitta C, Zhang W, Saltzman AL, Mohammad N, Babak T, Siu H, Hughes TR, Morris QD et al. 2004. Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform. Mol Cell 16 : 929941. Pandit S, Wang D, Fu XD. 2008. Functional integration of transcriptional and RNA processing machineries. Curr Opin Cell Biol 20 : 260265. Park JW, Parisky K, Celotto AM, Reenan RA, Graveley BR. 2004. Identification of alternative splicing regulators by RNA interference in Drosophila. Proc Natl Acad Sci U S A 101 : 1597415979. Patel GP, Ma S, Bag J. 2005. The autoregulatory translational control element of poly(A) binding protein mRNA forms a heteromeric ribonucleoprotein complex. Nucleic Acids Res 33 : 70747089.

125

Peltz SW, Brown AH, Jacobson A. 1993. mRNA destabilization triggered by premature translational termination depends on at least three cisacting sequence elements and one transacting factor. Genes Dev 7: 17371754. Pennisi E. 2003. Human genome. A low number wins the GeneSweep Pool. Science 300 : 1484. Perales R, Bentley D. 2009. "Cotranscriptionality": the transcription elongation complex as a nexus for nuclear transactions. Mol Cell 36 : 178191. Philipps DL, Park JW, Graveley BR. 2004. A computational and experimental approach toward a priori identification of alternatively spliced exons. Rna 10 : 18381844. Pleiss JA, Whitworth GB, Bergkessel M, Guthrie C. 2007. Transcript specificity in yeast pre mRNA splicing revealed by mutations in core spliceosomal components. PLoS Biol 5: e90. Polymenidou M, LagierTourenne C, Hutt KR, Huelga SC, Moran J, Liang TY, Ling SC, Sun E, Wancewicz E, Mazur C et al. 2011. Long premRNA depletion and RNA missplicing contribute to neuronal vulnerability from loss of TDP43. Nat Neurosci 14 : 459468. Pomeranz Krummel DA, Oubridge C, Leung AK, Li J, Nagai K. 2009. Crystal structure of human spliceosomal U1 snRNP at 5.5 A resolution. Nature 458 : 475480. Preker PJ, Guthrie C. 2006. Autoregulation of the mRNA export factor Yra1p requires inefficient splicing of its premRNA. Rna 12 : 9941006. Preker PJ, Kim KS, Guthrie C. 2002. Expression of the essential mRNA export factor Yra1p is autoregulated by a splicingdependent mechanism. Rna 8: 969980. Presutti C, Ciafre SA, Bozzoni I. 1991. The ribosomal protein L2 in S. cerevisiae controls the level of accumulation of its own mRNA. Embo J 10 : 22152221. Presutti C, Villa T, Hall D, Pertica C, Bozzoni I. 1995. Identification of the ciselements mediating the autogenous control of ribosomal protein L2 mRNA stability in yeast. Embo J 14 : 40224030. Query CC, Konarska MM. 2004. Suppression of multiple substrate mutations by spliceosomal prp8 alleles suggests functional correlations with ribosomal ambiguity mutants. Mol Cell 14 : 343354. Query CC, Moore MJ, Sharp PA. 1994. Branch nucleophile selection in premRNA splicing: evidence for the bulged duplex model. Genes Dev 8: 587597. Raes J, Van de Peer Y. 2005. Functional divergence of proteins through frameshift mutations. Trends Genet 21 : 428431. Ramani AK, Nelson AC, Kapranov P, Bell I, Gingeras TR, Fraser AG. 2009. High resolution transcriptome maps for wildtype and nonsensemediated decaydefective Caenorhabditis elegans. Genome Biol 10 : R101. Rapkins RW, Hore T, Smithwick M, Ager E, Pask AJ, Renfree MB, Kohn M, Hameister H, Nicholls RD, Deakin JE et al. 2006. Recent assembly of an imprinted domain from non imprinted components. PLoS Genet 2: e182. Ray D, Kazan H, Chan ET, Pena Castillo L, Chaudhry S, Talukder S, Blencowe BJ, Morris Q, Hughes TR. 2009. Rapid and systematic analysis of the RNA recognition specificities of RNAbinding proteins. Nat Biotechnol 27 : 667670.

126

Rebbapragada I, LykkeAndersen J. 2009. Execution of nonsensemediated mRNA decay: what defines a substrate? Curr Opin Cell Biol 21 : 394402. Reeves WH, Narain S, Satoh M. 2003. Henry Kunkel, Stephanie Smith, clinical immunology, and split genes. Lupus 12 : 213217. Rehwinkel J, Letunic I, Raes J, Bork P, Izaurralde E. 2005. Nonsensemediated mRNA decay factors act in concert to regulate common mRNA targets. Rna 11 : 15301544. Resch A, Xing Y, Alekseyenko A, Modrek B, Lee C. 2004. Evidence for a subpopulation of conserved alternative splicing events under selection pressure for protein reading frame preservation. Nucleic Acids Res 32 : 12611269. Resch AM, Ogurtsov AY, Rogozin IB, Shabalina SA, Koonin EV. 2009. Evolution of alternative and constitutive regions of mammalian 5'UTRs. BMC Genomics 10 : 162. Rhead B, Karolchik D, Kuhn RM, Hinrichs AS, Zweig AS, Fujita PA, Diekhans M, Smith KE, Rosenbloom KR, Raney BJ et al. 2010. The UCSC Genome Browser database: update 2010. Nucleic Acids Res 38 : D613619. Robberson BL, Cote GJ, Berget SM. 1990. Exon definition may facilitate splice site selection in RNAs with multiple exons. Mol Cell Biol 10 : 8494. Roignant JY, Treisman JE. 2010. Exon junction complex subunits are required to splice Drosophila MAP kinase, a large heterochromatic gene. Cell 143 : 238250. Rosel TD, Hung LH, Medenbach J, Donde K, Starke S, Benes V, Ratsch G, Bindereif A. 2011. RNASeq analysis in mutant zebrafish reveals role of U1C protein in alternative splicing regulation. Embo J . Rosenfeld N, Elowitz MB, Alon U. 2002. Negative autoregulation speeds the response times of transcription networks. J Mol Biol 323 : 785793. Rosonina E, Blencowe BJ. 2004. Analysis of the requirement for RNA polymerase II CTD heptapeptide repeats in premRNA splicing and 3'end cleavage. Rna 10 : 581589. Rossbach O, Hung LH, Schreiner S, Grishina I, Heiner M, Hui J, Bindereif A. 2009. Auto and crossregulation of the hnRNP L proteins by alternative splicing. Mol Cell Biol 29 : 1442 1451. Roth KM, Wolf MK, Rossi M, Butler JS. 2005. The nuclear exosome contributes to autogenous control of NAB2 mRNA levels. Mol Cell Biol 25 : 15771585. Rueter SM, Dawson TR, Emeson RB. 1999. Regulation of alternative splicing by RNA editing. Nature 399 : 7580. Salomonis N, Schlieve CR, Pereira L, Wahlquist C, Colas A, Zambon AC, Vranizan K, Spindler MJ, Pico AR, Cline MS et al. 2010. Alternative splicing regulates mouse embryonic stem cell pluripotency and differentiation. Proc Natl Acad Sci U S A 107 : 1051410519. Saltzman AL, Kim YK, Pan Q, Fagnani MM, Maquat LE, Blencowe BJ. 2008. Regulation of multiple core spliceosomal proteins by alternative splicingcoupled nonsensemediated mRNA decay. Mol Cell Biol 28 : 43204330. Saltzman AL, Pan Q, Blencowe BJ. 2011. Regulation of alternative splicing by the core spliceosomal machinery. Genes Dev 25 : 373384.

127

Sarkissian M, Winne A, Lafyatis R. 1996. The mammalian homolog of suppressorofwhite apricot regulates alternative mRNA splicing of CD45 exon 4 and fibronectin IIICS. J Biol Chem 271 : 3110631114. Sauliere J, Haque N, Harms S, Barbosa I, Blanchette M, Le Hir H. 2010. The exon junction complex differentially marks spliced junctions. Nat Struct Mol Biol 17 : 12691271. Sauterer RA, Feeney RJ, Zieve GW. 1988. Cytoplasmic assembly of snRNP particles from stored proteins and newly transcribed snRNA's in L929 mouse fibroblasts. Exp Cell Res 176 : 344359. Sayani S, Janis M, Lee CY, Toesca I, Chanfreau GF. 2008. Widespread impact of nonsense mediated mRNA decay on the yeast intronome. Mol Cell 31 : 360370. Schellenberg MJ, Dul EL, MacMillan AM. 2011. Structural model of the p14/SF3b155 . branch duplex complex. Rna 17 : 155165. Schneider M, Will CL, Anokhina M, Tazi J, Urlaub H, Luhrmann R. 2010. Exon definition complexes contain the trisnRNP and can be directly converted into Blike precatalytic splicing complexes. Mol Cell 38 : 223235. Schoning JC, Streitner C, Meyer IM, Gao Y, Staiger D. 2008. Reciprocal regulation of glycine rich RNAbinding proteins via an interlocked feedback loop coupling alternative splicing to nonsensemediated decay in Arabidopsis. Nucleic Acids Res 36 : 69776987. Serin G, Gersappe A, Black JD, Aronoff R, Maquat LE. 2001. Identification and characterization of human orthologues to Saccharomyces cerevisiae Upf2 protein and Upf3 protein (Caenorhabditis elegans SMG4). Mol Cell Biol 21 : 209223. Shai O, Morris QD, Blencowe BJ, Frey BJ. 2006. Inferring global levels of alternative splicing isoforms using a generative model of microarray data. Bioinformatics 22 : 606613. Sharma S, Falick AM, Black DL. 2005. Polypyrimidine tract binding protein blocks the 5' splice sitedependent assembly of U2AF and the prespliceosomal E complex. Mol Cell 19 : 485 496. Sharma S, Kohlstaedt LA, Damianov A, Rio DC, Black DL. 2008. Polypyrimidine tract binding protein controls the transition from exon definition to an intron defined spliceosome. Nat Struct Mol Biol 15 : 183191. Shen H, Green MR. 2006. RS domains contact splicing signals and promote splicing by a common mechanism in yeast through humans. Genes Dev 20 : 17551765. Shoemaker DD, Schadt EE, Armour CD, He YD, GarrettEngele P, McDonagh PD, Loerch PM, Leonardson A, Lum PY, Cavet G et al. 2001. Experimental annotation of the human genome using microarray technology. Nature 409 : 922927. Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S et al. 2005. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15 : 10341050. Silva AL, Ribeiro P, Inacio A, Liebhaber SA, Romao L. 2008. Proximity of the poly(A)binding protein to a premature termination codon inhibits mammalian nonsensemediated mRNA decay. Rna 14 : 563576.

128

Sims RJ, 3rd, Millhouse S, Chen CF, Lewis BA, ErdjumentBromage H, Tempst P, Manley JL, Reinberg D. 2007. Recognition of trimethylated histone H3 lysine 4 facilitates the recruitment of transcription postinitiation factors and premRNA splicing. Mol Cell 28 : 665676. Singh G, Rebbapragada I, LykkeAndersen J. 2008. A competition between stimulators and antagonists of Upf complex recruitment governs human nonsensemediated mRNA decay. PLoS Biol 6: e111. Singh R, Valcarcel J, Green MR. 1995. Distinct binding specificities and functions of higher eukaryotic polypyrimidine tractbinding proteins. Science 268 : 11731176. Smith CW, Patton JG, NadalGinard B. 1989. Alternative splicing in the control of gene expression. Annu Rev Genet 23 : 527577. Smith DJ, Query CC, Konarska MM. 2008. "Nought may endure but mutability": spliceosome dynamics and the regulation of splicing. Mol Cell 30 : 657666. Sorek R, Ast G. 2003. Intronic sequences flanking alternatively spliced exons are conserved between human and mouse. Genome Res 13 : 16311637. Sorek R, Shamir R, Ast G. 2004. How prevalent is functional alternative splicing in the human genome? Trends Genet 20 : 6871. Spellman R, Llorian M, Smith CW. 2007. Crossregulation and functional redundancy between the splicing regulator PTB and its paralogs nPTB and ROD1. Mol Cell 27 : 420434. Staiger D, Zecca L, Wieczorek Kirk DA, Apel K, Eckstein L. 2003. The circadian clock regulated RNAbinding protein AtGRP7 autoregulates its expression by influencing alternative splicing of its own premRNA. Plant J 33 : 361371. Stamm S, Zhu J, Nakai K, Stoilov P, Stoss O, Zhang MQ. 2000. An alternativeexon database and its statistical analysis. DNA Cell Biol 19 : 739756. Stevens SW, Ryan DE, Ge HY, Moore RE, Young MK, Lee TD, Abelson J. 2002. Composition and functional characterization of the yeast spliceosomal pentasnRNP. Mol Cell 9: 31 44. Stoilov P, Daoud R, Nayler O, Stamm S. 2004. Human tra2beta1 autoregulates its protein concentration by influencing alternative splicing of its premRNA. Hum Mol Genet 13 : 509524. Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G et al. 2004. A gene atlas of the mouse and human proteinencoding transcriptomes. Proc Natl Acad Sci U S A 101 : 60626067. Sugnet CW, Srinivasan K, Clark TA, O'Brien G, Cline MS, Wang H, Williams A, Kulp D, Blume JE, Haussler D et al. 2006. Unusual intron conservation near tissueregulated exons found by splicing microarrays. PLoS Comput Biol 2: e4. Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D et al. 2008. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321 : 956960. Sun S, Zhang Z, Sinha R, Karni R, Krainer AR. 2010. SF2/ASF autoregulation involves multiple layers of posttranscriptional and translational control. Nat Struct Mol Biol 17 : 306312.

129

Sureau A, Gattoni R, Dooghe Y, Stevenin J, Soret J. 2001. SC35 autoregulates its expression by promoting splicing events that destabilize its mRNAs. Embo J 20 : 17851796. Talerico M, Berget SM. 1990. Effect of 5' splice site mutations on splicing of the preceding intron. Mol Cell Biol 10 : 62996305. Tan EM, Kunkel HG. 1966. Characteristics of a soluble nuclear antigen precipitating with sera of patients with systemic lupus erythematosus. J Immunol 96 : 464471. Tan S, Guo J, Huang Q, Chen X, LiLing J, Li Q, Ma F. 2007. Retained introns increase putative microRNA targets within 3' UTRs of human mRNA. FEBS Lett 581 : 10811086. Tardiff DF, Rosbash M. 2006. Arrested yeast splicing complexes indicate stepwise snRNP recruitment during in vivo spliceosome assembly. Rna 12 : 968979. Tarn WY, Steitz JA. 1994. SR proteins can compensate for the loss of U1 snRNP functions in vitro. Genes Dev 8: 27042717. Tarpey PS, Lucy Raymond F, Nguyen LS, Rodriguez J, Hackett A, Vandeleur L, Smith R, Shoubridge C, Edkins S, Stevens C et al. 2007. Mutations in UPF3B, a member of the nonsensemediated mRNA decay complex, cause syndromic and nonsyndromic mental retardation. Nat Genet 39 : 11271133. Thieffry D, Huerta AM, PerezRueda E, ColladoVides J. 1998. From specific gene regulation to genomic networks: a global analysis of transcriptional regulation in Escherichia coli. Bioessays 20 : 433440. Triboulet R, Chang HM, Lapierre RJ, Gregory RI. 2009. Posttranscriptional control of DGCR8 expression by the Microprocessor. Rna 15 : 10051011. Tuerk C, Gold L. 1990. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249 : 505510. Ule J, Jensen KB, Ruggiu M, Mele A, Ule A, Darnell RB. 2003. CLIP identifies Novaregulated RNA networks in the brain. Science 302 : 12121215. Ule J, Ule A, Spencer J, Williams A, Hu JS, Cline M, Wang H, Clark T, Fraser C, Ruggiu M et al. 2005. Nova regulates brainspecific splicing to shape the synapse. Nat Genet 37 : 844 852. Valadkhan S. 2007. The spliceosome: caught in a web of shifting interactions. Curr Opin Struct Biol 17 : 310315. Valcarcel J, Gaur RK, Singh R, Green MR. 1996. Interaction of U2AF65 RS region with pre mRNA branch point and promotion of base pairing with U2 snRNA. Science 273 : 1706 1709. van Dam A, Winkel I, ZijlstraBaalbergen J, Smeenk R, Cuypers HT. 1989. Cloned human snRNP proteins B and B' differ only in their carboxyterminal part. Embo J 8: 38533860. Verbeeren J, Niemela EH, Turunen JJ, Will CL, Ravantti JJ, Luhrmann R, Frilander MJ. 2010. An ancient mechanism for splicing control: U11 snRNP as an activator of alternative splicing. Mol Cell 37 : 821833. Viegas MH, Gehring NH, Breit S, Hentze MW, Kulozik AE. 2007. The abundance of RNPS1, a protein component of the exon junction complex, can determine the variability in efficiency of the Nonsense Mediated Decay pathway. Nucleic Acids Res 35 : 45424551.

130

Wahl MC, Will CL, Luhrmann R. 2009. The spliceosome: design principles of a dynamic RNP machine. Cell 136 : 701718. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. 2008. Alternative isoform regulation in human tissue transcriptomes. Nature 456 : 470476. Wang GS, Cooper TA. 2007. Splicing in disease: disruption of the splicing code and the decoding machinery. Nat Rev Genet 8: 749761. Wang J, Takagaki Y, Manley JL. 1996. Targeted disruption of an essential vertebrate gene: ASF/SF2 is required for cell viability. Genes Dev 10 : 25882599. Wang Z, Burge CB. 2008. Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. Rna 14 : 802813. Wang Z, Hoffmann HM, Grabowski PJ. 1995. Intrinsic U2AF binding is modulated by exon enhancer signals in parallel with changes in splicing activity. Rna 1: 2135. Warf MB, Berglund JA. 2010. Role of RNA structure in regulating premRNA splicing. Trends Biochem Sci 35 : 169178. Warzecha CC, Sato TK, Nabet B, Hogenesch JB, Carstens RP. 2009. ESRP1 and ESRP2 are epithelial celltypespecific regulators of FGFR2 splicing. Mol Cell 33 : 591601. Weber G, Trowitzsch S, Kastner B, Luhrmann R, Wahl MC. 2010. Functional organization of the Sm core in the crystal structure of human U1 snRNP. Embo J 29 : 41724184. Weischenfeldt J, Damgaard I, Bryder D, TheilgaardMonch K, Thoren LA, Nielsen FC, Jacobsen SE, Nerlov C, Porse BT. 2008. NMD is essential for hematopoietic stem and progenitor cells and for eliminating byproducts of programmed DNA rearrangements. Genes Dev 22 : 13811396. Wiegand HL, Lu S, Cullen BR. 2003. Exon junction complexes mediate the enhancing effect of splicing on mRNA expression. Proc Natl Acad Sci U S A 100 : 1132711332. Will CL, Urlaub H, Achsel T, Gentzel M, Wilm M, Luhrmann R. 2002. Characterization of novel SF3b and 17S U2 snRNP proteins, including a human Prp5p homologue and an SF3b DEADbox protein. Embo J 21 : 49784988. Wilson GM, Sun Y, Sellers J, Lu H, Penkar N, Dillard G, Brewer G. 1999. Regulation of AUF1 expression via conserved alternatively spliced elements in the 3' untranslated region. Mol Cell Biol 19 : 40564064. Witten JT, Ule J. 2011. Understanding splicing regulation through RNA splicing maps. Trends Genet 27 : 8997. Wittkopp N, Huntzinger E, Weiler C, Sauliere J, Schmidt S, Sonawane M, Izaurralde E. 2009. Nonsensemediated mRNA decay effectors are essential for zebrafish embryonic development and survival. Mol Cell Biol 29 : 35173528. Wollerton MC, Gooding C, Wagner EJ, GarciaBlanco MA, Smith CW. 2004. Autoregulation of polypyrimidine tract binding protein by alternative splicing leading to nonsensemediated decay. Mol Cell 13 : 91100.

131

Wu C, Orozco C, Boyer J, Leglise M, Goodale J, Batalov S, Hodge CL, Haase J, Janes J, Huss JW, 3rd et al. 2009. BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome Biol 10 : R130. Wu H, Sun S, Tu K, Gao Y, Xie B, Krainer AR, Zhu J. 2010. A splicingindependent function of SF2/ASF in microRNA processing. Mol Cell 38 : 6777. Wu JY, Maniatis T. 1993. Specific interactions between proteins implicated in splice site selection and regulated alternative splicing. Cell 75 : 10611070. Xiao X, Wang Z, Jang M, Burge CB. 2007. Coevolutionary networks of splicing cisregulatory elements. Proc Natl Acad Sci U S A 104 : 1858318588. Xiao X, Wang Z, Jang M, Nutiu R, Wang ET, Burge CB. 2009. Splice site strengthdependent activity and genetic buffering by polyG runs. Nat Struct Mol Biol 16 : 10941100. Xu X, Yang D, Ding JH, Wang W, Chu PH, Dalton ND, Wang HY, Bermingham JR, Jr., Ye Z, Liu F et al. 2005. ASF/SF2regulated CaMKIIdelta alternative splicing temporally reprograms excitationcontraction coupling in cardiac muscle. Cell 120 : 5972. Yamashita A, Izumi N, Kashima I, Ohnishi T, Saari B, Katsuhata Y, Muramatsu R, Morita T, Iwamatsu A, Hachiya T et al. 2009. SMG8 and SMG9, two novel subunits of the SMG 1 complex, regulate remodeling of the mRNA surveillance complex during nonsense mediated mRNA decay. Genes Dev 23 : 10911105. Yang T, Adamson TE, Resnick JL, Leff S, Wevrick R, Francke U, Jenkins NA, Copeland NG, Brannan CI. 1998. A mouse model for PraderWilli syndrome imprintingcentre mutations. Nat Genet 19 : 2531. Yang XC, Torres MP, Marzluff WF, Dominski Z. 2009. Three proteins of the U7specific Sm ring function as the molecular ruler to determine the site of 3'end processing in mammalian histone premRNA. Mol Cell Biol 29 : 40454056. Yeakley JM, Fan JB, Doucet D, Luo L, Wickham E, Ye Z, Chee MS, Fu XD. 2002. Profiling alternative splicing on fiberoptic arrays. Nat Biotechnol 20 : 353358. Yeo G, Burge CB. 2004. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J Comput Biol 11 : 377394. Yeo GW, Nostrand EL, Liang TY. 2007. Discovery and analysis of evolutionarily conserved intronic splicing regulatory elements. PLoS Genet 3: e85. Yeo GW, Van Nostrand E, Holste D, Poggio T, Burge CB. 2005. Identification and analysis of alternative splicing events conserved in human and mouse. Proc Natl Acad Sci U S A 102 : 28502855. Yi J, Chang N, Liu X, Guo G, Xue L, Tong T, Gorospe M, Wang W. 2010. Reduced nuclear export of HuR mRNA by HuR is linked to the loss of HuR in replicative senescence. Nucleic Acids Res 38 : 15471558. Yu Y, Maroney PA, Denker JA, Zhang XH, Dybkov O, Luhrmann R, Jankowsky E, Chasin LA, Nilsen TW. 2008. Dynamic regulation of alternative splicing by silencers that modulate 5' splice site competition. Cell 135 : 12241236.

132

Zachar Z, Chou TB, Kramer J, Mims IP, Bingham PM. 1994. Analysis of autoregulation at the level of premRNA splicing of the suppressorofwhiteapricot gene in Drosophila. Genetics 137 : 139150. Zhang B, Kirov S, Snoddy J. 2005a. WebGestalt: an integrated system for exploring gene sets in various biological contexts. Nucleic Acids Res 33 : W741748. Zhang C, Frias MA, Mele A, Ruggiu M, Eom T, Marney CB, Wang H, Licatalosi DD, Fak JJ, Darnell RB. 2010. Integrative modeling defines the Nova splicingregulatory network and its combinatorial controls. Science 329 : 439443. Zhang D, Abovich N, Rosbash M. 2001. A biochemical function for the Sm complex. Mol Cell 7: 319329. Zhang D, Rosbash M. 1999. Identification of eight proteins that crosslink to premRNA in the yeast commitment complex. Genes Dev 13 : 581592. Zhang W, Morris QD, Chang R, Shai O, Bakowski MA, Mitsakakis N, Mohammad N, Robinson MD, Zirngibl R, Somogyi E et al. 2004. The functional landscape of mouse gene expression. J Biol 3: 21. Zhang Z, Lotti F, Dittmar K, Younis I, Wan L, Kasim M, Dreyfuss G. 2008. SMN deficiency causes tissuespecific perturbations in the repertoire of snRNAs and widespread defects in splicing. Cell 133 : 585600. Zhang ZH, Niu ZM, Yuan WT, Zhao JJ, Jiang FX, Zhang J, Chai B, Cui F, Chen W, Lian CH et al. 2005b. A mutation in SART3 gene in a Chinese pedigree with disseminated superficial actinic porokeratosis. Br J Dermatol 152 : 658663. Zhu J, Mayeda A, Krainer AR. 2001. Exon identity established through differential antagonism between exonic splicing silencerbound hnRNP A1 and enhancerbound SR proteins. Mol Cell 8: 13511361. Zieve GW, Sauterer RA, Feeney RJ. 1988. Newly synthesized small nuclear RNAs appear transiently in the cytoplasm. J Mol Biol 199 : 259267. Zorio DA, Lea K, Blumenthal T. 1997. Cloning of Caenorhabditis U2AF65: an alternatively spliced RNA containing a novel exon. Mol Cell Biol 17 : 946953. Zuo P, Maniatis T. 1996. The splicing factor U2AF35 mediates critical proteinprotein interactions in constitutive and enhancerdependent splicing. Genes Dev 10 : 13561368.

133

Appendices

Appendix 1. Reprint: Pan Q, Saltzman AL, Kim YK, Misquitta C, Shai O, Maquat LE, Frey BJ, Blencowe BJ. 2006. Quantitative microarray profiling provides evidence against widespread coupling of alternative splicing with nonsensemediated mRNA decay to control gene expression. Genes Dev 20 (2): 153158.

NOTE: this appendix can be found on the enclosed CD.

Appendix 2. Reprint: Saltzman AL , Kim YK, Pan Q, Fagnani MM, Maquat LE, Blencowe BJ. 2008. Regulation of multiple core spliceosomal proteins by alternative splicingcoupled nonsensemediated mRNA decay. Mol Cell Biol 28 (13): 43204330.

NOTE: this appendix can be found on the enclosed CD.

134

Appendix 3. Correlation of probe intensities (A) or % exon inclusion (B) between Cy3 and Cy5 fluor reversals for six samples. (A) Correlation of probe intensities for 6 probes (exon body: C1, A, C2 and exon junction: C1A, AC2, C1C2) per AS event * 3055 AS events. (B) Correlation of % inclusion values for the AS events in the ‘top half’ of the data (n=1704), as described in the Methods section. Abbreviations: r, Pearson’s correlation coefficient; n, number of data points.

135

Appendix 4. Correlation of % inclusion between pairs of AS events with duplicate probes on the AS microarray. A subset of AS events was represented by two sets of six probes. The % inclusion for duplicate AS events ranking in the top half (for each sample individually) are shown for the six samples. Abbreviations: r, Pearson’s correlation coefficient; n, number of data points.

136

Appendix 5. Correlation between % exon skipping (A) or knockdowndependent difference in % exon skipping (B) measurements by AS microarray or RTPCR. Abbreviations: n (below plot): number of RTPCR reactions for each pair of samples; R 2, Pearson’s correlation coefficient.

137

Appendix 6. Microarray data for 1704 AS events that met our detection criteria. For each of 3 sample pairs (3 knockdowns and corresponding controls), the change in % skipping (knockdowncontrol) and change in transcript level (arcsinh scale) are given. For each of the 6 samples (3 control siRNA treatments, 3 knockdowns), the % skipping and corresponding confidence rank, transcript level (average of constitutive exonspecific probes; arcsinh transformed) and probe intensities (after normalization, see Methods) are given. For details of probe identifiers (C1,C2,A,C1:A,A:C2,C1:C2) see Pan et al. 2004.

NOTE: this appendix can be found on the enclosed CD.

Appendix 7. Annotation for 1704 microarraymonitored AS events that met our detection criteria. Gene name, LocusLink IDs, aliases and gene ontology (GO) annotation were obtained from either Stanford SOURCE (http://source.stanford.edu) or Clone/Gene ID converter (ref. 38). The counts of nucleotides of introns flanking the alternative exon that overlap with phastCons elements are given for 50 and 150 nucleotide windows. The sequences of the alternative exons and flanking constitutive exons monitored for each event are given. Measurements of % skipping from RTPCR assays are also included. For microarray data for these events, see Appendix 6.

NOTE: this appendix can be found on the enclosed CD.

138

Appendix 8. Significant overlaps in AS events with a consistent change in exon inclusion levels when comparing any two UPF KDs. (A) Threeway Venn diagrams show the set relationships among AS events with at least 5% more skipping (left, blue) or more inclusion (right, yellow) upon KD of any UPF factor. There is significant enrichment for consistent changes (i.e. in the same direction of change in % exon skipping) between any pair of UPF KDs. Pvalues shown for the 2way overlaps were calculated using Fisher’s exact test. (B) Color plots show microarray data for changes in % exon skipping obtained for a particular KD minus the % exon skipping obtained in the presence of control siRNA. These data correspond to the AS events used to generate the Venn diagrams in (A). Each panel shows events for which the indicated KD resulted in at least a 5% increase in skipping (B, left) or 5% increase in inclusion (B, right), with the % skipping for the other KDs shown in the same row. Rows are ordered according to the median of the change in % skipping across the three KDs. Results shown in this figure are for all ‘detectable’ (see Methods) AS events (n=1704). Similar results were obtained for the PTCintroducing subset of AS events (Appendix 9 and data not shown).

139

Appendix 9. Effects of each UPF factor knockdown on PTCintroducing AS events. As in Figure 27, the change in % alternative exon inclusion level is shown for AS events that introduce a PTC upon inclusion (A) or upon skipping (B). Events for which at least a 5% change in % inclusion was detected upon UPF1, UPF2 or UPF3Xknockdown are shown, and rows are ordered according to the average of the % inclusion change across the three knockdowns.

140

Appendix 10. Frequency of changes in exon inclusion level upon knockdown of UPF1, UPF2, or UPF3X for all detectable AS events (A) or for specific categories (BD). Colored portions of stacked bar graphs reveal the proportion of AS events showing a given amount of change in % inclusion in the knockdown relative to the control siRNA treatment. The colors represent increasing cutoffs for the change in % exon skipping. The frequency of changes in UPF1 and UPF3X KDs is similar overall (A), while slightly fewer changes (58%) were seen in the UPF2 KD. Notably, a higher frequency of AS changes for PTCintroducing events (C and D) was observed for the UPF1KD. For events with No PTC (E), similar frequencies of changes were seen for UPF1 and UPF2 KDs, while for UPF3XKD the frequency was slightly higher. For a description of categories, see Methods.

141

Appendix 11. Cumulative distribution function (CDF) plots of flanking intron sequence overlap with phastCons elements for the ‘No PTC’ group. Differences in the distributions of nt overlapping phastCons sequences for events showing ≥5% more exon skipping upon knockdown of UPF2 or UPF3Xkd were observed ( top panels ; UPF2: upstream intron (black): p=0.006, downstream intron (red): p=0.0001; UPF3X: upstream intron: p=0.004, downstream intron: p=0.03, Wilcoxon rank sum test).

142

Appendix 12. Annotation for microarraymonitored PTCintroducing AS events with conserved flanking intron sequences. AS events that (i) introduce a PTC upon inclusion (#128) or skipping (#2943), (ii) have conserved flanking introns (at least 35/50 nt up/downstream of alternative exon overlap conserved elements predicted by phastCons, Siepel et al. 2005) and (iii) show changes in abundance upon UPF factor knockdown (Figure 27) are shown in this table. For microarray data for these events, see Appendix 6. Refer to Appendix 7 and Methods for column explanations and resources used.

NOTE: this appendix can be found on the enclosed CD.

Appendix 13. Annotation of cassette AS events identified in spliceosomeassociated genes. Gene identifiers, whether the event is conserved between human and mouse (based on ESTs) or introduces a PTC, number of nucleotides of intron flanking the alternative exon that overlap phastCons elements, and the sequence of the alternative exon and flanking constitutive exons are given.

NOTE: this appendix can be found on the enclosed CD.

Appendix 14. Annotation of cassette AS events identified in the control gene set. Columns as in Appendix 13.

NOTE: this appendix can be found on the enclosed CD.

Appendix 15. Reprint: Saltzman AL , Pan Q, Blencowe BJ. 2011. Regulation of alternative splicing by the core spliceosomal machinery. Genes Dev 25 (4), 373384.

NOTE: this appendix can be found on the enclosed CD.

143

Appendix 16. Comparison of SmB/B′ and SmN amino acid sequences ( A) and mRNA expression patterns across 84 tissue and cell types ( B). (A) A multiple sequence alignment of human SmB, SmB′ and SmN protein sequences was constructed using Clustal W version2 (Larkin et al. 2007). The 17 amino acid differences between SmB′ and SmN are highlighted in blue, and the Cterminal difference between SmB and SmB′ is highlighted in green. The Sm1 and Sm2 motifs are boxed. (B) Microarray expression data (Su et al. 2004) for SmB/B′ (Affymetrix U133A probeset 213175_s_at) and SmN (probeset 201522_x_at) were obtained from the BioGPS website (Wu et al. 2009). Each bar represents the average of 25 replicates. Dotted lines indicate expression that is twofold above or below the median across the 84 samples. Expression of SmB/B′ (left panel) is highest in cell lines and tissues of the hematopoeitic system. Expression of SmN (right panel) is highest in the central nervous system. The next highest expression of SmN is in the heart, pituitary gland, prostate, thyroid and fetal thyroid (indicated by ♦).

144

Appendix 17. Abrogation of NMD by treatment of HeLa cells with the translation inhibitor cycloheximide (CHX) leads to an increase in the steadystate level of the endogenous exon included PTCcontaining SNRPB variant ( A), but not the exonincluded variant from the SNRPB reporter ‘miniSmB’ ( B). Cells were transfected with miniSmB and treated with CHX (300 ug/mL) or with an equivalent amount of DMSO as a control for three hours prior to RNA isolation. (A) Endogenous SmB/B’ transcripts were detected by RTPCR using primers designed to amplify both the included and skipped variants ( upper panel ) or to specifically amplify the included PTCcontaining variant ( lower panel ). (B) Transcripts derived from the miniSmB reporter were detected by RTPCR using primers designed to amplify both the included and skipped variants ( upper panel ) or to specifically amplify the included variant ( middle panel ). RTPCR assays for 5S rRNA are shown as a control for RNA concentration ( bottom panel ). Dilutions of input RNA (from DMSOtreated cells) in the left three lanes indicate that the RTPCR assays are semiquantitative. Quantifications of % inclusion or relative signal ± standard deviations calculated from four RTPCR assays performed on samples from two independent transfections are shown in the bar graphs to the right of each gel.

145

Appendix 18. A deletion adjacent to the 5′ss that strengthens potential basepairing to U1 snRNA abrogates SmB/B′ knockdowndependent skipping. Assays performed as in Figure 45. MiniSmBD53 contains a deletion from +7 to +18, immediately downstream of the 5′ss. MiniSmBM52 contains a 12nt substitution from +7 to +18. ( Top panel ) The potential basepairing of D53 to U1 snRNA is stronger than that of M52. (Bottom panel ) In the control (NT) knockdown, both the deletion and mutation increase the inclusion of miniSmB when compared to wildtype (~40% inclusion; see Figs. 13). However, the effect of SmB/B′ knockdown on exon inclusion is reduced for D53 but not for M52. Abbreviations: ψ, pseudouridine; NT, nontargeting; D, deletion; M, mutation.

146

Appendix 19. Mutations that strengthen the 3′ss do not abrogate SmB/B′ knockdown dependent skipping. In the control (NT) knockdown, three mutations that increase the strength of the 3′ss (‘strong’, ‘A(15)U’ and ‘G(13)U’) increase the inclusion of miniSmB when compared to wildtype (see Figs. 13). The effect of SmB/B′ knockdown (grey bars) on miniSmB exon inclusion is essentially unchanged when compared to the effect observed in the wildtype miniSmB (~25% more skipping than in the control; see Figs. 13). Alternative exon inclusion from miniSmB was measured by RTPCR assays performed using primers specific for the flanking constitutive exons (hatched boxes). Quantifications of % inclusion ± standard deviations calculated from triplicate RTPCR assays performed on samples from two independent transfections are shown in the bar graph below the gel. Note that the ‘strong’ 3′ss (left) represents the same mutation shown in the main text (Figure 46B), however an assay performed on RNA from independent transfections is shown here. Abbreviations: NT, nontargeting; ss, splice site; wt, wildtype; mut, mutation.

147

Appendix 20. Data and annotations for 5752 AS events monitored by RNASeq that passed our filtering criteria.

NOTE: this appendix can be found on the enclosed CD.

Appendix 21. Data and annotations for 8626 triplets of consecutive 'constitutive' exons monitored by RNASeq that passed our filtering criteria.

NOTE: this appendix can be found on the enclosed CD.

Appendix 22. Gene Ontology (GO) and Pathway Commons enrichment analysis for 235 genes containing AS events with a ≥30% change in % exon inclusion upon SmB/B' knockdown. Genes #190 were annotated by any of the enriched GO or pathway terms. Genes #91235 were not annotated by these terms. For each GO term (columns GO) or pathway (columns PY), the enrichment pvalue and the genes annotated by the term are listed in the column in the appropriate rows. For each gene, all event ID(s) for AS events in the gene showing more skipping (at least 10% difference) in the SmB/B' knockdown relative to the control are shown in column E. The difference in % inclusion (SmB/B' knockdown control) for each event in column E is shown in column F. Genes were selected based on having a single AS event showing at least 30% more skipping in the SmB/B' knockdown, and any additional events affected by SmB/B' knockdown are included for completeness. Each gene was only counted once for the analysis. For details of the enrichment analysis, please refer to the Methods section.

NOTE: this appendix can be found on the enclosed CD.

148

Appendix 23. Exon inclusion levels and knockdowndependent changes for all 27 assayed alternative exons agree well with RNASeq predictions. Alternative exons monitored by RNASeq (n=5752) were split into three equally sized groups by the read coverage of their splicejunctions in the SmB/B′ knockdown sample (top 3rd, middle 3rd, lowest 3rd). Alternative splicing evens (n=27) were selected from each group representing a range of % inclusion differences between the control and SmB/B′ knockdown. Within each group, alternative exons are ordered from left to right by the change in inclusion in the SmB/B′ knockdown compared to the control (left, more skipping; right, more inclusion). PTC, premature termination codon.