<<

REGULATION OF ALTERNATIVE POLY(A) SITE CHOICE BY CO-

TRANSCRIPTIONAL FACTOR RECRUITMENT, POL II PAUSING AND

SPLICING

By

BECKY ANN FUSBY

B.S. University of Nebraska Kearney, 2010

A thesis submitted to the

Faculty of the Graduate School of the

University of Colorado in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

Molecular Biology Program

2016 This thesis for the Doctor of Philosophy degree by

Becky Ann Fusby

has been approved for the

Molecular Biology Program

by

Mair Churchill, Chair

Aaron Johnson

Jay Hesselberth

Richard Davis

Dylan Taatjes

David Bentley, Advisor

Date 05/20/2016

ii

Fusby, Becky Ann (Ph.D. Molecular Biology)

Regulation of Alternative Poly(A) Site Choice by Co-Transcriptional Factor

Recruitment, Pol II Pausing, and Splicing

Thesis directed by Professor David L. Bentley

ABSTRACT

The majority of mammalian undergo alternative resulting in production of alternate mRNA isoforms. Aberrant mRNA length due to alternative poly(A) site choice has been associated with numerous diseases, including . For this reason, understanding the mechanistic regulation of poly(A) site choice is critical. The work in this thesis specifically addresses numerous aspects of and mRNA processing that regulate poly(A) site choice. The relationship between a CTD-ligand, its ability to promote Ser2P, and the functional consequence of reduced Ser2P on 3’ UTR poly(A) site choice is examined. Additionally, the importance of poly(A) site processing in promoting pol II pausing, Ser2P, and processing factor recruitment is investigated. Lastly, the regulation of intronic poly(A) sites by splicing factors and functional splicing is analyzed. I find that the extent of Ser2P is not closely coupled to poly(A) site use as predicted, but pol II pausing and processing factor recruitment are directly coupled with poly(A) site utilization. Furthermore, I find that splicing is the major regulator of intronic poly(A) site usage in contradiction to a model of U1 snRNP- specific protection. This research further enhances the understanding of how poly(A) sites are regulated by identifying critical factors that ultimately influence

iii

alternative polyadenylation.

The form and content of this abstract are approved. I recommend its publication.

Approved: David Bentley

iv

To the people who kept me going when I wanted to give up.

Thank you for believing in me.

v

ACKNOWLEDGEMENTS

I would like to thank my family and friends for their unconditional love, support, encouragement and sacrifice. I would like to especially thank my parents who gave me the freedom to be myself and pursue my dreams. Their only wish for me has always been my happiness, and for that, I am truly grateful. To my life-partner Jesse, thank you for your love, unwavering support and patience.

Thank you for teaching me to worry less and for reminding me to live in the moment. Your love of science and quest for knowledge remind me daily why I pursued this degree. To Stella, thank you for being my support system while finishing my thesis and for being my reality check when I needed it most.

I would also like to thank all the people who have helped me along my academic journey. Thank you to my college mentor, Kim Carlson, who taught me not to apologize for who I am and encouraged me to go to graduate school.

David, thank you for letting me join your lab and investing your time and energy in training me. I would also like to thank all the present and past members of the

Bentley lab for their help and support. To Roberto and Kris, thank you for your friendship both inside and outside the lab. To Tassa, thank you for bringing your cheerfulness and knowledge into the lab. I learned so much from you in a short time. To Michael and Ryan, it has been a pleasure learning and training along side you and I wish you all the luck in your futures.

I would also like to thank the BMG department, the Molecular Biology program, and my committee for being passionate and insightful colleagues.

vi

TABLE OF CONTENTS

CHAPTER

I. INTRODUCTION…………………………………………………………………….1

1.1 Mechanism of 3’-End Processing……………………………………….... 2

1.1A Properties of Poly(A) Sites………………………………………. 2

CPSF and AAUAAA…………………………………………. 3

CstF and DSE…………………………………………………5

CFI and UGUA……………………………………………….. 6

Additional Cleavage/Polyadenylation Factors……………..7

1.1B APA and Mechanisms of Regulation…………………………… 7

“First Come, First Served” Model…………………………... 8

Survival of the Fittest Model………………………………. 10

Agonist/Antagonist Model…………………………………. 12

1.2 APA Regulation by RNA Polymerase II………………………………… 13

1.2A Regulation by RNA Polymerase II Pausing…………………...14

Promoter Proximal Pause…………………………………. 15

3’-End Pol II Pause………………………………………… 16

1.2B Ser2-CTD and APA…………………………. 17

Ser2-CTD Kinases and …………………... 18

Phospho-Ser2-CTD Ligands……………………………….21

Integration of Phospho-Ser2-CTD with 3’-End Processing…………………………………………………...22

vii

1.3 Integration of APA and Splicing…………………………………………. 22

1.3A Mechanism of Splicing………………………………………….. 23

Major …………………………………………. 23

1.3B Splicing and Cleavage/Polyadenylation………………………. 25

Interaction of Splicing and Cleavage/Polyadenylation Factors………………………………………………………. 26

IgM : A Balance of Splicing and Polyadenylation …27

U1 snRNP and Polyadenylation.…………………………. 29

1.4 Specific Questions…………………………………………………………30

II. MATERIALS AND METHODS………………………………………………….. 36

2.1 ChIP………………………………………………………………………... 36

2.2 ChIP-seq…………….…………………………………………………….. 37

2.3 ………………………………………………………………….. 38

2.4 Mini-Gene Transfection…………………………………………………... 38

2.5 RNA Extraction……………………………………………………………. 39

2.6 Poly(A)+ RNA-seq………………………………………………………… 39

2.7 Cell Lines and Growth Conditions………………………………………. 39

2.8 Extractions and Immunoblotting………………………………...40

2.9 3’ RACE……………………………………………………………………. 40

2.10 Real-Time PCR………………………………………………………….. 41

2.11 ASO Transfection………………………………………………………...41

2.12 shRNA-Mediated Spt6 Knockdown……………………………………. 41

viii

2.13 ChIP-seq Pipeline……………………………………………………….. 42

III. THE ROLE OF SPT6 IN REGULATING TRANSCRIPTION, SER2-CTD PHOSPHORYLATION AND MRNA PROCESSING……………44

3.1 Introduction………………………………………………………………… 44

3.2 Results……………………………………………………………………... 48

3.2A Spt6 is Localized at 3’-Ends of Genes in Correlation with Ser2P-CTD………………………………………………………. 48

3.2B Knockdown of Spt6 Through an Inducible shRNA…………... 50

3.2C Spt6 Affects Pol II Distribution at the 5’- and 3’-Ends of Genes…………………………………………………………….. 50

3.2D Spt6 Does Not Generally Affect Pol II Distribution Genome-Wide…………………………………………………… 52

3.2E Spt6 Promotes Ser2-CTD Phosphorylation at 3’-Ends of Genes…………………………………………………………. 52

3.2F Spt6 Does Not Affect Other Phospho-CTD Isoforms……….. 53

3.2G Spt6 Regulates Elongation Complex Factor but Not Ser2 Kinase……………………………………………………… 54

3.2H Spt6 Affects 3’ UTR Alternative Poly(A) Site Choice……….. 56

3.2I Spt6 Alters H3K36me3 on Gene Bodies………………………. 57

3.3 Discussion…………………………………………………………………. 58

IV. COORDINATION OF RNA POLYMERASE II PAUSING AND 3’-END PROCESSING FACTOR RECRUITMENT WITH ALTERNATIVE POLYADENYLATION…………………………………………………………... 74

4.1 Introduction………………………………………………………………… 74

4.2 Results……………………………………………………………………... 77

4.2A Pol II Pause Correlates with Poly(A) Site Usage……………..77

ix

4.2B The μS Poly(A) Site is Necessary for the μS+500 Pause….. 80

4.2C The μS Poly(A) Site is Not Sufficient to Induce Pausing…….81

4.2D CTD Ser2 Phosphorylation is Uncoupled from Pol II Pausing at the μS+500 Site……………………………………. 82

4.2E The β-Globin Poly(A) Site is Necessary for Ser2-CTD Hyperphosphorylation…………………………………………... 83

4.2F Alternative Poly(A) Site Use and CstF Recruitment…………. 84

4.3 Discussion…………………………………………………………………. 85

V. REGULATION OF ALTERNATIVE POLY(A) SITE CHOICE BY SPLICING……..…………………………………………………………………... 98

5.1 Introduction………………………………………………………………… 98

5.2 Results……………………………………………………………………. 103

5.2A snRNA ASOs Attenuate Splicing…………………………….. 103

5.2B Reduction of Splicing Alters Poly(A) Site Use Within Gene Bodies……………………………………………………. 104

5.2C Different Mechanisms of Splicing Inhibition Alternatively Regulate Poly(A) Site Use……………………………………. 106

5.2D U1, U4, and U6 Protect Poly(A) Sites Near the TSS…………………………………………………………. 108

5.2E Splicing Preferentially Regulates Use of Intronic Poly(A) Sites……………………………………………………. 109

5.2F snRNP Depletion Favors Intronic Poly(A) Site Use on NR3C1 Mini-Gene……………………………………………... 110

5.2G Attenuation of Splicing by Alternative Mechanisms Increases Intronic Poly(A) Site Use……………………………………… 112

5.2H snRNPs Regulate APA on Integrator Genes in a Potential Negative-Feedback Loop……………………………………... 114

x

5.3 Discussion……………………………………………………………….. 116

VI. CONCLUSIONS………………………………………………………………...128

REFERENCES……………………………………………………………………….. 136

APPENDIX

A. CHAPTER III SEQUENCING LIBRARIES…………………………………… 168

B. SPT6 AFFECTED POLY(A) SITES…………………………………………... 169

C.CHAPTER IV SEQUENCING LIBRARIES…………………………………… 172

D. CHAPTER IV PRIMERS………………………………………………………..173

E. CHAPTER V SEQUENCING LIBRARIES…………………………………… 174

F. CHAPTER V PRIMERS………………………………………………………... 175

G. SNRNP AND SSA AFFECTED POLY(A) SITES…………………………… 176

xi

LIST OF FIGURES

FIGURE

1-1: Diagram of pre-mRNA poly(A) site cis-elements and corresponding factors……………………………………………………………………………… 32

1-2: Density of total pol II, Ser2P- and Ser5P-CTD on a typical mammalian gene…………………………………………………………………. 33

1-3: Schematic of the step-wise major spliceosome reaction……………………. 34

1-4: Diagram of poly(A) site use on the IgM gene and relevant mutants……….. 35

2-1: ChIP-seq pipeline flowchart…………………………………………….………. 43

3-1: Spt6 is localized at 3’-ends of genes in correlation with Ser2P-CTD………. 63

3-2: Knockdown of Spt6 through an inducible shRNA…………………………….. 64

3-3: Spt6 affects pol II distribution at both the 5’- and 3’-ends of genes………… 65

3-4: Spt6 does not generally affect pol II distribution genome-wide……………... 66

3-5: Spt6 promotes Ser2-CTD phosphorylation at 3’-ends of genes……………..67

3-6: Spt6 globally regulates Ser2P-CTD at 3’-ends of genes…………………….. 68

3-7: Spt6 does not affect other phospho-CTD isoforms…………………………... 69

3-8: Spt6 affects recruitment of elongation complex factor but not Ser2 kinase……………………………………………………………………….. 70

3-9: Spt6 affects 3’ UTR alternative poly(A) site choice…………………………... 71

3-10: Spt6 alters H3K36me3 on gene bodies……………………………………… 72

3-11: Spt6 globally regulates placement of H3K36me3 on genes……………….. 73

4-1: IgM WT transgene expression in M12 and S194 cells………………………. 90

4-2: Pol II pausing in the Cμ4-M1 is coupled to μS poly(A) site use…………………………………………………………………………….. 91

xii

4-3: IgM pA21 and 5’-SP transgene expression in S194 cells…………………… 92

4-4: The μS poly(A) site is necessary but not sufficient for μS+500 pol II pause………………………………………………………………………... 93

4-5: Uncoupling of pausing from pol II Ser2-CTD hyperphosphorylation in mutants that alter μS poly(A) site usage……………………………………. 94

4-6: The ß-Globin poly(A) site is necessary for pol II pausing and Ser2-CTD hyperphosphorylation…………………………………………………………… 95

4-7: Alternative poly(A) site use and CstF recruitment……………………………. 96

4-8: Model for coordination of pausing and 3’ processing factor recruitment with alternative poly(A) site choice at the IgM gene………………………….. 97

5-1: snRNA ASOs attenuate splicing……………………………………………….118

5-2: Reduction of splicing alters poly(A) site use within gene bodies………….. 119

5-3: Dendogram of poly(A) site use in different conditions……………………… 120

5-4: U1, U4 and U6 preferentially protect poly(A) sites near the TSS…………. 121

5-5: Splicing preferentially regulates use of intronic poly(A) sites……………… 122

5-6: snRNPs regulate the intronic poly(A) site on a NR3C1 mini-gene………... 123

5-7: Attenuation of splicing by alternative mechanisms increases intronic poly(A) site use………………………………………………………………….. 124

5-8: Validation of intronic poly(A) site use by 3’ RACE………………………...... 125

5-9: snRNPs regulate APA on Integrator genes………………………………….. 126

5-10: Model of intronic poly(A) site regulation by splicing……………………….. 127

6-1: Alternative models of 3’ UTR poly(A) site choice…………………………… 135

xiii

LIST OF ABBREVIATIONS

APA: Alternative polyadenylation

ASOs: Antisense oligonucleotides

BED: Browser Extensible Data

CDK: Cyclin dependent kinase

ChIP: Chromatin immunoprecipitation

ChIP-seq: ChIP coupled with sequencing

CFI: I

CPA: Cleavage/polyadenylation

CPSF: Cleavage and polyadenylation specificity factor

CstF: Cleavage stimulation factor

CTD: C-terminal domain (pol II)

DSE: Downstream element

EtBr: Ethidium bromide

H3K36me3: Histone 3, 36, trimethylation

NELF: Negative

PARP: Poly(ADP)-ribose polymerase

PAS: Poly(A) site

PCPA: Premature cryptic poly(A) site

P-TEFb: Positive elongation factor b

Pol II: RNA Polymerase II

RACE: Rapid amplification of cDNA ends

xiv

SEC: Super elongation complex

Ser2, 5, 7: 2, 5, 7(CTD) shRNA: short hairpin RNA snRNA: small nuclear RNA snRNP: small nuclear ribonucleic

Spt6: Suppressor of Ty6

TSS: Transcription start site

Tyr1: Tyrosine1 (CTD)

U1, U2, U4, U5, U6: Uridyclic acid-rich 1, 2, 4, 5, 6

UCSC: University of California Santa Cruz

USE: Upstream element

xv

CHAPTER I

INTRODUCTION

Alternative mRNA processing significantly facilitates the diversity of the transcriptome. The research summarized in this dissertation focuses on understanding the mechanistic regulation of alternative poly(A) site choice.

Specifically, this work reveals 1) the ability of a RNA polymerase II (pol II) CTD ligand to modify Ser2-phosphorylation levels, 2) the role of Ser2-phosphorylation in promoting poly(A) site choice, 3) the coupling of poly(A) site use with pol II pausing and 3’ processing factor recruitment, and 4) the regulation of poly(A) site use by snRNPs and splicing.

The first section of this chapter will discuss the mechanism of 3’-end processing by focusing on how a poly(A) site is defined. I will also discuss the importance of alternative polyadenylation (APA) and theories of how poly(A) site choice is regulated. In the second section, I will discuss how pol II regulates poly(A) site choice by examining how transcription is integrated with APA. I will focus on how pol II pausing impacts mRNA processing and how phosphorylation of Ser2-CTD couples, and possibly promotes, 3’-end processing. In the third section, I will discuss evidence that connects splicing and polyadenylation. I will examine the IgM gene where a balance between splicing and polyadenylation regulates alternative poly(A) site choice during B-cell maturation. I will also examine a novel function of a splicing factor, U1 snRNP, in regulating polyadenylation. Finally, the specific questions addressed in this thesis will be

1

outlined.

1.1 Mechanism of 3’-End Processing

Processing of pre-mRNAs into mature transcripts involves the addition of a

7-methylguanosine cap at the 5’-end, removal of by splicing, and synthesis of a polyadenine tail at the 3’-end. Almost all eukaryotic protein-coding genes have a non-templated poly(A) tail added to their 3’-ends. This involves a well-documented two-step process of pre-mRNA endonucleolytic cleavage followed by poly(A) tail addition (Chan et al., 2011; Colgan and Manley, 1997;

Proudfoot, 2011; Shi and Manley, 2015; Zhao et al., 1999a). Interestingly, genome-wide analysis revealed that more than 70% of eukaryotic genes have multiple mRNA isoforms through APA (Di Giammartino et al., 2011; Tian et al.,

2005). The work in this thesis focuses on mechanisms that regulate alternative poly(A) site choice. The first part of this introduction will focus on how a poly(A) site (PAS) is defined and the mechanism of 3’-end processing. This section will also address general theories of how APA is regulated.

1.1A Properties of Poly(A) Sites

mRNA’s were first identified to have unique 3’-terminal tails by purifying subcellular factions via column chromatography, and the 3’-ends were identified to be adenine rich and resistant to pancreatic and T1 RNase , which are that cleave at C/U and G residues, respectively (Adesnik et al.,

1972; Birnboim et al., 1973; Edmonds et al., 1971; Lim and Canellakis, 1970;

Mendecki et al., 1972). The function of long poly(A) tails at the 3’-end of mRNAs

2

was not known at the time, but it provided a useful property to isolate mRNA from ribosomal RNA using oligo(dT) chromatography (Aviv and Leder, 1972). Being able to isolate mRNA, along with the technological development of cDNA synthesis and Sanger sequencing, allowed for thorough characterization of the

PAS. Poly(A) sites are defined by highly conserved composition and a number of key cis-elements including an AAUAAA hexanucleotide sequence, a

CA dinucleotide cleavage site ~10-30 downstream of the hexamer, a

U/GU-rich downstream element (DSE) that is ~40 nucleotides downstream of the cleavage site, and a U-rich upstream auxiliary element (USE) containing a UGUA consensus sequence (Figure 1-1). PAS-recognition is dependent on protein-RNA interaction at these specific elements and is discussed in detail below.

CPSF and AAUAAA

The AAUAAA sequence is a critical PAS element as of the late

SV40 PAS prevented 3’-end polyadenylation (Fitzgerald and Shenk, 1981). This hexanucleotide cis-element is recognized by cleavage and polyadenylation specificity factor (CPSF), which is a multisubunit complex comprised of

CPSF160, Wdr33, CPSF100, CPSF73, Fip1 and CPSF30 (Bienroth et al., 1991;

Murthy and Manley, 1992; Shi et al., 2009). CPSF73 possesses RNA activity and is essential for cleavage of the pre-mRNA (Mandel et al., 2006). CPSF100 is also critical for the cleavage reaction (Kolev et al., 2008), but unlike CPSF73, it does not have endonuclease activity. Recombinant Fip1 binds U-rich RNA, and within the CPSF complex, it binds the RNA sequence

3

upstream of the AAUAAA hexamer (Chan et al., 2014; Kaufmann et al., 2004).

However, the exact mechanism of how CPSF directly binds the hexamer sequence is not yet resolved.

It was initially shown that two CPSF subunits of approximately 160 kDa and 30 kDa crosslink to an AAUAAA RNA oligo (Keller et al., 1991) and it was presumed to be CPSF160 and CPSF30 due to t(Kaida et al., 2010)heir relative sizes. It was shown in yeast that Yhh1 and Ctf1, the CPSF160 homologs, bind to a PAS-containing RNA near the cleavage site (Dichtl et al., 2002). Mammalian recombinant CPSF160 has also been shown to posses RNA-binding activity with preference for the AAUAAA sequence (Murthy and Manley, 1995). However, global-mapping of CSPF160 to RNA failed to detect specific enrichment near the

AAUAAA hexanucleotide element (Martin et al., 2012). Interestingly, Wdr33 is of the same approximate size as CPSF160 and directly interacts with RNA

(Schonemann et al., 2014). Recent PAR-CLIP experiments provide evidence that

Wdr33 and CPSF30 directly bind AAUAAA RNA sequences (Chan et al., 2014;

Schonemann et al., 2014).

The native CPSF complex was originally estimated to be in the range of

200-300 kDa (Christofori and Keller, 1988; Takagaki et al., 1989), which could not include both CPSF160 and Wdr33 in the same complex, as they are both

~160 kDa proteins. This and the discrepant results of CPSF160 and Wdr33 in

AAUAAA binding suggest that CPSF might have multiple distinct complex compositions which function upon distinct classes of poly(A) sites. This could

4

prove to be another layer of APA regulation.

CstF and DSE

The DSE was identified as a GU-rich sequence downstream of the

AAUAAA element that enhances 3’-end formation (Gil and Proudfoot, 1984,

1987; McLauchlan et al., 1985). Cleavage stimulation factor (CstF) recognizes the DSE and is a trimeric complex comprised of CstF77, CstF50, and CstF64 or paralog CstF64τ. CstF77, in addition to interacting with the components of CstF, has been shown to interact directly with CPSF160 (Murthy and Manley, 1995) and indirectly with pol II CTD (McCracken et al., 1997b). It has been suggested that the role of CstF77 is to bridge and position various 3’ processing factors in relation to transcribing pol II. CstF50 directly interacts with pol II CTD as well

(McCracken et al., 1997b) and is required for CstF activity (Takagaki and Manley,

1992, 1994).

The ability of CstF to bind the DSE of the pre-mRNA is mediated by

CstF64(τ) (MacDonald et al., 1994; Takagaki and Manley, 1997). CstF64τ was originally identified as being testis-specific and has a higher RNA-binding specificity than its paralog, CstF64 (Monarez et al., 2007; Wallace et al., 1999).

However, new evidence suggests that CstF64τ is similar to CstF64 and is widely expressed in mammalian tissues (Yao et al., 2013). In fact, individual knockdown of CstF64 and CstF64τ had little impact on 3’-end processing, but a double knockdown of the paralogs had a significant impact in altering poly(A) site usage in cells (Yao et al., 2012).

5

CFI and UGUA

3’-end processing is additionally enhanced by the presence of the U-rich

USE upstream of the AAUAAA element, which contains a UGUA consensus motif (Carswell and Alwine, 1989; Moreira et al., 1995). Cleavage factor I (CFI) binds the UGUA motif (Brown and Gilmartin, 2003; Hu et al., 2005), and it is comprised of CFI25 and either CFI59 or CFI68 (Ruegsegger et al., 1996). It is thought that all 3 CFI subunits have the ability to bind RNA since they can be UV cross-linked to RNA (Ruegsegger et al., 1996).

CFI25 is the only subunit always found in the CFI complex, and it has a non-canonical Nudix domain that allows for specific binding to the UGUA motif

(Yang et al., 2011; Yang et al., 2010), which has been shown genome-wide

(Martin et al., 2012). Interestingly, CFI25 can form a dimer allowing multiple

UGUA sequences to be recognized simultaneously (Yang et al., 2010). As CFI contains CFI59 and CFI68 in mutually exclusive manners, it is not surprising that

CFI59 and CFI68 share similar structural features such as N-terminal RNA recognition motifs and C-terminal RS-repeats (Ruegsegger et al., 1998).

Intriguingly, CFI59 and CFI68 also share similar structures to SR proteins, which function in splicing (Manley and Tacke, 1996), and have been shown to interact with splicing factors (Millevoi et al., 2002; Millevoi et al., 2006; Rappsilber et al.,

2002; Yang et al., 2011; Zhou et al., 2002). This connection between cleavage/polyadenylation and splicing factors is further discussed in section

1.3B.

6

Additional Cleavage/Polyadenylation Factors

CPSF, CstF and CFI contribute the core mRNA 3’-processing complex. It is thought that CPSF and CstF synergistically bind AAUAAA and DSE, respectively, while the CFI complex binds to the UGUA motif upstream (Hu et al.,

2005). However, over 20 proteins are directly involved in cleavage/polyadenylation (Mandel et al., 2008; Zhao et al., 1999b), and it is after the core complex binds pre-mRNA that other cleavage/polyadenylation factors are recruited including cleavage factor II (CFII), containing Pcf11 and Clp1,

Symplekin, poly(A) polymerase (PAP), nuclear poly(A) binding protein (PABPN) and pol II (Shi et al., 2009). Once, the pre-mRNA is cleaved, there is an exposed

3’ OH that is rapidly polyadenylated by PAP and a 5’ phosphate that is targeted for degradation.

1.1B APA and Mechanisms of Regulation

Discussed above are the properties that define a poly(A) site. However, poly(A) sites can be divergent and often lack one or more cis-element, and poly(A) site recognition does not appear to follow strict requirements compared to other processing events such as splicing. First of all, the AAUAAA hexanucleotide sequence is highly divergent as it is estimated that approximately

30% of human PASs lack the canonical AAUAAA or even a close variant

(Beaudoing et al., 2000). Genome-wide analysis of PAS usage revealed that

AAUAAA is the hexamer PAS sequence ~60% of the time while AUUAAA is used

~15% of the time and there are 9 additional PAS variants used ~14% of the time

7

(Tian et al., 2005). Additionally, it has been reported that ~20% of human PASs do not possess a U/GU- rich DSE (Zarudnaya et al., 2003).

Additionally, genes typically have multiple poly(A) sites making 3’-end processing considerably more complicated. It is estimated that greater than 70% of eukaryotic genes undergo APA producing multiple mRNA isoforms with distinct

3’-ends (Di Giammartino et al., 2011; Elkon et al., 2013; Shi, 2012; Tian and

Manley, 2013). Specifically, APA in is estimated to occur on 70-79% genes (Derti et al., 2012; Hoque et al., 2013). Changes in 3’-ends by APA can occur in the 3’ UTR, which impacts mRNA stability and translocation to the . APA can also occur within the gene body in or introns, which ultimately alters the coding sequence resulting in different protein isoforms.

This leads to the fundamental, but unresolved, question of ‘how are multiple poly(A) sites on the same gene regulated?” The work in this thesis provides insight into the mechanism of alternative poly(A) site choice. Below, I will discuss current models of how APA is regulated and the evidence that supports and opposes each model.

“First Come, First Served” Model

The “first come, first served” model predicts that the distance between poly(A) sites and elongation rate play important roles in APA (Danckwardt et al.,

2008; Davis and Shi, 2014; Shi, 2012). In any given gene with multiple poly(A) sites, there is a poly(A) site that is closer to the transcription start site (TSS) and a poly(A) site that is further downstream, which are typically referred to as

8

proximal and distal, respectively. If both poly(A) sites were identical in sequence and cis-elements, this model predicts that the proximal poly(A) site would have the intrinsic advantage. Additionally, processing of the proximal poly(A) site also depends on how fast pol II is elongating, the distance to the next poly(A) site and the length of the gene.

Experimental evidence supports this theory as a Drosophila pol II mutant that exhibits slower elongation resulted in use of the proximal poly(A) site rather than the canonical distal poly(A) site on the POLO gene (Pinto et al., 2011).

Additionally in yeast, chemical treatment slowing elongation resulted in increased usage of the proximal poly(A) site on RPB2 (Yu and Volkert, 2013). Pol II elongation can be altered endogenously by chromatin and DNA post-translational modifications, which are obstacles for the transcribing polymerase (Brown et al.,

2012). In fact, there is evidence for DNA methylation altering poly(A) site choice.

The mouse H13 gene has alternative poly(A) sites with a CpG island located in- between. When the CpG island is unmethylated, the proximal poly(A) site is used as opposed to when it is methylated and the distal poly(A) site is used (Wood et al., 2008). However, this result does not support the proposed model as presumably the methylated CpG slows elongation but the distal poly(A) site is favored, disagreeing with the proposed model. Unpublished work by the Bentley lab also disagrees with this model as analysis of poly(A) site usage in human cells expressing a slow pol II mutant resulted in poly(A) usage shifts in both directions, proximal-to-distal and distal-to-proximal.

9

The second aspect of the ‘first come, first served’ model considers the distance between poly(A) sites. In accordance with the model, a longer distance would favor the use of the proximal poly(A) site. This was shown to be true on a reporter plasmid, constructed from the herpes simplex virus type 1 (HSV) tk gene and the AATAAA-segment of SV40 early gene, where addition of sequence between the two PAS resulted in increased use of the proximal PAS (Denome and Cole, 1988). This was also shown to be true on the IgM gene (Peterson and

Perry, 1989) and is discussed in further detail in section 1.3B. Interestingly, a

Fip1 deletion in embryonic stem cells revealed that mRNAs with increased proximal poly(A) site use have a distance 4-5 times greater between the alternative poly(A) sites than mRNAs that presented increased use of the distal poly(A) site (Lackford et al., 2014). These results support the model that proximal poly(A) sites are intrinsically favored. However, this model does not always hold true, and it does not explain how distal poly(A) sites are often the primary poly(A) sites.

Survival of the Fittest Model

The ‘first come, first served’ model only holds true if both poly(A) sites are treated equally, but evidence suggests that is most often not the case. Genome- wide analyses have revealed that the distal poly(A) site is often stronger than the proximal poly(A) site (Tian et al., 2005). In fact, studies across multiple suggest that the distal PAS has more canonical features including the conserved

AAUAAA hexamer and a DSE (Lackford et al., 2014; Martin et al., 2012; Smibert

10

et al., 2012; Tian and Graber, 2012). This evidence supports a ‘survival of the fittest’ model where a strong PAS has canonical elements which better recruit processing factors thus increasing the concentration of processing factor at the stronger poly(A) site.

The impact of processing factor concentration on APA has been examined. A formative analysis of IgM alternative poly(A) sites revealed that an increase of CstF64 concentration correlates with increased use of the weaker proximal poly(A) site (Takagaki and Manley, 1998; Takagaki et al., 1996). These results suggest that under normal conditions the stronger poly(A) site is monopolizing processing factor, and oversaturation of processing factor gives advantage to the weaker poly(A) site. In agreement with this model, depletion of

CstF64(τ) or Fip1 leads to a general shift to the stronger distal poly(A) site

(Lackford et al., 2014; Yao et al., 2012; Yao et al., 2013). However, knockdown of mammalian CFI25 or CFI68 resulted in the opposite effect with a shift to the proximal poly(A) site (Martin et al., 2012). Notably, processing factors bind specific pol II phospho-CTD isoforms and thus the extent of phospho-CTD might additionally regulate APA through the amount of processing factor it is capable of binding in proximity to the poly(A) site.

These results suggest that processing factor concentration can play an important role in poly(A) site choice, and it is a relevant biological mechanism of how APA is regulated during development. For example, activation of B-cells, where the strong distal IgM poly(A) site is expressed, there is a corresponding

11

repression of CstF64 (Chuvpilo et al., 1999; Takagaki et al., 1996). This provides less opportunity for the weaker IgM poly(A) site to be used thus securing the distal poly(A) site to be selectively processed.

Agonist/Antagonist Model

The ‘agonist/antagonist’ model suggests poly(A) site use is regulated by protein-RNA interactions that are enhancing or inhibiting (Shi, 2012). This model proposes that factors can bind RNA and recruit other processing factors.

Alternatively, a factor could bind PAS elements competitively or bind the RNA in a manner that blocks the RNA processing element. Evidence suggests both types of factors alter poly(A) site use.

It has been shown across species that factors compete with CstF64 for

RNA recognition and thus compete with PAS recognition (Alkan et al., 2006;

Arhin et al., 2002; Gawande et al., 2006). Conversely, elements have been shown to bind upstream of PAS and help recruit processing factors to suboptimal proximal PAS (Bava et al., 2013). A very intriguing poly(A) site antagonist, and the focus of Chapter 5 in this thesis, is U1 snRNP which has been shown to have a novel role in preventing the use of cryptic poly(A) sites (Kaida et al., 2010).

The work presented in this thesis integrates aspects from all of these models. I examine how a phospho-isoform of pol II CTD affects poly(A) site choice in Chapter 3. I delve further into this question in Chapter 4 where I examine how poly(A) site processing impacts pol II pausing, phosphorylation of pol II CTD, and how these elements regulate the recruitment of processing factor.

12

Lastly, I attempt to further elucidate the mechanism of how U1 snRNP functions as an antagonist factor by examining if other snRNPs similarly regulate APA.

1.2 APA Regulation by RNA Polymerase II

3’-end processing of pre-mRNAs is coupled co-transcriptionally with the

DNA-dependent polymerase that transcribes the gene (Bentley, 2005).

Eukaryotes have 3 multi-subunit enzymes that transcribe DNA, each dedicated to a specific gene set. RNA polymerase I transcribes ribosomal RNA and RNA polymerase III transcribes a range of short genes included tRNA and 5s RNA.

RNA polymerase II (pol II) is the responsible for transcribing protein- coding genes as well as many non-coding genes (Egloff and Murphy, 2008). Pol

II is a 12 subunit complex and its largest subunit, Rpb1, has a unique C-terminal domain (CTD) that is comprised of a 52-heptad repeat in mammals whose consensus sequence, YSPTSPS, is identical across species (Bentley, 2005). In yeast, there are over 100 different CTD interacting proteins (Phatnani and

Greenleaf, 2004), which support CTD function as a landing pad for proteins during transcription. In fact, the CTD has been shown to be essential for proper capping, splicing and 3’-end formation (McCracken et al., 1997a; McCracken et al., 1997b). The functional coupling of the CTD to mRNA processing provides direct evidence of co-transcriptional regulation by pol II.

The CTD is able to interact with a vast number of different proteins as it is dynamically modified throughout transcription initiation, elongation and termination. All residues within the CTD heptad can either be isomerized or

13

phosphorylated providing temporal and spatial coordination of proteins involved in transcription. However, phosphorylation of serine 5 (Ser5P) and serine 2

(Ser2P) are the best-characterized and most conserved marks of transcription

(Jeronimo et al., 2013). It is well established that CTD is highly phosphorylated at

Ser5 on 5’-ends of genes while CTD is highly phosphorylated at Ser2 on 3’-ends of genes (Komarnitsky et al., 2000), and these modifications correlate with different steps of the transcription cycle as diagramed in Figure 1-2. Ser2P is a major focus of this thesis and is discussed below in section 1.2B.

As denoted in Figure 1-2, the accumulation of phospho-CTD isoforms on either end of the gene is marked by accumulation of pol II on these regions, which is commonly referred to as a pol II pause. The advancement of genome- wide, high-throughput analysis has provided a better understanding of how pol II is distributed across genes offering further insight into how pol II regulates . The pausing of pol II at 5’-ends of genes is a major rate-limiting step for elongation. However, the role of pol II pausing at 3’-ends remains elusive. It is speculated that the 3’-pause has a similar impact on the transcription cycle as the 5’-pause, but it aids specifically in controlling termination and 3’-end processing. How pol II pausing regulates mRNA processing is discussed below.

1.2A Regulation by RNA Polymerase II Pausing

Promoter proximal pause (PPP) was definitively shown in Drosophila where pol II cross-linked near the promoter of the HSP70 was shown to be transcriptionally engaged having formed a short nascent RNA chain but was

14

arrested (Rougvie and Lis, 1988). The arrested pol II was only released upon heat-shock induction. This provided the basis of pol II pausing as a mechanism of transcriptional control. Pol II pause has been reported on other genes (Bentley and Groudine, 1986; Plet et al., 1995; Strobl and Eick, 1992), and now, has been shown numerous times genome-wide, including pol II ChIP-seq experiments presented in this thesis (Figure 3-4).

The mechanism and impact of the 5’ PPP is discussed in detail below. The extensive regulation of the 5’-pause and its impact on mRNA processing provide the basis that the 3’-pause would similarly regulate mRNA processing. Below, I discuss what is known about the 3’-pol II pause and the evidence linking it to 3’- end processing.

Promoter Proximal Pause

Transcription initiation was once considered the rate-limiting step of transcription. However, it is now known that once transcription is initiated, pol II pauses shortly after the TSS, and in Drosophila, pol II accumulates 20-60 bp downstream of the TSS (Gariglio et al., 1981; Rasmussen and Lis, 1993). This accumulation of pol II is referred to as the promoter proximal pause (PPP) and it is where the flux of pol II from initiation to elongation is regulated.

The PPP is estimated to be the rate-limiting step for greater than 70% of metazoan genes (Liu et al., 2015), and it has been proposed to have numerous co-transcriptional functions. A high concentration of pol II transcriptionally engaged at the TSS allows for rapid induction of genes that are needed in

15

development and in response to stimuli, such as heat shock (Bentley and

Groudine, 1986; Core and Lis, 2008; Gilmour and Lis, 1986; Krumm et al., 1995;

Levine, 2011; Rahl et al., 2010; Zeitlinger et al., 2007). Additionally, the PPP is marked by a high density of Ser5P that is preferentially bound by numerous factors, including capping factor, providing a functional link between pol II pausing at the 5’-end of genes and capping of the pre-mRNA (Cho et al., 1998;

Ho and Shuman, 1999). Lastly, recent findings published by the Bentley lab revealed that decapping and termination factors promote the PPP (Brannan et al., 2012) suggesting a novel role of termination factors at the PPP.

Promoter-proximal paused pol II is phosphorylated by positive transcription elongation factor b (P-TEFb) resulting in release into productive elongation (Marshall and Price, 1995; Zhu et al., 1997). P-TEFb is comprised of cyclin T and cyclin dependent kinase 9 (CDK9) (Peng et al., 1998a; Peng et al.,

1998b). CDK9 functions in phosphorylating negative elongation factor (NELF) and DRB-inducing sensitivity factor (DSIF), which are part of the promoter- proximal-paused-pol II complex (Fujinaga et al., 2004; Wada et al., 1998a; Wada et al., 1998b; Yamada et al., 2006). CDK9 also phosphorylates Ser2-CTD, which is a critical mark for elongating pol II and 3’-end function as discussed below in section 1.2B.

3’-End Pol II Pause

Once pol II is released from the PPP, it functions in transcription elongation and pauses again once it reaches the 3’-end of the gene. Pausing at

16

3’-ends of genes has been extensively observed downstream of the PAS and is thought to promote transcription termination and 3’-end formation (Eggermont and Proudfoot, 1993; Gromak et al., 2006; West and Proudfoot, 2009). It has been shown that 3’-end pausing of pol II is mediated by CPSF recognition of the

AAUAAA hexanucleotide element (Nag et al., 2007). Pol II paused at 3’-ends of genes is characterized by a high concentration of Ser2P-CTD as demonstrated by ChIP experiments (Glover-Cutter et al., 2008; Grosso et al., 2012).

Additionally, the Ser2P-CTD isoform interacts with mRNA processing factors that regulates 3’-end formation as discussed in more detail in section 1.2B.

Unlike the PPP, the mechanisms regulating the 3’ pause remain unclear.

There has been some work suggesting DNA-encoded pol II elements function in termination (Dye and Proudfoot, 2001; Gromak et al., 2006; et al., 2005;

Proudfoot, 1989; Tantravahi et al., 1993; West et al., 2006). Although 3’-end pause is seen near the poly(A) site, it is not known if the poly(A) site induces the pausing of pol II directly or if other DNA elements induce pol II pausing. Work in

Chapter 4 of this thesis addresses this question by examining the impact of poly(A) site processing on pol II pausing and coupling of Ser2P.

1.2B Ser2-CTD Phosphorylation and APA

Ser2P-CTD is strongly linked to 3’-end processing as the phospho-isoform is most densely accumulated at 3’-ends where it interacts with numerous processing factors, including Pcf11 and CstF50 (Licatalosi et al., 2002;

McCracken et al., 1997b). Interestingly, pol II that is lacking phospho-Ser2-CTD

17

is unable to support efficient 3’-end processing in human cells (Gu et al., 2013).

However, the relationship between Ser2P and 3’-end processing in mammals is not well understood. A recent study examining a β-globin reporter gene found that pol II pausing near a single PAS promotes both Ser2P and recruitment of 3’- end processing factors (Davidson et al., 2014). Work presented in this thesis examines the relationship between pol II pausing and Ser2P when there are alternative poly(A) sites. Below, I will discuss the enzymes that control Ser2 phosphorylation, what factors bind to Ser2P, and finally, evidence supporting

Ser2P integration with 3’-end processing.

Ser2-CTD Kinases and Phosphatases

Phosphorylation of Ser2-CTD had been solely credited to CDK9, and it is only within recent years that CDK12/CDK13 and BRD4 have been identified as additional Ser2 kinases. A Ser2 kinase was first identified in yeast (Lee and

Greenleaf, 1989) and was shown to be a homolog of yeast cyclin dependent serine/ kinase Ctk1 (Lee and Greenleaf, 1991). Bur1 was also identified to be a Ser2 kinase in yeast (Keogh et al., 2003; Prelich and Winston, 1993).

Ctk1 is considered the major yeast Ser2 kinase (Yao et al., 2000), but the presence of Bur1 results in an additive effect on Ser2 phosphorylation (Qiu et al.,

2009). Human Ser2 kinases CDK9 and CDK12 are homologs of yeast kinases

Bur1 and Ctk1, respectively (Bowman and Kelly, 2014).

In mammals, P-TEFb was identified to phosphorylate CTD (Marshall et al.,

1996), and CDK9 was identified as the kinase component of P-TEFb (Peng et al.,

18

1998a; Peng et al., 1998b). In vitro studies have suggested that CDK9 not only phosphorylates Ser2-CTD but also Ser5- and Ser7-CTD (Czudnochowski et al.,

2012; Glover-Cutter et al., 2009; Pinhero et al., 2004; Ramanathan et al., 2001;

Ramanathan et al., 1999; Zhou et al., 2000). However, in vivo studies have demonstrated that Ser2 phosphorylation is the main target of CDK9 inactivation

(Boehm et al., 2003; Bowman et al., 2013; Eissenberg et al., 2007; Ni et al.,

2004; Shim et al., 2002). Interestingly, CDK9 is primarily located at 5’ ends of genes even though the majority of Ser2P is seen at 3’-ends (Ghamari et al.,

2013; Larochelle et al., 2012; Lin et al., 2011). Work in this thesis also shows

CDK9 localization at 5’-ends of genes genome-wide (Figure 3-8A).

Interestingly, CDK12 has been shown in vitro and in vivo to be a Ser2-

CTD kinase, and it localizes at 3’-ends of genes in proximity to Ser2P-CTD accumulation (Bartkowiak et al., 2010). Although not tested, it is possible that different Ser2 kinases have mutually exclusive functions such as CDK9 in promoting PPP release and CDK12 in promoting 3’-end formation. Drosophila

CDK12 (dCDK12) is the major Ser2-CTD kinase (Bartkowiak et al., 2010), and is homologous to CDK12 and CDK13 in (Blazek et al., 2011). However, neither depletion of CDK12 or CDK13 in human cells or in vitro kinase assays using purified CDK complexes support a role for CDK12 or CDK13 as the major

Ser2 kinase (Liang et al., 2015). Additionally, BRD4 has been shown to be an atypical Ser2 kinase with a weakly conserved eukaryotic kinase motif that both binds and phosphorylates Ser2-CTD (Devaiah et al., 2012). Interestingly,

19

inhibition of BRD4 chromatin binding inhibits Ser2 phosphorylation (Devaiah et al., 2012), and loss of BRD4 results in transcription termination and embryonic lethality (Houzelstein et al., 2002; Jang et al., 2005; Yang et al., 2005). The molecular mechanism by which BRD4 promotes transcription in relationship to other Ser2 kinases remains unknown.

Fcp1 (TFIIF-associated CTD 1) is the preferred Ser2P phosphatase (Cho et al., 2001; Ghosh et al., 2008; Hausmann and Shuman,

2002), but it is also capable of dephosphorylating Ser5P-CTD (Lin et al., 2002).

However, in yeast FCP1 resulted in elevating Ser2P levels only

(Bataille et al., 2012). It has also been shown in yeast that Ser2P is dephosphorylated in a step-wise manner first by Fcp1 and then by phosphatase

Ssu72 (Bataille et al., 2012).

There have been few studies focused on factors that are not kinases or phosphatases impacting the level of phosphorylation on CTD isoforms. Notably,

Ser2P-CTD ligand FUS has been sown to prevent hyperphosphorylation of Ser2-

CTD at 5’-ends of genes (Schwartz et al., 2012). I propose in Chapter 3 that a

Ser2P-CTD ligand Spt6 is able to promote phosphorylation of Ser2 at 3’-ends of genes. Interestingly, a recent study has shown that SPT6 mutation in yeast resulted in a global reduction of Ser2P-CTD as observed by Western blot of whole-cell protein extracts (Dronamraju and Strahl, 2014), but the level of Ser2P on chromatin was not examined.

20

Phospho-Ser2-CTD Ligands

As pol II elongates, Ser5 is dephosphorylated and Ser2 is phosphorylated, which results in the gene body often being phosphorylated at both marks simultaneously, Ser2P-Ser5P. Numerous factors preferentially bind this double phospho-mark including splicing factors Ppr40 (Phatnani and Greenleaf, 2004) and U2AF65 (David et al., 2011), yeast export factor Yra1 (MacKellar and

Greenleaf, 2011), H3K36 methyltransferase Set2 (Kizer et al., 2005; Li et al.,

2005), RNA-binding factor Ssd1, Hrr25 mitotic kinase (Phatnani and Greenleaf,

2004), and ReQ5 genome stability (Kanagaraj et al., 2010).

The increase in Ser2P during elongation results in a high density of Ser2P at 3’-ends of genes, which is accompanied by recruitment of histone modifying, splicing, elongation, termination, transport and RNA-processing factors that preferentially bind Ser2P-CTD. Spt6 is a transcription elongation factor that has been shown in vitro to preferentially bind Ser2P (Yoh et al., 2007). There is some conflicting evidence in the literature whether Ser2P is the preferential ligand of

Spt6 but the work presented in this thesis supports the results that it is a Ser2P-

CTD ligand (Chapter 3). Additionally, another elongation factor, Npl3, has been shown to preferentially bind Ser2P (Dermody et al., 2008) as well as termination factors Pcf11 (Licatalosi et al., 2002) and Rtt103 (Kim et al., 2004b).

Interestingly, Pcf11 and Rtt103 have been show to bind cooperatively to neighboring Ser2P repeats (Lunde et al., 2010).

21

Integration of Phospho-Ser2-CTD with 3’-End Processing

Seminal experiments from the Bentley lab established the coupling of CTD with efficient mRNA processing (Fong and Bentley, 2001; Licatalosi et al., 2002;

McCracken et al., 1997b). Additionally, Ser2P-CTD was shown to be particularly important in 3’-end processing as yeast cells lacking Ser2 kinase, Ctk1, resulted in disruption of 3’-end processing factor recruitment and aberrant 3’-end processing (Ahn et al., 2004). In human cells, endogenous CTD was replaced with S2A CTD, where the Ser2 had been replaced with in all the heptads, resulting in inefficient cleavage/polyadenylation at 3’-ends (Gu et al., 2013).

A major focus of this thesis is to further characterize the relationship between Ser2-CTD phosphorylation and 3’-end processing in mammalian cells.

As summarized above, there is evidence to support the role of Ser2P in promoting 3’-end processing. However, there is little known about how processing events might promote phosphorylation of the CTD.

1.3 Integration of APA and Splicing

In addition to being processed at their 3’-ends, mRNAs are spliced, which in contrast to cleavage/polyadenylation, occurs at precise nucleotide positions that define exons and introns. In metazoans, exons are typically short, ~50-250 bp, and introns are much larger, up to 1,000 kb (Matera and Wang, 2014). Due to the large composition of introns, there are numerous cis-elements within introns, including poly(A) sites that are removed upon splicing. For this reason, splicing can regulate 3’-end processing as a retained intronic poly(A) site provides an

22

opportunity to be processed.

Interestingly, there are multiple ways in which poly(A) site choice and splicing are connected. First, there are numerous examples of both genetic and biochemical interactions between splicing and cleavage/polyadenylation factors within the cell. Second, there is a known balance between splicing efficiency and the strength of the poly(A) site, which compete for expression. Lastly, there are numerous examples of how splicing factor, U1 snRNP, functions in polyadenylation outside of its canonical role in splicing. All of these examples are discussed in detail below after I review the basic mechanism of splicing.

1.3A Mechanism of Splicing

Splicing involves the removal of an intron followed by ligation of the adjacent exons in a highly dynamic two-step reaction. The spliceosome is the molecular machine that carries out splicing (Lerner et al., 1980) and is comprised of 5 different small nuclear ribonucleoproteins (snRNPs) and many associated co-factors (Jurica and Moore, 2003). There are two different classes of , a major and a minor. However, the is much less abundant, representing less than 1% of spliceosomes in the cell (Hall and

Padgett, 1996; Patel and Steitz, 2003; Tarn and Steitz, 1996), and the study presented in this thesis focuses on splicing by the major spliceosome.

Major Spliceosome

The major spliceosome is comprised of U1, U2, U4, U6, and U5 snRNPs, which, aided by other factors, recognize specific sequences that define introns

23

and exons. The accuracy and efficiency of splicing is aided by cis-elements including the 5’ and 3’ splice sites (ss), branch site, polypyrimidine tract, as well as splicing enhancers and silencers (Figure 1-3A) (Shefer et al., 2014). How the spliceosome components recognize the key cis-elements and carry out the splicing reaction is very precise, as an inefficient splicing mechanism could result in spliced-mRNA frame shifts ultimately producing a different translated protein product.

Step-wise assembly of the spliceosome facilitates the splicing reaction

(Brow, 2002; Matera and Wang, 2014; Shefer et al., 2014) and is diagrammed in

Figure 1-3. The reaction begins with U1 snRNP recognizing the 5’ ss (GU) by base pairing of U1 snRNA with the pre-mRNA (Reed, 1996). Next, the 3’ ss (AG), polypyrimidine tract (Py), and branch site (yUnAy) are recognized and bound by

U2AF35 (Wu et al., 1999), U2AF65 (Zamore et al., 1992), and SF1/mBBP (Liu et al., 2001), respectively. The binding of these spliceosome components forms the

E complex bridging the intron and bringing the 5’ and 3’ splice sites together

(Figure 1-3B). In an ATP-dependent reaction, U2 snRNP displaces SF1/mBBP to bind the branch site via direct base pairing of U2 snRNA forming complex A

(Figure 1-3C) (Gozani et al., 1996; Query et al., 1996; Will et al., 2001). Binding of the pre-formed U4/U5.U6 tri-snRNP forms the pre-catalytic B complex. Next,

U1 and U4 snRNPs are destabilized and ejected forming the active B* complex

(Figure 1-3D). The B* complex catalyzes the hydrophilic attack of the branch site adenosine 2’ OH on the 5’ ss guanine. The free 3’OH of 1 then attacks the

24

conserved guanine at the 3’ ss (Figure 1-3E). Finally, the catalytically competent complex C is formed resulting in liberation of the intron lariat and joining of the spliced exons (Figure 1-3F).

Even though splicing occurs at precise nucleotide positions directed by conserved cis-elements, auxiliary-splicing cis-elements can either enhance or inhibit exon-intron recognition (Zhang et al., 2008). These splicing enhancers/silencers found in both exons and introns were once thought to only play a role in but are now considered to be necessary for exon recognition (Han et al., 2011). Serine/-rich (SR) proteins bind enhancing elements while hnRNP proteins recognize silencing elements.

The work in this thesis investigates how splicing and splicing factors regulate APA. A weak 5’ ss that diminishes splicing but promotes processing at an intronic poly(A) site is further characterized in Chapter 4. Additionally, Chapter

5 investigates novel functions of the major spliceosome snRNPs in preventing early alternative poly(A) site choice.

1.3B Splicing and Cleavage/Polyadenylation

As discussed above in section 1.2, pol II CTD plays a critical role in coupling mRNA processing events to transcription and potentially to each other.

For this reason, it is conceivable that splicing and cleavage/polyadenylation events are connected. In fact, coupling cleavage/polyadenylation and splicing events was initially suggested as a mechanism for defining the 3’ terminal exon

(Niwa et al., 1990). A mechanistic relationship between splicing and

25

polyadenylation remains elusive but the evidence supporting a functional link is discussed below.

Interaction of Splicing and Cleavage/Polyadenylation Factors

The revelation that pol II lacking the CTD does not support splicing or 3’- end processing (McCracken et al., 1997b) has lead to further analysis of how

CTD connects these two processing events. It has been shown that transcription initiation causes splicing factors to localize on active genes (Listerman et al.,

2006; Neugebauer and Roth, 1997; Zeng et al., 1997), and this localization of splicing factors is lost when pol II is lacking the CTD (Misteli and Spector, 1999).

Pol II has been extensively shown to indirectly interact with splicing factors including PSF, SCAFs, SR proteins, U2AF, and U1, U4, U6 snRNPs (Emili et al.,

2002; Kameoka et al., 2004; Kim et al., 1997; Kwek et al., 2002; Patturajan et al.,

1998; Robert et al., 2002; Yuryev et al., 1996). The first splicing factor shown to directly bind pol II CTD was Prp40 in yeast (Morris and Greenleaf, 2000).

Interestingly, a proteomic analysis of a human pol II co-immunoprecipitation revealed that SR proteins and U1 snRNP components are highly enriched on pol

II but no other snRNPs or splicing factors were identified (Das et al., 2007).

However, U2AF65 was shown to directly bind phosphorylated CTD (David et al.,

2011), and a S2A CTD mutant failed to recruit U2AF65 resulting in splicing and

3’-end defects in human cells (Gu et al., 2013). Although the full extent to which splicing factors and pol II interact is unclear, it is evident that pol II provides a functional link between splicing and cleavage/polyadenylation factors.

26

There is further evidence that splicing and cleavage/polyadenylation factors interact directly with each other. U2 auxiliary factor 65 (U2AF65) has been shown to have direct molecular contacts with poly(A) polymerase (PAP) (Vagner et al., 2000), and it interacts with CFI (Millevoi et al., 2006). Additionally, U2 snRNP interacts with CPSF (Kyburz et al., 2006). U1 snRNP has also been shown to interact with CPSF160 (Lutz et al., 1996), but the full extent of U1 snRNPs non-canonical functions are discussed below. These results provide evidence that both physically and functionally connect splicing and cleavage/polyadenylation but the full extent of this coupling is not understood.

IgM Gene: A Balance of Splicing and Polyadenylation

In addition to the reasons discussed above, splicing and APA are intrinsically connected in instances where a poly(A) site is located within an intron. This is the case for APA in the immunoglobulin heavy chain (IgM) genes, and the IgM (μ) isotype is the reporter gene used for the study presented in

Chapter 4. IgM was the first cellular gene shown to produce two different, but related, mRNAs through alternative processing (Early et al., 1980). IgM has two poly(A) sites that are alternatively regulated in B-cell maturation. The distal 3’

UTR poly(A) site is preferentially used in B-cells and results in membrane-bound

(μM) produced (Figure 1-4A). The proximal poly(A) site is used in plasma cells and is located within an intron resulting in production of secreted

(μS) antibody.

Systematic studies have identified key regulatory properties on the IgM

27

gene. The intron size, but not the intron sequence, surrounding the proximal PAS impacts the regulation between poly(A) sites. A smaller intron results in a higher efficiency of splicing as measured by more complete splicing resulting in total use of the distal poly(A) site and no use of the proximal-intronic poly(A) site (Figure 1-

4B) (Galli et al., 1987; Peterson and Perry, 1986; Tsurushita and Korn, 1987).

Conversely, if the intron is made longer, splicing is less efficient as only the proximal-intronic poly(A) site is utilized (Figure 1-4C) (Peterson and Perry, 1986;

Tsurushita and Korn, 1987). It is possible that the extra time it takes pol II to transcribe a longer intron is what favors processing of the poly(A) site, and vice- versa, in agreement with the ‘first come, first served’ model (section 1.1B).

Notably, the size of the exon upstream of the intronic poly(A) site is also critical.

When the upstream exon is made smaller splicing is more efficient resulting in use of the distal poly(A) site (Peterson et al., 1994). This result supports the exon-definition-model, which suggests that size impacts exon-recognition thus regulating splicing efficiency (Sterner et al., 1996).

The IgM proximal-intronic poly(A) site is weak and is coupled with a weak

5’ ss. When the 5’ ss is strengthened to a consensus sequence and expressed in plasma cells, the cell-type specific poly(A) site regulation is altered resulting in total use of the distal poly(A) site (Figure 1-4D) (Peterson and Perry, 1989).

Correspondingly, if the proximal-intronic poly(A) site is strengthened and expressed in B-cells, only the distal poly(A) site is used (Figure 1-4E) (Peterson and Perry, 1989). When both the 5’ ss and the intronic poly(A) site are

28

strengthened on one gene, the regulation between the poly(A) sites is restored in both B-cells and plasma cells (Peterson et al., 2002).

The characterization of the IgM gene has provided evidence that the efficiency of splicing and cleavage/polyadenylation influence APA. Interestingly, global analysis has identified weak 5’ ss and long introns as key elements governing intronic poly(A) sites near 3’ ends of genes (Tian et al., 2007). A balance of efficiencies model suggests that an intronic poly(A) site is waiting for the opportunity to be used when factors are balanced in its favor and is incorporated within the survival of the fittest model of APA regulation (section

1.1B).

U1 snRNP and Polyadenylation

When the spliceosome forms, the snRNPs come together in 1:1 stoichiometry (Wahl et al., 2009). However, there is approximately one million copies of U1 snRNP per HeLa cell while other snRNPs have less than half as many copies (Baserga, 1993). This disparity in snRNP concentration has prompted numerous investigations into a potential function for the cellular abundance of U1 snRNP. The Dreyfuss lab depleted functional U1 snRNP within

HeLa cells using an antisense morpholino targeted to the 5’-end of U1 snRNA, which is responsible for base pairing the pre-mRNA 5’ ss (Kaida et al., 2010).

Upon U1 snRNP depletion, there was an accumulation in premature cleaved and polyadenylated transcripts via use of premature cryptic poly(A) sites (PCPAs), which are characterized as being near the 5’-end of genes and located within

29

introns. Interestingly, depletion of functional U2 snRNP or inhibition of splicing through spliceostatin A (SSA) treatment did not result in use of PCPAs (Kaida et al., 2010). It has been hypothesized that U1 snRNP functions outside of its role in splicing to protect premature poly(A) sites by directly binding mRNA and blocking use of the PCPA supporting an antagonist model of APA regulation (section

1.1B).

Additionally, U1 snRNP directly interacts with poly(A) polymerase (PAP) inhibiting polyadenylation of pre-mRNAs (Gunderson et al., 1998; Gunderson et al., 1997). U1 snRNP has also been implicated in regulating promoter directionality in coordination with polyadenylation signals (Almada et al., 2013).

The Sharp lab showed that there is U1-PAS axis near the TSS where in the antisense direction there is a high number of PAS and a low number of U1- binding sites, and in the sense direction, the opposite is observed. This suggests an evolutionary mechanism where U1-binding sites are present to prevent poly(A) site use, potentially through binding of U1 snRNP. However, the precise mechanism by which U1 snRNP regulates poly(A) sites is unclear.

1.4 Specific Questions

It is widely accepted that mRNA processing is tightly coupled to the transcribing polymerase via the CTD. However, the extent to which transcription and mRNA processing events regulate one another remains unclear. The specific questions addressed in this thesis include the following: 1) Can a Ser2-CTD ligand promote phosphorylation of a CTD isoform? 2) Does attenuation of Ser2-

30

CTD phosphorylation at 3’-ends regulate poly(A) site choice? 3) Is processing of alternative poly(A) sites coupled to Ser2-CTD phosphorylation? 4) How does alternative poly(A) site use affect pol II pausing and recruitment of processing factor? 5) Does U1 snRNP have a unique role in preventing intronic poly(A) site use or do other snRNPs have a similar function? 6) Is splicing the primary regulator of processing at intronic poly(A) sites?

31

Figure 1-1: Diagram of pre-mRNA poly(A) site cis-elements and corresponding factors. Genomic DNA is shown in black and nascent transcribed pre-mRNA is in red. The poly(A) site is defined by key cis-elements including the AAUAAA hexamer, followed by the CA dinucleotide cleavage site, U/GU-rich DSE and the UGUA motif in the U-rich USE. Cleavage/polyadenylation factors CPSF, CSTF and CFI bind the pre-mRNA cis-elements resulting in an endonucleolytic cleavage (scissors) of the RNA followed by polyadenylation at the 3’ OH and degradation of the 5’ phosphate.

32

Figure 1-2: Density of total pol II, Ser2P- and Ser5P-CTD on a typical mammalian gene. A pol II pause is defined as an accumulation of pol II on chromatin and is marked by yellow pause-boxes. Pol II pauses shortly after transcription initiation and is marked by high density of Ser5P-CTD. During elongation, Ser5 is dephosphorylated as Ser2 is phosphorylated. Pol II pauses again 1-2 kb after the poly(A) site marked by a high density of Ser2P-CTD.

33

Figure 1-3: Schematic of the step-wise major spliceosome reaction. (A) The key cis-elements are diagramed on the pre-mRNA. Exons are represented by black boxes and introns by black lines. (B) The E complex is formed by U1 snRNP recognition of the 5’ ss and recognition of the branch site, polypyrimidine tract, and 3’ ss by U2AF components. (C) U2 snRNP displaces SF1/mBBP to directly bind the branch site. (D) U4/U5.U6 tri-snRNP binds U1 snRNP forming the pre-catalytic B* complex. (E) U4 and U1 snRNPs are ejected from the complex forming the active B* complex resulting in the two-step splicing reaction. (F) The intron lariat is removed, the two exons are joined, and snRNPs are recycled.

34

Figure 1-4: Diagram of poly(A) site use on the IgM gene and relevant mutants. (A) WT IgM is shown with exons in grey boxes and introns as black lines. Yellow diamonds indicate poly(A) sites. The proximal poly(A) site located between Cμ4-M1 is primarily used in plasma cells. The distal poly(A) site is downstream of M2 and is primarily used in B-cells. (B) When the Cμ4-M1 intron is shortened, the distal poly(A) sites is favored. (C) The proximal poly(A) site is favored when the Cμ4-M1 intron is lengthened. (D) Strengthening the 5’ ss, (red asterisk and strong arm) to a strong consensus sequence favors splicing of the Cμ4-M1 intron and use of the distal poly(A) site. (E) A strong proximal poly(A) site (strong arm) favors processing over the distal poly(A) site and competes with splicing of the Cμ4-M1 intron.

35

CHAPTER II

MATERIALS AND METHODS

2.1 ChIP

ChIPs were performed as previously described (Kim et al., 2011). Cells were washed with PBS and cross-linked with 1% formaldehyde in PBS for 10 minutes and quenched with 125 mM for 10 minutes at room temperature.

Cells were washed twice with cold PBS and resuspended in RIPA buffer (150 mM NaCl, 1% Nonidet P-40, 0.5% deoxycholate, 0.1% SDS, 50 mM Trish pH 8.0,

5 mM EDTA) supplemented with 50 mM NaF, 0.2 mM NaVO4, 55 mM Na butyrate, and 0.5 mM PMSF. Chromatin was then sheared into ~250 bp fragments using a Bioruptor at 4°C on high output for 45 minutes in 30 second on/off intervals.

For IP’s, extract (1 ml) at 1μg/ml in RIPA buffer were pre-cleared at 4°C for one hour with 30 μl of A-Sepharose. Input DNA was collected at this point.

Antibodies were added to 30 μl of pre-blocked (1 mg/ml BSA and 0.3 mg/ml salmon sperm) protein A-Sepharose and incubated overnight at 4°C on a nugatory. The beads were washed twice with 1 ml RIPA buffer, four times with 1 ml of ChIP wash buffer (100 mM Tris-HCl pH 8.5, 500 mM LiCl, 1% Nonidet P-

40, 1% deoxycholic acid), and twice with 1 ml 1X TE (Tris-EDTA) each wash was nutated for 5 minutes. The immunoprecipitated protein-chromatin complexes were reverse cross-linked overnight at 65°C in 70 mM Tris-HCl pH 8.0, 1 mM

EDTA, 1.5% SDS and 200 mM NaCl. Reverse crosslinked samples were treated

36

with proteinase K and phenol/chloroform extracted. One ug of immunoprecipitated extract was dissolved in 45 μl of water.

2.2 ChIP-seq

Illumina libraries were prepared as previously described (Brannan et al.,

2012) on SPRI beads (Beckman AMPure XP SPRI). Phosphates were added to the ends of the DNA molecules and overhangs were filled in for addition of A- bases to the ends of the DNA fragment followed by adapter ligation. The DNA fragments were then PCR enriched and tagged with a unique Illumina barcode.

Libraries in Chapter III (Figure 3-1) were sequenced on the Illumina

Genome Analyzer IIx and single-end 34 base reads (after removing barcodes) were mapped to hg19 UCSC (Feb, 2009). All other libraries in

Chapter III and Chapter IV were sequenced on the Illumina Hi-Seq platform.

Single-end 50 base reads were mapped to mm10 UCSC mouse genome (Dec,

2011) in Chapter IV, to a custom genome corresponding to the wild type IgM

SVneo plasmid pR-SP6 (Ochi et al., 1983) in Chapter IV, or to hg19 UCSC human genome (Feb, 2009) in Chapter III. Mapping was done using Bowtie version 0.12.5 (Langmead et al., 2009). BED and WIG files were generated using

50 bp bins and 200 bp windows assuming a 180 bp fragment size shifting effect.

Libraries were normalized by total number of aligned reads (RPBM, reads per bin per million basses) and results were viewed with the UCSC genome browser.

37

2.3 Antibodies

Rabbit anti-pan CTD against pol II (Schroeder et al., 2000), rabbit anti-

H3K36me3 (Kim et al., 2011), rabbit anti-AFF4 (Lin et al., 2010), rat monoclonal anti-Ser7P-CTD (4E12) (Chapman et al., 2007), rat monoclonal anti-Tyr1P-CTD

(3D12) (Mayer et al., 2012), rabbit polyclonal anti-Ser2P-CTD (Glover-Cutter et al., 2008), and rabbit anti-phospho-Cdk9 (Thr186) (Larochelle et al., 2012) were all previously described. Rat monoclonal anti-Ser2P-CTD (3E10, Chromatek), rat monoclonal anti-Ser5P-CTD (3E8, Chromatek) and other rat monoclonals were used with rabbit anti-rat IgG (Jackson ImmunoResearch) for IP’s. Polyclonal anit-

Spt6 was raised in rabbits against a GST fusion to human Spt6 (amino acids

478-780) and the serum was used for IPs and Western blot. Anti-Cstf77 was raised in rabbits against a GST fusion to the C-terminus of human CstF77 (amino acids 539-717) and was affinity purified.

2.4 Mini-Gene Transfection

NR3C1 mini-gene was constructed as previously described (Kaida et al.,

2010). pCDNA3-NR3C1 construct was transfected into HeLa cells simultaneously with ASOs using Lipofectamine 2000 (Thermo Fisher Scientific) following the

ASO transfection protocol. Total RNA was collected using Zymo RNA mini-prep kits and was DNased at 37°C with DNase I, Xho I, 1X BSA and 0.5 mM CaCl2 to thoroughly degrade transfected plasmid. cDNA was then generated and used for

3’ RACE as described.

38

2.5 RNA Extraction

Total RNA was extracted from tissue culture cells using commercial RNA mini-prep kits from Qiagen and Zymo. Samples were DNased on the column.

2.6 Poly(A)+ RNA-seq

Poly(A)+ RNA was purified from 30-100 μg of total RNA on BioMag Oligo

(dT)20 magnetic beads. Poly(A)+ RNA was fragmented using Ambion

Fragmentation Reagents and then EtOH precipitated. cDNA was synthesized using a reverse oligo(dT) primer that is designed for circularization of the single stranded cDNA. ExoI eliminates extra primer after the cDNA reaction, and single- stranded cDNA is purified on a Qiagen column. The cDNA was then circularized using CircLigase II (Epicentre) and purified on a Qiagen column. Next, the library was PCR amplified using Phusion (NEB) incorporating a unique Illumina barcode. Libraries in Chapter III and Chapter V were sequenced on the Illumina

Hi-Seq platform. Hyunmin Kim mapped single-end 50 bp reads to hg19 UCSC human genome (Feb, 2009).

2.7 Cell Lines and Growth Conditions

HEK293-Flp-in T-REX-glob cells (Chapter III) stably containing pINDUCER10-Spt6 shRNA were grown in DMEM medium supplemented with

10% fetal bovine serum, 1% pen/strep, and 1μg/ml puromycin at 37°C and 5%

CO2. HEK293-Flp-in T-REX-glob cells contain a stably integrated, hygromycin- resistant gene and CMV β-globin reporter that is not relevant to these studies.

M12 and S194 cell lines (Chapter IV) stably transfected with WT Ig Cμ gene

39

(Peterson and Perry, 1989) were grown in DMEM medium supplemented with

10% heat-inactivated fetal calf serum, 1% pen/strep at 37°C and 5% CO2. HeLa cells (Chapter V) were grown in DMEM medium supplemented with 10% fetal bovine serum, 1% pen/strep at 37°C and 5% CO2.

2.8 Protein Extractions and Immunoblotting

Total protein from monolayer mammalian cells was extracted using M-

PER reagent (Thermo Scientific) with protease inhibitors. Protein extracts were separated by SDS-PAGE (BioRAD) and transferred to PVDF membranes via a semi-dry blotting apparatus (BioRAD) at 0.2 amps for 30 minutes. Membranes were blocked for at least one hour at 25°C in either 5% non-fat dry milk in PBST or 5% BSA in PBST for non-phospho and phospho-specific antibodies, respectively. Blocked membranes were probed with rabbit polyclonal or mouse monoclonal antibodies diluted in PBST + 0.02% sodium azide overnight at 4°C.

Membranes were washed three times for 15 minutes with PBST at 25°C, incubated with either rabbit anti-mouse Ig-HRP (for monoclonals) or swine anti- rabbit Ig-HRP (for polyclonals) in 5% non-fat dry milk in PBST for no longer than

30 minutes at 25°C, and visualized by enhanced chemilumenescence.

2.9 3’RACE

cDNA was synthesized using Protoscript II reverse transcriptase (NEB) with the dT18-XbaKpnBam primer for 3’ RACE (Kaida et al., 2010). 3’ RACE was carried out using nested forward primers and XbaKpnBam reverse primer as previously described (Kaida et al., 2010).

40

2.10 Real-Time PCR

PCR reactions (10 μl) were performed with SybrGreen using the Roche

LC-480 (Roche Applied Science) in 384 well plates. Denaturation was 5 minutes at 95°C followed by 50 cycles of 15 seconds at 95°C, 20 seconds at 60°C, and

20 seconds at 72°C. Primer sets were designed using Roche Light Cycler Probe

Design version 2.0 with Tms between 55°C and 62°C, product sizes between 50 and 125 base pairs. After each run, all primer pairs were checked to have efficiency between 1.8 and 2.2 and an error less than 0.01.

2.11 ASO Transfection

HeLa cells were split at 5x105 cells/ml in 5 mls of DMEM complete medium (10% FBS, 1% pen/strep) and allowed to adhere overnight. 100 nM of

ASO was gently mixed with 25 μg Lipofectamine 2000 (Thermo Scientific Fisher) in 500 μl Opti-MEM and incubated at room temperature for 20 minutes before being added drop wise directly to HeLa cells. After four hours, the transfection mixture was aspirated from cells, and fresh media was added. Twenty-four hours after initial transfection, the cells were harvested for downstream assays.

2.12 shRNA-Mediated Spt6 Knockdown

shRNA sequence targeting Spt6 was subcloned into pInducer10

(Meerbrey et al., 2011) and 293 Flp-In cells were transduced with pINDUCER10-

Spt6. The transduced cells were stably maintained in 1μg/ml of puromycin. Cells were treated with 1μg/ml of doxycycline for 72 hours to induce shRNA expression before being harvested for downstream assays. Knockdown was confirmed by

41

Western blot (Figure 3-2).

2.13 ChIP-seq Pipeline

The sequencing core uses the unique barcodes to parse the sequenced

ChIP data generating FastQ files. The FastQ files are then taken through the

Bentley Lab pipeline, and all FastQ files and respective information have been filed in the Twiki database. The ChIP-seq pipeline is detailed in Figure 2-1. First, the short DNA reads are aligned to the respective genome using bowtie. From a bowtie file, the data is processed into a browser extensible data (BED) format. The BED file is converted into a bedGraph file using a default bin size of 50 bp, a fragment size of 200 bp centered from the 5’- end of the read, and a window size of 200 bp, and the data is normalized by reads per bin per million bases (rpbm). Lastly, the data is formatted into a bigWig file, which is an indexed binary format for dense, continuous data that can be displayed on the UCSC genome browser. A bigWig file can be uploaded directly on the genome browser for visualization, but bigWig files are typically visualized on the genome browser by unique URLs, which is convenient for storing and sharing files.

42

Figure 2-1: ChIP-seq pipeline flowchart. A parsed fastq file is aligned to the hg19, or respective, genome using bowtie. The data is then processed into a BED file, a bedGraph file and a bigWig file that is ultimately visualized on the UCSC genome browser. A unique URL is assigned to bigWig files to easily load, manage and share.

43

CHAPTER III

THE ROLE OF SPT6 IN REGULATING TRANSCRIPTION, SER2-CTD

PHOSPHORYLATION AND MRNA PROCESSING

3.1 Introduction

Pre-mRNA processing in eukaryotic cells is regulated co-transcriptionally through the C-terminal domain (CTD) of RNA polymerase II (pol II). The CTD is an evolutionarily conserved unstructured domain consisting of multiple heptad repeats, YSPTSPS, which can be dynamically modified throughout the transcription cycle. The phosphorylation of serine 5 (Ser5P) and serine 2 (Ser2P) are perhaps the best characterized modifications, but 1 (Tyr1), threonine

4 (Thr4), and serine 7 (Ser7) can also be phosphorylated (Heidemann et al.,

2013). Ser5P marks the promoter proximal pol II pause, and when pol II is signaled into elongation by the Super Elongation Complex (SEC) and association of P-TEFb, which is comprised of Ser2-kinase CDK9, Ser5 is dephosphorylated while Ser2 is phosphorylated. Ser2P is densely accumulated at 3’-ends of genes in association with the terminal pol II pause. In addition to marking specific stages of the transcription cycle, phospho-CTD isoforms recruit specific factors that regulate mRNA processing. One such factor that preferentially binds phospho-CTD is Spt6.

Spt6 is a conserved and essential nuclear protein (Clark-Adams and

Winston, 1987; Swanson et al., 1990) first identified in yeast as one of seven genes that affect the phenotype of HIS- caused by the insertion of a Ty δ element

44

at the 5’ end of HIS4 (Winston et al., 1984). Spt6 is a histone chaperone of histones H3 and H4 (Bortvin and Winston, 1996) that directly binds nucleosomes

(McDonald et al., 2010), and genetically interacts with H2A and H2B

(Compagnone-Post and Osley, 1996; Swanson et al., 1990).

Formative experiments revealed that Spt6 is critical in maintaining chromatin structure as Spt6 inactivation leads to loss of nucleosomes and aberrant transcription from cryptic promoters (Cheung et al., 2008; Kaplan et al.,

2003). Yeast Spt6 has also been shown to play an important role in maintaining

H3 at nucleosome-depleted-regions (NDRs) near gene 5’-ends (Perales et al.,

2013). Interestingly, a mutation in yeast Spt6 resulted in a loss of H3K36me3, a mark associated with actively transcribed genes, which could not be suppressed by exogenous expression of H3K36 methyltransferase, Set2 (Youdell et al.,

2008). Set2 has been shown to localize on chromatin indirectly through an interaction with Iws1 (Interacts with Spt6-1) and Spt6 (Yoh et al., 2008), which is potentially why Set2 cannot restore H3K36me3 when Spt6 is mutated.

Additionally, Spt6 is an essential (Swanson and

Winston, 1992) that interacts with the transcription machinery (Hartzog and

Quan, 2008) and is localized on active genes through an association with pol II

(Andrulis et al., 2000; Endoh et al., 2004; Kaplan et al., 2000; Kim et al., 2004a;

Krogan et al., 2002; Mayer et al., 2012). Murine Spt6 was shown to directly bind

Ser2P-CTD through its SH2 domain (Yoh et al., 2007). The SH2 domain of Spt6 is unique in the fact that it is the only SH2 domain found in the yeast genome

45

(Maclennan and Shaw, 1993), and the crystal structure of yeast Spt6’s SH2 domain indicates it preferentially binds phospho-serine rather than phospho- tyrosine (Dengl et al., 2009). However, studies measuring the binding affinity of yeast Spt6 revealed that Spt6’s SH2 domain has a strikingly higher affinity for

Tyr1P-CTD than Ser2P-CTD (Mayer et al., 2012).

In addition to conflicting reports of Spt6 binding affinity, Spt6 localization on chromatin has differed across species. Initial ChIP experiments showed that

Drosophila Spt6 occupies the gene body and 3’-end of HSP70 (Saunders et al.,

2003). Genome-wide profiling by tiling microarrays of yeast Spt6 ChIP revealed that Spt6 is high throughout the entire gene body and does not correlate with

Ser2P-CTD localization (Mayer et al., 2010). Yeast Spt6 ChIP-Chip conducted in the Bentley lab also confirmed that Spt6 is high throughout gene bodies but revealed that Spt6 is densely accumulated at 3’-ends of genes (Perales et al.,

2013). Interestingly, it was shown that Tyr1P-CTD, independent of Ser2P-CTD, recruits yeast Spt6 to 3’-ends of genes while Ser2-CTD phosphorylation by Ser2 kinases Bur1 and Ctk1 is important for Spt6 recruitment to gene bodies (Burugula et al., 2014). Genome-wide localization of Spt6 in a mammalian multicellular organism has not been previously published but is presented in this chapter.

Although the preferred CTD-isoform of Spt6 has yet to be resolved, the association of Spt6 with CTD is critical for transcription as mutation of yeast Spt6 results in aberrant transcription and mRNA processing (Kaplan et al., 2005; Yoh et al., 2007). It may be through Spt6’s role as a CTD-ligand that it regulates

46

mRNA processing. It has been previously shown that a Ser2P-CTD ligand FUS prevents hyperphosphorylation of Ser2 at 5’-ends of genes (Schwartz et al.,

2012). Interestingly, yeast Spt6 has been shown to stabilize global Ser2P-CTD levels in whole cell extracts (Dronamraju and Strahl, 2014), but the effect on

Ser2P-CTD bound to chromatin has not been examined.

There is a strong connection between Ser2P and 3’-end processing, but the extent to which Ser2P-CTD levels regulate processing events is unknown.

The CTD is essential for proper mRNA processing (McCracken et al., 1997b), and Ser2-CTD phosphorylation promotes proper 3’-end processing in yeast and humans (Ahn et al., 2004; Gu et al., 2013). However if you have too much or too little Ser2P, it is unclear how that affects poly(A) site choice. It was shown that premature hyperphosphorylation of Ser2-CTD on human genes results in a proximal shift in poly(A) site use (Schwartz et al., 2012). It is possible that a reduction in Ser2P-CTD would similarly impact poly(A) site choice.

Mammalian Spt6 and its relationship with the CTD remain largely undefined. It is unclear what mammalian Spt6’s preferred phospho-CTD isoform is or where Spt6 localizes on genes. Additionally, the ability of Spt6 to stabilize

Ser2P-CTD levels on chromatin is unknown and if Ser2P levels regulate 3’-end processing. Furthermore, it is unknown how mammalian Spt6 might regulate other aspects of transcription and histone modifications. To answer these questions, I conducted a genome-wide analysis of Spt6 knockdown in mammalian cells.

47

I find that Spt6 localizes at 3’-ends of mammalian genes in correlation with

Ser2P-CTD suggesting its preferred ligand is Ser2P, and the presence of Spt6 promotes Ser2P at 3’-ends of genes. There is no discernable change in other phospho-CTD isoforms and pol II remains unchanged at 3’-ends. Interestingly, there is an increase of pol II density downstream of the promoter proximal pause that suggests a potential defect in transcription in connection with Spt6’s role as an elongation factor. Additionally, the reduction of Ser2P at 3’-ends of genes correlates with both aberrant poly(A) site choice at 3’ UTRs and a 3’ shift in

H3K36me3 suggesting a loss of critical CTD-interacting factors that function in mRNA processing and histone methylation. This study provides a better mechanistic understanding of how Spt6 regulates transcription and mRNA processing.

3.2 Results

3.2A Spt6 is Localized at 3’-Ends of Genes in Correlation with Ser2P-CTD

Spt6 is known to localize on active genes through its interaction with the

CTD, but the preferred CTD-isoform for human Spt6 remains unclear. Knowing what isoform Spt6 preferentially binds and where on genes it localizes would provide insight into potential roles of Spt6 within the transcription cycle. To determine where human Spt6 localizes on genes, I performed ChIP coupled to sequencing with anti-Spt6 serum in HeLa cells. I visualized the Spt6 ChIP-seq on the UCSC genome browser where I observed accumulation of Spt6 at the 3’-end of MYC (Figure 3-1A, purple). To compare Spt6 localization with total pol II, I

48

aligned a previously generated anti-total pol II ChIP-seq library to MYC (Figure 3-

1A, blue). Notably, Spt6 correlates with the 3’-end pol II pause and not the 5’-end pause. To determine if Spt6 localization correlates with Ser2P-CTD, I compared a previously generated anti-Ser2-phospho-isoform ChIP-seq library to Spt6 on

MYC (Figure 3-11A, orange), and I observe a clear overlap in Spt6 and Ser2P-

CTD at the 3’-end, which corresponds to the 3’ pol II pause. I also examined how

Spt6 localization correlates with Tyr1P-CTD by generating an anti-Tyr1-phospho- isoform ChIP-seq library (Figure 3-1, green), and I observe that Tyr1-CTD more closely resembles total pol II and is not specifically accumulated at the 3’-end like

Spt6 and Ser2P-CTD.

To determine if Spt6 localizes to 3’-ends of genes throughout the human genome, I generated a meta-plot of the ChIP signals. The meta-plot was generated by dividing all annotated human genes into three regions with the 5’ region defined as 1.5 kb upstream to 500 bp downstream of the TSS divided into

20 equal binds, the 3’ region as 500 bp upstream to 3.5 kb downstream of the poly(A) site divided into 40 equal bins, and a variable region between the defined

5’ and 3’ ends divided into 20 bins where the size of each bin depends on the length of the gene. The relative frequency of the mean Spt6 ChIP signal for each bin is reported (Figure 3-1B). I observe a distinct 3’ accumulation of Spt6 across

16,268 genes suggesting Spt6 is preferentially accumulated at the 3’-ends of genes (Figure 3-1B, arrow).

These results reveal that Spt6 specifically localizes at 3’-ends of genes of

49

genes in the first genome-wide analysis of Spt6 in a mammalian multicellular organism. My results suggest that Spt6 does not preferentially bind Tyr1P-CTD on chromatin, as I do not observe a strong correlation between the two factors.

Rather, this study provides support for Spt6 as a Ser2P-CTD specific ligand in human cells.

3.2B Knockdown of Spt6 Through an Inducible shRNA

To further characterize the role of Spt6 in relation to pol II and a potential function at 3’-ends of genes, I developed inducible shRNAs to transiently knockdown Spt6 in vivo. I used a tet-inducible shRNA system developed by the

Elledge lab (Meerbrey et al., 2011) (Figure 3-2A). I transduced the inducible lenti viruses into HEK 293 Flp-In cells, which are a human embryonic cell line.

Through Western blot analysis, I was able to validate Spt6 knockdown upon addition of 1μg/ml doxycycline for 72 hours (Figure 3-2B). In the rest of this chapter, I am using 293 Flip-In cells that stably express the inducible tet-system, and cells treated with doxycycline (+doxy) result in Spt6 knockdown while cells not treated with doxycycline (-doxy) are used as a control.

3.2C Spt6 Affects Pol II Distribution at the 5’- and 3’-Ends of Genes

Human Spt6 is localized specifically at the 3’-end of genes in correlation with Ser2P-CTD, which is likely its preferred ligand. However, Spt6 is a known transcription elongation factor, and it is unclear if or how mammalian Spt6 regulates elongation if it is localized at 3’-ends. To determine if Spt6 is regulating pol II distribution on genes, I performed an anti-total pol II ChIP-seq experiment

50

with expression of Spt6 shRNA #1. Visualization of pol II distribution on the MYC gene using the UCSC genome browser revealed a higher accumulation of pol II in the region downstream of the 5’ promoter proximal pause of pol II (Figure 3-

3A, blue tracks, red arrow). Redistribution of pol II on MYC was confirmed in a replicate experiment using Spt6 shRNA #4 (Figure 3-3A, green tracks).

Additionally, EEF1A1 (Figure 3-3B) and GAPDH (Figure 3-3C) both exhibit a redistribution of pol II into the gene body with Spt6 knockdown. MALAT1, a lncRNA, is a unique example where pol II terminated early with Spt6 knockdown

(Figure 3-3D).

In general, I find that Spt6 knockdown results in an increase in pol II downstream of the promoter proximal pause on numerous genes. The promoter proximal pause is regarded as an important regulating step in pol II’s transition from initiation to elongation. It is important to note that there is not a decrease in the promoter proximal pause itself when Spt6 is depleted, but potentially an increase on some genes (Figure 3-3B, C). This increase of pol II near the TSS suggests a defect in transcription. However, elongation rate is not directly tested in these experiments and further analysis is needed to determine if elongation is being effected. Interestingly, screening of individual genes on the UCSC genome browser revealed there is not a substantial change in pol II distribution at 3’-ends of most genes with Spt6 knockdown.

51

3.2D Spt6 Does Not Generally Affect Pol II Distribution Genome-Wide

Spt6 knockdown affected pol II distribution both near 5’- and 3’-ends of genes although Spt6 is specifically localized at 3’-ends of genes. To see if there is a global trend of pol II redistribution with Spt6 knockdown, I generated meta- plots as previously described for both Spt6 shRNAs (Figure 3-4A, B). I observe no evident changes in pol II distribution with Spt6 knockdown. This suggests that

Spt6 does not have a universal role in regulating pol II although there are some gene-specific examples of where Spt6 is impacting pol II distribution.

3.2E Spt6 Promotes Ser2-CTD Phosphorylation at 3’-Ends of Genes

The importance of human Spt6 localization at 3’-ends of genes (Figure 3-

1B) remains unclear, but yeast Spt6 is critical for stabilizing total Ser2P-CTD protein levels (Dronamraju and Strahl, 2014). To determine if human Spt6 regulates Ser2P-CTD on chromatin, I performed an anti-Ser2-phospho-isoform

ChIP-seq with Spt6 knockdown. Visualization of the Ser2P ChIP-seq libraries on the UCSC genome browser revealed distinct changes in Ser2P-CTD localization on genes. The most prominent effect of Spt6 shRNA #1 knockdown is a decrease in Ser2P at 3’-ends of genes including MYC, EEF1A1, GAPDH, ACTG, and RPS14 (Figures 3-5A-E, blue tracks). This decrease in Ser2P at 3’-ends was reproduced on each gene with Spt6 shRNA #4 (Figures 3-5A-E, green tracks).

Interestingly, UBC showed the opposite effect with an increase of Ser2P at the

5’-end of the gene with Spt6 knockdown (Figure 3-5F), which was also reproducible.

52

To determine the affect of Spt6 knockdown on Ser2P-CTD genome-wide, I generated meta-plots of the relative frequency of mean Ser2P ChIP-seq reads as previously described. I found that Spt6 knockdown with shRNA #1 resulted in a shift in Ser2P accumulation at 3’-ends of genes (Figure 3-6A, arrow). This effect on Ser2P is reproduced with Spt6 shRNA #4 (Figure 3-6B). To be certain that the change in Ser2P-CTD is not due to a change in pol II occupancy at 3’-ends, the mean RPBM reads in 40 equal bins spanning a 4 kb region around the poly(A) site was determined for both Ser2P and total pol II ChIPs. The total pol II log2 of the mean was subtracted from the Ser2P log2 of the mean for each condition to determine the ratio of Ser2P relative to total pol II. This analysis revealed that Ser2P is decreased at the 3’-end relative to total pol II when Spt6 is knocked down (Figures 3-6C, D).

These results suggest a novel function of Spt6 in regulating Ser2P-CTD on chromatin. In general, Spt6 promotes Ser2P at 3’-ends, and when Spt6 is knocked down, there is a reduction in Ser2P. Alternatively, Spt6 may prevent premature hyperphosphorylation of Ser2-CTD on UBC, but similarly regulated genes have yet to be identified.

3.2F Spt6 Does Not Affect Other Phospho-CTD Isoforms

Spt6 plays a critical role in promoting Ser2P at 3’-ends of genes (Figure 3-

6), but it is unknown if it alters other phospho-CTD isoforms. Spt6 has been shown to directly bind Ser2P-CTD, but there is evidence that it has a higher affinity for both Ser5P- and Tyr1P-CTD than Ser2P-CTD (Mayer et al., 2012). To

53

determine if Spt6 regulates other phospho-CTD isoforms, I examined how Spt6 knockdown affects their distribution on genes.

I performed anti-phospho-Ser5-isoform, anti-phospho-Ser7-isoform, and anti-phospho-Tyr1-isoform ChIP-seq and analyzed the resulting data genome- wide by meta-plot analysis. I found that Spt6 shRNA#1 knockdown had little detectable effect on Ser5P-CTD (Figure 3-7A), Ser7P-CTD (Figure 3-7B), or

Tyr1P-CTD (Figure 3-8C). None of these ChIPs have been replicated with Spt6 shRNA #4. When analyzing meta-plots, relative frequency provides a reasonable solution to address issues that arise from variable read numbers due to sequencing depth. However, relative frequency can often introduce artifacts in the meta-plot analysis. For instance, an insignificant change across many bins, like in the 3’ region, can result in a more visible change in another region, like the

TSS. The Ser5P, Ser7P and Tyr1P meta-plots all exhibit a slight change at the 5’ end that are identical to one another (Figure 3-7, *) and is likely due to a relative frequency artifact as described.

These experiments reveal that Spt6 does not considerably affect other phospho-CTD isoforms and suggest that Spt6 is able to specifically regulate

Ser2P-CTD although it has no known kinase or phosphatase activity.

3.2G Spt6 Regulates Elongation Complex Factor but Not Ser2 Kinase

Spt6 specifically promotes Ser2P-CTD at 3’-ends of genes but the mechanism by which Spt6 regulates Ser2P is unclear. It was shown in yeast that

Spt6 regulates Ser2P-CTD levels by stabilizing Ser2 kinase, Ctk1, which is the

54

homologue of mammalian CDK9 (Dronamraju and Strahl, 2014). To determine if human Spt6 similarly regulates CDK9 on chromatin, I performed anti-CDK9

ChIP-seq with Spt6 knockdown. I analyzed the results by generating a meta-plot analysis of the data and examined the relative frequency of the mean. I observed no visible difference in CDK9 localization with Spt6 knockdown (Figure 3-8A).

It is unclear why yeast Spt6 regulates Ctk1, but mammalian Spt6 does not affect the human homologue CDK9 localization on chromatin. Interestingly,

CDK9 is a subunit of positive elongation factor P-TEFb (Peng et al., 1998b), which is known to interact with super elongation complex (SEC) (Luo et al.,

2012). It is possible that Spt6 regulates other functions of CDK9, potentially through the SEC, which might also be how Spt6 alters pol II occupancy on some genes (Figure 3-3A-C).

To determine if Spt6 affects SEC localization on genes, I examined SEC subunit, AFF4, by performing anti-AFF4 ChIP-seq with Spt6 knockdown. I visualized the AFF4 ChIP on the UCSC genome browser, and I observed a notable increase in AFF4 near the TSS of RPL41 and ZC3H10 (Figure 3-8B, arrows). However, the ChIP signals for these libraries were unusually low and were hard to distinguish from background on most genes. In order to visualize global localization of AFF4, I generated a meta-plot of relative frequency of mean read counts (Figure 3-8C). I observe a change in AFF4 accumulation at the TSS with Spt6 knockdown. These results suggest that Spt6 is potentially regulating transcription through SEC recruitment.

55

3.2H Spt6 Affects 3’ UTR Alternative Poly(A) Site Choice

Ser2P is critical in promoting efficient 3’-end processing in yeast and humans (Ahn et al., 2004; Gu et al., 2013). Premature hyperphosphorylation of

Ser2-CTD on human genes results in a shift in a proximal shift in poly(A) site use

(Schwartz et al., 2012), but how attenuation of Ser2P might impact poly(A) site choice is not known.

To determine the impact of diminished Ser2P-CTD with Spt6 knockdown on poly(A) site choice, I generated poly(A)+ RNA-seq libraries with Spt6 shRNA

#1. The poly(A)+ RNA seq protocol (Figure 3-9A) involves collecting total RNA, generating single stranded cDNA using oligo(dT) reverse primer, and addition of an adapter sequence, which is then utilized for library amplification and addition of a unique barcode. Libraries are submitted for Illumina sequencing and the resulting data is analyzed through the Bentley lab pipeline. Dr. Hyunmin Kim performed a linear trend analysis by clustering poly(A) sites and determining if there are significant shifts in 3’ UTR poly(A) site usage upstream or downstream in each gene.

I found 1,191 genes that had significant (p < 0.05) shifts in poly(A) site use within the 3’ UTR (Figure 3-9B). Notably, 63% of the significant shifts were downstream shifts with Spt6 knockdown. EBNA1BP2 is a gene with a significant downstream poly(A) site shift with Spt6 knockdown and the change in poly(A) site use is visualized on the UCSC genome browser (Figure 3-9C). There are two major poly(A) sites on the 3’ UTR of EBNA1BP2 and the major poly(A) site

56

switches from the upstream in the control to the downstream with Spt6 knockdown (Figure 3-9C, arrow).

These results suggest that Spt6 regulates poly(A) site choice in addition to promoting Ser2P at 3’-ends of genes. In principle, it is possible that Spt6 is directly regulating poly(A) site choice and/or poly(A) site choice is affected by the extent of Ser2P-CTD. However, there is evidence that supports the importance of

Ser2P in regulation of poly(A) site choice (Davidson et al., 2014; Schwartz et al.,

2012), and it is not known how Spt6 could be directly regulating poly(A) site choice. Notably, attenuation of Ser2P did not correlate with a unilateral shift in poly(A) site choice. This suggests that although the degree of Ser2P is critical, it does not clearly predict poly(A) site choice.

3.2I Spt6 Alters H3K36me3 on Gene Bodies

Both Spt6 and Ser2P-CTD have been shown to be critical for establishing

H3K36me3 in yeast (Youdell et al., 2008). However the relationship between mammalian Spt6, Ser2P-CTD and H3K36me3 is not fully understood. To determine if mammalian Spt6 regulates H3K36me3, I performed anti-H3K36me3

ChIP-seq with Spt6 shRNA #1. Visualization of H3K36me3 on the UCSC genome browser revealed a reduction and redistribution of H3K36me3 away from the 5’- ends towards the 3’-ends of genes on ACTG1, EEF1A1, RPS14, RPS16, and

DDIT4 (Figures 3-10A-E, blue tracks). This redistribution of H3K36me3 was validated with Spt6 shRNA #4 (Figures 3-10A-E, green tracks). Notably, UBC once again proved to be a contradicting gene displaying an increase in

57

H3K36me3 at the 5’-end with Spt6 knockdown (Figure 3-10F). Interestingly, this

5’ increase in H3K36me3 correlates with the 5’ increase in Ser2P-CTD (Figure 3-

5F).

To determine if there is a general effect of Spt6 knockdown on

H3K36me3, I generated meta-plots from the H3K36me3 ChIP reads as previously described. I found that Spt6 knockdown results in a shift of H3K36me3 within the gene body towards the 3’-end (Figures 3-11A, B). These results suggest that Spt6 is critical in maintaining proper H3K36me3. However, it is unclear if it is Spt6’s role in promoting Ser2P-CTD or if it is the presence of Spt6 that is critical for this function.

3.3 Discussion

In this chapter, I show that Spt6, a CTD-ligand with no known kinase or phosphatase activity, promotes Ser2P-CTD at 3’-ends of genes, which ultimately regulates 3’ UTR poly(A) site choice. The prominent results are: (i) Mammalian

Spt6 localizes at 3’-ends of genes in correlation with Ser2P-CTD, indicating

Ser2P is Spt6’s preferred CTD ligand in humans (Figure 3-1). (ii) Spt6 promotes

Ser2P at 3’-ends of genes (Figure 3-6), but it does not affect other phospho-CTD isoforms (Figure 3-7). (iii) Attenuation of Ser2P at 3’-ends correlates with use of alternative 3’ UTR poly(A) sites (Figure 3-9).

The role of Spt6 as a positive elongation factor has been established through its interaction with other transcription factors (Hartzog and Quan, 2008;

Swanson and Winston, 1992), and it has been shown to localize on actively

58

transcribed genes across species (Andrulis et al., 2000; Endoh et al., 2004;

Kaplan et al., 2000; Kim et al., 2004a; Krogan et al., 2002; Mayer et al., 2012;

Yoh et al., 2007). Knockdown of mammalian Spt6 supports a role in transcription regulation as it resulted in an increase of pol II downstream of the promoter proximal pause (Figure 3-3A-C). Interestingly, depletion of Spt6 also resulted in an increase of SEC factor, AFF4, at the TSS (Figure 3-8C) suggesting that Spt6 might regulate transcription through the SEC. It is known that yeast Spt6 regulates the stability of Ctk1 (Dronamraju and Strahl, 2014), which the human homologue, CDK9, is a component of P-TEFb that interacts with SEC. Perhaps it is through cellular CDK9 that Spt6 regulates SEC recruitment and ultimately transcription. However, depletion of Spt6 did not impact CDK9 localization on chromatin (Figure 3-8A).

Although Spt6 is known to directly bind the CTD, its preferred binding partner has been disputed in recent findings. Murine Spt6 was originally shown to directly bind Ser2P-CTD (Yoh et al., 2007), however yeast Spt6 has the lowest affinity for Ser2P-CTD of all phospho-CTD isoforms and the highest affinity for

Tyr1P-CTD (Mayer et al., 2012). I preformed the first genome-wide analysis of mammalian Spt6, and I found that Spt6 is localized at 3’-ends of genes (Figure 3-

1A) in correlation with Ser2P-CTD but not Tyr1P-CTD (Figure 3-1A). These results suggest Spt6 is bound to Ser2P-CTD on chromatin.

Yeast Spt6 was shown to stabilize global levels of Ser2P-CTD within the cell (Dronamraju and Strahl, 2014), but whether Spt6 regulates Ser2P on

59

chromatin has not been previously shown. Interestingly, a CTD-ligand has been shown to prevent aberrant Ser2 hyperphosphorylation at 5’-ends of genes

(Schwartz et al., 2012) establishing that a CTD ligand is capable of regulating its phospho-CTD mark. I found that Spt6 knockdown resulted in a reduction of

Ser2P-CTD at 3’-ends of genes (Figure 3-6) suggesting mammalian Spt6 potentially functions in stabilizing Ser2P. Yeast Spt6 was shown to stabilize

Ser2P through stabilization of Ctk1, but I found mammalian Spt6 does not regulate the mammalian homologue, CDK9, on chromatin (Figure 3-8A).

A potential reason Spt6 does not affect chromatin-bound CDK9 is that

CDK9 localizes at 5’-ends of genes while Spt6 localizes at 3’-ends of genes.

However, CDK12, another Ser2 kinase, localizes at 3’-ends of genes (Bartkowiak et al., 2010) suggesting it might be the kinase responsible for 3’-end Ser2- phosphorylation. Notably, Spt6 knockdown did not affect other phospho-CTD isoforms on chromatin (Figure 3-7) indicating that Spt6 only regulates the phospho-CTD isoform it binds. This supports a previous hypothesis (Dronamraju and Strahl, 2014) that Spt6 binding of Ser2P-CTD stabilizes, or potentially protects, the phosphorylation state of Ser2 in a positive feedback loop.

The extent of Ser2P-CTD in promoting 3’-end processing has yet to be fully resolved. The CTD is critical for proper 3’-end processing (McCracken et al.,

1997b), and Ser2-CTD phosphorylation promotes proper 3’-end processing in yeast and humans (Ahn et al., 2004; Gu et al., 2013). However if you have too much or too little Ser2P, it is unclear how that affects poly(A) site choice. It was

60

shown that premature hyperphosphorylation of Ser2-CTD on human genes results in a proximal shift in poly(A) site use (Schwartz et al., 2012). For this reason, it is possible that hypophosphorylation of Ser2-CTD would result in a distal shift in poly(A) site use. Interestingly, reduction of Ser2P-CTD at 3’-ends correlated with significant changes in 3’ UTR poly(A) site choice with 63% of affected poly(A) sites exhibiting downstream shifts (Figure 3-9). However, attenuation of Ser2P did not correlate with a unilateral shift in poly(A) site choice in either one direction suggesting that perhaps the extent of Ser2P-CTD has a limited impact on poly(A) site choice in comparison to other factors.

Additionally, Spt6, Ser2P-CTD and Ctk1 are essential for establishing

H3K36me3 in yeast (Youdell et al., 2008), as Spt6 binds Ser2P and indirectly interacts with H3K36 methyltransferase, SETD2 (Yoh et al., 2008). I found that

Spt6 knockdown results in both a delay and a reduction of H3K36me3 within gene bodies (Figure 3-11). Interestingly, H3K36me3 is a mark of active elongation and a decrease might indicate a decrease in productive elongation substantiating Spt6’s role as an elongation factor. It is unclear how mammalian

Spt6 affects H3K36me3 as it may be through loss of Spt6-SETD2 or through attenuation of transcription. Interestingly, one unique gene, UBC, exhibits both hyperphosphorylation of Ser2P at the 5’-end (Figure 3-5F) and a 5’ shift in

H3K36me3 (Figure 3-10F) with Spt6 knockdown. This suggests that Ser2P is a strong indicator of H3K36me3, and Spt6 does not likely regulate the recruitment of SETD2 on UBC.

61

Interestingly, it has been reported that highly used poly(A) sites are depleted of nucleosomes while downstream of the poly(A) site there is an increase of nucleosomes in correlation with an accumulation of pol II (Khaladkar et al., 2011). It is possible that the observed downstream shift of H3K36me3 with

Spt6 knockdown is also impacting 3’ UTR poly(A) site choice. If a 3’ shift in

H3K36me3 obstructs upstream 3’ UTR poly(A) sites, this could potentially result in a downstream shift in poly(A) sites as observed with Spt6 knockdown.

In summary, these results provide evidence for a novel role of CTD-ligand

Spt6 in promoting Ser2P on chromatin and that the extent of Ser2P potentially regulates poly(A) site choice at 3’-ends. Additionally, Spt6 plays a critical role in promoting transcription and H3K36me3, but it is unclear if this is through a direct role of Spt6 or through Spt6’s ability to stabilize Ser2P-CTD. In the future, it will be interesting to examine how Spt6 promotes Ser2P by investigating the relationship between CDK12 and Spt6 and determine if attenuation of Ser2P affects elongation rate.

62

Figure 3-1: Spt6 is localized at 3’-ends of genes in correlation with Ser2P- CTD. (A) Visualization of anti-Spt6 (purple), anti-total pol II (blue), anti-Ser2P (orange), anti-Tyr1P (green) ChIP-seq libraries aligned to MYC on the UCSC hg19 genome browser. Gene direction is left to right (black arrow). (B) All human genes were divided into 20 equal bins in the 5’ region 1.5 kb upstream to 500 bp downstream of the TSS, 40 equal bins in the 3’ region 500 bp upstream to 3.5 kb downstream of the poly(A) site, and 20 variable bins in a region between the defined 5’ and 3’ boundaries. The relative frequency of the mean ChIP reads (RPBM) in each bin was calculated to generate a meta-plot analysis of Spt6 localization across 16,268 genes.

63

Figure 3-2: Knockdown of Spt6 through an inducible shRNA. (A) Diagram of the tet-inducible shRNA system, which expresses shRNA specific to Spt6 upon the addition of doxycycline (Meerbrey et al., 2011). (B) Western blot analysis of whole protein extracts after treatment for 72 hours of 1μg/ml doxycycline (+doxy) revealed that Spt6 shRNAs #1 and #4 reduced Spt6 expression (lanes 2, 4) relative to the control cells not treated with doxycycline (-doxy) (lanes 1, 3). Xrn2 serves as a loading control.

64

Figure 3-3: Spt6 affects pol II distribution at both the 5’- and 3’-ends of genes. (A-D) Anti-pol II ChIP-seq (RPBM) is mapped to the UCSC hg19 genome browser. Each gene shows Spt6 shRNA #1 +doxy (dark blue) and -doxy (light blue) and Spt6 shRNA #4 +doxy (dark green) and -doxy (light green). The arrow indicates gene direction. Note the increased pol II signal within the gene body upon Spt6 knockdown on MYC (A), EEF1A1 (B) and GAPDH (C). (D) In contrast, MALAT1 shows early termination of pol II at the 3’-end when Spt6 is knocked down.

65

Figure 3-4: Spt6 does not generally affect pol II distribution genome-wide. (A, B) Meta-plots of anti-total pol II ChIP-seq data were generated by calculating the relative frequency of the mean RPBM reads as described in 3-1B. Neither knockdown of Spt6 with shRNA #1 (A) or shRNA #4 (B) resulted in a discernable difference in pol II localization compared to the control.

66

Figure 3-5: Spt6 promotes Ser2-CTD phosphorylation at 3’-ends of genes. (A-F) Rabbit polyclonal anti-Ser2P ChIP-seq reads (RPBM) are mapped to the UCSC genome browser. Each gene shows Spt6 shRNA #1 +doxy (dark blue) and -doxy (light blue) and Spt6 shRNA #4 +doxy (dark green) and -doxy (light green). The arrows below the genes indicate gene direction. Note the decrease in Ser2P at the 3’-end on MYC (A), EEF1A1 (B), GAPDH (C), ACTG1 (D), and RPS14 (E) with knockdown of Spt6. (F) In contrast, UBC revealed an increase of Ser2P-CTD at the 5’ end of the gene.

67

Figure 3-6: Spt6 globally regulates Ser2P-CTD at 3’-ends of genes. (A, B) Meta-plots of the Ser2P ChIP-seq data in Figure 3-5 were generated by calculating the relative frequency of the mean RPBM reads as described in 3-1B. Note the shift in Ser2P downstream of the poly(A) site with Spt6 knockdown using shRNA #1 (A) and shRNA #4 (B). (C, D) Meta-plots of anti-Ser2P ChIP- Seq normalized to anti-total pol II ChIP-seq were determined by finding the mean RPBM reads in 40 equal bins surrounding the 4 kb region around the poly(A) site. The total pol II log2 of the mean was subtracted from the Ser2P log2 of the mean for each condition. Note the decrease in Ser2P relative to total pol II at the 3’-end for knockdown of Spt6 with both shRNA #1 (C) and shRNA #4 (D).

68

Figure 3-7: Spt6 does not affect other phospho-CTD isoforms. (A-C) Meta- plots of anti-Ser5P (A), anti-Ser7P (B), and anti-Tyr1P (C) ChIP-seq data were generated by calculating the relative frequency of the mean RPBM reads as described in 3-1B. None of these phospho-CTD isoforms reveal distinct differences with Spt6 knockdown via shRNA #1. Note there is a discernable change upstream (*) of the TSS in each isoform that is likely an artifact of relative frequency.

69

Figure 3-8: Spt6 affects recruitment of elongation complex factor but not Ser2 kinase. (A) Meta-plot of anti-CDK9 ChIP-seq data was generated by calculating the relative frequency of the mean RPBM reads as described in 3-1B. There is no detectable difference with knockdown of Spt6 by shRNA #1. (B) Anti- AFF4 ChIP-seq reads (RPBM) are mapped to the UCSC genome browser. Two neighboring genes are shown, RPL41 and ZC3H10, and the black arrow indicates gene direction. Spt6 shRNA #1 +doxy (dark blue) and -doxy (light blue) are shown. Note the increase of AFF4 surrounding the TSS in both genes (red arrows). (C) Meta-plot of anti-AFF4 ChIP-seq data was generated by the relative frequency of the mean RPBM reads as described in 3-1B. Note the genome-wide increase of AFF4 near the TSS with Spt6 knockdown.

70

Figure 3-9: Spt6 affects 3’ UTR alternative poly(A) site choice. (A) Schematic of the poly(A)+ RNA-seq protocol. Total RNA is collected from cells and primed with oligo(dT) to select poly(A)+ mRNA. When cDNA is made, an adapter sequence is added, which is then utilized for library amplification and addition of a unique barcode. Libraries are submitted for Illumina sequencing and the resulting data is analyzed through the Bentley lab pipeline. Dr. Hyunmin Kim then performed a linear trend analysis by clustering 3’ UTR poly(A) sites and determining if there are significant shifts in poly(A) site usage upstream or downstream in each gene. (B) Spt6 knockdown by shRNA #1 resulted in nearly 1200 significant (p < 0.05) 3’ UTR poly(A) sites shifts with the majority being downstream shifts. (C) Mapped cleavage sites from each poly(A)-seq read normalized to library size are visualized on the UCSC genome browser. The EBNA1BP2 gene has a significant downstream shift with knockdown of Spt6 (arrow).

71

Figure 3-10: Spt6 alters H3K36me3 on gene bodies. (A-E) Anti-H3K36me3 ChIP-seq reads (RPBM) are mapped to the UCSC genome browser. Each gene shows Spt6 shRNA #1 +doxy (dark blue) and -doxy (light blue) and Spt6 shRNA #4 +doxy (dark green) and -doxy (light green). The arrow indicates gene direction. Note the shift in H3K36me3 within the gene body away from the TSS towards the 3’-end on ACTG1 (A), EEF1A1 (B), RPS14 (C), RPS16 (D), and DDIT4 (E) with knockdown of Spt6. (F) In contrast, UBC revealed an increase of H3K36me3 at the 5’ end of the gene similar to the increase of Ser2P-CTD shown in 3-5F.

72

Figure 3-11. Spt6 globally regulates placement of H3K36me3 on genes. (A, B) Calculating the relative frequency of the mean RPBM reads as described in 3- 1B generated meta-plots of anti-H3K36me3 ChIP-seq data. Note that knockdown of Spt6 with shRNA #1 (A) or shRNA #4 (B) both resulted in a shift of H3K36me3 towards the 3’-end indicating that Spt6 is critical in the early deposition of this post-translational mark.

73

CHAPTER IV

COORDINATION OF RNA POLYMERASE II PAUSING AND 3’-END

PROCESSING FACTOR RECRUITMENT WITH ALTERNATIVE

POLYADENYLATION1

4.1 Introduction

The majority of RNA polymerase II (pol II) transcribed mRNAs must undergo cleavage and polyadenylation (CPA) at their 3’-ends. A poly(A) site is defined by numerous cis-elements including the AAUAAA hexanucleotide sequence (Fitzgerald and Shenk, 1981) that is recognized by cleavage and polyadenylation specificity factor (CPSF) (Bienroth et al., 1991; Murthy and

Manley, 1992), which is responsible for endonucleolytic cleavage (Dominski,

2007) at the CA dinucleotide element. Further downstream of the AAUAAA element is the U/GU-rich downstream element (DSE) that is recognized by cleavage stimulation factor (CstF) (MacDonald et al., 1994). Additionally, there is a U-rich upstream element (USE) with a conserved UGUA motif that is specifically bound by cleavage factor I (CFI) (Brown and Gilmartin, 2003; Hu et al., 2005). Poly(A) sites can be highly divergent lacking one or more cis-element resulting in strong and weak poly(A) sites depending on conservation of the elements and the sequence conservation of the hexanucleotide element. The core 3’-processing complex is formed when CPSF, CstF and CFI bind the RNA, which results in recruitment of other factors such as cleavage factor II (CFII) and

1 The text and figures in this chapter are adapted from a previous publication to fit the format and continuity of this thesis (Fusby et al., 2015). 74

poly(A) polymerase (PAP) (Shi and Manley, 2015). Once the CPA reaction is complete, the cleaved mRNA has an exposed 5’ phosphate and a 3’ OH, which are respectively polyadenylated and targeted for degradation.

CPA occurs co-transcriptionally and is coordinated by the C-terminal domain (CTD) of pol II, which is necessary for efficient mRNA 3’-end processing

(McCracken et al., 1997b). The CTD is comprised of a conserved heptad repeat,

YSPTSPS, which is reversibly phosphorylated at different positions in coordinated with initiation, elongation and termination phases of the transcription cycle (Buratowski, 2009; Heidemann et al., 2013; Komarnitsky et al., 2000).

Phosphorylation of Ser2-CTD (Ser2P) during elongation enhances mRNA 3’-end processing and binding to CPA factors (Ahn et al., 2004; Licatalosi et al., 2002;

Ni et al., 2004). Pol II pauses approximately 1-5 kb downstream of poly(A) sites in human cells in a state that is hyperphosphorylated on Ser2-CTD residues and associated with cleavage/polyadenylation factors CstF and CPSF (Anamika et al., 2012; Dye et al., 2006; Glover-Cutter et al., 2008; Gromak et al., 2006; Lian et al., 2008). However, the relationship between pausing, Ser2P and cleavage factor recruitment is poorly understood. It is not known whether the poly(A) site in the nascent RNA is necessary or sufficient to induce pol II pausing, Ser2 phosphorylation, or 3’ processing factor recruitment.

Analysis of a human β-globin reporter gene revealed that mutation of the

AATAAA poly(A) site inhibits both pol II pausing and Ser2P at the normal position

1-2 kb downstream of the gene (Davidson et al., 2014; Kim et al., 2011).

75

Additionally, knockdown of the cleavage factor, CPSF73, also inhibited Ser2P downstream of the β-globin poly(A) site. These results suggest that Ser2P promotes 3’ processing factor recruitment, and in turn, reinforces Ser2P in a positive feedback loop. However, how alternative poly(A) site choice couples pol

II pausing, Ser2P, and 3’ processing factor recruitment remains unclear.

Alternative polyadenylation (APA) is a major regulator of mRNA isoforms that affects expression of most human genes (Chan et al., 2014; Hoque et al.,

2013; Lianoglou et al., 2013). APA was first identified on IgM heavy chain where a single gene produces two distinct mRNA isoforms at different stages of B-cell maturation. IgM heavy chain was the first example of APA generating two different mRNA isoforms from a single gene. Plasmacytoma cells preferentially use the proximal (μS) poly(A) site between the Cμ4-M1 intron producing the secreted IgM isoform. Alternatively, immature B-cells use the distal (μM) poly(A) site in the 3’ UTR producing membrane-bound IgM isoform. The cell-type specific selection of alternative Ig μ poly(A) sites is regulated by differential expression of the CstF64 subunit (Takagaki and Manley, 1998; Takagaki et al., 1996), splicing factors (Bruce et al., 2003; Ma et al., 2006) and transcription elongation factors

(Martincic et al., 2009). Competition between splicing of the Cμ4-M1 intron and processing at the μS poly(A) site within the intron is a major determinant of the decision between alternative Ig μ poly(A) sites (Peterson and Perry, 1986). Thus strengthening the weak 5’ splice site of this intron inhibits μS poly(A) site processing and favors the alternative μM site (Peterson and Perry, 1989). A

76

putative pause site where pol II density accumulates has been identified downstream of the μS poly(A) site and the delay associated with transcription elongation through the Cμ4-M1 intron appears to provide a competitive advantage for the μS over the μM poly(A) site (Peterson et al., 2002; Peterson and Perry, 1986).

In principle, the decision between alternative poly(A) sites could be made by co-transcriptional and/or post-transcriptional mechanisms. It is not well understood whether alternative poly(A) site usage is associated with changes in the state of pol II transcription elongation complexes such as their pausing, CTD

Ser2 phosphorylation or recruitment of 3’ processing factors. In this study, I examined several features of the transcription elongation complex at IgM transgenes modified such that either μS or μM poly(A) site utilization is favored. I report that sites of pol II pausing and CstF recruitment to the transcription elongation complex correlate closely with which poly(A) site is utilized, whereas

CTD Ser2 phosphorylation does not. These results suggest that co- transcriptional events make an important contribution to the decision between processing at alternative poly(A) sites.

4.2 Results

4.2A Pol II Pausing Correlates with Poly(A) Site Usage

To investigate the relationship between transcription and coupled mRNA

3’-end processing, I examined the mouse IgM gene where alternative poly(A) sites are used to encode the secreted (μS) and membrane-bound (μM) forms of

77

Ig μ heavy chain. μM expressed in B-cells is processed at the downstream- membrane poly(A) site whereas μS expressed in plasma cells uses the upstream poly(A) site located in the intron between the Cμ4 and M1 exons. The locations of these poly(A) sites on the IgM gene are diagramed in Figure 4-1A. I compared

M12 B-cells and S194 plasmacytoma cells that contain integrated intact rearranged IgM transgenes (pR-SP6) (Ochi et al., 1983; Peterson et al., 2002;

Peterson and Perry, 1986), 3’ RACE confirmed the previously reported shift in poly(A) site use from almost exclusively μS in S194 plasmacytoma cells to a mix of μS and μM in M12 B-cells (Peterson et al., 2002) (Figure 4-1B).

To investigate whether pol II pausing changes when alternative poly(A) sites are used, I performed pol II ChIP-seq in these cell lines. In these experiments, I interpret an accumulation of pol II ChIP signal at a defined region as a pause. In the S194 plasmacytoma cells there is a discrete pol II pause spanning a 1kb region downstream of the μS poly(A) site (shaded bar in Figure

4-2A) that is absent in the M12 B-cells. A biological replicate pol II ChIP-seq experiment confirmed the pol II pause is specific to the S194 plasmacytoma cells and is absent or much reduced in M12 B-cells (Peterson et al., 2002) (Figure 4-

2B). Note that the peak of pol II ChIP-seq reads at this pause has a dip in the center due to repeated sequences (Figure 4-2A) that are masked out during mapping. The center of this pause detected by pol II ChIP is approximately 300 bp downstream of a pause site mapped previously by nuclear run-on analysis

(Peterson et al., 2002). I refer to this accumulation of pol II density centered 500

78

bases downstream of the poly(A) site as the “μS+500 pause”. The peak of pol II density directly downstream of the μS poly(A) site likely represents the previously identified pause, which is within the μS+500 pause region.

Large peaks of pol II density were consistently observed in the intron between the VDJ and Cμ1 exons (Figure 4-2A, 4-2B), which possible correspond to paused pol II at cryptic transcription start sites. Promoter activity of the Eμ enhancer in this intron has previously been documented (Lennon and Perry,

1985). Substantial pol II ChIP signals were also observed downstream of the μM poly(A) site but interpretation of this signal is confounded by the convergent

SVneo transcription unit in the transgene (Figure 4-1A, Figure 4-2B). My conclusions are therefore limited to analysis of the pause within the Cμ4-M1 intron, which is well downstream of SVneo. SVneo serves as an internal control for transgene activity, and as expected, ChIP signals over the IgM sequences relative to SVneo are comparable between the different cell lines (Figure 4-2B).

To validate and quantify pausing at the μS+500 site within the Cμ4-M1 intron, I performed independent ChIP-qPCR experiments on the IgM transgene in

S194 and M12 cells. Anti-pol II ChIP signals was quantified at two amplicons near the μS+500 pause site (Figure 4-2A, 8303, 8903) and at control upstream amplicons in the VDJ-Cμ1 intron (Figure 4-2A, 2453, 2743). Q-PCR confirmed the μS+500 pause site identified by ChIP-seq and showed it is approximately 5 fold higher in S194 plasmacytoma cells than in M12 B-cells (Figure 4-2C). In summary, these experiments show that use of the μS poly(A) site in the S194

79

cells results in an early and distinct pol II μS+500 pause that is diminished or absent when the downstream-membrane poly(A) site is used in B-cells.

4.2B The μS Poly(A) Site is Necessary for the μS+500 Pol II Pause

Since the μS+500 pause is correlated with the μS poly(A) site in plasmacytoma cells, I asked whether this poly(A) site is necessary to establish pol II pausing. For these experiments, I used the previously characterized mutant, pA21, with a 21-nucleotide deletion of the μS AATAAA consensus sequence and the surrounding AU/A-rich sequence (Fig. 4-3A) (Peterson et al.,

2002; Peterson et al., 2006). I confirmed by 3’ RACE that this mutation causes a profound switch from the μS to the μM poly(A) site (Fig. 4-3B lanes 1, 2). Anti- pol II ChIP-seq of the pA21 and WT IgM transgenes in S194 cells (Fig. 4-4A) revealed a marked decrease in the pol II μS+500 pause in the pA21 μS poly(A) site deletion relative to WT (Fig. 4-4A, shaded box), that was confirmed in a replicate ChIP-seq experiment (Fig. 4-4B).

To validate and quantify altered pausing at the μS+500 site in the pA21 mutant, I performed independent anti-pol II ChIP-qPCR experiments. Pol II ChIP signals at two amplicons near the μS+500 pause site (Fig. 4-4A, primers 8303,

8903) were quantified relative to a control amplicon located upstream in the Cμ4 exon (Fig. 4-4A, primer 7036). This analysis shows the ratio of pol II at μS+500 site relative to the upstream position is approximately 6 fold higher in the WT IgM gene than the pA21 mutant (Fig. 4-4C). I conclude that the μS poly(A) site is necessary for pol II pausing at the intronic μS+500 site.

80

4.2C The μS Poly(A) Site is Not Sufficient to Induce Pausing

I next investigated whether the μS poly(A) site is sufficient to induce the prominent pol II μS+500 pause in the Cμ4-M1 intron. To address this question, I used the 5’-SP mutant in which the 5’ splice site of the Cμ4-M1 intron is mutated to a strong canonical splice site sequence (Fig. 4-3A). Efficient splicing of the

Cμ4-M1 intron in the 5’-SP mutant outcompetes any use of the μS poly(A) site even though the μS poly(A) site sequence is left intact (Peterson and Perry,

1989) as I confirmed by 3’ RACE (Fig. 4-3B lanes 1, 3). I performed pol II ChIP- seq experiments in S194 cells expressing the 5’-SP transgene (Fig. 4-4A) and observed loss of the μS+500 pause in this mutant relative to WT (Fig. 4-4A). The loss of the μS+500 pause in the 5’-SP mutant was reproduced in a biological replicate ChIP-seq experiment (Fig. 4-4B).

To independently verify reduced pausing in the 5’-SP mutant, I quantified pol II ChIP signals by qPCR as described above for the pA21 mutant. Ratios of pol II occupancy in the μS+500 region (amplicons 8303 and 8903) relative to the control upstream region (amplicon 7036) were determined for 5’-SP and WT (Fig.

4-4C). By this criterion, the 5’-SP mutant has a nearly 5-fold decrease in μS+500 pol II pausing compared to WT. Together the experiments indicate that while the

μS poly(A) site is necessary, it is not sufficient for downstream pol II pausing.

When μS poly(A) site usage is outcompeted by efficient splicing of the Cμ4-M1 intron, then pausing within the intron is strongly inhibited.

81

4.2D CTD Ser2 Phosphorylation is Uncoupled from Pol II Pausing at the μS+500

Site

I next investigated the relationship between pol II pausing at the μS+500 site and Ser2P that is associated with co-transcriptional mRNA 3’-end processing. To measure Ser2P in the Cμ4-M1 intron, I carried out ChIP-seq with antibody specific to this phospho-isoform in S194 plasmacytoma cells with the

WT, pA21, and 5’-SP IgM transgenes. As expected, pol II paused at the μS+500 site in the WT IgM transgene is associated with a strong Ser2P ChIP signal (Fig.

4-5A, top track, shaded box). Unexpectedly, I also detected clear evidence of

Ser2P pol II at the same position in the pA21 and 5’-SP mutants (Fig. 4-5A, see shaded box) even though pol II pausing at μS+500 is much reduced in these mutants (Fig. 4-4A). These results were reproduced in biological replicates that also revealed Ser2P ChIP signals at the μS+500 Cμ4-M1 pause site on WT, pA21 and 5’-SP transgenes (Fig. 4-5B).

To validate and quantify Ser2 CTD phosphorylation relative to total pol II at the μS+500 pause site, I performed qPCR on independent ChIP samples at three amplicons near the μS+500 pause (8303, 8584, and 8734) and several others further downstream (Fig. 4-5C). Ratios of Ser2P/total pol II ChIP signals at each amplicon were normalized to Ser2P/total pol II at the 3’-end of the ACTB gene that served as an internal control. This experiment showed that the level of Ser2P relative to total pol II was not diminished in the region of the μS+500 pause in either the pA21 or the 5’-SP mutant (Fig. 4-5C) even though μS poly(A) site

82

usage and pol II pausing is much reduced relative to WT. Surprisingly, at amplicons 8303 and 8584 in the μS+500 pause region, Ser2P/total pol II was actually higher in the pA21 mutant than in the WT. Together these results suggest that CTD Ser2 phosphorylation can be uncoupled from poly(A) site processing and pol II pausing downstream of a poly(A) site. Furthermore, the maintenance of relatively high Ser2P/total pol II ratios near the μS poly(A) site in the 5’-SP mutant compared to WT suggests that Ser2P-CTD hyperphosphorylation is not sufficient to induce processing at the poly(A) site.

4.2E The β-Globin Poly(A) Site is Necessary for Ser2-CTD Hyperphosphorylation

My finding that the level of Ser2-CTD phosphorylation downstream of the

μS poly(A) site is not well correlated with processing at that site is surprising in light of a recent study that showed CTD Ser2 hyperphosphorylation at the 3’-end of a human β-globin reporter gene is dependent on a functional poly(A) site

(Davidson et al., 2014). To independently assess the relationship between Ser2P and use of the β-globin poly(A) site, I examined single-copy human β-globin transgenes in CHO cells (Kim et al., 2011) where the poly(A) site is WT or mutated at a single position from AATAAA to AAGAAA (A2GA3) which prevents

3’ processing (Fig. 4-6A) and causes a downstream shift of pol II pausing relative to WT (Fig. 4-6B) (Kim et al., 2011). I analyzed levels of Ser2P relative to total pol II by ChIP-qPCR in the WT and A2GA3 mutant β-globin genes. These experiments showed that Ser2P levels are decreased in the A2GA3 mutant in agreement with previous results (Fig. 4-6C) (Davidson et al., 2014). Together

83

these observations suggest that the relationship between poly(A) site use and

Ser2P differs between genes and may depend on whether a single major poly(A) site or multiple alternative sites are present.

4.2F Alternative Poly(A) Site Use and CstF Recruitment

To investigate the relationship between μS poly(A) site usage and recruitment of the cleavage/polyadenylation factor CstF, I conducted anti-CstF77

ChIP-seq of WT and mutant IgM transgenes in S194 cells. As expected, there is strong recruitment of CstF77 to WT IgM in the region of the μS+500 pol II pause

(Fig. 4-7A). In contrast, relative CstF77 ChIP signals in the region of the pause site are lower in the pA21 and 5’-SP mutants than in WT consistent with reduced pol II occupancy at these positions (Fig. 4-7A).

To quantify CstF77 relative to pol II, I performed an independent ChIP qPCR experiment and quantified the results with 16 primer pairs spanning the

IgM and SVneo transcription units (Fig. 4-7B). CstF77/pol II values were normalized to a control region of maximal recruitment downstream of the SVneo gene. This experiment showed that, consistent with the ChIP-seq results, CstF77 recruitment downstream of the μS poly(A) site is lower in both the pA21 and 5’SP mutants (Fig. 4-7B, arrow) in correlation with reduced processing at the μS poly(A) site. The greatest reduction in CstF77 recruitment occurred in the pA21 mutant consistent with the idea that recognition of the consensus AAUAAA RNA processing element is important for stable association of CstF with the transcription complex. These results also suggest that maintenance of WT or

84

even greater levels of Ser2P in the pA21 and 5’-SP mutants is not sufficient to optimally recruit the CstF77 3’-end processing factor.

4.3 Discussion

In this report, I show that use of the alternative poly(A) sites at the IgM gene is associated with distinct co-transcriptional events manifested by different properties of the pol II transcription elongation complexes on the gene. The salient results are: (i) Pol II pausing downstream of the μS poly(A) site and recruitment of the 3’ processing factor CstF correlate with processing at that site suggesting a functional coupling. (ii) A functional poly(A) site is necessary but not sufficient for pausing. When the μS poly(A) site and the necessary factors in plasmacytoma cells are present, but the site is not utilized, as in the 5’-SP mutant, then pausing is suppressed. (iii) In contrast to the β-globin gene with a single strong poly(A) site, the level of Ser2P downstream of the alternative IgM

μS poly(A) site did not correlate closely with the extent of processing at that site.

A previously reported pol II pause site located 50-200 nucleotides downstream of the μS poly(A) site that was mapped by nuclear run-on analysis

(Peterson et al., 2002). The higher resolution pol II ChIP-seq experiments reported here suggest that this pause extends several hundred bases further downstream into the middle of the Cμ4-M1 intron and is centered approximately

500 bases downstream of the poly(A) site. Interestingly, both deletion of the μS poly(A) site, in the pA21 mutant, and reduced processing at this site without changing its sequence, in the 5’-SP mutant, strongly inhibit pol II pausing at the

85

μS+500 position. This paused peak of pol II in the WT gene is also more prominent in the plasmacytoma cells, where processing at the μS poly(A) site predominates, compared to B-cells, where it is used less frequently. I conclude that the 3’ pol II pause is strongly associated with functional processing at the μS poly(A) site rather than just the presence of the consensus cleavage polyadenylation sequences. This result is consistent with the hypothesis that poly(A) site processing within the context of a pol II transcription elongation complex causes the polymerase to pause (Fig. 4-8A). This idea is also supported by previous results implicating the cleavage factor CPSF in promoting pausing in vitro and in vivo (Davidson et al., 2014; Nag et al., 2007). In addition, pausing could feed back to facilitate co-transcriptional μS poly(A) site processing as previously suggested (Enriquez et al., 1991; Yonaha and Proudfoot, 1999). It is also possible that binding of U1 snRNP to the 5’ splice site inhibits pausing within the intron. I am not aware of any evidence for such a mechanism, but it would be consistent with the proposed inhibition of premature cleavage polyadenylation by

U1 snRNP (Berg et al., 2012; Kaida et al., 2010). My observations are not easily reconciled with the idea that pausing is caused by intrinsic dominantly acting

DNA sequence elements within the Cμ4-M1 intron because such sequences are unaffected by the poly(A) site (pA21) and 5’ splice site (5’-SP) mutants that severely reduce pausing.

Ser2P correlates with pol II pausing and recruitment of 3’-end processing factors downstream of poly(A) sites on human genes (Davidson et al., 2014;

86

Glover-Cutter et al., 2008). However, what this correlation signifies for the mechanism relating 3’ processing, processing factor recruitment and CTD phosphorylation remains unclear. Deletion of the single major poly(A) site in the human β-globin gene caused a marked loss of Ser2P in the 3’ flanking region

(Davidson et al., 2014), and I repeated this observation in independent reporter cell lines. These results suggest some form of coupling between poly(A) site recognition and CTD Ser2 phosphorylation. In contrast to β-globin, I found that

Ser2P downstream of the IgM μS alternative poly(A) site was maintained and even enhanced in mutants that severely reduced processing at this site. These results therefore suggest that maintenance of Ser2P can be uncoupled from poly(A) site processing and pausing, but they do not eliminate the possibility that pausing can enhance Ser2 phosphorylation. In summary, the relationship between Ser2 hyperphosphorylation and RNA 3’-end processing can differ between genes and may depend on gene length and whether a single strong poly(A) site is present or multiple alternative sites. It remains possible that inherent poly(A) site strength might influence the relationship between 3’-end processing and Ser2 CTD phosphorylation. It is not known whether the β-globin and IgM μS poly(A) sites have different strengths though both have consensus

AAUAAA elements and the μS site has functional AU and GU rich consensus elements (Peterson et al., 2006; Phillips and Virtanen, 1997).

The roles of the poly(A) site consensus sequences, Ser2 phosphorylation of the CTD, and transcriptional pausing in recruitment of

87

cleavage/polyadenylation factors to the pol II elongation complex remain to be resolved. I observed that co-transcriptional recruitment of CstF to elongation complexes on the IgM gene changed when different alternative poly(A) sites were utilized. Processing at the μS poly(A) site was associated with strong recruitment of CstF77 to pol II complexes at the μS+500 pause site. Inhibition of processing at this site either by deletion of the consensus AATAAA sequence

(pA21) or by strengthening of the competing 5’ splice site (5’-SP) diminished

CstF77 recruitment although CTD Ser2 phosphorylation levels relative to WT were maintained. These results argue that recognition of 3’ processing signals in the nascent transcript is necessary for maximal association of CstF with the transcription elongation complex. Recruitment of CstF may also be enhanced by the Ser2 hyperphosphorylated CTD however Ser2P alone appears not to be sufficient for formation of a stable complex. Although CstF77 recruitment downstream of the μS poly(A) site approximately parallels the level of processing at this site, I cannot exclude the possibility that other factors might regulate processing at a step after CstF binding. For example, it is possible that U1 snRNP, which inhibits processing at premature poly(A) sites (Berg et al., 2012;

Kaida et al., 2010), might inhibit μS poly(A) site processing at a stage after

CstF77 recruitment.

In summary, my results show that changes in alternative poly(A) site choice are associated with changes in pol II pausing and 3’-end processing factor recruitment to the elongation complex. The results suggest that the decision

88

between alternative poly(A) sites is likely made in part at the co-transcriptional level. In future, it will be of interest to investigate how properties of the pol II transcription elongation complex, such as its elongation rate and CTD phosphorylation state, could influence the decision between alternative poly(A) sites, and to identify factors responsible for Ser2-CTD hyperphosphorylation independent of poly(A) site use.

89

Figure 4-1: IgM WT transgene expression in M12 and S194 cells. (A) Diagram of the WT IgM gene transgene showing the location of the μS and μM poly(A) sites. This plasmid (pR-SP6) also contains a convergent SVneo gene. The grey dashed box indicates the region of the transgene shown in Fig 4-2A. The intronic Eμ enhancer is marked. (B) 3’ RACE of RNA from S194 and M12 cells with IgM WT transgenes. Oligo dT primed cDNA was amplified using nested forward PCR primers. PCR fragments were visualized by EtBr staining. * marks an unidentified RT-PCR product specific to S194 cells. (C) IgM copy number was determined by qPCR of input DNA from cells with the respective IgM transgenes relative to parental S194 cells.

90

Figure 4-2: Pol II pausing in the Cμ4-M1 intron is coupled to μS poly(A) site use. (A) Anti-pol II ChIP-seq of IgM transgenes in S194 plasmacytoma and M12 B cells reveals a pol II pause (shaded box) centered ~500 bp downstream of the μS poly(A) site (green arrow) specific to S194 cells. UCSC genome browser screen shot is shown. (B) Biological replicate anti-pol II ChIPs on the IgM transgenes in M12 B-cells and S194 plasmacytoma cells. The μS+500 pause specific to the WT gene in S194 cells is indicated with the shaded grey box. The red shaded box and red arrows indicate the converging IgM and SVneo 3’-ends. (C) Biological replicates anti-pol II ChIPs analyzed by qPCR with primers specific to the μS+500 pause site (8303 and 8903) normalized to amplicons upstream (2453 and 2743). Values are expressed relative to those for the M12 line for each pair of amplicons. Means and SEM for > 2 PCRs are shown. The standard deviation of the ratios was calculated using the BRR formula to calculate the SEM that is represented by the error bars. Note the increased pausing in the S194 WT.

91

Figure 4-3: IgM pA21 and 5’-SP transgene expression in S194 cells. (A) Diagram of the IgM Cμ mutant transgenes, pA21 that deletes the μS poly(A) site and 5’-SP that strengthens the 5’ splice site. The grey dashed box indicates the region of the transgene shown in Fig. 4-4A. (B) 3’ RACE of WT and mutant IgM transgene transcripts in S194 cell lines as in Fig. 1B. * marks an unidentified RT- PCR product specific to S194 cells. Note the pA21 and 5’-SP mutants shift poly(A) site use in favor of μM. (C) IgM copy number was determined by qPCR of input DNA from cells with the respective IgM transgenes relative to parental S194 cells.

92

Figure 4-4: The μS poly(A) site is necessary but not sufficient for μS+500 pol II pause. (A) UCSC genome browser screen shot of anti-pol II ChIP-seq of WT and mutant IgM transgenes in S194 cells. Note the μS+500 pause (grey shaded box) is reduced or absent in the pA21 and 5’-SP mutants (grey arrows). (B) UCSC genome browser shot of anti-pol II ChIP-seq replicates on the IgM transgene in S194 cells. These ChIPs are biological replicates of the experiment in 4-4A. Note the reduction of the WT specific μS+500 pause in the IgM mutants. (C) Biological replicate anti-pol II ChIPs analyzed by qPCR with primers specific to the μS+500 pause site (8303 and 8903) relative to an amplicon upstream (7036). Means and SEM of >3 PCRs are shown. Note reduced pausing in both pA21 and 5’-SP mutants relative to WT.

93

Figure 4-5: Uncoupling of pausing from pol II Ser2-CTD hyperphosphorylation in mutants that alter μS poly(A) site usage. (A) UCSC genome browser shot of anti-Ser2P-CTD ChIP-seq of IgM transgenes in S194 cells. Note Ser2P-CTD ChIP signals (arrows) at the μS+500 pause marked by the shaded grey box. (B) Anti-Ser2P-CTD ChIP-seq biological replicates on the IgM transgene in S194 cells. Note Ser2P-CTD is detectable at the μS+500 pause (grey shaded box) in the pA21 and 5’-SP mutants. (C) Biological replicate anti- Ser2P-CTD ChIPs were quantified by real-time PCR. Ser2P/total pol II ratios at each amplicon were normalized to the values at the 3’ end of the ACTB gene. Means and SEM of > 3 PCRs are shown.

94

Figure 4-6: The ß-globin poly(A) Site is necessary for pol II pausing and Ser2-CTD hyperphosphorylation. (A) Map of the integrated tet-inducible CMV- human β-globin gene and the upstream hygromycin resistant gene (white box) in CHO Flp-in cells with a poly(A) site mutation (AAGAAA) marked. Arrows mark the TSS and poly(A) site. PCR amplicons are indicated with their position relative to the TSS. (B) Relative anti-pol II ChIP signals normalized to the maximum value across β-globin. Means and SEM are shown. Note the WT gene has a pol II pause proximal to the poly(A) site (grey arrow), that is diminished in the A2GA3 poly(A) site mutant where this is evidence of a distal pause (brown arrow). These data are from (Kim et al., 2011). (C) Poly(A) site mutation reduces CTD Ser2P on β-globin in contrast to the IgM μS poly(A) site. Anti-Ser2P/total pol II values at each amplicon were normalized to position 1063 upstream of the poly(A) site in the WT and A2GA3 poly(A) site mutant β-globin transgenes in CHO Flp-in cells. Means and SEM are shown. Note decreased Ser2P/total pol II in the A2GA3 (arrow) consistent with previous results (Davidson et al., 2014).

95

Figure 4-7: Alternative poly(A) site use and CstF recruitment. (A) UCSC genome browser shot of anti-CstF77 ChIP-seq of IgM transgenes in S194 cells. Note reduced CstF77 recruitment at the μS+500 pause (shaded box) in the pA21 and 5’-SP mutants (grey arrows) relative to WT. (B) Biological replicate anti- CstF77 ChIP quantified by real-time PCR. CstF77/total pol II ratios at each amplicon were normalized to the maximum values in the region between convergent IgM and SVneo genes. Means and SEM of 2 PCRs are shown. Note reduced CstF77 recruitment relative to pol II in pA21 and 5’-SP mutants (arrow).

96

Figure 4-8. Model for coordination of pausing and 3’ processing factor recruitment with alternative poly(A) site choice at the IgM gene. (A) In plasma cells, processing of the μS poly(A) site out competes splicing of the Cμ4- M1 intron and is associated with recruitment of cleavage polyadenylation (CPA) factors to the transcription elongation complex. Our results suggest that recognition of the poly(A) site results in pol II pausing. It is not known whether splicing factors are excluded as shown, or if they are present in the complex but unable to function. (B) In B-cells and the poly(A) site deletion, pA21 (ΔμS pA), and consensus 5’ splice site mutants, 5’-SP (*) mutants, splicing outcompetes the μS poly(A) site. When splicing factors have the competitive advantage, CPA factors are excluded and pol II does not pause at the μS+500 site within the Cμ4- M1 intron.

97

CHAPTER V

REGULATION OF ALTERNATIVE POLY(A) SITE CHOICE BY SPLICING

5.1 Introduction

Processing of most eukaryotic transcripts involves capping, splicing, cleavage and polyadenylation of the pre-mRNA. Splicing removes introns from a pre-mRNA transcript in a two-step catalytic reaction via the spliceosome. The major spliceosome core is comprised of five snRNPs, U1, U2, U4, U6, and U5, which are assembled around their corresponding snRNA. U1 snRNP initiates spliceosome assembly by recognizing the 5’ splice site (Reed, 1996) while U2 snRNP components, U2AF35 and U2AF65, recognize the 3’ splice site and polypyrimidine tract, respectively (Wu et al., 1999; Zamore et al., 1992). The branch point is recognized by splicing factor 1 (SF1) (Liu et al., 2001) before U2 snRNP displaces SF1 to bind the branch point in an ATP-dependent reaction

(Gozani et al., 1996; Query et al., 1996; Will and Luhrmann, 2001). Next, a pre- formed U4/U6.U5 tri-snRNP joins the reaction resulting in the ejection of U1 and

U4 snRNP and formation of a catalytically active spliceosome. The splicing reaction can then occur resulting in removal of the intron lariat and joining of the two exons.

3’-end processing typically involves recognition of the poly(A) site (PAS) and results in cleavage of nascent RNA exposing two ends, a 3’ OH and a 5’ phosphate, that is polyadenylated and targeted for degradation, respectively. A

PAS is defined by the presence of numerous cis-elements that are recognized by

98

specific processing factors. The AAUAAA hexanucleotide is located ~10-30 nt upstream of the cleavage site and is recognized by cleavage specificity factor

(CPSF) (Keller et al., 1991). Although not required, there is often a U/GU-rich downstream element (DSE) recognized by cleavage stimulation factor (CstF) and a U-rich upstream element (USE) with a UGUA consensus motif that is recognized by cleavage factor I (CFI) (Brown and Gilmartin, 2003; MacDonald et al., 1994). Recognition of the core cis-elements by their respective factors forms the core 3’ processing complex and recruits additional processing factors including cleavage factor II (CFII), poly(A) polymerase (PAP) and polyadenylate- binding protein 1 (PABN1) among others. Interestingly, one group of eukaryotic genes not processed at their 3’-ends by this mechanism is snRNA genes, which contain their own set of 3’-end cis-elements and are processed by the Integrator complex (Shi et al., 2009).

Interestingly, poly(A) sites can be highly divergent lacking one or more cis- element and can have a variant hexanucleotide sequence. Additionally, poly(A) sites are abundant throughout the human genome with most genes containing more than one viable poly(A) site, and alternative polyadenylation (APA) is estimated to occur in ~70-79% of mammalian genes (Hoque et al., 2013). APA was first observed on the IgM gene, which has two developmentally regulated poly(A) sites where in plasma cells the proximal intronic poly(A) site is used, and in B-cells, the distal 3’ poly(A) site is used (Early et al., 1980; Peterson, 2011).

The regulation of alternative poly(A) sites on IgM has been shown to be

99

dependent on a balance of efficiencies as plasma cells use the weak intronic poly(A) site in coordination with a weak 5’ splice site, and strengthening the 5’ splice site results in use of the distal poly(A) site only (Peterson and Perry, 1989).

These results suggest a balance between splicing and 3’-end processing efficiencies regulates intronic poly(A) sites. The full extent of APA regulation remains unclear, but the relationship between splicing and 3’-end processing is particularly interesting, as there are functional and biochemical links between the processes.

Both splicing and 3’-end processing are coordinated co-transcriptionally through the C-terminal domain (CTD) of RNA polymerase II (pol II) providing a functional link between the processing events. The CTD is essential for proper mRNA processing (McCracken et al., 1997b) and numerous splicing and cleavage/polyadenylation factors directly interact with the CTD (Bourquin et al.,

1997; Kim et al., 1997; Licatalosi et al., 2002; Pal-Bhadra et al., 2004; Robert et al., 2002; Tanner et al., 1997). Additionally, splicing and cleavage/polyadenylation factors are able to interact directly as U2 snRNP interacts with CPSF (Kyburz et al., 2006), U2AF65 interacts with PAP (Vagner et al., 2000) and CFI (Millevoi et al., 2006), and U1 snRNP interacts with CPSF

(Lutz et al., 1996). However, the functional impact of these interactions on their respective processes is unclear.

A mutual functional coupling between splicing and 3’-end processing was first characterized in a late region deletion mutation of SV40 where a 3’ splice site

100

deletion resulted in aberrant 3’-end cleavage and inhibition of polyadenylation

(Villarreal and White, 1983). Additionally, it was shown that intron splicing significantly enhances polyadenylation (Niwa et al., 1990) and that mutation of either the terminal splicing or cleavage/polyadenylation sequences disrupts both splicing and 3’-end formation (Dye and Proudfoot, 1999). These experiments lead to the terminal exon definition model, which suggests a mutually beneficial relationship between terminal splicing and proper cleavage/polyadenylation.

However, there is also evidence of splicing factors negatively regulating 3’- end formation. U1 snRNA interaction with the 5’ splice site has been shown to silence both the upstream poly(A) site of HIV-1 (Ashe et al., 1995) and the downstream poly(A) site of BPV-1 (Furth and Baker, 1991). Additionally, U1 snRNP binding at 5’ splice sites has been shown to recruit and bind PAP through

U170K effectively inhibiting polyadenylation (Gunderson et al., 1998).

Interestingly, U1 snRNP component U1A autoregulates itself by binding a sequence in the 3’ UTR of U1A pre-mRNA and inhibiting use of the poly(A) site

(Boelens et al., 1993; Gunderson et al., 1994; van Gelder et al., 1993). There is a strikingly high amount of U1 snRNP in the cell, with an estimated one million copies per human cell, although snRNPs form the spliceosome in 1:1 stoichiometry (Baserga, 1993). For this reason, U1 snRNP has been of particular interest as it is hypothesized that the high amount of U1 snRNP is due to potential alternative functions.

Notably, depletion of functional U1 snRNP resulted in accumulation of

101

premature transcripts through utilization of upstream cryptic poly(A) sites typically located within introns (Kaida et al., 2010). Interestingly, depletion of functional U2 snRNP or global splicing inhibition with spliceostatin A (SSA) treatment did not result in use of this premature poly(A) sites. For this reason, it is proposed that

U1 snRNP functions outside of its role in splicing to bind pre-mRNA at potential

U1 binding sties, which are weakly conserved, and prevent the use of cryptic poly(A) sites through steric inhibition. However, it has not been shown that a cryptic poly(A) site can be upregulated by mutating a nearby U1 binding site.

Genome-wide analysis identified both splicing and cleavage/polyadenylation factors that regulate poly(A) sites (Li et al., 2015).

Interestingly, components of CFI and PABPN1 promote use of distal 3’ UTR poly(A) sites while CPSF component, Fip1, and polyadenylation/cleavage factor,

Pcf11, promote use of upstream 3’ UTR poly(A) sites (Li et al., 2015).

Additionally, knockdown of numerous splicing factors and depletion of U1 snRNP resulted in an upregulation of poly(A) sites in intronic and exonic regions suggesting that splicing generally represses upstream cleavage/polyadenylation

(Li et al., 2015).

The mechanism regulating APA use remains poorly understood but evidence supports regulation of cleavage/polyadenylation by splicing. It is unclear if a balance of processing efficiencies regulates human intronic poly(A) sites or if it is a role unique to U1 snRNP, as there is evidence supporting both mechanisms. This study examines the role of splicing and spliceosome

102

components in regulating alternative poly(A) site choice genome-wide. I report that snRNA globally regulates poly(A) site choice with a notable increase use of intronic poly(A) sites that is broadly due to an attenuation of splicing and not a function unique to U1 snRNP. Interestingly, I found that U1, U4 and U6 snRNPs protect similar poly(A) sites near the TSS while U2 snRNP protects an alternative group of poly(A) sites with no 5’-bias. Lastly, I found that depletion of snRNPs results in a potential negative feedback loop that regulates APA on Integrator genes, which function in snRNA 3’-end processing.

5.2 Results

5.2A snRNA ASOs Attenuate Splicing

In order to examine the role of splicing in poly(A) site choice, I selectively depleted functional snRNPs in HeLa cells using anti-sense oligonucleotides

(ASOs). The ASOs are designed to specifically bind snRNA resulting in an

RNA:DNA hybrid, which is targeted by RNase H for degradation (Vickers et al.,

2011). I examined the efficiency of splicing on MYC to test if ASO transfection in

HeLa cells inhibits splicing. I collected total RNA from treated cells and generated single-stranded poly(A)+ cDNA using an oligo(dT) reverse primer. I PCR amplified the cDNA using paired primers that either spanned an exon-intron junction or spanned a region on the terminal exon of MYC (Figure 5-1A). PCR signal from the exon-intron junction represents unspliced MYC while signal from the terminal exon represents the constitutive level of MYC within the cells. I found that treatment of 100 nm ASO for 24 hours resulted in some reduction of splicing

103

for all conditions except the control (Figure 5-1A).

I quantified the MYC PCR reactions by measuring the densitometry of the

EtBr signal. I first found the ratio of the exon-intron PCR product relative to the terminal exon PCR product for each ASO condition. I then divided the ratios for the treated conditions by the control to determine the fold increases of unspliced

MYC relative to the control. The greatest reduction of MYC splicing resulted from

U4 ASO treatment with 8.7 fold more unspliced MYC relative to the control and

U1 ASO treatment exhibited the smallest reduction in splicing with 1.6 fold more unspliced MYC (Figure 5-1B).

Although splicing of MYC was not completely inhibited, I found that 100 nM ASO treatments was the highest concentration I could use without inhibiting splicing with the control ASO. The fact that U1 ASO had the smallest attenuation of splicing is not unexpected as there is an estimated one million U1 snRNA per

HeLa cell, which is greater than twice the amount of any other snRNA (Baserga,

1993). In principle, snRNP depletion will not equally affect all introns as there are varying splicing efficiencies of introns. I expect weak splice signals to be most impacted by snRNP depletion and strong splice signals to the least affected.

5.2B Reduction of Splicing Alters Poly(A) Site Use within Gene Bodies

To examine how attenuation of splicing impacts poly(A) site usage on gene-bodies, I generated poly(A)+ RNA-seq libraries. In addition to depleting functional snRNPs with snRNA ASOs, I used SSA, which targets U2 snRNP component, SF3b, and is a robust inhibitor of splicing (Kaida et al., 2007). The

104

number of mapped reads for each poly(A)+ RNA-seq library is detailed in Figure

5-2A. Interestingly, SSA treatment allows for early spliceosome formation on the pre-mRNA, but it prevents the transition of the spliceosome into its catalytic form effectively inhibiting splicing (Roybal and Jurica, 2010). Treatment with SSA allows me to test how snRNPs, individually or through spliceosome formation, block poly(A) site usage on the gene body.

Alternative poly(A) site usage was determined by first calculating the total amount of poly(A) site usage on a single gene up to 200 bp downstream of the 3’

UTR. Next, the amount of poly(A) site use within the gene body, which I defined as upstream of the 3’ UTR, was given a ratio of usage based on the total use of poly(A) sites on the gene. This ratio of gene-body poly(A) site usage was then compared between the treated conditions relative to the control ASO condition. I defined a significant change in poly(A) site usage as having an FDR ≤ 0.05.

I found that reduction of splicing, either through disruption of snRNP formation or SSA treatment, greatly affects poly(A) site usage within gene bodies and there were both considerable increases and decreases in poly(A) site usage

(Figure 5-2B). Interestingly, treatment with 20 ng/ml SSA for 12 hours and with

U2 ASO resulted in the greatest number of altered poly(A) sites while U1 and U4

ASO treatment resulted in the fewest affected poly(A) sites (Figure 5-2B). For this study, I focused on characterizing the poly(A) sites that demonstrate increased usage to further elucidate if U1 snRNP is unique in preventing processing at cryptic poly(A) sites. However, it will be interesting to further characterize the

105

down-regulated poly(A) sites in the future. Notably, treatment with the different

ASOs resulted in approximately 40% of affected poly(A) sites exhibiting increased use and 60% with decreased use (Figure 5-2C). SSA treatment exhibited the opposite effect with 66% of affected poly(A) sites being increased and 34% decreased (Figure 5-2C).

My analysis supports previous findings that U1 snRNP functions in preventing processing at premature poly(A) sites (Kaida et al., 2010), but also suggests that reduction of splicing through alternative mechanisms regulates poly(A) site usage within gene bodies. Interestingly, I found that U1 snRNP depletion affects the fewest number of poly(A) sites and SSA treatment affected the most poly(A) sites. Additionally, the disparity in percent of poly(A) sites protected by individual snRNPs (ASO treatment) compared to poly(A) sites protected by universal regulation of splicing (SSA treatment) (Figure 5-2C) suggests that splicing is a general regulator of intronic poly(A) sites while snRNPs might function on distinct classes of poly(A) sites. For example, U2 snRNP might protect poly(A) sites near the 3’ splice sites while U1 snRNP might protect poly(A) sites near 5’ splice sites.

5.2C Different Mechanisms of Splicing Inhibition Alternatively Regulate Poly(A)

Site Use

In principle, depletion of different snRNPs would impact splicing by alternative mechanisms. For example, U1 snRNP and U2 snRNP both directly bind the pre-mRNA but at opposite ends of the intron so depletion of either

106

snRNP would affect different steps in spliceosome assembly. However, I cannot eliminate the unlikely possibility that depletion of a single snRNP destabilizes the rest of the snRNPs. SSA treatment inhibits splicing by blocking the formation of the catalytic spliceosome but still allows early spliceosome formation. Since depletion of U1, U2, U4, U6 snRNPs and SSA treatment differentially hinder splicing but similarly affect poly(A) site use, I wanted to examine how closely related the affected poly(A) sites are in each condition.

To do this, Dr. Hyunmin Kim analyzed the poly(A)+ RNA-seq data for similar poly(A) site usage by sample clustering. The mapped cleavage sites were clustered in 20 bp increments throughout the genome, and the heights of these poly(A) site clusters were compared and summarized to calculate Spearman’s rank correlation coefficient (rho). The rho values were then clustered using an R package, hclust, which utilizes a pair-wise distance matrix calculated by 1-rho values generating a dendogram depicting similarity in poly(A) site usage between the different conditions.

This analysis revealed that U1, U4 and U6 snRNP depletion impacted similar poly(A) sites (Figure 5-3). While U2 snRNP depletion and SSA treatment resulted in use of similar poly(A) sites (Figure 5-3), which is not unexpected as

SSA functions by targeting U2 snRNP component. Interestingly, depletion of U1,

U4, and U6 snRNPs had the fewest number of affected poly(A) site usage compared to the control. In contrast, U2 snRNP depletion and SSA treatment resulted in the greatest number of affected poly(A) sites relative to control. These

107

results suggest that alternative classes of poly(A) sites are regulated by U1, U4 and U6 snRNPs and by U2 snRNP.

5.2D U1, U4, and U6 snRNPs Protect Poly(A) Sites Near the TSS

To further characterize how attenuation of splicing regulates poly(A) site choice, I examined if there was any bias in the gene position of affected poly(A) sites. To do this, the significantly impacted poly(A) sites (FDR ≤ 0.05) were divided into two categories, increased and decreased poly(A) site usage as previously described in section 5.2B. Dr. Hyunmin Kim then mapped the density of significantly altered poly(A) sites relative to gene length where ‘0.0’ refers to the most proximal TSS and ‘1.0’ refers to the most distal poly(A) site for each gene (Figure 5-4).

This analysis revealed that increased poly(A) sites with U1, U4 and U6 snRNP depletion are notably biased towards the transcription start site (TSS) of genes (Figure 5-4A-C). Interestingly, the 5’ bias of increased poly(A) site usage is not observed with U2 snRNP depletion or SSA treatment (Figure 5-4D,E), and in fact, there appears to be no strong bias for where poly(A) sites are being impacted on genes. This further suggests that U1, U4 and U6 snRNP share a similar mechanism of poly(A) site regulation perhaps near weak 5’ splice sites, and U2 snRNP is potentially functioning in an alternative mechanism in correlation with weak 3’ splice sites.

108

5.2E Splicing Preferentially Regulates Use of Intronic Poly(A) Sites

In the model of balanced efficiencies, it is hypothesized that weakening of splicing would result in increased use of intronic poly(A) sites as they would gain the competitive advantage. Alternatively, it is possible that the presence of snRNPs and assembly of the spliceosome blocks access to intronic poly(A) sites.

The steric-blockage model could be further extended to exonic poly(A) sites as the exon definition model suggests that U1 and U2 snRNP interaction is bridged over exons due to their relatively short lengths compared to introns. For this reason if snRNP depletion equally favored exonic and intronic poly(A) sites, it would support a mechanism of steric blockage. Alternatively if intronic poly(A) sites were preferentially repressed by splicing, it would support a model of balanced-efficiencies regulating intronic poly(A) sites.

To address which mechanism is regulating intronic poly(A) sites, I analyzed the poly(A)+ RNA-seq data to determine if increased poly(A) site usage occurs more often in exons or introns when splicing is diminished. The data was separated into gene-body poly(A) sites that have significant (FDR ≤ 0.05) increased and decreased usage relative to control, as described above in section

5.2B. I converted the list of significant poly(A) sites into a BED file and used

Galaxy to intersect the UCSC hg19 intron BED file. This provided me with a list of significant poly(A) site changes that occurred within introns for each treatment condition.

I found that significantly increased poly(A) sites due to reduced splicing

109

were found 81-93% of the time in introns (Figure 5-5A). Interestingly, U1 snRNP depletion resulted in the greatest percentage of increased poly(A) sites within introns supporting previous evidence that U1 snRNP preferentially protects intronic poly(A) sites (Kaida et al., 2010). One explanation for why the majority of increased poly(A) sites are in introns is that introns are a large part of a eukaryotic gene’s structure. It is estimated that introns account for ten times more

DNA than exons in genes (Cooper, 2010). However, if the high percentage of increased poly(A) sites were due to the gene composition, then I would expect a comparable percentage of decreased poly(A) sites to be intronic. I found that 70-

72% of significantly decreased poly(A) sites due to snRNP depletion were located within introns, and 77% of decreased poly(A) sites due to SSA treatment are located within introns (Figure 5-5B). Although a modest difference, these results suggest that splicing preferentially prevents use of intronic poly(A) sites.

5.2F snRNP Depletion Favors Intronic Poly(A) Site Use on NR3C1 Mini-Gene

To further characterize regulation of intronic poly(A) sites by splicing, I generated an NR3C1 mini-gene, which was previously utilized to demonstrate U1 snRNP regulation of an intronic cryptic poly(A) site (Kaida et al., 2010).

Interestingly, U2 snRNP depletion and splicing inhibition due to SSA treatment had no significant impact on the NR3C1 mini-gene cryptic poly(A) site (Kaida et al., 2010) suggesting that NR3C1’s intronic poly(A) site is specifically regulated by U1 snRNP. When I observe the endogenous NR3C1 cryptic poly(A) site on the UCSC genome browser, there is a low number of mapped reads (Figure 5-

110

6A), which makes interpretation difficult. However, use of the NR3C1 mini-gene allows me to investigate if depletion of other snRNPs regulates this intronic poly(A) site. Since my poly(A)+ RNA-seq data reveals that U2, U4, and U6 snRNP depletion affects other intronic poly(A) sites, I expect that they will also regulate NR3C1’s intronic poly(A) site.

The NR3C1 mini-gene was constructed as previously described (Kaida et al., 2010) to contain the exons and shortened intron region flanking the intronic poly(A) site as shown by the blue boxes and dashed line in Figure 5-6A. The mini-gene was cloned into a pcDNA3 expression plasmid and simultaneously transfected into HeLa cells with the respective ASOs. Total RNA was collected and reverse-transcribed into single-stranded cDNA using a reverse oligo(dT) primer to select only polyadenylated mRNA. 3’ RACE was conducted on single- stranded cDNA using nested forward primers that align to the second exon

(Figure 5-6B), and a reverse primer that recognizes a unique sequence engineered into the oligo(dT) primer (Kaida et al., 2010). The resulting PCR products were analyzed on an agarose gel.

I found that treatment with the control ASO resulted in use of only the 3’- end poly(A) site located on the pCDNA3 vector (Figure 5-6C, lane 1). Depletion of U1, U2, U4, and U6 snRNP all resulted in increased use of the intronic NR3C1 poly(A) site and a reduction in use of the 3’-end poly(A) site (Figure 5-6C, lanes

2-5). The 3’ RACE results were further quantified by calculating the densitometry of the EtBr signal. Total NR3C1 mRNA level was determined by adding the

111

intronic and 3’-end poly(A) site values, and the intronic poly(A) site value was divided by the total single to find the percent use of intronic poly(A) site on the

NR3C1 mini-gene (Figure 5-6D). There was greater than 85% use of the intronic poly(A) site in each case where splicing was diminished due to snRNP depletion.

These results suggest that the NR3C1 intronic poly(A) site is intrinsically regulated by the presence of functional snRNPs, and not just by U1 snRNP.

Since independent depletion of U1, U2, U4 and U6 snRNPs all resulted in a dramatic switch to use of the intronic poly(A) site, it is unlikely that the mechanism of regulation is due to a steric inhibition of the poly(A) site by the snRNPs. This is especially true for the U4 and U6 snRNP results, as these snRNPs do not directly bind the mRNA. These results more closely support a mechanism of splicing in out-competing intronic poly(A) site use. I have not yet tested whether NR3C1’s cryptic poly(A) site is regulated by SSA treatment, but it will be interesting to examine if inherent splicing is the mechanism regulating

NR3C1’s intronic poly(A) site. Additionally, the current NR3C1 mini-gene could be optimized and is being redesigned to be shorter containing only the essential cis- elements.

5.2G Attenuation of Splicing by Alternative Mechanisms Increases Intronic

Poly(A) Site Use

Although there appear to be classes of genes that are distinctly impacted by snRNP depletion and SSA treatment as described in section 5.2C & 5.2D, an

NR3C1 intronic poly(A) site is regulated by independent depletion of all snRNPs

112

examined. To determine if there are other intronic poly(A) sites that are similarly impacted by snRNP depletion and SSA treatment, I further analyzed the poly(A)+

RNA-seq data by visualizing the data on the UCSC genome browser.

I focused specifically on the list of previously reported poly(A) sites preferentially protected by U1 snRNP that were termed premature cryptic poly(A) sites (Kaida et al., 2010). I found numerous examples where these PCPAs were indiscriminately increased with snRNP depletion and SSA treatment. In all the gene examples shown, I compared the number of mapped poly(A)+ RNA-seq reads at each intronic poly(A) site to the number of mapped reads at the 3’ UTR, as an internal control. FTH1 shows increased usage in all conditions with U6 snRNP depletion having the most reads at the intronic poly(A) site and SSA treatment resulting in total use of upstream poly(A) sites (Figure 5-7A). SLC25A3 has increased poly(A) site usage with U1, U4, U6 depletion and SSA treatment, but it does not appear to be increased with U2 snRNP depletion (Figure 5-7B).

PMPCA shows a dramatic increase in use of the intronic poly(A) site with U1 snRNP depletion and a moderate increase with U4 and U6 snRNP depletion, but no change with U2 depletion or SSA treatment (Figure 5-7C).

To independently validate these results, I used biological replicate samples to perform 3’ RACE. As described above, nested primers near each poly(A) site were used to amplify single-stranded poly(A)+ cDNA (Kaida et al.,

2010). The PCR products were visualized on an agarose gel and the EtBr signals were quantified. The 3’ RACE results were further validated with a –RT control,

113

which revealed no detectable product. The total poly(A) mRNA for each gene was calculated by adding the intronic and 3’ poly(A) site signals, and the percent of intronic poly(A) site usage was calculated. FTH1 was robustly validated showing no usage of the intronic poly(A) site in the control and 50-90% usage when splicing was attenuated (Figure 5-8A). SLC25A3 was also confirmed to show a 15-37% increase use of the intronic poly(A) site indiscriminate of how splicing was reduced (Figure 5-8B). Additionally, PMPCA was validated revealing a 19-38% increased use in the intronic poly(A) site (Figure 5-8C).

These results reveal that previously characterized U1 snRNP-protected intronic poly(A) sites are, in fact, regulated by other snRNPs or potentially by splicing, as shown by SSA treatment. This provides further evidence that a competitive balance between splicing and 3’-end processing regulates intronic poly(A) sites.

5.2H snRNPs Regulate APA on Integrator Genes in a Potential Negative-

Feedback Loop

Since previously identified intronic poly(A) sites are globally regulated by attenuation of splicing, I next wanted to identify novel intronic poly(A) sites regulated by splicing. To do this, I intersected the lists of significantly increased poly(A) sites for all conditions to find poly(A) sites significantly increased when each snRNP is depleted and with SSA treatment.

The top hit of this analysis was INTS10, which codes a subunit of the

Integrator complex that is responsible for processing 3’-ends of snRNAs. When I

114

examine the poly(A) site usage of INTS10 on the USCS genome browser, I observe that attenuation of splicing by snRNP depletion and by SSA treatment results in an increased use of an intronic poly(A) site that is not utilized in the control (Figure 5-9A). Interestingly, I found numerous INTS genes had a shift in poly(A) site usage towards the 5’-end of the gene when splicing was diminished.

For example, INTS7 has increased usage of an intronic poly(A) site with U1, U4, and U6 snRNP depletion and a shift towards an upstream intronic poly(A) site with SSA treatment (Figure 5-9B).

I validated and quantified the use of these intronic poly(A) sites on biological replicate samples by 3’ RACE of cDNA, as described above. I confirmed that the INTS10 intronic poly(A) site has a 29-37% increase with depletion of U1, U4 and U6 snRNPs and a 90% increase with SSA treatment

(Figure 5-9C). However, I was not able to validate an increased use of the intronic poly(A) site with U2 snRNP depletion. INTS7 was further validated to have a 12-14% increase use of the intronic poly(A) site with U1, U4 and U6 snRNP depletion relative to the control (Figure 5-9D).

These results implicate a role of functional snRNPs and splicing in a negative feedback loop regulating expression of Integrator. It is possible that as snRNAs are depleted, and splicing is hindered, the snRNAs are further down regulated by aberrant Integrator expression due to APA on INTS genes.

115

5.3 Discussion

In this chapter, I showed that alternative poly(A) site use within introns is regulated by snRNPs and splicing. The relevant results are: (i) Attenuation of splicing, either by snRNP depletion or SSA treatment, results in aberrant poly(A) site choice within gene bodies. (ii) Functional snRNPs and efficient splicing protects intronic poly(A) sites from being utilized. (iii) snRNPs appear to be regulating expression of the Integrator subunits in a potential negative feedback loop.

It has been previously reported that U1 snRNP has a unique function in protecting cryptic poly(A) that is independent of its role in splicing as neither U2 snRNP depletion or SSA treatment impacted use of intronic poly(A) sites (Kaida et al., 2010). However, a recent report that confirmed the importance of U1 snRNP in regulating poly(A) site choice also revealed that numerous splicing factors and 3’-end processing factors regulate poly(A) site choice within gene bodies and on the 3’ UTR (Li et al., 2015).

The results in this chapter reveal that U1 snRNP is not unique in its ability to prevent intronic poly(A) sites, as depletion of all snRNPs examined and inhibition of splicing with SSA resulted in increased usage of poly(A) sites previously characterized as being specifically and solely regulated by U1 snRNP

(Figure 5-7, 5-8). I propose that there is a balance of splicing and 3’-end processing efficiencies that is regulating intronic poly(A) site use (Figure 5-10).

Under normal conditions, splicing is often the strong reaction out-competing use

116

of intronic poly(A) sites (Figure 5-10A). However when the splicing reaction is weakened or abolished, this provides a competitive advantage for the intronic poly(A) site to be processed (Figure 5-10B).

The mechanism of how splicing and snRNPs are regulating intronic poly(A) sites remains unclear. It is possible that a weak 5’ splice site is affected more by U1 snRNP depletion and a weak 3’ splice site is impacted more by U2 snRNP depletion. Additionally, it is possible that depletion of functional snRNPs destabilizes other regulatory factors that bind the pre-mRNA to enhance or inhibit

3’-end processing. In future, it will be interesting to further characterize the cis- elements surrounding intronic poly(A) sites most altered by a weakening of splicing. In summary, this report provides evidence that splicing is a major regulator of alternative poly(A) site choice. It is likely that intronic poly(A) sites are in a direct competition with the splicing reaction that removes them from the mRNA, and when the splicing reaction is no longer favored, the intronic poly(A) site is more likely to be utilized.

117

Figure 5-1: snRNA ASOs attenuate splicing. (A) PCR of single-stranded poly(A)+ cDNA on the MYC gene with primers specific to the first exon-intron junction (red) and last exon (blue) allow amplification of unspliced (exon-intron) and constitutive (exon) MYC with snRNP depletion. The PCR products were visualized on a 1% agarose gel. (B) The densitometry of the MYC PCR products was quantified. The ratio of the unspliced (exon-intron) product to the total amount of MYC product (exon-intron plus exon) was determined and then divided by the control to get the fold change in unspliced product. There are no error bars as this represents one gel.

118

Figure 5-2: Reduction of splicing alters poly(A) site use within gene bodies. (A) The poly(A)+ RNA-seq libraries are detailed including the number of processed reads, the percentage of mapped reads, and the total number of mapped reads. (B) Significantly (FDR ≤ 0.05) altered poly(A) sites were determined by comparing cleavage sites aligned to hg19 human genome for each treatment condition relative to the control. The number of significantly altered poly(A) sites in each condition relative to the control is reported. The altered poly(A) sites were further split into two categories representing poly(A) sites with increase use or decreased use relative to the control. (C) The percent of increased and decreased poly(A) sites from the total number of altered poly(A) sites are shown. Note that U1, U2, U4 and U6 ASO treatment result in an increase of approximately 40% of affected poly(A) sites (green) while SSA treatment results in an increase in 66% of affected poly(A) sites (red).

119

Figure 5-3: Dendogram of poly(A) site use in different conditions. The cleavage sites from the poly(A)+ RNA-seq data aligned to hg19, grouped into 20 bp increments and the number of cleavage sites in each group were compared among the treated and control conditions to calculate the rho coefficient. The rho values were clustered using R package, hclust, to generate a dendogram. Note that U2 and SSA share similar poly(A) site usage and are most different than the control sample. Also, U1, U4 and U6 have similar poly(A) site usage.

120

Figure 5-4: U1, U4, and U6 preferentially protect poly(A) sites near the TSS. The density of significantly (FDR ≤ 0.05) increased and decreased poly(A) sites relative to gene length was determined for each condition. ‘0.0’ represents the most proximal TSS and ‘1.0’ represents the most distal poly(A) site on a given gene. (A-C) Poly(A) sites with increased usage with U1 (A), U4 (B), and U6 (C) snRNP depletion are found near the TSS as indicated by the red arrows. (D, E) Increased poly(A) sites with U2 snRNP depletion (D) and SSA treatment (E) did not exhibit a bias towards the TSS as indicated by red downward facing arrows.

121

Figure 5-5: Splicing preferentially regulates use of intronic poly(A) sites. (A) Poly(A) sites with significantly (FDR ≤ 0.05) increased poly(A) site usage were intersected with the hg19 intron BED file to calculate the number of affected poly(A) sites located within introns. The percent of significantly increased poly(A) sites located within introns relative to the total number of altered poly(A) sites is illustrated by a bar chart. Note that > 81% of increased poly(A) sites in each condition are located within an intron. (B) The percentage of significantly decreased poly(A) sites located within introns was calculated as described in A. Note that no error bars are included as there are not biological replicates of the poly(A)+ RNA-seq data.

122

Figure 5-6: snRNPs regulate the intronic poly(A) site on a NR3C1 mini- gene. (A) The poly(A)+ RNA-seq cleavage sites were mapped to hg19 and visualized on the UCSC genome browser for NR3C1. The image is zoomed in on a previously reported intronic poly(A) site (PAS) in the second intron of NR3C1. The region of NR3C1 visualized on the UCSC browser is shown by the blue box and blue arrow. (B) NR3C1 mini-gene construct is shown in blue. The blue boxes and dashed lines on the NR3C1 full-length gene in A indicate the cloned sections of NR3C1 used in the mini-gene construct. The forward nested primers align to exon 2 and are indicated by the double arrow. Note that there are two poly(A) sites in the construct, one in the intron and one located after the 3’ UTR that is part of the pcDNA3 plasmid. (C) The 3’ RACE of the NR3C1 mini-gene is visualized on a 1% agarose gel. The two PCR products represent use of the intronic poly(A) site (top band) and use of the 3’-end poly(A) site that is coupled with splicing on the intron (bottom band). (D) The EtBr on the agarose gel in C was quantified and the percent of intronic poly(A) site relative to total NR3C1 use was determined. Note there are no error bars as the data represents only the one gel shown.

123

Figure 5-7: Attenuation of splicing by alternative mechanisms increases intronic poly(A) site use. The cleavage site of the poly(A)+ RNA-seq data was mapped to hg19 and visualized on the UCSC genome browser at the previously reported intronic poly(A) site (red) and at the 3’-end poly(A) site (green) which are shown relative to their position on the full length gene for each example. (A) The intronic poly(A) site (PAS) of FTH1 reveals increased usage in all treated conditions relative to the control. (B) SLC25A3 has increased usage of its intronic poly(A) site relative to the control for U1, U4 and U6 snRNP depletion as well as SSA treatment. (C) There is increased use of PMPCA’s intronic poly(A) site when U1, U4 and U6 snRNPs are depleted.

124

Figure 5-8: Validation of intronic poly(A) site use by 3’ RACE. 3’ RACE of biological replicate sample with nested PCR primers near the intronic poly(A) site (PAS) (red) or 3’-end poly(A) site (green) are shown. 3’ RACE products were visualized on agarose gels and the EtBr signal was quantified. (A) Depletion of U1, U2, U4 and U6 snRNPs as well as SSA treatment all resulted in a dramatic increase in usage of FTH1’s intronic poly(A) site. (B) The use of SLC25A3’s intronic poly(A) site was validated for all conditions. (C) PMPCA was also validated to show that depletion of snRNPs and SSA treatment result in increased use of the intronic poly(A) site. The bar graph represents the single gel shown.

125

Figure 5-9: snRNPs regulate APA on Integrator genes. (A) The cleavage site of the poly(A)+ RNA-seq data was mapped to hg19 and visualized on the UCSC genome browser for the entire INTS10 gene. Note the intronic poly(A) site (arrows) that is used in all conditions except the control. (B) The poly(A)-seq data for the INTS7 gene is visualized on the genome browser and there is increased usage of an intronic poly(A) site for U1, U4 and U6 snRNP depletion. Note that there is use of a novel poly(A) site even further upstream with SSA treatment. (C) 3’ RACE of biological replicates confirms the use of the intronic poly(A) site on INTS10 expect with U2 snRNP depletion. The EtBr signal on the agarose gel was quantified to determine the percent of intronic poly(A) site usage. (D) 3’ RACE of INTS7 validated an increased use of the intronic poly(A) site with U1, U5 and U6 snRNP depletion. Note that the EtBr quantification only represents the single gel shown for each gene.

126

Figure 5-10: Model of intronic poly(A) site regulation by splicing. (A) In normal conditions, intronic poly(A) sites are likely removed by splicing before than can be properly processed. (B) Alternatively when splicing is inhibited either through snRNP depletion or with SSA, the intronic poly(A) site is processed as it is present on the transcript.

127

CHAPTER VI

CONCLUSIONS

The majority of mammalian genes have alternate mRNA isoforms due to alternative polyadenylation (APA), but how poly(A) site choice is regulated is not fully understood. Ser2P-CTD is most strongly linked to 3’-end processing as inhibition of Ser2-CTD phosphorylation results in inefficient 3’-end processing in mammalian cells (Gu et al., 2013). However, the relationship between Ser2P and poly(A) site choice has not yet been fully characterized. For instance, it is unclear if accumulation of Ser2P promotes 3’-end processing or if attenuation of Ser2P affects 3’-end processing. Additionally, pol II pausing is known to be an important regulating event at 5’-ends of genes but the pause at 3’-ends of genes is not fully understood (Chapter 1.2A). It is possible that the 3’-end pause is critical for regulation of poly(A) site use. Lastly, there are numerous lines of evidence that functionally and biochemically connect splicing and 3’-end processing (Chapter

1.3B), but it is unclear how splicing or splicing factors might regulate poly(A) site choice. This study provides further insight into the regulation of APA through a

CTD-ligand that modifies Ser2P, pol II pausing, and splicing.

In Chapter 3, I identify a novel function for a CTD-ligand in stabilizing

Ser2P on chromatin, and I find that attenuation of Ser2P leads to aberrant 3’ UTR poly(A) site choice. As ChIP-seq is not inherently quantitative, in the future it will be important to generate Ser2P ChIPs using a yeast spike-in control to definitively quantitate observed changes in chromatin-bound Ser2P. However, a

128

recent finding that yeast Spt6 stabilizes global Ser2P-CTD levels within the cell

(Dronamraju and Strahl, 2014) supports my results that Spt6 promotes Ser2P at

3’-ends of genes.

The attenuation of Ser2P on chromatin due to Spt6 knockdown (Figures 3-

5, 3-6) provided an opportunity to investigate the effect of reduced Ser2P on poly(A) site choice. Since hyperphosphorylation of Ser2-CTD at 5’-ends of genes resulted in a proximal shift in poly(A) site usage (Schwartz et al., 2012), it was my hypothesis that a decrease in Ser2P at 3’-ends would result in a downstream shift in poly(A) site choice, as a decrease in Ser2P could potentially result in a decrease in processing factor recruitment to chromatin. I found nearly 1,200 genes with significant poly(A) site shifts at their 3’ UTR in correlation with attenuation of Ser2P at 3’-ends due to Spt6 knockdown (Figure 3-9, Appendix B).

Notably, only about two-thirds were a downstream shift, which is substantial, but it indicates that the extent of Ser2P does not robustly predict poly(A) site choice.

In future, it would be interesting to further characterize the cis-elements surrounding the alternatively used poly(A) sites when Ser2P is attenuated. It is known that an important determinate of poly(A) site use is strength of the site, which is determined by conservation of the hexanucleotide sequence as well as the presence of the DSE and USE. For example, if a 3’ UTR has two poly(A) sites where the proximal poly(A) site is stronger than the distal poly(A) site, it is possible that under normal conditions both sites are used as the proximal poly(A) site is strong and the weak distal poly(A) site is balanced by a higher

129

concentration of processing factor in association with a high density of Ser2P

(Figure 6-1A). In this example, it would be my prediction that a decrease in

Ser2P would favor the proximal poly(A) site as the weak distal poly(A) site no longer has the advantage of increased processing factor in close proximity

(Figure 6-1B). Alternatively, if there were two weak poly(A) sites on the 3’ UTR, I would predict that under normal conditions the distal poly(A) site is favored as it is in closer proximity to Ser2P accumulation (Figure 6-1C). However, a reduction in Ser2P could result in the proximal poly(A) site being favored as the concentration of processing factor no longer favors the distal site, and the regulation becomes kinetic resulting in the first encountered poly(A) site being processed first (Figure 6-1D). In principal, there is evidence to support regulation of poly(A) sites by the CTD-code, but is likely multifaceted and needs further characterization.

In Chapter 4, I examined coupling of poly(A) site use to pol II pausing,

Ser2P and recruitment of processing factor on the IgM gene. I found that pol II pausing and 3’ processing factor recruitment is strongly linked with poly(A) site use, but Ser2P is not coupled with processing of a poly(A) site (Fusby et al.,

2015). Pausing of pol II both at the 5’- and 3’-ends of genes is thought to be an important regulator of transcription and mRNA processing (Levine, 2011; Liu et al., 2015), but the signals that regulate pol II pausing at the 3’-end are not well defined. It is possible that DNA elements signal pol II pausing and the accumulation of pol II then signals 3’-end processing. Alternatively, active

130

processing of a poly(A) site might signal pol II to pause near the site of processing. I found the poly(A) site is necessary for pol II pausing, but its presence is not sufficient to induce pausing. This suggests that pol II pausing is a result of 3’-end processing on the pre-mRNA and not signaled by a specific DNA cis-element.

Interestingly, I did not find the same correlation with poly(A) site processing and Ser2P accumulation. Rather, I found that Ser2-CTD was universally hyperphosphorylated regardless if the poly(A) site was being utilized

(WT), not utilized due to splicing (5’-SP), or not present (pA21) (Figure 4-4).

Additionally, I found that 3’-end processing factor, CstF77, is associated only with actively processed poly(A) site regardless of Ser2P accumulation when the poly(A) site is not processed (Figure 4-7). All together, these results suggest that processing of a poly(A) site promotes pol II pausing and processing factor recruitment uncoupled to Ser2P.

Similarly, analysis of human β-globin revealed that mutation of the sole 3’

UTR poly(A) site disrupts canonical pol II pausing and shifts the pol II pause 1-2 kb downstream (Davidson et al., 2014; Kim et al., 2011). However, mutation of the β-globin poly(A) site also disrupts 3’-end accumulation of Ser2P, and depletion of CDK12 impaired Ser2P, processing factor recruitment and 3’-end processing (Davidson et al., 2014). These contradicting results suggest that

Ser2P might not regulate all poly(A) sites the same, and notably the genes are different as IgM has two alternatively regulated poly(A) sites while β-globin has

131

only one poly(A) site.

In Chapter 5, I further investigated how splicing and splicing factors regulate alternative poly(A) sites. It has been proposed that U1 snRNP functions outside its canonical role in splicing to regulate poly(A) site use by binding to pre- mRNA at U1 binding sites and sterically blocking processing of the poly(A) site

(Kaida et al., 2010). I found that U2, U4, and U6 snRNPs, in addition to U1 snRNP, regulate processing of poly(A) sites within gene bodies (Figure 5-2,

Appendix G). Most interestingly, I found that poly(A) sites are also regulated by inhibition of splicing with spliceostatin A (SSA) (Figure 5-2, Appendix G), which allows spliceosome formation on pre-mRNAs but inhibits the catalytic formation of the spliceosome. In fact, I showed that previously categorized U1-snRNP protected poly(A) sites could be regulated by depletion of other snRNPs and by inhibition of splicing through SSA (Figure 5-6, 5-7). These results suggest that splicing is a critical regulator of poly(A) site choice within gene bodies.

Interestingly, poly(A) sites that are down regulated by splicing and the presence of functional snRNPs are preferentially located within introns. However, different poly(A) sites were regulated with different treatments. U1, U4 and U6 snRNP depletion impacted similar poly(A) sites but had the smallest difference from control (Figure 5-3, 5-4). Whereas U2 snRNP depletion and SSA treatment, which targets a U2 snRNP component, affected similar poly(A) sites quite distinct from the control (Figure 5-3, 5-4). This suggests that although splicing significantly influences intronic poly(A) sites there might be multiple mechanisms

132

of regulation that function on distinct classes of poly(A) sites.

In future, it would be interesting to further characterize alternative mechanisms of splicing regulation on poly(A) site usage. For example, are poly(A) sites that are up regulated with U1, U4 and U6 snRNP depletion correlated with weak 5’ splice sites, and are upregulated poly(A) sites with U2 snRNP depletion and SSA treatment correlated with weak 3’ splice sites.

Additionally, it would be interesting to know if U1-, U4-, U6-affected poly(A) sites are closer to the 5’ splice site and if U2/SSA-affected poly(A) sites are closer to the 3’ splice site. This would suggest a model where intronic poly(A) sites are regulated by proximity and efficiency of splicing.

In summary, this thesis provides further characterization of the signals and factors that regulate poly(A) site choice and how alternative poly(A) site choice, in turn, signals co-transcriptional events. Although not directly addressed in this thesis, transcription elongation rate is likely to be an important determinant in poly(A) site choice. Additionally, it has been shown that high poly(A) site use is correlated with an increase in downstream nucleosomes and an increase in pol II accumulation (Khaladkar et al., 2011). Interestingly, the increase in downstream nucleosomes is correlated with a more stable mRNA structure around the poly(A) site that better exposes critical cis-elements. In fact, mRNA structure has been shown to be a critical determinant of polyadenylation in vitro (Graveley et al.,

1996), and RNA editing is a direct result of mRNA structure and the adenosine- rich composition near the poly(A) site is a likely target for RNA-editing. In future, it

133

will be interesting to further investigate pol II elongation, the chromatin state, and

RNA structure in relation to poly(A) site choice.

134

Figure 6-1: Alternative models of 3’ UTR poly(A) site choice. (A) A strong poly(A) site followed by a weak site are likely equally used in normal conditions as accumulation of Ser2P at the 3’-end of a gene increases the concentration of processing factor localized near the weaker poly(A) site. B) When Ser2P is decreased, the stronger proximal poly(A) site is favored and the weaker poly(A) site is not used as much because there is decreased concentration of processing factor. (C) When there are two weak 3’ UTR poly(A) sites, it is likely that the downstream poly(A) site is favored as it is in closer proximity to localized processing factor. (D) When Ser2P is decreased, it is possible that both poly(A) sites have equal processing efficiency but the proximal site could be favored simple because it is encountered first and processed first.

135

REFERENCES

Adesnik, M., Salditt, M., Thomas, W., and Darnell, J.E. (1972). Evidence that all messenger RNA molecules (except histone messenger RNA) contain Poly (A) sequences and that the Poly(A) has a nuclear function. J Mol Biol 71, 21-30.

Ahn, S.H., Kim, M., and Buratowski, S. (2004). Phosphorylation of serine 2 within the RNA polymerase II C-terminal domain couples transcription and 3' end processing. Mol Cell 13, 67-76.

Alkan, S.A., Martincic, K., and Milcarek, C. (2006). The hnRNPs F and H2 bind to similar sequences to influence gene expression. Biochem J 393, 361-371.

Almada, A.E., Wu, X., Kriz, A.J., Burge, C.B., and Sharp, P.A. (2013). Promoter directionality is controlled by U1 snRNP and polyadenylation signals. Nature 499, 360-363.

Anamika, K., Gyenis, A., Poidevin, L., Poch, O., and Tora, L. (2012). RNA polymerase II pausing downstream of core histone genes is different from genes producing polyadenylated transcripts. PLoS One 7, e38769.

Andrulis, E.D., Guzman, E., Doring, P., Werner, J., and Lis, J.T. (2000). High- resolution localization of Drosophila Spt5 and Spt6 at heat shock genes in vivo: roles in promoter proximal pausing and transcription elongation. Genes Dev 14, 2635-2649.

Arhin, G.K., Boots, M., Bagga, P.S., Milcarek, C., and Wilusz, J. (2002). Downstream sequence elements with different affinities for the hnRNP H/H' protein influence the processing efficiency of mammalian polyadenylation signals. Nucleic Acids Res 30, 1842-1850.

Ashe, M.P., Griffin, P., James, W., and Proudfoot, N.J. (1995). Poly(A) site selection in the HIV-1 provirus: inhibition of promoter-proximal polyadenylation by the downstream major splice donor site. Genes Dev 9, 3008-3025.

136

Aviv, H., and Leder, P. (1972). Purification of biologically active globin messenger RNA by chromatography on oligothymidylic acid-cellulose. Proc Natl Acad Sci U S A 69, 1408-1412.

Bartkowiak, B., Liu, P., Phatnani, H.P., Fuda, N.J., Cooper, J.J., Price, D.H., Adelman, K., Lis, J.T., and Greenleaf, A.L. (2010). CDK12 is a transcription elongation-associated CTD kinase, the metazoan ortholog of yeast Ctk1. Genes Dev 24, 2303-2316.

Baserga, S.J.a.S.J.A. (1993). The RNA World (Cold Spring Harbor Laboratory Press).

Bataille, A.R., Jeronimo, C., Jacques, P.E., Laramee, L., Fortin, M.E., Forest, A., Bergeron, M., Hanes, S.D., and Robert, F. (2012). A universal RNA polymerase II CTD cycle is orchestrated by complex interplays between kinase, phosphatase, and isomerase enzymes along genes. Mol Cell 45, 158-170.

Bava, F.A., Eliscovich, C., Ferreira, P.G., Minana, B., Ben-Dov, C., Guigo, R., Valcarcel, J., and Mendez, R. (2013). CPEB1 coordinates alternative 3'-UTR formation with translational regulation. Nature 495, 121-125.

Beaudoing, E., Freier, S., Wyatt, J.R., Claverie, J.M., and Gautheret, D. (2000). Patterns of variant polyadenylation signal usage in human genes. Genome Res 10, 1001-1010.

Bentley, D.L. (2005). Rules of engagement: co-transcriptional recruitment of pre- mRNA processing factors. Curr Opin Cell Biol 17, 251-256.

Bentley, D.L., and Groudine, M. (1986). A block to elongation is largely responsible for decreased transcription of c-myc in differentiated HL60 cells. Nature 321, 702-706.

Berg, M.G., Singh, L.N., Younis, I., Liu, Q., Pinto, A.M., Kaida, D., Zhang, Z., Cho, S., Sherrill-Mix, S., Wan, L., et al. (2012). U1 snRNP determines mRNA length and regulates isoform expression. Cell 150, 53-64.

137

Bienroth, S., Wahle, E., Suter-Crazzolara, C., and Keller, W. (1991). Purification of the cleavage and polyadenylation factor involved in the 3'-processing of messenger RNA precursors. J Biol Chem 266, 19768-19776.

Birnboim, H.C., Mitchel, R.E., and Straus, N.A. (1973). Analysis of long pyrimidine polynucleotides in HeLa cell nuclear DNA: absence of polydeoxythymidylate. Proc Natl Acad Sci U S A 70, 2189-2192.

Blazek, D., Kohoutek, J., Bartholomeeusen, K., Johansen, E., Hulinkova, P., Luo, Z., Cimermancic, P., Ule, J., and Peterlin, B.M. (2011). The Cyclin K/Cdk12 complex maintains genomic stability via regulation of expression of DNA damage response genes. Genes Dev 25, 2158-2172.

Boehm, A.K., Saunders, A., Werner, J., and Lis, J.T. (2003). Transcription factor and polymerase recruitment, modification, and movement on dhsp70 in vivo in the minutes following heat shock. Mol Cell Biol 23, 7628-7637.

Boelens, W.C., Jansen, E.J., van Venrooij, W.J., Stripecke, R., Mattaj, I.W., and Gunderson, S.I. (1993). The human U1 snRNP-specific U1A protein inhibits polyadenylation of its own pre-mRNA. Cell 72, 881-892.

Bortvin, A., and Winston, F. (1996). Evidence that Spt6p controls chromatin structure by a direct interaction with histones. Science 272, 1473-1476.

Bourquin, J.P., Stagljar, I., Meier, P., Moosmann, P., Silke, J., Baechi, T., Georgiev, O., and Schaffner, W. (1997). A serine/arginine-rich nuclear matrix cyclophilin interacts with the C-terminal domain of RNA polymerase II. Nucleic Acids Res 25, 2055-2061.

Bowman, E.A., Bowman, C.R., Ahn, J.H., and Kelly, W.G. (2013). Phosphorylation of RNA polymerase II is independent of P-TEFb in the C. elegans germline. Development 140, 3703-3713.

Bowman, E.A., and Kelly, W.G. (2014). RNA polymerase II transcription elongation and Pol II CTD Ser2 phosphorylation: A tail of two kinases. Nucleus 5, 224-236.

138

Brannan, K., Kim, H., Erickson, B., Glover-Cutter, K., Kim, S., Fong, N., Kiemele, L., Hansen, K., Davis, R., Lykke-Andersen, J., et al. (2012). mRNA decapping factors and the Xrn2 function in widespread premature termination of RNA polymerase II transcription. Mol Cell 46, 311-324.

Brow, D.A. (2002). Allosteric cascade of spliceosome activation. Annu Rev Genet 36, 333-360.

Brown, K.M., and Gilmartin, G.M. (2003). A mechanism for the regulation of pre- mRNA 3' processing by human cleavage factor Im. Mol Cell 12, 1467-1476.

Brown, S.J., Stoilov, P., and Xing, Y. (2012). Chromatin and epigenetic regulation of pre-mRNA processing. Human molecular genetics 21, R90-96.

Bruce, S.R., Dingle, R.W., and Peterson, M.L. (2003). B-cell and plasma-cell splicing differences: a potential role in regulated immunoglobulin RNA processing. RNA 9, 1264-1273.

Buratowski, S. (2009). Progression through the RNA polymerase II CTD cycle. Mol Cell 36, 541-546.

Burugula, B.B., Jeronimo, C., Pathak, R., Jones, J.W., Robert, F., and Govind, C.K. (2014). Histone deacetylases and phosphorylated polymerase II C-terminal domain recruit Spt6 for cotranscriptional histone reassembly. Mol Cell Biol 34, 4115-4129.

Carswell, S., and Alwine, J.C. (1989). Efficiency of utilization of the simian virus 40 late polyadenylation site: effects of upstream sequences. Mol Cell Biol 9, 4248-4258.

Chan, S., Choi, E.A., and Shi, Y. (2011). Pre-mRNA 3'-end processing complex assembly and function. Wiley Interdiscip Rev RNA 2, 321-335.

Chan, S.L., Huppertz, I., Yao, C., Weng, L., Moresco, J.J., Yates, J.R., 3rd, Ule, J., Manley, J.L., and Shi, Y. (2014). CPSF30 and Wdr33 directly bind to AAUAAA in mammalian mRNA 3' processing. Genes Dev 28, 2370-2380.

139

Chapman, R.D., Heidemann, M., Albert, T.K., Mailhammer, R., Flatley, A., Meisterernst, M., Kremmer, E., and Eick, D. (2007). Transcribing RNA polymerase II is phosphorylated at CTD residue serine-7. Science 318, 1780- 1782.

Cheung, V., Chua, G., Batada, N.N., Landry, C.R., Michnick, S.W., Hughes, T.R., and Winston, F. (2008). Chromatin- and transcription-related factors repress transcription from within coding regions throughout the Saccharomyces cerevisiae genome. PLoS Biol 6, e277.

Cho, E.J., Kobor, M.S., Kim, M., Greenblatt, J., and Buratowski, S. (2001). Opposing effects of Ctk1 kinase and Fcp1 phosphatase at Ser 2 of the RNA polymerase II C-terminal domain. Genes Dev 15, 3319-3329.

Cho, H., Orphanides, G., Sun, X., Yang, X.J., Ogryzko, V., Lees, E., Nakatani, Y., and Reinberg, D. (1998). A human RNA polymerase II complex containing factors that modify chromatin structure. Mol Cell Biol 18, 5355-5363.

Christofori, G., and Keller, W. (1988). 3' cleavage and polyadenylation of mRNA precursors in vitro requires a poly(A) polymerase, a cleavage factor, and a snRNP. Cell 54, 875-889.

Chuvpilo, S., Zimmer, M., Kerstan, A., Glockner, J., Avots, A., Escher, C., Fischer, C., Inashkina, I., Jankevics, E., Berberich-Siebelt, F., et al. (1999). Alternative polyadenylation events contribute to the induction of NF-ATc in effector T cells. Immunity 10, 261-269.

Clark-Adams, C.D., and Winston, F. (1987). The SPT6 gene is essential for growth and is required for delta-mediated transcription in Saccharomyces cerevisiae. Mol Cell Biol 7, 679-686.

Colgan, D.F., and Manley, J.L. (1997). Mechanism and regulation of mRNA polyadenylation. Genes Dev 11, 2755-2766.

Compagnone-Post, P.A., and Osley, M.A. (1996). Mutations in the SPT4, SPT5, and SPT6 genes alter transcription of a subset of histone genes in Saccharomyces cerevisiae. Genetics 143, 1543-1554.

140

Cooper, G.M. (2010). The Cell: A Molecular Approach, 2nd Edition edn (Sinauer Associates).

Core, L.J., and Lis, J.T. (2008). Transcription regulation through promoter- proximal pausing of RNA polymerase II. Science 319, 1791-1792.

Czudnochowski, N., Bosken, C.A., and Geyer, M. (2012). Serine-7 but not serine- 5 phosphorylation primes RNA polymerase II CTD for P-TEFb recognition. Nature communications 3, 842.

Danckwardt, S., Hentze, M.W., and Kulozik, A.E. (2008). 3' end mRNA processing: molecular mechanisms and implications for health and disease. EMBO J 27, 482-498.

Das, R., Yu, J., Zhang, Z., Gygi, M.P., Krainer, A.R., Gygi, S.P., and Reed, R. (2007). SR proteins function in coupling RNAP II transcription to pre-mRNA splicing. Mol Cell 26, 867-881.

David, C.J., Boyne, A.R., Millhouse, S.R., and Manley, J.L. (2011). The RNA polymerase II C-terminal domain promotes splicing activation through recruitment of a U2AF65-Prp19 complex. Genes Dev 25, 972-983.

Davidson, L., Muniz, L., and West, S. (2014). 3' end formation of pre-mRNA and phosphorylation of Ser2 on the RNA polymerase II CTD are reciprocally coupled in human cells. Genes Dev 28, 342-356.

Davis, R., and Shi, Y. (2014). The polyadenylation code: a unified model for the regulation of mRNA alternative polyadenylation. J Zhejiang Univ Sci B 15, 429- 437.

Dengl, S., Mayer, A., Sun, M., and Cramer, P. (2009). Structure and in vivo requirement of the yeast Spt6 SH2 domain. J Mol Biol 389, 211-225.

Denome, R.M., and Cole, C.N. (1988). Patterns of polyadenylation site selection in gene constructs containing multiple polyadenylation signals. Mol Cell Biol 8, 4829-4839.

141

Dermody, J.L., Dreyfuss, J.M., Villen, J., Ogundipe, B., Gygi, S.P., Park, P.J., Ponticelli, A.S., Moore, C.L., Buratowski, S., and Bucheli, M.E. (2008). Unphosphorylated SR-like protein Npl3 stimulates RNA polymerase II elongation. PLoS One 3, e3273.

Derti, A., Garrett-Engele, P., Macisaac, K.D., Stevens, R.C., Sriram, S., Chen, R., Rohl, C.A., Johnson, J.M., and Babak, T. (2012). A quantitative atlas of polyadenylation in five mammals. Genome Res 22, 1173-1183.

Devaiah, B.N., Lewis, B.A., Cherman, N., Hewitt, M.C., Albrecht, B.K., Robey, P.G., Ozato, K., Sims, R.J., 3rd, and Singer, D.S. (2012). BRD4 is an atypical kinase that phosphorylates serine2 of the RNA polymerase II carboxy-terminal domain. Proc Natl Acad Sci U S A 109, 6927-6932.

Di Giammartino, D.C., Nishida, K., and Manley, J.L. (2011). Mechanisms and consequences of alternative polyadenylation. Mol Cell 43, 853-866.

Dichtl, B., Blank, D., Sadowski, M., Hubner, W., Weiser, S., and Keller, W. (2002). Yhh1p/Cft1p directly links poly(A) site recognition and RNA polymerase II transcription termination. EMBO J 21, 4125-4135.

Dominski, Z. (2007). of the metallo-beta-lactamase family and their role in DNA and RNA . Crit Rev Biochem Mol Biol 42, 67-93.

Dronamraju, R., and Strahl, B.D. (2014). A feed forward circuit comprising Spt6, Ctk1 and PAF regulates Pol II CTD phosphorylation and transcription elongation. Nucleic Acids Res 42, 870-881.

Dye, M.J., Gromak, N., and Proudfoot, N.J. (2006). Exon tethering in transcription by RNA polymerase II. Mol Cell 21, 849-859.

Dye, M.J., and Proudfoot, N.J. (1999). Terminal exon definition occurs cotranscriptionally and promotes termination of RNA polymerase II. Mol Cell 3, 371-378.

142

Dye, M.J., and Proudfoot, N.J. (2001). Multiple transcript cleavage precedes polymerase release in termination by RNA polymerase II. Cell 105, 669-681.

Early, P., Rogers, J., Davis, M., Calame, K., Bond, M., Wall, R., and Hood, L. (1980). Two mRNAs can be produced from a single immunoglobulin mu gene by alternative RNA processing pathways. Cell 20, 313-319.

Edmonds, M., Vaughan, M.H., Jr., and Nakazato, H. (1971). Polyadenylic acid sequences in the heterogeneous nuclear RNA and rapidly-labeled polyribosomal RNA of HeLa cells: possible evidence for a precursor relationship. Proc Natl Acad Sci U S A 68, 1336-1340.

Eggermont, J., and Proudfoot, N.J. (1993). Poly(A) signals and transcriptional pause sites combine to prevent interference between RNA polymerase II promoters. EMBO J 12, 2539-2548.

Egloff, S., and Murphy, S. (2008). Cracking the RNA polymerase II CTD code. Trends Genet 24, 280-288.

Eissenberg, J.C., Shilatifard, A., Dorokhov, N., and Michener, D.E. (2007). Cdk9 is an essential kinase in Drosophila that is required for heat shock gene expression, histone methylation and elongation factor recruitment. Mol Genet Genomics 277, 101-114.

Elkon, R., Ugalde, A.P., and Agami, R. (2013). Alternative cleavage and polyadenylation: extent, regulation and function. Nat Rev Genet 14, 496-506.

Emili, A., Shales, M., McCracken, S., Xie, W., Tucker, P.W., Kobayashi, R., Blencowe, B.J., and Ingles, C.J. (2002). Splicing and transcription-associated proteins PSF and p54nrb/nonO bind to the RNA polymerase II CTD. RNA 8, 1102-1111.

Endoh, M., Zhu, W., Hasegawa, J., Watanabe, H., Kim, D.K., Aida, M., Inukai, N., Narita, T., Yamada, T., Furuya, A., et al. (2004). Human Spt6 stimulates transcription elongation by RNA polymerase II in vitro. Mol Cell Biol 24, 3324- 3336.

143

Enriquez, H.P., Levitt, N., Briggs, D., and Proudfoot, N.J. (1991). A pause site for RNA polymerase II is associated with termination of transcription. Embo J 10, 1833-1842.

Fitzgerald, M., and Shenk, T. (1981). The sequence 5'-AAUAAA-3'forms parts of the recognition site for polyadenylation of late SV40 mRNAs. Cell 24, 251-260.

Fong, N., and Bentley, D.L. (2001). Capping, splicing, and 3' processing are independently stimulated by RNA polymerase II: different functions for different segments of the CTD. Genes Dev 15, 1783-1795.

Fujinaga, K., Irwin, D., Huang, Y., Taube, R., Kurosu, T., and Peterlin, B.M. (2004). Dynamics of human immunodeficiency virus transcription: P-TEFb phosphorylates RD and dissociates negative effectors from the transactivation response element. Mol Cell Biol 24, 787-795.

Furth, P.A., and Baker, C.C. (1991). An element in the bovine papillomavirus late 3' untranslated region reduces polyadenylated cytoplasmic RNA levels. J Virol 65, 5806-5812.

Fusby, B., Kim, S., Erickson, B., Kim, H., Peterson, M.L., and Bentley, D.L. (2015). Coordination of RNA Polymerase II Pausing and 3' End Processing Factor Recruitment with Alternative Polyadenylation. Mol Cell Biol 36, 295-303.

Galli, G., Guise, J.W., McDevitt, M.A., Tucker, P.W., and Nevins, J.R. (1987). Relative position and strengths of poly(A) sites as well as transcription termination are critical to membrane versus secreted mu-chain expression during B-cell development. Genes Dev 1, 471-481.

Gariglio, P., Bellard, M., and Chambon, P. (1981). Clustering of RNA polymerase B molecules in the 5' moiety of the adult beta-globin gene of hen erythrocytes. Nucleic Acids Res 9, 2589-2598.

Gawande, B., Robida, M.D., Rahn, A., and Singh, R. (2006). Drosophila Sex- lethal protein mediates polyadenylation switching in the female germline. EMBO J 25, 1263-1272.

144

Ghamari, A., van de Corput, M.P., Thongjuea, S., van Cappellen, W.A., van Ijcken, W., van Haren, J., Soler, E., Eick, D., Lenhard, B., and Grosveld, F.G. (2013). In vivo live imaging of RNA polymerase II transcription factories in primary cells. Genes Dev 27, 767-777.

Ghosh, A., Shuman, S., and Lima, C.D. (2008). The structure of Fcp1, an essential RNA polymerase II CTD phosphatase. Mol Cell 32, 478-490.

Gil, A., and Proudfoot, N.J. (1984). A sequence downstream of AAUAAA is required for rabbit beta-globin mRNA 3'-end formation. Nature 312, 473-474.

Gil, A., and Proudfoot, N.J. (1987). Position-dependent sequence elements downstream of AAUAAA are required for efficient rabbit beta-globin mRNA 3' end formation. Cell 49, 399-406.

Gilmour, D.S., and Lis, J.T. (1986). RNA polymerase II interacts with the promoter region of the noninduced hsp70 gene in cells. Mol Cell Biol 6, 3984-3989.

Glover-Cutter, K., Kim, S., Espinosa, J., and Bentley, D.L. (2008). RNA polymerase II pauses and associates with pre-mRNA processing factors at both ends of genes. Nat Struct Mol Biol 15, 71-78.

Glover-Cutter, K., Larochelle, S., Erickson, B., Zhang, C., Shokat, K., Fisher, R.P., and Bentley, D.L. (2009). TFIIH-associated Cdk7 kinase functions in phosphorylation of C-terminal domain Ser7 residues, promoter-proximal pausing, and termination by RNA polymerase II. Mol Cell Biol 29, 5455-5464.

Gozani, O., Feld, R., and Reed, R. (1996). Evidence that sequence-independent binding of highly conserved U2 snRNP proteins upstream of the branch site is required for assembly of spliceosomal complex A. Genes Dev 10, 233-243.

Graveley, B.R., Fleming, E.S., and Gilmartin, G.M. (1996). RNA structure is a critical determinant of poly(A) site recognition by cleavage and polyadenylation specificity factor. Mol Cell Biol 16, 4942-4951.

145

Gromak, N., West, S., and Proudfoot, N.J. (2006). Pause sites promote transcriptional termination of mammalian RNA polymerase II. Mol Cell Biol 26, 3986-3996.

Grosso, A.R., de Almeida, S.F., Braga, J., and Carmo-Fonseca, M. (2012). Dynamic transitions in RNA polymerase II density profiles during transcription termination. Genome Res 22, 1447-1456.

Gu, B., Eick, D., and Bensaude, O. (2013). CTD serine-2 plays a critical role in splicing and termination factor recruitment to RNA polymerase II in vivo. Nucleic Acids Res 41, 1591-1603.

Gunderson, S.I., Beyer, K., Martin, G., Keller, W., Boelens, W.C., and Mattaj, L.W. (1994). The human U1A snRNP protein regulates polyadenylation via a direct interaction with poly(A) polymerase. Cell 76, 531-541.

Gunderson, S.I., Polycarpou-Schwarz, M., and Mattaj, I.W. (1998). U1 snRNP inhibits pre-mRNA polyadenylation through a direct interaction between U1 70K and poly(A) polymerase. Mol Cell 1, 255-264.

Gunderson, S.I., Vagner, S., Polycarpou-Schwarz, M., and Mattaj, I.W. (1997). Involvement of the carboxyl terminus of vertebrate poly(A) polymerase in U1A autoregulation and in the coupling of splicing and polyadenylation. Genes Dev 11, 761-773.

Hall, S.L., and Padgett, R.A. (1996). Requirement of U12 snRNA for in vivo splicing of a minor class of eukaryotic nuclear pre-mRNA introns. Science 271, 1716-1718.

Han, J., Ding, J.H., Byeon, C.W., Kim, J.H., Hertel, K.J., Jeong, S., and Fu, X.D. (2011). SR proteins induce alternative exon skipping through their activities on the flanking constitutive exons. Mol Cell Biol 31, 793-802.

Hartzog, G.A., and Quan, T.K. (2008). Just the FACTs: histone H2B ubiquitylation and nucleosome dynamics. Mol Cell 31, 2-4.

146

Hausmann, S., and Shuman, S. (2002). Characterization of the CTD phosphatase Fcp1 from fission yeast. Preferential dephosphorylation of serine 2 versus serine 5. J Biol Chem 277, 21213-21220.

Heidemann, M., Hintermair, C., Voss, K., and Eick, D. (2013). Dynamic phosphorylation patterns of RNA polymerase II CTD during transcription. Biochim Biophys Acta 1829, 55-62.

Ho, C., and Shuman, S. (1999). Distinct Effector Roles for Ser2 and Ser5 Phosphorylation of the RNA polymerase II CTD in the Recruitment and Allosteric Activation of Mammalian Capping Enzyme. Mol Cell 3, 405-411.

Hoque, M., Ji, Z., Zheng, D., Luo, W., Li, W., You, B., Park, J.Y., Yehia, G., and Tian, B. (2013). Analysis of alternative cleavage and polyadenylation by 3' region extraction and deep sequencing. Nat Methods 10, 133-139.

Houzelstein, D., Bullock, S.L., Lynch, D.E., Grigorieva, E.F., Wilson, V.A., and Beddington, R.S. (2002). Growth and early postimplantation defects in mice deficient for the bromodomain-containing protein Brd4. Mol Cell Biol 22, 3794- 3802.

Hu, J., Lutz, C.S., Wilusz, J., and Tian, B. (2005). Bioinformatic identification of candidate cis-regulatory elements involved in human mRNA polyadenylation. RNA 11, 1485-1493.

Jang, M.K., Mochizuki, K., Zhou, M., Jeong, H.S., Brady, J.N., and Ozato, K. (2005). The bromodomain protein Brd4 is a positive regulatory component of P- TEFb and stimulates RNA polymerase II-dependent transcription. Mol Cell 19, 523-534.

Jeronimo, C., Bataille, A.R., and Robert, F. (2013). The writers, readers, and functions of the RNA polymerase II C-terminal domain code. Chem Rev 113, 8491-8522.

Jurica, M.S., and Moore, M.J. (2003). Pre-mRNA splicing: awash in a sea of proteins. Mol Cell 12, 5-14.

147

Kaida, D., Berg, M.G., Younis, I., Kasim, M., Singh, L.N., Wan, L., and Dreyfuss, G. (2010). U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature 468, 664-668.

Kaida, D., Motoyoshi, H., Tashiro, E., Nojima, T., Hagiwara, M., Ishigami, K., Watanabe, H., Kitahara, T., Yoshida, T., Nakajima, H., et al. (2007). Spliceostatin A targets SF3b and inhibits both splicing and nuclear retention of pre-mRNA. Nature chemical biology 3, 576-583.

Kameoka, S., Duque, P., and Konarska, M.M. (2004). p54(nrb) associates with the 5' splice site within large transcription/splicing complexes. EMBO J 23, 1782- 1791.

Kanagaraj, R., Huehn, D., MacKellar, A., Menigatti, M., Zheng, L., Urban, V., Shevelev, I., Greenleaf, A.L., and Janscak, P. (2010). RECQ5 helicase associates with the C-terminal repeat domain of RNA polymerase II during productive elongation phase of transcription. Nucleic Acids Res 38, 8131-8140.

Kaplan, C.D., Holland, M.J., and Winston, F. (2005). Interaction between transcription elongation factors and mRNA 3'-end formation at the Saccharomyces cerevisiae GAL10-GAL7 . J Biol Chem 280, 913-922.

Kaplan, C.D., Laprade, L., and Winston, F. (2003). Transcription elongation factors repress transcription initiation from cryptic sites. Science 301, 1096-1099.

Kaplan, C.D., Morris, J.R., Wu, C., and Winston, F. (2000). Spt5 and spt6 are associated with active transcription and have characteristics of general elongation factors in D. melanogaster. Genes Dev 14, 2623-2634.

Kaufmann, I., Martin, G., Friedlein, A., Langen, H., and Keller, W. (2004). Human Fip1 is a subunit of CPSF that binds to U-rich RNA elements and stimulates poly(A) polymerase. EMBO J 23, 616-626.

Keller, W., Bienroth, S., Lang, K.M., and Christofori, G. (1991). Cleavage and polyadenylation factor CPF specifically interacts with the pre-mRNA 3' processing signal AAUAAA. EMBO J 10, 4241-4249.

148

Keogh, M.C., Podolny, V., and Buratowski, S. (2003). Bur1 kinase is required for efficient transcription elongation by RNA polymerase II. Mol Cell Biol 23, 7005- 7018.

Khaladkar, M., Smyda, M., and Hannenhalli, S. (2011). Epigenomic and RNA structural correlates of polyadenylation. RNA Biol 8, 529-537.

Kim, E., Du, L., Bregman, D.B., and Warren, S.L. (1997). Splicing factors associate with hyperphosphorylated -polymerase-ii in the absence of pre- messenger-rna. J Cell Biol 136, 19-28.

Kim, M., Ahn, S.H., Krogan, N.J., Greenblatt, J.F., and Buratowski, S. (2004a). Transitions in RNA polymerase II elongation complexes at the 3' ends of genes. EMBO J 23, 354-364.

Kim, M., Krogan, N.J., Vasiljeva, L., Rando, O.J., Nedea, E., Greenblatt, J.F., and Buratowski, S. (2004b). The yeast Rat1 exonuclease promotes transcription termination by RNA polymerase II. Nature 432, 517-522.

Kim, S., Kim, H., Fong, N., Erickson, B., and Bentley, D.L. (2011). Pre-mRNA splicing is a determinant of histone H3K36 methylation. Proc Natl Acad Sci U S A 108, 13564-13569.

Kizer, K.O., Phatnani, H.P., Shibata, Y., Hall, H., Greenleaf, A.L., and Strahl, B.D. (2005). A novel domain in Set2 mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation. Mol Cell Biol 25, 3305-3316.

Kolev, N.G., Yario, T.A., Benson, E., and Steitz, J.A. (2008). Conserved motifs in both CPSF73 and CPSF100 are required to assemble the active endonuclease for histone mRNA 3'-end maturation. EMBO Rep 9, 1013-1018.

Komarnitsky, P., Cho, E.J., and Buratowski, S. (2000). Different phosphorylated forms of RNA polymerase II and associated mRNA processing factors during transcription. Genes Dev 14, 2452-2460.

149

Krogan, N.J., Kim, M., Ahn, S.H., Zhong, G., Kobor, M.S., Cagney, G., Emili, A., Shilatifard, A., Buratowski, S., and Greenblatt, J.F. (2002). RNA polymerase II elongation factors of Saccharomyces cerevisiae: a targeted proteomics approach. Mol Cell Biol 22, 6979-6992.

Krumm, A., Hickey, L.B., and Groudine, M. (1995). Promoter-proximal pausing of RNA polymerase II defines a general rate-limiting step after transcription initiation. Genes Dev 9, 559-572.

Kwek, K.Y., Murphy, S., Furger, A., Thomas, B., O'Gorman, W., Kimura, H., Proudfoot, N.J., and Akoulitchev, A. (2002). U1 snRNA associates with TFIIH and regulates transcriptional initiation. Nat Struct Biol 9, 800-805.

Kyburz, A., Friedlein, A., Langen, H., and Keller, W. (2006). Direct interactions between subunits of CPSF and the U2 snRNP contribute to the coupling of pre- mRNA 3' end processing and splicing. Mol Cell 23, 195-205.

Lackford, B., Yao, C., Charles, G.M., Weng, L., Zheng, X., Choi, E.A., Xie, X., Wan, J., Xing, Y., Freudenberg, J.M., et al. (2014). Fip1 regulates mRNA alternative polyadenylation to promote stem cell self-renewal. EMBO J 33, 878- 889.

Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25.

Larochelle, S., Amat, R., Glover-Cutter, K., Sanso, M., Zhang, C., Allen, J.J., Shokat, K.M., Bentley, D.L., and Fisher, R.P. (2012). Cyclin-dependent kinase control of the initiation-to-elongation switch of RNA polymerase II. Nat Struct Mol Biol 19, 1108-1115.

Lee, J.M., and Greenleaf, A.L. (1989). A protein kinase that phosphorylates the C-terminal repeat domain of the largest subunit of RNA polymerase II. Proc Natl Acad Sci U S A 86, 3624-3628.

150

Lee, J.M., and Greenleaf, A.L. (1991). CTD kinase large subunit is encoded by CTK1, a gene required for normal growth of Saccharomyces cerevisiae. Gene Expr 1, 149-167.

Lennon, G.G., and Perry, R.P. (1985). C mu-containing transcripts initiate heterogeneously within the IgH enhancer region and contain a novel 5'- nontranslatable exon. Nature 318, 475-478.

Lerner, M.R., Boyle, J.A., Mount, S.M., Wolin, S.L., and Steitz, J.A. (1980). Are snRNPs involved in splicing? Nature 283, 220-224.

Levine, M. (2011). Paused RNA polymerase II as a developmental checkpoint. Cell 145, 502-511.

Li, M., Phatnani, H.P., Guan, Z., Sage, H., Greenleaf, A.L., and Zhou, P. (2005). Solution structure of the Set2-Rpb1 interacting domain of human Set2 and its interaction with the hyperphosphorylated C-terminal domain of Rpb1. Proc Natl Acad Sci U S A 102, 17636-17641.

Li, W., You, B., Hoque, M., Zheng, D., Luo, W., Ji, Z., Park, J.Y., Gunderson, S.I., Kalsotra, A., Manley, J.L., et al. (2015). Systematic profiling of poly(A)+ transcripts modulated by core 3' end processing and splicing factors reveals regulatory rules of alternative cleavage and polyadenylation. PLoS Genet 11, e1005166.

Lian, Z., Karpikov, A., Lian, J., Mahajan, M.C., Hartman, S., Gerstein, M., Snyder, M., and Weissman, S.M. (2008). A genomic analysis of RNA polymerase II modification and chromatin architecture related to 3' end RNA polyadenylation. Genome Res 18, 1224-1237.

Liang, K., Gao, X., Gilmore, J.M., Florens, L., Washburn, M.P., Smith, E., and Shilatifard, A. (2015). Characterization of human cyclin-dependent kinase 12 (CDK12) and CDK13 complexes in C-terminal domain phosphorylation, gene transcription, and RNA processing. Mol Cell Biol 35, 928-938.

151

Lianoglou, S., Garg, V., Yang, J.L., Leslie, C.S., and Mayr, C. (2013). Ubiquitously transcribed genes use alternative polyadenylation to achieve tissue- specific expression. Genes Dev 27, 2380-2396.

Licatalosi, D.D., Geiger, G., Minet, M., Schroeder, S., Cilli, K., McNeil, J.B., and Bentley, D.L. (2002). Functional interaction of yeast pre-mRNA 3' end processing factors with RNA polymerase II. Mol Cell 9, 1101-1111.

Lim, L., and Canellakis, E.S. (1970). Adenine-rich polymer associated with rabbit reticulocyte messenger RNA. Nature 227, 710-712.

Lin, C., Garrett, A.S., De Kumar, B., Smith, E.R., Gogol, M., Seidel, C., Krumlauf, R., and Shilatifard, A. (2011). Dynamic transcriptional events in embryonic stem cells mediated by the super elongation complex (SEC). Genes Dev 25, 1486- 1498.

Lin, C., Smith, E.R., Takahashi, H., Lai, K.C., Martin-Brown, S., Florens, L., Washburn, M.P., Conaway, J.W., Conaway, R.C., and Shilatifard, A. (2010). AFF4, a component of the ELL/P-TEFb elongation complex and a shared subunit of MLL chimeras, can link transcription elongation to leukemia. Mol Cell 37, 429- 437.

Lin, P.S., Dubois, M.F., and Dahmus, M.E. (2002). TFIIF-associating carboxyl- terminal domain phosphatase dephosphorylates phosphoserines 2 and 5 of RNA polymerase II. J Biol Chem 277, 45949-45956.

Listerman, I., Sapra, A.K., and Neugebauer, K.M. (2006). Cotranscriptional coupling of splicing factor recruitment and precursor messenger RNA splicing in mammalian cells. Nat Struct Mol Biol 13, 815-822.

Liu, X., Kraus, W.L., and Bai, X. (2015). Ready, pause, go: regulation of RNA polymerase II pausing and release by cellular signaling pathways. Trends Biochem Sci 40, 516-525.

152

Liu, Z., Luyten, I., Bottomley, M.J., Messias, A.C., Houngninou-Molango, S., Sprangers, R., Zanier, K., Kramer, A., and Sattler, M. (2001). Structural basis for recognition of the intron branch site RNA by splicing factor 1. Science 294, 1098- 1102.

Lunde, B.M., Reichow, S.L., Kim, M., Suh, H., Leeper, T.C., Yang, F., Mutschler, H., Buratowski, S., Meinhart, A., and Varani, G. (2010). Cooperative interaction of transcription termination factors with the RNA polymerase II C-terminal domain. Nat Struct Mol Biol 17, 1195-1201.

Luo, Z., Lin, C., and Shilatifard, A. (2012). The super elongation complex (SEC) family in transcriptional control. Nat Rev Mol Cell Biol 13, 543-547.

Lutz, C.S., Murthy, K.G., Schek, N., O'Connor, J.P., Manley, J.L., and Alwine, J.C. (1996). Interaction between the U1 snRNP-A protein and the 160-kD subunit of cleavage-polyadenylation specificity factor increases polyadenylation efficiency in vitro. Genes Dev Genes And Development 10, 325-337.

Ma, J., Gunderson, S.I., and Phillips, C. (2006). Non-snRNP U1A levels decrease during mammalian B-cell differentiation and release the IgM secretory poly(A) site from repression. RNA 12, 122-132.

MacDonald, C.C., Wilusz, J., and Shenk, T. (1994). The 64-kilodalton subunit of the CstF polyadenylation factor binds to pre-mRNAs downstream of the cleavage site and influences cleavage site location. Mol Cell Biol 14, 6647-6654.

MacKellar, A.L., and Greenleaf, A.L. (2011). Cotranscriptional association of mRNA export factor Yra1 with C-terminal domain of RNA polymerase II. J Biol Chem 286, 36385-36395.

Maclennan, A.J., and Shaw, G. (1993). A yeast SH2 domain. Trends Biochem Sci 18, 464-465.

Mandel, C.R., Bai, Y., and Tong, L. (2008). Protein factors in pre-mRNA 3'-end processing. Cell Mol Life Sci 65, 1099-1122.

153

Mandel, C.R., Kaneko, S., Zhang, H., Gebauer, D., Vethantham, V., Manley, J.L., and Tong, L. (2006). Polyadenylation factor CPSF-73 is the pre-mRNA 3'-end- processing endonuclease. Nature 444, 953-956.

Manley, J.L., and Tacke, R. (1996). SR proteins and splicing control. Genes Dev 10, 1569-1579.

Marshall, N.F., Peng, J., Xie, Z., and Price, D.H. (1996). Control of RNA polymerase II elongation potential by a novel carboxyl-terminal domain kinase. J Biol Chem 271, 27176-27183.

Marshall, N.F., and Price, D.H. (1995). Purification of P-TEFb, a transcription factor required for the transition into productive elongation. J Biol Chem 270, 12335-12338.

Martin, G., Gruber, A.R., Keller, W., and Zavolan, M. (2012). Genome-wide analysis of pre-mRNA 3' end processing reveals a decisive role of human cleavage factor I in the regulation of 3' UTR length. Cell Rep 1, 753-763.

Martincic, K., Alkan, S.A., Cheatle, A., Borghesi, L., and Milcarek, C. (2009). Transcription elongation factor ELL2 directs immunoglobulin secretion in plasma cells by stimulating altered RNA processing. Nat Immunol 10, 1102-1109.

Matera, A.G., and Wang, Z. (2014). A day in the life of the spliceosome. Nat Rev Mol Cell Biol 15, 108-121.

Mayer, A., Heidemann, M., Lidschreiber, M., Schreieck, A., Sun, M., Hintermair, C., Kremmer, E., Eick, D., and Cramer, P. (2012). CTD tyrosine phosphorylation impairs termination factor recruitment to RNA polymerase II. Science 336, 1723- 1725.

Mayer, A., Lidschreiber, M., Siebert, M., Leike, K., Soding, J., and Cramer, P. (2010). Uniform transitions of the general RNA polymerase II transcription complex. Nat Struct Mol Biol 17, 1272-1278.

154

McCracken, S., Fong, N., Rosonina, E., Yankulov, K., Brothers, G., Siderovski, D., Hessel, A., Foster, S., Shuman, S., and Bentley, D.L. (1997a). 5'-Capping enzymes are targeted to pre-mRNA by binding to the phosphorylated carboxy- terminal domain of RNA polymerase II. Genes Dev 11, 3306-3318.

McCracken, S., Fong, N., Yankulov, K., Ballantyne, S., Pan, G., Greenblatt, J., Patterson, S.D., Wickens, M., and Bentley, D.L. (1997b). The C-terminal domain of RNA polymerase II couples mRNA processing to transcription. Nature 385, 357-361.

McDonald, S.M., Close, D., Xin, H., Formosa, T., and Hill, C.P. (2010). Structure and biological importance of the Spn1-Spt6 interaction, and its regulatory role in nucleosome binding. Mol Cell 40, 725-735.

McLauchlan, J., Gaffney, D., Whitton, J.L., and Clements, J.B. (1985). The consensus sequence YGTGTTYY located downstream from the AATAAA signal is required for efficient formation of mRNA 3' termini. Nucleic Acids Res 13, 1347- 1368.

Meerbrey, K.L., Hu, G., Kessler, J.D., Roarty, K., Li, M.Z., Fang, J.E., Herschkowitz, J.I., Burrows, A.E., Ciccia, A., Sun, T., et al. (2011). The pINDUCER lentiviral toolkit for inducible RNA interference in vitro and in vivo. Proc Natl Acad Sci U S A 108, 3665-3670.

Mendecki, J., Lee, S.Y., and Brawerman, G. (1972). Characteristics of the polyadenylic acid segment associated with messenger ribonucleic acid in mouse sarcoma 180 ascites cells. 11, 792-798.

Millevoi, S., Geraghty, F., Idowu, B., Tam, J.L., Antoniou, M., and Vagner, S. (2002). A novel function for the U2AF 65 splicing factor in promoting pre-mRNA 3'-end processing. EMBO Rep 3, 869-874.

Millevoi, S., Loulergue, C., Dettwiler, S., Karaa, S.Z., Keller, W., Antoniou, M., and Vagner, S. (2006). An interaction between U2AF 65 and CF I(m) links the splicing and 3' end processing machineries. EMBO J 25, 4854-4864.

155

Misteli, T., and Spector, D.L. (1999). RNA polymerase II targets pre-mRNA splicing factors to transcription sites in vivo. Mol Cell 3, 697-705.

Monarez, R.R., MacDonald, C.C., and Dass, B. (2007). Polyadenylation proteins CstF-64 and tauCstF-64 exhibit differential binding affinities for RNA polymers. Biochem J 401, 651-658.

Moreira, A., Wollerton, M., Monks, J., and Proudfoot, N.J. (1995). Upstream sequence elements enhance poly(A) site efficiency of the C2 complement gene and are phylogenetically conserved. EMBO J 14, 3809-3819.

Morris, D.P., and Greenleaf, A.L. (2000). Analysis of CTD and phospho-CTD binding by Yeast WW domain-containing proteins reveals that the splicing factor, Prp40, is a phospho-CTD binding protein. J Biol Chem 275, 39935-39943.

Murthy, K.G., and Manley, J.L. (1992). Characterization of the multisubunit cleavage-polyadenylation specificity factor from calf thymus. J Biol Chem 267, 14804-14811.

Murthy, K.G., and Manley, J.L. (1995). The 160-kD subunit of human cleavage- polyadenylation specificity factor coordinates pre-mRNA 3'-end formation. Genes Dev 9, 2672-2683.

Nag, A., Narsinh, K., and Martinson, H.G. (2007). The poly(A)-dependent transcriptional pause is mediated by CPSF acting on the body of the polymerase. Nat Struct Mol Biol 14, 662-669.

Neugebauer, K.M., and Roth, M.B. (1997). Distribution Of Pre-Messenger-Rna Splicing Factors At Sites Of Rna-Polymerase-Ii Transcription. Genes & Development. 11., 1148-1159.

Ni, Z., Schwartz, B.E., Werner, J., Suarez, J.R., and Lis, J.T. (2004). Coordination of transcription, RNA processing, and surveillance by P-TEFb kinase on heat shock genes. Mol Cell 13, 55-65.

156

Niwa, M., Rose, S.D., and Berget, S.M. (1990). In vitro polyadenylation is stimulated by the presence of an upstream intron. Genes Dev 4, 1552-1559.

Ochi, A., Hawley, R.G., Hawley, T., Shulman, M.J., Traunecker, A., Kohler, G., and Hozumi, N. (1983). Functional immunoglobulin M production after transfection of cloned immunoglobulin heavy and light chain genes into lymphoid cells. Proc Natl Acad Sci U S A 80, 6351-6355.

Pal-Bhadra, M., Leibovitch, B.A., Gandhi, S.G., Chikka, M.R., Bhadra, U., Birchler, J.A., and Elgin, S.C. (2004). Heterochromatic silencing and HP1 localization in Drosophila are dependent on the RNAi machinery. Science 303, 669-672.

Patel, A.A., and Steitz, J.A. (2003). Splicing double: insights from the second spliceosome. Nat Rev Mol Cell Biol 4, 960-970.

Patturajan, M., Wei, X., Berezney, R., and Corden, J.L. (1998). A nuclear matrix protein interacts with the phosphorylated C-terminal domain of RNA polymerase II. Mol Cell Biol 18, 2406-2415.

Peng, J., Marshall, N.F., and Price, D.H. (1998a). Identification of a cyclin subunit required for the function of Drosophila P-TEFb. J Biol Chem 273, 13855-13860.

Peng, J., Zhu, Y., Milton, J.T., and Price, D.H. (1998b). Identification of multiple cyclin subunits of human P-TEFb. Genes Dev 12, 755-762.

Perales, R., Erickson, B., Zhang, L., Kim, H., Valiquett, E., and Bentley, D. (2013). Gene promoters dictate histone occupancy within genes. EMBO J 32, 2645-2656.

Peterson, M.L. (2011). Immunoglobulin heavy chain gene regulation through polyadenylation and splicing competition. Wiley Interdiscip Rev RNA 2, 92-105.

Peterson, M.L., Bertolino, S., and Davis, F. (2002). An RNA polymerase pause site is associated with the immunoglobulin mus poly(A) site. Mol Cell Biol 22, 5606-5615.

157

Peterson, M.L., Bingham, G.L., and Cowan, C. (2006). Multiple features contribute to the use of the immunoglobulin M secretion-specific poly(A) signal but are not required for developmental regulation. Mol Cell Biol 26, 6762-6771.

Peterson, M.L., Bryman, M.B., Peiter, M., and Cowan, C. (1994). Exon size affects competition between splicing and cleavage-polyadenylation in the immunoglobulin mu gene. Mol Cell Biol 14, 77-86.

Peterson, M.L., and Perry, R.P. (1986). Regulated production of mu m and mu s mRNA requires linkage of the poly(A) addition sites and is dependent on the length of the mu s-mu m intron. Proc Natl Acad Sci U S A 83, 8883-8887.

Peterson, M.L., and Perry, R.P. (1989). The regulated production of mu m and mu s mRNA is dependent on the relative efficiencies of mu s poly(A) site usage and the c mu 4-to-M1 splice. Mol Cell Biol 9, 726-738.

Phatnani, H.P., and Greenleaf, A.L. (2004). Identifying phosphoCTD-associating proteins. Methods Mol Biol 257, 17-28.

Phillips, C., and Virtanen, A. (1997). The murine IgM secretory poly(A) site contains dual upstream and downstream elements which affect polyadenylation. Nucleic Acids Res 25, 2344-2351.

Pinhero, R., Liaw, P., Bertens, K., and Yankulov, K. (2004). Three cyclin- dependent kinases preferentially phosphorylate different parts of the C-terminal domain of the large subunit of RNA polymerase II. Eur J Biochem 271, 1004- 1014.

Pinto, P.A., Henriques, T., Freitas, M.O., Martins, T., Domingues, R.G., Wyrzykowska, P.S., Coelho, P.A., Carmo, A.M., Sunkel, C.E., Proudfoot, N.J., et al. (2011). RNA polymerase II kinetics in polo polyadenylation signal selection. EMBO J 30, 2431-2444.

Plant, K.E., Dye, M.J., Lafaille, C., and Proudfoot, N.J. (2005). Strong polyadenylation and weak pausing combine to cause efficient termination of transcription in the human Ggamma-globin gene. Mol Cell Biol 25, 3276-3285.

158

Plet, A., Eick, D., and Blanchard, J.M. (1995). Elongation and premature termination of transcripts initiated from c-fos and c-myc promoters show dissimilar patterns. Oncogene 10, 319-328.

Prelich, G., and Winston, F. (1993). Mutations that suppress the deletion of an upstream activating sequence in yeast: involvement of a protein kinase and histone H3 in repressing transcription in vivo. Genetics 135, 665-676.

Proudfoot, N.J. (1989). How RNA polymerase II terminates transcription in higher . Trends Biochem Sci 14, 105-110.

Proudfoot, N.J. (2011). Ending the message: poly(A) signals then and now. Genes Dev 25, 1770-1782.

Qiu, H., Hu, C., and Hinnebusch, A.G. (2009). Phosphorylation of the Pol II CTD by KIN28 enhances BUR1/BUR2 recruitment and Ser2 CTD phosphorylation near promoters. Mol Cell 33, 752-762.

Query, C.C., Strobel, S.A., and Sharp, P.A. (1996). Three recognition events at the branch-site adenine. EMBO J 15, 1392-1402.

Rahl, P.B., Lin, C.Y., Seila, A.C., Flynn, R.A., McCuine, S., Burge, C.B., Sharp, P.A., and Young, R.A. (2010). c-Myc regulates transcriptional pause release. Cell 141, 432-445.

Ramanathan, Y., Rajpara, S.M., Reza, S.M., Lees, E., Shuman, S., Mathews, M.B., and Pe'ery, T. (2001). Three RNA polymerase II carboxyl-terminal domain kinases display distinct substrate preferences. J Biol Chem 276, 10913-10920.

Ramanathan, Y., Reza, S.M., Young, T.M., Mathews, M.B., and Pe'ery, T. (1999). Human and transcription elongation factor P-TEFb: interactions with human immunodeficiency virus type 1 and carboxy-terminal domain substrate. J Virol 73, 5448-5458.

159

Rappsilber, J., Ryder, U., Lamond, A.I., and Mann, M. (2002). Large-scale proteomic analysis of the human spliceosome. Genome Res 12, 1231-1245.

Rasmussen, E.B., and Lis, J.T. (1993). In vivo transcriptional pausing and cap formation on three Drosophila heat shock genes. Proc Natl Acad Sci U S A 90, 7923-7927.

Reed, R. (1996). Initial splice-site recognition and pairing during pre-mRNA splicing. Curr Opin Genet Dev 6, 215-220.

Robert, F., Blanchette, M., Maes, O., Chabot, B., and Coulombe, B. (2002). A human RNA polymerase II-containing complex associated with factors necessary for spliceosome assembly. J Biol Chem 277, 9302-9306.

Rougvie, A.E., and Lis, J.T. (1988). The RNA polymerase II molecule at the 5' end of the uninduced hsp70 gene of D. melanogaster is transcriptionally engaged. Cell 54, 795-804.

Roybal, G.A., and Jurica, M.S. (2010). Spliceostatin A inhibits spliceosome assembly subsequent to prespliceosome formation. Nucleic Acids Res 38, 6664- 6672.

Ruegsegger, U., Beyer, K., and Keller, W. (1996). Purification and characterization of human cleavage factor Im involved in the 3' end processing of messenger RNA precursors. J Biol Chem 271, 6107-6113.

Ruegsegger, U., Blank, D., and Keller, W. (1998). Human pre-mRNA cleavage factor Im is related to spliceosomal SR proteins and can be reconstituted in vitro from recombinant subunits. Mol Cell 1, 243-253.

Saunders, A., Werner, J., Andrulis, E.D., Nakayama, T., Hirose, S., Reinberg, D., and Lis, J.T. (2003). Tracking FACT and the RNA polymerase II elongation complex through chromatin in vivo. Science 301, 1094-1096.

160

Schonemann, L., Kuhn, U., Martin, G., Schafer, P., Gruber, A.R., Keller, W., Zavolan, M., and Wahle, E. (2014). Reconstitution of CPSF active in polyadenylation: recognition of the polyadenylation signal by WDR33. Genes Dev 28, 2381-2393.

Schroeder, S.C., Schwer, B., Shuman, S., and Bentley, D. (2000). Dynamic association of capping enzymes with transcribing RNA polymerase II. Genes Dev 14, 2435-2440.

Schwartz, J.C., Ebmeier, C.C., Podell, E.R., Heimiller, J., Taatjes, D.J., and Cech, T.R. (2012). FUS binds the CTD of RNA polymerase II and regulates its phosphorylation at Ser2. Genes Dev 26, 2690-2695.

Shefer, K., Sperling, J., and Sperling, R. (2014). The Supraspliceosome - A Multi- Task Machine for Regulated Pre-mRNA Processing in the . Comput Struct Biotechnol J 11, 113-122.

Shi, Y. (2012). Alternative polyadenylation: new insights from global analyses. RNA 18, 2105-2117.

Shi, Y., Di Giammartino, D.C., Taylor, D., Sarkeshik, A., Rice, W.J., Yates, J.R., 3rd, Frank, J., and Manley, J.L. (2009). Molecular architecture of the human pre- mRNA 3' processing complex. Mol Cell 33, 365-376.

Shi, Y., and Manley, J.L. (2015). The end of the message: multiple protein-RNA interactions define the mRNA polyadenylation site. Genes Dev 29, 889-897.

Shim, E.Y., Walker, A.K., Shi, Y., and Blackwell, T.K. (2002). CDK-9/cyclin T (P- TEFb) is required in two postinitiation pathways for transcription in the C. elegans embryo. Genes Dev 16, 2135-2146.

Smibert, P., Miura, P., Westholm, J.O., Shenker, S., May, G., Duff, M.O., Zhang, D., Eads, B.D., Carlson, J., Brown, J.B., et al. (2012). Global patterns of tissue- specific alternative polyadenylation in Drosophila. Cell Rep 1, 277-289.

161

Sterner, D.A., Carlo, T., and Berget, S.M. (1996). Architectural limits on split genes. Proc Natl Acad Sci U S A 93, 15081-15085.

Strobl, L.J., and Eick, D. (1992). Hold back of RNA polymerase II at the transcription start site mediates down-regulation of c-myc in vivo. EMBO J 11, 3307-3314.

Swanson, M.S., Carlson, M., and Winston, F. (1990). SPT6, an essential gene that affects transcription in Saccharomyces cerevisiae, encodes a nuclear protein with an extremely acidic amino terminus. Mol Cell Biol 10, 4935-4941.

Swanson, M.S., and Winston, F. (1992). SPT4, SPT5 and SPT6 interactions: effects on transcription and viability in Saccharomyces cerevisiae. Genetics 132, 325-336.

Takagaki, Y., and Manley, J.L. (1992). A human polyadenylation factor is a beta-subunit homologue. J Biol Chem 267, 23471-23474.

Takagaki, Y., and Manley, J.L. (1994). A polyadenylation factor subunit is the human homologue of the Drosophila suppressor of forked protein. Nature 372, 471-474.

Takagaki, Y., and Manley, J.L. (1997). RNA recognition by the human polyadenylation factor CstF. Mol Cell Biol 17, 3907-3914.

Takagaki, Y., and Manley, J.L. (1998). Levels of polyadenylation factor CstF-64 control IgM heavy chain mRNA accumulation and other events associated with B cell differentiation. Mol Cell 2, 761-771.

Takagaki, Y., Ryner, L.C., and Manley, J.L. (1989). Four factors are required for 3'-end cleavage of pre-mRNAs. Genes Dev 3, 1711-1724.

Takagaki, Y., Seipelt, R.L., Peterson, M.L., and Manley, J.L. (1996). The polyadenylation factor CstF-64 regulates alternative processing of IgM heavy chain pre-mRNA during B cell differentiation. Cell 87, 941-952.

162

Tanner, S., Stagljar, I., Georgiev, O., Schaffner, W., and Bourquin, J.P. (1997). A novel sr-related protein specifically interacts with the carboxy-terminal domain (ctd) of rna-polymerase-ii through a conserved interaction domain. Biological Chemistry 378, 565-571.

Tantravahi, J., Alvira, M., and Falck-Pedersen, E. (1993). Characterization of the mouse beta maj globin transcription termination region: a spacing sequence is required between the poly(A) signal sequence and multiple downstream termination elements. Mol Cell Biol 13, 578-587.

Tarn, W.Y., and Steitz, J.A. (1996). A novel spliceosome containing U11, U12, and U5 snRNPs excises a minor class (AT-AC) intron in vitro. Cell 84, 801-811.

Tian, B., and Graber, J.H. (2012). Signals for pre-mRNA cleavage and polyadenylation. Wiley Interdiscip Rev RNA 3, 385-396.

Tian, B., Hu, J., Zhang, H., and Lutz, C.S. (2005). A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res 33, 201- 212.

Tian, B., and Manley, J.L. (2013). Alternative cleavage and polyadenylation: the long and short of it. Trends Biochem Sci 38, 312-320.

Tian, B., Pan, Z., and Lee, J.Y. (2007). Widespread mRNA polyadenylation events in introns indicate dynamic interplay between polyadenylation and splicing. Genome Res 17, 156-165.

Tsurushita, N., and Korn, L.J. (1987). Effects of intron length on differential processing of mouse mu heavy-chain mRNA. Mol Cell Biol 7, 2602-2605.

Vagner, S., Vagner, C., and Mattaj, I.W. (2000). The carboxyl terminus of vertebrate poly(A) polymerase interacts with U2AF 65 to couple 3'-end processing and splicing. Genes Dev 14, 403-413.

163

van Gelder, C.W., Gunderson, S.I., Jansen, E.J., Boelens, W.C., Polycarpou- Schwarz, M., Mattaj, I.W., and van Venrooij, W.J. (1993). A complex secondary structure in U1A pre-mRNA that binds two molecules of U1A protein is required for regulation of polyadenylation. EMBO J 12, 5191-5200.

Vickers, T.A., Sabripour, M., and Crooke, S.T. (2011). U1 adaptors result in reduction of multiple pre-mRNA species principally by sequestering U1snRNP. Nucleic Acids Res 39, e71.

Villarreal, L.P., and White, R.T. (1983). A splice junction deletion deficient in the transport of RNA does not polyadenylate nuclear RNA. Mol Cell Biol 3, 1381- 1388.

Wada, T., Takagi, T., Yamaguchi, Y., Ferdous, A., Imai, T., Hirose, S., Sugimoto, S., Yano, K., Hartzog, G.A., Winston, F., et al. (1998a). DSIF, a novel transcription elongation factor that regulates RNA polymerase II processivity, is composed of human Spt4 and Spt5 homologs. Genes Dev 12, 343-356.

Wada, T., Takagi, T., Yamaguchi, Y., Watanabe, D., and Handa, H. (1998b). Evidence that P-TEFb alleviates the negative effect of DSIF on RNA polymerase II-dependent transcription in vitro. Embo J 17, 7395-7403.

Wahl, M.C., Will, C.L., and Luhrmann, R. (2009). The spliceosome: design principles of a dynamic RNP machine. Cell 136, 701-718.

Wallace, A.M., Dass, B., Ravnik, S.E., Tonk, V., Jenkins, N.A., Gilbert, D.J., Copeland, N.G., and MacDonald, C.C. (1999). Two distinct forms of the 64,000 Mr protein of the cleavage stimulation factor are expressed in mouse male germ cells. Proc Natl Acad Sci U S A 96, 6763-6768.

West, S., and Proudfoot, N.J. (2009). Transcriptional termination enhances protein expression in human cells. Mol Cell 33, 354-364.

West, S., Zaret, K., and Proudfoot, N.J. (2006). Transcriptional termination sequences in the mouse serum albumin gene. RNA 12, 655-665.

164

Will, C.L., and Luhrmann, R. (2001). Spliceosomal UsnRNP biogenesis, structure and function. Curr Opin Cell Biol 13, 290-301.

Will, C.L., Schneider, C., MacMillan, A.M., Katopodis, N.F., Neubauer, G., Wilm, M., Luhrmann, R., and Query, C.C. (2001). A novel U2 and U11/U12 snRNP protein that associates with the pre-mRNA branch site. EMBO J 20, 4536-4546.

Winston, F., Chaleff, D.T., Valent, B., and Fink, G.R. (1984). Mutations affecting Ty-mediated expression of the HIS4 gene of Saccharomyces cerevisiae. Genetics 107, 179-197.

Wood, A.J., Schulz, R., Woodfine, K., Koltowska, K., Beechey, C.V., Peters, J., Bourc'his, D., and Oakey, R.J. (2008). Regulation of alternative polyadenylation by genomic imprinting. Genes Dev 22, 1141-1146.

Wu, S., Romfo, C.M., Nilsen, T.W., and Green, M.R. (1999). Functional recognition of the 3' splice site AG by the splicing factor U2AF35. Nature 402, 832-835.

Yamada, T., Yamaguchi, Y., Inukai, N., Okamoto, S., Mura, T., and Handa, H. (2006). P-TEFb-mediated phosphorylation of hSpt5 C-terminal repeats is critical for processive transcription elongation. Mol Cell 21, 227-237.

Yang, Q., Coseno, M., Gilmartin, G.M., and Doublie, S. (2011). Crystal structure of a human cleavage factor CFI(m)25/CFI(m)68/RNA complex provides an insight into poly(A) site recognition and RNA looping. Structure 19, 368-377.

Yang, Q., Gilmartin, G.M., and Doublie, S. (2010). Structural basis of UGUA recognition by the Nudix protein CFI(m)25 and implications for a regulatory role in mRNA 3' processing. Proc Natl Acad Sci U S A 107, 10062-10067.

Yang, Z., Yik, J.H., Chen, R., He, N., Jang, M.K., Ozato, K., and Zhou, Q. (2005). Recruitment of P-TEFb for stimulation of transcriptional elongation by the bromodomain protein Brd4. Mol Cell 19, 535-545.

165

Yao, C., Biesinger, J., Wan, J., Weng, L., Xing, Y., Xie, X., and Shi, Y. (2012). Transcriptome-wide analyses of CstF64-RNA interactions in global regulation of mRNA alternative polyadenylation. Proc Natl Acad Sci U S A 109, 18773-18778.

Yao, C., Choi, E.A., Weng, L., Xie, X., Wan, J., Xing, Y., Moresco, J.J., Tu, P.G., Yates, J.R., 3rd, and Shi, Y. (2013). Overlapping and distinct functions of CstF64 and CstF64tau in mammalian mRNA 3' processing. RNA 19, 1781-1790.

Yao, S., Neiman, A., and Prelich, G. (2000). BUR1 and BUR2 encode a divergent cyclin-dependent kinase-cyclin complex important for transcription in vivo. Mol Cell Biol 20, 7080-7087.

Yoh, S.M., Cho, H., Pickle, L., Evans, R.M., and Jones, K.A. (2007). The Spt6 SH2 domain binds Ser2-P RNAPII to direct Iws1-dependent mRNA splicing and export. Genes Dev 21, 160-174.

Yoh, S.M., Lucas, J.S., and Jones, K.A. (2008). The Iws1:Spt6:CTD complex controls cotranscriptional mRNA biosynthesis and HYPB/Setd2-mediated histone H3K36 methylation. Genes Dev 22, 3422-3434.

Yonaha, M., and Proudfoot, N.J. (1999). Specific transcriptional pausing activates polyadenylation in a coupled in vitro system. Mol Cell 3, 593-600.

Youdell, M.L., Kizer, K.O., Kisseleva-Romanova, E., Fuchs, S.M., Duro, E., Strahl, B.D., and Mellor, J. (2008). Roles for Ctk1 and Spt6 in regulating the different methylation states of histone H3 lysine 36. Mol Cell Biol 28, 4915-4926.

Yu, L., and Volkert, M.R. (2013). UV damage regulates alternative polyadenylation of the RPB2 gene in yeast. Nucleic Acids Res 41, 3104-3114.

Yuryev, A., Patturajan, M., Litingtung, Y., Joshi, R., Gentile, C., Gebara, M., and corden, J. (1996). The CTD of RNA polymerase II interacts with a novel set of SR-like proteins. Proc. Natl. Acad. Sci. 93, 6975-6980.

Zamore, P.D., Patton, J.G., and Green, M.R. (1992). Cloning and domain structure of the mammalian splicing factor U2AF. Nature 355, 609-614.

166

Zarudnaya, M.I., Kolomiets, I.M., Potyahaylo, A.L., and Hovorun, D.M. (2003). Downstream elements of mammalian pre-mRNA polyadenylation signals: primary, secondary and higher-order structures. Nucleic Acids Res 31, 1375- 1386.

Zeitlinger, J., Stark, A., Kellis, M., Hong, J.W., Nechaev, S., Adelman, K., Levine, M., and Young, R.A. (2007). RNA polymerase stalling at developmental control genes in the Drosophila melanogaster embryo. Nat Genet 39, 1512-1516.

Zeng, C., Kim, E., Warren, S.L., and Berget, S.M. (1997). Dynamic relocation of transcription and splicing factors dependent upon transcriptional activity. EMBO J 16, 1401-1412.

Zhang, C., Li, W.H., Krainer, A.R., and Zhang, M.Q. (2008). RNA landscape of for optimal exon and intron discrimination. Proc Natl Acad Sci U S A 105, 5797-5802.

Zhao, J., Hyman, L., and Moore, C. (1999a). Formation of mRNA 3' ends in eukaryotes: mechanism, regulation, and interrelationships with other steps in mRNA synthesis. Microbiol Mol Biol Rev 63, 405-445.

Zhao, J., Kessler, M., Helmling, S., O'Connor, J.P., and Moore, C. (1999b). Pta1, a component of yeast CF II, is required for both cleavage and poly(A) addition of mRNA precursor. Mol Cell Biol 19, 7733-7740.

Zhou, M., Halanski, M.A., Radonovich, M.F., Kashanchi, F., Peng, J., Price, D.H., and Brady, J.N. (2000). Tat modifies the activity of CDK9 to phosphorylate serine 5 of the RNA polymerase II carboxyl-terminal domain during human immunodeficiency virus type 1 transcription. Mol Cell Biol 20, 5077-5086.

Zhou, Z., Licklider, L.J., Gygi, S.P., and Reed, R. (2002). Comprehensive proteomic analysis of the human spliceosome. Nature 419, 182-185.

Zhu, Y., Pe'ery, T., Peng, J., Ramanathan, Y., Marshall, N., Marshall, T., Amendt, B., Mathews, M.B., and Price, D.H. (1997). Transcription elongation factor P- TEFb is required for HIV-1 tat transactivation in vitro. Genes Dev 11, 2622-2632.

167

APPENDIX A

CHAPTER III SEQUENCING LIBRARIES

168

APPENDIX B

SPT6 AFFECTED POLY(A) SITES

Spt6 shRNA #1: Genes with Upstream 3’ UTR Shifts NIPAL3 ZMIZ2 IL17RA CSNK1E USP11 ZNF3 JMJD4 CDC27 CHKA METTL4 SULF2 SF1 LSM8 U2SURP VASH1 CDK6 MRFAP1L1 TOP1 DIS3 COPS6 CLIC4 CENPM PIK3C2A SMC1A CLIP1 MZT1 NKTR EIF2A MCM5 RNF141 EEA1 PCDH7 SNRPB2 DDX17 RFC2 MYH9 CPSF6 MTMR4 CHML RGP1 HMGN1 TAF8 ST13 IQGAP1 PLA2G16 TOMM20 C6orf62 ARHGAP5 DCTN1 CERK CRTC3 ANO5 C1orf198 ITM2B TIA1 MSH5 SIX4 SECTM1 LMNB2 SRP9 UQCRQ HSBP1 CDK11A USP31 SOD1 CUL4A CNIH4 MBTPS2 FYCO1 SNF8 KNOP1 CELSR2 ZNF26 MARCH2 POMP PDIA3 GUCY1B3 KCTD9 SNRPG BRCC3 LYPLAL1 RDH11 UBE2J1 MYO10 AES SLC20A1 ATP6V0A2 NENF PPAPDC2 DNTTIP2 PFDN1 CAV2 CCNA2 SULT4A1 PPP1R15B MT-TF RPL15 LUC7L3 ZNF688 NDUFS6 PHF12 TMEM183A MT-TG ATP13A3 SKP2 NUFIP2 FAM177A1 MTMR3 CYB5R1 CAMSAP1 CDKN1B CCNG1 SHPK PLOD2 CMC1 RABIF COX16 SGOL1 JADE1 TMEM97 UBASH3B SRA1 GLRX2 SNX9 FANCI UBE2D2 PRPF19 ATP5G3 SDC3 RNF2 ORMDL1 PDHX PAICS PRDM4 PDLIM3 ZNF224 FAM129A SLC33A1 GNG4 PCBD2 UBE3A ABCF2 SPRYD4 DHX9 FAM50A TPRA1 TMA16 TTL GATAD1 SMIM15 PIGM UBTD2 KDM5C SREK1IP1 MED28 ZNF19 USP1 GPATCH4 PRDM5 VTA1 MAP1B CRIPT ZNF562 KCTD21 INPP5A TRPT1 RIF1 ENPP1 ZC3H13 RPL29 CDK13 ACADSB C16orf87 XPC PARP8 PLA2G12A MAPKAPK2 DST IGSF3 KIF20A TOX4 LACTB2 ELK1 CDC42EP3 SCIN PCMTD2 SMARCAD1 RP4-665J23.1 HDAC2 HOXD11 LZTFL1 LGALS8 SLC30A7 QPRT KTN1 ATP6V0E1 DGCR14 RPN1 SRSF10 TMEM56 SNX10 NCOR1 EIF3H ARPC1B FBXW11 FAM111B GCLM SH3KBP1 SEC62 SREK1 MRPS25 HEATR3 NR3C1 EXOSC1 SLC38A1 CEP250 MTUS1 WBP2 ZNF319 HMGN3 LCOR POLR1D MTIF3 ERICH1 H3F3B PHB CLTB STAG2 P4HA2 PRKRIP1 TXNRD3 RFXAP MARK4 OSTF1 GHITM CBX6 QTRTD1 RPS25 IMPA1 GLOD4 CHST15 SH3GLB2 C22orf29 PNISR EIF4G2 NARS TK1 WDR3 TCEAL4 SERF2 ENO1 ACER3 PRPF38A MLST8 H2AFV TOMM34 TBC1D10B THOC2 NASP TFCP2 PDS5A ZFP30 MAFB CWC22 METTL10 RPS6KA4 MPHOSPH6 PCBP1 FAM114A2 AK2 AC006547.8 ATP1B3 DPF2 DYNC1LI2 HSPA4 SMC6 S100PBP SETD5 PREB DDHD2 PDCL TMEM43 GIT2 RPA2 SMARCA4 PQLC2 CCDC34 TRMT5 FASN HNRNPH3 PRPF4 ENO1-IT1 SMC4 PPFIA1 RSBN1 PDCD6IP AP3M1 INIP GOSR2 MTHFD2 TBCEL MESDC2 HOOK3 GTF2E2 TXN RP11-54O7.3 CD47 EPB41L2 SLC30A4 NFYC ZNF770 EDA2R RHOA ACIN1 ODF2L IMPAD1 FAM192A AREL1 NELFE LUZP1 WNT5A NONO COPE FKBP2 PEX2 BICD2 PAX6 FNDC4 ZKSCAN1 MCM6 KRAS HELZ IER3 BRAP GLT8D1 TSN ORC4 RHOT2 KLHL9 MDC1 RPL31 MRPS16 RSPRY1 CPOX ZFP91 VSIG10 PCSK5 RP11-488L18.4 CCAR1 FKBP5 PPAP2A RAB23 PHF2 GPR180 PRPF39 BRD2 POLDIP2 SSR3 NFX1 WDSUB1 ARAF ZNF561 TSC22D2 ALKBH5 169

HNRNPF PPP1R12A PKNOX1 COX4I1 CTRL NARS HNRNPUL1 LETMD1 NACA ANKRD17 ARPP19 FLII MRPL4 EEF2 DCTN2 SYNE2 RPL28 TIPIN RECQL5 NSRP1 DHPS CHURC1 SRP54 TRPM7 C15orf39 RPL36 POP4 MED29 CD63 IRX5 HN1L RBMX KCTD1 FZR1 CALM3 C14orf80 LRRC28 BRD7 TBC1D24 SPAG5 ZNF850 DTNA UBE2N SEC11A NETO2 ITGAE ZNF230 PRKAR1A MRPL35 SPATS2 HSP90AA1 PHKB ATPAF1 RP11-15A1.7 ALDH16A1

Spt6 shRNA #1: Genes with Downstream 3’ UTR Shifts FKBP4 SAFB2 LAMA3 PYGO1 NUP98 CDYL COX20 RPAP3 EIF2S3 TXNL4A ABR CUL1 ICMT CEP170 RPS20 RPA1 ECSIT NUDCD2 SOX2 PYCR2 MTR SLC39A9 COQ3 PSMA5 SUV420H1 EID2B PTP4A2 IRF2BP2 BOD1L1 FOPNL JTB NANP ZBTB33 TPRG1L FAM89A DERL2 CCDC59 SDE2 DTYMK SEPHS1 MALT1 PARP1 GPATCH1 MED6 PDIA6 LSM3 ZNF704 ACBD3 KEAP1 SLC38A2 IQSEC1 CBR4 COA3 APP ANGEL2 CA12 ZCCHC11 SETD7 IBTK C19orf68 ZNF546 NSL1 PVR TMEM231 ABT1 FEM1B NPLOC4 ANK1 NEK2 ACTN1 TMEM127 ARMC1 C14orf142 PURA RBMS1 TMEM181 SPG21 MRPS9 GLUD1 JAGN1 P4HB CKB NUCKS1 GOLGA3 MED4 PPRC1 SRP68 ACTG1 TBX3 SNRPE KDELR3 ADAM10 CCDC82 TMOD3 SUMO3 SULT1A1 GINM1 GTPBP1 RSL24D1 DUSP6 RBM4 MINA SNRNP35 AHI1 TTLL12 BCAT1 YBX3 DENND6A PGP MLLT4 HBS1L ACO2 PPP1R12A GUF1 ZMYND8 ZNF555 NFYA POGK SNAPC1 CCND2 MED13L GNPDA1 PITPNB DDX21 PTPRK AHCY RAB21 SRP19 ZBTB4 GPATCH8 ZNF195 SNX27 NME3 KLF3 UBP1 PPP1CA HM13 HNRNPA2B1 PDSS2 HSDL1 ARL1 HOXA10 GNAS BTF3 NRD1 TIAL1 EMC2 CLEC16A ADAMTS1 ZMYM4 TUBA1B ZNF607 APH1A PPP2CB BLMH NUP205 WSB2 LARP1 SNU13 FAM204A SF3A2 TAOK1 BACH1 RP11-542C16.2 TNPO1 NUDCD3 PHGDH TSPAN12 NAA25 RAB11FIP1 ROCK2 NIPA1 UBE2H DKC1 RBM28 PTGES3 APPL1 ZNF786 ITPRIP STRN3 DNASE1L1 SETX WDR59 HIST1H2BD EEF1A1 ZNF394 CENPL EVI5 PFN1 PSMB1 SSU72 CLK4 RERE CNNM1 YTHDF1 SLC25A11 TRAM1 NR2F6 VAMP2 UBQLN2 PBRM1 RP11- TMEM109 ACTR3 ALKBH4 VRK3 ING1 CDK11A PTP4A1 MLEC SC5D MAML1 WHSC1L1 ARHGEF3 MOB3A SLC9A6 MRPL51 GSK3B RPL26 HADHB SATB1 ADIPOR2 GADD45A ASF1A YWHAE ARL5A PABPC1 IMPDH1 FAM212B TM9SF3 RPS12 NSUN2 PCOLCE2 HDDC2 ABHD12 BRCA1 NFIA CEBPZ VDAC1 SHROOM3 FOXP1 SERINC1 FRMD4A DHCR24 CDC7 SRF MRPL33 RAPH1 TUBB4B RAB3GAP2 NDC1 RPL22 MCCC1 ICE1 MPDZ GTF2F2 ANKRD12 B4GALT5 WDR77 TMEM115 RPS23 DCTPP1 MIA3 ATG2B DDX27 MYL12B DIDO1 PHAX KARS GOLGB1 ANKS1A GTF3C4 IRF2BPL N4BP2L2 KIAA1429 FUBP3 PPP1R9A ZNF148 RAPGEF1 PANK3 TMBIM6 WAPAL HACL1 LDOC1L GLT8D2 YRDC PFN2 ANKRD52 KIAA0101 CLK1 TMEM189-UBE2V1MPC1 MEAF6 SERP1 TSPAN3 MAPRE2 MYPOP JARID2 PJA2 UNC5B ATP5E TICRR NUDT21 ZBTB7A DLX1 RUNDC1 AGO1 DNAJC2 CDR2 ATP5L E2F7 SEMA6A SMOC1 EIF4EBP2 FOXA1 ZFHX3 DDB1 MRE11A E2F1 KIF1BP C6orf106 ESYT2 SSH2 UBXN1 TAF1D UBTF HNRNPU ATP6V1G1 170

BSPRY ZNF326 IRGQ DDX3X CNOT4 KBTBD4 NFATC3 UGCG RPL34 PUM1 AC079922.3 NCL ACADM DNAJA2 SRRM1 SLC39A8 DECR2 TMED4 CEP104 RBM25 CMC2 KIF4A PSAP RNF216 DDX42 XRCC2 ADM COX4I1 LPAR1 ALAS1 MAP4K4 DNTTIP2 HNRNPA2B1 ASRGL1 NFATC2IP NCBP1 AFF1 PKP4 SUCLG2 PSIP1 CAPRIN1 SCAPER TSR2 VDR HUWE1 CPSF3 SCHIP1 TPT1 XPO6 UPF3A JMY RP4-592A1.2 ARHGAP30 XPA UBTF KNOP1 SVIL TRAK1 PSMD4 RBM39 ARHGEF26 AP3M2 BBS4 ABHD13 NBN WDR75 SIKE1 FNDC5 CUL5 CCNDBP1 YME1L1 FXN OGG1 TKT CEBPA PICALM COQ9 TLE1 TRIM39 MBD4 UPRT ETF1 RPS3 SNUPN GNL1 PPARG RPL32 U2SURP DCTD ALKBH3 KIF22 DRAP1 PABPN1 PRNP SMC1A CRNDE TRIM3 PSMD7 KLF9 ARL16 MRPL11 HBP1 TMEM167A APLP2 ADPGK DTD1 ZNF709 SUN1 YDJC LSM6 BCLAF1 CFDP1 BTN2A1 PQLC1 PPT1 ACBD3 SPCS3 RPL8 RSL1D1 VPS36 KHSRP XPO7 AMD1 TMEM161B METAP2 KANSL1 NPHP4 DPYSL3 AC007238.1 ATPIF1 TIMM8B WNK1 MIS12 SLC35B4 TOMM6 TAPBP ARID4B ZGRF1 TRMT112 FN3KRP TPT1 DTNA XIST NOP56 ZFYVE16 MELK TRIM37 SAT1 RBM12B EMC8 FTSJ2 MAN2B2 POLR3B TOP2A IPO7 QSER1 C5orf24 RPL13A WDR36 OAT CUEDC1 RBBP7 COLEC12 HSPH1 SLC19A1 SKIV2L2 RPAIN B4GALT6 ANKRD16 DIAPH3 CRELD1 DAXX TXNDC15 METTL3 CLTC CWC27 CTCF PTPN12 TRAPPC6B JAKMIP2 NPC2 CPD KLHL42 UQCR10 MLX RABGAP1L ZNF827 TMEM120B EIF4A1 THUMPD1 XPO1 SLC22A23 LDHB TSPAN5 RAP1B COX10 IDI1 TTC32 PDLIM5 DNAJC5 COX7C C12orf73 GRB2 MTMR6 RTN4 OXR1 WRAP73 NFKBIB LAMTOR1 RPL19 DCAF16 UBE2I COG1 CTC-359D24.3 C5orf42 SMAD5 SOCS6 FSCN1 COMMD6 STYX PSMG4 PHF3 CHERP SAFB2 SAP18 ICOSLG PRRC1 WDR46 NME1 HNRNPA1 JUP MRFAP1 BTRC DYM DARS DROSHA GATC LASP1 N6AMT2 SSB ZNF146 HINT2 MTRR SMARCC2 KANK2 HACD2 SPATS2L YWHAH DNAJC2 H2AFY ELK3 SEC14L1 MT-TH CHN1 HNRNPU NSRP1 CANX SLC48A1 BCAS3 UNKL PRPF40A EFHD2 ITGB3BP CLIP1 COPZ1 TMEM101 ATXN7L3 FOXP2 ZNF473 SDCCAG8 FIP1L1 NAA25 WDR7 RTF1 ZNF655 FKBP15 CLK2 DCP2 FKBP11 PLEKHH3 TMEM87A HAT1 TRRAP HNRNPR TCOF1 GTF2A1 SAFB2 CCNK DIABLO RAB5A C3orf62 KHDRBS3 UBR7 NCLN ARL4C PAPOLG MANF MICU2 CSPP1 PAPOLA SPC24 ZNF576 ZNF200 DNAJC25 ZNF282 C5orf45 KLC1 NDUFB7 ZNF569 BCL2L11 DCTN4 RSRC1 FBXO32 CDK2 NUP62 RFC5 ZNF101 NEDD9 TRA2B TACC1 LGMN UBA52 TMPO ECE1 ZNF687 PRPF4B SREK1 NEMF HNRNPM ZNF800 THAP5 TMEM107 ATXN2 DPYSL2 GPATCH2L SYMPK HNRNPH1 ABI2 TCEA3 RPL29 SGCE AP5M1 ZNF160 ZNF23 KNSTRN MRPL21 PLS3 PNMA2 KMT2C UPF1 MRPL4 SNHG7 CYR61 CAMTA1 GDAP1 EMC4 ATF5 PSMC3IP SATB2 L3MBTL2 SRSF6 LYRM2 PSMA4 SNRNP70 NPM1 HMGXB4 MAPK9 HNRNPU FBXO28 EIF3J HNRNPL CHD2 STK3 NR2F2 STMN1 HSPA9 AP4E1 BRD4 RPL23 DPF1 ERP29 DAPK1 TMEM41B NUSAP1 RNF40 ATP5G2 ZNF697 NT5C3A MDK POLR2G PKM MRPL37

171

APPENDIX C

CHAPTER IV SEQUENCING LIBRARIES

172

APPENDIX D

CHAPTER IV PRIMERS

173

APPENDIX E

CHAPTER V SEQUENCING LIBRARIES

*Note that libraries are not biological replicates. The same libraries were additionally sequenced to obtain more reads.

174

APPENDIX F

CHAPTER V PRIMERS

175

APPENDIX G

SNRNP AND SSA AFFECTED POLY(A) SITES

U1 ASO: Genes with Increased Poly(A) Sites ABCC9 BUB1 CTNNA3 FIS1 KLHL7 NR2C2 PUS7 ABHD12 BZW2 CTNNB1 FKBP4 KTN1 NUP133 PVRL1 ABLIM1 C16orf52 CTPS1 FRAS1 KXD1 NUP155 PXMP2 AC009403.2 C17orf58 DCAF10 FRMD8 LAMC3 OBSCN RABEPK ACAT1 C17orf75 DCUN1D5 FUS LAP3 OFD1 RANBP6 ACLY C17orf85 DDB2 G3BP1 LIG1 OLFM2 RARS ACTR10 C19orf68 DDX19A GAL LONP1 OSBPL1A RBBP8 ACTR6 C9orf3 DDX21 GAPVD1 LOX OSBPL5 RBFA ADGRA3 CA5B DHX57 GARS LPCAT1 PADI4 RBM26 AEBP2 CACNA1A DLD GATA2 LRBA PAIP2 RBM28 AEN CACNA1D DLST GFPT2 LRPPRC PALLD RBM34 AFAP1L1 CAMK1D DNAH12 GLIS2 LRRC28 PAPD5 RELB AGPAT5 CANT1 DNAH17 GORAB LRRC59 PAPSS1 REXO4 AK5 CAPZA1 DNAH7 GPALPP1 LRSAM1 PARD3 RIN2 AKNAD1 CASC3 DNAJA2 GPBP1 LSG1 PCBD2 RMND5A ALCAM CBX2 DNAJC8 GPR176 LSS PDE2A RNF103 ALDH1A3 CCDC47 DOCK11 GRB10 LTV1 PDLIM1 RNF149 ALDH3B1 CCDC71 DOT1L GRB2 LYRM5 PDLIM2 RNMTL1 ALDH4A1 CCNB1IP1 DPY19L2 GTPBP4 MAP4K5 PDPR RNPS1 ALG8 CCNG1 DROSHA HBS1L MAPK8IP3 PGBD1 ROCK1 AMMECR1 CCT6A DYNC1H1 HEATR6 MBTPS1 PGM1 ROCK2 ANK2 CCT8 EAF2 HELLS MCF2 PHF10 RP11- ANKRD13A CDC37 EEF1D HMGXB3 MCM3 PIGT RP11- ANKRD13C CDC6 EIF2B5 HNF4A MCM7 PKN1 RPE ANKS6 CDCA4 EIF3A HNRNPH1 MEF2A PLCXD2 RPS4X ANP32A CDCA7 EIF3B HPRT1 MERTK PLEKHG1 RPSA AP3S2 CDK5RAP2 EIF3D HSDL1 METTL2B PNO1 RPUSD4 APEH CDK6 EIF3G HSP90AA1 MFSD8 POLA1 RRM1 APPL1 CDV3 EIF4B HTATSF1 MIPOL1 POLA2 RRP1 AQR CEBPZ EIF4G1 IBTK MMP15 POLR1C RRP12 ARFRP1 CENPJ ELAC2 IDH1 MRPL21 POLR2M RTF1 ARHGAP29 CEP192 ELAVL1 INHBC MRPS30 POLR3C RUFY2 ARHGEF7 CEP290 ENPP1 INO80E MSR1 POLR3K RUVBL2 ARID4B CEP95 EPC2 INSIG1 MTG2 POTEJ SARS ARL6 CHCHD3 EPHA2 INTS10 MTUS2 PPFIA1 SATB2 ARL6IP6 CHD2 EPS8L2 ITCH MYH11 PPFIA2 SCNM1 ARMCX1 CHEK2 ESYT1 ITGA8 MYO10 PPP1R8 SCUBE2 ASIC2 CHMP3 ETV5 ITGB8 MZT1 PRDM4 SDCCAG3 ATAD1 CLNS1A EVA1C JARID2 NBEA PRKAR1A SEC31A ATAT1 CLPTM1L EXD3 JPH2 NBPF3 PRKAR2A SEC61A1 ATCAY CLVS1 EXOSC7 KANSL1L NBR1 PRMT3 SEL1L ATF3 CNOT6 F3 KARS NCAPG PROS1 SEMA3E ATG4C COBL F8 KAT5 NCOR2 PRPF31 SETD5 ATP1B1 CORO7 FADS1 KCNQ5 NEBL PRPF6 SF3B1 ATP7A CPSF6 FAH KDM1A NEDD4 PSMD11 SH2B1 ATP9B CRKL FAM126B KDM8 NEK9 PSMD3 SHTN1 ATRX CS FAM172A KIAA0020 NELFB PTAR1 SKAP2 ATXN7 CSNK2A1 FASTKD2 KIAA0196 NFYC PTP4A2 SLC16A14 B4GALT2 CSNK2B FBRSL1 KIAA0825 NHLRC3 PTPDC1 SLC25A3 BAZ1B CSNK2B FBXO34 KIF13A NLRC5 PTPN12 SLC27A4 BNIP2 CTAGE5 FIBCD1 KIF24 NMD3 PTPRG SLC33A1 BRMS1 CTD-2207O23.3 FIGN KLC2 NOL11 PTTG1IP SLC38A10 176

SLC38A2 SP3 TARSL2 TNPO1 TTL VWF ZNF114 SLC38A9 SPDL1 TCF25 TOP3A TUT1 WDR3 ZNF121 SLC43A3 SRP68 TCP1 TOPBP1 UBE2D3 WDR76 ZNF131 SLC5A1 SRSF11 TEX14 TP53BP1 UBE2E3 WNT7B ZNF341 SLC7A5 ST6GALNAC6 TEX22 TPK1 UBR3 WRNIP1 ZNF367 SLTM STK16 TGFB1I1 TPR UIMC1 XPO5 ZNF568 SMAD4 STK4 TIGD6 TPTE2 UPF3B XRCC3 ZNF639 SMARCA5 STPG1 TIPRL TRAF7 UPP2 YARS ZNF71 SMC2 STRIP1 TM9SF3 TRDN UQCC1 YKT6 ZNF730 SMCHD1 STXBP5 TMC1 TRIM52 USP12 YWHAQ SMYD4 SUSD1 TMED10 TRIP13 USP25 YWHAZ SNAPC4 SYMPK TMEM132D TRMT6 USP43 ZBED9 SNW1 SYT16 TMEM161B TSPAN17 VCP ZCCHC6 SNX16 TAF6L TMEM80 TTBK2 VRK2 ZMAT3 SORBS2 TARDBP TMLHE TTC4 VTA1 ZMYND19

U1 ASO: Genes with Decreased Poly(A) Sites ABCA13 ATIC CCDC82 CRIPT DYNC2H1 FUBP3 HSP90B1 ABHD3 ATP2B4 CCNB1 CSDE1 EAF2 FXR2 HSPA14 AC009403.2 ATP2C1 CCNG1 CSPP1 EDC3 GABARAPL1 HSPA4L AC104534.3 ATP7A CCT4 CSTF3 EDEM1 GALNT2 HYOU1 AC109829.1 ATP8A2 CCT6A CTAGE5 EED GAMT IARS2 ACADL ATRX CD47 CTD-2116N17.1 EEF1G GAPVD1 ICE1 ACOXL AUH CDC26 CTD-2135J3.4 EGLN1 GART ICMT ADAM9 B9D2 CDC42 CTGF EIF2AK4 GAS2L1 IGF2BP3 ADGRE1 BAG1 CDC5L CTNNA3 EIF2S3 GCFC2 IGF2R AGAP1 BAG6 CDCA3 CUL2 EIF3E GCOM1 IGFL2 AGBL1 BAZ1A CDH5 CUL7 EIF3M GET4 IKBKB AGGF1 BAZ1B CDKAL1 DARS EIF5B GGA2 IKZF3 AKR1D1 BCAS1 CDKL3 DBN1 ELAVL4 GGPS1 IL18 ALDH7A1 BCLAF1 CENPE DBNL EMB GMPR2 IL7 ALMS1 BDP1 CENPI DCTN2 EMC3 GNB1 IMPDH2 AMOTL1 BNIP3L CENPL DDX1 ENPP6 GNB2L1 INTS10 AMY2A BOD1 CENPN DDX21 EPB41L4B GNPAT IPO8 ANK2 BOD1L1 CEP104 DDX3X ERC1 GOLGA8A IQGAP1 ANKAR BRD4 CEP76 DDX3Y ERGIC1 GOLGA8B IRS1 ANKIB1 BTAF1 CGB1 DDX42 ERGIC2 GOLIM4 ITGB1 ANKRD13C BTF3 CHD2 DDX46 ESF1 GPATCH8 ITSN2 ANKRD30B BTG3 CHEK2 DDX60 ESPL1 GPC5 KATNA1 ANKRD55 BZW1 CHN2 DDX60L EVPLL GPR107 KATNBL1 ANKS3 BZW2 CHPT1 DEC1 EXOSC10 GPR149 KCTD16 ANXA2 C11orf58 CHST6 DESI1 EXOSC9 GRTP1 KCTD9 ANXA5 C15orf38-AP3S2 CHUK DHRS7 EYA2 GSR KDM3A ANXA6 C15orf41 CIRH1A DHRSX F8 H2AFY KDM4C AP3B2 C16orf62 CKLF-CMTM1 DHX34 FAAH2 HBS1L KHDRBS1 AP3S2 C1orf27 CLDN12 DHX57 FAM101A HCCS KIAA0020 APAF1 C1RL CLPTM1L DIAPH3 FAM135A HCN3 KIAA0196 API5 C3 CLPX DIMT1 FAM13B HEATR1 KIAA0586 APLP2 C4orf17 CMPK1 DLX6 FAM162A HEATR6 KIAA0825 APMAP C5orf30 CNKSR2 DMRT1 FAM172A HLA-B KIAA1755 APOOP5 CA4 CNN2 DMXL1 FAM189A2 HMGXB3 KIF11 AQR CACNA1A COL14A1 DMXL2 FAM19A2 HNRNPA1L2 KIF13A ARAP1 CAGE1 COL3A1 DNAH11 FAM208A HNRNPA2B1 KIF1BP ARID5B CAMSAP2 COL4A2 DNAJA1 FARSB HNRNPC KIFC3 ARIH2 CAND1 COPA DNAJA2 FASTKD2 HNRNPCL1 KIR2DL4 ARMC4 CANX COQ5 DNAJA4 FBXO11 HNRNPH3 KIR3DL1 ARMT1 CASQ2 COX10 DOCK5 FCF1 HNRNPK KIR3DL3 ARPP21 CC2D2A COX4I1 DONSON FGF18 HNRNPL KITLG ASAP1 CCDC150 CPLX1 DPH7 FIGNL1 HNRNPLL KRT222 ASPH CCDC18 CPNE3 DSCR3 FMN2 HNRNPR KSR2 ATAD1 CCDC47 CPT1A DSEL FNDC7 HNRNPU KTN1 ATE1 CCDC58 CR1 DSN1 FOXL1 HRC L3MBTL3 ATF7IP CCDC77 CRBN DYNC1H1 FOXP1 HS6ST2 LARS ATG16L1 CCDC79 CRIM1 FSD2 HSF2 LAX1 177

LDHA MYOF PDE1A RAB18 SAE1 SSR3 UBE3A LDHB MYT1 PDE2A RAB6A SAMHD1 STIM1 UBE4A LGALS3 NAA16 PDE3A RAB9A SAMM50 STK39 UBLCP1 LINC01119 NADK2 PDE3B RABEP1 SCAPER STT3A UBR7 LMAN1 NAP1L1 PDE4B RAD1 SCFD1 STX16 UCHL5 LMBR1 NAP1L4 PDHA1 RAD50 SCRN3 STXBP5 UFL1 LMBRD1 NAPEPLD PDILT RAF1 SEC13 SUGCT UGGT1 LRBA NASP PDSS2 RANBP6 SEC61G SUMF1 UNC50 LRCH1 NBAS PHF20L1 RASAL2 SEL1L SUPT16H UNC5C LRP11 NBEA PI4KA RB1CC1 SEMA6B SUSD3 USH2A LRP6 NBEAL1 PICALM RBBP7 SERBP1 SYMPK USO1 LRRC59 NBPF1 PIGC RBFOX1 SERINC1 SYN1 USP1 LRRC9 NBPF3 PIGK RBM12 SERPINB1 TARS USP13 LRRN2 NCAPD2 PIGN RBM23 SESTD1 TATDN1 USP15 LRSAM1 NCAPG2 PINX1 RBM26 SETD1B TBC1D1 USP34 LTBP1 NCEH1 pk RBM27 SF3A3 TBCE USP47 LTV1 NDUFA9 PKN2 RBM28 SF3B2 TBK1 VMP1 LYAR NEDD4 PLEKHA7 RCC1 SGCE TBPL1 VPS13A LYSMD2 NEIL2 PLOD2 RCN2 SGIP1 TCF3 VPS35 M1AP NFE2L2 PLXNC1 REEP3 SGPL1 TDRKH VPS54 M6PR NFKB2 PMM2 REV3L SIK1 TEK VRK2 MAD1L1 NFRKB PMPCA RFC1 SLC25A24 TERF1 VWF MAEA NFU1 POLA1 RFWD2 SLC26A8 TES WASF1 MAN2A1 NIFK POLDIP2 RHOT1 SLC35B1 TFAP2A WDFY1 MAP3K7 NKIRAS1 POLG2 RIF1 SLC38A2 THADA WDFY3 MARCH6 NLN POU6F1 RIN3 SLC38A6 TIMM17A WDR1 MAST3 NMNAT3 PPAN RNF103-CHMP3 SLC39A8 TKT WDR11 MATR3 NOD1 PPAN-P2RY11 RNF130 SLC44A1 TLK1 WDR12 MAX NOL8 PPARGC1B RNF217 SLC45A4 TLL1 WDR3 MCCC1 NONO PPFIA1 RNF220 SLC6A3 TMBIM6 WDR43 MCF2L NREP PPID RP11-111H13.1 SLC7A1 TMEM105 WDR70 MDGA2 NSMCE4A PPIH RP11-38C17.1 SLC9B2 TMEM184C WDR86 MDH2 NT5E PPIL3 RP11-407N17.3 SMAD5 TMEM222 WDSUB1 MDN1 NUCKS1 PPME1 RPA1 SMARCA2 TMEM44 WLS MED1 NUDT12 PPP2R2A RPAP2 SMARCA5 TMLHE WWP1 MELK NUDT4 PRDM6 RPL10 SMC3 TMPRSS11E XRCC3 METAP2 NUDT5 PRDX5 RPL19 SMC6 TNKS2 XRCC5 MIS18BP1 NUP133 PREP RPL23 SMCHD1 TNRC6A XRN2 MKRN3 NUP153 PRLR RPL37 SNAPC4 TP53TG5 YAE1D1 MLXIP NUP205 PRPF19 RPL37A SNRNP200 TPR YARS2 MMP26 NUP37 PRPF3 RPL5 SNRNP70 TPTE2 YIPF4 MND1 OFD1 PRPF4 RPL7 SNRPD3 TRAF3IP1 YTHDC2 MORN2 OGDH PRPF40B RPN1 SNRPF TRIM33 ZC3H7A MPDZ OR2W3 PSMA2 RPP14 SNTG1 TRIM58 ZDBF2 MPHOSPH10 OSBPL9 PSMA4 RPP40 SNW1 TRIP12 ZFP2 MRPL16 OSMR PSMA6 RPS14 SNX14 TRMT6 ZMYM6 MRPL18 OXCT1 PSMA7 RPS19 SNX7 TSEN15 ZNF107 MRPL30 PAIP1 PSMB5 RPS26 SNX8 TSEN54 ZNF124 MSH2 PAIP2 PSMC2 RPS6 SORBS2 TSPAN3 ZNF181 MSRA PARP15 PSMC5 RPS7 SOX7 TTC13 ZNF280D MTBP PAXBP1 PTEN RRAGC SPAG6 TTC27 ZNF410 MTMR8 PBK PTGFRN RRM2 SPATA16 TXN ZNF578 MTPAP PBRM1 PTMA RSL24D1 SPATS2L TXNL1 ZNF622 MTSS1 PCBD2 PTMS RSPH1 SPECC1L U2AF1 ZNF664 MUC16 PCMTD1 PTPRG RSRP1 SPG21 U2SURP ZNF720 MYBL1 PCNXL4 PTPRM RTCA SPINK5 UBA1 ZNF81 MYCBP2 PCYT1A PUM1 RUNX3 SPRED1 UBAP2 ZNFX1 MYO1E PCYT1B PWP1 RUVBL1 SPSB4 UBE2R2 ZZEF1 MYO9A PDCD4 QRSL1 RWDD1 SREK1 UBE2T

U2 ASO: Genes with Increased Poly(A) Sites AAMDC ABCB10 ABCD4 ABHD17C AC007040.11 ACAT1 ACSF2 AARS ABCB9 ABCE1 ABLIM1 AC009403.2 ACBD3 ACSL3 AASDH ABCC10 ABCF2 AC003002.4 AC011997.1 ACBD4 ACSL4 ABAT ABCC2 ABHD12 AC004076.9 ACAP2 ACRC ACSS2 178

ACTN1 ANAPC5 ASAH1 BRWD1 CASP7 CENPI CNOT1 ACTR3B ANAPC7 ASAP1 BSCL2 CAST CENPK CNOT11 ACVR2A ANK3 ASB8 BSG CBWD1 CENPN CNOT4 ADA ANKEF1 ASCC3 BTAF1 CBWD2 CENPO CNOT7 ADAM19 ANKFN1 ASIP BTF3 CBX1 CENPP CNR2 ADAMTS6 ANKH ASRGL1 BTG1 CC2D2A CENPW CNTLN ADARB1 ANKHD1 ASTN2 BTG3 CCAR1 CEP112 CNTN1 ADARB2 ANKHD1-EIF4EBP3ATAD2 BTN3A1 CCDC101 CEP120 CNTN5 ADAT1 ANKMY1 ATAT1 BUD31 CCDC112 CEP128 CNTNAP2 ADCY9 ANKMY2 ATF3 C10orf11 CCDC122 CEP131 CNTRL ADD1 ANKRD12 ATF5 C10orf76 CCDC137 CEP135 COA5 ADGRA3 ANKRD13A ATG13 C11orf30 CCDC138 CEP164 COL12A1 ADGRL3 ANKRD13C ATG4D C11orf70 CCDC144A CEP295 COL25A1 ADH5 ANKRD18B ATG7 C11orf80 CCDC15 CEP57 COL4A5 ADK ANKRD26 ATIC C12orf29 CCDC171 CEP63 COL4A6 ADORA2B ANKRD28 ATL1 C14orf37 CCDC181 CEP89 COL5A2 ADPRM ANKRD29 ATL2 C15orf57 CCDC3 CEP95 COL6A2 ADRBK2 ANKRD42 ATL3 C16orf46 CCDC30 CEP97 COMMD1 ADSL ANKRD44 ATP11C C17orf53 CCDC34 CFAP44 COMMD10 AEBP2 ANKS6 ATP2A2 C18orf54 CCDC66 CFAP54 COPS2 AFF1 ANO4 ATP5B C1orf56 CCDC77 CFAP97 COPZ1 AFMID ANP32E ATP5F1 C1QTNF3 CCDC82 CFDP1 CORIN AGAP3 ANXA1 ATP8B1 C1S CCDC85C CGA CORO7 AGBL1 ANXA11 ATPAF2 C20orf196 CCDC88A CHD1 COX10 AGBL3 ANXA3 ATR C20orf24 CCDC93 CHD2 COX7C AGL ANXA6 ATXN10 C2CD3 CCNE2 CHD3 CPD AGMO AP000295.9 AUH C2orf44 CCNT2 CHD4 CPED1 AGO3 AP000350.10 AUNIP C2orf49 CCP110 CHD7 CPNE3 AGO4 AP1G1 AURKA C2orf70 CCT2 CHEK2 CPSF3 AGPAT6 AP1S3 AUTS2 C2orf81 CCT3 CHID1 CPSF3L AHCTF1 AP2B1 AVL9 C3orf17 CCT5 CHMP2B CPSF6 AHI1 AP3S2 AVPI1 C3orf62 CCT7 CHMP4A CPT1C AHNAK AP4E1 AZI2 C3orf67 CCZ1 CHN1 CRCP AHSA2 APOD AZIN1 C4orf27 CCZ1B CHN2 CREB3 AIMP1 APOOL B2M C5orf28 CD109 CHORDC1 CREB5 AIMP2 AQR B3GALNT2 C5orf34 CD163L1 CHPT1 CRHR1 AJUBA ARAF BAG1 C5orf66 CD44 CHRNA5 CRISPLD2 AK2 ARAP2 BAIAP2L1 C6orf89 CD58 CHRNA7 CRLF3 AK4 ARGLU1 BAZ1A C8orf59 CDC37 CHST11 CRMP1 AK9 ARHGAP10 BAZ2B C9orf84 CDC40 CHSY3 CRYBB1 AKAP11 ARHGAP19 BBS2 CA10 CDC42BPA CKAP2 CSMD1 AKAP2 ARHGAP19-SLIT1BCAS4 CA5B CDC42BPB CKAP5 CSMD3 AKAP3 ARHGAP21 BCL2L11 CAB39 CDC73 CKLF CSNK1A1 AKAP6 ARHGAP27 BCL2L14 CABLES1 CDCA7L CKLF CSNK1G1 AKIRIN2 ARHGAP28 BCO2 CACNA1A CDH19 CLASP1 CSNK2A2 AKNAD1 ARHGAP44 BDP1 CACNA2D3 CDH4 CLCC1 CSPP1 AKT3 ARHGAP8 BET1L CACNB4 CDK19 CLDN14 CTAGE5 ALDH1A2 ARHGEF1 BFAR CACNG7 CDK5RAP2 CLGN CTB-60B18.6 ALDH3A2 ARHGEF11 BFSP1 CAD CDKAL1 CLIC2 CTC-360G5.8 ALDH7A1 ARHGEF26 BIRC2 CADM2 CDKL3 CLSPN CTCF ALG9 ARID4A BMP8B CADPS2 CDKL5 CLTCL1 CTD-2135J3.4 ALOX12-AS1 ARIH1 BNC2 CAGE1 CDKN1A CLVS1 CTD-2510F5.6 ALPL ARL2 BNIP1 CAMKMT CDS1 CLYBL CTNNA3 AMBRA1 ARL2BP BPNT1 CANX CDV3 CMB9 CTNNB1 AMD1 ARL3 BRAP CAPRIN1 CDYL CMTM1 CTPS1 AMN1 ARMC6 BRCA2 CAPS2 CEBPZ CMTR2 CTSZ AMY2B ARNT BRD4 CASC4 CELF2 CNBD1 CTTN ANAPC16 ARSH BRD7 CASC5 CENPC CNKSR2 CUTC ANAPC4 ARSJ BRD8 CASK CENPE CNN2 CWC22 179

CWF19L2 DNAH12 ENG FANCB GANC GRIN2B HSPA4 CXorf23 DNAH14 ENOX1 FANCD2 GAPVD1 GRIN2D HSPA9 DAB1 DNAH5 ENTPD5 FANCI GART GRIPAP1 HSPB8 DAB2 DNAJB11 EP400 FAP GAS6 GRK5 HTR2C DAB2IP DNAJB4 EPC1 FARS2 GATA2 GRTP1 HYOU1 DAP3 DNAJC10 EPHX4 FARSB GBAS GSR IARS DAPK2 DNAJC25 EPN1 FBL GCC2 GSTCD ICE1 DARS2 DNAJC25-GNG10EPN2 FBLN7 GCDH GSTO1 IDH3A DBF4 DNAJC8 EPRS FBN1 GCH1 GTDC1 IFI44 DBF4B DNER EPS15 FBXL13 GCN1 GTF2F2 IFIH1 DBR1 DNM2 EPS15L1 FBXO17 GCNT2 GTF2IRD2 IFNAR2 DCHS2 DNMT1 ERBB2 FBXO4 GDF15 GTF3C3 IFRD1 DCLK1 DNTTIP2 ERBB4 FBXO42 GDI2 GTPBP2 IFT52 DCPS DOCK10 ERC2 FBXO46 GDPD3 GUF1 IFT57 DCUN1D2 DOPEY1 ERCC4 FBXO5 GET4 HACD3 IFT88 DDB2 DPH6 ERCC6 FBXO9 GFM1 HACE1 IGF1R DDHD1 DPM1 ERCC6-PGBD3 FBXW7 GGA2 HADHA IGF2R DDX1 DPYD ERCC6L2 FCHSD2 GGA3 HAUS6 IGSF10 DDX10 DSCR3 ERI3 FER GGPS1 HAVCR2 IKBKB DDX18 DSEL ERICH1 FGF14 GHITM HBEGF IL15 DDX19B DSP ERMP1 FGFR1 GIGYF2 HBG2 IL1RAP DDX31 DTL ERP29 FGFR1OP GIPC1 HBP1 IL1RAPL1 DDX3X DUSP22 ESF1 FIGNL1 GLCE HDAC11 IL6 DDX55 DYM ESRP1 FILIP1 GLIS1 HDAC5 IL7 DDX59 DYNC1H1 ETV4 FIS1 GLP2R HEATR1 IMMP2L DDX60 DYX1C1 EXO1 FKBP10 GLS HEATR6 IMMT DDX60L E2F1 EXOC1 FKBP1A GLT1D1 HECTD2 IMPA2 DEAF1 E2F3 EXOC6 FKBP5 GM2A HELB INADL DEK E2F6 EXOSC4 FKTN GMPR2 HELZ ING1 DENND1B EAF2 EXTL3 FLNB GNL2 HERC2 INPP4B DENND4A EBPL EYS FLOT2 GNL3 HEXA INTS10 DENND4B ECM2 EZH2 FMN2 GNL3L HEXB INTS12 DENR EDC4 EZR FMNL2 GNPTAB HEXDC INVS DEPDC7 EDNRA FAAH2 FNBP1L GNS HEXIM2 IPCEF1 DERA EED FAF1 FNIP2 GOLGA8A HIATL2 IPO11 DESI1 EFCAB11 FAM117B FOCAD GOLGA8B HIBCH IPO13 DGKD EFCAB13 FAM118B FOXK2 GOLGB1 HIP1 IPO5 DGKE EFHC2 FAM120B FOXP1 GOLPH3 HIVEP1 IPO8 DGKH EGFLAM FAM120C FRG1 GOLPH3L HIVEP2 IQCA1 DGKZ EGLN1 FAM129B FRG1BP GON4L HM13 IQCB1 DHFR EGLN3 FAM133A FRMD4A GOPC HMG20A IQCK DHX30 EHBP1L1 FAM133B FRMD4B GPATCH1 HMGB3 IQGAP2 DHX32 EHHADH FAM134B FRMD5 GPATCH2 HMGN1 IRF3 DHX36 EIF1B FAM135A FRMD8 GPBP1L1 HMGN3 IST1 DHX40 EIF2AK2 FAM13B FRS2 GPC5 HNRNPD ISY1 DHX57 EIF2B5 FAM160B1 FUBP1 GPHN HNRNPM ISY1-RAB43 DIAPH2 EIF2S2 FAM168A FUCA2 GPM6A HNRNPU ITGA1 DIEXF EIF2S3 FAM172A FUT8 GPN3 HNRNPUL2 ITGA3 DIP2A EIF3A FAM174B FXYD5 GPR156 HOOK3 ITGA4 DIRC2 EIF3G FAM177B FYB GPR173 HP1BP3 ITGA5 DIS3L EIF3H FAM188B FZD6 GRAMD1C HPCAL1 ITGB3BP DLAT EIF3L FAM198A G3BP2 GRAP2 HPSE2 ITPK1 DLD EIF4G1 FAM200B GAA GRB14 HRH1 ITPR1 DLG1 ELAVL4 FAM208B GABARAPL1 GREB1L HS1BP3 ITPR3 DLG2 ELF2 FAM210A GABPB1 GRHL1 HSF2 ITSN2 DLGAP5 ELL2 FAM53A GABRE GRHPR HSF2BP IWS1 DNA2 ELOVL2 FAM63B GALM GRIA4 HSF5 JAGN1 DNAAF2 ELOVL6 FAM83D GALNT15 GRID1 HSP90AB1 JAK2 DNAH1 EMB FAM98B GANAB GRID2 HSPA12A JAZF1 180

JPH3 KNTC1 MAGI1 MKRN3 MYH9 NME2 P4HA1 KANK1 KPNA2 MAGI3 MLLT11 MYLK NME8 P4HB KANSL1L KPNA3 MALRD1 MLLT3 MYO10 NMNAT1 PABPC4 KANSL2 KRAS MAMDC2 MMS22L MYO16 NMT1 PACS1 KARS KREMEN1 MAN1A1 MND1 MYO1E NNMT PACSIN2 KAT5 KRT8 MAN2A1 MOB3B MYO1H NOL11 PACSIN3 KATNAL2 KRT86 MAP2K2 MOB4 MYO5B NOL7 PAFAH1B1 KAZN KSR2 MAP2K4 MOCOS MYO6 NOL8 PAICS KCNC4 KTN1 MAP2K5 MON1A MYO9A NOL9 PAK1 KCNG3 KXD1 MAP3K13 MORC4 MYO9B NOLC1 PAK2 KCNH1 LACTB MAP3K2 MORN2 MYOF NOP14 PALM2 KCNH8 LAMA1 MAP3K4 MOSPD1 MYRIP NOP56 PAM KCNIP1 LAMA4 MAP3K5 MOXD1 MYSM1 NOP58 PAN3 KCNK13 LAMB1 MAP4K3 MPP1 N4BP2L2 NOS1 PANK2 KCNMA1 LAMB4 MAP4K4 MPP3 N6AMT2 NOX1 PANK3 KCNMB4 LAMTOR3 MAP7D3 MR1 NAA25 NPC1 PANX2 KCTD1 LANCL2 MAPK8 MRE11A NAA35 NPHP1 PAPD4 KCTD21 LAP3 MAPKAP1 MRPL11 NAA38 NR2C2 PAQR5 KDM1A LARP4 MAPRE1 MRPL30 NAF1 NRD1 PARD3 KDM3A LARS MARCH5 MRPL39 NALCN NREP PARG KDM4C LAS1L MAST4 MRPL42 NAP1L4 NRG3 PARK7 KDM5B LAT2 MASTL MRPL47 NARS NRG4 PARL KHDRBS3 LCLAT1 MB21D1 MRPS22 NASP NSA2 PARP4 KIAA0020 LDHB MBD4 MRPS31 NBEA NSD1 PATL1 KIAA0319L LETM2 MBD5 MSH2 NBEAL1 NSMAF PAWR KIAA0368 LHFPL3 MBIP MSH3 NBPF1 NSRP1 PAXBP1 KIAA0430 LIN52 MBTPS1 MSI2 NBPF20 NSUN2 PBK KIAA0556 LINC00923 MCCC2 MSL2 NBPF3 NSUN6 PBRM1 KIAA0586 LIPA MCMBP MSRA NCAM2 NT5C2 PC KIAA0825 LMAN1 MCU MSRB3 NCKAP1 NT5DC1 PCBD2 KIAA0907 LMBRD1 MECOM MSTO1 NCKAP1L NTN1 PCCA KIAA1033 LNX1 MECP2 MTBP NCL NUCB1 PCDH11X KIAA1217 LONP1 MED13 MTCL1 NCOA1 NUCKS1 PCDH11Y KIAA1324L LPXN MED13L MTF1 NCOA6 NUDC PCDH15 KIAA1551 LRBA MED28 MTG2 NCOR1 NUDT3 PCDHGA1 KIAA1715 LRCH1 MED4 MTHFD1 NDC80 NUDT9 PCF11 KIAA1841 LRP5 MEF2A MTIF2 NDST1 NUP133 PCGF3 KIAA1958 LRPPRC MEF2D MTL5 NDUFAF1 NUP155 PCM1 KIAA2012 LRR1 MEI1 MTMR1 NDUFAF7 NUP188 PCMT1 KIF11 LRRC16A MELK MTMR8 NDUFB6 NUP210 PCMTD1 KIF15 LRRC23 MESDC2 MTPAP NDUFS5 NUP214 PCNT KIF23 LRRC39 METTL4 MTRF1 NDUFS7 NUP43 PCNXL2 KIF26B LRRC3B MFAP1 MTRF1L NDUFV2 NUP54 PCSK1 KIF2A LRRC40 MFF MTRNR2L1 NEDD4 NUP85 PCSK6 KIF3C LRRIQ1 MGEA5 MTRNR2L10 NEIL2 NUPL2 PCYT1A KIF4A LRRIQ3 MGME1 MTRNR2L11 NEK1 NVL PDCD10 KIF4B LRRTM4 MGMT MTRNR2L12 NEK11 ODF2 PDE1A KIFC3 LRSAM1 MGST3 MTRNR2L2 NELFE OFD1 PDE4B KIN LSM14A MIB1 MTRNR2L6 NETO2 OGFOD3 PDE8A KLF4 LTBP2 MICAL2 MTRNR2L8 NF1 OGT PDGFRB KLHDC2 LUC7L2 MICU1 MTRNR2L9 NFRKB OLFM2 PDHA1 KLHDC3 LYPD1 MIEF1 MTUS2 NGDN OPA1 PDHX KLHDC4 LZIC MIER1 MUM1 NGRN OSBPL5 PDS5B KLHL23 M1AP MIOS MVB12A NIP7 OSMR PDZD2 KLHL29 MACROD1 MIS18BP1 MVP NKD1 OTUD3 PDZRN4 KLHL5 MACROD2 MKI67 MXD1 NLRC5 OTUD7A PEBP1 KLHL7 MAD1L1 MKL1 MYBL1 NLRP4 OVCH1-AS1 PEBP4 KLRC3 MADD MKL2 MYCT1 NMD3 P2RY11 PELI3 KMT2A MAGEA6 MKLN1 MYH10 NME1-NME2 P3H3 PEX1 181

PEX14 POLQ PSMA4 RANBP6 ROCK1 SARS2 SKIV2L2 PFDN2 POLR2A PSMB7 RANBP9 ROCK2 SASH1 SLC13A3 PFDN6 POLR2C PSMC1 RAP1GAP2 ROR2 SASS6 SLC14A2 PGBD3 POLR2M PSMC2 RAPGEF1 RORA SATB2 SLC16A14 PGPEP1 POLR3A PSMD1 RAPGEF4 RP11-101E3.5 SBDS SLC19A2 PHACTR1 POLR3C PSMD7 RAPGEF6 RP11-111H13.1 SBF2 SLC1A5 PHF14 POP1 PSME4 RAPH1 RP11-15K19.2 SCAF11 SLC20A2 PHF20L1 POR PSRC1 RARB RP11-162P23.2 SCAF4 SLC24A2 PHF7 POTEB PTBP1 RARS2 RP11-295P9.3 SCAF8 SLC25A24 PHKA1 POTEB2 PTCRA RASA1 RP11-343C2.12 SCAMP3 SLC25A42 PHKB POU2AF1 PTDSS1 RASA2 RP11-407N17.3 SCAPER SLC27A1 PHLPP1 PPAN PTGER2 RASA3 RP11-458D21.5 SCFD2 SLC27A2 PI4K2B PPAN-P2RY11 PTGR1 RASAL2 RP11-468E2.1 SCG5 SLC27A4 PI4KA PPAP2A PTP4A2 RASGEF1B RP11-529K1.3 SCLY SLC27A6 PIAS2 PPAP2C PTPDC1 RASSF1 RP11-545J16.1 SCUBE2 SLC2A11 PICALM PPHLN1 PTPN1 RB1 RP11-706O15.1 SCYL1 SLC30A6 PIGF PPIE PTPN21 RBBP8 RP13-672B3.2 SDAD1 SLC30A9 PIGL PPIH PTPN3 RBBP9 RPA1 SDCCAG3 SLC35B3 PIGT PPIP5K1 PTPRA RBFOX1 RPAP3 SDF2 SLC35F3 PIH1D1 PPIP5K2 PTPRD RBFOX3 RPF2 SDHA SLC35F4 PIK3C2A PPM1B PTPRJ RBL1 RPH3A SEC11C SLC38A10 PIK3C3 PPM1G PTPRM RBL2 RPL11 SEC13 SLC38A2 PILRB PPM1L PTPRO RBM10 RPL35 SEC22A SLC38A6 PINX1 PPP1R12A PTPRR RBM22 RPL36A SEC24D SLC39A10 PIR PPP1R12C PTPRT RBM26 RPL36A-HNRNPH2SEC31A SLC39A11 PITHD1 PPP1R2 PTRF RBM28 RPL37A SECISBP2L SLC3A2 PITPNA PPP1R8 PTRH2 RBM34 RPL5 SEH1L SLC43A3 PIWIL2 PPP3CC PUM1 RBM6 RPL7 SENP2 SLC45A4 pk PPP4R1 PUM2 RBMS2 RPN1 SENP3 SLC4A1AP PKHD1 PPP6R3 PUS1 RC3H1 RPP30 SENP7 SLC9A5 PKM PRADC1 PUS7 RECK RPRD2 SERINC5 SLC9A8 PKMYT1 PRDM10 PUS7L RECQL RPS24 SERP1 SLC9A9 PKP4 PRDM11 PXMP2 REPS1 RPS26 SESN3 SLCO1B1 PLA2G4C PRDM5 PXN REV3L RPS27A SESTD1 SLCO1B3 PLA2R1 PRDX4 PYGL RFFL RPS29 SETBP1 SLCO1B7 PLCB4 PREX2 QRFPR RFWD2 RPS6KA2 SETD1B SLF2 PLCXD2 PRIM2 QTRTD1 RGL3 RPS6KC1 SETD2 SLFN11 PLEC PRKAA1 R3HCC1 RGS12 RRAGC SETX SLFN12 PLEK2 PRKAG2 R3HCC1L RGS3 RRM2B SF3A2 SLFN12L PLEKHA1 PRKAR1B R3HDM1 RHOT1 RRNAD1 SF3B2 SLMO1 PLEKHA3 PRKCE RAB11FIP1 RIF1 RSF1 SFMBT1 SLU7 PLEKHG5 PRKG1 RAB18 RILPL1 RSPH3 SFPQ SMAD4 PLEKHM3 PRKRA RAB1A RIMS2 RSRP1 SFSWAP SMAGP PLIN2 PROSC RAB20 RIN2 RTCA SGCD SMAP2 PMM2 PROSER1 RAB22A RIT1 RTKN2 SGMS2 SMARCA1 PMS1 PRPF18 RAB2A RLF RTN4IP1 SGOL2 SMARCA5 PMS2 PRPF3 RAB3GAP2 RMND1 RUNX1 SGPL1 SMARCAL1 PNISR PRPF38B RAB5A RNF111 RUNX3 SGTB SMC5 PNKD PRPF4B RABEPK RNF128 RUVBL1 SH3KBP1 SMC6 PNPT1 PRPF6 RABL6 RNF130 RWDD2B SH3TC2 SMCHD1 POC5 PRPSAP2 RAC1 RNF169 RYK SHARPIN SMG6 PODXL2 PRR11 RAD18 RNF20 RYR2 SHC4 SMIM19 POGLUT1 PRR14L RAD50 RNF212 S100A11 SHISA5 SMIM4 POLA1 PRR5 RALA RNF216 S100A13 SHTN1 SMOC1 POLB PRR5-ARHGAP8 RALBP1 RNF220 SAFB SIM2 SMURF1 POLD3 PRRC2C RALGAPA1 RNF31 SAFB2 SIPA1L3 SNAP25 POLE PRSS12 RALGDS RNF40 SAMD11 SIRT1 SNAP29 POLG PSEN1 RALY RNLS SAMD12 SIX4 SNAPC1 POLK PSG8 RANBP2 RNMT SARS SKAP2 SNAPC4 182

SND1 STAU2 TATDN2 TMEM217 TRPC4AP USP34 YIF1B SNED1 STIM1 TAX1BP1 TMEM222 TRPM3 USP35 YIPF6 SNRPA1 STK10 TBC1D16 TMEM223 TRPM4 USP37 YTHDC1 SNRPC STK3 TBC1D19 TMEM33 TRPM6 USP39 YTHDF1 SNRPD2 STK32C TBC1D24 TMEM38A TSC22D2 USP43 ZBED4 SNRPN STK33 TBC1D30 TMEM39B TSG101 USP45 ZBED8 SNTB1 STK38 TBC1D31 TMEM44 TSHZ2 USP48 ZBTB1 SNTG1 STK38L TBC1D4 TMEM45A TSN USP8 ZBTB37 SNU13 STK39 TBC1D5 TMEM53 TSR1 USP9X ZBTB40 SNX14 STOM TBC1D9 TMEM62 TTC1 UTRN ZC3H13 SNX29 STON2 TBC1D9B TMEM64 TTC27 UVRAG ZC3H3 SNX3 STPG2 TBCD TMEM67 TTC28 VAMP1 ZC3H4 SNX4 STT3A TBCE TMF1 TTC3 VAMP4 ZC3H6 SOBP STX16 TBK1 TMOD2 TTC30A VAPA ZC3H7A SOCS5 STX6 TBL1Y TMTC4 TTC33 VAPB ZC3H8 SORT1 STX8 TCEAL8 TNFRSF10D TTF1 VAV3 ZC3HAV1 SOS1 STXBP5 TCEANC2 TNFRSF21 TTK VCPKMT ZCCHC11 SOX13 STXBP5L TCERG1 TNPO1 TTLL11 VEPH1 ZCCHC17 SOX7 SUB1 TCF3 TNS3 TTLL4 VIPAS39 ZCCHC6 SPAG16 SUCLG1 TCP1 TNS4 TUBA1A VIPR1 ZCCHC7 SPAG5 SUCO TCP11L1 TOM1L2 TUBA4A VPS13B ZCCHC8 SPAG9 SUGCT TDRD3 TOP1MT TUBE1 VPS13D ZDHHC6 SPATA5L1 SUGT1 TDRD9 TOP2A TULP4 VPS35 ZEB1 SPATA7 SUMF1 TEK TOP2B TWF1 VPS53 ZFAND3 SPC25 SUPT20H TERF2 TOP3A TWISTNB VRK1 ZFAND4 SPCS3 SUPT3H TEX14 TOPBP1 TXLNG VRK2 ZFC3H1 SPDL1 SUPT6H TFAM TOR1AIP2 TXNDC11 VWF ZFYVE1 SPDYA SUSD1 TFAP4 TOR1B TXNL4A WAPAL ZFYVE16 SPECC1 SUV39H2 TFCP2 TOX TXNRD3 WBP11 ZFYVE26 SPECC1L SV2C TFDP1 TP53BP2 U2SURP WBP4 ZGRF1 SPG20 SVBP TFDP2 TP63 UBA1 WDHD1 ZHX1 SPHK2 SVIL TFG TPCN2 UBAP2 WDR3 ZHX1-C8 SPOP SVOP TFIP11 TPK1 UBAP2L WDR33 ZMYM1 SPPL2A SWT1 TFPI TPM1 UBE2A WDR35 ZMYM5 SPRED1 SYCP1 TGFB1 TPP2 UBE2D3 WDR43 ZMYM6 SPTLC3 SYDE1 TGIF2-C20orf24 TPR UBE2G1 WDR44 ZNF100 SQLE SYN3 THOC2 TPRA1 UBE2I WDR47 ZNF106 SRA1 SYNCRIP THOC5 TPRG1 UBE2J1 WDR5 ZNF107 SRCAP SYT12 THSD4 TPST2 UBE4A WDR59 ZNF124 SRFBP1 SYT9 THSD7A TRAF3IP1 UBL7 WDR60 ZNF143 SRGAP1 SYTL3 TIAL1 TRAF5 UBLCP1 WDR7 ZNF146 SRP68 TACC1 TIAM2 TRAFD1 UBR2 WDR70 ZNF160 SRPK1 TACC2 TIMELESS TRAPPC11 UBR4 WDR74 ZNF181 SRPK2 TAF12 TIPARP TRAPPC2B UEVLD WDR76 ZNF182 SRSF11 TAF1A TJP1 TRIM2 UGGT2 WHSC1 ZNF195 SRSF4 TAF1B TLE3 TRIM24 UHRF1BP1L WHSC1L1 ZNF197 SRSF6 TAF3 TLK2 TRIM26 UPF3B WIPF2 ZNF200 SSB TAF5 TLL1 TRIM35 UQCRB WLS ZNF236 SSBP1 TAF8 TLN2 TRIM37 UQCRQ WNK3 ZNF250 SSBP2 TANC1 TM9SF1 TRIM5 URGCP WSCD1 ZNF267 SSR1 TANGO2 TMBIM6 TRIM69 USH2A XPNPEP3 ZNF280C ST3GAL2 TANGO6 TMCC1 TRIO USO1 XPO1 ZNF284 ST3GAL3 TANK TMCO3 TRIP11 USP1 XPO6 ZNF304 STAC TAOK3 TMEM132B TRIP4 USP11 XRCC3 ZNF326 STAC2 TAPT1 TMEM132D TRMT1 USP13 XRCC4 ZNF331 STAG1 TARDBP TMEM143 TRMT10B USP15 XRCC5 ZNF33B STARD9 TARSL2 TMEM156 TRMT13 USP28 XRN2 ZNF394 STAT1 TASP1 TMEM181 TRMT1L USP31 YARS2 ZNF473 STAT6 TATDN1 TMEM200A TRMT6 USP32 YBX1 ZNF514 183

ZNF547 ZNF589 ZNF609 ZNF730 ZNF83 ZNFX1 ZSCAN5A ZNF557 ZNF594 ZNF69 ZNF782 ZNF839 ZRANB1 ZXDC ZNF564 ZNF606 ZNF713 ZNF81 ZNF860 ZRANB2 ZZEF1 ZNF586 ZNF607 ZNF720 ZNF813 ZNF98 ZSCAN25 ZZZ3

U2 ASO: Genes with Decreased Poly(A) Sites AADAT AGTPBP1 AP000275.65 ASRGL1 BCAT1 C3orf58 CCDC150 AAGAB AHCTF1 AP000295.9 ASTN2 BCCIP C4orf27 CCDC18 ABCB4 AHCYL2 AP000304.12 ASUN BCL2L14 C4orf29 CCDC25 ABCC2 AHI1 AP1S3 ASXL2 BCLAF1 C5orf15 CCDC30 ABCE1 AHNAK AP3B1 ATAD1 BDP1 C5orf30 CCDC47 ABCF2 AHR AP3B2 ATAD2 BIRC2 C5orf34 CCDC58 ABHD12 AHRR AP3D1 ATE1 BIRC3 C5orf42 CCDC66 ABR AIMP1 AP3M1 ATF3 BLM C6orf62 CCDC77 AC018816.3 AK2 AP4B1 ATF6 BLOC1S2 C7orf50 CCDC79 AC024592.12 AK3 APAF1 ATG12 BLOC1S6 C8orf88 CCDC82 AC068533.7 AKAP12 APC ATG13 BMPR2 C9orf171 CCDC88A AC073610.5 AKAP13 API5 ATG3 BMS1 C9orf3 CCDC88C AC091801.1 AKAP2 APLP2 ATIC BNIP3L C9orf72 CCDC90B AC104534.3 AKAP6 APMAP ATL2 BOD1 C9orf78 CCNA2 AC109829.1 AKAP9 APOBEC3G ATOX1 BOD1L1 CA5B CCNB1 ACADL AKR1D1 APOO ATP13A3 BOLA3 CAAP1 CCNG1 ACAT1 AL021546.6 APOOP5 ATP1B1 BORA CACNA1A CCNH ACOXL ALCAM APP ATP1B3 BPTF CACNA1I CCNK ACP1 ALDH4A1 AQP9 ATP2A2 BRAP CACNA2D1 CCPG1 ACRC ALDH7A1 AQR ATP2B1 BRCA2 CACNA2D3 CCSER2 ACSBG2 ALG6 ARFGEF1 ATP2B4 BRD3 CADM2 CCT2 ACSL3 ALMS1 ARFGEF2 ATP2C1 BRD7 CADPS2 CCT3 ACTG1 ALOX12-AS1 ARFGEF3 ATP5A1 BRD8 CAGE1 CCT4 ACTL6A AMIGO2 ARFIP1 ATP5J2-PTCD1 BRIX1 CALB1 CCT5 ACTL8 AMMECR1 ARGLU1 ATP5O BROX CALM2 CCT6A ACTR3 AMN1 ARHGAP10 ATP5S BSN CALN1 CCT7 ACTR6 AMOTL1 ARHGAP11A ATP6AP2 BTAF1 CAMK2G CCT8 ADAM10 AMY2A ARHGAP24 ATP6V1B2 BTBD3 CAMLG CD164 ADAM17 ANAPC16 ARHGAP6 ATP6V1D BTF3 CAMSAP2 CD2AP ADAM9 ANAPC7 ARHGEF10 ATP6V1E2 BUB1 CAMTA1 CD44 ADAMTS1 ANGEL2 ARHGEF7 ATP7A BZW1 CAND1 CD46 ADAT2 ANK2 ARID1A ATRX BZW2 CANX CD47 ADCY2 ANKH ARID4B ATXN2 C11orf49 CAPN15 CD58 ADGRE1 ANKIB1 ARID5B ATXN7L1 C11orf57 CAPRIN1 CD59 ADGRL2 ANKLE2 ARIH2 AUTS2 C11orf58 CAPZB CD80 ADH5 ANKMY1 ARL14EPL AVIL C11orf71 CARNMT1 CDC40 ADHFE1 ANKRD11 ARMC4 AVL9 C11orf80 CARS2 CDC42BPA ADK ANKRD17 ARMT1 AZIN1 C11orf85 CASK CDC42SE1 ADNP ANKRD18B ARNT AZIN2 C14orf166 CASP8 CDC42SE2 ADNP2 ANKRD26 ARPC3 B2M C15orf38-AP3S2 CASQ2 CDC5L ADRA1D ANKRD28 ARPC5 B4GALT1 C15orf40 CAST CDC6 ADSL ANKRD50 ARPP21 B4GALT2 C15orf41 CATSPERB CDCA2 ADSS ANKRD55 ARSE B9D2 C16orf45 CBR4 CDCA3 AFF4 ANLN ASAH1 BACH1 C16orf62 CBS CDCA8 AFG3L2 ANO4 ASAP1 BAG1 C16orf72 CBWD2 CDH12 AGAP1 ANP32B ASB8 BANP C16orf74 CBX3 CDH18 AGBL1 ANTXR1 ASF1A BAZ1A C17orf85 CBX5 CDH20 AGBL3 ANXA1 ASIC2 BAZ1B C1orf141 CCAR1 CDH4 AGGF1 ANXA11 ASIP BAZ2B C1orf27 CCDC109B CDH5 AGO2 ANXA2 ASNSD1 BBS2 C21orf59 CCDC112 CDK13 AGO3 ANXA3 ASPH BCAS1 C2orf70 CCDC125 CDK14 AGPAT5 ANXA7 ASPM BCAS2 C3 CCDC14 CDK19 184

CDK5RAP2 CHUK COL5A1 CTNNA3 DESI2 DPYD ELP6 CDK7 CIB4 COL8A1 CTNNAL1 DET1 DPYSL5 EMB CDK8 CINP COL8A2 CTR9 DGCR8 DRAM2 EMC3 CDKAL1 CIR1 COLGALT1 CTSC DHFR DRD3 ENAH CDKL3 CIRH1A COLGALT2 CTTN DHRS7 DROSHA ENDOV CDKL4 CKAP2 COMMD10 CUL2 DHRSX DSG2 ENGASE CDKL5 CKAP2L COMMD2 CUL3 DHX15 DSN1 ENO1 CDKN3 CKAP5 COPB2 CUL4A DHX16 DSP ENPP3 CDR2 CKLF COPS2 CUL4B DHX29 DST ENPP6 CDYL CKLF-CMTM1 COPS5 CUL5 DHX32 DSTN EPDR1 CEACAM19 CLASP2 COPZ1 CWC27 DHX33 DTWD1 EPRS CEBPZ CLCN3 COQ5 CWF19L2 DHX57 DTYMK EPS8 CECR2 CLDN12 CORIN CYB5RL DHX9 DUSP19 EPSTI1 CELSR1 CLDND1 CORO1C CYHR1 DIAPH3 DUT ERBB4 CENPC CLGN CORO7 CYR61 DICER1 DYM ERC1 CENPE CLIC2 CORO7-PAM16 CYTH1 DIMT1 DYNC1I1 ERC2 CENPF CLINT1 COX4I1 DAB1 DIS3 DYRK1A ERCC4 CENPH CLIP1 COX5A DAB2 DIS3L DYX1C1 ERCC8 CENPI CLMP CPD DACH2 DIS3L2 DZIP1 ERGIC1 CENPL CLN5 CPLX1 DAGLB DKC1 E2F3 ERGIC2 CENPN CLPTM1L CPNE1 DAP3 DKK2 EBAG9 ERLEC1 CENPU CLPX CPNE3 DARS DLG1 ECI1 ERO1A CENPV CLSPN CPNE4 DAZAP1 DLG2 ECT2 ESCO1 CEP104 CLTC CPS1 DAZAP2 DLGAP5 EDC3 ESCO2 CEP112 CLUAP1 CPSF2 DBF4B DLX6 EDEM3 ESF1 CEP120 CLVS1 CPT1A DBN1 DMAP1 EEA1 ESYT1 CEP135 CMAS CPVL DBNL DMRT1 EEF1A1 ETAA1 CEP152 CMB9-22P13.1 CR1 DCAF13 DMWD EEF1G ETS2 CEP290 CMBL CRBN DCAF8 DMXL1 EEF2K EXO1 CEP295 CMPK1 CRCP DCBLD1 DMXL2 EFCAB14 EXOC1 CEP44 CNBD1 CREG2 DCBLD2 DNA2 EFCAB2 EXOC3 CEP55 CNGB1 CRIM1 DCC DNAAF2 EFHC2 EXOC6 CEP57 CNKSR2 CRIPT DCDC1 DNAH1 EFR3B EXOC6B CEP89 CNN2 CRISPLD2 DCLK1 DNAH10 EFTUD1 EXOSC10 CERK CNNM2 CRTC3 DCLRE1A DNAH11 EGFL6 EXOSC2 CFAP36 CNOT1 CSDE1 DCTN1 DNAH12 EGFR EXOSC4 CFAP44 CNOT4 CSE1L DCTN2 DNAH14 EGLN1 EXOSC8 CFAP52 CNOT6 CSMD1 DCTN5 DNAJA1 EHBP1 EXOSC9 CFAP97 CNOT6L CSMD2 DCUN1D4 DNAJA2 EHHADH EYA1 CFHR2 CNOT8 CSMD3 DCUN1D5 DNAJA4 EIF2A EYA2 CGB1 CNPY2 CSNK1A1 DDAH1 DNAJC1 EIF2AK4 EZH2 CGGBP1 CNR2 CSNK1G1 DDB1 DNAJC10 EIF2B5 F3 CHCHD3 CNTLN CSPP1 DDHD2 DNAJC11 EIF2S1 F8 CHD1 CNTN1 CSRP2 DDX1 DNAJC13 EIF2S3 FAF2 CHD2 CNTNAP2 CTAGE5 DDX10 DNAJC16 EIF2S3L FAIM CHD4 CNTNAP5 CTB-60B18.6 DDX19A DNAJC21 EIF3A FAM101A CHD7 CNTRL CTBP2 DDX21 DNAJC3 EIF3D FAM102A CHEK2 COA4 CTC-360G5.8 DDX24 DNAJC8 EIF3H FAM102B CHIC1 COA5 CTC-429P9.4 DDX31 DNM1L EIF3M FAM120A CHKA COBLL1 CTC-454I21.3 DDX3X DNM3 EIF4B FAM133B CHML COG2 CTD-2116N17.1 DDX3Y DNMT1 EIF4G1 FAM135A CHMP3 COG6 CTD-2287O16.3 DDX42 DOCK2 EIF4G2 FAM13A CHN1 COG8 CTD-2410N18.5 DDX46 DOCK5 EIF4G3 FAM13B CHN2 COL14A1 CTD-2510F5.6 DEGS1 DOCK7 EIF5B FAM162A CHODL COL25A1 CTD-3088G3.8 DENND1A DOCK8 ELAC2 FAM172A CHORDC1 COL3A1 CTGF DENND1B DOK1 ELAVL4 FAM177B CHPT1 COL4A2 CTIF DENND4A DPH6 ELL2 FAM184A CHRNA5 COL4A4 CTNNA1 DENND5A DPP10 ELMOD2 FAM189A2 CHST6 COL4A6 CTNNA2 DENND5B DPY30 ELP3 FAM198A 185

FAM19A2 FRG1 GNAL HCCS HSPA9 ITGA6 KIAA1429 FAM19A5 FRG1BP GNB1 HDAC8 HSPD1 ITGA8 KIAA1551 FAM200A FRMD4B GNB2L1 HDGFRP3 HSPE1 ITGAV KIAA1755 FAM204A FRMD5 GNG7 HDLBP HSPE1-MOB4 ITGB1 KIAA2012 FAM208A FRMD6 GNGT1 HEATR3 HSPH1 ITGB3BP KIAA2018 FAM208B FRS2 GNL2 HEATR5A HTATSF1 ITGB8 KIF11 FAM213A FSD2 GNL3L HEATR6 HTR2C ITPR2 KIF13A FAM214A FSTL4 GNPAT HECTD1 HYDIN ITSN1 KIF15 FAM227B FTO GNPNAT1 HECTD4 IARS ITSN2 KIF16B FAM49B FTSJ1 GOLGA1 HELLS IARS2 JAK2 KIF18A FAM73A FUBP1 GOLGA3 HERC1 ICE1 JAKMIP2 KIF1B FAM76A FUBP3 GOLGA4 HERC4 ICE2 JMJD1C KIF1BP FAM81A FUCA2 GOLGA8A HERPUD2 ICMT JRKL KIF20B FAM81B FUS GOLGB1 HHIP IDE KANK1 KIF21A FAM83D FUT5 GOLIM4 HIATL1 IDI1 KANSL1L KIF23 FAM98A FXN GOLPH3 HIATL2 IFFO2 KARS KIF26B FAM98B FXR1 GON4L HIF1A IFNAR1 KAT7 KIF2A FAN1 FXR2 GOSR2 HIGD1A IFNAR2 KAT8 KIF2C FANCB FYB GPATCH2 HKR1 IFT57 KATNA1 KIF3A FANCD2 G3BP2 GPATCH2L HLA-B IGF2BP2 KATNBL1 KIF5B FANCI GAB2 GPATCH8 HLCS IGF2BP3 KAZN KIFC1 FANCM GABPB1 GPBP1L1 HMGXB3 IGF2R KBTBD2 KIFC3 FARP1 GALK2 GPC3 HMMR IGFBP4 KCNAB2 KIR2DL4 FARSB GALNT18 GPC5 HNMT IGFL2 KCND2 KIR3DL1 FAT1 GALNT2 GPC6 HNRNPA1L2 IGFN1 KCNG1 KIR3DL3 FBN1 GALNT7 GPR107 HNRNPA2B1 IGSF10 KCNG3 KITLG FBN2 GAPDH GPR149 HNRNPA3 IKBKAP KCNH1 KLC1 FBRSL1 GAREM GPSM2 HNRNPAB IKZF2 KCNMA1 KLHL29 FBXL5 GARS GPX4 HNRNPC IKZF3 KCNMB3 KLHL5 FBXO11 GART GRAMD1C HNRNPCL1 IL10RB KCNMB4 KMT2E FBXO22 GAS2 GRAP2 HNRNPD IL18 KCNQ3 KNSTRN FBXO3 GAS2L1 GRB10 HNRNPDL IL1RAPL1 KCNU1 KPNA2 FBXO4 GBAS GRHPR HNRNPH1 IL33 KCTD1 KPNA4 FBXO5 GBE1 GRID1 HNRNPH2 IL6ST KCTD16 KPNB1 FBXW11 GCC2 GRID2 HNRNPH3 IL7 KCTD19 KRBOX1 FCF1 GCDH GSKIP HNRNPK ILF2 KCTD3 KRIT1 FCGR2B GCFC2 GSR HNRNPL IMMP2L KCTD5 KRR1 FCHSD2 GCLC GSS HNRNPLL IMPDH2 KCTD9 KRT222 FDX1 GDAP2 GTF2A1 HNRNPM INADL KDELC2 KRT7 FER GDE1 GTF2B HNRNPR INF2 KDELR2 KSR2 FGF13 GDI2 GTF2E2 HNRNPU ING2 KDM2A KTN1 FGF14 GEMIN2 GTF2H1 HOMEZ INTS10 KDM3A KYNU FIG4 GEMIN5 GTF2H3 HOOK3 INTS2 KDM4C L1CAM FIGNL1 GFM1 GTF2IRD2B HP1BP3 INTS7 KDM6A L3MBTL3 FILIP1 GGA2 GTF3C6 HPS1 IPCEF1 KHDRBS1 LA16c-306E5.2 FIP1L1 GGCT GTPBP10 HPSE2 IPO11 KIAA0020 LAMA1 FLNB GGH GTPBP4 HRC IPO5 KIAA0196 LAMA4 FLVCR1 GINM1 GTSE1 HS2ST1 IPO7 KIAA0226L LAMB1 FMN2 GLG1 GUCY2C HS6ST2 IPO8 KIAA0232 LARGE FMO4 GLIS1 GUSB HS6ST3 IPO9 KIAA0319L LARP7 FMR1 GLO1 H2AFV HSD17B12 IQCB1 KIAA0368 LARS FNBP1L GLP2R H2AFY HSF1 IQGAP1 KIAA0430 LBR FNDC3A GLRB HACD2 HSP90AA1 IQSEC2 KIAA0586 LCORL FNDC7 GLRX3 HACD3 HSP90AB1 IRAK4 KIAA0753 LDHA FOCAD GLS HAT1 HSP90B1 IREB2 KIAA0825 LDHB FOXK1 GLUD1 HAUS3 HSPA14 IRS1 KIAA0922 LEKR1 FOXL1 GMDS HAVCR2 HSPA4 ITCH KIAA1033 LEMD3 FRA10AC1 GMFB HBG2 HSPA4L ITFG1 KIAA1217 LEO1 FREM2 GNA12 HBS1L HSPA8 ITGA1 KIAA1324L LEPR 186

LEPROT MAP3K7 MMRN2 MTRNR2L6 NDC1 NR3C1 P4HA1 LGALS3 MAP4 MN1 MTRNR2L8 NDC80 NR4A1 PABPC1 LHFPL3 MAP4K4 MNAT1 MTRR NDUFA11 NRCAM PABPC3 LIMCH1 MAP7D3 MND1 MTUS1 NDUFA5 NRXN1 PABPC4 LINC01119 MAPK12 MOB1A MUC16 NDUFA8 NSA2 PABPN1 LINS MAPK1IP1L MOCOS MUM1 NDUFAF5 NSMAF PACSIN2 LIPN MAPK8 MON2 MXD4 NDUFB4 NSMCE4A PAFAH1B1 LMAN1 MARCH1 MORF4L1 MXI1 NDUFC2 NSRP1 PAIP1 LMBR1 MARCH6 MORF4L2 MYBPC1 NDUFC2-KCTD14NSUN2 PAIP2 LMBRD1 MARK1 MOXD1 MYCBP2 NDUFS4 NT5C3B PAK1 LMO7 MARK3 MPC2 MYH11 NDUFS6 NT5E PAK1IP1 LONP2 MASTL MPDZ MYH7 NEDD4 NTM PALLD LPCAT1 MATR3 MPHOSPH10 MYH9 NEDD4L NTN1 PALM2 LPIN2 MAX MPHOSPH9 MYL6 NEDD8 NUB1 PAM16 LRBA MB21D1 MPP1 MYO10 NEDD8-MDP1 NUCB2 PAPOLA LRP11 MBD4 MPP6 MYO16 NEIL3 NUCKS1 PARD3 LRP6 MBNL1 MRE11A MYO1B NEK11 NUDT5 PARK7 LRPPRC MBTPS2 MROH8 MYO1E NELFA NUF2 PARP1 LRRC23 MCAM MRPL15 MYO1H NEMF NUFIP2 PARP11 LRRC28 MCC MRPL16 MYO6 NETO2 NUP107 PARP15 LRRC3B MCCC1 MRPL18 MYO9A NEXN NUP153 PARP16 LRRC41 MCF2 MRPL22 MYO9B NF1 NUP160 PARP2 LRRC47 MCF2L2 MRPL3 MYOF NFAT5 NUP205 PARP8 LRRC49 MCFD2 MRPL30 MYRIP NFATC2IP NUP210 PARPBP LRRC53 MCM6 MRPL33 MYT1 NFATC3 NUP37 PAWR LRRC59 MCM8 MRPL34 NAA15 NFE2L2 NUP54 PBK LRRC9 MCPH1 MRPL42 NAA16 NFIA NUP88 PBRM1 LRRCC1 MCTP1 MRPS11 NAA38 NFKB2 NUP98 PBX1 LRRFIP2 MCTS1 MRPS18C NAA60 NFU1 NUPL2 PBX4 LRRN2 MDGA2 MRPS22 NAALADL2 NFX1 NXPE3 PCBP2 LRRTM4 MDH1 MSANTD3-TMEFF1NAB1 NGDN NYAP2 PCDH11X LSAMP MDH2 MSH2 NABP1 NHSL1 OARD1 PCDH11Y LSG1 MDN1 MSH6 NACA NIFK ODC1 PCDH15 LSM14A ME1 MSI2 NACA2 NIPBL ODF2L PCDH17 LTA4H MECOM MSRA NADK2 NKRF OGDH PCGF5 LTBP1 MED1 MSTO1 NAE1 NKTR OGFOD1 PCIF1 LTN1 MED13 MTA1 NAP1L1 NLN OLA1 PCM1 LTV1 MED14 MTA3 NAP1L4 NMD3 ONECUT2 PCMTD1 LUC7L3 MED21 MTBP NARS NME1-NME2 OOEP PCNA LYAR MEF2A MTCH2 NASP NME2 OPA1 PCNXL4 LYRM7 MELK MTDH NAT10 NMNAT3 OPN3 PCP4 LYSMD2 MET MTHFD1 NAV3 NMT1 OR2W3 PCSK1 LYST METAP2 MTHFD1L NBEA NMU ORC3 PCTP LZIC METTL12 MTHFD2 NBEAL1 NOA1 ORC5 PCYT1A M1AP MFF MTHFD2L NBN NOC3L OSBPL10 PCYT1B MACF1 MFN1 MTIF2 NBPF3 NOD1 OSBPL1A PDCD4 MACROD1 MGARP MTL5 NBR1 NOL11 OSBPL3 PDCD6IP MAEA MGME1 MTMR12 NCAM2 NOL8 OSBPL8 PDCL MAGI2 MGST1 MTMR2 NCAPD2 NOM1 OSBPL9 PDE1A MAGOH MIB1 MTMR6 NCAPG NONO OTUD5 PDE2A MAGOHB MICU2 MTMR8 NCAPG2 NOP14 OTUD6B PDE3A MAK16 MID1 MTO1 NCAPH NOP58 OTUD7A PDE4B MALRD1 MIER1 MTOR NCK1 NOS1AP OTULIN PDE4D MALT1 MIS12 MTPAP NCKAP5 NOV OVCH1-AS1 PDE6A MAN1A2 MIS18BP1 MTRNR2L1 NCL NOX1 OXCT1 PDE8B MAN2A1 MKKS MTRNR2L12 NCOA1 NPEPPS OXR1 PDGFRB MANBA MKLN1 MTRNR2L13 NCOA5 NPHP1 OXSR1 PDHX MAP2K5 MLK4 MTRNR2L2 NCOA6 NPM1 P2RX7 PDILT MAP3K4 MLXIP MTRNR2L5 NCOR1 NR2C1 P3H2 PDK2 187

PDSS2 POLD3 PRDM6 PTPRG RBFOX2 ROCK2 RRM1 PDZD2 POLH PRDX5 PTPRM RBL1 ROR1 RRM2 PDZRN4 POLK PREP PTPRR RBL2 RORA RRM2B PES1 POLN PRIM2 PTPRS RBM12 RP11-1035H13 RRN3 PFKP POLQ PRKACB PTPRT RBM17 RP11-111H13 RRP1 PGM2L1 POLR1A PRKCA PUM1 RBM22 RP11-159D12 RRP15 PGRMC2 POLR1B PRKCH PUM2 RBM23 RP11-178C3 RRP1B PHACTR1 POLR1D PRKD3 PUS7 RBM25 RP11-277P12 RSF1 PHACTR4 POLR2B PRKRA PVRL3 RBM26 RP11-286N22 RSL24D1 PHB POLR2H PRLR PWP1 RBM27 RP11-295P9 RSRC2 PHF11 POLR3C PRMT3 QRSL1 RBM34 RP11-302B13 RSRP1 PHF20 POM121 PROS1 QSER1 RBM39 RP11-38C17 RTCA PHF20L1 POMP PRPF18 R3HCC1L RBM41 RP11-407N17 RUNX1 PHF21A POTEB PRPF19 R3HDML RBM48 RP11-47I22 RUNX3 PHF3 POTEB2 PRPF3 RAB10 RBM4B RP11-529K1 RUSC2 PHIP POTEI PRPF4 RAB11FIP2 RBM6 RP11-545J16 RUVBL1 PHKA2 POTEJ PRPF40A RAB18 RBMS1 RP11-574F21 RWDD1 PHLDB2 POU2AF1 PRPF40B RAB21 RC3H2 RP11-618P17 RWDD2B PHLDB3 PPA2 PRPSAP1 RAB27A RCAN3 RP11-96O20 RYK PI4KA PPAN PRR12 RAB3C RCC1 RP11-977G19 S100A11 PIAS2 PPAN-P2RY11 PRR14L RAB3GAP1 RCN1 RP13-512J5 SACS PIBF1 PPARA PRR5L RAB3GAP2 RCN2 RP5-1052I5 SAE1 PICALM PPARG PRRC2B RAB6A REEP3 RP5-972B16 SAMHD1 PIEZO2 PPAT PRRC2C RAB8A REV3L RP9 SAP30BP PIGC PPFIA1 PRRG1 RAB9A RFC1 RPA3 SARS2 PIGK PPFIA2 PRSS21 RABEP1 RFT1 RPAP3 SASS6 PIGL PPHLN1 PSAT1 RABGAP1 RFX5 RPF1 SAV1 PIGN PPIA PSD3 RABGGTB RGPD1 RPL10 SBDS PIK3C2A PPIB PSIP1 RABL3 RGS10 RPL14 SBF2 PIK3C2B PPIC PSMA1 RABL6 RGS2 RPL19 SCAF11 PIK3R3 PPIG PSMA3 RACGAP1 RGS22 RPL22 SCAI PILRB PPIH PSMA4 RAD1 RHD RPL23 SCAPER PITPNC1 PPIL3 PSMA6 RAD21 RHOA RPL27 SCFD1 PJA2 PPIL4 PSMA7 RAD50 RHOBTB3 RPL3 SCFD2 PKN2 PPIP5K2 PSMB5 RAD51B RHOF RPL35A SCLT1 PKP4 PPM1G PSMC2 RAD54B RIF1 RPL37 SCML1 PLA2G4A PPM1L PSMC6 RAF1 RIMS2 RPL37A SCRN3 PLAA PPP1CC PSMD14 RAI14 RINT1 RPL39L SDCBP PLAC1 PPP1R12A PSMD7 RALGAPA1 RIOK1 RPL4 SEC11A PLAGL1 PPP1R13B PSMG2 RALGPS2 RIOK2 RPL5 SEC11C PLCB4 PPP1R16B PTBP3 RALY RIOK3 RPL7A SEC14L1 PLCE1 PPP1R2 PTCD1 RANBP17 RLF RPLP1 SEC22A PLCH1 PPP2R2A PTCD3 RANBP2 RMDN2 RPLP2 SEC23IP PLCXD2 PPP2R2B PTEN RANBP6 RMND1 RPP14 SEC31A PLEKHA1 PPP2R2D PTGES3 RAP1A RNASEH1 RPP30 SEC61A1 PLEKHA5 PPP2R5A PTGFR RARB RNASEH2B RPP40 SEC61G PLK4 PPP2R5C PTGFRN RARS RNF103-CHMP3 RPRD1A SEC63 PLLP PPP2R5E PTGS2 RASA1 RNF111 RPS12 SEH1L PLOD2 PPP3CA PTMA RASAL2 RNF128 RPS14 SEL1L PLS1 PPP3CB PTMS RASGEF1B RNF157 RPS15A SEL1L3 PLS3 PPP3CC PTP4A1 RASSF8 RNF19A RPS23 SEMA4B PMM2 PPP3R1 PTP4A2 RB1 RNF212 RPS26 SEMA6B PMPCA PPP4R1 PTPN11 RB1CC1 RNF216 RPS4X SENP2 PMPCB PPP6R1 PTPN12 RBAK-RBAKDN RNF217 RPS6 SENP6 PMS2 PPP6R3 PTPN13 RBBP6 RNF6 RPS6KA2 SENP7 PNKD PRAME PTPN21 RBBP7 RNGTT RPS6KB1 SEPHS2 PODNL1 PRC1 PTPN9 RBBP8 RNPS1 RPS7 SEPT10 POLA1 PRCP PTPRA RBFA ROBO1 RPS9 SEPT2 POLB PRDM4 PTPRD RBFOX1 ROCK1 RREB1 SEPW1 188

SERF2 SLC35F2 SOCS4 STIM1 TCAIM TMEM222 TRMT5 SERINC1 SLC35F3 SOCS5 STIM2 TCEA1 TMEM260 TRMT6 SERINC3 SLC38A1 SOD2 STIP1 TCEA2 TMEM31 TROVE2 SERPINB6 SLC38A2 SON STK10 TCEAL8 TMEM33 TRPC4AP SESTD1 SLC39A10 SORBS2 STPG2 TCEB1 TMEM44 TRPM3 SETD2 SLC39A11 SORCS1 STRN TCERG1 TMEM5 TRPM7 SETD3 SLC39A8 SOX6 STRN3 TCF12 TMEM52B TRRAP SETD5 SLC39A9 SP110 STRN4 TCF7L2 TMEM62 TSC22D1 SETD7 SLC44A1 SPAG6 STT3B TCFL5 TMEM68 TSEN15 SETDB2 SLC6A3 SPAG9 STX2 TCOF1 TMOD3 TSEN2 SETX SLC7A1 SPAST STX8 TCTEX1D1 TMTC4 TSEN54 SF1 SLC7A11 SPATA16 STXBP5 TDP2 TNFRSF10A TSG101 SF3A3 SLC9A7 SPATA5 SUB1 TDRD9 TNIP1 TSHZ2 SF3B1 SLC9A9 SPATS2 SUCLA2 TDRKH TNKS2 TSNAX SF3B6 SLC9B2 SPATS2L SUCLG1 TEK TNPO1 TSPAN3 SFMBT1 SLCO1B3 SPDL1 SUCO TENM4 TNRC18 TSR1 SFPQ SLCO1B7 SPECC1 SUGCT TERF1 TNRC6A TTBK2 SFTA3 SLCO5A1 SPECC1L SUGP2 TERF2 TNS4 TTC1 SFXN1 SLF2 SPG11 SUPT16H TES TOE1 TTC13 SGCD SLFN12L SPIDR SUZ12 TEX11 TOMM70A TTC17 SGCE SLK SPIN1 SVBP TFAM TOP1 TTC27 SGK1 SLMAP SPINK6 SWT1 TFAP2A TOP2A TTC37 SGMS1 SLTM SPOCK1 SYCP1 TFDP2 TOR1AIP2 TTC39C SGOL2 SLU7 SPRED2 SYMPK TFG TOX2 TTC4 SH2D4A SMAD5 SPSB4 SYN1 TFPI TP53 TTC9 SH3D19 SMAD6 SPTBN1 SYN3 TFPI2 TP53TG5 TTF1 SH3GLB1 SMAGP SPTLC3 SYNCRIP TG TPK1 TTK SH3KBP1 SMAP2 SPTSSA SYNE1 TGFB2 TPM1 TTL SHANK2 SMARCA1 SQLE SYNE2 THADA TPM4 TTLL1 SHCBP1 SMARCA2 SQSTM1 TAB2 THBS1 TPR TTLL11 SHISA9 SMARCA5 SREK1 TACC1 THOC2 TPRA1 TTLL5 SHOC2 SMARCAD1 SRP72 TACR1 THOC7 TPRKB TTN SHPK SMARCC1 SRPK1 TAF12 THUMPD1 TPT1 TUBA1B SHTN1 SMARCD1 SRRM1 TAF1B TIAL1 TPTE2 TUBB6 SIK1 SMC1A SRRM2 TAF3 TIAM1 TPX2 TUFM SIKE1 SMC2 SRRM5 TAF4 TIMELESS TRA2A TULP2 SIMC1 SMC3 SRRT TAF9B TIPIN TRA2B TWF1 SIN3A SMC4 SRSF1 TAMM41 TJP1 TRABD2A TXLNB SIN3B SMC5 SRSF10 TANC1 TLE3 TRAF3IP1 TXLNG SIPA1L1 SMC6 SRSF11 TANK TLK1 TRAIP TXN SKAP2 SMIM7 SRSF4 TAOK1 TLK2 TRANK1 TXNRD1 SKIV2L2 SMR3B SRSF5 TAP2 TLL1 TRAP1 TXNRD3 SLC12A2 SMS SSB TARBP1 TM4SF1 TRAPPC8 TYW1 SLC19A1 SNAPC3 SSBP3 TARDBP TM9SF2 TRERF1 U2AF1 SLC19A2 SNCA SSFA2 TARS TM9SF3 TRIAP1 U2SURP SLC22A5 SND1 SSR3 TATDN1 TMBIM6 TRIM24 UACA SLC25A13 SNRNP200 SSRP1 TAX1BP1 TMCO3 TRIM27 UBA1 SLC25A24 SNRNP70 SSX2IP TBC1D1 TMCO4 TRIM33 UBA2 SLC25A3 SNRPD1 ST13 TBC1D12 TMED2 TRIM4 UBA5 SLC25A32 SNRPD3 ST6GALNAC4 TBC1D15 TMEFF1 TRIM58 UBA6 SLC25A33 SNTG1 ST7L TBC1D19 TMEM105 TRIM61 UBAP2 SLC25A37 SNW1 STAG1 TBC1D22A TMEM106B TRIM69 UBE2G2 SLC25A51 SNX14 STAG2 TBC1D23 TMEM116 TRIO UBE2I SLC26A4 SNX24 STAM2 TBC1D31 TMEM117 TRIP11 UBE2Q1 SLC26A8 SNX4 STAMBP TBC1D5 TMEM132D TRIP12 UBE2R2 SLC30A5 SNX5 STARD9 TBC1D9B TMEM184C TRIP13 UBE3A SLC30A9 SNX6 STAT1 TBCD TMEM189 TRIP4 UBE3C SLC33A1 SNX7 STAU1 TBK1 TMEM189-UBE2V1TRMT10B UBLCP1 SLC35B1 SOAT1 STEAP1B TBPL1 TMEM209 TRMT13 UBN1 189

UBP1 USP1 VPS13B WDR74 YY1AP1 ZGRF1 ZNF564 UBR2 USP13 VPS13D WDR75 YY2 ZIK1 ZNF578 UBR5 USP15 VPS35 WDSUB1 ZBED5 ZKSCAN5 ZNF585A UBR7 USP16 VPS37A WEE1 ZBTB1 ZMAT2 ZNF609 UBXN2A USP20 VPS41 WHSC1 ZBTB11 ZMPSTE24 ZNF639 UBXN4 USP24 VPS53 WIPF1 ZBTB20 ZMYM1 ZNF644 UBXN7 USP28 VRK2 WLS ZBTB25 ZMYM4 ZNF652 UCHL5 USP32 VRTN WNK1 ZBTB38 ZMYND11 ZNF664 UFL1 USP34 VTA1 WWC2 ZBTB8A ZNF131 ZNF709 UGGT1 USP38 VWA8 WWP1 ZC3H11A ZNF143 ZNF720 UGGT2 USP39 VWF WWTR1 ZC3H14 ZNF160 ZNF730 UGT2B7 USP47 WAPAL XPNPEP3 ZC3H15 ZNF169 ZNF770 UHMK1 USP48 WARS2 XPO1 ZC3H18 ZNF202 ZNF791 UHRF1BP1L USP49 WASF1 XPO5 ZC3H7A ZNF207 ZNF81 UHRF2 USP53 WASH4P XRCC5 ZC3HAV1 ZNF277 ZNF827 UNC50 USP7 WDFY1 XRN2 ZCCHC17 ZNF280D ZNF891 UNC5C USP8 WDFY3 XXbac-BPG246D15.ZCCHC69 ZNF326 ZNF98 UNK USP9X WDR1 YBX1 ZCCHC7 ZNF337 ZNFX1 UPF2 UTP11L WDR11 YBX3 ZCRB1 ZNF365 ZNHIT6 UPF3A UTP18 WDR12 YES1 ZDHHC6 ZNF397 ZNRF3 UPF3B UXS1 WDR19 YME1L1 ZFAND1 ZNF410 ZRANB2 UPP2 UXT WDR3 YTHDC1 ZFC3H1 ZNF429 ZRANB3 UQCRC1 VAPA WDR41 YTHDC2 ZFP2 ZNF430 ZSCAN1 UQCRC2 VBP1 WDR43 YTHDF3 ZFP30 ZNF438 ZSCAN5A URB1 VDAC3 WDR48 YWHAB ZFP91 ZNF470 ZSWIM5 URGCP VKORC1L1 WDR53 YWHAH ZFR ZNF507 ZWILCH URI1 VMP1 WDR59 YWHAQ ZFYVE16 ZNF517 ZXDC USH2A VPS13A WDR70 YY1 ZFYVE26 ZNF544 ZZEF1

U4 ASO: Genes with Increased Poly(A) Sites AATK ARPC5L BZW2 CFAP44 DCUN1D5 ETF1 GADD45A ABCC9 ATAD1 C14orf37 CFHR4 DDX21 ETS2 GBE1 AC006486.9 ATE1 CACNA1I CFLAR DEK ETV5 GCDH AC073610.5 ATF3 CALM2 CHERP DFNB31 EVA1C GGCT ACSL3 ATG16L1 CAMKMT CHODL DGKD EXOC6 GLCCI1 ADAM9 ATN1 CAMSAP1 CHST15 DHX40 EZH1 GLRX3 ADORA2B ATP11A CAMSAP2 CLASP2 DIMT1 EZR GNL3 AEBP2 ATP1B1 CAPRIN1 CLDN14 DIS3 FAH GOLGA1 AGBL1 ATP7A CASD1 CLIC2 DNAH8 FAM162A GOLGA7 AGPAT4 ATRX CASP2 CLK1 DNAJC11 FAM208A GOLGA8B AHRR AVPI1 CBFB CLSPN DNAJC16 FAM60A GOLM1 AK9 AXL CCDC71 CMB9-22P13.1 DOCK11 FANCL GPALPP1 AKIRIN2 B4GALT2 CCND1 CMTM1 DPYD FAR1 GPATCH2 ALDH5A1 BAP1 CCNF CNN2 DUSP5 FARSA GPATCH8 AMMECR1 BASP1 CCT8 CNNM2 DYNC2H1 FASTKD2 GPRC5A ANKRD13C BCAP29 CD80 CNR2 DYX1C1 FBXO27 GRSF1 ANKRD13D BCCIP CD96 CNTLN EEF1D FBXO41 GTF2H1 ANKRD46 BECN1 CDC42BPA COPS2 EFHB FBXO5 GTF2IRD1 ANKS6 BIRC3 CDCA2 CPNE3 EGLN3 FGD4 GTPBP1 ANP32A BMX CDCA4 CPSF6 EIF3B FGF2 HEATR5A AP5M1 BNIP2 CDH24 CRAMP1L EIF3G FHL2 HELQ APLF BRCA2 CDIPT CSTF1 ELAVL1 FHL3 HENMT1 APPL1 BRD3 CDK11B CTD-2135J3.4 EMC8 FIBP HIPK2 AQR BRD4 CDK6 CTNNA3 ENDOV FIGN HIVEP1 ARFIP2 BRD9 CEBPG CTSZ EPHA2 FLNA HNF4A ARHGAP8 BRIP1 CENPJ CUL7 ERCC6 FTH1 HNRNPDL ARID4B BSDC1 CEP72 CUTC ERCC6 FZD6 HNRNPL ARMCX1 BUB1 CFAP221 DAGLB ERG G3BP2 HORMAD1 190

HSPA4 MATN2 NUDCD3 PPP1R8 ROCK1 STXBP6 UFL1 HUS1 MCM3AP NUP214 PPP2R2B RP11-192H23.4 SUZ12 UIMC1 ICE2 MET NUPL1 PRDM4 RP11-302B13.5 SVBP UPF3A IFIT2 MKRN3 NXPH1 PRKAA1 RRAGB SYDE1 USP15 IGF2BP1 MRPL44 OCRL PRLR RRM2 SYN3 USP37 IGF2R MRPL48 OLA1 PROS1 RYK SYT12 USP38 IL7R MRPS30 OSBPL5 PRPF31 SACM1L SZT2 USP7 INTS10 MSH2 OSBPL8 PRPF40A SAR1A TARDBP VCL IPO11 MTERF2 PADI4 PRR5-ARHGAP8 SAV1 TAX1BP1 VEGFA JAK2 MTIF2 PAK2 PSAT1 SCOC TBC1D15 VPS51 JPH2 MUM1 PAPSS1 PSMD3 SEMA3E TC2N VSIG10 KCMF1 MXD1 PAWR PTDSS1 SEMA4B TCHP WDR47 KCND2 MYBL2 PAXBP1 PTPRD SFMBT1 TEC WHSC1 KCNK1 MYCBP2 PCBD2 PTPRS SFPQ TFCP2 WNT5A KCTD7 MYH15 PCBP2 PUM2 SGOL2 TFDP1 WTAP KDELR2 MYO1E PDCD6 PUS7 SHPK TGFB1 WWP1 KDM2B MYO5A PDPK1 QTRTD1 SKAP2 TGFB1I1 XRCC3 KIAA0020 MYO5B PDSS2 RAB14 SLC25A3 TIMELESS YARS KIAA0195 NAA50 PGBD3 RAB5A SLC25A37 TLK2 YWHAG KIAA0825 NAB1 PGRMC2 RABGEF1 SLC38A2 TM9SF3 YY2 KIAA1524 NAF1 PHF7 RAD1 SLC39A14 TMCO4 ZBED9 KIAA1958 NAIF1 PHF8 RANBP10 SLTM TMLHE ZC3H11A KIAA2018 NAV3 PHLDB2 RANGAP1 SMARCA2 TNKS2 ZDHHC16 KIF20B NCAPG PI4KA RAP1A SMC3 TNPO2 ZFP30 KLHL7 NCLN PIAS2 RASAL2 SMG1 TNR ZGRF1 KPNA2 NCOA7 PINX1 RB1CC1 SMU1 TP53BP2 ZNF121 KTN1 NELFB PLEKHA8 RBBP6 SNAPC1 TPK1 ZNF131 LAX1 NEMF PLK4 RBFA SORL1 TPR ZNF266 LIG4 NET1 PMPCA RBL1 SOX7 TRAF3 ZNF397 LMO7 NEURL1 PNISR RBM26 SPDL1 TRAPPC8 ZNF496 LOXHD1 NFASC PNO1 RBM33 SPRTN TRIM39 ZNF644 LRIF1 NHLRC3 PNPLA6 RELB SPTBN2 TRIM4 ZNF704 LRP5 NLRP4 PODXL2 REXO4 SPTLC2 TRMT6 ZNF709 LRP8 NOL11 POLR3C RFWD3 SRPK1 TRPC1 ZNF860 LRRC28 NOP14 POLR3G RICTOR SRSF11 TTC12 ZWINT LSG1 NOP58 POTEI RIF1 ST5 TTLL1 LYST NRAS POTEJ RIMS2 ST6GALNAC6 TUBB6 MAGOHB NRG3 POTEM RMND5A STAMBPL1 TXNDC9 MAP3K7 NSMAF PPFIA1 RNFT2 STEAP1B UBA2 MAP4K5 NSUN3 PPM1G RNMT STK16 UBLCP1 MAPKAPK2 NUAK2 PPP1CB RNPS1 STXBP3 UBR5

U4 ASO: Genes with Decreased Poly(A) Sites AASDH AGPAT9 ANKRD13C ARFIP1 ATP7A BRD8 C6orf62 AC024592.12 AHNAK ANKRD28 ARID4B ATP8A2 BSN C7orf50 AC068533.7 AHR ANKRD55 ARID5B ATR BTF3L4 C9orf78 ACADL AIMP1 ANKS3 ARIH2 ATRX BTG3 CA4 ACP1 AKAP11 ANXA2 ARPP21 ATXN7L1 BZW2 CACNA1I ACTR6 AKAP12 ANXA3 ARSJ AVIL C11orf49 CALB1 ADA AKAP6 ANXA6 ASAP1 AZIN1 C11orf54 CALCB ADAM10 AKR1D1 AP000295.9 ASPM B9D2 C11orf58 CALD1 ADAM9 ALCAM AP3B1 ATE1 BAG6 C11orf70 CAMK2G ADCY2 ALDH18A1 APAF1 ATG16L1 BCAS1 C11orf85 CAMSAP2 ADIPOR2 ALDH1L2 API5 ATIC BCAS2 C17orf64 CAND1 AFF4 ALG3 APOBEC3G ATP1B3 BIRC2 C1orf27 CANX AFG3L2 ALMS1 APOOL ATP2C1 BMS1 C4orf17 CASQ2 AGBL1 AMOTL1 APOOP5 ATP6AP2 BNC2 C4orf29 CCDC109B AGO3 AMY2A ARAP1 ATP6V1B2 BOD1 C6orf203 CCDC117 191

CCDC150 COL4A4 DPH7 FIG4 ICE1 KRT222 MRPL30 CCDC79 COPS2 DSCR3 FNDC3A IDH3A LAMA1 MRPL33 CCDC82 COQ5 DSG2 FNDC7 IFI44 LANCL2 MRPL39 CCDC84 CORO1C DSTN FNIP1 IGF2BP3 LARS MRPL42 CCNT2 COX4I1 DTL FOXN3 IGF2R LAS1L MRPS27 CCT4 CPLX1 DYM FOXP1 IGFBP4 LAX1 MSH2 CCT6A CPNE1 DYNC1I1 FUT5 IGFL2 LDHA MSTO1 CD46 CPVL DYNC2H1 FXR2 IGFN1 LDHB MTBP CD47 CRBN DYRK1A G3BP2 IKBKB LEPR MTCH2 CD58 CRCP EAF2 GABARAPL1 IKZF3 LEPROT MTHFD2 CDADC1 CSE1L EBAG9 GABRR1 IL10RB LGALS3 MTHFSD CDC42 CSNK1A1 EDC3 GAMT IL1RAPL1 LINC01119 MTL5 CDC42SE1 CSRP2 EDEM1 GAPVD1 IMMP2L LIPN MTOR CDC7 CTC-432M15.3 EEF1A1 GGA2 IMPDH2 LMBR1 MTPAP CDCA2 CTD-2116N17.1 EEF1G GNB2L1 INF2 LNP1 MTRF1 CDH5 CTD-2135J3.4 EFCAB2 GNPAT INO80D LPIN2 MTUS1 CDK7 CTD-2410N18.5 EFR3B GOLGA8A INTS10 LRBA MUC16 CDK8 CTD-2510F5.6 EIF3H GOLGA8B IPCEF1 LRCH1 MUM1 CECR2 CTD-3088G3.8 EIF3M GOLIM4 IPO13 LRP6 MYCBP2 CENPF CTNNA1 EIF4G2 GPR107 IPO5 LRPPRC MYH7 CENPN CTNNA3 ELF2 GPR149 IPO8 LRRCC1 MYO1E CEP104 CTSC ELL2 GRB10 IQGAP1 LRRN2 MYO5B CEP192 CUL4A ELMOD2 GSR ITGB1 LRRTM4 MYO9A CEP290 CYTH3 ENGASE GSS ITSN2 LRSAM1 MYOF CERK DAGLB ENPP1 GTF2A1 KATNA1 LTBP1 MYRIP CFAP52 DAP3 ENPP6 GTF2H1 KATNBL1 LTV1 MYT1 CHCHD3 DARS EPS8 GTPBP2 KCND2 LYAR N4BP2L2 CHD2 DAZAP2 EPSTI1 GTPBP4 KCNH1 LYST NADK2 CHD4 DBR1 ERBB2IP H2AFY KCNMB4 M1AP NAP1L4 CHML DCC EREG HBS1L KCNQ1 MAGI1 NBN CHN2 DCLRE1A ERGIC1 HCCS KCTD16 MAP3K7 NBPF1 CHPT1 DCTN1 ESF1 HEATR3 KDELR2 MAP4K4 NBPF3 CHST6 DCTN2 ESYT1 HEATR6 KDM1A MAPK1IP1L NBR1 CIB4 DDX1 EXOC1 HECTD1 KDM3A MARCH6 NCAM2 CIRH1A DDX19A EXOC6 HEXA KDM4C MASTL NCAPD2 CKAP2 DDX3X EXOC6B HIAT1 KDM6A MATR3 NDUFA11 CKAP5 DDX46 EXOSC9 HIGD1A KIAA0020 MAX NDUFA9 CKLF-CMTM1 DDX59 EYA2 HLA-B KIAA0196 MCCC1 NDUFC2 CLDN12 DDX60L EYA4 HMGXB3 KIAA0368 MCF2L NDUFC2-KCTD14 CLIC2 DEC1 F3 HNRNPCL1 KIAA0586 MCF2L2 NEMF CLN5 DEPDC7 F8 HNRNPH1 KIAA0825 MDGA2 NFATC3 CLNK DESI1 FAM101A HNRNPK KIAA0922 MDH2 NFE2L2 CLPTM1L DHRS7 FAM151A HNRNPLL KIAA1217 MDN1 NFKB2 CLPX DHRSX FAM162A HNRNPM KIAA1551 MELK NFRKB CLSPN DHX57 FAM189A2 HNRNPR KIAA1715 MET NIFK CLTC DIAPH3 FAM200B HPX KIAA1755 METTL4 NKIRAS1 CMAS DICER1 FAM208A HRC KIF11 MFAP1 NKRF CMB9-22P13.1 DIEXF FANCA HS2ST1 KIF15 MGARP NLN CMBL DIMT1 FAP HSF2 KIF18A MGST1 NOC3L CMTM1 DIS3L FARP1 HSP90AB1 KIF20B MICU2 NOP14 CNKSR2 DKK2 FASTKD2 HSP90B1 KIF21B MIER1 NPC1 CNN2 DLG1 FASTKD3 HSPA4L KIF5B MIS18BP1 NSUN2 CNNM2 DMAP1 FBN1 HSPA8 KIR2DL4 MKKS NUCKS1 CNOT4 DMRT1 FBXO11 HSPA9 KIR3DL1 MKRN3 NUDT12 COA5 DMXL1 FBXO22 HSPD1 KIR3DL3 MOB1A NUDT4 COG2 DMXL2 FBXO34 HSPH1 KITLG MOXD1 NUDT5 COL14A1 DNAH12 FCF1 HTT KLHL29 MPC2 NUFIP2 COL25A1 DNAJC1 FGFR1OP IARS KLRC3 MPHOSPH10 NUP133 COL3A1 DNM3 FGFR1OP2 IARS2 KMT2C MRPL16 NUP210 192

NUP214 PMPCA RANGAP1 SCFD1 SPDL1 TMEM222 USO1 NYAP2 PMS1 RARS SCML1 SPG21 TMEM240 USP1 ODF2 PNPT1 RASAL2 SEC11A SPIN1 TMEM260 USP15 OGDH POLG2 RB1CC1 SEC61A1 SPTSSA TMEM31 USP24 OLA1 POLQ RBBP7 SEC61G SQSTM1 TMEM44 USP34 ONECUT2 PPARD RBFOX1 SEMA4B SRRM1 TMEM68 USP45 OPA1 PPARG RBFOX2 SEMA6B SRRM2 TNFRSF10A USP47 OPN3 PPHLN1 RBM12 SEPHS2 SRSF1 TNPO1 USP7 OR2W3 PPIH RBM23 SEPT10 SRSF11 TNPO3 UXS1 ORC3 PPME1 RBM26 SERF2 SSBP3 TNRC6A VMP1 OSBPL10 PPP1R2 RBM39 SERINC3 SSFA2 TOE1 VPS13B OSBPL1A PPP2R2A RBM5 SERPINB1 SSR3 TOMM20 VRTN OTULIN PPP3R1 RCC1 SET SSRP1 TOMM70A VWF PABPC3 PRDM2 REEP1 SETBP1 ST5 TOP2B WASF1 PABPC4 PRDM4 REEP3 SETD3 STEAP1B TOX2 WASH4P PAIP1 PRDM6 RFC1 SF3A3 STK38L TP53TG5 WDR12 PARP15 PRDX5 RFT1 SFMBT1 STRN TPD52L2 WDR36 PARP16 PREP RFX7 SFTA3 STX16 TPK1 WDR70 PARP2 PRLR RGPD2 SGIP1 STX2 TRA2B WDSUB1 PARPBP PRPF3 RHOA SH3BGRL2 STXBP5 TRAF5 WLS PAXBP1 PRPF40A RHOT1 SH3KBP1 SUB1 TRAIP WNK1 PBK PRPF40B RIF1 SHC3 SUDS3 TRIM33 XPO1 PCBD2 PRPSAP2 RIMS2 SHCBP1 SUGCT TRIM58 XRCC5 PCM1 PSD3 RNASEH1 SIK1 SUMF1 TRIM69 YARS2 PCMTD1 PSMA4 RNF128 SKAP2 SYCP1 TRIO ZBED5 PCP4 PSMA6 RNF217 SLC12A5 SYN1 TRIP12 ZBTB20 PCYT1B PSMA7 ROBO1 SLC25A24 TAF12 TRIP13 ZBTB38 PDCD4 PSMC2 ROCK1 SLC25A51 TALDO1 TRRAP ZC3HAV1 PDE2A PSMC5 ROCK2 SLC35B1 TARBP1 TSEN15 ZCCHC11 PDE3A PSMG2 RP11-1035H13.3 SLC39A9 TATDN1 TSNAX ZFP30 PDE4D PTEN RP11-111H13.1 SLC44A1 TBC1D15 TSPAN3 ZFX PDE6A PTGFR RP11-159D12.5 SLC6A3 TBCD TTC1 ZFY PDILT PTGFRN RP11-277P12.6 SLC7A1 TBCE TTC17 ZFYVE16 PDK2 PTMA RP11-38C17.1 SLC9B2 TBPL1 TTC3 ZMYM4 PDSS2 PTMS RP11-529K1.3 SLMAP TCTEX1D1 TTLL5 ZMYM6 PES1 PTPRA RP5-1052I5.2 SLU7 TDRKH TTN ZNF277 PHACTR4 PTPRD RPAP2 SMARCA1 TEK TUFM ZNF280D PHF20L1 PUM1 RPL22 SMARCA2 TERF1 TULP2 ZNF326 PHF7 PUS7 RPL37 SMC4 TEX11 TXLNB ZNF429 PICALM PWP1 RPL37A SMC5 TEX9 TXN ZNF470 PIGC QSER1 RPL5 SMG8 TFG TYRO3 ZNF507 PIK3CB RAB10 RPL7 SMR3B TG TYW1 ZNF578 PINX1 RAB27A RPN1 SNCA THAP2 UBA1 ZNF638 PITPNC1 RAB3GAP1 RPP14 SND1 THSD4 UBA2 ZNF644 pk RAB3GAP2 RPP40 SNED1 TIAM2 UBA5 ZNF664 PKD2L2 RAB9A RPS15A SNRNP70 TIMELESS UBE2G1 ZNF720 PKMYT1 RABEP1 RTCA SNRPF TIMM17A UBE2R2 ZNF721 PLA2G4C RAD1 RUNX3 SNX6 TKT UBR2 PLA2R1 RAD18 RUVBL1 SNX7 TLL1 UCHL5 PLAGL1 RAD50 SAMD11 SOX7 TM9SF2 UEVLD PLCXD2 RALGAPA1 SBF2 SPAG6 TM9SF3 UNC119B PLEKHA1 RANBP2 SCAF11 SPATA16 TMBIM6 UNC5C PLS1 RANBP6 SCAPER SPATS2 TMCO3 UQCRH

U6 ASO: Genes with Increased Poly(A) Sites AARS ABCG2 ABHD6 AC073610.5 ACP2 ACSS2 ADAL ABCC2 ABHD12 ABLIM1 ACAD11 ACRC ACTR1A ADAMTS6 ABCC9 ABHD14A-ACY1 ABTB2 ACLY ACSL3 ACY1 ADAMTS8 193

ADGB AP3M2 BET1L CCDC71 CMB9-22P13.1 DCP2 EIF2S1 ADGRB3 APBA3 BICC1 CCDC82 CMBL DCUN1D5 EIF3A ADGRG2 API5 BLZF1 CCDC88A CMIP DDAH1 EIF3G ADHFE1 APOLD1 BMPER CCDC93 CNBD1 DDB1 EIF3K ADORA2B APPL1 BMX CCM2 CNBP DDB2 EIF4A2 ADRA1D AQR BNIP2 CCNC CNKSR2 DDHD2 EIF4B ADSS ARCN1 BOD1L1 CCND1 CNN2 DDX10 EIF5B AEBP2 AREL1 BRCA2 CCNJL CNNM2 DDX18 ELAVL1 AEN ARFGEF1 BRD4 CCNT2 CNOT10 DDX5 ELAVL2 AFF1 ARFGEF2 BRD9 CCSER1 CNOT6L DDX55 ELL2 AFF4 ARFGEF3 BRE CD2AP CNOT7 DDX60L ELOVL5 AFG3L2 ARFIP2 BRF1 CD99L2 CNTLN DEPDC1 ELP2 AGAP1 ARHGAP21 BRIP1 CDC37L1 CNTNAP5 DGKB ENO2 AGAP3 ARHGAP24 BTAF1 CDC40 COA5 DHX16 ENSA AGBL1 ARHGAP8 BTBD3 CDC42BPA COG5 DHX30 ENTPD4 AGPAT5 ARHGEF17 BUB1B CDC42BPB COL25A1 DHX37 EPAS1 AHRR ARHGEF28 BZW2 CDC7 COL4A6 DHX40 EPB41 AIG1 ARHGEF7 C10orf11 CDCA4 COL5A1 DIAPH2 EPDR1 AJUBA ARID4B C10orf88 CDH12 COLGALT1 DICER1 EPGN AK2 ARID5B C11orf57 CDH4 COLGALT2 DIP2B EPHX1 AK5 ARIH2 C11orf80 CDK5RAP2 COQ5 DIS3L2 EPT1 AKAP13 ARPC5L C12orf4 CDK6 CPSF6 DISP1 ERCC5 AKAP6 ASAP2 C14orf166 CDKAL1 CRB1 DLG1 ERCC6 AKAP9 ASB3 C16orf62 CDKL3 CREB1 DLG2 ERCC6-PGB AKIRIN1 ASCC1 C17orf85 CDKN2C CREB5 DLGAP4 ERCC8 AKIRIN2 ASCC3 C19orf38 CDON CREBBP DMD ERG ALCAM ASIC2 C19orf43 CDR2L CREG2 DMTF1 ERI3 ALDH3B1 ASPH C1orf123 CEBPZ CRHR1 DNAH8 ESYT2 ALDH5A1 ATAD1 C1orf21 CENPI CRNKL1 DNAH9 ETS2 ALG11 ATAD2 C21orf58 CENPJ CRYZL1 DNAJC11 ETV5 ALK ATF1 C2orf27A CENPT CSMD3 DNAJC6 EVA1C ALS2 ATG12 C3orf62 CENPU CSNK2B DNM3 EXD3 AMDHD2 ATG16L1 C4BPB CEP128 CSNK2B DOCK11 EXO1 AMMECR1 ATIC C4orf27 CEP152 CSPP1 DOCK5 EXOC1 ANAPC4 ATL2 C5orf22 CEP350 CSRP2 DOCK7 EXOC3 ANAPC7 ATOX1 C5orf63 CEP83 CSTF3 DOCK8 EXOC4 ANGPTL5 ATP10D C6orf203 CEP89 CTBP2 DOK6 EXOSC2 ANK2 ATP5J2-PTCD1 C9orf171 CERS1 CTCF DOPEY1 EXOSC7 ANKAR ATP5S C9orf3 CFAP221 CTD-2583A14.10 DOT1L EXT2 ANKHD1 ATP6V1D CAB39 CFLAR CTD-3074O7.11 DPYD EYA2 ANKHD1-EIF4EBP3ATP8A2 CAD CHD1 CTDSPL2 DRD3 EYS ANKRD1 ATP9A CADPS2 CHD2 CTNNA3 DROSHA EZR ANKRD10 ATP9B CALM1 CHD7 CTNNAL1 DSCAM FADS1 ANKRD13C ATRIP CAMLG CHEK2 CTNNB1 DSEL FAH ANKRD13D AVL9 CAPN15 CHIC1 CUL3 DTL FAM129B ANKRD18A AVPI1 CARD8 CHL1 CUX1 DUSP16 FAM135A ANKRD18B B4GALT2 CAST CHST3 CUX2 DYRK1A FAM13A ANKRD28 BACH1 CATSPER2 CHUK CWC27 DZIP3 FAM172A ANKS1A BAIAP2 CBLL1 CKAP5 CYTH1 ECT2 FAM178B ANKS6 BASP1 CBWD2 CLEC2L D2HGDH EDEM2 FAM184B ANLN BAZ1A CBX1 CLEC4F DAB2IP EEF1D FAM185A ANO4 BAZ2B CC2D2A CLIC2 DACH2 EEF2K FAM193A ANP32B BCAP29 CCAR1 CLIP1 DARS2 EFHC1 FAM204A ANXA2 BCCIP CCDC130 CLPB DCAF10 EFHC2 FAM208A AP000295.9 BCKDK CCDC14 CLSPN DCAF16 EFTUD1 FAM81A AP000304.12 BCL2L2-PABPN1CCDC18 CLSTN1 DCAF6 EHBP1 FAM83A AP1G1 BCORL1 CCDC30 CLUAP1 DCAF7 EIF2AK1 FANCA AP1M1 BECN1 CCDC6 CLVS1 DCBLD1 EIF2B2 FANCB AP2A2 BEND6 CCDC66 CMAS DCBLD2 EIF2B5 FANCL 194

FANCM GNL3 IFT122 KPNA4 MED31 N4BP2L2 NRG1 FAP GNPTAB IFT22 KSR1 MED9 N6AMT2 NRG3 FAR1 GNS IFT88 KTN1 MEF2A NAA20 NUCB1 FARP1 GOLGA3 IGF2BP1 LAMA1 MET NAA38 NUDT21 FASTKD1 GOLIM4 IGF2BP2 LAMA2 METTL13 NAB1 NUFIP2 FASTKD3 GOLM1 IGF2R LAMB1 METTL16 NADK2 NUP133 FAT1 GOLPH3 IGFBP3 LAMC1 METTL4 NADSYN1 NUP214 FBLN1 GON4L IGFL3 LAMC3 METTL6 NAP1L4 NUP37 FBXO11 GOT2 IGHMBP2 LAMP1 MGEA5 NAPG NUP50 FBXW2 GPALPP1 IL17RA LARP1B MGMT NASP NUP88 FBXW7 GPATCH1 IL1RAPL2 LARS MIA3 NAT10 NUPL1 FCHO2 GPATCH2 ING5 LASP1 MIB1 NBAS NUS1 FEM1C GPATCH4 INPP5A LCMT1 MICB NBEA NXPH1 FER GPC4 INTS10 LDHB MICU1 NBN OARD1 FGF14 GPN3 INTS6 LEO1 MID1 NBPF20 OBSCN FGFR1OP GPR157 INTS7 LHFPL3 MIER1 NBR1 OGFOD2 FGGY GPR75-ASB3 IPO5 LIG4 MIOS NCALD OLA1 FHIT GRID2 IQGAP2 LIN52 MIPEP NCAPG OOEP FIG4 GRIPAP1 IQGAP3 LINGO2 MIS18BP1 NCBP1 OPN3 FIP1L1 GRPEL1 IRAK2 LINS MLPH NCDN ORAOV1 FIS1 GTF2E2 IRS2 LMAN2L MMP26 NCKAP1 ORC2 FKBP15 GTF2F2 IST1 LMBRD1 MOCOS NCL OSBPL3 FKBP4 GTF2H1 ITCH LMO7 MORC4 NCOA7 OSBPL5 FLJ22447 GTPBP1 ITGA6 LPCAT1 MPI NCOR2 OSBPL8 FLNA GTPBP4 ITGA8 LPIN2 MPP3 NDOR1 OSCP1 FLNB GXYLT1 ITGB3 LRBA MPRIP NDUFA6 OXR1 FNTA HACD3 IVNS1ABP LRP10 MRC2 NDUFS4 P4HA2 FOXK2 HAUS7 JAGN1 LRP11 MRPL12 NEDD4 PABPC4 FOXL1 HDHD1 JAKMIP2 LRPPRC MRPL21 NEIL3 PACSIN2 FOXM1 HDLBP JDP2 LRRC28 MRPL3 NEK4 PADI4 FOXP4 HEATR1 JMJD1C LRRC3B MRPL44 NEK9 PAIP2 FTH1 HECTD1 JPH2 LRRCC1 MRPL46 NEMP2 PAK2 FUBP1 HECW1 KANSL1L LRRFIP2 MRPL48 NEXN PALLD FUT8 HELQ KANSL3 LRRIQ3 MRPL55 NFASC PAM16 FYB HERC1 KCMF1 LSG1 MRPS30 NFKBIE PARP1 G3BP2 HGSNAT KCND2 LSR MSH2 NFS1 PARVB GALK2 HIBCH KCNH1 MACROD1 MSRB2 NFX1 PCBD2 GALNT15 HIF1A KCNIP1 MACROD2 MSTO1 NFYC PCCA GANC HK1 KCNJ6 MAGOHB MTBP NHS PCNXL4 GAREM HMGCS1 KCTD16 MAML2 MTDH NHSL2 PDCD6 GART HNF4A KDM5B MAP3K11 MTERF4 NIPBL PDE3A GAS2L1 HNRNPA2B1 KIAA0141 MAP3K13 MTFR1L NKAIN2 PDE6A GATAD2B HNRNPC KIAA0196 MAP4K4 MTG1 NKTR PDGFRB GBA2 HNRNPDL KIAA0319L MAP4K5 MTHFD1 NLN PDHX GBE1 HNRNPL KIAA0430 MAP7D3 MTL5 NMD3 PDK2 GCC2 HNRNPU KIAA0556 MAPK1IP1L MTMR14 NME6 PDK4 GDAP2 HORMAD1 KIAA0586 MAPK8IP3 MTMR6 NOB1 PDLIM5 GDF1 HP1BP3 KIAA0825 MAPKAPK2 MTMR9 NOM1 PDXK GEMIN5 HS6ST3 KIAA1191 MARCH6 MTO1 NOP56 PDZD8 GEN1 HSPA4 KIAA1586 MARK3 MTRF1 NOP58 PEBP4 GGA2 HSPA5 KIAA1841 MAZ MTUS2 NPAS2 PGBD3 GGCT HSPA9 KIF11 MB21D1 MUM1 NPC1 PGRMC2 GGCX HSPD1 KIF21B MBTPS1 MXD4 NPC2 PHB GGPS1 HSPH1 KIF22 MBTPS2 MYBL1 NPEPL1 PHF7 GIT2 HTRA1 KIF23 MDM1 MYBL2 NPM1 PHGDH GLCE IARS KIF3B MDN1 MYH9 NR3C1 PHLDB2 GLOD4 IFI16 KLC1 ME3 MYO10 NR4A2 PIAS2 GLRB IFNAR1 KLHL1 MED13L MYOF NR6A1 PIBF1 GLRX3 IFNAR2 KLHL7 MED15 MYPN NRD1 PIGT 195

PIK3R3 PSMD12 RICTOR SCAMP1 SMAP1 SYN3 TMEM5 PINK1 PSMD14 RIMS1 SCG5 SMAP2 SYNJ1 TMEM67 PKN2 PSMD3 RIMS2 SCYL3 SMARCAD1 SYNPR TMEM68 PKP4 PTCD1 RITA1 SDAD1 SMC2 SYT12 TMEM99 PLA2R1 PTDSS1 RLF SDCCAG3 SMC3 SYT7 TMPO PLAA PTP4A1 RMND5A SDHA SMC5 TAF1A TMTC1 PLCB4 PTP4A2 RNASEH2B SEC14L5 SMG7 TAF4 TMX2 PLCD3 PTPRG RND3 SEC31A SNAPC1 TAF6L TNPO2 PLCH1 PTPRS RNF139 SEC61A1 SNAPC3 TAF9B TNRC18 PLCXD2 PUM1 RNF167 SEC63 SNRNP48 TANC1 TOMM70A PLD5 PUM2 RNF19A SEH1L SNU13 TANGO6 TOPBP1 PLEC PUS7 RNF19B SEL1L SNX17 TAOK2 TOR1AIP2 PLEKHA5 PUS7L RNF20 SENP7 SNX5 TARDBP TP53RK PLK4 PWP1 RNF38 SEPT11 SNX6 TARSL2 TPD52L2 PLOD2 PXDN RNF6 SERBP1 SOCS4 TAX1BP1 TPM1 PLXNA2 PYGB RNFT2 SERINC5 SORBS2 TAZ TPRG1 PLXND1 R3HDM1 RNMT SESTD1 SORCS1 TBC1D15 TPTE2 PMM2 RAB3GAP1 RNPS1 SETD2 SOS1 TBC1D2 TRAF3 PMPCA RAB3GAP2 ROCK2 SETD5 SP110 TBC1D22A TRAF6 PMS1 RAB5A RP11-108K14.8 SETDB1 SP4 TBC1D23 TRAF7 PNISR RABL3 RP11-144F15.1 SETMAR SPDL1 TBC1D24 TRAP1 PNKD RABL6 RP11-178L8.4 SF3A2 SPEN TBC1D25 TRAPPC6B POC5 RAD1 RP11-20I23.3 SF3B3 SPINK6 TBC1D30 TRAPPC9 POLA1 RAD17 RP11-298I3.5 SGK3 SPPL3 TBC1D5 TREX2 POLD3 RAD21 RP11-302B13.5 SGMS1 SPRED1 TBCD TRIM23 POLG RAD23A RP11-47I22.4 SGOL2 SPRED2 TBK1 TRIM44 POLR1E RAD51B RP11-849H4.2 SH2D4A SPRTN TCAIM TRMT1 POLR2C RAD54B RP13-1032I1.10 SH3KBP1 SPTBN1 TCEA2 TRMT2A POLR3C RALBP1 RPE SHTN1 SREBF2 TCERG1 TRMT6 POLR3G RALGAPA1 RPL13A SIN3B SREK1 TDRD9 TRMT61B POLR3H RALY RPL32 SIRT7 SRRM1 TEX9 TRMU POLRMT RANBP10 RPL9 SLC12A2 SRRM2 TFAP2A TROVE2 POTEI RANBP17 RPP40 SLC19A1 SRSF10 TFAP2C TRPM3 POTEJ RANBP2 RPS13 SLC1A3 SRSF11 TFDP1 TSG101 PPARGC1A RAP1GDS1 RPS4X SLC1A4 SSBP2 TFG TSNAX PPAT RAPGEF6 RRM1 SLC20A1 SSH1 TGFB1 TTC1 PPHLN1 RARG RRM2 SLC20A2 ST20-MTHFS TGFB1I1 TTC3 PPIB RARS2 RRP1B SLC22A4 ST3GAL4 THNSL1 TTC38 PPIE RASA2 RRP7A SLC25A10 ST6GALNAC6 THOC2 TTC39C PPP1R3F RASAL2 RSL1D1 SLC25A13 STAG2 THSD7B TTC4 PPP1R7 RBAK-RBAKDN RSPH3 SLC25A23 STAMBPL1 THUMPD1 TTK PPP1R8 RBBP8 RTEL1 SLC25A24 STARD7 TIA1 TTL PPP2R1B RBFOX1 RTN4IP1 SLC25A3 STC2 TIGD6 TTLL1 PPP2R3C RBFOX3 RUFY2 SLC25A36 STEAP1B TIMELESS TTLL11 PPP2R4 RBKS RUNDC3B SLC27A4 STIM2 TIPIN TUBA1C PPP2R5A RBL1 RXRB SLC35F3 STK16 TLDC1 TUBB6 PPP2R5C RBM12B RYK SLC38A2 STK3 TLK1 TXLNA PPP3CB RBM26 SACM1L SLC38A9 STK35 TLK2 TXLNG PPWD1 RBM34 SACS SLC39A11 STRC TM4SF1 TXNRD1 PRDM4 RBM39 SAMD4A SLC45A3 STRN4 TM9SF3 TXNRD3 PRDX1 RBPJ SAMM50 SLC7A1 STT3A TMC1 TYW5 PRLR RC3H1 SAP18 SLC7A2 STX12 TMCC1 UAP1 PRPF19 RC3H2 SARNP SLC9B2 STXBP3 TMCO1 UBA3 PRPF4 RECQL SARS SLF2 SUCO TMED3 UBA5 PRPF4B RERE SASS6 SLMAP SUPT3H TMEM116 UBA52 PRR5-ARHGAP8 REV3L SBDS SLTM SUPT7L TMEM147 UBE2G2 PRRC2C REXO4 SBF2 SMAD2 SUZ12 TMEM220 UBE2O PRSS12 RGPD2 SCAF4 SMAD4 SVEP1 TMEM222 UBFD1 PSMD1 RIC1 SCAF8 SMAD5 SVIL TMEM254 UBN1 196

UBR4 USP20 WAPAL WWOX ZC3H14 ZNF107 ZNF687 UBXN7 USP25 WDR12 WWP1 ZCCHC11 ZNF124 ZNF711 UCHL5 USP38 WDR25 XPO1 ZCCHC6 ZNF131 ZNF730 UFD1L USP47 WDR27 XRCC5 ZDHHC4 ZNF226 ZNF765 UGGT2 USP48 WDR3 YARS ZEB1 ZNF320 ZNF780A UHRF1BP1L USP9X WDR35 YEATS2 ZFAND1 ZNF335 ZNF852 ULK4 USP9Y WDR36 YES1 ZFAT ZNF383 ZPBP UNC79 VAC14 WDR43 YIPF3 ZFC3H1 ZNF407 ZPR1 UPF3A VAPA WDR59 YKT6 ZFP64 ZNF45 ZRANB1 UPP1 VASH2 WDR74 YTHDC1 ZFP82 ZNF451 ZRANB2 UPP2 VPS11 WDR76 YY1 ZFYVE1 ZNF485 ZRSR2 UQCC1 VPS41 WDR81 ZBED4 ZFYVE26 ZNF496 ZSCAN1 UQCC2 VRK1 WLS ZBED9 ZFYVE28 ZNF544 ZSCAN2 USP13 VRK2 WRN ZBTB24 ZKSCAN4 ZNF587B ZSWIM5 USP14 VTA1 WRNIP1 ZBTB34 ZMPSTE24 ZNF592 ZXDC USP15 VWA8 WSCD1 ZBTB37 ZMYND19 ZNF652 USP16 WAC WTAP ZBTB38 ZNF10 ZNF653

U6 ASO: Genes with Decreased Poly(A) Sites AADAT AHSA2 AP1S3 ASTN2 BIRC3 CAGE1 CD58 AASDH AIMP1 AP2A2 ASUN BLOC1S6 CALB1 CDC26 ABCB4 AK2 AP3B1 ATAD1 BMPR2 CALCB CDC42BPA ABCC5 AK3 AP3B2 ATAD2 BMS1 CALU CDC42BPB ABCE1 AKAP11 AP3D1 ATF6 BNIP3L CAMKMT CDC42SE1 ABCF2 AKAP12 AP3S2 ATF7IP BOD1 CAMSAP1 CDC5L ABR AKAP13 AP4B1 ATG12 BOD1L1 CAMSAP2 CDC7 AC018816.3 AKAP6 AP4E1 ATG13 BOLA3 CAND1 CDCA3 AC104534.3 AKAP9 APBB2 ATIC BORA CAPRIN1 CDCP1 AC109829.1 AKR1D1 API5 ATL3 BPIFC CASQ2 CDH12 ACAP2 ALCAM APLP2 ATOX1 BPTF CAST CDH4 ACAT1 ALDH4A1 APMAP ATP1B1 BRD4 CBX3 CDH5 ACIN1 ALDH7A1 APOBEC3G ATP2A2 BRD7 CC2D2A CDK5RAP2 ACSL3 ALMS1 APOOP5 ATP2C1 BRIP1 CCAR1 CDK7 ACTG1 ALOX12-AS1 AQP9 ATP6V1B2 BRIX1 CCDC112 CDK8 ACTL6A AMIGO2 AQR ATP8A2 BTAF1 CCDC117 CDKAL1 ACTL8 AMY2A ARAP1 ATPAF2 BTBD3 CCDC150 CDKL3 ACTR6 ANAPC5 ARFGEF1 ATRX BTF3 CCDC18 CDV3 ADA ANAPC7 ARFGEF2 ATXN2 BTG3 CCDC3 CEBPZ ADAM10 ANKAR ARHGEF10 AUH BUB1 CCDC34 CECR2 ADAM17 ANKH ARHGEF7 AURKA BZW1 CCDC47 CENPC ADAM9 ANKHD1 ARID1A AZIN1 BZW2 CCDC77 CENPF ADAT2 ANKHD1-EIF4EBP3ARID4B AZIN2 C11orf58 CCDC79 CENPH ADGRL2 ANKRD11 ARID5B B2M C11orf70 CCDC81 CENPL ADH5 ANKRD13C ARIH2 B9D2 C11orf85 CCDC82 CENPN ADIPOR2 ANKRD18A ARL13B BAG1 C15orf38-AP3S2 CCDC88A CEP131 ADNP ANKRD18B ARL14EPL BAG6 C17orf64 CCNB1 CEP135 ADRA1D ANKRD50 ARNT BANP C1orf141 CCNG1 CEP290 ADSS ANKS3 ARPC5 BAZ1A C1RL CCNH CEP295 AEBP2 ANLN ARPP21 BAZ1B C21orf59 CCNK CEP55 AFF4 ANO4 ARSJ BAZ2B C5orf15 CCPG1 CEP83 AFG3L2 ANO6 ASAP1 BBS1 C5orf30 CCT2 CEP97 AGBL1 ANP32B ASAP2 BBS7 C5orf34 CCT3 CERS6 AGGF1 ANXA1 ASCC1 BCAS1 C6orf203 CCT4 CFAP36 AGO3 ANXA11 ASIC2 BCAT1 C7orf50 CCT6A CFAP52 AGPAT9 ANXA2 ASIP BCL2L11 C9orf72 CCT8 CFAP97 AHCTF1 ANXA7 ASNSD1 BCL2L14 C9orf78 CD164 CFHR2 AHNAK AP000275.65 ASPH BCLAF1 CA5B CD2AP CGB1 AHR AP000295.9 ASPM BDP1 CADPS2 CD44 CGGBP1 197

CHD1 CORO7 DDX19B DST ERO1A FNBP1L GPALPP1 CHD2 CORO7-PAM16 DDX3X DTL ESCO2 FNDC3A GPATCH2 CHD4 COX10 DDX3Y DTYMK ESF1 FNDC3B GPC3 CHID1 COX4I1 DDX42 DUT ETAA1 FNDC7 GPC5 CHMP3 COX7A2 DDX46 DYM ETS2 FOXK1 GPCPD1 CHN1 CPD DDX59 DYNC1I1 EXO1 FOXP1 GPN3 CHORDC1 CPLX1 DDX60L DYNC2H1 EXOC3 FRA10AC1 GPR149 CHPT1 CPNE1 DEC1 DYNC2LI1 EXOC6 FRG1BP GPSM2 CHST6 CPNE3 DEK DYRK1A EXOC6B FRMD6 GRB10 CHUK CPS1 DENND4A DZIP1L EXOSC10 FSD2 GRID2 CIB4 CPT1A DENND5B EBAG9 EXOSC2 FSTL4 GRTP1 CIR1 CPVL DGKA ECI1 EXOSC9 FUBP1 GSPT1 CIRH1A CRNKL1 DHRS7 EDC3 EYA1 FUBP3 GSR CKAP2 CSDE1 DHX15 EDEM1 EYA2 FUCA2 GTF2A1 CKAP2L CSMD3 DHX33 EEA1 EZH2 FUS GTF2B CKAP5 CSNK1A1 DHX34 EEF1A1 F3 FXR1 GTF2E2 CLASP2 CSNK1G1 DHX57 EEF1G F8 FXR2 GTF2F2 CLDN12 CSNK2A1 DHX8 EFCAB14 FAAH2 FZD6 GTF2H1 CLEC4F CSPP1 DHX9 EFCAB2 FAF2 G3BP2 GTF3C3 CLGN CSRP2 DICER1 EFTUD1 FAM101A GABARAPL1 GTPBP10 CLIC2 CTC-429P9.4 DIEXF EGLN1 FAM133B GALK2 GTPBP2 CLINT1 CTC-454I21.3 DIMT1 EHBP1 FAM13A GALNT2 GTPBP4 CLIP1 CTCF DKC1 EIF2AK4 FAM13B GALNT7 GTSE1 CLN5 CTD-2116N17.1 DLG2 EIF2B1 FAM162A GAMT GUSB CLNS1A CTD-2287O16.3 DLX6 EIF2B5 FAM172A GAPVD1 HACD3 CLPTM1L CTD-2510F5.6 DMRT1 EIF2S1 FAM19A2 GAREM HAT1 CLPX CTD-3074O7.11 DMRTA1 EIF2S3L FAM200A GARS HAVCR2 CLSPN CTNNA1 DMXL2 EIF3A FAM200B GAS2 HBS1L CLTC CTNNA3 DNAAF5 EIF3H FAM204A GBE1 HCCS CLUAP1 CTNNAL1 DNAH10 EIF3M FAM208A GCDH HEATR3 CLVS1 CTNND1 DNAH11 EIF4A2 FAM208B GCLC HECTD1 CMAS CTR9 DNAH12 EIF4B FAM213A GDAP2 HELLS CMPK1 CTSC DNAH14 EIF4G2 FAM49B GDE1 HERC1 CMSS1 CTTN DNAJA1 EIF4G3 FAM53B GEMIN5 HERPUD2 CNBD1 CUL2 DNAJA2 EIF5B FAM73A GGA2 HEXA CNGB1 CUL3 DNAJA4 ELAVL4 FAM83D GGCT HIBADH CNKSR2 CUL4A DNAJB11 ELL2 FANCB GGH HIGD1A CNN2 CUL5 DNAJC1 ELMOD2 FANCD2 GINM1 HKR1 CNNM2 CUX1 DNAJC13 EMB FARSB GIPC1 HLA-B CNOT6L CWC15 DNAJC25 ENDOV FASTKD2 GLCE HLCS CNPY2 CWF19L2 DNAJC25-GNG10ENGASE FASTKD3 GLG1 HMGXB3 CNTLN CYHR1 DNAJC8 ENO1 FAT1 GLIS1 HMMR CNTNAP5 CYP46A1 DNM1L ENPP1 FBN1 GLRX3 HNMT COBLL1 CYTH1 DNM3 ENPP3 FBN2 GLS HNRNPA2B1 COG6 DACH2 DNMT1 ENPP6 FBXL5 GLUD1 HNRNPAB COG8 DAP3 DOCK7 EPB41L4B FBXO11 GMDS HNRNPC COL14A1 DARS DOCK8 EPC2 FBXO3 GMPR2 HNRNPCL1 COL25A1 DARS2 DOK1 EPS8 FBXO4 GNB2L1 HNRNPH1 COL4A2 DBN1 DONSON EPSTI1 FBXO5 GNB4 HNRNPH2 COL4A6 DBNL DOPEY1 ERBB2IP FCF1 GNGT1 HNRNPH3 COL8A1 DBR1 DPH6 ERC1 FCHO2 GNL2 HNRNPK COL8A2 DCAF8 DPH7 ERC2 FGF14 GNL3L HNRNPL COLGALT1 DCTN2 DPP10 ERCC6 FGFR1OP2 GNPAT HNRNPLL COMMD3-BMI1 DCTN5 DPY30 ERCC6-PGBD3 FHIT GOLGA1 HNRNPM COPA DCUN1D5 DPYD ERCC8 FIG4 GOLGA4 HNRNPR COPS5 DDX1 DRAM2 EREG FIGNL1 GOLGA8A HNRNPU COPZ1 DDX10 DROSHA ERGIC1 FIP1L1 GOLIM4 HNRNPUL1 COQ5 DDX18 DSG2 ERGIC2 FKBP3 GOLPH3 HOMER2 CORO1C DDX19A DSN1 ERLEC1 FLNB GON4L HOMEZ 198

HP1BP3 ITPR2 KMT2A LTN1 MORF4L1 NAP1L4 NSMAF HPS3 ITPR3 KMT2E LTV1 MORF4L2 NAPEPLD NSMCE2 HPX ITSN1 KNSTRN LUC7L3 MORN2 NARS NSMCE4A HRC ITSN2 KPNA2 LYAR MPHOSPH10 NBAS NSUN2 HS6ST2 JAKMIP2 KPNA3 LYRM7 MPHOSPH9 NBEAL1 NT5C3B HSD17B12 KANSL2 KPNA4 LYSMD2 MPP5 NBN NT5E HSF2 KATNBL1 KPNB1 LYST MPP6 NBPF1 NUCKS1 HSP90AB1 KBTBD2 KRAS M6PR MRE11A NBPF3 NUDT12 HSP90B1 KCNC4 KRIT1 MACF1 MROH8 NBR1 NUDT4 HSPA14 KCND2 KRR1 MAEA MRPL11 NCAM2 NUDT5 HSPA4 KCNG1 KRT222 MAGOH MRPL16 NCAPD2 NUP133 HSPA8 KCNH1 KSR2 MAGOHB MRPL18 NCAPG NUP153 HSPA9 KCNJ6 KTN1 MAK16 MRPL19 NCAPG2 NUP160 HSPD1 KCNMB4 KYNU MANBA MRPL22 NCAPH NUP205 HSPH1 KCNQ1 L3MBTL3 MAP2K5 MRPL3 NCK1 NUP214 HTATSF1 KCNU1 LA16c-306E5.2 MAP3K7 MRPL33 NCL NUP37 HYOU1 KCTD16 LAMA1 MAP4 MRPL39 NCR1 NUP98 IARS KCTD18 LAMA4 MAP4K4 MRPS22 NDC1 NUSAP1 IARS2 KCTD19 LAMB1 MASTL MRPS27 NDUFA8 NYAP2 IBTK KCTD3 LAMP2 MATR3 MRPS5 NDUFB4 OARD1 ICE1 KCTD9 LANCL2 MAX MSH2 NDUFC2 ODC1 ICMT KDELC2 LARS MB21D1 MSL3 NDUFC2 OFD1 IDH3A KDM1A LAS1L MCAM MSRA NEDD4 OGDH IDI1 KDM1B LDHB MCF2 MSTO1 NEDD8 OGT IFI44 KDM2A LEMD3 MCFD2 MTBP NEDD8 ONECUT2 IFNAR2 KDM3A LEO1 MCM3AP MTCH2 NEIL2 OOEP IFRD1 KDM6A LEPR MCM4 MTDH NEMF OPA1 IFT57 KHDRBS1 LEPROT MCM6 MTHFD1L NETO2 OR2W3 IFT88 KHSRP LGALS3 MCTS1 MTHFD2 NEXN ORC5 IGF2BP3 KIAA0020 LIMCH1 MDH1 MTIF2 NF1 OSBPL10 IGF2R KIAA0196 LINC01119 MDH2 MTMR12 NFKB2 OSBPL1A IGFBP4 KIAA0368 LINS MEAF6 MTMR2 NFRKB OSBPL3 IGFL2 KIAA0430 LMAN1 MED1 MTO1 NFU1 OSBPL8 IGFN1 KIAA0586 LMBR1 MED17 MTOR NIFK OSMR IGSF10 KIAA0825 LMBRD1 MED21 MTPAP NIPBL OTUD5 IKBKB KIAA0922 LMO7 MEF2A MTRF1 NKRF OTUD7A IKZF2 KIAA1033 LPCAT1 MELK MTRNR2L1 NKTR OXCT1 IKZF3 KIAA1324L LPIN2 MET MTRNR2L11 NLN P3H2 IL10RB KIAA1524 LRBA METAP2 MTRNR2L12 NME7 PA2G4 IL18 KIAA1551 LRCH1 METTL4 MTRNR2L9 NMNAT3 PABPC3 IL1RAPL1 KIAA1755 LRP11 MFAP1 MUC16 NMU PABPC4 IL33 KIAA2012 LRP1B MFF MUM1 NOC3L PAIP1 ILF2 KIF11 LRP6 MFN1 MVB12A NOD1 PAIP2 IMPDH2 KIF13A LRR1 MGA MYH9 NOL11 PAK1IP1 INTS10 KIF15 LRRC16A MGARP MYL6 NOL4L PAM IPCEF1 KIF18A LRRC23 MGEA5 MYO1B NOL8 PAPOLA IPO8 KIF1BP LRRC28 MIB1 MYO1E NOLC1 PARD3 IQCA1 KIF20B LRRC41 MIER1 MYO9A NONO PARP1 IQCB1 KIF21A LRRC47 MIS12 MYOF NOP14 PARP15 IQGAP1 KIF23 LRRC59 MIS18BP1 MYT1 NOP58 PATL1 IRAK4 KIF2A LRRC9 MITD1 N4BP2 NOTCH4 PAXBP1 IREB2 KIFC1 LRRCC1 MKKS NAA15 NPC1 PBK IRS1 KIFC3 LRRFIP2 MKLN1 NAA16 NPM1 PCBD2 IST1 KIN LRRN2 MKRN3 NAA60 NR2C1 PCBP2 ITCH KIR2DL4 LRRTM4 MLKL NAB1 NR3C1 PCCB ITFG1 KIR3DL1 LRSAM1 MMRN2 NABP1 NR4A1 PCGF5 ITGA1 KIR3DL3 LSAMP MN1 NADK2 NR4A2 PCM1 ITGA6 KITLG LSM14A MOB1A NAF1 NREP PCMTD1 ITGB1 KLHL7 LTBP1 MORC3 NAP1L1 NSA2 PCNA 199

PCNXL4 PMPCA PSMA4 RBM26 RPP40 SESTD1 SMG8 PCSK1 PMS1 PSMA6 RBM27 RPRD1A SETD2 SMIM19 PCYT1B PNPT1 PSMB5 RBM28 RPS12 SETD3 SMIM7 PDCD11 PODNL1 PSMC2 RBM34 RPS14 SETD5 SMOC1 PDCD6IP POLA1 PSMC5 RBM39 RPS15A SETD7 SMPDL3A PDCL POLD3 PSMD14 RBM5 RPS19 SETD9 SMR3B PDE1A POLE PSMD4 RBMS1 RPS26 SETDB2 SNCA PDE2A POLH PSMD7 RC3H2 RPS6 SETX SND1 PDE3A POLK PSME4 RCAN3 RPS7 SF1 SNRNP200 PDE3B POLQ PSRC1 RCC1 RRAGC SF3A2 SNRNP70 PDE4D POLR1A PTCD3 RECQL RRM2 SF3A3 SNRPA1 PDE6A POLR1B PTEN REEP3 RRM2B SF3B1 SNRPD1 PDGFRB POLR1D PTGFR REV1 RRN3 SF3B2 SNRPD3 PDHA1 POLR2B PTGFRN RFC1 RRP1B SF3B6 SNRPF PDHX POLR2H PTGS2 RFT1 RSF1 SFMBT1 SNTG1 PDILT POLR3A PTMA RFX5 RSL24D1 SFPQ SNW1 PDK2 POLR3C PTMS RFX7 RSPH1 SFTA3 SNX14 PDS5A POLR3G PTP4A1 RGPD2 RSPH3 SGIP1 SNX4 PDSS2 POM121 PTPN11 RGS10 RSRC2 SGMS1 SNX6 PDXDC1 POU2AF1 PTPN12 RGS7 RTCA SGOL1 SNX7 PDZRN4 PPA1 PTPRA RHOBTB3 RUNX1 SGOL2 SNX8 PEX12 PPAN PTPRD RHOF RUNX3 SHCBP1 SOAT1 PGBD1 PPAN-P2RY11 PTPRG RHOT1 RUVBL1 SHTN1 SOCS4 PGBD3 PPARG PTPRT RIF1 RWDD1 SIN3B SOCS5 PHACTR1 PPARGC1B PUS7 RIMS2 RYK SIX4 SON PHACTR4 PPAT PWP1 RINT1 SAFB2 SKAP2 SORBS2 PHB PPFIA1 PXYLP1 RIT1 SAMM50 SLC12A5 SOS1 PHF20L1 PPIC QSER1 RNASEH1 SAP30BP SLC19A2 SOX6 PHF21A PPID R3HDML RNASEH2B SASS6 SLC20A2 SP110 PHF7 PPIE RAB27A RNF103-CHMP3 SBDS SLC25A24 SPAG6 PHIP PPP1R10 RAB3C RNF128 SBF2 SLC25A32 SPAST PHLDB2 PPP1R12A RAB3GAP1 RNF130 SCAF11 SLC25A33 SPATA16 PI4KA PPP2CB RAB6A RNF19A SCAF4 SLC25A51 SPATS2 PIAS2 PPP2R5A RAB9A RNF212 SCAPER SLC26A8 SPATS2L PIBF1 PPP2R5C RABEP1 RNF216 SCFD1 SLC27A2 SPC25 PICALM PPP2R5E RABGGTB RNF217 SCLT1 SLC35B1 SPDL1 PIGC PPP3R1 RABL6 RNMT SCML1 SLC35F3 SPECC1 PIGL PPP4R3A RACGAP1 ROCK1 SDCBP SLC38A1 SPG11 PIGN PRDM4 RAD1 ROCK2 SEC11A SLC38A10 SPIDR PIK3CB PRDM6 RAD21 RP11-1035H13.3 SEC11C SLC38A6 SPINK6 PIK3R3 PRDX4 RAD50 RP11-159D12.5 SEC13 SLC39A9 SPRED1 PILRB PREP RAD54B RP11-343C2.12 SEC23A SLC44A1 SPRED2 PITHD1 PREX1 RAF1 RP11-529K1.3 SEC31A SLC45A4 SPSB1 PITPNC1 PRIMPOL RALGAPA1 RP11-574F21.3 SEC61A1 SLC4A1AP SPTBN1 PJA2 PRKD3 RANBP2 RP11-96O20.4 SEC61G SLC6A3 SPTSSA pk PRKDC RANGAP1 RP11-977G19.10 SEC63 SLC9B2 SQLE PKN2 PRKRA RAP1A RP5-972B16.2 SEH1L SLF2 SREK1 PLAGL1 PRLR RARS RP9 SEMA4B SLFN12 SRP72 PLD5 PRMT3 RARS2 RPAP2 SEMA6B SLMAP SRRM1 PLEKHA1 PRPF4 RASAL2 RPF1 SENP2 SLTM SRRT PLEKHA5 PRPF40A RB1 RPGRIP1L SENP6 SLU7 SRSF1 PLEKHD1 PRPF6 RB1CC1 RPL3 SENP7 SMAD6 SRSF4 PLK4 PRPSAP2 RBBP7 RPL37 SEPHS2 SMARCA1 SRSF5 PLLP PRR12 RBBP8 RPL37A SEPT2 SMARCA2 SSB PLOD2 PRSS12 RBFOX1 RPL4 SEPW1 SMARCC1 SSBP3 PLS1 PSD3 RBL1 RPL5 SERBP1 SMARCD1 SSFA2 PLS3 PSEN1 RBM12 RPL7 SERF2 SMC3 SSR3 PLXDC2 PSIP1 RBM17 RPLP1 SERINC1 SMC4 SSRP1 PLXNA4 PSMA2 RBM23 RPLP2 SERPINB1 SMC5 ST13 200

STAG2 TBC1D15 TNFRSF10A TRPM7 UGCG WDR53 ZFP91 STAMBP TBC1D5 TNKS2 TRRAP UGGT1 WDR60 ZFR STAT1 TBCD TNNT1 TSC22D1 UHRF2 WDR7 ZFYVE16 STAU1 TBCE TNPO1 TSEN2 UNC50 WDR70 ZIK1 STEAP1B TBPL1 TNPO3 TSHZ2 UPF3A WDR75 ZMAT2 STIP1 TCEB1 TNRC18 TSNAX UPF3B WHSC1 ZMPSTE24 STK38L TCERG1 TNRC6A TSPAN3 UQCRH WLS ZMYM6 STMN1 TCFL5 TOMM70A TTBK2 URI1 WNK1 ZNF124 STRAP TCP11L1 TOP1 TTC1 USO1 WRN ZNF131 STRN TCTEX1D1 TOP1MT TTC13 USP1 WWC2 ZNF143 STX16 TDRKH TOP2A TTC17 USP13 WWP1 ZNF160 STX2 TEK TOP2B TTC27 USP14 XPNPEP3 ZNF181 STX8 TERF1 TOPBP1 TTC3 USP15 XPO1 ZNF202 SUB1 TEX11 TOR1AIP2 TTC37 USP16 XPO5 ZNF227 SUCO TFAM TP53BP1 TTC4 USP24 XXbac-BPG246D15.9ZNF277 SUDS3 TFCP2 TP53BP2 TTC9 USP34 YAE1D1 ZNF280D SUGCT TFPI TP53TG5 TTK USP45 YAF2 ZNF304 SUN1 TFPI2 TPD52L2 TTLL11 USP47 YARS2 ZNF326 SUPT6H THADA TPK1 TTLL5 USP48 YBX1 ZNF410 SUZ12 THAP1 TPM1 TTN USP7 YES1 ZNF438 SVBP THOC2 TPM4 TULP3 USP9X YME1L1 ZNF507 SWT1 THUMPD1 TPP2 TWF1 UTP11L YTHDC1 ZNF510 SYCP1 TIAM2 TPR TXLNG UTP18 YTHDC2 ZNF517 SYMPK TIMM17A TPRA1 TXN VBP1 YTHDF3 ZNF532 SYN1 TIPARP TPRKB TXNRD1 VDAC1 YWHAB ZNF578 SYN3 TJP1 TPTE2 TXNRD2 VDAC3 YY1 ZNF585A SYNCRIP TLE3 TPX2 TXNRD3 VIMP YY1AP1 ZNF607 SYNE1 TLL1 TRA2A TYW1 VIPAS39 ZBED5 ZNF638 SYNE2 TLN2 TRAF3IP1 U2AF1 VPS13A ZBTB1 ZNF639 TAF12 TM9SF2 TRAF5 UBA1 VPS13B ZBTB11 ZNF664 TAF1A TM9SF3 TRAIP UBA2 VPS37A ZBTB25 ZNF692 TAF3 TMCC1 TRAPPC11 UBA6 VPS41 ZBTB8A ZNF721 TAF4 TMCO3 TRAPPC8 UBE2G1 VRK1 ZC3H14 ZNF730 TALDO1 TMEM116 TRAPPC9 UBE2R2 VRTN ZC3H15 ZNF773 TANK TMEM117 TRERF1 UBE3A VTA1 ZC3H7A ZNF789 TAOK1 TMEM126B TRIM33 UBLCP1 WAPAL ZC3HAV1 ZNF791 TAP2 TMEM156 TRIM58 UBN1 WASF1 ZCCHC11 ZNF81 TARBP1 TMEM184C TRIO UBR4 WDFY3 ZCCHC6 ZNF98 TARDBP TMEM189 TRIP11 UBR7 WDR1 ZCCHC7 ZRANB2 TARS TMEM189-UBE2V1TRIP12 UBXN2A WDR11 ZCRB1 ZRANB3 TARSL2 TMEM44 TRMT13 UBXN4 WDR12 ZDHHC6 ZSCAN1 TATDN1 TMEM45A TRMT5 UCHL5 WDR19 ZDHHC7 TAX1BP1 TMEM5 TRMT6 UFD1L WDR43 ZFAND1 TBC1D12 TMOD3 TROVE2 UFL1 WDR48 ZFC3H1

SSA: Genes with Increased Poly(A) Sites AADAT ABHD3 AC104534.3 ACSL1 ADAMTSL1 AEBP2 AGO3 AAGAB ABL1 ACACB ACSL3 ADAMTSL3 AFF4 AGPAT5 AAK1 ABL2 ACAP2 ACSL4 ADAT1 AFG3L2 AGPAT9 AAR2 ABLIM1 ACAT1 ACTL6A ADGRL2 AFMID AGPS ABCA13 ABR ACBD3 ACTR3 ADH5 AGAP1 AHCTF1 ABCB11 AC003002.4 ACIN1 ACTR6 ADHFE1 AGAP3 AHCYL2 ABCB5 AC003002.6 ACKR2 ACY1 ADIPOR1 AGBL1 AHI1 ABCC1 AC004076.7 ACO1 ADAM10 ADIPOR2 AGBL3 AHNAK ABCC5 AC009403.2 ACO2 ADAM12 ADK AGBL5 AHR ABCE1 AC011997.1 ACOXL ADAM17 ADNP2 AGGF1 AHRR ABCF3 AC024592.12 ACP1 ADAM18 ADRBK2 AGMO AHSA2 ABHD14A-ACY1 AC068533.7 ACRC ADAM9 ADSL AGO2 AJUBA 201

AK5 ANXA1 ASAH1 AUH BTBD7 CAD CCZ1 AK7 ANXA11 ASAP1 AURKA BTD CAMK1D CCZ1B AK9 ANXA2 ASAP2 AVL9 BUB1 CAMK2D CD44 AKAP10 ANXA3 ASB1 AXDND1 BZW2 CAMK2G CD46 AKAP11 ANXA6 ASB3 AZIN1 C10orf67 CAMLG CD47 AKAP12 AP000275.65 ASB8 B2M C11orf24 CAMSAP1 CD58 AKAP13 AP000295.9 ASCC1 B3GALNT1 C11orf57 CAMSAP2 CD59 AKAP2 AP1M1 ASCC2 B3GAT2 C11orf70 CAMTA1 CDC37 AKAP3 AP1S3 ASCC3 B4GALT1 C11orf71 CAND2 CDC42BPA AKAP6 AP2A2 ASF1A B4GALT4 C11orf80 CANX CDC42BPB AKAP9 AP3B1 ASH1L B4GALT5 C12orf66 CAPN10 CDC42SE1 AKIRIN2 AP3D1 ASIP BACE2 C14orf159 CAPN2 CDC42SE2 AKNAD1 AP3S2 ASL BACH2 C15orf38-AP3S2 CAPN7 CDC45 AKR1A1 AP4B1 ASNSD1 BAG1 C15orf41 CAPRIN1 CDC5L AKT2 AP4E1 ASPH BANP C15orf57 CARD8 CDC6 ALCAM AP4M1 ASPM BAP1 C16orf45 CARKD CDC7 ALDH1L2 APAF1 ASPSCR1 BASP1 C16orf62 CASC5 CDC73 ALDH3A2 APBA3 ASRGL1 BAZ1A C16orf72 CASK CDCA2 ALG14 APBB2 ATAD1 BAZ1B C16orf74 CASP8 CDCA4 ALG6 APC ATAD2 BAZ2B C16orf95 CAST CDCA8 ALG9 API5 ATE1 BBS2 C17orf67 CBR4 CDH12 ALKBH1 APMAP ATF3 BBX C17orf75 CBX3 CDH4 ALKBH3 APOO ATF6 BCAS3 C18orf54 CBX5 CDH5 ALKBH8 APOOL ATF7IP BCAT1 C1D CC2D2A CDH9 ALPK1 APP ATG3 BCKDHB C1orf112 CC2D2B CDK13 AMBRA1 AQR ATG4C BCL2 C1orf21 CCAR1 CDK17 AMFR ARAP1 ATG5 BCL2L13 C1orf43 CCBL1 CDK19 AMIGO2 ARAP2 ATIC BCL2L2-PABPN1C1orf50 CCDC109B CDK5RAP2 AMMECR1 AREL1 ATL2 BEAN1 C1orf56 CCDC112 CDK6 AMOTL1 ARFGEF1 ATL3 BEND3 C1QTNF1 CCDC117 CDK7 ANAPC10 ARFGEF2 ATP11A BET1L C1S CCDC126 CDK8 ANAPC16 ARFIP1 ATP11B BIN3 C20orf194 CCDC18 CDKAL1 ANAPC5 ARFIP2 ATP11C BIRC2 C21orf2 CCDC25 CDKL4 ANAPC7 ARHGAP15 ATP13A3 BIRC3 C21orf58 CCDC30 CDKN1A ANGEL2 ARHGAP21 ATP1A1 BIRC6 C21orf59 CCDC34 CDR2 ANGPT1 ARHGAP27 ATP1B1 BIVM-ERCC5 C2orf42 CCDC39 CDYL ANK2 ARHGAP32 ATP1B3 BLM C3 CCDC40 CDYL2 ANKEF1 ARHGAP35 ATP2B1 BLOC1S2 C3orf33 CCDC59 CEBPZ ANKH ARHGAP44 ATP2C1 BMP8B C3orf58 CCDC66 CELF1 ANKHD1 ARHGEF10 ATP5C1 BMPR2 C3orf67 CCDC68 CELSR1 ANKHD1-EIF4EBP3ARHGEF11 ATP5J2-PTCD1 BMS1 C4orf27 CCDC77 CENPC ANKIB1 ARHGEF18 ATP5S BNIP3L C4orf29 CCDC82 CENPE ANKLE2 ARHGEF26 ATP6V0A1 BOD1 C4orf45 CCDC85C CENPF ANKRA2 ARHGEF28 ATP6V0A2 BOD1L1 C5orf30 CCDC88A CENPI ANKRD11 ARHGEF40 ATP6V0D1 BORA C5orf34 CCDC88C CENPJ ANKRD13C ARID1A ATP6V0E1 BPGM C6orf106 CCDC91 CENPK ANKRD17 ARID4A ATP6V1B2 BPTF C7orf50 CCDC93 CENPL ANKRD26 ARID4B ATP6V1H BRAF C7orf55-LUC7L2 CCNA2 CENPM ANKRD28 ARID5B ATP7B BRAP C8orf44-SGK3 CCNB1 CENPO ANKRD30B ARIH2 ATP8B4 BRCA2 C8orf88 CCNB1IP1 CENPP ANKRD42 ARL15 ATP9B BRD4 C9orf3 CCND3 CENPT ANKRD50 ARMC12 ATPAF1 BRD7 C9orf85 CCNK CENPU ANKS1A ARMC2 ATPAF2 BRD8 CA5B CCNT2 CENPV ANKS6 ARMC4 ATPIF1 BRF2 CAB39 CCPG1 CEP104 ANLN ARMC9 ATRNL1 BRIP1 CACHD1 CCT2 CEP120 ANO10 ARMT1 ATRX BRMS1L CACNA1A CCT3 CEP126 ANO2 ARRDC1-AS1 ATXN2 BROX CACNA2D1 CCT4 CEP131 ANP32B ARSB ATXN7 BTAF1 CACNA2D3 CCT5 CEP152 ANTXR1 ARSJ ATXN7L1 BTBD3 CACNB1 CCT6A CEP164 202

CEP192 CLUAP1 CPSF3L CYBA DHX34 DPM1 EIF3B CEP290 CLVS1 CPT1A CYBB DHX38 DPY19L1 EIF3L CEP295 CLYBL CPVL CYP27C1 DHX57 DPYD EIF4EBP2 CEP44 CMB9-22P13.1 CRADD CYR61 DHX9 DPYSL5 EIF4ENIF1 CEP55 CMBL CRCP CYTH3 DIAPH3 DRAM2 EIF4G1 CEP57 CMC1 CREB1 DAB1 DIDO1 DRD3 EIF4G3 CEP70 CMPK1 CREBL2 DAB2 DIMT1 DROSHA EIF5 CEP85 CMSS1 CREBRF DACH2 DIO2 DSCAM ELAVL1 CEP85L CMTM1 CREG2 DAGLB DIP2A DSCR3 ELAVL2 CEP89 CMTM4 CRIM1 DARS DIP2B DSEL ELL2 CEP95 CMTM7 CRIPT DARS2 DIS3L DSN1 ELMO1 CERK CNBD1 CRK DAZAP1 DIS3L2 DSP ELMO2 CERS5 CNGA1 CRLF3 DAZAP2 DKC1 DST ELP3 CERS6 CNKSR2 CRNKL1 DBN1 DKK2 DSTN ELP6 CFAP53 CNNM2 CROCC DBR1 DLEU1 DSTYK EME1 CFAP61 CNNM4 CRTC1 DCAF10 DLG1 DTWD1 ENAH CFAP97 CNOT1 CRTC3 DCAF12 DLG5 DTYMK ENKD1 CFTR CNOT10 CSGALNACT1 DCAF7 DLGAP4 DUS2 ENOSF1 CGNL1 CNOT11 CSMD2 DCAF8 DMAP1 DUSP16 ENOX1 CHAF1B CNOT6 CSMD3 DCBLD1 DMWD DYM ENSA CHCHD3 CNOT7 CSNK1A1 DCDC1 DMXL1 DYNC2H1 ENTHD1 CHD1 CNPY2 CSNK1E DCLK1 DNAAF5 DYNLL1 EP400 CHD2 CNR2 CSNK1G1 DCLRE1A DNAH11 DYRK1A EPB41L2 CHD4 CNRIP1 CSNK1G3 DCTN1 DNAH14 DYRK4 EPC1 CHD6 CNTLN CSPP1 DCTN5 DNAH2 DZIP1 EPC2 CHEK2 CNTN1 CSTF1 DCUN1D5 DNAH5 DZIP3 EPHA5 CHIC1 CNTN4 CSTF3 DDAH1 DNAH6 E2F1 EPHA6 CHKA CNTNAP2 CTAGE5 DDB1 DNAH7 E2F3 EPM2A CHM CNTNAP4 CTBP2 DDRGK1 DNAH8 E2F5 EPN1 CHML CNTNAP5 CTC-429P9.4 DDX1 DNAH9 E2F8 EPRS CHMP4A CNTRL CTC-432M15.3 DDX10 DNAJA2 EBAG9 EPS8 CHPT1 CNTROB CTCF DDX19A DNAJA3 EBF2 EPSTI1 CHRNA5 COA5 CTD-2116N17.1 DDX21 DNAJA4 ECH1 ERBB2 CHSY3 COG2 CTD-2135J3.4 DDX24 DNAJB11 ECI1 ERBB4 CHTOP COG3 CTD-2140B24.4 DDX31 DNAJB14 ECI2 ERC1 CHUK COG5 CTD-3074O7.11 DDX46 DNAJB6 EDC4 ERCC4 CIAO1 COG8 CTDSP1 DDX55 DNAJC1 EDEM2 ERCC5 CISD1 COIL CTGF DDX60 DNAJC10 EDEM3 ERCC6 CKAP4 COL14A1 CTH DDX60L DNAJC13 EDIL3 ERCC6-PGBD3 CKAP5 COL3A1 CTIF DEGS1 DNAJC21 EEA1 ERCC8 CKLF COL4A3 CTNNA1 DEK DNAJC24 EEF1B2 ERG CKLF-CMTM1 COL4A5 CTNNA3 DENND1B DNAJC3 EEF2K ERICH1 CLASP1 COL5A1 CTNNAL1 DENND4C DNAJC6 EFCAB14 ERLEC1 CLASP2 COL6A6 CTNNBL1 DENND5A DNASE1 EFCAB2 ERO1A CLCC1 COL8A1 CTNND2 DENND5B DNHD1 EFTUD1 ESCO2 CLCN3 COLGALT1 CTR9 DEPDC4 DNM1L EGFR ESF1 CLDN12 COMMD1 CTSC DERA DNM2 EGLN2 ESR1 CLDND1 COMMD10 CTSZ DESI2 DNMT1 EHBP1 ESYT1 CLGN COPB2 CUL1 DGKA DOC2A EHBP1L1 ESYT2 CLIC2 COPS2 CUL2 DGKB DOCK11 EIF2A ETF1 CLIC5 COQ5 CUL3 DGKD DOCK2 EIF2AK2 ETS2 CLK1 COTL1 CUL4A DGKE DOCK4 EIF2B2 ETV5 CLN3 COX10 CUL5 DHCR7 DOCK7 EIF2B3 EVC CLN5 CPEB2 CUX1 DHFR DOCK8 EIF2B5 EXD3 CLNS1A CPLX1 CWC22 DHRS7B DOCK9 EIF2D EXOC1 CLOCK CPNE3 CWC27 DHX15 DONSON EIF2S1 EXOC3 CLPB CPNE7 CXADR DHX29 DOPEY1 EIF2S3 EXOC4 CLSPN CPS1 CYB561A3 DHX30 DPF3 EIF2S3L EXOC6 CLU CPSF2 CYB5RL DHX32 DPH7 EIF3A EXOSC10 203

EXOSC2 FBN2 FSHR GLA GSPT1 HM13 IFI16 EXOSC8 FBRSL1 FSIP1 GLCE GSR HMCN1 IFI44 EXOSC9 FBXL13 FSTL4 GLG1 GSTCD HMG20A IFIH1 EXPH5 FBXL20 FTH1 GLI3 GSTM3 HMGN5 IFNAR1 EXT1 FBXL5 FTO GLIS2 GTF2A1 HMMR IFNAR2 EXT2 FBXO10 FTSJ1 GLP2R GTF2E2 HNRNPA1L2 IFRD1 EXTL3 FBXO11 FUBP1 GLS GTF2F2 HNRNPA2B1 IFT122 EYA3 FBXO16 FUS GLYR1 GTF2H1 HNRNPC IFT172 EYA4 FBXO22 FUT5 GMDS GTF2H3 HNRNPD IFT52 EZH1 FBXO3 FUT8 GMEB2 GTF3C3 HNRNPH2 IFT88 EZH2 FBXO32 FXR1 GMFB GTF3C4 HNRNPK IGF1R F3 FBXO34 FYB GMIP GTF3C5 HNRNPL IGF2BP1 FAAH2 FBXO4 FYCO1 GNA12 GTF3C6 HNRNPLL IGF2BP3 FAF1 FBXO9 FZD6 GNAL GTPBP2 HNRNPM IGF2R FAF2 FBXW11 GAA GNAO1 GTPBP4 HNRNPU IGFBP4 FAM102B FBXW7 GAB1 GNAQ GTSE1 HNRNPUL2 IGHMBP2 FAM118B FDFT1 GABARAPL1 GNB1 GUF1 HOMER2 IGSF11 FAM120A FER GABPB1 GNB4 GYG1 HOMEZ IKBKB FAM120B FEZ2 GABPB2 GNG7 H2AFV HOOK1 IKZF3 FAM120C FGF12 GABRA3 GNL2 H2AFY HOOK3 IL10RB FAM135A FGF14 GABRE GNL3 HACD1 HOXC6 IL15 FAM13B FGF2 GADD45GIP1 GNPAT HACE1 HP1BP3 IL18 FAM161B FGFR1OP GAK GNPDA1 HADHA HPS1 IL32 FAM172A FGFR1OP2 GALK2 GOLGA1 HAT1 HPS3 IL4I1 FAM179B FGFR2 GALNT1 GOLGA3 HAUS3 HPSE IMMP1L FAM184A FHIT GALNT11 GOLGA4 HAUS7 HRSP12 IMMP2L FAM185A FIG4 GALNT15 GOLGA8A HAVCR2 HS2ST1 IMMT FAM189A1 FIGNL1 GALNT2 GOLGA8B HBS1L HS3ST3A1 IMPAD1 FAM189A2 FIP1L1 GALNT5 GOLGB1 HDAC4 HS3ST4 INADL FAM19A2 FKBP15 GAN GOLIM4 HDAC5 HS6ST2 INF2 FAM19A5 FKBP5 GAREM GOSR2 HDAC6 HS6ST3 INIP FAM200A FLNB GARNL3 GPATCH2L HDGFRP3 HSBP1L1 INO80E FAM200B FLVCR1 GARS GPATCH8 HDHD1 HSD17B12 INPP4B FAM208A FMN1 GART GPBP1L1 HDLBP HSF1 INPP5A FAM208B FMN2 GAS2L1 GPC3 HDX HSF2 INSR FAM210A FMNL1 GAS2L3 GPC5 HEATR3 HSP90AA1 INTS10 FAM227A FMNL2 GATAD1 GPCPD1 HEATR5A HSPA12A INTS2 FAM46A FMO4 GATAD2B GPHN HEATR6 HSPA14 INTS4 FAM49B FMO5 GATC GPI HECTD1 HSPA4 INTS7 FAM73A FMR1 GBE1 GPR107 HECW2 HSPA4L INTU FAM76A FNBP1 GBF1 GPR158 HELLS HSPA8 INVS FAM76B FNBP1L GCC2 GPR160 HELQ HSPA9 IPCEF1 FAM78B FNDC3A GCFC2 GPR176 HELZ HSPB8 IPMK FAM83D FNDC3B GCHFR GPRIN1 HERC1 HSPBP1 IPO11 FAM91A1 FNIP1 GCLC GPS2 HERPUD2 HSPD1 IPO13 FAM92A1 FOCAD GCN1 GPSM2 HEXA HSPE1 IPO5 FAM98A FOSB GDI2 GRAMD1C HEXB HTT IPO7 FAN1 FOXJ3 GDPD1 GRAP2 HEXDC HYDIN IPO8 FANCA FOXK2 GEMIN5 GRB10 HEXIM2 HYOU1 IPO9 FANCB FOXL1 GEMIN6 GRID2 HGSNAT IARS IPP FANCD2 FOXP1 GEMIN7 GRID2IP HHLA2 IBTK IPPK FANCM FPGT-TNNI3K GGA2 GRIK3 HIATL1 ICAM2 IQCB1 FAP FRAS1 GGCT GRIK4 HIATL2 ICE1 IQCG FARSA FREM2 GGH GRIN2B HIBCH ICMT IQGAP1 FASTKD2 FRG1BP GGPS1 GRK5 HIGD1A ID2 IQGAP2 FAT1 FRMD4A GIGYF2 GRM8 HIVEP2 IDE IQSEC2 FBLN1 FRMD4B GINM1 GRTP1 HKR1 IDH3A IRAK1 FBLN2 FRMD5 GIPC1 GSDMC HLCS IDI1 IRAK1BP1 FBN1 FRMD6 GIT2 GSK3B HLX IFFO2 IRAK4 204

IREB2 KIAA0020 KTN1 LRP8 MARCH1 MGMT MSRA ISY1 KIAA0141 KYNU LRR1 MARCH6 MGRN1 MSTO1 ISY1-RAB43 KIAA0195 L1CAM LRRC23 MARK1 MGST1 MT1E ITCH KIAA0196 L3HYPDH LRRC3B MARK3 MIB1 MT1M ITFG1 KIAA0232 L3MBTL1 LRRC41 MARK4 MICAL3 MTA3 ITGA1 KIAA0319L LA16c-306E5.2 LRRC49 MAST2 MICU1 MTAP ITGA11 KIAA0368 LACE1 LRRC4C MAST4 MIER1 MTCH1 ITGA6 KIAA0391 LACTB LRRCC1 MATR3 MIOS MTCH2 ITGA8 KIAA0430 LACTB2-AS1 LRRFIP1 MAX MIS18BP1 MTDH ITGAE KIAA0586 LAMA4 LRRFIP2 MBD4 MITD1 MTHFD2 ITGAM KIAA0825 LAMB1 LRRK1 MBD5 MKLN1 MTHFS ITGAV KIAA0907 LAMC1 LRRTM4 MBNL1 MKRN3 MTIF2 ITGB8 KIAA0922 LAMP2 LSG1 MBTPS1 MLIP MTL5 ITPKB KIAA1033 LANCL1 LSM14A MBTPS2 MLK4 MTMR12 ITPR2 KIAA1217 LAPTM4B LSM5 MCAM MLLT10 MTMR14 ITPR3 KIAA1429 LARP1 LTA4H MCCC1 MLLT3 MTMR2 ITSN1 KIAA1549 LARP1B LTBP1 MCF2L MMD MTMR6 ITSN2 KIAA1715 LARP4 LUC7L2 MCFD2 MMEL1 MTOR JADE3 KIAA1958 LARP7 LURAP1L MCM3 MMP26 MTRF1 JAG2 KIF11 LARS LUZP2 MCM3AP MNAT1 MTRF1L JAGN1 KIF13A LARS2 LYAR MCM4 MND1 MTRNR2L10 JAK2 KIF15 LATS2 LYN MCM9 MNS1 MTRNR2L6 JAKMIP2 KIF18A LBR LYPD6 MCMBP MON2 MTRNR2L8 JAM2 KIF1B LCMT1 LYPD6B MCOLN2 MORC3 MTRR JARID2 KIF1BP LCOR LYRM7 MCPH1 MORF4L1 MTUS1 JMJD1C KIF20B LDAH LYSMD2 MCTS1 MORN2 MTUS2 JPH3 KIF23 LDB2 LYSMD3 MCU MPDZ MUM1 KANK1 KIF24 LDHA LYST MDFIC MPHOSPH10 MVB12A KANSL1 KIF26A LDHB LZIC MDGA2 MPP1 MVP KANSL1L KIF3C LDLR LZTFL1 MDH1B MPP3 MXD4 KANSL2 KIF5A LDLRAD3 M6PR MDN1 MPP5 MXI1 KANSL3 KIF5B LEKR1 MACF1 ME1 MPP6 MXRA7 KARS KIFC1 LEMD3 MAD1L1 ME2 MPP7 MYBL1 KAT2A KIFC3 LEPR MAD2L1BP MECP2 MRE11A MYBPC1 KAT8 KIN LGALS3 MAEA MED1 MRPL18 MYCBP2 KATNA1 KLC1 LHFPL2 MAGED1 MED13 MRPL19 MYH11 KATNBL1 KLF6 LHFPL3 MAGOH MED13L MRPL30 MYH9 KBTBD2 KLHDC4 LIMCH1 MAGOHB MED21 MRPL33 MYO10 KCMF1 KLHL1 LIMK2 MAGT1 MED28 MRPL34 MYO16 KCNG1 KLHL12 LINGO2 MAK16 MED4 MRPL39 MYO18A KCNG3 KLHL22 LINS MALRD1 MEF2A MRPL4 MYO1B KCNH1 KLHL5 LIPA MALT1 MEGF8 MRPL42 MYO1E KCNH8 KLHL7 LMAN2L MAMDC2 MEI1 MRPL43 MYO1H KCNJ15 KLRC3 LMBR1 MAN1A1 MEMO1 MRPL46 MYO5A KCNJ6 KMT2A LMBRD1 MAN2B2 MET MRPL47 MYO5B KCNQ1 KMT2D LMO7 MANBA METAP1D MRPL48 MYO6 KCNQ3 KMT2E LNP1 MAP2K5 METTL13 MRPS11 MYO9A KCNQ5 KNSTRN LOH12CR1 MAP2K7 METTL14 MRPS14 MYO9B KCTD3 KNTC1 LONP1 MAP3K12 METTL20 MRPS22 MYOF KDELR2 KPNA4 LONP2 MAP3K5 METTL4 MRPS23 MYPN KDELR3 KPNA6 LOXHD1 MAP3K7 MFAP1 MRPS25 MYRFL KDM1B KPNB1 LPIN2 MAP4 MFF MRPS27 MYT1L KDM2A KRAS LPPR5 MAP4K4 MFN1 MRPS6 N4BP1 KDM3A KRBOX1 LRBA MAP7 MGA MRPS9 N4BP2 KDM4A KREMEN1 LRIG2 MAP7D3 MGAT4A MSANTD3 N4BP2L2 KDM4C KRIT1 LRP11 MAPK12 MGAT4C MSC-AS1 NAA10 KDM5A KRR1 LRP12 MAPK1IP1L MGAT5 MSH3 NAA16 KDM6A KRT34 LRP1B MAPK8 MGEA5 MSI2 NAA35 KHDRBS2 KRT7 LRP2 MAPKAP1 MGLL MSR1 NAA60 205

NAALADL2 NFASC NT5DC3 OSCP1 PCGF3 PHTF2 PNPT1 NAB1 NFATC2IP NT5E OSMR PCGF5 PI4K2B PODNL1 NABP1 NFATC3 NTAN1 OSTC PCM1 PI4KA POFUT2 NACC2 NFE2L2 NTN1 OTOA PCMTD2 PIAS1 POGLUT1 NADK NFIA NTRK2 OTUD4 PCNA PIAS2 POLA1 NADK2 NFIB NTRK3 OTUD6B PCNX PIBF1 POLB NADSYN1 NFIC NUB1 OTUD7A PCNXL2 PICALM POLD3 NAGA NFX1 NUBPL OTULIN PCNXL4 PIEZO2 POLE NALCN NGDN NUCB1 OVCH1-AS1 PCSK1 PIGC POLG2 NAP1L4 NGLY1 NUCB2 OXCT1 PCSK6 PIGN POLK NAPB NHSL1 NUCKS1 OXNAD1 PCYOX1 PIGX POLR1A NASP NIPBL NUDT12 OXR1 PCYT1A PIK3C2A POLR3B NAT10 NISCH NUDT13 P3H1 PDCD10 PIK3C2B POLR3C NAV1 NKAIN1 NUDT3 P3H2 PDCD11 PIK3R3 POLR3E NAV3 NKAIN2 NUDT4 P4HA1 PDCD6IP PIN1 POLR3G NBEA NKTR NUDT5 PABPC3 PDCL PINX1 POMGNT1 NBEAL1 NLN NUF2 PACRGL PDE1A PIP4K2A POMP NBPF1 NLRP4 NUFIP2 PACSIN2 PDE1C PITHD1 POMZP3 NBPF20 NME7 NUMB PACSIN3 PDE2A PITPNC1 POR NBPF3 NMNAT1 NUP133 PADI4 PDE3A PJA2 POU2F1 NCAPD2 NOC3L NUP153 PAFAH1B1 PDE4B pk PPA2 NCAPG NOL10 NUP160 PAIP1 PDE4D PKHD1L1 PPARD NCAPG2 NOL11 NUP205 PAIP2 PDE6A PKIG PPARG NCAPH NOL4L NUP210 PAK1 PDE8B PKMYT1 PPARGC1A NCBP1 NOLC1 NUP214 PAK1IP1 PDGFD PKN2 PPARGC1B NCDN NONO NUP50 PAK2 PDGFRA PKNOX2 PPAT NCEH1 NOP58 NUP54 PALB2 PDHX PKP2 PPFIA1 NCK1 NOS1 NUP62 PALLD PDIA6 PKP4 PPHLN1 NCKAP1 NOS1AP NUP88 PALM2-AKAP2 PDLIM5 PLA2G4A PPIB NCKAP5 NOTCH2 NUP98 PAM PDS5A PLA2R1 PPID NCOA5 NOV NUPL2 PANK2 PDS5B PLAA PPIE NCOA6 NPAS2 NUSAP1 PANX2 PDSS2 PLAGL1 PPIG NCOR1 NPAT NUTF2 PAPSS2 PDXDC1 PLCB1 PPIL3 NDC1 NPC1 NVL PARD3 PDXK PLCB2 PPIL4 NDFIP1 NPEPPS NXN PARD3B PEAK1 PLCB4 PPIL6 NDRG4 NPHP1 NXPE3 PARD6G PELI2 PLCE1 PPIP5K1 NDUFA5 NPSR1 NYAP2 PARK2 PEPD PLCZ1 PPL NDUFAB1 NR1I2 OARD1 PARL PERP PLD3 PPM1G NDUFAF4 NR2C1 ODF2 PARN PES1 PLEK2 PPM1K NDUFAF6 NR2C2 ODF2L PARP1 PFDN2 PLEKHA1 PPME1 NDUFB4 NR3C1 OFD1 PARP2 PFKFB3 PLEKHA2 PPP1R10 NDUFB9 NR3C2 OGDH PARP8 PFKP PLEKHA3 PPP1R12A NDUFS1 NR4A1 OGFOD1 PARPBP PFN2 PLEKHA5 PPP1R12B NDUFS4 NR4A3 OGT PARVB PGBD3 PLEKHA6 PPP1R21 NEBL NRCAM OLA1 PATL1 PGBD5 PLEKHA7 PPP1R37 NECAP1 NRD1 ONECUT2 PAWR PGPEP1 PLEKHA8 PPP2CB NECAP2 NRG3 OPA1 PAXBP1 PHACTR4 PLEKHB2 PPP2R2D NEDD4 NRP1 OPCML PBK PHB PLEKHM3 PPP2R3A NEDD4L NRP2 OPHN1 PBRM1 PHF20 PLK1 PPP2R5C NEGR1 NRXN1 OPN3 PBX1 PHF20L1 PLOD2 PPP3CA NEIL2 NSD1 ORC1 PBX3 PHF6 PLS3 PPP3CC NEK10 NSFL1C ORC2 PC PHF7 PLSCR4 PPP3R1 NEK11 NSMAF ORC3 PCBD2 PHIP PLXDC2 PPP4R1 NELL2 NSMCE2 ORC5 PCBP3 PHKA1 PMEPA1 PPP5C NEMF NSMCE4A OSBP2 PCCA PHKB PML PPP6R1 NEO1 NSUN2 OSBPL1A PCDH11Y PHLDB3 PMM2 PPP6R3 NETO2 NSUN6 OSBPL3 PCDH17 PHLPP1 PMPCA PPWD1 NEXN NT5C2 OSBPL5 PCDH7 PHLPP2 PMS1 PQLC1 NF1 NT5DC2 OSBPL8 PCF11 PHTF1 PNKD PRC1 206

PRCP PTGES3 RALBP1 RFXAP RP11-322E11.6 RSRP1 SEPT8 PRDM11 PTGS2 RANBP17 RGS2 RP11-343C2.12 RTCA SEPT9 PRDM4 PTK2 RANBP2 RGS22 RP11-38C17.1 RTN3 SERF2 PRDX6 PTPN11 RANBP6 RGS3 RP11-407N17.3 RUFY2 SERGEF PRELID2 PTPN12 RANBP9 RGS6 RP11-458D21.5 RUNX1 SERINC3 PREX2 PTPN2 RAP1B RGS7 RP11-468E2.1 RUNX3 SERPINB1 PRKAA1 PTPN21 RAPGEF4 RHBDL2 RP11-47I22.4 RUVBL1 SERPINB6 PRKACB PTPN9 RAPGEF6 RHOBTB3 RP11-574F21.3 RWDD2B SESTD1 PRKAR1A PTPRA RARS RHOF RP11-618P17.4 RYK SET PRKCA PTPRB RARS2 RHOT1 RP11-65D24.2 SACM1L SETBP1 PRKCD PTPRK RASA2 RIF1 RP11-73M18.2 SACS SETD1A PRKCH PTPRN2 RASGEF1A RILPL1 RP11-834C11.12 SAFB SETD1B PRKD3 PTPRR RASGEF1B RIMS1 RP11-977G19.10 SAMD11 SETD2 PRKDC PTPRS RASSF8 RIMS2 RP13-512J5.1 SAP130 SETD5 PRKRA PTRF RAVER2 RIOK1 RP13-672B3.2 SAR1B SETD7 PRMT2 PTRH2 RB1 RIOK3 RP2 SARS SETX PROSC PUM1 RB1CC1 RIPK4 RP5-972B16.2 SASH1 SEZ6L PRPF18 PUM2 RBBP8 RLF RPAP2 SASS6 SEZ6L2 PRPF3 PUS1 RBFOX1 RMDN1 RPAP3 SATB2 SF1 PRPF4 PUS7 RBFOX2 RMDN2 RPF1 SAV1 SF3B1 PRPF40A PUS7L RBM19 RMI2 RPF2 SBDS SF3B2 PRPF4B PVR RBM25 RMND1 RPL10A SBF2 SFI1 PRPF6 PVRL1 RBM26 RNASEH1 RPL18 SCAF11 SFMBT1 PRPSAP1 PVRL3 RBM27 RNASEH2B RPL23 SCAF4 SFSWAP PRPSAP2 PXMP2 RBM28 RNF11 RPL23A SCAI SFXN5 PRR11 PYGL RBM3 RNF111 RPL37 SCAPER SGCE PRR14L QRSL1 RBM34 RNF123 RPL4 SCFD1 SGCG PRR16 QSER1 RBM39 RNF126 RPL5 SCFD2 SGK1 PRR3 R3HCC1 RBM41 RNF13 RPL7 SCG5 SGK3 PRR5L R3HDM2 RBM47 RNF130 RPL7A SCLT1 SGMS1 PRRC2A R3HDM4 RBM4B RNF150 RPN1 SCMH1 SGOL2 PRRC2B RAB10 RBM6 RNF182 RPN2 SCO1 SGPL1 PRRC2C RAB18 RBMS1 RNF19A RPP14 SCTR SGTB PRRG1 RAB1A RBMS2 RNF212 RPP40 SCUBE1 SH2D4A PRSS21 RAB21 RBMS3 RNF214 RPRD2 SDCBP SH3BGR PSD3 RAB27A RBPMS RNF216 RPS10-NUDT3 SDCCAG3 SH3D19 PSEN1 RAB27B RC3H2 RNF217 RPS12 SDHA SH3GLB1 PSMA1 RAB31 RCBTB2 RNF220 RPS16 SDHAF2 SH3GLB2 PSMA2 RAB3GAP1 RCC1 RNF31 RPS2 SEC13 SH3KBP1 PSMA5 RAB3GAP2 RCN2 RNF4 RPS26 SEC14L1 SH3PXD2A PSMA7 RAB5A RCOR1 RNF8 RPS4X SEC23IP SH3RF1 PSMC2 RAB8A RCOR3 RNGTT RPS5 SEC24B SH3RF2 PSMC3IP RAB9A RECQL RNMT RPS6 SEC24D SH3TC2 PSMC6 RABEP1 REEP3 RNPS1 RPS6KA2 SEC31A SHB PSMD1 RABGGTB RELB ROBO1 RPS6KA4 SEC61A1 SHCBP1 PSMD14 RABL3 REPS1 ROCK2 RPS6KB1 SEC63 SHFM1 PSMD4 RABL6 RER1 ROR2 RPS7 SECISBP2L SHISA9 PSMD6 RAD1 RERE RORA RRAGC SEH1L SHMT1 PSME4 RAD17 REV3L RP1-66C13.4 RRAS2 SEL1L3 SHOC2 PSMG2 RAD18 REXO4 RP11-1035H13.3 RREB1 SELO SHROOM3 PSORS1C1 RAD21 RFC1 RP11-111H13.1 RRN3 SEMA3A SHTN1 PSPC1 RAD50 RFC4 RP11-159G9.5 RRNAD1 SEMA3E SIAH1 PSPH RAD51C RFT1 RP11-166B2.1 RRP15 SENP2 SIK1 PSRC1 RAD52 RFWD2 RP11-178L8.4 RRP36 SENP5 SIKE1 PSTPIP2 RAD54B RFX1 RP11-277P12.6 RSF1 SENP6 SIN3A PTBP3 RAD54L2 RFX2 RP11-286N22.8 RSL1D1 SENP7 SIN3B PTCD1 RAF1 RFX4 RP11-295P9.3 RSL24D1 SEPT11 SIPA1L1 PTCHD1 RAI1 RFX5 RP11-298I3.5 RSPH3 SEPT2 SIX4 PTEN RALB RFX7 RP11-302B13.5 RSPO3 SEPT6 SKAP2 207

SKIV2L2 SMARCA5 SPEN STX16 TBC1D5 TIMM17A TMLHE SLC12A2 SMARCAD1 SPG11 STX2 TBC1D9 TIMM8A TMOD1 SLC12A8 SMARCB1 SPG21 STX5 TBC1D9B TIMP3 TMOD3 SLC13A3 SMARCC1 SPG7 STXBP1 TBCD TIPIN TMTC1 SLC16A14 SMARCD1 SPICE1 STXBP5 TBCE TJP1 TMTC4 SLC19A1 SMC2 SPIDR STYXL1 TBCK TKT TMX1 SLC1A4 SMC3 SPINK6 SUB1 TBK1 TLE1 TNFRSF10A SLC20A2 SMC4 SPOCK1 SUCLA2 TBL1Y TLE2 TNFRSF21 SLC22A5 SMC5 SPOPL SUFU TBP TLK1 TNIP1 SLC24A3 SMC6 SPRED2 SUGCT TBPL1 TLK2 TNKS2 SLC25A12 SMCHD1 SPRTN SUGP2 TBXAS1 TLL1 TNNI3K SLC25A13 SMCO4 SPSB1 SUGT1 TC2N TLN2 TNPO1 SLC25A23 SMG5 SPSB4 SULF1 TCAIM TM4SF1 TNPO3 SLC25A24 SMOC1 SPTBN1 SULF2 TCEA1 TM9SF1 TNRC6A SLC25A3 SMPDL3A SPTLC2 SUPT16H TCEA2 TM9SF2 TNRC6B SLC25A32 SMS SPTSSA SUPT3H TCEANC2 TM9SF3 TNS4 SLC25A37 SMURF1 SQLE SUSD1 TCERG1 TMBIM6 TOMM20 SLC25A42 SMYD3 SQSTM1 SUV39H2 TCF20 TMC5 TOMM70A SLC27A2 SNCA SRBD1 SUZ12 TCF7L2 TMC7 TOP1 SLC27A4 SND1 SRCAP SVBP TCFL5 TMCC1 TOP1MT SLC30A4 SNED1 SREBF2 SWAP70 TCOF1 TMCO3 TOP2A SLC30A5 SNRNP200 SREK1 SWT1 TCTEX1D2 TMCO4 TOR1AIP2 SLC30A9 SNRNP27 SRFBP1 SYCP1 TDRD9 TMEFF1 TOR1B SLC35A5 SNTB1 SRGAP3 SYCP2L TEAD4 TMEM110 TOX SLC35B3 SNTB2 SRP19 SYDE1 TEC TMEM110 TP53 SLC35F2 SNX1 SRRD SYN3 TECPR2 TMEM116 TP53BP2 SLC38A1 SNX11 SRRM2 SYNE2 TEK TMEM117 TPD52 SLC38A10 SNX16 SRSF1 SYNGAP1 TENM3 TMEM126B TPD52L2 SLC38A2 SNX2 SRSF11 SYNJ2 TERF1 TMEM131 TPK1 SLC38A6 SNX24 SRSF4 SYT15 TERF2 TMEM132D TPM1 SLC39A11 SNX29 SRSF9 TAF1 TES TMEM147 TPM4 SLC39A14 SNX4 SSB TAF12 TEX11 TMEM156 TPP2 SLC41A2 SNX5 SSBP2 TAF1A TEX14 TMEM159 TPR SLC4A1AP SNX6 SSFA2 TAF1B TEX9 TMEM161B TPRA1 SLC5A3 SNX7 SSR3 TAF1D TFAP2A TMEM164 TPRG1 SLC6A11 SNX8 SSX2IP TAF3 TFCP2 TMEM180 TPST1 SLC6A15 SOAT1 ST18 TAF4 TFDP1 TMEM184C TPST2 SLC7A1 SOCS5 ST20-MTHFS TAF4B TFDP2 TMEM187 TPTE2 SLC7A11 SOD2 ST3GAL2 TAF9B TFG TMEM189 TPX2 SLC7A2 SON ST6GALNAC4 TANC1 TFIP11 TMEM189 TRA2A SLC7A6 SORBS2 ST6GALNAC6 TANC2 TFPI TMEM19 TRAF2 SLC8B1 SORCS1 ST7 TANGO6 TFRC TMEM2 TRAF3 SLC9A8 SOS1 ST7L TANK TG TMEM209 TRAF3IP1 SLCO1B1 SOS2 STAG1 TAOK3 TGFB2 TMEM220 TRAF3IP2 SLCO1B7 SOX13 STAG2 TAP2 TGFBI TMEM222 TRAF5 SLCO3A1 SOX5 STAM TARBP1 TGS1 TMEM223 TRAFD1 SLF1 SOX7 STAMBPL1 TARS THADA TMEM245 TRAIP SLIT2 SPAG6 STARD9 TASP1 THAP1 TMEM248 TRAK2 SLMAP SPAG9 STAU2 TATDN1 THBS1 TMEM254 TRAP1 SLMO1 SPATA16 STEAP1B TAX1BP1 THEM4 TMEM260 TRAPPC11 SLMO2 SPATA2 STK3 TBC1D10B THOC3 TMEM33 TRAPPC2B SLTM SPATA22 STK32B TBC1D14 THOC7 TMEM44 TRAPPC8 SLU7 SPATA5L1 STK38L TBC1D15 THSD4 TMEM45A TRAPPC9 SMAD2 SPATS2 STPG2 TBC1D22A THUMPD2 TMEM59 TRDMT1 SMAD5 SPATS2L STRBP TBC1D22B THUMPD3 TMEM62 TREX2 SMAD6 SPDL1 STRC TBC1D23 TIAL1 TMEM63C TRIM26 SMAP2 SPECC1 STRN TBC1D30 TIAM2 TMEM64 TRIM33 SMARCA1 SPECC1L STRN4 TBC1D31 TICRR TMEM68 TRIM69 SMARCA2 SPEF2 STT3A TBC1D4 TIMELESS TMEM87B TRIM9 208

TRIO TXNDC16 UNC5C VPRBP WSCD1 ZDHHC6 ZNF430 TRIP11 TXNDC9 UNK VPS13A WWC1 ZDHHC7 ZNF45 TRIP13 TXNL1 UPF3A VPS13B WWC2 ZER1 ZNF462 TRIP4 TXNRD1 UPP2 VPS35 WWP1 ZFAND1 ZNF492 TRMT5 TXNRD2 UPRT VPS37A WWP2 ZFAND4 ZNF496 TRMT6 TYRO3 UQCC1 VPS41 WWTR1 ZFAT ZNF507 TROVE2 U2SURP URB1 VPS53 XPNPEP3 ZFC3H1 ZNF510 TRPC1 UACA URGCP VPS8 XPO1 ZFP30 ZNF514 TRPM3 UBA2 URGCP-MRPS24VRK1 XPO5 ZFP64 ZNF536 TRPM6 UBA5 URI1 VRK2 XPO6 ZFP90 ZNF544 TRPM7 UBA6 UROD VTA1 XPR1 ZFX ZNF547 TRRAP UBAC1 UROS VTI1A XRCC3 ZFYVE16 ZNF565 TRUB2 UBAP1 USH2A VWF XRCC5 ZFYVE28 ZNF568 TSC22D1 UBAP2 USHBP1 WAC XXbac-BPG246D15.9ZGRF1 ZNF589 TSC22D2 UBC USO1 WAPAL XYLB ZHX1 ZNF594 TSEN15 UBE2D1 USP1 WARS YAE1D1 ZHX1-C8orf76 ZNF607 TSEN2 UBE2D2 USP11 WARS2 YAF2 ZHX3 ZNF618 TSG101 UBE2G2 USP13 WASF1 YAP1 ZKSCAN1 ZNF622 TSNAX UBE2I USP14 WBP2NL YARS ZKSCAN3 ZNF623 TSPAN3 UBE2K USP15 WDPCP YARS2 ZMAT2 ZNF627 TSPAN8 UBE2L3 USP16 WDR1 YEATS2 ZMYND11 ZNF638 TTC1 UBE2R2 USP19 WDR11 YLPM1 ZMYND19 ZNF644 TTC13 UBE2T USP20 WDR12 YME1L1 ZNF106 ZNF652 TTC17 UBE2W USP24 WDR25 YTHDC1 ZNF107 ZNF684 TTC23 UBE3A USP25 WDR26 YTHDF3 ZNF124 ZNF692 TTC27 UBE3B USP3 WDR3 YWHAE ZNF148 ZNF704 TTC3 UBE3D USP30 WDR34 YWHAH ZNF160 ZNF708 TTC30A UBE4A USP34 WDR35 YY1 ZNF165 ZNF720 TTC33 UBLCP1 USP38 WDR36 ZBED4 ZNF181 ZNF721 TTC39C UBN1 USP4 WDR41 ZBED5 ZNF197 ZNF730 TTC4 UBN2 USP47 WDR43 ZBED9 ZNF207 ZNF75A TTC7B UBOX5 USP53 WDR44 ZBTB1 ZNF215 ZNF766 TTF1 UBP1 USP6 WDR45 ZBTB24 ZNF233 ZNF770 TTK UBR2 USP7 WDR53 ZBTB25 ZNF250 ZNF804B TTL UBR4 USP8 WDR60 ZBTB38 ZNF251 ZNF860 TTLL11 UBR5 UTP11L WDR7 ZBTB4 ZNF253 ZNF98 TTLL3 UBR7 UTRN WDR70 ZBTB8A ZNF264 ZNHIT6 TTLL4 UBXN2A UXS1 WDR74 ZBTB8OS ZNF273 ZNRF3 TTLL5 UBXN4 VAMP3 WDR75 ZC3H11A ZNF280D ZRANB1 TTLL7 UBXN7 VAMP4 WDR76 ZC3H14 ZNF304 ZRANB2 TTN UCHL5 VAPA WDR78 ZC3H15 ZNF317 ZRANB3 TUBB6 UEVLD VAV2 WDR89 ZC3H7A ZNF326 ZSCAN1 TUBGCP3 UFD1L VCL WDYHV1 ZC3H8 ZNF330 ZSCAN12 TULP3 UFL1 VCP WIBG ZC3HAV1 ZNF384 ZSCAN5A TVP23A UGCG VDAC2 WIPF1 ZC3HC1 ZNF385B ZSWIM6 TWF1 UGGT2 VEGFC WIPI1 ZCCHC11 ZNF41 ZWILCH TWISTNB UHRF2 VGLL4 WLS ZCCHC17 ZNF410 ZYG11A TXLNG UNC13B VIPAS39 WNK1 ZCCHC2 ZNF420 ZYG11B TXN UNC50 VKORC1L1 WRNIP1 ZCCHC7 ZNF43 ZZZ3

SSA: Genes with Decreased Poly(A) Sites AAK1 AC073610.5 ACSL4 ADAMTSL3 ADSS AHCYL2 ALDH18A1 AASDH AC091801.1 ACTL6A ADAT2 AEBP2 AHI1 ALDH4A1 ABCB4 AC109829.1 ACTL8 ADGB AFG3L2 AK3 ALDH7A1 ABCC2 ACADL ADA ADGRE1 AGAP1 AKAP13 ALMS1 ABCC5 ACAT1 ADAM10 ADGRL2 AGBL1 AKAP6 AMY2A ABR ACOXL ADAM9 ADK AGGF1 AKR1D1 ANAPC16 AC018816.3 ACSL3 ADAMTS1 ADRA1D AGO2 AL021546.6 ANAPC7 209

ANGPT1 ATOX1 C1orf112 CDH12 CNBD1 CTNNA2 DMXL1 ANK2 ATP10D C1orf141 CDH18 CNGB1 CTNNA3 DMXL2 ANKHD1 ATP11C C1orf27 CDH20 CNKSR2 CTNND1 DNAAF2 ANKHD1-EIF4EBP3ATP2B4 C21orf58 CDH3 CNN2 CTTN DNAH1 ANKIB1 ATP5O C3orf62 CDH4 CNNM2 CUL2 DNAH10 ANKMY1 ATP6V1D C4orf27 CDH5 CNOT1 CUL3 DNAH11 ANKRD18B ATP8A2 C4orf29 CDK13 CNOT6L CUL5 DNAH12 ANKRD55 ATR C5orf34 CDK19 CNOT8 CWC15 DNAH14 ANKS6 ATRX C6orf62 CDK5RAP2 CNR2 CWC27 DNAH5 ANLN ATXN7L1 C7orf50 CDK8 CNTLN CYP46A1 DNAH7 ANO4 AUTS2 C9orf171 CDKAL1 CNTN1 CYTH1 DNAJA1 ANP32B AVIL C9orf3 CDKL3 CNTNAP2 DAB1 DNAJA2 ANTXR1 AVL9 CA5B CDKL5 CNTNAP5 DACH2 DNAJC10 ANXA11 AZIN2 CAB39 CDKN2C CNTRL DAGLB DNAJC27 ANXA5 B2M CACNA1A CDKN3 COA4 DAP3 DNAJC8 ANXA6 B4GALT2 CACNA1I CDV3 COBLL1 DARS DNM3 ANXA7 B9D2 CACNA2D1 CEBPZ COG2 DARS2 DNMT1 AP000304.12 BACH1 CACNA2D3 CECR2 COG6 DAZAP1 DNTTIP2 AP1S3 BAZ1A CADM2 CENPC COL14A1 DBF4B DOCK2 AP3B1 BAZ2B CADPS2 CENPF COL4A3 DCC DOCK4 AP3B2 BBS2 CAGE1 CENPH COL4A3BP DCDC1 DOCK7 AP3D1 BBS7 CALB1 CENPN COL4A6 DCLK1 DOCK8 AP3M1 BCAS1 CALN1 CEP112 COMMD3-BMI1 DCLRE1A DOK1 APAF1 BCCIP CALU CEP120 COPA DCTN2 DOPEY1 APLP2 BCL2L14 CAMLG CEP135 COPS5 DCTN5 DPH6 APOOP5 BCLAF1 CAMSAP1 CEP192 COPZ1 DCUN1D5 DPP10 AQP9 BFAR CAND1 CEP290 CORO7-PAM16 DDAH1 DPP8 ARFGEF3 BHLHE40 CAPN15 CEP57 COX10 DDX10 DPYD ARFIP1 BIRC2 CAPRIN1 CEP89 COX4I1 DDX18 DPYSL5 ARGLU1 BLM CAPZB CEP97 COX7A2 DDX21 DRD3 ARHGAP10 BLOC1S6 CASQ2 CERS6 CPD DDX3X DSG2 ARHGAP44 BMI1 CBS CFAP36 CPLX1 DDX3Y DSP ARHGAP6 BMPR2 CBX3 CFAP44 CPNE1 DDX42 DST ARID1A BNC2 CCAR1 CFAP52 CR1 DDX46 DSTN ARID4B BNIP3L CCDC109B CFDP1 CRBN DDX59 DUSP19 ARL13B BOLA3 CCDC117 CGB1 CREG2 DEC1 DUT ARL14EPL BPIFC CCDC125 CGGBP1 CRNKL1 DENND1A DYM ARL2 BPTF CCDC150 CHCHD3 CRTC3 DENND1B DYNC1I1 ARL2BP BRD4 CCDC3 CHEK2 CSDE1 DENND5B DYRK1A ARMC4 BRD8 CCDC30 CHIC1 CSE1L DEPDC4 DYX1C1 ARNT BRIP1 CCDC47 CHID1 CSMD1 DEPDC7 DZIP1L ARPC3 BRIX1 CCDC58 CHMP3 CSMD2 DET1 E2F3 ARPP21 BROX CCDC66 CHN1 CSNK1G1 DGCR8 E2F6 ARSE BSN CCDC79 CHODL CSNK2A1 DHFR ECI1 ASAH1 BTAF1 CCDC90B CHORDC1 CSNK2A2 DHRS7 ECT2 ASAP1 BTBD3 CCDC91 CHRNA5 CSRP2 DHRSX EDC3 ASB8 BTD CCDC93 CHST6 CSTF3 DHX15 EDEM2 ASCC1 BTF3 CCNG1 CIB4 CTAGE5 DHX29 EEA1 ASCC3 BZW1 CCT2 CKAP2 CTB-60B18 DHX33 EEF1A1 ASIC2 BZW2 CCT4 CKAP2L CTC-454I21 DHX8 EFHB ASIP C11orf49 CCT7 CKAP5 CTCF DHX9 EFR3B ASPH C11orf58 CCT8 CLIC2 CTD-2116N DIAPH2 EFTUD1 ASPM C11orf71 CD164 CLIP1 CTD-2135J DIAPH3 EGFL6 ASRGL1 C11orf80 CD59 CLMP CTD-2287O DICER1 EGFR ASTN2 C11orf85 CD80 CLNS1A CTD-2410N1 DIEXF EGLN1 ASUN C12orf29 CDC26 CLPTM1L CTD-2510F DIS3 EHBP1 ATE1 C14orf166 CDC42BPA CLTC CTD-3088G DKK2 EHHADH ATIC C16orf72 CDC42BPB CLVS1 CTIF DLG2 EIF2A ATL2 C17orf67 CDC5L CMBL CTNNA1 DLX6 EIF2B1 210

EIF3A FAM208A GLUD1 HMGB3 IQCE KIN LYRM7 EIF3B FAM227B GMDS HMGXB3 IQSEC2 KIR2DL4 LYST EIF3E FAM49B GNB1 HMMR IRAK2 KIR3DL1 M1AP EIF3H FAM73A GNB2L1 HNF4A IRAK4 KIR3DL3 MACF1 EIF3M FAM76A GNG7 HNRNPA2B1 IRS1 KITLG MAD1L1 EIF4A2 FAM81B GNL3L HNRNPA3 IST1 KLC1 MAGI2 EIF4B FAM98B GNPNAT1 HNRNPAB ISY1 KLHDC4 MAGOHB EIF4G1 FAN1 GOLGA4 HNRNPC ISY1-RAB43 KLHL7 MALRD1 EIF4G2 FANCD2 GOLIM4 HNRNPCL1 ITCH KMT2A MALT1 EIF4G3 FANCI GOLPH3L HNRNPD ITGA1 KMT2E MAN2A1 EIF5 FARP1 GON4L HNRNPH1 ITGA9 KNTC1 MANBA EIF5B FARSB GPALPP1 HNRNPK ITGB1 KPNA2 MAP4 ELAVL4 FBN2 GPATCH2 HNRNPL ITGB3BP KPNA4 MAP4K4 ELF2 FBXO4 GPATCH8 HNRNPM ITPR2 KRBOX1 MAP7D3 ELMOD2 FBXO5 GPC3 HNRNPR JAK2 KRIT1 MAPK9 ELP3 FCF1 GPC5 HNRNPU JAKMIP2 KRT222 MATR3 EMC3 FCGR2B GPC6 HOMER2 JMJD1C KSR2 MCCC1 ENDOV FGF13 GPN3 HOOK3 JRKL KTN1 MCF2 ENGASE FGF14 GPR149 HRC KANK1 L3MBTL3 MCF2L2 ENO1 FHIT GPX4 HS2ST1 KANSL1L LAMB1 MCM8 ENPP1 FIG4 GRAMD1C HS6ST2 KAT7 LARGE MCPH1 ENPP3 FIP1L1 GRAP2 HS6ST3 KATNA1 LARP7 MDGA2 ENPP6 FKBP3 GRB10 HSP90AA1 KCNG3 LARS MDH2 EPC1 FMN2 GRB14 HSP90B1 KCNH1 LBR MDN1 EPHA6 FMO4 GRHPR HSPA14 KCNJ6 LCORL MED13 EPS8 FNDC3A GRID1 HSPA4 KCNMA1 LDHB MEF2A EPSTI1 FNDC3B GRID2 HSPA4L KCNMB3 LEMD3 MELK ERBB4 FNDC7 GSKIP HSPA8 KCNMB4 LEO1 METAP2 ERC1 FOXN3 GSPT1 HSPA9 KCNQ1 LEPR METTL5 ERC2 FRA10AC1 GSS HSPBP1 KCNQ3 LEPROT MFAP1 ERCC8 FREM2 GTF2E2 HSPD1 KCNQ5 LHFPL3 MGA ERGIC2 FRMD4B GTF2H1 HSPE1-MOB4 KCNU1 LIMCH1 MGARP ESCO1 FRMD5 GTF2IRD2B HSPH1 KCTD16 LINC01119 MGEA5 ETAA1 FRMD6 GTPBP4 HYDIN KCTD19 LINS MIB1 ETS2 FSD2 GUCY2C ICE1 KCTD3 LIPA MICU2 EXOC1 FSTL4 GULP1 ICE2 KDELC2 LMAN1 MID1 EXOC3 FUCA2 GUSB IDE KDM1A LMBR1 MIER1 EXOC6 FXR1 HACD2 IFNAR1 KDM1B LONP2 MINOS1 EXOC6B FYB HADHA IFT57 KDM4C LPCAT1 MINOS1-NBL1 EYA1 G3BP2 HAT1 IGF2R KDM5B LPIN2 MIRLET7BHG EYA2 GAB2 HBG2 IGFL2 KDM6A LRCH1 MITD1 EYA3 GALK2 HDAC2 IGFN1 KDM7A LRP1B MKKS F8 GALNT18 HDGFRP3 IGSF10 KHDRBS2 LRP8 MKLN1 FAAH2 GALNT2 HDLBP IKBKAP KHSRP LRPPRC MMRN2 FAF2 GAREM HECTD1 IKZF2 KIAA0226L LRR1 MNAT1 FAM101A GARNL3 HECTD4 IKZF3 KIAA0319L LRRC23 MND1 FAM102A GAS2 HECW1 IL10RB KIAA0368 LRRC28 MNS1 FAM134B GBE1 HERC1 IL1RAPL1 KIAA0391 LRRC53 MOB1A FAM13A GCDH HFM1 IL33 KIAA0825 LRRC59 MOCOS FAM13B GCLC HIATL2 IL6ST KIAA1217 LRRC9 MORC3 FAM162A GDI2 HIBADH IL7 KIAA1715 LRRCC1 MORC4 FAM172A GEMIN2 HIF1A ILF2 KIAA1755 LRRN2 MORF4L1 FAM177A1 GEMIN5 HIGD1A INADL KIAA2012 LRRTM4 MORF4L2 FAM177B GGH HIPK3 ING2 KIF11 LSAMP MOXD1 FAM184A GGPS1 HIVEP2 INTS10 KIF13A LSM14A MPHOSPH9 FAM189A2 GINM1 HK1 IPCEF1 KIF20A LTBP1 MROH8 FAM19A2 GLRB HKR1 IPO11 KIF21B LTN1 MRPL1 FAM19A5 GLRX3 HLA-B IPO5 KIF26B LTV1 MRPL15 FAM200B GLS HLCS IQCB1 KIFC1 LUC7L3 MRPL16 211

MRPL22 NDUFAF6 OVCH1-AS1 PIP5K1B PRDM6 RASA2 RPL3 MRPS22 NDUFC2 OXR1 PKP4 PRDX4 RASGEF1B RPL35A MSH2 NDUFC2-KCTD14OXSR1 PLA2R1 PRKDC RASSF8 RPL37A MSH3 NDUFS4 PA2G4 PLAA PRLR RB1CC1 RPL39L MSH6 NDUFS6 PABPC1 PLAGL1 PRMT3 RBAK RPL4 MSL3 NEDD4 PABPC4 PLCE1 PROS1 RBBP8 RPL5 MSRA NEDD8 PACSIN2 PLCXD2 PRPF40A RBFA RPL7 MSTO1 NEDD8-MDP1 PAIP1 PLD5 PRPF40B RBFOX1 RPL7A MTCH2 NEDD9 PAIP2 PLEKHA5 PRPF4B RBM12 RPLP2 MTDH NETO2 PAK1 PLLP PRR12 RBM17 RPN1 MTHFD1L NF1 PALLD PLOD2 PRR14L RBM26 RPRD1A MTHFD2 NFIA PAM PLS1 PRRC2B RBM34 RPS14 MTHFD2L NFKB2 PAM16 PLS3 PRRC2C RBM39 RPS15A MTMR2 NFRKB PAPOLA PLXDC2 PSAT1 RBM48 RPS23 MTMR8 NFU1 PARD3 PMS1 PSMA1 RBM5 RPS4X MTPAP NFX1 PARK7 PMS2 PSMA4 RBM6 RPS6KA2 MTRNR2L11 NFYB PARL PNISR PSMA6 RC3H2 RPS6KB1 MTRNR2L12 NIFK PARP1 PNPT1 PSMB5 RCAN3 RPS9 MTRNR2L8 NIPBL PARP11 PODNL1 PSME4 RCC1 RRAGC MTRNR2L9 NKRF PARP15 POLA1 PTCD3 RCN1 RRM2 MTUS1 NKTR PATL1 POLB PTCHD1 RCN2 RRP1 MUC16 NMU PBX4 POLD3 PTGES3 REV3L RRP1B MYBL1 NOA1 PCBD2 POLH PTGFR RFC1 RSF1 MYBPC1 NOC3L PCBP2 POLR1B PTMA RGPD2 RSL24D1 MYH11 NOL11 PCCB POLR2H PTMS RGS22 RSRC2 MYH7 NOL7 PCDH11X POLR3A PTP4A2 RGS3 RSRP1 MYO10 NOL8 PCDH11Y POM121 PTPN12 RGS7 RUNX1 MYO16 NOM1 PCDH15 POMGNT1 PTPN13 RHOBTB3 RUVBL1 MYO1E NOP14 PCGF5 POMP PTPN21 RIF1 RWDD2B MYO1H NOP58 PCIF1 POTEB PTPRA RIMS2 RYK MYO6 NOS1AP PCTP POTEB2 PTPRD RINT1 S100A11 MYO9A NOX1 PCYT1B POTEI PTPRG RIOK2 SAAL1 MYT1 NPAS2 PDCD4 POU2AF1 PTPRM RLF SACS N4BP2 NR2C1 PDE1A PPA1 PTPRN2 RNASEH2B SAE1 N4BP2L2 NRCAM PDE3A PPAN PTPRR RNF103 SAMHD1 NAALADL2 NRXN1 PDE3B PPAN-P2RY11 PTPRT RNF128 SAMM50 NACA2 NSA2 PDE4B PPAP2A PUM1 RNF157 SAP30BP NAE1 NSMCE2 PDE4D PPARA PVR RNF216 SASS6 NAP1L1 NSUN2 PDE6A PPARD R3HCC1L RNF219 SBDS NARS NT5C3B PDHX PPARG R3HDML RNGTT SBF2 NASP NUCB2 PDILT PPARGC1B RAB27A ROBO1 SCAF4 NAV3 NUCKS1 PDK2 PPFIA2 RAB3C ROCK1 SCAI NBEA NUP107 PDS5A PPHLN1 RAB3GAP1 ROCK2 SCAPER NBEAL1 NUP153 PDXDC1 PPIA RAB3GAP2 ROR1 SCFD1 NBN NUP205 PDZRN4 PPIC RAB6A RORA SCFD2 NBPF20 NUP214 PGRMC2 PPIP5K2 RABEP1 RP11-159D12 SCML1 NCALD NUP37 PHACTR1 PPM1G RABL6 RP11-286N22 SEC11C NCAM2 NUP54 PHF21A PPP1CC RACGAP1 RP11-302B13 SEC23A NCAPG NUP98 PHKA2 PPP1R12B RAD21 RP11-407N17 SEC61G NCEH1 NYAP2 PHLDB2 PPP1R13B RAD50 RP11-545J16 SEMA4B NCK1 ODC1 PHTF1 PPP1R37 RAD51B RP11-96O20 SEMA6B NCKAP5 OOEP PI4KA PPP2R2A RAF1 RP5-972B16 SENP6 NCL OPHN1 PIBF1 PPP2R2B RALY RPA3 SENP7 NCOA5 OR2W3 PIEZO2 PPP2R5A RANBP17 RPL10 SEPHS2 NCOA6 ORC5 PIGN PPP2R5C RANBP2 RPL14 SEPT2 NCOA7 OSBPL1A PIK3C2A PPP2R5E RAP1A RPL19 SERBP1 NCOR1 OSBPL8 PIK3C2B PRAME RARB RPL22 SERINC1 NDUFA8 OTUD5 PIK3R3 PRCP RARS RPL23 SESTD1 NDUFAF5 OTULIN PINX1 PRDM4 RASA1 RPL27 SETBP1 212

SETD2 SLK SRPK1 TBC1D15 TOP2A TULP2 UTP18 SETD3 SLMAP SRRT TBC1D19 TOP2B TWF1 VAT1L SETD5 SLTM SRSF1 TBC1D31 TOPBP1 TXLNB VDAC3 SETDB2 SMAD6 SRSF10 TBC1D5 TOX2 TXNRD1 VMP1 SF3A3 SMAP2 SSB TBC1D9B TP53TG5 TXNRD3 VPS13A SF3B1 SMARCC1 SSBP3 TBK1 TPK1 TYW1 VPS13B SF3B2 SMC2 SSRP1 TBPL1 TPM1 U2AF1 VPS26B SFPQ SMC4 ST13 TCAIM TPP2 UBA1 VPS33A SFTA3 SMCHD1 ST20-MTHFS TCEA1 TPR UBA2 VPS37A SFXN1 SMG6 ST7L TCERG1 TPRKB UBA52 VPS8 SGCD SMG8 STAM TCP11L1 TPTE2 UBA6 VRK1 SGMS1 SMIM19 STARD9 TCTEX1D1 TPX2 UBE2D1 VRTN SGOL2 SMPDL3A STAT1 TDRKH TRA2B UBE2G1 VWA8 SHC3 SMR3B STEAP1B TEK TRAIP UBE2G2 VWF SHCBP1 SNAPC3 STIP1 TENM4 TRANK1 UBE2I WAPAL SHISA9 SND1 STK10 TEX11 TRAP1 UBE2Q1 WARS2 SHTN1 SNRNP200 STK3 TFDP1 TRAPPC9 UBE3C WASH4P SIN3A SNRNP70 STK32C TFDP2 TRERF1 UBR2 WBP4 SIPA1L1 SNRPD1 STMN1 TFPI2 TRIM33 UBR4 WDFY2 SKAP2 SNRPF STPG2 TG TRIM58 UBR5 WDR19 SKIV2L2 SNTG1 STX16 THADA TRIM61 UBXN11 WDR41 SLC20A2 SNW1 STX8 THOC2 TRIM69 UBXN7 WDR43 SLC25A3 SNX4 SUB1 THUMPD1 TRIO UGGT2 WDR60 SLC25A33 SNX6 SUCLG1 TIAM1 TRIP12 UGT2B7 WDR70 SLC25A51 SOCS4 SUGCT TIAM2 TRIP4 ULK4 WEE1 SLC26A4 SOD2 SUGP2 TIPARP TRMT6 UNC119B WHSC1 SLC26A8 SORBS2 SUZ12 TJP1 TRPC4AP UNC5C WLS SLC30A9 SORCS1 SWT1 TLK2 TRPM3 UNK WNK1 SLC33A1 SOX6 SYCP1 TM9SF2 TRPM6 UPF3B WRN SLC35B1 SOX7 SYMPK TM9SF3 TRRAP UPP2 WWC2 SLC38A2 SPAG6 SYN1 TMED2 TSEN54 UQCRC2 XDH SLC39A10 SPAG9 SYN3 TMEM105 TSHZ2 UQCRH XPNPEP3 SLC39A11 SPATA16 SYNCRIP TMEM116 TSPAN3 URGCP YAE1D1 SLC39A8 SPATA5 SYNE1 TMEM117 TSR1 USH2A YBX1 SLC44A1 SPC25 SYNE2 TMEM132D TTBK2 USP1 YBX3 SLC45A4 SPCS3 TAB2 TMEM220 TTC1 USP25 YES1 SLC4A1AP SPDL1 TAF12 TMEM260 TTC13 USP28 YIPF4 SLC6A3 SPECC1 TAF1A TMEM57 TTC17 USP31 YME1L1 SLC9A7 SPECC1L TAF1B TMIGD3 TTC27 USP32 YWHAQ SLC9A9 SPG11 TAF3 TMTC4 TTC3 USP38 YY1AP1 SLC9B2 SPG21 TALDO1 TNNT1 TTC37 USP39 YY2 SLCO1B3 SPIDR TANC1 TNPO1 TTC39C USP4 ZBTB11 SLCO1B7 SPRED1 TARDBP TNPO3 TTC9 USP45 ZBTB20 SLCO5A1 SPSB4 TARSL2 TNRC18 TTL USP48 ZBTB8A SLF1 SPTLC3 TATDN1 TNRC6B TTLL1 USP7 ZCCHC6 SLF2 SQSTM1 TBC1D1 TOMM20 TTLL11 USP8 ZCCHC7 SLFN12L SRP72 TBC1D12 TOP1 TTN USP9X

213