U1 Snrnp Telescripting Roles in Transcription and Its Mechanism
Total Page:16
File Type:pdf, Size:1020Kb
U1 snRNP Telescripting Roles in Transcription and Its Mechanism 1,2 1,2 1 1 1 CHAO DI, BYUNG RAN SO, ZHIQIANG CAI, CHIE ARAI, JINGQI DUAN, 1 AND GIDEON DREYFUSS 1Howard Hughes Medical Institute, Department of Biochemistry and Biophysics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104-6148, USA Correspondence: [email protected] Telescripting is a fundamental cotranscriptional gene regulation process that relies on U1 snRNP (U1) to suppress premature 3′-end cleavage and polyadenylation (PCPA) in RNA polymerase II (Pol II) transcripts, which is necessary for full-length transcription of thousands of protein-coding (pre-mRNAs) and long noncoding (lncRNA) genes. Like U1 role in splicing, telescripting requires U1 snRNA base-pairing with nascent transcripts. Inhibition of U1 base-pairing with U1 snRNA antisense morpholino oligonucleotide (U1 AMO) mimics widespread PCPA from cryptic polyadenylation signals (PASs) in human tissues, including PCPA in introns and last exons’ 3′-untranslated regions (3′ UTRs). U1 telescripting–PCPA balance changes generate diverse RNAs depending on where in a gene it occurs. Long genes are highly U1-telescripting-dependent because of PASs in introns compared to short genes. Enrichment of cell cycle control, differentiation, and developmental functions in long genes, compared to housekeeping and acute cell stress response genes in short genes, reveals a gene size–function relationship in mammalian genomes. This polarization increased in metazoan evolution by previously unexplained intron expansion, suggesting that U1 telescripting could shift global gene expression priorities. We show that that modulating U1 availability can profoundly alter cell phenotype, such as cancer cell migration and invasion, underscoring the critical role of U1 homeostasis and suggesting it as a potential target for therapies. We describe a complex of U1 with cleavage and polyadenylation factors that silences PASs in introns and 3′ UTR, which gives insights into U1 telescripting mechanism and transcription elongation regulation. Telescripting is an activity of U1 snRNP (U1) that sup- oligonucleotide (U1 AMO) titrated to mask all or nearly presses premature transcription of RNA polymerase II all U1 snRNA 5′ sequence (“high U1 AMO”), compared (Pol II) transcription at 3′-end processing signals, gener- to control nonspecific AMO (cAMO), showed unfamiliar ally cleavage and polyadenylation (CPA) signals (PASs) and striking “Z patterns” in thousands of genes, consisting in introns. Premature CPA (PCPA) (Kaida et al. 2010) was of RNAs extending several kilobases (typically, ∼1–3kb uncovered from transcriptome profiling of human cells and up to tens of kilobases) from the transcription start site transfected with antisense morpholino oligonucleotides (TSS) that end abruptly in an intron (Fig. 1). At their ends (AMOs), which showed that AMO masking of a U1 these RNAs had 3′-poly(A)s specified by canonical PAS snRNA 5′ sequence, whose base pairs with 5′ splice sites hexamers, AAUAAA and variants thereof, indistinguish- (5′ss) is necessary for the first step in splicing (Lerner et al. able from classical PASs at the ends of full-length pre- 1980; Mount et al. 1983; Padgett et al. 1983). These ex- mRNAs and lncRNAs (Proudfoot and Brownlee 1976; periments showed PCPA in thousands of nascent tran- Tian and Manley 2017). The widespread PCPA elicited scripts of protein coding genes (pre-mRNAs) and long by high U1 AMO revealed that U1 is a PCPA suppressor. noncoding RNAs (lncRNAs), indicating that U1 is neces- Importantly, detection of 3′-poly(A) tags at U1 AMO-in- sary to suppress PCPA and thereby promote full-length duced PCPA positions in normal tissues in human and transcription (hence the term telescripting). Telescripting other organisms (Derti et al. 2012; Oh et al. 2017) dem- is an additional U1 function, separate from its role in onstrated that PCPA and U1 telescripting are physiological splicing. Here, we review U1 telescripting emergence as processes, and U1 AMO is a useful tool to study it. a key regulator of gene expression, some of its biological As expected, U1 AMO also caused widespread splicing roles, current understanding of its mechanism, and out- inhibition, evident in intron retention in many introns that standing questions for future research. are not PCPAed. However, U1 telescripting appears to be U1 snRNP comprises a single noncoding small nuclear additional and separable from U1’s role in splicing. As RNA (164 nt in human) and 10 proteins (U1-specific U1- PCPA frequently occurs in introns well before their 3′ss 70K, U1A, U1C, and seven Sm proteins common to spli- is transcribed, PCPA is not secondary to splicing inhibi- ceosomal snRNPs). RNA sequencing (RNA-seq) from tion (of the same intron) and U1 telescripting is U1-spe- cells transfected with U1 snRNA antisense morpholino cific and may not require other spliceosomal snRNPs (U2, 2These authors contributed equally to this work. © 2019 Di et al. This article is distributed under the terms of the Creative Commons Attribution-NonCommercial License, which permits reuse and redistribution, except for commercial purposes, provided that the original author and source are credited. Published by Cold Spring Harbor Laboratory Press; doi:10.1101/sqb.2019.84.040451 Cold Spring Harbor Symposia on Quantitative Biology, Volume LXXXIV 1 2 DI ET AL. Figure 1. Premature 3′-end cleavage and polyadenylation (PCPA) terminates Pol II elongation cotranscriptionally in gene bodies. Genome browser (version hg19) views of representative PCPAed genes are shown with nascent RNA-seq and Pol II ChIP-seq (Oh et al. 2017) reads aligned to the human genome (e.g., RAB7A and E2F3). HeLa cells were pulse-labeled for 5 min with 5′-ethynyluridine (EU) at 8 h transfection with a 25-mer control or U1 AMO to U1 snRNA 5′ sequence. Pol II near transcription start sites (TSSs) and Pol II near termination zones (TZs) are indicated as boxes. 3′-poly(A) sites detected in various human tissues (Derti et al. 2012) are shown as bars and indicate natural PCPAs. Tissue differences suggest a role for cell-specific factors in PCPA/telescripting or metabolism of PCPAed RNAs. Gene structure lines show introns (horizontal lines) and exons (vertical bars), respectively. U4, U5, U6, or the minor snRNPs U11, U12, U4atac, PASs in terminal exons 3′-untranslated regions (3′ UTRs), U6atac) (Kaida et al. 2010). Although telescripting and and a few instances of APA in gene bodies had been splicing require U1 base-pairing to the nascent transcript known previously (Wang et al. 2008; Hartmann and Val- and U1 bound at 5′ss can function in both, telescripting cárcel 2009; Di Giammartino et al. 2011), the PCPA-U1 can also be supplied by U1 base-paired to sequences that telescripting phenomenon as a major transcription control cannot function as 5′ss, giving U1 telescripting more po- was unexpected. tential sites (Berg et al. 2012). Pol II chromatin immunoprecipitation-sequencing (ChIP-seq) maps complemented the nascent RNA maps Transcriptome Control and Proteome (Fig. 1; Gilmour and Lis 1984; Oh et al. 2017), indicating Diversification that U1 telescripting acts cotranscriptionally, as opposed RNA-seq and PCPA maps developed from Z-pattern- to post-transcriptionally (e.g., by reprocessing from full- detecting algorithms and nongenomic 3′-poly(A) reads at length transcripts). Like CPA at normal 3′ ends (the ulti- various U1 AMO doses helped identify key telescripting mate PASs), Pol II ChIP-seq tapered off within a few features and its biological roles. PCPA in a gene can be kilobases from PCPA locations, a gradual post-3′-end tran- complete, effectively shutting down full-length transcrip- scription termination zone (TZ) (Fong et al. 2015) consis- tion or partial (in some genes even at high U1 AMO), tent with the torpedo termination model (Connelly and affecting a fraction of a gene’s transcripts (e.g., RAB7A Manley 1988; Proudfoot 2016). Pol II occupancy down- and E2F3; Fig. 1), and can occur at multiple PASs in the stream from TZs through the rest of the gene was sharply same gene. Any PCPA is at the expense of full-length reduced or eliminated. Notably, U1 AMO did not inhibit transcription; however, it can potentially generate diver- transcription initiation (and promoter-proximal pause Pol sity of RNAs and proteins. Figure 2 illustrates various II release into elongation); rather, transcription continued outcomes of PCPA at different locations that U1 telescript- to stream into genes and frequently increased up to PCPA ing can regulate. The relative abundance of the different points, evident by high pol II occupancy from the TSSs to isoforms can be manipulated with U1 AMO dose (and PCPA points in U1 AMO compared to control. Thus, full- correspondingly decreases available U1), which increases length Pol II transcription elongation in most genes in usage of TSS-proximal PAS with higher U1 AMO, con- vertebrates depends on averting PCPA termination at sistent with a cotranscriptional process whereby the first PAS checkpoints, which U1 telescripting provides. PAS encountered by Pol II that is unprotected by U1 is PCPAed. Pervasive PCPA at multiple cryptic PASs in in- trons counters the textbook view that PASs function pri- BIOLOGICAL ROLES OF U1 TELESCRIPTING marily in the 3′ UTR. Technically, PAS selection in genes that have more than PCPA in the 1–3 kb from the first 5′ss (which is <0.5 kb one can be classified as alternative polyadenylation from the TSS) in intron1 makes polyadenylated RNAs that (APA). Although APA, involving choice among tandem are generally rapidly degraded by nuclear exosomes U1 TELESCRIPTING CONTROLS TRANSCRIPTION 3 pre-mRNA Pol II splicing PAS UA 1 2 3 4 5 AAAAn Start Stop intron exon uaRNA/ CPA PROMPT nAAAA UA G7m m7G AAAAn m7G 1 AAAAn 1 2 3 Start Start Stop exosome AAAAn m7G 1 AAAAn m7G 152 3 4 mRNAs Start Stop Start Stop m7G 142 3 5 AAAAn Start Stop 3' UTR Figure 2.