CP Binding to a Cytosine-Rich Subset of Polypyrimidine Tracts Drives a Novel
Total Page:16
File Type:pdf, Size:1020Kb
Published online 20 February 2016 Nucleic Acids Research, 2016, Vol. 44, No. 5 2283–2297 doi: 10.1093/nar/gkw088 ␣CP binding to a cytosine-rich subset of polypyrimidine tracts drives a novel pathway of cassette exon splicing in the mammalian transcriptome Xinjun Ji1,*,†, Juw Won Park2,3,4,†, Emad Bahrami-Samani2, Lan Lin2, Christopher Duncan-Lewis1, Gordon Pherribo1,YiXing2,* and Stephen A. Liebhaber1,5,* 1Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA, 2Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, Los Angeles, CA 90095, USA, 3Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY 40292, USA, 4KBRIN Bioinformatics Core, University of Louisville, Louisville, KY 40202, USA and 5Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA Received September 01, 2015; Revised January 27, 2016; Accepted February 03, 2016 ABSTRACT INTRODUCTION Alternative splicing (AS) is a robust generator of RNA splicing is a highly dynamic process that involves as mammalian transcriptome complexity. Splice site many as 200 protein factors interacting with target sites on a specification is controlled by interactions of cis- PolII transcript. While many of these RNA-protein (RNP) acting determinants on a transcript with specific interactions have been described in detail, the broad array RNA binding proteins. These interactions are fre- of alternative splicing (AS) events detected in mammalian cells (1,2) and the large number of RNA binding proteins quently localized to the intronic U-rich polypyrimi- (RBPs) encoded by the mammalian genome, suggest that dine tracts (PPT) located 5 to the majority of splice numerous additional determinants of splicing controls re- ␣ acceptor junctions. CPs (also referred to as polyC- main to be identified and characterized (3–7). Many of the binding proteins (PCBPs) and hnRNPEs) comprise a relevant RBPs (at least 141) compose core components of subset of KH-domain proteins with high affinity and the mammalian spliceosome complexes (3,8–10) that inter- specificity for C-rich polypyrimidine motifs. Here, we act with splice donor and splice acceptor regions and cat- demonstrate that ␣CPs promote the splicing of a de- alyze intron excision and exon ligation (8). The ‘strength’ of fined subset of cassette exons via binding to a C- a splice acceptor site is impacted by the assembly of RNP complexes at a polypyrimidine tract (PPT) located imme- rich subset of polypyrimidine tracts located 5 to the ␣CP-enhanced exonic segments. This enhancement diately 5 of the AG splicing acceptor site. This PPT char- of splice acceptor activity is linked to interactions of acteristically consists of a loosely defined U-rich sequence with interspersed C residues (8,11–14). The splicing factor ␣CPs with the U2 snRNP complex and may be medi- U2AF65 binds directly to this U-rich PPT and recruits its ated by cooperative interactions with the canonical heterodimeric partner U2AF35 and the U2 snRNP com- polypyrimidine tract binding protein, U2AF65. Analy- plex to the splicing branch site with the initiation of the first sis of ␣CP-targeted exons predicts a substantial im- of two trans-esterification reactions (8). pact on fundamental cell functions. These findings Control of splice acceptor activity can be mediated by lead us to conclude that the ␣CPs play a direct and altering the efficiency and/or productivity of the interac- global role in modulating the splicing activity and in- tions between U2AF65 and the PPT (8). For example, the clusion of an array of cassette exons, thus driving RNA binding proteins PTB and hnRNPC have been pro- a novel pathway of splice site regulation within the posed to repress splice acceptor utilization by blocking the / mammalian transcriptome. binding and or activity of U2AF65 at the PPT (11,14,15). The mechanistic details of repression by these two fac- *To whom correspondence should be addressed.Tel: +1 215 898 7834; Fax: +1 215 573 5892; Email: [email protected] Correspondence may also be addressed to Xinjun Ji. Tel: +1 215 898 4250; Fax: +1 215 573 5157; Email: [email protected] Correspondence may also be addressed to Yi Xing. Tel: +1 310 825 6806; Fax: +1 310 206 3663; Email: [email protected] †These authors contributed equally to this work as the first authors. C The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 2284 Nucleic Acids Research, 2016, Vol. 44, No. 5 tors appear to be distinct and remain to be fully defined struction. The libraries corresponded to the three control (11,14,15). Recent studies indicate that RBFOX2 protein siRNA depletions and three ␣CP1/2 co-depletions. These regulates binding of the early intron recognition factors six samples comprised four samples used and validated in U2AF and the U1 small nuclear ribonucleoprotein complex our previous analysis of 3 processing (Figure 1, Ji, 2013 (snRNP) (16). Additional sets of RNA binding proteins fur- MCB (30)) and additional two siRNA treated K562 sam- ther contribute to activating or repressing controls over AS. ples, one control siRNA (CTRL-3: non-targetting, Dhar- Many of these proteins may be widely expressed while oth- macon, Inc) and one ␣CP1/2 siRNA(␣CP1/2–3, same as ers may mediate regulation of splicing programs in specific (␣CP1/2–1 in (30)) (Figure 1B). These libraries were mul- cell types (e.g. ESRP family and Nova family (17–21)). tiplexed and sequenced on a single channel on the Hiseq- The ␣CP proteins (also referred to as polyC-binding pro- 2000 platform using a 100 nt paired-end sequencing proto- teins (PCBPs) and hnRNPEs) comprise a family of RBPs col by NGSC at University of Pennsylvania (https://ngsc. encoded at four dispersed loci (22). These widely expressed med.upenn.edu/). proteins contain three highly conserved RNA binding KH domains and shuttle between the nucleus and the cytoplasm (23). The two most highly expressed of these proteins, ␣CP1 Alternative splicing analysis ␣ and CP2, are present in all metazoan organisms studied to RNA-seq reads were mapped to the human transcriptome ␣ date. CPs have been demonstrated to regulate mRNA sta- (Ensembl, release 65) and genome (hg19) using the soft- bility (24–26) and translation (27,28) in the cytoplasm and ware Tophat (38) (v1.4.1) allowing up to 3 bp mismatches exert a global impact on 3 processing of transcripts in the per read and up to 2 bp mismatches per 25 bp seed. ␣CP- nucleus (29,30). Several studies have reported an additional regulated differential AS events corresponding to five ma- ␣ impact of CPs on splicing of specific transcripts encoding jor types of AS patterns were identified by rMATS (v3.0.7) ␣ CD44 (31–33), Tau (34), CD45 (35) and human (h) -globin (39,40)(http://rnaseq-mats.sourceforge.net/)(Table1). For (36,37). each AS event, both the reads mapped to the exon–exon ␣ In the current study, we define a global impact of CPs junction and the reads mapped to the exon body were used ␣ on alternative splicing. The data reveal that the CPs selec- as rMATS input. Putative ␣CP-regulated AS events were tively enhance the splicing (inclusion) of a defined subset of identified as those with significant difference in inclusion cassette exons in the human transcriptome. These cassette levels (|PSI|≥5%) between knockdown and control at an ␣ exons targeted by CPs are characterized by a cytosine-rich FDR < 5%. subset of PPTs adjacent to their splice acceptor sites. The demonstrated interactions of ␣CPs with U2 spliceosome- associated proteins and with U2AF65 are likely to support Motif enrichment analysis their observed splicing functions. These findings lead us to conclude that ␣CP interactions with C-rich PPTs play a sig- Motifs that were significantly enriched in differential exon ␣ nificant role in the AS of a subset of cassette exons. skipping events between the CP depleted and control sam- ples were identified by comparison to background (non- regulated) alternative exons. A total of 4527 alternative ex- MATERIALS AND METHODS ons without splicing changes (rMATS FDR > 50%) in Cell culture and siRNA/shRNA transfection highly expressed genes (FPKM > 5.0 in at least one sam- ple group) were treated as a background exon set. RNA- K562 cells were cultured in RPMI 1640 medium, sup- seq based gene expression levels (FPKM) were calculated plemented with 10% fetal bovine serum (FBS) (HyClone) ◦ by Cuffdiff (v2.2.0) (41). A MEME analysis (v4.9.0) was and antibiotic/antimycotic at 37 Cina5%CO2 incu- ␣ carried out to detect enriched motifs in a 500 bp window bator. K562 cells were transfected with CP siRNAs centered on the splice acceptor and splice donor sites (42). or shRNAs using Nucleofector V (Amaxa) as previ- ␣ / To examine the enrichment of identified motifs in the vicin- ously described (30). The two CP1 2 co-depletion siR- ity of the ␣CP regulated exons, the 500 bp windows were NAs are previously reported (30). siRNAs to U2AF65 analyzed as 50 bp bins with step-size of 1 bp and the occur- (HSS117616, HSS117617, HSS117618); siRNAs to ␣CP1 ␣ rence of each motif within each bin was assessed for posi- (HSS107632, HSS107633, HSS107634); siRNAs to CP2 tional distribution of each motif. siRNAs (HSS143239, HSS143241, HSS181787) and nega- tive control siRNA (medium GC content) were purchased from Invitrogen. The ␣CP(1/2)-2 siRNA (30), which tar- Ingenuity pathway analysis gets the 3 UTR of native ␣CP1/2 mRNA, was converted to shRNA based on pGFP-V-RS parental vector (Ori- The enrichment analyses of biological functions were gen- gene Technologies, Inc.).