Identification of the RNA Recognition Element of the RBPMS Family of RNA-Binding Proteins and Their Transcriptome-Wide Mrna Targets
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from rnajournal.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press Identification of the RNA recognition element of the RBPMS family of RNA-binding proteins and their transcriptome-wide mRNA targets THALIA A. FARAZI,1,5 CARL S. LEONHARDT,1,5 NEELANJAN MUKHERJEE,2 ALEKSANDRA MIHAILOVIC,1 SONG LI,3 KLAAS E.A. MAX,1 CINDY MEYER,1 MASASHI YAMAJI,1 PAVOL CEKAN,1 NICHOLAS C. JACOBS,2 STEFANIE GERSTBERGER,1 CLAUDIA BOGNANNI,1 ERIK LARSSON,4 UWE OHLER,2 and THOMAS TUSCHL1,6 1Laboratory of RNA Molecular Biology, Howard Hughes Medical Institute, The Rockefeller University, New York, New York 10065, USA 2Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, 13125 Berlin, Germany 3Biology Department, Duke University, Durham, North Carolina 27708, USA 4Institute of Biomedicine, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, SE-405 30, Sweden ABSTRACT Recent studies implicated the RNA-binding protein with multiple splicing (RBPMS) family of proteins in oocyte, retinal ganglion cell, heart, and gastrointestinal smooth muscle development. These RNA-binding proteins contain a single RNA recognition motif (RRM), and their targets and molecular function have not yet been identified. We defined transcriptome-wide RNA targets using photoactivatable-ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) in HEK293 cells, revealing exonic mature and intronic pre-mRNA binding sites, in agreement with the nuclear and cytoplasmic localization of the proteins. Computational and biochemical approaches defined the RNA recognition element (RRE) as a tandem CAC trinucleotide motif separated by a variable spacer region. Similar to other mRNA-binding proteins, RBPMS family of proteins relocalized to cytoplasmic stress granules under oxidative stress conditions suggestive of a support function for mRNA localization in large and/or multinucleated cells where it is preferentially expressed. Keywords: RNA-binding protein; PAR-CLIP; RNA recognition element; RBPMS INTRODUCTION tured, characterized by a high density of prolines, without ho- mology to other proteins. The RBPMS RRM domain is most RNA-binding protein with multiple splicing (RBPMS) be- similar to those of ELAVL/Hu proteins involved in mRNA longs to a family of RNA-binding proteins (RBPs) containing stability and splicing (Lebedeva et al. 2011; Mukherjee et al. a single RNA recognition motif (RRM), corresponding to a ′′ 2011), and the snRNA-binding proteins SNRPB2/U2B protein domain of ∼80 amino acids (aa). In RBPMS this and SNRPA/U1A (Oubridge et al. 1994). domain is flanked by 23 aa N-terminal and 95 aa C-terminal The RRM domain is the most common RNA-binding regions. The RBPMS gene spans over 230 kb in the human domain, present in at least 224 RBPs (Ascano et al. 2012a). genome and is expressed as multiple mRNA isoforms (four RRM domains are structurally diverse and bind to a multitude RefSeq and at least 19 transcript variants according to of sequence and structural motifs, such as the base and AceView) resulting in multiple protein isoforms (Shimamoto loop residues in stem–loop structures (Maris et al. 2005). et al. 1996). All Refseq protein isoforms are identical in the RRMs are present in proteins that regulate a variety of RNA majority of their coding sequence (1–170 aa). RBPMS2 is processes, including pre-mRNA splicing, RNA transport, predominantly expressed as a single isoform (one Refseq localization, translation, and stability. The RBPMS family transcript and at least seven spliced variants according to is conserved in vertebrates (e.g., 72% identity of H. sapiens AceView). RBPMS paralogs share 67% amino acid identity RBPMS isoform A vs. D. rerio RBPMS2) (Fig. 1A), and other (RBPMS2 vs. RBPMS isoform A), varying mostly in their vertebrates also contain at least two RBPMS family members. N- and C-termini. The C terminus of RBPMS is unstruc- © 2014 Farazi et al. This article is distributed exclusively by the RNA Soci- 5These authors contributed equally to this work. ety for the first 12 months after the full-issue publication date (see http:// 6Corresponding author rnajournal.cshlp.org/site/misc/terms.xhtml). After 12 months, it is available E-mail [email protected] under a Creative Commons License (Attribution-NonCommercial 4.0 Article published online ahead of print. Article and publication date are at International), as described at http://creativecommons.org/licenses/by- http://www.rnajournal.org/cgi/doi/10.1261/rna.045005.114. nc/4.0/. RNA 20:1–13; Published by Cold Spring Harbor Laboratory Press for the RNA Society 1 Downloaded from rnajournal.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press Farazi et al. RNP 2 RNP 1 A RBPMS H. sapiens ---------------------MNNGGKAEKEN--------TPSEANLQEEEVRTLFVSGLPLDIKPRELYLLFRPFKGYEGSLIKLTSKQPVGFVSFDSRSEAEAAKNALNGIRFDPEIP QTLRLEFAKANTKMAKNKLVG 112 RBPMS M. musculus ---------------------MNGGGKAEKEN--------TPSEANLQEEEVRTLFVSGLPLDIKPRELYLLFRPFKGYEGSLIKLTSKQPVGFVSFDSRSEAEAAKNALNGIRFDPEIP QTLRLEFAKANTKMAKNKLVG 112 RBPMS 2a D. rerio ---------------------MS--VKSDSEP----------NNNVSLEEEVRTLFVSGLPTDIKPRELYLLFRPFKGYEGSLIKLTSKQPVGFVTFDSRSGAEAAKNALNGVRFDPENP QTLRLEFAKANTKMAKSKLMG 108 RBPMS 2b D. rerio PEPKNOWNPROTEINCODINGMS--LKSDSET----------NTSVSLEEEVRTLFVSGLPVDIKPRELYLLFRPFKGYEGSLIKLTSKQPVGFVTFDSRSGAEAAKNALNGIRFDPESP QTLRLEFAKANTKMAKSKLMA 129 RBPMS 2 H. sapiens ---------------------MSN-LKPDGEHGGSTGTGSGAGSGGALEEEVRTLFVSGLPVDIKPRELYLLFRPFKGYEGSLIKLTARQPVGFVIFDSRAGAEAAKNALNGIRFDPENP QTLRLEFAKANTKMAKSKLMA 119 RBPMS 2 M. musculus ---------------------MSN-LKPDVEH------CTGAGTGSPLEEEVRTLFVSGLPVDIKPRELYLLFRPFKGYEGSLIKLTSRQPVGFVIFDSRAGAEAAKNALNGIRFDPENP QTLRLEFAKANTKMAKSKLIA 113 RBPMS D. rerio ---------------------MNN--TSEEEP---------KTPEEEEEEQVRTLFVSGLPMDIKPRELYLLFRPFQGYEGSLIKFTSKQPVGFVSFSSRSEAEAAKNRLNGVRFDPDVP QTLRLEFAKANTKMSR----R 99 *. ..: * **:********** **************:********:*::****** *.**: ****** ***:****: * **************:: RNA recognition motif RBPMS H. sapiens TPNPSTPLPNTVPQFIAREPYELTVPALYPSSPEVWAPY PLYPAELAPALPPP-AFTYP-------AS--LHAQMRWLPP-SEATSQGWKSRQFC 196 RBPMS M. musculus TPNPSTPLPNTVPQFIAREPYELTVPALYPSSPEVWAPY PLYPAELAPALPPPAAFTYP-------AS--LHAQMRWIPP-SEATSQGWKSRQFC 197 RBPMS 2a D. rerio TPNPTNIHPALGAHFIARDPYDLTGAALIPASPEAWSPY PLYTPELSPGLPHT-AFTYPAAAAAAAAA--LHAQMRWYPSTSDSSQPGWKSRQFC 200 RBPMS 2b D. rerio TPNPSNLHPALGAHFIARDPYDLTGAALIPASPEAWAPY PLYTTELTPGLPHA-AFTYP-AAAAAAAA--LHAQMRWYPSPSESSQPGWKSRQFC 220 RBPMS 2 H. sapiens TPNPSNVHPALGAHFIARDPYDLMGAALIPASPEAWAPY PLYTTELTPAISHA-AFTYPTATAAAAA---LHAQVRWYPS-SDTTQQGWKYRQFC 209 RBPMS 2 M. musculus TPNPTSVHPALGAHLIARDPYDLMGTALIPASPEAWAPY PLYTTELTPAISHT-TFTYPAATAAAAAAATLHAQVRWYPS-SDTTQQGWKYRQFC 206 RBPMS D. rerio NSNP-AARLSAGAACVSREPYELSVPALYPSSPDVWPAF PLYPADLTP------AYAYS-------SA--LHTQIRWLPT-ADTNHQSWKSRQFC 183 ..** . ::*:**:* .** *:**:.*..: ***..:*:* :::*. : **:*:** *. :::. .** **** B FH-RBPMS Hoechst Alexa 488 anti-HA Phalloidin 546 Nuc Cyt Tot (DNA) (HA) (Actin) LMNB1 TUBB RBPMS ELAVL1 HA RBPMS2 FH-RBPMS2 Nuc Cyt Tot LMNB1 TUBB HEK293 ELAVL1 HA 100% Library A CD16,198 E 90% 630 80% 857 9,901 70% all other 1065 repeat 60% 6,207 coding RBPMS binding 50 kDa 50% no annotation sites 1291 intron 40% 3' UTR 37 kDa 6,365 30% 20% 25 kDa 2170 Library B % Sequence reads included in clusters 12,572 10% IB: HA Number of clusters with >0.5 0% conversion specificity RBPMS binding sites FIGURE 1. RBPMS overview. (A) RBPMS is conserved in vertebrates. Alignments were generated using ClustalW (NP_956553.1, NP_001002409.1, NP_082306.2, NP_919248.1, NP_001008710.1, NP_001036139, XP_003199078.1). (B) RBPMS and RBPMS2 localize to the nucleus and cytoplasm. Western Blotting analysis of nucleo-cytoplasmic fractions prepared from FLAG-HA RBPMS and RBPMS2 cell lines, as indicated. Nuclear (Nuc), cy- toplasmic (Cyt), and total cell lysate (Tot) fractions were resolved on a 4%–12% SDS–polyacrylamide gel and then probed using an anti-HA antibody targeting FLAG-HA RBPMS and RBPMS2, and antibodies targeting ELAVL1 (HuR), TUBB (β-tubulin), and LMNB1 (lamin B) as controls for the purity of the cytoplasmic and nuclear fractions. (Right) Immunofluorescence staining of HA epitope in HEK293 cells stably expressing FLAG-HA RBPMS and FLAG-HA RBPMS2, using Hoechst and Phalloidin 546 as controls. (C) Phosphorimages of SDS-PAGE fractionating PAR-CLIP immu- noprecipitate from constitutive and inducible overexpressing FLAG-HA-tagged RBPMS HEK293 cells. The cross-linked RNA–RBPMS complexes are indicated for two biological replicate experiments. Anti-HA Western-blotting control for expression and loading is shown at the bottom. Two protein bands were recognized by anti-HA antibody, one at the expected size based on recombinant full-length RBPMS and the other shorter, suggesting proteolytic cleavage (bands confirmed as RBPMS by mass spectrometry). (D) Overlap of PAR-CLIP clusters with ≥0.5 T-to-C conversion specificity between libraries A and B defines 6207 RBPMS-binding sites. (E) Genomic distribution of RBPMS-binding sites. 2 RNA, Vol. 20, No. 7 Downloaded from rnajournal.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press RBPMS RNA recognition and targets Dysregulation of RBPMS family proteins has been re- we generated an atlas of RBPMS family expression using ported in cancer (Skawran et al. 2008; Miller and Stama- microarray data from 156 mouse samples from six reports toyannopoulos 2010; Drozdov et al. 2012; Hapkova et al. submitted to the Gene Expression Omnibus (GEO).