Posttranscriptional Regulation of Human Endogenous Retroviruses by RNA-Binding Motif Protein 4, RBM4
Total Page:16
File Type:pdf, Size:1020Kb
Posttranscriptional regulation of human endogenous retroviruses by RNA-binding motif protein 4, RBM4 Amir K. Foroushania, Bryan Chima,1, Madeline Wonga,1, Andre Rastegara,1, Patrick T. Smitha, Saifeng Wanga, Kent Barbianb, Craig Martensb, Markus Hafnerc,2, and Stefan A. Muljoa,2 aLaboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892; bResearch Technologies Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Hamilton, MT 59840; and cLaboratory of Muscle Stem Cells and Gene Regulation, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, MD 20892 Edited by Akiko Iwasaki, Yale University, New Haven, CT, and approved August 17, 2020 (received for review April 2, 2020) The human genome encodes for over 1,500 RNA-binding proteins threat to genome and transcriptome integrity (19). Recently, (RBPs), which coordinate regulatory events on RNA transcripts. Most epigenetic silencing by TRIM28 and HDAC1 to suppress ERV studies of RBPs have concentrated on their action on host protein- expression at the transcriptional level was identified as a crucial encoding mRNAs, which constitute a minority of the transcriptome. mechanism to control HERVs (20–23). Interestingly, the RNA- A widely neglected subset of our transcriptome derives from inte- binding proteins (RBPs) such as TARDBP/TDP-43 and Spen grated retroviral elements, termed endogenous retroviruses (ERVs), were found to bind and modulate ERV transcripts in flies, mice, that comprise ∼8% of the human genome. Some ERVs have been and humans (24–27), suggesting that RBPs may play an under- shown to be transcribed under physiological and pathological con- appreciated role in their regulation. The human genome encodes ditions, suggesting that sophisticated regulatory mechanisms to co- around 1,500 RBPs that bind and regulate localization, stability, ordinate and prevent their ectopic expression exist. However, it is adenylation, and splicing patterns of coding and noncoding RNAs unknown how broadly RBPs and ERV transcripts directly interact to in a sequence- or structure-dependent manner (28). A recent study provide a posttranscriptional layer of regulation. Here, we imple- showed that many RBPs bind and repress evolutionarily young mented a computational pipeline to determine the correlation of ex- mobile repetitive elements, including intronic long interspersed pression between individual RBPs and ERVs from single-cell or bulk nuclear elements (LINEs) to prevent them from interfering with RNA-sequencing data. One of our top candidates for an RBP nega- cellular gene expression (29, 30). By analogy, RBPs seem like at- tively regulating ERV expression was RNA-binding motif protein 4 tractive candidates for broad regulation of HERV transcripts (31). (RBM4). We used photoactivatable ribonucleoside-enhanced cross- Until recently, systematic studies of expression patterns of linking and immunoprecipitation to demonstrate that RBM4 indeed HERVs have been hampered by the lack of adequate bioinformatic bound ERV transcripts at CGG consensus elements. Loss of RBM4 tools. In 2015, the Hammell group released TEToolkit (32), which resulted in an elevated transcript level of bound ERVs of the HERV- enables family-wise quantification of HERVs and other classes of K and -H families, as well as increased expression of HERV-K envelope transposable elements (TEs) from RNA-sequencing (RNA-seq) protein. We pinpointed RBM4 regulation of HERV-K to a CGG- data. In 2018, the Iwasaki and Vincent groups released ERVmap containing element that is conserved in the LTRs of HERV-K-10, -K- and hervQuant, respectively (33, 34), which enable high-resolution 11, and -K-20, and validated the functionality of this site using re- porter assays. In summary, we systematically identified RBPs that Significance may regulate ERV function and demonstrate a role for RBM4 in controlling ERV expression. The expression of endogenous retroviruses (ERVs) appears to have broad impact on human biology. Nevertheless, only a RNA-binding proteins | PAR-CLIP | RNA sequencing | posttranscriptional handful of transcriptional or posttranscriptional regulators of regulation | endogenous retroviruses ERV expression are known. We implemented a computational pipeline that allowed us to identify RNA-binding proteins uman endogenous retroviruses (HERVs) are the remnants of (RBPs) that modulate ERV expression levels. Experimental val- Hseveral ancient retroviral germline infections that have ac- idation of one of the prime candidates we identified, RNA- cumulated over the past 60 million y (1). Today, they account for binding motif protein 4 (RBM4), showed that it indeed bound ∼8% of our genome (2). Consequently, as a class, ERVs are RNAs made from ERVs and negatively regulated the levels of perpetual members of our virome, the compendium of viruses that those RNAs. We hereby identify a layer of ERV regulation by exist in or on an organism (3). As life-long residents of all human RBPs. We suspect that this work on RBM4 is only the beginning cells, these sequences of viral origin likely continue to have a in recognizing a broader set among the >1,500 human RBPs considerable impact on our biology, yet they are not fully under- that act on transcripts derived from endogenous and exogenous stood. Many of the integrated viral DNA sequences contain viruses. transcription-factor (TF) binding sites and have been exapted as genomic coordinators of a wide range of important processes, Author contributions: A.K.F., B.C., M.W., A.R., P.T.S., S.W., C.M., M.H., and S.A.M. de- including the expression of IFN-γ–responsive transcripts (4). At signed research; A.K.F., M.W., A.R., P.T.S., S.W., K.B., and C.M. performed research; A.K.F. and A.R. contributed new reagents/analytic tools; A.K.F., B.C., M.W., A.R., C.M., the transcriptomic level, HERVs seem largely silent in healthy M.H., and S.A.M. analyzed data; and A.K.F., B.C., M.W., A.R., M.H., and S.A.M. wrote adult cells and tissues. However, they are highly active in embry- the paper. onic cells at the earliest stages of human development, and in The authors declare no competing interest. induced pluripotent stem cells (iPSCs), where they seem indis- This article is a PNAS Direct Submission. pensable but need to be tightly regulated for successful differen- Published under the PNAS license. – tiation (5 12). HERV expression is also elevated in a wide range 1B.C., M.W., and A.R. contributed equally to this work. of diseases, among others systemic lupus erythematosus (SLE) 2To whom correspondence may be addressed. Email: [email protected] or stefan. (13, 14), various cancers (15–17), and multiple sclerosis (18). [email protected]. Despite their strong association with fundamental biological This article contains supporting information online at https://www.pnas.org/lookup/suppl/ processes, little is known about if and how these integral parts of doi:10.1073/pnas.2005237117/-/DCSupplemental. our genome and “virome” are regulated to avoid a potential First published October 5, 2020. 26520–26530 | PNAS | October 20, 2020 | vol. 117 | no. 42 www.pnas.org/cgi/doi/10.1073/pnas.2005237117 Downloaded by guest on September 29, 2021 mapping of sequencing reads to a curated set of ∼3,200 HERV Analysis of our previous (39) and two new rounds of RNA-seq proviral loci. These pioneering resources enabled us to develop a data (SI Appendix,TableS1) comparing RBM4-deficient cells hybrid computational pipeline to test the hypothesis that RBPs act versus WT controls supported our hypothesis that RBM4 silences as regulators of HERV expression. Running our pipeline on two HERVs. In the absence of RBM4, a range of HERV transcripts different single-cell and one bulk cell population RNA-seq datasets were differentially expressed in the HAP1 model system (Fig. 2A from the public domain yielded lists of RBPs with correlated or and Dataset S3). Although, expression of some HERVs were anticorrelated expression to that of HERVs, which represent pos- decreased in RBM4-deficient cells, they were either of lower sible positive regulators or suppressors of HERV function. Among statistical confidence or of much lower baseline abundance. As the top-candidates for HERV-repressing RBPs, we identified one example, in the absence of RBM4, reads mapping to ERV- RNA-binding protein motif 4 (RBM4). Follow-up experiments in map ID# 4452, an HERV-K13 (HML4) family member, was the human HAP1 cell line demonstrated that RBM4 directly bound dramatically decreased in abundance. (Fig. 2B). However, its proviral transcripts of members of the HERV-K and HERV-H baseline expression is around 10-fold lower level than that of the families and that loss of RBM4 resulted in an increased abundance HERV-K (HML2) members (K-10/1.q22, 6272/22.q11.23, K-9/ of these HERV transcripts, as well as increased expression of its 6.q14.1, K-21/1.q23.3, K-11/5.q33.3, 1045/3.q12.3, K-7/8.p23.1) envelope (env) protein. In summary, we nominate RBPs that might that were all differentially up-regulated (Dataset S3). Interest- act as posttranscriptional suppressors of ERVs and identify a role ingly, there are also HERVs that are expressed but not signifi- for RBM4 in regulating HERV-K and -H proviral transcripts. cantly affected by RBM4 including, but not limited to, K-14 (Fig. 2C). K-14 is interesting for a reason to be discussed later. Results Overall, more HERV transcripts were elevated in the absence of Besides TARDBP and Spen (24–27), no RBP has been dem- RBM4 and thus we focused on them. By cross-referencing the onstrated to act directly on HERV transcripts. We hypothesized significantly changed HERVs to the TEToolkit’s (32) repeat- that RBPs represent prime candidates to regulate HERVs masker annotations, we realized that most HERVs negatively reg- posttranscriptionally.