RNA Structure Maps Across Mammalian Cellular Compartments
Total Page:16
File Type:pdf, Size:1020Kb
RESOURCE https://doi.org/10.1038/s41594-019-0200-7 RNA structure maps across mammalian cellular compartments Lei Sun1,2,3,4,10, Furqan M. Fazal 5,6,7,10, Pan Li1,2,3,4,10, James P. Broughton5,6,7, Byron Lee5,6,7, Lei Tang1,2,3,4, Wenze Huang1,2,3,4, Eric T. Kool 8, Howard Y. Chang 5,6,7,9,11* and Qiangfeng Cliff Zhang 1,2,3,4,11* RNA structure is intimately connected to each step of gene expression. Recent advances have enabled transcriptome-wide maps of RNA secondary structure, called ‘RNA structuromes’. However, previous whole-cell analyses lacked the resolution to unravel the landscape and also the regulatory mechanisms of RNA structural changes across subcellular compartments. Here we reveal the RNA structuromes in three compartments, chromatin, nucleoplasm and cytoplasm, in human and mouse cells. The cytotopic structuromes substantially expand RNA structural information and enable detailed investigation of the central role of RNA structure in linking transcription, translation and RNA decay. We develop a resource with which to visualize the interplay of RNA–protein interactions, RNA modifications and RNA structure and predict both direct and indirect reader proteins of RNA modifications. We also validate a novel role for the RNA-binding protein LIN28A as an N6-methyladenosine modification ‘anti- reader’. Our results highlight the dynamic nature of RNA structures and its functional importance in gene regulation. NAs fold into complex structures that are crucial for their density22. We recently used icSHAPE to probe RNA structuromes functions and regulations including transcription, process- in mouse embryonic stem cells (ESCs) and examined the in vivo Ring, localization, translation and decay1–6. Over the past few and in vitro structure profiles of RBFOX2, a splicing factor of the decades RNA structure has been studied extensively in vitro and feminizing locus on X (Fox) family proteins; and HuR, an RBP that in silico, and crystallography and cryo-EM structures of molecu- regulates transcript stability12. We implemented a machine-learn- lar machines such as the spliceosome and ribosome, containing ing algorithm and found that using structure signals significantly RNAs at their core, have become available7,8. In recent years tech- improved the prediction of RNA-binding sites of both RBPs, sug- nologies have been developed to map RNA secondary structures for gesting that RNA structure signature analysis is a powerful tool to the whole transcriptome, that is, RNA structuromes, by combining investigate RNA–RBP interactions. However, in spite of these recent biochemical probing with deep sequencing9–16. These systems biol- advances in our understanding of the association between RNA ogy studies have revealed many novel insights on the RNA structure structure and RBP binding, a compendium of the RNA structural basis of gene regulation17–20. However, so far, existing genome-wide basis of RBP binding is not available. structure probing studies have focused on whole-cell data, which In addition to RBP binding, the modification and editing of only represent an ensemble average of RNA molecules in different RNAs are also an important mechanism for the regulation of RNA subcellular compartments. structure. RNA modification can regulate almost all RNA processes In fact, RNA undergoes a complex life cycle in eukaryotic cells, including RNA maturation, nuclear retention and exportation, mirrored by its movement into distinct cytotopic locales4,21. RNA translation, decay, and cell differentiation and reprogramming as structure is thought to form co-transcriptionally on the chromatin well23,24. As one of the most abundant and important types of mes- template, undergo conformational changes resulting from RNA senger RNA modification, N6-methyladenosine (m6A) has been chemical modification and processing in the nucleus, and experi- shown to favor the unwinding of duplex RNAs by conformational ence further changes in the cytoplasm during translation and RNA switching12,25,26. The impact of the structure-destabilizing effect decay. Averaging the RNA structure signal in the entire cell may of m6A is exemplified by a study that investigated heterogeneous obscure these critical features. More importantly, detailed mapping nuclear ribonucleoprotein C (HNRNPC), a splicing factor that of RNA structures in vivo will help to elucidate how they are regu- preferentially binds to single-stranded polyU tracts27. Biochemical lated, which is essential to understanding the RNA structure basis studies showed that m6A modification can disrupt the local RNA for the regulation of gene expression. structures and promote HNRNPC binding in nearby regions28. The An important driving force that regulates the landscape of RNA study defined these m6A sites as ‘m6A-switches’, and identified the structural changes in post-transcription regulation are the RNA- enrichment of tens of thousands of m6A-switches in the vicinity of binding proteins (RBPs). A study in Arabidopsis revealed that HNRNPC-binding sites, thereby altering HNRNPC binding and RNA secondary structure is anti-correlated with protein-binding splicing of the target mRNAs. However, whether RNA structural 1MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China. 2Beijing Advanced Innovation Center for Structural Biology, Beijing Frontier Research Center for Biological Structures, Tsinghua University, Beijing 100084, China. 3Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China. 4Tsinghua-Peking Joint Center for Life Sciences, Tsinghua University, Beijing 100084, China. 5Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA. 6Department of Dermatology, Stanford University, Stanford, CA, USA. 7Department of Genetics, Stanford University, Stanford, CA, USA. 8Department of Chemistry, Stanford University, Stanford, CA, USA. 9Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA. 10These authors contributed equally: Lei Sun, Furqan M. Fazal, Pan Li. 11These authors jointly supervised this work: Howard Y. Chang, Qiangfeng Cliff Zhang. *e-mail: [email protected]; [email protected] 322 NatURE STRUctURAL & MOlecULAR BIOLOGY | VOL 26 | APRIL 2019 | 322–330 | www.nature.com/nsmb NATURE STRUCTURAL & MOLECULAR BIOLOGY RESOURCE ab c Chromatin (Ch) Nucleoplasm (Np) Cytoplasm (Cy) RNA 381.9M 167.8M 132.8M GAS5 secondary structure HEK293 mESCs O 2'-OH N N Exon GR–DNA model N NAI-N N3 3 O RNA O GRE GR NAI-N3 N N3 HEK293 447 Intron GR-binding R 208.6M 194.6M 141.7M DNA motif GR–GAS5 model K442 mESCs GR mRNA lncRNA pseudogene RNA miscRNA icSHAPE reactivity (Ch) NAI-N3 miRNA snRNA snoRNA GAS5 01.0 Other (excludes rRNA/mtRNA) e d ) n 1.0 Ch Np Cy Ch Np Cy HNRNPA2B1 0.8 Chromatin [0-1] 0.6 Chromati Ch 0.4 Nucleoplasm [0-1] Np Gini (icSHAPE 0.2 Cy reactivity in 0 Cytoplasm [0-1] Exon Intron Exon Intron In vivo In vitro (in vivo) (in vivo) (in vitro) (in vitro) Fig. 1 | Chromatin fractions are enriched for pre-mRNA and lncRNA structures. a, Experimental overview of the icSHAPE protocol. The dashed box highlights the chemical structure of NAI-N3 and its covalent bond with the 2′-OH group of RNA, which allows probing of RNA structures inside living cells. mESCs, mouse ESCs. b, Donut charts showing read distributions of different RNA types in the three cellular compartments. The outer circles represent exon coverage while the inner circles represent intron coverage. MiscRNA is miscellaneous RNA from GENCODE (https://www.gencodegenes.org/). c, GAS5 RNA secondary structure with icSHAPE-reactivity scores shown in color. The nucleotides outlined in red interact with GR amino acids, shown in blue. GR, glucocorticoid receptor. d, UCSC tracks showing icSHAPE reactivity scores (y-axis), along the RNA sequence. The scale relates to the degree of structure, 1 denotes unstructured (single-stranded) regions, and 0 denotes fully structured regions. e, Violin plot of Gini index of icSHAPE data in exon versus in intron. The thick black bar in the center of the Violin plot represents the interquartile range, the thin black line extended from it represents the 95% confidence intervals, and the white dot is the median. The numbers of sliding windows (width = 20 nt) on the respective regions from the left to the right are n = 18,930, n = 5,926, n = 51,409, n = 82,648. context is a general mechanism for the recognition of other ‘reader’ cells allowed us to examine whether the structural patterns we proteins of m6A and other RNA modifications, is still unclear29. observed are conserved across the two species and cell types. Here we use in vivo click-selective 2-hydroxyl acylation and We determined RNA structure, as previously described12,32, profiling experiments (icSHAPE)12, a technique we developed to after enriching for mRNAs and long noncoding RNAs (lncRNAs) map RNA structure in vivo, in three compartments—chromatin, by ribosome depletion, and sequencing the resulting icSHAPE nucleoplasm and cytoplasm—in both mouse and human cells. libraries at high depth (approximately 200 million reads per rep- Consequently, we were able to determine the precise relationship licate, Supplementary Table 1). We first confirmed the quality of between RNA structure and cellular processes including transcrip- fractionation using quantitative PCR with reverse transcription tion, translation and RNA decay in the compartment in which they (RT–qPCR) for landmark RNAs, and immunoblots for specific occur. Separately, we could quantify how RNA