Fast and Reliable Prediction of Noncoding Rnas

Total Page:16

File Type:pdf, Size:1020Kb

Fast and Reliable Prediction of Noncoding Rnas Fast and reliable prediction of noncoding RNAs Stefan Washietl*, Ivo L. Hofacker*, and Peter F. Stadler*†‡ *Department of Theoretical Chemistry and Structural Biology, University of Vienna, Wa¨hringerstrasse 17, A-1090 Wien, Austria; and †Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Ha¨rtelstrasse 16-18, D-04107 Leipzig, Germany Communicated by Hans Frauenfelder, Los Alamos National Laboratory, Los Alamos, NM, December 14, 2004 (received for review November 2, 2004) We report an efficient method for detecting functional RNAs. The served noncoding elements in mammalian (or, more generally, approach, which combines comparative sequence analysis and vertebrate) genomes, and it must be expected that a significant structure prediction, already has yielded excellent results for a fraction of them are functional RNAs. small number of aligned sequences and is suitable for large-scale Possible candidates, however, have been identified only spo- genomic screens. It consists of two basic components: (i) a measure radically so far (19, 21), simply because there are no reliable tools for RNA secondary structure conservation based on computing a to scan multiple sequence alignments for functional RNAs. The consensus secondary structure, and (ii) a measure for thermody- most widely used program QRNA (22), which has been success- namic stability, which, in the spirit of a z score, is normalized with fully used to identify ncRNAs in bacteria (23) and yeast (24), is respect to both sequence length and base composition but can be not suitable for screens of large genomes. QRNA is limited to calculated without sampling from shuffled sequences. Functional pairwise alignments, and its reliability is low, especially if the RNA secondary structures can be identified in multiple sequence evolutionary distance of the two sequences lies outside of the alignments with high sensitivity and high specificity. We demon- optimal range. An alternative approach, DDBRNA (25), suffers strate that this approach is not only much more accurate than from similar problems and so far has not been used in a real-life previous methods but also significantly faster. The method is application. MSARI (26), on the other hand, gains its drastically implemented in the program RNAZ, which can be downloaded from enhanced accuracy from the large amount of information con- www.tbi.univie.ac.at͞ϳwash͞RNAz. We screened all alignments tained in large multiple sequence alignments of 10–15 sequences of length n > 50 in the Comparative Regulatory Genomics data- with high sequence diversity. At present, however, data sets of base, which compiles conserved noncoding elements in upstream this kind are not available at a genomewide scale, at least for regions of orthologous genes from human, mouse, rat, Fugu, and multicellular organisms. zebrafish. We recovered all of the known noncoding RNAs and In this article we address the problem by using an alternative cis-acting elements with high significance and found compelling approach: we combine a measure for thermodynamic stability evidence for many other conserved RNA secondary structures not with a measure for structure conservation. Using a combination described so far to our knowledge. of both scores we are able to efficiently detect functional RNAs in multiple sequence alignments of only a few sequences. Our comparative genomics ͉ conserved RNA secondary structure method is substantially more accurate than QRNA or DDBRNA and performs better on pairwise alignments than MSARI does on raditionally, the role of RNA in the cell was considered alignments with 15 sequences. On the large, diverse alignments Tmostly in the context of protein gene expression, limiting used for testing MSARI in ref. 26, our RNAZ program achieved RNA to its function as mRNA, tRNA, and rRNA. The discovery 100% sensitivity at 100% specificity. of a diverse array of transcripts that are not translated to proteins Methods but rather function as RNAs has changed this view profoundly (1–3). Noncoding RNAs (ncRNAs) are involved in a large Minimum Free Energy (MFE) RNA Folding. For MFE RNA folding we variety of processes, including gene regulation (4), maturation of used the C libraries of the Vienna RNA package version 1.5 (27). mRNAs, rRNAs, and tRNAs, or X-chromosome inactivation in We used RNAFOLD for folding single sequences and RNAALIFOLD mammals (5). In fact, a large fraction of the mouse transcriptome (28) for consensus folding of aligned sequences. The same consists of ncRNAs (6), and about half of the transcripts from folding parameters were used for both algorithms to ensure that human chromosomes 21 and 22 are noncoding (7, 8). Structured the obtained MFE values were comparable. For the covariation RNA motifs furthermore function as cis-acting regulatory ele- part of RNAALIFOLD we used default parameters. Gaps were ments within protein-coding genes. Also in this context, new removed for single sequence folding. intriguing mechanisms are being discovered (9). Hence, a comprehensive understanding of cellular processes is Calculation of z Scores Using Support Vector Machine (SVM) Regres- impossible without considering RNAs as key players. Efficient sion. To calculate z scores by regression analysis we used the identification of functional RNAs (ncRNAs as well as cis-acting following procedure: we generated synthetic sequences of dif- elements) in genomic sequences is, therefore, one of the major ferent length and base composition. The length of the test sequences ranged from 50 to 400 nt in steps of 50. To quantify goals of current bioinformatics. Notwithstanding its utmost ͞ ͞ ͞ biological relevance, de novo prediction is still a largely unsolved base composition, we used the GC AT, A T, and G C ratios of issue. Unlike protein-coding genes, functional RNAs lack in the sequences and chose values for all ratios ranging from 0.25 their primary sequence common statistical signals that could be to 0.75 in steps of 0.05. This process resulted in 10,648 points in exploited for reliable detection algorithms. Many functional a four-dimensional space of the independent variables. For each RNAs, however, depend on a defined secondary structure. In of the points we calculated the mean and standard deviation of particular, evolutionary conservation of secondary structures serves as compelling evidence for biologically relevant RNA Abbreviations: ncRNA, noncoding RNA; snoRNA, small nucleolar RNA; SRP, signal- function. Comparative studies therefore seem to be the most recognition particle; MFE, minimum free energy; SCI, structure conservation index; SVM, promising approach. To date, complete genomic sequences of support vector machine; CNB, conserved noncoding block; CORG, Comparative Regulatory related species have been sequenced for almost all genetic model Genomics. organisms as, for example, bacteria (10, 11), yeasts (12), nem- See Commentary on page 2269. atodes (13, 14), and even mammals (15–17). Several studies ‡To whom correspondence should be addressed. E-mail: [email protected]. (18–21) have identified a large collection of evolutionary con- © 2005 by The National Academy of Sciences of the USA 2454–2459 ͉ PNAS ͉ February 15, 2005 ͉ vol. 102 ͉ no. 7 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0409169102 Downloaded by guest on October 1, 2021 the MFE of 1,000 random sequences, representing the depen- Results dent variables in our regression. The SCI. In a recent article (34) we demonstrated that the program ͞ SEE COMMENTARY We used the SVM library LIBSVM (www.csie.ntu.edu.tw RNAALIFOLD [which originally was developed for prediction of ϳ ͞ cjlin libsvm) to train two regression models for mean and secondary structure in aligned sequences (28)] also can be used standard deviation. Input data for the SVM were scaled to mean for detection of evolutionarily conserved secondary structure. of 0 and standard deviation of 1. RNAALIFOLD implements a consensus folding algorithm gener- ␯ We chose the variant of regression and a radial basis function alizing the standard dynamic programming algorithms for RNA ␯ ϭ ␥ ϭ kernel. We optimized the parameters and found 0.5, 1, secondary structure prediction algorithms (35) by adding se- ϭ and C 5 to yield the best results. Finally, we obtained two quence covariation terms to the folding energy model (36, 37). models for the mean and standard deviation we used for z-score More precisely, a consensus MFE is computed for an alignment calculation. that is composed of an energy term averaging the energy The traditional sampling of z scores depends on the random- contributions of the single sequences and a covariance term ization of the native sequence by shuffling the positions. In this rewarding compensatory and consistent mutations (28). As this context it was pointed out by Workman and Krogh (30) that a consensus MFE is difficult to interpret in absolute terms, we correct randomization procedure should conserve dinucleotide previously used a time-consuming random sampling method to content because of the energy contributions of stacked base pairs assess its significance (34). This approach would require massive in the energy model. In principle, the regression model could be computational effort even for small-sized genomes and it does extended to use dinucleotide frequencies. The good results with not seem practicable for large genomes as, for example, the the simple model, however, allow us to neglect this effect. human genome. A much more efficient normalization
Recommended publications
  • Comparing Tools for Non-Coding RNA Multiple Sequence Alignment Based On
    Downloaded from rnajournal.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press ES Wright 1 1 TITLE 2 RNAconTest: Comparing tools for non-coding RNA multiple sequence alignment based on 3 structural consistency 4 Running title: RNAconTest: benchmarking comparative RNA programs 5 Author: Erik S. Wright1,* 6 1 Department of Biomedical Informatics, University of Pittsburgh (Pittsburgh, PA) 7 * Corresponding author: Erik S. Wright ([email protected]) 8 Keywords: Multiple sequence alignment, Secondary structure prediction, Benchmark, non- 9 coding RNA, Consensus secondary structure 10 Downloaded from rnajournal.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press ES Wright 2 11 ABSTRACT 12 The importance of non-coding RNA sequences has become increasingly clear over the past 13 decade. New RNA families are often detected and analyzed using comparative methods based on 14 multiple sequence alignments. Accordingly, a number of programs have been developed for 15 aligning and deriving secondary structures from sets of RNA sequences. Yet, the best tools for 16 these tasks remain unclear because existing benchmarks contain too few sequences belonging to 17 only a small number of RNA families. RNAconTest (RNA consistency test) is a new 18 benchmarking approach relying on the observation that secondary structure is often conserved 19 across highly divergent RNA sequences from the same family. RNAconTest scores multiple 20 sequence alignments based on the level of consistency among known secondary structures 21 belonging to reference sequences in their output alignment. Similarly, consensus secondary 22 structure predictions are scored according to their agreement with one or more known structures 23 in a family.
    [Show full text]
  • Expression Profiling of Snornas in Normal Hematopoiesis and AML Wayne A
    Washington University School of Medicine Digital Commons@Becker Open Access Publications 2018 Expression profiling of snoRNAs in normal hematopoiesis and AML Wayne A. Warner Washington University School of Medicine in St. Louis David H. Spencer Washington University School of Medicine in St. Louis Maria Trissal Washington University School of Medicine in St. Louis Brian S. White Washington University School of Medicine in St. Louis Nichole Helton Washington University School of Medicine in St. Louis See next page for additional authors Follow this and additional works at: https://digitalcommons.wustl.edu/open_access_pubs Recommended Citation Warner, Wayne A.; Spencer, David H.; Trissal, Maria; White, Brian S.; Helton, Nichole; Ley, Timothy J.; and Link, Daniel C., ,"Expression profiling of snoRNAs in normal hematopoiesis and AML." Blood Advances.2,2. 151-163. (2018). https://digitalcommons.wustl.edu/open_access_pubs/7237 This Open Access Publication is brought to you for free and open access by Digital Commons@Becker. It has been accepted for inclusion in Open Access Publications by an authorized administrator of Digital Commons@Becker. For more information, please contact [email protected]. Authors Wayne A. Warner, David H. Spencer, Maria Trissal, Brian S. White, Nichole Helton, Timothy J. Ley, and Daniel C. Link This open access publication is available at Digital Commons@Becker: https://digitalcommons.wustl.edu/open_access_pubs/7237 REGULAR ARTICLE Expression profiling of snoRNAs in normal hematopoiesis and AML Wayne A. Warner,1,* David H. Spencer,1-3,* Maria Trissal,1 Brian S. White,1,2 Nichole Helton,1 Timothy J. Ley,1-3 and Daniel C. Link1,3 1Division of Oncology, 2McDonnell Genome Institute, School of Medicine, and 3Siteman Cancer Center, Washington University, St.
    [Show full text]
  • Comparing Tools for Noncoding RNA Multiple Sequence Alignment Based on Structural Consistency
    Downloaded from rnajournal.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press BIOINFORMATICS RNAconTest: comparing tools for noncoding RNA multiple sequence alignment based on structural consistency ERIK S. WRIGHT Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania 15219, USA ABSTRACT The importance of noncoding RNA sequences has become increasingly clear over the past decade. New RNA families are often detected and analyzed using comparative methods based on multiple sequence alignments. Accordingly, a number of programs have been developed for aligning and deriving secondary structures from sets of RNA sequences. Yet, the best tools for these tasks remain unclear because existing benchmarks contain too few sequences belonging to only a small number of RNA families. RNAconTest (RNA consistency test) is a new benchmarking approach relying on the obser- vation that secondary structure is often conserved across highly divergent RNA sequences from the same family. RNAconTest scores multiple sequence alignments based on the level of consistency among known secondary structures belonging to reference sequences in their output alignment. Similarly, consensus secondary structure predictions are scored according to their agreement with one or more known structures in a family. Comparing the performance of 10 popular alignment programs using RNAconTest revealed that DAFS, DECIPHER, LocARNA, and MAFFT created the most structurally consistent alignments. The best consensus secondary structure predictions were generated by DAFS and LocARNA (via RNAalifold). Many of the methods specific to noncoding RNAs exhibited poor scalability as the number or length of input sequences increased, and several programs displayed substantial declines in score as more sequences were aligned.
    [Show full text]
  • Inflammatory, Regulatory, and Autophagy Co-Expression Modules
    Durocher et al. Journal of Neuroinflammation (2019) 16:56 https://doi.org/10.1186/s12974-019-1433-4 RESEARCH Open Access Inflammatory, regulatory, and autophagy co-expression modules and hub genes underlie the peripheral immune response to human intracerebral hemorrhage Marc Durocher1, Bradley P. Ander1, Glen Jickling1, Farah Hamade1, Heather Hull1, Bodie Knepp1, Da Zhi Liu1, Xinhua Zhan1, Anh Tran1, Xiyuan Cheng1, Kwan Ng1, Alan Yee1, Frank R. Sharp1 and Boryana Stamova1,2* Abstract Background: Intracerebral hemorrhage (ICH) has a high morbidity and mortality. The peripheral immune system and cross-talk between peripheral blood and brain have been implicated in the ICH immune response. Thus, we delineated the gene networks associated with human ICH in the peripheral blood transcriptome. We also compared the differentially expressed genes in blood following ICH to a prior human study of perihematomal brain tissue. Methods: We performed peripheral blood whole-transcriptome analysis of ICH and matched vascular risk factor control subjects (n = 66). Gene co-expression network analysis identified groups of co-expressed genes (modules) associated with ICH and their most interconnected genes (hubs). Mixed-effects regression identified differentially expressed genes in ICH compared to controls. Results: Of seven ICH-associated modules, six were enriched with cell-specific genes: one neutrophil module, one neutrophil plus monocyte module, one T cell module, one Natural Killer cell module, and two erythroblast modules. The neutrophil/monocyte modules were enriched in inflammatory/immune pathways; the T cell module in T cell receptor signaling genes; and the Natural Killer cell module in genes regulating alternative splicing, epigenetic, and post-translational modifications.
    [Show full text]
  • Bioinformatics Approaches to Analysing RNA Mediated Regulation of Gene Expression
    Bioinformatics approaches to analysing RNA mediated regulation of gene expression Dissertation zur Erlangung des akademischen Grades "doctor rerum naturalium" (Dr. rer. nat.) eingereicht im Institut für Biochemie und Biologie an der Mathematisch-Naturwissenschaftlichen Fakultät der Universität Potsdam LIAM CHILDS Arbeitsgruppe Bioinformatik Max-Plank-Institut für Molekulare Pflanzenphysiologie Potsdam, den 13.07.2009 This work is licensed under a Creative Commons License: Attribution - Noncommercial - Share Alike 3.0 Germany To view a copy of this license visit http://creativecommons.org/licenses/by-nc-sa/3.0/de/deed.en Published online at the Institutional Repository of the University of Potsdam: URL http://opus.kobv.de/ubp/volltexte/2010/4128/ URN urn:nbn:de:kobv:517-opus-41284 http://nbn-resolving.org/urn:nbn:de:kobv:517-opus-41284 Acknowledgements Dirk Walther is certainly one of the best supervisors I’ve ever had. He gave me the freedom to pursue my own ideas and goals whilst providing all the necessary help and support when needed. I have a sneaking suspicion that, if I didn’t play the drums, I may not have been chosen to fill this PhD position. Nevertheless, studying under Dirk and playing in a kick-arse rock/pop/punk/funk/blues/jazz band together has made the past three years, although so very far from home, a very happy and fulfilling experience. To work at such an institute, in the company of great scientific minds, has helped my scientific thinking through a colossal improvement. I still have far to go, but I start my post-doctoral efforts from a very solid basis.
    [Show full text]
  • Fast and Reliable Prediction of Noncoding Rnas
    Fast and reliable prediction of noncoding RNAs Stefan Washietl1, Ivo L. Hofacker1 & Peter F. Stadler1;2;∗ 1Department of Theoretical Chemistry and Structural Biology, University of Vienna, Wahringerstraße¨ 17, A-1090 Wien, Austria 2Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Kreuzstraße 7b, D-04103 Leipzig, Germany ∗Corresponding author. [email protected] Abstract We report an efficient method to detect functional RNAs. The ap- proach, which combines comparative sequence analysis and structure pre- diction, yields excellent results already for a small number of aligned se- quences and is suitable for large scale-genomic screens. It consists of two basic components: (1) a novel measure for RNA secondary structure conservation based on computing a consensus secondary structure, and (2) a measure for thermodynamic stability, which — in the spirit of a z- score — is normalized w.r.t. both sequence length and base composition but can be calculated without sampling from shuffled sequences. Func- tional RNA secondary structures can be identified in multiple sequence alignments with high sensitivity and high specificity. We demonstrate that this approach is not only much more accurate than previous methods but also significantly faster. The method is implemented in the program RNAz, which can be downloaded from http://www.tbi.univie.ac.at/ ˜wash/RNAz/. As a first application we screened all alignments of length N ≥ 50 in the CORG database, which compiles conserved non-coding elements in upstream regions of orthologous genes from human, mouse, rat fugu and zebrafish. We recovered all of the known non-coding RNAs and cis-acting elements with high significance and found compelling evidence for many other conserved RNA secondary structures not described so far.
    [Show full text]
  • Modular Organization and Composability of RNA
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations Fall 2009 Modular Organization and Composability of RNA Miler T. Lee University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Bioinformatics Commons, Biology Commons, Computational Biology Commons, Evolution Commons, Genomics Commons, and the Molecular and Cellular Neuroscience Commons Recommended Citation Lee, Miler T., "Modular Organization and Composability of RNA" (2009). Publicly Accessible Penn Dissertations. 244. https://repository.upenn.edu/edissertations/244 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/244 For more information, please contact [email protected]. Modular Organization and Composability of RNA Abstract Life is organized. Organization is largely achieved via composability -- that at some level of abstraction, a system consists of smaller parts that serve as building blocks -- and modularity -- the tendency for these blocks to be independent units that recombine to form functionally different systems. Here, we explore the organization, composition, and modularity of ribonucleic acid (RNA) molecules, biopolymers that adopt three-dimensional structures according to their specific nucleotide sequence. eW address three themes: the efficacy of specific sequenceso t function as modules or as the context in which modules are inserted; the sources of novel modules in modern genomes; and the resolutions at which functionally relevant modules exist in RNA. First, we investigate the structural modularity of RNA sequences by developing the Self-Containment Index, a method to quantify in silico the degree to which RNA structures deviate in changing genomic contexts. We show that although structural modularity is not a general property of natural RNAs, precursor microRNAs are strongly modular, which we hypothesize is a consequence of their unique biogenesis and evolutionary history.
    [Show full text]
  • Additional File 3. Non Coding Rnas
    Additional File 3. Non coding RNAs Table S3A Codon and tRNA gene usage in A.adeninivorans and Y. lipolytica. Genes encoding tRNAs were searched with tRNAScan-SE 1 and sequences were analyzed for possible exceptions to the eukaryotic cloverleaf model, as described in 2. Initiator Met tRNA genes were distinguished from the elongator sequences thanks to the "GGG" sequence in positions 29 -31. Table S3B Genes encoding snRNAs and snoRNAs. Blast 3 and Infernal 4 searches were performed on A. adeninivorans genome for snRNAs and snoRNAs detection. For the Blast search, ncRNA sequences from the Genolevures database 5 were used as queries. All hits with an e-value lower than 0.1 were retained for validation. For the Infernal search, covariance models found in the RFam 6 database were used. All hits with an e-value lower than 0.5 were retained for validation. All retained hits were aligned and manually checked. Hits were accepted as candidates if: i) the sequence agrees with known structural features, guiding sequences (for snoRNAs) and conserved sequence motifs for homologous molecules and ii) known synteny was verified. 1 Table S3A Codon and tRNA gene Usages in A. adeninivorans and Y. lipolytica AA C AC ARAD YALI AA C AC ARAD YALI ------ --- --- -------- -------- ------ --- --- -------- -------- F Phe TTT --- 2.14 - 1.58 - Y Tyr TAT --- 1.33 - 0.68 - F Phe TTC (GAA) 1.61 4+ 2.30 17+ Y Tyr TAC (GTA) 1.80 5+ 2.32 14+ L Leu TTA (TAA) 0.69 1 0.18 1 * Och TAA --- - - L Leu TTG (CAA) 1.93 2+ 1.04 3+ * Amb TAG --- - - L Leu CTT (AAG) 1.56 4 1.33 21+ H
    [Show full text]
  • Dissertation
    DISSERTATION PLURIPOTENCY FACTORS AND MIRNAS IN HUMAN OVARIAN CANCER CELLS AND THEIR SECRETED EXOSOMES REGULATE GENE EXPRESSION AND PHENOTYPE Submitted by Vanessa Ann Enriquez Graduate Degree Program in Cell and Molecular Biology In partial fulfillment of the requirements For the Degree of Doctor of Philosophy Colorado State University Fort Collins, Colorado Summer 2014 Doctoral Committee: Advisor: Gerrit J. Bouma Co-Advisor: Quinton A. Winger Susan M. Bailey Dawn L. Duval i Copyright by Vanessa Ann Enriquez 2014 All Rights Reserved ii ABSTRACT PLURIPOTENCY FACTORS AND MIRNAS FROM HUMAN OVARIAN CANCER CELLS AND THEIR SECRETED EXOSOMES REGULATE GENE EXPRESSION AND PHENOTYPE Ovarian cancer is the fifth most deadly cancer among women in the United States and the most lethal gynecological malignancy in the world. Recent studies reveal that human tumor cells release cell-secreted vesicles called exosomes. Exosomes are endosome-derived vesicles containing bioactive materials, including RNAs and miRNAs that can be detected in body fluids. Importantly, pluripotency factor LIN28, a regulator of let-7 miRNAs, is present in ovarian cancer cells. High LIN28A and low let-7 miRNA levels are associated with aggressive IGROV1 human ovarian cancer cells (Piskounova et al., 2011) so we compared this to low LIN28A and high let-7 miRNA levels in less aggressive OV420 human ovarian cancer cells. Moreover, let-7b, let-7c, let-7g , and let- 7i miRNA signatures were also present in their secreted exosomes. We hypothesized that ovarian cancer cell-secreted exosomes are taken up by target cells and induce changes in gene expression and cell behavior. Our data revealed that IGROV1 secreted exosomes taken up by HEK293 cells lead to significantly higher LIN28A mRNA levels, but not LIN28A protein levels and did not significantly change LIN28B mRNA or protein levels.
    [Show full text]
  • RNA Modifications and Stem Cell Identity
    G C A T T A C G G C A T genes Review Above the Epitranscriptome: RNA Modifications and Stem Cell Identity Francesco Morena 1, Chiara Argentati 1, Martina Bazzucchi 1, Carla Emiliani 1,2 and Sabata Martino 1,2,* 1 Department of Chemistry, Biology and Biotechnologies, University of Perugia, 06126 Perugia, Italy; [email protected] (F.M.); [email protected] (C.A.); [email protected] (M.B.); [email protected] (C.E.) 2 CEMIN, Center of Excellence of Nanostructured Innovative Materials, University of Perugia, 06126 Perugia, Italy * Correspondence: [email protected]; Tel.: +39-0755-857-442 Received: 29 May 2018; Accepted: 25 June 2018; Published: 28 June 2018 Abstract: Sequence databases and transcriptome-wide mapping have revealed different reversible and dynamic chemical modifications of the nitrogen bases of RNA molecules. Modifications occur in coding RNAs and noncoding-RNAs post-transcriptionally and they can influence the RNA structure, metabolism, and function. The result is the expansion of the variety of the transcriptome. In fact, depending on the type of modification, RNA molecules enter into a specific program exerting the role of the player or/and the target in biological and pathological processes. Many research groups are exploring the role of RNA modifications (alias epitranscriptome) in cell proliferation, survival, and in more specialized activities. More recently, the role of RNA modifications has been also explored in stem cell biology. Our understanding in this context is still in its infancy. Available evidence addresses the role of RNA modifications in self-renewal, commitment, and differentiation processes of stem cells.
    [Show full text]
  • Rationalization and Prediction of Selective Decoding of Pseudouridine-Modified Nonsense and Sense Codons
    Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press HYPOTHESIS Rationalization and prediction of selective decoding of pseudouridine-modified nonsense and sense codons MARC PARISIEN,1,3 CHENGQI YI,1,3 and TAO PAN1,2,4 1Department of Biochemistry and Molecular Biology, 2Institute for Biophysical Dynamics, University of Chicago, Chicago, Illinois 60637, USA ABSTRACT A stop or nonsense codon is an in-frame triplet within a messenger RNA that signals the termination of translation. One common feature shared among all three nonsense codons (UAA, UAG, and UGA) is a uridine present at the first codon position. It has been recently shown that the conversion of this uridine into pseudouridine (C) suppresses translation termination, both in vitro and in vivo. Furthermore, decoding of the pseudouridylated nonsense codons is accompanied by the incorporation of two specific amino acids in a nonsense codon-dependent fashion. C differs from uridine by a single N1H group at the C5 position; how C suppresses termination and, more importantly, enables selective decoding is poorly understood. Here, we provide molecular rationales for how pseudouridylated stop codons are selectively decoded. Our analysis applies crystal structures of ribosomes in varying states of translation to consider weakened interaction of C with release factor; thermodynamic and geometric considerations of the codon-anticodon base pairs to rank and to eliminate mRNA-tRNA pairs; the mechanism of fidelity check of the codon-anticodon pairing by the ribosome to evaluate noncanonical codon-anticodon base pairs and the role of water. We also consider certain tRNA modifications that interfere with the C-coordinated water in the major groove of the codon-anticodon mini-helix.
    [Show full text]
  • Nucleic Acids Research
    Nucleic Acids Research Rfam: an RNA family database Sam Griffiths-Jones, Alex Bateman, Mhairi Marshall, Ajay Khanna and Sean R. Eddy Nucleic Acids Res. 31:439-441, 2003. doi:10.1093/nar/gkg006 The full text of this article, along with updated information and services is available online at http://nar.oxfordjournals.org/cgi/content/full/31/1/439 References This article cites 29 references, 24 of which can be accessed free at http://nar.oxfordjournals.org/cgi/content/full/31/1/439#BIBL Cited by This article has been cited by 83 articles at 9 September 2008 . View these citations at http://nar.oxfordjournals.org/cgi/content/full/31/1/439#otherarticles Reprints Reprints of this article can be ordered at http://www.oxfordjournals.org/corporate_services/reprints.html Email and RSS alerting Sign up for email alerts, and subscribe to this journal’s RSS feeds at http://nar.oxfordjournals.org PowerPoint® Images from this journal can be downloaded with one click as a PowerPoint slide. image downloads Journal information Additional information about Nucleic Acids Research, including how to subscribe can be found at http://nar.oxfordjournals.org Published on behalf of Oxford University Press http://www.oxfordjournals.org Downloaded from http://nar.oxfordjournals.org at University of Georgia Libraries on 9 September 2008 # 2003 Oxford University Press Nucleic Acids Research, 2003, Vol. 31, No. 1 439–441 DOI: 10.1093/nar/gkg006 Rfam: an RNA family database Sam Griffiths-Jones*, Alex Bateman, Mhairi Marshall, Ajay Khanna1 and Sean R.Eddy 1 The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UKand 1Howard Hughes Medical Institute and Department of Genetics, Washington University School of Medicine, St Louis, MO 63110, USA Received August 15, 2002; Accepted September 1, 2002 ABSTRACT RNA sequence analysis has been difficult to automate.
    [Show full text]