Cofold: an RNA Structure Prediction Method That Takes Co-Transcriptional Folding Into Account

Total Page:16

File Type:pdf, Size:1020Kb

Cofold: an RNA Structure Prediction Method That Takes Co-Transcriptional Folding Into Account CoFold: an RNA structure prediction method that takes co-transcriptional folding into account by JEFFREY RYAN PROCTOR BSEng, The University of Victoria, 2010 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (Bioinformatics) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) September, 2012 c Jeffrey Ryan Proctor, 2012 Abstract RNA has a diverse array of cellular functions, and relies on molecular structure to carry them out. The vast majority of current methods for prediction of RNA secondary structure (i.e. the set of base pairs in the molecule) consider the minimum free energy structure (i.e. the most thermodynamically stable structure), and thus disregard the process of structural formation. There exists substantial evidence that the process of structure formation is important, and that it does impact the resulting functional RNA structure. Several methods currently exist that explicitly simulate the kinetic folding pathway as a time-ordered sequence of structural changes. However, these methods not only suffer from a long list of limiting assumptions about the cellular environment, but also are restricted to short sequences. In this thesis, we explore the idea of capturing the effects of kinetic folding rather than simulating in detail the process over time, and propose that accounting for kinetic effects of structural formation is crucial to further improve non-comparative RNA secondary structure prediction methods. During transcription, RNA structure begins to form immediately as the molecule emerges from the polymerase (i.e. co-transcriptionally). Long-range base pairs suffer a disadvantage during this process, as quickly- forming short-range base pairs act to block their formation (i.e. due to kinetic barriers). We propose a novel method, CoFold, that captures the reachability of potential pairing partners during co-transcriptional folding. We show that it significantly improves prediction accuracy over free energy minimization alone, particularly for long sequences. CoFold depends on only two free parameters that are highly correlated, and we demonstrate robust training. Furthermore, the resulting structures predicted by CoFold have a free energy measurement that does not significantly differ from that of the respective RNAfold prediction, indicating that they are indeed stable secondary structures. We propose that consideration of kinetics in RNA secondary structure prediction is crucial, and we hope that this work encourages further exploration of its effect on biological RNA structure. ii Contents Abstract .................................................. ii Contents .................................................. iii List of Figures .............................................. v List of Tables ............................................... vi Acknowledgements ........................................... vii 1 Introduction ............................................. 1 1.1 RNA secondary structure . 1 1.1.1 Secondary vs. tertiary structure . 3 1.1.2 Components of RNA structure . 3 1.2 Experimental determination of RNA structure . 4 1.2.1 Xray crystallography . 4 1.2.2 Nuclear magnetic resonance spectroscopy . 5 1.2.3 Chemical and enzymatic probing . 5 1.3 Computational non-comparative RNA Secondary Structure Prediction . 6 1.3.1 Nussinov algorithm . 6 1.3.2 Zuker-Stiegler algorithm . 7 1.3.3 Prediction of pseudoknotted structures . 8 1.3.4 Suboptimal folding . 9 1.3.5 Equilibrium partition function . 10 1.3.6 Sfold ........................................... 11 1.3.7 Prediction guided by chemical probing data . 11 1.3.8 CONTRAfold ..................................... 12 1.4 Computational comparative RNA Secondary Structure Prediction . 12 1.4.1 Covariance models . 13 1.4.2 Pfold ........................................... 14 1.4.3 RNA-Decoder ..................................... 15 1.4.4 RNAalifold ....................................... 15 1.4.5 Alignment-free methods . 15 1.5 Thermodynamics alone does not tell the whole story . 16 1.5.1 Importance of kinetics . 16 1.5.2 Co-transcriptional Folding . 17 1.5.3 Kinetic folding methods . 18 1.6 Goals of this thesis . 19 iii 2 CoFold ................................................. 22 2.1 Motivation . 22 2.2 CoFold Algorithm . 22 2.3 Implementation . 26 2.4 Compilation of the long and combined data sets . 27 2.4.1 The long data set . 28 2.4.2 The combined data set . 29 2.5 Parameter training . 30 2.5.1 Training procedure . 31 2.5.2 Clade-specific parameter training . 33 2.6 CoFold Performance . 34 2.7 Calculation of free energy differences . 39 2.8 Case studies . 43 2.9 CoFold web server . 47 3 Discussion ............................................... 48 4 Future Work ............................................. 49 References ................................................. 50 Appendices ................................................ 60 Definition of covariation metric . 60 Data set statistics . 60 Detailed free energy difference distribution . 63 iv List of Figures 1 RNA secondary structure components . 2 2 Diagram of an RNA pseudoknot . 4 3 Example of base pair covariation . 13 4 Scaling function γ of CoFold .................................. 24 5 Visualization of the indices involved in CoFold recursion formulas . 26 6 Heatmap of performance measurements for all (α; τ) combinations. 33 7 Difference in predictive accuracy between CoFold and RNAfold(TPR vs PPV) . 37 8 Difference in predictive accuracy between CoFold and RNAfold(TPR vs FPR) . 38 9 Distribution of relative free energy difference between predicted structures and the respec- tive RNAfold-predicted minimum free energy structures in the long data set . 41 10 Distribution of absolute free energy difference (in kcal/mol) between predicted structures and the respective RNAfold-predicted minimum free energy structures in the long data set 42 11 RNA secondary structures predicted by RNAfold and CoFold-A for the 23S rRNA of the gamma-proteobacteria Pseudomonas aeruginosa ..................... 45 12 RNA secondary structures predicted by RNAfold and CoFold-A for the 16S rRNA of the fresh-water algae Cryptomonas sp. (species unknown) . 46 A1 Distributions of relative free energy difference between predicted structures and the re- spective minimum free energy structures predicted by RNAfold . 63 A2 Differences in prediction accuracy versus relative free energy changes between the predicted structures and the MFE structures predicted by RNAfold . 64 v List of Tables 1 Definition of method names used throughout the text . 27 2 Composition of the long and combined datasets . 30 3 Performance metrics for CoFold and RNAfold ....................... 36 4 Summary of relative free energy difference between predicted structures and the respective minimum free energy structures predicted by RNAfold. .................. 39 5 Details of the linear fits to the ∆ MCC versus % ∆∆G distributions . 43 A1 RNA families of the long and the combined data sets . 61 A2 Alignment quality and phylogenetic support for the reference RNA secondary structures . 62 vi Acknowledgements I would like to thank my supervisor, Irmtraud Meyer, as well as my committee members, Joerg Gsponer and Anne Condon. I would also like to thank all the members of the Meyer group, particularly Daniel Lai for the invaluable R code he has written as part of the R4RNA package, and Adi Steif for immensely helpful advice regarding statistics. My sources of funding include the MSFHR/CIHR Bioinformatics Training Program, and the Natural Sciences and Engineering Research Council (NSERC) Canada Graduate Scholarship. vii 1 Introduction 1.1 RNA secondary structure Ribonucleic acids (RNA) perform a wide variety of essential and well-defined roles in the cell. Whether it is protein-coding mRNA, or the myriad non-coding RNAs, RNA molecules exert their function by assuming structural features. Transfer RNA (tRNA), ribosomal RNA (rRNA), and messenger RNA (mRNA) act in concert to carry out faithful translation of the genetic code. The structure of transfer RNA is vital not only to interact with codons within messenger RNA, but also for robust recognition by tRNA synthases, the enzymes that attach the appropriate amino acid to each tRNA [107]. The complex structure of ribosomal RNAs acts not only as structural scaffolding of the ribosome, but interestingly rRNA is in fact responsible for its catalytic peptidyl transferase activity [124]. The untranslated regions (UTR) at the ends of messenger RNAs are repositories of RNA structural features responsible for regulation of the respective protein product. Riboswitches are often found in the UTRs of metabolic genes, where they modulate expression by actively binding to small metabolites [89, 115, 119]. Conserved hairpins in the UTRs of genes involved in iron metabolism are essential for regulation of cellular iron levels in mammals [47]. MicroRNA response elements, commonly found in UTRs of protein-coding genes, are short sequences with near-perfect complementarity to microRNAs, short RNA molecules which cause gene repression upon binding [105]. In the 1980s, it was discovered that RNA, in addition to proteins, can play a catalytic role. For instance, group I and II introns contain RNA structure that catalyzes its own excision from the tran- script [10, 14]. Ribonuclease P (RNase P) is a catalytic RNA that cleaves tRNA precursors during their maturation pathway [32]. Viral genomes have been found to depend highly on RNA structural features, ostensibly due to the
Recommended publications
  • Comparing Tools for Non-Coding RNA Multiple Sequence Alignment Based On
    Downloaded from rnajournal.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press ES Wright 1 1 TITLE 2 RNAconTest: Comparing tools for non-coding RNA multiple sequence alignment based on 3 structural consistency 4 Running title: RNAconTest: benchmarking comparative RNA programs 5 Author: Erik S. Wright1,* 6 1 Department of Biomedical Informatics, University of Pittsburgh (Pittsburgh, PA) 7 * Corresponding author: Erik S. Wright ([email protected]) 8 Keywords: Multiple sequence alignment, Secondary structure prediction, Benchmark, non- 9 coding RNA, Consensus secondary structure 10 Downloaded from rnajournal.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press ES Wright 2 11 ABSTRACT 12 The importance of non-coding RNA sequences has become increasingly clear over the past 13 decade. New RNA families are often detected and analyzed using comparative methods based on 14 multiple sequence alignments. Accordingly, a number of programs have been developed for 15 aligning and deriving secondary structures from sets of RNA sequences. Yet, the best tools for 16 these tasks remain unclear because existing benchmarks contain too few sequences belonging to 17 only a small number of RNA families. RNAconTest (RNA consistency test) is a new 18 benchmarking approach relying on the observation that secondary structure is often conserved 19 across highly divergent RNA sequences from the same family. RNAconTest scores multiple 20 sequence alignments based on the level of consistency among known secondary structures 21 belonging to reference sequences in their output alignment. Similarly, consensus secondary 22 structure predictions are scored according to their agreement with one or more known structures 23 in a family.
    [Show full text]
  • U4 Small Nuclear RNA Dissociates from a Yeast Spliceosome And
    MOLECULAR AND CELLULAR BIOLOGY, Nov. 1991, p. 5571-5577 Vol. 11, No. 11 0270-7306/91/115571-07$02.00/0 Copyright C) 1991, American Society for Microbiology U4 Small Nuclear RNA Dissociates from a Yeast Spliceosome and Does Not Participate in the Subsequent Splicing Reaction SHYUE-LEE YEAN AND REN-JANG LIN* Department of Microbiology, University of Texas at Austin, Austin, Texas 78712-1095 Received 16 April 1991/Accepted 19 August 1991 U4 and U6 small nuclear RNAs reside in a single ribonucleoprotein particle, and both are required for pre-mRNA splicing. The U4/U6 and U5 small nuclear ribonucleoproteins join Ul and U2 on the pre-mRNA during spliceosome assembly. Binding of U4 is then destabilized prior to or concomitant with the 5' cleavage-ligation. In order to test the role of U4 RNA, we isolated a functional spliceosome by using extracts prepared from yeast cells carrying a temperature-sensitive allele ofprp2 (rna2). The isolated prp2A spliceosome contains U2, U5, U6, and possibly also I11 and can be activated to splice the bound pre-mRNA. U4 RNA does not associate with the isolated spliceosomes and is shown not to be involved in the subsequent cleavage-ligation reactions. These results are consistent with the hypothesis that the role of U4 in pre-mRNA splicing is to deliver U6 to the spliceosome. Splicing of introns from nuclear pre-mRNAs occurs by mRNA in a spliceosome (19, 22). This prp2A spliceosome is two cleavage-ligation (transesterification) reactions. The first functional, since it can be activated to splice if supplemented reaction is a cleavage at the 5' splice site and the formation with splicing factors and ATP.
    [Show full text]
  • Forward Genetics
    MOLECULAR AND CELLULAR BIOLOGY, Sept. 1992, p. 3939-3947 Vol. 12, No. 9 0270-7306/92/093939-09$02.00/0 Copyright X 1992, American Society for Microbiology PRP38 Encodes a Yeast Protein Required for Pre-mRNA Splicing and Maintenance of Stable U6 Small Nuclear RNA Levels STEVEN BLANTON, APARNA SRINIVASAN, AND BRIAN C. RYMOND* T. H. Morgan School ofBiological Sciences, University ofKentucky, Lexington, Kentucky 40506-0225 Downloaded from Received 7 April 1992/Returned for modification 14 May 1992/Accepted 17 June 1992 An essential pre-mRNA splicing factor, the product of the PRP38 gene, has been genetically identified in a screen of temperature-sensitive mutants of Saccharomyces cerevisiae. Shifting temperature-sensitive prp38 cultures from 23 to 37°C prevents the first cleavage-ligation event in the excision of introns from mRNA precursors. In vitro splicing inactivation and complementation studies suggest that the PRP38-encoded factor functions, at least in part, after stable splicing complex formation. The PRP38 locus contains a 726-bp open reading frame coding for an acidic 28-kDa polypeptide (PRP38). While PRP38 lacks obvious structural similarity to previously defined splicing factors, heat inactivation of PRP38, PRP19, or any of the known U6 http://mcb.asm.org/ (or U4/U6) small nuclear ribonucleoprotein-associating proteins (i.e., PRP3, PRP4, PRP6, and PRP24) leads to a common, unexpected consequence: intracellular U6 small nuclear RNA (snRNA) levels decrease as splicing activity is lost. Curiously, U4 snRNA, normally extensively base paired with U6 snRNA, persists in the virtual absence of U6 snRNA. The excision of intervening sequences from eukaryotic ates from the spliceosome (11, 37).
    [Show full text]
  • Domains of Yeast U4 Spliceosomal RNA Required for PRP4 Protein Binding, Snrnp-Snrnp Interactions, and Pre- Mrna Splicing in Vivo
    Downloaded from genesdev.cshlp.org on September 27, 2021 - Published by Cold Spring Harbor Laboratory Press Domains of yeast U4 spliceosomal RNA required for PRP4 protein binding, snRNP-snRNP interactions, and pre- mRNA splicing in vivo R6my Bordonn4, Josette Banroques,~, 2 John Abelson, 1 and Christine Guthrie s Department of Biochemistry and Biophysics, University of California at San Francisco, San Francisco, California 94143 USA; ~Division of Biology, California Institute of Technology, Pasadena, California 91125 USA U4 small nuclear RNA (snRNA) contains two intramolecular stem-loop structures, located near each end of the molecule. The 5' stem-loop is highly conserved in structure and separates two regions of U4 snRNA that base- pair with U6 snRNA in the U4/U6 small nuclear ribonucleoprotein particle (snRNP). The 3' stem-loop is highly divergent in structure among species and lies immediately upstream of the binding site for Sm proteins. To investigate the function of these two domains, mutants were constructed that delete the yeast U4 snRNA 5' stem-loop and that replace the yeast 3' stem-loop with that from trypanosome U4 snRNA. Both mutants fail to complement a null allele of the yeast U4 gene. The defects of the mutants have been examined in heterozygous strains by native gel electrophoresis, glycerol gradient centrifugation, and immunoprecipitation. The chimeric yeast-trypanosome RNA does not associate efficiently with U6 snRNA, suggesting that the 3' stem-loop of yeast U4 snRNA might be a binding site for a putative protein that facilitates assembly of the U4/U6 complex. In contrast, the 5' hairpin deletion mutant associates efficiently with U6 snRNA.
    [Show full text]
  • Comparing Tools for Noncoding RNA Multiple Sequence Alignment Based on Structural Consistency
    Downloaded from rnajournal.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press BIOINFORMATICS RNAconTest: comparing tools for noncoding RNA multiple sequence alignment based on structural consistency ERIK S. WRIGHT Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania 15219, USA ABSTRACT The importance of noncoding RNA sequences has become increasingly clear over the past decade. New RNA families are often detected and analyzed using comparative methods based on multiple sequence alignments. Accordingly, a number of programs have been developed for aligning and deriving secondary structures from sets of RNA sequences. Yet, the best tools for these tasks remain unclear because existing benchmarks contain too few sequences belonging to only a small number of RNA families. RNAconTest (RNA consistency test) is a new benchmarking approach relying on the obser- vation that secondary structure is often conserved across highly divergent RNA sequences from the same family. RNAconTest scores multiple sequence alignments based on the level of consistency among known secondary structures belonging to reference sequences in their output alignment. Similarly, consensus secondary structure predictions are scored according to their agreement with one or more known structures in a family. Comparing the performance of 10 popular alignment programs using RNAconTest revealed that DAFS, DECIPHER, LocARNA, and MAFFT created the most structurally consistent alignments. The best consensus secondary structure predictions were generated by DAFS and LocARNA (via RNAalifold). Many of the methods specific to noncoding RNAs exhibited poor scalability as the number or length of input sequences increased, and several programs displayed substantial declines in score as more sequences were aligned.
    [Show full text]
  • Inflammatory, Regulatory, and Autophagy Co-Expression Modules
    Durocher et al. Journal of Neuroinflammation (2019) 16:56 https://doi.org/10.1186/s12974-019-1433-4 RESEARCH Open Access Inflammatory, regulatory, and autophagy co-expression modules and hub genes underlie the peripheral immune response to human intracerebral hemorrhage Marc Durocher1, Bradley P. Ander1, Glen Jickling1, Farah Hamade1, Heather Hull1, Bodie Knepp1, Da Zhi Liu1, Xinhua Zhan1, Anh Tran1, Xiyuan Cheng1, Kwan Ng1, Alan Yee1, Frank R. Sharp1 and Boryana Stamova1,2* Abstract Background: Intracerebral hemorrhage (ICH) has a high morbidity and mortality. The peripheral immune system and cross-talk between peripheral blood and brain have been implicated in the ICH immune response. Thus, we delineated the gene networks associated with human ICH in the peripheral blood transcriptome. We also compared the differentially expressed genes in blood following ICH to a prior human study of perihematomal brain tissue. Methods: We performed peripheral blood whole-transcriptome analysis of ICH and matched vascular risk factor control subjects (n = 66). Gene co-expression network analysis identified groups of co-expressed genes (modules) associated with ICH and their most interconnected genes (hubs). Mixed-effects regression identified differentially expressed genes in ICH compared to controls. Results: Of seven ICH-associated modules, six were enriched with cell-specific genes: one neutrophil module, one neutrophil plus monocyte module, one T cell module, one Natural Killer cell module, and two erythroblast modules. The neutrophil/monocyte modules were enriched in inflammatory/immune pathways; the T cell module in T cell receptor signaling genes; and the Natural Killer cell module in genes regulating alternative splicing, epigenetic, and post-translational modifications.
    [Show full text]
  • Bioinformatics Approaches to Analysing RNA Mediated Regulation of Gene Expression
    Bioinformatics approaches to analysing RNA mediated regulation of gene expression Dissertation zur Erlangung des akademischen Grades "doctor rerum naturalium" (Dr. rer. nat.) eingereicht im Institut für Biochemie und Biologie an der Mathematisch-Naturwissenschaftlichen Fakultät der Universität Potsdam LIAM CHILDS Arbeitsgruppe Bioinformatik Max-Plank-Institut für Molekulare Pflanzenphysiologie Potsdam, den 13.07.2009 This work is licensed under a Creative Commons License: Attribution - Noncommercial - Share Alike 3.0 Germany To view a copy of this license visit http://creativecommons.org/licenses/by-nc-sa/3.0/de/deed.en Published online at the Institutional Repository of the University of Potsdam: URL http://opus.kobv.de/ubp/volltexte/2010/4128/ URN urn:nbn:de:kobv:517-opus-41284 http://nbn-resolving.org/urn:nbn:de:kobv:517-opus-41284 Acknowledgements Dirk Walther is certainly one of the best supervisors I’ve ever had. He gave me the freedom to pursue my own ideas and goals whilst providing all the necessary help and support when needed. I have a sneaking suspicion that, if I didn’t play the drums, I may not have been chosen to fill this PhD position. Nevertheless, studying under Dirk and playing in a kick-arse rock/pop/punk/funk/blues/jazz band together has made the past three years, although so very far from home, a very happy and fulfilling experience. To work at such an institute, in the company of great scientific minds, has helped my scientific thinking through a colossal improvement. I still have far to go, but I start my post-doctoral efforts from a very solid basis.
    [Show full text]
  • Direct Probing of RNA Structure and RNA-Protein Interactions in Purified Hela Cell&Apos;S and Yeast Spliceosomal U4/U6.U5
    doi:10.1006/jmbi.2002.5451availableonlineathttp://www.idealibrary.comon J. Mol. Biol. (2002) 317, 631±649 Direct Probing of RNA Structure and RNA-Protein Interactions in Purified HeLa Cell's and Yeast Spliceosomal U4/U6.U5 Tri-snRNP Particles AnnieMougin1,AlexanderGottschalk2,PatriziaFabrizio2 ReinhardLuÈhrmann2andChristianeBranlant1* 1UMR 7567 CNRS-UHP The U4/U6.U5 tri-snRNP is a key component of spliceosomes. By using Nancy I, Maturation des ARN chemical reagents and RNases, we performed the ®rst extensive exper- et Enzymologie MoleÂculaire imental analysis of the structure and accessibility of U4 and U6 snRNAs Universite H. Poincare in tri-snRNPs. These were puri®ed from HeLa cell nuclear extract and B.P. 239, 54506 Vandoeuvre- Saccharomyces cerevisiae cellular extract. U5 accessibility was also investi- les Nancy CeÂdex, France gated. For both species, data demonstrate the formation of the U4/U6 Y- shaped structure. In the human tri-snRNP and U4/U6 snRNP, U6 forms 2Department of Cellular the long range interaction, that was previously proposed to be respon- Biochemistry, Max-Planck- sible for dissociation of the deproteinized U4/U6 duplex. In both yeast Institute of Biophysical and human tri-snRNPs, U5 is more protected than U4 and U6, Chemistry, D-37077, GoÈttingen suggesting that the U5 snRNP-speci®c protein complex and other com- Germany ponents of the tri-snRNP wrapped the 50 stem-loop of U5. Loop I of U5 is partially accessible, and chemical modi®cations of loop I were identical in yeast and human tri-snRNPs. This re¯ects a strong conservation of the interactions of proteins with the functional loop I.
    [Show full text]
  • Cloning PRP38 by Complementation
    PRP38 encodes a yeast protein required for pre-mRNA splicing and maintenance of stable U6 small nuclear RNA levels. S Blanton, A Srinivasan and B C Rymond Mol. Cell. Biol. 1992, 12(9):3939. DOI: 10.1128/MCB.12.9.3939. Downloaded from Updated information and services can be found at: http://mcb.asm.org/content/12/9/3939 These include: http://mcb.asm.org/ CONTENT ALERTS Receive: RSS Feeds, eTOCs, free email alerts (when new articles cite this article), more» on October 8, 2014 by UNIV OF KENTUCKY Information about commercial reprint orders: http://journals.asm.org/site/misc/reprints.xhtml To subscribe to to another ASM Journal go to: http://journals.asm.org/site/subscriptions/ MOLECULAR AND CELLULAR BIOLOGY, Sept. 1992, p. 3939-3947 Vol. 12, No. 9 0270-7306/92/093939-09$02.00/0 Copyright X 1992, American Society for Microbiology PRP38 Encodes a Yeast Protein Required for Pre-mRNA Splicing and Maintenance of Stable U6 Small Nuclear RNA Levels STEVEN BLANTON, APARNA SRINIVASAN, AND BRIAN C. RYMOND* T. H. Morgan School ofBiological Sciences, University ofKentucky, Lexington, Kentucky 40506-0225 Received 7 April 1992/Returned for modification 14 May 1992/Accepted 17 June 1992 Downloaded from An essential pre-mRNA splicing factor, the product of the PRP38 gene, has been genetically identified in a screen of temperature-sensitive mutants of Saccharomyces cerevisiae. Shifting temperature-sensitive prp38 cultures from 23 to 37°C prevents the first cleavage-ligation event in the excision of introns from mRNA precursors. In vitro splicing inactivation and complementation studies suggest that the PRP38-encoded factor functions, at least in part, after stable splicing complex formation.
    [Show full text]
  • Prp31p Promotes the Association of the U4/U6 U5 Tri-Snrnp
    MOLECULAR AND CELLULAR BIOLOGY, July 1997, p. 3580–3588 Vol. 17, No. 7 0270-7306/97/$04.0010 Copyright © 1997, American Society for Microbiology Prp31p Promotes the Association of the U4/U6 z U5 Tri-snRNP with Prespliceosomes To Form Spliceosomes in Saccharomyces cerevisiae ELAINE M. WEIDENHAMMER,† MONICA RUIZ-NORIEGA, AND JOHN L. WOOLFORD, JR.* Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 Received 3 February 1997/Returned for modification 17 March 1997/Accepted 8 April 1997 The PRP31 gene encodes a factor essential for the splicing of pre-mRNA in Saccharomyces cerevisiae. Cell extracts derived from a prp31-1 strain fail to form mature spliceosomes upon heat inactivation, although commitment complexes and prespliceosome complexes are detected under these conditions. Coimmunopre- cipitation experiments indicate that Prp31p is associated both with the U4/U6 z U5 tri-snRNP and, indepen- dently, with the prespliceosome prior to assembly of the tri-snRNP into the splicing complex. Nondenaturing gel electrophoresis and glycerol gradient analyses demonstrate that while Prp31p may play a role in main- taining the assembly or stability of tri-snRNPs, functional protein is not essential for the formation of U4/U6 or U4/U6 z U5 snRNPs. These results suggest that Prp31p is involved in recruiting the U4/U6 z U5 tri-snRNP to prespliceosome complexes or in stabilizing these interactions. The precise excision of intervening sequences, or introns, actions between the U4/U6 z U5 tri-snRNP and the prespli- from precursor mRNA transcripts (pre-mRNAs) is a critical ceosome involve specific base-pairing between the U4, U5, or step in the pathway of gene expression, as evidenced by the U6 snRNAs and RNA components of the prespliceosome is resources that the cell invests in the splicing machinery.
    [Show full text]
  • Cofold: Thermodynamic RNA Structure Prediction with a Kinetic Twist
    CoFold: thermodynamic RNA structure prediction with a kinetic twist Jeff R. Proctor and Irmtraud M. Meyer Centre for High-Throughput Biology & Department of Computer Science and Department of Medical Genetics, University of British Columbia, 2125 East Mall, Vancouver, BC, Canada V6T 1Z4, [email protected] July 15, 2012 Running head: co-transcriptional RNA folding, RNA structure prediction, RNA secondary structures Summary: Existing state-of-the-art methods that take a single RNA sequence and predict the corresponding RNA ß are thermodynamic methods. These predict the most stable RNA struc- ture, but do not consider the process of structure formation. We have by now ample experimental and theoretical evidence, however, that sequences in vivo fold while being transcribed and that the process of structure formation matters. We here present a conceptually new method for predicting RNA ß, called CoFold, that combines thermodynamic with kinetic considerations. Our method significantly improves the state-of-art in terms of prediction accuracy, especially for long sequences of more than a thousand nucleotides length such as ribosomal RNAs. Introduction: The primary products of almost all genomes are transcripts, i.e. RNA se- quences. Their expression is often regulated by RNA structure which forms when the transcript interacts with itself via hydrogen-bonds between complementary nucleotides (G-C, A-U, G-U). These structures regulate translation, transcription, splicing, RNA editing and transcript degra- arXiv:1207.6013v1 [q-bio.BM] 25 Jul 2012 dation. To assign a potential functional role to a transcript, it often suffices to know its RNA ß, i.e. the set of base pairs.
    [Show full text]
  • A Meta-Analysis of the Effects of High-LET Ionizing Radiations in Human Gene Expression
    Supplementary Materials A Meta-Analysis of the Effects of High-LET Ionizing Radiations in Human Gene Expression Table S1. Statistically significant DEGs (Adj. p-value < 0.01) derived from meta-analysis for samples irradiated with high doses of HZE particles, collected 6-24 h post-IR not common with any other meta- analysis group. This meta-analysis group consists of 3 DEG lists obtained from DGEA, using a total of 11 control and 11 irradiated samples [Data Series: E-MTAB-5761 and E-MTAB-5754]. Ensembl ID Gene Symbol Gene Description Up-Regulated Genes ↑ (2425) ENSG00000000938 FGR FGR proto-oncogene, Src family tyrosine kinase ENSG00000001036 FUCA2 alpha-L-fucosidase 2 ENSG00000001084 GCLC glutamate-cysteine ligase catalytic subunit ENSG00000001631 KRIT1 KRIT1 ankyrin repeat containing ENSG00000002079 MYH16 myosin heavy chain 16 pseudogene ENSG00000002587 HS3ST1 heparan sulfate-glucosamine 3-sulfotransferase 1 ENSG00000003056 M6PR mannose-6-phosphate receptor, cation dependent ENSG00000004059 ARF5 ADP ribosylation factor 5 ENSG00000004777 ARHGAP33 Rho GTPase activating protein 33 ENSG00000004799 PDK4 pyruvate dehydrogenase kinase 4 ENSG00000004848 ARX aristaless related homeobox ENSG00000005022 SLC25A5 solute carrier family 25 member 5 ENSG00000005108 THSD7A thrombospondin type 1 domain containing 7A ENSG00000005194 CIAPIN1 cytokine induced apoptosis inhibitor 1 ENSG00000005381 MPO myeloperoxidase ENSG00000005486 RHBDD2 rhomboid domain containing 2 ENSG00000005884 ITGA3 integrin subunit alpha 3 ENSG00000006016 CRLF1 cytokine receptor like
    [Show full text]