Rna 3D Motifs: Identification, Clustering, and Analysis

Rna 3D Motifs: Identification, Clustering, and Analysis

RNA 3D MOTIFS: IDENTIFICATION, CLUSTERING, AND ANALYSIS ANTON I. PETROV A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY May 2012 Committee: Neocles Leontis, Advisor Craig L. Zirbel Graduate Faculty Representative Paul Morris Scott Rogers Ray Larsen i ABSTRACT Neocles B. Leontis, Advisor Many hairpin and internal RNA 3D motif structures are recurrent, occurring in various types of RNA molecules, not necessarily homologs. Although usually drawn as single-strand “loops” in RNA 2D diagrams, recurrent motifs share a common 3D structure, but can vary in sequence. It is essential to understand the sequence variability of RNA 3D motifs in order to advance the RNA 2D and 3D structure prediction and ncRNA discovery methods, to interpret mutations that affect ncRNAs, and to guide experimental functional studies. The dissertation is organized into two parts as follows. First, the development of a new online resource called RNA 3D Hub is described, which is intended to provide a useful resource for structure modeling and prediction. It houses non-redundant sets of RNA-containing 3D structures, RNA 3D motifs extracted from all RNA 3D structures, and the RNA 3D Motif Atlas, a representative collection of RNA 3D motifs. Unique and stable ids are assigned to all non- redundant equivalence classes of structure files, to all motifs, and to all motif instances. RNA 3D Hub is updated automatically on a regular schedule and is available at http://rna.bgsu.edu/rna3dhub. In the second part of the dissertation, the development of WebFR3D (http://rna.bgsu.edu/webfr3d), a new webserver for finding and aligning RNA 3D motifs, is described and its use in a biologically relevant context is then illustrated using two RNA 3D motifs. The first motif was predicted in Potato Spindle Tuber Viroid (PSTVd), and the prediction ii was supported by functional evidence. The second motif had previously been undescribed, although it is found in multiple 3D structures. RNA 3D Hub, RNA 3D Motif Atlas, and the bioinformatic techniques discussed in this dissertation lay the groundwork for further research into RNA 3D motif prediction starting from sequence and provide useful online resources for the scientific community worldwide. iii TABLE OF CONTENTS Page PART I. RNA 3D HUB AND RNA 3D MOTIF ATLAS .................................................... 1 CHAPTER 1. MOTIVATION FOR RNA 3D MOTIF ATLAS .......................................... 2 1.1 Introduction to RNA 3D Motifs .......................................................................... 2 1.1.1 RNA Base Pair Classification .............................................................. 2 1.1.2 RNA 3D Motifs.................................................................................... 7 1.2 Potential Applications for RNA 3D Motif Atlas ................................................ 10 1.2.1 RNA 3D Structure Prediction .............................................................. 11 1.2.2 Searching for RNA 3D Motifs in Sequences ....................................... 12 1.2.3 Experimental Studies of RNA 3D Motifs ............................................ 13 1.3 Criteria for Successful Curation of an RNA 3D Motif Atlas.............................. 14 CHAPTER 2. SEARCHING FOR RNA 3D MOTIFS ......................................................... 16 2.1 Overview of Existing Tools for Searching for RNA 3D Motifs ......................... 19 2.1.1 MC-Search ........................................................................................... 19 2.1.2 NASSAM ............................................................................................. 19 2.1.3 PRIMOS ............................................................................................... 20 2.1.4 FR3D .................................................................................................... 20 2.1.5 Apostolico et al., 2009 ......................................................................... 21 2.1.6 RNAMotifScan .................................................................................... 21 2.1.7 FRMF ................................................................................................... 22 2.2 Choosing a Motif Search Tool for RNA 3D Motif Atlas ................................... 22 2.3 Tools for RNA Structural Alignment ................................................................. 24 iv CHAPTER 3. AUTOMATIC CURATION OF NON-REDUNDANT LISTS OF RNA- CONTAINING 3D STRUCTURES ...................................................................................... 25 3.1 Sources of Redundancy in Structural Data ......................................................... 25 3.2 Existing RNA Non-redundant Lists .................................................................... 26 3.3 RNA 3D Motifs and Data Redundancy .............................................................. 27 3.4 Non-redundant Lists at RNA 3D Hub ................................................................ 29 3.4.1 Versioning and Assigning Unique Ids ................................................. 29 3.5 Future Directions ................................................................................................ 31 CHAPTER 4. AUTOMATIC EXTRACTION OF RNA 3D MOTIFS ................................ 33 4.1 Introduction ......................................................................................................... 33 4.2 Overview of Existing RNA 3D Motif Collections ............................................. 33 4.2.1 Loop-oriented Collections ................................................................... 36 4.2.1.1 RNAJunction......................................................................... 36 4.2.1.2 RNA STRAND ..................................................................... 36 4.2.1.3 RNA FRABASE 2.0 ............................................................. 37 4.2.1.4 RLooM .................................................................................. 37 4.2.1.5 RNA CoSSMos ..................................................................... 37 4.2.1 Motif-oriented Collections ................................................................... 38 4.2.2.1 SCOR .................................................................................... 38 4.2.2.2 Comparative RNA Web (CRW) Site .................................... 38 4.2.2.3 K-turn Database .................................................................... 39 4.2.2.4 RNAMotifScan ..................................................................... 39 4.2.2.5 FRMF .................................................................................... 39 v 4.2.3 Comparison of Existing RNA 3D Motif Collections ........................... 39 4.3 Extracting RNA 3D Motifs Using Symbolic FR3D Searches ............................ 40 4.3.1 Symbolic FR3D Searches for Hairpin Loops ...................................... 42 4.3.2 Symbolic FR3D Searches for Internal Loops ...................................... 42 4.3.3 Symbolic FR3D Searches for Three-way Junction Loops ................... 44 4.3.4 Assigning Unique Ids to Loop Instances and Other 3D Fragments..... 46 4.4 Quality Assurance Procedures ............................................................................ 52 4.4.1 Motivation for Quality Assurance ....................................................... 52 4.4.2 Quality Assurance Algorithm .............................................................. 55 4.4.2.1 Identifying Potential Gaps .................................................... 56 4.4.2.2 Identifying Self-Complimentary Internal Loops .................. 57 4.4.2.3 Identifying Loops with Modified Nucleotides ...................... 58 4.4.2.4 Identifying Loops with Missing Nucleotides ........................ 58 4.4.2.4 Identifying Loops with Incomplete Nucleotides ................... 59 4.4.2.4 Identifying Loops with Abnormal Chain Counts .................. 60 4.4.3 Results of Quality Assurance ............................................................... 61 4.5 Conclusions ......................................................................................................... 61 CHAPTER 5. AUTOMATIC CLUSTERING OF RNA 3D MOTIFS ................................ 63 5.1 Existing Techniques for Automatic Classification of RNA 3D Motifs .............. 64 5.1.1 COMPADRES ..................................................................................... 64 5.1.2 Huang et al., 2005 ................................................................................ 64 5.1.3 Wang et al., 2007 ................................................................................. 65 5.1.4 Rna3Dmotif.......................................................................................... 66 vi 5.1.5 RNAMSC ............................................................................................. 67 5.2 Comparison of the Existing RNA 3D Motif Clustering Techniques .................. 67 5.3 Implementation of Automatic Motif Classification in RNA 3D Motif Atlas ..... 69 5.3.1 Selection of Loop Instances for Clustering .......................................... 70 5.3.2 All-against-all Geometric FR3D Searches ........................................... 70 5.3.3 Quality Assurance of the Search Results ............................................. 72 5.3.4 Matching Matrix and Maximum Cliques ............................................. 73 5.3.5 Assigning

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    167 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us