Rna 3D Motifs: Identification, Clustering, and Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Rna 3D Motifs: Identification, Clustering, and Analysis RNA 3D MOTIFS: IDENTIFICATION, CLUSTERING, AND ANALYSIS ANTON I. PETROV A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY May 2012 Committee: Neocles Leontis, Advisor Craig L. Zirbel Graduate Faculty Representative Paul Morris Scott Rogers Ray Larsen i ABSTRACT Neocles B. Leontis, Advisor Many hairpin and internal RNA 3D motif structures are recurrent, occurring in various types of RNA molecules, not necessarily homologs. Although usually drawn as single-strand “loops” in RNA 2D diagrams, recurrent motifs share a common 3D structure, but can vary in sequence. It is essential to understand the sequence variability of RNA 3D motifs in order to advance the RNA 2D and 3D structure prediction and ncRNA discovery methods, to interpret mutations that affect ncRNAs, and to guide experimental functional studies. The dissertation is organized into two parts as follows. First, the development of a new online resource called RNA 3D Hub is described, which is intended to provide a useful resource for structure modeling and prediction. It houses non-redundant sets of RNA-containing 3D structures, RNA 3D motifs extracted from all RNA 3D structures, and the RNA 3D Motif Atlas, a representative collection of RNA 3D motifs. Unique and stable ids are assigned to all non- redundant equivalence classes of structure files, to all motifs, and to all motif instances. RNA 3D Hub is updated automatically on a regular schedule and is available at http://rna.bgsu.edu/rna3dhub. In the second part of the dissertation, the development of WebFR3D (http://rna.bgsu.edu/webfr3d), a new webserver for finding and aligning RNA 3D motifs, is described and its use in a biologically relevant context is then illustrated using two RNA 3D motifs. The first motif was predicted in Potato Spindle Tuber Viroid (PSTVd), and the prediction ii was supported by functional evidence. The second motif had previously been undescribed, although it is found in multiple 3D structures. RNA 3D Hub, RNA 3D Motif Atlas, and the bioinformatic techniques discussed in this dissertation lay the groundwork for further research into RNA 3D motif prediction starting from sequence and provide useful online resources for the scientific community worldwide. iii TABLE OF CONTENTS Page PART I. RNA 3D HUB AND RNA 3D MOTIF ATLAS .................................................... 1 CHAPTER 1. MOTIVATION FOR RNA 3D MOTIF ATLAS .......................................... 2 1.1 Introduction to RNA 3D Motifs .......................................................................... 2 1.1.1 RNA Base Pair Classification .............................................................. 2 1.1.2 RNA 3D Motifs.................................................................................... 7 1.2 Potential Applications for RNA 3D Motif Atlas ................................................ 10 1.2.1 RNA 3D Structure Prediction .............................................................. 11 1.2.2 Searching for RNA 3D Motifs in Sequences ....................................... 12 1.2.3 Experimental Studies of RNA 3D Motifs ............................................ 13 1.3 Criteria for Successful Curation of an RNA 3D Motif Atlas.............................. 14 CHAPTER 2. SEARCHING FOR RNA 3D MOTIFS ......................................................... 16 2.1 Overview of Existing Tools for Searching for RNA 3D Motifs ......................... 19 2.1.1 MC-Search ........................................................................................... 19 2.1.2 NASSAM ............................................................................................. 19 2.1.3 PRIMOS ............................................................................................... 20 2.1.4 FR3D .................................................................................................... 20 2.1.5 Apostolico et al., 2009 ......................................................................... 21 2.1.6 RNAMotifScan .................................................................................... 21 2.1.7 FRMF ................................................................................................... 22 2.2 Choosing a Motif Search Tool for RNA 3D Motif Atlas ................................... 22 2.3 Tools for RNA Structural Alignment ................................................................. 24 iv CHAPTER 3. AUTOMATIC CURATION OF NON-REDUNDANT LISTS OF RNA- CONTAINING 3D STRUCTURES ...................................................................................... 25 3.1 Sources of Redundancy in Structural Data ......................................................... 25 3.2 Existing RNA Non-redundant Lists .................................................................... 26 3.3 RNA 3D Motifs and Data Redundancy .............................................................. 27 3.4 Non-redundant Lists at RNA 3D Hub ................................................................ 29 3.4.1 Versioning and Assigning Unique Ids ................................................. 29 3.5 Future Directions ................................................................................................ 31 CHAPTER 4. AUTOMATIC EXTRACTION OF RNA 3D MOTIFS ................................ 33 4.1 Introduction ......................................................................................................... 33 4.2 Overview of Existing RNA 3D Motif Collections ............................................. 33 4.2.1 Loop-oriented Collections ................................................................... 36 4.2.1.1 RNAJunction......................................................................... 36 4.2.1.2 RNA STRAND ..................................................................... 36 4.2.1.3 RNA FRABASE 2.0 ............................................................. 37 4.2.1.4 RLooM .................................................................................. 37 4.2.1.5 RNA CoSSMos ..................................................................... 37 4.2.1 Motif-oriented Collections ................................................................... 38 4.2.2.1 SCOR .................................................................................... 38 4.2.2.2 Comparative RNA Web (CRW) Site .................................... 38 4.2.2.3 K-turn Database .................................................................... 39 4.2.2.4 RNAMotifScan ..................................................................... 39 4.2.2.5 FRMF .................................................................................... 39 v 4.2.3 Comparison of Existing RNA 3D Motif Collections ........................... 39 4.3 Extracting RNA 3D Motifs Using Symbolic FR3D Searches ............................ 40 4.3.1 Symbolic FR3D Searches for Hairpin Loops ...................................... 42 4.3.2 Symbolic FR3D Searches for Internal Loops ...................................... 42 4.3.3 Symbolic FR3D Searches for Three-way Junction Loops ................... 44 4.3.4 Assigning Unique Ids to Loop Instances and Other 3D Fragments..... 46 4.4 Quality Assurance Procedures ............................................................................ 52 4.4.1 Motivation for Quality Assurance ....................................................... 52 4.4.2 Quality Assurance Algorithm .............................................................. 55 4.4.2.1 Identifying Potential Gaps .................................................... 56 4.4.2.2 Identifying Self-Complimentary Internal Loops .................. 57 4.4.2.3 Identifying Loops with Modified Nucleotides ...................... 58 4.4.2.4 Identifying Loops with Missing Nucleotides ........................ 58 4.4.2.4 Identifying Loops with Incomplete Nucleotides ................... 59 4.4.2.4 Identifying Loops with Abnormal Chain Counts .................. 60 4.4.3 Results of Quality Assurance ............................................................... 61 4.5 Conclusions ......................................................................................................... 61 CHAPTER 5. AUTOMATIC CLUSTERING OF RNA 3D MOTIFS ................................ 63 5.1 Existing Techniques for Automatic Classification of RNA 3D Motifs .............. 64 5.1.1 COMPADRES ..................................................................................... 64 5.1.2 Huang et al., 2005 ................................................................................ 64 5.1.3 Wang et al., 2007 ................................................................................. 65 5.1.4 Rna3Dmotif.......................................................................................... 66 vi 5.1.5 RNAMSC ............................................................................................. 67 5.2 Comparison of the Existing RNA 3D Motif Clustering Techniques .................. 67 5.3 Implementation of Automatic Motif Classification in RNA 3D Motif Atlas ..... 69 5.3.1 Selection of Loop Instances for Clustering .......................................... 70 5.3.2 All-against-all Geometric FR3D Searches ........................................... 70 5.3.3 Quality Assurance of the Search Results ............................................. 72 5.3.4 Matching Matrix and Maximum Cliques ............................................. 73 5.3.5 Assigning
Recommended publications
  • Redesigning the Eterna100 for the Vienna 2 Folding Engine
    bioRxiv preprint doi: https://doi.org/10.1101/2021.08.26.457839; this version posted August 28, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Redesigning the Eterna100 for the Vienna 2 folding engine Rohan V. Koodli1,*, Boris Rudolfs2,*, Hannah K. Wayment-Steele3, Eterna Structure Designersγ, Rhiju Das4,5,^ 1Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA 94720 2Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, 92110 3Department of Chemistry, Stanford University, Stanford, CA 94305 4Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305 5Department of Physics, Stanford University, Stanford, CA 94305 *Indicates equal contribution γ Group Author: Membership of Eterna participants and list of player names is provided in Acknowledgments ^Corresponding Author: [email protected] bioRxiv preprint doi: https://doi.org/10.1101/2021.08.26.457839; this version posted August 28, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Abstract The rational design of RNA is becoming important for rapidly developing technologies in medicine and biochemistry. Recent work has led to the development of several RNA secondary structure design algorithms and corresponding benchmarks to evaluate their performance. However, the performance of these algorithms is linked to the nature of the underlying algorithms for predicting secondary structure from sequences.
    [Show full text]
  • Poorna Roy Phd Dissertation
    ANALYZING AND CLASSIFYING BIMOLECULAR INTERACTIONS: I. EFFECTS OF METAL BINDING ON AN IRON-SULFUR CLUSTER SCAFFOLD PROTEIN II. AUTOMATIC ANNOTATION OF RNA-PROTEIN INTERACTIONS FOR NDB Poorna Roy A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 2017 Committee: Neocles Leontis, Committee Co-Chair Andrew Torelli, Committee Co-Chair Vipaporn Phuntumart, Graduate Faculty Representative H. Peter Lu © 2017 Poorna Roy All Rights Reserved iii ABSTRACT Neocles B. Leontis and Andrew T. Torelli, Committee co-chairs This dissertation comprises two distinct parts; however the different research agendas are thematically linked by their complementary approaches to investigate the nature of important intermolecular interactions. The first part is the study of interactions between an iron-sulfur cluster scaffold protein, IscU, and different transition metal ions. Interactions between IscU and specific metal ions are investigated and compared with those of SufU, a homologous Fe-S cluster biosynthesis protein from Gram-positive bacteria whose metal-dependent conformational behavior remains unclear. These studies were extended with additional metal ions selected to determine whether coordination geometry at the active sites of IscU and its homolog influence metal ion selectivity. Comparing the conformational behavior and affinity for different transition metal ions revealed that metal-dependent conformational transitions exhibited by IscU may be a recurring strategy exhibited by U-type proteins involved in Fe-S cluster biosynthesis. The second part of the thesis focuses on automated detection and annotation of specific interactions between nucleotides and amino acid residues in RNA-protein complexes.
    [Show full text]
  • Designing Learning with Citizen Science and Games
    The Emerging Learning Design Journal Volume 4 Issue 1 Article 3 February 2018 Designing Learning with Citizen Science and Games Karen Schrier Marist College Follow this and additional works at: https://digitalcommons.montclair.edu/eldj Part of the Curriculum and Instruction Commons Recommended Citation Schrier, Karen (2018) "Designing Learning with Citizen Science and Games," The Emerging Learning Design Journal: Vol. 4 : Iss. 1 , Article 3. Available at: https://digitalcommons.montclair.edu/eldj/vol4/iss1/3 This Trends is brought to you for free and open access by Montclair State University Digital Commons. It has been accepted for inclusion in The Emerging Learning Design Journal by an authorized editor of Montclair State University Digital Commons. For more information, please contact [email protected]. Volume 4 (2017) pp. 19-26 http://eldj.montclair.edu eld.j ISSN 2474-8218 Emerging Learning Design Journal Trend Designing Learning with Citizen Science and Games Karen Schrier Marist College [email protected] April 29, 2017 ABSTRACT This emerging trends article introduces concepts such as citizen science (the inclusion of non-professionals in scientific knowledge production) and knowledge games (games that enable players to solve real-world problems through crowdsourcing and collective intelligence activities within a game). The article shares the strengths and limitations of using citizen science and knowledge games in the classroom, as well as initial tips and guidelines for bringing these types of experiences to
    [Show full text]
  • Geometric and Statistical Methods for the Analysis and Prediction of Structural Interactions Between Biomolecules Julie Bernauer
    Geometric and statistical methods for the analysis and prediction of structural interactions between biomolecules Julie Bernauer To cite this version: Julie Bernauer. Geometric and statistical methods for the analysis and prediction of structural interac- tions between biomolecules. Bioinformatics [q-bio.QM]. Université Paris-Sud XI, 2015. tel-01136261 HAL Id: tel-01136261 https://tel.archives-ouvertes.fr/tel-01136261 Submitted on 26 Mar 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Copyright Methodes´ geom´ etriques´ et statistiques pour l'analyse et la prediction´ des interactions structurales de biomolecules´ HABILITATION À DIRIGER DES RECHERCHES (Sp´ecialit´eInformatique) UNIVERSITÉ PARIS-SUD 11 pr´esent´eeet soutenue publiquement le 13 janvier 2015 Julie Bernauer Pr´esident: Philippe Dague Professeur, Universit´eParis Sud, LRI/LaDHAC Rapporteurs : Patrice Koehl Professor, University of California, Davis, Department of Computer Science Erik Lindahl Professor, KTH Royal Institute of Technology & Stockholm University, Department of Biochemistry
    [Show full text]
  • Human Computation and Human Subject Tasks in Social Network Playful Applications
    University of Bremen Doctoral Thesis Human Computation and Human Subject Tasks in Social Network Playful Applications Supervisors: Author: Dr. Rainer Malaka Aneta Takhtamysheva Dr. Andreas Breiter A thesis submitted in fulfilment of the requirements for the degree of Doctor of Engineering (Dr.-Ing.) in the Digital Media Mathematics and Informatics March 2016 Declaration of Authorship I, Aneta Takhtamysheva, declare that this thesis titled, ’Human Computation and Human Subject Tasks in Social Network Playful Applications’ and the work presented in it are my own. I confirm that: This work was done wholly or mainly while in candidature for a research degree at this University. Where any part of this thesis has previously been submitted for a degree or any other qualification at this University or any other institution, this has been clearly stated. Where I have consulted the published work of others, this is always clearly at- tributed. Where I have quoted from the work of others, the source is always given. With the exception of such quotations, this thesis is entirely my own work. I have acknowledged all main sources of help. Where the thesis is based on work done by myself jointly with others, I have made clear exactly what was done by others and what I have contributed myself. Signed: Date: ii UNIVERSITY OF BREMEN Abstract Mathematics and Informatics Doctor of Engineering (Dr.-Ing.) Human Computation and Human Subject Tasks in Social Network Playful Applications by Aneta Takhtamysheva Universal connectivity has made crowdsourcing - an online activity of a crowd toward the completion of a goal requested by someone in an open call - possible.
    [Show full text]
  • RNA Structure Through Multidimensional Chemical Mapping
    REVIEW RNA structure through multidimensional chemical mapping Siqi Tian1 and Rhiju Das1,2* 1 Department of Biochemistry, Stanford University, Stanford, CA 94305, USA 2 Department of Physics, Stanford University, Stanford, CA 94305, USA Quarterly Reviews of Biophysics (2016), 49, e7, page 1 of 30 doi:10.1017/S0033583516000020 Abstract. The discoveries of myriad non-coding RNA molecules, each transiting through multiple flexible states in cells or virions, present major challenges for structure determination. Advances in high-throughput chemical mapping give new routes for characterizing entire tran- scriptomes in vivo, but the resulting one-dimensional data generally remain too information-poor to allow accurate de novo structure deter- mination. Multidimensional chemical mapping (MCM) methods seek to address this challenge. Mutate-and-map (M2), RNA interaction groups by mutational profiling (RING-MaP and MaP-2D analysis) and multiplexed •OH cleavage analysis (MOHCA) measure how the chemical reactivities of every nucleotide in an RNA molecule change in response to modifications at every other nucleotide. A growing body of in vitro blind tests and compensatory mutation/rescue experiments indicate that MCM methods give consistently accurate secondary structures and global tertiary structures for ribozymes, ribosomal domains and ligand-bound riboswitch aptamers up to 200 nucleotides in length. Importantly, MCM analyses provide detailed information on structurally heterogeneous RNA states, such as ligand-free riboswitches that are functionally important but difficult to resolve with other approaches. The sequencing requirements of currently available MCM pro- tocols scale at least quadratically with RNA length, precluding general application to transcriptomes or viral genomes at present. We propose a modify-cross-link-map (MXM) expansion to overcome this and other current limitations to resolving the in vivo ‘RNA structurome’.
    [Show full text]
  • RNA Structural Motif Recognition Based on Least-Squares Distance
    Downloaded from rnajournal.cshlp.org on October 1, 2021 - Published by Cold Spring Harbor Laboratory Press BIOINFORMATICS RNA structural motif recognition based on least-squares distance YING SHEN,1 HAU-SAN WONG,2,4 SHAOHONG ZHANG,3 and LIN ZHANG1 1School of Software Engineering, Tongji University, Shanghai 200092, China 2Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong 3Department of Computer Science, Guangzhou University, Guangzhou 510006, China ABSTRACT RNA structural motifs are recurrent structural elements occurring in RNA molecules. RNA structural motif recognition aims to find RNA substructures that are similar to a query motif, and it is important for RNA structure analysis and RNA function prediction. In view of this, we propose a new method known as RNA Structural Motif Recognition based on Least-Squares distance (LS-RSMR) to effectively recognize RNA structural motifs. A test set consisting of five types of RNA structural motifs occurring in Escherichia coli ribosomal RNA is compiled by us. Experiments are conducted for recognizing these five types of motifs. The experimental results fully reveal the superiority of the proposed LS-RSMR compared with four other state-of-the-art methods. Keywords: RNA structural motif; RNA motif recognition INTRODUCTION RNA molecule (i.e., the search space). There are two prevalent ways to search for RNA structural motifs. The first class of LongnoncodingRNA(ncRNA),whichconsistsof>200ntand methods is to find motifs based on their geometric features. has little or no protein-coding capability, is a remarkable class of RNAs. They are important in various biological processes NASSAM (Harrison et al. 2003) represents RNA motifs and (Dinger et al.
    [Show full text]
  • Cross-Talk Between Overlap Interactions in Biomolecules: a Case Study of the Β-Turn Motif
    molecules Article Cross-Talk between Overlap Interactions in Biomolecules: A Case Study of the b-Turn Motif Jayashree Nagesh Solid State and Structural Chemistry Unit, Indian Institute of Science Bangalore, Bengaluru 560012, Karnataka, India; [email protected] Abstract: Noncovalent interactions play a pivotal role in regulating protein conformation, stability and dynamics. Among the quantum mechanical (QM) overlap-based noncovalent interactions, n ! p∗ is the best understood with studies ranging from small molecules to b-turns of model proteins such as GB1. However, these investigations do not explore the interplay between multiple overlap interactions in contributing to local structure and stability. In this work, we identify and characterize all noncovalent overlap interactions in the b-turn, an important secondary structural element that facilitates the folding of a polypeptide chain. Invoking a QM framework of natural bond orbitals, we demonstrate the role of several additional interactions such as n ! s∗ and p ! p∗ that are energetically comparable to or larger than n ! p∗. We find that these interactions are sensitive to changes in the side chain of the residues in the b-turn of GB1, suggesting that the n ! p∗ may not be the only component in dictating b-turn conformation and stability. Furthermore, a database search of n ! s∗ and p ! p∗ in the PDB reveals that they are prevalent in most proteins and have significant interaction energies (∼1 kcal/mol). This indicates that all overlap interactions must be taken into account to obtain a comprehensive picture of their contributions to protein structure and energetics. Lastly, based on the extent of QM overlaps and interaction energies, we propose geometric criteria using which these additional interactions can be efficiently tracked in broad database searches.
    [Show full text]
  • New Visions in Citizen Science
    case study series vol 3 New Visions in Citizen Science by Anne Bowser and Lea Shanley New Visions in Citizen Science by Anne Bowser and Lea Shanley, Woodrow Wilson Center November 2013 Commons Lab | Case sTUDy series | vOL 3 NEW VISIONS IN CITIZEN SCIENCE Commons Lab Science and Technology Innovation Program Woodrow Wilson International Center for Scholars One Woodrow Wilson Plaza 1300 Pennsylvania Avenue, N.W. Washington, DC 20004-3027 www.CommonsLab.wilsoncenter.org Study Director: Lea Shanley Editor: Aaron Lovell Cover design: Kathy Butterfield and Diana Micheli © 2013, The Woodrow Wilson Center: This work is licensed under Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License: http://creativecommons. org/licenses/by-nc-nd/3.0/ This report may be reproduced in whole, or in part, for educational and non- commercial uses, pursuant to the Creative Commons Attribution-NonCommerical- NoDerivs 3.0 Unported License found at http://creativecommons.org/licenses/by-nc- nd/3.0/ and provided this copyright notice and the following attribution is given: Anne Bowser and Lea Shanley, New Visions in Citizen Science. Washington, DC: Wood- row Wilson International Center for Scholars, 2013. Users may not use technical measures to obstruct or control the reading or further copying of the copies that they make or distribute. Nongovernmental users may not accept compensation of any manner in exchange for copies. The Woodrow Wilson Center is open to certain derivative uses of this product beyond the limitations of the included Creative Commons License, particularly for educational materials targeted at expanding knowledge on the Commons Lab’s mandate. For more information, please contact [email protected].
    [Show full text]
  • Revisiting Citizen Science Through the Lens of Hybrid Intelligence
    Revisiting Citizen Science Through the Lens of Hybrid Intelligence Janet Rafnera, Miroslav Gajdacza, Gitte Kragha, Arthur Hjortha, Anna Ganderb, Blanka Pala, Aleks Berditchevskaiac, François Greyd, Kobi Gale, Avi Segale, Mike Walmsleyf, Josh Aaron Millerg, Dominik Dellermanh, Muki Haklayi, Pietro Micheluccij, Jacob Shersona a Center for Hybrid Intelligence, Denmark b University of Gothenburg, Sweden c Centre for Collective Intelligence Design, United Kingdom d Citizen Cyberlab e Ben Gurion University, Israel f University of Oxford, England g Northwestern University, USA h Vencortex i University College London, England j Human Computation Institute, USA Abstract Articial Intelligence (AI) can augment and sometimes even replace human cognition. Inspired by eorts to value human agency alongside productivity, we discuss the benets of solving Citizen Science (CS) tasks with Hybrid Intelligence (HI), a synergetic mixture of human and articial intelligence. Currently there is no clear framework or methodology on how to create such an eective mixture. Due to the unique participant-centered set of values and the abundance of tasks drawing upon both human common sense and complex 21st century skills, we believe that the eld of CS oers an invaluable testbed for the development of HI and human-centered AI of the 21st century, while beneting CS as well. In order to investigate this potential, we rst relate CS to adjacent computational disciplines. Then, we demonstrate that CS projects can be grouped according to their potential for HI-enhancement by examining two key dimensions: the level of digitization and the amount of knowledge or experience required for participation. Finally, we propose a framework for types of human-AI interaction in CS based on established criteria of HI.
    [Show full text]
  • Comparison of GHT-Based Approaches to Structural Motif Retrieval
    Comparison of GHT-Based Approaches to Structural Motif Retrieval Alessio Ferone1 and Ozlem Ozbudak2 1 University of Naples Parthenope, Department of Applied Science, Centro Direzionale Napoli - Isola C4, 80143, Napoli, Italy [email protected] 2 Istanbul Technical University, Department of Electronics and Communication Engineering, 34469, Istanbul, Turkey [email protected] Abstract. The structure of a protein gives important information about its function and can be used for understanding the evolutionary relation- ships among proteins, predicting protein functions, and predicting pro- tein folding. A structural motif is a compact 3D protein block referring to a small specific combination of secondary structural elements which appears in a variety of molecules. In this paper we present a compari- son between few approaches for motif retrieval based on the Generalized Hough Transform (GHT). Performance comparisons, in terms of preci- sion and computation time, are presented considering the retrieval of motifs composed by three to five SSs for more than 15 million searches. The approaches object of this study can be easily applied to the retrieval of greater blocks, up to protein domains, or even entire proteins. Keywords: Hough transform, Protein motif retrieval, Protein structure comparison. 1 Introduction Proteins are central molecules in biological phenomena because they form the functional and structural cell components of every organisms and their function is determined, to a large extend, by their spatial structures. Starting from the linear sequence of amino acid given in Protein Data Bank (PDB) [1], two basic regular 3D structures can be envisaged [11], called SSs: helices and sheets.Small specific combinations of SSs, which appear in a variety of molecules, are called motifs, and can be considered as super-SSs [12].
    [Show full text]
  • For Use of the Conserved Helix-Turn-Helix Motif in DNA Binding (Escherichia Coli/Operator Recognition/Hydroxylamine Mutagenesis/Tetracycline Resistance) PAUL J
    Proc. Natl. Acad. Sci. USA Vol. 82, pp. 6226-6230, September 1985 Genetics Dominant negative mutations in the TnIO tet repressor: Evidence for use of the conserved helix-turn-helix motif in DNA binding (Escherichia coli/operator recognition/hydroxylamine mutagenesis/tetracycline resistance) PAUL J. ISACKSON AND KEVIN P. BERTRAND Department of Microbiology and Molecular Genetics, California College of Medicine, University of California, Irvine, CA 92717 Communicated by Charles Yanofsky, May 20, 1985 ABSTRACT The Tn1O tet repressor regulates transcrip- not yet known. These observations have led several groups tion of the tetracycline-resistance determinant in transposon to propose that many sequence-specific DNA-binding pro- Tn1O. Previous DNA sequencing studies identified a region of teins use similar helix-turn-helix structures for DNA binding tet repressor (amino acids 26-47) that is homologous to the (19-21). helix-turn-helix regions of X Cro, X repressor, and catabolite We previously reported that an amino-terminal region of gene activator protein that are implicated in sequence-specific the TnJO tet repressor shows amino acid sequence homology DNA binding. Here we report the isolation of dominant tetR with the characteristic helix-turn-helix regions of Cro, X mutations that result in tet repressors deficient in tet operator repressor, and CAP (8). Here we report that mutations in binding but that retain some capacity to form dimers with, and TWJO tetR that impair repressor-operator binding, but not thereby inactivate, wild-type repressor monomers. The muta- tetracycline binding or subunit aggregation, are clustered in tions were isolated by transforming a MeMR+ tetA-lacZ fusion the region of helix-turn-helix sequence homology.
    [Show full text]