A CASP-Like Evaluation of RNA Three-Dimensional Structure Prediction
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from rnajournal.cshlp.org on September 28, 2021 - Published by Cold Spring Harbor Laboratory Press BIOINFORMATICS RNA-Puzzles: A CASP-like evaluation of RNA three-dimensional structure prediction JOSE´ ALMEIDA CRUZ,1 MARC-FRE´DE´RICK BLANCHET,2 MICHAL BONIECKI,3 JANUSZ M. BUJNICKI,3,4 SHI-JIE CHEN,5 SONG CAO,5 RHIJU DAS,6,7 FENG DING,8 NIKOLAY V. DOKHOLYAN,8 SAMUEL COULBOURN FLORES,9 LILI HUANG,10 CHRISTOPHER A. LAVENDER,11 VE´RONIQUE LISI,2 FRANCxOIS MAJOR,2 KATARZYNA MIKOLAJCZAK,3 DINSHAW J. PATEL,10 ANNA PHILIPS,3,4 TOMASZ PUTON,4 JOHN SANTALUCIA,12,13 FREDRICK SIJENYI,13 THOMAS HERMANN,14 KRISTIAN ROTHER,4 MAGDALENA ROTHER,4 ALEXANDER SERGANOV,10 MARCIN SKORUPSKI,4 TOMASZ SOLTYSINSKI,3 PARIN SRIPAKDEEVONG,6,7 IRINA TUSZYNSKA,3 KEVIN M. WEEKS,11 CHRISTINA WALDSICH,15 MICHAEL WILDAUER,15 NEOCLES B. LEONTIS,16 and ERIC WESTHOF1,17 1Architecture et Re´activite´ de l’ARN, Universite´ de Strasbourg, IBMC-CNRS, F-67084 Strasbourg, France 2Institute for Research in Immunology and Cancer (IRIC), Department of Computer Science and Operations Research, Universite´ de Montre´al, Montre´al, Que´bec H3C 3J7, Canada 3Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, 02-109 Warsaw, Poland 4Laboratory of Bioinformatics, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, 61-614 Poznan, Poland 5Department of Physics and Department of Biochemistry, University of Missouri, Columbia, Missouri 65211, USA 6Department of Biochemistry, 7Department of Physics, Stanford University, Stanford, California 94305, USA 8Department of Biochemistry and Biophysics, University of North Carolina, School of Medicine, Chapel Hill, North Carolina 27599, USA 9Computational & Systems Biology Program, Institute for Cell and Molecular Biology, Uppsala University, 751 05 Uppsala, Sweden 10Structural Biology Program, Memorial Sloan-Kettering Cancer Center, New York, New York 10021, USA 11Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599, USA 12Department of Chemistry, Wayne State University, Detroit, Michigan 48202, USA 13DNA Software, Ann Arbor, Michigan 48104, USA 14Department of Chemistry and Biochemistry, University of California at San Diego, La Jolla, California 92093, USA 15Max F. Perutz Laboratories, Department of Biochemistry, University of Vienna, Vienna 1030, Austria 16Department of Chemistry and Center for Biomolecular Sciences, Bowling Green State University, Bowling Green, Ohio 43403, USA ABSTRACT We report the results of a first, collective, blind experiment in RNA three-dimensional (3D) structure prediction, encom- passing three prediction puzzles. The goals are to assess the leading edge of RNA structure prediction techniques; compare existing methods and tools; and evaluate their relative strengths, weaknesses, and limitations in terms of sequence length and structural complexity. The results should give potential users insight into the suitability of available methods for different applications and facilitate efforts in the RNA structure prediction community in ongoing efforts to improve prediction tools. We also report the creation of an automated evaluation pipeline to facilitate the analysis of future RNA structure prediction exercises. Keywords: 3D prediction; bioinformatics; force fields; structure INTRODUCTION basis of the underlying biological process. Each of the current experimental methods for determining the three-dimensional The determination of the atomic structure of any biological (3D) structures of RNA molecules—X-ray crystallography, macromolecule, RNA molecules being no exception, con- NMR, and cryo-electron microscopy—requires great exper- tributes regularly toward the understanding of the molecular tise and substantial technical resources. Therefore, the ability to reliably predict accurate RNA 3D structures based solely on their sequences, or in concert with efficiently obtained 17 Corresponding author. biochemical information, is an important problem and E-mail [email protected]. Article published online ahead of print. Article and publication date are constitutes a major intellectual challenge (Tinoco and at http://www.rnajournal.org/cgi/doi/10.1261/rna.031054.111. Bustamante 1999). Recent decades have seen several 610 RNA (2012), 18:610–625. Published by Cold Spring Harbor Laboratory Press. Copyright Ó 2012 RNA Society. Downloaded from rnajournal.cshlp.org on September 28, 2021 - Published by Cold Spring Harbor Laboratory Press Evaluation of RNA 3D structure prediction significant theoretical advances toward this goal that tures were undertaken once the structures were published. include: The resulting benchmarks function as a snapshot of the current status of this field. On the basis of this successful first 1. The development of predictive models for RNA second- round, we would like to extend to RNA the idea established ary structure, pioneered by the seminal work of Tinoco by the protein structure prediction community (Moult and coworkers (Tinoco et al. 1973) and made commonly 2006) and to propose a continuous, open, and collective available in a number of tools that perform reasonably structure prediction experiment, with the essential, active well for sequences of moderate size (Hofacker et al. 1994; participation of experimentalists. Zuker 2003; Reuter and Mathews 2010); 2. The ability to meaningfully deduce RNA structures through comparative sequence analysis (Woese et al. RNA-PUZZLES 1980; Michel and Westhof 1990); RNA-Puzzles is a collective blind experiment for evaluation 3. The systematization of the knowledge about RNA archi- of de novo RNA structure prediction. With this initiative, tecture and interactions (Leontis and Westhof 2001) to we hope to (1) assess the cutting edge of RNA structure gain a handle on the rapid increase in the number and prediction techniques; (2) compare the different methods size of RNA molecules with published structures available and tools, elucidating their relative strengths and weak- in public databases (Berman et al. 2000); nesses and clarifying their limits in terms of sequence 4. The availability of comprehensive sequence alignments length and structure complexity; (3) determine what has still (Gardner et al. 2009) permitting the study of the relation- to be done to achieve an ultimate solution to the structure ship between structure and sequence; prediction problem; (4) promote the available methods and 5. The development of improved molecular dynamics force guide potential users in the choice of suitable tools for fields and techniques (Ditzler et al. 2010); different problems; and (5) encourage the RNA structure 6. Finally, the increasing availability of inexpensive comput- prediction community in their efforts to improve the ing power and data storage allows for extensive compu- current tools. tational searches. The procedure that governs RNA-Puzzles is straightfor- ward. Based on the successful first round, we propose the As a consequence, exciting developments in the field of de following steps: novo structure prediction have occurred in the last few years: computer-assisted modeling tools (Martinez et al. 2008; Jossinet et al. 2010); conformational space search d Complete nucleotide sequences will be periodically re- (Parisien and Major 2008); discrete molecular dynamics leased to interested groups who agree to keep sequence (Ding et al. 2008a); knowledge-based, coarse-grained re- information confidential. These target sequences corre- finement (Jonikas et al. 2009); template-based (Flores and spond to experimentally determined crystallographic or Altman 2010; Rother et al. 2011b); and force-field-based NMR structures, kindly provided by experimental groups, approaches (Das et al. 2010) inspired by proven protein- and not yet published in any form. Confidentiality of RNA folding techniques adapted to the RNA field (for review, see sequence information is essential to protect the target Rother et al. 2011a). All these new approaches are pushing selection and molecular engineering strategies of par- the limits of automatic RNA structure prediction from short ticipating experimental groups. sequences of a few nucleotides to medium-sized molecules d The interested groups will have a specified length of time with several dozens. Assuming continuing, steady progress, (usually 4–6 wk) to submit their predicted models to one can expect that in the near-to-medium future, de a website in a standard pdb format that respects atom novo prediction of RNA 3D structures will become as naming and nomenclature conventions. common and useful as RNA secondary structure pre- d The predicted models will be evaluated with regard to diction is today. stereochemical correctness, topology, and geometrical These promising results and the increasing number of similarity, relative to the experimental structure. available tools raise the need for objective evaluation and d After publication of the original experimental structures, comparison. Indeed, the establishment of a benchmark for all predicted models, experimental results, and compar- RNA structure prediction has become essential in order to ative data will be made publicly available. optimize and improve the current methods and tools for structural prediction. Here, we present the results of a blind To set up and automate these steps, the RNA-Puzzles exercise in RNA structure prediction. Sequences of RNA team has put together a public website for announcing new