Mobidb 3.0: More Annotations for Intrinsic Disorder
Total Page:16
File Type:pdf, Size:1020Kb
Published online 10 November 2017 Nucleic Acids Research, 2018, Vol. 46, Database issue D471–D476 doi: 10.1093/nar/gkx1071 MobiDB 3.0: more annotations for intrinsic disorder, conformational diversity and interactions in proteins Damiano Piovesan1,†, Francesco Tabaro1,2,†, Lisanna Paladin1, Marco Necci1,3,4, Ivan Micetiˇ c´ 1, Carlo Camilloni5, Norman Davey6,7, Zsuzsanna Dosztanyi´ 8, Balint´ Mesz´ aros´ 8,9, Alexander M. Monzon10, Gustavo Parisi10, Eva Schad9, Pietro Sormanni11, Peter Tompa9,12,13, Michele Vendruscolo11, Wim F. Vranken12,13,14 and Silvio C.E. Tosatto1,15,* 1Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy, 2Institute of Biosciences and Medical Technology, Arvo Ylpon¨ katu 34, 33520 Tampere, Finland, 3Department of Agricultural Sciences, University of Udine, via Palladio 8, 33100 Udine, Italy, 4Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all’Adige, Italy, 5Department of Biosciences, University of Milan, 20133 Milano, Italy, 6Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin 4, Ireland, 7UCD School of Medicine & Medical Science, University College Dublin, Belfield, Dublin 4, Ireland, 8MTA-ELTE Lendulet¨ Bioinformatics Research Group, Department of Biochemistry, Eotv¨ os¨ Lorand´ University, 1/cPazm´ any´ Peter´ set´ any,´ H-1117, Budapest, Hungary, 9Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, H-1518 Budapest, Hungary, 10Structural Bioinformatics Group, Department of Science and Technology, National University of Quilmes, CONICET, Roque Saenz Pena 182, Bernal B1876BXD, Argentina, 11Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK, 12Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium, 13VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium, 14Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, 1050 Brussels, Belgium and 15CNR Institute of Neuroscience, via U. Bassi 58/b, 35131 Padua, Italy Received September 22, 2017; Revised October 13, 2017; Editorial Decision October 16, 2017; Accepted October 19, 2017 ABSTRACT creation of a wide array of custom-made datasets for download and further analysis. A large amount The MobiDB (URL: mobidb.bio.unipd.it) database of of information and cross-links to more specialized protein disorder and mobility annotations has been databases are intended to make MobiDB the cen- significantly updated and upgraded since its last ma- tral resource for the scientific community working jor renewal in 2014. Several curated datasets for in- on protein intrinsic disorder and mobility. trinsic disorder and folding upon binding have been integrated from specialized databases. The indirect evidence has also been expanded to better capture INTRODUCTION information available in the PDB, such as high tem- The protein structure-function paradigm is a cornerstone perature residues in X-ray structures and overall con- of molecular biology, offering a mechanistic understanding formational diversity. Novel nuclear magnetic reso- of processes ranging from enzyme catalysis, signal transduc- nance chemical shift data provides an additional ex- tion to molecular recognition and allosteric regulation. Un- perimental information layer on conformational dy- derlying this paradigm is the assumption that proteins be- namics. Predictions have been expanded to provide come functional by assuming a well-defined structure, typ- new types of annotation on backbone rigidity, sec- ically described by the coordinates of all its atoms. A solid ondary structure preference and disordered bind- foundation of this view is provided by the 130 000 struc- tures of proteins and complexes in the Protein Data Bank, ing regions. MobiDB 3.0 contains information for PDB (1). However, it is increasingly recognized that many the complete UniProt protein set and synchroniza- proteins do not obey this rule. Intrinsically disordered pro- tion has been improved by covering all UniParc se- teins (IDPs) or regions (IDRs) are devoid of order in their quences. An advanced search function allows the native unbound state (2–4). Intrinsic disorder is prevalent *To whom correspondence should be addressed. Tel: +39 49 827 6269; Fax: +39 49 827 6363; Email: [email protected] †These authors contributed equally to the paper as first authors. C The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Downloaded from https://academic.oup.com/nar/article-abstract/46/D1/D471/4612964 by University of Cambridge user on 22 April 2018 D472 Nucleic Acids Research, 2018, Vol. 46, Database issue in the human proteome (5), appears to play important sig- naling and regulatory roles (2) and is frequently involved in disease (6). The discovery of intrinsic disorder and its prevalence and functional importance is transforming the field of molecular biology. As intrinsic disorder is emerging as a general phenomenon, databases are collecting and pre- senting disorder related data in a systematic manner. Mo- biDB has been a major contributor by providing consensus predictions and functional annotations for all UniProt pro- teins, driving the field ahead (7,8). The MobiDB upgrade we present in this paper is essential for several reasons. There is a rapid advance in the functional understand- ing of intrinsic disorder. The functional classification of IDPs/IDRs is becoming ever more elaborate, with sev- eral newly recognized functional mechanisms (9). For ex- ample, the central role of intrinsic disorder in the forma- tion of membraneless organelles, such as nucleoli and stress granules, by liquid-liquid phase separation has been char- acterized recently (10–13). A wide range of experimen- tal observations on the structure-function relationship of IDPs/IDRs is furthering our understanding of disordered states and of the manners in which they function (14–16). These developments have also played a central role in the re- cent update of the DisProt database (17), the central repos- itory of experimentally characterized IDPs and IDRs. The re-curated version of this database contains experimental observations of disorder for more than 800 protein entries and a renewed functional ontology schema. The experimen- tal evidence on which it rests has also been significantly aug- mented to include a broad range of biophysical techniques. DisProt is the basis for most developments in disorder pre- dictors (18,19), and its recent update is a major motivation for a new version of MobiDB. Additional developments in the field make this release timely. A major source of intrinsic disorder is the identifi- Figure 1. Overview of different annotation data types (A) and levels of cation of residues with missing atomic coordinates in the accuracy (B) in MobiDB 3.0. PDB, which can now be augmented by cryo-electron mi- croscopy (cryo-EM) data. This is having a tremendous im- searches are facilitated by an improved search algorithm, pact on structural biology (20,21). Structural descriptions pre-calculated data and new sections in the database. of IDPs and IDRs under physiological conditions have also greatly advanced and are starting to appear in dedicated DATABASE DESCRIPTION databases such as IDEAL (22), DIBS (23) and MFIB (24). IDPs and IDRs can perform key roles in molecular recogni- MobiDB 3.0 is intended to be a central resource for large- tion by folding upon binding of short linear motifs (SLiMs) scale intrinsic disorder sequence annotation. This new ver- covered in the ELM database (25). Generally, the full func- sion is organized by both type of disorder annotation and tional characterization of IDPs and IDRs requires the de- quality of disorder evidence (Figure 1). Disorder informa- scription not just of their free (disordered) states (26,27), tion is grouped in three different sections: disorder, linear but also of their residual dynamics in the bound states (28). interacting peptides (LIPs) and secondary structure pop- Fuzzy (disordered) complexes can be found in FuzDB (29) ulations. The latter represents the conformational hetero- and structural ensembles describing the free form (30)in geneity of IDPs and IDRs as the ability to populate differ- the protein ensemble database (PED (31)). Techniques such ent secondary structure populations in solution. LIPs are as in-cell Nuclear magnetic resonance (NMR) spectroscopy structure fragments that interact with other molecules pre- (32,33) and single-molecule fluorescence (34) will soon help serving an elongated structure or folding upon binding. The study these structures in the physiological state. In reflec- data in MobiDB is organized hierarchically. The top tier is tion of all these developments, we are now launching a sig- formed by manually curated data from external databases nificantly updated version of our database, MobiDB 3.0. and represents the highest quality annotations. Annotations The new version incorporates additional curated data from derived from experimental data such as X-ray and NMR specialized databases. Novel annotation features include chemical