Accurate and Automated Classification of Protein Secondary Structure with Psicsi

Total Page:16

File Type:pdf, Size:1020Kb

Accurate and Automated Classification of Protein Secondary Structure with Psicsi Accurate and automated classification of protein secondary structure with PsiCSI LING-HONG HUNG AND RAM SAMUDRALA Computational Genomics, Department of Microbiology, University of Washington, Seattle, Washington 98109, USA (RECEIVED July 2, 2002; FINAL REVISION October 18, 2002; ACCEPTED October 31, 2002) Abstract PsiCSI is a highly accurate and automated method of assigning secondary structure from NMR data, which is a useful intermediate step in the determination of tertiary structures. The method combines information from chemical shifts and protein sequence using three layers of neural networks. Training and testing was performed on a suite of 92 proteins (9437 residues) with known secondary and tertiary structure. Using a stringent cross-validation procedure in which the target and homologous proteins were removed from the databases used for training the neural networks, an average 89% Q3 accuracy (per residue) was observed. This is an increase of 6.2% and 5.5% (representing 36% and 33% fewer errors) over methods that use chemical shifts (CSI) or sequence information (Psipred) alone. In addition, PsiCSI improves upon the and is able to use sequence (87.4% ס translation of chemical shift information to secondary structure (Q3 without 13C shifts and 86.9% ס information as an effective substitute for sparse NMR data (Q3 with only H␣ shifts available). Finally, errors made by PsiCSI almost exclusively involve the 86.8% ס Q3 interchange of helix or strand with coil and not helix with strand (<2.5 occurrences per 10000 residues). The automation, increased accuracy, absence of gross errors, and robustness with regards to sparse data make PsiCSI ideal for high-throughput applications, and should improve the effectiveness of hybrid NMR/de novo structure determination methods. A Web server is available for users to submit data and have the assignment returned. Keywords: NMR; chemical shifts; secondary structure; neural networks Supplemental material: See www.proteinscience.org. The flood of data from the genomic sequencing projects has Moseley et al. 2001), still requires considerable human in- inspired structural genomic projects aimed at determining tervention. One promising approach to this problem is to all of the possible protein folds (Burley 2000; Brenner couple theoretical simulations with NMR methods to reduce 2001). Although the major methodology being used in these the amount of data, effort, and time required to determine projects is X-ray crystallography, NMR is also being devel- the fold of a protein (Delaglio et al. 2000; Rohl and Baker oped as an alternative for high-throughput applications 2002). Automated and accurate secondary structure assign- (Montelione 2001). One of the major bottlenecks in NMR ments are necessary for these methods to be effective. structure determinations is in the interpretation and analysis of the spectral data, which, with the possible exception of chemical-shift assignment (Bailey-Kellogg et al. 2000; Secondary structure from chemical shifts (CSI) The first step of any NMR structure determination is the Reprint requests to Ram Samudrala, Computational Genomics, Depart- assignment of chemical shifts (CSI). Because this is also the ment of Microbiology, University of Washington, Box 357242, Rosen only step that has been partially automated, a considerable Building, 960 Republican St., Seattle, WA 98109, USA; e-mail: amount of effort has been expended in translating chemical [email protected]; fax: (206) 732-6055. Article and publication are at http://www.proteinscience.org/cgi/doi/ shifts into structural information (Wishart et al. 1992; Wis- 10.1110/ps.0222303. hart and Sykes 1994; Cornilescu et al. 1999; Bonvin et al. 288 Protein Science (2003), 12:288–295. Published by Cold Spring Harbor Laboratory Press. Copyright © 2003 The Protein Society Assignment of secondary structure with PsiCSI 2001). There is a fairly simple, if noisy, relationship be- works to derive a second set of refined potentials. Like CSI, tween secondary structure and the chemical shifts of certain multiple shifts are used to further increase accuracy. Addi- nuclei (Spera and Bax 1991; Wishart et al. 1991). For ex- tional information from 15N shifts and from Psipred predic- ample, H␣ chemical shifts are higher than average (down- tions is also used. Rather than utilizing a simple jury system, field) in extended structures and lower than average (up- PsiCSI trains a second layer of neural networks. Every pos- 15 field) in helices. The same is true for N and C␤ shifts, sible combination of the available data for the residue (i.e., whereas the opposite relationship holds for C, and C␣ shifts. refined potentials from the first layer of networks and To exploit this information, CSI (Chemical Shift Index) Psipred potentials) is fed into separate neural nets. Reliabili- (Wishart et al. 1992; Wishart and Sykes 1994) assigned ties for each combination are estimated and the best per- three indices, −1, 0, and 1, depending on whether the chemi- forming combination (for that residue type) is used to pro- cal shift was near the average value or at one of the ex- vide potentials for the next layer of neural networks. Fi- tremes. Consecutive occurrences of like indices were used nally, as with Psipred, the last neural net takes into account to identify the presence of secondary structure. To further local interactions. This is similar to the first layer of neural increase accuracy, a jury system averaged assignments from nets used to average out chemical shift noise. However, multiple chemical shifts—C, C␣,C␤, and H␣—to arrive at a because the accuracy of the inputs at this stage is much consensus assignment. higher, it is possible to utilize a much larger window (17 vs. 3 residues) to take into account more subtle interactions Secondary structure from sequence (Psipred) between distant residues. The most reliable outputs from the second layer along with estimated reliablities are fed into this Early secondary structure prediction methods relied upon final neural net to ultimately obtain the PsiCSI prediction. database (Chou and Fasman 1974; Garnier et al. 1978) or theoretically derived propensities (Lim 1974) for residue types to be in the three secondary structure states with Q3 Results and Discussion (i.e., the percentage total number of residues correctly as- signed to the three secondary structure states ) accuracies in PsiCSI significantly improves upon existing methods the 60% range. The current generation of methods exploits PsiCSI achieves a Q3 accuracy of 89% (per residue), which the information from multiple alignments to further enhance is a significant improvement over the 82.8% (z > 12) accu- the accuracy (Krogh et al. 1994; Rost 1996; Jones 1999), racy observed for CSI and the 83.5% (z > 11) accuracy which now approaches 80%. The use of neural nets, PHD observed for Psipred. The CSI accuracy observed for our (Rost 1996) and Psipred (Jones 1999), to interpret the large dataset differs from the originally published accuracy of amount of data has been also instrumental in increasing the 92%. However, this figure was obtained using a small accuracy. One of the most accurate methods, Psipred, uses sample of proteins on the basis of a combination of subjec- neural nets to convert PsiBlast (Altschul et al. 1997) profile tive identification of secondary structures from NMR data data to secondary structure propensities. A second set of (not structures) and on crystal structures. In addition, the neural nets then takes into account local interactions to dataset was not jackknifed, 8 of the 16 proteins used to smooth the resulting secondary structure predictions and evaluate CSI were also part of the set of 12 proteins used to further increase accuracy. determine the indices. The observed accuracy of Psipred also differs from the stated accuracy (80%) by more than Secondary structure from chemical shifts and can be accounted for by random chance (z > 8). The test set sequence (PsiCSI) may include proteins and/or homologs to proteins used to PsiCSI combines both the chemical shift-based and se- train Psipred’s neural networks, which could account for the quence-based methods to further increase the accuracy of higher accuracy. However, the accuracy of the chemical secondary structure assignments. It is also designed to best shift alone version of PsiCSI (87.4%) indicates that the high utilize whatever data is available. PsiCSI begins by refining accuracy of PsiCSI is not contingent upon the unusual ac- the CSI methodology. Rather than three indices, three sepa- curacy of Psipred on the test set. rate potentials ranging from 0 to 1 are assigned to reflect the The distribution of Q3 accuracies of PsiCSI, CSI, and relative likelihood of a given chemical-shift value being PsiPred, is shown in Figure 1. The distribution of PsiCSI associated with a given secondary structure state. Like CSI, accuracies is very tight, reflecting the consistency of the PsiCSI reduces noise by polling nearby shifts. PsiCSI ex- method. Some of the less accurate results come from large amines a small window of shifts (three residues) centered regions of coil being assigned as helix or extended (see around the residue in question. Potentials derived from Electronic Supplemental Material). It is possible that PsiCSI these shifts, along with the estimated residue-dependent re- is detecting some residual structure in these regions. PsiCSI liabilities (i.e., probability of the assignment being correct) does better than CSI or Psipred in the majority of cases as of these potentials, are fed into a first layer of neural net- is expected from the average per residue increase in accu- www.proteinscience.org 289 Hung and Samudrala Table 1. Accuracy and reliability of PsiCSI, Psipred, and CSI Accuracy (%)a Reliability (%)b Overall (Q3)% HECHEC PsiCSI 89.0 91.8 84.4 89.1 93.3 84.0 88.3 PsiCSI (shifts only) 87.4 90.5 80.2 88.3 92.1 82.9 86.2 PsiCSI (no 13C) 86.9 87.0 82.6 88.7 90.6 86.0 85.1 PsiCSI (no 13C/15N) 86.8 87.0 81.6 88.7 90.6 85.9 84.4 Psipred v2.3 83.5 88.7 79.3 81.8 83.1 77.2 86.6 CSI (consensus) 82.8 86.9 80.7 81.0 91.4 71.3 82.7 a Percentage of correct assignments of state/total number of residues that are actually in that state.
Recommended publications
  • A Web Server for Rapid NMR-Based Protein Structure Determination Berjanskii, M.; Tang, P.; Liang, J.; Cruz, J
    NRC Publications Archive Archives des publications du CNRC GeNMR: a web server for rapid NMR-based protein structure determination Berjanskii, M.; Tang, P.; Liang, J.; Cruz, J. A.; Zhou, J.; Zhou, Y.; Bassett, E.; Macdonell, C.; Lu, P.; Lin, G.; Wishart, D. S. This publication could be one of several versions: author’s original, accepted manuscript or the publisher’s version. / La version de cette publication peut être l’une des suivantes : la version prépublication de l’auteur, la version acceptée du manuscrit ou la version de l’éditeur. For the publisher’s version, please access the DOI link below./ Pour consulter la version de l’éditeur, utilisez le lien DOI ci-dessous. Publisher’s version / Version de l'éditeur: https://doi.org/10.1093/nar/gkp280 Nucleic Acids Research, 37, Suppl. 2, pp. W670-W677, 2009-07-01 NRC Publications Record / Notice d'Archives des publications de CNRC: https://nrc-publications.canada.ca/eng/view/object/?id=8c24f92f-5a63-4bc9-9510-868e720b942e https://publications-cnrc.canada.ca/fra/voir/objet/?id=8c24f92f-5a63-4bc9-9510-868e720b942e Access and use of this website and the material on it are subject to the Terms and Conditions set forth at https://nrc-publications.canada.ca/eng/copyright READ THESE TERMS AND CONDITIONS CAREFULLY BEFORE USING THIS WEBSITE. L’accès à ce site Web et l’utilisation de son contenu sont assujettis aux conditions présentées dans le site https://publications-cnrc.canada.ca/fra/droits LISEZ CES CONDITIONS ATTENTIVEMENT AVANT D’UTILISER CE SITE WEB. Questions? Contact the NRC Publications Archive team at [email protected].
    [Show full text]
  • Chemical Shift Secondary Structure Population Inference
    bioRxiv preprint doi: https://doi.org/10.1101/2021.02.20.432095; this version posted February 20, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 CheSPI: Chemical shift Secondary structure Population Inference Jakob Toudahl Nielsen1* and Frans A.A. Mulder1* 1Interdisciplinary Nanoscience Center (iNANO) and Department of Chemistry, Aarhus University, Gustav Wieds VeJ 14, 8000 Aarhus C, Denmark *Correspondence to: [email protected], [email protected] ORCID: Jtn http://orcid.org/0000-0002-9554-4662 faam http://orcid.org/0000-0001-9427-8406 Keywords: NMR, protein, order, disorder, chemical shifts Abstract NMR chemical shifts (CSs) are delicate reporters of local protein structure, and recent advances in random coil CS (RCCS) prediction and interpretation now offer the compelling prospect of inferring small populations of structure from small deviations from RCCSs. Here, we present CheSPI, a simple and efficient method that provides unbiased and sensitive aggregate measures of local structure and disorder. It is demonstrated that CheSPI can predict even very small amounts of residual structure and robustly delineate subtle differences into four structural classes for intrinsically disordered proteins. For structured regions and proteins, CheSPI can assign up to eight structural classes, which coincide with the well-known DSSP classification. The program is freely available, and can either be invoked from URL www.protein-nmr.org as a web implementation, or run locally from command line as a python program.
    [Show full text]
  • CSI 3.0: a Web Server for Identifying Secondary and Super-Secondary Structure in Proteins Using NMR Chemical Shifts Noor E
    W370–W377 Nucleic Acids Research, 2015, Vol. 43, Web Server issue Published online 15 May 2015 doi: 10.1093/nar/gkv494 CSI 3.0: a web server for identifying secondary and super-secondary structure in proteins using NMR chemical shifts Noor E. Hafsa1, David Arndt1 and David S. Wishart1,2,* 1Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada and 2Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E8, Canada Received March 10, 2015; Revised April 23, 2015; Accepted May 02, 2015 ABSTRACT ture elements has long been an integral part of the pro- tein structure determination process. This is particularly The Chemical Shift Index or CSI 3.0 (http://csi3. true for nuclear magnetic resonance (NMR)-based protein wishartlab.com) is a web server designed to accu- structure determination where secondary structure is used rately identify the location of secondary and super- to help in structure generation and refinement (2,3). In pro- secondary structures in protein chains using only tein NMR, secondary structures are traditionally identi- nuclear magnetic resonance (NMR) backbone chem- fied and assigned using NOE-based (Nuclear Overhauser ical shifts and their corresponding protein sequence Effect) methods. By manually analyzing the positions and data. Unlike earlier versions of CSI, which only iden- patterns of weak, medium or strong NOEs it is possible to tified three types of secondary structure (helix, ␤- identify helices, ␤-turns and ␤-stands with reasonably good strand and coil), CSI 3.0 now identifies total of 11 accuracy. Even today NOE pattern measurements continue types of secondary and super-secondary structures, to be the most commonly used method for identifying sec- ondary structures in peptides and proteins (2).
    [Show full text]
  • BMC Structural Biology Biomed Central
    BMC Structural Biology BioMed Central Research article Open Access Relationship between chemical shift value and accessible surface area for all amino acid atoms Wim F Vranken*1 and Wolfgang Rieping2 Address: 1Protein Databank Europe, European Bioinformatics Institute, Cambridge, UK and 2Department of Biochemistry, University of Cambridge, Cambridge, UK Email: Wim F Vranken* - [email protected]; Wolfgang Rieping - [email protected] * Corresponding author Published: 2 April 2009 Received: 12 November 2008 Accepted: 2 April 2009 BMC Structural Biology 2009, 9:20 doi:10.1186/1472-6807-9-20 This article is available from: http://www.biomedcentral.com/1472-6807/9/20 © 2009 Vranken and Rieping; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: Chemical shifts obtained from NMR experiments are an important tool in determining secondary, even tertiary, protein structure. The main repository for chemical shift data is the BioMagResBank, which provides NMR-STAR files with this type of information. However, it is not trivial to link this information to available coordinate data from the PDB for non-backbone atoms due to atom and chain naming differences, as well as sequence numbering changes. Results: We here describe the analysis of a consistent set of chemical shift and coordinate data, in which we focus on the relationship between the per-atom solvent accessible surface area (ASA) in the reported coordinates and their reported chemical shift value.
    [Show full text]
  • An Efficient Backbone Assignment Tool for NMR Studies of Proteins
    Reduced Dimensionality (4,3)D-hnCOCANH Experiment: An Efficient Backbone Assignment tool for NMR studies of Proteins Dinesh Kumar* Centre of Biomedical Magnetic Resonance, SGPGIMS Campus, Raibareli Road, Lucknow-226014, India; *Email: [email protected] KEYWORDS: Multidimensional NMR; Backbone Assignment; hncoCANH, hnCOcaNH; Sequential Correlations. ABBREVIATIONS: NMR, Nuclear Magnetic Resonance; HSQC, Heteronuclear Single Quantum Correlation; CARA, Computer Aided Resonance Assignment; RD: Reduced dimensionality. 1 Abstract: Sequence specific resonance assignment and secondary structure determination of proteins form the basis for variety of structural and functional proteomics studies by NMR. In this context, an efficient standalone method for rapid assignment of backbone (1H, 15N, 13C and 13C') resonances and secondary structure determination of proteins has been presented here. Compared to currently available strategies used for the purpose, the method employs only a single reduced dimensionality (RD) experiment -(4,3)D-hnCOCANH and exploits the linear combinations of backbone (13C and 13C') chemical shifts to achieve a dispersion relatively better compared to those of individual chemical shifts (see the text) for efficient and rapid data analysis. Further, the experiment leads to the spectrum with direct distinction of self (intra-residue) and sequential (inter-residue) carbon correlation peaks; these appear opposite in signs and therefore can easily be discriminated without using an additional complementary experiment. On top of
    [Show full text]
  • Bioinformatics Methods for NMR Chemical Shift Data
    Bioinformatics Methods for NMR Chemical Shift Data Dissertation an der Fakult¨at fur¨ Mathematik, Informatik und Statistik der Ludwig-Maximilians-Universit¨at Munchen¨ Simon W. Ginzinger vorgelegt am 24.10.2007 Erster Gutachter: Prof. Dr. Volker Heun, Ludwig-Maximilians-Universit¨at Munchen¨ Zweiter Gutachter: Prof. Dr. Robert Konrat, Universit¨at Wien Rigorosum: 8. Februar 2008 Kurzfassung Die nukleare Magnetresonanz-Spektroskopie (NMR) ist eine der wichtigsten Meth- oden, um die drei-dimensionale Struktur von Biomolekulen¨ zu bestimmen. Trotz großer Fortschritte in der Methodik der NMR ist die Aufl¨osung einer Protein- struktur immer noch eine komplizierte und zeitraubende Aufgabe. Das Ziel dieser Doktorarbeit ist es, Bioinformatik-Methoden zu entwickeln, die den Prozess der Strukturaufkl¨arung durch NMR erheblich beschleunigen k¨onnen. Zu diesem Zweck konzentriert sich diese Arbeit auf bestimmte Messdaten aus der NMR, die so genannten chemischen Verschiebungen. Chemische Verschiebungen werden standardm¨aßig zu Beginn einer Struktur- aufl¨osung bestimmt. Wie alle Labordaten k¨onnen chemische Verschiebungen Fehler enthalten, die die Analyse erschweren, wenn nicht sogar unm¨oglich machen. Als erstes Resultat dieser Arbeit wird darum CheckShift pr¨asentiert, eine Meth- ode, die es erm¨oglich einen weit verbreiteten Fehler in chemischen Verschiebungs- daten automatisch zu korrigieren. Das Hauptziel dieser Doktorarbeit ist es jedoch, strukturelle Informationen aus chemischen Verschiebungen zu extrahieren. Als erster Schritt in diese Richtung wurde SimShift entwickelt. SimShift erm¨oglicht es zum ersten Mal, strukturelle Ahnlichkeiten¨ zwischen Proteinen basierend auf chemischen Verschiebungen zu identifizieren. Der Vergleich zu Methoden, die nur auf der Aminos¨aurensequenz basieren, zeigt die Uberlegenheit¨ des verschiebungsbasierten Ansatzes. Als eine naturliche¨ Erweiterung des paarweisen Vergleichs von Proteinen wird darauf fol- gend SimShiftDB vorgestellt.
    [Show full text]
  • Anna Katharina Dehof Novel Approaches for Bond Order Assignment and NMR Shift Prediction
    Anna Katharina Dehof Novel Approaches for Bond Order Assignment and NMR Shift Prediction Novel Approaches for Bond Order Assignment and NMR Shift Prediction Dissertation zur Erlangung des Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) der Naturwissenschaftlich–Technischen Fakult¨at I – Mathematik und Informatik – der Universit¨at des Saarlandes vorgelegt von Anna Katharina Dehof, M.Sc. Saarbr¨ucken 2011 c 2011 Anna Katharina Dehof Covergestaltung und Titelbild: c 2011 Anna Katharina Dehof, BALL developer team, http://www.ball-project.org Herstellung und Verlag: Books on Demand GmbH, Norderstedt ISBN 978-3-8482-0773-2 Bibliografische Informationen der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliographie; detaillierte bibliografische Daten sind im Internetuber ¨ http://dnb.d-ndb.de abrufbar. Tag des Kolloquiums 15.03.2012 Dekan Prof. Dr. Holger Hermanns Berichterstatter Prof. Dr. Hans-Peter Lenhof Prof. Dr. Sebastian B¨ocker Vorsitz Prof. Dr. Philipp Slusallek Akad. Mitarbeiter Dr. Oliver M¨uller Eidesstattliche Versicherung Hiermit versichere ich an Eides statt, dass ich die vorliegende Arbeit selbstst¨andig und ohne Benutzung anderer als der angegebenen Hilfsmittel angefertigt habe. Die aus anderen Quellen oder indirekt ubernommenen¨ Daten und Konzepte sind unter Angabe der Quelle gekennzeichnet. Die Arbeit wurde bisher weder im In- noch im Ausland in gleicher oder ¨ahnlicher Form in einem Verfahren zur Erlangung eines akademischen Grades vorgelegt. Saarbr¨ucken, den 17. November 2011 (Anna Katharina Dehof) Abstract Molecular modelling is one of the cornerstones of modern biological and pharmaceutical research. Accurate modelling approaches easily become computationally overwhelming and thus, different levels of approximations are typically employed. In this work, we develop such approximation approaches for problems arising in structural bioinformatics.
    [Show full text]
  • Equilibrium Simulations of Proteins Using Molecular Fragment Replacement and NMR Chemical Shifts
    Equilibrium simulations of proteins using molecular fragment replacement and NMR chemical shifts Wouter Boomsmaa,1,2, Pengfei Tianb,1, Jes Frellsenc, Jesper Ferkinghoff-Borgd, Thomas Hamelrycka, Kresten Lindorff-Larsena, and Michele Vendruscoloe,2 aDepartment of Biology, University of Copenhagen, 2200 Copenhagen, Denmark; bNiels Bohr Institute, University of Copenhagen, 2100 Copenhagen, Denmark; cDepartment of Engineering, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom; dBiotech Research and Innovation Center, University of Copenhagen, Technical University of Denmark Campus, 2800 Kongens Lyngby, Denmark; and eDepartment of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom Edited by José N. Onuchic, Rice University, Houston, TX, and approved August 7, 2014 (received for review March 18, 2014) Methods of protein structure determination based on NMR chemical data provide information primarily about local structural prop- shifts are becoming increasingly common. The most widely used erties of a molecule. In such cases, without modifying the force approaches adopt the molecular fragment replacement strat- field, the data can be integrated directly in the sampling pro- egy, in which structural fragments are repeatedly reassembled cedure, which can dramatically increase the efficiency of the into different complete conformations in molecular simulations. simulations. For instance, local structural information can be Although these approaches are effective in generating individual integrated into the simulations
    [Show full text]
  • Model-Based Assignment and Inference of Protein Backbone Nuclear Magnetic Resonances
    Model-based assignment and inference of protein backbone nuclear magnetic resonances. Olga Vitek∗ Jan Vitek† Bruce Craig∗ Chris Bailey-Kellogg†‡ Abstract Nuclear Magnetic Resonance (NMR) spectroscopy is a key experimental technique used to study protein structure, dynamics, and interactions. NMR methods face the bottleneck of spectral analysis, in particular determining the resonance assignment, which helps define the mapping between atoms in the protein and peaks in the spectra. A substantial amount of noise in spectral data, along with ambiguities in interpretation, make this analysis a daunting task, and there exists no generally accepted measure of uncertainty associated with the resulting solutions. This paper develops a model- based inference approach that addresses the problem of characterizing uncertainty in backbone resonance assignment. We argue that NMR spectra are subject to random variation, and ignoring this stochasticity can lead to false optimism and erroneous conclusions. We propose a Bayesian statistical model that accounts for various sources of uncertainty and provides an automatable framework for inference. While assignment has previously been viewed as a deterministic optimization problem, we demonstrate the importance of considering all solutions consistent with the data, and develop an algorithm to search this space within our statistical framework. Our approach is able to characterize the uncertainty associated with backbone resonance assignment in several ways: 1) it quantifies of uncertainty in the individually assigned resonances in terms of their posterior standard deviations; 2) it assesses the information content in the data with a posterior distribution of plausible assignments; and 3) it provides a measure of the overall plausibility of assignments. We demonstrate the value of our approach in a study of experimental data from two proteins, Human Ubiquitin and Cold-shock protein A from E.
    [Show full text]
  • NMR Study of the Secondary Structure and Biopharmaceutical Formulation of an Active Branched Antimicrobial Peptide
    molecules Article NMR Study of the Secondary Structure and Biopharmaceutical Formulation of an Active Branched Antimicrobial Peptide Francesca Castiglia 1, Fabrizia Zevolini 1, Giulia Riolo 1, Jlenia Brunetti 1 , Alessandra De Lazzari 2, Alberto Moretto 2, Giulia Manetto 2, Marco Fragai 3, Jenny Algotsson 4, Johan Evenäs 4, Luisa Bracci 1,5, Alessandro Pini 1,5 and Chiara Falciani 1,5,* 1 Department of Medical Biotechnology, University of Siena, via Aldo Moro 2, 53100 Siena, Italy; [email protected] (F.C.); [email protected] (F.Z.); [email protected] (G.R.); [email protected] (J.B.); [email protected] (L.B.); [email protected] (A.P.) 2 Zambon S.p.a, Via della Chimica, 9. 36100 Vicenza, Italy; [email protected] (A.D.L.); [email protected] (A.M.); [email protected] (G.M.) 3 Magnetic Resonance Center (CERM), University of Florence, via Luigi Sacconi 6, 50019 Sesto Fiorentino (Firenze), Italy; [email protected]fi.it 4 Red Glead Discovery AB, Medicon Village, 223 81 Lund, Sweden; [email protected] (J.A.); [email protected] (J.E.) 5 SetLance srl, via Aldo Moro 2, 53100 Siena, Italy * Correspondence: [email protected]; Tel.: +0039-577-232022 Academic Editor: Hirokazu Tamamura Received: 15 October 2019; Accepted: 20 November 2019; Published: 25 November 2019 Abstract: The synthetic antimicrobial peptide SET-M33 is being developed as a possible new antibacterial candidate for the treatment of multi-drug resistant bacteria. SET-M33 is a branched peptide featuring higher resistance and bioavailability than its linear analogues.
    [Show full text]
  • Structure Calculation of Proteins from Solution and Solid-State NMR Data : Application to Monomers and Symmetric Aggregates
    Structure calculation of proteins from solution and solid-state NMR data : Application to monomers and symmetric aggregates Inaugural-Dissertation to obtain the academic degree Doctor rerum naturalium (Dr. rer. nat.) submitted to the Department of Biology, Chemistry and Pharmacy of the Freie Universitat¨ Berlin by Benjamin Bardiaux from Nantes, France January, 2009 1st Reviewer: Prof. Dr. Michael Nilges (Institut Pasteur, Paris) 2nd Reviewer: Prof. Dr. Hartmut Oschkinat (FMP, Berlin) date of defence: 8 Sept. 2009 This work was carried out from October 2004 to September 2008 under the joint supervision of Prof. Dr. Michael Nilges (Institut Pasteur, Paris), Dr. Ther´ ese` E. Malliavin (Institut Pasteur, Paris) and Prof. Dr. Hartmut Oschkinat (Leibnizinstitut fur¨ Molekulare Pharmakologie, Berlin) at the Unite´ de Bioinformatique Structurale of the Institut Pasteur in Paris. iii Acknowledgements I would like to thank Michael Nilges for giving me the opportunity to work in an excellent scientific environment and entrusting me with all ARIA related projects during the last four years. I am deeply grateful to Ther´ ese` Malliavin for introducing me to the world of NMR and structure calculations. Her everyday support, inventiveness and cheerfulness was of great help during my thesis. I wish to thank Hartmut Oschkinat for his effective co-supervision of my thesis from Berlin. I would also like to express my gratitude to the persons across Europe without whom this work would not have been possible : Michael Habeck and Wolfgang Rieping who initiated
    [Show full text]
  • University of Florence International Doctorate in Structural Biology Cycle XXIII (2008–2010)
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Florence Research University of Florence International Doctorate in Structural Biology Cycle XXIII (2008{2010) Ph.D. thesis of Anna Pavelkov´a NMR Characterization of Protein{Protein Interactions Tutor: Prof. Isabella Caterina Felli Coordinator: Prof. Ivano Bertini S.S.D. CHIM/03 This thesis has been approved by the University of Florence, the University of Frankfurt and the Utrecht University There are many people who deserve aknowledgement for support that made my task easier and also more enjoyable, they are too numerous to be men- tioned here. I would like to devote thanks to Prof. Ivano Bertini for giving me the opportunity to work at CERM and to be involved in interesting projects introducing me to the world of modern nuclear magnetic resonance in struc- tural biology. Special thanks belong to Prof. Isabella C. Felli for dedicated care through informal discussions, lending advice, and other assistance. List of Abbreviations δ chemical shift τc rotational correlation time 1D, 2D, 3D one-, two-, three-dimensional AIR ambiguous interaction restraints ALR augmenter of liver regeneration, FAD-linked sulfhydryl oxi- dase ATP7A P-type ATPase BL21 (DE3) competent cells CLEANEX Chemical EXchange spectroscopy CSI chemical shift index DG distance geometry DNA deoxyribonucleic acid DTT dithiothreitol DYANA DYnamic Algorithm for Nmr Applications EDTA ethylene diamine tetraacetic acid FAD flavin adenine dinucleotide FPLC fast protein liquid chromatography
    [Show full text]