A New Computer Model for Evaluating the Selective Binding Affinity Of

Total Page:16

File Type:pdf, Size:1020Kb

A New Computer Model for Evaluating the Selective Binding Affinity Of pharmaceuticals Article A New Computer Model for Evaluating the Selective Binding Affinity of Phenylalkylamines to T-Type Ca2+ Channels You Lu 1 and Ming Li 2,* 1 Center for Aging, School of Medicine, Tulane University, New Orleans, LA 70112, USA; [email protected] 2 Department of Physiology, School of Medicine, Tulane University, New Orleans, LA 70112, USA * Correspondence: [email protected]; Tel.: +1-504-988-8207 Abstract: To establish a computer model for evaluating the binding affinity of phenylalkylamines (PAAs) to T-type Ca2+ channels (TCCs), we created new homology models for both TCCs and a L-type calcium channel (LCC). We found that PAAs have a high affinity for domains I and IV of TCCs and a low affinity for domains III and IV of the LCC. Therefore, they should be considered as favorable candidates for TCC blockers. The new homology models were validated with some commonly recognized TCC blockers that are well characterized. Additionally, examples of the TCC blockers created were also evaluated using these models. Keywords: T-type calcium channel blocker; homology modeling; computer-aid drug design; virtual drug screening; L-type calcium channel 1. Introduction Citation: Lu, Y.; Li, M. A New As the only type of voltage-gated Ca2+ channels that are activated at or near resting Computer Model for Evaluating the membrane potentials, T-type Ca2+ channels (TCCs) play an important role in regulat- Selective Binding Affinity of 2+ ing [Ca ]i homeostasis in a variety of tissues, including pancreatic β-cells and tumor Phenylalkylamines to T-Type Ca2+ cells [1–4]. Therefore, TCC antagonists could be potentially useful for the treatment of Channels. Pharmaceuticals 2021, 14, chronic diseases associated with Ca2+ dysregulation [5–7]. For this reason, it is imperative 141. https://doi.org/10.3390/ to develop more selective TCC antagonists for prospective clinical applications. Since ph14020141 many existing TCC blockers, such as mibefradil, also show inhibitory effects on L-type calcium channels (LCCs), the most important task in developing new TCC blockers is Academic Editor: Osvaldo to enhance their selectivity to TCCs over LCCs. To achieve this, we established TCC– Andrade Santos-Filho phenylalkylamine interaction models based on the specific amino acid sequences in the Received: 27 January 2021 P-loop of TCCs, and α C LCCs for characterizing the drug molecules’ affinities for TCCs Accepted: 8 February 2021 1 Published: 10 February 2021 and LCCs, respectively. TCCs have a close evolutional relationship with LCCs. A recent report from a cryo- α Publisher’s Note: MDPI stays neutral electron microscopy study reveals that the frame of the 1G (Cav3.1) pore domain structure α with regard to jurisdictional claims in is similar to that of 1S (Cav1.1) [8]. This similarity allowed us to confidently adopt the published maps and institutional affil- global structure of the calcium channel CavAb model, constructed based upon Arcobacter iations. butzleri crystallization [9], in the establishment of our TCC model. One of the most remarkable differences between all types of TCCs and LCCs is a lysine residue located adjacent to the critical glutamic acid/aspartic acid residue in domain III. The existence of a positively charged lysine (K3p49) may swing the aspartic acid (D3p50) away from the center of the calcium filter and change the preferred calcium ion and drug binding sites Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. from domains III and IV for LCCs to domains I and IV for TCCs. Therefore, we used the This article is an open access article ZMM molecule modeling program [10–12] to create four-domain TCC models, in which distributed under the terms and the binding affinities of drugs to TCCs and α1C LCC were determined by scoring their free conditions of the Creative Commons energy in binding to the channels [13]. Attribution (CC BY) license (https:// The new TCC models are adopted from a drug–protein interaction framework for creativecommons.org/licenses/by/ modeling CavAb blocking by phenylalkylamines (PAAs) [9]. This is rational because 4.0/). many TCC blockers are PAAs or their derivatives, and because PAAs block CavAb [9,14]. Pharmaceuticals 2021, 14, 141. https://doi.org/10.3390/ph14020141 https://www.mdpi.com/journal/pharmaceuticals Pharmaceuticals 2021, 14, 141 2 of 16 It is proposed that PAAs bind to LCCs in an inverse V-shaped configuration, with the Pharmaceuticals 2021, 14, x FOR PEER REVIEW 2 of 15 ammonium group towards the P-helices, and the nitrile group bound to the calcium ion coordinated by the selectivity filter glutamates in domains III and IV of the LCC [15]. We reason that this is also true for TCC blockers, except that the calcium ion is coordinated many TCC blockers are PAAs or their derivatives, and because PAAs block CavAb [9,14]. in the cavity betweenIt is domains proposed Ithat and PAAs IV, sincebind to the LCCs depolarization in an inverse V-shaped confirmed configuration, that the with pore the domains of CaV3.1ammonium and CaV 1.1group are towards superimposed the P-helices, [8 and]. Asthe anitrile result, group the bound two to rings the calcium of the ion flexible PAA moleculecoordinated [15] will by the make selectivity hydrogen filter glutamates bonding in contactsdomains III with and IV the of the mobile LCC [15]. side We chains of relevant aminoreason that acids this is from also true domains for TCC I blockers, and IV except of TCCs. that the This calcium strategy ion is coordinated allowed in the cavity between domains I and IV, since the depolarization confirmed that the pore us to create computerdomains models of Ca forV3.1 simulating and CaV1.1 are the superimposed interactions [8]. betweenAs a result, drugs the two and rings channel of the flex- receptors for LCC andible TCCs,PAA molecule respectively. [15] will make hydrogen bonding contacts with the mobile side chains of relevant amino acids from domains I and IV of TCCs. This strategy allowed us to create 2. Results computer models for simulating the interactions between drugs and channel receptors for LCC and TCCs, respectively. 2.1. Homology Modeling of TCCs and a1C LCC Using the bacterial2. Results calcium CavAb open channel 3D structure as the input, ZMM generated the first template2.1. Homology of Modeling the calcium of TCCs channel, and α1C LCC which was then modified with S5-P- loop-S6 segments of α1C,Using and theα bacterial1G, α1H, calcium and αCa1vIAb (Table open 1channel) to create 3D structure corresponding as the input, protein ZMM gen- structures of Cav1.2erated LCC andthe first Ca templatev3.1, Ca ofv 3.2,the calcium and Ca chav3.3nnel, TCCs, which respectively was then modified (Figure with1 S5-P-loop-A–D ). For cross-validationS6 of segments ZMM-generated of α1C, and α 3D1G, α structure1H, and α1I models,(Table 1) to we create also corresponding performed protein ab initio struc- tures of Cav1.2 LCC and Cav3.1, Cav3.2, and Cav3.3 TCCs, respectively (Figure 1A–D). For α α modeling of 1C andcross-validation1G calcium of ZMM-generated channels. Since 3D struct thereure ismodels, a considerable we also performed overlap ab initio of PAA inhibition betweenmodeling LCC of α1 andC and TCCs α1G calcium [9], the channels. allosteric Since structuresthere is a considerable of LCC overlap and TCCs of PAA are more likely to beinhibition similar. between Comparing LCC andtwo TCCs different [9], the allo homologysteric structures modeling of LCC and tools, TCCsZMM are more likely to be similar. Comparing two different homology modeling tools, ZMM generates generates more consistent 3D models of the domain III S5-P-loop-S6 segment of α1C and more consistent 3D models of the domain III S5-P-loop-S6 segment of α1C and α1G (Figure α G (Figure1E,F) than the ab initio method (Figure1G,H). 1 1E,F) than the ab initio method (Figure 1G,H). A B C D E F G H Figure 1. Top views of homology modeling results from ZMM for α1C, α1G, α1H, and α1I. (A–D) FigureThe structures 1. Top views of ofα homology1C, α1G, modelingα1H, and resultsα1 I,from respectively; ZMM for α1C, blue, α1G, α green,1H, and brown,α1I. (A–D and) The yellowstructures represent of α1C, αdomains1G, α1H, and I, II,α1I, III, respectively; and IV,respectively; blue, green, brown, the fourand yellow selectivity-determining represent domains I, II, aminoIII, and acidsIV, respectively; (glutamic the acid four or selectivity-determining amino acids (glutamic acid or aspartic acid) in the P-loop are colored red and displayed as spheres; ZMMaspartic generates acid) more in the consistent P-loop 3D are structure colored than red the and ab displayedinitio modeling as spheres;method for ZMM α1C and generates α1G; (E) the more predicted consistent 3D structure3D structure of the α than1C domain the abIII generated initio modeling by ZMM, the method glutamic for acidα1 isC represented and α1G; by (E red) the spheres; predicted (F) the predicted 3D structure 3D structure of the α1G domain III generated by ZMM, the lysine is represented by red spheres; (G) the most representative of the α C domain III generated by ZMM, the glutamic acid is represented by red spheres; (F) the structure selected1 by Calibur clustering analysis [16] of α1C domain III, the glutamic acid is represented by red spheres; (predictedH) the most 3Drepresentative structure structure of the αselected1G domain by Calibur III generatedclustering analysis by ZMM, of α1G thedomain lysine III, the is representedlysine is represented by red by red spheres. spheres; (G) the most representative structure selected by Calibur clustering analysis [16] of α1C domain III, the glutamic acid is represented by red spheres; (H) the most representative structure selected by Calibur clustering analysis of α1G domain III, the lysine is represented by red spheres.
Recommended publications
  • Structural Bioinformatics
    Structural Bioinformatics Sanne Abeln K. Anton Feenstra Centre for Integrative Bioinformatics (IBIVU), and Department of Computer Science, Vrije Universiteit, De Boelelaan 1081A, 1081 HV Amsterdam, Netherlands December 4, 2017 arXiv:1712.00425v1 [q-bio.BM] 1 Dec 2017 2 Abstract This chapter deals with approaches for protein three-dimensional structure prediction, starting out from a single input sequence with unknown struc- ture, the `query' or `target' sequence. Both template based and template free modelling techniques are treated, and how resulting structural models may be selected and refined. We give a concrete flowchart for how to de- cide which modelling strategy is best suited in particular circumstances, and which steps need to be taken in each strategy. Notably, the ability to locate a suitable structural template by homology or fold recognition is crucial; without this models will be of low quality at best. With a template avail- able, the quality of the query-template alignment crucially determines the model quality. We also discuss how other, courser, experimental data may be incorporated in the modelling process to alleviate the problem of missing template structures. Finally, we discuss measures to predict the quality of models generated. Structural Bioinformatics c Abeln & Feenstra, 2014-2017 Contents 7 Strategies for protein structure model generation5 7.1 Template based protein structure modelling . .7 7.1.1 Homology based Template Finding . .7 7.1.2 Fold recognition . .8 7.1.3 Generating the target-template alignment . 10 7.1.4 Generating a model . 11 7.1.5 Loop or missing substructure modelling . 12 7.2 Template-free protein structure modelling .
    [Show full text]
  • Methods for the Refinement of Protein Structure 3D Models
    International Journal of Molecular Sciences Review Methods for the Refinement of Protein Structure 3D Models Recep Adiyaman and Liam James McGuffin * School of Biological Sciences, University of Reading, Reading RG6 6AS, UK; [email protected] * Correspondence: l.j.mcguffi[email protected]; Tel.: +44-0-118-378-6332 Received: 2 April 2019; Accepted: 7 May 2019; Published: 1 May 2019 Abstract: The refinement of predicted 3D protein models is crucial in bringing them closer towards experimental accuracy for further computational studies. Refinement approaches can be divided into two main stages: The sampling and scoring stages. Sampling strategies, such as the popular Molecular Dynamics (MD)-based protocols, aim to generate improved 3D models. However, generating 3D models that are closer to the native structure than the initial model remains challenging, as structural deviations from the native basin can be encountered due to force-field inaccuracies. Therefore, different restraint strategies have been applied in order to avoid deviations away from the native structure. For example, the accurate prediction of local errors and/or contacts in the initial models can be used to guide restraints. MD-based protocols, using physics-based force fields and smart restraints, have made significant progress towards a more consistent refinement of 3D models. The scoring stage, including energy functions and Model Quality Assessment Programs (MQAPs) are also used to discriminate near-native conformations from non-native conformations. Nevertheless, there are often very small differences among generated 3D models in refinement pipelines, which makes model discrimination and selection problematic. For this reason, the identification of the most native-like conformations remains a major challenge.
    [Show full text]
  • Exercise 6: Homology Modeling
    Workshop in Computational Structural Biology (81813), Spring 2019 Exercise 6: Homology Modeling Homology modeling, sometimes referred to as comparative modeling, is a commonly-used alternative to ab-initio modeling (which requires zero knowledge about the native structure, but also overlooks much of our understanding of structural diversity). Homology modeling leverages the massive amount of structural knowledge available today to guide our search of the native structure. Practically, it assumes that sequence similarity implies structural similarity, i.e. assumes a protein homologous to our query protein cannot be very different from it. Therefore, we can roughly "thread" the query sequence onto the template (homologue) structure, and exploit existing knowledge to guide our modeling. Contents Homology modeling with Rosetta 1 Overview 1 Template Selection 1 Loop Modeling 2 Running the full simulation 4 Bonus section! Homology modeling with automated/pre-calculated servers 6 MODBASE 6 I-Tasser 6 Homology modeling with Rosetta Overview In this section, we will model the structure of the Erbin PDZ domain, using homologue template structures, and analyze the quality of the models. The structure of this protein has been solved (PDB ID 1MFG), so we will be able to compare prediction with experiment. As taught in class, classical homology modeling in Rosetta is composed of four phases: 1. Template selection 2. Template-query alignment 3. Rebuilding loop regions 4. Full-atom refinement of the model Our model protein is the Erbin PDZ domain: Erbin contains a class-I PDZ domain that binds to the C-terminal region of the receptor tyrosine kinase ErbB2. PDB ID 1MFG is the crystal structure of the human Erbin PDZ bound to the peptide EYLGLDVPV corresponding to the C-terminal residues of human ErbB2.
    [Show full text]
  • Advances in Rosetta Protein Structure Prediction on Massively Parallel Systems
    UC San Diego UC San Diego Previously Published Works Title Advances in Rosetta protein structure prediction on massively parallel systems Permalink https://escholarship.org/uc/item/87g6q6bw Journal IBM Journal of Research and Development, 52(1) ISSN 0018-8646 Authors Raman, S. Baker, D. Qian, B. et al. Publication Date 2008 Peer reviewed eScholarship.org Powered by the California Digital Library University of California Advances in Rosetta protein S. Raman B. Qian structure prediction on D. Baker massively parallel systems R. C. Walker One of the key challenges in computational biology is prediction of three-dimensional protein structures from amino-acid sequences. For most proteins, the ‘‘native state’’ lies at the bottom of a free- energy landscape. Protein structure prediction involves varying the degrees of freedom of the protein in a constrained manner until it approaches its native state. In the Rosetta protein structure prediction protocols, a large number of independent folding trajectories are simulated, and several lowest-energy results are likely to be close to the native state. The availability of hundred-teraflop, and shortly, petaflop, computing resources is revolutionizing the approaches available for protein structure prediction. Here, we discuss issues involved in utilizing such machines efficiently with the Rosetta code, including an overview of recent results of the Critical Assessment of Techniques for Protein Structure Prediction 7 (CASP7) in which the computationally demanding structure-refinement process was run on 16 racks of the IBM Blue Gene/Le system at the IBM T. J. Watson Research Center. We highlight recent advances in high-performance computing and discuss future development paths that make use of the next-generation petascale (.1012 floating-point operations per second) machines.
    [Show full text]
  • Homology Modeling
    25 HOMOLOGY MODELING Elmar Krieger, Sander B. Nabuurs, and Gert Vriend The ultimate goal of protein modeling is to predict a structure from its sequence with an accuracy that is comparable to the best results achieved experimentally. This would allow users to safely use rapidly generated in silico protein models in all the contexts where today only experimental structures provide a solid basis: structure-based drug design, analysis of protein function, interactions, antigenic behavior, and rational design of proteins with increased stability or novel functions. In addition, protein modeling is the only way to obtain structural information if experimental techniques fail. Many proteins are simply too large for NMR analysis and cannot be crystallized for X-ray diffraction. Among the three major approaches to three-dimensional (3D) structure prediction described in this and the following two chapters, homology modeling is the easiest one. It is based on two major observations: 1. The structure of a protein is uniquely determined by its amino acid sequence (Epstain, Goldberger, and Anfinsen, 1963). Knowing the sequence should, at least in theory, suffice to obtain the structure. 2. During evolution, the structure is more stable and changes much slower than the associated sequence, so that similar sequences adopt practically identical structures, and distantly related sequences still fold into similar structures. This relationship was first identified by Chothia and Lesk (1986) and later quantified by Sander and Schneider (1991). Thanks to the exponential growth of the Protein Data Bank (PDB), Rost (1999) could recently derive a precise limit for this rule, shown in Figure 25.1.
    [Show full text]
  • Homology Modeling Hanka Venselaar, Elmar Krieger, & Gert Vriend
    [Book Title], Edited by [Editor’s Name]. ISBN 0-471-XXXXX-X Copyright © 2000 Wiley[Imprint], Inc. Chapter 25 Homology Modeling Hanka Venselaar, Elmar Krieger, & Gert Vriend INTRODUCTION The goal of protein modeling is to predict a structure from its sequence with an accuracy that is comparable to the best results achieved experimentally. This would allow users to safely use in silico generated protein models in scientific fields where today only experimental structures provide a solid basis: structure-based drug design, analysis of protein function, interactions, antigenic behavior, or rational design of proteins with increased stability or novel functions. Protein modeling is the only way to obtain structural information when experimental techniques fail. Many proteins are simply too large for NMR analysis and cannot be crystallized for X-ray diffraction. Among the three major approaches to 3D structure prediction described in this and the following two chapters, homology modeling is the "easiest" approach based on two major observations: • The structure of a protein is uniquely determined by its amino acid sequence (Epstain et al. 1963),and therefore the sequence should, in theory, contain suffice information to obtain the structure. • During evolution, structural changes are observed to be modified at a much slower rate than sequences. Similar sequences have been found to adopt practically identical structures while distantly related sequences can still fold into similar structures. This relationship was first identified by Chothia & Lesk (Chothia et al. 1986) and later quantified by Sander & Schneider (Sander et al. 1991) as summarized in Figure 1. Since the initial establishment of this relationship, Rost et al.
    [Show full text]
  • Homology-Based Loop Modelling Yields More Complete Crystallographic Protein Structures
    bioRxiv preprint doi: https://doi.org/10.1101/329219; this version posted May 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Homology-based loop modelling yields more complete crystallographic protein structures Bart van Beusekom1, Krista Joosten1, Maarten L. Hekkelman1, Robbie P. Joosten1,* and Anastassis Perrakis1,* 1 Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, 1066CX, Amsterdam, The Netherlands Contact information: [email protected] and [email protected] 1 bioRxiv preprint doi: https://doi.org/10.1101/329219; this version posted May 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Abstract Inherent protein flexibility, poor or low-resolution diffraction data, or poor electron density maps, often inhibit building complete structural models during X-ray structure determination. However, advances in crystallographic refinement and model building nowadays often allow to complete previously missing parts. Here, we present algorithms that identify regions missing in a certain model but present in homologous structures in the Protein Data Bank (PDB), and “graft” these regions of interest. These new regions are refined and validated in a fully automated procedure. Including these developments in our PDB-REDO pipeline, allowed to build 24,962 missing loops in the PDB.
    [Show full text]
  • A Reinforcement Learning Approach to Protein Loop Modeling Kevin Molloy, Nicolas Buhours, Marc Vaisset, Thierry Simeon, Étienne Ferré, Juan Cortés
    A Reinforcement Learning Approach to Protein Loop Modeling Kevin Molloy, Nicolas Buhours, Marc Vaisset, Thierry Simeon, Étienne Ferré, Juan Cortés To cite this version: Kevin Molloy, Nicolas Buhours, Marc Vaisset, Thierry Simeon, Étienne Ferré, et al.. A Reinforcement Learning Approach to Protein Loop Modeling. IEEE/RSJ International Conference on Intelligent Robots and Systems, Sep 2015, Hamburg, Germany. hal-01206128 HAL Id: hal-01206128 https://hal.archives-ouvertes.fr/hal-01206128 Submitted on 28 Sep 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. A Reinforcement Learning Approach to Protein Loop Modeling Kevin Molloy∗y, Nicolas Buhours ∗yz, Marc Vaisset∗y, Thierry Simeon´ ∗y, Etienne´ Ferre´ z, Juan Cortes´ ∗y ∗CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France yUniv de Toulouse, LAAS, F-31400 Toulouse, France zSiemens Industry Software, 478 rue de la decouverte,´ 31670 Labege,` France Abstract—Modeling the loop regions of proteins is an active formation space, we propose discretizing the search space via area of research due to their significance in defining how a database of small, contiguous segments of experimentally the protein interacts with other molecular partners. The high observed loop configurations organized by the corresponding structural flexibility of loops presents formidable challenges for both experimental and computational approaches.
    [Show full text]
  • Macromolecular Modeling and Design in Rosetta: New Methods and Frameworks
    Supplementary information for: Macromolecular modeling and design in Rosetta: new methods and frameworks Rosetta is licensed and distributed through www.rosettacommons.org. Licenses for academic, non-profit and government laboratories are free of charge, there is a license fee for industry users. The main documentation page can be found at https://www.rosettacommons.org/docs/latest/Home. Tutorials, demos and protocol captures are documented at https://www.rosettacommons.org/demos/latest/Home. RosettaScripts documentation is available at https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/RosettaScripts. PyRosetta tutorials are available at http://www.pyrosetta.org/tutorials. Method Use Developer(s) Lab developed Documentation Protocol capture / demo Limitations Competing methods Protein structure prediction fragment picker1 picks protein Dominik Gront Dominik Gront*,** https://www.rosettacommons.org/docs/latest/applicati Rosetta/demos/public/fragment_picking fragments for on_documentation/utilities/app-fragment-picker various modeling tasks RosettaCM2 comparative Yifan Song formerly David https://www.rosettacommons.org/docs/latest/applicati Rosetta/demos/public/homology_modeling_threa model quality depends on manual MODELLER3, iTasser4, modeling from Baker on_documentation/structure_prediction/RosettaCM ding_basic adjustment of sequence alignment, which HHpred5 multiple templates Supp to 2 can take time to do well iterative recombination of Sergey David Baker, https://www.rosettacommons.org/docs/latest/Iterative
    [Show full text]
  • Homology-Based Loop Modeling Yields More Complete
    research papers Homology-based loop modeling yields more IUCrJ complete crystallographic protein structures ISSN 2052-2525 BIOLOGYjMEDICINE Bart van Beusekom, Krista Joosten, Maarten L. Hekkelman, Robbie P. Joosten* and Anastassis Perrakis* Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066CX, The Netherlands. Received 18 May 2018 *Correspondence e-mail: [email protected], [email protected] Accepted 23 July 2018 Inherent protein flexibility, poor or low-resolution diffraction data or poorly defined electron-density maps often inhibit the building of complete structural Edited by J. L. Smith, University of Michigan, models during X-ray structure determination. However, recent advances in USA crystallographic refinement and model building often allow completion of previously missing parts. This paper presents algorithms that identify regions Keywords: loop building; structural re-building; PDB-REDO; model completion; missing in a certain model but present in homologous structures in the Protein crystallography. Data Bank (PDB), and ‘graft’ these regions of interest. These new regions are refined and validated in a fully automated procedure. Including these Supporting information: this article has developments in the PDB-REDO pipeline has enabled the building of 24 962 supporting information at www.iucrj.org missing loops in the PDB. The models and the automated procedures are publicly available through the PDB-REDO databank and webserver. More complete protein structure models enable a higher quality public archive but also a better understanding of protein function, better comparison between homologous structures and more complete data mining in structural bioinfor- matics projects. 1. Introduction Protein structure models give direct and detailed insights into biochemistry (Lamb et al., 2015) and are therefore highly relevant to many areas of biology and biotechnology (Terwilliger & Bricogne, 2014).
    [Show full text]
  • Loop Modeling in Proteins Using a Database Approach with Multi-Dimensional Scaling
    Loop Modeling in Proteins Using a Database Approach with Multi-Dimensional Scaling by Daniel Holtby A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in Computer Science Waterloo, Ontario, Canada, 2013 c Daniel Holtby 2013 I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. ii Abstract Modeling loops is an often necessary step in protein structure and func- tion determination, even with experimental X-ray and NMR data. It is well known to be difficult. Database techniques have the advantage of producing a higher proportion of predictions with sub-angstrom accuracy when com- pared with ab initio techniques, but the disadvantage of often being able to produce usable results as they depend entirely on the loop already be- ing represented within the database. My contribution is the LoopWeaver protocol, a database method that uses multidimensional scaling to rapidly achieve better clash-free, low energy placement of loops obtained from a database of protein structures. This maintains the above-mentioned advan- tage while avoiding the disadvantage by permitting the use of lower quality matches that would not otherwise fit. Test results show that this method achieves significantly better results than all other methods, including Mod- eler, Loopy, SuperLooper, and Rapper before refinement. With refinement, the results (LoopWeaver and Loopy combined) are better than ROSETTA's, with 0.53A˚ RMSD on average for 206 loops of length 6, 0.75A˚ local RMSD for 168 loops of length 7, 0.93A˚ RMSD for 117 loops of length 8, and 1.13A˚ RMSD loops of length 9, while ROSETTA scores 0.66A,˚ 0.93A,˚ 1.23A,˚ 1.56A,˚ respectively, at the same average time limit (3 hours on a 2.2GHz Opteron).
    [Show full text]
  • Modeling of Loops in Proteins: a Multi-Method Approach Michal Jamroz, Andrzej Kolinski*
    Jamroz and Kolinski BMC Structural Biology 2010, 10:5 http://www.biomedcentral.com/1472-6807/10/5 RESEARCH ARTICLE Open Access Modeling of loops in proteins: a multi-method approach Michal Jamroz, Andrzej Kolinski* Abstract Background: Template-target sequence alignment and loop modeling are key components of protein comparative modeling. Short loops can be predicted with high accuracy using structural fragments from other, not necessairly homologous proteins, or by various minimization methods. For longer loops multiscale approaches employing coarse-grained de novo modeling techniques should be more effective. Results: For a representative set of protein structures of various structural classes test predictions of loop regions have been performed using MODELLER, ROSETTA, and a CABS coarse-grained de novo modeling tool. Loops of various length, from 4 to 25 residues, were modeled assuming an ideal target-template alignment of the remaining portions of the protein. It has been shown that classical modeling with MODELLER is usually better for short loops, while coarse-grained de novo modeling is more effective for longer loops. Even very long missing fragments in protein structures could be effectively modeled. Resolution of such models is usually on the level 2-6 Å, which could be sufficient for guiding protein engineering. Further improvement of modeling accuracy could be achieved by the combination of different methods. In particular, we used 10 top ranked models from sets of 500 models generated by MODELLER as multiple templates for CABS modeling. On average, the resulting molecular models were better than the models from individual methods. Conclusions: Accuracy of protein modeling, as demonstrated for the problem of loop modeling, could be improved by the combinations of different modeling techniques.
    [Show full text]