A Simple Computational Predictor of Enzyme Function from Structure

Total Page:16

File Type:pdf, Size:1020Kb

A Simple Computational Predictor of Enzyme Function from Structure THEMATICS: A simple computational predictor of enzyme function from structure Mary Jo Ondrechen*†‡, James G. Clifton†, and Dagmar Ringe†§ *Department of Chemistry, Northeastern University, Boston, MA 02115; and †Rosenstiel Basic Medical Sciences Research Center and §Departments of Biochemistry and Chemistry, Brandeis University, Waltham, MA 02454 Communicated by Gregory A. Petsko, Brandeis University, Waltham, MA, August 18, 2001 (received for review June 19, 2001) We show that theoretical microscopic titration curves (THEMATICS) sometimes be verified by spectroscopic measurements. Also, the can be used to identify active-site residues in proteins of known pKa values of the catalytic residues can be determined indirectly structure. Results are featured for three enzymes: triosephosphate by the measured reaction rate as a function of pH. This measured isomerase (TIM), aldose reductase (AR), and phosphomannose pKa can then be compared with the computed pKa’s of the isomerase (PMI). We note that TIM and AR have similar structures residues of the protein to identify the active site. We now show but catalyze different kinds of reactions, whereas TIM and PMI that the shapes of the theoretical titration functions carry have different structures but catalyze similar reactions. Analysis of information about active-site location, even in the absence of the theoretical microscopic titration curves for all of the ionizable experimental kinetic data. residues of these proteins shows that a small fraction (3–7%) of the The titration curves of most of the ionizable residues in a curves possess a flat region where the residue is partially proton- protein have a typical shape: as pH is increased, there is a ated over a wide pH range. The preponderance of residues with sudden, sharp decline in the predicted net charge. This decline such perturbed curves occur in the active site. Additional results are occurs, as expected, in a narrow pH range around the pKa. This given in summary form to show the success of the method for behavior is well known as the source of the buffer capacity of the proteins with a variety of different chemistries and structures. ionizable groups of amino acids. Thus, most ionizable residues go from their protonated to unprotonated forms within a APPLIED here is a need to develop methods to predict protein function relatively narrow pH range, as predicted by the Henderson– MATHEMATICS Tfrom structure; this need becomes particularly acute as novel Hasselbalch equation, written as Ϫ protein folds are discovered for which there are no proteins of pH ϭ pK ϩ log͓͑A ͔͓͞HA͔͒. [1] similar structure with known function. We show that theoretical a microscopic titration curves (THEMATICS) can be used to A small fraction of the ionizable residues have curves with identify active-site residues in enzymes of known structure. This perturbed shapes that do not fit Eq. 1. information will prove to be important in the current challenging quest to predict protein function from sequence and structure. Methods BIOPHYSICS Knowledge of protein sequences and structures has burgeoned A number of methods are available for the prediction of pKa very recently as a result of genome sequencing (1) and structural values of ionizable groups in proteins based on a finite difference genomics efforts. To translate such information into tangible Poisson–Boltzmann method (12–15). This requires calculation of benefits to humankind, the next step is to develop methods that the electrostatic potentials, for which we employ the UHBD (16) enable one to predict and to establish function from structure. program. We use the program HYBRID (15, 17) to calculate the Techniques used to predict function from sequence or function mean net charge as a function of pH (18–21) for each ionizable from structure are in their infancy and rely either on analogies group (all Lys, Arg, Asp, Glu, His, Tyr, Cys, N terminus, and C to related proteins of known function (2–8) or on computational terminus) of each protein. For each residue type in a given searches for binding sites by docking (9) of selected sets of small enzyme, a plot is drawn of predicted mean net charge as a molecules onto the structure. For instance, recent work has function of pH—THEMATICS. Thus, for each residue type, we searched the Protein Data Bank (PDB; ref. 10; http:͞͞ have a plot with a family of curves: one plot with a curve for each www.rcsb.org͞pdb͞) for previously unrecognized cation-binding Lys residue, another plot with a curve for each Arg residue, etc. sites (11). We compare the curves for residues of the same type; curves with However, there is as yet no reliable method to identify active unusual shape are noted by visual inspection. sites of enzymes or other interaction sites of proteins in the In this brief report we feature the results for three represen- absence of biochemical data, even when the structure is known. tative enzymes: triosephosphate isomerase (TIM), aldose reduc- There is a wealth of biochemical data that demonstrates that tase (AR), and phosphomannose isomerase (PMI). We also give active-site residues involved in specific kinds of chemistry pos- results in summary form for eight other proteins. For all three sess predictable chemical properties that enable one to identify of the featured enzymes and for many others as well, we observe them as active-site residues. The most important of these prop- theoretical titration functions with perturbed shapes for some of erties is a perturbed pKa that can be determined experimentally the ionizable active-site residues. Other ionizable residues that by pH titrations of the activity of the enzyme. Herein we are not in the active site almost always have titration functions demonstrate that theoretical titration functions can identify that are more typical in that they fit the Henderson–Hasselbalch active-site residues that are involved in Brønsted acid–base equation. From the theoretical titration functions alone, one is chemistry for a variety of proteins, thereby identifying the active therefore able to identify the physical location of the active site site from the location of the residues. In an experimental titration curve, one typically plots pH as a function of the volume of standardized solution added. In a Abbreviations: THEMATICS, theoretical microscopic titration curves; TIM, triosephosphate isomerase; AR, aldose reductase; PMI, phosphomannose isomerase; PDB, Protein Data theoretical titration curve for a residue in a protein, one gen- Bank. erally plots the net charge on the residue (for an ensemble of ‡ To whom reprint requests should be sent at the * address. E-mail: [email protected]. protein molecules) as a function of pH. Thus, we regard pH as The publication costs of this article were defrayed in part by page charge payment. This the control variable and the charge on the group as the depen- article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. dent variable. The calculated charge on a residue in a protein can §1734 solely to indicate this fact. www.pnas.org͞cgi͞doi͞10.1073͞pnas.211436698 PNAS ͉ October 23, 2001 ͉ vol. 98 ͉ no. 22 ͉ 12473–12478 Downloaded by guest on September 24, 2021 in a variety of different protein structures. We shall also discuss the probable catalytic advantage afforded by the observed perturbations in the titration functions. Results We have obtained theoretical titration functions for all ionizable residues of enzymes of several different types and have noticed a high correlation between a particular type of perturbed shape and the location of the residue in the active site. Our results correlate strongly with biochemical results of kinetic titrations that implicate specific residues in acid–base chemistry of a catalyzed reaction. We note that two of the three featured enzymes, TIM and AR, have similar folds, the ␣͞␤-barrel (or ‘‘TIM barrel’’) structure, although they catalyze different chem- ical reactions. We note further that TIM and PMI catalyze the same kind of chemical reaction but have very different struc- tures. These three proteins have no significant sequence identity; alignment scores obtained with the program CLUSTALW (ref. 22; http:͞͞www.ebi.ac.uk͞clustalw) for the three possible pairings of the three sequences used herein were 9%, 6%, and 11% for TIM-AR, AR-PMI, and TIM-PMI, respectively. These values are considered unacceptable for identification of structure from sequence similarity. TIM. TIM (23–28) catalyzes the isomerization of D-glyceralde- hyde 3-phosphate to dihydroxyacetone phosphate, a key reaction in the glycolytic pathway. Calculations on TIM from Gallus gallus (chicken) were performed by using the 1.80-Å resolution struc- ture 1TPH (10, 23) of the TIM-phosphoglycolohydroxamate complex. Theoretical titration functions were obtained for the biologically active dimer from which the coordinates for the inhibitor were removed. Our observations of the theoretical titration functions for all of the ionizable residues of TIM find that four residues have curves with highly perturbed shapes: His-95, Glu-165, Lys-112, and Tyr-164. Fig. 1A shows the predicted mean net charge as a function of pH for the eight histidine residues in the A chain. Predicted titration functions for the B chain (not shown) are similar to the A chain. Fig. 2a shows the location of His-95 and Glu-165 in the active site of TIM. First, one notes that the theoretical titration functions for most of the histidine residues depicted in Fig. 1A have the typical Fig. 1. Sample theoretical titration curves. Predicted mean net charge as a shape. However we observe that His-95 has a strikingly different function of pH. (A) All of the histidine residues in the A chain of TIM: His-26 (ϩ), predicted titration function. Not only is the pKa of the conjugate His-95 (ϫ), His-100 (✳), His-115 (ᮀ), His-185 (■), His-195 (E), His-224 (F), and acid downshifted to 3.2, but also the shape of the curve is His-248 (‚).
Recommended publications
  • Computational Protein Function Predictions
    Methods 93 (2016) 1–2 Contents lists available at ScienceDirect Methods journal homepage: www.elsevier.com/locate/ymeth Guest Editor’s Introduction Computational protein function predictions What is the role of this protein? Where does this protein local- in CATH-Gene3D using a hidden Markov model and thereby ize in a cell? Are there any ligands that bind to this protein? If so, predict function of the query. what are they? Which residues constitute a functional site of the The next three methods [4–6] take a protein tertiary structure protein? These questions, which in a broader sense seek the biolog- as input and predict binding ligands for the query protein. Binding ical function of a protein, are fundamental and central in modern ligand prediction is not only for function prediction of proteins but biology. Ultimately, the biological function of a protein needs to it is also useful for drug design. The method developed by be determined by experiments. However, a hypothesis is needed Nakamura and Tomii represents a binding pocket by an ensemble to design an assay because it determines whether a target protein of all triangles consisting of three Ca atoms in the pocket [4]. The has a particular function or not. Biologists come up with hypothe- triangles are classified into 171,700 types considering amino acid ses of protein function from circumstantial evidence, and computa- types of vertices and edge distances. Thus, a pocket is represented tional function prediction can play an important part. by a vector that shows the frequency of each triangle type in the Computational function prediction methods are also useful for pocket, which is further reduced to a vector of 11 elements by analyzing protein function in a proteomic scale since methods multi-dimensional scaling.
    [Show full text]
  • Bioorganic & Medicinal Chemistry, Volume 11, Dec, 2003
    MMCC Results Volume 14 No. 3 1 April 2005 Vol. 14, No. 3 1 April 2005 Editorial and News A groundbreaker in membrane biophysics was Coverage period: 1 Mar. 2005 through 1 April 2005 published by Gunther Peters and colleagues this month in About 253 papers from 13 journals are cited. J. Chem. Physics (see Methodology, Molecular Dynamics). It deals with a problem in membrane 1. APPLICATIONS (105) . page 2 biophysics that has achieved nearly universal interest in the 1.1 Small Molecules – 29 past year or so, the computation of the lateral pressure profile in the lipid bilayer. The reason this is so interesting Water and Solvation – 6 QSAR — 12 Med Chem & Drug Design - 9 Carbon Nanoparticles - 2 is the heterogeneous and variable shapes of membrane proteins. Although we still don’t know much about them, 1.2 Biopolymers – 73 membrane NMR and spectroscopic studies of membrane Bioinformatics – 6 Protein Dynamics – 5 proteins are starting to reveal small but significant Protein Seq Anal and Align – 3 Ligand Binding – 4 variations in countour at their interfaces with lipid Protein Structure Prediction – 2 Enzyme Catalysis – 8 Comp and Homol Modeling – 3 Protein-Protein Inter. – 2 membranes and large conformational changes during Peptide Conform Anal – 1 Membrane Proteins - 9 function. Although most membrane proteins are essentially Protein Structure Anal – 8 Protein Nuc Acid Inter - 1 Protein Folding – 6 Nucleic Acids – 4 right cylinders (either β-barrels or helix bundles) with axes Protein Design& Engineering – 2 Lipids and Surfactants - 4 parallel to the membrane normal, border helices or strands Protein Hydration – 3 Carbohydrates - 1 can sometimes have tilts, kinks, or extrusions that require Protein Electr and Titr – 1 accommodation by bilayer lipids.
    [Show full text]
  • Application of Molecular Modeling to Drug Discovery and Functional Genomics
    APPLICATION OF MOLECULAR MODELING TO DRUG DISCOVERY AND FUNCTIONAL GENOMICS A Dissertation Presented By Zhouxi (Josie) Wang To The Department of Chemistry and Chemical Biology in partial fulfillment of the requirements For the degree of Doctor of Philosophy in the field of Chemistry Northeastern University Boston, Massachusetts May, 2012 1 © 2012 Zhouxi Wang ALL RIGHTS RESERVED 2 APPLICATION OF MOLECULAR MODELING TO DRUG DISCOVERY AND FUNCTIONAL GENOMICS By Zhouxi (Josie) Wang ABSTRACT OF DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Chemistry in the Department of Chemistry and Chemical Biology in the Graduate School of Northeastern University, Boston, Massachusetts May 2012 3 ABSTRACT Molecular modeling can accelerate and guide drug design and contribute to the understanding of the biochemical functions of gene products. This thesis applies molecular modeling to facilitate the drug design for human African trypanosomiasis and develops a new modeling technique for protein biochemical function annotations. A special technique employed in this work is the prediction of the individual amino acids in protein structures that are involved in ligand interactions; these predicted local interaction sites are used for drug discovery and for the prediction of the biochemical function of proteins of unknown function. Chapter 2 applies molecular modeling to the structure based drug design for human African trypanosomiasis (HAT) at the Aurora kinase -1 target. HAT is a vector borne disease caused by several species of trypanosomes, affecting thousands of people every year. This disease is fatal if untreated. Current therapeutic interventions are unsatisfactory, all with limited efficacy or life-threatening side effects.
    [Show full text]
  • Specificity of DNA Damage Inducible DNA Polymerase IV from Escherichia Coli
    Specificity of DNA Damage Inducible DNA Polymerase IV from Escherichia coli A thesis presented by Jason M. Walsh to The Department of Chemistry and Chemical Biology in partial fulfillment of the requirements for the degree of Master of Science in the field of Chemistry Northeastern University Boston, Massachusetts December 2009 1 © 2009, Jason M Walsh All rights reserved. 2 Specificity of DNA Damage Inducible DNA Polymerase IV from Escherichia coli by Jason M. Walsh ABSTRACT OF THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in Chemistry and Chemical Biology in the Graduate School of Arts and Sciences of Northeastern University, December 2009 3 ABSTRACT DNA polymerases are responsible for DNA replication during cell division. There are multiple families of polymerases (A, B, C, D, X) responsible for copying DNA during replication and repair. There is also a class of polymerases conserved throughout evolution, known as the Y family polymerases, that have reduced replication fidelity on undamaged DNA (Tang et al. 2000). However Y family DNA polymerases have the specialized property of replicating DNA by copying damaged DNA, a process known as translesion synthesis (TLS). Structural differences between Y family and replicative polymerases may account for the difference in enzymatic activity. However we demonstrate that the Klenow fragment (A family) can bypass a fluorescent cytosine analog known as 1, 3-diaza-2-oxophenothiazine (tC), that DinB, a Y family polymerase, cannot bypass. We show that DinB inserts dGTP faithfully, but cannot extend the DNA primer beyond that. Verifying which amino acid residues are responsible for both function and specificity of the Y family polymerases is accomplished by assessing the kinetic data of nucleotide incorporation events of DinB variants as compared to wild-type DinB.
    [Show full text]
  • Identification of Functional Subclasses in the DJ-1 Superfamily Proteins
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Biochemistry -- Faculty Publications Biochemistry, Department of 1-2007 Identification of unctionalF Subclasses in the DJ-1 Superfamily Proteins Ying Wei Northeastern University, Dagmar Ringe University of Nebraska - Lincoln Mark A. Wilson Brandeis University, [email protected] Mary Jo Ondrechen Northeastern University, Follow this and additional works at: https://digitalcommons.unl.edu/biochemfacpub Part of the Biochemistry, Biophysics, and Structural Biology Commons Wei, Ying; Ringe, Dagmar; Wilson, Mark A.; and Ondrechen, Mary Jo, "Identification of unctionalF Subclasses in the DJ-1 Superfamily Proteins" (2007). Biochemistry -- Faculty Publications. 25. https://digitalcommons.unl.edu/biochemfacpub/25 This Article is brought to you for free and open access by the Biochemistry, Department of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Biochemistry -- Faculty Publications by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. Identification of Functional Subclasses in the DJ-1 Superfamily Proteins Ying Wei1,2, Dagmar Ringe3,4,5*, Mark A. Wilson3,4,5¤, Mary Jo Ondrechen1,2 1 Department of Chemistry and Chemical Biology, Northeastern University, Boston, Massachusetts, United States of America, 2 Institute for Complex Scientific Software, Northeastern University, Boston, Massachusetts, United States of America, 3 Department of Biochemistry, Brandeis University, Waltham, Massachusetts, United States of America, 4 Department of Chemistry, Brandeis University, Waltham, Massachusetts, United States of America, 5 Rosenstiel Basic Medical Sciences Research Center, Brandeis University, Waltham, Massachusetts, United States of America Genomics has posed the challenge of determination of protein function from sequence and/or 3-D structure. Functional assignment from sequence relationships can be misleading, and structural similarity does not necessarily imply functional similarity.
    [Show full text]
  • Che Newsletter
    SUMMER 2020 DIVISION OF CHEMISTRY CHE NEWSLETTER INSIDE CHE FUNDED RESEARCH 2 Update from the Division Director HIGHLIGHT Making single molecule electronic components, like rectifiers or switches, is 3 Chemistry Staffing News and Recruiting an ultimate goal for fabricating electronics a. Special Thank-You to Carol Bessel in the nano realm. Traditionally, single b. Congratulations to Dr. Evelyn Goldfield This work was published molecule junctions are fabricated by on her Retirement in Angewandte Chemie anchoring the molecule between two metal (https://doi.org/10.1002/ c. Welcome the new Rotators and Farewell electrodes and charge transport is measured to the departing Rotators anie.201903898, CHE-1508567). only along a single axis. When studying 8 Recruiting Rotators: Would you like to join the charge conduction through benzene the NSF team as a rotator? derivatives, Eric Borguet’s group at Temple University discovered that the geometry of a molecule between the electrodes has a significant impact 8 CHE COVID-19 Effort on the charge transport through that molecule. Harnessing that anisotropic effect in trimesic acid, Borguet’s group developed a single‐molecule 10 CHE Virtual Office Hours electromechanical switch. Using an applied voltage at the electrode, they controlled the geometry of the molecule and found that a planar orientation 11 Funding Opportunities gave rise to a conductance that is more than 400 times higher than that of a. Division of Chemistry’s updated Disciplinary Research Programs Solicitation (CHE-DRP) an upright molecule. b. MPS Alliances for Graduate Education and the Professoriate (AGEP) – Graduate Research Supplements (GRS) Dear Colleague Letter 12 Chemistry Outreach in 2020 POLARITON CHEMISTRY: 12 2021 NSF Chemistry Early Career Investigator Workshop EXPLOITING CONFINED LIGHT 12 Chemistry Highlights FOR CHEMICAL Figure from J.B.
    [Show full text]
  • Selective Prediction of Interaction Sites in Protein Structures with THEMATICS Ying Wei1, Jaeju Ko1,2, Leonel F Murga1,3 and Mary Jo Ondrechen*1
    BMC Bioinformatics BioMed Central Research article Open Access Selective prediction of interaction sites in protein structures with THEMATICS Ying Wei1, Jaeju Ko1,2, Leonel F Murga1,3 and Mary Jo Ondrechen*1 Address: 1Department of Chemistry and Chemical Biology and Institute for Complex Scientific Software, Northeastern University, Boston, Massachusetts 02115 USA, 2NSF-ROA awardee. Department of Chemistry, Indiana University of Pennsylvania, 975 Oakland Avenue, Indiana, Pennsylvania 15705 USA and 3Rosenstiel Basic Medical Sciences Center, Brandeis University, Waltham, MA 02454 USA Email: Ying Wei - [email protected]; Jaeju Ko - [email protected]; Leonel F Murga - [email protected]; Mary Jo Ondrechen* - [email protected] * Corresponding author Published: 9 April 2007 Received: 1 November 2006 Accepted: 9 April 2007 BMC Bioinformatics 2007, 8:119 doi:10.1186/1471-2105-8-119 This article is available from: http://www.biomedcentral.com/1471-2105/8/119 © 2007 Wei et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites.
    [Show full text]
  • Using Computation to Identify Catalytic Features of Enzymes And
    Graduate Category: Physical and Life Sciences Degree Seeking: PhD Abstract ID# 2277 Using Computation to Identify Catalytic Features of Enzymes and to POOL Server: www.pool.neu.edu Engineer Enzymes with Enhanced or Altered Function DNA Lab Site: www.dna.neu.edu 205-R2-CX-0011 Timothy A. Coulther, Israel Adam, Penny J. Beuning, Mary Jo Ondrechen (TAC) NSF MCB-1158176 ORG Site: nuweb15.neu.edu/org NSF-MCB-1517290 Retro-aldolases: de novo Engineering a replicative polymerase for accurate damage bypass The catalytic properties of enzymes make them useful for wide application, THEoretical Microscopic Anomalous designed and evolved in Identify mutations that promote the accurate lesion conformation while such as in medicine, biofuel production, or sample analysis. However, we are TItration Curve Shapes1 laboratories3,4,5,6,7 retaining catalytic properties currently restricted to the enzymes that nature has to offer, limiting their Typical H-H Accurate Mutagenic applicability. Unfortunately, most initial enzyme design attempts Curve Identification of catalytic residues G stays G G becomes T have resulted in very low activities or no activity at all. Long-term directed evolution computationally for understanding activity and further engineering experiments have increased enzyme activities, but these require Approach years, high cost, and sometimes sophisticated equipment, that limit their development and use. A major issue is the sheer size of mutational space, as there Perturbed H-H 48 Charge are over 1.0x10 possibilities for a protein with 200 residues. The goal of our 4 Curves research is to use computation to reduce the number of possibilities, identifying a smaller number of potential designs that are more likely to be successful.
    [Show full text]
  • Ronald E. Hatcher Science on Saturday Lecture Series 24 January 2015
    Ronald E. Hatcher Science on Saturday Lecture Series 24 January 2015 Using Physics to Understand the Genome Mary Jo Ondrechen Department of Chemistry & Chemical Biology Northeastern University, Boston, MA 02115 USA ABSTRACT: The sequencing of the genome, the complete DNA, for humans and for thousands of other species at the dawn of the new millennium brought promises of untold advances in medicine and of many other novel technologies of benefit to the world. However, big questions remain about the roles played by these DNA sequences, including many of the protein-coding genes. Structural Genomics (SG) projects have since determined the 3D structures of over 13,000 gene products, but most of them are of unknown or uncertain function. In order to transform the promises of genome science into useful and practical advances, we must understand the function of these protein gene products. Therefore the functional classification of protein 3D structures is an important current problem. A new approach to this problem is described based primarily on chemical and electrostatic properties computed using techniques from chemical physics. Our function prediction method utilizes electrical potential functions that we calculate from the protein 3D structure. These potential functions are in turn used to predict the amino acids that are involved directly in the protein’s function. For proteins of known function, these sets of amino acids are used to establish “chemical signatures” for each functional type. Then, we predict the sets of active amino acids for protein structures of unknown function and, using matching with the chemical signatures, seek a functional assignment for the proteins of unknown function.
    [Show full text]
  • Machine Learning Differentiates Enzymatic and Non- Enzymatic Metals in Proteins ✉ Ryan Feehan 1,3, Meghan W
    ARTICLE https://doi.org/10.1038/s41467-021-24070-3 OPEN Machine learning differentiates enzymatic and non- enzymatic metals in proteins ✉ Ryan Feehan 1,3, Meghan W. Franklin1,3 & Joanna S. G. Slusky 1,2 Metalloenzymes are 40% of all enzymes and can perform all seven classes of enzyme reactions. Because of the physicochemical similarities between the active sites of metal- loenzymes and inactive metal binding sites, it is challenging to differentiate between them. fi 1234567890():,; Yet distinguishing these two classes is critical for the identi cation of both native and designed enzymes. Because of similarities between catalytic and non-catalytic metal binding sites, finding physicochemical features that distinguish these two types of metal sites can indicate aspects that are critical to enzyme function. In this work, we develop the largest structural dataset of enzymatic and non-enzymatic metalloprotein sites to date. We then use a decision-tree ensemble machine learning model to classify metals bound to proteins as enzymatic or non-enzymatic with 92.2% precision and 90.1% recall. Our model scores electrostatic and pocket lining features as more important than pocket volume, despite the fact that volume is the most quantitatively different feature between enzyme and non- enzymatic sites. Finally, we find our model has overall better performance in a side-to-side comparison against other methods that differentiate enzymatic from non-enzymatic sequences. We anticipate that our model’s ability to correctly identify which metal sites are responsible for enzymatic activity could enable identification of new enzymatic mechanisms and de novo enzyme design. 1 Center for Computational Biology, The University of Kansas, Lawrence, KS, USA.
    [Show full text]
  • THEMATICS: a Simple Computational Predictor of Enzyme Function from Structure
    THEMATICS: A simple computational predictor of enzyme function from structure Mary Jo Ondrechen*†‡, James G. Clifton†, and Dagmar Ringe†§ *Department of Chemistry, Northeastern University, Boston, MA 02115; and †Rosenstiel Basic Medical Sciences Research Center and §Departments of Biochemistry and Chemistry, Brandeis University, Waltham, MA 02454 Communicated by Gregory A. Petsko, Brandeis University, Waltham, MA, August 18, 2001 (received for review June 19, 2001) We show that theoretical microscopic titration curves (THEMATICS) sometimes be verified by spectroscopic measurements. Also, the can be used to identify active-site residues in proteins of known pKa values of the catalytic residues can be determined indirectly structure. Results are featured for three enzymes: triosephosphate by the measured reaction rate as a function of pH. This measured isomerase (TIM), aldose reductase (AR), and phosphomannose pKa can then be compared with the computed pKa’s of the isomerase (PMI). We note that TIM and AR have similar structures residues of the protein to identify the active site. We now show but catalyze different kinds of reactions, whereas TIM and PMI that the shapes of the theoretical titration functions carry have different structures but catalyze similar reactions. Analysis of information about active-site location, even in the absence of the theoretical microscopic titration curves for all of the ionizable experimental kinetic data. residues of these proteins shows that a small fraction (3–7%) of the The titration curves of most of the ionizable residues in a curves possess a flat region where the residue is partially proton- protein have a typical shape: as pH is increased, there is a ated over a wide pH range.
    [Show full text]
  • Form and Function in Enzyme Activity 6 April 2012, by Angela Herring
    Form and function in enzyme activity 6 April 2012, By Angela Herring are structurally important while others are required for the enzyme's catalytic properties. "There are cavities on the surface of a protein where a molecule can come in and sit down," Ondrechen said. "The enzyme does a reaction on it and the product goes away." The current body of research on enzyme activity mostly focuses on the amino acids in that cavity, which come into direct contact with the reactive molecule. But over the years, some research has Mary Jo Ondrechen and Penny Beuning recently suggested that amino acids far away from the received an NSF grant to study enzyme activity in the active site also play a role in catalysis. lab and computationally. Credit: Dominick Reuter Ondrechen's team, using a method she developed 10 years ago, will be able to predict which remote amino acids will impact reactivity. Beuning's team Many industrial chemistry applications, such as will test these predictions experimentally. drug or biofuel synthesis, require large energy inputs and often produce toxic pollutants. But "My lab is really interested in specificity of chemistry and chemical biology professor Mary Jo enzymes," Beuning said. "We look enzymes and Ondrechen said enzymes - proteins that increase figure out how they recognize their substrates." the rate of chemical reactions in the body - could be used to effectively replace standard industrial To do this, her team takes a protein engineering processes. approach in which they manipulate the enzyme's composition and observe how it affects its function. "Enzymatic reactions are cleaner, produce fewer byproducts and use less energy," she explained.
    [Show full text]