A Simple Computational Predictor of Enzyme Function from Structure

THEMATICS: A simple computational predictor of enzyme function from structure Mary Jo Ondrechen*†‡, James G. Clifton†, and Dagmar Ringe†§ *Department of Chemistry, Northeastern University, Boston, MA 02115; and †Rosenstiel Basic Medical Sciences Research Center and §Departments of Biochemistry and Chemistry, Brandeis University, Waltham, MA 02454 Communicated by Gregory A. Petsko, Brandeis University, Waltham, MA, August 18, 2001 (received for review June 19, 2001) We show that theoretical microscopic titration curves (THEMATICS) sometimes be verified by spectroscopic measurements. Also, the can be used to identify active-site residues in proteins of known pKa values of the catalytic residues can be determined indirectly structure. Results are featured for three enzymes: triosephosphate by the measured reaction rate as a function of pH. This measured isomerase (TIM), aldose reductase (AR), and phosphomannose pKa can then be compared with the computed pKa’s of the isomerase (PMI). We note that TIM and AR have similar structures residues of the protein to identify the active site. We now show but catalyze different kinds of reactions, whereas TIM and PMI that the shapes of the theoretical titration functions carry have different structures but catalyze similar reactions. Analysis of information about active-site location, even in the absence of the theoretical microscopic titration curves for all of the ionizable experimental kinetic data. residues of these proteins shows that a small fraction (3–7%) of the The titration curves of most of the ionizable residues in a curves possess a ﬂat region where the residue is partially proton- protein have a typical shape: as pH is increased, there is a ated over a wide pH range. The preponderance of residues with sudden, sharp decline in the predicted net charge. This decline such perturbed curves occur in the active site. Additional results are occurs, as expected, in a narrow pH range around the pKa. This given in summary form to show the success of the method for behavior is well known as the source of the buffer capacity of the proteins with a variety of different chemistries and structures. ionizable groups of amino acids. Thus, most ionizable residues go from their protonated to unprotonated forms within a APPLIED here is a need to develop methods to predict protein function relatively narrow pH range, as predicted by the Henderson– MATHEMATICS Tfrom structure; this need becomes particularly acute as novel Hasselbalch equation, written as Ϫ protein folds are discovered for which there are no proteins of pH ϭ pK ϩ log͓͑A ͔͓͞HA͔͒. [1] similar structure with known function. We show that theoretical a microscopic titration curves (THEMATICS) can be used to A small fraction of the ionizable residues have curves with identify active-site residues in enzymes of known structure. This perturbed shapes that do not fit Eq. 1. information will prove to be important in the current challenging quest to predict protein function from sequence and structure. Methods BIOPHYSICS Knowledge of protein sequences and structures has burgeoned A number of methods are available for the prediction of pKa very recently as a result of genome sequencing (1) and structural values of ionizable groups in proteins based on a finite difference genomics efforts. To translate such information into tangible Poisson–Boltzmann method (12–15). This requires calculation of benefits to humankind, the next step is to develop methods that the electrostatic potentials, for which we employ the UHBD (16) enable one to predict and to establish function from structure. program. We use the program HYBRID (15, 17) to calculate the Techniques used to predict function from sequence or function mean net charge as a function of pH (18–21) for each ionizable from structure are in their infancy and rely either on analogies group (all Lys, Arg, Asp, Glu, His, Tyr, Cys, N terminus, and C to related proteins of known function (2–8) or on computational terminus) of each protein. For each residue type in a given searches for binding sites by docking (9) of selected sets of small enzyme, a plot is drawn of predicted mean net charge as a molecules onto the structure. For instance, recent work has function of pH—THEMATICS. Thus, for each residue type, we searched the Protein Data Bank (PDB; ref. 10; http:͞͞ have a plot with a family of curves: one plot with a curve for each www.rcsb.org͞pdb͞) for previously unrecognized cation-binding Lys residue, another plot with a curve for each Arg residue, etc. sites (11). We compare the curves for residues of the same type; curves with However, there is as yet no reliable method to identify active unusual shape are noted by visual inspection. sites of enzymes or other interaction sites of proteins in the In this brief report we feature the results for three represen- absence of biochemical data, even when the structure is known. tative enzymes: triosephosphate isomerase (TIM), aldose reduc- There is a wealth of biochemical data that demonstrates that tase (AR), and phosphomannose isomerase (PMI). We also give active-site residues involved in specific kinds of chemistry pos- results in summary form for eight other proteins. For all three sess predictable chemical properties that enable one to identify of the featured enzymes and for many others as well, we observe them as active-site residues. The most important of these prop- theoretical titration functions with perturbed shapes for some of erties is a perturbed pKa that can be determined experimentally the ionizable active-site residues. Other ionizable residues that by pH titrations of the activity of the enzyme. Herein we are not in the active site almost always have titration functions demonstrate that theoretical titration functions can identify that are more typical in that they fit the Henderson–Hasselbalch active-site residues that are involved in Brønsted acid–base equation. From the theoretical titration functions alone, one is chemistry for a variety of proteins, thereby identifying the active therefore able to identify the physical location of the active site site from the location of the residues. In an experimental titration curve, one typically plots pH as a function of the volume of standardized solution added. In a Abbreviations: THEMATICS, theoretical microscopic titration curves; TIM, triosephosphate isomerase; AR, aldose reductase; PMI, phosphomannose isomerase; PDB, Protein Data theoretical titration curve for a residue in a protein, one gen- Bank. erally plots the net charge on the residue (for an ensemble of ‡ To whom reprint requests should be sent at the * address. E-mail: [email protected]. protein molecules) as a function of pH. Thus, we regard pH as The publication costs of this article were defrayed in part by page charge payment. This the control variable and the charge on the group as the depen- article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. dent variable. The calculated charge on a residue in a protein can §1734 solely to indicate this fact. www.pnas.org͞cgi͞doi͞10.1073͞pnas.211436698 PNAS ͉ October 23, 2001 ͉ vol. 98 ͉ no. 22 ͉ 12473–12478 Downloaded by guest on September 24, 2021 in a variety of different protein structures. We shall also discuss the probable catalytic advantage afforded by the observed perturbations in the titration functions. Results We have obtained theoretical titration functions for all ionizable residues of enzymes of several different types and have noticed a high correlation between a particular type of perturbed shape and the location of the residue in the active site. Our results correlate strongly with biochemical results of kinetic titrations that implicate specific residues in acid–base chemistry of a catalyzed reaction. We note that two of the three featured enzymes, TIM and AR, have similar folds, the ␣͞␤-barrel (or ‘‘TIM barrel’’) structure, although they catalyze different chemical reactions. We note further that TIM and PMI catalyze the same kind of chemical reaction but have very different structures. These three proteins have no significant sequence identity; alignment scores obtained with the program CLUSTALW (ref. 22; http:͞͞www.ebi.ac.uk͞clustalw) for the three possible pairings of the three sequences used herein were 9%, 6%, and 11% for TIM-AR, AR-PMI, and TIM-PMI, respectively. These values are considered unacceptable for identification of structure from sequence similarity. TIM. TIM (23–28) catalyzes the isomerization of D-glyceralde- hyde 3-phosphate to dihydroxyacetone phosphate, a key reaction in the glycolytic pathway. Calculations on TIM from Gallus gallus (chicken) were performed by using the 1.80-Å resolution structure 1TPH (10, 23) of the TIM-phosphoglycolohydroxamate complex. Theoretical titration functions were obtained for the biologically active dimer from which the coordinates for the inhibitor were removed. Our observations of the theoretical titration functions for all of the ionizable residues of TIM find that four residues have curves with highly perturbed shapes: His-95, Glu-165, Lys-112, and Tyr-164. Fig. 1A shows the predicted mean net charge as a function of pH for the eight histidine residues in the A chain. Predicted titration functions for the B chain (not shown) are similar to the A chain. Fig. 2a shows the location of His-95 and Glu-165 in the active site of TIM. First, one notes that the theoretical titration functions for most of the histidine residues depicted in Fig. 1A have the typical Fig. 1. Sample theoretical titration curves. Predicted mean net charge as a shape. However we observe that His-95 has a strikingly different function of pH. (A) All of the histidine residues in the A chain of TIM: His-26 (ϩ), predicted titration function. Not only is the pKa of the conjugate His-95 (ϫ), His-100 (✳), His-115 (ᮀ), His-185 (■), His-195 (E), His-224 (F), and acid downshifted to 3.2, but also the shape of the curve is His-248 (‚).

A Simple Computational Predictor of Enzyme Function from Structure

Computational Protein Function Predictions

Bioorganic & Medicinal Chemistry, Volume 11, Dec, 2003

Application of Molecular Modeling to Drug Discovery and Functional Genomics

Specificity of DNA Damage Inducible DNA Polymerase IV from Escherichia Coli

Identification of Functional Subclasses in the DJ-1 Superfamily Proteins

Che Newsletter

Selective Prediction of Interaction Sites in Protein Structures with THEMATICS Ying Wei1, Jaeju Ko1,2, Leonel F Murga1,3 and Mary Jo Ondrechen*1

Using Computation to Identify Catalytic Features of Enzymes And

Ronald E. Hatcher Science on Saturday Lecture Series 24 January 2015

Machine Learning Differentiates Enzymatic and Non- Enzymatic Metals in Proteins ✉ Ryan Feehan 1,3, Meghan W

THEMATICS: a Simple Computational Predictor of Enzyme Function from Structure

Form and Function in Enzyme Activity 6 April 2012, by Angela Herring