Comparison of Two Optimization Methods to Derive Energy Parameters for Protein Folding: Perceptron and Z Score
Total Page:16
File Type:pdf, Size:1020Kb
PROTEINS: Structure, Function, and Genetics 41:192–201 (2000) Comparison of Two Optimization Methods to Derive Energy Parameters for Protein Folding: Perceptron and Z Score Michele Vendruscolo,1* Leonid A. Mirny,2 Eugene I. Shakhnovich,2 and Eytan Domany3 1Oxford Centre for Molecular Sciences, New Chemistry Laboratory, Oxford, United Kingdom 2Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 3Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot, Israel ABSTRACT Two methods were proposed re- Given a set of proteins whose native structure is known, cently to derive energy parameters from known experimentally or otherwise, the set w must stabilize native protein conformations and corresponding these structures against all the possible alternatives. sets of decoys. One is based on finding, by means of a Optimization of the stability can be realized using perceptron learning scheme, energy parameters different methods.2–9 In this work, we compare two re- such that the native conformations have lower ener- cently proposed approaches to this problem, the one by gies than the decoys. The second method maximizes Mirny and Shakhnovich (MS)10 and that of Vendruscolo the difference between the native energy and the and Domany (VD).1 average energy of the decoys, measured in terms of The purpose of the present study is to compare the the width of the decoys’ energy distribution (Z- merits and possible shortcomings of the two methods, score). Whereas the perceptron method is sensitive when applied to realistic situations and data, involving mainly to “outlier” (i.e., extremal) decoys, the Z- either real or artificial model proteins. score optimization is governed by the high density In the Z score method, one determines the energy regions in decoy-space. We compare the two meth- parameters by optimizing the gap between the native state ods by deriving contact energies for two very differ- and the average energy of alternative conformations, ent sets of decoys: the first obtained for model measured in units of standard deviations of the energy lattice proteins and the second by threading. We distribution. Imagine that a set of parameters with a very find that the potentials derived by the two methods low Z-score has been found. If the number of decoys is are of similar quality and fairly closely related. This large, the Z-score will not be affected by a small number of finding indicates that standard, naturally occurring “outlier” configurations. If we now add a few conformations sets of decoys are distributed in a way that yields whose energy is below that of the native structure, this will robust energy parameters (that are quite insensi- not affect in a significant way the average energy and, tive to the particular method used to derive them). hence, the Z-score. Therefore, for this new set of decoys the The main practical implication of this finding is that parameters that optimize the Z-score do not assign the it is not necessary to fine-tune the potential search lowest energy to the native state. We can easily create method to the particular set of decoys used. Proteins such a situation; it is not clear at all, however, whether 2000;41:192–201. © 2000 Wiley-Liss, Inc. such a mishap will or will not occur for a routinely obtained set of decoys. Key words: protein folding; contact maps; protein The perceptron method is aimed at enforcing the condi- potential; perceptron; Z-score tion Ͻ INTRODUCTION E0 E (2) To perform protein folding one assigns an energy E to a for one or more proteins. Here E0 is the energy of the protein sequence in a given conformation. One of the native state and E ( ϭ 1,...P) are the energies of P simplest approximations to the true energy is the pairwise alternative conformations. This is a necessary condition contact approximation for any energy function to be used for protein folding by energy minimization. When there exists a set of contact N energy parameters w for which (2) holds for all ,wesay true͑ ͒ Ӎ pair͑ ͒ ϭ ͑ ͒ E a, S E a, S, w Sij w ai, aj , (1) i Ͻ j Grant sponsor: US-Israel Binational Science Foundation (BSF); Grant sponsor: Germany-Israel Science Foundation (GIF); Grant where we denoted by a the sequence of amino acids, by S sponsor: Minerva Foundation; Grant sponsor: European Molecular the conformation (represented by its contact map,1 and by Biology Organization (EMBO); Grant sponsor: National Institute of Health (NIH); Grant number: GM52126. w the set of energy parameters. If there is a contact ϭ *Correspondence to: Michele Vendruscolo, Oxford Centre for Molecu- between residues i and j, then Sij 1 and the parameter lar Sciences, New Chemistry Laboratory, University of Oxford, South w(ai, aj), which represents the energy gained by bringing Parks Road, OX1 3QT Oxford UK. E-mail: [email protected] amino acids ai and aj in contact, is added to the energy. Received 28 February 2000; Accepted 27 June 2000 © 2000 WILEY-LISS, INC. COMPARISON OF TWO OPTIMIZATION METHODS 193 that the problem is learnable. For a learnable problem, N however, the solution is not unique: there is a region ͗ ͘ ϭ ͗ ͘ ͑ ͒ E Sij w ai, aj (5) (called “version space”) in energy parameter space, whose i Ͻ j points satisfy the P inequalities (2). One can identify a subset of conformations that are of low energy for at least N N 2͑ ͒ ϭ ͑ ͒ ͑ ͒ ͑ ͒ some of the points in version space. The result obtained by E cov Sij, Skl w ai, aj w ak, al , (6) perceptron learning is sensitive only to such a (possibly i Ͻ j k Ͻ l very small) subset of conformations. Using a different set ͗ ͘ of conformations, even generated in the same way may, in where Sij is the frequency of a contact between residues i ϭ ͗ ͘ Ϫ ͗ ͗͘ ͘ principle, change the solution considerably. and j in the decoys and cov(Sij, Skl) Sij Skl Sij Skl Hence the perceptron solution may be influenced very is covariance of contacts between i, j and k, l. For a given ͗ ͘ strongly by a few low-energy “outlier” conformations. set of decoys, one can easily compute Sij and cov(Sij, Skl). Therefore, it is possible to create a situation in which the Importantly, the Z-score method allows derivation of a energy parameters obtained by perceptron learning do potential using no explicit decoys. Assuming a certain form ͗ ͘ satisfy (2) and stabilize the native fold but, at the same of the distribution of contacts in the decoys Sij and their time, yield a relatively high value for the Z-score. Again we correlations cov(Sij, Skl) one can compute the Z-score and wish to find out whether for realistic decoys such a optimize a potential against these implicit decoys. situation will or will not actually occur. However, when the Z-score method is compared with the perceptron learning (see below) a set of actual decoys is ͗ ͘ OPTIMIZATION METHODS always present. In this case both Sij and cov(Sij, Skl) are Z-Score Method computed explicitly using these decoys. MS presented a method to derive a potential based on Optimization of Potential the optimization of the Z-score. The Z-score is defined by A potential w is obtained by minimization of the Z- scores simultaneously for all proteins in a database. As E Ϫ ͗E͘ explained above, this is achieved by using a harmonic ϭ 0 Z (3) mean of individual Z-scores as a function to be minimized M where E is the energy of the native state, and ͗E͘ and 0 ͗Z͘harm ϭ (7) are, respectively, the mean and the standard deviation of M the energy distribution. 1/Zm The procedure that they used to recover the true poten- m ϭ 1 tial worked by optimizing the Z-score simultaneously for At each step of the Monte Carlo procedure, an element of all the sequences as a function of the energy parameters w. w is chosen at random and a small random number ⑀ ʦ Using a Monte Carlo in parameter space they minimized [Ϫ0.1, 0.1] is added to it. This change is accepted or the harmonic mean of the Z-scores ͗ ͘ rejected according to the associated change in Z harm and the Metropolis criterion with algorithmic temperature T.11 M At low temperature T, the procedure rapidly converges to ͗Z͘harm ϭ (4) M the low values of ͗Z͘ . harm 1/Zm To assess the quality of obtained potential wopt one m ϭ 1 needs to compare the energy of the native conformation E0(wopt) with the energy of each decoy E(wopt). If the The procedure that they used to recover the true poten- actual decoys are present the procedure is straightfor- tial worked by optimizing the mean harmonic Z-score ward. When potential is obtained using implicit decoys Ͻ simultaneously for all proteins in the database as a (see above), one cannot check whether E0 E for all . function of the energy parameters w. The reason for However, it is possible to estimate whether E0 is below EC, taking the harmonic mean of Z-scores as a function to the bottom of the continuum part of the decoy’s energy optimize is that the harmonic mean is most sensitive to spectrum. Assuming the Gaussian energy distribution of “outliers,” i.e., it is a good approximation to obtain min- the decoys one gets (maxmZm). Physically it means that the optimization of the ϭ ͗ ͘ Ϫ ͱ harmonic mean is likely to exclude the situation when few EC E 2lnM, proteins are “over-optimized” while for most other proteins where M is the estimated number of decoys.