Method for Computing Protein Binding Affinity

Method for Computing Protein Binding Affinity Charles F. F. Karney∗ and Jason E. Ferrara Sarnoff Corporation, Princeton, NJ 08543-5300 Stephan Brunner† Locus Pharmaceuticals, Inc., Blue Bell, PA 19422-2700 (Dated: July 14, 2004; revised September 9, 2004) A Monte Carlo method is given to compute the binding affinity of a ligand to a protein. The method involves extending configuration space by a discrete variable indicating whether the ligand is bound to the protein and a special Monte Carlo move which allows transitions between the unbound and bound states. Provided that an accurate protein structure is given, that the protein-ligand binding site is known, and that an accurate chemical force field together with a continuum solvation model is used, this method provides a quantitative estimate of the free energy of binding. Keywords: free energy; binding affinity; Monte Carlo methods; equilibrium constants; proteins Introduction In order to compute pKd, we require (1) a sufficiently accurate model of the protein and the ligand and their interaction Many drugs work by binding to a target protein in an or- and (2) a good way to compute the resulting value of pKd. ganism and affecting the action of this protein. The binding In this paper, we assume the first requirement is fulfilled and of the drug molecule, the ligand L, to the protein P under instead focus on meeting the second. physiological conditions is usually reversible (characterized The standard methods of computing free energies [1, 2] are by weak chemical interactions rather than covalent bonds), not capable of computing ∆F directly because the unbound and bound systems are too dissimilar, which hinders transi- ⇋ L+P LP, tions between these systems. Instead, typically, two close lig- ands L and L , are compared separately unbound and bound and, in equilibrium and in the dilute limit, the concentrationof a b to the protein, thereby obtaining the difference in the free en- the ligand-protein complex [LP] is given by the dissociation ergies ∆F ∆F . constant a − b We present here a practical method for directly comput- [L][P] ing ∆F and hence pKd. The method consists of: (1) for- Kd = . [LP] mulating the problem in an extended phase space which allows the unbound and bound systems to be treated as a single It is convenient to define the binding affinity as system and Kd to be expressed as the ratio of two canoni- K /N cal averages; (2) introducing a new Monte Carlo move, the pK = log d A , d − 101 kmol m−3 “wormhole move,” to make transitions between the unbound and bound states in this extended system; and (3) a method to where NA is the Avogadro constant. A high value for pKd is determine the “portals” needed for the wormhole move. crucial to obtaining a good drug molecule and, consequently, the ability to compute pKd accurately would greatly acceler- ate drug discovery by allowing many molecules to be screened in silico before any time-consuming syntheses and assays are Formulation done. The dissociation constant can also be expressed in terms of the binding free energy ∆F , Consider a system of volume V0 consisting of a ligand molecule L and a protein molecule P dissolved in NS molecules Kd = exp(β∆F )/V0, (1) S. The overall state of the system is given by [Γ, ΓS] where represents the phase space configuration of and and where V0 is the system volume, β = 1/(kT ), T is the tem- Γ L P ΓS arXiv:cond-mat/0401348v3 [cond-mat.stat-mech] 9 Sep 2004 represents the configurations of all the solvent molecules . perature, and k is the Boltzmann constant. Similarly pKd is S given by The energy of the system is given by E(Γ, ΓS) and, in equilibrium, the system obeys the Boltzmann distribution [3] β∆F −3 pKd = + log10(V0NA 1 kmol m ). − ln 10 × exp[ βE(Γ, ΓS)] f(Γ, ΓS)= − . The quantity ∆F is the free energy difference of the ligand exp[ βE(Γ, ΓS)] dΓ dΓS − and the protein forming a bound complex LP, the “bound sys- R tem,” comparedto the ligand and the protein isolated from one It is frequently useful to average over the configurations of the another L+P, the “unbound system.” solvent molecules by integrating the Boltzmann distribution 2 over ΓS to give a reduced Boltzmann distribution boundary effects). It is also independent of the precise definition of Σ1, provided that Σ1 includes the protein-ligand binding site and does not include a “macroscopic” volume beyond f(Γ) = f(Γ, ΓS) dΓS Z this. exp[ βE(Γ)] In this formulation, we have assumed that the system vol- = − , exp[ βE(Γ)] dΓ ume V0 is fixed. However, in most physiological systems, the − R pressure is held constant and the binding affinity is then re- where lated to the differences in the Gibbs free energy which intro- duces a correction term which is the product of the pressure 1 E(Γ) = ln exp[ βE(Γ, ΓS)] dΓS and the change in the volume caused by the formation of the −β Z − LP complex [6]. We expect this correction to be small for is the energy of the system with ligand and protein configu- typical ligand-protein interactions in solution. rations specified by Γ and with the equilibrium effects of the We would like to cast Eq. (2) as the ratio of canonical av- solvent implicitly included as a solvation free energy. In this erages which can be computed using the canonical-ensemble paper, we will assume that E(Γ) is given. Monte Carlo method [7]. To achieve this, we combine the Typical molecular interactions have a short range. In view unbound and bound systems by extending phase space to include a discrete index λ 0, 1 and consider a system in of this, let us define Σ0 to represent all accessible Γ space this extended space with energy∈ { E}∗(Γ) for which the canoni- (i.e., L and P somewhere in the system volume V0), and Σ1 λ cal average is defined by to represent that portion of Σ0 where there is an appreciable interaction between L and P which therefore form the com- dΓ exp[ βE∗(Γ)]X (Γ) X = λ − λ λ . plex LP. In the phase-space volume Σ1 we write the energy ∗ h i P R λ dΓ exp[ βEλ(Γ)] as E (Γ) which is just an alternate notation for the full energy − 1 ∗ P R We take E (Γ) to be infinite for Γ / Σλ and finite otherwise. E(Γ), while in the volume Σ0 Σ1 we may write the energy λ − Now Eq. (2) can be rewritten as ∈ as E0(Γ) which we define as the “unbound”energy,i.e., E(Γ) ∗ excluding the interaction between L and P. The dissociation 1 δλ0 exp β[E0(Γ) E (Γ)] K = − − 0 , (3) constant can then be written as d ∗ V0 δλ1 exp β[E1(Γ) E1 (Γ)] 2 − − −βE(Γ) ∗ e dΓ where δλµ is the Kronecker delta. If Eλ(Γ) Eλ(Γ), the 1 Σ0−Σ1 ≈ terms being averaged are O(1). Because the definition of K Kd = −RβE(Γ) −βE(Γ) d V0 e dΓ e dΓ Σ0 Σ1 is independent of V0, we can pick V0 1/Kd so that approx- R R 2 imately the same number of samples contribute∼ to each of the e−βE0(Γ) dΓ 1 Σ0−Σ1 = canonical averages. We show later, Eq. 7, that this choice V R −βE1(Γ) 0 Σ1 e dΓ × minimizes the error in the estimate of Kd. ∗ R 1 We can evaluate Eλ(Γ) with short energy cutoffs allowing . e−βE0(Γ) dΓ+ e−βE1(Γ) dΓ it to be computed more quickly than Eλ(Γ) and the terms con- Σ0−Σ1 Σ1 tributing to the averages in Eq. (3) can be accumulated every R R hundred steps, for example. Since there is typically a high In the dilute limit V0 , this can be simplified by ignoring the second term in→ the ∞ denominator of the last factor and correlation between successive steps in a Monte Carlo sim- ulation, this method allows the averages to be computed to by extending the limits of the integrals over Σ0 Σ1 to in- − a given degree of accuracy in less time than if we had used clude Σ1. In extending the definition of E0(Γ) to Γ Σ1, we ∈ ∗ include the intramolecular energy and the solvation free en- Eλ(Γ) = Eλ(Γ). ergy but continue to omit the intermolecular (ligand-protein) The extension of phase space has been used in other free en- energy. This yields [1, 4, 5] ergy calculations, to combine, for example, systems at several different temperatures [8] or to treat the “reaction coordinate” controlling the transition between two chemical species as a 1 Σ0 exp[ βE0(Γ)] dΓ Kd = − . (2) dynamic variable [9, 10]. In our case, the use of the worm- V0 R exp[ βE1(Γ)] dΓ Σ1 − hole Monte Carlo (described in the next section) allows us to R The Helmholtz free energy of the unbound and bound systems include just the two systems of interest without the need to is [3] compute the properties of (possibly unphysical) intermediate systems. 1 Fλ = ln exp[ βEλ(Γ)] dΓ , −β ZΣλ − Wormhole Monte Carlo for λ = 0 and 1, and Eq. (1) is obtained from Eq. (2) with ∆F = F1 F0. The definition of Kd, Eq. (2), is strictly in- We can compute the canonical averages in Eq. (3) using the − ′ ′ dependent of V0 because of translational symmetry (ignoring Monte Carlo method [7] to make steps from [Γ, λ] to [Γ , λ ] 3 (a) whose equilibrium distribution is proportional to g(Υ).

Method for Computing Protein Binding Affinity

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support