Histograms and Free Energies Che210d
Total Page:16
File Type:pdf, Size:1020Kb
Histograms and free energies ChE210D Today's lecture: basic, general methods for computing entropies and free ener- gies from histograms taken in molecular simulation, with applications to phase equilibria. Overview of free energies Free energies drive many important processes and are one of the most challenging kinds of quan- tities to compute in simulation. Free energies involve sampling at constant temperature, and ultimately are tied to summations involving partition functions. There are many kinds of free energies that we might compute. Macroscopic free energies We may be concerned with the Helmholtz free energy or Gibbs free energy. We might compute changes in these as a function of their natural variables. For single-component systems: 퐴(푇, 푉, 푁) 퐺(푇, 푃, 푁) For multicomponent systems, 퐴(푇, 푉, 푁1, … , 푁푀) 퐺(푇, 푃, 푁1, … , 푁푀) Typically we are only interested in the dependence of these free energies along a single param- eter, e.g., 퐴(푉), 퐺(푃), 퐺(푇), etc. for constant values of the other independent variables. Free energies for changes in the interaction potential It is also possible to define a free energy change associated with a change in the interaction po- 푁 푁 tential. Initially the energy function is 푈0(퐫 ) and we perturb it to 푈1(퐫 ). If this change hap- pens in the canonical ensemble, we are interested in the free energy associated with this pertur- bation: Δ퐴 = 퐴1(푇, 푉, 푁) − 퐴0(푇, 푉, 푁) © M. S. Shell 2009 1/29 last modified 11/7/2019 푁 ∫ 푒−훽푈1(퐫 )푑퐫푁 = −푘퐵푇 ln ( 푁 ) ∫ 푒−훽푈0(퐫 )푑퐫푁 What kinds of states 1 and 0 might we use to evaluate this expression? Here is a small number of sample applications: • electrostatic free energy – charging of an atom or atoms in a molecule, in which state 0 has zero partial charges and state 1 has finite values • dipolar free energy – adding a point dipole to an atom between states 0 and 1 • solvation free energy – one can “turn on” interactions between a solvent and a solute as a way to determine the free energy of solvation • free energy associated with a field – states 0 and 1 correspond to the absence and pres- ence, respectively, of a field, such as an electrostatic field • restraint free energy – turning on some kind of restraint, such as confining a molecule to have a particular conformation or location in space. Such restraints would correspond to energetic penalties for deviations from the restrained space in state 1. • free energies of alchemical transforms – we convert one kind of molecule (e.g., CH4) to another kind (e.g., CF4). This gives the relative free energies of these two kinds of mole- cules in the system of interest (e.g., solvation free energies in solution). Potentials of mean force (PMFs) Oftentimes we would like to compute the free energy along some order parameter or reaction coordinate of interest. These are broadly termed potentials of mean force, for reasons we will see shortly. This perspective enables us to understand free-energetic driving forces in many pro- cesses. For the purposes of this discussion, we will notate a PMF by 퐹(휉) where 휉 is the reaction coordinate of interest. This coordinate might be, for example: • an intra- or intermolecular distance (or combination of distances) • a bond or torsion angle • a structural order parameter (e.g., degree of crystallinity, number of hydrogen bonds) Consider the example of a protein in aqueous solution interacting with a surface. The reaction coordinate might be the distance between the center of mass of the protein and the surface: © M. S. Shell 2009 2/29 last modified 11/7/2019 푧 The PMF along 푧, 퐹(푧) would give the free energy of the system as a function of the protein- surface distance. It might look something like: 퐹(푧) 푧 This curve would show us: • the preferred distance at which the protein binds to the surface, from the value of 푧 at the free energy minimum • the free energy change upon binding, from the difference in free energy between the minimum and large values of 푧 • the barrier in free energy for binding and unbinding, from the height of the hump Importantly, the free energy function does not just include the direct potential energy interac- tions between atoms in the molecule with atoms in the surface. It also includes the effects of all of the interactions in the solvent molecules. This may be crucial to the behavior of the system. © M. S. Shell 2009 3/29 last modified 11/7/2019 For example, the direct pairwise interactions of an alkane with a silica surface will be the same regardless of whether the solvent is water or octanol. However, the net interaction of the alkane and surface will be very different in the two cases due to solvent energies and entropies, and this effect is exactly determined by the PMF. Definition Formally, a potential of mean force (the free energy) along some reaction coordinate 휉 is given by a partial integration of the partition function. In the canonical ensemble, we begin with the configurational part of the Helmholtz free energy, 퐹(휉) = 퐴푐(푇, 푉, 푁, 휉) = −푘퐵푇 ln 푍(푇, 푉, 푁, 휉) −훽푈(퐫푁) ̂ 푁 푁 = −푘퐵푇 ln ∫ 푒 훿[휉 − 휉(퐫 )]푑퐫 Here, 휉̂(퐫푁) is a function that returns the value of the order parameter for a particular configu- ration 퐫푁. The integral in this expression entails a delta function that filters for only those Boltz- mann factors for configurations with the specified 휉. One can think of the PMF as the free energy when the system is constrained to a given value of 휉. Notice that we have the identity ∫ 푒−훽퐹(휉)푑휉 = 푒−훽퐴푐 The potential of mean force is so-named because its derivative gives the average force along the direction of 휉 at equilibrium. We proceed to find the derivative of the PMF: 푑퐹(휉) 푑 푁 = −푘 푇 ln ∫ 푒−훽푈(퐫 )훿[휉 − 휉̂(퐫푁)]푑퐫푁 푑휉 퐵 푑휉 푑 푁 −훽푈(퐫 ) [ ̂( 푁)] 푁 푑휉 ∫ 푒 훿 휉 − 휉 퐫 푑퐫 = −푘퐵푇 ∫ 푒−훽푈(퐫푁)훿[휉 − 휉̂(퐫푁)]푑퐫푁 To make progress, we need the mathematical identity 푑 푑푔(푥) ∫ 푔(푥)훿(푥 − 푎)푑푥 = ∫ 훿(푥 − 푎)푑푥 푑푎 푑푥 This allows us to pull the derivative inside the integral: 푑 푁 ( 푒−훽푈(퐫 )) 훿[휉 − 휉̂(퐫푁)]푑퐫푁 푑퐹(휉) ∫ 푑휉 = −푘퐵푇 푑휉 ∫ 푒−훽푈(퐫푁)훿[휉 − 휉̂(퐫푁)]푑퐫푁 © M. S. Shell 2009 4/29 last modified 11/7/2019 푑푈 푁 −훽푈(퐫 ) [ ̂( 푁)] 푁 ∫ (−훽 푑휉 푒 ) 훿 휉 − 휉 퐫 푑퐫 = −푘퐵푇 ∫ 푒−훽푈(퐫푁)훿[휉 − 휉̂(퐫푁)]푑퐫푁 −훽푈(퐫푁) ̂ 푁 푁 ∫ 푓휉푒 훿[휉 − 휉(퐫 )]푑퐫 = − ∫ 푒−훽푈(퐫푁)훿[휉 − 휉̂(퐫푁)]푑퐫푁 Here, the term 푓휉 gives the force along the direction of 휉, 푑푈 푓 = − 휉 푑휉 푑퐫푁 = − ⋅ ∇푈 푑휉̂ 푑퐫푁 = ⋅ 퐟푁 푑휉̂ The remainder of the terms in the PMF equation serve to average the force for a specified value of 휉. Thus, 푑퐹(휉) = −⟨푓 (휉)⟩ 푑휉 휉 Paths Keep in mind that free energies are state functions. That is, if we are to compute a change in any free energy between two conditions, we are free to pick an arbitrary path of interest between them. This ultimately lends flexibility to the kinds of simulation approaches that we can take to compute free energies. Overview of histograms in simulation Until now, we have focused mainly on computing property averages in simulation. Histograms, on the other hand, are concerned with computing property distributions. These distributions can be used to compute averages, but they contain much more information. Importantly, they relate to the fluctuations in the ensemble of interest, and ultimately can be tied to statistical-mechan- ical partition functions. It is through this connection that histograms enable us to compute free energies and entropies. Definitions and measurement in simulation For the purposes of illustration, we will consider a histogram in potential energy. In our simula- tion, we might measure the distribution of the variable 푈 using a long simulation run and many observations of the instantaneous value of 푈. © M. S. Shell 2009 5/29 last modified 11/7/2019 In classical systems, the potential energy is a continuously-varying variable. Therefore, the un- derlying ℘(푈) is a continuous probability distribution. However, in the computer we must meas- ure a discretized version of this distribution. • We specify a minimum and maximum value of the energy that defines a range of energies in which we are interested. Let these be 푈min and 푈max. • We define a set of 푚 bins into which the energy range is discretized. Each bin has a bin width of 푈 − 푈 훿푈 = max min 푚 • Let the variable 푘 be the bin index. It varies from 0 to 푚 − 1. The average energy of bin 푘 is then given by 1 푈 = 푈 + (푘 + ) 훿푈 푘 min 2 • We create a histogram along the energy bins. This is simply an array in the computer that measures counts of observations: 푐푘 = counts of 푈 observations in the range [푈푘 − 훿푈⁄2 , 푈푘 + 훿푈⁄2) For the sake of simplicity, we will often write the histogram array using the energy, rather than the bin index, 푈 − 푈min 푐(푈) = 푐 where 푘 = int ( ) 푘 훿푈 Here the int function returns the integer part of its argument. For example, int(2.6) = 2. To determine a histogram in simulation, we perform a very large number of observations 푛 from a long, equilibrated molecular simulation.