<<

A generalized G-SFED continuum free PNAS PLUS energy calculation model

Sehan Leea, Kwang-Hwi Chob, Young-Mook Kanga, Harold A. Scheragac,1, and Kyoung Tai Noa,d,1

aDepartment of Biotechnology, Yonsei University, Seoul 120-749, Korea; bDepartment of Bioinformatics, SoongSil University, Seoul 156-743, Korea; cBaker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301; and dBioinformatics and Molecular Design Research Center, Seoul 120-749, Korea

Contributed by Harold A. Scheraga, December 20, 2012 (sent for review October 12, 2012) An empirical continuum solvation model, solvation free energy method (23). A polarizable continuum model (PCM), proposed density (SFED), has been developed to calculate solvation free by Tomasi and colleagues (24), casts the quantum mechanical energies of a molecule in the most frequently used . SCRF equations into a boundary element problem with apparent A generalized version of the SFED model, generalized-SFED (G-SFED), surface charges spread on the cavity surface. Three types of PCMs is proposed here to calculate molecular solvation free energies in have been used frequently: the original method, PCM virtually any . G-SFED provides an accurate and fast gener- (24); integral equation formalism PCM (25–27);andanalter- alized framework without a complicated description of a . native model in which the surrounding medium is modeled as a In the model, the solvation free energy of a solute is represented as conductor instead of as a dielectric (28, 29). a linear combination of empirical functions of the solute properties Moreover, some models complement the description of the – representing the effects of solute on various solute solvent interac- solute–solvent interactions by adding the missing nonelectrostatic tions, and the complementary solvent effects on these interactions fl fi terms. Hydrogen bonds are described adequately by density- were re ected in the linear expansion coef cients with a few solvent functional theory exchange-correlation functionals as well as properties. G-SFED works well for a wide range of sizes and polari- by post-Hartree-Fock (HF) methods (30–32). In the classical ties of solute molecules in various solvents as shown by a set of approach, the free energy of hydrogen-bond formation can be 5,753 solvation free energies of diverse combinations of 103 solvents calculated with a product of parameters characteristic of acidity and 890 solutes. Octanol-water partition coefficients of small organic BIOPHYSICS AND and basicity of molecules (33, 34). Solvent-accessible surface compounds and peptides were calculated with G-SFED with accuracy – COMPUTATIONAL BIOLOGY within 0.4 log unit for each group. The G-SFED computation time (SAS) area models are popular choices for nonpolar solute depends linearly on the number of nonhydrogen atoms (n)ina solvent interactions, such as dispersion, repulsion, and cavitation – molecule, O(n). (35 39). The continuum models generally are parameterized for a specific implicit solvent | macromolecules | linear time solvent to reproduce the interactions of the solvent with a variety of solutes. An important advance in continuum approaches is the n accurate description of the effect of solvent on solvation is development of generalized (or universal) solvation models ap- Acrucial for understanding chemical and biological phenom- plicable to any solvent. The conductor-like screening model for ena. Because many peptide and drugs are currently being realistic solvation (COSMO-RS) (40) extension of COSMO (41) developed (1, 2), computational efficiency as well as accuracy are derives the polarization charges of the continuum, caused by the very important. Among the different approaches that have been charge distribution of the solute, from a scaled-conductor ap- x used, continuum solvation models are appealing because of the proximation. SM models (SM5A, SM5.42R, SM8, and others) simplified yet accurate description of the solvent effect (3–8). implement the parameterization of atomic surface tension not Continuum solvent models have been developed based on the only in terms of the properties of the atoms of the solute but also – x early work of Born (9), Bell (10), Kirkwood (11), and Onsanger in terms of the solvent properties (42 44). COSMO-RS and SM (12). They focus primarily on the description of the solute either can be applied only to molecules of limited size that can be at the quantum mechanical level or as a classical collection of treated by quantum or semiempirical quantum chemistry. point charges and try to represent the influence of the solvent based on an approximate treatment as a dielectric continuum Significance medium. The charge distribution of the solute induces electric polarization of the surrounding solvent, and the electric field This paper deals with a long-standing problem in biophysics, generated by the polarized solvent, the reaction field, in turn per- which is resolved here. The model has a strong physical back- turbs the solute, leading to a modification of the charge distribution ground and will have a wide range of applications to physical of the solute. This electrostatic problem of the mutual polarization and biological problems. The model proposed in this article, between solute and solvent can be recast by using boundary con- GSFED model, can be used for the solvation free energy cal- ditions at the cavity surface, leading to significant simplifications in culation of most organic solutes in most organic solvents. Since the associated equations. the computing time depends linearly on the size of the molecule, In the classical approaches, most work is concerned with de- the model can be applied easily to large molecules, for example scribing the electrostatic contribution to the solvation in terms of . The model can provide reliable salvation free energies the Poisson–Boltzmann equation (PBE) (13). Several different of experimentally unavailable solute-solvent pairs. computational techniques for solving the PBE have been deve- loped, for example, finite difference methods in which the di- Author contributions: H.A.S. and K.T.N. designed research; S.L. performed research; K.-H.C. and Y.-M.K. contributed new reagents/analytic tools; and S.L. and K.T.N. wrote electric property of the solvent is described in terms of a 3D the paper. – grid around the solute (14 17) and boundary element methods in The authors declare no conflict of interest. which the reaction potential is described by an apparent charge 1To whom correspondence may be addressed. E-mail: [email protected] or ktno@yonsei. spread on the cavity surface (18–22). ac.kr. In the quantum mechanical approaches, the mutual polarization This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. is taken into account by the self-consistent reaction field (SCRF) 1073/pnas.1221940110/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1221940110 PNAS Early Edition | 1of6 Downloaded by guest on September 27, 2021 Theory Solvent εm, ηm, γm, Am, Bm The solvation free energy of a solute s in a solvent m, ΔGsolv, was m m i ε , η m described by two steps: ( ) forming a solute-shaped cavity in the ε , ηm k l polarizable solvent and calculating the cavitation free energy, r r jk il ΔGcav, and (ii) transferring the charge distribution of the solute rjl rik from the gas phase to the cavity and calculating the interaction, ΔGinter, between the polarizable solvent and the polarizable sol- R R ute. Because the continuum models used the potential of mean vdw S force produced by the solute to calculate ΔG , the contribution q , a q , a inter i i j j to the , especially from proton tunneling, always is underestimated. Therefore, the contribution of hydrogen bonds s s Solute A , B to solvation, ΔGHB, is calculated independently of other possible solute–solvent interactions: ΔG ¼ ΔG þ ΔG þ ΔG : [1] SAS van der Waals Surface solv inter HB cav

To calculate ΔGinter,(i) the solute molecule is represented by Fig. 1. The solute and solvent of a solution are described as an assemblage polarizable atoms with point charges at the center of the atoms of interacting compartments. rik is the distance between the ith atom and ii the kth surface fragment on the SAS of the solute, and γm is the macroscopic in the solute, and ( ) the solvent is represented by an SAS, and surface tension of the solvent. A dielectric constant, «m, and a refractive each surface unit, represented by a dot in Fig. 1, has an average index, ηm, of the solvent are placed at the grid points on the SAS. Hydrogen- dipole and an average induced dipole described by macroscopic bond acidity and basicity of the solvent, Am and Bm, and of the solute, As and solvent properties. Bs, represent the contribution of a hydrogen bond. The electric field produced by the solute aligns the dipoles and induced dipoles on the surface, and the aligned dipoles and in- duced dipoles of the solvent perturb the electron distribution of Previously, we demonstrated that the solvation free energy each atom of the solute. This perturbed electron distribution in fi density (SFED) model is an ef cient one for the description of the solute, described by the change of net atomic charge of each molecular properties in solution with a linear combination of atom, which is proportional to the effective atomic polarizability, – empirical functions of the solute properties (45 50). However, interacts with the aligned dipoles and the induced dipoles on the parameterization of the linear expansion coefficients, opti- the surface. mized for each solvent, limits the application of the model to To describe ΔGinter, which originates from the phenomena de- solvents for which the coefficients were determined. Here we scribed above, previous work (45–50) was carried out to find a present the generalized SFED (G-SFED) model to calculate the proper functional form for ΔGinter. Several functional forms and solvation free energies of a molecule including biological mole- several ways to combine the basis functions have been tested cules in any environment. The effects of the solute on various to calculate solvation free energies of diverse solutes in diverse – solute solvent interactions are represented by the basis func- solvents. Among the tested functions, ΔGinter calculated with tions developed earlier (46–50), and the complementary solvent the following basis functions optimally reproduced the experi- effects on these interactions are reflected in the coefficients with mental solvation free energies of all pairs of solutes and solvents a few solvent properties. examined (SI Text):

Fig. 2. Comparison of the training and external validation sets based on molecular weight (A) and number of hydrogen-bond acceptors (B). The training set consists mainly of small non- or uni-polar molecules, and the external validation set includes a large number of multipolar large molecules. The rightmost bars represent the number of molecules that are larger than 400 Da (A) and have more than seven hydrogen-bond acceptors (B), respectively.

2of6 | www.pnas.org/cgi/doi/10.1073/pnas.1221940110 Lee et al. Downloaded by guest on September 27, 2021 ( ) PNAS PLUS N N N N Table 1. Optimized universal solvent G-SFED parameters for XS XA q XA q2 XA α XA α ΔG ¼ C i þ C i þ C i þ C i ; Eqs. 3–6 inter 1 r2 2 r3 3 r3 4 r6 k¼1 i¼1 ik i¼1 ik i¼1 ik i¼1 ik Parameter Value Parameter Value Parameter Value [2] c1,0 −1.76E-03 c3,0 −2.16E-01 ca −7.53

c1,1 −1.37E-01 c3,1 2.64E-01 cd −4.35 where S and NA are the number of surface fragments on the cavity c2,0 −2.89E-03 c4,0 6.72 ccav 7.12E-05 surface and atoms of the solute, respectively. rik is the distance c2,1 −1.84E-01 c4,1 −8.99 C −2.66E-01 between the ith atom and the kth surface fragment. The net atomic charge qi and the effective atomic polarizability αi of the The parameters have units that enable the product of the basis function ith atom of the solute were calculated by using the modified and the coefficient to be expressed in kcal/mol. partial equalization of orbital electronegativity and the charge dependent effective atomic polarizability methods, respectively small molecules to minimize the error originating from structural (51–55). The cavity surface of a solute is defined by the union of the SAS of atoms in the solute (Fig. 1). The radius of the changes upon dissolution. The predictability of the model was SAS sphere of each atom is defined as the sum of its van der validated by using an external validation set containing diverse small nonpolar molecules and large multipolar molecules including Waals radius, Rvdw, and the effective solvent shell thickness, drug and natural compounds. Comparison between the training RS (46). and external validation sets is shown in Fig. 2. The CjsofEq.2 represent the degree of the interaction of the solvent with the field produced by the electron distribution The training set consists of 2,173 solvation free energies of of the solute. Because the first and second terms in Eq. 2 originate 366 neutral molecules in 91 solvents (49). This set encompassed from the dipole interaction, and the third and fourth terms a broad range of common functional groups present in biological originate from polarization, the Cjs are described as follows, and drugable small molecules that possess H, C, N, O, F, S, Cl, Br, and I atoms. The data set for validation of the model is taken m m m m from our previous work (50). After exclusion of data in common Cj¼ ¼ cj;0« þ cj;1 and Cj¼ ¼ cj;0η þ cj;1; [3] 1or2 3or4 with the training set, the external validation set consisted of m m 3,580 unique solvation free energies of 741 neutral molecules where Cj represents the jth coefficient of a solvent m, and « m in 36 polar solvents. Of these, 524 solutes and 12 solvents were and η are the dielectric constant and refractive index of the BIOPHYSICS AND m used only to validate the model. The geometries of the solutes

solvent (56, 57), respectively. The macroscopic properties, « COMPUTATIONAL BIOLOGY ηm were obtained by energy minimization in the gas-phase, using and , of the solvent originate from the average dipole moment HF/MIDIx in Gaussian (62) software. and average induced dipole of the solvent, respectively. The parameters for calculating qi and αi of the solute and Because ΔG is the free energy cost to form the cavity cav those describing the cavity surface were taken from our pre- in a solvent, it depends linearly on the cavity surface area, vious work (46–55), and only 12 empirical parameters were S (58): cav determined by minimizing the difference between calculated exp m ΔGcal ΔG ΔG ¼ c γ S ; [4] ( solv) and experimental ( solv) solvation free energies. The cav cav cav optimized 12 empirical parameters and the contribution of each ΔG γm m basis function to solv are given in Table 1 and Table S2. where is the macroscopic surface tension of a solvent in The following sections discuss the training and validation re- energy per unit area (56, 57). sults of the G-SFED model, followed by application to calculate ΔG is given by the product of hydrogen-bond acidity, A, and HB partition coefficients of organic molecules and peptides. basicity, B, of the hydrogen-bonded molecules. In the model, ΔGHB was divided into two terms based on the role of the solvent molecule on the hydrogen bond, with acceptor and donor repre- sented by subscript “a” and “d,” respectively:

m s m s ΔGHB ¼ ΔGa þ ΔGd ¼ caB A þ cdA B : [5]

A and B were taken from Abraham’s work (59–61). The solvation free energy calculated with the G-SFED model is ! S N N N N X XA q XA q2 XA α XA α ΔG ¼ C i þ C i þ C i þ C i solv 1 r2 2 r3 3 r3 4 r6 k¼1 i¼1 ik i¼1 ik i¼1 ik i¼1 ik m s m s m þ caB A þ cdA B þ ccavγ Scav þ C; [6]

where C is a constant to correct for the error in this linear equation. The coefficients c1,0, c1,1, c2,0, c2,1, c3,0, c3,1, c4,0, c4,1, ca, cd, ccav,andC can be applied universally for any solvents. The G-SFED model introduces only these 12 parameters to calculate the solvation free energy of any solute–solvent combination. The parameterization and validation of the model were per- formed with 5,753 experimental solvation free energies of 890 solutes in 103 physically diverse solvents (Table S1 and Fig. S1). Fig. 3. Performance, MAE and R2, of G-SFED (filled black symbols) and Because G-SFED employs parameters that were determined using SM5.42R (filled gray symbols) for the external validation set. MAE (left y-axis, rigid gas-phase solute structures, parameterization was performed bars) and R2 (right y-axis, circles) are plotted for each solvent. The MAE and by using a training set composed mainly of non- or unipolar R2 values are given in Table S4.

Lee et al. PNAS Early Edition | 3of6 Downloaded by guest on September 27, 2021 Fig. 4. Scatter plots of ΔGsolv calculated with G-SFED (A) and SM5.42 (B) against experimental ΔGsolv . Red and black dots represent the solvation free energies in the training and external validation sets, respectively. The MAEs and R2s were 0.73 kcal/cal and 0.95, respectively, for G-SFED, and 0.90 kcal/mol and 0.89, respectively, for SM.42R.

Results and Discussion (43), and SM5.42R adapted two parameterizations for atomic

Calculation of ΔGsolv of the Training Set of Non- or Uni-Polar Small surface tensions, one for water and another for all organic sol- Molecules. The performance of the G-SFED model with the vents, whereas G-SFED adapted a single parameterization for training set is compared with that of the SM5.42R model, one of all solvents. the most accurate generalized continuum solvation models (63). Predictability for the External Validation Set Including Diverse Large The SM5.42R calculations were carried out with AMSOL (64) at Multipolar Molecules. Fig. 3 compares the performance [MAE and the AM1 level of theory. Table S3 gives an error summary for coefficientofdeterminant(R2)] of G-SFED and SM5.42R applied each solvent. The results indicate that both the G-SFED and to the external validation set including diverse large multipolar SM5.42R models reproduce the solvation free energies very well molecules described above (Fig. 2). The MAEs of G-SFED and for small molecules in diverse solvents. The mean absolute errors SM5.42R for the external validation set increased by 63% and (MAEs) of G-SFED and SM5.42R are 0.52 kcal/mol and 0.43 177%, respectively, in comparison with those of the training set, kcal/mol, respectively, over the entire training set. It should be to 0.85 kcal/mol and 1.19 kcal/mol. Although the MAEs in- ΔGexp noted that the training set is almost the same as that of SM5.42R creased, G-SFED gives very high correlation between solv

Fig. 5. The calculated octanol-water partition coefficients from G-SFED vs. experimental values for 601 organic molecules (A) and 193 neutral peptides (B). The MAEs and R2s were 0.33 log units and 0.94, respectively, for A, and 0.41 log units and 0.91, respectively, for B. If the single largest outlier [(Me)Arg-Lys-Pro- Trp-tLeu-LeuOEt] in B is removed, the MAE and R2 are 0.40 log unit and 0.92, respectively.

4of6 | www.pnas.org/cgi/doi/10.1073/pnas.1221940110 Lee et al. Downloaded by guest on September 27, 2021 Calculation of Partition Coefficients of Organic Compounds and PNAS PLUS Peptides. G-SFED was applied to predict the octanol-water par- tition coefficient (log P) of 601 organic compounds (61) and 193 nonzwitterionic peptides (67). Peptide data consist of 185 termi- nally blocked and eight cyclic neutral peptides of 1–11 amino acids in length. The geometries of the organic molecules and peptides were obtained by energy minimization in the gas-phase, using HF/ MIDIx and the consistent force field (CFF, Discovery Studio; Accelrys), respectively. The log P values were calculated directly from the difference between solvation free energies in water (ΔGwater) and 1-octanol (ΔG1oct):

1 log P ¼ − ΔGcal − ΔGcal : [7] 2:303RT 1oct water

Here R is the universal gas constant, and T is the system . In Fig. 5, calculated partition coefficients of the organic com- pounds and the peptides are plotted against experimental parti- Fig. 6. Computation time vs. chain length. G-SFED and SM5.42R are fi R2 written in C++ and FORTRAN 77, respectively. Because the G-SFED model is tion coef cients. The MAE and of the organic compounds based on simple empirical functions, the runtime of the model is linearly were 0.33 log units and 0.94, respectively. The model predicts the proportional to the number of nonhydrogen atoms of a molecule. values well for most of the 601 organic compounds; however, 20 molecules, including sulfonamides and multihalogenated com- pounds, were predicted with absolute error lager than 1.0 log units ΔGcal and solv in all solvents over the whole MAE range; the lowest qi 2 2 because of inaccuracy of for these molecules (54, 55). R was 0.92 in chloroform, and R of other solvents was higher Partition coefficients of peptides were computed with empirical than 0.96. Also, the absolute error of each solute in a solvent

hydrogen-bond acidity and basicity parameters. Prediction accu- BIOPHYSICS AND increased in proportion to the magnitude of the solvation free racy for all peptides was excellent even for large cyclic peptides ΔGexp ΔGcal COMPUTATIONAL BIOLOGY energy of the solutes; the slopes from solv vs. solv were (RA-V, RA-VII, RA-V_acetate, melanotan, sandostatin, and cyclo- greater than but close to 1 in aliphatic alcohol and haloaromatic sporin) within one log unit where other methods are inaccurate solvents and smaller than 1 in other solvents (Table S4). This (Table S5). The MAE and R2 for 193 peptides were 0.41 log units proportional systematic error may have two main causes. The and 0.91, respectively. Of 193 peptides, 133 were predicted within first one is omission of microscopic solvent effects around the 0.5 log unit, and only 10 peptides were predicted over one log unit. solute molecules, because the solvent effects were reflected The error originates mainly from structural changes upon disso- through minimal macroscopic solvent properties in Eqs. 3–5. lution and a proportional systematic error of the model described The second is incomplete generalized functional forms for the hydrogen bond in Eq. 3; Abraham’s parameters, adapted to rep- above. The application of the G-SFED model to macromolecules, especially to proteins, is one of the most important goals of our resent the hydrogen-bond acidity and basicity of solute and solvent fi molecules, were developed only for solute molecules, and c and search. With the computational ef ciency shown in Fig. 6, such a computational accuracy for large molecules indicates that G-SFED cd in Eq. 5 are variables of solvent properties (65, 66). Fig. 4 compares the training and external validation results can be applied to macromolecules. for G-SFED and SM5.42R with experiments. The MAEs and Conclusions R2s for all data sets are 0.73 kcal/cal and 0.95, respectively, for G-SFED and 0.90 kcal/mol and 0.89, respectively, for SM.42R. We have described a generalized empirical continuum solvation Overall, G-SFED shows consistent accuracy for both small non- model, G-SFED, for calculation of the solvation free energies of polar and large multipolar molecules. any solute and solvent pairs. G-SFED aims to provide a practical uniform framework based on a proper description of the effects

Calculation of ΔGsolv with Empirical Hydrogen-Bond Acidity and of solute and solvent on solvation. In the model, effects of the Basicity. G-SFED, with Abraham’s hydrogen-bond parameters, solute on various solute–solvent interactions were represented produces results in good agreement with the experimental sol- by empirical functions of the solute properties, and the com- vation free energies. However, because there are a limited number plementary effects of the solvent on these interactions were of Abraham’s parameters, the model cannot be applied to mole- reflected with a few macroscopic solvent properties. G-SFED cules for which these parameters are unavailable (for example, provides accurate prediction results for both small nonpolar peptides and many drugs). For the complete generalization of the and large multipolar molecules in diverse solvents. Based on model, empirical hydrogen-bond acidity and basicity of solute and the computational efficiency and accuracy for large molecules, solvent molecules (50) were used without additional parameteri- as shown by calculation of peptide octanol-water partition coef- ’ zation instead of with Abraham s parameters. The results were ficients, G-SFED can be applied to very large systems, and even ’ comparable to those for G-SFED with Abraham s parameters; to proteins. compared with those based on Abraham’s parameters, the MAEs increased by only 4% and 8%, to 0.54 kcal/mol and 0.91 kcal/mol, ACKNOWLEDGMENTS. This study was supported by Grants A100096 and for the training and external validation sets, respectively. The A085105 from the Korea Healthcare Technology Research and Development errors increased mainly because of the inaccuracy of empirical Project, Ministry for Health, Welfare and Family Affairs, Korea; Grant 2011- 0001245 from the Translational Research Center for Protein Function Control, hydrogen-bond parameters for the solvent molecules (a 0.1 unit National Research Foundation of Korea; Grant GM-14312 from the National error of solvent hydrogen-bond acidity corresponds to an MAE Institutes of Health; and Grant MCB-1019767 from the US National Science of 0.1 kcal/mol in methanol solvent). Foundation.

Lee et al. PNAS Early Edition | 5of6 Downloaded by guest on September 27, 2021 1. Mahato RI, Narang AS, Thoma L, Miller DD (2003) Emerging trends in oral delivery 34. Raevsky OA (1999) Molecular structure descriptors in the computer-aided design of of peptide and protein drugs. Crit Rev Ther Drug Carrier Syst 20(2-3):153–214. biologically active compounds. Russ Chem Rev 68(6):505–524. 2. Fjell CD, Hiss JA, Hancock REW, Schneider G (2012) Designing antimicrobial peptides: 35. Chothia C (1974) Hydrophobic bonding and accessible surface area in proteins. Nature Form follows function. Nat Rev Drug Discov 11(1):37–51. 248(446):338–339. 3. Ángyán JG (1992) Common theoretical framework for quantum chemical solvent effect 36. Miertus S, Tomasi J (1981) Approximate evaluations of the electrostatic free energy theories. JMethPhys10:93–137. and internal energy changes in solution processes. Chem Phys 65(2):239–245. 4. Tomasi J, Persico M (1994) Molecular interactions in solution: An overview of methods 37. Eisenberg D, McLachlan AD (1986) Solvation energy in and binding. based on continuous distributions of the solvent. Chem Rev 94(7):2027–2094. Nature 319(6050):199–203. 5. Smith PE, Pettitt BM (1994) Modeling solvent in biomolecular systems. J Phys Chem 38. Kang YK, Gibson KD, Némethy G, Scheraga HA (1988) Free energies of hydration 98(39):9700–9711. of solute molecules. 4. Revised treatment of the hydration shell model. J Phys Chem 6. Chambers CC, Hawkins GD, Cramer CJ, Truhlar DG (1996) Model for aqueous 92(16):4739–4742. solvation based on class IV atomic charges and first solvation effects. JPhysChem 39. Swanson JMJ, Henchman RH, McCammon JA (2004) Revisiting free energy calculations: 100(40):16385–16398. A theoretical connection to MM/PBSA and direct calculation of the association free 7. Cramer CJ, Truhlar DG (1999) Implicit solvation models: Eqilibria, structure, spectra, energy. Biophys J 86(1):67–74. and dynamics. Chem Rev 99(8):2161–2200. 40. Klamt A (1995) Conductor-like screening model for real solvents: A new approach to 8. Tomasi J, Mennucci B, Cammi R (2005) Quantum mechanical continuum solvation the quantitative calculation of solvation phenomena. J Phys Chem 99(7):2224–2235. models. Chem Rev 105(8):2999–3093. 41. Klamt A, Schüürmann G (1993) COSMO: A new approach to dielectric screening in 9. Born M (1920) Volumen und hydratationswӓrme der ionen. Z Phys 1:45–48. solvents with explicit expressions for the screening energy and its gradient. J Chem 10. Bell RP (1931) The electrostatic energy of dipole molecules in different media. Trans Soc, Perkin Trans 2 799–805. Faraday Soc 27:797–802. 42. Giesen DJ, Gu MZ, Cramer CJ, Truhlar DG (1996) A universal organic solvation model. – 11. Kirkwood JG (1934) Theory of of molecules containing widely separated J Org Chem 61(25):8720 8721. charges with special application to zwitterions. J Chem Phys 2:351–361. 43. Li J, et al. (1999) Extension of the platform of applicability of the SM5.42R universal – 12. Onsager L (1936) Electric moments of molecules in . JAmChemSoc58(8): solvation model. Theor Chem Acc 103(1):9 63. 1486–1493. 44. Cramer CJ, Truhlar DG (2008) A universal approach to solvation modeling. Acc Chem – 13. Fowler R, Guggenheim EA (1956) Statistical Thermodynamics (Cambridge Univ Press, Res 41(6):760 768. London). 45. Son SH, Han CK, Ahn SK, Yoon JH, No KT (1999) Development of three-dimensional 14. Rogers NK, Sternberg MJE (1984) Electrostatic interactions in globular proteins. descriptors represented by tensors: Free energy of hydration density tensor. J Chem – Different dielectric models applied to the packing of α-helices. J Mol Biol 174(3): Inf Comput Sci 39(3):601 609. 527–542. 46. No KT, Kim SG, Cho K-H, Scheraga HA (1999) Description of hydration free energy – 15. Rogers NK, Moore GR, Sternberg MJE (1985) Electrostatic interactions in globular density as a function of molecular physical properties. Biophys Chem 78(1-2):127 145. proteins: Calculation of the pH dependence of the redox potential of cytochrome 47. Nam K-Y, Cho DH, Paek K, No KT (2002) Investigation of some amino acids conformations at the interface of binary mixture using the solvation free energy density model. c551. J Mol Biol 182(4):613–616. Chem Phys Lett 364(3-4):267–272. 16. Klapper I, Hagstrom R, Fine R, Sharp K, Honig B (1986) Focusing of electric fields in the 48. In Y, Chai HH, No KT (2005) A partition coefficient calculation method with the SFED active site of Cu-Zn superoxide dismutase: Effects of ionic strength and amino-acid model. J Chem Inf Model 45(2):254–263. modification. Proteins 1(1):47–59. 49. Lee S, et al. (2011) Calculation of the solvation free energy of neutral and ionic 17. Nicholls A, Honig B (1991) A rapid finite difference algorithm, utilizing successive molecules in diverse solvents. J Chem Inf Model 51(1):105–114. over-relaxation to solve the Poisson-Boltzmann equation. J Comput Chem 12:435–445. 50. Lee S, Cho K-H, Acree WE, Jr., No KT (2012) Development of surface-SFED models for 18. Zauhar RJ, Morgan RS (1985) A new method for computing the macromolecular polar solvents. J Chem Inf Model 52(2):440–448. electric potential. J Mol Biol 186(4):815–820. 51. No KT, Grant JA, Scheraga HA (1990) Determination of net atomic charges using a 19. Rashin AA, Namboodiri K (1987) A simple method for the calculation of hydration modified partial equalization of orbital electronegativity method. 1. Application to of polar molecules with arbitrary shapes. J Phys Chem 91(23):6003–6012. neutral molecules as models for polypeptides. J Phys Chem 94(11):4732–4739. 20. Rashin AA (1990) Hydration phenomena, classical , and the boundary 52. No KT, Grant JA, Jhon MS, Scheraga HA (1990) Determination of net atomic charges element method. J Phys Chem 94(5):1725–1733. using a modified partial equalization of orbital electronegativity method. 2. Application 21. Vorobjev YN, Scheraga HA (1997) A fast adaptive multigrid boundary element to ionic and aromatic molecules as models for polypeptides. J Phys Chem 94(11): method for macromolecular electrostatics in a solvent. J Comput Chem 18(4):569–583. 4740–4746. 22. Vorobjev YN, Vila JA, Scheraga HA (2008) FAMBE-pH: Afast and accurate method 53. Park JM, No KT, Jhon MS, Scheraga HA (1993) Determination of net atomic charges to compute the total solvation free energies of proteins. JPhysChemB112(35): using a modified partial equalization of orbital electronegativity method. III. Application 11122–11136. to halogenated and aromatic molecules. JComputChem14(12):1482–1490. 23. Tapia O, Goscinski O (1975) Self-consistent reaction field theory of solvent effects. 54. No KT, Cho K-H, Jhon MS, Scheraga HA (1993) An empirical method to calculate average Mol Phys 29(6):1653–1661. molecular polarizabilities from the dependence of effective atomic polarizabilities on 24. Miertus S, Scrocco E, Tomasi J (1981) Electrostatic interaction of a solute with a net atomic charge. J Am Chem Soc 115(5):2005–2014. continuum. A direct utilization of AB initio molecular potentials for the prevision 55. Park JM, Kwon OY, No KT, Jhon MS, Scheraga HA (1995) Determination of net atomic – of solvent effects. Chem Phys 55:117 129. charges using a modified partial equalization of orbital electronegativity method. IV. 25. Cancès E, Mennucci B, Tomasi J (1997) A new integral equation formalism for the Application to hypervalent surfur- and phosphorus-containing molecules. J Comput polarizable continuum model: Theoretical background and applications to isotropic Chem 16(8):1011–1026. – and anisotropic . J Chem Phys 107(8):3032 3041. 56. Lide DR (2005) CRC Handbook of chemistry and physics (CRC, Boca Raton). 26. Mennucci B, Cancès E, Tomasi J (1997) Evaluation of solvent effect in isotropic and 57. Speight JG (2005) Lange’s Handbook of Chemistry (McGraw-Hill, New York). fi anisotropic dielectric and in ionic solutions with a uni ed integral equation method: 58. Uhlig HH (1937) The solubilities of gases and surface tension. JPhysChem41(9): Theoretical bases, computational implementation, and numerical applications. J Phys 1215–1226. – Chem B 101(49):10506 10517. 59. Abraham MH (1993) Scales of solute hydrogen-bonding: Their construction and 27. Cancès E, Mennucci B (1998) New applications of integral equations methods application to physicochemical and biochemical processes. Chem Soc Rev 22:73–83. for solvation continuum models: Ionic solutions and crystals. JMathChem 60. Abraham MH (1993) Hydrogen bonding. 31. Construction of a scale of solute effective 23(3-4):309–326. or summation hydrogen-bond basicity. J Phys Org Chem 6(12):660–684. 28. Barone V, Cossi M (1998) Quantum calculation of molecular energies and energy 61. Abraham MH, Chadha HS, Whiting GS, Mitchell RC (1994) Hydrogen bonding. 32. An gradients in solution by a conductor . JPhysChemA102(11):1995–2001. analysis of water-octanol and water-alkane partitioning and the delta log P parameter 29. Cossi M, Rega N, Scalmani G, Barone V (2003) Energies, structures, and electronic of seiler. J Pharm Sci 83(8):1085–1100. properties of molecules in solution with the C-PCM solvation model. J Comput Chem 62. Frisch MJ, et al. (2004) Gaussian 03 (Gaussian, Inc., Pittsburgh), 6.0. 24(6):669–681. 63. Thompson JD, Cramer CJ, Truhlar DG (2004) New universal solvation model and 30. Tuckerman ME, Marx D, Klein ML, Parrinello M (1997) On the quantum nature of the comparisonoftheaccuracyoftheSM5.42R,SM5.43R,C-PCM,D-PCM,andIEF-PCM shared proton in hydrogen bonds. Science 275(5301):817–820. continuum solvation models for aqueous and organic solvation free energies and 31. Salahub DR, Martinez A, Wei DQ (1998) Theory of Atomic and Molecular Chemistry for vapor pressures. JPhysChemA108(31):6532–6542. (Springer, New York). 64. Hawkins GD, et al. (2004) AMSOL-version 7.1 (University of Minnesota, Minneapolis). 32. Santra B, Michaelides A, Scheffler M (2007) On the accuracy of density-functional 65. Hunter CA (2004) Quantifying intermolecular interaction: Guidelines for the molecular theory exchange-correlation functionals for H bonds in small water clusters: recognition toolbox. Angew Chem Int Ed 43(40):5310–5324. Benchmarks approaching the complete basis set limit. J Chem Phys 127(18):184104. 66. Abraham MH, Gola JMR, Cometto-Muñiz JE, Acree WE, Jr. (2010) Hydrogen bonding 33. Raevsky OA, Grigori’ve VY, Kireev DB, Zefirov NS (1992) Complete thermodynamic between solutes in solvents octan-1-ol and water. J Org Chem 75(22):7651–7658. description of H-bonding in the framework of multiplicative approach. Quant 67. Buchwald P, Bodor N (1998) Octanol-water partition of nonzwitterionic peptides: Struct –Act Relat 11(1):49–63. Predictive power of a molecular size-based model. Proteins 30(1):86–99.

6of6 | www.pnas.org/cgi/doi/10.1073/pnas.1221940110 Lee et al. Downloaded by guest on September 27, 2021