Chemical Physics 307 (2004) 187–199 www.elsevier.com/locate/chemphys

Exploring the effects of hydrogen bonding and hydrophobic interactions on the foldability and cooperativity of helical proteins using a simplified atomic model

Michael Knott, Hue Sun Chan *

Protein Engineering Network of Centres of Excellence (PENCE), Department of , 1 Kings College Circle, University of Toronto, Toronto, Ont., Canada M5S 1A8 Department of Medical Genetics and Microbiology, Faculty of Medicine, University of Toronto, Toronto, Ont., Canada M5S 1A8

Received 16 February 2004; accepted 3 June 2004 Available online 6 July 2004

Abstract

Using a simplified atomic model, we perform Langevin dynamics simulations of polypeptide chains designed to fold to one-, two- and three-helix native conformations. The impact of the relative strengths of the hydrophobic and hydrogen bonding interactions on folding is investigated. Provided that the two interactions are appropriately balanced, a simple potential function allows the chains to fold to their respective target native structures, which are essentially lowest- conformations. However, if the hydrophobic interaction is too strong, helix formation is preempted by hydrophobic collapse into compact conformations with little helical content. While the transition from denatured to compact non-native conformations is not cooperative, the transition between native (helical) and denatured states exhibits certain cooperative features. The degree of apparent cooperativity increases with the length of the polypeptide; but it falls far short of that observed experimentally for small, single-domain ‘‘two-state’’ proteins. Even for the three-helix bundle, the present model interaction scheme leads to a distribution of energy which is not bimodal, although a peak associated with thermal unfolding is observed. This finding suggests that models with simple pairwise additive in- teraction schemes, involving hydrophobic interactions and hydrogen bonding, can mimic the folding of small helical proteins, but that such models are insufficient to produce high degrees of folding cooperativity. Certain features of our model are reminiscent of a recent scenario proposed for downhill folding. Their ramifications are discussed. Ó 2004 Elsevier B.V. All rights reserved.

Keywords: Calorimetry; Langevin dynamics; Two-state cooperativity; Helical proteins; Radius of gyration; Heat capacity

1. Introduction thermodynamic (calorimetric) cooperativity and of folding/unfolding kinetic cooperativity. Thermodynamic Polymer chain models are indispensable tools for the cooperativity of refers to an apparent understanding of protein folding. In particular, it has two-state behaviour deduced from differential scanning become clear that we can gain considerable insight into calorimetry and other measurements. This property protein energetics by rigorously evaluating the ability of implies that a given protein’s conformational distribu- explicit-chain, self-contained polymer models to repro- tion (distribution in or energy in the case of duce generic experimental properties of proteins. A calorimetry) is well separated into two populations – prime example of such a property is cooperativity: many native and denatured – at the folding/unfolding transi- small single-domain proteins exhibit a high degree of tion midpoint, with very low (though not non-existent) intermediate conformational population in between. * Corresponding author. Tel: +1-416-978-2697; fax: +1-416-978- Kinetic cooperativity of protein folding refers to a 8548. protein’s apparent two-state folding and unfolding re- E-mail address: [email protected] (H.S. Chan). laxation as characterised by a linear .

0301-0104/$ - see front matter Ó 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.chemphys.2004.06.014 188 M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199

Comparison of the behaviours of several representative all proteins. Recent years have witnessed tremendous models has shown that kinetic cooperativity is corre- progress in atomic simulation of peptide and protein lated with a high degree of thermodynamic cooperativity folding. There have been all-atom simulations with ex- [1]. plicit treatment of the aqueous solvent [17–21], implicit- Because of their computational tractability, contin- solvent investigations [22–27], and critical comparisons uum (off-lattice) Go models [2–5] have been instru- of different approaches (see, e.g., [28,29]). Some of these mental in elucidating the folding behaviour of real efforts have been facilitated by massive distributed proteins. However, as is the case for their lattice coun- computing [24,27]. The most impressive advances have terparts [6–8], common continuum Go models – at least involved the simulation of small peptides and ‘‘mini- those tested so far – fail to reproduce the linear chevron proteins’’, with several notable cases of successful in plots that are observed in real ‘‘two-state’’ proteins [1,4], silico folding to stable conformations that essentially because kinetic trapping in these models – though not coincide with known native structures (e.g., serious – becomes non-negligible under mildly native [18,19,23,27], see also the recent review [30] and refer- conditions [8]. ences therein). Kinetic trapping in these models arises from stable Traditionally, the main emphasis of many all-atom compact non-native conformations, which have inter- studies has been on molecular mediate between the native and fully unfolded structures rather than on energetics [20,31]. This is be- values and thus are associated with attenuated thermo- cause the complexity of these models has often pre- dynamic cooperativity. A possible reason for the insuf- cluded the extensive conformational sampling that ficiently high cooperativity of the common Go models is statistical mechanical considerations require. Recently, that their interaction schemes are essentially pairwise however, simulations have begun to overcome this additive: recent investigations indicate that many-body problem. Reversible folding and unfolding have been interactions [9,10] are probably necessary for a high le- achieved in all-atom explicit-solvent simulations of vel of cooperativity quantitatively similar to that ob- small peptides, allowing thermodynamic issues to be served in some real, small, single-domain proteins. addressed [30,32]. Indeed, it has been demonstrated, using native-centric There are multiple approaches to the level of detail lattice constructs, that a many-body local–non-local incorporated in a simulation model. All-atom simula- coupling mechanism [1,10] can lead to enhanced coop- tion is perhaps the only approach that could, in princi- erativity as well as to a strong correlation between ple, predict the details of the folding process of a given and folding rate, reminiscent of the cor- protein. However, we wish to address the question of relation that is observed experimentally [11]. This sug- cooperativity, and it is currently still problematic to use gests that a similar local–non-local cooperative all-atom explicit-solvent models to address the folding mechanism may be at play in certain classes of real thermodynamics of proteins with 50 or more residues. proteins (see also [12]). This is because explicit-solvent simulations of proteins This line of investigation highlights our limited un- of such sizes have yet to obtain a single continuous derstanding of protein energetics: it shows that highly folding trajectory from an open conformation to the cooperative behaviour is non-trivial. A protein chain native structure. In all-atom implicit-solvent models, on model that purports to incorporate simple, intuitive the other hand, the lowest-energy conformations in the intrachain interactions will not inevitably be highly model often do not correspond to the known native cooperative. structures [24,25]. On the other hand, it is equally important to recog- With this background, a complementary approach, nise that not all natural proteins are highly cooperative. very much in the traditional spirit of physics, is to use In view of recent advances in protein science, it is rea- simplified continuum chain representations that are sonable to expect that a broad range of polymer be- reasonably geometrically accurate and that capture the haviours can be exploited by Nature for biological desired aspects of the real system [33,34]. It is unlikely function. It has been pointed out [1] that polymer that this approach could ever predict the full behaviour models with somewhat reduced cooperativity (certain of a specific protein, but by allowing us to make changes Go models, for example) can be appropriate for proteins to a relatively small input of information, and to eval- with chevron rollovers [8]. And polymer models with uate their effects on the results, it enables us to gain a even much diminished cooperativity would be applica- general theoretical understanding of the process [35–38]. ble to the study of downhill folding [13–16]. In accordance with this perspective, the overall goals of Although Go-like native-centric constructs are useful our effort to employ simplified models to address protein for making conceptual advances [2–4], a fundamental folding cooperativity are the following. (i) To delineate physical understanding of protein folding ultimately the strengths and weaknesses of simplified, implicit- requires an atomistic approach based on general, solvent protein models with respect to their ability to transferrable interaction schemes that are applicable to mimic real protein behaviour. (ii) To build simplified M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199 189 models that can better capture real protein cooperativ- ther such an interaction scheme can lead to calorimetric ity, as part of an endeavour to develop novel hypotheses cooperativity. Although folding kinetics is not studied for possible cooperativity mechanisms, e.g., local– extensively in the present article, the investigation has non-local coupling [1,10]. (iii) To facilitate experimental kinetic implications, since a high degree of thermody- investigations into possible physical and evolutionary namic cooperativity appears to be a prerequisite for origins of cooperativity, e.g., to ask why coarse-grained kinetic cooperativity [1]. pairwise additive interactions [4,7,39] may be insufficient [40] and how mutations and protein re-designs may af- fect a hypothesised cooperativity mechanism (item ii 2. Technical details above). (iv) To facilitate the theoretical search for the atomistic origins of cooperativity. It has yet to be shown 2.1. The model whether existing all-atom explicit-solvent models can account for the high degree of cooperativity of many The model and its potential energy function are real, single-domain proteins. Ultimately, hypothesised similar to one used previously [41–43] in Monte Carlo coarse-grained cooperativity mechanisms need to be simulation. It is a continuum model that represents ev- 0 tested in all-atom constructs to gain insight into whether ery backbone atom N, Ca and C , together with the O the physical picture offered by common atomic force atom attached to the C0 and the H atom attached to the fields is sufficient to reproduce proteinlike cooperativ- N; this allows hydrogen bonds to be represented as an ity or whether more novel physical understanding is interaction between H and O atoms. However, the required. model is not entirely all-atom, since the side group is Recently, Wallin and co-workers [41–43] have used represented as a single Cb atom, and the H attached to Monte Carlo simulation to investigate a simplified 3- the Ca is excluded in order to make the model compu- letter protein model with an atomic representation of the tationally cheaper. The model allows three possibilities peptide backbone. This model does not include explicit for each Cb: it can be hydrophobic, polar or absent al- solvent or multibody interactions, and simplifies the together. The first possibility permits a representation of polypeptide chain in a number of ways. Nevertheless, the hydrophobic interaction; the last enables us to in- the results showed that a single-helix, and two- and clude a glycine-like residue to facilitate turns. three-helix bundles, could be formed as essentially the We study the three sequences (16-residue single-helix, lowest-energy conformations of the model. The se- 35-residue two-helix and 54-residue three-helix) used in quences were designed [44] to produced these results, but [41,44]. Neighbouring helical sections are separated by the energy function is general in that it does not rely on three glycines. The sequences were designed to allow a knowledge of the target native structure (i.e., the en- hydrophobic residues to come into contact in the two- ergy function is non-Go-like). Related, but slightly more and three-helix sequences. The numbers of backbone complex, 5-letter and all-atom models [45–47] have also atoms, N, and side atoms, Ns, in these sequences are as been used by the same group: [45] and [47] look at the follows. Single-helix: N ¼ 48, Ns ¼ 46; two-helix: three-helix B domain of staphylococcal protein A, which N ¼ 105, Ns ¼ 100; three-helix: N ¼ 162, Ns ¼ 154. We is a system that has been studied extensively using var- use A as the unit of length and atomic mass units (amu) ious computational methods (e.g., high-coordination as the unit of mass. lattice models [48], all-atom simulations using explicit The conformational forces are derived from a po- solvent [21,49], implicit solvent [25,50], structure-based tential energy function V , which is a function of the potentials [51] and Go potentials [52]). position vectors {rn} of the atoms (labelled by n). It can In this article, we present the results of simulating a be expressed as the sum of eight components, model that is similar to the 3-letter model of Wallin and V V V V V V V V V : 1 co-workers [41–43]. Instead of Monte Carlo sampling, ¼ l þ h þ v þ x þ dh þ hc þ hp þ hb ð Þ Langevin dynamics is employed because it is more Of these eight, the first four are designed to constrain the amenable to kinetic interpretation [53–55]. Aside from simulated molecule to physically realistic local confor- confirming that the model interactions do indeed pro- mations (bond lengths, angles, chirality and dihedral duce helical structures, our main focus here is on ther- angles). This is their only purpose: the harmonic form of modynamic cooperativity, an issue that has not been Vl, for example, is not being proposed as a realistic definitively addressed in the context of these models. representation of the energetic cost of stretching or It is widely accepted that hydrogen bonding [56,57] compressing a bond far from its equilibrium length. and hydrophobicity [58,59] are main stabilising forces in Vl keeps the lengths of the atomic bonds near their protein folding [60,61]. Therefore, we are interested in reference (equilibrium) values, and is given by ascertaining the extent to which a simple pairwise ad- X ditive model of these forces can produce real proteinlike m;n m;n 2 Vl ¼ kl ðl l0 Þ ; ð2Þ behaviour; in particular, we are interested here in whe- m;n 190 M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199 in which the summation is performed only over those The term Vhc represents a hard-core interaction that pairs of atoms ðm, nÞ which are joined by a bond: a pair maintains the self-avoidance of the chain. It is given by of atoms is only counted if m and n represent either a X 12 rm þ rn þ Drmn neighbouring pair of backbone atoms or a backbone V ; hc ¼ ehc m;n ð7Þ m;n m;n l atom and its attached side chain atom. Here, l ¼jl j m;n m;n is the magnitude of the vector l rn rm, i.e., the where rm and rn are the radii of atoms m and n and Drmn distance between the two atoms m and n, lm;n ¼jlm;nj is 0 0 is equal to zero unless one of the pair is a Cb and the the natural bond length [41,62] (from which the true other is a C0, N or O separated from the C by three 2 b value should not vary too much), and kl ¼ 100 A is bonds, in which case Drmn ¼ 0:625 A. The introduction the parameter which determines the strength of the of Drmn is motivated by the fact that real side interaction. groups are spatially asymmetric: without Drmn the side Vh, which ensures that the bond angles remain near groups represented by the Cb atoms in the model would their natural values, is defined by X be effectively spherical. The summation is performed n n 2 Vh ¼ eh h h0 ; ð3Þ over all pairs of atoms (both backbone and side atoms) n which are separated by three or more bonds, except n where h is the angle made at a backbone atom by the pairs of hydrophobic Cb (see below). The interaction bonds joining it to two other atoms (either two back- strength parameter is ehc ¼ 0:0034 [41]. The interaction m;n bone atoms, or one backbone atom and one side chain Vhc has a cutoff at l ¼ 8 A. n atom), h0 is the natural value of the same angle, and The hydrophobic component of the energy is given by e 20 is the interaction strength parameter. The sum- X h ¼ r 12 r 6 mation is performed over all such angles hn. V ¼ e hp 2 hp ; ð8Þ hp hp lm;n lm;n The third term in the energy is XNs where the summation is performed over all pairs of m1;m m;mþ1 m;n Vv ¼ Cv l l l hydrophobic Cb, the equilibrium separation of the pair n¼1 is rhp ¼ 5:0 A [41], and ehp determines the strength of the m1;m m;mþ1 m;n 2 interaction. The above expression implies that V ¼ 0at l0 l0 l0 ; ð4Þ hp a separation where lm;n=r 0:89. In order to allow the where n represents a side chain atom and m the back- hp attractive component of V to vary, while keeping the bone atom to which it is attached. C ¼ 20 A6 is the hp v repulsive component constant to maintain the strong interaction strength parameter. This expression contains self-avoidance (excluded volume effect) of the chain, we cross products of the form v v and dot products of 1 2 set e ¼ 2:2 when lm;n=r < 0:89 even when it takes a the form v v , where v and v are vectors. The sum- hp hp 1 2 1 2 V mation is performed over all side chain atoms to ensure different value at larger separations. The interaction hp lm;n that the chain maintains the correct chirality; this is has a cutoff at ¼ 8 A. particularly important for the C atom, since there is a Finally, the energy contribution of hydrogen bonds is b given by significant difference between the biologically realistic X r 12 r 10 left-handed isomer and the right-handed isomer. A V ¼ e 5 hb 6 hb cos2 a cos2 b: ð9Þ similar method for enforcing chirality was employed in a hb hb lm;n lm;n study of helical protein folding by Takada et al. [44]. The summation is performed over all pairs of atoms The fixed dihedral angle x around the NC0 double that are composed of one O and one H. Here, rhb ¼ 2:0 bond is maintained by X A is the equilibrium separation of two atoms joined by a Vx ¼ ex ½1 cos ðx x0Þ; ð5Þ hydrogen bond and ehb ¼ 2:8 is the interaction strength parameter. Angle a is 180° minus the NHO angle, while where x0 is the minimum of the dihedral interaction, set b is 180° minus the HOC0 angle. This energy term has no to 180°,andex ¼ 1 specifies the interaction strength. distance cutoff, but is set to zero if a > 90° or b > 90°. The summation is performed over all residues. When we evaluate the total number of hydrogen bonds The remaining four components of V account for the Nhb in a particular conformation, a hydrogen atom and interactions that are intended to affect the secondary oxygen atom are considered to have formed a hydrogen and tertiary structure of the model. The dihedral po- bond if 1.5 A < lm;n < 3:0 A, a < 45° and b < 45°. tential for angles / and w uses a traditional three-min- Table 1 provides the reference bond lengths and bond imum energy [63] with minima at 60° and 180°: angles and the radii of the different atomic species X [41,62]. V ¼ e ½2 cos 3ð/ / Þcos 3ðw w Þ; ð6Þ dh dh 0 0 The present energy function is similar to that used by where /0 and w0 are minima of the dihedral interactions Wallin and co-workers [41–43], and it is worthwhile to and edh ¼ 0:5 is the interaction strength parameter [41]. enumerate the differences, as follows. (i) The present The summation is performed over all residues. model has extra energy terms Vl, Vh, Vv and Vx. Instead M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199 191

Table 1 Geometric parameters of the model Atomic radii (A) Bond lengths (A) Bond angles (°)

0 N 1.65 NCa 1.46 C NCa 121.7 0 0 Ca 1.85 CaC 1.52 NCaC 111.0 0 0 0 C 1.85 C N 1.33 CaC N 116.6 H 1.00 NH 1.03 C0NH 119.5

Cb 2.50 CaCb 1.53 HNCa 118.2 0 O 1.65 C O 1.23 NCaCb 110.0 0 CbCaC 110.0 0 CaC O 121.1 OC0N 122.3 of constraining the bond lengths, bond angles and On dimensional grounds, a good timeqffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi scale for the peptide torsional angles to fixed values as in the original 2 system can be estimated [55] by s ¼ m0a0=e0, where Monte Carlo investigation [41], here they are allowed to m0, a0 and e0 are the mass, length and energy scales. vary slightly in our Langevin dynamics simulations. An Setting m0 ¼ 12 amu (the mass of the smallest atoms), effect of this difference is that our system is slightly more a0 ¼ 1 A (approximately the length of the shortest flexible. (ii) The present model allows e to take dif- hp bonds) and e0 ¼ 2:8 (the energy parameter of the hy- ferent values in two regimes defined by different sepa- drogen bonds), we estimate s ¼ 4 and define in terms of rations between hydrophobic Cb atoms [5]. This enables this the time step dt ¼ 0:005s and the coefficient of the strength of the hydrophobic interaction to be varied friction c ¼ 0:05s1. To simplify notation, energy and without adversely affecting excluded volume. Finally, temperature units are chosen such that k ¼ 1, as in B (iii) this model uses a cutoff distance of 8 A for the hard- [4,55]. core repulsion, whereas the original model uses 4.5 A; Simulations were performed at a wide range of tem- but this difference is minor. peratures on (i) the single-helix model with ehp ¼ 0:0, ehp ¼ 1:0 and ehp ¼ 2:2; (ii) the two-helix model with 2.2. Langevin dynamics ehp ¼ 1:0andehp ¼ 2:2; and (iii) the three-helix model with ehp ¼ 1:0andehp ¼ 2:2. All simulations started We simulate using Langevin dynamics [55]. Each from a fully extended conformation, and each ran for at atom has the following equation of motion for each of least 48 million time steps. Since equilibrium properties the three spatial components: are the primary focus of the present investigation, the oviðtÞ first 10 million time steps of each simulation were m ¼ F i ðtÞmcviðtÞþgiðtÞ; ð10Þ ot conf excluded from our calculation of thermodynamic averages. where m is the mass of the atom, t is time, c is the co- i i i efficient of friction, and v ðtÞ, Fconf ðtÞ and g ðtÞ represent spatial components of the velocity, conformational force 2.3. Thermodynamic quantities and random force, respectively, as functions of time. The conformational force is the derivative of the po- To assess the thermodynamic cooperativity of a tential energy V . Each component of the random force is model protein, we calculate its specific heat capacity given by CVðT Þ using the following relation, based on the vari- rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ance of the energy E, h i i 2mckBT i 1 g ðtÞ¼ n ðtÞ; ð11Þ C T E2 T E T 2 ; Vð Þ¼ 2 ð Þ hið Þ ð12Þ dt kBT where kB is the Boltzmann constant, T is the absolute where hX ðT Þi is the Boltzmann average of quantity X at temperature, and dt is the integration time step. The T . For continuum models of protein dynamics it is im- random variable niðtÞ is taken from a Gaussian distri- portant [5] that the energy be calculated as the sum of bution with zero mean and unit variance. The atomic potential and kinetic energies, E ¼ V þ EK: we cannot masses are set to physically realistic values (m ¼ 12 amu just set E ¼ V as is done in studies of lattice models [6,7]. 0 for Ca,Cb and C ; m ¼ 14 amu for N; m ¼ 16 amu for The present model energy E and heat capacity CV are O) except that m ¼ 12 amu is used for H in order to analogues of the enthalpy H and heat capacity CP in a avoid high frequency vibrations. (Note that the purpose constant-pressure system. Since the contribution of the of the harmonic terms is simply to constrain molecular PV term is in general small for protein folding under geometry: they are not intended to be a good physical ambient conditions, we can regard the simulated E and representation of the behaviour of a bond). CV here to be equivalent to H and CP. In other words, 192 M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199 the present simulations are considered to be performed 4000 at constant pressure, and the computed heat capacities are taken to be comparable with calorimetrically deter- mined values of CP. V 2000 C If the folding/unfolding transition is somewhat ther- modynamically cooperative, there will be a peak in the heat capacity at a transition temperature Tm.We 0 quantify this thermodynamic cooperativity using the ratio j ¼ DH =DH , where 18 2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffivH cal

2 > DHvH ¼ 2 kBT CPðTmÞ ð13Þ g m R 12 < is the van’t Hoff enthalpy at the transition midpoint temperature, and 6 Z 0.3 0.4 0.5 0.6 0.7 T DHcal ¼ dTCPðT Þð14Þ

Fig. 1. Specific heat capacity CV (upper panel) and average radius of is the calorimetric enthalpy of the transition, where the gyration hRgi (lower panel), as functions of temperature T , for single- integral is performed across the transition region. We helix sequence with no hydrophobic interaction (ehp ¼ 0). Radius of ðsÞ gyration has units of A. Diamonds indicate values calculated from can also calculate a revised ratio j2 for DHvH=DHcal by applying a baseline subtraction [6] in order to make the direct simulations at the given temperatures. Lines joining these data points serve merely as a guide for the eye. simulated heat capacities more directly comparable with experimental results. Experimental baseline subtractions are applied with the aim of eliminating solvent contri- 4000 butions [64–66] (but see also [67]). The calorimetric criterion requires DHvH=DHcal 1 for an apparent two-

state protein. When this condition is satisfied, it follows V 2000 from simple considerations that a C given protein’s enthalpy distribution at the folding/un- folding transition midpoint is strongly bimodal with few 0 intermediate-enthalpy conformations [1]. Another indicator of a thermodynamically coopera- 18 tive transition is a steplike sigmoidal change in the > Boltzmann average of the radius of gyration hRgi, with g R 12 little further expansion in the chain as the temperature < increases further above Tm [5,6, and references therein]. The radius of gyration of a single conformation is 6 0.3 0.4 0.5 0.6 0.7 defined by T X 2 1 2 R ¼ jr hijr ; ð15Þ Fig. 2. Specific heat capacity CV (upper panel) and average radius of g N N n þ s n gyration hRgi (lower panel), as functions of temperature, for single- helix (diamonds) and two-helix (squares) sequences with ehp ¼ 1:0. whereP rn is the position of atom n and Radius of gyration has units of A. As in Fig. 1, symbols indicate values hri¼ n rn=ðN þ NsÞ is the mean of the positions of all calculated from direct simulations, lines joining symbols are only a the atoms.qffiffiffiffiffi The sum is performed over all atoms n,and guide for the eye. 2 Rg Rg. models with relatively strong hydrogen bonding inter- action (ehb fixed at 2.8) but zero (ehp ¼ 0:0) or not too 3. Results and discussion strong (ehp ¼ 1:0) hydrophobic interactions exhibit some degree of thermodynamic cooperativity. This is indi- 3.1. Foldability, cooperativity, and heat capacity cated by a peak in the heat capacity function at some transition temperature, with the rate of change of radius To get a qualitative idea of the degree of thermody- of gyration with respect to temperature also being larger namic cooperativity of the models, we use Eqs. (12) and in the region of the transition temperature than at other (15) to calculate the specific heat capacity CV and the temperatures. average radius of gyration hRgi as functions of temper- Below the transition temperature, the system is found ature (Figs. 1–4). The first three figures show that largely in helical conformations (a single-helix or a two- M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199 193

4000 moidal behaviour in the radius of gyration. This ob- servation (which is consistent with previous observations [41,42]) indicates that, for this class of

V 2000 models, the degree of thermodynamic cooperativity in- C creases with chain length. However, with stronger hydrophobic interactions 0 ehp ¼ 2:2 (Fig. 4), we see no evidence of apparent two-state behaviour. Moreover, the low-temperature 18 conformational radius of gyration when ehp ¼ 2:2is significantly smaller than it is when the hydrophobic > g interactions are weaker (Figs. 1–3). Hydrophobic col-

R 12 < lapse is the main factor driving the formation of these compact conformations, which have fewer hydrogen 6 0.3 0.4 0.5 0.6 0.7 bonds (data not shown) than the low-temperature heli- T cal conformations at ehp ¼ 1:0. In contrast, we note that in the Monte Carlo study of Fig. 3. Specific heat capacity C (upper panel) and average radius of V Irback€ et al. [41], low-energy helical conformations were gyration hRgi (lower panel), as functions of temperature, for three-helix sequence with ehp ¼ 1:0. Radius of gyration has units of A. Symbols observed at ehp ¼ 2:2 (with the same ehb ¼ 2:8 as was indicate values calculated from direct simulations at different temper- used here). We would expect that the difference between atures. Solid lines joining the symbols serve merely as a guide for the the two studies is caused by the differences in the for- eye. The dashed curves are extrapolations obtained by applying stan- mulations of the models, namely the presence of addi- dard histogram techniques to the results of conformational sampling at tional energy terms Vl, Vh, Vv and Vx that allow for more the transition midpoint T ¼ Tm ¼ 0:42. Dotted straight lines in the heat capacity plot indicate possible alternative empirical baseline flexibility in our chain models. At a more qualitative subtractions for the determination of the van’t Hoff to calorimetric level, our result is consistent with Irback€ et al.’s [42] enthalpy ratio (see text for details). It is instructive to note that the subsequent observation that making the hydrophobic kinetic energy contribution 3ðN þ NsÞkB=2 ¼ 3ð162 þ 154Þ=2 ¼ 474 to interaction stronger relative to the hydrogen bonding the heat capacity of the three-helix sequence (cf. equipartition theorem; interaction reduced the degree of thermodynamic co- kB ¼ 1 in the present units) makes up approximately one half of the CV baseline value observed here. operativity in their model. In this context, it should also be pointed out that, unlike in Monte Carlo sampling (which does not take kinetic energy into account [41]), entropic effects in the present study are not restricted to 4000 configurational . This is because the kinetic en- ergy distribution in our model (see below) contributes to

V phase-space entropy and therefore has an impact on

C 2000 thermodynamic stability. To evaluate the thermodynamic cooperativity of the 0 three-helix model with ehp ¼ 1:0 against the experimen- tal calorimetric criterion, we use its heat capacity func- 18 tion (upper panel of Fig. 3) to estimate the van’t Hoff to calorimetric enthalpy ratio DHvH=DHcal. Two sets of > g

R possible empirical baselines are used for this purpose < 12 (dotted straight lines in the upper panel of Fig. 3). These example baselines are constructed to be reasonably 6 similar to those employed in experiments. Using the 0.4 0.6 0.8 1 lower pair of baselines, we find the DHvH=DHcal ratio T ðsÞ after baseline subtraction [6] to be j2 0:56. Using the Fig. 4. Specific heat capacity C (upper panel) and average radius of ðsÞ V upper pair of baselines we find j2 0:86. Both of these gyration hRgi (lower panel), as functions of temperature, for single- values are significantly smaller than unity, indicating helix (diamonds), two-helix (squares) and three-helix (circles) se- that the thermodynamic behaviour of the three-helix quences with ehp ¼ 2:2. Radius of gyration has units of A. Symbols indicate values calculated from direct simulations. Lines joining sym- model does not mimic the thermodynamics of those real bols are only a guide for the eye. proteins that exhibit apparent two-state properties. Baselines more steeply inclined than those drawn in Fig. 3 can reduce DHcal dramatically and thus push the ðsÞ or three-helix bundle, depending on the sequence). The value of j2 higher [6], but such baselines would appear larger the model molecule, the more prominent the heat artificial and would bear little resemblance to those used capacity peak and the more noticeable the steplike sig- in experimental calorimetric analyses. Indeed, the low 194 M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199

ðsÞ j2 values that we have obtained are a good indication of the very weakly cooperative behaviour of the present model; this behaviour is fundamentally related to its underlying energy distribution, the details of which will be discussed below. It is noteworthy that, because the present three-helix model is not sufficiently two-state-like, and because the kinetic component of the total energy depends on the temperature, simulations at the transition midpoint do not adequately sample conformations that are popu- lated at higher and lower temperatures. As a result, al- though curves for CV and hRgi that are derived using histogram techniques from sampling results at T ¼ Tm ¼ 0:42 (dashed curves in Fig. 3) are a reasonably good approximation for a narrow regime in the transi- tion region, this technique does not provide a good es- timate for CV or hRgi outside the immediate vicinity of the simulation temperature (Fig. 3). In contrast, for two- state-like systems such as the Go models that we in- vestigated recently [5], there is less discrepancy between the directly simulated heat capacities and the histogram- estimated heat capacities based on sampling at the transition midpoint.

3.2. Folding and unfolding trajectories

We now focus on structural and energetic details of the three-helix system with ehp ¼ 1:0. Among the models explored in this work, this model is structurally most similar to a small globular protein. Fig. 5 shows examples of structures observed in simulations of this model at different temperatures. Fig. 6 displays representative folding/unfolding trajec- tories, simulated at temperatures bracketing Tm. At low temperatures, this model’s secondary structure com- prises three helices; the three helices can be arranged in two topologically distinct ways, with the result that the model has two different possible tertiary structures, which have been referred to as ‘‘FU’’ and ‘‘BU’’ (‘‘Front of U’’ and ‘‘Back of U’’) [41,44]. Conformations (i) and Fig. 5. Examples of conformations of the three-helix sequence with (ii) in Fig. 5 illustrate these two native structures (FU ehp ¼ 1:0 obtained during simulations at different temperatures T ,as and BU, respectively) at low temperature. At T Tm, follows. The times at which these ‘‘snapshots’’ are taken are provided the model collapses to one of the native structures, and in parentheses, expressed as a number of time steps from the initiation 7 7 the transition time between the two is too long for of the simulation run: (i) T ¼ 0:35 (3.8 10 ); (ii) T ¼ 0:40 (3.0 10 ); (iii) T ¼ 0:40 (3.9 107); (iv) T ¼ 0:40 (4.2 107); (v) T ¼ 0:42 interconversion to be seen on the time scale of our (2.4 107); (vi) T ¼ 0:42 (3.2 107); (vii) T ¼ 0:42 (4.2 107); (viii) simulations. T ¼ 0:42 (4.6 107); (ix) T ¼ 0:45 (4.1 107); (x) T ¼ 0:45 (4.8 107). As T approaches Tm, unfolded structures begin to be The preparation of this figure was aided by RASMOL [68]. observed, and conformations (iii) and (iv) in Fig. 5 are examples of these. Conformation (iii) can be described though they usually retain some helical structure. Con- as comprising three helices, but they are no longer formation (v) is extended and unfolded, while (vi) is a confined in a bundle; conformation (iv) retains a large more compact non-native conformation. Conforma- amount of helical content but cannot be described as tions (vii) and (viii) show nativelike and essentially na- having a ‘‘three-helix’’ structure. tive three-helix bundle conformations (FU and BU, Conformations (v)–(viii) were observed at the tran- respectively). These snapshots are taken 4 million time sition midpoint (T ¼ Tm ¼ 0:42). At this temperature steps apart, indicating that the two native structures the system samples a wide range of conformations, al- readily interconvert at the transition midpoint. M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199 195

(a) T=0.40 readily discernible from the trajectory at T ¼ Tm ¼ 0:42. 27 Fig. 6 also confirms that three of the BU and FU con- R g 18 formations in Fig. 5 [conformations (ii), (vii) and (viii)] 9 40 are indeed essentially lowest-energy conformations with Nhb 20 maximal numbers of hydrogen bonds Nhb and very low 0 V [cf. vertical dashed lines (ii), (vii) and (viii) in Fig. 6]. 300 V 200 (ii) (iii) (iv) 100 3.3. Energy distributions (b) T=0.42 27 To better understand the present three-helix model’s Rg 18 lack of apparent two-state cooperativity, it is instructive 9 to examine how the distributions of the different com- 40 ponents of the energy vary with temperature. Fig. 7 Nhb 20 provides a set of such data for the two most conform- 0 300 ationally relevant components of the potential energy in V (vii) (viii) 200 the model, namely the hydrogen bonding energy Vhb and (vi) 100 the hydrophobic interaction energy Vhp. These scatter plots are obtained from a random selection of confor- (c) T=0.45 27 mations encountered in the simulation of the three-helix Rg 18 model. In addition, Fig. 8 shows the scatter plots of 9 kinetic energy EK versus the sum Vhb þ Vhp of the hy- 40 drogen bonding and hydrophobic interaction energies N hb 20 for the same set of conformations. 0 300 It is clear from Figs. 7 and 8 that the largest variation V 200 of energies occurs at the transition temperature (ix) (x) 100 Tm ¼ 0:42. This is, of course, consistent with the pres- 3e+07 3.5e+07 4e+07 4.5e+07 ence of a peak in the heat capacity at this temperature, time since CV is proportional to the variance of the total

Fig. 6. Radius of gyration Rg [Eq. (15)], number of hydrogen bonds energy. Now, if the model’s behaviour were truly two- Nhb [defined in text below Eq. (9)] and potential energy V [Eq. (1)] as a state (in the finite-sized-system, protein folding sense function of time (number of time steps from the initiation of the [1]), we would expect to see the energies approximately simulation), for the three-helix sequence with ehp ¼ 1:00 at confined to two temperature-independent regions, with (a) T ¼ 0:40, (b) T ¼ 0:42 and (c) T ¼ 0:45. Vertical dotted lines la- belled with small roman numerals indicate the times at which some of both of the regions occupied in the vicinity of the the ‘‘snapshot’’ conformations in Fig. 5 were taken. transition temperature [1,6]. However, Figs. 7 and 8 show that the distributions of the hydrophobic and hydrogen bonding energies drift significantly as the At T Tm, the native structures are no longer ob- temperature changes, and the distribution around the served, and the system is found in conformations com- transition temperature is not bimodal. This echoes posed largely of ‘‘random coil’’. Conformation (ix) of the absence of steplike jumps (sharp conformational Fig. 5, observed at T ¼ 0:45, has some helical regions, transitions) along the trajectories in Fig. 6 and fits well while conformation (x), observed at the same tempera- with the fact that, even after reasonable baseline sub- ture, is devoid of any helical structure. tractions, the DHvH=DHcal ratios computed from Fig. 3 This trend of folding/unfolding transition is more are significantly less than unity [1,6]. clearly illustrated by Fig. 6. Between T ¼ 0:40 and It is interesting to note from Fig. 7 that, although T ¼ 0:45; the system passes from a largely helical and the drift in the energy distribution is caused by drifts in compact state (small Rg with a large number of hydro- both Vhb and Vhp, the hydrogen bonding energy changes gen bonds Nhb and low potential energy V ) to a largely faster and more continuously than the hydrophobic non-helical state which expands and contracts (varying energy, and therefore contributes more to the contin- Rg with few hydrogen bonds, i.e., small Nhb, and high uous drift. The vertical variations in Fig. 7 indicate that values of V ). Consistent with its low calorimetric co- the distribution of the hydrophobic component of the operativity, even at the transition midpoint the present energy in this model does have a limited amount of three-helix model does not exhibit the sharp, steplike bimodal character. These features suggest that modifi- kinetic jumps that are typical of thermodynamically cations to the hydrogen bonding energy function might more cooperative systems such as certain Go models [3– be particularly effective at improving the cooperativity 5]. Nonetheless, reversible folding and unfolding are of the model. 196 M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199

T=0.35

T=0.35

T=0.40

T=0.40 0

50 T=0.42

T=0.42

T=0.45 hydrogen bonding + hydrophobic energy

hydrophobic potential energy T=0.45 0

–50 0 T=0.50 –100 T=0.50 –150 -25 150 200 250 kinetic energy

-50 Fig. 8. Distribution of simulated states of the three-helix sequence, -100 -50 0 with ehp ¼ 1:0, according to their kinetic energy EK and the sum V V [Eqs. (8) and (9)] of their hydrogen bonding and hydro- hydrogen bonding energy hb þ hp phobic energies, at five different temperatures: T ¼ 0:35, T ¼ 0:40, T 0:42, T 0:45 and T 0:50. The energy scale is the same for all Fig. 7. Distribution of simulated states of the three-helix sequence, ¼ ¼ ¼ five scatter plots. with ehp ¼ 1:0, according to their hydrophobic energy Vhp [Eq. (8)] and hydrogen bonding energy Vhb [Eq. (9)], at five different temperatures: T ¼ 0:35, T ¼ 0:40, T ¼ 0:42, T ¼ 0:45 and T ¼ 0:50. The energy scale temperature indicates that this general behaviour is not is the same for all five scatter plots. Note that Vhb very rarely becomes an artefact of the bond-stretching and bond-angle har- positive (though such incidences exist) because this can only arise from severe excluded-volume overlaps in the model. monic potentials that were introduced to constrain chain geometry. The continuous drift in energy distribution, Fig. 8 shows that, as expected from elementary sta- with a shifting peak population and an absence of clear tistical mechanics, the degree of fluctuation in the kinetic bimodal distribution, is the basic physical reason for the energy EK increases with temperature. As a check on model’s lack of apparent two-state cooperativity. self-consistency, we determine the effective temperature Computationally, this feature of the model also ex- for each of the ensembles in Fig. 8 by equating it to 2/3 plains why the heat capacity function calculated by ap- of the average kinetic energy per atom (note that kB ¼ 1 plying the histogram method to simulation results at in the present notation). The effective temperatures are, T ¼ Tm ¼ 0:42 (dashed curve in the upper panel of from top to bottom, 0.3512, 0.4017, 0.4204, 0.4523, and Fig. 3) is so inaccurate outside the immediate vicinity of 0.5019, which are very close to their respective simula- Tm. The inaccuracy occurs because the position of the tion temperatures, as expected. peak of the distribution of total energy E changes dra- Fig. 9 provides another perspective on the tempera- matically as a function of temperature (Fig. 9). As a ture dependence of the energy distribution of the three- result, the distribution at T ¼ 0:42 has little overlap with helix model. The fact that the total energy E, the total the distribution at T ¼ 0:45, and negligible overlap with potential energy V and Vhb þ Vhp all show a similar trend the distributions at T ¼ 0:35 and T ¼ 0:50. Hence, if we of continuous drift in distribution as a function of use the histogram method to extrapolate, say, from M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199 197

0.05 (non-Go-like) pairwise additive hydrophobic and hy- T=0.35 T=0.40 E drogen bonding terms. Provided that the model hydro- T=0.45 T=0.50 T=0.42 phobic interaction is not too strong relative to the hydrogen bonding interaction, the model 3-letter se- 0 200 400 600 quences are able to fold to the helical structures for 0.05 which the they are designed, and the target structures V emerge as essentially lowest-energy conformations. This confirms the earlier Monte Carlo results of Wallin and co-workers [41–43]. Despite this success for helical population 0 100 200 300 proteins, it should be pointed out that modelling the 0.1 folding of proteins into target structures with substantial V +V hb hp non-helical elements would probably require a more detailed representation of the sidechain geometry and additional energetic heterogeneity as compared with the 0 –100 –50 0 present 3-letter formulation [45,46]. If the hydrophobic interaction is too strong, the Fig. 9. Population distributions, as functions of total energy E (top models studied here undergo hydrophobic collapse to panel), potential energy V (middle panel), and the sum of hydrogen conformations that are more compact but significantly bonding and hydrophobic energies Vhb þ Vhp (bottom panel), for the three-helix sequence with ehp ¼ 1:00, at temperatures T ¼ 0:35, less helical than the designed structures. T ¼ 0:40, T ¼ 0:42, T ¼ 0:45 and T ¼ 0:50. Each individually nor- malised distribution in every panel is constructed using 1000 bins for 4.1. Lack of apparent two-state behaviour the energy. Each hump here represents the energy distribution at a different temperature: there are no double-humped distributions in this The three-helix bundle protein model is a major focus figure. The Vhb þ Vhp distribution here at T ¼ Tm ¼ 0:42 shares some qualitative similarities with the free energy profile of energy at the of our study. We find that, although the heat capacity transition midpoint in Fig. 2 of [43]. function of this model exhibits a well-defined peak at a certain transition temperature and its average radius of T ¼ 0:42 to T ¼ 0:35, we are using only the small gyration undergoes a concomitant rapid increase, its number of conformations that are found in the region of folding/unfolding transition is not two-state-like. This the overlap. These conformations, at the edge of the lack of a high degree of cooperativity corresponds to an energy range sampled at T ¼ 0:42, do not adequately underlying energy distribution that is not bimodal, and sample the states which would be observed in a simu- is underscored by DHvH=DHcal values that are signifi- lation at T ¼ 0:35, so we cannot expect the extrapola- cantly less than unity even when ample latitude is al- tion to be accurate. In addition, the overlapping lowed for baseline subtractions. Our results thus conformations cover only a very restricted energy range. emphasise that simple models with seemingly intuitive Consequently, the variance of the extrapolated energy energy functions do not necessarily display apparent distribution, and therefore the extrapolated CV curve, two-state behaviour. goes to zero very rapidly as we move away from Tm, We note that the effective interactions [terms in Eq. while the directly calculated CV curve maintains a finite (1)] in the present study are taken to be temperature value over the entire temperature range studied (as it independent, whereas effective solvent-mediated intra- should, since the variance of the energy distribution protein interactions are often temperature dependent. cannot be zero in a finite temperature system). Aspects of the impact of temperature-dependent inter- Therefore, to obtain the temperature dependence of actions on protein folding cooperativity have been ex- CV or Rg for a model that is far from being two-state- plored [69], although much remains to be investigated. like, such as the present three-helix model, it is necessary Nonetheless, inasmuch as hydrophobic and other in- to perform multiple simulations at different tempera- teractions vary relatively smoothly with respect to tem- tures. The technique of simulating only at the transition perature, such temperature dependences are (as temperature, and then extrapolating using the histogram demonstrated in a recent example lattice calculation method, is not appropriate for such a model, although [59]) not expected to alter fundamentally the present the technique can be less problematic for more cooper- conclusion that coarse-grained pairwise additive inter- ative models [3–5]. actions are insufficient for a high degree of protein folding thermodynamic cooperativity. Indeed, using physically plausible interactions to 4. Concluding remarks construct a protein chain model that possesses folding/ unfolding cooperativity similar to that of real, small, The simple Langevin dynamics models investigated in single-domain two-state proteins is non-trivial [1,10]. this article have an energy function that contains general The temperature-induced drift observed in the present 198 M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199 energy distribution is disproportionately caused by the operativity of the present continuum 3-letter three-helix hydrogen bonds. Hence, this particular component of bundle model, especially its underlying energy distribu- the energy is a prime target for improvement if one aims tion, show similarities to an early 3-letter lattice model to enhance the folding/unfolding cooperativity of the (see Fig. 6A of reference [1]), as well as to the HP model model. The hydrogen bonding interactions in the pres- [76]. ent model allow for substantial helical structure in the Because the folding process of the three-helix bundle unfolded state [see, e.g. conformations (iii) and (iv) in studied here resembles downhill folding in many re- Fig. 5 above], but this feature is not conducive to calo- spects, we expect that valuable insights into the ther- rimetric cooperativity [70]. modynamics and kinetics of downhill folding can be A possible avenue to enhanced cooperativity, there- gained from further investigation of this model and of fore, would be to weaken the hydrogen bonding inter- related protein chain models. action in the unfolded state by coupling local hydrogen bond formation with hydrophobic collapse [7,10]. However, an earlier implementation of non-additive Acknowledgements context-dependent hydrogen bonding did not lead to highly cooperative behaviour (cf. the heat capacity We thank Prof. Anders Irback€ and Dr. Stefan Wallin function in Fig. 14 of [44]); and even the sign of such a for helpful discussions, and Dr. Stefan Wallin for a coupling remains controversial (see [56,70] and refer- critical reading of the manuscript. The research reported ences therein). Other proposed non-additive cooperative here was partially supported by the Canadian Institutes interaction schemes [71–74] are insightful and poten- of Health Research (CIHR Grant No. MOP-15323), tially important, but they remain to be rigorously eval- PENCE, and a Premier’s Research Excellence Award uated against experimental thermodynamic and kinetic from the Province of Ontario. H.S.C. holds a Canada criteria for protein folding cooperativity. Research Chair in Biochemistry. Thus, the results presented in this article highlight the limitations of current understanding of protein ener- getics. Further work is needed to unravel the links be- References tween thermodynamic cooperativity and the various components (both potential and kinetic) of the energy. [1] H.S. Chan, S. Shimizu, H. Kaya, Methods Enzymol. 380 (2004) 350. [2] C. Micheletti, J.R. Banavar, A. Maritan, F. Seno, Phys. Rev. Lett. 4.2. Downhill folding 82 (1999) 3372. [3] C. Clementi, H. Nymeyer, J.N. Onuchic, J. Mol. Biol. 298 (2000) Not all proteins fold in an apparent two-state man- 937. ner, however: the folding of some real, small proteins is [4] H. Kaya, H.S. Chan, J. Mol. Biol. 326 (2003) 911. [5] M. Knott, H. Kaya, H.S. Chan, Polymer 45 (2004) 623. not highly cooperative. Recent experiments by Garcia- [6] H. Kaya, H.S. Chan, Proteins 40 (2000) 637; Mira et al. [15] have shown that folding/unfolding of the Erratum 43 (2001) 523. peripheral subunit binding domain BBL does not ap- [7] H. Kaya, H.S. Chan, J. Mol. Biol. 315 (2002) 899. pear to involve a major free energy barrier, and there- [8] H. Kaya, H.S. Chan, Phys. Rev. Lett. 90 (2003) 258104. fore might be characterised as downhill folding. [9] P. Pokarowki, A. Kolinski, J. Skolnick, Biophys. J. 84 (2003) 1518. Interestingly, our simple three-helix bundle model [10] H. Kaya, H.S. Chan, Proteins 52 (2003) 524. may be particularly useful for understanding such [11] K.W. Plaxco, K.T. Simons, D. Baker, J. Mol. Biol. 227 (1998) ‘‘barrierless’’ folding [43]. The present model’s lack of 985. calorimetric cooperativity and its gradual loss of struc- [12] A.I. Jewett, V.S. Pande, K.W. Plaxco, J. Mol. Biol. 326 (2003) ture during thermal unfolding are very similar to the 247. [13] J.D. Bryngelson, J.N. Onuchic, N.D. Socci, P.G. Wolynes, corresponding properties of BBL: as Garcia-Mira et al. Proteins 21 (1995) 167. [15] have commented, ‘‘non-concerted unfolding start- [14] J. Sabelko, J. Ervin, M. Gruebele, Proc. Natl. Acad. Sci. USA 96 ing from defined 3D structures is exactly what we expect (1999) 6031. for downhill folding’’. Using an empirical statistical [15] M.M. Garcia-Mira, M. Sadqi, N. Fischer, J.M. Sanchez-Ruiz, V. mechanical model, these authors have predicted the Munoz,~ Science 298 (2002) 2191. [16] S. Osvath, J.J. Sabelko, M. Gruebele, J. Mol. Biol. 333 (2003) 187. distribution of conformational similarity to the native [17] Y. Duan, P.A. Kollman, Science 282 (1998) 740. structure. Their predicted distribution is not bimodal [18] X. Daura, K. Gademann, H. Schafer, B. Jaun, D. Seebach, W.F. and has a single peak value that shifts gradually with van Gunsteren, J. Am. Chem. Soc. 123 (2001) 2393. temperature. Their model did not employ an explicit [19] A.E. Garcıa, K.Y. Sanbonmatsu, Proc. Natl. Acad. Sci. USA 99 chain representation [75]. Nonetheless, the trend it pre- (2002) 2782. [20] A.R. Fersht, V. Daggett, Cell 108 (2002) 573. dicted is qualitatively similar to the (non-bimodal) en- [21] A.E. Garcıa, J.N. Onuchic, Proc. Natl. Acad. Sci. USA 100 (2003) ergy distributions in Figs. 7–9 above. In connection with 13898. this, it should be mentioned that aspects of the low co- [22] N.A. Alves, U.H.E. Hansmann, J. Chem. Phys. 117 (2002) 2337. M. Knott, H.S. Chan / Chemical Physics 307 (2004) 187–199 199

[23] C. Simmerling, B. Strockbine, A.E. Roitberg, J. Am. Chem. Soc. [46] A. Irback,€ B. Samuelsson, F. Sjunnesson, S. Wallin, Biophys. J. 85 124 (2002) 11258. (2003) 1466. [24] C.D. Snow, N. Nguyen, V.S. Pande, M. Gruebele, Nature 420 [47] G. Favrin, A. Irback,€ S. Wallin, Proteins 54 (2004) 8. (2002) 102. [48] A. Kolinski, J. Skolnick, Proteins 18 (1994) 353. [25] J.A. Vila, D.R. Ripoll, H.A. Scheraga, Proc. Natl. Acad. Sci. USA [49] E.M. Boczko, C.L. Brooks, Science 269 (1995) 393. 100 (2003) 14812. [50] A. Ghosh, R. Elber, H.A. Scheraga, Proc. Natl. Acad. Sci. USA [26] S. Chowdhury, W. Zhang, C. Wu, G. Xiong, Y. Duan, Biopoly- 99 (2002) 10394. mers 68 (2003) 63. [51] E. Kussell, J. Shimada, E.I. Shakhnovich, Proc. Natl. Acad. Sci. [27] E. Paci, A. Cavall, M. Vendruscolo, A. Caflisch, Proc. Natl. Acad. USA 99 (2002) 5343. Sci. USA 100 (2003) 8217. [52] Y.Q. Zhou, A. Linhananta, J. Phys. Chem. B 106 (2002) 1481. [28] J.E. Shea, C.L. Brooks, Annu. Rev. Phys. Chem. 52 (2001) 499. [53] J.D. Honeycutt, D. Thirumalai, Biopolymers 32 (1992) 695. [29] H. Nymeyer, A.E. Garcıa, Proc. Natl. Acad. USA 100 (2003) [54] Z. Guo, D. Thirumalai, Biopolymers 36 (1995) 83. 13934. [55] T. Veitshans, D. Klimov, D. Thirumalai, Fold. Des. 2 (1997) 1. [30] S. Gnanakaran, H. Nymeyer, J. Portman, K.Y. Sanbonmatsu, [56] J.K. Myers, C.N. Pace, Biophys. J. 71 (1996) 2033. A.E. Garcıa, Curr. Opin. Struct. Biol. 13 (2003) 168. [57] A. Kentsis, T.R. Sosnick, Biochemistry 37 (1998) 14613. [31] S. Gianni, N.R. Guydosh, F. Khan, T.D. Caldas, U. Mayor, [58] K.A. Dill, Biochemistry 29 (1990) 7133. G.W.N. White, M.L. DeMarco, V. Daggett, A.R. Fersht, Proc. [59] S. Shimizu, H.S. Chan, Proteins 48 (2002) 15. Natl. Acad. Sci. USA 100 (2003) 13286. [60] R. Srinivasan, G.D. Rose, Proteins 22 (1995) 81. [32] L.J. Smith, X. Daura, W.F. van Gunsteren, Proteins 48 (2002) [61] K. Yue, K.A. Dill, Protein Sci. 5 (1996) 254. 487. [62] T.E. Creighton, Proteins: Structures and Molecular Properties, [33] D. Thirumalai, D.K. Klimov, Curr. Opin. Struct. Biol. 9 (1999) second ed., W.H. Freeman, New York, 1993. 197. [63] D.A. Brant, W.G. Miller, P.J. Flory, J. Mol. Biol. 23 (1967) 47. [34] T. Head-Gordon, S. Brown, Curr. Opin. Struct. Biol. 13 (2003) [64] R. Lumry, R. Biltonen, J.F. Brandts, Biopolymers 4 (1966) 917. 160. [65] T.Y. Tsong, R.P. Hearn, D.P. Wrathall, J.M. Sturtevant, [35] D.C. Rapaport, Phys. Rev. E 66 (2002) 011906. Biochemistry 9 (1970) 2666. [36] J.Z.Y. Chen, H. Imamura, Physica A 321 (2003) 181. [66] P.L. Privalov, N.N. Khechinashvili, J. Mol. Biol. 86 (1974) 665. [37] A. Mukherjee, B. Bagchi, J. Chem. Phys. 118 (2003) 4733. [67] D. Yang, Y.K. Mok, J.D. Forman-Kay, N.A. Farrow, L.E. Kay, [38] G.H. Wei, P. Derreumaux, N. Mousseau, J. Chem. Phys. 119 J. Mol. Biol. 272 (1997) 790. (2003) 6403. [68] R.A. Sayle, E.J. Milner-White, Trends Biochem. Sci. 20 (1995) [39] H. Kaya, H.S. Chan, Proteins 52 (2003) 510. 374. [40] M. Scalley-Kim, D. Baker, J. Mol. Biol. 338 (2004) 573. [69] S. Shimizu, H.S. Chan, Proteins 49 (2002) 560. [41] A. Irback,€ F. Sjunnesson, S. Wallin, Proc. Natl. Acad. Sci. USA [70] H. Kaya, H.S. Chan, Phys. Rev. Lett. 85 (2000) 4823. 97 (2000) 13614. [71] A. Kolinski, W. Galazka, J. Skolnick, Proteins 26 (1996) 271. [42] A. Irback,€ F. Sjunnesson, S. Wallin, J. Biol. Phys. 27 (2001) [72] S.S. Plotkin, J. Wang, P.G. Wolynes, J. Chem. Phys. 106 (1997) 169. 2932. [43] G. Favrin, A. Irback,€ B. Samuelsson, S. Wallin, Biophys. J. 85 [73] B.A. Shoemaker, J. Wang, P.G. Wolynes, Proc. Natl. Acad. Sci. (2003) 1457. USA 94 (1997) 777. [44] S. Takada, Z. Luthey-Schulten, P.G. Wolynes, J. Chem. Phys. 110 [74] M.P. Eastwood, P.G. Wolynes, J. Chem. Phys. 114 (2001) 4702. (1999) 11616. [75] J. Karanicolas, C.L. Brooks, Proteins 53 (2003) 740. [45] G. Favrin, A. Irback,€ S. Wallin, Proteins 47 (2002) 99. [76] H.S. Chan, Proteins 40 (2000) 543.