Cooperative folding kinetics of BBL protein and peripheral subunit-binding domain homologues

Wookyung Yu*, Kwanghoon Chung*†, Mookyung Cheon*‡, Muyoung Heo*§, Kyou-Hoon Han¶, Sihyun Hamʈ, and Iksoo Chang*,**

*National Research Laboratory for Computational Proteomics and Biophysics, Department of Physics, Pusan National University, Busan 609-735, Korea; ‡Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; ¶Molecular Cancer Research Center, Korea Research Institute for Bioscience and Biotechnology, Daejeon 305-806, Korea; and ʈDepartment of Chemistry, Sookmyoung Women’s University, Seoul 140-742, Korea

Edited by Alan R. Fersht, University of Cambridge, Cambridge, United Kingdom, and approved December 21, 2007 (received for review September 7, 2007) Recent experiments claiming that Naf-BBL protein follows a global landscape when the folding energy predominates over the downhill folding raised an important controversy as to the folding loss of the conformational (4), depending upon the mechanism of fast-folding proteins. Under the global downhill conditions of a folding experiment such as their initial locations folding scenario, not only do proteins undergo a gradual folding, on the free and the heterogeneity of confor- but folding events along the continuous folding pathway also mational ensemble (7). In the global downhill folding model the could be mapped out from the equilibrium denaturation experi- folding free energy landscape possesses a global unimodal ment. Based on the exact calculation using a free energy landscape, distribution for all denaturation conditions so that a protein relaxation eigenmodes from a master equation, and Monte Carlo follows a smooth and progressive conformational change, and a simulation of an extended Muñoz–Eaton model that incorporates continuous folding pathway can be observed simply by an multiscale-heterogeneous pairwise interactions between amino equilibrium denaturation experiment (11, 12). acids, here we show that the very nature of a two-state cooper- Muñoz and coworkers (13, 14) presented the experimental ative transition such as a bimodal distribution from an exact free observation of a broad thermal denaturation midpoint for energy landscape and biphasic relaxation kinetics manifest in the Naf-BBL protein, which is a truncated BBL peripheral subunit thermodynamics and folding–unfolding kinetics of BBL and pe- binding domain (PSBD) from Escherichia coli with a naphthyl- ripheral subunit-binding domain homologues. Our results provide alanine. They not only suggested that Naf-BBL follows the global an unequivocal resolution to the fundamental controversy related downhill folding scenario but also claimed that the same scenario to the global downhill folding scheme, whose applicability to other holds for other PSBD homologues (13). In contrast, Fersht and proteins should be critically reexamined. coworkers (15, 16) examined the structural, denaturation, and kinetic properties of PSBD homologues from E. coli [unlabeled mechanism ͉ global downhill folding ͉ wild-type BBL, Protein Data Bank (PDB) entry 1W4H], Bacillus ͉ protein thermodynamics relaxation kinetics of protein stearothermophilus (E3BD F166W, PDB entry 1W4E), and BIOPHYSICS Pyrobaculumaerophilum (POB Y166W L146A, PDB entry ur understanding of protein folding is primarily based on 1W4J) (17) and found that 1W4H, 1W4E, and 1W4J follow the Ointerpretation of the thermodynamics and kinetics from a typical two-state folding scheme. Whereas Naganathan et al. (18) simple protein folding model (1–10). The principle of minimal argued that the experimental data in ref. 15 support the global frustration and structural consistency captured by the Go៮ model downhill folding scenario, Fersht and coworkers (16) believe that allowed us to describe the complicated nature of protein folding the conclusion drawn by Muñoz and coworkers was obtained by at a coarse-grained level (5, 6). Bryngelson et al. (6, 7) posited an incorrect assumption and analysis of the data in refs. 13 and a phenomenological picture of free energy landscape of a 18. More recently, Sadqi et al. (19) presented an atom-by-atom protein-like heteropolymer and classified protein folding into analysis of NMR data to support the global downhill folding three scenarios depending on whether the free energy landscape scheme of Naf-BBL (13). However, Ferguson et al. (20) and is unistable or bistable and also whether a glass transition occurs. Zhou and Bai (21) argued back again that the new data Shakhnovich and Gutin (8) presented microscopic statistical- presented by Sadqi et al. (19) could well be described by a two- mechanical theories for heteropolymer model, whose conse- or three-state model. Such an ongoing fundamental controversy quences provided important insights into the thermodynamics, (13–16, 18–25) about the folding mechanism for Naf-BBL and kinetics, and nature of free energy landscape for protein folding. PSBD homologues would reshape our concept on protein fold- The conventional view of protein folding assumes a discrete ing. Faced with these conflicting and ambiguous assertions from two-state transition between a and a denatured experiment, we have attempted to resolve such a controversy by ensemble of protein conformations that are separated by a carrying out an exact analysis of thermodynamic and kinetic folding barrier (9). The other view termed as one-state (or global) downhill folding asserts that there is a continuous change Author contributions: I.C. designed research; W.Y., K.C., M.C., M.H., and I.C. performed of protein conformation during folding without distinct free research; K.-H.H., S.H., and I.C. contributed new reagents/analytic tools; W.Y., K.C., M.C., energy barrier (7, 10). The former could be understood as the M.H., K.-H.H., S.H., and I.C. analyzed data; and I.C. wrote the paper. first order (cooperative) whereas the latter as The authors declare no conflict of interest. the second order (noncooperative) phase transition between a This article is a PNAS Direct Submission. native and a denatured ensemble of protein conformation. An †Present address: Ministry of Science and Technology, Gwacheon 427-715, Korea. in-depth analysis for possible downhill folding was performed by §Present address: Department of Chemistry and Chemical Biology, Harvard University, 12 Abkevich et al. (10) where it was shown that the major deter- Oxford Street, Cambridge, MA 02138. minant for the cooperativity of protein folding is the relative **To whom correspondence should be addressed. E-mail: [email protected]. abundance of nonlocal versus local contacts between residues in ac.kr. the native structure of a protein, which also simultaneously This article contains supporting information online at www.pnas.org/cgi/content/full/ compete with the loss of conformational entropy. However, the 0708480105/DC1. folding process of two-state proteins may go downhill in the free © 2008 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0708480105 PNAS ͉ February 19, 2008 ͉ vol. 105 ͉ no. 7 ͉ 2397–2402 Downloaded by guest on September 24, 2021 0 Fig. 1. Exact free energy ⌬G in the unit of RTm (Upper) and equilibrium population ␯ (Lower) plotted as a function of M for T/Tm ϭ 0.9ϳ1.1 for 1BBL (a), 1W4H (b), 1W4E (c), and 1W4J (d) using our extended ME model. The native (denatured) well is located at the global minimum of the free energy landscape for T Ͻ Tm (T Ͼ Tm). As T crosses Tm, the discrete transition occurs across the free energy barrier of 2ϳ3 RTm for 1BBL (a) and 1W4J (d) and 4ϳ5 RTm for 1W4H (b) and 1W4E (c), respectively. Because the native state is invariant, the peaks in ␯0 corresponding to the native state do not move with temperature, whereas the denatured peaks move to lower values of M while maintaining the bimodal distribution. The bimodal nature of eigenvector elements governs the equilibrium population.

properties for folding–unfolding of an unlabeled BBL (BBL geneous pairwise interactions. Fig. 1 Upper shows an exact free short, PDB entry 1BBL) with no naphthyl moiety and three energy landscape in the unit of RTm as a function of the fraction PSBD homologues. In addition, a Monte Carlo simulation was M of native residue for 1BBL, 1W4H, 1W4E, and 1W4J, where used to monitor out-of-equilibrium relaxation kinetics. R is a gas constant and Tm is a folding mid-temperature. Although the free energy landscape of the denatured state is Results and Discussion broad and populated with multiple wells having low energy Thermodynamics of BBL and PSBD. The aforementioned contro- barriers with heights less than or equal to RTm, a clear free versy can be resolved by determining whether the thermody- energy barrier on the order of 2ϳ3 RTm for 1BBL and 1W4J, and namic and kinetic properties during folding–unfolding of BBL 4ϳ5 RTm for 1W4H and 1W4E exists that separates the native short and three PSBD homologues exhibit a bimodal or a from the denatured state. In addition, a discernible discrete jump unimodal pattern. To achieve such a goal one must use a free is identified in the location of the global minimum of free energy energy landscape in an exact manner along a proper reaction as T crosses Tm, which is a signature of the first order phase coordinate. A coarse-grained Muñoz–Eaton (ME) model of a transition between the native and the denatured ensemble of protein folding has been highly useful in explaining folding protein conformation. The invariant position of the native well processes of many proteins (26, 27). However, this original ME with temperature indicates that the native state is distinct and model does not distinguish different types. Further- constant. When T/Tm ϽϽ 1(T/Tm ϾϾ 1), the pairwise-contact more, pairwise interactions between two amino acids within a folding energy predominates over the loss of the conformational cut-off distance are treated as attractive interactions in this entropy; thus, the folding barrier diminishes and the folding model. We have extended the original ME model (see Materials process goes downhill toward its equilibrium native state in the and Methods) to a new one that explicitly incorporates both the free energy landscape (the opposite abundance leads the un- specificity of amino acid type and the heterogeneity of all folding process downhill) (4, 10). This reconciles with the type pairwise interactions without a cut-off distance into a free 0 scenario of protein folding posited by Bryngelson et al. (7). energy function. Our model takes into account of both attractive Because the hydrophobic effect, which plays different roles in and repulsive pairwise interactions in a more realistic fashion affecting the folding behavior of proteins at different tempera- because it uses a hybridized energy function that bridges the gap tures, can cause cold denaturation at low temperatures, the between the atom-level and residue-level descriptors of protein downhill free energy landscape at very low temperature in this conformations while retaining the topology of protein structure model may not be relevant. The exact results for the equilib- at a coarse-grained level. Using our extended ME model, which rium susceptibility ␹ ϭ [͗M2͘Ϫ͗M͘2]/T and specific heat C ϭ is a multiscale-heterogeneous model capturing the atomic-scale [͗E2͘Ϫ͗E͘2]/T2 from our Monte Carlo simulation, which are the energy function, one can perform an exact analysis for thermo- fluctuation in the fraction of native residue and in the pairwise dynamic and kinetic properties of proteins and efficiently ex- interaction energy, respectively, are plotted in Fig. 2a as a plore the out-of-equilibrium relaxation kinetics over the entire function of T/Tm. The presence of pronounced peaks in the time scale, including the early folding events and the asymptotic susceptibility and specific heat around T/Tm ϭ 1 represents a behavior leading to an equilibrium state. folding–unfolding transition whose associated fluctuations are We first optimized the three-dimensional structures of PDB larger for 1W4E and 1W4H than those of 1BBL and 1W4J. The entries 1BBL, 1W4H, 1W4E, and 1W4J by simulated annealing equilibrium melting curves of the fraction of native conforma-

using AMBER 8.0/ff99 (28) with a generalized Born approxi- tion PT/Tm(r) for 1BBL as a function of the location r of residues mation for an implicit consideration of solvent. The strength of along a sequence for each temperature T/Tm is presented in Fig. pairwise interactions between ith and jth amino acid for j Ն i ϩ 3a, whereas Fig. 3b plots Pr(T/Tm) as a function of temperature 2 was obtained by summing over all atomic—Coulombic and van for each residue. Fig. 3a illustrates that while residues rc on the der Waals—interaction . We then generalized the pre- ␣-helix at C terminus do not melt for the temperature range

viously known exact solution (29) of an original ME model to one studied, PT/Tm(r) at a given temperature depends on the location that contains information on the sequence-specific and hetero- and secondary structure of residues. Fig. 3b demonstrates that

2398 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0708480105 Yu et al. Downloaded by guest on September 24, 2021 1W4H, 1W4E, and 1W4J by using the master equation (30) (see Materials and Methods). Fig. 2b shows a as a function of T/Tm. If a protein were to follow a global downhill folding, a chevron plot should show flat dependence on tem- perature. The facts that a linear folding (unfolding) arm without a rollover behavior for T Ͻ Tm (T Ͼ Tm) exists and that the sum of folding and unfolding rate at T ϭ Tm is equal to a relaxation rate are important kinetic evidences that 1BBL, 1W4H, 1W4E, and 1W4J fold by a two-state folding mechanism. Fig. 2b also shows that 1BBL and 1W4J fold faster than 1W4H and 1W4E because the free energy barriers for the former are lower than the latter. It is worth noting that the recent kinetic experiment by Fersht et al. (24) for the BBL homologue Ala131Gly and wild-type POB PSBD also showed the similar chevron plot with the steep linear refolding arm, which is the hallmark of coop- Fig. 2. Equilibrium susceptibility ␹, specific heat C, and chevron plot as a erative transition. function of T/Tm.(a) The peaks around T/Tm for 1W4H and 1W4E are more In an exact analysis (30) of eigenvectors of a relaxation matrix pronounced than those of 1BBL and 1W4J. (b) The linear folding (unfolding) in the master equation, the eigenvector elements of the zero arm for T Ͻ T (T Ͼ T ) reflects the existence of a free energy barrier. The sum m m eigenvalue are the equilibrium populations of a protein confor- of folding and unfolding rates at T ϭ Tm is equal to the relaxation rate, signifying the two-state folding behavior. mation and their sum is 1.0, whereas eigenvectors of nonzero eigenvalues govern the relaxation mode of a conformational probability toward its equilibrium one. The sum of eigenvector elements for each nonzero eigenvalue is 0.0. The values of residues rb on the loop region unfold cooperatively to the most extent leading to a free energy barrier, which precedes the eigenvector elements for the zero eigenvalue and the first, unfolding of residues r on the ␣-helix at the N terminus. second nonzero smallest eigenvalues are plotted in Fig. 1 Lower a and supporting information (SI) Fig. 7 in terms of the fraction Therefore, a spread in T does not necessarily indicate that the m M of native residue, where one gets an important insight about free energy landscape is downhill. whether the equilibrium population and the relaxation kinetics toward an equilibrium state are bimodal or unimodal (24). A Chevron Plot and Relaxation Eigenmode. In simplistic cases the common feature emerging from these plots is that the distribu- folding of a protein can be modeled (27) by its relaxation kinetics tions of eigenvector elements are bimodal for both the native and along a one-dimensional free energy landscape such as the one denatured state. The location of a distribution maxima along the shown in Fig. 1. We have analyzed exactly the time evolution of x axis in Fig. 1 Lower for the native state does not change with a conformational probability and its relaxation rate for 1BBL, temperature, implying that the native states for 1BBL, 1W4H,

1W4E, and 1W4J are invariant (24). The distribution pattern for BIOPHYSICS denatured states is broad and shifted toward lower values of M. SI Fig. 7 Middle shows the longest-relaxation eigenmodes of the population exchange between the native and denatured state during the out-of-equilibrium relaxation toward an equilibrium state for a given temperature. It shows that an increase of native population is always accompanied by a decrease of denatured population and vice versa. SI Fig. 7 Bottom shows in the next eigenmodes of the population exchange that an increase of population in the native and denatured state is compensated for by a decrease of population in the other remaining states and vice versa, which illustrates that a bimodal distribution for the native and denatured states is well separated. Therefore, the exact analysis of relaxation eigenmodes based on our exact free energy landscape vividly reveals a bimodal nature of free energy landscape and of relaxation kinetics for 1BBL, 1W4H, 1W4E, and 1W4J.

Biphasic Relaxation Kinetics. The out-of-equilibrium Monte Carlo simulations (31) relevant to the temperature-jump experiment (32) were carried out to investigate the relaxation kinetics toward an equilibrium state without resorting to a particular free energy landscape. Using the multiscale-heterogeneous free energy func- tion in our extended ME model, 1,000 independent dynamic trajectories that start from an initial nonequilibrium conformation and end at an equilibrium conformation were generated at a given temperature. A single spin-flip Metropolis algorithm (31) was used Fig. 3. Equilibrium melting curves of each residues for 1BBL. The fraction of in Monte Carlo simulations that were run well beyond the equi- native conformation P (r)[P (T/T )] as a function of r (T/T ) for each T/Tm r m m librium time scale at each temperature (see SI Text). Among the temperature (residue) is presented in a (b). (a) P (r) depends on the location T/Tm particularly important quantities we focused on were an average and secondary structure of residues. (b) Residues rb on the loop region unfold ϩ ϭ͚͗N ⅐ ϩ cooperatively to the most extent, which precedes the unfolding of residues r spin–spin autocorrelation function Y(t tw, tw) iϭ1Si(tw) Si(t a ͘ on the ␣-helix at the N terminus. It illustrates that a spread of Tm does not tw)/N as a similarity measure of a protein conformation at the later necessarily indicate that the free energy landscape is downhill. (See SI Fig. 6 for time t ϩ tw with respect to that at the time t and an average other melting curves for 1BBL and 1W4E.) pairwise-contact energy E(t). ͗͘ denotes an average over 1,000

Yu et al. PNAS ͉ February 19, 2008 ͉ vol. 105 ͉ no. 7 ͉ 2399 Downloaded by guest on September 24, 2021 [1 Ϫ M(t)], 1 Ϫ AD(t,0)AN(t, 0) are directly proportional to the entropy of a protein conformation. The picture (7) arises naturally in Fig. 4 a and c and SI Fig. 8 a and c such that the magnitude of pairwise folding energy increases while entropy decreases as the folding proceeds for a given tempera- ture T Ͻ Tm, whereas Fig. 4 b and d and SI Fig. 8 b and d show the unfolding funnel. The genuine picture demonstrated in Fig. 4 and SI Fig. 8 is the existence of (i) biphasic relaxation for T Յ Tm (T Ն Tm) and (ii) uniphasic relaxation for T far away from Tm. The biphasic relaxation consists of a fast-downhill-relaxation for the re-equilibration of a fully unfolded (native) initial confor- mation to the denatured (native) conformation in the nearest free energy well, which is followed by a slow relaxation for crossing the free energy barrier to the native (denatured) conformation in the other free energy well (33–35). During the biphasic relaxation, two time scales of the fast and slow relaxation for 1W4E and 1BBL are well separated and the slow relaxation for A (t,0)andE (t) follows the stretched D ␤ D Ϫ(t/␶) kinetics (36) of the form e with ␤ ϭ 0.6ϳ0.8 for T Х Tm, where a free energy barrier is highest. Although the heights of low-lying barriers (ՅRTm) in the denatured basin of free energy landscape are not as high as to generate thermodynamically relevant intermediate states, the existence of such low-lying barriers may result in the interconversion between weak kinetic intermediates during the relaxation process of protein confor- mations and increase the roughness of the free energy landscape. The folding barrier and these low-lying barriers are most pro- nounced for T Х Tm, as is shown in Fig. 1; therefore, the slow relaxation behavior of AD(t,0)andED(t, 0) could follow the stretched kinetics. What should be clarified here is whether the relaxation modes of AD(t,0)andED(t)[AN(t,0)andEN(t)] of 1W4E and 1BBL for entire temperature are governed by expo- nential relaxation, and more importantly whether these relax- ations are barrier-limited. AD(t, 0) is best fitted by a tri- Fig. 4. Out-of-equilibrium relaxation behaviors of 1W4E toward an equi- Ϫt/␶0 Ϫt/␶1 Ϫt/␶2 exponential function of the form C0e ϩ C1e ϩ C2e , librium state at T/Tm ϭ 0.9ϳ1.1. (a–d) Time evolution of an average autocor- relation function A1W4E(t,0)(a)[A1W4E(t,0)(d)] and an average pairwise- whereas ED(t)[AN(t,0)andEN(t)] is best fitted by a bi- D N Ϫt/␶1 ϩ Ϫt/␶2 contact energy A1W4E(t,0)(c)[A1W4E(t,0)(b)] starting from a fully unfolded exponential function of the form C1e C2e near and D D ␶ Ͻ ␶ Ͻ ␶ (native) state of 1W4E. The combination of a and c (b and d) shows the folding below (above) Tm, where 0 1 2. The relaxation of AD(t, (unfolding) funnel picture. The solid lines are drawn from the fitted expo- 0) that occurs in a time scale of ␶0 is the ballistic one that nential function. (e) The longest relaxation time ␶2 of the biphasic relaxation corresponds to the local motion of amino acids and is irrelevant Ͻ Ͼ for T Tm (T Tm) follows an Arrhenius relation with T and manifests the to our discussion. For T far away from Tm, AD(t,0)andED(t) existence of a barrier-limited relaxation. The error bars are drawn with three [AN(t,0)andEN(t)] are best fitted by a single-exponential standard deviations. (f) A(t ϩ tw, tw 0) at T ϭ Tm with different waiting times Ϫt/␶3 function C3e . ␶1 and ␶3 are temperature-independent because tw shows biphasic relaxation behavior for large tw and demonstrates the ␶ corresponds to the downhill motion of an initial configuration presence of fast and slow relaxation due to a free energy barrier. 1 to a nearest denatured (native) well of the free energy landscape and ␶3 to the native (denatured) state for T ϽϽ Tm (T ϾϾ Tm). On the other hand, the longest-relaxation time ␶ for each case different dynamic trajectories, tw denotes a waiting time after a 2 temperature-jump, and S (t) denotes a spin value indicating the exhibits a strong temperature dependence and becomes the i ϭ ⅐ ϩ largest around T Tm. Fig. 4e for 1W4E and SI Fig. 8e for 1BBL nativeness of ith amino acid at time t. The product Si(tw) Si(t tw) ␶ is defined to be 1 when the ith spin values at both times are the same, both show an Arrhenius-type relation between 2 and T/Tm, and Ϫ1 otherwise. We transformed Y(t ϩ t , t )byA(t ϩ t , t ) ϭ which is consistent with the barrier-limited relaxation due to the w w w w free energy barrier. If 1BBL and 1W4E were to follow the [1 ϩ Y(t ϩ t , t )]/2 so that the transformed autocorrelation w w two-state folding behavior, the presence of a denatured well and function is rescaled to change from 0 to 1. a native well on the free energy landscape should be equally Starting from a fully unfolded (native) state with all spins pronounced at T ϭ Tm. This ought to result in the strong biphasic Si ϭ 0(1), i ϭ 1,2,...,N, the time evolution of AD(t,0)andED(t, relaxation behavior in A(t ϩ tw, tw 0) because heterogeneous 0) [AN(t,0)andEN(t, 0)] over the whole hierarchy of time scales initial conformations at tw could probe the fast and the slow is shown in Fig. 4 a and c (Fig. 4 b and d) for 1W4E and in SI relaxation depending on their initial locations on the free energy Fig. 8 a and c (SI Fig. 8 b and d) for 1BBL at different quenching 1W4E ϩ landscape (11, 32). Fig. 4f and SI Fig. 8f show AD (t tw, tw), temperatures T/Tm ϭ 0.9ϳ1.1 (see SI Fig. 8). Because 1W4H and 1W4E ϩ 1BBL ϩ 1BBL ϩ ϭ AN (t tw, tw) and AD (t tw, tw), AN (t tw, tw)atT 1W4J show a similar behavior to 1W4E and 1BBL, respectively, Tm for tw ϭ 1,000, 10,000, 100,000, which demonstrates the we only show the results for the latter in Fig. 4 and SI Fig. 8.It existence of biphasic relaxation for large tw due to an apparent was not straightforward to intuitively grasp the physical realiza- free energy barrier. tion of AD(t ϩ tw, tw); nevertheless, AD(t,0)[AN(t, 0)] turned out to be the fraction of nonnative residue 1 Ϫ M(t) [native residue Average Pathway of Folding and Unfolding. The equilibrium free ϭ͚N M(t)] at time t, where M(t) iϭ1Si(t)/N. Because the entropic energy landscape F(M) was obtained via F(M) ϭϪRT ln Z(M) reduction (gain) of a protein conformation in ME model with in our Monte Carlo simulation, where Z(M) is the number of respect to a fully unfolded (native) state is proportional to M(t) protein conformations with M, and was confirmed to be the same

2400 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0708480105 Yu et al. Downloaded by guest on September 24, 2021 this function may control the competition between the stabilizing energy and the destabilizing entropy as T changes, it is important to realize that the stabilizing energy there is effectively calcu- lated from the nature of a single residue. Given that the major determinant for cooperative protein folding is nonlocal inter- residue interactions especially with a large , a free energy function that does not properly account for nonlocal interresidue interactions is bound to lead to a global downhill folding behavior, to which Muñoz and coworkers (13) fitted their experimental results. Thus, the global downhill folding behavior of 1BBL described numerically by Muñoz and coworkers (13) is not unexpected. When the multiscale-heterogeneous nature of both attractive and repulsive pairwise contacts is not properly taken care of, the same reasoning applies and downplays the crucial role of nonlocal pairwise contacts in competition with the local pairwise contacts and the conformational entropy. The free energy function used by Muñoz and coworkers might describe the overall equilibrium energetics of a native state of Naf-BBL reasonably yet fails to provide an adequate description of the thermodynamics and folding kinetics involving a discrete folding barrier.

Limitations of the Current Model. Although our extended ME model captures an essential aspect of protein folding in a simple way and is more realistic than the original ME model, there are few shortcomings in our model for describing the full complexity of real proteins in aqueous environments. For example, the effect of desolvation barrier in the pairwise interaction potentials ϭ Fig. 5. Average pathway of unfolding as a function of M and t at T/Tm 1.1 between two residues is not addressed in our model. Chan et al. for 1BBL (a) and 1W4E (b). It shows the average unfolding process: re- (37) unraveled that desolvation barriers render protein models equilibrating to the native state, attempting to escape from the native well, with a higher degree of folding cooperativity associated with a climbing the unfolding barrier, and relaxing to the denatured state. The averaged value M(t) on the top of a free energy barrier has the heterogeneous higher free energy barrier than their no-desolvation counter- contributions mostly from the native and the denatured state. parts. When the original ME model (26) was first proposed, single (SSA) or double (DSA) sequence approximation was

used. The consequences of such approximations are that the BIOPHYSICS as the exact free energy landscape as shown in Fig. 1. We conformational ensemble of the unfolded state is underesti- calculated the dynamic free energy-like quantity F(M, t) ϭϪRT mated and the possibility of pairwise contacts of native residues ln Z(M, t), where Z(M, t) is the one calculated in the time window separated by denatured loop is ignored, which may lead to an t and t ϩ ␦t, and plotted F(M, t) in Fig. 5 for 1BBL and 1W4E. overestimate of folding cooperativity (4). Deficiencies from For t larger than the equilibration time, F(M, t) converges to using SSA or DSA were removed completely by the exact F(M) for each temperature. This ensures that the dynamic solution (29) of ME model lately, from which we extended our trajectory in our out-of-equilibrium Monte Carlo simulation current model. We also performed additional extensive calcu- indeed converges to the correct equilibrium state. Because we lations, employing DSA with allowing pairwise contacts of native knew the average value of M(t) as shown in Fig. 4 and SI Fig. 8, residues separated by denatured loop (DSA/L) for which the we could envisage the average pathway of folding and unfolding loop entropy is assigned as in ref. 27. Interestingly enough, the on the surface of F(M, t). Fig. 5a (Fig. 5b) shows the average qualitative features of protein folding from using DSA/L remain unfolding pathway of 1BBL (1W4E) at T/T ϭ 1.1 starting from as cooperative as those from using the exact calculation in this m work. Therefore, although the ignorance of the entropy of the fully native state at t ϭ 0. It first re-equilibrates to a basin of denatured loops might affect the degree of folding cooperativity, native state, crosses over a unfolding barrier of the height 2ϳ3 its effect is not large enough to alter the fundamental scheme of (4ϳ5) RT , and then leads to a basin of denatured state after the m protein folding itself into a noncooperative one. Our model is equilibration time. (For the folding pathway, see SI Fig. 9.) The still a Go៮-like model so that it does not take into account the average pathway of folding and unfolding shown on the surface nonnative interactions most appropriately and tends to overes- of F(M, t) indisputably demonstrates that both 1BBL and 1W4E timate the folding cooperativity (38). Several works have shown must undergo a two-state cooperative transition. that although the elimination of nonnative interactions enhances folding cooperativity in general, the role of nonnative interac- Why Did the Previous Analysis Not Show the Cooperative Folding? We tions in protein folding is complicated and will depend on the revisited Muñoz and coworkers’ (13) original numerical calcu- precise sequence, structure of the native state, and average ⌬ ϭ lation based on the free energy function of the form G(T) energy of nonnative contacts (39, 40). The degree of cooperat- n⅐⌬H ϩ⌬C(T Ϫ 373) Ϫ T[n⅐⌬S ϩ⌬C⅐ln(T/385)] that demon- ivity could be compromised to some extent, yet the existence or strated the global downhill folding behavior of 1BBL, where n is nonexistence of cooperativity itself might not be altered by the the number of native residues and T is the temperature. This ignorance of nonnative interactions. Although the major deter- function was determined solely by the one-body environmental minant for the cooperativity of protein folding is the relative nature of individual residues for which only the mean abundance of nonlocal versus local pairwise contacts between (⌬H) and the mean entropy (⌬S) per a native residue were taken residues in the native structure of a protein (4, 10), it will be into account and the change in (⌬C) was estimated nontrivial and highly challenging to come up with a further only from the differences in accessible surface area of an realistic model that can fully address the overall effect on the individual residue in each fragment of native structure. Although folding cooperativity that is orchestrated by the interplay of pH

Yu et al. PNAS ͉ February 19, 2008 ͉ vol. 105 ͉ no. 7 ͉ 2401 Downloaded by guest on September 24, 2021 value, denaturant concentration, and several factors mentioned Materials and Methods above. ME Model. An effective free energy of a protein conformation is described by ⌬ ϭ ͚ ␧ ͟j Ϫ ͚N ⌬ ⅐ G({Sk}) J ͗i,j͘ ij kϭiSk T kϭ1 Sconf Sk, where T is a temperature and N is the Conclusion number of amino acids in a protein. The conformational state of a protein with This article addressed one of the most important issues in protein N amino acids is represented by the N-spin binary variables {Sk} for amino acids ϭ ␾ ⌿ folding by tackling and resolving the recent controversy on the k 1,2,...,N. When the dihedral angles ( kϪ1, k)ofkth amino acid are folding mechanism of fast-folding proteins. We established a native (nonnative)-like, Sk takes the value 1 (0). The entropic cost of forming ⌬ Ͻ rigorous and general framework for the exact and simulational a native amino acid is Sconf 0 with respect to its nonnative conformation. The pairwise-contact energy between ith and jth amino acid is ␧ij when the investigation of thermodynamics and kinetics for protein folding ␣ and applied it to an unlabeled BBL protein and three PSBD distance between two C sofith and jth amino acid is less than a threshold homologues that are target proteins of the intensive debate for value, e.g., 6.5 Å, in its native structure. a global downhill protein folding. Although the new scenario of Folding Kinetics by Master Equation. The time evolution of the conformational global downhill protein folding for BBL protein and three PSBD ជ probability vector P(t) ϭ (Pl) ϭ (P0, P1, P2,...,PN) satisfies a master equation homologues has been proposed, we demonstrated evidences of dPជ(t)/dt ϭϪMPជ(t), where l ϭ 0,1,2,...,NϪ 1, N denotes (N ϩ 1) set of two-state cooperative (first order phase) transition for these protein conformations each having the same fraction M ϭ l/N of native residue proteins by using an extended ME model that uses an exact free such that 0(N) denotes a fully unfolded (native) state. M ϭ (Mmn) is a relaxation energy landscape, an exact analysis of relaxation eigenmodes matrix constructed by the transition probability from a state n to m based on from a master equation, and an out-of-equilibrium relaxation the Metropolis algorithm, where Mmn ϭϪ1/␶0 exp(Ϫ(⌬Gm Ϫ⌬Gn)/RT)ifGm Ͼ kinetics via kinetic Monte Carlo simulation. Our exact and Gn and Ϫ1/␶0 otherwise, and m, n ϭ 0,1,2,...,N Ϫ 1, N. Here, ⌬Gm and ⌬Gn numerical results are in a qualitative agreement with the recent are read from Fig. 1 and ␶0 is a molecular time scale. The diagonal elements of experimental results (15, 16, 24) and properly address the M are set by Mnn ϭϪ͚m,(m n)Mmn such that the sum of column matrix elements fundamental controversy related to the global downhill folding is zero and Mmn satisfies the detailed balance condition. not in terms of the statistical dispute but by the fundamental principles of equilibrium thermodynamics and out-of- ACKNOWLEDGMENTS. I.C. thanks M. Vendruscolo for careful comments on equilibrium relaxation kinetics. We believe that our work pro- the manuscript. We thank the BIT center of Pusan National University for the use of their 256 CPU-supercomputing clusters. This work was supported by vides a resolution to the long-standing controversy of global National Research Laboratory Program Grants R01-2006-10905 (to K.-H.H. and downhill protein folding, whose applicability for other proteins I.C.) and R01-2006-10696 (to S.H.) from the Korea Science and Engineering needs to be exercised with caution. Foundation.

1. Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 21. Zhou Z, Bai Y (2007) Analysis of protein-folding cooperativity. Nature 445:E16–E17. 181:223–230. 22. Sadqi M, Fushman D, Mun˜oz V (2007) Sadqi et al. reply. Nature 445:E17–E18. 2. Dill KA, et al. (1995) Principles of protein folding—A perspective from simple exact 23. Mun˜oz V, Sanchez-Ruiz JM (2004) Exploring protein-folding ensembles: A variable- models. Protein Sci 4:561–602. barrier model for the analysis of equilibrium unfolding experiments. Proc Natl Acad Sci 3. Daggett V, Fersht AR (2003) The present view of the mechanism of protein folding. Nat USA 101:17646–17651. Rev Mol Cell Biol 4:497–502. 24. Huang F, Sato S, Sharpe TD, Ying L, Fersht AR (2007) Distinguishing between cooper- 4. Shakhnovich EI (2006) Protein folding thermodynamics and dynamics: Where physics, ative and unimodal downhill protein folding. Proc Natl Acad Sci USA 104:123–127. chemistry, and biology meet. Chem Rev 106:1559–1588. 25. Kelly JW (2006) Proteins downhill all the way. Nature 442:255–256. 5. Go N (1983) Theoretical studies of protein folding. Annu Rev Biophys Bioeng 12:183– 26. Mun˜ oz V, Eaton WA (1999) A simple model for calculating the kinetics of protein 210. folding from three-dimensional structures. Proc Natl Acad Sci USA 96:11311– 6. Bryngelson JD, Wolynes PG (1987) Spin glasses and the of protein 11316. folding. Proc Natl Acad Sci USA 84:7524–7528. 27. Henry ER, Eaton WA (2004) Combinatorial modeling of protein folding kinetics: Free 7. Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG (1995) Funnels, pathways, and the energy profiles and rates. Chem Phys 307:163–185. energy landscape of protein folding: A synthesis. Proteins 21:167–195. 8. Shakhnovich EI, Gutin AM (1989) Formation of unique structure in polypeptide chains. 28. Case DA, et al. (2004) AMBER 8 (Univ of California, San Francisco). Theoretical investigation with the aid of a replica approach. Biophys Chem 34:187–199. 29. Bruscolini P, Pelizzola A (2002) Exact solution of the Mun˜oz–Eaton model for protein 9. Matouschek A, Serrano L, Fersht AR (1992) The folding of an enzyme. I. Theory of folding. Phys Rev Lett 88:258101. protein engineering analysis of stability and pathway of protein folding. J Mol Biol 30. Chang I, Cieplak M, Banavar JR, Maritan A (2004) What can one learn from experiments 224:771–782. about the elusive transition state? Protein Sci 13:2446–2457. 10. Abkevich VI, Gutin AM, Shakhnovich EI (1995) Impact of local and non-local interac- 31. Binder K, Heermann DW (2002) Monte Carlo Simulation in Statistical Physics (Springer, tions on thermodynamics and kinetics of protein folding. J Mol Biol 252:460–471. Berlin). 11. Eaton WA (1999) Searching for ‘‘downhill scenarios’’ in protein folding. Proc Natl Acad 32. Leeson DT, Gai F, Rodriguez HM, Gregoret LM, Dyer RB (2000) Protein folding and Sci USA 96:5897–5899. unfolding on a complex energy landscape. Proc Natl Acad Sci USA 97:2527–2532. 12. Mun˜oz V (2002) Thermodynamics and kinetics of downhill protein folding investigated 33. Shakhnovich E, Farztdinov G, Gutin AM, Karplus M (1991) Protein folding bottlenecks: with a simple statistical mechanical model. Int J Quantum Chem 90:1522–1528. A lattice Monte Carlo simulation. Phys Rev Lett 67:1665–1668. 13. Garcia-Mira MM, Sadqi M, Fischer N, Sanchez-Ruiz JM, Mun˜oz V (2002) Experimental 34. Camacho CJ, Thirumalai D (1993) Kinetics and thermodynamics of folding in model identification of downhill protein folding. Science 298:2191–2195. proteins. Proc Natl Acad Sci USA 90:6369–6372. 14. Oliva FY, Mun˜oz VA (2004) A simple thermodynamic test to discriminate between 35. Chan CK, et al. (1997) Submillisecond protein folding kinetics studied by ultrarapid two-state and downhill folding. J Am Chem Soc 126:8596–8597. mixing. Proc Natl Acad Sci USA 94:1779–1784. 15. Ferguson N, Schartau PJ, Sharpe TD, Sato S, Fersht AR (2004) One-state downhill versus 36. Gutin A, Sali A, Abkevich V, Karplus M, Shakhnovich E (1998) Temperature dependence conventional protein folding. J Mol Biol 344:295–301. of the folding rate in a simple protein model: Search for a glass transition. Chem Phys 16. Ferguson, N et al. (2005) Ultra-fast barrier-limited folding in the peripheral subunit- 108:6466–6483. binding domain family. J Mol Biol 353:427–446. 37. Liu Z, Chan HS (2005) Solvation and desolvation effects in protein folding: Native 17. Fitz-Gibbon ST, et al. (2002) Genome sequence of the hyperthermophilic crenarchaeon flexibility, kinetic cooperativity and enthalpic barriers under isostability conditions. Pyrobaculum aerophilum. Proc Natl Acad Sci USA 99:984–989. 18. Naganathan AN, Perez-Jimenez R, Sanchez-Ruiz JM, Mun˜oz V (2005) Robustness of Phys Biol 2:S75–S85. downhill folding: Guidelines for the analysis of equilibrium folding experiments on 38. Kaya H, Chan HS (2000) Polymer principles of protein calorimetric two-state cooper- small proteins. 44:7435–7449. ativity. Proteins 40:637–661. 19. Sadqi M, Fushman D, Mun˜oz V (2006) Atom-by-atom analysis of global downhill 39. Li L, Mirny LA, Shakhnovich EI (2000) Kinetics, thermodynamics and evolution of protein folding. Nature 442:317–321. non-native interactions in a protein folding nucleus. Nat Struct Biol 7:336–342. 20. Ferguson N, Sharpe TD, Johnson CM, Schartau PJ, Fersht AR (2007) Analysis of ‘down- 40. Klimov DK, Thirumalai D (2001) Multiple protein folding nuclei and the transition state hill’ protein folding. Nature 445:E14–E15. ensemble in two-state proteins. Proteins 43:465–475.

2402 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0708480105 Yu et al. Downloaded by guest on September 24, 2021