Native-Like Mean Structure in the Unfolded Ensemble of Small Proteins

Native-Like Mean Structure in the Unfolded Ensemble of Small Proteins

B doi:10.1016/S0022-2836(02)00888-4 available online at http://www.idealibrary.com on w J. Mol. Biol. (2002) 323, 153–164 Native-like Mean Structure in the Unfolded Ensemble of Small Proteins Bojan Zagrovic1, Christopher D. Snow1, Siraj Khaliq2 Michael R. Shirts2 and Vijay S. Pande1,2* 1Biophysics Program The nature of the unfolded state plays a great role in our understanding of Stanford University, Stanford proteins. However, accurately studying the unfolded state with computer CA 94305-5080, USA simulation is difficult, due to its complexity and the great deal of sampling required. Using a supercluster of over 10,000 processors we 2Department of Chemistry have performed close to 800 ms of molecular dynamics simulation in Stanford University, Stanford atomistic detail of the folded and unfolded states of three polypeptides CA 94305-5080, USA from a range of structural classes: the all-alpha villin headpiece molecule, the beta hairpin tryptophan zipper, and a designed alpha-beta zinc finger mimic. A comparison between the folded and the unfolded ensembles reveals that, even though virtually none of the individual members of the unfolded ensemble exhibits native-like features, the mean unfolded structure (averaged over the entire unfolded ensemble) has a native-like geometry. This suggests several novel implications for protein folding and structure prediction as well as new interpretations for experiments which find structure in ensemble-averaged measurements. q 2002 Elsevier Science Ltd. All rights reserved Keywords: mean-structure hypothesis; unfolded state of proteins; *Corresponding author distributed computing; conformational averaging Introduction under folding conditions, with some notable exceptions.14 – 16 This is understandable since under Historically, the unfolded state of proteins has such conditions the unfolded state is an unstable, received significantly less attention than the folded fleeting species making any kind of quantitative state.1 The reasons for this are primarily its struc- experimental measurement very difficult. Here, it tural heterogeneity and complexity, and secon- is important to emphasize the distinction between darily a belief that biological function is pre- the unfolded state, a transient species en route to dominately mediated by the native state. The the folded state, and the denatured state, an artifi- molten globule, as a specific example of a non- cially stabilized non-native state. For instance, in a folded state, has been studied somewhat more typical stop-flow folding experiment, the intensively, but little is known about its structure.2 denatured state refers to the protein in the presence Several recent studies of the chemically or ther- of urea or guanidinium chloride, while the mally denatured proteins, both experimental and unfolded state refers to the same species after the theoretical, have suggested that the structure of denaturant has been diluted out and the protein is the denatured state may not be as diverse as pre- beginning to fold. The structural and dynamic viously thought, and that long-range order in differences between the two species have been 15 the denatured state may play a role in defining noted before. protein folding mechanism.3–13 The majority of Computer simulations of either the unfolded or these studies have focused on the properties of the the denatured state have so far been limited by artificially generated non-native samples, and the immense computational power required for there has in general been very little focus on the accurate sampling. The unfolded state is in fact a structure and the dynamics of the unfolded state greater challenge to simulate using conventional means than the folded state precisely because of its structural diversity. While there have been Abbreviations used: dRMS, distance-based root-mean square deviation; GB/SA, generalized Born/surface several simulations of the denatured state before, area. in particular high temperature denaturing 4,6,7,12,17 – 19 E-mail address of the corresponding author: studies, it has been debated whether the [email protected] sampling has been sufficient. Indeed, most studies 0022-2836/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved 154 Native-like Mean Structure of Unfolded Proteins Table 1. Summary of the native equilibrium simulations Native villin Native TrpZip Native BBA5 Temperature (K) 300 278 278 Total time (ms) 90.6 19.3 72.0 ˚ Initial Rgyr (A) 9.6 6.6 9.2 Initial SASA (A˚ 2) 3076 1449 2422 Representative time point (ns) (no. structures) 20 (1401) 15 (481) 15 (1317) ˚ ^ ^ ^ kRgyrl at trep (A) 9.8 0.8 6.6 0.2 8.6 0.7 ˚ 2 ^ ^ ^ kSASAl at trep (A ) 3027 137 1454 57 2256 115 SS at treps (%) 87 78 90 a ˚ C -dRMSms at trep (A) 1.5 0.6 2.0 Representative time point (trep) indicates the time at which we have calculated the mean native structures used for comparison with the unfolded simulations throughout this article. The number of independent structures used in these averages is shown in the trep row. The secondary structure at trep (SS at trep) refers to the fraction of the native ensemble at trep which has 14 or more helical residues in the case of villin, four or more beta-sheet residues in the case of TrpZip, and four or more helical residues and two or more beta- sheet residues in the case of BBA5, indicating stable secondary structure. All secondary structure content is determined using 40 a a DSSP. C -dRMSms at trep refers to the C -dRMS of the mean native structure at trep from the initial experimental native structure used in the simulations. Rgyr refers to the radius of gyration, and SASA refers to the solvent accessible surface area as determined by DSSP. We show values for the two for both the initial structure and the ensemble at the trep. kl brackets refer to ensemble averages throughout the text. Error values refer to the standard deviation around the mean of a given population. employ a few (one to ten) simulations on the nano- one millisecond, orders of magnitude larger second timescale. In addition, it is also debatable than previous atomistic MD simulations.25,26 This how relevant are the results obtained under unprecedented sampling has allowed us to observe high temperature conditions for our understanding previously inaccessible effects, which we describe of the unfolded state under folding conditions. below. The central question that we ask is what is Recently, we have introduced a novel compu- the structure of the unfolded state on average. tational approach aimed at addressing the issue of vastly improving the sampling in protein simu- lations: for our calculations we have employed Results distributed computing techniques and a super- cluster of more than 10,000 processors.20,21 Using The folding simulations for all three molecules this computational resource, we have run thou- studied were started from extended conformations. sands of fully independent atomistic molecular In approximately 10 ns (TrpZip and BBA5) or 20 ns dynamics (MD) trajectories starting from the fully (villin) the unfolded ensembles non-specifically extended state of three small proteins, each tens of collapse to form compact conformations. On nanoseconds long (see Methods; Figures 1(a), 2(a) average, these conformations exhibit native-like and 3(a)). In addition, as an important control radii of gyration and solvent accessible surface for our folding simulations, we have also run areas (Table 1, Figures 1(b), 2(b) and 3(b)). Simu- thousands of trajectories starting from the experi- lations started from the folded structures, in mental native structures of the three molecules. contrast, remain stable throughout, with respect to The advantage of performing a large number of the radii of gyration, secondary structure content, independent, relatively short folding simulations solvent accessible surface area, and distance-based is twofold. First, by the stochastic nature of the root-mean square deviation of the average struc- folding process and exponential kinetics, in an ture (dRMS) (Table 1, Figures 1(b), 2(b) and 3(b)) ensemble consisting of thousands of such trajec- from the experimental structures. In Figures 5–7 tories we can expect to observe a small but signifi- (see below), we compare the experimental struc- cant number of folding events which on average tures of the three molecules with the representative would take much longer to occur.21 Specifically, in native structures from our simulations (Figures an ensemble of 10,000 trajectories, each of which is 5(a) and (b), 6(a) and (b), and 7(a) and (b)), and 10 ns long, one expects to see about ten folding the similarity is obvious. This attests to the stability events for a protein that folds with single exponen- of our simulations and suggests that our simulated tial kinetics and time constant of 10 ms. Second, native ensembles could be used for comparison such an approach gives us a detailed picture of the with the unfolded ensembles. unfolded ensemble very early into folding (tens of Inspection of individual members of the nanoseconds after initiation of folding). Here we unfolded ensembles reveals very heterogeneous have focused on this latter aspect of our data for populations. Except for the majority of unfolded three different polypeptides: a 36 residue three- molecules being collapsed, they are structurally helix bundle villin headpiece protein,22 a 12 residue quite diverse. Despite this diversity, a meaningful b-hairpin tryptophan zipper peptide (TrpZip23), question to ask is what do these heterogeneous and a 23 residue designed bba zinc finger mimic ensembles look like on average. A natural first (BBA524). Our aggregate simulation time is nearly question is how do the averaged properties of the Native-like Mean Structure of Unfolded Proteins 155 Figure 1. Villin simulations. (a) The total number of independent simulations that have reached a given time point for the native ensemble (green) and the unfolded ensemble (red).

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    12 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us