Spectrin SH3 Domain Xavier Periole,1* Michele Vendruscolo,2 and Alan E
Total Page:16
File Type:pdf, Size:1020Kb
proteins STRUCTURE O FUNCTION O BIOINFORMATICS Molecular dynamics simulations from putative transition states of a-spectrin SH3 domain Xavier Periole,1* Michele Vendruscolo,2 and Alan E. Mark1,3,4* 1 Department of Biophysical Chemistry, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, 9747 AG Groningen, The Netherlands 2 Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom 3 School of Molecular and Microbial Sciences, University of Queensland, St Lucia, 4072, Queensland, Australia 4 Institute for Molecular Biosciences, University of Queensland, St Lucia, 4072, Queensland, Australia INTRODUCTION ABSTRACT Understanding of the process of protein folding is one of the grand challenges in A series of molecular dynam- molecular biology. Ever since it was shown that the information contained in the ics simulations in explicit sol- sequence of amino acids is sufficient for a protein to find the structure of its native vent were started from nine state,1 experimentalists and theoreticians have tried to understand the mechanisms of structural models of the tran- this important biological process.2–5 sition state of the SH3 do- 5 main of a-spectrin, which Much effort has been focused on proteins that undergo two-state folding. The pri- were generated by Lindorff- mary advantage of two-state proteins is the lack of detectable intermediate states, so Larsen et al. (Nat Struct Mol their folding process can be considered as involving a transition from a broad ensem- Biol 2004;11:443–449) using ble of configurations representing the unfolded state to a narrow ensemble of configu- molecular dynamics simula- rations making up the native state, via a specific transition state. The analysis of two- tions in which experimental state proteins greatly simplifies the interpretation of experiments designed to elucidate F-values were incorporated as the mechanism of folding. The F-value analysis,6 in which the effects of specific restraints. Two of the nine amino acid substitutions on folding kinetics and equilibria are measured,7 has been models were simulated 10 widely used to obtain structural information regarding the nature of the transition times for 200 ns and the state as F-values reflect the degree to which the environments of specific residues are remaining models simulated native-like in the transition state. By assuming that F-values correlate with the propor- two times for 200 ns. Com- 8–13 plete folding was observed in tion of native contacts in the transition state, F-values can also be used as restraints one case, while in the other in computer simulations. This approach has been used by several groups to propose simulations partial folding or models for the structures of the transition state ensembles (TSEs) of a range of pro- unfolding events were teins.10,12,14 However, as independent experimental information about transition states observed, which were charac- is difficult to obtain, it has been problematic to verify whether a TSE generated in this terized by a regularization of manner is in fact representative of the true TSE. Studies in which the TSE generated elements of secondary struc- using a set of experimental F-values is used to predict the results of further F-value ture. These results are consist- measurements,10,15 and studies showing that the contact order in the TSE correlate with ent with recent experimental protein folding rates16 are supporting the use of this approach. A further validation has evidence that the folding of been provided recently by the demonstration that the results of double-mutant cycle SH3 domains involves low populated intermediate states. experiments could be predicted from the knowledge of a set of structures representing the TSE of barnase.17 Proteins 2007; 69:536–550. In an alternative theoretical approach, a particular structure or set of structures is VC 2007 Wiley-Liss, Inc. identified as belonging to the TSE by calculating the probability of folding by generat- 18 Key words: molecular dyna- ing trajectories that are started from the proposed structures. Since the TSE corre- mics; transition state; protein folding; phi-value; Pfold. Grant sponsor: European Community Training and Mobility of research ‘‘Protein (mis)folding’’; Grant number: HPRN-CT- 2002-00241. *Correspondence to: Xavier Periole, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), Department of Biophysical Chemistry, University of Groningen, Nijenborgh 4, 9747 AG Groningen, The Netherlands. E-mail: [email protected] or Alan E. Mark, School of Molecular and Microbial Sciences, University of Queensland, St. Lucia, 4072, Queensland, Australia. E-mail: [email protected] Received 28 November 2006; Revised 22 February 2007; Accepted 7 March 2007 Published online 10 July 2007 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/prot.21491 536 PROTEINS VC 2007 WILEY-LISS, INC. Probing a TSE by MD in Explicit Solvent sponds to the point of highest free energy along the reac- for comparison. In the following we examine a range of tion coordinate of the folding process, a TSE structure structural properties, including the solvent accessible sur- should have an equal probability to fold or to unfold, face area, the radius of gyration, the deviation of the and therefore its folding probability, Pfold, should be structure from the native configuration (global, local, equal to 0.5.18 Several studies have used this type of and per residue), and ratio of native contacts. approach to validate TSEs. Gsponer and Caflisch13 gen- erated putative TSE conformations and estimated the MODEL AND METHODS Pfold values of six structures using an implicit model to account for solvation effects. They found that up to 200 Starting structures for the simulations ns simulations were necessary to discriminate between folding or unfolding behavior. Shakhnovich and cow- The native state orkers determined Pfold values in simulations using a Go a 19,20 The native structure of the SH3 domain of -spectrin potential and observed that not all conformations (PDB code 1BK228) was used. The numbering of the res- satisfying experimental F-values were part of the TSE on idues is as in Ref. 27. the free energy landscape of the model that they consid- 20 ered. More recently, however, Wolynes and coworkers The transition state ensemble concluded that the use of F-values as restraints repre- sents an effective strategy for the accurate determination We considered the structures representing the TSE of a- 27 of transition state structures.21 Go potentials, in which spectrin SH3 generated by Lindorff-Larsen et al. In brief, 12 native interactions are considered more favorable than this TSE was generated using MD simulations in which exp nonnative ones, were used both by Shakhnovich and experimental F-values, F , were used as restraints by coworkers and Wolynes and coworkers, and they have adding a term in the force field to penalize the difference the advantage of being computationally very convenient. between the experimental F-values and those calculated calc exp There has been, however, some debate about whether the during the simulations, F .TheF values represent the free energy landscapes22 and the folding kinetics23 of Go ratio of the destabilization of the transition state (TS), TSÀU models resemble the true ones faithfully enough to pro- DDGi , compared to that of the native state (N), NÀU 7 vide good estimates of the Pfold values, which are known DDGi , due to a mutation i, and were interpreted as calc to be extremely sensitive to changes in conformations a measure of ratio of native contacts, F , present in the 10,12,27 and energetics.23 Within this context, the use of explicit TSE. solvent models in Pfold calculations has been analyzed DDGTSÀU recently by Rhee and Pande,23 who showed that it can exp i Ui ¼ NÀU ð1Þ represent, at least in the case of the 23-residue mini-pro- DDGi tein BBA5, a promising approach for a reliable estima- tion of Pfold. Even with simplified models the evaluation The unfolded state (U) is used as reference. The Fcalc of reliable Pfold values is computationally extremely values were computed for each residue in a given con- demanding. Several methods have been proposed to formation as the ratioofnativecontacts8,10: calc conf N conf reduce this demand by alternative determinations of the Ui ¼ Qi =Qi , where Qi designate the number of N Pfold values of conformations sampled during reversible contacts of the residue i in a conformation and Qi the folding simulations.24–26 For example Rao et al.24 number of contacts of the same residue i in the native showed that the Pfold of a conformation can be approxi- state. A specific contact between two residues was consid- mated by the probability that structurally similar confor- ered to exist if the two residues were separated by at least mations (clusters) fold during a reversible folding simula- two positions along the sequence and the distance tion. Such a procedure was suggested to be equivalent to between their Ca atoms was less than 8.5 A˚ .10 A contact a Pfold calculation for each individual conformation of was considered native if observed to exist for more than the cluster obtained through multiple simulations but 80% of the time during the 100 ns simulation of the much less computational demanding. native state. Fcalc were determined for each conformation In this work, extensive atomistic molecular dynamics of the MD simulations. (MD) simulations in explicit solvent are used to examine Nine transition state conformations (TSC) were a set of nine structures selected from a model of the TSE extracted from the original set of 500 structures (the set ensemble of the a-spectrin SH3 domain, which was gen- Y ¼ 500 K, see Ref. 27 for details) and will be referred erated by Lindorff-Larsen et al.27 using experimental F- as TSC-X with X ¼ 1 to 9.