The Limited Role of Nonnative Contacts in the Folding Pathways of a Lattice Protein
Total Page:16
File Type:pdf, Size:1020Kb
doi:10.1016/j.jmb.2009.06.058 J. Mol. Biol. (2009) 392, 1303–1314 Available online at www.sciencedirect.com The Limited Role of Nonnative Contacts in the Folding Pathways of a Lattice Protein Brian C. Gin1,2,3, Juan P. Garrahan4 and Phillip L. Geissler1,2⁎ 1Department of Chemistry, Models of protein energetics that neglect interactions between amino acids University of California at that are not adjacent in the native state, such as the Gō model, encode or Berkeley, Berkeley, underlie many influential ideas on protein folding. Implicit in this CA 94720, USA simplification is a crucial assumption that has never been critically evaluated in a broad context: Detailed mechanisms of protein folding are not biased by 2Chemical Sciences and Physical nonnative contacts, typically argued to be a consequence of sequence design Biosciences Divisions, Lawrence and/or topology. Here we present, using computer simulations of a well- Berkeley National Laboratory, studied lattice heteropolymer model, the first systematic test of this oft- Berkeley, CA 94720, USA assumed correspondence over the statistically significant range of hundreds 3School of Medicine, University of thousands of amino acid sequences that fold to the same native structure. of California at San Francisco, Contrary to previous conjectures, we find a multiplicity of folding San Francisco, CA 94143, USA mechanisms, suggesting that Gō-likemodelscannotbejustifiedby 4 considerations of topology alone. Instead, we find that the crucial factor in School of Physics and discriminating among topological pathways is the heterogeneity of native Astronomy, University of contact energies: The order in which native contacts accumulate is Nottingham, Nottingham profoundly insensitive to omission of nonnative interactions, provided NG7 2RD, UK that native contact heterogeneity is retained. This robustness holds over a Received 25 March 2009; surprisingly wide range of folding rates for our designed sequences. received in revised form Mirroring predictions based on the principle of minimum frustration, fast- 19 June 2009; folding sequences match their Gō-like counterparts in both topological accepted 23 June 2009 mechanism and transit times. Less optimized sequences dwell much longer Available online in the unfolded state and/or off-pathway intermediates than do Gō-like 2 July 2009 models. For dynamics that bridge unfolded and unfolded states, however, even slow folders exhibit topological mechanisms and transit times nearly identical with those of their Gō-like counterparts. Our results do not imply a direct correspondence between folding trajectories of Gō-like models and those of real proteins, but they do help to clarify key topological and energetic assumptions that are commonly used to justify such caricatures. © 2009 Elsevier Ltd. All rights reserved. Keywords: Gō model; nonnative contacts; lattice model; protein folding; Edited by M. Levitt principle of minimum frustration Introduction assumptionsabouttheenergiesofinteraction among amino acid residues. A special class of The current understanding of protein folding has models, based on Gō's insights, asserts that only a been strongly shaped by theoretical and computa- subset of interactions (those between segments of a tional studies of simplified models.1 Such models protein that contact one another in the native state) are typically constructed by discarding fine details is crucially important for folding.2 The Gō model of molecular structure or by making simplifying further assumes a unique energy scale for these native contacts. Here, we will focus on elaborated “Gō-like” models that allow for a diversity of native contact energies. *Corresponding author. 207 Gilman Hall, Department of The Gō model was originally proposed as a Chemistry, University of California at Berkeley, Berkeley, schematic but microscopic perspective on the CA 94720, USA. E-mail address: [email protected]. stability and kinetic accessibility of proteins' native Abbreviations used: CAO, contact appearance order; states. It accordingly provided generic insight into CDO, contact disappearance order. issues of cooperativity and nucleation, and the 0022-2836/$ - see front matter © 2009 Elsevier Ltd. All rights reserved. 1304 Nonnative Contacts in Folding Pathways relationship between sequence and structure.1 proteins taken from living organisms fold reliably Results from early studies employing Gō-like and with relative efficiency.26 models established a basis for theories that focus These notions and observations motivate a “prin- on gaps in the spectrum of conformational ciple of minimum frustration” asserting that natural energies3,4 and the funnel-like nature of potential amino acid sequences have been “designed” by – energy landscapes.5 9 Corroborated by experiments, evolution to minimize the disruptive influence of concepts intrinsic to and inspired by Gō-like models nonnative contacts on the dynamics of folding.5 One now form a canon of widely accepted ideas about might thus apply Gō-like models to these designed how proteins fold.1,10,11 sequences with confidence, since omitted interac- The role of Gō-like models, in comparison with tions are precisely the ones whose effects have been experimental data, has expanded beyond its original mitigated by natural selection. By contrast, one conception as a schematic tool. Neglect of nonnative might expect Gō-like models to poorly represent the contacts offers substantial computational relief to folding mechanisms of slowly folding molecules, numerical simulations, allowing thorough kinetic whose nonnative interactions are presumably and thermodynamic studies to be performed even responsible for hampering pathways to the native – for detailed molecular representations.12 15 For state.22,23 example, Gō-like models have been used success- Testing these ideas of sequence design and kinetic fully to predict the folding rates of small single- frustration is made difficult by several factors. domain proteins from their native topologies.16 Experimentally, microscopic details of folding Recently, several studies have ascribed additional kinetics cannot be resolved, but only inferred, from significance to the detailed dynamic pathways indirect observables or effects of mutations. Further- defined by Gō-like models.14 The transition states more, the most concrete hypotheses stemming from and order of folding events in Gō-like simulations of the principle of minimum frustration involve Gō- several single-domain proteins17 and multimeric like models, which cannot be realized in the structures18 were found to largely correspond with laboratory. Computer simulations of detailed mole- those inferred from an analysis of experimental ϕ- cular representations can generate, at great cost, values. A recent comparison of equilibrium and dynamic information sufficient to determine a kinetic data between Gō-like models and an experi- folding mechanism for only the smallest of natural ment for the villin subdomain revealed that even the proteins.27 Although the statistical dynamics of simplest topology-based models were able to coarse-grained or schematic representations can be reproduce a wide range of experimental results, readily explored, biology does not provide collec- but did not provide evidence sufficient to conclude tions of fast-folding and slow-folding sequences to that Gō-like predictions of folding order are fol- compare in these artificial contexts. Finally, even lowed by the real protein.19 While these results when appropriate ensembles of sequences and suggest that Gō-like models can provide much more ensembles of folding trajectories are available, a than schematic physical perspectives, the advisa- useful comparison of Gō-like models and their full bility of drawing detailed mechanistic conclusions counterpart requires a compact way of characteriz- from their folding dynamics remains unclear. Argu- ing the course of highly chaotic dynamics.28 A ments in favor of a precise correspondence with general method for this purpose is not available, realistic molecular dynamics are often based on although studies of nucleation as a rate-limiting theories suggesting similarities between the energy fluctuation provide a useful starting point.29,30 landscapes of real proteins and their Gō-like This article presents the first systematic large-scale representations, as afforded by topological con- comparison of folding pathways within Gō-like and straints and evolutionary pressures. Nevertheless, full models. We focus on a schematic lattice such general principles offer only rough guidance, representation of proteins that are well suited for and few computational studies have compared the this task in several ways: (a) geometrically, because folding pathways of Gō-like models and their “full” contacting segments of the chain can be unambigu- counterparts (in which nonnative contact energies ously identified; (b) statistically, because represen- are included) in a broad context.20 tative ensembles of folding trajectories can be Very favorable interactions between segments of a generated for large numbers of amino acid se- protein that are not adjacent in the folded state quences; and (c) conceptually, because the essential generally impede folding. They might do so by competition between contact energetics and chain introducing detours or traps en route to the native connectivity can be isolated from the complicating state or simply by stabilizing the ensemble of effects of secondary structure, side-chain packing, – unfolded