<<

PCA ETR:PERSPECTIVE FEATURE: SPECIAL

Molecular and function

M. Karplus*†§ and J. Kuriyan§¶ʈ *Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138; †Laboratoire de Chimie Biophysique, Institut de Science et d’Inge´nierie Supramole´culaires, Universite´Louis Pasteur, 67000 Strasbourg, France; ¶Howard Hughes Medical Institute and Departments of Molecular and Cell Biology and Chemistry, University of California, Berkeley, CA 94720-3202; and ʈPhysical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720

Edited by Bruce J. Berne, Columbia University, New York, NY, and approved February 15, 2005 (received for review January 18, 2005)

A fundamental appreciation for how biological work requires knowledge of structure and dynamics. Molecular dy- namics provide powerful tools for the exploration of the conformational energy landscape accessible to these , and the rapid increase in computational power coupled with improvements in methodology makes this an exciting time for the ap- plication of to . In this Perspective we survey two areas, and enzymatic catalysis, in which simulations have contributed to a general understanding of mechanism. We also describe results for the F1 ATPase molecular motor and the Src family of signaling as examples of applications of simulations to specific biological systems.

he conformational dynamics of protein domains that are not expected protein molecules is encoded in to undergo large conformational transi- their structures and is often a tions (e.g., an SH2 domain bound to a critical element of their func- phosphopeptide), a simulation lasting Ttion. A fundamental appreciation for several nanoseconds in a fully solvated how proteins work therefore requires an environment and using all- poten- understanding of the connection be- tials without truncation of the electro- tween three-dimensional structure, ob- static interactions will typically result in tained with increasing rapidity by x-ray rms deviations from the struc- crystallography and NMR, and dynam- ture of Ϸ1 Å for the backbone of ics, which is much more difficult to the core region of the protein (7). Such probe experimentally. Molecular dynam- simulations engender a degree of confi- ics simulations provide links between dence that the interesting structural structure and dynamics by enabling the changes that result from the simulation exploration of the conformational en- of larger assemblies [e.g., intact Src ki- ergy landscape accessible to protein nases containing SH2 domains (8)] are molecules (1–3). The first molecular dy- meaningful, as we describe later. namics simulation of a protein was re- Fig. 1. Simulation of a solvated protein. This slice Simulations are most effective when ported in 1977 and consisted of a 9.2-ps through a simulation system shows a Src kinase analyzed in close conjunction with ex- Ϸ trajectory for a small protein in protein (green) surrounded by 15,000 water mol- periments on protein function, which ecules (oxygen atoms are red and hydrogen atoms (4). Eleven years later, a 210-ps simula- play an essential role in validating and are white). The simulation system consists of improving the simulations (see, for ex- tion of the same protein in water was Ϸ50,000 atoms, including potassium and chloride reported (5), and the phenomenal in- ions (purple and orange spheres, respectively). A ample, refs. 9–12). One outcome of the crease in computing power since then 1-ns molecular dynamics trajectory for this system increase in readily available computer now makes it routine to run simulations can be generated in 4 days by using a cluster of four power and the standardization of simu- of much larger proteins that are 1,000– inexpensive Linux-based computers. (Courtesy of lation protocols is that it is now becoming 10,000 times as long as the original Olga Kuchment.) feasible for nonspecialists to reproduce simulation (Ϸ10–100 ns), in which the and check some, if not all, of the con- protein is surrounded by water and salt clusions of simulation analyses published lations can provide the ultimate detail by others. This kind of cross-validation (Fig. 1). Significant improvements in the concerning individual atomic as potential functions have also been will aid in the integration of simulations a function of time; thus, they can be achieved, making the simulations much into the normal process of interpreting used to answer specific questions about more stable and accurate (6). newly emerging structural results. A the properties of a model system often The engineering principles underlying prescription for making the analysis of are truly baroque, with more readily than experiments on the simulation results more rigorous and evolutionary tinkering resulting in molecu- actual system. For many aspects of bi- a list of possible errors has been pro- lar machines that may be effective and omolecule function, these details are of vided (13). efficient but whose three-dimensional interest (e.g., which of the many resi- In this Perspective, we illustrate the structures are often so complicated as to dues surrounding ATP in a motor pro- application of molecular dynamics simu- lations to biology by describing several obscure the mechanism of action. Clues tein are most important for coupling selected examples, emphasizing our own to the essential dynamics of the mole- ATP binding to molecular movement). work. We start with a broader survey of cule are quite often provided by the The combination of increased com- two areas, protein folding and enzymatic separation of the protein into domains puter power and improved potential connected by hinges and the availability functions has resulted in an ability to generate simulations that approach the of structures corresponding to different This paper was submitted directly (Track II) to the PNAS functional states, but it is not easy from point at which they can survive critical office. visual inspection or simple calculations examination by the experimentalists who §To whom correspondence may be addressed. E-mail: to deduce the workings of these molecu- determine the structures of the proteins [email protected] or [email protected]. lar machines. Molecular dynamics simu- being simulated. For small proteins or © 2005 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0408930102 PNAS ͉ May 10, 2005 ͉ vol. 102 ͉ no. 19 ͉ 6679–6685 Downloaded by guest on September 30, 2021 catalysis, in which simulations combined with experiment have led to a general understanding of mechanism. The ex- plosive growth of structural results for an increasingly comprehensive range of biological processes demands simula- tions that address issues that are specific to particular systems. As examples, we describe simulation results for the F1 ATPase molecular motor and the Src family of signaling proteins. Protein Folding An understanding of how the newly syn- thesized polypeptide chain is able to fold to its native structure is fundamen- tal to the description of life at a molecu- lar level. This issue is made all the more critical because misfolding is involved in a range of diseases (14). A synergy be- tween simulation and experiment has led to a conceptual solution for one as- pect of the folding problem that is con- cerned with the mechanism by which the protein chain finds the native conforma- tion despite the immensity of the poten- tial search space. This advance has occurred despite the fact that folding is slow on the simulation time scale. (Even the fastest proteins take microseconds Fig. 2. Simulations of the folding of a three- protein. (a and b)(Upper) Semilog plots of the time dependence of the fractions of native helical and interhelical contacts and the inverse fractions of to fold, and the process cannot be simu- native volume (calculated from the inverse cube of the radius of gyration) for two different trajectories. lated directly.) Most of the simulations (Lower) Structures of the protein at selected times. [Reproduced with permission from ref. 20 of folding have used simplified models (Copyright 1999, Nature Publishing Group).] to capture the essential aspects of the process (15–17). Key insights as to how the native One such study of three-helix bundle formation from , structure can be found by the polypep- proteins used a C␣ model to represent illustrating a possible mechanism by tide chain in a reasonable time were the protein chain and a square well po- which large, relatively well ordered obtained by using Monte Carlo dynam- tential for the interactions between pairs ␤-sheet aggregates could appear in solu- ics for the folding of simplified bead of nonbonded residues. These simplifica- tion (23). models on a cubic lattice, a process fast tions made possible the use of discrete It is not yet possible to do statistically enough so that the hundreds of trajecto- molecular dynamics algorithms for meaningful folding simulations of pro- ries needed to obtain statistically mean- studying the folding process (20). The teins with all-atom models. However, ingful results could be calculated with speed of the latter is such that several peptides composed of 20 or so residues the available computers (18). These lat- hundred folding trajectories could be with a well defined native structure are tice Monte Carlo studies showed that calculated for different relative weights being simulated by molecular dynamics folding to the native state occurred on a of native and nonnative interactions in with implicit (15, 24, 25) and explicit time scale many orders of magnitude the model potential function. Fig. 2 (16) solvent. Although the peptides are small, the complexity of the simulations shorter than that required to sample all shows typical trajectories for a model can be usefully interpreted in terms of of the configurations (the so-called with the native state strongly favored disconnectivity graphical (24) and net- Levinthal time); e.g., with time mea- (Go-like potential) and a model in work (26) descriptions of the folding Ϸ ϫ which nonnative interactions make a sured in Monte Carlo steps, only 5 process. Given the increase in the speed 107 steps were required for folding a significant contribution along the folding 16 of computers, particularly through the 27-bead that had 10 possible pathway. In the former model, the heli- development of massively parallel ma- configurations. Although each folding ces form first and then diffuse to find chines, it is likely that within the next 10 trajectory is very different, the bias to the native fold [a limiting case of the years ab initio folding will reach the the native state on the free energy sur- diffusion–collision model (21)], whereas, stage at which it can be used not only to face is sufficient that the stochastic in the latter model, there first is a col- determine the details of the folding search samples only a small fraction of lapse to a relatively disordered globule mechanism but also to aid in solving the the total configurations in finding the and the helices form simultaneously ‘‘other’’ folding problem, that of deter- native state. This principle is now gener- with the native tertiary structure. The mining the structure of the native state ally accepted as the solution of the model results correlate with recent fold- directly from the sequence by molecular search problem in protein folding (17, ing experiments and all-atom (unfold- dynamics simulations. 19), and the objective of present studies ing) simulations in explicit solvent (22). is concerned with the specifics of the Interestingly, an extension of the dis- Enzyme Catalysis folding of individual proteins and the crete dynamics methodology has re- Enzyme catalysis can produce rate ac- ways that misfolding is avoided. cently been used to simulate fibril celerations of a factor of 1019 (27). This

6680 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0408930102 Karplus and Kuriyan Downloaded by guest on September 30, 2021 process involves ‘‘molecular recognition’’ is primarily due to the lowering of the at the highest level; the catalysis of pro- activation barriers. ton-transfer reactions, for example, re- Dynamic effects associated with the quires the recognition of a change in a preexponential factor, A(T), are nor- COH of Ϸ0.5 Å in going mally very difficult to isolate experimen- from the reactant to the transition state. tally. One contribution concerns the In 1946, before structural information Eyring-type transmission coefficient, was available, Linus Pauling (28) pro- which can be less than unity because of posed that enzymes can accelerate reac- the recrossing of the barrier. Simula- tion rates because they bind the transition tions have shown that, although the state better than the substrate and recrossing factor is quantitatively inter- thereby lower the activation free energy, esting in terms of a full understanding ⌬G‡. The validity of this key concept in of the reaction, the magnitude of its enzyme catalysis has been confirmed by effect is generally small (no more than many studies (29, 30), although the a factor of 2 or 3) as compared with dynamical contribution to the preexpo- other contributions (30). By contrast, nential factor, A(T), in the Arrhenius tunneling can be more important in expression k ϭ A(T)exp(Ϫ⌬G‡͞RT), enzymatic reactions, particularly those where k is the rate constant, has to be involving the transfer of hydrogen (hy- considered as well (30). drogen atom, proton, or hydride ion) Computer simulations are essential (43–45). Because inferences concerning for understanding the lowering of the this effect from isotope studies are indi- activation free energy in terms of the rect (46, 47), simulations have been im- structure of the enzyme and its flexibil- portant for a direct determination of the ity. The structure provides a ‘‘preorga- magnitude of the tunneling. At ambient nized environment’’ (31) that enhances , the calculated rate en- catalysis by a greater stabilization of the hancements due to tunneling range from transition state than of the reactant a factor of 1.5 for the intermolecular state, with contributions from interac- proton transfer in TIM (43) to 780 for tions with the bound substrate (32, 33) soybean lipoxygenase (48). These rate and from the enzyme itself. A certain Fig. 3. Mechanism of TIM. (a) Electrostatic con- accelerations are equivalent to thermo- degree of enzyme flexibility is essential tribution of individual residues (in kcal͞mol on the dynamic free energy effects of 0.2–3.9 for catalysis because atomic motions of ordinate) to the lowering of the activation energy kcal͞mol, a significant contribution but the enzyme are required in all reactions barrier (TS1) of the reaction of the DHAP substrate one that is still small compared with (34). Moreover, larger scale motions to form the enolate intermediate. This is the rate- the lowering of the activation free en- also can be involved directly in catalysis determining step of the overall . ergy by 10 kcal͞mol or more. Detailed and in providing a protected catalytic The residues are plotted on the abscissa as a func- calculations of the tunneling contribu- site for the reaction while permitting the tion of the distance from the C␣ of the tion in DHFR are providing additional residue [or the oxygen of a water molecule (W)] to substrate to enter and the product to insights concerning its role in enzyme C1 of the substrate. Negative values correspond to escape. The changes in enzyme structure the lowering of the barrier. (b) Active-site structure catalysis (49). and vibrational modes associated with at transition state showing important residues and the progress along the reaction coordi- water molecules. [Reproduced with permission Molecular Dynamics Analyses of F1 nate have been shown to promote catal- from ref. 30 (Copyright 2004, AAAS).] ATPase ‡ ysis most efficiently by lowering ⌬G . The enzyme FoϪF1 ATP synthase is This effect is distinct from the role of perhaps the most remarkable of the mo- such motions in determining A(T). ute. This type of decomposition, which lecular machines that have been studied One of the enzymes that has been can be validated in part by experiment, at atomic resolution (Fig. 4A). This en- studied in detail experimentally and by also provides a basis for determining the zyme is central to life because it synthe- simulations is triosephosphate isomerase evolutionary variation in ⌬G‡ for the sizes ATP by harnessing the chemical (TIM), which catalyzes the conversion large number of available TIM se- potential of proton gradients across cell Ϫ of dihydroxyacetone phosphate (DHAP) quences (M.K., unpublished data). Be- membranes. One component of Fo F1 to (R)-glyceraldehyde 3-phosphate. The cause water can lead to a side reaction, ATP synthase (Fo) is mainly within the apparent barrier for the reaction in the protection of the active site is achieved membrane and is responsible for con- enzyme has been calculated to be 11–13 verting the chemical energy of the pro- ͞ ϭ by a ‘‘lid’’ , which makes the active kcal mol (1 cal 4.18 J) lower than site of TIM accessible to the substrate but ton gradient into rotatory motion of a that for the reaction in aqueous solution drive shaft that is located within the sec- closes it off for catalysis (37, 38). (35, 36). The rate-determining step is ond component, which is known as F Another enzyme that has been stud- 1 the transfer of a proton from DHAP to ATPase (50). The rotational motion of ied extensively is dihydrofolate reduc- Glu-165 and the contributions obtained the drive shaft is converted by F1 from a perturbation model (35) of indi- tase (DHFR), which catalyzes hydride ATPase into the production of ATP transfer between nicotinamide adenine vidual residues to lowering the activa- from ADP and Pi (inorganic phosphate, dinucleotide phosphate and 7,8-dihydro- Ϫ tion energy of this step are shown in H2PO4 ). The landmark crystal struc- folate. Experiments (39, 40) and simula- Fig. 3A; the positions of important resi- tures of F1 ATPase (51, 52), along with dues in the active site are shown in Fig. tions (41, 42) point to the fact that the the results of single molecule studies 3B. The charged residue Lys-12 makes reaction coordinate is complex, involv- (53) and more conventional biochemical the most important contribution, but the ing motions of parts of the enzyme experiments (54), have provided the ba- neutral His-95 side chain as well as cer- distant from the active site. Although sis for a series of simulations aimed at tain main chain NH groups also contrib- enzyme motion is involved, the catalysis understanding how F1 ATPase works.

Karplus and Kuriyan PNAS ͉ May 10, 2005 ͉ vol. 102 ͉ no. 19 ͉ 6681 Downloaded by guest on September 30, 2021 the conformation of the ␤-subunits. The ning and end states corresponding to a original crystal structure of F1 ATPase 120° rotation of the ␥-subunit are led to the identification of three confor- known (51), the challenge is essentially mations of the ␤-subunit: ␤E (empty), one of mapping a low-energy path that ␤TP (ATP bound), and ␤DP (ADP connects one stable conformation of the bound) (51). The open conformation of system to another, a general problem the ␤E-subunit is different from that of that is receiving considerable attention the ␤TP- and ␤DP-subunits, which are at present (64). The earliest attempt to both closed and very similar to each characterize the intermediate structures other. in F1 ATPase used an interpolation The prescient analysis of kinetic data method in a cyclindrical coordinate sys- by Paul Boyer (58) led to his proposal tem defined by the rotation axis of the (in advance of structural information) of ␥-subunit (65). These interpolated struc- a remarkable and unprecedented ‘‘bind- tures were recently used after local ing change mechanism’’ that described relaxation by molecular dynamics to an- the action of F1 ATPase. In terms of alyze the breaking of interactions be- this mechanism, ATP synthesis proceeds tween the ATP and the protein as the Fig. 4. F1 ATPase and targeted molecular dynam- by the cyclical conversion of the ␤- system is driven from the ␤TP state to Ϫ ics. (a) Structure of Fo F1 ATP synthase based on subunit from an ‘‘open’’ state that binds the ␤E state (59). More details concern- crystal structures of the F ATPase (51, 52). The F 1 o ATP only weakly to a ‘‘loose’’ state that ing the conformational transitions in F1 component is indicated as a gray cylinder within ␥ has higher affinity for ATP to a ‘‘tight’’ ATPase were obtained by molecular the membrane (yellow), and the -subunit shaft is state that has the highest affinity for dynamics simulations in the presence of shown in green. The three ␣-subunits are indicated ATP. The crystal structure (51), with its biasing that were applied to the by blue backbone traces, and the molecular sur- ␥ faces of the three ␤-subunits are shown. The three distinct conformations of the -subunit alone (60, 61) or to the entire ␥-subunit rotates in a clockwise direction during ␤-subunits and asymmetric disposition structure in a procedure that drives the ATP synthesis, as viewed from the membrane, and of the ␥-subunit shaft, clearly supported system from one state to the other with- the effects of a 120° rotation of the ␥-subunit are the binding change mechanism proposed out explicitly constraining the nature of manifested as a change in the conformation of the by Boyer (58). the transition path (61). The spirit in ␤-subunits. (b) Schematic representation of tar- Molecular dynamics simulations have which these simulations are done entails geted molecular dynamics. The simulated structure made contributions to our understand- ‘‘pushing’’ on the molecular assembly is restrained to be at a specified rms distance from ing of two aspects of the mechanism of (e.g., by forcing the ␥-subunit to rotate) the target structure, and this distance is gradually decreased during the simulation. Because the re- F1 ATPase. As is usually the case in and analyzing how the rest of the struc- straint is applied in an overall rms sense, the inter- crystallography, the structures are silent ture responds (Fig. 4B). Because the mediate structures are not specified explicitly and about the nature of the transitions from time scale of the forced rotational tran- different parts of the structure can relax toward one state to the other and the forces sition of the ␥-subunit is orders of mag- the final structures at different rates. involved. Simulations have been useful nitude faster than the actual rotation in piecing together at least some of what rate, the implicit assumption in such must happen as the motor moves studies is that meaningful information Of particular interest is the demonstra- through its duty cycle, and we discuss concerning the mechanism can be still tion that F1 ATPase, which by itself these studies first (59–61). The second be obtained. normally hydrolyzes ATP, can also issue concerns the identification of par- Both of these studies demonstrated synthesize ATP if the central shaft is ticular conformations of the ␤-subunit that the rotation of the ␥-shaft triggers rotated by external forces (55). High- with the open, loose, and tight states of the opening of the nucleotide-bound resolution structures for the Fo compo- the binding change mechanism. It is ap- ␤TP-subunit and the closure of the open nent are not available at present, but parent that the empty conformation ␤E-subunit (60, 61). The importance of this experiment demonstrates that it is (␤E) is the low-affinity state of the a track of ionic residues on the ␥-shaft reasonable to hope to understand how ␤-subunit, but the ␤DP and ␤TP states and on the inward-facing surfaces of the F1 ATPase synthesizes ATP without in- are very similar to each other, and it has ␣- and ␤-subunits has been highlighted cluding the Fo component in simulations not been possible on the basis of experi- (61). These ionic residues provide a (but see ref. 56 for a simulation analysis mental data alone to identify which of mechanism for smooth rotation of the of how Fo functions). the two is the high-affinity ␥-subunit by enabling the sequential The major structural element of F1 for ATP. Free energy difference sim- handover of ion pairing interactions. ATPase is a hexameric assembly of ulations of nucleotide binding to the The functional difference between the three ␣-subunits and three ␤-subunits ␤-subunits have resolved these ambigu- ␣- and ␤-subunits is demonstrated by surrounding the ␥-subunit, which has a ities (62), allowing the large body of ex- smooth bypass of the ␣-subunits by globular base and an extended coiled- perimental data on the complex to be the rotating ␥-subunit. In contrast, coil domain (51) (see Fig. 4A). All six of integrated into a detailed kinetic scheme interlocking elements from the ␥- and the ␣- and ␤- subunits bind nucleotides, that models the catalysis of ATP hydro- ␤-subunits generate responses in the ␤ but only the three -subunits are cata- lysis by F1 ATPase in the absence of the ␤-subunits as the ␥-subunits move. lytically active. The crystal structures of Fo component (63). One interesting phenomenon that F1 ATPase have proven to be enor- emerged from both sets of simulations mously informative regarding the mech- Conformational Transitions in F1 ATPase. (60, 61) is the rapid relaxation of the anism of the motor because each crys- The time scale for one rotation of the ␥ ␤E- subunit once the steric block im- tallographic snapshot provides views of shaft is in the microsecond to millisec- posed by the ␥-subunit is removed by three distinct states of the catalytic ond range (53) and is therefore not ac- rotation. Although solution NMR data ␤-subunits (51, 52, 57). The centrally cessible to the submicrosecond time for the isolated ␤E-subunit indicate that located and asymmetric ␥-subunit forms scales probed by standard molecular dy- the equilibrium is shifted toward the a shaft, and its orientation determines namics simulations. Because the begin- open conformation until the nucleotide

6682 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0408930102 Karplus and Kuriyan Downloaded by guest on September 30, 2021 binds, the rapid closure of the ␤E- ATPase. The first contribution is the in- SH2 and SH3 domains tended to move subunit seen in the simulations with (61) crease in the binding free energy of ATP together as a unit, with the linker between and without (60) nucleotides is consis- in going from the ␤E to the ␤TP state; this them playing a particularly important role tent with the results of normal mode result is in agreement with the original in clamping these domains to the kinase calculations (66), which show that one model of Wang and Oster (65). However, domain. In contrast, NMR studies of iso- of the lowest frequency (i.e., most the second contribution is a new result; it lated SH2–SH3 units of Src kinases had readily excitable) normal modes of the arises from the fact that once hydrolysis shown that these domains are linked to- ␤-subunit accounts for a significant frac- has taken place and ADP.Pi occupies the gether flexibly (73). Indeed, simulations of tion of the conformational difference binding site, the conformation is biased isolated SH2–SH3 domains do reveal between the open and closed forms of toward ␤DP. A detailed kinetic model for them to be extremely flexible in terms of this subunit. ATP hydrolysis by F1 ATPase has been their relative orientation (8), suggesting developed with these results and solution that the linker between them functions as Free Energy Simulations of F1 ATPase. The kinetic constants identified with the dif- an ‘‘inducible snap lock,’’ a term intro- simulations described so far do not ad- ferent ␤-subunits (63). duced by Wright and coworkers (74) to dress the energetics of ATP binding to describe flexible connections between cer- the various states. The essential diffi- Dynamics of the Src Tyrosine Kinases tain zinc finger modules that snap into culty here arises from the striking simi- The Src tyrosine kinases are a family of rigidity when the zinc fingers bind to their larity in structure between the ␤DP and closely related proteins that transmit sig- cognate DNA recognition elements. In ␤TP states (52, 57). A recently deter- nals initiated by growth factor receptors in the Src proteins, the linker between the mined high-resolution structure of F1 human cells (67). A mutant and constitu- SH2 and SH3 domains is stabilized in the ATPase has the ATP analog ADP.AlF4 tively activated form of Src is encoded by assembled and inactive state by several bound at the ␤DP and the ␤TP sites, with the first oncogene to be discovered (v-Src; hydrogen bonds. When released from the tight coordination of the nucleotide in so named for the sarcomas caused by the interactions with the kinase domain, simu- both cases (52). Which site features expression of this retroviral gene). The lations show that these hydrogen bonds high-affinity ATP binding and which site Src proteins contain a tyrosine kinase do- are broken by water molecules and the more closely resembles the structure of main that catalyzes the transfer of phos- linker becomes flexible (8). the subunit when the chemical catalysis phate from ATP to tyrosine residues on Targeted molecular dynamics simula- step occurs? Answers to these questions substrate proteins. The inappropriate tions in which an ‘‘activation loop’’ that is have been provided by free energy dif- activation of tyrosine kinases such as Src located near the active site of the kinase ference simulations in which ATP.H2O can have deadly consequences in terms of domain is driven from the inactive confor- is converted computationally to ADP.Pi the onset of cancers because phosphory- mation to the active conformation while (62). These calculations, which are often lated tyrosines serve as targeting signals leaving the SH2 and SH3 domains unre- referred to as ‘‘computational alchemy,’’ for SH2 domains in signaling proteins strained showed that movements at the provide estimates of the standard free that control cell growth and differentia- active site of the kinase reverberate energy change, ⌬Go, for the conversion tion (68). through to the SH2 domain, even though of ATP into ADP at each of the occu- The Src kinases themselves contain an it is located Ͼ40 Å away (8). The cou- pied binding sites. This free energy SH2 domain and another -binding pling between the SH2 and SH3 domains change is calculated to be Ϫ9 kcal͞mol module known as the SH3 domain, both is the key to this intersite communication in the ␤DP site (i.e., ADP is strongly fa- of which are important for the mainte- in Src and underlies the importance of the vored over ATP at this site) and 1.5 nance of the inactive state of these pro- phosphotyrosine–SH2 linkage at the base kcal͞mole at the ␤TP site (i.e., ATP and teins. Crystal structures of inactive Src of the distal surface of the kinase domain. ADP have almost the same chemical kinases unexpectedly revealed that the If this linkage is broken, as in the form of potential at this site). SH2 and SH3 domains bind to the distal the protein produced by the v-Src onco- Solution measurements have shown face of the kinase domain, rather than gene, the kinase activity of Src is turned that, under ‘‘unisite’’ hydrolysis conditions near the active site (69, 70). How do the on constitutively (67). The molecular dy- (i.e., when the ATP concentration is so SH2 and SH3 domains affect the catalytic namics simulations show that the residues low that only the strongest binding site is activity of the kinase domain without be- in the SH2 domain that interact with the involved), the reaction free energy is near ing located near the active site? One pos- phosphotyrosine residue in the tail move zero; the experimental value is 0.4 kcal͞ sibility is that these domains affect the significantly when the activation loop in mol for the mitochondrial enzyme, whose global dynamics of the protein, making it the kinase domain is displaced. The SH2– structure has been determined, and Ϫ0.6 more difficult for the kinase domain to phosphotyrosine linkage would be ex- kcal͞mol for the Escherichia coli enzyme undergo the change in conformation from pected to resist these movements of the (63). Given the free energy simulation the observed inactive state (69, 70) to the activation loop and, thereby, to impede results just cited, it is possible to identify structurally different active form (71). the activation process. the ␤TP as the strong binding site for ATP Molecular dynamics simulations of the The importance of rigidity in the and the ␤DP site as the intermediate bind- Src kinases c-Src and Hck have revealed SH2–SH3 linker was tested experimen- ing site. This ‘‘missing link’’ between the an unexpected feature of the SH2 and tally and computationally (8). Simula- solution measurements and the crystal SH3 domains that may be a key to the tions in which linker residues were structure has made possible a complete regulation of the Src kinases (8). Before replaced by glycine caused the SH2 and assignment of the measured binding con- the simulation analysis, the SH2 and SH3 SH3 domains to become less tightly cou- stants not only for ATP but also for domains were considered to be flexibly pled. Introduction of corresponding mu- ADP.Pi (62). Interestingly, it was also linked and independently functioning tations into Src proteins in cell-based shown that the ␤DP site is the strong bind- modules (72, 73). Thus, it came as a sur- assays led to their constitutive activa- ing site for ADP.Pi (63). These results prise that the simulations showed the tion, bypassing the inhibitory action of indicate that there are two major contri- SH2–SH3 unit of the assembled Src pro- the distal SH2–phosphotyrosine linkage butions to the driving (actually a teins to be coupled together relatively rig- (8). We note that these mutations were biasing free energy) that rotates the idly. Unbiased simulations extending for designed only as a consequence of the ␥-subunit when ATP is hydrolysed by F1 several nanoseconds revealed that the molecular dynamics simulations, despite

Karplus and Kuriyan PNAS ͉ May 10, 2005 ͉ vol. 102 ͉ no. 19 ͉ 6683 Downloaded by guest on September 30, 2021 years of prior experimental investigation into Src function. The Abelson (Abl) Kinase and the Specificity of Imatinib Mesylate (Gleevec) An interesting corroboration of these re- sults from molecular dynamics was pro- vided by a subsequent structural analysis of the Abl tyrosine kinase (75). A close relative of the Src kinases, Abl contains SH2 and SH3 domains that are similar in sequence to those of Src, and, although it Fig. 5. Structure and dynamics of the Src and Abl kinases. (Left) The structures of c-Abl (green) and c-Src lacks the tyrosine residue that anchors the (red) are shown superimposed on their SH2 and SH3 domains (69, 70, 75). Note the dissimilarity in the SH2 domain to the kinase in Src when conformation of the kinase domains. (Center and Right) The results of unbiased molecular dynamics phosphorylated, the overall structure of simulations of c-Src. Residues in different domains that move in a correlated manner in the simulation are autoinhibited Abl is similar to that of Src linked by a red line. These correlations were calculated by superimposing each instantaneous structure in the simulation on the C-terminal lobe of the kinase domain, and motions that are correlated to the (75). Molecular dynamics simulations of C-terminal lobe are removed by this procedure. (Right) The mutation of residues in the SH2–SH3 linker to Abl carried out as for Src suggested that glycine reduces the correlation in the dynamics of these domains. Similar results were obtained for c-Abl. these domains form a similar snap lock in (Modified from refs. 8 and 75.) Abl (75). As in Src, mutations in the SH2–SH3 linker activate the Abl kinase (76, 77). The difference is that the SH2 difference in conformation to which deeper understanding of particular bio- domain in Abl is anchored to the kinase Gleevec is sensitive. logical systems. When molecular dynam- domain not by a phosphotyrosine-medi- ics simulations are a routine part of ated linkage, which keeps the SH2 and Future Prospects structural biology, it will become clearer kinase domains apart in Src, but through It is now often the case that experimental what refinements and extensions of the an intimate and direct interface that structures are available for more than one methodology are most needed to im- forms between the SH2 domain and the functional state of a molecular assembly, prove the results and to perfect the con- kinase domain. so that the problem of finding transition structive interplay between the simula- The Abl kinase domain is the target paths between low-energy regions of a tions and experiment. These conclusions of the successful cancer drug Gleevec, complex conformational space is becom- will, in turn, provide challenges for the which works by blocking the catalytic ing increasingly important. Despite the simulation experts and catalyze new de- activity of a form of the Abl protein difficulties inherent in the difference be- velopments in the field. In addition to that is activated in chronic myelogenous tween the experimental and simulation reaction path search methods, improved leukemia (78). Gleevec binds with high time scales, it is likely that creative solu- treatments of solvent by implicit meth- ods (84) are likely to play a role here. affinity to the inactive Abl kinase do- tions to the problem will be found, as il- Although simulations with explicit water main but not to that of Src, even though lustrated by the recent application of a molecules are the gold standard, implicit transition path sampling method (64) to all of the residues that make contact solvent models make possible the testing with the drug in Abl are conserved in the problem of fidelity checking by DNA of hypotheses by repeated simulations Src (79, 80). The striking feature of the polymerases (81). It will be particularly (e.g., with different mutants and͞or with structures of the inactive forms of Abl interesting to be able to calculate the free different constraints) for larger systems. and Src is that it is the conformation of energy barriers to conformational transi- Given the availability of several molecu- the SH2–SH3 unit and not that of the tions, which has not as yet been widely lar dynamics programs, large amounts of kinase domain that is conserved be- done for protein conformational changes computer time, and examples for which tween the two proteins (Fig. 5). It ap- because of the difficulties in specifying the molecular dynamics has really played a pears that the rigidity of the SH2–SH3 reaction coordinate for the transition. A role in furthering our understanding of unit, which is a feature of the Src and prominent exception occurs in the study protein functions, we look forward to a Abl simulations (8, 75), is manifested as of ion channels, for which the reaction wide field of biological applications for a conservation of the structure of the coordinate is essentially defined by the molecular dynamics in the future. SH2–SH3 unit in the inactive states of system (82). Alternatively, the generation both proteins. Because the SH2 domain of multiple targeted molecular dynamics We thank the National Energy Research Sci- in Abl is closer to the kinase domain trajectories could be used to extract the entific Computing Center for supercomputing than in Src, the SH3 domain in the rigid free energy directly (83). resources and Lore Leighton for preparing SH2–SH3 unit swings away, and the ki- It is to be hoped that experimental Fig. 4. J.K. thanks Matthew Young for many stimulating discussions regarding molecular nase domain adjusts its conformation so structural biologists, who know their dynamics. M.K. thanks M. Viloca-Garcia, J. as to preserve the interaction with the systems better than anyone else, will Gao, and D. G. Truhlar for helpful discus- SH3 domain. As a result, the kinase do- make increasing use of molecular dy- sions and collaboration on ref. 30, which main of Abl is more open, and it is this namics simulations for obtaining a forms the basis of the enzyme text.

1. Karplus, M. & McCammon, J. A. (2002) Nat. 4. McCammon, J. A., Gelin, B. R. & Karplus, M. 8. Young, M. A., Gonfloni, S., Superti-Furga, G., Struct. Biol. 9, 646–652. (1977) Nature 267, 585–590. Roux, B. & Kuriyan, J. (2001) Cell 105, 115–126. 2. Wang, W., Donini, O., Reyes, C. M. & Kollman, 5. Levitt, M. & Sharon, R. (1988) Proc. Natl. Acad. 9. Case, D. A. (2002) Acc. Chem. Res. 35, 325–331. P. A. (2001) Annu. Rev. Biophys. Biomol. Struct. Sci. USA 85, 7557–7561. 10. Soares, T. A., Daura, X., Oostenbrink, C., Smith, 30, 211–243. 6. Mackerell, A. D., Jr. (2004) J. Comput. Chem. 25, L. J. & van Gunsteren, W. F. (2004) J. Biomol. 3. Hansson, T., Oostenbrink, C. & van Gunsteren, 1584–1604. NMR 30, 407–422. W. (2002) Curr. Opin. Struct. Biol. 12, 190– 7. Price, D. J. & Brooks, C. L., III (2002) J. Comput. 11. Krieger, E., Darden, T., Nabuurs, S. B., Finkel- 196. Chem. 23, 1045–1057. stein, A. & Vriend, G. (2004) Proteins 57, 678–683.

6684 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0408930102 Karplus and Kuriyan Downloaded by guest on September 30, 2021 12. Horita, D. A., Zhang, W., Smithgall, T. E., 38. Kursula, I., Salin, M., Sun, J., Norledge, B. V., 62. Yang, W., Gao, Y. ., Cui, Q., Ma, J. & Karplus, Gmeiner, W. H. & Byrd, R. A. (2000) Protein Sci. Haapalainen, A. M., Sampson, N. S. & Wierenga, M. (2003) Proc. Natl. Acad. Sci. USA 100, 874– 9, 95–103. R. K. (2004) Protein Eng., Des. Sel. 17, 375–382. 879. 13. van Gunsteren, W. F. & Mark, A. E. (1998) 39. Schnell, J. R., Dyson, H. J. & Wright, P. E. (2004) 63. Gao, Y. Q., Yang, W., Marcus, R. A. & Karplus, J. Chem. Phys. 108, 6109–6116. Annu. Rev. Biophys. Biomol. Struct. 33, 119–140. M. (2003) Proc. Natl. Acad. Sci. USA 100, 11339– 14. Dobson, C. M. (2003) Nature 426, 884–890. 40. Sawaya, M. R. & Kraut, J. (1997) 36, 11344. 15. Cavalli, A., Ferrara, P. & Caflisch, A. (2002) 586–603. 64. Bolhuis, P. G., Chandler, D., Dellago, C. & Gei- 47, Proteins 305–314. 41. Rod, T. H., Radkiewicz, J. L. & Brooks, C. L., III ssler, P. L. (2002) Annu. Rev. Phys. Chem. 53, 16. Simmerling, C., Strockbine, B. & Roitberg, A. E. (2003) Proc. Natl. Acad. Sci. USA 100, 6980–6985. 291–318. (2002) J. Am. Chem. Soc. 124, 11258– 42. Watney, J. B., Agarwal, P. K. & Hammes-Schiffer, 65. Wang, H. & Oster, G. (1998) Nature 396, 279–282. 11259. S. (2003) J. Am. Chem. Soc. 125, 3745–3750. 66. Cui, Q., Li, G., Ma, J. & Karplus, M. (2004) J. Mol. 17. Wolynes, P. G. (2005) Philos. Trans. R. Soc. Lon- 43. Cui, Q. & Karplus, M. (2002) J. Am. Chem. Soc. don A 363, 453–467. 124, 3093–3124. Biol. 340, 345–372. 18. Dobson, C. M., Sali, A. & Karplus, M. (1998) 44. Billeter, S. R., Webb, S. P., Iordanov, T., Agarwal, 67. Martin, G. S. (2004) Oncogene 23, 7910–7917. Angew. Chem. 37, 868–893. P. K. & Hammes-Schiffer, S. (2001) J. Chem. Phys. 68. Pawson, T. & Nash, P. (2003) Science 300, 445– 19. Karplus, M. (1997) Fold Des. 2, S69–S75. 114, 341–349. 452. 20. Zhou, Y. & Karplus, M. (1999) Nature 401, 400– 45. Truhlar, D. G., Gao, J., Alhambra, C., Garcia- 69. Sicheri, F., Moarefi, I. & Kuriyan, J. (1997) Nature 403. Viloca, M., Corchado, J., Sanchez, M. L. & Villa, 385, 602–609. 21. Islam, S. A., Karplus, M. & Weaver, D. L. (2002) J. (2002) Acc. Chem. Res. 35, 341–349. 70. Xu, W., Harrison, S. C. & Eck, M. J. (1997) Nature J. Mol. Biol. 318, 199–215. 46. Jonsson, T., Glickman, M. H., Sun, S. J. & Klin- 385, 595–602. 22. Gianni, S., Guydosh, N. R., Khan, F., Caldas, man, J. P. (1996) J. Am. Chem. Soc. 118, 10319– 71. Yamaguchi, H. & Hendrickson, W. A. (1996) T. D., Mayor, U., White, G. W., DeMarco, M. L., 10320. Nature 384, 484–489. Daggett, V. & Fersht, A. R. (2003) Proc. Natl. 47. Basran, J., Sutcliffe, M. J. & Scrutton, N. S. (1999) 72. Engen, J. R., Smithgall, T. E., Gmeiner, W. H. & Acad. Sci. USA 100, 13286–13291. Biochemistry 38, 3218–3222. Smith, D. L. (1999) J. Mol. Biol. 287, 645–656. 23. Nguyen, H. D. & Hall, C. K. (2004) Proc. Natl. 48. Tresadern, G., McNamara, J. P., Mohr, M., Wang, 73. Arold, S. T., Ulmer, T. S., Mulhern, T. D., Werner, Acad. Sci. USA 101, 16180–16185. H., Burton, N. A. & Hillier, I. A. (2002) Chem. J. M., Ladbury, J. E., Campbell, I. D. & Noble, 24. Krivov, S. V. & Karplus, M. (2004) Proc. Natl. Phys. Lett. 358, 489–494. M. E. (2001) J. Biol. Chem. 276, 17199–17205. Acad. Sci. USA 101, 14766–14770. 49. Hammes-Schiffer, S. (2004) Curr. Opin. Struct. 74. Laity, J. H., Dyson, H. J. & Wright, P. E. (2000) 25. Zhou, R. (2003) Proteins 53, 148–161. Biol. 14, 192–201. J. Mol. Biol. 295, 719–727. 26. Rao, F. & Caflisch, A. (2004) J. Mol. Biol. 342, 50. Boyer, P. D. (1997) Annu. Rev. Biochem. 66, 299–306. 717–749. 75. Nagar, B., Hantschel, O., Young, M. A., Schef- 27. Wolfenden, R. & Snider, M. J. (2001) Acc. Chem. 51. Abrahams, J. P., Leslie, A. G., Lutter, R. & fzek, K., Veach, D., Bornmann, W., Clarkson, B., Res. 34, 938–945. Walker, J. E. (1994) Nature 370, 621–628. Superti-Furga, G. & Kuriyan, J. (2003) Cell 112, 28. Pauling, L. (1946) Chem. Eng. News 24, 1375–1377. 52. Menz, R. I., Walker, J. E. & Leslie, A. G. (2001) 859–871. 29. Schowen, R. L. (1978) in Transition States of Cell 106, 331–341. 76. Hantschel, O., Nagar, B., Guettler, S., Biochemical Processes, eds. Gandour, R. D. & 53. Kinosita, K., Jr., Adachi, K. & Itoh, H. (2004) Kretzschmar, J., Dorey, K., Kuriyan, J. & Superti- Schowen, R. L. (Plenum, New York), pp. 77–114. Annu. Rev. Biophys. Biomol. Struct. 33, 245–268. Furga, G. (2003) Cell 112, 845–857. 30. Garcia-Viloca, M., Gao, J., Karplus, M. & Truhlar, 54. Weber, J. & Senior, A. E. (2003) FEBS Lett. 545, 77. Brasher, B. B., Roumiantsev, S. & Van Etten, D. G. (2004) Science 303, 186–195. 61–70. R. A. (2001) Oncogene 20, 7744–7752. 31. Villa, J. & Warshel, A. (2001) J. Phys. Chem. B 105, 55. Itoh, H., Takahashi, A., Adachi, K., Noji, H., 78. Druker, B. J. & Lydon, N. B. (2000) J. Clin. 7887–7907. Yasuda, R., Yoshida, M. & Kinosita, K. (2004) Invest. 105, 3–7. 32. Dinner, A. R., Blackburn, G. M. & Karplus, M. Nature 427, 465–468. 79. Schindler, T., Bornmann, W., Pellicena, P., (2001) Nature 413, 752–755. 56. Aksimentiev, A., Balabin, I. A., Fillingame, R. H. Miller, W. T., Clarkson, B. & Kuriyan, J. (2000) 33. Fromme, J. C., Bruner, S. D., Yang, W., Karplus, & Schulten, K. (2004) Biophys. J. 86, 1332–1344. Science 289, 1938–1942. M. & Verdine, G. L. (2003) Nat. Struct. Biol. 10, 57. Kagawa, R., Montgomery, M. G., Braig, K., Leslie, 80. Nagar, B., Bornmann, W. G., Pellicena, P., 204–211. A. G. & Walker, J. E. (2004) EMBO J. 23, 2734– Schindler, T., Veach, D. R., Miller, W. T., Clark- 34. Brooks, C. L., Karplus, M. & Pettitt, B. M. (1988) 2744. son, B. & Kuriyan, J. (2002) Cancer Res. 62, Proteins: A Theoretical Perspective of Dynamics, 58. Boyer, P. D. (1993) Biochim. Biophys. Acta 1140, 4236–4243. Structure and (Wiley, New 215–250. York). 59. Antes, I., Chandler, D., Wang, H. & Oster, G. 81. Radhakrishnan, R. & Schlick, T. (2004) Proc. Natl. 35. Cui, Q. & Karplus, M. (2002) J. Phys. Chem. B 106, (2003) Biophys. J. 85, 695–706. Acad. Sci. USA 101, 5970–5975. 1768–1798. 60. Bockmann, R. A. & Grubmuller, H. (2002) Nat. 82. Berneche, S. & Roux, B. (2001) Nature 414, 73–77. 36. Feierberg, I. & Åqvist, J. (2002) Theor. Chem. Acc. Struct. Biol. 9, 198–202. 83. Hummer, G. & Szabo, A. (2001) Proc. Natl. Acad. 108, 71–84. 61. Ma, J., Flynn, T. C., Cui, Q., Leslie, A. G., Walker, Sci. USA 98, 3658–3661. 37. Joseph, D., Petsko, G. A. & Karplus, M. (1990) J. E. & Karplus, M. (2002) Structure (Cambridge, 84. Feig, M. & Brooks, C. L., III (2004) Curr. Opin. Science 249, 1425–1428. MA,U.S.)10, 921–931. Struct. Biol. 14, 217–224.

Karplus and Kuriyan PNAS ͉ May 10, 2005 ͉ vol. 102 ͉ no. 19 ͉ 6685 Downloaded by guest on September 30, 2021