Rational stabilization of by computational redesign of surface charge–charge interactions

Alexey V. Gribenko1,2, Mayank M. Patel1, Jiajing Liu, Scott A. McCallum, Chunyu Wang, and George I. Makhatadze3

Department of Biology and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY 12180

Edited by Robert L. Baldwin, Stanford University Medical Center, Stanford, CA, and approved December 15, 2008 (received for review August 20, 2008) Here, we report the application of a computational approach that what is more important is that they both bind charged substrates. allows the rational design of enzymes with enhanced thermosta- Second, they differ in size and topology, thus providing a test for bility while retaining full enzymatic activity. The approach is based universality of the design strategy. Third, the 3D structures of on the optimization of the energy of charge–charge interactions these proteins are known, which is crucial for setting up the on the protein surface. We experimentally tested the validity of the computational energy-optimization procedure. The designed approach on 2 human enzymes, acylphosphatase (AcPh) and Cdc42 variants of these 2 proteins were characterized by a battery of GTPase, that differ in size (98 vs. 198-aa residues, respectively) and biophysical methods showing that they are more stable than the tertiary structure. We show that the designed proteins are signif- respective WT proteins and, by using biochemical methods, icantly more stable than the corresponding WT proteins. The showing that they do retain their enzymatic activities at levels increase in stability is not accompanied by significant changes in comparable with the WT proteins. structure, oligomerization state, or, most importantly, activity of the designed AcPh or Cdc42. This success of the design method- Results and Discussion ology suggests that it can be universally applied to other enzymes, Computational Design of Proteins. Two model proteins were cho- on its own or in combination with the other strategies based on sen for the test: human acylphosphatase, AcPh (EC-Number redesign of the interactions in the protein core. 3.6.1.7), and human cell-division cycle 42 factor, Cdc42 (EC- Number 3.6.5.2). Acylphosphatase is a 98-residue protein that

Until man duplicates a blade of grass, nature will laugh catalyzes hydrolysis of acylphosphates to produce carboxylate BIOPHYSICS at his so-called scientific knowledge. and inorganic phosphate (23). This is proposed to be Thomas Edison involved in several metabolic pathways and, in particular, in /, pyruvate , and benzoate ͉ ͉ computational design protein engineering protein stability degradation via CoA ligation (23). Acylphosphatase activity has been identified in both prokaryotes and eukaryotes (23). Ac- ational engineering of proteins to enhance stability and yet cording to SCOP classification, this protein belongs to the Rretain their enzymatic activity is well motivated (1). One structural family of ␣ϩ␤ sandwich proteins with antiparallel motivation is the practical significance of expanding the use of ␤-sheet (Fig. 1). The actual location of the - enzymes in many areas of the modern world, including protein is not known. It is believed to be located near the sulfate-binding therapeutics, enzymes for food industry, diagnostics, and other site observed in the 2ACY structure of the bovine protein (Fig. areas of industrial biotechnology. Another motivation is valida- 1), with residues R23, K24, N41, and K98 located in close tion of the existing scientific knowledge. In this case, predictions proximity. Biochemical and mutagenesis analyses have provided made by the existing models for protein stability are subjected to further support for this notion and identified R23 and N41 (23) thorough experiments, testing their applicability to protein de- as residues that are critically important for the activity of AcPh. sign. In this paper, we present the results of rational design of This observation is also supported by our results on redesign of enzymes with enhanced stability and unchanged enzymatic AcPh that were not concerned with maintaining activity. It was activity. This approach has 2 major differences from previously shown that a variant with 4 substitutions (K24E/E63K/N81K/ described successful protein design methods (2–5): (i)itcon- Q95K), although more stable than the WT (7), is in fact inactive. centrates only on the residues on the protein surface, and (ii)it Cdc42 is a eukaryotic protein and belongs to the family of optimizes just one type of interactions, namely, charge–charge small (24). It is a key enzyme that is involved in interactions on the protein surface (6–15). regulation of numerous vital cellular pathways such as gene One of the most important aspects of engineering proteins expression, cell-cycle progression, and rearrangement of the with enhanced stability, retaining the enzymatic activity, is often actin cytoskeleton (25). It has 198 amino acid residues arranged forgotten. However, for all of these design efforts to be practi- cally useful, it is important that the engineered proteins retain their biological and enzymatic activity. This issue is particularly Author contributions: A.V.G. and G.I.M. designed research; A.V.G., M.M.P., J.L., S.A.M., and important when enhanced protein stability is achieved by rede- C.W. performed research; A.V.G. and M.M.P. analyzed data; and A.V.G., M.M.P., and G.I.M. wrote the paper. signing the charge–charge interactions on the protein surface. The authors declare no conflict of interest. Such redesign can lead to several potentially detrimental effects on the activity: (i) it can affect the electrostatic potential in the This article is a PNAS Direct Submission. active center, thus reducing or even abolishing the activity; (ii) Data deposition: The atomic coordinates and structure factors have been deposited in the , www.pdb.org (PDB ID codes 2K7J and 2K7K). it can affect substrate/ binding and again reduce or 1 abolish the enzymatic activity; or (iii) it can have effects on the A.V.G. and M.M.P. contributed equally to this work. 2 kinetics of substrate binding and, thus, lower the activity via Present address: Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555. reduced rates of electrostatic steering (3, 16–22). 3To whom correspondence should be addressed at: Center for Biotechnology and Inter- To this end, it is important to test whether redesign of the disciplinary Studies 3244A, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180. surface charges that leads to the increase in stability affects E-mail: [email protected]. enzyme activity. As the test model systems, we chose 2 human This article contains supporting information online at www.pnas.org/cgi/content/full/ enzymes, acylphosphatase and Cdc42. These proteins were 0808220106/DCSupplemental. chosen for several reasons. First, they are both enzymes, and © 2009 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0808220106 PNAS Early Edition ͉ 1of6 Downloaded by guest on October 2, 2021 TKSA-GA is shown in Fig. 2 A and C. The favorable energies of charge–charge interactions increase with the number of substi- tutions until reaching saturation values. Such saturation behavior has been observed for other proteins and is consistent with 2 facts (7, 8, 29): (i) for a given protein topology, there is a finite number of surface sites where charged residues can be intro- duced, and (ii) addition of a charged residue at a new site has both favorable (with residues of the opposite charge) and unfavorable (with residues of the same charge) interactions. For experimental tests, we selected sequences that have the largest increase in the energy of charge–charge interactions but have the number of substitutions at 5–6% of the total number of Fig. 1. The diagram structures of 2 proteins studied in this paper. The amino acid residues. These sequences are shown in Fig. 2 B and locations of active sites are shown by van der Waals surfaces for the substrates. D. For AcPh, the substitutions were made at 5 positions, 3 of Positions of the substitutions are shown by gray circles corresponding to the which led to charge reversal (H60E, E63K, and K72E) and 2 of backbone CA atoms. (A) Acylphosphatase (AcPh): comparison of the structures which introduced new charges (Q50K and N81K). For Cdc42, we of bovine protein (PDB code 2ACY) (34) with the 3D structures of human fully characterized 2 variants with 7 (Cdc42-des1) and 8 (Cdc42- protein AcPh-wt (blue) and its designed variant AcPh-des (green). The struc- des2) substitutions. Cdc42-des1 has 4 charge-reversal substitu- tures of AcPh-wt and AcPh-des were determined in this work by using multi- tions (E95K, K107E, D121K, and K131E) and 3 new charges dimentional NMR spectroscopy (see Materials and Methods and SI Appendix for details). (B) Cdc42: structural model of human protein (PDB code 1AN0) as (Q74E, N167K, and V189K). Cdc42-des2 has an additional solved by x-ray crystallography in ref. 40. charge-reversal substitution (E178K). Calculations show that the designed variants should have more favorable energies of charge–charge interactions, in particular because of a decrease into a fold common for all Ras GTPases: a 6-stranded ␤-sheet in the number of amino acid residues with overall unfavorable surrounded by 5 ␣-helices (Fig. 1). Eleven residues were iden- energies of charge–charge interactions, ⌬Gqq (see Fig. 2 B and tified to be in direct contact with the GTP molecule, of which D for TKSA calculations and Fig. S1 for calculations using the three (K16, D57, and D118) are ionizable residues. MCCE software package) (30). Distribution of surface charges in these 2 proteins was opti- mized by the Tanford–Kirkwood surface accessibility genetic Experimental Characterization of Designed Proteins. These designed algorithm (TKSA-GA), which is based on a GA for optimization proteins were cloned, expressed, and purified (see Material and of charge–charge interactions in proteins (26), in which the Methods), and their physico–chemical properties were examined energies of charge–charge interactions are calculated by using and compared with the properties of the corresponding WT TK formalism (27) that includes SA correction as introduced in proteins. First, we compared the oligomerization properties of ref. 28. The dependence of the energy of charge–charge inter- the WT and designed proteins. Substitutions on the protein actions on the number of substitutions as identified by surface can influence the oligomerization state in solution by

Fig. 2. The dependence of the energy of charge–charge interactions on the number of amino acid substitutions (or the percentage of substitutions relative to the total number of amino acid residues) in AcPh (A) and Cdc42 (C). Each small dot corresponds to a different sequence. The sequences selected for experimental verification are shown in large symbols: AcPh-wt or Cdc42-wt, black circles; AcPh-des, red squares; Cdc42-des1, blue squares; and Cdc42-des2, red triangles. (B and D) The corresponding per residue energies of charge–charge interactions as calculated by TKSA model are given in B (AcPh-wt, black bars; AcPh-des, red bars) and D (Cdc42-wt, black bars; Cdc42-des1, red bars; Cdc42-des2, blue bars). Positive energies represent overall unfavorable interactions of a given residue with all other ionizable residues in the proteins, whereas negative values of ⌬Gqq reflect overall favorable interactions.

2of6 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0808220106 Gribenko et al. Downloaded by guest on October 2, 2021 increasing the size of oligomer, by creating complementary surfaces, or by inducing domain swapping (31, 32). Any of these mechanisms would increase the association constant, thereby making the oligomerization stronger. Such changes in the qua- ternary structure can lead to changes in stability. However, this increase/decrease in stability will not be directly relevant to the main hypothesis of this work, which is the demonstration that protein stability can be enhanced by optimization of charge– charge interactions on the protein surface. Both WT AcPh and Cdc42 are monomeric in solution as measured by equilibrium analytical ultracentrifugation. Importantly, all of the designed proteins were also found to be monomeric in solution (Fig. S2). This finding suggests that if the designed AcPh and Cdc42 are more stable, the increase in stability would not be due to changes Fig. 3. Biophysical characterization of the designed and WT acylphos- in the oligomerization state of the proteins. phatase (AcPh) and Cdcd42. (A) Comparison of the experimental partial molar Changes in the structure of proteins upon amino acid substi- heat capacity profiles of AcPh-wt (white circles) and AcPh-des (red squares). tutions can be another reason for changes in stability and thus Solid lines are the results of the fit according to a 2-state unfolding model. The create additional challenges for testing the hypothesis. Such results of the fit are given in Table 1. (B) Comparison of the experimental substitutions can sometimes lead to dramatic structural changes temperature-induced unfolding profiles of AcPh-wt (black thin line) and (33). These changes in the structure of the native-state protein AcPh-des (red thin line) as monitored by the changes in ellipticity at 222 nm. could, in turn, alter stability in a way that could compromise Thick solid lines are the results of the fit according to a 2-state unfolding experimental proofs of enhancing protein stability solely by model. The results of the fit are given in Table 1. (C) Comparison of the experimental temperature induced unfolding profiles of Cdc42-wt (thin black optimization of charge–charge interactions on the protein sur- line), Cdc42-des1 (thin blue line), and Cdc42-des2 (thin red line) as monitored face. To address this issue, structures of proteins were examined by the changes in ellipticity at 222 nm. (D) Dependence of specific activity of first at the low level of resolution by using circular-dichroism Cdc42 variants (Cdc42-wt, black circles; Cdc42-des1, blue squares; Cdc42-des2, spectroscopy. Comparison showed no changes in the far-UV CD red triangles) on temperature. Lines are drawn to guide the eye. (E) Resistance spectra between WT and designed proteins, suggesting that of Cdc42 variants to aggregation after exposure to high temperature. The there are no large structural changes in these proteins with fraction of soluble monomer remained in solution after exposure to elevated BIOPHYSICS substitutions (Fig. S3). To further validate this conclusion, at temperatures was measured by using size-exclusion chromatography as de- higher structural resolution, we solved solution structures of scribed in Materials and Methods. Shown are Cdc42-wt (black circles), Cdc42- AcPh-wt and AcPh-des by using multidimensional NMR spec- des1 (blue squares), and Cdc42-des2 (red triangles), and lines are drawn to guide the eye. troscopy (see SI Appendix for procedures on structure determi- nation and detailed statistics of the final structures). Fig. 1A shows the overlay of the averaged structures of these 2 proteins, all temperatures, the catalytic constant Kcat is the same for both AcPh-wt and AcPh-des, together with the reported X-ray struc- enzymes (AcPh-wt and AcPh-des). Although there is some ture of bovine AcPh (34) that was used for the TKSA-GA decrease in Michaelis-Menten constant Km at room tempera- optimization protocol. The rms deviation of the backbone atoms ture, the difference becomes very small at 55 °C (Table 1). Thus, between bovine, ensemble-averaged AcPh-wt, and AcPh-des is Ͻ for AcPh, the amino acid substitutions that were introduced by 1.1 Å. This observation further supports the notion that, design via optimization of charge–charge interactions on the structurally, the designed protein is very similar to the WT AcPh. protein surface produced an enzyme that is not only 10 °C more Because there are no changes in the oligomerization state of stable but also equally catalytically active. the proteins and there are no dramatic changes in their 3D Temperature-induced unfolding of Cdc42 is irreversible be- structure, all changes in stability are probably directly related to cause of aggregation in the unfolded state of the protein (see differences in charge–charge interactions in the WT and de- below). To compare the transition profiles, temperature-induced signed proteins. unfolding was monitored by using changes in ellipticity at 222 nm Stability of the AcPh-wt and AcPh-des proteins in solution was estimated from temperature-induced reversible unfolding by at different protein concentrations and solvent conditions (2 M using 2 different methods: differential scanning calorimetry urea) (Fig. 3C and Table 2). Melting profiles for Cdc42-wt, (DSC) and monitoring the changes in ellipticity at 222 nm by Cdc42-des1, and Cdc42-des2, when compared at identical con- centrations (Fig. 3C and Table 2), show that the designed using CD spectroscopy (Fig. 3 A and B). Temperature-induced Ϸ Ϸ unfolding was fully reversible, and the transitions were analyzed variants have 8 °C and 10 °C higher midpoints of transition, by using a 2-state unfolding model (see SI Appendix). Impor- respectively, than the Cdc42-wt protein. tantly, both methods (DSC and CD) produced similar thermo- The enzymatic activity of all 3 proteins (Cdc42-wt, Cdc42- dynamic parameters, suggesting that there are no changes in the des1, and Cdc42-des2) at room temperature remains unchanged unfolding mechanism upon substitutions in AcPh-des (Table 1). (Table 2). The temperature dependence of the specific activity Notably, AcPh-des has an unfolding temperature 10 °C higher for these 3 proteins is compared in Fig. 3D. It is clear that both than AcPh-wt, which translates into an increase in Gibbs energy designed Cdc42 proteins remain active at temperatures higher of 9 Ϯ 1 kJ/mol. This increase in stability is rather substantial (4), than the WT protein. Estimates show that the difference in suggesting that indeed, the TKSA-GA protein-stabilization ap- midpoint of temperature inactivation for the Cdc42-des1 and proach can identify protein sequences that will produce more Cdc42-des2 proteins is Ϸ10 Ϯ 2 °C, which is consistent with the stable proteins. estimates based on the CD-melting profiles (Fig. 3D and Table How does this increase in stability affect the enzymatic activity 2). The changes in CD signal and loss-of-activity are due to the of AcPh? The activity of AcPh proteins was compared by using protein unfolding followed by irreversible protein aggregation. synthetic substrate benzoylphosphate (35, 36). The results of To test whether substitutions in the designed Cdc42 variants led activity measurements at different temperatures are given in to offset in the temperature of aggregation, the temperature Table 1 (actual kinetic traces are shown in Fig. S4). It is dependence of aggregation was determined by using size- remarkable that not only does the AcPh-des protein remain exclusion chromatography to quantify the fraction of protein active, it also has very similar kinetic parameters as the WT. At that remains as monomer in solution. Fig. 3E shows the per-

Gribenko et al. PNAS Early Edition ͉ 3of6 Downloaded by guest on October 2, 2021 Table 1. Comparison of properties of the wild-type and designed AcPh proteins AcPh-wt AcPh-des

MWtheor, kDa 12.2 12.2 MWAUC, kDa 11.2 Ϯ 0.9 12.2 Ϯ 0.1 Stability

Tm (CD or DSC), °C 57.0 Ϯ 0.5; 57.6 Ϯ 0.1 66.2 Ϯ 0.5; 66.3 Ϯ 0.1 ⌬H (CD or DSC), kJ/mol 333 Ϯ 12; 328 Ϯ 9 365 Ϯ 16; 343 Ϯ 8 ⌬⌬G (CD or DSC), kJ/mol 0 9.3 Ϯ 1; 8.3 Ϯ 0.5 Activity 25 °C Ϫ4 Ϫ4 Km,M (1.0 Ϯ 0.2)⅐10 (2.3 Ϯ 0.2)⅐10 Ϫ1 3 3 kcat,s (1.0 Ϯ 0.1)⅐10 (0.9 Ϯ 0.1)⅐10 40 °C Ϫ4 Ϫ4 Km,M (1.2 Ϯ 0.1)⅐10 (2.2 Ϯ 0.2)⅐10 Ϫ1 3 3 kcat,s (1.6 Ϯ 0.1)⅐10 (1.6 Ϯ 0.1)⅐10 55 °C Ϫ4 Ϫ4 Km,M (3.0 Ϯ 0.2)⅐10 (4.0 Ϯ 0.3)⅐10 Ϫ1 3 3 kcat,s (2.4 Ϯ 0.2)⅐10 (2.4 Ϯ 0.2)⅐10

MWtheormolecular mass calculated from the amino acid sequence. MWAUC molecular mass in solution was obtained from the analytical ultracentrifugation experiments as described [data for AcPh-wt from Strickler et al. (7); actual experimental data for AcPh-des, Cdc42-wt, CDC42-des1, and Cdc42-des2 are presented in Fig. S2]. Tm (CD/DSC) and ⌬H (CD/DSC) are the values for the transition temperature and enthalpy of unfolding, respectively, obtained from the analysis of CD or DSC by using a two-state model and ⌬Cp of 4.5 kJ/(mol⅐K) as shown in Fig 3 A and B. ⌬⌬G (CD/DSC) values of Gibbs energy at 57° C; Km and kcat were obtained from the fit of the initial velocity versus concentration of benzoylphosphate as a substrate by using Michaelis-Menten relationship (see Fig. S4).

centage of the monomeric Cdc42 remaining in solution after AcPh-wt and Cdc42-wt have midpoint-transition temperatures exposure to different temperatures. The midpoint temperatures Ͼ55 °C, which suggests that they are rather stable proteins. Are of aggregation for both designed Cdc42 variants are Ϸ9 Ϯ 2°C the introduced substitutions more frequent in the corresponding higher than that of the WT protein, which is also consistent with sequence positions of the homologous series of AcPh and the data from CD-monitored unfolding and temperature depen- Cdc42? If so, can they be identified by using a consensus-design dence of the specific activity (Fig. 3 C and D). These results approach that uses evolutionary conservation as a means to suggest that the offset in the aggregation temperature in de- engineer more stable proteins? To answer these questions, we signed Cdc42 variants is due to the higher stability of their native analyzed the normalized probabilities of the residues at the states that start to unfold at higher temperatures, and thus the substitution sites. The normalized probability of a residue at a onset of aggregation is at higher temperatures as well. given position in sequence (Pi) indicates how many times over what would be expected by random chance does a given residue Implications for Protein Evolution. Is it fortuitous that the selected type occur at that position (see Materials and Methods for sequences are more stable? Is it that the reference (i.e., WT) details). Interestingly, of 13 substitutions in the 2 proteins, 12 are proteins chosen were not very stable to begin with? Both substitutions from more frequent (Pi Ͼ 1) to less frequent (Pi Ͻ

Table 2. Comparison of properties of the wild-type and designed Cdc42 proteins Cdc42-wt Cdc42-des1 Cdc42-des2

MWtheor, kDa 22.1 22.1 22.1

MWAUC, kDa 21.7 Ϯ 1.1 21.1 Ϯ 0.7 21.3 Ϯ 0.9 Stability

T1/2,CD, ° C 60 ␮g/ml 59 Ϯ 168Ϯ 169Ϯ 1 310 ␮g/ml 56 Ϯ 163Ϯ 166Ϯ 1 310 ␮g/ml in 2M urea 51 Ϯ 159Ϯ 160Ϯ 1

T1/2,activity, ° C60Ϯ 270Ϯ 270Ϯ 2 T1/2,SEC, ° C57Ϯ 266Ϯ 266Ϯ 2 Activity, 25 ° C Ϫ3 Ϫ3 Ϫ3 Km,M (8.2 Ϯ 0.9)⅐10 (8.2 Ϯ 0.9)⅐10 (8.2 Ϯ 0.9)⅐10 Ϫ1 kcat,hr 2.8 Ϯ 0.1 2.6 Ϯ 0.1 3.1 Ϯ 0.2 Kd,app,GAP/GTP, ␮M 0.3 Ϯ 0.1 0.4 Ϯ 0.1 0.3 Ϯ 0.1 Kd,app,GAP/GDP, ␮M* 5 Ϯ 212Ϯ 48Ϯ 3

MWtheor molecular mass calculated from the amino acid sequence. MWAUC molecular mass in solution was obtained from the analytical ultracentrifugation experiments as described (experimental data is given in Fig. S2). T1/2,CD values were obtained from the CD experiments, one of which is shown in Figure 3C;T1/2,activity values were obtained from the experiments shown in Figure 3D;T1/2,SEC values were obtained from experiments shown in Figure 3E;Km and kcat were obtained from the fit of the initial velocity versus concentration of GTP plot by using Michaelis-Menten relationship (see Fig S5); Kd,app,GAP/GTP was obtained from the analysis of data given in Fig. 4; and Kd,app,GAP/GDP values were obtained from ITC experiments performed at 5° C (shown in Fig. S6).

4of6 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0808220106 Gribenko et al. Downloaded by guest on October 2, 2021 for Cdc42-wt, Cdc42-des1, and Cdc42-des2 and is on the order of 0.3 ␮M (Table 2). Interestingly, the same dissociation constant of Cdc42-wt and Cdc42GAP was measured by Cerione et al. (41), who used fluorescence polarization spectroscopy to monitor Cdc42 to Cdc42GAP binding in the presence of GTP. Second, we used isothermal titration calorimetry to measure Cdc42 binding to Cdc42GAP in the presence of GDP (Fig. S6). These experiments were performed in the presence of excess GDP, and thermodynamic analysis of the binding isotherms shows that under these conditions, the apparent dissociation constant for Cdc42 binding to Cdc42GAP is on the order of 10 Ϯ 3 ␮M and Fig. 4. Stimulation of the GTPase activity of the Cdc42 proteins by the is the same for all 3 Cdc42 variants. This estimate for Cdc42 increasing concentrations of Cdc42GAP. Experiments were performed at the binding to Cdc42GAP in the presence of GDP agrees well with initial GTP concentration of 100 ␮M. Shown are Cdc42-wt (black circles), the dissociation constant of 2.8–2.9 ␮M (24, 42) and is in Cdc42-des1 (blue squares), and Cdc42-des2 (red triangles). Concentration of reasonable agreement with the 50 ␮M value reported by Hoff- all 3 Cdc42 proteins was kept constant at 0.5 ␮M. Data were fitted to a binding man et al. (41). These experiments establish the feasibility of equation engineering a protein with enhanced stability without perturbing 2 ͱ its enzymatic or signaling functions. ͑LT ϩ MT ϩ Kd,app͒ Ϫ ͑LT ϩ MT ϩ Kd,app͒ Ϫ 4⅐LT⅐MT y ϭ , 2⅐M T Conclusion where y is the fractional saturation, LT is the total Cdc42GAP concentration, MT The results presented here clearly support the idea that the is the total Cdc42 concentration, and Kd,app is the apparent dissociation stability of many proteins and enzymes is not fully optimized constant of Cdc42 to Cdc42GAP. Dashed lines are the results of the fit for each from the evolutional point-of-view and, thus, the stability of the individual protein, and the solid line is the fit of all datapoints to a single enzymes can be increased via optimization of surface charge– Kd,app. See the results of the fit in Table 2. charge interactions without perturbing the enzymatic activity. As such, this approach is a viable strategy that can be used on its own BIOPHYSICS 1) amino acid residues. The ratio Pi,wt/Pi,des, ranges from 2 to 75 but also in combination with other design strategies that ratio- with an average value of 13 (Table S1). The only exception is the nally optimize other types of interactions, predominantly in the H60E substitution in AcPh, for which WT histidine is 1.6 times protein core (2–5). less frequent than glutamic acid. Overall, for all 12 substitution Materials and Methods positions, the WT residues have normalized Pi, Ͼ 1. Moreover, in many cases, WT-residues have the maximal or second max- Protein Design. Optimization of energies of charge–charge interactions on the protein surface, as calculated by the TKSA model (7, 14, 43) by using GA, was imal Pi, and yet the substitutions in these residues lead to increase of protein stability without changes in enzymatic activ- identical to the approach used by us (7, 26) (see SI Appendix for details). X-ray ity. One possible explanation of these observations is that the structures of human Cdc42 and bovine acylphosphatase (PDB entries 1AN0 interactions with other proteins and ligands in the cell dictate the and 2ACY, respectively) were used as starting templates in the design (34, 40). To assess errors in the calculated energies of charge–charge interactions and preference for the residues at these positions, i.e., activity-for- to at least partially account for the dynamics of the side chains in solution, stability tradeoff (37, 38). Although cellular partners interacting ensembles of 11 structures were generated from each starting template by with AcPh are not known, proteins interacting with Cdc42 are using Modeler version 7.7 (44). Average surface accessibilities, distances be- well characterized, and disruption of these interactions fre- tween the side chains, and interaction potentials were calculated from these quently leads to lethal phenotype. For example, a set of alanine- structural ensembles and used as input parameters during GA optimizations. scanning mutations has been generated in the yeast homolog of New sites for introducing ionizable residues were picked from those that are Cdc42, and the effect of these mutations on the growth of yeast Ͼ50% solvent exposed. Side chains with Ͻ50% solvent accessibility, side has been examined (39). A number of mutations were found to chains forming multiple hydrogen bonds, and side chains located within 10 Å be detrimental. However, for the mutations in positions E95, of the active sites and/or ligand-binding sites were excluded from optimiza- D121, K131, and E178 (i.e., site of substitutions in our Cdc42- tion. For AcPh, the sites included in the optimization were E2, D4, K31, K32, des1 and Cdc42-des2), the growth of haploid cells on YPD plates D43, Q44, Q50, H60, E63, E66, K68, K72, H74, D76, R77, S79, H81, N82, K84, K88, D90, D93, and Q95. For Cdc42, the optimizations sites were Q2, N26, K27, N39, had a WT phenotype (39). Further support that the site of the M45, E49, E62, D63, K66, Q74, E91, K94, E95, H103, K107, D121, E127, K128, substitutions in Cdc42-des1 and Cdc42-des2 does not affect the K131, N132, Q134, K135, E140, K144, R147, D148, K150, K166, N167, E178, function of this protein was obtained from the analysis of the E181, K183, K184, and V189. During the TKSA-GA run, each and every site that Cdc42 interactions with Cdc42 GTPase-activating protein was included in optimization was allowed to a have positive, negative, or Cdc42GAP. Cdc42GAP binds Cdc42 and activates GTPase neutral charge, which is independent of the charges on the other residues. The activity of Cdc42 (40). We analyzed the enzymatic activity (e.g., energies of these different combinations of charges were evaluated, and GTPase activity) enhancement of Cdc42 in the presence of those that have energies higher than a certain predetermined value were Cdc42GAP protein. For all 3 proteins (Cdc42-wt, Cdc42-des1, discarded. and Cdc42-des2), it was observed that the addition of Cdc42GAP leads to an enhancement of GTPase activity in a Proteins and Enzymes. All proteins used in this work were cloned into pGia concentration-dependent manner (Fig. S5). Moreover, the en- expression vectors, overexpressed in BL21(DE3) pLys E. coli strain, and purified hancement in activity among the WT and designed Cdc42 to homogeneity by using column chromatography (7). Detailed description of variants is practically indistinguishable. the cloning, expression, purification, and characterization is provided in SI Appendix. To further characterize the interactions of the Cdc42 variants with Cdc42GAP, we performed 2 types of experiments. First, we Statistical Analysis of Sequences. The sequences homologous to the human measured the GTPase activity of the Cdc42 variants at a fixed ␮ Cdc42 were identified by using BLASTP 2.2.17 and nonredundant protein concentration of GTP (100 M) and different concentrations of sequences from all major databases (45); 995 sequences were identified this Cdc42GAP. The results of these measurements are shown in Fig. way. For AcPh, sequence alignment was taken from the database (46) 4 and suggest that all 3 Cdc42 proteins have similar enhancement and contained 682 sequences. The normalized probability of finding a residue of activity by Cdc42GAP. Analysis shows that the Kd,app is similar type (i) at a substitution site (j) was calculated as:

Gribenko et al. PNAS Early Edition ͉ 5of6 Downloaded by guest on October 2, 2021 HNCO, HNCACB, and HN(CO)CACB (Fig. S7). Side-chain assignments were ͑ ͒ ϭ ͑ ͒ ͸ ͑ ͒ ͸ obtained through the analysis of 3D 13C-separated HCCH-TOCSY and COSY Pi j ͩni j / ni j ͪͲͩNi/ Niͪ spectra. Distance constraints were obtained from a 3D 15N-NOESY spectrum i i recorded at 1H frequency of 600 MHz and 2 3D 13C-NOESY spectra recorded with a 150-ms mixing time at a 1H frequency of 800 MHz with the 13C-carrier where ni(j) is the number of residues of type i at site j, ¥i ni(j) is the total number frequency in the aliphatic and the aromatic region, respectively. The 3 data- of sequences in the alignment, Ni is the number of residues of type i in sets were peak-picked and automatically assigned by using the CANDID macro the sequence alignment, and ¥i Ni is the total number of residues in the for CYANA (47). Dihedral-angle constraints were obtained by using the pro- alignment. gram Talos (48). The structural statistics are listed in Table S2.

NMR Methods. All NMR spectra were obtained at 27 °C on 600- and 800-MHz ACKNOWLEDGMENTS. We thank Dr. Cerione (Cornell University, Ithaca, NY) Bruker spectrometers equipped with cryoprobes. Protein concentrations of for the Cdc42 clone and Drs. Thomas Spratt and Susan Gilbert for advice on 0.4 to 1 mM were used for all experiments, in 20 mM phosphate buffer, 50 mM enzyme kinetic data acquisition and analysis. This work was supported by NaCl, at pH 5.7. Polypeptide backbone assignments were obtained with National Science Foundation Grant MCB-0110396 (to G.I.M.).

1. Sanchez-Ruiz JM, Makhatadze GI (2001) To charge or not to charge? Trends Biotechnol 25. Cerione RA (2004) Cdc42: New roads to travel. Trends Cell Biol 14:127–132. 19:132–135. 26. Ibarra-Molero B, Sanchez-Ruiz JM (2002) Genetic algorithm to design stabilizing 2. Dantas G, Kuhlman B, Callender D, Wong M, Baker D (2003) A large scale test of surface-charge distributions in proteins. J Phys Chem B 106:6609–6613. computational protein design: Folding and stability of nine completely redesigned 27. Tanford C, Kirkwood JG (1957) Theory of protein titration Curves. 1. General equations globular proteins. J Mol Biol 332:449–460. for impenetrable spheres. J Am Chem Soc 79:5333–5339. 3. Korkegian A, Black ME, Baker D, Stoddard BL (2005) Computational thermostabiliza- 28. Matthew JB, Gurd FRN (1986) Calculation of electrostatic interactions in proteins. tion of an enzyme. Science 308:857–860. Methods Enzymol 130:413–436. 4. Malakauskas SM, Mayo SL (1998) Design, structure and stability of a hyperthermophilic 29. Schweiker KL, Makhatadze GI (2009) Protein stabilization by the rational design of protein variant. Nat Struct Biol 5:470–475. surface charge–charge interactions. Methods Mol Biol, 490:261–283 5. Pokala N, Handel TM (2001) Review: Protein design–where we were, where we are, 30. Georgescu RE, Alexov EG, Gunner MR (2002) Combining conformational flexibility where we’re going. J Struct Biol 134:269–281. and continuum electrostatics for calculating pK (a) s in proteins. Biophys J 83:1731– 6. Grimsley GR, et al. (1999) Increasing protein stability by altering long-range coulombic 1748. interactions. Protein Sci 8:1843–1849. 31. Dantas G, et al. (2007) High-resolution structural and thermodynamic analysis of 7. Strickler SS, et al. (2006) Protein stability and surface electrostatics: A charged rela- extreme stabilization of human procarboxypeptidase by computational protein de- tionship. Biochemistry 45:2761–2766. sign. J Mol Biol 366:1209–1221. 8. Schweiker KL, Zarrine-Afsar A, Davidson AR, Makhatadze GI (2007) Computational 32. Byeon IJ, Louis JM, Gronenborn AM (2004) A captured folding intermediate involved design of the Fyn SH3 domain with increased stability through optimization of surface in dimerization and domain-swapping of GB1. J Mol Biol 340:615–625. charge-charge interactions. Protein Sci 16:2694–2702. 33. Ermolenko DN, Dangi B, Gvritishvili A, Gronenborn AM, Makhatadze GI (2007) Elim- 9. Gribenko AV, Makhatadze GI (2007) Role of the charge-charge interactions in defining ination of the C-cap in ubiquitin—Structure, dynamics and thermodynamic conse- stability and halophilicity of the CspB proteins. J Mol Biol 366:842–856. quences. Biophys Chem 126:25–35. 10. Lee CF, Makhatadze GI, Wong KB (2005) Effects of charge-to-alanine substitutions on 34. Thunnissen MM, Taddei N, Liguri G, Ramponi G, Nordlund P (1997) Crystal structure of the stability of ribosomal protein L30e from Thermococcus celer. Biochemistry common type acylphosphatase from bovine testis. Structure 5:69–79. 44:16817–16825. 35. Ramponi G, Treves C, Guerritore AA (1966) Aromatic acyl phosphates as substrates of 11. Permyakov SE, et al. (2005) How to improve nature: Study of the electrostatic prop- acyl phosphatase. Arch Biochem Biophys 115:129–135. erties of the surface of alpha-lactalbumin. Protein Eng Des Sel 18:425–433. 36. Camici G, Manao G, Cappugi G, Ramponi G (1976) A new synthesis of benzoyl phos- 12. Makhatadze GI, Loladze VV, Gribenko AV, Lopez MM (2004) Mechanism of thermo- phate: A substrate for acyl phosphatase assay. Experientia 32:535–536. stabilization in a designed cold shock protein with optimized surface electrostatic 37. Ondrechen MJ, Clifton JG, Ringe D (2001) THEMATICS: A simple computational pre- interactions. J Mol Biol 336:929–942. dictor of enzyme function from structure. Proc Natl Acad Sci USA 98:12473–12478. 13. Loladze VV, Makhatadze GI (2002) Removal of surface charge-charge interactions from 38. Arnold FH, Wintrode PL, Miyazaki K, Gershenson A (2001) How enzymes adapt: Lessons ubiquitin leaves the protein folded and very stable. Protein Sci 11:174–177. from directed evolution. Trends Biochem Sci 26:100–106. 14. Loladze VV, Ibarra-Molero B, Sanchez-Ruiz JM, Makhatadze GI (1999) Engineering a 39. Kozminski KG, Chen AJ, Rodal AA, Drubin DG (2000) Functions and functional domains thermostable protein via optimization of charge-charge interactions on the protein of the GTPase Cdc42p. Mol Biol Cell 11:339–354. surface. Biochemistry 38:16419–16423. 40. Nassar N, Hoffman GR, Manor D, Clardy JC, Cerione RA (1998) Structures of Cdc42 15. Spector S, et al. (2000) Rational modification of protein stability by the mutation of bound to the active and catalytically compromised forms of Cdc42GAP. Nat Struct Biol charged surface residues. Biochemistry 39:872–879. 5:1047–1052. 16. Antosiewicz J, McCammon JA (1995) Electrostatic and hydrodynamic orientational 41. Hoffman GR, Nassar N, Oswald RE, Cerione RA (1998) Fluoride activation of the Rho steering effects in enzyme-substrate association. Biophys J 69:57–65. family GTP-binding protein Cdc42Hs. J Biol Chem 273:4392–4399. 17. Bolon DN, Voigt CA, Mayo SL (2002) De novo design of biocatalysts. Curr Opin Chem 42. Zhang B, Chernoff J, Zheng Y (1998) Interaction of Rac1 with GTPase-activating Biol 6:125–129. proteins and putative effectors. A comparison with Cdc42 and RhoA. J Biol Chem 18. Honig B, Nicholls A (1995) Classical electrostatics in biology and chemistry. Science 273:8776–8782. 268:1144–1149. 43. Ibarra-Molero B, Loladze VV, Makhatadze GI, Sanchez-Ruiz JM (1999) Thermal versus 19. Vizcarra CL, Mayo SL (2005) Electrostatics in computational protein design. Curr Opin guanidine-induced unfolding of ubiquitin. An analysis in terms of the contributions Chem Biol 9:622–626. from charge-charge interactions to protein stability. Biochemistry 38:8138–8149. 20. Elcock AH, Huber GA, McCammon JA (1997) Electrostatic channeling of substrates 44. Marti-Renom MA, et al. (2000) Comparative protein structure modeling of genes and between enzyme active sites: Comparison of simulation and experiment. Biochemistry genomes. Annu Rev Biophys Biomol Struct 29:291–325. 36:16049–16058. 45. Altschul SF, et al. (1997) Gapped BLAST and PSI-BLAST: A new generation of protein 21. Warshel A, Papazyan A (1998) Electrostatic effects in macromolecules: Fundamental database search programs. Nucleic Acids Res 25:3389–3402. concepts and practical modeling. Curr Opin Struct Biol 8:211–217. 46. Finn RD, et al. (2008) The Pfam protein families database. Nucleic Acids Res 36:D281– 22. Roca M, Liu H, Messer B, Warshel A (2007) On the relationship between thermal D288. stability and catalytic power of enzymes. Biochemistry 46:15076–15088. 47. Mumenthaler C, Guntert P, Braun W, Wuthrich K (1997) Automated combined assign- 23. Stefani M, Taddei N, Ramponi G (1997) Insights into acylphosphatase structure and ment of NOESY spectra and three-dimensional protein structure determination. J Bi- catalytic mechanism. Cell Mol Life Sci 53:141–151. omol NMR 10:351–362. 24. Leonard DA, Lin R, Cerione RA, Manor D (1998) Biochemical studies of the mechanism 48. Cornilescu G, Delaglio F, Bax A (1999) Protein backbone angle restraints from searching of action of the Cdc42-GTPase-activating protein. J Biol Chem 273:16210–16215. a database for chemical shift and sequence homology. J Biomol NMR 13:289–302.

6of6 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0808220106 Gribenko et al. Downloaded by guest on October 2, 2021