<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Research Paper 473

Native-like secondary structure in a from the ␣-domain of hen lysozyme Jenny J Yang1*, Bert van den Berg*, Maureen Pitkeathly, Lorna J Smith, Kimberly A Bolin2, Timothy A Keiderling3, Christina Redfield, Christopher M Dobson and Sheena E Radford4

Background: To gain insight into the local and nonlocal interactions that Addresses: Oxford Centre for Molecular Sciences contribute to the stability of hen lysozyme, we have synthesized two and New Chemistry Laboratory, University of Oxford, South Parks Road, Oxford OX1 3QT, UK. that together comprise the entire -domain of the . One peptide (peptide Present addresses: 1Department of Molecular 1–40) corresponds to the sequence that forms two -helices, a loop region, and Physics and , Yale University, 266 a small -sheet in the N-terminal region of the native protein. The other (peptide Whitney Avenue, PO Box 208114, New Haven, 84–129) makes up the C-terminal part of the -domain and encompasses two CT 06520-8114, USA. 2Department of Chemistry -helices and a 310 helix in the native protein. and Biochemistry, University of California, Santa Cruz, CA 96064, USA. 3Department of Chemistry, University of Illinois, 845 West Taylor Street, Results: As judged by CD and a range of NMR parameters, peptide 1–40 has Chicago, IL 60607-7061, USA. 4Department of little secondary structure in aqueous and only a small number of local Biochemistry and Molecular Biology, University of hydrophobic interactions, largely in the loop region. Peptide 84–129, by Leeds, Leeds LS2 9JT, UK. contrast, contains significant helical structure and is partially hydrophobically *The first two authors contributed equally to this work. collapsed. More specifically, the region corresponding to helix C in native lysozyme is disordered, whereas regions corresponding to the D and 310 helices Correspondence: Sheena E Radford e-mail: [email protected] in the native protein are helical in this peptide. The structure in peptide 84–129 is at least partly stabilized by interactions between residues in the two helical Key words: hen egg white lysozyme, hydrophobic regions, as suggested by further NMR analysis of three short peptides collapse, NMR, peptide models, corresponding to the individual helices in this region of the native protein. Received: 30 Sep 1996 Revisions requested: 22 Oct 1996 Conclusions: Stabilization of structure in the sequence 1–40 appears to be Revisions received: 05 Nov 1996 facilitated predominantly by long-range interactions between this region and the Accepted: 07 Nov 1996 sequence 84–129. In native lysozyme, the existence of two disulphide bonds Published: 25 Nov 1996 between the N- and C-terminal halves of the -domain is likely to be a major Electronic identifier: 1359-0278-001-00473 factor in their stabilization. The data show, however, that native-like secondary Folding & Design 25 Nov 1996, 1:473–484 structure can be generated in the C-terminal portion of the -domain by nonspecific and nonnative interactions within a partially collapsed state. © Current Biology Ltd ISSN 1359-0278

Introduction such observations raise is the manner in which native-like During the past decade, extensive research has led to a secondary structure can form in the absence of many of broader insight into the rules that govern the folding of a the specific sidechain interactions that characterize the polypeptide chain into the complex native . three-dimensional structure of a protein. Although the folding of some small appears to be a two-state One approach to the investigation of these issues involves process [1–7], others show a series of events and well the study of peptide fragments of proteins, particularly defined intermediates between the fully unfolded state those corresponding to specific structural features of the and the fully native one [8]. One example of the latter intact protein [12–18]. Initial studies using such an class of proteins is hen lysozyme. After initiation of the approach have been carried out for lysozyme where the refolding of this protein from a chemically denatured entire sequence of the protein has been synthesized as state, the polypeptide chain collapses to a heterogeneous four peptides ranging from 20 to 46 residues in length population of states containing, on average, a native [19]. Two of the peptides (1–40 and 84–129) together content of secondary structure [9,10], but lacking persis- encompass the entire -domain of native lysozyme. In tent tertiary interactions [11]. From this state, the native native lysozyme, the sequence 1–40 spans two -helices, protein emerges in a complex series of events which involving residues 4–15 (helix A) and 24–36 (helix B). The involve the formation of stable structure in distinct two helices are connected by a loop (residues 16–23). In domains of the protein [9]. One of the key issues that addition, a small antiparallel -sheet is present between 474 Folding & Design Vol 1 No 6

Figure 1

The -domain of hen lysozyme. (a) Backbone (a) (b) trace of the -domain of the native structure W111 of hen lysozyme [57,58] highlighting the D locations of -helices A (residues 4–15; red) W108 and B (residues 24–36; purple) in peptide 1–40, and -helices C (residues 88–99; yellow), D (residues 108–115; green) and the B C-terminal 310 helix (residues 120–125; blue) in peptide 84–129. The remaining residues C comprising the -domain are shown in white. W123 10 The positions of the disulphide bonds 3 Cys6–Cys127 and Cys30–Cys115 are 84 (b) indicated as spheres. Examples of some of 40 the hydrophobic sidechains that make contact between the regions 1–40 and 84–129 in the 1 -domain of hen lysozyme. The sidechains shown are coloured according to their A location in different helices, coloured as in 129 (a). The sidechains of residues F3, L8, A9, M12, L17, Y20, Y23, L25, W28, V29, F34, I88, V92, V99, M105, W108, W111, W123, make hydrophobic contacts with four or more with the program NAOMI [59]. The picture A124 and L129 are shown. These residues other residues within the -domain, identified was made using the program Molscript [60]. residues 2–3 and 39–40. The sequence 84–129 in native –0.09 ppm), the most notable upfield shifts (> 0.1 ppm) lysozyme also comprises two -helices — helix C, residues occur for residues 8–12, 19–22 and 31–33, which in the 88–99, and helix D, residues 108–115 — together with a native protein are located in the central regions of helices C-terminal 310 helix (residues 120–125). The locations of A and B and in the loop region connecting them. In addi- the sequences 1–40 and 84–129 and the positions of the tion, the CH resonances of Gly26 and Val29, which lie two disulphide bonds (Cys6–Cys127 and Cys30–Cys115) towards the N terminus of the B helix in the native in the -domain of native lysozyme are shown in Figure protein, are also significantly upfield shifted. 1a. Far-UV CD studies of the conformational properties of these peptides revealed that whereas the N-terminal half Very few medium-range NOEs indicative of regular sec- of the -domain (peptide 1–40) is predominantly unstruc- ondary structure are observed for peptide 1–40 (Fig. 4a). tured, the C-terminal half of this domain (peptide 84–129) No N(i,i+3), N(i,i+3) or N(i,i+4) NOEs are observed, showed significant structure in aqueous solution, although despite the fact that 36 out of the total of 55 medium- no detailed information on the nature of this structure was range NOEs expected, if the native helices (residues obtained. In this paper, we describe detailed 1H NMR 4–15 and 24–36) were adopted, would be resolved in the studies of the conformational preferences of both of these NOESY spectrum (Table 1). Four short-range N(i,i+2) peptides. In addition, by an approach involving dissection NOEs are detected, however, between residues in the of peptide 84–129 into three short peptides (86–102, region 16–23, which corresponds to a loop in the native 105–115 and 116–129), we investigate the role of long- protein. These NOEs are predicted to occur in peptides range interactions in stabilizing the conformation of with random structures, in which population of and peptide 84–129. space occurs in adjacent residues [21]. The fact that these NOEs are not observed elsewhere in the spectrum, Results however, suggests that they could arise from preferential Peptide 1–40 occupation of certain regions of space in this The fingerprint region of the 2D TOCSY spectrum of sequence, perhaps caused by sidechain–sidechain inter- peptide 1–40 in 0.5 M urea is shown in Figure 2a. Under actions (see below). Near the central region of the native these conditions, the peptide is monomeric and spectra of A helix, three weak (i,i+3) NOEs can be identified, high quality could be obtained (see Materials and two of which involve alanine methyl groups. Only one methods). Although significant degeneracy exists, the (i,i+3) NOE (also involving an alanine methyl group) chemical shift dispersion throughout the spectrum is suffi- was detected in the sequence corresponding to the cient to permit its complete assignment, using homonu- native B helix, even though the 10 NOEs expected for clear 2D DQF-COSY, TOCSY, ROESY and NOESY the native helix would all have been resolved in the experiments [20]. Although the majority of the CH reso- spectrum of the peptide (Table 1). In view of the lack of nances in peptide 1–40 (Fig. 3a) are upfield shifted rela- other medium-range NOEs, it is likely that these NOEs tive to the random coil values (average secondary shift merely reflect the high population of alanine residues in Research Paper NMR studies of lysozyme peptides Yang et al. 475

Figure 2 Figure 3

(a)

0.2 Helix A Loop Helix B

0 H (ppm)

α -0.2 C ∆

-0.4

-0.6 412202836 Residue number (b)

0.2 Helix C Helix D 310

0 H (ppm)

α -0.2 C ∆

-0.4

-0.6 88 96 104 112 120 128 Residue number

Secondary CH chemical shifts of (a) peptide 1–40 in 0.5 M urea and (b) peptide 84–129 ( bars), and the short peptides 86–102, 105–115 and 116–129 (open bars). A positive difference (CH = measured – coil) indicates a downfield chemical shift relative to the random coil value [25]. The locations of the secondary structure elements in the native protein are indicated.

space, which is also anticipated for alanine residues in the random coil state [21,22]. Thus, the NMR data indi- cate that peptide 1–40 is largely unstructured. In accord 3 with this, the majority of the JHN coupling constants for the peptide are close in magnitude (within 0.5 Hz) to those predicted for a random coil of this sequence [23] (Table 2). 1H NMR spectra of peptides 1–40 and 84–129. Fingerprint regions of

2D TOCSY spectra in H2O of (a) 1.0 mM peptide 1–40 in 0.5 M urea NOEs are observed between several sidechains in the (mixing time 50 ms; pH 2.0, 285 K) and (b) of 0.5 mM peptide spectrum of peptide 1–40; these involve the aromatic 84–129 (mixing time 65 ms; pH 2.0, 290 K). The cross peaks labelled ring hydrogens of Tyr20 and the Gly22 NH hydrogen, the with an asterisk originate from the acetamidomethyl protecting groups. and hydrogens of Tyr23 and the CH hydrogen of Residue-specific assignments are shown. The inclusion of 0.5 M urea improved the quality of the spectra of peptide 1–40, but had no Gly22, and the and aromatic ring hydrogens of Tyr23 influence on its structure as judged by far-UV CD. Under these and the Leu25 C H hydrogen. The interaction of a tyro- conditions, the two peptides are monomeric and there is no sine aromatic ring with the NH of a glycine in the peptide concentration dependence of their NMR spectra over periods of at sequence Tyr-X-Gly has been observed in other peptides least several days. and it has been suggested that such interactions could be important in stabilizing turn conformations in a polypep- 476 Folding & Design Vol 1 No 6

Figure 4 one of these cross peaks could be attributed to short-range or medium-range NOEs. Examination of the near-UV CD spectrum of peptide 1–40 is consistent with the view that at least a proportion of the aromatic sidechains have non- random orientations in this peptide (Fig. 5a). Thus, a small but significant intensity in the near-UV CD spec- trum of the peptide around 270–280 nm is observed. The addition of 6 M GuHCl reduces substantially the intensity in the near-UV CD spectrum, consistent with NOESY data that in 8 M urea the NOEs arising from specific sidechain–sidechain interactions are lost or are much weaker in intensity (Fig. 5b).

Peptide 84–129 The NMR data indicate that the conformation of peptide 84–129 is very different from that of peptide 1–40. Firstly, although the average values of the CH secondary shifts for both peptides are similar (average secondary shift value –0.11 ppm for peptide 84–129 and –0.09 ppm for peptide 1–40; Figs 2b,3b), peptide 84–129 displays many large secondary shifts for residues in the regions corre- sponding in native lysozyme to helix D and the C-termi- nal 310 helix. The NMR and far-UV CD spectra were both independent of concentration, and gel filtration demon- strated that the peptide is monomeric under these condi- tions (see Materials and methods). Analysis of the NOESY spectrum of peptide 84–129 provides direct evidence for significant helical structure in these regions. In the region 108–115, which corresponds to the D helix in native lysozyme, eight medium-range (i,i+3), N(i,i+3) and N(i,i+4) NOEs are present (Fig. 4b). Indeed, four out of the five (i,i+3) NOEs expected for a native-like helix in this region, together with one N(i,i+4) NOE, are unam- biguously observed in the spectrum of the peptide (Table 1). Two additional medium-range NOEs [Asn106–Val109 (i,i+3) and Ala107–Ala110 N(i,i+3)] are also observed, suggesting that the helical structure in the peptide is elongated by one turn relative to helix D in native lysozyme. In addition to the above NOEs there are seven N(i,i+2) NOEs present in the region 106–115. The latter are expected to be weak relative to the N(i,i+3) and N(i,i+4) NOEs in regular -helices. The presence of N(i,i+2) NOEs between residues in this region of NOE maps of (a) peptide 1–40 in 0.5 M urea, (b) peptide 84–129, peptide 84–129 would, therefore, be consistent with - and (c) the short peptides 86–102, 105–115 and 116–129. The relative intensities of the NOEs are not shown. The locations of the helical structure. They would also be consistent with the various secondary structural elements in the native protein are notion that the helical structure is in equilibrium with a indicated. substantial population of random coil conformers [21] or turn-like structures [20]. tide chain [24,25]. Indeed, Tyr20 and Gly22 lie in the loop In the region 120–125, corresponding to the 310 helix in region of sequence 1–40 in native lysozyme, but the con- native lysozyme, three medium-range N(i,i+3) and tacts observed in the peptide are not those found in the (i,i+3) NOEs are observed. These NOEs, together with native protein. NOEs are also observed between the aro- the nonnative N(i,i+3) NOE (Trp123–Gly126), suggest matic ring hydrogens of Tyr20, Tyr23 and Trp28 and the the formation of significant helical structure in this region methyl groups of valine and leucine residues (Fig. 5a). of the peptide. No N(i,i+4) NOEs are observed, even Although spectral overlap prevents full assignment, all but though the relevant cross peaks would be clearly resolved Research Paper NMR studies of lysozyme peptides Yang et al. 477

Table 1 being close to that (6.8 Hz) predicted for amino acids in unstructured peptides [23]. By contrast, with the excep- Expected, observed and overlapping NOEs in the spectra of 3 peptides 1–40 and 84–129. tion of Cys115, all of the JHN coupling constants for the residues in the region corresponding to the D helix in Structural regions NOE type Total no. No. No. native lysozyme (108–115) are smaller than 6.5 Hz, three in native lysozyme expected* present† absent† consecutive residues (110–112) having values more than Helix A N(i,i+2) 10 1 6 1 Hz smaller than those predicted for a random structure. The helical content of the peptide in the region 120–125 (4–15) N(i,i+3) 9 0 5 is more difficult to judge by this analysis, since 310 helices N(i,i+4) 8 0 5 in the Protein Data Bank have a broad range of torsion (i,i+3) 9 3 2 angles which gives rise to a wide distribution of 3J cou- HN Helix B N(i,i+2) 11 1 4 pling constants (the average coupling constant for which is (24–36) N(i,i+3) 10 0 6 3 5.6 Hz [23]). The J values of residues in this region HN N(i,i+4) 9 0 6 (6.6 Hz on average), however, are much larger than those (i,i+3) 10 1 9 expected for a fixed, regular helical structure, suggesting Helix C N(i,i+2) 10 2 3 that the conformations adopted in this region are irregular (88–99) N(i,i+3) 9 0 5 or highly dynamic. N(i,i+4) 8 0 6 (i,i+3) 9 1 4 Further dissection of peptide 84–129 Helix D N(i,i+2) 6 5 0 Insight into the role of interactions between residues (108–115) N(i,i+3) 5 3 0 distant in the sequence in stabilizing the N(i,i+4) 4 1 0 helical structure of peptide 84–129 was obtained by syn- (i,i+3) 5 4 1 thesizing three shorter peptides (86–102, 105–115 and 310 helix N(i,i+2) 4 2 1 116–129) corresponding to the three helical regions in (120–125) aN(i,i+3) 3 1 1 the sequence 84–129 (Fig. 1a) in native lysozyme (the N(i,i+4) 2 0 2 degeneracy of sidechain resonances precluded direct cal- (i,i+3) 3 2 0 culation of the solution structure of 84–129). These short peptides were highly soluble and monomeric under the *The number of NOEs of each type predicted if the helices in the experimental conditions chosen (see Materials and peptides correspond to those in the native protein. †The number of NOEs identified as present or absent in the spectra. The remainder methods). correspond to those where spectral overlap prevents their unambiguous assignment. The CH secondary shifts of resonances in the spectra of the three short peptides are compared with those in in the spectrum if they were sufficiently intense. Although peptide 84–129 in Figure 3b. Overall, very few resonances the relative population of 310 and -helical structures in show large differences in secondary shifts between the this region cannot be defined from these data, the lack of long and the short peptides. In the region 108–115, N(i,i+4) NOEs would be consistent with a predominance however, three residues (Val109, Arg112 and Cys/Ser115) of the former. have differences in secondary shifts larger than 0.18 ppm. Although for residue 115 the difference can be explained None of the eight N(i,i+4) NOEs expected if peptide by its location at the C terminus of the short peptide, the 84–129 adopted a native-like helix in the region corre- differences in secondary shifts of Val109 and Arg112 are sponding to the C helix in native lysozyme are observed, consistent with there being a higher population of helical even though six of these would have been clearly resolved structure in this region of peptide 84–129 than in the in the spectrum (Table 1). In addition, only one (i,i+3) shorter peptide. NOE and two N(i,i+2) NOEs are observed, indicating that this sequence, although it forms the longest and most The short-range and medium-range NOEs observed in regular -helix in native lysozyme, is not extensively spectra of the three short peptides are shown in Figure structured in peptide 84–129. 4c. The lack of N(i,i+3), (i,i+3) and N(i,i+4) NOEs in the spectrum of peptide 86–102 (Table 1) are in accord 3 The JHN coupling constants of nonglycine residues in with our previous analysis of this peptide [26] and suggest the peptide 84–129 support the view that the peptide has that the C helix region is highly unstructured in both significant helical structure in the regions corresponding peptides 84–129 and 86–102. Very few medium-range to the native D and 310 helices (Table 2). Thus, with one NOEs are also observed in the spectrum of peptide 3 exception (Ala90), the JHN coupling constants for 105–115, despite the fact that the majority of the NOE residues corresponding to the C helix in native lysozyme cross peaks expected for a native-like helix (108–115) (88–99) are all greater than 6 Hz, the average (6.9 Hz) would have been clearly resolved in the spectrum (Table 478 Folding & Design Vol 1 No 6

Table 2

Coupling constants for the various peptides measured from the antiphase splittings in high-resolution DQF-COSY spectra.

3 3 3 3 3 Residue JHN JHN Residue JHN JHN JHN 1–40 1–40 random coil 84–129 84–129 short peptides random coil Lys1 6.6 Leu84 6.5 6.6 Val2 8.0 7.3 Ser85 6.6 Phe3 7.7 7.1 Ser86 6.7 6.6 Gly4 Asp87 7.3 6.6 Arg5 6.0 6.4 Ile88 7.3 6.7 7.2 Cys6 6.8 6.8 Thr89 7.2 7.3 7.5 Glu7 7.1 6.2 Ala90 5.6 5.7 5.8 Leu8 6.5 6.6 Ser91 6.4 6.2 6.6 Ala9 5.5 5.8 Val92 6.8 7.5 7.3 Ala10 4.6 5.8 Asn93 6.8 6.9 7.0 Ala11 4.4 5.8 Ala94 6.2 5.8 Met12 6.0 6.8 Ala95 5.9 5.8 Lys13 6.6 6.6 Lys96 6.5 6.7 6.6 Arg14 7.3 6.4 Lys97 7.2 6.6 His15 6.9 6.9 Ile98 7.6 6.8 7.2 Gly16 Val99 7.9 7.1 7.3 Leu17 6.5 6.6 Ser100 6.8 6.6 Asp18 6.9 6.6 Asp101 7.6 6.6 Asn19 7.9 7.0 Gly102 Tyr20 7.0 7.2 Asn103 7.4 7.0 Arg21 7.4 6.4 Gly104 Gly22 Met105 6.4 7.3 Tyr23 6.4 7.2 Asn106 5.7 7.0 7.0 Ser24 7.0 6.6 Ala107 5.5 5.8 Leu25 6.6 Trp108 6.3 6.5 6.5 Gly26 Val109 6.4 7.7 7.3 Asn27 7.6 7.0 Ala110 4.8 5.2 5.8 Trp28 5.9 6.5 Trp111 5.6 6.2 6.5 Val29 7.8 7.3 Arg112 5.4 6.7 6.4 Cys30 6.8 Asn113 6.4 7.0 7.0 Ala31 5.2 5.8 Arg114 6.4 7.1 6.4 Ala32 5.4 5.8 Cys115 6.7 (7.5) 6.8(6.6) Lys33 6.7 6.6 Lys116 6.4 6.6 Phe34 Gly117 Glu35 7.6 6.2 Thr118 7.6 7.6 7.5 Ser36 6.4 6.6 Asp119 7.6 7.6 6.6 Asn37 7.6 7.0 Val120 6.8 6.8 7.3 Phe38 7.1 Gln121 6.5 6.6 6.3 Asn39 7.5 7.0 Ala122 6.0 5.8 Thr40 8.0 7.5 Trp123 6.7 7.1 6.5 Ile124 7.3 7.6 7.2 Arg125 5.9 6.3 6.4 Gly126 Cys127 7.1 (6.8) 6.8(6.6) Arg128 7.3 7.2 6.4 Leu129 7.6 7.4 6.6 Coupling constants were calculated from the splittings in DQF-COSY 85 high-resolution protein structures, using the ALL model [23]. The spectra taking into account the observed line-widths of the resonances errors in the measured coupling constant values are approximately 3 ± of interest [56]. For comparison, the JHN coupling constants for each 0.5 Hz. Values in parentheses indicate the value for a Ser which is amino acid type in a random coil state are shown. These were substituted for Cys at that residue. Values for Gly were not measured. predicted from the distribution of mainchain torsion angles for residues The cross peaks for Phe34 and Phe38 are not resolved in the not involved in regular secondary structure elements in a database of fingerprint region of the DQF-COSY spectrum. Research Paper NMR studies of lysozyme peptides Yang et al. 479

Figure 5 actions with residues distant from those in the D helix region are important for stabilizing this helix in the long peptide.

One of the most remarkable results from this study is that the 14-residue peptide 116–129 appears to be significantly structured. Thus, two of the three (i,i+3) NOEs expected for a native-like helix and one N(i,i+3) NOE are observed in the spectrum of the peptide (Fig. 4c). In addition, the pattern of NOEs observed in the spectrum of peptide 116–129 resembles that of peptide 84–129, and 3 the JHN values of these residues in the two peptides are very similar (Table 2), suggesting that the conformation of residues 116–129 in the two peptides is closely related, despite their very different lengths.

The results of the NOE analysis are substantiated by a study of the peptides by far-UV CD (Fig. 6a). Thus, although the far-UV CD spectrum of peptide 84–129 is consistent with a substantial population of unstructured molecules, a simple analysis using the value of the elliptic- ity at 222 nm, and assuming that aromatic amino acids do not contribute to the far-UV CD signal [27], suggests that the peptide contains on average about 20% helical struc- ture [19]. This would correspond remarkably well to the fraction expected if, as suggested by the NMR analysis, the regions corresponding to the D and 310 helices in native lysozyme are highly structured in the peptide. Con- sistent with this view, the ellipticity at 222 nm is dramati- Aromatic–aliphatic regions of NOESY spectra in H2O of (a) 1.0 mM peptide 1–40 in 0.5 M urea and (b) 1.0 mM peptide 1–40 in 8.0 M cally reduced by the addition of denaturants [19], and the urea. All spectra were recorded at pH 2.0 and 285 K and are plotted chemical shift dispersion of resonances is substantially at similar contour levels. The near-UV CD spectra of peptide 1–40 in reduced by the addition of 8 M urea. The far-UV CD 0.5 M urea and in 6.0 M GuHCl are shown in the insets of (a) and (b), respectively. A number of possible assignments are indicated in the spectra of the three short peptides are also consistent with NMR spectra. Several NOEs between phenylalanine sidechain(s) and the results of the NMR analysis. Thus, as judged by far- methyl groups of leucines and/or valines are also present. Since in UV CD, peptides 86–102 and 105–115 contain little, if these cases not only the methyl groups but also the phenylalanine ring any, nonrandom structure. By contrast, although the spec- hydrogen resonances are degenerate, residue-specific assignments could not be made. trum of peptide 116–129 is indicative of substantial random structure, the negative ellipticity at 222 nm would be consistent with helical or turn-like structures [28], as 1). This indicates that peptide 105–115 is also predomi- suggested by the NMR analysis. nantly unstructured, even though this region is highly 3 helical in the peptide 84–129. The JHN coupling con- To compare the far-UV CD spectra of the three short pep- stants of residues 105–115 support this view (Table 2). tides with that of peptide 84–129, their spectra were Thus, three residues (Asn106, Val109 and Arg112) have summed according to their fractional contribution (by coupling constants larger by an average of 1.3 Hz in the residue) to the sequence 84–129. The resulting spectrum small peptide relative to the value for these residues in was then subtracted from that of the long peptide. The the peptide 84–129. The data suggest, therefore, that difference spectrum obtained (Fig. 6b) is typical of - either elongation of the peptide chain or interactions with helical structure, indicating a loss of this structure upon residues flanking this sequence are important for stabiliz- fragmenting the long peptide. Based on the magnitude of ing the helical structure in this region of peptide 84–129. the ellipticity at 222 nm (assuming no contribution from To distinguish between these two possibilities, a longer aromatic or cysteine residues [27]), a decrease in helicity peptide (103–118) was synthesized (both the N and C of about 10% occurs upon fragmenting peptide 84–129, termini were blocked) and this peptide was also studied consistent with unfolding of the D helix region (108–115). in detail by far-UV CD and 1H NMR (R Bryant et al., unpublished data). This peptide was also highly unstruc- The lack of chemical shift dispersion of sidechains in the tured as judged by these methods, suggesting that inter- spectrum of peptide 84–129 suggest that the interactions 480 Folding & Design Vol 1 No 6

Figure 6 Figure 7

(a) 15 120 c ) -1 10 100 res -1 5 80 dmol 2 0 60 -5 (deg cm

-3 40 -10 x10 Fluorescence intensity 20 -15 a b [θ] -20 0 190 200 210 220 230 240 250 400 450 500 550 600 650 Wavelength (nm) Wavelength (nm) (b) 15 )

-1 10 Fluorescence emission spectra of the various peptides upon the

res addition of ANS. (a) ANS alone (❍), (b) ANS in the presence of an -1 5 equimolar mixture of the peptides 86–102, 105–115 and 116–129 (¾) and (c) ANS with peptide 84–129 (−- −- ). Each spectrum was dmol

2 0 obtained at 290 K, pH 2.0. No changes in fluorescence were observed over periods of several hours. -5 (deg cm -3 -10

x10 mixture (Fig. 7), suggesting that formation of the -15 hydrophobic surface in peptide 84–129 arises from the [θ] juxtaposition of nonlocal residues. -20 190 200 210 220 230 240 250 Wavelength (nm) Discussion Importance of local and nonlocal interactions in stabilizing secondary structure Far-UV spectra of the various peptides. (a) Far-UV CD spectra of peptides 84–129 (), 86–102 (), 105–115 () and 116–129 (u). The CD and NMR data of peptides 1–40 and 84–129 show (b) Difference between the spectrum of 84–129 and that calculated largely random coil behaviour for most of their sequences. by summing spectra of three peptides 86–102, 105–115 and Indeed, the regions corresponding to helices A, B and C in 116–129 according to the fractional contribution of each (by residue) native lysozyme appear to have no significant propensity in the sequence 84–129. Spectra were acquired at 290 K, pH 2.0. to form secondary structure in any of the peptides studied here. The lack of such structure in the three helices is stabilizing the D helix in this peptide must be relatively interesting in view of the fact that prediction programs nonspecific and nonpersistent. Consistent with this, a point to a significant helical propensity for residues in the large blue shift (20 nm) in the max of ANS fluorescence, sequence corresponding to helix A and, to a lesser extent, and a 20-fold increase in its intensity, occurs upon the helices B and C [19,29–31]. Given the lack of helical struc- addition of ANS to peptide 84–129 (Fig. 7). A similar, ture in the above regions, which form relatively long albeit smaller (7–8-fold), enhancement in ANS fluores- helices in native lysozyme (Fig. 1a), it is remarkable that cence occurs when peptide 1–40 is titrated with ANS, native-like secondary structure is found in two short although a short unstructured peptide (residues 13–33) regions of peptide 84–129. One region, corresponding to does not give a fluorescence enhancement upon the addi- helix D in native lysozyme, does not have high helical tion of ANS. The results suggest, therefore, that peptides propensity [19,31], and in accord with this, isolated pep- 1–40 and 84–129 both possess a hydrophobic surface tides encompassing this sequence (105–115 and 103–118) which binds ANS. In the case of peptide 84–129, this do not show a significant helical content. The data suggest, surface presumably coexists with the helical structure. In therefore, that the flanking sequences are very important accord with this, in the presence of denaturants such as for the stabilization of helical structure in the sequence 6 M GuHCl or 8 M urea, the enhancement of ANS fluo- 105–115. The lack of structure in the short peptide rescence is lost. Furthermore, no enhancement of fluores- 86–102, as well as in this region of peptide 84–129, sug- cence is observed upon the addition of ANS to the three gests that the C helix region also requires residues in the short peptides either individually or in an equimolar preceding sequence, or residues from more distant regions Research Paper NMR studies of lysozyme peptides Yang et al. 481

of the polypeptide chain, for stability. In support of the insufficient to generate helical structure. This emphasizes former notion, a peptide encompassing the C helix the importance of the formation of the hydrophobic together with 52 residues preceding it has been reported to cluster involving nonlocal residues in stabilizing helical be significantly structured in aqueous solution [32]. structure in the lysozyme peptide.

A further important feature of the structure seen in peptide Relationship to protein folding 84–129 is that the majority of stabilizing interactions that Early in the folding of intact lysozyme, substantial sec- occur in the isolated peptide must be nonnative (Fig. 1b), ondary structure is formed before specific tertiary interac- simply because the majority of interactions that stabilize tions have developed [11]. The results with peptide the -domain in native lysozyme are formed between 84–129 suggest that nonspecific interactions involving residues in the N- and C-terminal regions of the amino hydrophobic residues could stabilize significant popula- acid sequence (Fig. 1b), and not between residues within tions of native-like structure, in this case helices, in the each region. For example, the aromatic rings of Tyr20 and early stages of folding. These interactions could be nonna- Tyr23 (which are in the loop linking helices A and B) lie tive and local, as seen in peptide 84–129. In an intact close to residues in the C helix (Lys96 and Val99), the protein, however, more extensive hydrophobic collapse C–D loop (Ser100, Gly104 and Met105), and the D helix provides an opportunity for interactions with residues (Trp111) (Fig. 1b). Similarly, helix B has a very hydropho- distant in the sequence. One example of particular rele- bic sequence and forms the core of the -domain, around vance to this study is seen in a partially folded state of which the other, more amphipathic helices, are arranged equine lysozyme at pH 2, in which a subdomain involving (Fig. 1a). In addition, two disulphide bonds in native stable native-like secondary structure in helices A, B and lysozyme (Cys6–Cys127 and Cys30–Cys115) covalently D is stabilized by a cluster of interactions between link the two peptide sequences studied here; these pre- hydrophobic residues in these regions [36]. Interestingly, sumably also play a major role in stabilizing the structure in the tryptophan residues mentioned above are central to the -domain. These long-range interactions, therefore, this structure. Other examples are found in the molten are clearly the dominant factor in stabilization even of globule state of apomyoglobin, in which a combination of ‘local’ structures in these regions of the -domain. partially formed native-like tertiary interactions and non- specific hydrophobic interactions have recently been The structure of peptide 84–129 involving significant shown to stabilize its structure [37], and in CheY, in which native-like secondary structure stabilized by nonspecific the ratio of local versus nonlocal interactions has been hydrophobic interactions is reminiscent of that observed shown to be important in optimizing the stability and in partially folded states of intact proteins [33]. Indeed, cooperativity of its native structure [38,39]. peptide 84–129 was found to be weakly protected from hydrogen exchange measured by acquiring one-dimen- As well as casting light on the relative importance of local sional 1H NMR spectra after dissolution of the peptide in interactions in stabilizing secondary structure, peptide

D2O solution. Peptide 1–40, by contrast, exchanged its models have also been used to test ideas about the amide hydrogens at a rate commensurate with that pre- events occurring very early during folding [13,40–43]. dicted for a fully unfolded peptide of this sequence [34]. This approach makes the assumption that the conforma- Studies of the peptide have allowed the nature of the sta- tion of the fragments mimics that of the intact protein bilizing interactions to be pinpointed to relatively local early in folding. Supporting evidence for this assumption regions of the polypeptide chain. The results demon- has been provided in studies of barnase and chy- strate, therefore, that native-like secondary structure can motrypsin inhibitor 2, in which a correlation between the be stabilized by nonspecific interactions between conformation of peptide fragments and theoretical pre- hydrophobic residues, and that even nonnative interac- dictions of protein folding initiation sites has been tions can generate native-like secondary structure. In the observed [44,45]. In the case of the lysozyme family, the- case of peptide 84–129, clustering of hydrophobic residues oretical approaches have revealed several potential to form a core capable of binding ANS appears to be a crit- folding initiation sites. One approach [31] suggests that ical factor in stabilizing the observed helical structure. the sequence corresponding to helix A in the native The hydrophobic cluster could involve Trp108 and assumes structure early in folding, whereas other Trp111 which, in a helical structure, would lie in close models based on hydrophobic surface burial predict that proximity, together with hydrophobic residues in the the regions 17–28 (a loop), 19–39 (helix B) and 102–117 flanking regions (of which there are seven in the preced- (helix D) are the earliest secondary structural elements ing sequence and five in the succeeding sequence). By to form [45,46] (Fig. 1). It is interesting in this regard contrast with other simple peptides, however, in which that our study of peptides 1–40 and 84–129 has shown interaction of aromatic residues in adjacent turns stabilizes that helix D can form in the absence of specific tertiary helical structure significantly [35], the positioning of tryp- interactions, supporting the view that this helix could tophans four residues apart in this lysozyme sequence is play a role in initiating the folding of lysozyme. The 482 Folding & Design Vol 1 No 6

intermediates observed during folding of the disulphide- chain (2.5 kDa) and BPTI (6.5 kDa). Columns were eluted at a flow rate bonded protein, however, may be significantly different of 0.5 ml min–1. Peptides were detected by absorption at 214 and 280 nm. The concentration dependence of the far-UV CD and 1H NMR in conformation than the isolated peptide fragments; spectra (chemical shifts and linewidths) was also investigated. For their relationship will have to await the results of further peptide 1–40, addition of 0.5 M urea proved to be necessary to experiments. prevent aggregation of the sample and to obtain good-quality NMR spectra. Far-UV CD spectra demonstrated that the conformation of the peptide was not perturbed by the inclusion of 0.5 M urea and no con- For relatively small proteins, collapse and secondary struc- centration dependence of the CD signal of the NMR spectrum was ture formation is sufficient to drive rapid folding to the observed under these conditions. Peptide 84–129 appeared to be native state without the significant population of partially entirely monomeric at concentrations between 5 M and 0.5 mM in folded intermediates [1–7]. For larger and more complex aqueous solution and each of the short peptides was monomeric at proteins, however, significant barriers to folding can exist, concentrations up to 3 mM, as judged by all of the assays described above. resulting in the accumulation of partially folded states. Reorganization of secondary structural elements formed early in folding could be relatively facile, assuming that Peptide were prepared by dissolving lyophilized peptides in sufficient conformational freedom exists in the partially 45 mM sodium phosphate buffer pH 2.0, or in the same buffer contain- folded state. For some molecules, however, these ing 0.5 M urea, 8.0 M urea or 6.0 M GuHCl, at 285 K for peptide 1–40 and 290 K for all other peptides. These conditions were also used for intramolecular rearrangements could be relatively slow 1 the H NMR experiments. The pH was adjusted with H3PO4 or NaOH. events, especially once extensive hydrophobic collapse DTT was added in a 2–4-fold molar excess over peptides to prevent has occurred and stable interactions have developed. If dimerization. Far-UV CD spectra were obtained using a Jasco J720 this is the case, the acquisition of nonnative interactions spectropolarimeter with cells of 0.1, 1, and 10 mm path length at 0.5 nm intervals over the wavelength ranges 190–260 nm (samples in early in folding could be responsible for the generation of water) and 210–260 nm (for samples containing denaturants). Sample slow-folding species and parallel folding pathways, such as concentrations ranged from 2.5 M to 1.5 mM. Near-UV CD spectra of those observed in the folding of lysozyme [9,11,47]. Reor- peptide 1–40 were recorded using a peptide concentration of 0.15 ganization of the initially formed structure is therefore a mM in 0.5 M urea and in 6.0 M GuHCl, using cells with a path length of 10 mm over the wavelength range 260–310 nm. Typically 4–8 scans key step in folding, an important feature of which is the were averaged. The peptide concentration was determined by the swapping of nonnative contacts for native ones as domains absorption at 280 nm [48] and by quantitative amino acid analysis. form, specific tertiary interactions develop, and the ulti- mate native fold emerges. The helical contents of the peptides were calculated using the mean residual molar ellipticity at 222 nm ([ ]222). The molar ellipticities Materials and methods expected for peptides 1–40, 84–129, 86–102, 105–115 and 116–129 in a 100% helical conformation (max [] ) are –35400, Peptide synthesis 222 –36000, –29000, –21600 and –26800 respectively (deg cm2 dmol–1 Peptides 1–40 (sequence H N-KVFGRC(Acm)ELAAAMKRHGLDNY- 2 res–1), obtained by using the formula: RGYSLGNWVCAAKFESNFNT-NH2) and 84–129 (sequence Ac-LSS- DITASVNAAKKIVSDGNGMNAWVAWRNRC(Acm)KGTDVQAWIRG max [] = [(n–4.6)(–40000)]/n (1) CRL-COOH) were synthesized and purified as previously described 222 [19]. Cys94, which forms a disulphide with Cys76 in the -domain of where n is the number of residues in the peptide [49,50]. native lysozyme, was replaced by alanine in peptide 84–129. Cys6 and Cys127 were modified with Acm groups, whereas Cys30 and Cys115 were unmodified. This synthesis strategy was chosen to leave open the ANS binding studies possibility of the formation of the native disulphide pairs. The N termi- Fluorescence measurements were performed using a Perkin–Elmer nus of peptide 1–40 was not acetylated to mimic this residue in the LS50B fluorimeter. The excitation wavelength was 394 nm, and fluo- intact protein. The peptides encompassing helix C (86–102), helix D rescence emission was recorded from 400–600 nm. Slit widths of 3 (105–115) and the C-terminal 310 helix (116–129) were synthesized nm were used both for excitation and emission. Typically two scans per using the same strategy. The native cysteine residues were replaced by spectrum were recorded. ANS stock solutions were prepared in water, alanine or serine. The sequences of these peptides are: and the ANS concentration was determined using the molar extinction × 3 –1 –1 coefficient 350 = 5.0 10 (M cm ). In these experiments, peptide and ANS concentrations were varied between 5–20 and 20–500 M, 86–102 H2N-SDITASVNAAKKIVSDG-COOH respectively. 105–115 H2N-MNAWVAWRNRS-COOH 116–129 Ac-KGTDVQAWIRGSRL-COOH NMR spectroscopy The peptides were shown to have the correct sequence by analytical Samples for NMR spectroscopy (typically 500 l) were prepared by

HPLC, electrospray mass spectrometry and amino acid analysis. The dissolving the lyophilized peptides in 93% H2O/7% D2O or in 100% masses of peptides 1–40, 84–129, 86–102, 105–115 and 116–129 D2O. Samples were dissolved in 45 mM sodium phosphate buffer pH were 4596.21 ± 0.38 (expected 4596.14), 5133.34 ± 0.23 (expected 2.0, or in the same buffer containing 0.5 M (1H) urea (in the case of 5133.88), 1675.10 ± 0.18 (expected 1675.86), 1390.16 ± 0.32 (expected peptide 1–40). The pH was adjusted to 2.0 ± 0.05 (unless stated oth- ± 1390.58) and 1628.62 0.18 (expected 1628.85), respectively. erwise; uncorrected meter readings) with diluted H3PO4 and NaOH solutions. DTT was added in a two-fold molar excess over peptides. To determine whether the peptides were monomeric under the experi- Concentrations for the various NMR samples ranged from 30 M to 3 mental conditions, samples were analyzed by gel filtration in 45 mM mM. The spectra of all peptides were independent of temperature over sodium phosphate buffer pH 2.5, 293 K, and in the same buffer con- the range 283–298 K. All chemical shifts (both in 1D and 2D spectra) taining 0.5 M urea (for peptide 1–40), using a TSK 3000SW column. were referenced to the internal standard 1,4-dioxan, the shift of which Molecular mass markers used for calibrating the column were insulin A- was assumed to be at 3.75 ppm under the experimental conditions. Research Paper NMR studies of lysozyme peptides Yang et al. 483

1H NMR experiments were performed on Bruker AM600 and home- References built Ω 500 MHz spectrometers at the Oxford Centre for Molecular 1. Jackson, S.E. & Fersht, A.R. (1991). Folding of chymotrypsin inhibitor Sciences. 1D experiments were typically acquired with 4K or 8K 2. 1. Evidence for a two-state transition. Biochemistry 30, complex data points. 2D experiments were acquired with 2K complex 10428–10435. 2. Alexander, P., Orban, J. & Bryan, P. (1992). Kinetic analysis of folding points in t2, and in phase-sensitive mode using time-proportional phase incrementation (TPPI [51]) for quadrature detection in t . Typi- and unfolding the 56 amino acid IgG-binding domain of streptococcal 1 protein G. Biochemistry 31, 7243–7248. cally 350–600 t increments with 64–256 scans each were recorded 1 3. Khorasanizadeh, S., Peters, I.D., Butt, T.R. & Roder, H. (1993). Folding for double quantum filtered COSY (DQF-COSY [52]), TOCSY [53], and stability of a tryptophan containing mutant of ubiquitin. Biochem- ROESY [54], and NOESY11 [55] experiments. In most cases, istry 32, 7054–7063. forward linear prediction was used to extend the data to 512–800 t1 4. Schindler, T., Herrler, M., Marahiel, M.A. & Schmid, F.X. (1995). increments. In all cases, except for the jump-return NOESY spectra, Extremely rapid protein folding in the absence of intermediates. Nat. the water signal was suppressed using presaturation during the 1.2 s Struct. Biol. 2, 663–673. relaxation delay. The mixing time in the various TOCSY experiments 5. Kragelund, B.B., Robinson, C.V., Knudsen, J., Dobson, C.M. & was varied between 30 and 82 ms to emphasize direct and remote Poulsen, F.M. (1995). Folding of a four helix bundle; studies of acyl- coenzyme A binding protein. Biochemistry 34, 7217–7224. connectivities, respectively. The mixing times employed in the ROESY 6. Huang, G.S. & Oas, T.G. (1995). Submillisecond folding of a and jump-return NOESY experiments ranged from 75–400 ms. In all monomeric repressor. Proc. Natl. Acad. Sci. USA 92, 6878–6882. spectra, intraresidue NOE cross peaks for the Trp 3 to 2 protons 7. Viguera, A.R., Martinez, J.C., Filimonov, V.V., Mateo, P.L. & Serrano, L. were absent and correlations from 3 to 2 protons were weak or (1994). Thermodynamic and kinetic analysis of the SH3 domain of absent (data not shown), indicating that spin diffusion does not con- spectrin shows a two state folding transition. Biochemistry 33, tribute significantly to the NOE/ROE intensities. Spectral widths of 2142–2150. 5102.0 and 6024.1 Hz were used for spectra of peptide 1–40 8. Matthews, C.R. (1993). Pathways of protein folding. Annu. Rev. Biochem. 62, 653–683. acquired in D2O and H2O, respectively. For the other peptides, the spectral widths were 7246.3 (84–129 and 116–129), 6024.1 9. Radford, S.E., Dobson, C.M. & Evans, P.A. (1992). The folding of hen lysozyme involves partially structured intermediates and multiple path- (86–102) and 7042.5 (105–115) Hz. Standard procedures were ways. Nature 358, 302–307. used to obtain the sequential assignments for the various peptides 10. Chaffotte, A., Guillou, Y. & Goldberg, M.E. (1992). Kinetic resolution of [56]. and far-UV circular dichroism during the folding of hen egg white lysozyme. Biochemistry 31, 9694–9702. Data were processed with the program Felix 2.3 (BIOSYM) on SUN 11. Radford, S.E. & Dobson, C.M. (1995). Insights into protein folding workstations. For 1D spectra, after Fourier transformation, window using physical techniques. Proc. Roy. Soc. Lond. B 348, 17–25. multiplication (typically a sinebell window function shifted over 60°) 12. De Prat-Gay, G., Ruiz-Sanz, J., Neira, J.L., Itzhaki, L.S. & Fersht, A.R. and (if necessary) zero-filling to 8K real points, the spectra were base- (1995). Folding of a nascent polypeptide chain in vitro — cooperative line corrected using a cubic spline function. For 2D experiments, the formation of structure in a protein module. Proc. Natl. Acad. Sci. USA × × × 92, 3683–3686. data were zero-filled to yield 1K 1K, 2K 1K or 2K 2K (t2,t1) data matrices. Some data sets were processed twice, in order to give a 13. Itzhaki, L.S., Neira, J.L., Ruiz-Sanz, J., De Prat Gay, G.D. & Fersht, A.R. (1995). Search for nucleation sites in smaller fragments of chy- spectrum with a high signal-to-noise ratio and low resolution (squared motrypsin inhibitor 2. J. Mol. Biol. 254, 289–304. sine bell functions shifted over 90°) and another spectrum with lower 14. Ruiz-Sanz, J., De Prat Gay, G., Otzen, D.E. & Fersht, A.R. (1995). signal-to-noise but higher resolution (squared sine bell functions Protein fragments as models for early events in protein folding path- shifted over 45° or 60°). Post-acquisition suppression of the water ways: protein engineering analysis of the association of two comple- signal was achieved by a convolution of a Gaussian function with a mentary fragments of barley chymotrypsin inhibitor 2 (CI-2). function width of 15. In general, no baseline corrections were applied Biochemistry 34, 1695–1701. to the 2D spectra. 15. Ittah, V. & Haas, E. (1995). Non-local interactions stabilise long range loops in the initial folding intermediates of reduced bovine pancreatic trypsin inhibitor. Biochemistry 34, 4493–4506. For the determination of the 3J coupling constants, high-resolution HN 16. Peng, Z.Y. & Kim, P.S. (1994). A protein dissection study of a molten DQF-COSY spectra in H2O were recorded on a Bruker AM600 spec- globule. Biochemistry 33, 2136–2141. trometer. 600 increments of 8K complex data points were collected 17. Shin, H.-C., Merutka, G., Waltho, J.P., Wright, P.E. & Dyson, H.J. using a sweep width of 5102 Hz. The data in the F1 were (1993). Peptide models of protein folding initiation sites. 3. The G-H extended to 800 points by linear prediction. Squared sinebell functions helical hairpin of myoglobin. Biochemistry 32, 6356–6364. shifted over 30–40o in F1 and 50–70o in F2 were used to enhance 18. Waltho, J.P., Feher, V.A., Merutka, G., Dyson, H.J. & Wright, P.E. resolution. After zero filling, data matrices of 8K×2K were obtained, (1993). Peptide models of protein folding initiation sites. 1. Secondary with a digital resolution of 0.62 Hz in F2. The cross peaks were fitted to structure formation by peptides corresponding to the G- and H-helices 32, simulated antiphase cross sections along the F2 dimension using a of myoglobin. Biochemistry 6337–6347. 19. Yang, J.J., et al., & Radford, S.E. (1995). Conformational properties of procedure interfaced to FELIX 2.3 [56] to yield values for the coupling four peptides spanning the sequence of hen lysozyme. J. Mol. Biol. constants and the line widths. Measurements for peptide 1–40 were 252, 483–491. made at 12°C, and for peptide 86–102 at 25°C. All other measure- 20. Wüthrich, K. (1986). NMR of Proteins and Nucleic Acids. Wiley, New ments were made at 25°C. York. 21. Fiebig, K.M., Schwalbe, H., Buck, M., Smith, L.J. & Dobson, C.M. (1996). Towards a description of the conformation of denatured states Acknowledgements of proteins. Comparisons of a random coil model with NMR measure- This paper is a contribution from the Oxford Centre for Molecular Sciences ments. J. Phys. Chem. 100, 2661–2666. which is supported by the BBSRC, EPSRC and MRC. The research of CM 22. Smith, L.J., Fiebig, K.M., Schwalbe, H. & Dobson, C.M. (1996). The Dobson is supported in part by an International Research Scholars award concept of a random coil. Residual structure in peptides and dena- from the Howard Hughes Medical Institute. SE Radford and LJ Smith tured proteins. Fold. Des. 1, R95–R106; 1359-0278-001-R0106. acknowledge the Royal Society for support. TA Keiderling was supported 23. Smith, L.J., Bolin, K.A., Schwalbe, H., MacArthur, M.W., Thornton, J.M. in part by the University of Illinois and the National Institutes of Health & Dobson, C.M. (1996). Analysis of main chain torsion angles in pro- (GM30147) while a guest at the Oxford Centre for Molecular Sciences. teins: prediction of NMR coupling constants for native and random coil KA Bolin thanks the Overseas Research Award Scheme for funding. The conformations. J. Mol. Biol. 255, 494–506. authors would further like to thank Jochen Balbach, Harald Schwalbe, 24. Kemmink, J. & Creighton, T.E. (1995). The physical properties of local Klaus Fiebig, Matthias Buck and Don Haynie for their helpful contributions interactions of tyrosine residues in peptides and unfolded proteins. J. to this work. We also thank Debbie Smith for providing data on a smaller Mol. Biol. 245, 251–260. lysozyme peptide (residues 13–33), which provided valuable insights for 25. Merutka, G., Dyson, H.J. & Wright, P.E. (1995). Random coil 1H chem- the current study. ical shifts obtained as a function of temperature and trifluoroethanol 484 Folding & Design Vol 1 No 6

concentration for the peptide series GGXGG. J. Biomol. NMR 5, 51. Marion, D. & Wüthrich, K. (1983). Application of phase-sensitive two 14–24. dimensional correlation spectroscopy (COSY) for measurements of 26. Bolin, K.A., Pitkeathly, M., Miranker, A., Smith, L.J. & Dobson, C.M. 1H–1H spin–spin coupling constants in proteins. Biochem. Biophys. (1996). Insights into a random coil conformation and an isolated helix: Res. Commun. 113, 967–974. structural and dynamical characterisation of the C-helix peptide from 52. Rance, M., Sorensen, O.W., Bodenhausen, G., Wagner, G., Ernst, hen lysozyme. J. Mol. Biol. 261, 443–453. R.R. & Wüthrich, K. (1983). Improved spectral resolution in COSY 1H 27. Chakrabartty, A., Kortemme, S., Padmanabhan, S. & Baldwin, R.L. spectra of proteins via double quantum filtering. Biochem. Biophys. (1993). Aromatic side chain contribution to far-ultraviolet circular Res. Commun. 177, 479–485. dichroism of helical peptides and its effect on measurement of helix 53. Braunschweiler, L. & Ernst, R.R. (1983). Coherence transfer by propensities. Biochemistry 32, 5560–5565. isotropic mixing — application to proton correlation spectroscopy. J. 28. Curtis Johnson, J.W. (1988). Secondary structure of proteins through Magn. Reson. 53, 521–528. circular dichroism spectroscopy. Annu. Rev. Biophys. Biophys. Chem. 54. Bothner-By, A.A., Stephens, R.L., Lee, J.M., Warren, C.D. & Jeanloz, 17, 145–166. R.W. (1984). Structure determination of a tetrasaccharide: transient 29. Muñoz, V. & Serrano, L. (1994). Intrinsic secondary structure of the nuclear Overhauser effects in the rotating frame. J. Am. Chem. Soc. amino acids, using statistical - matrices: comparison with experi- 106, 811–813. mental scales. Proteins 20, 301–311. 55. Sklenar, V. & Bax, A. (1987). Spin-echo water suppression for the gen- 30. Muñoz, V. & Serrano, L. (1995). Elucidating the folding problem of eration of pure-phase two-dimensional NMR spectra. J. Magn. Reson. helical peptides using empirical parameters. III. Temperature and pH 74, 469–479. dependence. J. Mol. Biol. 245, 297–308. 56. Smith, L.S., Sutcliffe, M.J., Redfield, C. & Dobson, C.M. (1991). Analy- 31. Rooman, M.J. & Wodak, S.J. (1992). Extracting information on folding sis of and torsion angles for hen lysozyme in solution from 1H from the amino acid sequence: consensus regions with preferred con- NMR spin–spin coupling constants. Biochemistry 30, 986–996. formations in homologous proteins. Biochemistry 31, 10239–10249. 57. Blake, C.C.F., Koenig, D.F., Mair, G.A. & Sarma, R. (1965). 32. Ueda, T., Nakashima, A., Hashimoto, Y., Miki, T., Yamada, H. & Imoto, structure of lysozyme by X-ray diffraction. Nature 206, 757–761. T. (1994). Formation of -helix 88–98 is essential in the establishment 58. Handoll, H. (1985). Crystallographic studies of proteins. PhD thesis, of higher order structure from reduced lysozyme. J. Mol. Biol. 235, University of Oxford. 1312–1317. 59. Brocklehurst, S.M. & Perham, R.N. (1993). Prediction of the three 33. Dobson, C.M. (1994). Solid evidence for molten globules. Curr. Biol. dimensional structure of the biotinylated domain from yeast pyruvate 4, 636–639. carboxylase and of the lipoylated H-protein from the pea leaf glycine- 34. Bai, Y., Milne, J.S., Mayne, L. & Englander, S.W. (1993). Primary struc- cleavage system; a new automated method for the prediction of ture effects on peptide group hydrogen exchange. Proteins 17, protein tertiary structure. Protein Sci. 2, 626–639. 75–86. 60. Kraulis, P.J. (1991). Molscript: a program to produce both detailed and 35. Albert, J.S. & Hamilton, A.D. (1995). Stabilization of helical domains in schematic plots of protein structures. J. Appl. Crystallogr. 24, short peptides using hydrophobic interactions. Biochemistry 34, 946–950. 984–990. 36. Morozova, L.A., Haynie, D.T., Arico-Muendel, C., Van Dael, H. & Dobson, C.M. (1995). Structural basis of the stability of a lysozyme . Nat. Struct. Biol. 2, 871–875. 37. Kay, M.S. & Baldwin, R.L. (1996). Packing interactions in the apomyo- globin folding intermediate. Nat. Struct. Biol. 3, 439–445. 38. Muñoz, V., Cronet, P., López-Hernández, E. & Serrano, L. (1996). Analysis of the effect of local interactions on protein stability. Fold. Des. 1, 167–178; 1359-0278-001-00167. 39. Muñoz, V. & Serrano, L. (1996). Local versus nonlocal interactions in protein folding and stability — an experimentalist's point of view. Fold. Des. 1, R78–R77;1359-0278-001-R0071. 40. Neira, J.L. & Fersht, A.R. (1996). An NMR study on the -hairpin region of barnase. Fold. Des. 1, 231–241; 1359-0279-001-00231. 41. Dyson, H.J., Rance, M., Houghten, R.A., Lerner, R.A. & Wright, P.E. (1988). Folding of immunogenic peptide fragments of proteins in water solution II. The nascent helix. J. Mol. Biol. 201, 201–217. 42. Dyson, H.J., Rance, M., Houghten, R.A., Lerner, R.A. & Wright, P.E. (1988). Folding of immunogenic peptide fragments of proteins in water solution I. Sequence requirements for the formation of a reverse turn. J. Mol. Biol. 201, 161–200. 43. Kemmink, J. & Creighton, T.E. (1993). Local conformations of peptides representing the entire sequence of bovine pancreatic trypsin inhibitor and their roles in folding. J. Mol. Biol. 234, 861–878. 44. Abkevich, V.I., Gutin, A.M. & Shakhnovich, E.I. (1994). Specific nucleus as the transition state for protein folding: evidence from the lattice model. Biochemistry 33, 10026–10036. 45. Moult, J. & Unger, R. (1991). An analysis of protein folding pathways. Biochemistry 30, 3816–3824. 46. Chelvanayagam, G., Reich, Z., Bringas, R. & Argos, P. (1992). Predic- tion of protein folding pathways. J. Mol. Biol. 227, 901–916. 47. Thirumalai, D. & Guo, Z. (1994). Nucleation mechanism for protein folding and theoretical predictions for hydrogen-exchange labelling experiments. Biopolymers 35, 137–140. 48. Mach, H., Middaugh, C.R. & Lewis, R.V. (1992). Statistical determina- tion of the average values of the extinction coefficient of tryptophan and tyrosine in native proteins. Analyt. Biochem. 200, 74–80. 49. Chen, Y.-H., Yang, J.T. & Chau, K.H. (1974). Determination of the helix Because Folding & Design operates a ‘Continuous Publication and beta form of proteins in aqueous solution by circular dichroism. Biochemistry 16, 3350–3359. System’ for Research Papers, this paper has been published 50. Gans, P.J., Lyu, P.C., Manning, M., Woody, R.W. & Kallenbach, N.R. via the internet before being printed. The paper can be (1991). The helix-coil transition in heterogeneous peptides with spe- cific side-chain interactions: theory and comparison with CD spectral accessed from http://biomednet.com/cbiology/fad.htm — for data. Biopolymers 31, 1605–1614. further information, see the explanation on the contents page.