Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984

PROTEIN STRUCTURE AND FUNCTION, FROM A COLLOIDAL TO A MOLECULAR VIEW" by HAROLD A. SCHERAGA

Baker Laboratory of Chemistry, Cornell University, Ithaca, New York 14853, USA Presented as the 7th Linderstrom-Lang Lecture at the Carlsberg Laboratory, Copenhagen, on lOth May, 1983

Keywords: Hydrodynamic properties, internal interactions, ribonuclease, synthetic polypep- tides, conformational changes, protein folding, conformational energy calculations, structural elements of proteins, enzyme-substrate complexes

1. INTRODUCTION opment of theoretical methods to use these Forty years ago, proteins were described in experimental distance constraints, together with terms of ellipsoids of revolution in order to empirical potential energy functions, to gain an account for their hydrodynamic properties. understanding as to how interatomic inter- With the determination of the amino acid sequ- actions dictate the structural features of proteins ence of insulin (290) and other proteins, it and to try to compute the three-dimensional became meaningful to discuss structures of polypeptides and proteins in solu- and function in terms of the interatomic inter- tion. At the same time, both theoretical and actions within the protein molecule. Various experimental methods were used to elucidate physical chemical methods were developed and the pathway(s) of folding from a nascent poly- applied to elucidate such interactions. Many of peptide chain to the three-dimensional structure the pairwise interactions deduced from such of the native protein. The same theoretical and studies were found to exist, e.g. in bovine pan- experimental methods have also been applied to creatic ribonuclease A (303), when the X-ray study the biological function of the folded (na- crystal structure was subsequently determined tive) protein, e.g. the interaction of an enzyme (404, 405,407). The ability to identify pairwise with its substrate. The development of some of interactions by physical chemical means pro- our current views of protein structure and func- vided distance constraints with which to define tion will be described here by focussing on the the three-dimensional structure of a protein in impact of physical chemistry on the advance- solution. There then followed the rapid devel- ment of our knowledge in this field.

"~The work reviewed here was supported by research grants from the National Institutes of Health, the National Science Foundation, and the National Foundation for Cancer Research

Springer-Verlag 0105-1938/84/0049/0001/$11.00 H. A. SCHERAGA:Protein structure

Scale

I I o I00.~, No + CI- Glucose

II Albumin Hemoglobin .a, - Globulin 69.000 68.000 90.000

a,- Lipoprolein y_ Globulin 200,000 156,000 Lipoprotein 1,300,000

Fibrinogen 4OO,O0O Figure 1. Early representation of protein molecules in terms of ellipsoids of revolution. Relative dimensions of several protein molecules in blood (293).

2. EARLY VIEW OF PROTEINS peptide chains) was altered to one in which the In the 1940's, proteins were thought of as protein swells (almost isotropically) (325). Thus, charged colloidal particles. A summary of their the concept that a protein is a rigid particle had acid-base equilibria and dielectric properties has to be abandoned in favor of one in which the been provided by COHN and EDSALL (47), and molecule is a dynamically flexible one (325). a description of their sizes and shapes, deter- The rigid-particle view had also been used to mined by various hydrodynamic and optical interpret titration data of proteins. LINDER- methods, has been presented by EDSALL (63). STROM-LANG (173) was the first one to exploit Figure 1 is a pictorial interpretation of such data, the newly-formulated DEBYE-HOCKEL theory in terms of ellipsoids of revolution (293). A (52, 53) for this purpose by applying it to a re-interpretation of hydrodynamic data with the spherical model of a protein, with the charge aid of shape-dependent [3- and 8-functions (325) distributed uniformly on its surface. Subse- led to an alteration of the pictorial represent- quently, the spherical model was extended to a ations of Figure 1; e.g. the computed axial ratio cylindrical one by HILL (113), and the uniform offibrinogen was reduced from ! 8:1 to 5:1 (327, distribution of charge was modified to a discrete 344), a result that is compatible with electron one (for a spherical model) by TANFORD and micrographs (103) of this protein. Further, the IORKWOOD (369, 371). In the succeeding years, view that urea denaturation of proteins leads to TANFORD (370) and KLOTZ (138) made consi- increased asymmetry (due to unfolding ofpoly- derable use of such models to provide clues that

2 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure interactions involving ionizable groups alter were confirmed when the a-helix was subse- their normally-observed pK's and binding con- quently found in the crystal structures of myo- stants for various ligands. globin (133) and hemoglobin (244), and the Simultaneously, the thermodynamic proper- [3-pleated sheet in the crystal structure of lyso- ties of protein solutions had also been treated zyme (18). with rigid spherical charged models. For exam- Simultaneously, the need was recognized ple, SCATCHARD and coworkers (294) ac- [even before the successful determinations of counted for the non-ideality of aqueous solu- protein structures by X-ray diffraction (18, 133, tions of proteins by interpreting the second virial 244)] for obtaining structural information about coefficient in the expression for the osmotic proteins in solution, and for determining how pressure in terms of interactions between the polypeptide chains fold into the native confor- various components of the system. mations of proteins and then interact with other Of course, it was realized that such simple molecules. Hence, methods were developed to models provided only the crudest description of synthesize homopolymers and copolymers of a protein, and efforts were made to try to probe amino acids as simple models (19, 20, 130) in the more detailed structure of the protein mole- order to elucidate the interactions present in cule. The aforementioned titration and binding proteins, and various physical chemical techni- experiments suggested that there were inter- ques [such as optical rotation and circular di- actions between functional groups that were chroism (204, 205, 298, 375, 377), infrared reflected in modified pK's and binding con- (203), Raman (381), and nuclear magnetic reso- stants. Such interactions also modified the ease nance (127, 406) spectroscopies, kinetics of of hydrolysis of peptide bonds, and LINDER- deuterium-hydrogen exchange (122, 176), and STROM-LANG (174, 175, 177), OTTESEN(234), immunochemical measures of folding equilibria RICHARDS (285) and others demonstrated this (291)] were developed to investigate these inter- very clearly by observations of limited proteo- actions. At the same time, the development of lysis by enzymes. But a detailed interpretation of thermodynamic and statistical mechanical me- such results had to await the pioneering work of thods began to emerge (150-152, 296, 297) to SANGER (290) who first determined the amino treat the interactions that determine conforma- acid sequence of a protein, insulin. tion, conformational changes, and intramolecu- lar structure in polypeptides and proteins. We shall therefore trace the parallel and symbiotic 3. TRANSITION TO A MOLECULAR elaboration of theoretical and experimental pro- APPROACH cedures to study protein structure and function With the determination of the amino acid in solution. sequence of insulin (290), and subsequently of ribonuclease (99, 114, 268, 347-349), lysozyme (32, 128) and other proteins, a new era in protein 4. INTERNAL INTERACTIONS chemistry was opened up. The protein was no Internal interactions in proteins affect their longer a colloidal particle, but an organic mole- various properties, and physical chemical stu- cule whose complete covalent structure was dies of such properties, accompanied by appro- describable. It then remained the province of the priate theoretical interpretation, provide infor- physical chemist to provide a description of its mation about such internal interactions. three-dimensional structure, of the interactions Initially, the experiments were carried out with that led to it, and of its interactions with other insulin, lysozyme and ribonuclease (301) molecules. because these were among the first proteins After a long series of trials by numerous whose amino acid sequences were determined, investigators, PAULING and COREY (241, 242) and attention was focussed on electrostatic ef- succeeded in elucidating the stereochemistry of fects and on hydrogen and hydrophobic inter- the polypeptide chain in terms of the ix-helix and actions. the I]-pleated sheet. These pioneering proposals

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 3 H. A. SCHERAGA:Protein structure

._.= -- CH2--CH2-C,o .... HO.-< )'-CH2--

Figure 2. Schematic representation of a hydrogen bond between the hydroxyl group of a tyrosyl residue and the carboxylate ion of a glutamyl residue between two rigid, helical polypeptide chains (150).

4.1. Electrostatic effects hydrogen bond will affect the observed pK's of Generalized electrostatic effects, which are the donor and acceptor groups (150). In the reflected in altered pK's observed iia acid-base illustration of Figure 2, the observed pK of the titrations of proteins, are usually treated by the tyrosyl group will be raised, and that of the aforementioned LINDERSTROM-LANG model carboxyl group will be lowered, by an amount (173). More recently, both generalized and spe- depending on the free energy required to disrupt cific electrostatic interactions have been in- this hydrogen bond. Such modified pK's are cluded as an essential component of empirical observed in titrations of many proteins, and this conformational energies (see section 7.1). model and more involved hydrogen-bonding schemes have been used to account for such observed "anomalous" pK's (150) and also for 4.2. Hydrogen bonds "anomalous" binding constants (150). Very early, MIRSKY and PAULING (202) sug- In studies of ribonuclease, potentiometric and gested that the hydrogen bond is an important spectrophotometric titration curves (343, 372) structural element in proteins, and such an interaction (involving the backbone NH and CO groups) is an essential feature of the a-helix and 13-pleated sheet (241, 242). The first statistical mechanical treatment of such structures was 0.100 provided by SCHELLMAN (296, 297) (section 6) who focussed attention on the backbone hydro- gen bonds. Additional insight was provided by /XD considering the hydrogen bonds involving the 005C polar side chains ( 150-152). A general hydrogen- bond potential is included in empirical confor- mational energy calculations (see section 7.1).

I l I I I l 270 280 290 300 X (rn/J.} 4.2. I. Role of hydrogen bonds in modifying Figure 3. Difference in optical density at room tempe- the reactivity of polar groups rature between ribonuclease solutions at pH 1.91 and A schematic representation of a hydrogen 6.94 for experiments carried out at four different bond (involving, as an example, two ionizable protein concentrations, increasing from curve A to side chains) is shown in Figure 2 (150). Such a curve D (299).

4 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure

I I f i ! I I I.O o,-~ ,- 'o / / .e-.o 0.8

c 0.6 0 - 01 0 89 is 3.16 6.83 U / o?

0.4 C ._o _ f / u 0.2

0 I I I i I I L Io 20 30 40 50 60 70 80 Temperature (~ Figure 4. Fraction converted versus temperature at several pH's for reversible denaturation of ribonuclease. Open and closed symbols correspond to ultraviolet spectra and optical rotation measurements, respectively(109).

provided the first clues that abnormal pK's arise The information in Figure 5, together with the from internal interactions; e.g., three of the six known location of four disulfide bonds in ribo- tyrosyl groups of this protein have abnormally nuclease (349) and the proximity of His 12, His high pK's. Ultraviolet difference spectra (153, 119 and Lys 41 in the active site of this protein 155, 299) (Figure 3) and their pH dependence (100, 107, 108, ll5, 351) constitute distance (15, 23, 81, 105, 109, 110, 299, 342) (Figure 4) constraints on the folding of the native molecule indicated that tyrosyl groups were near carboxyl (30 l, 309). Hence, we began the development of groups, perhaps interacting by means of hydro- theoretical computational methods (216) to ex- gen bonds of the type shown in Figure 2. pIoit these distance constraints (see sections 7 A series of chemical and physical chemical and 8). However, it must be emphasized that experiments on solutions of ribonuclease and distance constraints are not enough to deter- some of its derivatives led to three specific mine the folding of a protein (393); these must be pairings oftyrosyl with carboxyl groups in ribo- included in an empirical conformational energy nuclease, viz. Tyr 25... Asp 14, Tyr 92... Asp algorithm. 38 and Tyr 97... Asp 83 (303). When the crystal structure of ribonuclease was subsequently de- termined (405, 407) these three pairs were in- 4.2.2. Role of hydrogen bonds in modifying deed found to be near each other (Figure 5). the reactivity of primary valence bonds Considering that there are (6!/3!3!) and (1 l!/ As mentioned in section 2, peptide bonds in 3!8!) ways to pick 3 out of 6 tyrosyl-OH and 3 out proteins can have altered reactivities, compared of 11 carboxyl groups, respectively, and 3! ways to those in, say, dipeptides. Likewise, the redox to form tyrosyl-carboxyl pairs from each subset potentials of disulfide bonds in proteins can of three, there are 19,800 possible sets; thus, the differ from that in cystine. Such modified reac- set of three pairings illustrated in Figure 5 is tivity can lead to the aforementioned pheno- much more than a fortuitous choice, and repre- menon of limited proteolysis, a study of which sents a triumph of protein physical chemistry. provides information about the existence of Nowadays, such pair interactions are deter- internal interactions. For example, if it is neces- minable by nuclear OVERHAUSER (25, 156) and sary to disrupt a hydrogen bond of the type non-radiative fluorescence energy transfer (193) illustrated in Figure 2 in order to liberate the experiments. hydrolyzed peptide fragment, then the accom-

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 5 H. A. SCHERAGA: Protein structure

ASPo14

Tyr 25

Ty 92 otfN>o

Tyr

Figure 5. Three Tyr... Asp interactions deduced from physical chemical experiments (303). The drawings are based on the subsequently-determined X-ray coordinates of WLODAWER et al. (405).

6 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure panying free energy can stabilize the covalent tion of such internal interactions have provided peptide or disulfide bond undergoing reaction an explanation for the pH dependence of the ( 151 ). In fact, the acquired stabilization can be stability of proteins either in fibers and films sufficiently great so that, for example, a proteo- (178,209, 210) or in solution (109, 110). Figure lytic enzyme can lead to synthesis rather than 4 provides an example of the thermally-induced hydrolysis ofa peptidebond. The reversibility of denaturation of ribonuclease at various pH's such processes has been demonstrated (by ap- (see section 5 for further exploitation of the data proaching equilibrium from both directions) in of Figure 4 to study protein folding). the thrombin-fibrinogen reaction (154, 321, 324), and in other proteolytic reactions involv- ing, e.g., the action of trypsin on soybean trypsin 4.2.4. Role of hydrogen bonds in protein inhibitor (146) and hemoglobin (207). It has also association been used for peptide synthesis with proteolytic In those systems in which hydrogen bonds are enzymes (26). the main interactions involved in protein associ- Limited proteolytic reactions are used to map ation, the association reaction will be exother- the active sites ofproteolytic enzymes (295, 310, mic. Further, ifionizable donor and/or acceptor 373). For example, with the aid ofa Schechter- groups are involved in the (intermolecular) hy- Berger type of analysis of the kinetics of enzyme drogen bonds, then the enthalpy of reaction will action (295), interactions (such as an Arg... Asp depend (in a predictable manner) on pH. Such a salt link in the substrate) involved in the throm- pH-dependent exothermicity has been demon- bin-fibrinogen complex are being identified strated for the association of fibrin monomer ( 18 l, 310). [Details of local conformations at the (355), and the donor and acceptor groups have active site are also obtainable by resonance been identified as tyrosyl and histidyl, respecti- Raman spectroscopy, a technique that has pro- vely (324, 355). Exothermicity is a rare occur- vided considerable information about the inter- rence in protein association, and the example of action of carboxypeptidase A with inhibitors fibrin monomer polymerization is almost a (333-336, 386)]. Limited proteolytic reactions unique one. Most protein association reactions have also served to probe the stages of unfolding are endothermic, possibly involving hydropho- of a protein (28, 229-232, 289) (see section 5.1). bic interactions which are accompanied by a positive enthalpy of formation (see section 4.3.3). 4.2.3. Role of hydrogen bonds in stabilizing a protein As already implied in the work of PAULING 1.2 and COREr (241, 242) and SCHELLMAN (296, 297), hydrogen bonds can stabilize a protein. If t 0.9 the hydrogen bonds involve polar side chains, 4 especially those that can ionize, then the exist- § t§ ence of such a hydrogen bond will depend on the *x: O.'~ pH. Consequently, the kinetics and equilibria of protein denaturation will depend on pH in a i manner that can be accounted for ( 152, 300) by

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 7 H. A. SCHERAGA:Protein structure

Figure 7. Illustrative examples of hydrophobic interactions between pairs of isolated side chains (215).

The hydrogen-bonding model used to inter- groups were not large enough to lead to the pret the dependence of AH on pH for the apparently strong interactions between such association of fibrin monomer (355) also ac- groups in water, and suggested that alterations counts for the pH range of association and for in the solvation of nonpolar groups were in- the liberation of protons during association volved in their association. In 1959, KAUZ- (64-66, 147, 32 l, 355). For example, the libera- MANN provided an analysis of the thermodyna- tion of protons (Figure 6) in fibrin association mics of hydrophobic interactions between non- reflects the influence of intermolecular hydro- polar groups in water (131). Simultaneously, gen bonds on the acid-base equilibrium of the BERGER and LINDERSTROM-LANG tried to ob- tyrosyi and histidyl groups involved. tain experimental evidence for hydrophobic in- teractions by attributing the stability of the 4.3. Hydrophobic interactions poly(D,L-alanine) a-helix in water to such inter- In 1954, KIRKWOOD (137) pointed out that actions (12). VAN DER WAALS interactions between nonpolar

Table I. Theoretical thermodynamic parameters (215) for formation of the hydrophobic interactions of Figure 7 at 25 ~ AG~ AH~ AS~ Side chains kcal/mol kcal/mol e.u. A. Alanine... alanine -0.3 0.4 2.1 B. isoleucine.., isoleucine -1.5 1.8 11,1 C. Phenylalanine... leucine -0.4 0.9 4.7 D. Phenylalanine... phenylalanine - 1.4 0.8 7.5

8 Car|sberg Res. Commun. Voi. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein struclure

4.3.1. Early model for statistical mechanical 126). These parameters also provided a basis treatment of hydrophobic interactions for hydrophobic chromatography (352) and for Before reliable empirical potentials were de- understanding the role of nonpolar groups in veloped, hydrophobic interactions were treated the entropy of association of proteins (353). by statistical mechanical calculations on models of the structure of water (213) and of aqueous solutions of hydrocarbons (214). Partial cages 4.4. Alternative treatments ofthe effect ofwater (clathrates) were envisaged as forming around on noncovalent interactions the hydrocarbon molecules (hydrophobic hy- While molecular dynamics and Monte Carlo dration), and the formation of such ordered methods are suitable for treating aqueous solu- structures was reversed upon association of non- tions of small molecules, they suffer from a polar groups in the hydrophobic interaction serious deficiency in not being able to provide (215). This interaction is thus accompanied by free energies of hydration without considerable increases in enthalpy, entropy and volume, and modification of the computational procedure a decrease in free energy. Hydrophobic inter- (236, 237, 314, 376). Also, they are not easily actions stabilize protein structures by themsel- adaptable to conformational energy calcula- ves (328) and also by increasing the strength tions on proteins. Hence, two alternative treat- of hydrogen bonds (221). ments of protein hydration have been devel- oped. The first is a hydration-shell model in which empirical free energies of hydration are 4.3.2. Molecular dynamics and Monte Carlo assigned to polar and nonpolar groups, and the simulations overlap of hydration shells (leading to dehydra- While this approach underwent several im- tion) tends to force nonpolar and polar groups provements (101, 160), it was subsequently to the inside and outside, respectively, of protein superseded by molecular dynamics (224, 239, molecules (77, 116-118,224, 239, 240). A mea- 282, 283, 318, 354) and Monte Carlo (11,136, sure of the preference for the inside or outside 167, 224, 235-239, 318, 357) simulations of is provided by a procedure that computes sur- aqueous solutions, as a result of the acquisition face accessibility (157). of reliable empirical potential functions. Among A second treatment is based on a statistical other things, these simulation methods demon- analysis of the radial distribution of side-chain strated the existence of the clathrate-like struc- groups in a protein molecule (148, 194, 195, tures that had been assumed in the earlier 198, 224, 272, 318, 395, 396, 403). The data models (214, 215, 314). from such an analysis also reflect the tendencies of nonpolar and polar groups to lie on the inside and outside, respectively, of protein molecules 4.3.3. Experimental verification of parameters Both of these empirical procedures are easily Figure 7 provides some examples ofhydroho- incorporated into empirical conformational bic interactions, and Table I indicatesthe magni- energy schemes to compute protein structure. tudes of the thermodynamic parameters for The empirical potential functions, into which their formation. The thermodynamic parame- the effect of hydration is incorporated, are di- ters computed for hydrophobic interactions scussed in section 7.1. were verified by numerous experimental studies involving the association of small molecules containing nonpolar groups (224, 227, 239,302, 5. RIBONUCLEASE AS A MODEL FOR 311, 314, 318). The parameters for inter- STUDYING PROTEIN STRUCTURE molecular hydrophobic interactions also ac- AND PROTEIN FOLDING count for the large increases in enthalpy, entropy Many of our current ideas about protein and volume accompanying most protein associ- structure and protein folding were developed ation reactions (215, 353), e.g. the aggregation through studies of the interactions in native of the protein from tobacco mosaic virus (10, bovine pancreatic ribonuclease (section 4. 2. 1 )

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 9 H. A. SCHERAGA: Protein structure

,-_ ~ s~---~ -~ ~,~,

,../ ~- ; ~?-'" "~'.'r~'-' ..,/

TTT

_ _.-('Y--~_3-..t3 s

tx~3 ..'.r bl ) , ,~ e "s I?, I r~'o~"

. ,2 o I~1 \

10 Carisberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure

and of the unfolding and refolding of this protein possible to use the proteolysis data to identify ( 17,286, 316, 326). This is also the protein with several stages of unfolding in the reversible which ANFINSEN (8) first demonstrated the equilibrium thermal transition (28). These are spontaneous folding of a polypeptide chain into shown in Figure 8. the native conformation. Various experiments Since the transition is reversible, the diagrams have been carded out on the folding of ribonu- of Figure 8 represent stages of both unfolding clease either by maintaining the four disulfide and refolding. Additional evidence in support bonds intact, or by reducing and then oxidizing of these stages of folding, modified to a slight them. extent by subsequent experiments (34, 183), was obtained by studies ofcarboxypeptidase A diges- tion of ribonuclease (31), flash photolytic label- 5.1. Folding with intact disulfide bonds ing of the surface residues (183), EPR measure- Many investigations of the kinetics of unfold- ments of mobilities of spin labels attached to ing and refolding of ribonuclease have demon- ribonuclease (185), Raman (36, 80) and '3C strated the presence of kinetic intermediates NMR (119) spectra, immunochemical probing ( 135,340). Under some solvent conditions, local (34), and X-ray diffraction (80) at various tem- ordered structures are stabilized, but they do peratures in the transition region. The recent not necessarily play an essential role in the Raman and X-ray study (80) also indicates that folding of this protein (55, 179). In addition, the thermally-induced conformational changes cis/trans isomerism about peptide bonds pre- in aqueous solution and in the crystal parallel ceding proline may play an important role in each other, thereby showing that the stability the folding process, but there is some disagree- of the protein is determined primarily by intra- ment as to the stage of folding in which this molecular rather than intermolecular inter- isomerism takes place (171, 254, 338). actions. Studies of the reversible equilibrium ther- mally-induced unfolding/refolding of ribonu- clease (Figure 4) have provided details of intra- 5.2. Folding accompanying formation of molecular interactions. The data of Figure 4, disulfide bonds obtained by ultraviolet difference spectra and Air-oxidation of reduced ribonuclease led to optical rotatory dispersion measurements, indi- the regeneration of the native structure (8), from cate that the overall unfolding of the tyrosyl which ANFINSEN concluded that the amino acid side chains and of the polypeptide backbone sequence contains all the information required parallel each other. However, these two techni- for proper folding in a given solvent. This ques provide information that is averaged over observation has led to many similar studies not the whole molecule, rather than about local only of ribonuclease but also of hen egg white unfolding. Studies of limited proteolysis at dif- lysozyme, bovine pancreatic trypsin inhibitor ferent temperatures (28, 229-232, 289), on the and other proteins (1, 33, 34, 49, 104, 292, 316). other hand, have indicated the sequence in Most of the subsequent oxidation experiments which the peptide bonds of ribonuclease are have been carried out with a redox pair, such hydrolyzed by trypsin and chymotrypsin, re- as oxidized and reduced glutathione, rather than spectively, in the thermally-induced unfolding with air. transition. Presumably, such limited proteolysis KONISHI et al. (139-142, 332) carried out a takes place in those portions of the molecule detailed study of the glutathione-induced regen- which have unfolded. With the aid of the X-ray eration ofribonuclease from the reduced protein structure ofribonuclease (405, 407), it has been by fractionating, isolating, and identifying inter-

Figure 8. Schematic representation of the first five of six identifiable stages for the thermally-induced unfolding of ribonuclease A (the sixth stage, not shown, is the completely unfolded chain). Disulfide bonds connect half-cystine residues 26-84, 40-95, 58-110 and 65-72 (28).

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 11 H. A. SCHERAGA:Protein structure

5 m D

0 I I I u I O I I E 3 S2"H" ,, I " I tD 3SIGIH '" ~_ L) 3S2G x-" I

V I (26-84). ZS4H"- I (40- 95)-.~.. " I I m (58- itO)-" 2s'G3" s ,' o~ -10 (65-72) -- I 2S3GIH 2S46 I I I <3 i S~'H~ ,, I I I x\ I I I --" I I I ISI65H. ~ .,. I I ; ,s264~--• .- .... ;~6 _ ~. , , .s3,.. ,s~2. s5,6. , -15 ",, I I I I I

167HTM .~ I I I I I I --~ I I I I 2GGH ~--_365H ---~I ~ 563H...... 6G2H 7GIH 8G 464H

-20

R Nase A Figure 9. Apparent standard state eonformational chemical potentials (relative to that of zero for 4S) of "Intermediates" in the regeneration of ribonuclease A from the reduced protein at 22 ~ and pH 8.2 (141). The horizontal and vertical dashed lines connect species that interconvert by two different types of reactions discussed in reference 135. The corresponding values for native ribonuclease A (N) and denatured ribonuclease A with intact disulfide bonds (D) are plotted on the right side of the figure. See references 141 and 332 for further details. mediates, charaterizing their distributions at and determining the relative stabilities of the various concentrations of oxidized and reduced intermediates and the rate-limiting steps in their glutathiones (GSSG and GSH, respectively), conversion to the native structure. The inter-

12 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure

30 that there are two general types of pathways for

It. non-native interactions play a significant role IO in the folding pathways and lead to metastable intermediate species. Such non-native inter- c actions must be disrupted or rearranged to nucleate the native interactions [in the rate r 12. limiting step(s)] for the protein to fold. Attempts are currently being made to shift the folding o -4.0 -3.0 -2.0 -I .o mechanism from a rearrangement-type to a Log [GSH] (M) growth-type pathway by introducing an extrin- Figure 10. Percent of fully regenerated ribonudease sic covalent crosslink into the native structure A produced by reaction of the reduced protein with (172, 332). GSSG (at a fixed concentration) and GSH at the It is of interest to point out that K~,f, the variable concentrations shown, at pH 8.22 and 22 ~ equilibrium constant for formation of the native for 60 rain. The symbols represent the total amount conformation from the unfolded form (291), is of ribonuclease regenerated and the amount regen- an extremely large number for native ribonu- erated through each of six specific pathways (140). clease, is a significantly large number (-0.06) for reduced ribonuclease, and is several orders of magnitude smaller for fragments of ribonu- mediates were grouped into 25 classes clease (35). This indicates that reduced ribonu- kSmGnH, where each set of structures in a given clease has local structure (in the antigenic sites class has k intramolecular disulfide bonds (S), to which the antibodies, used to measure I~o.f, m intermolecular disulfide bonds (G) between bind) that is quite similar to that of the native a cysteine residue of the protein and GSH, and molecule. It seems that the folding pathway of n free sulfhydryl groups (H). Their standard state this near-native "unfolded" structure is in- conformational chemical potentials, relative to fluenced by the solvent conditions. zero for 4S, were determined from equilibrium The growth-type mechanism, in which native constants among the various classes of inter- interactions play significant roles in the path- mediates kSmGnH, and are shown in Figure way, has also been studied theoretically (87, 88, 9; the values for the native protein (N) and the 366, 392). As proposed by TANAKA and denatured protein with the native disulfide SCHERAGA (366), residues that are nearby in bonds intact (D) are also shown in Figure 9. the amino acid sequence form local structures Six different pathways were identified for the (near the diagonal of a triangle map). These regeneration of the native structure from the structures, of which there are several in general, fully reduced material, depending on the con- then merge with nearby ones (represented by centrations of GSSG and GSH. Figure l0 shows regions on the triangle map as one moves suc- the amount of native ribonuclease A formed cessively away from the diagonal) until the in each pathway as a function of the concentra- whole protein is folded. Folding is thus envi- tion of GSH (at a fixed concentration of GSSG). saged as a directed rather than a random-search As a result of these studies, it was proposed process. WAKO and SAITO (392), and GO and

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 13 H. A. SCHERAGA:Protein structure

0

V r--5--]

I I

I L I I o I 1 I E lOC~l 50 i o l~-Ul I ol Z L~_..%_ J (o \ \ \ 3 V

/ n~ O O/ / ~5] , // T I H~ / / B I I I00 -- r--~-7 I n I _iloB iI I i M i I d '=m \ /~_-~_ I I O01 O0 \ / / 124 I I I e l i i i i I I II 0 50 I O0 124 Residue Number j Figure 11, Contact map of ribonuclease S (218). Each point of the map represents the presence (square) or absence (no marking) of a contact between two amino acid residues i and j. Contacts between residues are omitted from the figure whenever I i-j [ ~<4. The pairs of half-cystine residues forming the disulfide bonds are denoted by black squares, Contact regions (A-M) are bounded by dashed lines. Those contact regions near the diagonal are possible nucleation sites.

ABE (88), have generalized this model by allow- MATHESON and SCHERAGA (182) on the basis ing the local structures to lie anywhere along of the hydrophobicity of the residues involved. the chain, and be of any size, whereas TANAKA The secondary nucleation site, predicted by the and SCHERAGA (366) confine the local struc- same model (182), is included in residues 61- tures to those that are most likely to occur 11 l, and corresponds to region E in the contact because of the way that the amino acids are map (218). The early formation of these nuclea- arranged in the sequence, A Tanaka-Scheraga tion sites is also supported by immunochemical type triangle contact map for ribonuclease S is experiments (34); i.e., the formation of these shown in Figure I 1 (218), and one of the local interior hydrophobic clusters induces the forma- structures (F) on the diagonal is the primary tion of the (surface) antigenic site in segment nucleation site (residues 106-118) predicted by 87-104.

14 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure

Experiments of the type described above pro- nucleotides; see section 6.2) is related to the vide information as to how proteins fold. They range of the interaction potential within the also give us clues about constraints that can be molecule, and these two types of biopolymers incorporated into theoretical protein folding differ in the range of interaction and conse- algorithms (see section 8.3). quently in the sharpness of the transition (70, 258, 259, 315). While most of the theories summarized in 6. SYNTHETIC POLYPEPTIDES AS reference 263 pertain to interactions among the MODELS FOR PROTEINS backbone atoms of the polypeptide chain, the As indicated in section 3, the availability of influence of the side chains in both the helix synthetic polypeptides, and physical chemical and coil forms has also been taken into account methods for studying their properties, led to a (16, 257). For example, a hydrophobic interac- deeper understanding of the interactions in tion between the methyl side chain of the i .h polypeptides and proteins (22, 243,350). With residue and the backbone C"H group of the the aid of these synthetic materials, the spectro- (i+3) 'h residue in the a-helix (215) raises the scopic and deuterium-hydrogen-exchange char- melting point of an infinite poly(L-alanine) a- acteristics of the various conformational ele- helix in water over that of a corresponding ments of proteins (~t-helix, 13-sheet, 13-bend, and polyglycine a-helix by about 100 ~ (16). disulfide bonds) were elucidated (9, 122, 127, Other aspects of the role of intramolecular 176, 189,203,204, 205,298,375,377-385,406), interactions in the helix-coil transition concern and used empirically to investigate protein a molecular treatment, as opposed to the phe- structure in solution. nomenological one mentioned above (see sec- Some of the most fruitful results came from tion 7.8), and theories of the kinetics of the experimental and theoretical studies of the helix- transition (260, 339). coil transition in homopolymers and copo- Recently, an analogous phenomenological lymers of amino acids (21,240, 345). Largely matrix treatment has been developed for the through the work of BLOUT, DOTY, and formation of intramolecular antiparallel fl- KATCHALSKI (19, 20, 130), the existence of the sheets from statistical coil (186). This will pro- a-helical conformation in synthetic polypep- vide a theoretical basis for future experiments tides, and its conversion to the statistical-coil designed to obtain parameters characterizing the form by changes in temperatures, pH or solvent, tendency of the naturally occurring amino acids were demonstrated. The deduction of theore- to take the fi-sheet conformation. A molecular tical parameters characterizing the interactions treatment ofl~-sheets is discussed in section 7.4. in homopolyamino acids was facilitated by the development and application of pheno- 6.1. Determination of o and s with host-guest menological statistical mechanical theories of random copolymers the helix-coil transition to interpret the data As an example of the application of the from such experiments. Following SCHELL- phenomenological theory to experimental data, MAN'S initial treatment (297), ZIMM and we consider the use of host-guest random copo- BRAGG, LIVSON and ROIG, and others (see lymers to determine the ZIMM-BRAGG (412) reference 263 for a summary) applied primarily nucleation and growth parameters 6 and s. Since matrix and sequence-generating function me- nearest-neighbor interactions dominate in de- thods (168, 170, 412) to evaluate the partition termining the conformation of a polypeptide function and compute the transition curve. The in water (307), it is valid (134, 262, 389, 390) utility of this combination of theory and expe- to use the theories based on the one-dimensional riment was greatly enhanced by extension of Ising model to determine the tendency of each the theory to random copolymers of two (2, 69, of the 20 naturally occurring amino acids to 159, 169, 262, 389) or more (134) amino acids. adopt the helix, relative to the coil, conforma- The existence of phase-transition-like properties tion. Because most of the amino acid homopo- in solutions of polyamino acids (and also poly- lymers are not soluble in water or, if soluble,

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 15 H, A. SCHERAGA:Protein structure

I I 1 1 I I I I .MMet A

1.2 Ile Leu J Trp AIo

Gin 1.0 Val Glu-

0.8

Asp

0.6

His

I I I 0 20 40 60 0 20 40 60 Temperature (~ Figure 12. Temperature dependence of Zimm-Bragg parameters s for 18 amino acids in water (351).

do not form a-helices that melt between 0 ~ and teins, based either on short-range (312) or short- 100 ~ the o and s parameters are determined plus-medium-range (397) interactions. The from water-soluble host-guest random copo- general agreement of the ordering, obtained lymers, in which the host is a helical homopo- from these two very different types of analysis lymer which does melt in this temperature range (i.e. from melting curves of synthetic random and the guest is usually present to the extent copolymers, on the one hand, and from protein ofonly < 15 or 20%. From the influence of the structures on the other) is itself another valida- guest residue on the transition curve of the host tion of the principle of the dominance of nearest- homopotymer, it is possible to compute o and neighbor interactions in these systems. This s for the guest residue. Thus far, such experi- agreement is also another example of the utility ments have been carried out for 18 of the 20 of a physical chemical approach to protein amino acids, and the temperature dependence conformation. The data of Figure 12 provide of s for these residues is shown in Figure 12 the initial input in protein-folding algorithms (356). (see section 8.3.1). The curves of Figure 12 express the relative The physical' basis for the data of Figure 12 preference of each of the amino acids to adopt has been accounted for (83, 85, 86, 143, 144, the helix conformation at various temperatures. 165, 309); i.e. the helix-making or helix-break- A similar ordering of the amino acids (ignoring ing tendencies depend on local interactions the temperature dependence) is obtained from between a side chain and its own backbone. For statistical analyses of X-ray structures of pro- example, as shown in Figure 13, the serine

16 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 I-1. A. SCHERAGA: Protein structure

II puted with the values of t~ and s determined from binary host-guest random copolymers) agree with the corresponding experimental C' curves within experimental error (134).

0 \/ ~,,M H" C[)~ H l I o 6.2. Related studies of polynucleotides "'" ", o#C~,~CCI OxcB/C~N Similar phenomenological statistical me- chanical theories have been presented for helix- coil transitions in single- and double-stranded polynucleotides (82, 261,263, 264, 391, 41 l). In the case of the double-stranded polymer, the rrr entropy of interior loops in the partially-melted conformation plays an important role in deter- c c mining the character of the phase transition (70, ...o / 259). "'"M C' H/'I)~CB/'C~N Figure 13. Illustration of the types of hydrogen bonds 7. CONFORMATIONAL ENERGY between serine and threonine side chains and the CALCULATIONS backbone (165). ON MODEL SYSTEMS The experimental determination of distance constraints such as those of Figure 5 led to the formulation of computational procedures to use side-chain -OH group can hydrogen bond to them (together with empirical potential energy the backbone, and the tendency toward forma- functions) to calculate the structures ofpolypep- tion of such a hydrogen bond (at non-a-helical tides and proteins (216). The calculations are values of ~ or ~) may be the origin of the based on the assumption that the most stable helix-breaking character of serine residues that state of a system at constant temperature and is illustrated in Figure 12. Occasionally, a me- pressure is the one with the lowest Gibbs free dium-range interaction plays an important role; energy. In this section we discuss the application e. g. negatively charged glutamic acid is a stron- of this methodology to model systems and, in ger helix former in proteins [where it tends to later sections, to proteins. interact with nearby positively charged groups (187)] than it is in the host-guest copolymers used to obtain the data of Figure 12. The data 7.1. Empirical potential energy functions of Figure 12 pertain to the intrinsic helix-form- and methodology ing tendency, which can then be modified by Initially, only hard-sphere potentials were additional (longer-range) interactions. An used to determine the sterically allowed regions example of such a medium-range electrostatic of conformational space for an oligopeptide interaction involving charged glutamic acid ap- (216, 279, 329), but these were later replaced pears in the C-peptide of ribonuclease (14). by more detailed empirical potential energy Additional evidence for the dominance of functions (24, 56, 233, 341) which underwent nearest-neighbor interactions comes from a various types of refinement that made use of comparison of the experimental melting curves crystal structures, lattice energies and gas-phase of two-component random copolymers with microwave data on small molecules that are those computed from o and s for each of the prototypes for peptides (102, I 1 l, 206, 225). pure components as homopolymers (390). In These are essentially atom-centered pair poten- a similar way, theoretical melting curves for a tials, with the energy of a molecule being a sum three-component random copolymer (corn- over all pairwise interactions. The latter include

Cadsberg Res. Commun. Vol. 49, p. 1-55, 1984 17 H. A. SCHERAGA:Protein structure

180

120 9 II

5

60

A t/) tl) Ib- 0

v -9- -60

-I 20

9 II rllg.~ -180 -180 -120 -60 0 60 120 80 (degrees) Figure 14. Conformational energy map of N-acetyl-N'-methyl-alanine amide, for a side-chain dihedral angle X' of 60 ~ Locations of minima are indicated by the filled circles. The contour lines are labelled with the energy AE in kcal/mol above the minimum-energy point at values of the backbone dihedral angles (r ~g) = (-84~ 79~ ) (416).

contributions from electrostatic, nonbonded, as on nuclei, have also been introduced (211). and hydrogen-bonding interactions, and an in- Conforrnational entropy effects are included trinsic torsional potential (206, 225, 267); it (79, 96), based on an analysis of the vibrational must be emphasized, however, that it is only degrees of freedom (78, 89, 91,306). The influ- the total pair potential, not its ingredients, that ence of the free energy of hydration (including have physical meaning. Simplifications of the hydrophobic interactions) is incorporated by potentials [e.g. united-atom (61) and united- either of the two empirical procedures discussed residue (246) approximations] have been intro- in section 4.4. duced to increase computational speed. Other Polypeptide chains can be generated by ma- types of potentials, for example those centered trix methods using dihedral angles for rotation on bonding and non-bonding electrons as well about the single bonds of the backbone (~, ~,

18 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein strucfure

Left-handed Right - handed

P

r ,>t

Figure 15, Orientation of the side chains of the lowest-energy left- and right-handed a-helices of poly(m-Cl-benzyl- L-aspartate)The arrows represent the directions of the C-C1, ester, and amide dipoles (409).

to) and side chains (X's) as variables. In this 7.2. The terminally-blocked single residue procedure, the bond lengths and bond angles (24, 165, 280, 341, 387, 416) are kept fixed, but with different values assigned The stereochemistry of a polypeptide chain to each type of amino acid residue. Generally, reflects, in large measure, that of the individual the peptide group is maintained in the planar amino acids. In order that computations on trans conformation, except for X-Pro dipeptide single residues serve as prototypes for interior fragments [for energetic reasons that are under- residues of a polypeptide chain, we block the stood (413)]. However, the option is maintained N- and C-termini with acetyl and methyl amide either to fix or vary the dihedral angle for groups, respectively. Thus, e. g., the terminally- rotation about the peptide bond. Alternatively, blocked alanine residue would be CH3CO-NH- bond lengths, bond angles and dihedral angles CH(CH3)-CO-NHCH3, the conformational are all allowed to vary by using the cartesian energy map for which is shown in Figure 14. coordinates of all of the atoms as variables. Data of the type represented in Figure 14 are Various procedures are used to minimize the available for the 20 naturally occurring amino empirical conformational energy. However, the acids (387), and constitute a data base for build- computations (especially on large systems) ing up larger structures (see section 8.1). suffer from the existence of many local minima in the multi-dimensional energy surface, and minimization algorithms lead only to the nea- 7.3. Handedness (twist) of the a-helix rest local minimum, not to the global minimum Similar (0, ~) maps can be computed for (the multiple-minima problem; see section 8 for regular (helical) chain structures, i.e., those for further discussion of this problem). Monte Carlo which the values of (0,u/) are constrained to be and molecular dynamics methods are also used. the same in every residue. Each point on such Details of these procedures can be found in a map corresponds to some kind of helical several review articles (7, 217, 304-306, 308, structure, and the right- and left-handed a-he- 313, 317, 319, 320, 322). lices correspond to narrow low-energy regions Some examples of conformationai energy cal- around the two points at which (~,~/) - (-50 ~ culations on model systems are given below. -55 ~ and - (50 ~ 55~ respectively. In fact, one

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 19 H. A. SCHERAGA:Protein structure of the earliest model systems for which confor- order to solubilize the polymer, but this is a mational energy calculations were carried out helix-breaking solvent. When poly(L-valine) was the a-helix in order to determine the inter- was solubilized in a helix-promoting solvent, by actions that lead to a preference for a right- or incorporating it into a block copolymer, the left-handed screw sense (233, 341, 408, 409). predicted a-helix conformation (233) was in- The presence of a 13-carbon in the L-configura- deed found to be stable (67). tion favors right-handedness (233). However, It is worth noting that the stable a-helix other atoms at the outer end of the side chain conformation is found not only by energy min- can either enhance the preference for right- imization but also by Monte Carlo computa- handedness or overcome this preference and tions (281). lead to left-handedness (408, 409). The outer atoms of the side chains can have such an effect, by interacting with the backbone, because the 7.4. Handedness (twist) of the [I-sheet preferred conformations of the side chains tend Empirical observations on protein structures to be longitudinal or transverse, i.e., they lie had shown that the [I-sheet of PAULING and nearly parallel to the helix axis or they wrap COREY is not flat but has a right-handed twist around the helix (408, 409). It has thus been (37). As for the a-helix, conformational energy possible to account for right- and left-handed- calculations yield a low-energy structure having ness of poly(),-benzyl-L-glutamate) and poly(13- the observed twist, and indicate the interactions benzyI-L-aspartate) a-helices, respectively, and that lead to it (38, 39, 41,42, 330). For example, of numerous other a-helices (305). (Also, as the right-handed twist of a 13-sheet of parallel shown in Figure 13, side chain - backbone or anti-parallel poly(L-valine) chains (Figure 16) interactions can disrupt the a-helix). arises because of side chain - backbone inter- An example of the interactions that influence actions (38). The computations also indicate the helix sense is illustrated in Figure 15. Whe- that parallel 13-sheets are more stable than anti- reas the aspartate helix is left-handed, the sub- parallel ones for some amino acids, but vice stitution of a chlorine atom in the para position versa for some others, in agreement with avai- of the benzyl group switches the screw sense lable experimental observations on l~-sheets of to right-handed (106). However, if the chlorine these same polyamino acids (42). atom is in the ortho or meta positions, the screw sense becomes left-handed again. Figure 15 shows the lowest-energy right- and left-handed 7.5. Models of collagen helices of the meta derivative, and illustrates Some synthetic polymers of the form the electrostatic interaction between the C-C1 poly(Gly-X-Y), where X and Y are frequently and nearest-backbone-amide dipoles. This in- proline or hydroxyproline, take on the triple- teraction, which is attractive for the left-handed stranded helical structure of collagen whereas helix but repulsive for the right-handed one, others of this general formula do not. With the plays the dominant role in forcing the helix to development of methodology to treat coiled be left-handed. The computed helix senses of coils (226), it was possible to carry out compu- the ortho-, meta-, and para-derivatives have tations on several of these synthetic models of been verified subsequently by experiment (68, collagen (see section 8.2). 106). For example, poly(Gly-Pro-Pro), poly(Gly- A surprising result of the calculations was the Pro-Hypro) and poly(Gly-Pro-Ala) are all found finding that poly(L-valine) can form a stable to form low-energy collagen-like structures (right-handed) helix (233). Earlier experimental (199-201), but poly(Gly-Ala-Pro) does not results (21) had led to the "rule" that amino (222). The computed low-energy structure of acid residues, such as valine and isoleucine, that poly(Gly-Pro-Pro) is shown in Figure 17. All branch at the 13-carbon cannot take on the of these computations are in agreement with a-helix conformation. The earlier experiments experiment, not only as far as the overall struc- had been carried out in trifluoroacetic acid, in ture is concerned but even quantitatively, in

20 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure

%

--o ~ B

Figure 16. Stereo drawings of minim um-energy I~-sheetswith five CHrCO-(L-Val)~-NHCH3chains. (A) Antiparallel structure. (B) Parallel structure (38). The parallel structure is more stable than the antiparallel one for poly(L-valine).

predicting the helical parameters obtained by 7.6. Packing of helical structures X-ray fiber diffraction. In fact, the non-hydro- The interactions between a-helices in isolated, gen-(i.e, heavy)-atom coordinates of poly(Gly- non-crystalline arrays have also been treated by Pro-Pro) agree with those obtained subsequently conformational energy computations (40, 43, from a single-crystal X-ray diffraction study of 330). Several low-energy structures are found, (Gly-Pro-Pro)j0 (228) within an r. m. s. deviation some of which differ from those predicted from of 0.3 A (199). a simple knobs-in-holes approach based only

r

9

Figure 17. Triple-stranded coiled-coil complex of poly(Gly-Pro-Pro) of lowest energy (199).

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 21 H. A. SCHERAGA:Protein structure

Figure 18. Stereo drawing of two CH3CO-(L-Ala),,,-NHCH3a-helices in the lowest-energy(essentially anti-parallel) packing arrangement (40).

on geometrical considerations (50). Figure 18 conformation with the aid of the hard-sphere shows the most favored arrangement of two potential, paying particular attention to the poly(L-alanine) a-helices. This antiparallel ar- presence of a hydrogen bond (388). Bend and rangement, which is influenced largely by the non-bend structures are distinguishable in a antiparallel dipoles of the helices, is found fre- more general and natural way in terms of the quently in globular proteins. The theory devel- probability distribution P(R) for the distance oped to treat the interactions between two a-he- R between the C ~ atoms of residues i and i+3 lices (40, 43) is also applicable to the interactions (for a bend involving residues i+ 1 and i+2); the between an a-helix and a 13-sheet, and between function P(R) for globular proteins has two two I~-sheets. Such computations are in progress distinct peaks separated by a minimum at to determine how interatomic interactions lead R-7A, indicating the existence of a natural to the structures commonly observed in globular division between bend (R < 7A) and non-bend proteins. (R>TA) structures (417). A detailed analysis The same computational methodology is also of l~-bends in proteins was carried out by LEWIS applicable to the packing of two or more triple- et al. (164, 166) and by CRAWFORD et al. (48). stranded poly(Gly-Pro-Pro) collagen-like struc- It was demonstrated that the existence of the tures (212, 330). The results indicate that spe- B-bend, like the a-helix and extended structure, cific interactions between the triple helices play could be understood on the basis of the domin- an important role, favoring parallel packing (as ance of nearest-neighbor interactions, and empi- seen in collagen fibers) over antiparallel pack- rical probabilities were computed to predict the ings. locations of [~-bends in proteins (164). Subse- quently, the various classes of []-bends ( 164, 166, 388) were shown to constitute a continuum of 7.7. 13-bends structures (219, 273, 274, 276) (see section 8.6). Attention was first called to the 13-bend struc- Multiple bends also exist in proteins (124). ture by VENKATACHALAMwho considered this A double bend is a nonhelical tripeptide se-

22 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure quence in which two successive distances be- putational methods (ll2). To obtain a physi- tween C a atoms (i to i+3 and i+l to i+4) are cally realistic estimate of the electrostatic contri- both <7,h,, with analogous definitions for bution to the free energy, it was necessary to higher-order multiple bends. There is a strong introduce the screening effect of the ionic atmos- preference for single and multiple bends to be phere (with a DEBYE-HUCKEL screening func- right-handed, like the ct-helices in proteins. The tion). It was thus possible to compute the effect same type of nonbonded interactions appear to of both pH and ionic strength on the transition be operative in helical and in bend structures curve. For example, the computed midpoint of to lead to the fight-handed preference. the helix-coil transition in poly(L-lysine) in 0.1 In order to characterize [~-bends spectroscopi- M salt occurs at a degree of ionization of .5, cally, a series of fairly rigid, cyclic peptides were compared to an experimental value of 0.35 synthesized and shown to take on specific types (112). of 13-bend structures, depending on the amino Besides the detailed molecular theories di- acid sequence (9, 57, 191,223). These charac- scussed above, simpler models have also been teristic spectra should be of help in identifying used to estimate ~ and s. One of these models bend structures from the spectroscopic proper- (184) for estimating ~ was based on the assump- ties of proteins. tion that the hydrophobic interactions between nonpolar side chains serve to overcome the entropy loss in nucleation of the a-helix, with 7.8. Molecular theory of the helix-coil electrostatic interactions playing a role only transition when polar residues are constrained to be near A molecular theory has been formulated to each other. The values ofo computed with this compute the phenomenological parameters a simple model (184), and with another model and s of the helix-coil transition, using empirical based on a statistical mechanical theory that potential functions (95). This required the com- includes short- and medium-range interactions putation of the free energy of both the helix (397), agree with experimental values to within and coil forms in water (83-86). The calculation an order of magnitude. A similar comparison of the free energy of the helix was based on of calculated and experimental values of s was the small-vibration harmonic approximation, presented in section 6.1. and that of the coil was based on the nearest- neighbor approximation. The computations were carried out for polyglycine (85), poly(L- alanine) (85), poly(L-valine) (86) and poly(L-iSO- leucine) (83). Figure 19 shows the agreement between the computed and experimental results 1.2 for poly(L-valine), wherein the dominant effect of hydrophobic interactions accounts for the n increase in s with increasing temperature. A en dominant feature that determines the s vs. T s i.o behavior in Figure 19 is the difference in hydra- tion between the helical and coil forms. Poly(L- valine) and poly(L-isoleucine) differ conside- oS 0.8 rably in their hydration properties, due to the I t I I I I I I extra methyl group in the isoleucine side chain, o 20 40 60 so that the s vs. T curve for poly(L-isoleucine) Temperolure(~ has the opposite slope of that of poly(L-valine) Figure 19. Comparison of s vs. T curves for poly(L- (83). valine) in water. The line is a calculated one (86), In addition to the thermally-induced helix- and the squares are the experimental results (3). The coil transition, the pH-induced transition [in calculated curve matches the observed increase in the poly(L-lysine)] has been treated by similar tom- values of s with increasing T.

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 23 H. A. SCHERAGA: Protein structure

Figure 20. View of the minimum-energy crystalline ti~-helical form of poly(p-Cl-benzyl-L-aspartate) (74).

7.9. Molecular theory of helix-helix (74, 192). These involve intermolecular, crystal- interconversion packing degrees of freedom, as well as the The stable crystal arrangements of several internal ones. Whereas poly(p-Cl-benzyl-L-as- homopolymers, including the thermally in- partate) exists as a right-handed a-helix in solu- duced conversion between a- and to-helical tion and in the crystal, it is possible to convert forms of polyamino acids, have been computed itto an to-helix in the crystal. Figure 20 illustrates

24 Carlsberg Res, Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure

1.0, I I

,,/sj " J

0.8 $ 4 4 /' / / / ! / 0.6 A / /~ If // I / Oz /o I / o B o 0.4 r"/ i / Is I mrn g~ I

I // I ! '/ A I I ! 0.2

I r I I

_ ._w.~ _~--~_ ..~ ~'-~ ~'~ 0"

0 O. 2 0.4 0.6 0.8 1.0

f (volume fraction of n - Butonol)

Figure 21. Calculated transition curves (361) of poly(L-proline) in n-butyl alcohol: benzyl alcohol at 70 ~ The fraction of form I helix, 0, is plotted against the volume fraction of n-butyl alcohol. The experimental points (75) are shown for degrees of polymerization of 217 ( 90 (I-q), 33 (iX) and 14 (0). The various solid and dashed curves are based on different assumptions made in the computations (361).

how favorable interchain interactions enable the groups, respectively (180). Interconversion be- backbone to adopt the to-helical form in the tween these two helical forms can be induced crystal (74). Entropy effects play a role in the by changes of solvent (180). For example, form a q to conversion as the temperature is raised I is stable in n-butanol, and form II in benzyl (74). X-ray fiber-diffraction studies have de- alcohol. The transition curves (shown in Figure monstrated the presence of the o~-helix in 21), computed from empirical potential func- crystals of this polymer (359). tions, also taking the effect of these solvents into Poly(L-proline) can also exist in two helical account, match experimental data (75) fairly forms, I and II, with all-cis and all-trans peptide well (361).

Carlsberg Res. Commun. VoL 49, p. 1-55, 1984 25 H. A. SCHERAGA:Protein structure

8. CONFORMATIONAL ENERGY REDUCTION OF NUMBER CALCULATIONS ON OF STATES POLYPEPTIDES AND PROTEINS The success of the computations on model systems attests to the reasonableness of the methodology, of the parameters characterizing the potential functions, and of the procedures for incorporating the effect ofsolvation. We may VARIABLE therefore apply this methodology to polypep- CUT-OFF tides and proteins. However, as pointed out in il' ",,,- section 7.1, the main obstacle to be overcome is the multiple-minima problem. Therefore, the discussion of computational methods for treat- Figure 22. Schematic representation of the energies ing polypeptides and proteins is presented in of the various minima in the conformational space the context of surmounting the multiple-min- of an oligopeptide. In using such a data base to build ima problem. For this purpose, we divide poly- up larger structures, only those states below a (variable) peptides into three categories: (a) small open- cut-off are considered (319). chain and cyclic oligopeptides (leaving until section 8.1 the specification of what "small" means), (b) fibrous proteins (using collagen as a model), and (c) globular proteins. The ap- We may then build up larger molecules by proach to the multiple-minima problem is dif- combinations of single-residue minima, as illus- ferent for each category and, indeed, it has been trated in Figure 15 of reference 309, and de- solved for categories (a) and (b), and progress scribed in more detail in reference 319. As is being made in solving this problem for cat- pointed out in reference 319, this method has egory (c). some features in common with the real-space We do not describe the anatomy and taxo- renormalization group technique, and is based nomy of protein structure here. Instead, we refer on the assumption that short-range interactions the reader to two reviews of this subject (2 ! 7, 287). ABCDEFGHIJKLMNOP,,,, ,,,,,,,, ,, , 8.1. Small open-chain and cyclic oligopeptides The smallest "peptide" is, of course, the ter- AB CD EF GH Id KL MN OP minally-blocked single amino acid residue, e. ABCD EFGH IJKL MNOP g. alanine. The conformational energy map for ABCDEFGH IJKLMNOP this molecule was presented in Figure 14. It can I I be seen that there are 8 local minima (indicated I by black dots), the global minimum being at I (0,~) = ( -84~ 79 ~ (416). Since this map covers the whole conformational space of this mole- BC DE FG HI JK LM NO PQ cule, it is clear that the global minimum is BCDE FGHI JKLM NOPQ included among the various minima, and has BCDEFGHI JKLMNOPQ thereby been identified; thus, the multiple-min- I ima problem for this molecule has been sur- I I mounted. However, since the energies of some i Figure 23. Schematic representation of the buildup of the low-lying minima are close to that of the of larger structures from smaller ones (AB, CD, . . global minimum, the molecule does not exist ABCD, EFGH ..... or, alternatively, BC, DE, as a single (global-minimum) conformation but ..... BCDE, FGHI ..... ). See references 253, rather as a Boltzmann-averaged ensemble over 319 and 345 for strategies for selecting low-energy the low-lying minima. starting conformations for energy minimization.

26 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure A

13

Figure 24. Computed (62, 220) (A) and X-ray (121) (B) structures of gramicidin S, showing (among other things) a hydrogen bond between the ornithine side chain and the phenylalanine backbone carbonyl group.

play a dominant role in determining the confor- to make sure that higher-energy states can jus- mation of a polypeptide or protein. tifiably be neglected. As mentioned in section 7.2, conformational As in the real-space renormalization group energy maps of the type shown in Figure 14 technique, larger structures are built up in this are available for the 20 naturally occurring manner from smaller ones, by minimizing the amino aojds (387), and serve as a data base for energies of all possible combinations of low- the larger structures. Each of them contains energy structures of the component fragments several backbone minima (of the order of 10). and then reducing the total number of states The conformational space of a terminally at each stage by cutting off the higher-energy blocked dipeptide, CH3CO-X-Y-NHCH3, ones (5, 73, 345). This technique is augmented where X and Y are amino acid residues, is by a ring-closure constraint and, where applica- mapped by taking all possible combinations of ble, by a symmetry constraint for cyclic mole- low-lying minima for the single residues and cules (62). The formation of larger structures minimizing their energies (414, 415). Only those from smaller ones (309, 345) is represented conformations are retained whose energies are schematically in Figure 23, where two alter- smaller than some cut-off value (see Figure 22). native groupings of the smaller fragments are This cut-off value is usually taken as 3 kcal/mol shown. The uniqueness of the final result is above the lowest energy, but it can be varied tested by varying the groupings of the residues,

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 27 H. A. SCHERAGA: Protein structure

~HG s

HG6

"Ga HGI2 ~ G24 Figure 25. Computed low-energy structures of cyclic hexaglycine, with various kinds of symmetry as indicated (90). as shown in Figure 23, and by varying the cut-off side-chain nitrogen atom and the o-phenyl- energy of Figure 22 at each stage as larger alanine carbonyl oxygen atom which had not fragments are built up from smaller ones. been predicted". Figure 24A, which has a com- As one example of the application of this puted rotational variant (220) of the previously- method, we cite the computation (62) of the computed (62) side-chain conformation oforni- structure of the cyclic decapeptide gramicidin thine, clearly shows the predicted hydrogen S. The computed structure consists of two an- bond [the original computed rotational variant tiparallel extended chains, cross-linked with (62) also possessed such a hydrogen bond, but four hydrogen bonds, and connected by 13 turns it had the Orn and Phe partners interchanged]. at each end. A differential-geometrical analysis The computed result of Figure 24A is not (274) of this structure (62) [shown (220) in a fortuitously-obtained one, but was arrived at Figure 24A] indicated that it was in good agree- by energy minimization from 10,541 starting ~ent (275) with a subsequently-determined conformations, with the imposition of C2 sym- ( 121) X-ray crystal structure of a hydrated gra- metry. To provide an idea of the magnitude micidin S-urea complex (shown in Figure 24B). of the number of possible comformations, we When the structure was computed (62) in 1975, refer to Figure 25 which shows some of the it was stated that "the ornithine side chain may computed low-energy conformations of cyclo- be free to occupy more than one rotational state. hexaglycine (90). These are conformations that . (and there is) a hydrogen bond between a have some kind of symmetry, and there are still &NH_, proton of Orn and the backbone CO of others [obtained by a Monte Carlo procedure Phe. There is no experimental evidence indicat- (92)] with non-symmetric conformations. A ing the existence of this interaction". When the decapeptide such as gramicidin S would have X-ray structure was determined (121) in 1978, many more low-energy structures than the cyclic it was stated that "there is an intramolecular hexapeptide of Figure 25. When viewed in this hydrogen bond.., between.., the ornithine light, it is seen how well the computational

28 Cadsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure

Figure 26. Stereo view of the lowest-energy structure calculated for residues 1-20 of melittin (253).

methodology leads to the fairly unique low- structures poly(Gly-X-Y) which are models of energy conformation of Figure 24A. the fibrous protein collagen. The long chain was Because of the extremely large number of built up from GIy-X, X-Y and Y-GIy fragments, low-energy minima for each component frag- and comformational equivalence conditions ment of a polypeptide, the size of gramicidin were imposed on each tripeptide within each S at first appeared to be the upper limit that chain and on each chain. Thus, the energy of could be treated by this methodology, and hence the triple-stranded structure becomes a function defined what was meant by "small" in the title of the dihedral angles of the Gly-X-Y tripeptide of this section. However, recently, by not only only. The results of energy minimization of eliminating high-energy structures of fragments triple-stranded complexes (taking both intra- (as illustrated in Figure 22), but also by reducing and interchain intenictions into account), and the number of low-energy minima further by a comparison with experiment, are summarized selecting only those having different backbone in section 7.5. conformations [designated as nondegenerate The computations also provided an explana- minima (245)], it appears to be possible to treat tion why collagen is triple-stranded and why larger structures in this manner (245). Thus, the the a-helix, for example, is generally single- build-up of large fragments from small ones stranded. In the case of collagen, the coiled-coil (245, 309) (as illustrated in Figure 23) may be packing of the triple strand (and its consequent possible. In fact, such a calculation has been interchain interactions) lowers the energy per carried out for the 20-residue membrane-bound tripeptide, over that in the single-strand form, portion of a relatively small globular protein, to a sufficient extent to overcome the entropy melittin (253). Only two (very similar) low- loss upon association. Apparently, this is usually energy, largely a-helical structures were found not the case for the a-helix, unless it can take (253); one of them is shown in Figure 26. X-ray on (intermolecular) coiled-coil conformations. (374) and NMR (27) structural information is The effects of hydration were not included available for melittin either as a tetramer in a in these computations on collagen-like poly(tri- crystal or as a monomer bound to micelles, peptides). However, the interchain interactions, respectively; considering possible environ- which accompany the tight packing in the triple- mental effects in either of these forms, the stranded structure, would exist even if the outer general agreement with experiment is satis- parts of the triple helix were surrounded by factory. It should be possible to extend this water. methodology to much larger structures (309, The single-stranded structure can take on 319). many more low-energy conformations than the triple-stranded one. Computations on the sin- gle-stranded terminally-blocked tetrapeptide 8.2. Fibrous proteins Pro-Pro-Gly-Pro (158) demonstrated that there The same approach to the multiple-minima is a very high probability of occurrence of a problem has been applied to regular-repeating ~bend at Pro-Gly; the occurrence of this favored

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 29 H. A. SCHERAGA:Protein structure

conformation, which can also be accommo- dures using the same data or data from host- dated in single-stranded poly(Gly-Pro-Pro), sup- guest copolymers (section 6.1 ), are available for ports the hypothesis (4) that a 13-bend at Pro-Gly this purpose (30, 44, 97, 129, 143-145, 161-163, is required for post-translational hydroxylation 188, 190, 208, 270, 288, 307, 362, 363, 365, of (only) the third residue of the Gly-Pro-Pro 367). For example, the data of Figure 12 (to- sequence during the biosynthesis of collagen, gether with corresponding values of o) provide before assembly of the chains into the triple- initial input to predict the location of a-helical helical structure. structures in the amino acid sequence. Since this is only an approximate procedure, it cannot be expected to yield predictions with 100% accuracy. Nevertheless, the lower accuracy 8.3. Globular proteins achievable (50-75%) provides a useful starting Despite the apparent formidability of extend- point, and also a demonstration of the validity ing the above methodology to globular proteins, of the concept of the dominance of short-range it now appears to be feasible. For larger proteins interactions. Even if conformational regions than melittin, the computational time involved could be predicted with 100% accuracy, how- in this "building-up" procedure can be greatly ever, the resulting structure would not resemble reduced by introducing constraints that limit the native one (29) (as illustrated for bovine the volume of conformational space that has pancreatic trypsin inhibitor in Figure 27) to be searched. If such constraints can limit the because the conformational regions are too large conformational space to the potential well in for a narrow specification of the precise values which the native structure lies, then presently of the backbone dihedral angles (O,V); i.e., the available minimization procedures will locate conformations of Figures 27A and 27C have the global minimum. In fact, such a procedure the same local structure (in the sense that cor- has been used to compute the structures of responding residues are in the same local region proteins from a knowledge of the structures of of a O,~g conformational map) but they differ homologous ones (see section 8.5). Less well-de- considerably in their long-range interactions. fined constraints than those implied by the It has recently been possible to improve the availability of the structure of a homologous short-range (lsing model) approach by comput- protein may also place the starting conforma- ing the actual values of (r for each residue tion close enough to the native structure so that (60) rather than simply assigning the residues the latter can be reached by energy min- to their respective conformational regions. In imization. These constraints encompass three this computation, the statistical weights were ranges of interactions, short-, medium-, and obtained from empirical conformational ener- long-range (defined on p. 242 of reference 217), gies (section 7. l) instead of from X-ray data. and successive approximations proceed from an Work is currently in progress to extend this Ising initial use of a short-range algorithm to subse- model approach based on empirical energies quent incorporation of medium- and long-range (60) to longer range. interactions, followed by minimization of the As pointed out in section 5.2, it is important energy of the complete molecule. to realize that even a protein with its disulfide bonds fully reduced has a local conformation that is close to that of the native structure (35). 8.3.1. Short-range-interaction approximation Presumably, the short-range (side chain-back- Since short-range interactions dominate bone) interacffons within each residue play a (307), one may begin the treatment of a globular major role in determining the ultimate confor- protein by preliminarily assigning each residue mation of each residue in the chain. Only a small to a region of conformational (r ~g) space. extra stabilization free energy from medium- Various short-range prediction algorithms, and long-range interactions is required to lead based either on statistical analyses of protein to the precise values of (r165 of the native X-ray data or on statistical mechanical proce- structures.

30 Cadsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure

A 38 5J

55 58

lO 5

I0

~ 2c

z~ D

"t~ 4C' r~r--~

50 ~

58 ' ' ' ' ' 58 I0 20 30 40 50 58 Io 20 30 40 50 58 Residue Number Residue Number Figure 27. Structural comparisons for bovine pancreatic trypsin inhibitor (29). (A) Spatial structure with correct short-range order, generated by assigning to each residue the average value of the backbone dihedral angles and ~ in its appropriate conformational state (obtained from the X-ray structure). Only the positions of the C" atoms are shown as circles connected by virtual bonds. (B) Distance contours in ,/~ of the structure shown in (A). (C) Spatial structure determined by X-ray . (D) Distance contours in A of the structure shown in (C). The numbers in the contours should all be augmented by 2 ,/~.

28, wherein the correctness of the predictions 8.3.2. Introduction of medium-range (using statistical weights from protein X-ray inmractions data) increases as one extends the range of Some of the interaction energies, missing interaction step by step from one to four residues from the short-range algorithms, can be incor- (397). porated by considering short fragments in which Other approaches have also been taken to a central residue interacts with up to four resi- treat medium-range interactions. A model (re- dues on either side of it (29, 245,266, 309, 397). ferred to in section 5.2) based on maximizing Thus, in essence, one adopts the procedure hydrophobic contacts has been used to identify implied in Figure 23, using the medium-range the primary (and secondary) hairpinlike nuclea- interactions to correct (partially) for the short- tion sites for protein folding (182). Alternatively, comings of the short-range interaction approxi- a-helices or 13-structures (in which breaks are mation. A dramatic example of the improve- introduced at a later stage of folding) are con- ment achieved thereby is illustrated in Figure sidered to serve as nucleating conformations

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 31 H. A. SCHERAGA: Protein structure

1.c

0 i o.~

0.0

1.

0 i 0.

@. Residue Number end Sequence Figure 28. Calculated ct-helix and extended-structure probability profiles for flavodoxin (397). The symbols O, z~, [] and ~" represent profiles that take into account interactions up to 1, 2, 3 and 4 neighbors, respectively. The dashed lines are thresholds to predict the conformational state of each residue as either an a-helix (or extended structure) or a coil, The letters H and E indicate the residues that are in the a-helical or extended-structure states, respectively, in the native conformation. The amino acid sequence of flavodoxin is shown in the one-letter representation at the bottom.

(269). Both models are incorporated in a view essential to introduce long-range interactions to of nucleation (276, 277) based on a differential enablethe molecule to achieve its proper overall geometric consideration of four- and five-resi- shape, before minimizing the energy of the due structures (273, 274, 276). The naturally whole molecule; i.e., it is necessary to avoid the occurring amino acids have been divided into problem inherent in Figure 27A. two groups, one of which is responsible for the One method (196) for introducing long-range nucleation of helix and bend structures and the interactions divides the conformational space other for the nucleation of extended structures. of a protein into classes characterized by diffe- As an example, residues 106-118 have been rent spatial geometric arrangements of the loops identified as the primary nucleation site for the formed by disulfide bonds and then randomly folding of bovine pancreatic ribonuclease (182), selects one or more members of each class for and experimental evidence supports this sugges- subsequent energy minimization. Because of the tion (316); thus, the preservation of the predicted limited number of classes and the availability nucleation site serves as a constraint on the of an array processor (267), such an approach folded structure. in feasible. It is assumed that one of the confor- mations selected from the class containing the 8.3.3. Incorporation of long-range interactions native structure would emerge as the one of While large structures can be built up from lowest energy. The selected conformations are short ones, as indicated in Figure 23, it is defined initially with the help of a space-filling

32 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure model, thereby avoiding interatomic overlaps (197) was converted to a virtual-bond one, and right from the beginning. This approach was the optimization was carried out. The r.m.s. tested on bovine pancreatic trypsin inhibitor in deviation was thereby reduced from 4.4 to 2.2 a preliminary way by selecting two starting A. Recently, we have solved the problem of conformations- one from the class of the native converting from a virtual-bond chain back to structure (R) and one from a different class (W). a real polypeptide chain (271). This will enable After complete energy minimization of both, the computation to be continued with a full- the one obtained from R did have a lower energy energy minimization, on a structure which op- than that obtained from W (94 compared to timally satisfies the distance constraints. 168 kcal/mol), but its r.m.s, deviation from the native structure was 5.9 A (compared to 8.3 A for the structure obtained from W). 8.3.5. Other aspects of long-range interactions Introduction of the correct short-range con- Several aspects of the folding process have formational information (to the extent of about been brought out by considering two-dimen- 85% accuracy), and reminimization of the sional models (93, 94, 394). These computa- energy reduced the energy from 94 to 10 kcal/ tions, and the three-dimensional one on bovine mol and the r.m.s, deviation from 5.9 to 4.4 pancreatic trypsin inhibitor (396), have demon- A (197). strated, among other things, that nonspecific interactions stabilize the native structure of a protein if the hydrophobic and hydrophilic resi- 8.3.4. Use of distance constraints dues are mixed in a globular protein, rather than All of the foregoing procedures appear to be being segregated into blocks, and that there are insufficient, by themselves, to obtain a starting two types of long-range interactions. One is a structure to which it is worthwhile to apply a specific one that can act only when ordered complete energy minimization algorithm. Addi- structures form (by short- and medium-range tional information is required, and this can be interactions) and bring the specific pair together. provided by distance constraints. Such con- The other is the nonspecific long-range inter- straints constitute alternative expressions of in- action (e.g., those between hydrophobic resi- teraction energies and are operative over all dues), that can act at any stage of the folding ranges of interaction. Information about dis- process irrespective of the role played by short- tance constraints is obtainable from several and medium-range interactions. A protein-fold- experimental sources (25, 193, 301,303, 309) ing algorithm should include both types of (see e.g. Figure 5 and section 4.2.1), and from long-range interactions. statistical treatments of X-ray structures of pro- There are essentially two ways to incorporate teins (51,98, 149, 364, 395, 410). This informa- the long-range interactions. One method is that tion is incorporated into protein-folding algo- implied by the three-step mechanism of rithms (45, 98, 149, 393, 394, 396, 410) in TANAKA and SCHERAGA (360, 366, 368): (a) various ways. However, as emphasized in sec- formation of ordered backbone structures by tion 4.2.1, distance constraints are not enough short-range interactions, (b) formation of small to determine the folding of a protein (393); they contact regions by medium-range interactions, must by included in an algorithm that min- and (c) association of the small contact regions imizes the conformational energy. into the native structure by long-range inter- In a test on bovine pancreatic trypsin inhibitor actions. In this mechanism, long-range inter- (396) (see reference 319 for the details), an object actions cannot act without the formation of function was optimized to obtain a virtual-bond ordered structures by short- and medium-range structure (i. e. one represented only by C ~atoms) interactions. Such a folding pathway, which that best satisfied various distance constraints. makes use of interactions present in the native Several starting conformations were used, in- structure, is designated as a growth-type process cluding the one mentioned in section 8.3.3 (197) (142) to distinguish it from a rearrangement- with an r.m.s, deviation of 4.4 A. The real chain type pathway, which initially makes use of

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 33 H. A. SCHERAGA: Protein structure non-native interactions that must subsequently (hydrophilic) amino acids occur more fre- be disrupted during folding to the native struc- quently (infrequently) in the interior than on ture (see section 5.2). The procedure discussed the surface of the protein. Such behavior makes in section 8.3.3 (196, 197), insofar as it includes it seem as ifa centripetal (centrifugal) force acted the correct disulfide bonds (long-range inter- on the hydrophobic (hydrophilic) residues. actions), embodies the view implied by a growth- Therefore, a Monte Carlo simulation of protein type process. To incorporate this mechanism folding incorporates this type of long-range in- in a distance constraint approach, one could, teraction automatically, and so does the distance for example, start by trying to predict the loca- constraint approach through the use of mean tion of helical segments and the contact sites values of the distances between hydrophobic betweenhelices (45, 46); alternatively, one could (hydrophilic) residues (395, 396). start by predicting specific pairs that form hydro- The balance between short- and long-range phobic contacts on the assumption of the exist- interactions has also been demonstrated in the ence of hairpinlike structures (149, 182). Spe- "final Stages" of the folding of bovine pancreatic cific long-range interactions (interactions trypsin inhibitor (6). By varying the relative between specific pairs of residues distant along weights of short- and long-range interactions, the chain) in the lattice model of GO and it was shown that both are required to bring coworkers (88, 93, 94) are also of the type that remote parts of the protein together and to is charateristic of the above three-step me- stabilize its native conformation. chanism, in the following sense. Although such specific pairs can interact when they occupy nearest-neighbor lattice points, the probability 8.4. Refinement of X-ray structures of proteins that such specific pairs form a contact in a In order to use X-ray structures of proteins folding simulation is much higher when all to elucidate their function, it is necessary to residues between the two specific ones take on know the positions of their constituent atoms the native conformation than it is by chance much more accurately than can be achieved by (88, 392). even the (present-day) highest-resolution pro- A second method to treat long-range inter- tein-structure-determination methods. Min- actions is based on the hydrophobic and hydro- imization of the conformational energy provides philic nature of the residues. This isa nonspecific a way to realize such an improvement of an interaction and, on the average, hydrophobic X-ray structure. Both reciprocal-space ( 125) and

Figure 29. Comparison of sections of electron density maps around Arg-42 of bovine pancreatic trypsin inhibitor, projected into the x,y plane (72). (Left) Map calculated by using the experimental 2.5-A phases. (Right) Map calculated by using phases from structure factor calculation.

34 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure

Figure 30. Stereo view of computed structure of a-lactalbumin (402).

real-space (58, 71, 72, 358, 398-401) procedures energy calculations to obtain the structure of have been used for this purpose. Since one starts a protein from that of a homologous one, again with the experimentally-determined coordi- because the starting conformation is close to nates, the multiple-minima problem is not as the potential well of the native structure. The serious as foran ab initio computation of protein amino-acid sequence of the unknown structure structure from an arbitrary starting point; i. e. and the X-ray coordinates of the homologous the starting conformation is close to the poten- structure provide a starting point that usually tial well of the native structure. Energy min- is of high energy because of steric overlaps. imization serves to eliminate steric overlaps and However, these are relieved during the course provide a low-energy structure. This procedure of energy minimization. This procedure has has been applied to actinomycin D (265), rubre- been applied to compute the structure of tt-lac- doxin (284), lysozyme (399), cytochrome c talbumin (402) from that of lysozyme and those (400), and bovine pancreatic trypsin inhibitor of three snake venom inhibitors (358) from that (71, 72, 358). of bovine pancreatic trypsin inhibitor. For each In the case of bovine pancreatic trypsin inhi- protein (358, 402), the results are expressed as bitor, the energy constraints were used to obtain cartesian coordinates from which a model can a high-resolution structure from low-resolution be constructed; see, e.g., the model of the com- data (71, 72). The procedure was applied (72) puted structure of ct-lactalbumin in Figure 30. to 2.5 A-resolution data for this protein (120) The same methodology is currently being used to obtain a structure that was a satisfactory to try to obtain a low-energy structure ofthrom- approximation to the one that had been ob- bin (59) from those of trypsin, chymotrypsin tained by use of 1.5 }~-resolution data (54). and elastase, and is applicable to other families Figure 29 illustrates how the energy-refined map of homologous proteins, e.g. the immunoglobu- on the right (72) identifies more of the electron lins. density of Arg-42 than the map on the left (120). In the case of ct-lactalbumin, WARME et al. (402) cited various lines of experimental evi- dence (including the environment oftryptophan 8.5. Calculation of structures of homologous residues and the N-terminal amino group) in proteins support of the computed structure. More re- The multiple-minima problem is also not as cently, BERLINERand KAPTEIN(13) have obtai- serious in the application of conformational ned evidence which is compatible with the

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 35 H. A. SCHERAGA: Protein structure

predicted conclusions about tryptophan resi- section 8.1, it was demonstrated (275) that the dues, and GERKEN (76) reported a high pK, predicted and observed structures ofgramicidin for the N-terminal amino group which "sup- S are very similar, and that all divergences in ports its interaction in an ion pair as proposed backbone conformation can be accounted for by WARME et al.". by intermolecular interactions in the crystal, which arise from the formation of a very in- 8.6. Differential geometry and protein folding teresting intermolecular four-stranded [5-sheet The formation of nuclei and subsequent fold- (see Figure 3 of reference 275). ing implies the existence of a hierarchy of time and length scales which characterize the folding process. It further suggests that these two scales 8.7. Enzyme-substrate interactions are correlated - i.e., those structures which form In this section, we consider an example of in the shortest times are those characterized by protein function viz. one aspect of enzyme the shortest length scale and larger structures action. Both experimental and computational are formed at later times in the folding process. methods are being used to elucidate the nature The various characteristic structures of proteins of the interactions in the initial binding step vary in length, and procedures are required for between an enzyme and a substrate. One expe- their description. The usual representation on rimental approach was discussed in section a (0, ~) map (Figure 14) is a description on 4.2.2, and we consider here the computational a (terminally-blocked) single-residue scale. In method. order to extend the length scale to four or five Such conformational energy calculations residues and achieve, among other things, in- have been carded out on enzyme- substrate sight into the conformational characteristics of complexes, viz. those between u-chymotrypsin 13-bends, a differential-geometric representation and oligopeptides (249, 255,256, 323, 33 l) and of protein structure has been developed (273, between lysozyme and polymers and copo- 274, 276-278). The virtual-bond backbone lymers of N-acetyl-glucosamine (GlcNAc) and (which depends on four C~'s) can be described N-acetylmuramic acid (MurNAc) (247-252, in terms of~ and ~ (the curvature and torsion, 323, 331 ). This is essentially a docking problem, respectively, at each C~. The characteristic struc- where, in the last analysis, a flexible substrate tural elements of proteins lie in different regions is allowed to approach a flexible enzyme (explor- of the (K, ~) plane, and I~-bends, e.g., form a ing the whole conformational space of the sub- continuum of structures from those that resem- strate and of the active site of the enzyme) to ble right-handed helices, through essentially flat find the most stable binding disposition, as the bends, to those that resemble left-handed helices one for which the sum of the intermolecular (274). The differential-geometric analysis of energy and the intramolecular energies of both protein structures provides a unifying frame- the enzyme and substrate is a minimum. Each work for the proposed nucleation mechanisms partner in the complex affects the conformation and shows that there is relatively low selectivity of the other as the total energy is minimized. in nucleation on a scale of four C~'s, and that In the case oflysozyme, two low-energy struc- only certain combinations of four-C~ structures tures of the complex with a hexasaccharide occur on a scale of five C~'s; further, a higher substrate (GlcNAc)6 were found (247) (Figure degree of selectivity in nucleation appears on 31). These are a right-sided complex with a the five-C~ length scale than on the four-C~ scale. distorted (half-chair) conformation for the D The differential geometry representation also residue of the substrate and a left-sided undis- provides a useful method for comparing two torted complex. The left-sided complex is the protein structures. The similarity between local more stable one, and the position of the D regions of the two proteins is assessed in terms residue, near the surface of the active site, is of a distance function based on the (r, ~) behav- compatible with experimental results of ior of each. This procedure has been applied SCHINDLER et al. (337). The preference for the to several protein pairs; e.g., as pointed out in predicted left-sided binding mode over the right-

36 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA:Protein structure

Figure 31. Stereo views of space-filling models (247) off (top) the active site of native hen egg white lysozyme; (middle) energy-minimized model-built hexamer [(GIcNAc)6]bound to the active site (right-sided mode); and (bottom) ~owest-energy hexamer bound to the active site (left-sided mode). See reference 247 for color code.

sided one has recently been verified (346) by the right-sided binding site (see Figure 32). This experiments involving competition between oli- exercise is a nice illustration of the symbiosis gosaccharides and monoclonal antibodies for between theory and experiment, in which the binding to hen egg white lysozyme and by experiment was guided by a need to check a measurements of the MICHAELIS-MENTEN theoretical prediction. constant KM for hydrolysis of (GlcNAc)6 by hen The calculations have also been extended to egg white lysozyme and by a homologous lyso- include the binding conformations of alter- zyme from ringed neck pheasant, respectively. nating copolymers of GlcNAc and MurNAc at The latter has several different amino acids in the active site of the enzyme. The calculated

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 37 H. A. SCHERAGA:Protein structure

Figure 32. Stereo view (346) of the computed structure (247) of the low-energy complex between (GIcNAc)6 and hen egg white lysozyme (same complex as in the bottom of Figure 31). The dark shaded regions contain the residues involved in left- and right-sided binding, respectively. Even though hen and ringed neck pheasant lysozymesdiffer in the dark shaded region on the right side, both exhibit similar values of KM towards (GIcNAc)~. lowest-energy structure for MurNAc.GIcNAc.- active site tryptophans 62, 63 and 108 in sites MurNAc bound to sites B, C, D (248) is in B, C, D, and through hydrogen bonding with excellent agreement with the x-ray crystal struc- Asn 59 and Ala 107 in site C and with Glu ture of this complex (132). If either the calcu- 35 in site E) in agreement with experimental lated or the X-ray structure is extended into results. Polymers of saccharides without the the lower active site with GIcNAc residues, it N-acetyl group (e.g. glucose) are thus predicted is found that it binds with its F site residue on to bind with low affinities, again in agreement the left side of the cleft. The alternating hexamer with experiment (123). The calculated hydrogen copolymer (GlcNAc.MurNAc)3, however, binds bonding scheme (hydrogen bonding accounts in a preferred right-sided mode (similar to the for about one-fourth of the total energy of homopolymer in the middle of Figure 31) stabilization) is identical in site C to that inferred because of the absence of a favorable hydrogen from X-ray crystallography. bond in site F between the 3-OH of the substrate A very important lesson learned from this and (the -OH of GIcNAc is now replaced with a other studies is that it is practically impossible lactic acid side chain in MurNAc) and the to pinpoint one or two specific interactions as backbone ~ C=O of Arg 45; also, the lactic acid being the most significant for the recognition side chain makes unfavorable steric contacts process. Instead, the binding energy that leads with Arg 45. In addition, from an investigation to recognition and specificity is a sum over small of the allowed binding sites for MurNAc resi- contributions from many nonbonded inter- dues, it appears that this residue can be accom- actions (which can be identified from an exami- modated in sites B, D and F only, in good nation of all pair interactions in the computed agreement with experiment (248). lowest-energy structure). The computations also give information about other interactions in the complex. Ap- proximately one-third of the binding energy for 9. CONCLUDING REMARKS hexasaccharide complexes is attributable to the The past thirty years have witnessed a drama- interactions of the N-acetyl groups of the sub- tic transitio.n in the view of a protein molecule strate with the enzyme (mainly with the three from a colloidal particle, described as a rigid

38 Carisberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure ellipsoid of revolution, to a flexible organic REFERENCES molecule with specifiable covalent and three-di- mensional structure. By the development and 1. ACHARYA,A. S. & H. TANIUCHI: Implication of the structure and stability of disulfide inter- application of X-ray crystallographic and expe- mediates oflysozyme on the mechanism ofrena- rimental and theoretical physical chemical me- turation. Mol. Cell. Biochem. 44, 129-148 (1982) thods, it has been possible to elucidate the 2. ALLEGRA,G.: The calculation of average func- structures of protein molecules and functional tions of local conformations for a non-interacting complexes thereof in the solid state and in copolymer system with neighbor interactions. J. solution, and to determine the nature of the Polymer Sci., Part C, 16, 2815-2824 (1967) intra- and inter-molecular interactions that give 3. ALTER. J. E., R. H. ANDREATTA. G. T. TAYLOR rise to these structures. We thus have acquired & H. A. SCHERAGA: Helix-coil stability constants considerable knowledge as to how polypeptide for the naturally occurring amino acids in water. chains fold into the native conformations of Vlll. Valine parameters from random poly(hy- droxypropylglutamine-co-L-valine)and poly(hy- proteins and then interact with other molecules droxybutylglutamine-co-L-valine). Macromole- to express their biological function. Many of cules 6, 564-570 (1973) the experimental studies of protein structure and 4. ANANTHANARAYANAN, V. S.: Conformational folding pathways in solution have been carried criteria for, and consequence of proline hydroxy- out with ribonuclease, pancreatic trypsin inhi- lation in collagen. In: Conformation in Biology, bitor and lysozyme. Also, the active sites of eds. Srinivasan, N. & R. H. Sarma, Adenine Press, several enzymes such as thrombin and carboxy- Guilderland, New York, pp. 99-111 (1983) peptidase have been mapped by kinetic and 5. ANDERSON, J. S. & H. A. SCHERAGA: Conforma- spectroscopic methods. Physical chemical stu- tional energy calculations on the contraceptive dies of synthetic polypeptide models have pro- tetrapeptide H-Thr-Pro-Arg-Lys-OH. Macro- molecules I1, 812-819 (1978) vided considerable insight about the interactions 6. ANDERSON. J. S. & H. A. SCHERAGA: Effect of in protein systems. Most of the theoretical stu- short- and long-range interactions on protein dies have been carried out with various model folding. J. Protein Chem., l, 281-304 (1982) systems, with collagen-like poly(tripeptides), 7. ANFINSEN, C. B. 8, H. A. SCHERAGA: Experi- with natural collagen, and with bovine pancrea- mental and theoretical aspects of protein folding. tic trypsin inhibitor (a protein of known struc- Adv. Protein Chem. 29, 205-300 (1975) ture, - useful for testing computational proce- 8. ANFINSEN, C. B., E. HABER, M. SELA & F. H. dures). Theoretical studies of enzyme-substrate WHITE. JR.: The kinetics of formation of native complexes have dealt with chymolrypsin-plus- ribonudease during oxidation of the reduced oligopeptide and lysozyme-plus-oligosaccha- polypeptide chain. Proc. Natl. Acad. Sci., U. S. 47, 1309-1314 (1961) ride systems. While dramatic results have been 9. BANDEKAR, J., D. J. EVANS, S. KRI MM, S. J. LEACH, achieved, there is still much more to be learned S. LEE, J. R McQuIE, E, MINASIAN,G. NEMETHY, about the details of protein-folding pathways. M. S. POTTLE, H. A. SCHERAGA, E R. STIMSON Also, we are just beginning to probe the dynamic & R W. WOODY: Conformations of cyclo(L-ala- processes involved in proteins and protein com- nyl-L-alanyl-e-aminocaproyl)and ofcyclo(L-ala- plexes. The experimental and theoretical me- nyI-D-alanyl-e-aminocaproyl); Cyclized dipep- thods developed over the past thirty years, and tide models for specific types of I~-bends. Intntl. developments that will undoubtedly occur in J. Peptide Protein Res. 19, 187-205 (1982) the foreseeable future, e.g. the incorporation of 10. BANERJEE, K. & M. A. LAUFFER: Polymerization- site-specific mutations by genetic engineering depolymerization of tobacco mosaic virus pro- tein. VI. Osmotic pressure studies of early stages techniques, should further increase our under- of polymerization. Biochemistry 5, 1957-1964 standing of protein structure and function. (1966) 1 l. BEN-NAJM. A.: "Water and aqueous solutions, introduction to a molecular theory", Plenum, New york, pp. 1-474 (1974) 12. BERGER, A. & K. LINDERSTROM-LANG: Deute- rium exchange of poly-DL-alanine in aqueous

Carlsberg Res. Commun. Vol. 49, p. !-55, 1984 39 H. A. SCHERAGA: Protein structure

solution. Arch. Biochem. Biophys. 69, 106-118 377-396 (1981) (1957) 26. BREDDAM, K.. F. WIDMER & J. T. JOHANSEN" 13. BERLINER.L. J. & R. KAPTEIN: Nuclear magnetic Carboxypeptidase Y catalyzed C-terminal modi- resonance characterization of aromatic residues fication in the B-chain of porcine insulin. Cads- of a-lactalbumins. Laser photo chemically in- berg Res. Commun. 46, 361-372 (1981) duced dynamic nuclear polarization nuclear 27. BROWN, L R.. W. BRAUN, A. KUMAR & K. magnetic resonance studies of surface exposure. WIJTHRICH; High resolution nuclear magnetic Biochemistry 20, 799-807 (1981) resonance studies of the conformation and ori- 14. BIERZYNSKI.A., P. S. KIM & R, L. BALDWIN: A entation ofmelitten bound to a lipid-water inter- salt bridge stabilizes the helix formed by isolated face. Biophys. J 37, 319-328 (1982) C-peptide of RNase A. Proc. Natl. Acad. Sci., 28. BURGESS,A. W. & H. A. SCHERAGA:A hypothesis U. S. 79, 2470-2474 (1982) for the pathway of the thermally-induced unfold- 15. BIGELOW,C. C, & M. OTTESEN:Spectrophotomet- ing of bovine pancreatic ribonuclease. J. Theor. ric titration of two ribonuclease derivatives. Bio- Biol., 53, 403-420 (1975) chim, Biophys. Acta 32, 574-575 (1959) 29. BURGESS,A. W. & H. A. SCHERAGA: Assessment 16. BIXON, M., H. A. SCHERAGA& S. LIFSON: Effect of some problems associated with prediction of ofhydrophobic bondingon the stability ofpoly-L- the three-dimensional structure of a protein from alanine helices in water. Biopolymers, 1,419-429 its amino-acid sequence. Proc. Natl. Acad. Sci., (1963) U. S. 72, 1221-1225 (1975) 17. BLACKBURN,P. & S. MOORE; Pancreatic ribonu- 30. BURGESS, A. W.. P. K. PONNUSWAMY & H. A. clease. In: The Enzymes, (3rd Ed. ) ed. P. D. SCHERAGA:Analysis of conformations of amino Boyer, Academic Press, New York, Vol. XV, pp. acid residues and prediction of backbone topo- 317-433 (1982) graphy in proteins. Israel J. Chem. 12, 239-286 18. BLAKE,C. C. F., D. F. KOENIG, G. A. MAIN, A. C. (1974) T. NORTH. D. C. PHILLIPS& V. R. SARMA;Struc- 31. BURGESS,A. W.. L. 1. WEINSTEIN,D. GABEL& H. ture of hen egg-white lysozyme. Nature 206, A. SCHERAGA;Immobilized carboxypeptidase A 757-761 (1965) as a probe for studying the thermally induced 19. BLOUT,E R: Synthesis and chemical properties unfolding of bovine pancreatic ribonuclease. Bio- of polypeplides. In: Polyamino Acids, Polypep- chemistry 14, 197-200 (1975) tides, and Proteins, ed. Stahmann, M. A., Univ. 32. CANFIELD, R. E.: The amino acid sequence of of Wisconsin Press, Madison, pp. 3-11 (1962) egg white lysozyme. J. Biol. Chem. 238, 2698- 20. BLOUT,E. R., R. H. KARLSON,P. DOTY & B. HAR- 2707 (1963) GITAY: Polypeptides. I. The synthesis and the 33. CHAVEZ,L G., JR., & H. A. SCHERAGA:Immuno- molecular weight of high molecular weight poly- logical determination of the order of folding of glutamic acids and esters. J. Am. Chem. Soc. 76, portions of the molecule during air oxidation of 4492-4493 (1954) reduced ribonuclease. Biochemistry 16, 1849- 21. BLOUT, E. R,, C. DE LOZE, S. M BLOOM & G. D. 1856 (1977) FASMAN:The dependence of the conformations 34. CHAVEZ, L. G., JR. & H. A. SCHERAGA: Folding of synthetic polypeptides on amino acid compo- of ribonuclease, S-protein, and Des(121-124)- sition. J. Am. Chem. Soc. 82, 3787-3789 (1960) ribonuclease during giutathione oxidation of the 22. BLOUT, E R.. F. A. BOVEY, M. GOODMAN & N, reduced proteins. Biochemisty 19, 996-1004 LOTAN: Peptides, polypeptides, and proteins, (1980) John Wiley, New York, pp. 1-644 (1974) 35. CHAVEZ, L. G., JR. & H. A. SCHERAGA: Intrinsic 23. BRANDTS.J. F.: The nature of the complexities stabilities of portions of the ribonuclease mole- in the ribonuclease conformational transition cule. Biochemistry 19, 1005-1012 (1980) and the implications regarding clathrating. J, Am. 36. CHEN. M. C. & R C. LORD: Laser Raman spectro- Chem. Soc. 87, 2759-2760 (1965) scopic studies of the thermal unfolding of ribonu- 24. BRANT, D. A. & P. J. FLORV: The configuration clease A. Biochemistry 15, 1889-1897 (1976) of random polypeptide chains. II. Theory. J, Am. 37. CHOTHIA,C.: Conformation of twisted ~-pleated Chem. Soc. 87, 2791-2800 (1965) sheets in proteins. J. Mol. Biol. 75, 295-302 25. BRAUN,W., C. BOSCH. L. R. BROWN, N. GO & K. (1973) WOTHRIC~t: Combined use of proton-proton 38. CHou. K. C. & H. A. SCHERAGA; Origin of the Overhauser enchancements and a distance geom- right-handed twist of I~-sheets of poly(L-Val) etry algorithm for determination of polypeptide chains. Proc. Natl. Acad. Sci., U. S. 79, 7047-705 I conformations. Biochim. Biophys. Acta 667, (1982)

40 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure

39. CHOU, K. C., M. POTTLE, G. NI~METHY, Y. UEDA (1923) & H. A. SCHERAGA" Structure of 13-sheets. Origin 54. DEISENHOFER, J. & W, STEIGEMANN" Crystallo- of the right handed twist and of the increased graphic refinement of the structure of bovine stability of antiparallel over parallel sheets. J. pancreatic trypsin inhibitor at 1.5 A resolution. Mol. Biol. 162, 89-112 (1982) Acta Cryst. B 31, 238-250 (1975) 40. CHOU, K. C., G. NI~METHY & H. A. SCHERAGA: 55. DENTON, J. B., Y. KONISHI & H. A. SCHERAGA: Energetic approach to the packing of a-helices. Folding of ribonuclease A from a partially disor- 1. Equivalent helices. J. Phys, Chem. 87, 2869- dered conformation. Kinetic study under folding 2881 (1983) conditions. Biochemistry 21, 5155-5163 (1982) 41. CHOU, K. C., G. NI~METHY & H. A. SCHERAGA: 56. DESANTIS. R.. E. GIGLIO. A. M. LIQUORI & A. Role ofinterchain interactions in the stabilization RIPAMONTI: Van der Waals interaction and the of the right-handed twist oflD-sheets. J. Mol. Biol. stability of helical polypeptide chains. Nature 168, 389-407 (1983) 206, 456-458 (1965) 42. CHOU, K. C., G. NI~METHY & H A. SCHERAGA: 57. DESLAURIERS,R., D. J. EVANS, S. J. LEACH, Y. C. Effect of amino acid composition on the twist MEINWALD, E. MINASIAN, G. NI~METHY, 1. D, and the relative stability of parallel and antiparal- RAE, H. A. SCHERAGA. R. L. SOMORJAI, E R. lel fl-sheets. Biochemistry 22, 6213-6221 (1983) STIMSON, J. W. VAN N1SPEN & R. W. WOODY: 43. CHOU. K. C.. G. NI~METHY & H. A. SCHERAGA: Conformation of cyclo(L-alanylglycyl-e-amino- Energetic approach to the packing of a-helices. caproyl), a cyclized dipeptide model for a 13-bend, 2. Non-equivalent helices. J. Am. Chem. Soc., 2. Synthesis, nuclear magnetic resonance, and in press measurements. Macro- 44. CHOU, P. Y. & G. D. FASMAN: Conformational molecules 14, 985-996 (1981) parameters for amino acids in helical, I~-sheet, 58. DIAMOND, R.: A real-space refinement procedure and random coil regions calculated from pro- for proteins. Acta Cryst. A 27, 436-452 (1971) teins. Biochemistry 13, 211-245 (1974) 59. DUDEK, M., D. TIMMS & H. A. SCHERAGA: Work 45. COHEN, F. E. & M. J. E. STERNBERG" On the use in progress. of chemically derived distance constraints in the 60. DUNFmLD, L. G. & H. A. SCHERAGA: lsing model prediction of protein structure with myoglobin treatment of short-range interactions in polypep- as an example. J. Mol. Biol. 137, 9-22 (1980) tides and its application to the structure of bovine 46. COHEN, F. E., T. J. RiCH MOND & F. M. RICHARDS: pancreatic trypsin inhibitor. Macromolecules 13, Protein folding: Evaluation of some simple rules 1415-1428 (1980) for the assembly of helices into tertiary structures 61. DUNFIELD, L. G., A.W. BURGESS & H. A. SCHE- with myoglobin as an example. J. Mol. Biol. 132, RAGA: Energy parameters in polypeptides. 8. 275-288 (1979) Empirical potential energy algorithm for the 47. COHN, E. J. & J. T. EDSALL" Proteins, amino acids conformationai analysis of large molecules, J. and peptides. Reinhold Publishing Corp., New Phys. Chem. 82, 2609-2616 (1978) York (1943) 62, DYGERT, M., N. GO & H A. SCHERAGA: Use of 48. CRAWFORD, J. L., W. N. LIPSCOMB & C. G. a symmetry condition to compute the conforma- SCHELLMAN; The reverse turn as a polypeptide tion of Gramicidin S. Macromolecules 8, 750- conformation in globular proteins. Proc. Natl. 761 (1975) Acad. Sci,, U. S. 70, 538-542 (1973) 63. EDSALL, J. T.: The size, shape and hydration of 49. CREIGHTON, T. E.: Energetics of folding and protein molecules. In: The Proteins, eds. H. unfolding of pancreatic trypsin inhibitor. J. Mol. Neurath & K. Bailey. Academic Press, New York, Biol. 113, 295-312 (1977) vol. IB, 549-726 (1953) 50. CRICK, F. H. C.: The packing of a-helices: Simple 64. ENDRES, G. F. & H. A. SCHERAGA: Equilibria in coiled-coils. Acta Cryst. 6, 689-697 (1953) the fibrinogen-fibrin conversion. VII. On the 51. CRIPPEN, G. M.: Correlation of Sequence and mechanism of the reversible polymerization of tertiary structure in globular proteins. Biopo- fibrin monomer. Biochemistry, 5, 1568-1577 lymers 16, 2189-2201 (1977) (1966) 52. DEBVE. P.: Osmotische Zustandsgleichung und 65. ENDRES, G. F. & H. A. SCHERAGA: Equilibria in Aktivita.t Verdtinnter starker Elektrolyte. Phy- the fibrinogen-fibrin conversion. VIII. Polymeri- sik. Zeitschr. 25, 97-107 (1924) zation of acceptor-modified fibrin monomer. 53. DEBYE, P. & E. HUCKEL: Zur Theorie der Elektro- Biochemistry, 7, 4219-4226 (1968) lyre, I. Gefrierpunktserniedrigung und verwandte 66. ENDRES, G. F., S. EHRENPREIS, H. A. SCHERAGA: Erscheinungen. Physik. Zeitschr. 24, 185-206 Equilibria in the fibrinogen-fibrin conversion.

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 41 H. A. SCHERAGA: Protein structure

VI. Ionization changes in the reversible polyme- conformation of crystalline ribonuclease A from rization of fibrin monomer. Biochemistry 5, X-ray diffraction and Raman spectroscopy. J. 1561-1567 (1966) Raman. Spectr. 12, 173-179 (1982) 67. EPAND, R. F. & H. A. SCHERAGA: Conformations 81. GINSBERG.A. & W. R. CARROLL: Some specific of poly-L-valine in solution. Biopolymers 6, ion effects on the conformation and thermal 1551-1571 (1968) stability of ribonuclease. Biochemistry 4, 2159- 68. EREN RICH, E H., R. H, AN DREATTA & H. A. SCH E- 2174 (1965) RAGA: Experimental verification of predicted 82. Go.M: Statistical mechanics ofbiopolymers and helix sense of two polyamino acids. J. Am. Chem. its application to the melting transition of poly- Soc. 92, lllr-l119 (1970) nucleotides. J. Phys. Soc. Japan 23, 597-608 69. FINK, T. R. & D. M. CROTHERS: Comparison of (1967) several calculations of helix-coil transitions in 83. GO, M. & H. A. SCHERAGA: Molecular theory of heterogeneous polymers. Biopolymers 6, 863- the helix-coil transition in polyamino acids. V. 871 (1968) Explanation of the different conformational be- 70. FISHER, M. E: Effect of excluded volume on havior of valine, isoleucine and leucine in phase transitions in biopolymers. J. Chem. Phys. aqueous solution. Biopolymers, in press 45, 1469-1473 (1966) 84. Gr M., N. GO & H. A. SCHERAGA: Molecular 71. FITZWATER, S. & H. A. SCHERAGA: A model- theory of the helix-coil transition in polyamino building procedure with particular application to acids. II. Numerical evaluation of s and o for proteins. Acta Cryst. A 36, 211-219 (1980) polyglycine and poly-L-alanine in the absence (for 72. FITZWATER, S. & H, A, SCHERAGA: Combined-in- s and o) and presence (for o) of solvent. J. Chem. formation protein structure refinement: Poten- Phys. 52, 2060-2079 (1970) tial energy-constrained real-space method for 85. Go. M.. N. GO & H. A. SCHERAGA: Molecular refinement with limited diffraction data. Proc. theory of the helix-coil transition in polyamino Natl. Acad. Sci., U. S. 79, 2133-2137 (1982) acids. III. Evaluation and analysis of s and s for 73. FITZWATER, S., Z, 1. HODES & H. A. SCHERAGA: polyglycine and poly-L-alanine in water. J. Chem. Conformational energy study of tuftsin. Macro- Phys. 54, 4489-4503 (1971) molecules 11, 805-811 (1978) 86. GO, M., F. T. HESSELINK, N. GO 8, H. A. SCHE- 74. Fu. Y. C., R. F. McGUIRE & H. A. SCHERaGA: RAGA: Molecular theory of the helix-coil transi- lntermolecular potentials from crystal data. V. tion in poly(amino acids). IV. Evaluation and Crystal packing of poly[~(p-chlorobenzyl)-L-as- analysis ofs for poly(L-valine) in the absence and partate]. Macromolecules 7, 468-480 (1974) presence of water. Macromolecules 7, 45%467 75. GANSER, V,, J. ENGEL, D. WINKLMAIR & G. (1974) KRAUSE: Cooperative transition between two 87. GO. N: Theoretical studies of protein folding. helical conformations in a linear system: Poly-L- Ann. Rev. Biophys. Bioeng. 12, 183-210 (1983) proline I~--11.1. Equilibrium studies. Biopolymers 88. GO. N. & H. ABE: Noninteracting local-structure 9, 329-352 (1970) model of folding and unfolding transition in 76. GERKEN,T. A.: ~3C NMR studies of the solution globular proteins. 1. Formulation. Biopolymers structure of ~C methylated ct-lactalbumin. Fed. 20, 991-1011 (1981) Proc. 42, 2001 (1983) 89. GO, N, & H. A. SCHERAGA" Analysis of the contri- 77. GmSON, K. D & H. A. SCHERAGA:Minimization bution of internal vibrations to the statistical of polypeptide energy. I. Preliminary structures weights of equilibrium conformations of macro- of bovine pancreatic ribonuclease S-peptide. molecules. J. Chem. Phys. 5 l, 4751-4767 (1969) Proc. Natl. Acad. Sci., U. S. 58, 420-427 (1967) 90. GO. N. & H. A. SCHERAGA: Calculation of the 78. GIBSON.K. D. & H. A. SCHERAGA:Minimization conformation of cycio-hexaglycyi. Macro- of polypeptide energy. V. Theoretical aspects. molecules 6, 525-535 (1973) Physiol. Chem. & Phys. l, 109-126 (1969) 9 I. GO. N & H A. SCHERAGA,On the use of classical 79. GmsoN, K. D & H. A. SCHERAGA:Minimization statistical mechanics in the treatment of polymer of polypeptide energy. VII. Second derivatives chain conformation. Macromolecules 9,535-542 and statistical weights of energy minima for (1976) deca-L-alanine. Proc. Natl. Acad. Sci., U. S. 63, 92. GO. N. & H. A. SCHERAGA: Calculation of the 242-245 (1969) conformation ofcyclo-hexaglycyl. 2. Application 80. GILBERT, W. A., R. C. LORD, G. A. PETSKO & T. of a Monte Carlo Method. Marcromolecules l 1, J. THAMANN"Laser-Raman spectroscopy ofbio- 552-559 (1978) molecules. 16. Temperature dependence of the 93. GO, N. & H TAKETOMI: Studies on protein fold-

42 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure

ing, unfolding and fluctuations by computer methyl, chloro, cyano, or nitrobenzyl-L-aspartate simulation. III. Effect of short-range interactions. in a chloroform solution. Bull. Chem, Soc. Japan lntntl. J. Peptide Protein Res. 13, 235-252 (1979) 40, 1698-1701 (1967) 94. GO. N &. H. TA'KETOMI: Studies on protein fold- 107. HEINRIKSON, R. L.: On the alkylation of amino ing, unfolding and fluctuations by computer acid residues at the active site of ribonuclease. simulation. IV. Hydrophobic interactions. Intntl. J, Biol. Chem. 241, 1393-1405 (1966) J. Peptide Protein Res. 13, 447-461 (1979) 108. HEINRIKSON. R. L.. W. H. STEIN, A. M. CREST- 95. GO, N_, M. Go &. H. A. SCHERAGA: Molecular FIELD &. S. MOORE: The reactivities of the his- theory of the helix-coil transition in polyamino tidine residues at the active site of ribonuclease acids. 1. Formulation. Proc. Natl. Acad. Sci., U. toward halo acids of different structures. J. Biol. S. 59, 1030-1037 (1968) Chem. 240, 2921-2934 (1965) 96. Go. N. P. N LEWIS & H A. SCHERAGA: Calcula- 109. HERMANS, J. JR. & H. A. SCHERAGA" Structural tion of the conformation of the pentapeptide studies of ribonuclease. V. Reversible change of cyclo(glycylglycylglycylprolylprolyl). II. Statisti- configuration. J. Am. Chem. Soc. 83, 3283-3292 cal weights. Macromolecules 3, 628-634 (1970) (1961) 97. GO, N., P. N. LEWIS, M. GO & H. A. SCHERAGA" 110. HERMANS, J. JR. & H A. SCHERAGA: Structural A model for the helix-coil transition in specific- studies of ribonuclease. VI. Abnormal ionizable sequence copolymers of amino acids. Macro- groups. J. Am. Chem. Soc. 83, 3293-3300 ( 1961 ) molecules 4, 692-709 (1971) 111. HERMANS. J., D. R. FERRO. J. E. MCQUEEN & S. 98. GOEL. N. S. &. M. YCTAS: On the computation of C. WEI: Use of energy calculations to refine and the tertiary structure of globular proteins. II. J. understand structure and function of protein. Theor. Biol. 77, 253-305 (1979) Jerusalem Symp. Quantum Chem. Biochem. 8, 99. GROSS. E. &. B. WITKOP: Nonenzymatic cleavage 459-483 (1975) of peptide bonds: The methionine residues in 112. HESSELINK. F. T., T. OOl &. H. A. SCHERAGA: bovine pancreatic ribonuclease. J. Biol. Chem. Conformational energy calculations. Thermody- 237, 1856-1860 (1962) namic parameters of the helix-coil transition of 100. GUNDLACH, H. G., W. H. STEIN & S. MOORE: The poly(L-lysine) in aqueous salt solution. Macro- nature of the amino acid residues involved in molecules 6, 541-552 (1973) the inactivation of ribonuclease by iodoacetate. 113. HILL, T. L.: Approximate calculation of the elec- J. Biol. Chem. 234, 1754-1760 (1959) trostatic free energy of nucleic acids and other I 0 I. HAGLER, A. T., H. A. SCHERAGA & G. NI~METHY" cylindrical macromolecules. Arch. Biochem, Structure of liquid water, statistical thermodyna- Biophys. 57, 229-239 (1955) mic theory. J. Phys. Chem, 76, 3229-3243 (1972) 114. HIRS, C. H. W., S. MOORE &. W. H. STEIN: The 102. HAGLER, A. T., E. HULER &. S, LIE-SON: Energy sequence of the amino acid residues in performic functions for peptides and proteins. I. Derivation acid-oxidized ribonuclease. J. Biol. Chem. 235, of a consistent force field including the hydrogen 633-647 (1960) bond from amide crystals. J. Am. Chem. Soc. 115. HIRS, C H. W.. M. HALMANN & J. H. KYCIA" 96, 5319-5327 (1974) Reactivity of certain functional groups in ribonu- 103. HALL. C. E. & H. S. SLAYTER: The fibrinogen clease A towards substitution by l-fluoro-2,4-di- molecule: Its size, shape, and mode of polyme- nitrobenzene-inactivation of the enzyme by sub- rization. J. Biophys. Biochem. Cytol. 5, 11-16 stitution at the lysine residue in position 41. In: (19591 Biological Structure and Function (T. W. Good- 104. HANTGAN, R. R., G. G. HAMMES &. H. A. SCHE- win & O. Lindberg, eds. ) vol. 1, pp. 42-57, RAGA" Pathway of folding of reduced bovine Academic Press, N. Y. (1960) pancreatic ribonuclease. Biochemistry 13, 3421- 116. HODES, Z. l.. G. NI~METHu & H. A. SCHERAGA" 3431 (1974) Model for the conformational analysis of hy- 105. HARRINGTON, W. F. & J. A. SCHELLMAN; Evid- drated peptides. Effect of hydration on the con- ence for the instability of hydrogen-bonded pep- formational stability of the terminally blocked tide structures in water, based on studies of residues of the 20 naturally occurring amino ribonuclease and oxidized ribonuclease. Compt. acids. Biopolymers 18, 1565-1610 (1979) Rend. Tray. Lab. Cadsberg, Set. Chim. 30, 21-43 117. HODES. Z. I., G. NteMETHY & H. A. SCHERAGA: (1956) Influence of hydration on the conformational 106. HASHIMOTO. M. & S. ARAKAWA: Studies of poly- stability and formation of bends in terminally 13-benzyI-L-aspartate helix. III. Infrared spectra blocked dipeptides. Biopolymers 18, 1611-1634 of copolymers of 13-benzyl-L-aspartate with ~p- (1979)

Caflsberg Res. Commun. Vol. 49, p, 1-55, 1984 43 H. A. SCHERAGA: Protein structure

118. HOPFINGER. A. J.: Polymer-solvent interactions G. JAMES & D. C. PHILLIPS: X-ray crystallography for homopolypeptides in aqueous solution. Mac- of the binding of the bacterial cell wall trisaccha- romolecules 4, 731-737 (1971) ride NAM-NAG-NAM to lysozyme. Nature 282, 119. HOWARTH, O. W.: The thermal unfolding of 875-878 (1979) ribonuclease A. A '~C NMR study. Biochim. 133. KENDREW, J. C., H. C. WATSON, B. E. STRAND- Biophys. Acta 576, 163-175 (1979) BERG, R E DICKERSON, D. C. PHILLIPS & V. C. 120. HUBER, R.. D. KUKLA, A. ROHLMANN. O. EPP & SHORE: The amino-acid sequence of sperm whale H. FORMANEK: The basic trypsin inhibitor of myoglobin. A partial determination by X-ray bovine pancreas, l. Structure analysis and confor- methods, and its correlation with chemical data. mation of the polypeptide chain. Naturwissen- Nature 190, 666-670 (1961) schaften 57, 389-392 (1970) 134. KIDERA, A. M. MOCHIZUKI, R. HASEGAWA, T. 121. HULL. S. E.. R. KARLSSON, P. MAIN. M. M. HAYASHI, H. SATO, A. NAKAJIMA, R. A. FREDE- WOOLFSON & E. J. DODSON: The crystal structure RICKSON, S. P. POWERS, 8. LEE & H. A. SCHERAGA: of a hydrated gramicidin S-urea complex. Nature Helix-coil transition in multi-component ran- 275, 206-207 (1978) dom copolypeptides in water. I. Theory, and 122. HVIDT, A., G JOHANSEN, K. LINDERSTRQM- application to random copolymers of (hydroxy- LANG & F. VASLOW: Exchange of deuterium and propyl)-L-glutamine, L-alanine and glycine. Mac- '~O between water and other substances. Compt. romolecules 16, 162-172 (1983) Rend. Trav. Lab. Carlsberg, SeT. Chim. 29, 129- 135. KIM. P. S. & R. L. BALDWIN: Specific inter- 157 (1955) mediates in the folding reactions of small proteins 123. IMOTO, T., L. N. JOHNSON. A. C. T. NORTH. D. and the mechanism of protein folding. Ann. Rev. C. PHILLIPS & J. A. RUPLEY: Vertebrate lyso- Biochem. 51, 459-489 (1982) zymes. In: The Enzymes, Ed. P. D. Boyer, Ac- 136. KINCAID. R. H. & H. A. SCHERAGA" Acceleration ademic Press, New York and London, Third ed., of convergence in Monte Carlo simulations of vol. VII, pp. 665-868 (1972) aqueous solutions using the Metropolis algo- 124. ISOGAI, Y., G. NI~METHY, S. RACKOVSKY, S. J. rithm. Hydrophobic hydration of methane. J. LEACH & H. A. SCHERAGA: Characterization of Computational Chem. 3, 525-547 (1982) multiple bends in proteins. Biopolymers 19, 137. KIRKWOOD. J. G.: The nature of forces between 1183-1210 (1980) protein molecules in solution, ln: The Me- 125. JACK, A. & M. LEVtTT: Refinement of large struc- chanism of Enzyme Action. The John Hopkins tures by simultaneous minimization of energy Press, Baltimore, pp. 4-23 (1954) and R factor. Acta Cryst. A34, 931-935 (1978) 138. KLOTZ, 1. M.: Protein interactions. In: The Pro- 126. JAENICKE, R. & M. A. LAUFFER: Polymerization- teins, eds. H. Neurath & K. Bailey. Academic depolymerization of tobacco mosaic virus pro- Press, New York, vol. IB, 727-806 (1953) tein. XI1. Further studies on the role of water. 139. KONISHI, Y., T. Ool & H. A. SCHERAGA: Regen- Biochemistry 8, 3083-3092 (1969) eration of ribonuclease A from the reduced pro- 127. JARDETZKY, O. & G. C. K. ROBERTS: NMR in tein. Isolation and identification of inter- molecular biology. Academic Press, New York mediates, and equilibrium treatment. (1981) Biochemistry 20, 3945-3955 (1981) 128. JOLLF~S, J.. J. JAUREGUI-ADELL 1. BERNIER & P. 140. KONISHI. Y., T. OoI & H. A. SCHERAGA: Regen- JOEL,e,S: La structure chimique du lysozyme de eration ofribonuclease A from the reduced pro- blanc d'oeufde poule: Etude detaillee. Biochim. tein. Rate limiting steps. Biochemistry 2 l, 4734- Biophys. Acta 78, 668-689 (1963) 4740 (1982) 129. KABAT,E A.&T. T. WU: The influence of nearest- 141. KONISHI, Y.. T. Ooi & H. A. SCHERAGA" Regen- neighboring amino acid residues on aspects of eration of ribonuclease A from the reduced pro- secondary structure of proteins. Attempts to lo- tein. Energetic analysis. Biochemistry 21, 4741- cate a-helices and [~-sheets. Biopolymers 12, 751- 4748 (1982) 774 (1973) 142. KONISHI, Y.. T. Ool & H. A. SCHERAGA: Regen- 130. KATCHALSKI, E & M. SELA: Synthesis and che- eration of RNase A from the reduced protein: mical properties of poly-a-amino acids. Adv. Models of regeneration pathways. Proc. Natl. Protein Chem. 13, 243-492 0958) Acad. Sci., USA 79, 5734-5738 (1982) i 3 I. KAUZMANN, W.: Some factors in the interpreta- 143. KOTELCHUCK, D, & H. A. SCHERAGA: The influ- tion of protein denaturation. Adv. Protein Chem. ence of short-range interactions on protein con- 14, 1-63 (1959) formation. I. Side chain-backbone interactions 132. KELLY, J. A.. A. R. SIELECKI, B. D. SYKES, M. N. within a single peptide unit. Proc. Natl, Acad.

44 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A SCHERAGA: Protein structure

Sci. USA 61, 1163-1170 (1968) determination of the conformations of amino 144. KOTELCHUCK, D. & H. A. SCHERAGA" The influ- acid residues in oligopeptides. Biochem. Biophys. ence of short-range interactions on protein con- Res. Comm. 75, 207-215 (1977) formation. I I. A model for predicting the ct-helical 157. LEE, B. & F. M RICHARDS: The interpretation of regions of proteins. Proc. Natl. Acad. Sci. USA protein structures: Estimation of static accessibil- 62, 14-21 (1969) ity. J. Mol. Biol. 55,379-400 (1971) 145. KOTELCHUCK. D., M. DYGERT & H. A, SCHE- 158. LEE. E.. G. NI~METHY, H. A. SCHERAGA & V. S. RAGA: The influence of short-range interactions ANANTHANARAYANAN: I3-bend conformation of on protein conformation. Ill. Dipeptide distribu- N-Ac-Pro-Pro-Gly-Pro-N HCH3. Implications tions in proteins on known sequence and struc- for posttranslational proline hydroxylation in ture. Proc. Natl. Acad. Sci. USA 63, 615-622 collagen. Biopolymers, in press (1969) 159. LEHMAN, G. W. & J. P. MCTAGUE: Melting of 146. KOWALSKI. D. & M. LASKOWSKI, JR.: Chemical- DNA. J. Chem. Phys. 49, 3170-3179 (1968) enzymatic replacement of lie64 in the reactive site 160. LENTZ, B. R, A. T. HAGLER & H. A. SCHERAGA: of soybean trypsin inhibitor (Kunitz). Biochemi- Structure of liquid water. !I. Improved statistical stry 15, 1300-1309 (1976) thermodynamic lreatment and implications of a 147. KRAKOW, W., G. F. ENDRES, B. M. SIEGEL & H. cluster model. J. Phys. Chem. 78, 1531-1550 A. SCHERAGA: An electron microscopic investi- (1974) gation of the polymerization of bovine fibrin 161. LEWIS, P. N. & H, A. SCHERAGA: Predictions of monomer. J. Mol. Biol. 71, 95-103 (1972) structural homologies in cytochrome c proteins. 148. KRIGBAUM, W. R. & A. KOMORIYA: Local inter- Arch. Biochem. Biophys. 144, 576-583 (1971) actions as a structure determinant for protein 162. LEWIS. P. N. & H. A. SCHERAGA: Prediction of molecules. Biochim. Biophys. Acta 576,204-246 structural homology between bovine a-lactalbu- (1979) rain and hen egg white lysozyme. Arch. Biochem. 149. KUNTZ, 1. D, G. M. CRIPPEN & P. A. KOLLMAN: Biophys. 144, 584-588 (1971) Application of distance geometry to protein ter- 163. LEWIS. P. N.. N. GO. M. GO, D. KOTELCH UCK & H. tiary structure calculations. Biopolymers 18,939- A. SCHERAGA: Helix probability profiles of dena- 957 (1979) tured proteins and their correlation with native 150. LASKOWSKI, M. JR. & H. A. SCHERAGA: Thermo- structures. Proc. Natl. Acad. Sci. U. S. 65, 810- dynamic considerations of protein reactions. 1. 815 (1970) Modified reactivity of polar groups. J. Am. Chem. 164. LEWIS, P. N.. F. A. MOMANY & H. A. SCHERAGA: Soc. 76, 6305-6319 (1954) Folding of polypeptide chains in proteins: A 15 I. LASKOWSKI, M. JR. & H. A. SCHERAGA" Thermo- proposed mechanism for folding. Proc. Natl. dynamic considerations of protein reactions. II. Acad. Sci. U. S. 68, 2293-2297 (1971) Modified reactivity of primary valence bonds. 165. LEWIS, P. N. F. A, MOMANY & H. A. SCHERAGA: J. Am. Chem. Soc. 78, 5793-5798 (1956) Energy parameters in polypeptides. V1. Confor- 152. LASKOWSK1. M. JR. & H. A SCHERAGA: Thermo- mational energy analysis of the N-acetyl N'- dynamic considerations of protein reactions. Ill. methyl amides of the twenty naturally occurring Kinetics of protein denaturation. J. Am. Chem. amino acids. Israel J. Chem. 11, 121-152 (1973) Soc. 83, 266-274 (1961) 166. LEWIS, P. N., F. A. MOMANY & H. A. SCHERAGA: 153. LASKOWSK1. M. JR., J. M. W1DOM, M. L. MCFAD- Chain reversals in proteins. Biochim. Biophys. DEN &'H. A SCHERAGA: Differential ultraviolet Acta 303, 211-229 (1973) spectra of insulin. Biochim. Biophys. Acta 19, 167. LIE, G. C.. E CLEMENTI & M. YOSHIMINE: Study 581-582 (1956) of the structure of molecular complexes. XIII. 154. LASKOWSKL M. JR.. S. EHRENPREIS, T. H. DON- Monte Carlo simulation of liquid water with a NELLY & H. A. SCHERAGA: Equilibria in the fibro- configuration interaction pair potential. J. Chem. gen-fibrin conversion. V, Reversibility and ther- Phys. 64, 2314-2323 (1976) modynamics of the proteolytic action of 168. LI~ON, S.: Partition functions of linear-chain thrombin on fibrinogen. J. Am. Chem. Soc. 82, molecules. J. Chem. Phys. 40, 3705-3710 (1964) 1340-1348 (1960) 169. LIFSON, S. & G. ALLEGRA: On the theory of 155. LASKOWSKI, M. JR., S. J. LEACH & H. A. SCHE- order-disorder transition and copolymer struc- RAGA: Tyrosyl hydrogen bonds in insulin, J. Am. ture of DNA. Biopolymers 2, 65-68 (1964) Chem. Soc. 82, 571-582 (1960) 170. LIFSON, S. & A. ROIG: On the theory of the 156. LEACH.S.J.. G NEMETHY & H. A. SCHERAGA"Use helix-coil transition in polypeptides. J. Chem. of proton nuclear Overhauser effects for the Phys. 34, 1963-1974 (1961)

Carlsberg Res. Commun. Vol, 49, p. 1-55, 1984 45 H. A. SCHERAGA: Protein structure

17 I. LIN. L. N. & J. F. BRANDTS" Mechanism for the scopy as a monitor of the pathway of the thermal unfolding and refolding of ribonuclease A. Ki- unfolding of ribonuclease A. Biochem. Biophys. netic studies utilizing spectroscopic methods. Res. Comm. 74, 869-876 (1977) Biochemistry 22, 564-573 (1983) 186. MATTICE, W. L. & H. A. SCHERAGA: Matrix for- 172. LIN, S., Y. KONISHI, M. E. DENTON & H. A, SCHE- mulation of the transition from a statistical coil to RAGA: Influence of an extrinsic cross-link on the an antiparallel I~-shect. Biopolymers, in press folding pathway of ribonuclease A. Conforma- 187. MAXFIELD,F. R. & H. A. SCHERAGA" The effect of tional and thermodynamic analysis of cross- neighboring charges on the helix forming ability

. 7 41 , linked (lysme-lysme )-nbonuclease A. Bioche- of charged amino acids in proteins. Macro- mistry, submitted molecules 8, 491-493 (1975) 173. LINDERSTROM-LANG,K.: The ionization of pro- [88. MAXFIELD, F. R. & H, A. SCHERAGA: Status of teins. Compt. Rend. Tray. Lab. Carlsberg, 15, empirical methods for the prediction of protein 1-29 (1924) backbone topography. Biochemistry 15, 5138- 174. LINDERSTROM-LANG,K.: The initial stages in the 5153 (1976) breakdown of proteins by enzymes. Lane Medical 189. MAXFIELD, F. R. & H. A. SCHERAGA: A Raman Lectures. Stanford Univ. Publ., Med. Series 6, spectroscopic investigation of of the disulfide 53-72 (1952) conformation in oxytocin and lysine vasopressin. 175. LINDERSTROM-LANG.K.: Degradation of pro- Biochemistry 16, 4443-4449 (1977) teins by enzymes, inst. Intntl. Chim. Solvay, 9th 190. MAXFIELD, F. R. & H. A. SCHERAGA: Improve- Conseil Chim., Brussels, pp. 247-299 (1953) ments in the prediction of protein backbone 176. LINDERSTROM-LANG, K.: Deuterium exchange topography by reduction of statistical errors. between peptides and water. Chem. Soc. London, Biochemistry 18, 697-704 (1979) Special Publ. No. 2, pp. 1-24 (1955) 191. MAXFIELD, F. R., J. BANDEKAR,S. KRIMM, D. J. 177. LINDERSTROM-LANG,K., R. D, HOTCHKISS & G. EVANS. S. J. LEACH. G. NEMETHY & H. A. SCHE- JOHANSEN: Peptide bonds in globular proteins. RAGA: Conformation of cyclo4L-atanylglycyl-~- Nature 142, 996 (1938) aminocaproyl), a cyclized dipeptide model for a 178. LOEB. G. & H A. SCHERAGA: The thermally-in- 13-bend. 3. Infrared and Raman spectroscopic duced transition in fibrin. J. Am. Chem. Soc. 84, studies. Macromolecules 14, 997-1003 ( 1981) 134-142 (1962) 192. McGumE, R. F., G. VANDERKOOI, F. A. MO- 179. LYNN, R. M.. Y. KONISHI & H. A. SCHERAGA: MANY, R. T, INOWALL,G. M. CRIPPEN, N. LOTAN, Folding of ribonuclease A from a partially disor- R. W. TUTTLE. K. L. KASHUBA & H. A. SCHE- dered conformation. Kinetic study under transi- RAGA: Determination of interrnolecular poten- tion conditions. Biochemistry, in press tials from crystal data. If. Crystal packing with 180. MANDELKERN, L." Poly-L-proline, In: Poly-ct- applications to poly(amino acids). Macro- Amino Acids. Ed. G. D. Fasman, Marcel Dekker, molecules 4, 112-124 ( 1971) New York, pp. 675-724 (1967) 193. MCWHERTER.C. A.. H. A. SCHERAGA& E. HAAS: 181. MARSH, H. C., JR., Y. C. MEINWALD, T, W. Work in progress THANNHAUSER & H. A. SCHERAGA: Mechanism 194. MEIROVlTCH.H. & H A. SCHERAGA: Empirical of action of thrombin on fibrinogen. Kinetic studies of hydrophobicity. 2. Distribution of the evidence for involvement of aspartic acid at hydrophobic, hydrophilic, neutral and ambiva- position P~0. Biochemistry 22, 4170-4174 (1983) lent amino acids in the interior and exterior layers 182. MATHESON. R. R., JR. & H. A. SCHERAGA: A of native proteins. Macromolecules 13, 1406- method for predicting nucleation sites for protein 1414 (1980) folding based on hydrophobic contacts. Macro- 195, MEIROVITCH,H. & H. A. SCHERAGA: Empirical molecules 11, 819-829 (1978) studies of hydrophobicity. 3. Radial distribution 183. MATHESON,R. R,. J R. & H. A. SCHERAGA:Steps in of clusters ofhydrophobic and hydrophilic amino the pathway of the thermal unfolding of ribonu- acids. Macromolecules 14, 340-345 ( 1981 ) clease A. A nonspecific photochemical surface-la- 196. MEIROVITCH, H. & H. A. SCHERAGA" An ap- beling study. Biochemistry 18, 2437-2445 (1979) proach to the multiple-minimum problem in 184. MATHESON,R. R.. JR, & H. A. SCHERAGA:Calcu- protein folding, involving a long-range geomet- lation of the ZIMM-BRAGGeooperativity parame- rical restriction and short-, medium- and long- ter o from a simple model of the nucleation range interactions. Macromolecules 14, 1250- process. Macromolecules 16, 1037-1043 (1983) 1259 (1981) 185, MATHESON. R. R., JR., H. DUGAS & H. A. SCHE- 197. MEIROVITCH, H. & H. A. SCHERAGA: Introduc- RAGA:Electron paramagnetic resonance spectro- tion of short-range restrictions in a protein-fold,

46 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure

ing algorithm involving a long-range geometrical and 13-structures from primary structure. J. Mol. restriction and short-, medium- and long-range Biol. 75,401-420 (1973) interactions. Proc. Natl. Acad. Sci., U. S. 78, 209. NAKAJIMA, A. & H. A. SCHERAGA: Thermodyna- 6584-6587 ( 1981) mic study of shrinkage and of phase equilibrium 198. MEIROVITCH. H., S. RACKOVSKY & H. A. SCHE- under stress in films made from ribonuclease. J. RAGA; Empirical studies of hydrophobicity. 1. Am. Chem. Soc. 83, 1575-1584 (1961) Effect of protein size on the hydrophobic behav- 210. NAKAJIMA, A. & H. A. SCHERAGA: Thermodyna- ior of amino acids. Macromolecules 13, 1398- mic study of shrinkage in fibers made from 1405 (1980) insulin. J. Am. Chem. Soc. 83, 1585-1589 ( 1961) 199. MILLER,M. H. & H. A SCHERAGA:Calculation of 211. NEMENOFF, R. A., J. SNIR & H. A. SCHERAGA: A the structure of collagen models. Role of inter- revised empirical potential for conformational, chain interactions in determining the triple- intermolecular, and solvation studies. J. Phys. helical coiled-coil conformation. I. Poly(Glycyl- Chem. 82, 2497-2530 (1978) Prolyl-Prolyl). J. Polymer Sci.: Polymer Sympo- 212. NEMETHY, G: Interactions between Poly(Gly- sia, No. 54, 171-200 (1976) Pro-Pro) triple helices: A model for molecular 200. MILLER,M. H., G. NI~METHY& H. A. SCHERAGA: packing in collagen. Biopolymers 22, 33-36 Calculation of the structures of collagen models. (1983) Role ofinterchain interactions in determining the 213. NI~METHY,G. 8, H. A. SCHERAGA:The structure of triple-helical coiled-coil conformation. 2. Poly(- water and hydrophobic bonding in proteins. I. A glycyl-prolyl-hydroxyprolyl). Macromolecules model for the thermodynamic properties of li- 13, 470-478 (1980) quid water. J. Chem. Phys., 36, 3382-3400 (1962) 201. MILLER.M. H., G. NI~METHY& H. A. SCHERAGA: 214. NI~METHY,G. & H. A. SCHERAGA:The structure of Calculation of the structures of collagen models. water and hydrophobic bonding in proteins. 11. A Role of interchain interactions in determining model for the thermodymamic properties of the triple-helical coiled-coil conformation. 3. aqueous solutions of hydrocarbons. J. Chem. Poly(glycyl-prolyl-alanyl). Macromolecules 13, Phys. 36, 3401-3417 (1962) 910-913 (1980) 215. NEMETHY.G. & H. A. SCHERAGA:The stucture of 202. MIRSKY,A. E. & L PAULING: On the structure of water and hydrophobic bonding in proteins. 111. native, denatured, and coagulated proteins. Proc. The thermodynamic properties of hydrophobic Natl. Acad. Sci. U. S. 22, 439-447 (1936) bonds in proteins. J. Phys. Chem. 66, t 773-1789 203. MIYAZAWA,T.." Characteristic amide bands and (1962) conformations of polypeptides. In: Polyamino 216. NEMETHY. G. & H. A. SCHERAGA: Theoretical Acids, Polypeptides, and Proteins. Ed. Stah- determination of sterically allowed conforma- mann, M. A., Univ. of Wisconsin Press, Madison, tions of a polypeptide chain by a computer pp 201-217 (1962) method. Biopolymers 3, 155-184 (1965) 204. MOFFITT, W.: The optical rotatory dispersion of 217. NEMETHY, G. & H. A. SCH ERAGA: Protein folding. simple polypeptides. II. Proc. Natl. Acad. Sci., U. Quart. Rev. Biophys. 10, 239-352 (1977) S. 42, 736-746 (1956) 218. NEMETHY, G. & H. A. SCHERAGA: A possible 205. MOFFITT, W., D. D. FITTS & J. G. KIRKWOOD: folding pathway of bovine pancreatic RNase. Critique of the theory of optical activity of helical Proc. Natl. Acad. Sci. U. S. 76, 6050-6054 (1979) polymers, Proc. Natl. Acad. Sci., U. S. 43, 723- 219. NEM ETHu G. & H. A. SCHERAGA:Stereochemical 730 (1957) requirements for the existence of hydrogen bonds 206. MOMANY, F. A., R. F. McGUIRE. A. W. BURGESS in ~-bends. Biochem. Biophys. Res. Comm. 95, & H. A. SCHERAGA: Energy parameters in poly- 320-327 (1980) peptides, VII, geometric parameters, partial 220. NEMETHY. G. & H. A. SCHERAGA: Hydrogen atomic charges, nonbonded interactions, hydro- bonding involving the ornithine side chain of gen bond interactions, and intrinsic torsional gramicidin S. Biochem. Biophys. Res. Commun. potentials for the naturally occurring amino 118, 643-647 (1984) acids. J, Phys. Chem. 79, 2361-2381 (1975) 221. Ng'METHY. G., I. Z. STEINBERG & H. A. SCHE- 207. NAGAI, K.. Y. ENOKL S. TOMITA & T. TESHIMA; RAGA: The influence of water structure and of Trypsin-catalyzed synthesis of peptide bond in hydrophobic interactions on the strength of side- human hemoglobin. J. Biol. Chem. 257, 1622- chain hydrogen bonds in proteins. Biopolymers 1625 (1982) 1, 43-69 (1963) 208. NAGANO,K." Logical analysis of the mechanism 222. NEMETHY,G., M. H. MILLER& H. A. SCHERAGA" of protein folding. 1. Predictions of helices, loops Calculation of the structures of collagen models.

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 47 H. A. SCHERAGA: Protein structure

Role of interchain interactions in determining plakalbumin. Compt. Rend. Trav. Lab. Carls- the triple-helical coiled-coil conformation. 4. berg, Ser. Chim. 27, 421-435 (1951) Poly(glycyl-alanyl-prolyl). Macromolecules 13, 235. OwlcKI, J. C. & H. A. SCHERAGA: Monte Carlo 914-919 (1980) calculations in the isothermal-isobaric ensemble. 223. NI~METHY,G.,J. R. McQuIE. M. S. POTTLE& H. A. 1. Liquid water. J. Am. Chem. Soc. 99, 7403- SCHERAGA: Conformation of cyclo(L-alanylgly- 7412 (1977) cyl-e-aminocaproyl), a cyclized dipeptide model 236. OwtcKk J. C & H. A. SCHERAGA: Monte Carlo for a [3-bend. 1. Conformational energy calcula- calculations in the isothermal-isobaric ensemble. tions. Macromolecules 14, 975-985 ( 1981 ) 2. Dilute aqueous solution of methane. J. Am. 224. NEMETHY, G., W. J. PEER & H. A. SCHERAGA: Chem. Soc. 99, 7413-7418 (1977) Effect of protein-solvent interactions on protein 237. OWICKI, J. C. & H. A. SCHERAGA: Monte Carlo conformation. Ann. Rev. Biophys. Bioeng. 10, free energy calculations on dilute solutions in the 459-497 (1981) isothermal-isobaric ensemble, J. Phys. Chem. 82, 225. NI~METHY, G., M. POTTLE & H. A. SCHERAGA: 1257-t264 (1978) Energy parameters in polypeptides. 9. Updating 238. PANGALI, C., M. RAO & B. J. BERNE: A Monte of geometrical parameters, nonbonded inter- Carlo simulation of the hydrophobic interaction. actions, and hydrogen bond interactions for the J. Chem. Phys. 71, 2975-2990 (1979) naturally occurring amino acids. J. Phys. Chem. 239. PATERSON, V.. G. NI~METHY & H. A. SCHERAGA: 87, 1883-1887 (1983) Hydration of amino acids, peptides and model 226. NISHIKAWA, K. & H, A. SCI-iERAGA: Geometrical compounds. Annals N. Y. Acad. Sci. 367, 132- criteria for formation of coiled-coil structures of 150(1981) polypeptide chains. Macromolecules 9, 395-407 240. PATERSON, Y.. G. NI~METHY & H. A. SCHERAGA: (1976) An empirical potential function for the inter- 227. NOZAKI. V. & C. TANFORO" The solubility of action between univalent ions in water. J. Solu- amino acids and two glycine peptides in aqueous tion Chem. 1 I, 831-856 (1982) ethanol and dioxane solutions. J. Biol. Chem. 241. PAUL1NG. L. & R. B. COREY: The pleated sheet, a 246, 2211-2217 (1971) new layer configuration of polypeptide chains. 228. OKUYAMA. K.. N. TANAKA, T. ASHIDA & M. KA- Proc. Natl. Acad. Sci., U. S. 37, 251-256 (1951) KUDO: Structure analysis of a collagen model 242. PAULING, L., R. B. COREY & H R. BRANSON: The polypeptide,(Pro-Pro-Gly),0.Bull. Chem. Soc. structure of proteins: Two hydrogen-bonded Japan 49, 1805-1810 (1976) helical configurations of the polypeptide chain. 229. Ook T. & H, A SCHERAGA: Structural studies of Proc. Natl. Acad. Sci., U. S. 37,205-211 ( 1951 ) ribonuclease. XII. Enzymic hydrolysis of active 243. PEGGION. E.: Peptides, polypeptides and pro- tryptic modifications ofribonuclease. Biochemi- teins: Interactions and their biological implica- stry 3, 641-647 (1964) tions. Biopolymers 22, 1-586 (1983) 230. Ool, T. & H. A. SCHERAGA: Structural studies of 244. PERUTZ, M. F., H. MUIRHEAD, J. M. COX & L. G. ribonuclease. XIII. Physicochemical properties GOAMAN: Three-dimensional Fourier synthesis of trypsin modifications ofribonuclease. Bioche- of horse oxyhaemoglobin at 2.8 ]~ resolution: The mistry 3, 648-652 (1964) Atomic Model. Nature 219, 131-139 (1968) 231. OOl, T. & H. A. SCHERAGA: Structural studies of 245. PINCUS, M. R. & R. D. KLAUSNER: Prediction of ribonuclease. XIV. Tryptic hydrolysis of ribonu- the three-dimensional structure of the leader clease in propyl alcohol solution. Biochemistry 3, sequence of Pre-~c light chain, a hexadecapeptide. 1209-1213 (1964) Proc. Natl. Acad. Sci., U. S. 79, 3413-3417 (1982) 232. Ook T., J, A, RUPLEY & H. A. SCHERAGA: Struc- 246. PINCUS, M. R. & H. A. SCHERAGA: An approxi- tural studies of ribonuclease. VIII. Tryptic hydro- mate treatment of long-range interactions in lysis of ribonuclease A at elevated temperatures. proteins. J. Phys. Chem. 81, 1579-1583 (1977) Biochemistry 2, 432-437 (1963) 247. P1NCUS. M. R. & H. A SCHERAGA: Conforma- 233. Ool, T., R. A. SCOTT, G. VANDERKOOI & H. A. tional energy calculations of enzyme-substrate SCHERAGA: Conformational analysis of macro- and enzyme-inhibitor complexes of lysozyme. 2. molecules. IV. Helical structures of poly-L- Calculation of the structures of complexes with a alanine, poly-L-valine, poly-~methyI-L-aspar- flexible enzyme. Macromolecules 12, 633-644 tate, poly-~'-methyI-L-glutamate, and poly-L-ty- (1979) rosine. J. Chem. Phys. 46, 4410-4426 (1967) 248. PINCUS, M. R, & H. A SCHERAGA; Prediction of 234. OTTESEN, M. & C. VILLEE: The peptides released the three-dimensional structures of complexes of in the enzymic transformation of ovalbumin to lysozyme with cell wall substrates. Biochemistry

48 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure

20, 3960-3965 (1981) 261. POLAND, D. a H. A. SCHERAGA: The equilibrium 249. PINCUS, M. R. & H. A. SCHERAGA; Theoretical unwinding in finite chains of DNA. Physiol. calculations on Enzyme-substrate complexes: Chem. and Phys. l, 389-446 (1969) The basis of molecular recognition and catalysis. 262. POLAND, D. & H. A. SCHERAGA: The Lifson-Alle- Accts. Chem. Res. 14, 299-306 (1981) gra theories of the helix-coil transition for ran- 250. PINCUS. M. R., A. W. BURGESS& H. A. SCHERAGA: dom copolymers: Comparison with exact results Conformational energy calculations of enzyme- and extension. Biopolymers 7, 887-908 (1969) substrate complexes oflysozyme. 1. Energy mini- 263. POLAND, D. & H. A. SCHERAGA: Theory of helix- mization ofmono- and oligosaccharide inhibitors coil transitions in biopolymers. Academic Press, and substrates of lysozyme. Biopolymers 15, N. Y. (1970) 2485-2521 (1976) 264. POLAND, D. J. N. Vou RNAKIS& H. A. SCHERAGA: 251. PINCUS, M. R., S. S. ZIMMERMAN & H. A. SCHE- Cooperative interactions in single-strand oligo- RAGA: Prediction of three-dimensional structures mers of adenylic acid. Biopolymers 4, 223-235 ofenzyme-substrate and enzyme-inhibitor com- ( 1966 ) plexes of lysozyme. Proc. Natl. Acad. Sci., U. S. 265. PONNUSWAMY, P. K., R. F. McGuIRE & H. A. 73, 4261-4265 (1976) SCHERAGA: Refinement of the molecular struc- 252. PINCUS, M. R., S. S. ZIMMERMAN & H. A. SCHE- ture of actinomycin D by energy minimization. RAGA: Structures ofenzyme-substrate complexes Intntl. J. Peptide Protein Res. 5, 73-84 (1973) of lysozyme. Proc. Natl. Acad. Sci., U. S. 74, 266. PONNUSWAMY.P. K. P. K. WARME& H. A. SCHE- 2629-2633 (1977) RAGA: Role of medium-range interactions in 253. PINCUS. M. R., R. D. KLAUSNER & H. A. SCHE- proteins. Proc. Natl. Acad. Sci., U. S. 70, 830-833 RAGA: Calculation of the three-dimensional (1973) structure of the membrane-bound portion of 267. POTTLE, C., M. S. POTTLE, R. W. TUTTLK, R. J. melittin from itsamino acid sequence. Proc. Natl. KINCH & H. A. SCHERAGA: Conformational anal- Acad. Sci., U. S. 79, 5107-5110(1982) ysis of proteins: Algorithms and data structures 254. PINCUS, M. R., F. GEREWITZ, H. WAKO & H. A. for array processing. J. Computational Chem. 1, SCHERAGA: Cis-trans isomerization of proline in 46-58 (1980) the peptide (His105-Val124) of ribonuclease A 268. POTTS, J. T., A. BERGER. J. COOKE & C. B. ANFIN- containingthe primary nucleation site. J. Protein SEN: A reinvestigation of the sequence of residues Chem. 2, 131-146 (1983) 11 to 18 in bovine pancreatic ribonuclease. J. 255. PLATZER, K. E. B., F. A. MOMANY & H. A. SCHE- Biol. Chem. 237, 1851-1855 (1962) RAGA: Conformational energy calculations of 269. PTITSYN, O. B.: Protein folding: General physical enzyme-substrate interactions. I. Computation model. FEBS lett. 131, 197-202 (1981) of preferred conformations of some substrates of 270. PTITSYN, O. B., V. I. LtM & A. FINKELSTEIN: Se- ct-chymotrypsin. Intntl. J. of Peptide and Protein condary structure of globular proteins and the Research 4, 187-200 (1972) principle of concordance of local and long-range 256. PLATZER, K. E B.. F. A. MOMANY & H. A. SCHE- interactions. Fed. Eur. Biochem. Soc. Meet. RAGA: Conformational energy calculations of (Proc.) 25, 421-429 (1972) enzyme-substrate interactions. II. Computation 27 I. PURISIMA, E. O. & H. A. SCHERAGA: Conversion of the binding energy for substrates in the active from a virtual-bond chain to a complete polypep- site of ct-chymotrypsin. Intntl. J. of Peptide and tide backbone chain. Biopolymers, in press Protein Research 4, 201-219 (1972) 272. RACKOVSKY.S. & H. A. SCHERAGA: Hydrophobi- 257. POLAND. D. & H. A. SCHERAGA: Statistical me- city, hydrophilicity, and the radial and orientatio- chanics of non-covalent bonds in polyamino nal distributions of residues in native proteins. acids. Biopolymers 3, 275-419 (1965) Proc. Natl. Acad. Sci., U. S. 74, 5248-5251 (1977) 258. POLAND. D. & H. A. SCHERAGA: Phase transitions 273. RACKOVSKY. S. & H. A. SCHERAGA: Differential in one dimension and the helix-coil transition in geometry and polymer conformation. I. Compa- polyamino acids. J. Chem. Phys. 45, 1456-1463 rison of protein conformations. Macromolecules (1966) ll, 1168-1174 (1978) 259. POLAND, D. & H. A. SCHERAGA: Occurrence of a 274. RACKOVSKY, S. & H. A. SCHERAGA: Differential phase transition in nucleic acid models. J. Chem. geometry and polymer conformation. 2. Deve- Phys. 45, 1464-1469 (1966) lopment of a comformational distance function. 260. POLAND, D. & H. A. SCHERAGA: Kinetics of the Macromolecules 13, 1440-1453 (1980) helix-coil transition in polyamino acids. J. Chem. 275. RACKOVSKY,S. & H. A. SCHERAGA: Intermolecu- Phys. 45, 2071-2090 (1966) lar anti-parallel ~sheet: Comparison of predicted

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 49 H. A. SCHERAGA: Protein structure

and observed conformations of gramicidin S. sis of ribonuclease A at elevated temperatures. Proc. Natl. Acad. Sci., U. S. 77, 6965-6967 (1980) Biochemistry 2, 421-431 (1963) 276. RACKOVSKY.S. 8, H. A. SCrtERAGA: Differential 290. RYLE, A. P., F. SANGER, L. F. SMITH & R. KITAI: geometry and polymer conformation. 3. Single- The disulphide bonds of insulin. Biochem. J. 60, site and nearest-neighbor distributions, and nuc- 541-556 (1955) leation of protein folding. Macromolecules 14, 291. SACHS,D. H.. A.N. SCHECHTER,A.EASTLAKE&C. 1259-1269 (1981) A. ANFINSEN: An immunologic approach to the 277. RACKOVSKV,S. & H. A. SCHERAGA: Differential conformational equilibria of polypeptides. Proc. geometry and polymer conformation. 4. Confor- Natl. Acad. Sci., U. S. 69, 3790-3794 (1972) mational and nucleation properties of individual 292. SAXENA,V. P. & D. B. WETLAUFER:Formation of amino acids. Macromolecules 15, 1340-1346 three-dimensional structure in proteins. 1. Rapid (1982) nonenzymic reactivation of reduced lysozyme. 278. RACKOVSKV,S. 8` H. A. SCHERAGA: Differential Biochemistry 9, 5015-5023 (1970) geometry and protein folding. Accts. Chem. Res., 293. SCATCHARD.G.: Molecular interactions in pro- in press tein solutions. Amer. Scientist 40, 61-83 (1952) 279. RAMACHANDRAN, G. N., C. RAMAKRISHNAN & 294. SCATCHARD, G., A. C. BATCHELDER & A. BROWN: V. SASlSEKHARAN:Stereochemistry of polypep- Preparation and properties of serum and plasma tide chain configurations. J. Mol. Biol. 7, 95-99 proteins. VI. Osmotic equilibria in solutions of (1963) serum albumin and sodium chloride. J. Am. 280. RAMACHANDRAN,G. N., C M. VENKA'rACHA- Chem. Soc. 68, 2320-2329 (1946) LAM & S. KRIMM." Stereochemical criteria for 295. SCHECHTER. I. & A. BERGER: On the active site of polypeptide and protein chain conformations. proteases. 111. Mapping the active site of papain; I11. Helical and hydrogen-bonded polypeptide specific peptide inhibitors of papain. Biochem. chains. Biophys. J. 6, 849-872 (1966) Biophys. Res. Commun. 32, 898-902 (1968) 281. RAPAPORT,D. C. & H. A. SCHERAGA: Evolution 296. SCHELLMAN,J. A." The thermodynamics of urea and stability of polypeptide chain conformation: solutions and the heat of formation of the peptide A simulation study. Macromolecules 14, 1238- hydrogen bond. Compt. Rend. Trav. Lab. Cads- 1246 (1981) berg, Srr. Chim. 29, 223-229 (1955) 282. RAPAPORT, D. C. 8, H. A. SCHERAGA: Structure 297. SCHELLMAN,J. A.: The stability of hydrogen-bon- and dynamics of the "Configuration Interaction" ded peptide structures in aqueous solution. model of liquid water. Chem. Phys. Letters 78, Compt. Rend. Trav. Lab. Carlsberg, Srr. Chim. 491-494 ( 1981 ) 29, 230-259 (1955) 283. RAPAPORT, D. C, & H. A. SCHERAGA; Hydration 298. SCHELLMAN, J. A.: Optical rotatory properties of of inert solutes. A molecular dynamics study. J. proteins and polypeptides. I. Theoretical discus- Phys. Chem. 86, 873-880 (1982) sion and experimental methods. Compt, Rend. 284. RASSE, D., P. K. WARME & H. A. SCHERAGA" Refi- Tray. Lab. Carlsberg, Set. Chim. 30, 363-394 nement of the X-ray structure of rubredoxin by (1958) conformational energy calculations. Proc. Natl. 299. SCHERAGA,H. A.: Tyrosyl-carboxylate ion hydro- Acad. Sci., U. S. 71, 3736-3740 (1974) gen bonding in ribonuclease. Biochim. Biophys. 285. RICHARDS,F. M: An active intermediate produ- Acta 23, 196-197 (1957) ced during the digestion of ribonuclease by subti- 300. SCHERAGA. H. A." Influence of side-chain hydro- lisin, Compt. Rend. Trav. Lab. Carlsberg, Ser. gen bonds on the elastic properties of protein Chim. 29, 329-346 (1955) fibers and on the configurations of proteins in 286. RICHARDS, F. M. & H. W. WYCKOFF: Bovine solution. J. Phys. Chem. 64, 1917-1926 (1960) pancreatic ribonuclease. In: The Enzymes (3rd 301. SCHERAGA,H. A.: Protein Structure. Academic Ed. ). Ed. P. D. Boyer, Academic Press, New Press, N. Y., pp. 241-287 (1961) York, Vol. IV, pp. 647-806 (1971) 302. SCHERAGA,H. A.: The effect of solutes on the 287. RICHARDSON,J. S.: The anatomy and taxonomy structure of water and its implications for protein of protein structure. Adv. Protein Chem. 34, structure. Ann. N. Y. Acad. Sci. 125, 253-276 167-339 (1981) (1965) 288. ROaSON, B. & R. H. PAIN: Analysis of the code 303. SCHERAGA,H. AA Structural studies of pancreatic relating sequence to conformation in globular rihonuclease. Fed. Proc. 26, 1380-1387 (1967) proteins. Biochem. J. 141,869-904 (1974) 304. SCHERAGA,H. A.: Calculations of conformations 289. RUPLEY,J. A. & H. A. SCHERAGA:Structural stu- of polypeptides. Adv. Phys. Org. Chem, 6, 103- dies ofribonuclease. VII. Chymotrypsin hydroly- 184 (1968)

50 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. SCHERAGA: Protein structure

305. SCHERAGA.H. A.: Calculations of conformations 1-14(1983) ofpolypeptides from amino acid sequence. Nobel 320. SCHERAGA.H. A.: Theoretical and experimental symposium 1 I, on "Symmetry and Function of aspects of protein folding. In: Supramolecular Biological Systems at the Macromolecular Level" Structure and Function. Eds. Pifat, G. & J. N. (ed. A. Engstrom & B. Strandberg), Almqvist and Herak, Plenum Publ. Corp., New York, pp. 45-58 Wiksell, Stockholm, pp. 43-78 (1969) (1983) 306. SCHERAGA,H. A.: Theoretical and experimental 321. SCHERAGA,H. A.: Interaction of thrombin and studies of conformations of polypeptides. Chem. fibrinogen, and the polymerization of fibrin mo- Revs. 71, 195-217 (1971) nomer. Annals N. Y. Acad. Sci. 408, 330-343 307. SCHERAGA.H. A.: On the dominance of short- (1983) range interactions in polypeptides and proteins. 322. SCHERAGA,H. A.: Calculations of the three-di- Pure and Applied Chem. 36, 1-8 (1973) mensional structures of proteins. Annals N. Y. 308. SCH ERAGA,H. As Poly(amino acids), interalomic Acad. Sci., in press energies, and protein folding. In: Peptides, Poly- 323. SCHERAGA,H, A.: Theoretical studies of molecu- peptides, and Proteins. Eds. Blout, E, R., F. A. lar recognition and catalysis by enzymes. Proc. Bovey, M. Goodman & N. Lotan, John Wiley & Pontifical Academy of Sciences, in press. Also, Sons, New York, pp. 49-70 (1974) Proc. XVlllth International Solvay Conference 309. SCHERACJA.H A.] Prediction of protein confor- on Chemistry, in press mation. In: Current Topics in Biochemistry, 324. SCH ERAGA.H. A. & M. LASKOWSKL J R.: The fibri- 1973.Eds. Anfinsen, C. B. & A, N. Schechter, nogen-fibrin conversion. Adv. Protein Chem. 12, Academic Press, New York, pp. 1-42 (1974) 1-131 (1957) 310. SCHERAGA,H. A.: Active site mapping of throm- 325. SCHERAGA,H, A. & L. MADELKERN: Considera- bin. In: Chemistry and Biology ofThrombin. Ed. tion of the hydrodynamic properties of proteins. R. L. Lundblad, J. W. Fenton II & K. G. Mann, J. Am. Chem. Soc. 75, 179-184 (1953) Ann Arbor Science Publ., Ann Arbor, pp. 145- 326. SCHERAGA,H. A. & J. A. RUPLEY; The structure 158 (1977) and function of ribonuclease. Adv. in Enzymo- 311. SCHERAGA, H. A.: lntermolecular potential for logy 24, 161-261 (1962) water and the hydration of proteins. Ann. N. Y. 327. SCHERAGA.H. A, W. R. CARROLL, L. F. NIMS, E Acad. Sci. 303, 2-9 (1977) SUTTON,J. K. BACKUS& J. M, SAUNDERS:Hydro- 312. SCHERAGA,H. A.: Use of random copolymers to dynamic properties of urea-denatured fibrino- determine the helix-coil stability constants of the gen. J, Polymer Sci. 14, 427-442 (1954) naturally occurring amino acids, Pure and Ap- 328. SCHERAGA, H. A., G. NEMETHY & !. Z. STEIN- plied Chem. 50, 315-324 (1978) BERG: The contribution ofhydrophobic bonds to 313. SCHERAGA, H. A.: An approximate model for the thermal stability of protein conformations. J. protein folding: Experimental and theoretical Biol. Chem. 237, 2506-2508 (1962) aspects. In: Versatility of Protein. Ed. C, H. Li, 329. SCHERAGA,H. A., S. J. LEACH, R. A. SCOTT & G. Academic Press, New York, pp. 119-132 (1978) NI~METHY: Intramolecular forces and protein 314. SCHERAGA, H. A.: Interactions in aqueous solu- conformation. Disc. Faraday Soc. 40, 268-277 tion. Accts. Chem. Res. 12, 7-14 (1979) (1965) 315. SCHERAGA,H. A.: Phase transitions in synthetic 330. SCHERAGA, H. A., K. -C. CHOU & G. NI~METHY: polymers of amino acids, and their relation to Interactions between the fundamental structures protein folding. Ferroelectrics 30, 157-158 (1980) of polypeptide chains. In: Conformation in Bio- 316. SCHERAGA,H. AA Protein folding; Application to logy. Ed. R. Srinivasan and R. H. Sarma, Adenine ribonuclease. In: Protein Folding. Ed. R. Jae- Press, pp. 1-10 (1982) nicke, Elsevier, Amsterdam, pp. 261-288 (1980) 331. SCHERAGA.H. A., M. R. PINCUS & K. E. BURKE: 317. SCHERAGA,H. A.: Influence ofinteratomic inter- Calculations of structures of enzyme-substrate actions on the structure and stability ofpolypepti- complexes. In: Structure of Complexes between des and proteins. Biopolymers 20, 1877-1899 Biopolymers and Low Molecular Weight Mole- (1981) cules. Eds. W. Bartmann and G. Snatzke, John 318. SCHERAGA.H. A2 Structure and thermodynamic Wiley, Chichester, pp. 3-76 (1982) properties of aqueous solutions of small molecu- 332. SCHERAGA,H. A,, Y. KONISHI & T. Oo1: Multiple les and proteins. Pure and applied Chem, 54, pathways for regenerating ribonuclease A. Adv. 1495-1505 (1982) in Biophys, in press 319. SCHERAGA,H. A." Recent progress in the theoreti- 333. SCHEULE,R. K., H. E. VAN WART, B. L. VALLEE& cal treatment of protein folding. Biopolymers 22, n. A, SCHERAGA" Resonance Raman spectro-

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 51 H. A, SCHERAGA: Protein structure

scopy ofarsanilazocarboxypeptidase A: Determi- identification of a theoretically-predicted "left- nation of the nature of the azotyrosyl-248.Zinc sided" binding mode for (GIcNAc)6 in the active Complex. Proc. Natl. Acad. Sci., U. S. 74, 3273- side of lysozyme. Biochemistry 23, 993-997 3277 (1977) (1984) 334. SCHEULE, R. K., H. E. VAN WART. B. O. ZWEIFEL, 347. SMYTH, D, G.. W, H. STEIN & S. MOORE: On the B. L. VALLEE& H. A. SCHERAGA: Resonance Ra- sequence of residues l 1 to 18 in bovine pancreatic man spectroscopy of arsanilazocarboxypeptidase ribonuclease. J. Biol. Chem. 237, 1845-1850 A: Assignment of the vibrations of azotyrosyl- (1962) 248. J. Inorganic Biochem. 11,283-301 (1979) 348. SMYTH. D. G., W. H. STEIN & S. MOORE: The 335. SCHEULE,R. K.. H. E. VAN WART, B. L. VALLEE& sequence of amino acid residues in bovine pan- H. A. SCHERAGA." Resonance Raman spectros- creatic ribonuclease. Revisions and confirma- copy of arsanilazocarboxypeptidase A: Confor- tions. J. Biol. Chem. 238, 227-234 (1963) mational equilibria in solution and crystal pha- 349. SPACKMAN, D. H., W. H. STEIN & S. MOORE: The ses. Biochemistry 19, 759-766 (1980) disulfide bonds of ribonuclease. J. Biol. Chem. 336. SCHEULE. R. K., S. L. HAN. H. E. VAN WART, B. L. 235, 648-659 (1960) VALLEE & H. m. SCHERAGA: Resonance Raman 350. STAHMANN, M. Az Polyamino acids, polypep- spectroscopy of arsanilazocarboxypeptidase A: tides, and proteins. Univ. of Wisconsin Press, Mode of inhibitor binding and active-site topo- Madison, pp. 1-394 (1962) graphy. Biochemistry 20, 1778-1784 ( 1981) 351. STEIN, W, D. & E A. BARNARD: The histidine 337. SCHINDLER, M, Y. ASSAF, N. SHARON & D. M. residue in the active centre of ribonuclease. II. CHIPMAN" Mechanism of lysozyme catalysis: The position of this residue in the primary protein Role of ground-state strain in subsite D in hen chain. J. Mol. Biol. 1, 350-358 (1959) egg-white and human lysozymes. Biochemistry 352. STEINBERG. I. Z. & H. A. SCHERAGA: Chromato- 16, 423-431 (1977) graphy on columns packed with a non-polar 338. SCHMID, F. X.: Mechanism of folding ofribonuc- material. J. Am. Chem. Soc. 84, 2890-2892 lease A. Slow refolding is a sequential reaction via (1962) structural intermediates. Biochemistry 22, 4690- 353. STEINBERG, I. Z. & H. A. SCHERAGA: Entropy 4696 (1983) changes accompanying association reactions of 339. SCHWARZ. G,: Oil the kinetics of the helix-coil proteins. J. Biol. Chem. 238, 172-181 (1963) transition of polypeptides in solution. J. Mol. 354, STILLINGER, F. H. & A. RAHMAN: Molecular dy- Biol. l 1, 64-77 (1965) namics study of temperature effects on water 340. SCOTT, R. A & H. A. SCHERAGA: Structural stu- structure and kinetics. J. Chem, Phys, 57, 1281- dies of ribonuclease. XI. Kinetics of denatura- 1292 (1972) tion. J. Am. Chem. Soc. 85, 3866-3873 (1963) 355. STURTEVANT. J. M., M. LASKOWSKI, JR., T, H. 34 I. SCOTT. R. A. & H. A. SCHERAGA: Conformational DONNELLY & H. A. SCHERAGA: Equilibria in the analysis of macromolecules. III, Helical structu- fibrinogen-fibrin conversion, III, Heats of poly- res of polyglycine and poly-L-alanine. J. Chem, merization and clotting of fibrin monomer. J. Phys. 45, 2091-2101 (1966) Am. Chem. Soc. 77, 6168-6172 (1955) 342. SELA, M. & C. B. ANFINSEN: Some spectrophoto- 356. SUEKI, M., S. LEE, S. P. POWERS, J. B, DENTON, metric and polarimetric experiments with ribo- Y, KONISHI & H. A, SCHERAGA: Helix-coil stabili- nuclease, Biochim. Biophys. Acta 24, 229-235 ty constants for the naturally occurring amino (1957) acids in water. 22.Histidine parameters from 343. SHUGAR, D: Ultraviolet absorption spectrum of random poly(hydroxybutylglutamine-co-L-his- ribonuclease. Biochem. J. 52, t42-149 (1952) tidine). Macromolecules 17, 148-155 (1984) 344. SIEGEL, B. M., J. P, MERNAN & H, A. SCHERAGA'. 357. SWAMINATHAN.S., S. W. HARRISON & D. L. BEV- The configuration of native and partially polyme- ERIDGE: Monte Carlo studies on the structure rized fibrinogen. Biochim. Biophys. Acta l 1, of a dilute aqueous solution of methane. J. Am. 329-336 (1953) Chem. Soc. 100, 5705-5712 (1978) 345. SIMON,l.. G. Ni~METHY & H. A. SCHERAGA: Con- 358. SWENSON, M. K., A. W. BURGESS & H. A. SCHE- formational energy calculations of the effects of RAGA" Conformationai analysis of polypeptides: sequence variations on the conformations of two Application to homologous proteins, ln: Fron- tetrapeptides. Macromolecules 11, 797-804 tiers in Physicochemical Biology. Ed. B. Pullman, (1978) Academic Press, pp. 115-142 (1978) 346. SMITH-GILL, S. J., J. A. RUPLEY, M. R. PINCUS, 359. TAKEDA, Y., Y. hTAKA & M. TSUBOI: Structure R. P. CARTY & H. A. SCHERAGA: Experimental of poly-fl-(p-chlorobenzyl)-L-aspartate: X-ray

52 Carlsberg Res. Commun. Voi. 49, p. 1-55, 1984 H. A. S('HERAGA: Protein structure

analysis of the a-helix form. J. Mol. Biol. 51, 372. TANFORD, C., J. D. HAUENSTEIN & D. G. RANDS: 101-113 (1970) Phenolic hydroxyl ionization in proteins. II. 360. TANAKA, S. & H. A. SCHERAGA: Model of protein Ribonuclease. J. Am. Chem. Soc. 77, 6409-6413 folding: inclusion of short-, medium- and long- (1955) range interactions. Proc. Natl. Acad. Sci., U. S. 373. TELFORD, J. N., J. A. NAGY. P. A. HATCHER & H. 72, 3802-3806 (1975) A. SCHERAGA: Location of peptide fragments in 361. TANAKA. S. & H. A_ SCHERAGA: Theory of the the fibrinogen molecule by immunoelectron mi- cooperative transition between two ordered con- croscopy. Proc. Natl. Acad, Sci., U. S. 77, 2372- formations of poly(L-Proline). III. Molecular 2376 (1980) theory in the presence of solvent. Macro- 374. TERWILLIGER, Z. C.. L WEISSMAN & O. EISEN- molecules 8, 516-521 (1975) aERG: The structure of melittin in the form l 362. TANAKA. S. & H. A. SCHERAGA: Statistical me- crystals and its implication for melittin'slytic and chanical treatment of protein conformation. I, surface activities. Biophys. J. 37, 353-361 (1982) II, III. Macromolecules 9, 142-182 (1976) 375. TINOCO, I. JR. & R. W. WOODY: Optical rotation 363. TANAKA, S. & H. A. SCHERAGA: Statistical me- of oriented helices. It. Calculation of the rotatory chanical treatment of protein conformation. 4. dispersion of the . J. Chem. Phys. 32, A four-state model for specific-sequence copo- 461-467 (1960) lymers of amino acids. Macromolecules 9, 812- 376. TORR1E.G.M. &J. P. VALLEAU." Monte Carlo free 833 (1976) energy estimates using non-Boltzmann sampling: 364. TANAKA. S. & H. A. SCHERAGA: Medium- and Application to the sub-critical Lennard-Jones long-range interaction parameters between fluid. Chem. Phys. Lett. 28, 578-581 (1974) amino acids for predicting three-dimensional 377. URNES, P. & P. DOTY: Optical rotation and the structures of proteins. Macromolecules 9, 945- conformation ofpolypeptides and proteins. Adv. 950 (1976) Protein Chem. 16, 401-544 (1961) 365. TANAKA. S. & H. A. SCHERAGA: Statistical me- 378, VAN WART, H. E. & H. A. SCHERAGA: Raman chanical treatment of protein conformation. 5. spectra of cystine-related disulfides. Effect of A multistate model for specific-sequence copo- rotational isomerism about carbon-sulfur bonds lymers of amino acids. Macromolecules 10, 9-20 on sulfur-sulfur stretching frequencies. J. Phys. (1977) Chem. 80, 1812-1823 (1976) 366. TANAKA,S.& H.A. SCHERAGA."Hypothesis about 379. VAN WART, H. E. & H. A. SCHERAGA: Raman the mechanism of protein folding. Macro- spectra of strained disulfides. Effect of rotation molecules 10, 291-304 (1977) about sulfur-sulfur bonds on sulfur-sulfur 367. TANAKA, S. & H. A. SCHERAGA: Statistical me- stretching frequencies. J. Phys. Chem. 80, 1823- chanical treatment of protein conformation. 6. 1832 (1976) Elimination of empirical rules for prediction by 380. VAN WART, H E. & H. A. SCHERAGA: Stable use of a high-order probability. Correlation be- conformations of aliphatic disulfides: The influ- tween the amino acid sequences and conforma- ence of 1,4 interactions involving sulfur atoms. tions for homologous neurotoxin proteins. Mac- Proc. Natl. Acad. Sci. U. S. 74, 13-17 (1977) romolecules 10, 305-316 (1977) 381. VAN WART. H. E. & H. A. SCHERAGA: Raman and 368. TANAKA,S. & H. A. SCHERAGA: Model of protein resonance Raman spectroscopy: In: Enzyme folding: Incorporation of a one-dimensional Structure, Part G of Methods in Enzymology. short-range (Ising) model into a three-dimen- Vol. 49, Ch. 5, eds. C. H. W. HiTs & S. N. sional model. Proc. Natl. Acad. Sci., U. S. 74, Timasheff, Academic Press, New York, pp. 67- 1320-1323 (1977) 149 (1978) 369, TANFORD, C." Theory ofprotein titration curves. 382. VAN WART, H. E.. L L. SHIPMAN & H. A. SCHE- II. Calculations for simple models at low ionic RAGA: Variation of disulfide bond stretching strength. J. Am. Chem. Soc. 79, 5340-5347 frequencies with disulfide dihedral angle in di- (1957) methyl disulfide. J. Phys. Chem. 78, 1848-1853 370. TANFORD,C.: The interpretation of hydrogen ion (1974) titration curves of proteins. Adv. Protein Chem. 383. VAN WART. H. E.. L. L SHIPMAN & H. A. SCHE- 17, 69-165 (1962) RAGA: The nature of the potential function for 371. TANFORD. C. & J. G. KIRKWOOD: Theory of pro- internal rotation about carbon-sulfur bonds in tein titration curves. I. General equations for disulfides. J. Phys. Chem. 79, 1428-1435 (1975) impenetrable spheres. J. Am. Chem. Soc. 79, 384. VAN WART, H. E., L. L SHIPMAN & H. A. SCHE- 5333-5339 (1957) RAGA: Theoretical and experimental evidence for

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 53 H. A. SCHERAGA: Protein structure

a nonbonded 1,4 carbon-sulfur interaction in of three-dimensional structure of bovine pan- organosulfur compounds. J. Phys. Chem. 79, creatic trypsin inhibitor. J. Protein Chem. 1, 1436-1447 (1975) 85-117 (1982) 385. VAN WART. H. E. F. CARDINAUX & H. A, SCHE- 397. WAKO, H,,N SAITG& H. A SCHERAGA: Statistical RAGA" Low frequency Raman spectra of di- mechanical treatment of a-helices and extended methyl, methyl ethyl and diethyl disulfides, and structures in proteins with inclusion of short- and rotational isomerism about their carbon-sulfur medium-range interactions. J. protein Chem. 2, bonds. J. Phys. Chem, 80, 625-630 (1976) 221-249 (1983) 386. VAN WART, H. E., B. L VALLEE, R. K. SCHEULE 398. WARME, P. K. & H. A. SCHERAGA.: Refinement & H. A. SCHERAGA" Resonance Raman probes of X-ray data on proteins. II. Adjustment of of enzyme active sites, Trends in Biochemical structure of specified geometry to relieve atomic Sciences 6, 316-318 (1981) overlaps. J, Comput. Phys. 12, 49-64 (1973) 387. V,~SQUEZ. M., G. NEMETHY & H. A. SCHERAGA" 399. WARME, P. K. & H, A. SCHERAGA: Refinement Computed conformational states of the 20 nat- of the X-ray structure of lysozyme by complete urally occurring amino acid residues and of the energy minimization. Biochemistry 13, 757-767 prototype residue ct-aminobutyric acid. Macro- (1974) molecules 16, 1043-1049 (1983) 400. WARME, P. K. & H. A. SCHERAQA: Conforma- 388. VENKATACHALAM.C. M.: Stereochemical crite- tional energy refinement of horse-heart ferricy- ria for polypeptides and proteins. V. Conforma- tochrome c. Biochemistry 14, 3509-3517 (1975) tion of a system of three linked peptide units, 401. WARME, P. K,, N GO & H. A. SCHERAGA: Refine- Biopolymers 6, 1425-1436 (1968) ment of X-ray data on proteins. I. Adjustment 389. YON DREELE, P H., D. POLAND & H. A, SCHE- of atomic coordinates to conform to a specified RAGA: Helix-coil stability constants for the nat- geometry. J. Computational Physics 9, 303-317 urally occurring amino acids in water. I. Proper- (1972) ties of copolymers and approximate theories. 402. WARME, P. K., F. A. MOMANY, S, V RUMBALL, Macromolecules 4, 396-407 (1971) R. W. TUTTLE & H. A. SCHERAGA: Computation 390. YON DREELE, P. H., N. LOTAN, V, S. ANANTHANA- of structures of homologous proteins; a-lactalbu- RAYANAN, R. H. ANDREATTA. O. POLAND & H. rain from lysozyme. Biochemistry 13, 768-782 A. SCHERAGA7 Helix-coil stability constants for (1974) the naturally occurring amino acids in water. It. 403. WERTZ, D. & H. A, SCHERAGA: The influence of Characterization of the host polymers and appli- water on protein structure, An analysis of the cation of the host-guest technique to random rlreferences of amino acid residues for the inside poly(hydroxypropylglutamine-eo-hydroxybutyl- or outside and for specific conformations in a glutamine). Macromolecules 4, 408-417 ( 1971 ) protein molecule, Macromolecules 11, 9-15 39 I. VOURNAKIS,J. N., D. POLAND& H. A, SCHERAGA; (1978) Anti-cooperative interactions in single-strand oli- 404. WLODAWER, A. & L. SJOL|N" Structure ofribonu- gomers of deoxyriboadenylic acid. Biopolymers clease A: Results of joint neutron and X-ray 5, 403-422 (1967) refinement at 2.0-~, resolution. Biochemistry 392. WAKO. H. & N SAITO: Statistical mechanical 22, 2720-2728 (1983) theory of the protein conformation. J. Phys. Soc. 405. WLODAWER,A.. R, BOTT& L. SJOLIN: The refined Japan 44, 1931-1945 (1978) crystal structure of ribonuclease A at 2.0-A 393. WAKO, H. & H A. SCHERAGA: On the use of resolution. J. Biol. Chem. 257, 1325-1332 (1982) distance constraints to fold a protein. Macro- 406. WOTHRICH, K.: NMR in biological research: molecules 14, 961-969 (1981) Peptides and proteins. North-Holland Publ. 394. WAKO, H & H. A. SCHERAGA: Visualization of Amsterdam (1976) the nature of protein folding by a study of a 407. WYSKOFF, H. W., D. TSERNOGLOU, A. W. HAN- distance constraint approach in two-dimensional SON, J. R. KNOX. B. LEE & F. M RICHARDS: The models, Biopolymers 2 l, 611-632 (1982) three-dimensional structure of ribonuclease-S. J. 395. WAKO, H. & H. A. SCHERAGA" Distance-con- Biol. Chem. 245, 305-328 (1970) straint approach to protein folding. I. Statistical 408. YAN, J. F., G, VANDERKOOI & H. A. SCHERAGA: analysis of protein conformations in terms of Conformational analysis of macromolecules. V. distances between residues. J. Protein Chem. I, Helical structures of poly-L-aspartic acid and 5-45 (1982) poly-L-glutamic acid, and related compounds. J. 396. WAKO, H. & H. A. SCHERAGA: Distance-con- Chem. Phys, 49, 2713-2726 (1968) straint approach to protein folding. II. Prediction 409. YAN. J. F., F. A. MOMANY & H. A. SCHERAGA:

54 Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 H. A. S('HERAGA: Protein structure

Conformational analysis of macromolecules. VI. formational energy studies of N-acetyl-N'-methyl Helical structures ofo-, m-, and p-chlorobenzyl amides of the Pro-X and X-Pro dipeptides. Bio- esters of poly-L-aspartic acid. J. Am. Chem. Soc. polymers 16, 811-843 (1977) 92, 1109-1115 (1970) 415. ZIMMERMAN. S. S. & H. A. SCHERAGA" Influence 410. YC'AS. M., N. S. GOEL & J. W. JACOBSEN: On the of local interactions on protein structure. II, I11, computation of the tertiary structure of globular IV. Conformational energy studies of N-acetyl- proteins. J. Theor. Biol. 72, 443-457 (1978) N'-methyl amides of the AIa-X, X-AIa, Gly-X, 411. ZIMM, B. H.: Theory of "melting" of the helical X-GIy, Ser-X and X-Ser dipeptides. Biopolymers form in double chains of the DNA type. J. Chem. 17, 1849-1890 (1978) Phys. 33, 1349-1356 (1960) 416. ZIMMERMAN. S. S., M. S. POTTLE. G. NI~METHY 412. ZIMM. B. H. & J. K. BRAGG: Theory of the phase & H. A. SCHERAGA: Conformational analysis of transition between helix and random coil in the 20 naturally occurring amino acid residues polypeptide chains. J. Chem. Phys. 31,526-535 using ECEPP. Macromolecules 10, 1-9 (1977) (1959) 417. ZIMMERMAN. S. S.. L. L. SHIPMAN & H. A. SCHE- 413. ZIMMERMAN, S. S. & H. A. SCHERAGA; Stability RAGA: Bends in globular proteins. A statistical ofcis, trans, and nonplanar peptide groups. Mac- mechanical analysis of the conformational space romolecules 9, 408-416 (1976) of dipeptides and proteins. J. Phys. Chem. 81, 414. ZIMMERMAN, S. S. & H. A. SCHERAGA; Influence 614-622 (1977) of local interactions on protein structure. I. Con-

Carlsberg Res. Commun. Vol. 49, p. 1-55, 1984 55