<<

J. Mol. Biol. (1996) 255, 536–553

Principles of Helix-Helix Packing in : The Helical Lattice Superposition Model Dirk Walther1*, Frank Eisenhaber1,2 and Patrick Argos1

1European Molecular Biology The geometry of helix-helix packing in globular proteins is comprehen- Laboratory, Meyerhofstraße 1 sively analysed within the model of the superposition of two helix lattices Postfach 10.2209, 69012 which result from unrolling the helix cylinders onto a plane containing Heidelberg, Germany points representing each residue. The requirements for the helix geometry (the radius R, the twist angle v and the rise per residue D) under perfect 2Biochemisches Institut der match of the lattices are studied through a consistent mathematical model Charite´, der Humboldt- that allows consideration of all possible associations of all helix types (a-, Universita¨t zu Berlin, p- and 310). The corresponding equations have three well-separated Hessische Straße 3–4 10115 solutions for the interhelical packing angle, V, as a function of the helix Berlin, Germany geometric parameters allowing optimal packing. The resulting functional relations also show unexpected behaviour. For a typically observed a-helix ° − ° (v = 99.1 , D = 1.45 Å), the three optimal packing angles are Va,b,c = 37.1 , −97.4° and +22.0° with a periodicity of 180° and respective helix radii Ra,b,c = 3.0 Å, 3.5 Å and 4.3 Å. However, the resulting radii are very sensitive ° to variations in the twist angle v. At vtriple = 96.9 , all three solutions yield identical radii at D = 1.45 Å where Rtriple = 3.46 Å. This radius is close to that of a poly(Ala) helix, indicating a great packing flexibility when alanine is involved in the packing core, and vtriple is close to the mean observed twist angle. In contrast, the variety of possible theoretical solutions is limited for the other two helix types. Besides the perfect matches, novel suboptimal ‘‘knobs into holes’’ hydrophobic packing patterns as a function of the helix radius are described. Alternative ‘‘knobs onto knobs’’ and mixed models can be applied in cases where salt bridges, hydrogen bonds, disulphide bonds and tight hydrophobic head-to-head contacts are involved in helix-helix associations. An analysis of the experimentally observed packings in proteins con- firmed the conclusions of the theoretical model. Nonetheless, the observed a-helix packings showed deviations from the 180° periodicity expected from the model. An investigation of the actual three-dimensional geometry of helix-helix packing revealed an explanation for the observed discrep- ancies where a decisive role was assigned to the defined orientation of the Ca-Cb vectors of the side-chains. As predicted from the model, helices with different radii (differently sized side-chains in the packing core) were observed to utilize different packing cells (packing patterns). In agreement with the coincidence between Rtriple and the radius of a poly(Ala) helix, Ala was observed to show greatest propensity to build the packing core. The ap- plication of the helix lattice superposition model suggests that the packing of residues is best described by a ‘‘knobs into holes’’ scheme rather than ‘‘ridges into grooves’’. The various specific packing modes made salient by the model should be useful in engineering and design. 7 1996 Academic Press Limited Keywords: protein; helix; ; helix packing; protein *Corresponding author secondary structure

Introduction had been suggested. Several models were devel- oped and were mostly devoted to surface comple- The topic of helix-helix pairwise packing in mentarities upon packing. Crick’s model (Crick, proteins was addressed soon after helical structures 1953), later referred to as ‘‘knobs into holes’’,

0022–2836/96/030536–18 $12.00/0 7 1996 Academic Press Limited Helix-helix Packing 537 introduced the unrolling of regular helices onto a equivalent helices in homologous proteins. Other plane and then finding the best fit of the resulting efforts have focused on the energetic aspects of lattices (one point per residue). This was achieved helix-helix packing where different interaction by superposition in a face-to-face manner potentials ranging from burial of hydrophobic through rotation followed by translation such residues (Ptitsyn & Rashin, 1975) and other that residues of one helix (knobs) fit into cells simplified interaction potentials (Solovyov & formed by neighbouring residues in the other Kolchanov, 1984) to atomic energy minimization helix (holes). Assuming a helix radius R of 5.0 A˚ and Monte-Carlo sampling (Chou et al., 1983, and a twist angle v of 100.0° between residues 1984; Tuffe´ry & Lavery, 1993) have been applied. along the helix path, he found optimal packing Murzin & Finkelstein (1988) attempted to predict at a dihedral packing angle V between the helix the topology and orientation of certain helical axes at +20° (coiled-coil structures) and a subopti- assemblies by arranging them in polyhedral mal packing at V = −70°. Richmond & Richards shells. Harris et al. (1994) have performed a care- (1978) also pursued the knobs into holes model ful study of the diversity in four-helix bundle and concluded further that the packing angle is proteins. inversely correlated to the helix radius. They The work presented here was stimulated by the suggested three possible classes of helix-helix observation that observed helix-helix packing packing and, for each class, listed possible amino angles demonstrate a pronounced preference for acids central to the contact. These preferences were V1−50°/130°. It is difficult to imagine why utilized to predict spatial helical arrangements from this preference should be a result of the relative primary structural information (Richmond & length of one ridge along one helical side, as Richards, 1978; Cohen et al., 1979; Cohen & Kuntz, argued by Chothia et al. (1981), or due to the less 1987). splayed character of residues in the i = 4 ridge Chothia et al. (1977, 1981) introduced another and (Chothia et al., 1981; Hutchinson et al., 1994). The now widely accepted interpretation of the superim- contact-forming residues in helix association posed ‘‘helical’’ lattices. Instead of ‘‘knobs into need not belong to one and the same ridge. holes’’ packing, they coined ‘‘ridges into grooves’’. Maximizing the burial of hydrophobic surface upon Here, the ridges formed by residues with sequential contact (presumably favouring smaller packing spacing i in the first helix fit into grooves formed by angles) or an easier and fitter packing of amino acid residues in the second helix with spacing j. By side-chains at a certain packing angle would assuming mean observed helix geometries, they seem to provide more natural explanations. Thus, found three basic packing types by varying i and j; the model of unrolled helix lattices was further − ° − ° namely, Vi=1,j = 4 = 105 , Vi=4,j = 4 = 52 and investigated and treated mathematically in a ° Vi=3,j = 4 = +23 . In principle, yet other combinations rigorous fashion. Which set of helix parameters (the − ° of i and j were possible (e.g. Vi=3,j = 3 = 109 ); radius of the helix R, twist angle v and the rise per however, as they noted, these classes were residue D) guarantees an optimal match and barely distinguishable from the former because of association of two identical and ideal helical their similar packing angle and pattern of amino lattices in a face-to-face manner after translating one acid contacts. They introduced yet another packing of them (homogeneous packing)? Can the class (‘‘crossed ridge’’ packing), where the ridges of ambiguities in the ridges-into-groove model be two helices cross with expected packing angles at resolved by considering optimization of the +55°, −15° and −105°. Chothia and his co-workers packing density? To approach these questions, also argued that the observed preference for the conditions for optimal packing were mathemat- packing angles around V = −52° can be understood ically formulated to allow careful consideration of in that ridges, formed by contact residues spaced by all solutions. To check the theoretical model, a i = 4, dominate the shape and surface of the helical statistical analysis of experimentally determined face since they make the smallest angle to the helix helix-helix packings was effected. The latter axis. showed that a 180° periodic selection in V Efimof (1979) attempted to relate the packing was not uniform. An explanation for this is angle with preferred rotational states of the provided here based on the tertiary structural side-chains along the helix. He distinguished configuration of helices, especially the Ca–Cb two types of packing; polar and apolar, each bond direction. To the authors’ knowledge, the giving rise to different combinations of rotational treatment here is mathematically rigorous in isomeric states of the contacting amino acid contrast to all the previous works where residues. For a best fit, he proposed three discrete more visual approaches were adopted and various packing angles for the apolar case (V1+30°, helical geometric parameters were held fixed. 1−30°, 190°) and a range of possible docking The non-uniform V distribution has not been angles in the polar case (−30°EVE30°). Reddy & previously addressed. The various optimal and Blundell (1993) correlated the distance of closest suboptimal packing modes made salient by the approach between two packed helices to the model should aid in protein engineering and volume of the interface-forming amino acid design, especially in selection of residue types to residues and used the resulting linear dependency achieve specific helical contact sites or axial to predict the interhelical distance of structurally orientations. 538 Helix-helix Packing

Table 1. Definition of symbols Symbol Definition R Radius of a helix D Rise per residue along the helix axis v Angular twist per residue along the helix path V Dihedral packing angle (sign conventions as in Chothia et al., 1981) a, b, c Identifiers for the three optimal solutions of the functions describing the lattice superposition am , bm , cm Identifiers for the three optimal solutions of the model equations where the mean values of D and v are taken from helices observed in protein tertiary structures VN Dihedral packing angle using the 180° rotation symmetry of ideal helix-helix packing; i.e. VN = V + 180° if V < 0°; otherwise VN = V t Angle measuring the deviation of the helix axis from the contact plane of the helix pair; i.e. the plane normal to the line of closest approach. The angle is non-zero when, for straight or curved helix axes, the line of closest approach is not perpendicular to at least one of the respective helix axes and crosses the axes at the helical termini d Distance of closest approach between two fitted helix axes di Distance of closest approach between a local helix axis assigned to the residue i of the first helix and the second fitted helix axis; i.e. the shortest distance between the position obtained by drawing a perpendicular from the geometric centre of the side-chain atoms of residue i (Ca for Gly) to its fitted helix axis to the second fitted helix axis a Skew angle (Harris et al., 1994) between a vector, obtained by drawing a perpendicular from the geometric centre of the side-chain (Ca for Gly) to its fitted helix axis and the local line of closest approach between the interacting helices Pc Site of contact between a residue of one helix and a second helix; i.e. the geometric centre of the positions of two side-chain atoms of the residue of the first helix that are closest to the second helix axis (Ca only for Gly and Cb only for Ala) a Ptip Position of the side-chain atom of a contacting helical residue that is furthest from the fitted helix axis (C for Gly) Ra Apparent helix radius defined as the distance of the Ptip-atom to the fitted helix axis; atomic radii were not considered and were assumed to be compensated by side-chain–side-chain interdigitation upon packing Several of these parameters are illustrated by Figure 1.

Mathematical Description per residue D along the helix axis. Various symbols utilized throughout the text are listed in Table 1 and Homogeneous (hydrophobic) packing their definitions are illustrated in Figure 1. Unrolling an ideal helix onto a plane towards the observer results in a regular lattice, as shown Optimal (perfect) packing in Figure 2, where each point represents a residue. In associating a-helices, each of the The model used here assumes regular and same geometry, one lattice must be rotated relative straight helices of radius R, twist angle between v to the other about a lattice position such that the successive residues along the helix path and a rise points of the two lattices overlap. Then an appropriately chosen translation of one lattice must be effected so that the knobs (points in one lattice) fall into the centre of parallelograms (holes) in the other helix (Figure 2). The parallelo- grams are formed by connecting four neighbouring points in one of the lattices. This packing optimization, where infinite lattices of unrolled helices overlap, is justified by the assumption that the global two-dimensional optimum coincides with the best possible local packing optimum in three dimensions. This phenomenon can be mathematically formulated. Each lattice can be respectively de- scribed with two base vectors (v1 and v2; v1" and v2"). In face-to-face packing of the helices, the base vectors are related through mirror sym- metry: vx';1,2 = −vx;1,2 and vz';1,2 = vz;1,2, where x and z represent respective vector components and the mirror plane contains the z-axis (Figure 2). Figure 1. Schematic drawing of the parameters used for Superposition requires rotation of one lattice the description of helix-helix packing geometries. A1' and v1,2 R v1,2 R A2' correspond to the helix axes projected onto the contact such that " = V ' , where V is a rotation plane, which is normal to the line of closest approach. matrix corresponding to two helices with axial Definitions of the parameters and associated symbols are packing angle V. The lattice point Pi is the centre of given in Table 1. rotation (Figure 2). Under the condition of perfect Helix-helix Packing 539

Without loss of generality, two base vectors with respective components can be selected (Figure 2) such that:

AR v1 = Pi+1 − Pi = (3) 0 D 1

BR v2 = Pi+k − Pi = (4) 0kD1 where A = v, B = kv − 2p and k = (2p/v) + 1, where 2p/v is truncated to an integral value and v is in radians. Since the helices are considered as cylinders, the x-co-ordinate corresponds to arcs on Figure 2. Detail of a helical lattice created by the cylinder. As shown in the Appendix, this system unrolling an infinite regular helix onto a plane towards the observer and specified by an xz-coordinate system. of equations yields three distinct solutions (Table 2) for the packing angle V, corresponding to packing The origin is set onto one lattice point Pi representing residue i. The vectors correspond to possible base vectors classes designated here as a, b and c. The solutions for the lattice. The indices of the points refer to respective were found to possess a 180° periodicity as amino acid sequence positions along the helix relative indicated by their signs. The functions for each to residue i, where k = [2p/v] + 1, where 2p/v is class, corresponding to particular values of n1..4, are made integral through truncation and v is in radians. subsequently given where the (b) relationships for The sequence separations for average a-helices are given R/D are derived by squaring and summing the (a) in parentheses. Two of any of the three base vectors relationships: shown can be chosen to specify the lattice (three possibilities). The grey coloured parallelograms corre- class a: spond to the three possible topologically possible packing cells (holes) into which the lattice points (knobs) can be (1 − k)B + kA cos( ) = 2 , J fit, resulting in helix-helix packing. The identifier of a V kA − B G given cell is calculated from indices of lattice points h associated with the cell, which are summed after R 2AB − B2 G sin(V) = 2 j (5a) becoming powers to the base 2, and k is assumed as 4; for D B − kA example, a cell bonded by i, i + 3, i + 4 and i + 7 is identified by 153 = 20 + 23 + 24 + 27; similarly for cell 27 (20 + 21 + 23 + 24); cell 51 and so forth. R 2 kA(2k − 4) + kB(2 − k) = (5b) 0D1 B3 − 4AB2 + 4BA2 overlap, the mirrored and rotated lattice can be class b: described through a linear combination of the original base vectors (v1,2) with four integer factors A − kB R A2 − B2 cos(V) = 2 , sin(V) = 2 (6a) (n1, n2, n3 and n4) such that: kA − B D B − kA v"1 = n1v1 + n2v2 (1)

2 2 v"2 = n3v1 + n4v2 (2) R k − 1 = 2 − 2 (6b) − D A B where n1..4 = 0, +1 or 1. 0 1

Table 2. The six solutions for optimal superposition of ideal helical lattices Correspondinga Vb Rc Classd n1 n2 n3 n4 solution equation: (deg.) (A˚ ) ident. (1) −1 1 0 1 (5a, b): + sign −37.1 3.0 a (2) 1 −1 0 −1 (5a, b): − sign 142.9 3.0 a (3) 0 1 1 0 (6a, b): + sign −97.4 3.5 b (4) 0 −1 −1 0 (6a, b): − sign 82.6 3.5 b (5) −1 0 −1 1 (7a, b): + sign 22.0 4.3 c (6) 1 0 1 −1 (7a, b): − sign −158.0 4.3 c a Solution corresponding to the given sign conditions in the specified equations. b Values given for the packing angle assume a twist angle v = 99.1°, which is the mean observed value in known protein structures. c The helical radius given assumes v = 99.1° and D = 1.45 A˚ , values most often observed in actual helices. d Class identification (see the text). 540 Helix-helix Packing

Figure 3. Graphical representation for the three solutions (packing classes a, b and c) of optimally overlapped helical lattices. Calculated optimal packing angle VN and calculated ratio R/D are shown as functions of the helix twist angle v for the given interval. The vertical lines correspond to the mean twist angles for the three helix types − ° (a (v = 99.1215.0 ), 310 and p-helices). The filled circles in B correspond respectively to the ratios R/D for different amino acids obtained by dividing the mean observed distances of the ‘‘tip’’ atoms (Ptip) of the residues Gly, Ala, Val and Leu to the axis of a-helices (i.e. the apparent helix radius) with the corresponding mean rise per residue of the respective helix type (D = 1.45 (21.1) A˚ ) for the a-helix, 2.0 A˚ (310-helix) and 1.1 A˚ (p-helix)). The radii for the 310 and a the p-helix were corrected for their different C based radii (1.9 and 2.8 A˚ , respectively). va-helix was taken from the mean observed twist angle between the geometric centres of side-chain atoms for two consecutive helix residues. These mean values (v and D) did not include observations based on the four residues at either helix terminus where the helix axis is less accurately determined (see the text). Values for the 310 and p-helices were taken from Schulz & Schirmer (1979).

class c: the packing angle can accommodate the same helix radius. At va, the three packing classes display B + (k − 1)A cos(V) = 2 , J different radii. The smallest accommodated radius kA − B G ˚ h is found in packing class a (R = 3.0 A) and the R 2AB − A2 largest in class c (R = 4.3 A˚ ) assuming the mean sin(V) = 2 G (7a) D B − kA j observed rise per residue D = 1.45 A˚ . Only for helices of the a-type is the simultaneous occurrence of all three packing classes allowed and the mean 2 − − R A(2k 1) + B(2 4k) observed value va is found closer to a triple point = 3 − 2 2 (7b) 0D1 A 4A B + 4AB than the mean twist angle of the other helix types (p and 310). Thus, the helical lattice obtained from It is clear that, for each class, the packing angle unrolling an average a-helix more closely resembles V is a function of the twist angle v, as is the ratio a regular hexagonal lattice, which allows three R/D. VN and R/D can be plotted against v (Fig- packing angles simultaneously. The p-helix does not ure 3) for each class. As evident from the equations, correspond to solution c and the mean twist angle ° each solution has a restricted definition space and for 310 helices (v310 = 120.0 ) coincides with an there are singularities at different twist angles. For ambiguity where the solution switches from k = 4 to different k dependent upon v, the solutions repeat k = 3 with increasing v. Furthermore, the required but have different periods. The solutions for all helical radii corresponding to v intervals about the three classes recurrently cross at points (Figure 3) in observed mean of a-helices fall in a biologically the radius dependency (for the triple point closest reasonable range (Figure 3B), whereas for the other to the mean observed twist angle for a-helices va, helix types, the required radius for one solution has ° vtriple = 96.9 ). The corresponding helix lattices are to be either infinite (solution c for p-helices) or is regular hexagons such that the three solutions for ambiguously defined (solution a and b for 310 Helix-helix Packing 541

helices). Solution c for 310 helices is found just at the positions in the mirrored and rotated lattice. The ° transition to an infinite radius. angle VN covers the range 0 to 180 and implies a − ° second possible packing angle at V = VN 180 due Suboptimal packing to the rotational symmetry of ideal helices. Since in the model an increased helix radius implies the Apart from analysing perfect lattice overlap, same for the helix-forming side-chains, the dis- helix-helix axial association angles may be observed tances between the lattice points were normalized corresponding to local packing optima where, for to the helical radius (R). example, base vectors are superimposed such that The behaviour of the function SP(VN, R) is shown regularity is achieved in only one lattice direction. in Figure 4, where its value is plotted against VN Since helices often pack over a few turns, the with D and v taken as the mean observed values for condition of infinite lattice superposition may be a-helices. The three minima of the optimal packings too strong and not fulfilled. Thus, only the nearest (SP(VN, R) = 0) can be readily identified. For a given six neighbours around a central lattice point are radius more than one packing angle meets the considered to determine suboptimal packing. As requirements of little steric clash, increasing the before, the central points of two sublattices are possible number of helix-helix packing angles. For brought into coincidence. Subsequently, one sub-lat- example, at larger radii, in addition to the steep ° tice is rotated at the angle V around the central minimum at VN120 , a broad but shallow ° point. A packing parameter SP was constructed to minimum is found near VN = 120 . measure the degree of overlap between the two sublattices: Determination of the translation vector for the − superimposed lattice =Pm (R) Qn=1..6(VN, R)= SP(VN, R) = s min (8) m R Only optimal superposition of lattice points 0 1 through rotation has been thus far considered. For where m = 1..6 corresponds to the six neighbouring actual helix-helix packing, the translation vector by lattice points of Pi (Figure 2), the lattice vectors to which one of the superimposed lattices is shifted to points Pm are held fixed and Qn are the vector bring the knobs (lattice points) into holes (lattice

Figure 4. Three-dimensional plot of the function SP = SP(VN, R). The colour spectrum corresponds to different isosurfaces where violet colors belong to minimal values of SP. The twist angle and the rise per residue were respectively taken as the mean observed values, v = 99.1° and D = 1.45 A˚ . The grey shadows are shown merely for dimensional perspective and reflect the direction of the light source. 542 Helix-helix Packing

bigger radii, side-chains should prefer to pack into cell 153. According to the size criterion adopted here, cell 51 is never the largest packing cell. For this reason, and since in cylindrically shaped helices cell 51 is mostly oriented away from the helix-helix interface, it will not be considered further. Figure 6 shows the three possible perfectly superimposed helix lattices where the mean observed twist angle is taken. The selected packing cells are cell 27 for solutions am and bm and cell 153 for cm .

Non-homogeneous packing The knobs-into-holes model assumes that the amino acid side-chains pack isotropically into a hole Figure 5. Size of the packing cells 27, 51 and 153 as a formed by four side-chains of the second helix and function of the helix radius. The size is defined by the with parallelogram cross-section. This might not be length of the smaller diagonal associated with the necessary if some other interaction joined the two respective packing cells. The twist angle was set to 99.1° helices, such as bonds formed between the and the rise per residue D to 1.45 A˚ , the respective average associating amino acids, including disulphide observed values. Arrows indicate observed radii of bonds, salt bridges, hydrogen bonds or tight helices composed only of Gly (G), Ala (A), Val (V) and hydrophobic head-to-head contact (knobs onto Leu (L). knobs packing). If the packing site merely consisted of this type of contact alone, the preferred packing angles would remain the same as those derived cells) must be applied. In accordance with the three from superposition of the helical lattices but no topologically possible lattice cells (referred to as translation would be necessary. This situation is 153, 27 and 51), three translation vectors are unlikely. Nonetheless, a mixture of knobs-into-holes possible where lattice points are shifted to their and bonded contacts are yet possible. Chothia et al. centres (Figure 2). The cellular designations are (1981) have coined the term ‘‘crossed ridge helix explained in the legend to Figure 2. To achieve the packing’’ for these cases. The possible packing most homogeneous and dense packing, the largest angles for this association type can be obtained by possible cell must be selected for association with a a consideration of suboptimal packing in the model side-chain of an interacting helix. The length of the developed here. The superposition of the two smaller diagonal of each cell was chosen as a simple central lattice points can now be interpreted as a estimate of cellular size. (The area of an inscribed residue-residue bond of any type. Since the circle, an alternative, does not exist for parallelo- neighbouring residues should still obey the normal grams.) The plots of Figure 5 demonstrate that a knobs-into-holes scheme but without shifting, the cell’s capacity depends on the helix radius and thus function SP(VN, R) for the sublattice has now to be different cells are preferentially occupied at maximal instead of minimal. In agreement with the different helix radii. For helices with smaller radii, angles predicted by Chothia for the crossed ridge cell 27 should be favoured, while for helices with case, Figure 4 reveals three isolated maxima at

Figure 6. Helical lattices according to the theoretical solutions for packing classes am , bm and cm at the mean observed parameters (v = 99.1, D = 1.45 A˚ ) and the corresponding packing angles V(am , bm , cm ) = −37.1°, −97.4° and +22.0°, respectively. Starting from perfect superposition achieved by rotation of the mirrored lattice (open circles), one lattice was shifted to centre the lattice points in the appropriately chosen packing cells (see the text). The continuous (broken) line denotes the helix axis of the lattice with the filled (open) circles. Helix-helix Packing 543

° ° ° a VN155 , 115 and 175 . In addition, helices with closest four consecutive C positions around the ° a, a, large radii should also pack at VN175 . residue i (i.e. Cj + i − 1 ... Cj + i + 2 where j = 0 for the inner helical residues and appropriately chosen at Analysis of Observed Helix-Helix the helix termini and the points are correspondingly Associations shifted along ui ). The length of the local axis is first set to 1.5 A˚ . The direction of the local axis Ai The theoretical model used here assumes that associated with residue i is then smoothed by taking ideal helices pack. In the following section, the the average direction of three consecutive local experimental verification of the conclusions drawn vectors centred at i (two at the helix ends). To from the mathematical lattice superposition model achieve a continuous axial curve over the entire is discussed. helix, the new starting and ending points of consecutive local lines are joined by calculating the Data middle point between the end point of the first local stretch and the starting point of the next local A total of 220 protein tertiary structures, stretch. The new lengths and directions of the local determined at 2.0 A˚ resolution or better and with axes are then recalculated. This smoothing pro- mutual sequence similarity less than 35% as cedure is repeated three times. Despite the selected by the program OBSTRUCT (Heringa simplicity of this algorithm, the improvement for et al., 1992, available via World-Wide-Web; URL: the fit of the local axis is considerable. The standard deviation of the distances of each Ca atom to the http://www.embl-heidelberg.de/obstruct/ helix axis decreased from s = 0.34 A˚ for a globally obstruct info.html) defined axis (obtained by averaging the vectors ui over the whole helix and taking the geometric centre were used for a statistical analysis of helix-helix of every Ca position) to s = 0.14 A˚ when using the association (the set is available upon request by local axes. When only the inner helical residues e-mail to [email protected]). The assign- were considered (four residues subtracted at either ments of the a-helical stretches were taken from the helix termini), the accuracy was improved. The program DSSP (Kabsch & Sander, 1983). The angle standard deviation decreased from s = 0.37 A˚ for between two consecutive carbonyl bonds was not the global axes to s = 0.07 A˚ for local axes. ° allowed to exceed 65 ; otherwise, the helix was The packing angles are positive if the background divided into two at this residue. Two helices were helix is rotated clockwise with respect to the frontal defined to be in close contact if at least three helix when facing them. The helices are parallel residues of each helix had at least one interhelical with respect to their sequence direction at V = 0°. atom-atom contact with maximal threshold distance The packing angles are sometimes normalized to the ˚ ° ° of 4.5 A between atom centres. The resulting dataset interval 0 < VN<180 because of the 2-fold rotation ° of proteins used in this study contained 687 closely axis of ideal helices; i.e. VN = V + 180 if V < 0 ; packed pairs of helices. Membrane proteins were otherwise VN = V. The line of closest approach not included and only heavy atoms were con- between two helices was calculated by determining sidered. the line of closest approach with minimum length amongst all local axes pairs. To ensure a face-to-face Definition of the helix axis packing, distorted helix-helix associations where the line of closest approach intersected at the helix The definition of the helix axis from which many termini and t > 5° (Table 1 and Figure 1) were contact characteristics are measured bears critically omitted, leaving 449 helix pairs for analysis. on the results. Since helices can be bent, a procedure to fit a local helix axis, Ai , to every residue i along the helix was adopted. It takes advantage of a Packing cell determination straightforward algorithm for the overall axial definition given by Chothia et al. (1981). The vector The packing cell of a second helix utilized by a coincident with the local helix axis of residue i, ui , contacting residue in the first helix was determined can be determined from the cross product of the by the sequence separation of four residues vectors Bi and Bi+1 such that: containing, respectively, one of the four closest atoms (one closest atom per residue) to the ui = Bi × Bi+1 (9) geometric centre of the two atoms in the contacting where: residue (Ca for Gly and Cb for Ala) in the first helix that are closest to the axis of the second helix Bi = ri + ri+2 − 2ri+1 (10) (position of contact, Pc; see Figure 1 for illustration). and r is the position vector of the Ca atom in residue To ensure a real packing conformation, the third i. At the C terminus of the helix, where the residue closest residue of the second helix to the contacting indices would go beyond those in the helix, the local residue in the first helix was required to be within line vector is taken from the closest helical a distance of 6.0 A˚ . Despite the imprecision of the constituent residues. A point on the local axis Ai is cell-determining procedure, the observed ranking of assigned by calculating the geometric centre of the cell usage is as predicted. The three topologically 544 Helix-helix Packing possible cells (153, 27 and 51) were detected most often with respective counts 767, 647 and 228. Other determined cells such as 23 and 275 had frequencies of 48 and 42, and were followed by others.

Algorithm for interhelical ‘‘bond’’ determination Interhelical bonds were determined on the basis of geometric pattern recognition. A bond was identified between two side-chains in different helices if their corresponding Pc sites were mutually the closest to each other. The Pc sites must be no more than 4 A˚ apart and the closest Pc site for other residues in the same helix must be 5 A˚ or greater. Furthermore, the angle between the local line of closest approach and the vector joining the two mutually closest positions Pc was required to be smaller than 45°. These conditions assured knobs- onto-knobs packing, and that identified residues literally faced each other and did not pack into a cell formed by the oppositely facing helix. For 95 helix-helix pairs, this definition was fulfilled; 80 such pairs had only one interhelical bond while 15 displayed two.

Results The distribution of the observed global (per helix-helix pair) and local (per amino acid residue along the two packed helices) dihedral packing Figure 7. Frequency histogram for the observed angles in the selected set of proteins is shown in dihedral helix-helix packing angle V (bin width 10°). Figure 7. To a certain extent, the histogram of the A, Packing angle about the global line of closest approach. local packing angles biases the observations to more B, Histogram of local packing angles; i.e. the packing parallel or antiparallel associations because of angle about the local line of closest approach defined for longer possible contact regions. Yet, it allows each contacting amino acid in the helix-helix pair (Fig- ure 1). The light grey filled histogram corresponds to data considerations at which angle packings are possible based on all observed helix-helix pairs while the dark over a longer stretch where the lattice model is grey filled histogram was determined from those with certainly more critical. To account for possible more than 20 intervening residues between the end of one restrictions due to short loops connecting two helix in a contacting pair and the beginning of the other successive helices along the chain, which disallows helix. The third histogram (thick line) corresponds to all parallel packing, the condition of more than 20 helix-helix pairs with no detected interhelical bond. intervening residues was applied in a second Arrows show the predicted packing angles for the three histogram. In a third histogram, all helix-helix pairs optimal solutions (am , bm and cm ) according to the theoretical model developed here; the mean observed were used except those displaying interhelical ° bonds. twist angle was taken as 99.1 . The two largest peaks occur in the intervals −70°EVE−20° and +110°EVE+140°. The medium peaks are found at −170°EVE−150° and The correlation between frequencies of packing −110°EVE−90°. Fewer helical pairs pack at angles in the negative range to its periodic angle in ° ° ° ° +10 EVE+60 and +160 EVE+180 . As indi- the positive range; i.e. rf(V<0), f(V+180°), was 0.74. The cated by arrows in Figure 7, in the negative angular absence of a pronounced peak at V1+22° is range the optimal solutions (am , bm and cm ) of the perhaps explained by the repulsion of the dipole helical lattice superposition model match the moments in nearly parallel helices. However, the observed peaks well. In the positive range, the class importance of this effect is doubted by several a peak at V1+142° misses the observed peak by authors, and theoretical and experimental results 20°, which rather corresponds to the angle of the indicate that the effect may be limited (Chou & predicted suboptimal solution for larger helix radii. Zheng, 1992; Robinson & Sligar, 1993). Further- The positive class b peak falls at a peak shoulder. more, in the histogram of the local packing angles The expected peak at V1+22° is little observed in (Figure 7B), the expected peak at V1+22° is better the histogram of the global packing angles. observed, indicating that the packings at larger Helix-helix Packing 545

Figure 8. Observed density of interface atoms at Figure 9. Radius dependency as a function of the different packing angles. The core region is defined by a dihedral packing angle as expressed through a 30-point sphere centred at the middle of the line of closest running average of the distances of closest approach approach (length d) between the two helices and with ordered with ascending V values. Arrows correspond to radius r = d/2. Only atoms belonging to contact-forming the optimal packing angles for classes am , bm and cm . The amino acids have been considered; i.e. at least one mean standard distribution for the 30-point clusters was interhelical atom-atom contact was within 4.5 A˚ . Further- 1.2 A˚ . more, it was required that all helix termini (Ca positions) be outside the defined sphere. Arrows correspond to the optimal packing angles for classes am , bm and cm . Results angle of helices with large radii (vide infra). were obtained by a running average over 20 successive Interestingly, the highest core densities are found at point clusters of the ordered helix data pairs with respect −3 the most frequently observed packing angles to VN with a mean standard deviation s = 0.013 A˚ . (V1−40°/+130°, Figure 7). The plot shows devi- ations from 180° periodicity. positive angles are accommodated by shorter Radius dependency of helix-helix contact regions for which the lattice approach is packing angles naturally less stringent. An examination of the ° packing parameter SP (Figure 4) at VN1+60 and In the present model, it is assumed that the V10° indicates steric unacceptability at all helical associating helices have the same radius. A check radii, which explains the paucity of observations was made regarding this assumption. The distance near these packing angles where only interhelical of the side-chain atom that belongs to a contacting bond packing is possible. Nonetheless, from the residue in one helix and that is the furthest from its theoretical model described here, the observed own helix axis was projected onto the local line of distribution is likely to consist of a mixture of closest approach between the two helices and then optimal and suboptimal solutions that are a function normalized by division with the closest approach of the helical radii as well as knobs-onto-knobs distance. Residues were considered only if the angle packing cases. A detailed structural investigation between the perpendicular drawn from the will be subsequently described. geometric centre of the side-chain to its helix axis and the local line of closest approach was smaller ° Goodness of packing at different packing than 25 to ensure that the residue points almost angles directly to the second helix along the local line of closest approach (‘‘radius’’ at the interaction site). It Helix-helix interfaces should display a high atom was further required that identified contacting density at their association site (core). In Figure 8 residues be surrounded by at least three side-chains this property is shown as a function of packing of the second helix with closest side-chains atoms angle. The packing core was defined by a sphere within 6 A˚ to the Pc site of the contacting residue. centred at the middle position of the line of closest Good knobs-into-holes packing was thus guaran- approach and with a radius of half the length of the teed. The normalized distances for residues line (r = d/2). The running average of the densities fulfilling these conditions have a mean of 0.44 and has pronounced maxima. Except packings at V122° a standard deviation s of 0.08 over 491 residues. (very few data), high densities are generally The relatively low value of s confirms that the achieved in agreement with the predicted optimal assumption of similar radii is reasonable for the packing classes am , bm and cm (i.e. highest possible many contacting regions. homogeneous packing density), which are derived It is clear from Figure 3B and the behaviour of ° from the theoretical model by using v = 99.1 , the the function SP(VN, R) (Figure 4) that for the mean observed value. The peak at V1140° is broad optimal solutions the radii of the helices and thus and extends to V = +120°, a suboptimal packing the distance of closest approach between packed 546 Helix-helix Packing

helices should increase in the order R(am ) < R(bm ) < R(cm ). As shown in Figure 9 (mean closest approach distances versus the packing angle), these predictions are confirmed by the observed data in the negative angular range. Shortest distances of closest approach are found in agreement with the optimal knobs-into-holes packing classes, which allow interdigitation of the side-chains. As pre- dicted, the distances increase from am to bm and cm . However, the distribution is, like the histogram of the packing angles (Figure 7), not periodic. In the positive range, very close distances are not observed. Only a shallow minimum (knobs-into- holes interdigitation) is observed in an interval where a suboptimal packing for larger helix radii is Figure 10. Logarithmic relative probability for the predicted. This is consistent with the observation occurrence of individual amino acids in the packing core. that the main peak in the histogram of packing Only side-chains with a corresponding skew angle a ° angles in the positive range (Figure 7) actually smaller than 25 were taken under the condition that ˚ agrees with the suboptimal packing arrangement of di < d + 1.0 A (Figure 1, Table 1). The obtained relative helices with larger helix radii (Figure 4). It seems as frequencies were normalized by the relative frequencies of the appropriate amino acid as found in all helices of the if the required small helix radii for the optimal dataset. Listed in the plot are the specific helical radii solutions am and bm (short distances of closest corresponding to helices composed entirely of a given approach), practically corresponding to packings of amino acid type and the respective observed standard Gly or Ala in the packing core, are less compatible deviations. Numbers near the bars correspond to the with positive packing angles. This recurrent and number of observations. consistent deviation from periodicity can be explained by the actual spatial packing geometries of the two periodic solutions that are naturally i + 3, i + 4 and i + 7 (cell 153) and smaller residues excluded in the two-dimensional lattice model (see would prefer the smaller cell formed by residues i, Discussion). i + 1, i + 3 and i + 4 (cell 27; Figures 2 and 5). From Figure 4, it is evident that helices with radii Corresponding preferences for these cells are between 3.0 A˚ and 4.5 A˚ fall simultaneously into confirmed qualitatively by the experimentally the three minima of the function SP(VN, R). Thus, observed data (Figure 11A). The small amino acids residue types that impart such helix radii should Gly and Ala clearly prefer to occupy cell 27. At an provide the largest packing flexibility and therefore apparent helix radius of 4.5 A˚ (distance of the tip be advantageous over others and correspondingly atom of the residue occupying the cell to its helix be often observed as radius-determining residues in axis), the occupancy is balanced between cell 27 the packing core; i.e. they should display a small and cell 153. Larger residues (Ile, Leu and Phe) skew angle a (see Table 1 and Figure 1 for definition prefer cell 153 as predicted. The assignment of the and illustration) to the other helix axis. At larger a packing cell is based on the position of contact (Pc) angles, the size of the residue becomes significant, and not the tip atom (Ptip). For bigger side-chains such that extended residues can compensate for the these two positions may differ, likely explaining the cylindrical shape of helices and are observed more weak difference for the two cells in occupancies at often because of the closer distance to the second R > 7.0 A˚ . It was observed that cell 51 also prefers helix. It can be shown that the mean skew angle larger amino acids; since it is somewhat to the side ° increases from Ala to Val to Leu (aAla = 41.1 , of the helix-helix interface, extended residues must ° ° aVal = 43.2 and aLeu = 47.7 ). The logarithmic reach like arms to fill it. The discrete nature of relative frequencies of the 20 amino acids as packing residue sizes used in packing is also evident from core residues normalized to the mean relative the clear peaks in the plot of Figure 11A. frequency of the residue within helices generally is Furthermore, the more direct approach relating the shown in Figure 10. The relative probability is distance of closest approach to the predominantly highest for Ala followed by Val and Ile. The radius used packing cell type at a given helix-helix of a poly(Ala) helix is near 3.4 A˚ and 4.5 A˚ for interface also confirms the theoretical conclusions poly(Val) and 5.2 A˚ for poly(Ile). The preference of (Figure 11B). More closely packed helices prepon- Ala and Val is thus explained by their ability to derantly utilize cell 27, while cell 153 dominates for allow flexible packing arrangements to achieve an helices further apart at the association site. optimal protein fold. Correlation of packing angle and preferred Radius dependency of packing cell (holes) packing cell occupancy Since the helix radius is related to the packing Larger residues of one helix should preferably angle and to the packing cell predominantly pack into the largest cell formed by the residues i, occupied, a well-defined correlation should exist Helix-helix Packing 547

Figure 12. Relative occupancies of packing cells (holes) as a function of packing angle VN. A helical pair was assigned to only one cell packing type according to the most frequently occupied cell along the contact. Counts are registered only if the number of occupied cells of type 153 (continuous line) is larger by at least +3 than the number of occupied cell types 27 for a single pair of packed helices (48 examples) and vice versa for cell 27 (broken lines, 34 examples).

° are clearly distinct for cell 27 at VN1150 where the Figure 11. Correlation between the helix radius and the model predicts association of helices with smaller occupied packing cell. A, Normalized histogram of ° radii. Peaks for cell 27 are found also at VN180 and occupancies of a specific packing cell (hole) are plotted as ° VN140 . The former angle corresponds to an a function of the length (size) of the occupying side-chain optimal solution (class b) and the latter can be defined by the distance of its tip atom to its helix axis identified as suboptimal for helices with smaller (apparent helix radius Ra, Figure 1). Packing cell 27 is indicated by a broken line and cell 153 by a continuous radii. The distribution of Sp(VN,R) for helices with line. For comparison, the mean distances for selected large radii has two minima, in contrast to that of amino acids as found in all helices of the protein dataset helices with smaller radii, which exhibits three are indicated by the arrows. The bin width was taken as minima (Figure 4). This is confirmed by the data 0.25 A˚ . B, Normalized histogram of observed distances of shown in Figure 12, which reveals that the broad ° closest approach for helices with a predominantly packed peak at VN1130 actually comprises two different cell 27 (broken line, 34 examples) and cell 153 (continuous packing modes, optimal packings (cell 27 peak) and line, 48 examples). The respective difference in the suboptimal packings (cell 153 peak). number of occupied packing cells of the two types for a given helix-helix contact region was larger than 2 to ensure cell-type dominance. Conditions that deem a cell Non-homogeneous associations occupied are discussed in the text. The bin width was taken as 1 A˚ . Besides side-chain interdigitation facilitated by van der Waals contacts of apolar atoms (knobs into holes), interhelical salt bridges, disulphide bonds, hydrogen bonds and tight head-to-head van der between the packing angle and the preferred Waals contacts can constitute interhelical contacts, packing cell. At packing angles preferred by larger referred to here as interhelical bond interactions (smaller) helices, the packing cell 153 (27) should be (knobs-onto-knobs). Indeed, cysteine and charged mainly occupied. By assigning each pair of packed residues, and the polar asparagine were found to helices to one cell class determined by the show the highest propensity of forming such bonds. prevailing cell type used, the resulting distribution Helix-helix associations with only one such interhe- is in good agreement with the predictions of the lical bond were found more often at the expected model (Figure 12). Because of the sparseness of packing angles (vide supra), provided that at least data, the packings were normalized to the range one helix of the pair had less than 12 residues (37 ° ° 0 EVNE180 . Cell 153 is preferably occupied over examples, data not shown). In longer helices with two packing angle intervals. It is the dominating cell larger contact regions, the packing angles behaved ° ° at VN125 and occurs also at VN1130 , a according to knobs-into-holes where hydrophobic suboptimal packing angle for larger radii. The peaks contacts dominate. 548 Helix-helix Packing

Discussion

Deviation from the 180° periodicity, limitations of the two-dimensional approach

A model for helix-helix packing based on superposition of two planar lattices yields 180° periodic solutions in the packing angle V. However, the observed properties show deviations from periodicity. In particular, the predicted optimal solutions am and bm are not convincingly represented by the experimental data in the positive angular range, neither the packing angles (Figure 7) nor at the expected smaller radii (Figure 9). What causes this discrepancy? Why is packing with short distances of closest approach (small helix radii) disfavoured in the positive V range? Three main features of the real spatial structure of a-helices are not described by a two-dimensional model: (1) the cylindrical shape; (2) the radii along the helix are discrete rather than continuous, as are the side-chain orientations (rotamers); and (3) the non- orthogonal extensions of side-chains; i.e. the Ca–Cb vectors leave the helix backbone under a defined angle (extension angle) and are not, as assumed by Figure 13. Observed angle (g) between the Ca–Cb the model, straight extensions of the perpendicular vectors of two interhelically contacting side-chains. drawn from the Ca-positions to the helix axis. This Shown are the mean value (A) and the standard latter property has been shown important in deviations (B) as a function of the packing angle of the corresponding helix-helix pair obtained by a 100-points causing different oligomerization states of coiled (black lines) and 50-points (grey lines) running average of coils (Harbury et al., 1993). the V-ordered data points. The black line corresponds to Principally, for a real three-dimensional but all observed g-angles (4122 events) in A. In B, The raw regular helix, the lattice obtained by unrolling such data for the black curve were the standard deviations of a helix onto a plane coincides with that used in the g per helix-helix pair. The 100-point clusters of the model. Despite an apparently smaller helix radius running average had a mean standard deviation of 10.3°. for the same helix-building amino acids caused by The grey lines were obtained for observations where at the extension angle, the solutions for the packing least one Ca–Cb vector of the contacting residue-residue ° pair made an angle to the global line of closest approach angles would still be 180 -periodic but differ only in ° a translation of one lattice. However, in three oriented to the adjacent helix smaller than 45 (2747 events); i.e. residues centrally involved in the helix-helix dimensions, the extension angle entails different a b packing. Arrows correspond to the three periodic optimal alignments of the C –C vectors of side-chains solutions of the helical lattice superposition model. performing interhelical contacts (angle g) and thus different mutual orientations for the contacting residues; i.e. between the knob and corresponding hole residues. The angle g between the Ca–Cb vectors of two contacting side-chains (at least one cm , the standard deviations are slightly smaller in inter-helix atom-atom contact shorter than 4.5 A˚ ) the positive range but the orientations of the Ca–Cb correlates at 71% with the angle between the vectors have less impact on the packing because corresponding Ca geometric-centre-of-side-chain of the larger required helix radii. Figure 14 reveals vectors. Obviously, the alignment angle g depends the consequences of the systematically different on the packing angle V, as demonstrated by g-angles on the helix-helix packing. In the case of Figure 13. The sinusoidal shape of the observed alternating parallel and antiparallel Ca–Cb vectors mean reflects the full-circle rotation in V. Further, (henceforth called alternating packing), where the not only does gmean vary with the packing angle but optimal solutions am and bm are in the positive also the observed standard deviations sg. High sg packing angle range (Figure 13), the three-dimen- values reflect side-chain–side-chain contacts of sional packing differs from the more regular residues with alternately nearly parallel (small g) (g-angles) packings (henceforth called regular a b and antiparallel C –C (large g) vector pairs, packing). Figure 14 illustrates this for solution am . whereas smaller deviations point to more regular Solutions am and bm require small helix radii packing with the corresponding mean g in the 90 to (Figure 3) and, consequently, short distances of 120° range. In this respect, the optimal solutions am closest approach. This is achieved by small residues and bm show more regularity in the negative packing in the packing core (preferentially Gly, Ala or Pro). angle range than in the positive range. For solution In three dimensions, the planar lattice approach Helix-helix Packing 549

may be understood as packings of helices ‘‘un- solution am , they go to the next accessible solution rolling’’ their side-chains onto the surface of the for helices with larger radii; i.e. V1120 to 130° other. Thus, side-chains outside the packing core (Figure 4). The same principle considerations hold may be larger, thereby fulfilling the planar packing for the bm periodic solutions. The next accessible conditions and filling the crevice that would be packing mode for solution bm is also the suboptimal opened up by the packing of ideal cylinders. This is with V1120 to 130°, resulting in frequent obser- supported by experimental observations such as the vations for this packing angle range (Figure 7). increasing mean skew angle from Ala to Val to Leu (vide supra). This mutual (‘‘gearwheel’’) unrolling is Ridges into grooves: a model lacking different for regular and alternating packings. In the structural details alternating packing case, knob-residues repeatedly pack with hole-residues from the other helix with In the work presented here, helix-helix packing nearly antiparallel Ca–Cb vectors (Figure 14). was studied theoretically from the perspective of Obviously, given a corresponding packing angle V, the helical lattice superposition concept, which alterations of the side-chain sizes are less tolerable allowed all possible associations to be systemati- in this case where steric clash of the respective cally considered from a purely mathematical side-groups from the two helices can easily result perspective and is not found in previous work because of the parallel or facing Ca–Cb vectors. In (Crick, 1953; Efimof, 1979; Chothia et al., 1981). the regular case, steric hindrance is less likely Thus, a more complete understanding of packing because the hole-residues point away from the options, both optimal and suboptimal, has been interface and may even be extended. Consequently, achieved. regular packings may have short closest approach The lattice superposition model treats the packing distances and more sequences (greater tolerance to problem on the basis of individual side-chains as different side-chain sizes) fulfil the requirement for the smallest packing unit, while higher-order small helix radii for solutions am and bm . Alternating structures are assumed by the ridges into groove packing generally has larger distances of closest (r/g) model where the dominating shape feature of approach and thus, instead of utilizing the optimal helices are considered smooth, and continuous

Figure 14. Differences in the packing between the two 180°-periodic solutions of class am ; illustration of the regular and alternating packing mode. The pictures show real examples of packed helices with corresponding packing angles and distances of closest approach illustrating the differences in the Ca–Cb vector alignments: regular packing (PDB entry codes and sequence numbers) 1dbp, helix 1, 43 to 53, helix 9, 237 to 253; alternating packing (right graph) 1thl, helix 2, 137 to 151, helix 3, 159 to 179. The Ca–Cb bonds are drawn in magenta. Interhelical contacts between residues with nearly perpendicular Ca–Cb vectors are denoted by broken red lines. Ca–Cb vectors with antiparallel orientation are indicated by broken blue lines and the ones with nearly parallel orientation are shown with dotted blue lines. The yellow curved lines are the helix axes, the thin blue continuous lines are the lines of closest approach. The broken dark grey lines connecting the Cb positions denote the packing cell (cell 27). The sequences of the helices are given in the one-letter code. 550 Helix-helix Packing

groove is therefore unlikely. There exists a register allowing only discrete translations where the side-chains of one helix can click into the local depressions of the other helix (knobs into holes). Only through consideration of these key features of helices can successful prediction of the radius dependencies and occupancies of packing cells (holes) according to packing angle be achieved. Though two continuous ridges can certainly be aligned, others must inevitably cross. This conflict can be resolved only by assuming a discrete nature for ridges and grooves. Furthermore, only 27.8% of the helical residues make intra-helical side-chain– side-chain contacts (atom-atom distances smaller than 4.0 A˚ ). Thus, smooth ridges hardly predomi- nate. Figure 15. Expected packing angles VN for the ‘‘ridges Through a consistent mathematical treatment, into groove’’ model as a function of the helix radius. The three and only three solutions for the perfect numbers correspond to the combinations of the ridges superposition of a-helical lattices have been and grooves; i.e. in the terminology of the model demonstrated. Not only is suboptimal packing presented here, the oriented angles are given for the six possible combinations of the three base vectors (Figure 2) evident in the model but also the relationship of one helical lattice with the corresponding three base between occupancy of packing cells and the helix vectors of the other; i.e. mirrored but not rotated lattice radius. Further, it is shown that within the ° ° (D = 1.45 A˚ , v = 99.1°). preferred packing angle range 120 < VN < 160 , there are two topologically different packing arrangements (small helix radii/cell 27 occupancy ridges and grooves are formed by residues at and large helix radii/cell 153 occupancy). This regular sequence separation. These ridges and result cannot be inferred from the r/g model where grooves correspond to base vectors in the model packing cells are not considered. presented here, where helix-helix packing involves their alignment in the respective lattices such that Regularity of helices the condition vi = lRVv'j is fulfilled. The term l is a scalar value, R is a rotation matrix with the V The helix superposition model assumes regu- corresponding packing angle , v'j and vi are vectors V larity and that packing of two helices is strainless. joining lattice points with sequence spacing i and j The helix pairs must also display similar radii, (e.g. i = j = 4 for class 4-4), and the prime denotes possess relatively straight helical axes and constant the applied mirror operation corresponding to twist angles and rises per residue. Significant face-to-face packing. The resulting packing angles violation lessens the applicability of the model. are plotted in Figure 15 as a function of the helix Despite the large variations in the observed twist radius. In the helix lattice superposition model, not angles i,i+1 between consecutive centres of side- only is the direction of a pair of base vectors v chains (standard deviation of 15°), the side-chains considered but also their length and the packing s are covalently bound to the Ca backbone atoms properties of their neighbours. In most cases, this which are very regular in their i,i+1 ( = 3.7° for coincides with a ‘‘knobs into holes’’ (k/h) packing v s inner helical residues). scheme. The equivalent k/h graph is given in Figure 4. The three k/h optimal solutions (am , bm and cm ) are found at the intersection points in the Helix radius dependency of packing r/g model where three different base vectors are involved (Figure 15). The k/h treatment deletes By analysing the helix radius dependency of the some of the possible solutions of the r/g model due packing angle, it was possible to reshape the to steric clashes at other lattice points; for example, suggestions of Richmond & Richards (1978), who the 1-4 and 1-3 r/g classes at larger helical radii or inferred that the radius is inversely correlated with the smaller radial segments of the 3-3 class. In the the packing angles defined as the smaller of the two k/h approach, the optimal solutions delineate the complementary angles with 180°. The model used preferred packing angles. For different classes of the here shows that the dependency is not a r/g model, packing angles are not as distinguish- monotonous function, as observed also by Reddy & able. Nonetheless, both approaches rely on the Blundell (1993), albeit without the detailed expla- direction of base vectors and thus some packing nations provided here. solutions are commonly predicted. The helix geometric parameters that allow It is obvious that the ridges and grooves are optimized packing were examined. It is noteworthy ‘‘bumpy’’ and that protruding side-chains and local that the structural characteristics of a-helices depressions are more appropriate helical surface designed by nature best and most consistently descriptions. A smooth sliding of a ridge into a satisfy the requirements in the helix parameters Helix-helix Packing 551

(Figure 3). This would allow considerable and Chou, K. C. & Zheng, C. (1992). Strong electrostatic advantageous flexibility in achieving the protein loop-helix interactions in bundle motif protein fold. Apart from internal structural strains, the clear structures. Biophys. J. 63, 682–688. disadvantage of other helical types in packing Chou, K. C., Ne´methy, G. & Scheraga, H. A. (1983). Energetic approach to the packing of a-helices. 1. flexibility ( -helix and 310-helix) in viable folded p Equivalent helices. J. Phys. Chem. 87, 2869–2881. proteins is evident. Chou, K. C., Ne´methy, G. & Scheraga, H. A. (1984). It has been shown that alanine as a helical Energetic approach to the packing of a-helices. 2. constituent provides the largest flexibility in General treatment of nonequivalent and nonregular possible packing angles, since the radius of a helices. J. Am. Chem. Soc. 106, 3161–3170. poly(Ala) a-helix is closest to that associated with Cohen, F. E. & Kuntz, I. D. (1987). Prediction of the the v triple point. Alanine has accordingly been three-dimensional structure of human growth hor- observed to be very often involved in helix-helix mone. Proteins: Struct. Funct. Genet. 2, 162–166. contacts as a central, radius-determining amino acid Cohen, F. E., Richmond, T. J. & Richards, F. M. (1979) (Figure 10). This alanine preference in helices is Protein folding: evaluation of some simple rules for thus explained not only by its compatibility with the assembly of helices into tertiary structures with myoglobin as an example. J. Mol. Biol. 132, 275–288. helical structure as such but also by the attendant Crick, F. H. C. (1953). The packing of a-helices: simple variety allowed for packing arrangements. coiled coils. Acta Crystallog. 6, 689–697. The observed relationship between packing angle Efimof, A. V. (1979). Packing of a-helices in globular and helix radius is likely to be of use in the proteins. Layer-structure of hydrophobic engineering of . If, for instance, the cores. J. Mol. Biol. 134, 23–40. designing task required helices packed with 20° (or Harbury, P. B., Zhang, T., Kim, P. S. & Alber, T. (1993). A −160°), leucine would be the ideal candidate for switch between two-, three-, and four-stranded hydrophobic associations. If a packing angle of coiled coils in GCN4 leucine zipper mutants. Science, about −40° is desired, glycine would be the better 262, 1401–1407. choice. This is supported by the work of Chou et al. Harris, N. L., Presell, S. R. & Cohen F. E.(1994). Four helix (1984) who, in their energetic analysis of helix-helix bundle diversity in globular proteins. J. Mol. Biol. 236, 1356–1368. packing, found the lowest interaction energies at − ° ° Heringa, J., Sommerfeldt, H., Higgins, D. & Argos, P. 154 (VN = 26 ) for packing of poly(Leu) helices ° ° (1992). OBSTRUCT: a program to obtain largest and at 144 (VN = 144 ) for poly(Ala) helices. cliques from a protein sequence set according to The model in this work also explains the observed structural resolution and sequence similarity. increased occurrence of leucine and the decreased CABIOS, 8, 599–600. frequency of glycine and proline in four-helix Hutchinson, E. G., Morris, A. L. & Thornton, J. M. (1994). bundle proteins, where helices pack at about Structural patterns in globular proteins. In Structure ° VN120 (Paliakasis & Kokkinidis, 1992). Alanine Correlation (Burgi, H. B. & Dunitz, J. D., eds) Verlay was also often involved, which supports the model Chemie, Weinheim, vol. 2, pp. 643–650. in that alanine was shown to possess greatest Kabsch, W. & Sander, C. (1983). Dictionary of protein secondary structure: pattern recognition of hydro- packing flexibility. The significance of packing cell gen-bonded and geometrical features. Biopolymers, 2, type and helix radius, and the corresponding need 2577–2637. for a good residue fit into a specific cell should Murzin, A. G. & Finkelstein, A. V. (1988). General further aid in associating helices. Minimally, the architecture of the a-helical globule. J. Mol. Biol. 204, number of possible interaction sites can be reduced 749–769. for any two specific helices. Attempts in this Paliakasis, C. D. & Kokkinidis, M. (1992). Relationships prediction direction are in progress. between sequence and structure for the four-a-helix In conclusion, the observed preference for bundle tertiary motif in proteins. Protein Eng. 5, packing angles near −40° and +130° may not be 739–748. explained by a better packing of side-chains alone. Ptitsyn, O. B. & Rashin A. A. (1975). A model of The presented study revealed that there are three myoglobin self-organisation. Biophys. Chem. 3, 1–20. Reddy, B. V. B. & Blundell, T. L. (1993). Packing of optimal periodic solutions for the packing angle. secondary structural elements in proteins. Analysis Furthermore, the preferred angle in the positive and prediction of inter-helix distance. J. Mol. Biol. range is not the calculated optimal solution and 233, 464–479. therefore corresponds to a suboptimal solution (vide Richmond, T. J. & Richards, F. M. (1978). Packing of supra), hence, other determinants like entropic a-helices: geometric constraints and contact area. effects or surface burial differences might be J. Mol. Biol. 119, 537–555. important. Robinson, C. R. & Sligar, S. G. (1993). Electrostatic stabilization in four-helix bundle proteins. Protein Sci. 2, 826–837. References Schulz, G. E. & Schirmer, R. H. (1979). Principles of Protein Structure. Springer-Verlag, Berlin. Chothia, C., Levitt, M. & Richardson, D. (1977). Structure Solovyov, V. V. & Kolchanov, N. A. (1984). A simple of proteins: packing of a-helices and pleated sheets. method for the calculation of low energy packings of Proc. Natl Acad. Sci. USA, 74, 4130–4134. a-helices—a threshold approximation. I. The use of Chothia, C., Levitt, M. & Richardson, D. (1981). Helix the method to estimate the effects of amino acid to helix packing in proteins. J. Mol. Biol. 145, substitutions, deletions and insertions in . J. 215–250. Theoret. Biol. 110, 67–91. 552 Helix-helix Packing

Tuffe´ry, P. & Lavery, R. (1993). Packing and recognition of Dividing equation (A9) by equation (A10), the 180° protein structural elements: a new approach applied periodicity of the solutions become obvious; to the 4-helix bundle of myohemerythrin. Proteins: namely: Struct. Funct. Genet. 15, 413–425. R tan(V) = f(v) (A11) Appendix D where f indicates a function. Under the condition of perfect overlap of the two (2) Separating R and D in equations (A1) to (A4) helical lattices, equations (1) and (2) must be and equating one side of equation (A1) with (A2) satisfied. The base vectors v1" and v"2 descibing the and one side of equation (A3) with (A4) yields: second lattice result from v"1,2 = RVv'1,2 where the − − vectors v1,2' are mirrors of the base vectors of the first n2(kA B)cos(V) = A (n1 + kn2)(n1A + n2B) lattice v1,2 and RV is a rotation matrix (see Mathematical Description). By substituting the (A12) − − selected vectors of equations (3) and (4) into n3(kA B)cos(V) = (n3 + kn4)(n3A + n4B) kB equations (1) and (2) the following system of equations results: (A13) (3) By multiplying equation (A1) and (A3) by − − D AR cos(V) D sin(V) = n1AR + n2BR (A1) and equation (A2) and (A4) by AR and sub- sequently subtracting equation (A1) from (A2) and − AR sin(V) + D cos(V) = n1D + n2kD (A2) (A3) from (A4) and multiplying equation (A1) and (A3) by kD and (A2) and (A4) by BR and − − BR cos(V) kD sin(V) = n3AR + n4BR (A3) subsequently subtracting (A1) from (A2) and (A3) from (A4), it can be shown that: − BR sin(V) + kD cos(V) = n3D + n4kD (A4) (kD2 − ABR2)sin(V) + DR(kA + B)cos(V) − Given specific helix geometric parameters (R, v, D), = n1DR(B kA) (A14) this system of equations would contain five ( 2 − A2R2)sin( ) + 2AR cos( ) unknowns (V, n1, n2, n3 and n4). However, for the D V D V integer variables n1..4, several boundary conditions − = n2DR(kA B) (A15) apply: (k2D2 − B2R2)sin(V) + 2kBR cos(V) n1,2,3,4 $ (−1, 0, 1) (A5) − = n3DR(B kA) (A16) 2 2 2 2 n1 + n2$0 and n3 + n4$0 (A6) (kD2 − ABR2)sin(V) + DR(kA + B)cos(V) − − =n1 + n2= < 2 and =n3 + n4= < 2 (A7) = n4DR(B kA) (A17) These latter transformations restrict the possible n2n3$n1n4 (A8) combinations of n1,2,3,4. Comparing equations (A14) and (A17), it directly follows that n1 = −n4, given that These conditions reflect restrictions in the length B − kA$0. The remaining cases of possible combi- and orientation of the base vectors. Obviously, n1 nations must be investigated separately.If n2 = n3 = 0 and n2 may not be simultaneously zero; the same and n1 = −n4 = 21, then it follows from equation holds for n3 and n4 (equation (A6)). The base vectors (A10) that: may not exceed in magnitude the distance of the closest hexagonal lattice points around the point Pi (kA − B)cos(V) = 2(kA − B) (A18) (Figure 2; equations (A5) and (A7)) and they may not be linearly dependent (equation (A8)). These Since kA − B$0, then cos(V) = 21 and thus boundary conditions reduce the number of possible sin(V) = 0. Under these conditions equations (A15) combinations of values for n1 to n4 from 81 to 24. and (A16) yield 22ARD = 0 and 2kBR = 0. Since Further restrictions can be elicited by reformulating A = v and for any real helix v cannot be zero, then equations (A1) to (A4). (1) Multiplying equation the combinations of n-values above are dismissable. (A1) and equation (A2) by B and (A3) and (A4) by If n1 = n4 = 0 and n2,3 = 21, then it follows from A and subsequently subtracting equation (A1) from equations (A12) and (A13) that: (A3) and (A2) from (A4) yields: − − n2(kA B)cos(V) = A kB (A19) − − − − D(B kA)sin(V) = (AB(n4 n1) n3(kA B)cos(V) = A kB (A20) − − 2 2 Since kA B$0, n2 = n3 providing A kB$0. If + n3A − n2B )R (A9) A − kB = 0, the twist angle v must be 2kp/(k2 − 1) D(kA − B)cos(V) and from equation (A14), D/R = 2p/(k2 − 1). A detailed examination of equation (A9) taken with − = ((n3 + kn4)A (n1 + n2kD)B)D (A10) these values for v and D/R shows that n2 = n3. Helix-helix Packing 553

Consequently, six possible combinations of n1,2,3,4 with 180° periodicity). Note that the equations are remain. The solutions for the packing angle V can solved under the conditions of lattice superposition now be obtained directly by using these possible for two associating helices. V represents the rotation sets in equations (A9) and (A10). The allowed sets angle required for one lattice to achieve the overlap. are given in Table 2 of the main text and correspond Actual packing is, of course, a result of lattice to packing classes designated here as a, b and c (each translation as well.

Edited by B. Honig

(Received 4 July 1995; accepted 9 October 1995)