1860 LINUSPAULING AND CARLNIEMANN Vol. 61

[CONTRIBUTIONFROM THE GATESAND CRELLINLABORATORIES 6~ CHEMISTRY,CALIFORNIA INSTITUTE OF TECHNOLOGY, No. 7081 The Structure of

BY LINUSPAULING AND CARLNIEMANN

1. Introduction authors to provide proofs that the mole- It is our opinion that the polypeptide chain cule actually has the structure of the space-en- structure of proteins,l with bonds and closing cyclol CZ. Because of the great and wide- other interatomic forces (weaker than those corre- spread interest in the question of the structure of sponding to covalent bond formation) acting be- proteins, it is important that this claim that insu- tween polypeptide chains, parts of chains, and lin has been proved to have the cyclol structure side-chains, is compatible not only with the chemi- be investigated thoroughly. We have carefully cal and physical properties of proteins but also examined the X-ray arguments and other argu- with the detailed information about molecular ments which have been advanced in support of structure in general which has been provided by the cyclol hypothesis, and have reached the con- the experimental and theoretical researches of the clusions that there exists no evidence whatever in last decade. Some of the evidence substantiat- support of this hypothesis and that instead strong ing this opinion is mentioned in Section 6 of this evidence can be advanced in support of the con- paper. tention that bonds of the cyclol type do not occur Some time ago the alternative suggestion was at all in any . A detailed discussion of made by Frank2that hexagonal rings occur in pro- the more important pro-cyclol and anti-cyclol ar- teins, resulting from the transfer of hydrogen guments is given in the following paragraphs. from secondary amino to carbonyl groups 2. X-Ray Evidence Regarding with the formation of carbon-nitrogen single bonds. This cyclol hyfiothesis has been de- It has not yet been possible to make a complete veloped extensively by Wrin~h,~who has con- determination with X-rays of the positions of the sidered the geometry of cyclol molecules and has atoms in any protein crystal; and the great com- given discussions of the qualitative correlations plexity of proteins makes it unlikely that a com- of the hypothesis and the known properties of plete structure determination for a protein will proteins. ever be made by X-ray methods a10ne.~ Never- It has been recognized by workers in the field theless the X-ray studies of silk fibroin by Herzog of modern structural chemistry that the lack of and Jancke,8 Brill,g and Meye; and Markloand of conformity of the cyclol structures with the rules p-keratin and certain other proteins by Astbury found to hold for simple molecules makes it very and his collaboratorsll have provided strong (but improbable that any protein molecules contain (6) In ref. 4d, for example, the authors write “The superposability of these two sets of points represented the first stage in the proof of structural elements of the cyclol type. Until re- the correctness of the Ct structure proposed for insulin. . . . These cently no evidence worthy of consideration had investigations, showing that it is possible to deduce that the insulin molecule is a polyhedral cage structure of the shape and size pre- been adduced in favor of the cyclol hypothesis. dicted, give some indication of the powerful weapon which the Now, however, there has been published4 an inter- geometrical method puts at our disposal.” (7) A protein molecule, containing hundreds of resi- pretation of Crowfoot’s valuable X-ray data on dues, is immensely more complicated than a molecule of an amino crystalline insulin5 which is considered by the acid or of diketopiperazine. Yet despite attacks by numerous in- vestigators no complete structure determination for any amino acid (1) E. Fiacher, “Untersuchungen tiber Aminosauren, Polypeptide had been made until within the last year, when Albrecht and Corey und Protein,” J. Springer, Berlin, 1906 and 1923. succeeded, by use of the Patterson method, in accurately locating (2) F. C. Frank, Nafurc, 188, 242 (1936); this idea was first pro- the atoms in crystalline glycine [G. A. Albrecht and R. B. Corey, posed by Frank in 1933: see w.T. Astbury, J. Textile Inst., 31, 282 THISJOURNAL. 61, 1087 (1939)l. The only other crystal with a (1936). close structural relation to proteins for which a complete structure (3) D. M. Wrinch, (a) Nafuw, 137, 411 (1936); (b) 188, 241 determination has been made is diketopiperazine [R. B. Corey, ibid.. (1936); (c) 189, 651, 972 (1937); (d) Roc. Roy. SOC. (London), 60, 1598 (1938)l. The investigation of the structure of crystals of 8160, 59 (1937); (e) A161, 505 (1937); (f) Trans. Faraday SOC.,83, relatively simple substances related to proteins is being continued in 1368 (1937): (g) Phil. Mag., 26, 313 (1938); (h) Nafurc, 143, 482 these Laboratories. (1939); etc. (8) R.0. Herzog and W. Jancke, Bcr., 18, 2162 (1920). (4) (a) D. M. Wrinch, Science, 88, 148 (1938); (b) THISJOURNAL, (9) R. Brill, Ann., 434, 204 (1923). 60, 2005 (1938); (c) D. M. Wrinch and I. Langmuir, ibid., 60, (10) K. H.Meyer and H. Mark, Bey., 61,1932 (1928). 2247 (1938); (d) I. Langmuir and D. M. Wrinch, Nafurc, 142, 581 (11) W.T. Astbury, J SOC.Chem. Ind., 49, 441 (1930); W. T. (1938). Astbury and A. Street, Phil. Trans. Roy. SOC.,A280, 75 (1931): (5) D. Crowfoot, Proc. Roy. SOC.(London), A164, 580 (1938). W.T. Astbury and H. J. Woods, ibid., 8384,333 (1933); etc. July, 1939 STRUCTUREOF PROTEINS 1861 not rigorous) evidence that these fibrous proteins gions of the crystal (center of molecule, center of contain polypeptide chains in the extended configu- lacunae) have an electron density less than the ration. This evidence has been strengthened by average, and others (slits, zinc atoms) have an the fact that the observed identity distances corre- electron density greater than the average. The spond closely to those calculated with the covalent positions of these regions are predicted by the bond lengths, bond angles, and N-H . . .O hydrogen cyclol theory, but the magnitudes of the electron bond lengths found by Corey in diketopiperazine. density are not predicted quantitatively by the The X-ray work of Astbury also provides evi- theory. Accordingly the authors had at their dence that a-keratin and certain other fibrous disposal seven parameters, to which arbitrary proteins contain polypeptide chains with a folded values could be assigned in order to give agree- rather than an extended configuration. The X- ment with the data. Despite the numbers of ray data have not led to the determination of the these parameters, however, it was necessary to atomic arrangement, however, and there exists no introduce additional arbitrary parameters, bear- reliable evidence regarding the detailed nature of ing no predicted relation whatever to the cyclol the folding. structure, before rough agreement with the Crow- X-Ray studies of crystalline globular proteins foot diagrams could be obtained. Thus the peak have provided values of the dimensions of the B”, which is the most pronounced peak in the units of structure, from which some qualitative P(xy0) section (Fig. 2 of Wrinch and Langmuir’s conclusions might be drawn regarding the shapes paper) and is one of the four well-defined isolated of the protein molecules. An interesting at- maxima reported, is accounted for by use of a tempt to go farther was made by Crowf~ot,~who region (V) of very large negative deviation located used her X-ray data for crystalline insulin to cal- at a completely arbitrary position in the crystal; culate Patterson and Patterson-Harker dia- and this region is not used by the authors in in- grams. l2 Crowfoot discussed these diagrams in a terpreting any other features of the diagrams. sensible way, and pointed out that since the X- This introduction of four arbitrary parameters ray data correspond to effective interplanar dis- (the three coordinates and the intensity of the tances not less than 7 A. they do not permit the region V) to account for one feature of the experi- determination of the positions of individual mental diagrams would in itself make the argu- atoms;I3 the diagrams instead give some informa- ment advanced by Wrinch and Langmuir uncon- tion about large-scale fluctuations in scattering vincing; the fact that many other parameters power within the crystal. Crowfoot also stated were also assigned arbitrary values removes all that the diagrams provide no reliable evidence re- significance from their argument. garding either a polypeptide chain or a cyclol It has been pointed out by Bernal,“ moreover, structure for insulin. that the authors did not make the comparison of Wrinch and Langmuir4 have, however, con- their suggested structure and the experimental tended that Crowfoot’s X-ray data correspond diagrams correctly. They compared only a frac- in great detail to the structure predicted for the tion of the vectors defined by their regions with insulin molecule on the basis of the cyclol theory, the Crowfoot diagrams, and neglected the rest of and thus provide the experimental proof of the the vectors. Bernal reports that he has made the theory. We wish to point out that the evidence complete calculation on the basis of their structure, adduced by Wrinch and Langmuir has very little and has found that the resultant diagrams show value, because their comparison of the X-ray data no relation whatever to the experimental dia- and the cyclol structure involves so many arbi- grams. He states also that with seven density trary assumptions as to remove all significance values at closest-packed positions as arbitrary from the agreement obtained. In order to at- parameters he has found that a large number of tempt to account for the maxima and minima ap- structures which give rough agreement with the pearing on Crowfoot’s diagrams, Wrinch and experimental diagrams can be formulated. Langmuir made the assumption that certain re- We accordingly conclude that there exists no (12) A. L. Patterson, Z. Krisf., 90, 517 (1935); D. Harker, J. satisfactory X-ray evidence for the cyclol struc- Chcm. Phys., 4, 825 (1938). (13) It has also been pointed out by J. M. Robertson, Noluye, 148, ture for insulin. 75 (1939), that the intensities of 80 planes could not provide sufficient information to locate the several thousand atoms in the insulin (14) J. D. Bernal, Nafurc, 148, 74 (1939); see also D. P. Riley molecule. and I. Fankuchen, ibid., 143, 648 (1939). 1862 LINUSPAULING AND CARLNIEMANN Vol. 61

3. Thermochemical Evidence Regarding Pro- The change in bonds from polypeptide chain to tein Structure cyclol is N-H + C=O ----f N-C + C-0 + 0-H. It is, moreover, possible to advance a strong ar- With N-H = 83.3, C=O = 152.0, N-C = 48.6, gument in support of the contention that the cy- C-0 = 70.0, and 0-H = 110.2 kcal./mole, the clol structure does not occur to any extent in any bonds of an amino acid residue are found to be 6.5 protein. kcal./mole less stable for the cyclol configuration X-Ray photographs of denatured globular pro- than for the chain configuration. This must fur- teins are similar to those of @-keratin,and thus in- ther be corrected for of the double bond dicate strongly that these denatured proteins 0: /o:- contain extended polypeptide chains. Ast- resonance of the type {-‘(iH> -‘\;+.)), bury16has also obtained evidence that in protein I I films on surfaces the protein molecules have the which amounts for an to about 21 kcal./ extended-chain configuration, and this view is molelg; there is no corresponding resonance for shared by Langmuir, who has obtained indepen- the cyclol, which involves only single bonds. dent evidence in support of it.” Now the heat of We conclude that the cyclol structure is less stable denaturation of a protein is small-less than one than the polypeptide chain structure by 27.5 hundred kilogram calories per mole of protein kcal./mole per amino acid residue. molecules for denaturation in solution, that is, This value relates to gaseous molecules, con- only a fraction of a kilogram calorie per mole of taining no hydrogen bonds, and with the ordinary amino acid residues. Consequently the structure van der Waals forces also neglected. It is prob- of native proteins must be such that only a very able that the ordinary van der Waals forces would small change is involved in conversion to have nearly the same value for a cyclol as for a the polypeptide chain configuration. polypeptide chain ; and the available eviden~e1~J~ It is unfortunate that there exist no substances indicates that the polypeptide hydrogen bonds known to have the cyclol structure; otherwise would be slightly stronger than the hydrogen their heats of formation could be found experi- bonds for the cyclol structure. Moreover, the mentally for comparison with those of substances observed small values (about 2 kcal./mole) for the such as diketopiperazine which are known to con- heat of solution of and alcohols show that tain polypeptide chains or rings. It is possible, the stability relations in solution are little differ- however, to make this comparison indirectly in ent from those of the crystalline substances. We various ways. A system of values of bond ener- accordingly conclude that the Polypeptide chain gies and resonance has been f~rmulated’~structure for a protein is more stable than the which permits the total energy of a molecule of cyclol structure by about 28 kcal./mole per amino known structure to be predicted with an average acid residue, either for a solid protein or a protein uncertainty of only about 1 kcal./mole for a mole- in solution (with the active groups hydratedz1). cule the size of the average amino acid residue. The comparison of the polypeptide chain and The polypeptide chain (amide form) and cyclol cyclol can also be made without the use of bond can be represented by the following diagrams energy values. The heat of combustion of crystal- .C line diketopiperazine, which contains two glycine residues forming a polypeptide chain,22is known;23 Polypeptide chain from its value, 474.6 kcal./mole, the heat of for- mation of crystalline diketopiperazine (from ele- “>N-C50-H /c ments in their standard states) is calculated to be C Cyclol 128.4 kcal./mole, or 64.2 kcal./mole per glycine residue. A similar calculation cannot be made (15) W. T. Astbury, S. Dickiason and K. Bailey, Biochcm. J., 29, 2351 (1935). directly for the cyclol structure, because no sub- (16) W.T. Astbury, Nature. 148, 280 (1939). (17) I. Langmuir, ibid., 148, 280 (1939). (20) M. L. Huggins, J. Org. Chcm., 1, 407 (1936). (IS) M.L. Anson and A. E. Mirsky, J. Gcn. Physiol., 17, 393, 399 (21) The suggestion has been made IF. C. Frank, Notu~c,188, 242 (1934). (1936)l that the energy of hydration of hydroxyl groups might be (19) (a) L. Pauling and J. Sherman, J. Chem. Phys., 1, 606 (1933); very much greater than that of the carbonyl and secondary amino (b) L. Pauling, “The Nature of the ,” Cornell Uni- groups of a polypeptide chain; there exists, however, no evidence versity Press, Ithaca, N. Y.,1939. The values quoted above are indicating that this is so. from the latter source; they involve no significant change from the (22) R. B. Corey, ref. 7. earlier set. (23) M. S. Rharasch, Bur. Slandards J. Rcscarch, 2, 359 (1929). July, 1939 STRUCTUREOF PROTEINS 1863 stance is known to have the cyclol structure; but teins, under special conditions, in solution and in an indirect calculation can be made in many wa;% the crystal, we attribute to definite stabilizing such as the following. One hexamethylenetetra- fa~tors;2~12~namely, (1) hydrogen bonds between mine molecule and one pentaerythritol molecule the oxygens of certain of the triazine rings, (2) contain the same bonds as four cyclized glycine the multiple paths of linkage between atoms in the residues and three methane molecules; hence the fabric, (3) the closing of the fabric into a poly- heat of formation of a glycine cyclol per residue hedral surface which eliminates boundaries of the is predicted to have the value 32.2 kcal./mole fabric and greatly increases the symmetry, and found experimentallyz4 for +C~HIZN~(C)-l- ic- (4) the coalescence of the hydrophobic groups in (CH20H),(c) - iCHd(c). Similarly the value the interior of the cage.” These factors are, for N(C2H6)3(c) + C2H60H(c) - 3CzHdc) is however, far from sufficient to stabilize the cyclol 40.2 kcal./mole. The average of several calcula- structure relative to the polypeptide chain struc- tions of this type, 36 kcal./mole, differs from the ture. (1) The hydrogen bonds between hydroxyl experimental value of the heat of formation of di- groups in the cyclol structure would have nearly ketopiperazine per residue, 64 kcal./mole, by 28 the same energy (about 5 kcal./mole) as those in- kcal./mole. This agrees closely with the value volving the secondary amino and carbonyl groups 27.5 kcal./mole found by the use of bond energies, of the polypeptide chain. The suggestionz6that and we can be sure that the suggested cyclol struc- resonance of the protons between oxygen atoms ture for proteins is less stable than the polypep- would provide further stabilization is not accept- tide chain structure by about this amount per able, since the frequencies of nuclear motion are amino acid residue. Since denatured proteins so small compared with electronic frequencies that are known to consist of polypeptide chains, and no appreciable resonance energy can be obtained native proteins differ in energy from denatured by resonance involving the motion of nuclei. proteins by only a very small amount (less than (2) We are unable to find any aspects of the bond 1 kcal./mole per residue), we draw the rigorous distribution in cyclols which are not taken into conclusion that the cyclol structure cannot be of pri- consideration in our energy calculation given mary imfiorhnce for proteins; if it occurs ut all above. (3) There is no type of interatomic inter- (which is unlikely because of its great energetic dis- action known to us which would lead to additional advantage relative to polypefitide chains) not more stability of a cage cyclol as the result of eliminat- than about three per cent. of the amino acid residues ing boundaries and increasing the symmetry. could possess this configuration. (4) The stabilizing effect of the coalescence of the The above conclusion is not changed if the as- hydrophobic groups has been estimatedz6 to be sumption be made that polypeptide chains are in about 2 kcal./mole per CH2 group, and to amount the imide rather than the amide form,3bsince this to a total for the insulin molecule of about 600 would occur only if the imide form were the more kcal./mole. It seems improbable to us that the stable. In this case the experimental values of van der Waals interactions of these groups are heats of formation (such as that of diketopipera- much less than this for polypeptides. The maxi- zine) would still be used as the basis for compari- mum of 600 kcal./mole from this source is still son with the predicted value for the cyclol struc- negligibly small compared with the total energy ture, and the same energy difference would result difference to be overcome, amounting to about from the calculation. 8000 kcal./mole for a protein containing about It has been recogni~ed~~-~~that energy rela- 288 residuesz8 tions present some difficulty for the cyclol theory We accordingly conclude that the cyclol struc- (although the seriousness of the difficulty seems ture is so unstable relative to the polypeptide struc- not to have been appreciated), and various sug- ture that it cannot be of significance for proteins. gestions have been made in the attempt to avoid It may be pointed out that a number of experi- the difficulty. In her latest comm~nication~~ment~~~-~’ have added the weight of their evi-

Wrinch writes, “The stability of the globular pro- (28) Other suggestions regarding the source of stabilizing energy (24) The values of heats of combustion used are CaHaNdc), which have been made hardly merit discussion. ”Foreign molecules” 1006.7; C(CHzOH)r(c), 661.2; CHd(c), 210.6; N(CaHs)a(c), 1035.5; (Wrinch, ref. 271, for example, cannot be discussed until we have some CsHrOH(c), 325.7; CzHdc), 370.0 kcal./mole. information as to their nature. (25) F. C. Frank, Nature, 188, 242 (1936). (29) G. 1. Jenkins and T. W. J. Taylor, J. Chem. Soc., 495 (1937). (26) I. Langmuir and D. Wrinch, ibid., 143, 49 (1939). (30) L. Kellner, Nature, 140, 193 (1937). (27) D.Wrinch, Sym9osia on Quanr. Biol., 6, 122 (19381. (34) H. Meyer and W. Hohenemser. ibid., 141, 1138 (1938). 1864 LINUS PAULINGAND CARLNIEMANN Vol. 61 dence to the general conclusion reached in this vious experiment^^'-^^ on the acylation and communication that the cyclol bond and the cyclol alkylation of proteins that the experimental evi- fabric are energetically impossible. dence is in decided opposition to the conception that proteins possess great numbers of hydroxyl 4. Further Arguments Indicating the Non- groups and therefore to the cyclol hypothesis. existence of the Cyclol Structure It seems to us that the objection raised by Hau- There are many additional arguments which ro~itz~~is worthy of consideration and it cer- indicate more or less strongly that the cyclol tainly cannot be disposed of on the grounds that structure does not exist. Of these we shall men- the original structure has been destroyed unless tion only a few. some concrete evidence can be submitted to indi- It has been found experimentally that two cate that this is the case. In a second communica- atoms in adjacent molecules or in the same mole- tion Haurowitz and A~trup~~write that “Accord- cule but not bonded directly to one another reach ing to the classical theory of protein structure the equilibrium at a distance which can be represented carboxyl and amino groups found after hydrolytic approximately as the sum of certain van der Waals splitting of a protein come from 40-NH- radii for the ^.'^*^^ Two carbon atoms of bonds. According to the cyclol hypothesis, how- methyl or methylene groups not bonded to the ever, the free carboxyl and amino groups must be same atom never approach one another more formed, during the splitting, from bonds of the closely than about 4.0 A., and two hydrogen atoms structure =C(OH)-N=. The classical theory not bonded to the same atom are always at least would predict on hydrolysis no great change in the 2.0 A. apart. It has been pointed out by Hug- absorption spectrum below 2400 A. because the gins33that the cyclol structure places the carbon CO groups of the amino acids and of the atoms of side chains only 2.45 A. apart, and that bonds both are strongly absorbing in this region.@ in the Ct structure for insulin there are hydrogen On the other hand, the cyclol hypothesis would atoms only 0.67 A. apart. We agree with Hug- predict a greatly increased absorption because of gins that this difficulty alone makes the cyclol the formation of new CO groups. . . . The ab- hypothesis unacceptable. sorption for genuine and for hydrolyzed protein is A closely related argument, dealing with the about equal. This seems to be in greater accord- small area available for the side chains of a cyclol ance with the classical theory of the structure of fabric, has been advanced by Neurath and proteins than with the cyclol theory.’’ The area provided per side chain by the cyclol Mention may also be made of the facts that no fabric, about 10 sq. A.,is far smaller than that re- simple substances with the cyclol structure have quired; and, as Neurath and Bull point out, the ever been synthesizedzgand that in general chemi- suggesti~n~~~~~that some of the side chains pass cal reactions involving the breaking of covalent through the lacunae of the fabric to the other side bonds are slow, whereas rapid interconversion cannot be accepted, because this would require of polypeptide and cyclol structure must be as- non-bonded interatomic distances much less than sumed to occur in, for example, surface denatura- the minimum values found in crystals. tion. These chemical arguments indicate strongly One of the most striking features of the cyclol that the cyclol theory is not a~ceptable.~~ fabric is the presence of great numbers of hydroxyl 5. A Discussion of Arguments Advanced groups: in the case of cyclol CZthere are 288 hy- in Support of the Cyclol Theory droxyl groups exclusive of those present in the side chains. Recently Ha~rowitz~~,~~has subjected Although a great number of papers dealing with the cyclol hypothesis to experimental tests on the the cyclol theory have been published, we have (37) J. Herzig and K. Landsteiner, Biochcm. Z., 61, 458 (1914). basis of the existence or non-existence of cage hy- (38) B. M.Hendrix and F. Paquin, Jr., J. Biol. Chtm., 124, 135 droxyl groups. In the first communicationss (1938). (39) R. G.Stern and A. White, ibid., 122, 371 (1938). Haurowitz concludes on the basis of his and pre- (40) M. A. Magill, R. E. Steiger and A. J. Allen, Biochem. J., 31, (32) N. V. Sidgwick, “The Covalent Link in Chemistry,” Cornell 188 (1937). University Press, Ithaca, N. Y.,1933; E. Mack, Jr., Tars JOURNAL, (41) Another argument against cyclols of the Ct type can be based 64, 2141 (1932); S. B. Hendricks, Chcm. Rev., 7, 431 (1930); M.L. on the results reported by J. L. Oncley, J. D. Ferry and J. Shack, Huggins, ibid., 10, 427 (1932). Symposia on Quant. Biol., 6, 21 (1938),H. Neurath, ibid., 6, 196 (33) M.L. Huggins, THISJOURNAL, 61,755 (1939). (1938), and J. W. Williams and C. C. Watson, ibid., 6, 208 (1938). (34) H. Neurath and H. D. Bull, Chem. Rev., 28,427 (1938). who have shown that dielectric constant measurements and diffusion (35) T. Haurowitz, Z. physiol. Chcm.,866, 28 (1938). measurements indicate that the molecules of many proteins are far (36) T. Haurowitz and T. Astrup, Nature, 143, 118 (1939). from spherical in shape. July, 1939 STRUCTUREOF PROTEINS 1865 had difficulty in hding in them many points of cyclol cages; here the pairs of residues at a slit comparison with experiment (aside from the X- have ‘different environments’ and the residues not ray work mentioned above) which were put forth at a slit fall into sets which again have ‘different en- as definite arguments in support of the structure. vironments.’ We therefore expect characteristic One argument which has been advanced is that proportions to be associated with aromatic, basic, the cyclol theory “readily interprets the total acidic, and hydrocarbon R groups, respectively, number of amino acid residues per molecule, with- even perhaps with individual R groups. In any out the introduction of any ad hoc hypothe~is”~“case a non-random distribution of the proportions and that “The group of proteins with molecular of each residue in proteins in general is to be ex- weights ranging rom 33,600 to 40,500 are closed pected on any fabric hypothesis. On the cyclol cyclols of the type CZcontaining 288 amino acid hypothesis, for example, a-carbons having equiva- Now the presence of imino acids lent environments occur in powers of 2 and 3. . . . (, oxyproline) in a protein prevents its for- It is difficult to avoid interpreting the many cases mation of a complete cyclol such as Cz, and many which have recently been summarized in which proteins in this molecular weight range are known the proportions of many types of residue axe to contain significant amounts of proline: for powers of 2 and 3 as further direct evidence in insulin 10% is rep~rted,~‘for egg albumin 45!&,42 favor of the cyclol fabric. This fabric consists of for zein 9%,43for Bence-Jones protein 3%0,44and an alternation of diazine and triazine‘ hexagons, for 5%.45 Wrinch has stated that “a with symmetries respectively 2 and 3.” Also it future modification” (in regard to the number of has been said by Langm~ir~~that “The occur- residues) “is also introduced if imino acids are rence of these factors, 2 and 3, furnishes a power- present”3c; “these numbers perhaps being modi- ful argument for a geometrical interpretation such fied if imino acids are present”;3e and “if certain as that given by the cyclol theory. In fact, the numbers of imino acid residues are present, these hexagonal arrangement of atoms in the cyclol numbers” (of residues) “may be correspondingly fabric gives directly and automatically a reason m~dified.”~‘This uncertainty regarding the for the existence of the factors 2 and 3 and the effect of the presence of imino acids in cyclols on non-occurrence of such factors as 5 and 7.” the expected number of residues leaves the argu- On examining the cyclol CZ, however, we find ment little force. In fact, even the qualitative that these statements are not justified. The only claim that the cyclol hypothesis implies the exist- factors of 288 are of the form 2” 3m; moreover, ence of polyhedral structures containing certain the framework of the cyclol Cz has the tetrahedral numbers of amino acid residues and so predicts symmetry T, so that if the distribution of side that globular proteins have molecular weights chains conforms to the symmetry of the framework which fall into a sequence of separated classes can the amino acid residues would occur in equivalent be doubted for the same reason. groups of twelve. But in view of the rapid de- It has been claimed36that the cyclol hypothesis crease in magnitude of interatomic forces with explains the facts that proteins contain certain distance there would seem to be little reason for numbers of various particular amino acid residues the distribution of side chains over a large protein and that these numbers are frequently powers of molecule to conform to the symmetry T; it is ac- 2 and 3,46and it is proper that we inquire into the cordingly evident that any residue numbers might nature of the argument. Wrinch statesz7 “An occur for the cyclol Cz. We conclude that the individual R group” (side chain) “is presumably cyclol hypothesis does not provide an explanation attached, not to just any a-carbon atom, but only of the occurrence of amino acid residues in num- to those whose environment makes them appro- bers equal to products of powers of 2 and 3. priate in view of its specific nature. As an example Although there is little reason to expect that the dis- of different environments, we may refer to the tribution of side chains would correspond to the symmetry of the framework, it is interesting to note that the logical (42) H.0. Calvery, J. Bid. Chem.. 94, 613 (1931). application of the methods of argument used by Wrinch (43) T. B. Osborne and L. M. Liddle, Am. J. Physiol., 26, 304 suggests strongly that sixty residues of each of two amino (1910). (44) C. L. A. Schmidt, “Chemistry of Amino Acids and Proteins,” acids should be present in a CP cyclol. This cyclol con- C. C.Thomas, Springfield, Ill., 1938. tains twenty lacunae of a particular type-each surrounded (45) Unpublished determination by one of the authors. by a nearly coplanar border of twelve diazine and triazine (46) M. Bergmann and C. Niemann, J. Bid. Chcm., 118, 77 (1936); 118 307 (1987). (47) I. Langmuir, Symfiosia on Quant. Bid., 6,135 (1938). 1866 LINUSPAULING AND CARLNIEMANN Vol. 61 rings. Each of these has trigonal symmetry so far as gether by peptide bonds need not be reviewed this near environment is concerned. Hence it might well here. be expected that a particular amino acid would be repre- sented by three residues about each of these twenty The question now arises as to whether the poly- lacunae, giving a total of sixty residues. But the number peptide chains or rings contain many or few amino 60 cannot be expressed in the form 2" 3", it is not a factor acid residues. We believe that the chains or of 288, and the integer nearest the quotient 288/60, 5, rings contain many residues-usually several also cannot be expressed in the form 2" 3". hundred. The fact that in general proteins in One of the most straightforward arguments ad- solution retain molecular weights of the order of vanced by Wrin~h~',~is that a protein surface 17,000 or more until they are subjected to condi- film must have all its side chains on the same side, tions under which peptide hydrolysis occurs gives which would be the case for a cyclol fabric but not strong support to this view. It seems to us highly for an extended polypeptide chain. This argu- unlikely that any protein consist of peptide rings ment now has lost its significance through the re- containing a small number of residues (two to six) cently obtained strong evidence that proteins in held together by hydrogen bonds or similar rela- films have the polypeptide stru~ttlre,~~~~~and not tively weak forces, since, contrary to fact, in acid the cyclol structure. or basic solution a protein molecule of this type There can be found in the papers by Wrinch would be decomposed at once into its constituent many additional statements which might be con- small molecules. strued as arguments in support of the cyclol struc- There exists little evidence as to whether a long ture. None of these seems to us to have enough peptide chain in a protein has free ends or forms significance to justify discussion one more to become a ring. This is, 6. The Present State of the Protein Problem in fact, a relatively unimportant question with re- spect to the structure, as it involves only one pep- The amount of experimental information about tide bond in hundreds, but it may be of consider- proteins is very great, but in general the proc- able importance with respect to enzymatic attack esses of deducing conclusions regarding the struc- and biological behavior in general. ture of proteins from the experimental results are A native protein molecule with specific prop- so involved, the arguments are so lacking in rigor, erties must possess a definite configuration, in- and the conclusions are so indefinite that it would volving the coiling of the polypeptide chain or not be possible to present the experimental evi- chains in a rather well-defined way.4g The forces dence at the basis of our ideas of protein struc- holding the molecule in this configuration may ture@in a brief discussion. In the following para- arise in part from peptide bonds between side- graphs we outline our present opinions regarding chain amino and carboxyl groups or from side- the structure of protein molecules, without at- chain ester bonds or SS bonds; in the main, tempting to do more than indicate the general however, they are probably due to hydrogen bonds nature of the evidence supporting them. These and similar interatomic interactions. Interac- opinions were formed by the consideration not tions of this type, while individually weak, can by only of the experimental evidence obtained from combining their forces stabilize a particular struc- proteins themselves but also of the information ture for a molecule as large as that of a protein. regarding interatomic interactions and molecular In some cases (trypsin, hemoglobin) the structure structure in general which has been gathered by of the native protein is the most stable of those the study of simpler molecules. accessible to the polypeptide chain ; the structure We are interested here only in the role of amino can then be reassumed by the molecule after de- acids in proteins-that is, in the simple proteins naturation. In other cases (antibodies) the na- (consisting only of a-amino and a-imino acids) tive configuration is not the most stable of those and the corresponding parts of conjugated pro- accessible, but is an unstable configuration im- teins; the structure and linkages of prosthetic pressed on the molecule by its environment (the groups will be ignored. influence of the antigen) during its synthesis; de- The great body of evidence indicating strongly naturation is not reversible for such a protein. that the amino acids in proteins are linked to- Crystal structure investigations have shown (48) We believe that our views regarding the structure of protein molecules are essentially the same as those of many other investi- (49) H. Wu,Chinese J. Physiol., 6, 321 (1931); A. E. Mirsky and gators interested in this problem. L. Pauling, Proc. Not. Acad. Sci., 22, 439 (1938). July, 1939 STRUCTUREOF PROTEINS 1867 that in general the distribution of matter in a might well cause every second or third residue in molecule is rather uniform. A protein layer in a chain to be a glycine residue, for example. which the peptide backbones are essentially co- The evidence regarding frequencies of residues planar (as in the p-keratin structure) has a thick- involving powers of two and three leads to the ness of about 10 A. If these layers were ar- conclusion that there are 288 residues in the mole- ranged as surfaces of a polyhedron, forming a cules of some simple proteins. It is not to be ex- cage molecule, there would occur great steric in- pected that this number will be adhered to rigor- teractions of the side chains at the edges and cor- ously, Some variation in structure at the ends ners. (This has been used above as one of the ar- of a peptide chain might be anticipated; more- guments against the CZcyclol structure.) We ac- over, amino acids might enter into the structure cordingly believe that proteins do not have such of proteins in some other way than the cyclic se- cage structure^.^^ A compact structure for a glo- quence along the main chain.51 The structural bular protein might involve the superposition of significance of the number 288 is not clear at pres- several parallel layers, as suggested by Astbury, ent. It seems to us, however, very unlikely that or the folding of the polypeptide chain in a more the existence of favored molecular weights (or complex way. residue numbers) of proteins is the result of greater One feature of the cyclol hypothesis-the re- thermodynamic stability of these molecules than striction of the molecule to one of a few configura- of similar molecules which are somewhat smaller tions, such as Cz-seems to us unsatisfactory rather or larger, since there are no interatomic forces than desirable. The great versatility of antibodies known which could effect this additional stabili- in complementing antigens of the most varied na- zation of molecules of certain sizes. It seems ture must be the reflection of a correspondingly probable that the phenomenon is to be given a wide choice of configuration by the antibody pre- biological rather than a chemical explanation- cursor. We feel that the biological significance of we believe that the existence of molecular-weight proteins is the result in large part of their versa- classes of proteins is due to the retention of this tility, of the ability of the polypeptide chain to ac- protein property through the long process of the cept and retain that configuration which is suited evolution of species. to a special purpose from among the very great We wish to express our thanks to Dr. R. B. Corey number of possible configurations accessible to it. for his continued assistance and advice in the prepa- Proteins are known to contain the residues of ration of this paper, and also to other colleagues some twenty-five amino acids and it is not un- who have discussed these questions with us. likely that this number will be increased in the Summary future. A great problem in protein chemistry is It is concluded from a critical examination of the that of the order of the constituent amino acid X-ray evidence and other arguments which have residues in the peptide chains. Considerable been proposed in support of the cyclol hypothesis evidence has been accum~lated~~suggesting of the structure of proteins that these arguments strongly that the stoichiometry of the polypep- have little force. Bond energy values and heats tide framework of protein molecules can be inter- of combustion of substances are shown to lead to This preted in terms of a simple basic principle. the prediction that a protein with the cyclol struc- principle states that the number of each individual ture would be less stable than with the polypep- amino acid residue and the total number of all tide chain structure by a very large amount, amino acid residues contained in a protein mole- about 28 kcal./mole of amino acid residues; and cule can be expressed as the product of powers of the conclusion is drawn that proteins do not have the integers two and three. Although there is no the cyclol structure. Other arguments leading direct and unambiguous experimental evidence to the same conclusion are also presented. A confirming the idea that the constituent amino brief discussion is given summarizing the present acid residues are arranged in a periodic manner state of the protein problem, with especial refer- along the peptide chain, there is also no experi- ence to polypeptide chain structures. mental evidence which would deny such a possi- bility, and it seems probable that steric factors PASADENA,CALIF. RECEIVEDAPRIL 22, 1939 (51) H. Jensen and E. A. Evans, Jr., J. Bioi. Chem., 108, 1 (1935), (50) Wrinch recently has suggestedSb that even if proteins are not have shown that insulin probably contains several phenylalanine cyclols the cage structure might be significant. groups attached only by side-chain bonds to the main peptide chain.