<<

134 CHEMISTRY: J. LEIDERBERG PRoc. N. A. S.

This article was dedicated to EDWIN BIDWELL WILSON on the occasion of the fiftieth anniversary of his service as Managing Editor of the Academy's PROCEEDINGS. The author thanks John A. Hartigan for pointing out a significant slip in an earlier version. * Prepared in part in connection with research at Princeton University sponsored by the U.S. Army Research Office (Durham). Reproduction permitted for any purposes of the U.S. Govern- ment. 1 Fisher, R. A., Proc. Cambridge Phil.. Soc., 22, 700-725 (1925). 2 Cramer, H., Mathematical Methods of Statistics (Stockholm: Almqvist and Wiksells, 1945; Princeton: University Press, 1946), p. 480. 3 Bhattacharyya, A., Sankhyd, 8, 1-14 (1946). 4 Chapman, D. G., and H. Robbins, Ann. Math. Statist., 22, 581-586 (1951). 6 Mosteller, F., Ann. Math. Statist., 17, 377-408 (1946). 6 Mandel, J., and R. D. Stiehler, J. Res. Nat. Bur. Std., 53, 155-159 (1954). 7 Doob, J. L., Stochastic Processes (New York: John Wiley and Sons, 1953), p. 90. 8 Tukey, J. W., Ann. Math. Statist., 18, 529-539 (1947). 9 Shapiro, S. S., and M. B. Wilk, Ann. Math. Statist., 33, 1499 (1963). 10 Bennett, C. A., Ann. Math. Statist., 24, 138 (1953). "Jung, J., Arkiv. Mat., 3, 199-209 (1955). 12 David, F. N. and N. L. Johnson, Biometrika, 41, 228-240 (1954).

TOPOLOGICAL MAPPING OF ORGANIC * BY JOSHUA LEDERBERG

DEPARTMENT OF GENETICS, STANFORD UNIVERSITY SCHOOL OF MEDICINE, PALO ALTO Communicated November 23, 1964 A structural formula in organic chemistry is a statement of the topological con- nectivity of the of a compound. However, the topological theory of organic chemical structure has not been formally developed. Partly in consequence, the taxonomy, i.e., nomenclature, notation, and homology, of the field lags behind its substance which impedes communication, whether this be information retrieval or professional education. Witness the mystification often provoked by the proper name of a new drug. The need for a more complete formal system became acutely evident in an effort to write computer programs for the logical analysis of mass spec- tra. 1 2 It was found that the mapping of organic structures on standardized forms contributed to the simplification of the problem and this will be illustrated here. Tree Structures. -Acyclic molecules are easy to standardize, but topological principles are hardly used in current practice. Over thirty years ago, Henze and Blair3 pointed out, in their enumeration of , that a unique centroid can be found in any chemical tree. This is either a link that evenly divides the skeleton of the tree, or a single each branch from which carries less than half of the skeletal atoms. The unique centroid is then the starting point for a canonical mnappimig of the tree, following simple rules of precedence of the constituent radicals according to their composition and topological structure. A compact, unique, and unambiguous notational systemi12 has been established front these canions amid need detain us no further here. Cyclic Structures.-Rings are much more difficult to process OIL a node-by-node basis. Ambiguities due to symmetry are usual, and many paths can be evaluated Downloaded by guest on September 28, 2021 VOL. 53, 1965 CHEMISTRY: J. LEDERBERG 135

VERTICES FIGURE POLYGON FORM(S) EXAMPLE

0 0 C IRCLE C O

2A BICYCLANE (I)

4A TETRAHEDRON X g TRIPHENYLENE

48 TRICYCLANE if()

6A PRISM

OA CUBE CUBANE

Be 8PENTAGON- O BENZOPYRENE

Sc X I- PERYLENE

IDA "IHEXAGON Y sENZOPERYLENE

lOB DIBENZO- CHRYSENE

10C ~~~~~~~~~DIBENZO - PYRENE

IOEIOE ~~PENTAGONAL ETHANEDIYLIDENE- PRISePRM CYCLOPENTA- PENTALENE FIG. 1.-Fundamental trivalent polyhedra, including degenerate forms, with up to 10 vertices. Examples are drawn from polycyclic compounds where possible; these do show an unusual degree of symmetry. only by recursively searching through the entire graph. This approach was there- fore abandoned in favor of a fundamental classification of the graphs. To achieve this, a number of simplifying steps are introduced. Thefirst of these is to isolate the paths within the ring. The classification then depends on the set of branch points. Organic rings rarely have more than three branches at any point; an instance of four branches can be accommodated by exception. A second simplification then asks only for a classification of regular trivalent graphs. How, then, can the set of trivalent graphs be systematically arranged, and how can we be assured of having deduced the entire set, without isomorphic redundancies? The graphs of Figure 2 constitute such a set, of order 8, except for the gauche forms discussed later. Similar sets of orders 10 and 12 have also been generated on the computer. Polygonal graphs are relatively easy to compute, but they fail to show many of the symmetries of the figures. This is dramatized by the two isomorphic polygonal representations of the bipentagon. Furthermore, not all the graphs have Hamilton circuits (i.e., can be represented as chorded polygons). Some, like 8 M, require ad- ditional vertices, and these are not so readily generated by a polygon algorithm. A third basis was therefore introduced, the trivalent graphs being identified with polyhedra, including some degenerate forms and derivations from them. As shown Downloaded by guest on September 28, 2021 136 CHEMISTRY: J. LEDERBERG PROC. N. A. S.

POLYGONAL POLYHEDRAL POLYGONAL POLYHEDRAL REPRESENTATION PORM REPRESENTATION FORM

8R~~~~~ ~~~ (4A + 4A) 8G (4.A + 2(2A))

8H~~~~~~~~ ~+ 4B)

0C Si ( 4A +4B)

so (RA + 2A) ( :I>(4B + 4R)

8EGN (6A + 2A) 81OB~~4 + 4R)

+ IF (" 2(2A)) am ~~~~~~~~~~~~~(2A:2A,2A,2A)

FIG. 2.-Trivalent graphs of order 8. 8A-f! are fundamental forms already seen in Fig. 1. Polygonal representations are computer-generated plots; corresponding polyhedral presentations were drawn by hand. Adjacent to each of the unions is a code for its com- position. in Figure 1, the formulation of polyhedra emphasizes the orderly development of the set of graphs and the symmetries of each structure, and thus facilitates the recogni- tion of isomorphisms. Polyhedral Forms.-For topological analysis of a ring the linear paths and the vertices connecting them are first identified. The vertices are simply the branch points, i.e., the atoms with three or more links to the rest of the ensemble. For these purposes a double or triple bond is a single link. The paths are then the intervals be- tween the vertices. A path may be a simple link or a linear string of tandemly

PYRENE (c)(d (a) Downloaded by guest on September 28, 2021 VOL. 53, 1965 CHEMISTRY: J. LEDERBERG 137

linked atoms. For example, marking AA the paths of pyrene, (a), gives the. /J\4 diagram (b) which is readily recognized °-VQ\V/ as isomorphic to the prism (c) and its MORPHINAN formal graph (d). The isomorphism of (b) with (c) could also be established 3 Ts-3 C algorithmically by systematic permuta -_ 4 2 N3 3 tion of the incidence matrix of the a 6 graphs. 18---3 CANONICAL Figure 1 lists convex trivalent poly- (80 MAP hedra with up to 10 vertices, on which FIG. 3.-Mapping a complex ring; morphinan. structures with up to six rings can be mapped. It also includes some laminar forms, e.g., the circle, bicyclane, and tri- cyclane, which might be regarded as degenerate polyhedra with 1, 2, or 3 faces, respectively. The series has also been algorithmically expanded further in a com- puter program. Any trivalent graph is assumed to represent either a polyhedron' or a gauche graph of the same order, or the union of two or more graphs of lower order. A union is obtained across a pair of cut edges of two graphs. The derived forms are classified according to the largest polyhedron or symmetrical union, e.g., bitetrahedron, con- tained in the graph. Spiro-Atoms. -A quadrivalent vertex is mapped as a collapsed edge of a trivalent graph: > -* < = > * <. The parent graph is almost always subject to two or three choices. That partition is chosen which leads to the least complex map, i.e., the nearest to a polyhedron with the least appendages. Figure 3 illustrates a mapping of morphinan. Gauche Graphs. -The Ring Index4 with its 11,524 examples of rings known to or- ganic chemistry contains no example of a finite gauche graph, i.e., one whose representation on the plane has obligatory crossed paths. (Optional crossed-path formulas are sometimes preferred to show the homology of figures to one another.) A theorem of Kuratowski has shown that a gauche graph must contain either Figure 4a or 4b ;6 Figure 4a may be discounted as an unlikely pentaspiro complex. Is the nonexistence of Figure 4b a coincidence? Its representation, Figure 4c, as an in- ternally chorded tetrahedron may throw some light on this. A gauche structure would require access of a chemical path to the interior of a urotropinelike , Figure 4d. However, with the interposition of longer paths, it should be possible to fill this topologicochemical hiatus. Limitations of Topological Description. -Mapping is intended to convey only the connectivity relationship of a set of atoms, which is only the first-order y g% A\ account of a molecular structure. In- - >

conformations represent another domain, whose computation belongs mainly to numerical analysis rather than topology. Thus, from the standpoint of the present paper, interlocked rings have the same connectivity as separate rings. The catena structures cannot, however, be properly drawn without crossed paths. Many real conformations may also have to be shown with superimposed paths in any actual projection on the plane. Applications of Topological Mapping.-The primary purpose of this analysis was to provide a framework for computable logic in organic chemistry, especially the analysis of mass spectra. The theoretical ideas of this presentation are very primi- tive and its main virtue may be to provoke a more sophisticated mathematical for- mulation. Once a standard form is chosen for mapping, canons can be elaborated for the ordering of paths leading in turn to a systematic, compact, computable notation for organic structures.2 The main burden of the standardization is a dissection of the symmetries of the diagram, then a rule of choice among the permutations of the labels. Thus, in the example of Figure 3, the diagram 8B has fourfold symmetry. The symmetry permutations of the principal polygon can be expressed, with the corresponding path lists, as: Path [12345678 (28) (37) (46) (15) (26) (39) (48) (15) (24) (38) 12 - NCCC CCC 23 CCC Fused 0 34 0 Fused CCC 45 CCC CCCN 56 CCC NCCC 67 0 CCC Fused 78 Fused CCC 0 81 CCCN CCC 15 C C C C 28 37 - 46 The top heavy path list of (28) (37) (46) makes this the canonical choice. A linear code for the mapping of morphinan is then (8H NCCC,$,-,-,CCC,O,CCC,-,C,-, or morecompa ctly, (8H N3,$,,,3,0,3,,1), the map being thus reduced to a twelve-dimensional vector. A computer program would unambiguously recognize any of the permuted path lists as equivalent forms, and can perform the tedious ex- ercise just concluded. More important than the notation, this framework enables a computer program to generate hypotheses of organic molecular structure in an algorithmic, exhaustive, and, above all, redundancy-free sequence, important if the computer is to amplify human logical capacity in this field. Summary. -An algorithmic approach to the topological mapping of organic molecules is presented. Three structures are initiated at a unique centroid of the skeletal atoms. Cyclic structures are more difficult. However, the set of regular graphs of degree 3 can be generated on a basic set of polyhedra. Any organic ring molecule can be mapped on one of these graphs. Exceptional quadrivalent vertices (spiro fusions) are expanded to a pair of trivalent vertices. Gauche graphs, with obligatory crossed paths, have not yet been realized in organic chemistry, probably owing to the difficulty of accessing a chemical path as an interior chord of a closed molecule. Downloaded by guest on September 28, 2021 VOL. 53, 1965 MICROBIOLOGY: TALIAFERRO AND TALIAFERRO 139

* Research connected with this program has been supported by grants from the NIH (NB-04270 and AI-05160), National Science Foundation (G-6411), and NASA (NsG 81-60). 1 Lederberg, J., Computation of Molecular Formulas for Mass Spectrometry (San Francisco: Holden-Day, 1964). 2Lederberg, J., DENDRAL-64, A System for Computer Construction, Enumeration and Nota- tion of Organic Molecules as Tree Structures (NASA CR 57029). 3Henze, H. R., and C. M. Blair, J. Am. Chem. Soc., 53, 3077 (1931). 4The Ring Index (American Chemical Society, 1964), 2nd ed., suppl. 2. All the polyhedra (but only some of the derived forms) have a polygonal representation, i.e., a Hamilton circuit, which greatly facilitates the computation of these graphs, as it does in many applications of graph theory. Tait7 had conjectured that any true, trivalent polyhedron had a Hamilton circuit. Tutte8 has demonstrated a counterexample with 46 vertices. If this is the smallest exception, the conjecture is an adequate basis for present purposes. No examples of chemical rings actually recorded in the Ring Index defy descriptions in this system. 6Berge, C., The Theory of Graphs (New York: John Wiley & Sons, 1958), chap. 21. 7Tait, P. G., Phil. Mag. (Series 5), 17, 30 (1884). 8 Tutte, W. T., J. London Math. Soc., 21, 98 (1946).

ENHANCEMENT OF NATURAL HEMOLYSIN IN ADULT RABBITS AFTER RADIATION* BY WILLIAM H. TALIAFERRO AND Lucy GRAVES TALIAFERRO

DIVISION OF BIOLOGICAL AND MEDICAL RESEARCH, ARGONNE NATIONAL LABORATORY, ARGONNE, ILLINOIS Read before the Academy October 13, 1964 Rabbits develop small amounts of natural hemolysin in 9-10 weeks after birth according to Aitken,' and this hemolysin declines in titer when adult rabbits are irradiated with 500 or 600 r according to Talmage et al.15 The present work deals with postirradiation recovery of this hemolysin during which enhanced production takes place. In the present paper, natural hemolysin is used to denote the low-titered hemolysin occurring in normal rabbits not intentionally immunized. It probably arises in response to the absorption by rabbits of minute amounts of the widespread Forssman antigen occurring in food and in Forssman-positive infectious organisms. In fact, it is probably identical with some part of the immune gamma-i Forssman hemolysin arising in rabbits after injection with sheep red blood cells. It is, nevertheless, convenient to retain the term, natural hemolysin, because of the characteristic low titer which fails to stimulate anamnestic sensitivity (see Discussion). Materials and Methods.-Healthy female Dutch belted rabbits, approximately 2.5 months old and weighing 2 kg, were obtained from Bunnyrun, California. They were kept for over a month in the Argonne Laboratory animal quarters before they were irradiated. During this time and throughout our experiment, they were kept singly in stainless steel cages on a 12-hr regulated light and dark schedule in air-conditioned quarters at about 220C and were fed an unrestricted diet of Wayne Rabbit Ration pellets. They had no observable intestinal or respiratory infections. Downloaded by guest on September 28, 2021