<<

Commentary

Taming combinatorial explosion

Peter Schuster*

Institut fu¨r Theoretische Chemie und Molekulare, Strukturbiologie, Universita¨t Wien, A-1090 Vienna, Austria

ssembling objects from building Beilstein, the well known source of in- chemical transformations among CmHnOp Ablocks by means of predefined com- formation in organic chemistry, which compounds and the data on current cel- bination rules leads to combinatorial ex- represents the most comprehensive da- lular metabolism. In contrast, the other plosions. Indeed, it does not matter how tabase of organic molecules and contains two rules, ii and iii, have a firm physical many classes of building blocks or alter- more than 3.5 million entries, and apply and chemical basis in the context of pre- natives of combination rules are given, a small number of simple and plausible biotic evolution and present day cellular provided one of both is two or larger, selection rules: (i) The compounds con- biology. Sufficient solubility in aqueous because then the number of possible ob- sidered contain six or less carbon atoms solution is a conditio sine qua non for jects commonly increases exponentially combined with an arbitrary (or in prin- reaction partners in the cytoplasm. A high with the number of elements and soon ciple unlimited) number of hydrogen and heat of combustion implies a large dis- exceeds the realizations that can be sus- oxygen atoms.† (ii) Compounds showing tance in free energy between the com- ϩ tained by taking together all available too low solubility in water are elimi- pound and its equivalent in (xCO2 resources (for a few simple examples, see nated. (iii) Compounds with heats of yH2O). Such a large free energy distance Fig. 1). Problems of this kind are encoun- combustion exceeding a predefined is a hardly surmountable obstacle for pre- tered in biophysical chemistry when threshold value are not considered. biotic synthesis from CO2,H2O, and re- biopolymer molecules are built from sev- These three rules are readily converted duction equivalents, which after all must eral classes of monomers, in combinato- into practically equivalent restrictions on occur without enzymatic . Rules rial chemistry when new molecules are the numbers and ratios of atoms, #C, i–iv reduce the numbers of organic ͞ ͞ formed through combination of several #C #O, and #H #O. Then follows a molecules from somewhat less than reactions, or in molecular genetics when combination of more specific rules (iv), 10,000 to 153. The remaining molecules regulation and control networks are con- which exclude structural elements that are good candidates for the partners in sidered. Thus, combinatorial explosion is are characteristic of low stability in a prebiotic reaction networks, and, thus, a universal threat to biopolymer se- prebiotic or biological scenario. These Morowitz et al. (1) set the stage for mod- quences and structures as well as reaction are radicals, energetically rich ions as eling plausible precursors of present day and controlling networks. Examples are well as molecules containing highly re- A A ' O core metabolism. known from biology, in particular meta- active bonds, C C C, C C, or O O, Early autotrophic organisms, like most bolic, genetic, developmental, signaling, respectively. The final result is a subset of present-day bacteria, were unable to per- and neural networks. If this is true, how, 153 molecules, and, not unexpectedly, all form photosynthesis. Then, the reductive then, can organized objects originate? members of the citric cycle are in this can serve as the major Must not all processes that are not regu- subset. In case no restrictions except the source of carbon compounds with two or lated externally end up in a highly diverse specific rules (iv) were made, the num- more carbon atoms, which are synthesized mess of molecular species, each one at best bers of compounds would have been through reduction of carbon dioxide. How realized only in a few molecules? The almost as large as 7,500 (Fig. 1). could it be avoided that this synthesis ends frequently given answers invoke self- It is worthwhile, nevertheless, to inves- up in a mess of thousands of compounds, organization as a (universal) principle in- tigate critically and in detail the assump- each one present at very low concentra- troducing order into diverse manifolds. In tions applied by Morowitz et al. (1). The tion only? The reductive citric acid cycle is general, self-organization requires non- selection rules used to reduce diversity are autocatalytic and can be formally written equilibrium conditions and some kind of of differing characters. Rule i, the restric- Ϫ as ([C H O ]3 ϭ citrate): self-enhancement. Both criteria are often tion to compounds containing six or fewer 6 5 7 fulfilled and commonly met in biology carbon atoms, is a plausible ad hoc as- 3Ϫ ϩ [C6H5O7] 6CO2 under realistic conditions. The main ques- sumption because the tricarboxylic acids tions, nevertheless, remain: Which are the appearing in the citric acid cycle contain ϩ 3 3Ϫ ϩ ϩ ϩ 9H2 2[C6H5O7] 5H2O 3H . chemical driving forces or reaction mech- six carbon atoms. It is, however, rather anisms that shape organized networks: for difficult to justify the assumption on the Autocatalytic reactions bind resources in example, those we see in nature? What basis of kinetics because the sense that they convert all available limits the size, the diversity, and the com- there are many processes with high effi- plexity of chemical reaction networks or, ciency and sufficiently high rate constants in particular, what determines the prop- that lead to larger organic compounds, See companion article on page 7704. erties of the ones that operate in living which are, nevertheless, soluble in water *E-mail: [email protected]. organisms? and have small enough heats of combus- †The numbers of hydrogen and oxygen atoms are, of Morowitz et al. (1) present in this issue tion. On the other hand, it is evident that course, determined by the number of carbon atoms of PNAS a study on the origin of inter- larger molecules are generally more diffi- through the building principle of organic molecules (also see Fig. 1). mediary metabolism that can be inter- cult to obtain from precursor C1-com- Article published online before print: Proc. Natl. Acad. Sci. preted as an illustrative example of a new pounds. Rule i thus appears to be a kind USA, 10.1073͞pnas.150237097. Article and publica- strategy reducing complexity and unde- of ‘‘best compromise’’ bridging insuffi- tion date are at www.pnas.org͞cgi͞doi͞10.1073͞ sirable diversity. They start from the cient knowledge on the frequency of pnas.150237097

7678–7680 ͉ PNAS ͉ July 5, 2000 ͉ vol. 97 ͉ no. 14 Downloaded by guest on September 29, 2021 Fig. 1. Examples of combinatorial explosion and a catalytic cycle reducing the numbers of possible objects. A shows polynucleotide sequences with increasing n chain length n. The number of possibilities is 4 and thus increases exponentially with n. B presents the numbers of possible CmHnOp compounds, computed from simple minded combinatorial expressions assigning three, two, and one possibilities for introducing oxygen containing functions into CH3,CH2, and CH groups, respectively. Clearly, we see an indication of exponential increase. These numbers are compared with those derived in ref. 1 through selection of compounds suitable for a primitive metabolism in early evolution. C presents diagrams for interactions. Here we are dealing with a single class of elements represented by COMMENTARY squares. The elements are coupled through their edges to the neighboring squares. We show possible patterns on a two-dimensional (square) lattice. The numbers of these patterns, related to ‘‘polyominoes’’ or ‘‘animals’’ defined in discrete mathematics, increase exponentially with the numbers of squares (2). D, ultimately, represents a cycle of catalytic reactions (3). The catalysts, the enzymes En, are synthesized from substrates Sn. Closure of the cycle leads to an autocatalytic ensemble. Self-enhancement discriminates against molecules that are not members of the cycle and reduces the numbers of possibilities.

material into their products, and, thus, How could an early metabolism evolve tocatalysis in the form of a reaction cycle side reactions not belonging to the auto- toward the highly elaborate present-day is a possible strategy for taming combina- catalytic cycle are annihilated. Here the reaction network? Or, in the sense of the torial complexity, but there are more ef- product is citrate, and it opens the initial question, how could evolution avoid ficient and more powerful ways to do so. pathway to a great variety of organic mol- being caught in ‘‘combinatorial explo- The ‘‘great invention of nature’’ is to ecules. In other words, because of self- sion’’? Let us consider two examples, one reduce the catalytic cycle to one or two enhancement, the proposed cycle will being a proposed mechanism of early evo- members. Autocatalysis is then a result of canalize production of organic matter via lution and the other an experimentally the logically simple but chemically diffi- its members. The reductive citric acid tested process, autocatalytic reaction net- cult process of copying. Template induced cycle, thus, represents a source of organic works (4–7), and replication induced by synthesis is found with oligonucleotides compounds, provided that enough reduc- templates (3). Networks of catalytic reac- (12) and oligopeptides (13). Oligo- and tion equivalents are available. This re- tions were introduced already in the 1960s polynucleotides are ideal templates be- quires some early source of chemical en- and 1970s as models for self-organization cause their template action is built directly ergy that can be used for reducing carbon of biological macromolecules as well as for into the molecular structure and is inde- dioxide. Although the autocatalytic na- genetic and epigenetic control (4–7). In- pendent of specific sequence require- ture of the citric acid cycle is a convincing dependently of their possible role in early ments. Therefore, they may be character- argument in favor of its role in a prebiotic evolution, genetic regulatory networks be- ized as obligatory templates. Polynucleoti- scenario, we should keep in mind that the came now a central issue in theoretical des are copied either directly or via an reactions are now carried out by highly genome research (8–11). Reaction net- intermediate in the form of a uniquely specific proteins. At present, there is no works showing autocatalysis as an ensem- defined negative copy. Template action of clear experimental evidence that such a ble property were discussed as canalizing oligopeptides was found with specific reaction cycle would work efficiently with- devices for the evolution of biopolymers sequences only, and, in this case, the out enzyme catalysis. The paper by (5, 6). The numbers of polymers, which capacity to perform autocatalysis is se- Morowitz et al. (1), nevertheless, identifies play a role in such reaction networks, are quence-specific and thus ‘‘non-obligato- the molecular candidates for a metabolic drastically reduced through the occur- ry.’’ In particular, Reza Ghadiri’s group reaction cycle in aqueous solution. rence of autocatalytic subsets. Thus, au- (13) used sequences forming leucine zip-

Schuster PNAS ͉ July 5, 2000 ͉ vol. 97 ͉ no. 14 ͉ 7679 Downloaded by guest on September 29, 2021 pers consisting of a ratchet-like arrange- however, two points that require atten- an organism. Because the enzymes are not ment of valine and leucine residues. After tion: (i) The citric acid cycle is different synthesized through reactions within the template-induced autocatalysis has been from catalytic cycles shown in Fig.1D or cycle, they are not part of the set of introduced, the numbers of alternatives discussed in refs. 3 and 5 because, in the molecules produced by the cycle. The are no longer important because self- former, the protein catalysts case are not enzymes, nevertheless, could be devel- enhancement and selection will lead au- synthesized through reactions within the oped by the carrier of the cycle, and higher tomatically to a single or a few dominant cycle. (ii) Catalytic networks are non- metabolic efficiency would be beneficial species. Indeed, the exponentially explod- obligatory autocatalysts like polynucle- for the whole entity. After enzyme devel- ing diversity of polynucleotides is not at all otide templates are. This means that mu- opment came under control of a genetic a restriction for the evolution of genes tations are not regularly conserved and regulatory system, the proteins could be and genomes. In short, template-induced produced further in the future by copying improved by the conventional mechanics autocatalytic processes are excellent the variant template. Evolution of cata- of mutation and selection. In other words, means for the taming of the combinatorial lytic cycles is much more involved process an early autonomous, eventually autocat- explosion. than simple mutation (7). Coming back alytic, precursor of the current citric acid A catalytic cycle, for example, the one again to metabolic cycles, we remark that cycle could evolve into the controlled core shown in Fig.1D, as an ensemble behaves the evolution of specific catalysts for in- of present day metabolism through a gene essentially like an autocatalyst. If a system dividual reactions is not intrinsic to the regulated development of enzymes. What contains several catalytic cycles, selection network. A metabolic cycle, like the citric remains to be shown, however, is the takes place between individual cycles and acid cycle discussed in ref. 1, is a straight- existence and kinetic persistence of such the result is again a drastic reduction in the forward property of the entity carrying it, a cycle of reactions without enzyme diversity of molecular species. There are, be it a functionally coupled ensemble or catalysis.

1. Morowitz, H. J., Kostelnik, J. D., Yang, J. & Cody, G. D. 5. Kauffman, S. A. (1971) J. Cybernetics 1, 71–96. Theor. Biol. 176, 291–300. (2000) Proc. Natl. Acad. Sci. USA 97, 7704–7708. 6. Kauffman, S. A. (1993) The Origins of Order: 10. McAdams, H. H. & Arkin, A. (1998) Annu. Rev. 2. Graham, R. L., Gro¨tschel, M. & Lova´sz, L., eds. Self-Organization and Selection in Evolution (Ox- Biophys. Biomol. Struct. 27, 199–224. (1995) Handbook of Combinatorics (MIT Press, ford Univ. Press, Oxford). 11. Mendoza, L., Thieffry, D. & Alvarez-Buylla, E. R. (1999) Bioinformatics 15, 593–606. Cambridge, MA), Vol. II, pp. 1938–1942. 7. Frank, S. A. (1999) J. Theor. Biol. 197, 281–294. 12. Orgel, L. E. (1992) Nature (London) 358, 203–209. 3. Eigen, M. & Schuster, P. (1977) Naturwissen- 8. Thomas, R. & D’Ari, R. (1990) Biological Feed- 13. Lee, D. H., Granja, J. R., Martinez, J. A., Severin, schaften 64, 541–565. back (CRC, Boca Raton FL). K. & Ghadiri, M. R. (1996) Nature (London) 382, 4. Kauffman, S. A. (1969) J. Theor. Biol. 22, 437–467. 9. Mestl, T., Plahte, E. & Omholt, S. W. (1995) J. 525–528.

7680 ͉ www.pnas.org Schuster Downloaded by guest on September 29, 2021