An Approach to Biological Computation: Unicellular Core

An Approach to Biological Hideaki Suzuki ATR Human Information Computation: Unicellular Processing Research Core-Memory Creatures Evolved Laboratories 2-2 Hikaridai Seika-cho Using Genetic Algorithms Soraku-gun Kyoto 619-0288 Japan [email protected] Keywords Abstract A novel machine language genetic programming core memory, unicellular creature, system that uses one-dimensional core memories is proposed membrane, biological computation, algorithmic complexity, machine lan- and simulated. The core is compared to a biochemical guage genetic programming, genetic reaction space, and in imitation of biological molecules, four algorithms types of data words (Membrane, Pure data, Operator, and Instruction) are prepared in the core. A program is represented by a sequence of Instructions. During execution of the core, Instructions are transcribed into corresponding Operators, and Operators modify, create, or transfer Pure data. The core is hierarchically partitioned into sections by the Membrane data, and the data transfer between sections by special channel Operators constitutes a tree data-flow structure among sections in the core. In the experiment, genetic algorithms are used to modify program information. A simple machine learning problem is prepared for the environment data set of the creatures (programs), and the fitness value of a creature is calculated from the Pure data excreted by the creature. Breeding of programs that can output the predefined answer is successfully carried out. Several future plans to extend this system are also discussed. 1 Introduction Recent approaches for the designing of an automatic programming system in imitation of biological evolution are based on the notion that during the long history of evolution, some lineage of living things has increased the degree of complexity that is defined as the number of functional genes in a living cell. For example, higher organisms such as mammals are expected to have about 50,000 genes in each cell, which is about 10 times more than the number of genes that a yeast cell has. Of course it is controversial whether or not “functional” or “structural” complexity increases at all in evolution [29], and yet so-called “genomic” complexity which we may view as the number of genes, has clearly increased during evolution [13]. If one could clarify the mechanisms (the necessary and sufficient conditions) that have facilitated this growth of complexity, we might be able to devise a computational system that can increase algorithmic complexity by implementing those clarified mechnisms. This is a strong motivation for many researchers in genetic programming, and with the aim of implementing such a system, various schemes have been proposed and tested [1–3, 5, 11, 26–28, 35, 38–40, 45]. Here I focus on two groups of studies that have much relevance to this article. c 2000 Massachusetts Institute of Technology Artificial Life 5: 367–386 (1999) Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/106454699568827 by guest on 25 September 2021 H. Suzuki An Approach to Biological Computation The first group is studies on machine language genetic programming. Several systems have been proposed in this area. Nordin [34, 35] devised the compiling genetic programming system (CGPS), which directly manipulates the machine code on a Sun SPARC station. Ray [38–40] proposed the famous Tierra system, which uses a core memory for breeding self-replicating programs. The author [44, 45] and Huelsbergen [23, 24] independently proposed machine language genetic programming systems that use genetic algorithms (GAs) [21, 22, 32, 33, 43, 46, 49, 50] to evolve programs. The most successful system among them is Tierra, which accomplished the emergence of higher functions such as parasitism between programs. Tierra is the first system that demonstrated that programs can experience a kind of open-ended evolution under an appropriate environment. Since this system was proposed, a number of different approaches have been taken to extend Tierra to a more advanced system that can evolve programs with much algorithmic complexity. One of them is Ray and Hart’s Network Tierra [41]. In this system, the tierran core was extended to a vast memory space that consists of a large number of computers distributed throughout the world. Several interesting phenomena are observed in this system; however, it seems that creating programs with much algorithmic complexity has not yet been accomplished therein. Another approach to extend Tierra was taken by Adami and Adami and Brown [1–3]. The Avida system devised by them demonstrated how quite complex programs can be evolved in the Tierra-type architecture. The second group that is relevant to this article comprises approaches to chemical computation in some mathematical medium. Fontana [20] proposed Algorithmic chemistry (ALChemy), which manipulates Lisp trees as objects (molecules) and allows combinations of trees as reactions between objects. A similar approach was recently taken by Szuba [51] who designed a chemical-reaction-like system that proceeds with Prolog inference. Rasmussen and colleagues [37, 36] devised a core memory system in which core words react with one another and change their inner codes. Banzhaf and colleagues [8–10, 18, 19] introduced a kind of information object that catalyzes the change of another object. They expressed the objects by binary strings and made them work not only as “operands” but also “operators” of computation. These studies succeeded in inducing so-called catalytic networks, in which reaction arrows are connected to each other and constitute intricate topology-like loops. However, from the viewpoint of automatic programming, the functions achieved in these systems are still unsatisfactory. Compared to these systems, the functions of real organisms are much higher and more complicated. Processes of a biological system, molecular reactions in a cell, are not simple chemical reactions. They are biochemical reactions catalyzed by enzymes that are created from genetic information that has evolved over three billion years. To create a computational system that can evolve complex functions, we might have to make a computational system that imitates biological systems more elaborately. Here, I propose a novel evolutionary programming system called SeMar. SeMar is an abbreviation of the sea of matter. SeMar uses a core memory. The core is compared to a biochemical reaction space, and in imitation of biological matter (substances), four types of data words are prepared in the core. These are the Membrane, Pure data, Operator, and Instruction. The Membrane can be compared to a lipid bilayer, the Pure data can be compared to a small molecule, the Operator can be compared to a protein, and the Instruction can be compared to a gene. In the experiment, GAs are used to modify a sequence of Instructions that a creature (program) has as its genetic information. To calculate the fitness value of each creature, a core memory is prepared for each creature and the Instruction sequence is substituted for the core together with a sequence of environmental data (Pure data). The execution of the core proceeds with the transcription of Instructions to Operators and the actions of Operators to induce 368 Artificial Life Volume 5, Number 4 Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/106454699568827 by guest on 25 September 2021 H. Suzuki An Approach to Biological Computation modifications of Pure data. During these processes, an arbitrary number of Membranes, Pure data, or Operators are inserted or deleted at an arbitrary address of the core. The principal part of this article has already been described in preliminary reports [47, 48]. However, the brevity of those papers allowed only a brief description of the model. The background of the model is scarcely described. Here, I remedy those problems and give a full description of the basic strategy for SeMar. The organization of the article is as follows: Section 2 describes several results of preliminary experiments that led me to devise SeMar. In Section 3, I demonstrate a concrete imple- mentation of SeMar and the simulation procedure using GAs. The results of a SeMar simulation are given in Section 4, where an external problem is imposed upon the creatures and it is shown that SeMar creatures succeed in creating a program that outputs the desired answer. In Section 5, the characteristics of SeMar are briefly summarized and the differences between SeMar and other programming systems are discussed. The final section (Section 6) is devoted to description of future plans to extend Se- Mar. 2 Prelimiary Experiments SeMar has stemmed from the study on a machine language genetic programming system called MUNCs (MUltiple von Neumann Computers) devised by the author [44, 45]. In this section, I survey the journey that has led me from MUNCs to SeMar. MUNCs form a system in which machine language programs evolve using GAs. In several experiments using environmental problems, MUNCs succeeded in creating a small functional program [45]; and yet they could not succeed in creating a higher (longer) program with much algorithmic complexity. One of the most serious problems MUNCs suffer from is what I call “evolutionary dead end” (Figure 1). In MUNCs, a sequence of instructions is put into action one by one using an instruction pointer. A “jump” instruction can move this pointer to an earlier address, and once such a loop has been accomplished, the other program regions are not tested, no matter how good the functions within them are. (In an extreme case, a program appeared that executed only 5 of the 200 instructions it had.) The crossover operation cannot destroy jump instructions that are fixed in the population, so the evolutionary speed conspicuously drops. I tried several modifications of the CPU hardware architecture and the basic instruction set to no avail. Evolutionary dead end is a serious and inevitable problem for any sequentially executed programming system that evolves using GAs.

Load more