On the Parameterization of Cartesian Genetic Programming
Total Page:16
File Type:pdf, Size:1020Kb
On the Parameterization of Cartesian Genetic Programming Paul Kaufmann Roman Kalkreuth Gutenberg School of Management & Economics Department of Computer Science Johannes Gutenberg University, Mainz, Germany Dortmund Technical University, Germany [email protected] [email protected] Abstract—In this work, we present a detailed analysis of the (1 + 4) Evolutionary Strategy (ES) that prefers off-spring Cartesian Genetic Programming (CGP) parametrization of the individuals over parents of the same fitness. The consequence selection scheme (µ + λ), and the levels back parameter l. We of the selection preference is that inactive genes that have been also investigate CGP’s mutation operator by decomposing it into a self-recombination, node function mutation, and inactive gene touched by the randomization operator do not contribute to a randomization operators. We perform experiments in the Boolean change of the fitness and therefore propagate always to the and symbolic regression domains with which we contribute to next generation. This circumstance allows a steady influx of the knowledge about efficient parametrization of two essential random genetic material, and we assume that it supports the parameters of CGP and the mutation operator. self-recombination operator to explore close and distant parts Index Terms—Cartesian Genetic Programming of the search space, i.e., increase its search horizon. The search horizon of the self-recombination operator can I. INTRODUCTION also be improved using another inner mechanism of CGP. Genetic Programming (GP) can be considered as nature- Goldman and Punch showed that active genes are distributed inspired search heuristic, which allows for the automatic unevenly in a single-line CGP model [15]. The probability derivation of programs for problem-solving. First work on to be used for the encoding of a phenotype rises for genes GP has been done by [1], [2] and [3]. Later work by [4] located close to the beginning of a genotype. However, the significantly popularized the field of GP. GP is traditionally effectiveness of the self-recombination operator depends on used with trees as program representation. Over two decades the amount of genetic and random genetic material accessible ago Miller, Thompson, Kalganova, and Fogarty presented first to it. Distributing the probability of active gene locations more publications on Cartesian Genetic Programming (CGP) — evenly throughout the genotype by setting the levels back an encoding model inspired by the two-dimensional array of parameter to some fraction of the genotype length improves functional nodes connected by feed-forward wires of an Field self-recombination’s effectiveness. Programmable Gate Array (FPGA) device [5] CGP offers a The role of node function mutation and its relation to graph-based representation that, in addition to standard GP the self-recombination operator has not been investigated yet. problem domains, makes it easy to be applied to many graph- This also has been the major motivation for our work. Our based applications such as electronic circuits [6], [7], image experiments show that node function mutation contributes to processing [8], and neural networks [9]. The geometric and al- the successful evolution of goal functions but that the self- gorithmic parameters of CGP are usually configured according recombination contributes to a greater extent. Switching off to the findings of Miller [10]. The experiments of Goldman node function mutation can, therefore, improve the conver- and Punch [11], [12] and Kaufmann and Kalkreuth [13], [14] gence rates of CGP. showed, however, that CGP’s standard parameterization can This paper is divided into six Sections. After the Intro- be improved significantly in different ways. In this work, we duction, Section II surveys related work for our experiments. argue that by taking a more in-depth look into the interplay of Section III introduces Cartesian Genetic Programming. We CGP’s inner mechanisms, the convergence can be improved discuss the motivation and methodology for our experiments even further. For instance, CGP relies solely on the mutation in Section IV. The experiments are presented and discussed operator. The mutation operator decomposes functionally into in Section V. Section VI concludes our work. the self-recombination, node function mutation, and inac- tive gene randomization operators. The first two operators II. RELATED WORK operate only on genes that contribute to the coding of a Many works and contributions in the field of CGP primar- CGP phenotype. The inactive gene randomization operator ily head for extending and improving CGP by novel ideas exclusively touches genes that are not used for the genotype- such as biogeography-based optimization [16], introduction to-phenotype mapping. The self-recombination and inactive of recurrent wires [17], and the evolution of differentiation gene randomization operators are linked together by CGP’s systems [18]. Recently, Miller [19] published a survey on the optimization algorithm. Traditionally, CGP uses a variant of status of CGP and reflected the many extensions that have been 978-1-7281-6929-3/20/$31.00 ©2020 IEEE introduced to CGP in recent years. However, his article surveys primary inner primary only a few works, which address the understanding of the inner inputs nodes outputs mechanisms of standard CGP and its mutation operator. The 0 nc work of Turner and Miller [20] takes a deeper look into the 1 3 fi 3 5 5 7 0 AND various “neutrality” effects in CGP. The authors investigate the 0 f2 4 f1 1 n n 1 NAND effects of “algorithmic neutrality”, i.e., preferring off-spring ni 1 2 8 o r 2 OR 4 6 4 individuals over parents of same fitness and “neutral genetic 2 2 f0 2 f3 3 NOR drift” in inactive genotype regions. They also take a look into scaling up CGP’s genotype sizes in search of the right balance between search space sizes and convergence speeds. Closer to our work are the results of Kaufmann and 1 0 2 1 2 0 3 4 1 2 2 3 5 4 Kalkreuth [14], [13]. Using an automatic parameter tuner [21], Fig. 1: Cartesian Genetic Program (top) and its encoding the authors tested and evaluated different grid and (µ + λ) (bottom). All nodes contributing to primary outputs are called configurations of CGP and made comparisons to Simulated “active” and all other nodes “inactive”. For example, node 6 Annealing (SA). The outcome of their study is that for is an inactive node. regression benchmarks, large µ’s and huge λ’s as well as for combinational benchmarks SA perform best. The use of an automatic parameter tuner limits, however, the scope of sition of CGP’s mutation operator. Kaufmann and Kalkreuth’s study. It becomes more challenging to derive general dependencies of CGP’s parameters from III. CARTESIAN GENETIC PROGRAMMING these experiments. This work follows a different approach – This section introduces the representation model, the evo- the experiments test as many relevant parameter combinations lutionary operator, and the optimization procedure of CGP. as possible to find relationships between them. The contribution which can bee seen as closest to our work A. The Representation Model are the findings of Goldman and Punch [15], [11], [12]. In Cartesian Genetic Programming is a representation model their work, the authors have defined a parameter-free mutation that encodes the goal function as a directed acyclic graph procedure, called “single-mutation”, that we utilize for our (DAG) [22]. An exemplification of the encoding is illustrated experiments. The mutation procedure implements a loop that in Fig. 1. The graph nodes are arranged as a nc × nr grid picks a gene randomly and mutates it by flipping the respective and are connected by feed-forward wires (c.f. top of Fig. 1). value in the legal range. Setting the maximal loop count to Inputs to the graph are sourced in by ni primary inputs, and one would correspond to the conventional mutation operator the graph’s outputs are fed into no primary output nodes. All of CGP. Single-mutation’s loop, however, stops after hitting nodes are enumerated column-wise and from left to right. A an active gene for the first time. The rationale behind this wire can be restricted to span at most l columns. Each node mutation procedure is that the functional quality needs an may have up to na ingoing wires and computes a single output. evaluation only if the encoded phenotype changes. The lengths of wires connecting primary outputs with inner Another result of Goldman and Punch is the analysis of nodes and inner nodes with primary inputs are restricted by the distribution of active genes within the genotype of a the levels back parameter l. Usually, l is set to span the entire CGP individual. When not restricting CGP’s wire length, the genotype (l = 1). Each node may implement a function from probability for genes being active is higher if located at the the functional set. beginning of the genotype. A cluster of active genes at the The graph in CGP is encoded by a linear list of integers, as beginning of a genotype results in a representational bias shown at the bottom of Fig. 1. The first na +1 integers encode towards smaller phenotypes. Evolution of functions with larger the na input wires (connection genes) and the function gene topologies is hampered by this property [15]. Goldman and of the upper left node of the grid. In Fig. 1, the upper left Punch also conjectured that the steady injection of random node with the index 3 connects to the primary inputs 1 and 0 genes into the inactive genotype regions may be beneficial for and computes the function f2 = OR. The encoding of node CGP. In this work, we validate this hypothesis experimentally. three is, therefore “1 0 2”. The next na +1 integers encode the Compared to the number of works that introduce manifold configuration of node four and so on.