Evolution Strategies for Constants Optimization in Genetic Programming

Evolution Strategies for Constants Optimization in Genetic Programming Cesar´ L. Alonso ¤ Jose´ Luis Montana˜ Centro de Inteligencia Artificial Cruz Enrique Borges y Universidad de Oviedo Dpto. de Matematicas´ Estad´ıstica y Computacion´ Campus de Viesques 33271 Gijon´ Universidad de Cantabria [email protected] Avda de los Castros s.n. [email protected] [email protected] Abstract whose corresponding semantic function is a candidate solution for the problem instance, is obtained. One simple way Evolutionary computation methods have been used to to exemplify this situation is the following. Assume that we solve several optimization and learning problems. This pa- have to guess the equation of a geometric figure. If some- per describes an application of evolutionary computation body (for example a GP algorithm) tells us that this figure is methods to constants optimization in Genetic Programming. a quartic function, it only remains for us to guess the appro- A general evolution strategy technique is proposed for ap- priate coefficients. This point of view is not new and it con- proximating the optimal constants in a computer program stitutes the underlying idea of many successful methods in representing the solution of a symbolic regression problem. Machine Learning that combine a space of hypotheses with The new algorithm has been compared with a recent lin- least square methods. Previous work in which constants of ear genetic programming approach based on straight-line a symbolic expression have been effectively optimized has programs. The experimental results show that the proposed also dealt with memetic algorithms, in which classical local algorithm improves such technique. optimization techniques (gradient descent [9], linear scaling [3] or other methods based on diversity measures [7]) were used. 1 Introduction We have tested the performance of our strategy on symbolic regression problem instances. The problem of symbolic regression consists of finding –in symbolic form– a In the last years Genetic Programming (GP) has been ap- function that fits a given finite sample set of data points. plied to a range of complex problems in a variety of fields More formally, we consider an input space X = IRn and like quantum computing, electronic design, sorting, search- an output space Y = IR: We are given a sample of m pairs ing, game playing, etc. Most of these applications can be z = (x ; y ) : These examples are drawn according seen as evolutionary optimization or evolutionary learning. i i 1·i·m to an unknown probability measure ½ on the product space For dealing with these complex tasks, GP evolves a popu- Z = X £Y and they are independent identically distributed lation composed by symbolic expressions built from a set (i.i.d.). The goal is to construct a function f : X ! Y of functionals and a set of terminals (including the vari- which predicts the value y 2 Y from a given x 2 X: The ables and the constants). In this paper we want to exploit criterion to choose a function f is a low probability of er- the following intuitive idea: once the shape of the sym- ror. As usual in this context, we estimate error by empirical bolic expression representing some optimal solution has error. been found, we try to determine the best values of the constants appearing in the symbolic expression. More specif- In our algorithm for finding function f; a GP algorithm ically, the terminals being constants are not fixed numeric will try to guess its shape whereas the evolution strategy values, but references to numeric values. Specializing these (ES) will try to find its coefficients. references to fixed values, a specific symbolic expression, The paper is organized as follows: in section 2 we describe the components that constitute the ES for obtain- ¤The First two authors are supported by spanish grant TIN2007-67466- C02-02 ing good values for the constants. Section 3 provides the ySupported by FPU program and MTM2004-01167 definition of the structure that will represent the programs and also includes the details of the designed GP algorithm. value given the multiple partial results from the collabora- Section 4 presents some numerical experiments. Finally, tors (collaboration credit assignment). In this paper we will section 5 draws some conclusions and addresses future re- consider the first subset of P for computing the fitness of c: search directions. Then the expression 1 becomes ES c 2 The algorithm and its convergence Fz (c) = Fz(¡0) (2) where ¡0 is the best symbolic expression of the population In this section we describe an evolution strategy (ES) that P; obtained by the execution of a previous GP algorithm. provides good values for the numeric terminal symbols C = The details of this GP algorithm will be described in the next section. fc1; : : : ; cqg used by a population of symbolic expressions that evolves during a GP process. We assume a population Next we describe the ES for optimizing constants. Sim- ilar to many other evolutionary algorithms, the ES always P = f¡1;:::; ¡N g constituted by N symbolic expressions over a set of numeric functionals F and a set of numeric maintains a population of constants vectors fc1;:::; cM g. terminals T = V [C: Let [a; b] ½ IR be the search space for The initial vector of constants comes from the GP algorithm evolving symbolic expressions. Different vectors of the constants ci; 1 · i · q: In this situation, each individual c is represented by a vector of floating point numbers in constants can be generated at random from the uniform dis- [a; b]q: tribution in the search space. There are several ways of defining the fitness of a vector Recombination in the ES involves all individuals in the of constants c; but in all of them the current population P population. If the population size is M, then the recom- of symbolic expressions that evolves in the GP process is bination will have M parents and generates M offsprings n through linear combination. Mutation is achieved by per- involved. Let z = (xi; yi) 2 IR £ IR; 1 · i · m; be a sample defining a symbolic regression instance. Given a forming an affine transformation. The fitness of a vector of constants is evaluated by Equation 2. The main steps of the vector of values containing the constants c = (c1; : : : ; cq); we define the fitness of c by the following expression: ES are described as follows: 1. Recombination: Let A := (aij )1·i;j·M be an M £ ES c Fz (c) = minfFz(¡i ); 1 · i · Ng (1) M matrix that satisfies aij ¸ 0 for 1 · i; j · M PM c and j=1 aij = 1; for 1 · i · M. Then generate where Fz(¡ ) represents the fitness of the symbolic expres- i I sion ¡ after making the substitution of the references to the an intermediate population of constants vectors X = i I I constants in C; by the numeric values of c: The expression (c1 ;:::; cM ) from X = (c1;:::; cM ) through the c following recombination: that computes Fz(¡i ) is defined below by Equation 6. Observe that when the fitness of c is computed by means of the above expression, the GP fitness values of a whole (XI )t = AXt; (3) population of symbolic expressions are also computed. This t could cause too much computational effort when the size of where X represents the transposition of vector X. both populations increases. In order to prevent the above sit- Many different methods for choosing matrix A can be uation new fitness functions for c can be introduced, where used. In practice A can be chosen either deterministi- only a subset of the population P of the symbolic expres- cally or stochastically. sions is evaluated. Previous work based on cooperative co- 2. Mutation: Generate the next intermediate population evolutionary architectures suggests two basic methods for J I I X from X as follows: for each individual ci in pop- selecting the subset of collaborators (see [11]): The first ulation XI produce an offspring according to one, in our case, consists of the selection of the best symbolic expression of the current population P; considering J t I t t the GP fitness obtained from the last evaluation of P: The (ci ) = Bi (ci ) + gi (4) second one selects two individuals from P : the best one where B is an q£q matrix and g is a q- vector. Matrix and a random symbolic expression. Then, evaluates both i i B and vector g can be chosen either deterministically symbolic expressions with the references to constants spe- i i or stochastically. cialized to c and assigns the best value to the fitness of c: In general, there are three aspects to consider when select- Next we show some calculations that justify the pro- ing the subset of collaborators. These aspects are: The de- posed recombination and mutation procedures. Suppose gree of greediness when choosing a collaborator (collabo- that the individuals at time I of the evolution process are rator selection pressure); the number of collaborators (col- X = (c1; ¢ ¢ ¢ ; cM ). Let c be an optimal set of constants. laboration pool size) and the method of assigning a fitness Let ej := cj ¡ c According to recombination 2 C = fc1; : : : ; cqg is a finite set of references to constants. XM I The number of instructions l is the length of ¡: ci = aijcj; i = 1;:::;M Observe that a slp ¡ = fI1;:::;Ilg is identified with j=1 the set of variables ui that are introduced by means of the PM Since j=1 aij = 1 and aij ¸ 0 then for i = 1;:::;M: instructions Ii: Thus the slp ¡ can be denoted by ¡ = fu1; : : : ; ulg: I I jjei jj = jjci ¡ cjj · Let ¡ = fu1; : : : ; ulg be a slp over F and T: The symbolic expression represented by ¡ is ul considered as an XM expression over the set of terminals T constructed by a se- I · aijjjcj ¡ cjj · max jjejjj: quence of recursive compositions from the set of functions 1·j·M j=1 F: Provided that V = fx1; : : : ; xng ½ T is the set of termi- This means essentially that the recombination procedure nal variables and C = fc1; : : : ; cqg ½ T is the set of refer- q I ences to the constants, for each specialization c 2 [a; b] ½ does not make population X worse than population X.

Load more