<<

Topic 9 Can be intelligent? „ The behaviour of an individual organism is an inductive inference about some yet unknown „ Intelligence can be defined as the capability of a system to adapt its behaviour to ever-changing aspects of its environmenenvironment.t. If, over successive environment. According to Alan Turing, the form generations, the organism survives, we can say „ Introduction, or can evolution be or appearance of a system is irrelevant to its that this organism is capable of learning to predict intelligent? or appearance of a system is irrelevant to its intelligence. changes in its environment. „ Simulation of natural evolution „ The evolutionary approach is based on „ Evolutionary computation simulates evolution on a The evolutionary approach is based on „ Genetic algorithms computer. The result of such a simulation is a computational models of natural selection and genetics. We call them evolutionary „ Evolution strategies series of optimisation algorithms, usually based on a simple set of rules. Optimisation iteratively computation, an umbrella term that combines „ improves the quality of solutions until an optimal, genetic algorithms, evolution strategies and „ Summary or at least feasible, solution is found. genetic programming.

Simulation of natural evolution „ Neo-Darwinism is based on processes of „ Evolution can be seen as a process leading to the „ On 1 July 1858, Charles Darwin presented his reproduction, mutation, competition and selection. maintenance of a population’s ability to survive theory of evolution before the Linnean Society of The power to reproduce appears to be an essential and reproduce in a specific environment. This London. This day marks the beginning of a property of life. The power to mutate is also ability is called evolutionary fitness. revolution in biology. guaranteed in any living organism that reproduces „ Evolutionary fitness can also be viewed as a „ Darwin’s classical theory of evolution, together itself in a continuously changing environment. measure of the organism’s ability to anticipate with Weismann’s theory of natural selection and Processes of competition and selection normally changes in its environment. Mendel’s concept of genetics, now represent the take place in the natural world, where expanding „ The fitness, or the quantitative measure of the neo-Darwinian paradigm. populations of different species are limited by a ability to predict environmental changes and finite space. respond adequately, can be considered as the quality that is optimised in natural life. How is a population with increasing Simulation of natural evolution Genetic Algorithms fitness generated? „ In the early 1970s, John Holland introduced the „ All methods of evolutionary computation simulate „ Let us consider a population of rabbits. Some concept of genetic algorithms. rabbits are faster than others, and we may say that natural evolution by creating a population of these rabbits possess superior fitness, because they individuals, evaluating their fitness, generating a „ His aim was to make computers do what nature have a greater chance of avoiding foxes, surviving new population through genetic operations, and does. Holland was concerned with algorithms and then breeding. repeating this process a number of times. that manipulate strings of binary digits. „ „ If two parents have superior fitness, there is a good „ We will start with Genetic Algorithms (GAs) as „ Each artificial “chromosome” consists of a chance that a combination of their genes will most of the other evolutionary algorithms can be number of “genes”, and each gene is represented produce an offspring with even higher fitness. viewed as variations of genetic algorithms. by 0 or 1: Over time the entire population of rabbits becomes 1 0 1 1 0 1 0 0 0 0 0 1 0 1 0 1 faster to meet their environmental challenges in the face of foxes.

Basic genetic algorithms Step 3: Randomly generate an initial population of „ Nature has an ability to adapt and learn without Step 3: Randomly generate an initial population of being told what to do. In other words, nature Step 1: Represent the problem variable domain as chromosomes of size N: finds good chromosomes blindly. GAs do the a chromosome of a fixed length, choose the size x1, x2, . . . , xN same. Two mechanisms link a GA to the problem of a chromosome population N, the crossover Step 4: Calculate the fitness of each individual

it is solving: encoding and evaluation. probability pc and the mutation probability pm. chromosome: f (x ), f (x ), . . . , f (x ) „ The GA uses a measure of fitness of individual Step 2: Define a fitness function to measure the 1 2 N chromosomes to carry out reproduction. As performance, or fitness, of an individual Step 5: Select a pair of chromosomes for mating reproduction takes place, the crossover operator chromosome in the problem domain. The fitness from the current population. Parent exchanges parts of two single chromosomes, and function establishes the basis for selecting chromosomes are selected with a probability the mutation operator changes the gene value in chromosomes that will be mated during related to their fitness. some randomly chosen location of the reproduction. chromosome. Step 6: Create a pair of offspring chromosomes by Genetic algorithms Genetic algorithms: case study applying the genetic operators − crossover and „ GA represents an iterative process. Each iteration is A simple example will help us to understand how mutation. called a generation. A typical number of generations called a generation. A typical number of generations a GA works. Let us find the maximum value of for a simple GA can range from 50 to over 500. The Step 7: Place the created offspring chromosomes the function (15x − x2) where parameter x varies entire set of generations is called a run. in the new population. between 0 and 15. For simplicity, we may Step 8: Repeat Step 5 until the size of the new „ Because GAs use a stochastic search method, the assume that x takes only integer values. Thus, chromosome population becomes equal to the fitness of a population may remain stable for a chromosomes can be built with only four genes: size of the initial population, N. number of generations before a superior chromosome appears. Integer Binary code Integer Binary code Integer Binary code Step 9: Replace the initial (parent) chromosome 1 0 0 0 1 6 0 1 1 0 11 1 0 1 1 „ A common practice is to terminate a GA after a population with the new (offspring) population. 2 0 0 1 0 7 0 1 1 1 12 1 1 0 0 specified number of generations and then examine 3 0 0 1 1 8 1 0 0 0 13 1 1 0 1 Step 10: Go to Step 4, and repeat the process until the best chromosomes in the population. If no 4 0 1 0 0 9 1 0 0 1 14 1 1 1 0 5 0 1 0 1 10 1 0 1 0 15 1 1 1 1 the termination criterion is satisfied. satisfactory solution is found, the GA is restarted.

The fitness function and chromosome locations „ In natural selection, only the fittest species can Chromosome Chromosome Decoded Chromosome Fitness Suppose that the size of the chromosome population label string integer fitness ratio, % survive, breed, and thereby pass their genes on to the next generation. GAs use a similar approach, N is 6, the crossover probability pc equals 0.7, and X1 1 1 0 0 12 36 16.5 X2 0 1 0 0 4 44 20.2 but unlike nature, the size of the chromosome the mutation probability pm equals 0.001. The X3 0 0 0 1 1 14 6.4 fitness function in our example is defined by X4 1 1 1 0 14 14 6.4 population remains unchanged from one X5 0 1 1 1 7 56 25.7 X6 1 0 0 1 9 54 24.8 generation to the next. f(x) = 15 x − x2 f(x) 60 60 „ The last column in Table shows the ratio of the 50 50 individual chromosome’s fitness to the 40 40

30 30 population’s total fitness. This ratio determines

20 20 the chromosome’s chance of being selected for 10 10 mating. The chromosome’s average fitness 0 0 0 510150 51015 x x improves from one generation to the next. (a) Chromosome in it ia l locations. (b) Chromosome final locations. Roulette wheel selection Crossover operator The most commonly used chromosome selection „ First, the crossover operator randomly chooses a crossover point where two parent chromosomes techniques is the roulette wheel selection. „ In our example, we have an initial population of 6 crossover point where two parent chromosomes chromosomes. Thus, to establish the same “break”, and then exchanges the chromosome 100 0 population in the next generation, the roulette parts after that point. As a result, two new X1: 16.5% 16.5 wheel would be spun six times. offspring are created. X2: 20.2% „ If a pair of chromosomes does not cross over, „ Once a pair of parent chromosomes is selected, If a pair of chromosomes does not cross over, X3: 6.4% Once a pair of parent chromosomes is selected, 75.2 the crossover operator is applied. then the chromosome cloning takes place, and the X4: 6.4% offspring are created as exact copies of each X5: 25.3% parent. 36.7 X6: 24.8% 49.5 43.1

Crossover Mutation operator Mutation

„ Mutation represents a change in the gene. X6'i 1 0 0 0 „ Mutation is a background operator. Its role is to X6 1 0 0 0 1 00 00 X2 i 0 1 i provide a guarantee that the search algorithm is X2'i 0 1 0 01 not trapped on a local optimum. X1'i 01 1 1 1 1 1 X1"i „ The mutation operator flips a randomly selected gene in a chromosome. 1 1 1 X5'i 0 1 10 10 X1i 01 11 00 00 0 1 1 1 X5i „ The mutation probability is quite small in nature, and is kept low for GAs, typically in the range X2i 0 1 0 0 1 0 X2"i between 0.001 and 0.01. X5i 0 1 1 1 X2i 0 1 0 0 0 1 1 1 X5i The cycle Genetic algorithms: case study „ We also choose the size of the chromosome „ Suppose it is desired to find the maximum of the population, for instance 6, and randomly generate Crossover Suppose it is desired to find the maximum of the Generation i “peak” function of two variables: an initial population. X1 1 1 0 0 f = 36 0 0 i X6i 1 0 00 1 0 1 0 0 X2i X2 0 1 0 0 f = 44 „ The next step is to calculate the fitness of each i 2 −x2−( y+1)2 3 3 −x2 −y2 X3i 0 0 0 1 f = 14 f (x, y) = (1− x) e − (x − x − y ) e chromosome. This is done in two stages. X4 1 1 1 0 f = 14 1 1 1 chromosome. This is done in two stages. i X1i 01 11 00 00 0 1 1 1 X5i X5i 0 1 1 1 f = 56 „ First, a chromosome, that is a string of 16 bits, is X6i 1 0 0 1 f = 54 where parameters x and y vary between −3 and 3. X2i 0 1 0 0 0 1 1 1 X5i Generation (i + 1) partitioned into two 8-bit strings: „ The first step is to represent the problem variables partitioned into two 8-bit strings: X1i+1 1 0 0 0 f = 56 Mutation X2 0 1 0 1 f = 50 1 0 0 0 1 0 1 0 and 0 0 1 1 1 0 1 1 i+1 X6'i 1 0 0 0 as a chromosome − parameters x and y as a X3i+1 1 0 1 1 f = 44 X2'i 0 1 0 01 „ Then these strings are converted from binary X4i+1 0 1 0 0 f = 44 concatenated binary string: X1'i 01 1 1 1 1 1 X1"i X5i+1 0 1 1 0 f = 54 (base 2) to decimal (base 10): X5' 0 1 10 10 X6i+1 0 1 1 1 f = 56 i 1 0 0 0 1 0 1 0 0 0 1 1 1 0 1 1 (10001010) = 1× 27 + 0 × 26 + 0 × 25 + 0 × 24 +1× 23 + 0 × 22 +1× 21 + 0× 20 = (138) X2i 0 1 0 0 1 0 X2"i 2 10

X5i 0 1 1 1 and 7 6 5 4 3 2 1 0 x y (00111011)2 = 0× 2 + 0× 2 +1× 2 +1× 2 +1× 2 + 0× 2 +1× 2 +1× 2 = (59)10

Chromosome locations on the surface of the „ Now the range of integers that can be handled by „ Using decoded values of x and y as inputs in the “peak” function: initial population 8-bits, that is the range from 0 to (28 − 1), is mathematical function, the GA calculates the mapped to the actual range of parameters x and y, fitness of each chromosome. that is the range from −3 to 3: „ To find the maximum of the “peak” function, we 6 = 0.0235294 will use crossover with the probability equal to 0.7 256 −1 and mutation with the probability equal to 0.001. „ To obtain the actual values of x and y, we multiply As we mentioned earlier, a common practice in their decimal values by 0.0235294 and subtract 3 GAs is to specify the number of generations. from the results: Suppose the desired number of generations is 100. That is, the GA will create 100 generations of 6 x = (138)10 × 0.0235294 − 3 = 0.2470588 and chromosomes before stopping.

y = (59)10 × 0.0235294 − 3 = −1.6117647 Chromosome locations on the surface of the Chromosome locations on the surface of the Chromosome locations on the surface of the “peak” function: first generation “peak” function: local maximum “peak” function: global maximum

Performance graphs for 100 generations of 6 Performance graphs for 100 generations of 6 Performance graphs for 20 generations of chromosomes: local maximum chromosomes: global maximum 60 chromosomes

pc = 0.7, pm = 0.001 pc = 0.7, pm = 0.01 pc = 0.7, pm = 0.001 0.7 1.8 1.8

0.6 1.6 1.6

0.5 1.4 1.4

0.4 1.2 1.2

0.3 1.0 1.0 F F i t n e s s F F i t n e s s F i t n e s s n e s Fi t 0.2 0.8 0.8

0.1 0.6 0.6

Best Best Best 0.4 0 Average 0.4 Average Average

-0.1 0.2 0.2 0 2 468 10 12 16 20 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 14 18 G e n e r a t i o n s G e n e r a t i o n s G e n e r a t i o n s Steps in the GA development Evolution Strategies „ In 1963 two students of the Technical University of Berlin, Ingo Rechenberg and 1. Specify the problem, define constraints and Another approach to simulating natural Hans-Paul Schwefel, were working on the evolution was proposed in Germany in the early optimum criteria; search for the optimal shapes of bodies in a 1960s. Unlike genetic algorithms, this approach flow. They decided to try random changes in 2. Represent the problem domain as a flow. They decided to try random changes in − called an − was designed the parameters defining the shape following the chromosome; to solve technical optimisation problems. example of natural mutation. As a result, the 3. Define a fitness function to evaluate the evolution strategy was born. chromosome performance; „ Evolution strategies were developed as an 4. Construct the genetic operators; alternative to the engineer’s intuition. „ Unlike GAs, evolution strategies use only a 5. Run the GA and tune its parameters. Unlike GAs, evolution strategies use only a mutation operator.

Basic evolution strategies Step 4: Create a new (offspring) parameter by In its simplest form, termed as a (1+1)-evolution Step 2: Randomly select an initial value for each adding a normally distributed random variable a strategy, one parent generates one offspring per parameter from the respective feasible range. The with mean zero and pre-selected deviation δ to each generation by applying normally distributed set of these parameters will constitute the initial parent parameter: mutation. The (1+1)-evolution strategy can be population of parent parameters: xi′ = xi + a (0, δ ) i = 1, 2, . . . , N implemented as follows: x1, x2, . . . , xN Step 1: Choose the number of parameters N to Normally distributed mutations with mean zero Step 3: Calculate the solution associated with the represent the problem, and then determine a feasible reflect the natural process of evolution (smaller parent parameters: range for each parameter: changes occur more frequently than larger ones). X = f (x1, x2, . . . , xN) {} x1min , x1max , {} x2min, x2max , . . . , {} xNmin , xNmax Step 5: Calculate the solution associated with the Define a standard deviation for each parameter and offspring parameters: the function to be optimised. X′ = f (x1′, x2′ , . . . , x′N ) Genetic programming Step 6: Compare the solution associated with the „ An evolution strategy reflects the nature of a offspring parameters with the one associated with chromosome. „ One of the central problems in computer science is how to make computers solve problems without the parent parameters. If the solution for the „ A single gene may simultaneously affect several being explicitly programmed to do so. offspring is better than that for the parents, replace characteristics of the living organism. the parent population with the offspring „ Genetic programming offers a solution through the „ On the other hand, a single characteristic of an population. Otherwise, keep the parent evolution of computer programs by methods of population. Otherwise, keep the parent individual may be determined by the simultaneous parameters. natural selection. parameters. interactions of several genes. „ In fact, genetic programming is an extension of the „ The natural selection acts on a collection of genes, Step 7: Go to Step 4, and repeat the process until a conventional genetic algorithm, but the goal of not on a single gene in isolation. satisfactory solution is reached, or a specified genetic programming is not just to evolve a bit- number of generations is considered. string representation of some problem but the computer code that solves the problem.

„ Genetic programming is a recent development in LISP structure the area of evolutionary computation. It was „ Since genetic programming manipulates programs LISP has a highly symbol-oriented structure. Its greatly stimulated in the 1990s by John Koza. by applying genetic operators, a should permit a computer program to be basic data structures are atoms and lists. An atom „ According to Koza, genetic programming searches manipulated as data and the newly created data to is the smallest indivisible element of the LISP the space of possible computer algorithms for a be executed as a program. For these reasons, syntax. The number 21, the symbol X and the string program that is highly fit for solving the problem at LISP was chosen as the main language for “This is a string” are examples of LISP atoms. A hand. genetic programming. list is an object composed of atoms and/or other „ Any computer program is a sequence of operations lists. LISP lists are written as an ordered collection (functions) applied to values (arguments), but of items inside a pair of parentheses. different programming languages may include different types of statements and operations, and have different syntactic restrictions. How do we apply genetic programming „ To measure the performance of the as-yet- „ The Pythagorean Theorem helps us to illustrate to a problem? undiscovered computer program, we will use a these preparatory steps and demonstrate the number of different fitness cases. The fitness Before applying genetic programming to a problem, potential of genetic programming. The theorem cases for the Pythagorean Theorem are we must accomplish five preparatory steps: says that the hypotenuse, c, of a right triangle with represented by the samples of right triangles in short sides a and b is given by 1. Determine the set of terminals. Table. These fitness cases are chosen at random 2. Select the set of primitive functions. over a range of values of variables a and b. 2 2 over a range of values of variables a and b. 3. Define the fitness function. c = a + b Side a Side b Hypotenuse c Side a Side b Hypotenuse c 4. Decide on the parameters for controlling the run. „ The aim of genetic programming is to discover a 3 5 5.830952 12 10 15.620499 5. Choose the method for designating a result of 8 14 16.124515 21 6 21.840330 program that matches this function. 18 2 18.110770 7 4 8.062258 the run. 32 11 33.837849 16 24 28.844410 4 3 5.000000 2 9 9.219545

Step 1: Determine the set of terminals. Step 3: Define the fitness function. A fitness Step 4: Decide on the parameters for controlling The terminals correspond to the inputs of the function evaluates how well a particular computer the run. For controlling a run, genetic computer program to be discovered. Our program can solve the problem. For our problem, programming uses the same primary parameters program takes two inputs, a and b. the fitness of the computer program can be as those used for GAs. They include the Step 2: Select the set of primitive functions. measured by the error between the actual result population size and the maximum number of The functions can be presented by standard produced by the program and the correct result generations to be run. arithmetic operations, standard programming given by the fitness case. Typically, the error is operations, standard mathematical functions, not measured over just one fitness case, but Step 5: Choose the method for designating a logical functions or domain-specific functions. instead calculated as a sum of the absolute errors result of the run. It is common practice in Our program will use four standard arithmetic over a number of fitness cases. The closer this genetic programming to designate the best-so-far operations +, −, * and /, and one mathematical sum is to zero, the better the computer program. generated program as the result of a run. function sqrt. Crossover in genetic programming: Crossover in genetic programming: Once these five steps are complete, a run can be Two parental S-expressions Two offspring S-expressions made. The run of genetic programming starts with / + a random generation of an initial population of / + computer programs. Each program is composed of − sqrt − sqrt − * − sqrt functions +, −, *, / and sqrt, and terminals a and b. sqrt a sqrt a a b sqrt b / − * b / In the initial population, all computer programs a a usually have poor fitness, but some individuals are + − a b + * b a b more fit than others. Just as a fitter chromosome is − b b more likely to be selected for reproduction, so a * − * a * fitter computer program is more likely to survive by a a a b b b a a a b copying itself into the next generation. (/ (− (sqrt (+ (* a a) (− a b))) a) (* a b)) (+ (− (sqrt (− (* b b) a)) b) (sqrt (/ a b))) (/ (− (sqrt (+ (* a a) (− a b))) a) (sqrt (− (* b b) a))) (+ (−(* a b) b) (sqrt (/ a b)))

Mutation in genetic programming: Mutation in genetic programming: Mutation in genetic programming Original S-expressions Mutated S-expressions A mutation operator can randomly change any / + / + function or any terminal in the LISP S-expression.

Under mutation, a function can only be replaced by  − * − sqrt + * − sqrt a function and a terminal can only be replaced by a terminal. sqrt a a b sqrt b  / sqrt a a b sqrt a /

+ − a b + − a b

a a * − * * − *

a a a b b b a a a b b b

(/ (− (sqrt (+ (* a a) (− a b))) a) (* a b)) (+ (− (sqrt (− (* b b) a)) b) (sqrt (/ a b))) (/ (+ (sqrt (+ (* a a) (− a b))) a) (* a b)) (+ (− (sqrt (− (* b b) a)) a) (sqrt (/ a b))) In summary, genetic programming creates computer Step 3: Execute each computer program in the programs by executing the following steps: Step 5: If the cloning operator is chosen, select one population and calculate its fitness with an computer program from the current population of Step 1: Assign the maximum number of generations appropriate fitness function. Designate the best- programs and copy it into a new population. to be run and probabilities for cloning, crossover so-far individual as the result of the run. • If the crossover operator is chosen, select a pair and mutation. Note that the sum of the probability Step 4: With the assigned probabilities, select a of computer programs from the current of cloning, the probability of crossover and the to perform cloning, crossover or population, create a pair of offspring programs probability of mutation must be equal to one. mutation. and place them into the new population. Step 2: Generate an initial population of computer • If the mutation operator is chosen, select one programs of size N by combining randomly computer program from the current population, selected functions and terminals. perform mutation and place the mutant into the new population.

What are the main advantages of genetic Step 6: Repeat Step 4 until the size of the new Fitness history of the best S-expression What are the main advantages of genetic population of computer programs becomes equal programming compared to genetic algorithms? „ to the size of the initial population, N. 100 „ Genetic programming applies the same sqrt evolutionary approach. However, genetic Step 7: Replace the current (parent) population 80 programming is no longer breeding bit strings that with the new (offspring) population. 60 + represent coded solutions but complete computer programs that solve a particular problem. Step 8: Go to Step 3 and repeat the process until 40

F i t n e s % s, * * the termination criterion is satisfied. „ The fundamental difficulty of GAs lies in the 20 a a bb problem representation, that is, in the fixed-length 0 0 1 234 coding. A poor representation limits the power of G e n e r a t i o n s B e s t o f g e n e r a t i o n a GA, and even worse, may lead to a false solution. „ A fixed-length coding is rather artificial. As it cannot provide a dynamic variability in length, such a coding often causes considerable redundancy and reduces the efficiency of genetic search. In contrast, genetic programming uses high-level building blocks of variable length. Their size and complexity can change during breeding. „ Genetic programming works well in a large number of different cases and has many potential applications.