arXiv:nlin/0006050v1 [nlin.AO] 30 Jun 2000 clysata h ee fsml oes ntrso pop- of terms In models. simple of typ to level has the However, one at that interest. start difficult of ically so deal usually great is a analysis generate landscape not do b [5] considered Eigen landscape (NIAH) such such needle-in-the-haystack “simu- landscapes, played the somewhat simple not is Paradigmatically has computation driven.” evolutionary it lation that why utilized is reason being role One under- a are to . freedom particular their of in in degrees and effective do, what they stand way algorithms the genetic as behave such us (GAs) systems help why on will and flows how computation population understand evolutionary of in scrutiny landscapes closer a fitness genetics that population clear as be mathematically, should least ty same at the system, much very of treats com latter evolutionary more the in that a Given landscapes for for of putation). (see [4] role role the and important of perspective an account historical recent played an not although for has role, [3] it central example say of to sort not same is the that played not has scape traditional selection.” the by to “order of opposed view as or Darwinian in — such ordering order” for spontaneous of paradigm — “origin another der the offering world, of biological issue the the ten utilized addressing last has [2] the for Kauffman in concept Stuart particular, systems unify- In complex a so. of as or theory years role the important in an concept played by ing conceived has originally [1], as Wright landscape, Sewell fitness a of notion The Introduction 1 simulations. by also illustrated concept and cases examples the complicated more analytic of some simple efficacy several The using shown role. selec- is fit- important than an effective other play under- operators an tion to genetic on when way flows landscape intuitive as ness an dynamics gives population and for- stand latter to the the probability models not the fitness of mer Effective measure a age. as reproductive num- or reach the offspring, of fit measure a of as ber that — appears ways fitness different of the concept to the traced be rule can the descrepency than rather This exception presence recombi- the the and is mutation in hill-climbing as general, nation, such in operators that genetic shown other landscape. is of rugged it a article on this por- process In being hill-climbing itself a evolution as role, trayed important an played has Abstract- EFCIE INS ADCPSFREOUINR SYSTEMS EVOLUTIONARY FOR LANDSCAPES FITNESS “EFFECTIVE” neouinr optto h ocp fafins land- fitness a of concept the computation evolutionary In landscape fitness a of concept the theory evolution In NM icioEtro,APsa 70-543 A.Postal Exterior, Circuito UNAM, NP nttt eCeca Nucleares, Ciencias de Instituto NNCP, -al [email protected] e-mail: ´ xc ..04510 M´exico D.F. .R Stephens R. C. the pe as it y - - - . norltd hl onsa distance a at points while uncorrelated, eeecsteen.Temjrt fpeiu okhsbeen has work previous of majority [10] The is example for group therein). (see “Vienna” references context the this in of interest work particular enormous of The e of an is task. be this ambitious though clearly landscapes,” tremely of would “theory It a have to dynami interest sensitive? the in are change which a under and universal, proper- what i.e. robust, question: are a important ties specify an to then fitness has is on one There flow flows dynamics. evolution a of specify of view to of problem Obviously, In point landscapes. whole the from landscape. the understood given that be a argue can on might flows one compu- population evolutionary fact, a as how well is as biology, what tation, However, in concept. importance static of a is is ruggedness of properti degree landscape The dynamic and static between distinguish optimum. global isolated th the in of anywhere existence indication the no of being correl landscape there natural zero, the is landscape length NIAH tion the In correlated. tially distance a corr by such separated of Points extent lations. and degree the of the measure length, characteristic of of correlation points measure The two rough between a landscape. available is information it as mutual search the corre- obviousl in of is importance structure degree highest Correlation the the landscape. is the importance thing in same more lation the Of not is modality ruggedness. that maxi- as is here a ca moral [8], hand The function easy. other porcupine be the the as On such that function, for modal else. and mally GA, anything a about for difficult just uni is matter known, is well landscape is as NIAH yet, the modal instance, somewhat be For can As things counter-intuitive. [7], useful. [6], immensely however, algo be discussed evolutionary would been easy has an are for that difficult those and are rithm land that of modality those classification more into a scapes that Obviously being difficulty. more assumption signifies land- the and difficulty modality; problem between scape relation the on focus to com- evolutionary the expects. to what community able putation and been theory landscape have bi- in mathematicians achieve theoretical and what between physicists, gap” ologists, “expectation t an of Thus, been be has to computation. seem evolutionary not for do interest but particular speaking, biologically interest of be enpi oaatv ak ntehprui configuration hypercubic Kauffman the the has of on attention spaces walks adaptive much to landscapes paid fitness been on dynamics ulation hntligaotfins adcpsi sipratto important is it landscapes fitness about talking When tended has instance, for theory, GA in analysis Landscape NK mdl 2.Sc yaiscan dynamics Such [2]. -models ξ ftelnsaei a is landscape the of , ξ > ξ < ilb essentially be will ilb substan- be will here of y and es. x- cs a- e- n e - - - restricted to dynamics generated by one or both of the two given that it is the phenotype that manifests the physical char- genetic operators selection and mutation, some celebrated re- acteristics on which natural selection acts. However, the raw sults being Fisher’s fundamental theorem of natural selection material of an evolutionary system is the genotype. Hence, [9] and the concept of an error threshold [5]. In particular, one needs to know how fitness manifests itself at the geno- adaptive walk models for Kaufmann NK landscapes [2] of- typic level. For that one defines a genotype-phenotype map, fer an arena wherein some analytical progress can be made. φ : G −→ Q, where G is the space of genotypes which has Work that includes the effect of recombination has been less dimension DG. One may thus define an induced fitness func- forthcoming, especially in terms of analytical work. tion on the space of genotypes, fG = fQ ◦φ. As the genotype- A fundamental question is: how do the different genetic phenotype map is more often than not non-injective (many- operators such as selection, mutation and recombination af- to-one) the function fG will be degenerate, many genotypes fect population flows? Here, I am defining a genetic operator correspondingto the same fitness value. Thus, fitness gives an to be any operation H such that P (t +1) = HP (t), where equivalence relation on G, many genotypes being equivalent P (t) is the population at time t. It is common to think of selectively. A simple example of this would be the standard selection as not being a genetic operator as it acts only on ex- synonym “symmetry” of the genetic code. I will thus refer to pressed behavior, i.e. the phenotype. However, given that, at the equivalence of a set of genotypes under the action of se- least formally, there exists a genotype-phenotype map selec- lection (i.e. they’re all equally fit) as a “symmetry” between tion also acts at the genotypic level. A lot of the power be- them. Obviously, by definition, selection preserves this sym- hind the standard visualization of a fitness landscape has been metry. associated with the view that evolution is a hill-climbing pro- Having defined fitness as a mathematical relation we must cess on such a landscape. This intuition however is linked to now understand it conceptually, in particular in its relation a particular class of dynamics — principally selection. Thus, to other genetic operators. In its simplest form [11] fitness one genetic operator dominates the intuition behind the land- and selection are measured by the number of fertile offspring scape. In the presence of other genetic operators it is quite produced by one genotype versus another. However, this is generic that population flows are not simple hill climbing pro- not the way it is normally portrayed in evolutionary compu- cesses. What one requires is a more democratic approach that tation, which follows the lead of population genetics, where treats the various genetic operators on a more equal footing. it is a measure of the probability that an individual survives Thus, one is led to enquire as to whether there exist other to reproductive age [12]. Clearly the two concepts are quite ways of thinking of landscapes so as to restore an intuitive different. The second is a property of an individual, in that it picture of landscape dynamics in the presence of genetic op- does not depend on other genotypes, even though the repro- erators other than selection. A precedent for such an alter- ductive fitness function may reflect “environmental” effects. native type of landscape can be found in thermodynamics In the first case we must ask: what type of fertile offspring where one may consider the difference between energy and are left to the next generation? Genetically identical copies free energy, the latter being the more relevant quantity for de- of the parents or what? Without thinking too much of the termining the behavior of a system. In particular the system’s biological realities proportional selection means selection for properties are much more readily seen from the free energy reproduction of certain parents, which in the absence of ge- landscape than the energy landscape. netic“mixing” operators such as mutation and recombination The plan of this contribution is as follows: in section 2 leads to the production of offspring genetically identical to I will discuss some issues related to the definition of fitness their parents. However, this idea of fitness does not take into drawing in particular a strong distinction between the con- account the important effect the other genetic operators may cepts of “reproductive”fitness and “offspring” fitness. In sec- have in determining the complete reproductive success of an tion 3, I discuss landscape statics and dynamics showing that individual. In particular, the effect of the other genetic op- a realistic landscape is almost always explicitly time depen- erators, as shall be shown below using several model exam- dent. In sections 4 and 5, I present several, generic analytic ples, can be such that the population flows on the standard and numerical examples wherein the corresponding popula- fitness landscape cannot be understood with any degree of in- tion flows cannot be intuitively understood on the standard tuition. When the effects of such genetic operators are small, fitness landscape thus illuminating the need for an alterna- and selection is dominant, it is quite likely that the two fit- tive paradigm. In section 6 this paradigm is introduced and nesses are quite close numerically. However, to take another discussed and the examples of the previous two sections re- extreme, neutral evolution where all genotypes have the same considered. Finally, in section 7, I draw some conclusions. probability to reach maturity, the two will be quite different. We will introduce later the concept of an “effective fitness” 2 What is fitness? that can encompass both limits of strong and weak selection within the same function. I will denote the reproductive fit-

Mathematically it is quite simple to define fitness: fQ :−→ ness of a genotype by f(Ci,t) and its success in producing + R , where Q is the space of phenotypes and is of dimension offspring by the offspring fitness, feff , which will be defined DQ. It is natural to define fitness as a function on phenotypes mathematically in section 6. Having discussed fitness we now come to the idea of a fit- represented by a set {gc}⊂{gk}, then we may ask whether ness landscape. The concept of fitness landscape is already we may assign a fitness to those commoncharacteristics. This implicit in the above definition of fitness which assigns to ev- can be simply done by defining ery g ∈ G a real, non-negative number r ∈ R+. Thus, one can intuitively think of a mountainous landscape where fit- f(ξ,t)= f(Ci)P (Ci,t)+ f(Ci)P (Ci,t) (1) ness is the height function above some DG-dimensional hy- perplane. Of course, the visualization of a typical landscape where P (C,t) is the probability of finding the configuration such as an N-dimensional hypercube is somewhat less pic- C at time t. Of course, the above is very familiar to people turesque. Crucial to the concept of a landscape is the idea working in GAs as ξ is just a schema. I mention it as it is also of a distance function, without which the landscape is shape- of fundamental importance in biology. Why? Because except less as we cannot say which genotypesare closely related and in a computer simulation one can never keep track of the evo- which are not. A very common distance function, particularly lution of all microscopic configurations. Typically, what are natural in the case of mutation and selection, is the Hamming of interest are more coarse-grained variables. For example, the fitness of a species, S, we can consider as distance, dij , between two genotypes Ci and Ci. If the fit- ness function is an explicit function of time the corresponding landscape will be a dynamical not a static object. Almost in- f(S,t)= f(Ci)P (Ci,t) (2) variably, landscape analysis has been restricted to landscapes CXi∈S associated with a static reproductive fitness. Such landscapes, where the sum is over all those genotypes that correspond as we shall see, are the exception rather than the rule. In par- to the species. The moral here is that any coarse graining ticular, given that the number of offspring of a genotype will whatsoever will introduce a time dependence into the fitness almost inevitably be a function of time a landscape based on function for the coarse grained effective degrees of freedom. feff will be an explicit function of time. Thus, in general the concept of a dynamic landscape is of more relevance than a static one. In the case of both biology 3 Landscape Statics and Dynamics and fitness as measured in terms of number of offspring will be time dependent, hence, any land- As mentioned above, a fitness landscape is normally thought scape portrayal of this function will also be time dependent. of as a static concept, the reproductive fitness assigned to a We nowcome to the questionof howto impose a dynamics given configuration being independent of time. It is clear that on the fitness landscape. A population M(t) ≡{g(t)}⊂ G, in any real, biological system this is a crude simplification. where {g(t)} is the set of genotypes present in the population Any realistic biological landscape must be time dependent, at at time t, flows according to least over some time scale, due to the effects on fitness of the environment. This is most clear in the concept of coevolution M(t +1)= H({f(Ci)}, {pk},t)M(t) (3) where changes in one species can affect the fitness of another. Under certain circumstances, however, and over certain time where H is an evolution operator that depends on the fitness scales, a static landscape may be a good approximation to landscape, {f(Ci)}, and the set of parameters, {pk}, that gov- the actual one. In this case a key concept is the ruggedness ern the other genetic operators; e.g. mutation and recombi- of the landscape which one can partially think of as being a nation probabilities. There are very many choices by which measure of the density of local optima but, more importantly, one can implement a population dynamics. A simple one, is a measure of the degree of correlation in the landscape. applicable in both biology and evolutionary computation, is An associated concept, a complexity catastrophe, shows that that of pure proportional selection which gives the following there are limits to the power of selection in the case of both equation for the mean number, n(Ci,t), of genotype Ci very smooth and very rough landscapes. ¯ In evolutionary computation there are many situations, es- n(Ci,t +1)=(f(Ci)/f(t))n(Ci,t) (4) pecially in global optimization such as the canonical trav- where I assume the reproductive fitness landscape to be time eling salesman problem, where the landscape is strictly independent. In this case it is clear how the population flows static. However, even in this case a time-dependent land- — monotonically towards the global optimum of the land- scape emerges in a very natural way. Consider any micro- scape (neglecting of course finite size effects). It is precisely scopic configuration, C , that we can represent by a set g of i such intuitive flows, according to the gradient of the land- N elements {g }, k ∈ [1,N]. Such a configuration could k scape, that have lent such power to the concept of evolution represent, for example, the genotype of an organism, or a as a flow on a fitness landscape. It should be fairly clear, and possible solution to a combinatorial problem. The fitness of will be shown explicitly in section 4, that in this dynamics such a configuration, f(C ), I assume to be independent of i reproductive and offspring fitnesses are equal due to the fact time. Now, consider another configuration, C 6= C , of fit- i i that the only operator present is reproductive selection. ness f(C ). We ask: what common characteristics do the two i Another interesting limiting case is that of the adaptive configurations have? If they have N2 elements in common, walk whereby one represents the entire population by one genotype that jumps instantaneously to a one-mutant neigh- mutations both the reproductive fitness and the offspring fit- bor. This dynamics and the effect of landscape ruggedness ness are the same. In the infinite populationcase, oron the av- has been well studied. However, this limiting case of hill- erage in the finite population case, ∆P (t)= P (1,t)−P (0,t) climbing, although it has biological interest, is not so inter- is constant in time. Therefore any initial deviations from ho- esting from the point of view of evolutionary computation mogeneity in the initial population will be preserved. For as adaptive walks get stuck at local optima. A more general non-zero mutation rate any initial inhomogeneity will be dynamics for proportional selection, mutation, and one-point eliminated by the effect of mutations. Thus, if ∆P > 0 one crossover can be described by the equation [13, 14] will find that the offspring fitness of allele 0 is greater than that of allele 1 until the deviation is eliminated. Thus, the → P (Ci,t +1)= P(Ci)Pc(Ci,t)+ P(Ci Cj )Pc(Cj ,t) (5) effect of mutations will be to bring the system into “equilib-

CXi6=Ci rium,” i.e. into the homogeneous population state. During this approach to equilibrium the less numerous allele, 0, is → where the effective mutation coefficients P(Ci) and P(Ci Cj ) “selected” more than the allele 1 in that it leaves more off- represent the probabilities that the genotype Ci remains un- spring. If the mutation rates for changing allele 1 to allele 0 mutated and the probability that the genotype C mutates to j and for changing allele 0 to allele 1 are not equal but are p1 the genotype C respectively. C is the mean propor- i Pc( i,t) and p2 respectively then the differences between reproductive tion of strings Ci at time t after selection and recombination. fitness and offspring fitness are even more pronounced as can Explicitly be seen by

l − 1 ′ Pc(Ci,t)= 1 − pc P (Ci,t) ∆P (t +1)= (1 − 2p2)∆P (t) + (p1 − p2)P (0,t) (7)  N − 1 l−1 In this case limit ∆P (t) → ((p1 − p2)/(p1 + p2)) p t→∞ + c P ′(CL(k),t)P ′(CR(l − k) (6) Now consider a two-locus system, once again with two N − 1 i i Xk=1 alleles, 0 and 1. The fitness landscape we will take to be: f(00) = f(01) = 1, f(11) = 10, f(10) = 0.1. The fitness ′ ¯ ¯ where P (Ci,t) = (f(Ci,t)/f(t))P (Ci,t), f(t) being the landscape in this case is only partially degenerate: the states average population fitness. pc is the crossover probability 00 and 01 having the same fitness value. However, although ′ L and k the crossover point. The quantities P (Ci (k),t) and the reproductive fitness values are the same the offspring fit- ′ R ′ P (Ci (k),t) are defined analogously to P (Ci,t) but refer ness values, once again, are different. The degeneracy in this L R to the coarse grained variables, i.e. schemata, Ci and Ci case is lifted by the effect of mutation as can be seen from the which are the parts of Ci to the left and right of k respec- equations tively. One can illustrate the content of the equation with a simple example: 0110100|0101010010. The crossover point f(00) 2 f(11) L P (00,t +1)= (1 − p)P (00,t)+ p P (11,t) is at k = 7 hence Ci , as a schema, has N2 = l = 7 while f¯(t) f¯(t) R Ci has N2 = l = 10. An analogous equation, identical in p(1 − p) functional form, for the case of a general schema ξ can also + (f(01)P (01,t)+ f(10)P (10,t)) (8) f¯(t) be derived [13, 14]. In the rest of the paper we will consider the dynamics generated by this equation in its various limits. f(01) f(10) P (01,t +1)= (1 − p)P (01,t)+ p2 P (10,t) 4 Effect of other genetic operators: analytic ex- f¯(t) f¯(t) amples p(1 − p) + (f(00)P (00,t)+ f(11)P (11,t)) (9) f¯(t) In this section I will consider more explicitly the effect of ge- netic operators other than reproductive selection on the pop- For p < 1/2, and starting with a random initial population, ulation flow on fitness landscapes in the context of some in terms of number of offspring the configuration 01 will be simple, analytically tractable models. I have discussed that preferred to 00. It is not hard to see why: the fittest config- the intuition behind the idea of fitness is to a large extent uration is 11 and this can more easily mutate to 01 than to that of giving a “reproductive edge” to certain genotypes, i.e. 00. Thus, there is a population flow from 00 to 01 in spite that certain genotypes give rise to more offspring than oth- of the fact that there is no gradient in the reproductive fitness ers. Also, that the intuition behind population flows on a landscape to induce it. On the contrary, even if there were a fitness landscape is that of hill-climbing. Let’s think about gradient in the direction from 01 to 00 if it were not too large this somewhat more critically in the light of some interesting the mutation induced flow from 00 to 01 could overcome it, counter-examples. i.e. the population can flow downhill against the reproduc- I will first consider some simple one and two locus sys- tive fitness gradient! Thus, there is a tendency for the system tems. Consider a single gene with two alleles, 0 and 1, which to evolve along a preferred direction not because of selection have the same reproductive fitness value, f. In the absence of constraints but because the system has preferred directions of change in the face of random mutations. This is the phe- To take an extreme example of this consider the follow- nomenon of orthogenesis. ing simple two-locus system in the presence of selection Naturally, this phenomenon encourages one to ask just and recombination, but not mutation, defined by a fitness when neutral evolution [15] is actually “neutral.” In the above landscape: f(01) = f(10) = 0, f(11) = f(00) = 1. case although the flow is reproductively neutral in the direc- The steady state solution of the evolution equation (5) is 1 pc pc tion 00, 01 the neutrality, or symmetry, is broken by the pres- P (11) = P (00) = 2 1 − 2 , P (01) = P (10) = 4 . ence of non-neutral adjacent mutants. For a flat fitness land- For pc = 1 one sees that half the steady state population is scape where all genotypes have fitness f composed of genotypes that have zero fitness. In this case we see that populations can even flow against infinite fit- 2N −1 d N−d ness gradients! Note also that this fixed point of the dynam- P (C ,t +1)= P (C ,t)p ij (1 − p) ij (10) i i ics is in fact a stable one. For the other two-locus system Xj=1 mentioned above P (00, 1) = (1 − (9.9pc/12.1))P (00, 0), For a homogeneous population the number of states Ham- P (01, 1) = (1+(9.9pc/12.1))P (01, 0). Hence, once again N ming distance dij from Ci is Cdij which implies that the off- we see a degeneracy in reproductive fitness being broken by spring fitness is the same for all genotypes. Small deviations the effect of another genetic operator, in this case recombina- from homogeneity will be manifest in small differences be- tion, with a consequent differential in offspring fitness. tween the reproductive fitness and the offspring fitness which will gradually diminish as the population homogenizes. If the 5 Effect of other genetic operators: examples initial population, M(0), is homogeneousthen it is preserved to be so in the evolution — on the average. The equations from simulations I am using here are of course mean field equations and as Having shown how genetic operators such as mutation and such do not capture finite size effects that can play an impor- recombination can drastically alter the directions populations tant role in neutral evolution. If the landscape only has a flat flow on fitness landscapes in some simple analytic models I subspace then how well one can describe the population evo- will now illustrate the same phenomenon in the case of some lution as being neutral will depend on where the population much more complex, analytically intractable models. I will is located and, if located predominantly in the flat subspace, be brief in detail referring, where applicable, the reader to the what is the Hamming distance to states not within the sub- original literature. space and what is the fitness of those states. Pictorially, if I will consider first the evolution of two schemata of vari- one thinks of a bowl with a flat bottom then the sides of the ous defining lengths in strings of size 8 in the case of a sim- bowl with the largest gradient will attract the population most ple counting ones landscape in the presence of selection and strongly. recombination, but without mutation [16]. In the absence of Another example is that of the NIAH landscape in the crossover there will certainly be no preference for one schema presence of mutation and selection. The landscape is f(Ci)= length versus another, i.e. the offspring fitness as well as the f0, i = opt, f(Ci) = f1, i 6= opt, opt being the optimum reproductive fitness of any two-schemata is on average the genotype. One can use as a measureof orderin the population same. To analyze this situation consider the quantity M(l) the relative concentration of optimum genotype, P (Copt ,t); where M(l) ≡ (nopt(l) − nopt(8))/nopt(8). Here, nopt(l) and in particular in the long time limit, P (Copt, ∞), where a is the number of optimal 2-schemata of defining length l nor- steady state population is reached known as a quasi-species. malized by the total number of length l 2-schemata per string, In this case as is well known [5] P (Copt, ∞) monotonically i.e. 9 − l. By optimal 2-schemata we mean schemata contain- decreases as a function of the mutation rate p until a critical ing the global optimum . is the number of optimal N 11 nopt(8) rate, pcri, is reached beyond which P (Copt , ∞)=1/2 . It 2-schemata of defining length 8. is important to realize that the fitness landscape is constant A large population of 5000 8-bit strings was considered. throughout this behaviour, i.e. for p = 0 the entire popu- Figure 1 shows an average over 30 different runs of M(l) lation climbs the fitness peak until f¯(t) = fopt, while for versus time with pc = 1. As mentioned, without crossover p ≥ pcri the population is uniformly dispersed throughout the there is essentially no preference for schemata of a given entire landscape. Clearly no intuition about these very differ- length. Adding in crossover leads to a remarkable change: ent types of population flow endpoints can be gleaned from Schemata prevalence is ordered monotonically with respect the structure of the landscape itself. In one limit selection to length but with the larger schemata being favored. Thus, dominates in the other limit it has no effect. Whether or not although there is no preference in terms of reproductive fit- the population will ascend a fitness peak due to reproductive ness for one schema length versus another quite the contrary selection depends crucially on the presence of another genetic is true in terms of offspring fitness. Next, I will consider operator — mutation. the differences between reproductive and offspring fitness in Above, I considered only mutation to support the sup- the case of a simple auto-adaptive system. Specifically, one position that reproductive fitness and offspring fitness can codes the mutation and 1-point crossover probabilities into an be very different in the presence of other genetic opera- Nc-bit binary extension of an N-bit genotype which is repre- tors. Similar considerations also apply to recombination. scape the “jumper” landscape. In Figure 2 one sees the results of an experiment where the mutation and crossover probabil- 0.015 l=2 l=3 ities were coded either with three or eight bits to codify each l=4 0.01 l=5 probability. Tournament selection of size 5 was used and a l=6 l=7 lower bound of 0.005 for mutation imposed. The success of 0.005 the self-adapting system in converging to the time dependent global optimum was compared to that of an “optimal” fixed 0 parameter system with p = 0.01 and pc = 0.8. The upper M(l)

-0.005

-0.01 1.0

CR-Mut8b 40,41 0.75 -0.015 CR-3b CR-8b

0.5 -0.02 0 5 10 15 20 25 30 generation 0.25

1-0 Figure 1: Graph of M(l) versus t in unitation model with

0.75 Cros 8b pc =1. Cros 3b Average Rates / CR

0.5 sented by a non-degenerate fitness landscape, i.e. fG = fQ. 0.25 8b-Mut This leads to a new (N + N )-bit genotype whose landscape Mut 3b c Mut 8b Nc has a degree of degeneracy of order 2 , i.e. the phenotype- 0 N 0 5 10 15 20 25 30 35 40 45 50 genotype map is now 2 c fold degenerate. Generation In practice, starting off with a random population, where the average rates are 0.5, one finds that the population in a Figure 2: Graph of relative concentration of the global op- class of interesting model landscapes self-organizes until pre- timum (CR) (upper graph) and average crossover and muta- ferred mutation and recombination rates appear [17]. Such tion probabilities (lower graph) as a function of time for the self-organization cannot come about due to any bias in terms “jumper” landscape. CR-3b and CR-8b are the results for 3- of reproductive fitness, as by construction there is no such bit and 8-bit encoded algorithms. CR-Mut8b is the result for bias. Neither can it come about by a spontaneous symme- coded mutation with pc = 0, with 40, 41 being the relative try breaking (i.e. a spontaneous breaking of the genotype- concentration of strings associated with the local optimum at phenotype degeneracy) due to stochastic effects, i.e. finite 40 and 41. Mut 3b, Mut 8b, Cross 3b and Cross 8b are the size effects, as in the majority of the simulations the size of average mutation and crossover probabilities in 3-bit and 8- the population was much bigger than DG. What is happen- bit representations. The solid line for Mut8b is the average ing is that even though particular values for p and p are not c mutation rate in the case pc =0. selected for in terms of reproductive fitness they are selected for in terms of offspring fitness. Thus, in G there is a flow curves show the relative frequencies of the optima using 8-bit to a certain subregion of the space wherein the mutation and and 3-bit codification and also what happens when pc = 0 recombination probabilities take on preferred values. Addi- and only the mutation rate is coded. It is notable that the opti- tionally, the probability distribution associated with the var- mal fixed parameter system was incapable of finding the new ious values of p and pc narrows as time increases indicating optimumwhereas the codedsystem had no such problem. For that there is convergence of the population. the case pc =0 the curve 40, 41 shows the relative frequency As a specific example, consider a time-dependent land- of the strings associated with the optimum at 40 and 41. Be- scape defined on 6-bit chromosomes that code the integers fore the landscape “jump” this optimum is local, being less fit between 0 and 63. The landscape is time dependent in the than the global optimum at 10 and 11. After the “jump” it is following way: the initial landscape has a global optimum fitter but less fit than the new global optimum 63 which is an situated at 10 and 11 and a local optimum at 40 and 41. How- isolated point. ever, after 60% of the population reaches the global optimum Notice that the global optimum was found in a two-step the landscape changes instantaneously, converting the origi- process after the landscape change. First, the strings started nal global optimum into a local one. The original local opti- finding the optima 40, 41 before moving onto the true global mum at 40 and 41 remains the same but with a higher fitness optimum, 63. Immediately after the jump the effective pop- value than the new local optimum at 10 and 11. Additionally, ulation of the new global optimum is essentially zero. The a new global optimum appears at 63. I will denote this land- number of strings associated with 40 and 41 first starts to grow substantially at the expense of 10 and 11 strings. At available from both small and large trees, the only selec- its maximum the number of optimum strings is still very low. tive criterion being that giraffes prefer to choose a mate from However, very soon thereafter the algorithm manages to find among those that have similar neck size. This “social pres- the optimum string which then increases very rapidly at the sure” landscape is modeled by defining the fitness of the expense of the rest. ith giraffe to be a function of its neck size ni and the av- The striking result here can be seen by comparing the erage neck size of the population, hni, with value one if changes in the relative frequencies with the changes in the hni− δ 0 is average mutation rate, especially in the case pc = 0. Clearly a tolerance window. Note that landscape fitness depends only they are highly correlated. First, while the population is or- on neck size, hence all genotypes that correspond to the same dering itself around the original optimum, there is negative dynamical fixed point (phenotype) have the same reproduc- selection in terms of offspring fitness against high mutation tive fitness. Thus, as in the other examples, there is no direct rates as one can see by the steady decay of the average muta- selective advantage for one genotype versus another. To in- tion rate. After the jump there is a noticeable increase in the troduce time dependence into the landscape one imposes a mutation probability as the system now has to try to find fitter short period of drought in which food begins to be available strings. As the global optimum is an isolated state it is much only in taller and taller trees. This period is mimicked by easier to find fit strings associated with 41 and 40. The popu- making fi =1 if hni− δ + ǫ

190 feff (ξ,t) P (ξ,t +1)= P (ξ,t) (11) 180 f¯(t)

170 One may think of the effective fitness as representing the ef- 160 fect of all genetic operators in a single “selection” factor.

150 feff (Ci,t) is the fitness value at time t required to increase average neck size or decrease P (Ci,t) by pure selection by the same amount 140 as all the genetic operators combined in the context of a re- 130 productive fitness f(Ci). If feff (Ci,t) > f(Ci,t) the effect

120 of the genetic operators other than selection is to enhance the number present of genotype Ci relative to the number 110 0 200 400 600 800 1000 1200 1400 number of generations found in their absence. Obviously, the converse is true when feff (Ci,t)

feff (Ci,t) = ((1 − 2p) + (p/P (Ci,t))) f (13) 6 Effective fitness and effective fitness land- scapes Thus, we can see explicitly that when the populationis homo- geneous effective fitness is the same as reproductive fitness. The previous two sections showed that it is difficult to in- Deviations from homogeneity result in a higher effective fit- tuitively understand population flows on fitness landscapes ness for the less numerous genotype. The effective fitness ad- when genetic operators other than selection play an important vantage then decreases as the system approaches equilibrium. role. This is manifest in the fact that the most selectively fit The effective fitness landscape for this model as a function of individuals reproductively do not always give rise to the most time can be seen in Figure 4. In this case, f =2, p1 =0.15, offspring. Thus, under many circumstances the hill-climbing p2 = 0.1 and P (0, 0) = P (1, 0) = 0.5. Note the significant analogyis not a good one. I emphasize that I am not just talk- differences in feff (i, 0) as a result of which there is a popula- ing abouta pathologythat in practiceis very unlikelyto occur, tion flow from 1 to 0 along this effective fitness gradient even but rather a phenomenon that is central to both biology and though there is no reproductivefitness gradient. This gradient evolutionary computation whenever genetic operators other decreases monotonically as a function of time thus the effec- than pure selection are of primary importance. tive landscape becomes flat asymptotically. We can see from I have throughout constantly emphasized the difference (11) that this will be a generic propertyof any effective fitness between reproductive “landscape” fitness and offspring fit- landscape, the only steady state solutions of the equation be- ¯ ness. I now wish to define mathematically the latter. There ing feff (Ci,t)= f(t) ∀Ci, or P (Ci,t)=0. Thus, already we are several possibilities for such a quantity. For instance, see the usefulness of effective fitness — it indicates by its de- based on the thermodynamic analogy [19] between an in- viation from flatness how close the population is to a steady homogeneous two-dimensional spin model and a population state. evolving with respect to selection and mutation, one could For the two-locus model the reproductive fitnesses of 00 define quite naturally the free energy per row as an effective and 01 are equal. However, their effective fitnesses are 2 measure of fitness. Here, however, I will use another defi- quite different being: feff (00, 0) = (1 − 0.9p +9.9p ) and Figure 4: Graph of effective fitness as a function of time for Figure 6: Graph of effective fitness at late times for the two- the one-locus model of section 4. locus model of section 4.

2 feff (01, 0)= (1+9p − 9.9p ) respectively, where initial pro- ness landscape which will gradually diminish as the popula- portions of all four states are taken to be equal. The effec- tion homogenizes. If the landscape only has a flat subspace tive fitness landscape in this problem is shown in Figure 5 the system will seek to escape the flat subspace by way of the for t = 0 with p = 0.05, P (00, 0) = 3, P (01, 0) = 0.2, direction with the highest effective fitness gradient. P (10, 0)= 0.4 and P (11, 0)= 0.1. Notice the initial effec- In the case of the NIAH landscape the effective fitness of the optimum string is

N feff (Copt ,t)= f0(1 − p)

P (Ci,t) d N−d +f1 p ij (1 − p) ij (15) P (Copt ,t) CXi6=Ci In Figure 7 we see a plot of the effective fitness as found in a simulation of the Eigen model in the steady state as a function of p. In this case f0 = 9 and f1 = 1. For p = 0,

feff (Copt , ∞) = f0 i.e. reproductive and effective fitness are the same. Note how the error threshold, located at about 0.2, manifests itself in terms of the effective fitness — that at and ¯ above the threshold feff (Copt ,t) → f(t) ≈ f1. i.e. once again the effective fitness landscape will be flat. The curve

for p>pcri is quite noisy due to the fact that there are very few optimum strings in the population. Thus, the effective Figure 5: Graph of effectivefitness at t =0 for the two-locus fitness itself can serve as an order parameter to distinguish model of section 4. the selection dominated regime from the mutation dominated one. tive fitness gradient from 00 to 01 that → 0 as t → ∞ as can Consider now the examples of section 5. In the first ex- be seen in Figure 6 which shows the same effective fitness ample, in Figure 8 we see the effective fitness as a function landscape as in Figure 5 but at late times. Once again note of time for the various two-schemata. Note the initial mono- the flatness of the effective landscape at late times. tonic ordering of the schemata — the longest being the most In the case of a strictly flat fitness landscape characteristic effectively fit. of neutral evolution the effective fitness is For the self-adaptive system if we concentrate for the mo- 2N −1 ment on just selection and mutation the effective fitness of a P (C ,t) j dij N−dij given string is feff (Ci,t)= f p (1 − p) (14) P (Ci,t) Xj=1 N feff (Ci,t)= f(Ci,t)(1 − pi) C C C For a homogeneous population feff ( i,t) = f∀ i,t. Thus, dij N−dij P ( j ,t) + pj (1 − pj ) f(Ci,t) (16) under these circumstances the effective fitness landscape is P (Ci,t) CXi6=Ci as flat as the normal one. Small deviations from homogene- ity will be manifest in small corrugations of the effective fit- where pi is the mutation rate associated with the genotype Ci. 0.004 l=2 l=3 l=4 0.002 l=5 l=6 l=7 0

-0.002

-0.004 F(l)

-0.006

-0.008

-0.01

-0.012 0 5 10 15 20 25 30 generation

Figure 8: Graph of effective fitness versus time for the uni- tation model of Figure 1.

Figure 7: Graph of effective fitness versus p in the steady All these examples show clearly how different the effec- state limit of the NIAH landscape. tive fitness landscape is from the reproductive fitness land- scape and how population flows can be understood much bet- To understand the behavior shown in Figure 2 we first realize ter within the framework of the former. that there are two contributions to feff (Ci,t): one from the genotype Ci itself and another from all the other genotypes 7 Conclusions that can mutate to it. For the first contribution we can see that genotypes with low mutation rates will be preferred while for In evolutionary computation, as in biology, it is obviously of the second contribution genotypes with higher mutation rates crucial importance to know what generic properties the effec- will tend to be preferred. Thus, we can see a quantitative way tive degrees of freedom exploited by a genetic system pos- of evaluating the explanation given in section 5. Initially, be- sess. Of particular concern is what constitutes a “fit” string. fore the landscape jump, selection dominated in that the first I have shown that in many situations high/low landscape re- term in (16) was the most important. Consequently the av- productive fitness does not correspond to high/low effective erage mutation rate decreased as genotypes of low pi were fitness and it is the latter that really determines reproductive more effectively fit. After the jump there is at first a pref- success. erence for the local optimum at 40, 41. This is due to the Several criticisms of the effective fitness concept come im- fact that both terms in the effective fitness are playing an im- mediately to mind. First of all, why do we need another type portant role whereas for the needle-like global optimum, as of landscape? Isn’t the present one good enough? After all, the original population before the jump had converged to 10, the notion of an adaptive landscape has turned out to be one of the most powerful concepts in evolutionary theory. One of 11, there was no contributionfrom the first term to feff (63,t). The fact that the average Hamming distance from 10 and 11 the strongest motivations for the fitness landscape concept is to 63 is 3.5, while from 10, 11 to 40, 41 it is 2.75, explains that it allows one to intuitively understand population flows as why the system first converges to 40, 41 before passing on to hill climbing processes, thus offering a compelling paradigm the global optimum. It is clear too why the average mutation for how evolution works. However, as I have demonstrated rate increases at two distinct points in the the search process. the hill climbing analogy is intimately linked to a certain type It is precisely when the second contribution in the effective of dynamics associated with pure selection. In the presence fitness is playing a dominant role in aiding the search. of other genetic operators hill climbing is the exception rather The giraffe example is rather similar to that of the self- than the norm and so the idea of a fitness landscape thereby adapting system, the master gene being analogous to the part loses some of its appeal. Another possible criticism is that of the genotypethat codes for the mutation and recombination effective fitness is intrinsically time-dependent. Doesn’t this rates in the latter. It plays no role in reproductive selection lead to an extra level of complication relative to that of a static but does play an important role in effective selection type one landscape? As I have argued there are several motivations for chromosomes having a higher effective fitness than type two thinking of a time dependent landscape as being more funda- during and after the drought. mental than a static one. First of all, it is more biologically realistic as environmental effects that affect fitness are almost [5] M.Eigen Self-Organization of Matter and the Evolution inevitably time dependent. Secondly, even if a landscape is of Biological Macromolecules, Naturwissenschaften static in terms of the microscopic degrees of freedom it will 58: 465 (1971). be time dependent in terms of any coarse grained degrees of freedom. On might also argue that the definition is tauto- [6] J. Horn and D.E. Goldberg, Diffi- logical — survival of the survivors. Such circularity is not culty and the Modality of Fitness Landscapes, FOGA very different to that which appears in other sciences such as 3, pp 243-271, eds D. Whitley and M.D. Vose, Morgan physics where one can level the same sort of criticism at an Kaufmann, San Francisco (1995). equation such as Newton’s Second Law. Such circularity is [7] K. Chellapilla, D.B. Fogel and S.S. Rao, Gaining In- usually a hallmark of a truly fundamentalconcept and its def- sight into Evolutionary Programming Through Land- inition. Additionally, in the present context we have a way scape Visualization: An Investigation into IIR Filtering, to, at least in principle, explicitly compute it in terms of the Evolutionary Programming VI, eds. P.J. Angeline, R.G. parameters associated with the various genetic operators. Reynolds, J.R. McConnell and R. Eberhardt, Springer Another advantage of the effective fitness concept is that Verlag, Berlin (1997). it allows one to quantitatively understand the different mech- anisms by which order may arise in evolution. i.e. that “or- [8] D.H. Ackley, An empirical study of bit vector optimiza- der” may arise due to the effect of genetic operators other tion, Genetic algorithms and simulated annealing, pp than selection. In particular it provides a framework within 170-204, ed L. Davis, Pitman, London (1987). which neutral evolution and natural selection can be under- stood as different sides of the same coin, and in particular un- [9] R.A. Fisher, The Genetical Theory of Natural Selection, der what circumstances neutral mutations may lead to adap- Clarendon Press, Oxford (1930). tive changes. In this sense the phenomenon of orthogenesis, [10] P. Stadler, Towards a theory of landscapes, Complex i.e. genetic drives in the presence of random mutations, is Systems and Binary Networks, pp 77-155, eds R. L´opez nothing more than the appearance of an effective fitness gra- Pe˜na et al, Springer (1995). dient in the case where there was no original reproductive fitness gradient. Clearly, much more work needs to be done [11] M.W. Strickberger, Evolution, Jones and Bartlett, in understanding the effective fitness concept and developing Boston (1990). an intuition for landscape analysis based on it rather than the traditional idea of fitness landscape. [12] J.D. Hofbauerand K. Sigmund, The Theory of Evolution and Dynamical Systems, Cambridge University Press (1992). Acknowledgements This work was partially supported through DGAPA-UNAM [13] C.R. Stephens and H. Waelbroeck, Analysis of the grant number IN105197. Many of the basic ideas presented Effective Degrees of Freedom in Genetic Algorithms, here were developed in collaboration with Henri Waelbroeck Physical Review E57, pp 3251-3264 (1998). to whom I am grateful for many stimulating conversations. [14] C.R. Stephens and H. Waelbroeck, Effective Degrees of I also wish to thank Sara Vera for help with the graphs and Freedom of Genetic Algorithms and the Block Hypoth- David Fogel for useful comments on the manuscript. esis, Proceedings of the Sixth International Conference on Genetic Algorithms, Morgan Kaufman, San Fran- Bibliography cisco, pp 34-41 (1997).

[1] S. Wright, The roles of mutation, inbreeding, cross- [15] M. Kimura, The Neutral Theory of Molecular Evolu- breeding and selection in evolution, In: Jones DF (ed) tion, Cambridge University Press, Cambridge (1983). Proceedings of the sixth international congress on ge- [16] C.R. Stephens, H. Waelbroeck and R. Aguirre, netics, vol 1, pp356-366, Brooklyn Botanic Garden, Schemata as Building Blocks: Does Size Matter?,to be New York (1932). published in FOGA 5, eds. C. Reeves and W. Banzhaf, [2] S. A. Kauffman, The Origins of Order, Oxford Univer- Morgan Kaufman, San Francisco (1999). sity Press, Oxford (1993). [17] C.R. Stephens, I. Garc´ıa Olmedo, J. Mora Vargas and [3] D.B. Fogel, Evolutionary Computation: The Fossil H. Waelbroeck, Self-Adaptation in Evolving Systems, Record, IEEE Press (1998). Artificial Life 4.2, pp 183-201 (1998). [4] T.C. Jones, Evolutionary Algorithms, Fitness Land- [18] J. Mora, H. Waelbroeck, C.R. Stephens and F. Zertuche scapes and Search, Ph. D thesis Univ. of New Mexico Symmetry Breaking and Adaptation: Evidence From a (1995). Simple Toy Model of a Viral Neutralization Epitope, National University of Mexico Preprint ICN-UNAM- 97-10 (adap-org/0797) to be published in Biosystems (1997). [19] I. Leutheusser, J. Chem. Phys. 84, pp. 1884 (1986).