The Evolution of Genetic Representations and Modular Neural
Total Page:16
File Type:pdf, Size:1020Kb
PhD thesis The evolution of genetic representations andmodularadaptation Dissertation zur Erlangung des Grades eines Doktors der Naturwissenschaften in der Fakult¨atf¨urPhysik und Astronomie der Ruhr-Universit¨atBochum Marc Toussaint Institut f¨urNeuroinformatik, Ruhr-Universit¨atBochum, ND 04, 44780 Bochum—Germany [email protected] March 31, 2003 Gutachter: Prof. Dr.-Ing. Werner von Seelen, Prof. Dr. Klaus Goeke Contents Introduction 7 1 Evolutionary adaptation 13 1.1 Introduction ........................................ 13 1.2 A theory on the evolution of phenotypic variability .................. 16 1.2.1 A prior in understanding evolution: There is No Free Lunch without assum- ing a constrained problem structure ...................... 16 If you do not presume that the problem has a structure, then any search strategy for good solutions is as good as random search. 1.2.2 How to assume and learn about a problem structure: A generic heuristic to adapt the search distribution .......................... 17 The structure of a problem is described by the structure of the distribution of good solutions—learning the structure of a problem means to adapt the search distribution toward the distribution of good solutions. 1.2.3 Evolutionary processes: A population, mutations, and recombination to rep- resent the search distribution .......................... 20 Evolution can be described as a stochastic dynamic process on the genotype space, com- parable to generic heuristic search, where the search distribution is parameterized by a finite parent population via mutation and recombination. 1.2.4 The genotype and phenotype: Reparameterizing the search space ...... 23 The mapping from genotype to phenotype is the key to understand complex phenotypic variability and evolution’s capability to adapt the search distribution on the phenotype space. 1.2.5 The topology of search: Genotype versus phenotype variability ....... 24 A simple, by mutation induced genotypic variability may, via a non-injective genotype- phenotype mapping, lead to arbitrarily complex phenotypic variability. 1.2.6 Commutativity and neutrality: When does phenotype evolution depend on neutral traits? .................................. 25 Neutral traits (of which strategy parameters are a special case) have an impact on phe- notype evolution if and only if they influence mutation probabilities and thus encode for different exploration distributions. 1.2.7 Behind this formalism .............................. 27 The abstract formalism developed so far relates to topics like evolving genetic represen- tations, strategy parameters, and evolvability. 1.2.8 Embedding neutral sets in the variety of exploration distributions ..... 29 The embedding defines a unique way of understanding neutral traits and will allow to derive an evolution equation for them. 4 CONTENTS 1.2.9 σ-evolution: A theorem on the evolution of phenotypic variability ...... 30 Exploration distributions naturally evolve towards minimizing the KL-divergence between exploration and the exponential fitness distribution, and minimizing the entropy of explo- ration. 1.2.10 Appendix — n-th order equivalence and n-th order delay .......... 32 How could evolution arrange to increase the probability for children to generate with high probability children that generate with high probability ...etc... children with high fitness? 1.2.11 Summary ..................................... 33 Recalling the steps toward this result. 1.3 Crossover, buildings blocks, and correlated exploration ................ 34 1.3.1 Two notions of building blocks of evolution .................. 34 The notion of building blocks as induced by crossover does not match the notion of func- tional phenotypic building blocks as it is induced by investigating correlated phenotypic variability. 1.3.2 The crossover GA: Explicit definitions of mutation and crossover ...... 35 In order to define crossover and derive results we assume that a genotype is composed of a finite number of genes and that crossover and mutation obey some constraints. 1.3.3 The structure of the mutation distribution ................... 37 Mutation increases entropy and decreases mutual information. 1.3.4 The structure of the crossover distribution ................... 39 Crossover destroys mutual information in the parent population by transforming it into entropy in the crossed population. 1.3.5 Correlated exploration .............................. 41 A precise definition of correlated exploration allows to pinpoint the difference between crossover exploration and correlated exploration in the case of EDAs. 1.3.6 Conclusions .................................... 43 Crossover is good to decorrelate exploration; is does not, as EDAs, induce complex ex- ploration. 1.4 Rethinking natural and artificial evolutionary adaptation .............. 45 1.4.1 From the DNA to protein folding: Genotype vs. phenotype variability in nature ....................................... 45 Without referring to its phenotypical meaning, the DNA is of a rather simple structure and variational topology. The codon translation is the first step in the GP-map that introduces neutrality and non-trivial phenotypic variability. Considering protein functionality as a phenotypic level, the GP-map from a protein’s primary to tertiary structure is complicated enough that neutral sets percolate almost all the primary structure space. 1.4.2 The operon and gene regulation: Structuring phenotypic variability .... 50 Considering the expression of genes as a phenotypic level, the operon is a paradigm for introducing correlations in phenotypic variability. 1.4.3 Flowers and Lindenmayer systems: An artificial model for a genotype-pheno- type mapping ................................... 52 Grammar-like genotype-phenotype mappings realize complex, correlated variability in ar- tificial evolutionary systems—similar to the operon—and induce highly non-trivial neu- trality. 1.4.4 Altenberg’s evolvable genotype-phenotype map ................ 54 Ultimately, discussing Lee Altenberg’s model of the evolution of the genotype-phenotype mapping pinpoints our point of view on the subject. CONTENTS 5 1.5 A computational model ................................. 57 1.5.1 The genotype and genotype-phenotype mapping ............... 57 An abstract model of ontogenetic development mimics the basic operon-like mechanism to induce complex phenotypic variability. 1.5.2 2nd-type mutations for the variability of exploration ............. 58 A new type of structural mutations allows for reorganizations of the genetic representa- tions and exploration of the respective neutral sets. 1.5.3 First experiment: Neutral σ-evolution ..................... 59 Genetic representations reorganize themselves in favor of short genome length and mod- ular phenotypic variability. 1.5.4 Second experiment: σ-evolution for phenotypic innovation .......... 64 Genetic representations develop in favor of highly correlated phenotypic variability allowing for simultaneous phenotypic innovations. 1.5.5 Third experiment: Evolving artificial plants .................. 66 Evolving artificial plants is an illustrative demonstration of the evolution of genetic rep- resentations to encode large-scale, structured, self-similar phenotypes. 1.6 Conclusions ........................................ 77 2 Neural adaptation 79 2.1 Introduction ........................................ 79 2.2 Functional modularity in NN .............................. 83 2.2.1 Decomposed adaptation ............................. 83 The adaptation dynamics of a neural system can be described as a Markov process—the structure of the transition distribution constitutes the notions of correlated adaptation and functional modules of a neural system. 2.2.2 The geometry of adaptation ........................... 85 The relation between a neural system’s free parameters and its functional traits can be described geometrically. The induced metric on the functional space determines the correlatedness of adaptation. Ordinary gradient online learning ........................ 85 Natural gradient online learning ........................ 88 2.2.3 An example: Comparing conventional NN with multi-expert models .... 89 An example demonstrates how the theory precisely describes the difference in adapta- tion dynamics between a conventional NN and a multi-expert w.r.t. their coadaptation behavior. 2.2.4 Generic properties of the class of conventional NN and multi-expert models 90 The statistics of coadaptational features in the space of all neural networks of a given architecture and in the space of all multi-expert systems of a given architecture exhibits a generic difference. 2.2.5 Summary ..................................... 93 2.3 A space of more modular neural systems ........................ 94 2.3.1 A neural model for multi-expert architectures ................. 94 A generalization of conventional artificial neural networks allows for a functional equiva- lence to multi-expert systems. Gradient learning ................................. 97 EM-learning .................................... 98 6 CONTENTS Q-learning ..................................... 100 Oja-Q learning .................................. 101 2.3.2 Empirical tests .................................. 102 Some tests of the proposed architecture and the learning schemes on a task similar to the what-and-where task demonstrate the functionality. Conclusions 105 Appendix 107 Theorems, definitions and symbols in chapter 1 .................... 108 Theorems, definitions and symbols