Evolvability ES: Scalable and Direct Optimization of Evolvability

Evolvability ES: Scalable and Direct Optimization of Evolvability Alexander Gajewski∗ Jeff Clune Columbia University Uber AI Labs University of Wyoming Kenneth O. Stanley Joel Lehman Uber AI Labs Uber AI Labs ABSTRACT 1 INTRODUCTION Designing evolutionary algorithms capable of uncovering highly One challenge in evolutionary computation (EC) is to design al- evolvable representations is an open challenge in evolutionary com- gorithms capable of uncovering highly evolvable representations; putation; such evolvability is important in practice, because it ac- though evolvability’s definition is debated, the idea is to find genomes celerates evolution and enables fast adaptation to changing cir- with great potential for further evolution [2, 10, 15, 19, 21, 26, 33, 43]. cumstances. This paper introduces evolvability ES, an evolutionary Here, as in previous work, we adopt a definition of evolvability as algorithm designed to explicitly and efficiently optimize for evolv- the propensity of an individual to generate phenotypic diversity ability, i.e. the ability to further adapt. The insight is that it is [21, 23, 26]. Such evolvability is important in practice, because it possible to derive a novel objective in the spirit of natural evolution broadens the variation accessible through mutation, thereby accel- strategies that maximizes the diversity of behaviors exhibited when erating evolution; improved evolvability thus would benefit many an individual is subject to random mutations, and that efficiently areas across EC, e.g. evolutionary robotics, open-ended evolution, scales with computation. Experiments in 2-D and 3-D locomotion and quality diversity (QD; [22, 32]). While evolvability is seemingly tasks highlight the potential of evolvability ES to generate solutions ubiquitous in nature (e.g. the amazing diversity of dogs accessible with tens of thousands of parameters that can quickly be adapted to within a few generations of breeding), its emergence in evolution- solve different tasks and that can productively seed further evolu- ary algorithms (EAs) is seemingly rare [33, 43], and how best to tion. We further highlight a connection between evolvability in EC encourage it remains an important open question. and a recent and popular gradient-based meta-learning algorithm There are two general approaches to encourage evolvability in called MAML; results show that evolvability ES can perform com- EAs. The first is to create environments or selection criteria that petitively with MAML and that it discovers solutions with distinct produce evolvability as an indirect consequence [6, 10, 17, 21, 33]. properties. The conclusion is that evolvability ES opens up novel For example, environments wherein goals vary modularly over research directions for studying and exploiting the potential of generations may implictly favor individuals better able to adapt to evolvable representations for deep neural networks. such variations [17]. The second approach, which is the focus of this paper, is to select directly for evolvability, i.e. to judge individuals CCS CONCEPTS by directly testing their potential for further evolution [26]. While • Computing methodologies → Genetic algorithms; Neural the first approach is more biologically plausible and is important networks; to understanding natural evolvability, the second benefits from its directness, its potential ease of application to new domains, and its KEYWORDS ability to enable the study of highly-evolvable genomes without fully understanding evolvability’s natural emergence. However, Evolvability, neuroevolution, evolution strategy, meta-learning current implementations of such evolvability search [26] suffer from ACM Reference Format: their computational cost. Alexander Gajewski, Jeff Clune, Kenneth O. Stanley, and Joel Lehman. 2019. A separate (but complementary) challenge in EC is that of effec- Evolvability ES: Scalable and Direct Optimization of Evolvability. In Genetic tively evolving large genomes. For example, there has been recent arXiv:1907.06077v1 [cs.NE] 13 Jul 2019 and Evolutionary Computation Conference (GECCO ’19), July 13–17, 2019, interest in training deep neural networks (DNNs) because of their Prague, Czech Republic. ACM, New York, NY, USA, 15 pages. https://doi.org/ potential for expressing complex behaviors, e.g. playing Atari games 10.1145/3321707.3321876 from raw pixels [28]. However, evolving DNNs is challenging be- ∗Work done during an internship at Uber AI Labs cause they have many more parameters than genomes typically evolved by comparable approaches in EC (e.g. neural networks Permission to make digital or hard copies of all or part of this work for personal or evolved by NEAT [38]). For this reason, the study of scalable EAs classroom use is granted without fee provided that copies are not made or distributed that can benefit from increased computation is of recent interest for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the [5, 34, 40], and evolvability-seeking algorithms will require similar author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or considerations to scale effectively to large networks. republish, to post on servers or to redistribute to lists, requires prior specific permission This paper addresses both of these challenges, by synthesizing and/or a fee. Request permissions from [email protected]. GECCO ’19, July 13–17, 2019, Prague, Czech Republic three threads of research in deep learning and EC. The first thread © 2019 Copyright held by the owner/author(s). Publication rights licensed to the involves a popular gradient-based meta-learning algorithm called Association for Computing Machinery. MAML [11] that searches for points in the search space from which ACM ISBN 978-1-4503-6111-8/19/07...$15.00 https://doi.org/10.1145/3321707.3321876 GECCO ’19, July 13–17, 2019, Prague, Czech Republic Alexander Gajewski, Jeff Clune, Kenneth O. Stanley, and Joel Lehman one (or a few) optimization step(s) can solve diverse tasks. We in- help compare the advantages or drawbacks of different quantifi- troduce here a connection between this kind of parameter-space cations of evolvability [24], and that it may facilitate the study of meta-learning and evolvability, as MAML’s formulation is very evolvability even before its natural emergence is understood. similar to that of evolvability search [26], which searches for in- In evolvability search, the central idea is to calculate an indi- dividuals from which mutations (instead of optimization) yield a vidual’s fitness from domain evaluations of many of its potential diverse repertoire of behaviors. MAML’s formulation, and its suc- offspring. In particular, an individual’s potential to generate phe- cess with DNNs on complicated reinforcement learning (RL) tasks, notypic variability is estimated by quantifying the diversity of hints that there may similarly be efficient and effective formulations behaviors demonstrated from evaluating a sample of its offspring. of evolvability. The second thread involves the recent scalable form Note the distinction between evolvability search and QD: QD at- of evolution strategy (ES) of Salimans et al. [34] (which at heart is tempts to build a collection of diverse well-adapted genomes, while a simplified form of natural evolution strategy [45]) shown to be evolvability search attempts to find genomes from which diverse surprisingly competitive with gradient-based RL. We refer to this behaviors can readily be evolved. In practice, evolvability search specific algorithm as ES in this paper for simplicity, and note that requires (1) quantifying dimensions of behavior of interest, as in the field of ES as a whole encompasses many diverse algorithms the behavior characterizations of novelty search [20], and (2) a dis- [3, 36]. The final thread is a recent formalism called stochastic com- tance threshold to formalize what qualifies as two behaviors being putation graphs (SCGs) [35], which enables automatic derivations distinct. Interestingly, optimizing evolvability, just like optimizing of gradient estimations that include expectations over distributions novelty, can sometimes lead to solving problems as a byproduct (such as the objective optimized by ES). We here extend SCGs to [26]. However, evolvability search is computationally expensive (it handle a larger class of functions, which enables formulating an requires evaluating enough offspring of each individual in the pop- efficient evolvability-inspired objective. ulation to estimate its potential), and has only been demonstrated Weaving together these three threads, the main insight in this with small neural networks (NNs); evolvability ES addresses both paper is that it is possible to derive a novel algorithm, called evolv- issues. ability ES, that optimizes an evolvability-inspired objective without incurring any additional overhead in domain evaluations relative 2.2 MAML to optimizing a traditional objective with ES. Such efficiency is Meta-learning [42] focuses on optimizing an agent’s learning po- possible because each iteration of ES can aggregate information tential (i.e. its ability to solve new tasks) rather than its immediate across samples to estimate local gradients of evolvability. performance (i.e. how well it solves the current task) as is more The experiments in this paper demonstrate the potential of evolv- typical in optimization and RL, and has a rich

Load more