Arxiv:1912.03395V2 [Q-Bio.PE] 19 Feb 2020

Total Page:16

File Type:pdf, Size:1020Kb

Arxiv:1912.03395V2 [Q-Bio.PE] 19 Feb 2020 Information-geometric optimization with natural selection Jakub Otwinowski and Colin H. LaMont Max Planck Institute for Dynamics and Self-Organization∗ Evolutionary algorithms, inspired by natural evolution, aim to optimize difficult objective func- tions without computing derivatives. Here we detail the relationship between population genetics and evolutionary optimization and formulate a new evolutionary algorithm. Optimization of a continuous objective function is analogous to searching for high fitness phenotypes on a fitness land- scape. We summarize how natural selection moves a population along the non-euclidean gradient that is induced by the population on the fitness landscape (the natural gradient). Under normal ap- proximations common in quantitative genetics, we show how selection is related to Newton’s method in optimization. We find that intermediate selection is most informative of the fitness landscape. We describe the generation of new phenotypes and introduce an operator that recombines the whole population to generate variants that preserve normal statistics. Finally, we introduce a proof-of- principle algorithm that combines natural selection, our recombination operator, and an adaptive method to increase selection. Our algorithm is similar to covariance matrix adaptation and natural evolutionary strategies in optimization, and has similar performance. The algorithm is extremely simple in implementation with no matrix inversion or factorization, does not require storing a co- variance matrix, and may form the basis of more general model-based optimization algorithms with natural gradient updates. INTRODUCTION type with a one percent fitness advantage will go extinct [2]. Finding the optimal parameters of a high dimensional Some form of deterministic selection in optimization function is a common problem in many fields. We seek is desirable to not waste computational effort, and with- protein conformations with minimal free energy in bio- out stochasticity a population based algorithm will still physics, the genotypes with maximal fitness in evolu- be robust to noise in fitness since a population effec- tion, the parameters of maximum likelihood in statisti- tively integrates information over some region of the fit- cal inference, and optimal design parameters in count- ness landscape. Some ES and GAs use deterministic less engineering problems. Often derivatives of the ob- rank based selection which removes individuals below jective function are not available or are too costly, and some threshold. However, such truncation selection is derivative-free algorithms must be applied. very coarse, and does not affect proportionally the geno- Evolutionary optimization algorithms (EA) use a types that survive. population of candidate solutions to generate new can- Many population based algorithms, including ES and didate solutions for the objective “fitness” function. estimation of distribution algorithms (EDA), are based In particular, genetic algorithms (GA) and evolution on drawing a population of candidate solutions from a strategies (ES) are two classes of EAs most directly in- parameterized distribution P(θ) and iteratively updat- spired by the Wright-Fisher and Moran models from ing the parameters θ [3]. The basic approach is to move population genetics [1, 2]. GAs are initialized with some the parameters in the direction of the gradient of the population of genotypes, representing candidate solu- mean fitness: rθF . To account for the uncertainty of tions, and use some form of stochastic reproduction, in- the parameters many algorithms move the parameters in −1 corporating a bias known as selection. Among the dif- the direction of the natural gradient [4], g rθF , where ferent selection schemes, fitness-proportionate selection g−1 is the inverse of the fisher information, which can be is equivalent to natural selection in population genetics. estimated from the population. The popular covariance arXiv:1912.03395v2 [q-bio.PE] 19 Feb 2020 Stochasticity of reproduction may be helpful in over- matrix adaptation ES algorithm (CMA-ES) [5, 6] and coming local optima, and noise in fitness. In population related natural evolution strategies (NES) [7–10] param- genetics, stochasticity of reproduction is known as ge- eterize a population as a normal distribution, and use netic drift, and has the important effect of scaling the samples from the distribution to update the mean and strength of selection inversely with the magnitude of covariance with a natural gradient descent step. More stochasticity [2]. Stochasticity also causes the loss of generally, natural gradients describe ascent of the fitness many genotypes, even if they have high fitness. For ex- landscape in terms of information geometry, and their ample in the strong selection weak mutation regime of use characterizes a wide class of information-geometric the Moran model, the probability that a single genotype algorithms [11]. These algorithms differ from GAs and will sweep a population (fixation) is proportional to its population genetics, in that there are no selection or selective advantage, and there is a 99% chance a geno- mutation operators applied directly to individuals in a population. Here, we point out that the natural gradient used in information-geometric optimization also appears in nat- ∗ [email protected] ural selection. Under normally distributed phenotypes, 2 A 3 B 0.2 1 5 2 0.20 2 x 1 0.15 0 0.10 −1 phenotype frequencies p(t) 0.05 −2 −3 0.00 −3 −2 −1 0 1 2 3 −2−1 0 1 2 −2−1 0 1 2 −2−1 0 1 2 phenotype x1 phenotype x1 Figure 1. A) An example of 100 variants in a 2D phenotype space on a quadratic fitness landscape (blue contours). B) Frequencies evolve over time according to eq. 2, with t = 0:2 (red), t = 1 (green) and t = 5 (blue). as is done in multivariate quantitative genetics, we show in mean fitness equals the fitness variance (Fisher’s the- how selection is related to Newton’s method in optimiza- orem), and higher moments evolve as well. As an exam- tion. Then, we describe how intermediate levels of se- ple we show 100 variants in a quadratic fitness landscape lection are best for optimization, and how mutation and and how their frequencies change over time (Fig. 1). recombination generate new variants without having to Remarkably, replicator dynamics can be rewritten in explicitly sample from the distribution. Finally, we de- terms of information geometry [12]. Frequencies can be velop a proof of principle quantitative genetic algorithm considered as the parameters of a discrete categorical (QGA) which combines selection, recombination, and a distribution, and selection moves them in the direction form of adaptive selection tuning that shrinks the pop- of the covariant derivative [13, 14], (also known as the ulation towards an optimum. In contrast to GAs, QGA natural gradient [4]), has deterministic selection, and compared to CMA-ES and NES, it is much simpler and does not store a co- dp = g−1r F; variance matrix. dt p where p is the vector of (linearly independent) frequen- −1 NATURAL SELECTION GRADIENT cies, rpF is the vector of partial derivatives, and g is the inverse of the fisher information metric of the We begin by considering a population of infinite size, categorical distribution, which defines distances on the but with a finite number of unique phenotypes. Each curved manifold of probability distributions (see ap- unique variant i has a continuous multivariate pheno- pendix A). Selection changes the frequencies in the di- rection of steepest ascent in non-euclidean coordinates type xi (a vector), with frequency pi and growth rate, defined by the geometry of the manifold of the distribu- or fitness f(xi), independent of time and frequencies. In the context of optimization, phenotypes are candi- tion. date solutions, and fitness is the objective function to The natural gradient is independent of parameteri- be maximized. Classical replicator dynamics, leaving zation, and therefore, if the distribution over x can out mutation and genetic drift, describe the change in be approximated by another distribution, selection will frequencies as change those parameters in the direction of their natural gradient. This can be demonstrated by projecting onto dpi a normal phenotype distribution, as is assumed in clas- = pi (f(xi) − F ) ; (1) dt sic multivariate quantitative genetics. The population P with mean fitness F = i pif(xi). In stochas- mean µ and population covariance matrix Σ parame- tic descriptions, these dynamics describe the expected terize the distribution, and selection changes the mean changes due to selection. as ([15, 16], appendix A) In the absence of other processes, frequencies can be integrated over time resulting in dµ = ΣrµF; (3) 1 dt tf(xi) pi(t) = pi(0) e ; (2) Zt where Σ−1 is the associated Fisher information metric. with normalization Zt ensuring the probabilities sum to Similarly, the covariance follows a natural gradient with one. At long times, the phenotype distribution will con- a more complex metric (appendix A). If phenotype co- centrate on high fitness phenotypes until the highest fit- variance reaches zero, then the population is monomor- ness phenotype reaches a frequency of unity. The change phic and there is no selection. However, an alternative 3 population genetics model in the strong selection weak distribution of frequencies. The exponential of entropy St mutation regime can search a fitness landscape with lim- Kt = e defines an effective number of variants, such ited diversity, with the mutation covariance matrix serv-
Recommended publications
  • Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space
    Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space Matthew C. Fontaine Julian Togelius Viterbi School of Engineering Tandon School of Engineering University of Southern California New York University Los Angeles, CA New York City, NY [email protected] [email protected] Stefanos Nikolaidis Amy K. Hoover Viterbi School of Engineering Ying Wu College of Computing University of Southern California New Jersey Institute of Technology Los Angeles, CA Newark, NJ [email protected] [email protected] ABSTRACT We focus on the challenge of finding a diverse collection of quality solutions on complex continuous domains. While quality diver- sity (QD) algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are designed to generate a diverse range of solutions, these algorithms require a large number of evaluations for exploration of continuous spaces. Meanwhile, variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are among the best-performing derivative-free optimizers in single- objective continuous domains. This paper proposes a new QD algo- rithm called Covariance Matrix Adaptation MAP-Elites (CMA-ME). Figure 1: Comparing Hearthstone Archives. Sample archives Our new algorithm combines the self-adaptation techniques of for both MAP-Elites and CMA-ME from the Hearthstone ex- CMA-ES with archiving and mapping techniques for maintaining periment. Our new method, CMA-ME, both fills more cells diversity in QD. Results from experiments based on standard con- in behavior space and finds higher quality policies to play tinuous optimization benchmarks show that CMA-ME finds better- Hearthstone than MAP-Elites. Each grid cell is an elite (high quality solutions than MAP-Elites; similarly, results on the strategic performing policy) and the intensity value represent the game Hearthstone show that CMA-ME finds both a higher overall win rate across 200 games against difficult opponents.
    [Show full text]
  • Improving Fitness Functions in Genetic Programming for Classification on Unbalanced Credit Card Datasets
    Improving Fitness Functions in Genetic Programming for Classification on Unbalanced Credit Card Datasets Van Loi Cao, Nhien An Le Khac, Miguel Nicolau, Michael O’Neill, James McDermott University College Dublin Abstract. Credit card fraud detection based on machine learning has recently attracted considerable interest from the research community. One of the most important tasks in this area is the ability of classifiers to handle the imbalance in credit card data. In this scenario, classifiers tend to yield poor accuracy on the fraud class (minority class) despite realizing high overall accuracy. This is due to the influence of the major- ity class on traditional training criteria. In this paper, we aim to apply genetic programming to address this issue by adapting existing fitness functions. We examine two fitness functions from previous studies and develop two new fitness functions to evolve GP classifier with superior accuracy on the minority class and overall. Two UCI credit card datasets are used to evaluate the effectiveness of the proposed fitness functions. The results demonstrate that the proposed fitness functions augment GP classifiers, encouraging fitter solutions on both the minority and the majority classes. 1 Introduction Credit cards have emerged as the preferred means of payment in response to continuous economic growth and rapid developments in information technology. The daily volume of credit card transactions reveals the shift from cash to card. Concurrently, the incidence of credit card fraud has risen with the loss of billions of dollars globally each year. Moreover, criminals have evolved increasingly more sophisticated strategies to exploit weaknesses in the security protocols.
    [Show full text]
  • Accelerating Evolutionary Algorithms with Gaussian Process Fitness Function Models Dirk Buche,¨ Nicol N
    IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS — PART C: APPLICATIONS AND REVIEWS, VOL. 35, NO. 2, MAY 2005 183 Accelerating Evolutionary Algorithms with Gaussian Process Fitness Function Models Dirk Buche,¨ Nicol N. Schraudolph, and Petros Koumoutsakos Abstract— We present an overview of evolutionary algorithms II. MODELS IN EVOLUTIONARY ALGORITHMS that use empirical models of the fitness function to accelerate convergence, distinguishing between evolution control and the There are two main ways to integrate models into an surrogate approach. We describe the Gaussian process model evolutionary optimization. In the first, a fraction of individuals and propose using it as an inexpensive fitness function surrogate. is evaluated on the fitness function itself, the remainder Implementation issues such as efficient and numerically stable merely on its model. Jin et al. [7] refer to individuals that computation, exploration versus exploitation, local modeling, are evaluated on the fitness function as controlled, and call multiple objectives and constraints, and failed evaluations are ad- dressed. Our resulting Gaussian process optimization procedure this technique evolution control. In the second approach, the clearly outperforms other evolutionary strategies on standard test optimum of the model is determined, then evaluated on the functions as well as on a real-world problem: the optimization fitness function. The new evaluation is used to update the of stationary gas turbine compressor profiles. model, and the process is repeated with the improved model. Index Terms— evolutionary algorithms, fitness function mod- This surrogate approach evaluates only predicted optima on eling, evolution control, surrogate approach, Gaussian process, the fitness function, otherwise using the model as a surrogate.
    [Show full text]
  • Neuroevolution and an Application of an Agent Based Model for Financial Market
    City University of New York (CUNY) CUNY Academic Works Dissertations and Theses City College of New York 2014 NEUROEVOLUTION AND AN APPLICATION OF AN AGENT BASED MODEL FOR FINANCIAL MARKET Anil Yaman CUNY City College How does access to this work benefit ou?y Let us know! More information about this work at: https://academicworks.cuny.edu/cc_etds_theses/648 Discover additional works at: https://academicworks.cuny.edu This work is made publicly available by the City University of New York (CUNY). Contact: [email protected] NEUROEVOLUTION AND AN APPLICATION OF AN AGENT BASED MODEL FOR FINANCIAL MARKET Submitted in partial fulfillment of the requirement for the degree Master of Science (Computer) at The City College of New York of the City University of New York by Anil Yaman May 2014 NEUROEVOLUTION AND AN APPLICATION OF AN AGENT BASED MODEL FOR FINANCIAL MARKET Submitted in partial fulfillment of the requirement for the degree Master of Science (Computer) at The City College of New York of the City University of New York by Anil Yaman May 2014 Approved: Associate Professor Stephen Lucci, Thesis Advisor Department of Computer Science Professor Akira Kawaguchi, Chairman Department of Computer Science Abstract Market prediction is one of the most difficult problems for the machine learning community. Even though, successful trading strategies can be found for the training data using various optimization methods, these strategies usually do not perform well on the test data as expected. Therefore, se- lection of the correct strategy becomes problematic. In this study, we propose an evolutionary al- gorithm that produces a variation of trader agents ensuring that the trading strategies they use are different.
    [Show full text]
  • Maximization of a Dissimilarity Measure for Multimodal Optimization
    Maximization of a Dissimilarity Measure for Multimodal Optimization Fabr´ıcio Olivetti de Franca, Member, IEEE Universidade Federal do ABC (UFABC) Center of Mathematics, Computing and Cognition (CMCC) R. Santa Adelia´ 166, CEP 09210-170, Santo Andre,´ Brazil Email: [email protected] Abstract—Many practical problems are described by an This problem was also solved using Niching techniques objective-function with the intent to optimize a single goal. This enforcing the output of multiple solutions. Additionaly, this leads to the important research topic of nonlinear optimization, motivated the creation of specific computational methods spe- that seeks to create algorithms and computational methods that cially crafted to find multiple optima, such as the immune- are capable of finding a global optimum of such functions. But, inspired approaches [8], [9], [10]. many functions are multimodal, having many different global optima. Also, given the impossibility to create an exact model As these methods aim at exploring distinct regions of of a real-world problem, not every global (or local) optima is the search space, expecting to find the basis of attraction feaseable to be conceived. As such, it is interesting to find as many of different global optima, they often use a distance metric alternative optima in order to find one that is feaseable given unmodelled constraints. This paper proposes a methodology that, to determine whether two points are sufficiently far apart. given a local optimum, it finds nearby local optima with similar Most algorithms use the Euclidean distance but this poses objective-function values. This is performed by maximizing the as a problem in order to find the right definition of what is approximation error of a Linear Interpolation of the function.
    [Show full text]
  • Implementation and Evaluation of Cma-Es Algorithm
    IMPLEMENTATION AND EVALUATION OF CMA-ES ALGORITHM A Paper Submitted to the Graduate Faculty of the North Dakota State University of Agriculture and Applied Science By Srikanth Reddy Gagganapalli In Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE Major Department: Computer Science November 2015 Fargo, North Dakota North Dakota State University Graduate School Title IMPLEMENTATION AND EVALUATION OF CMA-ES ALGORITHM By Srikanth Reddy Gagganapalli The Supervisory Committee certifies that this disquisition complies with North Dakota State University’s regulations and meets the accepted standards for the degree of MASTER OF SCIENCE SUPERVISORY COMMITTEE: Dr. Simone Ludwig Chair Dr. Saeed Salem Dr. María de los Ángeles Alfonseca-Cubero Approved: 11/03/2015 Dr. Brian M. Slator Date Department Chair ABSTRACT Over recent years, Evolutionary Algorithms have emerged as a practical approach to solve hard optimization problems in the fields of Science and Technology. The inherent advantage of EA over other types of numerical optimization methods lies in the fact that they require very little or no prior knowledge regarding differentiability or continuity of the objective function. The inspiration to learn evolutionary processes and emulate them on computer comes from varied directions, the most pertinent of which is the field of optimization. In most applications of EAs, computational complexity is a prohibiting factor. This computational complexity is due to number of fitness evaluations. This paper presents one such Evolutionary Algorithm known as Covariance Matrix Adaption Evolution Strategies (CMA ES) developed by Nikolaus Hansen, We implemented and evaluated its performance on benchmark problems aiming for least number of fitness evaluations and prove that the CMA-ES algorithm is efficient in solving optimization problems.
    [Show full text]
  • Evolution Strategies
    Evolution Strategies Nikolaus Hansen, Dirk V. Arnold and Anne Auger February 11, 2015 1 Contents 1 Overview 3 2 Main Principles 4 2.1 (µ/ρ +; λ) Notation for Selection and Recombination.......................5 2.2 Two Algorithm Templates......................................6 2.3 Recombination Operators......................................7 2.4 Mutation Operators.........................................8 3 Parameter Control 9 3.1 The 1/5th Success Rule....................................... 11 3.2 Self-Adaptation........................................... 11 3.3 Derandomized Self-Adaptation................................... 12 3.4 Non-Local Derandomized Step-Size Control (CSA)........................ 12 3.5 Addressing Dependencies Between Variables............................ 14 3.6 Covariance Matrix Adaptation (CMA)............................... 14 3.7 Natural Evolution Strategies.................................... 15 3.8 Further Aspects............................................ 18 4 Theory 19 4.1 Lower Runtime Bounds....................................... 20 4.2 Progress Rates............................................ 21 4.2.1 (1+1)-ES on Sphere Functions............................... 22 4.2.2 (µ/µ, λ)-ES on Sphere Functions.............................. 22 4.2.3 (µ/µ, λ)-ES on Noisy Sphere Functions........................... 24 4.2.4 Cumulative Step-Size Adaptation.............................. 24 4.2.5 Parabolic Ridge Functions.................................. 25 4.2.6 Cigar Functions.......................................
    [Show full text]
  • Multimodal Optimization Using Crowding Differential Evolution with Spatially Neighbors Best Search
    932 JOURNAL OF SOFTWARE, VOL. 8, NO. 4, APRIL 2013 Multimodal Optimization using Crowding Differential Evolution with Spatially Neighbors Best Search Dingcai Shen State Key Laboratory of Software Engineering, Computer School, Wuhan University, Wuhan, China School of Computer and Information Science, Hubei Engineering University, Xiaogan, China Email: [email protected] Yuanxiang Li State Key Laboratory of Software Engineering, Computer School, Wuhan University, Wuhan, China Abstract—Many real practical applications are often needed options. For example, in the field of mechanical design to find more than one optimum solution. Existing we may choose the less fit solutions as our final choice Evolutionary Algorithm (EAs) are originally designed to due to physical or spatial restrictions that the optimal search the unique global value of the objective function. The solutions may be hard to fabricate, or for the easiness of present work proposed an improved niching based scheme maintenance, or the reliability and so on. named spatially neighbors best search technique combine with crowding-based differential evolution (SnbDE) for Multimodal problems are usually thought as a difficult multimodal optimization problems. Differential evolution work to solve by canonical EAs because of the existence (DE) is known for its simple implementation and efficient of multiple global or local optima. Numerous techniques for global optimization. Numerous DE-variants have been have been applied to EAs to enable them to find and exploited to resolve diverse optimization problems. The maintain multiple optima among the whole search space. proposed method adopts DE with DE/best/1/bin scheme. These techniques can be classified into two major The best individual in the adopted scheme is searched categories [3][4]: (i) iterative methods [5][6], which around the considered individual to control the balance of adopting the same optimization algorithm repeatedly to exploitation and exploration.
    [Show full text]
  • KGSA: a Gravitational Search Algorithm for Multimodal
    Open Math. 2018; 16:1582–1606 Open Mathematics Research Article Shahram Golzari*, Mohammad Nourmohammadi Zardehsavar, Amin Mousavi, Mahmoud Reza Saybani, Abdullah Khalili, and Shahaboddin Shamshirband* KGSA: A Gravitational Search Algorithm for Multimodal Optimization based on K-Means Niching Technique and a Novel Elitism Strategy https://doi.org/10.1515/math-2018-0132 Received February 14, 2018; accepted August 2, 2018 Abstract: Gravitational Search Algorithm (GSA) is a metaheuristic for solving unimodal problems. In this paper, a K-means based GSA (KGSA) for multimodal optimization is proposed. This algorithm incorporates K- means and a new elitism strategy called “loop in loop” into the GSA. First in KGSA, the members of the initial population are clustered by K-means. Afterwards, new population is created and divided in dierent niches (or clusters) to expand the search space. The “loop in loop” technique guides the members of each niche to the optimum direction according to their clusters. This means that lighter members move faster towards the optimum direction of each cluster than the heavier members. For evaluations, KGSA is benchmarked on well-known functions and is compared with some of the state-of-the-art algorithms. Experiments show that KGSA provides better results than the other algorithms in nding local and global optima of constrained and unconstrained multimodal functions. Keywords: Gravitational Search Algorithm (GSA), multimodal optimization, K-means, niching methods 1 Introduction In addition to the need for nding several optima in many applications, solving multimodal problems can be useful at least for two reasons; rst, it can increase the chance of nding the global optimum and second, it can help the researcher to become more familiar with the nature of the problem [1].
    [Show full text]
  • Evolutionary Multimodal Optimization: a Short Survey
    Evolutionary Multimodal Optimization: A Short Survey Ka-Chun Wong (Department of Computer Science, University of Toronto) August 4, 2015 Real world problems always have different multiple solutions. For instance, optical en- gineers need to tune the recording parameters to get as many optimal solutions as possible for multiple trials in the varied-line-spacing holographic grating design problem. Unfor- tunately, most traditional optimization techniques focus on solving for a single optimal solution. They need to be applied several times; yet all solutions are not guaranteed to be found. Thus the multimodal optimization problem was proposed. In that problem, we are interested in not only a single optimal point, but also the others. With strong parallel search capability, evolutionary algorithms are shown to be particularly effective in solving this type of problem. In particular, the evolutionary algorithms for multimodal optimization usually not only locate multiple optima in a single run, but also preserve their population diversity throughout a run, resulting in their global optimization ability on mul- timodal functions. In addition, the techniques for multimodal optimization are borrowed as diversity maintenance techniques to other problems. In this chapter, we describe and review the state-of-the-arts evolutionary algorithms for multimodal optimization in terms of methodology, benchmarking, and application. 1 Introduction and Background Since genetic algorithm was proposed by John H. Holland [9] in the early 1970s, researchers have been exploring the power of evolutionary algorithms [47]. For instance, biological pat- tern discovery [48] and computer vision [51]. In particular, its function optimization capa- bility was highlighted [6] because of its high adaptability to different non-convex function landscapes, to which we cannot apply traditional optimization techniques.
    [Show full text]
  • Natural Evolution Strategies
    Journal of Machine Learning Research 15 (2014) 949-980 Submitted 4/11; Revised 5/13; Published 3/14 Natural Evolution Strategies Daan Wierstra [email protected] Tom Schaul [email protected] DeepMind Technologies Ltd. Fountain House, 130 Fenchurch Street London, United Kingdom Tobias Glasmachers [email protected] Institute for Neural Computation Universit¨atsstrasse 150 Ruhr-University Bochum, Germany Yi Sun [email protected] Google Inc. 1600 Amphitheatre Pkwy Mountain View, United States Jan Peters [email protected] Intelligent Autonomous Systems Institute Hochschulstrasse 10 Technische Universit¨atDarmstadt, Germany J¨urgenSchmidhuber [email protected] Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA) University of Lugano (USI)/SUPSI Galleria 2 Manno-Lugano, Switzerland Editor: Una-May O'Reilly Abstract This paper presents Natural Evolution Strategies (NES), a recent family of black-box opti- mization algorithms that use the natural gradient to update a parameterized search distri- bution in the direction of higher expected fitness. We introduce a collection of techniques that address issues of convergence, robustness, sample complexity, computational complex- ity and sensitivity to hyperparameters. This paper explores a number of implementations of the NES family, such as general-purpose multi-variate normal distributions and separa- ble distributions tailored towards search in high dimensional spaces. Experimental results show best published performance on various standard benchmarks, as well as competitive performance on others. Keywords: natural gradient, stochastic search, evolution strategies, black-box optimiza- tion, sampling 1. Introduction Many real world optimization problems are too difficult or complex to model directly. There- fore, they might best be solved in a `black-box' manner, requiring no additional information c 2014 Daan Wierstra, Tom Schaul, Tobias Glasmachers, Yi Sun, Jan Peters and J¨urgenSchmidhuber.
    [Show full text]
  • Diversity Handling in Evolutionary Landscape
    Diversity Handling In Evolutionary Landscape Maumita Bhattacharya School of Computing & Mathematics Charles Sturt University, Australia [email protected] Abstract mutation is unlikely to be adequate to escape local optimum or optima [17]. While premature The search ability of an Evolutionary Algorithm convergence [17] may be defined as the phenomenon (EA) depends on the variation among the individuals in of convergence to sub-optimal solutions, gene- the population [3, 4, 8]. Maintaining an optimal level convergence means loss of diversity in the process of of diversity in the EA population is imperative to evolution. Though, the convergence to a local or to the ensure that progress of the EA search is unhindered by global optimum cannot necessarily be concluded from premature convergence to suboptimal solutions. gene convergence, maintaining a certain degree of Clearer understanding of the concept of population diversity is widely believed to help avoid entrapment diversity, in the context of evolutionary search and in non-optimal solutions [3, 4]. premature convergence in particular, is the key to In this paper we present a comprehensive analysis designing efficient EAs. To this end, this paper first on population diversity in the context of efficiency of presents a comprehensive analysis of the EA evolutionary search. We then present an investigation population diversity issues. Next we present an on a counter niching-based evolutionary algorithm that investigation on a counter-niching EA technique [4] aims at combating gene-convergence (and premature that introduces and maintains constructive diversity in convergence in turn) by employing intelligent the population. The proposed approach uses informed introduction of constructive diversity [4].
    [Show full text]