Quality and Diversity Optimization: a Unifying Modular Framework Antoine Cully and Yiannis Demiris, Senior Member, IEEE

1 Quality and Diversity Optimization: A Unifying Modular Framework Antoine Cully and Yiannis Demiris, Senior Member, IEEE Abstract—The optimization of functions to find the best solu- QD-algorithm tion according to one or several objectives has a central role in Collection of diverse and high-performing solutions many engineering and research fields. Recently, a new family of optimization algorithms, named Quality-Diversity optimization, has been introduced, and contrasts with classic algorithms. Instead of searching for a single solution, Quality-Diversity Quality algorithms are searching for a large collection of both diverse Coverage and high-performing solutions. The role of this collection is to search space descriptor space Previously encountered Solution contained cover the range of possible solution types as much as possible, solution (not stored) in the collection and to contain the best solution for each type. The contribution of this paper is threefold. Firstly, we present a unifying framework Fig. 1: The objective of a QD-algorithm is to generate a of Quality-Diversity optimization algorithms that covers the two collection of both diverse and high-performing solutions. This main algorithms of this family (Multi-dimensional Archive of collection represents a (model free) projection of the high- Phenotypic Elites and the Novelty Search with Local Compe- dimensional search space into a lower dimensional space tition), and that highlights the large variety of variants that defined by the solution descriptors. The quality of a collection can be investigated within this family. Secondly, we propose algorithms with a new selection mechanism for Quality-Diversity is defined by its coverage of the descriptor space and by the algorithms that outperforms all the algorithms tested in this global quality of the solutions that are kept in the collection. paper. Lastly, we present a new collection management that overcomes the erosion issues observed when using unstructured collections. These three contributions are supported by extensive evolutionary algorithms are used to generate neural networks, experimental comparisons of Quality-Diversity algorithms on three different experimental scenarios. robot behaviors, or objects [9], [10]. However, from a more general perspective and in contrast Index Terms—Optimization Methods, Novelty Search, Quality- with Artificial Evolution, Natural Evolution does not produce Diversity, Behavioral Diversity, Collection of Solutions. one effective solution but rather an impressively large set of different organisms, all well adapted to their respective I. INTRODUCTION environment. Surprisingly, this divergent search aspect of Natural Evolution is rarely considered in engineering and Searching for high-quality solutions within a typically high- research fields, even though the ability to provide a large dimensional search space is an important part of engineering and diverse set of high-performing solutions appears to be and research. Intensive work has been done in recent decades promising for multiple reasons. to produce automated procedures to generate these solutions, For example, in a set of effective solutions, each provides which are commonly called “Optimization Algorithms”. The an alternative in the case that one solution turns out to be less applications of such algorithms are numerous and range from effective than expected. This can happen when the optimization modeling purposes to product design [1]. More recently, process takes place in simulation, and the obtained result does arXiv:1708.09251v1 [cs.NE] 12 May 2017 optimization algorithms have become the core of most machine not transfer well to reality (a phenomenon called the reality learning techniques. For example, they are used to adjust gap [11]). In this case, a large collection of solutions can the weights of neural networks in order to minimize the quickly provide a working solution [4]. Maintaining multiple classification error [2], [3], or to allow robots to learn new solutions and using them concurrently to generate actions or behaviors that maximize their velocity or accuracy [4], [5]. predict actions when done by other agents has also been shown Inspired by the ability of natural evolution to generate to be very successful in bioinspired motor control and cognitive species that are well adapted to their environment, Evolutionary robotics experiments[12]. Computation has a long history in the domain of optimization, Moreover, most artificial agents, like robots, should be able particularly in stochastic optimization [6]. For example, evolu- to exhibit different types of behavior in order to accomplish tionary methods have been used to optimize the morphologies their mission. For example, a walking robot needs to be able and the neural networks of physical robots [7], and to infer to move not only forwards, but in every direction and at the equations behind collected data [8]. These optimization different speeds, in order to properly navigate in its environment. abilities are also the core of Evolutionary Robotics in which Similarly, a robotic arm needs to be able to reach objects at different locations rather than at a single, predefined target. A. Cully and Y. Demiris are with the Personal Robotics Laboratory, Department of Electrical and Electronic Engineering, Imperial College London, Despite this observation, most optimization techniques that U.K. (e-mail: [email protected]; [email protected]). are employed to learn behaviors output only a single solution: 2 the one which maximizes the optimized function [10], [7], [5]. tend to the global-optimum and the diversity of the produced Learning generic controllers that are able to solve several tasks walking behaviors will not be enough to properly control the is particularly challenging, as it requires testing each solution robot. For instance, it will not contain slow behaviors, which on several scenarios to assess their quality [13]. The automatic are essential for the robot’s manoeuvrability. This example creation of a collection of behaviors is likely to overcome these illustrates that sampling the entire range of possible solutions limitations and will make artificial agents more versatile. is not always related to searching for the local optima, and why The diversity of the solutions could also be beneficial for it may be useful to have the diversity preservation mechanism the optimization process itself. The exploration process may not correlated with the performance function, but rather based find, within the diversity of the solutions, stepping stones that on differences in the solution type. allow the algorithm to find even higher-performing solutions. Similarly, the algorithms may be able to solve a given problem B. Searching for Diverse Solutions faster if they can rely on solutions that have been designed Following this idea of a non performance-based diversity for different but related situations. For example, modifying mechanism, the Novelty Search algorithm [18] introduces the an existing car design to make it lighter might be faster than idea of searching for solutions that are different from the inventing a completely new design. previous ones, without considering their quality. This concept Attracted by these different properties several recent works, is applied by optimizing a “novelty score” that characterizes the such as Novelty Search with Local Competition [14] and the difference of a solution compared to those already encountered, MAP-Elites algorithm [15], started to investigate the question which are stored in a “novelty archive”. The novelty archive is of generating large collections of both diverse and high- independent from the population of the evolutionary algorithm. performing solutions. Pugh et al. [16], [17] nicely named this The novelty score is computed as the average distance of question as the Quality-Diversity (QD) challenge. the k-nearest neighboring solutions that currently are in the After a brief description of the origins of QD-algorithms novelty archive, while the distances are computed according in the next section, we unify these algorithms into a single to a user-defined solution descriptor (also called a behavioral modular framework, which opens new directions to create characterization, or behavioral descriptor [18], [13]). When the QD-algorithms that combine the advantages of existing meth- novelty score of a solution exceeds a pre-defined threshold, ods (see section III). Moreover, we introduce a new QD- this solution is added to the archive and thus used to compute algorithm based on this framework that outperforms the the novelty score of future solutions. existing approaches by using a new selective pressure, named The main hypothesis behind this approach is that, in some the “curiosity score”. We also introduce a new archive cases, the optimal solutions cannot be found by simply maxi- management approach for unstructured archives, like the mizing the objective function. This is because the algorithm novelty archive [18]. The performance of these contributions is first needs to find stepping stones that are ineffective according assessed via an extensive experimental comparison involving to the objective function, but lead to promising solutions numerous variants of QD-algorithms (see section IV). After afterwards. A good illustration of this problem is the “deceptive the conclusion, we introduce the open-source library designed maze” [18] in which following the objective function inevitably for this study, which

Quality and Diversity Optimization: a Unifying Modular Framework Antoine Cully and Yiannis Demiris, Senior Member, IEEE

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support