Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Volume 3 Summer 2012
Volume 3 Summer 2012 . Academic Partners . Cover image Magnetic resonance image of the human brain showing colour-coded regions activated by smell stimulus. Editors Ulisses Barres de Almeida Max-Planck-Institut fuer Physik [email protected] Juan Rojo TH Unit, PH Division, CERN [email protected] [email protected] Academic Partners Fondazione CEUR Consortium Nova Universitas Copyright ©2012 by Associazione EURESIS The user may not modify, copy, reproduce, retransmit or otherwise distribute this publication and its contents (whether text, graphics or original research content), without express permission in writing from the Editors. Where the above content is directly or indirectly reproduced in an academic context, this must be acknowledge with the appropriate bibliographical citation. The opinions stated in the papers of the Euresis Journal are those of their respective authors and do not necessarily reflect the opinions of the Editors or the members of the Euresis Association or its sponsors. Euresis Journal (ISSN 2239-2742), a publication of Associazione Euresis, an Association for the Promotion of Scientific Endevour, Via Caduti di Marcinelle 2, 20134 Milano, Italia. www.euresisjournal.org Contact information: Email. [email protected] Tel.+39-022-1085-2225 Fax. +39-022-1085-2222 Graphic design and layout Lorenzo Morabito Technical Editor Davide PJ Caironi This document was created using LATEX 2" and X LE ATEX 2 . Letter from the Editors Dear reader, with this new issue we reach the third volume of Euresis Journal, an editorial ad- venture started one year ago with the scope of opening up a novel space of debate and encounter within the scientific and academic communities. -
Optimization of Deep Architectures for EEG Signal Classification
sensors Article Optimization of Deep Architectures for EEG Signal Classification: An AutoML Approach Using Evolutionary Algorithms Diego Aquino-Brítez 1, Andrés Ortiz 2,* , Julio Ortega 1, Javier León 1, Marco Formoso 2 , John Q. Gan 3 and Juan José Escobar 1 1 Department of Computer Architecture and Technology, University of Granada, 18014 Granada, Spain; [email protected] (D.A.-B.); [email protected] (J.O.); [email protected] (J.L.); [email protected] (J.J.E.) 2 Department of Communications Engineering, University of Málaga, 29071 Málaga, Spain; [email protected] 3 School of Computer Science and Electronic Engineering, University of Essex, Colchester CO4 3SQ, UK; [email protected] * Correspondence: [email protected] Abstract: Electroencephalography (EEG) signal classification is a challenging task due to the low signal-to-noise ratio and the usual presence of artifacts from different sources. Different classification techniques, which are usually based on a predefined set of features extracted from the EEG band power distribution profile, have been previously proposed. However, the classification of EEG still remains a challenge, depending on the experimental conditions and the responses to be captured. In this context, the use of deep neural networks offers new opportunities to improve the classification performance without the use of a predefined set of features. Nevertheless, Deep Learning architec- Citation: Aquino-Brítez, D.; Ortiz, tures include a vast number of hyperparameters on which the performance of the model relies. In this A.; Ortega, J.; León, J.; Formoso, M.; paper, we propose a method for optimizing Deep Learning models, not only the hyperparameters, Gan, J.Q.; Escobar, J.J. -
A Neuroevolution Approach to Imitating Human-Like Play in Ms
A Neuroevolution Approach to Imitating Human-Like Play in Ms. Pac-Man Video Game Maximiliano Miranda, Antonio A. S´anchez-Ruiz, and Federico Peinado Departamento de Ingenier´ıadel Software e Inteligencia Artificial Universidad Complutense de Madrid c/ Profesor Jos´eGarc´ıaSantesmases 9, 28040 Madrid (Spain) [email protected] - [email protected] - [email protected] http://www.narratech.com Abstract. Simulating human behaviour when playing computer games has been recently proposed as a challenge for researchers on Artificial Intelligence. As part of our exploration of approaches that perform well both in terms of the instrumental similarity measure and the phenomeno- logical evaluation by human spectators, we are developing virtual players using Neuroevolution. This is a form of Machine Learning that employs Evolutionary Algorithms to train Artificial Neural Networks by consider- ing the weights of these networks as chromosomes for the application of genetic algorithms. Results strongly depend on the fitness function that is used, which tries to characterise the human-likeness of a machine- controlled player. Designing this function is complex and it must be implemented for specific games. In this work we use the classic game Ms. Pac-Man as an scenario for comparing two different methodologies. The first one uses raw data extracted directly from human traces, i.e. the set of movements executed by a real player and their corresponding time stamps. The second methodology adds more elaborated game-level parameters as the final score, the average distance to the closest ghost, and the number of changes in the player's route. We assess the impor- tance of these features for imitating human-like play, aiming to obtain findings that would be useful for many other games. -
Neuro-Crítica: Un Aporte Al Estudio De La Etiología, Fenomenología Y Ética Del Uso Y Abuso Del Prefijo Neuro1
JAHR ǀ Vol. 4 ǀ No. 7 ǀ 2013 Professional article Amir Muzur, Iva Rincic Neuro-crítica: un aporte al estudio de la etiología, fenomenología y ética del uso y abuso del prefijo neuro1 ABSTRACT The last few decades, beside being proclaimed "the decades of the brain" or "the decades of the mind," have witnessed a fascinating explosion of new disciplines and pseudo-disciplines characterized by the prefix neuro-. To the "old" specializations of neurosurgery, neurophysiology, neuropharmacology, neurobiology, etc., some new ones have to be added, which might sound somehow awkward, like neurophilosophy, neuroethics, neuropolitics, neurotheology, neuroanthropology, neuroeconomy, and other. Placing that phenomenon of "neuroization" of all fields of human thought and practice into a context of mostly unjustified and certainly too high – almost millenarianistic – expectations of the science of the brain and mind at the end of the 20th century, the present paper tries to analyze when the use of the prefix neuro- is adequate and when it is dubious. Key words: brain, neuroscience, word coinage RESUMEN Las últimas décadas, además de haber sido declaradas "las décadas del cerebro" o "las décadas de la mente", han sido testigos de una fascinante explosión de nuevas disciplinas y pseudo-disciplinas caracterizadas por el prefijo neuro-. A las "antiguas" especializaciones de la neurocirugía, la neurofisiología, la neurofarmacología, la neurobiología, etc., deben agregarse otras nuevas que pueden sonar algo extrañas, como la neurofilosofía, la neuroética, la neuropolítica, la neuroteología, la neuroantropología, la neuroeconomía, entre otras. El presente trabajo coloca ese fenómeno de "neuroización" de todos los campos del pensamiento y la práctica humana en un contexto de expectativas en general injustificadas y sin duda altísimas (y casi milenarias) en la ciencia del cerebro y la mente a fines del siglo XX; e intenta analizar cuándo el uso del prefijo neuro- es adecuado y cuándo es cuestionable. -
Machine Learning for Neuroscience Geoffrey E Hinton
Hinton Neural Systems & Circuits 2011, 1:12 http://www.neuralsystemsandcircuits.com/content/1/1/12 QUESTIONS&ANSWERS Open Access Machine learning for neuroscience Geoffrey E Hinton What is machine learning? Could you provide a brief description of the Machine learning is a type of statistics that places parti- methods of machine learning? cular emphasis on the use of advanced computational Machine learning can be divided into three parts: 1) in algorithms. As computers become more powerful, and supervised learning, the aim is to predict a class label or modern experimental methods in areas such as imaging a real value from an input (classifying objects in images generate vast bodies of data, machine learning is becom- or predicting the future value of a stock are examples of ing ever more important for extracting reliable and this type of learning); 2) in unsupervised learning, the meaningful relationships and for making accurate pre- aim is to discover good features for representing the dictions. Key strands of modern machine learning grew input data; and 3) in reinforcement learning, the aim is out of attempts to understand how large numbers of to discover what action should be performed next in interconnected, more or less neuron-like elements could order to maximize the eventual payoff. learn to achieve behaviourally meaningful computations and to extract useful features from images or sound What does machine learning have to do with waves. neuroscience? By the 1990s, key approaches had converged on an Machine learning has two very different relationships to elegant framework called ‘graphical models’, explained neuroscience. As with any other science, modern in Koller and Friedman, in which the nodes of a graph machine-learning methods can be very helpful in analys- represent variables such as edges and corners in an ing the data. -
Evolving Neural Networks Using Behavioural Genetic Principles
Evolving Neural Networks Using Behavioural Genetic Principles Maitrei Kohli A dissertation submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy Department of Computer Science & Information Systems Birkbeck, University of London United Kingdom March 2017 0 Declaration This thesis is the result of my own work, except where explicitly acknowledged in the text. Maitrei Kohli …………………. 1 ABSTRACT Neuroevolution is a nature-inspired approach for creating artificial intelligence. Its main objective is to evolve artificial neural networks (ANNs) that are capable of exhibiting intelligent behaviours. It is a widely researched field with numerous successful methods and applications. However, despite its success, there are still open research questions and notable limitations. These include the challenge of scaling neuroevolution to evolve cognitive behaviours, evolving ANNs capable of adapting online and learning from previously acquired knowledge, as well as understanding and synthesising the evolutionary pressures that lead to high-level intelligence. This thesis presents a new perspective on the evolution of ANNs that exhibit intelligent behaviours. The novel neuroevolutionary approach presented in this thesis is based on the principles of behavioural genetics (BG). It evolves ANNs’ ‘general ability to learn’, combining evolution and ontogenetic adaptation within a single framework. The ‘general ability to learn’ was modelled by the interaction of artificial genes, encoding the intrinsic properties of the ANNs, and the environment, captured by a combination of filtered training datasets and stochastic initialisation weights of the ANNs. Genes shape and constrain learning whereas the environment provides the learning bias; together, they provide the ability of the ANN to acquire a particular task. -
Evolving Autonomous Agent Controllers As Analytical Mathematical Models
Evolving Autonomous Agent Controllers as Analytical Mathematical Models Paul Grouchy1 and Gabriele M.T. D’Eleuterio1 1University of Toronto Institute for Aerospace Studies, Toronto, Ontario, Canada M3H 5T6 [email protected] Downloaded from http://direct.mit.edu/isal/proceedings-pdf/alife2014/26/681/1901537/978-0-262-32621-6-ch108.pdf by guest on 27 September 2021 Abstract a significant amount of time and computational resources. Furthermore, any such design decisions introduce additional A novel Artificial Life paradigm is proposed where experimenter bias into the simulation. Finally, complex be- autonomous agents are controlled via genetically- haviors require complex representations, adding to the al- encoded Evolvable Mathematical Models (EMMs). Agent/environment inputs are mapped to agent outputs ready high computational costs of ALife algorithms while via equation trees which are evolved using Genetic Pro- further obfuscating the inner workings of the evolved agents. gramming. Equations use only the four basic mathematical We propose Evolvable Mathematical Models (EMMs), a operators: addition, subtraction, multiplication and division. novel paradigm whereby autonomous agents are evolved via Experiments on the discrete Double-T Maze with Homing GP as analytical mathematical models of behavior. Agent problem are performed; the source code has been made available. Results demonstrate that autonomous controllers inputs are mapped to outputs via a system of genetically with learning capabilities can be evolved as analytical math- encoded equations. Only the four basic mathematical op- ematical models of behavior, and that neuroplasticity and erations (addition, subtraction, multiplication and division) neuromodulation can emerge within this paradigm without are required, as they can be used to approximate any ana- having these special functionalities specified a priori. -
Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning
Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning Felipe Petroski Such Vashisht Madhavan Edoardo Conti Joel Lehman Kenneth O. Stanley Jeff Clune Uber AI Labs ffelipe.such, [email protected] Abstract 1. Introduction Deep artificial neural networks (DNNs) are typ- A recent trend in machine learning and AI research is that ically trained via gradient-based learning al- old algorithms work remarkably well when combined with gorithms, namely backpropagation. Evolution sufficient computing resources and data. That has been strategies (ES) can rival backprop-based algo- the story for (1) backpropagation applied to deep neu- rithms such as Q-learning and policy gradients ral networks in supervised learning tasks such as com- on challenging deep reinforcement learning (RL) puter vision (Krizhevsky et al., 2012) and voice recog- problems. However, ES can be considered a nition (Seide et al., 2011), (2) backpropagation for deep gradient-based algorithm because it performs neural networks combined with traditional reinforcement stochastic gradient descent via an operation sim- learning algorithms, such as Q-learning (Watkins & Dayan, ilar to a finite-difference approximation of the 1992; Mnih et al., 2015) or policy gradient (PG) methods gradient. That raises the question of whether (Sehnke et al., 2010; Mnih et al., 2016), and (3) evolution non-gradient-based evolutionary algorithms can strategies (ES) applied to reinforcement learning bench- work at DNN scales. Here we demonstrate they marks (Salimans et al., 2017). One common theme is can: we evolve the weights of a DNN with a sim- that all of these methods are gradient-based, including ES, ple, gradient-free, population-based genetic al- which involves a gradient approximation similar to finite gorithm (GA) and it performs well on hard deep differences (Williams, 1992; Wierstra et al., 2008; Sali- RL problems, including Atari and humanoid lo- mans et al., 2017). -
Acetylcholine in Cortical Inference
Neural Networks 15 (2002) 719–730 www.elsevier.com/locate/neunet 2002 Special issue Acetylcholine in cortical inference Angela J. Yu*, Peter Dayan Gatsby Computational Neuroscience Unit, University College London, 17 Queen Square, London WC1N 3AR, UK Received 5 October 2001; accepted 2 April 2002 Abstract Acetylcholine (ACh) plays an important role in a wide variety of cognitive tasks, such as perception, selective attention, associative learning, and memory. Extensive experimental and theoretical work in tasks involving learning and memory has suggested that ACh reports on unfamiliarity and controls plasticity and effective network connectivity. Based on these computational and implementational insights, we develop a theory of cholinergic modulation in perceptual inference. We propose that ACh levels reflect the uncertainty associated with top- down information, and have the effect of modulating the interaction between top-down and bottom-up processing in determining the appropriate neural representations for inputs. We illustrate our proposal by means of an hierarchical hidden Markov model, showing that cholinergic modulation of contextual information leads to appropriate perceptual inference. q 2002 Elsevier Science Ltd. All rights reserved. Keywords: Acetylcholine; Perception; Neuromodulation; Representational inference; Hidden Markov model; Attention 1. Introduction medial septum (MS), diagonal band of Broca (DBB), and nucleus basalis (NBM). Physiological studies on ACh Neuromodulators such as acetylcholine (ACh), sero- indicate that its neuromodulatory effects at the cellular tonine, dopamine, norepinephrine, and histamine play two level are diverse, causing synaptic facilitation and suppres- characteristic roles. One, most studied in vertebrate systems, sion as well as direct hyperpolarization and depolarization, concerns the control of plasticity. The other, most studied in all within the same cortical area (Kimura, Fukuda, & invertebrate systems, concerns the control of network Tsumoto, 1999). -
Model-Free Episodic Control
Model-Free Episodic Control Charles Blundell Benigno Uria Alexander Pritzel Google DeepMind Google DeepMind Google DeepMind [email protected] [email protected] [email protected] Yazhe Li Avraham Ruderman Joel Z Leibo Jack Rae Google DeepMind Google DeepMind Google DeepMind Google DeepMind [email protected] [email protected] [email protected] [email protected] Daan Wierstra Demis Hassabis Google DeepMind Google DeepMind [email protected] [email protected] Abstract State of the art deep reinforcement learning algorithms take many millions of inter- actions to attain human-level performance. Humans, on the other hand, can very quickly exploit highly rewarding nuances of an environment upon first discovery. In the brain, such rapid learning is thought to depend on the hippocampus and its capacity for episodic memory. Here we investigate whether a simple model of hippocampal episodic control can learn to solve difficult sequential decision- making tasks. We demonstrate that it not only attains a highly rewarding strategy significantly faster than state-of-the-art deep reinforcement learning algorithms, but also achieves a higher overall reward on some of the more challenging domains. 1 Introduction Deep reinforcement learning has recently achieved notable successes in a variety of domains [23, 32]. However, it is very data inefficient. For example, in the domain of Atari games [2], deep Reinforcement Learning (RL) systems typically require tens of millions of interactions with the game emulator, amounting to hundreds of hours of game play, to achieve human-level performance. As pointed out by [13], humans learn to play these games much faster. This paper addresses the question of how to emulate such fast learning abilities in a machine—without any domain-specific prior knowledge. -
Automatic Synthesis of Working Memory Neural Networks with Neuroevolution Methods Tony Pinville, Stéphane Doncieux
Automatic synthesis of working memory neural networks with neuroevolution methods Tony Pinville, Stéphane Doncieux To cite this version: Tony Pinville, Stéphane Doncieux. Automatic synthesis of working memory neural networks with neu- roevolution methods. Cinquième conférence plénière française de Neurosciences Computationnelles, ”Neurocomp’10”, Aug 2010, Lyon, France. hal-00553407 HAL Id: hal-00553407 https://hal.archives-ouvertes.fr/hal-00553407 Submitted on 10 Mar 2011 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. AUTOMATIC SYNTHESIS OF WORKING MEMORY NEURAL NETWORKS WITH NEUROEVOLUTION METHODS Tony Pinville Stephane´ Doncieux ISIR, CNRS UMR 7222 ISIR, CNRS UMR 7222 Universite´ Pierre et Marie Curie-Paris 6 Universite´ Pierre et Marie Curie-Paris 6 4 place Jussieu, F-75252 4 place Jussieu, F-75252 Paris Cedex 05, France Paris Cedex 05, France email: [email protected] email: [email protected] ABSTRACT used in particular in Evolutionary Robotics to make real or simulated robots exhibit a desired behavior [3]. Evolutionary Robotics is a research field focused on Most evolutionary algorithms optimize a fixed size autonomous design of robots based on evolutionary algo- genotype, whereas neuroevolution methods aim at explor- rithms. -
When Planning to Survive Goes Wrong: Predicting the Future and Replaying the Past in Anxiety and PTSD
Available online at www.sciencedirect.com ScienceDirect When planning to survive goes wrong: predicting the future and replaying the past in anxiety and PTSD 1 3 1,2 Christopher Gagne , Peter Dayan and Sonia J Bishop We increase our probability of survival and wellbeing by lay out this computational problem as a form of approxi- minimizing our exposure to rare, extremely negative events. In mate Bayesian decision theory (BDT) [1] and consider this article, we examine the computations used to predict and how miscalibrated attempts to solve it might contribute to avoid such events and to update our models of the world and anxiety and stress disorders. action policies after their occurrence. We also consider how these computations might go wrong in anxiety disorders and According to BDT, we should combine a probability Post Traumatic Stress Disorder (PTSD). We review evidence distribution over all relevant states of the world with that anxiety is linked to increased simulations of the future estimates of the benefits or costs of outcomes associated occurrence of high cost negative events and to elevated with each state. We must then calculate the course of estimates of the probability of occurrence of such events. We action that delivers the largest long-run expected value. also review psychological theories of PTSD in the light of newer, Individuals can only possibly approximately solve this computational models of updating through replay and problem. To do so, they bring to bear different sources of simulation. We consider whether pathological levels of re- information (e.g. priors, evidence, models of the world) experiencing symptomatology might reflect problems and apply different methods to calculate the expected reconciling the traumatic outcome with overly optimistic priors long-run values of alternate courses of action.