The Landscape of AI Safety and Beneficence Research

Total Page:16

File Type:pdf, Size:1020Kb

The Landscape of AI Safety and Beneficence Research The Landscape of AI Safety and Beneficence Research: Input for Brainstorming at Beneficial AI 2017∗ Executive Summary Success in the quest for artificial intelligence has the potential to bring unprecedented benefits to humanity, and it is therefore worthwhile to research how to maximize these benefits while avoiding potential pitfalls. This document gives numerous examples of research topics aimed at ensuring that AI remains robust and beneficial. ss warene Optimization Measures Quantilization Multiobjective Optimization alue Uncertainty Generalized Fallability A Resource V emporal Discount Rates T Environmental Goals Environmental Predicting Future Observations Future Predicting Model Lookahead Model History Analysis History Detecting Channel Switching Channel Detecting Risk-Sensitive Performance Criteria Defined Impact Regularizer Making V Simulated Exploration voidance V Learned Impact Regularizer erification of Cyberphysical Systems Bounded Exploration V erified Component Design Approaches V erification of Machine Learning Components Multiobjective Reinforcement Learning erification More User Friendly V Mild Optimization erification of Recu of erification Human Oversight Penalize Influence Buried Honeypot A Adaptive Control Theory Multi-agent Approaches Abstractly Reason About Superior Agents Computational Humility Reflective Induction Confidence V erification of Whole AI Systems voiding Reward Hacking fects tion Reward Uncertainty Löbian Obstacle V Automated Impact Regularizers Nontransitive Options ference fect Evalua Formal Software V V Follow-on Analysis Pursuing Environmental Goals Environmental Pursuing Safety T erification of Intelli Resistance Counterexample Safe Exploration Adversarial Reward Functions Generalizing A rsive Self-Improvement rsive Adversarial Blinding Computational Complexity T Implementation ariable Indif voiding Negative Side Ef Optimal Policyechnique Preservation A V A Careful Engineering Reward Capping alues-Based Side Ef ulnerab V Reward Pretraining wareness Projecting Behavioral Bounds Multiple Rewards V acking ingean Reflection olerant Agent Design -T ility Finding ility Error Concept Drift Impact Measures Open Source Game Theory Goal Stability Domesticity e gent Systems verting Paranoia Ontology Identification rification A esting Logical Counterfactuals Safer Self-Modification Ontology Update Thresholds al Incentives Episodic Contexts Consistent Decision Making voiding Reward H A Expressive Representations f Induction orld Models Normative Uncertainty Unsupervised Model Learning V Decision Theory Correlation of Dynamics erification Human Preference Aggregation verting Instrument orld-Embedded Solomonof Infinite Ethics A W Perceptual Distance Agnostic W orld-Models Ontological V Bounded Rationality Knowledge Representation Ensembles Realistic W Ethical Motivationalue Universal AlgorithmicTheory IntelligenceFoundations of Ethics of Rational Agency Foundations Common Sense Reasoning Theory of Counterfactuals Causal Accounting Bootstrapped Common Sense Logical Uncertainty Seeded Common Sense Report Suf ficiency Metareasoning Detailed Audit T Endowing Common Sense Visualization and Situational A rails Lengthening Horizons wareness Monitoring Adversarial T ransparency orld-Models Interpretability of Blackboxes Rich Understanding of Human Commands Multilevel W World Model Connectedness Transparent Reinforcement Learners Transparency of Whiteboxes Validation Concept Geometry Oversight Increasing Contextual A Implicit Human Concepts Capability Distillation Conservative Concepts ith Anomaly Detection Control wareness Dimensionality Reduction W Informed Oversight Controlling Another Algorithm Human Judgement Learner Scalable Oversight of Ambiguities Cooperative Inverse Reinforcement Learner Inferring Latent Variables Scalable Oversight Bayesian Feature Selection Trusted Policy Oversight Inductive Ambiguity Identification Uncertainty Identification and Management Non-Bayesian Confidence Estimation Hierarchical Reinforcement Learning Active Learning Distant Supervision Realistic-Prior Design Covariate Shift alue Iteration Security Unsupervised V ransfer Computational Deference Knows-What-It-Knows Learning Unsupervised Risk Estimation Control T Active Reward Learning ference Robustness to Distributional Shift Causal Identification Utility Indif Symbolic-Subsymbolic Integration Limited-Information Maximum Likelihood ripwires V Semisupervised Reward Learning T alue Alignment Resource-A Active Calibration Supervised Reward Learning Corrigibility ware Reasoning Psychological Analogues Psychological Standard IT SecurityContainment T Reachability Analysis esting and Quality Assurance Quality and esting Misuse Risk Robust Policy Improvement Use of Structured Knowledge In Subsymbolic Systems Counterfactual Reasoning Decision Under Bounded Computational Resources Switchable Objective Functions Ethics Mechanisms Use of Subsymbolic Processing In Symbolic Systems eam Analysis ML with Contracts Logical Induction alidated SoftwareAI to Engineering Support Security Privileged Biases Interruptability Model Repair Norm Denial of Service Security-V Red T Grounded Ethical Evolution Utility Function Security Degree of V erified Downstack Software V Logical Priors Robust Human Imitation alue Evolution Value Specification esting Cognitive Parallels Impossible Possibilities Developmental Psychology Dysfunctional Systems of Thought Whole Brain Emulation Safety V Penetration T Handling Improper External Behavior alue Learning Moral T Ethical Ensembles riggered Defense rade Drives and Af Adversarial ML rust Establishment Structured and Privileged Ethical Biases Classical Computational Ethics Modeling Operator Intent Inverse Reinforcement Learning fect V Sensibility-T alue Structuring Capability Amplification Imitation Statistical Guarantees Operator Modeling Reversible T Detecting and Managing Low Competence Distance from User Demonstration Narrow V Scaling Judgement Learning Operator V Statistical-Behavioral T V alue Elicitation asks via V Collaborative V alue Learning alue Learning esting Values Geometry Action and Outcome Evaluations ariational alues Learning Game Theoretic Framing Unified Ethics Spaces Autoencoding Cooperative Inverse Reinforcement Learning Generative Adversarial Networks Other Generative Models V alue Sourcing Dealing with Adversarial T V alue Interrogation V alue Factoring ∗This document borrows heavily from MIRI's agent foundations technical agenda [403], Amodei et al.'s concrete problems review [11], MIRI's machine learning technical agenda [437], and FLI's previous research priorities document [371]. This document was written by Richard Mallah and was last edited January 2, 2017. It reflects feedback from Abram Demski, Andrew Critch, Anthony Aguirre, Bart Selman, Jan Leike, Jessica Taylor, Max Tegmark, Meia Chita-Tegmark, Nate Soares, Paul Christiano, Roman Yampolskiy, Stuart Armstrong, Tsvi Benson-Tilsen, Vika Krakovna, and Vincent Conitzer. (ver. 0.41) 1 Contents 1 Introduction 6 2 Foundations 7 2.1 Foundations of Rational Agency . .7 2.1.1 Logical Uncertainty . .7 2.1.2 Theory of Counterfactuals . .7 2.1.3 Universal Algorithmic Intelligence . .7 2.1.4 Theory of Ethics . .8 2.1.4.1 Ethical Motivation . .8 2.1.4.2 Ontological Value . .8 2.1.4.3 Human Preference Aggregation . .8 2.1.4.4 Infinite Ethics . .8 2.1.4.5 Normative Uncertainty . .8 2.1.5 Bounded Rationality . .9 2.2 Consistent Decision Making . .9 2.2.1 Decision Theory . .9 2.2.1.1 Logical Counterfactuals . .9 2.2.1.2 Open Source Game Theory . .9 2.2.2 Safer Self-Modification . .9 2.2.2.1 Vingean Reflection . 10 2.2.2.1.1 Abstractly Reason About Superior Agents . 10 2.2.2.1.2 Reflective Induction Confidence . 10 2.2.2.1.3 L¨obianObstacle . 10 2.2.2.2 Optimal Policy Preservation . 10 2.2.2.3 Safety Technique Awareness . 10 2.2.3 Goal Stability . 10 2.2.3.1 Nontransitive Options . 10 2.3 Projecting Behavioral Bounds . 11 2.3.1 Computational Complexity . 11 3 Verification 11 3.1 Formal Software Verification . 11 3.1.1 Verified Component Design Approaches . 12 3.1.2 Adaptive Control Theory . 12 3.1.3 Verification of Cyberphysical Systems . 12 3.1.4 Making Verification More User Friendly . 12 3.2 Automated Vulnerability Finding . 12 3.3 Verification of Intelligent Systems . 13 3.3.1 Verification of Whole AI Systems . 13 3.3.2 Verification of Machine Learning Components . 13 3.4 Verification of Recursive Self-Improvement . 13 3.5 Implementation Testing . 13 4 Validation 14 4.1 Averting Instrumental Incentives . 14 4.1.1 Error-Tolerant Agent Design . 15 4.1.2 Domesticity . 15 4.1.2.1 Impact Measures . 15 4.1.2.1.1 Impact Regularizers . 15 4.1.2.1.1.1 Defined Impact Regularizer . 15 4.1.2.1.1.2 Learned Impact Regularizer . 15 4.1.2.1.2 Follow-on Analysis . 15 4.1.2.1.3 Avoiding Negative Side Effects . 16 4.1.2.1.3.1 Penalize Influence . 16 2 4.1.2.1.3.2 Multi-agent Approaches . 16 4.1.2.1.3.3 Reward Uncertainty . 16 4.1.2.1.4 Values-Based Side Effect Evaluation . 16 4.1.2.2 Mild Optimization . 16 4.1.2.2.1 Optimization Measures . 16 4.1.2.2.2 Quantilization . 17 4.1.2.2.3 Multiobjective Optimization . 17 4.1.2.3 Safe Exploration . 17 4.1.2.3.1 Risk-Sensitive Performance Criteria . 17 4.1.2.3.2 Simulated Exploration . 17 4.1.2.3.3 Bounded Exploration . 17 4.1.2.3.4 Multiobjective Reinforcement Learning . 17 4.1.2.3.5 Human Oversight . 17 4.1.2.3.6 Buried Honeypot Avoidance . 18 4.1.2.4 Computational Humility . 18 4.1.2.4.1 Generalized Fallibility Awareness . 18 4.1.2.4.2 Resource Value Uncertainty . 18 4.1.2.4.3 Temporal Discount Rates . 18 4.1.3 Averting Paranoia . 18 4.2 Avoiding Reward Hacking . 18 4.2.1 Pursuing Environmental Goals . 18 4.2.1.1 Environmental Goals . 19 4.2.1.2 Predicting Future Observations . 19 4.2.1.3 Model Lookahead . 19 4.2.1.4 History Analysis . 19 4.2.1.5 Detecting Channel Switching . 19 4.2.2 Counterexample Resistance . 19 4.2.3 Adversarial Reward Functions . 19 4.2.4 Generalizing Avoiding Reward Hacking . 19 4.2.5 Adversarial Blinding . 19 4.2.6 Variable Indifference . 19 4.2.7 Careful Engineering . 20 4.2.8 Reward Capping . 20 4.2.9 Reward Pretraining . 20 4.2.10 Multiple Rewards . 20 4.3 Value Alignment . 20 4.3.1 Ethics Mechanisms . ..
Recommended publications
  • FOCS 2005 Program SUNDAY October 23, 2005
    FOCS 2005 Program SUNDAY October 23, 2005 Talks in Grand Ballroom, 17th floor Session 1: 8:50am – 10:10am Chair: Eva´ Tardos 8:50 Agnostically Learning Halfspaces Adam Kalai, Adam Klivans, Yishay Mansour and Rocco Servedio 9:10 Noise stability of functions with low influences: invari- ance and optimality The 46th Annual IEEE Symposium on Elchanan Mossel, Ryan O’Donnell and Krzysztof Foundations of Computer Science Oleszkiewicz October 22-25, 2005 Omni William Penn Hotel, 9:30 Every decision tree has an influential variable Pittsburgh, PA Ryan O’Donnell, Michael Saks, Oded Schramm and Rocco Servedio Sponsored by the IEEE Computer Society Technical Committee on Mathematical Foundations of Computing 9:50 Lower Bounds for the Noisy Broadcast Problem In cooperation with ACM SIGACT Navin Goyal, Guy Kindler and Michael Saks Break 10:10am – 10:30am FOCS ’05 gratefully acknowledges financial support from Microsoft Research, Yahoo! Research, and the CMU Aladdin center Session 2: 10:30am – 12:10pm Chair: Satish Rao SATURDAY October 22, 2005 10:30 The Unique Games Conjecture, Integrality Gap for Cut Problems and Embeddability of Negative Type Metrics Tutorials held at CMU University Center into `1 [Best paper award] Reception at Omni William Penn Hotel, Monongahela Room, Subhash Khot and Nisheeth Vishnoi 17th floor 10:50 The Closest Substring problem with small distances Tutorial 1: 1:30pm – 3:30pm Daniel Marx (McConomy Auditorium) Chair: Irit Dinur 11:10 Fitting tree metrics: Hierarchical clustering and Phy- logeny Subhash Khot Nir Ailon and Moses Charikar On the Unique Games Conjecture 11:30 Metric Embeddings with Relaxed Guarantees Break 3:30pm – 4:00pm Ittai Abraham, Yair Bartal, T-H.
    [Show full text]
  • Hierarchical Clustering with Global Objectives: Approximation Algorithms and Hardness Results
    HIERARCHICAL CLUSTERING WITH GLOBAL OBJECTIVES: APPROXIMATION ALGORITHMS AND HARDNESS RESULTS ADISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Evangelos Chatziafratis June 2020 © 2020 by Evangelos Chatziafratis. All Rights Reserved. Re-distributed by Stanford University under license with the author. This work is licensed under a Creative Commons Attribution- Noncommercial 3.0 United States License. http://creativecommons.org/licenses/by-nc/3.0/us/ This dissertation is online at: http://purl.stanford.edu/bb164pj1759 ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Tim Roughgarden, Primary Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Moses Charikar, Co-Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Li-Yang Tan I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Gregory Valiant Approved for the Stanford University Committee on Graduate Studies. Stacey F. Bent, Vice Provost for Graduate Education This signature page was generated electronically upon submission of this dissertation in electronic format.
    [Show full text]
  • Classification Schemas for Artificial Intelligence Failures
    Journal XX (XXXX) XXXXXX https://doi.org/XXXX/XXXX Classification Schemas for Artificial Intelligence Failures Peter J. Scott1 and Roman V. Yampolskiy2 1 Next Wave Institute, USA 2 University of Louisville, Kentucky, USA [email protected], [email protected] Abstract In this paper we examine historical failures of artificial intelligence (AI) and propose a classification scheme for categorizing future failures. By doing so we hope that (a) the responses to future failures can be improved through applying a systematic classification that can be used to simplify the choice of response and (b) future failures can be reduced through augmenting development lifecycles with targeted risk assessments. Keywords: artificial intelligence, failure, AI safety, classification 1. Introduction Artificial intelligence (AI) is estimated to have a $4-6 trillion market value [1] and employ 22,000 PhD researchers [2]. It is estimated to create 133 million new roles by 2022 but to displace 75 million jobs in the same period [6]. Projections for the eventual impact of AI on humanity range from utopia (Kurzweil, 2005) (p.487) to extinction (Bostrom, 2005). In many respects AI development outpaces the efforts of prognosticators to predict its progress and is inherently unpredictable (Yampolskiy, 2019). Yet all AI development is (so far) undertaken by humans, and the field of software development is noteworthy for unreliability of delivering on promises: over two-thirds of companies are more likely than not to fail in their IT projects [4]. As much effort as has been put into the discipline of software safety, it still has far to go. Against this background of rampant failures we must evaluate the future of a technology that could evolve to human-like capabilities, usually known as artificial general intelligence (AGI).
    [Show full text]
  • Letters to the Editor
    Articles Letters to the Editor Research Priorities for is a product of human intelligence; we puter scientists, innovators, entrepre - cannot predict what we might achieve neurs, statisti cians, journalists, engi - Robust and Beneficial when this intelligence is magnified by neers, authors, professors, teachers, stu - Artificial Intelligence: the tools AI may provide, but the eradi - dents, CEOs, economists, developers, An Open Letter cation of disease and poverty are not philosophers, artists, futurists, physi - unfathomable. Because of the great cists, filmmakers, health-care profes - rtificial intelligence (AI) research potential of AI, it is important to sionals, research analysts, and members Ahas explored a variety of problems research how to reap its benefits while of many other fields. The earliest signa - and approaches since its inception, but avoiding potential pitfalls. tories follow, reproduced in order and as for the last 20 years or so has been The progress in AI research makes it they signed. For the complete list, see focused on the problems surrounding timely to focus research not only on tinyurl.com/ailetter. - ed. the construction of intelligent agents making AI more capable, but also on Stuart Russell, Berkeley, Professor of Com - — systems that perceive and act in maximizing the societal benefit of AI. puter Science, director of the Center for some environment. In this context, Such considerations motivated the “intelligence” is related to statistical Intelligent Systems, and coauthor of the AAAI 2008–09 Presidential Panel on standard textbook Artificial Intelligence: a and economic notions of rationality — Long-Term AI Futures and other proj - Modern Approach colloquially, the ability to make good ects on AI impacts, and constitute a sig - Tom Dietterich, Oregon State, President of decisions, plans, or inferences.
    [Show full text]
  • Exploratory Engineering in AI
    MIRI MACHINE INTELLIGENCE RESEARCH INSTITUTE Exploratory Engineering in AI Luke Muehlhauser Machine Intelligence Research Institute Bill Hibbard University of Wisconsin Madison Space Science and Engineering Center Muehlhauser, Luke and Bill Hibbard. 2013. “Exploratory Engineering in AI” Communications of the ACM, Vol. 57 No. 9, Pages 32–34. doi:10.1145/2644257 This version contains minor changes. Luke Muehlhauser, Bill Hibbard We regularly see examples of new artificial intelligence (AI) capabilities. Google’s self- driving car has safely traversed thousands of miles. Watson beat the Jeopardy! cham- pions, and Deep Blue beat the chess champion. Boston Dynamics’ Big Dog can walk over uneven terrain and right itself when it falls over. From many angles, software can recognize faces as well as people can. As their capabilities improve, AI systems will become increasingly independent of humans. We will be no more able to monitor their decisions than we are now able to check all the math done by today’s computers. No doubt such automation will produce tremendous economic value, but will we be able to trust these advanced autonomous systems with so much capability? For example, consider the autonomous trading programs which lost Knight Capital $440 million (pre-tax) on August 1st, 2012, requiring the firm to quickly raise $400 mil- lion to avoid bankruptcy (Valetkevitch and Mikolajczak 2012). This event undermines a common view that AI systems cannot cause much harm because they will only ever be tools of human masters. Autonomous trading programs make millions of trading decisions per day, and they were given sufficient capability to nearly bankrupt one of the largest traders in U.S.
    [Show full text]
  • On the Differences Between Human and Machine Intelligence
    On the Differences between Human and Machine Intelligence Roman V. Yampolskiy Computer Science and Engineering, University of Louisville [email protected] Abstract [Legg and Hutter, 2007a]. However, widespread implicit as- Terms Artificial General Intelligence (AGI) and Hu- sumption of equivalence between capabilities of AGI and man-Level Artificial Intelligence (HLAI) have been HLAI appears to be unjustified, as humans are not general used interchangeably to refer to the Holy Grail of Ar- intelligences. In this paper, we will prove this distinction. tificial Intelligence (AI) research, creation of a ma- Others use slightly different nomenclature with respect to chine capable of achieving goals in a wide range of general intelligence, but arrive at similar conclusions. “Lo- environments. However, widespread implicit assump- cal generalization, or “robustness”: … “adaptation to tion of equivalence between capabilities of AGI and known unknowns within a single task or well-defined set of HLAI appears to be unjustified, as humans are not gen- tasks”. … Broad generalization, or “flexibility”: “adapta- eral intelligences. In this paper, we will prove this dis- tion to unknown unknowns across a broad category of re- tinction. lated tasks”. …Extreme generalization: human-centric ex- treme generalization, which is the specific case where the 1 Introduction1 scope considered is the space of tasks and domains that fit within the human experience. We … refer to “human-cen- Imagine that tomorrow a prominent technology company tric extreme generalization” as “generality”. Importantly, as announces that they have successfully created an Artificial we deliberately define generality here by using human cog- Intelligence (AI) and offers for you to test it out.
    [Show full text]
  • The Future of AI: Opportunities and Challenges
    The Future of AI: Opportunities and Challenges Puerto Rico, January 2-5, 2015 ! Ajay Agrawal is the Peter Munk Professor of Entrepreneurship at the University of Toronto's Rotman School of Management, Research Associate at the National Bureau of Economic Research in Cambridge, MA, Founder of the Creative Destruction Lab, and Co-founder of The Next 36. His research is focused on the economics of science and innovation. He serves on the editorial boards of Management Science, the Journal of Urban Economics, and The Strategic Management Journal. & Anthony Aguirre has worked on a wide variety of topics in theoretical cosmology, ranging from intergalactic dust to galaxy formation to gravity physics to the large-scale structure of inflationary universes and the arrow of time. He also has strong interest in science outreach, and has appeared in numerous science documentaries. He is a co-founder of the Foundational Questions Institute and the Future of Life Institute. & Geoff Anders is the founder of Leverage Research, a research institute that studies psychology, cognitive enhancement, scientific methodology, and the impact of technology on society. He is also a member of the Effective Altruism movement, a movement dedicated to improving the world in the most effective ways. Like many of the members of the Effective Altruism movement, Geoff is deeply interested in the potential impact of new technologies, especially artificial intelligence. & Blaise Agüera y Arcas works on machine learning at Google. Previously a Distinguished Engineer at Microsoft, he has worked on augmented reality, mapping, wearable computing and natural user interfaces. He was the co-creator of Photosynth, software that assembles photos into 3D environments.
    [Show full text]
  • Constraint Clustering and Parity Games
    Constrained Clustering Problems and Parity Games Clemens Rösner geboren in Ulm Dissertation zur Erlangung des Doktorgrades (Dr. rer. nat.) der Mathematisch-Naturwissenschaftlichen Fakultät der Rheinischen Friedrich-Wilhelms-Universität Bonn Bonn 2019 1. Gutachter: Prof. Dr. Heiko Röglin 2. Gutachterin: Prof. Dr. Anne Driemel Tag der mündlichen Prüfung: 05. September 2019 Erscheinungsjahr: 2019 Angefertigt mit Genehmigung der Mathematisch-Naturwissenschaftlichen Fakultät der Rheinischen Friedrich-Wilhelms-Universität Bonn Abstract Clustering is a fundamental tool in data mining. It partitions points into groups (clusters) and may be used to make decisions for each point based on its group. We study several clustering objectives. We begin with studying the Euclidean k-center problem. The k-center problem is a classical combinatorial optimization problem which asks to select k centers and assign each input point in a set P to one of the centers, such that the maximum distance of any input point to its assigned center is minimized. The Euclidean k-center problem assumes that the input set P is a subset of a Euclidean space and that each location in the Euclidean space can be chosen as a center. We focus on the special case with k = 1, the smallest enclosing ball problem: given a set of points in m-dimensional Euclidean space, find the smallest sphere enclosing all the points. We combine known results about convex optimization with structural properties of the smallest enclosing ball to create a new algorithm. We show that on instances with rational coefficients our new algorithm computes the exact center of the optimal solutions and has a worst-case run time that is polynomial in the size of the input.
    [Show full text]
  • Public Response to RFI on AI
    White House Office of Science and Technology Policy Request for Information on the Future of Artificial Intelligence Public Responses September 1, 2016 Respondent 1 Chris Nicholson, Skymind Inc. This submission will address topics 1, 2, 4 and 10 in the OSTP’s RFI: • the legal and governance implications of AI • the use of AI for public good • the social and economic implications of AI • the role of “market-shaping” approaches Governance, anomaly detection and urban systems The fundamental task in the governance of urban systems is to keep them running; that is, to maintain the fluid movement of people, goods, vehicles and information throughout the system, without which it ceases to function. Breakdowns in the functioning of these systems and their constituent parts are therefore of great interest, whether it be their energy, transport, security or information infrastructures. Those breakdowns may result from deteriorations in the physical plant, sudden and unanticipated overloads, natural disasters or adversarial behavior. In many cases, municipal governments possess historical data about those breakdowns and the events that precede them, in the form of activity and sensor logs, video, and internal or public communications. Where they don’t possess such data already, it can be gathered. Such datasets are a tremendous help when applying learning algorithms to predict breakdowns and system failures. With enough lead time, those predictions make pre- emptive action possible, action that would cost cities much less than recovery efforts in the wake of a disaster. Our choice is between an ounce of prevention or a pound of cure. Even in cases where we don’t have data covering past breakdowns, algorithms exist to identify anomalies in the data we begin gathering now.
    [Show full text]
  • AI and You Transcript Guest: Roman Yampolskiy Episode 16 First Aired: Monday, October 5, 2020
    AI and You Transcript Guest: Roman Yampolskiy Episode 16 First Aired: Monday, October 5, 2020 Hi, welcome to episode 16. Today we have a feast, because I am talking with Dr. Roman Yampolskiy, a legend within the AI community. He is a tenured associate professor in the department of Computer Engineering and Computer Science at the University of Louisville in Kentucky where he is the director of Cyber Security. Look up his home page and Wikipedia entry because he has qualifications too numerous to detail here. Some of those include that he is also a Visiting Fellow at the Singularity Institute, he was also one of the scientists at the Asilomar conference on the future of AI, and the author of over 100 publications, including the book Artificial Superintelligence: A Futuristic Approach, which I would strongly recommend, and many peer-reviewed papers, including one coauthored with yours truly. Most of them are on the subject of AI Safety. In this episode we will talk mostly about a recent paper of his, a table-thumping 73 pages titled “On Controllability of Artificial Intelligence,” which you may recognize is talking about what we call the Control Problem. That’s the most fundamental problem in the whole study of the long-term impact of AI, and Roman explains it very well in this episode. If you’re looking at the possible future of artificial intelligence having serious negative as well as positive consequences for everyone, which is a concept that’s gotten a lot of popularization in the last few years, of course, and you’re finding computer professionals pooh-poohing that idea because AI right now requires a great deal of specialized effort on their part to do even narrow tasks, and the idea of it becoming an existential threat is just too sensationalist, the product of watching too many Terminator movies, then pay attention to Roman, because he has impeccable computer science credentials and yet does not shy away from exploring fundamental threats of AI.
    [Show full text]
  • The Age of Artificial Intelligence an Exploration
    THE AGE OF ARTIFICIAL INTELLIGENCE AN EXPLORATION Edited by Steven S. Gouveia University of Minho, Portugal Cognitive Science and Psychology Copyright © 2020 by the Authors. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of Vernon Art and Science Inc. www.vernonpress.com In the Americas: In the rest of the world: Vernon Press Vernon Press 1000 N West Street, C/Sancti Espiritu 17, Suite 1200, Wilmington, Malaga, 29006 Delaware 19801 Spain United States Cognitive Science and Psychology Library of Congress Control Number: 2020931461 ISBN: 978-1-62273-872-4 Product and company names mentioned in this work are the trademarks of their respective owners. While every care has been taken in preparing this work, neither the authors nor Vernon Art and Science Inc. may be held responsible for any loss or damage caused or alleged to be caused directly or indirectly by the information contained in it. Every effort has been made to trace all copyright holders, but if any have been inadvertently overlooked the publisher will be pleased to include any necessary credits in any subsequent reprint or edition. Cover design by Vernon Press using elements designed by FreePik. TABLE OF CONTENTS LIST OF ACRONYMS vii LIST OF FIGURES xi LIST OF TABLES xiii INTRODUCTION xv SECTION I: INTELLIGENCE IN ARTIFICIAL INTELLIGENCE 1 CHAPTER 1 TOWARDS THE MATHEMATICS OF INTELLIGENCE 3 Soenke Ziesche Maldives National University, Maldives Roman V. Yampolskiy University of Louisville, USA CHAPTER 2 MINDS , BRAINS AND TURING 15 Stevan Harnad Université du Québec à Montréal, Canada; University of Southampton, UK CHAPTER 3 THE AGE OF POST -INTELLIGENT DESIGN 27 Daniel C.
    [Show full text]
  • Leakproofing the Singularity Artificial Intelligence Confinement Problem
    Roman V.Yampolskiy Leakproofing the Singularity Artificial Intelligence Confinement Problem Abstract: This paper attempts to formalize and to address the ‘leakproofing’ of the Singularity problem presented by David Chalmers. The paper begins with the definition of the Artificial Intelli- gence Confinement Problem. After analysis of existing solutions and their shortcomings, a protocol is proposed aimed at making a more secure confinement environment which might delay potential negative effect from the technological singularity while allowing humanity to benefit from the superintelligence. Keywords: AI-Box, AI Confinement Problem, Hazardous Intelligent Software, Leakproof Singularity, Oracle AI. ‘I am the slave of the lamp’ Genie from Aladdin 1. Introduction With the likely development of superintelligent programs in the near future, many scientists have raised the issue of safety as it relates to such technology (Yudkowsky, 2008; Bostrom, 2006; Hibbard, 2005; Chalmers, 2010; Hall, 2000). A common theme in Artificial Intelli- gence (AI)1 safety research is the possibility of keeping a super- intelligent agent in a sealed hardware so as to prevent it from doing any harm to humankind. Such ideas originate with scientific Correspondence: Roman V. Yampolskiy, Department of Computer Engineering and Computer Sci- ence University of Louisville. Email: [email protected] [1] In this paper the term AI is used to represent superintelligence. Journal of Consciousness Studies, 19, No. 1–2, 2012, pp. 194–214 Copyright (c) Imprint Academic 2011 For personal use only -- not for reproduction LEAKPROOFING THE SINGULARITY 195 visionaries such as Eric Drexler, who has suggested confining transhuman machines so that their outputs could be studied and used safely (Drexler, 1986).
    [Show full text]