
Implementing Selves with Safe Motivational Systems and Motivational Systems and Self-Improvement: Papers from the AAAI Spring Symposium The Maximally Distributed Intelligence Explosion Francesco Albert Bosco Cortese Affiliate Scholar of the Institute for Ethics & Emerging Technologies; Research Scientist at ELPIs Foundation for Indefinite Lifespans; Assistant Editor of Ria University Press; Chief Operating Officer of the Center for Interdisciplinary Philosophic Studies [email protected] Abstract computational price performance, are making the creation of a superintelligent AI (or more accurately a recursively We argue that the most effective solution paradigm in machine self-modifying Seed AI able to bootstrap itself into ethics aiming to maximize safe relations between humans and superintelligence) easier and cheaper. Malicious or merely recursively self-improving AI is to maximize the approximate indifferent superintelligent AI constitute a pressing equality between humans and AGI. We subsequently argue that existential risk for humanity. embedding the concomitant intelligence-amplification of This has motivated attempts to formulate a means of biological humans as a necessary intermediary goal between each preventing the creation of a malicious or indifferent successive iteration of recursive self-improvement – such that the AGI conceives of such an intermediary step as a necessary sub- superintelligent AI. The solution paradigm that has thus far goal of its own self-modification – constitutes the best logistical received the most attention is ‘Friendly AI’, or more method of maintaining approximate intelligence equality amongst accurately the notion of a ‘Coherent Extrapolated Volition biological humans and recursively self-improving AGI. We Engine’ (a.k.a. CEV) as formulated by the Machine ultimately argue that this approach bypasses the seeming impasse Intelligence Research institute (a.k.a. MIRI, formerly the of needing to design, develop and articulate a motivational system Singularity Institute). A CEV would be a recursively self- possessing a top-level utility function that doesn’t decay over modifying optimization algorithm whose topmost utility repeated iterations of recursive self-improvement in order to have a safe recursively self-modifying AGI. function does not decay over recursive iterations of self- modification, and whose topmost utility function is designed in such a way that the CEV formulates what Superintelligence: A Double-Edged Cliff humanity would desire if we were smarter and more Superintelligence is a sharper double-edged sword than amenable to consensus. any other. It constitutes at once the greatest conceivable The construction of CEV is motivated by the concern source of existential risk and global catastrophic risk, and that the first superintelligent AI will be built in a way that our most promising means of mitigating such risk. makes no attempt to ensure its safety of ‘friendliness’ Superintelligence possesses at once greater destructive relative to humanity. This is indeed an important and potential than any other emerging technology and greater pressing concern. But I and others argue that MIRI is going potential to formulate new solutions to emerging about it in a fundamentally misguided way. They seek to existential risks and global catastrophic risk because it is prevent the creation of a rogue superintelligence by the very embodiment of the ability to conceive of new creating the first one, and making sure it’s built as safely as weapons and new solutions to existing threats. A the technology and methodology of the times can manage. superintelligence could not only thwart any prior-existing This is somewhat akin to trying to prevent the creation of security measures against emerging-technology-mediated nuclear arms by being the first to create it and then existential risk, but it could also formulate new generations connecting it to a global surveillance system, and using of weapons so superior as to be inconceivable to those of that nuclear weapon to threaten anyone else who might be lesser intelligence. found to be building one. This poses a grave problem, as developments in artificial Admittedly, superintelligence has some properties which intelligence, combined with continuing increases in make this line of attack – i.e., being the first to create superintelligent AI and building it to be as safe as one can – seem appealing and intuitive. For instance, upon the Copyright © 2014, Association for the Advancement of Artificial creation of a superintelligence, the rate at which that Intelligence (www.aaai.org). All rights reserved. superintelligence could gain control (defined here as the 7 capacity to effect change in the world, and approximately and circumstances into any single intelligent agent is analogized with intelligence, such that a given degree of undemocratic and contrary to both universal human values intelligence would correlate with an agent’s degree of (autonomy and liberty) and our most definitive essence, control, i.e., the agent’s capacity to affect changes in the our longing to have more control over the determining world) is unprecedented. This means that upon the creation conditions of our own selves and lives. Secondly, we argue of an effective Seed AI, the battle is largely already lost. that CEV and the solution paradigms it exemplifies would So being the first to make it would in this case constitute a also be unethical for a different reason – namely that palpable and definitive advantage. restricting any intelligent, self-modifying agent to a However, we argue that a number of alternative solution specific set of values, beliefs or goals – i.e. a paradigms to the existential risk and global catastrophic preprogrammed and non-decaying utility function – is risk posed by the creation of a malicious or indifferent unethical on the same grounds – namely that externally Seed AI exist and warrant being explored as alternatives to determining the determining conditions of any entity CEV. Whereas many prior solution paradigms sought to possessing some degree of self-modification and self- minimize the unpredictability of a superintelligence, we determination (i.e. any self-modifying intelligent agent) is argue that unpredictability is inextricably wed to the unethical because it is directly contrary to their foremost property of superintelligence, that it is one of its most values (autonomy and liberty) and to their coremost definitive and essential characteristics, and that to remove essence, i.e. their longing to determine for themselves the unpredictability from superintelligence is to remove conditions and circumstances of their own lives and selves. superintelligence itself. The whole point of creating such For these reasons we argue that it would be unethical to Seed AI I to think and do that which we as humans cannot, create a Seed AI without also allowing it to formulate its to think thoughts that are categorically unavailable to us as own ethical system, in accordance with its own self-formed intelligent agents. To seek this while simultaneously and ever-reformulating beliefs and desires. Moreover, to seeking a comprehensive and predictively-accurate restrict such a Seed AI to the moral codes and beliefs of a understanding of those as-yet-unconceived-of products that kind of entity (human) that it was built for the express the AI is meant to bring into being is tautological. purpose of surpassing in thought and intelligence is even Most critics of CEV have challenged it on the grounds more unethical and for the same reasons. of feasibility, arguing that a recursively self-improving Contrary to such past solution paradigms, we argue that optimization algorithm possessing utility functions that the existential risks posed by any single entity with a level remain stable over recursive iterations of self-improvement of intelligence (and thus control, as defined above) We agree, as expressed above, but also advance two ethical significantly surpassing other entities it has the capacity to concerns over its development that further bring into interact with far exceeds the potential advantages offered question its merit as an effective solution paradigm to the by its creation – such as new solutions to humanity’s existential and global catastrophic risks posed by Seed AI. gravest crises and concerns, like disease, poverty and We argue firstly that it even if its feasibility weren’t in pollution. Furthermore, we articulate a new solution question – i.e. if a Seed AI that is predictable, which paradigm for mitigating the existential and global ‘friendly’ utility functions that do not decay over iterations catastrophic risks incurred by the creation of a recursively of recursive self-modification, were definitively feasible) – self-modifying seed AI, the end goal of which is not a safe then it would still be unethical to create any intelligent superintelligence, but rather the amplification of agent that decides the fate of humanity for them, at intelligence without ever incurring the relative whatever scale and in whatever context. The heart of the superintelligence of any agent over another. In other human is our will toward self-determination – our unerring words, it seeks to facilitate a maximally-distributed attempt to determine the conditions and circumstances of intelligence explosion, aiming to maintain rough equality our own selves and lives. It is exemplified by the of intelligence (and thus control) amongst all intelligent prominence of autonomy
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-