The First Law of Robotics (A Call to Arms)

From: AAAI Technical Report SS-94-03. Compilation copyright © 1994, AAAI (www.aaai.org). All rights reserved. The First Law of Robotics (a call to arms) Oren Etzioni Daniel Weld* Department of Computer Science and Engineering University of Washington Seattle, WA98195 {etzioni, weld}@cs.washington.edu Abstract ¯ A construction robot is instructed to fill a pothole Even before the advent of Artificial Intelligence, sci- in the road. Although the robot repairs the cavity, it leaves the steam roller, chunks of tar, and an oil ence fiction writer Isaac Asimov recognized that a slick in the middle of a busy highway. robot must place the protection of humans from harm at a higher priority than obeying human orders. In- ¯ A software agent is instructed to reduce disk utiliza- spired by Asimov, we pose the following fundamental tion below 90%. It succeeds, but inspection reveals questions: (1) Howshould one formalize the rich, but that the agent deleted irreplaceable IATEXfiles with- informal, notion of "harm"? (2) How can an agent out backing them up to tape. avoid performing harmful actions, and do so in a com- While less dramatic than Asimov’s stories, the sce- putationally tractable manner? (3) Howshould narios illustrate his point: not all ways of satisfying a agent resolve conflict between its goals and the need humanorder are equally good; in fact, sometimes it is to avoid harm? (4) When should an agent prevent better not to satisfy the order at all. As we begin to human from harming herself? While we address some deploy agents in environments where they can do some of these questions in technical detail, the primary goal real damage, the time has come to revisit Asimov’s of this paper is to focus attention on Asimov’sconcern: Laws. This paper explores the following fundamental society will reject autonomous agents unless we have questions: some credible means of making them safe! ¯ How should one formalize the notion of "harm"? In Sections and , we define dent-disturb and restore two domain- The Three Laws of Robotics: independent primitives that capture aspects of Asi- 1. A robot may not injure a human being, or, mov’s rich but informal notion of harm within the through inaction, allow a human being to come classical planning framework. to harm. ¯ How can an agent avoid performing harm- 2. A robot must obey orders given it by human ful actions, and do so in a computationally beings except where such orders would conflict tractable manner? We leverage and extend the with the First Law. familiar mechanismsof planning with subgoal inter- 3. A robot must protect its own existence as long actions [35, 7, 24, 29] to detect potential harm in as such protection does not conflict with the polynomial time. In addition, we explain how the First or Second Law. agent can avoid harm using tactics such as confronta- Isaac Asimov [,~]: tion and evasion (executing subplans to defuse the threat of harm). Motivation ¯ How should an agent resolve conflict between In 1940, Isaac Asimovstated the First Lawof Robotics, its goals and the need to avoid harm? We capturing an essential insight: a robot should not impose a strict hierarchy where dent-disturb con- slavishly obey human commands-- its foremost goal straints override planners goals, but restore con- should be to avoid harming humans. Consider the fol- straints do not. lowing scenarios: ¯ When should an agent prevent a human from harming herself? In section , we show how our *Wethank Steve Hanks, Nick Kushmerick, Neat Lesh, framework could be extended to partially address and Kevin Sullivan for helpful discussions. This research was funded in part by Office of Naval Research Grants 90- this question. J-1904 and 92-J-1946, and by National Science Foundation The paper’s main contribution is a "call to arms:" Grants IRI-8957302, IRI-9211045, and IRI-9357772. before we release autonomous agents into real-world environments, we need some credible and computation- formuli. This sidesteps the ramification problem [23], ally tractable means of making them obey Asimov’s since domain axioms are banned. Instead, we demand First Law. that individual action descriptions explicitly enumer- ate changes to every predicate that is affected. 3 Note, Survey of Possible Solutions however, that we are not assuming the STRIPS rep- To make intelligent decisions regarding which actions resentation; Instead we adopt an action language are harmful, and under what circumstances, an agent (based on ADL[26]) which includes universally quanti- requires some explicit model of harm. Wecould pro- fied and disjunctive preconditions as well as conditional vide the agent with an explicit model that induces a effects [29]. partial order over world states (i.e., a utility function). Given the above assumptions, the next two sections This framework is widely adopted and numerous re- ....... define the: primitives dour-disturb and restore, and searchers are attempting to render it computationally explain how they should be treated by a generative tractable [13, 31, 34, 32, 37, 17], but manyproblems planning algorithm. Weare not claiming that the ap- remain to be solved [36]. In many cases, the intro- proach sketched below is the "right" way to design duction of utility models transforms planning into an agents or to formalize Asimov’s First Law. Rather, optimization problem -- instead of searching for some our formalization is meant to illustrate the kinds of plan that satisfies the goal, the agent is seeking the best technical issues to which Asimov’s Law gives rise and such plan. In the worst case, the agent maybe forced .... how they might be solved. With this in mind, the pa- to examine all plans to determine which one is best. per concludes with a critique of our approach and a In contrast, we have explored a satisficing approach -- (long) list of open questions. our agent will be satisfied with any plan that meets its constraints and achieves its goals. The expressive Safety power of our constraint language is weaker than that Some conditions are so hazardous that our agent of utility functions, but our constraints are easier to should never cause them. For example, we might de- incorporate into standard planning algorithms. mandthat the agent never delete I~TEXfiles, or never By using a general, temporal logic such as that of handle a gun. Since these instructions hold for all [33] or [9, Ch. 5] we could specify constraints that times, we refer to them as dont-distuxb constraints, would ensure the agent would not cause harm. Before and say that an agent is safe when it guarantees to executing an action, we could ask an agent to prove abide by them. As in Asimov’s Law, dont-disturb that the action is not harmful. While elegant, this ap- constraints override direct humanorders. Thus, if we proach is computationally intractable as well. Another ask a software agent to reduce disk utilization and it alternative would be to use a planner such as ILP [3, 2] can only do so by deleting valuable IbTEXfiles, the or ZENO[27, 28] which supports temporally quantified agent should refuse to satisfy this request. goals but, at present, these planners seem too ineffi- We adopt a simple syntax: clout-disturb takes a cient1 for our needs. single, function-free, logical sentence as argument. For Instead, we aim to make the agent’s reasoning about example, one could commandthe agent avoid deleting harm more tractable, by restricting the content and files that are not backed up on tape with the following form of its theory of injury. 2 Weadopt the standard constraint: assumptions of classical planning: the agent has com- plete information of the initial state of the world, the dont-disturb(written.to.tape(f)V isa(f, file)) agent is the sole cause of change, and action execu- Free variables, such as f above, are interpreted as tion is atomic, indivisible, and results in effects which universally quantified. In general, a sequence of ac- are deterministic and completely predictable. Section tions satisfies dont-disturb(C) if none of the actions considers relaxing these assumptions. On a more syn- make C false. Formally, we say that a plan satisfies tactic level, we make the additional assumption that an dont-disturb constraint when every consistent, the agent’s world model is composed of ground atomic totally-ordered, sequence of plan actions satisfies the constraint as defined below. 1Wehave also examined previous work on "plan qual- ity" for ideas, but the bulk of that workhas focused on the Definition: Satisfaction of dour-disturb: Let w0 problemof leveraging a single action to accomplishmulti- be the logical theory describing the initial state of the ple goals thereby reducing the numberof actions in, and world, let A1,...,A, be a totally-ordered sequence of the cost of, the plan [20, 30, 39]. Whilethis class of op- actions that is executable in w0, let wj be the theory timizations is critical in domainssuch as database query describing the world after executing Aj in wj-1, and optimization, logistics planning, and others, it does not ad- let C be a function.free, logical sentence. We say that dress our concerns here. 2Looselyspeaking, our approach is reminiscent of clas- 3Althoughunpalatable, this is standard in the planning sical workon knowledgerepresentation, whichrenders in- literature. For example, a STrtWsoperator that movesblock ference tractable by formulating restricted representation A from B to C must delete on(A,B) and also add clear(B) languages[21]. even though clear(z) could be defined as Vy"-on(y, 18 AI,...,A, satisfies the constraint dent-disturb(C) vlolatlon(E, C) if/or all j E [1, n], for all sentences C, and for all l.

The First Law of Robotics (A Call to Arms)

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support