Articles

Mapping the Landscape of Human-Level Artificial General Intelligence

Sam S. Adams, Itamar Arel, Joscha Bach, Robert Coop, Rod Furlan, , J. Storrs Hall, Alexei Samsonovich, Matthias Scheutz, Matthew Schlesinger, Stuart C. Shapiro, John F. Sowa

I We present the broad outlines of a roadmap his article is the result of an ongoing collaborative effort toward human-level artificial general intelli- by the coauthors, preceding and during the AGI Roadmap gence (henceforth, AGI). We begin by discussing Workshop held at the University of Tennessee, Knoxville AGI in general, adopting a pragmatic goal for T in October 2009, and from many continuing discussions since its attainment and a necessary foundation of then. Some of the ideas also trace back to discussions held dur- characteristics and requirements. An initial capability landscape will be presented, drawing ing two Evaluation and Metrics for Human Level AI workshopa on major themes from developmental psycholo- organized by John Laird and Pat Langley (one in Ann Arbor in gy and illuminated by mathematical, physio- late 2008, and one in Tempe in early 2009). Some of the con- logical, and information-processing perspec- clusions of the Ann Arbor workshop were recorded (Laird et al. tives. The challenge of identifying appropriate 2009). Inspiration was also obtained from discussion at the tasks and environments for measuring AGI will Future of AGI postconference workshop of the AGI-09 confer- be addressed, and seven scenarios will be pre- ence, triggered by Itamar Arel’s presentation AGI Roadmap (Arel sented as milestones suggesting a roadmap 2009); and from an earlier article on AGI road-mapping (Arel across the AGI landscape along with directions and Livingston 2009). for future research and collaboration. Of course, this is far from the first attempt to plot a course toward human-level AGI: arguably this was the goal of the founders of the field of in the 1950s, and has been pursued by a steady stream of AI researchers since, even as the majority of the AI field has focused its attention on more narrow, specific subgoals. The ideas presented here build on the ideas of others in innumerable ways, but to review the history of AI and situate the current effort in the context of its predecessors would require a much longer article than this one. Thus we have chosen to focus on the results of our AGI roadmap discussions, acknowledging in a broad way the many debts owed to many prior researchers. References to the prior litera- ture on evaluation of advanced AI systems are given by Laird (Laird et al. 2009) and Geortzel and Bugaj (2009), which may in a limited sense be considered prequels to this article. We begin by discussing AGI in general and adopt a pragmat- ic goal for measuring progress toward its attainment. We also

Copyright © 2012, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602 SPRING 2012 25 Articles

adopt, as a provisional starting point, a slightly will be crucial to that success, and it was the aim of modified version of the characteristics and require- our workshop to make substantial progress in that ments for AGI proposed by Laird and Wray (2010), direction. upon which we will later construct a number of specific scenarios for assessing progress in achiev- A Pragmatic Goal for AGI ing AGI. An initial capability landscape for AGI The heterogeneity of general intelligence in will be presented, drawing on major themes from humans makes it practically impossible to develop developmental psychology and illuminated by a comprehensive, fine-grained measurement sys- mathematical, physiological, and information- tem for AGI. While we encourage research in defin- processing perspectives. The challenge of identify- ing such high-fidelity metrics for specific capabili- ing appropriate tasks and environments for meas- ties, we feel that at this stage of AGI development uring AGI will be taken up. Several scenarios will a pragmatic, high-level goal is the best we can be presented as milestones outlining a roadmap agree upon. across the AGI landscape, and directions for future Nils Nilsson, one of the early leaders of the AI work and collaboration will conclude the article. field, stated such a goal in the 2005AI Magazine article Human-Level Artificial Intelligence? Be Seri- The Goal: Human-Level ous! (Nilsson, 2005): I claim achieving real human-level artificial intelli- General Intelligence gence would necessarily imply that most of the tasks that humans perform for pay could be auto- Simply stated, the goal of AGI research as consid- mated. Rather than work toward this goal of ered here is the development and demonstration automation by building special-purpose systems, I of systems that exhibit the broad range of general argue for the development of general-purpose, edu- intelligence found in humans. This goal of devel- cable systems that can learn and be taught to per- oping AGI echoes that of the early years of the arti- form any of the thousands of jobs that humans can ficial intelligence movement, which after many perform. Joining others who have made similar pro- valiant efforts largely settled for research into nar- posals, I advocate beginning with a system that has row AI systems that could demonstrate or surpass minimal, although extensive, built-in capabilities. human performance in a specific task, but could These would have to include the ability to improve not generalize this capability to other types of tasks through learning along with many other abilities. or other domains. Many variant approaches have been proposed A classic example of the narrow AI approach was for achieving such a goal, and both the AI and AGI IBM’s Deep Blue system (Campbell, Hoane, and communities have been working for decades on Hsu 2002), which successfully defeated world chess the myriad subgoals that would have to be champion Gary Kasparov but could not readily achieved and integrated to deliver a comprehen- apply that skill to any other problem domain with- sive AGI system. But aside from the many techno- out substantial human reprogramming. In early logical and theoretical challenges involved in this 2011, IBM’s Watson question-answering system effort, we feel the greatest impediment to progress (Ferrucci 2010) dramatically defeated two all-time is the absence of a common framework for collab- champions in the quiz show Jeopardy, but having oration and comparison of results. The AGI com- never personally visited Chicago’s O’Hare and munity has been working on defining such a Midway airports, fumbled on a question that any framework for several years now, and as mentioned human frequent flier in the US would have known. earlier, this present effort on AGI road-mapping To apply the technology underlying Watson to builds on the shoulders of much prior work. another domain, such as insurance or call-center support, would require not merely education of Characteristics and Requirements for AGI the AI system, but significant reprogramming and As a starting point for our road-mapping effort, we human scoring of relevant data — the analogue of have adopted a slightly modified version of Laird needing to perform brain surgery on a human each and Wray’s Requirements time the person needs to confront a new sort of for Achieving AGI (Laird and Wray 2010). Their task. As impressive as these and other AI systems outline of eight characteristics for environments, are in their restricted roles, they all lack the basic tasks, and agents and 12 requirements for general cognitive capabilities and common sense of a typ- cognitive architectures provides a level foundation ical five-year-old child, let alone a fully educated for the comparison of appropriate scenarios for adult professional. assessing AGI systems. We stress, however, that in Given the immense scope of the task of creating our perspective these requirements should not be AGI, we believe the best path to long-term success considered as final or set in stone, but simply as a is collaboration and coordination of the efforts of convenient point of departure for discussion and multiple research groups. A common goal and a collaboration. As our understanding of AGI shared understanding of the landscape ahead of us improves through ongoing research, our under-

26 AI MAGAZINE Articles standing of the associated requirements is sure to evolve. In all probability, each of the many current C1. The environment is complex, with diverse, interacting AGI research paradigms could be used to spawn a and richly structured objects. slightly different requirements list, but we must start somewhere if we are to make progress as a C2. The environment is dynamic and open. community. To test the capability of any AGI system, the C3. Task-relevant regularities exist at multiple time scales. characteristics of the intelligent agent and its C4. Other agents impact performance. assigned tasks within the context of a given envi- ronment must be well specified. Failure to do this C5. Tasks can be complex, diverse and novel. may result in a convincing demonstration, but make it exceedingly difficult for other researchers C6. Interactions between agent, environment and tasks to duplicate experiments and compare and con- are complex and limited. trast alternative approaches and implementations. The characteristics shown in figure 1 provide the C7. Computational resources of the agent are limited. necessary (if not sufficient) degrees of dynamism C8. Agent existence is long-term and continual. and complexity that will weed out most narrow AI approaches at the outset, while continually chal- lenging researchers to consider the larger goal of AGI during their work on subsystems and distinct Figure 1. Characteristics for AGI Environments, Tasks, and Agents. capabilities. We have added the requirement for openness of the environment (C2), that is, the agent should not be able to rely on a fixed library of objects, relations, and events. We also require R0. New tasks do not require re-programming of objects to have an internal structure that requires the agent complex, flexible representations (C1). There are nearly as many different AGI architec- R1. Realize a symbol system tures as there are researchers in the field. If we are Represent and effectively use: ever to be able to compare and contrast systems, let alone integrate them, a common set of archi- R2. Modality-specific knowledge tectural features must form the basis for that com- parison. R3. Large bodies of diverse knowledge Figure 2 reprises Laird and Wray’s requirements for general cognitive architectures and provides a R4. Knowledge with different levels of generality framework for that basis. We have modified R5. Diverse levels of knowledge requirement R0, which in Laird and Wray reads “fixed structure for all tasks,” to emphasize that the R6. Beliefs independent of current perception system may grow and develop over time, perhaps changing its structure dramatically — but these R7. Rich, hierarchical control knowledge changes need to be effected by the agent itself, not R8. Meta-cognitive knowledge by the intervention of the programmer. We antici- pate further additions and changes to these lists R9. Support a spectrum of bounded and unbounded over time as the community converges in experi- deliberation ence with various cognitive architectures and AGI systems, but we have adopted them so we can R10. Support diverse, comprehensive learning progress as a community to our larger, shared goals. R11 Support incremental, online learning Current Challenges In addition to providing a framework for collabo- Figure 2. Cognitive Architecture Requirements for AGI. rative research in AGI, Laird and Wray (2010) iden- tified two significant challenges: confronted. A second challenge is that the existence … one of the best ways to refine and extend these of an architecture that achieves a subset of these sets of requirements and characteristics is to devel- requirements does not guarantee that such an op agents using cognitive architectures that test the architecture can be extended to achieve other sufficiency and necessity of all these and other pos- requirements while maintaining satisfaction of the sible characteristics and requirements on a variety original set of requirements. of real-world tasks. One challenge is to find tasks and environments where all of these characteristics Our 2009 AGI road-mapping workshop took up are active, and thus all of the requirements must be the first challenge of finding appropriate tasks and

SPRING 2012 27 Articles

environments to assess AGI systems, while the sec- Top Down: Characterizing ond challenge will be more appropriately handled Human Cognitive Development by individual research efforts. We also added a The psychological approach to intelligence encom- third challenge, that of defining the landscape of passes a broad variety of subapproaches rather AGI, in service to the AGI road-mapping effort that than presenting a unified perspective. Viewed his- has been under way for several years. torically, efforts to conceptualize, define, and The balance of this article will deal with these measure intelligence in humans reflect a distinct two challenges in reverse order. First, we will pro- trend from general to specific (Gregory 1996), vide an initial landscape of AGI capabilities based much like the history of AI. on developmental psychology and underpinned Early work in defining and measuring intelli- by mathematical, physiological, and information- gence was heavily influenced by Spearman, who in processing perspectives. Second, we will discuss the 1904 proposed the psychological factor g (for gen- issues that arise in selecting and defining appro- eral intelligence). Spearman argued that g was bio- priate tasks and environments for assessing AGI, logically determined, and represented the overall and finally, we will present a number of scenarios intellectual skill level of an individual. A related and directions for future work and collaboration. advance was made in 1905 by Binet and Simon, who developed a novel approach for measuring Challenge: Mapping the Landscape general intelligence in French schoolchildren. A unique feature of the Binet-Simon scale was that it There was much discussion both in preparation provided comprehensive age norms, so that each and throughout our AGI Roadmap Workshop on child could be systematically compared with oth- the process of developing a roadmap. A traditional ers across both age and intellectual skill level. In highway roadmap shows multiple driving routes 1916, Terman introduced the notion of an intelli- across a landscape of cities and towns, natural fea- gence quotient or IQ, which is computed by divid- tures like rivers and mountains, and political fea- ing the test taker’s mental age (that is, his or her tures like state and national borders. A technology age-equivalent performance level) by the physical roadmap typically shows a single progression of or chronological age. developmental milestones from a known starting In subsequent years, psychologists began to point to a desired result. Our first challenge in question the concept of intelligence as a single, defining a roadmap for achieving AGI was that we undifferentiated capacity. There were two primary initially had neither a well-defined starting point concerns. First, while performance within an indi- nor a commonly agreed upon target result. The vidual across knowledge domains is somewhat cor- history of both AI and AGI is replete with this related, it is not unusual for skill levels in one problem, which is somewhat understandable given domain to be considerably higher or lower than in the breadth and depth of the subjects of both another (that is, intraindividual variability). Sec- human intelligence and computer technology. We ond, two individuals with comparable overall per- made progress by borrowing more metaphors from formance levels might differ significantly across the highway roadmap, deciding to first define the specific knowledge domains (that is, interindivid- landscape for AGI and then populate that land- ual variability). scape with milestones that may be traversed These issues helped to motivate a number of through multiple alternative routes. alternative theories, definitions, and measurement The final destination, full human-level artificial approaches, which share the idea that intelligence general intelligence, encompasses a system that is multifaceted and variable both within and across could learn, replicate, and possibly exceed human- individuals. Of these approaches, a particularly level performance in the full breadth of cognitive well-known example is Gardner’s theory of multi- and intellectual abilities. The starting point, how- ple intelligences (Gardner 1999), which proposes ever, was more problematic, since there are many eight distinct forms or types of intelligence: (1) lin- current approaches to achieving AGI that assume guistic, (2) logical-mathematical, (3) musical, (4) different initial states. We finally settled on a devel- bodily-kinesthetic, (5) spatial, (6) interpersonal, (7) opmental approach to the roadmap, following intrapersonal, and (8) naturalist. Gardner’s theory human cognitive development from birth through suggests that each individual’s intellectual skill is adulthood. represented by an intelligence profile, a unique The various scenarios for assessing AGI that had mosaic or combination of skill levels across the been submitted by participants prior to the work- eight forms of intelligence. shop were then arrayed as milestones between While Gardner’s theory has had significant these two endpoints, without any specific routes impact within the field of adult intelligence, it has between them aside from their relative require- had comparatively less influence on the study of ments for increasing levels of human cognitive intellectual development in children. Instead, development. researchers in the field of cognitive development

28 AI MAGAZINE Articles seek to (1) describe processes of intellectual she is capable of doing alone; he referred to this change, while (2) identifying and explaining the cognitive space as the zone of proximal develop- underlying mechanisms (both biological and envi- ment. ronmental) that make these changes possible. Third, Vygotsky also stressed that each child Contemporary theories of cognitive development inherits a unique set of objects, ideas, and tradi- are very diverse and defy simple systematization, tions that guide learning (for example, books, cal- and a thorough treatment of the field would take culators, computers, and others). These tools of us too far from our focus. However, two major intellectual adaptation not only influence the pat- schools of thought, those of Piaget and Vygotsky, tern of cognitive development, but also serve as will serve as axes for our AGI landscape. constraints on the rate and extent of development. Piaget’s Theory Surveying the Landscape In his classic work that founded the science of cog- of Human Intelligence nitive development, Piaget proposed that humans While many consider the views of Piaget and progress through four qualitatively distinct stages Vygotsky to be at odds because of their different (Piaget 1953). foci, we consider them to be complementary, par- First, during the sensorimotor stage (0–2 years), tial descriptions of the same development process. infants acquire a rich repertoire of perceptual and Piaget’s approach focused on the stages of cogni- motor skills (for example, reaching, grasping, tive development of an individual child, while crawling, walking, and others). This stage includes Vygotsky considered the same development with- a number of major milestones, such as the ability in the context of social interaction with other to search for hidden objects and the ability to use humans with access to and training in the use of objects as simple tools. cultural artifacts like tools, languages, books, and a Second, infants enter the preoperational stage shared environment. By placing the developmen- (2–6 years) as they acquire the capacity to mental- tal stages of each theory on opposing axes (figure ly represent their experiences (for example, mem- 3), we can outline the landscape of human cogni- ory, mental images, drawing, language, and oth- tive development and provide a structure for the ers), but lack the ability to systematically placement of milestones along the road to AGI. coordinate these representations in a logically con- sistent manner. Bottom Up: Substrata During the next stage (6 years to adolescence), of the AGI Landscape children at the concrete operational level master Our ultimate goal as a community is to create func- basic elements of logical and mathematical tional computer systems that demonstrate human- thought, including the ability to reason about level general intelligence. This lofty aim requires classes and categories, as well as numerical opera- that we deeply understand the characteristics of tions and relations. human cognitive behavior and development out- The final stage of development, formal opera- lined above, as well as implement that under- tions, begins in adolescence and includes the use standing in a nonbiological substrate, the modern of deductive logic, combinatorial reasoning, and digital computer. To accomplish this, we draw the ability to reason about hypothetical events. inspiration and perspectives from many different disciplines, as shown in figure 4. From physiology Vygotsky’s Theory we seek to understand the biological implementa- In contrast to Piaget’s view, which focuses on the tion of human intelligence; from mathematics, the individual, Vygotsky’s classic theory of cognitive nature of information and intelligence regardless development emphasizes the sociocultural perspec- of its implementation; and from information pro- tive (Vygotsky 1986). Vygotsky’s theory not only cessing, we map these insights and others into the highlights the influence of the social environment, metaphors of computer science that will most but it also proposes that each culture provides a directly lead to successful computer implementa- unique developmental context for the child. Three tion. This section describes each of these under- fundamental concepts from this theory are (1) pinnings for our road-mapping efforts. internalization, (2) zone of proximal development, and (3) tools of intellectual adaptation. The Physiological Perspective First, Vygotsky proposed that the capacity for Given advances in bioscience, it has become thought begins by acquiring speech (that is, think- increasingly feasible to take a more holistic physi- ing out loud), which gradually becomes covert or ological perspective on cognitive development internalized. incorporating genetic, biochemical, and neural Second, he emphasized that parents, teachers, mechanisms, among others. A unique strength of and skilled peers facilitate development by helping biologically inspired cognitive theories is their abil- the child function at a level just beyond what he or ity to account for universal aspects of development

SPRING 2012 29 Articles

Formal Operational Full Human-Level (Adolescent to Adult) Intelligence

Concrete Operational (6 years to Adolescent)

evelopment Preoperational D (2 - 6 years)

Sensory/Motor Human Cognitive (Birth - 2 years) Individual Capability (Piaget) Internalization Zone of Proximal Tools of Intellectual Adaptation Development Social-Cultural Engagement (Vygotsky)

Figure 3. The Landscape of Human Cognitive Development.

that emerge in a consistent pattern across a wide posed to take into account the way that intelli- range of cultures, physical environments, and his- gence may be biased to particular sorts of environ- torical time periods (for example, motor-skill ments, and the fact that not all intelligent systems development, object perception, language acquisi- are explicitly reward-seeking (Goertzel 2010). tion, and others). The physiological approach also While this notion of intelligence as compression plays a prominent role in explaining differences is useful, even if a mathematical definition of gen- between typical and atypical patterns of develop- eral intelligence were fully agreed upon, this ment (for example, autism, ADHD, learning disor- wouldn’t address the human-level part of AGI. ders, disabilities, and others). Both physiological Human intelligence is neither completely general and information-processing perspectives favor in the sense of a theoretical AGI like AIXI, Hutter’s modular accounts of cognitive development, optimally intelligent agent (Hutter 2004), nor is it which view the brain as divided into special-pur- highly specialized in the sense of current AI soft- pose input-output systems that are devoted to spe- ware. It has strong general capability, yet is biased cific processing tasks. toward the class of environments in which human intelligence develops — a class of environments The Mathematical Perspective whose detailed formalization remains largely an Thie mathematical perspective is typified by the unsolved problem. recent work of Marcus Hutter and Shane Legg, who give a formal definition of general intelligence The Information Processing Perspective based on the Solomonoff-Levin distribution (Legg Much of modern cognitive science uses the com- and Hutter 2007). Put very roughly, they define puter as a metaphor for the mind. This perspective intelligence as the average reward-achieving capa- differs from the system-theoretic ideas of Piaget bility of a system, calculated by averaging over all and Vygotsky, but at the same time provides a possible reward-summable environments, where more direct mapping to the target implementation each environment is weighted in such a way that for AGI systems. more compactly describable programs have larger According to the information-processing per- weights. Variants of this definition have been pro- spective, cognitive development in infancy and

30 AI MAGAZINE Articles

Human Cognitive Development

Implementation Metaphor

Info. Proc. Information Theory

Math Existing Realization Physiology

Figure 4. The Substrata for the AGI Landscape. childhood is due to changes in both hardware Tasks, therefore, require a context to be useful, (neural maturation, for example, synaptogenesis, and environments, including the AGI’s embodi- neural pruning, and others) and software (for ment itself, must be considered in tandem with the example, acquisition of strategies for acquiring and task definition. Using this reasoning, we decided to processing information). Rather than advocating a define scenarios that combine both tasks and their stage-based approach, this perspective often high- necessary environments. lights processes of change that are gradual or con- Further, since we are considering multiple sce- tinuous. For example, short-term memory — typi- narios as well as multiple approaches and architec- cally measured by the ability to remember a string tures, it is also important to be able to compare and of random letters or digits — improves linearly contrast tasks belonging to different scenarios. during early childhood (Dempster 1981). With this in mind, we chose to proceed by first articulating a rough heuristic list of human intelli- gence competencies (figure 5). As a rule of thumb, Challenge: Finding tasks may be conceived as ways to assess compe- Tasks and Environments tencies within environments. However, contem- porary cognitive science does not give us adequate The next challenge we address is that of defining guidance to formulate anything close to a com- appropriate tasks and environments for assessing plete, rigid, and refined list of competencies. What progress in AGI where all of the Laird and Wray we present here must be frankly classified as an characteristics are active, and thus all of the intuitive approach for thinking about task genera- requirements must be confronted. The usefulness tion, rather than a rigorous analytical methodolo- of any task or environment for this purpose is crit- gy from which tasks can be derived. ically dependent on how well it provides a basis for comparison between alternative approaches and Environments and Embodiments for AGI architectures. With this in mind, we believe that General intelligence in humans develops within both tasks and environments must be designed or the context of a human body, complete with many specified with some knowledge of each other. For thousands of input sensors and output effectors, example, consider the competition to develop an which is itself situated within the context of a rich- automobile capable of driving itself across a rough- ly reactive environment, complete with many oth- ly specified desert course.1 While much of the er humans, some of which may be caregivers, architecture and implementation would likely be teachers, collaborators, or adversaries. Human per- useful for the later competition for autonomous ceptual input ranges from skin sensors reacting to city driving,2 such a different environment temperature and pressure, smell and taste sensors required significant reconsideration of the tasks in the nose and mouth, sound and light sensors in and subtasks themselves. the ears and eyes, to internal proprioceptive sen-

SPRING 2012 31 Articles

Perception Memory Attention Social Planning Motivation Interaction Vision Working Visual Communication Tactical Subgoal creation Smell Episodic Auditory Appropriateness Strategic Affect-based Touch Implicit Social Social Physical Deferred Gratification Taste Semantic Behavioral Cooperation Social Altruism Audition Procedural Competition Cross-modal Communication Relationships Modeling Self / Other Proprioceptive Reasoning Gestural Emotion Self-Awareness Induction Verbal Learning Perceived Other- Awareness Actuation Deduction Musical Imitation Expressed Relationships Physical Skills Abduction Pictorial Reinforcement Control Self-Control Tool Use Physical Diagrammatical Dialogical Understanding Theory of Mind Navigation Causal Language Media-Oriented Sympathy Sympathy Acquisition Proprioceptive Associational Cross-Modal Experimentation Empathy Empathy

Building/Creation Quantitative Physical Construction with Objects Counting Observed Entities Formation of Novel Concepts Grounded Small Number Arithmetic Verbal Invention Comparison of Quantitative Properties of Observed Entities Social Organization Measurement Using Simple Tools

Figure 5. Some of the Important Competency Areas Associated with Human-Level General Intelligence.

sors that provide information on gravitational incompatible and typically nonrepeatable form. direction and body joint orientation, among oth- Various leagues of the RoboCup competition ers. Human output effectors include the many (Kitano 1997) have been a notable exception to ways we can affect the physical states of our envi- this, where a commercially available hardware ronment, like producing body movements robotic platform such as the Sony AIBO or Alde- through muscles, sound waves using body baran Nao is specified, with contestants only motions, or producing vibrations through vocal allowed to customize the software of the robotic cords, for example. Effectors like the muscles con- players. Unfortunately, the limited nature of the trolling gaze direction and focus of the eyes also RoboCup task — a modified game of soccer — is directly affect environmental sensing. too narrow to be suitable as a benchmark for AGI. One of the distinctive challenges of achieving AGI is thus how to account for a comparably rich The Breadth of Human Competencies sensory/motor experience. The history of AI is Regardless of the particular environment or replete with various attempts at robotic embodi- embodiment chosen, there are, intuitively speak- ment, from Gray Walter’s Tortoise (Walter 1953) ing, certain competencies that a system must pos- through Honda’s ASIMO. Virtual embodiment has sess to be considered a full-fledged AGI. For also been utilized, exploiting software systems that instance, even if an AI system could answer ques- simulate bodies in virtual environments, such as tions involving previously known entities and con- Steve Grand’s Creatures (Grand 2001). cepts vastly more intelligently than IBM’s Watson, There are many challenges in determining an we would be reluctant to classify it as human-level appropriate level of body sophistication and envi- if it could not create new concepts when the situ- ronmental richness in designing either physical or ation called for it. virtual bodies and environments. Regrettably, While articulation of a precise list of the compe- nearly every AI or AGI project that has included sit- tencies characterizing human-level general intelli- uated embodiment has done so in some unique, gence is beyond the scope of current science (and

32 AI MAGAZINE Articles definitely beyond the scope of this article), it is we must allow for concurrent activity in many nevertheless useful to explore the space of compe- regions of the landscape. This requires that many tencies, with a goal of providing heuristic guidance of the unsolved lower-level aspects of general intel- for the generation of tasks for AGI systems acting ligence such as visual perception and fine-motor in environments. Figure 5 lists a broad set of com- control be finessed in some of our scenarios, petencies that we explored at the road-mapping replaced by an approximate high-level implemen- workshop: 14 high-level competency areas, with a tation or even a human surrogate, until we have few key competency subareas corresponding to sufficiently rich underlying systems as a common each. A thorough discussion of any one of these base for our research. It is sometimes argued that subareas would involve a lengthy review paper and since human intelligence emerges and develops dozens of references, but for the purposes of this within a complex body richly endowed with article an evocative list will suffice. innate capabilities, AGI may only be achieved by We consider it important to think about compe- following the same path in a similar embodiment. tencies in this manner — because otherwise it is But until we have the understanding and capabili- too easy to pair a testing environment with an ty to construct such a system, our best hope for overly limited set of tasks biased to the limitations progress is to foster concurrent but ultimately col- of that environment. What we are advocating is laborative work. not any particular competency list, but rather the approach of exploring a diverse range of compe- tency areas, and then generating tasks that evalu- Scenarios for Assessing AGI ate the manifestation of one or more articulated In order to create a roadmap toward human-level competency areas within specified environments. AGI, one must begin with one or more particular Some of the competencies listed may appear scenarios. By a scenario we mean a combination of intuitively simpler and earlier stage than others, at tasks to be performed within a specified environ- least by reference to human cognitive develop- ment, together with a set of assumptions pertain- ment. The developmental theories reviewed earli- ing to the way the AGI system interacts with the er in this article each contain their own hypothe- environment along with the inanimate entities ses regarding the order in which developing and other agents therein. Given a scenario as a humans achieve various competencies. However, basis, one can then talk about particular subtasks, after careful consideration we have concluded that their ordering and performance evaluation. the ordering of competencies isn’t necessary for an A wide variety of scenarios may be posited as rel- AGI capability roadmap. In many cases, different evant to AGI; here we review seven that were iden- existing AGI approaches pursue the same sets of tified by one or more participants in the AGI competencies in radically different orders, so Roadmap Workshop as being particularly interest- imposing any particular ordering on competencies ing. Each of these scenarios may be described in would bias the road-mapping effort toward partic- much more detail than we have done here; this ular AGI approaches, something antithetical to the section constitutes a high-level overview, and more collaboration and cooperation sorely needed in the details on several of the scenarios is provided in the present stage of AGI development. references. For example, developmental robotics (Lungarel- la et al. 2003) and deep learning (Bengio 2009) AGI General Video-Game Learning approaches often assume that perceptual and This scenario addresses many challenges at once: motor competencies should be developed prior to providing a standard, easily accessible, situated linguistic and inferential ones. On the other hand, embodiment for AGI research; a gradient of ever logic-focused AGI approaches are often more natu- increasing sensory/motor requirements to support rally pursued by beginning with linguistic and gradual development, measurement, and compar- inferential competencies, and only afterwards mov- ison of AGI capabilities; a wide range of differing ing to perception and actuation in rich environ- environments to test generality of learning ability; ments. By dealing with scenarios, competencies, and a compelling demonstration of an AGI’s capa- and tasks, but not specifying an ordering of com- bility or lack thereof, easily evaluated by nonspe- petencies or tasks, we are able to present a capabil- cialists, based on widely common human experi- ity roadmap that is friendly to existing approaches ences playing various video games. and open to new approaches as they develop. The goal of this scenario would not be human- level performance of any single video game, but Supporting Diverse the ability to learn and succeed at a wide range of Concurrent Research Efforts video games, including new games unknown to Balancing the AGI community’s need for collabo- the AGI developers before the competition. The rative progress while still supporting a wide range contestant system would be limited to a uniform of independent research and development efforts, sensory/motor interface to the game, such as video

SPRING 2012 33 Articles

and audio output and controller input, and would Know I Learned in Kindergarten (Fulghum 1989), it is be blocked from any access to the internal pro- appealing to consider early childhood education gramming or states of the game implementation. such as kindergarten or preschool as inspiration for To provide for motivation and performance feed- scenarios for teaching and testing AGI systems. back during the game, the normal scoring output The details of this scenario are fleshed out by would be mapped to a uniform hedonic Goertzel and Bugaj (2009). (pain/pleasure) interface for the contestant. Con- This scenario has two obvious variants: a physi- testants would have to learn the nature of the cal preschool-like setting involving a robot, and a game through experimentation and observation, virtual-world preschool involving a virtual agent. by manipulating game controls and observing the The goal in such scenarios is not to imitate human results. The scores achieved against a prepro- child behavior precisely but rather to demonstrate grammed opponent or story line would provide a a robot or virtual agent qualitatively displaying standard measure of achievement along with the similar cognitive behaviors to a young human time taken to learn and win or advance in each child. This idea has a long and venerable history in kind of game. the AI field — Alan Turing’s original 1950 paper on The range of video games used for testing in this AI, where he proposed the famous Turing test, con- scenario could be open ended in both simplicity tains the suggestion that “Instead of trying to pro- and sophistication. An early game like Pong might duce a programme to simulate the adult mind, itself prove too challenging a starting point, so why not rather try to produce one which simulates even simpler games may be selected or developed. the child’s?” (Turing 1950). Since success at most video games would require This childlike cognition-based approach seems some level of visual intelligence, general video- promising for many reasons, including its integra- game learning (GVL) would also provide a good tive nature: what a young child does involves a test of computer vision techniques, ranging from combination of perception, actuation, linguistic relatively simple two-dimensional object identifi- and pictorial communication, social interaction, cation and tracking in Pong to full three-dimen- conceptual problem solving, and creative imagina- sional perception and recognition in a game like tion. Human intelligence develops in response to Tomb Raider or Half-Life at the highest levels of per- the demands of richly interactive environments, formance. Various genres of video games such as and a preschool is specifically designed to be just the early side-scrolling games (for example, Super such an environment with the capability to stimu- Mario Brothers) to first person shooters (for exam- late diverse mental growth. The richness of the pre- ple, Doom, Half-Life) to flight simulations (for school environment suggests that significant value example, Microsoft Flight Simulator, Star Wars X- is added by the robotics-based approach; but much Wing) provide rich niches where different could also potentially be achieved by stretching the researchers could focus and excel, while the com- boundaries of current virtual-world technology. mon interface would still allow application of Another advantage of focusing on childlike cog- learned skills to other genres. nition is the wide variety of instruments for meas- Among other things, in order to effectively play uring child intelligence developed by child psy- many videos games a notable degree of strategic chologists. In a preschool context, the AGI system thinking must be demonstrated, such as the abili- can be presented with variants of tasks typically ty to map situations to actions while considering used to measure the intelligence of young human not just the short-term but also the longer-term children. implications of choosing an action. Such capabili- It doesn’t currently make sense to outfit a virtu- ty, often associated with the notion of the credit al or robot preschool as a precise imitation of a assignment problem, is one that remains to be human preschool — this would be inappropriate demonstrated in a scalable way. GVL provides a since a contemporary robotic or virtual body is effective platform for such demonstration. rather differently capable than that of a young This scenario has the added benefit of support- human child. The aim in constructing an AGI pre- ing the growth of a research community focused school environment should be to emulate the on the specification and development of appropri- basic diversity and educational character of a typi- ate tests for AGI. Such a community would create cal human preschool. the video game tests, in collaboration with the To imitate the general character of a human pre- original game vendors in many cases, and both school, several learning centers should be available administer the tests and report the results through in a virtual or robot preschool. The precise archi- an Internet forum similar to the SPEC-style system tecture will be adapted through experience, but performance reporting sites.3 initial learning centers might include the Blocks Center — a table with blocks of various shapes and Preschool Learning sizes on it; the Language Center: a circle of chairs, In the spirit of the popular book All I Really Need to intended for people to sit around and talk with the

34 AI MAGAZINE Articles

AGI; the Manipulatives Center — with a variety of standing. However, the role of text critic is new, different objects of different shapes and sizes, requiring metalevel reasoning about the quality of intended to teach a wide range of visual and motor the text, its author’s intentions, the relation of the skills; the Ball Play Center — where balls are kept text to the reader’s society and culture, and the in chests and there is space for the AGI to kick or reader’s own reaction to the text, rather than mere- throw the balls around alone or with others; or the ly reasoning about the information in the text. Dramatics Center — where the AGI can both This scenario fulfills most of the Laird and Wray observe and enact various movements and roles. criteria handily, and a few less obviously. The AGI’s environment in this scenario is not dynamic in a Reading Comprehension direct sense (C2), but the AGI does have to reason The next scenario is closely related to the previous about a dynamic environment to fulfill the tasks. one, but doesn’t require embodiment in a virtual In a sense, the tasks involved are fixed rather than world, and makes a special commitment as to the novel (C5), but they are novel to the AGI as it pro- type of curriculum involved. In this scenario, an ceeds through them. Other agents affect task per- aspiring AGI should work through the grade formance (C4) if group exercises are involved in school reading curriculum, and take and pass the the curriculum (which indeed is sometime the assessments normally used to assess the progress of case). Many of the specific abilities needed for this human children. This obviously requires under- scenario are discussed in the next scenario. standing a natural language (NL) text and being able to answer questions about it. However, it also Story or Scene Comprehension requires some not so obvious abilities. Focused on scene and story comprehension, this Very early readers are usually picture books that scenario shares aspects of the preschool and read- tightly integrate the pictures with the text. In some, ing comprehension scenarios. Like the reading sce- the story is mostly conveyed through the pictures. nario, it focuses on a subset of childlike perform- In order to understand the story, the pictures must ance — but a different subset, involving a broader be understood as well as the NL text. This requires variety of engagements with the world. Scene com- recognizing the characters and what the characters prehension here does not mean only illustrations, are doing. Reference resolution is required between but real-world scenes, which can be presented at characters and events mentioned in the text and different granularities, media, and complexities illustrated in the pictures. The actions that the (cartoons, movies, or theatrical performances, for characters are performing must be recognized from instance). This approach differs from the reading snapshot poses, unlike the more usual action recog- scenario in that it more directly provides a dynam- nition from a sequence of frames taken from a ic environment. If group exercises are included video. The next stage of readers are early chapter then all the Laird and Wray criteria are fulfilled in books, which use pictures to expand on the text. a direct and obvious way. Although the story is now mainly advanced The story or scene comprehension scenario through the text, reference resolution with the pic- focuses on the relationship between perception, tures is still important for understanding. mental representation, and natural language. Here, Instructors of reading recognize ”four roles of a the system might be presented with a story, which reader as: meaning maker, code breaker, text user, it needs to analyze semantically and then re-pre- and text critic … meaning makers read to under- sent in a different form (for instance, as a movie stand … comprehension questions [explore] literal sequence, or another kind of retelling). In the oth- comprehension, inferential comprehension, and er direction, the system is shown a movie or car- critical thinking” (Sundance 2001). Code breakers toon sequence, which it needs to comprehend and translate written text to speech and vice versa. Text re-present using language. This approach lends users identify whether the text is fiction or nonfic- itself to a variety of standardizable test scenarios, tion. ”If the book is factual, they focus on reading which allow the direct comparison of competing for information. If a text is fiction, they read to AGI architectures with each other as well as with understand the plot, characters, and message of child performance using identical input. the story …. Text critics evaluate the author’s pur- Note that these tests ignore the interaction of pose and the author’s decisions about how the the system and the environment, and the particu- information is presented … check for social and lar needs of the system itself. It is often argued that cultural fairness … look for misinformation … and general intelligence might only be attained by a think about their own response to the book and motivated, interacting system, but this scenario whether the book is the best it might be” (Sun- would allow the comparison of motivated, inter- dance 2001). active architectures with less comprehensive pro- The roles of meaning maker, code breaker, and cessing-oriented architectures. text user are mostly, though not entirely, familiar AGI system capabilities in story and scene com- to people working in natural language under- prehension are probably not independent but relat-

SPRING 2012 35 Articles

ed, since both tasks focus on modeling the rela- instructor. Here the general metrics for self-regulat- tionships between perception, language, and ed learning will be used (Winne and Perry 2000). In thought, which are not unidirectional. Further addition, social performance of the agent can be tasks might focus on the interplay between these evaluated based on surveys of students and standard areas, by combining expression and comprehen- psychological metrics. Another potentially impor- sion. This could either be done in a fully interactive tant measure will be the effect of the agent presence manner (that is, by designing a system that has to in the classroom on human student learning. respond to a teacher), or self-interactive (two agents learn to communicate with each other, while shar- The Wozniak Test ing an environment). For an example design of In an interview a few years ago, Steve Wozniak of such a self-interactive scenario, see Steels (2010). Apple Computer fame expressed doubt that there would ever be a robot that could walk into an unfa- School Learning miliar house and make a cup of coffee (Wozniak The virtual school student scenario continues the and Moon 2007). We feel that this task is demand- virtual preschool scenario but is focused on higher ing enough to stand as a Turing test equivalent for cognitive abilities assuming that, if necessary, low- embodied AGI. Note that the Wozniak test is a sin- er-level skills will be finessed. In particular, it is gle, special case of Nils Nilsson’s general employ- assumed that all interfaces with the agent are ment test for human-level AI (Nilsson 2005). implemented at a symbolic level: the agent is not A robot is placed at the door of a typical house required to process a video stream, to recognize or apartment. It must find a doorbell or knocker, or speech and gestures, to balance its body and avoid simply knock on the door. When the door is obstacles while moving in space, and others. All answered, it must explain itself to the household- these can be added later to enhance the challenge, er and enter once it has been invited in. (We will but a key feature of this scenario is that they can assume that the householder has agreed to allow also be finessed, allowing for early exploration of the test in his or her house, but is otherwise com- such capabilities without waiting for other system pletely unconnected with the team doing the developments. On the other hand, it is critically experiment, and indeed has no special knowledge important in the scenario for the agent to make of AI or robotics at all.) The robot must enter the academic progress at a human student level, to house, find the kitchen, locate local coffee-making understand human minds, and to understand and supplies and equipment, make coffee to the house- use classroom-related social relations in the envi- holder’s taste, and serve it in some other room. It ronment in which it is embedded. is allowed, indeed required by some of the In this scenario, the agent is embedded in a real specifics, for the robot to ask questions of the high school classroom by means of a virtual-reali- householder, but it may not be physically assisted ty-based interface. The agent lives in a symbolic in any way. virtual world that is continuously displayed on a The current state of the art in robotics falls short big screen in the classroom. The virtual world of this capability in a number of ways. The robot includes a virtual classroom represented at a sym- will need to use vision to navigate, identify objects, bolic (object) level, including the human instruc- possibly identify gestures (“the coffee’s in that cab- tor and human students represented by simplistic inet over there”), and to coordinate complex avatars. The agent itself is represented by an avatar manipulations. Manipulation and physical model- in this virtual classroom. The symbolic virtual ing in a tight feedback learning loop may be nec- world is synchronized with the real physical world essary, for example, to pour coffee from an unfa- with the assistance of intelligent monitoring and miliar pot into an unfamiliar cup. Speech recording equipment performing scene analysis, recognition and natural language understanding speech recognition, language comprehension, ges- and generation will be necessary. Planning must be ture recognition, and others (if necessary, some or done at a host of levels ranging from manipulator all of these functions will be performed by hidden paths to coffee-brewing sequences. human personnel running the test; students The major AGI advance for a coffee-making should not be aware of their existence). The study robot is that all of these capabilities must be coor- material, including the textbook and other cur- dinated and used appropriately and coherently in riculum materials available to each student, will be aid of the overall goal. The usual setup, task defi- encoded electronically and made available to the nition, and so forth are gone from standard narrow agent at a symbolic level. AI formulations of problems in all these areas; the A particularly challenging aspect of this scenario robot has to find the problems as well as to solve is that the agent will be evaluated not only on its them. That makes coffee-making in an unfamiliar own learning and problem-solving performance, house a strenuous test of a system’s adaptivity and but also on its approach to problem solving, based ability to deploy common sense. on its interactions with students and with the Although standard shortcuts might be used,

36 AI MAGAZINE Articles

Human-Level AGI (Nilsson) Formal Operational (Adolescent to Adult)

Wozniak Concrete Operational (6 years to Adolescent)

Reading School Preschool Preoperational (2 - 6 years) Scene/Story

Sensory/Motor General (Birth - 2 years) Video-game Learning Individual Capability (Piaget) Internalization Zone of Proximal Tools of Intellectual Adaptation Development Social-Cultural Engagement (Vygotsky)

Figure 6. Scenario Milestones on the AGI Landscape. such as having a database of every manufactured defined AGI landscape (figure 3). Figure 6 shows coffeemaker built in, it would be prohibitive to the result of this mapping, with some scenarios have the actual manipulation sequences for each like General Video Game Learning and Scene/Sto- one preprogrammed, especially given the variabil- ry Comprehension spanning large areas of the ity in workspace geometry, dispensers and con- landscape as their content increases in complexity tainers of coffee grounds, and so forth. Transfer and cognitive challenge. Just as we consider the learning, generalization, reasoning by analogy, and Laird and Wray requirements as a practical starting in particular learning from example and practice point and our scenarios as initial representatives of are almost certain to be necessary for the system to a much larger, more representative collection, we be practical. hope this initial road map is enhanced and cor- Coffee-making is a task that most 10-year-old rected as our collective experience in AGI grows. humans can do reliably with a modicum of expe- rience. A week’s worth of being shown and prac- Remaining Whitespace ticing coffee making in a variety of homes with a In our view, the suggested scenarios represent a sig- variety of methods would provide the grounding nificant step in populating the AGI Landscape, but for enough generality that a 10 year old could many alternative scenarios are conceivable, and make coffee in the vast majority of homes in a might even cover areas that we have incompletely Wozniak test. Another advantage to this test is it addressed or not mentioned at all. would be extremely difficult to cheat, since the Among the areas that might need more atten- only reasonably economical way to approach the tion are aesthetic appreciation and performance, task would be to build general-learning skills and structured social interaction, and skills that require have a robot that is capable of learning not only to high cognitive function and integration. make coffee but any similar domestic chore. Aesthetic appreciation and performance include composition, literature, artistic portrayal, dance, An Initial Road Map for Achieving AGI and others. Examining the appreciation of aes- Having set forth these seven scenarios, we can now thetics is bound to increase our understanding of proceed to array them across the previously the motivational system, of mental representation,

SPRING 2012 37 Articles

and of metacognition. Aesthetic performance adds a large number of specialized programs that in manipulation, creation, and adaptation of physi- themselves could handle a great deal of learning. cal and imagined objects, theory of mind, aspects of sociality, and many things more. Example Tasks Structured social interaction includes goal adop- For sake of brief illustration, table 1 roughly tion, collaboration, competition and exploitation, describes a handful of example tasks and task fam- negotiation, discourse and joint decision making, ilies corresponding to a few of the scenarios group organization, leadership, and so on. Scenar- described above. Much of the work of fleshing out ios in the social domain will require representa- the initial roadmap described here will involve sys- tions and assessment of social setups, the mental tematically creating a substantial list of tasks cor- states of individuals, and possible ramifications for responding to the scenarios chosen, using an the agent’s own goals. exploratory enumeration of human general intelli- Skills that require high cognitive function and gence competencies as a guide (perhaps a refined integration include, for instance, complex rescue version of the list in figure 5). Each task will then scenarios, physical shopping, and skilled human need to be associated with quantitative perform- assistance. Each of these task areas would require ance metrics. complex coordination between interaction, explo- Some of the examples in table 1 are highly spe- ration, evaluation, and planning. cific, others are broader task families. Our feeling is Thus, we encourage and invite suggestions for that it will generally be better to identify moder- additional scenarios, as long as these are firmly ately broad task families rather than highly specific rooted in an AGI approach, that is, each scenario tasks, in order to encourage the creation of systems should foster our understanding of general, with more robust and flexible functionality. humanlike intelligence, and not simply provide narrow engineering solutions to a limited task. Multidimensional Challenges What then is the right sort of challenge to present From Scenarios to Tasks, Metrics, to a would-be AGI system? We suggest the follow- ing: to address a few dozen closely interrelated, and Challenges moderately broadly defined tasks in an environ- The description of scenarios and competency ment drawn from a moderately broad class (for areas, as we have presented here, makes it possible example, an arbitrary preschool or textbook rather to articulate specific tasks for assessing progress than specific ones determined in advance). If the toward AGI in a principled manner. For each of the environment is suitably rich and the tasks are scenarios reviewed above (or other analogous sce- drawn to reflect the spectrum of human general narios), one may define a specific task-set, where intelligence competencies, then this sort of chal- each task addresses one or more of the competen- lenge will motivate the development of genuine cy areas in the context of the scenario. To consti- human-level AGI systems. tute a reasonable approximation of an AGI test This sort of challenge is not nearly as well spec- suite or overall AGI challenge, the total set of tasks ified as a chess contest, RoboCup, or the DARPA for a scenario must cover all the competency areas. Grand Challenge, but we feel the complexity and Each task must also be associated with some par- heterogeneity here is directly reflective of the com- ticular performance metric — quantitative wher- plexity and heterogeneity of human general intel- ever possible, but perhaps qualitative in some cas- ligence. Some have posited that there are simple es depending on the nature of the task. core principles underlying all human-level general The obvious risk of an AGI evaluation approach intelligence. Even if this turns out to be true, the based on a long list of tasks is that it is susceptible real-world manifestation of these principles — to solution by the so-called “big switch statement” what can actually be measured — is highly com- approach, in which separate narrowly specialized plex and involves multiple interrelated competen- programs corresponding to the individual tasks are cies defined through interaction with a rich, combined together in a simplistic harness. Some dynamic world. may believe that human intelligence is well approximated by a big switch statement of this Challenges and Competitions sort, but we take a different perspective, which Finally, we emphasize that our main goal here is to doesn’t rule out the possibility, but also doesn’t propose a capability roadmap for the AGI research focus on or encourage this approach. community, not a competition or contest. Compe- In addition, the competency areas alluded to titions can be valuable, and we would be happy to above include many that focus on learning and see one or more developed along the lines of the generalization. So even if a big switch statement capability roadmap we have described. However, approach is used to handle tasks based on these we doubt any successful competition could ever requirements, the switch will be choosing between cover the full extent of a roadmap such as this.

38 AI MAGAZINE Articles

Scenario Competency Subarea Example Task or Task Family Area Virtual Learning Dialogical Learn to build a particular structure of blocks (such as a pyramid) faster based on a Preschool combination of imitation, reinforcement, and verbal instruction, than by imitation and reinforcement without verbal instruction Virtual Modeling Self Theory of While Sam is in the room, Ben puts the red ball in the red box. Then Sam leaves Preschool and Other Mind and Ben moves the red ball to the blue box. Sam returns and Ben asks him where the red ball is. The agent is asked where Sam thinks the ball is. Virtual Learning Media- Starting from initially available basic concepts (a number, a variable, a function), School oriente d demonstrate academic progress in learning how to solve problems from the text- Student book using techniques described in the same textbook. The agent should move step by step, from simple to advanced problems, from one domain to another. Robot Actuation Propriocep- The teacher moves the robot's body into a certain configuration. The robot is asked Preschool tion to restore its body to an ordinary standing position, and then repeat the configura- tion that the teacher moved it into. Robot Memory Episodic Ask the robot about events that occurred at times when it received a particularly Preschool large, or particularly small, reward for its actions; it should be able to answer simple questions about these significant events, with significantly more accuracy than about events occurring at random times Wozniak Communication Gestural The robot will be directed to the kitchen. It must understand gestures indicating Test that it should follow an indicated path, or know how to follow its guide, and know when either is appropriate. Wozniak Actuation Navigation The robot must complete its task without colliding people, walls, furniture, or pets. Test Wozniak Social The robot must recognize and appropriately respond to the situation where it has Test Interaction behavior knocked on the wrong door, the householder is not home, or the robot is not wel- come to enter the house. Wozniak Reasoning Physical Be able to use incomplete or alternative tools and equipment. For instance, a drip Test pot may be used without its top, but not a percolator. This may require physical simulation, based on an understanding of naive physics. Wozniak Reasoning Induction On the other hand, the above-mentioned knowledge about drip pots and percola- Test tors may be gathered through inductive reasoning based on observations in multi- ple relevant situations.

Table 1. Mapping Human General Intelligence Competencies to Scenario Tasks and Task Families.

One reason for this is that a competition is very bones: describe the scenarios in more detail, refine likely to involve concrete decisions regarding the the list of specific competency areas, and then cre- prioritization or temporal ordering of tasks, which ate tasks and metrics along the lines outlined will inevitably result in the competition having above. Once this is done, we will have a capability widely differential appeal to AGI researchers roadmap clearly articulating some plausible path- depending on their chosen scientific approach. ways to AGI; and we will have the basis for creat- The primary value to be derived through wide ing holistic, complex challenges that can mean- adoption of a capability roadmap of this nature is ingfully assess progress on the AGI problem. the concentration of multiple researchers pursuing Such a roadmap does not give a highly rigorous, diverse approaches on common scenarios and objective way of assessing the percentage of tasks — which allows them to share ideas and learn progress toward the end goal of AGI. However, it from each other much more than would be the gives a much better sense of progress than one case otherwise. There is much to be gained would have otherwise. For instance, if an AGI sys- through multiple researchers pursuing the same tem performed well on diverse metrics correspon- concrete challenges in parallel, with or without the ding to tasks assessing 50 percent of the compe- presence of an explicit judged competition. tency areas listed above, in several of the above scenarios, the creators of this system would seem Call For Collaboration justified in claiming to have made very substantial progress toward AGI. If the number were 90 per- We have explained our vision for the creation of a cent rather than 50 percent, one would seem justi- roadmap to human-level artificial general intelli- fied in claiming to be almost there. If the number gence, selected goals and starting points, estab- were 25 percent, this would give one a reasonable lished a landscape, populated it with a number of claim to interesting AGI progress. This kind of diverse scenarios, and fleshed out some of the qualitative assessment of progress is not the most details. What’s next is to put more meat on the one could hope for, but it is better than the

SPRING 2012 39 Articles

progress indications one could get without this pilation Techniques (PACT ’10), 1–2. New York: Association sort of roadmap. for Computing Machinery. We have already explicitly noted many of the Fulghum, R. 1989. All I Really Need to Know I Learned in limitations of our treatment of this challenging Kindergarten. New York: Random House/Ivy Books. topic. While there is no consensus among AGI Gardner, H. 1999. Intelligence Reframed: Multiple Intelli- researchers on the definition of general intelli- gences for the 21st Century. New York: Basic Books. gence, we can generally agree on a pragmatic goal. Goertzel, B. 2010. Toward a Formal Definition of Real- The diversity of scenarios presented reflects a diver- World General Intelligence. In Proceedings of the Third sity of perspectives among AGI researchers regard- Conference on Artificial General Intelligence (AGI-10). Paris: ing which environments and tasks best address the Atlantis Press. most critical aspects of AGI. Most likely neither the Goertzel, B., and Bugaj, S. V. 2009. AGI Preschool. In Pro- tentative list of competencies nor the Laird and ceedings of the Second Conference on Artificial General Intel- ligence (AGI-09). Paris: Atlantis Press. Wray criteria are either necessary or sufficient. There is no obvious way to formulate a precise Grand, S. 2001. Creation: Life and How to Make It. London: Phoenix Books. measure of progress toward AGI based on the com- petencies and scenarios provided — though one Gregory, R. 1996. Psychological Testing: History, Principles, and Applications. London: Allyn and Bacon. can use these to motivate potentially useful approximative measures. Hutter, M. 2004. Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Berlin: Springer. But in spite of its acknowledged limitations, a capability roadmap of the sort outlined here allows Kitano, H. 1997. RoboCup: The Robot World Cup Initia- tive. In Proceedings of the First International Conference on multiple researchers following diverse approaches Autonomous Agent (Agents-97). New York: Association for to compare their work in a meaningful way; allows Computing Machinery. researchers, and other observers, to roughly assess Laird, J., and Wray, R. 2010. Cognitive Architecture the degree of research progress toward the end goal Requirements for Achieving AGI. In Proceedings of the of human-level artificial general intelligence; and Third Conference on Artificial General Intelligence (AGI-10). allows work on related road-mapping aspects, such Paris: Atlantis Press. as tools roadmaps and study of social implications Laird, J.; Wray, R.; Marinier, R.; and Langley, P. 2009. and potential future applications, to proceed in a Claims and Challenges in Evaluating Human-Level Intel- more structured way. ligent Systems. In Proceedings of the Second Conference on With this in mind, we encourage AGI Artificial General Intelligence (AGI-09). Paris: Atlantis Press. researchers to join with us in the ongoing collabo- Legg, S., and Hutter, M. 2007. A Definition of Machine rative construction of a more detailed AGI Intelligence. Minds and Machines 17(4) 391–444. roadmap along the lines suggested here. Lungarella, M.; Metta, G.; Pfeifer, R.; and Sandini, G. 2003. Developmental Robotics: A Survey. Connection Sci- ence 15(4): 151–190. Notes Nilsson, N. J. 2005. Human-Level Artificial Intelligence? 1. See the DARPA Grand Challenge Rulebook Be Serious! AI Magazine 26(4): 68–75. (www.darpa.mil/grandchallenge05/Rules_8oct04.pdf). Piaget, J. 1953. The Origins of Intelligence in Children. Lon- 2. See the DARPA Urban Challenge Rules (www. don: Routledge and Kegan Paul. darpa.mil/grandchallenge/rules.asp). Samsonovich, A. V.; Ascoli, G. A.; and De Jong, K. A. 3. See the Standard Performance Evaluation Corporation 2006. Human-Level Psychometrics for Cognitive Aarchi- (www.spec.org). tectures. Paper presented at the Fifth International Con- ference on Development and Learning (ICDL5), Bloom- References ington, IN, 31 May–3 June. Arel, I. 2009. Working Toward Pragmatic Convergence: Steels, L. 2010. Modeling the Formation of Language in AGI Axioms and a Unified Roadmap. Paper presented at Embodied Agents: Methods and Open Challenges. In the Workshop on the Future of AGI, Second Conference Evolution of Communication and Language in Embodied on Artificial General Intelligence (AGI-09). Agents, 223–233. Berlin: Springer Verlag. Arel, I., and Livingston, S. 2009. Beyond the Turing Test. Sundance 2001. Teacher Guide for Just Kids. Marlborough, IEEE Computer 42(3): 90–91. MA: Sundance Publishing. Bengio, Y. 2009. Learning Deep Architectures for AI. Turing, A. M. 1950. Computing Machinery and Intelli- Foundations and Trends in Machine Learning 2(1): 1–127. gence. Mind 59(236) (October): 433–60. Campbell, M.; Hoane, A. J., Jr.; and Hsu, F. 2002. Deep Vygotsky, L. 1986. Thought and Language. Cambridge, Blue. Artificial Intelligence 134(1–2) (January): 57–83. MA: The MIT Press. Dempster, F. N. 1981. Memory Span: Sources of Individ- Walter, W. G. 1953. The Living Brain. New York: W. W. ual and Developmental Differences. Psychological Bulletin Norton. 89(1): 63–100. Winne, P. H., and Perry, N. E. 2000. Measuring Self-Regu- Ferrucci, D. 2010. Build Watson: An Overview of DeepQA lated Learning, 531–566. Boston: Academic Press. for the Jeopardy! Challenge. In Proceedings of the 19th Wozniak, S., and Moon, P. 2007. Three Minutes with International Conference on Parallel Architectures and Com- Steve Wozniak. PC World July 19.

40 AI MAGAZINE Articles

Sam S. Adams ([email protected]) currently works Alexei V. Samsonovich ([email protected]) is a for IBM Research and was appointed one of IBM’s first research assistant professor at Krasnow Institute for distinguished engineers in 1996. His far-ranging contri- Advanced Study at George Mason University. He holds a butions include founding IBM’s first object technology Ph.D. in applied mathematics from the University of Ari- practice, authoring IBM’s XML technical strategy, origi- zona, where he codeveloped a continuous-attractor the- nating the concept of service-oriented architecture, pio- ory of hippocampal spatial maps and a mental state neering work in self-configuring and autonomic systems, framework for cognitive modeling. His current research artificial general intelligence, end-user mashup program- focuses on biologically inspired cognitive architectures ming, massively parallel many-core programming, petas- (BICA). He has chaired annual conferences on BICA since cale analytics, and data-centered systems. He is a found- 2008, cofounded and directed the BICA Society since ing member of the AGI community and continues to 2010, and edited BICA since 2012. research bootstrapping human-level general intelligence Matthias Scheutz ([email protected]) is an associ- through the Joshua Blue project. ate professor of computer science at Tufts University Itamar Arel ([email protected]) is an associate profes- School of Engineering. His current research interests sor of electrical engineering and computer science and include agent-based modeling, cognitive modeling, com- director of the Machine Intelligence Laboratory at the plex cognitive and affective robots for human-robot University of Tennessee. His research focus is on high- interaction, computational models of human language performance machine learning architectures and algo- processing in mono- and bilinguals, cognitive architec- rithms, with emphasis on deep learning architectures, tures, distributed agent architectures, and interactions reinforcement learning, and decision making under between affect and cognition. Scheutz earned a Ph.D. in uncertainty. Arel is a recipient of the U.S. Department of philosophy from the University of Vienna in 1995 and a Energy Early Career Principal Investigator (CAREER) Joint Ph.D. in cognitive science and computer science award in 2004. He holds B.S., M.S., and Ph.D. degrees in from the Indiana University Bloomington in 1999. electrical and computer engineering and an M.B.A. Matthew Schlesinger ([email protected]) received his degree, all from Ben-Gurion University in Israel. Ph.D. in developmental psychology from the University Joscha Bach ([email protected]) studied com- of California at Berkeley, with postdoctoral research in puter science and philosophy at Berlin’s Humboldt Uni- robotics and machine learning. He is an associate profes- versity and at Waikato University (NZ). He received a sor of psychology at Southern Illinois University, Car- Ph.D. in cognitive science from the University of bondale, where he is a member of the Brain and Cogni- tive Sciences program and directs the Vision Lab. Osnabrück. His main research interest concerns the Schlesinger serves as an associate editor of IEEE Transac- design of cognitive architectures, especially with respect tions on Autonomous Mental Development, in addition to to representation, motivation, and affect. He is a research cochairing the TAMD-IEEE Technical Committee. His fellow at the Berlin School of Mind and Brain, at Hum- work integrates a number of approaches that span psy- boldt University of Berlin. chology and machine learning, and includes experimen- Robert Coop ([email protected]) is a doctoral student at the tal research with human infants, children, and adults, University of Tennessee at Knoxville. He is currently modeling of perceptual and cognitive processes in studying machine learning, with a focus on generalized embodied agents, as well as neuroimaging. pattern extraction methods. Stuart C. Shapiro ([email protected]) is a professor of Rod Furlan ([email protected]) is the chief of computer science and engineering and an affiliated pro- research and development at Quaternix Research Inc. His fessor of linguistics and of philosophy at the University at work explores the intersection of artificial intelligence, Buffalo, The State University of New York. He received an high-performance computing, and quantitative finance. SB in mathematics from the Massachusetts Institute of Passionate about technology and a serial autodidact, Technology in 1966, and an MS and PhD in computer sci- Furlan is also a Singularity University alumnus and teach- ences from the University of Wisconsin – Madison in ing fellow for AI and robotics. 1968 and 1971. His primary research interests are in knowledge representation and reasoning, especially in Ben Goertzel ([email protected]) is chief scientist of the support of natural language competence and cognitive financial prediction firm Aidyia Holdings and chairman robotics. He is a past chair of ACM/SIGART; a past presi- of the AI consulting firm Novamente LLC and bioinfor- dent of Principles of Knowledge Representation and Rea- matics firm Biomind LLC. He is also leader of the soning, Incorporated; a life senior member of the IEEE; an OpenCog open-source artificial general intelligence proj- ACM Distinguished Scientist; and a Fellow of the AAAI. ect, general chair of the AGI conference series, and vice chair of the futurist organization Humanity+. His major John F. Sowa ([email protected]) spent 30 years work- research focuses are embodied artificial general intelli- ing on research and development projects at IBM and is gence and applications of AI to genomics. a cofounder of VivoMind Research, LLC. He has a BS in mathematics from the Massachusetts Institute of Tech- J. Storrs Hall ([email protected]) is an independent nology, an MA in applied mathematics from Harvard, Ph.D. scientist and author. His recent book, Beyond AI: and a PhD in computer science from the Vrije Univer- Creating the Conscience of the Machine (Prometheus, 2007), siteit Brussel. He is a fellow of the American Association was the first full-length treatment of . He for Artificial Intelligence. The language of conceptual is active in the machine ethics and AGI research com- graphs, which he designed, has been adopted as one of munities, serving on the program committee for the the three principal dialects of the ISO/IEC standard for international AGI conference series. Common Logic.

SPRING 2012 41 Articles

Call for Proposals 2013 AAAI Spring Symposium Series

Sponsored by the Association for the Advancement of Artificial Intelligence

AAAI invites proposals for the 2013 Spring Symposium lated to the topic(s) of the symposium. Note that ble over the subfields of AI, and balanced between theo- Series, to be held March 25-27, 2013 at Stanford Univer- potential participants need not commit to partici- retical and applied topics. Symposia bridging theory and sity, California. pating, only state that they are interested. practice and those combining AI and related fields are particularly solicited. The Spring Symposium Series is an annual set of meet- Ideally, the entire organizing committee should collabo- ings run in parallel at a common site. It is designed to rate in producing the proposal. If possible, a draft pro- Symposium proposals should be submitted as soon as bring colleagues together in an intimate forum while at posal should be sent out to a few of the potential partici- possible, but no later than April 20, 2012. Proposals that the same time providing a significant gathering point for pants and their comments solicited. are submitted significantly before this deadline can be in the AI community. The two and one half day format of draft form. Comments on how to improve and complete the series allows participants to devote considerably more Approximately eight symposia on a broad range of topics the proposal will be returned to the submitter in time for time to feedback and discussion than typical one-day within and around AI will be selected for the 2013 Spring revisions to be made before the deadline. Notifications of workshops. It is an ideal venue for bringing together new Symposium Series. All proposals will be reviewed by the acceptance or rejection will be sent to submitters around communities in emerging fields. AAAI Symposium Committee, (Chair: Matthew E. Tay- May 18, 2012. The submitters of accepted proposals will lor, Lafayette College; Cochair: Gita Sukthankar, Univer- become the chair of the symposium, unless alternative The symposia are intended to encourage presentation of sity of Central Florida; Alan Schultz, Naval Research Lab- arrangements are made. The symposium organizing speculative work and work in progress, as well as com- oratory; and Kate Larson, University of Waterloo). committees will be responsible for: pleted work. Ample time should be scheduled for discus-  Producing, in conjunction with the general chair, a sion. Novel programming, including the use of target The criteria for acceptance of proposals include: Call for Participation and Registration Brochure for problems, open-format panels, working groups, or break- Perceived interest to the AAAI community. Although the symposium, which will be distributed to the out sessions, is encouraged. AAAI Technical Reports will AAAI encourages symposia that cross disciplinary AAAI membership be prepared, and distributed to the participants. Most boundaries, a symposium must be of interest to some participants of the symposia will be selected on the basis subcommunity of the AAAI membership. Symposia that  Additional publicity of the symposium, especially to of statements of interest or abstracts submitted to the are of interest to a broad range of AAAI members are al- potential audiences from outside the AAAI commu- symposia chairs; some open registration will be allowed. so preferred. nity All symposia are limited in size, and participants will be Appropriate number of potential participants. Al-  Reviewing requests to participate in the symposium expected to attend a single symposium. though the series supports a range of symposium sizes, and determining symposium participants the target size is around 40-60 participants.  Proposals for symposia should be between two and five Lack of a long-term ongoing series of activities on the Preparing working notes for the symposium pages in length, and should contain: topic. The Spring Symposium Series is intended to nur-  Scheduling the activities of the symposium ture emerging communities and topics, so topics that al-  A title for the symposium.  Preparing a short review of the symposium, to be ready have yearly conferences or workshops are inappro- printed in AI Magazine  A description of the symposium, identifying specif- priate. ic areas of interest, and general symposium format. An appropriate organizing committee. The organizing AAAI will provide logistical support, will take care of all  The names and addresses (physical and electronic) committee should have (1) good technical knowledge of local arrangements, and will arrange for reproducing and of the organizing committee: preferably three or the topic, (2) good organizational skills, and (3) connec- distributing the working notes. Please submit your sym- more people at different sites, all of whom have tions to the various communities from which they intend posium proposals by electronic mail (no postal submis- agreed to serve on the committee. to draw participants. Committees for cross-disciplinary sions), and inquiries concerning symposia, to: symposia must adequately represent all the disciplines to   Matthew E. Taylor A list of potential participants that have been con- be covered by the symposium. tacted and that have expressed interest in partici- Lafayette College Easton, Pennsylvania, USA pating. A common way of gathering potential par- Accepted proposals will be distributed as widely as possi- ticipants is to send email messages to email lists re- [email protected]

March 25–27, 2013  Stanford University  Stanford, California

AAAI SSS-13  2275 East Bayshore Road, Suite 160  Palo Alto, CA 94303  650-328-3123  650-321-4457 (fax)  [email protected]

42 AI MAGAZINE