Progress and Challenges in Research on Cognitive Architectures
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Progress and Challenges in Research on Cognitive Architectures Pat Langley Institute for the Study of Learning and Expertise 2164 Staunton Court, Palo Alto, CA 94306 Abstract 2 Previous Work on Cognitive Architectures Research on cognitive architectures attempts to develop uni- The cognitive architecture movement shares many ideas fied theories of the mind. This paradigm incorporates many with other branches of AI, but it has sufficiently different ideas from other parts of AI, but it differs enough in its aims emphases that we should clarify them before discussing its and methods that it merits separate treatment. In this paper, status. In this section, we describe its basic aims, some re- we review the notion of cognitive architectures and some re- curring themes, and some well-known examples. curring themes in their study. Next we examine the substan- tial progress made by the subfield over the past 40 years, after 2.1 The Notion of a Cognitive Architecture which we turn to some topics that have received little atten- tion and that pose challenges for the research community. A cognitive architecture (Newell 1990) is a theory of intel- ligent behavior that specifies those facets of cognition hy- pothesized to remain constant over time and across different 1 Introduction and Overview domains. This includes memory stores and the representa- Most research in AI is analytic, in that it selects some facet tions of elements in those memories, but not their contents, of intelligence and attempts to understand it in detail, typi- which change as the result of external stimuli and internal cally in isolation from other elements. This is balanced by processing. In this sense, a cognitive architecture is anal- a smaller movement, synthetic in character, that aims to dis- ogous to a building architecture, which describes its fixed cover how different aspects of intelligence interact. Without structure (e.g., floors, rooms, and doors), but not its replace- efforts of this latter sort, AI may be able to create idiot sa- able elements (e.g., tables, chairs, and people). vants that outperform people in narrow arenas, but it can- However, such a framework incorporates more constraints not create complete intelligent agents that show the same than the ‘software architectures’ used in mainstream com- breadth of abilities as seen in humans. puter science. It makes strong assumptions about the rep- Theoretical physicists seek a grand unified theory that ex- resentations and mechanisms that underly cognition, typi- plains all known physical laws within a single consistent clly incorporating ideas from psychology about the nature of framework. The analog in AI is a unified theory of cogni- the human mind. Most cognitive architectures have distinct tion, which Newell (1990) linked to the notion of a cogni- modules, but these usually access and alter the same mem- tive architecture. The mapping is imperfect in that most AI ories and representations. Moreover, they come with a pro- researchers focus on creating computational artifacts rather gramming language for constructing intelligent agents that than explaining observations. However, if we want systems adopts a high-level syntax which reflects theoretical assump- that exhibit the full range of human intelligence, they must tions. Production systems (Klahr et al. 1987) were both the reproduce all major phenomena associated with the latter. first and most common examples of the paradigm; they are In the sections that follow, we review the cognitive archi- not the only variety, but many frameworks labeled as ‘cog- tecture paradigm and its recurring themes, then discuss the nitive architectures’ do not fit the criteria we have specified. great strides it has made in the decades since its inception. We will argue that, despite this progress, research has fo- 2.2 Recurring Themes cused on some topics to the near exclusion of others, and we The literature on cognitive architectures reflects a number of examine a number of areas that deserve more attention. One common assumptions or recurring themes that give it a dis- open question is whether replicating these abilities requires tinctive character, although it shares some with other areas changes to existing architectures or simply adding new types of artificial intelligence. These include postulates that: of knowledge. Ultimately, this is an empirical question that • Short-term memories are distinct from long-term stores. can only be answered by making the attempt, but we will The former, often called working memories, include con- take some tentative stances on the issue. tent that changes rapidly over time, such as the agent’s Copyright c 2017, Association for the Advancement of Artificial beliefs and goals. The latter store stable elements that re- Intelligence (www.aaai.org). All rights reserved. main static or change slowly through learning. Short-term 4870 elements are typically specific, while long-term content is 2.3 Example Architectures often general in form; the former play the role of program To clarify these ideas, we should briefly review some cog- variables in traditional languages, whereas the latter often nitive architectures that reflect these recurring themes. Here serve as program instructions. are three examples of such frameworks: • Memories contain modular symbolic structures. Both • ACT-R (Anderson 1993) is a mature architectural theory short-term and long-term repositories are collections of concerned mainly with explaining psychological data. It distinct elements that, usually, are encoded as list struc- combines a procedural memory of generalized produc- tures. These may include numeric information, but each tion rules with a declarative memory of specific facts, item is a relational structure that also specifies symbolic with the former accessing the latter through activation- content. Short-term elements typically share symbols so based retrieval and pattern matching. Learning mecha- that, jointly, they can denote large-scale relationships. nisms include rule compilation and statistical updates, • Relational pattern matching accesses long-term content. each based on use of knowledge structures. Before an architecture can use the ‘program’ encoded in • Soar (Laird et al. 1987) is another well-developed archi- its long-term memories, it must find relevant structures. tecture that is organized around problem-space search. This invariably revolves around matching or unifying re- An elaboration stage invokes production rules to draw in- lational patterns in these elements against corresponding ferences and evaluate alternatives, with results used by a elements in the short-term stores. Such patterns match decision-making stage that selects an operator to apply when variables in the long-term structure bind consis- mentally or in the environment. A chunking mechanism tently with constants in dynamic memories. acquires rules based on results from problem solving. Re- • Cognitive processing occurs in recognize-act cycles. The cent versions include other modules that we discuss later. core interpreter alternates between matching long-term • ICARUS (Langley, Choi, and Rogers 2009) is a more re- structures against short-term ones, selecting a subset of cent architecture that includes separate modules for con- the former to apply, and executing their associated ac- ceptual inference, teleoreactive skill execution, means- tions to alter memory or the environment. Because many ends problem solving, and skill acquisition. It differs long-term elements may be satisfied, a conflict resolution from predecessors by positing that cognition is grounded mechanism selects among them. This reflects an abiding in perception and action, conceptual knowledge is dis- concern of architectural research with cognitive control. tinct from skills, both are organized in a hierarchy, and short-term elements are instances of long-term structures. • Cognition dynamically composes mental structures. Se- quential application of rules or similar knowledge ele- These architectures are important examples because they ments produces working memory elements, which in turn combine mechanisms for inference, routine activity, goal- enable matching on the next cycle. This chaining process directed problem solving, and structure learning. essentially combines the matched elements into new enti- We should also mention frameworks that omit some of ties, as in work on logical reasoning. Similarly, structural these elements but share the core assumptions listed earlier: learning involves the composition of new rules or skills • PRODIGY (Veloso et al. 1995) uses search-control rules from existing cognitive material. to guide means-ends analysis and learns new rules from These assumptions are well suited to the class of phenomena successes and failures. The architecture supports a variety and level of analysis for which most architectures have been of learning methods, but focuses on planning tasks rather designed. Research in this paradigm was influenced strongly than other forms of high-level cognition. • by studies of human thinking, with behavior taking seconds CAPS (Thibadeau 1983) encodes knowledge as produc- to minutes and with basic operations on the order of 50 to tion rules but applies them in parallel to alter the activa- 100 milliseconds. In such settings, humans appear to ac- tions of elements in working memory. This lets the ar- cess relevant content from long-term memory in parallel (or chitecture