Sequences of Mechanisms for Causal Reasoning in Artificial Intelligence

Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Sequences of Mechanisms for Causal Reasoning in Artificial Intelligence Denver Dash Mark Voortman Martijn de Jongh Intel Corporation and University of Pittsburgh University of Pittsburgh Carnegie Mellon University USA USA USA [email protected] [email protected] [email protected] Abstract been generalized by [Pearl(2000)] and others (e.g., [Spirtes et al.(2000)Spirtes, Glymour, and Scheines]) to include dis- We present a new approach to token-level causal crete propositional variables, of which (causal) Bayesian net- reasoning that we call Sequences Of Mechanisms works (BNs) can be viewed as a subset because they are (SoMs), which models causality as a dynamic se- acyclic. Yet, despite decades of theoretical progress, causal quence of active mechanisms that chain together models have not been widely used in the context of core AI to propagate causal influence through time. We applications such as robotics. motivate this approach by using examples from AI and robotics and show why existing approaches are Because causality in econometrics operates at the pop- inadequate. We present an algorithm for causal ulation level, important human causal faculties that oper- reasoning based on SoMs, which takes as input a ate on single entities may have been neglected in current knowledge base of first-order mechanisms and a set causality formalisms. Such single-entity cases are com- of observations, and it hypothesizes which mecha- mon in AI: A robot gets stuck in a puddle of oil, a person nisms are active at what time. We show empirically has chest pains and we want to know the specific causes, that our algorithm produces plausible causal expla- etc. This distinction between population-level and individual- nations of simulated observations generated from a level causation is known in the philosophical literature as [ causal model. type-level versus token-level causation (e.g., Eells(1991); Kleinberg(2012)])orgeneral versus singular causation (e.g., We argue that the SoMs approach is qualitatively [Hitchcock(1995); Davidson(1967)]). For example, a type- closer to the human causal reasoning process, for level model for lung cancer might include all the possible example, it will only include relevant variables in causes, such as: CigaretteSmoking, AsbestosExposure, Ge- explanations. We present new insights about causal neticFactors, etc., whereas a token-level explanation contains reasoning that become apparent with this view. One only actual causes: “Bob’s lung cancer was caused by that such insight is that observation and manipulation time in the 80s when he snorted asbestos.” do not commute in causal models, a fact which we Models in econometrics rely less on token causality and show to be a generalization of the Equilibration- more on type-level reasoning. Causality-in-AI’s evolution Manipulation Commutability of [Dash(2005)]. from these disciplines may explain why token-level causal reasoning has been less studied in AI. However, in many AI 1 Introduction tasks such as understanding why a robot hand is stuck in a The human faculty of causal reasoning is a powerful tool cabinet, this ability may be crucial to posing and testing con- to form hypotheses by combining limited observational data crete causal hypotheses. This disconnect may further explain with pre-existing knowledge. This ability is essential to un- why causal reasoning has up to now not been widely used in covering hidden structure in the world around us, perform- the context of robotics and AI applications. ing scientific discovery and diagnosing problems in real time. In this paper, we consider the tasks of producing token- Enabling computers to perform this kind of reasoning in an level explanations and predictions for causal systems. We effective and general way is thus an important sub-goal to- present a new representation for these tasks which we call Se- ward achieving Artificial Intelligence. quences of Mechanisms (SoMs). We motivate SoMs with sev- The theoretical development of causality in AI has eral examples for how they improve causal reasoning in typi- up to now primarily been based on structural equation cal AI domains. We present an algorithm for the construction models (SEMs) [Strotz and Wold(1960); Simon(1954); of SoMs from a knowledge base of first-order causal mecha- Haavelmo(1943)], a formalism which originated in econo- nisms, and we show empirically that this algorithm produces metrics and which is still used commonly in the economic good causal explanations. The first-order nature of our mech- and social sciences. The models used in these disci- anisms facilitates scalability and more human-like hypotheses plines typically involve real-valued variables, linear equa- when the possible number of causes is large. tions and Gaussian noise distributions. In AI this theory has There has been some work on representation and al- 839 gorithms for token-level causal reasoning. In particular, a boxing match, and we would like to identify a specific [Halpern and Pearl(2005)] present a definition of causal mechanism that is becoming active when someone delivers explanation which uses the concept of actual cause from a Punch. We argue in this section that mechanisms such as [Halpern and Pearl(2001)], based on functional causal mod- these are more fundamental than a state-space representation els of [Pearl(2000)], to produce sets of variables which are which only deals with sets of variables and their states. deemed to be possible explanations for some evidence. We In many real-world problems, having a first-order represen- discuss some shortcomings of token causality with functional tation for token-level reasoning is essential. For example, the causal models such as BNs in Section 2. In particular, we method of [Halpern and Pearl(2005)] can generate explana- show that in order for this representation to be general, it tions, but only once a “complete” causal graph enumerating must essentially be reduced to the approach presented in this all possible causal ancestors of an event is specified. In the paper. [Kleinberg(2012)] discusses token causality explicitly office brawl scenario as well as other real-world simple sce- and presents a measure of significance for a token-level im- narios such as a pool table where there are an intractable or mediate cause given logical formulae which are similar syn- even infinite number of possible events that could cause other tactically to our mechanisms; however she does not attempt events to happen, building this “complete” causal model may to find optimal chains of causation which could serve as non- be impossible or impractical. trivial hypotheses. Furthermore, both Halpern and Pearl’s This limitation aside, in BNs and SEMs, something similar and Kleinberg’s approaches are fundamentally propositional to token-level reasoning can be performed by instantiating in nature, so lack our ability to scale when the number of pos- variables in the model to values based on the specific sible causes is large. instance at hand. For example, in the lung cancer case of Section 1, one might instantiate LungCancer to True and 2 Sequences of Mechanisms AsbestosExposure to True to indicate the token-level hypoth- To illustrate the type of reasoning we would like to enable, esis that Bob’s asbestos snorting adventure caused his lung consider the following simple running example of “human- cancer. However, this state-space approach is incomplete: like” causal inference: being able to reason only about states and possible causes is different from creating specific hypotheses about how While you are in a business meeting with Tom, Bob an event was caused. For example, it could very well be suddenly bursts into your office and punches Tom in that Bob was a smoker, but the smoking was not the cause the face. Tom falls to the ground, then gets up and of his lung cancer. In this case, the BN model is unable punches Bob back. to distinguish (and score) between the three hypotheses: In this example, which we will refer to as the Office Brawl Smoking→LungCancer, AsbestosExposure→LungCancer example, there are three main events spaced out in time: and Smoking & AsbestosExposure→LungCancer. Punch(Bob,Tom,T1), Fall(Tom,T2), and Punch(Tom,Bob,T3) This problem becomes exacerbated in a more realistic with T1 <T2 <T3. Humans, given their wealth of back- causal model where the number of nodes would be much ground knowledge might construct the graph of Figure 1(a). higher, and where nodes would combine in nontrivial ways. They may also be able to expand on these observed events to Consider, for example, the BN causal model of Figure 2(a) where all nodes are binary. Given some evidence we can ob- tain beliefs about the states of all the nodes in the graph (say dark represents False and light represents True). This representation of a type-level causal graph plus specific states does not necessarily provide us with a clear token-level picture of what is happening causally in this system. On the other hand, a token-level explanation represented by a subgraph (such as that shown in Figure 2(b) showing the likely causal ancestors of the event of interest provides a much clearer causal picture. If one wanted to consider, for example, which manipulations might change the outcome of the event of interest, the graph of Figure 2(b) would be much more informative. This sug- gests one possible algorithm to achieve such a token-level ex- Figure 1: (a) A simple causal explanation. (b) A more elabo- planation which looks strikingly similar to a structure search rate causal explanation with hypothesized events. given data, with two important differences: (1) the data of interest here is a single record, and (2) we have strong priors include hidden events, such as Angry, of which a causal graph on the structure, being provided by the full type-level causal is displayed in Figure 1(b).

Sequences of Mechanisms for Causal Reasoning in Artificial Intelligence

Artificial General Intelligence and Classical Neural Network

Artificial Intelligence: Distinguishing Between Types & Definitions

Artificial Intelligence/Artificial Wisdom

Machine Guessing – I

How Empiricism Distorts Ai and Robotics

Artificial Intelligence for Robust Engineering & Science January 22

Artificial Intelligence and National Security

A Framework for Designing Compassionate and Ethical Artificial Intelligence and Artificial Consciousness

Artificial Intelligence Measurement and Evaluation at The

The New Empiricism and the Semantic Web: Threat Or Opportunity?

Artificial Pain May Induce Empathy, Morality, and Ethics in The

Juniper and Cloud Ingenuity Transform Wi-Fi to Enable Wayfinding and Asset Tracking for Enhanced User Experience