Automating Path Analysis for Building Causal Models from Data

From: AAAI Technical Report WS-93-02. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved. AutomatingPath Analysis for Building Causal Models from Data: First Results and Open Problems Paul R. Cohen Lisa BaUesteros AdamCarlson Robert St.Amant Experimental KnowledgeSystems Laboratory Department of Computer Science University of Massachusetts, Amherst [email protected]@cs.umass.edu [email protected] [email protected] Abstract mathematical basis than path analysis; they rely on evidence of nonindependence, a weaker criterion than Path analysis is a generalization of multiple linear regres- path analysis, which relies on evidence of correlation. sion that builds modelswith causal interpretations. It is Thosealgorithms will be discussed in a later section. an exploratory or discovery procedure for finding causal structure in correlational data. Recently, we have applied Wedeveloped the path analysis algorithm to help us path analysis to the problem of building models of AI discover causal explanations of how a complex AI programs, which are generally complex and poorly planning system works. The system, called Phoenix understood. For example, we built by hand a path- [Cohenet al., 89], simulates forest fires and the activities analytic causal model of the behavior of the Phoenix of agents such as bulldozers and helicopters. One agent, planner. Path analysis has a huge search space, however. called the fireboss, plans howthe others, whichare semi- If one measures N parameters of a system, then one can autonomous,should fight the fire; but things inevitably go build O(2N2) causal mbdels relating these parameters. wrong,winds shift, plans fail, bulldozers run out of gas, For this reason, we have developed an algorithm that and the fireboss soon has a crisis on its hands. At first, heuristically searches the space of causal models. This this chaos was appealing and we congratulated ourselves paper describes path analysis and the algorithm, and for building such a realistic environment. However,we soon realized that we could explain very little about the presents preliminary empirical results, including what we behavior of the fireboss. Weturned to regression analysis believe is the first example of a causal model of an AI system induced from performance data by another AI to answer some questions, such as, "Which has more system. impact on the time to contain a fire: the wind speed or the numberof times the fireboss must replanT’ But although regression assumedthese factors interacted, it provided no 1. INTRODUCTION explanation of their causal relationship. For example, we knew that the wind speed could affect the incidence of This paper describes a statistical discovery procedurefor replanning and not vice versa, but this causal, explanatory finding causal structure in correlational data, called path knowledgewas not to be found in regression models. Nor analysis lasher, 83; Li, 75] and an algorithm that builds would automated regression procedures (e.g., stepwise path-analytic models automatically, given data. This multiple regression) find causal models of our planner. work has the same goals as research in function finding Path analysis, however, is a generalization of regression and other discoverytechniques, that is, to find rules, laws, analysis that produces explicitly causal models. Webuilt and mechanisms that underlie nonexperimental data one such model of Phoenix by hand, and by automating [Falkenhainer & Michalski 86; Langley et al., 87; 1 path analysis as we describe below, we have been able to Schaffer, 90; Zytkow et al., 90]. Whereas function discover other causal explanations of how the Phoenix finding algorithms produce functional abstractions of fireboss works. (presumably) causal mechanisms, our algorithm produces explicitly causal models. Our workis most similar to that Readers whoare familiar with regression analysis might of Glymouret al. [87], who built the TETRADsystem. skip to the end of the next section, where we introduce Pearl [91; 93] and Spirtes [93] have recently developed path analysis, or Section 3 where we illustrate a path causal induction algorithms with a more general analysis of Phoenix. Section 4 describes our algorithm. Section 5 discusses two experiments, an informal one in which we applied the algorithm to Phoenix data, and a 1The term "nonexperimental" is perhaps confusing, because data are factorial experiment in which the behavior of the usually collected in an experiment. Nonexperimental means that the algorithm was probedby applying it to artificial data. experiment is over and the opportunity to manipulate variables to see effects has passed. Causal hypotheses must therefore be generated and tested with the data, alone. AAAI-9$ Knowledge Discovery in Databases Workshop 1993 Page 153 beta coefficients is plausible because betas are standard- 2. BACKGROUND: REGRESSION ized partial regression coefficients; they represent the ef- fect of a predictor variable on Y whenall the other pre- Path analysis is a generalization of multiple linear regres- dictor variables are fixed. Youcan interpret 131 as what sion, so we will begin with regression. Simplelinear re- happens to Y when only XI is systematically varied. In gression finds a least-squares line relating a single this sense, beta coefficients providea statistical version of predictor variable x to a performance variable y. A the control you get in an experiment in which X2 and X3 least-squares line is one that minimizes the sum of are fixed and XI is varied; in such an experiment, the ef- squared deviations of predicted values from actual values. fect on Y is attributable to X1. (Alternatively, the effects That is, simple linear regression finds a line ~ = bx + a 2. might be due to an unmeasuredor latent variable that is that minimizesY.i(Yi - Yi ) correlated with Xl; we will not consider this case here.) Multiple linear regression finds least-squares rules (i.e., planes and hyperplanes) for more than one predictor variable, rules of the form ~=blxl+...+bkxk+a. The regression^equation is better represented in standardized form as Y = fliXi +f12X2+f13X3 (standardized variables are denoted with uppercase letters). The interpretation of this model is that a change in XI of one standard d^eviation, Sl, producesfll standard deviations changein Y. Thus, beta coefficients axe comparable: if fll =. 4, f12 =. 8, then a standard deviation changein X2 has twice the influence on Y as a standard deviation Figure 1: The path model that corresponds to change in XI. the multiple linear regression of Y To fred beta coefficients the following equations are on Xl, X2 and X3. derived from the regression equation and solved: = + + So far we have described how a prediction model 9x. = + + x.x. I~ = fllX1 +/~X2+ f13X3gives rise to a set of normalequa- tions, and to beta coefficients that makethe prediction = + + model a least-squares fit to our data. If this were all we The previous equations can be rewritten in terms of could do, path analysis would be identical to linear correlations: regression and not worth the effort. The power of path analysis is that we can specify virtually any prediction q,x,rrx’ = ~[~ +r ~zrx, x:x, +x, f12 + [33rx,+ ~rx3x, xl model we like, and then solve for beta coefficients that (1) ensure the modelis a least-squares fit to our data. rrx, = ~rx,x, + flzrx, x, + f13 Clearly, with these three equations we can solve for the 3. PATH ANALYSIS OF PHOENIX DATA three unknownbeta coefficients. Wehave not shownthat Let us illustrate a path model other than the regression these coefficients guarantee that Y =flaXl +/~_X2+f13X3 model with an example from the Phoenix system [Cohen is a least-squares rule, but the interested reader can find & Hart, 93; Cohenet al., 89; Hart & Cohen, 92]. Weran this demonstrationin [Li, 75] or any goodstatistics text. ...... Phoenix on 215 simulated forest fires and collected many The three equations (I) are called the normal equations, measurementsafter each trial, including: and they have an interesting interpretation, illustrated in WindSpeedThe wind speed during the trial Figure 1. Consider the first normal equation, rl’Xt =ill +fl2rx2xz +fl3rxax~. The [31 term is represented RTK The ratio of fireboss "thinking speed" to the in Figure 1 by the direct path between Xl, and Y; the rate at whichfires bum second term is represented by the indirect path from Xl, NumPlans The numberof plans tried before the fire is through X2 to Y; and the third term is represented by the contained indirect path through X3. Thus, the correlation rl’xt is given by the sumof the weightsof three paths in Figure 1, FirstPlan Thename of the first plan tried where the weight of a path is the product of the coeffi- Fireline The length of fireline dug by bulldozers cients (either correlations or betas) along the constituent links of the path. The second and third normal equations FinishTime The amount of simulated time required to have similar interpretations in Figure 1. By convention, contain the fire curved arcs without arrows represent correlations and di- We specified a prediction model and the path model rected arcs represent causes. The causal interpretation of shownin Figure 2, and solved for the path coefficients on Page 154 Knowledge Discovery in Databases Workshop I998 AAAI-93 each link. The coefficient on a path from X to Y is the correlation of X and Y if X is uncorrelated with any other 4. AUTOMATIC GENERATION OF PATH node that points to Y; otherwise it is the beta coefficient MODELS from the regression of Y on all the correlated nodes that point directly to it (including X). Withthis rule it is easy The question that motivated the current research is to solve for path coefficients by hand, using the whether models like the one in Figure 2 can be generated correlation matrix and running multiple regressions as automatically by a heuristic search algorithm. This sec- necessary. tion describes such an algorithm.

Automating Path Analysis for Building Causal Models from Data

Causal Inference with Information Fields

Week 2: Causal Inference

Causal Inference and Data Fusion in Econometrics∗

Path Analysis in Mplus: a Tutorial Using a Conceptual Model of Psychological and Behavioral Antecedents of Bulimic Symptoms in Young Adults

STATS 361: Causal Inference

Basic Concepts of Statistical Inference for Causal Effects in Experiments and Observational Studies

Causal and Path Model Analysis of the Organizational Characteristics of Civil Defense System Simon Wen-Lon Tai Iowa State University

Generalized Correlations and Kernel Causality Using R Package Generalcorr

Elements of Causal Inference

The Book of Why Presentation Transcription

Tutorial: Causal Model Search

“Structural Vs. Reduced Form” Language