Sensitivity Analysis of Linear Structural Causal Models

Sensitivity Analysis of Linear Structural Causal Models Carlos Cinelli 1 Daniel Kumor 2 Bryant Chen 3 Judea Pearl 1 Elias Bareinboim 2 Abstract der to obtain causal claims. These assumptions are usually encoded as the absence of certain causal relationships, or Causal inference requires assumptions about the as the absence of association between certain unobserved data generating process, many of which are un- factors. Conclusions based on causal models are, therefore, verifiable from the data. Given that some causal provisional: they depend on the validity of causal assump- assumptions might be uncertain or disputed, for- tions, regardless of the sample size (Pearl, 2000; Spirtes mal methods are needed to quantify how sensitive et al., 2000; Bareinboim & Pearl, 2016). research conclusions are to violations of those assumptions. Although an extensive literature In many real settings, it is not uncommon that these assump- exists on the topic, most results are limited to spe- tions are subject to uncertainty or dispute. Scientists may cific model structures, while a general-purpose posit alternative causal models that are equally compatible algorithmic framework for sensitivity analysis is with the observed data; or, more mundanely, researchers can still lacking. In this paper, we develop a formal, make identification assumptions for convenience, simply systematic approach to sensitivity analysis for ar- to proceed with estimation.1 Regardless of the motivation, bitrary linear Structural Causal Models (SCMs). the provisional character of causal inference behooves us to We start by formalizing sensitivity analysis as formally assess the extent to which causal conclusions are a constrained identification problem. We then sensitive to violations of those assumptions. develop an efficient, graph-based identification The importance of such exercises is best illustrated with a algorithm that exploits non-zero constraints on real example, which directly impacted public policy. During both directed and bidirected edges. This allows the late 1950s and early 1960s, there was a fierce debate researchers to systematically derive sensitivity regarding the causal effect of cigarette smoking on lung curves for a target causal quantity with an arbi- cancer. One of its most notable skeptics was the influen- trary set of path coefficients and error covariances tial statistician Ronald Fisher, who claimed that, without as sensitivity parameters. These results can be an experiment, one cannot rule out unobserved common used to display the degree to which violations of causes (e.g. the individual’s genotype) as being responsible causal assumptions affect the target quantity of in- for the observed association (Fisher, 1957; 1958). Techni- terest, and to judge, on scientific grounds, whether cally speaking, Fisher’s statement was accurate; data alone problematic degrees of violations are plausible. could not refute his hypothesis. Yet, although no RCT mea- suring the effect of cigarette smoking on lung cancer was 1. Introduction performed, currently there exists a broad consensus around the issue. How could such a consensus emerge? Randomized controlled trials (RCT) are considered the gold An important step towards the current state of affairs was standard for identifying cause-effect relationships in data- a sensitivity analysis performed by Cornfield et al. (1959). intensive sciences (Giffin et al., 2010). In practice, however, Their investigation consisted of the following hypotheti- direct randomization is often infeasible or unethical, requir- cal question: if Fisher’s hypothesis were true, how strong ing researchers to combine non-experimental observations would the alleged confounder need to be to explain all the with assumptions about the data generating process in or- observed association between cigarette smoking and lung 1Depts. of Statistics and Computer Science, University of Cali- cancer? The analysis concluded that, since smokers had nine fornia, Los Angeles, California, USA. 2Dept. of Computer Science, times the risk of nonsmokers for developing lung cancer, Purdue University, West Lafayette, IN, USA. 3Brex, San Francisco, the latent confounder would need to be at least nine times CA, USA. Most of this work was conducted while at IBM Research more common in smokers than in nonsmokers—something AI. Correspondence to: Carlos Cinelli <[email protected]>. 1 th As noted by Joffe et al. (2010), “such assumptions are usually Proceedings of the 36 International Conference on Machine made casually, largely because they justify the use of available Learning, Long Beach, California, PMLR 97, 2019. Copyright statistical methods and not because they are truly believed”. 2019 by the author(s). Sensitivity Analysis of Linear Structural Causal Models deemed implausible by experts at the time. experimental results corroborate its generality, show- ing canonical sensitivity analysis examples are a small Cornfield’s exercise reveals the fundamental steps of a sen- subset of the cases solved by our proposal. sitivity analysis. The analyst introduces a violation of a causal assumption of the current model, such as positing the presence of unobserved confounders that induce a non-zero This paper is structured as follows. Section 2 reviews basic association between two error terms. Crucially, however, we terminology and definitions that will be used throughout the are willing to tolerate this violation up to a certain plausibil- text. Section 3 shows how sensitivity analysis in the context ity limit dictated by expert judgment (e.g., prior biological of linear SCMs can be reduced to a constrained identification understanding, pilot studies). The task is, thus, to system- problem. In Section 4 we develop a novel approach that atically quantify how different hypothetical “degrees” of allows researchers to systematically incorporate constraints violation (to be defined) affect the conclusions, and to judge on error covariances of linear SCMs. Section 5 utilizes these whether expert knowledge can rule out problematic values. results to construct a constrained identification algorithm The problem of sensitivity analysis has been studied through- for deriving sensitivity curves. Finally, Section 6 presents out the sciences, ranging from statistics (Rosenbaum & experimental results to evaluate our proposals. Rubin, 1983; Small, 2007; Rosenbaum, 2010; Cinelli & Ha- zlett, 2018; Franks et al., 2019) to epidemiology (Brumback 2. Preliminaries et al., 2004; Vanderweele & Arah, 2011; Ding & Vander- Weele, 2016; Arah, 2017), sociology (Frank, 2000), psy- In this paper, we use the language of structural causal mod- chology (Mauro, 1990), political science (Imai et al., 2010; els as our basic semantic framework (Pearl, 2000). In partic- Blackwell, 2013), and economics (Leamer, 1983; Imbens, ular, we consider linear semi-Markovian SCMs, consisting 2003; Oster, 2017; Masten & Poirier, 2018). Notwithstand- of a set of equations of the form V = ΛV +U, where V rep- ing all this attention, the current literature is still limited to resent the endogenous variables, U the exogenous variables, specific models and solved on a case-by-case basis. Consid- and Λ a matrix containing the structural coefficients repre- ering the ubiquity of causal questions in the sciences and senting both the strength of causal relationships and lack artificial intelligence, a formal, algorithmic framework to of direct causation among variables (when λij = 0). The deal with violations of causal assumptions is needed. exogenous variables are usually assumed to be multivariate Gaussian with covariance matrix E, encoding independence Causal modeling requires a formal language where the char- 2 between error terms (when ij = 0). We focus on acyclic acterization of the data generating process can be encoded models, where Λ can be arranged to be lower triangular. explicitly. Structural Causal Models (Pearl, 2000) provide such a language and, in many fields, including machine The covariance matrix Σ of the endogenous variables in- −1 −> learning, the health and social sciences, linearity is a popular duced by model M is given by Σ = (I − Λ) E(I − Λ) . modeling choice. In this paper, we focus on the sensitivity Without loss of generality, we assume model variables have analysis of linear acyclic semi-Markovian SCMs. We al- been standardized to unit variance. For any three variables low violations of exclusion and independence restrictions, x, y and z, we denote σyx to be the covariance of x and y, such as (i) the absence or presence of unobserved common σyx:z to be the partial covariance of y and x given z, and causes; and, (ii) the absence, presence or reversal of direct Ryx:z the regression coefficient of y on x adjusting for z. causal effects. Our contributions are the following: Causal quantities of interest in a linear SCM are usually entries of Λ (or functions of those entries), and identifiability reduces to checking whether they can be uniquely computed 1. We introduce a formal, algorithmic approach for sen- from the observed covariance matrix Σ. sitivity analysis in linear SCMs and show it can be reduced to a problem of identification with non-zero Causal graphs provide a parsimonious encoding of some constraints, i.e, identification when certain parameter of the substantive assumptions of a linear SCM. The values are fixed to a known, but non-zero, number. causal graph (or the path diagram) of model M is a graph G = (V; D; B), where V denotes the vertices (endogenous 2. We develop a novel graphical procedure,

Sensitivity Analysis of Linear Structural Causal Models

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support