Uva-DARE (Digital Academic Repository)
Total Page:16
File Type:pdf, Size:1020Kb
UvA-DARE (Digital Academic Repository) Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias Forré, P.; Mooij, J.M. Publication date 2019 Document Version Author accepted manuscript Published in Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence Link to publication Citation for published version (APA): Forré, P., & Mooij, J. M. (2019). Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias. In A. Globerson, & R. Silva (Eds.), Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence: UAI 2019, Tel Aviv, Israel, July 22-25, 2019 [15] AUAI Press. http://auai.org/uai2019/proceedings/papers/15.pdf General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) Download date:01 Oct 2021 Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias Patrick Forré Joris M. Mooij Informatics Institute Informatics Institute University of Amsterdam University of Amsterdam The Netherlands The Netherlands [email protected] [email protected] Abstract tinuous, singular or mixtures, etc.) the respective endeav- our is not immediate. We prove the main rules of causal calcu- Such a framework of causal calculus contains rules about lus (also called do-calculus) for i/o structural when one can 1.) insert/delete observations, 2.) ex- causal models (ioSCMs), a generalization of a change action/observation, 3.) insert/delete actions; and recently proposed general class of non-/linear about when and how to recover from interventions and/or structural causal models that allow for cycles, selection bias (backdoor and selection-backdoor crite- latent confounders and arbitrary probability rion), etc. (see [1, 4, 5, 14, 21–24, 26, 27, 32–35]). While distributions. We also generalize adjustment these rules have been extensively studied for acyclic criteria and formulas from the acyclic setting causal models, e.g. (semi-)Markovian models, which are to the general one (i.e. ioSCMs). Such crite- attached to directed acyclic graphs (DAGs) or acyclic di- ria then allow to estimate (conditional) causal rected mixed graphs (ADMGs) (see [1,4,5,14,21–24,26, effects from observational data that was (par- 27,32–35]), the case of causal models with cycles stayed tially) gathered under selection bias and cy- in the dark. cles. This generalizes the backdoor crite- To deal with cycles and latent confounders at the same rion, the selection-backdoor criterion and ex- time in this paper we will introduce the class of in- tensions of these to arbitrary ioSCMs. To- put/output structural causal models (ioSCMs), a “condi- gether, our results thus enable causal reasoning tional” version of the recently proposed class of modular in the presence of cycles, latent confounders structural causal models (mSCMs) (see [10, 11]) to also and selection bias. Finally, we extend the ID include “input” nodes that can play the role of parame- algorithm for the identification of causal ef- ter/context/action/intervention nodes. ioSCMs have sev- fects to ioSCMs. eral desirable properties: They allow for arbitrary prob- ability distributions, non-/linear functional relations, la- tent confounders and cycles. They can also model non- 1 INTRODUCTION /probabilistic external and probabilistic internal nodes in one framework. The cycles are modelled in a least Statistical models are governed by the rules of proba- restrictive way such that the class of ioSCMs still be- bility (e.g. sum and product rule), which link joint dis- comes closed under arbitrary marginalizations and inter- tributions with the corresponding (conditional) marginal ventions. All causal models that are based on acyclic ones. Causal models follow additonal rules, which relate graphs like DAGs, ADMGs or mDAGs (see [9, 28]) can the observational distributions with the interventional be interpreted as special acyclic ioSCMs. Besides feed- ones. In contrast to the rules of probability theory, which back over time ioSCMs can also express instantaneous directly follow from their axioms, the rules of causal cal- and equilibrated feedback under the made model as- culus need to be proven, when based on the definition of sumptions (e.g. the ODEs in [2, 18]). All models where structural causal models (SCMs). As SCMs will among the non-trivial cycles are “contractive” (negative feed- other things depend on the underlying graphical structure back loops, see [11]) are ioSCMs without further as- (e.g. with or without cycles or bidirected edges, etc.), the sumptions. Thus ioSCMs generalize all these classes used function classes (e.g. linear or non-linear, etc.) and of causal models in one framework, which goes be- the allowed probability distributions (e.g. discrete, con- yond the acyclic setting and also allows for conditional versions of those (e.g. CADMGs), expressed via ex- loops (independent of v v E or not). ∈ ternal non-/probabilistic “input” nodes. Also the gen- 2. The set of loops of G is written as (G). L eralized directed global Markov property for mSCMs 3. The strongly connected component of v in G is de- (see [10, 11]) generalizes to ioSCMs, i.e. ioSCMs en- fined to be: ScG(v) := AncG(v) DescG(v). ∩ tail the conditional independence relations that follow 4. The set of strongly connected components is (G). S from the σ-separation criterion in the underlying graph, Remark 2.2. Let G = (V, E) be a directed graph. where σ-separation generalizes the usual d-separation 1. We always have v ScG(v) and ScG(v) (G). (also called m- or m∗-separation, see [9, 20, 24, 28, 38]) ∈ ∈ L 2. If G is acyclic then: (G) = v v V . from acyclic graphs to directed mixed graphs (DMGs) L {{ } | ∈ } (and even HEDGes [10] and σ-CGs [11]) with or with- out cycles in a non-naive way. In the following all spaces are meant to be equipped with σ-algebras and all maps to be measurable. Whenever This paper now aims at proving the mentioned main rules (regular) conditional distributions occur we implicitly as- of causal calculus for ioSCMs and derive adjustment cri- sume standard measurable spaces (to ensure existence). teria with corresponding adjustment formulas like gen- Definition 2.3 (Input/Output Structural Causal Model). eralized (selection-)backdoor adjustments. We also pro- An input/output (i/o) structural causal model (ioSCM) by vide an extension of the ID algorithm for the identifica- definition consists of: tion of causal effects to the ioSCM setting, which reduces to the usual one in the acyclic case. 1. a set of nodes V + = V ˙ U ˙ J, where elements of V ∪ ∪ correspond to output/observed variables, elements The paper is structured as follows: We will first give the of U to probabilistic latent variables and elements precise definition of ioSCMs closely mirroring mSCMs of J to input/intervention variables. from [10, 11]. We will then review σ-separation and 2. an observation/latent/action space v for every v generalize its criterion from mSCMs (see [10, 11]) to + X ∈ V , := v V + v, ioSCMs. As a preparation for the causal calculus, which X ∈ X 3. a product probability measure PU = Pu on relates observational and interventional distributions, we u U the latent spaceQ := , ∈ will then show how one can extend a given ioSCM to XU u U Xu 4. a directed graph structure ∈G+ = (VN+,E+) with one that also incorporates additional interventional vari- the properties: Q ables indicating the regime of interventions on the ob- + V = ChG (U J) served nodes. We will then show how the rules of causal (a) , G+ ∪ calculus directly follow from applying the σ-separation (b) Pa (U J) = , + ∪ +∅ criterion to such an extended ioSCM. We then derive the where ChG and PaG stand for children and par- mentioned general adjustment criteria with correspond- ents in G+, resp.,1 ing adjustment formulas. Finally, we introduce the right 5. a system of causal mechanisms g = (gS)S (G+): ∈LS V definitions for ioSCMs to extend the ID algorithm for the ⊆ identification of causal effects to the general setting. g : , 2 S Xv → Xv v PaG+ (S) S v S 2 INPUT/OUTPUT STRUCTURAL ∈ Y \ Y∈ CAUSAL MODELS that satisfy the following global compatibility con- ditions: For every nested pair of loops S0 + ⊆ In this section we will define input/output structural S V of G and every element xPaG+ (S) S ⊆ ∪ ∈ causal models (ioSCMs), which can be seen as a “con- + v PaG (S) S v we have the implication: ditional” version of modular structural causal models ∈ ∪ X Q (mSCMs) defined in [10, 11]. We will then construct + gS(xPaG (S) S) = xS \ marginalized ioSCMs and intervened ioSCMs. To allow + = gS0 (xPaG (S ) S ) = xS0 , for cycles we first need to introduce the notion of loop of ⇒ 0 \ 0 a graph and its strongly connected components. + where xPaG (S ) S and xS0 denote the correspond- 0 \ 0 Definition 2.1 (Loops). Let G = (V, E) be a directed ing components of xPaG+ (S) S.