Causal Inference: Classical Approaches

Total Page:16

File Type:pdf, Size:1020Kb

Causal Inference: Classical Approaches Causal Inference: Classical Approaches Si Yi (Cathy) Meng Feb 5, 2020 UBC MLRG Outline • Potential Outcomes • Confounding and Causal DAGs • Granger Causality • ICA for Causal Discovery AssociationalInference • Universe � • For each unit � ∈ �: • Attribute variable X(�) • Observed variable �(�) • Inference: • �(� = �|� = �) AssociationalInference Causal Inference • Universe � • Universe � • For each unit � ∈ �: • For each unit � ∈ �: • Attribute variable X(�) • Treatment variable � � ∈ {1,0} • Observed variable �(�) • Potential outcome �3 � , �4(�) • Inference: • Inference: • �(� = �|� = �) • �3 � − �4(�) Rubin’s Framework • For each unit � ∈ �: • Treatment variable � � ∈ {1,0} • Potential outcomes �3 � , �4(�) • the outcome that would be observed if treatment was set to � = 0 or 1, on the same unit. • (before) • If �(�) is set to 1 • �3 � is the observed outcome • �4 � is the counterfactual outcome • (after) Causal Effects • �3 � − �4 � is the causal effect of treatment 1 (relative to 0) on �. • Abbreviated as �3 and �4 Causal Effects • �3 � − �4 � is the causal effect of treatment 1 (relative to 0) on �. • Abbreviated as �3 and �4 • Fundamental Problem of Causal Inference • It is impossible to observe both �3 and �4 on the same unit, and therefore it is impossible to observe the causal effect. THE END Scientific solution to the Fundamental Problem • Assume temporal stability and causal transience • The value of �4 does not depend on when � = 0 is applied and measured. • The effect of � = 0 and the measurement process that gives rise to �4 does not change � enough to affect �3 measured later. Scientific solution to the Fundamental Problem • Assume temporal stability and causal transience • The value of �4 does not depend on when � = 0 is applied and measured. • The effect of � = 0 and the measurement process that gives rise to �4 does not change � enough to affect �3 measured later. • With these two assumptions, we can simply measure both �4 and �3 by applying � = 0 then � = 1, taking the measurement after each exposure. • Widely used in experiments involving physical devices. Scientific solution to the Fundamental Problem • Assume temporal stability and causal transience • The value of �4 does not depend on when � = 0 is applied and measured. • The effect of � = 0 and the measurement process that gives rise to �4 does not change � enough to affect �3 measured later. • With these two assumptions, we can simply measure both �4 and �3 by applying � = 0 then � = 1, taking the measurement after each exposure. • Widely used in experiments involving physical devices. • Assume unit homogeneity • For two units �3 and �6, we assume �4 �3 = �4 �6 and �3 �3 = �3 �6 . Scientific solution to the Fundamental Problem • Assume temporal stability and causal transience • The value of �4 does not depend on when � = 0 is applied and measured. • The effect of � = 0 and the measurement process that gives rise to �4 does not change � enough to affect �3 measured later. • With these two assumptions, we can simply measure both �4 and �3 by applying � = 0 then � = 1, taking the measurement after each exposure. • Widely used in experiments involving physical devices. • Assume unit homogeneity • For two units �3 and �6, we assume �4 �3 = �4 �6 and �3 �3 = �3 �6 . • Causal effect can then be computed using �3 �3 − �4 �6 . • Implies the constant effect assumption: �3 � − �4 � is the same for all � ∈ �. Scientific solution to the Fundamental Problem • Assume temporal stability and causal transience • The value of �4 does not depend on when � = 0 is applied and measured. • The effect of � = 0 and the measurement process that gives rise to �4 does not change � enough to affect �3 measured later. • With these two assumptions, we can simply measure both �4 and �3 by applying � = 0 then � = 1, taking the measurement after each exposure. • Widely used in experiments involving physical devices. • Assume unit homogeneity • For two units �3 and �6, we assume �4 �3 = �4 �6 and �3 �3 = �3 �6 . • Causal effect can then be computed using �3 �3 − �4 �6 . • Implies the constant effect assumption: �3 � − �4 � is the same for all � ∈ �. • It’s very difficult to argue that these are valid… Statistical solution to the Fundamental Problem • “What would have happened if I had not taken the flu shot” --> “What would the flu rate be if everyone got the flu shot vs if no one did?” Statistical solution to the Fundamental Problem • “What would have happened if I had not taken the flu shot” --> “What would the flu rate be if everyone got the flu shot vs if no one did?” • Average causal effect of � = 1 (relative to � = 0) over �: • � �3 − �4 = � �3) − �(�4 • Imagine parallel universes with the same population… • Can’t observe this. • Observed data can only give us information about the average of the outcome over � ∈ � exposed to � = �. • � �3|� = 1) − �(�4|� = 0 Statistical solution to the Fundamental Problem • “What would have happened if I had not taken the flu shot” --> “What would the flu rate be if everyone got the flu shot vs if no one did?” • Average causal effect of � = 1 (relative to � = 0) over �: • � �3 − �4 = � �3) − �(�4 • Imagine parallel universes with the same population… • Can’t observe this. • Observed data can only give us information about the average of the outcome over � ∈ � exposed to � = �. • � �3|� = 1) − �(�4|� = 0 • In general, � �:) ≠ �(�:|� = t Statistical solution to the Fundamental Problem • “What would have happened if I had not taken the flu shot” --> “What would the flu rate be if everyone got the flu shot vs if no one did?” • Average causal effect of � = 1 (relative to � = 0) over �: • � �3 − �4 = � �3) − �(�4 • Imagine parallel universes with the same population… • Can’t observe this. • Observed data can only give us information about the average of the outcome over � ∈ � exposed to � = �. • � �3|� = 1) − �(�4|� = 0 • In general, � �:) ≠ �(�:|� = t • Independence assumption hold via randomized treatment assignment allows equality to hold, which lets us compute the ACE above. Other assumptions • Stable Unit Treatment Value Assumption (SUVTA) • No interference: units do not interact with each other. • One version of treatment. • Consistency • The potential outcome �: is equal to the observed outcome if the actual treatment received is � = �. • Positivity • ℙ(�(�) = �) > 0 for all � and �. Other assumptions • Stable Unit Treatment Value Assumption (SUVTA) • No interference: units do not interact with each other. • One version of treatment. • Consistency • The potential outcome �: is equal to the observed outcome if the actual treatment received is � = �. • Positivity • ℙ(�(�) = �) > 0 for all � and �. • Ignorability (aka no unmeasured confounders assumption) • �4, �3 ⊥ T|X • Among people with the same features X, we can think of treatment T as being randomly assigned. Outline • Potential Outcome • Confounding and Causal DAGs • Granger Causality • ICA for Causal Discovery DAGs Court Order (CO) • Useful for identifying dependencies and ways to factor and simplify the joint distribution. B • � �3, … , �B = ∏ DE3 �(�D|� FG D ) Captain (C) Soldier A Soldier B Prisoner D Firing squad example [Pearl, 2018] DAGs Court Order (CO) • Useful for identifying dependencies and ways to factor and simplify the joint distribution. B • � �3, … , �B = ∏ DE3 �(�D|� FG D ) Captain • Two variables � and � are d-separated by a set of variables (C) � if � and � are conditionally independent given �. • � �, �|� = � � � �(�|�) • Chain Soldier A Soldier B • Fork • Inverted fork Prisoner D Firing squad example [Pearl, 2018] Causal DAGs Court Order (CO) • DAGs where directions of the edges represent causal relationships. • In contrast to Rubin’s potential outcome framework, this is a Captain structural approach to causal inference which Pearl (C) advocates. • They are shown to be mathematically equivalent. Soldier A Soldier B Prisoner D Firing squad example [Pearl, 2018] Intervention and Pearl’s do-calculus Court Order (CO) • ��() operator signals an intervention on a variable. • Replace that variable with the actual value that we assign. • Removes all incoming edges to that node. Captain (C) A=0 Soldier B Prisoner D Firing squad example [Pearl, 2018] Intervention and Pearl’s do-calculus Court Order (CO) • ��() operator signals an intervention on a variable. • Replace that variable with the actual value that we assign. • Removes all incoming edges to that node. Captain (C) • Instead of � � � = 0 • We want � � ��(� = 0 ) • The causal effect of � = 0 on D. A=0 Soldier B Prisoner D Firing squad example [Pearl, 2018] Confounding • Confounders: variables that influences both treatment and outcome. A B • Want: identify a set of variables so that ignorability holds. • We don’t need to identity specific confounders • We just need to be able to control for confounding. T Y • Need to block backdoor paths from � to �. D Frontdoor paths A B A B T Y T Z Y • We are not concerned about frontdoor paths. • We don’t want to control anything along the frontdoor paths. • Unless we care about the magnitude of the causal effect… Backdoor paths • Begins with a parent of � and ends at �. A B • Need to control these paths as they confound our causal effect. • How? • Identify the set of variables that blocks all backdoor paths from � to �. T Y D Backdoor criterion • A set of variables � satisfies the backdoor criterion if 1. it blocks all backdoor paths from � to �, and A B 2. It does not include any descendants of �. T Y D Backdoor criterion • A set of variables � satisfies the backdoor criterion if 1. it blocks all backdoor paths from � to �, and A B 2. It does not include any descendants of �. • � = �, � • Alternatively, � = �, � , � = �, �, � T Y D Backdoor criterion • A set of variables � satisfies the backdoor criterion if 1. it blocks all backdoor paths from � to �, and A B 2. It does not include any descendants of �. • � = �, � • Alternatively, � = �, � , � = �, �, � T Y • Controlling any of these sets allow us to control for confounding. D Backdoor criterion • A set of variables � satisfies the backdoor criterion if 1. it blocks all backdoor paths from � to �, and A B 2. It does not include any descendants of �.
Recommended publications
  • The Practice of Causal Inference in Cancer Epidemiology
    Vol. 5. 303-31 1, April 1996 Cancer Epidemiology, Biomarkers & Prevention 303 Review The Practice of Causal Inference in Cancer Epidemiology Douglas L. Weedt and Lester S. Gorelic causes lung cancer (3) and to guide causal inference for occu- Preventive Oncology Branch ID. L. W.l and Comprehensive Minority pational and environmental diseases (4). From 1965 to 1995, Biomedical Program IL. S. 0.1. National Cancer Institute, Bethesda, Maryland many associations have been examined in terms of the central 20892 questions of causal inference. Causal inference is often practiced in review articles and editorials. There, epidemiologists (and others) summarize evi- Abstract dence and consider the issues of causality and public health Causal inference is an important link between the recommendations for specific exposure-cancer associations. practice of cancer epidemiology and effective cancer The purpose of this paper is to take a first step toward system- prevention. Although many papers and epidemiology atically reviewing the practice of causal inference in cancer textbooks have vigorously debated theoretical issues in epidemiology. Techniques used to assess causation and to make causal inference, almost no attention has been paid to the public health recommendations are summarized for two asso- issue of how causal inference is practiced. In this paper, ciations: alcohol and breast cancer, and vasectomy and prostate we review two series of review papers published between cancer. The alcohol and breast cancer association is timely, 1985 and 1994 to find answers to the following questions: controversial, and in the public eye (5). It involves a common which studies and prior review papers were cited, which exposure and a common cancer and has a large body of em- causal criteria were used, and what causal conclusions pirical evidence; over 50 studies and over a dozen reviews have and public health recommendations ensued.
    [Show full text]
  • Statistics and Causal Inference (With Discussion)
    Applied Statistics Lecture Notes Kosuke Imai Department of Politics Princeton University February 2, 2008 Making statistical inferences means to learn about what you do not observe, which is called parameters, from what you do observe, which is called data. We learn the basic principles of statistical inference from a perspective of causal inference, which is a popular goal of political science research. Namely, we study statistics by learning how to make causal inferences with statistical methods. 1 Statistical Framework of Causal Inference What do we exactly mean when we say “An event A causes another event B”? Whether explicitly or implicitly, this question is asked and answered all the time in political science research. The most commonly used statistical framework of causality is based on the notion of counterfactuals (see Holland, 1986). That is, we ask the question “What would have happened if an event A were absent (or existent)?” The following example illustrates the fact that some causal questions are more difficult to answer than others. Example 1 (Counterfactual and Causality) Interpret each of the following statements as a causal statement. 1. A politician voted for the education bill because she is a democrat. 2. A politician voted for the education bill because she is liberal. 3. A politician voted for the education bill because she is a woman. In this framework, therefore, the fundamental problem of causal inference is that the coun- terfactual outcomes cannot be observed, and yet any causal inference requires both factual and counterfactual outcomes. This idea is formalized below using the potential outcomes notation.
    [Show full text]
  • Bayesian Causal Inference
    Bayesian Causal Inference Maximilian Kurthen Master’s Thesis Max Planck Institute for Astrophysics Within the Elite Master Program Theoretical and Mathematical Physics Ludwig Maximilian University of Munich Technical University of Munich Supervisor: PD Dr. Torsten Enßlin Munich, September 12, 2018 Abstract In this thesis we address the problem of two-variable causal inference. This task refers to inferring an existing causal relation between two random variables (i.e. X → Y or Y → X ) from purely observational data. We begin by outlining a few basic definitions in the context of causal discovery, following the widely used do-Calculus [Pea00]. We continue by briefly reviewing a number of state-of-the-art methods, including very recent ones such as CGNN [Gou+17] and KCDC [MST18]. The main contribution is the introduction of a novel inference model where we assume a Bayesian hierarchical model, pursuing the strategy of Bayesian model selection. In our model the distribution of the cause variable is given by a Poisson lognormal distribution, which allows to explicitly regard discretization effects. We assume Fourier diagonal covariance operators, where the values on the diagonal are given by power spectra. In the most shallow model these power spectra and the noise variance are fixed hyperparameters. In a deeper inference model we replace the noise variance as a given prior by expanding the inference over the noise variance itself, assuming only a smooth spatial structure of the noise variance. Finally, we make a similar expansion for the power spectra, replacing fixed power spectra as hyperparameters by an inference over those, where again smoothness enforcing priors are assumed.
    [Show full text]
  • Discovery of Causal Time Intervals
    Discovery of Causal Time Intervals Zhenhui Li∗ Guanjie Zheng∗ Amal Agarwaly Lingzhou Xuey Thomas Lauvauxz Abstract Granger test X causes Y: rejected Causality analysis, beyond \mere" correlations, has be- X[t0:t1] causes Y[t0:t1]: accepted come increasingly important for scientific discoveries X and policy decisions. Many of these real-world appli- cations involve time series data. A key observation is Y that the causality between time series could vary signif- icantly over time. For example, a rain could cause severe t0 t1 time traffic jams during the rush hours, but has little impact on the traffic at midnight. However, previous studies Figure 1: An example illustrating the causality in mostly look at the whole time series when determining partial time series. The original Granger test fails to the causal relationship between them. Instead, we pro- detect the causal relationship, since X causes Y only pose to detect the partial time intervals with causality. during the time interval [t0 : t1]. Our goal is to find As it is time consuming to enumerate all time intervals such time intervals. and test causality for each interval, we further propose an efficient algorithm that can avoid unnecessary com- causes Y by Granger test if the full model is significantly putations based on the bounds of F -test in the Granger better than the reduced model (i.e., it is critical to use causality test. We use both synthetic datasets and real the values of X to predict Y ). datasets to demonstrate the efficiency of our pruning However, in real-world scenarios, causal relation- techniques and that our method can effectively discover ships may exist only in partial time series.
    [Show full text]
  • Articles Causal Inference in Civil Rights Litigation
    VOLUME 122 DECEMBER 2008 NUMBER 2 © 2008 by The Harvard Law Review Association ARTICLES CAUSAL INFERENCE IN CIVIL RIGHTS LITIGATION D. James Greiner TABLE OF CONTENTS I. REGRESSION’S DIFFICULTIES AND THE ABSENCE OF A CAUSAL INFERENCE FRAMEWORK ...................................................540 A. What Is Regression? .........................................................................................................540 B. Problems with Regression: Bias, Ill-Posed Questions, and Specification Issues .....543 1. Bias of the Analyst......................................................................................................544 2. Ill-Posed Questions......................................................................................................544 3. Nature of the Model ...................................................................................................545 4. One Regression, or Two?............................................................................................555 C. The Fundamental Problem: Absence of a Causal Framework ....................................556 II. POTENTIAL OUTCOMES: DEFINING CAUSAL EFFECTS AND BEYOND .................557 A. Primitive Concepts: Treatment, Units, and the Fundamental Problem.....................558 B. Additional Units and the Non-Interference Assumption.............................................560 C. Donating Values: Filling in the Missing Counterfactuals ...........................................562 D. Randomization: Balance in Background Variables ......................................................563
    [Show full text]
  • Neural Granger Causality
    1 Neural Granger Causality Alex Tank*, Ian Covert*, Nick Foti, Ali Shojaie, Emily B. Fox Abstract—While most classical approaches to Granger causality detection assume linear dynamics, many interactions in real-world applications, like neuroscience and genomics, are inherently nonlinear. In these cases, using linear models may lead to inconsistent estimation of Granger causal interactions. We propose a class of nonlinear methods by applying structured multilayer perceptrons (MLPs) or recurrent neural networks (RNNs) combined with sparsity-inducing penalties on the weights. By encouraging specific sets of weights to be zero—in particular, through the use of convex group-lasso penalties—we can extract the Granger causal structure. To further contrast with traditional approaches, our framework naturally enables us to efficiently capture long-range dependencies between series either via our RNNs or through an automatic lag selection in the MLP. We show that our neural Granger causality methods outperform state-of-the-art nonlinear Granger causality methods on the DREAM3 challenge data. This data consists of nonlinear gene expression and regulation time courses with only a limited number of time points. The successes we show in this challenging dataset provide a powerful example of how deep learning can be useful in cases that go beyond prediction on large datasets. We likewise illustrate our methods in detecting nonlinear interactions in a human motion capture dataset. Index Terms—time series, Granger causality, neural networks, structured sparsity, interpretability F 1 INTRODUCTION In many scientific applications of multivariate time se- Most classical model-based methods assume linear time ries, it is important to go beyond prediction and forecast- series dynamics and use the popular vector autoregressive ing and instead interpret the structure within time series.
    [Show full text]
  • The Granger-Causality Between Transportation and GDP: a Panel Data Approach
    A Service of Leibniz-Informationszentrum econstor Wirtschaft Leibniz Information Centre Make Your Publications Visible. zbw for Economics Beyzatlar, Mehmet Aldonat; Karacal, Müge; Yetkiner, I. Hakan Working Paper The Granger-Causality between Transportation and GDP: A Panel Data Approach Working Papers in Economics, No. 12/03 Provided in Cooperation with: Department of Economics, Izmir University of Economics Suggested Citation: Beyzatlar, Mehmet Aldonat; Karacal, Müge; Yetkiner, I. Hakan (2012) : The Granger-Causality between Transportation and GDP: A Panel Data Approach, Working Papers in Economics, No. 12/03, Izmir University of Economics, Department of Economics, Izmir This Version is available at: http://hdl.handle.net/10419/175921 Standard-Nutzungsbedingungen: Terms of use: Die Dokumente auf EconStor dürfen zu eigenen wissenschaftlichen Documents in EconStor may be saved and copied for your Zwecken und zum Privatgebrauch gespeichert und kopiert werden. personal and scholarly purposes. Sie dürfen die Dokumente nicht für öffentliche oder kommerzielle You are not to copy documents for public or commercial Zwecke vervielfältigen, öffentlich ausstellen, öffentlich zugänglich purposes, to exhibit the documents publicly, to make them machen, vertreiben oder anderweitig nutzen. publicly available on the internet, or to distribute or otherwise use the documents in public. Sofern die Verfasser die Dokumente unter Open-Content-Lizenzen (insbesondere CC-Lizenzen) zur Verfügung gestellt haben sollten, If the documents have been made available under an Open gelten abweichend von diesen Nutzungsbedingungen die in der dort Content Licence (especially Creative Commons Licences), you genannten Lizenz gewährten Nutzungsrechte. may exercise further usage rights as specified in the indicated licence. www.econstor.eu Working Papers in Economics The Granger-Causality between Transportation and GDP: A Panel Data Approach Mehmet Aldonat Beyzatlar, Dokuz Eylül University Müge Karacal, Izmir University of Economics I.
    [Show full text]
  • GRANGER CAUSALITY and U.S. CROP and LIVESTOCK PRICES Rod F
    SOUTHERN JOURNAL OF AGRICULTURAL ECONOMICS JULY, 1984 GRANGER CAUSALITY AND U.S. CROP AND LIVESTOCK PRICES Rod F. Ziemer and Glenn S. Collins Abstract then discussed. Finally, some concluding re- Agricultural economists have recently been marks are offered. attracted to procedures suggested by Granger CAUSALITY: TESTABLE DEFINITIONS and others which allow observed data to reveal causal relationships. Results of this study in- Can observed correlations be used to suggest dicate that "causality" tests can be ambiguous or infer causality? This question lies at the heart in identifying behavioral relationships between of the recent causality literature. Jacobs et al. agricultural price variables. Caution is sug- contend that the null hypothesis commonly gested when using such procedures for model tested is necessary but not sufficient to imply choice. that one variable "causes" another. Further- more, the authors show that exogenity is not Key words: Granger causality, causality tests, empirically testable and that "informativeness" autoregressive processes is the only testable definition of causality. This Economists have long been concerned with testable definition is commonly known as caus- the issue of causality versus correlation. Even ality in the Granger sense. Although Granger beginning economics students are constantly (1969) never suggested that this testable defi- reminded that economic variables can be cor- nition of causality could be used to infer ex- related without being causally related. Re- ogenity, many researchers think of exogenity cently, testable definitions of causality have been when seeing the common parlance "test for suggested by suggestedGranger by(1969,(196, 1980), Sims andand casuality."Granger (1980) and Zellner provide others. These causality tests have been applied further discussion of definitions of "causality" by agricultural economists in livestock markets and Engle et al.
    [Show full text]
  • Causation and Causal Inference in Epidemiology
    PUBLIC HEALTH MAnERS Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat fixed. In other words, a cause of a disease Concepts of cause and causal inference are largely self-taught from early learn- ing experiences. A model of causation that describes causes in terms of suffi- event is an event, condition, or characteristic cient causes and their component causes illuminates important principles such that preceded the disease event and without as multicausality, the dependence of the strength of component causes on the which the disease event either would not prevalence of complementary component causes, and interaction between com- have occun-ed at all or would not have oc- ponent causes. curred until some later time. Under this defi- Philosophers agree that causal propositions cannot be proved, and find flaws or nition it may be that no specific event, condi- practical limitations in all philosophies of causal inference. Hence, the role of logic, tion, or characteristic is sufiicient by itself to belief, and observation in evaluating causal propositions is not settled. Causal produce disease. This is not a definition, then, inference in epidemiology is better viewed as an exercise in measurement of an of a complete causal mechanism, but only a effect rather than as a criterion-guided process for deciding whether an effect is pres- component of it. A "sufficient cause," which ent or not. (4m JPub//cHea/f/i. 2005;95:S144-S150. doi:10.2105/AJPH.2004.059204) means a complete causal mechanism, can be defined as a set of minimal conditions and What do we mean by causation? Even among eral.
    [Show full text]
  • Experiments & Observational Studies: Causal Inference in Statistics
    Experiments & Observational Studies: Causal Inference in Statistics Paul R. Rosenbaum Department of Statistics University of Pennsylvania Philadelphia, PA 19104-6340 1 My Concept of a ‘Tutorial’ In the computer era, we often receive compressed • files, .zip. Minimize redundancy, minimize storage, at the expense of intelligibility. Sometimes scientific journals seemed to have been • compressed. Tutorial goal is: ‘uncompress’. Make it possible to • read a current article or use current software without going back to dozens of earlier articles. 2 A Causal Question At age 45, Ms. Smith is diagnosed with stage II • breast cancer. Her oncologist discusses with her two possible treat- • ments: (i) lumpectomy alone, or (ii) lumpectomy plus irradiation. They decide on (ii). Ten years later, Ms. Smith is alive and the tumor • has not recurred. Her surgeon, Steve, and her radiologist, Rachael de- • bate. Rachael says: “The irradiation prevented the recur- • rence – without it, the tumor would have recurred.” Steve says: “You can’t know that. It’s a fantasy – • you’re making it up. We’ll never know.” 3 Many Causal Questions Steve and Rachael have this debate all the time. • About Ms. Jones, who had lumpectomy alone. About Ms. Davis, whose tumor recurred after a year. Whenever a patient treated with irradiation remains • disease free, Rachael says: “It was the irradiation.” Steve says: “You can’t know that. It’s a fantasy. We’ll never know.” Rachael says: “Let’s keep score, add ’em up.” Steve • says: “You don’t know what would have happened to Ms. Smith, or Ms. Jones, or Ms Davis – you just made it all up, it’s all fantasy.
    [Show full text]
  • A Brief Introduction to Temporality and Causality
    A Brief Introduction to Temporality and Causality Kamran Karimi D-Wave Systems Inc. Burnaby, BC, Canada [email protected] Abstract Causality is a non-obvious concept that is often considered to be related to temporality. In this paper we present a number of past and present approaches to the definition of temporality and causality from philosophical, physical, and computational points of view. We note that time is an important ingredient in many relationships and phenomena. The topic is then divided into the two main areas of temporal discovery, which is concerned with finding relations that are stretched over time, and causal discovery, where a claim is made as to the causal influence of certain events on others. We present a number of computational tools used for attempting to automatically discover temporal and causal relations in data. 1. Introduction Time and causality are sources of mystery and sometimes treated as philosophical curiosities. However, there is much benefit in being able to automatically extract temporal and causal relations in practical fields such as Artificial Intelligence or Data Mining. Doing so requires a rigorous treatment of temporality and causality. In this paper we present causality from very different view points and list a number of methods for automatic extraction of temporal and causal relations. The arrow of time is a unidirectional part of the space-time continuum, as verified subjectively by the observation that we can remember the past, but not the future. In this regard time is very different from the spatial dimensions, because one can obviously move back and forth spatially.
    [Show full text]
  • Elements of Causal Inference
    Elements of Causal Inference Foundations and Learning Algorithms Adaptive Computation and Machine Learning Francis Bach, Editor Christopher Bishop, David Heckerman, Michael Jordan, and Michael Kearns, As- sociate Editors A complete list of books published in The Adaptive Computation and Machine Learning series appears at the back of this book. Elements of Causal Inference Foundations and Learning Algorithms Jonas Peters, Dominik Janzing, and Bernhard Scholkopf¨ The MIT Press Cambridge, Massachusetts London, England c 2017 Massachusetts Institute of Technology This work is licensed to the public under a Creative Commons Attribution- Non- Commercial-NoDerivatives 4.0 license (international): http://creativecommons.org/licenses/by-nc-nd/4.0/ All rights reserved except as licensed pursuant to the Creative Commons license identified above. Any reproduction or other use not licensed as above, by any electronic or mechanical means (including but not limited to photocopying, public distribution, online display, and digital information storage and retrieval) requires permission in writing from the publisher. This book was set in LaTeX by the authors. Printed and bound in the United States of America. Library of Congress Cataloging-in-Publication Data Names: Peters, Jonas. j Janzing, Dominik. j Scholkopf,¨ Bernhard. Title: Elements of causal inference : foundations and learning algorithms / Jonas Peters, Dominik Janzing, and Bernhard Scholkopf.¨ Description: Cambridge, MA : MIT Press, 2017. j Series: Adaptive computation and machine learning series
    [Show full text]