A Language for Counterfactual Generative Models

A Language for Counterfactual Generative Models Zenna Tavares 1 James Koppel 1 Xin Zhang 2 Ria Das 1 Armando Solar Lezama 1 Abstract We present OMEGAC, a probabilistic programming language with support for counterfactual inference. Counterfactual inference means to ob- serve some fact in the present, and infer what would have happened had some past intervention been taken, e.g. “given that medication was not effective at dose x, how likely would it have been effective at dose 2x?” We accomplish this by in- troducing a new operator to probabilistic programming akin to Pearl’s do, define its formal seman- Figure 1: A speeding driver (Left: driver’s view) crashes tics, provide an implementation, and demonstrate into a pedestrian (yellow) emerging from behind an ob- its utility by examples in population dynamics, struction (blue). Given a single frame of camera footage inverse planning, and graphics. (Right), OMEGAC infers whether driving below the speed limit would have prevented the crash. 1. Introduction To construct counterfactuals, OMEGAC introduces a do operator for constructing interventions: In this paper we introduce OMEGAC: a Turing-universal programming language for causal reasoning. OMEGA allows C Y j do(X ! x) (1) us to automatically derive causal inferences about phenomena that can only be modelled faithfully through simulation. This evaluates to what Y would have been had X been We focus on counterfactuals – what-if inferences about the bound to x when Y was defined. Consequently, if Y is a way the world could have been, had things been different. random variable and we define Yx = Y j do(X ! x), OMEGAC programs are simulation models augmented with then P (Yx = y) is the probability Y would have been y probability distributions to represent any uncertainty. In a had X had been x. A counterfactual is then simply Yx j E. similar vein to other probabilistic languages, OMEGAC pro- Note, if E depends on X, conditioning on it affects X’s vides primitive operators for conditioning, which revises the factual, non-intervened value, which is critical to capturing model to be consistent with any observed evidence. Counter- the semantics of counterfactuals. factuals, however, cannot be expressed through probabilistic To illustrate the potential of counterfactual reasoning within conditioning alone. They take the form: "Given that some a universal programming language, consider the scenario of evidence E is true, what would Y have been had X been an expert witness called to determine, from only a frame of different?" For example, given that a drug treatment was recorded video (Figure1), whether a driver was to blame not effective on a patient, would it have been effective at a for them crashing into a pedestrian. Using OMEGA , the ex- stronger dosage? Although one can condition on E being C pert could first construct a probabilistic model that includes true, attempting to condition on X being different to the the car dynamics, the driver and pedestrian’s behaviour, value it actually took is contradictory, and hence impossible. and a rendering function that relates the three dimensional 1CSAIL, MIT, USA 2Key Lab of High Confidence Software scene structure to two dimensional images. She could then Technologies, Ministry of Education, Department of Computer condition the model on the captured images to infer the Science and Technology, Peking University, China. Correspon- conditional distribution over the drivers velocity, determin- dence to: Zenna Tavares <[email protected]>, Xin Zhang ing the probability that the driver had been speeding. Us- <[email protected]>. ing OMEGAC she could then pose a counterfactual, asking Proceedings of the 38 th International Conference on Machine whether the crash would have still occurred even if the driver Learning, PMLR 139, 2021. Copyright 2021 by the author(s). had obeyed the speed limit. If she later wanted to investi- Submission and Formatting Instructions for ICML 2021 gate the culpability of another candidate cause, such as the laws of physics, etc. Any system for counterfactual rea- placement of the crosswalk, she could do so by adding a soning must provide mechanisms to construct hypothetical single-line, and without modifying her underlying models worlds that maintain invariances (and hence share informa- at all. tion) with the factual world, so that for instance the fact that I actually lost the match helps predict whether I would have Causal reasoning is currently done predominantly using won the match had I trained harder. causal graphical models (20): graphs whose vertices are variables, and whose directed edges represent causal de- These requirements have been resolved in the context of pendencies. Despite widespread use, causal graphs cannot causal graphical models. Causal interventions are “surgical easily express many real-world phenomena. One reason procedures” which modify single nodes but leave functional for this is that causal graphs are equivalent to straight-line dependencies intact. Pearl’s twin-network construction (20) programs: programs without conditional branching or loops of counterfactuals duplicates the model into one twice the – just finite sequences of primitive operations. Straight-line size. One half is the original model. The other half is a languages are not Turing-complete; they cannot express un- duplicate, modified to express the counterfactual interven- bounded models with an unknown number of variables. In tions. These halves are joined via a shared dependence on practice, they lack many of the features (composite func- the background facts. Hence, conditioning a variable in the tions, data types, polymorphism, etc.) necessary to express factual world influences the counterfactual world. the kinds of simulation models we would like to perform To generalize the twin-network construction to arbitrary pro- causal inference in. grams, OMEGAC runs two copies of a program, one factual Counterfactual reasoning in OMEGAC alleviates many of execution, and one counterfactual execution which shares these limitations. In an OMEGAC intervention X ! x, x some variables, but where others have been given alternate can be a constant, function, another random variable, or even definitions. It is folklore that programs doing this can be refer to its non-intervened self, e.g. X ! 2X. Moreover, built by hand, but, as in the twin-network construction, each users can construct various forms of stochastic interventions, intervention requires writing a separate model, and each and even condition the corresponding interventional worlds. counterfactual included doubles the size of the program. This allows users to model experimental error, or scenarios The solution in OMEGAC is to provide a new do operator where observers are unsure about which intervention has which removes the need to modify an existing program to taken place. add a counterfactual execution. Instead, t1 j do(x ! t2) is defined to be the value that a term t would take if x had A generic do operator that composes systematically with 1 been set to This works even if any dependencies of t on x conditioning presents several challenges. In particular, to 1 are indirect. For instance, if y = 2x, then y2 j do(x ! f) construct Y , we must be able to copy Y in such a way that X is equivalent to (2f)2. And note that the variable x can be the code that defines it is retroactively modified. This goes any variable, even one that is bound to a function, meaning beyond the capabilities of existing programming languages, users can compactly define interventions which are substan- probabilistic or otherwise, and hence OMEGA requires a C tial modifications. Finally, combining the operator with non-standard semantics and implementation. conditioning automatically gives counterfactual inference. In summary, we (i) present the syntax and semantics of a uni- Our examples show that OMEGA ’s do operator enables versal probabilistic language for counterfactual generative C compact definition of many counterfactual inference prob- models (Section3); (ii) provide a complete implementation lems. Indeed, in Appendix B, we prove that the do operator of OMEGA , and (iii) demonstrate counterfactual genera- C is not expressible as syntactic sugar (as defined by program- tive modelling through a number of examples (Section4). ming language theory). Regarding scope, causal inference includes problems of both (i) inferring a causal model from data, and (ii) given a causal model, predicting the result of interventions and 3. A Calculus for Counterfactuals counterfactuals on that model. This work focuses on the Our language OMEGA augments the functional probabilis- latter. C tic language OMEGA (31) with counterfactuals. To achieve this: (1) the syntax is augmented with a do operator, and 2. Overview of Counterfactuals (2) the language evaluation is changed from eager to lazy, which is key to handling interventions. In this section, we Counterfactual claims assume some structure is invariant introduce λ , a core calculus of OMEGA . We build the between the original factual world and intervened hypothet- C C language up in pieces: first showing the standard/determin- ical world. For instance, the counterfactual “If I had trained istic features, then features for deterministic interventions, more, I would have won the match” is predicated on the and finally the probabilistic ones. Together, intervention and invariance of the opponent’s skill, the existence of the game, Submission and Formatting Instructions for ICML 2021 conditioning give the language the ability to do counterfac- Causal Fragment Our causal fragment adds one new tual inference. Appendix A gives a more formal definition term: the do expression (Fig.3). t1 j do(x ! t2) eval- of the entire λC language. A Haskell implementation which uates t1 to the value that it would have evaluated to, had provides complete execution traces of terms in λC is avail- x been defined as t2 at point of definition. Here, x can able from https://tinyurl.com/y3cusyoe. be any variable that is in scope, bound locally or globally, and t can be any any term denoting a value.

Load more