Multiple Causal Inference with Latent Confounding
Total Page:16
File Type:pdf, Size:1020Kb
Multiple Causal Inference with Latent Confounding Rajesh Ranganath 1 Adler Perotte 2 Abstract from observational data. Causal inference from obser- Causal inference from observational data re- vational data requires assumptions. These assumptions quires assumptions. These assumptions range include measurement of all confounders (Rosenbaum and from measuring confounders to identifying in- Rubin, 1983), the presence of external randomness that struments. Traditionally, causal inference as- partially controls treatment (Angrist et al., 1996), and sumptions have focused on estimation of ef- structural assumptions on the randomness (Hoyer et al., fects for a single treatment. In this work, we 2009). construct techniques for estimation with mul- tiple treatments in the presence of unobserved Though causal inference from observational data has been confounding. We develop two assumptions used in many domains, the assumptions that underlie based on shared confounding between treat- these inferences focus on estimation of a causal effect ments and independence of treatments given with a single treatment. But many real-world applications the confounder. Together, these assumptions consist of multiple treatments. For example, the effects lead to a confounder estimator regularized by of genetic variation on various phenotypes (Consortium mutual information. For this estimator, we de- et al., 2007) or the effects of medications from order sets velop a tractable lower bound. To recover treat- in clinical medicine (O’connor et al., 2009) consist of ment effects, we use the residual information in causal problems with multiple treatments rather than a the treatments independent of the confounder. single treatment. Considering multiple treatments make We validate on simulations and an example from new kinds of assumptions possible. clinical medicine. We formalize multiple causal inference, a collection of causal inference problems with multiple treatments and 1. Introduction a single outcome. We develop a set of assumptions under which causal effects can be estimated when confounders Causal inference aims to estimate the effect one variable are unmeasured. Two assumptions form the starting has on another. Causal inferences form the heart of in- point: that the treatments share confounders, and that quiry in many domains, including estimating the value of given the shared confounder, all of the treatments are giving a medication to a patient, understanding the influ- independent. This kind of shared confounding structure ence of genetic variations on phenotypes, and measuring can be found in many domains such as genetics. the impact of job training programs on income. One class of estimators for unobserved confounders take Assumption-free causal inferences rely on randomized in treatments and output a noisy estimate of the unmea- experimentation (Cook et al., 2002; Pearl et al., 2009). sured confounders. Estimators for multiple causal infer- Randomized experiments break the relationship between arXiv:1805.08273v3 [stat.ML] 1 Mar 2019 ence should respect the assumptions of the problem. To the intervention variable (the treatment) and variables respect shared confounding, the information between that could alter both the treatment and the outcome— the confounder and a treatment given the rest of the confounders. Though powerful, randomized experimen- treatments should be minimal. However, forcing this in- tation fails to make use of large collections of non- formation to zero makes the confounder independent randomized observational data (like electronic health of the treatments. This can violate the assumption of records in medicine) and is inapplicable where broad independence given the shared confounder. This ten- experimentation is infeasible (like in human genetics). sion parallels that between underfitting and overfitting. The counterpart to experimentation is causal inference Confounders with low information underfit, while con- 1New York University 2Columbia University. Correspondence founders with high information memorize the treatments and overfit. to: Rajesh Ranganath <[email protected]>, Adler Perotte <[email protected]>. To resolve the tension between the two assumptions, we develop a regularizer based on the additional mutual in- Multiple Causal Inference with Latent Confounding formation each treatment contributes to the estimated et al.(2017) develop estimators with theoretical guaran- confounder given the rest of the treatments. We develop tees by building representations that penalize differences an algorithm that estimates the confounder by simulta- in confounder distributions between the treated and un- neously minimizing the reconstruction error of the treat- treated. ments, while regularizing the additional mutual informa- The above approaches focus on building estimators for tion. The stochastic confounder estimator can include single treatments, where either the confounder or a non- complex nonlinear transformations of the treatments. In treatment proxy is measured. In contrast, computational practice, we use neural networks. The additional mutual genetics has developed a variety of methods to control information is intractable, so we build a lower bound for unmeasured confounding in genome-wide associa- called the multiple causal lower bound (MCLBO). tion studies (GWAS). Genome-wide association studies The last step in building a causal estimator is to build the have multiple treatments in the form of genetic variations outcome model. Traditional outcome models regress the across multiple sites. Yu et al.(2006); Kang et al.(2010); confounders and treatments to the outcome (Morgan and Lippert et al.(2011) estimate kinship matrices between Winship, 2014). However, since the confounder estimate individuals using a subset of the genetic variations, then is a stochastic function of the treatments, it contains no fit a LMM where the kinship provides the covariance for new information about the response over the treatments— random effects. Song et al.(2015) adjust for confounding a regression on both the estimated confounder and treat- via factor analysis on discrete variables and use inverse ments can ignore the estimated confounder. Instead, we regression to estimate individual treatment effects. Tran build regression models using the residual information and Blei(2017) build implicit models for genome-wide in the treatments and develop an estimator to compute association studies and describe general implicit causal these residuals. We call the entire causal estimation pro- models in the same vein as Kocaoglu et al.(2017). A cess multiple causal estimation via information (MCEI). formulation of multiple causal inference was also pro- Under technical conditions, we show that the causal esti- posed by (Wang and Blei, 2018); they take a model-based mates converge to the true causal effects as the number approach in the potential outcomes framework that lever- of treatments and examples grow. The assumptions we ages predictive checks. develop strengthen the foundation for existing causal es- timation with unobserved confounders such as causal Our grounding for multiple causal inference complements this earlier work. We develop the two assumptions of estimation with linear mixed models (LMMs) (Kang et al., 2010; Lippert et al., 2011). shared confounding and of independence given shared confounders. We introduce a mutual information based We demonstrate MCEI on a large simulation study. Though regularizer that trades off between these assumptions. traditional methods like principal component analysis Earlier work estimates confounders by choosing their di- (PCA) adjustment (Yu et al., 2006) closely approximate mensionality (e.g., number of PCA components) to not the family of techniques we describe, we find that our overfit. This matches the flavor of the estimator we de- approach more accurately estimates the causal effects, velop. Lastly, we describe how residual information must even when the confounder dimensionality is misspecified. be used to fit complex outcome models. Finally, we apply the MCLBO to control for confounders in a medical prediction problem on health records from the 2. Multiple Causal Inference Multiparameter Intelligent Monitoring in Intensive Care (MIMIC III) clinical database (Johnson et al., 2016). We The trouble with causal inference from observational data show the recovered effects match the literature. lies in confounders, variables that affect both treatments and outcome. The problem is that the observed statistical Related Work. Causal inference has a long history in relationship between the treatment and outcome may many disciplines including statistics, computer science, be partially or completely due to the confounder. Ran- and econometrics. A full review is outside of the scope domizing the treatment breaks the relationship between of this article, however, we highlight some of recent ad- the treatment and confounder, rendering the observed vances in building flexible causal models. Wager and statistical relationship causal. But the lack of randomized Athey(2017) develop random forests to capture variabil- data necessitates assumptions to control for potential con- ity in treatment effects (Wager and Athey, 2017). Hill founders. These assumptions have focused on causal (2011) uses Bayesian nonparametric methods to model estimation with a single treatment and a single outcome. the outcome response. Louizos et al.(2017) build flexible In real-world settings such as in genetics and medicine,