Matching on the Estimated Propensity Score

Total Page:16

File Type:pdf, Size:1020Kb

Matching on the Estimated Propensity Score Econometrica, Vol. 84, No. 2 (March, 2016), 781–807 NOTES AND COMMENTS MATCHING ON THE ESTIMATED PROPENSITY SCORE BY ALBERTO ABADIE AND GUIDO W. I MBENS1 Propensity score matching estimators (Rosenbaum and Rubin (1983)) are widely used in evaluation research to estimate average treatment effects. In this article, we de- rive the large sample distribution of propensity score matching estimators. Our deriva- tions take into account that the propensity score is itself estimated in a first step, prior to matching. We prove that first step estimation of the propensity score affects the large sample distribution of propensity score matching estimators, and derive adjustments to the large sample variances of propensity score matching estimators of the average treatment effect (ATE) and the average treatment effect on the treated (ATET). The adjustment for the ATE estimator is negative (or zero in some special cases), implying that matching on the estimated propensity score is more efficient than matching on the true propensity score in large samples. However, for the ATET estimator, the sign of the adjustment term depends on the data generating process, and ignoring the estima- tion error in the propensity score may lead to confidence intervals that are either too large or too small. KEYWORDS: Matching estimators, propensity score matching, average treatment ef- fects, causal inference, program evaluation. 1. INTRODUCTION PROPENSITY SCORE MATCHING ESTIMATORS (Rosenbaum and Rubin (1983)) 2 are widely used to estimate treatment effects. Rosenbaum and Rubin (1983) defined the propensity score as the conditional probability of assignment to a treatment given a vector of covariates. Suppose that adjusting for a set of co- variates is sufficient to eliminate confounding. The key insight of Rosenbaum and Rubin (1983) is that adjusting only for the propensity score is also suffi- cient to eliminate confounding. Relative to matching directly on the covariates, propensity score matching has the advantage of reducing the dimensionality of matching to a single dimension. This greatly facilitates the matching process 1We are grateful to the editor and three referees for helpful comments, to Ben Hansen, Ju- dith Lok, James Robins, Paul Rosenbaum, Donald Rubin, and participants in many seminars for comments and discussions, and to Jann Spiess for expert research assistance. Financial support by the NSF through Grants SES 0820361 and SES 0961707 is gratefully acknowledged. 2Following the terminology in Abadie and Imbens (2006), the term “matching estimator” is reserved in this article to estimators that match each unit (or each unit of some sample subset, e.g., the treated) to a small number of units with similar characteristics in the opposite treatment arm. Thus, our discussion does not refer to regression imputation methods, like the kernel matching method of Heckman, Ichimura, and Todd (1998), which use a large number of matches per unit and nonparametric smoothing techniques to consistently estimate unit-level regression values under counterfactual treatment assignments. See Hahn (1998), Heckman, Ichimura, and Todd (1998), Imbens (2004), and Imbens and Wooldridge (2009) for a discussion of such estimators. © 2016 The Econometric Society DOI: 10.3982/ECTA11293 782 A. ABADIE AND G. W. IMBENS because units with dissimilar covariate values may nevertheless have similar values for their propensity scores. In observational studies, propensity scores are not known, so they have to be estimated prior to matching. In spite of the great popularity that propensity score matching methods have enjoyed since they were proposed by Rosenbaum and Rubin in 1983, their large sample distribution has not yet been derived for the case when the propensity score is estimated in a first step.3 A possible rea- son for this void in the literature is that matching estimators are non-smooth functionals of the distribution of the matching variables, which makes it dif- ficult to establish an asymptotic approximation to the distribution of match- ing estimators when a matching variable is estimated in a first step. This has motivated the use of bootstrap standard errors for propensity score matching estimators. However, recently it has been shown that the bootstrap is not, in general, valid for matching estimators (Abadie and Imbens (2008)).4 In this article, we derive large sample approximations to the distribution of propensity score matching estimators. Our derivations take into account that the propensity score is itself estimated in a first step. We show that propensity matching estimators have approximately Normal distributions in large sam- ples. We demonstrate that first step estimation of the propensity score affects the large sample distribution of propensity score matching estimators, and de- rive adjustments to the large sample variance of propensity score matching estimators that correct for first step estimation of the propensity score. We do this for estimators of the average treatment effect (ATE) and the average treat- ment effect on the treated (ATET). The adjustment for the ATE estimator is negative (or zero in some special cases), implying that matching on the esti- mated propensity score is more efficient than matching on the true propensity score in large samples. As a result, treating the estimated propensity score as it was the true propensity score for estimating the variance of the ATE esti- mator leads to conservative confidence intervals. However, for the ATET esti- mator, the sign of the adjustment depends on the data generating process, and ignoring the estimation error in the propensity score may lead to confidence intervals that are either too large or too small. 2. MATCHING ESTIMATORS The setup in this article is a standard one in the program evaluation litera- ture, where the focus of the analysis is often the effect of a binary treatment, 3Influential papers using matching on the estimated propensity score include Heckman, Ichimura, and Todd (1997), Dehejia and Wahba (1999), and Smith and Todd (2005). 4In contexts other than matching, Heckman, Ichimura, and Todd (1998), Hirano, Imbens, and Ridder (2003), Abadie (2005), Wooldridge (2007), and Angrist and Kuersteiner (2011) derived large sample properties of statistics based on a first step estimator of the propensity score. In all these cases, the second step statistics are smooth functionals of the propensity scores and, therefore, standard stochastic expansions for two-step estimators apply (see, e.g., Newey and McFadden (1994)). MATCHING ON THE ESTIMATED PROPENSITY SCORE 783 represented in this paper by the indicator variable W , on some outcome vari- able, Y . More specifically, W = 1 indicates exposure to the treatment, while W = 0 indicates lack of exposure to the treatment. Following Rubin (1974), we define treatment effects in terms of potential outcomes. We define Y(1) as the potential outcome under exposure to treatment, and Y(0) as the potential outcome under no exposure to treatment. Our goal is to estimate the average treatment effect, τ = E Y(1) − Y(0) where the expectation is taken over the population of interest. Alternatively, the goal may be estimation of the average effect for the treated, τt = E Y(1) − Y(0)|W = 1 Estimation of these average treatment effects is complicated by the fact that for each unit in the population, we observe at most one of the potential outcomes: Y(0) if W = 0, Y = Y(1) if W = 1. Let X be a vector of covariates of dimension k. The propensity score is p(X) = Pr(W = 1|X),andp∗ = Pr(W = 1) is the probability of being treated. The following assumption is often referred to as “strong ignorability” (Rosenbaum and Rubin (1983)). It means that adjusting for X is sufficient to eliminate all confounding. ASSUMPTION 1: (i) Y(1) Y (0) ⊥⊥ W |X almost surely; (ii) p ≤ p(X) ≤ p almost surely, for some p > 0 and p<1. Assumption 1(i) uses the conditional independence notation in Dawid (1979). This assumption is often referred to as “unconfoundedness.” It will hold, for example, if all confounders are included in X, so that after control- ling for X, treatment exposure is independent of the potential outcomes. Hahn (1998) derived asymptotic variance bounds and studied asymptotically efficient estimation under Assumption 1(i). Assumption 1(ii) implies that, for almost all values of X, the population includes treated and untreated units. Moreover, Assumption 1(ii) bounds the values of the propensity score away from zero and 1. Khan and Tamer (2010) have shown that this condition is necessary for root-N consistent estimation of the average treatment effect. Let μ(w x) = E[Y |W = wX = x] and σ 2(w x) = var(Y |W = wX = x) be the conditional mean and variance of Y given W = w and X = x. Simi- larly, let μ(w¯ p) = E[Y |W = wp(X) = p] and σ¯ 2(w p) = var(Y |W = w p(X) = p) be the conditional mean and variance of Y given W = w and 784 A. ABADIE AND G. W. IMBENS p(X) = p. Under Assumption 1, τ = E μ(1X)− μ(0X) and τt = E μ(1X)− μ(0X)|W = 1 (see Rubin (1974)). Therefore, adjusting for differences in the distribution of X between treated and nontreated removes all confounding and, therefore, allows identification of ATE and ATET. Rosenbaum and Rubin (1983)proved that W and X are independent conditional on the propensity score, p(X), which implies that under Assumption 1: τ = E μ¯ 1 p(X) −¯μ 0 p(X) and τt = E μ¯ 1 p(X) −¯μ 0 p(X) |W = 1 In other words, under Assumption 1, adjusting for the propensity score only is enough to remove all confounding. This result motivates the use of propen- sity score matching estimators. A propensity score matching estimator for the average treatment effect can be defined as 1 N 1 τ∗ = (2W − 1) Y − Y N N i i M j i=1 j∈JM (i) where M is a fixed number of matches per unit and JM (i) is the set of matches 5 ∗ ∗ for unit i.
Recommended publications
  • Statistical Matching: a Paradigm for Assessing the Uncertainty in the Procedure
    Journal of Of®cial Statistics, Vol. 17, No. 3, 2001, pp. 407±422 Statistical Matching: A Paradigm for Assessing the Uncertainty in the Procedure Chris Moriarity1 and Fritz Scheuren2 Statistical matching has been widely used by practitioners without always adequate theoreti- cal underpinnings. The work of Kadane (1978) has been a notable exception and the present article extends his insights. Kadane's 1978 article is reprinted in this JOS issue. Modern com- puting can make possible, under techniques described here, a real advance in the application of statistical matching. Key words: Multivariate normal; complex survey designs; robustness; resampling; variance- covariance structures; and application suggestions. 1. Introduction Many government policy questions, whether on the expenditure or tax side, lend them- selves to microsimulation modeling, where ``what if'' analyses of alternative policy options are carried out (e.g., Citro and Hanushek 1991). Often, the starting point for such models, in an attempt to achieve a degree of verisimilitude, is to employ information contained in several survey microdata ®les. Typically, not all the variables wanted for the modeling have been collected together from a single individual or family. However, the separate survey ®les may have many demographic and other control variables in common. The idea arose, then, of matching the separate ®les on these common variables and thus creating a composite ®le for analysis. ``Statistical matching,'' as the technique began to be called, has been more or less widely practiced since the advent of public use ®les in the 1960's. Arguably, the desire to employ statistical matching was even an impetus for the release of several of the early public use ®les, including those involving U.S.
    [Show full text]
  • A Machine Learning Approach to Census Record Linking∗
    A Machine Learning Approach to Census Record Linking∗ James J. Feigenbaumy March 28, 2016 Abstract Thanks to the availability of new historical census sources and advances in record linking technology, economic historians are becoming big data geneal- ogists. Linking individuals over time and between databases has opened up new avenues for research into intergenerational mobility, the long run effects of early life conditions, assimilation, discrimination, and the returns to edu- cation. To take advantage of these new research opportunities, scholars need to be able to accurately and efficiently match historical records and produce an unbiased dataset of links for analysis. I detail a standard and transparent census matching technique for constructing linked samples that can be repli- cated across a variety of cases. The procedure applies insights from machine learning classification and text comparison to record linkage of historical data. My method teaches an algorithm to replicate how a well trained and consistent researcher would create a linked sample across sources. I begin by extracting a subset of possible matches for each record, and then use training data to tune a matching algorithm that attempts to minimize both false positives and false negatives, taking into account the inherent noise in historical records. To make the procedure precise, I trace its application to an example from my own work, linking children from the 1915 Iowa State Census to their adult-selves in the 1940 Federal Census. In addition, I provide guidance on a number of practical questions, including how large the training data needs to be relative to the sample. ∗I thank Christoph Hafemeister, Jamie Lee, Christopher Muller, Martin Rotemberg, and Nicolas Ziebarth for detailed feedback and thoughts on this project, as well as seminar participants at the NBER DAE Summer Institute and the Berkeley Demography Conference on Census Linking.
    [Show full text]
  • Efficiency of Average Treatment Effect Estimation When the True
    econometrics Article Efficiency of Average Treatment Effect Estimation When the True Propensity Is Parametric Kyoo il Kim Department of Economics, Michigan State University, 486 W. Circle Dr., East Lansing, MI 48824, USA; [email protected]; Tel.: +1-517-353-9008 Received: 8 March 2019; Accepted: 28 May 2019; Published: 31 May 2019 Abstract: It is well known that efficient estimation of average treatment effects can be obtained by the method of inverse propensity score weighting, using the estimated propensity score, even when the true one is known. When the true propensity score is unknown but parametric, it is conjectured from the literature that we still need nonparametric propensity score estimation to achieve the efficiency. We formalize this argument and further identify the source of the efficiency loss arising from parametric estimation of the propensity score. We also provide an intuition of why this overfitting is necessary. Our finding suggests that, even when we know that the true propensity score belongs to a parametric class, we still need to estimate the propensity score by a nonparametric method in applications. Keywords: average treatment effect; efficiency bound; propensity score; sieve MLE JEL Classification: C14; C18; C21 1. Introduction Estimating treatment effects of a binary treatment or a policy has been one of the most important topics in evaluation studies. In estimating treatment effects, a subject’s selection into a treatment may contaminate the estimate, and two approaches are popularly used in the literature to remove the bias due to this sample selection. One is regression-based control function method (see, e.g., Rubin(1973) ; Hahn(1998); and Imbens(2004)) and the other is matching method (see, e.g., Rubin and Thomas(1996); Heckman et al.(1998); and Abadie and Imbens(2002, 2006)).
    [Show full text]
  • Stability and Median Rationalizability for Aggregate Matchings
    games Article Stability and Median Rationalizability for Aggregate Matchings Federico Echenique 1, SangMok Lee 2, Matthew Shum 1 and M. Bumin Yenmez 3,* 1 Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA 91125, USA; [email protected] (F.E.); [email protected] (M.S.) 2 Department of Economics, Washington University in St. Louis, St. Louis, MO 63130, USA; [email protected] 3 Department of Economics, Boston College, Chestnut Hill, MA 02467, USA * Correspondence: [email protected] Abstract: We develop the theory of stability for aggregate matchings used in empirical studies and establish fundamental properties of stable matchings including the result that the set of stable matchings is a non-empty, complete, and distributive lattice. Aggregate matchings are relevant as matching data in revealed preference theory. We present a result on rationalizing a matching data as the median stable matching. Keywords: aggregate matching; median stable matching; rationalizability; lattice 1. Introduction Following the seminal work of [1], an extensive literature has developed regarding matching markets with non-transferable utility. This literature assumes that there are agent-specific preferences, and studies the existence of stable matchings in which each Citation: Echenique, F.; Lee, S.; agent prefers her assigned partner to the outside option of being unmatched, and there Shum, M.; Yenmez, M.B. Stability and are no pairs of agents that would like to match with each other rather than keeping their Median Rationalizability for assigned partners. Aggregate Matchings. Games 2021, 12, In this paper, we develop the theory of stability for aggregate matchings, which we 33.
    [Show full text]
  • Report on Exact and Statistical Matching Techniques
    Statistical Policy Working Papers are a series of technical documents prepared under the auspices of the Office of Federal Statistical Policy and Standards. These documents are the product of working groups or task forces, as noted in the Preface to each report. These Statistical Policy Working Papers are published for the purpose of encouraging further discussion of the technical issues and to stimulate policy actions which flow from the technical findings and recommendations. Readers of Statistical Policy Working Papers are encouraged to communicate directly with the Office of Federal Statistical Policy and Standards with additional views, suggestions, or technical concerns. Office of Joseph W. Duncan Federal Statistical Director Policy Standards For sale by the Superintendent of Documents, U.S. Government Printing Office Washington, D.C. 20402 Statistical Policy Working Paper 5 Report on Exact and Statistical Matching Techniques Prepared by Subcommittee on Matching Techniques Federal Committee on Statistical Methodology DEPARTMENT OF COMMERCE UNITED STATES OF AMERICA U.S. DEPARTMENT OF COMMERCE Philip M. Klutznick Courtenay M. Slater, Chief Economist Office of Federal Statistical Policy and Standards Joseph W. Duncan, Director Issued: June 1980 Office of Federal Statistical Policy and Standards Joseph W. Duncan, Director Katherine K. Wallman, Deputy Director, Social Statistics Gaylord E. Worden, Deputy Director, Economic Statistics Maria E. Gonzalez, Chairperson, Federal Committee on Statistical Methodology Preface This working paper was prepared by the Subcommittee on Matching Techniques, Federal Committee on Statistical Methodology. The Subcommittee was chaired by Daniel B. Radner, Office of Research and Statistics, Social Security Administration, Department of Health and Human Services. Members of the Subcommittee include Rich Allen, Economics, Statistics, and Cooperatives Service (USDA); Thomas B.
    [Show full text]
  • Alternatives to Randomized Control Trials: a Review of Three Quasi-Experimental Designs for Causal Inference
    Actualidades en Psicología, 29(119), 2015, 19- 27 ISSN 2215-3535 http://revistas.ucr.ac.cr/index.php/actualidades DOI: http://dx.doi.org/10.15517/ap.v29i119.18810 Alternatives to Randomized Control Trials: A Review of Three Quasi-experimental Designs for Causal Inference Alternativas a las Pruebas Controladas Aleatorizadas: una revisión de tres diseños cuasi experimentales para la inferencia causal Pavel Pavolovich Panko1 Jacob D. Curtis2 Brittany K. Gorrall3 Todd Daniel Little4 Texas Tech University, United States Abstract. The Randomized Control Trial (RCT) design is typically seen as the gold standard in psychological research. As it is not always possible to conform to RCT specifications, many studies are conducted in the quasi-experimental framework. Although quasi-experimental designs are considered less preferable to RCTs, with guidance they can produce inferences which are just as valid. In this paper, the authors present 3 quasi-experimental designs which are viable alternatives to RCT designs. These designs are Regression Point Displacement (RPD), Regression Discontinuity (RD), and Propensity Score Matching (PSM). Additionally, the authors outline several notable methodological improvements to use with these designs. Keywords. Psychometrics, Quasi-Experimental Design, Regression Point Displacement, Regression Discontinuity, Propensity Score Matching. Resumen. Los diseños de Pruebas Controladas Aleatorizadas (PCA) son típicamente vistas como el mejor diseño en la investigación en psicología. Como tal, no es siempre posible cumplir con las especificaciones de las PCA y por ello muchos estudios son realizados en un marco cuasi experimental. Aunque los diseños cuasi experimentales son considerados menos convenientes que los diseños PCA, con directrices estos pueden producir inferencias igualmente válidas.
    [Show full text]
  • Matching Via Dimensionality Reduction for Estimation of Treatment Effects in Digital Marketing Campaigns
    Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) Matching via Dimensionality Reduction for Estimation of Treatment Effects in Digital Marketing Campaigns Sheng Li Nikos Vlassis Jaya Kawale Yun Fu Northeastern University Adobe Research Adobe Research Northeastern University Boston, MA, USA San Jose, CA, USA San Jose, CA, USA Boston, MA, USA [email protected] [email protected] [email protected] [email protected] Abstract units in the control group are interpreted as counterfactuals, and the average treatment effect on treated (ATT) is estimated A widely used method for estimating counterfactu- by comparing the outcomes of every matched pair. als and causal treatment effects from observational data is nearest-neighbor matching. This typically One of the widely used matching methods is Nearest involves pairing each treated unit with its nearest- Neighbor Matching (NNM) [Rubin, 1973a]. For each treated in-covariates control unit, and then estimating an unit, NNM finds its nearest neighbor in the control group average treatment effect from the set of matched to generate a matched pair, and the ATT is then estimated pairs. Although straightforward to implement, this from the set of matched pairs. Different NNM methods are estimator is known to suffer from a bias that in- characterized by the choice of a distance measure for de- creases with the dimensionality of the covariate termining such a match. Some of the popularly used dis- space, which can be undesirable in applications tance measures are Exact Matching and its variant Coars- that involve high-dimensional data. To address this ened Exact Matching (CEM) [Iacus et al., 2011], Maha- problem, we propose a novel estimator that first lanobis distance matching [Rubin, 1979], and Propensity projects the data to a number of random linear sub- Score Matching (PSM) [Rosenbaum and Rubin, 1983].
    [Show full text]
  • Frequency Matching Case-Control Techniques: an Epidemiological Perspective
    Frequency Matching case-control techniques: an epidemiological perspective. Authors: Hai Nguyen, MS (1st and Corresponding Author) Research Assistant Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois at Chicago Phone: 1-312-355-4471 Email: [email protected] Trang Pham, MS Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois at Chicago [email protected] Garth Rauscher, PhD Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois at Chicago [email protected] Abstract In many cohort and case-control studies, subjects are matched to intend to control confounding and to improve study efficiency by improving precision. An often-used approach is to check to see that the frequency distributions in each study group are alike. Being alike in the frequency distributions of key variables would provide evidence that the groups are comparable. However, there are instances where the overall distributions could be alike, but the individual cases could vary substantially. While there are no methods that can guarantee comparability, individual case matching has often been used to provide assurances that the groups are comparable. We propose an algorithm build a macro in SAS to match control for the case given a set of matching criteria, including match exact on site, on the year, exact or 5-year interval match on age. Cases will be matched with a large number of controls. A large dataset from 2000-2017 from Metropolitan Chicago Breast Cancer Registry with more than 485000 women obtaining breast screening or diagnostic imaging (more than 15000 cases:30000 controls) was applied.
    [Show full text]
  • Package 'Matching'
    Package ‘Matching’ April 14, 2021 Version 4.9-9 Date 2021-03-15 Title Multivariate and Propensity Score Matching with Balance Optimization Author Jasjeet Singh Sekhon <[email protected]> Maintainer Jasjeet Singh Sekhon <[email protected]> Description Provides functions for multivariate and propensity score matching and for finding optimal balance based on a genetic search algorithm. A variety of univariate and multivariate metrics to determine if balance has been obtained are also provided. For details, see the paper by Jasjeet Sekhon (2007, <doi:10.18637/jss.v042.i07>). Depends R (>= 2.6.0), MASS (>= 7.2-1), graphics, grDevices, stats Suggests parallel, rgenoud (>= 2.12), rbounds License GPL-3 URL http://sekhon.berkeley.edu/matching/ NeedsCompilation yes RoxygenNote 7.1.1 Repository CRAN Date/Publication 2021-04-13 22:00:15 UTC R topics documented: balanceUV . .2 GenMatch . .4 GerberGreenImai . 11 ks.boot . 13 lalonde . 15 Match . 16 MatchBalance . 22 Matchby . 26 1 2 balanceUV qqstats . 30 summary.balanceUV . 32 summary.ks.boot . 33 summary.Match . 34 summary.Matchby . 34 Index 36 balanceUV Univariate Balance Tests Description This function provides a number of univariate balance metrics. Generally, users should call MatchBalance and not this function directly. Usage balanceUV(Tr, Co, weights = rep(1, length(Co)), exact = FALSE, ks=FALSE, nboots = 1000, paired=TRUE, match=FALSE, weights.Tr=rep(1,length(Tr)), weights.Co=rep(1,length(Co)), estimand="ATT") Arguments Tr A vector containing the treatment observations. Co A vector containing the control observations. weights A vector containing the observation specific weights. Only use this option when the treatment and control observations are paired (as they are after matching).
    [Show full text]
  • A Comparison of Different Methods to Handle Missing Data in the Context of Propensity Score Analysis
    European Journal of Epidemiology https://doi.org/10.1007/s10654-018-0447-z (0123456789().,-volV)(0123456789().,- volV) METHODS A comparison of different methods to handle missing data in the context of propensity score analysis 1 1,2 1,3 Jungyeon Choi • Olaf M. Dekkers • Saskia le Cessie Received: 21 May 2018 / Accepted: 25 September 2018 Ó The Author(s) 2018 Abstract Propensity score analysis is a popular method to control for confounding in observational studies. A challenge in propensity methods is missing values in confounders. Several strategies for handling missing values exist, but guidance in choosing the best method is needed. In this simulation study, we compared four strategies of handling missing covariate values in propensity matching and propensity weighting. These methods include: complete case analysis, missing indicator method, multiple imputation and combining multiple imputation and missing indicator method. Concurrently, we aimed to provide guidance in choosing the optimal strategy. Simulated scenarios varied regarding missing mechanism, presence of effect modification or unmeasured confounding. Additionally, we demonstrated how missingness graphs help clarifying the missing structure. When no effect modification existed, complete case analysis yielded valid causal treatment effects even when data were missing not at random. In some situations, complete case analysis was also able to partially correct for unmeasured confounding. Multiple imputation worked well if the data were missing (completely) at random, and if the imputation model was correctly specified. In the presence of effect modification, more complex imputation models than default options of commonly used statistical software were required. Multiple imputation may fail when data are missing not at random.
    [Show full text]
  • Week 10: Causality with Measured Confounding
    Week 10: Causality with Measured Confounding Brandon Stewart1 Princeton November 28 and 30, 2016 1These slides are heavily influenced by Matt Blackwell, Jens Hainmueller, Erin Hartman, Kosuke Imai and Gary King. Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 1 / 176 Where We've Been and Where We're Going... Last Week I regression diagnostics This Week I Monday: F experimental Ideal F identification with measured confounding I Wednesday: F regression estimation Next Week I identification with unmeasured confounding I instrumental variables Long Run I causality with measured confounding ! unmeasured confounding ! repeated data Questions? Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 2 / 176 1 The Experimental Ideal 2 Assumption of No Unmeasured Confounding 3 Fun With Censorship 4 Regression Estimators 5 Agnostic Regression 6 Regression and Causality 7 Regression Under Heterogeneous Effects 8 Fun with Visualization, Replication and the NYT 9 Appendix Subclassification Identification under Random Assignment Estimation Under Random Assignment Blocking Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 3 / 176 Lancet 2001: negative correlation between coronary heart disease mortality and level of vitamin C in bloodstream (controlling for age, gender, blood pressure, diabetes, and smoking) Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 4 / 176 Lancet 2002: no effect of vitamin C on mortality in controlled placebo trial (controlling for nothing) Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 4 / 176 Lancet 2003: comparing among individuals with the same age, gender, blood pressure, diabetes, and smoking, those with higher vitamin C levels have lower levels of obesity, lower levels of alcohol consumption, are less likely to grow up in working class, etc.
    [Show full text]
  • STATS 361: Causal Inference
    STATS 361: Causal Inference Stefan Wager Stanford University Spring 2020 Contents 1 Randomized Controlled Trials 2 2 Unconfoundedness and the Propensity Score 9 3 Efficient Treatment Effect Estimation via Augmented IPW 18 4 Estimating Treatment Heterogeneity 27 5 Regression Discontinuity Designs 35 6 Finite Sample Inference in RDDs 43 7 Balancing Estimators 52 8 Methods for Panel Data 61 9 Instrumental Variables Regression 68 10 Local Average Treatment Effects 74 11 Policy Learning 83 12 Evaluating Dynamic Policies 91 13 Structural Equation Modeling 99 14 Adaptive Experiments 107 1 Lecture 1 Randomized Controlled Trials Randomized controlled trials (RCTs) form the foundation of statistical causal inference. When available, evidence drawn from RCTs is often considered gold statistical evidence; and even when RCTs cannot be run for ethical or practical reasons, the quality of observational studies is often assessed in terms of how well the observational study approximates an RCT. Today's lecture is about estimation of average treatment effects in RCTs in terms of the potential outcomes model, and discusses the role of regression adjustments for causal effect estimation. The average treatment effect is iden- tified entirely via randomization (or, by design of the experiment). Regression adjustments may be used to decrease variance, but regression modeling plays no role in defining the average treatment effect. The average treatment effect We define the causal effect of a treatment via potential outcomes. For a binary treatment w 2 f0; 1g, we define potential outcomes Yi(1) and Yi(0) corresponding to the outcome the i-th subject would have experienced had they respectively received the treatment or not.
    [Show full text]