Design of Observational Studies (Springer Series in Statistics)
Total Page:16
File Type:pdf, Size:1020Kb
Springer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger For other titles published in this series, go to http://www.springer.com/series/692 Paul R. Rosenbaum Design of Observational Studies 123 Paul R. Rosenbaum Statistics Department Wharton School University of Pennsylvania Philadelphia, PA 19104-6340 USA [email protected] ISBN 978-1-4419-1212-1 e-ISBN 978-1-4419-1213-8 DOI 10.1007/978-1-4419-1213-8 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2009938109 c Springer Science+Business Media, LLC 2010 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is a part of Springer Science+Business Media (www.springer.com). For Judy “Simplicity of form is not necessarily simplicity of experience.” Robert Morris, writing about art. “Simplicity is not a given. It is an achievement.” William H. Gass, writing about literature. “Simplicity . is a very important matter which must be constantly borne in mind.” Sir David Cox, writing about experiments. Preface An observational study is an empiric investigation of effects caused by treatments when randomized experimentation is unethical or infeasible. The quality and strength of evidence provided by an observational study is determined largely by its design. Excellent methods of analysis will not salvage a poorly designed study. The line between design and analysis is easier to draw in practice than it is in the- ory. In practice, the design of an observational study consists of all activities that precede the examination or use of those outcome measures that will be the basis for the study’s conclusions. Unlike experiments, in some observational studies, the outcomes may exist as measurements prior to the design of the study; it is their ex- amination and use, not their existence, that separates design from analysis. Aspects of design include the framing of scientific questions to permit empirical investiga- tion, the choice of a context in which to conduct the investigation, decisions about what data to collect, where and how to collect it, matching to remove bias from measured covariates, strategies and tactics to limit uncertainty caused by covariates not measured, and sample splitting to guide design using individuals who will not be included in the final analysis. In practice, design ends and analysis begins when outcomes are examined for individuals who will be the basis of the study’s conclu- sions. An observational study that begins by examining outcomes is a formless, undisciplined investigation that lacks design. In theory, design anticipates analysis. Analysis is ever present in design, as any goal is ever present in any organized effort, as a goal is necessary to organize effort. One seeks to ask questions and collect data so that results will be decisive when analyzed. To end well, how should we begin? Philadelphia, PA Paul Rosenbaum 5 August 2009 ix Acknowledgments I am in debt to many people: to Jeff Silber, Dylan Small, and Ruth Heller for re- cent collaborations I describe here in detail; to my teacher, adviser, and coauthor Don Rubin, from whom I learned a great deal; to Ben Hansen, Bo Lu and Robert Greevy for making network optimization algorithms for matching generally accessi- ble inside R; to colleagues, coauthors or former students Katrina Armstrong, Susan Bakewell-Sachs, Lisa Bellini, T. Behringer, Avital Cnaan, Shoshana Daniel, Gabriel Escobar, Orit Even-Shoshan, Joe Gastwirth, Robert Greevy, Sam Gu, Amelia Hav- iland, Robert Hornik, Abba Krieger, Marshall Joffe, Yunfei Paul Li, Scott Lorch, Bo Lu, Barbara Medoff-Cooper, Lanyu Mi, Andrea Millman, Kewei Ming, Dan Nagin, Dan Polsky, Kate Propert, Tom Randall, Amy Rosen, Richard Ross, Sandy Schwartz, Tom Ten Have, Richard Tremblay, Kevin Volpp,Yanli Wang, Frank Yoon, and Elaine Zanutto, for collaborations I describe more briefly; to Judith McDonald, John Newell, Luke Keele, Dylan Small, and anonymous reviewers for comments on drafts of the book; to Joshua Angrist, David Card, Susan Dynarski, Alan Krueger, and Victor Lavy for making micro-data from their research available in one form or another. Parts of this book were written while I was on sabbatical. The hospitality of the Department of Statistics at Columbia University and the Department of Eco- nomics and the Statistics Program at the National University of Ireland at Galway are gratefully acknowledged. The work was supported in part by the Methodology, Measurement and Statistics Program of the U.S. National Science Foundation. Of course, my greatest debts are to Judy, Sarah, Hannah, and Aaron. xi Contents Part I Beginnings 1 Dilemmas and Craftsmanship ................................... 3 1.1 Those Confounded Vitamins ................................. 3 1.2 Cochran’s Basic Advice ..................................... 4 1.2.1 Treatments, covariates, outcomes ....................... 5 1.2.2 How were treatments assigned? . ....................... 5 1.2.3 Were treated and control groups comparable? . .......... 5 1.2.4 Eliminating plausible alternatives to treatment effects . ..... 6 1.2.5 Exclusion criteria . ................................. 6 1.2.6 Exiting a treatment group after treatment assignment . ..... 7 1.2.7 Study protocol. ..................................... 7 1.3 Maimonides’ Rule .......................................... 7 1.4 Seat Belts in Car Crashes . ................................. 9 1.5 Money for College ......................................... 10 1.6 Nature’s ‘Natural Experiment’ ................................ 11 1.7 What This Book Is About . ................................. 13 1.8 Further Reading . .......................................... 18 References . ................................................... 18 2 Causal Inference in Randomized Experiments ..................... 21 2.1 Two Versions of the National Supported Work Experiment . ..... 21 2.1.1 A version with 185 pairs and a version with 5 pairs . ..... 21 2.1.2 Basic notation . ..................................... 23 2.2 Treatment Effects in Randomized Experiments .................. 25 2.2.1 Potential responses under alternative treatments . .......... 25 2.2.2 Covariates and outcomes . ............................ 26 2.2.3 Possible treatment assignments and randomization . ..... 27 2.2.4 Interference between units . ............................ 28 2.3 Testing the Null Hypothesis of No Treatment Effect .............. 29 2.3.1 Treated−control differences when the null hypothesis is true 29 xiii xiv Contents 2.3.2 The randomization distribution of the mean difference ..... 31 2.3.3 The randomization distribution of Wilcoxon’s statistic ..... 36 2.4 Testing Other Hypotheses; Confidence Intervals; Point Estimates. 40 2.4.1 Testing a constant, additive treatment effect .............. 40 2.4.2 Confidence intervals for a constant, additive effect . ..... 41 2.4.3 Hodges-Lehmann point estimates of effect . .............. 43 2.4.4 Testing general hypotheses about treatment effects . ..... 44 2.4.5 Multiplicative effects; Tobit effects . ................... 46 2.5 Attributable Effects ......................................... 49 2.6 Internal and External Validity ................................ 56 2.7 Summary . .............................................. 57 2.8 Further Reading . .......................................... 57 2.9 Appendix: Randomization Distribution of m-statistics . .......... 58 References . ................................................... 61 3 Two Simple Models for Observational Studies ..................... 65 3.1 The Population Before Matching . ............................ 65 3.2 The Ideal Matching ......................................... 66 3.3 A Na¨ıve Model: People Who Look Comparable Are Comparable . 70 3.4 Sensitivity Analysis: People Who Look Comparable May Differ . 76 3.5 Welding Fumes and DNA Damage ............................ 79 3.6 Bias Due to Incomplete Matching . ............................ 85 3.7 Summary . .............................................. 86 3.8 Further Reading . .......................................... 87 3.9 Appendix: Exact Computations for Sensitivity Analysis .......... 88 References . ................................................... 90 4 Competing Theories Structure Design ............................ 95 4.1 How Stones Fall . .......................................... 95 4.2 The Permanent-Debt Hypothesis . ............................ 98 4.3 Guns and Misdemeanors . .................................100 4.4 The Dutch Famine of 1944–1945 . ............................100 4.5 Replicating Effects and Biases ................................101 4.6 Reasons for Effects .........................................104 4.7 The Drive for System . .....................................108 4.8 Further Reading . ..........................................109