Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives an Essential Journey with Donald Rubin’S Statistical Family

Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives An essential journey with Donald Rubin’s statistical family Edited by Andrew Gelman Department of Statistics, Columbia University, USA Xiao-Li Meng Department of Statistics, Harvard University, USA Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors: David J. Balding, Peter Bloomfield, Noel A. C. Cressie, Nicholas I. Fisher, Iain M. Johnstone, J. B. Kadane, Geert Molenberghs, Louise M. Ryan, David W. Scott, Adrian F. M. Smith, Jozef L. Teugels; Editors Emeriti: Vic Barnett, J. Stuart Hunter, David G. Kendall A complete list of the titles in this series appears at the end of this volume. Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives An essential journey with Donald Rubin’s statistical family Edited by Andrew Gelman Department of Statistics, Columbia University, USA Xiao-Li Meng Department of Statistics, Harvard University, USA Copyright 2004 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England Telephone (+44) 1243 779777 Email (for orders and customer service enquiries): [email protected] Visit our Home Page on www.wileyeurope.com or www.wiley.com All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher. Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to [email protected], or faxed to (+44) 1243 770620. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the Publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought. Other Wiley Editorial Offices John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1 Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 0-470-09043-X Produced from LaTeX files supplied by the authors and processed by Laserwords Private Limited, Chennai, India Printed and bound in Great Britain by Antony Rowe Ltd, Chippenham, Wiltshire This book is printed on acid-free paper responsibly manufactured from sustainable forestry in which at least two trees are planted for each one used for paper production. Contents Preface xiii I Casual inference and observational studies 1 1 An overview of methods for causal inference from observational studies, by Sander Greenland 3 1.1 Introduction . 3 1.2Approachesbasedoncausalmodels................. 3 1.3 Canonical inference . 9 1.4 Methodologic modeling . 10 1.5Conclusion.............................. 13 2 Matching in observational studies, by Paul R. Rosenbaum 15 2.1 The role of matching in observational studies . 15 2.2Whymatch?.............................. 16 2.3Twokeyissues:balanceandstructure................ 17 2.4Additionalissues........................... 21 3 Estimating causal effects in nonexperimental studies, by Rajeev Dehejia 25 3.1 Introduction . 25 3.2 Identifying and estimating the average treatment effect . 27 3.3TheNSWdata............................ 29 3.4 Propensity score estimates . 31 3.5Conclusions.............................. 35 4 Medication cost sharing and drug spending in Medicare, by Alyce S. Adams 37 4.1 Methods . 38 4.2Results................................. 40 4.3 Study limitations . 45 4.4Conclusionsandpolicyimplications................. 46 5 A comparison of experimental and observational data analyses, by Jennifer L. Hill, Jerome P. Reiter, and Elaine L. Zanutto 49 5.1Experimentalsample......................... 50 v vi CONTENTS 5.2 Constructed observational study . 51 5.3Concludingremarks......................... 60 6 Fixing broken experiments using the propensity score, by Bruce Sacerdote 61 6.1 Introduction . 61 6.2Thelotterydata............................ 62 6.3 Estimating the propensity scores . 63 6.4Results................................. 65 6.5Concludingremarks......................... 71 7 The propensity score with continuous treatments, by Keisuke Hirano and Guido W. Imbens 73 7.1 Introduction . 73 7.2Thebasicframework......................... 74 7.3BiasremovalusingtheGPS..................... 76 7.4 Estimation and inference . 78 7.5 Application: the Imbens–Rubin–Sacerdote lottery sample . 79 7.6Conclusion.............................. 83 8 Causal inference with instrumental variables, by Junni L. Zhang 85 8.1 Introduction . 85 8.2 Key assumptions for the LATE interpretation of the IV estimand . 87 8.3 Estimating causal effects with IV . 90 8.4 Some recent applications . 95 8.5Discussion............................... 95 9 Principal stratification, by Constantine E. Frangakis 97 9.1 Introduction: partially controlled studies . 97 9.2 Examples of partially controlled studies . 97 9.3Principalstratification......................... 101 9.4 Estimands . 102 9.5Assumptions............................. 104 9.6 Designs and polydesigns . 107 II Missing data modeling 109 10 Nonresponse adjustment in government statistical agencies: constraints, inferential goals, and robustness issues, by John L. Eltinge 111 10.1 Introduction: a wide spectrum of nonresponse adjustment efforts in government statistical agencies . 111 10.2Constraints.............................. 112 10.3 Complex estimand structures, inferential goals, and utility functions 112 CONTENTS vii 10.4 Robustness . 113 10.5Closingremarks............................ 113 11 Bridging across changes in classification systems, by Nathaniel Schenker 117 11.1 Introduction . 117 11.2 Multiple imputation to achieve comparability of industry and occu- pationcodes.............................. 118 11.3 Bridging the transition from single-race reporting to multiple-race reporting................................ 123 11.4Conclusion.............................. 128 12 Representing the Census undercount by multiple imputation of households, by Alan M. Zaslavsky 129 12.1 Introduction . 129 12.2Models................................ 131 12.3Inference............................... 134 12.4Simulationevaluations........................ 138 12.5Conclusion.............................. 140 13 Statistical disclosure techniques based on multiple imputation, by Roderick J. A. Little, Fang Liu, and Trivellore E. Raghunathan 141 13.1 Introduction . 141 13.2Fullsynthesis............................. 143 13.3SMIKeandMIKe........................... 144 13.4Analysisofsyntheticsamples.................... 147 13.5Anapplication............................ 149 13.6Conclusions.............................. 152 14 Designs producing balanced missing data: examples from the National Assessment of Educational Progress, by Neal Thomas 153 14.1 Introduction . 153 14.2 Statistical methods in NAEP . 155 14.3 Split and balanced designs for estimating population parameters . 157 14.4 Maximum likelihood estimation . 159 14.5 The role of secondary covariates . 160 14.6Conclusions.............................. 162 15 Propensity score estimation with missing data, by Ralph B. D’Agostino Jr. 163 15.1 Introduction . 163 15.2Notation................................ 165 15.3Appliedexample:MarchofDimesdata............... 168 15.4 Conclusion and future directions . 174 viii CONTENTS 16 Sensitivity to nonignorability in frequentist inference, by Guoguang Ma and Daniel F. Heitjan 175 16.1 Missing data in clinical trials . 175 16.2 Ignorability and bias . 175 16.3 A nonignorable selection model . 176 16.4 Sensitivity of the mean and variance . 177 16.5 Sensitivity of the power . 178 16.6 Sensitivity of the coverage probability . 180 16.7Anexample.............................. 184 16.8Discussion............................... 185 III Statistical modeling and computation 187 17 Statistical modeling and computation, by D. Michael Titterington 189 17.1Regressionmodels.......................... 190 17.2Latent-variableproblems....................... 191 17.3 Computation: non-Bayesian . 191 17.4Computation:Bayesian........................ 192 17.5Prospectsforthefuture........................ 193 18 Treatment effects in before-after data, by Andrew Gelman 195 18.1 Default statistical models of treatment effects . 195 18.2 Before-after correlation is typically larger for controls than for treatedunits.............................. 196 18.3 A class of models for varying treatment effects . 200 18.4Discussion..............................

Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives an Essential Journey with Donald Rubin’S Statistical Family

1 the University of Chicago the Harris School of Public

Donald Rubin Three-Day Course in Causality 5 to 7 February 2020 At

Theory with Applications to Voting and Deliberation Experiments!

Arxiv:1805.08845V4 [Stat.ML] 10 Jul 2021 Outcomes Such As Images, Sequences, and Graphs

The Neyman-Rubin Model of Causal Inference and Estimation Via Matching Methods∗

To Adjust Or Not to Adjust? Sensitivity Analysis of M-Bias and Butterfly-Bias

1 the University of Chicago the Harris School of Public

Causality Causality Refers to the Relationship

Donald Rubin∗ Andrew Gelman† 3 Oct 2017

ASA 1999 JSM Teaching Causal Inference Dr. Donald Rubin in Experiments and Observational Studies

^^.^-3 Causal Diagrams for Empirical Research C^-J-H Di^^I^)

Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics∗