Research Design

The Biggest Problem in Research: Establishing Causality Class exercise: HIV and circumcision Introduction: Research design 17.871 1 2 Field Experiment example: HIV and male circumcision HIV and male circumcision 3,274 uncircumcised men, aged 18–24, When controlling for behavioral factors, including sexual behavior that increased slightly in the intervention group, volunteered! condom use, and health-seeking behavior, the protection was Randomly assigned to a control or an 61% (95% CI: 34%–77%). intervention group with Male circumcision provides a degree of protection against acquiring HIV infection, equivalent to what a Follow-up visits at months 3, 12, and 21 vaccine of high efficacy would have achieved. Did it work? Male circumcision may provide an important way of reducing the spread of HIV infection in sub-Saharan Control group: 2.1 per 100 person-years Africa. PLoS Medicine Vol. 2, No. 11 Treatment group: 0.85 per 100 person-years 3 4 Field Experiment example: Why is causality such a problem? HIV and male circumcision Problems? In observational studies, selection into “treatment” and “control” cases rarely random Internally valid! HIV example Because of randomization intervention, no bias from Schooling examples (private vs. public) nonrandom selection into the treatment group. That is, Voting examples (pro-choice versus pro-life) No differences between the treatment and control group on Treatment and control cases may thus differ in other confounding variables (only comparing apples with apples, no ways that affect the outcome of interest apples with oranges) Problem with internal validity No possibility of reverse causation The two primary drivers of selection are Alternative interpretations of the treatment? Confounding variables External validity? Reverse causation Could the difference have occurred by chance? If you can address these problems, you have an internally valid study Unlikely: p < 0.001 on difference 5 6 Internal validity: the two problems The two primary drivers of nonrandom selection into the treatment group (or on your key explanatory variable) 1. Confounding variables Comparing Review of internal and apples with apples or apples with oranges? Random assignment ensures apple to apple external validity comparisons Regression, matching, difference-in- differences also attempt to compare apples with apples 2. Reverse causation The chicken and egg problem, which came first? Is your dependent variable influencing your treatment (your explanatory variable)? If you can address these problems, you almost always have an internally valid study Randomly assigned experiments address both Good research is External validity about addressing Internal validity Is your sample External validity representative of the population? Address by randomly sampling Avoiding case wise deletion because of missing data Clarification How to Establish Causality (i.e., how to rule out alternatives) Randomly sampling cases gets you? How do we establish causality? By ruling External validity out alternative explanations Randomly assigning to treatment group? Internal validity Legal analogy: prosecutor versus defense Controlling for variables with regression Best approach to ruling out alternatives? addresses? Internal validity Run a field experiment! What study design addresses both internal and E.g., HIV and circumcision external validity? Field experiments 12 Review How to Establish Causality Classic Post-test only experiment (i.e., how to rule out alternatives) Donald Campbell and Julian Stanley, Experimental and But, running an experiment is often impossible Quasi-Experimental Designs for Research (1963) Try anyway: e.g., HIV and circumcision Summary: If you can’t run an experiment: natural R X O experiment R O Exploit something that is exogenous Accidental deaths No prior observation Timing of Senate elections Imposition of new voting machines Classical scientific and agricultural experimentalism 9/11 terrorist attacks Geographical boundaries Exploit a discontinuity Summa Cum Laude’s effect on income 13 Regression discontinuity (RD) design 14 Regression discontinuity Example from Brazil 15 16 17 How to Establish Causality Difference-in-differences (i.e., how to rule out alternatives) If you can’t run an experiment or find a Media effects example natural experiment/discontinuity Endorsement changes in the 1997 British Control for confounding variables election Difference-in-differences (DD) Illustrates Matching Difference-in-differences, which reduces bias from Controlling for variables with parametric models, confounding variables e.g., regression Panel data, which can help rule out reverse Eliminate reverse causation causation Exploit time with panel data, i.e., measure the outcome before and after some treatment 19 20 Read paper before 1997 that switched to Labour (n=185) Read paper before 1997 that switched to Labour (n=185) 12.6 12.6 5.3 5.3 % Labour vote among voters among vote Labour % voters among vote Labour % Did not read paper before 1997 that switched to Labour (n=1408) Did not read paper before 1997 that switched to Labour (n=1408) 25 30 35 40 45 50 55 60 25 30 35 40 45 50 55 60 1992 1997 21 1992 1997 22 Read paper before 1997 that switched to Labour (n=185) How to Establish Causality Difference-in-differences (i.e., how to rule out alternatives) 12.6 - 5.3 = 7.3 If you can’t run an experiment or find a 12.6 natural experiment Control for confounding variables Much of 17.871 Difference-in-differences (DD) is about this 5.3 Matching Controlling for variables with parametric % Labour vote among voters among vote Labour % models, e.g., regression Did not read paper before 1997 that switched to Labour (n=1408) Eliminate reverse causation 25 30 35 40 45 50 55 60 Exploit time with panel data, i.e., measure the outcome before and after some treatment 1992 1997 23 24 Summary Field experiment always preferred Always keep a field experiment in mind when designing observational studies Strive for “natural” or quasi-experiments Supplemental examples Timing of Senate elections Imposition of new voting machines 9/11 terrorist attacks Regression-discontinuity designs Use Difference-in-differences designs Gather as much cross-time data as possible (panel studies) If you only have cross-sectional data, be humble! 25 Another regression discontinuity example (Angrist and Lavy, 1999) 27 28 29.

Load more