Restricted Randomization: Blocking and Matching
Total Page:16
File Type:pdf, Size:1020Kb
Restricted randomization: Blocking and matching Roc´ıo Titiunik University of Michigan Prepared for Workshop on Field Experimentation in Political Economy Columbia Center for the Study of Development Strategies 18-20 May 2011 Titiunik (UM) Block randomization May 19, 2011 1 / 27 Potential Outcomes Ti = 1 or Ti = 0: binary treatment assignment Yi (1): potential outcome under treatment Yi (0): potential outcome under control Yi = Yi (0)(1 − Ti ) + Yi (1)Ti : observed outcome Fundamental Problem of Causal Inference: we see either Yi (0) or Yi (1), but never both simultaneously for the same i Titiunik (UM) Block randomization May 19, 2011 2 / 27 Random Assignment: Solution to Fundamental Problem Random assignment of treatment offers a solution to the Fundamental Problem of Causal Inference, because potential outcomes and treatment assignment are independent E(Yi jTi = 0) = E(Yi (0)jTi = 0) = E(Yi (0)) E(Yi jTi = 1) = E(Yi (1)jTi = 1) = E(Yi (1)) The average treatment effect (ATE) is ATE = E(Yi (1)) − E(Yi (0)) = E(Yi jTi = 1) − E(Yi jTi = 0) So we can estimate the ATE by simple difference in means Pn Pn i Ti · Yi i (1 − Ti ) · Yi ¯ ¯ ATEd = Pn − Pn = YT − YC i Ti i (1 − Ti ) Titiunik (UM) Block randomization May 19, 2011 3 / 27 How To Randomize When there is random assignment, participants in a trial are assigned to comparison groups based on some chance process Two elements: I Generation of unpredictable randomized allocation sequence I Concealment of the sequence until assignment occurs There are several ways in which random assignment can be conducted We distinguish between two broad categories: I Simple or unrestricted randomization I Restricted randomization Titiunik (UM) Block randomization May 19, 2011 4 / 27 Unrestricted (Simple) Randomization Unrestricted or simple randomization has two main cases Case 1: Random Allocation Rule I The total sample size n and the sample sizes in each group (treatment, control) are fixed and under control of researcher I A subset of n=2 out of n is randomly chosen and assigned to treatment, the remainder to control Case 2: Complete Randomization I Final sample size is not known with certainty (but target established) I Randomization procedure is analogous to repeated tossing of fair coin I Sample sizes in each group are binomially distributed random variables I Most commonly used in clinical trials Titiunik (UM) Block randomization May 19, 2011 5 / 27 Unrestricted (Simple) Randomization: Random Allocation Rule vs Complete Randomization In both cases, marginal probability of assignment is 1=2 for all assignments In complete randomization, there is a chance of having imbalanced sample sizes in both groups, but this imbalance becomes extremely unlikely for n ≥ 200 Since statistical power is maximized for equal sample sizes in both groups, random allocation rule might be preferable for experiments with small sample size Greater likelihood of a covariate imbalance with the random allocation rule than with complete randomization, but the difference is trivial for large sample sizes In an unmasked study with staggered patient entry, there is substantial potential for selection bias with random allocation rule, not with complete randomization Titiunik (UM) Block randomization May 19, 2011 6 / 27 Restricted Randomization In unrestricted randomization, during the recruitment process, there is a chance of undesirable differences in the number of subjects assigned to each group If baseline characteristics of subjects change over time, this periodic imbalance in sample sizes may result in significant differences between treatment and control groups in the distribution of pre-treatment characteristics These are important concerns in small trials Restricted randomization procedures control the probability of obtaining an allocation sequence with severely imbalanced sample sizes in the treatment and control groups The most common form of restricted randomization is blocked randomization or simply blocking Titiunik (UM) Block randomization May 19, 2011 7 / 27 Restricted Randomization: Blocked Randomization Blocked randomization refers to a randomization procedure that forces (periodic) balance in the number of subjects assigned to each treatment group The most common form is the permuted-block design, in which the size of each block is decided, and the treatment is randomly assigned within each block Blocked randomization is usually combined with stratification to increase statistical power of treatment-control comparisons A stratified blocked randomization divides experimental units into homogeneous strata (blocks), and then randomly assigns the treatment within blocks Often in the social sciences, the term blocking is used to refer to a stratified blocked randomization Titiunik (UM) Block randomization May 19, 2011 8 / 27 Stratified Blocked Randomization Performing a stratified blocked randomization involves incorporating pre-treatment covariates in the design of the experiment Pre-treatment covariates: I Variables that predict potential outcomes I Variables that are determined before treatment is assigned If Ti is randomly assigned, the distribution of covariates in the treatment group is the same as that in the control group We can learn about the treatment effect without having to explicitly model the way in which other variables affect the outcome In randomized experiments, covariates can be safely ignored However, by blocking on important covariates in the design stage we can improve power in the analysis stage Titiunik (UM) Block randomization May 19, 2011 9 / 27 Stratified Blocked Randomization Stratified blocked randomization can provide considerable benefits when blocks are defined on the basis of a pre-treatment covariate that is strongly correlated with the outcome of interest The general procedure is as follows: I Divide total observations into homogeneous blocks I Each block has observations with same or similar values of pre-treatment covariates I Randomly assign treatment to observations within blocks I If we have J blocks, we run J \mini-experiments" Titiunik (UM) Block randomization May 19, 2011 10 / 27 Stratified Blocked Randomization Blocking on the basis of an important pre-treatment covariate has several advantages: I Eliminates the randomness in outcomes that occurs when covariates are imbalanced, since imbalance in covariates used for blocking are ruled out by design I Eliminates correlation between blocked covariates and treatment assignment, which introduces a collinearity penalty when using regression if these covariates are correlated with the treatment I Increases the efficiency of estimation and the power of hypothesis tests I Reduces the required sample size for fixed precision or power I Forces the researcher to think about which covariates should be included in the analysis before randomization occurs I As a result, it sets a clear guideline for the analysis stage Unless sample size is very small, blocking on a covariate that fails to predict experimental outcomes does no harm Titiunik (UM) Block randomization May 19, 2011 11 / 27 Stratified Blocked Randomization: Analysis Stage Blocking is part of the design stage of the experiment However, it can, and in most cases it must, be incorporated into the analysis stage How exactly does blocking affect the analysis stage? Answer depends on whether the treatment effect varies with the pre-treatment covariate used for blocking If the covariate has no predictive power over the outcome, blocking can be ignored in the analysis stage { but blocking is unlikely to be used in this case to begin with If the covariate has strong predictive power over the outcome, the analysis stage must take blocking into consideration Titiunik (UM) Block randomization May 19, 2011 12 / 27 Stratified Blocked Randomization: Analysis Stage If the covariate has strong predictive power over the outcome, the analysis stage must take blocking into consideration In this case, the expected intrablock correlation is nonzero If the intrablock correlation is positive, ignoring blocking in the analysis stage will usually result in a conservative test This may be argument for ignoring blocking, but a conservative test results in an unnecessary loss of efficiency and power Plus, there is always the possibility that ignoring blocks will lead to a non-conservative test { i.e., a test that rejects too much Titiunik (UM) Block randomization May 19, 2011 13 / 27 Stratified Blocked Randomization: Pairwise Matching The simplest particular case of a stratified blocked randomization is a pairwise matching randomization I There are two treatments I Blocks are pairs of two subjects I One subject is picked at random in each pair to receive each treatment I Pairing can be done on the basis of one or two, or many covariates Titiunik (UM) Block randomization May 19, 2011 14 / 27 How Blocking Reduces Variability: Example An experiment may randomly encourage people to get out to vote I Ti = 1 if person i receives card in the mail encouraging her to vote I Ti = 0 otherwise Outcome of interest, Yi , is turnout of person i in next election We know from cost theories of voting that education affects people's propensities to vote Imagine: I Two levels of education: low and high I We create two blocks based on education I For people with low education, turnout ranges from 0.1 to 0.4 I For people with high education, turnout ranges from 0.4 to 0.7 Overall variability in sample is less than variability in each education block Titiunik (UM) Block randomization May 19, 2011 15 / 27 How Blocking Reduces Variability: Exercise