<<

Restricted : and

Roc´ıo Titiunik

University of Michigan

Prepared for Workshop on Field Experimentation in Political Economy Columbia Center for the Study of Development Strategies 18-20 May 2011

Titiunik (UM) Block randomization May 19, 2011 1 / 27 Potential Outcomes

Ti = 1 or Ti = 0: binary treatment assignment

Yi (1): potential outcome under treatment

Yi (0): potential outcome under control

Yi = Yi (0)(1 − Ti ) + Yi (1)Ti : observed outcome

Fundamental Problem of Causal Inference: we see either Yi (0) or Yi (1), but never both simultaneously for the same i

Titiunik (UM) Block randomization May 19, 2011 2 / 27 : Solution to Fundamental Problem

Random assignment of treatment offers a solution to the Fundamental Problem of Causal Inference, because potential outcomes and treatment assignment are independent

E(Yi |Ti = 0) = E(Yi (0)|Ti = 0) = E(Yi (0))

E(Yi |Ti = 1) = E(Yi (1)|Ti = 1) = E(Yi (1)) The average treatment effect (ATE) is

ATE = E(Yi (1)) − E(Yi (0)) = E(Yi |Ti = 1) − E(Yi |Ti = 0)

So we can estimate the ATE by simple difference in Pn Pn i Ti · Yi i (1 − Ti ) · Yi ¯ ¯ ATEd = Pn − Pn = YT − YC i Ti i (1 − Ti )

Titiunik (UM) Block randomization May 19, 2011 3 / 27 How To Randomize

When there is random assignment, participants in a trial are assigned to comparison groups based on some chance process Two elements:

I Generation of unpredictable randomized allocation sequence

I Concealment of the sequence until assignment occurs There are several ways in which random assignment can be conducted

We distinguish between two broad categories:

I Simple or unrestricted randomization

I

Titiunik (UM) Block randomization May 19, 2011 4 / 27 Unrestricted (Simple) Randomization

Unrestricted or simple randomization has two main cases Case 1: Random Allocation Rule

I The total sample size n and the sample sizes in each group (treatment, control) are fixed and under control of researcher

I A subset of n/2 out of n is randomly chosen and assigned to treatment, the remainder to control Case 2: Complete Randomization

I Final sample size is not known with certainty (but target established)

I Randomization procedure is analogous to repeated tossing of fair coin

I Sample sizes in each group are binomially distributed random variables

I Most commonly used in clinical trials

Titiunik (UM) Block randomization May 19, 2011 5 / 27 Unrestricted (Simple) Randomization: Random Allocation Rule vs Complete Randomization

In both cases, marginal probability of assignment is 1/2 for all assignments In complete randomization, there is a chance of having imbalanced sample sizes in both groups, but this imbalance becomes extremely unlikely for n ≥ 200 Since statistical power is maximized for equal sample sizes in both groups, random allocation rule might be preferable for with small sample size Greater likelihood of a covariate imbalance with the random allocation rule than with complete randomization, but the difference is trivial for large sample sizes In an unmasked study with staggered patient entry, there is substantial potential for selection bias with random allocation rule, not with complete randomization

Titiunik (UM) Block randomization May 19, 2011 6 / 27 Restricted Randomization

In unrestricted randomization, during the recruitment process, there is a chance of undesirable differences in the number of subjects assigned to each group

If baseline characteristics of subjects change over time, this periodic imbalance in sample sizes may result in significant differences between treatment and control groups in the distribution of pre-treatment characteristics

These are important concerns in small trials

Restricted randomization procedures control the probability of obtaining an allocation sequence with severely imbalanced sample sizes in the treatment and control groups

The most common form of restricted randomization is blocked randomization or simply blocking

Titiunik (UM) Block randomization May 19, 2011 7 / 27 Restricted Randomization: Blocked Randomization

Blocked randomization refers to a randomization procedure that forces (periodic) balance in the number of subjects assigned to each treatment group

The most common form is the permuted-, in which the size of each block is decided, and the treatment is randomly assigned within each block

Blocked randomization is usually combined with stratification to increase statistical power of treatment-control comparisons

A stratified blocked randomization divides experimental units into homogeneous strata (blocks), and then randomly assigns the treatment within blocks

Often in the social sciences, the term blocking is used to refer to a stratified blocked randomization

Titiunik (UM) Block randomization May 19, 2011 8 / 27 Stratified Blocked Randomization

Performing a stratified blocked randomization involves incorporating pre-treatment covariates in the design of the Pre-treatment covariates:

I Variables that predict potential outcomes

I Variables that are determined before treatment is assigned

If Ti is randomly assigned, the distribution of covariates in the treatment group is the same as that in the control group

We can learn about the treatment effect without having to explicitly model the way in which other variables affect the outcome

In randomized experiments, covariates can be safely ignored

However, by blocking on important covariates in the design stage we can improve power in the analysis stage

Titiunik (UM) Block randomization May 19, 2011 9 / 27 Stratified Blocked Randomization

Stratified blocked randomization can provide considerable benefits when blocks are defined on the basis of a pre-treatment covariate that is strongly correlated with the outcome of interest The general procedure is as follows:

I Divide total observations into homogeneous blocks

I Each block has observations with same or similar values of pre-treatment covariates

I Randomly assign treatment to observations within blocks

I If we have J blocks, we run J “mini-experiments”

Titiunik (UM) Block randomization May 19, 2011 10 / 27 Stratified Blocked Randomization

Blocking on the basis of an important pre-treatment covariate has several advantages:

I Eliminates the in outcomes that occurs when covariates are imbalanced, since imbalance in covariates used for blocking are ruled out by design

I Eliminates correlation between blocked covariates and treatment assignment, which introduces a collinearity penalty when using regression if these covariates are correlated with the treatment

I Increases the efficiency of estimation and the power of hypothesis tests

I Reduces the required sample size for fixed precision or power

I Forces the researcher to think about which covariates should be included in the analysis before randomization occurs

I As a result, it sets a clear guideline for the analysis stage Unless sample size is very small, blocking on a covariate that fails to predict experimental outcomes does no harm

Titiunik (UM) Block randomization May 19, 2011 11 / 27 Stratified Blocked Randomization: Analysis Stage

Blocking is part of the design stage of the experiment

However, it can, and in most cases it must, be incorporated into the analysis stage

How exactly does blocking affect the analysis stage?

Answer depends on whether the treatment effect varies with the pre-treatment covariate used for blocking

If the covariate has no predictive power over the outcome, blocking can be ignored in the analysis stage – but blocking is unlikely to be used in this case to begin with

If the covariate has strong predictive power over the outcome, the analysis stage must take blocking into consideration

Titiunik (UM) Block randomization May 19, 2011 12 / 27 Stratified Blocked Randomization: Analysis Stage

If the covariate has strong predictive power over the outcome, the analysis stage must take blocking into consideration

In this case, the expected intrablock correlation is nonzero

If the intrablock correlation is positive, ignoring blocking in the analysis stage will usually result in a conservative test

This may be argument for ignoring blocking, but a conservative test results in an unnecessary loss of efficiency and power

Plus, there is always the possibility that ignoring blocks will lead to a non-conservative test – i.e., a test that rejects too much

Titiunik (UM) Block randomization May 19, 2011 13 / 27 Stratified Blocked Randomization: Pairwise Matching

The simplest particular case of a stratified blocked randomization is a pairwise matching randomization

I There are two treatments

I Blocks are pairs of two subjects

I One subject is picked at random in each pair to receive each treatment

I Pairing can be done on the basis of one or two, or many covariates

Titiunik (UM) Block randomization May 19, 2011 14 / 27 How Blocking Reduces Variability: Example

An experiment may randomly encourage people to get out to vote

I Ti = 1 if person i receives card in the mail encouraging her to vote

I Ti = 0 otherwise

Outcome of interest, Yi , is turnout of person i in next election We know from cost theories of voting that education affects people’s propensities to vote Imagine:

I Two levels of education: low and high

I We create two blocks based on education

I For people with low education, turnout ranges from 0.1 to 0.4

I For people with high education, turnout ranges from 0.4 to 0.7 Overall variability in sample is less than variability in each education block

Titiunik (UM) Block randomization May 19, 2011 15 / 27 How Blocking Reduces Variability: Exercise

We turn this example into an exercise

We assume that those subjects who are assigned to the treatment condition increase their baseline turnout in 0.1.

Thus, if subject i had baseline turnout of 0.6 and she is assigned to treatment, she will have an observed outcome equal to 0.6 + 0.1 = 0.7.

We assume there are four subjects in our experiment

We compare unrestricted randomization using a random allocation rule that has 2 subjects in the treatment group and 2 subjects in the control group to a pairwise blocking randomization where blocks are homogeneous in education level

Titiunik (UM) Block randomization May 19, 2011 16 / 27 How Blocking Reduces Variability: Exercise

Table 1: all possible treatment assignments of these four individuals to two education-homogeneous pairs, together with their baseline turnout and their level of education

In Table 2: all possible treatment assignments of these four individuals with unrestricted randomization (random allocation rule that has 2 treated and 2 controls), together with their baseline turnout and their level of education

Titiunik (UM) Block randomization May 19, 2011 17 / 27 How Blocking Reduces Variability: Exercise

Table: Possible treatment assignments for pairwise blocking on education

Individual Education Baseline Turnout (Y0) P1 P2 P3 P4 1 L 0.1 T T C C 2 L 0.1 C C T T 3 H 0.6 T C T C 4 H 0.6 C T C T

Titiunik (UM) Block randomization May 19, 2011 18 / 27 How Blocking Reduces Variability: Exercise

Table: Possible treatment assignments for unrestricted randomization

Individual Education Baseline Turnout P1 P2 P3 P4 P5 P6 1 L 0.1 T T T C C C 2 L 0.1 T C C T T C 3 H 0.6 C T C T C T 4 H 0.6 C C T C T T

Titiunik (UM) Block randomization May 19, 2011 19 / 27 How Blocking Reduces Variability: Exercise

The questions you are asked are the following:

Calculate the difference-in-means between the treated and control groups for all possible treatment assignments in both types of randomization, this is, for the allocations in Table 1 and Table 2

Under which randomization scheme do you see more variability in these estimated effects?

Is one randomization better than the other at preventing “worst-case scenario” assignments?

Titiunik (UM) Block randomization May 19, 2011 20 / 27 How Blocking Reduces Variability: Answer to Exercise

Table: Possible treatment effects for pairwise blocking on education, assuming constant treatment effect of 0.1 Treatment assignment Average Treatment Effect P1 (0.2 - 0.1)/2 + (0.7 - 0.6)/2 = 0.1 P2 (0.2 - 0.1)/2 + (0.7 - 0.6)/2 = 0.1 P3 (0.2 - 0.1)/2 + (0.7 - 0.6)/2 = 0.1 P4 (0.2 - 0.1)/2 + (0.7 - 0.6)/2 = 0.1

Titiunik (UM) Block randomization May 19, 2011 21 / 27 How Blocking Reduces Variability: Answer to Exercise

Table: Possible treatment effects for unrestricted randomization, assuming constant treatment effect of 0.1 Treatment assignment Average Treatment Effect P1 (0.2 + 0.2)/2 - (0.6 + 0.6)/2 = -0.4 P2 (0.2 + 0.7)/2 - (0.1 + 0.6)/2 = 0.1 P3 (0.2 + 0.7)/2 - (0.1 + 0.6)/2 = 0.1 P4 (0.2 + 0.7)/2 - (0.1 + 0.6)/2 = 0.1 P5 (0.2 + 0.7)/2 - (0.1 + 0.6)/2 = 0.1 P6 (0.7 + 0.7)/2 - (0.1 + 0.1)/2 = 0.6

Titiunik (UM) Block randomization May 19, 2011 22 / 27 How Blocking Reduces Variability: Answer to Exercise

As we can see, pairwise blocking on education levels eliminates, by construction, the possibility of obtaining severe imbalance in education In unrestricted randomization, this possibility is not ruled out, and 2 out of the 6 possible assignments produce severe imbalance in education levels Since education is highly correlated with potential outcomes, this imbalance leads to a large bias in the estimated average treatment effect This difference between the two designs causes the variability in the estimated average treatment effect to be much larger in the unrestricted randomization than in the pairwise blocking randomization Indeed, the estimated treatment effect does not vary at all in the pairwise blocking case

Titiunik (UM) Block randomization May 19, 2011 23 / 27 Factorial Designs

When we want to learn about the effect of two or more treatments on some response, we have two options:

I Run separate experiments for each factor

I Use a factorial design in which both treatments or factors are studied in the same experiment

Titiunik (UM) Block randomization May 19, 2011 24 / 27 Factorial Designs

A a complete factorial design refers to an experiment with the following features:

I Two or more treatments are under study

I Each treatment takes two or more levels each

I All combinations of treatments and levels are represented

The treatments of interest are referred as “factors”

The possible values a treatment or factor may take are referred to as “levels”

Titiunik (UM) Block randomization May 19, 2011 25 / 27 Factorial Designs: Example

We want to study the effects of two drugs, A and B, at different dosages of both

Drug A: no drug, low dosage, high dosage

Drug B: no drug, low dosage

Complete factorial design in this case has six experimental groups:

Drug A No drug Low Dosage High Dosage No drug Group 1 Group 2 Group 3 Drug B Low Dosage Group 4 Group 5 Group 6

Titiunik (UM) Block randomization May 19, 2011 26 / 27 Factorial Designs: Example

We want to study whether an educational program that provides mathematical tutoring for children has an effect on math test scores We wish to vary both the amount of time students spend in tutoring sessions plus the environment in which they receive these sessions:

I Amount of time: 2-hour or 4-hours of tutoring per week

I Tutoring sessions: individual or groups

Four possible combinations of treatment conditions

Time per week 2 hours 4 hours Individual Group 1 Group 2 Type of session Group Group 3 Group 4

Titiunik (UM) Block randomization May 19, 2011 27 / 27