Inference: Fisher's Exact P-‐‑Values

2/4/14 Quiz 1 Grades Inference: Fisher’s Exact p-values STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University > summary(Quiz1) Min. 1st Qu. Median Mean 3rd Qu. Max. 8.00 15.00 16.00 15.33 17.00 19.00 Review of Quiz Review of Last Class • Potential outcomes: what would spring GPA be if student tents AND what would it be if • Randomizing units to treatments creates student doesn’t tent? balanced treatment groups • Covariates must be pre-treatment • Placebos and blinding are important • Assumptions: o SUTVA no interference: Yi not affected by Wj • Four types of classical randomized o individualistic: Wi not affected by Xj or Yj experiments: unconfounded: W not affected by Y or Y o i i j, o Bernoulli randomized experiment conditional on X o Completely randomized experiment • Wi can depend on Wj o Stratified randomized experiment • Context: don’t just recite definitions o Paired randomized experiment Diet Cola and Calcium Diet Cola and Calcium Drink Calcium Excreted • Does drinking diet cola leach calcium Diet cola 50 from the body? Diet cola 62 Diet cola 48 Diet cola 55 • 16 healthy women aged 18-40 were Diet cola 58 randomly assigned to drink 24 ounces of Diet cola 61 Diet cola 58 either diet cola or water Diet cola 56 Water 48 • Their urine was collected for 3 hours, and Water 46 Water 54 calcium excreted was measured (in mg) Water 45 Water 53 • Is there a significant difference? Water 46 Water 53 Water 48 1 2/4/14 Test Statistic Diet Cola and Calcium • A test statistic, T, can be any function of: • Difference in sample means between obs o the observed outcomes, Y treatment group (diet cola drinkers) and o the treatment assignment vector, W o the covariates, X control group (water drinkers) • The test statistic must be a scalar (one T obs ≡ Y obs −Y obs number) T C n n • Examples: WiYi (1) (1−Wi )Yi (0) Difference in means ∑ ∑ o i=1 i=1 o Regression coefficients = − o Rank statistics NT NC o See chapter 5 for a discussion of test statistics = 6.875 mg Key Question p-value • Is a difference of 6.875 mg more extreme • T: A random variable than we would have observed, just by • Tobs: the observed test statistic computed in random chance, if there were no the actual experiment difference between diet cola and water regarding calcium excretion? • The p-value is the probability that T is as extreme as Tobs, if the null is true • What types of statistics would we see, • GOAL: Compare Tobs to the distribution of T just by the random assignment to under the null hypothesis, to see how treatment groups? extreme Tobs is • SO: Need distribution of Tobs under the null Sir R.A. Fisher Randomness • In Fisher’s framework, the only randomness is the treatment assignment: W • The potential outcomes are considered fixed, it is only random which is observed • The distribution of T arises from the different possibilities for W • For a completely randomized experiment, N choose NT possibilities for W 2 2/4/14 Sharp Null Hypothesis Diet Cola and Calcium • Fisher’s sharp null hypothesis is there is no • There is NO EFFECT of drinking diet cola treatment effect: (as compared to water) regarding calcium excretion • H0: Yi(0) = Yi(1) for all i • So, for each person in the study, their • Note: this null is stronger than the typical amount of calcium excreted would be hypothesis of equality of the means the same, whether they drank diet cola or water • Advantage of Fisher’s sharp null: under the null, all potential outcomes “known”! Sharp Null Hypothesis Randomization Distribution • Key point: under the sharp null, the • The randomization distribution is the vector Yobs does not change with distribution of the test statistic, T, assuming different W the null is true, over all possible assignment vectors, W • Therefore we can compute T exactly • For each possible assignment vector, under the null for each different W! compute T (keeping Yobs fixed, because we are assuming the null) • Assignment mechanism completely determines the distribution of T under • The randomization distribution gives us the null exactly the distribution of T, assuming the sharp null hypothesis is true • (why is this not true without sharp null?) Diet Cola and Calcium Diet Cola and Calcium • 16 choose 8 = 12,870 different possible assignment vectors • For each of these, calculate T, the difference in sample means, keeping the values for calcium excretion fixed 3 2/4/14 Exact p-value Diet Cola and Calcium • From the randomization distribution, computing the p-value is straightforward: • The exact p-value is the proportion of p-value test statistics in the randomization = 0.005 distribution that are as extreme as Tobs • This is exact because there are no distributional assumptions – we are using the exact distribution of T Diet Cola and Calcium Notes • If there were no difference between • This approach is completely diet cola and water regarding calcium nonparametric – no model specified in excretion, only 5/1000 of all terms of a set of unknown parameters randomizations would lead to a difference as extreme as 6.875 mg (the • We don’t model the distribution of observed difference) potential outcomes (they are considered fixed) • Drinking diet cola probably does leach calcium from your body! • No modeling assumptions or assumptions about the distribution of the potential outcomes Approximate p-value Approximate p-value • For larger samples, the number of • Repeatedly randomize units to treatments, possible assignment vectors (N choose and calculate test statistic keeping Yobs fixed N ) gets very large T • If the number of simulations is large enough, • Enumerating every possible assignment this randomization distribution will look very vector becomes computationally much like the exact distribution of T difficult • Note: estimated p-values will differ slightly • It’s often easier to simulate many from simulation to simulation. This is okay! (10,000? 100,000?) random assignments • The more simulations, the closer this approximate p-value will be to the exact p-value 4 2/4/14 Diet Cola and Calcium StatKey www.lock5stat.com/statkey p-value ≈ 0.004 To Do • Read Ch 5 • Bring laptops to class Wednesday • HW 2 due next Monday 5 .

Inference: Fisher's Exact P-‐‑Values

Data Collection: Randomized Experiments

Chapter 4: Fisher's Exact Test in Completely Randomized Experiments

The Theory of the Design of Experiments

Designing Experiments

Randomized Experimentsexperiments Randomized Trials

Design of Engineering Experiments the Blocking Principle

Lecture 9, Compact Version

The Core Analytics of Randomized Experiments for Social Research

Randomized Experiments Education Policies Market for Credit Conclusion

Variance Identification and Efficiency Analysis in Randomized

Week 10: Causality with Measured Confounding

Randomization Distributions