Hypothesis Testing Using Randomization Distributions T.Scofield 10/03/2016

Total Page:16

File Type:pdf, Size:1020Kb

Hypothesis Testing Using Randomization Distributions T.Scofield 10/03/2016 Hypothesis Testing Using Randomization Distributions T.Scofield 10/03/2016 Randomization Distributions in Two-Proportion Settings By calling our setting a “two proportion” one, I mean that the data frame has two binary categorical variables, when the one that delineates which of two groups a subject comes from serves as the explanatory variable, and the other, the response variable, also has just two outcomes. In the cocaine addiction data, we have an explanatory variable, “treatment”, which has three levels: “De- sipramine”, “Lithium”, and “Placebo.” We cut that back to two by ignoring one set of patients, perhaps those receiving Desipramine, thereby giving us just two groups to consider. The response variable is “relapsed or not?” which has just two values, “yes” or “no.” We focus on the relapsers. Natural hypotheses for a study to see if Lithium helps to decrease the chance of relapse are H0 : pL − pP = 0, Ha : pL − pP < 0. The sample proportions among the lithium and placebo groups are pˆL = 18/24 and pˆP = 20/24, giving us test statistic 18 20 2 . pˆ − pˆ = − = − = −0.083. L P 24 24 24 Like the study about tapping fingers under the influence of caffeine, this study is an experiment, where the treatment (Lithium or Placebo) was randomly assigned to patients. When we generate a randomization distribution, we want to be faithful to this process, even as we take the null hypothesis into account. That is, the mental image of dropping slips of paper into two bags, one bag containing the 48 relapse results (38 “yes” and 10 “no”) and the other containing the 48 treatments (24 “Lithiums” and 24 “Placebos”), and randomly assigning the latter to the former as we select our randomization sample, is achieving both goals. Generating a randomization distribution, however, is trickier in RStudio for this situation than in earlier scenarios, primarily because of the work we must do to prepare data for randomization samples. You may well prefer to use StatKey, the software meant to accompany the textbook, over RStudio, for cases involving two proportions. I will, however, provide details in RStudio for your perusal. The main difficulty, as indicated above, is preparing data. Here are two approaches. Approach 1: Recreate the data from scratch We have done this sort of thing once before, back in Section 2.1. Perhaps you recall the commands. part1 <- do(6)* data.frame(Drug="Lithium", Relapse="no") part2 <- do(18)* data.frame(Drug="Lithium", Relapse="yes") part3 <- do(4)* data.frame(Drug="Placebo", Relapse="no") part4 <- do(20)* data.frame(Drug="Placebo", Relapse="yes") addictTreatments <- rbind(part1, part2, part3, part4) Approach 2: Filtering the supplied data frame It turns out we don’t actually need to recreate the data, as it has been supplied to us as part of the Lock5withR package in a data frame called CocaineTreatment. But working with it is not so straightforward as it 1 would at first seem, because this data frame contains all the patients, including those who received the drug called Desipramine. We can select the desired subset by leaving out these subjects: myFilteredData <- subset(CocaineTreatment, Drug != "Desipramine") However, there seems to be a lingering “memory” that there were three levels for the Drug variable. You see this, for instance, when you produce a frequency table on Drug: tally(~Drug, myFilteredData) ## Drug ## Desipramine Lithium Placebo ## 0 24 24 While the count of Desipramine patients is 0, we would prefer that our filtered data frame not know Desipramine is part of this study. One way to make it “forget” is to combine the removal of Desipramine patients with the droplevels() command. myFilteredData <- droplevels(subset(CocaineTreatment, Drug != "Desipramine")) tally(~Drug, myFilteredData) ## Drug ## Lithium Placebo ## 24 24 Now our Drug variable truly has just two levels in the myFilteredData data frame. Once data has been prepared . If you carried out the commands above, you now have two data frames, addictTreatments and myFil- teredData, which can be used for our analysis. Either will work, but I will use myFilteredData. head(myFilteredData) ## Drug Relapse ## 25 Lithium no ## 26 Lithium yes ## 27 Lithium yes ## 28 Lithium yes ## 29 Lithium yes ## 30 Lithium no We obtain our test statistic from the sample itself: diff(prop(Relapse~Drug, data=myFilteredData)) ## no.Placebo ## -0.08333333 As when dealing with the difference of two means (see the example using data from CaffeineTaps in a prior handout), our null hypothesis dictates that the drug received (Lithium vs. Placebo) is not actually a factor, and we should generate many randomization statistics by shuffling values of the explanatory variable. One randomization statistic is obtained with the command diff(prop(Relapse~shuffle(Drug), data=myFilteredData)) ## no.Placebo ## 0.1666667 and this may be repeated many times to obtain a randomization distribution: 2 manyDiffs <- do(5000)* diff(prop(Relapse~shuffle(Drug), data=myFilteredData)) head(manyDiffs) ## no.Placebo ## 1 0.25000000 ## 2 -0.16666667 ## 3 0.25000000 ## 4 0.08333333 ## 5 -0.16666667 ## 6 0.00000000 The column, containing 5000 randomization statistics, has been given the curious name no.Placebo. We may view a histogram and mark the region corresponding to our P -value: histogram(~no.Placebo, data=manyDiffs, groups = no.Placebo <=-0.083333, width=.1) 2.5 2.0 1.5 1.0 Density 0.5 0.0 −0.4 −0.2 0.0 0.2 0.4 no.Placebo nrow(subset(manyDiffs, no.Placebo <=-0.083333)) / 5000 ## [1] 0.3488 This P -value, here approximately 0.36, represents the probability, in a world where Lithium does not help deter relapse into cocaine addiction, of obtaining a sample with a test statistic (difference in sample proportions) of −0.08333 or more. This P -value is not statistically significant under any of the usual significance levels α = 0.1, 0.05 or 0.01. In fact, such samples statistics would arise about 36% of the time, which makes our sample statistic appear consistent with the null hypothesis. We fail to reject the null hypothesis. Example: Hypothesis Test for Positive Correlation (NFL Malevo- lence) The hypotheses (explained in the text, Section 4.4): H0 : ρ = 0, Ha : ρ > 0. The test statistic: cor(ZPenYds ~ NFL_Malevolence, data=MalevolentUniformsNFL) ## [1] 0.429796 Generation of many randomization statistics: 3 manyCors <- do(5000)* cor(ZPenYds ~ shuffle(NFL_Malevolence), data=MalevolentUniformsNFL) head(manyCors) ## cor ## 1 -0.22396686 ## 2 -0.39130305 ## 3 0.06329420 ## 4 0.11707616 ## 5 0.19503326 ## 6 0.09328136 histogram(~cor, data=manyCors, groups=cor>=0.42979) 1.5 1.0 Density 0.5 0.0 −0.5 0.0 0.5 cor The P -value: nrow(subset(manyCors, cor>=0.42979)) / 5000 ## [1] 0.0108 In the case where the significance level α = 0.05, this result is statistically signficant, and we would reject the null hypothesis in favor of the alternative, concluding that there is a positive correlation. Example: Is the mean body temperature really 98.6◦? The hypotheses: H0 : µ = 98.6, Ha : µ 6= 98.6. The test statistic: mean(~BodyTemp, data=BodyTemp50) ## [1] 98.26 The natural thing would be to simulate the bootstrap distribution for x¯, as when we constructed a confidence interval for the population mean µ: manyMeans = do(5000)* mean(~BodyTemp, data=resample(BodyTemp50)) head(manyMeans) ## mean ## 1 98.332 ## 2 98.190 4 ## 3 98.250 ## 4 98.280 ## 5 98.206 ## 6 98.280 histogram(~mean, data=manyMeans) 3 2 Density 1 0 98.0 98.2 98.4 98.6 mean But this cannot be an proper simulation of the null distribution, as it is not centered at the right place. It appears the center is about 98.26, the value of our point estimate x¯, not at the hypothesized (population) mean of 98.6, which is what happens whenever we bootstrap a mean. Our randomization statistics should not be the same as bootstrap statistics here, but need to be modified so that they are centered on the proposed mean 98.6. The modification can simply be that we add to each of our sample means the difference between the intended center (98.6) and where they were centered above (at the sample mean x¯ = 98.26): that is, we should add 98.6 − 98.26 = 0.34: manyMeans = do(5000)*( mean(~BodyTemp, data=resample(BodyTemp50)) + 0.34) names(manyMeans) ## [1] "result" histogram(~result, data=manyMeans, groups = abs(result-98.6)>=0.34) 3 2 Density 1 0 98.2 98.4 98.6 98.8 99.0 result We see this modified test statistic has a randomization distribution centered where it ought to be if serving as the null distribution. We have attempted to shade those regions in both tails corresponding to randomization statistics at least as extreme as ours, though there are very few. We obtain the approximate P -value by calculating the area in one tail and doubling it: 5 nrow(subset(manyMeans, result <= 98.26)) *2/ 5000 ## [1] 0.002 Given this small P -value, we reject the null hypothesis and conclude that the actual (population) mean body temperature is something other than 98.6. Example 4.34: A New Wrinkle on Finger Tapping and Caffeine This example has already been done adequately. Since it was a controlled, randomized experiment in which one treatment, either caffeine or placebo, was assigned randomly to each subject, we obtained our randomization distribution in a manner that also randomly assigned treatment values while adhering to the null hypothesis that “treatment doesn’t matter.” We obtained one randomization statistic with the command diff(mean(Taps ~ shuffle(Caffeine), data=CaffeineTaps)) and an entire distribution of such statistics by repeating this command often. Example 4.34 challenges us to imagine different ways of studying the question: “Does caffeine increase tapping rates?” Surely there are other approaches besides a controlled randomized experiment.
Recommended publications
  • Data Collection: Randomized Experiments
    9/2/15 STAT 250 Dr. Kari Lock Morgan Knee Surgery for Arthritis Researchers conducted a study on the effectiveness of a knee surgery to cure arthritis. Collecting Data: It was randomly determined whether people got Randomized Experiments the knee surgery. Everyone who underwent the surgery reported feeling less pain. SECTION 1.3 Is this evidence that the surgery causes a • Control/comparison group decrease in pain? • Clinical trials • Placebo Effect (a) Yes • Blinding • Crossover studies / Matched pairs (b) No Statistics: Unlocking the Power of Data Lock5 Statistics: Unlocking the Power of Data Lock5 Control Group Clinical Trials Clinical trials are randomized experiments When determining whether a treatment is dealing with medicine or medical interventions, effective, it is important to have a comparison conducted on human subjects group, known as the control group Clinical trials require additional aspects, beyond just randomization to treatment groups: All randomized experiments need a control or ¡ Placebo comparison group (could be two different ¡ Double-blind treatments) Statistics: Unlocking the Power of Data Lock5 Statistics: Unlocking the Power of Data Lock5 Placebo Effect Study on Placebos Often, people will experience the effect they think they should be experiencing, even if they aren’t actually Blue pills are better than yellow pills receiving the treatment. This is known as the placebo effect. Red pills are better than blue pills Example: Eurotrip 2 pills are better than 1 pill One study estimated that 75% of the
    [Show full text]
  • Chapter 4: Fisher's Exact Test in Completely Randomized Experiments
    1 Chapter 4: Fisher’s Exact Test in Completely Randomized Experiments Fisher (1925, 1926) was concerned with testing hypotheses regarding the effect of treat- ments. Specifically, he focused on testing sharp null hypotheses, that is, null hypotheses under which all potential outcomes are known exactly. Under such null hypotheses all un- known quantities in Table 4 in Chapter 1 are known–there are no missing data anymore. As we shall see, this implies that we can figure out the distribution of any statistic generated by the randomization. Fisher’s great insight concerns the value of the physical randomization of the treatments for inference. Fisher’s classic example is that of the tea-drinking lady: “A lady declares that by tasting a cup of tea made with milk she can discriminate whether the milk or the tea infusion was first added to the cup. ... Our experi- ment consists in mixing eight cups of tea, four in one way and four in the other, and presenting them to the subject in random order. ... Her task is to divide the cups into two sets of 4, agreeing, if possible, with the treatments received. ... The element in the experimental procedure which contains the essential safeguard is that the two modifications of the test beverage are to be prepared “in random order.” This is in fact the only point in the experimental procedure in which the laws of chance, which are to be in exclusive control of our frequency distribution, have been explicitly introduced. ... it may be said that the simple precaution of randomisation will suffice to guarantee the validity of the test of significance, by which the result of the experiment is to be judged.” The approach is clear: an experiment is designed to evaluate the lady’s claim to be able to discriminate wether the milk or tea was first poured into the cup.
    [Show full text]
  • The Theory of the Design of Experiments
    The Theory of the Design of Experiments D.R. COX Honorary Fellow Nuffield College Oxford, UK AND N. REID Professor of Statistics University of Toronto, Canada CHAPMAN & HALL/CRC Boca Raton London New York Washington, D.C. C195X/disclaimer Page 1 Friday, April 28, 2000 10:59 AM Library of Congress Cataloging-in-Publication Data Cox, D. R. (David Roxbee) The theory of the design of experiments / D. R. Cox, N. Reid. p. cm. — (Monographs on statistics and applied probability ; 86) Includes bibliographical references and index. ISBN 1-58488-195-X (alk. paper) 1. Experimental design. I. Reid, N. II.Title. III. Series. QA279 .C73 2000 001.4 '34 —dc21 00-029529 CIP This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W.
    [Show full text]
  • Designing Experiments
    Designing experiments Outline for today • What is an experimental study • Why do experiments • Clinical trials • How to minimize bias in experiments • How to minimize effects of sampling error in experiments • Experiments with more than one factor • What if you can’t do experiments • Planning your sample size to maximize precision and power What is an experimental study • In an experimental study the researcher assigns treatments to units or subjects so that differences in response can be compared. o Clinical trials, reciprocal transplant experiments, factorial experiments on competition and predation, etc. are examples of experimental studies. • In an observational study, nature does the assigning of treatments to subjects. The researcher has no influence over which subjects receive which treatment. Common garden “experiments”, QTL “experiments”, etc, are examples of observational studies (no matter how complex the apparatus needed to measure response). What is an experimental study • In an experimental study, there must be at least two treatments • The experimenter (rather than nature) must assign treatments to units or subjects. • The crucial advantage of experiments derives from the random assignment of treatments to units. • Random assignment, or randomization, minimizes the influence of confounding variables, allowing the experimenter to isolate the effects of the treatment variable. Why do experiments • By itself an observational study cannot distinguish between two reasons behind an association between an explanatory variable and a response variable. • For example, survival of climbers to Mount Everest is higher for individuals taking supplemental oxygen than those not taking supplemental oxygen. • One possibility is that supplemental oxygen (explanatory variable) really does cause higher survival (response variable).
    [Show full text]
  • Randomized Experimentsexperiments Randomized Trials
    Impact Evaluation RandomizedRandomized ExperimentsExperiments Randomized Trials How do researchers learn about counterfactual states of the world in practice? In many fields, and especially in medical research, evidence about counterfactuals is generated by randomized trials. In principle, randomized trials ensure that outcomes in the control group really do capture the counterfactual for a treatment group. 2 Randomization To answer causal questions, statisticians recommend a formal two-stage statistical model. In the first stage, a random sample of participants is selected from a defined population. In the second stage, this sample of participants is randomly assigned to treatment and comparison (control) conditions. 3 Population Randomization Sample Randomization Treatment Group Control Group 4 External & Internal Validity The purpose of the first-stage is to ensure that the results in the sample will represent the results in the population within a defined level of sampling error (external validity). The purpose of the second-stage is to ensure that the observed effect on the dependent variable is due to some aspect of the treatment rather than other confounding factors (internal validity). 5 Population Non-target group Target group Randomization Treatment group Comparison group 6 Two-Stage Randomized Trials In large samples, two-stage randomized trials ensure that: [Y1 | D =1]= [Y1 | D = 0] and [Y0 | D =1]= [Y0 | D = 0] • Thus, the estimator ˆ ˆ ˆ δ = [Y1 | D =1]-[Y0 | D = 0] • Consistently estimates ATE 7 One-Stage Randomized Trials Instead, if randomization takes place on a selected subpopulation –e.g., list of volunteers-, it only ensures: [Y0 | D =1] = [Y0 | D = 0] • And hence, the estimator ˆ ˆ ˆ δ = [Y1 | D =1]-[Y0 | D = 0] • Only estimates TOT Consistently 8 Randomized Trials Furthermore, even in idealized randomized designs, 1.
    [Show full text]
  • Design of Engineering Experiments the Blocking Principle
    Design of Engineering Experiments The Blocking Principle • Montgomery text Reference, Chapter 4 • Bloc king and nuiftisance factors • The randomized complete block design or the RCBD • Extension of the ANOVA to the RCBD • Other blocking scenarios…Latin square designs 1 The Blockinggp Principle • Blocking is a technique for dealing with nuisance factors • A nuisance factor is a factor that probably has some effect on the response, but it’s of no interest to the experimenter…however, the variability it transmits to the response needs to be minimized • Typical nuisance factors include batches of raw material, operators, pieces of test equipment, time (shifts, days, etc.), different experimental units • Many industrial experiments involve blocking (or should) • Failure to block is a common flaw in designing an experiment (consequences?) 2 The Blocking Principle • If the nuisance variable is known and controllable, we use blocking • If the nuisance factor is known and uncontrollable, sometimes we can use the analysis of covariance (see Chapter 15) to remove the effect of the nuisance factor from the analysis • If the nuisance factor is unknown and uncontrollable (a “lurking” variable), we hope that randomization balances out its impact across the experiment • Sometimes several sources of variability are combined in a block, so the block becomes an aggregate variable 3 The Hardness Testinggp Example • Text reference, pg 120 • We wish to determine whether 4 different tippps produce different (mean) hardness reading on a Rockwell hardness tester
    [Show full text]
  • Lecture 9, Compact Version
    Announcements: • Midterm Monday. Bring calculator and one sheet of notes. No calculator = cell phone! • Assigned seats, random ID check. Chapter 4 • Review Friday. Review sheet posted on website. • Mon discussion is for credit (3rd of 7 for credit). • Week 3 quiz starts at 1pm today, ends Fri. • After midterm, a “week” will be Wed, Fri, Mon, so Gathering Useful Data for Examining quizzes start on Mondays, homework due Weds. Relationships See website for specific days and dates. Homework (due Friday) Chapter 4: #13, 21, 36 Today: Chapter 4 and finish Ch 3 lecture. Research Studies to Detect Relationships Examples (details given in class) Observational Study: Are these experiments or observational studies? Researchers observe or question participants about 1. Mozart and IQ opinions, behaviors, or outcomes. Participants are not asked to do anything differently. 2. Drinking tea and conception (p. 721) Experiment: 3. Autistic spectrum disorder and mercury Researchers manipulate something and measure the http://www.jpands.org/vol8no3/geier.pdf effect of the manipulation on some outcome of interest. Randomized experiment: The participants are 4. Aspirin and heart attacks (Case study 1.6) randomly assigned to participate in one condition or another, or if they do all conditions the order is randomly assigned. Who is Measured: Explanatory and Response Units, Subjects, Participants Variables Unit: a single individual or object being Explanatory variable (or independent measured once. variable) is one that may explain or may If an experiment, then called an experimental cause differences in a response variable unit. (or outcome or dependent variable). When units are people, often called subjects or Explanatory Response participants.
    [Show full text]
  • The Core Analytics of Randomized Experiments for Social Research
    MDRC Working Papers on Research Methodology The Core Analytics of Randomized Experiments for Social Research Howard S. Bloom August 2006 This working paper is part of a series of publications by MDRC on alternative methods of evaluating the implementation and impacts of social and educational programs and policies. The paper will be published as a chapter in the forthcoming Handbook of Social Research by Sage Publications, Inc. Many thanks are due to Richard Dorsett, Carolyn Hill, Rob Hollister, and Charles Michalopoulos for their helpful suggestions on revising earlier drafts. This working paper was supported by the Judith Gueron Fund for Methodological Innovation in Social Policy Research at MDRC, which was created through gifts from the Annie E. Casey, Rocke- feller, Jerry Lee, Spencer, William T. Grant, and Grable Foundations. The findings and conclusions in this paper do not necessarily represent the official positions or poli- cies of the funders. Dissemination of MDRC publications is supported by the following funders that help finance MDRC’s public policy outreach and expanding efforts to communicate the results and implications of our work to policymakers, practitioners, and others: Alcoa Foundation, The Ambrose Monell Foundation, The Atlantic Philanthropies, Bristol-Myers Squibb Foundation, Open Society Institute, and The Starr Foundation. In addition, earnings from the MDRC Endowment help sustain our dis- semination efforts. Contributors to the MDRC Endowment include Alcoa Foundation, The Ambrose Monell Foundation, Anheuser-Busch Foundation, Bristol-Myers Squibb Foundation, Charles Stew- art Mott Foundation, Ford Foundation, The George Gund Foundation, The Grable Foundation, The Lizabeth and Frank Newman Charitable Foundation, The New York Times Company Foundation, Jan Nicholson, Paul H.
    [Show full text]
  • Randomized Experiments Education Policies Market for Credit Conclusion
    Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Randomized experiments Clément de Chaisemartin Majeure Economie September 2011 Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion 1 Limits of OLS 2 What is causality ? A new definition of causality Can we measure causality ? 3 Randomized experiments Solving the selection bias Potential applications Advantages & Limits 4 Education policies Access to education in developing countries Quality of education in developing countries 5 Market for credit The demand for credit Adverse Selection and Moral Hazard 6 Conclusion Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Omitted variable bias (1/3) OLS do not measure the causal impact of X1 on Y if X1 is correlated to the residual. Assume the true DGP is : Y = α + β1X1 + β2X2 + ", with cov(X1;") = 0 If you regress Y on X1, cove (Y ;X1) cove (X1;X2) cove (X1;") β1 = = β1 + β2 + c Ve (X1) Ve (X1) Ve (x1) If β2 6= 0, and cov(X1; X2) 6= 0, the estimator is not consistent. Clément de Chaisemartin Randomized experiments Limits of OLS What is causality ? Randomized experiments Education policies Market for credit Conclusion Omitted variable bias (2/3) In real life, this will happen all the time: You never have in a data set all the determinants of Y . This is very likely that these remaining determinants of Y are correlated to those already included in the regression. Example: intelligence has an impact on wages and is correlated to education.
    [Show full text]
  • Variance Identification and Efficiency Analysis in Randomized
    STATISTICS IN MEDICINE Statist. Med. 2008; 27:4857–4873 Published online 10 July 2008 in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/sim.3337 Variance identification and efficiency analysis in randomized experiments under the matched-pair design Kosuke Imai∗,† Department of Politics, Princeton University, Princeton, NJ 08544, U.S.A. SUMMARY In his 1923 landmark article, Neyman introduced randomization-based inference to estimate average treatment effects from experiments under the completely randomized design. Under this framework, Neyman considered the statistical estimation of the sample average treatment effect and derived the variance of the standard estimator using the treatment assignment mechanism as the sole basis of inference. In this paper, I extend Neyman’s analysis to randomized experiments under the matched-pair design where experimental units are paired based on their pre-treatment characteristics and the randomization of treatment is subsequently conducted within each matched pair. I study the variance identification for the standard estimator of average treatment effects and analyze the relative efficiency of the matched-pair design over the completely randomized design. I also show how to empirically evaluate the relative efficiency of the two designs using experimental data obtained under the matched-pair design. My randomization-based analysis differs from previous studies in that it avoids modeling and other assumptions as much as possible. Finally, the analytical results are illustrated with numerical and empirical examples. Copyright q 2008 John Wiley & Sons, Ltd. KEY WORDS: causal inference; average treatment effect; randomization inference; paired comparison 1. INTRODUCTION Despite the sharp disagreements on some issues [1], Neyman and Fisher agreed with each other on the use of the randomized treatment assignment mechanism as the sole basis of statistical inference in the statistical analysis of randomized experiments.
    [Show full text]
  • Week 10: Causality with Measured Confounding
    Week 10: Causality with Measured Confounding Brandon Stewart1 Princeton November 28 and 30, 2016 1These slides are heavily influenced by Matt Blackwell, Jens Hainmueller, Erin Hartman, Kosuke Imai and Gary King. Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 1 / 176 Where We've Been and Where We're Going... Last Week I regression diagnostics This Week I Monday: F experimental Ideal F identification with measured confounding I Wednesday: F regression estimation Next Week I identification with unmeasured confounding I instrumental variables Long Run I causality with measured confounding ! unmeasured confounding ! repeated data Questions? Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 2 / 176 1 The Experimental Ideal 2 Assumption of No Unmeasured Confounding 3 Fun With Censorship 4 Regression Estimators 5 Agnostic Regression 6 Regression and Causality 7 Regression Under Heterogeneous Effects 8 Fun with Visualization, Replication and the NYT 9 Appendix Subclassification Identification under Random Assignment Estimation Under Random Assignment Blocking Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 3 / 176 1 The Experimental Ideal 2 Assumption of No Unmeasured Confounding 3 Fun With Censorship 4 Regression Estimators 5 Agnostic Regression 6 Regression and Causality 7 Regression Under Heterogeneous Effects 8 Fun with Visualization, Replication and the NYT 9 Appendix Subclassification Identification under Random Assignment Estimation Under Random Assignment Blocking Stewart
    [Show full text]
  • Randomization Distributions
    Section 4.4 Creating Randomization Distributions Statistics: Unlocking the Power of Data Lock5 Randomization Distributions p-values can be calculated by randomization distributions: simulate samples, assuming H0 is true calculate the statistic of interest for each sample find the p-value as the proportion of simulated statistics as extreme as the observed statistic Today we’ll see ways to simulate randomization samples for a variety of situations Statistics: Unlocking the Power of Data Lock5 Cocaine Addiction • In a randomized experiment on treating cocaine addiction, 48 people were randomly assigned to take either Desipramine (a new drug), or Lithium (an existing drug), and then followed to see who relapsed • Question of interest: Is Desipramine better than Lithium at treating cocaine addiction? Statistics: Unlocking the Power of Data Lock5 Cocaine Addiction • What are the null and alternative hypotheses? • What are the possible conclusions? Statistics: Unlocking the Power of Data Lock5 Cocaine Addiction • What are the null and alternative hypotheses? pD, pL: proportion of cocaine addicts who relapse after taking Desipramine or Lithium, respectively ̂ H0: pD = pL H : p < p a D L • What are the possible conclusions? Reject H0; Desipramine is better than Lithium Do not reject H0: We cannot determine from these data whether Desipramine is better than Lithium Statistics: Unlocking the Power of Data Lock5 R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R 1.
    [Show full text]