Chapter 10 – Lecture 10

Total Page:16

File Type:pdf, Size:1020Kb

Chapter 10 – Lecture 10 Chapter 10 – Lecture 10 Internal Validity – Control through Experimental Design 1) Test the effects of IV on DV 2) Protects against threats to internal validity Causation Experimental Design Highest Constraint Comparisons btw grps Random sampling Random assignment Infer Causality Experimental Design (5 characteristics) 1) One or more hypothesis 2) Includes at least 2 “levels” of IV 3) Random assignment 4) Procedures for testing hypothesis 5) Controls for major threats to internal validity Experimental Design Develop the problem statement Define IV & DV Develop research hypothesis Identify a population of interest Random sampling & Random assignment Specify procedures (methods) Anticipate threats to validity Create controls Specify Statistical tests • Ethical considerations Clear Experimental Design… Experimental Design 2 sources of variance 1. between groups variance (systematic) no drug drug 2. Within groups variance (nonsystematic) (error variance) Remember… Sampling error Significant differences…variability btw means is larger than expected on the basis of sampling error alone (due to chance alone) Variance Need it! Without it… No go VARIANCE “Partitioning of the variance” Between Group Within Group Experimental Variance (Due to your treatment) + Error Variance Extraneous Variance (not due to treatment – chance) (confounds etc.) CON TX Subs Variance: Important for the statistical analysis F = between groups variance Within groups variance F = Systematic effects + error variance error variance F = 1.00 No differences btw groups Variance Your experiment should be designed to • Maximize experimental variance •Control extraneous variance •Minimize error variance Maximize “Experimental” Variance • At least 2 levels of IV (IVs really vary?) •Manipulation check: make sure the levels (exp. conditions) differ each other Ex: anxiety levels (low anxiety/hi anxiety) performance on math task anxiety scale Control “Extraneous” Variance 1. Ex. & Con grps are similar to begin with 2. Within subjects design (carryover effects??) 3. If need be, limit population of interest (o vs o ) 4. Make the extraneous variable an IV (age, sex, socioeconomic) = factorial design M F Lo Anxiety M-low F-low Factorial design (2 IV’s) Hi Anxiety M-hi F-hi YOUR Proposals Control through Design – Don’ts 1. Ex Post Facto 2. Single-group, posttest only 3. Single-group pretest-posttest 4. Pretest-Posttest natural control group 1. Ex Post Facto – “after the fact” Group A Naturally Occurring Event Measurement No manipulation Control through Design – Don’ts Single group posttest only Group A TX Posttest Single group Pretest-posttest Pretest Group A TX Posttest Compare Control through Design – Don’ts Pretest-Posttest Naturalistic Control Group Group A Pretest TX Posttest Compare Group B Pretest no TX Posttest Natural Occurring Control through Design – Do’s – Experimental Design • Manipulate IV • Control Group • Randomization 4 Basic Designs Testing One IV 1. Randomized Posttest only, Control Group 2. Randomized Pretest-Posttest, Control Group 3. Multilevel Completely Randomized Between Groups 4. Solomon’s Four- Group Randomized Posttest Only – Control Group (most basic experimental design) R Group A TX Posttest (Ex) Compare R Group B no TX Posttest (Con) Randomized, Pretest-Posttest, Control Group Design R Group A Pretest TX Posttest (Ex) Compare R Group B Pretest no TX Posttest (Con) Multilevel, Completely Randomized Between Subjects Design (more than 2 levels of IV) R Group A Pretest TX1 Posttest R Group B Pretest TX 2 Posttest Compare R Group C Pretest TX3 Posttest R Group D Pretest TX4 Posttest Solomon’s Four Group Design (extension Multilevel Btw Subs) R Group A Pretest TX Posttest R Group B Pretest ---- Posttest Compare R Group C -------- TX Posttest R Group D -------- ---- Posttest Powerful Design! What stats do you use to analyze experimental designs? Depends the level of measurement Test difference between groups Nominal data chi square (frequency/categorical) Ordered data Mann-Whitney U test Interval or ratio t-test / ANOVA (F test) t-Test Compare 2 groups Independent One sample (Within) Samples (between Subs) Evaluate differences bwt two Evaluate differences bwt conditions in a single groups 2 independent groups Assumptions to use t-Test 1. The test variable (DV) is normally distributed in each of the 2 groups 2. The variances of the normally distributed test variable are equal – Homogeniety of Variance 3. Random assignment to groups t-distribution Represents the distribution of t that would be obtained if a value of t were calculated for each sample mean for all possible random samples of a given size from some population Degrees of freedom (df) When we use samples we approximate means & SD to represent the true population Sample variability (SS = squared deviations) tends to underestimate population variability Restriction is placed = making up for this mathematically by using n-1 in denominator S2 = variance ss (sum of squares) df (degrees of freedom) (x - x )2 n-1 Degrees of freedom (df): n-1 The number of values (scores) that are free to vary given mathematical restrictions on a sample of observed values used to estimate some unknown population = price we pay for sampling Degrees of freedom (df): n-1 Number of scores free to vary Data Set you know the mean (use mean to compute variance) n=2 with a mean of 6 X In order to get a mean of 6 8 with an n of 2…need a sum of 12…second score must be x = ? 6 4… second score is restricted by sample mean (this score is not free to vary) Group Statistics Std. Error DRUG N Mean Std. Deviation Mean ENDURANC doped 10 7.9000 1.1972 .3786 no dope 10 2.6000 1.2649 .4000 Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means 95% Confidence Interval of the Mean Std. Error Difference F Sig. t df Sig. (2-tailed) Difference Difference Low er Upper ENDURANC Equal variances .065 .801 9.623 18 .000 5.3000 .5508 4.1429 6.4571 assumed Equal variances 9.623 17.946 .000 5.3000 .5508 4.1427 6.4573 not assumed ANOVA ENDURANC Sum of Squares df Mean Square F Sig. Betw een Groups 140.450 1 140.450 92.604 .000 Within Groups 27.300 18 1.517 Total 167.750 19 Analysis of Variance (ANOVA) Two or more groups ….can use on two groups… t2 = F Variance is calculated more than once because of varying levels (combo of differences) Several Sources of Variance SS – between Partitioning SS – Within the variance SS – Total Sum of Squares: sum of squared deviations from the mean Assumptions to use ANOVA 1. The test variable (DV) is normally distributed 2. The variances of the normally distributed test variable is equal – Homogeniety of Variance 3. Random assignment to groups F = between groups variance Within groups variance F = Systematic effects + error variance error variance F = 1.00 No differences btw groups F = 21.50 22 times as much variance between the groups than we would expect by chance After Omnibus F… Planned comparisons & Post Hoc tests A Priori (spss: contrast) A Posteriori part of your hypothesis…before Not quite sure where data are collected…prediction is made differences will occur Why not just do t-tests! 2 types of errors that you must consider when doing Post Hoc Analysis Alpha 1. Per-comparison error (PC) 2. Family wise error (FW) Inflate Alpha!!!! FW = c() c = # of comparisons made = your PC Ex: IV ( 5 conditions) 1 vs 2 1 vs 3 1 vs 4 3 vs 4 FW = c() 1 vs 5 3 vs 5 2 vs 3 4 vs 5 10 (0.05) = .50 2 vs 4 2 vs 5 HSD .
Recommended publications
  • Survey Experiments
    IU Workshop in Methods – 2019 Survey Experiments Testing Causality in Diverse Samples Trenton D. Mize Department of Sociology & Advanced Methodologies (AMAP) Purdue University Survey Experiments Page 1 Survey Experiments Page 2 Contents INTRODUCTION ............................................................................................................................................................................ 8 Overview .............................................................................................................................................................................. 8 What is a survey experiment? .................................................................................................................................... 9 What is an experiment?.............................................................................................................................................. 10 Independent and dependent variables ................................................................................................................. 11 Experimental Conditions ............................................................................................................................................. 12 WHY CONDUCT A SURVEY EXPERIMENT? ........................................................................................................................... 13 Internal, external, and construct validity ..........................................................................................................
    [Show full text]
  • Analysis of Variance and Analysis of Variance and Design of Experiments of Experiments-I
    Analysis of Variance and Design of Experimentseriments--II MODULE ––IVIV LECTURE - 19 EXPERIMENTAL DESIGNS AND THEIR ANALYSIS Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur 2 Design of experiment means how to design an experiment in the sense that how the observations or measurements should be obtained to answer a qqyuery inavalid, efficient and economical way. The desigggning of experiment and the analysis of obtained data are inseparable. If the experiment is designed properly keeping in mind the question, then the data generated is valid and proper analysis of data provides the valid statistical inferences. If the experiment is not well designed, the validity of the statistical inferences is questionable and may be invalid. It is important to understand first the basic terminologies used in the experimental design. Experimental unit For conducting an experiment, the experimental material is divided into smaller parts and each part is referred to as experimental unit. The experimental unit is randomly assigned to a treatment. The phrase “randomly assigned” is very important in this definition. Experiment A way of getting an answer to a question which the experimenter wants to know. Treatment Different objects or procedures which are to be compared in an experiment are called treatments. Sampling unit The object that is measured in an experiment is called the sampling unit. This may be different from the experimental unit. 3 Factor A factor is a variable defining a categorization. A factor can be fixed or random in nature. • A factor is termed as fixed factor if all the levels of interest are included in the experiment.
    [Show full text]
  • The Politics of Random Assignment: Implementing Studies and Impacting Policy
    The Politics of Random Assignment: Implementing Studies and Impacting Policy Judith M. Gueron Manpower Demonstration Research Corporation (MDRC) As the only nonacademic presenting a paper at this conference, I see it as my charge to focus on the challenge of implementing random assignment in the field. I will not spend time arguing for the methodological strengths of social experiments or advocating for more such field trials. Others have done so eloquently.1 But I will make my biases clear. For 25 years, I and many of my MDRC colleagues have fought to implement random assignment in diverse arenas and to show that this approach is feasible, ethical, uniquely convincing, and superior for answering certain questions. Our organization is widely credited with being one of the pioneers of this approach, and through its use producing results that are trusted across the political spectrum and that have made a powerful difference in social policy and research practice. So, I am a believer, but not, I hope, a blind one. I do not think that random assignment is a panacea or that it can address all the critical policy questions, or substitute for other types of analysis, or is always appropriate. But I do believe that it offers unique power in answering the “Does it make a difference?” question. With random assignment, you can know something with much greater certainty and, as a result, can more confidently separate fact from advocacy. This paper focuses on implementing experiments. In laying out the ingredients of success, I argue that creative and flexible research design skills are essential, but that just as important are operational and political skills, applied both to marketing the experiment in the first place and to helping interpret and promote its findings down the line.
    [Show full text]
  • Key Items to Get Right When Conducting Randomized Controlled Trials of Social Programs
    Key Items to Get Right When Conducting Randomized Controlled Trials of Social Programs February 2016 This publication was produced by the Evidence-Based Policy team of the Laura and John Arnold Foundation (now Arnold Ventures). This publication is in the public domain. Authorization to reproduce it in whole or in part for educational purposes is granted. We welcome comments and suggestions on this document ([email protected]). 1 Purpose This is a checklist of key items to get right when conducting a randomized controlled trial (RCT) to evaluate a social program or practice. The checklist is designed to be a practical resource for researchers and sponsors of research. It describes items that are critical to the success of an RCT in producing valid findings about a social program’s effectiveness. This document is limited to key items, and does not address all contingencies that may affect a study’s success.1 Items in this checklist are categorized according to the following phases of an RCT: 1. Planning the study; 2. Carrying out random assignment; 3. Measuring outcomes for the study sample; and 4. Analyzing the study results. 1. Key items to get right in planning the study Choose (i) the program to be evaluated, (ii) target population for the study, and (iii) key outcomes to be measured. These should include, wherever possible, ultimate outcomes of policy importance. As illustrative examples: . An RCT of a pregnancy prevention program preferably should measure outcomes such as pregnancies or births, and not just intermediate outcomes such as condom use. An RCT of a remedial reading program preferably should measure outcomes such as reading comprehension, and not just participants’ ability to sound out words.
    [Show full text]
  • Chapter 5 Experiments, Good And
    Chapter 5 Experiments, Good and Bad Point of both observational studies and designed experiments is to identify variable or set of variables, called explanatory variables, which are thought to predict outcome or response variable. Confounding between explanatory variables occurs when two or more explanatory variables are not separated and so it is not clear how much each explanatory variable contributes in prediction of response variable. Lurking variable is explanatory variable not considered in study but confounded with one or more explanatory variables in study. Confounding with lurking variables effectively reduced in randomized comparative experiments where subjects are assigned to treatments at random. Confounding with a (only one at a time) lurking variable reduced in observational studies by controlling for it by comparing matched groups. Consequently, experiments much more effec- tive than observed studies at detecting which explanatory variables cause differences in response. In both cases, statistically significant observed differences in average responses implies differences are \real", did not occur by chance alone. Exercise 5.1 (Experiments, Good and Bad) 1. Randomized comparative experiment: effect of temperature on mice rate of oxy- gen consumption. For example, mice rate of oxygen consumption 10.3 mL/sec when subjected to 10o F. temperature (Fo) 0 10 20 30 ROC (mL/sec) 9.7 10.3 11.2 14.0 (a) Explanatory variable considered in study is (choose one) i. temperature ii. rate of oxygen consumption iii. mice iv. mouse weight 25 26 Chapter 5. Experiments, Good and Bad (ATTENDANCE 3) (b) Response is (choose one) i. temperature ii. rate of oxygen consumption iii.
    [Show full text]
  • How to Do Random Allocation (Randomization) Jeehyoung Kim, MD, Wonshik Shin, MD
    Special Report Clinics in Orthopedic Surgery 2014;6:103-109 • http://dx.doi.org/10.4055/cios.2014.6.1.103 How to Do Random Allocation (Randomization) Jeehyoung Kim, MD, Wonshik Shin, MD Department of Orthopedic Surgery, Seoul Sacred Heart General Hospital, Seoul, Korea Purpose: To explain the concept and procedure of random allocation as used in a randomized controlled study. Methods: We explain the general concept of random allocation and demonstrate how to perform the procedure easily and how to report it in a paper. Keywords: Random allocation, Simple randomization, Block randomization, Stratified randomization Randomized controlled trials (RCT) are known as the best On the other hand, many researchers are still un- method to prove causality in spite of various limitations. familiar with how to do randomization, and it has been Random allocation is a technique that chooses individuals shown that there are problems in many studies with the for treatment groups and control groups entirely by chance accurate performance of the randomization and that some with no regard to the will of researchers or patients’ con- studies are reporting incorrect results. So, we will intro- dition and preference. This allows researchers to control duce the recommended way of using statistical methods all known and unknown factors that may affect results in for a randomized controlled study and show how to report treatment groups and control groups. the results properly. Allocation concealment is a technique used to pre- vent selection bias by concealing the allocation sequence CATEGORIES OF RANDOMIZATION from those assigning participants to intervention groups, until the moment of assignment.
    [Show full text]
  • Random Assignment
    7 RANDOM ASSIGNMENT Just as representativeness can be secured by the method of chance . so equivalence may be secured by chance.1distribute —W. A. McCall or LEARNING OBJECTIVES • Understand what random assignment does and how it works. • Produce a valid randomization processpost, for an experiment and describe it. • Critique simple random assignment, blocking, matched pairs, and stratified random assignment. • Explain the importance of counterbalancing. • Describe a Latincopy, square design. ust as the mantra in real estate is “location, location, location,” the motto in experi- Jmental design is “random assignment, random assignment, random assignment.” This notbook has discussed random assignment all throughout. It bears repeating that ran- dom assignment is the single most important thing a researcher can do in an experiment. Everything else pales in comparison to having done this correctly.2 Random assignment is what distinguishes a true experiment from a quasi, natural, or pre-experimental design. In Dochapter 1, experiments were referred to as the gold standard. Without successful random 173 Copyright ©2019 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 174 Designing Experiments for the Social Sciences assignment, however, they can quickly become “the bronze standard.”3 This chapter will review some of the advantages of random assignment, discuss the details of how to do it, and explore related issues of counterbalancing. THE PURPOSE OF RANDOM ASSIGNMENT People vary. That is, they are different. Were this not so, there would be no reason to study them. Everyone would be the same, reacting the same way to different teaching techniques, advertisements, health interventions, and political messages.
    [Show full text]
  • Introduction to Impact Evaluation in Social Epidemiology and Public Health
    Introduction to Impact Evaluation in Social Epidemiology and Public Health Sam Harper Arijit Nandi McGill University Epi Tools Workshop Series, UCSF, 13 April 2017 1 / 194 Outline 1 What is Impact Evaluation? 2 Evaluating: Randomized studies Pre-randomization phase Randomization phase 3 Evaluating: Non-randomized studies Problems with non-randomized studies Instrumental Variables Regression discontinuity ITS and Difference-in-Differences 4 Concluding thoughts 5 References 2 / 194 This is an “Epi Tools” workshop What’s in your toolbox? 4 / 194 Causation, Association, and Confounding Causal effect: Do individuals randomly assigned (i.e., SET) to treatment have better outcomes? E (Y jSET [Treated]) − E (Y jSET [Untreated]) Association: Do individuals who happen to be treated have better outcomes? E (Y jTreated) − E(Y jUntreated) Confounding [Omitted Variable Bias] : E (Y jSET [Treated]) − E (Y jSET [Untreated]) 6= E (Y jTreated) − E (Y jUntreated) 5 / 194 Effect of what? Questions about effects are expressed as counterfactual contrasts. We can write the average causal effect (ACE) or average treatment effect (ATE) in terms of potential outcomes: E Y a=1 − E Y a=0 indicating potential outcomes for an individual under two possible treatments. Consistency assumption: Y a = a for every individual with A = a. Problematic when there are multiple versions of the treatment, or when we do not have control over treatment assignment. We need well-defined treatments. Hernan 2016[1], among others. 6 / 194 Here’s a question: Does obesity shorten life? The increased emphasis on causal inference in epidemiology has generated some acrimonious debate:* Vandenbroucke, Pearce, Broadbent, Krieger, Davey Smith, Ebrahim, Schwartz, etc..
    [Show full text]
  • AP Statistics – 5.2 Designing Experiments – Notes
    AP Statistics – 5.2 Designing Experiments – Notes 1. Observational Study – we observe individuals and measure variables of interest but do not attempt to influence the responses. 2. Experiment – we deliberately impose some treatment on (that is, do something to) individuals in order to observe their responses. a. Experimental Units – individuals in which the experiment is done b. Subjects – humans in an experiment c. Treatment – specific experimental condition applied to the units d. Double-Blind Experiment –neither the subjects nor those who measure the response variable know which treatment a subject received. 3. Purpose of Experiments – to reveal the response of one variable to changes in other variables. Experiments can give good evidence for causation. a. Factors – explanatory variables in experiments b. Level – When experiments study joint effects of several factors, each treatment is formed by combining a specific value (level) of each factor. 4. Forms of Control a. Comparison – determine differences/results from two or more groups (only valid when the treatments are given to similar groups of experimental units). Comparison helps ensure that all outside influences affect both groups the same. i. Control Group – the group which receives a placebo (fake treatment) in order to control lurking/confounding variables. ii. Experimental Group – the group that receives the actual treatment b. Replication – use enough subjects to reduce chance variation. c. Randomization – use chance to divide experimental units into groups (without judgment of the experimenter or any characteristic of the experimental units). The two groups are similar before treatments are applied. 5. Random Selection & Random Assignment a. Random Selection is where the subjects are randomly selected from the population.
    [Show full text]
  • Study Designs and Their Outcomes
    © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC CHAPTERNOT FOR SALE 3 OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION © JonesStudy & Bartlett Designs Learning, LLC and Their Outcomes© Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION “Natural selection is a mechanism for generating an exceedingly high degree of improbability.” —Sir Ronald Aylmer Fisher Peter Wludyka © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION OBJECTIVES ______________________________________________________________________________ • Define research design, research study, and research protocol. • Identify the major features of a research study. • Identify© Jonesthe four types& Bartlett of designs Learning,discussed in this LLC chapter. © Jones & Bartlett Learning, LLC • DescribeNOT nonexperimental FOR SALE designs, OR DISTRIBUTIONincluding cohort, case-control, and cross-sectionalNOT studies. FOR SALE OR DISTRIBUTION • Describe the types of epidemiological parameters that can be estimated with exposed cohort, case-control, and cross-sectional studies along with the role, appropriateness, and interpreta- tion of relative risk and odds ratios in the context of design choice. • Define true experimental design and describe its role in assessing cause-and-effect relation- © Jones &ships Bartlett along with Learning, definitions LLCof and discussion of the role of ©internal Jones and &external Bartlett validity Learning, in LLC NOT FOR evaluatingSALE OR designs. DISTRIBUTION NOT FOR SALE OR DISTRIBUTION • Describe commonly used experimental designs, including randomized controlled trials (RCTs), after-only (post-test only) designs, the Solomon four-group design, crossover designs, and factorial designs.
    [Show full text]
  • An Introduction to Randomization*
    An Introduction to Randomization* TAF – CEGA Impact Evaluation Workshop Day 1 * http://ocw.mit.edu (J-PAL Impact Evaluation Workshop materials) Outline Background What is randomized evaluation? Advantages and limitations of experiments Conclusions An example: The “Vote 2002” campaign Arceneaux, Gerber and Green (2006) Intervention: get-out-the-vote phone calls to increase voter turnout in Iowa and Michigan, 2002 midterm elections Treatment group = 60,000 individuals (35,000 actually reached by phone) Control group = >2,000,000 individuals Main outcome: turnout (did the individual vote?) Effect sizes using experimental v. non-experimental methods How to measure impact? What would have happened in the absence of the intervention program? { Since the counterfactual is not observable, key goal of all impact evaluation methods is to construct of “mimic” the counterfactual Constructing the counterfactual Counterfactual is often constructed by selecting a group not affected by the program Randomized: { Use random assignment of the program to create a control group which mimics the counterfactual. Non-randomized: { Argue that a certain excluded group mimics the counterfactual. Validity A tool to assess credibility of a study Internal validity { Relates to ability to draw causal inference, i.e. can we attribute our impact estimates to the program and not to something else External validity { Relates to ability to generalize to other settings of interest, i.e. can we generalize our impact estimates from this program to other populations,
    [Show full text]
  • Chapter 11. Experimental Design: One-Way Independent Samples Design
    11 - 1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing More Than Two Groups Thinking Critically About Everyday Information Quasi-experiments Case Analysis General Summary Detailed Summary Key Terms Review Questions/Exercises 11 - 2 Advantages and Limitations Now that we have introduced the basic concepts of behavioral research, the next five chapters discuss specific research designs. Chapters 11–14 focus on true experimental designs that involve experimenter manipulation of the independent variable and experimenter control over the assignment of participants to treatment conditions. Chapter 15 focuses on alternative research designs in which the experimenter does not manipulate the independent variable and/or does not have control over the assignment of participants to treatment conditions. Let’s return to a topic and experimental design that we discussed earlier. Suppose we are interested in the possible effect of TV violence on aggressive behavior in children. One fairly simple approach is to randomly sample several day-care centers for participants. On a particular day, half of the children in each day-care center are randomly assigned to watch Mister Rogers for 30 minutes, and the other half watch Beast Wars for 30 minutes. The children are given the same instructions by one of the day-care personnel and watch the TV programs in identical environments. Following the TV program, the children play for 30 minutes, and the number of aggressive behaviors is observed and recorded by three experimenters who are “blind” to which TV program each child saw.
    [Show full text]