Experimental Design
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSITY OF BERGEN Experimental design Sebastian Jentschke UNIVERSITY OF BERGEN Agenda • experiments and causal inference • validity • internal validity • statistical conclusion validity • construct validity • external validity • tradeoffs and priorities PAGE 2 Experiments and causal inference UNIVERSITY OF BERGEN Some definitions a hypothesis often concerns a cause- effect-relationship PAGE 4 UNIVERSITY OF BERGEN Scientific revolution • discovery of America → french revolution: renaissance and enlightenment – Copernicus, Galilei, Newton • empiricism: use observation to correct errors in theory • scientific experimentation: taking a deliberate action [manipulation, vary something] followed by systematic observation of what occured afterwards [effect] controlling extraneous influences that might limit or bias observation: random assignment, control groups • mathematization, institutionalization PAGE 5 UNIVERSITY OF BERGEN Causal relationships Definitions and some philosophy: • causal relationships are recognized intuitively by most people in their daily lives • Locke: „A cause is which makes any other thing, either simple idea, substance or mode, begin to be; and an effect is that, which had its beginning from some other thing“ (1975, p. 325) • Stuart Mill: A causal relationship exists if (a) the cause preceded the effect, (b) the cause was related to the effect, (c) we can find no plausible alternative explanantion for the effect other than the cause. Experiments: (a) manipulate the presumed cause, (b) assess whether variation in the cause is related to variation in the effect, (c) use various methods to reduce the plausibility of other explanations for the effect. Non-experimental methods (e.g., correlation analyses) have weaknesses with (a) unclear which variable came first, and (c) can‘t rule out alternative explanations (third moderating variable) and can‘t provide evidence for causation • Popper: regarding (c) – falsificionist logic: confirmation is often difficult (because we might not observe all instances) → one disconfirming instance is sufficient to falsify the hypothesis / conclusion → prove – provide evidence PAGE 6 UNIVERSITY OF BERGEN Causal relationships • cause: constellation - many factors are usually required and we rarely know all of them and how they relate (e.g., psychotherapy) • inus condition (an insufficient but non-redundant part of an unnecessary but sufficient condition) insufficient: a match can not start a fire → adding a non-redundant part: fire-promoting factors (oxygen, dry leaves) → unnecessary: there might be other sets of conditions → sufficient condition to start a fire • causes must be manipulable to be used in experiments – non- manipulable causes can still be studied and provide evidence (observe an effect and search for its cause) PAGE 7 UNIVERSITY OF BERGEN Causal relationships • effect: Hume – counterfactual model experiment: we observe what did happen after people got treatment, but we don‘t know what would have happened (counterfactual) if they had not received treatment effect: difference between what did happen and what would have happened – but: can not observe the counterfactual and need a reasonable approximation (e.g., treatment and control group) • two central tasks of experimental design: (a) creating a high-quality (but necessarily imperfect) source of counterfactual inference; (b) understand how this source differs from the treatment condition PAGE 8 UNIVERSITY OF BERGEN Causal relationships • experiments can provide a causal description – describing the consequences attributable to deliberately varying a treatment – but are less suited to provide a causal explanantion – clarifying mechanisms through which and the conditions under which a causal relationship holds • analogy to molar (as a whole) and molecular (decomposed into parts) causation: causal description = describe bivariate relationship between molar treatment and molar outcome; causal explanation = breaking molar causes into molecular parts to determine what causes the change (drug vs. placebo: decomposing medication effects and verbal interaction / social support) • no clear dichotomy between causal description and explanation • causal explanation is not always required for practical solutions PAGE 9 UNIVERSITY OF BERGEN Components of experiments • control of treatment → manipulating (one or more) independent variable to observe the effect on (one or more) dependant variable; caveat: observations / measurements are not theory- neutral (what is measured and how is influenced by, e.g., the researchers theoretical assumptions, available measures, etc.) • experiment: randomized assignment to the experimental units → create two groups that are probabilistic similar to each other → outcome differences are likely due to the treatment not to already exisiting group differences • quasi-experiments: share most features of an experiment (e.g., control group, pretest) but lack random assignment – cause is manipulable and occurs before the effect, but less compelling support for counterfactual inference (control group may differ even though many alternative explanations are controlled for; solution: assess pre- and post-test scores and assess whether they vary in commonality with the hypothesized cause) • natural experiments: naturally-occuring difference between treatment and comparison • non-experimental designs: correlational / passive observational design → identify presumed cause and effect without structural features of experiments (randomization, control group) PAGE 10 UNIVERSITY OF BERGEN Experiments and generalizability • experiments: strength – illuminate causal inference vs. weakness – how far generalizes that causal relationship • highly localized and particularistic: restricted range of settings, theory-laden measures, convenient samples, conducted at a particular point in time vs. derived theories: abstract constructs with broad conceptual applicability, population construct validity: how well does the research operation represent the underlying theoretical / abstract construct? • Cronbach (1982): decomposing experiments into units / persons, treatments, observations / outcomes and settings (UTOS) external validity: does the causal relationship hold over variations in persons, treatments, observations and settings? • random selection as solution? persons (but requires clearly delineated population and opportunity to sample from these – but self-selection); treatments (conflicts with „optimal“ treatment), outcomes (multi-method), settings (prototypical vs. heterogeneous instances) PAGE 11 Validity UNIVERSITY OF BERGEN Definition • (approximate) truth of an inference (i.e., a property of the inference not design, methods, etc.) → judgement about the extent to which the empirical evidence supports this inference → always an approximation: no method guarantees the validity of an inference • philosophical theories of truth: (1) correspondence: a claim is true if it corresponds to the world → gathering data to assess how well knowledge claims match the world (2) coherence: a claim is true if it belongs to a coherent set of claims → must cohere with exisiting knowledge, scepticism if new contradicts established knowledge (3) pragmatism: a claim is true if it is useful to believe it → assigns meaning or permits predictability; convince others to use it • correspondence: empirical evidence → abstract inference • various degrees and types / aspects of validity: use of a method may affect more than one type of validity simultaneously (e.g., internal vs. external validity) → we may not anticipate all consequences PAGE 13 UNIVERSITY OF BERGEN Validity typology 1) How reliable and large is the covari- ation between presumed cause and effect? 2) Is the covariation causal or would the same covariation have been obtained without or with another treatment? 3) How well reflect the persons, treat- ments, observations and settings the underlying general constructs? 4) How generalizable is the locally embedded causal relationship over varied persons, treatments, observations and settings? PAGE 14 UNIVERSITY OF BERGEN Threats to validity • threats can be identified conceptually or empirically – empirically-based threats change over time and the likelihood of occurence varies across contexts • list of validity threats have a heuristic function: help anticipating likely criticism of the inferences → minimize amount and plausibility of occurence (1) design controls (e.g., randomization) (2) statistical controls • explore role and influence of threats: (1) How would the thread apply? (2) Is the threat plausible to occur (not just possible)? (3) Does it operate in the same direction as the observed effect (confound)? PAGE 15 Internal validity UNIVERSITY OF BERGEN Definition • does the observed covariation (between cause and effect) reflect a causal relationship? (1) cause must precede effect, (2) cause and effect must covary, (3) there is no plausible alternative explanation for the relationship • internal validity as local molar causal validity: local (limited to particular treatments, outcomes, settings and persons), molar (treatment as a complex package, e.g. psychotherapy) PAGE 17 UNIVERSITY OF BERGEN Threats • generally randomization works well (except for differential attrition by treatment group or due to that different testing procedures are required by the treatment groups) • indentifying and quantifying possible threats (and statistically controlling for them) PAGE 18 Statistical conclusion validity UNIVERSITY OF BERGEN