Principles of Experimental Design Measured and Included in an Analysis and Those Not Measured, May Influence the Response Variable of Interest
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Removing Hidden Confounding by Experimental Grounding
Removing Hidden Confounding by Experimental Grounding Nathan Kallus Aahlad Manas Puli Uri Shalit Cornell University and Cornell Tech New York University Technion New York, NY New York, NY Haifa, Israel [email protected] [email protected] [email protected] Abstract Observational data is increasingly used as a means for making individual-level causal predictions and intervention recommendations. The foremost challenge of causal inference from observational data is hidden confounding, whose presence cannot be tested in data and can invalidate any causal conclusion. Experimental data does not suffer from confounding but is usually limited in both scope and scale. We introduce a novel method of using limited experimental data to correct the hidden confounding in causal effect models trained on larger observational data, even if the observational data does not fully overlap with the experimental data. Our method makes strictly weaker assumptions than existing approaches, and we prove conditions under which it yields a consistent estimator. We demonstrate our method’s efficacy using real-world data from a large educational experiment. 1 Introduction In domains such as healthcare, education, and marketing there is growing interest in using observa- tional data to draw causal conclusions about individual-level effects; for example, using electronic healthcare records to determine which patients should get what treatments, using school records to optimize educational policy interventions, or using past advertising campaign data to refine targeting and maximize lift. Observational datasets, due to their often very large number of samples and exhaustive scope (many measured covariates) in comparison to experimental datasets, offer a unique opportunity to uncover fine-grained effects that may apply to many target populations. -
Latin Square Design and Incomplete Block Design Latin Square (LS) Design Latin Square (LS) Design
Lecture 8 Latin Square Design and Incomplete Block Design Latin Square (LS) design Latin square (LS) design • It is a kind of complete block designs. • A class of experimental designs that allow for two sources of blocking. • Can be constructed for any number of treatments, but there is a cost. If there are t treatments, then t2 experimental units will be required. Latin square design • If you can block on two (perpendicular) sources of variation (rows x columns) you can reduce experimental error when compared to the RBD • More restrictive than the RBD • The total number of plots is the square of the number of treatments • Each treatment appears once and only once in each row and column A B C D B C D A C D A B D A B C Facts about the LS Design • With the Latin Square design you are able to control variation in two directions. • Treatments are arranged in rows and columns • Each row contains every treatment. • Each column contains every treatment. • The most common sizes of LS are 5x5 to 8x8 Advantages • You can control variation in two directions. • Hopefully you increase efficiency as compared to the RBD. Disadvantages • The number of treatments must equal the number of replicates. • The experimental error is likely to increase with the size of the square. • Small squares have very few degrees of freedom for experimental error. • You can’t evaluate interactions between: • Rows and columns • Rows and treatments • Columns and treatments. Examples of Uses of the Latin Square Design • 1. Field trials in which the experimental error has two fertility gradients running perpendicular each other or has a unidirectional fertility gradient but also has residual effects from previous trials. -
Confounding Bias, Part I
ERIC NOTEBOOK SERIES Second Edition Confounding Bias, Part I Confounding is one type of Second Edition Authors: Confounding is also a form a systematic error that can occur in bias. Confounding is a bias because it Lorraine K. Alexander, DrPH epidemiologic studies. Other types of can result in a distortion in the systematic error such as information measure of association between an Brettania Lopes, MPH bias or selection bias are discussed exposure and health outcome. in other ERIC notebook issues. Kristen Ricchetti-Masterson, MSPH Confounding may be present in any Confounding is an important concept Karin B. Yeatts, PhD, MS study design (i.e., cohort, case-control, in epidemiology, because, if present, observational, ecological), primarily it can cause an over- or under- because it's not a result of the study estimate of the observed association design. However, of all study designs, between exposure and health ecological studies are the most outcome. The distortion introduced susceptible to confounding, because it by a confounding factor can be large, is more difficult to control for and it can even change the apparent confounders at the aggregate level of direction of an effect. However, data. In all other cases, as long as unlike selection and information there are available data on potential bias, it can be adjusted for in the confounders, they can be adjusted for analysis. during analysis. What is confounding? Confounding should be of concern Confounding is the distortion of the under the following conditions: association between an exposure 1. Evaluating an exposure-health and health outcome by an outcome association. -
Beyond Controlling for Confounding: Design Strategies to Avoid Selection Bias and Improve Efficiency in Observational Studies
September 27, 2018 Beyond Controlling for Confounding: Design Strategies to Avoid Selection Bias and Improve Efficiency in Observational Studies A Case Study of Screening Colonoscopy Our Team Xabier Garcia de Albeniz Martinez +34 93.3624732 [email protected] Bradley Layton 919.541.8885 [email protected] 2 Poll Questions Poll question 1 Ice breaker: Which of the following study designs is best to evaluate the causal effect of a medical intervention? Cross-sectional study Case series Case-control study Prospective cohort study Randomized, controlled clinical trial 3 We All Trust RCTs… Why? • Obvious reasons – No confusion (i.e., exchangeability) • Not so obvious reasons – Exposure represented at all levels of potential confounders (i.e., positivity) – Therapeutic intervention is well defined (i.e., consistency) – … And, because of the alignment of eligibility, exposure assignment and the start of follow-up (we’ll soon see why is this important) ? RCT = randomized clinical trial. 4 Poll Questions We just used the C-word: “causal” effect Poll question 2 Which of the following is true? In pharmacoepidemiology, we work to ensure that drugs are effective and safe for the population In pharmacoepidemiology, we want to know if a drug causes an undesired toxicity Causal inference from observational data can be questionable, but being explicit about the causal goal and the validity conditions help inform a scientific discussion All of the above 5 In Pharmacoepidemiology, We Try to Infer Causes • These are good times to be an epidemiologist -
Study Designs in Biomedical Research
STUDY DESIGNS IN BIOMEDICAL RESEARCH INTRODUCTION TO CLINICAL TRIALS SOME TERMINOLOGIES Research Designs: Methods for data collection Clinical Studies: Class of all scientific approaches to evaluate Disease Prevention, Diagnostics, and Treatments. Clinical Trials: Subset of clinical studies that evaluates Investigational Drugs; they are in prospective/longitudinal form (the basic nature of trials is prospective). TYPICAL CLINICAL TRIAL Study Initiation Study Termination No subjects enrolled after π1 0 π1 π2 Enrollment Period, e.g. Follow-up Period, e.g. three (3) years two (2) years OPERATION: Patients come sequentially; each is enrolled & randomized to receive one of two or several treatments, and followed for varying amount of time- between π1 & π2 In clinical trials, investigators apply an “intervention” and observe the effect on outcomes. The major advantage is the ability to demonstrate causality; in particular: (1) random assigning subjects to intervention helps to reduce or eliminate the influence of confounders, and (2) blinding its administration helps to reduce or eliminate the effect of biases from ascertainment of the outcome. Clinical Trials form a subset of cohort studies but not all cohort studies are clinical trials because not every research question is amenable to the clinical trial design. For example: (1) By ethical reasons, we cannot assign subjects to smoking in a trial in order to learn about its harmful effects, or (2) It is not feasible to study whether drug treatment of high LDL-cholesterol in children will prevent heart attacks many decades later. In addition, clinical trials are generally expensive, time consuming, address narrow clinical questions, and sometimes expose participants to potential harm. -
Design of Engineering Experiments Blocking & Confounding in the 2K
Design of Engineering Experiments Blocking & Confounding in the 2 k • Text reference, Chapter 7 • Blocking is a technique for dealing with controllable nuisance variables • Two cases are considered – Replicated designs – Unreplicated designs Chapter 7 Design & Analysis of Experiments 1 8E 2012 Montgomery Chapter 7 Design & Analysis of Experiments 2 8E 2012 Montgomery Blocking a Replicated Design • This is the same scenario discussed previously in Chapter 5 • If there are n replicates of the design, then each replicate is a block • Each replicate is run in one of the blocks (time periods, batches of raw material, etc.) • Runs within the block are randomized Chapter 7 Design & Analysis of Experiments 3 8E 2012 Montgomery Blocking a Replicated Design Consider the example from Section 6-2 (next slide); k = 2 factors, n = 3 replicates This is the “usual” method for calculating a block 3 B2 y 2 sum of squares =i − ... SS Blocks ∑ i=1 4 12 = 6.50 Chapter 7 Design & Analysis of Experiments 4 8E 2012 Montgomery 6-2: The Simplest Case: The 22 Chemical Process Example (1) (a) (b) (ab) A = reactant concentration, B = catalyst amount, y = recovery ANOVA for the Blocked Design Page 305 Chapter 7 Design & Analysis of Experiments 6 8E 2012 Montgomery Confounding in Blocks • Confounding is a design technique for arranging a complete factorial experiment in blocks, where the block size is smaller than the number of treatment combinations in one replicate. • Now consider the unreplicated case • Clearly the previous discussion does not apply, since there -
Chapter 5 Experiments, Good And
Chapter 5 Experiments, Good and Bad Point of both observational studies and designed experiments is to identify variable or set of variables, called explanatory variables, which are thought to predict outcome or response variable. Confounding between explanatory variables occurs when two or more explanatory variables are not separated and so it is not clear how much each explanatory variable contributes in prediction of response variable. Lurking variable is explanatory variable not considered in study but confounded with one or more explanatory variables in study. Confounding with lurking variables effectively reduced in randomized comparative experiments where subjects are assigned to treatments at random. Confounding with a (only one at a time) lurking variable reduced in observational studies by controlling for it by comparing matched groups. Consequently, experiments much more effec- tive than observed studies at detecting which explanatory variables cause differences in response. In both cases, statistically significant observed differences in average responses implies differences are \real", did not occur by chance alone. Exercise 5.1 (Experiments, Good and Bad) 1. Randomized comparative experiment: effect of temperature on mice rate of oxy- gen consumption. For example, mice rate of oxygen consumption 10.3 mL/sec when subjected to 10o F. temperature (Fo) 0 10 20 30 ROC (mL/sec) 9.7 10.3 11.2 14.0 (a) Explanatory variable considered in study is (choose one) i. temperature ii. rate of oxygen consumption iii. mice iv. mouse weight 25 26 Chapter 5. Experiments, Good and Bad (ATTENDANCE 3) (b) Response is (choose one) i. temperature ii. rate of oxygen consumption iii. -
Latin Squares in Experimental Design
Latin Squares in Experimental Design Lei Gao Michigan State University December 10, 2005 Abstract: For the past three decades, Latin Squares techniques have been widely used in many statistical applications. Much effort has been devoted to Latin Square Design. In this paper, I introduce the mathematical properties of Latin squares and the application of Latin squares in experimental design. Some examples and SAS codes are provided that illustrates these methods. Work done in partial fulfillment of the requirements of Michigan State University MTH 880 advised by Professor J. Hall. 1 Index Index ............................................................................................................................... 2 1. Introduction................................................................................................................. 3 1.1 Latin square........................................................................................................... 3 1.2 Orthogonal array representation ........................................................................... 3 1.3 Equivalence classes of Latin squares.................................................................... 3 2. Latin Square Design.................................................................................................... 4 2.1 Latin square design ............................................................................................... 4 2.2 Pros and cons of Latin square design................................................................... -
The Impact of Other Factors: Confounding, Mediation, and Effect Modification Amy Yang
The Impact of Other Factors: Confounding, Mediation, and Effect Modification Amy Yang Senior Statistical Analyst Biostatistics Collaboration Center Oct. 14 2016 BCC: Biostatistics Collaboration Center Who We Are Leah J. Welty, PhD Joan S. Chmiel, PhD Jody D. Ciolino, PhD Kwang-Youn A. Kim, PhD Assoc. Professor Professor Asst. Professor Asst. Professor BCC Director Alfred W. Rademaker, PhD Masha Kocherginsky, PhD Mary J. Kwasny, ScD Julia Lee, PhD, MPH Professor Assoc. Professor Assoc. Professor Assoc. Professor Not Pictured: 1. David A. Aaby, MS Senior Stat. Analyst 2. Tameka L. Brannon Financial | Research Hannah L. Palac, MS Gerald W. Rouleau, MS Amy Yang, MS Administrator Senior Stat. Analyst Stat. Analyst Senior Stat. Analyst Biostatistics Collaboration Center |680 N. Lake Shore Drive, Suite 1400 |Chicago, IL 60611 BCC: Biostatistics Collaboration Center What We Do Our mission is to support FSM investigators in the conduct of high-quality, innovative health-related research by providing expertise in biostatistics, statistical programming, and data management. BCC: Biostatistics Collaboration Center How We Do It Statistical support for Cancer-related projects or The BCC recommends Lurie Children’s should be requesting grant triaged through their support at least 6 -8 available resources. weeks before submission deadline We provide: Study Design BCC faculty serve as Co- YES Investigators; analysts Analysis Plan serve as Biostatisticians. Power Sample Size Are you writing a Recharge Model grant? (hourly rate) Short Short or long term NO collaboration? Long Subscription Model Every investigator is (salary support) provided a FREE initial consultation of up to 2 hours with BCC faculty of staff BCC: Biostatistics Collaboration Center How can you contact us? • Request an Appointment - http://www.feinberg.northwestern.edu/sites/bcc/contact-us/request- form.html • General Inquiries - [email protected] - 312.503.2288 • Visit Our Website - http://www.feinberg.northwestern.edu/sites/bcc/index.html Biostatistics Collaboration Center |680 N. -
Chapter 30 Latin Square and Related Designs
Chapter 30 Latin Square and Related Designs We look at latin square and related designs. 30.1 Basic Elements Exercise 30.1 (Basic Elements) 1. Di®erent arrangements of same data. Nine patients are subjected to three di®erent drugs (A, B, C). Two blocks, each with three levels, are used: age (1: below 25, 2: 25 to 35, 3: above 35) and health (1: poor, 2: fair, 3: good). The following two arrangements of drug responses for age/health/drugs, health, i: 1 2 3 age, j: 1 2 3 1 2 3 1 2 3 drugs, k = 1 (A): 69 92 44 k = 2 (B): 80 47 65 k = 3 (C): 40 91 63 health # age! j = 1 j = 2 j = 3 i = 1 69 (A) 80 (B) 40 (C) i = 2 91 (C) 92 (A) 47 (B) i = 3 65 (B) 63 (C) 44 (A) are (choose one) the same / di®erent data sets. This is a latin square design and is an example of an incomplete block design. Health and age are two blocks for the treatment, drug. 2. How is a latin square created? True / False Drug treatment A not only appears in all three columns (age block) and in all three rows (health block) but also only appears once in every row and column. This is also true of the other two treatments (B, C). 261 262 Chapter 30. Latin Square and Related Designs (ATTENDANCE 12) 3. Other latin squares. Which are latin squares? Choose none, one or more. (a) latin square candidate 1 health # age! j = 1 j = 2 j = 3 i = 1 69 (A) 80 (B) 40 (C) i = 2 91 (B) 92 (C) 47 (A) i = 3 65 (C) 63 (A) 44 (B) (b) latin square candidate 2 health # age! j = 1 j = 2 j = 3 i = 1 69 (A) 80 (C) 40 (B) i = 2 91 (B) 92 (A) 47 (C) i = 3 65 (C) 63 (B) 44 (A) (c) latin square candidate 3 health # age! j = 1 j = 2 j = 3 i = 1 69 (B) 80 (A) 40 (C) i = 2 91 (C) 92 (B) 47 (B) i = 3 65 (A) 63 (C) 44 (A) In fact, there are twelve (12) latin squares when r = 3. -
Internal Validity Page 1
Internal Validity page 1 Internal Validity If a study has internal validity, then you can be confident that the dependent variable was caused by the independent variable and not some other variable. Thus, a study with high internal validity permits conclusions of cause and effect. The major threat to internal validity is a confounding variable: a variable other than the independent variable that 1) often co-varies with the independent variable and 2) may be an alternative cause of the dependent variable. For example, if you want to investigate whether smoking cigarettes causes cancer, you will need to somehow control for other possible causes of cancer that tend to vary with smoking. Each of these other possible causes is a potential confounding variable. If people who smoke more also tend to experience greater stress, and stress is known to contribute to cancer, then stress could be a confounding variable. Not every variable that tends to co-vary with the independent variable is an equally plausible confounding variable. People who smoke may also strike more matches, but if striking matches is not plausibly linked to cancer in any way, then it is generally not worth considering as a serious threat to internal validity. Random assignment. The most common procedure to reduce the risk of confounding variables (and thus increase internal validity) is random assignment of participants to levels of the independent variable. This means that each participant in a study has an equal probability of being assigned to any of the levels of the independent variable. For the smoking study, random assignment would require some participants to be randomly assigned to smoke more than others. -
Counting Latin Squares
Counting Latin Squares Jeffrey Beyerl August 24, 2009 Jeffrey Beyerl Counting Latin Squares Write a program, which given n will enumerate all Latin Squares of order n. Does the structure of your program suggest a formula for the number of Latin Squares of size n? If it does, use the formula to calculate the number of Latin Squares for n = 6, 7, 8, and 9. Motivation On the Summer 2009 computational prelim there were the following questions: Jeffrey Beyerl Counting Latin Squares Does the structure of your program suggest a formula for the number of Latin Squares of size n? If it does, use the formula to calculate the number of Latin Squares for n = 6, 7, 8, and 9. Motivation On the Summer 2009 computational prelim there were the following questions: Write a program, which given n will enumerate all Latin Squares of order n. Jeffrey Beyerl Counting Latin Squares Motivation On the Summer 2009 computational prelim there were the following questions: Write a program, which given n will enumerate all Latin Squares of order n. Does the structure of your program suggest a formula for the number of Latin Squares of size n? If it does, use the formula to calculate the number of Latin Squares for n = 6, 7, 8, and 9. Jeffrey Beyerl Counting Latin Squares Latin Squares Definition A Latin Square is an n × n table with entries from the set f1; 2; 3; :::; ng such that no column nor row has a repeated value. Jeffrey Beyerl Counting Latin Squares Sudoku Puzzles are 9 × 9 Latin Squares with some additional constraints.