Statistical Experiment Design for Animal Research
Total Page:16
File Type:pdf, Size:1020Kb
Statistical experiment design for animal research C.O.S. Sorzano Natl. Center of Biotechnology (CSIC) Madrid, Spain and M. Parkinson School of Biotechnology, Dublin City University Dublin, Ireland September 6, 2019 2 Contents 1 Why do we need a statistical experiment design?7 1.1 Pilot, exploratory and confirmatory experiments............ 14 1.2 Independence between individuals: experimental units........ 17 1.3 Avoiding bias: blocking, randomization and blinding......... 20 1.4 Reducing variance: variable and population selection, experimental conditions, averaging, and blocking.................. 27 1.4.1 Variable selection........................ 28 1.4.2 Population selection...................... 28 1.4.3 Experimental conditions.................... 29 1.4.4 Population scope, outliers and lack of independence..... 30 1.4.5 Averaging and pooling..................... 34 1.4.6 Blocking............................ 38 1.4.7 Paired samples......................... 41 1.4.8 Blocking and randomization.................. 42 1.5 Automating decision making: hypothesis testing............ 43 1.5.1 An intuitive introduction to hypothesis testing......... 46 1.5.2 Statistical power and confidence................ 49 1.5.3 Multiple testing......................... 53 1.5.4 A worked example....................... 55 1.6 A primer in sample size calculations.................. 60 2 Sample size calculations 69 2.1 Sample size for the mean........................ 70 2.1.1 Hypothesis test on the mean of one sample when the variance is known............................ 70 2.1.2 Hypothesis test on the mean of one sample when the variance is unknown........................... 71 2.1.3 Confidence interval for the mean................ 73 2.1.4 Hypothesis test on the mean for paired samples........ 74 2.1.5 Hypothesis test on the difference of the mean of two samples. 76 2.1.6 Hypothesis test on the mean of several groups (ANOVA)... 77 2.1.7 Unequal group sizes...................... 80 2.1.8 Hypothesis test on the equivalence of two means....... 81 2.2 Sample size for proportions....................... 84 3 4 CONTENTS 2.2.1 Hypothesis test on one small proportion............ 84 2.2.2 Confidence interval for one proportion............. 88 2.2.3 Hypothesis test on one proportion............... 90 2.2.4 Confidence interval for the difference of two proportions... 92 2.2.5 Hypothesis test on the difference of two proportions..... 94 2.2.6 Hypothesis test on the difference of two paired proportions.. 95 2.2.7 Hypothesis test on the difference of multiple proportions... 97 2.2.8 Hypothesis test on the equivalence of one proportion..... 98 2.2.9 Hypothesis test on the equivalence of two proportions.... 100 2.3 Sample size for regression....................... 101 2.3.1 Linear regression: Confidence interval on the slope of a regres- sion coefficient......................... 103 2.3.2 Linear regression: Hypothesis test on the regression coefficients 107 2.3.3 Logistic regression: Hypothesis test on the regression coefficients108 2.3.4 Cox regression: Hypothesis test on the regression coefficients 112 2.3.5 Poisson regression: Hypothesis test on the regression coefficients117 2.4 Sample size for Poisson counts..................... 122 2.4.1 Hypothesis test for a single population............. 123 2.4.2 Hypothesis test for two populations.............. 124 2.5 Sample size for the variance...................... 125 2.5.1 Confidence interval for the standard deviation......... 126 2.5.2 Hypothesis test for one variance................ 127 2.5.3 Hypothesis test for two variances............... 128 2.6 Sample size for correlations....................... 129 2.6.1 Confidence interval for correlation............... 129 2.6.2 Hypothesis test on one sample correlation........... 130 2.6.3 Hypothesis test for the correlations in two samples...... 131 2.6.4 Hypothesis test for multiple correlation in one sample.... 132 2.6.5 Confidence interval for the Intraclass Correlation (ICC).... 133 2.6.6 Hypothesis test for Cohen’s k ................. 136 2.7 Sample size for survival analysis.................... 138 2.7.1 Confidence interval for the mean survival time........ 138 2.7.2 Confidence interval for survival time percentile........ 142 2.7.3 Confidence interval for survival rate.............. 146 2.7.4 Hypothesis test for one sample mean survival time...... 148 2.7.5 Hypothesis test for one sample survival rate.......... 151 2.7.6 Hypothesis test for two samples with exponential survival.. 153 2.7.7 Hypothesis test for two samples with log-rank test...... 154 2.8 Sample size for pilot experiments.................... 155 2.8.1 Pilot experiments for the variance and mean.......... 156 2.8.2 Pilot experiments for proportions............... 157 2.9 Adaptive sample size.......................... 158 2.9.1 Hypothesis test for the superiority of a proportion....... 159 2.9.2 Hypothesis test on the difference of the mean of two samples. 162 2.9.3 Hypothesis test on the difference of two proportions..... 165 2.9.4 Sample size reestimation in blind experiments......... 167 CONTENTS 5 2.9.5 Hypothesis test on the difference of two survival curves.... 167 3 Design of experiments 169 3.1 Basic designs.............................. 169 3.1.1 Completely randomized design, CRD............. 169 3.1.2 Regression design....................... 176 3.1.3 Randomized block design, RBD................ 179 3.1.4 Use of covariates........................ 186 3.1.5 Linear models, sample size and replications.......... 190 3.1.6 Factorial designs, FD...................... 193 3.1.7 Non-orthogonal, incomplete and imbalanced designs..... 203 3.2 Advanced designs............................ 219 3.2.1 Latin squares.......................... 219 3.2.2 Graeco-Latin squares...................... 222 3.2.3 Cross-over designs....................... 222 3.2.4 2k Factorial designs....................... 227 3.2.5 2k Fractional factorial designs................. 232 3.2.6 Split-unit designs........................ 238 3.2.7 Hierarchical or nested designs................. 240 3.2.8 Fixed vs. random and mixed effects.............. 243 3.2.9 Loop designs.......................... 248 3.2.10 Response surface designs.................... 250 3.2.11 Mixture designs......................... 257 3.3 Design selection guide......................... 259 3.4 Sample size for designed experiments................. 263 3.4.1 Sample size for completely randomized and randomized block designs............................. 263 3.4.2 Sample size for factorial designs................ 267 4 Statistical pitfalls 271 4.1 Probability pitfalls........................... 271 4.2 Data analysis pitfalls.......................... 275 4.3 Test selection guide........................... 293 6 CONTENTS Chapter 1 Why do we need a statistical experiment design? Animal research is crucial for biomedical advances because animal models often show higher discrimination than many other experimental alternatives and have the neces- sary fidelity which may be required (Russell WMS, Burch RL. 1959. The principles of humane experimental technique. Wheathampstead (UK): Universities Federation for Animal Welfare.). Although, results with animals cannot be directly extrapolated to hu- mans (Leist and Hartung, 2013), they provide key insight and clues about the possible behaviour of drugs and treatments in other species like ours. The European directive 2010/63/EU proposes the 3Rs (Replacement, Reduction, and Refinement) as an ethical approach to animal research, being conscious of benefits of animal experiments and the harm infringed to them. Replacement addresses the substitution of animals by other non-sentient experi- mental entities (cell cultures, invertebrates, or mathematical models). For instance, it has been shown that lethal doses are better extrapolated from human cell cultures to human subjects than from animals to human subjects (Ekwall et al, 1998). Refinement refers to the way in which the experiment is conducted so that it is performed in as humane a manner as possible (distress, pain and harm are reduced; social and intel- ligent animals enjoy an enriched environment). The second R, reduction, is the topic of these chapters. We should aim at performing experiments with as few animals as possible consistent with achieving the aims of the experiment, which is typically to produce reproducible, publishable, statistically significant results. This reduction goal is best achieved by a thorough and detailed statistical design of the experiment, in- cluding consideration of spatial/time/treatment organization, and a calculation of the appropriate number of animals required for the experiment based on power calculation and credibility for publication. Before carrying out the experiment we should carefully design the analysis methodology so that we plan in advance the way the data will be analyzed thus preventing avoidable surprises. 7 8 CHAPTER 1. WHY DO WE NEED A STATISTICAL EXPERIMENT DESIGN? How to read these chapters The four statistical chapters of this book cover: 1) an overview of the problem and the main statistical concepts involved; 2) a calculation of the sample size; 3) a statistical plan to analyze the data that has a direct impact on the layout of the experiment; 4) some of the most common statistical pitfalls. Chapters 1 and 4 provide a very useful insight into the problem of animal experimentation and its careful design from a statistical point of view. We recommend their full reading. Chapters 2 and 3 are more reference chapters. They cover many different