Completely Randomized Designs (CRD) One-Way ANOVA

Total Page:16

File Type:pdf, Size:1020Kb

Completely Randomized Designs (CRD) One-Way ANOVA Completely Randomized Designs (CRD) One-Way ANOVA Lukas Meier, Seminar für Statistik Example: Meat Storage Study (Kuehl, 2000, Example 2.1) . Researcher wants to investigate the effect of packaging on bacterial growth of stored meat. Some studies suggested controlled gas atmospheres as alternatives to existing packaging. Different treatments (= packaging types) . Commercial plastic wrap (ambient air) Current techniques (control groups) . Vacuum package . 1% CO, 40% O2, 59% N New techniques . 100% CO2 . Experimental units: 12 beef steaks (ca. 75g). Measure effectiveness of packaging by measuring how successful they are in suppressing bacterial growth. 1 Example: Meat Storage Study . Three beef steaks were randomly assigned to each of the packaging conditions. Each steak was packaged separately in its assigned condition. Response: (logarithm of the) number of bacteria per square centimeter. The number of bacteria was measured after nine days of storage at 4 degrees Celsius in a standard meat storage facility. 2 First Step (Always): Exploratory Data Analysis . If very few observations: Plot all data points. With more observations: Use boxplots (side-by-side) . Alternatively: Violin-plots, histogram side-by-side, … . See examples in R: 02_meat_storage.R Such plots typically give you the same (or even more) information as a formal analysis (see later). 3 Side Remark: Factors . Categorical variables are also called factors. The different values of a factor are called levels. Factors can be nominal or ordinal (ordered) . Hair color: {black, blond, …} nominal . Gender: {male, female} nominal . Treatment: {commercial, vacuum, mixed, CO2} nominal . Income: {<50k, 50-100k, >100k} ordinal . Useful functions in R: . factor . as.factor . levels 4 Completely Randomized Design: Formal Setup . Compare 푔 treatments . Available resources: 푁 experimental units . Need to assign the 푁 experimental units to 푔 different treatments (groups) having 푛푖 observations each, 푖 = 1, … , 푔. Of course: 푛1 + 푛2 + … + 푛푔 = 푁. Use randomization: . Choose 푛1 units at random to get treatment 1, . 푛2 units at random to get treatment 2, . ... This randomization produces a so called completely randomized design (CRD). 5 Setting up the Model . Need to set up a model in order to do statistical inference. Good message: problem looks rather easy. Bad message: Some complications ahead regarding parametrization. 6 Remember: Two Sample 푡-Test for Unpaired Data . Model 2 . 푋푖 i. i. d. ∼ 푁 휇푋, 휎 , 푖 = 1, … , 푛 2 . 푌푗 i. i. d. ∼ 푁 휇푌, 휎 , 푗 = 1, … , 푚 . 푋푖, 푌푗 independent . 풕-Test . 퐻0: 휇푋 = 휇푌 . 퐻퐴: 휇푋 ≠ 휇푌 (or one-sided) (푋 −푌 ) . 푇 = 푛 푚 ∼ 푡 under 퐻 1 1 푛+푚−2 0 푆 + 푝표표푙 푛 푚 . Allows us to test or construct confidence intervals for the true (unknown) difference 휇푋 − 휇푌. Note: Both groups have their “individual” mean but they share a common variance (can be extended to other situations). 7 From Two to More Groups . In the meat storage example we had 4 groups. Hence, the 푡-test is not directly applicable. Could try to construct something using only pairs of groups (e.g., doing all pairwise comparisons). Will do so later. Now we want to expand the model that we used for the two sample 푡-test to the more general situation of 푔 > 2 groups. As we might run out of letters, we use a common letter (say 푌) for all groups and put the grouping and replication information in the index. 8 Cell Means Model . We need two indices to distinguish between the different treatments (groups) and the different observations. Let 푌푖푗 be the 푗th observation in the 푖th treatment group, 푖 = 1, … , 푔; 푗 = 1, … , 푛푖. Cell means model: Every group (treatment) has its own mean value, i.e. 2 푌푖푗 ∼ 푁 휇푖, 휎 , independent group 푖 observation 푗 . Also called separate means model. Note: Variance constant across groups (as for standard two-sample 푡-test!) 9 Illustration of Cell Means Model . See R-Code: 02_model_illustration.R . Or visit https://gallery.shinyapps.io/anova_shiny_rstudio/ . Why cell means? Have a look at meat storage data: Commercial Vacuum Mixed CO2 7.66 5.26 7.41 3.51 6.98 5.44 7.33 2.91 7.80 5.80 7.04 3.66 cell 10 Cell Means Model: Alternative Representation 2 . We can “extract” the deterministic part in 푌푖푗 ∼ 푁(휇푖, 휎 ). Leads to 푌푖푗 = 휇푖 + 휖푖푗 2 with 휖푖푗 i. i. d. ∼ 푁 0, 휎 . The 휖푖푗’s are random “errors” that fluctuate around zero. In the regression context: . 푌 is the response. Treatment is a categorical predictor (a factor). Hence, this is nothing else than a regression model with a categorical predictor. 11 Yet Another Representation (!) . We can also write 휇푖 = 휇 + 훼푖, 푖 = 1, … , 푔. E.g., think of 휇 as a “global mean” and 훼푖 as the corresponding deviation from the global mean. 훼푖 is also called the 푖th treatment effect. This looks like a needless complication now, but will be very useful later (with factorial treatment structure). Unfortunately this model is not identifiable anymore. Reason: 푔 + 1 parameters (휇, 훼1, … , 훼푔) for 푔 different means… 12 Ensuring Identifiability . Need side constraint: many options available. Sum of the treatment effects is zero, i.e. 훼푔 = − 훼1 + ⋯ 훼푔−1 (R: contr.sum) . Sum of weighted treatment effects is zero: … (R: do manually) . Set 휇 = 휇1, hence 훼1 = 0, 훼2 = 휇2 − 휇1, 훼3 = 휇3 − 휇1, … i.e. a comparison with group 1 as reference level. (R: contr.treatment) . Only 푔 − 1 elements of the treatments effect are allowed to vary freely. We also say that the treatment effect has 푔 − 1 degrees of freedom (df). 13 Encoding Scheme of Factors . The encoding scheme (i.e., the side constraint being used) of a factor is called contrast in R. To summarize: we have a total of 푔 parameters: 휇, 훼1, … , 훼푔−1 to parametrize the 푔 group means 휇1, … , 휇푔. The interpretation of the parameters 휇, 훼1, … , 훼푔−1 strongly depends on the parametrization that is being used. We will re-discover the word “contrast” in a different way later… 14 Parameter Estimation . Choose parameter estimates 휇 , 훼 1, … , 훼 푔−1 such that model fits the data “well”. Criterion: Choose parameter estimates such that 푔 푛푖 2 푖=1 푗=1 푦푖푗 − 휇 − 훼 푖 is minimal (so called least squares criterion, exactly as in regression). The estimated cell means are simply 휇 푖 = 휇 + 훼 푖 15 Illustration of Goodness of Fit . See blackboard (incl. definition of residual) 16 Some Notation Symbol Meaning Formula 푛푖 푦푖⋅ Sum of all values in group 풊 푦푖⋅ = 푦푖푗 푗=1 푛 1 푖 1 푦 푖⋅ Sample average in group 풊 푦 푖⋅ = 푦푖푗 = 푦푖⋅ 푛푖 푛푖 푗=1 푔 푛푖 푦⋅⋅ Sum of all observations 푦⋅⋅ = 푦푖푗 푖=1 푗=1 푦 푦 Grand mean 푦 = ⋅⋅ ⋅⋅ ⋅⋅ 푁 Rule: If we replace an index with a dot (“⋅”) it means that we are summing up values over that index. 17 Parameter Estimates, the Other Way Round . “Obviously”, the 휇 푖’s that minimize the least squares criterion are 휇 푖 = 푦 푖⋅. Means: Expectation of group 푖 is estimated with sample mean of group 푖. ′ . The 훼푖푠 are then simply estimated by applying the corresponding parametrization, i.e. 훼 푖 = 휇 푖 − 휇 = 푦 푖⋅ − 푦 ⋅⋅ The fitted values 휇 푖 (and the residuals) are independent of the parametrization, but the 훼 푖’s (heavily) depend on it! 18 Parameter Estimation . We denote residual (or error) sum of squares by 푔 푛푖 2 푆푆퐸 = 푖=1 푗=1 푦푖푗 − 푦 푖⋅ empirical variance in group 푖 2 . Estimator for 휎 is 푀푆퐸, mean squared error, i.e. 2 1 1 푔 2 휎 = 푀푆 = 푆푆 = 푛 − 1 푠 퐸 푁−푔 퐸 푁−푔 푖=1 푖 푖 . This is an unbiased estimator for 휎2 (reason for 푁 − 푔 instead of 푁 in the denominator). We also say that the error estimate has 푁 − 푔 degrees of freedom (푁 observations, 푔 parameters) or 푔 푁 − 푔 = 푖=1(푛푖 − 1 ). 19 Estimation Accuracy . Standard errors for the parameters (using the sum of weighted treatment effects constraint) Parameter Estimator Standard Error 휇 푦 ⋅⋅ 휎/ 푁 휇푖 푦 푖⋅ 휎/ 푛푖 1 1 훼푖 푦 푖⋅ − 푦 ⋅⋅ 휎 − 푛푖 푁 1 1 휇푖 − 휇푗 = 훼푖 − 훼푗 푦 푖⋅ − 푦 푗⋅ 휎 + 푛푖 푛푗 . Therefore, a 95% confidence interval for 훼푖 is given by 0.975 1 1 훼 푖 ± 푡푁−푔 ⋅ 휎 − 푛푖 푁 푁 − 푔 degrees of freedom because of 20 97.5% quantile of 푡푁−푔 distribution degrees of freedom of 푀푆퐸 Single Mean Model . Extending the null-hypothesis of the 푡-test to the situation where 푔 > 2, we can (for example) use the (very strong) null-hypothesis that treatment has no effect on the response. In such a setting, all values (also across different treatments) fluctuate around the same “global” mean 휇. 2 . Model reduces to: 푌푖푗 i. i. d. ∼ 푁(휇, 휎 ) 2 . Or equivalently: 푌푖푗 = 휇 + 휖푖푗, 휖푖푗 i. i. d. ∼ 푁 0, 휎 . This is the single mean model. 21 Comparison of models . Note: Models are “nested”, single mean model is a special case of cell means model. Or: Cell means model is more flexible than single mean model. Which one to choose? Let a statistical test decide. Cell means model Single mean model 22 Analysis of Variance (ANOVA) . Classical approach: decompose “variability” of response into different “sources” and compare them. More modern view: Compare (nested) models. In both approaches: Use statistical test with global null hypothesis 퐻0: 휇1 = 휇2 = ⋯ = 휇푔 versus the alternative 퐻퐴: 휇푘 ≠ 휇푙 for at least one pair 푘 ≠ 푙 . 퐻0 says that the single mean model is ok. 퐻0 is equivalent to 훼1 = 훼2 = … = 훼푔 = 0.
Recommended publications
  • Analysis of Variance and Analysis of Variance and Design of Experiments of Experiments-I
    Analysis of Variance and Design of Experimentseriments--II MODULE ––IVIV LECTURE - 19 EXPERIMENTAL DESIGNS AND THEIR ANALYSIS Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur 2 Design of experiment means how to design an experiment in the sense that how the observations or measurements should be obtained to answer a qqyuery inavalid, efficient and economical way. The desigggning of experiment and the analysis of obtained data are inseparable. If the experiment is designed properly keeping in mind the question, then the data generated is valid and proper analysis of data provides the valid statistical inferences. If the experiment is not well designed, the validity of the statistical inferences is questionable and may be invalid. It is important to understand first the basic terminologies used in the experimental design. Experimental unit For conducting an experiment, the experimental material is divided into smaller parts and each part is referred to as experimental unit. The experimental unit is randomly assigned to a treatment. The phrase “randomly assigned” is very important in this definition. Experiment A way of getting an answer to a question which the experimenter wants to know. Treatment Different objects or procedures which are to be compared in an experiment are called treatments. Sampling unit The object that is measured in an experiment is called the sampling unit. This may be different from the experimental unit. 3 Factor A factor is a variable defining a categorization. A factor can be fixed or random in nature. • A factor is termed as fixed factor if all the levels of interest are included in the experiment.
    [Show full text]
  • How to Randomize?
    TRANSLATING RESEARCH INTO ACTION How to Randomize? Roland Rathelot J-PAL Course Overview 1. What is Evaluation? 2. Outcomes, Impact, and Indicators 3. Why Randomize and Common Critiques 4. How to Randomize 5. Sampling and Sample Size 6. Threats and Analysis 7. Project from Start to Finish 8. Cost-Effectiveness Analysis and Scaling Up Lecture Overview • Unit and method of randomization • Real-world constraints • Revisiting unit and method • Variations on simple treatment-control Lecture Overview • Unit and method of randomization • Real-world constraints • Revisiting unit and method • Variations on simple treatment-control Unit of Randomization: Options 1. Randomizing at the individual level 2. Randomizing at the group level “Cluster Randomized Trial” • Which level to randomize? Unit of Randomization: Considerations • What unit does the program target for treatment? • What is the unit of analysis? Unit of Randomization: Individual? Unit of Randomization: Individual? Unit of Randomization: Clusters? Unit of Randomization: Class? Unit of Randomization: Class? Unit of Randomization: School? Unit of Randomization: School? How to Choose the Level • Nature of the Treatment – How is the intervention administered? – What is the catchment area of each “unit of intervention” – How wide is the potential impact? • Aggregation level of available data • Power requirements • Generally, best to randomize at the level at which the treatment is administered. Suppose an intervention targets health outcomes of children through info on hand-washing. What is the appropriate level of randomization? A. Child level 30% B. Household level 23% C. Classroom level D. School level 17% E. Village level 13% 10% F. Don’t know 7% A. B. C. D.
    [Show full text]
  • Randomized Experimentsexperiments Randomized Trials
    Impact Evaluation RandomizedRandomized ExperimentsExperiments Randomized Trials How do researchers learn about counterfactual states of the world in practice? In many fields, and especially in medical research, evidence about counterfactuals is generated by randomized trials. In principle, randomized trials ensure that outcomes in the control group really do capture the counterfactual for a treatment group. 2 Randomization To answer causal questions, statisticians recommend a formal two-stage statistical model. In the first stage, a random sample of participants is selected from a defined population. In the second stage, this sample of participants is randomly assigned to treatment and comparison (control) conditions. 3 Population Randomization Sample Randomization Treatment Group Control Group 4 External & Internal Validity The purpose of the first-stage is to ensure that the results in the sample will represent the results in the population within a defined level of sampling error (external validity). The purpose of the second-stage is to ensure that the observed effect on the dependent variable is due to some aspect of the treatment rather than other confounding factors (internal validity). 5 Population Non-target group Target group Randomization Treatment group Comparison group 6 Two-Stage Randomized Trials In large samples, two-stage randomized trials ensure that: [Y1 | D =1]= [Y1 | D = 0] and [Y0 | D =1]= [Y0 | D = 0] • Thus, the estimator ˆ ˆ ˆ δ = [Y1 | D =1]-[Y0 | D = 0] • Consistently estimates ATE 7 One-Stage Randomized Trials Instead, if randomization takes place on a selected subpopulation –e.g., list of volunteers-, it only ensures: [Y0 | D =1] = [Y0 | D = 0] • And hence, the estimator ˆ ˆ ˆ δ = [Y1 | D =1]-[Y0 | D = 0] • Only estimates TOT Consistently 8 Randomized Trials Furthermore, even in idealized randomized designs, 1.
    [Show full text]
  • How to Do Random Allocation (Randomization) Jeehyoung Kim, MD, Wonshik Shin, MD
    Special Report Clinics in Orthopedic Surgery 2014;6:103-109 • http://dx.doi.org/10.4055/cios.2014.6.1.103 How to Do Random Allocation (Randomization) Jeehyoung Kim, MD, Wonshik Shin, MD Department of Orthopedic Surgery, Seoul Sacred Heart General Hospital, Seoul, Korea Purpose: To explain the concept and procedure of random allocation as used in a randomized controlled study. Methods: We explain the general concept of random allocation and demonstrate how to perform the procedure easily and how to report it in a paper. Keywords: Random allocation, Simple randomization, Block randomization, Stratified randomization Randomized controlled trials (RCT) are known as the best On the other hand, many researchers are still un- method to prove causality in spite of various limitations. familiar with how to do randomization, and it has been Random allocation is a technique that chooses individuals shown that there are problems in many studies with the for treatment groups and control groups entirely by chance accurate performance of the randomization and that some with no regard to the will of researchers or patients’ con- studies are reporting incorrect results. So, we will intro- dition and preference. This allows researchers to control duce the recommended way of using statistical methods all known and unknown factors that may affect results in for a randomized controlled study and show how to report treatment groups and control groups. the results properly. Allocation concealment is a technique used to pre- vent selection bias by concealing the allocation sequence CATEGORIES OF RANDOMIZATION from those assigning participants to intervention groups, until the moment of assignment.
    [Show full text]
  • Bias in Rcts: Confounders, Selection Bias and Allocation Concealment
    Vol. 10, No. 3, 2005 Middle East Fertility Society Journal © Copyright Middle East Fertility Society EVIDENCE-BASED MEDICINE CORNER Bias in RCTs: confounders, selection bias and allocation concealment Abdelhamid Attia, M.D. Professor of Obstetrics & Gynecology and secretary general of the center of evidence-Based Medicine, Cairo University, Egypt The double blind randomized controlled trial that may distort a RCT results and how to avoid (RCT) is considered the gold-standard in clinical them is mandatory for researchers who seek research. Evidence for the effectiveness of conducting proper research of high relevance and therapeutic interventions should rely on well validity. conducted RCTs. The importance of RCTs for clinical practice can be illustrated by its impact on What is bias? the shift of practice in hormone replacement therapy (HRT). For decades HRT was considered Bias is the lack of neutrality or prejudice. It can the standard care for all postmenopausal, be simply defined as "the deviation from the truth". symptomatic and asymptomatic women. Evidence In scientific terms it is "any factor or process that for the effectiveness of HRT relied always on tends to deviate the results or conclusions of a trial observational studies mostly cohort studies. But a systematically away from the truth2". Such single RCT that was published in 2002 (The women's deviation leads, usually, to over-estimating the health initiative trial (1) has changed clinical practice effects of interventions making them look better all over the world from the liberal use of HRT to the than they actually are. conservative use in selected symptomatic cases and Bias can occur and affect any part of a study for the shortest period of time.
    [Show full text]
  • Randomization Lists
    PASS Sample Size Software NCSS.com Chapter 880 Randomization Lists Introduction This module is used to create a randomization list for assigning subjects to one of up to 25 treatment groups. The list can be stratified by up to 25 centers (strata) with an additional 2 stratification factors, each with up to 25 levels. Seven randomization algorithms are available. Five of the algorithms (Block Randomization, Efron’s biased coin randomization, Smith’s randomization, Wei’s urn randomization, and random sorting using maximum allowable % deviation) are designed to generate balanced random samples throughout the course of an experiment. The other two (complete randomization and random sorting) are less complex but have higher probabilities of imbalance. Technical details for each of these algorithms (except random sorting) as well as a discussion of randomization issues in general are given in Rosenberger and Lachin (2002), Pocock (1983), and Piantadosi (2005). Why Randomize? Random allocation has been an important part of scientific discovery since the time of R.A. Fisher. He and other scientists recognized the potential for bias in nonrandomized experiments that could compromise treatment comparisons. Randomization controls the influence of unknown factors, reducing potential bias in an experiment. Rosenberger and Lachin (2002) state on page 18 that “…the randomization of subjects between the treatment groups is the paramount statistical element that allows one to claim that a study is unbiased.” The elimination of bias, however, is not the only reason for randomization. Pocock (1983) states on page 50 that randomization is used “to provide a basis for the standard methods of statistical analysis such as significance tests.” Clearly, randomization is necessary in order to produce accurate and generalizable statistical conclusions.
    [Show full text]
  • Week 10: Causality with Measured Confounding
    Week 10: Causality with Measured Confounding Brandon Stewart1 Princeton November 28 and 30, 2016 1These slides are heavily influenced by Matt Blackwell, Jens Hainmueller, Erin Hartman, Kosuke Imai and Gary King. Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 1 / 176 Where We've Been and Where We're Going... Last Week I regression diagnostics This Week I Monday: F experimental Ideal F identification with measured confounding I Wednesday: F regression estimation Next Week I identification with unmeasured confounding I instrumental variables Long Run I causality with measured confounding ! unmeasured confounding ! repeated data Questions? Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 2 / 176 1 The Experimental Ideal 2 Assumption of No Unmeasured Confounding 3 Fun With Censorship 4 Regression Estimators 5 Agnostic Regression 6 Regression and Causality 7 Regression Under Heterogeneous Effects 8 Fun with Visualization, Replication and the NYT 9 Appendix Subclassification Identification under Random Assignment Estimation Under Random Assignment Blocking Stewart (Princeton) Week 10: Measured Confounding November 28 and 30, 2016 3 / 176 1 The Experimental Ideal 2 Assumption of No Unmeasured Confounding 3 Fun With Censorship 4 Regression Estimators 5 Agnostic Regression 6 Regression and Causality 7 Regression Under Heterogeneous Effects 8 Fun with Visualization, Replication and the NYT 9 Appendix Subclassification Identification under Random Assignment Estimation Under Random Assignment Blocking Stewart
    [Show full text]
  • Randomization
    Randomization Randomization in trials is the act of assigning treatment units (usually persons) to treatment via a random process. As noted in the essay on random, the expectation is that randomization will produce treatment groups having comparable entry and baseline characteristics. If one wants a "guarantee" of comparability, one has to resort to stratification or matching. Although the expectation is usually satisfied, there is no "guarantee" of comparability. Large differences in the treatment groups can occur by chance in the same way that one can have a fortuitous or disastrous run at the blackjack table. Hence, an aberrantly small p-value for a given baseline variable does not, by itself, provide evidence of a "randomization breakdown". The primary purpose of randomization is to avoid treatment-related selection bias. The notion of bias free assignment is at the heart of any comparative trial and chance via randomization represents the best available tool for achieving bias-free assignment. randomization n - 1. An act of assigning or ordering that is the result of a random process such as that represented by a sequence of numbers in a table of random numbers or a sequence of numbers produced by a random number generator, eg, the assignment of a patient to treatment using a random process. 2. The process of deriving an order or sequence of items, specimens, records, or the like using a random process. rt: haphazardization, quasirandomization Usage note:Donot use as a characterization except in settings where there is an explicit or implied mathematical basis for supporting the usage, as discussed in usage notes for random adj.
    [Show full text]
  • An Introduction to Randomization*
    An Introduction to Randomization* TAF – CEGA Impact Evaluation Workshop Day 1 * http://ocw.mit.edu (J-PAL Impact Evaluation Workshop materials) Outline Background What is randomized evaluation? Advantages and limitations of experiments Conclusions An example: The “Vote 2002” campaign Arceneaux, Gerber and Green (2006) Intervention: get-out-the-vote phone calls to increase voter turnout in Iowa and Michigan, 2002 midterm elections Treatment group = 60,000 individuals (35,000 actually reached by phone) Control group = >2,000,000 individuals Main outcome: turnout (did the individual vote?) Effect sizes using experimental v. non-experimental methods How to measure impact? What would have happened in the absence of the intervention program? { Since the counterfactual is not observable, key goal of all impact evaluation methods is to construct of “mimic” the counterfactual Constructing the counterfactual Counterfactual is often constructed by selecting a group not affected by the program Randomized: { Use random assignment of the program to create a control group which mimics the counterfactual. Non-randomized: { Argue that a certain excluded group mimics the counterfactual. Validity A tool to assess credibility of a study Internal validity { Relates to ability to draw causal inference, i.e. can we attribute our impact estimates to the program and not to something else External validity { Relates to ability to generalize to other settings of interest, i.e. can we generalize our impact estimates from this program to other populations,
    [Show full text]
  • Randomized Clinical Trials
    15 Randomized clinical trials One pill makes you larger, and one pill makes you small. And the ones your mother gives you, don’t do anything at all. Grace Slick, The Jefferson Airplane: White Rabbit, from Surrealistic Pillow, 1967 Learning objectives In this chapter you will learn: ● the unique features of randomized clinical trials (RCTs) ● how to undertake critical interpretation of RCTs The randomized clinical trial (RCT) is the ultimate paradigm of clinical research. Many consider the RCT to be the most important medical development of the twentieth century, as their results are used to dictate clinical practice. Although these trails are often put on a pedestal, it is important to realize that as with all experiments, there may be flaws in the design, implementation, and interpretation of these trials. The competent reader of the medical literature should be able to evaluate the results of a clinical trial in the context of the potential biases introduced into the research experiment, and determine if it contains any fatal flaws Introduction The clinical trial is a relatively recent development in medical research. Prior to the 1950s, most research was based upon case series or uncontrolled observations. James Lind, a surgeon in the British Navy, can claim credit for performing the first recorded clinical trial. In 1747, aboard the ship Salisbury, he took 12 sailors with scurvy and divided them into six groups of two each. He made sure they were similar in every way except for the treatment they received for scurvy. Dr. Lind found that the two sailors who were given oranges and lemons got better while the other ten did not.
    [Show full text]
  • Randomization Distributions
    Section 4.4 Creating Randomization Distributions Statistics: Unlocking the Power of Data Lock5 Randomization Distributions p-values can be calculated by randomization distributions: simulate samples, assuming H0 is true calculate the statistic of interest for each sample find the p-value as the proportion of simulated statistics as extreme as the observed statistic Today we’ll see ways to simulate randomization samples for a variety of situations Statistics: Unlocking the Power of Data Lock5 Cocaine Addiction • In a randomized experiment on treating cocaine addiction, 48 people were randomly assigned to take either Desipramine (a new drug), or Lithium (an existing drug), and then followed to see who relapsed • Question of interest: Is Desipramine better than Lithium at treating cocaine addiction? Statistics: Unlocking the Power of Data Lock5 Cocaine Addiction • What are the null and alternative hypotheses? • What are the possible conclusions? Statistics: Unlocking the Power of Data Lock5 Cocaine Addiction • What are the null and alternative hypotheses? pD, pL: proportion of cocaine addicts who relapse after taking Desipramine or Lithium, respectively ̂ H0: pD = pL H : p < p a D L • What are the possible conclusions? Reject H0; Desipramine is better than Lithium Do not reject H0: We cannot determine from these data whether Desipramine is better than Lithium Statistics: Unlocking the Power of Data Lock5 R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R 1.
    [Show full text]
  • De-Randomization Under a Uniform Assumption
    Randomness vs. Time: De-randomization under a uniform assumption Russell Impagliazzo∗ Avi Wigdersony Department of Computer Science Institute of Computer Science University of California Hebrew University San Diego, CA 91097-0114 Jerusalem, Israel 91904 [email protected] [email protected] Abstract 1 Introduction, History, and Intuition We prove that if BPP = EXP, then every prob- lem in BPP can be solved6 deterministically in 1.1 Motivation subexponential time on almost every input ( The introduction of randomization into efficient on every samplable ensemble for infinitely many computation has been one of the most fertile and input sizes). This is the first derandomiza- useful ideas in computer science. In cryptogra- tion result for BP P based on uniform, non- phy and asynchronous computing, randomization cryptographic hardness assumptions. It implies makes possible tasks that are impossible to per- the following gap in the average-instance com- form deterministically. Even for function com- plexities of problems in BP P : either these com- putation, many examples are known in which plexities are always sub-exponential or they con- randomization allows considerable savings in re- tain arbitrarily large exponential functions. sources like space and time over deterministic al- We use a construction of a small \pseudo- gorithms, or even \only" simplifies them. random" set of strings from a \hard function" But to what extent is this seeming power of in EXP which is identical to that used in the randomness over determinism real? The most analogous non-uniform results of [21, 3]. How- famous concrete version of this question regards ever, previous proofs of correctness assume the the power of BP P , the class of problems solv- \hard function" is not in P=poly.
    [Show full text]