3. Hypothesis Testing and Analysis of Variance with R Parametric and Non Parametric Tests PROBABILITY and STATISTICS with R Copyright C 2016

3. Hypothesis Testing and Analysis of Variance with R Parametric and non parametric tests PROBABILITY and STATISTICS WITH R Copyright c 2016 Tomás Goicoa Department of Statistics and Operations Research Public University of Navarre [email protected] Ana F. Militino Department of Statistics and Operations Research Public University of Navarre [email protected] − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Hypothesis Testing Introduction Hypothesis Test for µ and µ µ (Parametric and non-parametric) 1 − 2 Hypothesis Test for π Analysis of variance (Parametric and non-parametric) Máster Universitario en Salud Pública 2/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction A hypothesis test is a decision criterion to select between two complementary hypotheses. The null hypothesis, H0, which is assumed to be true prior to conducting the hypothesis test. It is compared to another hypothesis called the alternative hypothesis and denoted H1. The alternative hypothesis is often called the research hypothesis since the theory or what is believed to be true about the parameter is specified in the alternative hypothesis. Máster Universitario en Salud Pública 3/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction Table 1 : Form of hypothesis test Null Hypothesis Alternative Hypothesis Type of Alternative (A) H1 : θ<θ0 lower one-sided H0 : θ = θ0 (B) H1 : θ>θ0 upper one-sided (C) H : θ = θ two-sided 1 6 0 Máster Universitario en Salud Pública 4/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction-Example Doctors wants to know the average height of women who are given epidural anesthesia in traditional sitting position. They suspect that the average height is greater than 163 cm. H0 : µ = 163 H1 : µ> 163 Máster Universitario en Salud Pública 5/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction To help decide between the two hypotheses, calculate a test statistic based on the sample information from the experiment. Split the sample space into the rejection region R, and the acceptance region Rc . If the value of the test statistic falls in the rejection region, reject the null hypothesis and accept the alternative hypothesis. Máster Universitario en Salud Pública 6/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Type I and Type II errors Table 2 : Possible outcomes and the consequences for a trial by jury Jury’s Decision (Reality) Accept H0 Reject H0 (not guilty) (guilty) True State of the Defendant H0 True (innocent) Correct Type I error H0 False (guilty) Type II error Correct Máster Universitario en Salud Pública 7/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Type I and Type II errors The probability of committing a type I error (rejecting H0 when it is true), is called the level of significance for a hypothesis test. The level of significance is denoted by α where α = P(type I error) = P(reject H H is true) 0| 0 = P( accept H H is true). 1| 0 The probability of committing a type II error is β where β = P(type II error) = P(fail to reject H H is false) 0| 0 = P(accept H H is true). 0| 1 1 β is known as power of the test − Máster Universitario en Salud Pública 8/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Type I and Type II errors A type I error is frequently considered to be more serious than a type II error and the probability of a type I error is easier to control than the probability of a type II error It is common practice for researchers to specify a priori the largest probability of a type I error Researchers typically fix the probability of committing a type I error at the 0.01, 0.05, or 0.1 significance level Máster Universitario en Salud Pública 9/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance P-value The ℘-value is defined as the probability of observing a difference as extreme or more extreme than the difference observed under the assumption that the null hypothesis is true It is important to note that the ℘-value is not fixed a priori but rather is determined after the sample is taken. A small ℘-value indicates that observing differences as large or larger than the one found in the sample is rare, and thus do not occur by chance alone. A small ℘-value lends support to H1; so given a fixed significance level α, reject H0 whenever the ℘-value < α. Máster Universitario en Salud Pública 10/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Test of significance Step 1: Hypotheses — State the null and alternative hypotheses. Step 2: Test Statistic — Select an appropriate test statistic and its sampling distribution under the null hypothesis. Step 3: Calculate ℘-value— Step 4: Statistical Conclusion — Step 5: Explain Conclusion — Máster Universitario en Salud Pública 11/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance R Commands Hypothesis Test for µ and µ µ 1 − 2 t.test — Unknown Population Variance Máster Universitario en Salud Pública 12/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Hypothesis Test for µ — Unknown Population Variance Table 3 : Summary for testing the mean when sampling from a normal distribution with unknown variance (one-sample t-test) Null Standardized Test x¯ µ — H : µ = µ — t = − 0 Hypothesis 0 0 Statistic’s Value obs s/√n Alternative H : µ<µ H : µ>µ H : µ = µ Hypothesis 1 0 1 0 1 6 0 Rejection Region tobs < tα;n 1 tobs > t1 α;n 1 tobs > t1 α/2;n 1 − − − | | − − Máster Universitario en Salud Pública 13/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example EPIDURAL. (PASWR2) Doctors wants to know the average height of women who are given epidural anesthesia in traditional sitting position. They suspect that the average height is greater than 163 cm. Máster Universitario en Salud Pública 14/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Solution Using R: Checking normality Check for normality: eda() > library(PASWR2) > attach(EPIDURAL) > cm.sit<-cm[treatmen=="Traditional Sitting"] > eda(cm.sit) [1] "cm.sit" Size (n).... SW p-val 50.000.... 0.230 Máster Universitario en Salud Pública 15/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Solution Using R: Checking normality EXPLORATORY DATA ANALYSIS Histogram of cm.sit Density of cm.sit Boxplot of cm.sit Q−Q Plot of cm.sit Máster Universitario en Salud Pública 16/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example 3.1: Hypothesis Test for µ — Solution Step1: Hypotheses — H0 : µ = 163 versus H1 : µ> 163. Step 2: Test Statistic — The test statistic chosen is X . n Pi=1 xi 8265 The value of this test statistic isx ¯ = n = 50 = 172.5. The standardized test statistic and its distribution under the assumption H0 is true are X µ0 − t50 1. S/√n ∼ − Máster Universitario en Salud Pública 17/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example 3.1: Hypothesis Test for µ — Solution Step 3: Rejection Region Calculations — The rejection region is tobs > t1 0.05;49 = t0.95;49 =1.68 The value of the standardized test− statistic is x¯ µ0 172.5 163 t = − = − =2.36. obs s/√n 6.91/√50 1−pt(2.36,49)=0.011 dt(x, 49) t0.95:49 = 1.68 −4 −2 0 2 2.36 4 Máster Universitario en Salud Pública x 18/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example 3.1: Hypothesis Test for µ — Solution Step 4: Statistical Conclusion — The ℘-value is P(t 2.36) = 0.01. 49 ≥ I. From the rejection region, reject H0 because tobs =2.36 is greater than 1.68. II. From the ℘-value, reject H0 because the ℘-value = 0.01 is less than 0.05. Reject H0. Step 5: Explain Conclusion — There is evidence to suggest that the mean height of women in sitting position is greater than 163 cm. Máster Universitario en Salud Pública 19/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Solution Using R: t.test() > t.test(cm.sit,mu=163,alternative="g") One Sample t-test data: cm.sit t = 2.3552, df = 49, p-value = 0.01128 alternative hypothesis: true mean is greater than 163 95 percent confidence interval: 163.6627 Inf sample estimates: mean of x 165.3 Máster Universitario en Salud Pública 20/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Non-parametric alternative: Wilcoxon Signed-Rank Test wilcox.test() If normality does not hold, then use wilcox.test() > wilcox.test(cm.sit,mu=163,alternative="g") Wilcoxon signed rank test with continuity correction data: cm.sit V = 650.5, p-value = 0.01614 alternative hypothesis: true location is greater than 163 Warning messages: 1: In wilcox.test.default(cm.sit, mu = 163, alternative = "g") : cannot compute exact p-value with ties 2: In wilcox.test.default(cm.sit, mu = 163, alternative = "g") : cannot compute exact p-value with zeroes Máster Universitario en Salud Pública 21/66 UPNA − Outline Introduction HypothesisTestsfor µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Test for a Difference in Means.

3. Hypothesis Testing and Analysis of Variance with R Parametric and Non Parametric Tests PROBABILITY and STATISTICS with R Copyright C 2016

Hypothesis Testing and Likelihood Ratio Tests

Data 8 Final Stats Review

Use of Statistical Tables

8.5 Testing a Claim About a Standard Deviation Or Variance

A Study of Non-Central Skew T Distributions and Their Applications in Data Analysis and Change Point Detection

Two-Sample T-Tests Assuming Equal Variance

Chapter 7. Hypothesis Testing

This Is Dr. Chumney. the Focus of This Lecture Is Hypothesis Testing –Both What It Is, How Hypothesis Tests Are Used, and How to Conduct Hypothesis Tests

The Scientific Method: Hypothesis Testing and Experimental Design

Hypothesis Testing – Examples and Case Studies

Chi-Square Tests

Tests of Hypotheses Using Statistics