3. Hypothesis Testing and Analysis of Variance with R Parametric and Non Parametric Tests PROBABILITY and STATISTICS with R Copyright C 2016
Total Page:16
File Type:pdf, Size:1020Kb
3. Hypothesis Testing and Analysis of Variance with R Parametric and non parametric tests PROBABILITY and STATISTICS WITH R Copyright c 2016 Tom´as Goicoa Department of Statistics and Operations Research Public University of Navarre [email protected] Ana F. Militino Department of Statistics and Operations Research Public University of Navarre [email protected] − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Hypothesis Testing Introduction Hypothesis Test for µ and µ µ (Parametric and non-parametric) 1 − 2 Hypothesis Test for π Analysis of variance (Parametric and non-parametric) M´aster Universitario en Salud P´ublica 2/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction A hypothesis test is a decision criterion to select between two complementary hypotheses. The null hypothesis, H0, which is assumed to be true prior to conducting the hypothesis test. It is compared to another hypothesis called the alternative hypothesis and denoted H1. The alternative hypothesis is often called the research hypothesis since the theory or what is believed to be true about the parameter is specified in the alternative hypothesis. M´aster Universitario en Salud P´ublica 3/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction Table 1 : Form of hypothesis test Null Hypothesis Alternative Hypothesis Type of Alternative (A) H1 : θ<θ0 lower one-sided H0 : θ = θ0 (B) H1 : θ>θ0 upper one-sided (C) H : θ = θ two-sided 1 6 0 M´aster Universitario en Salud P´ublica 4/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction-Example Doctors wants to know the average height of women who are given epidural anesthesia in traditional sitting position. They suspect that the average height is greater than 163 cm. H0 : µ = 163 H1 : µ> 163 M´aster Universitario en Salud P´ublica 5/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction To help decide between the two hypotheses, calculate a test statistic based on the sample information from the experiment. Split the sample space into the rejection region R, and the acceptance region Rc . If the value of the test statistic falls in the rejection region, reject the null hypothesis and accept the alternative hypothesis. M´aster Universitario en Salud P´ublica 6/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Type I and Type II errors Table 2 : Possible outcomes and the consequences for a trial by jury Jury’s Decision (Reality) Accept H0 Reject H0 (not guilty) (guilty) True State of the Defendant H0 True (innocent) Correct Type I error H0 False (guilty) Type II error Correct M´aster Universitario en Salud P´ublica 7/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Type I and Type II errors The probability of committing a type I error (rejecting H0 when it is true), is called the level of significance for a hypothesis test. The level of significance is denoted by α where α = P(type I error) = P(reject H H is true) 0| 0 = P( accept H H is true). 1| 0 The probability of committing a type II error is β where β = P(type II error) = P(fail to reject H H is false) 0| 0 = P(accept H H is true). 0| 1 1 β is known as power of the test − M´aster Universitario en Salud P´ublica 8/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Type I and Type II errors A type I error is frequently considered to be more serious than a type II error and the probability of a type I error is easier to control than the probability of a type II error It is common practice for researchers to specify a priori the largest probability of a type I error Researchers typically fix the probability of committing a type I error at the 0.01, 0.05, or 0.1 significance level M´aster Universitario en Salud P´ublica 9/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance P-value The ℘-value is defined as the probability of observing a difference as extreme or more extreme than the difference observed under the assumption that the null hypothesis is true It is important to note that the ℘-value is not fixed a priori but rather is determined after the sample is taken. A small ℘-value indicates that observing differences as large or larger than the one found in the sample is rare, and thus do not occur by chance alone. A small ℘-value lends support to H1; so given a fixed significance level α, reject H0 whenever the ℘-value < α. M´aster Universitario en Salud P´ublica 10/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Test of significance Step 1: Hypotheses — State the null and alternative hypotheses. Step 2: Test Statistic — Select an appropriate test statistic and its sampling distribution under the null hypothesis. Step 3: Calculate ℘-value— Step 4: Statistical Conclusion — Step 5: Explain Conclusion — M´aster Universitario en Salud P´ublica 11/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance R Commands Hypothesis Test for µ and µ µ 1 − 2 t.test — Unknown Population Variance M´aster Universitario en Salud P´ublica 12/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Hypothesis Test for µ — Unknown Population Variance Table 3 : Summary for testing the mean when sampling from a normal distribution with unknown variance (one-sample t-test) Null Standardized Test x¯ µ — H : µ = µ — t = − 0 Hypothesis 0 0 Statistic’s Value obs s/√n Alternative H : µ<µ H : µ>µ H : µ = µ Hypothesis 1 0 1 0 1 6 0 Rejection Region tobs < tα;n 1 tobs > t1 α;n 1 tobs > t1 α/2;n 1 − − − | | − − M´aster Universitario en Salud P´ublica 13/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example EPIDURAL. (PASWR2) Doctors wants to know the average height of women who are given epidural anesthesia in traditional sitting position. They suspect that the average height is greater than 163 cm. M´aster Universitario en Salud P´ublica 14/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Solution Using R: Checking normality Check for normality: eda() > library(PASWR2) > attach(EPIDURAL) > cm.sit<-cm[treatmen=="Traditional Sitting"] > eda(cm.sit) [1] "cm.sit" Size (n).... SW p-val 50.000.... 0.230 M´aster Universitario en Salud P´ublica 15/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Solution Using R: Checking normality EXPLORATORY DATA ANALYSIS Histogram of cm.sit Density of cm.sit Boxplot of cm.sit Q−Q Plot of cm.sit M´aster Universitario en Salud P´ublica 16/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example 3.1: Hypothesis Test for µ — Solution Step1: Hypotheses — H0 : µ = 163 versus H1 : µ> 163. Step 2: Test Statistic — The test statistic chosen is X . n Pi=1 xi 8265 The value of this test statistic isx ¯ = n = 50 = 172.5. The standardized test statistic and its distribution under the assumption H0 is true are X µ0 − t50 1. S/√n ∼ − M´aster Universitario en Salud P´ublica 17/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example 3.1: Hypothesis Test for µ — Solution Step 3: Rejection Region Calculations — The rejection region is tobs > t1 0.05;49 = t0.95;49 =1.68 The value of the standardized test− statistic is x¯ µ0 172.5 163 t = − = − =2.36. obs s/√n 6.91/√50 1−pt(2.36,49)=0.011 dt(x, 49) t0.95:49 = 1.68 −4 −2 0 2 2.36 4 M´aster Universitario en Salud P´ublica x 18/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example 3.1: Hypothesis Test for µ — Solution Step 4: Statistical Conclusion — The ℘-value is P(t 2.36) = 0.01. 49 ≥ I. From the rejection region, reject H0 because tobs =2.36 is greater than 1.68. II. From the ℘-value, reject H0 because the ℘-value = 0.01 is less than 0.05. Reject H0. Step 5: Explain Conclusion — There is evidence to suggest that the mean height of women in sitting position is greater than 163 cm. M´aster Universitario en Salud P´ublica 19/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Solution Using R: t.test() > t.test(cm.sit,mu=163,alternative="g") One Sample t-test data: cm.sit t = 2.3552, df = 49, p-value = 0.01128 alternative hypothesis: true mean is greater than 163 95 percent confidence interval: 163.6627 Inf sample estimates: mean of x 165.3 M´aster Universitario en Salud P´ublica 20/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Non-parametric alternative: Wilcoxon Signed-Rank Test wilcox.test() If normality does not hold, then use wilcox.test() > wilcox.test(cm.sit,mu=163,alternative="g") Wilcoxon signed rank test with continuity correction data: cm.sit V = 650.5, p-value = 0.01614 alternative hypothesis: true location is greater than 163 Warning messages: 1: In wilcox.test.default(cm.sit, mu = 163, alternative = "g") : cannot compute exact p-value with ties 2: In wilcox.test.default(cm.sit, mu = 163, alternative = "g") : cannot compute exact p-value with zeroes M´aster Universitario en Salud P´ublica 21/66 UPNA − Outline Introduction HypothesisTestsfor µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Test for a Difference in Means.