3. Hypothesis Testing and Analysis of Variance with R Parametric and Non Parametric Tests PROBABILITY and STATISTICS with R Copyright C 2016

Total Page:16

File Type:pdf, Size:1020Kb

3. Hypothesis Testing and Analysis of Variance with R Parametric and Non Parametric Tests PROBABILITY and STATISTICS with R Copyright C 2016 3. Hypothesis Testing and Analysis of Variance with R Parametric and non parametric tests PROBABILITY and STATISTICS WITH R Copyright c 2016 Tom´as Goicoa Department of Statistics and Operations Research Public University of Navarre [email protected] Ana F. Militino Department of Statistics and Operations Research Public University of Navarre [email protected] − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Hypothesis Testing Introduction Hypothesis Test for µ and µ µ (Parametric and non-parametric) 1 − 2 Hypothesis Test for π Analysis of variance (Parametric and non-parametric) M´aster Universitario en Salud P´ublica 2/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction A hypothesis test is a decision criterion to select between two complementary hypotheses. The null hypothesis, H0, which is assumed to be true prior to conducting the hypothesis test. It is compared to another hypothesis called the alternative hypothesis and denoted H1. The alternative hypothesis is often called the research hypothesis since the theory or what is believed to be true about the parameter is specified in the alternative hypothesis. M´aster Universitario en Salud P´ublica 3/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction Table 1 : Form of hypothesis test Null Hypothesis Alternative Hypothesis Type of Alternative (A) H1 : θ<θ0 lower one-sided H0 : θ = θ0 (B) H1 : θ>θ0 upper one-sided (C) H : θ = θ two-sided 1 6 0 M´aster Universitario en Salud P´ublica 4/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction-Example Doctors wants to know the average height of women who are given epidural anesthesia in traditional sitting position. They suspect that the average height is greater than 163 cm. H0 : µ = 163 H1 : µ> 163 M´aster Universitario en Salud P´ublica 5/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Introduction To help decide between the two hypotheses, calculate a test statistic based on the sample information from the experiment. Split the sample space into the rejection region R, and the acceptance region Rc . If the value of the test statistic falls in the rejection region, reject the null hypothesis and accept the alternative hypothesis. M´aster Universitario en Salud P´ublica 6/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Type I and Type II errors Table 2 : Possible outcomes and the consequences for a trial by jury Jury’s Decision (Reality) Accept H0 Reject H0 (not guilty) (guilty) True State of the Defendant H0 True (innocent) Correct Type I error H0 False (guilty) Type II error Correct M´aster Universitario en Salud P´ublica 7/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Type I and Type II errors The probability of committing a type I error (rejecting H0 when it is true), is called the level of significance for a hypothesis test. The level of significance is denoted by α where α = P(type I error) = P(reject H H is true) 0| 0 = P( accept H H is true). 1| 0 The probability of committing a type II error is β where β = P(type II error) = P(fail to reject H H is false) 0| 0 = P(accept H H is true). 0| 1 1 β is known as power of the test − M´aster Universitario en Salud P´ublica 8/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Type I and Type II errors A type I error is frequently considered to be more serious than a type II error and the probability of a type I error is easier to control than the probability of a type II error It is common practice for researchers to specify a priori the largest probability of a type I error Researchers typically fix the probability of committing a type I error at the 0.01, 0.05, or 0.1 significance level M´aster Universitario en Salud P´ublica 9/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance P-value The ℘-value is defined as the probability of observing a difference as extreme or more extreme than the difference observed under the assumption that the null hypothesis is true It is important to note that the ℘-value is not fixed a priori but rather is determined after the sample is taken. A small ℘-value indicates that observing differences as large or larger than the one found in the sample is rare, and thus do not occur by chance alone. A small ℘-value lends support to H1; so given a fixed significance level α, reject H0 whenever the ℘-value < α. M´aster Universitario en Salud P´ublica 10/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Test of significance Step 1: Hypotheses — State the null and alternative hypotheses. Step 2: Test Statistic — Select an appropriate test statistic and its sampling distribution under the null hypothesis. Step 3: Calculate ℘-value— Step 4: Statistical Conclusion — Step 5: Explain Conclusion — M´aster Universitario en Salud P´ublica 11/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance R Commands Hypothesis Test for µ and µ µ 1 − 2 t.test — Unknown Population Variance M´aster Universitario en Salud P´ublica 12/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Hypothesis Test for µ — Unknown Population Variance Table 3 : Summary for testing the mean when sampling from a normal distribution with unknown variance (one-sample t-test) Null Standardized Test x¯ µ — H : µ = µ — t = − 0 Hypothesis 0 0 Statistic’s Value obs s/√n Alternative H : µ<µ H : µ>µ H : µ = µ Hypothesis 1 0 1 0 1 6 0 Rejection Region tobs < tα;n 1 tobs > t1 α;n 1 tobs > t1 α/2;n 1 − − − | | − − M´aster Universitario en Salud P´ublica 13/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example EPIDURAL. (PASWR2) Doctors wants to know the average height of women who are given epidural anesthesia in traditional sitting position. They suspect that the average height is greater than 163 cm. M´aster Universitario en Salud P´ublica 14/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Solution Using R: Checking normality Check for normality: eda() > library(PASWR2) > attach(EPIDURAL) > cm.sit<-cm[treatmen=="Traditional Sitting"] > eda(cm.sit) [1] "cm.sit" Size (n).... SW p-val 50.000.... 0.230 M´aster Universitario en Salud P´ublica 15/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Solution Using R: Checking normality EXPLORATORY DATA ANALYSIS Histogram of cm.sit Density of cm.sit Boxplot of cm.sit Q−Q Plot of cm.sit M´aster Universitario en Salud P´ublica 16/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example 3.1: Hypothesis Test for µ — Solution Step1: Hypotheses — H0 : µ = 163 versus H1 : µ> 163. Step 2: Test Statistic — The test statistic chosen is X . n Pi=1 xi 8265 The value of this test statistic isx ¯ = n = 50 = 172.5. The standardized test statistic and its distribution under the assumption H0 is true are X µ0 − t50 1. S/√n ∼ − M´aster Universitario en Salud P´ublica 17/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example 3.1: Hypothesis Test for µ — Solution Step 3: Rejection Region Calculations — The rejection region is tobs > t1 0.05;49 = t0.95;49 =1.68 The value of the standardized test− statistic is x¯ µ0 172.5 163 t = − = − =2.36. obs s/√n 6.91/√50 1−pt(2.36,49)=0.011 dt(x, 49) t0.95:49 = 1.68 −4 −2 0 2 2.36 4 M´aster Universitario en Salud P´ublica x 18/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Example 3.1: Hypothesis Test for µ — Solution Step 4: Statistical Conclusion — The ℘-value is P(t 2.36) = 0.01. 49 ≥ I. From the rejection region, reject H0 because tobs =2.36 is greater than 1.68. II. From the ℘-value, reject H0 because the ℘-value = 0.01 is less than 0.05. Reject H0. Step 5: Explain Conclusion — There is evidence to suggest that the mean height of women in sitting position is greater than 163 cm. M´aster Universitario en Salud P´ublica 19/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Solution Using R: t.test() > t.test(cm.sit,mu=163,alternative="g") One Sample t-test data: cm.sit t = 2.3552, df = 49, p-value = 0.01128 alternative hypothesis: true mean is greater than 163 95 percent confidence interval: 163.6627 Inf sample estimates: mean of x 165.3 M´aster Universitario en Salud P´ublica 20/66 UPNA − Outline Introduction Hypothesis Tests for µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Non-parametric alternative: Wilcoxon Signed-Rank Test wilcox.test() If normality does not hold, then use wilcox.test() > wilcox.test(cm.sit,mu=163,alternative="g") Wilcoxon signed rank test with continuity correction data: cm.sit V = 650.5, p-value = 0.01614 alternative hypothesis: true location is greater than 163 Warning messages: 1: In wilcox.test.default(cm.sit, mu = 163, alternative = "g") : cannot compute exact p-value with ties 2: In wilcox.test.default(cm.sit, mu = 163, alternative = "g") : cannot compute exact p-value with zeroes M´aster Universitario en Salud P´ublica 21/66 UPNA − Outline Introduction HypothesisTestsfor µ Hypothesis Tests for µX µY Hypothesis Tests for π Analysis of variance Test for a Difference in Means.
Recommended publications
  • Hypothesis Testing and Likelihood Ratio Tests
    Hypottthesiiis tttestttiiing and llliiikellliiihood ratttiiio tttesttts Y We will adopt the following model for observed data. The distribution of Y = (Y1, ..., Yn) is parameter considered known except for some paramett er ç, which may be a vector ç = (ç1, ..., çk); ç“Ç, the paramettter space. The parameter space will usually be an open set. If Y is a continuous random variable, its probabiiillliiittty densiiittty functttiiion (pdf) will de denoted f(yy;ç) . If Y is y probability mass function y Y y discrete then f(yy;ç) represents the probabii ll ii tt y mass functt ii on (pmf); f(yy;ç) = Pç(YY=yy). A stttatttiiistttiiicalll hypottthesiiis is a statement about the value of ç. We are interested in testing the null hypothesis H0: ç“Ç0 versus the alternative hypothesis H1: ç“Ç1. Where Ç0 and Ç1 ¶ Ç. hypothesis test Naturally Ç0 § Ç1 = ∅, but we need not have Ç0 ∞ Ç1 = Ç. A hypott hesii s tt estt is a procedure critical region for deciding between H0 and H1 based on the sample data. It is equivalent to a crii tt ii call regii on: a critical region is a set C ¶ Rn y such that if y = (y1, ..., yn) “ C, H0 is rejected. Typically C is expressed in terms of the value of some tttesttt stttatttiiistttiiic, a function of the sample data. For µ example, we might have C = {(y , ..., y ): y – 0 ≥ 3.324}. The number 3.324 here is called a 1 n s/ n µ criiitttiiicalll valllue of the test statistic Y – 0 . S/ n If y“C but ç“Ç 0, we have committed a Type I error.
    [Show full text]
  • Data 8 Final Stats Review
    Data 8 Final Stats review I. Hypothesis Testing Purpose: To answer a question about a process or the world by testing two hypotheses, a null and an alternative. Usually the null hypothesis makes a statement that “the world/process works this way”, and the alternative hypothesis says “the world/process does not work that way”. Examples: Null: “The customer was not cheating-his chances of winning and losing were like random tosses of a fair coin-50% chance of winning, 50% of losing. Any variation from what we expect is due to chance variation.” Alternative: “The customer was cheating-his chances of winning were something other than 50%”. Pro tip: You must be very precise about chances in your hypotheses. Hypotheses such as “the customer cheated” or “Their chances of winning were normal” are vague and might be considered incorrect, because you don’t state the exact chances associated with the events. Pro tip: Null hypothesis should also explain differences in the data. For example, if your hypothesis stated that the coin was fair, then why did you get 70 heads out of 100 flips? Since it’s possible to get that many (though not expected), your null hypothesis should also contain a statement along the lines of “Any difference in outcome from what we expect is due to chance variation”. Steps: 1) Precisely state your null and alternative hypotheses. 2) Decide on a test statistic (think of it as a general formula) to help you either reject or fail to reject the null hypothesis. • If your data is categorical, a good test statistic might be the Total Variation Distance (TVD) between your sample and the distribution it was drawn from.
    [Show full text]
  • Use of Statistical Tables
    TUTORIAL | SCOPE USE OF STATISTICAL TABLES Lucy Radford, Jenny V Freeman and Stephen J Walters introduce three important statistical distributions: the standard Normal, t and Chi-squared distributions PREVIOUS TUTORIALS HAVE LOOKED at hypothesis testing1 and basic statistical tests.2–4 As part of the process of statistical hypothesis testing, a test statistic is calculated and compared to a hypothesised critical value and this is used to obtain a P- value. This P-value is then used to decide whether the study results are statistically significant or not. It will explain how statistical tables are used to link test statistics to P-values. This tutorial introduces tables for three important statistical distributions (the TABLE 1. Extract from two-tailed standard Normal, t and Chi-squared standard Normal table. Values distributions) and explains how to use tabulated are P-values corresponding them with the help of some simple to particular cut-offs and are for z examples. values calculated to two decimal places. STANDARD NORMAL DISTRIBUTION TABLE 1 The Normal distribution is widely used in statistics and has been discussed in z 0.00 0.01 0.02 0.03 0.050.04 0.05 0.06 0.07 0.08 0.09 detail previously.5 As the mean of a Normally distributed variable can take 0.00 1.0000 0.9920 0.9840 0.9761 0.9681 0.9601 0.9522 0.9442 0.9362 0.9283 any value (−∞ to ∞) and the standard 0.10 0.9203 0.9124 0.9045 0.8966 0.8887 0.8808 0.8729 0.8650 0.8572 0.8493 deviation any positive value (0 to ∞), 0.20 0.8415 0.8337 0.8259 0.8181 0.8103 0.8206 0.7949 0.7872 0.7795 0.7718 there are an infinite number of possible 0.30 0.7642 0.7566 0.7490 0.7414 0.7339 0.7263 0.7188 0.7114 0.7039 0.6965 Normal distributions.
    [Show full text]
  • 8.5 Testing a Claim About a Standard Deviation Or Variance
    8.5 Testing a Claim about a Standard Deviation or Variance Testing Claims about a Population Standard Deviation or a Population Variance ² Uses the chi-squared distribution from section 7-4 → Requirements: 1. The sample is a simple random sample 2. The population has a normal distribution (n −1)s 2 → Test Statistic for Testing a Claim about or ²: 2 = 2 where n = sample size s = sample standard deviation σ = population standard deviation s2 = sample variance σ2 = population variance → P-values and Critical Values: Use table A-4 with df = n – 1 for the number of degrees of freedom *Remember that table A-4 is based on cumulative areas from the right → Properties of the Chi-Square Distribution: 1. All values of 2 are nonnegative and the distribution is not symmetric 2. There is a different 2 distribution for each number of degrees of freedom 3. The critical values are found in table A-4 (based on cumulative areas from the right) --locate the row corresponding to the appropriate number of degrees of freedom (df = n – 1) --the significance level is used to determine the correct column --Right-tailed test: Because the area to the right of the critical value is 0.05, locate 0.05 at the top of table A-4 --Left-tailed test: With a left-tailed area of 0.05, the area to the right of the critical value is 0.95 so locate 0.95 at the top of table A-4 --Two-tailed test: Divide the significance level of 0.05 between the left and right tails, so the areas to the right of the two critical values are 0.975 and 0.025.
    [Show full text]
  • A Study of Non-Central Skew T Distributions and Their Applications in Data Analysis and Change Point Detection
    A STUDY OF NON-CENTRAL SKEW T DISTRIBUTIONS AND THEIR APPLICATIONS IN DATA ANALYSIS AND CHANGE POINT DETECTION Abeer M. Hasan A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 2013 Committee: Arjun K. Gupta, Co-advisor Wei Ning, Advisor Mark Earley, Graduate Faculty Representative Junfeng Shang. Copyright c August 2013 Abeer M. Hasan All rights reserved iii ABSTRACT Arjun K. Gupta, Co-advisor Wei Ning, Advisor Over the past three decades there has been a growing interest in searching for distribution families that are suitable to analyze skewed data with excess kurtosis. The search started by numerous papers on the skew normal distribution. Multivariate t distributions started to catch attention shortly after the development of the multivariate skew normal distribution. Many researchers proposed alternative methods to generalize the univariate t distribution to the multivariate case. Recently, skew t distribution started to become popular in research. Skew t distributions provide more flexibility and better ability to accommodate long-tailed data than skew normal distributions. In this dissertation, a new non-central skew t distribution is studied and its theoretical properties are explored. Applications of the proposed non-central skew t distribution in data analysis and model comparisons are studied. An extension of our distribution to the multivariate case is presented and properties of the multivariate non-central skew t distri- bution are discussed. We also discuss the distribution of quadratic forms of the non-central skew t distribution. In the last chapter, the change point problem of the non-central skew t distribution is discussed under different settings.
    [Show full text]
  • Two-Sample T-Tests Assuming Equal Variance
    PASS Sample Size Software NCSS.com Chapter 422 Two-Sample T-Tests Assuming Equal Variance Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of the two groups (populations) are assumed to be equal. This is the traditional two-sample t-test (Fisher, 1925). The assumed difference between means can be specified by entering the means for the two groups and letting the software calculate the difference or by entering the difference directly. The design corresponding to this test procedure is sometimes referred to as a parallel-groups design. This design is used in situations such as the comparison of the income level of two regions, the nitrogen content of two lakes, or the effectiveness of two drugs. There are several statistical tests available for the comparison of the center of two populations. This procedure is specific to the two-sample t-test assuming equal variance. You can examine the sections below to identify whether the assumptions and test statistic you intend to use in your study match those of this procedure, or if one of the other PASS procedures may be more suited to your situation. Other PASS Procedures for Comparing Two Means or Medians Procedures in PASS are primarily built upon the testing methods, test statistic, and test assumptions that will be used when the analysis of the data is performed. You should check to identify that the test procedure described below in the Test Procedure section matches your intended procedure. If your assumptions or testing method are different, you may wish to use one of the other two-sample procedures available in PASS.
    [Show full text]
  • Chapter 7. Hypothesis Testing
    McFadden, Statistical Tools © 2000 Chapter 7-1, Page 155 ______________________________________________________________________________ CHAPTER 7. HYPOTHESIS TESTING 7.1. THE GENERAL PROBLEM It is often necessary to make a decision, on the basis of available data from an experiment (carried out by yourself or by Nature), on whether a particular proposition Ho (theory, model, hypothesis) is true, or the converse H1 is true. This decision problem is often encountered in scientific investigation. Economic examples of hypotheses are (a) The commodities market is efficient (i.e., opportunities for arbitrage are absent). (b) There is no discrimination on the basis of gender in the market for academic economists. (c) Household energy consumption is a necessity, with an income elasticity not exceeding one. (d) The survival curve for Japanese cars is less convex than that for Detroit cars. Notice that none of these economically interesting hypotheses are framed directly as precise statements about a probability law (e.g., a statement that the parameter in a family of probability densities for the observations from an experiment takes on a specific value). A challenging part of statistical analysis is to set out maintained hypotheses that will be accepted by the scientific community as true, and which in combination with the proposition under test give a probability law. Deciding the truth or falsity of a null hypothesis Ho presents several general issues: the cost of mistakes, the selection and/or design of the experiment, and the choice of the test. 7.2. THE COST OF MISTAKES Consider a two-by-two table that compares the truth with the result of the statistical decision.
    [Show full text]
  • This Is Dr. Chumney. the Focus of This Lecture Is Hypothesis Testing –Both What It Is, How Hypothesis Tests Are Used, and How to Conduct Hypothesis Tests
    TRANSCRIPT: This is Dr. Chumney. The focus of this lecture is hypothesis testing –both what it is, how hypothesis tests are used, and how to conduct hypothesis tests. 1 TRANSCRIPT: In this lecture, we will talk about both theoretical and applied concepts related to hypothesis testing. 2 TRANSCRIPT: Let’s being the lecture with a summary of the logic process that underlies hypothesis testing. 3 TRANSCRIPT: It is often impossible or otherwise not feasible to collect data on every individual within a population. Therefore, researchers rely on samples to help answer questions about populations. Hypothesis testing is a statistical procedure that allows researchers to use sample data to draw inferences about the population of interest. Hypothesis testing is one of the most commonly used inferential procedures. Hypothesis testing will combine many of the concepts we have already covered, including z‐scores, probability, and the distribution of sample means. To conduct a hypothesis test, we first state a hypothesis about a population, predict the characteristics of a sample of that population (that is, we predict that a sample will be representative of the population), obtain a sample, then collect data from that sample and analyze the data to see if it is consistent with our hypotheses. 4 TRANSCRIPT: The process of hypothesis testing begins by stating a hypothesis about the unknown population. Actually we state two opposing hypotheses. The first hypothesis we state –the most important one –is the null hypothesis. The null hypothesis states that the treatment has no effect. In general the null hypothesis states that there is no change, no difference, no effect, and otherwise no relationship between the independent and dependent variables.
    [Show full text]
  • The Scientific Method: Hypothesis Testing and Experimental Design
    Appendix I The Scientific Method The study of science is different from other disciplines in many ways. Perhaps the most important aspect of “hard” science is its adherence to the principle of the scientific method: the posing of questions and the use of rigorous methods to answer those questions. I. Our Friend, the Null Hypothesis As a science major, you are probably no stranger to curiosity. It is the beginning of all scientific discovery. As you walk through the campus arboretum, you might wonder, “Why are trees green?” As you observe your peers in social groups at the cafeteria, you might ask yourself, “What subtle kinds of body language are those people using to communicate?” As you read an article about a new drug which promises to be an effective treatment for male pattern baldness, you think, “But how do they know it will work?” Asking such questions is the first step towards hypothesis formation. A scientific investigator does not begin the study of a biological phenomenon in a vacuum. If an investigator observes something interesting, s/he first asks a question about it, and then uses inductive reasoning (from the specific to the general) to generate an hypothesis based upon a logical set of expectations. To test the hypothesis, the investigator systematically collects data, either with field observations or a series of carefully designed experiments. By analyzing the data, the investigator uses deductive reasoning (from the general to the specific) to state a second hypothesis (it may be the same as or different from the original) about the observations.
    [Show full text]
  • Hypothesis Testing – Examples and Case Studies
    Hypothesis Testing – Chapter 23 Examples and Case Studies Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. 23.1 How Hypothesis Tests Are Reported in the News 1. Determine the null hypothesis and the alternative hypothesis. 2. Collect and summarize the data into a test statistic. 3. Use the test statistic to determine the p-value. 4. The result is statistically significant if the p-value is less than or equal to the level of significance. Often media only presents results of step 4. Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. 2 23.2 Testing Hypotheses About Proportions and Means If the null and alternative hypotheses are expressed in terms of a population proportion, mean, or difference between two means and if the sample sizes are large … … the test statistic is simply the corresponding standardized score computed assuming the null hypothesis is true; and the p-value is found from a table of percentiles for standardized scores. Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. 3 Example 2: Weight Loss for Diet vs Exercise Did dieters lose more fat than the exercisers? Diet Only: sample mean = 5.9 kg sample standard deviation = 4.1 kg sample size = n = 42 standard error = SEM1 = 4.1/ √42 = 0.633 Exercise Only: sample mean = 4.1 kg sample standard deviation = 3.7 kg sample size = n = 47 standard error = SEM2 = 3.7/ √47 = 0.540 measure of variability = [(0.633)2 + (0.540)2] = 0.83 Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. 4 Example 2: Weight Loss for Diet vs Exercise Step 1.
    [Show full text]
  • Chi-Square Tests
    Chi-Square Tests Nathaniel E. Helwig Associate Professor of Psychology and Statistics University of Minnesota October 17, 2020 Copyright c 2020 by Nathaniel E. Helwig Nathaniel E. Helwig (Minnesota) Chi-Square Tests c October 17, 2020 1 / 32 Table of Contents 1. Goodness of Fit 2. Tests of Association (for 2-way Tables) 3. Conditional Association Tests (for 3-way Tables) Nathaniel E. Helwig (Minnesota) Chi-Square Tests c October 17, 2020 2 / 32 Goodness of Fit Table of Contents 1. Goodness of Fit 2. Tests of Association (for 2-way Tables) 3. Conditional Association Tests (for 3-way Tables) Nathaniel E. Helwig (Minnesota) Chi-Square Tests c October 17, 2020 3 / 32 Goodness of Fit A Primer on Categorical Data Analysis In the previous chapter, we looked at inferential methods for a single proportion or for the difference between two proportions. In this chapter, we will extend these ideas to look more generally at contingency table analysis. All of these methods are a form of \categorical data analysis", which involves statistical inference for nominal (or categorial) variables. Nathaniel E. Helwig (Minnesota) Chi-Square Tests c October 17, 2020 4 / 32 Goodness of Fit Categorical Data with J > 2 Levels Suppose that X is a categorical (i.e., nominal) variable that has J possible realizations: X 2 f0;:::;J − 1g. Furthermore, suppose that P (X = j) = πj where πj is the probability that X is equal to j for j = 0;:::;J − 1. PJ−1 J−1 Assume that the probabilities satisfy j=0 πj = 1, so that fπjgj=0 defines a valid probability mass function for the random variable X.
    [Show full text]
  • Tests of Hypotheses Using Statistics
    Tests of Hypotheses Using Statistics Adam Massey¤and Steven J. Millery Mathematics Department Brown University Providence, RI 02912 Abstract We present the various methods of hypothesis testing that one typically encounters in a mathematical statistics course. The focus will be on conditions for using each test, the hypothesis tested by each test, and the appropriate (and inappropriate) ways of using each test. We conclude by summarizing the di®erent tests (what conditions must be met to use them, what the test statistic is, and what the critical region is). Contents 1 Types of Hypotheses and Test Statistics 2 1.1 Introduction . 2 1.2 Types of Hypotheses . 3 1.3 Types of Statistics . 3 2 z-Tests and t-Tests 5 2.1 Testing Means I: Large Sample Size or Known Variance . 5 2.2 Testing Means II: Small Sample Size and Unknown Variance . 9 3 Testing the Variance 12 4 Testing Proportions 13 4.1 Testing Proportions I: One Proportion . 13 4.2 Testing Proportions II: K Proportions . 15 4.3 Testing r £ c Contingency Tables . 17 4.4 Incomplete r £ c Contingency Tables Tables . 18 5 Normal Regression Analysis 19 6 Non-parametric Tests 21 6.1 Tests of Signs . 21 6.2 Tests of Ranked Signs . 22 6.3 Tests Based on Runs . 23 ¤E-mail: [email protected] yE-mail: [email protected] 1 7 Summary 26 7.1 z-tests . 26 7.2 t-tests . 27 7.3 Tests comparing means . 27 7.4 Variance Test . 28 7.5 Proportions . 28 7.6 Contingency Tables .
    [Show full text]