Nonparametric Location Tests: One-Sample

Total Page:16

File Type:pdf, Size:1020Kb

Nonparametric Location Tests: One-Sample Nonparametric Location Tests: One-Sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 1 Copyright Copyright c 2017 by Nathaniel E. Helwig Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 2 Outline of Notes 1) Background Information: 3) Sign Test (Fisher): Hypothesis testing Overview Location tests Hypothesis testing Order and rank statistics Estimating location One-sample problem Confidence intervals 2) Signed Rank Test (Wilcoxon): 4) Some Considerations: Overview Choosing a location test Hypothesis testing Univariate symmetry Estimating location Bivariate symmetry Confidence intervals Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 3 Background Information Background Information Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 4 Background Information Hypothesis Testing Neyman-Pearson Hypothesis Testing Procedure Hypothesis testing procedure (by Jerzy Neyman & Egon Pearson1): (a) Start with a null and alternative hypothesis (H0 and H1) about θ (b) Calculate some test statistic T from the observed data (c) Calculate p-value; i.e., probability of observing a test statistic as or more extreme than T under the assumption H0 is true (d) Reject H0 if the p-value is below some user-determined threshold Typically we assume observed data are from some known probability distribution (e.g., Normal, t, Poisson, binomial, etc.). 1Egon Pearson was the son of Karl Pearson (very influential statistician). Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 5 Background Information Hypothesis Testing Confidence Intervals In addition to testing H0, we may want to know how confident we can be in our estimate of the unknown population parameter θ. A symmetric 100(1 − α)% confidence interval (CI) has the form: ^ ∗ θ ± T1−α/2σθ^ ^ ^ ∗ where θ is our estimate of θ, σθ^ is the standard error of θ, and T1−α/2 is ∗ the critical value of the test statistic, i.e., P(T ≤ T1−α/2) = 1 − α=2. Interpreting Confidence Intervals: Correct: through repeated samples, e.g., 99 out of 100 confidence intervals would be expected to contain true θ with α = :01 Wrong: through one sample; e.g., there is a 99% chance the confidence interval around my θ^ contains the true θ (with α = :01) Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 6 Background Information Location Tests Definition of “Location Test” Allow us to test hypotheses about mean or median of a population. There are one-sample tests and two-sample tests. One-Sample: H0 : µ = µ0 vs. H1 : µ 6= µ0 Two-Sample: H0 : µ1 − µ2 = µ0 vs. H1 : µ1 − µ2 6= µ0 There are one-sided tests and two-sided tests. One-Sided: H0 : µ = µ0 vs. H1 : µ < µ0 or H1 : µ > µ0 Two-Sided: H0 : µ = µ0 vs. H1 : µ 6= µ0 Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 7 Background Information Location Tests Problems with Parametric Location Tests Typical parametric location tests (e.g., Student’s t tests) focus on analyzing mean differences. Robustness: sample mean is not robust to outliers Consider a sample of data x1;:::; xn with expectation µ < 1 Suppose we fix x1; x2;:::; xn−1 and let xn ! 1 ¯ 1 Pn Note x = n i=1 xi ! 1, i.e., one large outlier ruins sample mean Generalizability: parametric tests are meant for particular distributions Assume data are from some known distribution Parametric inferences are invalid if assumption is wrong Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 8 Background Information Order and Rank Statistics Order Statistics Given a sample of data X1; X2; X3;:::; Xn from some cdf F, the order statistics are typically denoted by X(1); X(2); X(3);:::; X(n) where X(1) ≤ X(2) ≤ X(3) ≤ · · · ≤ X(n) are the ordered data Note that. the 1st order statistic X(1) is the sample minimum the n-th order statistic X(n) is the sample maximum Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 9 Background Information Order and Rank Statistics Rank Statistics Given a sample of data X1; X2; X3;:::; Xn from some cdf F, the rank statistics are typically denoted by R1; R2; R3;:::; Rn where Ri 2 [1; n] for all i 2 f1;:::; ng are the data ranks If there are no ties (i.e., if Xi 6= Xj 8i; j), then Ri 2 f1;:::; ng Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 10 Background Information Order and Rank Statistics Order and Rank Statistics: Example (No Ties) Given a sample of data X1 = 3; X2 = 12; X3 = 11; X4 = 18; X5 = 14; X6 = 10 from some cdf F, the order statistics are X(1) = 3; X(2) = 10; X(3) = 11; X(4) = 12; X(5) = 14; X(6) = 18 and the ranks are given by R1 = 1; R2 = 4; R3 = 3; R4 = 6; R5 = 5; R6 = 2 Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 11 Background Information Order and Rank Statistics Order and Rank Statistics: Example (With Ties) Given a sample of data X1 = 3; X2 = 11; X3 = 11; X4 = 14; X5 = 14; X6 = 11 from some cdf F, the order statistics are X(1) = 3; X(2) = 11; X(3) = 11; X(4) = 11; X(5) = 14; X(6) = 14 and the ranks are given by R1 = 1; R2 = 3; R3 = 3; R4 = 5:5; R5 = 5:5; R6 = 3 This is fractional ranking where we use average ranks: Replace Ri 2 f2; 3; 4g with the average rank 3 = (2 + 3 + 4)=3 Replace Ri 2 f5; 6g with the average rank 5:5 = (5 + 6)=2 Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 12 Background Information Order and Rank Statistics Order and Rank Statistics: Examples (in R) Revisit example with no ties: > x = c(3,12,11,18,14,10) > sort(x) [1] 3 10 11 12 14 18 > rank(x) [1] 1 4 3 6 5 2 Revisit example with ties: > x = c(3,11,11,14,14,11) > sort(x) [1] 3 11 11 11 14 14 > rank(x) [1] 1.0 3.0 3.0 5.5 5.5 3.0 Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 13 Background Information Order and Rank Statistics Summation of Integers (Carl Gauss) Carl Friedrich Gauss was a German mathematician who made amazing contributions to all areas of mathematics (including statistics). According to legend, when Carl was in primary school (about 8 y/o) the teacher asked the class to sum together all integers from 1 to 100. This was supposed to occupy the students for several hours After a few seconds, Carl wrote down the correct answer of 5050! Carl noticed that 1 + 100 = 101, 2 + 99 = 101, 3 + 98 = 101, etc. P100 There are 50 pairs that sum to 101 =) i=1 i = 5050 Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 14 Background Information Order and Rank Statistics General Summation Formulas Pn Summation of Integers: i=1 i = n(n + 1)=2 From pattern noticed by Carl Gauss Pn 2 Summation of Squares: i=1 i = n(n + 1)(2n + 1)=6 Can prove using difference approach (similar to Carl Gauss idea) These formulas relate to test statistics that we use for rank data. Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 15 Background Information One-Sample Problem Problem(s) of Interest For the one-sample location problem, we could have: Paired-replicates data: (Xi ; Yi ) are independent samples One-sample data: Zi are independent samples We want to make inference about: Paired-replicates data: difference in location (treatment effect) One-sample data: single population’s location Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 16 Background Information One-Sample Problem Typical Assumptions Independence assumption: Paired-replicates data: Zi = Yi − Xi are independent samples One-sample data: Zi are independent samples Symmetry assumption: Paired-replicates data: Zi ∼ Fi which is continuous and symmetric around θ (common median) One-sample data: Zi ∼ Fi which is continuous and symmetric around θ (common median) Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 17 Wilcoxon’s Signed Rank Test Wilcoxon’s Signed Rank Test Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 18 Wilcoxon’s Signed Rank Test Overview Assumptions and Hypothesis Assumes both independence and symmetry. The null hypothesis about θ (common median) is H0 : θ = θ0 and we could have one of three alternative hypotheses: One-Sided Upper-Tail: H1 : θ > θ0 One-Sided Lower-Tail: H1 : θ < θ0 Two-Sided: H1 : θ 6= θ0 Nathaniel E. Helwig (U of Minnesota) Nonparametric Location Tests: One-Sample Updated 04-Jan-2017 : Slide 19 Wilcoxon’s Signed Rank Test Hypothesis Testing Test Statistic Let Ri for i 2 f1;:::; ng denote the ranks of jZi − θ0j. Defining the indicator variable 1 if Zi − θ0 > 0 i = 0 if Zi − θ0 < 0 the Wilcoxon signed rank test statistic T + is defined as n + X T = Ri i i=1 where Ri i is the positive signed rank of Zi − θ0 Nathaniel E.
Recommended publications
  • Appendix a Basic Statistical Concepts for Sensory Evaluation
    Appendix A Basic Statistical Concepts for Sensory Evaluation Contents This chapter provides a quick introduction to statistics used for sensory evaluation data including measures of A.1 Introduction ................ 473 central tendency and dispersion. The logic of statisti- A.2 Basic Statistical Concepts .......... 474 cal hypothesis testing is introduced. Simple tests on A.2.1 Data Description ........... 475 pairs of means (the t-tests) are described with worked A.2.2 Population Statistics ......... 476 examples. The meaning of a p-value is reviewed. A.3 Hypothesis Testing and Statistical Inference .................. 478 A.3.1 The Confidence Interval ........ 478 A.3.2 Hypothesis Testing .......... 478 A.3.3 A Worked Example .......... 479 A.1 Introduction A.3.4 A Few More Important Concepts .... 480 A.3.5 Decision Errors ............ 482 A.4 Variations of the t-Test ............ 482 The main body of this book has been concerned with A.4.1 The Sensitivity of the Dependent t-Test for using good sensory test methods that can generate Sensory Data ............. 484 quality data in well-designed and well-executed stud- A.5 Summary: Statistical Hypothesis Testing ... 485 A.6 Postscript: What p-Values Signify and What ies. Now we turn to summarize the applications of They Do Not ................ 485 statistics to sensory data analysis. Although statistics A.7 Statistical Glossary ............. 486 are a necessary part of sensory research, the sensory References .................... 487 scientist would do well to keep in mind O’Mahony’s admonishment: statistical analysis, no matter how clever, cannot be used to save a poor experiment. The techniques of statistical analysis, do however, It is important when taking a sample or designing an serve several useful purposes, mainly in the efficient experiment to remember that no matter how powerful summarization of data and in allowing the sensory the statistics used, the inferences made from a sample scientist to make reasonable conclusions from the are only as good as the data in that sample.
    [Show full text]
  • 12.6 Sign Test (Web)
    12.4 Sign test (Web) Click on 'Bookmarks' in the Left-Hand menu and then click on the required section. 12.6 Sign test (Web) 12.6.1 Introduction The Sign Test is a one sample test that compares the median of a data set with a specific target value. The Sign Test performs the same function as the One Sample Wilcoxon Test, but it does not rank data values. Without the information provided by ranking, the Sign Test is less powerful (9.4.5) than the Wilcoxon Test. However, the Sign Test is useful for problems involving 'binomial' data, where observed data values are either above or below a target value. 12.6.2 Sign Test The Sign Test aims to test whether the observed data sample has been drawn from a population that has a median value, m, that is significantly different from (or greater/less than) a specific value, mO. The hypotheses are the same as in 12.1.2. We will illustrate the use of the Sign Test in Example 12.14 by using the same data as in Example 12.2. Example 12.14 The generation times, t, (5.2.4) of ten cultures of the same micro-organisms were recorded. Time, t (hr) 6.3 4.8 7.2 5.0 6.3 4.2 8.9 4.4 5.6 9.3 The microbiologist wishes to test whether the generation time for this micro- organism is significantly greater than a specific value of 5.0 hours. See following text for calculations: In performing the Sign Test, each data value, ti, is compared with the target value, mO, using the following conditions: ti < mO replace the data value with '-' ti = mO exclude the data value ti > mO replace the data value with '+' giving the results in Table 12.14 Difference + - + ----- + - + - + + Table 12.14 Signs of Differences in Example 12.14 12.4.p1 12.4 Sign test (Web) We now calculate the values of r + = number of '+' values: r + = 6 in Example 12.14 n = total number of data values not excluded: n = 9 in Example 12.14 Decision making in this test is based on the probability of observing particular values of r + for a given value of n.
    [Show full text]
  • One Sample T Test
    Copyright c October 17, 2020 by NEH One Sample t Test Nathaniel E. Helwig University of Minnesota 1 How Beer Changed the World William Sealy Gosset was a chemist and statistician who was employed as the Head Exper- imental Brewer at Guinness in the early 1900s. As a part of his job, Gosset was responsible for taking samples of beer and performing various analyses to ensure that the beer was of the proper composition and quality. While at Guinness, Gosset taught himself experimental design and statistical analysis (which were new and exciting fields at the time), and he even spent some time in 1906{1907 studying in Karl Pearson's laboratory. Gosset's work often involved collecting a small number of samples, and testing hypotheses about the mean of the population from which the samples were drawn. For example, in the brewery, Gosset may want to test whether the population mean amount of some ingredient (e.g., barley) was equal to the intended value. And, on the farm, Gosset may want to test how different farming techniques and growing conditions affect the barley yields. Through this work, Gos- set noticed that the normal distribution was not adequate for testing hypotheses about the mean in small samples of data with an unknown variance. 2 2 1 Pn As a reminder, if X ∼ N(µ, σ ), thenx ¯ ∼ N(µ, σ =n) wherex ¯ = n i=1 xi, which implies x¯ − µ Z = p ∼ N(0; 1) σ= n i.e., the Z test statistic follows a standard normal distribution. However, in practice, the 2 2 1 Pn 2 population variance σ is often unknown, so the sample variance s = n−1 i=1(xi − x¯) must be used in its place.
    [Show full text]
  • 1 Sample Sign Test 1
    Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming. This test makes no assumption about the shape of the population distribution, therefore this test can handle a data set that is non-symmetric, that is skewed either to the left or the right. THE POINT: The SIGN TEST simply computes a significance test of a hypothesized median value for a single data set. Like the 1 SAMPLE T-TEST you can choose whether you want to use a one-tailed or two-tailed distribution based on your hypothesis; basically, do you want to test whether the median value of the data set is equal to some hypothesized value (H0: η = ηo), or do you want to test whether it is greater (or lesser) than that value (H0: η > or < ηo). Likewise, you can choose to set confidence intervals around η at some α level and see whether ηo falls within the range. This is equivalent to the significance test, just as in any T-TEST. In English, the kind of question you would be interested in asking is whether some observation is likely to belong to some data set you already have defined. That is to say, whether the observation of interest is a typical value of that data set. Formally, you are either testing to see whether the observation is a good estimate of the median, or whether it falls within confidence levels set around that median.
    [Show full text]
  • Signed-Rank Tests for ARMA Models
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector JOURNAL OF MULTIVARIATE ANALYSIS 39, l-29 ( 1991) Time Series Analysis via Rank Order Theory: Signed-Rank Tests for ARMA Models MARC HALLIN* Universith Libre de Bruxelles, Brussels, Belgium AND MADAN L. PURI+ Indiana University Communicated by C. R. Rao An asymptotic distribution theory is developed for a general class of signed-rank serial statistics, and is then used to derive asymptotically locally optimal tests (in the maximin sense) for testing an ARMA model against other ARMA models. Special cases yield Fisher-Yates, van der Waerden, and Wilcoxon type tests. The asymptotic relative efficiencies of the proposed procedures with respect to each other, and with respect to their normal theory counterparts, are provided. Cl 1991 Academic Press. Inc. 1. INTRODUCTION 1 .l. Invariance, Ranks and Signed-Ranks Invariance arguments constitute the theoretical cornerstone of rank- based methods: whenever weaker unbiasedness or similarity properties can be considered satisfying or adequate, permutation tests, which generally follow from the usual Neyman structure arguments, are theoretically preferable to rank tests, since they are less restrictive and thus allow for more powerful results (though, of course, their practical implementation may be more problematic). Which type of ranks (signed or unsigned) should be adopted-if Received May 3, 1989; revised October 23, 1990. AMS 1980 subject cIassifications: 62Gl0, 62MlO. Key words and phrases: signed ranks, serial rank statistics, ARMA models, asymptotic relative effhziency, locally asymptotically optimal tests. *Research supported by the the Fonds National de la Recherche Scientifique and the Minist&e de la Communaute Franpaise de Belgique.
    [Show full text]
  • 6 Single Sample Methods for a Location Parameter
    6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually the median) are used. Recall: M is a median of a random variable X if P (X M) = P (X M) = :5. • ≤ ≥ The distribution of X is symmetric about c if P (X c x) = P (X c + x) for all x. • ≤ − ≥ For symmetric continuous distributions, the median M = the mean µ. Thus, all conclusions • about the median can also be applied to the mean. If X be a binomial random variable with parameters n and p (denoted X B(n; p)) then • n ∼ P (X = x) = px(1 p)n−x for x = 0; 1; : : : ; n x − n n! where = and k! = k(k 1)(k 2) 2 1: • x x!(n x)! − − ··· · − Tables exist for the cdf P (X x) for various choices of n and p. The probabilities and • cdf values are also easy to produce≤ using SAS or R. Thus, if X B(n; :5), we have • ∼ n P (X = x) = (:5)n x x n P (X x) = (:5)n ≤ k k X=0 P (X x) = P (X n x) because the B(n; :5) distribution is symmetric. ≤ ≥ − For sample sizes n > 20 and p = :5, a normal approximation (with continuity correction) • to the binomial probabilities is often used instead of binomial tables. (x :5) :5n { Calculate z = ± − : Use x+:5 when x < :5n and use x :5 when x > :5n. :5pn − { The value of z is compared to N(0; 1), the standard normal distribution.
    [Show full text]
  • Tests of Hypotheses Using Statistics
    Tests of Hypotheses Using Statistics Adam Massey¤and Steven J. Millery Mathematics Department Brown University Providence, RI 02912 Abstract We present the various methods of hypothesis testing that one typically encounters in a mathematical statistics course. The focus will be on conditions for using each test, the hypothesis tested by each test, and the appropriate (and inappropriate) ways of using each test. We conclude by summarizing the di®erent tests (what conditions must be met to use them, what the test statistic is, and what the critical region is). Contents 1 Types of Hypotheses and Test Statistics 2 1.1 Introduction . 2 1.2 Types of Hypotheses . 3 1.3 Types of Statistics . 3 2 z-Tests and t-Tests 5 2.1 Testing Means I: Large Sample Size or Known Variance . 5 2.2 Testing Means II: Small Sample Size and Unknown Variance . 9 3 Testing the Variance 12 4 Testing Proportions 13 4.1 Testing Proportions I: One Proportion . 13 4.2 Testing Proportions II: K Proportions . 15 4.3 Testing r £ c Contingency Tables . 17 4.4 Incomplete r £ c Contingency Tables Tables . 18 5 Normal Regression Analysis 19 6 Non-parametric Tests 21 6.1 Tests of Signs . 21 6.2 Tests of Ranked Signs . 22 6.3 Tests Based on Runs . 23 ¤E-mail: [email protected] yE-mail: [email protected] 1 7 Summary 26 7.1 z-tests . 26 7.2 t-tests . 27 7.3 Tests comparing means . 27 7.4 Variance Test . 28 7.5 Proportions . 28 7.6 Contingency Tables .
    [Show full text]
  • Signrank — Equality Tests on Matched Data
    Title stata.com signrank — Equality tests on matched data Description Quick start Menu Syntax Option for signrank Remarks and examples Stored results Methods and formulas References Also see Description signrank tests the equality of matched pairs of observations by using the Wilcoxon matched-pairs signed-rank test (Wilcoxon 1945). The null hypothesis is that both distributions are the same. signtest also tests the equality of matched pairs of observations (Arbuthnott[1710], but better explained by Snedecor and Cochran[1989]) by calculating the differences between varname and the expression. The null hypothesis is that the median of the differences is zero; no further assumptions are made about the distributions. This, in turn, is equivalent to the hypothesis that the true proportion of positive (negative) signs is one-half. For equality tests on unmatched data, see[ R] ranksum. Quick start Wilcoxon matched-pairs signed-rank test for v1 and v2 signrank v1 = v2 Compute an exact p-value for the signed-rank test signrank v1 = v2, exact Conduct signed-rank test separately for groups defined by levels of catvar by catvar: signrank v1 = v2 Test that the median of differences between matched pairs v1 and v2 is 0 signtest v1 = v2 Menu signrank Statistics > Nonparametric analysis > Tests of hypotheses > Wilcoxon matched-pairs signed-rank test signtest Statistics > Nonparametric analysis > Tests of hypotheses > Test equality of matched pairs 1 2 signrank — Equality tests on matched data Syntax Wilcoxon matched-pairs signed-rank test signrank varname = exp if in , exact Sign test of matched pairs signtest varname = exp if in by and collect are allowed with signrank and signtest; see [U] 11.1.10 Prefix commands.
    [Show full text]
  • Reference Manual
    PAST PAleontological STatistics Version 3.16 Reference manual Øyvind Hammer Natural History Museum University of Oslo [email protected] 1999-2017 1 Contents Welcome to the PAST! ......................................................................................................................... 11 Installation ........................................................................................................................................... 12 Quick start ............................................................................................................................................ 13 How do I export graphics? ................................................................................................................ 13 How do I organize data into groups? ................................................................................................ 13 The spreadsheet and the Edit menu .................................................................................................... 14 Entering data .................................................................................................................................... 14 Selecting areas ................................................................................................................................. 14 Moving a row or a column................................................................................................................ 14 Renaming rows and columns ..........................................................................................................
    [Show full text]
  • Comparison of T-Test, Sign Test and Wilcoxon Test
    IOSR Journal of Mathematics (IOSR-JM) e-ISSN: 2278-5728, p-ISSN:2319-765X. Volume 10, Issue 1 Ver. IV. (Feb. 2014), PP 01-06 www.iosrjournals.org On Consistency and Limitation of paired t-test, Sign and Wilcoxon Sign Rank Test 1 2 3 Akeyede Imam, Usman, Mohammed. and Chiawa, Moses Abanyam. 1Department of Mathematics, Federal University Lafia, Nigeria. 2Department of Statistics, Federal Polytechnic Bali, Nigeria. 3Department of Mathematics and Computer Science, Benue State University Makurdi, Nigeria. Correspondence: Akeyede Imam, Department of Mathematics, Federal University Lafia, PMB 106 Lafia, Nasarawa State, Nigeria. Abstract: This article investigates the strength and limitation of t-test and Wilcoxon Sign Rank test procedures on paired samples from related population. These tests are conducted under different scenario whether or not the basic parametric assumptions are met for different sample sizes. Hypothesis testing on equality of means require assumptions to be made about the format of the data to be employed. The test may depend on the assumption that a sample comes from a distribution in a particular family. Since there is doubt about the nature of data, non-parametric tests like Wilcoxon signed rank test and Sign test are employed. Random samples were simulated from normal, gamma, uniform and exponential distributions. The three tests procedures were applied on the simulated data sets at various sample sizes (small, moderate and large) and their Type I error and power of the test were studied in both situations under study. Keywords: Parametric, t-test, Sign-test, Wilcoxon Sign Rank, Type I error, Power of a test. I.
    [Show full text]
  • 1 a Novel Joint Location-‐Scale Testing Framework for Improved Detection Of
    A novel joint location-scale testing framework for improved detection of variants with main or interaction effects David Soave,1,2 Andrew D. Paterson,1,2 Lisa J. Strug,1,2 Lei Sun1,3* 1Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, M5T 3M7, Canada; 2 Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, Ontario, M5G 0A4, Canada; 3Department of Statistical Sciences, University of Toronto, Toronto, Ontario, M5S 3G3, Canada. *Correspondence to: Lei Sun Department of Statistical Sciences 100 St George Street University of Toronto Toronto, ON M5S 3G3, Canada. E-mail: [email protected] 1 Abstract Gene-based, pathway and other multivariate association methods are now commonly implemented in whole-genome scans, but paradoxically, do not account for GXG and GXE interactions which often motivate their implementation in the first place. Direct modeling of potential interacting variables, however, can be challenging due to heterogeneity, missing data or multiple testing. Here we propose a novel and easy-to-implement joint location-scale (JLS) association testing procedure that can account for compleX genetic architecture without eXplicitly modeling interaction effects, and is suitable for large-scale whole-genome scans and meta-analyses. We focus on Fisher’s method and use it to combine evidence from the standard location/mean test and the more recent scale/variance test (JLS- Fisher), and we describe its use for single-variant, gene-set and pathway association analyses. We support our findings with analytical work and large-scale simulation studies. We recommend the use of this analytic strategy to rapidly and correctly prioritize susceptibility loci across the genome for studies of compleX traits and compleX secondary phenotypes of `simple’ Mendelian disorders.
    [Show full text]
  • Nonparametric Test Procedures 1 Introduction to Nonparametrics
    Nonparametric Test Procedures 1 Introduction to Nonparametrics Nonparametric tests do not require that samples come from populations with normal distributions or any other specific distribution. Hence they are sometimes called distribution-free tests. They may have other more easily satisfied assumptions - such as requiring that the population distribution is symmetric. In general, when the parametric assumptions are satisfied, it is better to use parametric test procedures. However, nonparametric test procedures are a valuable tool to use when those assumptions are not satisfied, and most have pretty high efficiency relative to the corresponding parametric test when the parametric assumptions are satisfied. This means you usually don't lose much testing power if you swap to the nonparametric test even when not necessary. The exception is the sign test, discussed below. The following table discusses the parametric and corresponding nonparametric tests we have covered so far. (z-test, t-test, F-test refer to the parametric test statistics used in each procedure). Parameter Parametric Test Nonparametric Test π One sample z-test Exact Binomial µ or µd One sample t-test Sign test or Wilcoxon Signed-Ranks test µ1 − µ2 Two sample t-test Wilcoxon Rank-Sum test ANOVA F-test Kruskal-Wallis As a note on p-values before we proceed, details of computations of p-values are left out of this discussion. Usually these are found by considering the distribution of the test statistic across all possible permutations of ranks, etc. in each case. It is very tedious to do by hand, which is why there are established tables for it, or why we rely on the computer to do it.
    [Show full text]