STAT 416 – Nonparametric Tests

STAT 416 – Nonparametric Tests 1. Test for Randomness (Trend) (1). Test based on total number of runs (2). Runs Up and Down: (3). Rank von Nueman Test 2. Goodness-of-fit test. H0 : FX (x) = F0 (x) for all x, F0 (x) is specified. (1). Chi-square Test. k (f − e ) Q = X i i ∼ χ2 (k − 1) , under null hypothesis, i=1 ei where fi is the observed frequency in i-th group, and ei is the ex- pected frequency in i-th group, i = 1, ..., k. (2). Kolmogrov-Smirnov Test. Dn = sup kFn (x) − F0 (x)k x where Fn (x) is the empirical cdf of the observed sample. 3. Location Test for One-Sample and Paired-Sample. n n n Data {Xi}i=1 or {Di = Yi − Xi}i=1 for paired sample {(Xi,Yi)}i=1. Hypothesis H0 : M = M0 (θ = P (X > M0) = 0.5) , where M0 is specific median. (1). Sign Test n X K = I {Xi > M0} ∼ Binomial (n, p = 0.5) , underH0 i=1 (2). Wilcoxon Signed-Rank Test n + X T = Zi · r (|Di|) i=1 where Di = Xi − M0,Zi = I {Di > 0} . 1 4. Test for General Two-Sample Data: {X1, ...Xm}{Y1,..., Yn} two independent samples. (1). Hypothesis H0 : FX (x) = FY (x) for all x. Kolmogrov- Smirnov Test. Dm,n = sup kFm,X (x) − Fn,Y (x)k x where Fm,X and Fn,Y are the empirical cdfs of samples X and Y . (2). Location Test for General Two-Sample from a Location Fam- ily. Hypothesis H0 : MX = MY . m P a). Median Test: U = I {Xi < M} ,where M is the median of i=1 pooled sample {X1, ...Xm,Y1,..., Yn}. m P b). Control Median (Y is a control sample): V = I {Xi < MY } i=1 c). Mann-Whitney Test (for tendency): m n X X U = Dij,Dij = I {YJ < Xi} i=1 j=1 5. Linear Rank Test for Location Problems Hypothesis H0 : FY (x) = FX (x) for all x, vs. H1 : FY (x) = FX (x − θ) for all x and some θ 6= 0. Pooled sample size N=n+m. Wilcoxon Rank Sum Test. H0 : θ = 0, vs. H1 : θ 6= 0, N X WN = iZi, i=1 where Zi = 1 if the i-th ordered random variable is an X; otherwise Zi = 0. 6. Linear Rank Test for Scale Problems Hypothesis H0 : FY (x) = FX (x) for all x, vs. H1 : FY (x) = FX (x · θ) for all x and some θ > 0, θ 6= 1. (1). Mood Test: N 2 X N + 1 MN = i − Zi i=1 2 2 (2). Siegel-Tukey Test 2i for i even, , 1 < i ≤ N/2 N X 2i − 1 for i odd, , 1 < i ≤ N/2 SN = aiZi, where ai = i=1 2 (N − i) + 2 for i even, , N/2 < i ≤ N 2 (N − i) + 1 for i odd, , N/2 < i ≤ N (3). Sukhatme Test m n X X T = Dij, where Dij = I {YJ < Xi < 0, or 0 < Xi < Yj} i=1 j=1 7. Test for Equality of k Independent Samples All samples are from a location model F (x − θi) , i = 1, ...k. Hypothesis: H0 : θ1 = θ2 = ... = θk vs H1 : θi 6= θj, for at least one pair. (1). Extension of Median Test and Control Median Test (3). Kruskal-Wallis Test: k " #2 12 X 1 ni (N + 1) H = Ri − N (N + 1) i=1 ni 2 where Ri is the rank sum of i-th sample, ni is the size of i-th sample (4). Multiple Comparison ¯ ¯ Ri − Rj Z = ij r N(N+1) 1 + 1 12 ni nj Compare the above statistic with normal score z∗ = Φ−1 (α∗/ (k (k − 1))) ,for multiple comparison α∗ = 0.20. 8. Measure of Association for Bivariate Sample Data {(Xi,Yi) , i = 1, ..., n} . Hypothesis H0 : two samples are independent 3 (1). Kendall’s Tau Statistic for τ = pc − pd (difference of proba- bilities of concordance and discordance) n n P P Aij T = i=1 j=1 , where A = sign (X − X ) sign (Y − Y ) , n (n − 1) ij j i j i (2). Spearman’s Rho coefficient of rank correlation n n P ¯ ¯ P 2 12 Ri − R Si − S 6 Di R = i=1 = 1 − i=1 n (n2 − 1) n (n2 − 1) where Ri and Si are the ranks of Xi and Yi respectively,Di = Ri−Si. 9. Friedman’s ANOVA Test by Ranks A set of observations is collected over k blocks and n treatments (complete randomized block design), its rank Rij is the rank of ob- k P servation in i-th block. Rj = Rij is the rank of j-th treatment i=1 Hypothesis on treatment effect. H0 : θ1 = θ2 = ... = θn Friedman’s Test Statistic n !2 X k (n + 1) S = Rj − j=1 2 12 · S Q = ∼ χ2 (n − 1) under H . kn (n + 1) 0 10. Kendall’s Coefficient of Concordance of k sets of n objects There are k sets of observations collected, and each set includes n objects. Rank Rij is the rank of observation in i-th set, and k P Rj = Rij. i=1 Hypothesis H0 : k sets are independent (or there is no association). 4 The deviation statistics is n !2 X k (n + 1) S = Rj − j=1 2 12 · S Q = ∼ χ2 (n − 1) under H . kn (n + 1) 0 Kendall’s Coefficient of Concordance (ratio statistic): 12 · S W = k2n (n − 1) where 0 ≤ W ≤ 1. 11. Chisquare Test for Independence (Count Data) Two-dimensional contingency table lists count number Xij at i-th level of factor A (Ai) and j-th level of factor B (Bj). Denote Xi· and X·j be the row total and column total. Let θij = P (Ai ∩ Bj) , θi· = P P j θij = P (Ai) , θ·j = i θij = P (Bj) , which is subject to restric- P P tion i θi· = j θ·j = 1. The hypothesis of independence: H0 : θij = θi·θ·j for all i and all j Under the null hypothesis, test statistic r k (NX − X X )2 Q = X X ij i· ·j ∼ χ2 ((r − 1) (k − 1)) . i=1 j=1 NXi·X·j 12. Fisher’s Exact Test Two independent bionomial random samples, Yi ∼ Bin (ni, θi) , i = 1, 2. Under null hypothesis H0 : θ1 = θ2 = θ, the exact distribution given that Y = Y1 + Y2 n1 n2 y1 y − y1 P (Y1 = y1| Y = y) = N y where N = n1 + n2. 5.

STAT 416 – Nonparametric Tests

Appendix a Basic Statistical Concepts for Sensory Evaluation

One Sample T Test

1 a Novel Joint Location-‐Scale Testing Framework for Improved Detection Of

Basic Statistical Concepts for Sensory Evaluation

Nonparametric Statistics

The NPAR1WAY Procedure This Document Is an Individual Chapter from SAS/STAT® 14.2 User’S Guide

Chapter 10 Analysis of Variance

Statistical Inference with Paired Observations and Independent Observations in Two Samples

Two-Sample Testing Using Deep Learning

1. Preface 2. Introduction 3. Sampling Distribution

Simultaneously Testing for Location and Scale Parameters of Two Multivariate Distributions

Acoustic Emission Source Location Using a Distributed Feedback Fiber Laser Rosette