Hypothesis Testing III
Total Page:16
File Type:pdf, Size:1020Kb
Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Hypothesis testing III Botond Szabo Leiden University Leiden, 16 April 2018 Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Outline 1 Kolmogorov-Smirnov test 2 Mann-Whitney test 3 Normality test 4 Brief summary 5 Quiz Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz One-sample Kolmogorov-Smirnov test IID observations X1;:::; Xn∼ F : Want to test H0 : F = F0 versus H1 : F 6=F0; where F0 is some fixed CDF. Test statistic p p Tn = nDn = n sup jF^n(x) − F0(x)j x If X1;:::; Xn ∼ F0 and F0 is continuous, then the asymptotic distribution of Tn is independent of F0 and is given by the Kolmogorov distribution. Reject the null hypothesis if Tn > Kα; where Kα is the 1 − α quantile of the Kolmogorov distribution. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Kolmogorov-Smirnov test (estimated parameters) IID data X1;:::; Xn ∼ F : Want to test H0 : F 2 F versus H1 : F 2= F; where F = fFθ : θ 2 Θg: Test statistic p p ^ Tn = nDn = n sup jFn(x) − Fθ^(x)j; x where θ^ is an estimate of θ based on Xi 's. Astronomers often proceed by treating θ^ as a constant and use the critical values from the usual Kolmogorov-Smirnov test. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Kolmogorov-Smirnov test (continued) This is a faulty practice: asymptotically Tn will typically not have the Kolmogorov distribution. Extra work required to obtain the correct critical values. The case when Fθ is the normal CDF has been worked out explicitly (Lilliefors test). Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Two-sample Kolmogorov-Smirnov test IID observations X1;:::; Xn∼ FX and Y1;:::; Ym∼ FY : Want to test H0 : FX = FY versus H1 : FX 6=FY : Test statistic r nm r nm Tn;m = Dn;m = sup jF^X (x) − F^Y (x)j: n + m n + m x The limit distribution of Tn;m under the null hypothesis is the Kolmogorov distribution. Reject the null hypothesis if Tn;m > Kα; where Kα is the 1 − α quantile of the Kolmogorov distribution. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Mann-Whitney test IID observations X1;:::; Xn∼ FX and Y1;:::; Ym∼ FY : Want to test H0 : FX = FY versus H1 : FX 6=FY : Mann-Whitney test (or Wilcoxon rank sum test): Basic idea: group all the n + m observations together, rank them in order of increasing size and look at the rank sum of Y 's, say (Wilcoxon statistic). If the latter takes too unlikelya value compared to what we could have obtained under the null hypthesis, we reject the null hypothesis. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Example Example Consider the data in the following table (ranks are shown in parentheses): X 0s Y 's 1 (1) 6 (4) 3 (2) 4 (3) The rank sum of X 's is 3 and that of Y 's is R = 7: 4 Under the null hypothesis each of the 2 = 6 assignments of ranks to Y 's are equally likely. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Example (continued) Example Ranks R f1; 2g 3 f1; 3g 4 f1; 4g 5 f2; 3g 5 f2; 4g 6 f3; 4g 7 Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Example (continued) Example Under the null hypothesis the distribution of R is: r 3 4 5 6 7 1 1 1 1 1 P(R = r) 6 6 3 6 6 In particular, P(R = 7) = 1=6: So if the null hypothesis were true, the rank sum we saw would occur one time out of six purely on the basis of chance. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Mann-Whitney test: formalisation It can be shown that the rank sum (Wilcoxon statistic) can be expressed in terms of the Mann-Whitney statistic m n 1 X X U = 1 : mn [Xi <Yj ] i=1 j=1 The latter is an estimator of P(X < Y ): If FX =FY ; then P(X < Y ) = 1=2: The statistic U is trying to detect deviation from this. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Shapiro-Wilk test IID observations X1;:::; Xn ∼ F : Want to test H0 : F is normal vs H1 : F is not normal : Test statistic Pn 2 ( ai X ) W = i=1 (i) ; n Pn 2 i=1(Xi − X ) where X(i) is the ith order statistics, mT V −1 (a ;:::; a ) = ; 1 n (mT V −1V −1m)1=2 with mi 's the expectations of order statistics of IID standard normal random variables Z1;:::; Zn; and V is the corresponding covariance matrix. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Shapiro-Wilk test (continued) The test rejects for small values of Wn: The distribution of Wn is tabulated (or approximated), which allows determination of the critical values. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Brief summary Wald test relies on an asymptotic argument and is useful in the large sample settings. t-test is exact (no asymptotics) and useful, but makes the normality assumption. Likelihood ratio test is excellent, and even the best in various senses in various setups, but makes parametric assumptions. There are many normality tests, Shapiro-Wilk is just one example. They complement nicely graphical tools for checking normality (histogram, QQ-plot). Not so useful with small samples, but small samples always pose difficulties. Nonparametric tests are nice, sometimes exact, other times based on asymptotic arguments. No silver bullet. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Question 1 What is not true for the one-sample Kolmogorov-Smirnov test? Answers: 1 It is a nonparametric test. p ^ 2 The test statistics is Tn = n supx jFn(x) − F0(x)j 3 The asymptotic distribution of Tn depends on F0. 4 The asymptotic distribution of Tn is the Kolmogorov distribution. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Question 1 What is not true for the one-sample Kolmogorov-Smirnov test? Answers: 1 It is a nonparametric test. p ^ 2 The test statistics is Tn = n supx jFn(x) − F0(x)j 3 The asymptotic distribution of Tn depends on F0. 4 The asymptotic distribution of Tn is the Kolmogorov distribution. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Question 2 What is the basic idea in the Mann-Whitney test for checking if two data sets are coming from the same distribution? Answers: 1 Rank the observations and check if the rand sum of the first sample is not too unlikely. 2 Check if the average of the observations is different. 3 The empirical distributions are close to each other. 4 The difference of the observations should concentrate aorund zero. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Question 2 What is the basic idea in the Mann-Whitney test for checking if two data sets are coming from the same distribution? Answers: 1 Rank the observations and check if the rand sum of the first sample is not too unlikely. 2 Check if the average of the observations is different. 3 The empirical distributions are close to each other. 4 The difference of the observations should concentrate aorund zero. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Question 3 What is not true for the Shapiro-Wilk test? Answers: 1 It is a nonparametric test. 2 It is a normality test. 3 The distribution of the test statisitcs Wn is given in a table. 4 It is the single best test to check normality and especially useful for small sample size. Kolmogorov-Smirnov test Mann-Whitney test Normality test Brief summary Quiz Question 3 What is not true for the Shapiro-Wilk test? Answers: 1 It is a nonparametric test. 2 It is a normality test. 3 The distribution of the test statisitcs Wn is given in a table. 4 It is the single best test to check normality and especially useful for small sample size..