R Tutorial on Tests

R session Shota Gugushvili November 12, 2013 1 Permutation test for means Here I will illustrate the permutation test as described on pp. 162{163 in Wasser- man. First let me generate two data sets, X1;:::;Xm ∼ FX and Y1;:::;Yn ∼ FY : > m<-10 > n<-10 > set.seed(123456) > dat.1<-rnorm(m) > dat.2<-rnorm(n,mean=3,sd=1) I want to test H0 : FX = FY versus FX 6= FY : Let T (X1;:::;Xm;Y1;:::;Yn) = jXm − Y nj: In the following code I implement the permutation test based on T (since N = m + n = 20; the total number of possible permutations is 20!; type fac- torial(20) in R to find out how large this number actually is!): > dat.3 = c(dat.1,dat.2) # Combine dat.1 and dat.2 into one data set. > t.obs <- abs(mean(dat.1) - mean(dat.2)) # t_{obs} in Wasserman, p. 163, step 1. > t.obs [1] 2.460882 > # Now steps 2 and 3 in Wasserman, p. 163: > N<-m+n # Size of dat.3 > B = 100000 # Number of replications. > # In the next object we will fill in values of T based on permuted data. > T.sim = NULL > # Loop for filling in T.sim values. > index<-seq(1,N) > for (i in 1 : B) { + 1 + random1 = sample(dat.3,m,replace=FALSE) + random2 = setdiff (dat.3, random1) + + T.sim[i] = abs(mean(random1) - mean(random2)) + } > # Now we compute an approximate p-value (Wasserman, p. 163, step 4). > p.value = sum(T.sim>t.obs)/B # Check the English R guide on the blackboard, pp. 11-12. > p.value [1] 7e-05 A small p-value gives very strong evidence against the null hypothesis. Exercise 1. Figure out why the code does what the algorithm on p. 163 in Wasserman says. Remark 1. The statistic T as above is not a good choice to detect difference of two distributions that have the same means. Always choose T carefully. 2 One sample Kolmogorov-Smirnov test The command ks.test performs both the one-sample and two-sample Kolmogorov- Smirnov tests. Let us start with the one-sample case. We generate the sample of size n = 50 from the N(µ, σ2) distribution and test the null hypothesis that the data come from the normal CDF Φµ,σ with (µ, σ) = (0; 1): > data<-rnorm(50) > ks.test(data,"pnorm",mean=0,sd=1) One-sample Kolmogorov-Smirnov test data: data D = 0.109, p-value = 0.5559 alternative hypothesis: two-sided The syntax is quite obvious: the command takes as its input the data set, the name of the distribution and the parameter value determining the null hypothesis. Exercise 2. Generate a sample of size n = 20 from the Gamma(α; β) distribution with parameters α = 1 and β = 2: (take a good notice: here I use Wasserman's parametrisation of the gamma distribution). Next test the null hypothesis α0 = 0:8; β0 = 1:8 using the Kolmogorov-Smirnov test. 3 Two-sample Kolmogorov-Smirnov test The two-sample Kolmogorov-Smirnov test is just as easy to perform. To illustrate it I will again use the MWG and M 31 data sets. Assume Wikipedia 2 is right in saying the distance modulus to Andromeda is 24:4: I will subtract that number from the values (apparent magnitudes) in the M 31 data set and compare the resulting data set to the MWG data set (absolute magnitudes). > GC_M31<-read.table("http://astrostatistics.psu.edu/MSMA/datasets/GlobClus_M31.dat", + header=TRUE) > GC_MWG<-read.table("http://astrostatistics.psu.edu/MSMA/datasets/GlobClus_MWG.dat", + header=TRUE) > data.x<-GC_M31[,2]-24.4 > data.y<-GC_MWG[,2] > ks.test(data.x,data.y) Two-sample Kolmogorov-Smirnov test data: data.x and data.y D = 0.259, p-value = 0.0002817 alternative hypothesis: two-sided The syntax is self-evident: I supply the command with two data sets. An extremely small p-value gives very strong evidence in favour of the claim that the two distributions are not the same. 4 Mann-Whitney test Another nonparametric test for testing the null hypothesis of equality of two distributions is the Mann-Whitney test. This is included in R as part of the functionalities of the wilcox.test command. Here is an illustration using the same example as in the case of the two-sample Kolmogorov-Smirnov test above. > wilcox.test(data.x,data.y) Wilcoxon rank sum test with continuity correction data: data.x and data.y W = 17268.5, p-value = 0.009496 alternative hypothesis: true location shift is not equal to 0 The result is comparable to the Kolmogorov-Smirnov test case in that the p-value is extremely small. 5 Normality tests There are many normality tests implemented in R. Let us try out the Shapiro- Wilk test using the MWG and M 31 data sets. > data.x<-GC_M31[,2] > data.y<-GC_MWG[,2] > shapiro.test(data.x) 3 Shapiro-Wilk normality test data: data.x W = 0.9853, p-value = 0.001017 > shapiro.test(data.y) Shapiro-Wilk normality test data: data.y W = 0.9883, p-value = 0.675 A small p-value gives strong evidence against the normality assumption in the M 31 case. We see no evidence in the MWG case. You can compare the results of the Shapiro-Wilk test to the graphical checks (histogram, QQ-plot) you have applied to the two data sets before. Many other normality tests are implemented in the nortest package (if you want to use it, you first have to install and then load it). 6 Lilliefors test Let me draw your attention to the Lilliefors test (Kolmogorov-Smirnov test, normal distribution with estimated parameters). > library(nortest) > lillie.test(data.y) Lilliefors (Kolmogorov-Smirnov) normality test data: data.y D = 0.0688, p-value = 0.4498 Compare this to the output of the Kolmogorov-Smirnov test, but disregard- ing the fact the parameters have been estimated. > ks.test(data.y, "pnorm", mean=mean(data.y), sd=sd(data.y)) One-sample Kolmogorov-Smirnov test data: data.y D = 0.0688, p-value = 0.8376 alternative hypothesis: two-sided We get a much larger (and incorrect) p-value in the latter case. This does no harm in our particular example, but the message must be clear: the common practice in astronomy to disregard the fact that the parameters have been estimated and proceed with the usual Kolmogorov-Smirnov goodness-of-fit test can lead to invalid inference. 4 7 Additional practice You will be working with the Hipparcos data set. Read its description here: http://astrostatistics.psu.edu/datasets/HIP_star.html. Using exploratory data analysis techniques in R, 92 Hyades stars were iden- tified out of the data set consisting of 2719 Hipparcos stars. The way this has been done is described here (if you want, you can try the things out, but perhaps at this stage it is better to be content directly with the end results given below): http://astrostatistics.psu.edu/datasets/2006tutorial/2006reg. html. I presume there exist more refined astronomical techniques for doing that (just as there are more sophisticated statistical techniques for that), but the goal of that tutorial is just to show how exploratory data analysis can be performed in R and why it is useful for astronomers to know to do it. > hip = read.table("http://astrostatistics.psu.edu/datasets/HIP_star.dat", + header=T,fill=T) > attach(hip) > filter1= (RA>50 & RA<100 & DE>0 & DE<25) > filter2=(pmRA>90 & pmRA<130 & pmDE>-60 & pmDE< -10) > filter = filter1 & filter2 & (e_Plx<5) The user-defined command filter identifies the Hyades stars. What we are interested in is the colour of the stars. In the code below H contains the colours of 92 Hyades, while nH gives the colours of other stars (had not I added !is.na(color) to the definition of nH, there would have been some missing values (NA values) in it. Keeping them does no direct harm, but I nevertheless decided to throw them out). The variable B.V is the colour of a star in the original dataset. > color=B.V > H=color[filter] > nH=color[!filter&!is.na(color)] So now we have two data sets, one with colours of the Hyades and another with colours of other stars. We are interested in studying whether the two groups of stars are really different from each other as far as their colour is concerned. Numerical summaries indeed suggest some difference. > summary(H) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0490 0.3680 0.5600 0.6123 0.8410 1.3270 > summary(nH) Min. 1st Qu. Median Mean 3rd Qu. Max. -0.1580 0.5662 0.7160 0.7668 0.9540 2.8000 5 We can be more formal here. Observations in each group can be thought of as coming from a certain distribution. Thus we can test whether the two distributions are the same. Exercise 3. Carry out testing in the above setting using the two-sample Kolmogorov- Smirnov and the Mann-Whitney tests. What are your conclusions? The permutation test for the sample means can in principle be also used, but it is going to be slow. 8 General observations Here I provide a little summary on various tests I have introduced. One can classify them into two categories: those that make parametric assumptions (t- test, χ2-test) and those that do not (permutation test, Mann-Whitney test, Kolmogorov-Smirnov test). Parametric tests, such as the t-test, have to be applied in those cases where we have good reasons to believe the parametric assumptions we are making are valid.

Load more