<<

asymptotically normal with meanµ ˜, the of the population distribution.

Quartiles and . The first sample th 1 quartile is the r order X(r) where r = 4 n, th Nonparametric and the third sample quartile is the r order statis- tic X where r = 3 n Math 218, (r) 4 The kth sample is the rth order statis- D Joyce, Spring 2016 k th tic X(r) where r = 100 n. Thus, the 50 sample percentile is the sample median, the 25th sample Order statistics. The minimum value among percentile is the first sample quartile, and the 75th the variables X1,X2,...,Xn in a random sample sample percentile is the third sample quartile. X is also called the first and denoted Quartile and percentile statistics enjoy limit the- X(1), and the maximum value among them is also th orems analogous to that for . called the n order statistic, denoted X(n). There are also intermediate order statistics which result The . The sign test is used to test the from ordering the values X ,X ,...,X from small- 1 2 n hypothesis H that the medianµ ˜ of a population is est to largest 0 a particular valueµ ˜0 versus the alternate hypoth- esis H1 thatµ ˜ > µ˜0. You can imagine that if you X(1) ≤ X(2) ≤ · · · ≤ X(n−1) ≤ X(n) take a sample and find a great many of the sample These order statistics form for basis for nonpara- values lie on above µ0, then you should reject H0. metric statistical since some of their prop- The sign test codifies that intuition. erties do not depend on the family of distributions Let p denote the unknown that Xi > that the random comes from. For parametric statis- µ˜0. Then the number of trials S+ in the sample tical inference, there is a assumption that the distri- X greater thanµ ˜0 follows a bution comes from a certain family. In many cases, Binomial(n, p), that is such an assumption can be justified since there’s n X n an underlying , Poisson process, P (S > s ) = = pi(1 − p)n−i. + + i or the sample is large and the Central Limit Theo- i=s+ rem can usually be invoked. But in other cases, it’s 1 not clear what form the distribution might take, When H0 holds, then p = 2 , and conversely. and in that case nonparametric methods may be That allows us to compute the P -value for a sign the best way to go. test since it’s just

n n X n 1 The sample median. If n is an odd number, P = = . i 2 n = 2r + 1, then there is a middle order statistic i=s+ X . It’s index is (n + 1)/2, and that order statis- (r) You can use table A.1 in the text to find the P - tic is called the sample median. (When n is even, value for n ≤ 20. Reject H at the α-level if s , there won’t be a single middle order statistic, but 0 + the number of observations greater than it’s not uncommon to the two middle order statistics and call that the median.) µ˜0 ≥ bn,α There’s a limit theorem for the sample median, but we won’t prove it. It says that for a continu- where s+ is the number of observations greater than ous population distribution, the sample median is µ˜0.

1 For larger n you can use the normal approxima- tion with a continuity correction to the binomial distribution. Reject H0 if

x+ − n/2 − 1/2 z = ≥ zα, pn/4 or equivalently if

1 √ s+ ≥ 2 (n + 1 + zα n).

The other one-sided sign test for H1 beingµ ˜ < µ˜0 is similar but uses s− instead of s+. The two-sided sign test for H1 beingµ ˜ 6=µ ˜0 uses smax which is the maximum of s+ and s−. The cri- teria for rejection are the same except α is replaced by α/2. The confidence intervals correspond to the hy- pothesis tests, as usual.

Math 218 Home Page at http://math.clarku.edu/~djoyce/ma218/

2