6 Single Sample Methods for a Location Parameter
Total Page:16
File Type:pdf, Size:1020Kb
6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually the median) are used. Recall: M is a median of a random variable X if P (X M) = P (X M) = :5. • ≤ ≥ The distribution of X is symmetric about c if P (X c x) = P (X c + x) for all x. • ≤ − ≥ For symmetric continuous distributions, the median M = the mean µ. Thus, all conclusions • about the median can also be applied to the mean. If X be a binomial random variable with parameters n and p (denoted X B(n; p)) then • n ∼ P (X = x) = px(1 p)n−x for x = 0; 1; : : : ; n x − n n! where = and k! = k(k 1)(k 2) 2 1: • x x!(n x)! − − ··· · − Tables exist for the cdf P (X x) for various choices of n and p. The probabilities and • cdf values are also easy to produce≤ using SAS or R. Thus, if X B(n; :5), we have • ∼ n P (X = x) = (:5)n x x n P (X x) = (:5)n ≤ k k X=0 P (X x) = P (X n x) because the B(n; :5) distribution is symmetric. ≤ ≥ − For sample sizes n > 20 and p = :5, a normal approximation (with continuity correction) • to the binomial probabilities is often used instead of binomial tables. (x :5) :5n { Calculate z = ± − : Use x+:5 when x < :5n and use x :5 when x > :5n. :5pn − { The value of z is compared to N(0; 1), the standard normal distribution. For example: P (X x) P (Z z) and P (X x) P (Z z) = 1 P (Z z) ≤ ≈ ≤ ≥ ≈ ≥ − ≤ 6.1 Ordinary Sign Test Assumptions: Given a random sample of n independent observations The measurement scale is at least nominal. • Observations can be classified into 2 nonoverlapping categories whose union exhausts all • possibilities. The categories will be labeled + and {. 88 Hypotheses: The inference involves comparing probabilities P (+) and P ( ) for outcomes + and {. • − (A) Two-sided: H : P (+) = P ( ) vs H : P (+) = P ( ) 0 − 1 6 − (B) Upper one-sided: H : P (+) P ( ) vs H : P (+) < P ( ) 0 ≥ − 1 − (C) Lower one-sided: H : P (+) P ( ) vs H : P (+) > P ( ) 0 ≤ − 1 − Note: H is true only if P (+) = P ( ) = :5 • 0 − Method: For a given α Let T = the number of + observations. • + Let T− = the number of { observations. If H is true, then we would expect T and T− to be nearly equal ( n=2). • 0 + ≈ In other words, if H is true, T and T− are binomial B(n; :5) random variables. • 0 + For alternative hypothesis • (A) H1 : P (+) = P ( ) . Let T = min(T+;T−). Then find the6 largest− t such that B(n; :5) probability P (X t) α=2. ≤ ≤ (B) H1 : P (+) < P ( ) . Let T = T+. Then find the largest− t such that B(n; :5) probability P (X t) α. ≤ ≤ (C) H1 : P (+) > P ( ) . Let T = T−. Then find the largest− t such that B(n; :5) probability P (X t) α. ≤ ≤ Decision Rule For (A), (B), or (C), if T is too small, then we will reject H . That is, • 0 If T t, Reject H . If T > t, Fail to Reject H . ≤ 0 0 Large Sample Approximation 1. For the one-sided H1, calculate T + :5 :5n T :5 :5n z = + − for (B) if T < :5n z = + − − for (B) if T > :5n :5pn + :5pn + T− + :5 :5n T− :5 :5n z = − for (C) if T− < :5n z = − − for (C) if T− > :5n :5pn :5pn 2. For the two-sided H1, take the smaller of the two z-values in (1.). 3. Find Φ(z) = P (Z z) from the standard normal distribution. ≤ 4. Reject H0 if (i) if P (Z z) α for either 1-sided test or (ii) P (Z z) α=2 for the 2-sided test. ≤ ≤ ≤ ≤ 89 9044 Example: (From Gibbons, Nonparametric Methods for Quantitative Analysis). An oil com- pany is considering the following procedures for training prospective service station managers: 1. On-the-job training under actual working conditions for three months. 2. A company-run school training program concentrated over one month. They plan to compare the two procedures in an experiment. No training program can be the only determining factor for the success of a manager. Success is also affected by other factors such as age, intelligence, and previous experience. In order to eliminate the effects of these factors as much as possible, each trainee is \matched" with another trainee that has similar attributes (such as similar age and previous experience). If a good match does not exist for a trainee, then the trainee is not included in the experiment. Once pairs are determined, one member of each pair is randomly selected to receive the on-the-job training, while the other is assigned to the company school. After completing the assigned training program, the personnel manager assesses each trainee and judges which member of each pair has done a better job of managing the service station. In total, 13 pairs had completed the training programs. The personnel manager stated that for 10 of the 13 pairs, the better manager received the company school training. Is there sufficient evidence to claim that the company-run school training program is more effective? Table of Binomial Probabilities and Binomial CDF for n=13, p=.5 n p x f(x) = Pr(X=x) F(x) = Pr(X<=x) 13 0.5 0 .0001220703 .0001220703 13 0.5 1 .0015869141 .0017089844 13 0.5 2 <-- .0095214844 .0112304688 <.025 <-- 13 0.5 3 .0349121094 .0461425781 >.025 13 0.5 4 .0872802734 .1334228516 13 0.5 5 .1571044922 .2905273437 13 0.5 6 .2094726563 .5000000000 13 0.5 7 .2094726562 .7094726563 13 0.5 8 .1571044922 .8665771484 13 0.5 9 .0872802734 .9538574219 13 0.5 10 .0349121094 .9887695313 13 0.5 11 .0095214844 .9982910156 13 0.5 12 .0015869141 .9998779297 13 0.5 13 .0001220703 1.000000000 6.2 Sign (Binomial) Test for Location Assumptions: Given a random sample of n independent observations x1; x2; : : : ; xn: The variable of interest is continuous, and the measurement scale is at least ordinal. • Hypotheses: The inference concerns a hypothesis about the median M of a single population. • (A) Two-sided: H : M = Mo vs H : M = Mo 0 1 6 (B) Upper one-sided: H0 : M = Mo vs H1 : M > Mo (C) Lower one-sided: H0 : M = Mo vs H1 : M < Mo 91 Method: For a given α Let T = the number of observations > Mo. • + Let T− = the number of observations < Mo. Delete any xi = Mo and adjust the sample size n accordingly. • If H is true, then T and T− are binomial B(n; :5) random variables. Thus, we would • 0 + expect T and T− to be approximately equal ( n=2). + ≈ For alternative hypothesis • (A) H1 : M = Mo: Let T = min(T+;T−). Then find6 the largest t such that B(n; :5) probability P (X t) α=2. ≤ ≤ (B) H1 : M > Mo: Let T = T−. Then find the largest t such that B(n; :5) probability P (X t) α. ≤ ≤ (C) H1 : M < Mo: Let T = T+. Then find the largest t such that B(n; :5) probability P (X t) α. ≤ ≤ Perform the Ordinary Sign Test based on T and t. • Decision Rule For (A), (B), or (C), if T is too small, then we will reject H . That is, • 0 If T t, Reject H . If T > t, Fail to Reject H . ≤ 0 0 Large Sample Approximation Same as for the Ordinary Sign Test. −! Example 2.1 from Applied Nonparametric Statistics by W. Daniel. In a study of heart disease, a researcher measured the blood's \transit time" in subjects with healthy right coronary arteries. The median transit time was 3.50 seconds. In another study, the researchers repeated the transit time study but on a sample of 11 patients with significantly blocked right coronary arteries. The results (in seconds) were 1.80 3.30 5.65 2.25 2.50 3.50 2.75 3.25 3.10 2.70 3.00 1. Can these researchers conclude (using α = :05) that the median transit time in the pop- ulation of patients with significantly blocked right coronary arteries is different than 3.50 seconds? 2. Can these researchers conclude (α = :05) that the median transit time in the population of patients with significantly blocked right coronary arteries is less than 3.50 seconds? 92 Table of Binomial Probabilities and Binomial CDF for n=10, p=.5 n p x f(x) = Pr(X=x) F(x) = Pr(X<=x) 10 0.5 0 .0009765625 .0009765625 10 0.5 1 <-- .0097656250 .0107421875 <-- <.025 10 0.5 2 .0439453125 .0546875000 10 0.5 3 .1171875000 .1718750000 10 0.5 4 .2050781250 .3769531250 10 0.5 5 .2460937500 .6230468750 10 0.5 6 .2050781250 .8281250000 10 0.5 7 .1171875000 .9453125000 10 0.5 8 .0439453125 .9892578125 10 0.5 9 .0097656250 .9990234375 10 0.5 10 .0009765625 1.000000000 6.2.1 Special Case: Paired Data Assumptions: Given a random sample of n independent pairs of observations (x1; y1); (x2; y2);:::; (xn; yn): Both variables X and Y are continuous, and the measurement scales are at least ordinal. • Testing Procedure: Calculate all differences Di = yi xi for i = 1; : : : ; n. • − Use the median difference MD in the hypotheses. Typically, MD = 0. • Run the Sign Test based on the differences (the Di values). • Example 4.1 from Applied Nonparametric Statistics by W.