<<

6 Single Sample Methods for a

If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of (usually the ) are used. Recall:

M is a median of a X if P (X M) = P (X M) = .5. • ≤ ≥ The distribution of X is symmetric about c if P (X c x) = P (X c + x) for all x. • ≤ − ≥ For symmetric continuous distributions, the median M = the µ. Thus, all conclusions • about the median can also be applied to the mean.

If X be a binomial random variable with parameters n and p (denoted X B(n, p)) then • n ∼ P (X = x) = px(1 p)n−x for x = 0, 1, . . . , n x −   n n! where = and k! = k(k 1)(k 2) 2 1. • x x!(n x)! − − ··· ·   − Tables exist for the cdf P (X x) for various choices of n and p. The probabilities and • cdf values are also easy to produce≤ using SAS or R.

Thus, if X B(n, .5), we have • ∼ n P (X = x) = (.5)n x   x n P (X x) = (.5)n ≤ k k X=0   P (X x) = P (X n x) because the B(n, .5) distribution is symmetric. ≤ ≥ − For sample sizes n > 20 and p = .5, a normal approximation (with continuity correction) • to the binomial probabilities is often used instead of binomial tables. (x .5) .5n – Calculate z = ± − . Use x+.5 when x < .5n and use x .5 when x > .5n. .5√n − – The value of z is compared to N(0, 1), the standard normal distribution. For example: P (X x) P (Z z) and P (X x) P (Z z) = 1 P (Z z) ≤ ≈ ≤ ≥ ≈ ≥ − ≤

6.1 Ordinary Sign Test Assumptions: Given a random sample of n independent observations

The measurement scale is at least nominal. • Observations can be classified into 2 nonoverlapping categories whose union exhausts all • possibilities. The categories will be labeled + and –.

88 Hypotheses:

The inference involves comparing probabilities P (+) and P ( ) for outcomes + and –. • − (A) Two-sided: H : P (+) = P ( ) vs H : P (+) = P ( ) 0 − 1 6 − (B) Upper one-sided: H : P (+) P ( ) vs H : P (+) < P ( ) 0 ≥ − 1 − (C) Lower one-sided: H : P (+) P ( ) vs H : P (+) > P ( ) 0 ≤ − 1 − Note: H is true only if P (+) = P ( ) = .5 • 0 − Method: For a given α

Let T = the number of + observations. • + Let T− = the number of – observations.

If H is true, then we would expect T and T− to be nearly equal ( n/2). • 0 + ≈

In other words, if H is true, T and T− are binomial B(n, .5) random variables. • 0 + For

(A) H1 : P (+) = P ( ) . Let T = min(T+,T−). Then find the6 largest− t such that B(n, .5) probability P (X t) α/2. ≤ ≤ (B) H1 : P (+) < P ( ) . Let T = T+. Then find the largest− t such that B(n, .5) probability P (X t) α. ≤ ≤ (C) H1 : P (+) > P ( ) . Let T = T−. Then find the largest− t such that B(n, .5) probability P (X t) α. ≤ ≤ Decision Rule

For (A), (B), or (C), if T is too small, then we will reject H . That is, • 0 If T t, Reject H . If T > t, Fail to Reject H . ≤ 0 0 Large Sample Approximation

1. For the one-sided H1, calculate T + .5 .5n T .5 .5n z = + − for (B) if T < .5n z = + − − for (B) if T > .5n .5√n + .5√n + T− + .5 .5n T− .5 .5n z = − for (C) if T− < .5n z = − − for (C) if T− > .5n .5√n .5√n

2. For the two-sided H1, take the smaller of the two z-values in (1.). 3. Find Φ(z) = P (Z z) from the standard normal distribution. ≤

4. Reject H0 if (i) if P (Z z) α for either 1-sided test or (ii) P (Z z) α/2 for the 2-sided test. ≤ ≤ ≤ ≤

89 9044 Example: (From Gibbons, Nonparametric Methods for Quantitative Analysis). An oil com- pany is considering the following procedures for training prospective service station managers: 1. On-the-job training under actual working conditions for three months.

2. A company-run school training program concentrated over one month. They plan to compare the two procedures in an . No training program can be the only determining factor for the success of a manager. Success is also affected by other factors such as age, intelligence, and previous experience. In order to eliminate the effects of these factors as much as possible, each trainee is “matched” with another trainee that has similar attributes (such as similar age and previous experience). If a good match does not exist for a trainee, then the trainee is not included in the experiment. Once pairs are determined, one member of each pair is randomly selected to receive the on-the-job training, while the other is assigned to the company school. After completing the assigned training program, the personnel manager assesses each trainee and judges which member of each pair has done a better job of managing the service station. In total, 13 pairs had completed the training programs. The personnel manager stated that for 10 of the 13 pairs, the better manager received the company school training. Is there sufficient evidence to claim that the company-run school training program is more effective?

Table of Binomial Probabilities and Binomial CDF for n=13, p=.5 n p x f(x) = Pr(X=x) F(x) = Pr(X<=x) 13 0.5 0 .0001220703 .0001220703 13 0.5 1 .0015869141 .0017089844 13 0.5 2 <-- .0095214844 .0112304688 <.025 <-- 13 0.5 3 .0349121094 .0461425781 >.025 13 0.5 4 .0872802734 .1334228516 13 0.5 5 .1571044922 .2905273437 13 0.5 6 .2094726563 .5000000000 13 0.5 7 .2094726562 .7094726563 13 0.5 8 .1571044922 .8665771484 13 0.5 9 .0872802734 .9538574219 13 0.5 10 .0349121094 .9887695313 13 0.5 11 .0095214844 .9982910156 13 0.5 12 .0015869141 .9998779297 13 0.5 13 .0001220703 1.000000000

6.2 Sign (Binomial) Test for Location

Assumptions: Given a random sample of n independent observations x1, x2, . . . , xn: The variable of interest is continuous, and the measurement scale is at least ordinal. • Hypotheses: The inference concerns a hypothesis about the median M of a single population. • (A) Two-sided: H : M = Mo vs H : M = Mo 0 1 6

(B) Upper one-sided: H0 : M = Mo vs H1 : M > Mo

(C) Lower one-sided: H0 : M = Mo vs H1 : M < Mo

91 Method: For a given α

Let T = the number of observations > Mo. • + Let T− = the number of observations < Mo.

Delete any xi = Mo and adjust the sample size n accordingly. •

If H is true, then T and T− are binomial B(n, .5) random variables. Thus, we would • 0 + expect T and T− to be approximately equal ( n/2). + ≈

For alternative hypothesis •

(A) H1 : M = Mo. Let T = min(T+,T−). Then find6 the largest t such that B(n, .5) probability P (X t) α/2. ≤ ≤ (B) H1 : M > Mo. Let T = T−. Then find the largest t such that B(n, .5) probability P (X t) α. ≤ ≤ (C) H1 : M < Mo. Let T = T+. Then find the largest t such that B(n, .5) probability P (X t) α. ≤ ≤ Perform the Ordinary Sign Test based on T and t. •

Decision Rule

For (A), (B), or (C), if T is too small, then we will reject H . That is, • 0 If T t, Reject H . If T > t, Fail to Reject H . ≤ 0 0

Large Sample Approximation Same as for the Ordinary Sign Test. −→

Example 2.1 from Applied Nonparametric by W. Daniel.

In a study of heart disease, a researcher measured the blood’s “transit time” in subjects with healthy right coronary arteries. The median transit time was 3.50 seconds. In another study, the researchers repeated the transit time study but on a sample of 11 patients with significantly blocked right coronary arteries. The results (in seconds) were 1.80 3.30 5.65 2.25 2.50 3.50 2.75 3.25 3.10 2.70 3.00

1. Can these researchers conclude (using α = .05) that the median transit time in the pop- ulation of patients with significantly blocked right coronary arteries is different than 3.50 seconds?

2. Can these researchers conclude (α = .05) that the median transit time in the population of patients with significantly blocked right coronary arteries is less than 3.50 seconds?

92 Table of Binomial Probabilities and Binomial CDF for n=10, p=.5

n p x f(x) = Pr(X=x) F(x) = Pr(X<=x)

10 0.5 0 .0009765625 .0009765625 10 0.5 1 <-- .0097656250 .0107421875 <-- <.025 10 0.5 2 .0439453125 .0546875000 10 0.5 3 .1171875000 .1718750000 10 0.5 4 .2050781250 .3769531250 10 0.5 5 .2460937500 .6230468750 10 0.5 6 .2050781250 .8281250000 10 0.5 7 .1171875000 .9453125000 10 0.5 8 .0439453125 .9892578125 10 0.5 9 .0097656250 .9990234375 10 0.5 10 .0009765625 1.000000000

6.2.1 Special Case: Paired Assumptions: Given a random sample of n independent pairs of observations

(x1, y1), (x2, y2),..., (xn, yn):

Both variables X and Y are continuous, and the measurement scales are at least ordinal. • Testing Procedure:

Calculate all differences Di = yi xi for i = 1, . . . , n. • −

Use the median difference MD in the hypotheses. Typically, MD = 0. •

Run the Sign Test based on the differences (the Di values). •

Example 4.1 from Applied by W. Daniel.

Researchers studied the effects of “togetherness” on the heart rate in rats. They recorded the heart rates of 10 rats while they were alone and while in the presence of another rat. The results are shown below. Using an α = .05 significance level for the Sign Test, can we conclude that “togetherness” increases the heart rate in rats?

For this data, the ten Di values are 60 32 1 79 26 38 30 7 61 35 − −

93

45 6.2.2 Sign Test Examples using R and SAS R Output for Sign ( Examples)

> # Sign Test Example from Gibbons > binom.test(10,13)

Exact binomial test number of successes = 10, number of trials = 13, p-value = 0.09229 <-- Fail to reject alternative hypothesis: true probability of success is not equal to 0.5

95 percent : 0.4618685 0.9496189 <-- The CI contains .5 so we fail to reject Ho sample estimates: probability of success 0.7692308

> # Sign (Binomial) Test for Location -- Daniel Ex. 2.1 > time <- c(1.80,3.30,5.65,2.25,2.50,3.50,2.75,3.25,3.10,2.70,3.00) > time = time - 3.50 > time [1] -1.70 -0.20 2.15 -1.25 -1.00 0.00 -0.75 -0.25 -0.40 -0.80 -0.50

> ties = sum(time==0) > ties [1] 1 > binom.test(sum(time>0),length(time)-ties)

Exact binomial test Reject Ho ^^^ number of successes = 1, number of trials = 10, p-value = 0.02148 alternative hypothesis: true probability of success is not equal to 0.5

95 percent confidence interval: 0.002528579 0.445016117 <-- .5 is not in the CI, so reject Ho sample estimates: probability of success 0.1

> # Sign (Binomial) Test for Location -- Paired Data, Daniel Ex. 4.1 > alone <- c(463,462,462,456,450,426,418,415,409,402) > together <- c(523,494,461,535,476,454,448,408,470,437) > diff <- together - alone

> ties = sum(diff==0) > ties [1] 0 > binom.test(sum(diff>0),length(diff)-ties)

94 Exact binomial test Fail to reject Ho ^^^^^^^ number of successes = 8, number of trials = 10, p-value = 0.1094 alternative hypothesis: true probability of success is not equal to 0.5

95 percent confidence interval: 0.4439045 0.9747893 sample estimates: probability of success 0.8

R Code for Sign (Binomial Test Examples) # Sign Test Example from Gibbons binom.test(10,13)

# Sign (Binomial) Test for Location -- Daniel Ex. 2.1 time <- c(1.80,3.30,5.65,2.25,2.50,3.50,2.75,3.25,3.10,2.70,3.00) time = time - 3.50 time ties = sum(time==0) ties binom.test(sum(time>0),length(time)-ties)

# Sign (Binomial) Test for Location -- Paired Data, Daniel Ex. 4.1 alone <- c(463,462,462,456,450,426,418,415,409,402) together <- c(523,494,461,535,476,454,448,408,470,437) diff <- together - alone ties = sum(diff==0) ties binom.test(sum(diff>0),length(diff)-ties)

SAS Output for Sign (Binomial) Test Examples In SAS, the Sign (Binomial) Test is denoted M, and it represents the deviation in the observed count T+ from the expected count .5n when the null hypothesis is true. Ordinary Sign Test for Training Program Example

The UNIVARIATE Procedure Variable: level

Tests for Location: Mu0=0

Test -Statistic------p Value------

Student’s t t -2.21359 Pr > |t| 0.0470 Sign M -3.5 Pr >= |M| 0.0923 <-- Fail to Signed Rank S -24.5 Pr >= |S| 0.0923 reject Ho

95 Sign (Binomial) Test for Example 2.1

Variable: diff Basic Statistical Measures

Location Variability

Mean -0.42727 Std Deviation 0.98903 Median -0.50000 0.97818

Tests for Location: Mu0=0

Test -Statistic------p Value------

Student’s t t -1.43282 Pr > |t| 0.1824 Sign M -4 Pr >= |M| 0.0215 <-- Reject Signed Rank S -17.5 Pr >= |S| 0.0840 Ho

Sign (Binomial) Test for Paired Differences -- Example 4.1

Obs alone together diff

1 463 523 60 2 462 494 32 3 462 461 -1 4 456 535 79 5 450 476 26 6 426 454 28 7 418 448 30 8 415 408 -7 9 409 470 61 10 402 437 35

Variable: diff Basic Statistical Measures

Location Variability

Mean 34.30000 Std Deviation 26.78329 Median 31.00000 Variance 717.34444

Tests for Location: Mu0=0

Test -Statistic------p Value------

Student’s t t 4.049769 Pr > |t| 0.0029 Sign M 3 Pr >= |M| 0.1094 <--- Fail to Signed Rank S 24.5 Pr >= |S| 0.0098 reject Ho

96 SAS Code for Sign (Binomial) Test Examples

DM ’LOG; CLEAR; OUT; CLEAR;’; OPTIONS NODATE NONUMBER LS=76 PS=54;

****************************************************************; *** Ordinary Sign Test: Let -1,1 represent the 2 categories. ***; *** The frq values are the category frequencies ***; ****************************************************************;

DATA in; INPUT level frq @@; LINES; -1 10 1 3

DATA signtest (DROP=i); SET in; IF level = -1 THEN DO i = 1 TO frq; OUTPUT; end; IF level = 1 THEN DO i = 1 TO frq; OUTPUT; end;

PROC UNIVARIATE DATA=signtest; VAR level; TITLE ’Ordinary Sign Test for Training Program Example’;

******************************************; *** Sign (Binomial) Test for Location: ***; **** Example 2.1 in course notes ***; ******************************************; DATA in2; med_time = 3.50; INPUT time @@; diff = time - med_time; OUTPUT; LINES; 1.80 3.30 5.65 2.25 2.50 3.50 2.75 3.25 3.10 2.70 3.00

PROC UNIVARIATE DATA=in2; VAR diff; TITLE ’Sign (Binomial) Test for Example 2.1’; RUN;

****************************************************; *** Sign (Binomial) Test for Paired Differences: ***; *** Example 4.1 in course notes ***; ****************************************************; DATA in3; INPUT alone together @@; diff = together - alone; OUTPUT; LINES; 463 523 462 494 462 461 456 535 450 476 426 454 418 448 415 408 409 470 402 437 ; PROC PRINT DATA=in3; TITLE ’Sign (Binomial) Test for Paired Differences -- Example 4.1’; PROC UNIVARIATE DATA=in3; VAR diff; RUN;

97 6.3 Wilcoxon Signed Rank Test

Assumptions: Given a random sample of n independent observations X1,...,Xn:

Each Xi was drawn from a symmetric and continuous population. •

Each Xi has the same median M for i = 1, . . . , n). • The measurement scale is at least on the interval scale. • Hypotheses:

The inference concerns a hypothesis about the median M of a single population. Given • Mo, a hypothesized value of the median, we have:

(A) Two-sided: H : M = Mo vs H : M = Mo 0 1 6 (B) Lower one-sided: H0 : M = Mo vs H1 : M < Mo

(C) Upper one-sided: H0 : M = Mo vs H1 : M > Mo Because of the symmetry assumption, we can replace the median M with the mean µ in • the hypotheses.

Method: For a given α

Calculate all differences Di = Xi Mo. Remove all cases having Di = 0 and adjust the • sample size n accordingly. −

Assign ranks 1, 2, . . . , n to the Di . For tied Di values, assign average ranks. • | | | |

If H0 is true, then the Di are symmetrically distributed about 0. That is, we expect

(Ranks where Di > 0) (Ranks where Di < 0). ≈ X X + − Let T = (Ri when Di > 0) and T = (Ri when Di < 0). • X + −X Under H0, the sampling distributions of T and T are symmetric about n(n + 1)/4 and can assume integer values from 0 to n(n + 1)/2. Note that T − = n(n + 1)/2 T +. − For alternative hypothesis: • + − (A) H1 : M = Mo, let T = min(T ,T ). Let w be the largest value from the Wilcoxon Signed Rank6 Test Table such that P (T w) α/2. ≤ ≤ + (B) H1 : M < Mo, let T = T . Let w be the largest value from the Wilcoxon Signed Rank Test Table such that P (T w) α. ≤ ≤ − (C) H1 : M > Mo, let T = T . Let w be the largest value from the Wilcoxon Signed Rank Test Table such that P (T w) α. ≤ ≤

Decision Rule If T w, Reject H . If T > w, Fail to Reject H . ≤ 0 0

98 Large Sample Approximation (with continuity correction) (n > 30)

Computer packages like R and SAS calculate approximate p-values for the two-sided al- • ternative H1 based on large sample normal distribution approximations. The normalizing formula is: T n(n + 1)/4 T E(T ) T ∗ = − = − . n(n + 1)(2n + 1)/24 V ar(T ) Daniel (Applied Nonparametricp Statistics, page 42) describesp an adjustment to this formula in the event of ties.

Example of Wilcoxon Signed Rank Test A random sample of 12 fish was taken and the bodyweights recorded. Test the null hypothesis H0 : µ = 3.0 against the alternative H1 : µ < 3.0 pounds. (Assume they are sampled from the same symmetric distribution.) 2.11 2.22 2.23 2.41 2.54 2.73 2.80 2.80 2.92 3.06 3.12 3.12

R code for Wilcoxon Signed Rank Test with Confidence Interval # Wilcoxon Signed Rank Test with Confidence Interval for Fish Data fish <- c(2.11,2.22,2.23,2.41,2.54,2.73,2.80,2.80,2.92,3.06,3.12,3.12) wilcox.test(fish,mu=3,conf.int=TRUE) R output for Wilcoxon Signed Rank Test with Confidence Interval > # Wilcoxon Signed Rank Test with Confidence Interval for Fish Data > fish <- c(2.11,2.22,2.23,2.41,2.54,2.73,2.80,2.80,2.92,3.06,3.12,3.12) > wilcox.test(fish,mu=3,conf.int=TRUE) Wilcoxon signed rank test with continuity correction data: fish V = 8, p-value = 0.01664 alternative hypothesis: true location is not equal to 3 95 percent confidence interval: 2.420005 2.930018 sample estimates: (pseudo)median 2.670069

SAS code and selected output:

SIGN TEST AND WILCOXON SIGNED RANK TEST The UNIVARIATE Procedure Basic Statistical Measures Location Variability Mean -0.32833 Std Deviation 0.36321 Median -0.23500 Variance 0.13192 -0.20000 1.01000 0.67000

99 Tests for Location: Mu0=0 Test -Statistic------p Value------Student’s t t -3.13143 Pr > |t| 0.0096 Sign M -3 Pr >= |M| 0.1460 Signed Rank S -31 Pr >= |S| 0.0112 <-- p-value

The Signed Rank statistics S in SAS = • T (n(n + 1)/4 = 8 (12)(13)/4 = 8 39 = 31. − − − − The p-value is based on the normal approximation and is for a two-sided alternative. Thus, • for one-sided H1 : M < 3.0, the approximate p-value = .0112/2 = .0056.

OPTIONS LS=72 PS=60 NONUMBER NODATE; DATA IN; INPUT X @@; X=X-3; CARDS; 2.11 2.22 2.23 2.41 2.54 2.73 2.80 2.80 2.92 3.06 3.12 3.12 PROC UNIVARIATE DATA=IN; VAR X; TITLE ’ONE SAMPLE TESTS FOR LOCATION:’; TITLE2 ’SIGN TEST AND WILCOXON SIGNED RANK TEST’; RUN;

6.3.1 Reference Distribution for the Signed Rank Test (n = 5)

If H0 is true, then any random Di has a probability 1/2 of being > Mo or < Mo. Without loss of generality, let D1,D2,D3,D4,D5 be ordered from smallest to largest. Then when n = 5, every 5 possible of D1,D2,D3,D4,D5 has a (1/2) = 1/32 chance of occurring.

Possible Ranks Cumulative + T With Di > 0 Probability Probability 0 None 1/32 1/32 = .03125 1 1 1/32 2/32 = .06250 2 2 1/32 3/32 = .09375 3 3 or 1,2 2/32 5/32 = .15625 4 4 or 1,3 2/32 7/32 = .21875 5 5 or 1,4 or 2,3 3/32 10/32 = .31250 6 1,5 or 2,4 or 1,2,3 3/32 13/32 = .40625 7 2,5 or 3,4 or 1,2,4 3/32 16/32 = .50000 8 3,5 or 1,2,5 or 1,3,4 3/32 19/32 = .59375 9 4,5 or 1,3,5 or 2,3,4 3/32 22/32 = .68750 10 1,4,5 or 2,3,5 or 1,2,3,4 3/32 25/32 = .78125 11 2,4,5 or 1,2 3,5 2/32 27/32 = .84375 12 3,4,5 or 1,2 4,5 2/32 29/32 = .90625 13 1,3,4,5 1/32 30/32 = .93750 14 2,3,4,5 1/32 31/32 = .96875 15 1,2,3,4,5 1/32 32/32 = 1

100 10156 57102 6.3.2 Special Case: Paired Data

Assumptions: Given a random sample of n pairs of observations (x1, y1), (x2, y2),..., (xn, yn). Let Di = yi xi for i = 1, . . . , n. − The Dis are independent. • The measurement scale is at least on the interval scale. •

The distribution of the differences Di = yi xi for i = 1, . . . , n is symmetric. • − Testing Procedure:

Calculate all differences Di = yi xi for i = 1, . . . , n. • −

Use the median difference MD in the hypotheses. Typically, MD = 0. •

Because of the symmetry assumption, we can replace the median MD with the mean µD in the hypotheses.

Run the Wilcoxon Signed Test based on the Di. •

Example of Wilcoxon Signed Rank Test for Paired Data Two judges were asked to independently rate the rehabilitative potential for each of 22 male prison inmates. The following table contains the ratings:

Inmate (i) Judge 1 Judge 2 Di Di Sign Ri | | 1 6 5 1 1 + 6.0 2 12 11 1 1 + 6.0 3 3 4 -1 1 6.0 4 9 10 -1 1 − 6.0 5 5 2 3 3− + 18.0 6 8 6 2 2 + 13.5 7 1 2 -1 1 6.0 8 12 9 3 3− + 18.0 9 6 5 1 1 + 6.0 10 7 4 3 3 + 18.0 11 6 6 0 . . remove tie 12 9 8 1 1 + 6.0 ← 13 10 8 2 2 + 13.5 14 6 7 -1 1 6.0 15 12 9 3 3− + 18.0 16 4 3 1 1 + 6.0 17 5 5 0 . . remove tie 18 6 4 2 2 + 13.5 ← 19 11 8 3 3 + 18.0 20 5 3 2 2 + 13.5 21 10 9 1 1 + 6.0 22 10 11 -1 1 6.0 −

103 SAS code and output:

DATA IN; DO INMATE=1 TO 22; INPUT JUDGE1 JUDGE2 @@; DIFF = JUDGE1 - JUDGE2; OUTPUT; END; CARDS; 6 5 12 11 3 4 9 10 5 2 8 6 1 2 12 9 6 5 7 4 6 6 9 8 10 8 6 7 12 9 4 3 5 5 6 4 11 8 5 3 10 9 10 11 ; PROC UNIVARIATE DATA=IN; VAR DIFF; TITLE ’WILCOXON SIGNED RANK TEST FOR PAIRED DATA’; RUN; ======WILCOXON SIGNED RANK TEST FOR PAIRED DATA The UNIVARIATE Procedure Variable: DIFF Tests for Location: Mu0=0 Test -Statistic------p Value------Student’s t t 3.464102 Pr > |t| 0.0023 Sign M 5 Pr >= |M| 0.0414 Signed Rank S 75 Pr >= |S| 0.0031 <-- Reject Ho

The approximate p-value is .0031. Thus, we would reject the null hypothesis Ho : MD = 0.

6.3.3 Confidence Interval for the Median Based on the Wilcoxon Signed Rank Test To find the point estimate for the median M: • x + x – Calculate all paired averages u allowing : u = i j . ij ij 2 n n+1 There are 2 + n = 2 such averages. – Arrange the uij in increasing  order.

– The point estimate for M is M = the median of the uij . { } Method: For an approximate confidence level 100(1 α)% : c − Use the Wilcoxon Signed Rank Test Table to find the largest t such that P (T t) α/2. • ≤ ≤ st Let ML = (t + 1) uij observation from the beginning and • st MU = (t + 1) uij observation from the end of the set of ordered uij values.

Statistically, P (ML M MU ) = P (t + 1 T n(n + 1)/2 t) where T is the • Wilcoxon signed rank≤ statistic.≤ ≤ ≤ −

The approximate 100(1 α)% confidence interval is (ML,MU ). • − The exact confidence level for (ML,MU ) is determined by the distribution given in the • Wilcoxon Signed Rank Test Table. That is, if p = P (X t), then (ML,MU ) is an exact 100(1 2p)% confidence interval for M. ≤ − 104 st Note: You do not need to calculate all of the uij values, but only the (t + 1) largest and • smallest. This procedure is also known as the Hodges-Lehmann estimates of shift. • Example of Hodges-Lehmann confidence interval for M A random sample of 12 fish was taken and the body weights were recorded. 2.11 2.22 2.23 2.41 2.54 2.73 2.80 2.80 2.92 3.06 3.12 3.12 Calculate an approximate 95% confidence interval for the median bodyweight M. For n = 12, the largest value in the Wilcoxon Signed Rank Test Table with P (T t) .025 • is t = 13. Note: P (T 13) = .0212. ≤ ≤ ≤ Then t + 1 = 14 and n(n + 1)/2 t = 78 13 = 65. Now find the 14th and 65th values in • − − the list of the 78 uij values.

HODGES-LEHMANN CONFIDENCE INTERVAL Obs x1 x2 u Obs x1 x2 u 1 2.11 2.11 2.110 40 2.54 2.80 2.670 2 2.11 2.22 2.165 41 2.54 2.80 2.670 3 2.11 2.23 2.170 42 2.23 3.12 2.675 4 2.22 2.22 2.220 43 2.23 3.12 2.675 5 2.22 2.23 2.225 44 2.54 2.92 2.730 6 2.23 2.23 2.230 45 2.73 2.73 2.730 7 2.11 2.41 2.260 46 2.41 3.06 2.735 8 2.22 2.41 2.315 47 2.73 2.80 2.765 9 2.23 2.41 2.320 48 2.73 2.80 2.765 10 2.11 2.54 2.325 49 2.41 3.12 2.765 11 2.22 2.54 2.380 50 2.41 3.12 2.765 12 2.23 2.54 2.385 51 2.54 3.06 2.800 13 2.41 2.41 2.410 52 2.80 2.80 2.800 14 2.11 2.73 2.420 <-- lower 53 2.80 2.80 2.800 15 2.11 2.80 2.455 endpoint 54 2.80 2.80 2.800 16 2.11 2.80 2.455 55 2.73 2.92 2.825 17 2.22 2.73 2.475 56 2.54 3.12 2.830 18 2.41 2.54 2.475 57 2.54 3.12 2.830 19 2.23 2.73 2.480 58 2.80 2.92 2.860 20 2.22 2.80 2.510 59 2.80 2.92 2.860 21 2.22 2.80 2.510 60 2.73 3.06 2.895 22 2.11 2.92 2.515 61 2.92 2.92 2.920 23 2.23 2.80 2.515 62 2.73 3.12 2.925 24 2.23 2.80 2.515 63 2.73 3.12 2.925 25 2.54 2.54 2.540 64 2.80 3.06 2.930 26 2.22 2.92 2.570 65 2.80 3.06 2.930 <-- upper 27 2.41 2.73 2.570 66 2.80 3.12 2.960 endpoint 28 2.23 2.92 2.575 67 2.80 3.12 2.960 29 2.11 3.06 2.585 68 2.80 3.12 2.960 30 2.41 2.80 2.605 69 2.80 3.12 2.960 31 2.41 2.80 2.605 70 2.92 3.06 2.990 32 2.11 3.12 2.615 71 2.92 3.12 3.020 33 2.11 3.12 2.615 72 2.92 3.12 3.020 34 2.54 2.73 2.635 73 3.06 3.06 3.060 35 2.22 3.06 2.640 74 3.06 3.12 3.090 36 2.23 3.06 2.645 75 3.06 3.12 3.090 37 2.41 2.92 2.665 76 3.12 3.12 3.120 38 2.22 3.12 2.670 77 3.12 3.12 3.120 39 2.22 3.12 2.670 78 3.12 3.12 3.120 Thus, the confidence interval is (2.42,2.93).

105 R code for Wilcoxon Signed Rank Test with Confidence Interval # Wilcoxon Signed Rank Test with Confidence Interval for Fish Data fish <- c(2.11,2.22,2.23,2.41,2.54,2.73,2.80,2.80,2.92,3.06,3.12,3.12) wilcox.test(fish,mu=3,conf.int=TRUE) R output for Wilcoxon Signed Rank Test with Confidence Interval

Wilcoxon signed rank test with continuity correction data: fish V = 8, p-value = 0.01664 alternative hypothesis: true location is not equal to 3 95 percent confidence interval: 2.420005 2.930018 <--- Approximate 95% confidence interval for mu is (2.42, 2.93) sample estimates: (pseudo)median 2.670069

6.4 Asymptotic Relative Efficiency (A.R.E.) One way to compare properties of statistical tests is to compare the efficiency properties. • The definition of efficiency can vary but, generally speaking, it is used to compare the sample size required of one test with that of another test under similar conditions.

Suppose that two tests may be used to test a particular H0 against a particular H1, and • both tests have the same specified α and β errors. These tests are therefore comparable under conditions related to the level of significance α and power (1 β). − Thus, the test requiring the smaller sample size to satisfy these conditions will have the • smaller cost and effort. That is, the test with the smaller required sample size is more efficient than the other test, and its relative efficiency is greater than one.

Let T and T represent two tests that test the same H against the same H with the • 1 2 0 1 same specified α and β values. For example, T1 is the Sign Test and T2 is the Wilcoxon Signed Rank Test which are used to test H : µ = µ with α = .05 and power 1 β = .90. 0 0 − The relative efficiency of test T with respect to test T is the ratio n /n , where n • 1 2 2 1 1 is the required sample size of T1 to equal the power of test T2 which has sample size n2 (assuming the same H0 and significance level α).

Thus, there is a relative efficiency of T1 with respect to T2 for each choice of α and n2.A • more general measure of efficiency (asymptotic relative efficiency) was developed.

Consider the situation of letting sample size n increase for T with specified α and β. • 1 1 Then there exists a sequence of n2 values, such that for each value of n1 (n1 ), T2 has the same α and β values. → ∞

In other words, there is a sequence of relative efficiency values n2/n1. If n2/n1 approaches • a constant value as n , and, if that constant is the same for all choices of α and β, 1 → ∞ then the constant is called the asymptotic relative efficiency of T1 with respect to T2.

106 Note that if the A.R.E. exists for T1 and T2, then the limiting A.R.E. value is independent • of the choice of α and β.

To select a test with superior power, we generally select the test with the greatest A.R.E. • because the power depends on many factors such as the maximum number of observations that can be collected given experimental or sampling resources and the type of distribution that generates the data (normal?, weibull?, gamma?, ...) which is usually unknown.

The A.R.E. is, in general, difficult to calculate. In this course, we will only consider A.R.E. • results for various pairs of tests and for several choices of distributions.

Note that A.R.E. assumes that an infinite sample size can be taken. Thus, a natural • question arises: How good is a measure assuming an infinitely large sample when most practical situations involve relatively small sample sizes? In an attempt to answer this question, studies of exact relative efficiency values for very small samples have shown that A.R.E. provides a good approximation to the relative efficiency in many situations of practical interest.

6.4.1 A.R.E. Comparison for Three Single-Sample Tests of Location We will compare the t-test, Sign test, and the Wilcoxon Signed Rank test using the A.R.E. • values. To do this we will consider three situations involving symmetric distributions. Under symmetry assumptions, H0 and H1 are identical for all three tests. (I) The sample was randomly sampled from a normal distribution having density function

1 (x µ)2 f(x; µ, σ) = exp − for < x < σ√2π − 2σ2 − ∞ ∞   Without loss of generality, we can assume it is a standard normal N(0, 1) having density function 1 φ(x) = exp x2/2 for < x < √2π − − ∞ ∞ 

(II) The sample was randomly sampled from a uniform distribution having density func- tion: 1 f(x; a, b) = for a < x < b (b a) − = 0 otherwise

Without loss of generality, we can assume a uniform U(0, 1) having density function

f(x) = 1 for 0 < x < 1 = 0 otherwise

The uniform distribution is considered a light-tailed symmetric distribution.

107 3.4.1 A.R.E. Comparison for Tests of Location We will compare the t-test, Sign test, and the Wilcoxon Signed Rank test using the • A.R.E.. To do this we will consider three situations involving symmetric distributions. Under symmetry assumptions, H0 and H1 are identical for all three tests. (I) The sample was randomly sampled from a normal distribution having density function 1 (x µ)2 f(x; µ, σ) = exp − for < x < σ√2π "− 2σ2 # − ∞ ∞ Without loss of generality, assume it is a standard normal N(0, 1) having density function 1 φ(x) = exp x2/2 for < x < √2π − − ∞ ∞ ³ ´ (II) The sample was randomly sampled from a uniform distribution having density func- tion: 1 f(x; a, b) = for a < x < b (b a) − = 0 otherwise Without loss of generality, assume it is from a uniform U(0, 1) having density function f(x) = 1 for 0 < x < 1 = 0 otherwise The uniform distribution is considered a light-tailed symmetric distribution. (III) The sample was randomly sampled from a double exponential distribution (DE) hav- (III) ingThe density sample function was randomly sampled from a double exponential distribution (DE) hav- ing density function 1 x a f(x; a, b) = 1 exp |x − a| for < x < f(x; a, b) = 2b exp −| −b | for − ∞ < x < ∞ 2b Ã− b ! − ∞ ∞ Without loss of generality, we can assume DE(0, 1) having density function Without loss of generality, assume it is DE(0, 1) having density function 11 ff((xx)) = = exp ( x ) for < x < 22 −| | − ∞ ∞ TheThe DE DE distribution distribution is is a a heavy-tailed heavy-tailed symmetricsymmetric distribution. distribution.

Table of A.R.E. Values 61 Test (I) Normal (II) Uniform (III) Double Exponential Comparison Distribution Distribution Distribution Sign test 2/π 0.637 1/3 0.333 4/3 1.333 vs t-test ≈(t)≈ (t) (Sign)≈ Wilcoxon Signed Rank test 3/2 = 1.500 3.000 3/4 = 0.750 vs Sign test (Wlcxn) (Wlcxn) (Sign) Wilcoxon Signed Rank test 3/π .955 1.000 3/2 = 1.500 vs t-test (t)≈ (=) (Wlcxn) Bold-face letters indicate which test is more efficient

6.5 Introduction to the One-Sample Test Paired Data Example: An experimental drug was tested on 7 subjects. Blood level measure- ments were taken before (X) and after (Y ) administering the drug. In this situation, we have paired data. The difference (D = Y X) in blood level measurements for each subject were: − Patient i 1 2 3 4 5 6 7 Di D Difference Di .187 .011 .250 .034 .137 .112 .023 .664 .0949 − − − − − −P ≈ The goal is to test the hypothesis that there is no change in blood level measurement after • taking the drug. Statistically, we will assume that the distribution of the difference in blood levels measurements under the null hypothesis H0 is symmetric about 0.

108 Thus, H : µD = 0. We will consider two possible alternatives: • 0

(1) H : µD = 0 and (2) H : µD < 0. 1 6 1

If H0 is true (and assuming symmetry about 0), the signs (+ or ) of the 7 measurements • can be considered random. For example, we could have just as likely− observed

Patient i 1 2 3 4 5 6 7 Di D Difference Di .187 .011 .250 .034 .137 .112 .023 .508 .0726 − − P ≈ or

Patient i 1 2 3 4 5 6 7 Di D Difference Di .187 .011 .250 .034 .137 .112 .023 .412 .5886 − − P ≈ or any other “randomization” of the signs.

The Randomization Reference Distribution

1. Consider all possible sign assignments or “randomizations” of signs for the seven differ- ences.

7 2. Calculate Di for each randomization. In this example, there are 2 = 128 different randomizations of the seven signs. This yields the “randomization distribution” of Di. P In terms of testing, it is statistically equivalent to use the randomization distributionP • of the mean D.

3. Now compare the OBSERVED Di = .664 to the randomization distribution to find − the probability (p-value) associated with the test H0 : µD = 0 against the alternative H1 hypothesis. P

Case 1: For alternative H1 : µD = 0, from the randomization reference distribution we see • the p-value 6

= P ( Di .664 ) = P (Di .664 ) + P (Di .664 ) = (6 + 6)/128 = .09375. | | ≥ | − | ≥ | ≤ − | X Case 2: For alternative H1 : µD < 0, from the randomization reference distribution we see • the p-value = P (Di .664 ) = 6/128 = .046875. ≤ − |

109 RANDOMIZATION TEST #1 RANDOMIZATION TEST #1 Obs ID1 ID2 ID3 ID4 ID5 ID6 ID7 D_SUM CDF Obs ID1 ID2 ID3 ID4 ID5 ID6 ID7 D_SUM CDF 1 -0.187 -0.011 -0.25 -0.034 -0.137 -0.112 -0.023 -0.754 0.00781 65 0.187 -0.011 -0.25 0.034 0.137 -0.112 0.023 0.008 0.50781 2 -0.187 0.011 -0.25 -0.034 -0.137 -0.112 -0.023 -0.732 0.01563 66 -0.187 -0.011 0.25 -0.034 -0.137 0.112 0.023 0.016 0.51563 3 -0.187 -0.011 -0.25 -0.034 -0.137 -0.112 0.023 -0.708 0.02344 67 -0.187 -0.011 0.25 -0.034 0.137 -0.112 -0.023 0.020 0.52344 4 -0.187 -0.011 -0.25 0.034 -0.137 -0.112 -0.023 -0.686 0.03125 68 0.187 0.011 -0.25 0.034 0.137 -0.112 0.023 0.030 0.53125 5 -0.187 0.011 -0.25 -0.034 -0.137 -0.112 0.023 -0.686 0.03906 69 -0.187 -0.011 0.25 0.034 -0.137 0.112 -0.023 0.038 0.53906 ------70 -0.187 0.011 0.25 -0.034 -0.137 0.112 0.023 0.038 0.54688 6 -0.187 0.011 -0.25 0.034 -0.137 -0.112 -0.023 -0.664 0.04688 <-- 71 -0.187 0.011 0.25 -0.034 0.137 -0.112 -0.023 0.042 0.55469 ------72 -0.187 0.011 0.25 0.034 -0.137 0.112 -0.023 0.060 0.56250 7 -0.187 -0.011 -0.25 0.034 -0.137 -0.112 0.023 -0.640 0.05469 73 -0.187 -0.011 0.25 -0.034 0.137 -0.112 0.023 0.066 0.57031 8 -0.187 0.011 -0.25 0.034 -0.137 -0.112 0.023 -0.618 0.06250 74 -0.187 -0.011 0.25 0.034 -0.137 0.112 0.023 0.084 0.57813 9 -0.187 -0.011 -0.25 -0.034 -0.137 0.112 -0.023 -0.530 0.07031 75 -0.187 -0.011 0.25 0.034 0.137 -0.112 -0.023 0.088 0.58594 10 -0.187 0.011 -0.25 -0.034 -0.137 0.112 -0.023 -0.508 0.07813 76 -0.187 0.011 0.25 -0.034 0.137 -0.112 0.023 0.088 0.59375 11 -0.187 -0.011 -0.25 -0.034 -0.137 0.112 0.023 -0.484 0.08594 77 -0.187 0.011 0.25 0.034 -0.137 0.112 0.023 0.106 0.60156 12 -0.187 -0.011 -0.25 -0.034 0.137 -0.112 -0.023 -0.480 0.09375 78 -0.187 0.011 0.25 0.034 0.137 -0.112 -0.023 0.110 0.60938 13 -0.187 -0.011 -0.25 0.034 -0.137 0.112 -0.023 -0.462 0.10156 79 0.187 -0.011 -0.25 -0.034 0.137 0.112 -0.023 0.118 0.61719 14 -0.187 0.011 -0.25 -0.034 -0.137 0.112 0.023 -0.462 0.10938 80 0.187 -0.011 0.25 -0.034 -0.137 -0.112 -0.023 0.120 0.62500 15 -0.187 0.011 -0.25 -0.034 0.137 -0.112 -0.023 -0.458 0.11719 81 -0.187 -0.011 0.25 0.034 0.137 -0.112 0.023 0.134 0.63281 16 -0.187 0.011 -0.25 0.034 -0.137 0.112 -0.023 -0.440 0.12500 82 0.187 0.011 -0.25 -0.034 0.137 0.112 -0.023 0.140 0.64063 17 -0.187 -0.011 -0.25 -0.034 0.137 -0.112 0.023 -0.434 0.13281 83 0.187 0.011 0.25 -0.034 -0.137 -0.112 -0.023 0.142 0.64844 18 -0.187 -0.011 -0.25 0.034 -0.137 0.112 0.023 -0.416 0.14063 84 -0.187 0.011 0.25 0.034 0.137 -0.112 0.023 0.156 0.65625 19 -0.187 -0.011 -0.25 0.034 0.137 -0.112 -0.023 -0.412 0.14844 85 0.187 -0.011 -0.25 -0.034 0.137 0.112 0.023 0.164 0.66406 20 -0.187 0.011 -0.25 -0.034 0.137 -0.112 0.023 -0.412 0.15625 86 0.187 -0.011 0.25 -0.034 -0.137 -0.112 0.023 0.166 0.67188 21 -0.187 0.011 -0.25 0.034 -0.137 0.112 0.023 -0.394 0.16406 87 0.187 -0.011 -0.25 0.034 0.137 0.112 -0.023 0.186 0.67969 22 -0.187 0.011 -0.25 0.034 0.137 -0.112 -0.023 -0.390 0.17188 88 0.187 0.011 -0.25 -0.034 0.137 0.112 0.023 0.186 0.68750 23 0.187 -0.011 -0.25 -0.034 -0.137 -0.112 -0.023 -0.380 0.17969 89 0.187 -0.011 0.25 0.034 -0.137 -0.112 -0.023 0.188 0.69531 24 -0.187 -0.011 -0.25 0.034 0.137 -0.112 0.023 -0.366 0.18750 90 0.187 0.011 0.25 -0.034 -0.137 -0.112 0.023 0.188 0.70313 25 0.187 0.011 -0.25 -0.034 -0.137 -0.112 -0.023 -0.358 0.19531 91 0.187 0.011 -0.25 0.034 0.137 0.112 -0.023 0.208 0.71094 26 -0.187 0.011 -0.25 0.034 0.137 -0.112 0.023 -0.344 0.20313 92 0.187 0.011 0.25 0.034 -0.137 -0.112 -0.023 0.210 0.71875 27 0.187 -0.011 -0.25 -0.034 -0.137 -0.112 0.023 -0.334 0.21094 93 0.187 -0.011 -0.25 0.034 0.137 0.112 0.023 0.232 0.72656 110 28 0.187 -0.011 -0.25 0.034 -0.137 -0.112 -0.023 -0.312 0.21875 94 0.187 -0.011 0.25 0.034 -0.137 -0.112 0.023 0.234 0.73438 29 0.187 0.011 -0.25 -0.034 -0.137 -0.112 0.023 -0.312 0.22656 95 -0.187 -0.011 0.25 -0.034 0.137 0.112 -0.023 0.244 0.74219 30 0.187 0.011 -0.25 0.034 -0.137 -0.112 -0.023 -0.290 0.23438 96 0.187 0.011 -0.25 0.034 0.137 0.112 0.023 0.254 0.75000 31 0.187 -0.011 -0.25 0.034 -0.137 -0.112 0.023 -0.266 0.24219 97 0.187 0.011 0.25 0.034 -0.137 -0.112 0.023 0.256 0.75781 32 -0.187 -0.011 -0.25 -0.034 0.137 0.112 -0.023 -0.256 0.25000 98 -0.187 0.011 0.25 -0.034 0.137 0.112 -0.023 0.266 0.76563 33 -0.187 -0.011 0.25 -0.034 -0.137 -0.112 -0.023 -0.254 0.25781 99 -0.187 -0.011 0.25 -0.034 0.137 0.112 0.023 0.290 0.77344 34 0.187 0.011 -0.25 0.034 -0.137 -0.112 0.023 -0.244 0.26563 100 -0.187 -0.011 0.25 0.034 0.137 0.112 -0.023 0.312 0.78125 35 -0.187 0.011 -0.25 -0.034 0.137 0.112 -0.023 -0.234 0.27344 101 -0.187 0.011 0.25 -0.034 0.137 0.112 0.023 0.312 0.78906 36 -0.187 0.011 0.25 -0.034 -0.137 -0.112 -0.023 -0.232 0.28125 102 -0.187 0.011 0.25 0.034 0.137 0.112 -0.023 0.334 0.79688 37 -0.187 -0.011 -0.25 -0.034 0.137 0.112 0.023 -0.210 0.28906 103 0.187 -0.011 0.25 -0.034 -0.137 0.112 -0.023 0.344 0.80469 38 -0.187 -0.011 0.25 -0.034 -0.137 -0.112 0.023 -0.208 0.29688 104 -0.187 -0.011 0.25 0.034 0.137 0.112 0.023 0.358 0.81250 39 -0.187 -0.011 -0.25 0.034 0.137 0.112 -0.023 -0.188 0.30469 105 0.187 0.011 0.25 -0.034 -0.137 0.112 -0.023 0.366 0.82031 40 -0.187 0.011 -0.25 -0.034 0.137 0.112 0.023 -0.188 0.31250 106 -0.187 0.011 0.25 0.034 0.137 0.112 0.023 0.380 0.82813 41 -0.187 -0.011 0.25 0.034 -0.137 -0.112 -0.023 -0.186 0.32031 107 0.187 -0.011 0.25 -0.034 -0.137 0.112 0.023 0.390 0.83594 42 -0.187 0.011 0.25 -0.034 -0.137 -0.112 0.023 -0.186 0.32813 108 0.187 -0.011 0.25 -0.034 0.137 -0.112 -0.023 0.394 0.84375 43 -0.187 0.011 -0.25 0.034 0.137 0.112 -0.023 -0.166 0.33594 109 0.187 -0.011 0.25 0.034 -0.137 0.112 -0.023 0.412 0.85156 44 -0.187 0.011 0.25 0.034 -0.137 -0.112 -0.023 -0.164 0.34375 110 0.187 0.011 0.25 -0.034 -0.137 0.112 0.023 0.412 0.85938 45 0.187 -0.011 -0.25 -0.034 -0.137 0.112 -0.023 -0.156 0.35156 111 0.187 0.011 0.25 -0.034 0.137 -0.112 -0.023 0.416 0.86719 46 -0.187 -0.011 -0.25 0.034 0.137 0.112 0.023 -0.142 0.35938 112 0.187 0.011 0.25 0.034 -0.137 0.112 -0.023 0.434 0.87500 47 -0.187 -0.011 0.25 0.034 -0.137 -0.112 0.023 -0.140 0.36719 113 0.187 -0.011 0.25 -0.034 0.137 -0.112 0.023 0.440 0.88281 48 0.187 0.011 -0.25 -0.034 -0.137 0.112 -0.023 -0.134 0.37500 114 0.187 -0.011 0.25 0.034 -0.137 0.112 0.023 0.458 0.89063 49 -0.187 0.011 -0.25 0.034 0.137 0.112 0.023 -0.120 0.38281 115 0.187 -0.011 0.25 0.034 0.137 -0.112 -0.023 0.462 0.89844 50 -0.187 0.011 0.25 0.034 -0.137 -0.112 0.023 -0.118 0.39063 116 0.187 0.011 0.25 -0.034 0.137 -0.112 0.023 0.462 0.90625 51 0.187 -0.011 -0.25 -0.034 -0.137 0.112 0.023 -0.110 0.39844 117 0.187 0.011 0.25 0.034 -0.137 0.112 0.023 0.480 0.91406 52 0.187 -0.011 -0.25 -0.034 0.137 -0.112 -0.023 -0.106 0.40625 118 0.187 0.011 0.25 0.034 0.137 -0.112 -0.023 0.484 0.92188 53 0.187 -0.011 -0.25 0.034 -0.137 0.112 -0.023 -0.088 0.41406 119 0.187 -0.011 0.25 0.034 0.137 -0.112 0.023 0.508 0.92969 54 0.187 0.011 -0.25 -0.034 -0.137 0.112 0.023 -0.088 0.42188 120 0.187 0.011 0.25 0.034 0.137 -0.112 0.023 0.530 0.93750 55 0.187 0.011 -0.25 -0.034 0.137 -0.112 -0.023 -0.084 0.42969 121 0.187 -0.011 0.25 -0.034 0.137 0.112 -0.023 0.618 0.94531 56 0.187 0.011 -0.25 0.034 -0.137 0.112 -0.023 -0.066 0.43750 122 0.187 0.011 0.25 -0.034 0.137 0.112 -0.023 0.640 0.95313 57 0.187 -0.011 -0.25 -0.034 0.137 -0.112 0.023 -0.060 0.44531 ------58 0.187 -0.011 -0.25 0.034 -0.137 0.112 0.023 -0.042 0.45313 123 0.187 -0.011 0.25 -0.034 0.137 0.112 0.023 0.664 0.96094 <-- 59 0.187 -0.011 -0.25 0.034 0.137 -0.112 -0.023 -0.038 0.46094 ------60 0.187 0.011 -0.25 -0.034 0.137 -0.112 0.023 -0.038 0.46875 124 0.187 -0.011 0.25 0.034 0.137 0.112 -0.023 0.686 0.96875 61 -0.187 -0.011 0.25 -0.034 -0.137 0.112 -0.023 -0.030 0.47656 125 0.187 0.011 0.25 -0.034 0.137 0.112 0.023 0.686 0.97656 62 0.187 0.011 -0.25 0.034 -0.137 0.112 0.023 -0.020 0.48438 126 0.187 0.011 0.25 0.034 0.137 0.112 -0.023 0.708 0.98438 63 0.187 0.011 -0.25 0.034 0.137 -0.112 -0.023 -0.016 0.49219 127 0.187 -0.011 0.25 0.034 0.137 0.112 0.023 0.732 0.99219 64 -0.187 0.011 0.25 -0.034 -0.137 0.112 -0.023 -0.008 0.50000 128 0.187 0.011 0.25 0.034 0.137 0.112 0.023 0.754 1.00000 SAS Code to Generate the Randomization Distribution for the Change in Blood Level Measurements

DM ’LOG;CLEAR;OUT;CLEAR;’; OPTIONS LS=72 PS=68 NONUMBER NODATE; DATA IN; INPUT D1-D7 @@; CARDS; -.187 .011 -.250 .034 -.137 -.112 -.023 DATA IN; SET IN; DO I1=-1 TO 1 BY 2; ID1=I1*D1; DO I2=-1 TO 1 BY 2; ID2=I2*D2; DO I3=-1 TO 1 BY 2; ID3=I3*D3; DO I4=-1 TO 1 BY 2; ID4=I4*D4; DO I5=-1 TO 1 BY 2; ID5=I5*D5; DO I6=-1 TO 1 BY 2; ID6=I6*D6; DO I7=-1 TO 1 BY 2; ID7=I7*D7; D_SUM = SUM(OF ID1-ID7); OUTPUT; END; END; END; END; END; END; END; KEEP ID1-ID8 D_SUM; PROC SORT DATA=IN; BY D_SUM; DATA IN; SET IN; CDF=_N_/128; PROC PRINT DATA=IN; TITLE ’RANDOMIZATION TEST #1’; RUN;

6.5.1 Randomization Test for Paired Data

Assumptions: Given a random sample of n pairs of observations (x1, y1), (x2, y2),..., (xn, yn).

The differences Di = yi xi are independent. • −

The distribution of each Di is symmetric and has the same mean. •

The measurement scale for the Dis is at least interval. • Hypotheses:

The inference concerns a hypotheses about whether or not the mean difference µD = 0: •

(A) Two-sided: H : µD = 0 vs H : µD = 0 0 1 6 (B) Lower one-sided: H0 : µD = 0 vs H1 : µD < 0

(C) Upper one-sided: H0 : µD = 0 vs H1 : µD > 0

Because of the symmetry assumption, we can replace the mean difference µD with the • median difference MD in the hypotheses.

Method: For a given α

n Calculate the sum Di for each of the possible 2 sign randomizations. •

Order these valuesP to form the randomization distribution for the Di. • P

111 Decision Rule

For (A) H : µD = 0, • 1 6

– If ACTUAL Di < 0, let D = Di. If ACTUAL Di > 0, let D = Di. − P P Number of DP0s D P p value = 2 i ≤ − × 2n P

For (B) H : µD < 0, let D = Di. • 1 P Number of D0s D p value = i ≤ − 2n P

For (C) H : µD > 0, let D = Di. • 1 NumberP of D0s D Number of D0 s D p value = i ≥ = i ≤ − − 2n 2n P P Reject H if p-value α. Otherwise, we fail to reject H . • 0 ≤ 0

Without loss of generality, you can replace Di with D in the preceding arguments. • P Note that the number of sign randomizations (2n) forming the randomization distribution grows rapidly. For example, when n = 20, there are over 1 million randomizations. In such cases, it is • generally not feasible to generate the entire randomization distribution.

To handle this problem, a large number of randomizations of the signs are randomly taken. • Then, an approximate randomization distribution is generated from this large subset of possible randomizations.

Approximate p-values can then be determined from this distribution. This is known as • the monte-carlo approach to generating approximate p-values.

The following R code will generate p-values using the monte-carlo approach for the single-sample Randomization Test for Location. R Code for Randomization Test on Paired Data (Differences) # Single Sample Randomization Test for Location # Enter the number of permutations to take Prep = 50000 Prep # Enter vector of differences D <- c(-.187,.011,-.250,.034,-.137,-.112,-.023) D # Calculate the mean difference meanD <- mean(D) meanD sgnD <- sign(meanD) Fp = 0

112 upper <- 0 lower <- 0 n = length(D) # Begin sign randomizations meanpermD <- 1:Prep for (i in 1:Prep){ sgnvec <- sign(runif(n)-.5) # random vector with 1 or -1 values permvec <- sgnvec*D # Calculate the mean difference for the i_th randomization vector meanpermD[i] <- mean(permvec) if(meanpermD[i]>=meanD) upper = upper+1 if(meanpermD[i]<=meanD) lower = lower+1 } # Calculate p-values: # for lower one-sided Ho pval_lower <- lower/Prep pval_lower # for upper one-sided Ho p-val_upper <- upper/Prep pval_upper # for two-sided Ho if(sgnD < 0) pval_two_sided = 2*pval_lower if(sgnD > 0) pval_two_sided = 2*pval_upper pval_two_sided hist(meanpermD) R Output for Randomization Test on Paired Data

> meanD -0.09485714 > # Calculate p-values: > # for lower one-sided Ho [1] 0.04644 > # for upper one-sided H0 [1] 0.96076 > # for two-sided Ho [1] 0.09288

Note that the p-values from the monte-carlo approach (.04644, .96076, .09288) ap- proximate the exact p-values of (.046875, .953125, .09375) from the true random- ization distribution.

Without loss of generality, I used D instead of Di in my R code. P

113 ofHistogram 50,000 Values of meanpermD using the Monte-Carlo Approach 4000 3000 2000 1000 0

−0.10 −0.05 0.00 0.05 0.10

meanpermD

6.5.2 Single Sample Randomization Test for Ho : µ = µ0 Suppose we want to perform a randomization test if our inference concerns a hypotheses • about whether or not µ = µ0 for some specified value µ0 against one of three alternatives: (A) Two-sided: H : µ = µ vs H : µ = µ 0 0 1 6 0 (B) Lower one-sided: H0 : µ = µ0 vs H1 : µ < µ0

(C) Upper one-sided: H0 : µ = µ0 vs H1 : µ > µ0

To perform a randomization test, simply subtract µ0 from each observation, and then run • the randomization test as you would for paired data.

Example: A random sample of 12 fish was taken and the bodyweights recorded. Test the null hypothesis H0 : µ = 3.0 against the alternative H1 : µ < 3.0 pounds. 2.11 2.22 2.23 2.41 2.54 2.73 2.80 2.80 2.92 3.06 3.12 3.12

Based on the randomization test, the p-value is approximately .00474. Therefore, we would reject H0 : µ = 3.0 and conclude that µ < 3.0 pounds.

114 R Code for Randomization Test for Fish Weight Data # Single Sample Randomization Test for Location

# Enter the number of randomizations to take Prep = 50000 Prep fish <- c(2.11,2.22,2.23,2.41,2.54,2.73,2.80,2.80,2.92,3.06,3.12,3.12)

# Enter the hypothesized mean mu0 = 3 <-- enter mu_0

# Enter vector of differences D <- fish - mu0 <- subtract mu_0 from the data D

# Calculate the mean difference meanD <- mean(D) meanD sgnD <- sign(meanD)

Fp = 0 upper <- 0 lower <- 0 n = length(D)

# Begin sign randomizations meanpermD <- 1:Prep for (i in 1:Prep){ sgnvec <- sign(runif(n)-.5) # random vector with 1 or -1 values permvec <- sgnvec*D

# Calculate the mean difference for the i_th randomization vector meanpermD[i] <- mean(permvec) if(meanpermD[i]>=meanD) upper = upper+1 if(meanpermD[i]<=meanD) lower = lower+1 } # Calculate p-values: # for lower one-sided Ho pval_lower <- lower/Prep pval_lower

# for upper one-sided H0 pval_upper <- upper/Prep pval_upper

# for two-sided Ho if(sgnD < 0) pval_two_sided = 2*pval_lower if(sgnD > 0) pval_two_sided = 2*pval_upper pval_two_sided hist(meanpermD)

115 R Output for Randomization Test on Fish Weight Data

> # Enter vector of differences [1] -0.89 -0.78 -0.77 -0.59 -0.46 -0.27 -0.20 -0.20 -0.08 0.06 0.12 0.12

> # Calculate the mean difference [1] -0.3283333

> # Calculate p-values:

> # for lower one-sided Ho [1] 0.00474

> # for upper one-sided H0 [1] 0.99564

> # for two-sided Ho [1] 0.00948

Histogram of 50,000 Values using the Monte-Carlo Approach Histogram of meanpermD 7000 6000 5000 4000 Frequency 3000 2000 1000 0

−0.4 −0.2 0.0 0.2 0.4

meanpermD

116