
Lecture 16 Lecture Topics BIOST 514/517 • Sign test Biostatistics I / Applied Biostatistics I • Mann-Whitney U Test (Wilcoxon rank sum test) • Wilcoxon signed-rank test • Logrank test Kathleen Kerr, Ph.D. Associate Professor of Biostatistics University of Washington • As with all tests, we need to look at what summary measure they are based on, and what assumptions they make Lecture 16: Rank-based Tests December 2-4, 2013 Sign Test Sign Test • The sign test is a test that the median of a population is • If the median of a population is m, then half the equal to some specified value. population is above m and half is below m. Recoding • It is most often used to test that the median difference in the variable to 1 if it is above m and 0 if it is below m paired observations is zero gives n binary observations that will have mean ½ if m is the population median. – median difference, not difference in medians • The sign test compares the observed proportion using the binomial distribution. • STATA provides signtest for testing zero median difference between paired observations. Lecture 16 Sign Test: Example Sign Test: Example One-sided tests: • Shoulder pain data: Is pain higher or lower at the earliest Ho: median of pain1 - pain6 = 0 vs. time (time 1) and the latest time (time 6)? Ha: median of pain1 - pain6 > 0 Pr(#positive >= 15) = • Each person contributes pain scores at time 1 and time 6. Binomial(n = 19, x >= 15, p = 0.5) = 0.0096 Does the difference between them tend to be positive or negative? Ho: median of pain1 - pain6 = 0 vs. signtest pain1=pain6 Ha: median of pain1 - pain6 < 0 Sign test Pr(#negative >= 4) = sign | observed expected Binomial(n = 19, x >= 4, p = 0.5) = 0.9978 -------------+------------------------ positive | 15 9.5 Two-sided test: negative | 4 9.5 Ho: median of pain1 - pain6 = 0 vs. zero | 22 22 Ha: median of pain1 - pain6 != 0 -------------+------------------------ Pr(#positive >= 15 or #negative >= 15) = all | 41 41 min(1, 2*Binomial(n = 19, x >= 15, p = 0.5)) = 0.0192 Mann-Whitney U test Wilcoxon rank-sum test • The Mann-Whitney U test compares two groups. Unlike • Another way to compute the same test involves the two-sample t-test, the Mann-Whitney U test is based replacing the data by ranks and computing the sum of on a bivariate summary rather than the difference in a ranks in each group. univariate summary. • This leads to a second name for the test, “Wilcoxon •Write Xi (i=1,..n1) for the first group and Yj (j=1,…,nj) for rank-sum test.” the second group. The U test is based on the statistic • They are the same test. The only reason to distinguish U=Σ I(Xi>Yj) between two versions is if you are doing the test by hand where the sum is over all possible pairs. and have tables of U or the rank sum. • In other words, the test is based on the proportion of times an X is greater than a Y. • The null hypothesis is that the proportion is ½. Lecture 16 Issues Considerations • There are two problems with the Mann-Whitney test • Because of its use of ranks, the test is more sensitive to – It is not based on any univariate summary, so it shifts in all observations and less sensitive to large shifts doesn’t lead to confidence intervals for a difference in in a small number of observations. location. • If the distributions of X and Y were known to have • It is often incorrectly described as a test for equality of exactly the same shape, and differ only in location, then medians. the Mann-Whitney U test/Wilcoxon rank sum test would – The p-value is usually computed not under the be very useful. But we never know this. assumption that P(X>Y)=1/2, but under the much stronger assumption that X and Y have exactly the same distribution. Mann-Whitney U Test: Example Mann-Whitney U Test: Example • PBC dataset: bilirubin by presence of edema • The p-value is <0.0001. ranksum bilirubin, by(edema) porder • The option asks for P(X>Y) to be given. We see Two-sample Wilcoxon rank-sum (Mann-Whitney) test porder edema | obs rank sum expected that bilirubin is higher in the no-edema group only 23.6% of -------------+--------------------------------- the time. 0 | 275 40356 43037.5 1 | 37 8472 5790.5 -------------+--------------------------------- combined | 312 48828 48828 unadjusted variance 265397.92 adjustment for ties -410.96 ---------- adjusted variance 264986.96 Ho: biliru~n(edema==0) = biliru~n(edema==1) z = -5.209 Prob > |z| = 0.0000 P{biliru~n(edema==0) > biliru~n(edema==1)} = 0.236 Lecture 16 Signed rank test Signed rank test • The Wilcoxon signed-rank test adds to the naming •Write ∆i= Xi -Yi . confusion. • The Wilcoxon signed rank test statistic is based on the • It is a test for difference in location between paired proportion of the pairwise sums ∆i + ∆j that are positive. samples. • It is a test for the median pairwise sum being zero. • Like the previous test, it is not based on any univariate • The p-value is usually computed under the additional summary. assumption that the distribution of is symmetric about zero. • Compared to the sign test, the signed rank test is more sensitive to changes in extreme values and less sensitive than the paired t-test. • In large samples it has much more restrictive assumptions than either the sign test or paired t-test. logrank test STATA • The logrank test is a rank-based test that has been • PBC dataset, survival by treatment group modified for survival data. stset time, failure(censoring) • The logrank test is much more useful than the other rank sts test treatment failure _d: censoring tests because there are fewer alternatives available in analysis time _t: time standard software for survival data. Log-rank test for equality of survivor functions ∆i | Events Events treatment | observed expected ----------+------------------------- 1 | 65 63.22 2 | 60 61.78 ----------+------------------------- Total | 125 125.00 chi2(1) = 0.10 Pr>chi2 = 0.7498 Lecture 16 STATA Self-consistency • . sts graph, by(treatment) • Sometimes when comparing two groups X and Y, all univariate summary statistics give the same answer: the Kaplan-Meier survival estimates mean, median, geometric mean, all the quantiles are 1.00 higher for X than for Y. • Sometimes they don’t. Then the decision of which is 0.75 larger depends on the definition of “large.” 0.50 • In the ambiguous case, any test based on a univariate summary is (thankfully) consistent with itself. 0.25 – If group A has higher mean cost than B, and group B has higher mean cost than C, you know that A has 0.00 0 1000 2000 3000 4000 5000 higher mean cost than C analysis time treatment = 1 treatment = 2 • This property, which may seem trivial, does not hold for rank-based tests. Self-consistency: Example Self-consistency: Example • Suppose there are three illnesses with similar symptoms. • Base the choice on mean days of illness. Suppose all three take 3 days to recover when – B (mean 2.4) is better than untreated (mean 3) better untreated. than A (mean 3.2) • There are also treatments A and B, which may be beneficial or harmful depending on which underlying illness the patient actually has. Illness Proportion untreated A B Illness Proportion untreated A B 1 40% 3 2 0 1 40% 3 2 0 2 20% 3 2 4 2 20% 3 2 4 3 40% 3 5 4 3 40% 3 5 4 • If you can’t tell which illness a patient actually has, which is the best treatment option? Lecture 16 Self-consistency: Example Self-consistency: Example • Base the choice on median days of illness. • Base the choice on maximum days of illness. – A (median 2) is better than untreated (median 3) – untreated (maximum 3) better than B (maximum 4) is better than B (median 4) better than A (maximum 5) Illness Proportion untreated A B Illness Proportion untreated A B 1 40% 3 2 0 1 40% 3 2 0 2 20% 3 2 4 2 20% 3 2 4 3 40% 3 5 4 3 40% 3 5 4 Self-consistency: Example Self-consistency: Example • The Wilcoxon rank-sum test leads in circles. • The Wilcoxon rank-sum test leads in circles. – Choose A because it has the best chance of beating – Choose A because it has the best chance of beating untreated (60% chance) untreated (60% chance) – But then choose B because it has a 80% chance of beating A. Base the choice on maximum days of illness. Illness Proportion untreated A B Illness Proportion untreated A B 1 40% 320 1 40% 3 20 2 20% 324 2 20% 3 24 3 40% 354 3 40% 3 54 Lecture 16 Self-consistency: Example Self-consistency: Example • The Wilcoxon rank-sum test leads in circles. • NOTE: In the previous example there was no – Choose A because it has the best chance of beating uncertainty. untreated (60% chance) • If rank comparisons cannot help us make decisions – But then choose B because it has a 80% chance of when there is no uncertainty, why would we use it in the beating A. Base the choice on maximum days of presence of uncertainty? illness. • The same sort of problem can happen with other rank – But no treatment beats B 60% of the time. tests. Illness Proportion untreated A B 1 40% 3 2 0 2 20% 3 2 4 3 40% 3 5 4 Summary • In large samples of uncensored data there is little need for rank-based tests.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages7 Page
-
File Size-