<<

Nonparametric methods

Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health Measurement

¤ What are the 4 levels of measurement discussed? – 1. Nominal or Classificatory Scale • Gender, ethnic background – 2. Ordinal or Scale • Hardness of rocks, beauty, military ranks – 3. Interval Scale • Celsius or Fahrenheit – 4. Ratio Scale • speed, height, mass or weight Parametric Assumptions

¤ The observations must be independent ¤ The observations must be drawn from normally distributed populations ¤ These populations must have the same

Introduction

¤ The theory upon which the two- sample T-test is based requires that the two sampled populations be normal and have equal variances. ¤ Many other common statistical procedures have similar assumptions. Introduction

¤ A large body of statistical methods is available that comprises procedures that not requiring the estimation of the population and and not stating hypothesis about . These testing procedures are termed “non-parametric tests” Introduction

¤ Non parametric tests may be applied in any situation where we would be justified in employing a parametric test, such as the two-sample t test, as well as in instances when the assumptions of the latter are untenable. Introduction

¤ If either the parametric or nonparametric approach is applicable, then the former will always be more powerful than the latter. Why use a non-parametric ?

Very small samples (< 20 replicates) high probability of violating the assumption of normality leads to spurious Type 1 (false alarm) errors Outlier more often leads to spurious Type 1 errors in Non-parametric statistics reduce to an ordinal rank, which reduce the impact or leverage of outlier

Error

Type I error: False alarm for a bogus effect Reject the null hypothesis when it is really true Type II error: Miss a real effect Fail to reject the null hypothesis when it is really false Type III error ;) Lazy, incompetence, or willful ignorance of the truth

Nonparametric Assumptions

¤ Observations are independent ¤ Variable under study has underlying continuity Nonparametric Methods

¤ There is at least one nonparametric test equivalent to a parametric test ¤ These tests fall into several categories – Tests of differences between groups (independent samples) – Tests of differences between variables (dependent samples) – Tests of relationships between variables Nonparametric Methods

• Wilcoxon Signed-Rank Test • Mann-Whitney-Wilcoxon Test • Kruskal-Wallis Test •

Adapted from JOHN S. LOUCKS St. Edward’s University

Sign Test

• A common application of the sign test involves using a sample of n potential customers to identify a preference for one of two brands of a product. • The objective is to determine whether there is a difference in preference between the two items being compared. Sign Test • To record the preference data, we use a plus sign if the individual prefers one brand and a minus sign if the individual prefers the other brand.

• Because the data are recorded as plus and minus signs, this test is called the sign test.

Example: Hand Cream Test

Sign Test: Large-Sample Case o As part of a market research study, a sample of 36 consumers were asked to taste two brands of hand cream and indicate a preference o Do the data shown below indicate a significant difference in the consumer preferences for the two brands?

Example: Hand cream Test

• 18 preferred L’ Occitane (+ sign recorded) • 12 preferred Bath & Body (_ sign recorded) • 6 had no preference The analysis is based on a sample size of 18 + 12 = 30 Hypotheses

H0: No preference for one brand over the other exists

Ha: A preference for one brand over the other exists

Example: Hand cream Test

Rejection Rule Using 0.05 level of significance,

Reject H0 if z < -1.96 or z > 1.96 Test Statistic z = (18 - 15)/2.74 = 3/2.74 = 1.095 Conclusion

Do not reject H0. There is insufficient evidence in the sample to conclude that a difference in preference exists for the two brands of hand cream. Fewer than 10 or more than 20 individuals would have to have a preference for a particular brand in order for us to reject H0. Wilcoxon Signed-Rank Test

• The methodology of the parametric matched-sample analysis requires: – interval data, and – the assumption that the population of differences between the pairs of observations is normally distributed

• If the assumption of normally distributed differences is not appropriate, the Wilcoxon signed-rank test can be used. Wilcoxon Signed-Rank Test

Preliminary Steps of the Test • Compute the differences between the paired observations • Discard any differences of zero • Rank the absolute value of the differences from lowest to highest • Tied differences are assigned the average ranking of their positions • Give the ranks the sign of the original difference in the data • Sum the signed ranks . . . next determine whether the sum is significantly different from zero Example: Express Deliveries

Wilcoxon Signed-Rank Test • A huge animal hospital has decided to select one of two express delivery services. • To test the delivery times of the two services, the Vet sends two reports to a sample of 10 district animal clinics, with one report carried by one service and the other report carried by the second service. • Do the data (delivery times in hours) indicate a difference in the two services? Example: Express Deliveries

District clinic Overnight NiteFlite Seattle 32 hrs. 25 hrs. Los Angeles 30 24 Boston 19 15 Cleveland 16 15 New York 15 13 Houston 18 15 Atlanta 14 15 St. Louis 10 8 Milwaukee 7 9 Denver 16 11 Example: Express Deliveries

District clinic Differ |Diff| Rank Sign Rank Seattle 7 10 +10 Los Angeles 6 9 +9 Boston 4 7 +7 Cleveland 1 1.5 +1.5 New York 2 4 +4 Houston 3 6 +6 Atlanta -1 1.5 -1.5 St. Louis 2 4 +4 Milwaukee -2 4 -4 Denver 5 8 +8

+44 Example: Express Deliveries

Hypotheses

H0: The delivery times of the two services are the same; neither offers faster service than the other

Ha: Delivery times differ between the two services; recommend the one with the smaller times

Example: Express Deliveries

Rejection Rule Using 0.05 level of significance,

Reject H0 if z < -1.96 or z > 1.96 Test Statistic

z = (T - T )/T = (44 - 0)/19.62 = 2.24 Conclusion

Reject H0. There is sufficient evidence in the sample to conclude that a difference exists in the delivery times provided by the two services. Recommend using the NiteFlite service

Kruskal-Wallis Test

• The MWN test can be used to test whether two populations are identical • The MWW test has been extended by Kruskal and Wallis for cases of three or more populations • The Kruskal-Wallis test can be used with , interval or ratio data • Not require the assumption of normally distributed populations • The hypotheses are:

H0: All populations are identical Ha: Not all populations are identical

Mann-Whitney U Test Two-sample rank test

Although nonparametric procedures have been proposed for testing differences between the dispersion, or variability, of two populations, none has achieved widespread acceptance.

Differences between independent groups

Two samples – compare mean value for some variable of interest

Parametric test Non-parametric test

T-test for Wald-Wolfowitz runs independent samples test Mann-Whitney U test

Kolmogorov-Smirnov two sample test Mann-Whitney U Test For this test, as for many other nonparametric procedures, the actual measurements are not employed, but use instead the ranks of the measurements. The data may be ranked either from the highest to lowest or from the lowest to the highest values. Mann-Whitney U Test Nonparametric alternative to two- sample t-test Actual measurements not used – ranks of the measurements used Data can be ranked from highest to lowest or lowest to highest values Calculate Mann-Whitney U statistic (for one sided)

U = n1n2 + n1(n1+1) – R1 2 Mann-Whitney U Test Calculate Mann-Whitney U statistic (two sided)

U = n1n2+n1(n1+1) – R1 U'= n1n2-U 2 – n1 and n2 are the number of observations in Sample one and two – R1 is the sum of the ranks of the observations in Sample one Mann-Whitney U Test

Calculate Mann-Whitney U statistic (two sided) U'= n2n1+n2(n2+1) – R2 U= n1n2-U' 2 – n1 and n2 are the number of observations in Sample one and two – R2 is the sum of the ranks of the observations in Sample two Example of Mann-Whitney U test

Two tailed null hypothesis that there is no difference between the heights of male and female students Ho: Male and female students are the same height HA: Male and female students are not the same height

Example 1 U 0.05(2),7,5 = U 0.05(2),5,7 = 30

As 33 > 30, Ho is rejected

U = n1n2 + n1(n1+1) – R1

2 0.01 < P (U >= 33 or U’ =< 2) < 0.02

U=(7)(5) + (7)(8) – 30 Heights of Heights of Ranks of Ranks of 2 males (cm) females male female (cm) heights heights U = 35 + 28 – 30

U = 33 193 175 1 7 188 173 2 8

U’ = n1n2 – U 185 168 3 10 183 165 4 11 U’ = (7)(5) – 33 180 163 5 12

U’ = 2 178 6

170 9

n1 = 7 n2 = 5 R1 = 30 R2 = 48

Calculation for z-statistics

E(U) = (n1n2)/2 =(7*5)/2=17.5 S(U) = √ n1n2(n1+n2+1)/12 = √ 7*5*(7+5+1)/12 = 6.16

z = [U-E(U)]/S(U) = [(2-17.5)/6.16 = -2.516 Rejection Rule Using 0.05 level of significance,

Reject H0 if z < -1.96 or z > 1.96

Conclusion

Can reject H0. There is significantly difference between the heights of male and female students. .

Example of Mann-Whitney U test

Ho: The performance of students is the same under the two teaching assistants Ha: Students do not perform equally well under the two teaching assistants  = 0.05

Teaching Assistant A Teaching Assistant B Grade Rank of Grade Rank of Example 2 grade grade A A A A A B+ A- B+ B B B B- C+ C C+ C C C- C D C- D D D D-

n1 = 11 R1 = n2 = 14 R2 = Example 2 Teaching Assistant A Teaching Assistant B Grade Rank of Grade Rank of grade grade U = n1n2 + n1(n1+1) – R1 2 A 3 A 3 U=(11)(14) + (11)(12) – 114.5 A 3 A 3 2 A 3 B+ 7.5 U = 154 + 66 – 114.5 A- 6 B+ 7.5 U = 105.5 B 10 B 10

U’ = n1n2 – U B 10 B- 12

U’ = (11)(14) – 105.5 C+ 13.5 C 16.5

U’ = 48.5 C+ 13.5 C 16.5

C 16.5 C- 19.5 U 0.05(2),11,14 = 114 C 16.5 D 22.5 C- 19.5 D 22.5 As < 114, accept H0 D 22.5 0.10 < P (U >105.5 or U’ =< 48.5) < 22.5 0.20 D D- 25

n1 = 11 R1=114.5 n2 = 14 R2=210.5 Calculation for z-statistics

E(U) = (n1n2)/2 = 77 S(U) = √ n1n2(n1+n2+1)/12 = 18.27

z = [U-E(U)]/S(U) = [(48.5-77)/18.27] = -1.56 Rejection Rule Using 0.05 level of significance,

Reject H0 if z < -1.96 or z > 1.96

Conclusion

Can not reject H0. The performance of students is the same under the two teaching assistants.

Differences between independent groups

Multiple groups

Parametric Nonparametric Analysis of Kruskal-Wallis ¤ Multiple groups variance analysis of ranks

(ANOVA/ MANOVA) test Differences between dependent groups

Parametric Nonparametric ¤ Compare two t-test for variables measured dependent in the same sample Sign test samples

Wilcoxon’s ¤ If more than matched pairs two variables are test measured in same Repeated Friedman’s two sample measures way analysis of ANOVA variance Cochran Q Relationships between variables

Parametric Nonparametric Correlation Spearman R ¤ Two variables coefficient of interest are Kendall Tau categorical Coefficient Gamma Chi square

Phi coefficient Fisher Kendall coefficient of concordance Summary Table of Statistical Tests

Level of Sample Characteristics Correlation Measurement 1 2 Sample K Sample (i.e., >2) Sample Independent Dependent Independent Dependent

Categorical Χ2 or Χ2 Macnarmar’ Χ2 Cochran’s Q or Nominal bi- s Χ2 nomial

Rank or Mann Wilcoxin Kruskal Wallis Friendman’s Spearman’s Ordinal Whitney U Matched H ANOVA rho Pairs Signed Ranks

Parametric z test t test t test within 1 way ANOVA 1 way Pearson’s r (Interval & or t between groups between ANOVA Ratio) test groups groups (within or repeated measure) Factorial (2 way) ANOVA

Advantages of Nonparametric Tests

Probability statements obtained from most nonparametric statistics are exact probabilities, regardless of the shape of the population distribution from which the random sample was drawn If sample sizes as small as N=6 are used, there is no alternative to using a nonparametric test Advantages of Nonparametric Tests Treat samples made up of observations from several different populations. Can treat data which are inherently in ranks as well as data whose seemingly numerical scores have the strength in ranks They are available to treat data which are classificatory Easier to learn and apply than parametric tests Criticisms of Nonparametric Procedures

Losing precision/wasteful of data Low power False sense of security Lack of software Testing distributions only Higher-ordered interactions not dealt with A good tree will bear good fruits