c

Journal of Applied Mathematics Decision Sciences

Reprints Available directly from the Editor Printed in New Zealand

Extensions to the KruskalWallis Test and a

Generalised Test with Extensions

JCW RAYNER john rayneruoweduau

Department of Applied University of Wol longong Northelds Avenue NSW

Australia

DJ BEST johnbbiomsyddfstcsiroau

CSIRO Mathematical and Information Sciences PO Box North Ryde NSW Australia

Abstract The data for the tests considered here may b e presented in twowaycontingency ta

bles with all marginal totals xed Weshow that Pearsons test statistic X P for Pearson may

P

b e partitioned into useful and informative comp onents The rst detects lo cation dierences b e

tween the treatments and the subsequent comp onents detect disp ersion and higher order moment

dierences For KruskalWallistyp e data when there are no ties the lo cation comp onent is the

KruskalWallis test The subsequent comp onents are the extensions Our approach enables us to

generalise to when there are ties and to when there is a xed numb er of categories and a large

numb er of observations We also prop ose a generalisation of the wellknown median test In this

reduces to the usual median test statistic situation the lo cationdetecting rst comp onentofX

P

when there are only two categories Subsequent comp onents detect higher moment departures

from the null hyp othesis of equal treatmen t eects

Keywords Categorical data Comp onents Nonparametric tests Orthonormal p olynomial

Twoway data

Intro duction

The idea of decomp osing a test into orthogonal contrasts as in the analysis of

variance has long b een appreciated by statisticians as a way of making hyp othesis

tests more informative In the authors smo oth go o dness of t work see Rayner

and Best a similar approachis pursued Omnibus test statistics are par

titioned into smo oth comp onents We dene the comp onents of a test statistic

to be asymptotically pairwise indep endent with each asymptotically having the

chisquared distribution and suchthat their sum gives the original test statistic

ts provide powerful directional tests and permit a convenient and The comp onen

informative scrutinyofthe data This approachis applied to Sp earmans test in

Best and Rayner and Rayner and Best a Rayner and Best b

gaveanoverview of this approach applied to several commonly used nonparametric

tests including the Friedman and Durbin tests

Data for a generalisation of the median test that we subsequently prop ose and

for the KruskalWallis test both with and without ties may be presented in the

form of twoway tables with xed marginal totals Wederive the covariance matrix

 JCW RAYNER AND DJ BEST

of entries in such tables and then partition a multiple of X into comp onents that

P

detect lo cation and higher moment dierences b etween rows

For KruskalWallistyp e data when there are no ties the lo cation comp onentis

the KruskalWallis test Our approach enables us to generalise to when there are

ties and to when there is a xed number of categories and a large number of

observations We also prop ose a generalisation of the wellknown median test The

lo cation detecting rst comp onentofX reduces to the usual median test statistic

P

when there are only two categories Using more categories allows comp onents other

than this lo cation comp onent to b e calculated These additional comp onents that

detect disp ersion and higher moment eects are not available when using the usual

median test

The structure of this pap er is as follows In the next section the mo del for two

way contingency tables with xed marginal totals is given and Pearsons X is

P

derived as a test statistic for the null hyp othesis of like rows In section three a

multiple of X is partitioned into comp onents The material in section and

P

will b e familiar to many readers but is necessary background for the new work In

section four it is shown that when there are no ties the rst comp onent is the usual

KruskalWallis statistic The nonlo cation detecting comp onents are our extensions

Section ve generalises the treatment to when there are ties Section six intro duces

a generalisation of the usual median X test whichisthus identied as a lo cation

detecting test the extensions p ermit disp ersion and other eects to b e detected

A Mo del and Pearsons X Test

Supp ose wehavea twoway table of counts N with i r and j c

ij

ely n i r and n j c The row and column totals resp ectiv

i j

are known constants Under the null hyp othesis of simple random the

likeliho o d was given byRoy and Mitra as

r c r c

Y Y Y Y

n n n n

i j ij

i j  i j 

P P

in which n n n is the grand total of the observations The mo d

i j

i j

els for tables with just one set of marginal totals xed or only the grand total

xed are quite dierent from our mo del in which all row and column totals are

xed See Lancaster chapter XI section pp This likeliho o d can

b e expressed as a pro duct of extended or multivariate hyp ergeometric probability

functions

r c

Y Y

n n n n

j ij i

C C

n n

ij i

i j 

taken with resp ect to the To nd moments of the N exp ectations may be

ij

distribution of the second row conditional on knowledge of the column sums of the

EXTENDED KRUSKALWALLIS AND MEDIAN TESTS 

rst tworows then conditional on the column sums of the rst three rows and so

on It suces to know the moments of the extended hyp ergeometric distribution

Details are given in the App endix Wend

E N n n n i r and j c

ij i j

T T T

T

Write N N N i r and N N N so that N is

i i ic

 r

the vector of all the cell counts The jointcovariance matrix of N and N is for

i j

i j

n n n n n n

r r s i j

diag cov N N

i j

n n n

Write f n n j c and

j j

n n n n

r r s

R diag

n n

The covariance matrix of N is cov N fdiag f f f gR where is

j i j

the direct or Kronecker pro duct See Lancaster for details ab out direct or

Kronecker sums and pro ducts Now dene the standardised cell counts

q

Z N E N E N i r and j c

ij ij ij ij

T

Z Z Z Z Z

 c r  rc

I the a by a identity matrix and the a byvector with every element Then

a a

q

cov Z fI f f gR

r i j

p

The matrix fI f f g has r latent ro ots and one latent ro ot zero The

r i j

latent ro ots of R are dicult to nd in general but their asymptotic limits follow

from Lancaster Chapter V Lancaster showed that the quadratic form

with vector the standardised cell counts and matrix essentially R is the familiar

Pearson go o dness of t statistic with asymptotic distribution Hence the

c

latent ro ots of R are asymptotically one c times and zero once So under the

null hyp othesis of simple random sampling Z has zero mean and covariance matrix

cov Z which asymptotically has r c latent ro ots one and the remaining

r c latent ro ots zero

In the well known and often used classical mo del r and c are xed and the total

coun t n The test statistic X is given by

P

r c

X X

n n n n

i j i j

T

Z Z N X

ij

P

n n

i j 

Wenow conrm that our mo del leads to this test statistic Supp ose H is orthogonal

and diagonalises cov Z Asymptotically we then have

 JCW RAYNER AND DJ BEST

T

H covZ H I O

r c r c

T T

where means direct or Kronecker sum Dene Y H Z Now Z Z

T

Y Y in which Y by the multivariate Central Limit Theorem is asymptotically

N I under the null hyp othesis of simple random sampling

rc

r c r c

T T

It follows that under the null hyp othesis X Z Z Y Y asymptotically has

P

the distribution

r c

Partitioning Pearsons Statistic

We now show that X may be partitioned into comp onents the sth of which

P

detects sth moment departures from the null hyp othesis of similarly distributed

rows treatments

rc

X

There is some choice in dening Y The elements Y of Y are suchthat X

i

i P

i

the Y asH is not yet fully sp ecied In doing so our aim is to nd Y that can

i i

e one such partition rst supp ose be easily and usefully interpreted To achiev

that fg j g is the set of p olynomials orthonormal on fn n g See the App endix

s j

for the denitions of the rst two p olynomials and the derivation of subsequent

p olynomials This approach results when there are no ties in the rst comp onent

b eing the KruskalWallis test Write g for the c byvector with elements g j

s s

Dene G by

p

G G G c

c

in which G is the rc by r matrix

s

g

s

g

s

G s c and

s

g

s

c

c

is also rc by r G

c

c

q

T

n

G Z The elements of Y may be considered in blo cks of Dene Y

n

r the sth blo ck corresp onding to the p olynomial of order s These blo cks are

T T T

V in which asymptotically mutually indep endent Write Y V

 c

T T

V Y Y V Y Y

  r c

cr  cr

EXTENDED KRUSKALWALLIS AND MEDIAN TESTS 

and V all the V are r by so that

c s

n n

T T T T

X Z Z Y Y V V V V

 c

P  c

n n

T

n

X into comp onents V V s c The V are This partitions

s s

s

P

n

asymptotically mutually indep endent and asymptotically N I so that

r

r 

T

the V V are asymptotically mutually indep endent Explicitly wehave for

s

s r 

s c

p p

c

X

n n

T

A

V G Z g j Z

s s ij

s

n n

j 

Because V involves through g a p olynomial of order s the elements of V are

s s s

p olynomials of order s in the elements of N Under the null hyp othesis E Z

but when this is not true E V involves moments up to order s of Z So for

s

T

V detects sth moment departures from the null hyp othesis of s r V

s

s

similarly distributed rows treatments

Three instructors assign Instructors Example See Conover p

grades in ve categories according to the following table

Grade

A B C D E Total

Instructor

Instructor

Instructor

Total

Conover found the KruskalWallis statistic adjusted for ties to be

which is to b e compared with the p ointof We nd the lo cation de

T

V to havePvalue conrming as Conover rep orted tecting comp onent V





that none of the instructors can be said to grade higher or lower than the oth

ers on the basis of the evidence presented However the disp ersion detecting

T

comp onent V V has Pvalue indicating a signicantvariability dierence

From the data it app ears that the rst instructor is less variable than the other

t wo In fact with the elements of

T

v b eing values of approximately standard normally

distributed contributions from instructors and resp ectively The rst instruc

tor is less variable than the third who is less variable than the second This can

T T

V V V has Pvalue b e formalised by a LSD analysis The residual X V





P indicating no further eects in the data

 JCW RAYNER AND DJ BEST

Partition of X for instructors data

P

Statistics Degrees of Freedom Value Pvalue

T

V V





T

V V

T T

X V V V V





P

X

P

The KruskalWallis Test with No Ties

Wenow consider mo dels that lead to the KruskalWallis test when there are no ties

The latentrootsofcov Z will b e found explicitly rather than asymptotically as in

section Weshow that X is not an appropriate test statistic but nevertheless

P

its comp onents are The rst comp onent is the KruskalWallis test statistic and

the subsequentcomponents provide informative extensions

Supp ose we have distinct observations x b eing the j th of n observations on

ij i

the ith of t treatments All n n n observations are combined ordered

 t

ranked and the sums R of the ranks obtained by the ith treatment calculated

i

The KruskalWallis statistic is

H fnn g R n n

i i

i

See for example Conover section The data may b e presented as an t by

n contingency table of counts fN gwithN if rank j is allotted to treatment

ij ij

iandN if rank j is allotted to some other treatment The row and column

ij

totals are all xed the row totals are the treatment sample sizes so that n n

i i

for i t while the column totals are all one n for i n Such

j

a table has X t n no matter what the fN g Since X is constant it has

ij

P P

Pvalue Clearly X is not a suitable test statistic

P

The mo del of section holds except that now

n

T

I R

n n

n

n n

This matrix has one latent ro ot one and n latent ro ots nn It follows

that cov Z hast n latent ro ots nn and the remaining t n

latent ro ots zero So if H is orthogonal and diagonalises cov Z then

n

T

H cov Z H I

tn tn

n Dene

EXTENDED KRUSKALWALLIS AND MEDIAN TESTS 

s

n

T

H Z Y

n

Then

n n

T T

Y Y Z Z X

P

n n

With r replaced by t and c replaced by n this is the same as in section three

As in section we are interested in the distribution theory as n However

there Z was an rc by vector of xed length here Z is a tn by vector For

tunately it is not the asymptotic distribution of Z that is required First recall

that X has a xed value t n for all tables and so is not available as a test

P

statistic Second as in section three the multivariate Central Limit Theorem shows

that each V is asymptotically N I Moreover consideration of all pairs

s t

t

V V shows that they are asymptotically jointly multivariate normal and since

s t

their covariance matrix is zero they are asymptotically pairwise indep endent The

T

n

V V still partition X It is the pairwise indep endence and convenient

s

s

P

n

distribution of each V that makes data analysis so informative and conve

s

t

nient What is lost bytheunavailabilityof X is demonstrated in the Employees

P

Example b elow there is no residual available to assess if there are higher moment

dierences b etween the treatments

T

Wenowshowthatprovided there are no ties V V is the KruskalWallis statis





T

tic so that the subsequent V V provide extensions to the KruskalWallis test

s

s

is the set of p olynomials orthonormal on the discrete First note that the fg j g

s

uniform distribution so that g j aj b j ninwhich



p p

n and b n n fn ga a

n

X

The rank sum for treatment i R is jN i t Now since n for

i ij j

j 

j n

q

X X

p

g j g j n g j E N n n

   ij i

j j

and

X

p p

P

Z g j nn N aj b nn faR bn g

ij  i ij i i i

j

j

p

n

n a nn R

i i i

 JCW RAYNER AND DJ BEST

Now

n t

X X

n

T

A

g j Z V V Y Y

 ij 

 t



n

j  i

t t

X X

p

n R R n

i

i

a n n n

p

i

n n nn n

i i

i i

after some manipulation This is the KruskalWallis statistic well known to be

sensitive to lo cation departures from the null hyp othesis Since V assesses sth

s

n

X moment departures b etween treatments wehave partitioned the statistic

P

n

T

into asymptotically pairwise indep endent comp onents V V s n each

s

s

with the distribution and suchthat the sth detects sth moment departures

t

from the hyp othesis of similarly distributed rows treatments Since the rst of

these is the KruskalWallis statistic the subsequent comp onents provide extensions

to the KruskalWallis test

Employees Example Conover p exercise gave an exercise in which

new employees are randomly assigned to four dierent job training programmes

At the end of their training the employees are ranked with a low ranking reecting

alow job ability

Programme Ranks

The value of the KruskalWallis statistic is with Pvalue but Monte



Carlo p ermutation test Pvalue The latter is more likely to b e accurate as

the sample size is small Further comp onents are not signicant An LSD analysis

can b e used to show that programmes and and programmes and are equally

eective with and b eing sup erior

The KruskalWallis Test with Ties



If there are ties the data may be presented as an t by n contingency table of

counts fN g with the row totals are xed at the treatment sample sizes so again

ij

n n i t while the column totals are no longer all one The covariance

i i

matrix of Z is

q

n n n n

u v u

cov Z fI f f gR and R diag

t i j

n n

EXTENDED KRUSKALWALLIS AND MEDIAN TESTS 



As in section the latent ro ots of R are zero once and asymptotically one n



times It follows that cov Z hast n latent ro ots asymptotically one



and the remaining t n latent ro ots zero With suitable mo dications the



partitioning of section three holds For s n

n

X

p p

T

 

V G Z n g j Z n

s s ij

s

j 

Note that fg j g is the set of p olynomials orthonormal on fn n g not on the

s j

discrete uniform as in the previous section when there were no ties This is the

partition derived in section for X So the rst comp onent of X in the In

P P

structors example is the KruskalWallis statistic corrected for ties The subsequent

comp onents are extensions to the KruskalWallis test adjusted for ties Note that

for this example the mo del assumed in section with xed numbers of rows and



columns is more plausible than the mo del of this section since n is hardly

large

Generalised Median Tests

Conover section describ ed the median test in which random samples are

taken from each of c p opulations Each random sample is classied as ab ove and

b elow the grand median the median of the combined random samples forming

an r by contingency table with xed marginal totals The usual chisquared test

is then applied to this contingency table based on X

P

If instead of the grand median a grand quantile is used the resulting test is

describ ed as a quantile test see Conover p These tests can b e gen

eralised bycho osing c instead of two categories for the combined random samples

and so forming an r by c contingency table of counts N of the numb er of obser

ij

vations for the ith sample in the j th category This table has all row and column

totals xed and can b e tested for row consistency using the results of the sections

and The rst three say comp onents of X are of particular interest indicating

P

lo cation disp ersion and skewness dierences b etween treatments

T

V It is routine to showthatthe lo cation comp onent V of X reduces to the





P

median test statistic when observations are classied into just two categories This

is shown in the App endix The result identies the median test as a lo cation detect

ing test To detect up to sth moment dierences b etween the p opulations requires

categorisation into s categories and the use of the V V comp onents If

s

there are as many categories as observations and each category has one observation

the test based on the lo cation comp onent is the KruskalWallis test which is known

to be more powerful than the median test Using more than two categories will

result in less loss of information due to categorisation compared to the median test

and will p ermit assessment of higher moment dierences b etween the treatments

Corn Example Conover p gave the example of four dierent metho ds

of growing corn He classied the data as greater than and up to and applied

 JCW RAYNER AND DJ BEST

the median test In this form this do es not conform to the xed margins mo del

If the ob jective were to divide the data into groups of the lowestand highest

observations it would conform to the xed margins mo del Wenow classify the

data into four approximately equal groups

Using the median test Conover rep orted a Pvalue slightly less than the

metho d median yields are clearly dierent We calculate X on degrees

P

T T T

of freedom In addition V V V V and V V all

 

 

on degrees of freedom The lo cation and disp ersion comp onents and X are all

P

signicant with Pvalues all zero to three decimal places The residual or skewness

comp onenthas Pvalue The ner classication compare to that employed



by the median test has uncovered a variability dierence between the metho ds

metho ds and are signicantly less variable than and

First Second Third Fourth Total

Quartile Quartile Quartile Quartile

Metho d

Metho d

Metho d

Metho d

Total

App endix

The Orthogonal Polynomials

The rst two p olynomials dened on x x and orthonormal with regard to

 c

the weights p p areg x andg x given explicitly by

 c  j j

p

g x x

 j j

and

g x afx x g j c

j j  j

in which

c c

X X



r

x p x p and a

j j r j j 



j  j 

The subsequent p olynomials g x g x may b e derived by using the useful

 j c j

recurrence relations in Emerson In the text we have taken as in many

applications x j j c

j

Derivation of the Covariance Matrix of the Cell Counts

In section the metho d used to nd the moments of the N is describ ed Tond

ij

E N wetake E N jN N j c then the conditional exp ectation

  j j

of this expression with the sum of the rst three columns b eing known and so on

The successive exp ectations are

EXTENDED KRUSKALWALLIS AND MEDIAN TESTS 

n N N n n

  

fn n n gfn n N N N n n n g

      

n n n g fn n n gfn n

   

fn n n n gfn g n n n

  c  

c

By symmetry E N n n n i r and j c By dierence the

ij i j

exp ectations for the rst rowmay b e obtained giving the familiar

E N n n n i r and j c

ij i j

In the same way

E N N n n n n fn n g

   

from whichwe obtain varN and



n n n n

j j i

varN n i r and j c

ij i

n n n

Similarly

n n n n

j k i

cov N N n i r and j k c

ij ik i

n n n

By symmetry

n n n n

s j r

i r and j k c cov N N n

rj sj j

n n n

and by the exp ectation argumentagain

n n n n

i r j s

cov N N i j r and r s c

ir js

n n n

T T T

T

Write N N N i r and N N N The joint

i i ic

 r

covariance matrix of N and N is for i j

i j

n n n n n n

j r r s i

diag cov N N

i j

n n n n

Now since the fN g are suchthattherow and column totals are known constants

ij

cov N N N for i r So if wewritef n n j c

i  r j j and

 JCW RAYNER AND DJ BEST

n n n n

r s r

R diag

n n

then the covariance matrix of N is

i

X X

cov N cov N N f f R f f R

i i j i j i i

ij ij

which agrees with direct calculation So if is the Kronecker pro duct the covari

ance matrix of N is

cov N fdiag f f f gR

j i j

p

Recall that wehavedened Z N E N E N i r and j

ij ij ij ij

T

candZ Z Z Z Z It follows that

 c tl tc

q

cov Z fI f f gR

r i j

T

reduces to the median test statistic The lo cation comp onent V V of X





P

If there are b observations b elow a predetermined p oint in the combined sample

and a ab ove it then Conover p gavethe X Median test statistic as

r

X

n n b

i

T N n

i i

ab n

i

It is routine to show that g j j cinwhich and are are

j

the mean and standard deviation of the distribution dened by P X bn

and P X an It follows that an and abn Now

X

p

n V N g N an n N an

i i ij j i i i

j 

n an n N n bn N

i i i i i

T

V T It follows that as required V





References

DJ Best and JCW Rayner Nonparametric analysis for doubly ordered twowaycontin

gency tables Biometrics

WJ Conover Practical nd ed Wiley New York

PL Emerson Numerical construction of orthogonal p olynomials from a general recurrence

formula Biometrics

HO Lancaster The Chisquared Distribution Wiley New York

JCW Rayner and DJ Best Smooth Tests of Goodness of Fit Oxford University Press

New York

JCW Rayner and DJ Best Smo oth extensions of Pearsons pro duce moment correlation

and Sp earmans rho Statistics and Probability Letters a

EXTENDED KRUSKALWALLIS AND MEDIAN TESTS 

JCW Rayner and DJ Best Extensions to some imp ortant nonparametric tests In Pro

ceedings of the AC Aitken Centenary Conference Dunedin Kavalieris L et al ed

University of Otago Press Dunedin b

SN Roy and SK Mitra An intro duction to some nonparametric generalisations of analysis

of variance and multivariate analysis Biometrika Advances in Advances in Journal of Journal of Operations Research Decision Sciences Applied Mathematics Algebra Probability and Statistics Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

The Scientific International Journal of World Journal Differential Equations Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

Submit your manuscripts at http://www.hindawi.com

International Journal of Advances in Combinatorics Mathematical Physics Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

Journal of Journal of Mathematical Problems Abstract and Discrete Dynamics in Complex Analysis Mathematics in Engineering Applied Analysis Nature and Society Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

International Journal of Journal of Mathematics and Mathematical Discrete Mathematics Sciences

Journal of International Journal of Journal of Function Spaces Stochastic Analysis Optimization Hindawi Publishing Corporation Hindawi Publishing Corporation Volume 2014 Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 http://www.hindawi.com http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014