Measuring association among ordinal categorical variables

-

Goodman and Kruskal’s Gamma

1 An example

Effect of smoking at 45 years of age on self reported health five years later

Variable Categories Smoking 1 Never smoked 2 Stopped smoking 3 1-14 cigarettes/day 4 15-24 cigarettes/day 5 25+ cigarettes/day

SRH 1 Very good 2 Good 3 Fair 4 Bad Ordinal categories

Expected monotonous association: Increasing codes on Smoking  Increasing code on SRH

2 Data on males from the Glostrup surveys

+ SMOKE45 | | B:HEALTH51 | J | Vgood Good Fair Bad | TOTAL | ------+------+------+ Never | 16 73 6 1 | 96 | row%| 16.7 76.0 6.3 1.0 | 100.0 | No mo | 15 75 6 0 | 96 | row%| 15.6 78.1 6.3 0.0 | 100.0 | 1-14 | 13 59 7 1 | 80 | row%| 16.3 73.8 8.8 1.3 | 100.0 | 15-24 | 10 81 17 3 | 111 | row%| 9.0 73.0 15.3 2.7 | 100.0 | 25+ | 1 29 3 1 | 34 | row%| 2.9 85.3 8.8 2.9 | 100.0 | ------+ TOTAL | 55 317 39 6 | 417 | row%| 13.2 76.0 9.4 1.4 | 100.0 | ------+ 2 = 16.2 df = 12 p = 0.182

No evidence of association even though the expected monotonous relationship is plain as the nose on your face

3 Correlation coefficients for ordinal categorical variables

Pearson’s correlation coefficient

1) Measures linear association which is not meaningful for ordinal variables

2) Evaluation of significance requires normal distributions

Rank correlations (Kendall’s  and Spearman’s ) are more appropriate but require continuous data with very little risk of ties.

4 Goodman and Kruskal’s  for ordinal categorical data

1) Similar to Kendall’s .

2) Related to the odds ratio for 22 tables.

3) Well-known asymptotic properties.

4) A partial  coefficient measuring conditional monotonous relationship among ordinal variables is available.

5 Monotonous relationships

Y increases when X increases/decreases

Two variables: X,Y

Probabilities: pxy = Prob(X=x,Y=y)

X and Y are independent

pxy = Prob(X=x)P(Y=y)

What exactly do we mean when we say that there is a monotonous relationship between X and Y?

6 Concordance and discordance

Compare outcomes on (X,Y) for two stochastically independent cases

(X1,Y1) and (X2,Y2)

Concordance (C) if X1X2 and Y1>Y2

Discordance (D) if X1Y2 or X1>X2 and Y1

Tie (T) if X1=X2 or Y1=Y2

7 Concordance = same trend in X and Y Discordance = different trends in X and Y

Probability of concordance

p= P P C x1 y 1 x 2 y 2 (x1 ,y 1 ,x 2 ,y 2 ) C p= P P D x1 y 1 x 2 y 2 (x1 ,y 1 ,x 2 ,y 2 ) D

Positive relationship: PC > PD

Negative relationship: PC < PD

8 The gamma coefficient

A measure of the strength of the monotonous relationship

P- P g = D C PD+ P C

Satisfies all conventional requirements of correlationcoefficients:

-1    +1  = 0 if X and Y are independent Positive association if  > 0 Negative association if  < 0 Change the order of Y categories:  after recoding = - before recoding

9 Interpretation of 

P(C) P(C | C� D) P(C)+ P(D)

P(D) P(D | C� D) P(C)+ P(D)

such that

 = P(C|CD) – P(D|CD)

 is the difference between two conditional probabilities

10 Estimation of 

Pairwise comparison of all persons in the data set

nC = number of concordances nD = number of discordances nT = number of ties

Relative frequences

nC nD hC = hD = nC+ n D + n T nC+ n D + n T

nS hT = nC+ n D + n T

The estimate of  h- h n - n G =C D = C D hC+ h D n C + n D

11 A little bit of notation

nxy = number of persons with X=x and Y=y

Aij=邋 n xy + n xy x> i,y > j x < i,y < j

Dij=邋 n xy + n xy x> i,y < j x < i,y > j

X 1 . x . c

1 Axy Dxy .

Y y nxy . . Dxy Axy R

Number of concordances and discordances n= n A n= n D C xy xy D xy xy x,y x,y

12 The  coefficient for 22 tables

a b c d

nC = ad nD = bc

g =

nC- n D

nC+ n D = ad 1 ad- bc- OR - 1 =bc = ad bcad OR 1 ++ 1 + bc

13 OR- 1 1 + g g = OR = OR+ 1 1 - g

Gamma, odds-ratio og logit values gamma oddsratio logit LN(odds-ratio -1,00 ,00 - -,90 ,05 -2,94 -,80 ,11 -2,20 -,70 ,18 -1,73 -,60 ,25 -1,39 -,50 ,33 -1,10 -,40 ,43 -,85 -,30 ,54 -,62 -,20 ,67 -,41 -,10 ,82 -,20 ,00 1,00 ,00 ,10 1,22 ,20 ,20 1,50 ,41 ,30 1,86 ,62 ,40 2,33 ,85 ,50 3,00 1,10 ,60 4,00 1,39 ,70 5,67 1,73 ,80 9,00 2,20 ,90 19,00 2,94 1,00 + +

Note: logit  2gamma in the interval [-0,30 til 0,30]

14 Properties of the estimate of 

The estimate is unbiased E(G) = 

and asymptotically normally distributed

with standard error, s1, given by

2 2 16 轾 s1=4 犏 n xy( n D A xy - n C D xy ) (nC+ n D ) 臌x,y

If X and Y are independent, then the standard error, s0, of G is given by

2 4 轾 2 (n- n ) s2 =犏 n A - D - C D 02 xy( xy xy ) n (nC+ n D ) 臌犏x,y

15 Statistical inference

95 % confidence intervals

G  1.96se1

Test of significance

If X and Y are independent then

G z =  Norm(0,1) se0

Notice that confidence intervals and assessment of significance uses different estimates of the standard errors

16 The example

+ SMOKE45 | | B:HEALTH51 | J | Vgood Good Fair Bad | TOTAL | ------+------+------+ Never | 16 73 6 1 | 96 | row%| 16.7 76.0 6.3 1.0 | 100.0 | No mo | 15 75 6 0 | 96 | row%| 15.6 78.1 6.3 0.0 | 100.0 | 1-14 | 13 59 7 1 | 80 | row%| 16.3 73.8 8.8 1.3 | 100.0 | 15-24 | 10 81 17 3 | 111 | row%| 9.0 73.0 15.3 2.7 | 100.0 | 25+ | 1 29 3 1 | 34 | row%| 2.9 85.3 8.8 2.9 | 100.0 | ------+ TOTAL | 55 317 39 6 | 417 | row%| 13.2 76.0 9.4 1.4 | 100.0 | ------+ 2 = 16.2 df = 12 p = 0.182  = 0.24 p < 0.0005 Very strong evidence of an effect of smoking on health

For ordinal variables,  is much more powerful than 2 distributed test statistics

17 Exact conditional inference

The problem:

Can asymptotic distributions of estimates and test statistics be approximated by asymptotic distributions in small and moderate samples.

The small number of persons with bad health would result in warnings from most statistical programs that asymptotics probably do not work.

If in doubt use exact conditional tests instead of asymptotic tests.

18 The hypergeometric distribution

The contingency table: nxy, x = 1,..c y =1,,.r

The margins of the table:

n= n n= n x+ xy n+ y= n xy xy y x x,y

The probability of the table

n 骣 nxy P(n11,…,ncr) = 琪 pxy 桫n11 ...n rc x,y

19 H0: pxy = px+p+y

骣 n n nx+ + y P(n11,…,ncr) = 琪 照px+ p + y 桫n11 ...n rc x y

The marginal tables, nx+ and n+y, are

sufficient under H0

P(n11,…,ncr | n1+,..,nc+, n+1,..,n+r) = 骣 骣 琪照nx+ !琪 n + y ! 桫x桫 y

n! nxy ! x,y does not depend on unknown parameters

20 The exact conditional test procedure 1

Find all tables with the same marginal tables as the observed table.

For each of these tables calculate:

The conditional probability of the table The tests statistics of interest

The exact p-value = the sum of probabilities for tables with test statistics that are more extreme than the test statistic of the observed table

21 The exact conditional test procedure

Test statistic T(M) Where M is a rc table

Observed teststatistic = tobs

The exact p-value

pexact= P(M | n 1+ ,..,n c + ,n + 1 ,..,n + r ) M: mx+= n x + , m+y= n + y , T(M) tobs

Fisher’s exact test for 22 tables Also appropriate for rc tables, but may be very time consuming due to the number of tables fitting the margins The Monte Carlo test

22 Since the conditional probabilities are known exactly we may ask the computer to generate a random sample consisting of a large number of independent tables from this distribution:

The MC test procedure:

Generate tables:

M1,…,MNsim Calculate test statistics for each table:

Ti = T(Mi), i = 1,..Nsim

Count the number of random test statistics which are as extreme as the observed statistic

Nsim S= 1 {Ti t obs} i= 1 S p = is an unbiased estimate of pexact MC Nsim

The standard error of pMC depends on Nsim Sequential Monte Carlo tests

23 Interrupts the Monte Carlo procedure when it becomes obvious that the test statistic will not be significant

Nsim = 10,000 Critical level of the test = 5 %,

The sequential Monte Carlo test interrupts the Monte Carlo procedure when the number of tables with T(Mi) is equal to 501

S  501

pMC  501/10000 > 0.05

24 Repeated Monte Carlo tests

The repeated Monte Carlo test interrupts the Monte Carlo procedure when the “risk” of a

significant pMC-value has become very small

Parameters of the Repeated Monte Carlo test:

Nsim = the total number of tables to be generated

Nstart = the minimum number of tables to be Generated Critical value Max risk of stopping too soon (default = 0.1 %)

25 Smoking and Self reported health

+ SMOKE45 | | B:HEALTH51 | J | V.goo Good Fair Bad | TOTAL | ------+------+------+ Never | 16 73 6 1 | 96 | row%| 16.7 76.0 6.3 1.0 | 100.0 | No mo | 15 75 6 0 | 96 | row%| 15.6 78.1 6.3 0.0 | 100.0 | 1-14 | 13 59 7 1 | 80 | row%| 16.3 73.8 8.8 1.3 | 100.0 | 15-24 | 10 81 17 3 | 111 | row%| 9.0 73.0 15.3 2.7 | 100.0 | 25+ | 1 29 3 1 | 34 | row%| 2.9 85.3 8.8 2.9 | 100.0 | X² = 16.2 ------+ df = 12 TOTAL | 55 317 39 6 | 417 | p = 0.182 row%| 13.2 76.0 9.4 1.4 | 100.0 | Gam = 0.24 ------+ p = 0.000 Confounding?

26 Analysis of the conditional association given self reported health at 45 years HEALTH45 = Very good

+HEALTH45 | + SMOKE45 | | | B:HEALTH51 | G J | V.goo Good Fair Bad | TOTAL | ------+------+------+ 1 Never | 4 12 0 0 | 16 | row%| 25.0 75.0 0.0 0.0 | 100.0 | No mo | 5 7 0 0 | 12 | row%| 41.7 58.3 0.0 0.0 | 100.0 | 1-14 | 9 6 0 0 | 15 | row%| 60.0 40.0 0.0 0.0 | 100.0 | 15-24 | 2 3 0 0 | 5 | row%| 40.0 60.0 0.0 0.0 | 100.0 | 25+ | 0 3 0 0 | 3 | row%| 0.0 100.0 0.0 0.0 | 100.0 | X² = 6.0 ------+ df = 4 TOTAL | 20 31 0 0 | 51 | p = 0.196 row%| 39.2 60.8 0.0 0.0 | 100.0 | Gam = -0.18 ------+ p = 0.188

27 HEALTH45 = Good +HEALTH45 | + SMOKE45 | | | B:HEALTH51 | G J | V.goo Good Fair Bad | TOTAL | ------+------+------+ 2 Never | 11 55 5 1 | 72 | row%| 15.3 76.4 6.9 1.4 | 100.0 | No mo | 10 59 5 0 | 74 | row%| 13.5 79.7 6.8 0.0 | 100.0 | 1-14 | 3 50 4 1 | 58 | row%| 5.2 86.2 6.9 1.7 | 100.0 | 15-24 | 6 76 8 1 | 91 | row%| 6.6 83.5 8.8 1.1 | 100.0 | 25+ | 1 25 1 0 | 27 | row%| 3.7 92.6 3.7 0.0 | 100.0 | X² = 9.8 ------+ df = 12 TOTAL | 31 265 23 3 | 322 | p = 0.636 row%| 9.6 82.3 7.1 0.9 | 100.0 | Gam = 0.17 ------+ p = 0.041

HEALTH45 = Fair +HEALTH45 | + SMOKE45 | | | B:HEALTH51 | G J | V.goo Good Fair Bad | TOTAL | ------+------+------+

3 Never | 1 6 1 0 | 8 | row%| 12.5 75.0 12.5 0.0 | 100.0 | No mo | 0 6 1 0 | 7 | row%| 0.0 85.7 14.3 0.0 | 100.0 | 1-14 | 1 3 3 0 | 7 | row%| 14.3 42.9 42.9 0.0 | 100.0 | 15-24 | 2 1 6 2 | 11 | row%| 18.2 9.1 54.5 18.2 | 100.0 | 25+ | 0 1 2 1 | 4 | row%| 0.0 25.0 50.0 25.0 | 100.0 | X² = 17.5 ------+ df = 12 TOTAL | 4 17 13 3 | 37 | p = 0.131 row%| 10.8 45.9 35.1 8.1 | 100.0 | Gam = 0.52 ------+ p = 0.001

28 HEALTH45 = Bad

+HEALTH45 | + SMOKE45 | | | B:HEALTH51 | G J | V.goo Good Fair Bad | TOTAL | ------+------+------+

4 Never | 0 0 0 0 | 0 | row%| 0.0 0.0 0.0 0.0 | 0.0 | No mo | 0 3 0 0 | 3 | row%| 0.0 100.0 0.0 0.0 | 100.0 | 1-14 | 0 0 0 0 | 0 | row%| 0.0 0.0 0.0 0.0 | 0.0 | 15-24 | 0 1 3 0 | 4 | row%| 0.0 25.0 75.0 0.0 | 100.0 | 25+ | 0 0 0 0 | 0 | row%| 0.0 0.0 0.0 0.0 | 0.0 | X² = 3.9 ------+ df = 1 TOTAL | 0 4 3 0 | 7 | p = 0.047 row%| 0.0 57.1 42.9 0.0 | 100.0 | Gam = 1.00 ------+ p = 0.001

------** Local testresults for strata defined by HEALTH45 (G) ** p-values p-values (1-sided) G: HEALTH45 X² df asympt exact Gamma asympt exact ------1: V.good 6.04 4 0.1960 0.1880 -0.18 0.1884 0.2050 2: Good 9.77 12 0.6358 0.6310 0.17 0.0410 0.0290 3: Fair 17.52 12 0.1311 0.1430 0.52 0.0010 0.0030 4: Bad 3.94 1 0.0472 0.1540 1.00 0.0006 0.1220 ------

29 Tests of conditional independence H0: P(X,Y|Z=z) = P(X|Z=z)P(Y|Z=z) for all z

Z X Y Concordance Test statistics and discordance 1 2 3 1 1 2 1

2 N1C and N1D N1C  N1D N  N 1= 1C 1D

2 1 2 2

2 N2C and N2D N2C  N2D N  N 2= 2C 2D ...... 2 kZ 1 k

2 N2C and N2D NkC  NkD N  N k= kC kD All test statistics must be insignificant Global tests of conditional independence

The global  2

2 2     z z

30 df  dfz z

The partial  coefficient N  N N  N C  zC D  zD z z

NzC  NzD  NC  ND z  partial   NC  ND NzC  NzD  z =

 NzC  NzD   z z   w z  z  NzC  NzD  z z

NzC  NzD where wz = N  N i  iC iD 

 partial   w z  z z Weighted mean  Asymptotic normal distribution

31 Monte Carlo approximation of exact conditional p-values as for 2-way tables

------** Local testresults for strata defined by HEALTH45 (G) ** p-values p-values (1-sided) G: HEALTH45 X² df asympt exact Gamma asympt exact ------1: V.good 6.04 4 0.1960 0.1880 -0.18 0.1884 0.2050 2: Good 9.77 12 0.6358 0.6310 0.17 0.0410 0.0290 3: Fair 17.52 12 0.1311 0.1430 0.52 0.0010 0.0030 4: Bad 3.94 1 0.0472 0.1540 1.00 0.0006 0.1220 ------

2 Global  = 37.3 df = 29 p = 0.139 pexact = 0.148

partial = 0.17 p = 0.034 pexact = 0.027

32 Are the local  coefficients homogenous?

Least square estimate: Gamma = 0.1998 s.e. = 0.0777

G: HEALTH45 Gamma variance s.e. weight residual ------1: V.good -0.18 0.0397 0.1993 0.152 -2.060 2: Good 0.17 0.0097 0.0987 0.620 -0.411 3: Fair 0.52 0.0264 0.1625 0.228 2.237 4: Bad 1.00 standard error is not available ------

Incomplete set of Gammas

Test for partial association: X² = 7.5 df = 2 p = 0.023 Pairwise comparisons of strata:

Comparison of strata 1+2 - p = 0.11

Significant difference between 1+2 and 3 P = 0.025

Notice the similarity of the analysis of  coefficients and Mantel-Haenszel analysis of odds-ratios

33