Nonparametric Correlation Techniques
Total Page:16
File Type:pdf, Size:1020Kb
Nonparametric Correlation Techniques Techniques for Correlating Nominal & Ordinal Variables 2 KEY CONCEPTS ***** Nonparametric Correlation Techniques Scales of measurement Nominal Scale Ordinal scale Interval scale Ratio scale Metric vs. nonmetric variables Spearman Rank-Order Correlation Coefficient: Rho () Rho assumptions Null hypothesis in rho One and two-tailed hypotheses Reducing metric variables to ordinal scales of measurement Resolving the problem of tied ranks Goodman’s & Kruskal’s Gamma () Gamma assumptions Null hypothesis in gamma The concepts of consistency & inconsistency in gamma Using Z to determine the significance of gamma The Phi Coefficient () Phi assumptions Null hypothesis in phi The relationship between phi and chi-square The Contingency Coefficient (C) C assumptions Null hypothesis in C The relationship between C and chi-square The relationship between C and phi Limitation in the values that C can take Cramér’s V V assumptions Null hypothesis in V The relationship between V and chi-square Guttman’s Lambda () Lambda assumptions Null hypothesis in lambda Lambda as an asymmetrical correlation coefficient The concept of the reduction of the error in prediction PRE: Proportionate reduction of error Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 3 Lecture Outline What are nonparametric correlation techniques and what kind of research problems are they designed to solve. Spearman Rank-Order Correlation Coefficient: Rho () Goodman’s & Kruskal’s Gamma () The Phi Coefficient () Contingency Coefficient (C) Cramér’s V Guttman’s Coefficient of Predictability Lambda () Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 4 Nonparametric Correlation Techniques If the variables X and Y are metric (i.e. interval or ratio measures) and they are to be correlated, Then the appropriate technique is Pearson’s Product-Moment Correlation Coefficient. r = xy x2 y2 Q What if X and/or Y is nonmetric (i.e. nominal or ordinal measures), how can they be correlated? A By use of one of a variety of nonparametric correlational techniques. Nonparametric correlational techniques are designed two estimate the correlation or association between variables measured on nominal and/or ordinal scales, or metric variables that have been reduced to nominal and/or ordinal scales. Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 5 Spearman Rank-Order Correlation Coefficient: (rho) = 1 - (6D2 )/ [N(N2 – 1)] A technique for determining the correlation between two ordinal variables, or metric variables reduced to an ordinal scale. Assumptions The two variables are ordinal or metric variables that have been reduced to an ordinal scale of measurement, The correlation between the variables is linear, and If a test of significance is applied, the sample has been selected randomly from the population. Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 6 An Example A prosecutor received 10 felony cases filed by an interagency organized crime task force and ranked the cases by seriousness (serious=X) and relative prosecutability (prosecute=Y).* X Y Case Serious Prosecute D D2 A 6 3 3 9 B 1 10 -9 81 C 4 7 -3 9 D 7 5 2 4 E 10 1 9 81 F 3 8 -5 25 G 8 2 6 36 H 9 4 5 25 I 5 6 -1 1 J 2 9 -7 49 Total 320 *(Rankings: 1= the highest and 10= the lowest) D = the difference between the rank position of each case on X and Y. N = the number of paired observations, cases. Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 7 Calculation of Rho () = 1 - (6D2 )/ [N(N2 – 1)] = 1 - (6) (320)/ [10(102 – 1)] = 1 - (1920)/ [10(99)] = 1 - (1920)/ (990) = -0.939 Interpretation The correlation is negative and the magnitude is high. As the seriousness of the crime increases, its prosecutability decreases. Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 8 Sprearman’s Rho SPSS Results Rho = -0.939 Two-tailed level of significance: p 0.001 Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 9 Reducing a Metric Variable to an Ordinal Scale of Measurement What is the correlation between … The rank-ordered seriousness of 8 offences (ordinal variable) and The length of sentences received by their perpetrators (ratio variable)? Case Serious- Sentence Rank of D D2 ness Length: Sentence In Years A 5 6 5 0 0 B 2 3 2 0 0 C 7 7 6 -1 1 D 1 2 1 0 0 E 6 8 7 -1 1 F 3 5 4 -1 1 G 8 10 8 0 0 H 4 4 3 +1 1 Total 4 Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 10 Seriousness of offence is ranked-ordered from least serious (rank = 1) to most serious (rank = 8). The length of sentence is rank-ordered from lowest (rank = 1) to highest (rank = 8) Computation of rho = 1 - (6D2 )/ [N(N2 – 1)] = 1 - (6) (4) )/ [8(82 – 1)] = +0.952 = +0.952 Interpretation The relationship is positive and the magnitude of the correlation is high. As the seriousness of the offence increases, The length of sentence increases as well. Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 11 The Problem of Tied Ranks In converting a metric variable to an ordinal scale of measurement, some cases may have tied values. (Shaded cells are tied scores) Case Serious- Sentence Sentence Rank: D D2 ness Length In Rank Sentence Years Position A 5 6 4 4.5 0.5 0.25 B 2 2 1 1.5 0.5 0.25 C 7 7 6 6 1.0 1.00 D 1 2 2 1.5 -0.5 0.25 E 6 8 7 7 -1.0 1.00 F 3 6 5 4.5 -1.5 2.25 G 8 10 8 8 0.0 0.00 H 4 4 3 3 +1.0 1.00 Total 6.00 Cases B & D have tied sentences (2 years) as do cases A & F (6 years) In a rank ordering, cases B & D occupy rank positions 1 & 2, while cases A & F occupy rank positions 4 & 5. To determine the appropriate rank of tied cases, add the rank positions and divided by the number of tied cases. Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 12 For cases B & D: (1+2) / 2 = 1.5 1.5 is the rank assigned to cases B & D For cases A & F: (4+5) /2 = 4.5 4.5 is the rank assigned to cases A & F Computation of rho = 1 - (6D2 )/ [N(N2 – 1)] = 1 - (6) (6) )/ [8(82 – 1)] = +0.929 Interpretation The relationship is positive and the magnitude of the correlation is high As the seriousness of the offence increases The length of sentence increases Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 13 Spearman’s Rho With Tied Ranks SPSS Results Rho with tied ranks = +0.928 Two-tailed level of significance p= 0.001 Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 14 Significance of Rho In testing the significance of rho, the null hypothesis H0 states … That the value of rho in the population from which the sample was drawn is 0.0 Therefore, the statistical question becomes … What is the probability that the obtained value of rho in the sample could have come from such a population? Given a sample size of N cases, a statistical table can be used to answer this question. Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 15 Table for Determining the Significance of Rho Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 16 Critical Values of Rho in Testing Significance Consider the three previous examples involving: The prosecutor ranking the seriousness & prosecutability of criminal cases (N = 10) The correlation of offence seriousness and sentence length (N = 8), and The correlation of offence seriousness and sentence length involving tied cases (N = 8) Example N Rho Critical Value 0.05 0.01 Prosecutor 10 -0.939 0.648 0.794 Sentence 8 +0.952 0.738 0.881 Tied ranks 8 +0.929 0.738 0.881 All three sample values of rho exceed the critical value at the p=0.01 level of significance. Therefore, we are more than 99% confident in rejecting each of these H0’s. Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 17 Derivation of the Spearman Rank-Order Correlation Coefficient () Spearman’s rank-order correlation coefficient () can be derived from Pearson’s correlation coefficient (r). r = r = xy = = 1 - (6d2 )/ [N(N2 - 1)] x2 y2 If X and Y are ordinal variables ranked 1, 2, …, N, then X = Y = N(N+1) / 2 And X2 = Y2 = N(N+1)(2N+1) / 6 Given that x2 = (X - X) 2 = X2 - (X)2 / N And y2 = (Y - Y) 2 = Y2 - (Y)2 / N Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University 18 Then for ordinal variables X & Y x2 = N (N+1)(2N+1) - [N(N+1)/2] 2 / N 6 x2 = N(N+1)(2N+1) - 6 1/N [N(N+1)/ 2] [N(N+1)/ 2] This can be reduced as follows x2 = N(2N2+N+2N+1) - 6 1/N [(N2+N)(N2+N) /4] x2 = (2N3+N2+2N2+N) - 6 1/N [(N4+N3+N3+N2)/4] x2 = (2N3+3N2+N) - 1/N [(N4+2N3+ N2)/4] 6 x2 = (2N3+3N2+N) - (N4+2N3+ N2) 6 4N x2 = (2N3+3N2+N) - (N3+2N2+ N) 6 4 Nonparametric Correlation Techniques: Charles M.