<<

http://www.jmde.com/ Ideas to Consider

Analysis of Paired Dichotomous Data: A Gentle Introduction to the McNemar Test in SPSS

Omolola A. Adedokun Purdue University

Wilella D. Burgess Purdue University

Background: Although McNemar Test is the Research Design: Not applicable. most appropriate tool for analyzing pre-post differences in dichotomous items (e.g., “yes” or Data Collection and Analysis: Using data from “no”, “correct” or “incorrect”, etc.), many scholars 506 6th grade students’ responses to a pre-post have noted the inappropriate use of Pearson’s Chi- science test; this contribution illustrates how to square Test by researchers, including social conduct McNemar Test in SPSS.. scientists and evaluators, for the analysis of related or dependent dichotomous variables. Findings: This contribution provides a non- technical introduction to McNemar test and Purpose: The goal of this paper is to promote the illustrates its use in an applied use of McNemar Test among evaluators by research/evaluation context. providing a gentle introduction to the method. Setting: Not applicable. Keywords: McNemar Test; paired dichotomous data; Chi-square test; equality of proportions. Intervention: Not applicable. ______

rogram evaluation often involves the relatively popular (Berenson & Koppel, Pexamination of pretest-posttest 2005). differences in dichotomous items (e.g., yes The unpopularity of the McNemar test or no, correct or incorrect, etc.). Although has been noted by other scholars. For the McNemar test is one of the very few example, Levin and Serlin (2000) as well statistical procedures available for as Berenson and Koppel (2005) noted pretest-posttest analysis of related that hypotheses testing in most dichotomous variables, it remains largely courses for social scientists—including unpopular among evaluators. We suspect evaluators—often center on mainstream that this unpopularity is due to a lack of tests of mean differences (e.g., dependent knowledge of this statistical method and independent t-tests, ANOVA, etc.) among evaluators—with the exception of and independent sample proportions practitioners in the subfield of health (e.g., chi-square analysis) with little or no evaluation among whom the method is attention paid to the McNemar test of correlated proportions. In the same vein,

Journal of MultiDisciplinary Evaluation, Volume 8, Number 17 125 ISSN 1556-8180 January 2012 Omolola A. Adedokun and Wilella D. Burgess extant and recent studies (e.g., Hoffman, Description of the McNemar 1976; Berenson & Koppel, 2005) have commented on the improper use of Test Pearson’s chi-square test, instead of the McNemar test, for the analysis of The McNemar test (1947) is best correlated or dependent dichotomous described as a 2ൈ2 cross classification of variables. Indeed, the inappropriate use of paired (or matched) responses to a Chi-square analysis for dependent dichotomous item. In simple terms, the dichotomous variables may lead McNemar test can be viewed as a type of evaluators to make misleading chi-square test that uses dependent (i.e., conclusions and recommendations to correlated or paired) data rather than program stakeholders. independent (unrelated) samples. The The purpose of this contribution is to McNemar test is a non-parametric promote the use of the McNemar test for statistical test; i.e., it is distribution free the analysis of correlated proportions and can be used with data sets and among evaluators. We do not intend to samples that are not normally distributed provide technical details of the McNemar (Ciechalski, et al., 2002). test. Rather, our goal is to provide a gentle For illustration, we use data from a introduction to its application in a pretest-posttest science test that was th relatively non-technical manner. Using administered to 506 6 grade students to data from a science test, we illustrate the evaluate the impact of a curricular unit/ application of the McNemar test, and how module on their understanding of to interpret and present the findings. We Mixtures and Solutions. Students’ chose SPSS because most social scientists, responses to each of 15 questions on the including evaluators, are familiar with its test were scored as correct (1) or incorrect user friendly interface (Heck, Thomas & (0). Table 1 describes response patterns to Tabata, 2010). We begin with a general one of the questions in a typical 2ൈ2 description of the McNemar procedure format. followed by an illustration of how the analysis is conducted in SPSS.

Table 1 Example of 2ൈ2 Classification Table for McNemar Analysis

Posttest Correct Incorrect (1) (0)

Correct (1) a = 55 b = 35 n1 = 90

Pretest Pretest Incorrect (0) c = 199 d = 217

n2 = 254 n = 506

Journal of MultiDisciplinary Evaluation, Volume 8, Number 17 126 ISSN 1556-8180 January 2012 Omolola A. Adedokun and Wilella D. Burgess

Where: is a chi-square test (with 1 degree of a = number of students who gave freedom) denoted as (b - c)2/(b + c) and correct responses in the pretests the corrected test statistic is (|b - c|-1)2/(b and posttests + c) (Interested readers should see b = number of students who gave Berenson & Koppel, 2005; McNemar correct response in the pretest but 1947; Lehr, 2010; Feuer & Kessler, 1989; an incorrect response in the & Marascuilo et al., 1979 for detailed posttest explanations of the derivation of the test c = number of students who gave statistic). incorrect responses in the pretest but correct in the posttest Conducting McNemar Test in d = number students who gave incorrect responses in both the SPSS pretest and posttest n = total number of matched pairs 1. Inspect data set to ascertain that (i.e., a + b + c + d) responses are appropriately coded as 1 or 0. For example, run n1 = total number of students who provided correct responses in the frequency counts of pretest and pretest (i.e., a + b) posttest responses to the item(s) (AnalyzeDescriptive Statistics n2 = total number of students who provided correct responses in the Frequencies and then select posttest (i.e., a + c) variables of interest and click “OK”). p1 = proportion of correct responses in the pretest, i.e., n1/n or (a + b)/n 2. Conduct the analysis. There are two ways to run the McNemar analysis p2 = proportion of correct responses in the posttest, i.e., n2/n or (a + using the user friendly drop down c)/n menu in SPSS: (1) the binomial approach (described in Figure 1) that is better suited to small Hypothesis Testing samples and/or when b + c is less than 10, and (2) the original Suppose we wish to examine pretest- McNemar approach (described in posttest changes in the proportion of Figure 2) that is appropriate for students that reported correct responses large samples. While the SPSS before and after the intervention, then, we output for the binomial approach will test the null hypothesis that p1 = p2. includes an exact binomial p-value In a sense, hypothesis testing for the and does not provide a test statistic McNemar Test uses data from the two value, the output for the original discordant cells b & c (see Table 1) where method includes the value of the change has occurred to test the equivalence of the two proportions (i.e., marginal homogeneity). The uncorrected correction formula should be used instead of the test statistic1 for the McNemar procedure McNemar test (Ciechalski, Pinkney and Weaver, 2002). Also, the asymptotic 2ൈ2 McNemar test assumes that the number of discordant pairs (i.e., 1 A limitation of the McNemar test is that it was b+c) is equal to or larger than 10. Hence, use of an designed for use with large samples. For small exact binomial test is recommended if discordant sample sizes, a correction formula like the Yates pairs are less than 10 (Rufibach, 2011).

Journal of MultiDisciplinary Evaluation, Volume 8, Number 17 127 ISSN 1556-8180 January 2012 Omolola A. Adedokun and Wilella D. Burgess

test statistic and incorporates a this case, p < 0.05 indicating correction for continuity in the statistically significant differences analysis. between pretest and posttest 3. Interpret results. The responses). Thus, we can reject the results/outputs for both methods null hypothesis that p1 = p2. include the p-value for the test (in

Figure 1. Conducting McNemar Test in SPSS (Binomial Method)

Journal of MultiDisciplinary Evaluation, Volume 8, Number 17 128 ISSN 1556-8180 January 2012 Omolola A. Adedokun and Wilella D. Burgess

Figure 2. Conducting McNemar Test in SPSS (Traditional Method with Yates Continuity Correction)

Two Sample McNemar Test of pretest-posttest change is the same for boys and girls. McNemar analysis has been extended to The first step in testing the two-sample multiple group designs (Feuer & Kessler, hypothesis is to conduct separate 1989; Marascuillo & Serlin, 1979) to McNemar analyses for each group compare changes in pretest and posttest (Marascuilo & Serlin, 1979) to identify the responses between experimental, gender, discordant or “change” cells as shown in racial or age groups, when the researcher Table 3. To perform the separate wishes to test the hypothesis that the McNemar analysis in SPSS, split the data marginal change is the same across by group (e.g., “boys” and “girls” data groups. For example, in the case of the sets) and repeat the steps described above science test illustrated above, we may to generate Table 2. wish to examine whether the proportion

Journal of MultiDisciplinary Evaluation, Volume 8, Number 17 129 ISSN 1556-8180 January 2012 Omolola A. Adedokun and Wilella D. Burgess

Table 2 Two Sample McNemar Test

Posttest

Boys (284) Girls (222)

Correct (1) Incorrect (0) Correct (1) Incorrect (0)

Pretest Pretest Correct (1) 29 20 26 15 Incorrect (0) 106 129 93 88

The values of the discordant pairs for Conclusion each sample are then compiled into a separate 2ൈ2 table (see Table 3) and a chi- This contribution provides a non- square test of independence or z-test of technical introduction to the McNemar the equality of two proportions is test with the overarching goal of performed to examine if the probability of promoting its use and application among change (or marginal probabilities) is the evaluators. The article illustrates the use same across both groups (Marascuilo & of the McNemar test in an applied Serlin, 1979; Levin & Serlin, 2000; research context and seeks to discourage Howell, 2008). Fortunately, online the use of chi-square analysis for calculators (e.g., examining paired or dependent http://www.graphpad.com/quickcalcs/co variables—for which McNemar is the most ntingency1.cfm ) are available for such appropriate. As illustrated above, the analysis. For the example considered here, method is simple, quick and easy to the chi-square analysis revealed no perform, has high practical power and “it statistically significant gender difference enables an appropriate confirmatory data in students’ pretest-posttest responses to analysis for situations dealing with paired the question. dichotomous responses to surveys or experiments” and, provides a useful Table 3 addition to evaluators’ analyses (Berenson 2ൈ2 Classification Table for Two Sample & Koppel, 2005, p. 134). In addition to its McNemar Test use for testing significance of changes in related proportions, McNemar analysis Groups also has potential as an exploratory tool in Discordant Pairs Boys Girls dichotomous item analysis. Indeed, the method provides a useful addition to Correct Pretest & 20 15 evaluators’ analysis toolkit. Incorrect Posttest This article is by no means an Incorrect Pretest & 106 93 exhaustive introduction to McNemar Correct Posttest analysis. There are other advanced topics related to the method that may be of interest to evaluators. For example, although the McNemar test works best when data is complete, (i.e., with no

Journal of MultiDisciplinary Evaluation, Volume 8, Number 17 130 ISSN 1556-8180 January 2012 Omolola A. Adedokun and Wilella D. Burgess missing data), systems or methods modeling with IBM SPSS. Routledge, (including maximum likelihood NY. strategies) for handling the analysis with Hoffman, J. I. E. (1976). The Incorrect use missing data have also been developed of Chi-square analysis for paired data. (Marascuillo, Omelich, & Gokhale, 1988). Clinical Experiment Immunology, 24, Hopefully, this non-technical illustration 227-229. will serve as a guide or resource for Lehr, . G. (2010). McNemar Test. In S. evaluators who are often more interested Chow (Ed.), Encyclopedia of in the appropriate application of biopharmaceutical statistics (pp. 740- statistical methods and not necessarily 744). Informa Healthcare. their mathematical derivations and Levin, J. R., & Serlin, R. C. (2000). technical details. We also hope that this Changing students’ perspectives of illustration will spur more evaluators to McNemar’s test of change. Journal of gain deeper knowledge of the method and Statistics Education, 8(2). Retrieved its extensions. March 15, 2011 from www.amstat.org/publications/jse/secu Acknowledgements re/v8n2/levin.cfm Lachenbruch, P. A., (2005) McNemar test. Data for this study come from the In P. Armitage & T. Colton (Eds.), Evaluation of the Indiana Science Encyclopedia of biostatistics (pp. Initiative Project implemented by the 3062-3063). New York, NY: John Indiana Science Technology Engineering Wiley and Sons. and Mathematics (I-STEM) Network and Marascuilo, L. A., Omelich, C. L., & funded by the Lilly Foundation. Gokhale, D. V. (1988). Planned and The authors wish to thank Ann post hoc methods for multiple-sample Bessenbacher for her role in data McNemar (1947) tests with missing preparation. data. Psychological Bulletin, 103, 238- 245. Marascuilo, L. A. & Serlin, R. C. (1979). References Tests and contrasts for comparing change parameters for a multiple Berenson, M. L., & Koppel, N. B. (2005). sample McNemar data model. British Why McNemar’s procedure needs to Journal of Mathematical and be included in the Business Statistics Statistical Psychology, 32, 105-112. Curriculum. Decision Sciences Journal Howell, D. C. (2008). Testing change of Innovative Education, 3(1), 125- over two measurements in two 136. independent groups. Retrieved June Ciechalski, J. C., Pinkney, J. W., & 21, 2011 from Weaver, F. S. (2002). A method for http://www.uvm.edu/~dhowell/meth assessing change in attitude: The ods7/Supplements/Testing%20Depen McNemar Test. Poster presented at dent%20Proportions.pdf the Annual Meeting of the American Educational Research Association, New Orleans, LA. Heck, R. H., Thomas, S. L., & Tabata, L. N. (2010). Multilevel and longitudinal

Journal of MultiDisciplinary Evaluation, Volume 8, Number 17 131 ISSN 1556-8180 January 2012