A Comparison of Marginal Maximum Likelihood and Marko
Total Page:16
File Type:pdf, Size:1020Kb
Parameter Recovery for the Four-Parameter Unidimensional Binary IRT Model: A Comparison of Marginal Maximum Likelihood and Markov Chain Monte Carlo Approaches A dissertation presented to the faculty of The Gladys W. and David H. Patton College of Education of Ohio University In partial fulfillment of the requirements for the degree Doctor of Philosophy Hoan Do April 2021 © 2021 Hoan Do. All Rights Reserved. 2 This dissertation titled Parameter Recovery for the Four-Parameter Unidimensional Binary IRT Model: A Comparison of Marginal Maximum Likelihood and Markov Chain Monte Carlo Approaches by HOAN DO has been approved for the Department of Educational Studies and The Gladys W. and David H. Patton College of Education by Gordon P. Brooks Professor of Educational Studies Renée A. Middleton Dean, The Gladys W. and David H. Patton College of Education 3 Abstract DO, HOAN, Ph.D., April 2021, Educational Research and Evaluation Parameter Recovery for the Four-Parameter Unidimensional Binary IRT Model: A Comparison of Marginal Maximum Likelihood and Markov Chain Monte Carlo Approaches Director of Dissertation: Gordon P. Brooks This study assesses the parameter recovery accuracy of MML and two MCMC methods, Gibbs and HMC, under the four-parameter unidimensional binary item response function. Data were simulated under the fully crossed design with three sample size levels (1,000, 2,500 and 5,000 respondents) and two types of latent trait distribution (normal and negatively skewed). Results indicated that in general, MML took a more substantive impact of latent trait skewness but also absorbed the momentum from sample size increase to improve its performance more strongly than MCMC. Two MCMC methods remained advantageous with lower RMSE of item parameter recovery across all conditions under investigation, but sample size increase brought a correspondingly narrower gap between MML and MCMC regardless of latent trait distributions. Gibbs and HMC provided nearly identical outcomes across all conditions, and no considerable difference between two MCMC methods was detected. Specifically, when θs were generated from a normal distribution, MML and MCMC estimated the b, c and d parameters with little mean bias, even at N = 1,000. Estimates of the a parameter were positively biased for MML and negatively biased for MCMC, and mean bias by all methods was considerably large in absolute value (> 0.10) 4 even at N = 5,000. MML item parameter recovery became less biased than Gibbs and HMC at N = 5,000. Under normal θ, all methods consistently improved RMSE of item parameter recovery in conjunction with sample size increase, except for MCMC estimation of the c parameter which did not exhibit a clear trend. When latent trait scores were skewed to the left, there was a concomitant deterioration in the quality of item parameter recovery by both MML and MCMC generally. Under skewed θ, MML had total errors of item parameter recovery diminished as more examinees took a test, yet sample size increase did not appear to benefit mean bias. Indeed, MML became increasingly negatively biased in estimation of the d parameter as sample size increased, and mean biases of estimating other item parameters remained considerably large at N=5,000. For Gibbs and HMC, sample size increase under skewed θ benefited only mean bias of item slopes recovery while rendering their estimation of other item parameters more negatively biased. In addition, unlike MML, there was no appreciable RMSE improvement in the b and d parameter estimation by two MCMC methods as more cases were drawn from a skewed θ distribution. Sample size and latent trait distribution had little observable effect on person parameter recovery on average. Both MML-EAP and MCMC were essentially unbiased and had similar RMSE of trait score estimation across all conditions. 5 Dedication This dissertation is dedicated to my mother, Tam Nguyen. 6 Acknowledgments The completion of my dissertation would not be possible without the support and guidance of my professors. I would like to express my gratitude to Dr. Gordon Brooks, my advisor and dissertation chair, for encouraging me to go back to graduate school, allowing me to pursue the research topic I am interested in, and helping me formulate the research questions clearly. Under Dr. Brooks's supervision, I gained research methodology knowledge, statistical programming skills, critical perspectives on the research and knowledge production enterprise, a sense of humor, and five pounds of belly fat. The side effect, of course, is attributed to me over-following Dr. Brooks in his footsteps, and I take complete responsibility for it. Dr. Bruce Carlson has always been an academic inspiration. His course on Bayesian analysis laid a strong foundation for my pursuit of this dissertation topic. His questions pushed me to think more philosophically beyond the technical contents of my study. One can only feel overwhelmed by his knowledge and devotion to academic rigor. I am grateful to have his instruction and guidance. Dr. Sebastián Díaz has always been more than a professor to me. In him, I find a mentor, an advocate, and a friend. His encouragement made the dissertation research process less mentally brutal, and his support for me as a doctoral student over these years made graduate school more enjoyable. Discussions with him helped me develop a more practical approach to research and academic work. I am thankful for the well-rounded education I have received from Dr. Díaz. 7 I am grateful to have Dr. Adah Ward Randolph as my professor, dissertation committee member, and sister. Dr. Randolph helped me broaden my research methodological repertoire, strengthen my writing skills, and improve many aspects of my dissertation. She taught me how to position ourselves and navigate the academia as feminists of color. The critical thinking skills and commitment to justice I learnt from Dr. Randolph are meaningful lifelong lessons, and I am forever thankful. I would like to thank the Ohio Supercomputer Center for granting me the resources and helping me through the process to run my R code in the Linux system, and my friend Nina Adanin for setting up a group of office computers for my simulation. I am grateful to my family, especially my sister, my niece and nephew, for their support. My sister, Loan Do, has covered many family duties for me while I am engaged in coursework and research at graduate school. Without my sister’s sacrifice, my doctoral journey would not bear fruit. Finally, I would like to thank my friends, An Dinh, Mai Tran, Thuy Ho, Duong Tran, Hai Mai, and Linda Sauer, for their moral support, free meals, and free trips to Kroger. They made my graduate school experience a happy one. 8 Table of Contents Page Abstract ............................................................................................................................... 3 Dedication ........................................................................................................................... 5 Acknowledgments............................................................................................................... 6 List of Tables .................................................................................................................... 12 List of Figures ................................................................................................................... 13 Chapter 1: Introduction ..................................................................................................... 14 Overview of IRT ......................................................................................................... 14 Assumptions of IRT .................................................................................................... 16 Major Types of IRT Models ....................................................................................... 17 Unidimensional IRT Models for Binary Data ............................................................ 19 The Rasch/One-parameter IRT Model.................................................................. 20 The Two-parameter IRT Model ............................................................................ 21 The Three-parameter IRT Model .......................................................................... 23 The Lesser-known Four-parameter IRT Model .................................................... 26 Parameter Estimation Approaches in IRT .................................................................. 30 Joint Maximum Likelihood Estimation (JML) ..................................................... 31 Marginal Maximum Likelihood Estimation (MML) ............................................ 31 Fully Bayesian Approach: Markov Chain Monte Carlo Estimation ..................... 33 Problem Statement ...................................................................................................... 34 Research Objectives .................................................................................................... 35 Research Question 1 ............................................................................................. 36 Research Question 2 ............................................................................................. 36 Significance of the Study ............................................................................................ 36 Scope of the Study ...................................................................................................... 36 Definition