Generalized Fiducial Inference for Graded Response Models

GENERALIZED FIDUCIAL INFERENCE FOR GRADED RESPONSE MODELS Yang Liu A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Psychology. Chapel Hill 2015 Approved by: David Thissen Daniel J. Bauer Patrick J. Curran Jan Hannig Andrea Hussong c 2015 Yang Liu ALL RIGHTS RESERVED ii ABSTRACT YANG LIU: GENERALIZED FIDUCIAL INFERENCE FOR GRADED RESPONSE MODELS. (Under the direction of David Thissen) Generalized fiducial inference (GFI) has been proposed as an alternative inferential frame- work in the statistical literature. Inferences of various sorts, such as confidence regions for (possibly transformed) model parameters, making prediction about future observations, and goodness of fit evaluation, can be constructed from a fiducial distribution defined on the parameter space in a fashion similar to those used with a Bayesian posterior. However, no prior distribution needs to be specified. In this work, the general recipe of GFI is applied to the graded response models, which are widely used in psychological and educational stud- ies for analyzing ordered categorical survey questionnaire data. Asymptotic optimality of GFI is established (Chapter 2), and a Markov chain Monte Carlo algorithm is developed for sampling from the resulting fiducial distribution (Chapter 3). The comparative performance of GFI, maximum likelihood and Bayesian approaches is evaluated via Monte Carlo simulations (Chapter 4). The use of GFI as a convenient and powerful tool to quantify sampling variability in various inferential procedures is illustrated by an empirical data analysis using the patient-reported emotional distress data (Chapter 5). iii To my dearest wife Flora. iv ACKNOWLEDGMENTS I would like to show my gratitude to all the committee members, especially my academic advisors Drs. David Thissen and Jan Hannig. I could not have finished the work without their valuable feedbacks and advices. The project is sponsored by the Harold Gulliksen Psychometric Research Fellowship from the Educational Testing Service (ETS). I am grateful to my mentors Drs. Shelby Haberman and Yi-Hsuan Lee at ETS. It has been a great experience working under their supervision, and I have been greatly benefited from their expertise in statistics. In addition, I would like to offer my sincere appreciation to the help and support from the current and former members of the Thurstone Lab|especially Brooke, Jim, and Jolynn, and those from Hannig's fiducial lab|Jessi, Qing, Dimitris, Abhishek, Rosie, and Jenny. I also cannot express enough thanks to Drs. Li Cai, Ji Seung Yang, and Scott Monroe for their continued support and encouragement for my work. Finally, to my caring, loving, and supportive wife, Flora: my heartfelt thanks. I deeply appreciate you taking care of most household activities in the past year; it was a great comfort and relief for me to focus on completing my dissertation. Your encouragement during the difficult and painful times are duly noted; I could not have gone through the frustration and depression without your company. v TABLE OF CONTENTS LIST OF TABLES ............................................................................ ix LIST OF FIGURES ........................................................................... x 1 INTRODUCTION ......................................................................... 1 1.1 Overview............................................................................... 1 1.2 The graded response model .......................................................... 2 1.2.1 Point estimation ................................................................ 3 1.2.2 Confidence interval/set ......................................................... 4 1.2.3 Goodness of fit testing ......................................................... 5 1.3 Generalized fiducial inference ........................................................ 7 2 THEORY ................................................................................... 16 2.1 A generalized fiducial distribution for item parameters ............................ 16 2.2 A fiducial Bernstein-von Mises theorem ............................................. 24 2.3 Fiducial predictive inference.......................................................... 27 2.3.1 Consistency ..................................................................... 28 2.3.2 Example: Response pattern scoring ........................................... 30 2.4 Goodness of fit testing with a fiducial predictive check (FPC) .................... 32 2.4.1 The centering approach ........................................................ 33 2.4.2 The partial predictive approach ............................................... 33 2.4.3 Choice of test statistics ........................................................ 35 3 COMPUTATION .......................................................................... 37 3.1 General structure ..................................................................... 37 3.2 Conditional sampling steps ........................................................... 38 ? 3.2.1 Conditional sampling of Aij ................................................... 38 vi ? 3.2.2 Conditional sampling of Zid ................................................... 40 3.3 Updating interior polytopes .......................................................... 42 3.4 Starting values ........................................................................ 45 3.5 Heavy-tailedness and a workaround ................................................. 46 3.6 Computational time................................................................... 48 4 MONTE CARLO SIMULATIONS........................................................ 50 4.1 Unidimensional models: Simulation design.......................................... 50 4.2 Unidimensional models: Parameter recovery........................................ 54 4.3 Unidimensional models: Response pattern scoring ................................. 62 4.4 Unidimensional models: Goodness of fit testing .................................... 72 4.5 Bifactor models: Simulation design .................................................. 85 4.6 Bifactor models: Parameter recovery ................................................ 88 4.7 Conclusion ............................................................................. 93 5 EMPIRICAL EXAMPLE.................................................................. 95 5.1 A unidimensional model .............................................................. 96 5.2 A three-dimensional exploratory model ............................................. 99 5.3 A bifactor model ...................................................................... 101 5.4 Summary .............................................................................. 119 6 DISCUSSION AND CONCLUSION ...................................................... 121 Appendix A BASIC PROPERTIES ......................................................... 123 A.1 Calculating the fiducial density ...................................................... 123 A.2 The invariance property .............................................................. 124 Appendix B A BERNSTEIN-VON MISES THEOREM ................................... 126 Appendix C NON-UNIQUENESS DUE TO SELECTION RULES....................... 137 Appendix D PREDICTIVE INFERENCE .................................................. 149 Appendix E FIDUCIAL PREDICTIVE CHECK .......................................... 150 E.1 Asymptotic covariance with the sample score function............................. 150 vii E.2 Normal approximation of the likelihood ............................................. 152 REFERENCES ................................................................................ 153 viii LIST OF TABLES 3.1 The average CPU time (in seconds) consumed by a single MCMC iteration under different combinations of sample size n, test length m, latent dimensionality r (exploratory model, minimally constrained), and number of categories K (Kj = K for all j) ................................................................................... 49 4.1 Data-generating parameter values for the unidimensional GRM (m = 9) ............ 52 4.2 Data-generating parameter values for the bifactor GRM (m = 9)..................... 87 5.1 PROMIS emotional distress short-form items .......................................... 97 5.2 Fiducial point and interval estimates for item slopes and intercepts under the bifactor GRM ............................................................................. 106 ix LIST OF FIGURES 1.1 The binomial proportion example ...................................................... 12 2.1 Set inverse functions..................................................................... 18 3.1 Trace plot for a slope parameter before and after implementing the workaround... 47 4.1 Empirical coverage and median length of CIs for unidimensional GRM parameters (n = 100, m = 9)......................................................................... 56 4.2 Empirical coverage and median length of CIs for unidimensional GRM parameters (n = 200, m = 9)......................................................................... 57 4.3 Empirical coverage and median length of CIs for unidimensional GRM parameters (n = 500, m = 9)......................................................................... 58 4.4 Empirical coverage and median length

Generalized Fiducial Inference for Graded Response Models

Methods to Calculate Uncertainty in the Estimated Overall Effect Size from a Random-Effects Meta-Analysis

Fiducial Inference: an Approach Based on Bootstrap Techniques

The Significance Test Controversy Revisited: the Fiducial Bayesian

Principles of Statistical Inference

Confidence Distribution

Fisher's Fiducial Argument and Bayes Theorem

Improper Priors & Fiducial Inference

GENERALIZED FIDUCIAL INFERENCE for LOGISTIC GRADED RESPONSE MODELS Yang Liu

Should the Widest Cleft in Statistics - How and Why Fisher Oppos Ed Neyman and Pearson

Generalized Fiducial Inference for Normal Linear Mixed Models

The P-Value Function and Statistical Inference

Statistical Inference from Data to Simple Hypotheses