Predicting Differential Item Functioning in Cross-Lingual Testing: the Case of a High Stakes Test in the Kyrgyz Republic
Total Page:16
File Type:pdf, Size:1020Kb
PREDICTING DIFFERENTIAL ITEM FUNCTIONING IN CROSS-LINGUAL TESTING: THE CASE OF A HIGH STAKES TEST IN THE KYRGYZ REPUBLIC By Todd W. Drummond A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirement for the degree of DOCTOR OF PHILOSOPHY Educational Policy 2011 ABSTRACT PREDICTING DIFFERENTIAL ITEM FUNCTIONING IN CROSS-LINGUAL TESTING: THE CASE OF A HIGH STAKES TEST IN THE KYRGYZ REPUBLIC By Todd W. Drummond Cross-lingual tests are assessment instruments created in one language and adapted for use with another language group. Practitioners and researchers use cross-lingual tests for various descriptive, analytical and selection purposes both in comparative studies across nations and within countries marked by linguistic diversity (Hambleton, 2005). Due to cultural, contextual, psychological and linguistic differences between diverse populations, adapting test items for use across groups is a challenging endeavor. The validity of inferences based on cross-lingual tests can only be assured if the content, meaning, and difficulty of test items are similar in the different language versions of the test items (Ercikan, 2002). Of paramount importance in the test adaptation process is the proven ability of test developers to adapt test items across groups in meaningful ways. One way investigators seek to understand the level of item equivalence on a cross-lingual assessment is to analyze items for differential item functioning , or DIF. DIF is present when examinees from different language groups do not have the same probability of responding correctly to a given item, after controlling for examinee ability (Camilli & Shephard, 1994). In order to detect and minimize DIF, test developers employ both statistical methods and substantive (judgmental) reviews of cross-lingual items. In the Kyrgyz Republic, item developers rely on substantive review of items by bi-lingual professionals. In situations where statistical DIF detection methods are not typically utilized, the accuracy of such professionals in discerning differences in content, meaning and difficulty between items is especially important. In this study, the accuracy of bi-linguals’ predictions about whether differences between Kyrgyz and Russian language test items would lead to DIF was evaluated. The items came from a cross-lingual university scholarship test in the Kyrgyz Republic. Evaluators’ predictions were compared to a statistical test of “no difference” in response patterns by group using the logistic regression (LR) DIF detection method (Swaminathan & Rogers, 1990). A small number of test items were estimated to have “practical statistical DIF.” There was a modest, positive correlation between evaluators’ predictions and statistical DIF levels. However, with the exception of one item type, sentence completion, evaluators were unable to predict which language group was favored by differences on a consistent basis. Plausible explanations for this finding as well as ways to improve the accuracy of substantive review are offered. Data was also collected to determine the primary sources of DIF in order to inform the test development and adaptation process in the republic. Most of the causes of DIF were attributed to highly contextual (within item) sources of difference related to overt adaptation problems. However, inherent language differences were also noted: Syntax issues with the sentence completion items made the adaptation of this item type from Russian into Kyrgyz problematic. Statistical and substantive data indicated that the reading comprehension items were less problematic to adapt than analogy and sentence completion items. I analyze these findings and interpret their implications to key stakeholders, provide recommendations for how to improve the process of adapting items from Russian into Kyrgyz and highlight cautions to interpreting the data collected in this study. Copyright by Todd W. Drummond 2011 ACKNOWLEDGEMENTS I feel fortunate to have had the opportunity to pursue doctoral work in the College of Education at Michigan State University. I would like to express my sincere gratitude to my dissertation director, Dr. Mark Reckase, for his patient guidance as he mentored me throughout my doctoral studies. Special thanks also to my academic advisor Dr. Jack Schwille for his thoughtful probing and encouragement. Committee members Dr. Jim Fairweather and Dr. Ed Roeber were always accessible and provided constructive feedback. Dr. Michael Sedlak has been a consistent source of moral and financial support for all the students in the educational policy program at MSU. Though not directly involved with this dissertation, I learned a tremendous amount about leadership from Dr. John Hudzik in the Office for Global Engagement and about educational politics from the wisdom of Dr. Phillip Cusick. Thanks to Seung-Hwan Ham and Wang Jun Kim for their friendship and interest in reading my work. Colleagues and friends at the American Councils for International Education in both Washington, D.C., and the Kyrgyzstan field office influenced my thinking about this dissertation. I would like to acknowledge Dr. Dan Davidson, Dr. David Patton, Michael Curtis and Kimberly Verkuilen as well as past and current members of the American Councils team in Bishkek for their friendship and the role they have played in my professional development for more than a decade. This dissertation would not have been possible without the support of dozens of colleagues, students and friends in Kyrgyzstan. I thank Nina Dolzhenko, my former students, the entire collective at school number one in Kant, colleagues at the Ministry of Education, and the Shirinovi and Chokubeavi families for “introducing me” to Kyrgyzstan almost two decades ago. I also thank former Minister of Education, Camilla Sharshekeeva, for her friendship, inspirational courage, and tenacious optimism. v Past and current staff of the Center for Educational Assessment and Teaching Methods (CEATM) in Bishkek led by Dr. Inna Valkova have not only my deepest gratitude for their enthusiastic collaboration, but my sincere admiration for the outstanding work they do in very difficult conditions: Constantine Titov, Natalia Naumova, Merim Kadyrova and Asel Bazarbaeva, study participants, and the rest of the CEATM team supported me in every possible way in the summer of 2010. This research was made possible by support from the U.S. State Department’s (Title VIII) Research Scholars program administered by the American Councils for International Education. Finally, I want to express a very heartfelt thanks to my family. Vitaly and Lubov Stolyarovi as well as their extended families in Kyrgyzstan have been constant supporters. I thank my parents, R. Wayne and Gayle Drummond, for their unwavering love and a lifetime of opportunities and encouragement. I dedicate this dissertation to the most important person in my life, my wife and best friend, Natalia, who has been with me every step of the way. vi TABLE OF CONTENTS LIST OF TABLES ................................................................................................................... X CHAPTER 1: PREDICTING DIFFERENTIAL ITEM FUNCTIONING IN CROSS- LINGUAL TESTING ............................................................................................................... 1 OVERVIEW ................................................................................................................................... 1 THE CHALLENGE OF CROSS -LINGUAL ASSESSMENT .................................................................. 2 RESEARCH QUESTIONS ................................................................................................................ 6 UTILITY OF THIS STUDY ............................................................................................................... 7 SITUATING THE STUDY AND KEY TERMS ................................................................................... 10 STUDY LIMITATIONS .................................................................................................................. 13 ORGANIZATION OF THE STUDY .................................................................................................. 15 CHAPTER 2: EDUCATION & LANGUAGE(S) OF INSTRUCTION IN THE KR ..... 16 OVERVIEW ................................................................................................................................. 16 CONTEMPORARY SCHOOLING AND LANGUAGE ISSUES .............................................................. 26 THE STATUS OF RUSSIAN AS A MEDIUM OF INSTRUCTION ........................................................ 32 QUALITY OF EDUCATION BY LANGUAGE OF INSTRUCTION ........................................................ 38 TERTIARY EDUCATION AND THE NST ....................................................................................... 43 STUDENT SELECTION IN THE SOVIET PERIOD ........................................................................... 47 THE NATIONAL SCHOLARSHIP TEST AND LANGUAGE POLITICS ............................................... 55 CHAPTER 3: LITERATURE REVIEW ............................................................................ 58 SUBSTANTIVE REVIEW AND DIF PREDICTION .......................................................................... 58 LEVELS OF DIF IN CROSS -LINGUAL TESTING .......................................................................... 67 CAUSES OF DIF IN CROSS -LINGUAL TESTING .........................................................................