Technology and Testing

Total Page:16

File Type:pdf, Size:1020Kb

Technology and Testing Technology and Testing From early answer sheets filled in with number 2 pencils, to tests administered by mainframe computers, to assessments wholly constructed by computers, it is clear that technology is changing the field of educational and psychological measurement. The numerous and rapid advances have immediate impact on test creators, assessment professionals, and those who implement and analyze assessments. This comprehensive new volume brings together lead- ing experts on the issues posed by technological applications in testing, with chapters on game-based assessment, testing with simulations, video assessment, computerized test devel- opment, large-scale test delivery, model choice, validity, and error issues. Including an overview of existing literature and groundbreaking research, each chapter considers the technological, practical, and ethical considerations of this rapidly changing area. Ideal for researchers and professionals in testing and assessment, Technology and Testing provides a critical and in-depth look at one of the most pressing topics in educational testing today. Fritz Drasgow is Professor of Psychology and Dean of the School of Labor and Employment Relations at the University of Illinois at Urbana-Champaign, USA. The NCME Applications of Educational Measurement and Assessment Book Series Editorial Board: Michael J. Kolen, The University of Iowa, Editor Robert L. Brennan, The University of Iowa Wayne Camara, ACT Edward H. Haertel, Stanford University Suzanne Lane, University of Pittsburgh Rebecca Zwick, Educational Testing Service Technology and Testing: Improving Educational and Psychological Measurement Edited by Fritz Drasgow Forthcoming: Meeting the Challenges to Measurement in an Era of Accountability Edited by Henry Braun Testing the Professions: Credentialing Policies and Practice Edited by Susan Davis-Becker and Chad W. Buckendahl Fairness in Educational Assessment and Measurement Edited by Neil J. Dorans and Linda L. Cook Validation of Score Meaning in the Next Generation of Assessments Edited by Kadriye Ercikan and James W. Pellegrino Technology and Testing Improving Educational and Psychological Measurement Edited by Fritz Drasgow First published 2016 by Routledge 711 Third Avenue, New York, NY 10017 and by Routledge 2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business © 2016 Taylor & Francis The right of the editor to be identified as the author of the editorial material, and of the authors for their individual chapters, has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Technology and testing : improving educational and psychological measurement / edited by Fritz Drasgow. pages cm. — (Ncme applications of educational measurement and assessment book series) Includes bibliographical references and index. 1. Educational tests and measurements. 2. Educational technology. I. Drasgow, Fritz. LB3051.T43 2015 371.26—dc23 2015008611 ISBN: 978-0-415-71715-1 (hbk) ISBN: 978-0-415-71716-8 (pbk) ISBN: 978-1-315-87149-3 (ebk) Typeset in Minion by Apex CoVantage, LLC Contents List of Contributors vii Foreword xv Preface xvii 1. Managing Ongoing Changes to the Test: Agile Strategies for Continuous Innovation 1 CYNTHIA G. PARSHALL AND ROBIN A. GUILLE 2. Psychometrics and Game-Based Assessment 23 ROBERT J. MISLEVY, SETH CORRIGAN, ANDREAS ORANJE, KRISTEN DICERBO, MALCOLM I. BAUER, ALINA VON DAVIER, AND MICHAEL JOHN 3. Issues in Simulation-Based Assessment 49 BRIAN E. CLAUSER, MELISSA J. MARGOLIS, AND JEROME C. CLAUSER 4. Actor or Avatar? Considerations in Selecting Appropriate Formats for Assessment Content 79 ERIC C. POPP, KATHY TUZINSKI, AND MICHAEL FETZER Commentary on Chapters 1–4: Using Technology to Enhance Assessments 104 STEPHEN G. SIRECI 5. Using Technology-Enhanced Processes to Generate Test Items in Multiple Languages 109 MARK J. GIERL, HOLLIS LAI, KAREN FUNG, AND BIN ZHENG 6. Automated Test Assembly 128 KRISTA BREITHAUPT AND DONOVAN HARE vi • Contents 7. Validity and Automated Scoring 142 RANDY ELLIOT BENNETT AND MO ZHANG Commentary on Chapters 5–7: Moving From Art to Science 174 MARK D. RECKASE 8. Computer-Based Test Delivery Models, Data, and Operational Implementation Issues 179 RICHARD M. LUECHT 9. Mobile Psychological Assessment 206 OLEKSANDR S. CHERNYSHENKO AND STEPHEN STARK 10. Increasing the Accessibility of Assessments Through Technology 217 ELIZABETH STONE, CARA C. LAITUSIS, AND LINDA L. COOK 11. Testing Technology and Its Effects on Test Security 235 DAVID FOSTER Commentary on Chapters 8–11: Technology and Test Administration: The Search for Validity 255 KURT F. GEISINGER 12. From Standardization to Personalization: The Comparability of Scores Based on Different Testing Conditions, Modes, and Devices 260 WALTER D. WAY, LAURIE L. DAVIS, LESLIE KENG, AND ELLEN STRAIN-SEYMOUR 13. Diagnostic Assessment: Methods for the Reliable Measurement of Multidimensional Abilities 285 JONATHAN TEMPLIN 14. Item Response Models for CBT 305 DANIEL BOLT 15. Using Prizes to Facilitate Change in Educational Assessment 323 MARK D. SHERMIS AND JAISON MORGAN Commentary on Chapters 12–15: Future Directions: Challenge and Opportunity 339 EDWARD HAERTEL Index 343 Contributors Malcolm I. Bauer is a senior managing scientist in the Cognitive Science group at the Edu- cational Testing Service. His research has two main foci: game-based assessment for complex problem solving and non-cognitive skills, and learning progressions for assessment in STEM areas. Randy Elliot Bennett holds the Norman O. Frederiksen Chair in Assessment Innovation at Educational Testing Service in Princeton, New Jersey. His work concentrates on integrating advances in the learning sciences, measurement, and technology to create new approaches to assessment that measure well and that have a positive impact on teaching and learning. Daniel Bolt is Professor of Educational Psychology at the University of Wisconsin, Madison and specializes in quantitative methods. His research interests are in item response theory (IRT), including multidimensional IRT and applications. He teaches courses in the areas of test theory, factor analysis, and hierarchical linear modeling, and consults on various issues related to the development of measurement instruments in the social sciences. Dan has been the recipient of a Vilas Associates Award and Chancellors Distinguished Teaching Award at the University of Wisconsin. Krista Breithaupt, Ph.D., has 25 years of experience with modern psychometric methods and computerized testing, including assessment in clinical epidemiology, nursing, and popula- tion health. Krista served for 10 years as a psychometrician and then as Director of Psycho- metrics at the American Institute of CPAs (AICPA). She guided research and development to computerize a national licensing exam for CPAs, contributing innovative research in test development, assembly, scoring, and delivery technologies for the national licensing exam administered to 250,000 examinees annually. Krista held the position of Director, Research and Development (retired) at the Medical Council of Canada. In this post, Krista defined and led national studies to determine the skills, knowledge, and attitudes required for licensure as a physician in Canada. Krista developed and guided studies to validate and streamline the delivery, scoring, and comparability of certification examinations for Canadian and interna- tionally trained physicians in Canada. Oleksandr S. Chernyshenko received his Ph.D. from the University of Illinois at Urbana- Champaign in 2002 with a major in industrial organizational psychology and minors in quantitative methods and human resource management. He is currently an Associate viii • Contributors Professor at Nanyang Technological University in Singapore, conducting research and teach- ing courses related to personnel selection. Dr. Chernyshenko is an Associate Editor of the journal Personnel Assessment and Decisions and serves on the editorial board of the Interna- tional Journal of Testing. Brian E. Clauser has worked for the past 23 years at the National Board of Medical Examin- ers, where he has conducted research on automated scoring of complex assessment formats, standard setting, and applications of generalizability theory. He led the research project that developed the automated scoring system for a computer-simulation-based assessment of physicians’ patient management skills. The simulation is part of the licensing examina- tion for physicians. Brian is also the immediate past editor of the Journal of Educational Measurement. Jerome C. Clauser is a Research Psychometrician at the American Board of Internal Medi- cine. His research interests include standard setting, equating, and innovative item types. Linda L. Cook is recently retired from Educational Testing Service. While employed by ETS, she served in a number of roles including Executive Director of the Admissions and
Recommended publications
  • Randy E. Bennett Matthias Von Davier Editors the Methodological
    Methodology of Educational Measurement and Assessment Randy E. Bennett Matthias von Davier Editors Advancing Human Assessment The Methodological, Psychological and Policy Contributions of ETS Methodology of Educational Measurement and Assessment Series editors Bernard Veldkamp, Research Center for Examinations and Certification (RCEC), University of Twente, Enschede, The Netherlands Matthias von Davier, National Board of Medical Examiners (NBME), Philadelphia, USA1 1This work was conducted while M. von Davier was employed with Educational Testing Service. This book series collates key contributions to a fast-developing field of education research. It is an international forum for theoretical and empirical studies exploring new and existing methods of collecting, analyzing, and reporting data from educational measurements and assessments. Covering a high-profile topic from multiple viewpoints, it aims to foster a broader understanding of fresh developments as innovative software tools and new concepts such as competency models and skills diagnosis continue to gain traction in educational institutions around the world. Methodology of Educational Measurement and Assessment offers readers reliable critical evaluations, reviews and comparisons of existing methodologies alongside authoritative analysis and commentary on new and emerging approaches. It will showcase empirical research on applications, examine issues such as reliability, validity, and comparability, and help keep readers up to speed on developments in statistical modeling approaches. The fully peer-reviewed publications in the series cover measurement and assessment at all levels of education and feature work by academics and education professionals from around the world. Providing an authoritative central clearing-house for research in a core sector in education, the series forms a major contribution to the international literature.
    [Show full text]
  • Creating Coupled-Multiple Response Test Items in Physics and Engineering for Use in Adaptive Formative Assessments
    Creating coupled-multiple response test items in physics and engineering for use in adaptive formative assessments Laura R´ıos∗z, Benjamin Lutzy, Eileen Rossmany, Christina Yee∗, Dominic Tragesery, Maggie Nevrlyy, Megan Philipsy, Brian Selfy ∗Physics Department California Polytechnic State University San Luis Obispo, CA, USA yDepartment of Mechanical Engineering California Polytechnic State University San Luis Obispo, CA, USA zCorresponding author: [email protected] Abstract—This Research Work-in-Progress paper presents technologies [5], [6], [7]. In this work-in-progress, we describe ongoing work in engineering and physics courses to create our research efforts which leverage existing online learning online formative assessments. Mechanics courses are full of non- infrastructure (Concept Warehouse) [8]. The Concept Ware- intuitive, conceptually challenging principles that are difficult to understand. In order to help students with conceptual growth, house is an online instructional and assessment tool that helps particularly if we want to develop individualized online learning students develop a deeper understanding of relevant concepts modules, we first must determine student’s prior understanding. across a range of STEM courses and topics. To do this, we are using coupled-multiple response (CMR) ques- We begin by briefly reviewing student difficulties in statics tions in introductory physics (a mechanics course), engineering and dynamics, in addition to our motivation for developing statics, and engineering dynamics. CMR tests are assessments that use a nuanced rubric to examine underlying reasoning specifically coupled-multiple response (CMR) test items to elements students may have for decisions on a multiple-choice test tailor online learning tools. CMR items leverage a combination and, as such, are uniquely situated to pinpoint specific student of student answers and explanations to identify specific ways alternate conceptions.
    [Show full text]
  • AP® Art History Practice Exam
    Sample Responses from the AP® Art History Practice Exam Sample Questions Scoring Guidelines Student Responses Commentaries on the Responses Effective Fall 2015 AP Art History Practice Exam Sample Responses About the College Board The College Board is a mission-driven not-for-profit organization that connects students to college success and opportunity. Founded in 1900, the College Board was created to expand access to higher education. Today, the membership association is made up of over 6,000 of the world’s leading educational institutions and is dedicated to promoting excellence and equity in education. Each year, the College Board helps more than seven million students prepare for a successful transition to college through programs and services in college readiness and college success — including the SAT® and the Advanced Placement Program®. The organization also serves the education community through research and advocacy on behalf of students, educators, and schools. For further information, visit www.collegeboard.org. AP® Equity and Access Policy The College Board strongly encourages educators to make equitable access a guiding principle for their AP® programs by giving all willing and academically prepared students the opportunity to participate in AP. We encourage the elimination of barriers that restrict access to AP for students from ethnic, racial, and socioeconomic groups that have been traditionally underserved. Schools should make every effort to ensure their AP classes reflect the diversity of their student population. The College Board also believes that all students should have access to academically challenging course work before they enroll in AP classes, which can prepare them for AP success.
    [Show full text]
  • The Effect of Using Different Weights for Multiple-Choice and Free- Response Item Sections Amy Hendrickson Brian Patterson Gerald Melican
    The Effect of Using Different Weights for Multiple-Choice and Free- Response Item Sections Amy Hendrickson Brian Patterson Gerald Melican The College Board National Council for Measurement in Education March 27th, 2008 New York Combining Scores to Form Composites • Many high-stakes tests include both multiple-choice (MC) and free response (FR) item types • Linearly combining the item type scores is equivalent to deciding how much each component will contribute to the composite. • Use of different weights potentially impacts the reliability and/or validity of the scores. 2 Advanced Placement Program® (AP®) Exams • 34 of 37 Advanced Placement Program® (AP®) Exams contain both MC and FR items. • Weights for AP composite scores are set by a test development committee. • These weights range from 0.33 to 0.60 for the FR section. • These are translated into absolute weights which are the multiplicative constants that are applied directly to the item scores • Previous research exists concerning the effect of different weighting schemes on the reliability of AP exam scores but not on the validity of the scores. 3 Weighting and Reliability and Validity • Construct/Face validity perspective • Include both MC and FR items because they measure different and important constructs. • De-emphasizing the FR section may disregard the intention of the test developers and policy makers who believe that, if anything, the FR section should have the higher weight to meet face validity. • Psychometric perspective • FR are generally less reliable than MC, thus, it is best to give proportionally higher weight to the (more reliable) MC section. • The decision of how to combine item-type scores represents a trade-off between the test measuring what it is meant and expected to measure and the test measuring consistently (Walker, 2005).
    [Show full text]
  • The Beliefs of Advanced Placement Teachers Regarding Equity
    THE BELIEFS OF ADVANCED PLACEMENT TEACHERS REGARDING EQUITY AND ACCESS TO ADVANCED PLACEMENT COURSES: A MIXED-METHODS STUDY by Mirynne O’Connor Igualada A Dissertation Submitted to the Faculty of The College of Education in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Florida Atlantic University Boca Raton, Florida December 2015 Copyright 2015 by Mirynne O’Connor Igualada ii ACKNOWLEDGEMENTS I cannot express enough gratitude to the many people who provided continued support and encouragement throughout this journey. Thank you to my mother, Maureen, as you have been an endless source of support and encouragement. You always have believed that I could achieve this degree, while also helping me to stay somewhat sane throughout the process. You provided my theoretical framework; had I not been raised in your household and under your spiritual guidance, I would not have felt the need to examine issues of inequity. To my husband, Eric, thank you for your love throughout this process. You may not have always wanted to hear about my research, but you always have been willing to support my hopes and dreams without question. I would not be where I am today without you. I love you. Thank you to my Angel Baby and my Rainbow Baby, Emalee Jane. I am not sure I would have found the strength and stamina to complete this degree had I not felt the need to make you both proud of my accomplishments. I would like to thank my dissertation chair, Dr. Dilys Schoorman, who has been by my side through every major event academically, personally, and professionally in my adult life.
    [Show full text]
  • ED384655.Pdf
    DOCUMENT RESUME ED 384 655 TM 023 948 AUTHOR Mislevy, Robert J. TITLE A Framework for Studying Differences between Multiple-Choice and Free-Response Test Items. INSTITUTION Educational Testing Service, Princeton, N.J. REPORT NO ETS-RR-91-36 PUB DATE May 91 NOTE 59p. PUB TYPE Reports Evaluative/Feasibility (142) EDRS PRICE MF01/PC03 Plus Postage. DESCRIPTORS Cognitive Psychology; *Competence; Elementary Secondary Education; *Inferences; *Multiple Choice Tests; *Networks; Test Construction; Test Reliability; *Test Theory; Test Validity IDENTIFIERS *Free Response Test Items ABSTRACT This paper lays out a framework for comparing the qualities and the quantities of information about student competence provided by multiple-choice and free-response test items. After discussing the origins of multiple-choice testing and recent influences for change, the paper outlines an "inference network" approach to test theory, in which students are characterized in terms of levels of understanding of key concepts in a learning area. It then describes how to build inference networks to address questions concerning the information about various criteria conveyed by alternative response formats. The types of questions that can be addressed in this framework include those that can be studied within the framework of standard test theory. Moreover, questions can be asked about generalized kinds of reliability and validity for inferences cast in terms of recent developments in cognitive psychology. (Contains 46 references, 4 tables, and 16 figures.) (Author) *********************************************************************** * Reproductions supplied by EDRS are the best that can be made from the original document. *********************************************************************** RR-91-36 U ti DEPARTMENT OF EDUCATION R °ace of Educttonai Research and improvement -PERMISSION TO REPRODUCE THIS EDUCATIONAL RESOURCES INFORMATION MATERIAL HAS BEEN GRANTED BY CENTER (ERIC) hr$ document has been reproduced as .
    [Show full text]
  • Finding Relationships Between Multiple-Choice Math Tests and Their Ts Em-Equivalent Constructed Responses Nayla Aad Chaoui Claremont Graduate University
    Claremont Colleges Scholarship @ Claremont CGU Theses & Dissertations CGU Student Scholarship 2011 Finding Relationships Between Multiple-Choice Math Tests and Their tS em-Equivalent Constructed Responses Nayla Aad Chaoui Claremont Graduate University Recommended Citation Chaoui, Nayla Aad, "Finding Relationships Between Multiple-Choice Math Tests and Their tS em-Equivalent Constructed Responses" (2011). CGU Theses & Dissertations. Paper 21. http://scholarship.claremont.edu/cgu_etd/21 DOI: 10.5642/cguetd/21 This Open Access Dissertation is brought to you for free and open access by the CGU Student Scholarship at Scholarship @ Claremont. It has been accepted for inclusion in CGU Theses & Dissertations by an authorized administrator of Scholarship @ Claremont. For more information, please contact [email protected]. Finding Relationships Between Multiple-Choice Math Tests And Their Stem-Equivalent Constructed Responses By Nayla Aad Chaoui A dissertation submitted to the Faculty of Claremont Graduate University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate Faculty of Education Claremont Graduate University 2011 APPROVAL OF THE DISSERTATION COMMITTEE We, the undersigned, certify that we have read, reviewed, and critiqued the dissertation of Nayla Aad Chaoui and do hereby approve it as adequate in scope and quality for meriting the degree of Doctor of Philosophy. ______________________________________________________________ Dr. Mary Poplin Chair School of Educational Studies ______________________________________________________________
    [Show full text]
  • AP COMPARATIVE STUDY GUIDE by Ethel Wood
    AP COMPARATIVE STUDY GUIDE by Ethel Wood Comparative government and politics provides an introduction to the wide, diverse world of governments and political practices that currently exist in modern times. Although the course focuses on specific countries, it also emphasizes an understanding of conceptual tools and methods that form a framework for comparing almost any governments that exist today. Additionally, it requires students to go beyond individual political systems to consider international forces that affect all people in the world, often in very different ways. Six countries form the core of the course: Great Britain, Russia, China, Mexico, Iran, and Nigeria. The countries are chosen to reflect regional variations, but more importantly, to illustrate how important concepts operate both similarly and differently in different types of political systems: "advanced" democracies, communist and post communist countries, and newly industrialized and less developed nations. This book includes review materials for all six countries. WHAT IS COMPARATIVE POLITICS? Most people understand that the term government is a reference to the leadership and institutions that make policy decisions for the country. However, what exactly is politics? Politics is basically all about power. Who has the power to make the decisions? How did they get the power? What challenges do leaders face from others &endash; both inside and outside the country's borders &endash; in keeping the power? So, as we look at different countries, we are not only concerned about the ins and outs of how the government works. We will also look at how power is gained, managed, challenged, and maintained. College-level courses in comparative government and politics vary, but they all cover topics that enable meaningful comparisons across countries.
    [Show full text]
  • Preference for Multiple Choice and Constructed Response Exams for Engineering Students with and Without Learning Difficulties
    Preference for Multiple Choice and Constructed Response Exams for Engineering Students with and without Learning Difficulties Panagiotis Photopoulos1a, Christos Tsonos2b, Ilias Stavrakas1c and Dimos Triantis1d 1Department of Electrical and Electronic Engineering, University of West Attica, Athens, Greece 2Department of Physics, University of Thessaly, Lamia, Greece Keywords: Multiple Choice, Constructed Response, Essay Exams, Learning Difficulties, Power Relations. Abstract: Problem solving is a fundamental part of engineering education. The aim of this study is to compare engineering students’ perceptions and attitudes towards problem-based multiple-choice and constructed response exams. Data were collected from 105 students, 18 of them reported to face some learning difficulty. All the students had an experience of four or more problem-based multiple-choice exams. Overall, students showed a preference towards multiple-choice exams although they did not consider them to be fairer, easier or less anxiety invoking. Students facing learning difficulties struggle with written exams independently of their format and their preferences towards the two examination formats are influenced by the specificity of their learning difficulties. The degree to which each exam format allows students to show what they have learned and be rewarded for partial knowledge, is also discussed. The replies to this question were influenced by students’ preference towards each examination format. 1 INTRODUCTION consuming process and a computer-based evaluation of the answers is still problematic (Ventouras et al. Public universities are experiencing budgetary cuts 2010). manifested in increased flexible employment, Assessment embodies power relations between overtime working and large classes (Parmenter et al. institutions, teachers and students (Tan, 2012; Holley 2009; Watts 2017).
    [Show full text]
  • AP United States History Practice Exam and Notes
    Effective Fall 2017 AP United States History Practice Exam and Notes This exam may not be posted on school or personal websites, nor electronically redistributed for any reason. This exam is provided by the College Board for AP Exam preparation. Teachers are permitted to download the materials and make copies to use with their students in a classroom setting only. To maintain the security of this exam, teachers should collect all materials after their administration and keep them in a secure location. Further distribution of these materials outside of the secure College Board site disadvantages teachers who rely on uncirculated questions for classroom testing. Any additional distribution is in violation of the College Board’s copyright policies and may result in the termination of Practice Exam access for your school as well as the removal of access to other online services such as the AP Teacher Community and Online Score Reports. About the College Board The College Board is a mission-driven not-for-profit organization that connects students to college success and opportunity. Founded in 1900, the College Board was created to expand access to higher education. Today, the membership association is made up of over 6,000 of the world’s leading educational institutions and is dedicated to promoting excellence and equity in education. Each year, the College Board helps more than seven million students prepare for a successful transition to college through programs and services in college readiness and college success—including the SAT® and the Advanced Placement Program®. The organization also serves the education community through research and advocacy on behalf of students, educators, and schools.
    [Show full text]