
DOCUMENT RESUME ED 353 302 TM 019 366 AUTHOR Mislevy, Robert J. TITLE Linking Educational Assessments: Concepts,Issues, Methods, and Prospects. INSTITUTION Educational Testing Service,Princeton, NJ. Policy Information Center. SPONS AGENCY Office of Educational Research and Improvement(ED), Washington, DC. PUB DATE Dec 92 CONTRACT R117G10027 NOTE 97p.; Foreword byRobert L. Linn. PUB TYPE Reports Evaluative/Feasibility (142) EDRS PRICE MF01/PC04 Plus Postage. DESCRIPTORS Academic Achievement;Competence; *Educational Assessment; Elementary Secondary Education;Equated Scores; EvaluationUtilization; *Measurement Techniques; ProjectiveMeasures; Statistical Analysis; Student Evaluation IDENTIFIERS Calibration; *LinkingEducational Assessments; Statistical Moderation ABSTRACT This paper describesthe basic concepts educational assessments. of linking Although it discussesstatistical machinery for interpretingevidence about students' principal message is achievements, this paper's that linking tests isnot just a technical problem. Technical questions can not beasked or answered until questions about the nature of learning andachievement and about purposes and consequences of the assessment -.re addressed.Some fundamental ideas abouteducational assessment (chains of inferences and test theory and roles of judgment)are considered. The following approachesto linking assessments equating in physical are described: (1) measurement and educationalassessment; (2) calibration in physicalmeasurement and educational projection in physical assessment; (3) measurement and educationalassessment; and (4) statisticalmoderation and social moderation. Implicationsfor a system of monitoringprogress toward educational discussed. The following standards are obtainable goalsare highlighted: (1) comparing levels ofperformance across clusters directly interms of common indicators ofperformance on a selected defined tasks administered sample of consensually under standard conditions;(2) estimating levels of performanceof groups or individuals within clusters atthe levels of accuracydemanded by purposes within comparing levels of clusters; (3) performance acrossclusters in terms of performance ratings on customized assessmentsin terms of consensually defined, a more abstract descriptionof developing competence; and (4) making projections about howstudents fromone cluster might haveperformed on the assessment of anothercluster. Seven tables, 14figures, and 41 references are included.(RLC) *********************************************************************** Reproductions supplied by EDRS are the bestthat can be made from the originaldocument. *****************************************************************k***** dx. POLICY INFORMATION CENTER Educational Testing Service U.S. DEPARTMENT Of EDUCATION Office of Educational Research and improvement "PERMISSION TO REPRODUCE THIS EDUCATIONAL RESOURCES INFORMATION MATERIAL HAS BEEN GRANTED BY CENTER (ERIC) 'Thu document hat been reproduced es 0.01--t received from the person or organization originating It O Minor changed have been made to improve reproduction puskty Points of view or opinions stated in this docu- ment do not necessarily represent official TO THE EDUCATIONAL RESOURCES OERI position or policy INFORMATION CENTER (ERIC)." Linking Educational Assessments Concepts, Issues, Methods, and Prospects BEST COPY AVAILABLE 2 Linking EducationalAssessments: Concepts, Issues, Methods, andProspects By Robert J. Mislevy Educational Testing Service With a Foreword by Robert L. Linn Center for Research on Evaluation, Standards, and Student Testing University of Colorado at Boulder December 1992 ETS Policy Information Center Princeton, NJ 08541 3 Copyright © 1992 by Educational Testing Service. Allrights reserved. Table of Contents Page Preface i Acknowledgments ii Foreword iii Introduction 1 Educational Assessments 2 Purposes of Assessment 4 Conceptions of Competence 6 Assessments as Operational Definitions 7 Standardization 12 Test Theory 13 Chains of Inferences 16 The Roles of Judgment 19 Linking Tests: An Overview 21 A Note on Linking Studies 26 Equating 26 Equating in Physical Measurement 28 Equating in Educational Assessment 32 Comments on Equating 37 Calibration 37 Calibration in Physical Measurement 38 Calibration in Educational Assessment,Case 1: Long and Short Tests Built to theSame Specifications 42 Calibration in Educational Assessment,Case 2: Item Response Theory for Patternsof Behavior 44 Calibration in EducationalAssessment, Case 3: Item Response Theory forJudgments on an Abstractly Defined Proficiency 50 Comments on Calibration 53 Projection 54 Projection in Physical Measurement Projection in Educational 56 Assessment 59 Page Another Type of Data for Projection Linking 61 Comments on Projection 62 Moderation 64 Statistical Moderation 64 Social Moderation 68 Comments on Moderation 72 Cone usi on 72 References 77 List of Tables and Figures Page Tables 1. Description of Assessment Purposes 5 2. ACTFL Proficiency Guidelines for Reading 3 3. Methods of Linking Educational Assessments 22 4. Temperature Measures for 60 Cities 30 5. Correspondence Table for Thermometer X andThermometer Y 34 6. Cross-Tabulation of Oregon Ideas Scores and Arizona Content Scores for the OregonMiddle-School Papers 63 7. Decathlon Results from the 1932 Olympics 71 Figures 1. Hierarchy of Educational Outcomes 10 2. Chains of Inference from Test X and TestY to Two Sets of Specific Curricular Objectives 18 3. Equating Linkages 27 4. Distributions of True Temperatures andTemperature Readings 29 5. Sixty Cities 33 6. Unadjusted Thermometer Y Readings PlottedAgainst Thermometer X Readings 33 7. Cricket Temperature Readings PlottedAgainst True Temperatures 39 8. Distributions of True Temperatures andSelected Estimates 40 9. Calibration in Terms of Behavior 45 10. Calibration in Terms of Judgmentat a Higher Level of Abstraction 51 11. Empirical Linkages for Projection 55 12. True Temperatures Plotted AgainstLatitude 58 13. True Temperatures Plotted AgainstLatitude, with Regions Distinguished , 58 14. Moderated Distributions of Goals Scoredand Batting Averages 67 Preface As this is written, a fairly wideconsensus has developed around the proposition that we need to establish national education standards anda system of assessments to measure whether the standards are being achieved. Rejecting a single national exami- nation, the call is for a voluntary system of assessments, allgeared to national stan- dards, in which states, groups of states,or other groups design their own assessments. According to the report of the National Councilon Education Standards and Testing, the key features of assessments "would be alignment with highnational standards and the capacity to produce useful, comparable results." We foclis here on the word "comparable." Irrespective ofwhich assessment is given, there is a desire to be able to translate the resultsso that students across the country can be compared on their achievement of the national standards.Indeed, this has become a key to developingconsensus. To those who fear that national standards and an assessment system will lead toa national curriculum, the response is that the standards are just "frameworks" within which curriculumcan be determined locally, and assessments can reflect this curriculum. Techniques wouldbe used, such as "cali- bration," to make the results comparable. Ata meeting of the National Education Goals Panel, a governor said something to the effect thatwe can have as many tests as we want; all we have to do is calibrate them. How farcan local curriculum stray from the national standards and differ fromone locality to another, and how different can as- sessments be in how and what they test and still enable comparablescores to be con- structed? The ETS Policy Information Center commissioned RobertJ. Mislevy to give guidance to the policy and education communityon this question. While his report establishes the range of freedom, whichmay not be as wide as many have assumed or hoped, it also tries to give guidanceon how comparability can be achieved. It is critical, I believe, to know this in advance and design thesystem in a way that comparability is possible. There are no neat technical tricks by which theresults of just any old assess- ments can be compared. We asked Mislevy to write inas nontechnical a manner as possible so that a person with a need to know could understand. However, it is nota simple matter. The conclusion, on page 72, will give the less motivated readerthe bottom line on what Mislevy thinks possible, along witha summary of approaches to linking tests. We are grateful to the National Center for Researchon Evaluation, Standards, and Student Testing (CRESST) for joining in the fundingof this work, and we are grateful for the willingness of Robert Linn to write the foreword;his own work has already contributed much to the understanding of both limitationsand possibilities. Paul E. Barton, Director Policy Information Center Acknowledgments This monograph evolved froman internal memorandum on linking assessments by Martha Stocking and me. The work receivedsupport from the Policy Information Cen- ter and the Statistical and Psychometric Research Divisionof Educational Testing Service and the National Center for Researchon Evaluation, Standards, and Student Testing (CRESST), Educational Research and DevelopmentProgram, cooperative agreement number R117G10027 and CFDA catalog number84.117G, as administered by the Office of Educational Research and Improvement,U.S. Department
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages90 Page
-
File Size-