ENCYCLOPEDIA OF LANGUAGE AND EDUCATION

SECOND EDITION Encyclopedia of Language and Education

VOLUME 7: LANGUAGE TESTING AND ASSESSMENT

General Editor Nancy H. Hornberger, University of Pennsylvania, Philadelphia, USA

Editorial Advisory Board Neville Alexander, University of Cape Town, South Africa Colin Baker, University of Wales, UK Marilda Cavalcanti, UNICAMP, Brazil Caroline Clapham, University of Lancaster, UK Bronwyn Davies, University of Western Sydney, Australia Viv Edwards, University of Reading, UK Frederick Erickson, University of California at Los Angeles, USA Joseph Lo Bianco, University of Melbourne, Australia Luis Enrique Lopez, University of San Simon, Bolivia Allan Luke, Queensland University of Technology, Australia Tove Skutnabb-Kangas, Roskilde University, Denmark Bernard Spolsky, Bar-Ilan University, Israel G. Richard Tucker, Carnegie Mellon University, USA Leo van Lier, Monterey Institute of International Studies, USA Terrence G. Wiley, Arizona State University, USA Ruth Wodak, University of Vienna, Austria Ana Celia Zentella, University of California at San Diego, USA

The volume titles of this encyclopedia are listed at the end of this volume. Encyclopedia of Language and Education

Volume 7

LANGUAGE TESTING AND ASSESSMENT

Edited by

ELANA SHOHAMY Tel Aviv University School of Education Israel

and

NANCY H. HORNBERGER University of Pennsylvania Graduate School of Education USA Volume Editors: Elana Shohamy Tel Aviv University School of Education Tel Aviv, 69978 Israel [email protected]

Nancy H. Hornberger University of Pennsylvania Graduate School of Education Philadelphia, PA 19104-6216 USA [email protected]

General Editor: Nancy H. Hornberger University of Pennsylvania Graduate School of Education Philadelphia, PA 19104-6216 USA [email protected]

Library of Congress Control Number: 2007925265

ISBN-13: 978-0-387-32875-1

The electronic version will be available under ISBN 978-0-387-30424-3 The print and electronic bundle will be available under ISBN 978-0-387-35420-0

Printed on acid-free paper.

# 2008 Springer Science+Business Media, LLC. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC., 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.

9876543210 springer.com TABLE OF CONTENTS

VOLUME 7: LANGUAGE TESTING AND ASSESSMENT

General Editor’s Introduction ix Nancy H. Hornberger

Introduction to Volume 7: Language Testing and Assessment xiii Elana Shohamy

Contributors xxiii

Reviewers xxv Section 1: Assessing Language Domains 1. Assessing Oral and Literate Abilities 3 Alister Cumming 2. Assessment in Multilingual Societies 19 Rama Mathew 3. Assessing Content and Language 37 Heidi Byrnes 4. Assessing Communicative Language Ability: Models and their Components 53 James E. Purpura 5. Assessment at the Workplace 69 Kieran O’Loughlin 6. Testing Aptitude for Second Language Learning 81 Charles Stansfield and Paula Winke Section 2: Methods of Assessment 7. Alternative Assessment 97 Janna Fox 8. Task and Performance Based Assessment 111 Gillian Wigglesworth 9. Utilizing Technology in 123 Carol A. Chapelle 10. Large Scale Language Assessments 135 Antony John Kunnan 11. Criteria for Evaluating Language Quality 157 Glenn Fulcher

E. Shohamy and N. H. Hornberger (eds), Encyclopedia of Language and Education, 2nd Edition, Volume 7: Language Testing and Assessment, v–vii. #2008 Springer Science+Business Media LLC. vi TABLE OF CONTENTS

12. Methods of Test Validation 177 Xiaoming Xi 13. Utilizing Qualitative Methods for Assessment 197 Anne Lazaraton 14. Utilizing Psychometric Methods in Assessment 211 Micheline Chalhoub-Deville and Craig Deville 15. Training in Language Assessment 225 Margaret E. Malone 16. Using Corpora for Language Assessment 241 Lynda Taylor and Fiona Barker Section 3: Assessment in Education 17. Classroom-based Language Assessment 257 Pauline Rea-Dickins 18. Dynamic Assessment 273 James P. Lantolf and Matthew E. Poehner 19. Language Assessment Culture 285 Ofra Inbar-Lourie 20. Assessing Second/Additional Language of Diverse Populations 301 Constant Leung and Jo Lewkowicz 21. Assessment in Indigenous Language Programmes 319 Cath Rau 22. Utilizing Accommodations in Assessment 331 Jamal Abedi 23. Washback, Impact and Consequences 349 Liying Cheng 24. Educational Reform and Language Testing 365 Geoff Brindley 25. Assessing the Language of Young Learners 379 Alison L. Bailey Section 4: Assessment in Society 26. High-Stakes Tests as de facto Policies 401 Kate Menken 27. The Socio-political and Power Dimensions of Tests 415 Tim McNamara 28. Ethics, Professionalism, Rights and Codes 429 Alan Davies 29. Language Assessment in Historical and Future Perspective 445 Bernard Spolsky TABLE OF CONTENTS vii

Subject Index 455 Name Index 463 Tables of Contents: Volumes 1–10 473 NANCY H. HORNBERGER

GENERAL EDITOR’S INTRODUCTION1

ENCYCLOPEDIA OF LANGUAGE AND EDUCATION This is one of ten volumes of the Encyclopedia of Language and Education published by Springer. The Encyclopedia bears testimony to the dynamism and evolution of the language and education field, as it confronts the ever-burgeoning and irrepressible linguistic diversity and ongoing pressures and expectations placed on education around the world. The publication of this work charts the deepening and broadening of the field of language and education since the 1997 publication of the first Encyclopedia. It also confirms the vision of David Corson, general editor of the first edition, who hailed the international and interdisciplin- ary significance and cohesion of the field. These trademark characteris- tics are evident in every volume and chapter of the present Encyclopedia. In the selection of topics and contributors, the Encyclopedia seeks to reflect the depth of disciplinary knowledge, breadth of interdisciplinary perspective, and diversity of sociogeographic experience in our field. Language socialization and language ecology have been added to the original eight volume topics, reflecting these growing emphases in lan- guage education theory, research, and practice, alongside the enduring emphases on language policy, literacies, discourse, language acquisition, bilingual education, knowledge about language, language testing, and research methods. Throughout all the volumes, there is greater inclusion of scholarly contributions from non-English speaking and non-Western parts of the world, providing truly global coverage of the issues in the field. Furthermore, we have sought to integrate these voices more fully into the whole, rather than as special cases or international perspectives in separate sections. This interdisciplinary and internationalizing impetus has been immea- surably enhanced by the advice and support of the editorial advisory board members, several of whom served as volume editors in the Encyclopedia’s first edition (designated here with*), and all of whom I acknowledge here with gratitude: Neville Alexander (South Africa), Colin Baker (Wales), Marilda Cavalcanti (Brazil), Caroline Clapham* (Britain),

1 This introduction is based on, and takes inspiration from, David Corson’s general editor’s Introduction to the First Edition (Kluwer, 1997).

E. Shohamy and N. H. Hornberger (eds), Encyclopedia of Language and Education, 2nd Edition, Volume 7: Language Testing and Assessment, ix–xi. #2008 Springer Science+Business Media LLC. x NANCY H. HORNBERGER

Bronwyn Davies* (Australia), Viv Edwards* (Britain), Frederick Erickson (USA), Joseph Lo Bianco (Australia), Luis Enrique Lopez (Bolivia and Peru), Allan Luke (Singapore and Australia), Tove Skutnabb-Kangas (Denmark), Bernard Spolsky (Israel), G. Richard Tucker* (USA), Leo van Lier* (USA), Terrence G. Wiley (USA), Ruth Wodak* (Austria), and Ana Celia Zentella (USA). In conceptualizing an encyclopedic approach to a field, there is always the challenge of the hierarchical structure of themes, topics, and subjects to be covered. In this Encyclopedia of Language and Education, the stated topics in each volume’s table of contents are com- plemented by several cross-cutting thematic strands recurring across the volumes, including the classroom/pedagogic side of language and education; issues of identity in language and education; language ideol- ogy and education; computer technology and language education; and language rights in relation to education. The volume editors’ disciplinary and interdisciplinary academic inter- ests and their international areas of expertise also reflect the depth and breadth of the language and education field. As principal volume editor for Volume 1, Stephen May brings academic interests in the sociology of language and language education policy, arising from his work in Britain, North America, and New Zealand. For Volume 2, Brian Street approaches language and education as a social and cultural anthropologist and critical literacy theorist, drawing on his work in Iran, Britain, and around the world. For Volume 3, Marilyn Martin-Jones and Anne-Marie de Mejía bring combined perspectives as applied and educational lin- guists, working primarily in Britain and Latin America, respectively. For Volume 4, Nelleke Van Deusen-Scholl has academic interests in linguis- tics and sociolinguistics, and has worked primarily in the Netherlands and the USA. Jim Cummins, principal volume editor for Volume 5 of both the first and second editions of the Encyclopedia, has interests in the psy- chology of language, critical applied linguistics, and language policy, informed by his work in Canada, the USA, and internationally. For Volume 6, Jasone Cenoz has academic interests in applied linguistics and language acquisition, drawing from her work in the Basque Country, Spain, and Europe. Elana Shohamy, principal volume editor for Volume 7, approaches language and education as an applied linguist with interests in critical language policy, language testing and measurement, and her own work based primarily in Israel and the USA. For Volume 8, Patricia Duff has interests in applied linguistics and sociolinguistics, and has worked primarily in North America, East Asia, and Central Europe. Volume editors for Volume 9, Angela Creese and Peter Martin, draw on their academic interests in educational linguistics and linguistic eth- nography, and their research in Britain and Southeast Asia. And for Volume 10, Kendall A. King has academic interests in sociolinguistics GENERAL EDITOR’ S INTRODUCTION xi and educational linguistics, with work in Ecuador, Sweden, and the USA. Francis Hult, editorial assistant for the Encyclopedia, has aca- demic interests in educational and applied linguistics and educational lan- guage policy, and has worked in Sweden and the USA. Finally, as general editor, I have interests in anthropological linguistics, educational linguis- tics, and language policy, with work in Latin America, the USA, and inter- nationally. Beyond our specific academic interests, all of us editors, and the contributors to the Encyclopedia, share a commitment to the practice and theory of education, critically informed by research and strategically directed toward addressing unsound or unjust language education policies and practices wherever they are found. Each of the ten volumes presents core information and is international in scope, as well as diverse in the populations it covers. Each volume addresses a single subject area and provides 23–30 state-of-the-art chapters of the literature on that subject. Together, the chapters aim to comprehensively cover the subject. The volumes, edited by international experts in their respective topics, were designed and developed in close collaboration with the general editor of the Encyclopedia, who is a co-editor of each volume as well as general editor of the whole work. Each chapter is written by one or more experts on the topic, consists of about 4,000 words of text, and generally follows a similar structure. A list of references to key works supplements the authoritative information that the chapter contains. Many contributors survey early developments, major contributions, work in progress, problems and difficulties, and future directions. The aim of the chapters, and of the Encyclopedia as a whole, is to give readers access to the international literature and research on the broad diversity of topics that make up the field. The Encyclopedia is a necessary reference set for every university and college library in the world that serves a faculty or school of edu- cation. The Encyclopedia aims to speak to a prospective readership that is multinational, and to do so as unambiguously as possible. Because each book-size volume deals with a discrete and important subject in language and education, these state-of-the-art volumes also offer highly authoritative course textbooks in the areas suggested by their titles. The scholars contributing to the Encyclopedia hail from all continents of our globe and from 41 countries; they represent a great diversity of linguistic, cultural, and disciplinary traditions. For all that, what is most impressive about the contributions gathered here is the unity of purpose and outlook they express with regard to the central role of language as both vehicle and mediator of educational processes and to the need for continued and deepening research into the limits and possibilities that implies.

Nancy H. Hornberger ELANA SHOHAMY

INTRODUCTION TO VOLUME 7: LANGUAGE TESTING AND ASSESSMENT

This volume addresses the broad theme and specific topics associated with current thinking in the field of language testing and assessment. Interdisciplinary in its nature, language testing and assessment build on theories and definitions provided by linguistics, applied linguistics, language acquisition and language teaching, as well as on the disci- plines of testing, measurement and evaluation. Language testing uses these disciplines as foundations for researching, theorizing and con- structing valid language tools for assessing and judging the quality of language. The field of language testing is therefore viewed as consisting of two major components: one focusing on the ‘what’, referring to the con- structs that need to be assessed (also known as ‘the trait’); and the other component pertaining to the ‘how’ (also known as ‘the method’), which addresses the specific procedures and strategies used for assessing the ‘what’. Traditionally, ‘the trait’ has been defined by the language testing field; these definitions have provided the essential elements for creating language tests. The ‘how’, on the other hand, is derived mostly from the field of testing and measurement which has, over the years, developed a broad body of theories, research, techniques and practices about testing and assessment. Language testers incorporated these two areas to create the discipline of language testing and assessment, a field which includes theories, research and applications; it has its own research publications, conferences and two major journals, Lan- guage Testing and Language Assessment Quarterly, where many of these publications appear. An examination of the developments in the language testing and assessment field since the 1960’s reveals that its theories and practices have always been closely related to definitions of language proficiency. Matching the ‘how’ of testing with the ‘what’ of language uncovers several periods in the development of the field, with each one instan- tiating different notions of language knowledge along with specific measurement procedures that go with them. Thus, discrete-point testing viewed language as consisting of lexical and structural items so that the language tests of that era presented isolated items in objective testing procedures. In the integrative era, language tests tapped integrated

E. Shohamy and N. H. Hornberger (eds), Encyclopedia of Language and Education, 2nd Edition, Volume 7: Language Testing and Assessment, xiii–xxii. #2008 Springer Science+Business Media LLC. xiv ELANA SHOHAMY and discoursal language; in the communicative era, tests aimed to repli- cate interactions among language users utilizing authentic oral and written texts; and in the performance testing era, language users were expected to perform tasks taken from ‘real life’ contexts. Alternative assessment was a way of responding to the realization that language knowledge is a complex phenomenon, which no single procedure can be expected to capture. Assessing language knowledge therefore requires multiple and varied procedures that complement one another. While we have come to accept the centrality of the ‘what’ to the ‘how’ trajectory for the development of tests, extensive work in the past decade points to a less overt but highly influential dynamic in another direction. This dynamic has to do with the pivotal roles that tests play in societies in shaping the definitions of language, in affecting learning and teaching, and in maintaining and creating social classes. This means that contemporary assessment research perceives its obligations as being to examine the close relationship between methods and traits in broader contexts and to focus on how language tests interact with societal factors, given their enormous power. In other words, as lan- guage testers seek to develop and design methods and procedures for assessment (the ‘how’) they become mindful not only of the emerging insights regarding the trait (the ‘what’), and its multiple facets and dimensions, but also of the societal role that language tests play, the power that they hold, and their central functions in education, politics and society. In terms of the interaction of society and language, it is evident that changes are currently occurring in the broader contexts and spaces in which language testing takes place. It is being realized nowadays that language testing is not occurring in homogenous, uniform and isolated contexts but, rather, in diverse, multilingual and multicultural societies and thus posing new challenges and questions with regards to what it means to know language(s) in education and society. For example, different meanings of language knowledge may be associated with learning foreign languages, second languages, language by immersion, heritage languages, languages of immigrants arriving to new places with no knowledge of the new languages, and the languages of those defined as ‘trans-nationals’. Knowing the English language, the current world’s lingua franca, is different from knowing other languages. Similarly, the language of classrooms and schools may be different from that of the workplaces or communities where bi- or multi-lingual patterns are the norm. Each of these contexts may require different and varied theories of language knowledge and hence different definitions, applications and methods of measuring these proficiencies. In other words, the languages currently being used in different socie- ties in different contexts, no longer represent uniform constructs as INTRODUCTION xv these vary from one place to another, from one context to another, creating different language patterns, expectations and goals, and often resulting in hybrids and fusions, especially with regards to English. Such dynamic linguistic phenomena pose challenging problems to lan- guage testers. What is the language (or languages) that needs to be assessed? Where can it be observed in the best ways? Is it different at home, in schools, in classrooms and in the workplace? Should hybrids and fusions be assessed and how? Can levels of languages even be defined? How should language proficiency be reported and to whom? What is ‘good language’? Does such a term even apply? Who should decide how tests should be used? Do testers have an obligation to express their views about language and testing policy? What is the responsibility of testers to language learning and language use in class- rooms and communities? How can ethical and professional behaviours with regards to tests be maintained? These are some of the questions that language testers are currently pre-occupied with. Language testers are not technicians that just invent better and more sophisticated testing tools. Rather, they are constantly in search for and concerned with the ‘what’ and its complex meanings. Going beyond ‘general testing’, the unique aspect of language testing is that it is an integral part of a defined discipline, that of ‘language’. In this respect, language testers and the field of language testing are dif- ferent from the field of ‘general testing’ in that language testers are con- fined to a specific discipline and are therefore in constant need of asking such language-related questions as listed above in order to develop valid language assessment tools. The concern of language testers in the past decade about the use of tests and their political, social, educational and ethical dimensions has made the field even more complex and uncertain and in need of new questions and debates. The current era can be described as the era of uncertainty, where questions are being raised about the meaning of language and the possibilities for measuring this complex and dynamic variable. At the same time, it is an era of an ever more com- pelling need to ensure that these tests are reliable and valid, where validity includes the protection and guarding of the personal rights of others, as well as positive washback on learning by addressing the diverse communities in which the tests are used. Thus, the current era is not only concerned with a broader and more complex view of what it means to know a language, or with innovative methods of testing and assessment of complex constructs, but also with how these tests can be more inclusive, democratic, just, open, fair and equal and less biased. Even within the use of traditional large-scale testing, the field is asking questions about tests’ use: Why test? Who benefits, who loses? What are the impacts on, and consequences for definitions of xvi ELANA SHOHAMY language in relation to people, education, language policy, and society? Tests are not viewed as innocent tools, but rather as instruments that play central roles for people, education and societies. Language testers, therefore, are asked to deal with broader issues: to examine the uses of tests in the complex multilingual and multicultural societies where tests are used, not only as naïve measurement tools, but also as powerful educational, societal and political devices. This is the conceptual premise of this volume of the Encyclopedia of Language and Education on Language and Assessment. It aims to cover (and uncover) the multiple versions and perspectives of the ‘what’ of languages along with the multiple approaches developed for assessment of the ‘what’, especially given the multiplicity of lan- guages used by many diverse groups of learners in many different con- texts. It aims to focus on the societal roles of language testers and their responsibility to be socially accountable and to ensure ethicality and professionalism. A special focus is given in this volume to the multilin- gual and diverse contexts in which language testing and assessment are currently anchored, and the difficult task of ‘doing testing’ in this complex day and age. Accordingly, the first part of the volume addresses the ‘what’ of lan- guage testing and assessment. It no longer divides language into neat and clear-cut skills of reading, writing, speaking and listening, but rather examines the ‘what’ of language in the diverse contexts in which it is used. Rather than proposing one uniform way of defining the lan- guage construct, the chapters in Part 1 present language from multiple perspectives. It begins with a chapter by Alister Cumming who reviews research and practices of language assessment from the perspectives of oral and literate modes of communication and their meanings in rela- tion to language competencies, language learning and multimodalities. He notes that language assessment needs to be informed and extended by multiple forms of evidence in relation to educational purposes as well as diverse societies. Rama Mathew surveys developments in lan- guage assessment from the perspective of multilingual competencies as manifested in the case of India. She highlights the legitimacy of a multilingual reality in many societies nowadays, and emphasises the need to answer this call for different ways of thinking about language assessments. This is demonstrated through a survey of multilingual and multi-dialectical tests for assessing English. She then raises a num- ber of assessment issues that emerge in these complex realities. Heidi Byrnes focusses on the role of ‘content’ as part of language proficiency as it is closely embedded with language. By using a Hallidayan approach to texts and knowledge, she shows how assessment can be interpreted as part of a set of sophisticated text meanings as well as part of knowledge that is relevant to handling content and granting differing INTRODUCTION xvii priority to various elements of texts and their contribution to content. These approaches are anchored within , mainstream L2 curricula and second language literacy of diverse professional contexts, needed for ‘global literacy’. Jim Purpura applies the ‘Communicative Language Ability’ frame- work to the task of defining language and uses it as the basis for test development. By surveying the different theoretical models (and the tests developed based on them), he argues that these models represent targets of assessment that can be adapted for a range of test purposes and contexts, which consist of both grammatical and pragmatic knowl- edge. Accordingly, tests which are developed based on such models can help to better understand the components underlying communica- tive language ability, and can also help to provide useful diagnostic information to learners. Kieran O’Loughlin examines language from the angle of the workplace, focussing on language as related to the occupational purposes of professional duties. He provides a review of historical and current practices of performance-based tests related to ‘real world’ functions and tasks in a number of professional areas. At the same time, he is sceptical of the future of these tests, given the spread of large-scale standardized tests. Stansfield and Winke provide a somewhat different perspective of the language construct by re-visiting language aptitude. They re-define language aptitude by expanding its meaning to include second language learning aspects such as the diagno- sis and treatment of L2 learning problems in order to inform curricular design and to examine the relationship between working memory and L2 learning across a range of cognitive abilities. They survey the types of aptitude tests that are in line with these new theoretical constructs and raise questions about the validity of these tests and their uses. Together, these six chapters provide multiple perspectives of the lan- guage construct and assessment practices associated with it. As these chapters demonstrate, definitions of language cannot be detached from the diverse contexts in which they are used. The second part of the volume addresses the diverse methodological issues that language testers face in assessing the complex construct of language: that is, the ‘how’. These chapters demonstrate the sophisti- cated issues and deliberations as well as specific procedures used for assessing language. In the first chapter, Janna Fox reviews the develop- ments in, and outlines the procedures of alternative assessment. She expands the theoretical perspective not only by providing a longer list of ‘alternatives’, but also by asking whether alternative assessment represents a real paradigm shift or just additional procedures that actu- ally preserve traditional methods of testing. She then exapand the notion by incorporating different ways of thinking about testing in alternative modes, including accommodations, dynamic assessment xviii ELANA SHOHAMY and ethical, democratic, and equitable values. One of the dominant cases of alternative assessment is that of task and performance, issues that Jill Wigglesworth reviews in a chapter which focusses on the tasks designed to measure learners’ productive language skills through performances related to real world contexts (e.g. the workplace). She surveys the vast research literature on this topic, demonstrating the value of certain per- formance tests, the effect on task quality of certain variables, such as dif- ficulty levels, cognitive demands, type of discourse they produce, as well as the extent to which they indeed represent ‘real life’. In continua- tion with the discussion of the variety of possible assessment methods, Carol Chapelle delineates the new and current methods of utilizing technology in language assessment—(i.e. Computer-Assisted Testing (CAT)) by reviewing tests using Micro computers and the Internet, and analysing them not only in terms of their greater efficiency but also in terms of the serious problems that they pose. She surveys research on multimedia testing and its effects on learners in relation to specific skills such as listening, natural language processing, and written and spoken language. Issues of cost, training, access to infrastructure, and the intersections with construct validity are brought up, along with the question of whether computerized testing has been evolutionary or revolutionary. While the debates on the appropriate methods of assessment are tak- ing place, large-scale testing continues to be administered with even more force than ever before by governments and educational systems worldwide. In schools, tests are used for diagnostic purposes and to monitor students’ progress (through standardized tests); at college and university levels, tests are used for the screening and selection of applicants. Antony Kunnan discusses these issues and raises questions about the advantages of uniformity of tests for the sake of fairness. He reviews the history of large-scale testing and provides safeguards for fairness in the form of descriptive test information, codes of practices, test design and psychometric qualities. Criteria for language assessment, such as the Common European Framework, have been receiving major attention and gaining domi- nance over the past decade, especially with regards to their effects on the definitions of language and language policy. Glenn Fulcher pro- vides a comprehensive description of the methods used for examining the quality of language via rating scales, standards, benchmarks, band levels, frameworks and guidelines. He shows the advantages and disad- vantages of these tools in terms of validity of progression, equivalence across languages, hierarchies, false claims and their effects on defini- tions of language beyond serving as criteria for langauge evaluation. The field of psychometrics has gone through major changes as it has attempted to accommodate the more complex tests and tasks so that INTRODUCTION xix they will pass criteria of reliability, validity and ethicality. The chapter by Xiaoming Xi provides a comprehensive examination of these issues and updated methods of test validation. She shows how advances in validity benefit from progress in educational measurement, psycho- metrics and statistics, qualitative methods, discourse analysis, cognitive psychology as well as introspective methods about tasks’ complexity. Anne Lazaraton introduces new ways of utilizing qualitative methods for designing, describing and validating language tests, a topic that is gaining acceptance and legitimacy within the field of language testing, especially given the limitations of traditional statistical methods. She demontrates how qualitative methods can provide indication of the quality of tests both on the process and the product levels. The chapter co-authored by Micheline Chaloub-Deville and Craig Deville examines the common psychometric methods that are used in the field through a review and analysis of language testing research as reported in highly regarded testing journals. They show the multiple and varied methods which are used in testing research. Margaret Malone introduces the topic of training and teaching about language testing given the vast amount of knowledge available today so that testers can make informed decisions throughout the assessment process about test development, scoring, interpretation, selection and administration of tests. She introduces the term ‘assessment literacy’ to refer to the required knowledge about testing and its multiple inter- pretations. Another new topic relates to the emerging field of corpus linguistics. In the chapter by Linda Taylor and Fiona Baker they illus- trate how the field of corpus linguistics has become an important and relevant source of accurate language data which is useful for construct- ing tests based on scientific and empirical language documentation. Together, the chapters in Part 2 present multiple methods of language assessment while responding to current changes in the definitions of language. Part 3 of the volume addresses issues of language testing as they are embedded in educational systems and contexts, where language tests are so widely used. It is in the educational system that tests and various assessment methods serve as major tools for: assessing lan- guage for learning and teaching, making decisions about programmes, teachers and learners, and finally creating changes that lead to school reforms and that bring intended and un-intended washbacks in class- rooms and schools. Pauline Rae-Dickins opens Part 3 with a chapter focusing on classroom assessment, an area which is rather overlooked in relation to external high-stakes testing. She makes the distinction between assessment of learning (that is focused on achievement and summative in orientation) and assessment as learning (formative in its purpose, providing feedback to learners so that they can improve xx ELANA SHOHAMY their learning). She points to the ample progress in the latter area in the past decade and surveys studies of the many uses of different assessment tools in the classroom for feedback and effective instructional methods. Another new topic receiving recognition recently is that of Dynamic Assessment. Lantolf and Poehner introduce the topic in the context of language testing by applying Vygotsky’s sociocultural theories to show how testing and learning are closely connected. This approach leads to effective learning through testing, as it is revealed that tests are embedded in learning and can therefore also contribute to its improve- ment. Another new development is the increased attention to assessment as part of effective learning and teaching in schools. Ofra Inbar reviews studies that address the topic of ‘testing culture’, showing how the use of ongoing assessment in schools is an integral part of effective and bene- ficial learning, as well as of school organization. It is the realization that current schools are diverse in terms of stu- dents’ languages and cultural backgrounds that has led to different assessment approaches especially with regards to immigrants and indigenous populations. Using tests in the dominant school languages poses great difficulties for these students who are engaged in langauge acquistion while attempting to acquire school contents. Several chap- ters in Part 3 address these issues. Constant Leung and Jo Lewkowicz provide an overview of the types of assessment procedures used in diverse multilingual and plurilingual communities in the context of second language assessment designed to measure language develop- ment of linguistic minority students where another language is the majority language. They note that new developments in this area are indicative of more progressive views, which recognize the multi- faceted value of language proficiency. Cath Rau discusses assessment strategies for indigenous populations in schools in places where indig- enous groups make up a big part of the population, as in the case of the Maöri and other groups in New Zealand. She surveys descriptions of a number of strategies used to practice testing in fairer ways, incor- porating existing language knowledge. Test accommodations refer to strategies used for language learners to assess their content knowledge while compensating for lack of language knowledge in order to create fairer testing conditions for those for whom the language of assessment is their second language. Jamal Abedi reviews the extensive research that has been conducted in the past decade on the topic, examining effective accommoda- tions for language learners, mostly in the context of English language learners in the USA. He brings evidence from research about differ- ent types of accommodations in content areas such as Mathematics while being critical of the uses of some accommodations that have no empirical bases. INTRODUCTION xxi

Issues of washback and impact of large-scale testing on teaching and learning have stimulated ample research and writings in the past decade. The chapter by Liying Cheng surveys the large number of empirical studies that have documented the effects and impacts that tests have on learning, teaching and curriculum development. It is evident that test washback is considered nowadays as an intergral part of con- struct validity since it is incorporated in developments of large- scale tests. Geoff Brindley demonstrates how language tests are used by governments to reform educational systems, pointing to serious problems related to the practice of relying exclusively on tests for educa- tional reform. Alison Bailey addresses methods and techniques used for assessing the language of young learners in schools, pointing to the different strategies of these kinds of tests compared with those used for adults. This topic is gaining major attention nowadays with the growing number of young learners of English worldwide. Taken together, the chapters in Part 3 cover a wide range of topics related to broad issues of language assessment in education, especially amidst the changing realities of school demographies with regards to diverse populations and the use of tests in bringing about educational reform. The fourth and concluding part of the volume addresses societal, political, professional and ethical dimensions of tests; a topic that has been a major concern in the language testing field over the past decade. Each of the four chapters addresses different aspects of these dimen- sions. The chapter by Kate Menken illustrates how national language tests, especially those administered by government initiatives (e.g. the No Child Left Behind mandate in the US) affect language policy in schools and societies and deliver direct messages about the significance and insignificance of certain languages and language instruction poli- cies. She shows how language testing and language policy are closely connected, arguing that language tests have a greater effect than is viewed on the surface. This is especially relevant in contexts that include learners for whom the language of the test affects their ability to perform academically. Tim McNamara explains the need for taking into account the social and political dimensions of tests versus the structualist and psychometric dimensions which have previously dominated academic discussions around language testing. In his chapter, he surveys various social theories of linguistics and their important input into the field of language testing, with special attention to the work of Messick, who described the values and consequences of tests as part of construct valid- ity. He surveys studies and cases where language tests are used unjustly, such as in determining citizenship, employment and the status of asylum seekers. Alan Davies, who has written extensively on the ethical dimen- sions of tests and especially on the professional aspects related to ethical- ity, addresses these issues by covering the developments in the language xxii ELANA SHOHAMY testing field, showing how the Code of Ethics and Code of Practice, developed by the language testing profession via the International Lan- guage Testing Association (ILTA), can lead to the more ethical use of tests. He warns against the use of such codes as face-saving devices, action which, Davies argues, overlooks the real commitment to ethics that is instrumental for the profession itself, for its stakeholders and for the rights of individual test-takers. The final chapter, by Bernard Spolsky, examines the past, present and future of the field of language testing, providing guidance and direction for future vision. He surveys the history of the field with its advances as well as the ample questions and uncertainties that emerge and that need to be addressed in the future, while pointing to the contradictions, problems and difficulties of mea- suring and assessing such a complex construct as ‘language’. He ends the chapter by stating that he remains sceptical given the role of indus- trial test-makers in computerizing tests and in reducing multidimen- sional profiles into uniform scales, and also given that educational systems continue to interpret test scores as if they are meaningful. At the same time, he expects the quality research that has been conducted in the field of language testing to continue—especially that which has been conducted in relation to the ‘nature’ of language proficiency and the diverse approaches to assessing it in defined social contexts. I would like to thank each and every author of these chapters, which together make up a most valuable contribution to current thinking in the field of language testing and applied linguistics. The authors selected to write these chapters are among the most distinguished scholars and leaders in the field of language testing. The chapters herein reveal that the language testing field is dynamic, striving and vital. It is clear from these chapters that the field of language testing raises important and deep questions and does not overlook problems, difficulties, contradictions, malpractices and new societal realities and needs. While viewed by some as a technical field, this volume con- vincingly demonstrates that language testing and assessment is above all a scholarly and intellectual field that touches the essence of lan- guages and their meanings. The need to get engaged in testing and assessment forces testers to face these issues head-on and attempt to deliberate on creative and thoughtful solutions. Finally, special personal and deep thanks to Caroline Clapham who in her 1997 volume on Language Testing in the first edition of the Encyclopedia of Language and Education set the foundations and grounds for the field in such insightful and thorough ways that it has now been possible to expand and create this very comprehensive and stimulating volume.

Elana Shohamy CONTRIBUTORS

VOLUME 7: LANGUAGE TESTING AND ASSESSMENT

Jamal Abedi Glenn Fulcher University of California, University of Leicester, School of School of Education, Davis, USA Education, Leicester, UK Alison L. Bailey Ofra Inbar-Lourie University of California, Department of Tel Aviv University, School of Education, Education, Los Angeles, USA Tel Aviv, Israel Fiona Barker Antony John Kunnan University of Cambridge, ESOL California State University, Examinations, Cambridge, UK Charter College of Education, Geoff Brindley Los Angeles, USA Macquarie University, James P. Lantolf Department of Linguistics, Sydney, Pennsylvania State University, Australia Center for Language Acquisition, Heidi Byrnes University Park, USA Georgetown University, McLeau, USA Anne Lazaraton Micheline Chalhoub-Deville University of Minnesota, ESL/ILES, University of North Carolina, Minncapolis, USA Educational Research Methodology, Constant Leung Greensboro, USA King’s College, Department of Carol A. Chapelle Education and Professional Studies, lowa State University, Ames, USA London, UK Liying Cheng Jo Lewkowicz Queen’s University, Faculty of Education, American University of Armenia, Kingston, Canada Department of English Programs, Yerevan, Armenia Alister Cumming University of Toronto, Ontario Institute for Margaret E. Malone Studies in Education, Toronto, Canada Center for Applied Linguistics, Washington DC, USA Alan Davies University of Edinburgh, Department of Rama Mathew Theoretical and Applied Linguistics, Delhi University, Department of Scottland, UK Education, Delhi, India Craig Deville Tim McNamara University of North Carolina, Center for The University of Melbourne, Educational Research & Evaluation, Victoria, Australia Greensboro, USA Kate Menken Janna Fox City University of New York, Carleton University, Graduate Center/Queens College, School of Linguistics and Applied Linguistic Department, Language Studies, Ottawa, Canada New York, USA

E. Shohamy and N. H. Hornberger (eds), Encyclopedia of Language and Education, 2nd Edition, Volume 7: Language Testing and Assessment, xxiii–xxiv. #2008 Springer Science+Business Media LLC. xxiv CONTRIBUTORS

Kieran O’Loughlin Charles W. Stansfield University of Melbourne, Faculty of Second Language Testing Inc., Education, Victoria, Australia Rockville, USA Matthew E. Poehner Lynda Taylor The Pennsylvania State University, University of Cambridge, ESOL Linguistics and Applied Language Studies, Examinations, Cambridge, UK University Park, USA Gillian Wigglesworth James E. Purpura University of Melbourne, School of Columbia University, Teachers College, Languages, Victoria, Australia New York, USA Paula M. Winke Cath Rau Michigan State University, Second Ngāti Pūkeko, Agāki Awa, Tūhoc and Language Studies Program, Kia Ata Mai Educational Trust, East Lansing, USA Ngaruawahia, New Zealand Xiaoming Xi Pauline Rea-Dickins Center for Validity Research, University of Bristol, Graduate School of Research & Development Division, Education, Bristol, UK Educational Testing Service, Princeton, USA Bernard Spolsky Bar-Ilan University, Jerusalem, Israel REVIEWERS

VOLUME 7: LANGUAGE TESTING AND ASSESSMENT

Lyle F. Bachman Geoff Brindley Annie Brown Jeff Connor-Linton Sara Cushing Weigle Fred Davidson Alan Davies Cathie Elder Bruce Evans Janna Fox Glenn Fulcher Liz Hamp-Lyons Nancy H. Hornberger Francis M. Hult Ofra Inbar-Lourie Antony Kunnan Constant Leung Mike McCarthy Tim McNamara Trent Newman Harold Ormsby James E. Purpura Charlene Rivera Aliza Sacknovitz Elana Shohamy Carolyn Turner Mari Wesche