Ari Huhta Innovations in Diagnostic Assessment and Feedback

ARI HUHTA INNOVATIONS IN DIAGNOSTIC ASSESSMENT AND FEEDBACK: AN ANALYSIS OF THE USEFULNESS OF THE DIALANG LANGUAGE ASSESSMENT SYSTEM Ari Huhta Innovations in diagnostic assessment and feedback: An analysis of the usefulness of the DIALANG language assessment system Esitetään Jyväskylän yliopiston humanistisen tiedekunnan suostumuksella julkisesti tarkastettavaksi yliopiston Villa Rana –rakennuksen Paulaharjun salissa helmikuun 27. päivänä 2010 kello 12. Academic dissertation to be publicly discussed, by permission of the Faculty of Humanities of the University of Jyväskylä, in the Buiding Villa Rana, Auditorium Paulaharju, on February 27 th at 12 o’clock noon. ii ABSTRACT The thesis analyses the quality of DIALANG, which is a multilingual language assessment system that provides learners with immediate, on-line feedback about the strengths and weaknesses in their proficiency. Some of the evidence for its quality is provided by the present author in the articles appended to this thesis; other evidence comes from a range of studies summarized in the synthesis part of the thesis. The quality evidence divides into theoretical arguments that were presented in the design documents of DIALANG and into empirical evidence that was collected in the piloting and operational stages of the system. All this evidence is presented and analysed with reference to the Bachman and Palmer (1996) framework on test usefulness that uses usefulness as the super-ordinate term that characterizes the quality of a test or assessment system. According to Bachman and Palmer several factors contribute to the usefulness of a test for its intended purpose(s): reliability, construct validity, authenticity, interactiveness, impact, and practicality. In principle, the better the quality of each of these aspects is, the more useful the test is. Available evidence does not allow us to obtain a balanced picture of all the aspects of test usefulness for DIALANG, neither are all its languages equally covered. The most comprehensive evidence concerns impact and construct validity. Overall, the system appears reasonably useful for the purposes that concern individual learners and their teachers – such as finding out the learners’ level of proficiency and their profile of strengths and weaknesses. A survey of over 550 DIALANG users by the author showed that a significant majority of the informants reacted positively to the system and found the feedback they got useful. User statistics show that hundreds of thousands of learners have taken DIALANG tests or visited its website – and even more must have been informed about the system by their teachers and institutions. The impact of DIALANG is greatest in the Netherlands, Germany, France, and Finland. Many institutions use the system for placing their students on language courses even though it was not developed for that purpose and thus lacks many features that would be useful for a proper placement test. Evidence also indicates certain problems with the system, most notably with the placement test, as well as with some technical aspects of the system, at least with some learners and institutions. In the longer term the most important impact of DIALANG is probably its scientific impact on applied linguistics: it brought to researchers’ attention the fact how little we in fact know about diagnosing foreign language learning, which has spurred a considerable amount of new, interesting research into the diagnosis of language learning. Keywords: language assessment, diagnosis of foreign language proficiency, diagnosis of foreign language learning, feedback, computerised testing, self-assessment iii Author’s address Ari Huhta Centre for Applied Language Studies University of Jyväskylä Box 35 40014 University of Jyväskylä Finland [email protected] Supervisors Prof. Hannele Dufva Department of Languages University of Jyväskylä Reviewers Prof. emeritus Mats Oscarson Institutionen för pedagogik och didaktik Göteborgs Universitet Sverige Prof. Angela Hasselgreen Senter for utdanningsforskning Høgskolen i Bergen / Universitetet i Bergen Norge Opponents Prof. emeritus Mats Oscarson Institutionen för pedagogik och didaktik Göteborgs Universitet Sweden iv PREFACE This work is based on the DIALANG project (1996-2004), in which my own department, the Centre for Applied Language Studies (CALS) was involved, first as the coordinator and then as one of the four core project partners. I had the fortune to work in the project in many different roles, as is described in Appendix 2 of the synthesis part of this thesis. None of us who were involved in DIALANG at its early stages could foresee how long the project would last, nor how laborious and, sometimes, how frustrating the whole venture could be. However, designing this multilingual assessment system also turned out to be highly stimulating and rewarding for those participating in its development, as we had to face and solve a number of truly interesting and challenging problems along the way. Although we were able to tackle many questions about diagnosing strengths and weaknesses in language learners’ proficiency, ironically, the most important lesson we learned from the work on DIALANG was how little we in fact know about diagnosing second / foreign language proficiency and learning. Hopefully this realization will lead to more research into diagnosing proficiency so that if anything like DIALANG gets designed in the future, the result will be something that is significantly more useful to the individual learner than what DIALANG can be. Considerable amount of research was carried out during the development of DIALANG to address the challenges facing the project, and studies on the system have continued after the project came to an end in 2004. Much of the earlier research is reported in Alderson’s 2005 book that brought our lack of knowledge on diagnosing second / foreign language learning to the attention of the language testing world. There is, however, no comprehensive account of the different kinds of studies that have analysed different features of DIALANG or that have used DIALANG as a tool to study other things and have, as a byproduct, yielded interesting information about its quality. The synthesis part of my thesis is an attempt to bring together as much information as possible about how DIALANG has succeeded (or in some cases, failed) to meet different expectations and to fulfill the purposes it was created for. No study is possible without the professional and other support of many people. In addition to being lucky enough to participate in the design of DIALANG, I have been fortunate in having a chance to work in a very stimulating and supporting environment. I would like to warmly thank my former and present colleagues in the CALS and its predecessor, the Language Centre for Finnish Universities, and also elsewhere in the University of Jyväskylä as well as in several other universities and organizations in Finland and abroad, for countless meetings, discussions, and other exchange of ideas that have directly and indirectly contributed to my work on DIALANG and everything else that I have done during my career. I would like to single out three persons who have profoundly influenced not only DIALANG but also my study on that system, as well as my own professional development in general: The late Professor and Director of CALS Kari Sajavaara, Professor emeritus Sauli Takala, and Professor J. Charles Alderson. I was first a student of Kari Sajavaara and Sauli Takala in the 1980s and then worked with them for about 15 years in numerous projects from the late 1980s until their retirement in the early 2000s. Our joint projects included, for example, the development of a national test for business and administration purposes (the TKD or Työelämän kielidiplomi), the National Certificates examination system, the IEA Language Education Study, and, finally, DIALANG. Our co-operation continued even after their retirement, although, naturally, on a much smaller scale. My co-operation with Charles v Alderson also goes back a long time: it started with an international survey of standards in language testing in the mid 1990s, and was followed by other international projects, the most important of which was obviously DIALANG. Our collaboration continues in a research project aiming to shed more light on the diagnosis of second / foreign language learning. I owe Kari, Sauli and Charles much of what I know about applied linguistics, language test and research design, international networking, and project management. I am indebted to a large number of people who helped me in the collection of the data for my own empirical studies on DIALANG. I am particularly grateful to the 557 language learners in Finland and Germany who took the trouble to fill in the extensive questionnaire that I used in my data collection. Some of them also consented to be interviewed by me. Equally important were the language teachers in numerous institutions who were already, or became during the study, interested in using DIALANG as part of their teaching and who encouraged their students to fill in my questionnaire. Some of the teachers, too, were kind enough to let me interview them about their experiences in using the system. Another vital group in the data gathering was the roughly 30 teachers and colleagues around Finland and at the CALS who volunteered to assist in the concept mapping study of self-assessment that is partly reported in one of the articles that form my thesis. Several students who worked at the CALS as trainees at different times helped me in getting my questionnaire and interview data ready for analyses – their assistance was simply invaluable and is gratefully acknowledged here. Without the love, support and patience of my wife Anne and my sons Otto and Riku I could not have carried out my work. They have also shown me how there is life outside work by providing opportunities to relax at home in the evenings and on our trips to Lapland and even further away. I have been very fortunate in having a wife who is an academician and understands what research is like and the burden it often places on the family.

Ari Huhta Innovations in Diagnostic Assessment and Feedback

A Pattern Language for Teaching in a Foreign Language - Part 2

Aurora Tsai's Review of the Diagnosis of Reading in a Second Or Foreign

An Introduction to Applied Linguistics This Page Intentionally Left Blank an Introduction to Applied Linguistics

The Correlation Between Self- Assessment Reports and CEFR Levels According to Standardized Assessment

Evaluation, Testing and Assessment

Diagnosing Foreign Language Proficiency: the Interface Between Learning and Assessment

A Memorial Volume in Honour of Sauli Takala

Lancaster University Unit of Assessment

Book of Abstracts

The Routledge Handbook of Language Testing

Common European Framework of Reference for Languages: Learning, Teaching, Assessment

Diagnostic Assessment in Language Teaching and Learning