<<

Educational Testing in Bend Csapo Attila Jozsef University

the educational system became highly selective. The selection took What has been the role of educational testing in Hungar- place by means of several examina- ian ? How are the political changes in the coun- tions. Both oral and written exams played a significant role, but testing try expected to affect testing in the future? was almost unknown and never used in practice.

Origins of Systematic oday Hungary faces one of the Historical Background The most prestigious, regulated, and Tgreatest challenges in its his- As a consequence of geographical organized examination was (and still tory: after 40 years of limited free- location, Hungarian culture has tra- is) the , which was taken at dom the country has to find its own ditionally been strongly influenced the end of high school in order to way of development. As one of the by the German-speaking countries. apply for entry to a university. The newly founded Central European In certain historical periods and in matura was first introduced in Prus- democracies, Hungary is seeking its some domains, French and, to a sia in 1788. In Hungary, the first original traditions, trying to recon- lesser extent, Anglo-Saxon influence matura examinations were held in struct its natural European relation- was also detectable. While the origin 1851. Their basic function was to ships and to redefine its place in the of modern formal education goes limit the number of applicants to the modern world. This transitional pro- back to the Austro-Hungarian em- universities. After World War I, the cess stimulates a great deal of discus- pire, there are many German/ class-selection feature of the matura sion at every level of society, includ- Austrian traces in the present Hun- became dominant: it was a precondi- ing general and specific questions of garian educational system, and its tion for entering the social the educational system. While the development is very similar to the classes. Since its beginnings, the contradiction between decentraliza- educational systems in other Cen- muturu was always taken before a tion in the control of the educational tral European countries. committee and it consisted of two system and the relatively high na- The end of World War I became a parts: a written and an oral section. tional standards cannot be handled turning point in the history of A regulation from 1884 defined the without proper evaluation tech- Central . The Austro-Hungar- five subjects of the matura as lan- niques, one of the crucial issues is ian empire disintegrated and Hun- guage arts, , history, mathe- what role evaluation and testing gary became an independent state. matics and . This idea re- should play in the schools of the Between the two wars, schools in mained more or less unchanged future. Hungary were dominated by reli- until World War 11. gious, conservative, and national The two main trends in thinking This article reviews the develop- values. The idea of excellence and about education during this era ment of educational testing in Hun- high standards have been guiding might be characterized by the works gary in the light of recent changes. principles. Emphasis on the national of Ern6 Finaczy, who represented a The main evaluation tendencies will culture and quality education were Herbartian, theoretical-philosophi- be presented within a historical considered to be the means for cal approach, and by Laszlo Nagy, framework. The state of testing in maintaining or improving the posi- who emphasized psychological as- the last decade, present work, and tion of the country within the re- pects of education and the empirical predictable future developments will gion. The availability and quality of study of children. Finaczy was in- also be discussed. Although this schooling was significantly improved volved in devising and improving the article discusses the development of in this period, but it was not pro- examination systems. He worked on testing in a broad sense, including vided at the same level for every the more strict and detailed rules for all kinds of systematic evaluation, young citizen. The distribution of the maturu, while Nagy carried out a the terms and testing will be schooling was highly uneven. A used in their narrow meaning. These small portion of the population re- terms refer to the use of formal ceived a really outstanding educa- Ben6 Csapd is a in the measurement instruments to assess tion, while large masses were barely Departments of Education and Psychol- student . provided with elementary-level in- ogy at Attila Jdzsef University, Petofi sgt. Other forms of oral or written exams struction. In order to preserve high 30-34, H-6722 Szeged, Hungary. He will be differentiated from them. quality schooling for a few students, specializes in educational measurement.

Summer 1992 5 large-scale survey using question- objectivism” and were banned by Hungary has been participating in naires on the effect of war on every possible means. The Stalinist the major IEA studies (Science, children’s emotional development as system was broken by the revolution Reading Comprehension, English, early as 1916. One of L. Nagy’s of 1956. After the revolution had Second Study, Written followers, Gabor Kemhy (1934), been suppressed and its participants Composition, Second Science Study, wrote the first comprehensive book punished, a slow opening to the West Computers in Education, Pre-Pri- on educational evaluation in Hun- started at the end of the decade. mary Study, Reading-Literacy gary. In his work, based on previous The beginning of the third period Study). empirical studies, he argued against can be set in the early sixties. State The IEA studies first put the the routine use of marks. He out- control was gradually restricted to achievements of the Hungarian edu- lined the psychological principles of ideologically sensitive fields like the- cational system into an interna- “proper evaluation,” by which he ory and history of education, while tional context. This resulted in some meant a qualitative, verbal, motivat- the politically neutral problems of interesting, sometimes surprising, ing, and formative type of evaluation instruction, including testing and findings and inspired further evalu- instead of the use of scales or measurement, were left to the ex- ation research and development numbers for grading. perts. Thus, after a while there projects in Hungary. For example, emerged two more or less indepen- there was a large discrepancy be- The Post-War Period dent groups, which did not encroach tween the mathematics and science on each other’s territory. One worked scores on the one hand and the From the point of view of the under the prevailing ideological um- development of educational testing reading comprehension scores on brella dealing with test theory, and the other. Hungarian students were in Hungary, the post-war era can be the other moved toward western- divided into three major periods. usually ranked first in mathematics oriented empirical research. From and science, but performed poorly on Due to political overdetermination, the mid-seventies, empirical work the shift from one period to another reading comprehension when com- received more financial support, and pared on an international basis. roughly coincides with political probably this marked the renais- events. Although the positive results re- sance of educational research in ceived very little publicity for a long The short period of 1945-49 was Hungary. the beginning of a pluralistic democ- time, they helped educational ex- racy. During the first years right perts to defend the real values of the after the war, representatives of Research Projects in educational system in several discus- reform , liberal left-wing Educational Testing sions with party bureaucrats. The thinking, and humanistic approaches In educational testing two major high achievements have recently to schooling dominated the theoreti- research centers have developed, attracted the attention of research- cal discussions. In practice, the and during the last decades most ers in highly industrialized coun- founding of a new school system educational evaluation programs tries, and cross-cultural research required great effort, and less atten- have been carried out or directed by projects have been launched to find tion was paid to such specific ques- these two centers. At the beginning, out how education can work so well tions as evaluation or testing. both groups were formed by individ- in some fields while using such In 1949 the communist party took uals sharing similar research inter- slender resources. On the other power, and this meant the beginning ests, and they created their institu- hand, the poor scores in reading of the “dark fifties.” Leading posi- tional framework step by step. Today highlighted the weaknesses in the tions were occupied by party-backed both groups work as formal insti- teaching of reading. While the Hun- apparatchiks and educational theo- tutes. Their development can be garian language uses the phonetic rists, and researchers who did not characterized by some of their pro- way of writing and it is relatively accept the ideology of Marxism and grams and publications. easy to acquire basic reading skills, Leninism were forced to leave the The first group was founded by e.g., to read a word letter by letter, universities and research institu- hpad Kiss in the National Institute little emphasis was put on higher tions or in some cases were even for Education (NIE, formerly Scien- text-processing and comprehension taken to prison. Soviet textbooks on tific Institute of Pedagogy). In the skills. (Concerning the evaluation of education and were trans- post-war period Kiss first conceptual- reading performance, see Kadar- lated and considered the only authen- ized the questions of measurement, Fulop, 1985.) tic sources of information. Selecting and he organized the first large-scale Later Zoltan Bathory (1972,1973) individuals by political or ideological testing of students’ in became the leader of this group, principles became a general practice Hungary (Kiss, 1961). In his papers which under different names (De- everywhere (including within school he reported on a survey that was partment of Curriculum Theory, systems), and systematic evaluation started in 1958 and involved the Center for Evaluation) formed a procedures or tools without the assessment of 4th, 6th, and 8th department of the NIE. A network “class concept” would have been grade elementary school students of the County Institute for Educa- disturbing. Testing, measurement, (Kiss, 1960/61). Under his leader- tion was formed in the 19 counties of and experimentation in education, ship, a group in the NIE joined the the country, and with their assis- or even the empirical methods of International Association for the tance the Center for Evaluation of research in social sciences, were Evaluation of Educational Achieve- the NIE carried out several nation- declared to be tools of “bourgeois ment (IEA) in 1968. Since then, wide assessments (Bathory & Kadh-

6 Educational Measurement: Issues and Practice Fulop, 1985). A 1980 project, run- statements on the teaching material tencies that was standardized on a ning parallel with the IEA data that could be directly transferred large sample of over 10,000 pupils collection, aimed at providing proper into test items. The main character- (Nagy, 1980, 1986). Around the feedback information on (a) the istics of this test-construction beginning of the eighties, a new efficiency of the schools themselves, method were similar to those of direction in testing was launched at (b)specific problems of some content early domain-referenced testing, but AJU aimed at devising tools to domains, and (c) relationships of this test-construction method used a assess developmental levels in opera- social-economic status and school norm-referenced type of scoring sys- tional thinking. Thinking operations achievement (Bathory, 1983). tem (Nagy, 1973b). This work re- were interpreted within the frame- The so-called MONITOR program sulted in a series of 17 volumes, work of Piagetian and Neo-Piage- was launched in 1986, and it was the which presented the tests, the cod- tian traditions and batteries of paper- first step in a periodically repeated ing instructions, the achievement and-pencil tests were constructed nationwide representative assess- level of a representative sample for and used to outline the developmen- ment project. It is planned that the tal trends of these operations within program will be carried out with the age range of 6 to 17 years (Nagy, similarly composed samples of stu- 1987). dents in the same content domain to The most recent work of Nagy’s obtain data for a long-term analysis team at AJU is concerned with the of educational progress (similar to Around the beginning devising of diagnostic tests for the the NAEP program in the US.). of the eighties, a new practical needs of schools (Vidakov- More than 12,000 students were ich, 1987, 1990). These tests have involved in the 1986 survey in four direction in testing was been developed by several groups of age groups. Language arts and math- launched at AJU aimed experts and under the coor- ematics were the focal topics of the dination and with the theoretical assessments. The results were pub- at devising tools to support of a university research lished in professional journals and team. Depending on the availability attracted little attention (VAri, 1989). assess developmental of the necessary facilities, regional The MONITOR surveys were re- levels in operational or county evaluation centers receive peated 5 years later in 1991. The camera-ready copies of these tests first report of results indicated a thinking. and print them for use in schools. decline in reading comprehension After administering the tests, teach- and a stagnation in mathematics. ers use them to grade their students, and then data are entered into The release of information at a press each item, and a standardized scor- conference in January, 1992, re- computers at schools or the tests are ing system. The closing (18th) vol- sent to local or regional centers. sulted in immediate and stormy ume in this series summarized the discussions reported in the mass Where computers are available, re- theoretical output of the project sults are analyzed and discussed by media. based on the data of test analyses The other research team was local experts and groups of teachers organized in the Department of (Nagy, 1975). So far, this project has in order to improve their own work. Education at Attila J6zsef Univer- been the most comprehensive enter- The collected data are used for sity (AJU) at Szeged by J6zsef Nagy prise for developing tests in Hun- sophisticated test analysis and for in the early seventies. J. Nagy di- gary as the complete series covered the improvement of the tests. This rected several projects that aimed at the knowledge of main subjects in work offers feedback at several levels devising tests for the assessment of the 5th to 8th grade. in the educational system and at the several components of knowledge As a result of the research tradi- same time aims at disseminating from the most simple elementary tions at AJU, the Examination information on test construction skills to the more complex thinking Center for Basic Knowledge was es- and computerized test analysis. operations. First, he carried out a tablished in the Department of representative survey on the develop- Education at AJU in 1991. In this The Present State in Testing ment of elementary counting skills center, during the first year of its The state of the art in educational (Nagy, 1971) and then with his operation, a nationwide network of testing in Hungary can be best coworkers devised a series of test experts was organized, test batteries characterized through the kind and batteries for the measurement of were devised, and pilot studies on depth of knowledge about testing other basic skills (Nagy, 1973a; testing were conducted. These steps within the various groups of educa- Orosz, 1974). The next project was can lead to the introduction of tionalists and the testing practice at the development of standardized examination systems at turning the different levels of education. knowledge tests for five school sub- points in public education. Since testing (e.g., for selecting jects in the 5th to 8th grade of The first item-bank was also devel- personnel for different positions, elementary school (Nagy, 1972). oped under Nagy’s coauthorship for etc.) is not used as frequently in Nagy developed a method of ana- junior school word-problems (CsAki Hungary as in some highly industri- lyzing the thematic units or learning & Nagy, 1976). In the mid-seventies alized societies, parents have little tasks of the textbooks. This analysis he developed a test battery for the experience with it. Test construction resulted in a complete list of basic assessment of school-entry compe- or any information concerning test-

Summer 1992 ing has not comprised part of are often provided as a supplement schools could chose the content and training; thus, practicing teachers to textbooks or as separate booklets. methods of teaching, but at certain have very limited theoretical knowl- There are no definite national or ages, students would be expected to edge about testing. They have used local guidelines concerning the use reach standards accepted nation- tests as they received them, and of these tests; teachers are free to wide. The achievement of students sometimes they tried to make new use them as they choose. would be checked through centrally ones using other test as examples. The only centrally organized test- organized examinations, probably at There are very few manuals or ing process is part of the higher the ages of 12, 14, 16, and 18. The technical guides (Bathory, 1973; education entrance examination. exams at the ages of 12 and 14 play Csap6,1988; Job, 1980; Nagy, 1972) Only one-third on average of the mostly a diagnostic role and might available in Hungarian for those applicants can enter the universities help the transition between school teachers who wish to construct their or colleges, and these exams are very levels. The age of 16 marks the end own measurement instrument, and competitive. The tests are developed of compulsory education, and the this fact also limits the development by small groups of experts and are exams planned at this point are to of teacher-made tests. kept secret until the day of testing. close the compulsory schooling. Most There is a circle of educational Thus, these tests are never tried out of the actual research efforts in researchers and professional test and are of poor quality from the educational evaluation are concen- developers (at present about 20 point of view of test construction trated to develop the proper test persons within Hungary) who follow and reliability. In higher education, batteries for these exams. The tradi- international literature and partici- tests are relatively rarely used. They tional maturu is also expected to be pate in cross-cultural investigations are used mostly in language teach- renewed. The written exams will and who possess up-to-date knowl- ing where foreign test banks are mostly be replaced by standardized edge that enables them to use the directly adaptable. tests. most recent methods and deal with the most sophisticated computer Directions for the Nineties References programs. There is a growing inter- The most recent changes in the Bathory, Z. (1972).Ertekeles apedagogia- est among practicing teachers about Hungarian educational system im- ban [Evaluationin education]. Pedagd- the problems of testing that will ply some new development in evalu- giai Szemle, 22,212-222. increase the communication be- ation and testing. A new educational Bathory, Z. (1973). 7 standardizalt tant- tween the test experts and practitio- is being developed and will hrgyteszt [Seven standardized knowl- ners. However, the number of pa- probably be approved by the parlia- edge tests]. Budapest: Orszagos Peda- pers discussing theoretical questions ment in 1992. Its aim is to delineate gogiai Intezet. in Hungarian educational or psycho- what the legislative consequences of Bathory, Z. (1983). Az iskolai neveles logical journals (e.g., Csap6, 1987; nehany osszetevij’enek vizsgdata egy the change of the political system felmeres tiikreben [A study of certain Horvath, 1985) is very low and the are for education and to give a legal dissemination of information about factors in school education]. Pedagd- framework for the evolution of edu- giai Szemle, 33,135-185. testing is rather slow. cation. Although several outlines of Bathory, Z., & Khdar-Fiilop, J. (Eds.). There are very few areas in the the planned have already been (1985). Educational evaluation stud- Hungarian educational system where discussed, both in professional and ies in Hungary. Evaluation in Educa- usage of tests is a regular practice. lay circles, the latest, fifth draft is tion: An International Review Series, Before entering school, there are probably not the last one (“Concep- Vol. 9. school-readiness exams when psy- tions of the Law,” 1992).Despite the “Conception of the law of public chological tests or batteries devised active discussions, the main develop- education.” (1992). Budapest: Minis- specifically for this purpose are used mental tendencies have already try of Culture and Education (version (Nagy, 1986). In the first year of emerged. of January 14, mimeographed). elementary school, the form of evalu- Cshki, I., & Nagy, J. (1976).Alsd tagoza- Instead of the totally regulated tos szoueges feladatbank [Item bank ation is mostly qualitative and ver- central curriculum, a more flexible for junior school word-problems]. bal. From the second year to the national core curriculum is planned Szeged: Acta Paedagogica Series Speci- universities, a marking system is that leaves more autonomy for local fica. almost exclusively used that is based and school-level decisions (Nagy & Csap6, B. (1987). A kriteriumorientalt on a 5-grade scale: 1means a fail, 2 is Szebenyi, 1990). The existing 8 + 4 Brtekeles [Criterion-referencedevalua- the lowest acceptable level, and 5 is (elementary t grammar school) and tion]. Magyar Pedagdgia, 87, 247- the best achievement. In elementary 8 + 3 (elementary + apprenticeship) 266. and secondary schools, students re- structure of the school system might Csapo, B. (1988). A tanuldi teljesit- ceive about four to eight marks per also be changed and replaced by me‘nyek e‘rte‘kele‘se‘nekmtre‘ses mddsz- semester based on their oral presen- other forms, e.g., 8, 10 t 2, or erei [Assessmentmethods for evaluat- 4 t ing student achievement]. Budapest: tations, short quizzes, and written 6 t 6. MM. Vezetkepzh es Tovabbkepzh In- work, e.g., solving tasks in math and This greater variability in schools tezet. science or written compositions in and teaching systems is expected to Horvath, G. (1985). Tesztelmelet: prob- the language arts. These marks be controlled through an output lem& es perspektiv& [Test theory: determine their year’s result. In oriented evaluation system. This Problems and perspectives]. Pszicholci- written work some teachers may use means that local authorities could gia, 5,53-78. tests where available. These tests chose the structure of schools and Continued on page 15

8 Educational Measurement: Issues and Practice 93% of administrators' surveys. These the Arizona Educational Research Or- test scores: Fruitful, fruitless, or fraud- are the proportions of each group express- ganization, Mesa, AZ. ulent? Educational Measurement: Is- ing an opinion on each individual item. Gallup, A. M. (1989). The 21st annual sues and Practice, 8(1),14-22. Gallup poll of public's attitude toward Nitko, A. J. (1989). Designing tests that References the public schools. Phi Delta Kappan, are integrated with instruction. In Barrett, R., & Stevens, V. M. (1987, July 71,39-54. R. L. Linn (Ed.),Educational measure- 15). Districts in west valley analyze Haas, N. S., Haladyna, T. M., & Nolen, test scores. Arizona Republic, pp. lE, ment (3rd ed., pp. 447-474). Washing- S. B. (1989). Standardized testing in ton, DC: American Council on Educa- 4E. Arizona: Interviews and written com- Berk, R. A. (1988). Fifty reasons why tion and Macmillan. ments from teachers and administra- Richards, T. S. (1989). Testmania: The student achievement gain does not tors (Technical Report 89-3). Phoe- mean teacher effectiveness.Journal of school under siege. Learning, 17(7), nix: Arizona State University West. 64-66. Personnel Evaluation in Education, Haertel, E. (1985). Construct validity 1(4), 345-364. Rothman, R. (1988, February 17). E. D. Bishop, C. D. (1989, November 9). Pre- and criterion-referenced testing. Re- will prepare "consumer guide" on sentation to the State Joint Legisla- view of Educational Research, 55(11, standardized tests. Education Week, tive Committee on Goals for Educa- 23-46. pp. 16-17. tional Excellence, Phoenix, AZ. Haertel, E. (1986). The valid use of Shepard, L. A. (1989). Why we need Brandt, R. (1989). On misuse of testing: student performance measures for better assessments. Educational Lead- teacher evaluation. Educational Eval- A conversation with George Madaus. ership, 46(7),4-9. Educational Leadership, 46(7), 26- uation and Policy Analysis, 8(1),45- 60. Smith, M. L., Edelsky, C., Draper, K., 30. Rottenberg, C., & Cherland, M. (1991). Cohen, S. A. (1987). Instructional align- Haladyna, T. H., Haas, N. S., & Nolen, S. B. (1989). Test score pollution Put to the test: The effects of external ment: Searching for a magic bullet. testing on teachers. Educational Re- Educational Researcher, 16(8), 16- (Technical Report 89-1). Phoenix: Ari- zona State University West. searcher, 20(5), 8-11. 20. State-Mandated Tests, 1986-7. (1988). Droege, R. C. (1966). Effects of practice Madaus, G. F. (1987). Testing and the FairTest Examiner, 2(3),12. on scores. Journal of Applied curriculum. Chestnut Hill, MA: Bos- Psychology, 50, 306-310. ton College. Taylor, C., & White, K. R. (1982). The Frederiksen, N. (1984). The real test Madaus, G. F. (1988). The influence of effect of reinforcement and training : Influences of testing on teaching testing on curriculum. In L. N. Tan- on group behavior. and learning. American Psychologist, ner (Ed.), Critical issues in curricu- Journal of Educational Measurement, 39,193-202. lum. Eighty-Seventh Yearbook of the 19,199-210. Freidly, R. 11989, October). Remarks National Society for the Study of Tuinman, J. J., Farr, R., & Blanton, presented at the symposium "How Education (pp. 83-121). Chicago: Uni- B. E. (1972). Increases in test scores as Arizona Schools Prepare and Use versity of Chicago Press. a function of material rewards. Jour- Standardized Tests and the Effects on Mehrens, W. A., & Kaminski, J. (1989). nal of Educational Measurement, 9, Students," at the annual meeting of Methods for improving standardized 215-223.

Testing in Hungary Nagy, J. (1971). Az elemi szamolasi Nagy, J. (1987). A rendszereze'si ke'pesse'g Continued from page 8 kbzse'gek [Elementary counting skills]. kialakulasa: Gondolkodasi miiveletek Budapest: Tankonyvkiado. [The evolution of systematizing abil- Joo, A. (1980).A feladatkbzite's ke'rde'sei Nagy, J. (1972). A te'mazard tudasszint- ity: Thought operation]. Budapest: [Questions of item writing]. Budapest: me're's gyakorlati ke'rde'sei [Practical Akademlai Kiado. Orszagos Oktatastechnikai Kozpont. problems of thematic knowledge as- Nagy, J., & Szebenyi, P. (1990). Hungar- Khdar-Fulop, J. (1985). The CTD read- sessment]. Budapest: Tankonyvkiado. ian reform: Towards a curriculum for ing study. Evaluation in Education: Nagy, J. (1973a). Alapmuveleti szamo- 1990s. Curriculum Journal, 1, 247- An International Review Series, 9, lasi ke'szse'gek [Skills of basic counting 254. 117-1 5 1. operations]. Szeged: Acta Paedagogica Orosz, S. (1974). A fogalmazastechnika Kemeny, G. (1934). Iskolai e'rte'kele's 6s Series Specifica. me're'smetodikai proble'mai [Problems kivcilasztcis [Evaluation and selection Nagy, J. (1973b). A standard osztalyzat of assessment written compositions]. at school]. Budapest: Merkantil-nyo- [Standard grades]. Pedagdgiai Szemle, Budapest: Tankonyvkiado. mda, 23,225-234. Vki, P. (1989).AMONITOR '86 ismerte- Kiss, A. (1961). Docimologia, osztalyo- Nagy, J. (1975). A te'mazard tesztek tese [A review of MONITOR '861. zas, meres [Docimology, classification, reliabilitasa 6s validitasa [Reliability Pedagdgiai Szemle, 39,1123-1130. measurement]. In Pszicholdgiai tanul- and validity of knowledge tests]. Szeged: Vidtikovich, T. (1987). Innouativ ce'lu manyok, 3 (pp. 253-266). Budapest: Acta Paedagogica Series Specifica. diagnosztikus pedagdgiai e'rtikele's [In- Akademiai Kiado. Nagy, J. (1980). 5-6 Cues gyermekeink novative diagnostic evaluation in edu- Kiss, A. (1960161). Iskolai tanuloink iskolahe'szultse'ge [School readiness of cation]. Budapest: Kozoktatasi Kutata- tudasszintjenek vizsgalata, 1-4 resz 5-6-year-old Hungarian children]. sok Titkarsaga. [Examination of the knowledge of Budapest: Akademiai Kiado. Vidiikovich, T. (1990). Diagnosztikus school children, parts 1 to 41. Pedagd- Nagy, J. (1986). PREFER [A test battery pedagdgiai irte'kele's [Diagnosticeduca- giai Szemle, 10, 194-206, 585-593, for assessment of entrants' entry compe- tional evaluation]. Budapest: Aka- 775-784, and 11,600-613. tence~].Budapest: Akademiai Kiado. demiai Kiado.

Summer 1992 15