<<

Foreign Annals VOL. 49, NO. 2 201

Listening and Reading Proficiency Levels of College Students

Erwin Tschirner Universit€at Leipzig

Abstract: This article examines listening and reading proficiency levels of U.S. college foreign language students at major milestones throughout their undergraduate career. Data were collected from more than 3,000 participants studying seven languages at 21 universities and colleges across the United States. The results show that while listening proficiency appears to develop more slowly, Advanced levels of reading proficiency appear to be attainable for college majors at graduation. The article examines the relationship between listening and reading proficiency and suggests reasons for the apparent disconnect between listening and reading, particularly for some languages and at lower proficiency levels.

Key words: all languages, proficiency, postsecondary/higher education

Introduction Despite the fact that the foreign language teaching profession has focused on oral proficiency for decades, reaching the Advanced Low (AL) level in oral proficiency has remained “an elusive endeavor” for many college graduates majoring in foreign languages, including prospective foreign language instructors (Brooks & Darhower, 2014, p. 593). According to Swender (2003, p. 523), only 47% of graduating majors at prestigious liberal arts colleges, many of whom spent an academic year abroad, were at the AL level or higher, 35% were Intermediate High (IH), and 18% were Intermediate Mid (IM) or lower. As Omaggio-Hadley (2001) noted, it may take up to 720 hours of instruction to reach the Advanced level of foreign language speaking proficiency, an amount of time that is not regularly available in an undergraduate program. Rifkin (2005, p. 12) calculated an average of 410 to 415 total hours of instruction necessary to complete an undergraduate major in French, German, Russian, or Spanish. In addition, reaching the Advanced level commonly requires spending a significant amount of time abroad, using the target language both in and outside of class (Fraga-Canadas,~ 2010; Schulz, 2002; Sieloff-Magnan & Back, 2007).

Erwin Tschirner (PhD, University of California, Berkeley) is Gerhard Helbig Professor of German as a Foreign Language, Herder Institute, University of Leipzig, Germany. Foreign Language Annals, Vol. 49, Iss. 2, pp. 201–223. © 2016 by American Council on the Teaching of Foreign Languages. DOI: 10.1111/flan.12198 202 SUMMER 2016

While writing proficiency has received study investigating proficiency levels of substantially less attention in language graduating college foreign language majors. learning research (but see, e.g., Bernhardt, However, as Rifkin (2005, p. 4) pointed out, Molitoris, Romeo, Lin, & Valderrama, Carroll used the Modern Language Associ- 2015), the ACTFL Writing Proficiency ation (MLA) Foreign Language Proficiency Test developed in 2001 allowed the profes- Tests for Teachers and Advanced Students, sion to gauge the development of writing a norm-referenced four-skills battery of in addition to speaking proficiency. This tests. To establish correlations between emphasis on the productive modalities the Interagency Language Roundtable may be directly related to the fact that (ILR; a proficiency scale aligned with national foreign language instructor licens- the ACTFL scale) scale and the MLA test, ing requires achieving AL in speaking and Carroll (1967, p. 133) conducted a separate writing (see e.g., Glisan, 2013; Moeller, study with 19 Russian, 30 Spanish, 39 2013; Tedick, 2013). As Bernhardt et al. French, and 39 German instructor trainees (2015, p. 339) noted, writing proficiency who were interviewed by teams from the levels tend to be a little higher than speaking Foreign Service Institute (FSI) and who ones, with proficiency levels of IM after one completed the FSI reading and speaking year of college foreign language instruction tests and also completed the MLA test in English cognate languages such as battery. The FSI speaking test scores were French, German, and Spanish, and of Inter- then correlated with the MLA speaking and mediate Low (IL) in noncognate languages listening results, and the FSI reading test such as Arabic, Japanese, and Russian. After scores were correlated with the MLA read- two years, students may reach IH to AL in ing and writing results. Correlations were cognate languages and IM in noncognate moderately strong for French and Spanish, languages. with 0.67 for French listening and speaking, In contrast, very little is known about 0.63 for French writing, 0.71 for French the reading and listening proficiency levels reading, 0.66 for Spanish speaking, 0.73 of U.S. college students because until for Spanish listening, 0.74 for Spanish read- recently there were no standardized tests ing, and 0.77 for Spanish writing (p. 145). associated with the ACTFL reading and Correlations were higher for German and listening proficiency guidelines. With the Russian. Carroll’s finding that the average publication of the ACTFL Proficiency graduating senior was ILR 2þ in speaking Guidelines—2012 (ACTFL, 2012), the and listening (equivalent to an ACTFL empirical validation of the reading and rating of Advanced Mid or High) and ILR listening Guidelines (Clifford & Cox, 3 in reading and writing (equivalent to an 2013; Cox & Clifford, 2014), and the ACTFL rating of Superior) can thus only development of the ACTFL Reading be interpreted with extreme caution, (RPT) and Listening Proficiency Tests particularly because these ratings were (LPT; ACTFL, 2013, 2014), it became much higher than those obtained from possible to investigate and set benchmarks subsequent studies that used ACTFL Oral in reading and listening. This article Proficiency Interviews (OPIs; e.g., Brecht, describes the design and reports the results Davidson, & Ginsberg, 1993; Magnan, of such a study. 1986; I. Thompson, 1996; Tschirner & Heilenman, 1998). Review of Literature Rifkin (2005) assessed the listening, reading, speaking, and writing proficiency Listening and Reading Proficiency of 353 college students attending the Levels Middlebury Russian School, a 9-week sum- Carroll (1967) has commonly been credited mer immersion program, over the course with having completed the first large-scale of three summers (2001 to 2003). His Foreign Language Annals VOL. 49, NO. 2 203

data yielded ratings for participants at the range, 32% were IL, and 10% were IM in point of entry into the immersion pro- reading, while 93% were IL in writing and gram, i.e., levels achieved while enrolled 43% were NH and 50% were IL in speaking in undergraduate programs elsewhere, as (pp. 115–119). well as postprogram results. Data from Davin, Rempert, and Hammerand the preimmersion interviews revealed (2014) also used the STAMP to investigate that, after 150 hours of previous instruc- the development of reading, speaking, and tion (approximately one college year), writing proficiency in secondary schools. students were Novice High (NH) in all The study included 3,881 students, most four modalities. Proficiency levels were of whom studied Spanish (2,166), Chinese lowest for listening, followed by reading (1,058), and French (606) (Davin et al., and speaking, and highest for writing. 2014, pp. 250–251). While the instruc- After 250 hours of previous instruction tional context certainly influenced the re- (approximately two college years), stu- sults, reading levels for Chinese were dents were NH to IL in listening and read- Novice Low (NL) for 86% of all students ing and IL in speaking and writing. After tested. The mean score was NL after the 350 hours of instruction (approximately first 2 years of instruction, reached Novice 3 years of college instruction), they were Mid (NM) after 3 years, and stayed at NM IL in listening and reading and Intermedi- afterasmanyas5yearsofstudy.For ate Mid (IM) in speaking and writing, and French, the mean score remained at NM after 450 hours of instruction (approxi- for the first 3 years and reached IL after mately four college years), they were IM 4 years of study. For Spanish, the mean in listening and reading and IM to Inter- score remained at NM for the first 2 years, mediate High (IH) in speaking and writ- reached NH after 3 years and remained ing. Only students who had more than at NH after 4 years. Writing scores in 600 hours of instruction were rated IH Chinese reached IL after 4 years and IM in listening and reading and AL in speak- after 5 years, and participants were rated ing and writing (pp. 8–9). It is interesting IL in French and Spanish after 4 years. to note that preimmersion students gener- Mean speaking scores were IL in French ally were weaker in the interpretive than in after 3 and 4 years, in Spanish after 4 years, the productive modalities. Postimmersion and in Chinese after 3, 4, and 5 years. listening and reading abilities, however, Davidson (2010) described the reading were often higher than speaking and writ- and listening proficiency gains of college ing ones. Rifkin also presented data from students in study abroad programs of vari- traditional classroom instruction at the ous durations (2-month, 4-month, and University of Wisconsin, Madison, from 9-month programs) in Russia from 1994 2000 to 2004, which yielded a mean of to 2009. He presented data from 1,234 read- IL in listening and a mean of IM in speak- ing assessments and 390 listening assess- ing and reading after 4 years of instruction ments at both the beginning and end of (p. 11). the study abroad period. Students begin- Schmitt (2016), using the Standards- ning a 2-month program were required based Measurement of Proficiency to have a minimum of 2 years of college (STAMP),1 found that 45% of French stu- Russian (or equivalent); all others were dents were rated NH in reading after three supposed to have a minimum of 3 years semesters of college study, 30% were IL, and of college Russian (or equivalent). Both 15% were IM. Students scored substantially groups began the program at IH in reading higher in writing and speaking, with 75% and IM in listening. In reading, at the achieving IL in speaking and 85% achieving conclusion of the 2-month program stu- IL and IM in writing. Fifty-eight percent of dents advanced to AL, 4-month students Spanish students were rated in the Novice advanced to Advanced Mid (AM), and 204 SUMMER 2016

9-month students advanced to Advanced languages (Watson & Wolfel, 2015). In High (AH); in listening, 2-month and more distant languages such as Russian, 4-month students advanced only to IH, students appear to be NH to IL in reading while only 9-month students crossed the and listening after 2 years of college Advanced-level threshold in substantial instruction, IL after 3 years, and IM after numbers. As Davidson reported, “null 4 years (Rifkin, 2005). In intensive foreign gain was the norm and principal outcome language programs such as The Language for 56% of summer students, 46% of semes- Flagship (National Security Education ter students, and 15% of academic year Program), learners may reach IH in read- students” for listening (p. 16). ing and IM in listening after 2 to 3 years of Watson and Wolfel (2015), using the college instruction (Davidson, 2010). Defense Language Proficiency Test and the While preimmersion students may be OPI, looked at listening, reading, and speak- stronger in their productive modalities, ing proficiency gains of 279 third- and postimmersion students seem to be stron- fourth-year students who completed semes- ger in the interpretive modalities (Rifkin, ter-long study abroad programs in 15 differ- 2005). Even in study abroad contexts, ent countries. All students had completed at learners appear to make gains more easily least 2 years of college foreign language in reading than in listening, with students study or the equivalent before their study in 2-month and 4-month study abroad abroad period. Watson and Wolfel divided programs lingering in the Intermediate students into two groups: difficult lan- range and only students who study guages (Arabic, Chinese, and Russian) and abroad for longer periods, e.g., students less-difficult languages (French, German, in 9-month programs, advancing to AL Portuguese, and Spanish; pp. 59–60). At (Davidson, 2010). the beginning of their study abroad period, i.e., after at least 2 years of college foreign language study, students scored 1.69 in Correlations Between Modalities reading and 1.79 in listening for difficult Studies looking at correlations between languages, i.e., somewhat below ILR 0þ modalities have suggested similar conclu- (ACTFL NM to NH), and 4.04 in reading sions. Correlations between listening and and 3.78 in listening for less-difficult lan- reading are usually higher than correla- guages, i.e., approximately ILR 1þ (ACTFL tions between any other modalities. Liao, IM to IH).2 Qu, and Morgan (2010) examined the re- Taken together, the studies that are lationships between listening, reading, summarized above, based on ILR or ACTFL speaking, and writing scores on the Test proficiency assessments, indicate that profi- of English for International Communica- ciency levels in the interpretive modalities tion (TOEIC). Their data were based on generally appear to be lower than in the more than 12,000 examinees in Korea, productive ones and are lowest for listening Japan, and Taiwan. They found a signifi- ability, especially in instructed foreign lan- cant and high correlation between listen- guage learning. ing and reading (r ¼ 0.76) as well as The typical reading proficiency of for- significant and moderate relationships eign language students in the United States between listening and speaking (r ¼ 0.66), appears to be between NH and IL after four reading and writing (r ¼ 0.61), and speak- high school years (Davin et al., 2014) or ing and writing (r ¼ 0.62) (p. 13.4). In’nami three college semesters for closely related and Koizumi (2012) found an even higher languages such as French or Spanish correlation (r ¼ 0.87) between the listening (Schmitt, 2016). Third- and fourth-year and reading sections on the revised TOEIC students seem to be IM to IH (1þ)in 2006 for 569 Asian English learners both reading and listening in such (p. 145). Foreign Language Annals VOL. 49, NO. 2 205

Bozorgian (2012) investigated the rela- languages such as German or Spanish, tionships between listening, reading, speak- whose grapheme–phoneme relationships ing, and writing scores of 1,800 Iranian are transparent, requires more phonological students who took the International English processing (Frost, 2012). Goodwin, August, . He found that and Calderon (2015, p. 614) looked at how listening scores had the highest correlation fourth-grade Spanish-speaking students with reading (r ¼ 0.735), while listening in U.S. classrooms approached reading in score correlations with speaking (r ¼ 0.654) Spanish and English and found that phono- and writing (r ¼ 0.643) were slightly lower. logical decoding contributed to reading In general, speaking and listening scores comprehension in Spanish, whereas only were considerably lower than reading and morphological awareness contributed to writing ones. On a scale from 1 (nonuser) reading comprehension in English. Stu- to 9 (expert user), the listening and speaking dents learning languages with shallow means were 5.724 and 5.568, respectively, orthographies such as Spanish may be while the reading and writing means were able to relate words learned visually and 6.987 and 6.564, respectively. aurally more readily, and they may even Hirai (1999) compared reading and lis- be able to transfer words learned in one tening rates (words per minute read or lis- modality to the other, whereas students tened to) of 56 Japanese English as a foreign learning languages with deep orthogra- language learners. She divided her learners phiessuchasFrenchmaynotbeableto into two groups: a high-proficiency group transfer what they have learned visually to defined as having correctly answered at least listening. 75% of multiple-choice content questions, In summary, the research presented and a low-proficiency group scoring lower above suggests the following hypotheses. than 75%. Hirai found that high-proficiency While reading and listening proficiency students had similar reading and listening appear to be highly correlated (Bozorgian, rates (about 140 words per minute), 2012; In’nami & Koizumi, 2012; Liao whereas low-proficiency students had dra- et al., 2010), listening proficiency levels in matically lower rates, with even lower rates general appear to be considerably lower in listening (54 words per minute) than in than reading proficiency levels and similar reading (61 words per minute). For high- to the levels attained for speaking profi- proficiency students, there was a significant ciency (Bozorgian, 2012). The correlation and very high correlation between listening between reading and listening proficiency and reading rates (r ¼ 0.95); for low-profi- appears to increase as learners’ overall pro- ciency students, the correlation was very ficiency increases, with low correlations at low (r ¼ 0.37). Hirai concluded from her low proficiency levels and very high corre- data that listening and reading processes lations at high proficiency levels (Hirai, appeared to be highly interactive for high- 1999). In addition, orthographic depth proficiency students and very different from may play a role, with languages such as each other for low-proficiency students Spanish exhibiting higher correlations be- (pp. 375–378). tween listening and reading than languages One reason for the differences between such as English or French (Frost, 2012; reading and listening proficiency, at least for Goodwin et al., 2015). some languages, may be orthographic depth The studies summarized above pro- (Katz & Frost, 1992). The orthographic vided snapshots of reading and listening depth hypothesis holds that reading in proficiency levels of college or high school languages such as English or French, students for a few milestones in a student’s whose grapheme–phoneme relationships language learning career, e.g., after three or are opaque, requires more morphological four college semesters or four high school (semantic) processing, while reading in years, for a limited number of languages, 206 SUMMER 2016

and for a limited number of schools. Two of This left 3,321 RPTs and 2,932 LPTs that the studies with the largest number of tests could be analyzed. focused on Russian. None of the studies Due to the large number of colleges and used ACTFL proficiency assessments in lis- participants and the need for anonymity of tening and reading. Taken together, they student participants, demographic data seem to indicate that listening and reading included the name of the institution, lan- proficiency is lower than speaking and writ- guage, course number, test date, test length, ing proficiency in the first two to three and test results. Table 1 shows the number college years, which appears to be counter- of tests that were administered at each par- intuitive. One would assume that the inter- ticipating institution. As shown in Table 1, pretive skills are generally better developed students at Michigan State University con- than the productive ones. Moreover, they tributed more than half of all RPTs and seem to suggest that overall proficiency LPTs, and participants from three universi- levels of college students, even in closely ties combined contributed 83.4% of all related languages such as Spanish and RPTs and 87.9% of all LPTs. The data there- French, are relatively low, even after 3 to fore are representative primarily of students 4 years of college foreign language instruc- who attend large U.S. state universities. tion, unless students spent a significant Table 2 shows the number of RPTs per amount of time abroad. The present study language and semester. As shown in Table 2, sought toadd to this knowledge base byusing more than half of all students tested were ACTFL proficiency assessments, by includ- students of Spanish. There were also sizable ing a large number of students at a large numbers of test takers in French and number of colleges and universities, by German, and fewer in Italian, Japanese, focusing on all levels of instruction from Portuguese, and Russian. However, partic- the first to the fourth year, and by increasing ipants were distributed unevenly across the number of languages taken into account. semester levels. For the analysis, fifth- and sixth-semester students were identified as third-year students for all languages except Research Questions French and Spanish to yield adequate num- The main research questions of the present bers for analysis at the combined level. A study included the following: group of students at one university were returning missionaries who spent 2 years What levels of reading and listening pro- abroad before entering university. These ficiency can normally be found at major students are treated separately. milestones in an undergraduate foreign Table 3 shows the number of LPTs per language student’s career? language and class level. Because of the What is the relationship between reading discrepancy between Spanish and French and listening proficiency? on the one hand and all other languages on the other, the Results section herein focuses first on Spanish and French and Methods then on the other languages. In addition, Spanish and French are also the languages Participants used to address the research questions. Students from 21 U.S. universities and For all other languages, mostly descriptive colleges participated in the study. ACTFL statistics are provided. RPTs and LPTs were administered to first-, second-, third-, and fourth-year students over a 12-month period (May 12, 2014– Instrument May 13, 2015). Tests taken by instructors The RPT and LPT are standardized tests were eliminated, as were incomplete tests.3 for the global assessment of reading and Foreign Language Annals VOL. 49, NO. 2 207

TABLE 1 Number and Percentage of RPTs and LPTs by Institution Institution RPT RPT LPT LPT N Percentage N Percentage

Michigan State University 1,693 51 1,625 55.4 University of Utah 569 17.1 494 16.8 University of Minnesota 506 15.3 461 15.7 University of Wisconsin, Eau Claire 83 2.5 Georgia Southern University 59 1.8 50 1.7 Hunter College 53 1.6 9 .3 University of Delaware 50 1.5 46 1.6 Grand Valley State University 34 1.0 34 1.2 University of California, Berkeley 33 1.0 Eastern Washington University 31 0.9 33 1.1 Lee University 27 0.8 27 0.9 North Carolina State University 27 0.8 25 0.9 Loras College 27 0.8 18 0.6 State University of New York, 26 0.8 18 0.6 Plattsburgh San Diego State University 24 0.7 24 0.8 University of Maryland, College Park 23 0.7 14 0.5 Bowdoin College 18 0.5 University of Southern California 18 0.5 24 0.8 Old Dominion University 13 0.4 13 0.4 Yale University 6 0.2 4 0.1 University of Pittsburgh 13 0.4 Total 3,321 100 2,932 100

TABLE 2 Number of RPTs by Semester and Language 2nd 3rd 4th 5th 6th 4th 2 Years Total Sem. Sem. Sem. Sem. Sem. Year Abroad

French 120 86 215 166 62 124 773 German 22 178 75 18 38 11 342 Italian 73 8 24 17 2 7 131 Japanese 33 36 13 13 95 Portuguese 13 28 9 9 59 Russian 8 25 52 2 20 45 152 Spanish 242 222 338 432 208 242 85 1,769 Total 489 363 871 701 319 469 109 3,321 208 SUMMER 2016

TABLE 3 Number of LPTs by Semester and Language 2nd 3rd 4th 5th 6th 4th 2 Years Total Sem. Sem. Sem. Sem. Sem. Year Abroad

French 111 88 203 141 89 121 753 German 1 5 151 3 27 31 11 229 Italian 37 8 23 12 7 7 94 Portuguese 12 26 7 9 54 Russian 8 25 51 1 19 40 144 Spanish 235 209 317 382 187 233 85 1,658 Total 404 335 771 556 338 432 96 2,932

listening ability in a language (ACTFL, according to the specific algorithm of the 2013, 2014). They measure how well a per- test. The algorithm and cut points for each son spontaneously reads or is able to listen level were determined empirically. Because to texts and discourse when presented with there are no Novice texts/passages and texts, discourse, and tasks as described in tasks, the Novice levels are determined the ACTFL Proficiency Guidelines 2012. according to how close the test taker is to Each test consists of 10 to 25 reading texts the Intermediate level. The test is Internet- or listening passages depending on the administered and computer-scored (Institute levels tested. There are five sublevels: IL, for Test Research and Test Development, IM, AL, AM, and Superior (S). Each sublevel 2013a, 2013b). consists of five reading texts or listening The sublevels that the test takers passages accompanied by three tasks worked on were selected by the participat- (items) with four multiple-choice re- ing schools in consultation with the sponses, only one of which is correct. Test ACTFL. Most schools selected two-sublevel rubrics include genre, content area, rhetor- tests to fit a standard 50-minute class ses- ical organization, reader/listener purpose, sion. Because so little was known about the vocabulary, and for LPTs clarity of speech. average proficiency levels of students hav- Texts/passages and tasks align at each level: ing studied a particular language for a cer- e.g., an Intermediate task requires under- tain number of semesters, the tests selected standing information that is contained in were sometimes too difficult for weaker one sentence/utterance, whereas Advanced students, which resulted in a relatively large tasks require the ability to understand in- number of below-range (BR) ratings for formation that is spread out over several some languages in some semesters, espe- sentences/utterances. Tasks and multiple- cially for listening. A BR rating is given choice responses are in the target language when there is not enough evidence to assign (Institute for Test Research and Test Devel- the lowest rating of a particular two-sub- opment, 2013a, 2013b). level test. In addition, some of the best The RPT and LPT are timed tests with a students may have been at a higher sublevel total test time of 25 minutes per sublevel. than the top sublevel the test could verify. Test takers usually take two to three sub- levels at the same time. Two sublevels are rated together: either the two levels taken, Analyses or, if more than two levels were taken, the Length of study in semesters was deter- two highest levels that can be rated mined by aligning course numbers across Foreign Language Annals VOL. 49, NO. 2 209

the various institutions. Length of study in classroom hours was determined by calculating the number of previous con- tact hours students would have had if they had attended each class session and completed the required courses up to the current course in which they were en- rolled at their particular institution. Then, the number of contact hours that students had completed in the course in which they were enrolled when the test was administered was added to the previ- ous total, yielding a total number of class- room (instructional) hours. The total number of classroom hours was not cal- culated for the students who spent 2 years abroad. Following Rifkin (2005) and others, test results were coded numerically as

14.8 22.9follows: 9.0 NL ¼ 1, 11.8 NM ¼ 2, NH ¼ 3, IL ¼ 4, IM ¼ 5, and so on, up to S ¼ 10. For a number of cases, it was not possible to assign an ACTFL rating because the assess- ment chosen by the instructor was above the ability level of the student. In such – cases, BR was assigned. Table 4 shows the percentage of BR for RPTs and LPTs for all languages. The percentage of BR ratings was particularly high for Russian RPTs and LPTs, for Japanese RPTs, and for French and Portuguese LPTs. These numbers indicate that there was a discrep- ancy between the levels at which some teachers expected their students to perform and students’ actual proficiency levels. This was particularly noticeable on

Percentage of BR Ratings per RPTs and LPTs per Language the LPT for all languages except Italian. On the RPT, instructors’ expectations seemed to be more accurate with respect to Romance languages than with German, Japanese, and Russian, and in listening, they were accurate only with respect to Italian. For the results section, BR was coded as 0. Note that this lowered the means slightly. However, eliminating BR ratings, and thus eliminating weak students, would greatly inflate mean pro-

LanguageRPT FrenchLPT German 3.5 16.7 Italianficiency Japanese 14.0 11.8 levels Portuguese in 1.5 4.3 languages Russian Spanish 14.7 and semesters Total 3.4 15.8 5.3 6.4

TABLE 4 in which BR ratings accounted for a large number of the ratings. 210 SUMMER 2016

Results (3). Starting in the fifth semester, Spanish students had considerably higher proficien- Reading and Listening Proficiency cies. Because there was a sharp drop for Levels of French and Spanish French students after the sixth semester Students at Major Milestones going into the fourth year, the difference In general, the reading and listening profi- between French and Spanish increased sub- ciency of French and Spanish students stantially, a difference of well over two sub- increased the longer they studied their re- levels, between IL (4) in French and IH (6) spective languages. Figure 1 shows the in Spanish. mean4 ACTFL reading and listening profi- Table 5 shows mean ACTFL ratings ciency ratings of students in their second, expressed numerically and standard devia- third, fourth, fifth, and sixth semesters, and tions for each language and semester or year in their fourth year of college foreign lan- for all RPTs and LPTs with the number of guage study. As Figure 1 shows, there were cases in parentheses. As Table 5 shows, large differences between reading and there was a steady increase in mean ability listening proficiency in both French and from the second semester to the sixth Spanish. In reading, French and Spanish semester. While Spanish also showed an students started at NH (3) and developed increase going into the fourth year, there in a similar fashion to AL (7) in the fourth was a slight drop in French reading and year, with French reading abilities slightly a considerable drop in French listening. above Spanish ones, except for the slight Listening proficiency levels were substan- drop for French students after the sixth tially lower than reading levels for both semester going into the fourth year. In lis- French and Spanish. While Spanish listen- tening, French and Spanish students started ing levels were approximately one sublevel at NM (2) and also developed similarly until lower than Spanish reading levels, French the fourth semester, when they reached NH listening ability started at a little more than

FIGURE 1 Mean Proficiency Ratings for French and Spanish RPTs and LPTs by Semester Foreign Language Annals VOL. 49, NO. 2 211

one sublevel lower than French reading ability, increased to two sublevels lower in the fourth semester and throughout the third year, and reached two and a half sublevels lower in the fourth year. Return- ing missionaries who spent 2 years in a Spanish-speaking country had slightly higher reading and considerably higher fi in Parentheses listening pro ciencies than regular fourth-

N year Spanish students. It is interesting to note that the difference between modalities for these students was less than one half of one sublevel. Table 6 shows the ACTFL sublevels that correspond to the mean numeric rat- ings in Table 5. The plus symbol is used to indicate that the mean numeric rating is between 0.35 and 0.74 higher than the base level. Mean numeric ratings of 0.75 and above are rounded up to the next higher level. As shown in Table 6, French and Spanish students were rated NHþ, on aver- age, in reading in the third semester, and IM in the fifth semester. They began moving into the IH and IHþ ranges in the sixth semester, on average, and reached IHþ and AL in the fourth year. Spanish students who spent 2 years in a Spanish-speaking country read at the same level as regular fourth-year students. In listening, Spanish students moved into the Intermediate range in the fifth semester and reached the IH level in the fourth year. French students, on average, seemed to find it very difficult to reach IM even in their fourth year. Spanish students who spent 2 years abroad were rated AL, on average, the same as in reading. To examine the effect of number of semesters on proficiency, separate ANOVAs were calculated by language (French and Spanish) and modality (reading and

2nd Sem. 3rd Sem. 4th Sem. 5th Sem.listening). 6th Sem. 4th Year 2 Years Abroad For Spanish reading (N ¼ 1,684), the ANOVA found a significant effect of semes- ter number on proficiency (F[5,1678] ¼ 161.606, p < 0.001). Number of semesters

Mean ACTFL Ratings and Standard Deviations for French and Spanish RPTs and LPTs With explained 32.5% of the variance in profi-

French RPTSpanish RPT 2.94, 1.27 (120)French 3.11, LPT 1.35 (242) 3.63,Spanish 1.28 LPT (86) 1.83, 3.37, 0.90 1.43 (111) (222) 2.05, 4.52, 1.17 1.45 (235) 4.09, (215) 2.39, 1.64 1.16 (338) (88) 2.52, 5.29, 1.45 2.20 (209) 5.26, (166) 2.45 2.65, (432) 1.63 2.83, (203) 1.59 6.65, (317) 6.30, 1.52 2.16 (62) 3.48, (208) 2.51 4.39, (141) 2.44 (392)ciency 6.94, 6.56, 1.87 1.83 (242) (124) 4.58, 5.13, 2.70 2.37 (89) (187) 7.31, 1.31 (85) 4.04, 6.00, 2.69 1.92 (121) (233)(partial 7.08, 0.58 (85) eta squared ¼ 0.325). The

TABLE 5 Scheffe test (0.05) yielded five homoge- neous subsets. Second and third semesters 212 SUMMER 2016

TABLE 6 Mean ACTFL Ratings for French and Spanish RPTs and LPTs 2nd 3rd 4th 5th 6th 4th 2 Years Sem. Sem. Sem. Sem. Sem. Year Abroad

French RPT NH NHþ ILþ IM IHþ IHþ Spanish RPT NH NHþ IL IM IH AL AL French LPT NM NMþ NMþ NHþ ILþ IL Spanish LPT NM NMþ NH ILþ IM IH AL constituted one subset. Fourth, fifth, and subsets. There was a statistically signifi- sixth semesters and fourth year formed cant mean progression from NH (fourth individual subsets. This indicates that there semester) to ILþ (fifth semester) to IM were no significant differences between (sixth semester) and IH (fourth year). second and third semesters, but there For French listening (N ¼ 753), the were significant differences between third, ANOVA found a significant effect of semes- fourth, fifth, and sixth semesters and fourth ternumberonproficiency (F[5,747] ¼ year, representing a statistically significant 28.079, p < 0.001). Number of semesters mean progression from NHþ (third semes- explained only 15.8% of the variance in ter) to IL (fourth semester) to IM (fifth proficiency (partial eta squared ¼ 0.158). semester) to IH (sixth semester) and AL The Scheffe test (0.05) yielded four homo- (fourth year). geneous subsets. Second, third, and fourth For French reading (N ¼ 773), the AN- semesters formed subset 1, fourth and fifth OVA found a significant effect of semester semesters formed subset 2, fifth semester number on proficiency (F[5,767] ¼ 85.086, and fourth year formed subset 3, and sixth p < 0.001). Number of semesters explained semester and fourth year constituted subset 35.7% of the variance in proficiency (partial 4. While there was a mean progression from eta squared ¼ 0.357). The Scheffetest NM (second semester) to NMþ (third and (0.05) yielded four homogeneous subsets. fourth semesters) to NHþ (fifth semester) Second and third semesters formed subset and ILþ (sixth semester), the only statisti- 1. Fourth and fifth semesters formed indi- cally significant progression was from fifth vidual subsets. Sixth semester and fourth to sixth semester. Note that the mean profi- year formed subset 4. There was a statisti- ciency of fourth-year students was lower cally significant mean progression from than the sixth-semester mean. NHþ (third semester) to ILþ (fourth se- To get a better sense of the range of mester) to IM (fifth semester) and IHþ student proficiency levels at various mile- (sixth semester). stones, Table 7 shows the range of ACTFL For Spanish listening (N ¼ 1,573), the levels within one standard deviation of the ANOVA found a significant effect of semes- mean. This represents the range of profi- ter number on proficiency (F[5,1567] ¼ ciency levels of the mid 68.2% of test takers. 160.734, p < 0.001). Number of semesters As Table 7 shows, the top 16% (84th per- explained 33.9% of the variance in profi- centile) of Spanish students read at IM or ciency (partial eta squared ¼ 0.339). The higher in the third and fourth semesters and Scheffe test (0.05) yielded five homogeneous at AL or higher in the fifth semester. In the subsets. Second and third semesters formed sixth semester, they read at the AMþ level, subset 1. Third and fourth semesters as did students who returned from 2 years formed subset 2. Fifth and sixth semesters abroad. In the fourth year, they read at and fourth year formed three individual the AH level. The 84th percentile of French Foreign Language Annals VOL. 49, NO. 2 213

students had similar abilities in reading as the Spanish students. In listening, the 84th percentile of Spanish students reached the þ þ AL level as early as the fifth semester, while AL AM the same percentile of French students did

þ not reach it before the sixth semester and were below AL in the fourth year. Of partic- ular significance is that even the 84th per- centile of both French and Spanish students

þ þ þ were at the IL/IL levels in listening as late AH IH AM IH AM IH as the fourth semester. Note that because there were so many BR ratings in French þ IL IM listening, the bottom 16% of French stu- dents were BR in the fifth and sixth semes- ters and in the fourth year. This discrepancy þ þ between reading and listening levels did AL BR AM IL AL

AM not apply to students who had spent 2 years abroad (returning missionaries). Their IM IL listening levels were only marginally lower than their reading levels. Note that their þ þ standard deviations were also low, and AL NH IH BR AL AL particularly low in listening, which indi- cates that they formed a homogeneous fi

NM group with very similar pro ciency ratings. NH

þ Reading and Listening Levels of þ IL BR IM IH NH

IL German, Italian, Portuguese, and Russian Students at Major Milestones þ

NL There were substantially fewer test takers in German, Italian, Japanese, Portuguese, and Russian than there were in French and

þ Spanish, and they were unevenly distrib- IM NH IL NL IM NM

NH uted across class levels. For a number of þ class levels, there were not enough test tak-

NM ers. To generate more substantial numbers, NL fifth and sixth semesters were combined into third year. Table 8 shows the ACTFL þ þ IL NM numerical ratings, the standard deviations, IL NH NL

NM and the number of tests in parentheses in þ German, Italian, Japanese, Portuguese, and 2nd Sem. 3rd Sem. 4th Sem. 5th Sem.Russian 6th Sem. for 4th Yearall semesters 2 Years Abroad and years for which there were more than 10 tests. The table provides a heterogeneous picture, un- doubtedly due to the often very small num- bers and, for German, Japanese, and Russian as well as Portuguese listening, due to the Spanish LPT NL French LPT NL French RPT NL Spanish RPT NM relatively large number of BR ratings. In Range of ACTFL Proficiency Levels for French and Spanish RPTs and LPTs Within One Standard Deviation of the Mean TABLE 7 general, German, Italian, and Portuguese students were IL or IM in reading and NH 214

TABLE 8 Mean ACTFL Ratings and Standard Deviations for German, Italian, Japanese, Portuguese, and Russian RPT and LPT With N in Parentheses for Semesters and Years With More Than 10 Tests 2nd Sem. 3rd Sem. 4th Sem. 3rd Year 4th Year 2 Years Abroad

German RPT 4.00, 1.85 (22) 4.33, 1.45 (178) 4.11, 3.15 (93) 5.45, 2.29 (38) 6.64, 2.38 (11) Italian RPT 3.42, 1.17 (73) 4.12, 1.04 (24) 6.68, 1.94 (19) Japanese RPT 1.61, 1.03 (33) 3.03, 1.36 (36) 3.31, 2.75 (13) Portuguese RPT 3.54, 1.56 (13) 5.36, 1.97 (28) 6.89, 0.68 (18) Russian RPT 1.64, 1.04 (25) 3.13, 1.57 (52) 5.09, 2.14 (22) 3.69, 4.04 (45) German LPT 2.96, 1.72 (151) 4.53, 3.17 (30) 5.26, 2.28 (31) 7.45, .69 (11) Italian LPT 3.89, 1.08 (37) 4.04, 1.15 (23) 7.14, 3.29 (19) Portuguese LPT 2.50, 1.88 (12) 3.42, 2.04 (26) 5.13, 2.16 (16) Russian LPT 1.72, 1.21 (25) 2.88, 1.32 (51) 4.40, 2.87 (20) 1.70, 2.87 (40) UMR2016 SUMMER Foreign Language Annals VOL. 49, NO. 2 215

or IL in listening in their fourth semester. In Table 9 shows the overall and language- the third year, students were IH and AL in specific descriptive statistics of proficiency reading and listening in the Romance lan- levels expressed numerically for RPTs and guages except for Portuguese listening, LPTs. As Table 9 shows, the listening means where students continued to be IM, on av- were considerably lower than the reading erage. As was the case with the Spanish means except for Italian, and to some extent students who spent 2 years abroad on a for Russian. The difference in means was mission, returning missionaries who were highest for French and Portuguese, 1.67 in a German-speaking country were Ad- and 1.64, respectively, equivalent to more vanced in listening. Note again the very than one and a half sublevels, followed by low standard deviation, which indicates Spanish and German, 0.92 and 0.87, respec- that they were a very homogeneous group. tively, equivalent to a little less than one sublevel. The difference in means was Reading and Listening 0.47 in Russian, less than one half-sublevel, Because most students took both reading and it was close to 0 in Italian. and listening tests, particularly at the three Table10showstheoverallandlan- universities that contributed 83% of the guage-specific correlation coefficients be- reading and 88% of the listening test takers tween ACTFL reading and listening (Michigan State University, University of proficiency ratings. Both parametric Pear- Minnesota, and University of Utah), the son’s r and a nonparametric Spearman’s rho relationship between reading and listening were used to compute these correlations. was examined to see if reading and listening The effect size, R2, was also computed. As proficiencies developed in tandem or sepa- Table 10 shows, all correlations were signif- rately. In addition, the number of classroom icant at p < 0.01. Effect sizes above 0.25 are hours needed to reach particular profi- considered to be large (Larson-Hall, 2010, ciency levels is considered. p. 162). All effect sizes were larger than

TABLE 9 Overall and Language-Specific ACTFL Proficiency Ratings for RPTs and LPTs Including N, Minimum, Maximum, Mean, Standard Deviation, and Standard Error Assessment Language N Minimum Maximum Mean SD SE

RPT All 2,452 0 10 4.74 2.15 0.043 French 638 0 9 4.77 1.94 0.077 German 187 0 8 4.53 1.70 0.124 Italian 62 1 8 4.23 1.60 0.204 Portuguese 53 0 8 5.43 1.72 0.236 Russian 107 0 10 3.08 2.34 0.226 Spanish 1,405 0 10 4.87 2.24 0.060 LPT All 2,452 0 8 3.65 2.28 0.046 French 638 0 8 3.10 2.10 0.083 German 187 0 8 3.66 2.16 0.158 Italian 62 1 8 4.21 1.32 0.168 Portuguese 53 0 7 3.79 2.20 0.302 Russian 107 0 8 2.61 2.09 0.202 Spanish 1,405 0 8 3.95 2.35 0.063 216 SUMMER 2016

TABLE 10 Overall and Language-Specific Correlation Coefficients Between ACTFL Reading and Listening Proficiency Levels 2 N Spearman’s rho p Pearson’s rR p

All Languages 2,452 0.695 0.01 0.669 0.45 0.01 French 638 0.619 0.01 0.614 0.38 0.01 German 187 0.591 0.01 0.545 0.30 0.01 Italian 62 0.760 0.01 0.783 0.61 0.01 Portuguese 53 0.731 0.01 0.729 0.53 0.01 Russian 107 0.709 0.01 0.746 0.56 0.01 Spanish 1405 0.739 0.01 0.700 0.49 0.01

0.25, with Italian, Portuguese, Russian, and than their reading proficiency, NM vs. NH Spanish being the largest, explaining 49% in the second and third semester, NH vs. IL and more of the variance between reading in the fourth semester, IL vs. IM in the fifth and listening. semester, IM vs. IH in the sixth semester, To provide an additional perspective, and IH vs. AL in the fourth year. Most Table 11 shows the differences in mean significant, the difference between listening ACTFL ratings between reading and listen- and reading was only 0.23 for students who ing for class levels with 20 or more test spent 2 years in the target country. takers. Means and standard deviations In French, the differences between lis- were calculated on the basis of all universi- tening and speaking were much more pro- ties, not only the top three. As Table 11 nounced. In semesters two and three the shows, mean listening levels were generally differences were 1.11 and 1.24, respectively. lower than mean reading levels, with sub- In semesters four, five, and six, the differ- stantial differences between languages. ences were 1.87, 1.81, and 2.07. And in the The difference was lowest in Italian. In fourth year, the difference was a whopping the second semester, students’ proficiency 2.52. in listening was actually 0.47 ACTFL Further evidence for the gap between sublevels higher than in reading, and reading and listening proficiency for some in the fourth semester, the difference was languages was provided when the mean close to zero. For Portuguese, reading pro- length of time to reach a particular ACTFL ficiency was 1.94 sublevels higher than sublevel was considered. Table 12 shows listening proficiency in the fourth semester. the mean length of time and standard devi- Portuguese listening was still NHþ, while ation needed to reach a particular ACTFL Portuguese reading was IMþ. For German, sublevel with the number of students in the difference in the fourth semester was parentheses. Only cells in the table with at 1.37, while it was only 0.19 in the fourth least 20 students are filled in. The numbers year. For Russian, there was essentially no were not sufficient for Portuguese and difference in the third semester, while it Russian to compare reading and listening was a low 0.25 in the fourth semester. proficiency. For Japanese, only reading The difference in the fourth year of 1.99 data were available and therefore were not was an anomaly, because 72.5% of the included. As Table 12 shows, it took longer listening ratings were BR. to reach a particular proficiency rating in In Spanish, students’ listening profi- listening than in reading for all languages ciency was generally one sublevel lower and for all proficiency levels. To attain an oeg agaeAnnals Language Foreign

TABLE 11 Differences in Mean ACTFL Ratings Between Reading and Listening, Including Means and Standard Deviations With Number of Tests in Parentheses 2nd Sem. 3rd Sem. 4th Sem. 5th Sem. 6th Sem. 4th Year 2 Years Abroad O.4,N.2217 2 NO. 49, VOL. French RPT 2.94, 1.27 (120) 3.63, 1.28 (86) 4.52, 1.45 (215) 5.29, 2.19 (166) 6.65, 1.52 (62) 6.56, 1.83 (124) French LPT 1.83, 0.90 (111) 2.39, 1.16 (88) 2.65, 1.63 (203) 3.48, 2.51 (141) 4.58, 2.70 (89) 4.04, 2.68 (121) Spanish RPT 3.11, 1.35 (242) 3.27, 1.43 (222) 4.09, 1.64 (338) 5.26, 2.45 (432) 6.30, 2.16 (208) 6.94, 1.87 (242) 7.31, 1.31 (85) Spanish LPT 2.05, 1.17 (235) 2.52, 1.45 (209) 2.83, 1.59 (317) 4.39, 2.44 (392) 5.13, 2.37 (187) 6.00, 1.92 (233) 7.08, 0.58 (85) German RPT 4.33, 1.45 (178) 5.45, 2.29 (38) German LPT 2.96, 1.72 (155) 5.26, 2.28 (31) Italian RPT 3.42, 1.17 (73) 4.13, 1.04 (24) Italian LPT 3.89, 1.08 (37) 4.04, 1.15 (23) Portuguese RPT 5.36, 1.97 (28) Portuguese LPT 3.42, 2.04 (26) Russian RPT 1.64, 1.04 (25) 3.13, 1.57 (52) 3.69, 4.04 (45) Russian LPT 1.72, 1.21 (25) 2.88, 1.32 (51) 1.70, 2.87 (40) 218 SUMMER 2016

NH rating, it took French students an addi- tional 42 hours, for IL an additional 55 hours, and for IM an additional 59 hours. To attain an IL rating, it took German students an additional 15 hours and Italian students an additional 4 hours; for IM, German students needed an additional 42 hours, and Italian students an additional 25 hours. To attain a NH rating in listening proficiency, Spanish students needed an additional 33 hours, for IL an additional 27 hours, and for IM an additional 41 hours. To attain an IH rating in listening pro- ficiency, it took French students an addi- tional 23 hours and Spanish students an additional 30 hours; for AL, it took French students an additional 15 hours, German students an additional 34 hours, and Span- ish students an additional 78 hours. For an AM in listening proficiency, it took Spanish students an additional 90 hours. While the actual numbers are an artifact of the present study, because of its assumption that class- room hours can be added up, the difference between hours needed to reach an equiva- lent proficiency level in listening and read- ing supports the finding that it took longer, in some instances quite a bit longer, to attain listening proficiency levels than to attain the same level in reading. Especially striking was the difference between listening and reading in French at NH, IL, and IM: to Reach a Particular ACTFL Sublevel It took students 25% more hours of instruc- tion to reach NH, 30% to reach IL, and 26% to reach IM.

Discussion The mean reading proficiency in English cognate languages such as French and Span- ish was NHþ in the third semester, IL or ILþ in the fourth semester, IM in the fifth semester, IH or IHþ in the sixth semester, and IHþ or AL in the fourth year. These proficiency levels were considerably higher than the ones reported previously. If one looks at the top 84th percentile of students,

Language/TestFrench RPT NLFrench LPT 139,German 50 RPT (35) 145,German 53 LPT (99) 132, 43 (49)Italian RPT NM 147, 222, 53 13Italian (82) 169, (20) LPT 64 (63) 211,Spanish 75 224, RPT (122) 183, 27 71 (23) (145)Spanish 238, LPT 63 144, (107) 231, 51 62 (100) (159) NH 290, 140, 62 55 143, (91) (219) 280, 56 60 (80) (86) 160,they 57 303, (147) 160, 61 68 (60) (159) 314, 193, 66 75 (155) (176) 180, 67 were (293) 207, 329, 76 68 327, (214) 229, IL (58) 52 76 242, (48) (294) 31 270, (46) 73 (230) 248, even 275, 54 68 (21) 305, 288, (160) 110 56 (180) (24) 227, 314, 392, 40 81 164 (75)higher: (343) (281) 149, 52 445, (34) IM 355, 246, 176 82 61 (55) (212) (82) 148, 56 (40) 306, Both 56 (48) 186, 68 (22) 331, French 67 (45) IH 365, 153 (26) 152, 56 (27) and 213, 77 (28) AL AM Mean Length of Time (95% Trimmed Mean) and Standard Deviation, With Number of Students in Parentheses Needed TABLE 12 Spanish students were IM in the third se- mester, ALþ as early as the fifth semester, Foreign Language Annals VOL. 49, NO. 2 219

and AM or AMþ in the sixth semester. In listening and reading for French may be due the fourth year, they were AMþ in French to its deep orthography (Frost, 2012). Deep and AH in Spanish. Despite the limited orthography languages have an opaque amount of data for Italian and Portuguese, writing system, in which it is not easy to their proficiency levels were similar to those infer the pronunciation of a syllable or word reported for Spanish, while proficiency by the way it is written. Words, therefore, levels in German appeared to be similar to need to be learned twice, as visual and aural French. entities, and words learned visually are not Listening proficiency, however, seemed easily accessible when listening. Listening to be a different matter entirely. One of the and reading proficiency, therefore, diverge, most significant findings of the present and gains made in reading may not translate study was the fact that listening proficiency to listening. was lower than reading proficiency at al- The lack of continued development of most all levels of instruction and across all listening proficiency in French beyond the languages. The mean listening proficiency second year may have also been due to in French and Spanish was generally one upper-division courses being taught in sublevel lower than the mean reading profi- English. This, of course, may be exactly ciency in the second and third year, and it the kind of vicious circle that solidifies was two sublevels lower in the fourth year. the status quo, when instruction is in En- This may be a reflection of what the glish because students’ listening proficiency profession considers important in foreign is judged to be not sufficiently developed. language education. As a panel of experts Moreover, stagnating listening proficiency noted in a recent accreditation procedure in development as students enter upper-divi- which the present author took part, listen- sion courses may be a contributing factor to ing proficiency is not considered important the often noted Advanced barrier in oral enough for college credits to be given on the proficiency, a level that appears to be very basis of an examination. While the literature difficult to attain for students who do not faculty in foreign language departments spend a significant amount of time in the may still focus on reading, the emphasis target country (Davidson, 2010). As Rifkin on input and listening comprehension (2005) found, postimmersion students that characterized the early years of the showed greater ability in listening and read- communicative competence revolution in ing, contrasting their stronger skills in the late 1970s and 1980s appears to have speaking and writing prior to the immersion all but disappeared. The focus on speaking, experience. Davidson (2010) found that and particularly on interpersonal speaking, attaining Advanced-level listening profi- ushered in by the proficiency movement in ciency took the equivalent of a 9-month the late 1980s and 1990s and the fact that study abroad period. the first ACTFL assessment that was made Moreover, it may be important to available was the OPI may have contributed distinguish between local and global com- to the neglect of listening comprehension prehension, and between careful and even at the lower levels of language instruc- expeditious reading and listening (Weir tion. This may explain the finding in the & Khalifa, 2008). While global and expe- literature review—the productive skills ap- ditious reading—and listening, for that pear to be stronger at the lower levels of matter—may be the kind that is most foreign language instruction (Davin et al., valued in literature classes, and even in 2014; Rifkin, 2005; Schmitt, 2016). lower-division language courses, it may Listening proficiency was lower in be the local and careful reading with its French than in Spanish at all levels attention to detail, including lexical and of instruction except in the second and grammatical detail, that may be required third semesters. The discrepancy between to develop higher levels of proficiency. 220 SUMMER 2016

This seems to call for a renewed effort to and student background. Still, the data pre- better understand the relationship between sented offer a first comprehensive picture of vocabulary breadth and depth and reading the reading and listening proficiency levels and listening proficiency, including the of U.S. college students at major milestones ability to parse incoming information and raise a series of tantalizing questions, grammatically. answers to which may reshape the way A number of textual features have been foreign languages—in particular, reading identified as crucial to text comprehension, and listening—are taught and learned. including vocabulary load, syntactic com- plexity, domain, and genre, among others (Bernhardt, 2011; Jeon & Yamashita, 2014; Conclusion Vandergrift & Baker, 2015; Van Zeeland, The data presented in this article suggest 2014). While there may be some overarch- that it may be time to rethink the role of ing principles that relate to all languages, foreign language education for academic many of these features may be language purposes in the United States and the role specific and thus may need to be examined of foreign language proficiency within a language by language. More research is liberal arts curriculum. In particular, there needed to examine the role of these factors seems to be increasing recognition of the at specific proficiency levels, e.g., the kinds need to develop principled approaches of texts and tasks that allow students to toward improving listening proficiency, acquire the relevant linguistic knowledge and to a lesser extent reading proficiency, that is needed to advance to the next level, throughout the undergraduate foreign lan- the relative importance of these factors for guage experience. A focus on listening pro- different languages, and their characteris- ficiency may not only help the profession tics at particular proficiency levels. A related succeed in providing students with useful, question is determining the contributions professional academic foreign language that direct lexical and grammatical study listening skills, but may even be key to may have on the acquisition process. developing professional speaking skills as Two final questions raised by this study well. While reading proficiency levels, espe- include the role of listening proficiency in cially for cognate languages, appear to be the development of speaking proficiency much higher than speaking levels and close and the way that reading and listening to what one might expect at various points tasks may propel proficiency development in a postsecondary program, more princi- across modalities. It is interesting to con- pled approaches to reading proficiency may sider if listening levels need to be higher allow learners to reach what Carroll (1967) than speaking levels in order for speaking thought he discovered 50 years ago, i.e., proficiency to progress into the Advanced Advanced High and Superior levels. range. Because the ACTFL proficiency guide- While the data presented in this article lines afford a developmental perspective are robust and substantive, especially for and describe what a learner can do at vari- French and Spanish, it should be noted ous proficiency levels and what he or she that around 85% of the data stem from will be able to do at the next higher level, three large state universities and therefore they provide a framework for secondary and may be more representative of large state postsecondary foreign language depart- universities than of private universities ments with respect to aligning goals and or colleges and smaller institutions. More- curricula across the learning experience over, the absence of biographical data and setting proficiency goals for major mile- does not allow any assumptions to be stones in a student’s career. The ACTFL made about the relationship between profi- frameworks and assessments for speaking ciency levels, length and type of instruction, and writing have already had a major impact Foreign Language Annals VOL. 49, NO. 2 221

on curriculum and instruction. It is time to that are based on an underlying scale do the same for listening and reading. (see, e.g., Larson-Hall, 2010, pp. 34–35).

Notes Acknowledgments 1. The Standards-based Measure of Profi- Acknowledgments are gratefully made to ciency (STAMP) is a four-skills online the ACTFL, especially Elvira Swender, for proficiency test using adapted authentic initiating and supporting the study; LTI, and authentic-like materials. Its bench- especially Mohamed Diop, for generously mark-level descriptions are claimed to be supporting the research financially and ad- comparable to the ACTFL proficiency ministratively; the Language Flagship and guidelines (Avant Assessment, 2012). the Language Proficiency Initiative univer- 2. Watson and Wolfel (2015) provided raw sities—Michigan State University, the Uni- data only (pp. 71–72). To calculate versity of Minnesota, and the University of mean listening and reading proficiency, Utah—for sharing their data; and the in- their raw data were coded as follows: structors and students at all participating ILR 0 ¼ 1, ILR 0þ¼2, ILR 1 ¼ 3, ILR universities who generously gave their 1þ¼4, etc. The ranges of these scores time. Acknowledgments are also gratefully were as follows: 1.69 (range: 1, 7); 1.79 made to the Foreign Language Annals editor, (range:1,7);4.04(range:1,7);3.78 three anonymous reviewers, and the Insti- (range (2, 7). tute for Test Research and Test Develop- 3. Instructors were identified by not being ment, Leipzig, Germany, especially Fanny associated with a particular class level. Bies and Olaf B€arenf€anger, for helping with Incomplete tests were tests for which no the data coding and data analysis. ACTFL rating could be assigned because students stopped after completing only a few items. References 4. It is unclear what scale the ACTFL scale fi is. It is often considered ordinal, not ACTFL. (2012). ACTFL pro ciency guidelines interval, usually on the basis that it takes [Electronic version]. Retrieved February 25, 2016, from http://www.actfl.org/sites/default/ longer to move from one sublevel to the files/pdfs/ACTFLProficiencyGuidelines2012_ next at the higher than at the lower end FINAL.pdf of the scale, and because the presumed ACTFL. (2013). ACTFL reading proficiency additional knowledge required to move test (RPT). Familiarization manual and ACTFL to the next sublevel at the higher end of proficiency guidelines 2012—reading.Re- the scale is larger. However, this is also trieved February 25, 2016, from http://www. true of norm-referenced scales such as languagetesting.com/wp-content/uploads/ 2015/02/ACTFL_FamManual_Reading_2015. the one the Test of English as a Foreign pdf Language is based on, where the amount fi of knowledge required moving, e.g., ACTFL. (2014). ACTFL listening pro ciency test (LPT). Familiarization manual and ACTFL from a score of 110 to 120 must certainly proficiency guidelines 2012—listening.Re- be larger than moving from a score of 20 trieved February 25, 2016, from http://www. to 30. Furthermore, means and standard languagetesting.com/wp-content/uploads/ deviations have commonly been pre- 2015/02/ACTFL_FamManual_Listening_2015. sented in studies dealing with the pdf TM ACTFL scale (see G. L. Thompson, Avant Assessment. (2012). Avant STAMP 4S fi — Cox, & Knapp, 2016, for a recent discus- (Standards-based Measurement of Pro ciency 4 Skills). Spanish technical report. Retrieved sion), mirroring the practice in the social February 25, 2016, from http://avantassessment. sciences in general, where means are com/docs/spanish-avant-stamp-technical- regularly calculated for ordinal data document.pdf 222 SUMMER 2016

Bernhardt, E. (2011). Understanding advanced Goodwin, A. P., August, D., & Calderon, M. second language reading. New York: Routledge. (2015). Reading in multiple orthographies: Differences and similarities in reading in Bernhardt, E., Molitoris, J., Romeo, K., Lin, Spanish and English for English learners. Lan- N., & Valderrama, P. (2015). Designing and guage Learning, 65, 596–630. sustaining a foreign language writing profi- ciency assessment program at the postsec- Hirai, A. (1999). The relationship between lis- ondary level. Foreign Language Annals, 48, tening and reading rates of Japanese EFL learn- 329–349. ers. Modern Language Journal, 83,367–384. Bozorgian, H. (2012). The relationship be- In’nami, Y., & Koizumi, R. (2012). Factor tween listening and other language skills in structure of the revised TOEIC(R) test: A International English Language Testing Sys- multiple-sample analysis. Language Testing, tem. Theory and Practice in Language Studies, 29, 131–152. 2, 657–663. Institute for Test Research and Test Develop- Brecht, R. D., Davidson, D. E., & Ginsberg, ment. (2013a). Assessing evidence of validity of R. B. (1993). Predictors of foreign language the ACTFL reading proficiency test (RPT). Re- gain during study abroad. Washington, DC: trieved February 26, 2016, from http://www. Occasional Papers of the National Foreign languagetesting.com/wp-content/uploads/ Language Center. 2013/10/Technical-Report-ACTFL-RPT-for- Brooks, F. B., & Darhower, M. A. (2014). It publication.pdf takes a department! A study of the culture of Institute for Test Research and Test Develop- proficiency in three successful foreign lan- ment. (2013b). Assessing evidence of validity of guage teacher education programs. Foreign the ACTFL listening proficiency test (LPT). Re- Language Annals, 47, 592–613. trieved February 26, 2016, from http://www. Carroll, J. B. (1967). Foreign language profi- languagetesting.com/wp-content/uploads/ ciency levels attained by language majors near 2013/10/Technical-Report-ACTFL-LPT-2013- graduation from college. Foreign Language for-publication.pdf Annals, 1, 131–151. Jeon, E. H., & Yamashita, J. (2014). L2 reading comprehension and its correlates: A meta- Clifford, R., & Cox, T. L. (2013). Empirical – validation of reading proficiency guidelines. analysis. Language Learning, 64, 160 212. Foreign Language Annals, 46,45–61. Katz, L., & Frost, R. (1992). The reading pro- Cox, T. L., & Clifford, R. (2014). Empirical cess is different for different orthographies: validation of listening proficiency guidelines. The orthographic depth hypothesis. In R. Frost Foreign Language Annals, 47, 379–403. & L. Katz (Eds.), Orthography, , mor- phology, and meaning: Advances in psychology Davidson, D. E. (2010). Study abroad: When, (pp. 67–84). Oxford: North-Holland. how long, and with what results? New data from the Russian front. Foreign Language Larson-Hall, J. (2010). A guide to doing statis- Annals, 43,6–26. tics in second language research using SPSS. New York: Routledge. Davin, K. J., Rempert, T. A., & Hammerand, A. A. (2014). Converting data to knowledge: Liao, C., Qu, Y., & Morgan, R. (2010). The One district’s experience using large-scale relationships of test scores measured by the proficiency assessment. Foreign Language TOEIC listening and reading test and TOEIC Annals, 47, 241–260. speaking and writing tests. Retrieved Janu- ary 26, 2016, from http://www.ets.org/ ~ Fraga-Canadas, C. P. (2010). Beyond the Media/Research/pdf/TC- 10-13.pdf classroom: Maintaining and improving teach- ers’ language proficiency. Foreign Language Magnan, S. S. (1986). Assessing speaking pro- Annals, 43, 395–421. ficiency in the undergraduate curriculum: Data from French. Foreign Language Annals, Frost, R. (2012). Towards a universal model of 19, 429–437. reading. Behavioral and Brain Sciences, 35, 263–279. Moeller, A. J. (2013). Advanced Low language proficiency: An achievable goal? Modern Lan- Glisan, E. W. (2013). On keeping the target guage Journal, 97, 549–553. language in language teaching: A bottom-up effort to protect the public and students. Modern Omaggio-Hadley, A. (2001). Teaching language Language Journal, 97, 541–544. in context (3rd ed.). Boston: Heinle & Heinle. Foreign Language Annals VOL. 49, NO. 2 223

Rifkin, B. (2005). A ceiling effect in traditional Thompson, G. L., Cox, T. L., & Knapp, N. classroom foreign language instruction: Data (2016). Comparing the OPI and the OPIc: The from Russian. Modern Language Journal, 89, effect of test method on oral proficiency scores 3–18. and student preference. Foreign Language Annals, 49,75–92. Schmitt, E. (2016). Seat time versus profi- ciency: Assessment of language development Tschirner,E.,&Heilenman,L.K.(1998). in undergraduate students. In N. Mills & J. Reasonable expectations: Oral proficiency Norris (Eds.), AAUSC 2014 volume—Issues in goals for intermediate-level students of Ger- language program direction: Innovation and man. Modern Language Journal, 82, 147–158. accountability in language program evaluation Vandergrift, L., & Baker, S. (2015). Learner (pp. 110–130). Boston: Cengage. variables in second language listening com- Schulz, R. A. (2002). Changing perspectives in prehension: An exploratory path analysis. foreign language education: Where do we Language Learning, 65, 390–416. comefrom?Wherearewegoing?Foreign Van Zeeland, H. (2014). Lexical inferencing in Language Annals, 35, 285–292. first and second language listening. Modern Sieloff-Magnan, S., & Back, M. (2007). Social Language Journal, 98, 1006–1021. interaction and linguistic gain during study Watson, J. R., & Wolfel, R. (2015). The inter- abroad. Foreign Language Annals, 40,43–61. section of language and culture in study Swender, E. (2003). Oral proficiency testing abroad: Assessment and analysis of study in the real world: Answers to frequently asked abroad outcomes. Frontiers: The Interdisciplin- questions. Foreign Language Annals, 36, ary Journal of Study Abroad, 25,57–72. 520–526. Weir, C., & Khalifa, H. (2008). A cognitive Tedick, D. J. (2013). Embracing proficiency processing approach towards defining reading and program standards and rising to the chal- comprehension. Cambridge ESOL Research lenge: A response to Burke. Modern Language Notes, 31,2–10. Journal, 97, 535–538. Thompson, I. (1996). Assessing foreign lan- Submitted January 26, 2016 guage skills: Data from Russian. Modern Lan- guage Journal, 80,47–65. Accepted March 1, 2016