Vocabulary Knowledge and Speaking Proficiency 1

Supplementary materials for Koizumi and In’nami (in press) Koizumi, R., & In’nami, Y. (in press). Vocabulary knowledge and speaking proficiency among second language learners from novice to intermediate levels. Journal of Language Teaching and Research.

Table 1 Note. Table 1 did not include Uenishi (2005), because his study included many affective and linguistic factors as independent variables in the regression and SEM analyses, and only reported regression coefficients, on account of which we found it difficult to interpret his results.

Current study Vocabulary size and depth represent declarative knowledge (knowledge skills, according to the classification of De Jong et al. 2012, in press), and involve, for example, the form- meaning link, while speed is procedural knowledge (processing skills, according to De Jong et al. 2012, in press), enabling learners to access and retrieve word form and meaning quickly (see Wood, 2010, for details). Based on De Jong et al. (2012), we considered speaking proficiency as an individual, rather than a socially constructed trait. We also followed Housen and Kuiken (2009) in regarding speaking proficiency as consisting primarily of fluency, accuracy, and SC, which are often abbreviated as CAF (complexity, accuracy, and fluency). They called for research into how CAF are associated with internal variables, such as learners’ language knowledge, and the current study responds to their call. Finally, our approach is similar to De Jong et al. (in press), in terms of inspecting associations between cognitive fluency (reflected in vocabulary knowledge) and utterance fluency (reflected in fluency measures).

Study 1: Instruments In the derivation test, 20 derivational suffixes were selected from Levels 2 to 4 in Bauer and Nation (1993). All words used as stimuli and answers were selected from among the 3,000 most frequent lemmas in the JACET8000. Word class information (i.e., noun, verb, or adjective) was provided in the prompt, which may have elicited metalinguistic knowledge as well as derivation knowledge.

In the vocabulary tests, relatively high frequency vocabulary was targeted because, as Milton et al. (2010) argued, words with higher frequency are used more often in speech than in written texts (Adolphs & Schmitt, 2003), and thus high frequency vocabulary should be tested when correlating with speaking.

Study 1: Procedures and Analyses After solving questions in the size test, test takers waited until the test administrator instructed them to move to the derivation test. In answering subsequent tests, test takers were not allowed to go back to the size test. These procedures were followed because some items in the size test could be answered using information in the other tests.

The speaking test had two versions with different orders of tasks. The order of one version was Tasks 1, 2, 3, 4, and 5, whereas that of the other was Tasks 1, 5, 4, 2, and 3. This was done to partially counterbalance order effects and decrease the chances of cheating by copying other test-takers’ utterances. Vocabulary Knowledge and Speaking Proficiency 2

References Adolphs, S., & Schmitt, N. (2003). Lexical coverage of spoken discourse. Applied Linguistics, 24, 425–438. Albrechtsen, D., Haastrup, K., & Henriksen, B. (2008). Vocabulary and writing in a first and second language: Processes and development. New York: Palgrave MacMillan. Anderson, R. C., & Freebody, P. (1981). Vocabulary knowledge. In J. T. Guthrie (Ed.), Comprehension and teaching: Research reviews (pp. 77–117). Newark, DE: International Reading Association. Bauer, L., & Nation, P. (1993). Word families. International Journal of Lexicography, 6, 254– 279. Bonk, W. J. (2000). Second language lexical knowledge and listening comprehension. International Journal of Listening, 14, 14–31. Council of Europe. (2001). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press. Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. (2011). What is lexical proficiency? Some answers from computational models of speech data. TESOL Quarterly, 45, 182–193. doi: 10.5054/tq.2010.244019 Daller, H., Milton, J., & Treffers-Daller, J. (2007). Editors’ introduction: Conventions, terminology and an overview of the book. In H. Daller, J. Milton, & J. Treffers-Daller (Eds.), Modelling and assessing vocabulary knowledge (pp. 1–32). Cambridge University Press. Henriksen, B. (1999). Three dimensions of vocabulary development. Studies in Second Language Acquisition, 21, 303–317. Higgs, T. V.,  Clifford, R. (1982). The push toward communication. In T. V. Higgs (Ed.), Curriculum, competence, and the foreign language teacher (pp. 57–79). Lincolnwood, IL: National Textbook. Hulstijn, J. H. (2011). Language proficiency in native and nonnative speakers: An agenda for research and suggestions for second-language assessment. Language Assessment Quarterly, 8, 229–249. doi: 10.1080/15434303.2011.565844 In’nami, Y., & Koizumi, R. (2011). Structural equation modeling in language testing and learning research: A review. Language Assessment Quarterly, 8, 250– 276.doi:10.1080/15434303.2011.565844 Ishii, T., & Schmitt, N. (2009). Developing an integrated diagnostic test of vocabulary size and depth. RELC Journal, 40, 5–22. doi:10.1177/0033688208101452 Koizumi, R. (2011). Test-taking processes of the Lexical Organisation Test: Comparing it with the Word Associates Test. ARELE (Annual Review of English Language Education in Japan), 22, 153–168. Koizumi, R., & In’nami, Y. (2012). Modeling fluency, accuracy, and syntactic complexity of speaking performance: A structural equation modeling approach. Manuscript submitted for publication. Kormos, J., & Dénes, M. (2004). Exploring measures and perceptions of fluency in the speech of second language learners. System, 32, 145–164. doi:10.1016/j.system.2004.01.001 Laufer, B., & Nation, P. (1999). A vocabulary-size test of controlled productive ability. Language Testing, 16, 33–51. doi: 10.1177/026553229901600103 Meara, P. (1996). The dimensions of lexical competence. In G. Brown, K. Malmkjaer, & J. Williams (Eds.), Performance & competence in second language acquisition (pp. 35– 53). Cambridge University Press. Meara, P., & Fitzpatrick, T. (2000). Lex 30: An improved method of assessing productive Vocabulary Knowledge and Speaking Proficiency 3

vocabulary in an L2. System, 28, 19–30. doi: http://dx.doi.org/10.1016/S0346- 251X(99)00058-5 Mehnert, U. (1998). The effects of different length of time for planning on second language performance. Studies in Second Language Acquisition, 20, 83–108. Moinzadeh, A., & Moslehpour, R. (2012). Depth and breadth of vocabulary knowledge: Which really matters in reading comprehension of Iranian EFL learners? Journal of Language Teaching and Research, 3, 1015–1026. doi:10.4304/jltr.3.5.1015-1026 Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge University Press. Read, J. (2004). Plumbing the depths: How should the construct of vocabulary be defined? In P. Bogaards & B. Laufer (Eds.), Vocabulary in a second language: Selection, acquisition, and testing (pp. 209–227). Amsterdam, the Netherlands: John Benjamins. Segalowitz, N. (2010). Cognitive bases of second language fluency. New York: Routledge. Snellings, P., van Gelderen, A., & de Glopper, K. (2002). Lexical retrieval: An aspect of fluent second-language production that can be enhanced. Language Learning, 52, 723– 754. doi: 10.1111/1467-9922.00202 Uenishi, K. (2005). An empirical study on factors predicting the speaking ability of Japanese EFL learners. Unpublished Ph.D. dissertation, Hiroshima University, Japan. Yuan, F., & Ellis, R. (2003). The effects of pre-task planning and on-line planning on fluency, complexity and accuracy in L2 monologic oral production. Applied Linguistics, 24, 1– 27.doi: 10.1093/applin/24.1.1

APPENDIX 1: TASKS IN THE SPEAKING TEST IN STUDY 1 Task Speaking type; Instructions (originally written in Japanese) Content 1 Description; Self- Please introduce yourself to Ms. Smith. Please state your name and introduction talk about your family and friends first. If you do not know what to say, you can talk about anything you want (e.g., your school and likes and dislikes). 2 Picture Differences exist between the two pictures. Please locate these comparison; differences. Please talk about the marked objects first. Comparing pictures on the left and the right 3 Picture Describe the picture in as much detail as possible so that Ms. Smith, description; who is not looking at the picture, can understand what is in it. Picture of Please talk about the marked behaviors first. washing dishes 4 Picture Same as Task 3 description; Picture of riding bicycles 5 Picture There are pictures above and below. Your brother (Jiro) is comparison; mischievous. While you were away at school, he scattered your Comparing Taro’s belongings in your room. Describe what in the room has changed rooms before and and how by saying, “something was something before, but now after something is something else.” Vocabulary Knowledge and Speaking Proficiency 4

APPENDIX 2: DESCRIPTIVE STATISTICS IN STUDY 1 M SD Minimum Maximum Skewness Kurtosis Size (k = 78) 29.21 10.21 8.00 54.00 0.22 –0.52 Derivation (k = 20) 7.92 3.67 0.00 19.00 0.07 –0.52 Antonym (k = 17) 6.72 2.92 0.00 14.00 0.14 –0.47 Collocation (k = 18) 10.63 2.84 2.00 18.00 –0.36 0.57 T1 Speed F 49.52 18.18 8.00 106.67 0.30 –0.47 T1 Repair F 5.02 4.97 0.00 22.67 1.12 0.84 T1 Accuracy .75 .20 .00 1.00 –.81 .76 T1 SC 1.10 0.16 0.60 1.63 0.79 0.76 T2 Speed F 39.64 17.29 9.33 102.67 0.66 0.36 T2 Repair F 6.19 5.76 0.00 26.67 1.21 1.13 T2 Accuracy .26 .25 .00 1.00 .88 .41 T2 SC 0.89 0.21 0.20 1.25 –1.82 2.50 T3 Speed F 33.86 14.64 9.33 82.67 0.60 0.10 T3 Repair F 5.45 5.55 0.00 29.33 1.40 1.94 T3 Accuracy .59 .28 .00 1.00 –.15 –.59 T3 SC 1.07 0.21 0.33 2.00 1.58 6.26 T4 Speed F 37.55 15.41 5.33 82.67 0.48 –0.26 T4 Repair F 4.67 4.86 0.00 29.33 1.67 3.85 T4 Accuracy .54 .26 .00 1.00 –.35 –.45 T4 SC 0.98 0.18 0.20 1.50 –1.30 4.59 T5 Speed F 40.18 15.92 10.67 89.33 0.69 0.42 T5 Repair F 6.45 5.66 0.00 28.00 1.08 1.22 T5 Accuracy .34 .28 .00 1.00 .46 –.51 T5 SC 0.93 0.29 0.17 2.00 0.33 3.10 Note. T = Task. F = Fluency. SC = Syntactic complexity. These also apply to other appendixes. Vocabulary Knowledge and Speaking Proficiency 5

APPENDIX 3: CORRELATIONS USED FOR MODELS 1 TO 4 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1. Size .77 .80 .66 .30 .15 .35 .19 .42 .31 .10 .36 .48 .18 .34 .12 .50 .26 .31 .28 .41 .29 .32 .22 2. Derivation -- .68 .63 .25 .16 .19 .13 .37 .24 .04 .32 .44 .15 .27 .15 .41 .26 .25 .28 .34 .21 .19 .16 3. Antonym -- .61 .32 .14 .26 .13 .43 .31 .07 .26 .47 .13 .32 .11 .53 .26 .30 .25 .44 .28 .27 .21 4. Collocation -- .22 .17 .22 .13 .34 .23 .07 .29 .40 .15 .26 .09 .37 .21 .23 .22 .29 .20 .18 .15 5. T1 Speed F -- .28 .07 .26 .62 .36 –.12 .14 .62 .31 .19 .09 .69 .34 .11 .28 .64 .40 .15 .18 6. T1 Repair F -- –.02 –.06 .38 .43 –.02 .15 .35 .47 –.07 –.08 .43 .53 –.03 .06 .34 .45 .11 .16 7. T1 Accuracy -- .23 .16 .09 .22 .29 .20 –.02 .33 .08 .21 .02 .27 .19 .07 .10 .15 .14 8. T1 SC -- .10 .08 .08 .13 .15 .03 .17 .08 .16 .05 .13 .13 .05 .05 .13 .10 9. T2 Speed F -- .42 –.13 .30 .70 .41 .17 .12 .71 .40 .11 .28 .71 .46 .16 .27 10. T2 Repair F -- –.01 .18 .46 .50 .11 .00 .45 .51 .11 .16 .48 .50 .19 .21 11. T2 Accuracy -- .02 –.12 –.07 .23 .00 –.07 –.13 .13 –.08 –.13 –.09 .01 .07 12. T2 SC -- .23 .08 .13 .09 .22 .05 .23 .20 .19 .15 .11 .18 13. T3 Speed F -- .32 .16 .09 .74 .43 .18 .25 .63 .40 .13 .24 14. T3 Repair F -- .00 .02 .34 .57 .00 .12 .33 .61 .07 .18 15. T3 Accuracy -- .18 .22 –.02 .32 .26 .16 .12 .25 .04 16. T3 SC -- .04 .04 .08 .17 .09 .01 .09 .02 17. T4 Speed F -- .38 .23 .25 .69 .50 .25 .31 18. T4 Repair F -- –.02 .21 .38 .52 .12 .16 19. T4 Accuracy -- .36 .11 .15 .27 .10 20. T4 SC -- .26 .25 .28 .23 21. T5 Speed F -- .42 .13 .22 22. T5 Repair F -- .15 .24 23. T5 Accuracy -- .37 24. T5 SC -- Note. *p< .05: from .14 to .17. **p< .01: from .18 or more. Vocabulary Knowledge and Speaking Proficiency 6

APPENDIX 4: MULTIVARIATE NORMALITY AND MODEL FITS IN STUDY 1 (N = 224) Mardia’s Satorra- CFI TLI RMSEA SRMR coefficient Bentler [90%CI] scaled χ2 (df) p Criteria nonnormal: > p > .05 > .90 > < 0.08 < .08 5.00 0.90 Models 1 & 2: Vocabulary –0.63 4.20 (2) .12 1.00 0.99 0.07 .01 [0.00, 0.07] Model 3: Speaking 9.25 262.37 (164) .93 0.92 0.05 .06 < .01 [0.04, 0.06] Model 4: Vocabulary and 7.15 346.70 (242) .95 0.94 0.04 .06 speaking < .01 [0.03, 0.05] Note. CFI = comparative fit index; TLI = Tucker-Lewis index; RMSEA = root mean square error of approximation; CI = confidence interval; SRMR = standardized root mean square residual. This also applies to Table 6. Vocabulary Knowledge and Speaking Proficiency 7

APPENDIX 5: Mplus Input for the Monte Carlo Analysis to Determine the Precision and Power of Parameters for Study 2 TITLE: EFA Three-FACTOR, NORMAL DATA, NO MISSING MONTECARLO: NAMES ARE X1-X8; NOBSERVATIONS = 87; ! SAMPLE SIZE OF INTEREST NREPS = 1000; SEED = 53567; MODEL POPULATION: Voc BY X1@1 X2*.34 X3*-11.97; Speak BY X4@1 X5*.83 X6*.11 X7*.01 X8*.05; X1*88.80; X2*12.34 X3*76348.76; X4*27.73; X5*28.87; X6*7.48; X7*0.08; X8*1.21; Speak*20.70; Speak ON Voc*.65 MODEL: Voc BY X1@1 X2*.34 X3*-11.97; Speak BY X4@1 X5*.83 X6*.11 X7*.01 X8*.05; X1*88.80; X2*12.34 X3*76348.76; X4*27.73; X5*28.87; X6*7.48; X7*0.08; X8*1.21; Speak*20.70; Speak ON Voc*.65 ANALYSIS: ESTIMATOR = ML; OUTPUT: TECH9;

APPENDIX 6: Mplus Output for the Monte Carlo Analysis to Determine the Precision and Power of Parameters for Study 2 Standard Mean Power Sample SD of Population error of square 95% (% sig parameters sample parameter sample error of Coverage coeffi- averaged parameters parameters parameters cient) 0.80 Criteria .91-.98 is or OK more VOC By X1 16.22 16.07 1.66 1.65 2.79 0.94 1.00 X2 5.55 5.50 0.59 0.58 0.35 0.94 1.00 X3 5.55 5.50 0.59 0.58 0.35 0.94 1.00 SPEAK By X4 1.00 1.00 0.00 0.00 0.00 1.00 0.00 X5 0.83 0.83 0.08 0.08 0.01 0.95 1.00 X6 0.11 0.11 0.03 0.03 0.00 0.95 0.98 X7 0.01 0.01 0.00 0.00 0.00 0.94 0.94 X8 0.05 0.05 0.01 0.01 0.00 0.95 0.99 SPEAK On VOC 10.52 10.43 1.15 1.13 1.32 0.95 1.00 Note. The column labels were partially changed from original Mplus outputs to enhance clarity. VOC By X1 refers to a path from the Vocabulary factor to the X1 variable. Vocabulary Knowledge and Speaking Proficiency 8

APPENDIX 7: ESTIMATES OF PATH COEFFICIENTS IN MODEL 4 Parameter B Standard error β R2 Vocabulary --> Size 1.00a -- .94 .88 Derivation 0.31* 0.02 .82 .67 Antonym 0.26* 0.01 .86 .73 Collocation 0.21* 0.02 .72 .51 Vocabulary --> Speed fluency 0.83* 0.10 .57 .32 Repair fluency 0.12* 0.03 .36 .13 Accuracy 0.07* 0.001 .63 .40 SC 0.003* 0.001 .66 .44 Speed fluency --> T1 Speed fluency 1.00a -- .76 .58 T2 Speed fluency 1.05* 0.08 .84 .70 T3 Speed fluency 0.88* 0.07 .83 .69 T4 Speed fluency 0.98* 0.07 .88 .77 T5 Speed fluency 0.92* 0.07 .80 .64 Repair fluency --> T1 Repair fluency 1.00a -- .65 .42 T2 Repair fluency 1.22* 0.14 .68 .47 T3 Repair fluency 1.28* 0.15 .74 .55 T4 Repair fluency 1.12* 0.15 .75 .56 T5 Repair fluency 1.31* 0.16 .75 .56 Accuracy --> T1 Accuracy 1.00a -- .53 .28 T2 Accuracy 0.51* 0.20 .22 .05 T3 Accuracy 1.49* 0.25 .56 .32 T4 Accuracy 1.42* 0.25 .56 .32 T5 Accuracy 1.19* 0.25 .45 .20 SC --> T1 SC 1.00a -- .30 .09 T2 SC 1.90* 0.60 .43 .19 T3 SC 0.93* 0.43 .21 .04 T4 SC 2.02* 0.66 .53 .28 T5 SC 2.32* 0.78 .39 .15 Note. aFixed to 1.00 for scale identification. *p< .05. **p< .01. These also apply to other appendixes.

APPENDIX 8: COVARIANCES BETWEEN MEASUREMENT ERRORS IN MODEL 4 Parameter Covariance Standard error r D1 (Speed fluency) <--> D2 (Repair fluency) 20.46* 3.15 .60 D3 (Accuracy) –0.02 0.09 –.02 D4 (SC) 0.17* 0.07 .41 D2 (Repair fluency) <--> D3 (Accuracy) –0.03 0.03 –.14 D4 (SC) 0.03* 0.02 .27 D3 (Accuracy) <--> D4 (SC) 0.002* 0.001 .79 Vocabulary Knowledge and Speaking Proficiency 9

APPENDIX 9: Vocabulary tests used in Study 2

 Note. The upper left column shows one item in the J8VST. The target word is mention, meaning of which is provided in L1 as “write or speak about something; refer to something briefly.” One correct option, three distractors, and one option labeled “I don’t know” are provided. The upper right column shows the LOT, in which test takers select one link with the strongest connection between two words out of the three links; to select a link, test takers need to move the ball in the center to the slot on the link. The bottom left and right columns show the two tasks of the LEXATT; on the left, test takers start pushing the button in the bottom center and the target word trip appears; they release the button when they recognize the word form and meaning. The response time, from when the test takers begin to push the button to when they release it, is measured. In the bottom right column, they choose one option out of two to demonstrate their understanding of the word meaning (trap and trip).

APPENDIX 10: DESCRIPTIVE STATISTICS IN STUDY 2 (N = 87) M SD Minimum Maximum Skewness Kurtosi s Size (k = 125) 100.26 18.76 42.00 123.00 –1.04 0.50 Depth (k = 50) 24.28 6.57 9.00 41.00 0.24 –0.23 Speed (k = 40) 755.08 337.71 274.00 2233.00 1.22 3.04 Processing efficiency 40.59 12.61 20.00 69.00 0.41 –0.77 Speed fluency 16.10 10.95 1.00 40.50 0.55 –0.75 Repair fluency 3.95 3.04 0.00 14.50 0.80 0.49 Accuracy 0.62 0.32 0.00 1.00 –0.68 –0.52 SC 2.43 1.27 1.00 8.00 1.60 3.80 Vocabulary Knowledge and Speaking Proficiency 10

APPENDIX 11: CORRELATIONS USED FOR MODELS 5 TO 7 Depth Speed Efficiency Speed F Repair F Accuracy SC Size .74** –.51** .71** .65** .34** .56** .41** Depth -- –.44** .71** .69** .30** .36** .40** Speed -- –.47** –.49** –.25* –.43** –.33** Processing efficiency -- .80** .38** .43** .46** Speed fluency -- .46** .36** .41** Repair fluency -- .07 .12 Accuracy -- .26* SC --

APPENDIX 12: MULTIVARIATE NORMALITY AND MODEL FITS IN STUDY 2 (N = 87) Mardia’s Satorra-Bentler scaled χ2 CFI TLI RMSEA SRMR coefficient (df) p [90%CI] Criteria nonnormal: p> .05 > .90 > < 0.08 < .08 > 5.00 0.90 Model 5: 4.51 -- (Not computed because ------Vocabulary of df = 0) Model 6: 1.24 7.22 (5) .20 .99 0.97 0.07 .05 Speaking [0.00, 0.18] Model 7: 2.79 28.80 (19) .05 .97 0.95 0.08 .06 Vocabulary and [0.00, speaking 0.13]

APPENDIX 13: ESTIMATES OF PATH COEFFICIENTS IN MODEL 7 Parameter B Standard error β R2 Vocabulary --> Size 1.00a -- .87 .75 Depth 0.34* .04 .85 .71 Speed –11.97* 2.24 –.58 .33 Vocabulary --> Speaking 0.65* .07 .92 .84 Speaking --> Processing efficiency 1.00a -- .91 .83 Speed fluency 0.83* .08 .87 .76 Repair fluency 0.12* .03 .43 .19 Accuracy 0.01* .003 .48 .23 SC 0.06* .01 .50 .25