Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

Bert Le Bruyn and Magali Paquot (Eds.), Learner Corpus Research Meets Second

Language Acquisition. Cambridge: Cambridge University Press, 2021. xiii + 275 pp. ISBN

9781108442299. [Cambridge Applied Linguistics Series]

Reviewed by Kevin McManus (Penn State University, USA)

Understanding the ways in which speakers use an additional language and how that ability emerges and changes over time constitutes a major focus of applied linguistics research to date. A dominant approach to investigating this question has involved studies of production, including studies that have documented how speakers use specific linguistic features (e.g., articles), multi-word combinations, as well as broader linguistic and/or discourse-level patterns. One particularly fruitful method for studying L2 usage has involved corpus linguistic analyses of learner corpora, defined as “systematic collections of authentic, continuous and contextualized language use (spoken or written) by L2 learners stored in electronic format” (Callies & Paquot, 2015, p. 1). Learner corpus research (LCR) thus holds considerable potential for informing and better grounding current conceptualizations of L2 learning as developed in the field of second language acquisition

(SLA). However, as many commentators have noted, the fields of LCR and SLA have not always benefited from one another as much as they could (see Myles, 2005, 2015). By providing an up-to-date account of cutting-edge work at the intersection of LCR and SLA,

1 Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

the current volume shows how recent advances in LCR and SLA have brought these fields closer together to provide robust and innovative accounts of L2 use and development.

In this review, I provide a short summary of each chapter in the volume, followed by a concise evaluation of the volume as a whole and its contribution to the field.

Le Bruyn and Paquot’s introduction situates and provides a broad contextualization for the volume, noting three main topics addressed in the nine empirical studies: universal tendencies and crosslinguistic influence, proficiency and time, and corpus analysis and development. The editors note that the volume “provides a fair impression of how the fields of LCR and SLA are currently interacting” (p. 3), achieved by drawing on a broad range of corpora, theories of language and learning, and analytical approaches. The book concludes with commentaries from Sylviane Granger and Florence Myles on the volume’s contribution to LCR and SLA.

The first three empirical chapters focus on crosslinguistic influence. Ionin and Díez-

Bedmar investigated article usage by Russian- and Spanish-speaking learners of English.

Their study examined the extent to which the predictions formulated from prior experimental SLA work about article usage were borne out in LCR using the Cambridge

Learner Corpus (see Cambridge University Press, 2021). In general, their comparisons indicate a similar patterning of results in both approaches: article usage was influenced by prior linguistic knowledge and proficiency. At the same time, the authors note that each approach brings its own specific contribution for studying article usage, thus cementing claims about the importance of methodological triangulation in research design.

2 Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

Understanding crosslinguistic influence in terms of Present Perfect and Simple Past usage among German and Chinese speakers is the focus of Werner, Fuchs and Götz’s study. Their analysis examined “if/how learners from two different L1 backgrounds deviate from native usage” (p. 58) by comparing usage differences in a variety of corpora: for the

L2 speakers, the Louvain International Database of Spoken English Interlanguage

(LINDSEI, see Gilquin et al., 2010) and the International Corpus of Learner English

(ICLE, see Granger et al., 2009); and for the L1 speakers, the Louvain Corpus of Native

English Conversation (LOCNEC, see De Cock, 2004) and the Louvain Corpus of Native

English Essays (LOCNESS, see Granger & Tyson, 1996). Consistent with previous research, results indicated both that L2 and L1 speakers use these tense forms in different ways and that increased L2 proficiency shapes usage towards target-like norms. The authors suggest that L1 transfer explanations do not account well for their findings.

In the last of the crosslinguistic influence chapters, Meriläinen examined embedded inversion and preposition omission in L2 English among learners from a variety of L1 backgrounds as well as with L1 speakers using the ICLE corpus and the Corpus of

Matriculation Examination compositions for L2 speakers and LOCNESS for L1 speakers.

The author suggests that prior linguistic knowledge plays an important role in accounting for usage patterns in the corpus data.

Polio and Yoon offer a refreshing take on understandings and operationalizations of accuracy in L2 research. They investigated to what extent multi-word combinations can function as measures of accuracy in L2 writing, using the Corpus of Contemporary

American English (COCA, see Davies, 2008) and a variety of L2 corpora (e.g., the MSU

3 Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

corpus, see Connor-Linton & Polio, 2014). In so doing, they propose new ways to think about accuracy, drawing on usage-based conceptualizations of language that move beyond judgements from data coders and L1 speakers. In addition, the authors present and discuss some of the ways that accuracy coding has the potential to be less labor-intensive by using corpus linguistic techniques.

The volume also includes three chapters that use longitudinal learner corpora to understand development and change. Paquot, Haets, and Gries investigated phraseological complexity development in L2 English using written data from the Longitudinal Database of Learner English project (LONGDALE, see Meunier et al., 2016). The study examined how L2 usage changed over time and to what extent L2 proficiency helped understand development. Their findings suggest a close relationship between general L2 proficiency and development in writing. The authors also draw attention to important task effects on L2 performance. In the case of writing, this includes the extent to which different essay prompts can elicit different types of responses (see also Verspoor et al.). These insights on can influence conclusions and claims about L2 development in important ways.

Using oral and written data in French and Spanish from the Languages and Social

Networks Abroad Project (LANGSNAP, see Mitchell et al., 2017), Tracy-Ventura,

Huensch, and Mitchell investigated changes in L2 lexical diversity during study abroad and four years later. In contrast to much LCR research, this learner corpus includes considerable meta-data to understand and contextualize usage. In their study, social network data were used to understand the extent to which learners continued to use and/or

4 Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

receive exposure to the L2. The findings indicated L2 exposure provided an important explanation for changes in L2 lexical diversity.

Taking a longitudinal multiple case study approach, Verspoor, Lowie, and Wieling investigated L2 writing development over 23 weeks. The analyses examined lexical and syntactic changes over the course of the 23 weeks, with data collected each week to provide fine-grained data points for studying development. Their findings indicated improvement over time on a range of measures, but development was non-linear. Variation among individuals was also evident. The authors call for more multiple case study approaches to better understand the longitudinal trajectories of L2 development.

The last two empirical studies of the volume focus on methodology and research design. Wulff and Gries make a case for studying individual differences and variation in

SLA using the MuPDAR(F) (multifactorial prediction and deviation analysis using regression/random forests) statistical technique (see Gries & Adelman, 2014; Gries &

Deshors, 2014). This technique can address the limitations of frequency comparisons of overuse and underuse by focusing on probabilistic differences resulting from usage. In so doing, this method compares linguistic choices between speakers rather than counting the number usage instances between speakers. The usefulness of this method for LRC and SLA was demonstrated by analyzing genitive alternation (of vs. ’s) among Chinese- and

German-speaking learners of English (from ICLE) and L1 English speakers (from ICE).

The results show that individual variation plays a major role in our understanding of usage.

In line with Verspoor et al., while group-level analyses can be insightful, when used alone they provide a partial account only.

5 Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

Bell, Collins, and Marsden offer a review of methodological issues associated with the creation of learner corpora and present analyses from pilot data involving school-aged children. Their analyses highlight the importance of piloting before finalizing research designs. Particularly relevant for both LCR and SLA research, this chapter notes some of the challenges present in eliciting data about L2 grammatical development (e.g., appropriacy of task design for different language abilities) as well as methodological difficulties in the preparation of data (e.g., transcription and coding, error annotation issues). This chapter thus provides a useful reminder and review of issues concerned with corpus building.

The final two chapters of the volume include commentaries on the volume’s empirical studies. Granger provides an LCR perspective, while Myles provides an SLA perspective. Together they review the ways in which the fields of LCR and SLA appear to be working together, they note some challenges that seems to remain, as well as propose new avenues for future LCR and SLA research. For example, Granger notes how advances in corpus design have allowed LCR researchers to investigate questions about learning and development, especially with the develop of longitudinal learner corpora (see Tracy-

Ventura et al., Verspoor et al.). Focusing on the importance of context to understanding L2 usage, Myles provides a helpful reminder that “the importance of sound metadata cannot be over-emphasised” (p.260). Rich metadata must be included in the design and construction of learner corpora. The perspectives from these experienced researchers provide great contextualization for the volume and draw important connections among the studies.

6 Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

All in all, this book is a very timely contribution to the fields of LCR and SLA. The editors have brought together a diverse collection of studies, all informed by SLA research and all involving LCR that make a strong case for the ways in which these fields are becoming closely connected. Even though there is a tendency towards cross-sectional analyses of L2 written English, the variety of corpora used is notable. The inclusion of longitudinal corpora of languages other than English is encouraging, as is the inclusion of spoken corpora. These additions to the types of corpora available to researchers will be helpful for advancing new claims about development that move beyond the specifics of

English. The volume also draws attention to potential task effects on usage, showing very clearly that not all essay prompts elicit the same types of language. This is clearly an important issue to be considered in future work, especially longitudinal research that tracks usage over an extended period of time. Observations that task differences can bias usage can impact our conclusions about usage, learning, and development in unintended ways.

A final reflection point is that, by and large, the learner corpora presented in this volume are not yet available to the wider research community (but see Tracy-Ventura et al.). For example, the ICLE and LINDSEI corpora are usable by other researchers after purchase only. As our field is beginning to reflect more on the availability, openness, and reproducibility of our research, it is important that the design and compilation of future learner corpora consider how the research data can be made useable by others (e.g., the

(e.g., the CHILDES database, see MacWhinney, Brian, 2000).

Taken together, the volume thus provides readers with an impressive overview of the current state of LCR-SLA research. Of particular importance, the studies very clearly

7 Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

highlight how the combination of corpus-based and experimental approaches are needed to better understand L2 learning, inform L2 theory building, and revisit previous research findings.

References

Callies, M., & Paquot, M. (2015). Learner Corpus Research: An interdisciplinary field on

the move. International Journal of Learner Corpus Research, 1(1), 1–6.

https://doi.org/10.1075/ijlcr.1.1.00edi

Cambridge University Press. (2021, 232021). The Cambridge English Corpus.

https://www.cambridge.org/us/cambridgeenglish/better-learning-insights/corpus

Connor-Linton, J., & Polio, C. (2014). Comparing perspectives on L2 writing: Multiple

analyses of a common corpus. Journal of Second Language Writing, 26, 1–9.

https://doi.org/10.1016/j.jslw.2014.09.002

Davies, M. (2008). The Corpus of Contemporary American English.

http://www.americancorpus.org

De Cock, S. (2004). Preferred sequences of words in NS and NNS speech. Belgian Journal

of and Literatures, 2(1), 225–246.

Gilquin, G., De Cock, S., & Granger, S. (2010). The Louvain International Database of

Spoken English Interlanguage [Handbook and CD-ROM]. Presses universitaires de

Louvain.

8 Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

Granger, S., Dagneaux, E., Meunier, F., & Paquot, M. (2009). The International Corpus of

Learner English [Version 2, Handbook and CD-ROM]. Presses universitaires de

Louvain.

Granger, S., & Tyson, S. (1996). Connector usage in the English essay writing of native

and non-native EFL speakers of English. World Englishes, 15(1), 17–27.

https://doi.org/10.1111/j.1467-971X.1996.tb00089.x

Gries, S. Th., & Adelman, A. S. (2014). Subject Realization in Japanese Conversation by

Native and Non-native Speakers: Exemplifying a New Paradigm for Learner Corpus

Research. In J. Romero-Trillo (Ed.), Yearbook of and

Pragmatics 2014 (Vol. 2, pp. 35–54). Springer International Publishing.

https://doi.org/10.1007/978-3-319-06007-1_3

Gries, S. Th., & Deshors, S. C. (2014). Using regressions to explore deviations between

corpus data and a standard/target: Two suggestions. Corpora, 9(1), 109–136.

https://doi.org/10.3366/cor.2014.0053

MacWhinney, Brian. (2000). The CHILDES Project: Tools for Analyzing Talk. Lawrence

Erlbaum Associates.

Meunier, F., Castello, E., Ackerly, K., & Coccetta, F. (2016). Introduction to the

LONGDALE project. In Studies in Learner Corpus Linguistics: Research and

Applications for Foreign Language Teaching and Assessment (pp. 12–126). Peter

Lang.

9 Preprint version International Journal of Learner Corpus Research, accepted for publication 23 July 2021 [email protected]

Mitchell, R., Tracy-Ventura, N., & McManus, K. (2017). Anglophone students abroad:

Identity, social relationships and language learning. Routledge, Taylor & Francis

Group.

Myles, F. (2005). Interlanguage corpora and second language acquisition research. Second

Language Research, 21(4), 373–391. https://doi.org/10.1191/0267658305sr252oa

Myles, F. (2015). Second language acquisition theory and learner corpus research. In S.

Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge Handbook of Learner

Corpus Research (pp. 309–332). Cambridge University Press.

https://doi.org/10.1017/CBO9781139649414.014

10