A Dynamic Perspective on Second Language Acquisition Caspi, Tal

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record

Publication date: 2010

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA): Caspi, T. (2010). A Dynamic Perspective on Second Language Acquisition. s.n.

Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

Download date: 25-09-2021 A Dynamic Perspective on Second Language Development

Tal Caspi

i

The work in this thesis has been carried out under the auspices of the School of Behavioral and Cognitive Neuroscience (BCN) and the Center for Language and Cognition Groningen (CLCG). Both are affiliated with the University of Groningen.

Groningen Dissertation in Linguistics 85 ISSN 0928-0030 ISBN 978-90-367-4526-0

© 2010 Tal Caspi

ii

RIJKSUNIVERSITEIT GRONINGEN

A Dynamic Perspective on Second Language Development

Proefschrift

ter verkrijging van het doctoraat in de Letteren aan de Rijksuniversiteit Groningen op gezag van de Rector Magnificus, dr. F. Zwarts, in het openbaar te verdedigen op donderdag 7 oktober 2010 om 16.15 uur

door

Tal Caspi

iii

Promotor: Prof. dr. C.L.J. de Bot Copromotor: Dr. W. M. Lowie

Beoordelingscommissie: Prof. dr. D. Larsen-Freeman Prof. dr. N. Schmitt Prof. dr. P. van Geert

iv

TABLE OF CONTENTS

CHAPTER 1 INTRODUCTION ...... 1

CHAPTER 2 THE DYNAMICS OF L2 DEVELOPMENT ...... 5

2.1 Background: Dynamic Systems Theory ...... 5 2.1.1 Key concepts in DST ...... 7 2.1.1.1 Nestedness and ongoing interaction...... 8 2.1.1.2 Iterative growth...... 9 2.1.1.3 Self-organization...... 11 2.1.1.4 Limited resources...... 13 2.1.1.5 Precursors and dependents ...... 14 2.1.1.5.1 The island metaphor...... 14 2.1.1.6 The value of variation ...... 17 2.1.2 Models of dynamic growth and interaction ...... 20 2.1.3 Summary...... 21

CHAPTER 3 INVESTIGATING LANGUAGE DEVELOPMENT FROM A DYNAMIC PERSPECTIVE ...... 23

3.1 Introduction ...... 23

3.2 General study design ...... 23 3.2.1 Data collection...... 24 3.2.2 Data description and exploration (variability analyses)...... 24 3.2.3 Model specification...... 25 3.2.4 Model fitting ...... 26 3.2.5 Considering extensions...... 27

3.3 Methods of growth and variability analysis ...... 27 3.3.1 Growth trajectory plots ...... 28 3.3.2 De-trended data values (residuals)...... 29 3.3.3 Moving correlation ...... 30 3.3.4 Smoothing by local regression: spline interpolation...... 31 3.3.5 Pitfalls of variability analyses...... 33

3.4 Modeling a dynamic system: general considerations ...... 34 3.4.1 A basic growth model...... 35 3.4.2 Modeling connected growth ...... 37 3.4.3 Precursor interactions: unidirectional support ...... 38 3.4.4 Precursor interactions: unidirectional support and competition...... 40 3.4.5 Precursor interactions: bidirectional support by level; unidirectional competition by change .41 3.4.6 Aggregated support and competition ...... 43

3.5 Summary ...... 44

CHAPTER 4 A DYNAMIC PERSPECTIVE ON L2 VOCABULARY KNOWLEDGE ...... 45

4.1 Introduction ...... 45 4.1.1 Background: Overall vocabulary growth...... 47 4.1.2 Aspects of vocabulary knowledge ...... 50 4.1.3 The receptive-productive gap ...... 53

v 4.1.4 A dynamic perspective on vocabulary knowledge...... 58 4.1.5 Research questions and predictions ...... 60

4.2 Methodology...... 61 4.2.1 Participants and procedures ...... 61 4.2.2 Materials ...... 62 4.2.2.1 The knowledge paradigm...... 62 4.2.2.2 Free production ...... 63 4.2.2.3 Controlled production ...... 65 4.2.2.4 Recall and recognition...... 66 4.2.2.5 Testing and scoring considerations ...... 67

4.3 Results...... 67 4.3.1 The data at a glance ...... 68 4.3.1.1 Linear trends ...... 68 4.3.2 Correlations...... 71 4.3.2.1 The Portuguese speaker...... 71 4.3.2.2 The Mandarin speaker...... 72 4.3.2.3 The Vietnamese speaker ...... 73 4.3.2.4 The Indonesian speaker...... 73 4.3.3 Summary of the correlation analyses...... 74 4.3.4 Linear regression analysis...... 75 4.3.5 De-trended values: residuals...... 77 4.3.6 Moving correlations...... 79 4.3.7 Summary of the variability analyses...... 82 4.3.8 A precursor model of vocabulary knowledge development...... 82 4.3.8.1 Model optimization...... 87 4.3.9 Model outcome ...... 89 4.3.9.1 The Portuguese speaker...... 89 4.3.10 Generalizing the model...... 93 4.3.10.1 The Mandarin speaker...... 94 4.3.10.2 The Vietnamese speaker ...... 95 4.3.10.3 The Indonesian speaker...... 97 4.3.11 Summary of the modeling procedures ...... 98

4.4 Summary ...... 99

4.5 Discussion ...... 103

CHAPTER 5 DYNAMICS OF L2 WRITING DEVELOPMENT...... 107

5.1 Introduction ...... 107 5.1.1 Complexity vs. accuracy...... 111 5.1.2 Lexical vs. syntactic development ...... 117 5.1.3 A combined paradigm of the complexity-accuracy and lexical-syntactic distinctions ...... 120 5.1.4 Research questions and predictions ...... 122

5.2 Participants, materials and procedures...... 124 5.2.1 Lexical complexity indexes ...... 125 5.2.2 Lexical accuracy indexes ...... 126 5.2.3 Syntactic complexity indexes ...... 126 5.2.4 Syntactic accuracy indexes ...... 127

5.3 Results: data and variability analyses...... 128 5.3.1 Complexity vs. accuracy in the lexical dimension...... 129 5.3.2 Complexity vs. accuracy in the syntactic dimension ...... 131 5.3.2.1 Representativeness of the within-syntax findings ...... 134 5.3.3 Interactions within and between lexicon and syntax, across complexity measures ...... 135 5.3.4 Interactions between general measures of complexity and accuracy in lexicon and syntax...140 vi

5.3.5 Summary of the data analyses...... 147

5.4 Modeling complexity and accuracy in lexicon and syntax ...... 147 5.4.1 The Portuguese speaker ...... 150 5.4.1.1 Comparing correlation matrixes across the data, spline and model ...... 151 5.4.1.2 Validating the interpretation of the data and variability analyses ...... 152 5.4.2 The Mandarin speaker ...... 155 5.4.3 The Vietnamese speaker ...... 157 5.4.4 The Indonesian speaker ...... 159 5.4.4.1 Testing a different precursor hierarchy ...... 160

5.5 Summary ...... 162

5.6 Discussion ...... 165

CHAPTER 6 GENERAL SUMMARY AND DISCUSSION...... 171

6.1 Summary ...... 171 6.1.1 Differences between the vocabulary knowledge and writing performance models...... 174

6.2 Discussion ...... 176 6.2.1 Implications ...... 181

REFERENCES ...... 187

NEDERLANDSE SAMENVATTING ...... 201

vii

viii Chapter 1 Introduction

“The noise is the signal” (Landauer, 1998)

This frequently-quoted phrase is the title of a groundbreaking article by Rolf Landauer, one of the key figures in information physics. It summarizes the discovery that seemingly random fluctuations in physical measurements over time, which are excluded by the averaging inherent to many measurement tools, hold valuable information about interactions between particles. Although applied linguistics is undeniably a much “softer” discipline than physics in its subject matter and methodology, and thus deals with much “noisier” data, it has only recently begun to similarly consider the value of noise as indicative of underlying processes involved in language development. In the context of language development, noise is the nonlinearity and variability that accompany linguistic performance at any proficiency level. Language acquisition is markedly complex and elusive, involving numerous dimensions that develop at varied and often nonlinear rates, and a high degree of variation not only between individuals (inter-learner), but also within the performance of single learners over time (intra-learner) (de Bot, Lowie, & Verspoor, 2005). Inter-learner variability has been studied extensively, with numerous factors pinpointed as its determinants, although a large part of it remains unexplained (see R. Ellis, 1994 for a review). Investigating intra-learner variability in case studies of L2 acquisition, however, is a fairly new endeavor . In broad terms, language development can be described as constant change or flux (Larsen-Freeman & Cameron, 2008). This definition may be intuitively appealing to many L2 learners: as a nonnative (and non-too-frequent) Dutch speaker, I experience days in which even a rudimentary dialogue feels like a struggle, while on other days speaking Dutch seems much less demanding. Certain aspects of my performance appear to stagnate or even deteriorate, for example verb conjugation or sentence word order, whereas others, like my vocabulary, seem to grow constantly. Improvement in one area often appears to be offset by decline in another. For instance, while my tense use may occasionally be quite accurate, another such as using the wrong preposition might reemerge. Even my relative forte of vocabulary knowledge is not entirely reliable: whereas on some days I struggle to retrieve fairly

1

CHAPTER 1

frequent words, on others I surprise myself by uttering words that I had no idea I knew. These are, of course, strictly subjective anecdotes from a personal experience of L2 acquisition. Yet, as Larsen-Freeman and Cameron state, “everyone who has studied language acquisition knows that it is both systematic and variable” (2008, p. 21). This claim is expanded by Lowie, de Bot, and Verspoor: There is sufficient evidence that second language development goes in leaps and bounds with periods of stability and instability and identifiable stages in the nonlinear developmental pattern. During the process of language acquisition, periods of rapid acquisition are followed by periods of delayed acquisition or even attrition (2009, p. 127).

However, the fundamental nonlinearity of language development is often a mere footnote in published studies. Most of these studies aim to establish static, linear connections between aspects of performance and factors that are presumed to influence them, usually across groups of learners. The accessibility of statistical software has contributed to the widespread application of linear analyses, originally targeted at large populations, to what is essentially a highly individualized and variable process. Such analyses are based on measures of central tendency, which exclude variability as error or noise. While there is no doubt that the linear approach has led to invaluable discoveries in applied linguistics, it invariably overlooks the multidimensional complexity of language learning. In recent years, Dynamic Systems Theory (DST) has emerged as an alternative and complementary perspective to the cross-sectional and linear approach to applied linguistics. DST is “the most widely used, most successful, most thoroughly developed and understood descriptive framework in all of natural science” (van Gelder & Port, 1995, p. 328). It is concerned with describing change in complex systems over time in virtually any field, and has been applied to phenomena as varied as mineral crystallization, ecological equilibriums or epidemiological spread. DST defines and explores the ways in which processes unfold: how their developing components interact, and how these interactions yield unique and varied growth patterns. From this perspective, "far from simply reflecting noise in our measuring instruments or variability in low-level aspects of physiological maturation", variability patterns provide "a window onto the correlates and (by inference) the causes of developmental change" (Bates, Dale & Thal, 1995, p. 1). As a general and

2 Introduction

ecologically-valid paradigm, DST offers a way of reconciling general growth trends with individuated variability patterns, and accounting for both as inherent aspects of development. The crossover of DST from the physical and natural sciences to ecological, developmental and cognitive psychology has revolutionized these fields, including their approach to language learning. Via cognitive psychology, DST has been extended to language acquisition, first of early L1 (van Geert, 1991), and later of L2 (Larsen-Freeman, 1997). The key principles of DST align with the noisy reality of language acquisition: constant interaction between co-developing aspects of knowledge and performance over time; shifts in these interactions that derive from the structure of language and the limited resources of learners and their environments; and ensuing nonlinear and nonparallel growth of various linguistic subsystems and their components (de Bot, Lowie & Verspoor, 2007). The two empirical studies in this dissertation investigate the applicability of these basic dynamic principles to two areas of second language acquisition (SLA). The first study focuses on vocabulary knowledge, consisting of four levels that range from least to most productive. The second concentrates on writing performance, as expressed in the complexity and accuracy of its lexical and syntactic dimensions. Both studies examine longitudinal data from four case studies of advanced L2 acquisition. They combine analyses of central trend with investigations of local variability, and complement their results with models based on generic dynamic equations. Together, the studies are intended to explore the potential of the dynamic approach in supplementing the existing body of research in SLA. This thesis begins with two background chapters. The first provides a theoretical overview of the key principles of DST and their applicability to L1 and L2 acquisition. The second chapter explains the methodology associated with DST, namely longitudinal case study design, variability analyses and mathematical modeling of dynamic processes. The third and fourth chapters describe the vocabulary and writing studies, respectively. The final chapter is a summary and discussion of the implications of the dynamic approach, its limitations and extensions.

3

Chapter 2 The dynamics of L2 development

What is time? A secret – insubstantial and omnipotent. A prerequisite of the external world, a motion intermingled and fused with bodies existing and moving in space. But would there be no time, if there were no motion? No motion, if there were no time? What a question! Is time a function of space? Or vice versa? Or are the two identical? An even bigger question! Time is active, by nature it is much like a verb, it both "ripens" and "brings forth." And what does it bring forth? Change! Now is not then, here is not there – for in both cases motion lies in between. But since we measure time by a circular motion closed in on itself, we could just as easily say that its motion and change are rest and stagnation – for the then is constantly repeated in the now, the there in the here (Thomas Mann, The Magic Mountain , 1924, p. 409). The subtlety and complexity of cognition is found not at a time in elaborate static structures, but rather in time in the flux of change itself (van Gelder, 1999, p. 245). In a nutshell, the message of this book is that the human brain is constantly in motion (Spivey, 2007, p. 7).

2.1 Background: Dynamic Systems Theory The main aim of linguistic research is to describe mechanisms that underlie linguistic acts. In a broader framework, this objective is referred to as the structural approach to psychology (Spivey, 2007). It is predominantly pursued by “an empirical method sufficient to reduce observed behavior to underlying cognitive processes. This, in turn, requires that effects of cognitive processes at some level of description combine linearly to make up the whole of the behavior in question (plus random noise)” (van Orden, 2002, p. 1). Such linear data reduction assumes a direct and stable influence of cognitive linguistic constructs on behavioral variables, and that these variables in turn reflect a linear increase and a direct effect of these constructs. Consequently, even qualitative, longitudinal linguistic studies “appear to be driven less by developmental questions of magnitude and rate essential to the characterization of the range of normal variation in language development for specific populations and contexts, and more by the concern to establish statistical significance in the data” (Ortega, 2003, p. 513). Many longitudinal studies of L2 development are concerned with asserting universal acquisition orders of linguistic features, and thereby establishing differences between L1 and L2 acquisition. They thus encapsulate “cognitive psychology’s

5

CHAPTER 2

symbolic, stage-based, information-processing approach” (Spivey, 2007, p. 48), which reduces language development to a series of discrete events. However, nothing in this world remains at standstill, including human development. Whether physiological, behavioral or cognitive, development is change over time, a motion from one state to the next. It is constant and continuous, regardless of the fact that, viewed through the prism of linear analyses, it may appear as a series of discontinuous transitions. Like any developmental phenomenon, language is determined by a multitude of mutually-influential processes that are virtually impossible to disentangle. Micro-level changes in word representation in individual minds are expressed in performance across various modes of communication, leading to macro-level changes in the abstract, shared entity of language. Similarly, the feedback effect of global language change affects the language use and knowledge of individuals (Spivey, 2007; Larsen-Freeman & Cameron, 2008). Thus constant change cannot be extricated from complex interaction between language components, which occurs at all calibrations and time scales. DST addresses these fundamental traits, indeed the very core, of language development. It is a process theory that poses questions about underlying interactions between language components and their developmental expression. The dynamic approach considers language development “a more inconclusive construct – a competence in self-organization within linguistic processes that may be common to other linguistic processes (and nature at large)” (van Orden, 2002, p. 3). It therefore explores language development by utilizing generic principles, applicable across various natural systems, as analytical and explanatory tools. The potential contribution of DST to applied linguistics lies in its incorporation of time into otherwise static models, and its description of development as arising from iterated interactions rather than fixed and predetermined external effects. The acknowledgement of this fluidity leads to an emphasis on individual developmental pathways as inseparable from the ongoing interactions within the complex architecture of language (Smith & Thelen, 1993). Such pathways are seen as manifestations of the overall process of development. Consequentially, outliers are regarded as atypical expressions of prototypical processes. The dynamic approach therefore bridges the divide between generalizations about language development and frequently noted individual differences between learners, even given similar

6 The dynamics of L2 development

characteristics and circumstances (de Bot, 2007, 2008). Such differences otherwise remain unexplained, or explained by numerous factors (Skehan, 1991). The literature on the dynamic nature of language structure, use and development, in both communities and individuals users, can be roughly divided into two complementary branches: DST and complexity theory. As noted by Larsen- Freeman and Cameron, both theories employ the terms dynamic systems , complex systems , adaptive systems and their various combinations interchangeably (2008). Although complexity theory and DST are highly compatible in their approach to language development, overlapping to a great extent (Larsen-Freeman, 2007), there are nevertheless certain differences between the two approaches. Complexity theory, as its name suggests, frequently emphasizes the complex nature of language and its development, while DST emphasizes their dynamism, in the sense of identifying interactions between components and mathematically describing them. Since this thesis is concerned with this type of methodology, it usually uses the term dynamic , even when referencing complexity theory sources. The following sections elaborate on the key principles and terminology of the dynamic approach, while relating them to the central characteristics of language development.

2.1.1 Key concepts in DST Predictably, the keywords in DST are “system” and “dynamic”. A system is commonly defined as a collection of “components that affect and change one another over time” (van Geert, 2003, p. 655). The interactions between these components give rise to the holistic entity of the system as “dynamically stable and self-maintaining” (van Geert, 1993, p. 268). Simple systems, such as a stoplight, are composed of a relatively small number of elements, whereas complex systems, such as a forest, are multi-componential and intricate ensembles of components. Interactions between these components, and therefore systemic development on a whole, are nonlinear and occur on all levels (Larsen-Freeman & Cameron, 2008). Dynamic systems are “complexes of parts or aspects which are all evolving in a continuous, simultaneous, and mutually determining fashion” (van Gelder & Port, 1995, p. 13). From this distinction, it can be seen than the main shortfall of the prevailing approach to language development is its reduction of a complex dynamic system to a simple one.

7 CHAPTER 2

2.1.1.1 Nestedness and ongoing interaction Components or processes of a complex dynamic system are themselves complex dynamic (sub)systems, for example the plant and animal populations of the forest, between and within which numerous interactions occur. As this example illustrates, dynamic systems are “often interlinked on all possible levels” (van Geert, 2003, p. 658). This quality is referred to as nestedness . It is easy to see how it pertains to language: various structural linguistic components (and corresponding developmental processes) are embedded in a complex and multilayered structure and interact across all time units (Spivey, 2007). As Lowie et al. state, Language systems are complex sets of interacting variables at many different levels and sub-levels. Examples of levels are cultural, social, psychological and linguistic. Within each of these levels there are again many different sub- levels. For instance, within the linguistic sub-systems there is the sound system, the lexicon, the grammar and so on. These systems and their subsystems are interconnected (2009, p. 126).

Lowie et al. also point out that “the language system is nested within the cognitive system which is itself nested in the physical system of the body” (2009, p. 140). These nested systems are, in turn, embedded in a physical and cognitive environment comprised of numerous and similarly complex systems of individual bodies, minds, languages and their interactions within a variety of socio-cultural contexts. From a dynamic point of view, none of these systems and underlying processes are separable, as pointed by Spivey (2007) in his account of language variation and change as a dynamic process. However, since it is impossible to simultaneously consider all of these nested levels, any account of language development must diminish from its complexity by focusing on a restricted number of aspects. Because components of language are the dynamic product of developmental processes, which are in turn comprised of sub-processes on various time scales, the concept of nestedness is interlinked with the temporal or timescale aspect of development. For example, the forest ecosystem is comprised of soil, water, climatic conditions, flora and fauna. The same ecosystem can also be described as “a set of interconnected processes: processes of growth and decay, of feeding and digestion, of mating and breeding” (Larsen-Freeman & Cameron, 2008, p. 28). Each of these processes can generally be referred to as iterative growth.

8 The dynamics of L2 development

2.1.1.2 Iterative growth In the nested hierarchy of language, growth (whether in the form of increase or decline) arises from two factors: the passage of time and the preceding state of the system. The complex and nonlinear nature of a dynamic system implies that its prediction on the basis of time alone is impossible, but that At any particular moment, the system is affected by whatever environmental inflow occurs at this particular time, and, equally importantly, by the system’s preceding change. This property turns the changes that the system undergoes into what is called an iterative process . An iterative process takes the output of its preceding state (that is, the change it underwent in the immediate preceding moment) as the input of its next stage (van Geert, 2003, p. 657).

Due to iteration, dynamic systems are extremely sensitive: minute differences in initial values can yield drastically different outcomes. The most well-known example of this sensitivity is the , first noted in weather systems (Lorenz, 1972). This effect is sudden, unpredictable and catastrophic change, arising from a small perturbation or initial state difference. In order to illustrate it, this chapter will briefly shift from the theoretical to the mathematical description of DST. The basic mathematical definition of a dynamic system is “a set of quantitative variables changing continually, concurrently, and interdependently over quantitative time in accordance with dynamical laws described by some set of equations” (van Gelder, 1999, p. 244). A very simple system, which can be depicted in a single equation, can behave in a complicated way, settling into one of two (or more) distinct configurations of value ranges. This quality is best illustrated by the basic logistic growth equation, which has been used to describe various developmental phenomena, including learning. The equation was discovered by Pierre Verhulst, who used it to model population dynamics on the basis of available resources (such as food). Below is the differential version of the equation (the difference version appears in the following chapter). ∆P P = rP 1( − ) ∆t K Equation 1. The Verhulst formula

In the equation, P denotes the population size, t marks time, ∆ is change or difference, r is the growth rate (proportional increase in a single time unit) and K is the carrying capacity , which is “the final attainable growth level based on the resources available” (van Geert, 1995, p. 316). Early growth is expressed in the first

9 CHAPTER 2

(linear/exponential) term of the equation rP . This growth is curbed when the second, nonlinear term (1-P/K) becomes larger than the former, as the population size increases towards its carrying capacity. The resulting damping effect reduces r until the population ceases growing. This can be seen in the typical S-shape that the equation generates, which is prototypical for many developmental and learning phenomena (see section 3.4.1 for an application of this equation to lexical growth and a comparison of the S-curve to empirical data).

1.4

1.2

1

0.8

0.6 Population 0.4

0.2

0 1 9 17 25 33 41 49 57 65 73 81 89 97 105113121129137145 Iterations

Figure 1. The logistic map: a bifurcation plot (based on van Geert, 1994)

Figure 1 includes 150 trajectories, generated by the logistic equation on the basis of iterated r values with a fixed and small difference between them. The onset value of each trajectory is an iteration of the final value of its predecessor (in other words, these are segments of a single continuous trajectory). This type of plot is known as the logistic map. What determines its behaviour is the value of r. When r is between 0 and 1, P will eventually diminish (the population will become extinct). When r is between 1 and 2, the value of P stabilizes. The most interesting development occurs when r is between 3 and 4, at which point the population begins behaving chaotically, shifting into constant oscillation, as can be seen in the last 20 iterations in the plot. Although the change appears abrupt and discrete, it is in fact continuous. Despite the gradual and consistent change in r values, the logistic map shows an abrupt change between two distinct qualitative states or value ranges, which is known as a bifurcation. The value ranges that the system appears to prefer are referred to as , while the values that the system appears to avoid are called repellors. The map thus illustrates how continuous, slight and gradual change in

10 The dynamics of L2 development

initial values can generate dramatically different outcomes. These discrete alternatives may be interpreted as the result of separate mechanisms or events, rather than as arising from a gradual and continuous process. The logistic map also demonstrates the susceptibility of dynamic processes to small perturbations and its unpredictable results. Such a model is also the simplest example of sudden explosive change in a dynamic system, known as a catastrophe, and can be used to predict seemingly-abrupt transitions between two distinct states, such as animal flight or fight behavior. Van Geert (2003) related the sensitivity of dynamic systems to Piagetian learning theory, in which gradual minute growth in working memory capacity culminates in an acquisition explosion (in this particular case, of the conservation principle by infants). A similar effect has been repeatedly observed in early L1 lexicon, and is referred to as the “vocabulary burst” (Marchman & Bates, 1994; see elaborations in the following chapters). Other catastrophic qualities of early language development have been investigated by Ruhland (1998), who inspected variability patterns in case study data for various preceding signs of such “explosions”. In SLA, there is also evidence suggesting the operation of similar stochastic processes, since even given very similar learner characteristics and environmental conditions, there are still large individual differences between learners (de Bot, 2007, 2008).

2.1.1.3 Self-organization The bifurcation plot illustrates the development of a dynamic system as a function of time and the interaction between its components, which, in the case of the Verhulst formula, are only the population size and available resources. However, neither time nor these interactions can directly explain the macro-level change in a dynamic system. In other words, change is nonlinear: growth is disproportionate to input since “the effect of a dynamic process differs from the sum of its parts” (van Geert, 2003, p. 657). In SLA, this nonlinearity is seen in the fact that change is not a simple derivative of the “resources invested” (Lowie et al., 2009, p. 240) – whether these are internal or external to the learner, or simply the passage of time. An increase in complexity or order independently of the direct effect of an external source, such as distinct -repellor patterns that arise from internal systemic dynamics rather than external influences, is referred to as self-organization . As the name implies, this concept embodies the idea that an autonomous increase in the structure of a system is not governed or directed by a specific force other than the

11 CHAPTER 2

internal dynamics of the system itself. Self-organization is a key feature in current cognitive theories (Smith & Thelen, 1993; Spivey, 2007; van Gelder, 1999), and therefore also pertains to language development. According to van Orden (2002) and van Geert (2003), it is likely that language development, like other natural phenomena involving growth over time, is a process of self-organization. Similar ideas have been expressed in recent applied linguistics publications (e.g., de Bot et al., 2005; Larsen- Freeman & Cameron, 2008. 1 The idea of self-organization is compatible with emergentist and usage-based conceptualizations of language as an entity that increases in complexity due to the process of its use (Ellis, 1998). In this context, van Geert also points out that since self-organization can account for the increasing complexification of language without assuming direct external influences, it has explanatory power that rivals the supposition of an innate and specialized language mechanism, which is based on the observation that input is too poor to account for child language (e.g., Chomsky, 1965). On a different scale of language development, self-organization can account for the of creoles from pidgins and of new languages from creoles (van Geert, 2003). As these examples illustrate, across the nested structure of dynamic systems, self-organization occurs on many levels “from only a little increment in the structure provided, to the building of very complex structures” (van Geert, 2003, p. 659). By coupling logistic equations and incorporating additional parameters in them, more complex phenomena that involve robust self-organization can be modeled. Such mathematical models capture not only the iterative growth depicted in the basic logistic equation, but also locally-occurring interactions between systemic components that give rise to macro-level self-organization when iterated (van Gelder & Port, 1995). Coupled growth equations have been used to simulate empirical developmental data, and are similarly applied in the studies included in this thesis (see sections 3.4.2-3.4.5 for examples and explanations of such equations). Like the Verhulst formula, the concept of coupled growth relies on the basic assumptions that growth in a natural system requires resources, and that these resources are invariably limited. The following sections elaborate on these two premises and their implications for language development.

1 Beyond the individual level of language acquisition, the evolutionary emergence of language is also considered as dynamic and self-organizational (Komarova & Nowak, 2001; Komarova, Nowak, & Niyogi, 2001, Nowak & Komarova, 2001).

12 The dynamics of L2 development

2.1.1.4 Limited resources Growth of any kind “requires resources to keep the process going” (van Geert, 1995, p. 314). Concerning learning, several types of relevant resources can be distinguished: internal, such as memory capacity; temporal, such as time spent on skill acquisition; informational, such as the amount of available knowledge; and energetic, such as effort and motivation (van Geert, 1995). In the context of language learning, resources encompass factors such as working memory, attention, and effort, all of which influence the acquisition process (Elman, 1995). These resources can be subdivided further. For instance, effort can be seen as comprised of elements such as instruction, learner motivation and attitude towards the target language, language exposure and usage, learner physiognomy or neurology, and interaction with native speakers (Lowie et al., 2009). In DST, all of these factors are collapsed under a general umbrella term for “all the available internal and external factors that enable the development of a dynamic system” (Lowie et al., 2009, p. 128), unless the focus of a particular study is on a specific resource. According to van Geert, “as far as the growth process is concerned, resources have two major properties. First, they are limited. Second, resources are interlinked” (1995, p. 315). An example of the first property is the limited capacity of working memory (Baddeley, 1990), or the time that needs to be allocated to one aspect of language at the expense of another (Skehan & Foster, 1997). These inherent limitations are expressed in the notion of carrying capacity, which is the maximal attainable level of a particular systemic component given the resources available for its growth. Clearly, since resources are limited, so is the carrying capacity, as depicted in Equation 1. Resource limitations and their curbing effect on growth are not directly addressed by most language acquisition accounts, which may only emphasize capacities and positive growth potential. In contrast, from a dynamic perspective, resources limitations are not only a curbing factor on what would otherwise be exponential growth, but a crucial developmental “driving force” (van Geert, 2003, p. 656). This is because resource limitations determine the interactions between the co- developing systemic components that require them. However, this does not imply that these components will only compete for resources: certain skills may eventually support each other, once they reach a threshold value (van Geert, 1993). Thus, there may be seemingly-paradoxical interactions of mutual competition and support

13 CHAPTER 2

between co-developing systemic components. Such components, known as growers , can be classified into a structural hierarchy of precursors and dependents.

2.1.1.5 Precursors and dependents As section 2.1.1.2 has demonstrated, growth in dynamic systems is continuous even when it may appear to be step-wise. In this context, Larsen-Freeman points out that, in L2 development there are no discrete stages in which learners’ performance is invariant, although there are periods in which certain forms are dominant (…). There appears to be a need for the necessary building blocks to be in place in sufficient critical mass to move to a period where a different form dominates (2006b, p. 592).

The term critical mass has been used by studies of early L1 acquisition to refer to the fact that a certain amount of known words is prerequisite for the emergence of grammar. It is compatible with the notion of precursors , which are growers that need to develop to a certain level in order to enable the development of other dependent growers (cf., van Geert, 1991, 1995). Interactions across a hierarchy of precursor and dependent growers are generally referred to as the precursor model . Versions of this model have been identified in various areas of human development (Fischer, 1980; Fischer & Paré- Blagoev, 2000; Robinson & Mervis, 1998), including language (Bassano & van Geert, 2007; Robinson & Mervis, 1998; van Geert, 2003; Verspoor, Lowie, & van Dijk, 2008b). One of the clearest illustrations of the basic premises of the precursor model is van Geert’s (1993) island metaphor. This metaphor shows how natural variation reflects underlying precursor interactions in a hierarchy of species, given limitations on natural resources. Needless to say, island ecosystems and language development are far from identical. However, the metaphor serves as a simplified but useful demonstration of the relevance of the precursor model to language development.

2.1.1.5.1 The island metaphor Several fundamental premises make cognitive and specifically language development conceptually analogous to a hypothetical island. Island environments, although widely varied, exhibit general similarities based on fundamental evolutionary principles. Likewise, language learners exhibit overall similarities despite their unique individual variation. One of the basic principles that pertain to all islands is a hierarchical order of species, in which dependent species rely on their precursors. For instance, a plant

14 The dynamics of L2 development

population precedes and is conditional for that of insects, which depends on it. In turn, insects support plants by enabling their fertilization cycle. Higher in the hierarchy, both plants and insects precede and support certain birds, reptiles, and mammals, while the latter in turn precede and support a carnivore population. Similarly, language development shows universal acquisition orders of various dimensions, which, in generativist theory, are considered as independent of typology or the amount and quality of input. For instance, within the lexicon, nouns are seen as default predecessors to verbs, with several studies showing that a quantitative advantage of nouns to verbs is maintained in L1 and L2 acquisition across languages (Gentner, 2006).2 As this example shows, although precursor interactions have so far been indicated primarily in L1 acquisition, there is evidence for their applicability in L2 development. Accordingly, vanPatten, Williams, and Rott posit that “[L2] learners must achieve specific learning milestones, or complete developmental stages” (2004, p. 14), before acquiring new linguistic devices. This claim is in line with Larsen- Freeman’s assertion that despite its continuous and highly varied nature, the process of SLA relies on critical thresholds of various features (2006b). As Ellis also notes, The basic idea here is that complex structures are built of prior structures: a new construction can only be acquired if the learner has already acquired the relevant representational building block or if they have sufficient working memory capacity, phonological short-term memory span, or other aspects of general language processing resource, to be able to use the structure (2004, p. 64).

Another universal principle which is applicable across islands is the survival of the fittest. For the most part, species that inhabit islands are “not directly imported from elsewhere” (van Geert, 1993, p. 269), but evolve locally from life forms that migrate through “functionally narrow channels that strongly confine the speed, order, and content of the migratory process” (p. 268), i.e., air or water. Therefore, despite being embedded in an environment, the relative isolation of islands implies that migration of species is restricted. Species which migrate more easily via these channels, such as seed plants, birds and insects, will dominate the colonization process of a newly formed island. Likewise, any type of human learning “takes place via the functionally limited channels of speech, nonverbal behavior, help, and so forth” (van Geert, 1993, p. 268). Like the metaphorical island, which is likely

2 Since nouns precede verbs across languages, this finding is not attributable to the complexity of verb conjugation, but to the structural qualities of language, namely the relative instability in verb meanings in comparison with (mostly concrete) nouns (Gentner, 2006).

15 CHAPTER 2

embedded in an archipelago, itself located near larger islands which are closer to the mainland (in fact, yet another island, of a larger scale), the language of individual learners is embedded in a nested complex structure. Migration of species from the mainland to larger islands, and from these to smaller, newly-formed and remote islands is akin to the influence of the language community on the language of an individual adult user, which is eventually transferred to a developing child. Van Geert further suggests that natural selection, the probabilistic process that determines the success of species migration, plays a similar role in skill acquisition. This view is adopted in studies that relate early vocabulary learning to word distribution: more- frequent and more-basic words “migrate” more easily to early L1 (McMurry, 2007). In L2 vocabulary acquisition, simulations based on probability matrixes have indicated the applicability of similar principles (e.g., Meara, 1989, 1997). Apart from the universal principles of natural order, species migration, and survival of the fittest, each island ecosystem exhibits unique natural variation. This variation stems from the fact that survival and arrival of species are not the sole determinants of the evolutionary process; natural selection manifests not only in migration but also in the adaptation of colonizing species to their new environment, namely to the unique conditions in each island. More concretely, species adapt to the constraints of natural resources such as water, climate, topography or soil quality. Interactions between species are also affected by these limitations, and in turn become part of the “resources” that these species need to adapt to. Likewise, any learned skill takes on new qualities in its learner, in accordance not only with the sequential and the structural properties of knowledge, but with “the constraint of limited material, spatiotemporal, energy and informational resources” (van Geert, 1993, p. 270). In the area of SLA, the unique configuration of these resources can be summed up as individual differences between learners. Ultimately, the island ecosystem maintains a dynamic balance, or attractor state, in which species are engaged in complex, mutually-preserving interactions. However, some of these interactions “are not literally supportive – think of a predator- prey relationship – the maintenance of all the species that populate the island depends on the specific ways in which they collaborate” (van Geert, 1993, p. 267). As mentioned, the general configuration of species, their order and interactions, and the resource limitations that determine these interactions is referred to as the precursor model. Similarly, the components of the language system in a given individual can be

16 The dynamics of L2 development

conceived of as hierarchical and mutually interdependent “cognitive species” that rely on and compete for the same limited resources, as well as engage in codependent interactions. These interactions result in a dynamic equilibrium, in which certain aspects of language are more advanced, whilst others are less so. In SLA, stagnation or decline in certain skills or aspects of knowledge is generally referred to as fossilization (Selinker, 1972; Selinker & Lakshman, 1992). In line with the dynamic approach, fossilization has been reconsidered as an attractor state to which the system repeatedly returns as a result of its structural hierarchy and resource limitations. As such, fossilization is not fixed but dynamic, yet its recurrence and relative stability can be accounted for (Larsen-Freeman, 2006a; MacWhinney, 2006; see chapter 5 for an elaboration on this view). In sum, DST regards an island ecosystem as the dynamically-stable (rather than fixed) expression of a natural hierarchical order and resource limitations that yield specific inter-species interactions and unique natural variation. The combination of these factors is referred to as the precursor model. Accordingly, the dynamic approach to language development interprets inter- and intra- learner variability in language acquisition as arising from precursor interactions between various components of language, given their general structural hierarchy and learner-specific resource limitations. Studies conducted from this perspective therefore place a strong emphasis on variability as key to uncovering these interactions. The following section elaborates on the value of variation in DST-oriented studies.

2.1.1.6 The value of variation Idealization of complex systems has often involved the removing of ‘noise’ from data: for example, removing individual variation by averaging across samples. However, it may be precisely this variability that holds the key to how learning happens (Thelen & Smith, 1994, as cited in Larsen-Freeman & Cameron, 2008, p. 40).

DST-oriented studies aim to identify variability patterns around central developmental trends. Such patterns can be considered as manifestations of iterative cycles, also known as fractals, which are a form of self-organization that in turn arises from interactions between systemic components (de Bot et al., 2005; Verspoor, Lowie, & de Bot, 2008a; Verspoor et al., 2008b). There are various techniques for investigating variability, a few of which are used in the empirical studies in this thesis and are thus demonstrated in the following chapter. This section, however, is

17 CHAPTER 2

concerned with theoretical interpretations of variability, particularly within individual learner data. All levels of human behavior and learning, and specifically language development and use, are characterized by variation. Thus “nonlinear phenomena are ubiquitous in psycholinguistic performance (and human behavior at large)” (Van Orden, 2002, p. 3). Variability can be perceived on every time scale, with studies differentiating long-term variation, such as that observed over weeks or months, from short-term variation, observed daily or even during single measurement occasions. In L1, at the macro level of the learner lifespan, periods of rapid growth like the “vocabulary burst” alternate with periods of stagnation and even attrition, if the native language is not used over an extensive time period (de Bot, 2007). Yet even at the micro-level, down to a scale measured in milliseconds, alternations can be observed between rapid and strenuous word retrieval (de Bot & Lowie, 2010; Spivey, 2007). In comparison with L1, L2 exhibits a higher degree of variability. It is apparent in a greater fluctuation between acquisition and attrition, both of which are integral to L2 development (de Bot et al., 2005, 2007; Verspoor, de Bot, & Lowie, 2004). Asides from the time-scale distinction, studies of intra-learner variability further distinguish variation within- and between-components. The former refers to temporal variation patterns around the central growth trend of a single component of language, while the latter refers to the comparison of such patterns between several components (Bates et al., 1995). DST-oriented studies, regardless of the discipline in which they are conducted, place a high value on variation. Depending on the study, there are various interpretations of its empirical importance. These are related, but emphasize different roles or aspects of variability (such as evolutionary, mathematical, or developmental). First, as demonstrated in the island metaphor, from a Darwinian perspective, variation is the unique result of specific conditions and inter-species interactions. It thus plays an instrumental role in environmental adaptation, as a contributing factor to change: without such variation, amoebas would remain the only species on the planet (Bertenthal, 1999). In catastrophe theory, increased variability is considered one of eight “catastrophe flags” indicating the onset of sudden developmental change (Gilmore, 1981) . As the bifurcation plot in section 2.1.1.2 has shown, seemingly- sudden transitions between the states of a dynamic system are not discrete, but rather stem from continuous, graded and ongoing change. Increased variability during such

18 The dynamics of L2 development

constant gradual growth may thus indicate an approaching qualitative shift between attractors. Similarly, in human development, increased variability has also been identified as preceding a qualitative phase shift or “jump”, since it marks a process of trial and error, or unstabilized learning (Thelen & Smith, 1998; van Dijk, 2003). From this perspective, variation is inherent to, and in fact enables, language change and evolution by altering the languages of both individuals and their communities (Larsen-Freeman, 1997). From a general dynamic point of view, data variations "are substantial, stable, and have their own developmental course" (Bates et al., 1995, p. 1). Accordingly, case studies have shown that variation itself changes meaningfully and systematically, and can therefore illuminate hidden aspects of development. For instance, early L1 acquisition can be conceived of as a series of changes in relative frequencies of strategies, leading to a decrease in less mature strategies, rather than an abrupt replacement of older strategies with new ones (Bassano & van Geert, 2007; van Dijk, 2003; van Geert & van Dijk, 2002). This approach has recently been adopted in studies of L2 development (Larsen-Freeman, 2006b, 2009; Spoelman & Verspoor, 2009; Verspoor et al., 2008b) (see the following chapters for further reference to these studies). A dynamic system is by definition a conglomeration of interacting components within a nested and stratified structure. Therefore, variation in the development of each component is not only indicative of its own growth process, but is related to variation in the development of other components with which it interacts. In other words, if variation is regarded as essential to development, and internal interactions are considered a source of change in the system, it is intuitively appealing to inspect between-component variation rather than only within-component variation. The former variation type will be regarded as an expression of internal interactions and hence of systemic self-organization. This approach to variability has been undertaken in several studies (Bates et al., 1995; Robinson & Mervis, 1998; Spoelman & Verspoor, 2009, in press; Verspoor et al., 2008b), among them those in the current thesis. So far, most of these studies have focused primarily on variability analyses rather than on simulations. However, interpretations of such analyses may be corroborated by mathematical dynamic models that depict the componential interactions inferred from them and evaluate their fit to the data (cf., Robinson & Mervis, 1998).

19 CHAPTER 2

2.1.2 Models of dynamic growth and interaction Aside from using variability analyses to observe complexity in a system and indicate shifting interactions between its components, a common application of the dynamic approach is to focus on “some particular aspect of cognition” and posit “an abstract as a model of the processes involved” (van Gelder, 1999, p. 243). Such a system is described by coupled logistic equations. Across dynamic systems, “even a relatively small number of components easily results in a rather complicated web of relationships and mutual effects. The dynamics of such a web can only be understood by simulating its evolution under various conditions” (van Geert, 2003, p. 664). However, most models in linguistic research are schemes of (presumably stable) relations between constructs reflecting knowledge, or between these constructs and external influences, in line with the structural-linear approach. Such static models, in essence, abstract a phenomenon to the structural relations between its components; in contrast, a dynamic model is concerned with the execution of that phenomenon: the interactions between its key components and their ensuing growth patterns. Thus, static models serve as starting points for defining the parameters and configuration of dynamic models. Unlike static models, dynamic models enable iteration and can thereby test the effect of temporal change on the stable interactions posited by static models. Thus they offer a means of hypothesis or theory testing. The applicability of dynamic models is tested by iterating their equations and graphically displaying the outcome. Following this stage, “a close match between the behavior of the model and empirical data on the target phenomenon confirms the hypothesis that the target is itself dynamical in nature” (van Gelder, 1999, p. 243). When such models successfully replicate empirical results, they can also increase the validity of the static model or theory that informed their initial structural configuration (see also van Geert & Steenbeek, 2005). The models in the current thesis are versions of the precursor model which have been adapted to various aspects of human development. These models are calculus-based, employing difference equations to describe change over time. Their extension to developmental data is based primarily on the work of Fischer (1980), and its numerous elaborations by van Geert (cf., 1991, 1993, 1994, 1995, 2003).

20 The dynamics of L2 development

2.1.3 Summary This chapter has demonstrated the fundamentals of the dynamic approach to language development, by focusing on the main characteristics of dynamic systems and relating them to the key traits of language development. The advantage of incorporating the dynamic approach in SLA research is that it poses questions about language development that pertain to any naturally evolving phenomena. These questions concern the nature of interactions between aspects of knowledge, the effect of change over time on these interactions and subsequently on growth, the hierarchy of various linguistic skills, and the role of variation in development (van Geert, 1993). Although a growing body of literature discusses DST in the context of specific facets of language development, many areas remain unchartered territory in that respect. The aim of the current study is to focus on growth over time, variability, and componential interactions in two areas of L2 development: vocabulary knowledge and writing performance. With regard to each domain, the study asks whether development over time is nonlinear, whether it is uniform across various dimensions, and whether it can arise from dynamic precursor interactions. The methodological procedures involved in investigating these questions are longitudinal data collection and both central trend and variability analyses, the outcomes of which are subsequently configured in mathematical dynamic models. While this chapter has presented the theoretical principles and implications of DST in language studies, the next chapter describes the general design and procedures involved in the empirical dynamic approach, including some methods of variability analyses and the basic equations involved in dynamic modeling. It is intended to supplement the empirical studies in chapters 4 and 5 with additional information; however, it is possible to proceed directly to these chapters, in which the relevant sections of the following one are cross-referenced.

21

Chapter 3 Investigating language development from a dynamic perspective

3.1 Introduction As the previous chapter has shown, the dynamic approach diverges from the prevailing methodology in applied linguistics by focusing on research questions that concern individual pathways of growth and componential interaction rather than linear cause-and-effect. Therefore, before describing the two empirical studies that are the core of this dissertation, this chapter explains some basic methods involved in DST-oriented research. It begins by addressing the affinity of the dynamic approach with case study methodology, and then focuses on variability analyses and mathematical modeling. The chapter is rather technical and detailed in nature, because at this stage the empirical dynamic approach is rather a “do it yourself” methodology – there is no conveniently assembled PC-package for dynamic analyses (although, rather exceptionally, the study described in Chapter 4 uses a model which has been preprogrammed by Paul van Geert). Moreover, there are as of yet no clear guidelines for applying DST in linguistic studies (but see van Dijk, 2003; van Geert & van Dijk, 2002; van Geert, 1994). Therefore many of the practices associated with the empirical dynamic approach are rather novel and daunting, not just in terms of exact procedures, but also conceptually. While this dissertation aims for transparency in explaining the procedures that it applies, their descriptions in this chapter may be too abstract out of the context of an empirical study. Thus as mentioned, this chapter may be cross- referenced from the following chapters, rather than read in entirety.

3.2 General study design A longitudinal and detailed case study design is highly suitable for investigating SLA from a dynamic perspective, because it facilitates inspection of development as temporal change, and enables the relation and comparison of such change across various levels of the data. Case studies are used extensively in educational and developmental psychology; yet despite the proximity of these fields to applied linguistics, they are nowhere near as prominent in its literature (Meara, 1995). As van Lier points out,

23

CHAPTER 3

Case study methodology has been extremely influential in shaping the way we talk about education, yet it has been traditionally regarded as somewhat of a soft and weak approach when compared to studies that have been deemed more rigorous, randomized, or experimental in nature (2005, p. 195).

Singer and Willet (2003) suggest a framework for case study methodology, which has been adopted by the studies in this thesis. It is comprised of six basic stages:  Data collection  Data description  Data exploration  Model specification  Model fitting (assessment and possible optimization)  Considering extensions

The following subsections describe each stage in the context of the dynamic approach. Sections 3.3 and 3.4 then lists the practical procedures that the stages entail.

3.2.1 Data collection Case studies require longitudinal, systematic, and detailed data gathering. In choosing this data, “one can – indeed, one needs to – foreground a focal point, while allowing the background to continue on its dynamic trajectory” (Larsen-Freeman & Cameron, 2008, p. 234). In other words, suitable data need to be extracted from specific areas of language development, a process requiring the researcher to “determine the ecological circuit in which one is interested” (Larsen-Freeman & Cameron, 2008, p. 235). In the studies included in this dissertation, the focus is increasingly narrowed until it zooms in on specific parameters representing the phenomena of interest. For example, the writing study in Chapter 5 is concerned with text-level performance, within which it distinguishes the categories of complexity and accuracy, further specified in both the lexical and syntactic dimensions. In each of these categories, the study singles out specific representative measures.

3.2.2 Data description and exploration (variability analyses) A dynamic standpoint “requires us to look for change and for processes that lead to change, rather than for static, unchanging entities. Furthermore, data are not cleaned up before analysis to get rid of inconvenient ‘noise’” (de Bot et al., 2007, p. 16). The second stage, data description, thus involves inspecting raw data as change trajectories over time. Plotting data values as a function of time visually displays the route of their development, including local dips and jumps. It also enables to compare

24 Investigating language development from a dynamic perspective

developmental trajectories across various components of language (or across participants, cf., Larsen-Freeman, 2006b). Often linear or other central trends are added to the data in order to visualize the overall direction of its growth. Variability around this trend can then be distinguished and inspected in further detail. This stage, data exploration, involves investigating the variability patterns in the data in a similar manner, as trajectories of change. Like the raw data, variability can be inspected in a single component of language, or compared across several components. The two procedures correspond with two emphases on the value of variability, as discussed in section 2.1.1.6. First, variability in the development of one component can be analyzed from a general evolutionary or developmental perspective. In such cases, increased variability is considered a predecessor to qualitative shifts, which are frequently misidentified as discrete developmental jumps rather than the outcomes of continuous change (Ruhland, 1998; van Dijk, 2003). From this angle, variability has a facilitative role in development as enabling systemic adaptation. Second, variability patterns can also be compared across systemic components. In this context, shifts in variability are seen as indicators of change in underlying systemic interactions. The interpretation of findings from the analyses described so far may then be incorporated in a mathematical model, which is also based on the structural ordering of the data, and is therefore theoretically motivated.

3.2.3 Model specification A dynamic model is comprised of two parameter types. The first is order parameters, which define the number of its components (growers) and their hierarchy, that is which growers act as precursors, and which growers are dependent on the former. The second type is control parameters, which correspond with factors that influence the course of development in each grower. In other words, order parameters define the structure of the model, while control parameters are “those aspects that cause the process to behave as it does” (van Geert, 2003, p. 663). Control parameters can in turn be divided into property and relational parameters. Property parameters specify the initial value, growth rate, carrying capacity (optimal attainable value), any delay in onset, and in some cases the amount of random variation in each grower. Relational parameters specify the interactions between a given grower and other growers in the model. They can be further distinguished as support, competition, and a conditional threshold that enables the onset of growth in a dependent grower and thereby also the

25 CHAPTER 3

onset of competition and/or support towards it from other growers. This threshold value thus corresponds with the hierarchal order of the model, and relates to the role of the particular grower as a precursor. Any model of a given grower can incorporate these three parameter types as they refer to one or more other growers in the system, as well as interactions generated by these growers towards the specific grower. However, not all of these parameters need to be specified, and their inclusion in a model depends on the particular phenomena that it is intended to simulate. Both support and competition can be specified as either by level or by change interactions. By level means that their value changes in direct relation to the value of the grower that generates them; by change implies that their value is relative to the amount of change in the grower between the current and previous time points (see section 3.4 for the equations depicting these specifications). No phenomena can be modeled without relying on theory, empirical findings, or both for defining the model parameters and setting their values. The model configuration is a means of hypothesis testing, since it evaluates both the interpretation of the data analyses and the structural theory that refers to it. In each of the two studies in this dissertation, which apply dynamic models to L2 vocabulary knowledge and writing performance, the model settings are informed by the relevant background literature and preceding findings. Section 3.4 introduces some dynamic models which have been previously applied to (predominantly) L1 data and serve as basis for those used in the current studies.

3.2.4 Model fitting The fit of the model to the data can be evaluated visually, in which case it should be “convincingly similar to the outcomes of the real-world system” (Larsen-Freeman & Cameron, 2008, p. 41). Additionally, the model can be assessed by a parameter that denotes its goodness-of-fit. Some fit assessments are regression-based, comparing the linear regressions of the model to those of the data, correlating the model with the data values, or comparing the internal pattern of correlations in the model with that of data. Others are based on iterative weighted least squares. The present studies incorporate both methods, as suggested in previous applications of dynamic models to developmental data (van Geert & Steenbeek, 2005). A model can also be fitted to data by optimization procedures, which arrive at the optimal configuration of its control parameter values while maintaining its original

26 Investigating language development from a dynamic perspective

structure as specified in the order parameters. The optimized parameter values can then be compared to the interpretation of the data analyses and the pertinent theory, while the model outcome is again compared with the data. Thus, the hypotheses derived from both theory and data interpretation, which have informed the model configuration, can be supported further, if the optimization shows that the best fit retains parameter values that are congruent with them. On the other hand, if the best fit is achieved by altering these parameters, these hypotheses will be refuted.

3.2.5 Considering extensions In summary, an empirical dynamic study is comprised of three main stages. First, the data description determines a basic structural hierarchy (number and order) of components. Second, the data exploration may indicate the nature of the interactions (relational control) within this hierarchy. Third, the model configuration and optimization may test both the primary and secondary hypotheses (concerning the order and relational control parameters), which are based on the theory of the field and on the two preceding procedures. These steps constitute a kind of “three-in-one” methodological design. In such a study, the detailedness of the results and the need for its visual representation would require some form of accompanying, rather than retrospective, interpretation and discussion. At any of the three stages, additional research questions can be posed with regard to the nature of development and its underlying mechanisms. Finally, depending on the outcomes of this methodology, extensions to the data collection, the analyses or the model should be considered, as in any other empirical study. In the dynamic approach this last stage is particularly important, since the focus on longitudinal and individual data cannot be generalized to the overall population.

3.3 Methods of growth and variability analysis Several techniques have been used by dynamic-oriented studies to represent and inspect intra-individual growth and variability. Van Geert and van Dijk (2002) and van Dijk (2003) review a number of such measures, among them the moving range (min-max), which visualizes local variability peaks that precede developmental jumps. However, since the current studies investigate variability as indicative of systemic self-organization, the following sections describe only techniques that pertain to this role and are featured in these studies. These are trajectory plots, residual plots and moving correlations. Additionally, the studies use a technique of

27 CHAPTER 3

local data smoothing called spline interpolation, which incorporates a certain amount of variability with locally regressed trend.

3.3.1 Growth trajectory plots A trajectory plot displays change as a sequence of values plotted along the x-axis, which denotes time. Such a simple graphic presentation of data can convey meaningful information about the nature of its development. For example, a growth trajectory plot can illustrate the degree of consolidation of a newly-acquired linguistic feature, with the trajectory representing change in the amount of its usage. Adding a regression line to the plotted trajectory shows the overall direction (increase vs. decrease) of development, and visualizes the differences between the raw data values and their central trend. Each segment in a growth trajectory is characterized by its degree of change in relation to adjacent segments. The plotted trajectories show the patterns of change between observations. Trajectories are more informative when presented in conjunction with those of other data variables: plots of two or more trajectories can expose their interactions over time. If shifts in one trajectory tend to coincide with parallel or inverse shifts in the other, this may be interpreted as indicative of a supportive or competitive interaction between them, respectively. However, such an interpretation would be speculative, and needs to be reaffirmed by subsequent procedures. Figure 2 contains two growth trajectories, one of which shows a general linear increase, and the other a decrease. Both trajectories exhibit a high amount of variability around these trends. The patterns of this variability are mostly parallel, but peaks (outliers) in one index are usually accompanied by dips in the other (for example in weeks 6 and 10). The potential relatedness between the variability patterns of the two indexes can be inspected in further analyses, such as de-trending the data values and plotting their residuals.

28 Investigating language development from a dynamic perspective

7 0.7

6 0.6

5 0.5

4 0.4

3 0.3

Clause/sentenceratio 2 0.2 Subordination/clauseratio 1 0.1

0 0 1 6 11 16 21 26 31 36 Weeks

Clause/sentence ratio Subordination/clause ratio

Figure 2. Growth trajectories of two indexes of L2 writing performance

3.3.2 De-trended data values (residuals) De-trending data allows for inspecting variability independently of the linear trend. The residuals, or de-trended values, are obtained by subtracting the linear trend from the data series. The trend is calculated for each data value as the sum of the intercept and slope of the entire data series, multiplied by the number of measurements until that particular time point. The residuals can then be plotted and compared between two or more data variables. Residuals can also be correlated in static or moving correlations (see the following section). The advantages of de-trending data are that local divergences from the trend can be revealed and displayed as a function of time. Thus, even when the overall trend is robust (i.e., strong increase or decrease, manifested in steep slope values), the residuals can reveal temporal patterns in the departures from it. Another advantage is that residuals may reveal the concurrence of such local fluctuations with similar or inverse patterns in another index. In this way, residual plots can indicate local interactions which may be obscured by the overall trend when it is incorporated in the data. For instance, while the central trends of two variables may be similar, their residuals may exhibit inverse patterns, implying a potentially competitive interaction. Figure 3 depicts the residuals of two indexes, showing their relative patterns.

29 CHAPTER 3

0.2

0.15

0.1

0.05

0

-0.05

-0.1

-0.15 Weeks

Complex word ratio General word variation Figure 3. Residuals of two indexes of L2 writing performance

3.3.3 Moving correlation A common procedure in L2 research is correlating two developing indexes. Such a static correlation coefficient can be supplemented with a moving correlation, which shows temporal changes in the coefficient values in a moving window of several observations. Each window overlaps with the preceding window on all but the first measurement value. For example, in Figure 4, the first window features the correlation coefficient value in weeks 1-5, the second includes the coefficient in weeks 2-6, and so forth. Thus the correlation can be viewed as a function of time (van Geert & van Dijk, 2002). The moving correlation in Figure 4, changing over a 36- week period divided into overlapping windows of 5 measurements, shows repeated alternations between strong positive and weak negative coefficient values. If the interaction between the two variables was only summarized as a single coefficient value, these shifts would be obscured. Moreover, if the correlation is not statistically significant, it would likely not be noted, whereas the moving correlation shows that there might be a systematic pattern that underlies this correlation. While this pattern may prevent the correlation from becoming sufficiently high to reach significance, it may nonetheless be informative with regard to temporal changes in the interaction between the two variables.

30 Investigating language development from a dynamic perspective

1

0.5

0 1 3 5 7 9 1113151719212325272931

Coefficient value -0.5

-1 Windows (5 measurements)

Correlation general word variation-complex word ratio

Figure 4. A moving correlation between two indexes of L2 writing performance

Like the trajectory and residual plots, moving correlation plots can include more than one correlation, allowing for a visual comparison of the temporal patterns in one correlation with those in another. It should be noted that all of these techniques (and other forms of variability analyses which are not included in this thesis) are complementary. Using them conjointly with linear analyses can achieve a fuller representation of the developmental phenomena.

3.3.4 Smoothing by local regression: spline interpolation Smoothing by local regression combines local trend with part of the data variability. It thus bridges central trend analyses like linear or polynomial regression with variability analyses. Local smoothing can be performed in several ways. The simplest and perhaps most common technique is a moving average, which is simply an average calculated in partially overlapping time (or measurement) windows of a fixed size, like the moving correlation technique. Although the moving average is very popular in longitudinal studies, another smoothing method, the spline function, is preferred in the current thesis. This is because in a moving average, the past (previous values) is more influential than the future (upcoming values). While this pitfall can be corrected by weighing, moving averages are also more susceptible to the influence of extreme values. Finally and perhaps obviously, averages do not always represent actual data values (there are no families with 2.5 children, for example); therefore they are a rather crude means of representing trend.3

3 At the other end of the spectrum of smoothing methods is a B-spline interpolation, which is highly sensitive to local values but requires complex recalculations that are computationally costly, and is therefore not featured in this dissertation.

31 CHAPTER 3

The term spline refers to a wide class of piecewise polynomial functions used to minimize data roughness by interpolation. Because spline functions are segmented, they avoid the high oscillations produced by polynomial functions. The natural cubic spline, which is the most popular spline function, is termed as such because it provides the closest approximation to the curve produced by the original spline device used in ship building. In this function, the data is divided into three segments, for each ) = + + 2 + 3 of which the equation r (t) r0 at bt ct (the cubic polynomial) is calculated with different parameter values. The intersections between these segments are called knots. Continuity between the polynomial functions is maintained by matching their derivatives at each knot. In the equation, t represents the time dimension, r represents the rate (configured as the onset data value at point 0, and the last value of the preceding section in the data for each knot), and the parameters a, b, and c comprise the cubic polynomial. After each knot, a new cubic polynomial is calculated, with the cumulative number of data points subtracted from the time term. For example, in the second segment of the spline in Figure 5 (below), t is the current week number minus the total number of weeks in the previous segment (12 and 24 for the first and second segments, respectively). In the plot generated by this equation (see Figure 5), the rate sub-value of the first segment is the y-axis intercept. In this dissertation, spline functions are visually displayed in conjunction with optimized model outcomes (as well as with raw data values and their linear trends). This facilitates comparing the model plots to the data that they simulate, since optimized models cannot include random variation, and are therefore predictably smoother.

32 Investigating language development from a dynamic perspective

Data

1 0.8 0.6

Ratio 0.4 0.2 0 1 3 5 7 9 11131517192123252729313335 Weeks Spline

Recognition Recall Controlled production Free production 1 0.8 0.6

Ratio 0.4 0.2 0 1 3 5 7 9 11131517192123252729313335 Weeks

Recognition Recall Controlled production Free production

Figure 5. Raw data and natural cubic spline for four vocabulary knowledge levels

3.3.5 Pitfalls of variability analyses Investigating variation poses several problems. The first is that estimations of variability, even more than those of central tendency, require substantial sample sizes (Bates et al., 1995). For reasons of labor intensity, such samples, specifically in language development studies, can only be extracted from a small number of case studies. Thus, case studies in general, and those involving variability analyses in particular, are usually investigations of single learners (e.g., van Dijk, 2003; see also Meara, 1995). A second problem concerns the choice of indexes for variability analyses. Averaging is inherent to many measures of language development, for instance to the widely-used index of mean utterance length (MLU), as its name attests. The representation of variability may be compromised if it is investigated in such averaged data values. A plausible solution for this problem is employing indexes that are based on ratios, rather than averages.

33 CHAPTER 3

A more-general problem is a risk of “missing the forest for the trees”, while closely monitoring variability in a restricted number of variables. As Ellis warns, “it’s not enough to highlight individual variability (…). We still have to explain the regularities. And if we find it difficult to credit these as innately given, then we have to come up with some viable alternative, for we know that input will not suffice” (2007, p. 23). However, when variability analyses are used to supplement rather than replace central trend analyses, the problem of over-emphasizing irregularities is minimized. Moreover, both types of data analyses can be used in conjunction with models to explain general growth patterns of linguistic data, given unspecified input (cf., Bassano & van Geert, 2007). Therefore, at least in terms of intention (if not always execution), the dynamic perspective on language development can be aligned with more established linguistic approaches and their corresponding methodologies. Variability analyses can provide valuable information about processes of language development, but usually require support by further procedures. Some studies use permutation techniques, such as a Monte Carlo simulation, to corroborate their interpretations of variability patterns as developmental indicators (cf., van Dijk, Verspoor, & Lowie, in press). As mentioned, such interpretations of variability analyses can also be supported (or contradicted) by mathematical models. The practical side of this approach, which is taken up by the current studies, is discussed in detail in the following sections.

3.4 Modeling a dynamic system: general considerations Dynamic models are mathematical descriptions of the parameters that determine the development of a system. The values attached to these parameters are not meaningful in themselves. For example, an initial growth rate of 0.01 does not imply that the phenomenon at hand actually grows at this precise pace in real life. Rather, these values are relative to each other, so that if the value of a given grower increases at double the rate of another, this fact is meaningful. However, it should be kept in mind that only basic models or artificial phenomena rely on a fixed growth rate. In dynamic models, the growth rate is an onset value, which changes iteratively as a function of the interactions between the growers (relational control parameters).4 This interaction in turn is a function of both growth and carrying capacity (implied by resource limitations). In general, the configuration of order parameters relies more heavily on

4 as well as any other (property) control parameters, such as developmental delay

34 Investigating language development from a dynamic perspective

theory, whereas that of control parameters is also based on outcomes of data analyses. Thus constructing a mathematical model – both the combination of mathematical operators, which denote the order parameters, and the configuration of their values, which denote the control parameters – draws on both theory and findings. This does not mean that the theory need always be established and consensual. In fact, models can act as an exploratory tool that can substantiate less popular theories and show which axiomatic ones can benefit from revision. The following sections explicate several generic dynamic models, which have been so far applied to (mostly) L1 data. These models serve as a basis for the models used in the studies described in the subsequent chapters.

3.4.1 A basic growth model Section 2.1.1.1 has shown how the logistic growth equation can replicate the typical sigmoid “learning curve”. This curve can be observed across numerous developmental phenomena, and is characterized by rapid initial growth, followed by a plateau. The logistic equation specifies growth as a joint product of the current value (growth level) of the grower and the available internal and external systemic resources. In other words, the equation combines exponential iterative growth with a delimiting factor, which renders it nonlinear. Equation 1 (below) is the difference form of the logistic equation as it describes the value Lt+∆ of a grower L.

L t+ ∆t = L t+ L t* r ∆t* 1( − L t/ K t) Equation 2. A logistic growth function (after van Geert, 1994)

As previously mentioned, the original logistic equation, known as the Verhulst formula, was used to depict population dynamics on the basis of two central premises: that the rate of growth will always be proportional to the current value, and that it will also depend on the amount of available resources. Although these resources are not directly configured in the equation, they are implied through the K parameter, which signifies the carrying capacity. The growth rate, typically marked as r, is the proportional increase in one time unit. ∆ is the general difference symbol, which can also denote change, in this case change in time. Thus, the value Lt+∆t is based on the preceding value Lt .

35 CHAPTER 3

Van Geert (1991) used the difference form of the logistic growth equation to model the growth of early L1 lexicon. Figure 6 shows the growth curve obtained from observations of a child’s lexical production together with a logistic growth function.

Lexical data vs. logistic model

400 350 300 250 200 150 100 Words produced Words 50 0 1 3 5 7 9 111315 1719 212325 2729 31 Observations

Data Model

Figure 6. Lexical data (derived from Dromi, 1986, as cited in van Geert, 1994) vs. a logistic model. Based on van Geert (1994)

Depending on the value of the growth rate r, this function can yield three distinct growth patterns. The first is S-shaped increase towards a maximal value; the second is cyclic oscillations between various states (as shown in the bifurcation plot in Figure 1); and the third is irregular and seemingly-unpredictable oscillation. What determines the growth rate and its consequent effect on the shape of development are not only the current value of the grower (that is, its distance from the carrying capacity), but also various influences that relate to the process at hand, which can be expressed in different extended versions of the logistic equation. In construing these versions, the basic logistic function can be expanded by a range of terms that make the model more complex and specific, thus yielding a better fit. For example, van Geert (1993) points out that to successfully simulate real-life data, it is often necessary to add a feedback delay or resource oscillations as a further damping factor on the growth rate r, which determines the steepness of the S-shaped curve . Such delays or oscillations can be directly added to an equation of a single grower, or can be implied in coupling this equation with an equation that describes another grower. With regard to the second option, all available resources are usually treated as a single and unchanging entity in dynamic models. However, as the previous chapter has discussed, the availability of resources is a function of interactions between co-

36 Investigating language development from a dynamic perspective

developing components of the system, and no grower is isolated (although, of course, it may be addressed as such from theoretical or methodological considerations). Changes in the amount of resources available to a specific grower therefore reflect systemic self-organization in terms of internal interaction and subsequent reallocation of resources. From this point of view, resource oscillations are inevitable, since the limited nature of the resources implies that co-developing growers compete for their share of these resources, even while supporting each other in terms of conjoined development. Thus, instead of configuring the effect of resource oscillations as a random and periodic damping factor on the growth rate, it can be expressed by incorporating additional systemic components in the model, thereby turning it into a model of connected growers .

3.4.2 Modeling connected growth Dynamic modeling involves more than just fitting logistic growth equations to developmental trajectories of empirical data. By combining several developing components of a given phenomena in a hierarchy of interacting levels, it can simulate the effect of interactions between connected growers. In practice, this combination necessitates configuring coupled logistic equations. Equation 3 describes two co- developing growers, with no specific hierarchy between them.  r * A  = + − A n + − An+1 An * 1 rA s A * Bn c A * Bn   K  A  r * B  = + − B n + − Bn+1 Bn * 1 rB sB * An cB * An   K B  Equation 3. Coupled growers with support and competition by level (based on van Geert, 1995)

The growers, A and B, support each other’s growth, but also compete for resources. Therefore their interactions are defined as bidirectional. The equation parameters are K = carrying capacity; n = number of observations, which is equivalent to time t in the previous equations, with 1 replacing ∆ as a fixed unit of change; r = growth rate; c = competition; and s = support. The first part of each equation is equivalent to the logistic growth equation, while the second part specifies the damping factors on this logistic growth, which are the competition and support from the other grower towards the one described by the equation. Support from the precursor A to the dependent B ( SB) or from B to A ( SA) is multiplied by the current value of the grower

37 CHAPTER 3

that generates it, meaning that it is by level . The contribution of support from another grower to the growth rate r of its counterpart is curbed or dampened by a competition parameter C. The growth of C is also by level , in direct relation to the value of the grower that generates it. Considering that the equations for both growers A and B are identical, and that there is no damping parameter on the onset of their growth onset or their interaction, iterating the equations would simply yield identical growth curves (given of course that the parameter values for each grower are equal). In other words, there is no hierarchy (order) between the two connected growers. As discussed in the preceding chapter, a generic connected growth model which is particularly suitable for simulating cognitive and linguistic growth is the precursor model. In a precursor interaction, the development of one grower is a prerequisite to that of another. Various versions of the precursor model can be set up, with conditional unidirectional or bidirectional support and/or competition, either by level or by change. The following section contains the equations for a basic precursor interaction, followed by examples of versions of this model used to simulate linguistic development in several areas.

3.4.3 Precursor interactions: unidirectional support A precursor interaction between growers can be specified as such by setting a threshold value that the precursor, i.e., the earlier-developing grower, needs to reach before it enables the development of its dependent. This development need not necessarily depend on explicit support from the precursor, but can simply ensue on its own as a result of resource reallocation (since the threshold value assumes that the precursor has ceased to tax these resources to the degree that they are unavailable for the dependent). However, usually some form of support from the precursor to the dependent is implied, since the notion of connected growth is related to the hierarchy of the system, in which lower “species” actively support the emergence of higher ones, as illustrated by the island metaphor (section 2.1.1.5.1).

Equation 4 configures two connected growers A and B. The notion of a precursor interaction, in which the development of a given grower is a prerequisite for that of another, can be added to a connected growth model by an additional parameter P, which is a binary on/off variable. P=0 in if the current value of the first grower (the precursor) is lower than a specified threshold. Once that threshold is reached, then P=1. For example, if two-word utterances in early L1, denoted as B, can be generated

38 Investigating language development from a dynamic perspective

only once vocabulary, denoted as A, has reached a certain size, then P=1 when An is equal or higher to this prerequisite threshold value (Bassano & van Geert, 2007). Thus

PB thereby marks the onset of development in B (since the model does not include any feedback delay). Support from the precursor A to the dependent B is therefore in effect conditional, as well as by level (directly related to the value of A).  r * A  = + − A n An+1 An * 1 rA   K  A  r * B  = + − B n + Bn+1 Bn * 1 (rB s B * An ) * PB   K B 

Equation 4. Two growers in a precursor interaction with unidirectional support by level

1

0.8

0.6

0.4 Grower value 0.2

0 1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 Iterations

Level A Level B Level C Level D

Figure 7. Four growers in a precursor interaction with unidirectional support by level

Figure 7 shows the pattern that this type of precursor model would generate in a system of four connected growers, given equal growth rates and carrying capacities, with different conditional threshold values for the onset of support. The model is somewhat idealized, lacking the gradedness and variability that can often be seen in the emergence of developmental stages, which usually overlap. This is likely due to the fact that competition from the dependent is not included in the model. The following section demonstrates this extension, in a precursor model with unidirectional support (from the precursor to the dependent) and competition (from the dependent to the precursor).

39 CHAPTER 3

3.4.4 Precursor interactions: unidirectional support and competition  r A  = + − A * n − An+1 An * 1 (rA cA * Bn )  K A 

 r * B  = + − B n + Bn+1 Bn * 1 (rB sB * An *) pB   K B  Equation 5. Two growers in a precursor interaction with unidirectional (precursor to dependent) support and unidirectional (dependent to precursor) competition, both by level

Equation 5 incorporates competition as well as support: precursor A grows to a threshold value before enabling the emergence of B. A then supports B, while B competes with A by level. This equation is suitable for simulating interactions in which there is an eventual decline in the precursor due to increased competition from the dependent, as the value of the latter grows (in line with increased support from the precursor). In such a model, the dependent will ultimately overrun the precursor. The model can be used to capture the manner in which early strategies in child learning diminish over time, while more advanced strategies emerge, as suggested by Fischer (1980). It was also applied to the emergence of 1-, 2-, or 3-word utterances in early L1. In this data, the “holophrasic” one-word stage, the precursor to the other stages, is eventually replaced by the multi-word stage (Bassano & van Geert, 2007). This replacement would not normally take the shape of a smooth transition, in which dependents replace precursors sequentially. Rather, it is likely that the high amount of variation in developmental and linguistic data would yield a pattern of cyclic decline and recovery of earlier strategies, until their eventual disappearance. Figure 8 depicts such a model as it is extended to four equations describing growers (levels) A, B, C and D. Beginning at A, each grower is precursor to a dependent, which in turn is a precursor to the following grower. The competition increases in accordance with the value attained by each grower. The model produces a typical pattern: within a thousand iterations, level A declines to a minimum, level B also declines, although not to that extent, and levels C and D increase towards their carrying capacity (configured as 1 in this version).

40 Investigating language development from a dynamic perspective

1

0.8

0.6

0.4

0.2

0 1 37 73 109 145 181 217 253 289 325 361 397 433 469 505 541 577 613 649 685 721 757 793 829 865 901 937 973

Level A Level B Level C Level D

Figure 8. Four growers in a precursor interaction with unidirectional (precursor to dependent) support and unidirectional (dependent to precursor) competition, both by level

It is also possible to specify competition (or support) as increasing or decreasing in line with the local positive or negative growth in the grower that generates it (by change), rather than in accordance with the current value of this grower (by level). In this version of the precursor model, the precursor does not necessarily diminish over time. The next section addresses this possibility.

3.4.5 Precursor interactions: bidirectional support by level; unidirectional competition by change In the preceding coupled grower equations, support and competition are by level, and thus simply linear functions of the previous value of the competitive or supportive grower. However, it is likely that there are developmental stages in which competition changes as a function of the growth process in the grower that generates it, reflecting the amount of resources invested in its local change between time units. This type of competition is referred to as competition by change. According to van Geert (2003), an example of such an interaction can be seen between early reading skills and vocabulary acquisition. While reading skills depend on prerequisite vocabulary knowledge, and while these skills in turn support the acquisition of new vocabulary, the process of reading skill acquisition may interfere with the process of acquiring new vocabulary items. Thus support between these two constructs is by level (and conditional on a threshold value when generated by the precursor, vocabulary knowledge, towards the dependent, reading skills). In contrast, competition from reading skills to vocabulary knowledge is by change, and depends on the amount of resources invested locally in reading skills acquisition. The following chapter

41 CHAPTER 3

readdresses this example in the context of L2 vocabulary development. Equation 6 depicts this version of the precursor model.

  r * A  = + − A n + − − An+1 An *1 rA s A * Bn c A (* Bn Bn−1 )   K A 

 r * B  = + − B n + Bn+1 Bn * 1 (rB sB * An *) PB   K B  Equation 6. Two growers in a precursor interaction with bidirectional support by level, and unidirectional (dependent to precursor) competition by change

Depending on the values of the support and competition parameters, such a model can increase to a maximal value on all levels, or only on some. Figure 9 depicts an extension of this model to a set of four connected growers, which all ultimately increase to their carrying capacities.

1

0.8

0.6

0.4

0.2

0 1 37 73 109 145 181 217 253 289 325 361 397 433 469 505 541 577 613 649 685 721 757 793 829 865 901 937 973

Level A Level B Level C Level D

Figure 9. Four growers in a precursor interaction with bidirectional support by level and unidirectional (dependent to precursor) competition by change

An alternative to specifying weak or strong support and/or competition values in the model is specifically configuring competition as diminishing over time, as a function of both the change process and the value of the grower that generates it (in other words, by change and by level, simultaneously). This is done by dividing the difference (change) between the present and the previous value of the grower by its current value (level), and multiplying the competition parameter value with the outcome. In this context, van Geert refers to two simultaneously competing growers M and L, and the idea that the attention and effort spent to learn a new skill, principal, or strategy is considerably greater at the earlier states than it is in the later ones. The reasoning behind this assumption is that once the learning task becomes more familiar, less resources will have to be invested to achieve a similar amount of

42 Investigating language development from a dynamic perspective

progress. In this alternative version of the model, the competition factors CL − − and CM are multiplied by (M n M n− )1 / M n ) and(Ln Ln−1 / Ln ) respectively (1995, p. 322).

The coupled equations would thus take the following form:

  r * A c (* B − B )  = + − A n − A n n−1 + An+1 An *1 rA s A * Bn    K A Bn 

  r * B c (* A − A )   = + − B n − B n n−1 + Bn+1 Bn *1 rB sB * An  * PB    K B An   Equation 7. Connected growers in a precursor interaction with bidirectional support and bidirectional decreasing competition (based on van Geert, 1995, p. 323)

The revised model ensures that ultimately, all growers will reach their carrying capacities, regardless of their parameter values.

3.4.6 Aggregated support and competition Support and competition can be aggregated as a single influence on the growth rate of each grower. It is therefore possible to configure a general damping factor for each grower, which summaries a particular interaction between this grower and another. Such damping factors can in turn be summarized in a matrix that contains all the combinations of growers in the model. For a given system, the interactions by level and those by change need to be summarized in separate matrixes. R A B C D A rA KA dA(B) dA(C) dA(D) B rB dB(A) KB dB(C) dB(D) C rC dC(B) dC(B) KC dC(D) D rD dD(B) dD(B) dD(C) KD Table 1. Aggregated influences on the growth of four connected growers (extracted from van Geert, 1993, p. 232)

In Table 1, as in the equations in this chapter, r denotes the growth rate and K denotes the carrying capacity of each grower. d is the general damping factor which is the sum of influences (support and/or competition) from one grower to another in the specific version of the model . On the basis of such a matrix, van Geert (2003) has developed a VBA-based Excel spreadsheet program that iterates combinations of coupled grower equations like those featured in this chapter, and can thus accommodate various versions of the precursor model. This program was used as a basis for the simulation of L2 vocabulary development described in the next chapter.

43 CHAPTER 3

3.5 Summary Following the introduction to the theoretical dynamic perspective on L2 development, this chapter has focused on the practical steps involved in applying this approach in an empirical study. It first discussed the general compatibility of case study methodology with the dynamic approach, and then detailed the steps involved in using such a design in the context of a DST-oriented study. Next the chapter demonstrated some techniques of variability analyses, with a focus on revealing inter-componential interactions, which are applied in the current thesis. Finally, it showed how logistic growth equations can be used as a basis for modeling development, and how they can be adapted to simulate interactions between connected growers, particularly when such growers are defined as precursors and dependents. Models of growers in a precursor interaction depict a clear hierarchy, which should be observable in the data at least initially, when its components emerge in sequence. Such models should therefore be based on theory and recurrent empirical findings. The following chapters apply the dynamic perspective to longitudinal data in two areas of L2 development: vocabulary knowledge in the fourth chapter, and writing performance, expressed in the accuracy and complexity of lexicon and syntax, in the fifth. These studies follow the procedures outlined so far: longitudinal and detailed data collection, growth and variability analyses, and data simulations on the basis of the outcomes of these analyses and the background literature. While each study has its own set of domain-specific research questions, they can be collapsed as two general questions. The first question is whether variability analyses can reveal complex dynamic interactions in the data. The second question is whether such interactions, when configured in a model depicting the data components as a hierarchy of connected growers, can adequately simulate the data and thereby reinforce the interpretations of the variability analyses. In other words, the two studies put the theoretical and empirical aspects of the dynamic approach to the test.

44

Chapter 4 A dynamic perspective on L2 vocabulary knowledge

4.1 Introduction Vocabulary is the largest area of linguistic knowledge, and thus most affected by attrition, both in individuals and in language communities (Bardovi-Harlig & Stringer, 2010; Gross, 2004). Since words are the primary carriers of meaning (Vermeer, 2001), vocabulary provides the knowledge required for success in other areas of language proficiency (Laufer & Nation, 1999). L2 vocabulary size has been found to correlate closely with reading comprehension (Beglar & Hunt, 1999; Jiang, 2004; Laufer, 1992a, 1992b; Qian, 1999), as well as with writing ability (Astika, 1993; Laufer, 1998; Laufer & Nation, 1995; Linnarud, 1986). Not only vocabulary size, but also control over the multiple aspects of its knowledge is essential for achieving proficiency (Krashen, 1989; Laufer, 1992a, 1992b; Qian, 1999; Zareva, Schwanenflugel, & Nikolova, 2005). Thus, a growing number of studies focus on L2 vocabulary acquisition across various paradigms of knowledge. Among the many accounts of vocabulary knowledge, the distinction between receptive and productive skills is commonly accepted (see, for instance, Melka, 1997). The widely documented finding that L2 receptive vocabulary knowledge does not immediately or readily transfer to production is referred to as the receptive-productive gap (Laufer, 1998; Laufer & Paribakht, 1998; Schmitt & Meara, 1997). The development of vocabulary knowledge is complex, unstable and unpredictable. Yet despite the frequent application of case study methodology in L1 vocabulary research, and calls for its extension to L2 vocabulary studies (e.g., Henriksen, 1999; Meara, 1995), the bulk of the research in this area is cross-sectional, with few exceptions (cf., Fitzpatrick, Al-Qarni & Meara, 2008; Horst & Meara, 1999). Typically, studies of receptive and productive knowledge analyze differences between learner populations, or across different word frequencies, on single measurement occasions. Even longitudinal studies usually focus on pre- and post-treatment and group-level comparisons (cf., Schmitt & Meara, 1997). Consequently, not much is known about the development of the gap over time, and even its size and nature are subject to debate (e.g., Fan, 2000; Webb, 2008). No study so far has addressed the gap as a

45

CHAPTER 4

temporal and developmental phenomenon that may embody changing interactions between vocabulary knowledge levels. The current study focuses on the receptive-productive gap from this developmental and dynamic perspective. Rather than distinguishing two dichotomous categories of receptive and productive vocabulary, it investigates a hierarchy of four interconnected knowledge levels. This continuum has been adapted from two paradigms (Laufer & Nation, 1999; Laufer, Elder, Hill & Congdon, 2004). When combined, these paradigms achieve a fuller coverage of a range of least- to most- productive modalities consisting of word recognition, word recall, controlled production and free production. Recognition, in the context of the current study, is the ability to identify a given word form from several choices when supplied with its meaning. Recall is word retrieval from memory given meaning. Controlled production is the elicited production of a word in a given context from which it is omitted. Free production, the highest knowledge level, is spontaneous and unprompted use in writing. In order to inspect the development of this knowledge continuum, the study compiled data from four advanced learners during a 36-week period of immersion in academic English (L2) settings. The analyses of this data involve three steps, which correspond with the three main research questions of the study. The first step confirms the presence of the gap and its consistency over the study period; the second investigates nonlinear interactions between the four knowledge levels through variability analyses; the final step tests the hypothesis that development across the receptive-productive knowledge continuum can be explained by a dynamic precursor model. This model is a blueprint of interactions between co-developing components in a dynamic system, which has been successfully applied to other areas of language development (van Geert, 1991, 1993; Robinson & Mervis, 1998), as described in the preceding chapters. This chapter begins with a theoretical background addressing the overall characteristics of vocabulary development in L1 and L2, key approaches to vocabulary knowledge, the receptive-productive distinction and the relevance of the dynamic approach to these issues. The background is followed by a specification of the research questions and hypotheses and the corresponding methodology. The subsequent results section is divided into three parts, in line with the three steps detailed above. Each part is followed by a brief interpretation, which is expanded in the final discussion section.

46 A dynamic perspective on L2 vocabulary knowledge

4.1.1 Background: Overall vocabulary growth Vocabulary development in a native or foreign language is an ongoing and lifelong process. While L1 vocabulary acquisition is usually researched in the context of early speech or child literacy, there is no point at which it ceases. For instance, Singleton (1999) cites two studies (Carroll, 1971; Diller, 1971) which have shown that the L1 lexicon continues to increase in size until middle age and beyond. Similarly, while L2 vocabulary acquisition is usually investigated as the outcome of formal learning or intensive language immersion, virtually any situation enables its development at some level of knowledge (Singleton, 1999). To elaborate this claim, new words or new meanings and connotations, which are conjoined with pre-acquired words, are constantly acquired. Thus the lexicon is never stable, and is always in a state of flux (Elman, 1995). However, most studies investigate vocabulary growth in contexts where it is likely to be statistically significant or linearly linked with other factors (Meara, 1995). As described in the previous chapters, early L1 vocabulary development tends to follow a trajectory that consists of a slow onset, a rapid ascent, and a subsequent plateau. The ascent, occurring between the ages of 13-18 months, is known as the vocabulary burst (cf., Woodward, Markman, & Fitzsimmons, 1994). The general shape of early L1 vocabulary development can thus be depicted by a logistic S-shaped function, which produces a classic learning curve (van Geert, 1991, 1994). Questions surrounding this course of vocabulary development have been central to the linguistics field. Chomsky (1957, 1965, 1986) was the first to note the contrast between the relative lack of input that infants are exposed to and their vast creative abilities in language. Although this paradox, known as the poverty of stimulus, initially concerned grammatical abilities, it has been extended to the lexicon: The rate of lexical acquisition in the child is so rapid and precise that one has to conclude ‘that the child somehow has the concepts available before experience with language and is basically learning labels for concepts that are already part of his or her [innate] conceptual apparatus’ (Chomsky, 1988, p. 28, as cited in Singleton, 1999, p. 42).5

The nativist approach thus stipulates an intrinsic language learning mechanism that enables children to process large amount of words in a short time, but atrophies

5 Singleton (1999) goes on to list other prominent linguists and psychologists who maintain similar arguments with regard to early L1 development and specifically to vocabulary acquisition.

47 CHAPTER 4

and becomes defunct in puberty. This claim has been contested by various alternative explanations. For example, several studies have pinpointed the modified traits of adult child-directed speech, such as frequent naming, as enabling rapid infant vocabulary acquisition, while others have isolated the object identification phase, which coincides with the onset of early speech, as a key factor (see Singleton, 1999 for an overview). More recently, computational simulations have shown that the vocabulary spurt may be an inherent byproduct of the frequency-based statistical distribution of lexical items in all languages (McMurray, 2007). In L2, there is evidence for a similar vocabulary spurt. Based on Laufer (1991) and Meara (1997), Larsen-Freeman and Cameron state that [L2 vocabulary learning] begins quite slowly; once a certain number of words are mastered, learning increases in rate until vocabulary size reaches some level that seems to serve the student well enough, and then the rate of learning goes down. Vocabulary size does not increase linearly over time. On a graph, the rate of vocabulary growth with proficiency appears as an S-shaped curve (2008, p. 31).

The typical logistic growth of early L1 or L2 vocabulary (and indeed of many other knowledge types) is accompanied by a high degree of variability, which manifests in performance fluctuations. There is evidence for an inherent instability of vocabulary knowledge in both L1 and L2, at different time scales and knowledge levels. For example, Bloom (1974, as cited in Melka, 1997) showed that in children’s L1, certain words which were produced a day earlier could not be re-elicited in some cases, attributing this finding to the high context-dependency of early L1. While this evidence may be anecdotal and pertain only to a limited number of words, it nevertheless illustrates the variable nature of L1 lexical knowledge. Similarly, a case study of beginner-level L2 vocabulary acquisition reported decreased knowledge of some words, even during a short period of intensive learning (Fitzpatrick et al., 2008). On a much longer time scale, a cross-sectional study of L2 vocabulary noted decreased knowledge of words from certain frequencies in some learners, even after prolonged immersion in the target language (Schmitt & Meara, 1997). Recently, de Bot and Lowie (2010) conducted a longitudinal study of the stability of L1 and L2 vocabulary knowledge of a very advanced learner. They found that vocabulary knowledge, measured as reaction times to a simple naming task of a restricted set of items, showed similar fluctuations, expressed as weak correlations between measurements. This intra-learner variation was much higher in the L2 than in the L1.

48 A dynamic perspective on L2 vocabulary knowledge

However, a previous study of L1 learning post-attrition has shown that even L1 vocabulary knowledge is unstable, with the same words unsystematically recognized across observations (de Bot & Stoessel, 2000). The nonlinearity of L1 and L2 vocabulary development can be associated with the characteristics of dynamic systems (van Geert, 1994, 2008). As discussed in the previous chapters, the dynamic perspective considers variability, manifested in growth fluctuations, as predicting and enabling qualitative jumps (van Geert & van Dijk, 2002). Additional evidence for the dynamic characteristics of vocabulary knowledge, this time in a bilingual framework, comes from what Meara (2005) refers to as the Boulogne Ferry effect. This phenomenon is known as such because it has been noted during ferry crossings of the English Channel. It entails an activation of a small set of dormant L2 words via exposure to their written form, which generates a dramatic iterative activation of a much larger number of words. Meara (2005) has simulated the effect as a simplified on-off activation network of a small number of interconnected items in a bilingual lexicon. In this lexicon, the initially-dominant subset of L1 words remained relatively stable, while the initially-dormant subset of L2 words was sensitive to perturbation. Thus, the activation of L1 items did not produce large or lasting effects, whereas activating even a small set of L2 items sometimes launched extensive iterative growth patterns. Such oscillations resulted in large L2 vocabulary gains, rendering the L2 lexicon temporarily dominant. When activation of L2 items was ceased, the system returned to its original state of L1 dominance. Although not explicitly linked to the dynamic approach, Meara’s model resonates with the typical behavior of a dynamic system. On one hand, it shows seemingly-sudden and not entirely predictable growth as the outcome of iteration; on the other, it returns to settle in a steady attractor state. These patterns are reminiscent of Lorenz’s Butterfly Effect, as illustrated in the bifurcation plot (see section 2.1.1.2). As noted in the previous chapters, the dynamic approach to variability also emphasizes its role as reflecting changes in internal interactions within the system. From this perspective, nonparallel variability patterns in early lexical growth have been interpreted as indicative of hierarchical interactions between different linguistic components, in line with the precursor model. Mathematical dynamic simulations have supported this notion, by showing that precursor interactions between early L1 vocabulary and syntax can replicate the S-curve of vocabulary growth (Robinson &

49 CHAPTER 4

Mervis, 1998; van Geert, 1993). These studies are readdressed in the following chapter, in the context of the lexical-syntactic codependence. Asides from the vocabulary spurt in infancy, another period in which a large amount of L1 vocabulary is attained also coincides with the rapid acquisition of other skills, namely those involved with literacy. At this stage, the connection of word form, meaning and sound with orthography continues to be established even while new words are acquired by means of the emerging reading skills. As described in section 3.4.5, van Geert (2003) maintains that the precursor model is also applicable in this context: reading skills depend on a threshold amount of known words, but while reading ability is established, the cognitive resources of the child are predominantly dedicated to it, causing a temporary decline in the acquisition rate of new vocabulary. Once a certain level of reading proficiency is reached, this competition weakens, eventually turning into a mutually supportive interaction. In line with these findings, the present study stipulates that the variability that accompanies L2 vocabulary development may also indicate precursor interactions between receptive and productive modalities of knowledge. The following section addresses the general construct of vocabulary knowledge, before proceeding to discuss the receptive-productive distinction in further detail.

4.1.2 Aspects of vocabulary knowledge Vocabulary knowledge is multi-faceted (Schmitt, Schmitt, & Clapham, 2001), and “not an all-or-nothing proposition” (Melka, 1997, p. 87). While some studies, for instance Meara’s (2005) simulation of the Boulogne Ferry Effect or de Bot and Lowie’s (2010) exploration of the stability of vocabulary knowledge, treat it as a general construct by focusing on a single knowledge level, many studies address various aspects of vocabulary knowledge. Numerous definitions and research paradigms ensue from this approach. A prominent distinction separates the breadth and depth of vocabulary knowledge (Qian, 1999; Read, 2000). Breadth refers to size, or number of known vocabulary items, while depth refers to their degree of knowledge. Depth encompasses many knowledge paradigms, for example the degree of word familiarity (Paribakht & Wesche, 1996), or the degree to which associative word knowledge is native-like (e.g., Schmitt, 1998). In the current study, the focus is on the distinction between receptive and productive knowledge (cf., Laufer & Nation, 1999; Melka, 1997; Schmitt, 2010), which is widely recognized as “a bridging

50 A dynamic perspective on L2 vocabulary knowledge

dimension between lexical competence and performance” (Henriksen, 1999, as cited in Zareva et al., 2005, p. 570). Due to its saliency in L2 learning and teaching, the receptive-productive distinction has “great ecological validity” (Schmitt, 2010). However, definitions of receptive and productive vocabulary knowledge vary greatly across studies, as do testing methods of both modalities (Melka, 1997). Vocabulary production is considered by some researchers as the ability to retrieve or recall a word (i.e., Horst & Meara, 1999; Read, 2000; Webb, 2008). Studies that maintain this view also vary in their definitions of recall, among which are translation ability, retrieval of word meaning or form in the target language, or associative knowledge. Other studies diverge from this view and consider vocabulary production as the ability to write the target word in context. This ability, in turn, may be as elicited (controlled) or spontaneous (free) (Laufer & Nation, 1995). The controlled-free distinction is essential, “because not all learners who use infrequent vocabulary when forced to do so will also use it when left to their own selection of words” (Laufer, 1998, p. 257). As Singleton points out, “mastery of individual forms and meanings in isolation is absolutely no guarantee of a capacity to recognize or appropriately deploy the words in question in context” (1999, p. 51). This can be seen in the finding that even advanced L2 learners tend to avoid the use of nonfrequent vocabulary in writing, preferring earlier high-frequency vocabulary due to its smaller error risk (Ringbom, 1998). For example, a corpus study showed that 90% of words used in intermediate-advanced ESL writing belong to the most frequent 1,000 word list, a large part of which are function rather than content words. Native speaker writing, on the other hand, typically contains only 70% of the 1,000-word category, and even in native speech, this ratio is 80% (Cobb, 2003). In addition to the lack of consistency in defining receptive and productive vocabulary knowledge, there is also a disagreement on the nature of their distinction. While some studies consider the receptive/productive distinction as dichotomous (i.e., Fan, 2000; Fitzpatrick et al., 2008, Laufer, 1998), others extend it to several subcategories. One such paradigm differentiates recognition, recall, comprehension and use (Read, 2000). Another scheme specifies a four-level continuum that ranges from the ability to recognize the meaning of a given word from several distractors, to the ability to recall its meaning on the basis of form (Laufer et al., 2004). This

51 CHAPTER 4

paradigm forms a basis for the one used in the current study, and is readdressed in the following section. The debate extends to the definition of vocabulary knowledge in terms of storage and stability. While traditional paradigms of the mental lexicon describe a static, dictionary-like structure that exists independently of its use (cf., de Bot, 1992; Levelt, 1992), alternative accounts address the episodic, complex and interconnected nature of vocabulary knowledge as it develops over time (de Bot et al., 2005; de Bot & Lowie, 2010; Meara, 1989, 2001, 2005; Elman, 1995, 2004, 2009; Spivey, 2007). This dynamic knowledge consists of “highly context-sensitive, continuously-varied, and probabilistic” word representations which are “trajectories through mental space” (Elman, 1995, p. 199) rather than fixed lexical entries. In this view, different aspects of vocabulary knowledge (such as word recognition, comprehension or production) are graded processes which are affected by interactions that determine the ability to activate a word at a given knowledge level in a specific context. Words do not have fixed meanings, uses or contexts that are stored in the lexicon, but particular sets of meanings, associations, stylistics and so forth for any given circumstances at any specific time. Therefore, at any moment, the state of the lexicon is an iterative dynamic function that reflects its prior state (de Bot & Lowie, 2010; Elman, 1995, 2004, 2009). The instability of vocabulary knowledge is thus reinforced not only as a byproduct of learner performance, but as an intrinsic quality of the lexicon.6 This view is also compatible with a broader view of language development and use as inextricable from the structural characteristics of language (Beckner, Blythe, Bybee, Christiansen, Croft et al., 2009; Ellis & Larsen-Freeman, 2009). Invariably, it strongly contradicts the generativist/nativist approach and its emphasis on storage and innate mechanisms. These theoretical innovations, however, are seldom reflected in empirical studies, particularly of L2 vocabulary. Rather, most studies exclude inter- and intra-learner variability, emphasizing the end-product and not the process of learning. Despite the theoretical, terminological and operational disagreements on vocabulary knowledge, the literature is consensual that production is more sophisticated that reception. Overall, the receptive-productive distinction refers to partial or graded knowledge, with reception representing a more incomplete way in

6 For a detailed account of the dynamic approach to the lexicon, see Goldinger, 1998; van Orden & Goldinger, 1994.

52 A dynamic perspective on L2 vocabulary knowledge

which knowledge is “stored in the mental lexicon” (Melka, 1997, p. 85) in comparison with production. Whether accounts of such “storage” are more literal or emergentist/dynamic, they share the notion of different aspects of knowledge, some of which (in this case, receptive) are more elementary to others (in this case, productive).

4.1.3 The receptive-productive gap For many L2 learners, the inability to produce receptively-known words is familiar and frustrating. While it is most readily apparent in speech, many written vocabulary studies have also noted that receptive knowledge does not necessarily transfer to production (Laufer, 1998; Laufer & Paribakht, 1998; Schmitt & Meara, 1997; Webb, 2008). The causality of a word’s transition from receptive to productive knowledge is also debated. Some studies talk about the degree of automaticity as an explanatory factor (cf., Meara, 1996), while others posit word familiarity (Melka, 1997) or frequency (e.g., Laufer & Paribakht, 1998). In general, there are two main and competing approaches towards the disparate development of receptive and productive knowledge. The first maintains that reception and production rely on different mental processes, related in turn to the declarative (“knowledge that”) and the procedural (“knowledge how”) memory systems (Ryle, 1949). This approach appears at different strengths, interpretations and emphases (Paradis, 2009; Robinson, 1989, 1993; Ullman, 2001a, 2001b). The second approach claims that although reception precedes production, the gap between them changes in accordance with linguistic or contextual factors. It considers the two aspects of knowledge as reliant on a single system, assuming “some kind of unitary language proficiency and interaction between receptive and productive skills in the process of learning a language” (Melka, 1997, p. 93). This assumption enables the investigation of vocabulary knowledge dimensions as interconnected components of a single system, rather than as isolated structures. Regardless of how receptive and productive knowledge are defined, investigated, or explained, there is strong evidence for the saliency of the gap between them, starting from the earliest stages of language development: Young children can understand forms well before they can produce them. Infants under a year old, for example, understand some words for up to three or four months before they try to produce them; older children understand comparative word forms, for instance, long before they themselves can produce any, as well as novel-derived nouns before they themselves coin any (Clark, 1993, p. 245).

53 CHAPTER 4

Although the general agreement is that reception precedes production, the variability that accompanies the overall growth of vocabulary size (or breadth) can also be observed with regard to the receptive-productive distinction (or depth). For example, Melka cites two studies which have shown that, in early L1, children tend to produce many words before comprehending or even systematically recognizing them in adult speech (Keeney & Wolfe, 1972; Hagtvet, 1980, in Melka, 1997). Likewise, it has been observed that productive knowledge of certain L2 vocabulary items can temporarily exceed their receptive knowledge (Bloom, 1974, as cited in Melka, 1997). Tip-of-tongue (TOT) experiments have demonstrated that certain word forms cannot be retrieved even while their meaning is readily comprehended, thereby providing additional evidence for the high degree of fluctuation in vocabulary knowledge in general, and across the receptive-productive distinction in particular (Burke, MacKay, Worthley & Wade, 1991; Clark, 1993). In L2, some studies claim that the receptive-productive gap is predominant at beginner level and then gradually disappears as vocabulary knowledge develops. This observation is explained by the hypothesis that “deeper knowledge of words is the consequence of knowing more words, or that, conversely, the more words someone knows, the finer the networks and the deeper the word knowledge” (Vermeer, 2001, p. 222). For instance, a very early study, which operationalized reception as the recognition from a multiple choice of items and production as the translation ability of the target word, found that after five trimesters of L2 (German) classes, the gap was significantly smaller than in the first trimester, when receptive vocabulary size was nearly twice that of productive vocabulary. The researchers concluded that receptive vocabulary develops faster, but that production eventually catches up with it (Morgan & Oberdeck, 1930, as cited in Waring, 1999). A later study (Takala, 1984, as cited in Melka, 1997) employed a similar operationalization of receptive and productive knowledge of EFL in Finnish learners and corroborated Morgan and Oberdeck’s results. However, Melka (1997) has pointed out that a focus on high frequency words and lenient scoring, which did not control for cognates, may have biased the findings in both studies. In contrast, other studies did not report a similar change in the ratio of receptive to productive L2 vocabulary knowledge. Schmitt and Meara (1997) inspected vocabulary gains before and after a year of ESL immersion. They operationalized reception and production as recognition and recall, respectively, of

54 A dynamic perspective on L2 vocabulary knowledge

word associations and derivational and inflectional verb suffixes. They also measured overall vocabulary size with Nation’s (1990) Levels Test, and general proficiency by the TOEFL. Schmitt and Meara’s findings suggested that productive vocabulary does not reach the size of receptive vocabulary, but remains stable and 19-25% smaller. However, they also noted “wide variations in individual vocabulary learning” (1997, p. 33). For instance, some learners showed virtually no increase, or even decline, in certain word frequencies. Thus, Schmitt and Meara recommended that future studies inspect variability as well as group trends. Their report aligns with the dynamic and episodic account of the mental lexicon, since it shows that L2 vocabulary knowledge is not only unpredictable in its growth over time, but also with regard to its maintenance, even when it can be expected to increase or at least stabilize due to intensive language exposure. Other studies, regardless of differences in operational definitions and assessments methods of reception and production, similarly report a high degree of variability and fluctuation in individual learners. For instance, Laufer (1994), who defined production as the proportion of high-to-low frequency words in advanced EFL writing, noted that the receptive-productive correlation varied highly over time not only between, but also within learners. In another study, Laufer (1998) contrasted three knowledge levels – reception, controlled production, and free production – in learner groups from two subsequent years of high school. She defined receptive knowledge as identification of meaning (recognition), measured by Nation’s Levels Test (1990); controlled production as elicited use in context, measured by the Controlled Productive Knowledge Test (Laufer & Nation, 1995, 1999); and free production as spontaneous use in written essays, measured as the ratio of specialized to frequent vocabulary. While the study found significant and positively correlated gains in recognition and controlled production, growth in controlled production was lower. Moreover, free production did not show a significant increase or correlation with recognition or controlled production in either learner group. The development of free production was therefore defined as “a plateau” (Laufer, 1998, p. 266). Thus, in juxtaposition with the claim that the gap diminishes with time, Laufer concluded that it tends to increase, questioning whether this tendency was the result of instruction, or an expression of “the nature of lexical learning” (Laufer, 1998, p. 269).

55 CHAPTER 4

In a follow-up study, Laufer and Paribakht (1998) compared the same three vocabulary knowledge levels across learners of varied proficiencies in EFL vs. ESL conditions. They found that in both conditions, the gap remained robust in individual learners, particularly with regard to free production. In fact, even between learners, the ESL condition had no positive impact on the size of the gap until a minimum of two-years’ immersion. Moreover, counter-intuitively, the gap was generally larger in the ESL population. Similarly, a study of the pre- and post-treatment effects of an academic ESL writing course found “virtually no change in the total number of words used” (Shaw & Liu, 1998, p. 246), even while other aspects of writing showed improvement. In a study focused exclusively on the gap, Fan (2000) tested 138 Chinese EFL learners in nine proficiency levels, defining reception as recognition of word meaning, in line with Nation’s (1990) Levels Test, and production as controlled (elicited) rather than free. Fan found no stable correlation between the two knowledge types and proficiency, reporting a lack of increase in both receptive and productive vocabulary knowledge. Moreover, unlike Laufer’s (1998) finding of a significant positive correlation between recognition and controlled production (yet not between these levels and free production), the correlation between recognition and controlled production was not significant in Fan’s study, leading to the conclusion that the magnitude of the gap may be both unrelated to L2 proficiency and “far from constant” (2000, p. 117). It should be considered, however, that the gap in this case refers only to recognition vs. controlled production. Since the size of the gap varied greatly between the nine proficiency levels, Fan also predicted that it would be even more pronounced in individual learners, thereby joining the recommendations for using case study methodology to reconfirm the existence of the gap and reveal its causes. In another study targeted expressly at the gap, Webb (2008) operationalized reception as the ability to translate from L2 (English) to L1 (Japanese), and production as L1 to L2 translation. He employed two scoring methods, sensitive and strict, which correspondingly included and excluded spelling errors, and tested the knowledge of words from three ranges of frequency. The sensitive scoring revealed a larger gap than the strict one, as well as large individual differences. However, in line with the TOT studies mentioned earlier (Burke et al., 1991), as well as with Schmitt and Meara (1997), Webb noted that for some learners production actually exceeded reception in some word frequencies. In other words, certain learners use some words

56 A dynamic perspective on L2 vocabulary knowledge

without having full knowledge of their meaning, like children imitating words heard out of context without fully comprehending them (Keeney & Wolfe, 1972; Hagtvet, 1980, both as cited in Melka, 1997). Nevertheless, Webb concluded that receptive vocabulary is generally larger than productive vocabulary and that the gap increases in lower frequency words. He also maintained that in general, the larger the receptive vocabulary, the bigger the chance that receptively-known items would also be known productively. Webb’s claim is in line with a study by Laufer et al. (2004), who tested cross- sectional scores on a receptive-productive hierarchy of four knowledge levels: passive recognition, active recognition, passive recall, and active recall. Passive , in this context, refers to the elicitation of meaning when presented with form; active refers to the elicitation of form by meaning. Recognition entailed identifying the target word from a multiple choice of four items; while recall was tested by direct elicitation, with no prompts except the initial of the target word. The researchers reported significant correlations between the four levels, yet stressed the need to separate the four constructs and “report performance on at least three of the strength modalities separately, since (…) a learner may recognize a word like melt without necessarily being able to recall the word itself” (Laufer et al., 2004, p. 223). Like the aforementioned studies, Laufer et al. also recommended longitudinal investigations of their vocabulary knowledge paradigm, stating that “conclusions about vocabulary development cannot be drawn on the basis of data gathered at a single point in time” (2004, p. 224). Laufer et al.’s decision to distinguish conceptually close knowledge aspects is corroborated by a case study of L2 (Arabic) vocabulary development, which tracked daily retention of words learned at a pace of 15 per day, over a 20-day period. Following this period, learning was ceased and retention was tested weekly during a month. The study contrasted receptive and productive knowledge, with reception defined as identification of the L1 translation of the target word from a multiple choice, and recall defined as direct translation of the target word to L1, with no prompts. The study reported a disparity between the two levels, expressed not only in a slower acquisition rate of recall, but also in its much faster decline (Fitzpatrick et al., 2008). This close-up on the intricacies of short-term L2 vocabulary acquisition shows that even knowledge levels that are relatively close on the receptive-productive

57 CHAPTER 4

continuum can exhibit markedly different developmental paths. The discrepancy is notable, because it applies not only to a brief time span, but also to a small set of words. Fitzpatrick and her colleagues also noted that the gap was more robust in words that were acquired later in the learning period, which were evidently rehearsed less. Since a shorter period of exposure can also be assumed for lower-frequency vocabulary, this finding aligns with frequency effects on the size of gap, as reported by previous studies (Laufer et al., 2004; Laufer & Paribakht, 1998). In addition to these analyses, Fitzpatrick et al. simulated their data with a probability matrix.7 This procedure successfully replicated the disparate patterns of decline in both knowledge levels, although it did not account for their daily fluctuation. The simulation generated developmental patterns that resembled inverse learning curves, and like the data, it showed that the more-productive level (recall) was more sensitive to input or learning cessation than recognition. Both the empirical data and the simulation appear compatible with the dynamic perspective on SLA, since they show the two knowledge levels as developing (or rather, declining) nonlinearly. The fact that this variability is particularly pronounced in recall, the higher and more-receptive level, may also reflect a complex interaction between the two levels. Thus the effect generated by the matrix, like the Boulogne Ferry Effect simulation (Meara, 2005), is reminiscent of dynamic self-organization. The following section elaborates on the potential relevance of the dynamic approach for L2 vocabulary development in general and the receptive-productive gap in particular.

4.1.4 A dynamic perspective on vocabulary knowledge Several recurrent findings, as described so far, suggest that the dynamic approach may be suitable for researching and explaining the receptive-productive gap. First, converging evidence from various studies, notwithstanding their divergent views on vocabulary knowledge, point at its nonlinear growth and at shifting and unstable interactions between its receptive and productive modalities. Second, there is also a justification for distinguishing various levels of receptive and productive knowledge, even when these are conceptually close, rather than solely dichotomous receptive/productive categories. This implies that vocabulary knowledge is intricate and stratified. Finally, there is indication that a focus on longitudinal growth and interaction between knowledge levels in individual learners may lead to a better

7 See Meara, 1989 for an elaboration on this technique.

58 A dynamic perspective on L2 vocabulary knowledge

understanding of vocabulary development, as the explanatory power of the cross- sectional and linear approach is rather limited in this case. The dynamic perspective appears relevant for describing and explaining the development of receptive- productive vocabulary knowledge, since it addresses both nonlinear growth and changing interactions over time as functions of general and ecologically-valid constraints, such as limitations on cognitive abilities. The dynamic approach does not require the presupposition of separate mechanisms, but can investigate nonlinear and nonparallel development as arising from interactions between co-developing constituents of a dynamic system within a specified hierarchical order. Accordingly, the current study postulates that the dynamic precursor model, which embodies the notions of inherent order and resource limitations, can be extended to the receptive- productive gap in L2 vocabulary. Vocabulary reception is an earlier skill to production, just as children’s one- or two-word utterances are earlier strategies in comparison with 3-word utterances, or lexicon is a predecessor of syntax. In the two latter areas, early skills were identified as conditional precursors of later skills. As chapters 2 and 3 have shown, mathematical dynamic models have been successfully used to simulate disparate growth and changing interactions between these linguistic precursors and their dependents (Bassano & van Geert, 2007; van Geert, 1993). Similarly, receptive knowledge can be considered as a precursor to production, and the process of transferring receptively-known vocabulary into productive use can be seen as competing with the continued acquisition of receptive vocabulary for limited systemic resources such as attention or memory. These assumptions do not negate reports of instances at which production of certain words may precede reception, since production may sometimes be based on incomplete knowledge (cf., Clark, 1993; Hagtvet, 1980, as cited in Melka, 1997). It is clear that while the interactions between the receptive-productive modalities may not be entirely predictable, a basic hierarchy can still be assumed: a word must be recognized, at least phonologically, before it is uttered, even when its meaning remains unknown. Thus, the main hypothesis of this study is that the dynamic approach, and specifically the precursor model, can account for the receptive-productive gap. The following section elaborates on the questions involved in investigating this hypothesis.

59 CHAPTER 4

4.1.5 Research questions and predictions This study has three main research questions. They pertain to a four-level continuum of receptive-productive modalities in an ascending order, which consists of word recognition, recall, controlled production and free production. The operationalization and longitudinal assessment of these levels are detailed in the Methodology section of this chapter. 1) The first question is whether the receptive-productive gap is stable, or whether it changes over time. This question is addressed by inspecting the course of development on all knowledge levels, with an emphasis on free production, as well as by correlating these levels with the number of observations. 2) The second question concerns the interactions between the continuum levels and their stability. It is investigated by correlating the data values, and by comparing the variability patterns and shifts in (moving) correlations between pairs of knowledge levels. 3) The third question is if the development of the receptive-productive continuum, and hence of the gap, can be accounted for and adequately simulated by a model based on dynamic precursor interactions. It is addressed by configuring the findings from the previous analyses in a model based on coupled logistic equations. The model parameter values are then optimized and its outcome is compared to the data, while the optimized values are in turn compared with the interpretation of the analyses.

On the basis of the background literature, the predictions regarding these questions were as follows. Concerning the first question, it was expected that although all knowledge levels would increase during the study period, their growth would be disparate. Therefore, the gap would continue to manifest, although its size may fluctuate over time. With regard to the second question, it was expected that the variability patterns around the central growth trends of the data would indicate complex precursor interactions between the lower, more-receptive levels of the continuum and the higher, more-productive levels. This would imply that a certain amount of known receptive vocabulary is prerequisite for production, and that interactions between lower and more-established receptive levels and higher and less- established productive levels would reflect simultaneous support and competition. The final prediction was that a dynamic model configuring these interactions would achieve a good fit to the data, while maintaining the premise of precursor interactions between the knowledge levels.

60 A dynamic perspective on L2 vocabulary knowledge

4.2 Methodology

4.2.1 Participants and procedures Since the dynamic approach requires dense longitudinal data, and as studies of L2 vocabulary frequently recommend longitudinal designs (Fan, 2000; Fitzpatrick et al., 2008; Meara, 2005), the study encompasses four separate case studies. The participants in these studies were all female, aged 24, 26, 23 and 28. Their native languages were Portuguese, Mandarin Chinese, Indonesian and Vietnamese, respectively. Although these languages do not play an active role in the study, they are used in reference to the participants. The participants were enrolled in four different English-speaking Master’s degree programs at a university in the . Their studies entailed intensive exposure to spoken and predominantly written academic vocabulary. Because this was their first immersion period in a language other than their L1, it was assumed to entail rapid change in their L2 in general, and in their academic vocabulary knowledge in particular. This assumption is in line with reports of significant vocabulary gains through reading (cf., Horst, 2005). Accordingly, the study focused on academic English vocabulary, as it is defined by the University Word List (UWL) (Xue & Nation, 1984) and the Academic Word List (AWL) (Coxhead, 2000). The UWL contains 808 words, divided into 11 frequency bands, while the AWL includes 570 word families in ten frequency bands. Both lists were compiled by extensive corpus analyses across various academic disciplines, and exclude specialized as well as general use vocabulary. 8 The UWL has been shown to successfully discriminate between proficiency levels (Laufer & Nation, 1999); whereas familiarity with the AWL, when combined with knowledge of the most-frequent 2,000 English words, was found to be crucial for success in academic studies (Beglar & Hunt, 1999). Although these lists overlap to a degree, it was decided to include both of them in the study. This decision was motivated by the need to increase the size of the item database used to longitudinally assess the vocabulary knowledge of the participants, and thereby prevent practice effects caused by repeated assessment of a limited set of words. Moreover, the distribution of the lists across different word frequency levels minimizes potential frequency effects on the receptive-productive gap, as described in

8 The most common 2,000 words in the General Service List (West, 1953)

61 CHAPTER 4

the literature (Laufer, 1998; Laufer & Paribakht, 1998; Schmitt & Meara, 1997; Webb, 2008). Knowledge of words derived from the combined lists was assessed during a 36-week period than begun at the onset of the participants’ academic studies. In each week, the participants completed a test of their vocabulary knowledge in three levels: recognition, recall, and controlled production, and wrote a essay aimed at assessing free production. The Materials section elaborates on this receptive-productive vocabulary knowledge paradigm and its assessment.

4.2.2 Materials

4.2.2.1 The knowledge paradigm As the literature review has shown, despite agreement on the validity of the receptive- productive distinction, there is still substantial disagreement on the abilities that best represent receptive and productive knowledge, as well as on how these abilities should be measured. Therefore it is necessary for vocabulary studies to define a knowledge paradigm and a corresponding assessment method. The paradigm in the current study is monolingual, meaning that L2 knowledge levels are not defined in relation to L1 translation abilities, but only within the L2 context (Laufer et al., 2004; Laufer & Goldstein, 2004). It encompasses four knowledge levels: recognition, recall, controlled production and free production. These levels can be located on a continuum that ranges from most-receptive to most-productive, and is based a combination of knowledge paradigms specified by Laufer et al. (2004) and Laufer & Nation (1995). (Passive recognition) *---Active recognition--- (Passive recall) *---Active recall---- Controlled production---Free production *Levels implied, but not explicitly featured in the current study

As mentioned, in this context the term passive refers to activation of meaning by form, while active refers to activation of form by meaning (Laufer et al 2004). The present study excludes the passive dimensions, as pilot studies revealed that passive recognition was generally too easy for a target population of advanced learners. These studies also showed that passive recall, defined as non-prompted elicitation of word meaning following exposure to its form (Laufer et al., 2004), was difficult to assess, because participants often supplied partial or associative definitions. The three lower knowledge levels were measured by the Longitudinal Academic Vocabulary Test (LAVT) (Caspi & Lowie, 2010), which combines

62 A dynamic perspective on L2 vocabulary knowledge

adaptations of two validated vocabulary testing methods (Laufer et al., 2004; Laufer & Nation, 1995). The LAVT consists of three sections, testing controlled production, recall and recognition, in this sequence, which ranges from more- to least-productive. This descending-order design is aimed at minimizing potential practice effects caused by exposure to a word at a lower knowledge level, which might facilitate its elicitation at a higher level, in line with Laufer et al.’s (2004) recommendation. Each section of the LAVT tests 30 different words, derived from a 1,000-item database that combines the UWL and AWL. For each test version, words were chosen from all frequency levels of the database by spaced sampling, selecting items at predetermined intervals by starting from random points. Free production was assessed by comparing the vocabulary used in the participants’ essays with the combined academic word lists. The following subsection elaborates on the assessment of free production through these essays, before explaining how the three lower knowledge levels were measured by the LAVT.

4.2.2.2 Free production The highest level in the continuum, free production, which is the spontaneous usage of vocabulary in writing, was assessed in freely written essays on equivalent expository or narrative topics. These were derived from lists of TOEFL essay subjects, and concerned personal experiences and opinions on general cultural, economic, environmental or educational issue. 9 To encourage creativity and motivation, the participants were given a choice of three topics each week, from which they were asked to select a single topic and write an essay of approximately (but no less than) 350 words. The assignment of writing tasks on equivalent topics and the requirement of a minimal word length were intended to eliminate task variation effects (Cumming, 1989) and text length effects (Bulté, Housen, Pierrard, & van Daele, 2008). Free vocabulary production is difficult to measure, since it can obviously not be elicited. Therefore studies of this construct need to rely on a limited number of occurrences. This problem is predictably magnified when investigating free production of less-frequent and more-specialized vocabulary. Previous studies calculated written free production by contrasting the ratio of their target vocabulary with that of a more-frequent vocabulary category (e.g., Laufer, 1998). However, a

9 See for example www.ets.org/Media/Tests/TOEFL/pdf/989563wt.pdf

63 CHAPTER 4

pilot study showed that this approach has several drawbacks. First, the most frequent words (as specified by Nation, 1990) include numerous function words. It is thus not surprising that their proportion in comparison with academic or specialized vocabulary would appear almost static, regardless of whether it is calculated on the basis of word tokens, types or families. Additionally, the assessment of any vocabulary knowledge level other than free production incorporates, by default, aspects of accuracy and complexity. This is because any testing method invariably includes accuracy and complexity as implicit scoring criteria. Accuracy is implied because any response to an item with a word other than the target would be rejected, even when that word may be contextually appropriate; complexity is implied since each test item targets a different word. Moreover, usually knowledge of words derived from the same family would not be assessed in the same test version. Therefore, it was necessary to incorporate the accuracy and complexity dimensions into the free production measure, in order to render all knowledge levels on the continuum comparable, at least theoretically. In accordance with these considerations, the current study used a revised formula to calculate free production: the ratio of correctly used academic (UWL and AWL) families to total academic word tokens, divided by the total number of correct content words, multiplied by the general family/token ratio. In this way, both complexity (family/token ratio) and accuracy (correct item ratio) were incorporated in the free production index. The revised formula was intended to yield a better representation and comparison of free production with the three lower levels in the continuum.10 The accuracy ratios of the target words and of the general content words were based on the researcher’s judgment. Although ideally, more than one rater would be employed in such a case, this was not feasible in the current study. However, it is quite likely that the researcher’s judgment would have differed from that of another rater, due to reports of low rater agreement (Polio, 1997). Therefore this judgment was expected to be consistent, in the sense that any errors which may not have been identified would be consistently rather than occasionally overlooked. Since the focus of the study is on development over time rather than on error types, this was not

10 The identification of the target (academic) words in the essays, as well as the content word ratio and family/token ratios were based on the tools in Tom Cobb’s website, namely TextLexCompare and Vocabprofile; see www.lextutor.ca.

64 A dynamic perspective on L2 vocabulary knowledge

considered as a major disruption. Consistency in error identification was ensured further by re-checking the texts that were written in the first half of the study period, once all texts had been checked. This was done in order to prevent possible changes in rater judgment that may have occurred during the process of reviewing the corpus from affecting the accuracy ratios (see also section 5.2). Despite the adjustment of its formula, the free production ratio was still not expected to reach that of the lower knowledge levels, given its different conceptualization and assessment method. Moreover, the free production of specialized vocabulary cannot in any case reach an optimal value of 100% when calculated as relative to other word types, unlike the lower knowledge levels. However, the focus of the study is not on the optimal value that free production may attain, but on the magnitude and consistency of the distance between its developmental trajectory and that of the other lower knowledge levels. In other words, while it can be expected that free production would be consistently lower than the other knowledge levels, the main interest of the current study is the temporal fluctuations in free production and their relation to those in the other levels, i.e., the stability of the gap over time as a function of the overall development and dynamics of the knowledge continuum.

4.2.2.3 Controlled production Controlled production was tested by the first section of the LAVT. This section is a cued gap-fill test eliciting words in context, which is based on the productive version of the Levels Test (PVLT) (Laufer & Nation, 1995, 1999). The PLVT has parallel versions, which were found to be highly reliable (Laufer & Nation, 1999). Adapting its design for the current study entailed including two sentences per test item, in which the target word is omitted from two different contexts. This was done because the same word can be used in varied contexts and with different meanings, and it is therefore often difficult to surmise whether a word is known or not from its use in a single context. Both sentences were extracted from non-academic sources such as online magazine editions, in order to ensure that the surrounding vocabulary of the target words is relatively familiar. If the target word was conjugated, the corresponding morpheme was included in order to prevent confusion, since the grammatical aspect of word knowledge is not the focus of the study. The following is an example of a controlled production item:

65 CHAPTER 4

(Target word: norm ) One child per family is fast becoming the n______in some countries; Many immigrants find it hard to adjust to European cultural n______s.

4.2.2.4 Recall and recognition The second and third parts of the LAVT – testing recall and recognition, sequentially – are based on the monolingual version of the Computer Adaptive Test of Size and Strength ( CATSS) (Laufer & Goldstein, 2005; Laufer et al., 2004), 11 which is comprised of four parts that correspond with a descending four-level continuum. These levels are active recall, passive recall, active recognition, and passive recognition, from which the current study has omitted the passive dimensions (as explained in section 4.2.2.1). Like the CATSS, the LAVT elicits recall by providing a dictionary definition of the target word and its initial as cues, and recognition by supplying the same definition and presenting the target word in a multiple choice of four distractors. These distractors alternate between words of a similar frequency 12 and words of a similar affixation (prefix or suffix). The variance in distractor types was based on the premise that words of a similar frequency are often very different in form. Moreover, a search of the British National Corpus revealed that in some cases, words close in frequency to the target word may in fact be much more commonplace than the target word in spoken language. The following are examples of a recall item and of two recognition items, which correspond with the two types of distractors. (Target word: guarantee ) To promise that something will be done or will happen g______

(Target word: preliminary ; distractors of a similar frequency) Coming before a more important action or event, especially introducing or preparing for it a) applicable b) preliminary c) immense d) extended (Target word: converse ; distractors of a similar affixation) To talk with others a) contain b) contend c) converse d) conscript

11 The CATSS is designed to measure vocabulary knowledge size and strength simultaneously, diagnosing “vocabulary development” (Laufer et al., 2004, p. 202). 12 Extracted from Kilgarriff’s lemmatized British National Corpus (BNC) frequency list; see www.kilgarriff.co.uk.

66 A dynamic perspective on L2 vocabulary knowledge

4.2.2.5 Testing and scoring considerations The score for all the levels tested by the LAVT was calculated as the ratio of correct to total items. Prior to the study onset, the test items were trialed longitudinally by four native Mandarin speakers, who were students in English-speaking academic degree programs, as well as cross-sectionally in nonnative (L1 Dutch) academic English users. The longitudinal testing was intended to identify items which were potentially confusing or misleading; the cross-sectional testing was aimed at validating the equivalent form reliability of different test versions. The longitudinal study showed that in some cases synonyms that begin with the same initials as the target word may be elicited. Therefore, if necessary, additional letters were added to the initials of certain target words. Since the number of letters that can be added without supplying the answer is naturally limited, items were eliminated in cases which required the inclusion of more than three letters, or when the target word was relatively short. In some test items, the final letter of the target word was used as cue. The cross-sectional study was intended to ensure the equivalent form reliability of randomly-generated test versions. Each of its participants completed two LAVT versions, with 30 items in each test part. The tests themselves varied, with half of the participants completing two versions, and the other half two other versions, thus four LAVT versions were simultaneously tested for equivalence. The result was a significant correlation (p<0.01) between the coupled test versions, demonstrating that these randomly generated versions are indeed equivalent (see Table 2 below). Test part Pearson’s r coefficient Number of participants Controlled production 0,775 27 Recall 0,844 32 Recognition 0,733 31 Table 2. Correlations between two LAVT versions

4.3 Results Due to the extensive graphic representation required by the dynamic approach, most of the analyses are accompanied by an interpretation, with each subsection summarized. The subsequent Discussion section then recapitulates the findings, relating them to the research questions.

67 CHAPTER 4

4.3.1 The data at a glance Figure 10 contains scatter plots of the data for the four participants during the 36- week study period. In each plot, the y-axis represents a ratio of known vocabulary at each level, while the x-axis represents time, as the number of weeks lapsed from the study onset.

The P ortuguese speaker The Mandarin speaker

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Co ntrolled pro duction Free productio n Co ntro lled productio n Free productio n

The Vietnamese speaker The Indonesian speaker

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Controlled pro duction Free pro duction Controlled pro duction Free pro duction

Figure 10. Raw data values for the four participants. The x-axes denote the number of weeks; the y-axes denote the ratio of known vocabulary

Overall, the plots show a high degree of variability in the data of all the participants, as manifested in the scattered appearance of the measurement values for each level. This is particularly noticeable in free production, the highest level, and least in recognition, the lowest. This observation makes sense intuitively, since free production is the least-established level in the continuum. As such, it would be expected to show increased variability, particularly from a dynamic perspective.

4.3.1.1 Linear trends Due to the high amount of variability in the data, it is difficult to discern the overall shape of growth for each knowledge level from the raw values. Therefore, as an initial indication of general growth trend, linear regressions were calculated and added to the data values. Although linear regressions are rather limited in describing development,

68 A dynamic perspective on L2 vocabulary knowledge

particularly in learning, since they eliminate variability and local trends (van Geert, 1991), they show the overall shape of growth (increase vs. decrease) in relation to time. As mentioned in the previous chapter, adding linear regressions to data trajectories can indicate the general direction of development, while inspecting variability from these central trends can expose patterns that, like the trends, may not be readily apparent in the raw data. Moreover, since most vocabulary studies employ linear analyses when comparing learner groups or pre- and post-learning effects, it makes sense to include this perspective in the current study before supplementing it with further procedures.

The P ortuguese speaker The Mandarin speaker

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Co ntrolled pro duction Free productio n Co ntro lled productio n Free productio n

The Vietnamese speaker The Indonesian speaker

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Controlled pro duction Free pro duction Controlled pro duction Free pro duction

Figure 11. Raw data values and linear trends for all participants

The linear trend and raw data plots reveal general similarities across the four participants, despite the predictable differences between their longitudinal and individual datasets. First, as Figure 11 shows, recall and controlled production have very similar, and often parallel, linear growth trends, in all cases. These levels show the highest linear increase, manifested in the steepest slope of the regression line. Moreover, the linear trends for these levels are also very similar, to the point of near- overlap in the Portuguese speaker data, and virtual convergence in the Vietnamese

69 CHAPTER 4

speaker data. Recall and controlled production are likely closer to each other conceptually than the other knowledge levels, since the knowledge continuum is assembled from two different paradigms, one juxtaposing recognition and recall (Laufer et al., 2004), and the other contrasting controlled and free production (Laufer & Nation, 1995). Thus the recall-controlled production duo embodies the distinction between the ability to retrieve a word form given explicit meaning and the elicited ability to retrieve it given contextual or associative meaning (Laufer & Nation, 1999). The second general similarity is that recognition, despite its high onset values in all datasets, still increases. This increase is predictably weaker, as these onset values are uniformly near-maximal. Similarly, free production also shows a general increase in all cases. However, its increase is not parallel to that of the other knowledge levels. This relatively low increase and unparallel growth illustrates the receptive-productive gap as it has been defined by previous studies (e.g., Laufer, 1998; Laufer & Paribakht 1998). Third, the linear trend plots show that in general, the hierarchy of knowledge levels remains intact throughout the study period. Thus the receptive-productive gap, whether defined as free production vs. controlled production or recognition (e.g. Laufer & Paribakht, 1998), or even when defined as recall vs. recognition (e.g. Webb, 2008), remains robust for all participants when the central trends are inspected. However, the data variability reveals less distinctive patterns. The raw values show a more complex and individuated picture of development for each participant. In some weeks, the continuum hierarchy is not maintained. Unsurprisingly, this is most evident between controlled production and recall, which as noted have very close developmental trajectories and growth trends. For example, in the Portuguese speaker dataset, the value of controlled production is often equal to, or even surpasses, that of recall. More exceptionally, there are several instances in which free production temporarily surpasses controlled production or even recall, for instance, weeks 15 and 23 in the Vietnamese speaker dataset. Additionally, although the linear trends of recall and controlled production are clearly distinct from those of recognition in all cases, at certain observations in each dataset, recall and recognition have very similar and even identical values. Nevertheless, the value of recall does not exceed that of recognition at any point. There are even instances in which recognition, recall, and controlled production have very close values (week 36 in the Mandarin speaker data), or even

70 A dynamic perspective on L2 vocabulary knowledge

overlap (week 30 in the Vietnamese speaker plot). The free production value at those points is also very high, although it remains lower than the values of the other knowledge levels. Conversely, there are other instances when the values of recall, controlled production and free production are quite close, while that of recognition is incongruent (week 19 in the Vietnamese speaker plot). These instances illustrate a kind of temporary general “boost” of the entire knowledge continuum, or most of its levels. They also suggest that despite the predictably low value of free production, there are strong fluctuations in this level which are tied in with the internal dynamics of the continuum as a whole. Overall, fluctuations in the data variability indicate that interactions between the knowledge continuum levels may shift throughout the study period, despite maintaining a basic hierarchical order. Before proceeding to inspect these observations further in variability analyses, the data values for each knowledge level were correlated with the number of weeks, as well as with the other knowledge levels, in each case study. Following this procedure, a multiple linear regression was calculated for the dataset that showed the most robust correlations. These procedures were intended to determine the statistical significance of the surface interactions and linear growth trends of the data and assess the ability of linear procedures to account for growth across the knowledge continuum.

4.3.2 Correlations Since individual growth data is unlikely to be normally distributed, Spearman’s rho, a nonparametric correlation coefficient, was calculated. The correlation analyses results are summarized in a table for each participant, followed by an interpretation.

4.3.2.1 The Portuguese speaker Correlation Coefficient value Recognition-weeks .628 (p<0.01) Recall-weeks .849 (p<0.01) Controlled production-weeks .769 (p<0.01) Free production-weeks .347 (p<0.05) Recognition-recall .599 (p<0.01) Recall-controlled production .658 (p<0.01) Controlled-free production Not significant Recognition-controlled production .439 (p<0.05) Recognition-free production Not significant Recall-free production Not significant Table 3. Correlations in the Portuguese speaker data

71 CHAPTER 4

For the Portuguese speaker, there are strong positive and significant correlations between the three lower levels and time, as represented by the number of weeks. This indicates significant growth during the study period, and corresponds with the impression obtained from the linear trend plots. The free production-weeks correlation is also positive and significant, with a medium-low value. The correlations between adjacent knowledge levels in the continuum are high between recognition and recall and between recall and controlled production. However, the correlation between controlled and free production is not statistically significant. In the non- adjacent knowledge levels, the correlation between recognition and controlled production is moderate and significant, while the free production-recall and free production-recognition correlations do not reach significance. In other words, while free production increases significantly over time, like the other levels (albeit not as strongly), it is not significantly correlated with any of these levels, thereby demonstrating the receptive-productive gap.

4.3.2.2 The Mandarin speaker Correlation Coefficient value Recognition-weeks Not significant Recall-weeks .342 (p<0.05) Controlled production-weeks .408 (p<0.05) Free production-weeks .439 (p<0.01) Recognition-recall .457 (p<0.05) Recall-controlled production .664 (p<0.01) Controlled-free production Not significant Recognition-controlled production .336 (p<0.05) Recognition-free production .309 (p=0.67) Recall-free production .341 (p<0.05) Table 4. Correlations in the Mandarin speaker data

In this dataset, the correlations between the number of weeks and the knowledge levels are weak to moderate for recall, controlled production, and free production. Unlike the Portuguese speaker data, the weeks-recognition correlation is not significant. However, recognition and recall have a moderate significant correlation. Recall also has a strong correlation with controlled production. Nevertheless, despite the significant growth of both controlled and free production, as expressed in their correlations with time, the correlation between these levels is not significant. This discrepancy is similar to the one noted in the Portuguese speaker dataset. The correlations between non-adjacent levels show that recognition and

72 A dynamic perspective on L2 vocabulary knowledge

controlled production are weakly yet significantly correlated. Recall also correlates moderately with free production. Recognition and free production have a weak correlation with a tendency for significance (.309 at p=0.067).

4.3.2.3 The Vietnamese speaker Correlation Coefficient value Recognition-weeks .354 (p<0.05) Recall-weeks .409 (p<0.05) Controlled production-weeks .419 (p<0.05) Free production-weeks Not significant Recognition-recall Not significant Recall-controlled production .739 (p<0.01) Controlled-free production Not significant Recognition-controlled production Not significant Recognition-free production Not significant Recall-free production Not significant Table 5. Correlations in the Vietnamese speaker data

For this participant, the correlations between the number of weeks and the three lower knowledge levels are all positive and moderate. Although the weeks-free production correlation is not significant and weak, it shows a tendency towards significance (.291 at p=0.085). In the between-level correlations, the only significant correlation is between recall and controlled production, which is also very high (.739 at p<0.01). None of the other correlations, whether between adjacent or non-adjacent knowledge levels, are significant. This finding shows that the course of development across different modalities of vocabulary knowledge is highly individual, as noted by previous studies that reported robust individual differences even when group-level findings showed a general significant increase (cf., Schmitt & Meara, 1997).

4.3.2.4 The Indonesian speaker Correlation Coefficient value Recognition-weeks .540 (p<0.01) Recall-weeks .553 (p<0.01) Controlled production-weeks .518 (p<0.01) Free production-weeks Not significant Recognition-recall .469 (p<0.01) Recall-controlled production .769 (p<0.01) Controlled-free production Not significant Recognition-controlled production .389 (p<0.05) Recognition-free production Not significant Recall-free production Not significant Table 6. Correlations in the Indonesian speaker data

73 CHAPTER 4

This participant’s data shows strong to moderate and significant positive correlations between the number of weeks and the first three knowledge levels, while the free production-weeks correlation is not significant. In other words, the first three levels increase significantly over time in this dataset, whereas free production does not. Concerning the interactions between adjacent levels, the recognition-recall correlation is moderate, while the recall-controlled production correlation is high. In contrast, the controlled-free production correlation is not significant. In the non- adjacent knowledge levels, there is also a significant weak-moderate correlation between recognition and controlled production. Free production, again, is not significantly correlated with either recall or recognition.

4.3.3 Summary of the correlation analyses First, it should be noted that all the correlations found, whether between the knowledge levels and time (represented by the number of weeks), or between the levels themselves, regardless of their statistical significance (or lack thereof) were positive, as indicated also by the positive growth trends of the data. In the context of the receptive-productive gap, this shows that despite temporary declines in specific observations, positive growth occurs on all levels, even when it is not parallel or significant. It also shows that surface interactions between the knowledge levels are all positive, at face value. If there are complex or shifting interactions between levels, these are not overt in the linear growth trends or correlations, and need to be revealed by further analyses of the data variability. For all participants, there were significant moderate-strong correlations between recall and controlled production, and between these levels and the number of weeks. In the data of the Vietnamese and Indonesian speakers, the free production- weeks correlation was not significant, while all other levels correlated significantly with time at similar moderate values. In the Portuguese speaker study, the only one in which all correlations between the knowledge levels and weeks were significant, the free production-time correlation was the weakest. For the Mandarin speaker, the free production-weeks correlation was surprisingly the strongest out of the knowledge level-weeks correlations, although at a close value to those of recall and controlled production with weeks. In this case, the recognition-weeks correlation was not significant. Taken together, despite the differences between the participants, these findings show a robust disparity between the overall growth of free production and

74 A dynamic perspective on L2 vocabulary knowledge

that of the other knowledge levels. In this context, it should perhaps be re-emphasized that the study does not focus on comparing the participants on the basis of their various backgrounds, but on corroborating the existence of the gap and inspecting its nature in several longitudinal datasets, which act as cumulative evidence. Regarding the correlations between the knowledge levels, the controlled production-recall correlation was significant in all cases, including that of the Vietnamese speaker, whose data showed no other significant between-level correlations. It was also the strongest between-level correlation in all cases. The recognition-recall and recognition-controlled production correlations were significant for three participants, with the exception of the Vietnamese speaker. The controlled- free production and recognition-free production correlations were not significant in any dataset, while recall and free production correlated significantly only in the Mandarin speaker data. This again shows the disparity between free production and other knowledge levels, including controlled production. In the Portuguese speaker data, all levels correlated significantly with time. This dataset also showed the highest coefficient values in comparison with the other participants. Yet, even in this case, neither the free-controlled production correlation nor the correlations between free production and recall or recognition were significant. This finding is particularly relevant, as it epitomizes the receptive- productive gap. Additionally, the Portuguese speaker data also exhibited stronger correlations between adjacent knowledge levels than between non-adjacent knowledge levels, reinforcing the notion of the knowledge continuum as a hierarchy, and strengthening the indication for precursor interactions across it. For these reasons, this case study was selected for further analyses.

4.3.4 Linear regression analysis Even though all the knowledge levels in the Portuguese speaker data, including free production, increased significantly during the study period, none of the correlations between free production and the other knowledge levels reached significance. This finding embodies the receptive-productive gap. The positive and significant correlations between time and the knowledge levels in this dataset indicate significant growth over time, but do not express to what extent time acts as a predictor of this growth. Similarly, the significant correlations between the knowledge levels indicate the existence and strength of interaction, but do not indicate to what extent the growth

75 CHAPTER 4

of each level can be predicted on the basis of its interaction with the other levels. Moreover, it is impossible to separate the predictive capacity of time from that of the inter-level interactions. To address these issues, a multiple linear regression was performed on this dataset.

1

0.8

0.6 Ratio 0.4

0.2

0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

Recognition Recall Controlled production Weeks Free production

Figure 12. Data and linear trends for the Portuguese speaker case study

Figure 12 contains the data and linear trends for the Portuguese speaker case study. Although traditionally applied to population data that consists of independent measurements, linear regression is employed in this case to predict the growth of one variable from other co-developing variables. For this purpose, the measurements were treated as independent from each other. Since the main focus of this study is the receptive-productive gap, the multiple regression procedure specified free production as the dependent variable, while the number of weeks and the values of the other knowledge levels were defined as its predictors. The regression was initially based on an enter method of two models. In the first, growth in free production is predicted only by time; in the second, it is predicted by the combination of time and the three lower knowledge levels. In addition, a backward removal method was used on a model predicting growth in free production on the basis of the number of weeks and the three knowledge levels. This procedure begins with incorporating all four variables (the number of weeks and the three lower knowledge levels) as parameters in a multiple linear regression formula,

76 A dynamic perspective on L2 vocabulary knowledge

with the criteria for entry an F probability equal to .05. It then removes these parameters in sequence, with the criteria for removal as an F probability of .1 (both of which are the standard values used in stepwise backward elimination) (Norusis, 2008). The outcomes of the two models indicated that no linear interaction with a single variable or combination of variables can explain the growth in free production with a significant level of probability. Concerning the explanatory power of the three other knowledge levels, this is not surprising, since their correlations with free production were not statistically significant. Yet even for the number of weeks, which correlated positively and significantly with free production, the linear regression only showed a trend towards significance at F(1,34)=3.562 (p=0.068). Moreover, combining one or more of the other knowledge levels with time did not strengthen this trend, but rather slightly diminished from its probability. It can therefore be concluded that, for this particular case study, growth in free production cannot be sufficiently explained by the linear effect of time alone, by the linear effect of one or more of the other knowledge levels, or by the combination of one or more of these levels with time. The regression results indicate that in order to reveal factors that shape the development of the knowledge continuum, it may be necessary to inspect variability from the linear trends, and examine how changes in its patterns across the different knowledge levels are related. DST emphasizes variability as inherent to developmental processes. Therefore, from the dynamic approach, variability analyses are considered an essential complement to central trend analyses (van Geert & van Dijk, 2002; Verspoor et al., 2008a). The use of variability analyses does not imply that linear analyses are not useful in describing and visualizing general growth trends, but serves to illuminate micro-level, temporal interactions between data variables that may underlie overall growth.

4.3.5 De-trended values: residuals To investigate how interactions between the knowledge levels change as a function of time, the linear trends of the Portuguese speaker dataset were subtracted from the data to calculate its residuals, or “de-trended” data values, which were then plotted as a time series (see section 3.3.2 for an explanation of this technique).

77 CHAPTER 4

In Figure 13, the residuals of recognition and recall exhibit mainly parallel variability patterns, and few inverse ones, in which increase above the linear trend in one measure is accompanied by decrease in the other (e.g., in weeks 10-14). Recall and controlled production show both parallel patterns, such as in weeks 1-13 and 30- 33, and inverse patterns. In the bottom plot, controlled and free production show, for the most part, inverse patterns, particularly in the second half of the study.

0.25 0.2 0.15 0.1 0.05 0 -0.05 1 3 5 7 9 11 13 15 17 19 2123 25 27 29 3133 35 -0.1 -0.15 -0.2

Reco gnitio n Recall

0.3

0.2

0.1

0 1 3 5 7 9 11 13 15 17 19 2123 25 27 29 3133 35 -0.1

-0.2

-0.3

-0.4

Recall Co ntro lled productio n

0.2

0.1

0 1 3 5 7 9 11 13 15 17 19 2123 25 27 29 3133 35 -0.1

-0.2

-0.3

-0.4

Co ntrolled pro duction Free productio n

Figure 13. Paired residuals for adjacent levels on the vocabulary knowledge continuum. The x-axes denote the number of weeks; the y-axes denote the value of the residuals

In line with findings from other areas of L1 and L2 development (cf., van Geert, 1991; Verspoor et al., 2008b), these residual patterns can be interpreted as

78 A dynamic perspective on L2 vocabulary knowledge

indicating a predominantly supportive interaction between recognition and recall, a weak-moderate competitive interaction between recall and controlled production, and a stronger competition between controlled and free production. However, at this point, this interpretation is merely speculative, and should be corroborated by further procedures. Although it is possible to gauge interactions between the knowledge levels by visually inspecting and comparing their residuals, inspecting changes in correlations between pairs of knowledge levels as a function of time provides a clearer representation of these interactions. Moving correlation plots (as presented in section 3.3.3), can reveal temporal patterns in the interactions implied in the data variability. Furthermore, they enable the comparison of such interactions between different combinations of paired knowledge levels.

4.3.6 Moving correlations The moving window technique was used to plot between-level correlations across partially-overlapping periods of five consecutive weeks. The plots serve to expose the patterns that underline the previously described correlations between the knowledge levels, which may be obscured by a single coefficient value. The moving correlations showed that the interactions between the paired adjacent knowledge levels fluctuate in all cases (see the top left plot in Figure 14). For the most part, these fluctuations appear to be cyclical. The cycle states range from strong competition between the paired levels (with a coefficient value close to -1) to a strong support (close to 1). Similar patterns have previously been interpreted as indicating complex precursor interactions (Verspoor et al., 2008b). More specifically, the recognition-recall plot (top right plot in Figure 14) shows an initial strong shift from a high positive to a high negative value. From the approximate halfway point of the study period, these fluctuations are less pronounced, culminating in a high positive value. In the recall-controlled production plot (bottom left), the moving correlation shifts from an initial medium-high negative value to a medium-high positive value, fluctuates between low positive and medium-high negative values throughout most of the subsequent period, and peaks towards a high positive value, from which it drops to a negative value in the last weeks. This pattern is remarkable, because at the macro-level the interaction between recall and controlled production appears purely supportive, as expressed in both their near-identical linear

79 CHAPTER 4

trends and in their strong and positive correlation, whereas the paired residuals of these indexes (in Figure 13) suggests some degree of competition. This disparity may indicate a precursor relationship, in which the development of recall precedes, and possibly somewhat delays, development in controlled production. The controlled-free production correlation (bottom right) starts off as negative, peaks to a high positive value, and drops to a weak-neutral value, from which it drops further to a strong negative value until after the midpoint of the study. From there on, this coefficient peaks sharply to a high positive value, and then gradually declines towards a weak negative correlation in the final weeks.

1 1

0.5 0.5 0 0 -0.5 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -1 -0.5

Recognition-Recall -1 Recall-Co ntro lled production Controlled production-Free production Recognition-Recall

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -0.5 -0.5

-1 -1

Controlled production-Free production Recall-Co ntrolled production

Figure 14. Moving correlations between pairs of adjacent knowledge levels in the Portuguese speaker data. The x-axes denote the number of the moving window (1 signifies the first window and thus weeks 1-5, 2 signifies weeks 2-6, and so forth); the y- axes denote a Spearman’s rho coefficient value

Like the paired residual plots in the previous section, these findings can be seen as suggesting an initially supportive interaction between recognition and recall, which becomes competitive as learning progresses, but then recovers and tends to fluctuate relatively little in comparison with the recall-controlled production and controlled-free production interactions. The recall-controlled production interaction can be seen as moderately competitive, fluctuating between moderate positive and negative values and culminating in a positive value. In the controlled-free production interaction, fluctuations between strong positive, strong negative, and again strong

80 A dynamic perspective on L2 vocabulary knowledge

positive values indicate a more robust competition, particularly during the first half of the study, after which a shift towards a strong-moderate supportive interaction occurs, which ultimately declines to weak positive and neutral values and a final weak negative value. Next, each moving correlation was plotted together with another moving correlation. This procedure enables to compare shifts in one interaction between coupled knowledge levels with another simultaneous interaction.

1 1

0.5 0.5 0 0 1 4 7 10 13 16 19 22 25 28 31 x` -0.5 1 4 7 10 13 16 19 22 25 28 31 -0.5 -1 -1 Recognition-Recall Recall-Co ntrolled pro duction Recognition-Recall Controlled production-Free production Recall-Co ntro lled productio n

1 1

0.5 0.5

0 0 1 4 7 10 13 16 19 22 25 28 31 1 4 7 10 13 16 19 22 25 28 31 -0.5 -0.5

-1 -1

Recognition-Recall Recall-Co ntro lled productio n Controlled production-Free production Controlled production-Free production

Figure 15. Moving correlations between adjacent knowledge levels in the Portuguese speaker data, plotted together (top left), and in pairs

Figure 15 (above) shows that in general, shifts in a given interaction between paired knowledge levels appear to be accompanied by changes in another. In other words, whenever a moving correlation between a pair of levels shifts from a negative to a positive value, or vice versa, the opposite tends to occur in another correlation. However, there are exceptions, such as weeks 8-13 in the recognition-recall and controlled-free production plot (bottom left). The pairs of moving correlations show largely incongruent patterns – almost every alternation in each correlation is accompanied by an inverse change in another correlation. This finding is salient across all combinations of correlations. Thus the paired moving correlation plots strengthen the impression of complex interactions derived from the previous variability analyses.

81 CHAPTER 4

4.3.7 Summary of the variability analyses So far, the analyses revealed that the discrepancy between the lower knowledge levels and free production remained robust for all participants, despite gains in all knowledge levels over the study period. The variability analyses, focusing on the Portuguese speaker data, showed that variability is more pronounced in the higher and less-established knowledge levels, especially in free production. The residual and moving correlation plots strengthened the impression of complex interactions between the knowledge levels, which entail cycles of competition and support. If L2 vocabulary knowledge is considered from a dynamic perspective, it may be perceived as a hierarchy of precursors and dependents in which overall development is determined by interactions between co-developing levels. These interactions in turn result from limitations on resources available for systemic growth. In line with the data and variability analyses, the notion of the receptive-productive continuum as a hierarchy and the background literature, these interactions were interpreted as support between recognition and recall, weak-moderate competition between recall and controlled production, and strong competition between controlled and free production. At this point, this interpretation is tentative, and requires corroboration by a model of vocabulary knowledge as a system of connected growers in a hierarchy of precursors and dependents.

4.3.8 A precursor model of vocabulary knowledge development In the initial stage of the modeling procedure, a general model of precursor interactions was used to simulate the Portuguese speaker data. The model was programmed in Excel VBA code by Paul van Geert, on the basis of a generic connected growth model devised by Fischer (1980, as cited in van Geert, 1994, 2003). The steps involved in the model configuration corresponded with the general procedures detailed in section 3.2.3. First, the order parameters were specified: the number and name of the growers, in line with the levels in the vocabulary knowledge continuum. Next, the relational control parameters were configured. The first relational control parameter is support , specifying the degree of support from each grower to another. The second is competition , specifying the degree to which the same grower competes with the other. The third parameter, precursor , determines a threshold value which is conditional for support from the first grower to its counterpart. In other words, the precursor parameter renders the grower that it refers

82 A dynamic perspective on L2 vocabulary knowledge

to a precursor to the other grower, which is thereby defined as dependent, by specifying the value that the precursor needs to attain as a condition for support. For the first two relational parameters, support and competition, it is necessary to specify whether they are based on change in the value of the grower (by change) or on its current value (by level), as explained in section 3.2.3. Equation 8 describes the model version used in the current study. It depicts a precursor interaction between two growers, with conditional and unidirectional support (precursor to dependent) and unidirectional competition (dependent to precursor), both by change.  r * A  = + − A n − − An+1 An * 1 rA c A (* Bn Bn−1 )  K A 

  r * B   = + − B n + − Bn+1 Bn *1 (rB sB (* An An−1 ) * pB    K B   Equation 8. Connected growers in a precursor interaction with unidirectional support and competition (both by change)

In the model, these equations were expanded to a set of four growers, in correspondence with the vocabulary knowledge levels. The equations were then iterated, generating four growth trajectories. As Equation 8 shows, in each iteration the value of each grower is the outcome of a logistic function based on its previous value and its interactions with its precursor and dependent growers, to which it acts as 13 precursor F (predictably, the first grower in the hierarchy has no precursor, while the last has no dependent). Within this general hierarchy, the parameter values were configured on the basis of the data analyses results in the following manner: • a medium value of recognition is a conditional precursor to recall • a medium value of recall is a conditional precursor to controlled production • a high value of controlled production is a conditional precursor to free production

Once the respective thresholds are crossed, these interactions ensue • moderate support from recognition to recall • weak support from recall to controlled production • weak support from controlled production to free production

Additionally, the unidirectional competition included in the model refers only to a weak competition from free production to controlled production. However, it is

13 For an elaborate explanation of these equations, see for example van Geert, 1995.

83 CHAPTER 4

likely that the interactions in the continuum are far more complex, and include also bidirectional support. For example, increased free or controlled production may contribute to the lower knowledge levels, as posited by the Output Hypothesis (Swain, 1985, 1995; see also de Bot, 1996). Nevertheless, the study opted to examine the fit of the model on the basis of its most general premise, namely the interactions between adjacent levels in the hierarchy. The competition from free to controlled production was added in line with the assumption that the highest level in the hierarchy would demand the highest amount of resources for its growth. However, this interaction was omitted in the subsequent optimization procedure (see the following section). All of these interactions are by change , meaning that their value depends on the growth, or change, in the grower that generates them between the current and previous time points. These interactions are also conditional on the precursor value. It is plausible that by level interactions, whose strength is determined by the current value of the grower that generates them, also occur within the knowledge continuum. However, such interactions were not configured at this stage, in order to first test the general applicability of precursor interactions based on local changes, that is on the process of change itself rather than its outcome (i.e., the current value). The model also included the property control parameters of initial growth rate, initial value, developmental delay, level of general support, and amount of random variation for each grower. Except for the initial values, specified as the data onset values, the other parameters were randomly configured as uniform values for all knowledge levels, in order to focus on the effect of the key interactions in the model. Since all knowledge levels were calculated as ratios, the general support level was set as equivalent to the carrying capacity of each grower (its highest attainable value given optimal resource allocation) at a uniform value of 1, the maximal ratio. For each nominal value configured in the model, the model program assigned a numerical value between 0 and 1 (for example, 0.8 to a high and 0.6 to a moderate value). As mentioned in section 3.2.3, theses values are relative but not absolute. The support value, for instance, is not the actual support, but rather relative to other values in the equation. The equations were then iterated 300 times, and their outcomes were plotted.

84 A dynamic perspective on L2 vocabulary knowledge

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Ratio of of known vocabulary Ratio 1 31 61 91 121 151 181 211 241 271 Weeks (iterations)

Recognition Recall Controlled production Free production

Figure 16. A complex growth model of vocabulary knowledge development for the Portuguese speaker case study

Due to the random variation configured in the model, its values are dispersed around central growth trends, with each recalculation generating slightly different outcomes. The model can be evaluated on the basis of two criteria: the general shape of growth over time and the interactions between knowledge levels. An adequate model should be able to replicate both of these data dimensions. Thus, validating the model does not rely only on comparing or correlating its outcome values with their equivalents in the data, but also on comparing the internal interactions within the model with their counterparts in the data. As Figure 16 shows, the model outcome is visually similar to the range of data values and their linear growth trends (as presented in sections 4.3.1 and 4.3.1.1). To confirm this impression, the model values were correlated with the data, treating the difference between the number of data measurements and the number of model iterations as missing values, and thus effectively correlating only the outcomes of the first 36 iterations with the data. Additionally, the internal correlations within the model were compared to those of the data. Table 7 summarizes these correlations. In this table, the ‘data-model’ column refers to the correlation between the first knowledge level in the pair, derived from the data, and the second level in the pair, which is derived from the model. The ‘model-data’ column refers to the same procedure in reverse: the correlation of a level derived from the model and its counterpart in the data.

85 CHAPTER 4

CORRELATION MODEL DATA DATA-MODEL MODEL-DATA Free production-Weeks .352* .347* _ _

Controlled production-Weeks -.177 .769** _ _

Recall-Weeks .432** .849** _ _

Recognition- .656** .628** _ _ Weeks Free production-Controlled .025 .188 -.128 .324 (p=0.054) production Controlled production-Recall -.160 .658** .485** -.111

Recall- .334* .599** .249 .283 Recognition Free production-Recall .261 .233 .258 .296

Free production-Recognition .170 .210 .279 .119

Controlled production- -.110 .439* .196 -.029 Recognition Free production-Free _ _ .299 (p=0.077) _ production Controlled production- _ _ -.030 _ Controlled production Recall-Recall _ _ .351* _

Recognition-recognition _ _ .330* _

Table 7. The two left-hand columns: coefficients of correlations between levels and time, and between-levels, within the model and the data, respectively; The two right-hand columns: coefficients for the correlations between the levels of the data and those of the model, and vice versa, respectively (*= p<0.05; ** = p<0.01)

With regard to the patterns of internal interaction, there are significant positive correlations (see the leftmost column) between three knowledge levels in the model, including free production, and time. These correlations are similar to those found between the data knowledge levels and time, as presented in section 4.3.2.1 and again in the second column from the left in Table 7. Moreover, despite the significant correlation of free production with time in both the data and the model, the correlations between the three lower levels and free production in the model are weak and not statistically significant, just as they are in the data. This indicates a replication of the nonlinear interactions of free production with the lower knowledge levels, as expressed in the receptive-productive gap. While general similarities between the model and the data are evident in their similar growth patterns, linear trends, and correlations, not all aspects of the data are replicated by the model. For instance, the correlation between the values of controlled production in the model and in the data is not significant, and neither is the correlation

86 A dynamic perspective on L2 vocabulary knowledge

between the respective values of free production (in the second right-hand column). Moreover, the correlation between controlled production and time in the model is not significant, unlike its equivalent in the data. Therefore, the subsequent step in the modeling procedures was an optimization of the control parameter values. The optimized model output was then compared with the data, and its parameter values were checked for their compatibility with the hypothesized interactions in the vocabulary knowledge continuum, as described in sections 4.3.3 and 4.3.7.

4.3.8.1 Model optimization In order to optimize the model, two measures were taken. First, the model was simplified, so that it included only unidirectional or bottom-up interactions (precursor to dependent), eliminating the bidirectional interactions. In the current version, the latter interactions included only the competition from free production towards controlled production. Since controlled production was the level that was replicated least accurately by the model, this interaction was eliminated. Focusing only on unidirectional precursor-dependent interactions was intended to facilitate the comparison of the optimized values with the hypotheses that informed them, as based on the analyses, for instance on the interpretations of the moving correlations. These hypotheses refer to interactions, but not to their direction. While it is likely that top- down interactions exist in the data as well, the receptive-productive gap is usually referred to in terms of (a lack of) transfer from lower-receptive to higher-productive knowledge levels, rather than the opposite. Therefore, the revised model concentrated on this operational definition. Thus, in the optimized model version, there are three key interactions between the three combinations of precursor-dependent dyads, as specified by the relational control parameters. For these interactions, the support and competition parameters have been aggregated as neutral damping factors on growth, both by change and by level, as summarized in Table 1 (section 3.4.6). The revised equation refers to this aggregated parameter (the equivalent of d in Table 1) as s for support, simply because it is added to the growth rate. However, since this “support” parameter can take on a negative value, it can also express competition. The by level interactions were added to the model on the premise that emerging skills can support each other by the value that they attain, while competing locally through their growth processes, as shown in van Geert’s model of vocabulary acquisition and early reading skills (2003).

87 CHAPTER 4

Additionally, the developmental delay and variation parameters were omitted (variation, which is by definition a random factor in a model, cannot in any case be optimized). The initial values remained identical to the data onset values.  r A  = + − A * n An+1 An *1 rA   K A 

  r B   = + − B * n + + − Bn+1 Bn *1 (rB sB * An sB (* An An−1)* pB    K B   Equation 9. Revised model for optimization: connected growers in a precursor interaction with unidirectional “support” by change and “support” by level

The next step was assigning all remaining control parameters, namely the precursor value (the conditional threshold determining the value of P as 0 or 1), the growth rates and the relational control parameters specifying the between-level interactions, with default values. These values were then converged via the Simplex algorithm (Nelder & Mead, 1965). This optimization routine is a method of finding a local solution to a problem involving several parameters. It is based on minimizing the outcome of a function, which in this case is the sum of squared differences between the model and the data. The algorithm then generates positions of trail matrixes, derived from configuring different values in the function parameters, and extrapolates each outcome of the re-configured function at points arranged as a simplex (a function whose derivative is a shape of N+1 vertices in N dimensions, the simplest example thereof is a triangle, which has 3 vertices and 2 dimensions). The exact shape of the projected simplex is predictably determined by the number of optimized parameters (Mathews & Fink, 2004). The matrix of squared differences is recalculated up to 10,000 times until an optimal solution is reached, with a tolerance level of 0.00001 (meaning that minimizing the function by this or any larger number 14 is criteria for improvement). By optimizing the control parameter values, it was possible to assess the overall fit of the model to the data. This was done, first, by inspecting the sum of squared residuals between the model and the data (rendered sum of least squares via the optimization procedure). Second, the model outcome was visually compared with the data. Since the optimization excludes variation, the outcome was compared not only to the highly varied data, but also to its linear trends and cubic spline plots,

14 The simplex optimization procedure is available through the Solver AddIn for MSExcel, or through the program

Poptools (www.cse.csiro.au/poptools/ )H

88 A dynamic perspective on L2 vocabulary knowledge

which combine local trend and smoothed variability (as demonstrated in section 3.3.4). Finally, the goodness-of-fit was also judged by comparing the linear regression equations of the model with those of the data. Although linear regressions eliminate temporal or local changes in data variability, they express the receptive-productive gap as it has been identified by previous studies, and are therefore used in the current study as an additional means of model fit evaluation. As mentioned, the optimization can potentially reinforce the interpretations of the data and variability analyses as indicating precursor interactions within the knowledge continuum, while confirming the notion of the receptive-productive gap as emerging from these interactions. Thus, if the model achieves a good fit while retaining the hypothesized hierarchical order of precursors and dependents and their relational control parameter values, the two hypotheses would be upheld.

4.3.9 Model outcome

4.3.9.1 The Portuguese speaker Figure 17 shows the optimized model outcome for the Portuguese speaker case study, together with the data values, their linear trends and cubic spline interpolations.

Data Linear trends

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Co ntrolled pro duction Free productio n Controlled pro duction Free pro duction

Spline Mo del

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Recognition model Recall mo del Co ntrolled pro duction Free productio n Controlled production model Free production model

Figure 17. Data, linear trends, spline and optimized model (fit: 0.546191) for the Portuguese speaker case study

89 CHAPTER 4

Despite including only unidirectional bottom-up interactions between consecutive knowledge levels, the optimized model shows a great degree of similarity to the data, particularly to its linear trends. This is manifested not only in the visual similarity of the model and data trends, but also in the fact that the fit value (sum of squared differences) between the model and the data is quite low, at a value of 0.546191. The fit can be also visualized when plotting each level in the model and the data separately. Since the optimized model cannot incorporate variability, it does not fully resemble the data or its spline (which includes only smoothed variability). Rather, the model generates an outcome similar to a combination of the spline and linear trends.

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recognition model Recall Recall mo del

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Controlled productio n Controlled production model Free pro ductio n Free production model

Figure 18. Data and optimized model levels for the Portuguese speaker case study

Moreover, the linear trends of the data and the model are quite similar, as can be seen in Table 8. Because the present study focuses explicitly on the development of free production in relation to the other continuum levels (i.e., the gap), the ability of the model to replicate the linear growth trend of this data level on the basis of the configured interactions is of particular relevance.

90 A dynamic perspective on L2 vocabulary knowledge

Level Data Model

Recognition y = 0.0026x + 0.8805 y = 0.004x + 0.85 Recall y = 0.0136x + 0.229 y = 0.0128x + 0.2477 Controlled production y = 0.0098x + 0.275 y = 0.0106x + 0.2672 Free production y = 0.0035x + 0.1656 y = 0.004x + 0.1591 Table 8. Linear regression equations for the data and model of the Portuguese speaker case study (x=number of weeks lapsed; y=current ratio of known words)

Finally, a linear regression analysis shows that the free production values generated by the model are significant predictors of the free production values in the data, at F(1,34)=4.197 (p<0.05), confirming that the sum of squares value indeed reflects a very good fit. This procedure shows that the model acts as a better predictor of growth in free production than time alone, which only showed a tendency for significance when linearly regressed with the data in section 4.3.4, despite the significant correlation between free production and the number of weeks. The optimized parameter values, presented in Table 9, verify the basic hypothesis of a hierarchy of precursors and dependents. Within this continuum, support from precursors to dependents in connected grower pairs is instigated when the precursor reaches a threshold value. The optimized values show that the configuration of the model that yields an optimal fit consists of a negative value of the by change interactions and a positive value of the by level interactions from recall to controlled production and from controlled production to free production. This means that there are complex interactions, comprised of simultaneous competition by change and support by level, from these precursors to their dependents, i.e., from recall and controlled production, the more-receptive levels, to controlled production and free production, the higher and more-productive levels, respectively. In contrast, the interactions between recognition and recall are denoted by positive values, implying that recognition does not compete with recall.

91 CHAPTER 4

Growth rate recognition 0.057057 Growth rate recall 0.051112 Growth rate controlled production 0.04063 Growth rate free production 0.021853 Recognition to recall, by change 0.000484 Recall to controlled production, by change -0.00156 Controlled to free production, by change -0.00279 Recognition to recall, by level 0.152386 Recall to controlled production, by level 0.166983 Controlled to free production, by level 0.137332 Precursor value, recognition to recall 0.125696 Precursor value, recall to controlled production 0.055095 Precursor value, controlled to free production 0.141572 Table 9. Optimized parameters for the Portuguese speaker model

The optimized parameter values show how the data growth trends can be simulated as the result of precursor interactions incorporating simultaneous support and competition between adjacent growers. These interactions are generated by the less-productive (and more-established) levels towards the more-productive levels in the knowledge continuum. However, the lowest and most-established level, recognition, does not compete with recall. Both recall and controlled production compete by change with their respective dependents, controlled production and free production, even while in each of these precursor-dependent duos, the precursor supports the dependent by level. This simultaneous interaction is akin to van Geert’s (2003) precursor model of vocabulary and reading skills acquisition. In this model, emerging reading skills support vocabulary acquisition by level, since a higher level of reading enables the acquisition of new words. On the other hand, reading also initially competes with vocabulary acquisition by change, since in the process of learning to read, a higher pace of change (increased improvement) consumes more resources, thereby curbing the acquisition of other types of knowledge. Thus, competition and support interactions offset each other. It should be mentioned that in van Geert’s model, vocabulary knowledge is a conditional precursor to reading, whereas in the current model, the interactions are generated from the precursor to the dependent and not vice versa. However, the model serves to illustrate the by change/level distinction and the notion of simultaneous inverse interactions from one grower to another. Concerning the apparent lack of competition from recognition to recall, it may be hypothesized that at the L2 proficiency level of the participants in the current

92 A dynamic perspective on L2 vocabulary knowledge

study, recognition is relatively established (as seen in its near-maximal values), and thus ceases to tax systemic resources to a degree that necessitates competition with another level. The optimization also shows that the threshold ( precursor ) value for the support of free production by its precursor controlled production is the highest of the threshold values. The threshold for support from recognition to recall is the second- highest. This finding may reflect the distance between the two pairs of knowledge levels, which unlike recall and controlled production are established as dichotomous categories in the literature. It also reinforces the notion that controlled production is not sufficiently established to stop competing with free production by change, even while supporting it by level. Overall, the modeling procedures show that the data of the main case study, which was considered as encapsulating the receptive-productive gap, can be simulated as emerging from simple precursor interactions iterated over time. The optimized model not only confirms the hypothesis of precursor interactions across the continuum hierarchy, but also corroborates predictions about the nature of these interactions, and thereby also supports the interpretation of the data and variability analyses.

4.3.10 Generalizing the model While the Portuguese speaker was chosen as a main case study for the purpose of further analyses, since her data was considered as most representative of the receptive-productive gap, all participants were subjected to similar treatment in terms of L2 exposure via immersion. Their data showed comparable (though far from identical) growth trends and surface correlations. This suggested that the model may be generalizable to all case studies. By optimizing the model values with the procedure described in the preceding section, its fit and relational control parameter values could be assessed in relation to the other datasets.

93 CHAPTER 4

4.3.10.1 The Mandarin speaker

Data Linear trends

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Controlled pro ductio n Free pro ductio n Controlled production Free production

Spline Model

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Controlled productio n Free productio n Co ntro lled pro duction Free pro ductio n

Figure 19. Data, linear trends, spline and optimized model (fit: 0.664916 ) for the Mandarin speaker case study

Visually, the model and linear trend of this dataset are very similar. The sum of squared differences between the model and the data is fairly low, at a value of 0.664916 . The linear regression equations for the model and the data, presented in Table 10, are also similar, as in the case of the Portuguese speaker model.

Level Data Model Recognition y = 0.0012x + 0.7947 y = 0.0013x + 0.7928 Recall y = 0.0032x + 0.5163 y = 0.0041x + 0.4809 Controlled production y = 0.0053x + 0.4418 y = 0.0063x + 0.4136 Free production y = 0.0041x + 0.1377 y = 0.0049x + 0.1027 Table 10. Linear regression equations for the data and model of the Mandarin speaker case study

Likewise, the optimized parameter values of the model for this case study show simultaneous support by level and competition by change from recall towards controlled production, and from controlled production towards free production. In contrast, the interactions from recognition to recall are only positive. The optimized values also show, as in the Portuguese speaker model, that the conditional precursor

94 A dynamic perspective on L2 vocabulary knowledge

value from controlled production to free production is the highest of the three threshold values, followed by the precursor value from recognition to recall.

Growth rate recognition 0.010027 Growth rate recall 0.013594 Growth rate controlled production 0.026835 Growth rate free production 0.033578 Recognition to recall, by change 0.0009 Recall to controlled production, by change -0.0012983 Controlled to free production, by change -0.004794 Recognition to recall, by level 0.152386 Recall to controlled production, by level 0.101745223 Controlled to free production, by level 0.3307042 Precursor value, recognition to recall 0.1863 Precursor value, recall to controlled production 0.141342629 Precursor value, controlled to free production 0.247487 Table 11: Optimized parameters for the Mandarin speaker model

4.3.10.2 The Vietnamese speaker

Data Linear trends

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Co ntro lled productio n Free pro ductio n Co ntro lled productio n Free pro duction

Spline Model

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 3134

Reco gnitio n Recall Reco gnitio n Recall Co ntro lled production Free pro ductio n Controlled pro duction Free productio n

Figure 20. Data, linear trends, spline and optimized model (fit: 1.174487) for the Vietnamese speaker case study

At a sum of squares of 1.174487, the fit of the model to the Vietnamese speaker data is inferior to that of the two previous versions. However, when compared with the spline, the model still captures some key traits of the data, particularly the

95 CHAPTER 4

interplay between recall and controlled production, and the general curve of increase in recognition throughout the study period. The linear trend equations of the model are also generally different from those of the data, although the recall and free production equations are relatively similar. Level Data Model Recognition y = 0.0021x + 0.8887 y = 0.0048x + 0.8543 Recall y = 0.0055x + 0.5425 y = 0.0045x + 0.5573 Controlled production y = 0.005x + 0.5387 y = 0.0022x + 0.6069 Free production y = 0.0043x + 0.2428 y = 0.0056x + 0.2152 Table 12. Linear regression equations for the data and model of the Vietnamese speaker case study

Table 13 (below) shows, again, that the interactions between recall and controlled production and controlled and free production entail simultaneous support and competition, and that the precursor value from controlled to free production is the highest of the threshold values, followed by that of recognition towards recall. However, although these interactions confirm the hypotheses in the sense that they are retained while producing the best fit, as in the previous model versions, this fit is nonetheless poor, as expressed both in the sum of squares value and the dissimilarities between the data and trend plots.

Growth rate recognition 0.10191 Growth rate recall 0.016518 Growth rate controlled production 0.013223 Growth rate free production 0.028248 Recognition to recall, by change 0.0008 Recall to controlled production, by change -0.002061107 Controlled to free production, by change -0.003669 Recognition to recall, by level 0.235 Recall to controlled production, by level 0.113592645 Controlled to free production, by level 0.1823496 Precursor value, recognition to recall 0.1146 Precursor value, recall to controlled production 0.089396726 Precursor value, controlled to free production 0.2851287 Table 13. Optimized parameters for the Vietnamese speaker model

96 A dynamic perspective on L2 vocabulary knowledge

4.3.10.3 The Indonesian speaker

Data Linear trends

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Controlled productio n Free production Controlled productio n Free production

Spline Mo del

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Reco gnitio n Recall Reco gnitio n Recall Controlled productio n Free production Co ntro lled production Free production

Figure 21. Data, linear trends, spline and optimized model (fit: 0.603402 ) for the Indonesian speaker case study

The goodness-of-fit of this model is second to that of the Portuguese speaker model, as expressed in the sum of squares. Visually, this model also deviates strongly from the linear trends, and is more similar to the spline interpolations, particularly that of recognition. Concerning the three lower levels, when compared with the linear trends, the model is fairly similar. This is a possible strength of the model, as there is more variability around the central growth trends of this dataset, and hence a better fit of the spline than the linear trends to the data. However, the model fails to capture the intertwined growth patterns of recall and controlled production, which can be seen in the spline plot (although they are invariably omitted from the linear trends). The high amount of variation in this dataset, which can also be seen in the pronounced differences between its linear trend and spline interpolations, weakens the fit of the model. Therefore, the linear regression equations of the model are relatively similar to those of the data, although their slope values tend to be higher.

97 CHAPTER 4

Level Data Model Recognition y = 0.0039x + 0.8269 y = 0.0069x + 0.7902 Recall y = 0.0063x + 0.5663 y = 0.0072x + 0.5414 Controlled production y = 0.0055x + 0.483 y = 0.0067x + 0.4721 Free production y = 0.0015x + 0.1448 y = 0.0031x + 0.1078 Table 14. Linear regression equations for the data and model of the Indonesian speaker case study

The optimized parameter values of the model show the two characteristics observed in the previous versions, namely simultaneous support by level and competition by change from recall to controlled production and from controlled production to free production, and a higher precursor value for the controlled-free production interaction than for other interactions (followed by the precursor value from recognition to recall).

Growth rate recognition 0.10191 Growth rate recall 0.016518 Growth rate controlled production 0.013223 Growth rate free production 0.028248 Recognition to recall, by change 0.0006 Recall to controlled production, by change -0.00298 Controlled to free production, by change -0.00292 Recognition to recall, by level 0.1669 Recall to controlled production, by level 0.161935 Controlled to free production, by level 0.045509 Precursor value, recognition to recall 0.1164 Precursor value, recall to controlled production 0.077642 Precursor value, controlled to free production 0.152186 Table 15. Optimized parameters for the Indonesian speaker model

4.3.11 Summary of the modeling procedures The data from the four case studies were simulated by a model of connected growers based on precursor interactions. A preliminary version of this model was used to simulate the Portuguese speaker data, which was considered as most typical of the receptive-productive gap. The model was based on the premise of precursor interactions from adjacent lower (more-receptive) to higher (more-productive) knowledge levels, and configured the onset data values as its starting point. A 300- step iteration of the model generated growth patterns that resembled the general data trends. However, the model did not replicate all of the surface level interactions in the dataset, as expressed in its correlation matrix. A subsequent procedure of optimizing the control parameters of the model via the Simplex algorithm achieved a good fit on all levels, evident both in a relatively

98 A dynamic perspective on L2 vocabulary knowledge

small sum of squared model-data residuals and in the visual appearance of the model. Since the optimization procedure eliminates variation, which is a random parameter, the model was visually compared with the linear trends and the spline plots of the data. Moreover, although linear analyses do not capture temporal changes in variability and interactions in the data, they pertain to the receptive-productive gap as it has been described by preceding studies. Therefore, linear regression equations were used as an additional measure of model fit. These procedures showed strong similarities between the main case study data and the model. Applying the same fit assessment procedures to optimized model versions for the three other case studies yielded similar outcomes, with the exception of the Vietnamese speaker version. However, the Portuguese speaker model produced the best fit, possibly because this case study was the most representative of the receptive-productive gap. The optimization procedure achieved not just a replication of the data trends on the basis of precursor interactions, but also a reinforcement of the interpretation of the variability analyses as revealing simultaneous support and competition between adjacent levels of the knowledge continuum. First, the optimized values showed that the best model fit was obtained through precursor interactions from more-receptive to more-productive levels. Second, in the recall-controlled production and controlled production-free production dyads, the precursor-dependent interactions were support by level and competition by change, in all the versions of the model. By applying the basic premises of the precursor model – unidirectional and simultaneous support and competition between hierarchically ordered connected growers – to the onset values of the empirical data, these simulations demonstrated the explanatory power of the model with regard to development in the receptive-productive vocabulary knowledge continuum. This can be considered as additional evidence for the notion of the receptive-productive gap, as well as for the conceptualization of vocabulary knowledge as a dynamic system.

4.4 Summary The primary objective of this study was to investigate whether the dynamic approach can reveal internal interactions between levels of receptive and productive vocabulary knowledge. The dynamic perspective was previously applied to L2 data by observing and contrasting temporal changes in variability patterns across different aspects of language (see for instance Larsen-Freeman, 2006b). In the current study, this

99 CHAPTER 4

approach was extended beyond variability analyses, by a simulation of development in a dynamic growth model. The added value of using simulations is that the hypothesized dynamic interactions deduced from data and variability analyses can be tested. The focus of the study was the nonlinear interaction between receptive and productive L2 vocabulary knowledge levels, known as the receptive-productive gap (e.g., Laufer, 1998). The literature on receptive and productive vocabulary knowledge differs in terms of operational definitions and assessment methods (Schmitt, 2010). Despite this lack of uniformity, the gap was frequently mentioned in studies that contrasted more- and less- productive skills, such as word recall and recognition (Fitzpatrick et al., 2008), or recognition, controlled and free production (Laufer, 1998). While not always expressed at the group level, the gap has been frequently noted in individual learners (Laufer et al., 2004; Schmitt & Meara, 1997). Across such learners, it varies widely (Laufer, 1994; Fan, 2000). Moreover, there is evidence that the gap is consistent across L2 proficiency levels (Melka, 1997; Fan, 2000; Vermeer, 2001); on the other hand, there is counterevidence for its decrease with increased L2 proficiency (Webb, 2008). The present study traced development in a continuum of four knowledge levels of L2 vocabulary: recognition, recall, controlled production and free production. These levels range from least- to most-productive, and were defined and assessed in line with several previous studies that distinguished recognition from recall (Laufer et al., 2004; Laufer & Goldstein, 2005), and controlled production from free production (Laufer & Nation, 1995, 1999). The target vocabulary in this study was specified as academic English. Free production was defined as spontaneous, accurate and complex (i.e., varied) usage of this vocabulary in written texts; controlled production as its elicited and cued use in context; recall as cued recollection of word form in response to meaning; and recognition as cued identification of word form from a multiple choice, again in response to meaning. These levels were assessed weekly in four advanced learners during a 36-week immersion period in an academic English speaking environment. The first research question of the study concerned the existence and consistency of the gap over time. It was addressed by plotting the four datasets as developmental trajectories. This procedure showed growth on all knowledge levels, which was accompanied by a high degree of variability. Variability was particularly

100 A dynamic perspective on L2 vocabulary knowledge

pronounced in free production, confirming the existence of the gap in the data and its maintenance over the learning period despite local fluctuations. This observation also illustrated the DST principle which associates increased variability with development of a newly-established skill or knowledge component (van Dijk, 2003; Verspoor et al., 2008a). A second procedure addressing this question was analyses of correlations between the knowledge levels, and between these knowledge levels and time. The correlations showed that despite an overall increase during the study period, expressed in positive (although not always statistically significant) level-time correlations in all case studies, the gap between free production and the three lower levels remained robust, supporting previous accounts (Fan, 2000; Laufer, 1998; Laufer & Paribakht, 1998; Schmitt & Meara, 1997). The study then focused on the Portuguese speaker dataset, which displayed robust positive correlations between all levels and time, including free production, indicating significant growth. Despite this fact, free production was not found to correlate significantly with the three lower knowledge levels in this dataset. Thus, this case study was considered as typical of the receptive-productive gap. Linear regressions showed that growth in free production could not be sufficiently explained by the passage of time alone, or by its linear combination with one or more of the lower knowledge levels. This procedure thus addressed the second research question of the study, which concerned the nature of the interactions between the knowledge levels. The hypothesis pertaining to this question was that these interactions would be nonlinear and complex, reflecting complex patterns of competition and support within a hierarchical order of precursors and dependents. The same question was addressed further by variability analyses, which were again focused on the Portuguese speaker data. These analyses contrasted the residuals and moving correlations of pairs of consecutive knowledge levels in the continuum. The outcomes of these procedures suggested shifts between competition and support, which have previously been associated with the precursor model (Verspoor et al., 2008b). These shifts were deduced from fluctuating and inverse patterns in the residuals of adjacent knowledge levels, as well as in their moving correlations. The inverse patterns were most pronounced in the controlled-free production interaction, followed by the recall-controlled production interaction. The stronger fluctuations in the interactions between the higher and less-established knowledge levels were

101 CHAPTER 4

considered as further indication of the applicability of the precursor model to the knowledge continuum, since the model specifies sequences of emergence that depend on conditional threshold values, with stronger competition between higher- and less- established skills (Fischer, 1980; van Geert, 1991). These interpretations of the variability analyses led to the third research question: whether a model based on dynamic precursor interactions can adequately simulate development across the knowledge continuum, and thereby account for the receptive-productive gap. The question also concerned the interpretation of the variability analyses and, more generally, the emphasis on variability in dynamic- oriented SLA studies (de Bot et al., 2005, 2007; Verspoor et al., 2004, 2008a). In order to test this final question, a model based on coupled logistic equations, which depict precursor interactions, was used to simulate the data of the main case study. The model consisted of a four-level hierarchy of precursors and dependents, in accordance with the knowledge continuum. In line with the variability analyses results, unidirectional (precursor to dependent) interactions between these levels were configured as more-competitive between higher and more-productive levels, and less so between the lower levels. While the model replicated the general growth patterns of the data, it did not fit its actual values or all of its surface interactions. In the last stage of the study, this model was optimized. The optimized model outcome showed that precursor interactions of simultaneous support (by level) and competition (by change), specifically from recall to controlled production and from controlled production to free production, can generate very similar growth patterns to those of the data. The visual comparison, the sum of squares, and a comparison of the linear trend equations of the model and the data showed a good model fit. Moreover, the optimized parameter values demonstrated that despite a fairly simple configuration, based only on bottom-up interactions between consecutive knowledge levels (precursor to dependent), the model successfully replicated the data growth trends. These values also corroborated the results of the variability analyses. Applications of optimized versions of the same model to the three other case studies supported the findings from the main case study. These applications indicated that the precursor model may be applicable to various learners (albeit with comparable proficiency levels and learning contexts). However, the best model-data fit remained that of the Portuguese speaker dataset, perhaps because it best embodied

102 A dynamic perspective on L2 vocabulary knowledge

the nonlinear development and interaction that are associated with the receptive- productive gap. Together with the data and variability analyses, the model validated both the notion of vocabulary as a continuum of interdependent and hierarchical receptive- productive knowledge levels (Nation & Laufer, 1995, 1999; Laufer et al., 2004), and the idea that competition is stronger between higher, less-established growers in a dynamic system (van Geert 1994, 2003). More broadly, the modeling procedures provided additional evidence for the applicability of the dynamic approach to linguistic research, that is that general principles pertaining to natural, dynamic systems can be used to simulate linguistic data and explain its development, as claimed by previous studies (Bassano & van Geert, 2007; Robinson & Mervis, 1998; van Geert, 2008; Verspoor et al., 2008b).

4.5 Discussion The present study has implications and limitations in several aspects. Pedagogically, it reaffirms previous findings that suggest a lack of linear progression in vocabulary knowledge and especially in productive use, even given intensive exposure to the target language (e.g., Laufer & Paribakht, 1998; Schmitt & Meara, 1997). Therefore, the study adds to the evidence that receptive vocabulary knowledge, however sophisticated and advanced, cannot be expected to transfer smoothly into spontaneous production, even from a relatively high knowledge level such as controlled production. Since complex, varied and correct vocabulary knowledge is essential for other linguistic abilities (Beglar & Hunt, 1999; Jiang, 2004; Laufer 1992a, 1992b), and for overall proficiency (Krashen, 1989; Laufer & Nation, 1999; Qian, 1999; Zareva et al., 2005), free production should be targeted directly rather than expected to improve as a byproduct of receptive knowledge growth alone. At the same time, temporal fluctuation of vocabulary knowledge, as previously noted by many researchers (Bardovi-Harlig & Stringer, 2010; Gross, 2004) should be expected. Empirically, the study strengthens the indication for longitudinal case study methodology in researching L2 vocabulary development (Henriksen, 1999; Horst & Meara, 1999; Fan, 2000). Within this framework, it corroborates the dynamic and complex approach that focuses on changes over short time periods and on the empirical value of variability (Larsen-Freeman, 2006b; van Geert & van Dijk, 2002; Verspoor et al., 2004).

103 CHAPTER 4

Theoretically, the study shows that the disparate development of receptive vs. productive vocabulary knowledge need not necessarily be accounted for by separate mechanisms such as declarative and procedural memory (e.g., Robinson, 1989, 1993). Rather, vocabulary knowledge can be treated as a single system, within which interactions can be explored (Melka, 1997). In this single-system approach, lexical knowledge is frequently considered as stored, stratified and stable (e.g., Levelt, 1992). Yet there is ample evidence pointing to the instability of vocabulary knowledge, for instance its high context-dependency (de Bot & Lowie, 2010; Elman, 1995; van Orden & Goldinger, 1994), its short- and long-term fluctuation (Fitzpatrick et al. 2008; Schmitt & Meara, 1997, respectively); and the fact that meaning can occasionally be activated without access to form, as seen in TOT experiments (Burke et al., 1991; Clark, 1993). Similarly, the current study also indicates a high degree of fluctuation, both in vocabulary knowledge growth over time, and in the hierarchy of the knowledge continuum, in which higher levels may temporarily exceed lower levels. Yet its presupposition of a distinct hierarchy of knowledge modalities diverges to a degree from the current dynamic accounts of the lexicon. These accounts posit a mental lexicon that consists of state-space activation patterns rather than fixed entities, negating not only the notion of word entries in the lexicon, as suggested by Levelt, but also of distinct knowledge modalities. From this perspective, the context- sensitivity and instability of word meaning and knowledge implies that accessibility can never be completely predictable, even in the L1 (de Bot & Lowie, 2010; Elman, 1995; Spivey, 2007). It can be concluded that the current study, which is based on the measurement of fixed constructs over time and their configuration in mathematical models, needed to assume distinct modalities of vocabulary knowledge. Thus the fundaments of its design are not in complete congruence with the theoretical dynamic approach to the lexicon. Yet this study joins a body of evidence supporting these dynamic accounts, just as Meara’s models of vocabulary knowledge, though not explicitly associated with DST and assuming discrete categories (for example L1 vs. L2), show how patterns of dynamic activation can emerge in a single system (Meara, 1989, 2001, 2005). Moreover, the precursor model and other dynamic equations can account for growth even when it is nonlinear, yet cannot account for most of the variability around this growth. This variability may well reflect ongoing processes of language

104 A dynamic perspective on L2 vocabulary knowledge

change within the individual, in word meaning, adaptation, and activation, which are inseparable from language use and language structure (in this case, a rather amorphous and unstable “structure”) (Beckner et al., 2009; de Bot & Lowie, 2010; Ellis & Larsen-Freeman, 2009; Spivey, 2007). It is quite conceivable that the gradedness or fuzziness of word meaning and knowledge, implied in dynamic accounts of the lexicon, is responsible for some of the variability observed in the current study (and perhaps also in its predecessors). However, the study is limited in its ability to address the implications of these accounts. Rather, it is focused on generic interaction patterns that pertain to diverse dynamic systems, with recent advancements in theories of the lexicon remaining beyond its scope. Although the study results are promising, it has several drawbacks. First, since the model is based on data derived from advanced learners, it does not include the onset of development, to which, originally, the precursor model pertains. Additional data from different stages of L2 vocabulary development are necessary to establish the explanatory power of the precursor model more firmly in this area. Second, any model is by definition an abstraction, the degree of which is subject to debate. Therefore, any model is a gross simplification of reality, as is in fact any attempt at research. Third, case study results cannot be directly generalized to larger populations. However, as noted by several researchers in the current context (Fan, 2000; Webb, 2008), cross-sectional results are also often inapplicable to individual cases. It should be considered that the vocabulary knowledge paradigm in this study (and indeed many other studies) embodies the subcategories of complexity and accuracy, which are inherent to any testing method. The study has attempted to incorporate these dimensions into its measurement of free production. This operationalization is still not optimal. Measuring free production might be inherently less reliable than measuring more-receptive knowledge levels, as by definition, free production cannot be elicited. Yet, the fact that the participants had, in fact, used free production of academic vocabulary in their (non-academic) writing shows that receptive knowledge of vocabulary items does transfer, to some extent, into production even when not expressly demanded by context. Thus it should be kept in mind that the receptive-productive gap is not merely a lack of transfer and use, but rather the nonlinear and variable nature of this use (Laufer, 1998; Melka, 1997; Schmitt, 2010; Webb, 2008).

105 CHAPTER 4

Free vocabulary production as it is defined in this study not only incorporates the complexity and accuracy dimensions, but predictably does not occur only in conjunction with receptive vocabulary knowledge. Rather, it takes place in the text- level of writing, as a complement to the (morpho)syntactic dimension. Thus using complex (sophisticated and varied) and accurate (correct) vocabulary in writing entails further interactions. Accordingly, the following chapter is concerned with extending the application of the dynamic approach to the subcategories of complexity and accuracy in the lexical dimension of L2 writing, and their counterparts within the syntactic dimension. It is clear that the merit of the longitudinal and dynamic approach to SLA is found in its linkage with theory formation and different types of empirical data. Hopefully, this study will serve as an initial step in a collaborative and iterative process of documenting and understanding L2 vocabulary acquisition as a dynamic, temporal process.

106 Chapter 5 Dynamics of L2 writing development

5.1 Introduction As the previous chapters have shown, the dynamic approach to SLA is beneficial in demonstrating and explicating the effect of complex interactions, that underlie systemic self-organization, between aspects of language development (de Bot et al., 2007; Larsen-Freeman, 1997). In the context of language development, self- organization stems from two factors. The first factor is the structural qualities of language: the hierarchical order of linguistic knowledge specifies some components as prerequisite precursors, which conditionally support other dependent components and thereby enable their development. The second factor is the limited cognitive and environmental resources available to the learner, which yield competitive interactions within the precursor-dependent hierarchy of linguistic knowledge. This combination of conditional, competitive and supportive interactions within a structural order, given resource limitations, is referred to as the precursor model (van Geert, 1991). In this chapter, the dynamic approach and specifically the precursor model are applied to two common distinctions made across the text-level of L2 writing performance: the distinction of complexity vs. accuracy, and of lexicon vs. syntax. These two distinctions are merged into a single paradigm, in which complexity and accuracy are addressed separately in the lexical and syntactic dimensions. Writing performance encompasses numerous dimensions, which have been addressed by a wide array of studies (Cumming, 2001). Text-level indexes of L2 writing performance are frequently validated on the basis of their ability to discriminate between teaching programs, proficiency levels or tasks (cf., Larsen- Freeman, 1978; Larsen-Freeman & Strom, 1977). However, development is neither linear nor uniform across many of these measures (Larsen-Freeman, 2009; Young, 1995). According to some researchers, cross-linguistic (L1 to L2) influences make it unlikely that a single measure would serve as a uniformly-valid developmental indicator across all L2 writers (Odlin, 1989, Ringbom, 1987, both as cited in Cumming & Mellow, 1996). Therefore, L2 writing is often explored from a “multi- faceted, rather than unified, perspective” (Cumming, 2001, p. 10), by surveying

107

CHAPTER 5

several indexes simultaneously as representing one or more categories of performance (Cumming & Mellow, 1996). This approach is taken up by the current study, in conjunction with the dynamic approach and its emphasis on nested, complex interactions. The distinction between complexity and accuracy is very common in SLA research. The basic definitions of these constructs, which inform the current study, are of complexity as the use of a wide array of vocabulary items and syntactic structures (Lennon, 1990), and accuracy as error-free production (Foster & Skehan, 1996; Lennon, 1990). Several studies have shown that complexity and accuracy develop nonlinearly and asynchronously in spoken and written L2 (Cumming & Mellow, 1996; Foster & Skehan, 1996; Skehan & Foster, 1997; Wolfe-Quintero, Hae-Young & Inagaki, 1998). Complexity and accuracy can be further specified within the lexical or syntactic dimensions of writing (cf., Cumming & Mellow, 1996). Lexicon and grammar 15 show similar developmental discrepancies in both speech and writing, beginning at the earliest stages of L1 acquisition (Woodward et al., 1994; Marchman & Bates, 1994), and manifesting in L2 writing as well (Verspoor et al., 2008b). Many studies, however, treat complexity and particularly accuracy as general constructs at the global text level (cf. Ishikawa, 1995), and do not differentiate these categories further across the lexical-syntactic divide. As MacWhinney notes, accounts of (in)accuracy in language production share the common drawback of “treat[ing] language ability as a single undifferentiated whole” (2006, p. 148). The asynchronous developmental paths of complexity and accuracy, and those of lexicon and syntax in particular, are explained by two competing accounts. These accounts are readdressed in detail in the subsequent sections of this chapter in conjunction with each distinction. The first is the dual-systems model (Ullman, 2001a, 2001b), which is affiliated with the traditional generativist approach (Chomsky, 1957, 1965; Pinker, 1991, 1994). This model relates the disparate development of lexicon and syntax to their reliance on two distinct types of knowledge, declarative and procedural, respectively. It is therefore also known as the DP-model (Ullman, 2001b, 2004). Declarative knowledge is based on rote memory, or representation (“knowing what”), while procedural knowledge is based on memory for rules (“knowing how”)

15 This study is focused primarily on the syntactic dimension of writing, in juxtaposition with the lexical dimension. However, when reviewing studies that refer to syntactic development as part of grammar, it reverts to the term used by these studies.

108 Dynamics of L2 writing development

(Ryle, 1949). The same dual mechanisms have been aligned with the nonparallel development of complexity and accuracy, with complexity associated with declarative knowledge, and accuracy with procedural knowledge (Bialystok, 1982, 1994). The second and alternative account of the complexity-accuracy and lexical- syntactic distinctions is the single-system model, which is associated with cognitive and usage-based theories. It specifies a single mechanism as responsible for all aspects of linguistic knowledge. This mechanism is automatization, which involves the establishment of memory chunks assembled from robust linguistic patterns in fragmented phonemes, syllables, words and so forth (Ellis, 1996). Automatization is thus seen as linking complexity and accuracy in language performance, or in other words as transferring linguistic features from complex to accurate use, since it frees cognitive resources for higher- and more-sophisticated applications by conjoining linguistic elements. In turn, automatization relies on cognitive resources that are inherently limited, such as working memory span (Baddeley, 1990) or attentional capacities (Skehan, 2009; Skehan & Foster, 2001). Thus the single-system account maintains that disparate development across both the complexity-accuracy and the lexical-syntactic distinctions arises from a competition between each pair of categories (and indeed between all developing skills) for the limited capacities of learners and their environments (Bates & MacWhinney, 1989; Ellis, 1996; Foster & Skehan, 1996; Skehan, 2009; Skehan & Foster, 1997, 2001). The claim that resource limitations lead to competitive interactions between co-developing linguistic components, as posited by the single-system model, is congruent with the general premises of DST. The dynamic approach has been applied to both the complexity-accuracy and lexical-syntactic distinctions in several empirical studies. Two longitudinal studies of complexity and accuracy in L2 writing have identified disparate growth and variability patterns in these categories (Larsen- Freeman, 2006b, 2009; Spoelman & Verspoor, 2009, in press). Other dynamic- oriented studies have addressed the lexical-syntactic distinction in spoken early L1 by simulating the nonparallel development of lexicon and syntax with mathematical precursor models (van Geert, 1991, 1993; Robinson & Mervis, 1998). From these studies, Verspoor et al. (2008b) have extended the dynamic perspective to the lexical and syntactic dimensions of written L2. They noted inverse variability patterns across indexes denoting these dimensions, relating them to dynamic precursor interactions.

109 CHAPTER 5

The working assumption of the current study is that the complexity-accuracy and lexical-syntactic distinctions can be combined in a single paradigm of L2 writing performance. This assumption is based on the association of both distinctions with dynamic self-organization (Larsen-Freeman, 2006b; Spoelman & Verspoor, 2009, in press; Verspoor et al., 2008b), as well as on the fact that both the dual-systems and single-system accounts pertain to both distinctions. The proposed paradigm is hierarchical, consisting of lexical complexity, lexical accuracy, syntactic complexity and syntactic accuracy, in this order. Lexicon is posited as a precursor to syntax, and within both the lexical and syntactic dimensions, complexity is assumed to be a precursor to accuracy. These expectations are based on previous studies that have shown that vocabulary acquisition is an essential predecessor of syntactic 16 development (Marchman & Bates, 1994; Robinson & Mervis, 1998) F In both the lexical and syntactic dimensions, complexity is considered as a prerequisite precursor to accuracy, since without the acquisition and use of varied and increasingly complex lexical and (morpho)syntactic structures, correct performance cannot be achieved (Ellis, 1996; McLaughlin, 1990). Asides from these conditional support interactions, the present study hypothesizes a simultaneous competition for the limited resources of learners and their environments between each precursor-dependent dyad in the hierarchy, in line with the precursor model (Fischer, 1980; van Geert, 1993) and the single-system account (Ellis, 1996; Foster & Skehan, 1996). The study has two main research questions. The first is whether the four categories of writing performance exhibit nonlinear and alternating growth and variability patterns that are indicative of precursor interactions as specified above. This question is addressed by variability analyses across combinations of indexes that denote these categories in longitudinal L2 data. The second question is whether the interpretation of these variability patterns as expressing the nature of precursor interactions (i.e., as indicating predominantly support or competition between precursor-dependent dyads during the study period) can be corroborated. The question is approached by configuring the interactions gleaned from the analyses as relational control parameters in a mathematical precursor model aimed at simulating the data. The chapter begins by addressing the complexity-accuracy distinction and the competing dual-systems and single-system accounts that refer to it, as well as the

16 More recent Chomskian theory (cf., 1995) also hypothesizes that syntactic development begins by drawing on an existing lexicon, thereby in fact positing lexicon as a type of conditional precursor to syntax.

110 Dynamics of L2 writing development

dynamic perspective on this topic. The following section similarly discusses the lexical-syntactic distinction. Due to the related theoretical perspectives on the two topics, there is some degree of overlap between the two sections. However, since the bulk of the literature addresses these distinctions separately, the study reviews them accordingly. Both distinctions are then discussed conjointly as components of a single paradigm. The next section specifies the research questions and methodology of the study, including the indexes used to denote each category. The results section is divided into data and variability analyses, followed by the modeling procedures. Since this section is rather extensive, the results are accompanied by interpretations and summaries. The chapter ends with a discussion of the implications and limitations of the study.

5.1.1 Complexity vs. accuracy As mentioned earlier, complexity is broadly defined as the use of a variety of linguistic structures, and accuracy as the correctness of use (Lennon, 1990). According to Wolfe-Quintero et al., complexity “reveals the scope of expanding or restructured second language knowledge”, whereas accuracy “shows the conformity of second language knowledge to target language norms” (1998, p. 4). The complexity-accuracy distinction assumes “different subsystems of language” which “may develop differently and respond to instruction differently” (Pienneman, Johnston, & Brindely, 1988, as cited in Cumming & Mellow, 1996, p. 73). The nature of these subsystems has been subject to avid debate. In conjunction with the similar controversy on the lexical-syntactic distinction, this debate constitutes a part of a greater dispute that concerns L1 vs. L2 acquisition and the conflict between innate and cognitive approaches to language development. The dual-systems account of the complexity-accuracy disparity claims that declarative knowledge is responsible for complexity, while procedural knowledge, which reflects knowledge accessibility, underlies accuracy (Bialystok, 1982, 1994). The two knowledge types are in turn attributed to memory types that occur in different brain localities (Chomsky, 1995; Pinker, 1994; Ullman, 2001a, 2001b). Versions of the dual-system model appear at varied degrees of strength and conviction. For instance, Chomsky (1995) maintains that domain-specific mechanisms are responsible for the generation of the various dimensions of language. Conversely, Ullman (2004) posits that the two knowledge types are much more

111 CHAPTER 5

general and thus responsible for many other skills, including certain aspects of animal behavior. The different versions of the dual-systems model also diverge in their view on the exclusivity of the posited mechanisms in relation to different linguistic constructs. In this respect, Ullman (2004) does not claim that any given aspect of linguistic knowledge is generated solely by one memory type or another, while Pinker (1994) maintains a stricter view (see the following section, as well as Green, 2003, for an extensive review). In the current study, the different versions of the dual-systems model are not distinguished, since its aim is not to refute or confirm the dual-systems account, but to approach the complexity-accuracy and lexical-syntactic discrepancies from an alternative and dynamic angle. Pinker (1991) frequently refers to a U-shaped error pattern in early L1 past- tense verb production as evidence in favor of the dual-systems approach. This pattern theoretically supports the assumption of distinct mechanisms, since it shows stages in early verb production. According to this interpretation, at an early stage, in which procedural knowledge is yet unavailable, children rely on declarative memory for regular (i.e., rule-based) verb conjugation. Therefore, initially both regular and irregular verbs are learned (correctly) by rote. At a later stage, verb conjugation begins to rely on procedural knowledge, and thus regular verb endings are over- generalized to irregular verbs, resulting in increased error. Eventually, verb knowledge stabilizes, with regular verbs relying on procedural knowledge and irregular verbs on declarative memory, and thus overall error in verb inflection diminishes. Pinker’s (1991) separation of words and rules relies on this differential development of regular and irregular verb inflections, stating that irregular verb forms are stored in memory as lexical items, whereas regular verbs are conjugated by rules.17 In Ullman’s view, these patterns in early L1 verb error imply that declarative knowledge does not always transfer into procedural knowledge, since the two mechanisms compete as dual routes. Because declarative knowledge is assumed to be earlier and more-established, often input is not processed as procedural even once procedural memory has developed, since the declarative route is faster (Ullman, 2004). This elaboration on the dual-systems model necessitates the positing of a third mechanism as a conjoining factor that can explain how some items which are initially

17 The words/rules model is strongly contested by connectionist theory (see Plunkett & Marchman, 1991).

112 Dynamics of L2 writing development

processed declaratively (i.e., regular verbs) eventually transfer into procedural memory and are then processed on the basis of rules. Thus, elaborations on dual- systems theory have aimed to account for the eventual link between the memory types, in other words for the transfer of knowledge from declarative to procedural memory, by positing a third mechanism. This mechanism is referred to as automatization, and in turn relies on a third memory type: working memory (Anderson, 1982, 1993). This account is general rather than language-specific, referring to all types of knowledge. It posits declarative knowledge as a predecessor to procedural knowledge, with working memory and the ensuing increase in automatization as an intermediary factor that conjoins the two memory types (Anderson, 1976, 1982, 1993). Thus, the complexity-accuracy interface is attributed to the Power Law of practice, which states that accuracy in performance will increase with automatization over time (Anderson, 1982; Rosenbloom & Nowell, 1987; Anderson, 1993; McLaughlin & Heredia, 1996). Contrary to this stipulation, several studies have suggested that the interaction between L2 complexity and accuracy in written L2 (as well as between these and other aspects of linguistic knowledge, such as fluency) is nonlinear and non-uniform across learners (e.g., Larsen-Freeman & Strom, 1977) and particularly within learners, remaining disparate over time (Larsen-Freeman, 2006b; Skehan, 2009). There is evidence that L2 accuracy develops nonlinearly – decreasing in beginners, increasing in intermediate learners, and again decreasing in advanced learners (MacKay, 1982; Foster & Skehan, 1996). For example, a study of university-level learners found that “the number of error-free T-units decreased from low-intermediate to intermediate learners, and then increased for the advanced learners” (Sharma, 1980, as cited in Wolfe-Quintero et al., 1998, p. 39) (note that increased error is inverse to increased accuracy). 18 This finding, however, was contradicted by another study (Tomita, 1990, as cited in Wolfe-Quintero et al., 1998, p. 39), which reported increased accuracy in lower level learners. Referring to both studies, Wolfe-Quintero and her colleagues state: The U-shape commonly associated with accuracy in language development has been identified for particular phenomena such as past-tense morphology, whereas the error-free T-unit measure is more global in scope and may follow

18 The error-free T-unit (which is a main clause plus all its subordinates) is a general ratio of all error types per T- unit. For several reasons, listed in section 5.2, neither the T-unit nor the error-free T-unit ratio are used in the current study.

113 CHAPTER 5

a different path. A decrease in error-free T-units might also correlate with an increase in complexity, although this comparison has not been investigated directly by anyone (Wolfe-Quintero et al., 1998, p. 39).

The nonparallel development of complexity and accuracy, and the fact that even when the DP-model is assumed in this context, it still relies on automatization, which is in turn based on the inherently limited capacity of working memory (Baddeley, 1990; Skehan, 1998), lead to a consideration of a single-system model of complexity and accuracy. In general, single-system accounts explain shifts in L2 accuracy, such as the U-shaped pattern of increase, decline and subsequent increase, by a competition or tradeoff interaction between complexity and accuracy (Skehan & 19 Foster, 1997; Mehnert, 1998) F Like the DP-model, single-system accounts mostly pertain to complexity and accuracy within the syntactic dimension of L2. The tradeoff interaction that they identify is in turn allocated to limitations on various cognitive capacities (VanPatten, 1990). These limitations are addressed by theories such as the limited capacity attentional system (Skehan, 1998; Skehan & Foster, 2001) and the multiple resources attention model (Robinson, 2005). Both theories state that increased cognitive complexity takes a toll on accuracy in linguistic performance, particularly when time constraints are imposed. Complexity and accuracy are frequently inspected in conjunction with fluency (cf., Larsen-Freeman, 2006b, 2009; Skehan & Foster 1997; Wolfe-Quintero et al., 1998). It should be noted that the present study does not address fluency, since it relates to the time dimension, i.e., processing speed, which is in turn operationalized as the number of pauses, repetitions, silences, reformulations and so forth. Most of these factors cannot be readily assessed in the context of written language, unless as 20 the maximal text length produced under time restrictions F F. Moreover, the empirical dynamic perspective as it is applied in the current study necessitates the establishment of a clear initial order (precursors and dependents) between linguistic categories or components. Complexity and accuracy exhibit a more distinctive hierarchical order

19 The tradeoff effects relate mostly to spoken production, and are particularly evident in low-proficiency learners. However, at least with regard to accuracy, there is contradictory evidence of a decrease that accompanies higher proficiency levels. This finding was explained by the need for low-proficiency learners to vs. meaning, or in other words, on declarative vs. procedural forms of knowledge (Robinson, 2003). 20 Other researchers operationalize written fluency in various ways, most of them involving sentence or text length, which are more pertinent to beginner-level or low proficiency learners.

114 Dynamics of L2 writing development

between them than towards fluency, in the sense that accuracy cannot be achieved without some degree of complexity. Discussing several studies that juxtapose complexity and accuracy, as well as fluency, across various spoken L2 tasks and conditions, Skehan and Foster (1997) conclude that Evidence for trade-off effects is very strong. Learners required to complete tasks seem unable to prioritize equally the three performance aspects of fluency, accuracy and complexity. Achieving more highly in one seems mostly to be at the expense of doing well on the others, with competition between complexity and accuracy particularly evident (Skehan & Foster, 1997, pp. 207-208).

For example, Foster and Skehan (1996) found that planning time positively affected L2 complexity (measured as the amount of subordination per written or spoken text), but not accuracy (measured as error-free clauses). Addressing these findings, Wolfe-Quintero et al. maintain that a “complex interconnection” between several areas of L2 performance is part of the process of a language learner attempting to encode many aspects of language at the same time, at many linguistic levels. The competition for attention and resources allows only so much information to be assimilated, automatized, or restructured at a time (1998, p. 5).

As noted, the notion of restricted resources as determining competitive interactions is highly compatible with the DST principle of resource limitations as determinants of interactions between systemic components, i.e., self-organization. In combination with the idea of a structured hierarchy, this generic scheme is known as the precursor model. So far, two studies have explicitly applied the dynamic perspective to complexity and accuracy in L2 writing. Although these studies differed in their operationalization and measurement of these constructs, their findings congruently point at complex dynamic interactions across the complexity-accuracy distinction. Larsen-Freeman (2006b, 2009) traced the development of fluency, grammatical complexity, lexical complexity and overall accuracy in sets of four written and spoken ESL narratives, which were produced by five learners. Measures denoting these categories showed nonlinear and nonparallel growth and variability patterns, interpreted as manifesting complex and competitive dynamic interactions between complexity, accuracy, and fluency. Similarly, Spoelman and Verspoor (2009, in press) found that written morphosyntactic complexity and accuracy indexes in L2

115 CHAPTER 5

Finnish correlated very weakly, and that a moving correlation between these indexes exhibited robust fluctuations between negative and positive coefficient values. A parallel finding, namely a very weak negative correlation and a strongly oscillating moving correlation between indexes representing aspects of writing (lexicon and syntax), was previously interpreted as expressing complex precursor interactions (Verspoor et al., 2008b). Nevertheless, Spoelman and Verspoor did not explicitly associate the complexity-accuracy interaction in their data with the precursor model, but suggested tradeoff interactions given limited resources, in line with Foster and Skehan (1996). While many studies have found that complexity, in general, improves over time (Bardovi-Harlig, 1997; Ishikawa, 1995; Mellow & Cumming, 1994), there is evidence that this improvement is nonparallel across lexicon and syntax, in both L1 and L2 (Larsen-Freeman, 2006b; Tomasello, 2000, 2003). Concerning accuracy, Larsen-Freeman (2006a) has pointed out the inherent nonlinearity of this general construct, suggesting a dynamic view of fossilization, the general term for consistent or recurrent lack of L2 accuracy (Selinker, 1972; Selinker & Lakshman, 1992). Refuting the traditional description of fossilization as a static end-state, Larsen- Freeman defined it as a dynamic, localized and temporal process (2006a). MacWhinney (2006) has also noted that fossilization is not an across the board phenomenon, but differentially localized in various areas of L2. He conceptualized fossilization as an attractor state to which subsystems of language repeatedly return, as a result of competition for resources with other language components or extra- 21 linguistic factors. While L2 accuracy, as in the case of fossilization, is often considered as developing nonlinearly, Ortega and Byrnes pointed out that It has seldom been noted that L2 development as a whole is also unevenly paced, with different rates of change (in accuracy, fluency, and repertoire of choices or so-called complexity) for different aspects of the system of language and different time distributions of stability and (gradual or sudden) change for different phenomena, different learners, and different contexts of use (2008, p. 287).

Therefore, viewing complexity and accuracy as aggregated constructs across the various dimensions of language, even in a longitudinal and developmental study,

21 While these accounts may appear conflicted, distinct attractors or preferred systemic configurations can in fact result from ongoing dynamic change, as the bifurcation plot in section 2.1.1.2 has shown. In other words, “[just] because systems are dynamic, it does not mean that everything will be in constant flux” (Larsen-Freeman, 2006a, p. 192).

116 Dynamics of L2 writing development

may lead to a loss of valuable information. The complexity-accuracy distinction appears to be inextricable from developmental discrepancies between various dimensions of language, particularly between lexicon and syntax. The following section addresses the lexical-syntactic distinction, before proceeding to discuss both distinctions in a combined paradigm of L2 writing performance.

5.1.2 Lexical vs. syntactic development Lexicon and morphosyntax are, predictably, strongly interdependent, with prior lexical knowledge prerequisite to the emergence of grammar (Bates & Goodman, 1999; Tomasello, 2003). However, the literature, regardless of theoretical orientation, near-unanimously considers lexicon and grammar as separate entities, whose disparate development ensues from the earliest stages of L1 acquisition (cf., Bates et al., 1995; Chomsky, 1957, 1965; van Geert, 1993). This acknowledgement stems from ample evidence for nonlinear associations between the two areas (Marchman & Bates, 1994), which manifest in “temporal asynchronies” (Bates et al., 1995, p. 115). Thus, the single best estimate of grammatical status at 28 months (right in the heart of the “grammar burst”) is total vocabulary size at 20 months (measured right in the middle of the “vocabulary burst”) (Bates et al., 1988, in Bates & Goodman, 2001, p. 138).

As with regard to the complexity-accuracy distinction, the dual-systems approach associates lexicon and syntax with declarative and procedural knowledge, respectively (Pinker, 1994; Pinker & Ullman, 2002). This model assumes that the lexicon is a memorized “mental dictionary” (Pinker, 1994, p. 478), in juxtaposition with grammar, which is considered as rule-based and computational (Ullman, 2001a, 2001b). As mentioned in the previous section, the DP-model regards irregular verb forms as declaratively stored. Psycholinguistic experiments which have shown that irregular verbs are primed by frequency effects, unlike regular verbs, have been interpreted as evidence for the differential processing of rule-based grammar. For example, Ullman (1999) found significant correlations between judgments of irregular verb inflections as correct or incorrect and the frequency of these forms, yet no such correlations for regular verbs. Additionally, neuroimaging studies of brain activation in normal and impaired participants have provided some findings considered as compelling evidence for the applicability of the DP-model to lexicon and syntax. These studies have associated experimental output in lexicon and syntax – again

117 CHAPTER 5

operationalized as irregular and regular verb forms, respectively – with different brain localities (Ullman, 2001a, 2001b). However, as Ullman himself acknowledges, “even the relatively simple approach of studying regular and irregular morphology has failed to yield entirely consistent results” (2001a, p. 724). First, no single brain area is activated while other areas are deactivated, but rather several areas are associated with both constructs (Ullman, 2004). Moreover, the reliance on the irregular-regular verb distinction in operationalizing declarative and procedural knowledge, and lexicon and syntax via these knowledge types, is rather problematic. This is demonstrated by connectionist simulations of verb inflection performance as based solely on declarative knowledge, which have shown that it is unnecessary to assume dual mechanisms in order to explain the nonparallel development of accuracy across regular and irregular verbs (Ellis & Schmidt, 1997). Further counter-evidence for the DP-model comes from a study comparing irregular and regular verb production in normally developing and specific language impaired children, whose procedural memory is presumed to be impaired. This study strongly refuted the allocation of irregular and regular verbs to the two knowledge types (Marchman, Wulfeck, & Ellis-Weismer, 1999). The regular-irregular verb distinction as is not the only weakness of the DP- model. Its wholesale allocation of lexicon to declarative knowledge is also arguable. As mentioned in the previous chapter, certain modalities of lexical knowledge can also be considered as procedural or rule-based. These modalities refer to “what the learner knows compared to what the learner can do” (Nation, 2001, p. 378). In line with this view, the ability to focus on the communicative role of vocabulary production and the achievement of accuracy in collocation have been associated with procedural knowledge (Marco, 1998, in Nation, 2001; Robinson, 1989, 1993, 2003). Several other studies have shown that lexical acquisition does not rely exclusively on declarative memory (see Green, 2003). Due to such “loopholes” in the dual-systems model, and considering both the interdependence and asynchronous development of early lexicon and syntax, the “simple two strand account must give way to a more complex, dynamic account of associations and dissociations over time” (Bates et al., 1995, p. 13) between lexicon and syntax. This alternative account has several connectionist, emergentist, usage- based and dynamic versions, all of which specify a single learning mechanism for all aspects of language (Bates & Goodman, 1999; Bates & MacWhinney, 1987; Ellis,

118 Dynamics of L2 writing development

1996, 1998; Elman, Bates, Plunkett, Johnson, & Karmiloff-Smith, 1996; 22 MacWhinney, 1987). Regarding the disparate development of lexicon and syntax, there are two types of explanations affiliated with the single-system model. Some researchers propose a critical mass hypothesis. This approach, while attributing lexical and syntactic development to a single mechanism, specifies a certain amount of vocabulary knowledge as prerequisite for the emergence of syntax (Marchman & Bates, 1994). Other theories point at the relative conceptual or functional difficulty of grammatical features in comparison with lexical features as key to the later onset of syntax (see Bates et al., 1995 for a review). A combination of these two ideas, the first stressing the dependence of syntax on lexicon (critical mass), the second emphasizing the structural-hierarchical properties of linguistic knowledge (difficulty), clearly resonates with the precursor model (van Geert, 1991, 1993). This is because the precursor model combines the principle of interdependence with that of a hierarchical order, both of which operate under the general constraint of resource limitations. In this model, competition and support between co-developing components of a dynamic system can be accounted for as simultaneous and offsetting influences on growth. Accordingly, van Geert (1993) has successfully employed generic equations of precursor interactions in simulations of early lexical and syntactic L1 development. Following his footsteps, Robinson and Mervis (1998) plotted the developmental trajectories of indexes denoting early L1 lexical complexity (measured as cumulative word count, equal to vocabulary size) and syntactic complexity (measured in the mean length of utterance, abbreviated as MLU, and in plural use). They noted that the correlation between vocabulary size and MLU shifted dramatically from a weak to a near maximal value once MLU “has begun to increase in earnest” (p. 369) at age 21 months. Vocabulary size and plural use showed an even clearer move from an initially negative to a positive correlation. To test the possibility that their findings “hint at a dynamic reorganization of the relations between the two linguistic subsystems” (p. 370), Robinson and Mervis configured them in a model of lexicon and syntax as connected growers in precursor interactions, based on van Geert (1991). The model specified lexical acquisition as a precursor to the dependents MLU or plural use, with lexicon initially competing with morphosyntax, but supporting it once reaching a threshold

22 Or, in some cases, a general learning mechanism for all types of knowledge, in the same way that declarative and procedural knowledge are considered as non-domain specific by some theorists (cf., Ullman 2001a, 2001b)

119 CHAPTER 5

value. The model also incorporated competition from the dependent towards the precursor, which diminished as the dependent grew closer to its carrying capacity. A good fit of the model to the data led Robinson and Mervis to conclude that the general precursor function is suitable for modeling “a rapidly learned skill that once mastered, no longer requires high levels of cognitive resources to maintain productive use” (1998, p. 317). They also noted the compatibility of their modified precursor model with the critical mass hypothesis (Marchman & Bates, 1994). More generally, the simulation outcome negated the need for presupposing two separate mechanisms as responsible for lexical and syntactic development, as posited by dual-systems theory. Recently, Verspoor, et al. (2008b) have applied the dynamic approach to written lexical and syntactic indexes in a case study of advanced L2 (English) writing. The found alternating growth and variability patterns across these indexes, which they interpreted as indicating complex interactions between the lexical and syntactic dimensions and as related to the precursor model. However, the application of the precursor model to lexicon and syntax in L2 writing has not yet been tested in a dynamic simulation.

5.1.3 A combined paradigm of the complexity-accuracy and lexical- syntactic distinctions So far, the complexity-accuracy and lexical-syntactic distinctions have been treated as two separate dichotomies by both the dual-systems and single-system models, as well as by the dynamic approach. Yet, as the previous sections have shown, there is a great degree of overlap between the different theoretical perspectives that address both distinctions. Therefore, the current study proposes merging them into a single paradigm of L2 writing performance. Accordingly, this section discusses complexity and accuracy in the context of the lexical and syntactic dimensions of writing, beginning with the former. In the lexical dimension of L2 writing, studies often focus on complexity rather than accuracy. Even studies that distinguish between lexical and syntactic complexity tend to treat accuracy as an aggregated construct across both domains, by using global measures such as the ratio of error-free T-units, clauses, or words (e.g., Ishikawa, 1995). Few studies focus on lexical accuracy as a stand-alone construct (Arnaud, 1992; Bardovi-Harlig & Bofman, 1989; Engber, 1995; Harley & King, 1989). This is despite indications that lexical error is more prevalent in nonnative production than syntactic error (Bardovi-Harlig & Bofman, 1989; Lennon, 1996;

120 Dynamics of L2 writing development

Nesselhauf, 2003), and that lexical accuracy influences proficiency ratings by native speakers more than syntactic complexity (Bardovi-Harlig & Bofman, 1989; Engber, 1995). Lexical complexity is a combination of word variation and sophistication, whereas lexical accuracy is determined by the contextual appropriateness of use (Engber, 1995; Laufer & Hulstijn, 2001). Therefore, lexical complexity and accuracy, when measured in the same production units such as the total content word count, may overlap to a degree. However, this does not imply that improved lexical complexity is synonymous with improved accuracy, since diverse or non-frequent vocabulary may not always be used correctly. For example, a study of lexical accuracy in verb usage found that despite increased control over meaning, the accuracy of advanced learners in collocation or contextual use remained poor (Lennon, 1996). This finding illustrates the fact that lexical error may not necessarily reflect only a lack of appropriate or sophisticated vocabulary, but also a lack of know- how and experience in its contextual or collocational application (Bulté et al., 2008). Thus the constructs of lexical complexity and accuracy should be considered separately. However, it is evident that without lexical complexity, lexical accuracy cannot be attained. Similarly, while there is evidence for nonlinear and disparate development across various measures of syntactic complexity and accuracy (Bardovi-Harlig, 1997; Ortega, 2003), there is also is a concurrent dependency between complexity and accuracy in the syntactic dimension (Foster & Skehan, 1996). In other words, without complex (diverse and sophisticated) syntactic features, accuracy in syntax cannot be reached. However, avoidance of complex forms, expressed in the overgeneralization of less advanced ones, may not necessarily result in obvious syntactic error in a similar way that a lack of complexity manifests in a lack of lexical accuracy (Ringbom, 1998). For example, when coordinating conjunctions are extensively used in place of subordination, the result is not necessarily erroneous as such, but often a general impression of a lack of proficiency or sophistication. Thus the interdependency between complexity and accuracy and the role of complexity as a conditional precursor to accuracy may be less explicit and robust in the syntactic dimension than in the lexical one.

121 CHAPTER 5

To sum up, the literature on L2 writing development suggests that the complexity-accuracy and lexical-syntactic distinctions are not two separate and discrete dichotomies. There is indication for hierarchical and complex interactions between complexity and accuracy, in which complexity precedes and is conditional for accuracy but also competes with it for resources. This hierarchy is further embedded within a lexical-syntactic order, in which lexicon precedes syntax and is a conditional supporter of its growth, even while the two dimensions compete for resources in a similar manner. Such a nested and hierarchical structure is, in essence, a description of a dynamic system (van Geert, 1995), and is in line with theoretical dynamic accounts of L2 acquisition (de Bot et al., 2005; Larsen-Freeman, 2007; Larsen-Freeman & Cameron, 2008). Therefore, DST may offer an integrated theoretical and empirical framework for its understanding. Accordingly, the current study suggests that dynamic precursor interactions can account for development across both distinctions in L2 writing. Its main goals are to investigate whether there are nonlinear interactions between lexical and syntactic indexes of complexity and accuracy in L2 writing development, and whether these can be explained by a model specifying lexicon as a precursor to syntax, and complexity as a precursor to accuracy within both the lexical and syntactic dimensions. The following section details the research questions that arise from these objectives.

5.1.4 Research questions and predictions As specified in the introduction, the two main questions of the study correspond with two main methodological procedures, namely variability analyses and modeling. The first question asks if there is evidence for complex interactions that combine support and competition between complexity and accuracy in the lexical and syntactic dimensions of written L2 development, and across these dimensions. The second question asks whether a model based on precursor interactions can simulate the developmental patterns of the data categories on the basis of the answer to the first question, thereby corroborating the interpretations of these variability analyses. The first question can be divided into four subsidiary questions, listed below in the order in which they are addressed. The first two sub-questions concern the complexity-accuracy divide across the lexical and syntactic dimensions, respectively: 1a) How do complexity and accuracy measures interact within the lexical dimension of L2 writing?

122 Dynamics of L2 writing development

1b) How do complexity and accuracy measures interact within the syntactic dimension?

The third question addressed the general lexical-syntactic interaction: 1c) How do the lexical and syntactic dimensions interact across general indexes of complexity?

These questions lead to a broader question, which is used as a basis for configuring the model: 1d) How do complexity and accuracy interact across lexicon and syntax, in indexes denoting lexical complexity, lexical accuracy, syntactic complexity, and syntactic accuracy?

These questions are investigated by detailed data and variability analyses, and lead to the second main question, which is again divided in two. Its first part concerns the order parameters of the model, or in other words, its presumed hierarchy: 2a) Can a precursor model replicate the surface growth and interaction patterns of the data, on the basis of a hierarchical order that specifies lexicon as precursor to syntax, and complexity as precursor to accuracy within both the lexical and syntactic dimensions?

This question is intertwined with the final question, which focuses on the role of variability as indicative of underlying dynamic interactions. It pertains to the relational control parameters of the model, which specify the value of the interactions within its hierarchical order: 2b) Are the variability analyses outcomes indicative of the interactions between the four writing categories in the precursor?

In practical terms, the last two questions are approached by configuring a precursor model that consists of a hierarchy of four connected growers: lexical complexity, lexical accuracy, syntactic complexity and syntactic accuracy. The model is optimized and iterated, and its outcome is compared with the growth and interaction patterns in the data, thereby responding to question 2a. The optimized values of the interactions between the model levels are compared with the interpretation of the variability analyses, thereby responding to the final question.

123 CHAPTER 5

5.2 Participants, materials and procedures The study used the corpus of L2 essays which was presented in section 4.2. The corpus included four sets of 36 essays each, corresponding with four participants and the number of weeks in the study period, as described in the previous chapter. The essays were written on a weekly basis, starting at the onset of the participants’ immersion in English-speaking academic degree programs. For reasons of space and labor intensity, it was not feasible to conduct a full analysis of all case studies. For the same reasons, focusing on a single case study is standard procedure in longitudinal studies. The current study therefore concentrated on the 24-year old native Portuguese speaker who was also the main participant in the study presented in Chapter 4. This participant was chosen in order to maintain a continuity with the previous chapter, so that the two studies may eventually be consolidated. In order to substantiate the findings from the main case study, many of the analyses of her data are systematically accompanied by equivalent analyses of the other case studies. The abundance of indexes employed in L2 writing research implies that each aspect of L2 writing can be conceptualized and assessed in numerous ways. The present study applies a multidimensional approach, by examining not only four indexes as representative of the four categories of lexical and syntactic accuracy and complexity, but also several indexes within each category. The reason for this choice is that certain measures are reliable only at the group level. Moreover, the applicability of writing performance indexes tends to vary greatly across individuals (Cumming & Mellow, 1996; Wolfe-Quintero et al., 1998). Accuracy measures in general and lexical accuracy in particular are highly susceptible to individual interpretation. Thus, studies report strong inter-rater disagreement on such measures (cf., Polio, 1997). In the current study, errors were rated only by the researcher. At the end of the rating procedure, the first half of the data was reviewed and re-rated to ensure that change over time in rater perceptions would not interfere with the results. Although it is certainly probable that the researcher’s judgment would differ from that of another rater, the idea was that it would be consistent, with any biases being systematic. This compromise, stemming from practical considerations, was considered plausible since the focus of the study is ultimately on change over time rather than on errors types in themselves. Although ideally, raters would not be informed about the timing of the text writing, to avoid a

124 Dynamics of L2 writing development

bias towards perceiving improvement in later texts, this was not considered as necessary in the current study, since it focuses on interaction between indexes rather than on increase over time. Moreover, since many indexes were scored, it was considered unlikely that the rater would unconsciously apply criteria that would prejudice judgment towards increased growth. The general error analysis strategy of the study was identification in relation to correct structures, which applies the minimal correction needed (Burt & Kiparsky, 1972, as cited in James, 1998). In line with Faerch (1978, as cited in James, 1998), who suggested that error analysis is more complete as performance analysis, error rates were computed as ratios of erroneous use vs. the total number of required contexts, rather than only instances of usage. For example, the correct article use ratio is based on the number of correctly used articles out of the total number of instances in which articles are required. Such error ratios are converted to (positive) accuracy indexes by their subtraction from the maximal correct usage ratio of 1. The key considerations in choosing indexes to denote the four categories were as follows. First, that these indexes have been used and affirmed as adequate developmental measures by previous studies. Second, that the indexes are pure, meaning that they measure only the category that they pertain to. Thus, lexical indexes should not incorporate a syntactic dimension, and vice versa; likewise for the complexity-accuracy distinction. This consideration is the reason that general indexes, such as the error-free T-unit, were not included in the study. Finally, indexes based on very small initial values in relation to text length (such as for example the ratio of passive aspect use) were also avoided, in line with previous recommendations (Cumming & Mellow, 1996; Larsen-Freeman & Long, 1999). The following sections detail the specific considerations in choosing indexes for each of the writing performance categories, and the particular indexes used.

5.2.1 Lexical complexity indexes Bulté et al (2008) specify three types of lexical complexity measures: variation-based (i.e., type-token ratio, abbreviated as TTR; or lexical density, the content to total word ratio); frequency-based (i.e., words derived from frequency or specialized lists); and formula-based (combining both types). Although TTR or lexical density are frequently used measures of lexical complexity, pilots for the current study have indicated that they do not adequately capture development in the writing of advanced

125 CHAPTER 5

ESL learners. Therefore, the study adopted two versions of these popular measures, which are based on the number of lexical tokens (content words). The first version is variation-based, and the second is frequency-based. 1) Complex word ratio, defined as the number of words consisting of three or more syllables out of the total content word count. This measure is a variation on the sophisticated word ratio (with sophisticated words defined as less- frequent or academic vocabulary), which has been used in several studies (e.g., Hyltenstam, 2002; Laufer, 1994). 2) General word variation, calculated not as a simple type-token ratio but as the ratio of content word types to the total content words (Engber, 1995). In this way, this index excludes most function words (for example, auxiliaries, pronouns, or prepositions), which have a grammatical rather than a lexical function.23

5.2.2 Lexical accuracy indexes Lexical accuracy was measured as a general index of correct lexical use ratio, reflecting the ratio of general lexical error per lexical (content) word. In line with Engber’s (1995) operationalization, this error count incorporated errors in word choice, collocation (including compounds), and derivation, but excluded morphological verb errors, which are considered as indicators of syntactic accuracy. The study also excluded spelling errors from any error count.

5.2.3 Syntactic complexity indexes The literature on syntactic complexity usually equates this construct with an elaboration of patterns. Accounts of this elaboration vary, but generally specify that lower proficiency learners will use simple or coordinated clauses, while more advanced learners will use subordination. Thus, according to Wolfe-Quintero and her colleagues (1998), the best way of measuring syntactic complexity is to avoid length measurements, such as T-unit length or average sentence length, and instead use counts of dependent clauses or subordinating conjunctions (see also Ishikawa, 1995). They suggest the clause per T-unit ratio as the best cross-sectional developmental index. However, there are claims that this index is more suitable for child language assessments, and that ratios of clause coordination should be preferred (Bardovi- Harlig, 1992). As a compromise, the present study adopted versions of both measures as indicators of syntactic complexity: 1) Clause/sentence ratio, referred to as sentence complexity by Ishikawa (1995). This measure is similar to the coordination index, which divides the

23 The inclusion of function words in the TTR or any version thereof invariably renders these indexes lower than they are for content words alone, and reduces their sensitivity as developmental measures.

126 Dynamics of L2 writing development

number of independent clauses by the number of combined (independent+dependent) clauses (Bardovi-Harlig, 1992). Both indexes were shown to be adequate developmental measures (Wolfe-Quintero et al., 1998). 2) The ratio of subordinating conjunctions to the total clause number, with subordinating conjunctions including adverbial and relative conjunctions as well. All of these devices reflect more sophisticated rather than simple coordination, which is overused in some nonnative writing (Yuuping, 2003). Because of the relatively short text lengths in this study, collapsing the different subordination types also increases the number of occurrences that the ratio is based on.

In addition to these general measures, syntactic complexity (as well as syntactic accuracy, as detailed in the next section) was denoted by more-local indexes. This was done because the categories of syntactic complexity and accuracy are more detailed and varied than their lexical equivalents, suggesting that many indexes may converge in overall syntactic development. Because such indexes are based on small numerical values of structures which may not appear in every essay, they were not analyzed in isolation, but tallied up. The procedure used for this aggregation is a Pythagorean distance sum (section 5.3.2.1 includes this procedure and its explanation). It was used to summarize development in all of the indexes denoting syntactic complexity, and in all those denoting syntactic accuracy, separately for each category. Comparing these aggregations was intended to supplement the comparison of the more-general syntactic complexity and accuracy measures. The following indexes were included in the Pythagorean sum for syntactic complexity: 3) Passive use per clause, used in line with Kameen (1979). 4) Pronoun use per tokens, based on Evola, Mamer and Lentz (1980). 5) Adjective use per lexical words, based on McClure (1991). 6) Modal verbs per total verbs, based on Ringbom (1998), and Biber, Conard, and Leech (2002). Modal verb use can also be viewed as lexical or pragmatic. In this case, it was considered from the sentence complexity perspective (cf., Cooper, 1976).

5.2.4 Syntactic accuracy indexes Many popular syntactic complexity measures are very broad, for instance the error- free T-unit (cf., Larsen-Freeman, 1978; Larsen-Freeman & Strom, 1977). The current study avoided such measures, because they do not necessarily reflect short-term change (Wolfe-Quintero et al., 1998). For example, the error-free T-unit does not take into account that some T-units may contain multiple errors, even when others remain error free. Additionally, the number of T-units itself is a complexity measure;

127 CHAPTER 5

therefore the error-free T-unit is not a pure accuracy measure. Another reason for its unsuitability to the current study is its lack of distinction between lexical and syntactic error. Syntactic accuracy indexes can be based on general or specific error measures (in certain speech parts or syntactic structures, such as prepositions or conjunctions). The current study favored a compromise between local and more general measures: 1) The ratio of correct word error per clause (based on Kroll, 1990). This index is also in line with the observation that a recurrent error in nonnative writing is a lack of effective ordering or sequential coherence (James, 1998).

Additional indexes were not analyzed separately, but included in a Pythagorean summary and contrasted with the equivalent summary of the syntactic complexity measures. These indexes were considered as complementary to the main syntactic accuracy index, since they pertain to more local syntactic features: 2) Correct use of subordinating conjunction per required context, based on Evola et al. (1980). 3) Correct pronouns per required context, based on Evola et al. (1980), Kroll (1990), and Polio (1997). 4) Correct determiner use per required context, including quantifiers, based on Kroll (1990) and Polio (1997). 5) Correct tense use per verb, based on Arnaud (1992) and Kroll (1990). In line with Kroll (1990) and Polio (1997), this index excluded verb formation errors such as omission of auxiliaries and of the infinitive “to”, and confusion of gerunds and infinitives. 6) Correct agreement per verb. This index was based on Cumming and Mellow (1996), Kroll (1990), and Polio (1997). 7) Correct articles per required context, based on Cumming and Mellow (1996), Kroll (1990), and Polio (1997). 8) Correct prepositions per required context. 9) Correct use of coordinating and subordinating conjunctions per required context, based on Evola et al. (1980). 10) Correct part-of-speech use, reflecting speech part confusion errors, for example adjective use in place of adverb or noun use instead of adjective, as operationally defined by Kroll (1990). This measure also incorporates the ratios of missing and redundant parts of speech.

5.3 Results: data and variability analyses The data analyses proceeded as detailed in section 3.2. First, the development of indexes representing combinations of the different categories was traced as a function of time (represented on the x-axis as the number of weeks since the study onset).

128 Dynamics of L2 writing development

These trajectories were plotted in pairs, which were compared visually.24 The data plots were supplemented by variability analyses: in line with the claims made by previous dynamic-oriented studies, the present study assumed that inspecting and comparing residual patterns between indexes is potentially indicative of underlying shifts in systemic self-organization. For the same reason, the correlations between the indexes representing the four categories,25 which were used to summarize the surface interactions, were complemented with moving correlation plots. While moving correlations invariably exclude some variability, they show the temporal changes that underline the aggregated interactions expressed in the correlation coefficients, and thus supplement the impression derived from the data and residual plots.

5.3.1 Complexity vs. accuracy in the lexical dimension The first step in the analyses involved comparing complexity and accuracy within the lexical dimension of writing, in the data of the main case study. Figures 22 and 23 show the data and residuals, respectively, of the general word variation and the correct lexical use ratios in the Portuguese speaker data.

1 0.8 0.7 0.9 0.6 0.8 0.5 Variation

Correctuse 0.4 0.7 0.3 0.6 0.2 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Weeks

Correct lexical use Lexical variation

Figure 22. Data values for general word variation and correct lexical use in the Portuguese speaker data

24 Because the focus is on relative changes in growth and variability rather than on absolute values, scales are frequently adjusted for the different indexes, with plots including two y-axes. Thus these are not identical in all plots. 25 These correlations were calculated as Spearman’s rho coefficient of nonparametric correlation, since developmental data cannot be assumed to be normally distributed.

129 CHAPTER 5

0.15

0.1

0.05

0 1 6 11 16 21 26 31 36 -0.05

-0.1

-0.15 Weeks

General word variation Correct lexical use

Figure 23. Residuals (on the y-axis) for general word variation and correct lexical use in the Portuguese speaker data

Figure 22 shows that the growth trajectories of the general word variation and correct lexical use are predominantly parallel, particularly in the first and last thirds of the study period. Between these periods, these trajectories exhibit some alternating growth patterns. These patterns are expressed in a weak negative correlation between the two indexes, which is not statistically significant. The residual plot in Figure 22 similarly shows mostly parallel variability patterns and several alternating shifts between weeks 18-21 and on week 31. In these shifts, a movement above the trend in one index is accompanied by a movement below it in the other and vice versa, for example in week 19. However, these fluctuations are not robust. Figure 24 shows the interaction between the general word variation and correct lexical use as a moving correlation in a window of 5 measurements, for all case studies. The plot for the Portuguese speaker (top left) shows that the lexical complexity-accuracy correlation shifts between weak positive and weak negative coefficient values, with one peak towards a strong positive value in the 21st window. Together with the impression obtained from the data and residual plots, this indicates that while there may be some degree of competition between complexity and accuracy in the written lexicon of this participant, it is nonetheless weak. This interpretation is preliminarily at this stage, and should be inspected further.

130 Dynamics of L2 writing development

The Portuguese speaker The Vietnamese speaker

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -0.5 -0.5

-1 -1

The Mandarin speaker The Indonesian speaker

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -0.5 -0.5

-1 -1

Figure 24. Moving correlations (in 5-measurement windows, denoted on the x-axes) between the general word variation and correct lexical use ratios of the four case studies. The y-axes represent the coefficient value (Spearman’s rho )

In the Vietnamese speaker data (top right), the lexical complexity-accuracy correlation begins at a high positive value, and then fluctuates between high positive and weak positive or negative values, with two more peaks of a high positive correlation. For the Mandarin speaker (bottom left), the coefficient initially rises to a medium-high positive value and subsequently fluctuates between low-neutral and strong negative values, with two positive peaks towards the end. This pattern is similar to that in the Portuguese speaker data, yet more pronounced. Finally, in the Indonesian speaker data (bottom right), the correlation fluctuates between initial neutral (near zero) and weak-moderate positive and negative values. While differences between these participants are predictable, particularly on a variability measure, the overall patterns are not dissimilar, namely across the datasets of the Portuguese, Mandarin and Indonesian speakers.

5.3.2 Complexity vs. accuracy in the syntactic dimension This section inspects how syntactic complexity and accuracy interact across general syntactic measures. The top plot in Figure 25 (below; top plot) shows the developmental trajectories of two ratios representing syntactic complexity and

131 CHAPTER 5

accuracy at the clause level: subordination use per clause and correct word order per clause. In the first half of the study, these trajectories mostly alternate, but this pattern changes in the second half, in which they are mostly parallel. However, the correlation between these data values does not express this change towards parallel growth, and is negative (-.38 at p<0.05). In contrast, the residuals for these indexes (in the bottom plot) show distinctive parallel patterns. These patterns manifest in a positive correlation between these residuals, which is not significant but shows a trend towards significance (0.31 at p=0.061). This is despite the fact that, unlike the lexical complexity and accuracy indexes in the previous section, subordination use and word order are not expressly related. While the subordination/clause ratio residuals are mostly negative, and those of the correct word ratio mainly positive, an increase or decrease in the residuals of one index is frequently mirrored by a simultaneous, and occasionally even identical, shift in the other.

0.7 1

0.6 0.95 0.9 0.5 0.85 0.4 0.8 0.3 0.75 0.2 0.7

0.1 0.65 Subordination/clause ratio 0 0.6 1 6 11 16 21 26 31 36 Weeks

Subordination/clause ratio Co rrect word order per clause

0.4

0.3

0.2

0.1

0 1 6 11 16 21 26 31 36 -0.1

-0.2 Weeks

Subordination/clause ratio Correct word order

Figure 25. Growth trajectories (top) and residuals (bottom) of the subordination/clause and correct word order per clause ratios in the Portuguese speaker data

132 Dynamics of L2 writing development

To supplement the impression derived from these plots and their correlations, a moving correlation between the subordination/clause and correct word order ratio was plotted for this dataset, as well as for the other case studies. The moving correlation for the Portuguese speaker data (top left) shows mostly positive values, strengthening the impression derived from the parallel variability patterns in its residual plot (unlike the shift in the raw data trajectories from inverse to parallel patterns, and the accompanying negative and significant correlations of these values). The Vietnamese speaker plot is relatively similar, with peaks of high positive correlations, and several fluctuations towards weak negative values. However, the syntactic complexity-accuracy interaction in the two other case studies appears quite different. In the Mandarin speaker data, it shifts from positive to negative values, whereas in the case of the Indonesian speaker, it fluctuates between initial high negative, moderate negative, and moderate-high positive values.

The Portuguese speaker The Vietnamese speaker

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -0.5 -0.5

-1 -1

The Mandarin speaker The Indonesian speaker

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -0.5 -0.5

-1 -1

Figure 26. Moving correlations (in 5-measurement windows) between the subordination/clause and correct word order per clause ratios of the four case studies

To sum, in the main case study, the residuals and moving correlation of these general syntactic complexity and accuracy indexes exhibit mostly parallel patterns, unlike the raw data. However, in the moving correlation plots of the three other case studies, different patterns can be discerned, particularly in the case of the Mandarin

133 CHAPTER 5

and Indonesian speakers, illustrating that despite general similarities, there is inherent individual variation in interactions between writing performance categories. Such inter-individual variation is mentioned by many cross-sectional studies, and is key to the dynamic approach (Larsen-Freeman & Cameron, 2008; van Dijk, 2003; van Geert & van Dijk, 2002; Verspoor et al., 2008b). It should be acknowledged, even while overall similarities are generalized.

5.3.2.1 Representativeness of the within-syntax findings Syntactic complexity or accuracy can be denoted by numerous and nonequivalent measures. This is in contrast with lexical complexity and accuracy, which are often aggregated across different speech parts (as in the present study). Therefore, interpreting the interaction between syntactic complexity and accuracy measures in isolation and extending this interpretation to the overall interaction between syntactic complexity and accuracy may lead to overgeneralization. To corroborate the preceding analyses of the Portuguese speaker data and confirm that they are indeed representative of the written syntactic complexity and accuracy dimensions in general, development in several indexes that represent each dimension was summarized as a sum of differences representing the overall distance between consecutive observations. 26 This scaling method is similar to aggregating coordinate-based directions, and is based on the Pythagorean algorithm a 2 + b 2 = c 2 , which is true for any right angled triangle. It is used to summarize opinion questionnaires or compute color and hue distances. The Pythagorean sums were then plotted as functions of time and compared to the data trajectory and residual plots of the syntactic complexity and accuracy indexes presented so far.

26 Distance, in this case, does not refer to a physical quality, but to a difference between constructs such as performances on various test items, preferences, self-rating and so forth. For a simple explanation of the Pythagorean sum of differences, see http://betterexplained.com/articles/measure-any-distance-with-the- pythagorean-theorem/

134 Dynamics of L2 writing development

4 1.2 3.5 1.1 3 1 0.9 2.5 0.8 2 0.7 1.5 0.6

Syntactic accuracy accuracy Syntactic 1 0.5 0.5 0.4 0 0.3 1 6 11 16 21 26 31 Weeks

Syntactic complexity Syntactic accuracy

Figure 27. Pythagorean sums of the syntactic complexity and syntactic accuracy dimensions in the Portuguese speaker data. The y-axes denote differences between consecutive measurement values, with separate scales for each category

Figure 27 shows the Pythagorean sums of all the syntactic complexity and accuracy indexes extracted from the Portuguese speaker data. Each trajectory shows the relative difference between the total value of all these indexes in relation to the previous week. As listed in sections 5.2.3 and 5.2.4, the syntactic complexity indexes summarized in the Pythagorean trajectory are the ratios of subordination/clause, clause/sentence, modal verbs/tokens, pronouns per clause, passive use per clause, and use of adjectives. The syntactic accuracy indexes are the ratios of correctly used (unconfused) speech parts per clause, verb agreement, tense use per verb, coordinating and subordinating conjunctions per clause, prepositions, pronouns, determiners, and articles, all of which were calculated as correct use per required context. The Pythagorean plot demonstrates that the predominantly parallel patterns observed in the data and particularly the residuals of the syntactic complexity and accuracy measures in section 5.3.2 (Figure 25) are indeed representative of the overall interaction between these categories in the main case study.

5.3.3 Interactions within and between lexicon and syntax, across complexity measures This section targets the third research question: whether the complexification of written lexicon comes at a cost to that of syntax, and vice versa. In practical terms, it asks if different indexes denoting complexity within lexicon and syntax have similar developmental paths, and whether increased lexical complexity is typically

135 CHAPTER 5

accompanied by increased syntactic complexity. Figure 28 (below) shows paired trajectories of two complexity measures in the lexical and syntactic dimensions, respectively. Within each dimension, the two measures mark different conceptualizations of complexity. In lexicon, the ratio of complex words (words longer than three syllables) is contrasted with the general word variation. The two indexes are not synonymous, since an increase in word variation need not necessarily be accompanied by a parallel increase in the complex word ratio, and vice versa. In syntax, the ratio of subordinating conjunctions per clause is contrasted with the clause/sentence ratio. Again, while the number of overall clauses may increase, they might be conjoined by coordination, which is a less sophisticated device than subordination, therefore the development of the two indexes need not necessarily overlap. The two lexical complexity indexes show mainly parallel growth, whereas the syntactic complexity indexes show parallel growth in the first half of the study, followed by alternating patterns. These patterns can be seen in the correlations of the two index pairs. The coefficient for the correlation between the lexical complexity indexes is 0.32 (p<0.05); while for the syntactic indexes, it is -0.35 (p<0.05). Although the correlations invariably collapse the data variability, they confirm that the surface interaction between the lexical complexity indexes is mostly supportive, while the opposite holds for the interaction between the syntactic measures.

1 7 0.4 6 0.9 0.35 0.8 5 0.3 0.25 0.6 4 0.7 0.2 3 0.4 0.15 2 0.5 0.1 0.2 1 0.05 General word variation Subordination/clause ratio 0 0 0.3 0 1 6 11 16 21 26 31 36 1 6 11 16 21 26 31 36 Weeks Weeks

Subordination/clause ratio Clause/sentence ratio General word variation Complex word ratio

Figure 28. Two paired measures of syntactic (left) and lexical (right) complexity indexes in the Portuguese speaker data

However, the two index pairs do not show identical trajectories. Moreover, while the linear trend of the complex word ratio increases, that of the general word variation decreases. Similarly, the linear trend for the clause/sentence ratio increases, while that of the subordination/clause ratio decreases. The next step juxtaposes

136 Dynamics of L2 writing development

complexity in the lexical and syntactic dimensions, by plotting the development of the lexical and syntactic indexes in two combinations (Figure 29, below): the linearly 27 increasing measures of the complex word and clause/sentence ratios F F (top left); and the linearly decreasing measures of the general word variation and the clause/sentence ratio (top right).

0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4

0.3 0.3 0.2 0.2 0.1 0.1 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

Co mplex wo rd ratio Clause/sentence ratio/10 General wo rd variatio n Subordination/claus e ratio

0.7 0.25 0.4 0.6 0.6 0.35 0.5 0.3 0.4 0.25 0.2 0.2 0.3 0.15 0.2 0.1 0.1 0.05 0 0.15 0 0.55 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34 Weeks Weeks

Clause/sentence ratio Co mplex wo rd ratio Subordination/claus e ratio General wo rd variatio n

Figure 29. Two pairs of lexical and syntactic complexity indexes in the Portuguese speaker data with their linear trends (those of the syntactic indexes in the main y-axes)

Figure 29 shows that, regardless of the direction of their linear trends, both combinations of lexical and syntactic complexity measures exhibit largely alternating growth patterns, in which an increase in one index is accompanied by a decrease in the other. This impression is supported by the correlations between the data values. The correlation between the complex word and clause/sentence ratios is negative (- 0.265), although not statistically significant. The correlation between the general word variation and subordination/clause ratio is also negative (-.292) and closer to significance, at p=0.084. These correlations show that in general, each pair of lexical-

27 In order to simplify the visual display, the clause/sentence ratio values were divided in this case by 10, a procedure which has no effect on their fluctuation in relation to the complex word ratio, and allows plotting their data and residual values on the same scale as the other indexes. Since the lexical indexes change on much smaller scales, their linear trends were plotted separately from those of the syntactic indexes (on the secondary y-axes).

137 CHAPTER 5

syntactic complexity indexes is negatively related. The alternating growth patterns between lexical complexity and syntactic complexity, seen in the data plots and correlations, are more robust in the residuals of these index pairs. Indeed, correlating these residuals yields stronger negative correlations than correlating the raw data values.

3 0.2 0.15 2 0.1 1 0.05 0 0 1 6 11 16 21 26 31 36 -0.05 Clause / sentence / Clause -1 -0.1 -2 -0.15 Weeks

Clause/sentence ratio Complex word ratio

0.4 0.15 0.3 0.1 0.2 0.05 0.1 0 0 -0.05 -0.1 1 6 11 16 21 26 31 36 -0.1 Subordination / clause clause / Subordination -0.2 -0.15 Weeks

Subordination/clause ratio General word variation

Figure 30. Residuals of paired lexical and syntactic complexity indexes in the Portuguese speaker data, with the lexical indexes plotted on the secondary (right-hand) y-axes

Figure 30 shows the residuals of the same combinations of lexical and syntactic complexity indexes. The plots show robust patterns of alternating variability, which are confirmed in their correlations. Unlike the correlations between the data values, the strongest indication for competition is expressed in the correlation between the residuals of the complex word and clause/sentence ratios (-0.39 at p<0.05). The correlation between the residuals of the general word variation and subordination/clause ratios is also negative (-.211), but not statistically significant. However, the residual plots show that alternating patterns are also discernable in this dyad, although these may occur when the residuals of both indexes are negative (for

138 Dynamics of L2 writing development

example, in weeks 9-16, see the bottom plot). Overall, the residual plots suggest a competition between the two pairs of lexical-syntactic complexity indexes, as observed in the data trajectories. Figure 31 includes the moving correlation between lexical complexity and syntactic complexity in the Portuguese speaker data, together with the same correlation in the other case studies. The plots depict the interaction between the general word variation and subordination/clause ratio, because of the altogether stronger competition expressed in their data values (in contrast with the residuals, in which the complex word and clause/sentence ratios showed stronger competition). Another reason that these two indexes were selected for further analyses was that the general word variation is based on a higher number of occurrences than the complex word ratio, whereas the subordination/clause ratio reflects a higher order of complexity than the clause/sentence ratio, since it encompasses both clause conjunction and subordination.

The Portuguese speaker The Vietnamese speaker

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -0.5 -0.5

-1 -1

The Mandarin speaker The Indonesian speaker

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -0.5 -0.5

-1 -1

Figure 31. Moving correlations (in 5-measurement windows) between the general word variation and subordination/clause ratios in the four case studies

These plots show how, with the exception of the Vietnamese speaker, the correlation between lexical complexity and syntactic complexity fluctuates between positive and negative values for all case studies. Such alternations have been

139 CHAPTER 5

interpreted as reflecting competition for limited resources, since oscillations between positive and negative interactions can indicate the support-competition cycles implied by the precursor model (cf., Verspoor et al., 2008b). Since the residual plot for the main case study also showed predominantly alternating patterns, the impression of complex interaction, comprised predominantly of competition, is strengthened in relation to this dataset.

5.3.4 Interactions between general measures of complexity and accuracy in lexicon and syntax The final step in the analyses addresses the question of how complexity and accuracy interact not only within, but also between lexicon and syntax. In other words, it merges the complexity-accuracy and lexicon-syntax distinctions, inspecting how lexical complexity and accuracy interact with their syntactic counterparts. Figure 32 shows the trajectories for the general word variation, the correct lexical use, the subordination/clause ratio, and the ratio of correct word order per clause in the main case study data. These indexes denote lexical complexity, lexical accuracy, syntactic complexity and syntactic accuracy, respectively, and are referred to as such in the analyses.

1 0.8

0.6

Ratios 0.4 0.2

0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Weeks

Lexical variation Correct lexical use Subordination/clause ratio Correct word order per clause

Figure 32. Data trajectories of the four indexes denoting lexical and syntactic complexity and accuracy in the Portuguese speaker data

Figure 33 shows all the combinations of these trajectories as lexical-syntactic pairs.

140 Dynamics of L2 writing development

1 0.7 1

0.6 0.5 0.8 0.8 0.4

0.3 0.6 0.6 0.2

Correct lexical use 0.1 0.4 0 0.4 1 6 11 16 21 26 31 36 1 6 11 16 21 26 31 36

Co rrect lexical us e Subo rdinatio n/clause ratio Co rrect lexical us e Co rrect word order

1 0.7 1 0.6 0.5 0.8 0.8 0.4 0.3 0.6 Variation 0.6 0.2 0.1 0.4 0.4 0 1 6 11 16 21 26 31 36 1 6 11 16 21 26 31 36

General wo rd variatio n Subo rdinatio n/clause ratio General wo rd variatio n Co rrect word order

Figure 33. Data trajectories for pairs of lexical and syntactic measures in the Portuguese speaker data. Lexical indexes are plotted on the main y-axes; syntactic indexes are plotted on the secondary (right-hand) y-axes

The plots in Figure 33 show predominately alternating trajectories, in which, while a pair of indexes may exhibit a similar central growth trend (overall increase or decrease), local fluctuations are frequently inverse. This tendency is more pronounced in the data residuals, plotted in Figure 34. The residual plots show robust inverse patterns across the lexical-syntactic index pairs, in which an increase above the trend in a lexical index is often accompanied by a decrease in a syntactic index, and vice versa. These patterns are particularly salient in the first half of the correct lexical use and subordination/clause ratio plot (top left), and least in the correct lexical use and correct word order plot (top right), which shows relatively more parallel patterns than the other combinations. The two other plots, while including some parallel patterns, also show mostly inverse ones. The consecutive Figure 35 shows the same index combinations as moving correlations. Rather than referring to the indexes themselves, it notes the categories that they represent.

141 CHAPTER 5

0.4 0.15

0.3 0.1

0.2 0.05

0.1 0 1 6 11 16 21 26 31 36 -0.05 0 1 6 11 16 21 26 31 36 -0.1 -0.1 -0.15 -0.2 -0.2

Co rrect lexical us e Subordination/clause ratio Co rrect lexical us e Correct word order

0.4 0.15

0.3 0.1 0.05 0.2 0 0.1 1 6 11 16 21 26 31 36 -0.05 0 -0.1 1 6 11 16 21 26 31 36 -0.1 -0.15

-0.2 -0.2

General wo rd variatio n Subordinatio n/claus e ratio General wo rd variatio n Correct word order

Figure 34. Residual plots for combinations of lexical and syntactic indexes in the Portuguese speaker data

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31

-0.5 -0.5

-1 -1

Lexical accuracy-s yntactic co mplexity Lexical accuracy-syntactic accuracy

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -0.5 -0.5

-1 -1

Lexical co mplexity-s yntactic co mplexity Lexical co mplexity-s yntactic accuracy

Figure 35. Moving correlations (in 5 measurement windows) between pairs of lexical and syntactic accuracy and complexity indexes in the Portuguese speaker data

142 Dynamics of L2 writing development

In line with the patterns observed in the residual plots, the strongest alternations between negative and positive correlation coefficient values occur in the lexical accuracy-syntactic complexity dyad (top left). In contrast, the correlations between the two accuracy categories (top right) is principally weak and positive. The correlation between lexical complexity and syntactic complexity (bottom left), which was discussed in section 5.3.3, shows similar shifts to those in the lexical accuracy- syntactic complexity interaction, but the former are not as pronounced, alternating between high negative and weak-moderate positive values. Finally, the interaction between lexical complexity and syntactic accuracy (bottom right) shows relatively fewer shifts and remains mostly positive. Judging from these moving correlations and the previous analyses of lexical- syntactic index pairs, the strongest indication for a precursor interaction between lexical and syntactic categories in the main case study can be discerned in the lexical accuracy-syntactic complexity dyad. This interaction shifts from a high negative value at the beginning of the study to a high positive value at its end. The shift entails an intermediary oscillation between smaller positive and negative values. The moving correlations for the other case studies resemble those of the Portuguese speaker by fluctuating between positive and negative values, although, due to individual variation, their fluctuations are predictably not parallel across the datasets. In the Mandarin speaker data (Figure 36), the strongest shift is again in the lexical accuracy-syntactic complexity plot (top left). As in the Portuguese speaker data, this moving correlation shifts from an initial negative to a strong positive value, drops again in the last weeks to a strong negative value, and appears to recover towards the end of the study. Her lexical accuracy-syntactic accuracy correlation plot (top right), while showing robust fluctuation, shifts between strong positive and weak negative values, as also observed in the Portuguese speaker data.

143 CHAPTER 5

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31

-0.5 -0.5

-1 -1

Lexical accuracy-s yntactic co mplexity Lexical accuracy-s yntactic accuracy

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31

-0.5 -0.5

-1 -1

Lexical co mplexity-s yntactic co mplexity Lexical co mplexity-s yntactic accuracy

Figure 36. Moving correlations (in 5 measurement windows) for paired lexical and syntactic accuracy and complexity indexes in the Mandarin speaker data

In the Vietnamese speaker data (Figure 37), the moving correlation between the lexical accuracy and syntactic complexity categories also shows the most distinct fluctuations. However, unlike the two previous cases, it does not begin at a negative value, but rather shifts from a moderate positive to weaker positive and negative values, culminating at a strong positive value. The other interactions expressed by the moving correlations tend to be positive, although in two cases they culminate at negative values. Generally, in comparison with the two previous participants, the moving correlations in this dataset show less pronounced fluctuation.

144 Dynamics of L2 writing development

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31 -0.5 -0.5

-1 -1

Lexical accuracy-s yntactic co mplexity Lexical accuracy-syntactic accuracy

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31

-0.5 -0.5

-1 -1

Lexical co mplexity-s yntactic co mplexity Lexical co mplexity-s yntactic accuracy

Figure 37. Moving correlations (in 5 measurement windows) for data values of paired lexical and syntactic accuracy and complexity indexes in the Vietnamese speaker data

Finally, the moving correlations for the Indonesian speaker data show robust oscillations (Figure 38). Although these differ in their timing from those in the moving correlations of the other case studies, they are again most evident in the lexical accuracy-syntactic complexity correlation and the lexical complexity-syntactic complexity correlation, as in the first two datasets. However, in contrast with the other case studies, the lexical accuracy-syntactic accuracy correlation (top right) begins at a moderate negative value, increases to a positive value throughout most of the study period, and steeply declines to a negative value towards its end. Similarly, the lexical complexity-syntactic accuracy correlation (bottom right), changes from high positive initial values to predominately negative end values.

145 CHAPTER 5

1 1

0.5 0.5 0 1 6 11 16 21 2 31 0 -0.5 1 6 11 16 21 2 31

-0.5 -1

-1.5 -1

Lexical accuracy-s yntactic co mplexity Lexical accuracy-s yntactic accuracy

1 1

0.5 0.5

0 0 1 6 11 16 21 2 31 1 6 11 16 21 2 31

-0.5 -0.5

-1 -1

Lexical co mplexity-s yntactic co mplexity Lexical co mplexity-s yntactic accuracy

Figure 38. Moving correlations (in 5 measurement windows) for paired lexical and syntactic accuracy and complexity indexes in the Indonesian speaker data

To sum up, the moving correlation analyses showed general patterns of fluctuations in interactions across all combinations of lexical and syntactic categories, for all participants. Particularly strong shifts were apparent in the lexical accuracy- syntactic complexity interaction in the Portuguese, Mandarin and Indonesian speaker datasets. In the case of the Portuguese and Mandarin speakers, the correlation between lexical accuracy and syntactic complexity started at a negative value and ended at a positive; for the Indonesian speaker, fluctuations in this interaction were still robust, but the shift was from an initial positive to a final negative correlation. This interaction also showed the most distinctive oscillations in the Vietnamese speaker data in comparison with the other category combinations, albeit to a lesser degree. Previous studies have interpreted similar oscillations in moving correlations as indicative of dynamic precursor interactions (Verspoor et al., 2008b). In the final procedure in this study, this interpretation is tested by configuring the four writing performance categories – lexical complexity, lexical accuracy, syntactic complexity and syntactic accuracy – as parameters in a precursor model. The following section summarizes the data analyses and the key interactions relevant to the modeling procedure.

146 Dynamics of L2 writing development

5.3.5 Summary of the data analyses The first part of the results section looked at the development of L2 writing indexes, selected on the basis of previous studies. It focused primarily on a single case study, but complemented its main findings with additional data from the three other participants. However, L2 development, including that of writing performance, is highly individuated. Therefore presenting the different case studies was not aimed at simple cross-learner comparison, but to gauge what similarities, despite inevitable differences between learners, can be revealed and generalized in a model of the data as emerging from precursor interactions. The key interactions that are relevant to the modeling procedure are between lexical complexity and accuracy, lexical accuracy and syntactic complexity, and syntactic complexity and accuracy. These interactions are essential to the model, because they represent a hierarchy in which lexicon precedes syntax in general, and complexity precedes accuracy in each of these dimensions. In other words, this paradigm embeds the complexity-accuracy distinction within the general lexical- syntactic divide. Since these interactions were presented in detail in the preceding sections, they are recapitulated here only briefly, in conjunction with the main case study. The complexity-accuracy interaction within the lexical dimension was identified as a weak competition, since the data and variability analyses showed that while alternating patterns could be discerned between indexes denoting these categories, these patterns were not robust. In the lexical accuracy-syntactic complexity interaction, competition appears to be much stronger; this interaction was recognized as the most distinctively competitive of all combinations of lexical and syntactic categories in all case studies. Finally, the complexity-accuracy interaction within the syntactic dimension was considered as supportive, since it showed mostly parallel growth and variability patterns, as well as positive correlations, and thus little evidence for competition.

5.4 Modeling complexity and accuracy in lexicon and syntax The proposed model of the current data consists of a precursor hierarchy of four levels – lexical complexity, lexical accuracy, syntactic complexity and syntactic accuracy. This order, while based on previous and current findings, is also intuitively appealing: lexicon can be considered a prerequisite building block for syntax, and complexity as

147 CHAPTER 5

enabling accuracy. The hypothesized interactions across the hierarchy, based on the interpretation of the analyses, are a weak competition between lexical complexity and lexical accuracy; a strong competition between lexical accuracy and syntactic complexity; and support between syntactic complexity and syntactic accuracy. Thus the model tests both the general hypothesis that development across these categories of L2 writing can be explained by precursor interactions from complexity to accuracy and from lexicon to syntax, and the more specific interpretations of the analyses. The model iterates coupled logistic equations that configure interactions between adjacent levels in a hierarchy (van Geert 1991, 1993, 1994, 1995, 2003). It is similar to the connected grower model used in the vocabulary knowledge study in the previous chapter, with some adjustments. These adjustments were due to the fact that the dynamics of the writing performance data were considered more complex than those of the vocabulary knowledge continuum, which was simulated by a model containing only bottom-up interactions (from precursors to dependents). In the present model, there is an additional incorporation of backward interactions from the dependents to the precursors. The configuration of these interactions was motivated by the assumption that the construct of writing performance at an advanced proficiency involves interactions between pre-acquired skills, and not relatively or partially novel knowledge (as assumed in the previous study). This assumption was based on the advanced proficiency levels of the participants, but also on the appearance of the data. Unlike the general increase across all levels of the vocabulary knowledge data (seen in section 4.3.1.1), which implied an active acquisition process, the writing performance data generally shows more curbed growth, and even decrease (cf., in Figure 29, section 5.3.3). Moreover, although the data categories in the current study are considered as ordered in a hierarchy of precursors and dependents, the expression of this hierarchy in the indexes denoting these categories is less distinctive than in the vocabulary knowledge continuum, since these indexes do not necessarily pertain to the same linguistic features. Therefore, the resources invested in each writing category at any given moment were expected to have less local impact on the other categories; rather, a more global effect of each category on the other categories was predicted, in line with its current value. Accordingly, both forward and backward interactions were configured as by level , as relative to the current value of the grower that generates them.

148 Dynamics of L2 writing development

Like the (optimized) version of the vocabulary knowledge model, the current model included a single aggregated damping factor, or influence, from each grower towards another. This parameter is denoted as support, however it can also take on a negative value, reflecting competition (as depicted in Table 1, section 3.4.6). Equation 10 describes the current model.  r A  = + − A * n + An+1 An *1 rA sA * Bn   K  A  r * B  = + − B n + Bn+1 Bn *1 (rB sB * An *) PB   K B  Equation 10. Two growers in a precursor interaction with bidirectional “support” by level

Table 16 (below) lists the model parameters. Their equivalents in Equation 10 appear in brackets. Although the equation depicts only two growers A and B, the same equation also applies to the subsequent levels in the model, by incorporating corresponding relational control parameters that describe their respective feed- forward and feedback interactions. Parameter Definition rate_a (equivalent to rA) Growth rate of grower A (lexical complexity in this case) rate_b ( rB) Growth rate of grower B (lexical accuracy) rate_c Growth rate of grower C (syntactic complexity) rate_d Growth rate of grower D (syntactic accuracy) support_a_b ( SB) Level of support from A to B support_b_a ( SA) Level of support from B to A support_b_c Level of support from B to C support_c_b Level of support from C to B support_c_d Level of support from C to D support_d_c Level of support from D to C

A_c if An ≥ A_c, then P=1; otherwise p=0 Threshold value for onset of precursor interaction from A to B

B_c if Bn ≥ B_c, then p=1; otherwise p=0 Threshold value from onset of precursor interaction from B to C C_c Threshold value for onset of precursor interaction from C to D iniA ( An) Initial value for grower A (data based) iniB ( Bn) Initial value for grower B (data based) iniC Initial value for grower C (data based) iniD Initial value for grower D (data based)

K_a ( KA) Carrying capacity of grower A (set to 1)

K_b ( KB) Carrying capacity of grower B (set to 1) K_c Carrying capacity of grower C (set to 1) K_d Carrying capacity for grower D (set to 1) Table 16. Parameter names and definitions in the model

149 CHAPTER 5

For all case studies, the model is based on the same four-level precursor hierarchy. However, due to the predictable inter-individual differences in the data, it is impossible to generalize a single model version to all participants. In other words, the numerical values assigned to the control parameters, which determine the interactions between the hierarchy of growers as well as their growth rates, differ for each participant and therefore need to be optimized separately in each case. The optimization is performed by the Simplex procedure, as detailed in section 4.3.8.1.

5.4.1 The Portuguese speaker Figure 39 presents the data, linear trends, cubic spline interpolation and optimized model outcome for the Portuguese speaker case study. Inspecting these plots together shows that the model resembles a combination between the linear and spline trends (see section 3.3.4 for an explanation of the spline smoothing technique).

Data Linear trends

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

General wo rd variatio n Co rrect lexical us e General wo rd variatio n Co rrect lexical us e Subo rdination/clause ratio Co rrect word order per claus e Subo rdination/clause ratio Correct word order per clause

Model Spline

1 1 0.8 0.8 0.6 0.6

0.4 0.4

0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

General wo rd variatio n Co rrect lexical us e Level A Level B Level C Level D Subo rdination/clause ratio Co rrect word order per claus e

Figure 39. Data, linear trends, spline and optimized model (fit: 0.0234) for the Portuguese speaker case study

A very small sum of squared residuals between the model and the data reflects a near-optimal fit: all the model levels are correlated with the data categories at near- maximal coefficient values (p<0.01). The correlation between the general word

150 Dynamics of L2 writing development

variation in the data and the equivalent model level (A) was .930; for the correct lexical use ratio (and Level B) it was .831; for the subordination/clause ratio (and C): .920; and for the ratio of correct word order per clause (and D): .943. Because of these very high correlations with the data, it was not necessary to correlate the model with the spline or with the linear trends values, although it is easier to visually compare the model with these plots rather than with the raw data.

5.4.1.1 Comparing correlation matrixes across the data, spline and model Asides from correlating the model with the data or spline values, another type of correlation analyses can be performed in order to confirm the goodness-of-fit of the model. A comparison of the correlation matrix between the various categories in the data or the spline and the parallel correlation matrix between the levels of the model can show not only that the model replicates the raw or smoothed data (as already seen in the sum of squares and plots), but also that it replicates the surface level interactions of the data, as expressed in its correlations. The correlation matrixes in the data, spline and model outcome show very similar patterns, particularly those of the spline and the model. The data correlations are similar to those within the spline and model, but predictably lower, due to the higher variability of the data. Therefore contrasting the correlation matrix of the spline to that of the model facilitates a clearer comparison. The smoothed values in the spline incorporate part of the data variability, together with a local trend, whereas the model excludes variability altogether, since the optimization procedure cannot incorporate random factors. For this reason, the coefficient values in the model correlation matrix are higher than in that of the spline (in which in turn they are higher than in the data matrix). In these matrixes, the four indexes are abbreviated as GenWdVar (general word variation); CorrLexUse (correct lexical use); SubClauseRatio (subordination/clause ratio); and CorrWO (correct word order ratio).

151 CHAPTER 5

Table 17. Correlation matrix for the spline of the Portuguese speaker case study

Table 18. Correlation matrix for the model of the Portuguese speaker case study

The similar correlation matrixes of the model and the spline should not be confused with the correlations between the data and the model values, which were presented in the previous section, since the latter only reflect the resemblance of the model outcome to the data. The internal correlation matrixes, however, also reflect the ability of the model to generate similar surface interactions to those found in the raw or smoothed data.

5.4.1.2 Validating the interpretation of the data and variability analyses As discussed in sections 5.3.5 and 5.4, the pertinent interactions for the purpose of simulating the data are between lexical complexity and accuracy, lexical accuracy and

152 Dynamics of L2 writing development

syntactic complexity, and syntactic complexity and accuracy. Based on the variability analyses, these interactions were posited as follows:

Lexical complexity-Lexical accuracy Weak competition Lexical complexity-Syntactic accuracy Strong competition Syntactic complexity-Syntactic accuracy No competition (support) Table 19. Summarized between-category interactions as inferred from the variability analyses of the Portuguese speaker case study

Table 20 includes the optimized values of the control parameters for the four growers in the Portuguese speaker model. The carrying capacity (K) and initial value (ini) parameters are omitted from the table, because K is set to the default value of 1, and the initial values are configured as the corresponding data onset values. The threshold parameters are also excluded, since they were configured with a uniform value. This step was taken in order to focus the optimization procedure on the role of the bidirectional interactions in determining the general shape of the data growth. The rate parameter is an initial value, which changes iteratively after the first step in the model as a function of the damping effect of the relational control parameters. In other words, although the rate parameter differs across the growers in the model, the actual determinants of the rate are the relational control parameters that are added to or subtracted from it. Thus the optimization procedure focuses on aggregated bidirectional interactions, which appear in the table as tallied between each pair of adjacent levels in the hierarchy. These interaction values are also initial values, that change as a function of the value of the growers that generate them (by level). Because the relational control parameters are multiplied by the previous value of the grower (e.g., An), which is always positive, their sign cannot change from negative to positive or vice versa, unlike that of the growth rate r. Parameter Value rate_a 0.005219 rate_b 0.060632 rate_c -0.06972 rate_d 0.110176 AtoB -0.07386 BtoA 0.04244 BtoC -0.34834 CtoB -0.05067 CtoD 0.065989 DtoC 0.5988 Table 20. Aggregated parameter values for the model of the Portuguese speaker data

153 CHAPTER 5

The parameter values confirm the hypothesis that informed the model setup, namely a basic order that ranges from lexical complexity, to lexical accuracy, to syntactic complexity and finally to syntactic accuracy. In this hierarchy, the development of each precursor is conditional for that of its dependent. Although the exact threshold is not addressed in this study, the basic precursor model structure is upheld. Additionally, the hypothesized interactions of competition between lexical accuracy and syntactic complexity and between complexity and accuracy within the lexical dimension, but not within the syntactic dimension, are also confirmed by the model. The aggregated value of the interactions between levels A and B (lexical complexity and accuracy), i.e., AtoB and BtoA, is negative. Likewise, the overall interaction between levels B and C (lexical accuracy and syntactic complexity), as expressed in the sum of BtoC and CtoB, is also negative. Of the two aggregated bidirectional interactions, the lexical accuracy-syntactic complexity interaction has a stronger negative value, indicating a stronger competition between these levels than between lexical complexity and lexical accuracy. This indication is in line with the interpretation of the analyses. In contrast, the value of the aggregated interaction between levels C and D – simulating syntactic complexity and accuracy – is positive. This value is also in congruence with the interpretation of the variability analyses. Moreover, comparing the internal correlations between the data residuals (Table 21) with the optimized values of the relational control parameters in Table 20 and the verbal descriptions of these interactions in Table 19 further supports the interpretation of the variability patterns.

Table 21. Correlation matrix for the data residuals (variability patterns) of the Portuguese speaker case study

154 Dynamics of L2 writing development

As seen in the comparison of the raw data and the residual plots (for example, in section 5.3.2), the surface interactions between the data categories, reflected in between-index correlations, are not always similar to the interactions between the residuals of these categories. The notion that these interactions are generated by underlying precursor interactions, which remain for the most part obscure unless implied in variability patterns (e.g., Verspoor et al., 2008b), is corroborated by these comparisons. Thus, in general, the modeling procedures confirm the treatment of variability patterns as indicative of underlying internal interactions that give rise to overall development, as posited by studies conducted from the dynamic perspective (Steenbeek & van Geert, 2005; van Geert & van Dijk, 2002; Verspoor et al., 2008b).

5.4.2 The Mandarin speaker

Data Linear trends

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

General wo rd variatio n Co rrect lexical us e General wo rd variatio n Co rrect lexical us e Subordinatio n/clause ratio Correct word order Subordination/clause ratio Co rrect word order

Spline Model

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34

General wo rd variatio n Co rrect lexical us e General wo rd variatio n Co rrect lexical us e Subo rdination/clause ratio Co rrect word order Subordination/clause ratio Correct word order

Figure 40. Data, linear trends, spline and model (fit: 0.0807) for the Mandarin speaker case study

For the Mandarin speaker case study, the model fit is also very good, as expressed in a very small sum of squares. While the model did not correlate significantly with the data values, with the exception of one level (general word variation: .351 at p<0.05), it correlated significantly with three levels in the spline. The model-spline correlations were .609 (p<0.01) for general word variation; .561 (p<0.01) for correct lexical use; the spline-model correlation for the

155 CHAPTER 5

subordination/clause ratio did not reach significance (.242); and the correlation for the correct word order per clause was .516 (p<0.01). It should be noted that the fit of both the linear regression and the spline to the data was weaker than that of the model, reflecting a high amount of data variability in this case, with only one significant correlation (general word variation, .406 at p<0.05), and a near-significant correlation (correct lexical use, .331 at p=0.063) between the spline and the data. The linear regression also showed only one significant correlation with the data (general word variation, .345 at p<0.05). Considering the weaker fits of the spline interpolation and linear trends to the data, which indicate a relatively high amount of data variability, the model performs rather well, albeit in terms of fitting the spline rather than the data values. The aggregated model parameter values were as follows: Parameter Value rate_a 0.141169 rate_b -0.20825 rate_c -0.01042 rate_d -0.17628 AtoB -0.26902 BtoA -0.18801 BtoC -1.31695 CtoB 0.0516982 CtoD 0.098731 DtoC 1.17776979 Table 22. Aggregated parameter values for the model of the Mandarin speaker data

As in the Portuguese speaker model version, the optimized values for control parameters in this model support the key predictions of the study. The first prediction, positing a fundamental precursor hierarchy between the writing performance categories, is confirmed by the model fit to the data trends. The second prediction, concerning the nature of these interactions as consisting of competition between complexity and accuracy within lexicon and between lexicon and syntax, with the stronger competition occurring in the latter dyad, is also supported. Although the CtoB interaction is positive, at a weak value, the BtoC interaction is negative at a much stronger value. Added up, these interactions yield a overall stronger negative value than the aggregated AtoB and BtoA interactions. Finally, the optimized parameter values also resemble those of the Portuguese speaker by showing that the syntactic complexity and accuracy indexes exhibit a supportive rather than a

156 Dynamics of L2 writing development

competitive interaction, as expressed in the positive values of the CtoD and DtoC interactions.

5.4.3 The Vietnamese speaker

Data Linear trends

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34 Weeks Weeks

General wo rd variatio n Co rrect lexical us e General wo rd variatio n Co rrect lexical us e Subordinatio n/clause ratio Co rrect wo rd o rder Subo rdinatio n/clause ratio Co rrect wo rd o rder

Spline Mo del

1 1 0.8 0.8 0.6 0.6 0.4 0.2 0.4 0 0.2 1 4 7 10 13 16 19 22 25 28 31 34 0 Weeks 1 4 7 10 13 16 19 22 25 28 31 34

General wo rd variatio n Co rrect lexical us e General wo rd variatio n Co rrect lexical us e Subo rdination/claus e ratio Correct word order Subordination/clause ratio Correct word o rder

Figure 41. Data, linear trends, spline and model (fit: 0.683) for the Vietnamese speaker case study

Figure 41 shows how, particularly for the Vietnamese speaker data, the spline captures the growth trends much more adequately than the linear regression. The linear trends exclude the decline in correct word order following the first weeks of the study, and its subsequent recovery, as well as the fluctuations in the correct lexical use. The model, however, also fails to capture these changes, and for the most part resembles the linear trends rather than the spline. It simulates the increase in the subordination/clause ratio as low and almost nonexistent. Additionally, although the model simulates the two accuracy indexes as close and intertwined trajectories, the fluctuations in these trajectories do not resemble those in the data and spline plots. The model also depicts the general word variation at a constantly higher value than both the linear trend and spline plots (with the exception of its onset value). In this, it does replicate the fact that throughout the study there are points of conjuncture or overlap between the values of the general word variation and those of the accuracy

157 CHAPTER 5

measures, as seen also in the spline. However, this trait is over-generalized in the model. Thus, this model version had the worst fit, and accordingly showed no significant correlations with the data or spline values. Nevertheless, the model still captured the general shape of the data: the relative values of each index within the data, and the increase in correct lexical use towards the middle of the study period. Parameter Value rate_a 0.786866 rate_b -0.6515 rate_c -0.0155 rate_d 0.08264 AtoB -1.76141 BtoA -1.718823 BtoC 1.661494 CtoB -0.4129749 CtoD -0.51557 DtoC -1.808441 Table 23. Aggregated parameter values for the model of the Vietnamese speaker data

While maintaining the basic configuration of a precursor hierarchy, the optimized values for this model version diverge from those in the previous versions and from the predictions made on the basis of the variability analyses. The main difference is that the interaction between levels C and D (representing syntactic complexity and accuracy) is negative, like the interaction between levels A and B (lexical complexity and accuracy), while the aggregated interaction values between B and C are positive. In this sense, the model does not verify a positive or supportive interaction between complexity and accuracy in the syntactic dimension, or an overall competition between lexicon and syntax. It does however support the notion of competition between complexity and accuracy within the lexical dimension.

158 Dynamics of L2 writing development

5.4.4 The Indonesian speaker

Data Linear trends

1 1 0.8 0.8 0.6 0.6

0.4 0.4 0.2 0.2 0 0 1 3 5 7 9 11 13 15 17 19 212325 2729 3133 35 1 4 7 10 13 16 19 22 25 28 31 34

General word variatio n Co rrect lexical us e General word variatio n Co rrect lexical us e Subordination/claus e ratio Co rrect word o rder Subo rdination/clause ratio Co rrect word o rder

Spline Model

1 1 0.8 0.8 0.6 0.6

0.4 0.4 0.2 0.2 0 0 1 3 5 7 9 11 13 15 17 19 212325 2729 3133 35 1 4 7 10 13 16 19 22 25 28 31 34

General word variatio n Co rrect lexical us e General word variatio n Co rrect lexical us e Subordination/claus e ratio Co rrect word o rder Subo rdinatio n/clause ratio Co rrect word o rder

Figure 42. Data, linear trends, spline and model (fit: 0.06079) for the Indonesian speaker case study

For this participant, the model fit is fairly good, as demonstrated by the low sum of squares. However, the model correlates highly and significantly with the data only on one level (subordination/clause ratio, .541 at p<0.01). Regarding the model- spline correlations, the correlation was high for the general word variation (.557 at p<0.01) and very high for the subordination/clause ratio (.882 at p<0.01), yet the two other levels in the model did not correlate significantly with the spline values. The spline-data fit, as manifested in correlations between the spline and data values, was high for the first three indexes (general word variation: .463; correct lexical use: .592; subordination/clause ratio: .484; all at p<0.01). However, the correct word order ratio in the spline and its equivalent in the data was not significantly correlated. Additionally, the linear regression was not significantly correlated with the data on any of the levels except the subordination/clause ratio (.537 at p<0.01). Since the spline fits the data on three of its levels, the weaker fit of this model version could not be attributed only to high data variability (versus its absence from the model). Therefore, the fit of this model is considered somewhat weaker than that of the

159 CHAPTER 5

Portuguese and Mandarin speakers, although is still fairly good in terms of replicating the overall growth trends of the data (as seen in the plots and the low sum of squares).

Parameter Value rate_a 0.123748 rate_b -0.08895 rate_c -0.0123 rate_d -0.02939 AtoB -0.29616 BtoA -0.335589 BtoC -0.05511 CtoB 0.035072 CtoD 0.022746 DtoC 0.124017 Table 24. Aggregated parameter values for the model of the Indonesian speaker data

As in the Portuguese and Mandarin speaker models, the optimized parameter values in Table 24 are in line with the general hypotheses: a predominantly competitive interaction between lexical complexity and lexical accuracy (A and B), and between lexical accuracy and syntactic complexity (B and C), and a supportive interaction from syntactic complexity to syntactic accuracy (C and D). However, in this model version, the lexical complexity-accuracy competition is stronger than the bidirectional competition between lexical accuracy and syntactic complexity, in contrast with the prediction. It should be noted that, in comparison with the other participants, the Indonesian speaker data showed more negative values in the moving correlation between lexical complexity and lexical accuracy than in the moving correlation between lexical accuracy and syntactic complexity (see Figure 24 in section 5.3.1 and Figure 38 in section 5.3.4). This might indicate a stronger competition within the lexical dimension of this particular case study than between the lexical and syntactic dimension. Such a difference in interactions would naturally not align with the hypotheses concerning the relational control parameters, which were formulated on the basis of the Portuguese speaker data.

5.4.4.1 Testing a different precursor hierarchy The overall good fit of the model raises the concern that it may be at least partially due to the effectiveness of the optimization procedure. This suspicion is partially cleared by the fact that the model fit was not equally good for all case studies, namely for the Vietnamese speaker. However, it can also be countered by configuring the

160 Dynamics of L2 writing development

same categories in a model based on a different precursor hierarchy. Thus, the onset values of the four categories in the Portuguese speaker data were configured as initial values in a model specifying syntactic complexity (subordination/clause ratio) as the first precursor (A), followed by lexical accuracy (correct lexical use), lexical complexity (general word variation), and syntactic accuracy (the correct word order ratio).

Data Linear trends

1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 4 7 10 13 16 19 22 25 28 31 34 1 4 7 10 13 16 19 22 25 28 31 34 Weeks Weeks

Lexical variatio n Co rrect lexical us e General wo rd variatio n Co rrect lexical us e Subo rdinatio n/clause ratio Correct word order per clause Subo rdinatio n/clause ratio Co rrect wo rd o rder per claus e

Spline Model

1 1 0.8 0.8 0.6 0.4 0.6 0.2 0.4 0 0.2 1 4 7 10 13 16 19 22 25 28 31 34 Weeks 0 1 4 7 10 13 16 19 22 25 28 31 34 Lexical variatio n Co rrect lexical us e Subordinatio n/clause ratio Co rrect wo rd o rder per claus e Level A Level B Level C Level D

Figure 43. Data, linear trends, spline and optimized model, based on a different precursor hierarchy, for the Portuguese speaker case study

Figure 43 shows the model outcome, together with the data, its linear trends and spline. It is evident that the model differs greatly from the data and its trends. In fact, the only similarity can be seen in the accuracy indexes, correct lexical use (Level B) and correct word order (Level D), which retain some resemblance to the data. This might be because in the ‘scrambled’ hierarchy, as in the original model and the hypotheses that inform its configuration, lexical accuracy still precedes syntactic accuracy. In the original model, these two parameters are linked through syntactic complexity, whereas in this version, they are linked via the general word variation, representing lexical complexity, which is specified as dependent upon lexical

161 CHAPTER 5

accuracy and as precursor to syntactic accuracy. It is possible that because the original precedence of lexical accuracy to syntactic accuracy is maintained, as well as their link via another category, the general shape of these categories does not differ from the data as much as that of the other levels.

5.5 Summary Recent developmental studies of complexity and accuracy (Spoelman & Verspoor, 2009, in press) and of the lexical and syntactic dimensions (Verspoor et al., 2008b) of L2 writing performance have revealed alternating variability patterns across these divides, associating these findings with dynamic precursor interactions. Following these studies, the current study has stipulated that indexes representing lexical and syntactic complexity and accuracy in L2 writing would show growth and variability patterns that indicate complex precursor interactions, in which the growth of one level offsets that of another. However, a basic premise of the precursor model is not only competition between co-developing components, but also mutual support. Therefore, the first main prediction of the present study was that data trajectories, and particularly temporal variability patterns, would exhibit shifts that suggest support as well as competition. The second key hypothesis was that a model of hierarchical precursor interactions, in which lexicon precedes syntax and complexity precedes accuracy within each of these domains, would effectively simulate the data. The complexity-accuracy and lexical-syntactic distinctions have both been explained by two competing accounts: the dual-systems model, associated with generativist theory, and the single-system model, propagated by cognitive linguistics. The first model allocates the developmental asynchronies between complexity and accuracy and between lexicon and syntax to separate knowledge types; the second claims that their competition for cognitive resources yields unparallel development. Yet so far, complexity and accuracy and lexicon and syntax have not been systematically combined into a single paradigm by neither approach. Likewise, the dynamic perspective has also addressed each distinction separately. The aim of the present study was to merge both distinctions into a single paradigm of L2 writing performance, to which it applied the dynamic perspective. The theoretical motivation for this aim relied on the fact that both the dual-systems and the single-system model refer to the two distinctions as arising from the same mechanisms. Moreover, within each distinction, there is a logical internal order:

162 Dynamics of L2 writing development

complexity, the use of varied and sophisticated linguistic features, needs to precede accuracy in order to enable it; likewise, lexicon must precede syntax to enable the conjoining of words into syntactic units. To the purpose of investigating this combined paradigm of L2 writing performance, the study traced the longitudinal development and interaction of four categories: lexical complexity, lexical accuracy, syntactic complexity and syntactic accuracy. These constructs were represented by indexes extracted from essays written weekly by four advanced EFL learners with diverse language backgrounds, who followed different English-speaking academic programs during the data collection period. The development of these indexes was contrasted by inspecting their growth trajectories and variability patterns in various combinations. Although the analyses were mostly focused on one case study, findings were also re-addressed in the data of the three other participants. Due to predictable inter-individual variability, these findings were frequently unparallel, but allowed to identify general similarities across the different datasets. The first step in the analyses concentrated on the lexical dimension, contrasting complexity and accuracy in two general measures: general word variation and correct lexical use. It revealed shifts between parallel and oscillating variability patterns in these indexes, indicating only a weak negative interaction. The finding that increased lexical complexity, as manifested in word variation, does not necessarily translate to increased accuracy in lexical use is congruent with previous cross- sectional findings (Lennon, 1996). It also supports the general notion of tradeoff or competitive interactions between complexity and accuracy (Skehan & Foster, 1997; Wolfe-Quintero et al., 1998). The study then proceeded to inspect two general measures of syntactic complexity and accuracy: the ratios of subordination per clause and word order per clause. In the main case study data, growth and variability patterns were mostly parallel across these indexes. Similar parallel patterns were observed in aggregations of more local measures of syntactic complexity and syntactic accuracy. These findings were contradictory to those reported by previous studies, although it should be noted that these studies referred to spoken L2 (Foster & Skehan, 1997; Skehan & Foster, 1996). Next, the lexical and syntactic dimensions were juxtaposed in two pairs of lexical complexity and syntactic complexity indexes: the complex word and clause/sentence ratios, and the general word variation and subordination/clause ratios.

163 CHAPTER 5

This procedure revealed patterns of alternating growth across these paired lexical and syntactic indexes, which were similar to patterns previously identified as manifestations of precursor interactions between lexicon and syntax (Verspoor et al., 2008b). Finally, growth trajectories and variability patterns were contrasted between four indexes – general word variation, correct lexical use, subordination/clause ratio and correct word order per clause – which denote complexity and accuracy within the lexical and syntactic dimensions. This procedure revealed inverse patterns across combinations of lexical and syntactic indexes, which were particularly salient between lexical accuracy and syntactic complexity. The study then evaluated the theoretical explanation of alternating variability patterns as expressions of competitive precursor interactions, and the explanatory potential of the precursor model with regard to development across these writing categories. This was done by using a mathematical model to simulate the data. The model specified lexical complexity as a precursor to lexical accuracy, which is in turn a precursor to syntactic complexity, itself a precursor to syntactic accuracy. Between these precursor-dependent dyads, there are bidirectional interactions by level. In line with the data and variability analyses findings, the overall lexical complexity- accuracy interaction was posited as weakly competitive; the interaction between lexical accuracy and syntactic complexity as strongly competitive; and the syntactic complexity-accuracy interaction as mainly supportive. These hypotheses were also in line with previous findings that identify lexicon as a precursor to syntax and the two dimensions as distinctly competitive in early L1 (Robinson & Mervis, 1998). Moreover, as indicated in section 5.1.3, the role of lexical complexity as a conditional precursor to lexical accuracy may be stronger than that of syntactic complexity with respect to syntactic accuracy. This is because lexical accuracy relies more explicitly on lexical complexity, in the sense that the former is determined by correct collocation and context-appropriateness, both of which require the knowledge of varied and sophisticated vocabulary. Conversely, avoidance of complex syntactic forms does not necessarily result in readily-apparent syntactic error. In other words, it is possible that lexical complexity and accuracy need to manifest in unison and thus exhibit a stronger competition for resources, whereas syntactic complexity and accuracy might not compete as distinctively. The optimized model versions achieved a very good fit with the data, in particular for the main case study, the Portuguese speaker. For two other case studies,

164 Dynamics of L2 writing development

the Mandarin and Indonesian speakers, there was also a relatively good fit, especially visually. However, the optimized model failed to achieve a good fit with the Vietnamese speaker data. For the Portuguese speaker, the optimized model values confirmed the hypotheses concerning the general interactions between the levels in the precursor hierarchy, as derived from the variability analyses. For the Mandarin and Indonesian speakers, these hypotheses were partially confirmed. In the case of the Vietnamese speaker, the values of the relational control parameters were mostly incongruent with the predictions. This may be due to the distinction of this participant, who was generally more proficient, and slightly older than the others. Nevertheless, since L2 writing development is highly varied and individuated, the fact that the general premises of the model yielded a good fit in three out of four cases should be considered as evidence for its applicability to this type of data. For all the model versions in this study, the data simulation was based primarily on the precursor hierarchy (order parameters) and the values assigned to its internal interactions (relational control parameters). This is because the carrying capacity (K) was uniformly set to a value of 1, the threshold ( precursor ) values were configured as uniform value, and the initial values were based on the empirical onset values. Thus, the support of the hypotheses via the model is strengthened: if the order parameters, K, or the threshold values were not uniform across the different versions, but rather optimized to fit the different datasets, it is likely that better model fits would have been achieved. However, these would not show how the data growth patterns emerge from the hypothesized hierarchy and interactions with the same degree of confidence.

5.6 Discussion Through the data analysis and modeling procedures, this study has been able to support its key hypotheses. First, that the dimensions of lexical complexity, lexical accuracy, syntactic complexity and syntactic accuracy are interconnected in a hierarchical order of precursors and dependents, as suggested by previous studies (Robinson & Mervis, 1998; Verspoor et al., 2008b). Second, that the nature of the interactions between these dimensions can be deduced from variability patterns in the data (Larsen-Freeman, 2006b; Spoelman & Verspoor, 2009, in press; Verspoor et al., 2008). Regarding these interactions, the study postulated nonparallel patterns between complexity and accuracy that may be due to competitive trade-off relations (Mehnert,

165 CHAPTER 5

1998; Skehan & Foster, 1997). These in turn stem from the inherent limitations on various aspects of cognitive capacities (Baddeley, 1990; Robinson, 2005; Skehan, 1998; Skehan & Foster, 2001; VanPatten, 1990), as well as from the structural order of complexity as a predecessor to accuracy (Ellis, 1996; McLaughlin, 1990). Concerning the lexical and syntactic dimensions, the study hypothesized similar nonlinear and nonparallel developmental patterns, likewise stemming from a combination of cognitive capacity limitations and structural constraints, with lexicon acting as a prerequisite precursor for syntactic growth (Marchman & Bates, 1994; Robinson & Mervis, 1998; van Geert 1991, 1993; Verspoor et al., 2008b). More specifically, the study conceived of a four-category hierarchy in which complexity is a conditional precursor for accuracy in both the lexical and syntactic dimensions, and lexicon is a general precursor of syntax. These hypotheses do not explicitly refute the declarative/procedural knowledge distinction, or its claims concerning the complexity-accuracy or lexical-syntactic developmental disparities (e.g., Pinker & Ullman, 2002). However, their confirmation via the modeling procedure shows how disparate development of these dimensions can arise without requiring the stipulation of separate mechanisms (e.g., Bates et al., 1995; Marchman & Bates, 1994). Thus, the precursor model can be regarded as an explanatory link between the two distinctions – complexity vs. accuracy, and lexicon vs. syntax – which shows how their interactions unfold and change over time in line with global principles of dynamic development. Discussing the debate between the dual-systems and single-system models, Mitchell and Myles (2004) have observed that it is perhaps more useful to consider these approaches as a continuum that ranges from complete separation to complete unity of mechanisms, since both approaches are comprised of accounts of varied strengths. The dynamic approach, rather than being aligned with a specific theoretical stance, can potentially bridge the generativist-cognitive chasm. It focuses not on the specification of the number and nature of mechanisms involved in the development of various linguistic categories, but on how such categories or components of knowledge, whose number is in principle infinite – such as in this case complexity and accuracy in lexicon and syntax – interact and co-influence each other. Whether or not these components can be allocated to distinctly separate processing units is a moot point from the dynamic perspective, since even dual-systems accounts need to posit an intermediary mechanism as connecting the two systems (Anderson, 1993; Ullman, 2004). Moreover, whether one or more mechanisms are responsible for linguistic

166 Dynamics of L2 writing development

knowledge, and whether or not it or they are language-specific, components of language (or any mechanisms they are allocated to) must invariably rely on limited cognitive, motivational, educational and environmental resources. The limitations on these resources determine the course of development, which obeys the same generic principles that pertain to virtually every natural phenomenon (van Geert, 1993). There is no reason to assume that while the biological and physiological mechanisms of the human organism are shaped by these simple dynamic principles, their cognitive or linguistic subsystems are not (Spivey, 2007). In sum, the dynamic approach to SLA poses a different set of questions, which concern dynamic rather than static interactions and unified rather than fragmented accounts (de Bot, 2007). By addressing the hypotheses in the study, some more general premises are supported. These concern the issue of nonparallel growth across aspects of proficiency (Young, 1995), and the nonlinear properties of error, particularly in L2 (Foster & Skehan, 1996; Larsen-Freeman, 2006a; MacKay, 1982; MacWhinney, 2006), as well as the value of variability for understanding of processes in L1 and L2 development (de Bot et al., 2005, 2007; Larsen-Freeman, 1997, 2006a, 2006b, 2007; Larsen-Freeman & Cameron, 2008; van Geert & van Dijk, 2002). With regard to this role of variability, the study distinguishes two types of interactions in the empirical data. The first type is surface interactions, evident in linear trends and correlations. The second type is underlying interactions, exposed by variability analyses and used to configure the model. Previous studies applying the dynamic approach to longitudinal L2 writing development have interpreted variability patterns as manifesting such otherwise-obscure interactions, relating them to the precursor model, but have not tested these interpretations in a simulation (Spoelman & Verspoor, 2009, in press; Verspoor et al., 2008b). The current study supports these interpretations of variability as a valuable manifestation of internal systemic dynamics by two related outcomes of the modeling procedure. First, the optimized model values, while producing outcomes that fit the data, match the interactions discerned from its variability patterns. Second, the model outcome fits the data in terms of reproducing its surface interactions, as well as its growth trends and values. This is particularly evident in the main case study. Although the model in the current study inspects the development of writing performance as arising from hierarchical dynamic interactions, the study does not

167 CHAPTER 5

imply that these interactions are the only variables in the hypothetical “equation” of writing development, or even in the development of the specific categories or indexes included in the study. The literature on L2 writing performance specifies many external influences that come into play in determining its development. Among these influences are unintentional within-task planning (Yuan & Ellis, 2003), length of preparation time (Mehnert, 1998), learner strategies (Ortega, 2005), type of planning and task type (Foster & Skehan, 1996; Skehan & Foster, 1997). In the current model, all of these potential influences are collapsed under the umbrella term of resources (expressed in the carrying capacity). The model can demonstrate the effect of internal interaction on performance given general unspecified limited resources, but invariably assumes that these resources are unchanging. A dynamic-oriented L2 writing study focusing on the effect of one or more of the factors that constitute these resources would require a revised model setup. Moreover, any precursor model is inherently hierarchical, and thus not suitable for simulating data that does not display a clear and logical structural order. While modeling additional dimensions of text-level writing performance (and other levels of writing) is beyond the scope of this study, it is very likely that the growth and interaction of other dimensions is an inseparable and hidden aspect of the patterns recorded in this study. The unaccounted variability in the data may reflect changes in available resources, additional influences, external factors, or the inherent fuzziness of linguistic categories. Another possible drawback of the precursor model is that it was originally conceived to simulate beginner-level development, which starts at seed value (Fischer, 1980; van Geert, 1991). This is obviously not the case concerning the participants in the current study, and thus the model simplifies the interactions between the writing categories (for example, by configuring uniform threshold precursor and carrying capacity values, and by including only interactions between adjacent levels in the precursor hierarchy). Nevertheless, the model hopefully illustrates the potential of the dynamic approach for illuminating the dynamics that underlie this specific domain of L2 development, of which we may only be intuitively aware. As this and numerous previous studies have demonstrated, L2 and specifically writing development vary greatly across individuals (cf., Cumming, 2001). Using an iterative mathematical model can incorporate some of this individuation, but not all of it, as the failure to achieve a good fit with the one of the case studies showed.

168 Dynamics of L2 writing development

Although the study is not comparative, it is quite possible that a model tailored for each case study in terms of relational control parameters, obtained from more thorough analyses of each dataset, would achieve improved results. Despite the growing recognition of individual differences in psychological and linguistic research, cross-sectional studies are dominant in SLA research. Such studies have been limited in enabling insight into temporal acquisition processes beyond the averaged group level at fixed time points. Yet they are invaluable in many respects, among them in highlighting directions for future longitudinal studies that can explore their generalizations in individual learners. While language acquisition always occurs as a temporal and singular process, individual learners are always embedded in populations. Achieving a balance between the cross-sectional perspective and the longitudinal approach calls for combining group analyses and case studies, and central trends and variability within the two study types, as complementary rather than mutually exclusive sources of information. Hopefully, this small-scale and individualized study has managed to supplement the broad picture derived from the cross-sectional and generalized studies that provided it with information and motivation.

169

Chapter 6 General summary and discussion

6.1 Summary This PhD project is concerned with applying the principles of Dynamic Systems Theory in empirical research of L2 development. DST is an interdisciplinary descriptive framework of development in natural systems. Across such systems, it identifies simple and universal principles of growth and interaction, which lead to increasingly, and in essence infinitely, complex outcomes. In the last decades, DST has had profound implications for the study of human development, including that of language (van Gelder, 1999). The dynamic approach considers language as an ensemble of interacting elements, which in turn interact with the environment on all levels. The environment, in this case, is comprised of the internal resources of language users and the external resources of their surroundings. Language, whether as a developing structure in individuals, or as a shared entity in communities, is not a closed and entropic system, and does not settle into stasis unless it has become extinct (Larsen-Freeman, 2006a; Spivey, 2007). Macro-level language change results from micro-level changes in the language systems of individual users. Within these systems, linguistic knowledge is similarly comprised of nested subsystems, which are in turn compounded of similar nested components. This nested hierarchy can also be seen in the processes involved – the process of language change and variation is comprised of myriad processes of individual language use and acquisition, which are in turn compounded from the co- development of knowledge aspects, and so forth towards minute detail at every level of language knowledge and time scale, from millennia to nanoseconds (Spivey, 2007). All of these developmental processes are simultaneous and ongoing. They reflect, on one hand, a codependency of developing elements within the structural hierarchy of language, in which certain structures are conditional for the development of higher, dependent structures. On the other hand, these processes also incorporate a seemingly- paradoxical competition for the limited external and internal resources available to the developing system.

171

CHAPTER 6

This generic paradigm of interaction under the constraints of structural order and resource limitations is known as the precursor model. It is pertinent to an array of natural and developmental phenomena (Fischer, 1980; van Geert, 1991). When language is viewed from this perspective, intra-learner variability in performance and what Larsen-Freeman calls “indeterminacy in users’ intuitions” (2006a, p. 195) can be explained as meaningful expressions of shifts in precursor interactions, rather than as mere measurement error. In this view of variation, the dynamic approach diverges considerably from the predominant approach to applied linguistics, which focuses on linear and statistically-significant group effects. There are numerous versions of the basic precursor model scheme, depending on the phenomena at hand. Many of these can be simulated by simple coupled logistic equations. In the domain of language development, such simulations have been applied to several aspects of L1 acquisition, among them the nonlinear and non- uniform growth of early lexicon and syntax (Robinson & Mervis, 1998; van Geert, 1991), and the sequential progression of early speech from single- to multiword utterances (Bassano & van Geert, 2007). The studies in this thesis apply the empirical dynamic approach, which combines variability analyses and simulations based on the precursor model, to two areas of L2 development. The first area is vocabulary knowledge across a continuum of receptive-productive modalities; the second is text-level writing performance, reflected in the complexity and accuracy of the lexical and syntactic dimensions of writing. The studies have several features in common. First, both apply the dynamic perspective to combinations of two dichotomous paradigms in their respective areas of interest. In each study, the two sets of distinctions are merged into four-level continuums. In the vocabulary study, two paradigms of knowledge – word recognition vs. recall, and controlled production vs. free production – are collapsed into a single continuum of receptive-productive knowledge modalities. In the writing performance study, the distinction between writing complexity and accuracy and the distinction between the lexical and syntactic dimensions of writing are merged into a hierarchy of four categories: lexical complexity, lexical accuracy, syntactic complexity and syntactic accuracy. In each study, the sequencing of categories in the hierarchy was informed by previous findings as well as by common sense considerations. The two studies utilize elaborate longitudinal data from four case studies of advanced ESL learning in immersion conditions. They inspect the central data growth

172 General summary and discussion

trends and the variability patterns around them, juxtaposing pairs of indexes, in order to reveal patterns that suggest complex precursor interactions. The studies then employ models of precursor interactions to simulate the data, based on van Geert’s (1991, 1993, 1994, 2003) extensive work on mathematical descriptions of various precursor interactions. Each version “consists, first of all, of a specification of how the components involved in the system affect one another in terms of resource functions. (…) second, it specifies the initial conditions of each component, and third, the eventual conditional dependencies among the components” (van Geert, 2003, p. 664). This blueprint consists of a hierarchy of elements, across which there is a conditional threshold value for support from a component that emerges earlier, i.e., the precursor, towards its dependent, which in turn tends to compete with the precursor. To this blueprint, variations can be added, such as unidirectional or bidirectional competition or support from the dependent to the precursor, further conditional thresholds, developmental delays, and so forth. In the two studies, the internal hierarchy of the precursor models, described by order parameters, was determined in accordance with the background literature on the respective area of SLA and findings from the preceding analyses. The interactions within this hierarchical order were then specified on the basis of the analyses outcomes. The studies distinguished two types of interactions in the data. The first is surface interactions, evident in correlations between the data components. The second is underlying or hidden interactions, revealed by variability around the central growth trends of the data and temporal shifts in the data correlations. This latter type of interaction was used to configure the relational control parameters of the model, which specify the interactions between its hierarchical levels. The two studies hypothesized that these underlying interactions, when iterated in the models, would yield growth trends and surface interactions that match those of the data. This hypothesis was in line with previous studies of L2 writing development, which have identified alternating variability patterns as manifesting complex or competitive interactions and associated these interactions with the precursor model, but have not verified their interpretations in simulations (Larsen-Freeman, 2006b; Spoelman & Verspoor, 2009, in press; Verspoor et al., 2008b). In both studies, the hypothesis that the data variability is a meaningful manifestation of internal systemic dynamics was supported by two related outcomes

173 CHAPTER 6

of the modeling procedure. First, when the relational control parameters of the models were optimized in order to obtain the best model fit to the data, their values matched the interpretations of the variability analyses results. Second, the optimized model outcomes fitted the data not only in terms of matching its growth trends, but also in terms of reproducing its correlation matrix, i.e., surface interactions. This was particularly evident in the main case study, a native Portuguese speaker, whose data provided the basis for the deduction of those interactions. While the studies were not intended to compare the four participants, it is possible that the models would have achieved better fits to the three other case studies if similar detailed analyses of each dataset were conducted, informing the configuration of their respective model version. Due to time constraints, this could not be carried out as part of the current project. However, the fact that the general precursor hierarchy, as well as the overall interactions within it, could be extended from the main case study to the others quite successfully in both studies (with the exception of one participant, a Vietnamese speaker) is a strong statement in favor of the generalizability of the precursor model, at least to learners in similar circumstances and of a similar proficiency. The issue of generalizability is readdressed in the Discussion section of this chapter.

6.1.1 Differences between the vocabulary knowledge and writing performance models So far, the models in the two studies were described as uniform constructs, according to the basic principles that informed them. However, there are certain differences between the two model versions. The vocabulary knowledge model configured only bottom-up interactions, from more-receptive to more-productive knowledge levels. Backward interaction, although assumed, was omitted from this model. This was done since the phenomena that it aimed to capture, the receptive-productive gap, is defined as a unidirectional lack of transfer from receptive to productive knowledge (e.g., Melka, 1997). Thus, although it is very likely that the interaction between receptive and productive vocabulary knowledge is more complex than feed-forward alone (cf., Clark, 1993), it was decided to simplify the model in order to test the applicability of its most basic premises to the issue at hand: the lack (or rather, nonlinearity) of transfer from lower to higher vocabulary knowledge levels. Conversely, the performance categories included in the writing study – lexical complexity, lexical accuracy, syntactic complexity and syntactic accuracy – were not

174 General summary and discussion

as distinctively hierarchical as the vocabulary knowledge paradigm. The operational definitions of these categories involved indexes denoting pre-acquired skills, such as the ratios of subordination use or correct word order per clause. Moreover, these indexes, unlike the vocabulary knowledge levels, which are all related to the same set of words, were not necessarily associated with the same linguistic structures. Thus the model pertaining to this data incorporated feedback interactions (dependent to precursor) besides the basic feed-forward ones. This rendered the model more complex, and indeed achieved a better fit to the data (manifested in smaller sums of squared residuals) than the vocabulary knowledge model. Another difference between the model versions was that the interactions in the vocabulary knowledge model were both by change and by level, while those in the writing performance model were by level only. This was in line with the idea that actively-acquired skills, as depicted in the vocabulary study, may exhibit simultaneous support and competition, which would be related to the current level of these skills and to the effort spent on their learning process, respectively. On the other hand, it was hypothesized that since the writing performance categories were pre- acquired (albeit not stable) forms of knowledge, their interactions would be directly related to their values rather than to a process of acquisition. The differences between the model versions demonstrate that despite the apparent simplicity of dynamic modeling, it allows for the incorporation of numerous considerations, which should be theoretically and empirically motivated. Since both studies yielded a good model fit, as well as a confirmation of the interpretation of their preceding variability analyses as manifesting dynamic interactions, they can be considered fruitful. The topics which they addressed – vocabulary knowledge and complexity and accuracy in the lexical and syntactic dimensions of writing – have lent themselves fairly easily to dynamic modeling. This is because these areas have relatively straightforward and intuitively appealing orders (but see de Bot & Lowie, 2010; Elman, 1995, 2009), which are well-supported by previous findings and theory. Despite conceptual and operational disagreements between various studies of L2 vocabulary knowledge, there is a general agreement that receptive vocabulary knowledge precedes and is prerequisite for productive knowledge (but see Clark, 1993). Likewise, within writing performance, the precedence of complexity to accuracy and of lexicon to syntax is assumed by most researchers, with ample

175 CHAPTER 6

supportive evidence (Anderson, 1993; Marchman & Bates, 1994), even while its causality is strongly debated. The two present studies thus illustrate how, in the dynamic approach, “descriptions and explanations of patterns of behavior derive from ‘robust and typical properties of the system’” (Kauffman, 1995, p. 19, as cited in Larsen-Freeman & Cameron, 2008, p. 26). Such properties are often widely documented in the relevant literature, in light of which the dynamic approach should be applied. The studies in this dissertation have hopefully illustrated the potential contribution of DST to applied linguistics: its ability to reconcile disparate and apparently conflicting accounts while merging fragmented areas of language development and extending static theory to time-dependent development. The following section discusses this contribution and its limitations from a broader perspective.

6.2 Discussion As the present studies have demonstrated, DST is a universal rather than domain- specific paradigm. Because of its generic nature, DST cannot be applied to any area without a relevant empirical and theoretical tradition. In applied linguistics, theories are predominately derived from cross-sectional studies that make use of linear analyses such as product-moment correlations. Such studies are invariably restricted in what they can portray, in the same way that structural theories of language development are limited in comparison with process theories and cannot be viewed as describing development, unless in its crudest form. The dynamic approach and dynamic models in particular are valuable in supplementing static models and linear and cross-sectional empirical results. Spivey, discussing the complementarity of dynamic modeling and experimental studies, claims that [experimentation] is like a microscope providing a particular two-dimensional perspective, or sketch, of the full three-dimensional structure of perception and cognition. These two-dimensional peeks into the actual system of interest provide useful pictures of its function, but they lack the full volumetric feeling provided by a simulation (2007, p. 82).

Variability analyses of longitudinal data provide a bridge between linear and cross-sectional analyses and abstract dynamic modeling, because they enable a glimpse into the interactions that underlie the surface structure of the data. Following

176 General summary and discussion

variability analyses with dynamic modeling can potentially reveal these underpinnings of the data surface structure, by confirming the interpretations of such analyses. Yet even at its highest level (which is, needless to say, considerably more sophisticated that any model included in this thesis) mathematical simulation is restricted by the limitations of human abstraction. While mathematics (and its extension in computation) is the most accurate means available for describing real life phenomena, it still imposes restrictions on an infinitely complex reality. Likewise, dynamic models may not share the reductive pitfalls of linear analyses when describing development, but they have their own set of limitations. As the renowned physicist Landauer discovered, the noise reveals the underlying interactions that constitute the averaged signal (1998). Rather than obscuring information, the noise is the information; therefore its elimination is reductive. In the context of linear analyses, noise is a divergence from the central trends. However, noise reduction also takes place in dynamic studies, driven by the need to establish a clear hierarchy in their data. This call for order might be considered a disadvantage: when the data does not present a discernable hierarchy, precursor models are less likely to achieve an adequate fit (although it is still possible to depict dynamic interactions on the basis of connected grower models that are not bounded by precursor interactions). In this, the empirical dynamic approach appears somewhat conflicting with certain aspects of the theory that motivates it. To use the current studies as an example, a more complete account of development in the domains which they address would include additional variables that do not necessarily comply with the hierarchies depicted in the respective models. DST emphasizes the nested and complex nature of language and the myriad cross-interactions within it; yet from this respect, the current studies (and indeed any empirical study, dynamically- oriented or otherwise) are lacking, since they focus on easily discernible and quantifiable features of the data. Thus there appears to be an inherent paradox in empirically applying the dynamic approach. Theoretically, the dynamic perspective attends to practically infinite complexity. Empirically, data measurements require a reduction of this complexity to simple fragments, and stochastic dynamic modeling requires a focus on markedly ordered phenomena, collapsing such measurements further into several routes that comprise a relatively simple hierarchy. The empirical dynamic approach

177 CHAPTER 6

thus not only simplifies the developmental phenomena that it investigates, but also its own theoretical fundamentals. Addressing this apparent discrepancy, Larsen-Freeman asks “how is it possible to respect the interconnectedness of all things and still conduct practical investigations?”, and replies: the answer, I believe, lies in finding the optimal interconnected units of analysis depending on what we are seeking to explain. Even if everything is connected to everything else, this should not lead to a paralyzing holism (2007, p. 37).

In other words, the conflict between the theoretical and empirical dynamic approach may be resolved when we consider that, precisely through reduction of data to a restricted set of parameters in a dynamic model, it is possible to show how its complexity arises. To give an example from the current study, performing correlation or factor analysis across the numerous indexes denoting each of the four writing categories (all but one of which were omitted from the model, which includes only one index per category) results in a dense web of interactions that eludes the perception of structure. Inspecting variability in the context of a limited number of components is potentially informative; attempting to document all co-developing components, even linearly, is missing the trees for the forest, to paraphrase Ellis (2007). Nevertheless, the current studies have several drawbacks that relate to these reductive constraints. First, vocabulary knowledge and writing performance are treated as distinct areas by the bulk of the literature, with studies usually focusing on either one or the other, including those in this dissertation. However, even without regarding language as a complex dynamic system in which all aspects are interconnected, there is a strong interdependence between these areas (Laufer, 1998; Laufer & Nation, 1995). Taken together, the link between the two studies in this thesis is clear. Complexity and accuracy are embedded in the vocabulary knowledge continuum, whereas vocabulary is always used in conjunction with syntactic structures. A more sophisticated model is needed in order to capture the two dimensions in interaction. Such a model needs to rely on more cohesive theory, which does not view the constructs of vocabulary knowledge and writing performance in isolation. Yet at present, such a unified theory is unavailable. Even dual- or single- system accounts discuss the dichotomies between complexity and accuracy and between lexicon and syntax separately, despite maintaining that the same mechanisms

178 General summary and discussion

account for both disparities (an approach which, as noted in Chapter 5, results in internal contradictions in the dual-systems account). Thus, due to both methodological and theoretical “resource limitations”, and in order to compare its outcome with the research traditions in the vocabulary and writing performance areas, the current study did not converge these two subsystems of L2. Such an investigation remains as a future pursuit. Likewise, it is very likely that zooming in on the finer detail of these areas, for example on the subcomponents of writing development indexes, or on interactions between different measures of each category (e.g., syntactic complexity), would reveal more intricate dynamic processes and mechanisms. These have already been suggested by previous studies that revealed nonlinear interactions across syntactic complexity (Bardovi-Hardlig, 1997; 2000; Ortega, 2003) or lexical complexity indexes (Bulté et al., 2008; McClure, 1993). The scope limitations of the current study echo a general built-in limitation of researching human and particularly language development. Components or subsystems are singled out for analysis, and regardless of how thorough and inclusive the research may strive to be, it is virtually impossible to account for all influences and cross-influences on development. As Thelen and Smith state, in reference to the innate-environmental acrimony in developmental psychology: There is no plan. We posit that development, change, is caused by the interacting influences of heterogeneous components (…). These are not encapsulated modules; indeed, development happens, behavior is fluid and adaptively intelligent because everything else affects everything else (1994, p. 28 338).F

Recent dynamic accounts of SLA stress the inseparability of context from learning, or of interaction from uptake (cf., Ellis & Larsen-Freeman, 2009), while an earlier application of the dynamic approach to this area has emphasized interactions within the multilingual system (Herdina & Jessner, 2002). However, as Spivey (2007) notes, it is possible to zoom in on the language acquisition process at different scales. By focusing on a particular skill (i.e., writing), at a particular level (i.e., textual) and on specific dimensions of that level, from which individual features are extracted as representative, general growth and interaction patterns can still be observed and tested. Following this line of thought, the current research has chosen to focus only on

28 “Plan” and “goal” should not be confused: a learner may have a goal when learning a language, but their language development does not.

179 CHAPTER 6

processes that occur within the boundaries of its phenomena of interest. It has therefore grouped all influences under the nonspecific classification of (limited) resources. Investigating the specific effects of various cross-linguistic influences in a dynamic framework is a further recommendation for future studies. However, if a model based on iterating a small set of data-internal interactions has achieved a good fit to the data, as in the current studies, this does not imply that the factors which have been omitted from it are dismissible. It simply means that within the particular nested subsystem, growth over time involves constant interaction between these connected components. On other levels in the system, other interactions take place on different scales and time frames. In other words, even without accounting for, nor explicitly negating, the effect of factors beyond a given specific language subsystem, their influences remain inextricable from the internal dynamics exposed by a given study. This simultaneous, complex contribution of numerous factors and integrated co-influences to SLA is increasingly recognized by applied linguistics: What seems to be emerging is that there are numerous factors that guide second language acquisition. They can be investigated in isolation and their significance can be determined, but they should also be investigated as interacting and converging factors to truly see how they operate in the learning of a second language (Gass, 2004, p. 88).

The current research acknowledges those numerous influences, but investigates a common denominator of ecological-dynamic principles across learners and areas of language. In the studies, given some form of unspecified environmental support and input – assumed to be relatively constant due to the academic immersion conditions – this flow of unspecified influences from beyond the system would form a part of the general resources, which determine the carrying capacity of its components. Thus, growth is bound to occur as a function of the availability of resources, but its course and shape are determined by the internal interactions, which in turn are a function of the limitations of the same resources. This does not imply that change in a given external factor would have no impact, but that the internal dynamics of the system would determine the eventual effect of such change, rather than its direct external influence. From this vantage point, the separation of the learner, with his or her unique circumstances and characteristics, such as L1, from the learning is redundant. The self-organization patterns of the system, as identified in dynamic

180 General summary and discussion

analyses and tested by dynamic simulations, already incorporate all of these influences, and can adequately account for development without specifying (or negating) their contribution (de Bot, 2007; de Bot et al., 2005). Thus, although the reduction of the learning process into several hierarchical parameters may appear restrictive, it can be considered as a gateway towards understanding the general patterns that underlie unique and varied development. In a paraphrase on van Geert’s (1993) island metaphor, each learner, or the interlanguage thereof, is an island, within which unique variation emerges as a result of uniform principles: natural order and the resource limitations. While such concurrence of universal principles and unique variation may seem at odds, discrete instability and variability of performance need not be seen as a threat to the notion of systemacity in (inter)language. Thus, if we view as complex dynamic systems, then the question of whether an interlanguage is systematic or variable no longer arises, and we can concentrate on how to find the systematic patterns in variability (Larsen- Freeman & Cameron, 2008, p. 21).

It should also be accepted that any effect in development has numerous and often inseparable causes. Therefore, competing explanations of SLA may reflect the complexity of causality in L2 development, rather than a simple dichotomy of correct vs. misguided theories. Since myriad influences operate on the process of language acquisition, there will likely always be conflicting explanations of this process. In other words, “there will never be ‘a’ theory of SLA. Instead, most likely there will be multiple theories and models that account for different aspects of SLA” (VanPatten et al., 2004, p. 20).

6.2.1 Implications The outcomes of longitudinal studies which incorporate a high degree of variability cannot be generalized to other learners. However, the dynamic approach relinquishes the attempt to predict development. Rather, it aims to describe development retroactively (Larsen-Freeman & Cameron, 2008). Even in retrospect, DST can explain only a part of the variability in L2 development. Indeed, not every type of variability can be accounted for, or in fact need be. As Lowie et al. claim: Once we appreciate the dynamics of language development in general, we can also accept that not all humans will show the same [learning] behavior even under seemingly similar circumstances (2009, p. 128).

181 CHAPTER 6

Therefore, the pedagogical implication of the dynamic approach is that L2 development is to a large degree unpredictable, at least when it comes to linear cause and effect. While the inherent variability of longitudinal studies implies that their results cannot be extended to the learner population, what can be generalized is the fact that growth trajectories are highly individuated. If we can discern the key interacting factors, we may be able to achieve some sort of guidance of the process, although this is not guaranteed. For instance, if learners are aware that their receptive vocabulary knowledge does not transfer readily into production, they may choose to explicitly focus on production. If they are further aware of the underlying interactions between receptive and productive knowledge, they may elect to temporarily neglect the acquisition of new structures and focus on producing previously-acquired ones. However, if their receptive vocabulary knowledge is relatively small, they may need to expand it before increased production is instigated. Likewise, an awareness of the complexity-accuracy tradeoff in performance may reduce some of the expectations and disappointments involved in the process of SLA, and lead to some practical measures, such as focusing on the required dimension or strengthening its precursor. Chapter 5 has noted that the current study is inconclusive with regard to the dispute between the dual-systems and single-system accounts of language development. This might be surprising, in light of the fact that the study shows that it is not necessary to assume dual mechanisms in order to account for disparate growth patterns in dichotomous categories of writing performance. The reason for this inconclusiveness lies in the fact that the study nevertheless assumes the categorical distinctions that give rise to the dual-systems models. If there are two (or more) distinct areas in language development, such as lexicon and syntax, which develop incongruently, it is foreseeable that they will be treated as the outcome of different mechanisms. In fact, even when a single-system model is posited (cf., Marchman & Bates, 1994), the arguments that support it rely on juxtaposing the growth of two dichotomous categories, and ignoring any overlap or fuzziness between them. For instance, it is necessary to decide which part of a lexical item has a grammatical function before observing the development of lexicon and syntax in separation, regardless of how this development is explained. Likewise, while receptive- productive modalities of vocabulary knowledge are recognized as a continuum rather than discrete categories, they are still researched as the latter by studies that aim to understand the transition between them in actual behavior (e.g., Laufer et al., 2004).

182 General summary and discussion

In the case of the current study, dynamic models require categorization in order to simulate different “species”, and cannot incorporate fuzzy logic. Recent theoretical advancements in the dynamic approach strongly challenge this type of categorization, including the lexical-syntactic distinction (Elman, 2009). With regard to the lexicon, such theories dispute the distinction of knowledge levels. Instead, they suggest that word knowledge is “soft assembled” in context from activation patterns encompassing all possible state-space trajectories of representation (knowledge types, meaning, associations, pragmatic considerations, phonological and orthographical patterns and so on), rather than categorically stored in a mental depository (de Bot & Lowie, 2010; Elman, 1995, 2009). This reconsideration of categorization is referred to as The Continuity of Mind (Spivey, 2007), in rejoinder to Fodor’s highly-influential book The Modularity of Mind (1983), which has shaped the category-based research paradigm of cognitive psychology. In this vein, Larsen-Freeman and Cameron criticize the “dualistic thinking” (2008, p. 9) of categorizations such as the Chomsky’s (1965) performance- competence distinction. They express the concern that such distinctions “obscure insights into the nature of language and its learning rather than facilitate them” (Larsen-Freeman & Cameron, 2008, p. 9). However, the problem is that just as it is impossible to empirically investigate development without some degree of reduction, as mentioned earlier, it is also very difficult to theoretically discuss such development when categories are amorphous. To illustrate this constraint, suppose a hypothetical study that investigates walking and running abilities. It is likely that a multitude of cross-influences expressed in processes of minute and continuous change determine the dynamic transition between these seemingly-distinct attractor states of human movement. Nevertheless, if a dynamic model were to depict walking skills as precursors of running skills (which the latter invariably are), the two constructs would need to be addressed as mutually-exclusive, even when the precise instant at which walking becomes running is impossible to ascertain. Thus, while recognizing the immense significance of the innovations in the theoretical dynamic approach, the present studies nonetheless assume discrete categories of linguistic knowledge and performance. However, it should be kept in mind that their findings may well embody the elusive non-categorical nature of linguistic knowledge, since some (or most) of the

183 CHAPTER 6

variability from the growth trends remains unaccounted for by the dynamic models in these studies, as mentioned earlier. Asides from the practical empirical constraints that lead to the assumption of categorical distinctions in these studies, it should be acknowledged that such distinctions, like words in the human mind, arise from the saliency of certain patterns. From a theoretical-philosophical dynamic perspective, the competence-performance distinction is obsolete, yet from the language user’s perspective it is a valid and recurrent experience, as studies of the receptive-productive gap attest. The fact that this gap, or any other prevailing dichotomy, can be regarded as a developmental phenomenon which arises from dynamic interaction over time, rather than as a fixed and linear product, does not diminish from its relevance to describing the reality of language acquisition. Thus, categorizations in language structure or acquisition processes are not necessarily invalid, but rather reflect some inherent quality of the process at hand, fuzzy and ambiguous as it may be. Following this line of thought, it is quite possible that the lack of agreement in applied linguistics, not just on the causality of various phenomena, but even on their definition, reflects the fuzzy and impalpable nature of these phenomena. In this context, while reviewing the interface between the generativist and cognitive approaches to the lexical-syntactic distinction, van Hout, Hulk and Kuiken conclude that, in SLA research, “a whole series of contrasts keeps returning: symbolic learning vs. connectionist learning, L1 vs. L2 acquisition, procedural vs. declarative knowledge, structure vs. process, competence vs. performance” (2003, pp. 222-223). The empirical dynamic approach does not supply answers to those contrasts. It does, however, provide a means of transcending them by posing a different set of questions. Not about categorization and linear alignment with discrete mechanisms, but about development. In other words, “the why and the how cannot always be clearly separated” (de Bot, 2007, as cited in Lowie et al., 2009, p. 138). While some form of categorization is inevitable when we talk about language, the dynamic approach can contribute to our understanding of its origins. This dissertation has tried to reconcile the theoretical advances in the dynamic approach to SLA with the practice of conducting research. Its precursors have been some inspiring studies (Larsen-Freeman, 2006b; Verspoor et al., 2008; Spoelman & Verspoor, 2009, in press), which have utilized variability analyses to explore SLA dynamics. In attempting to expand this empirical framework, the current project has

184 General summary and discussion

likely overlooked some essential aspect of L2 development. It may be impossible to account for, incorporate into a study design, or indeed even grasp all of the interacting elements involved in this developmental process. But we will keep on trying.

185 CHAPTER 6

186 References

Anderson, J. R. (1976). Language, memory, and thought . Hillsdale, NJ: Erlbaum Associates.

Anderson, J. R. (1982). Acquisition of cognitive skill. Psychological Review, 89, 369-406.

Anderson, J. R. (1993). Rules of the mind . NJ: Hillsdale: Laurence Erlbaum Associates.

Arnaud, P. L. J. (1992). Objective lexical and grammatical characteristics of L2 written compositions and the validity of separate-component tests. In P.L.J. Arnaud & H. Bejoint (Eds.), Vocabulary and applied linguistics (pp. 133-145). London: Macmillan.

Astika, G. (1993). Analytical assessments of foreign language students' writing. RELC Journal, 24, 61-71.

Baddeley, A. D. (1990). Human memory: Theory and practice . Hove: Lawrence Erlbaum Associates.

Bardovi-Harlig, K. (1992). A second look at T-unit analysis: Reconsidering the sentence. TESOL Quarterly, 26, 390-395.

Bardovi-Harlig, K. (1997). Another piece of the puzzle: The emergence of the present perfect. Language Learning, 51, 215-264.

Bardovi-Harlig, K., & Bofman, T. (1989). Attainment of syntactic and morphological accuracy by advanced language learners. Studies in Second Language Acquisition, 11, 17- 34.

Bardovi-Harlig, K., & Stringer, D. (2010). Variables in second language attrition. Studies in Second Language Acquisition, 32, 1-45.

Bassano, D., & van Geert, P. (2007). Modeling continuity and discontinuity in utterance length: A quantitative approach to changes, transitions and intra-individual variability in early grammatical development. Developmental Science, 10, 588-612.

Bates, E., Dale, P. S., & Thal, D. (1995). Individual differences and their implications for theories of language development. In P. Fletcher & B. MacWhinney (Eds.), The handbook of child language (pp. 96-151). Oxford: Blackwell Publishing.

Bates, E., & Goodman, J. C. (1999). On the emergence of grammar from lexicon. In B. MacWhinney (Ed.), The emergence of language (pp. 29-80). Mahwah, NJ: Lawrence Erlbaum Associates.

Bates, E., & MacWhinney, B. (1987). Competition, variation, and language learning. In B. MacWhinney (Ed.), Mechanisms of language acquisition (pp. 157-194). Hillsdale, NJ: Lawrence Erlbaum Associates.

Bates, E. & MacWhinney, B. (1989). Functionalism and the . In B. MacWhinney & E. Bates (Eds.), The crosslinguistic study of sentence processing (pp. 3- 73). New York: Cambridge University Press.

187

REFERENCES

Beckner, C., Blythe, R., Bybee, J., Christiansen, M. H., Croft, W., Ellis, N. C., … Schoenemann, T. (2009). Language is a complex adaptive system: position paper. Language Learning, 59(Suppl. 1), 1-26.

Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 Word Level and University Word Level vocabulary tests. Language Testing, 16, 131-162.

Bertenthal, B. I. (1999). Variation and selection in the development of perception and action. In G. Savelsbergh, H. van der Maas, & P. van Geert (Eds.), Non-linear developmental processes (pp. 105-121). Amsterdam: Koninklijke Nederlandse Academie van Wetenschappen.

Biber, D., Conrad, S., & Leech, G. (2002). Longman student grammar of spoken and written English . Harlow: Longman.

Bulté, B., Housen, A., Pierrard, P., & van Daele, S. (2008). Investigating lexical proficiency development over time - the case of Dutch-speaking learners of French in Brussels. French Language Studies, 18, 277-298.

Burke, D. M., MacKay, D. G., Worthley, J. S., & Wade, E. (1991). On the tip of the tongue: What causes word finding failures in young and older adults? Journal of Memory and Language, 30, 542-579.

Bialystok, E. (1982). On the relationship between knowing and using linguistic forms. Applied Linguistics, 3, 181-206.

Bialystok, E. (1994). Analysis and control in the development of second language proficiency. Studies in Second Language Acquisition, 16, 157-168.

Caspi, T., & Lowie, W. M. (2010). A dynamic perspective on academic English L2 lexical development. In R. Chácon-Beltrán, C. Abello-Contesse, M. d. M. Torreblanca-López, & M. D. López-Jiménez (Eds.), Further insights into non-native vocabulary teaching and learning . Bristol: Multilingual Matters.

Cobb, T. (2003). Analyzing late interlanguage with learner corpora: Quebec replications of three European studies. The Canadian Modern Language Review, 59, 393-423.

Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.

Chomsky, N. (1965). Aspects of the theory of syntax . Cambridge, MA: The MIT Press.

Chomsky, N. (1986). Knowledge of language: Its nature, origins and use . New York: Praeger.

Chomsky, N. (1995). The minimalist program . Cambridge, MA: The MIT Press.

Clark, E. V. (1993). The lexicon in acquisition . Cambridge: Cambridge University Press.

Cooper, T. C. (1976). Measuring written syntactic patterns of second language learning in German. The Journal of Educational Research, 69, 176-183.

Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34/2, 213-238.

Cumming, A. H. (1989). Writing expertise and second language proficiency. Language Learning, 39, 81-141.

188 References

Cumming, A. H. (2001). Learning to write in a second language: Two decades of research. International Journal of English Studies, 1, 1-23.

Cumming, A. H., & Mellow, D. (1996). An investigation into the validity of written indicators of second language proficiency. In A. H. Cumming & R. C. Berwick (Eds.), Validation in Language Testing (pp. 72-93). Clevedon: Multilingual Matters. de Bot, K. (1992). A bilingual production model: Levelt's 'speaking' model adapted. Applied Linguistics, 13 , 1-24. de Bot, K. (1996). The psycholinguistics of the output hypothesis. Language Learning Journal, 46, 529-555. de Bot, K. (2007). Dynamic systems theory, life-span development and language attrition. In B. Köpke, M. S .Schmid, M. Keijzer, & Susan Dostert (Eds.), Language attrition: Theoretical perspectives (pp. 53-68). Amsterdam/Philadelphia: John Benjamins. de Bot, K. (2008). Introduction: Second language development as a dynamic process. The Modern Language Journal, 92, 166-178. de Bot, K. & Lowie, W. M. (2010). On the stability of representations in the multilingual lexicon. In M. Pütz & L. Sicola (Eds.), Inside the learner's mind: Cognitive processing and second language acquisition (pp. 117-134). Amsterdam/Philadelphia: John Benjamins. de Bot, K., Lowie, W. M., & Verspoor, M. H. (2005). Second language acquisition: An advanced resource book . Amsterdam/Philadelphia: John Benjamins. de Bot, K., Lowie, W. M., & Verspoor, M. H. (2007). A Dynamic Systems Theory approach to second language acquisition. Bilingualism: Language and Cognition, 10, 7-21. de Bot, K., & Stoessel, K. (2000). In search of yesterday's words: Reactivating a long forgotten language. Applied Linguistics, 21, 364-388.

Ellis, N. C. (1996). Sequencing in SLA: Phonological memory, chunking, and points of order. Studies in Second Language Acquisition, 18, 91-126.

Ellis, N. C. (1998). Emergentism, , and language learning. Language Learning, 48, 631-664.

Ellis, N. C. (2004). The processes of second language acquisition. In B. VanPatten, J. Williams, S. Rott, & M. Overstreet (Eds.), Form-meaning connections in second language acquisition (pp. 49-76). Mahwah, NJ: Lawrence Erlbaum Associates.

Ellis, N. C. (2007). Dynamic systems and SLA: The wood and the trees. Bilingualism: Language and Cognition , 10, 23-25.

Ellis, N. C., & Larsen-Freeman, D. (2009). Constructing a second language. Language Learning, 59(Suppl. 1), 90-125.

Ellis, N. C., & Schmidt, R. (1997). Morphology and longer distance dependencies: Laboratory research illuminating the A in SLA. Studies in Second Language Acquisition, 19, 145-171.

Ellis, R. (1994). The study of second language acquisition . Oxford: Oxford University Press.

189 REFERENCES

Elman, J. L. (1995). Language as a dynamical system. In R. F. Port & Van Gelder T. J. (Eds.), Mind as motion: Dynamical perspectives on behavior and cognition (pp. 195-226). Cambridge, MA: The MIT Press.

Elman, J. L. (2004). An alternative view of the mental lexicon. Trends in Cognitive Sciences, 8, 301-306.

Elman, J. L. (2009). On the meaning of words and dinosaur bones: Lexical knowledge without a lexicon. Cognitive Science, 33, 547-582.

Elman, J. L., Bates, E., Plunkett, K., Johnson, M., & Karmiloff-Smith, A. (1996). Rethinking innateness: A connectionist perspective on development . Cambridge, MA: The MIT Press.

Engber, C. A. (1995). The relationship of lexical proficiency to the quality of ESL compositions. Journal of , 4, 139-156.

Evola, J., Mamer, E., & Lentz, B. (1980). Discrete point versus global scoring for cohesive devices. In J. W. Oller & K. Perkins (Eds.), Research in language testing (pp. 177-181). Rowley, MA: Newbury House.

Fan, M. (2000). How big is the gap and how to narrow it? An investigation into the active and passive knowledge of L2 learners. RELC Journal, 31, 105-119.

Fischer, K. W. (1980). A theory of cognitive development: The control and construction of hierarchies of skills. Psychological Review, 87, 477-531.

Fischer, K. W., & Paré-Blagoev, J. (2000). From individual differences to dynamic pathways of development. Child Development, 71, 850-853.

Fitzpatrick, T., Al-Qarni, I., & Meara, P. (2008). Intensive vocabulary learning: a case study. Language Learning Journal, 36, 239-248.

Fodor, J. A. (1983). The modularity of mind: an essay on faculty psychology . Cambridge, MA: The MIT Press.

Foster, P., & Skehan, P. (1996). The influence of planning and task type on second language performance. Studies in Second Language Acquisition, 18, 299-323.

Gass, S. (2004). Context and SLA. In B. VanPatten, J. Williams, S. Rott, & M. Overstreet (Eds.), Form-meaning connections in second language acquisition (pp. 77-90). London: Lawrence Erlbaum Associates.

Gentner, D. (2006). Why verbs are hard to learn. In K. Hirsh-Pasek & R. Golinkoff (Eds.), Action meets word: How children learn verbs (pp. 544-564). Oxford: Oxford University Press.

Gilmore, R. (1981). Catastrophe theory for scientists and engineers . New York: John Wiley & sons.

Goldinger, S. D (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105, 251-279.

Green, D. W. (2003). Neural basis of lexicon and grammar in L2 acquisition. In R. van Hout, A. Hulk, F. Kuiken, & R. Towell (Eds.), The lexicon-syntax interface in second language acquisition (pp. 199-218). Amsterdam/Philadelphia: John Benjamins.

190 References

Gross, S. (2004). A modest proposal: Explaining language attrition in the context of contact linguistics. In M. S. Schmid, B. Köpke, M. Keijzer, & L. Weilemar (Eds.), First language attrition: Interdisciplinary perspectives on methodological issues (pp. 281-297). Amsterdam/Philadelphia: John Benjamins.

Harley, B., & King, M. L. (1989). Verb lexis in the written compositions of young L2 learners. Studies in Second Language Acquisition, 11, 415-439.

Henriksen, B. (1999). Three dimensions of vocabulary acquisition. Studies in Second Language Acquisition, 21, 303-317.

Herdina, P. & Jessner, U. (2002). A dynamic model of . Clevedon: Multilingual Matters.

Horst, M. (2005). Learning L2 vocabulary through extensive reading: a measurement study. The Canadian Modern Language Review, 61, 355-382.

Horst, M., & Meara, P. (1999). Test of a model for predicting second language lexical growth through reading. The Canadian Modern Language Review, 56, 308-328.

Hyltenstam, K. (2002). Non-native features of near-native speakers: on the ultimate attainment of childhood L2 learners. In R. J. Harris (Ed.), Cognitive processing in bilinguals (pp. 351-367). Amsterdam: North Holland.

Ishikawa, S. (1995). Objective measurement of low-proficiency ESL narrative writing. Journal of Second Language Writing, 4, 51-70.

James, C. (1998). Errors in language learning and use: Exploring error analysis . London: Longman.

Jiang, N. (2004). Semantic transfer and development in adult L2 vocabulary acquisition. In P. Bogaards & B. Laufer (Eds.), Vocabulary in a second language (pp. 101-126). Amsterdam/Philadelphia: John Benjamins.

Kameen, P. T. (1979). Syntactic skill and ESL writing quality. In C. Yorio, K. Perkins, & J. Schachter (Eds.), On TESOL '79: The learner in focus (pp. 343-364). Washington, DC: TESOL.

Komarova, N. L. & Nowak, M. A. (2001). The evolutionary dynamics of the lexical matrix. Bulletin of Mathematical Biology , 63 , 451-484.

Komarova, N. L., Nowak, M. A., & Niyogi, P. (2001). The evolutionary dynamics of grammar acquisition. Journal of Theoretical Biology, 209, 43-59.

Krashen, S. (1989). We acquire vocabulary and spelling by reading: Additional evidence for the . The Modern Language Journal, 73, 440-464.

Kroll, B. (1990). What does time buy? ESL student performance on home versus class compositions. In B. Kroll (Ed.), Second language writing: Research insights for the classroom (pp. 140-154). Cambridge: Cambridge University Press.

Landauer, R. (1998). Condensed matter physics: The noise is the signal. Nature, 392, 658- 659.

Larsen-Freeman, D. (1978). An ESL index of development. TESOL Quarterly, 12, 439-448.

191 REFERENCES

Larsen-Freeman, D. (1997). Chaos/complexity science and second language acquisition. Applied Linguistics, 18, 141-165.

Larsen-Freeman, D. (2006a). Second language acquisition and the issue of fossilization: There is no end, and there is no state. In Z. Han & T. Odlin (Eds.), Second language acquisition and the issue of fossilization (pp. 189-200). Toronto: Multilingual Matters.

Larsen-Freeman, D. (2006b). The emergence of complexity, fluency and accuracy in the oral and written production of five Chinese learners of English. Applied Linguistics, 24, 590- 619.

Larsen-Freeman, D. (2007). On the complementarity of Chaos/Complexity Theory and Dynamic Systems Theory in understanding the second language acquisition process. Bilingualism: Language and Cognition, 10, 35-37.

Larsen-Freeman, D. (2009). Adjusting expectations: The study of complexity, accuracy, and fluency in second language acquisition. Applied Linguistics, 30, 579-589.

Larsen-Freeman, D., & Cameron, L. (2008). Complex systems and applied linguistics . Oxford: Oxford University Press.

Larsen-Freeman, D., & Long, M. (1999). An introduction to second language acquisition research . London & New York: Longman.

Larsen-Freeman, D., & Strom, V. (1977). The construction of a second language acquisition index of development. Language Learning, 27, 123-134.

Laufer, B. (1992a). How much lexis is necessary for reading comprehension? In H. Bejoint & P. L. J. Arnaud (Eds.), Vocabulary and applied linguistics (pp. 126-132). Basingstoke: Macmillan.

Laufer, B. (1992b). Reading in a foreign language: How does L2 lexical knowledge interact with the learner's general academic ability? Journal of Research in Reading, 15, 126-132.

Laufer, B. (1994). The lexical profile of second language writing: Does it change over time? RELC Journal, 25, 21-33.

Laufer, B. (1998). The development of passive and active vocabulary in a second language: Same or different? Applied Linguistics, 19, 255-271.

Laufer, B., Elder, C., Hill, K., & Congdon, P. (2004). Size and strength: Do we need both to measure vocabulary knowledge? Language Testing, 21, 202-226.

Laufer, B., & Goldstein, Z. (2004). Testing vocabulary knowledge: Size, strength, and computer adaptiveness. Language Learning, 54, 399-436.

Laufer, B., & Hulstijn, J. (2001). Incidental vocabulary acquisition in a second language: The construct of task-induced involvement. Applied Linguistics, 22 , 1-26.

Laufer, B. &, Nation, I. S. P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics, 16, 307-322.

Laufer, B., & Nation, I. S. P. (1999). A vocabulary-size test of controlled productive ability. Language Testing, 16, 33-51.

192 References

Laufer, B., & Paribakht, T. S. (1998). The relationship between passive and active vocabularies: Effects of language learning context. Language Learning, 48, 365-391.

Lennon, P. (1990). Investigating fluency in EFL: A quantitative approach. Language Learning, 40, 417.

Lennon, P. (1996). Getting 'easy' verbs wrong at the advanced level. Applied Linguistics, 34, 23-36.

Levelt, W. J. M. (1992). Accessing words in speech production: Stages, processes and representation. Cognition, 42, 1-22.

Linnarud, M. (1986). Lexis in composition: A performance analysis of Swedish learners' written English . Malmö: Liber Förlag.

Lorenz, E. (1972). Predictability: Does the flap of a butterfly's wing in Brazil set off a tornado in Texas? Paper presented at the annual meeting of the American Association for the Advancement of Science . Washington, DC.

Lowie, W.M., Verspoor, M.H. & de Bot, K. (2009). A dynamic view of second language development across the lifespan. In K. de Bot & R. Schrauf (Eds.), Language development over the lifespan (pp. 125-146). New York/London: Routledge.

MacKay, D. G. (1982). The problem of flexibility, fluency, and speed-accuracy tradeoff in skilled behavior. Psychological Review, 89, 483-506.

MacWhinney, B. (1987). The competition model. In B. MacWhinney (Ed.), Mechanisms of language acquisition (pp. 249-308). Hillsdale, NJ: Lawrence Erlbaum Publishers.

MacWhinney, B. (2006). Emergent fossilization. In Z. Han & T. Odlin (Eds.), Studies of fossilization in second language acquisition (pp. 134-156). Clevedon: Multilingual Matters.

Mann, T. (1996). The Magic Mountain (J. E. Woods, Trans.) New York: Knopf Doubleday Publishing group. (original work published 1924).

Marchman, V. A., & Bates, E. (1994). Continuity in lexical and morphological development. Journal of Child Language, 21, 339-366.

Marchman, V. A., Wulfeck, B., & Ellis-Weismer, S. (1999). Morphological productivity in children with normal language and SLI: A study of the English past tense. Journal of Speech, Language, and Hearing Research, 42, 206-219.

Mathews, J. H., & Fink, K. D. (2004). Numerical methods using Matlab . (4th ed.) Upper Saddle River, NJ: Prentice-Hall Inc.

McClure, E. (1991). A comparison of lexical strategies in L1 and L2 written English narratives. Pragmatics and Language Learning, 2, 141-154.

McLaughlin, B. (1990). Restructuring. Applied Linguistics, 11, 113-128.

McLaughlin, B., & Heredia, R. (1996). Information processing approaches to research on second language acquisition and use. In W .C. Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition (pp. 228-231). San Diego, CA: Academic Press.

McMurry, B. (2007). Defusing the childhood vocabulary explosion. Science, 371 , 631.

193 REFERENCES

Meara, P. (1989). Matrix models of vocabulary acquisition. AILA Review, 66-74.

Meara, P. (1995). Single-subject studies of lexical acquisition. Second Language Research, 11, i-iii.

Meara, P. (1996). The vocabulary knowledge framework. Vocabulary Acquisition Research Group. Retrieved from: http://www.swan.ac.uk/cals/calsres/vlibrary/pm96d.htm..

Meara, P. (1997). Towards a new approach to modeling vocabulary acquisition. In N. Schmitt & M. McCarthy (Eds.), Vocabulary: Description, acquisition and pedagogy (pp. 109-121). Cambridge: Cambridge University Press.

Meara, P. (2001). The mathematics of vocabularies. In M. Gill, A. Johnson, L. Koski, R. Sell, & B. Wårvik (Eds.), Language, learning, literature: Studies presented to Håkan Ringbom (pp. 151-167). Åbo: Åbo Academy.

Meara, P. (2005). Reactivating a dormant vocabulary. EUROSLA Yearbook, 5, 269-280.

Mehnert, U. (1998). The effects of different lengths of time for planning on second language performance. Studies in Second Language Acquisition, 20, 52-83.

Melka, F. (1997). Receptive versus productive aspects of vocabulary. In N. Schmitt & N. McCarthy (Eds.), Vocabulary description, acquisition, and pedagogy (pp. 84-102). New York: Cambridge University Press.

Mellow, J. D., & Cumming, A. H. (1994). Concord in interlanguage: Efficiency or priming? Applied Linguistics, 15, 442-473.

Mitchell, R., & Myles, F. (2004). Second language learning theories . Oxford: Oxford University Press.

Nation, I. S. P. (1990). Learning and teaching vocabulary . New York: Newbury House.

Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.

Nation, I. S. P. & Laufer, B. (1999). A vocabulary-size test of controlled productive ability. Language Testing, 16, 33-51.

Nelder, J. A., & Mead, R. (1965). A simplex method for function minimization. Computer Journal, 7, 313.

Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some implications for teaching. Applied Linguistics, 24, 223-242.

Norusis, M. J. (2008). SPSS 16.0 guide to data analysis. Upper Saddle River, NJ: Prentice- Hall.

Nowak, M. A., & Komarova, N. L. (2001). Towards an evolutionary theory of language. Trends in Cognitive Sciences, 5 , 288-295.

Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics, 24 , 92-518

194 References

Ortega, L. (2005). What do learners plan? Learner-driven attention to form during pre-task planning. In R. Ellis (Ed.), Planning and task performance in a second language (pp. 77- 109). Amsterdam/Philadelphia: John Benjamins.

Ortega, L., & Byrnes, H. (2008). The longitudinal study of advanced L2 capacities . New York: Routledge.

Paradis, M. (2009). Declarative and procedural determinants of second languages . Amsterdam/Philadelphia: John Benjamins.

Paribakht, T. S., & Wesche, M. (1996). Enhancing vocabulary acquisition through reading: A hierarchy of text-related exercise types. The Canadian Modern Language Review, 52, 155- 178.

Pinker, S. (1991). Rules of language. Science, 253, 530-535.

Pinker, S. (1994). The language instinct . New York: William Morrow.

Pinker, S., & Ullman, M. T. (2002). The past tense debate: The past and future of the past tense. Trends in Cognitive Sciences, 6, 456-463.

Plunkett, K., & Marchman, V. A. (1991). U-shaped learning and frequency effects in a multilayered perceptron: Implications for child language acquisition. Cognition, 38, 43- 102.

Polio, C. (1997). Measures of linguistic accuracy in second language writing research. Language Learning, 47, 101-143.

Qian, D. D. (1999). Assessing the roles of depth and breadth of vocabulary knowledge in reading comprehension. The Canadian Modern Language Review, 56, 283-307.

Read, J. (2000). Assessing vocabulary . Cambridge: Cambridge University Press.

Ringbom, H. (1998). Vocabulary frequencies in advanced learner English: A cross linguistic approach. In S. Granger (Ed.), Learner English on computer (pp. 41-52). New York: Longman.

Robinson, B. F., & Mervis, C. B (1998). Disentangling early language development: Modeling lexical and grammatical acquisition using an extension of case-study methodology. Developmental Psychology, 34, 363-375.

Robinson, P. (1989). A rich view of lexical competence. ELT Journal, 43, 274-282.

Robinson, P. (1993). Procedural and declarative knowledge in vocabulary learning. In T. Huckin, M. Haynes, & J. Coady (Eds.), Second language reading and vocabulary (pp. 229-262).

Robinson, P. (2003). Attention and memory. In C. Doughty & M. H. Long (Eds.), Handbook of second language acquisition (pp. 631-678). Oxford: Blackwell.

Robinson, P. (2005). Cognitive complexity and task sequencing: Studies in a componential framework for second language task design. International Review of Applied Linguistics, 43, 1-32.

195 REFERENCES

Rosenbloom, P., & Nowell, A. (1987). Learning by chunking: A production system model of practice. In D. Klahr, P. Langley, & R. Neches (Eds.), Production system models of learning and development (pp. 221-286). Boston, MA: The MIT Press.

Ruhland, H. G (1998). Going the distance: A non-linear approach to change in language development (Doctoral dissertation, Rijksuniversiteit Groningen, 1998). Retrieved from http://dissertations.ub.rug.nl/FILES/faculties/ppsw/1998/h.g.ruhland/titlecon.pdf

Ryle, G. (1949). The concept of mind . Oxford: Oxford University Press.

Schmitt, N. (1998). Tracking the incremental acquisition of second language vocabulary: A longitudinal study. Language Learning, 48, 281-317.

Schmitt, N. (2010). Researching vocabulary: A vocabulary research manual . Basingstoke: Palgrave Macmillan.

Schmitt, N., & Meara, P. (1997). Researching vocabulary through a word knowledge framework. Studies in Second Language Acquisition, 19, 17-36.

Schmitt, N., Schmitt, D., & Clapham, C. (2001). Developing and exploring behaviour of two new versions of the Vocabulary Levels Test. Language Testing, 18, 55-88.

Selinker, L. (1972). Interlanguage. International Review of Applied Linguistics, 10, 209-231.

Selinker, L., & Lakshman, U. (1992). and fossilization: The “Multiple Effects Principle”. In S. Gass & L. Selinker (Eds.), Language transfer in language learning (pp. 197-216). Amsterdam/Philadelphia: John Benjamins.

Shaw, P., & Liu, E. T. K. (1998). What develops in the development of second language writing? Applied Linguistics, 19, 225-254.

Singer, J. D., & Willet, J. B. (2003). Applied longitudinal data analysis . New York: Oxford University Press.

Singleton, D. (1999). Exploring the second language mental lexicon . Cambridge: Cambridge University Press.

Skehan, P. (1991). Individual differences in second language learning. Studies in Second Language Acquisition, 31 , 275-298.

Skehan, P. (1998). A cognitive approach to language learning . Oxford: Oxford University Press.

Skehan, P. (2009). Modelling second language performance: integrating complexity, accuracy, fluency, and lexis. Applied Linguistics, 30, 510-532.

Skehan, P., & Foster, P. (1997). Task types and task processing conditions as influences on foreign language performance. Language Teaching Research, 1, 185-211.

Skehan, P., & Foster, P. (2001). Cognition and tasks. In P. Robinson (Ed.), Cognition and second language instruction (pp. 183-205). Cambridge: Cambridge University Press.

Smith, L. B., & Thelen, E. (1993). Can dynamic systems theory be usefully applied in areas other than motor development? In L. B. Smith & E. Thelen (Eds.), A dynamic systems approach to development: Applications (pp. 152-170). Cambridge, MA: The MIT Press.

196 References

Spivey, M. (2007). The continuity of mind . New York: Oxford University Press.

Spoelman, M., & Verspoor, M. (2009). De ontwikkeling van schrijfvaardigheid in het Fins als vreemde taal: een dynamisch perspectief. TTWiA, 81 .

Spoelman, M., & Verspoor, M. (in press). Dynamic patterns in development of accuracy and complexity: A longitudinal case study in the acquisition of Finnish. Applied Linguistics. Retrieved from: http://applij.oxfordjournals.org/cgi/content/abstract/amq001

Swain, M. (1985). Communicative competence: Some roles of comprehensible input and in its development. In S. M. Gass & C. G. Madden (Eds.), Input in second language acquisition (pp. 235-253). Rowely, MA: Newbury House.

Swain, M. (1995). Three functions of output in second language learning. In G. Cook & B. Seidelhofer (Eds.), Principle and practice in applied linguistics: Studies in honor of H. G. Widdowson (pp. 125-144). Oxford: Oxford University Press.

Thelen, E., & Smith, L. B. (1994). A dynamic systems approach to the development of cognition and action . Cambridge, MA: The MIT Press.

Thelen, E., & Smith, L. B. (1998). Dynamic systems theories. In W. Damon & R. M. Lerner (Eds.), Handbook of child psychology (5th ed.) (pp. 563-634). New York: Wiley.

Ullman, M. T. (1999). Acceptability ratings of regular and irregular past-tense forms: Evidence for a dual-systems model of language from word frequency and phonological neighborhood effects. Language and Cognitive Processes, 14, 47-67.

Ullman, M. T. (2001a). A neurocognitive perspective on language: The declarative/procedural model. Nature Reviews: Neuroscience, 2, 717-726.

Ullman, M. T. (2001b). The declarative/procedural model of lexicon and grammar. Journal of Psycholinguistic Research, 30, 37-69.

Ullman, M. T. (2004). Contributions of memory circuits to language: the declarative/procedural model. Cognition, 92, 270. van Dijk, M. (2003). Child language cuts capers: Variability and ambiguity in early child development. (Doctoral dissertation, Rijksuniversiteit Groningen, 2003). Retrieved from http://dissertations.ub.rug.nl/faculties/ppsw/2004/m.w.g.van.dijk/?pLanguage=en&pFullIte mRecord=ON van Dijk, M., Verspoor, M., & Lowie, W. M (in press). Variability anslyses in language development. In M. Verspoor, W. M. Lowie, & de Bot (Eds.) A dynamic approach to second language development: Methods and techniques. Amsterdam/Philadelphia: John Benjamins. van Geert, P. (1991). A dynamic systems theory model of cognitive and language growth. Psychological Review, 98, 3-53. van Geert, P. (1993). A dynamic systems model of cognitive growth: Competition and support under limited resource conditions. In L. B. Smith & E. Thelen (Eds.), A dynamic systems approach to development: Applications (pp. 265-332). Cambridge, MA: The MIT Press.

197 REFERENCES van Geert, P. (1994). Dynamic systems of development: Change between complexity and chaos . London: Harvester Wheatsheaf. van Geert, P. (1995). Growth dynamics in development. In R. F. Port & T.J. Van Gelder (Eds.), Mind as motion: Explorations in the dynamics of cognition (pp. 313-338). Cambridge, MA: The MIT Press. van Geert, P. (2003). Dynamic systems approaches and modeling of developmental processes. In J. Valsiner & K. J. Conolly (Eds.), Handbook of developmental psychology (pp. 640- 672). London: Sage. van Geert, P., & Steenbeek, H. (2005). Explaining after by before: Basic aspects of the dynamic systems approach to the study of development. Developmental Review, 25, 408- 442. van Geert, P., & van Dijk, M. (2002). Focus on variability: New tools to study intra- individual variability in developmental data. Infant Behavior and Development, 25, 340- 374. van Gelder, T. J. (1999). Dynamic approaches to cognition. In R. Wilson & F. Keil (Eds.), The MIT encyclopedia of cognitive sciences (pp. 244-246). Cambridge, MA: The MIT press. van Gelder, T. J., & Port, R. F. (1995). It's about time: An overview of the dynamical approach to cognition. In T. J. van Gelder & R. F. Port (Eds.), Mind as motion: Explorations in the dynamics of cognition (pp. 1-44). Cambridge, MA: The MIT Press. van Hout, R., Hulk, A, & Kuiken, F (2003). The interface: Concluding remarks. In R. van Hout, Hulk, A., Kuiken, F., & Towell, R. (Eds.), The lexicon-syntax interface in second language acquisition (pp. 219-226). Amsterdam/Philadelphia: John Benjamins. van Lier, L. (2005). Case study. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp. 195-208). Mahwah, NJ: Lawrence Erlbaum Associates. van Orden, G. C. (2002). Nonlinear dynamics and psycholinguistics. Ecological Psychology, 14, 1-4. van Orden, G. C., & Goldinger, S. D (1994). Interdependence of form and function in cognitive systems explains perception of primed words. Journal of Experimental Psychology: Human Perception and Performance, 20, 1269-1291.

VanPatten, B. (1990). Attending to form and content in the input: An experiment in consciousness. Studies in Second Language Acquisition, 12, 287-301.

VanPatten, B., Williams, J., & Rott, S. (2004). Form-meaning connections in second language acquisition. In B. VanPatten, J. Williams, S. Rott, & M. Overstreet (Eds.), Form-meaning connections in second language acquisition (pp. 1-26). Mahwah, NJ: Lawrence Erlbaum Associates.

Vermeer, A. (2001). Breadth and depth of vocabulary in relation to L1-L2 acquisition and frequency of input. Applied Psycholinguistics, 22, 217-234.

Verspoor, M., de Bot, K., & Lowie, W. (2004). Dynamic systems theory and variation: A case study in L2 writing. In H. Aertsen, M. Hannay, & R. Lyall (Eds.), Words in their places: A festschrift for J. Lachlan Mackenzie (pp. 407-421). Amsterdam: Free University Press.

198 References

Verspoor, M. H., Lowie, W. M., & de Bot, K. (2008). Input and second language development from a dynamic perspective. In T. Piske & M. Young-Scholten (Eds.), Input matters in SLA (pp. 62-80). Bristol: Multilingual Matters

Verspoor, M. H., Lowie, W. M., & van Dijk, M. (2008). Variability in L2 development from a dynamic systems perspective. The Modern Language Journal, 92, 214-231.

Waring, R. (1999). Tasks for assessing second language receptive and productive vocabulary. (Doctoral dissertation: University of Wales, 1999). Retrieved from http:// www.robwaring.org/papers/papers.html

Webb, S. (2008). Receptive and productive vocabulary sizes of L2 learners. Studies in Second Language Acquisition, 30, 79-95.

West, M. (1953). A general service list of English words . London: Longman.

Wolfe-Quintero, K., Hae-Young, K., & Inagaki, S. (1998). Second language development in writing: Measures of fluency, accuracy, and complexity . Honolulu: University of Hawai’i Press.

Woodward, A. L., Markman, E. M., & Fitzsimmons, C. M. (1994). Rapid word learning in 13- and 18-month-olds. Developmental Psychology, 30, 533-566.

Xue, G., & Nation, I. S. P. (1984). A university word list. Language Learning and Communication, 3, 215-229.

Young, R. (1995). Discontinuous interlanguage development and its implications for oral proficiency rating scales. Applied Language Learning, 6, 13-26.

Yuan, F., & Ellis, R. (2003). The effects of pre-task planning and on-line planning on fluency, complexity and accuracy in L2 oral production. Applied Linguistics, 24 , 1-27.

Yuuping, C. (2003). An analysis of the grammatical problems in Chinese EFL students' expository writing: A systemic functional perspective. Research papers, Singapore tertiary English teachers society (STETS), 1-10.

Zareva, A, Schwanenflugel, P., & Nikolova, Y. (2005). Relationship between lexical competence and language proficiency. Studies in Second Language Acquisition, 27, 567- 595.

199 REFERENCES

200 Nederlandse samenvatting

Het doel van dit proefschrift is de principes van de dynamische systeemtheorie (Dynamic Systems Theory, DST) toe te passen op empirisch onderzoek naar de ontwikkeling van een tweede taal (T2). DST is een interdisciplinaire theorie over ontwikkeling in natuurlijke systemen. Het identificeert eenvoudige en universele groei- en interactieprincipes in deze systemen die kunnen leiden tot oneindig complexe resultaten. In de afgelopen decennia heeft DST een belangrijk invloed gehad op het onderzoek naar cognitieve ontwikkeling, waaronder ook taalontwikkeling (van Gelder, 1999). In de DST-benadering wordt taal gezien als een cluster van componenten die elk op verscheidene vlakken in interactie verkeren met hun omgeving. De omgeving bestaat daarbij zowel uit de interne bronnen van taalgebruikers als uit externe bronnen, die worden gestuurd door de omgeving waarin de taalgebruiker zich bevindt. Taal, gezien als een zich ontwikkelend systeem binnen individuen of als een systeem dat door individuen in een gemeenschap wordt gedeeld, is een open systeem dat blijvend aan verandering onderhevig is, totdat het raakt uitgestorven (Larsen- Freeman, 2006a; Spivey, 2007). Taalveranderingen op het macroniveau van de taalgemeenschap komen voort uit veranderingen op het microniveau van het taalsysteem van individuele taalgebruikers. De linguïstische kennis in deze individuele systemen bestaat uit op elkaar voorbouwende subsystemen, die op hun beurt weer zijn opgebouwd uit soortgelijke subsystemen. Deze hiërarchie van gekoppelde systemen komt ook tot uiting in de bijbehorende processen – het proces van taalverandering en taalvariatie bestaat uit tal van individuele processen van taalgebruik en taalverwerving. Deze processen bestaan op hun beurt weer uit de gelijktijdige ontwikkeling van kennisaspecten op alle deelgebieden van taalkennis en op alle tijdsschalen, van millennia tot aan nanoseconden (Spivey, 2007). Al deze ontwikkelingsprocessen verlopen gelijktijdig en veranderen voortdurend. Aan de ene kant vertonen deze processen een wederzijdse afhankelijkheid van zich ontwikkelende elementen in de structurele hiërarchie van taal, waar bepaalde hogere structuren zich niet kunnen ontwikkelen zonder andere structuren. Aan de andere kant bevatten zij

201

SAMENVATTING ook een schijnbaar paradoxale rivaliteit met betrekking tot de externe en interne bronnen die voor het zich ontwikkelende systeem beperkt beschikbaar zijn. Dit algemene idee van een interactie binnen de structurele beperkingen en de beperkte bronnen wordt het precursormodel genoemd. Precursorinteracties spelen een belangrijke rol bij veel natuurlijke- en ontwikkelingsfenomenen (Fischer, 1980, Van Geert, 1991). Als taal vanuit dit perspectief wordt bekeken, dan betekent dit dat de intra-individuele variatie in de prestatie van taalleerders niet langer kan worden beschouwd als een simpele meetfout, maar als een betekenisvolle uitdrukking van veranderingen in precursorinteracties. Met deze benadering van variatie verschilt het DST perspectief dus duidelijk van het dominante perspectief in de Toegepaste Taalwetenschap, waar lineaire relaties en statistisch significante groepseffecten centraal staan. Afhankelijk van de onderzochte fenomenen zijn er verschillende versies van het basis-precursormodel. Veel van deze modellen kunnen met eenvoudige gekoppelde logistieke vergelijkingen worden gesimuleerd. Voor taalontwikkeling werden deze simulaties al eerder toegepast op verschillende aspecten van eerste- taalverwerving, waaronder non-lineaire en niet-uniforme groei van de vroege woordenschat en vroege syntaxis (Robinson & Mervis, 1998, van Geert, 1991), en sequentiële progressie van een- naar meerwoordsuitingen in vroeg taal (Bassano & van Geert, 2007). De analyses in dit proefschrift zijn een toepassing van een empirisch- dynamische benadering op twee gebieden van T2-ontwikkeling, waarbij analyses van variabiliteit worden gecombineerd met op het precursormodel gebaseerde simulaties. Het eerste gebied is woordkennis, en richt zich in het bijzonder op de ontwikkeling binnen het continuüm van receptieve en productieve woordenschat. Het tweede gebied is schrijfvaardigheid, en richt zich vooral op de combinatie van complexiteit en correctheid van de lexicale en syntactische dimensies van het schrijfproduct. De studies in dit proefschrift hebben verschillende kenmerken gemeen. Ten eerste passen beide analyses het dynamische perspectief toe op een combinatie van twee, in hun respectievelijke onderzoeksgebied dichotome, paradigma’s. In elke studie worden twee groepen van verschillende continua verenigd in vier niveaus. In de eerste studie worden twee paradigma’s van woordenkennis – herkenning vs. oproepen van woorden, en gecontroleerde vs. vrije productie – verenigd in één continuüm van receptief-productieve kennis. In de tweede studie wordt het onderscheid tussen

202 Samenvatting enerzijds de complexiteit en correctheid van het schrijfproduct, en het anderzijds tussen de lexicale en de syntactische dimensies van het schrijven in een hiërarchie van vier categorieën verenigd: lexicale complexiteit, lexicale correctheid, syntactische complexiteit, en syntactische correctheid. In beide studies is de volgorde van de categorieën in de hiërarchie gebaseerd op eerdere bevindingen en op beredeneerde overwegingen. Beide studies maken gebruik van gedetailleerde longitudinale gegevens van vier case studies van gevorderde leerders van het Engels als tweede taal, in een onderdompelingsomgeving. De studies bekijken de centrale groeitrends in de data en de variabiliteitspatronen van deze data middels de vergelijking van indexparen. Op deze manier worden patronen ontdekt die op complexe precursorinteracties kunnen duiden. Vervolgens worden modellen van precursorinteracties gebruikt om de gegevens te simuleren. Deze modellen zijn gebaseerd op van Geert’s (1991, 1993, 1994, 2003) uitgebreide werk over wiskundige beschrijvingen van verschillende soorten precursorinteracties. Elke versie van een dergelijk model specificeert de manier waarop de bij het systeem betrokken componenten elkaar beïnvloeden, specificeert de begintoestand ven elke component, en specificeert de eventuele conditionele afhankelijkheden tussen de componenten (van Geert, 2003, p. 664). Over het algemeen bevatten deze modellen een hiërarchie van elementen met elk een drempelwaarde, die de voorwaarde kwantificeert voor de steun van een component (de precursor) voor het ontstaan van de afhankelijke component (de dependent), en waarbij de afhankelijke component op zijn beurt zich doorgaans in competitie bevindt met de precursor. Aan dit basispatroon kunnen variaties worden toegevoegd, zoals bijvoorbeeld een unidirectionele of bidirectionele competitie, steun van de afhankelijke component voor de precursor, andere voorwaardelijke drempels, of vertragingen in de ontwikkeling. De interne hiërarchie in de precursormodellen wordt uitgedrukt in parameters. In de huidige studies worden deze bepaald aan de hand van achtergrondliteratuur over het betreffende gebied van tweedetaalverwerving en op grond van de bevindingen van de eerdere analyses. De interacties binnen de hiërarchie worden vervolgens vastgelegd op grond van de uitkomsten van de (variabiliteits-)analyses. Er worden in de studies twee soorten van interacties in de data onderscheiden. De eerste soort bestaat uit interacties aan de oppervlakte. Deze zijn te zien in correlaties tussen de componenten in de data. De tweede soort zijn verborgen of onderliggende interacties. Deze komen

203 SAMENVATTING tot uitdrukking in de variabiliteit rond de centrale groeitrends in de data, en in tijdelijke veranderingen in de datacorrelaties. Dit tweede type interacties wordt gebruikt voor het configureren van de relationele controleparameters, die de interacties specificeren tussen de hiërarchische niveaus van het model. De hypothese in de twee studies is dat als deze onderliggende interacties worden nagebootst, zij in de modellen tot groeitrends en oppervlakteinteracties leiden die gelijk zijn aan die in de echte data. De hypothese komt daarmee overeen met die uit eerdere studies van de T2 schrijfontwikkeling, waarin wordt aangetoond dat wisselende variabiliteits- patronen uitdrukking geven aan complexe of competitieve interacties. Deze studies hebben dergelijke interacties in verbinding gebracht met het precursormodel, maar hebben deze interpretaties niet geverifieerd met simulaties (Larsen-Freeman, 2006b; Spoelman & Verspoor, 2009, in press; Verspoor et al., 2008b). In de twee voorliggende studies wordt de hypothese bevestigd dat variabiliteit in de data een relevante uitdrukking is van interne systeemdynamiek. Dit werd aangetoond door de resultaten van de simulaties. Het eerste resultaat laat zien dat als de relationele controleparameters van de modellen worden geoptimaliseerd om de beste fit met de data te verkrijgen, hun waarden overeen komen met de interpretaties van de resultaten van de variabiliteitsanalyses. Ten tweede is er niet alleen een fit tussen de resultaten van het geoptimaliseerde model en de groeitrends, maar ook tussen de resultaten van het model en de verschillende correlaties in de data, en dus in de oppervlakteinteracties. Tot dusver zijn de modellen van de twee studies op dezelfde manier beschreven, in overeenstemming met de basisprincipes van dynamische simulaties. Er zijn echter wel verschillen tussen de modellen van de twee studies. Het woordkennismodel legde alleen bottom-up interacties vast, van meer receptieve naar meer productieve niveaus van kennis. Terugwerkende interactie werd wel aangenomen maar werd niet in het model verwerkt. De reden hiervoor was dat de verschijnselen die het model probeerde te simuleren, het gat tussen receptieve en productieve kennis, wordt gedefinieerd als een unidirectioneel gebrek aan transfer van receptieve naar productieve kennis (bijv. Melka, 1997). Ondanks het feit dat het erg waarschijnlijk is dat de interactie tussen receptieve en productieve woordenkennis complexer is dan alleen feed-forward (zie Clark, 1993) werd besloten het model eenvoudig te houden en uitsluitend de toepasbaarheid van de meest basale aannames

204 Samenvatting te toetsen: het ontbreken van transfer, of het bestaan van nonlineaire transfer van lagere naar hogere niveaus van woordkennis. Bij de studie naar schrijfvaardigheid was dit anders. Hier waren de categorieën (lexicale complexiteit, lexicale correctheid, syntactische complexiteit, en syntactische correctheid) niet zo duidelijk hiërarchisch als bij de woordkennisstudie. Onderdeel van de operationele definities van deze categorieën vormden indexen voor reeds verworven vaardigheden, zoals het relatieve gebruik van onderschikkende voeg- woorden of de proportie van zinnen met een correcte woordvolgorde. Ook werden deze indexen, anders dan de niveaus van woordkennis, niet automatisch geassocieerd met dezelfde linguïstische structuren. Daarom zijn in dit model naast basale feed- forward interacties ook feedback interacties (van de afhankelijke component naar de precursor) opgenomen. Dit maakte het model complexer en leidde tot een betere fit met de data dan het woordkennismodel. Een ander verschil tussen de modellen van de twee studies was dat de interacties in het woordkennismodel per verandering en per niveau werden gedefinieerd, terwijl de interacties in het schrijfvaardigheidsmodel alleen per niveau werden gedefinieerd. Dit komt overeen met het idee dat actief verworven vaardigheden zoals beschreven in de woordkennisstudie mogelijk zowel onderlinge steun als competitie vertonen. Dit zou dan gerelateerd zijn het niveau van deze vaardigheden en aan de moeite die werd gedaan om de vaardigheden te verwerven. Aan de andere kant was de hypothese dat omdat de schrijfvaardigheidscategorieën reeds verworven (en misschien zelfs stabiele) vormen van kennis zijn, hun interacties direct gerelateerd zouden zijn aan hun niveau en niet aan het verwervingsproces. De verschillen tussen de versies van de modellen tonen aan dat dynamisch modelleren het ondanks zijn ogenschijnlijke eenvoud mogelijk maakt om binnen een model verschillende overwegingen mee te nemen. Omdat beide studies resulteerden in een goede fit tussen het model en de data, en ook de interpretatie van de voorafgaande variabiliteitsanalyses als uitdrukking van dynamische interacties bevestigden, kunnen zij als geslaagd worden beschouwd. De onderwerpen van de simulaties – woord- kennis, en lexicale en syntactische complexiteit en correctheid bij het schrijven – waren beiden goed geschikt voor dynamisch modelleren. Dit komt door de relatief simpele en intuïtief overtuigende hiërarchieën (maar zie de Bot & Lowie, 2010; Elman, 1995, 2009) die breed werden aangetoond door eerdere bevindingen en theorieën. Ondanks conceptuele en operationele verschillen in de opvattingen van

205 SAMENVATTING verschillende studies over T2 woordkennis is er een algemene overeenstemming dat receptieve woordenkennis voorafgaand en voorwaardelijk is voor productieve kennis (maar zie Clark, 1993). Op dezelfde wijze nemen de meeste onderzoekers aan dat complexiteit voorafgaat aan correctheid en dat lexicale kennis voorafgaat aan syntactische kennis. En hoewel de causaliteit hiervan nog sterk omstreden is, zijn voor de deze volgordes veel bewijzen aangevoerd (Anderson, 1993; Marchman & Bates, 1994). De twee voorliggende studies laten zien hoe in een dynamische benadering de beschrijvingen en verklaringen van gedragspatronen voortkomen uit robuuste en typische eigenschappen van het system (Kauffman, 1995, p. 19, as cited in Larsen- Freeman & Cameron, 2008, p. 26). Deze eigenschappen zijn vaak in de relevante literatuur gedocumenteerd vormen de basis voor het toepassen van een dynamische aanpak. De studies in dit proefschrift hebben hopelijk laten zien hoe DST kan bijdrage aan de bestaande inzichten in de toegepaste taalwetenschap. DST schept de mogelijkheid om disparate en schijnbaar tegenstrijdige verklaringen met elkaar te in overeenstemming te brengen en tegelijkertijd afzonderlijke deelgebieden van taalontwikkeling te verenigen. Het belangrijkste is wellicht, dat DST een van oorsprong statische theorie kan uitbreiden tot een procesgeoriënteerde benadering van taalontwikkeling.

206 Groningen dissertations in linguistics (GRODIL )

1. Henriëtte de Swart (1991). Adverbs of Quantification: A Generalized Quantifier Approach . 2. Eric Hoekstra (1991). Licensing Conditions on Phrase Structure . 3. Dicky Gilbers (1992). Phonological Networks. A Theory of Segment Representation . 4. Helen de Hoop (1992). Case Configuration and Noun Phrase Interpretation . 5. Gosse Bouma (1993). Nonmonotonicity and Categorial Unification Grammar . 6. Peter I. Blok (1993). The Interpretation of Focus . 7. Roelien Bastiaanse (1993). Studies in Aphasia . 8. Bert Bos (1993). Rapid User Interface Development with the Script Language Gist . 9. Wim Kosmeijer (1993). Barriers and Licensing . 10. Jan-Wouter Zwart (1993). Dutch Syntax: A Minimalist Approach . 11. Mark Kas (1993). Essays on Boolean Functions and Negative Polarity . 12. Ton van der Wouden (1994). Negative Contexts . 13. Joop Houtman (1994). Coordination and Constituency: A Study in Categorial Grammar . 14. Petra Hendriks (1995). Comparatives and Categorial Grammar . 15. Maarten de Wind (1995). Inversion in French . 16. Jelly Julia de Jong (1996). The Case of Bound Pronouns in Peripheral Romance . 17. Sjoukje van der Wal (1996). Negative Polarity Items and Negation: Tandem Acquisition . 18. Anastasia Giannakidou (1997). The Landscape of Polarity Items . 19. Karen Lattewitz (1997). Adjacency in Dutch and German . 20. Edith Kaan (1997). Processing Subject-Object Ambiguities in Dutch . 21. Henny Klein (1997). Adverbs of Degree in Dutch . 22. Leonie Bosveld-de Smet (1998). On Mass and Plural Quantification: The case of French ‘des’/‘du’-NPs . 23. Rita Landeweerd (1998). Discourse Semantics of Perspective and Temporal Structure . 24. Mettina Veenstra (1998). Formalizing the Minimalist Program . 25. Roel Jonkers (1998). Comprehension and Production of Verbs in Aphasic Speakers . 26. Erik F. Tjong Kim Sang (1998). Machine Learning of Phonotactics . 27. Paulien Rijkhoek (1998). On Degree Phrases and Result Clauses . 28. Jan de Jong (1999). Specific Language Impairment in Dutch: Inflectional Morphology and Argument Structure. 29. H. Wee (1999). Definite Focus. 30. Eun-Hee Lee (2000). Dynamic and Stative Information in Temporal Reasoning: Korean Tense and Aspect in Discourse. 31. Ivilin P. Stoianov (2001). Connectionist Lexical Processing. 32. Klarien van der Linde (2001). Sonority Substitutions. 33. Monique Lamers (2001). Sentence Processing: Using Syntactic, Semantic, and Thematic Information . 34. Shalom Zuckerman (2001). The Acquisition of "Optional" Movement . 35. Rob Koeling (2001). Dialogue-Based Disambiguation: Using Dialogue Status to Improve Speech Understanding . 36. Esther Ruigendijk (2002). Case assignment in Agrammatism: A Cross-Linguistic Study. 37. Tony Mullen (2002). An Investigation into Compositional Features and Feature Merging for Maximum Entropy-Based Parse Selection. 38. Nanette Bienfait (2002). Grammatica-onderwijs aan allochtone jongeren. 39. Dirk-Bart den Ouden (2002). Phonology in Aphasia: Syllables and segments in level-specific deficits. 40. Rienk Withaar (2002). The Role of the Phonological Loop in Sentence Comprehension. 41. Kim Sauter (2002). Transfer and Access to Universal Grammar in Adult Second Language Acquisition. 42. Laura Sabourin (2003). Grammatical Gender and Second Language Processing: An ERP Study. 43. Hein van Schie (2003). Visual Semantics.

207

44. Lilia Schürcks-Grozeva (2003). Binding and Bulgarian. 45. Stasinos Konstantopoulos (2003). Using ILP to Learn Local Linguistic S tructures . 46. Wilbert Heeringa (2004). Measuring Dialect Pronunciation Differences Using Levenshtein Distance . 47. Wouter Jansen (2004). Laryngeal Contrast and Phonetic Voicing: A Laboratory Phonology . 48. Judith Rispens (2004). Syntactic and Phonological Processing in Developmental Dyslexia . 49. Danielle Bougaïré (2004). L'approche communicative des campagnes de sensibilisation en santé publique au Burkina Faso: Les cas de la planification familiale, du sida et de l'excision . 50. Tanja Gaustad (2004). Linguistic Knowledge and Word Sense Disambiguation. 51. Susanne Schoof (2004). An HPSG Account of Nonfinite Verbal Complements in Latin . 52. M. Begoña Villada Moirón (2005). Data-Driven Identification of Fixed Expressions and their Modifiability. 53. Robbert Prins (2005). Finite-State Pre-Processing for Natural Language Analysis. 54. Leonoor van der Beek (2005) Topics in Corpus-Based Dutch Syntax. 55. Keiko Yoshioka (2005). Linguistic and Gestural Introduction and Tracking of Referents in L1 and L2 Discourse. 56. Sible Andringa (2005). Form-Focused Instruction and the Development of Second Language Proficiency . 57. Joanneke Prenger (2005). Taal telt! Een onderzoek naar de rol van taalvaardigheid en tekstbegrip in het realistisch wiskundeonderwijs. 58. Neslihan Kansu-Yetkiner (2006). Blood, Shame and Fear: Self-Presentation Strategies of Turkish Women’s Talk about their Health and Sexuality . 59. Mónika Z. Zempléni (2006). Functional Imaging of the Hemispheric Contribution to Language Processing. 60. Maartje Schreuder (2006). Prosodic Processes in Language and Music. 61. Hidetoshi Shiraishi (2006). Topics in Nivkh Phonology. 62. Tamás Biró (2006). Finding the Right Words: Implementing Optimality Theory with Simulated Annealing. 63. Dieuwke de Goede (2006). Verbs in Spoken Sentence Processing: Unraveling the Activation Pattern of the Matrix Verb. 64. Eleonora Rossi (2007). Clitic Production in Italian Agrammatism. 65. Holger Hopp (2007). Ultimate Attainment at the Interfaces in Second Language Acquisition: Grammar and Processing. 66. Gerlof Bouma (2008). Starting a Sentence in Dutch: A Corpus Study of Subject- and Object-Fronting. 67. Julia Klitsch (2008). Open Your Eyes and Listen Carefully. Auditory and Audiovisual Speech Perception and the McGurk Effect in Dutch Speakers with and without Aphasia. 68. Janneke ter Beek (2008). Restructuring and Infinitival Complements in Dutch. 69. Jori Mur (2008). Off-line Answer Extraction for Question Answering. 70. Lonneke van der Plas (2008). Automatic Lexico-Semantic Acquisition for Question Answering. 71. Arjen Versloot (2008). Mechanisms of Language Change: Vowel Reduction in 15th Century West Frisian. 72. Ismail Fahmi (2009). Automatic term and Relation Extraction for Medical Question Answering System. 73. Tuba Yarbay Duman (2009). Turkish Agrammatic Aphasia: Word Order, Time Reference and Case. 74. Maria Trofimova (2009). Case Assignment by Prepositions in Russian Aphasia. 75. Rasmus Steinkrauss (2009). Frequency and Function in WH Question Acquisition. A Usage-Based Case Study of German L1 Acquisition. 76. Marjolein Deunk (2009). Discourse Practices in Preschool. Young Children’s Participation in Everyday Classroom Activities. 77. Sake Jager (2009). Towards ICT-Integrated Language Learning: Developing an Implementation Framework in Terms of Pedagogy, Technology and Environment. 78. Francisco Dellatorre Borges (2010). Parse Selection with Support Vector Machines. 79. Geoffrey Andogah (2010). Geographically Constrained Information Retrieval. 80. Jacqueline van Kruiningen (2010). Onderwijsontwerp als conversatie. Probleemoplossing in interprofessioneel overleg. 81. Robert G. Shackleton (2010). Quantitative Assessment of English-American Speech Relationships . 82. Tim Van de Cruys (2010). Mining for Meaning: The Extraction of Lexico-semantic Knowledge from Text. 83. Therese Leinonen (2010). An Acoustic Analysis of Vowel Pronunciation in Swedish Dialects. 84. Erik-Jan Smits (2010). Acquiring Quantification. How Children Use Semantics and Pragmatics to Constrain Meaning . 85. Tal Caspi (2010). A Dynamic Perspective on Second Language Development.

208

GRODIL Center for Language and Cognition Groningen (CLCG) P.O. Box 716 9700 AS Groningen The Netherlands

209