Bootstrapping Into Filler-Gap: an Acquisition Story
Total Page:16
File Type:pdf, Size:1020Kb
Bootstrapping into Filler-Gap: An Acquisition Story Marten van Schijndel Micha Elsner The Ohio State University vanschm,melsner @ling.ohio-state.edu { } Abstract Age 13mo 15mo 20mo 25mo Wh-S No Yes Yes Yes Analyses of filler-gap dependencies usu- ally involve complex syntactic rules or (Yes) heuristics; however recent results suggest Wh-O No Yes Yes that filler-gap comprehension begins ear- lier than seemingly simpler constructions 1-1 Yes No such as ditransitives or passives. Therefore, this work models filler-gap acquisition as a byproduct of learning word orderings (e.g. Figure 1: The developmental timeline of subject SVO vs OSV), which must be done at a (Wh-S) and object (Wh-O) wh-clause extraction very young age anyway in order to extract comprehension suggested by experimental results meaning from language. Specifically, this (Seidl et al., 2003; Gagliardi et al., 2014). Paren- model, trained on part-of-speech tags, rep- theses indicate weak comprehension. The final row resents the preferred locations of semantic shows the timeline of 1-1 role bias errors (Naigles, roles relative to a verb as Gaussian mix- 1990; Gertner and Fisher, 2012). Missing nodes de- tures over real numbers. note a lack of studies. This approach learns role assignment in filler-gap constructions in a manner con- sistent with current developmental findings acquired through learning word orderings rather and is extremely robust to initialization than relying on hierarchical syntactic knowledge. variance. Additionally, this model is shown to be able to account for a characteristic er- This work describes a cognitive model of the de- ror made by learners during this period (A velopmental timecourse of filler-gap comprehension and B gorped interpreted as A gorped B). with the goal of setting a lower bound on the mod- eling assumptions necessary for an ideal learner 1 Introduction to display filler-gap comprehension. In particular, the model described in this paper takes chunked The phenomenon of filler-gap, where the argument child-directed speech as input and learns orderings of a predicate appears outside its canonical posi- over semantic roles. These orderings then permit tion in the phrase structure (e.g. [the apple]i that the model to successfully resolve filler-gap depen- 1 the boy ate ti or [what]i did the boy eat ti), has long dencies. Further, the model presented here is also been an object of study for syntacticians (Ross, shown to initially reflect an idiosyncratic role as- 1967) due to its apparent processing complexity. signment error observed in development (e.g. A Such complexity is due, in part, to the arbitrary and B kradded interpreted as A kradded B; Gert- length of the dependency between a filler and its ner and Fisher, 2012), though after training, the gap (e.g. [the apple]i that Mary said the boy ate ti). model is able to avoid the error. As such, this work Recent studies indicate that comprehension of may be said to model a learner from 15 months to filler-gap constructions begins around 15 months between 25 and 30 months. (Seidl et al., 2003; Gagliardi et al., 2014). This finding raises the question of how such a complex phenomenon could be acquired so early since chil- 1This model does not explicitly learn gap positions, dren at that age do not yet have a very advanced but rather assigns thematic roles to arguments based grasp of language (e.g. ditransitives do not seem on where those arguments are expected to manifest. This approach to filler-gap comprehension is supported to be generalized until at least 31 months; Gold- by findings that show people do not actually link fillers berg et al. 2004, Bello 2012). This work shows to gap positions but instead link the filler to a verb that filler-gap comprehension in English may be with missing arguments (Pickering and Barry, 1991) 1084 Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pages 1084–1093, Baltimore, Maryland, USA, June 23-25 2014. c 2014 Association for Computational Linguistics 2 Background their study, infants were shown video of a person talking on a phone using a nonce verb with ei- The developmental timeline during which children ther one or two nouns (e.g. Mary kradded Susan). acquire the ability to process filler-gap construc- Under the assumption that infants look longer at tions is not well-understood. Language comprehen- things that correspond to their understanding of sion precedes production, and the developmental a prompt, the infants were then shown two im- literature on the acquisition of filler-gap construc- ages that potentially depicted the described action tions is sparsely populated due to difficulties in de- – one picture where two actors acted independently signing experiments to test filler-gap comprehen- (reflecting an intransitive proposition) and one pic- sion in preverbal infants. Older studies typically ture where one actor acted on the other (reflecting looked at verbal children and the mistakes they a transitive proposition).3 Even though the infants make to gain insight into the acquisition process had no extralinguistic knowledge about the verb, (de Villiers and Roeper, 1995). they consistently treated the verb as transitive if Recent studies, however, indicate that filler- two nouns were present and intransitive if only one gap comprehension likely begins earlier than pro- noun was present. duction (Seidl et al., 2003; Gagliardi and Lidz, Similarly, Gertner and Fisher (2012) show that 2010; Gagliardi et al., 2014). Therefore, studies intransitive phrases with conjoined subjects (e.g. of verbal children are probably actually testing John and Mary gorped) are given a transitive in- the acquisition of production mechanisms (plan- terpretation (i.e. John gorped Mary) at 21 months ning, motor skills, greater facility with lexical ac- (henceforth termed ‘1-1 role bias’), though this ef- cess, etc) rather than the acquisition of filler- fect is no longer present at 25 months (Naigles, gap. Note that these may be related since filler- 1990). This finding suggests both that learners gap could introduce greater processing load which will ignore canonical structure in favor of using could overwhelm the child’s fragile production ca- all possible arguments and that children have a pacity (Phillips, 2010). bias to assign a unique semantic role to each argu- Seidl et al. (2003) showed that children are able ment. It is important to note, however, that cross- to process wh-extractions from subject position linguistically children do not seem to generalize be- (e.g. [who]i ti ate pie) as young as 15 months yond two arguments until after at least 31 months while similar extractions from object position (e.g. of age (Goldberg et al., 2004; Bello, 2012), so a [what]i did the boy eat ti) remain unparseable until predicate occurring with three nouns would still 2 around 20 months of age. This line of investiga- likely be interpreted as merely transitive rather tion has been reopened and expanded by Gagliardi than ditransitive. et al. (2014) whose results suggest that the ex- Computational modeling provides a way to test perimental methodology employed by Seidl et al. the computational level of processing (Marr, 1982). (2003) was flawed in that it presumed infants have That is, given the input (child-directed speech, ideal performance mechanisms. By providing more adult-directed speech, and environmental experi- trials of each condition and controlling for the prag- ences), it is possible to probe the computational matic felicity of test statements, Gagliardi et al. processes that result in the observed output. How- (2014) provide evidence that 15-month old infants ever, previous computational models of grammar can process wh-extractions from both subject and induction (Klein and Manning, 2004), including in- object positions. Object extractions are more diffi- fant grammar induction (Kwiatkowski et al., 2012), cult to comprehend than subject extractions, how- have not addressed filler-gap comprehension.4 ever, perhaps due to additional processing load in The closest work to that presented here is the object extractions (Gibson, 1998; Phillips, 2010). work on BabySRL (Connor et al., 2008; Connor et Similarly, Gagliardi and Lidz (2010) show that rel- al., 2009; Connor et al., 2010). BabySRL is a com- ativized extractions with a wh-relativizer (e.g. find putational model of semantic role acquistion using [the boy]i who ti ate the apple) are easier to com- a similar set of assumptions to the current work. prehend than relativized extractions with that as BabySRL learns weights over ordering constraints the relativizer (e.g. find [the boy]i that ti ate the (e.g. preverbal, second noun, etc.) to acquire se- apple). mantic role labelling while still exhibiting 1-1 role Yuan et al. (2012) demonstrate that 19-month bias. However, no analysis has evaluated the abil- olds use their knowledge of nouns to learn both verbs and their associated argument structure. In 3There were two actors in each image to avoid bias- ing the infants to look at the image with more actors. 2Since the wh-phrase is in the same (or a very simi- 4As one reviewer notes, Joshi et al. (1990) and sub- lar) position as the original subject when the wh-phrase sequent work show that filler-gap phenomena can be takes subject position, it is not clear that these con- formally captured by mildly context-sensitive grammar structions are true extractions (Culicover, 2013), how- formalisms; these have the virtue of scaling up to adult ever, this paper will continue to refer to them as such grammar, but due to their complexity, do not seem to for ease of exposition. have been described as models of early acquisition. 1085 Susan said John gave girl book µ σ π -3 -2 -1 0 1 2 GSC -1 0.5 .999 GSN -1 3 .001 Table 1: An example of a chunked sentence (Su- GOC 1 0.5 .999 san said John gave the girl a red book) with the GON 1 3 .001 sentence positions labelled.