A Roadmap for Reverse-Engineering the Infant Language-Learner∗

Total Page:16

File Type:pdf, Size:1020Kb

A Roadmap for Reverse-Engineering the Infant Language-Learner∗ Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner∗ Emmanuel Dupoux EHESS, ENS, PSL Research University, CNRS, INRIA, Paris, France [email protected], www.syntheticlearner.net Abstract • Machine/humans comparison should be run on a benchmark of psycholinguistic tests. Spectacular progress in the information processing sci- ences (machine learning, wearable sensors) promises to revo- 1 Introduction lutionize the study of cognitive development. Here, we anal- yse the conditions under which ’reverse engineering’ lan- In recent years, artificial intelligence (AI) has been hit- guage development, i.e., building an effective system that ting the headlines with impressive achievements at matching mimics infant’s achievements, can contribute to our scien- or even beating humans in complex cognitive tasks (play- tific understanding of early language development. We argue ing go or video games: Mnih et al., 2015; Silver et al., that, on the computational side, it is important to move from 2016; processing speech and natural language: Amodei et toy problems to the full complexity of the learning situation, al., 2016; Ferrucci, 2012; recognizing objects and faces: He, and take as input as faithful reconstructions of the sensory Zhang, Ren, & Sun, 2015; Lu & Tang, 2014) and promising signals available to infants as possible. On the data side, a revolution in manufacturing processes and human society accessible but privacy-preserving repositories of home data at large. These successes show that with statistical learning have to be setup. On the psycholinguistic side, specific tests techniques, powerful computers and large amounts of data, have to be constructed to benchmark humans and machines it is possible to mimic important components of human cog- at different linguistic levels. We discuss the feasibility of this nition. Shockingly, some of these achievements have been approach and present an overview of current results. reached by throwing out some of the classical theories in lin- guistics and psychology, and by training relatively unstruc- Keywords tured neural network systems on large amounts of data. What does it tell us about the underlying psychological and/or neu- Artificial intelligence, speech, psycholinguistics, compu- ral processes that are used by humans to solve these tasks? tational modeling, corpus analysis, early language acquisi- Can AI provide us with scientific insights about human learn- tion, infant development, language bootstrapping, machine ing and processing? learning. Here, we argue that developmental psychology and in par- ticular, the study of language acquisition is one area where, Highlights indeed, AI and machine learning advances can be transfor- • Key theoretical puzzles of infant language develop- mational, provided that the involved fields make significant ment are still unsolved. adjustments in their practices in order to adopt what we call the reverse engineering approach. Specifically: arXiv:1607.08723v4 [cs.CL] 14 Feb 2018 • A roadmap for reverse engineering infant language learning using AI is proposed. The reverse engineering approach to the study of infant language acquisition consists in con- • AI algorithms should use realistic input (little or no structing scalable computational systems that supervision, raw data). can, when fed with realistic input data, mimic language acquisition as it is observed in infants. • Realistic input should be obtained by large scale recording in ecological environments. The three italicised terms will be discussed at length in subsequent sections of the paper. For now, only an intuitive understanding of these terms will suffice. The idea of us- * Author’s reprint of Dupoux, E. (2018). Cognitive sci- ing machine learning or AI techniques as a means to study ence in the era of artificial intelligence: A roadmap for reverse- child’s language learning is actually not new (to name a few: engineering the infant language learner. Cognition, 173, 43-59, doi: Kelley, 1967; Anderson, 1975; Berwick, 1985; Rumelhart 10.1016/j.cognition.2017.11.008. & McClelland, 1987; Langley & Carbonell, 1987) although 2 E. DUPOUX relatively few studies have concentrated on the early phases Third, the human language capacity can be viewed as a of language learning (see Brent, 1996b, for a pioneering col- finite computational system with the ability to generate a lection of essays). What is new, however, is that whereas (virtual) infinity of utterances. This turns into a learnabil- previous AI approaches were limited to proofs of principle ity problem for infants: on the basis of finite evidence, they on toy or miniature languages, modern AI techniques have have to induce the (virtual) infinity corresponding to their scaled up so much that end-to-end language processing sys- language. As has been discussed since Aristotle, such induc- tems working with real inputs are now deployed commer- tion problems do not have a generally valid solution. There- cially. This paper examines whether and how such unprece- fore, language is simultaneously a human-specific biological dented change in scale could be put to use to address linger- trait, a highly variable cultural production, and an apparently ing scientific questions in the field of language development. intractable learning problem. The structure of the paper is as follows: In Section 2, we Despite these complexities, most infants spontaneously present two deep scientific puzzles that large scale modeling learn their native(s) language(s) in a matter of a few years of approaches could in principle address: solving the bootstrap- immersion in a linguistic environment. The more we know ping problem, accounting for developmental trajectories. In about this simple fact, the more puzzling it appears. Specifi- Section 3, we review past theoretical and modeling work, cally, we outline two deep scientific puzzles that a reverse en- showing that these puzzles have not, so far, received an ad- gineering approach could, in principle help to solve: solving equate answer. In Section 4, we argue that to answer them the bootstrapping problem and accounting for developmental with reverse engineering, three requirements have to be ad- trajectories. The first puzzle relates to the ultimate outcome dressed: (1) modeling should be computationally scalable, of language learning: the so-called stable state, defined here (2) it should be done on realistic data, (3) model performance as the stabilized language competence in the adult. The sec- should be compared with that of humans. In Section 5, re- ond puzzle relates to what we know of the intermediate steps cent progress in AI is reviewed in light of these three require- in the acquisition process, and their variations as a function ments. In Section 6, we assess the feasibility of the reverse of language input.1 engineering approach and lay out the road map that has to be followed to reach its objectives 2.1 Solving the bootstrapping problem , and we conclude in Section 7. The stable state is the operational knowledge which en- ables adults to process a virtual infinity of utterances in their 2 Two deep puzzles of early language development native language. The most articulated description of this Language development is a theoretically important sub- stable state has been offered by theoretical linguistics; it is field within the study of human cognitive development for viewed as a grammar comprising several components: pho- the following three reasons: netics, phonology, morphology, syntax, semantics, pragmat- First, the linguistic system is uniquely complex: mastering ics. a language implies mastering a combinatorial sound system The bootstrapping problem arises from the fact these dif- (phonetics and phonology), an open ended morphologically ferent components appear interdependent from a learning structured lexicon, and a compositional syntax and seman- point of view. For instance, the phoneme inventory of a lan- tics (e.g., Jackendoff, 1997). No other animal communica- guage is defined through pairs of words that differ minimally tion system uses such a complex multilayered organization in sounds (e.g., "light" vs "right"). This would suggest that to (Hauser, Chomsky, & Fitch, 2002). On this basis, it has learn phonemes, infants need to first learn words. However, been claimed that humans have evolved (or acquired through from a processing viewpoint, words are recognized through a mutation) an innately specified computational architecture their phonological constituents (e.g., Cutler, 2012), suggest- to process language (see Chomsky, 1965; Steedman, 2014). ing that infants should learn phonemes before words. Sim- Second, the overt manifestations of this system are ex- ilar paradoxical co-dependency issues have been noted be- tremely variable across languages and cultures. Language tween other linguistic levels (for instance, syntax and seman- can be expressed through the oral or manual modality. In tics: Pinker, 1987, prosody and syntax: Morgan & Demuth, the oral modality, some languages use only 3 vowels, other 1996). In other words, in order to learn any one component more than 20. Consonants inventories vary from 6 to more of the language competence, many others need to be learned than 100. Words can be mostly composed of a single sylla- first, creating apparent circularities. ble (as in Chinese) or long strings of stems and affixes (as 1The two puzzles are not independent as they are two facets of in Turkish). Semantic roles can be identified through fixed the
Recommended publications
  • 1. Introduction
    University of Groningen Specific language impairment in Dutch de Jong, Jan IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 1999 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): de Jong, J. (1999). Specific language impairment in Dutch: inflectional morphology and argument structure. s.n. Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 25-09-2021 Specific Language Impairment in Dutch: Inflectional Morphology and Argument Structure Jan de Jong Copyright ©1999 by Jan de Jong Printed by Print Partners Ipskamp, Enschede Groningen Dissertations in Linguistics 28 ISSN 0928-0030 Specific Language Impairment in Dutch: Inflectional Morphology and Argument Structure Proefschrift ter verkrijging van het doctoraat in de letteren aan de Rijksuniversiteit Groningen op gezag van de Rector Magnificus, dr.
    [Show full text]
  • Bootstrap” Into Syntax?*
    Cognirion. 45 (1992) 77-100 What sort of innate structure is needed to “bootstrap” into syntax?* Martin D.S. Braine Department of Psychology, New York University, 6 Washington Place. 8th Floor, New York. NY 10003, LISA Received March 5. 1990, final revision accepted March 13, 1992 Abstract Braine, M.D.S., 1992. What sort of innate structure is needed to “bootstrap” into syntax? Cognition, 45: 77-100 The paper starts from Pinker’s theory of the acquisition of phrase structure; it shows that it is possible to drop all the assumptions about innate syntactic structure from this theory. These assumptions can be replaced by assumptions about the basic structure of semantic representation available at the outset of language acquisition, without penalizing the acquisition of basic phrase structure rules. Essentially, the role played by X-bar theory in Pinker’s model would be played by the (presumably innate) structure of the language of thought in the revised parallel model. Bootstrapping and semantic assimilation theories are shown to be formally very similar,. though making different primitive assumptions. In their primitives, semantic assimilation theories have the advantage that they can offer an account of the origin of syntactic categories instead of postulating them as primitive. Ways of improving on the semantic assimilation version of Pinker’s theory are considered, including a way of deriving the NP-VP constituent division that appears to have a better fit than Pinker’s to evidence on language variation. Correspondence to: Martin Braine, Department of Psychology, New York University, 6 Washing- ton Place. 8th Floor, New York, NY 10003.
    [Show full text]
  • Introduction
    Published in M. Bowerman and P. Brown (Eds.), 2008 Crosslinguistic Perspectives on Argument Structure: Implications for Learnability, p. 1-26. Majwah, NJ: Lawrence Erlbaum CHAPTER 1 Introduction Melissa Bowerman and Penelope Brown Max Planck Institute for Psycholinguistics Verbs are the glue that holds clauses together. As elements that encode events, verbs are associated with a core set of semantic participants mat take part in the event. Some of a verb's semantic participants, although not necessarily all, are mapped to roles that are syntactically relevant in the clause, such as subject or direct object; these are the arguments of the verb. For example, in John kicked the ball, 'John' and 'the ball' are semantic participants of the verb kick, and they are also its core syntactic arguments—the subject and the direct object, respectively. Another semantic participant, 'foot', is also understood, but it is not an argument; rather, it is incorpo- rated directly into the meaning of the verb. The array of participants associated with verbs and other predicates, and how these participants are mapped to syntax, are the focus of the Study of ARGUMENT STRUCTURE. At one time, argument structure meant little more than the number of arguments appearing with a verb, for example, one for an intransitive verb, two for a transitive verb. But argument structure has by now taken on a central theoretical position in the study of both language structure and language development. In linguistics, ar- gument structure is seen as a critical interface between the lexical semantic properties of verbs and the morphosyntactic properties of the clauses in which they appear (e.g., Grimshaw, 1990; Goldberg, 1995; Hale & Keyser, 1993; Levin & Rappaport Hovav, 1995; Jackendoff, 1990).
    [Show full text]
  • Computational Modeling of Human Language Acquisition Copyright © 2011 by Morgan & Claypool
    SeriesSeries ISSN: ISSN: 1947-4040 1947-4040 ALISHAHI ALISHAHI ALISHAHI SSYNTHESISYNTHESIS LLECTURESECTURES ONON MM MorganMorgan& & ClaypoolClaypool PublishersPublishers HHUMANUMAN L LANGUAGEANGUAGE T TECHNOLOGIESECHNOLOGIES &&CC SeriesSeries Editor: Editor: Graeme Graeme Hirst, Hirst, University University of of Toronto Toronto ComputationalComputational Modeling Modeling of of Human Human Language Language Acquisition Acquisition COMPUTATIONAL MODELING OF HUMAN LANGUAGE ACQUISITION COMPUTATIONAL MODELING COMPUTATIONAL OF MODELING HUMAN OF LANGUAGE HUMAN ACQUISITION LANGUAGE ACQUISITION AfraAfra Alishahi, Alishahi, University University of of Saarlandes Saarlandes ComputationalComputational ModelingModeling HumanHuman language language acquisition acquisition has has been been studied studied for for centuries, centuries, but but using using computational computational modeling modeling for for such such studiesstudiesstudies is isis a a arelatively relativelyrelatively recent recentrecent trend. trend.trend. However, However,However, computational computationalcomputational approaches approachesapproaches to toto language languagelanguage learning learninglearning have havehave become becomebecome ofof HumanHuman LanguageLanguage increasinglyincreasinglyincreasingly popular, popular,popular, mainly mainlymainly due duedue to toto advances advancesadvances in inin developing developingdeveloping machine machinemachine learning learninglearning techniques, techniques,techniques, and andand the thethe availability availabilityavailability
    [Show full text]
  • Bootstrapping Into Filler-Gap: an Acquisition Story
    Bootstrapping into Filler-Gap: An Acquisition Story Marten van Schijndel Micha Elsner The Ohio State University vanschm,melsner @ling.ohio-state.edu { } Abstract Age 13mo 15mo 20mo 25mo Wh-S No Yes Yes Yes Analyses of filler-gap dependencies usu- ally involve complex syntactic rules or (Yes) heuristics; however recent results suggest Wh-O No Yes Yes that filler-gap comprehension begins ear- lier than seemingly simpler constructions 1-1 Yes No such as ditransitives or passives. Therefore, this work models filler-gap acquisition as a byproduct of learning word orderings (e.g. Figure 1: The developmental timeline of subject SVO vs OSV), which must be done at a (Wh-S) and object (Wh-O) wh-clause extraction very young age anyway in order to extract comprehension suggested by experimental results meaning from language. Specifically, this (Seidl et al., 2003; Gagliardi et al., 2014). Paren- model, trained on part-of-speech tags, rep- theses indicate weak comprehension. The final row resents the preferred locations of semantic shows the timeline of 1-1 role bias errors (Naigles, roles relative to a verb as Gaussian mix- 1990; Gertner and Fisher, 2012). Missing nodes de- tures over real numbers. note a lack of studies. This approach learns role assignment in filler-gap constructions in a manner con- sistent with current developmental findings acquired through learning word orderings rather and is extremely robust to initialization than relying on hierarchical syntactic knowledge. variance. Additionally, this model is shown to be able to account for a characteristic er- This work describes a cognitive model of the de- ror made by learners during this period (A velopmental timecourse of filler-gap comprehension and B gorped interpreted as A gorped B).
    [Show full text]
  • The Labeling Problem in Syntactic Bootstrapping Main Clause Syntax in the Acquisition of Propositional Attitude Verbs∗
    The labeling problem in syntactic bootstrapping Main clause syntax in the acquisition of propositional attitude verbs∗ Aaron Steven White Valentine Hacquard Jeffrey Lidz Johns Hopkins University University of Maryland University of Maryland Abstract In English, the distinction between belief verbs, such as think, and desire verbs, such as want, is tracked by the tense of those verbs’ subordinate clauses. This suggests that subordi- nate clause tense might be a useful cue for learning these verbs’ meanings via syntactic boot- strapping. However, the correlation between tense and the belief v. desire distinction is not cross-linguistically robust; yet these verbs’ acquisition profile is similar cross-linguistically. Our proposal in this chapter is that, instead of using concrete cues like subordinate clause tense, learners may utilize more abstract syntactic cues that must be tuned to the syntactic distinctions present in a particular language. We present computational modeling evidence supporting the viability of this proposal. 1 Introduction Syntactic bootstrapping encompasses a family of approaches to verb learning wherein learners use the syntactic contexts a verb is found in to infer its meaning (Landau and Gleitman, 1985; Gleitman, 1990). Any such approach must solve two problems. First, it must specify how learners cluster verbs—i.e. figure out that some set of verbs shares some meaning component— according to their syntactic distributions. For instance, an English learner might cluster verbs based on whether they embed tensed subordinate clauses (1) or whether they take a noun phrase (2). (1) a. John fthinks, believes, knowsg that Mary is happy. b. *John fwants, needs, ordersg that Mary is happy.
    [Show full text]
  • Information and Incrementality in Syntactic Bootstrapping
    ABSTRACT Title of dissertation: INFORMATION AND INCREMENTALITY IN SYNTACTIC BOOTSTRAPPING Aaron Steven White, Doctor of Philosophy, 2015 Dissertation directed by: Professor Valentine Hacquard Department of Linguistics Some words are harder to learn than others. For instance, action verbs like run and hit are learned earlier than propositional attitude verbs like think and want. One reason think and want might be learned later is that, whereas we can see and hear running and hitting, we can't see or hear thinking and wanting. Children nevertheless learn these verbs, so a route other than the senses must exist. There is mounting evidence that this route involves, in large part, inferences based on the distribution of syntactic contexts a propositional attitude verb occurs in|a process known as syntactic bootstrapping. This fact makes the domain of propositional attitude verbs a prime proving ground for models of syntactic bootstrapping. With this in mind, this dissertation has two goals: on the one hand, it aims to construct a computational model of syntactic bootstrapping; on the other, it aims to use this model to investigate the limits on the amount of information about propo- sitional attitude verb meanings that can be gleaned from syntactic distributions. I show throughout the dissertation that these goals are mutually supportive. In Chapter 1, I set out the main problems that drive the investigation. In Chapters 2 and 3, I use both psycholinguistic experiments and computational mod- eling to establish that there is a significant amount of semantic information carried in both participants' syntactic acceptability judgments and syntactic distributions in corpora.
    [Show full text]
  • Bootstrapping Language Acquisition 1
    BOOTSTRAPPING LANGUAGE ACQUISITION 1 Bootstrapping Language Acquisition Omri Abend and Tom Kwiatkowski and Nathaniel J. Smith and Sharon Goldwater and Mark Steedman Institute of Language, Cognition and Computation School of Informatics University of Edinburgh 1 Abstract The semantic bootstrapping hypothesis proposes that children acquire their native language through exposure to sentences of the language paired with structured rep- resentations of their meaning, whose component substructures can be associated with words and syntactic structures used to express these concepts. The child’s task is then to learn a language-specific grammar and lexicon based on (proba- bly contextually ambiguous, possibly somewhat noisy) pairs of sentences and their meaning representations (logical forms). Starting from these assumptions, we develop a Bayesian probabilistic account of semantically bootstrapped first-language acquisition in the child, based on tech- niques from computational parsing and interpretation of unrestricted text. Our learner jointly models (a) word learning: the mapping between components of the given sentential meaning and lexical words (or phrases) of the language, and (b) syntax learning: the projection of lexical elements onto sentences by universal construction-free syntactic rules. Using an incremental learning algorithm, we ap- ply the model to a dataset of real syntactically complex child-directed utterances and (pseudo) logical forms, the latter including contextually plausible but irrelevant distractors. Taking the Eve section of the CHILDES corpus as input, the model simulates several well-documented phenomena from the developmental literature. In particular, the model exhibits syntactic bootstrapping effects (in which previ- ously learned constructions facilitate the learning of novel words), sudden jumps in learning without explicit parameter setting, acceleration of word-learning (the “vocabulary spurt”), an initial bias favoring the learning of nouns over verbs, and one-shot learning of words and their meanings.
    [Show full text]
  • 185B/209B Computational Linguistics: Fragments
    185b/209b Computational Linguistics: Fragments Edward Stabler* 2005-01-05 We will study the basic properties of human linguistic abilities by looking first at much simpler languages, 'language fragments', with particular attention to our ability to learn the semantic values of expressions and the role this could have in 'bootstrapping' the language learner. * Thanks to many contributions from the 2005 class: Sarah Churng, Eric Fiala, Jeff Heinz, Ben Keil, Ananda Lima, Everett Mettler, Steve Wedig, Christine Yusi. Contents 1 Bootstrapping: first questions 1 2 The simplest languages 7 3 The simplest languages: concept learning 15 4 The simplest languages: concept learning 2 25 5 The simplest languages: concept learning 3 33 6 The simplest languages: recognizing the symbols 45 7 The simplest languages: loose ends 67 8 Subject-Predicate languages with every 75 9 Subject-Predicate languages with every, part 2 81 10 Languages with every and relative clauses 89 11 Languages with every and relative clauses: grammar 97 12 Languages with every and relative clauses: conclusion 109 13 Languages with some 115 14 Learning languages with some,every 125 15 Learning languages with some,every, part 2 135 16 Learning quantifiers 143 17 The syllogistic fragment 153 18 Inferences in the syllogistic fragment 165 19 Learning the syllogistic fragment 175 20 Some next steps 189 Stabler, 2005-01-05 ii - ! . / / 01-01232 &-441 0-+-. $ --2- /* &$ .$ 3043.5 -+3. $.$ $ 6 / 7 % $ $ $ .8 8.% $ !
    [Show full text]
  • Bootstrapping Language Acquisition Q ⇑ Omri Abend ,1, Tom Kwiatkowski 2, Nathaniel J
    Cognition 164 (2017) 116–143 Contents lists available at ScienceDirect Cognition journal homepage: www.elsevier.com/locate/COGNIT Bootstrapping language acquisition q ⇑ Omri Abend ,1, Tom Kwiatkowski 2, Nathaniel J. Smith 3, Sharon Goldwater, Mark Steedman Informatics, University of Edinburgh, United Kingdom article info abstract Article history: The semantic bootstrapping hypothesis proposes that children acquire their native language through Received 10 February 2016 exposure to sentences of the language paired with structured representations of their meaning, whose Revised 1 November 2016 component substructures can be associated with words and syntactic structures used to express these Accepted 22 February 2017 concepts. The child’s task is then to learn a language-specific grammar and lexicon based on (probably contextually ambiguous, possibly somewhat noisy) pairs of sentences and their meaning representations Parts of this work were previously (logical forms). presented at EMNLP 2010, EACL 2012, and Starting from these assumptions, we develop a Bayesian probabilistic account of semantically boot- appeared in Kwiatkowski’s doctoral strapped first-language acquisition in the child, based on techniques from computational parsing and dissertation interpretation of unrestricted text. Our learner jointly models (a) word learning: the mapping between components of the given sentential meaning and lexical words (or phrases) of the language, and (b) syn- Keywords: tax learning: the projection of lexical elements onto sentences by universal construction-free syntactic Language acquisition rules. Using an incremental learning algorithm, we apply the model to a dataset of real syntactically com- Syntactic bootstrapping plex child-directed utterances and (pseudo) logical forms, the latter including contextually plausible but Semantic bootstrapping Computational modeling irrelevant distractors.
    [Show full text]
  • Bootstrapping the Syntactic Bootstrapper: Probabilistic Labeling of Prosodic Phrases
    Language Acquisition ISSN: 1048-9223 (Print) 1532-7817 (Online) Journal homepage: http://www.tandfonline.com/loi/hlac20 Bootstrapping the Syntactic Bootstrapper: Probabilistic Labeling of Prosodic Phrases Ariel Gutman, Isabelle Dautriche, Benoît Crabbé & Anne Christophe To cite this article: Ariel Gutman, Isabelle Dautriche, Benoît Crabbé & Anne Christophe (2015) Bootstrapping the Syntactic Bootstrapper: Probabilistic Labeling of Prosodic Phrases, Language Acquisition, 22:3, 285-309, DOI: 10.1080/10489223.2014.971956 To link to this article: http://dx.doi.org/10.1080/10489223.2014.971956 Accepted author version posted online: 06 Oct 2014. Published online: 15 Dec 2014. Submit your article to this journal Article views: 357 View related articles View Crossmark data Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=hlac20 Download by: [68.231.213.37] Date: 25 November 2015, At: 14:52 Language Acquisition, 22: 285–309, 2015 Copyright © Taylor & Francis Group, LLC ISSN: 1048-9223 print / 1532-7817 online DOI: 10.1080/10489223.2014.971956 Bootstrapping the Syntactic Bootstrapper: Probabilistic Labeling of Prosodic Phrases Ariel Gutman University of Konstanz Isabelle Dautriche Ecole Normale Supérieure, PSL Research University, CNRS, EHESS Benoît Crabbé Université Paris Diderot, Sorbonne Paris Cité, INRIA, IUF Anne Christophe Ecole Normale Supérieure, PSL Research University, CNRS, EHESS The syntactic bootstrapping hypothesis proposes that syntactic structure provides children with cues for learning the meaning of novel words. In this article, we address the question of how children might start acquiring some aspects of syntax before they possess a sizeable lexicon. The study presents two models of early syntax acquisition that rest on three major assumptions grounded in the infant literature: First, infants have access to phrasal prosody; second, they pay attention to words situated at the edges of prosodic boundaries; third, they know the meaning of a handful of words.
    [Show full text]
  • How Could a Child Use Verb Syntax to Learn Verb Semantics? *
    Lingua 92 (1994) 377410. North-Holland 377 How could a child use verb syntax to learn verb semantics? * Steven Pinker Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, EIO-016, Cambridge, MA 02139, USA I examine Gleitman’s (1990) arguments that children rely on a verb’s syntactic subcategoriza- tion frames to learn its meaning (e.g., they learn that see means ‘perceive visually’ because it can appear with a direct object, a clausal complement, or a directional phrase). First, Gleitman argues that the verbs cannot be learned by observing the situations in which they are used, because many verbs refer to overlapping situations, and because parents do not invariably use a verb when its perceptual correlates are present. I suggest that these arguments speak only against a narrow associationist view in which the child is sensitive to the temporal contiguity of sensory features and spoken verb. If the child can hypothesize structured semantic representations corresponding to what parents are likely to be referring to, and can refine such representations across multiple situations, the objections are blunted; indeed, Gleitman’s theory requires such a learning process despite her objections to it. Second, Gleitman suggests that there is enough information in a verb’s subcategorization frames to predict its meaning ‘quite closely’. Evaluating this argument requires distinguishing a verb’s root plus its semantic content (what She boiled the water shares with The water boiled and does not share with She broke the glass), and a verb frame plus its semantic perspective (what She boiled the water shares with She broke the glass and does not share with The water boiled).
    [Show full text]