A Roadmap for Reverse-Engineering the Infant Language-Learner∗

Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner∗ Emmanuel Dupoux EHESS, ENS, PSL Research University, CNRS, INRIA, Paris, France [email protected], www.syntheticlearner.net Abstract • Machine/humans comparison should be run on a benchmark of psycholinguistic tests. Spectacular progress in the information processing sci- ences (machine learning, wearable sensors) promises to revo- 1 Introduction lutionize the study of cognitive development. Here, we anal- yse the conditions under which ’reverse engineering’ lan- In recent years, artificial intelligence (AI) has been hit- guage development, i.e., building an effective system that ting the headlines with impressive achievements at matching mimics infant’s achievements, can contribute to our scien- or even beating humans in complex cognitive tasks (play- tific understanding of early language development. We argue ing go or video games: Mnih et al., 2015; Silver et al., that, on the computational side, it is important to move from 2016; processing speech and natural language: Amodei et toy problems to the full complexity of the learning situation, al., 2016; Ferrucci, 2012; recognizing objects and faces: He, and take as input as faithful reconstructions of the sensory Zhang, Ren, & Sun, 2015; Lu & Tang, 2014) and promising signals available to infants as possible. On the data side, a revolution in manufacturing processes and human society accessible but privacy-preserving repositories of home data at large. These successes show that with statistical learning have to be setup. On the psycholinguistic side, specific tests techniques, powerful computers and large amounts of data, have to be constructed to benchmark humans and machines it is possible to mimic important components of human cog- at different linguistic levels. We discuss the feasibility of this nition. Shockingly, some of these achievements have been approach and present an overview of current results. reached by throwing out some of the classical theories in linguistics and psychology, and by training relatively unstruc- Keywords tured neural network systems on large amounts of data. What does it tell us about the underlying psychological and/or neu- Artificial intelligence, speech, psycholinguistics, compu- ral processes that are used by humans to solve these tasks? tational modeling, corpus analysis, early language acquisi- Can AI provide us with scientific insights about human learn- tion, infant development, language bootstrapping, machine ing and processing? learning. Here, we argue that developmental psychology and in par- ticular, the study of language acquisition is one area where, Highlights indeed, AI and machine learning advances can be transfor- • Key theoretical puzzles of infant language develop- mational, provided that the involved fields make significant ment are still unsolved. adjustments in their practices in order to adopt what we call the reverse engineering approach. Specifically: arXiv:1607.08723v4 [cs.CL] 14 Feb 2018 • A roadmap for reverse engineering infant language learning using AI is proposed. The reverse engineering approach to the study of infant language acquisition consists in con- • AI algorithms should use realistic input (little or no structing scalable computational systems that supervision, raw data). can, when fed with realistic input data, mimic language acquisition as it is observed in infants. • Realistic input should be obtained by large scale recording in ecological environments. The three italicised terms will be discussed at length in subsequent sections of the paper. For now, only an intuitive understanding of these terms will suffice. The idea of us- * Author’s reprint of Dupoux, E. (2018). Cognitive sci- ing machine learning or AI techniques as a means to study ence in the era of artificial intelligence: A roadmap for reverse- child’s language learning is actually not new (to name a few: engineering the infant language learner. Cognition, 173, 43-59, doi: Kelley, 1967; Anderson, 1975; Berwick, 1985; Rumelhart 10.1016/j.cognition.2017.11.008. & McClelland, 1987; Langley & Carbonell, 1987) although 2 E. DUPOUX relatively few studies have concentrated on the early phases Third, the human language capacity can be viewed as a of language learning (see Brent, 1996b, for a pioneering col- finite computational system with the ability to generate a lection of essays). What is new, however, is that whereas (virtual) infinity of utterances. This turns into a learnabil- previous AI approaches were limited to proofs of principle ity problem for infants: on the basis of finite evidence, they on toy or miniature languages, modern AI techniques have have to induce the (virtual) infinity corresponding to their scaled up so much that end-to-end language processing sys- language. As has been discussed since Aristotle, such induc- tems working with real inputs are now deployed commer- tion problems do not have a generally valid solution. There- cially. This paper examines whether and how such unprece- fore, language is simultaneously a human-specific biological dented change in scale could be put to use to address linger- trait, a highly variable cultural production, and an apparently ing scientific questions in the field of language development. intractable learning problem. The structure of the paper is as follows: In Section 2, we Despite these complexities, most infants spontaneously present two deep scientific puzzles that large scale modeling learn their native(s) language(s) in a matter of a few years of approaches could in principle address: solving the bootstrap- immersion in a linguistic environment. The more we know ping problem, accounting for developmental trajectories. In about this simple fact, the more puzzling it appears. Specifi- Section 3, we review past theoretical and modeling work, cally, we outline two deep scientific puzzles that a reverse en- showing that these puzzles have not, so far, received an ad- gineering approach could, in principle help to solve: solving equate answer. In Section 4, we argue that to answer them the bootstrapping problem and accounting for developmental with reverse engineering, three requirements have to be ad- trajectories. The first puzzle relates to the ultimate outcome dressed: (1) modeling should be computationally scalable, of language learning: the so-called stable state, defined here (2) it should be done on realistic data, (3) model performance as the stabilized language competence in the adult. The sec- should be compared with that of humans. In Section 5, re- ond puzzle relates to what we know of the intermediate steps cent progress in AI is reviewed in light of these three require- in the acquisition process, and their variations as a function ments. In Section 6, we assess the feasibility of the reverse of language input.1 engineering approach and lay out the road map that has to be followed to reach its objectives 2.1 Solving the bootstrapping problem , and we conclude in Section 7. The stable state is the operational knowledge which en- ables adults to process a virtual infinity of utterances in their 2 Two deep puzzles of early language development native language. The most articulated description of this Language development is a theoretically important sub- stable state has been offered by theoretical linguistics; it is field within the study of human cognitive development for viewed as a grammar comprising several components: pho- the following three reasons: netics, phonology, morphology, syntax, semantics, pragmat- First, the linguistic system is uniquely complex: mastering ics. a language implies mastering a combinatorial sound system The bootstrapping problem arises from the fact these dif- (phonetics and phonology), an open ended morphologically ferent components appear interdependent from a learning structured lexicon, and a compositional syntax and seman- point of view. For instance, the phoneme inventory of a lan- tics (e.g., Jackendoff, 1997). No other animal communica- guage is defined through pairs of words that differ minimally tion system uses such a complex multilayered organization in sounds (e.g., "light" vs "right"). This would suggest that to (Hauser, Chomsky, & Fitch, 2002). On this basis, it has learn phonemes, infants need to first learn words. However, been claimed that humans have evolved (or acquired through from a processing viewpoint, words are recognized through a mutation) an innately specified computational architecture their phonological constituents (e.g., Cutler, 2012), suggest- to process language (see Chomsky, 1965; Steedman, 2014). ing that infants should learn phonemes before words. Sim- Second, the overt manifestations of this system are ex- ilar paradoxical co-dependency issues have been noted be- tremely variable across languages and cultures. Language tween other linguistic levels (for instance, syntax and seman- can be expressed through the oral or manual modality. In tics: Pinker, 1987, prosody and syntax: Morgan & Demuth, the oral modality, some languages use only 3 vowels, other 1996). In other words, in order to learn any one component more than 20. Consonants inventories vary from 6 to more of the language competence, many others need to be learned than 100. Words can be mostly composed of a single sylla- first, creating apparent circularities. ble (as in Chinese) or long strings of stems and affixes (as 1The two puzzles are not independent as they are two facets of in Turkish). Semantic roles can be identified through fixed the

A Roadmap for Reverse-Engineering the Infant Language-Learner∗

1. Introduction

Bootstrap” Into Syntax?*

Introduction

Computational Modeling of Human Language Acquisition Copyright © 2011 by Morgan & Claypool

Bootstrapping Into Filler-Gap: an Acquisition Story

The Labeling Problem in Syntactic Bootstrapping Main Clause Syntax in the Acquisition of Propositional Attitude Verbs∗

Information and Incrementality in Syntactic Bootstrapping

Bootstrapping Language Acquisition 1

185B/209B Computational Linguistics: Fragments

Bootstrapping Language Acquisition Q ⇑ Omri Abend ,1, Tom Kwiatkowski 2, Nathaniel J

Bootstrapping the Syntactic Bootstrapper: Probabilistic Labeling of Prosodic Phrases

How Could a Child Use Verb Syntax to Learn Verb Semantics? *