A Developmental Model of Syntax Acquisition in the Construction Grammar Framework with Cross-Linguistic Validation in English and Japanese
Total Page:16
File Type:pdf, Size:1020Kb
A Developmental Model of Syntax Acquisition in the Construction Grammar Framework with Cross-Linguistic Validation in English and Japanese Peter Ford Dominey Toshio Inui Sequential Cognition and Language Group Graduate School of Informatics, Institut des Sciences Cognitives, CNRS Kyoto University, 69675 Bron CEDES, France Yoshida-honmachi, Sakyo-ku, 606-8501, [email protected] Kyoto, Japan [email protected] constructions in a generalized, generative manner. Abstract An alternative functionalist perspective holds The current research demonstrates a system that learning plays a much more central role in inspired by cognitive neuroscience and language acquisition. The infant develops an developmental psychology that learns to inventory of grammatical constructions as construct mappings between the grammatical mappings from form to meaning (Goldberg 1995). structure of sentences and the structure of their These constructions are initially rather fixed and meaning representations. Sentence to meaning specific, and later become generalized into a more mappings are learned and stored as abstract compositional form employed by the adult grammatical constructions. These are stored (Tomasello 1999, 2003). In this context, and retrieved from a construction inventory construction of the relation between perceptual and based on the constellation of closed class cognitive representations and grammatical form items uniquely identifying each construction. plays a central role in learning language (e.g. These learned mappings allow the system to Feldman et al. 1990, 1996; Langacker 1991; processes natural language sentences in order Mandler 1999; Talmy 1998). to reconstruct complex internal representations These issues of learnability and innateness have of the meanings these sentences describe. The provided a rich motivation for simulation studies system demonstrates error free performance that have taken a number of different forms. and systematic generalization for a rich subset Elman (1990) demonstrated that recurrent of English constructions that includes complex networks are sensitive to predictable structure in hierarchical grammatical structure, and grammatical sequences. Subsequent studies of generalizes systematically to new sentences of grammar induction demonstrate how syntactic the learned construction categories. Further testing demonstrates (1) the capability to structure can be recovered from sentences (e.g. accommodate a significantly extended set of Stolcke & Omohundro 1994). From the constructions, and (2) extension to Japanese, a “grounding of language in meaning” perspective free word order language that is structurally (e.g. Feldman et al. 1990, 1996; Langacker 1991; quite different from English, thus Goldberg 1995) Chang & Maia (2001) exploited demonstrating the extensibility of the structure the relations between action representation and mapping model. simple verb frames in a construction grammar approach. In effort to consider more complex 1 Introduction grammatical forms, Miikkulainen (1996) demonstrated a system that learned the mapping The nativist perspective on the problem of between relative phrase constructions and multiple language acquisition holds that the <sentence, event representations, based on the use of a stack meaning> data to which the child is exposed is for maintaining state information during the highly indeterminate, and underspecifies the processing of the next embedded clause in a mapping to be learned. This “poverty of the recursive manner. stimulus” is a central argument for the existence of In a more generalized approach, Dominey a genetically specified universal grammar, such (2000) exploited the regularity that sentence to that language acquisition consists of configuring meaning mapping is encoded in all languages by the UG for the appropriate target language word order and grammatical marking (bound or (Chomsky 1995). In this framework, once a given free) (Bates et al. 1982). That model was based on parameter is set, its use should apply to new 33 the functional neurophysiology of cognitive “gugle” to the object of push. sequence and language processing and an associated neural network model that has been WordToReferent(i,j) = WordToReferent(i,j) + demonstrated to simulate interesting aspects of OCA(k,i) * SEA(m,j) * infant (Dominey & Ramus 2000) and adult max(a, SentenceToScene(m,k)) (1) language processing (Dominey et al. 2003). 2.2 Open vs Closed Class Word Categories 2 Structure mapping for language learning Our approach is based on the cross-linguistic The mapping of sentence form onto meaning observation that open class words (e.g. nouns, (Goldberg 1995) takes place at two distinct levels verbs, adjectives and adverbs) are assigned to their in the current model: Words are associated with thematic roles based on word order and/or individual components of event descriptions, and grammatical function words or morphemes (Bates grammatical structure is associated with functional et al. 1982). Newborn infants are sensitive to the roles within scene events. The first level has been perceptual properties that distinguish these two addressed by Siskind (1996), Roy & Pentland categories (Shi et al. 1999), and in adults, these categories are processed by dissociable (2002) and Steels (2001) and we treat it here in a neurophysiological systems (Brown et al. 1999). relatively simple but effective manner. Our Similarly, artificial neural networks can also learn principle interest lies more in the second level of to make this function/content distinction (Morgan mapping between scene and sentence structure. et al. 1996). Thus, for the speech input that is Equations 1-7 implement the model depicted in provided to the learning model open and closed Figure 1, and are derived from a class words are directed to separate processing neurophysiologically motivated model of streams that preserve their order and identity, as sensorimotor sequence learning (Dominey et al. indicated in Figure 2. 2003). 2.1 Word Meaning Equation (1) describes the associative memory, WordToReferent, that links word vectors in the OpenClassArray (OCA) with their referent vectors in the SceneEventArray (SEA)1. In the initial learning phases there is no influence of syntactic knowledge and the word-referent associations are stored in the WordToReferent matrix (Eqn 1) by associating every word with every referent in the current scene (a = 1), exploiting the cross- situational regularity (Siskind 1996) that a given word will have a higher coincidence with referent to which it refers than with other referents. This initial word learning contributes to learning the mapping between sentence and scene structure (Eqn. 4, 5 & 6 below). Then, knowledge of the syntactic structure, encoded in SentenceToScene Figure 1. Structure-Mapping Architecture. 1. Lexical categorization. can be used to identify the appropriate referent (in 2. Open class words in Open Class Array are translated to Predicted the SEA) for a given word (in the OCA), Referents in the PRA via the WordtoReferent mapping. 3. PRA elements are mapped onto their roles in the SceneEventArray by the corresponding to a zero value of a in Eqn. 1. In SentenceToScene mapping, specific to each sentence type. 4. This this “syntactic bootstrapping” for the new word mapping is retrieved from Construction Inventory, via the ConstructionIndex that encodes the closed class words that “gugle,” for example, syntactic knowledge of characterize each grammatical construction type. Agent-Event-Object structure of the sentence 2.3 Mapping Sentence to Meaning “John pushed the gugle” can be used to assign Meanings are encoded in an event predicate, argument representation corresponding to the 1 In Eqn 1, the index k = 1 to 6, corresponding to the maximum SceneEventArray in Figure 1 (e.g. push(Block, number of words in the open class array (OCA). Index m = 1 to 6, corresponding to the maximum number of elements in the scene event triangle) for “The triangle pushed the block”). array (SEA). Indices i and j = 1 to 25, corresponding to the word and There, the sentence to meaning mapping can be scene item vector sizes, respectively. 34 characterized in the following successive steps. referents to scene elements (in SEA). The First, words in the Open Class Array are decoded resulting, SentenceToSceneCurrent encodes the into their corresponding scene referents (via the correspondence between word order (that is WordToReferent mapping) to yield the Predicted preserved in the PRA Eqn 2) and thematic roles in Referents Array that contains the translated words the SEA. Note that the quality of while preserving their original order from the OCA SentenceToSceneCurrent will depend on the (Eqn 2) 2. quality of acquired word meanings in WordToReferent. Thus, syntactic learning n requires a minimum baseline of semantic PRA(k,j) = OCA(k,i) * WordToReferent(i,j) (2) knowledge. i= 1 Next, each sentence type will correspond to a SentenceToSceneCurrent(m,k) = specific form to meaning mapping between the n (4) PRA and the SEA. encoded in the PRA(k,i)*SEA(m,i) SentenceToScene array. The problem will be to i=1 retrieve for each sentence type, the appropriate corresponding SentenceToScene mapping. To Given the SentenceToSceneCurrent mapping solve this problem, we recall that each sentence for the current sentence, we can now associate it in type will have a unique constellation of closed the ConstructionInventory with the corresponding class words and/or bound morphemes (Bates et al. function word configuration or