Analogical Modeling: an Update

Total Page:16

File Type:pdf, Size:1020Kb

Analogical Modeling: an Update Analogical Modeling: An Update David Eddington and Deryle Lonsdale Department of Linguistics & English Language Brigham Young University Provo, UT, USA 84602 {eddington,lonz}@byu.edu Abstract accessed. A search is conducted for the stored ex- emplars that are most similar to the one whose be- Analogical modeling is a supervised havior is being predicted. The behavior of highly exemplar-based approach that has been similar stored entities generally predicts the behav- widely applied to predict linguistic behavior. ior of the one in question, although less similar ones The paradigm has been well documented in have a small chance of applying as well. the linguistics and cognition literature, but AM has been implemented in a succession of is less well known to the machine learning computer implementations that allow users to spec- community. This paper sets out some of the ify an exemplar base and then to test unseen in- basics of the approach, including a simpli- stances of language behavior against this accumu- fied example of the fundamental algorithm’s lated store to predict (and quantify) the relevant and operation. It then surveys some of the recent possible outcome(s). analogical modeling language applications, and sketches how the computational system A comprehensive examination of the system is has been enhanced lately to offer users beyond the scope of this paper1. Instead, an at- increased flexibility and processing power. tempt will be made to situate AM within the space of Some comparisons and contrasts are drawn exemplar-based modeling systems and, to a certain between analogical modeling and other extent, within the larger context of machine learning language modeling and machine learning systems and cognitive modeling systems in general. approaches. The paper concludes with a This paper begins by giving an informal overview discussion of ongoing issues that still con- of the algorithm used to perform AM. To accomplish front developers and users of the analogical this it traces in schematic form an early but typi- modeling framework. cal application of the AM approach: Spanish gen- der guessing. The subsequent section discusses AM 1 Introduction with respect to other types of modeling systems, and Analogical modeling (AM) is an exemplar-based gives an overview of the types of linguistic phenom- modeling approach designed to predict linguistic be- ena modeled by AM. The paper then discusses sev- havior on the basis of stored memory tokens (Sk- eral items of recent and ongoing work with the AM ousen, 1989; Skousen, 1992; Skousen, 1995; Sk- program. The conclusion mentions possible areas ousen, 1998). It is founded on the premise that for future improvement and investigation. previous linguistic experience is stored in the men- tal lexicon. When the need arises to determine 1For further information including a bibliography some linguistic behavior (pronunciation, morpho- and software downloads see the project webpage at: logical relationship, word, etc.), the lexicon itself is http://humanities.byu.edu/am Subcontexts Members of the Subcontext Pointers # of Disagreements pan none none 0 ¯pan plan M plan M > plan M 0 p¯an none none 0 pa¯n paz F, par M paz F > paz F 2 par M > par M *paz F > par M *par M > paz F ¯pa¯n cal F cal F > cal F 0 p¯an¯ tren M, ron M tren M > tren M 0 ron M > ron M tren M > ron M ron M > tren M p¯a¯n none none 0 p¯a¯n¯ rey M rey M > rey M 0 Table 1: Subcontexts and their disagreements. 2 The AM algorithm are marked with asterisks in Table 1. A disagree- ment occurs when words that are equally similar to Perhaps the best way to understand the AM algo- the given context, exhibit different behaviors, in this rithm is with a concrete illustration. Predictions are case, different genders. The number of disagree- always made in terms of specific exemplars, there- ments is determined by pairing all members of a sub- fore, for the purposes of the example, the following context with every other member, including itself, seven monosyllabic Spanish nouns and their corre- 2 by means of unidirectional pointers, and counting sponding gender will be considered : the number of times the members of the pair have paz F different behaviors. In this example, the only sub- plan M context containing any disagreement is pa¯n. tren M cal F Subcontexts are then arranged into more compre- ron M hensive groups called supracontexts as shown in Ta- par M ble 2. In Table 2, a hyphen indicates a wildcard. rey M In the subcontextual analysis a tally of the all The task will be to predict the gender of pan of the subcontextual disagreements is made. The ’bread’ on the basis of these seven items as the supracontextual analysis consists of analyzing all of set of exemplars. The three variables used are the the words that appear in a given supracontext, and phoneme or phoneme cluster of the onset, nucleus, again tallying disagreements (Table 3). and rhyme. First, all possible exemplar items are as- In the supracontexutal analysis, words that have signed to a series of subcontexts which are defined more than one variable in common with pan appear in terms of the given context pan. This yields eight in more that one supracontext. This is AM’s way of subcontexts (Table 1). A bar over a variable indi- allowing the gender of words which are more similar cates that any value of the variable except the barred to pan to influence the gender assignment of pan to value is permitted. a greater extent. By assigning members to subcontexts it is pos- The purpose of AM’s algorithm is to determine sible to determine disagreements. Disagreements which members of the exemplar base are most likely 2This example is a simplification; see (Eddington, 2002) for to affect the gender assignment of pan, and also to a complete AM treatment of this phenomenon. calculate the extent of analogical influence exerted. Supracontext Subcontexts in Supracontext # Subcontextual Disagreements p a n pan 0 p a – pan, *pa¯n 2 p – n pan, p¯an 0 – a n pan, ¯pan 0 p – – pan, p¯an, *pa¯n, p¯a¯n 2 – a – pan, ¯pan, *pa¯n, ¯pa¯n 2 – – n pan, ¯pan, p¯an, ¯p¯an 0 – – – pan, ¯pan, p¯an, *pa¯n, ¯pa¯n, ¯p¯an, p¯a¯n, ¯p¯a¯n 2 Table 2: Subcontextual analysis. This is accomplished by calculating heterogeneity. the influence of the analogical set on the behavior Heterogeneity is determined by comparing the num- of the given context. One is to assign the most fre- ber of disagreements in the supracontextual and sub- quently occurring behavior to the given context (se- contextual analyses. If there are more disagreements lection by plurality). Of the 18 pointers in the set, in the supracontextual analysis, the supracontext is 14 point to masculine. Therefore, selection by plu- heterogenous, and its members are eliminated from rality would assign masculine gender to pan. The consideration as possible analogs. If the number of other method (random selection) involves randomly disagreements does not increase, the supracontext is selecting a pointer, and assigning the behavior of the homogenous. word indicated by the pointer to the given context. Words belonging to homogenous supracontexts In this case, the probability of masculine gender as- comprise the analogical set (Table 4). In the example signment would be 77.78% (14/18). under consideration, disagreements increase in the supracontexts – – –, and – a –. Therefore, their mem- 3 Situating AM bers are eliminated from consideration. Cal, and rey The AM theoretical construct has been implemented appear exclusively in these heterogenous supracon- computationally through several generations of ap- texts. As a result, they do not form part of the ana- plication programs. Pascal code for the first version logical set. The word plan is also a member of both was listed as an appendix in (Skousen, 1989). A – – –, and – a –, however, it is also a member of the subsequent version was implemented in C. In recent homogenous supracontexts – a n, and – – n, so it will years the system was reimplemented as a Perl mod- still be available to influence pan. ule; extensive details on how to use the current ver- It should not be surprising that rey would be elim- sion are available in (Skousen et al., 2002). inated through heterogeneity; it has no phonemes in The system assumes a priori labeled input in- common with pan. However, consider the words ron stances, and thus only works in supervised learn- and cal. Both share only one feature with pan, yet ing mode. Each exemplar must be represented as heterogeneity eliminates only cal, and not ron. This a fixed-length vector comprised of features that may is due to the fact that ron appears in the supracontext or may not eventually be relevant to the phenomenon – – n, and all of the members of that supracontext being modeled. The features themselves may be are masculine, therefore, there is no disagreement. of fixed or variable length, as long as they are ap- Cal F, on the other hand, competes with masculine propriately delimited. An outcome is specified for words in all of the supracontexts in which it appears. each exemplar instance vector. Vector features are The analogical set contains all of the exemplar purely symbolic and nominal, not admitting contin- items that can possibly influence the gender assign- uous values. When the use of continuous-valued ment of pan. There are two methods for calculating features is needed (e.g. integers or real numbers), Supracontexts Words in Pointers # of Subcontextual Supracontext Disagreements p a n none none 0 p a – par M, paz F paz F > paz F 2 par M > par M *paz F > par M *par M > paz F p – n none none 0 – a n plan M plan M > plan M 0 p – – par M, paz F paz F > paz F 2 par M > par M *paz F > par M *par M > paz F – a – plan M, cal F, par M, paz F plan M > plan M 6 *plan M > cal F plan M > par M *plan M > paz F cal F > cal F *cal F > plan M *cal F > par M cal F > paz F *paz F > plan M *paz F > par M (not all shown; all the rest involve agreement) – – n plan M, tren M, ron M (not shown) 0 – – – plan M, tren M, ron M, cal F, (not shown) 12 paz F, par M, rey M Table 3: Supracontextual analysis.
Recommended publications
  • Rules Vs. Analogy in English Past Tenses: a Computational/Experimental Study Adam Albrighta,*, Bruce Hayesb,*
    Cognition 90 (2003) 119–161 www.elsevier.com/locate/COGNIT Rules vs. analogy in English past tenses: a computational/experimental study Adam Albrighta,*, Bruce Hayesb,* aDepartment of Linguistics, University of California, Santa Cruz, Santa Cruz, CA 95064-1077, USA bDepartment of Linguistics, University of California, Los Angeles, Los Angeles, CA 90095-1543, USA Received 7 December 2001; revised 11 November 2002; accepted 30 June 2003 Abstract Are morphological patterns learned in the form of rules? Some models deny this, attributing all morphology to analogical mechanisms. The dual mechanism model (Pinker, S., & Prince, A. (1998). On language and connectionism: analysis of a parallel distributed processing model of language acquisition. Cognition, 28, 73–193) posits that speakers do internalize rules, but that these rules are few and cover only regular processes; the remaining patterns are attributed to analogy. This article advocates a third approach, which uses multiple stochastic rules and no analogy. We propose a model that employs inductive learning to discover multiple rules, and assigns them confidence scores based on their performance in the lexicon. Our model is supported over the two alternatives by new “wug test” data on English past tenses, which show that participant ratings of novel pasts depend on the phonological shape of the stem, both for irregulars and, surprisingly, also for regulars. The latter observation cannot be explained under the dual mechanism approach, which derives all regulars with a single rule. To evaluate the alternative hypothesis that all morphology is analogical, we implemented a purely analogical model, which evaluates novel pasts based solely on their similarity to existing verbs.
    [Show full text]
  • Analogical Modeling: an Exemplar-Based Approach to Language"
    <DOCINFO AUTHOR "" TITLE "Analogical Modeling: An exemplar-based approach to language" SUBJECT "HCP, Volume 10" KEYWORDS "" SIZE HEIGHT "220" WIDTH "150" VOFFSET "4"> Analogical Modeling human cognitive processing is a forum for interdisciplinary research on the nature and organization of the cognitive systems and processes involved in speaking and understanding natural language (including sign language), and their relationship to other domains of human cognition, including general conceptual or knowledge systems and processes (the language and thought issue), and other perceptual or behavioral systems such as vision and non- verbal behavior (e.g. gesture). ‘Cognition’ should be taken broadly, not only including the domain of rationality, but also dimensions such as emotion and the unconscious. The series is open to any type of approach to the above questions (methodologically and theoretically) and to research from any discipline, including (but not restricted to) different branches of psychology, artificial intelligence and computer science, cognitive anthropology, linguistics, philosophy and neuroscience. It takes a special interest in research crossing the boundaries of these disciplines. Editors Marcelo Dascal, Tel Aviv University Raymond W. Gibbs, University of California at Santa Cruz Jan Nuyts, University of Antwerp Editorial address Jan Nuyts, University of Antwerp, Dept. of Linguistics (GER), Universiteitsplein 1, B 2610 Wilrijk, Belgium. E-mail: [email protected] Editorial Advisory Board Melissa Bowerman, Nijmegen; Wallace Chafe, Santa Barbara, CA; Philip R. Cohen, Portland, OR; Antonio Damasio, Iowa City, IA; Morton Ann Gernsbacher, Madison, WI; David McNeill, Chicago, IL; Eric Pederson, Eugene, OR; François Recanati, Paris; Sally Rice, Edmonton, Alberta; Benny Shanon, Jerusalem; Lokendra Shastri, Berkeley, CA; Dan Slobin, Berkeley, CA; Paul Thagard, Waterloo, Ontario Volume 10 Analogical Modeling: An exemplar-based approach to language Edited by Royal Skousen, Deryle Lonsdale and Dilworth B.
    [Show full text]
  • Analogy in Grammar: Form and Acquisition
    Analogy in Grammar: Form and Acquisition Department of Linguistics Max Planck Institute for Evolutionary Anthropology Leipzig, $$ –$% September $##& Workshop programme Friday, September 22 W I ": –": Jim Blevins (Cambridge) & Juliette Blevins ( <=;-:A6 ) A L C T V ": –: Andrew Garrett (Berkeley) Paradigmatic heterogeneity : –: : –: Andrew Wedel (Arizona) Multi-level selection and the tension between phonological and morphological regularity : –: Greg Stump & Rafael Finkel (Kentucky) Principal parts and degrees of paradigmatic transparency : –: Moderated discussion: Juliette Blevins ( <=;-:A6 ) : –: A S G : –: Keith Johnson (Berkeley) Analogy as exemplar resonance: Extension of a view of sensory memory to higher linguistic categories : –: John Goldsmith (Chicago) Learning morphological patterns in language : –: : –: Susanne Gahl (Chicago) The sound of syntax: Probabilities and structure in pronunciation variation : – : Farrell Ackerman ( @8?9 ), Sharon Rose ( @8?9 ) & Rob Malouf ( ?9?@ ) Patterns of relatedness in complex morphological systems : –!: Moderated discussion: Colin Bannard ( <=;-:A6 ) ": – B Saturday, September 23 A L A ": –": LouAnn Gerken (Arizona) Linguistic generalization by human infants ": –: Andrea Krott (Birmingham) Banana shoes and bear tables: Children’s processing and interpretation of noun-noun compounds : –: : –: Mike Tomasello ( - ) Analogy in the acquisition of constructions : –: Rens Bod (St. Andrews) Acquisition of syntax by analogy: Computation of new utterances out
    [Show full text]
  • Review of Dieter Wanner's the Power of Analogy: an Essay on Historical Linguistics
    Portland State University PDXScholar World Languages and Literatures Faculty Publications and Presentations World Languages and Literatures 2007 Review of Dieter Wanner's The Power of Analogy: An Essay on Historical Linguistics Eva Nuń ez-Mẽ ndeź Portland State University, [email protected] Follow this and additional works at: https://pdxscholar.library.pdx.edu/wll_fac Part of the Comparative and Historical Linguistics Commons, and the First and Second Language Acquisition Commons Let us know how access to this document benefits ou.y Citation Details Núñez-Méndez, Eva. "Review of Dieter Wanner's The Power of Analogy: An Essay on Historical Linguistics," Linguistica e Filologia, 25 (2007), p. 266-267. This Book Review is brought to you for free and open access. It has been accepted for inclusion in World Languages and Literatures Faculty Publications and Presentations by an authorized administrator of PDXScholar. Please contact us if we can make this document more accessible: [email protected]. Linguistica e Filologia 25 (2007) WANNER, Dieter, The Power of Analogy: An Essay on Historical Lin- guistics, Mouton de Gruyter, New York 2006, pp. xiv + 330, ISBN 978- 3-11-018873-8. Following the division predicated in the Saussurean dichotomy between synchrony and diachrony, this book starts by arguing that this antinomy between the formal and the historical should be relegated to the periphery. Combining diachronic with synchronic linguistic thought, Wanner proposes two adoptions: first, a restricted theoretical base in the form of Concrete Minimalism, and second analogical assimilations as formulated in Analogical Modeling. These two perspectives provide the basis for redirecting the theory in a cognitive direction and focus on the shape of linguistics material and the impact of the historical components of language.
    [Show full text]
  • Chapter 10 Analogy Behrens Final
    To appear in: Marianne Hundt, Sandra Mollin and Simone Pfenninger (eds.) (2016). The Changing English Language: Psycholinguistic Perspectives. Cambridge: Cambridge University Press. 10 The role of analogy in language acquisition Heike Behrens1 10.1 Introduction AnalogiCal reasoning is a powerful processing meChanism that allows us to disCover similarities, form Categories and extend them to new Categories. SinCe similarities Can be detected at the concrete and the relational levels, analogy is in principle unbounded. The possibility of multiple mapping makes analogy a very powerful meChanism, but leads to the problem that it is hard to prediCt whiCh analogies will aCtually be drawn. This indeterminaCy results in the faCt that analogy is often invoked as an explanandum in many studies in linguistiCs, language acquisition, and the study of language change, but that the underlying proCesses are hardly ever explained. When cheCking the index of current textbooks on cognitive linguistics and language acquisition, the keyword “analogy” is almost Completely absent, and the concept is typiCally evoked ad hoc to explain a Certain phenomenon. In Contrast, in work on language Change, different types of analogical change have been identified and have beCome teChniCal terms to refer to speCifiC phenomena, such as analogical levelling when irregular forms beCome regularized or analogical extension when new items beCome part of a Category (see Bybee 2010: 66-69, for a historiCal review of the use of the term in language Change and grammatiCalization see Traugott & Trousdale 2014: 37-38). In this paper, I will discuss the possible effects of analogical reasoning for linguistic Category formation from an emergentist and usage-based perspeCtive, and then characterize the interaction of different processes of analogical reasoning in language development, with a speCial foCus on regular/irregular morphology and argument structure.
    [Show full text]
  • Paradigm Uniformity and Analogy: the Capitalistic Versus Militaristic Debate1
    International Journal of IJES English Studies UNIVERSITY OF MURCIA www.um.es/ijes Paradigm Uniformity and Analogy: 1 The Capitalistic versus Militaristic Debate * DAVID EDDINGTON Brigham Young University ABSTRACT In American English, /t/ in capitalistic is generally flapped while in militaristic it is not due to the influence of capi[ɾ]al and mili[tʰ]ary. This is called Paradigm Uniformity or PU (Steriade, 2000). Riehl (2003) presents evidence to refute PU which when reanalyzed supports PU. PU is thought to work in tandem with a rule of allophonic distribution, the nature of which is debated. An approach is suggested that eliminates the need for the rule versus PU dichotomy; allophonic distribution is carried out by analogy to stored items in the mental lexicon. Therefore, the influence of the pronunciation of capital on capitalistic is determined in the same way as the pronunciation of /t/ in monomorphemic words such as Mediterranean is. A number of analogical computer simulations provide evidence to support this notion. KEYWORDS: analogy, flapping, tapping, American English, phoneme /t/, Analogical Modeling of Language, allophonic distribution, Paradigm Uniformity. * Address for correspondence: David Eddington. Brigham Young University. Department of Linguistics and English Language. 4064 JFSB. Provo, UT 84602. Phone: (801) 422-7452. Fax: (801) 422-0906. e-mail: [email protected] © Servicio de Publicaciones. Universidad de Murcia. All rights reserved. IJES, vol. 6 (2), 2006, pp. 1-18 2 David Eddington I. INTRODUCTION In traditional approaches to phonology, all surface forms are generated from abstract underlying forms. However, the pre-generative idea that surface forms can influence other surface forms, (e.g.
    [Show full text]
  • The Evolution of Case Grammar
    The evolution of case grammar Remi van Trijp language Computational Models of Language science press Evolution 4 Computational Models of Language Evolution Editors: Luc Steels, Remi van Trijp In this series: 1. Steels, Luc. The Talking Heads Experiment: Origins of words and meanings. 2. Vogt, Paul. How mobile robots can self-organize a vocabulary. 3. Bleys, Joris. Language strategies for the domain of colour. 4. van Trijp, Remi. The evolution of case grammar. 5. Spranger, Michael. The evolution of grounded spatial language. ISSN: 2364-7809 The evolution of case grammar Remi van Trijp language science press Remi van Trijp. 2017. The evolution of case grammar (Computational Models of Language Evolution 4). Berlin: Language Science Press. This title can be downloaded at: http://langsci-press.org/catalog/book/52 © 2017, Remi van Trijp Published under the Creative Commons Attribution 4.0 Licence (CC BY 4.0): http://creativecommons.org/licenses/by/4.0/ ISBN: 978-3-944675-45-9 (Digital) 978-3-944675-84-8 (Hardcover) 978-3-944675-85-5 (Softcover) ISSN: 2364-7809 DOI:10.17169/langsci.b52.182 Cover and concept of design: Ulrike Harbort Typesetting: Sebastian Nordhoff, Felix Kopecky, Remi van Trijp Proofreading: Benjamin Brosig, Marijana Janjic, Felix Kopecky Fonts: Linux Libertine, Arimo, DejaVu Sans Mono Typesetting software:Ǝ X LATEX Language Science Press Habelschwerdter Allee 45 14195 Berlin, Germany langsci-press.org Storage and cataloguing done by FU Berlin Language Science Press has no responsibility for the persistence or accuracy of URLs for external or third-party Internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
    [Show full text]
  • Spanish Velar-Insertion and Analogy
    Spanish Velar-insertion and Analogy: A Usage-based Diachronic Analysis DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Steven Richard Fondow, B.A, M.A Graduate Program in Spanish and Portuguese The Ohio State University 2010 Dissertation Committee: Dr. Dieter Wanner, Advisor Dr. Brian Joseph Dr. Terrell Morgan Dr. Wayne Redenbarger Copyright by Steven Richard Fondow 2010 Abstract The theory of Analogical and Exemplar Modeling (AEM) suggests renewed discussion of the formalization of analogy and its possible incorporation in linguistic theory. AEM is a usage-based model founded upon Exemplar Modeling (Bybee 2007, Pierrehumbert 2001) that utilizes several principles of the Analogical Modeling of Language (Skousen 1992, 1995, 2002, Wanner 2005, 2006a), including the ‘homogeneous supracontext’ of the Analogical Model (AM), frequency effects and ‘random-selection’, while also highlighting the speaker’s central and ‘immanent’ role in language (Wanner 2006a, 2006b). Within AEM, analogy is considered a cognitive means of organizing linguistic information. The relationship between input and stored exemplars is established according to potentially any and all salient similarities, linguistic or otherwise. At the same time, this conceptualization of analogy may result in language change as a result of such similarities or variables, as they may be used in the formation of an AM for the input. Crucially, the inflectional paradigm is argued to be a possible variable since it is a higher-order unit of linguistic structure within AEM. This investigation analyzes the analogical process of Spanish velar-insertion according to AEM.
    [Show full text]
  • Modeling Analogy As Probabilistic Grammar∗
    Modeling analogy as probabilistic grammar∗ Adam Albright MIT March 2008 1 Introduction Formal implemented models of analogy face two opposing challenges. On the one hand, they must be powerful and flexible enough to handle gradient and probabilistic data. This requires an ability to notice statistical regularities at many different levels of generality, and in many cases, to adjudicate between multiple conflicting patterns by assessing the relative strength of each, and to generalize them to novel items based on their relative strength. At the same time, when we examine evidence from language change, child errors, psycholinguistic experiments, we find that only a small fraction of the logically possible analogical inferences are actually attested. Therefore, an adequate model of analogy must also be restrictive enough to explain why speakers generalize certain statistical properties of the data and not others. Moreover, in the ideal case, restrictions on possible analogies should follow from intrinsic properties of the architecture of the model, and not need to be stipulated post hoc. Current computational models of analogical inference in language are still rather rudimentary, and we are certainly nowhere near possessing a model that captures not only the statistical abilities of speakers, but also their preferences and limitations.1 Nonetheless, the past two decades have seen some key advances. Work in frameworks such as neural networks (Rumelhart and McClelland 1987; MacWhinney and Leinbach 1991; Daugherty and Seidenberg 1994, and much subsequent work) and Analogical Modeling of Language (AML; Skousen 1989) have focused primarily on the first challenge, tackling the gradience of ∗Thanks to Jim Blevins, Bruce Hayes, Donca Steriade, participants of the Analogy in Grammar Workshop (Leipzig, 22–23 September 2006), and especially to two anonymous reviewers, for helpful comments and suggestions; all remaining errors are, of course, my own.
    [Show full text]
  • From Exemplar to Grammar: a Probabilistic Analogy-Based Model of Language Learning
    Cognitive Science 33 (2009) 752–793 Copyright Ó 2009 Cognitive Science Society, Inc. All rights reserved. ISSN: 0364-0213 print / 1551-6709 online DOI: 10.1111/j.1551-6709.2009.01031.x From Exemplar to Grammar: A Probabilistic Analogy-Based Model of Language Learning Rens Bod Institute for Logic, Language and Computation, University of Amsterdam Received 19 November 2007; received in revised form 24 September 2008; accepted 25 September 2008 Abstract While rules and exemplars are usually viewed as opposites, this paper argues that they form end points of the same distribution. By representing both rules and exemplars as (partial) trees, we can take into account the fluid middle ground between the two extremes. This insight is the starting point for a new theory of language learning that is based on the following idea: If a language learner does not know which phrase-structure trees should be assigned to initial sentences, s ⁄ he allows (implicitly) for all possible trees and lets linguistic experience decide which is the ‘‘best’’ tree for each sentence. The best tree is obtained by maximizing ‘‘structural analogy’’ between a sentence and previous sen- tences, which is formalized by the most probable shortest combination of subtrees from all trees of previous sentences. Corpus-based experiments with this model on the Penn Treebank and the Childes database indicate that it can learn both exemplar-based and rule-based aspects of language, ranging from phrasal verbs to auxiliary fronting. By having learned the syntactic structures of sentences, we have also learned the grammar implicit in these structures, which can in turn be used to produce new sentences.
    [Show full text]
  • Learning and Learnability in Phonology
    Learning and Learnability in Phonology The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Albright, Adam and Hayes, Bruce. 2011. "Learning and Learnability in Phonology." Handbook of Phonological Theory. As Published http://dx.doi.org/10.1002/9781444343069.ch20 Publisher Wiley-Blackwell Version Author's final manuscript Citable link https://hdl.handle.net/1721.1/128534 Terms of Use Creative Commons Attribution-Noncommercial-Share Alike Detailed Terms http://creativecommons.org/licenses/by-nc-sa/4.0/ Learning and learnability in phonology Adam Albright, MIT Bruce Hayes, UCLA To appear in John Goldsmith, Jason Riggle, and Alan Yu, eds. Handbook of Phonological Theory. Blackwell/Wiley. 1. Chapter content A central scientific problem in phonology is how children rapidly and accurately acquire the intricate structures and patterns seen in the phonology of their native language. The solution to this problem lies in part in the discovery of the right formal theory of phonology, but another crucial element is the development of theories of learning, often in the form of machine- implemented models that attempt to mimic human childrens’ ability. This chapter is a survey of work in this area. 2. Defining the problem Before we can develop a theory of how children learn phonological systems, we must first characterize the knowledge that is to be acquired. Traditionally, phonological analyses have focused on describing the set of attested words, developing rules or constraints that distinguish sequences that occur from those that do not. Although such analyses have proven extremely valuable in developing a set of theoretical tools for capturing phonologically relevant distinctions, it is risky to assume that human learners internalize every pattern that can be described by the theory.
    [Show full text]
  • The Analogical Speaker Or Grammar Put in Its Place René-Joseph Lavie
    The Analogical Speaker or grammar put in its place René-Joseph Lavie To cite this version: René-Joseph Lavie. The Analogical Speaker or grammar put in its place. Linguistique. Université de Nanterre - Paris X, 2003. Français. tel-00144458v1 HAL Id: tel-00144458 https://tel.archives-ouvertes.fr/tel-00144458v1 Submitted on 3 May 2007 (v1), last revised 5 May 2014 (v2) HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. UNIVERSITE DE PARIS 10 NANTERRE, FRANCE Département des Sciences du Language Laboratoire: MoDyCo, UMR 7114 Ecole doctorale: Connaissance et Culture René Joseph Lavie The Analogical Speaker or grammar put in its place Doctoral dissertation for the grade of Doctor in Language Sciences defended publicly on november 18th, 2003 Translated by the author with contributions of Josh Parker Original title: Le Locuteur Analogique ou la grammaire mise à sa place Jury: Sylvain Auroux Research Director, CNRS; Ecole Normale Supérieure, Lyon Marcel Cori, Professor, Université de Paris 10 Nanterre Jean-Gabriel Ganascia , Professor, Université Pierre et Marie Curie (Paris 6) Bernard Laks, Professor, Université de Paris 10 Nanterre Bernard Victorri, Research Director, CNRS, Laboratoire LATTICE Director: Bernard Laks 2004 02 01 Content INTRODUCTION .....................................................................................................................................
    [Show full text]