From to Romance: Computational Modeling of Tyler Lau∗, Maria Polinsky†, and Jake Seaton† ∗Department of Linguistics, University of California at Berkeley, USA †Deparment of Linguistics, Harvard University, USA

Overview Structure of the Connectionist Model Discussion

• What factors in Late Latin led to the heavy reshaping of • With phonology, frequency,& human semantics /kanis/ ‘dog’ Corpus the nominal system? 454 Latin Vulgate nouns in 6 forms (3 cases 2 numbers) = 2724 total forms • IV & V fall out in every simulation Phonology Semantics # times each form introduced = log(freq(form))× relative freq(case & num combination) • What minimal information does a connectionist model In training, input of each token (phonology and semantics)× and expected output given to model • With added, final forms converge more a ... and hidden layer adjusts connection weights accordingly need to predict syncretism in the correct direction? k Phonology (396 nodes = 6 6 11) Human Semantics (8 nodes) • Genitive singular drops out completely [-human] × × Analogy driven by factors such as frequency, [-vd] [+C] ... [+vd] [-C] ...... Each word maximally 6 syllables Nouns coded as: • Genitive plural hardly survives (only example in • Each syllable maximally 6 phonemes (CCVVCC) Male human: first 4 nodes activated Each phoneme coded for 11 features markedness, and morpheme length. (Kuryłowicz 1947, -1 1 ... 1 -1 ...... 0 0 0 0 0 0 0 0 Female human: final 4 nodes activated history is oblique 3PL pronoun–Fr. leur, It. loro) Bybee 1985, Albright 2008) Non-human: 0 nodes activated • Forms remaining in ≥90% of simulations Hidden Layer (30 nodes): -am > -a F.SG ending in all Romance (> -e in Fr.) Learning takes place in the hidden layer (Goldsmith & O’Brien 1995) • • Changes in Romance have been attributed not only to ... ? ? ? ? ? Hidden nodes use info from input and output to adjust connection weights between input layer • -um > -u M.SG ending in all of Romance (> -o in Sp., It. etc.) and hidden layer and between hidden layer and output layer sound change, but also to contact • -em > -e SG ending for M/F nouns in all of Romance Output Layer (13 nodes): Two parameters to be toggled: We aim to use a connectionist simulation of generational 1) CASE HIERARCHY • Forms remaining in 25-90% of simulations • 1 0 0 0 0 1 0 0 1 0 0 1 0 2) GENITIVE DROP learning providing minimal phonological and semantic No: All cases coded as equidistant No: All three cases available as output • -∅ SG ending for M/F nouns in all of Romance nom = (1,0,0) acc = (0,1,0) gen = (0,0,1) Possible outputs: nom, acc, gen nom sg • -¯es PL ending in western Romance, maybe > -i in eastern information and see whether the changes that are M III Yes: acc equidistant between nom and gen Yes: gen output becomes unavailable nom = (1,0,0) acc = (1,1,0) gen = (1,1,1) Possible outputs: nom, acc • -¯os M.PL ending in western Romance, maybe > -i in eastern actually attested in Romance can be reproduced Gender Case Number Generational Learning • -¯as F.PL ending in western Romance, maybe > -e in eastern Output set of features becomes new expected output for following generation • M/F.NOM.SG -us & -as: in E-Romance., final -s falls out; in Next Generation Iterated for 15 generations Latin Romance Comments W-Romance, nom persists in older Sp. & Fr. I Results • Accusative is most robust form–in history, acts as base II I form in most of modern Romance. Plurals in some Declensions IV and V languages may be from nominative (D’hulst 2006) Declension III II merge to III and I early • With genitive dropped, two notable outcomes for neuter IV III NSG to MSG NPL to FSG NPL to FPL 1.00 1.00 1.00 1 N.SG > M.SG | N.PL > F.SG (most of Romance) V 2 M.SG > N.SG | N.PL > F.PL (Romanian system) In most of Romance, ev- erything merges to the 0.75 0.75 0.75 nom accusative. The vocative • Taking into account these minimal factors, simulation falls out of use as well. offers a rather accurate history of syncretism and trends Nominative survives in acc nom early Spanish and French. 0.50 0.50 0.50 Also possible that East- that occurred on way to modern ern Romance plurals gen acc come from nominative plurals 0.25 0.25 0.25 % of NPL to FPL Genitive and dative sur- % of NPL to FSG Case % of NSG to MSG References dat gen vive only for feminine nouns in Romanian, vocative survives only 0.00 0.00 0.00 abl voc for masculine 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Albright, A. 2008. Explaining Universal Tendencies and Language Particulars in Analogical Change. In Generation Generation Generation Linguistics Universals and Language Change, 144-181. Bybee, J. 1985. Morphology: A Study of the voc Relation between Meaning and Form. D’hulst, Y. 2006. Romance Plurals. Lingua 116, 1303-1329. In most of Romance, With genitive drop, neuter singu- With genitive drop, neuter plurals Comparing with Figure 3, it is no- Goldsmith, J. & O’Brien, J. 1995. Grammar within a Neu- neuters mostly enter the Figure 2: Figure 3: Figure 4: ral Net. In The Reality of Linguistic Rules, 95-113. m masculine singular class lars bifurcate–they either merge with mascu- almost consistently migrate to the feminine table that in those trials where neuter plu- due to similar phonology. Herman, J. 1967. Le Latin Vulgaire. m f Due to similar phonology lines or draw masculines to their class (see singular class due to phonological similarity rals do not migrate to the feminine singular Kuryłowicz, J. 1947. The Nature of the so-called Analogical Processes. Diachronica 12 (1), with the feminine plural Figure 6). alone. class, they migrate to the feminine plural. 113-145. and fact that neuter plu- Gender f m.sg rals can be interpreted as collectives, some neuters Forms Remaining at End of Simulation (w/ Hierarchy) Distribution of Genders by End of Simulation Distribution of Cases by End of Simulation Acknowledgements become feminine 1.0 n.sg Gender: n m f Case: acc nom gen Romanian has an ambi- 0.9 f.sg generic neuter gender– 1.0 1.0 neuter singulars take 0.8 0.9 0.9 Many thanks to Kevin Ryan, James Kirby, Andrew Garrett, Terry Regier, n.pl masculine morphology 0.7 f.pl while neuter plurals take 0.8 0.8 Mairi McLaughlin, and Yang Xu for comments and guidance, to Ezra Van feminine morphology 0.6 0.5 0.7 0.7 Everbroeck for providing the code for the simulation in Polinsky and Van 0.4 0.6 0.6 Everbroeck (2003), and to Edwin Ko for consultation on data visualization. 0.3 0.5 0.5 0.2 0.4 0.4

Latin Declension System Proportion of Trials Percentage 0.1 0.3 Percentage 0.3 Contact Information 0.0 0.2 0.2 0.1 0.1 • Tyler Lau: [email protected] I II IIIa IIIb IV V f I acc pl f I gen pl f I acc sg f I nom sg f III accf I nom pl pl f III genn II accfpl II acc pl sgm I acc pl n II accm sg II accm sg III acc pl m II facc III acc pl f sgIII nom sg m IIm gen IIIm genpl II nnom plII nomn plIII accsg sg f II nom nsg II nom pl 0.0 0.0 m III accm sg II nom sgm III nom sg m III nom pl n III nom sg Root silva- anno- color- igni- lacu- fide- Form Trial Trial • Maria Polinsky: [email protected] Gloss ‘forest’ ‘year’ ‘color’ ‘fire’ ‘lake’ ‘faith’ • Jake Seaton: [email protected] Nom. silva annus color ignis lacus fid¯es Figure 5: With case hierarchy in play, ac- Figure 6: In most of the trials, the neuter Figure 7: With case hierarchy taken into Sg. Gen. silvae ann¯ı col¯oris ignis lacus¯ fide¯ı cusative is very robust and only the genitive falls out. In the cases where it is more consideration, the accusative becomes the Acc. silvam annum col¯orem ignem lacum fidem plural survives in some trials. Without it, the robust, masculine nouns migrate to the dominant case in most trials and the only Nom. silvae ann¯ı col¯or¯es ign¯es lacus¯ fid¯es genitive (both singular and plural) survive to neuter class and the total proportion of mas- case in almost half of them. The genitive Pl. Gen. silv¯arum annorum¯ col¯orum ignium lacum¯ fid¯erum a greater extent. culine+neuter nouns remains approximately survives in hardly more than 10% of trials Acc. silv¯as annos¯ col¯or¯es ign¯ıs/ lacus¯ fid¯es equal. (without case hierarchy, the genitive remains ign¯es in over 50% of trials). Figure 1: The Latin Declension Classes