Stop the Morphological Cycle, I Want to Get Off: Modeling the Development of Fusion
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Society for Computation in Linguistics Volume 3 Article 40 2020 Stop the Morphological Cycle, I Want to Get Off: Modeling the Development of Fusion Micha Elsner The Ohio State University, [email protected] Martha B. Johnson The Ohio State University, [email protected] Stephanie Antetomaso The Ohio State University, [email protected] Andrea D. Sims The Ohio State University, [email protected] Follow this and additional works at: https://scholarworks.umass.edu/scil Part of the Computational Linguistics Commons, and the Morphology Commons Recommended Citation Elsner, Micha; Johnson, Martha B.; Antetomaso, Stephanie; and Sims, Andrea D. (2020) "Stop the Morphological Cycle, I Want to Get Off: Modeling the Development of Fusion," Proceedings of the Society for Computation in Linguistics: Vol. 3 , Article 40. DOI: https://doi.org/10.7275/33wn-nb03 Available at: https://scholarworks.umass.edu/scil/vol3/iss1/40 This Paper is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Proceedings of the Society for Computation in Linguistics by an authorized editor of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. Stop the Morphological Cycle, I Want to Get Off: Modeling the Development of Fusion Micha Elsner, Martha Booker Johnson, Stephanie Antetomaso, Andrea D. Sims Department of Linguistics The Ohio State University elsner.14,johnson.6713,antetomaso.2,[email protected] Abstract Along with this taxonomic distinction comes a historical origin story, sometimes called the mor- Historical linguists observe that many fusional phological cycle (Hock and Joseph, 1996)[183]. (unsegmentable) morphological structures de- Through processes of phonological reduction, in- veloped from agglutinative (segmentable) pre- dependent function words become attached to decessors. Such changes may result when content words as agglutinative inflections. Fur- learners fail to acquire a phonological alterna- tion, and instead, “chunk” the altered versions ther phonological reduction or sound changes blur of morphemes and memorize them as under- the boundaries between morphemes, leading to lying representations. We present a Bayesian fusion. Finally, affixes may become so non- model of this process, which learns which transparent that their association with MSPs is lost morphosyntactic properties are chunked to- (demorphologization) at which point new function gether, what their underlying representations words may be recruited to replace them, beginning are, and what phonological processes apply the cycle anew. to them. In simulations using artificial data, we provide quantitative support to two claims Morphological change is more various and about agglutinative and fusional structures: more complicated than this simple story suggests, that variably-realized morphological markers and this cycle isn’t the only way in which fusion discourage fusion from developing, but that can arise (Grunthal¨ , 2007; Igartua, 2015; Karim, stress-based vowel reduction encourages it. 2019). However, it is one way that has been ob- served. In this paper we focus on the role of 1 Introduction phonological processes in the transition between agglutination and fusion. Morphological reanaly- While modern typologists reject the wholesale cat- sis often results from an interaction between the egorization of languages as isolating, agglutina- phonology of a language and the learning mecha- tive or fusional (Haspelmath, 2009), they still rec- nism. Specifically in this context, morphemes are ognize a distinction between morphological struc- most likely to fuse if the environments in which tures which can be easily segmented and those they occur, and the phonological processes trig- which cannot (Plank, 1999). In ones with mor- gered by those environments, are vulnerable to re- phological fusion (or cumulation), multiple mor- analysis, which is to say, to mis-learning. The 1 phosyntactic properties (MSPs) are realized by question becomes: which kinds of phonological a single morph with no immediately segmentable processes are likely to make morphological con- 2 pieces. For instance, Turkish tarla-lar-ı and structions vulnerable to reanalysis, and which are Old English feld-a both indicate ‘field-PL.ACC’ not? (Plank, 1999), but the Old English suffix cannot In order to test the role that phonological pro- be further analyzed whereas the Turkish word has cesses play in making agglutinative structures vul- separate number and case morphemes. nerable to reanalysis, we provide a formal learn- 3 1We use morphosyntactic category to refer to sets of prop- ing model for morphological systems whose in- erties; cross-linguistically common categories are TENSE, ternal representations clearly distinguish between PERSON, NUMBER, etc., and morphosyntactic properties are agglutination and fusion. The model extends Cot- PRESENT, PAST, etc. 2Following practice in morphology, we use the term 3Code and data at github.com/melsner/ morph to refer to (only) the form part of a morpheme. scil2019-fusion. 412 Proceedings of the Society for Computation in Linguistics (SCiL) 2020, pages 412-422. New Orleans, Louisiana, January 2-5, 2020 terell et al. (2015), learning a Bayesian model ysis is evidenced by the fact that in Aeolic di- which maps from sets of MSPs to surface forms in alects, speakers extended the athematic ending - three steps: selection of a morphological template, mi to verbs that did not historically have it, giving, concatenation of underlying forms, and phonol- e.g., f´ıli-mi ‘love-1SG.PRS’ where filo˜ is expected ogy. We validate the model by testing on a se- etymologically. The fact that -mi was extended as ries of artificial languages. The model recovers a single unit indicates that it had undergone fusion. the expected analyses for prototypically aggluti- The reanalysis of the Greek suffixes was thus native or fusional languages; for languages which driven by sound changes that introduced phono- can be analyzed in either way, we demonstrate logical alternations, and in the process introduced in the first study that those with variably-realized ambiguity regarding the morphological structure. morphological markers (i.e. ones that are some- In the wake of these changes, speakers were faced times present, sometimes absent) are less likely with an analytic choice, e.g.: is there one 1SG mor- to be learned as fusional. In a second study, we pheme -m plus a phonological rule, or different show that languages with stress-based vowel re- 1SG endings -mi and -n that also express tense? duction are more likely to be learned as fusional. The extent to which sound change leads agglu- Our model thus provides quantitative support for tinative structures to be reanalyzed as fusional has previous observations that languages with large recently been questioned (Haspelmath, 2018).5 proportions of agglutinative structures also fre- Nonetheless, this kind of ambiguity between anal- quently have large numbers of variably-realized yses at different levels of representation is often morphs (Plank, 1999) and vowel harmony rather a driver of language change (Bybee, 1999) and than stress-based reduction (Zingler, 2018). phonological reduction of agglutinative structures is widely cited as a source of fusionality (By- 2 Related work bee, 1997; Igartua, 2015, among others). Just as phonological rules and categories can arise when Indo-European Ancient Greek low-level phonetic processes like assimilation are PRS AOR PRS AOR 1SG *-m-i *-m d´ıdo-mi¯ ed´ o-n¯ reanalyzed as phonological, so fusion can ap- 2SG *-s-i *-s d´ıdo-s¯ ed´ o-s¯ pear when the effects of phonological process are 3SG *-t-i *-t d´ıdo-si¯ ed´ o¯ “baked in” to the morphological representations. Table 1: Partial set of Indo-European and Ancient Bybee (2002) summarizes the idea (with reference Greek (‘give’) person-number forms in present indica- mostly to syntax) with the catchphrase: “Items tive and aorist that are used together fuse together.” Both Heath (1998) and Zingler (2018) point We begin with a concrete example of the kind out the implication that agglutinative construc- of morphological change we are describing. In tions must have “barriers”— typological features 6 some Indo-European (IE) athematic verbs, per- which prevent them from becoming fusional. son and number were expressed cumulatively but Zingler makes a specific proposal, that fixed (lexi- tense was realized via a separate morpheme: -i cal) stress systems tend to encourage fusion, while for present active indicative and zero for aorist ac- vowel harmony discourages it. This builds on a tive indicative (Table 1). (These endings are re- typological observation: the kinds of phonolog- constructed for IE but attested in Sanskrit.) How- ical alternations that occur in agglutinative and ever, sound changes between IE and Proto-Greek fusional systems tend to differ, “... with vowel obscured the unity of the person-number morphs harmony tending to imply agglutination” (Plank, 7 across present and aorist. For example, word-final 1999)[310]. Zingler argues that fixed stress leads [m] turned into [n] as a result of sound change, re- 5In fact, Haspelmath states (pp107-8) that “...we do not sulting in different 1SG forms in Ancient Greek.4 know how it is that robust inflectional patterns with cumula- tive and suppletive affixes arise”. Our paper offers a partial These changes led speakers to reanalyze the for- answer. merly separate morphemes as fused (Brian Joseph, 6The argument of Heath