Estimating Grammeme Redundancy by Measuring Their Importance for Syntactic Parser Performance

Estimating Grammeme Redundancy by Measuring Their Importance for Syntactic Parser Performance Aleksandrs Berdicevskis UiT The Arctic University of Norway Department of Language and Linguistics [email protected] which are often described as “complex” and “re- Abstract dundant”. It is, however, often difficult to determine (and Redundancy is an important psycholinguistic provide empirical evidence in favour of the cho- concept which is often used for explanations sen decision) whether a certain feature is indeed of language change, but is notoriously diffi- redundant, or to what extent it is redundant and cult to operationalize and measure. Assuming to what extent it is functional. Some conclusions that the reconstruction of a syntactic structure can be drawn from indirect evidence, e.g. typo- by a parser can be used as a rough model of the understanding of a sentence by a human logical (cf. Dahl’s (2004) notion of cross- hearer, I propose a method for estimating re- linguistically dispensable phenomena). For mod- dundancy. The key idea is to compare per- ern languages, redundancy can be studied and formances of a parser on a given treebank be- measured by means of psycholinguistic experi- fore and after artificially removing all infor- ments (e.g. Caballero and Kapatsinski, 2014), but mation about a certain grammeme from the this approach is not applicable to older language morphological annotation. The change in per- stages and extinct languages. formance can be used as an estimate for the I propose a computational method to estimate redundancy of the grammeme. I perform an the functionality (and, conversely, redundancy) experiment, applying MaltParser to an Old of a grammeme (that is, a value of a grammati- Church Slavonic treebank to estimate grammeme redundancy in Proto-Slavic. The re- cal/morphological category) that can potentially sults show that those Old Church Slavonic work for any language for which written sources grammemes within the case, number and are available or can be collected. tense categories that were estimated as most I describe the philosophy behind the proposed redundant are those that disappeared in mod- method and its relevance to cognitive aspects of ern Russian. Moreover, redundancy estimates language evolution in section 2. Section 3 pro- serve as a good predictor of case grammeme vides the necessary background for a particular frequencies in modern Russian. The small instance of language change that will be used as sizes of the samples do not allow to make de- a case study. Section 4 describes how the exper- finitive conclusions for number and tense. iment was performed, section 5 provides the results. Section 6 discusses possible interpretations 1 Introduction of the results, and section 7 concludes. Explanations of historical language change often involve the concept of redundancy, especially 2 Using parsers to measure morpholog- grammatical (morphological) redundancy. ical redundancy One important example is a family of recent In the most general terms, morphological redun- theories about linguistic complexity (Sampson et dancy can be described as follows: if a message al., 2009), including those known under the la- contains certain morphological markers that are bels “sociolinguistic typology” (Trudgill, 2011) not necessary to understand the message fully and “Linguistic Niche Hypothesis” (Lupyan and and correctly, then these markers can be consid- Dale, 2010). The key idea behind these theories ered (at least to some extent) redundant. is that certain sociocultural factors, such as large The problem with operationalizing this intui- population size or a large share of adult learners tion is that it is unclear how to model under- in the population can facilitate morphological standing (that is, the reconstruction of the seman- simplification, i.e. increase the likelihood that the tic structure) of a message by human beings. language will lose some morphological features, 65 Proceedings of the Sixth Workshop on Cognitive Aspects of Computational Language Learning, pages 65–73, Lisbon, Portugal, 18 September 2015. c 2015 Association for Computational Linguistics. In the method I propose, syntactic structure is 2013). Thus, if it is shown that there is a dia- taken as a proxy for semantic structure, and a chronic bias towards eliminating redundant reconstruction of syntactic structure by an auto- grammemes, it will be possible to hypothesize matic parser is taken as a model of how a human that this bias stems from individual speakers' hearer understands the meaning. preference to avoid overloading their speech with The assumption that these processes have excessive complexity. enough in common to make the model adequate Importantly for diachronic studies, the method is bold, but not unwarranted. It is generally can be applied to extinct languages, provided that agreed that a correct interpretation of syntactic large enough treebanks exist. structure is necessary to understand the meaning In the following sections, I will exemplify the of a message, and that humans use morphologi- method by applying it to a particular case of lan- cal cues to reconstruct syntactic structure. guage change (Proto-Slavic —> Contemporary Parsers, obviously, do the latter, too. Crucially, Standard Russian). I also use the case study to the model does not require the assumption that test whether the resulting redundancy estimates parsers necessarily process the information in are plausible. Following a common assumption exactly the same way as humans. It is enough that more redundant grammemes are in general that they, using the same input, can approximate more likely to be lost (Kiparsky 1982: 88–99, see the output (i.e. syntactic structures) well enough, also references above), and that Russian has been and modern parsers usually can. Furthermore, under considerable pressure to shed excessive parsers also rely heavily on the morphological complexity (see section 3), I make the prediction information, not unlike humans. that the grammemes that did disappear were on The key idea is then to take a morphologically average more redundant than those that were tagged treebank of a language in question and kept, and that the “remove-and-reparse” method parse it with an efficient parser, artificially re- should be able to capture the difference. moving morphological features (either gram- In order to be explicit about the assumptions memes or categories) one by one. Changes in the behind the current study and its limitations I parser’s performance caused by the removal of a want to highlight that the study attempts to test feature can serve as a measure of its redundancy. two independent hypotheses at once: first, that In other words, if the removal of a feature causes redundant grammemes are more likely to disap- a significant decrease in parsing accuracy, the pear or become less frequent; second, that pars- feature can be considered important for extract- ing is an adequate model of human language per- ing syntactic information and thus functional. If, ception, since what is redundant for a parser is however, the decrease is small (or absent), the redundant for a human as well. This can be prob- feature can be considered redundant. lematic, since we do not really know whether Obviously, it is not necessary that this ap- either of these hypotheses is true. proach will provide an exact and comprehensive Let us look at the experiment from the follow- measure of morphological redundancy; there are ing perspective: if it turns out that there is a numerous potential sources of noise and errors. strong correlation between importance of the We can, however, expect that at least some real grammeme for parser performance and gram- redundancy will be captured. The method can meme survivability, then this fact has to be ex- then be applied to make rough estimates and thus plained. A plausible explanation which fits well be useful, for instance, in large-scale typological with the existing linguistic theories would be the studies, or in language change studies, or any one outlined above in the form of the two hy- studies aiming at understanding why languages potheses: under certain sociocultural conditions need (or do not need) redundancy. Understand- speakers tend to abandon redundant grammemes; ing that, in turn, will help to reveal the cognitive grammemes that are not important for the parser biases that influence language learning. are redundant. If there is no correlation, however, It has been shown by means of computational this absence would not tell us whether both hy- modelling and laboratory experiments that strong potheses are false or only one of them (and biases which affect the course of language which one) is. change can stem from weak individual cognitive In addition to the main prediction, I make a biases, amplified by iterated learning over gener- secondary one: assuming that more redundant ations (Kirby et al., 2007; Reali and Griffiths, grammemes will tend to become less frequent, 2009; Smith and Wonnacott, 2010) and commu- and more functional grammemes will tend to nication within populations (Fay and Ellison, become more frequent, we can expect that the 66 functionality of grammemes in Proto-Slavic CSR: the plural is used instead (the dual, howev- should serve as a good predictor of their frequen- er, left visible traces in the morphosyntax of the cy in modern Russian. I will test this prediction numerals and the formation of plural forms). as well, though the possibilities for this test of- Proto-Slavic had five basic verbal tenses: pre- fered by the current study are limited. In addi- sent (also called non-past), aorist, imperfect, per- tion, the prediction itself relies on stronger as- fect and pluperfect.1 The perfect and pluperfect sumptions (redundancy is not necessarily the were analytical forms, consisting of resp. present only, nor even the most important predictor of and imperfect2 forms of an auxiliary (‘be’) and a frequency). so-called resultative participle. Later, the aorist, imperfect and pluperfect went out of use, while 3 From Proto-Slavic to Russian the former perfect gradually lost the auxiliary verb.

Estimating Grammeme Redundancy by Measuring Their Importance for Syntactic Parser Performance

The Strategy of Case-Marking

Definiteness and Determinacy

PARASITIC MIRATIVITY of ENGLISH USE in COLIN TREVORROWS MOVIE “JURASSIC WORLD” Thesis Submitted in Partial Fulfillment of Th

Calculus of Possibilities As a Technique in Linguistic Typology

Context and Negation 1 the Role of Context in Young Children's Comprehension of Negation Ann E. Nordmeyer and Michael C. Frank

Public Policy for Thee, but Not for Me: Varying the Grammatical Person of Public Policy Justifications Influences Their Support

(In)Definiteness, Polarity, and the Role of Wh-Morphology in Free Choice

Evidentiality and Mood: Grammatical Expressions of Epistemic Modality in Bulgarian

The Influence of Animacy and Context on Word Order Processing: Neurophysiological Evidence from Mandarin Chinese

EVIDENTIALITY and FLEXIBILITY of SOURCE REPORTING 1 Evidentiality and Flexibility of Source Reporting Castelain, Thomas Floyd, S

How Does the Mother Tongue Affect Second Language Acquisition?

A Grammar of Kunbarlang