For Personal Use Only
Total Page:16
File Type:pdf, Size:1020Kb
Corpus evidence for prototype-driven alternations: the case of German weak nouns (Draft of August 20, 2014) Roland Schäfer <[email protected]>, Freie Universität Berlin1 In this paper, I present a large study of so-called “weak” nouns in German based on the 9.1 billion token DECOW2012 web corpus. The weak nouns form a small class of masculine nouns with prototypical semantic and phonotactic features as well as case and number affixes which are unusual within the German system of noun inflection. These weak nouns have a certain tendency to be assimilated to the dominant “strong” inflectional pattern. I quantify the strength of this tendency and model it as an effect of the presence or absence of the prototypical features in the nouns as well as paradig- matic (case) features and token frequency. While staying neutral with respect to spe- cific theories, this analysis provides overall evidence for usage-based theories, and it shows that even low-frequency alternations which are typical of non-standard lan- guage can be examined in corpus studies, provided that very large (and therefore nec- essarily web-derived) corpora are available. Keywords prototypeDRAFT theory, alternations, German, noun inflection, web corpora for personal use only 1 I would like to thank the following people (in alphabetical order) for comments, discussion, and sug- gestions: Götz Keydana, Ulrike Sayatz, Christian Zimmer. I also thank the participants of the work- shop “Usage‐ Based Approaches to Morphology” at the Annual Conference of the German Linguistic Society (DGfS) 2013. All remaining errors and inadequacies are mine, and I intend to keep them. 1 1 Overview On July 23, 2012, the German online newspaper Spiegel Online published an article about a controversial statement made by Philipp Rösler, the former leader of the Ger- man liberal democratic party. In that article, there is a comment from a fellow party member quoted in a headline as (1a), but repeated in the text body as (1b).2 (1) a. Auf welchem Planeten lebt er? on which planet lives he On which planet does he live? b. Auf welchem Planet lebt er? On which planet lives he In (1a), the “weak” masculine noun Planet (‘planet’) takes the inflectional marker -en in the dative, but it does not in (1b). The form in (1b) represents a non-standard alterna- tion, because dative singular forms without a suffix are characteristic of the dominant “strong” masculine inflection pattern. While it is impossible to find out which variant was originally uttered in this case, many native speakers would agree that dropping the -en in this case might be stylistically dispreferred in written standard German, but that it is not unusual in colloquialDRAFT spoken German, and that it is far from full ungrammatical- ity. Examples of an accusative and afor genitive personal of a weak noun use (Mensch only, ‘man, human’) 2 http://www.spiegel.de/politik/deutschland/philipp-roesler-empoert-fdp-freunde-mit-griechenland- aeusserung-a-845980.html 2 which inflect according to the strong pattern (accusative Mensch instead of Mensch-en and genitive Mensch-es instead of Mensch-en) can be found in (2) and (3).3,4 (2) Gibt es einen Mensch, der stetig wächst? gives it a man who constantly grows Is there any man who grows constantly? (3) Das Leben eines Mensches wird zu politischen Zwecken the life of.a man becomes for political reasons aufs Spiel gesetzt. on.the game put The live of a man is put at risk for political reasons. The weak nouns form a small class of just over 450 masculine nouns (compared to thousands of strong masculine nouns), and they have remarkable prototypical semantic and phonotactic properties (Köpcke [1995]), for example human denotation or non-final accent (cf. Section 2). In addition, their case and number forms are quite remarkable within the German declension system in that they use -en as a non-nominative-singular marker (Thieroff [2003]). All forms except for the nominative singular take the suffix -en (see Section 2). Not surprisingly, given the low type frequency of the weak nouns and their non-canonicalDRAFT inflection, some of them have been fully assimilated to the strong inflectional pattern (Wurzel for[1985], personal Joeres [1996]). Additionally,use only nouns which are still predominantly weak have alternative strong forms as exemplified in (1) – (3). No extensive corpus study which shows whether the presence or the absence of the pro- totypical semantic and phonotactic features of weak nouns influences the alternation 3 http://www.flegel-g.de/wachstum-wachstum.html 4 http://www.mumia.de/doc/aktuell/991201ai00.html 3 strength has been presented. By “alternation strength”, I mean the probability that a weak noun occurs in a strong form. The lack of corpus evidence was—in my view— mostly due to the lack of adequate corpora, and the present paper aims to remedy this situation by using a very large web corpus of German for a data-driven study. I pursue two major goals in this paper: a theoretical and a methodological one. On the theoretical side, I consider this study to contribute evidence for the adequacy of usage- based models of inflectional morphology, without making strong claims in favor of a specific version of, for example, Construction Morphology such as advocated in Booij (2010), who is not very much concerned with inflection, anyway. A prototype or schema approach as proposed by Köpcke (1988,1995,1998) suggests itself, not only because it was applied to German nominal inflection within those publications. The advantage of a schema approach is that it specifically allows for central and non-central membership. Lexical items less strongly associated with a certain schema can be assumed to disasso- ciate from that schema and associate with another one more readily than strongly asso- ciated items. In the case at hand, I suggest that such a theory predicts that the alternation strength of highly prototypical weak nouns should be lower than that of less prototypi- cal weak nouns. My corpus study not only confirms my hypothesis but also quantifies the strength of the influenceDRAFT of the different semantic, phonotactic, and morphological properties. It is a difficult question whetherfor personaland how such corpus use data only can be interpreted in “cognitive” terms. I carefully subscribe to a view of corpus linguistics as “psy- chologically informed (cognitively-inspired) usage-based linguistics” (Gries [2010:334]). In this vein, the inspiration for the present study comes from the cognitive schema-based theories by Köpcke. Also, the study contributes—by producing results 4 from corpus data which are highly compatible to those theories—evidence to support the assumption that reflexes of cognitive representations can indeed be measured in cor- pus data. As for the second (methodological) goal, I demonstrate that data-driven corpus studies of rare event alternations—even if they predominantly occur in unedited colloquial writ- ten language—are possible, if corpora of the appropriate size and nature are available. I suggest web corpora in the region of 10 billion tokens as the appropriate source of data. I consider this methodological goal to be equally important as the theoretical one. I now briefly describe the system of noun inflection in German as well as the position of the weak nouns within that system in Section 2. Section 3 presents the corpus study including the statistical results. Section 4 summarizes the findings. 2 German weak nouns 2.1 German noun inflection In this section, I briefly describe noun inflection in current German, mostly for readers who are unfamiliar with German. I follow the overviews in Augst (1979), Eisenberg (2000, 2013), and Schäfer (2014, submitted). German has a three-wayDRAFT gender distinction (neuter, masculine, feminine), with four cases (nominative, accusative, dative,for genitive) personal and a two-way use number only distinction (sin- gular, plural).5 Leaving only the weak nouns aside (cf. Section 2.2), case inflection in nouns reduces to two simple rules: The dative plural takes -(e)n (after the plural marker) 5 Many syncretisms can be observed in the case system, but if we take the whole range of case-marked parts-of-speech (nouns, adjectives, pronouns, and determiners), non-ambiguous NP forms can be found for each of the four cases. 5 whenever it is phonotactically possible, and the genitive singular of masculine and neuter nouns always takes -(e)s.6 Furthermore, grammatical gender and inflection are tightly coupled in as much as a noun’s gender largely predetermines its plural inflection. Leaving aside the weak nouns and some other small classes once again, the general masculine and neuter (strong) plu- ral marker is -e (with or without stem umlaut) as in Stuhl/Stühl-e (‘chair/s’) and Gurt/Gurt-e (‘belt/s’). The general feminine plural marker is -en as in Nadel/Nadel-n (‘needle/s’). Sometimes, the prototypical masculine and neuter plural suffixes occur with feminine nouns and vice versa. Examples are masculine Staat-en (‘states’) or femi- nine Wänd-e (‘wall/s’). The class of weak nouns also takes the atypical masculine plural marker -en. I now turn to the additional peculiarities of weak nouns in Section 2.2. 2.2 Weak nouns In addition to taking the rare masculine plural -en, weak nouns deviate from the other- wise strict pattern of case marking in that they take the suffix -en in all singular forms except for the nominative. In other words, -en becomes a non-nominative-singular marker instead of a simple plural marker. In this section, I briefly summarize the rele- vant previous work onDRAFT weak nouns, especially regarding their special status within the German declension system. for personal use only 6 In inflectional affixes, orthographic <e> always corresponds phonologically to schwa.