
Russ Linguist (2012) 36:91–119 DOI 10.1007/s11185-011-9083-x Morphosyntactic variation and syntactic constructions in Czech nominal declension: corpus frequency and native-speaker judgments Морфосинтакcическая вариативность и синтаксические конструкции в склонении чешских существительных: частотность в корпусе и oценки носителей языка Neil Bermel · Ludekˇ Knittl Published online: 3 January 2012 © Springer Science+Business Media B.V. 2012 Abstract Data from the Czech National Corpus and a large-scale survey of acceptability judgments are used to investigate the scope of morphosyntactic variation in two cases (genitive singular and locative singular) of a Czech declension pattern. The syntactic construction in which a form is found is shown to have a significant interaction with its frequency in the corpus and with its acceptability rating. We conclude that the pattern of acceptability preferences lends support to the entrenchment hypothesis and in general to emergentist approaches to language. Аннотация В настоящей статье рассматриваются отношения между данными из Национального Корпуса чешского языка и широким опросом оценки языковой приемлемости. Целью работы является рассмотрение масштабов морфосинтакси- ческой вариативности в двух чешских падежах (в родительном и локативном падежах единственного числа). Согласно результатам нашего анализа, синтаксическая конструкция, в которой имеется данная форма, состоит в тесном взаимодействии с ее частотностью в корпусе и с оценкой ее приемлемости. Таким образом, общая модель оценок приемлемости подтверждает гипотезу об «усилении» употребляемости более частых форм и в целом сходится с так называемыми «эмергентными» подходами к языку, т.е. с такими подходами, согласно которым созидание языковых структур происходит в ходе освоения языка. N. Bermel () · L. Knittl Department of Russian and Slavonic Studies, School of Modern Languages and Linguistics, University of Sheffield, Sheffield, UK e-mail: n.bermel@sheffield.ac.uk 92 N. Bermel, L. Knittl 1 Introduction1 Czech nominal declension patterns present a formidable amount of variation. Not only do some nouns exhibit variation between patterns, but within patterns there is often a choice of desinential morphs. The entire system is thus a fertile area for considering how we study variation, and how frequency information from large-scale annotated text databases (corpora) relates to the acceptability of forms for native speakers.2 The question is an important one, because we now possess a wealth of corpus data. In Czech, which was the first Slavic language to have a large tagged corpus, scholars have been mining corpora for information about the relative proportions of competing forms. In an earlier paper (Bermel and Knittl forthcoming), we used the results of an acceptability study to examine the overall relationship between corpus data and accept- ability judgments, arguing in the process that there is a significant relationship between them, but that only certain kinds of corpus data allow us to generalize about the ratings native speakers are likely to give one or another form. The current contribution attempts to answer several questions about the influence of syntactic and non-linguistic factors on acceptability judgments: 1. What do corpus data tell us about the scope of morphosyntactic variation in two Czech cases (genitive singular and locative singular), where this variation seems to be persis- tent and is not predictable from simple rules? 2. Does the syntactic construction in which a form appears play a role in influencing people’s evaluation of variant forms? 3. If so, what model best explains the distribution of forms in the syntactic environments studied? 2 Czech morphosyntax Modern Czech inherits the full complement of Slavic nominal paradigms. A degree of opacity in the assignment of nouns to particular paradigms, at least from a synchronic perspective (due to phonological changes over the last thousand years) is characteristic for Czech, as is significant variation within paradigms, where we find that the case endings in use depend on the word chosen. The variation we are examining falls within the so-called masculine hard inanimate pattern. This paradigm arises, as elsewhere in Slavic, through the merger of the early Slavic u-stem and o-stem paradigms, which had distinct endings in a number of cases. The merger meant that successor classes had variant endings at their disposal in certain cases, as detailed in Brown (2007); in Czech the endings from the smaller, more peripheral 1Data collection and analysis for this article was funded under British Academy research grant SG-50275. The authors are grateful to Ewa Dąbrowska, Dagmar Divjak, Jean Russell and Marcin Szczerbiński for their assistance and advice at various stages during this project. 2Both terms, ‘acceptability’ and ‘grammaticality’, are used in the literature, as Schütze (1996) notes. ‘Gram- maticality’ implies that the judgment is grounded in syntactic wellformedness (Featherston 2005, 673–674; Bader and Häussler 2009, 4–6), and it tends to imply a binary state of affairs (grammatical/ungrammatical), although some scholars take pains to stress that there is a gradient from grammaticality to ungrammaticality (e.g. Kempen and Harbusch 2008). Because in this particular study we wish to avoid the presumption that speakers are necessarily judging a matter of grammar as opposed to one of usage, we follow studies such as McKoon and Macfarland (2000) in preferring ‘acceptability’. Morphosyntactic variation in Czech 93 u-stem class have in some instances spread and become productive markers of the successor classes (masculine hard inanimate and masculine hard animate). The shape of the paradigm can be seen in Table 1: Table 1 The masculine hard inanimate paradigm in Czech (pattern hrad ‘castle’) Case Endings Primary pattern Alternate pattern 1 Alternate pattern 2 Nom. sg. Øhotel-Ø Gen. sg. -u/-a hotel-u svět-a ‘world’ jazyk-u, jazyk-a ‘language’ Dat. sg. -u hotel-u Acc. sg. Ø/-a hotel-Ø šlofík-a ‘nap’ buřt-Ø, buřt-a ‘wiener’ Voc. sg. -e/-u hotel-e zámk-u ‘stately home’ Loc. sg. -u/-ě hotel-u ovčín-ě ‘sheepfold’ hrad-u, hrad-ě ‘castle’ Inst. sg. -em hotel-em In the contemporary language, assignment of certain case endings is conditioned by phone- mic environment or membership in a lexical subclass. For example, use of the -u ending in the vocative is conditioned by a stem ending in a velar consonant (hotel-e but zámk-u). In the accusative, the appearance of a so-called ‘facultative animate’ form has been linked to groups of semantically similar nouns (foods, dances, card games, drinks, cars), to ex- pressivity or to foreignness (see e.g. Šulc 2001). In the genitive and locative singular, however, we find a different sort of distribution. In both cases, the ending found with the vast majority of nouns is -u.Thereisamuch smaller group of nouns that exclusively use the old o-stem endings -a and -ě respectively. A third group of nouns show variation between the two endings available for the paradigm, cf. (1): (1) Genitive and locative singular endings in the hrad paradigm in the SYN2005 corpus a) Genitive singular in -u only: hrad-u b) Genitive singular in -a only: svět-a c) Genitive singular in both -a or -u: jazyk-a/jazyk-u d) Locative singular in -u only: hotel-u e) Locative singular in -ě only: ovčín-ě f) Locative singular in both -ě or -u: hrad-ě/hrad-u3 Descriptions of Czech grammar rarely dwell on this sort of variation. It is mentioned in prescriptive manuals, but most of these are, at best, based on previous handbooks and 3This problem is arguably an artifact of convention, i.e. of the fact that we accept the traditional assignment of all these nouns to a single paradigm, and perhaps the problem would disappear if this conventional unit were abolished in favor of smaller units within which the availability of endings was clearly marked. Bermel (2010, 138–139) used corpus data to explore one alternative: designating a series of subparadigms, based on which ending or endings are found with nouns in the genitive, accusative and locative singular. Such an approach yields a confusing 15 different subparadigms. For any subparadigms where multiple endings are found, the proportions of one ending to another can run from 1:99 all the way to 99:1, casting doubt on the uniform character of the subparadigm. The alternative explanation thus does not yield any further clarity of description. 94 N. Bermel, L. Knittl dictionaries.4 These in turn are based either on intuition or on material from excerpt files, which contain a mixture of citations from newspapers and literary sources over the past century. Three articles summarize the explanations found in such manuals: Cummins (1995), Rusínová (1992) and Sedláček (1982). As for attempts to ground these findings in competence or performance, three further pre-corpus studies stand out. Klimeš (1953)used excerption from books, while Kasal (1992) took soundings into native speakers’ usage and Bermel (1993) asked native speakers to evaluate variant forms. Later studies (Bermel 2004; Štícha 2009) have looked at the extent of variation in the Czech National Corpus.5 Grammars of Czech frequently suggest that there are syntactic and semantic factors conditioning the choice of forms in the genitive and locative singular. The data for this are not cited; assertions tend to be handed down from one manual to the next. The 1986 Academy Grammar Mluvnice češtiny (Petr 1986, 305) gives the following descriptions: In the gen. sg. the declension formant is -u or -a, in places the doublet -u/-a. In the contemporary standard language the majority of inanimate masculine nouns have the formant -u. Disproportionately fewer inanimate masculine nouns have the ending -a (characteristic of the hard subtype of animate nouns) or the doublet -u/-a. The distri- bution of forms depends on various factors: word-formational, syntactic, semantic, sometimes on the phonemic composition of the word as well. In morphophono- logical terms the formants -u and -a are restricted to the environment following a non-soft consonant (after soft consonants, as a rule, only -e appears).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages29 Page
-
File Size-