A New Language for Constraint Grammar: Estonian∗

Total Page:16

File Type:pdf, Size:1020Kb

A New Language for Constraint Grammar: Estonian∗ A New Language for Constraint Grammar: Estonian∗ Kaili M¨u¨urisep Tiina Puolakainen Kadri Muischnek Institute of Cybernetics Institute of Estonian Language Department of Estonian and Tallinn Technical University 10136 Tallinn, Estonia Finno-Ugric Linguistics 12168 Tallinn, Estonia [email protected] University of Tartu [email protected] 50409 Tartu, Estonia [email protected] Mare Koit and Tiit Roosmaa and Heli Uibo Institute of Computer Science University of Tartu 50409 Tartu, Estonia {koit,roosmaa,heli u}@ut.ee Abstract problem exists also in Estonian, but the number The Constraint Grammar of Estonian presented of choices is much greater due to the richness of in the paper is the first attempt in automatic forms. syntactic analysis of Estonian. The grammar consists of 1,240 morphological disambiguation The ratio of ambiguous words varies greatly rules, 47 clause boundary detection rules, 180 from language to language: for example, in En- morphosyntactic mapping rules and 1,118 syn- glish, Swedish, and Finnish the ratio of words tactic constraints. The rules have been devised using a training corpus of 20,300 words and have with multiple morphological interpretation is been tested on a benchmark corpus of 10,000 40%, over 60%, and 11%, respectively (Karls- words. As the result of tests, 86.6% of words become morphologically unambiguous, and the son et al. 95). In Estonian literary texts more error rate of the morphological disambiguator than 45% of words are ambiguous. Estonian, un- is 1.8%. The results of the full analysis demon- like Germanic languages, is not subject-centered. strate the ambiguity rate of 83% and error rate of 3.5%. There are a number of non-elliptical sentences in Estonian with no subject. Before we started our project, an automatic 1 Introduction morphological analyzer for Estonian had already The Estonian language is a Finno-Ugric language been created (Kaalep 96). It was our task to elab- that has a rich structure of declensional and con- orate a grammar suitable for the automatic syn- jugational forms, and also a relatively free order tactic analysis of Estonian, and to compile the of sentence constituents. In these respects it dif- program for the syntactic analysis. To accomplish fers considerably from English. There are 14 cases this task, we had to choose a suitable grammar in Estonian, but due to the free word order it is model for Estonian, and analyze the available Es- difficult to determine the syntactic functions of tonian texts, primarily from the Corpus of Writ- these cases. Furthermore, there is no grammati- ten Estonian Texts (Hennoste et al. 98), in order cal gender. The person agreement (1st, 2nd, and to use the established regularities for wording the 3rd person in singular and plural) is common in rules and writing the syntactic parser. finite verbs in all forms and tenses. The major- Our grammar has been composed on the for- ity of grammatical categories are implemented by malism of the Constraint Grammar (Karlsson et means of morphology. al. 95). The main idea of the Constraint Gram- Estonian is characterised by a wide extent and mar (CG) is that it determines the surface-level variety of grammatical homonymy that makes the syntactic analysis of the text, which has gone automatic analysis of Estonian a difficult task. In through prior morphological analysis. The pro- the case of English, the main difficulty lies in de- cess of syntactic analysis consists of three stages: termining the correct part of speech. The same morphological disambiguation, identification of ∗ This work was supported by the Estonian Science clause boundaries, and identification of syntactic Foundation under the grants No. 3314 and No. 4605. functions of words. The underlying principle in determining both (came) the morphological interpretation and the syntac- tuli+0 // S com sg nom // (the light) tic functions is the same: first all the possible $. labels are attached to words, and then the ones that do not fit the context are removed by ap- Let us note that the word forms in this sentence plying special rules called constraints. Constraint can be sequenced in 3! = 6 different ways, and all Grammar consists of hand written rules, which by the resulting sentences will be correct and under- checking the context decide whether an interpre- standable for a native speaker of Estonian due to tation is correct or has to be removed. the free word order. The Constraint Grammar parser of Estonian In analyzing this sentence, the correct interpre- exists as two separate programs: the morphologi- tation of the word form tuli is found by applying cal disambiguator (Puolakainen 01) and the syn- the following constraint: remove the finite form tactic analyzer in a narrower sense (M¨u¨urisep00). of the verb from the cohort (in the present case The basic differences of our grammar from the verb tule+i // V main indic impf ps3 sg ps af #Intr //) standard one are the following: if a given word is immediately preceded by a fi- nite form of verb which is the only interpretation • the assumed clause boundaries are also used; of the word form (in the present case kustu+s // V • the referenced context conditions can be rep- main indic impf ps3 sg ps af #Intr // ). resented in two ways: the appropriate cohort After having added the syntactic tags we have is searched either up to the very end of the the sentence in the following form: possible context, or it is searched up to the Aknas first appropriate elements/filler; aken+s // S com sg in **CLB // @ADVL @<NN @NN> • it is possible to remove morphological inter- kustus pretations during the syntactic analysis. kustu+s // V main indic impf ps3 sg ps af #Intr // The next sections of the paper will provide an @+FMV overview of the Constraint Grammar of Estonian tuli (EstCG) and problems that cropped up in its cre- tuli+0 // S com sg nom // @SUBJ @OBJ @ADVL ation. We think that other researches who set the @NN> @<NN aim of elaborating an automatic syntactic analy- $. sis of a language can benefit from our experience, In this sentence, the noun tuli in the nominative especially if the language is different from English, case singular may be either the subject (@SUBJ), possesses rich morphology and/or free word order. the object (@OBJ), adverbial (@ADVL), pre- modifing attribute (@NN>), or postmodifing at- 2 Motives for Selecting Constraint tribute (@<NN). Grammar Formalism During the last stage syntactic constraints are According to Constraint Grammar, after the mor- applied to words that remove the syntactic tags phological analysis of a sentence the following unsuitable for the context: steps are performed: morphological disambigua- Aknas tion, determination of sentence-internal clause aken+s // S com sg in **CLB // @ADVL boundaries, adding of syntactic tags, and finally, kustus syntactic disambiguation. kustu+s // V main indic impf ps3 sg ps af #Intr // As an example, let us consider the morpholog- @+FMV ically analyzed Estonian sentence ”Aknas kustus tuli tuli” (The light went out in the window): tuli+0 // S com sg nom // @SUBJ Aknas (window) $. aken+s // S com sg in // Word form aknas was analyzed as an adverbial, kustus (go out) the word form tuli was analyzed as the subject, kustu+s // V main indic impf ps3 sg ps af #Intr // and the verb kustus received the tag of a finite tuli (light) predicate. tule+i // V main indic impf ps3 sg ps af #Intr // In 1995 when we launched preparatory activi- ties for the automatic syntactic parsing of Esto- and write new rules that would reduce the re- nian, the Constraint Grammar was beyond doubt maining ambiguities. The easiest to compile were the most efficient grammar model for morpholog- rules establishing the complements of quantifiers ical disambiguation. The syntactic description of and adpositions. For example, a word in the gen- the CG was not as deep as in the case of other itive case is a complement to a postposition if the rule-based grammar models, but the CG output postposition is immediately next it and it requires contained far less mistakes. CG has maintained the genitive case. that leading position from its introduction to the Among the attribute rules, the simplest are present day. those seeking whether in the left or right context there exists at all a word they may complement, 3 Method of Elaborating Rules as well as numerous rules checking the agreement or non-agreement. A number of rules are clearly To elaborate the morphological disambigua- of heuristic nature – the rule might not be 100% tion constraints, we established the more fre- true but its proficiency rate is very high, com- quent groups of ambiguities. We found both more pared to the number of errors. Several rules have frequent ambiguous word forms as well as the am- been compiled solely on the statistical informa- biguous grammatical categories (the past partici- tion. While observing the word order in the sen- ple with the interpretation of the adjective either tence, it became obvious that such combinations in singular or plural, the noun or the verb; noun in as 1) object in the nominative or genitive case – the nominative, genitive and partitive case; noun predicate – subject in the nominative case, or 2) in the genitive, partitive and aditive; adverb and object in the nominative or genitive case – sub- adjective in the ablative case, etc.) (Puolakainen ject in the nominative case – predicate occur very 01). This frequency table indicated which phe- rarely. nomena needed to be handled first of all. For each We tried to group the rules in such a way case samples were collected from text corpora and that the most reliable ones or those that cause upon these observations tentative rules were com- least errors are in the main part of the grammar; piled.
Recommended publications
  • 2 Numeric Control in Verse Constituent Structure 15 2.1 Introduction
    Typological tendencies in verse and their cognitive grounding Varuṇ deCastro-Arrazola deCastro-Arrazola, V. 2018. Typological tendencies in verse and their cognitive grounding. Utrecht: LOT. © 2018, Varuṇ deCastro-Arrazola Published under the Creative Commons Attribution 4.0 Licence (CC BY 4.0) ISBN: 978-94-6093-284-7 NUR: 616 Cover illustration: Irati Gorostidi Agirretxe Typesetting software:Ǝ X LATEX Layout based on: LATEX class langscibook.cls developed by Timm Lichte, Stefan Müller, Sebastian Nordhoff & Felix Kopecky for the open-access linguistics pub- lisher Language Science Press (langsci-press.org). Published by: LOT Trans 10 phone: +31 30 253 6111 3512 JK Utrecht e-mail: [email protected] The Netherlands http://www.lotschool.nl Typological tendencies in verse and their cognitive grounding Proefschrift ter verkrijging van de graad van Doctor aan de Universiteit Leiden, op gezag van Rector Magnificus prof.mr. C.J.J.M. Stolker, volgens besluit van het College voor Promoties te verdedigen op donderdag 3 mei 2018 klokke 13:45 uur door Varuṇaśarman de Castro Arrazola geboren te Canberra in 1988 Promotores Prof.dr. Marc van Oostendorp (Radboud Universiteit Nijmegen) Prof.dr. Johan Rooryck (Universiteit Leiden) Promotiecommissie Prof.dr. Janet Grijzenhout (Universiteit Leiden) Prof.dr. Paula Fikkert (Radboud Universiteit Nijmegen) Prof.dr. Nigel Fabb (University of Strathclyde) The research for this book was carried out as part of the Horizon project 317- 70-010 Knowledge and culture, funded by the Dutch Organisation for Scientific Research (NWO). Indrari ikerketa berdin bizi baitugu Contents Acknowledgements xi 1 Introduction 1 1.1 On verse ................................ 1 1.2 Explaining verse ............................ 2 1.3 Outline of the dissertation .....................
    [Show full text]
  • Estonian Transitive Verbs and Object Case
    ESTONIAN TRANSITIVE VERBS AND OBJECT CASE Anne Tamm University of Florence Proceedings of the LFG06 Conference Universität Konstanz Miriam Butt and Tracy Holloway King (Editors) 2006 CSLI Publications http://csli-publications.stanford.edu/ Abstract This article discusses the nature of Estonian aspect and case, proposing an analysis of Estonian verbal aspect, aspectual case, and clausal aspect. The focus is on the interaction of transitive telic verbs ( write, win ) and aspectual case at the level of the functional structure. The main discussion concerns the relationships between aspect and the object case alternation. The data set comprises Estonian transitive verbs with variable and invariant aspect and shows that clausal aspect ultimately depends on the object case. The objects of Estonian transitive verbs in active affirmative indicative clauses are marked with the partitive or the total case; the latter is also known as the accusative and the morphological genitive or nominative. The article presents a unification-based approach in LFG: the aspectual features of verbs and case are unified in the functional structure. The lexical entries for transitive verbs are provided with valued or unvalued aspectual features in the lexicon. If the verb fully determines sentential aspect, then the aspectual feature is valued in the functional specifications of the lexical entry of the verb; this is realized in the form of defining equations. If the aspect of the verb is variable, the entry’s functional specifications have the form of existential constraints. As sentential aspect is fully determined by the total case, the functional specifications of the lexical entry of the total case are in the form of defining equations.
    [Show full text]
  • Verb-Second in Spoken and Written Estonian RESEARCH
    Verb-second in spoken and written Estonian RESEARCH VIRVE-ANNELI VIHMAN GEORGE WALKDEN *Author affiliations can be found in the back matter of this article ABSTRACT CORRESPONDING AUTHOR: Virve-Anneli Vihman This paper investigates clausal constituent order in Estonian, a language often University of Tartu, Ülikooli 18, described in the literature as exhibiting a verb-second “tendency”. We present a corpus- 50090 Tartu, Estonia based study of ordering in independent affirmative declarative clauses, drawing data [email protected] from both written and spoken corpora. Our results show that, while written Estonian is robustly a verb-second language along the same lines as the modern Germanic standard languages, spoken Estonian exhibits much more variation. Our findings lead KEYWORDS: us to suggest that spoken Estonian patterns with the recently-established class of Estonian; constituent order; “verb-third” languages, and that syntactic analyses developed to account for these clause structure; corpus languages can also account for our spoken Estonian data. analysis; verb-second TO CITE THIS ARTICLE: Vihman, Virve-Anneli and George Walkden. 2021. Verb-second in spoken and written Estonian. Glossa: a journal of general linguistics 6(1): 15. 1–23. DOI: https://doi. org/10.5334/gjgl.1404 Konstanzer Online-Publikations-System (KOPS) URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-2-149vj82jlrrbt2 1 INTRODUCTION Vihman and Walkden 2 Glossa: a journal of Verb-second (V2) constituent order, though a signature property of the Germanic languages, general linguistics DOI: 10.5334/gjgl.1404 is very rare outside the Indo-European family. In his overview article, Holmberg (2015) lists only Estonian and potentially Karitiana; to this we might add Khoekhoegowab (den Besten 2002) and Dinka (van Urk & Richards 2015).
    [Show full text]
  • ESTONIAN LANGUAGE Kala on Puu Juures ‘A Fish Is Near the Tree’ Literally: ‘A Fish Is in the Root of a Tree’
    ESTONIAN LANGUAGE Kala on puu juures A fish is near the tree Literally: A fish is in the root of a tree ISBN 9985-9341-9-9 / Published by the Estonian Institute 2004 / Illustrations: Jaagup Roomet / Design: Aadam Kaarma LABOR Estonian Language Urmas Sutrop Estonian is used in the army... aviation... theatre The Estonian language The ancestors of the Estonians arrived at Finnish, Hungarian and Estonian are the the Baltic Sea 13 000 years ago when the best known of the Finno-Ugric languages; mainland glaciers of the last Ice Age had rather less known are the following retreated from the area now designated smaller languages of the same language as Estonia. The first settlers who followed group: South Estonian, Votian, Livonian, the reindeer herds came here from south, Izhorian, Vepsian, Karelian, Sami, Erzya, from Central Europe. Although the vocab- Moksha, Mari, Udmurt and Komi, spoken ulary and grammar of the language used from Scandinavia to Siberia. by people in those days have changed beyond recognition, the mentality of the Estonian differs from its closest large tundra hunters of thousands of years ago related language, Finnish, at least as can be still perceived in modern Estonian. much as English differs from Frisian. The difference between Estonian and Hungar- The majority of European languages ian is about as significant as between belong to the Indo-European language German and Persian. group (e.g. Spanish, Polish, Lithuanian, Norwegian, Albanian, Romany, Greek or Along with Icelandic, Estonian is at Welsh). Of the ancient European langua- present one of the smallest languages in ges, once so widespread throughout the the world that fulfils all the functions continent, Basque in the Pyrenees, the necessary for an independent state to Finno-Ugric languages in the North and perform linguistically.
    [Show full text]
  • Carnival 2009 on Line Version
    Carnival 2009, vol. 11 Carnival Journal of the International Students of History Association on line version Colofon Editor in Chief Lia Hamminga, ISHA Nijmegen Editorial Board Volker Prott, ISHA Berlin Publisher International Students of History Association Printing Studio HS Internet d.o.o. Reisnerova 46a 31000 Osijek, Croatia Distribution ISHA Osijek Circulation: 150 copies Facts and opinions published in the papers express solely the opinions of the authors and do not necessarily present the opinions of the editorship nor the publisher. The authors are responsible for accuracy of bibliographical units and the Editorial Board can not be held responsible for potential mistakes or lacks. International Board 2008-2009: International Board 2009-2010: Marko Halonen, president Frerik Kampman, president Larissa Vanamo, secretary Elsa Mäki-Reinikka, secretary Zrinka Kolari ć, treasurer Marko Smokvina, treasurer Carnival is an international publication of the International Students of History Association (ISHA). ISHA is an international, academic, non-profit, independ- ent network of students and recent graduates interested in history and related sciences. ISHA seeks to create and coordinate various means of co-operation and exchange between students of history and students of other related sci- ences. For further information, please contact the ISHA International Board at: [email protected] Carnival XI (2009) Contents Editorial 3 A Word from the President 5 Turning Points in History The Debate on Turning Points Sven Mörsdorf: Turning Points: Interview with Prof. Dr. Christoph Kampmann 7 Tonaí Ó Roduibh: Turning Points: Final Conclusions 15 Turning Points in Cultural History Jordy Geerlings: Turning Points in Cultural History 26 Katharina Rein: I Will Never be Hungry Again.
    [Show full text]
  • Grammatical Case in Estonian
    Grammatical Case in Estonian Merilin Miljan A thesis submitted in fulfilment of requirements for the degree of Doctor of Philosophy to School of Philosophy, Psychology and Language Sciences, University of Edinburgh September 2008 Declaration I hereby declare that this thesis is of my own composition, and that it contains no material previously submitted for the award of any other degree. The work reported in this thesis has been executed by myself, except where due acknowledgement is made in the text. Merilin Miljan ii Abstract The aim of this thesis is to show that standard approaches to grammatical case fail to provide an explanatory account of such cases in Estonian. In Estonian, grammatical cases form a complex system of semantic contrasts, with the case-marking on nouns alternating with each other in certain constructions, even though the apparent grammatical functions of the noun phrases themselves are not changed. This thesis demonstrates that such alternations, and the differences in interpretation which they induce, are context dependent. This means that the semantic contrasts which the alternating grammatical cases express are available in some linguistic contexts and not in others, being dependent, among other factors, on the semantics of the case- marked noun and the semantics of the verb it occurs with. Hence, traditional approaches which treat grammatical case as markers of syntactic dependencies and account for associated semantic interpretations by matching cases directly to semantics not only fall short in predicting the distribution of cases in Estonian but also result in over-analysis due to the static nature of the theories which the standard approach to case marking comprises.
    [Show full text]
  • Automatic Discovery of Adposition Typology
    Automatic Discovery of Adposition Typology Rishiraj Saha Roy Rahul Katare∗ IIT Kharagpur IIT Kharagpur Kharagpur, India – 721302. Kharagpur, India – 721302. [email protected] [email protected] Niloy Ganguly Monojit Choudhury IIT Kharagpur Microsoft Research India Kharagpur, India – 721302. Bangalore, India – 560001. [email protected] [email protected] Abstract Natural languages (NL) can be classified as prepositional or postpositional based on the order of the noun phrase and the adposition. Categorizing a language by its adposition typology helps in addressing several challenges in linguistics and natural language processing (NLP). Understand- ing the adposition typologies for less-studied languages by manual analysis of large text corpora can be quite expensive, yet automatic discovery of the same has received very little attention till date. This research presents a simple unsupervised technique to automatically predict the adpo- sition typology for a language. Most of the function words of a language are adpositions, and we show that function words can be effectively separated from content words by leveraging differ- ences in their distributional properties in a corpus. Using this principle, we show that languages can be classified as prepositional or postpositional based on the rank correlations derived from entropies of word co-occurrence distributions. Our claims are substantiated through experiments on 23 languages from ten diverse families, 19 of which are correctly classified by our technique. 1 Introduction Adpositions form a subcategory of function words that combine with noun phrases to denote their se- mantic or grammatical relationships with verbs, and sometimes other noun phrases. NLs can be neatly divided into a few basic typologies based on the order of the noun phrase and its adposition.
    [Show full text]
  • Testing the Robustness of Final Strictness in Verse Lines
    Studia Metrica et Poetica 5.2, 2018, 55–76 Testing the robustness of final strictness in verse lines Varun deCastro-Arrazola*1 Abstract: In the field of metrics, it has long been observed that verse lines tend to be more regular or restricted towards the end (Arnold 1905). This has led to the Strict End Hypothesis [SEH], which proposes a general versification principle of universal scope (Hayes 1983). This paper argues that two main challenges hinder the substan- tiation of the SEH in a broad typological sample of unrelated verse corpora. First, the concept of strictness is too coarse and needs to be narrowed down to testable features or subcomponents. Second, explicit measures need to be developed which enable the systematic comparison of corpora, particularly when trying to capture potentially gradient features such as the relative faithfulness to a metrical template. This study showcases how to overcome these issues by analysing the entropy at different positions in the line for corpora in five languages (English, Dutch, Sanskrit, Estonian, Berber). Finally, I argue that, if the SEH is shown to be typologically robust, shared human cognitive features may provide a partial explanation for this puzzling asymmetry in verse lines. Keywords: final strictness; verse universals; verse typology; cognition Introduction The words used in verse are subject to a number of constraints which are absent in everyday speech. By analysing how songs and poems are structured we can observe, for instance, that the discourse is organised into lines of simi- lar length, that a pulse can be perceived by the regular alternation of strong and weak syllables, or that a number of adjacent lines end with exactly the same phonemes.
    [Show full text]
  • Valency Reduction in Estonian
    Valency Reduction in Estonian Virve-Anneli Vihman A thesis submitted in fulfilment of requirements for the degree of Doctor of Philosophy to Theoretical and Applied Linguistics School of Philosophy, Psychology and Language Sciences, University of Edinburgh March 2004 Declaration I hereby declare that this thesis is of my own composition, and that it contains no material previously submitted for the award of any other degree. The work reported in this thesis has been executed by myself, except where due ac- knowledgement is made in the text. Virve-Anneli Vihman i Abstract Theories of voice based on a model of passivisation such as that exhibited by English tend to characterise valency modification as primarily a syntactic phe- nomenon which preserves propositional content, involving the mapping of un- derlying semantic roles to non-canonical syntactic argument positions. This dis- sertation finds such an approach insufficient to account for the observed phe- nomena in the more complex domain of voice in Estonian. The thesis provides a description of voice in Estonian, through an in-depth study of four valency- reducing constructions: the impersonal, personal passive, generic apersonal, and anticausative. These all involve semantic and lexical-level changes to the argu- ment structure of a predicate. In order to arrive at a satisfactory theoretical ac- count of voice in Estonian, the analysis must consider semantic and pragmatic information alongside the realignment of syntactic rules linking argument places with grammatical functions. The status of verbal arguments undergoing demotion in valency-changing op- erations is shown to be crucial to the interpretations that the various construc- tions give rise to.
    [Show full text]
  • Non-Autonomous Accusative Case in Estonian*
    Non-autonomous accusative case in Estonian* Mark Norris In Estonian, some objects of verbs show an alternation in case-marking that seems to be conditioned by morphological number: genitive when singular, nominative when plural. According to traditional descriptions (Erelt et al. 1993, 2000) and some recent research (Miljan & Cann 2013), these objects are genitive/nominative syntactically and morphologically. This paper argues against this approach, proposing instead that these cases are the morphological realization of a non-autonomous syntactic ac- cusative case, on the basis of two novel arguments. First, although isolated words in the language have no unique accusative form, the pseudopartitive construction does exhibit a unique form in would-be accusative contexts. Second, the genitive form of the inanimate relative pronoun (mille) can be replaced by nominative/unmarked mis, but only when it is in an object position. Though it has been proposed in the liter- ature that Estonian has an accusative case (Hiietam 2003, 2005, Caha 2009), neither of these arguments has been discussed, and they provide compelling morphosyntac- tic evidence in favor of the proposal. Possible paths to an analysis of the accusative’s pervasive syncretism are discussed in the amework of Distributed Morphology. It is proposed that an analysis making use of Impoverishment is superior to one with- out. The investigation here constitutes an additional case study in the divide between syntactic case and morphological case (Deal 2016, Goddard 1982, Legate 2008, 2014, Spencer 2006). Keywords: Estonian, case, syncretism, pseudopartitives, Impoverishment 1 Introduction In Estonian, some objects of verbs show an alternation in case-marking that seems to be conditioned by morphological number.
    [Show full text]
  • The Diachronic Typology of Non-Canonical
    e diachronic typology of non-canonical subjects and subject-like obliques Ilja A. Seržant University of Bergen Introduction As has been mentioned in the introduction to the present volume, there is a large body of literature devoted to di!erent aspects of non-canonical subjects or subject- like obliques ranging from language-speci"c case studies to typological overviews. Nevertheless, the diachrony of these constituents has not been the subject of cross- linguistic research (an important exception here is Cole et al. 1980). Most of the diachronic studies concentrated on Germanic languages (especially Old English vs. Middle and Modern English as well as Old Norse vs. Modern Scandinavian). #e aim of the present paper is to summarize and give structure to the research results in a rather theory-neutral way acknowledging the contributions of all approaches applied in this volume, as well as to integrate the preceding research on the evolution of non-canonical, oblique case-marked subjects. It is intended to establish an interim diachronic typology of the phenomenon including diachronic generalizations and recurrent developmental clines. #e generalizations that will be put forward in this paper are inductive and exclu- sively empirically based; the approach is typological and data-driven. At the same time, the empirical data base is not exhaustive, and no universality of the generaliza- tions may be claimed. Rather, it should be considered as a "rst attempt to provide the evolutionary typology of non-canonical subjects. Diachronic typology views the synchronic descriptions as representing stages that languages pass through in their in"nite evolution, stages that a language can enter and exit (Cro$ 2003: 234–5) ad libitum .
    [Show full text]
  • Estonian: Typological Studies Iv
    TARTU ÜLIKOOLI EESTI KEELE ÕPPETOOLI TOIMETISED 14 PUBLICATIONS OF THE DEPARTMENT OF ESTONIAN OF THE UNIVERSITY OF TARTU 14 ESTONIAN: TYPOLOGICAL STUDIES IV edited by Mati Erelt TARTU 2000 ESTONIAN: TYPOLOGICAL STUDIES IV TARTU ÜLIKOOLI EESTI KEELE ÕPPETOOLI TOIMETISED 14 — , i PUBLICATIONS OF THE DEPARTMENT OF ESTONIAN OF THE UNIVERSITY OF TARTU 14 ESTONIAN TYPOLOGICAL STUDIES IV edited by Mati Erelt TARTU 2000 ESTONIAN: TYPOLOGICAL STUDIES IV This work was supported by the Estonian Science Foundation, Grant No.: 3259. © University of Tartu, 2000 Printed by Bookmill CONTENTS M ati E relt On the Pro-Forms in E stonian.............................................................7 Külli Habicht Grammaticalization of Adpositions in Old Literary Estonian ... 19 Helle Metslang Reflections on the Development of a Particle in Estonian ........59 Renate Pajusalu Indefinite Determiners mingi and üks in Estonian........................87 Urmas Sutrop Basic Terms and Basic Vocabulary.......................................... 118 Ilona Tragel Usage of the Verb andma ‘to give’ in the Domain of Verbal Communication .................................. 146 Enn Veldi Estonian and English: a Lexicographer’s Point of View. I .............................................173 H aldur Õim Otse, sirge and õige: a Domain of Metaphoric Extension in Estonian........................198 ON THE PRO-FORMS IN ESTONIAN M ati E relt University of Tartu Estonian pro-forms include pronouns and pro-adverbs, but pro­ verbs are absent: prosubstantives: mina ‘I’, ise ‘oneself etc. pronouns proadjectives: niisugune ‘this kind o f etc. pronumerals: mitu 'several’ etc. proadverbs: siin ‘here’, nii ‘so’ etc. There are six kinds of pronouns in Estonian: 1) personal pronouns: m ina-m a T sina~sa 'you sg. tema­ ta ‘he/she' m eie-m e 'w e' teie~te ‘you pi.
    [Show full text]