Proceedings of the VII Nereus International Workshop : "Clitic Doubling and Other Issues of the Syntax/Semantic Interface I

Total Page:16

File Type:pdf, Size:1020Kb

Proceedings of the VII Nereus International Workshop : Arbeitspapier Nr. 128 Proceedings of the VII Nereus International Workshop: “Clitic Doubling and other issues of the syntax/semantic interface in Romance DPs” Susann Fischer & Mario Navarro (eds.) Konstanzer Online-Publikations-System (KOPS) URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-0-372560 Fachbereich Sprachwissenschaft der Universität Konstanz Arbeitspapier Nr. 128 Proceedings of the VII NEREUS INTERNATIONAL WORKSHOP: CLITIC DOUBLING AND OTHER ISSUES OF THE SYNTAX/SEMANTIC INTERFACE IN ROMANCE DPS SUSANN FISCHER & MARIO NAVARRO (EDS.) Fachbereich Sprachwissenschaft Universität Konstanz FACH 185 D-78457 Konstanz Germany Konstanz November 2016 Schutzgebühr € 3,50 Fachbereich Sprachwissenschaft der Universität Konstanz Sekretariat des Fachbereichs Sprachwissenschaft, Frau Tania Simeoni, Fach 185, D – 78457 Konstanz, Tel. 07531/88-2465 Table of Contents Susann Fischer & Mario Navarro Preface Artemis Alexiadou DP internal (clitic) Doubling .............................................................................................. 1 Elena Anagnostopoulou Clitic Doubling and Object Agreement ............................................................................... 11 Klaus von Heusinger, Diego Romero & Georg A. Kaiser Differential Object Marking in Spanish ditransitive constructions. An empirical approach ........................................................................................................ 43 Mihaela Marchis Moreno & Carolina Petersen On locality effects in Romance: the role of clitic doubling ................................................. 65 Mario Navarro & Mareike Neuhaus Clitic Doubling restrictions in Leísta Spanish .................................................................... 79 Teresa Parodi Formal features and vulnerable domains in L2 acquisition and an outlook on language contact ............................................................................................................ 91 Natascha Pomino On the clitic status of the plural marker in phonic French.................................................. 105 Elisabeth Stark Nominal morphology and semantics – Where’s gender (and ‘partitive articles’) in Gallo-Romance? .................................................................................................................. 131 Preface The phenomenon of clitic doubling is known to be especially interesting with respect to the Romance languages. As its name suggest, clitic doubling involves the doubling of a verbal argument by a clitic pronoun inside the same propositional structure. From a generative perspective it was initially investigated focusing on its properties as exhibited in those Romance languages where it is attested. Thus Jaeggli (1982) who was the first to notice its theoretical importance, describes it for River Plate Spanish (spoken in Argentina, Uruguay, and Paraguay). Over the years, different factors that make clitic doubling possible, likely or even obligatory have been studied. Grammatical factors such as e.g. pronominal vs. non- pronominal, accusative vs. dative, the occurrence vs. non-occurrence of different object marking together with semantic and pragmatic factors such as e.g. animacy, specificity or definiteness have been held responsible for the occurrence and distribution. This volume is a collection of papers given at the workshop “Clitic Doubling and other issues of the syntax/semantic interface in Romance DPs” held at the University of Hamburg in November 2014. https://www.slm.uni-hamburg.de/romanistik/personen/fischer/downloads/clitic-doubling.pdf The workshop was a joint event organized by NEREUS (Research Network for Referential Categories in Spanish and other Romance languages” and the DFG-project “Clitic Doubling across Romance”. The papers of this volume deal with different aspects of the clitic doubling construction and related issues, such as its semantic, pragmatic and morphosyntactic properties across the Romance languages and beyond, thereby contributing to the understanding of the nature of the cross-linguistic variation, as well as the micro-variation observed within. We would like to thank all contributors and participants of the workshop for their interest and committed engagement. The quality of the papers and the passionate discussions made the workshop a very inspiring event. We would like to acknowledge DFG grant (FI 875/3-1) and the University of Hamburg for financial support of this workshop. Special thanks go to Sarah Jobus for preparing the manuscript and to Georg Kaiser for his generous help with all editorial and technical matters. Hamburg, September 2016 Susann Fischer Mario Navarro In: Fischer, Susann & Mario Navarro (eds.), Proceedings of the VII Nereus International Workshop: “Clitic Doubling and other issues of the syntax/semantic interface in Romance DPs”. Arbeitspapier 128. Fachbereich Sprachwissenschaft, Universität Konstanz 2016, 1-10. DP internal (clitic) Doubling Artemis Alexiadou [email protected] 1. Introduction* While (clitic) doubling phenomena are relatively well described and (perhaps) well understood in the clausal domain, they remain rather understudied in the nominal domain. In this short paper, I focus on the properties of possessor (clitic) doubling from a comparative perspective by looking at French, Greek, Mauritian Creole (MC), and German. The types of possesor doubling I am interested in here are illustrated in the examples (1-4). In French, (1), as well as in Greek, (2), a postnominal possessive strong pronoun can be doubled: (1) son amie à lui French his friend to him (Cardinaletti 1998) (2) To vivlio mu emena den pulithike katholu. Greek the book CL.1SG me.GEN.STR not sold at all ‘My book was not sold at all.’ (Giusti & Stavrou 2008) Neither Standard French nor Greek allows for clitic doubling of full possessor DPs in the nominal domain. As we will see in the next sections, the languages differ as far as their verbal doubling patterns are concerned: Greek, but not French, allows clitic doubling of full DPs. By contrast, Mauritian Creole (henceforth MC) (3a) and dialects of German (4) allow possessor doubling with a prenominal DP possessor. This is not possible in French, the lexifier language for MC, as we see in (3b): (3) (a) Za so liv John his book ‘John’s book’ (Syea 2007) (b) *Jean son livre Jean his book (4) dem Hans sein Haus the Hans.DAT his house As in the case of doubling in the verbal domain, the first question is how one can distinguish between true doubling and instances of dislocation. As we will see, all the above are instances of true doubling. The next question concerns the difference between Greek and French. While clitic doubling in the French DP has identical properties to clitic doubling in the French verbal domain, this is not the case in Greek. Specifically, full DP possessor doubling is excluded. Nevertheless, the pronominal doubling illustrated above has identical properties to its verbal counterpart. The final question relates to the cross-linguistic distribution of doubling: what are the properties that characterize its distribution and how can we account for the cross-linguistic differences observed? As can be seen from the data in (1)-(4), the doubling patterns differ: we have full DP * I would like to thank Elena Anagnostopoulou, Susann Fischer, Terje Lohndal, Mario Navarro, and the participants of the Nereus workshop on clitic doubling in Hamburg in November 2014 for their comments. DP internal (clitic) doubling 2 possessors in MC and German preceding the possessor-clitic, while we have post-nominal possessors following the clitic in French and Greek. I will argue that two ingredients seem to be relevant to understand doubling patterns: (i) a clitic possessor should be able to realize D°, and (ii) a full possessor DP should be able to occupy Spec,DP which has A-properties. When both conditions are met as in e.g. German or MC the two co-occur in the DP layer. Moreover, in languages such as French, MC or German the possessor clitic realizes D°, but this is not the case in Greek where the possessor clitic is an en-clitic to an XP. Furthermore, Greek Spec, DP is an A’-position, making thus the Greek DP parallel to CP, and the MC/German DP parallel to TP. The paper is structured as follows: in sections 2 to 4, I will show that the examples in (1-4) are indeed instances of doubling. In section 5, I will turn to an analysis of properties that regulate the crosslinguistic of doubling. These relate to the properties of Spec,DP across languages and the types of possessor clitics available in a language. 2. Possessor doubling in French Cardinaletti (1998) discusses in detail the properties of the French pattern in (1). On the basis of several criteria, she convincingly concludes that French does indeed have possessor doubling involving strong possessive pronouns, that is the same type of doubling observed in the verbal domain. Let me briefly summarize her arguments here. All French data here are from Cardinaletti’s (op.cit.) paper. A first piece of evidence comes from the observation that the restrictions on possessive doubling are the same as those on personal pronouns: the doubled element can only be a pronoun, exactly as in the verbal domain, which disallows doubling of DPs. This is shown in the data in (5) and (6): (5) (a) son livre à lui (b) *son livre à Jean his book to him / Jean (6) (a) Il m'a vu moi. he me has seen me (b) *Il l'a vu Jean. he him has seen Jean Second, on a par with clitic pronouns, French possessives license floating
Recommended publications
  • Reconsidering the “Isolating Protolanguage Hypothesis” in the Evolution of Morphology Author(S): Jaïmé Dubé Proceedings
    Reconsidering the “isolating protolanguage hypothesis” in the evolution of morphology Author(s): Jaïmé Dubé Proceedings of the 37th Annual Meeting of the Berkeley Linguistics Society (2013), pp. 76-90 Editors: Chundra Cathcart, I-Hsuan Chen, Greg Finley, Shinae Kang, Clare S. Sandy, and Elise Stickles Please contact BLS regarding any further use of this work. BLS retains copyright for both print and screen forms of the publication. BLS may be contacted via http://linguistics.berkeley.edu/bls/. The Annual Proceedings of the Berkeley Linguistics Society is published online via eLanguage, the Linguistic Society of America's digital publishing platform. Reconsidering the Isolating Protolanguage Hypothesis in the Evolution of Morphology1 JAÏMÉ DUBÉ Université de Montréal 1 Introduction Much recent work on the evolution of language assumes explicitly or implicitly that the original language was without morphology. Under this assumption, morphology is merely a consequence of language use: affixal morphology is the result of the agglutination of free words, and morphophonemic (MP) alternations arise through the morphologization of once regular phonological processes. This hypothesis is based on at least two questionable assumptions: first, that the methods and results of historical linguistics can provide a window on the evolution of language, and second, based on the claim that some languages have no morphology (the so-called isolating languages), that morphology is not a necessary part of language. The aim of this paper is to suggest that there is in fact no basis for what I will call the Isolating Proto-Language Hypothesis (henceforth IPH), either on historical or typological grounds, and that the evolution of morphology remains an interesting question.
    [Show full text]
  • An Empirical Test of the Agglutination Hypothesis1
    An Empirical Test of the Agglutination Hypothesis1 Martin Haspelmath Abstract In this paper, I approach the agglutination-fusion distinction from an empirical point of view. Although the well-known morphological typology of languages (isolating, agglutinating, flexive/fusional, incorporating) has often been criticized as empty, the old idea that there are (predominantly) agglutinating and (predominantly) fusional languages in fact makes two implicit predictions. First, agglutination/fusion is characteristic of whole languages rather than individual con- structions; second, the various components of agglutination/fusion correlate with each other. The (unstated, but widely assumed) Agglutination Hypothesis can thus be formulated as follows: (i) First prediction: If a language is agglutinating/fusional in one area of its mor- phology (e.g. in nouns, or in the future tense), it shows the same type elsewhere. (ii) Second prediction: If a language is agglutinating/fusional with respect to one of the three agglutination parameters (a-c) (and perhaps others), it shows the same type with respect to the other two parameters: (a) separation/cumulation, (b) morpheme invariance/morpheme variability, (c) affix uniformity/affix sup- pletion. Ireportonastudyofthenominalandverbalinflectionalmorphologyofa reasonably balanced world-wide sample of 30 languages, applying a variety of measures for the agglutination parameters and determining whether they are cross- linguistically significant. The results do not confirm the validity of the Agglutination Hypothesis, and the current evidence suggests that “agglutination” is just one way of trying to capture the strangeness of non-Indo-European languages, which all look alike to Eurocentric eyes. M. Haspelmath Max-Planck-Institut fur¨ evolutionare¨ Anthropologie, Leipzig 1Earlier versions of this paper were presented at the 3rd conference of the Association for Linguistic Typology (Amsterdam 1999) and at the 9th International Morphology Meeting (Vienna 2000).
    [Show full text]
  • The Diachronic Development of Differential Object Marking in Spanish Ditransitive Constructions Klaus Von Heusinger Universität Zu Köln
    Chapter 11 The diachronic development of Differential Object Marking in Spanish ditransitive constructions Klaus von Heusinger Universität zu Köln Differential Object Marking (DOM) in Spanish synchronically depends on the referential features of the direct object, such as animacy and referentiality, and on the semantics of the verb. Recent corpus studies suggest that the diachronic development proceeds along the same features, which are ranked in scales, namely the Animacy Scale, the Referentiality Scale and the Affectedness Scale. The present paper investigates this development in ditran- sitive constructions from the 17th to the 20th century. Ditransitive constructions in Spanish are of particular interest since the literature assumes that the differential object marker a is often blocked by the co-occurrence of the case marker a for the indirect object. The paper focuses on the conditions that enhance or weaken this blocking effect. It investigates three types of constructions with a ditransitive verb: (i) constructions with indirect objects real- ized as a-marked full noun phrases, (ii) constructions with indirect objects as clitic pronouns, and (iii) constructions with non-overt indirect objects. The results clearly show that DOM is more frequent with (iii) and less frequent with (i). Thus the results support the observation that the co-occurrence of an a-marked indirect object (partly) blocks a-marking of the di- rect object to a certain extent. Furthermore, the results show for the first time that indirect objects realized as clitic pronouns without the marker a have a weaker blocking effect, but still a stronger one than constructions without overt indirect objects. In summary, the paper presents new and original evidence of the competition between arguments in a diachronic perspective.
    [Show full text]
  • Using Morphemes from Agglutinative Languages Like Quechua and Finnish to Aid in Low-Resource Translation
    Using Morphemes from Agglutinative Languages like Quechua and Finnish to Aid in Low-Resource Translation John E. Ortega [email protected] Dept. de Llenguatges i Sistemes Informatics, Universitat d’Alacant, E-03071, Alacant, Spain Krishnan Pillaipakkamnatt [email protected] Department of Computer Science, Hofstra University, Hempstead, NY 11549, USA Abstract Quechua is a low-resource language spoken by nearly 9 million persons in South America (Hintz and Hintz, 2017). Yet, in recent times there are few published accounts of successful adaptations of machine translation systems for low-resource languages like Quechua. In some cases, machine translations from Quechua to Spanish are inadequate due to error in alignment. We attempt to improve previous alignment techniques by aligning two languages that are simi- lar due to agglutination: Quechua and Finnish. Our novel technique allows us to add rules that improve alignment for the prediction algorithm used in common machine translation systems. 1 Introduction The NP-complete problem of translating natural languages as they are spoken by humans to machine readable text is a complex problem; yet, is partially solvable due to the accuracy of machine language translations when compared to human translations (Kleinberg and Tardos, 2005). Statistical machine translation (SMT) systems such as Moses 1, require that an algo- rithm be combined with enough parallel corpora, text from distinct languages that can be com- pared sentence by sentence, to build phrase translation tables from language models. For many European languages, the translation task of bringing words together in a sequential sentence- by-sentence format for modeling, known as word alignment, is not hard due to the abundance of parallel corpora in large data sets such as Europarl 2.
    [Show full text]
  • Complex Adpositions and Complex Nominal Relators Benjamin Fagard, José Pinto De Lima, Elena Smirnova, Dejan Stosic
    Introduction: Complex Adpositions and Complex Nominal Relators Benjamin Fagard, José Pinto de Lima, Elena Smirnova, Dejan Stosic To cite this version: Benjamin Fagard, José Pinto de Lima, Elena Smirnova, Dejan Stosic. Introduction: Complex Adpo- sitions and Complex Nominal Relators. Benjamin Fagard, José Pinto de Lima, Dejan Stosic, Elena Smirnova. Complex Adpositions in European Languages : A Micro-Typological Approach to Com- plex Nominal Relators, 65, De Gruyter Mouton, pp.1-30, 2020, Empirical Approaches to Language Typology, 978-3-11-068664-7. 10.1515/9783110686647-001. halshs-03087872 HAL Id: halshs-03087872 https://halshs.archives-ouvertes.fr/halshs-03087872 Submitted on 24 Dec 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Public Domain Benjamin Fagard, José Pinto de Lima, Elena Smirnova & Dejan Stosic Introduction: Complex Adpositions and Complex Nominal Relators Benjamin Fagard CNRS, ENS & Paris Sorbonne Nouvelle; PSL Lattice laboratory, Ecole Normale Supérieure, 1 rue Maurice Arnoux, 92120 Montrouge, France [email protected]
    [Show full text]
  • Diminutive and Augmentative Functions of Some Luganda Noun Class Markers Samuel Namugala MA Thesis in Linguistics Norwegian Un
    Diminutive and Augmentative Functions of some Luganda Noun Class Markers Samuel Namugala MA Thesis in Linguistics Norwegian University of Science and Technology (NTNU) Faculty of Humanities Department of Language and Literature Trondheim, April, 2014 To my parents, Mr. and Mrs. Wampamba, and my siblings, Polycarp, Lydia, Christine, Violet, and Joyce ii Acknowledgements I wish to express my gratitude to The Norwegian Government for offering me a grant to pursue the master’s program at NTNU. Without this support, I would perhaps not have achieved my dream of pursuing the master’s degree in Norway. Special words of thanks go to my supervisors, Professor Kaja Borthen and Professor Assibi Amidu for guiding me in writing this thesis. Your scholarly guidance, constructive comments and critical revision of the drafts has made it possible for me to complete this thesis. I appreciate the support and the knowledge that you have shared with me. I look forward to learn more from you. My appreciation also goes to my lecturers and the entire staff at the Department of Language and Literature. I am grateful to Professor Lars Hellan, Assoc. Professor Dorothee Beermann, Professor Wim Van Dommelen, and Assoc. Professor Jardar Abrahamsen for the knowledge you have shared with me since I joined NTNU. You have made me the linguist that I desired to be. I also wish to thank the authors that didn’t mind to help me when contacted for possible relevant literature for my thesis. My appreciation goes to Prof. Nana Aba Appiah Amfo (University of Ghana), Assistant Prof. George J. Xydopoulos (Linguistics School of Philology, University of Patras, Greece), Prof.
    [Show full text]
  • Fusion, Exponence, and Flexivity in Hindukush Languages
    Fusion, exponence, and flexivity in Hindukush languages An areal-typological study Hanna Rönnqvist Department of Linguistics Independent Project for the Degree of Master 30 HE credits General Linguistics Spring term 2015 Supervisor: Henrik Liljegren Examiner: Henrik Liljegren Fusion, exponence, and flexivity in Hindukush languages An areal-typological study Hanna Rönnqvist Abstract Surrounding the Hindukush mountain chain is a stretch of land where as many as 50 distinct languages varieties of several language meet, in the present study referred to as “The Greater Hindukush” (GHK). In this area a large number of languages of at least six genera are spoken in a multi-linguistic setting. As the region is in part characterised by both contact between languages as well as isolation, it constitutes an interesting field of study of similarities and diversity, contact phenomena and possible genealogical connections. The present study takes in the region as a whole and attempts to characterise the morphology of the many languages spoken in it, by studying three parameters: phonological fusion, exponence, and flexivity in view of grammatical markers for Tense-Mood-Aspect, person marking, case marking, and plural marking on verbs and nouns. The study was performed with the perspective of areal typology, employed grammatical descriptions, and was in part inspired by three studies presented in the World Atlas of Language Structures (WALS). It was found that the region is one of high linguistic diversity, even if there are common traits, especially between languages of closer contact, such as the Iranian and the Indo-Aryan languages along the Pakistani-Afghan border where purely concatenative formatives are more common.
    [Show full text]
  • Diagnosing Object Agreement Vs. Clitic Doubling: an Inuit Case Study
    REMARKS AND REPLIES 153 Diagnosing Object Agreement vs. Clitic Doubling: An Inuit Case Study Michelle Yuan Much recent literature has focused on whether the verbal agreement morphology cross-referencing objects is true ␾-agreement or clitic doubling. I address this question on the basis of comparative data from related Inuit languages, Inuktitut and Kalaallisut (West Greenlandic), and argue that both possibilities are attested in Inuit. Evidence for this claim comes from diverging syntactic and semantic properties of the object DPs encoded by this cross-referencing morphology. I demon- strate that object DPs in Inuktitut display various properties mirroring the behavior of clitic-doubled objects crosslinguistically, while their counterparts in Kalaallisut display none of these properties, indicating genuine ␾-agreement rather than clitic doubling. Crucially, this dis- tinction cannot be detected morphologically, as the relevant cross- referencing morphemes are uniform across Inuit. Therefore, this article cautions against the reliability of canonical morphological diagnostics for (agreement) affixes vs. clitics. Keywords: Inuit, agreement, affixes, clitic doubling, pronouns, objects 1 Introduction Verbal agreement morphology is commonly analyzed as the morphological reflex of ␾-feature valuation of a probing head H0 by a ␾-bearing goal, the result of Agree (e.g., Chomsky 2000, 2001). However, much recent literature has argued for a contrast between the agreement mor- phemes cross-referencing subjects and those cross-referencing objects: while subject agreement is often considered to be genuine ␾-agreement, many cases of apparent object agreement have been reanalyzed as pronominal clitic doubling (e.g., Woolford 2008, Preminger 2009, Nevins 2011, Kramer 2014, Anagnostopoulou 2016). Unlike true agreement, clitic doubling involves a pronominal D0 element cooccurring and coreferring with a DP associate.
    [Show full text]
  • Structuring Variation in Romance Linguistics and Beyond in Honour of Leonardo M
    Linguistik Aktuell Linguistics Today 252 Structuring Variation in Romance Linguistics and Beyond In honour of Leonardo M. Savoia Edited by Mirko Grimaldi Rosangela Lai Ludovico Franco Benedetta Baldi John Benjamins Publishing Company Structuring Variation in Romance Linguistics and Beyond Linguistik Aktuell/Linguistics Today (LA) issn 0166-0829 Linguistik Aktuell/Linguistics Today (LA) provides a platform for original monograph studies into synchronic and diachronic linguistics. Studies in LA confront empirical and theoretical problems as these are currently discussed in syntax, semantics, morphology, phonology, and systematic pragmatics with the aim to establish robust empirical generalizations within a universalistic perspective. For an overview of all books published in this series, please see http://benjamins.com/catalog/la Founding Editor Werner Abraham Universität Wien / Ludwig Maximilian Universität München General Editors Werner Abraham Elly van Gelderen Universität Wien / Arizona State University Ludwig Maximilian Universität München Advisory Editorial Board Josef Bayer Hubert Haider Ian Roberts University of Konstanz University of Salzburg Cambridge University Cedric Boeckx Terje Lohndal Lisa deMena Travis ICREA/UB Norwegian University of Science McGill University and Technology Guglielmo Cinque Sten Vikner University of Venice Christer Platzack University of Aarhus University of Lund Liliane Haegeman C. Jan-Wouter Zwart University of Ghent University of Groningen Volume 252 Structuring Variation in Romance Linguistics and Beyond
    [Show full text]
  • Ditransitive Constructions with Differentially Marked Direct Objects in Romanian Alexandra Cornilescu University of Bucharest
    Chapter 5 Ditransitive constructions with differentially marked direct objects in Romanian Alexandra Cornilescu University of Bucharest The paper discusses Romanian data that had gone unnoticed so far and investi- gates the differences of grammaticality triggered by differentially marked direct objects in ditransitive constructions, in binding configurations. Specifically, while a bare direct object (DO) may bind a possessor contained in the indirect object (IO), whether or not the IO is clitic doubled, a differentially marked DO may bind into an undoubled IO, but cannot bind into an IO if the latter is clitic doubled. Gram- maticality is restored if the DO is clitic doubled in its turn. The focus of the paper is to offer a derivational account of ditransitive constructions, which accounts for these differences. The claim is that the grammaticality contrasts mentioned above result from the different feature structures of bare DOs compared with differen- tially marked ones, as well as from the fact that differentially marked DOs and IO have common features. Differentially marked DOs interfere with IOs since both are sensitive to the animacy hierarchy, and include a syntactic [Person] feature in their featural make-up. The derivational valuation of this feature by both objects may create locality problems. 1 Problem and aim In this paper, I turn to data not discussed for Romanian so far and consider the differences of grammaticality triggered by differentially marked direct objects (i.e. DOs with Differential Object Marking, from now one, DOM-ed DOs) indi- transitive constructions, in binding configurations. Alexandra Cornilescu. 2020. Ditransitive constructions with differentially marked direct objects in Romanian.
    [Show full text]
  • A Morphological Lexicon of Esperanto with Morpheme Frequencies
    A Morphological Lexicon of Esperanto with Morpheme Frequencies Eckhard Bick University of Southern Denmark Campusvej 55, DK-5230 Odense M Email: [email protected] Abstract This paper discusses the internal structure of complex Esperanto words (CWs). Using a morphological analyzer, possible affixation and compounding is checked for over 50,000 Esperanto lexemes against a list of 17,000 root words. Morpheme boundaries in the resulting analyses were then checked manually, creating a CW dictionary of 28,000 words, representing 56.4% of the lexicon, or 19.4% of corpus tokens. The error percentage of the EspGram morphological analyzer for new corpus CWs was 4.3% for types and 6.4% for tokens, with a recall of almost 100%, and wrong/spurious boundaries being more common than missing ones. For pedagogical purposes a morpheme frequency dictionary was constructed for a 16 million word corpus, confirming the importance of agglutinative derivational morphemes in the Esperanto lexicon. Finally, as a means to reduce the morphological ambiguity of CWs, we provide POS likelihoods for Esperanto suffixes. Keywords: Morphological Analysis, Esperanto, Affixation, Compounding, Morpheme Frequencies 1. Introduction 2. Morphological analysis As an artificial language with a focus on regularity and From a language technology perspective, inflexional facilitation of language acquisition, Esperanto was regularity, morphological transparency and surface-based designed with a morphology that allows (almost) free, access to semantic features turn POS tagging of Esperanto productive combination of roots, affixes and inflexion into a non-task, and facilitate the parsing of syntactic and endings. Thus, the root 'san' (healthy) not only accepts its semantic structures (Bick 2007).
    [Show full text]
  • Towards an Alternative Description of Incomplete Sentences in Agglutinative Languages S. Ido a Thesis Submitted in Fulfilment O
    Towards an Alternative Description of Incomplete Sentences in Agglutinative Languages S. Ido A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy 2001 University of Sydney I declare that this thesis is all my own work. I have acknowledged in formal citation the sources of any reference I have made to the work of others. ____________________________ Shinji Ido ____________________________ Date Title: Towards an Alternative Description of Incomplete Sentences in Agglutinative Languages Abstract: This thesis analyses ‘incomplete sentences’ in languages which utilise distinctively agglutinative components in their morphology. In the grammars of the languages dealt with in this thesis, there are certain types of sentences which are variously referred to as ‘elliptical sentences’ (Turkish eksiltili cümleler), ‘incomplete sentences’ (Uzbek to‘liqsiz gaplar), ‘cut-off sentences’ (Turkish kesik cümleler), etc., for which the grammarians provide elaborated semantic and syntactic analyses. The current work attempts to present an alternative approach for the analysis of such sentences. The distribution of morphemes in incomplete sentences is examined closely, based on which a system of analysis that can handle a variety of incomplete sentences in an integrated manner is proposed from a morphological point of view. It aims to aid grammarians as well as researchers in area studies by providing a simple description of incomplete sentences in agglutinative languages. The linguistic data are taken from Turkish, Uzbek,
    [Show full text]