Can LSTM Learn to Capture Agreement? the Case of Basque

Total Page:16

File Type:pdf, Size:1020Kb

Can LSTM Learn to Capture Agreement? the Case of Basque Can LSTM Learn to Capture Agreement? The Case of Basque Shauli Ravfogel1 and Francis M. Tyers2;3 and Yoav Goldberg1;4 1 Computer Science Department, Bar Ilan University 2 School of Linguistics, Higher School of Economics 3 Department of Linguistics, Indiana University 4 Allen Institute for Artificial Intelligence fshauli.ravfogel, [email protected], [email protected] Abstract the ability of RNNs to capture syntactic struc- ture requires a use of established benchmarks. A Sequential neural networks models are pow- common approach is the use of an annotated cor- erful tools in a variety of Natural Language Processing (NLP) tasks. The sequential nature pus to learn an explicit syntax-oriented task, such of these models raises the questions: to what as parsing or shallow parsing (Dyer et al., 2015; extent can these models implicitly learn hier- Kiperwasser and Goldberg, 2016; Dozat and Man- archical structures typical to human language, ning, 2016) . While such an approach does evalu- and what kind of grammatical phenomena can ate the ability of the model to learn syntax, it has they acquire? several drawbacks. First, the annotation process We focus on the task of agreement predic- relies on human experts and is thus demanding tion in Basque, as a case study for a task in term of resources. Second, by its very nature, that requires implicit understanding of sen- training a model on such a corpus evaluates it on tence structure and the acquisition of a com- a human-dictated notion of grammatical structure, plex but consistent morphological system. An- and is tightly coupled to a linguistic theory. Lastly, alyzing experimental results from two syntac- tic prediction tasks – verb number prediction the supervised training process on such a corpus and suffix recovery – we find that sequential provides the network with explicit grammatical la- models perform worse on agreement predic- bels (e.g. a parse tree). While this is sometimes tion in Basque than one might expect on the desirable, in some instances we would like to eval- basis of a previous agreement prediction work uate the ability of the model to implicitly acquire in English. Tentative findings based on diag- hierarchical representations. nostic classifiers suggest the network makes use of local heuristics as a proxy for the hier- Alternatively, one can train language model archical structure of the sentence. We propose (LM) (Graves, 2013;J ozefowicz´ et al., 2016; the Basque agreement prediction task as chal- Melis et al., 2017; Yogatama et al., 2018) to model lenging benchmark for models that attempt to the probability distribution of a language, and use learn regularities in human language. common measures for quality such as perplexity as an indication of the model’s ability to capture 1 Introduction regularities in language. While this approach does In recent years, recurrent neural network (RNN) not suffer from the above discussed drawbacks, models have emerged as a powerful architec- it conflates syntactical capacity with other factors ture for a variety of NLP tasks (Goldberg, such as world knowledge and frequency of lexi- 2017). In particular, gated versions, such as Long cal items. Furthermore, the LM task does not pro- Short-Term Networks (LSTMs) (Hochreiter and vide one clear answer: one cannot be “right” or Schmidhuber, 1997) and Gated Recurrent Units “wrong” in language modeling, only softly worse (GRU) (Cho et al., 2014; Chung et al., 2014) or better than other systems. achieve state-of-the-art results in tasks such as lan- A different approach is testing the model on a guage modeling, parsing, and machine translation. grammatical task that does not require an exten- RNNs were shown to be able to capture long- sive grammatical annotation, but is yet indicative term dependencies and statistical regularities in in- of syntax comprehension. Specifically, previous put sequences (Karpathy et al., 2015; Linzen et al., works (Linzen et al., 2016; Bernardy and Lap- 2016; Shi et al., 2016; Jurafsky et al., 2018; Gu- pin, 2017; Gulordava et al., 2018) used the task lordava et al., 2018). An adequate evaluation of of predicting agreement, which requires detecting 98 Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 98–107 Brussels, Belgium, November 1, 2018. c 2018 Association for Computational Linguistics hierarchal relations between sentence constituents. the network relies, at least to a certain degree, on Labeled data for such a task requires only the col- local cues. Bernardy and Lappin(2017) evaluated lection of sentences that exhibit agreement from agreement prediction on a larger dataset, and ar- an unannotated corpora. However, those works gued that a large vocabulary aids the learning of have focused on relatively small set of languages: structural patterns. Gulordava et al.(2018) fo- several Indo-European languages and a Semitic cused on the ability of LM’s to capture agreement language (Hebrew). As we show, drawing con- as a marker of syntactic ability, and used nonsen- clusions on the model’s abilities from a relatively sical sentences to control for semantic clues. They small subset of languages can be misleading. have shown positive results in four languages, as In this work, we test agreement prediction in well as some similarities between their models’ a substantially different language, Basque, which performance and human judgment of grammati- is a language with ergative–absolutive alignment, cality. rich morphology, relatively free word order, and polypersonal agreement (see Section3). We pro- 3 Properties of the Basque Language pose two tasks, verb-number prediction (Section Basque agreement patterns are ostensibly more 6) and suffix prediction (Section7), and show that complex and very different from those of English. agreement prediction in Basque is indeed harder In particular, nouns inflect for case, and the verb for RNNs. We thus propose Basque agreement as agrees with all of its core arguments. How well a challenging benchmark for the ability of models can a RNN learn such agreement patterns? to capture regularities in human language. We first outline key properties of Basque rele- vant to this work. We have used the following two 2 Background and Previous Work grammars written in English for reference (Laka, To shed light on the question of hierarchical struc- 1996; de Rijk, 2007). ture learning, a previous work on English (Linzen Morphological marking of case and number on et al., 2016) has focused on subject-verb agree- NPs The grammatical role of noun phrases is ex- ment: The form of third-person present-tense plicitly marked by nuclear case suffixes that attach verbs in English is dependent upon the number after the determiner in a noun phrase — this is typ- of their subject (“They walk” vs. “She walks”). ically the last element in the phrase. Agreement prediction is an interesting case study The nuclear cases are the ergative (ERG), the for implicit learning of the tree structure of the in- absolutive (ABS) and the dative (DAT).1 In ad- put, as once the arguments of each present-tense dition to case, the same suffixes also encode for verb in the sentence are found and their grammat- number (singular or plural) as seen in Table1. ical relation to the verb is established, predicting the verb form is straightforward. Ergative-absolutive case system Unlike En- Linzen et al.(2016) tested different variants of glish and most other Indo-European languages the agreement prediction task: categorical pre- that have nominative–accusative morphosyntactic diction of the verb form based on the left con- alignment in which the single argument of intran- text; grammatical assessment of the validity of the sitive verbs and the agent of transitive verbs be- agreement present in a given sentence; and lan- have similarly to each other (“subjects”) but dif- guage modeling. Since in many cases the verb ferently from the object of transitive verbs, Basque form can be predicted according to number of the has ergative–absolutive alignment. This means preceding noun, they focused on agreement attrac- that the “subject” of an intransitive verb and the tors: sentences in which the preceding nouns have “object” of a transitive verbs behave similarly to the opposite number of the grammatical subject. each other and receive the absolutive case, while Their model achieved very good overall perfor- the “subject” of a transitive verb receives the erga- mance in the first two tasks of number prediction tive case. To illustrate the difference, while in and grammatical judgment, while in the third task English we say “she sleeps” and “she sees them” of language modeling, weak supervision did not (treating she the same in both sentences), in an suffice to learn structural dependencies. With re- 1Additional cases encode different aspects of the role of gard to the presence of agreement attractors, they the noun phrase in the sentence. For example, local cases have shown the performance decays with their indicate aspects such as destination and place of occurrence, possessive/genitive cases indicate possession, causal cases in- number, to the point of worse-than-random accu- dicate causation, etc. In this work we focus only on the three racy in the presence of 4 attractors; this suggests mentioned. 99 Suffix Forms The person sees the trees. Case Function Sg Pl No det (4) Zuhaitz-ak pertson-ak Absolutive S, O -a -ak - tree-SG.ERG person-PL.ABS Ergative A -ak -ek -(e)k ikusten ditu Dative IO -ari -ei -(r)i seeing it-is-them The tree sees the people. Table 1: Basque case and their corresponding determined nuclear case suffixes. Note the case syncretism, resulting in Word-order and Polypersonal Agreement structural ambiguity between the plural absolutive and the Basque is often said to have a SOV word order, ergative singular. Under function, S refers to the single ar- gument of a prototypical intransitive verb, O refers to the although the rules governing word order are most patient-like argument of a prototypical transitive verb, rather complex, and word order is dependent and A refers to the most agent-like argument of a prototyp- on the focus and topic of the sentence.
Recommended publications
  • Lara Mantovan Nominal Modification in Italian Sign Language Sign Languages and Deaf Communities
    Lara Mantovan Nominal Modification in Italian Sign Language Sign Languages and Deaf Communities Editors Annika Herrmann, Markus Steinbach, Ulrike Zeshan Editorial board Carlo Geraci, Rachel McKee, Victoria Nyst, Sibaji Panda, Marianne Rossi Stumpf, Felix Sze, Sandra Wood Volume 8 Lara Mantovan Nominal Modification in Italian Sign Language ISHARA PRESS ISBN 978-1-5015-1343-5 e-ISBN (PDF) 978-1-5015-0485-3 e-ISBN (EPUB) 978-1-5015-0481-5 ISSN 2192-516X e-ISSN 2192-5178 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2017 Walter de Gruyter Inc., Boston/Berlin and Ishara Press, Preston, UK Printing and binding: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com E Pluribus Unum (uncertain origin, attributed to Virgilio, Moretum, v. 103) Acknowledgements This book is a revised version of my 2015 dissertation which was approved for the PhD degree in Linguistics at Ca’ Foscari University of Venice. When I first plunged into the world of academic research, almost five years ago, I would never have imagined it was possible to achieve such an important milestone. Being so close to finalizing this book, I would like to look back briefly and remember and thank all the people who showed me the way, supported me, and encouraged me to grow both academically and personally.
    [Show full text]
  • Proceedings of the LFG13 Conference
    Proceedings of LFG13 Miriam Butt and Tracy Holloway King (Editors) 2013 CSLI Publications http://csli-publications.stanford.edu/ Contents 1 Dedication 4 2 Editors’ Note 5 3 Yasir Alotaibi, Muhammad Alzaidi, Maris Camilleri, Shaimaa ElSadek and Louisa Sadler: Psychological Predicates and Verbal Complemen- tation in Arabic 6 4 I Wayan Arka: Nominal Aspect in Marori 27 5 Doug Arnold and Louisa Sadler: Displaced Dependent Constructions 48 6 Daniele Artoni and Marco Magnani: LFG Contributions in Second Language Acquisition Research: The Development of Case in Russian L2 69 7 Oleg Belyaev: Optimal Agreement at M-structure: Person in Dargwa 90 8 Ansu Berg, Rigardt Pretorius and Laurette Pretorius: The Represen- tation of Setswana Double Objects in LFG 111 9 Tina Bögel: A Prosodic Resolution of German Case Ambiguities 131 10 Kersti Börjars and John Payne: Dimensions of Variation in the Ex- pression of Functional Features: Modelling Definiteness in LFG 152 11 George Aaron Broadwell: An Emphatic Auxiliary Construction for Emotions in Copala Triqui 171 12 Özlem Çetinoglu,˘ Sina Zarrieß and Jonas Kuhn: Dependency-based Sentence Simplification for Increasing Deep LFG Parsing Coverage 191 13 Elizabeth Christie: Result XPs and the Argument-Adjunct Distinction212 14 Cheikh Bamba Dione: Valency Change and Complex Predicates in Wolof: An LFG Account 232 15 Lachlan Duncan: Non-verbal Predicates in K’ichee’ Mayan: An LFG Approach 253 16 Dag Haug: Partial Control and the Semantics of Anaphoric Control in LFG 274 17 Annette Hautli-Janisz: Moving Right Along:
    [Show full text]
  • Mighty Morpheme Tagging Rangers
    in Proceedings of NAACL-HLT 2019 Background ❖ It's hard to make crosslinguistic comparisons of RNN syntactic performance (e.g., on subject-verb agreement prediction) ➢ Languages differ in multiple typological properties ➢ Cannot hold training data constant across languages Proposal: generate synthetic data to devise a controlled experimental paradigm for studying the interaction of the inductive bias of a neural architecture with particular typological properties. Setup ❖ Data: English Penn Treebank sentences converted to Universal Dependencies scheme Example of a dependency parse tree ONE in Proceedings of NAACL-HLT 2019 Setup ❖ Identify all verb arguments with nsubj, nsubjpass, dobj and record plurality (HOW? manually?) Example of a dependency parse tree Setup ❖ Generate synthetic data by appending novel morphemes to the verb arguments identified to inflect them for argument role and number Setup ❖ Generate synthetic data by appending novel morphemes to the verb arguments identified to inflect them for argument role and number No explanation or motivation given for how the novel morphemes were developed, nor an explicit mention that they're novel! Might length matter? Typological properties ❖ Does jointly predicting object and subject plurality improve overall performance? ➢ Generate data with polypersonal agreement ❖ Do RNNs have inductive biases favoring certain word orders over others? ➢ Generate data with different word orders ❖ Does overt case marking influence agreement prediction? ➢ Generate data with different case marking systems ■ unambiguous, syncretic, argument marking Examples of synthetic data Task ❖ Predict a verb's subject and object plurality features. Input: synthetically-inflected sentence Output: one category prediction each for subject & object subject: [singular, plural] object: [singular, plural, none] (if no object) (It's NOT CLEAR in the paper WHAT the actual prediction task is / what the actual output space is.
    [Show full text]
  • Multi-Lingual Dependency Parsing: Word Representation and Joint
    Multi-Lingual Dependency Parsing : Word Representation and Joint Training for Syntactic Analysis Mathieu Dehouck To cite this version: Mathieu Dehouck. Multi-Lingual Dependency Parsing : Word Representation and Joint Training for Syntactic Analysis. Computer Science [cs]. Université de lille, 2019. English. tel-02197615 HAL Id: tel-02197615 https://tel.archives-ouvertes.fr/tel-02197615 Submitted on 30 Jul 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Centre de Recherche en Informatique, Signal et Automatique de Lille École Doctorale Sciences Pour L’Ingénieur Thèse de Doctorat Spécialité : Informatique et Applications préparée au sein de l’équipe Magnet, du laboratoire Cristal et du centre de recherche Inria Lille - Nord Europe financée par l’Université de Lille Mathieu Dehouck Multi-Lingual Dependency Parsing : Word Representation and Joint Training for Syntactic Analysis Parsing en Dépendances Multilingue : Représentation de Mots et Apprentissage Joint pour l’Analyse Syntaxique sous la direction de Dr. Marc TOMMASI et l’encadrement de Dr. Pascal DENIS Soutenue publiquement à Villeneuve d’Ascq, le 20 mai 2019 devant le jury composé de: Mme Sandra KÜBLER Indiana University Bloomington Rapportrice M. Alexis NASR Université d’Aix Marseille Rapporteur Mme Hélène TOUZET CNRS Présidente du jury M.
    [Show full text]
  • Journal of South Asian Languages and Linguistics 2(2)
    JSALL 2021; aop Netra P. Paudyal and John Peterson* How one language became four: the impact of different contact-scenarios between “Sadani” and the tribal languages of Jharkhand https://doi.org/10.1515/jsall-2021-2028 Published online May 4, 2021 Abstract: Four Indo-Aryan linguistic varieties are spoken in the state of Jharkhand in eastern central India, Sadri/Nagpuri, Khortha, Kurmali and Panchparganiya, which are considered by most linguists to be dialects of other, larger languages of the region, such as Bhojpuri, Magahi and Maithili, although their speakers consider them to be four distinct but closely related languages, collectively referred to as “Sadani”. In the present paper, we first make use of the program COG by the Summer Institute of Linguistics (SIL) to show that these four varieties do indeed form a distinct, compact genealogical group within the Magadhan language group of Indo- Aryan. We then go on to argue that the traditional classification of these languages as dialects of other languages appears to be based on morphosyntactic differences between these four languages and similarities with their larger neighbors such as Bhojpuri and Magahi, differences which have arisen due to the different contact situations in which they are found. Keywords: Khortha; Kudmali; language contact; Sadani; Sadri 1 Introduction While the first official language of the state of Jharkhand in eastern central India is Hindi, over 96% of the state population speaks a local tribal or regional language as their first (L1) or second language (L2) on a daily basis, and only 3.7% of the people speak Hindi as their first language (JTWRI 2013:4–5).
    [Show full text]
  • English for Practical Purposes 9
    ENGLISH FOR PRACTICAL PURPOSES 9 CONTENTS Chapter 1: Introduction of English Grammar Chapter 2: Sentence Chapter 3: Noun Chapter 4: Verb Chapter 5: Pronoun Chapter 6: Adjective Chapter 7: Adverb Chapter 8: Preposition Chapter 9: Conjunction Chapter 10: Punctuation Chapter 11: Tenses Chapter 12: Voice Chapter 1 Introduction to English grammar English grammar is the body of rules that describe the structure of expressions in the English language. This includes the structure of words, phrases, clauses and sentences. There are historical, social, and regional variations of English. Divergences from the grammardescribed here occur in some dialects of English. This article describes a generalized present-dayStandard English, the form of speech found in types of public discourse including broadcasting,education, entertainment, government, and news reporting, including both formal and informal speech. There are certain differences in grammar between the standard forms of British English, American English and Australian English, although these are inconspicuous compared with the lexical andpronunciation differences. Word classes and phrases There are eight word classes, or parts of speech, that are distinguished in English: nouns, determiners, pronouns, verbs, adjectives,adverbs, prepositions, and conjunctions. (Determiners, traditionally classified along with adjectives, have not always been regarded as a separate part of speech.) Interjections are another word class, but these are not described here as they do not form part of theclause and sentence structure of the language. Nouns, verbs, adjectives, and adverbs form open classes – word classes that readily accept new members, such as the nouncelebutante (a celebrity who frequents the fashion circles), similar relatively new words. The others are regarded as closed classes.
    [Show full text]
  • The Origins of Personal Agreement Clitics in Caucasian Albanian and Udi
    The Origins of Personal Agreement Clitics in Caucasian Albanian and Udi Wolfgang Schulze, Munich / Banská Bystrica 1. Introduction Udi, belonging to the Southeast Caucasian (Lezgian) language family (Eastern Samur branch), represents one of the best studied minority languages of this family (see Schulze (in press) for a more comprehensive survey on the history of Udi linguistics). From a typological point of view, Udi has found much interest because of its system of so-called floating agreement markers that is said to be unique among the autochthonous languages of the Eastern Caucasus. In the present paper, dedicated to the jubilee with whom I had the honor to discuss over times issues of Caucasian Albanian and Udi grammar, I want to present some new thoughts on the origins of Udi and Caucasian Albanian patterns of personal agreement. The issue has become a hotspot not only in the linguistics of East Caucasian, but also in general linguistics due to the study by Alice Harris (Harris 2002) that has served as a starting point for several theory-driven proposals to interpret these patterns (e.g. Crysmann 2000, Luís & Spencer 2006). Most of these studies are based on the analyses and hypotheses put forward by Harris (2002) and do not offer new data or new arguments concerning the history and motivation of agreement constructions in Udi. Moreover, Harris' analysis and hypotheses could not yet include data stemming the Mount Sinai palimpsests that contain texts written in Caucasian Albanian (~ 600 AD). Jost Gippert and the author of the present article who had edited these palimpsests in collaboration with Zaza Aleksidze and Jean-Pierre Mahé (Gippert et al.
    [Show full text]
  • A Grammar of Kunbarlang
    A Grammar of Kunbarlang Ivan Kapitonov ORCID: 0000-0002-1603-6265 Submitted in total fulfilment of the requirements ofthe degree of Doctor of Philosophy School of Languages and Linguistics The University of Melbourne July 2019 Copyright 2019 Ivan Kapitonov This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. To view a copy of this license, visit http://creativecommons. org/licenses/by-nc-nd/3.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. Abstract This thesis is a comprehensive description of Kunbarlang, an Aboriginal language from northern Australia. The description and analysis are based on my original field work, as well as build on the preceding body of work by other scholars. Between 2015 and 2018 I have done field work in Warruwi (South Goulburn Island), Maningrida, and Darwin. The data elicited in those trips and the recordings of narratives andsemi- spontaneous conversation constitute the foundation of the present grammar. However, I was fortunate in that I was not working from scratch. Carolin Coleman did foundational work on Kunbarlang in central-western Arnhem Land from 1981, which resulted in the first grammar of the language (Coleman 1982). In her subsequent work in the area in the 1990’s, she carried on with lexicographic research in Kunbarlang, Mawng and Maningrida languages. More recently, Dr. Aung Si (Universität zu Köln), Dr. Isabel O’Keeffe (University of Sydney), and Dr. Ruth Singer (University of Melbourne / Australian National University) made a number of recordings of Kunbarlang speakers at Maningrida, Warruwi, Minjilang and Darwin.
    [Show full text]
  • Natural Language Processing and the Mohawk Language Creating a Finite State Morphological Parser of Mohawk Formal Nouns
    Natural Language Processing and the Mohawk Language Creating a finite state morphological parser of Mohawk formal nouns Alicia Alexandra Assini MSc candidate in Multilingual Computing and Localisation 2012/2013, CSIS Advised by Richard Sutcliffe, PhD 11/10/2013 Presented in this thesis is the design, implementation and evaluation of a finite state morphological parser for Mohawk formal nouns. Utilizing the finite state morphology software designed by Beesely and Karttunen (2003) along with three of the most comprehensive grammars for Mohawk, one from each of the major dialectal regions, a lexicon for a finite state system was created that incorporated a structure I created from cross-referencing the three sources. Since there was no formal coursework in the program providing instruction in computer programming or morphology, these skills were self- taught. In addition to the parser, a taxonomy of Mohawk prefixes and suffixes was developed for the finite state system. The challenges facing the development of Natural Language Processing tools and language-learning technologies for the Mohawk language, as a polysynthetic language are representative of the problems that other related languages in this important group experience. The research shown here demonstrates that using finite state morphology techniques that the Mohawk language’s formal nouns can be successfully described and parsed and that with further study and review that it is possible a finite state system could be expanded to include all of Mohawk nouns, verbs, and particles. This work is looking to provide greater access to the Mohawk language for the speakers and learners of the language by use in an e-dictionary and also to support projects for language tools development for other polysynthetic and under-resourced languages.
    [Show full text]
  • Comparative Typology of the English and Azerbaijani Languages
    DUNYAMIN YUNUSOV LEYLA KHANBUTAYEVA COMPARATIVE TYPOLOGY OF THE ENGLISH AND AZERBAIJANI LANGUAGES BAKU – 2008 MINISTRY OF EDUCATION OF AZERBAIJAN REPUBLIC AZERBAIJAN UNIVERSITY OF LANGUAGES DUNYAMIN YUNUSOV LEYLA KHANBUTAYEVA COMPARATIVE TYPOLOGY OF THE ENGLISH AND AZERBAIJANI LANGUAGES By the order of the Ministry of Education of Azerbaijan Republic the stamp was given (order № _________) BAKU – “MUTARJIM” – 2008 1 It was approved by the department of “European Languages and Literature” of “Scientific Methodical Council” of the Ministry of Education of Azerbaijan Republic, December 8, 2007 (record of evidence №16). REVIEWERS: Doctor of philology, professor A.A.Abdullayev Candidate of philology, assistant professor A.R.Huseynov EDITOR: Candidate of philology, assistant professor E.I.Hajiyev COMPUTER DESIGNER: Sadagat Yusivofa Copyright All rights reserved. No part of this publication may be reproduced, transmitted, stored in a retrieval system without the written permission of the authors. 2 PREFACE “Comparative Typology of the English and Azerbaijani languages” is a new kind of book in our Republic. In writing it, the authors have assumed that studying comparative typology, for the overseas student, makes most sense if one starts with the question “How can I learn this subject?” In preparing this book, care has been taken to bring the text of the book up to date and to introduce the reader to some outstanding problems of modern linguistics. One of these concerns the relations between two non-kindred English and Azerbaijani languages, on the one hand and main levels and processes of the development of languages, linguistic differentiation and integration on the other. Recent discussion of this problem has also immediate connection with the treatment of the notion of “comparativeness”.
    [Show full text]
  • Morphosyntactic Restructuring in Child Heritage Georgian
    234 Heritage Language Journal, 17(2) https://doi.org/10.46538/hlj.17.2.6 August, 2020 Morphosyntactic Restructuring in Child Heritage Georgian Cass Lowry and LeeAnn Stover The Graduate Center, City University of New York ABSTRACT This study investigates morphosyntactic restructuring in Heritage Georgian, a highly agglutinative language with polypersonal agreement. Child heritage speakers of Georgian (n = 26, age 3-16) completed a Frog Story narrative task and a lexical proficiency task in Georgian. Heritage speaker narratives were compared to narratives produced by age-matched peers living in Georgia (n = 30, age 5-14) and Georgian children and young adults who moved to the United States during childhood (n = 7, age 9–24). Heritage Georgian speakers produced more instances of non-standard nominal case marking and non-standard verbal subject agreement than their homeland peers. Individual morphosyntactic divergence was predicted by lexical score, but not by oral fluency or age. Patterns of divergence in the nominal domain included overuse of the default case (nominative) as well as over-extension of non-default cases (ergative, dative). In the verbal domain, person agreement was more consistently marked than number. Subject agreement exhibited more divergence from the baseline than object agreement, contrary to previous evidence from similar heritage languages (e.g., Heritage Hindi, Montrul et al., 2012). Results indicate that morphosyntactic production in child Heritage Georgian generally displays the same divergences as adult heritage-language grammars, but language-specific differences also underscore the need for continued documentation of lesser-studied heritage languages. KEYWORDS: child heritage speakers, Georgian, morphosyntax, case marking, verbal agreement, phi-features, lexical knowledge 1.
    [Show full text]
  • Reciprocal Markers in Adyghe, Their Relations and Interactions
    Reciprocal markers in Adyghe, their relations and interactions Alexander Letuchiy Russian State University for Humanities, Moscow 1. Introduction 1.1. Adyghe 1.2. Sources of data 1.3. Overwiew. Means of expressing the reciprocal meaning 2. Grammatical notes 2.1. Introductory 2.2. Nominal categories: case, number, possession 2.3. Tense and aspect 2.4. Verb classes 2.5. Agreement 2.6. Locative preverbs 2.7. Meanings of the reflexive prefix ze- 2.8. Other means of valency derivation 2.9. Compatibility of derivational markers 3. Morphological (prefixed) reciprocals 3.1. Subject-oriented reciprocals (intransitive and, rarely, transitive) 3.1.1. “Canonical” reciprocal (intransitive) 3.1.1.1. Reciprocals derived from two-place transitive verbs 3.1.1.1.1. With the prefix zere- 3.1.1.1.2. Prefix ze- instead of zere- 3.1.1.1.3. Prefix zere-gъe instead of zere- 3.1.1.2. Reciprocals derived from two-place transitive verbs (prefix ze-) 3.1.1.2.1. From common (non-inverse) intransitives 3.1.1.2.2. From intransitive inverse verbs 3.1.2. “Indirect” (transitive) reciprocals (prefix ze-) 3.1.3. “Possessive” reciprocals (transitive) 3.1.3.1. With the prefix zere- 3.1.3.2. With the prefix zere-gъe 3.1.3.3. With possessive plural prefixes on nominals 3.2. Object-oriented reciprocals (transitive; prefix ze-) 3.2.1. Spatial reciprocals 3.2.1.1. Reciprocals of joining 3.2.1.2. Reciprocals of separating 3.2.1.3. Object-oriented reciprocals without non-reciprocal correlates 3.2.2. Non-spatial reciprocals 3.2.3.
    [Show full text]