Studying the Inductive Biases of Rnns with Synthetic Variations of Natural Languages

Total Page:16

File Type:pdf, Size:1020Kb

Studying the Inductive Biases of Rnns with Synthetic Variations of Natural Languages Accepted as a long paper in NAACL 2019 1 INTRODUCTION Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages Shauli Ravfogel1 Yoav Goldberg1,2 Tal Linzen3 1Computer Science Department, Bar Ilan University 2Allen Institute for Artificial Intelligence 3Department of Cognitive Science, Johns Hopkins University fshauli.ravfogel, [email protected], [email protected] Abstract 2018) and limitations (Chowdhury and Zampar- elli, 2018; Marvin and Linzen, 2018; Wilcox et al., How do typological properties such as word 2018). order and morphological case marking af- fect the ability of neural sequence models to Most of the work so far has focused on En- acquire the syntax of a language? Cross- glish, a language with a specific word order and linguistic comparisons of RNNs’ syntactic relatively poor morphology. Do the typological performance (e.g., on subject-verb agreement properties of a language affect the ability of RNNs prediction) are complicated by the fact that to learn its syntactic regularities? Recent studies any two languages differ in multiple typo- suggest that they might. Gulordava et al.(2018) logical properties, as well as by differences in training corpus. We propose a paradigm evaluated language models on agreement predic- that addresses these issues: we create syn- tion in English, Russian, Italian and Hebrew, and thetic versions of English, which differ from found worse performance on English than the English in one or more typological parame- other languages. In the other direction, a study ters, and generate corpora for those languages on agreement prediction in Basque showed sub- based on a parsed English corpus. We re- stantially worse average-case performance than port a series of experiments in which RNNs reported for English (Ravfogel et al., 2018). were trained to predict agreement features for verbs in each of those synthetic languages. Existing cross-linguistic comparisons are dif- Among other findings, (1) performance was ficult to interpret, however. Models were in- higher in subject-verb-object order (as in En- evitably trained on a different corpus for each glish) than in subject-object-verb order (as in language. The constructions tested can differ Japanese), suggesting that RNNs have a re- cency bias; (2) predicting agreement with both across languages (Gulordava et al., 2018). Per- subject and object (polypersonal agreement) haps most importantly, any two natural languages improves over predicting each separately, sug- differ in a number of typological dimensions, such gesting that underlying syntactic knowledge as morphological richness, word order, or explicit transfers across the two tasks; and (3) overt case marking. This paper proposes a controlled morphological case makes agreement predic- experimental paradigm for studying the interac- tion significantly easier, regardless of word or- tion of the inductive bias of a neural architec- arXiv:1903.06400v2 [cs.CL] 26 Mar 2019 der. ture with particular typological properties. Given 1 Introduction a parsed corpus for a particular natural language (English, in our experiments), we generate cor- The strong performance of recurrent neural net- pora for synthetic languages that differ from the works (RNNs) in applied natural language pro- original language in one of more typological pa- cessing tasks has motivated an array of studies rameters (Chomsky, 1981), following Wang and that have investigated their ability to acquire nat- Eisner(2016). In a synthetic version of English ural language syntax without syntactic annota- with a subject-object-verb order, for example, sen- tions; these studies have identified both strengths tence (1a) would be transformed into (1b): (Linzen et al., 2016; Giulianelli et al., 2018; Gulordava et al., 2018; Kuncoro et al., 2018; (1) a. The man eats the apples. van Schijndel and Linzen, 2018; Wilcox et al., b. The man the apples eats. 1 Accepted as a long paper in NAACL 2019 2 SETUP Original they say the broker took them out for lunch frequently . (subject; verb; object) Polypersonal agreement they saykon the broker tookkarker them out for lunch frequently . (kon: plural subject; kar: singular subject; ker: plural object) Word order variation SVO they say the broker took out frequently them for lunch . SOV they the broker them took out frequently for lunch say. VOS say took out frequently them the broker for lunch they. VSO say they took out frequently the broker them for lunch . OSV them the broker took out frequently for lunch they say. OVS them took out frequently the broker for lunch say they. Case systems Unambiguous theykon saykon the brokerkar tookkarker theyker out for lunch frequently . (kon: plural subject; kar: singular subject; ker: plural object) Syncretic theykon saykon the brokerkar tookkarkar theykar out for lunch frequently . (kon: plural subject; kar: plural object/singular subject) Argument marking theyker sayker the brokerkin tookkerkin theyker out for lunch frequently . (ker: plural argument; kin: singular argument) Figure 1: The sentences generated in our synthetic languages based on an original English sentence. All verbs in the experiments reported in the paper carried subject and object agreement suffixes as in the polypersonal agreement experiment; we omitted these suffixes from the word order variation examples in the table for ease of reading. We then train a model to predict the agreement tems, agreement patterns, and order of core ele- features of the verb; in the present paper, we focus ments. For each parametric version of English, we on predicting the plurality of the subject and the recorded the verb-argument relations within each object (that is, whether they are singular or plural). sentence, and created a labeled dataset. We ex- The subject plurality prediction problem for (1b), posed our models to sentences from which one of for example, can be formulated as follows: the verbs was omitted, and trained them to predict the plurality of the arguments of the unseen verb. (2) The man the apples hsingular/plural subject?i. The following paragraph describes the process of collecting verb-argument relations; a detailed dis- We illustrate the potential of this approach in a cussion of the parametric generation process for series of case studies. We first experiment with agreement marking, word order and case marking polypersonal agreement, in which the verb agrees is given in the corresponding sections. We have with both the subject and the object (x3). We then made our synthetic language generation code pub- manipulate the order of the subject, the object and licly available.1 the verb (x4), and experiment with overt morpho- logical case (x5). For a preview of our synthetic Argument collection We created a labeled languages, see Figure1. agreement prediction dataset by first collecting verb-arguments relations from the parsed corpus. 2 Setup We collected nouns, proper nouns, pronouns, ad- jectives, cardinal numbers and relative pronouns Synthetic language generation We used an connected to a verb (identified by its part-of- expert-annotated corpus, to avoid potential con- speech tag) with an nsubj, nsubjpass or dobj de- founds between the typological parameters we pendency edge, and record the plurality of those manipulated and possible parse errors in an au- arguments. Verbs that were the head of a clausal tomatically parsed corpus. As our starting point, complement without a subject (xcomp dependen- we took the English Penn Treebank (Marcus et al., cies) were excluded. We recorded the plurality of 1993), converted to the Universal Dependencies the dependents of the verb regardless of whether scheme (Nivre et al. 2016) using the Stanford con- the tense and person of the verb condition agree- verter (Schuster and Manning, 2016). We then ment in English (that is, not only in third-person manipulated the tree representations of the sen- present-tense verbs). For relative pronouns that tences in the corpus to generate parametrically modified English corpora, varying in case sys- 1https://github.com/Shaul1321/rnn typology 2 Accepted as a long paper in NAACL 2019 3 POLYPERSONAL AGREEMENT Prediction Subject Object Object embedding and embeddings of the character n- task accuracy accuracy recall grams that made up the word.2 Subject 94:7 ± 0:3 -- The model (including the embedding layer) Object - 88:9 ± 0:26 81:8 ± 1:4 was trained end-to-end using the Adam optimizer Joint 95:7 ± 0:23 90:0 ± 0:1 85:4 ± 2:3 (Kingma and Ba, 2014). For each of the experi- ments described in the paper, we trained four mod- Table 1: Results of the polypersonal agreement exper- els with different random initializations; we report iments. “Joint” refers to multitask prediction of subject averaged results alongside standard deviations. and object plurality. 3 Polypersonal agreement Singular Plural Subject -kar -kon In languages with polypersonal agreement, verbs Object -kin -ker agree not only with their subject (as in English), Indirect Object -ken -kre but also with their direct object. Consider the fol- lowing Basque example:3 Table 2: Case suffixes used in the experiments. Verbs are marked by a concatenation of the suffixes of their (4) Kutxazain-ek bezeroa-ri liburu-ak corresponding arguments. cashier-PL.ERG customer-SG.DAT book-PL.ABS eman dizkiote gave they-them-to-her/him function as subjects or objects, we recorded the The cashiers gave the books to the customer. plurality of their referent; for instance, in the phrase Treasury bonds, which pay lower interest Information about the grammatical role of certain rates, we considered the verb pay to have a plural constituents in the sentence may disambiguate the subject. function of others; most trivially, if a word is the subject of a given verb, it cannot simultaneously Prediction task We experimented both with be its object. The goal of the present experiment prediction of one of the arguments of the verb is to determine whether jointly predicting both ob- (subject or object), and with a joint setting in ject and subject plurality improves the overall per- which the model predicted both arguments of each formance of the model.
Recommended publications
  • The Syntactic and Semantic Relationship in Synthetic Languages
    This is the published version of the bachelor thesis: Navarro Caparrós, Raquel; van Wijk Adan, Malou, dir. The Syntactic and Semantic Relationship in Synthetic Languagues : Linear Modification in Latin and Old English. 2016. 41 pag. (838 Grau en Estudis d’Anglès i de Clàssiques) This version is available at https://ddd.uab.cat/record/164293 under the terms of the license The Syntactic and Semantic Relationship in Synthetic Languages: Linear Modification in Latin and Old English TFG – Grau d’Estudis d’Anglès i Clàssiques Supervisor: Malou van Wijk Adan Raquel Navarro Caparrós June 2016 ACKNOWLEDGMENTS First and foremost, I would like to express my gratitude to my supervisor Ms. Malou van Wijk Adan for her assistance and patience during the period of my research. This project would not have been possible without her guidance and valuable comments throughout. I would also like to thank all my teachers of Classical Studies and English Studies for sharing their knowledge with me throughout these four years. Finally, I wish to acknowledge my mum and my sister Ruth whose invaluable support helped me to overcome any obstacle, especially in the last two years. Without them, I am sure that I would have never been able to get where I am now. TABLE OF CONTENTS List of Abbreviations ...................................................................................................... ii List of Tables .................................................................................................................. iii Abstract ..........................................................................................................................
    [Show full text]
  • II Levels of Language
    II Levels of language 1 Phonetics and phonology 1.1 Characterising articulations 1.1.1 Consonants 1.1.2 Vowels 1.2 Phonotactics 1.3 Syllable structure 1.4 Prosody 1.5 Writing and sound 2 Morphology 2.1 Word, morpheme and allomorph 2.1.1 Various types of morphemes 2.2 Word classes 2.3 Inflectional morphology 2.3.1 Other types of inflection 2.3.2 Status of inflectional morphology 2.4 Derivational morphology 2.4.1 Types of word formation 2.4.2 Further issues in word formation 2.4.3 The mixed lexicon 2.4.4 Phonological processes in word formation 3 Lexicology 3.1 Awareness of the lexicon 3.2 Terms and distinctions 3.3 Word fields 3.4 Lexicological processes in English 3.5 Questions of style 4 Syntax 4.1 The nature of linguistic theory 4.2 Why analyse sentence structure? 4.2.1 Acquisition of syntax 4.2.2 Sentence production 4.3 The structure of clauses and sentences 4.3.1 Form and function 4.3.2 Arguments and complements 4.3.3 Thematic roles in sentences 4.3.4 Traces 4.3.5 Empty categories 4.3.6 Similarities in patterning Raymond Hickey Levels of language Page 2 of 115 4.4 Sentence analysis 4.4.1 Phrase structure grammar 4.4.2 The concept of ‘generation’ 4.4.3 Surface ambiguity 4.4.4 Impossible sentences 4.5 The study of syntax 4.5.1 The early model of generative grammar 4.5.2 The standard theory 4.5.3 EST and REST 4.5.4 X-bar theory 4.5.5 Government and binding theory 4.5.6 Universal grammar 4.5.7 Modular organisation of language 4.5.8 The minimalist program 5 Semantics 5.1 The meaning of ‘meaning’ 5.1.1 Presupposition and entailment 5.2
    [Show full text]
  • Lara Mantovan Nominal Modification in Italian Sign Language Sign Languages and Deaf Communities
    Lara Mantovan Nominal Modification in Italian Sign Language Sign Languages and Deaf Communities Editors Annika Herrmann, Markus Steinbach, Ulrike Zeshan Editorial board Carlo Geraci, Rachel McKee, Victoria Nyst, Sibaji Panda, Marianne Rossi Stumpf, Felix Sze, Sandra Wood Volume 8 Lara Mantovan Nominal Modification in Italian Sign Language ISHARA PRESS ISBN 978-1-5015-1343-5 e-ISBN (PDF) 978-1-5015-0485-3 e-ISBN (EPUB) 978-1-5015-0481-5 ISSN 2192-516X e-ISSN 2192-5178 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2017 Walter de Gruyter Inc., Boston/Berlin and Ishara Press, Preston, UK Printing and binding: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com E Pluribus Unum (uncertain origin, attributed to Virgilio, Moretum, v. 103) Acknowledgements This book is a revised version of my 2015 dissertation which was approved for the PhD degree in Linguistics at Ca’ Foscari University of Venice. When I first plunged into the world of academic research, almost five years ago, I would never have imagined it was possible to achieve such an important milestone. Being so close to finalizing this book, I would like to look back briefly and remember and thank all the people who showed me the way, supported me, and encouraged me to grow both academically and personally.
    [Show full text]
  • Grammatical Analyticity Versus Syntheticity in Varieties of English
    Language Variation and Change, 00 (2009), 1–35. © Cambridge University Press, 2009 0954-3945/09 $16.00 doi:10.1017/S0954394509990123 1 2 3 4 Typological parameters of intralingual variability: 5 Grammatical analyticity versus syntheticity in varieties 6 7 of English 8 B ENEDIKT S ZMRECSANYI 9 Freiburg Institute for Advanced Studies 10 11 12 ABSTRACT 13 Drawing on terminology, concepts, and ideas developed in quantitative morphological 14 typology, the present study takes an exclusive interest in the coding of grammatical 15 information. It offers a sweeping overview of intralingual variability in terms of 16 overt grammatical analyticity (the text frequency of free grammatical markers), 17 grammatical syntheticity (the text frequency of bound grammatical markers), and 18 grammaticity (the text frequency of grammatical markers, bound or free) in English. 19 The variational dimensions investigated include geography, text types, and real time. 20 Empirically, the study taps into a number of publicly accessible text corpora that 21 comprise a large number of different varieties of English. Results are interpreted in 22 terms of how speakers and writers seek to achieve communicative goals while 23 minimizing different types of complexity. 24 25 This study surveys language-internal variability and short-term diachronic change, 26 along dimensions that are familiar from the cross-linguistic typology of languages. 27 The terms analytic and synthetic have a long and venerable tradition in linguistics, 28 going back to the 19th century and August Wilhelm von Schlegel, who is usually 29 credited for coining the opposition (Schlegel, 1818). This is, alas, not the place to 30 even attempt to review the rich history of thought in this area (but see Schwegler, “ 31 1990:chapter 1 for an excellent overview).
    [Show full text]
  • Proceedings of the LFG13 Conference
    Proceedings of LFG13 Miriam Butt and Tracy Holloway King (Editors) 2013 CSLI Publications http://csli-publications.stanford.edu/ Contents 1 Dedication 4 2 Editors’ Note 5 3 Yasir Alotaibi, Muhammad Alzaidi, Maris Camilleri, Shaimaa ElSadek and Louisa Sadler: Psychological Predicates and Verbal Complemen- tation in Arabic 6 4 I Wayan Arka: Nominal Aspect in Marori 27 5 Doug Arnold and Louisa Sadler: Displaced Dependent Constructions 48 6 Daniele Artoni and Marco Magnani: LFG Contributions in Second Language Acquisition Research: The Development of Case in Russian L2 69 7 Oleg Belyaev: Optimal Agreement at M-structure: Person in Dargwa 90 8 Ansu Berg, Rigardt Pretorius and Laurette Pretorius: The Represen- tation of Setswana Double Objects in LFG 111 9 Tina Bögel: A Prosodic Resolution of German Case Ambiguities 131 10 Kersti Börjars and John Payne: Dimensions of Variation in the Ex- pression of Functional Features: Modelling Definiteness in LFG 152 11 George Aaron Broadwell: An Emphatic Auxiliary Construction for Emotions in Copala Triqui 171 12 Özlem Çetinoglu,˘ Sina Zarrieß and Jonas Kuhn: Dependency-based Sentence Simplification for Increasing Deep LFG Parsing Coverage 191 13 Elizabeth Christie: Result XPs and the Argument-Adjunct Distinction212 14 Cheikh Bamba Dione: Valency Change and Complex Predicates in Wolof: An LFG Account 232 15 Lachlan Duncan: Non-verbal Predicates in K’ichee’ Mayan: An LFG Approach 253 16 Dag Haug: Partial Control and the Semantics of Anaphoric Control in LFG 274 17 Annette Hautli-Janisz: Moving Right Along:
    [Show full text]
  • A Comparative Syntactic Review of Null-Subject Parameter in English and Ịzọn Languages
    Vol.7(8), pp. 79-85, September, 2016 DOI: 10.5897/JLC2016.0363 Article Number: BCF661160470 ISSN 2141-6540 Journal of Languages and Culture Copyright © 2016 Author(s) retain the copyright of this article http://www.academicjournals.org/JLC Full Length Research Paper A comparative syntactic review of null-subject parameter in English and Ịzọn languages Odingowei Kwokwo Macdonald Department of English and Literary Studies, Niger Delta University, Nigeria. Received 12 February, 2016; Accepted 6 June 2016 The theory of universal grammar relies predominantly on the biolinguistic concept of natural endowment and innate knowledge of the general principles of language. It postulates that all humans are naturally endowed with the general rules and configurations of language and to this extent, all natural languages have similar structural features. The theory of universal grammar as hypothesized by Chomsky and propagated by other linguists not only recognizes the universality of the general principles of language but also the existence of language-specific idiosyncratic features that constitute parametric variations among languages. These are the parameters of universal grammar. The most prominent parameters that create distinctions between languages are head directionality, pro-drop or null-subject and wh- parameters. This paper reviews the null-subject parameter in English and juxtaposes its occurrence or non-occurrence in the Ịzọn language. The aim of the paper is to characterize the parametric choices by English and Ịzọn languages in the derivation of grammatically convergent sentences with null-subject constituents. The study is competence-based and used data from tokens of sentences in conversation among competent native speakers of Ịzọn language.
    [Show full text]
  • WORD ORDER and W O R D ORDER Chatjge
    Linguistics WORD ORDER AND WORD ORDER CHAtjGE The University of Texas Press Austin 1975 REVIEWED BY JAMES M. DUNN Princeton Unlversi ty This is a collection of twelve of the thirteen papers pre- sented at the Conference on Word Order and dord Order Change that was held at tne University of California, Santa Barbara, on January 26 - 27, 1974. The first eight deal with the diachk'onic aspect of word order, while the other four represent a synchronic treatment of the subject. In the preface the editor acknowledges the influence of Joseph Greenberg on these proceedings. Bis 1961 paper, Some universals of grammar with particular reference to the order of meaningful elements' , is seen as ' the starting point* for mos't of the papers in this volume. The papers in this collection appeal to a great diversity of interests r sign language, languages of the Niger-Congo group, Chinese, Indo-European, drift, discourse grammar, rnetatheory, the evaluation me-tric, and, of course, language typology. Obviously, their common purpose is to move toward a clearer ex- planation of the causal relationships between the surface con- stituents of a sentence both synch~onicdllyand diachronically. 54 But many of the papers actually share more than the commom denom- inator of interest in word order. At several points where other mutual interests overlap, the discussions assume the nature of a dialog (or, more often, a debate), and the reader finds transi- tion from paper to paper relatively smooth. I shall withhold further comment on the merits of this book as a whole until the conclusion of this review.
    [Show full text]
  • Sentence Structure: Subject – Verb – Object (SVO)
    Sentence structure: Subject – Verb – Object (SVO) Structuring sentences effectively is a key Verb feature of successful academic writing. If a supported sentence is easy to read, the meaning of your Object writing – the point you are trying to make or the cause of poor people. your argument – will be much clearer. In addition, structuring sentences properly will ‘Strongly’ is an adverb, qualifying the verb allow you to write more precisely. (‘supported’), while ‘in the nineteenth century’ is a prepositional phrase which is also Structuring sentences effectively will give functioning as an adverb. your writing greater clarity and precision. Here’s another example: Writing clearly and precisely is vital if you are to get the highest marks for your work. It has been argued that social media forms Academic work is in part assessed on its ability an essential element of an effective to communicate ideas, information and marketing campaign (Smith, 2018). interpretations in a lucid (i.e. clear) and mature (i.e. confident) way. Subject It In this guide, we are going to look at a basic Verb arrangement of words in a sentence: SVO. If has been argued you have this word order in mind when writing, it will help you to communicate your ideas The verb is followed by ‘that’. In sentences like more easily. What follows will therefore be this one, ‘that’ acts as a conjunction, particularly useful if you find academic writing introducing a subordinate clause (see our more challenging. You might find it useful to guide, ‘Sentences, clauses and phrases’). This use this guide with the related one on clause consists of everything that comes after ‘Sentences, clauses and phrases’.
    [Show full text]
  • Modeling Infant Segmentation of Two Morphologically Diverse Languages
    Modeling infant segmentation of two morphologically diverse languages Georgia Rengina Loukatou1 Sabine Stoll 2 Damian Blasi2 Alejandrina Cristia1 (1) LSCP, Département d’études cognitives, ENS, EHESS, CNRS, PSL Research University, Paris, France (2) University of Zurich, Zurich, Switzerland [email protected] RÉSUMÉ Les nourrissons doivent trouver des limites de mots dans le flux continu de la parole. De nombreuses études computationnelles étudient de tels mécanismes. Cependant, la majorité d’entre elles se sont concentrées sur l’anglais, une langue morphologiquement simple et qui rend la tâche de segmen- tation aisée. Les langues polysynthétiques - pour lesquelles chaque mot est composé de plusieurs morphèmes - peuvent présenter des difficultés supplémentaires lors de la segmentation. De plus, le mot est considéré comme la cible de la segmentation, mais il est possible que les nourrissons segmentent des morphèmes et non pas des mots. Notre étude se concentre sur deux langues ayant des structures morphologiques différentes, le chintang et le japonais. Trois algorithmes de segmentation conceptuellement variés sont évalués sur des représentations de mots et de morphèmes. L’évaluation de ces algorithmes nous mène à tirer plusieurs conclusions. Le modèle lexical est le plus performant, notamment lorsqu’on considère les morphèmes et non pas les mots. De plus, en faisant varier leur évaluation en fonction de la langue, le japonais nous apporte de meilleurs résultats. ABSTRACT A rich literature explores unsupervised segmentation algorithms infants could use to parse their input, mainly focusing on English, an analytic language where word, morpheme, and syllable boundaries often coincide. Synthetic languages, where words are multi-morphemic, may present unique diffi- culties for segmentation.
    [Show full text]
  • Experimenting with Pro-Drop in Spanish1
    Andrea Pešková Experimenting with Pro-drop in Spanish1 Abstract This paper investigates the omission and expression of pronominal subjects (PS) in Buenos Aires Spanish based on data from a production experiment. The use of PS is a linguistic phenomenon demonstrating that the grammar of a language needs to be considered independently of its usage. Despite the fact that the observed Spanish variety is a consistent pro-drop language with rich verbal agreement, the data from the present study provide evidence for a quite frequent use of overt PS, even in non-focal, non- contrastive and non-ambiguous contexts. This result thus supports previous corpus- based empirical research and contradicts the traditional explanation given by grammarians that overtly realized PS in Spanish are used to avoid possible ambiguities or to mark contrast and emphasis. Moreover, the elicited semi-spontaneous data indicate that the expression of PS is optional; however, this optionality is associated with different linguistic factors. The statistical analysis of the data shows the following ranking of the effects of these factors: grammatical persons > verb semantics > (syntactic) clause type > (semantic) sentence type. 1. Introduction It is well known that Spanish is a pro-drop (“pronoun-dropping”) or null- subject language whose grammar permits the omission of pronominal 1 A preliminary version of this paper was presented at the international conference Variation and typology: New trends in syntactic research in Helsinki (August 2011). I am grateful to the audience for their fruitful discussions and useful commentaries. I would also like to thank the editors, Susann Fischer, Ingo Feldhausen, Christoph Gabriel and the anonymous reviewers for their detailed and helpful comments on an earlier version of this article.
    [Show full text]
  • Minimum of English Grammar
    Parameters Morphological inflection carries with it a portmanteau of parameters, each individually shaping the very defining aspects of language typology. For instance, English is roughly considered a ‘weak inflectional language’ given its quite sparse inflectional paradigms (such as verb conjugation, agreement, etc.) Tracing languages throughout history, it is also worth noting that morphological systems tend to become less complex over time. For instance, if one were to exam (mother) Latin, tracing its morphological system over time leading to the Latin-based (daughter) Romance language split (Italian, French, Spanish, etc), one would find that the more recent romance languages involve much less complex morphological systems. The same could be said about Sanskrit (vs. Indian languages), etc. It is fair to say that, at as general rule, languages become less complex in their morphological system as they become more stable (say, in their syntax). Let’s consider two of the more basic morphological parameters below: The Bare Stem and Pro- drop parameters. Bare Stem Parameter. The Bare Stem Parameter is one such parameter that shows up cross-linguistically. It has to do with whether or not a verb stem (in its bare form) can be uttered. For example, English allows bare verb stems to be productive in the language. For instance, bare stems may be used both in finite conjugations (e.g., I/you/we/you/they speak-Ø) (where speak is the bare verb)—with the exception of third person singular/present tense (she speak-s) where a morphological affix{s}is required—as well as in an infinitival capacity (e.g., John can speak-Ø French).
    [Show full text]
  • Polysynthetic Structures of Lowland Amazonia
    OUP UNCORRECTED PROOF – REVISES, Sat Aug 19 2017, NEWGEN Chapter 15 Polysynthetic Structures of Lowland Amazonia Alexandra Y. Aikhenvald 15.1 Lowland Amazonian languages: a backdrop The Amazon basin is an area of high linguistic diversity (rivalled only by the island of New Guinea). It comprises around 350 languages grouped into over fifteen language families, in addition to a number of isolates. The six major linguistic families of the Amazon basin are as follows. • The Arawak language family is the largest in South America in terms of its geographical spread, with over forty extant languages between the Caribbean and Argentina. Well- established subgroups include Campa in Peru and a few small North Arawak groupings in Brazil and Venezuela. Arawak languages are spoken in at least ten locations north of the River Amazon, and in at least ten south of it. European languages contain a number of loans from Arawak languages, among them hammock and tobacco. • The Tupí language family consists of about seventy languages; nine of its ten branches are spoken exclusively in Amazonia. The largest branch, Tupí- Guaraní, extends beyond the Amazonian Basin into Bolivia and Paraguay. Loans from Tupí-Guaraní languages include jaguar and jacaranda. • Carib languages number about twenty five, and are spoken in various locations in Brazil and Venezuela in northern Amazonia, and in the region of the Upper Xingu and adjacent areas of Mato Grosso in Brazil south of the River Amazon. The place name ‘Caribbean’ and the noun cannibal (a version of the ethnonym ‘Carib’) are a legacy from Carib languages. • Panoan languages number about thirty, and are spoken on the eastern side of the Andes in Peru and adjacent areas of Brazil.
    [Show full text]