1 the Areal Typology of Western Middle and South America

Total Page:16

File Type:pdf, Size:1020Kb

1 the Areal Typology of Western Middle and South America The areal typology of western Middle and South America: towards a comprehensive view ***pre-publication version, do not cite without permission*** Matthias Urban DFG Center for Advanced Studies “Words, Bones, Genes, Tools,” University of Tübingen Rümelinstr. 19-23 72070 Tübingen, Germany Email: [email protected] Hugo Reyes-Centeno DFG Center for Advanced Studies “Words, Bones, Genes, Tools,” University of Tübingen Rümelinstr. 19-23 72070 Tübingen, Germany Email: [email protected] Kate Bellamy Leiden University Centre for Linguistics Leiden University Postbus 9515 2300RA Leiden, The Netherlands Email: [email protected] Matthias Pache Department of Anthropology of the Americas, University of Bonn Oxfordstr. 15 53111 Bonn, Germany Email: [email protected] Against a multidisciplinary background this contribution explores the areal typology of western Middle and South America. Based on a new language sample and a typological questionnaire that is specifically designed to bring some of the poorly documented and extinct languages into the debate, we explore the areal distribution of 77 linguistic traits in 44 languages. While one of the goals of the present article is to provide a general up-to-date view of areal patterning of these traits on a large scale, we also explore a number of specific questions in more detail. In particular, we address the relationship between known language areas like Mesoamerica and the Central Andes with their respective peripheries, the possibility of detecting an areal-typological signal that predates the rise of these linguistic areas, and, finally, the question of linguistic convergence along the Pacific coast. We find that, while the languages of the Mesoamerican periphery are rather diffuse typologically, the structural profiles of the Central Andean languages are embedded organically into a more general cluster of Andean typological affinity that alters continuously as one moves through geographical space. In different ways, the typological properties of the peripheral languages may reflect a situation that goes back to time depths which are greater than that of the emergence of the Mesoamerican and Central Andean linguistic areas. Finally, while we can confirm typological affinities with Mesoamerica for some languages of coastal South America, we do not find support for large- scale linguistic convergence on the Pacific coast. Keywords: typology, areal linguistics, language contact, Mesoamerican languages, Andean languages 1 1 Introduction1 Linguistic structures in the Americas have been explored from an areal point of view with renewed interest in recent years (e.g. Adelaar 2012; Campbell 2012; Epps forthcoming; Michael et al. 2014; contributions to O’Connor and Muysken 2014; Urban forthcoming a; Valenzuela 2015). In light of the awesome genealogical and structural diversity of language in the Americas, this is a rich and challenging field of research in its own right. However, areality in linguistic structures does not arise without a corresponding sociolinguistic background, which is in turn part of a more general sociocultural setting. Depending on the relative ease with which particular linguistic properties can be transferred from one language to another in different situations of language contact (itself a subject of considerable theoretical interest, cf. e.g. Nichols 2003), areal-typological similarities can usually be assumed to reflect periods of interaction between speakers of the languages involved at some point of time depth. Areal typology is therefore one of the possible points of access to the linguistic prehistory of the Americas. Indeed, a number of the areal-typological studies mentioned above have explicitly sought to bring the linguistic evidence to bear on questions of prehistoric developments in (parts of) the Americas. Our study is not different in this regard. We focus on the exploration of a large subregion of the Americas, namely the western parts of Middle America (which we define as that part of the main American landmass from Mexico in the north to Panama in the south) and South America, with a particular focus on the Pacific coastal regions.2 The choice of this area of investigation is indeed motivated by evidence from outside linguistics, which we discuss in 2.1. At the same time, different parts of this region have already been explored from the point of view of areal linguistics and are known to host a number of convergence areas. We discuss extant work from linguistics that is relevant to our study in Section 2.2, where we seek to provide an updated comprehensive broad-scale view of the areal distribution of linguistic structures in this part of the Americas. This is the overarching goal of the present paper. In light of the massive loss of linguistic diversity that set in with European contact, attaining this goal requires us to also pay attention to the extinct and poorly documented languages that were once spoken in this region, even though the available material is not optimal and significant gaps in documentation remain. We discuss past and present linguistic diversity in our study region in more detail in Section 3.1. This goal of including a maximum of linguistic diversity determines to a significant extent also the design of the study, in particular the questionnaire. We explain the design of our study further in Section 3.2. The subtitle, which alludes to a “comprehensive view,” is thus to be read in two ways: first, it refers to the broad-scale scope of this study, the geographic scope of which is larger than that of most recent contributions. Second, we also aim at comprehensiveness in another sense, namely that we aim to be more comprehensive than previous studies in bringing extinct and poorly documented languages into the debate under a unified and general framework of analysis. 1 Work on this article was supported by the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement n° 295918, the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project Nos. UR 310/1-1 and FOR 2237. We thank Willem F.H. Adelaar, former members of the MesAndLin(g)k project in Leiden, and two anonymous reviewers for commenting on earlier drafts of this article. 2 In contrast to these geographically defined terms, we use the term Mesoamerica to refer to a certain culture area and the associated linguistic area within Middle America. 2 In addition, we seek to explore more specific questions that appear to require further consideration in line with our literature review in Section 2.2. These include the relationship between the languages of two known linguistic areas within our geographical scope, Mesoamerica and the Central Andes, to those on the respective peripheries. Regarding the Central Andes, we focus in particular on the northern and southern periphery. A loosely related question pertains to the possibility of areal- typological affinities in the respective areas that could be older than the rise of these language areas. Our inclusive study design is well-suited to explore this question. Finally, a third question we aim to explore in more detail concerns the possibilities of linguistic contacts among Pacific peoples from Mesoamerica to the Central Andes. Here, we do so with a focus on the possibility of the diffusion of phonological and grammatical traits rather than lexical similarities which are evaluated in Urban (forthcoming c; see also Bellamy 2018). We explain these questions in Section 2.3. 2 Contextualization 2.1 Interdisciplinary Background This section provides a brief sketch of aspects of the pre-Colombian Americas that appear particularly relevant as background to the areal-typological study of this paper. Topics treated in this section are accordingly highly selective. In addition, while some points of contact with the (historical) linguistics of the Americas are already interwoven here, a detailed review of linguistic areality in Middle America and western (Pacific and Andean) South America itself is provided separately in Section 2.2. The human presence in the Americas dates back at least 15,000 years before present, when groups of people entered the continents from eastern Asia via the Beringian landbridge (Goebel et al. 2008; Braje et al. 2017). Taking into account the existence of very early archaeological sites in the southernmost parts of the double continent, such as Monte Verde (southern Chile), it appears that these groups dispersed rapidly across the then still unpeopled continents. The route or routes taken by these first colonizers is still not clear. There is, however, evidence from mitochondrial DNA that supports early migrations by a coastal route that are possibly related to a first entry by a Pacific route (Perego et al. 2009; Bodner et al. 2012; Llamas et al. 2016). People may have used watercraft and relied on the rich maritime resources of the Pacific for subsistence (Erlandson et al. 2007). At any rate, a maritime Pacific orientation is something deeply entrenched in the Americas, as can be inferred from early coastal sites such as the mentioned Monte Verde site or the Quebrada Jaguay site on Peru’s South Coast. Linguistic data is evaluated by different methodologies as far as very early population movements into the Americas are concerned. Greenberg’s (1987) attempt to reduce American linguistic diversity to just three language families, corresponding to three waves of migration of which the largest “Amerind” family would correspond to the first entry, is generally rejected because of methodological deficiencies. Accordingly, the state of the art still counts around 150 linguistic families in the Americas (Campbell 1997); progress at reducing this number is still being made, but slowly. The picture of extreme linguistic diversity coupled with the assumed short human presence is often taken as evidence for social conditions in the first phase after settlement that would promote linguistic diversification (e.g. Muysken et al. 2014: 303, though see Nettle 1999) or for the assumption that human presence in the Americas must be much older than commonly accepted (Nichols 1990).
Recommended publications
  • Bartholomew Collection of Unpublished Materials SIL International - Mexico Branch
    Language and Culture Archives Bartholomew Collection of Unpublished Materials SIL International - Mexico Branch © SIL International NOTICE This document is part of the archive of unpublished language data created by members of the Mexico Branch of SIL International. While it does not meet SIL standards for publication, it is shared “as is” under the Creative Commons Attribution- NonCommercial-ShareAlike license (http://creativecommons.org/licenses/by-nc- sa/4.0/) to make the content available to the language community and to researchers. SIL International claims copyright to the analysis and presentation of the data contained in this document, but not to the authorship of the original vernacular language content. AVISO Este documento forma parte del archivo de datos lingüísticos inéditos creados por miembros de la filial de SIL International en México. Aunque no cumple con las normas de publicación de SIL, se presenta aquí tal cual de acuerdo con la licencia "Creative Commons Atribución-NoComercial-CompartirIgual" (http://creativecommons.org/licenses/by-nc- sa/4.0/) para que esté accesible a la comunidad y a los investigadores. Los derechos reservados por SIL International abarcan el análisis y la presentación de los datos incluidos en este documento, pero no abarcan los derechos de autor del contenido original en la lengua indígena. Non-modal voicing as morphemic features in Íénná, Mazatec of Mazatlán Villa de Flores1, 2 R. David Klint SIL International 1 Introduction Mazatec is a Mexican language with 12-20 variants spoken in the La Cañada area of Oaxaca. Many variants show asymmetries in the laryngeally modified consonants of the phonemic inventory. Specifically, the laryngeally modified consonants in the phonemic inventory of Íénná, Mazatec of Mazatlán Villa de Flores, ISO 639-3 = vmz, mazateco del suroeste (INALI 2016), are asymmetric.
    [Show full text]
  • TEI and the Documentation of Mixtepec-Mixtec Jack Bowers
    Language Documentation and Standards in Digital Humanities: TEI and the documentation of Mixtepec-Mixtec Jack Bowers To cite this version: Jack Bowers. Language Documentation and Standards in Digital Humanities: TEI and the documen- tation of Mixtepec-Mixtec. Computation and Language [cs.CL]. École Pratique des Hauts Études, 2020. English. tel-03131936 HAL Id: tel-03131936 https://tel.archives-ouvertes.fr/tel-03131936 Submitted on 4 Feb 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Préparée à l’École Pratique des Hautes Études Language Documentation and Standards in Digital Humanities: TEI and the documentation of Mixtepec-Mixtec Soutenue par Composition du jury : Jack BOWERS Guillaume, JACQUES le 8 octobre 2020 Directeur de Recherche, CNRS Président Alexis, MICHAUD Chargé de Recherche, CNRS Rapporteur École doctorale n° 472 Tomaž, ERJAVEC Senior Researcher, Jožef Stefan Institute Rapporteur École doctorale de l’École Pratique des Hautes Études Enrique, PALANCAR Directeur de Recherche, CNRS Examinateur Karlheinz, MOERTH Senior Researcher, Austrian Center for Digital Humanities Spécialité and Cultural Heritage Examinateur Linguistique Emmanuel, SCHANG Maître de Conférence, Université D’Orléans Examinateur Benoit, SAGOT Chargé de Recherche, Inria Examinateur Laurent, ROMARY Directeur de recherche, Inria Directeur de thèse 1.
    [Show full text]
  • UC Santa Barbara UC Santa Barbara Electronic Theses and Dissertations
    UC Santa Barbara UC Santa Barbara Electronic Theses and Dissertations Title The inventory and distribution of tone in Tù’un Ndá’vi, the Mixtec of Piedra Azul (San Martín Peras), Oaxaca Permalink https://escholarship.org/uc/item/9fz844hn Author Peters, Simon L Publication Date 2018 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA Santa Barbara The Inventory and Distribution of Tone in Tù’un Ndá’vi, the Mixtec of Piedra Azul (San Martín Peras), Oaxaca A Thesis submitted in partial satisfaction of the requirements for the degree Master of Arts in Linguistics by Simon L. Peters Committee in charge: Professor Eric W. Campbell, Chair Professor Matthew Gordon Professor Argyro Katsika December 2018 The thesis of Simon L. Peters is approved. ____________________________________________ Matthew Gordon ____________________________________________ Argyro Katsika ____________________________________________ Eric W. Campbell, Committee Chair December 2018 The Inventory and Distribution of Tone in Tù’un Ndá’vi, the Mixtec of Piedra Azul (San Martín Peras), Oaxaca Copyright © 2018 by Simon L. Peters iii ACKNOWLEDGEMENTS Above all, I would like to thank Gabriel Mendoza not only for sharing his language with me, but also for his friendship and patience over the past several years as we have worked together to study and document his language. Certainly this thesis would not exist if it were not for the support of Eric W. Campbell, and I am extremely grateful for his advising. I would also like to thank my committee members Matthew Gordon and Argyro Katsika for their comments and feedback on this project. I am also incredibly appreciative of all the individuals who participate in the MICOP-UCSB Tu’un Nda’vi/Savi workshops and other linguistic projects, who have been a great community and source of encouragement throughout my time in graduate school.
    [Show full text]
  • Language EI Country Genetic Unit Speakers RI Acatepec Tlapanec 5
    Language EI Country Genetic Unit Speakers RI Acatepec Tlapanec 5 Mexico Subtiapa-Tlapanec 33000 1 Alacatlatzala Mixtec 4.5 Mexico Mixtecan 23000 2 Alcozauca Mixtec 5 Mexico Mixtecan 10000 3 Aloápam Zapotec 4 Mexico Zapotecan 2100 4 Amatlán Zapotec 5 Mexico Zapotecan 6000 5 Amoltepec Mixtec 3 Mexico Mixtecan 6000 6 Ascunción Mixtepec Zapotec 1 Mexico Zapotecan 100 7 Atatláhuca Mixtec 5 Mexico Mixtecan 8300 8 Ayautla Mazatec 5 Mexico Popolocan 3500 9 Ayoquesco Zapotec 3 Mexico Zapotecan < 900 10 Ayutla Mixtec 5 Mexico Mixtecan 8500 11 Azoyú Tlapanec 1 Mexico Subtiapa-Tlapanec < 680 12 Aztingo Matlatzinca 1 Mexico Otopamean > < 100 13 Matlatzincan Cacaloxtepec Mixtec 2.5 Mexico Mixtecan < 850 14 Cajonos Zapotec 4 Mexico Zapotecan 5000 15 Central Hausteca Nahuatl 5 Mexico Uto-Aztecan 200000 16 Central Nahuatl 3 Mexico Uto-Aztecan 40000 17 Central Pame 4 Mexico Pamean 4350 18 Central Puebla Nahuatl 4.5 Mexico Uto-Aztecan 16000 19 Chaopan Zapotec 5 Mexico Zapotecan 24000 20 Chayuco Mixtec 5 Mexico Mixtecan 30000 21 Chazumba Mixtec 2 Mexico Mixtecan < 2,500 22 Chiapanec 1 Mexico Chiapanec-Mangue < 20 23 Chicahuaxtla Triqui 5 Mexico Mixtecan 6000 24 Chichicapan Zapotec 4 Mexico Zapotecan 4000 25 Chichimeca-Jonaz 3 Mexico Otopamean > < 200 26 Chichimec Chigmecatitlan Mixtec 3 Mexico Mixtecan 1600 27 Chiltepec Chinantec 3 Mexico Chinantecan < 1,000 28 Chimalapa Zoque 3.5 Mexico Zoque 4500 29 Chiquihuitlán Mazatec 3.5 Mexico Popolocan 2500 30 Chochotec 3 Mexico Popolocan 770 31 Coatecas Altas Zapotec 4 Mexico Zapotecan 5000 32 Coatepec Nahuatl 2.5
    [Show full text]
  • The Numeral System of Purepecha: Historical and Typological Perspectives Kate Bellamy
    The numeral system of Purepecha: Historical and typological perspectives Kate Bellamy To cite this version: Kate Bellamy. The numeral system of Purepecha: Historical and typological perspectives. STUF, Akademie Verlag, In press, 4. halshs-03280910 HAL Id: halshs-03280910 https://halshs.archives-ouvertes.fr/halshs-03280910 Submitted on 7 Jul 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Bellamy, Kate. 2021. The numeral system of Purepecha: Historical and typological perspectives. STUF - Language Typology & Universals 4. The Numeral System of Purepecha: Historical and Typological Perspectives1 Abstract The internal structure of numeral systems can shed light on processes of word formation, language contact and change. In this article I analyse the numeral system of Purepecha, a language isolate spoken in Michoacán (western Mexico), on the basis of historical and contemporary sources. The Purepecha system is unusual both typologically and areally since (i) it possesses monolexemic terms up to six, while seven to nine are compounds with five; and (ii) the forms for the base (20) and next power (400) have clear non-corporeal meanings, related instead to a configuration of objects and the notion of ‘living’, respectively.
    [Show full text]
  • Mazateco De Mazatlán Villa De Flores Ryan David Klint ILV Ryan David [email protected] Israel Filio García CIESAS [email protected]
    ꞌIen Nájndi ̱a̱, el mazateco de Mazatlán Villa de Flores Ryan David Klint ILV [email protected] Israel Filio García CIESAS [email protected] 1. Introducción El mazateco es una agrupación lingüística que pertenece a la familia otomangue (Campbell 1997:158). Es hablado por aproximadamente 239,000 personas (INEGI 2015) en el norte del estado de Oaxaca y en comunidades fronterizas en los estados de Puebla y Veracruz. Los hablantes del mazateco se refieren a su idioma como ꞌíénná [ʔĩẽ́ ná́ ]‘nuestra lengua’, en la variante de Mazatlán Villa de Flores. El censo de 2010 reporta 10,606 hablantes del mazate- co en Mazatlán Villa de Flores, con 3.2% que no habla español (INEGI 2015). El Ethnologue (Simons y Fennig 2017) reconoce esta variante como mazateco de Mazatlán (código ISO 639-3 = mvz); mientras que el Instituto Nacional de Lenguas Indíggenas (INALI 2016) la reconoce como mazateco del suroeste, traducido en la lengua como ienra naxinandana nnandia. ꞌÍénná se ubica en las variantes del mazateco bajo, con influencias del mazateco alto (Gudschinsky 1956:1, Egland y otros 1983:22). El presente trabajo presenta los fonemas del ꞌíénná, el mazateco de Mazatlán Villa de Flores. Los análisis fonológicos de la lengua se encuentran en Kirk (1966), Carrera Guzmán (2011) y Filio García (2014); otros análisis comparativos de variedades del mazateco se incluyen en García García (2013), Carrera Guerrero (2014), entre otros. Las grabaciones fueron hechas por Israel Filio García, co-autor de esta obra, un hombre de 32 años de la comunidad de Nga̱súnchá (Barrio del Corral) y Araceli Casimiro Filio, quien tiene 24 años y originaria de la comunidad de Barrio Guadalupe.
    [Show full text]
  • Prosodic Structure in Ixtayutla Mixtec: Evidence for the Foot
    Prosodic structure in Ixtayutla Mixtec: Evidence for the foot by Kevin L. Penner A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Linguistics University of Alberta Examining committee: Dr. David Beck, Supervisor Dr. Anja Arnhold, Supervisory Committee Dr. Christian DiCanio, Supervisory Committee Dr. Stephanie Archer, Examiner Dr. Larry Hyman, External Examiner © Kevin L. Penner, 2019 Abstract Research on Mixtec languages (Otomanguean, Mexico), has long recognized a bimoraic/ bisyllabic “couplet” as an essential structure for the description of the phonology and morphology (e.g. Pike 1948; Josserand 1983); however, what exactly this structure is in terms of the struc- ture of the word, as well as the nature and extent of its influence in the grammar has not been adequately addressed. Most researchers have assumed that the couplet is the root, but this is prob- lematic since some synchronic roots are larger than a couplet, other couplets are multimorphemic and some couplets have a reduced form when not the stressed element in compounds. For a more adequate understanding of this structure, I turn to prosodic phonology where units of higher level phonological organization arranged in what is called the prosodic hierarchy form the domains for phonological patterns and provide the shapes of templates. Of particular relevance to the problem at hand is the foot, which is identified in the literature as a constituent between the syllable and the prosodic word in the prosodic hierarchy (Selkirk 1980a; Selkirk 1980b). Cross-linguistically, the foot is integrally connected to stress assignment, has a small inventory of basic shapes, plays an important templatic function in the synchronic and diachronic phonology of many languages and provides the domain for phonological rules and phonotactic generalizations.
    [Show full text]
  • Inflectional Class Complexity in the Oto-Manguean Languages Matthew Baerman, Enrique Palancar, Timothy Feist
    Inflectional class complexity in the Oto-Manguean languages Matthew Baerman, Enrique Palancar, Timothy Feist To cite this version: Matthew Baerman, Enrique Palancar, Timothy Feist. Inflectional class complexity in the Oto- Manguean languages. Amerindia, Association d’Ethno-linguistique Amérindienne, 2019, 41, pp.1 - 18. hal-02428337 HAL Id: hal-02428337 https://hal.archives-ouvertes.fr/hal-02428337 Submitted on 5 Jan 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. AMERINDIA 41: 1-18, 2019 Inflectional class complexity in the Oto-Manguean languages Matthew BAERMAN Surrey Morphology Group, University of Surrey Enrique L. PALANCAR SeDyL, CNRS Timothy FEIST Surrey Morphology Group, University of Surrey Abstract: In this paper we introduce the object of study of this special issue of Amerindia, the inflectional classes of the Oto-Manguean languages of Mexico, together with their most relevant typological characteristics. These languages are rich both in the variety of their inflectional systems, and in the way these are split into inflection classes. In effect, the full typological range of possible inflection class systems can be found just in this one stock of languages. This is illustrated through a survey of the variety of morphological forms, assignment principles, and paradigm structure, as well as the effects of combining multiple inflection class systems across different exponents within a single word form.
    [Show full text]
  • A Diasystemic Approach to Mazatec Inflectional Classes
    AMERINDIA 41: 199-240, 2019 From 'complexity' to 'simplexity': A diasystemic approach to Mazatec inflectional classes Jean Léo LÉONARD Paris-Sorbonne, EA 4509 (STIH) Julien FULCRAND Université Charles-de-Gaulle Lille 3, STL (UMR 8163) ‘Strikingly, we find that little children seem to have no remarkable difficulty in acquiring languages like Georgian, or Mohawk, or Icelandic along more or less the same time course as children learning English or Mandarin. Of course, it might be that little children are just remarkable geniuses at solving problems that seem impenetrable to scientists. But it seems more likely that morphology, despite the fact that a priori it seems like nothing but unmotivated and gratuitous complication, is actually deeply embedded in the nature of language’ Stephen R. Anderson (2015). Abstract: Mazatec provides a good example for the internal variation in inflectional class systems within a large dialectal continuum. This chapter provides first-hand data on a few Mazatec dialects over the Highlands and Midlands dialects, highlighting a number of important issues beyond the specific properties already known about this language in terms of inflectional complexity. The chapter is a first attempt to provide a comprehensive diasystemic description and modeling of this variation. We propose that disentangling this complexity by way of the concept of ‘symplexity’ is as important as describing intricate patterns within an inflectional system. Keywords: Mazatec, Oto-Manguean, inflectional classes, inflectional complexity 1. Introduction This paper studies the emergence of inflectional classes in Mazatec, an Eastern Oto-Manguean language complex of Southwestern Mexico, spoken by roughly 200,000 speakers in the Papaloapam Basin (see map 1).
    [Show full text]
  • 1. the Yalálag Zapotec Language
    Seth Cable Structure of a Non-Indo-European Language Fall 2017 Ling748 Some Basic Background on Yalálag Zapotec 1. Location and Genetic Affiliation (1) Autonym: Dìʼll Wlhàll Yàlhálhg ( /dìʔʒ wlàʒ jàlálɣ/ ) ~ ( /dìʔʒ wɾàʒ jàɾáɾɣ/ ) (2) Areas Spoken • ~2100 speakers in Villa Hidalgo (formerly Villa Yalálag), Oaxaca, Mexico • Many speakers also live in Oaxaca City, Mexico City, and throughout the US (3) Location and Image of Villa Hidalgo (formerly Villa Yalálag) Images taken from Google Maps 1 Seth Cable Structure of a Non-Indo-European Language Fall 2017 Ling748 (3) Oto-Manguean Languages Oto-Manguean Oto-Pamean Popolucan … Mixtecan-Amuzgoan Mixtec Zapotecan (approx. 23 languages, depending on how one counts) Chatino Western Zapotec Papabuco Zapotec Southern Zapotec Central Zapotec … San Lucas Qiaviní Zapotec Northern Zapotec Sierra Juárez Zapotec Rincon Zapotec Choapan Zapotec Cajonos Zapotec Tabaá Zapotec Lachirioag Zapotec Cajonos Zapotec Zoogocho Zapotec Largely mutually intelligible Yatzachi Zapotec Yalálag Zapotec (4) Map of Oto-Manguean Languagues Image taken from ‘https://en.wikipedia.org/wiki/Oto-Manguean_languagesʼ 2 Seth Cable Structure of a Non-Indo-European Language Fall 2017 Ling748 (5) Map of Zapotecan Languages 1 Image taken from ‘https://en.wikipedia.org/wiki/Zapotec_languages’ (6) Current Vitality of Zapotecan Languages • Zapotecan languages vary greatly in their vitality / endangerment • All communities, however, are under pressure from the socially dominant mestizo, Spanish-speaking society o Some parents discourage children from speaking their own indigenous languages o Economic pressures incentivize speaking Spanish and moving out of Zapotecan- speaking communities o Schooling is largely in Spanish, even in Zapotecan-speaking communities (7) Prior Literature on Yalálag Zapotec • There is a great variety of literature on Zapotec languages.
    [Show full text]
  • The Development of a Comprehensive Data Set for Systematic Studies of Machine Translation
    The Development of a Comprehensive Data Set for Systematic Studies of Machine Translation J¨orgTiedemann1[0000−0003−3065−7989] University of Helsinki, Department of Digital Humanities P.O. Box 24, FI-00014 Helsinki, Finland [email protected] Abstract. This paper presents our on-going efforts to develop a com- prehensive data set and benchmark for machine translation beyond high- resource languages. The current release includes 500GB of compressed parallel data for almost 3,000 language pairs covering over 500 languages and language variants. We present the structure of the data set and demonstrate its use for systematic studies based on baseline experiments with multilingual neural machine translation between Uralic languages and other language groups. Our initial results show the capabilities of training effective multilingual translation models with skewed training data but also stress the shortcomings with low-resource settings and the difficulties to obtain sufficient information through straightforward transfer from related languages. Keywords: machine translation · low-resource languages · multilingual NLP 1 Introduction Massively parallel data sets are valuable resources for various research fields ranging from cross-linguistic research, language typology and translation studies to neural representation learning and cross-lingual transfer of NLP tools and applications. The most obvious application is certainly machine translation (MT) that typically relies on data-driven approaches and heavily draws on aligned parallel corpora as their essential training material. Even though parallel data sets can easily be collected from human transla- tions that naturally appear, their availability is still a huge problem for most languages and domains in the world. This leads to a skewed focus in cross- linguistic research and MT development in particular where sufficient amounts of real-world examples of reasonable quality are only available for a few well- resourced languages.
    [Show full text]
  • Inflectional Class Complexity in the Oto-Manguean Languages
    AMERINDIA 41: 1-18, 2019 Inflectional class complexity in the Oto-Manguean languages Matthew BAERMAN Surrey Morphology Group, University of Surrey Enrique L. PALANCAR SeDyL, CNRS Timothy FEIST Surrey Morphology Group, University of Surrey Abstract: In this paper we introduce the object of study of this special issue of Amerindia, the inflectional classes of the Oto-Manguean languages of Mexico, together with their most relevant typological characteristics. These languages are rich both in the variety of their inflectional systems, and in the way these are split into inflection classes. In effect, the full typological range of possible inflection class systems can be found just in this one stock of languages. This is illustrated through a survey of the variety of morphological forms, assignment principles, and paradigm structure, as well as the effects of combining multiple inflection class systems across different exponents within a single word form. Keywords: Oto-Manguean, inflectional classes, morphological complexity, morphological typology Introduction Inflectional morphology expresses grammatical information and in an ideal world each distinct form would correspond to a distinct meaning. But in reality we find that inflectional morphology can be a source of systemic complexity, with inflectional markers displaying apparently unmotivated morphological differences. Often such inflectional allomorphy pervades the entire paradigm so that a given word class falls into morphologically distinct inflectional classes. Inflectional classes are seemingly useless in functional terms, and yet they are found across languages and are resilient over time, adding a layer of complexity to the linguistic system which is purely morphological. 2 AMERINDIA 41: 1-18, 2019 Current knowledge of inflectional classes is still largely based on European languages and is thus limited by their typological characteristics, but in no language family on the planet –we would claim– are inflectional classes better represented than in the Oto-Manguean languages of Mexico.
    [Show full text]