A Basic Language Technology Toolkit for Quechua

Total Page:16

File Type:pdf, Size:1020Kb

A Basic Language Technology Toolkit for Quechua A Basic Language Technology Toolkit for Quechua A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy to the Faculty of Arts of the University of Zurich by Annette Rios Accepted in the Autumn Term 2015 on the Recommendation of the Doctoral Committee: Prof. Dr. Martin Volk (main advisor) Prof. Dr. Balthasar Bickel Zurich, 2015 ii Abstract In this thesis, we describe the development of several natural language processing tools and resources for the Andean language Cuzco Quechua as part of the SQUOIA project at the University of Zurich. The main focus of this work lies on the implementation of a machine translation system for the language pair Spanish-Cuzco Quechua. Since the target language Quechua is not only a non-mainstream language in the field of computational linguistics, but also typologically quite different from the source language Spanish, several rather unusual problems became evident, and we had to find solutions in order to deal with them. Therefore, the first part of this thesis presents monolingual tools and resources that are not directly related to machine translation, but are nevertheless indispensable. The main contributions of this thesis are as follows: • We built a hybrid machine translation system that can translate Spanish text into Cuzco Quechua. The core system is a classical rule-based transfer engine, however, several statistical modules are included for tasks that cannot be resolved reliably with rules. • We implemented a text normalization pipeline that automatically rewrites Quechua texts in different orthographies or dialects to the official Peruvian standard orthog- raphy. This includes a tool for the morphological analysis of Quechua words that achieves high coverage. Furthermore, we also created a slightly adapted version that can be used as spell checker back-end, in combination with a plug-in for the open-source productivity suite LibreOffice/OpenOffice. • We built a Quechua dependency treebank of about 2000 annotated sentences, that provided not only training data for some of the translation modules, but also served as a source of verification, since it allows to observe the distribution of cer- tain syntactic and morphological structures. Furthermore, we trained a statistical parser on the treebank and thus have now a complete pipeline to morphologically analyze, disambiguate and then parse Quechua texts. All resources and tools are freely available from the projects website.1 Apart from the scientific interest in developing tools and applications for a language that is typologically distant from the main stream languages in computational linguistics, we hope that the various resources presented in this thesis will be useful not only for language learners and linguists, but also to Quechua speakers who want to use modern technology in their native language. 1https://github.com/ariosquoia/squoia Acknowledgements Above all, I would like to thank my supervisor Martin Volk for his support and guidance during the four years of this project. I am also very grateful for the continued assis- tance, endless discussions and many laughs with my fellow researcher in the SQUOIA project, Anne G¨ohring. I would also like to thank the members of the doctoral com- mittee, Balthasar Bickel and Paul Heggarty, who provided a detailed review with many suggestions for improvement. Moreover, I wish to thank the people in Peru that made this work possible: • Richard Castro Mamani for the collaboration on the spell checkers, the manage- ment and organization of the evaluation of the MT system and the translations for the treebank • Roger Gonzalo Segura for the syntactic annotation and the numerous discussions about Quechua syntax • C´esarMorante Luna for translations, corrections and filling the gaps of the bilin- gual dictionary of the MT system • Virginia Mamani Mamani and Irma Alvarez´ Ccoscco for the contribution of the translations of the treebank texts • Juan Cruz Tello for providing contacts and general support of the project Furthermore, I would like to thank all my colleagues at the Institute of Computational Linguistics, especially Simon Clematide for the provided help with the finite-state tools and my fellow PhD students Magdalena Plamada, for general advice on MT related issues, Don Tuggener for ideas and discussions about coreference resolution to deal with Quechua switch-reference, Johannes Gra¨enfor the technical support with the web- related parts of this thesis, and my former colleague Rico Sennrich for his valuable tips and tricks concerning the machine learning parts of the Spanish-Quechua translation system. I would also like to thank my family, especially Naira and my mother Susanne for their patience and support during these past four years. Most importantly, I am grateful for the financial support provided by the Swiss National Science Foundation under grants 100015 132219 and 100015 149841. iii Contents Abstract ii Acknowledgements iii Contents iv List of Figures ix List of Tables xi Abbreviations xiii 1 Introduction1 1.1 Overview....................................1 1.2 The Quechua Language Family........................2 1.2.1 Distribution of Quechua Languages..................3 1.3 NLP for Quechua................................4 1.4 The SQUOIA Project.............................6 1.5 Research Questions...............................8 1.6 Thesis Outline.................................8 I Monolingual Quechua Resources 11 2 Quechua Morphology 13 2.1 Introduction................................... 13 2.2 Orthographic Variation............................ 16 2.3 Morphological Analysis............................ 17 2.3.1 Finite-State Networks......................... 18 2.3.2 Finite-State Analysis for Quechua.................. 21 2.4 Morphological Disambiguation and Text Normalization........... 26 2.4.1 Model 1: Disambiguation of Ambiguous Roots........... 26 2.4.2 Model 2: Disambiguation of Nominalizing and Verbalizing Suffixes 30 2.4.3 Model 3: Disambiguation of Verbal Morphology........... 31 2.4.4 Model 4: Disambiguation of Independent Suffixes.......... 31 2.4.5 Performance of the Four Models................... 32 2.4.6 Evaluation............................... 36 v Contents vi 2.5 Spell Checking................................. 39 2.6 Summary.................................... 41 3 Quechua Treebank 43 3.1 Introduction................................... 43 3.2 Corpus...................................... 45 3.3 Quechua Dependency Annotation Scheme.................. 47 3.3.1 Case Suffixes.............................. 47 3.3.2 Elision of Copula............................ 48 3.3.3 Coordination.............................. 48 3.3.4 Focus.................................. 52 3.3.5 Relative Clauses............................ 54 3.3.6 Internally Headed Relative Clauses.................. 55 3.3.7 Embedded Clauses........................... 56 3.4 Annotation Process............................... 58 3.5 Parsing Quechua Sentences.......................... 61 3.5.1 Conversion PML to CoNLL...................... 61 3.5.2 Parsing and Preliminary Evaluation................. 63 3.6 Summary.................................... 67 II Bilingual Spanish-Quechua Resources 69 4 Word-Aligned Parallel Text: Bilingwis Spanish-Quechua 71 4.1 Introduction................................... 71 4.2 Spanish-Quechua Bilingwis.......................... 72 4.3 Summary.................................... 76 5 Hybrid Machine Translation Spanish-Quechua 81 5.1 Introduction................................... 81 5.2 Analysis of Spanish Input........................... 82 5.3 Verb Form Disambiguation.......................... 87 5.3.1 Relative Clauses............................ 88 5.3.1.1 Relative Clause Disambiguation with Machine Learning. 92 5.3.1.2 Training Data........................ 92 5.3.1.3 Features........................... 93 5.3.1.4 Evaluation.......................... 94 5.3.1.5 Relative Clauses with no Direct Correspondence..... 95 5.3.2 Coreference Resolution......................... 96 5.3.3 Disambiguation of Subordinated Clauses............... 97 5.3.3.1 Disambiguation of Subordinated Clauses with Machine Learning........................... 99 5.3.3.1.1 Training Data................... 99 5.3.3.1.2 Features...................... 100 5.3.3.1.3 Classification.................... 100 5.3.3.2 Rule-based Translation System with Machine Learning Verb Disambiguation.................... 101 5.3.3.3 Evaluation.......................... 103 Contents vii 5.3.3.3.1 Whole Verb Disambiguation Pipeline...... 103 5.3.3.3.2 Additional Verb Disambiguation Module.... 103 5.4 Lexical Transfer................................. 105 5.5 Morphological Disambiguation........................ 109 5.6 Syntactic Transfer and Generation...................... 113 5.7 Ranking and Morphological Generation................... 116 5.8 Discourse: Modeling Topic and Focus.................... 119 5.8.1 Discourse Morphology and Information Structure in Quechua... 119 5.8.2 Modeling Information Structure for Machine Translation...... 124 5.9 Evaluation of the Machine Translation Output............... 130 5.9.1 Setting................................. 132 5.9.2 Results................................. 133 5.10 Summary.................................... 137 6 Conclusions 139 6.1 Recapitulation and Contributions....................... 139 6.2 Discussion and Research Questions...................... 140 6.3 Outlook..................................... 143 6.3.1 Morphology Tools........................... 144 6.3.2 Treebank...............................
Recommended publications
  • Origins and Diversity of Aymara How and Why Is Aymara Different in Different Regions?
    Origins and Diversity of Aymara How and Why is Aymara Different in Different Regions? Contents Is Aymara Alone? Aymara and Quechua The Aymara Language Family ‘Southern’ or ‘Altiplano’ Aymara Central Aymara: Jaqaru and Kawki Differences Between Central and Southern Aymara Is There a ‘Correct’ Aymara? The Origins of Aymara: Where and When? Which Civilisations Spoke Aymara? How to Find Out More If you want to print out this text, we recommend our printable versions either in .pdf format : A4 paper size or Letter paper size or in Microsoft Word format : A4 paper size or Letter paper size Back to Contents Origins and Diversity of Aymara www.quechua.org.uk/Sounds Paul Heggarty [of 12 ] – 1 – Back to Contents – Skip to Next: Aymara & Quechua Is Aymara Alone? The language that is normally called ‘Aymara’ is well-known to be spoken in much of the Altiplano , the ‘high plain’ at an altitude of around 4000 m that covers much of western Bolivia and the far south of Peru. Aymara is spoken all around the region of the Bolivian capital La Paz, and further north to Lake Titicaca, the famous archaeological site of Tiwanaku, and into the southernmost regions of Peru, around Huancané, Puno and Moquegua. South of La Paz, Aymara is spoken in the Oruro and Poopó regions and beyond, across the wild and beautiful border areas into northern Chile. What is much less well-known, however, is that this Altiplano Aymara is not alone! A language of the very same family is spoken almost a thousand kilometres further north, in central Peru, in the semi-desert mountains of the province of Yauyos , not far south and inland from Lima.
    [Show full text]
  • Languages: Genetic Relatiónship Or Areal Diffusion?
    Opción, Año 11, No. 18 (1995): 45-73 ?' ISBN 1012-1387 Lexical similarities between Uru-Chipaya and Pano-Takanan languages: Genetic relatiónship or areal diffusion? Alain Fabre Tampere University ofTechnology Finlandia Abstract |; This study traces the geographical distribution of some words in the Pano-Takanan and Uru-Chipaya languige famiües (Perú and Bo- livia). Themethod applied has been fruitfuljy (though notexclusively) used in the fíelds of Uraüc and Indo-European diachronic studies. Out research has been influenced by the studies ¡jby Bereczki (1983), Hajdú (1981), Hajdú & Domokos (1987), Hákkinen (1983) and Joki (1973). This kind of studies are a prerequisite toany attempt toinvestígate into thekindofproblematic relatíonsfiips involverjbetween language groups, showing either that (1) the languages in quéstion, at least in the course ofthe time section under study, were not in! direct contact (or had only sporadic contacts) or that (2) the languages were indeed in contact. The latter possibiüty offers us the opportunity to further examine whether we are deaüng with areal affinity or geneticrelatiónship.When no contact can be shown,there can be no genetic connéction for the period under scrutiny, The next step, phonolpgical and morphological comparison either proving or discarding genetic relatiónship, is not attempted here. We try to disclose layer after layer, as far back in the past as feasable, the former distribution of the ancestors of these languages, thus recon- structing some of the movements of these Jpeoples and/or languages. 1 • ! Recibido: 20 de mayo de 1995 • Aceptado: 4 de octubrede 1995 46 Alain Fabre Opción, Año 11, No. 18 (1995): 45-73 Mainly by inspecting the geographical distribution ofcognate words, we havetriedto disentangle differentchronological stages ofthelanguages, inrelative tíme.
    [Show full text]
  • Forms and Functions of Negation in Huaraz Quechua (Ancash, Peru): Analyzing the Interplay of Common Knowledge and Sociocultural Settings
    Forms and Functions of Negation in Huaraz Quechua (Ancash, Peru): Analyzing the Interplay of Common Knowledge and Sociocultural Settings Dissertation zur Erlangung des Grades eines Doktors der Philosophie am Fachbereich Geschichts- und Kulturwissenschaften der Freien Universität Berlin vorgelegt von Cristina Villari aus Verona (Italien) Berlin 2017 1. Gutachter: Prof. Dr. Michael Dürr 2. Gutachterin: Prof. Dr. Ingrid Kummels Tag der Disputation: 18.07.2017 To Ani and Leonel III Acknowledgements I wish to thank my teachers, colleagues and friends who have provided guidance, comments and encouragement through this process. I gratefully acknowledge the support received for this project from the Stiftung Lateinamerikanische Literatur. Many thanks go to my first supervisor Prof. Michael Dürr for his constructive comments and suggestions at every stage of this work. Many of his questions led to findings presented here. I am indebted to him for his precious counsel and detailed review of my drafts. Many thanks also go to my second supervisor Prof. Ingrid Kummels. She introduced me to the world of cultural anthropology during the doctoral colloquium at the Latin American Institute at the Free University of Berlin. The feedback she and my colleagues provided was instrumental in composing the sociolinguistic part of this work. I owe enormous gratitude to Leonel Menacho López and Anita Julca de Menacho. In fact, this project would not have been possible without their invaluable advice. During these years of research they have been more than consultants; Quechua teachers, comrades, guides and friends. With Leonel I have discussed most of the examples presented in this dissertation. It is only thanks to his contributions that I was able to explain nuances of meanings and the cultural background of the different expressions presented.
    [Show full text]
  • Semantic Transparency in the Lowland Quechua Morphosyntax
    Semantic transparency in Lowland Ecuadorian Quechua morphosyntax1 PIETER MUYSKEN Abstract In this paper the properties of Lowland Ecuadorian Quechua, a possibly pidginized variety from this Andean indigenous language family, are eval- uated in the light of the semantic-transparency hypothesis. It is argued that the typological perspective created by looking at a wider range of languages brings some of the basic ideas developed for creole languages into focus. 1. Introduction One of Pieter Seuren’s contributions to the field of creole studies has been the idea that creoles somehow represent semantically transparent structures, as a result of their special history. Together with the late Herman Wekker, Seuren has particularly elaborated this idea in their joint 1986 paper. The dimensions of semantic transparency proposed by Seuren and Wekker (1986: 64) are uniformity, universality, and simplicity. Furthermore, Pieter Seuren has repeatedly stressed the importance of typological considerations, most recently in his Western Linguistics (1998). In this brief paper I will start to illustrate the workings of the principle of semantic transparency for the possibly pidginized Quechua of the Amazonian lowlands of eastern Ecuador, Lowland Ecuadorian Quechua (LEQ). This variety has been described by Leonardi (1966) and Mugica (1967) and is represented in texts gathered by Oberem and Hartmann (1971), but the present paper is based mostly on my own fieldwork in Arajuno (Tena province). Quechua is spoken (by more than eight million speakers) mostly in rural areas of the highlands of Bolivia, Peru, and Ecuador, but small pockets of speakers are also found on the slopes of the Amazon basin of Peru, Colombia, Bolivia, and Ecuador.
    [Show full text]
  • Text Segmentation by Language Using Minimum Description Length
    Text Segmentation by Language Using Minimum Description Length Hiroshi Yamaguchi Kumiko Tanaka-Ishii Graduate School of Faculty and Graduate School of Information Information Science and Technology, Science and Electrical Engineering, University of Tokyo Kyushu University [email protected] [email protected] Abstract addressed in this paper is rare. The most similar The problem addressed in this paper is to seg- previous work that we know of comes from two ment a given multilingual document into seg- sources and can be summarized as follows. First, ments for each language and then identify the (Teahan, 2000) attempted to segment multilingual language of each segment. The problem was texts by using text segmentation methods used for motivated by an attempt to collect a large non-segmented languages. For this purpose, he used amount of linguistic data for non-major lan- a gold standard of multilingual texts annotated by guages from the web. The problem is formu- lated in terms of obtaining the minimum de- borders and languages. This segmentation approach scription length of a text, and the proposed so- is similar to that of word segmentation for non- lution finds the segments and their languages segmented texts, and he tested it on six different through dynamic programming. Empirical re- European languages. Although the problem set- sults demonstrating the potential of this ap- ting is similar to ours, the formulation and solution proach are presented for experiments using are different, particularly in that our method uses texts taken from the Universal Declaration of only a monolingual gold standard, not a multilin- Human Rights and Wikipedia, covering more than 200 languages.
    [Show full text]
  • Languages of the Middle Andes in Areal-Typological Perspective: Emphasis on Quechuan and Aymaran
    Languages of the Middle Andes in areal-typological perspective: Emphasis on Quechuan and Aymaran Willem F.H. Adelaar 1. Introduction1 Among the indigenous languages of the Andean region of Ecuador, Peru, Bolivia, northern Chile and northern Argentina, Quechuan and Aymaran have traditionally occupied a dominant position. Both Quechuan and Aymaran are language families of several million speakers each. Quechuan consists of a conglomerate of geo- graphically defined varieties, traditionally referred to as Quechua “dialects”, not- withstanding the fact that mutual intelligibility is often lacking. Present-day Ayma- ran consists of two distinct languages that are not normally referred to as “dialects”. The absence of a demonstrable genetic relationship between the Quechuan and Aymaran language families, accompanied by a lack of recognizable external gen- etic connections, suggests a long period of independent development, which may hark back to a period of incipient subsistence agriculture roughly dated between 8000 and 5000 BP (Torero 2002: 123–124), long before the Andean civilization at- tained its highest stages of complexity. Quechuan and Aymaran feature a great amount of detailed structural, phono- logical and lexical similarities and thus exemplify one of the most intriguing and intense cases of language contact to be found in the entire world. Often treated as a product of long-term convergence, the similarities between the Quechuan and Ay- maran families can best be understood as the result of an intense period of social and cultural intertwinement, which must have pre-dated the stage of the proto-lan- guages and was in turn followed by a protracted process of incidental and locally confined diffusion.
    [Show full text]
  • Redalyc.Multilingualism on the North Coast of Peru: an Archaeological
    Indiana ISSN: 0342-8642 [email protected] Ibero-Amerikanisches Institut Preußischer Kulturbesitz Alemania Herrera Wassilowsky, Alexander Multilingualism on the North Coast of Peru: An Archaeological Perspective on Quingnam, Muchik, and Quechua Toponyms from the Nepeña Valley and its Headwaters Indiana, vol. 33, núm. 1, 2016, pp. 161-176 Ibero-Amerikanisches Institut Preußischer Kulturbesitz Berlin, Alemania Available in: http://www.redalyc.org/articulo.oa?id=247046764008 How to cite Complete issue Scientific Information System More information about this article Network of Scientific Journals from Latin America, the Caribbean, Spain and Portugal Journal's homepage in redalyc.org Non-profit academic project, developed under the open access initiative Multilingualism on the North Coast of Peru: An Archaeological Perspective on Quingnam, Muchik, and Quechua Toponyms from the Nepeña Valley and its Headwaters Multilingüismo en la costa norte del Perú: una perspectiva arqueológica sobre los topónimos quingnam, muchik y quechua del valle de Nepeña y sus cabeceras (Ancash, Perú) Alexander Herrera Wassilowsky Universidad de los Andes, Colombia [email protected] Abstract: This paper presents and explores names of places pertaining to the southern Yunga languages – Muchik or Quingnam – from the valley of Nepeña (Ancash, Peru). Toponyms include possible Quechua-Yunga compounds and, possibly Muchik-Quingnam hybrids. Their regional distribution is described and their temporal placement discussed. Archaeological data patterning, the location of sacred waka places, routes of interregional interaction and political developments are described. Enduring multilingualism – coupled with established oracular shrines – is put forward as an alternative to language replacement theories. Keywords: multilingualism; historical linguistics; toponyms; archaeology; Muchik; Quing- nam; Quechua; Ancash; Peru.
    [Show full text]
  • Intonation in Quechua: Questions and Analysis
    INTONATION IN QUECHUA: QUESTIONS AND ANALYSIS Erin O’Rourke University of Pittsburgh [email protected] ABSTRACT information about the peaks and valleys occurring in non-final position within the utterance. Research on the suprasegmental system of Quechua has largely focused on the placement of 2. BACKGROUND stress within a word. Previous descriptions of Quechua intonation found in the literature offer a Quechua is an agglutinative language with SOV schematic representation of the intonation contour. word order which is spoken by approximately 8 In order to examine Quechua intonation within the million people primarily in Peru, Bolivia and current framework of Autosegmental Metrical Ecuador and also in parts of Argentina, Colombia (AM) phonology, data from field recordings in and Chile [2]. The Quechuan language family can Cuzco of Southern Peruvian Quechua have been be divided into two main varieties, Central and analyzed. The current paper offers a preliminary Peripheral [8, 15]. In Peru the Peripheral variety sketch of the basic units of intonation employed in with the greatest number of speakers is Southern Quechua, including pitch accents and boundary Peruvian Quechua [3]. Cuzco Quechua, one of the tones. This analysis may provide additional data in Southern Peruvian varieties, has been chosen for the cross-comparison of intonation systems and this description of intonation given its large also aid in the task of applying the principals of the distribution of speakers. AM model to less-commonly studied intonation 3. STRESS IN QUECHUA systems. Quechua has a fixed location for primary stress. As noted in Cerrón-Palomino [2:128], research on Keywords: intonation, Autosegmental Metrical Quechua suprasegmentals has focused mainly on (AM) model, Quechua, pitch accent, boundary stress placement across different varieties: “The tone phenomena of accent, rhythm and intonation are 1.
    [Show full text]
  • Only and Focus in Imbabura Quichua 1
    Only and focus in Imbabura Quichua Jos Tellings University of California, Los Angeles⇤ 1 Introduction This paper investigates the interaction of focus and the exclusive particle -lla ‘only’ in Im- babura Quichua. Imbabura Quichua (henceforth Quichua)1 is a Quechuan language spoken in Imbabura Province in Northern Ecuador. Quichua is a highly agglutinative, suffixing language with a predominantly verb-final word order. A 2008 estimate of the number of speakers is 150,000 (G´omez-Rend´on 2008:182fn.). Existing literature on this language in- cludes one descriptive grammar (Cole 1982), whereas most theoretical work on this language is directed towards (morpho)syntax (Cole and Hermon 1981; Hermon 2001; Willgohs and Farrell 2009), and its evidential system (to be discussed in section 2.2 below) (see S´anchez 2010:236↵. for a more exhaustive bibliography on the Quechuan languages). The study of focus in Quichua is worthwhile for a number of reasons. First, as I will dis- cuss in a little more detail in section 2.1 below, Quichua is a relatively uncommon language from a point of view of focus typology, because it realizes focus non-phonologically, and it has a bound morpheme exclusive particle -lla. The semantic study of focus is still domi- nated by English and other languages that realize focus by phonological means. Studying a typologically marked language will be insightful in testing our theory for cross-linguistic validity. Second, this work contributes to an existing body of research on the suffix -mi which appears in several Quechuan languages, and which belongs to perhaps the best studied parts of the Quechuan language family.
    [Show full text]
  • (REELA) 5-7 September 2015, Leiden University Centre for Linguistics
    Fourth Conference of the Red Europea para el Estudio de las Lenguas Andinas (REELA) 5-7 September 2015, Leiden University Centre for Linguistics Fourth Conference of the European Association for the Study of Andean Languages - Abstracts Saturday 5 September Lengua X, an Andean puzzle Matthias Pache Leiden University In the southern central Andes, different researchers have come across series of numerals which are difficult to attribute to one of the language groups known to be or have been spoken in this area: Quechuan, Aymaran, Uru-Chipayan, or Puquina (cf. Ibarra Grasso 1982: 97-107). In a specific chapter headed “La lengua X”, Ibarra Grasso (1982) discusses different series of numerals which he attributes to this language. Although subsumed under one heading, Lengua X, the numerals in question may vary across the sources, both with respect to form and meaning. An exemplary paradigm of Lengua X numerals recorded during own fieldwork is as follows: 1 mayti 2 payti 3 kimsti 4 taksi 5 takiri 6 iriti 7 wanaku 8 atʃ͡atʃ͡i 9 tʃ͡ipana 10 tʃ͡ˀutx Whereas some of these numerals resemble their Aymara counterparts (mayti ‘one’, payti ‘two’, cf. Aymara maya ‘one’, paya ‘two’), others seem to have parallels in Uru or Puquina numerals (taksi ‘four’, cf. Irohito Uru táxˀs núko ‘six’ (Vellard 1967: 37), Puquina tacpa ‘five’ (Torero 2002: 454)). Among numerals above five, there are some cases of homonymy with Quechua/Aymara terms referring to specific entities, as for instance Lengua X tʃ͡ipana ‘nine’ and Quechua/Aymara tʃ͡ipana ‘fetter, bracelet’. In this talk, I will discuss two questions: (1) What is the origin of Lengua X numerals? (2) What do Lengua X numerals reveal about the linguistic past of the southern central Andes? References Ibarra Grasso, Dick.
    [Show full text]
  • Five Suffixes with Unified Spellings for Southern Quechua
    Five Suffixes With Unified Spellings for Southern Quechua -mi /-m -pa/-p -pti- -chka- -chik On this page we look at the five suffixes of Quechua for which there is a single unified spelling proposed for all Southern Quechua : -mi/-m, -pa/-p, -pti-, -chka- and -chik , to explain why these particular spellings have been chosen as the unified ones. Certainly, they might at first seem a little odd to people in some regions, including Cuzco and southern Bolivia; but on the other hand there are also big, big advantages of using these particular spellings in order to be consistent and to achieve more unity among Quechua speakers from all regions, as we shall find out on this page … Contents Spelling and Sounds, Spelling and Grammar But Isn’t It Really Strange to Spell These Suffixes in the Unified Way? An Example of a Unified Suffix Spelling in Another Language: English Unified Suffix Spellings: More Complicated, or Simpler? The Five Main Suffixes With a Unified Spelling ‘Assertive ’ -mi or -m ’Possessor ’ -pa or -p ‘If/When ’ -pti- ‘Progressive ’ -chka- ‘Inclusive we ’ -chik Which Region’s Spellings are the Unified Ones? Three Vowels or Five? If you want to print out this text, we recommend our printable versions either in .pdf format : A4 paper size or Letter paper size or in Microsoft Word format : A4 paper size or Letter paper size Suffix Spellings in Southern Quechua www.quechua.org.uk/Sounds Paul Heggarty [of 8] – 1 – Back to Contents – Skip to Next: But Isn’t It Really Strange to Spell These Suffixes in the Unified Way? Spelling and Sounds, Spelling and Grammar Our main texts on pronunciation and spelling discuss only the proposed unified spellings of individual sounds like [ q] or [ m] wherever they occur in a word.
    [Show full text]
  • Descriptive and Comparative Research on South American Indian Languages
    Historical overview: Descriptive and comparative research on South American Indian languages Willem F. H. Adelaar 1. Introduction The extreme language diversity that was characteristic for South America must have been a challenge to native groups throughout the subcontinent, struggling to maintain commercial and political relations with each other. Due to the absence of phonetically based writing systems in pre-European times there is hardly any documentation about the way cross-linguistic communication was achieved. How- ever, the outlines of a conscious linguistic policy can be assumed from the Incas’ success in imposing their language upon a millenary multilingual society. Second- language learning, often by users of typologically widely different languages, must have been an everyday concern to the subjects of the Inca empire. Sixteenth-cen- tury chroniclers often report in a matter-of-fact way on the ease and rapidity with which native Americans mastered the language of their conquerors, be it Quechua, Spanish or any other language. Apart from such cases of political necessity, there are indications that language played an essential role in many South American native societies and that it could be manipulated and modified in a deliberate way. The use of stylistic speech levels among the Cuna (Sherzer 1983) and of ceremo- nial discourse among the Mbyá (Cadogan 1959; Clastres 1974), the Shuar (Gnerre 1986) and the Trio (Carlin 2004), the appreciation of rhetorical skill as a requisite for leadership among the Mapuche, the distinction of female and masculine speech among the Karajá (Rodrigues 2004) and the Chiquitano (Galeote 1993), the associ- ation of language choice and family lineage among the peoples of the Vaupés region (Sorensen 1967; Aikhenvald 2002), and the association of language choice and professional occupation in highland Bolivia (Howard 1995) appear to indicate an awareness of linguistic functionality not limited to daily communication alone.
    [Show full text]