A Full-Scale Test of the Language Farming Dispersal Hypothesis
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Toward the Reconstruction of Proto-Algonquian-Wakashan. Part 3: the Algonquian-Wakashan 110-Item Wordlist
Sergei L. Nikolaev Institute of Slavic studies of the Russian Academy of Sciences (Moscow/Novosibirsk); [email protected] Toward the reconstruction of Proto-Algonquian-Wakashan. Part 3: The Algonquian-Wakashan 110-item wordlist In the third part of my complex study of the historical relations between several language families of North America and the Nivkh language in the Far East, I present an annotated demonstration of the comparative data that was used in the lexicostatistical calculations to determine the branching and approximate glottochronological dating of Proto-Algonquian- Wakashan and its offspring; because of volume considerations, this data could not be in- cluded in the previous two parts of the present work and has to be presented autonomously. Additionally, several new Proto-Algonquian-Wakashan and Proto-Nivkh-Algonquian roots have been set up in this part of study. Lexicostatistical calculations have been conducted for the following languages: the reconstructed Proto-North Wakashan (approximately dated to ca. 800 AD) and modern or historically attested variants of Nootka (Nuuchahnulth), Amur Nivkh, Sakhalin Nivkh, Western Abenaki, Miami-Peoria, Fort Severn Cree, Wiyot, and Yurok. Keywords: Algonquian-Wakashan languages, Nivkh-Algonquian languages, Algic languages, Wakashan languages, Chimakuan-Wakashan languages, Nivkh language, historical phonol- ogy, comparative dictionary, lexicostatistics. The classification and preliminary glottochronological dating of Algonquian-Wakashan currently remain the same as presented in Nikolaev 2015a, Fig. 1 1. That scheme was generated based on the lexicostatistical analysis of 110-item basic word lists2 for one reconstructed (Proto-Northern Wakashan, ca. 800 A.D.) and several modern Algonquian-Wakashan lan- guages, performed with the aid of StarLing software 3. -
The Status of the Least Documented Language Families in the World
Vol. 4 (2010), pp. 177-212 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/4478 The status of the least documented language families in the world Harald Hammarström Radboud Universiteit, Nijmegen and Max Planck Institute for Evolutionary Anthropology, Leipzig This paper aims to list all known language families that are not yet extinct and all of whose member languages are very poorly documented, i.e., less than a sketch grammar’s worth of data has been collected. It explains what constitutes a valid family, what amount and kinds of documentary data are sufficient, when a language is considered extinct, and more. It is hoped that the survey will be useful in setting priorities for documenta- tion fieldwork, in particular for those documentation efforts whose underlying goal is to understand linguistic diversity. 1. InTroducTIon. There are several legitimate reasons for pursuing language documen- tation (cf. Krauss 2007 for a fuller discussion).1 Perhaps the most important reason is for the benefit of the speaker community itself (see Voort 2007 for some clear examples). Another reason is that it contributes to linguistic theory: if we understand the limits and distribution of diversity of the world’s languages, we can formulate and provide evidence for statements about the nature of language (Brenzinger 2007; Hyman 2003; Evans 2009; Harrison 2007). From the latter perspective, it is especially interesting to document lan- guages that are the most divergent from ones that are well-documented—in other words, those that belong to unrelated families. I have conducted a survey of the documentation of the language families of the world, and in this paper, I will list the least-documented ones. -
Special Issue 2012 Part I ISSN: 0023-1959
Language & Linguistics in Melanesia Special Issue 2012 Part I ISSN: 0023-1959 Journal of the Linguistic Society of Papua New Guinea ISSN: 0023-1959 Special Issue 2012 Harald Hammarström & Wilco van den Heuvel (eds.) History, contact and classification of Papuan languages Part One Language & Linguistics in Melanesia Special Issue 2012 Part I ISSN: 0023-1959 THE KEUW ISOLATE: PRELIMINARY MATERIALS AND CLASSIFICATION David Kamholz University of California, Berkeley [email protected] ABSTRACT Keuw is a poorly documented language spoken by less than 100 people in southeast Cenderawasih Bay. This paper gives some preliminary data on Keuw, based on a few days of field work. Keuw is phonologically similar to Lakes Plain languages: it lacks contrastive nasals and has at least two tones. Basic word order is SOV. Lexical comparison with surrounding languages suggests that Keuw is best classified as an isolate. Keywords: Keuw, Kehu, Papua, Indonesia, New Guinea, isolate, description, classification 1 INTRODUCTION Keuw (ISO 639-3 khh, also known as Kehu or Keu) is a poorly documented language spoken by a small ethnic group of the same name in southeast Cenderawasih Bay. Keuw territory is located in a swampy lowland plain along the Poronai river in Wapoga distrct, Nabire regency, Papua province, Indonesia (see Figure 1 and 2). The Keuw are reported to be in occasional contact with the Burate (bti, East Geelvink Bay family), who live downstream near the mouth of the Poronai (village: Totoberi), and the Dao/Maniwo (daz, Paniai Lakes family), who live upstream in the highlands (village: Taumi). There is no known contact with other nearby ethnic groups, which include a group of Waropen (wrp, Austronesian) who live just south of the Burate (village: Samanui), and the Auye (auu, Paniai Lakes family), who live in the highlands along the Siriwo river. -
Hyman Uconn Accent PLAR
UC Berkeley Phonology Lab Annual Report (2010) Do All Languages Have Word Accent? Or: What’s so great about being universal? Larry M. Hyman University of California, Berkeley Draft of Paper presented at the Conference on Word Accent: Theoretical and Typological Issues, University of Connecticut, April 30, 2010 • Comments Welcome The purpose of this paper is to address the question: Do all languages have word accent? By use of the term “word accent” I will first consider the traditional notion of “stress accent” at the word level, as so extensively studied within the metrical literature. However, I will also ask whether certain other phenomena which privilege a single syllable per word should also be identified at some higher level of abstraction as “word accent”, whether on a par with stress, or different. There of course have been claims that all languages have word accent, e.g. most recently: “A considerable number (probably the majority, and according to me: all) of the world's languages display a phenomenon known as word stress.” (van der Hulst 2009:1) On the other hand, a number of scholars have asserted that specific languages lack word stress. This includes certain tone languages in Africa, but also languages without tone. In Bella Coola there is “no phonemically significant phenomena of stress or pitch associated with syllables or words.... When two or more syllabics occur in a word or sentence, one can clearly hear different degrees of articulatory force. But these relative stresses in a sequence of acoustic syllables do not remain constant in repetitions of the utterance.” (Newman 1947:132). -
Expanding the JHU Bible Corpus for Machine Translation of the Indigenous Languages of North America
Expanding the JHU Bible Corpus for Machine Translation of the Indigenous Languages of North America Garrett Nicolai, Edith Coates, Ming Zhang and Miikka Silfverberg Department of Linguistics University of British Columbia Vancouver, Canada [email protected], [email protected], [email protected], [email protected] Abstract ence of the language communities. We demonstrate the usefulness of our Indigenous parallel corpus by build- We present an extension to the JHU Bible ing multilingual neural machine translation systems for corpus, collecting and normalizing more than North American Indigenous languages. Multilingual thirty Bible translations in thirty Indigenous training is shown to be beneficial especially for the languages of North America. These exhibit a most resource-poor languages in our corpus which lack wide variety of interesting syntactic and mor- complete Bible translations. phological phenomena that are understudied in the computational community. Neural transla- 2 Corpus Construction tion experiments demonstrate significant gains obtained through cross-lingual, many-to-many The Bible is perhaps unique as a parallel text. Partial translation, with improvements of up to 8.4 translations exist in more languages than any other text 2 BLEU over monolingual models for extremely (Mayer and Cysouw, 2014). Furthermore, for nearly low-resource languages. 500 years, the Bible has had a canonical hierarchical structure - the Bible is made up of 66 books, each of 1 Introduction which contains a number of chapters, which are, in turn, broken down into verses. Each verse corresponds In 2019, Johns Hopkins University collated a corpus of to a short segment – often no more than a sentence. -
The Languages of Melanesia: Quantifying the Level of Coverage
Language Documentation & Conservation Special Publication No. 5 (December 2012) Melanesian Languages on the Edge of Asia: Challenges for the 21st Century, ed. by Nicholas Evans and Marian Klamer, pp. 13–33 http://nflrc.hawaii.edu/ldc/sp05/ 2 http://hdl.handle.net/10125/4559 The languages of Melanesia: Quantifying the level of coverage Harald Hammarström Max Plank Institute for Evolutionary Anthopology Sebastian Nordhoff Max Plank Institute for Evolutionary Anthopology The present paper assesses the state of grammatical description of the languages of the Melanesian region based on database of semi- automatically annotated aggregated bibliographical references. 150 years of language description in Melanesia has produced at least some grammatical information for almost half of the languages of Melanesia, almost evenly spread among coastal/non-coastal, Austronesian/non- Austronesian and isolates/large families. Nevertheless, only 15.4% of these languages have a grammar and another 18.7% have a grammar sketch. Compared to Eurasia, Africa and the Americas, the Papua- Austronesian region is the region with the largest number of poorly documented languages and the largest proportion of poorly documented languages. We conclude with some dicussion and remarks on the documentational challenge and its future prospects. 1. INTRODUCTION. We will take Melanesia to be the sub-region of Oceania extending from the Arafura Sea and Western Pacific in the west to Fiji in the east – see the map in figure 1.1 This region is home to no fewer than 1347 (1315 living + 32 recently extinct) attested indigenous languages as per the language/dialect divisions of Lewis (2009), with small adjustments and adding attested extinct languages given in table 1. -
WILC Vol9 Mcallister Optimized.Pdf
WORKPAPBRS IN INDONBSIAN LANGUAGBS.... AND CULTURBS VOLUMS 9 - IRIAN JAYA • . -' , ~ .. • Cenderawasih University ~ and The Summer InstLtute of Linguistics in cooperation with The Department of Education and Culture WORKPAPBRS IN INDONESIAN LANGUAGES AND CULTURES VOLUME 9 - IRIAN JAYA Margaret Hartzler, LaLani Wood, Editors Cenderaw8sih University and The Summer Institute of Linguistics in cooperation with The Department of Education and Cultu-re J Workpapers in Indonesian Languages and Cultures Volume 9 - Irian Jaya Margaret Hartzler, LaLani Wood, Editors Printed 1991 Jayapura, Irian Jaya, Indonesia copies of this publication may be obtained from Summer Institute of Linguistics P.O. Box 1800 Jayapura, Irian Jaya 99018 Indonesia Microfiche copies of this and other pUblications of the Summer Institute of Linguistics may be obtained from . Academic Book Center Summer Institute of Linguistics 7500 West Camp Wisdom Road Dallas, TX 75236 U.S.A. ISBN 979-8132-734 Prakata Saya menyambut dengan gembira penerbitan buku Workpapers in Indonesian Languages and Cultures , Volume 9 - Irian Jaya. Penerbitan ini merupakan bukti kemajuan serta keberhasilan yang dicapai oleh Proyek Kerjasama Universitas Cenderawasih dengan Summer Institute of Linguistics , Irian Jaya. Buku ini juga merupakan wujud nyata peranserta para anggota SIL dalam membantu pengembangan masyarakat umumnya dan masyarakat pedesaan Irian Jaya khususnya. Selain berbagai informasi ilmiah tentang bahasa-bahasa daerah dan kebudayaan suku-suku setempat, buku ini sekaligus mengungkapkan sebagian kecil kekayaan budaya bangs a kita yang berada di Irian Jaya. Dengan adanya penerbitan ini, diharapkan penulis-penulis yang lain akan didorong minatnya agar dapat menyumbangkan pengetahuan yang berguna bagi generasi-generasi yang akan datang dan untuk kepentingan pengembangan ilmu pengetahuan. -
33 Contact and North American Languages
9781405175807_4_033 1/15/10 5:37 PM Page 673 33 Contact and North American Languages MARIANNE MITHUN Languages indigenous to the Americas offer some good opportunities for inves- tigating effects of contact in shaping grammar. Well over 2000 languages are known to have been spoken at the time of first contacts with Europeans. They are not a monolithic group: they fall into nearly 200 distinct genetic units. Yet against this backdrop of genetic diversity, waves of typological similarities suggest pervasive, longstanding multilingualism. Of particular interest are similarities of a type that might seem unborrowable, patterns of abstract structure without shared substance. The Americas do show the kinds of contact effects common elsewhere in the world. There are some strong linguistic areas, on the Northwest Coast, in California, in the Southeast, and in the Pueblo Southwest of North America; in Mesoamerica; and in Amazonia in South America (Bright 1973; Sherzer 1973; Haas 1976; Campbell, Kaufman, & Stark 1986; Thompson & Kinkade 1990; Silverstein 1996; Campbell 1997; Mithun 1999; Beck 2000; Aikhenvald 2002; Jany 2007). Numerous additional linguistic areas and subareas of varying sizes and strengths have also been identified. In some cases all domains of language have been affected by contact. In some, effects are primarily lexical. But in many, there is surprisingly little shared vocabulary in contrast with pervasive structural parallelism. The focus here will be on some especially deeply entrenched structures. It has often been noted that morphological structure is highly resistant to the influence of contact. Morphological similarities have even been proposed as better indicators of deep genetic relationship than the traditional comparative method. -
California Indian Languages
SUB Hamburg B/112081 CALIFORNIA INDIAN LANGUAGES VICTOR GOLLA UNIVERSITY OF CALIFORNIA PRESS Berkeley Los Angeles London CONTENTS PREFACE ix PART THREE PHONETIC ORTHOGRAPHY xiii Languages and Language Families Algic Languages / 61 PART ONE Introduction: Defining California as a 3.1 California Algic Languages (Ritwan) / 61 Sociolinguistic Area 3.2 Wiyot / 62 3.3 Yurok / 65 1.1 Diversity / 1 Athabaskan (Na-Dene) Languages / 68 1.2 Tribelet and Language / 2 3.4 The Pacific Coast Athabaskan Languages / 68 1.3 Symbolic Function of California Languages / 4 3.5 Lower Columbia Athabaskan 1.4 Languages and Migration / 5 (Kwalhioqua-Tlatskanai) / 69 1.5 Multilingualism / 6 3.6 Oregon Athabaskan Languages / 70 1.6 Language Families and Phyla / 8 3.7 California Athabaskan Languages / 76 Hokan Languages / 82 PART TWO 3.8 The Hokan Phylum / 82 History of Study 3.9 Karuk / 84 3.10 Chimariko / 87 3.11 Shastan Languages / 90 Before Linguistics / 12 3.12 Palaihnihan Languages / 95 2.1 Earliest Attestations / 12 3.13 Yana / 100 2.2 Jesuit Missionaries in Baja California / 12 3.14 Washo / 102 2.3 Franciscans in Alta California / 14 3.15 Porno Languages / 105 2.4 Visitors and Collectors, 1780-1880 / 22 3.16 Esselen / 112 3.17 Salinan / 114 Linguistic Scholarship / 32 3.18 Yuman Languages / 117 3.19 Cochimi and the Cochimi-Yuman Relationship / 125 2.5 Early Research Linguistics, 1865-1900 / 32 3.20 Seri / 126 2.6 The Kroeber Era, 1900 to World War II / 35 2.7 Independent Scholars, 1900-1940 / 42 Penutian Languages / 128 2.8 Structural Linguists / 49 2.9 The -
[.35 **Natural Language Processing Class Here Computational Linguistics See Manual at 006.35 Vs
006 006 006 DeweyiDecimaliClassification006 006 [.35 **Natural language processing Class here computational linguistics See Manual at 006.35 vs. 410.285 *Use notation 019 from Table 1 as modified at 004.019 400 DeweyiDecimaliClassification 400 400 DeweyiDecimali400Classification Language 400 [400 [400 *‡Language Class here interdisciplinary works on language and literature For literature, see 800; for rhetoric, see 808. For the language of a specific discipline or subject, see the discipline or subject, plus notation 014 from Table 1, e.g., language of science 501.4 (Option A: To give local emphasis or a shorter number to a specific language, class in 410, where full instructions appear (Option B: To give local emphasis or a shorter number to a specific language, place before 420 through use of a letter or other symbol. Full instructions appear under 420–490) 400 DeweyiDecimali400Classification Language 400 SUMMARY [401–409 Standard subdivisions and bilingualism [410 Linguistics [420 English and Old English (Anglo-Saxon) [430 German and related languages [440 French and related Romance languages [450 Italian, Dalmatian, Romanian, Rhaetian, Sardinian, Corsican [460 Spanish, Portuguese, Galician [470 Latin and related Italic languages [480 Classical Greek and related Hellenic languages [490 Other languages 401 DeweyiDecimali401Classification Language 401 [401 *‡Philosophy and theory See Manual at 401 vs. 121.68, 149.94, 410.1 401 DeweyiDecimali401Classification Language 401 [.3 *‡International languages Class here universal languages; general -
Survey of the World's Languages
Survey of the world’s languages The languages of the world can be divided into a number of families of related languages, possibly grouped into larger stocks, plus a residue of isolates, languages that appear not to be genetically related to any other known languages, languages that form one-member families on their own. The number of families, stocks, and isolates is hotly disputed. The disagreements centre around differences of opinion as to what constitutes a family or stock, as well as the acceptable criteria and methods for establishing them. Linguists are sometimes divided into lumpers and splitters according to whether they lump many languages together into large stocks, or divide them into numerous smaller family groups. Merritt Ruhlen is an extreme lumper: in his classification of the world’s languages (1991) he identifies just nineteen language families or stocks, and five isolates. More towards the splitting end is Ethnologue, the 18th edition of which identifies some 141 top-level genetic groupings. In addition, it distinguishes 1 constructed language, 88 creoles, 137 or 138 deaf sign languages (the figures differ in different places, and this category actually includes alternate sign languages — see also website for Chapter 12), 75 language isolates, 21 mixed languages, 13 pidgins, and 51 unclassified languages. Even so, in terms of what has actually been established by application of the comparative method, the Ethnologue system is wildly lumping! Some families, for instance Austronesian and Indo-European, are well established, and few serious doubts exist as to their genetic unity. Others are quite contentious. Both Ruhlen (1991) and Ethnologue identify an Australian family, although there is as yet no firm evidence that the languages of the continent are all genetically related. -
Language Index
Cambridge University Press 978-0-521-86573-9 - Endangered Languages: An Introduction Sarah G. Thomason Index More information Language index ||Gana, 106, 110 Bininj, 31, 40, 78 Bitterroot Salish, see Salish-Pend d’Oreille. Abenaki, Eastern, 96, 176; Western, 96, 176 Blackfoot, 74, 78, 90, 96, 176 Aboriginal English (Australia), 121 Bosnian, 87 Aboriginal languages (Australia), 9, 17, 31, 56, Brahui, 63 58, 62, 106, 110, 132, 133 Breton, 25, 39, 179 Afro-Asiatic languages, 49, 175, 180, 194, 198 Bulgarian, 194 Ainu, 10 Buryat, 19 Akkadian, 1, 42, 43, 177, 194 Bushman, see San. Albanian, 28, 40, 66, 185; Arbëresh Albanian, 28, 40; Arvanitika Albanian, 28, 40, 66, 72 Cacaopera, 45 Aleut, 50–52, 104, 155, 183, 188; see also Bering Carrier, 31, 41, 170, 174 Aleut, Mednyj Aleut Catalan, 192 Algic languages, 97, 109, 176; see also Caucasian languages, 148 Algonquian languages, Ritwan languages. Celtic languages, 25, 46, 179, 183, Algonquian languages, 57, 62, 95–97, 101, 104, 185 108, 109, 162, 166, 176, 187, 191 Central Torres Strait, 9 Ambonese Malay, see Malay. Chadic languages, 175 Anatolian languages, 43 Chantyal, 31, 40 Apache, 177 Chatino, Zenzontepec, 192 Apachean languages, 177 Chehalis, Lower, see Lower Chehalis. Arabic, 1, 19, 22, 37, 43, 49, 63, 65, 101, 194; Chehalis, Upper, see Upper Chehalis. Classical Arabic, 8, 103 Cheyenne, 96, 176 Arapaho, 78, 96, 176; Northern Arapaho, 73, 90, Chinese languages (“dialects”), 22, 35, 37, 41, 92 48, 69, 101, 196; Mandarin, 35; Arawakan languages, 81 Wu, 118; see also Putonghua, Arbëresh, see Albanian. Chinook, Clackamas, see Clackamas Armenian, 185 Chinook.