Automatic Understanding of Unwritten Languages

Total Page:16

File Type:pdf, Size:1020Kb

Automatic Understanding of Unwritten Languages Automatic Understanding of Unwritten Languages A thesis presented by Oliver Adams to School of Computing and Information Systems in total fulfillment of the requirements for the degree of Doctor of Philosophy The University of Melbourne Melbourne, Australia December 2017 Declaration This is to certify that: (i) the thesis comprises only my original work towards the PhD except where indi- cated in the Citations to Previously Published Work; (ii) due acknowledgement has been made in the text to all other material used; (iii) the thesis is fewer than 100,000 words in length, exclusive of tables, maps, bibliographies and appendices. Signed: Date: ©2017 - Oliver Adams All rights reserved. Thesis advisor(s) Author Steven Bird Oliver Adams Trevor Cohn Graham Neubig Automatic Understanding of Unwritten Languages Abstract Many of the world’s languages are falling out of use without a written record and minimal linguistic documentation. Language documentation is a slow process and there are an insufficient number of linguists working to ensure the world’s languages are documented before they die out. This thesis addresses automatic understanding of unwritten languages in order to perform tasks such as phonemic transcription and bilingual lexicon induction. The automation of such tasks promises to improve the leverage of field linguists and ultimately speed up the language documentation process. Modelling endangered languages is challenging due to the nature of the available data, which is typically not written text but limited quantities of recorded speech. Manually annotated information in the form of lexicons and grammars is typically also limited. Since the languages are spoken, the most efficient way of sourcing data is to collect speech in the language. Most speakers of endangered languages are bilin- gual or multilingual, so acquiring spoken translations works to the strength of the speakers. Key approaches described in this thesis make use of bilingual data, in par- ticular translated speech, which consists of segments of endangered language speech paired with translations in a larger language. Such data is important for relating the source language speech with a larger language. Additionally, the application of monolingual phoneme transcription is also explored, since it has direct applicability in more traditional phonemic transcription workflows. The overarching question is this: what can be automatically learnt about the languages with the data we have available, and how can this help automate language documentation? Abstract iv We first consider translation modelling of accurate phoneme transcriptions. This assumption allows us to investigate the feasibility of phoneme–word translation and the effectiveness of inferring bilingual lexical items from such data in isolation from confounding acoustic factors. A second investigation explores how bilingual lexicons can be used to improve language models, which are crucial components of speech recognition and machine translation systems. In a third set of experiments we re- move the assumption of accurate transcriptions and investigate operating in the face of acoustic uncertainty. Experiments in this space demonstrate that translated speech can improve automatic phoneme transcription even without a prior translation model. Finally, we make a step towards further generalisability, exploring acoustic modelling in resource-scarce environments without a lexicon or language model. In particular, we assess the use of automatic phoneme and tone transcription on Yongning Na, a threatened tonal language spoken in south-west China. Beyond quantitative investi- gation, we report on the use of this method in linguistic documentation of Na. Its effectiveness has led to its incorporation into the language documentation workflow for Na. Citations to Previously Published Work The work in this thesis is entirely original except where explicitly stated otherwise. Large portions of Chapter 3 have appeared in the following paper: Oliver Adams, Graham Neubig, Trevor Cohn, Steven Bird (2015) Induc- ing bilingual lexicons from small quantities of sentence-aligned phonemic transcriptions, in Proceedings of the 12th International Workshop on Spo- ken Language Technologies (IWSLT), Da Nang, Vietnam. pp. 248–255. Large portions of Chapter 4 have appeared in the following paper: Oliver Adams, Adam Makarucha, Graham Neubig, Steven Bird, Trevor Cohn (2017) Cross-lingual word embeddings for low-resource language modeling, in Proceedings of the 15th Conference of the European Chap- ter of the Association for Computational Linguistics (EACL), Valencia, Spain. Large portions of Chapter 5 have appeared in the following papers: Oliver Adams, Graham Neubig, Trevor Cohn, Steven Bird (2016) Learn- ing a translation model from word lattices, in 17th Annual Conference of the International Speech Communication Association (INTERSPEECH 2016), San Francisco, California, USA. Oliver Adams, Graham Neubig, Trevor Cohn, Steven Bird, Quoc Do Truong, Satoshi Nakamura (2016) Learning a lexicon and translation model from phoneme lattices, in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), Austin, Texas, USA. pp. 2377–2382. Large portions of Chapter 6 have appeared in the following papers: Abstract vi Oliver Adams, Trevor Cohn, Graham Neubig, Alexis Michaud (2017) Phonemic transcription of low-resource tonal languages, in Proceedings of the Australasian Language Technology Association Workshop 2017 (ALTA), Brisbane, Australia. pp. 53–60. Oliver Adams, Trevor Cohn, Graham Neubig, Hilaria Cruz, Steven Bird, Alexis Michaud (2018) Evaluating phonemic transcription of low-resource tonal languages for language documentation in Proceedings of LREC 2018: 11th edition of the Language Resources and Evaluation Conference, Miyazaki, Japan. (To appear). Contents Title Page .................................... i Abstract ..................................... iii Citations to Previously Published Work ................... v Table of Contents ................................ vii List of Figures ................................. x List of Tables .................................. xii Acknowledgements ............................... xiii 1 Introduction 1 1.1 Motivation ................................. 2 1.1.1 Language Extinction and Documentation ........... 2 1.1.2 The Changing Nature of Language Documentation ...... 3 1.1.3 A Different Kind of Data .................... 5 1.1.4 Why Pursue Automatic Understanding of Unwritten Languages? 8 1.2 Aim and Scope .............................. 10 1.2.1 Research Questions ........................ 11 1.3 Overview of the Contributions ...................... 12 2 Background 15 2.1 Data Acquisition ............................. 16 2.2 Modelling Translation .......................... 17 2.2.1 Traditional Word Alignment ................... 18 2.2.2 What is the Right Granularity for Translation Units? ..... 19 2.2.3 Data Sparsity in Machine Translation ............. 23 2.2.4 Unsupervised Word Segmentation ................ 25 2.2.5 Bilingual Lexicon Induction ................... 26 2.2.6 Cross-Lingual Word Embeddings ................ 28 2.2.7 Language Modelling ....................... 30 2.3 Automatic Speech Recognition ...................... 33 2.3.1 Traditional Speech Recognition ................. 33 2.3.2 End-to-End Speech Recognition ................. 35 2.3.3 Unsupervised Speech Modelling ................. 36 vii Contents viii 2.3.4 Low-Resource and Multilingual Speech Recognition ...... 37 2.3.5 Tonal Speech Recognition .................... 40 2.4 Machine Translation Meets Speech Recognition ............ 41 2.4.1 Speech-to-Speech Translation .................. 41 2.4.2 Computer-Aided Translation .................. 42 2.4.3 Translation Modelling of Speech ................. 43 3 Translation Modelling of Phonemes 46 3.1 Introduction ................................ 46 3.2 Phoneme-Based Machine Translation .................. 47 3.2.1 Alignment Approaches ...................... 49 3.2.2 Experimental Setup ....................... 52 3.2.3 Results and Discussion ...................... 53 3.3 Phoneme-Based Bilingual Lexicon Induction .............. 55 3.3.1 Translation Models ........................ 57 3.3.2 Experimental Setup ....................... 59 3.3.3 Quantitative Evaluation ..................... 62 3.3.4 Qualitative Evaluation ...................... 67 3.4 Discussion ................................. 69 3.4.1 Evaluation Issues ......................... 70 3.4.2 Reconsidering the Value of Bilingual Lexicon Induction .... 71 4 Cross-Lingual Low-Resource Language Modelling 73 4.1 Introduction ................................ 73 4.2 Resilience of Cross-Lingual Word Embeddings ............. 76 4.2.1 Experimental Setup ....................... 77 4.2.2 Results .............................. 78 4.3 Pre-training Language Models ...................... 79 4.3.1 Experimental Setup ....................... 80 4.3.2 English Results .......................... 81 4.3.3 Other Target Languages ..................... 83 4.4 First Steps in an Under-Resourced Language .............. 84 4.4.1 Experimental Setup ....................... 85 4.4.2 Results and Discussion ...................... 87 4.5 Discussion ................................. 90 4.5.1 Future Work on Na Language Modelling ............ 91 4.5.2 Beyond Na ............................ 92 5 Harnessing Translations for Improved Phoneme Transcription 94 5.1 Introduction ................................ 94 5.2 Phrase Alignment
Recommended publications
  • Tone in Yongning Na: Lexical Tones and Morphotonology Alexis Michaud
    Tone in Yongning Na: lexical tones and morphotonology Alexis Michaud To cite this version: Alexis Michaud. Tone in Yongning Na: lexical tones and morphotonology. Language Science Press, 13, 2017, Studies in Diversity Linguistics, Haspelmath, Martin, 978-3-946234-86-9. 10.5281/zen- odo.439004. halshs-01094049v6 HAL Id: halshs-01094049 https://halshs.archives-ouvertes.fr/halshs-01094049v6 Submitted on 26 Apr 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution| 4.0 International License Tone in Yongning Na Lexical tones and morphotonology Alexis Michaud language Studies in Diversity Linguistics 13 science press Studies in Diversity Linguistics Chief Editor: Martin Haspelmath Consulting Editors: Fernando Zúñiga, Peter Arkadiev, Ruth Singer, Pilar Valen zuela In this series: 1. Handschuh, Corinna. A typology of marked-S languages. 2. Rießler, Michael. Adjective attribution. 3. Klamer, Marian (ed.). The Alor-Pantar languages: History and typology. 4. Berghäll, Liisa. A grammar of Mauwake (Papua New Guinea). 5. Wilbur, Joshua. A grammar of Pite Saami. 6. Dahl, Östen. Grammaticalization in the North: Noun phrase morphosyntax in Scandinavian vernaculars. 7. Schackow, Diana. A grammar of Yakkha. 8.
    [Show full text]
  • 第四届中国西南地区汉藏语国际研讨会program and Abstract Book
    FOURTH WORKSHOP ON SINO-TIBETAN LANGUAGES OF SOUTHWEST CHINA 第四届中国西南地区汉藏语国际研讨会 UNIVERSITY OF WASHINGTON, SEATTLE SEPTEMBER 8-10, 2016 PROGRAM AND ABSTRACT BOOK Table of Contents General Information & Special Thanks to Our Sponsors ......................................................... 3 Program Synoptic Schedule ..................................................................................................................... 4 Thursday, September 8 .............................................................................................................. 5 Friday, September 9 ................................................................................................................... 6 Saturday, September 10 ............................................................................................................. 7 Abstracts (in presentation order) Scott DeLancey, Reconstructing Hierarchical Argument Indexation in Trans-Himalayan .... 8 James A. Matisoff, Lahu in the 21st century: vocabulary enrichment and orthographical issues ........................................................................................................................................ 10 Guillaume Jacques, The life cycle of multiple indexation and bipartite verbs in Sino-Tibetan ................................................................................................................................................. 11 Jackson T.-S. SUN and Qianzi TIAN, Argument Indexation patterns in Horpa languages: a major Rgyalrongic subgroup ..................................................................................................
    [Show full text]
  • Review of Tone in Yongning Na: Lexical Tones and Morphotonology by Alexis Michaud
    Journal of the Southeast Asian Linguistics Society JSEALS Vol. 10.2 (2017): xxx-xxxiv ISSN: 1836-6821, DOI: http://hdl.handle.net/10524/52412 University of Hawaiʼi Press REVIEW OF TONE IN YONGNING NA: LEXICAL TONES AND MORPHOTONOLOGY BY ALEXIS MICHAUD Mark J. Alves Montgomery College [email protected] Tone in Yongning Na: Lexical Tones and Morphotonology (Berlin: Language Science Press, Studies in Diversity Linguistics 13, 2017, ISBN 978-3-946234-87-6 (hardcover)), by Alexis Michaud, describes both lexical tones and phonological tonal rules in the Naish (ISO 639-3 naic1235) Yongning Na language (ISO 639-3 nru). Michaud claims that his aim is a “detailed description of a level-tone system,” making this work a significant contribution as even among tonal Bantu language studies, as the author notes, it is rare to see entire books devoted to description of such systems. For this undertaking, Michaud has spent a tremendous amount of time over the last decade doing fieldwork on, intensively analyzing various aspects of, and writing about the phonology of Yongning Na, a Sino-Tibetan language spoken by some 47,000 people living in an area bordering Sichuan and Yunnan provinces of southwest China. Over that extended period of research, he has authored some two dozen linguistics articles on Na (including several co-authored works, mostly written in English but also in French and Chinese), a Na-English-Chinese dictionary, numerous Na language recordings efficiently laid out to facilitate study by others (in the online PanGloss collection), and now this massive 600-page volume on the Na language.
    [Show full text]
  • The Neolithic Ofsouthern China-Origin, Development, and Dispersal
    The Neolithic ofSouthern China-Origin, Development, and Dispersal ZHANG CHI AND HSIAO-CHUN HUNG INTRODUCTION SANDWICHED BETWEEN THE YELLOW RIVER and Mainland Southeast Asia, southern China1 lies centrally within eastern Asia. This geographical area can be divided into three geomorphological terrains: the middle and lower Yangtze allu­ vial plain, the Lingnan (southern Nanling Mountains)-Fujian region,2 and the Yungui Plateau3 (Fig. 1). During the past 30 years, abundant archaeological dis­ coveries have stimulated a rethinking of the role ofsouthern China in the prehis­ tory of China and Southeast Asia. This article aims to outline briefly the Neolithic cultural developments in the middle and lower Yangtze alluvial plain, to discuss cultural influences over adjacent regions and, most importantly, to examine the issue of southward population dispersal during this time period. First, we give an overview of some significant prehistoric discoveries in south­ ern China. With the discovery of Hemudu in the mid-1970s as the divide, the history of archaeology in this region can be divided into two phases. The first phase (c. 1920s-1970s) involved extensive discovery, when archaeologists un­ earthed Pleistocene human remains at Yuanmou, Ziyang, Liujiang, Maba, and Changyang, and Palaeolithic industries in many caves. The major Neolithic cul­ tures, including Daxi, Qujialing, Shijiahe, Majiabang, Songze, Liangzhu, and Beiyinyangying in the middle and lower Yangtze, and several shell midden sites in Lingnan, were also discovered in this phase. During the systematic research phase (1970s to the present), ongoing major ex­ cavation at many sites contributed significantly to our understanding of prehis­ toric southern China. Additional early human remains at Wushan, Jianshi, Yun­ xian, Nanjing, and Hexian were recovered together with Palaeolithic assemblages from Yuanmou, the Baise basin, Jianshi Longgu cave, Hanzhong, the Li and Yuan valleys, Dadong and Jigongshan.
    [Show full text]
  • In and out of Suriname Caribbean Series
    In and Out of Suriname Caribbean Series Series Editors Rosemarijn Hoefte (Royal Netherlands Institute of Southeast Asian and Caribbean Studies) Gert Oostindie (Royal Netherlands Institute of Southeast Asian and Caribbean Studies) Editorial Board J. Michael Dash (New York University) Ada Ferrer (New York University) Richard Price (em. College of William & Mary) Kate Ramsey (University of Miami) VOLUME 34 The titles published in this series are listed at brill.com/cs In and Out of Suriname Language, Mobility and Identity Edited by Eithne B. Carlin, Isabelle Léglise, Bettina Migge, and Paul B. Tjon Sie Fat LEIDEN | BOSTON This is an open access title distributed under the terms of the Creative Commons Attribution-Noncommercial 3.0 Unported (CC-BY-NC 3.0) License, which permits any non-commercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited. The realization of this publication was made possible by the support of KITLV (Royal Netherlands Institute of Southeast Asian and Caribbean Studies). Cover illustration: On the road. Photo by Isabelle Léglise. This publication has been typeset in the multilingual “Brill” typeface. With over 5,100 characters covering Latin, IPA, Greek, and Cyrillic, this typeface is especially suitable for use in the humanities. For more information, please see www.brill.com/brill-typeface issn 0921-9781 isbn 978-90-04-28011-3 (hardback) isbn 978-90-04-28012-0 (e-book) Copyright 2015 by the Editors and Authors. This work is published by Koninklijke Brill NV. Koninklijke Brill NV incorporates the imprints Brill, Brill Nijhoff and Hotei Publishing. Koninklijke Brill NV reserves the right to protect the publication against unauthorized use and to authorize dissemination by means of offprints, legitimate photocopies, microform editions, reprints, translations, and secondary information sources, such as abstracting and indexing services including databases.
    [Show full text]
  • The Qiangic Subgroup from an Areal Perspective: a Case Study of Languages of Muli
    LANGUAGE AND LINGUISTICS 13.1:133-170, 2012 2012-0-013-001-000299-1 The Qiangic Subgroup from an Areal Perspective: A Case Study of Languages of Muli Katia Chirkova CRLAO, CNRS In this paper, I study the empirical validity of the hypothesis of “Qiangic” as a subgroup of Sino-Tibetan, that is, the hypothesis of a common origin of thirteen little-studied languages of South-West China. This study is based on ongoing work on four Qiangic languages spoken in one locality (Muli Tibetan Autonomous County, Sichuan), and seen in the context of languages of the neighboring genetic subgroups (Yi, Na, Tibetan, Sinitic). Preliminary results of documentation work cast doubt on the validity of Qiangic as a genetic unit, and suggest instead that features presently seen as probative of the membership in this subgroup are rather the result of diffusion across genetic boundaries. I furthermore argue that the four local languages currently labeled Qiangic are highly distinct and not likely to be closely genetically related. Subsequently, I discuss Qiangic as an areal grouping in terms of its defining characteristics, as well as possible hypotheses pertaining to the genetic affiliation of its member languages currently labeled Qiangic. I conclude with some reflections on the issue of subgrouping in the Qiangic context and in Sino-Tibetan at large. Key words: Qiangic, classification, areal linguistics, Sino-Tibetan 1. Introduction This paper examines the empirical validity of the Qiangic subgrouping hypothesis, as studied in the framework of the project “What defines Qiang-ness: Towards a This is a reworked version of a paper presented at the International Symposium on Sino- Tibetan Comparative Studies in the 21st Century, held at the Institute of Linguistics, Academia Sinica, Taipei, Taiwan on June 24-25, 2010.
    [Show full text]
  • 1 Tonal Alternations in the Pumi Verbal System1 Guillaume
    Author manuscript, published in "Language and Linguistics 12, 2 (2011) 359-392" Tonal alternations in the Pumi verbal system1 Guillaume Jacques, CNRS-INALCO Summary: This paper presents two hitherto unnoticed sets of alternations in the tonal systems of Northern Pumi. First, in some verbs, rising tone in the citation form alternates with falling tone in the perfective. Second, the verb “to go”, when combined with other verbs, presents tonal alternations which are not found elsewhere in the verbal system. A sample text is added in the appendix to illustrate the tonal phenomena discussed in the article. Keywords: Pumi, Tonal alternation, Directional prefixes halshs-00605901, version 1 - 4 Jul 2011 1 I wish to thank David Bradley, Katia Chirkova, Henriette Daudey, Randy LaPolla, Cédric Patin, Alexis Michaud, Thomas Pellard, Melanie Viljoen, Elizabeth Zeitoun and three anonymous reviewers for comments and corrections on this paper. I am responsible for any remaining error. I am also indebted to my Mudiqing consultant Mr. Cao and to my Shuiluo Pumi consultant Ngag-dbang [ŋawṍ] 昂翁 who unfortunately passed away in late 2009, a few months after my field trip. This paper was completed during my stay at the Research Centre for Linguistics Typology, La Trobe University. Fieldwork in 2008 and 2009 was funded by the ANR (Agence Nationale de la Recherche) project PASQi (What defines Qiang-ness: Towards a phylogenetic assessment of the Southern Qiangic languages of Muli 07-JCJC-0063). 1 1. Introduction Like most Qiangic languages (apart from Northern Qiang and Japhug Rgyalrong), Pumi dialects2 have tonal contrasts. On monosyllables, either two tone categories (Lu 2001:21, Matisoff 1997) or three (Ding 2001, 2003, Lu 2001: 97) are found depending on the dialect.
    [Show full text]
  • Integrating Automatic Transcription Into the Language
    Integrating automatic transcription into the language documentation workflow: Experiments with Na data and the Persephone toolkit Alexis Michaud, Oliver Adams, Trevor Cohn, Graham Neubig, Séverine Guillaume To cite this version: Alexis Michaud, Oliver Adams, Trevor Cohn, Graham Neubig, Séverine Guillaume. Integrating au- tomatic transcription into the language documentation workflow: Experiments with Na data and the Persephone toolkit. Language Documentation & Conservation, University of Hawaii Press 2018, 12, pp.393-429. halshs-01841979v2 HAL Id: halshs-01841979 https://halshs.archives-ouvertes.fr/halshs-01841979v2 Submitted on 6 Sep 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike| 4.0 International License Vol. 12 (2018), pp. 393–429 http://nflrc.hawaii.edu/ldc http://hdl.handle.net/10125/24793 Revised Version Received: 5 July 2018 Integrating Automatic Transcription into the Language Documentation Workflow: Experiments with Na Data and the Persephone Toolkit Alexis
    [Show full text]
  • Nàxī 納西 Language / Naish Languages.” In: Rint Sybesma, Wolfgang Behr, Zev Handel & C.T
    Nàxī language / Naish languages Alexis Michaud, Limin He, Yaoping Zhong To cite this version: Alexis Michaud, Limin He, Yaoping Zhong. Nàxī language / Naish languages. Encyclopedia of Chi- nese Language and Linguistics, 3, Brill, pp.144-157, 2017, 10.1163/2210-7363_ecll_COM_00000247. halshs-00793649v2 HAL Id: halshs-00793649 https://halshs.archives-ouvertes.fr/halshs-00793649v2 Submitted on 4 Jan 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution| 4.0 International License References: MICHAUD Alexis, HE Limin & ZHONG Yaoping (2017). “Nàxī 納西 language / Naish languages.” In: Rint Sybesma, Wolfgang Behr, Zev Handel & C.T. James Huang (eds.), Encyclopedia of Chinese Language and Linguistics. Leiden: Brill. Volume 3, pages 144-157. Nàxī 納西 language / Naish languages Alexis Michaud 米可 (CNRS-LACITO), with two co-authors for the section on Naxi pictographs: Hé Lìmín 和力民 (Dongba Culture Research Institute, Lijiang) and Zhōng Yàopíng 钟耀萍 (Chinese Academy of Social Sciences, Beijing) 1. General 1.1. Definition and genealogical position ‘Naxi’ (Nàxī 納西) is commonly used in Chinese scholarship as a cover term for a range of language varieties related to the Naxi language as spoken in Lìjiāng 麗江, in Northwestern Yunnan.
    [Show full text]
  • Das Sprachsystem Der Naxi
    Das Sprachsystem der Naxi Inauguraldissertation zur Erlangung des Doktorgrades der Philosophischen Fakultät der Universität Passau vorgelegt von Zhijun Wu Passau Juli. 2016 Abkürzungsverzeichnis ..................................................................................................................... iii Abbildungsverzeichnis ........................................................................................................................ v 1. Einleitung ............................................................................................................................................ 1 1.1. Die Naxi .................................................................................................................................... 1 1.2. Historischer Überblick ............................................................................................................ 5 1.3. Aufbau der Arbeit .................................................................................................................... 8 1.4. Feldforschungen ..................................................................................................................... 9 2. Die Sprache ...................................................................................................................................... 12 2.1. Klassifikation .......................................................................................................................... 12 2.2. Unterteilung ..........................................................................................................................
    [Show full text]
  • Investigating Differential Case Marking in Sümi, a Language Of
    INVESTIGATING DIFFERENTIAL CASE MARKING IN SÜMI, A LANGUAGE OF NAGALAND, USING LANGUAGE DOCUMENTATION AND EXPERIMENTAL METHODS by AMOS BENJAMIN LENG YUAN TEO A DISSERTATION Presented to the Department of Linguistics and the Graduate School of the University of Oregon in partial fulfillment of the requirements for the degree of Doctor of Philosophy September 2019 DISSERTATION APPROVAL PAGE Student: Amos Benjamin Leng Yuan Teo Title: Investigating Differential Case Marking in Sümi, a Language of Nagaland, Using Language Documentation and Experimental Methods This dissertation has been accepted and approved in partial fulfillment of the requirements for the Doctor of Philosophy degree in the Linguistics by: Scott DeLancey Co-Chairperson Melissa Baese-Berk Co-Chairperson Spike Gildea Core Member Kaori Idemaru Institutional Representative and Janet Woodruff-Borden Vice Provost and Dean of the Graduate School Original approval signatures are on file with the University of Oregon Graduate School. Degree awarded September 2019 ii © 2019 Amos Benjamin Leng Yuan Teo This work is licensed under a Creative Commons CC-BY-NC-ND. iii DISSERTATION ABSTRACT Amos Benjamin Teo Doctor of Philosophy Department of Linguistics September 2019 Title: Investigating Differential Case Marking in Sümi, a Language of Nagaland, Using Language Documentation and Experimental Methods One goal in linguistics is to model how speakers use natural language to convey different kinds of information. In theories of grammar, two kinds of information: “who is doing what (and to whom)”, the technical term for which is case or case role; and pragmatic information about “what is important”, have been assumed to be expressed by different means within a language.
    [Show full text]
  • China Delight
    Friendly Planet Travel China Delight China OVERVIEW Introduction These days, it's quite jarring to walk around parts of old Beijing. Although old grannies can still be seen pushing cabbages in rickety wooden carts amidst huddles of men playing chess, it's not uncommon to see them all suddenly scurry to the side to make way for a brand-new BMW luxury sedan squeezing through the narrow hutong (a traditional Beijing alleyway). The same could be said of the longtang-style alleys of Sichuan or a bustling marketplace in Sichuan. Modern China is a land of paradox, and it's becoming increasingly so in this era of unprecedented socioeconomic change. Relentless change—seen so clearly in projects like the Yangtze River dam and the relocation of thousands of people—has been an elemental part of China's modern character. Violent revolutions in the 20th century, burgeoning population growth (China is now the world's most populous country by far) and economic prosperity (brought about by a recent openness to the outside world) have almost made that change inevitable. China's cities are being transformed—Beijing and Shanghai are probably the most dynamic cities in the world right now. And the country's political position in the world is rising: The 2008 Olympics were awarded to Beijing, despite widespread concern about how the government treats its people. China has always been one of the most attractive travel destinations in the world, partly because so much history exists alongside the new, partly because it is still so unknown to outsiders. The country and its people remain a mystery.
    [Show full text]