Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, Pages 1–14, Osaka, Japan, December 12 2016
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Romanian Language and Its Dialects
Social Sciences ROMANIAN LANGUAGE AND ITS DIALECTS Ana-Maria DUDĂU1 ABSTRACT: THE ROMANIAN LANGUAGE, THE CONTINUANCE OF THE LATIN LANGUAGE SPOKEN IN THE EASTERN PARTS OF THE FORMER ROMAN EMPIRE, COMES WITH ITS FOUR DIALECTS: DACO- ROMANIAN, AROMANIAN, MEGLENO-ROMANIAN AND ISTRO-ROMANIAN TO COMPLETE THE EUROPEAN LINGUISTIC PALETTE. THE ROMANIAN LINGUISTS HAVE ALWAYS SHOWN A PERMANENT CONCERN FOR BOTH THE IDENTITY AND THE STATUS OF THE ROMANIAN LANGUAGE AND ITS DIALECTS, THUS SUPPORTING THE EXISTENCE OF THE ETHNIC, LINGUISTIC AND CULTURAL PARTICULARITIES OF THE MINORITIES AND REJECTING, FIRMLY, ANY ATTEMPT TO ASSIMILATE THEM BY FORCE KEYWORDS: MULTILINGUALISM, DIALECT, ASSIMILATION, OFFICIAL LANGUAGE, SPOKEN LANGUAGE. The Romanian language - the only Romance language in Eastern Europe - is an "island" of Latinity in a mainly "Slavic sea" - including its dialects from the south of the Danube – Aromanian, Megleno-Romanian and Istro-Romanian. Multilingualism is defined narrowly as the alternative use of several languages; widely, it is use of several alternative language systems, regardless of their status: different languages, dialects of the same language or even varieties of the same idiom, being a natural consequence of linguistic contact. Multilingualism is an Europe value and a shared commitment, with particular importance for initial education, lifelong learning, employment, justice, freedom and security. Romanian language, with its four dialects - Daco-Romanian, Aromanian, Megleno- Romanian and Istro-Romanian – is the continuance of the Latin language spoken in the eastern parts of the former Roman Empire. Together with the Dalmatian language (now extinct) and central and southern Italian dialects, is part of the Apenino-Balkan group of Romance languages, different from theAlpine–Pyrenean group2. -
Christians and Jews in Muslim Societies
Arabic and its Alternatives Christians and Jews in Muslim Societies Editorial Board Phillip Ackerman-Lieberman (Vanderbilt University, Nashville, USA) Bernard Heyberger (EHESS, Paris, France) VOLUME 5 The titles published in this series are listed at brill.com/cjms Arabic and its Alternatives Religious Minorities and Their Languages in the Emerging Nation States of the Middle East (1920–1950) Edited by Heleen Murre-van den Berg Karène Sanchez Summerer Tijmen C. Baarda LEIDEN | BOSTON Cover illustration: Assyrian School of Mosul, 1920s–1930s; courtesy Dr. Robin Beth Shamuel, Iraq. This is an open access title distributed under the terms of the CC BY-NC 4.0 license, which permits any non-commercial use, distribution, and reproduction in any medium, provided no alterations are made and the original author(s) and source are credited. Further information and the complete license text can be found at https://creativecommons.org/licenses/by-nc/4.0/ The terms of the CC license apply only to the original material. The use of material from other sources (indicated by a reference) such as diagrams, illustrations, photos and text samples may require further permission from the respective copyright holder. Library of Congress Cataloging-in-Publication Data Names: Murre-van den Berg, H. L. (Hendrika Lena), 1964– illustrator. | Sanchez-Summerer, Karene, editor. | Baarda, Tijmen C., editor. Title: Arabic and its alternatives : religious minorities and their languages in the emerging nation states of the Middle East (1920–1950) / edited by Heleen Murre-van den Berg, Karène Sanchez, Tijmen C. Baarda. Description: Leiden ; Boston : Brill, 2020. | Series: Christians and Jews in Muslim societies, 2212–5523 ; vol. -
After Serbo-Croatian: the Narcissism of Small Difference1
polish 3()’ 171 10 sociological review ISSN 1231 – 1413 After Serbo-Croatian: The Narcissism of Small Difference1 Snježana Kordić. Jezik i nacjonalizam [Language and Nationalism] (Rotulus / Universitas Series). Zagreb, Croatia: Durieux. 2010. ISBN 978-953-188-311-5. Keywords: Croatian, kroatistik, language politics, nationalism, Serbo-Croatian Kordić’s book Jezik i nacjonalizam [Language and Nationalism] is a study of lan- guage politics or political sociolinguistics. Language being such a burning political issue in Yugoslavia after the adoption in 1974 of a truly federal constitution. In her extensive monograph, written in Croatian (or Latin script-based Croato-Serbian?), Kordić usefully summarizes today’s state of the linguistic and popular discourse on language and nationalism as it obtains in Croatia, amplified with some comparative examples drawn from Bosnia, Montenegro and Serbia. These four out of the seven post-Yugoslav states (the other three being Kosovo, Macedonia and Slovenia) parti- tioned among themselves Yugoslavia’s main official language, Serbo-Croatian (or, in the intra-Yugoslav parlance, ‘Serbo-Croatian or Croato-Serbian’), thus reinventing it anew as the four separate national languages of Bosnian, Croatian, Montenegrin and Serbian. The first two are written in the Latin alphabet; Montenegrin is written both in this alphabet and in Cyrillic; Serbian is officially in Cyrillic, but is in practice also written in Latin characters. The monograph is divided into three parts. The first and shortest one, Linguis- tic Purism (Jezični purizam), sets out the theoretical (and also ideological) position adopted by Kordić. Building on this theoretical framework, she conducts her anal- ysis and discussion in the two further sections, The Pluricentric Standard Language (Policentrični standardni jezik) and the final more far-ranging one, Nation, Identity, Culture and History (Nacija, identitet, kultura, povijest). -
Wikipedia, the Free Encyclopedia 03-11-09 12:04
Tea - Wikipedia, the free encyclopedia 03-11-09 12:04 Tea From Wikipedia, the free encyclopedia Tea is the agricultural product of the leaves, leaf buds, and internodes of the Camellia sinensis plant, prepared and cured by various methods. "Tea" also refers to the aromatic beverage prepared from the cured leaves by combination with hot or boiling water,[1] and is the common name for the Camellia sinensis plant itself. After water, tea is the most widely-consumed beverage in the world.[2] It has a cooling, slightly bitter, astringent flavour which many enjoy.[3] The four types of tea most commonly found on the market are black tea, oolong tea, green tea and white tea,[4] all of which can be made from the same bushes, processed differently, and in the case of fine white tea grown differently. Pu-erh tea, a post-fermented tea, is also often classified as amongst the most popular types of tea.[5] Green Tea leaves in a Chinese The term "herbal tea" usually refers to an infusion or tisane of gaiwan. leaves, flowers, fruit, herbs or other plant material that contains no Camellia sinensis.[6] The term "red tea" either refers to an infusion made from the South African rooibos plant, also containing no Camellia sinensis, or, in Chinese, Korean, Japanese and other East Asian languages, refers to black tea. Contents 1 Traditional Chinese Tea Cultivation and Technologies 2 Processing and classification A tea bush. 3 Blending and additives 4 Content 5 Origin and history 5.1 Origin myths 5.2 China 5.3 Japan 5.4 Korea 5.5 Taiwan 5.6 Thailand 5.7 Vietnam 5.8 Tea spreads to the world 5.9 United Kingdom Plantation workers picking tea in 5.10 United States of America Tanzania. -
(Daco-Romanian and Aromanian) in Verbs of Latin Origin
CONJUGATION CHANGES IN THE EVOLUTION OF ROMANIAN (DACO-ROMANIAN AND AROMANIAN) IN VERBS OF LATIN ORIGIN AIDA TODI1, MANUELA NEVACI2 Abstract. Having as starting point for research on the change of conjugation of Latin to the Romance languages, the paper aims to present the situation of these changes in Romanian: Daco-Romanian (that of the old Romanian texts) and Aromanian dialect (which does not have a literary standard). Keywords: conjugation changes; Romance languages; Romanian language (Daco-Romanian and Aromanian). Conjugation changes are a characteristic feature of the Romance verb system. In some Romance languages (Spanish, Catalan, Portuguese, Sardinian) verbs going from one conjugation to another has caused the reduction of the four conjugations inherited from Latin to three inflection classes: in Spanish and Portuguese the 2nd conjugation extended (verbs with stressed theme vowel): véndere > Sp., Pg. vender; cúrere > Sp., Pg. correr; in Catalan 3rd conjugation verbs assimilated the 2nd conjugation ones, a phenomenon occurring in Sardinian as well: Catal. ventre, Srd. biere; Additionally, in Spanish and Portuguese the 4th conjugation also becomes strong, assimilating 3rd conjugation verbs: petĕre > Sp., Pg. petir, ungĕre> Sp., Pg. ungir, iungĕre> Sp., Pg. ungir (Lausberg 1988: 259). Lausberg includes Aromanian together with Spanish, Catalan and Sardinian, where the four conjugations were reduced to three, mentioning that 3rd conjugation verbs switched to the 2nd conjugation3. The process of switching from one conjugation to another is frequent from as early as vulgar Latin. Grammars experience changes such as: augĕre > augēre; ardĕre> ardēre; fervĕre > fervēre; mulgĕre> mulgēre; respondĕre> respondēre; sorbĕre> sorbēre; torquĕre > torquēre; tondĕre >tondēre (Densusianu 1961: 103, 1 “Ovidius” University, Constanţa, [email protected]. -
Saudi Dialects: Are They Endangered?
Academic Research Publishing Group English Literature and Language Review ISSN(e): 2412-1703, ISSN(p): 2413-8827 Vol. 2, No. 12, pp: 131-141, 2016 URL: http://arpgweb.com/?ic=journal&journal=9&info=aims Saudi Dialects: Are They Endangered? Salih Alzahrani Taif University, Saudi Arabia Abstract: Krauss, among others, claims that languages will face death in the coming centuries (Krauss, 1992). Austin (2010a) lists 7,000 languages as existing and spoken in the world today. Krauss estimates that this figure could come down to 600. That is, most the world's languages are endangered. Therefore, an endangered language is a language that loses her speakers within a few generations. According to Dorian (1981), there is what is called ―tip‖ in language endangerment. He argues that a language's decline can start slowly but suddenly goes through a rapid decline towards the extinction. Thus, languages must be protected at much earlier stage. Arabic dialects such as Zahrani Spoken Arabic (ZSA), and Faifi Spoken Arabic (henceforth, FSA), which are spoken in the southern region of Saudi Arabia, have not been studied, yet. Few people speak these dialects, among many other dialects in the same region. However, the problem is that most these dialects' native speakers are moving to other regions in Saudi Arabia where they use other different dialects. Therefore, are these dialects endangered? What other factors may cause its endangerment? Have they been documented before? What shall we do? This paper discusses three main different points regarding this issue: language and endangerment, languages documentation and description and Arabic language and its family, giving a brief history of Saudi dialects comparing their situation with the whole existing dialects. -
East European Languages
Latest Trends on Cultural Heritage and Tourism SEMANTIC CONCORDANCE IN THE SOUTH- EAST EUROPEAN LANGUAGES ANCUłA NEGREA, AGNES ERICH, ZLATE STEFANIA Faculty of Humanities Valahia University of Târgoviste Address: Street Lt. Stancu Ion, No. 35, Targoviste COUNTRY ROMANIA [email protected]; [email protected] Abstract: - The common terms of the southeast European languages that present semantic convergences [1], belong to a less explored domain, that of comparative linguistics. “The reason why the words have certain significations and not others is not found in the linguistic system or in the human biological or mental structures … it is the societies that, through their diverse history, determine the significations, and the semantics is found at the crossroads between the historical complexity of the reality it studies and the historical complexity of the culture reflecting on this reality… its object is the result and the condition of the historical character, thanks to the relations of mutual conditioning that relate the signification to the historic events of each human community.”[2] Key-Words: Balkan linguistics, terminology of the word “house”: Bg. kăšta, Srb., Cr. kuča, Maced. kuk’a, slov. koča < old sl. kotia meaning “what is hidden, what is sheltered”, Rom. casa < Lat. casa; Ngr. σπητη < Lat. hospitium; Alb. shtepi. 1 Introduction NeoGreek, and Albanese, present unanimous The “mentality” common to the Balkan world, concrdances for the terms having the meaning: having manifestations in the language, is the - “house”, “dwelling” and also historical product of the cohabitation in similar - ”room”, “place to live”, though these forms of civilization and culture. terms have different origins: Kristian SANDFELD, in his fundamental work - Bg. -
The Culture of Wikipedia
Good Faith Collaboration: The Culture of Wikipedia Good Faith Collaboration The Culture of Wikipedia Joseph Michael Reagle Jr. Foreword by Lawrence Lessig The MIT Press, Cambridge, MA. Web edition, Copyright © 2011 by Joseph Michael Reagle Jr. CC-NC-SA 3.0 Purchase at Amazon.com | Barnes and Noble | IndieBound | MIT Press Wikipedia's style of collaborative production has been lauded, lambasted, and satirized. Despite unease over its implications for the character (and quality) of knowledge, Wikipedia has brought us closer than ever to a realization of the centuries-old Author Bio & Research Blog pursuit of a universal encyclopedia. Good Faith Collaboration: The Culture of Wikipedia is a rich ethnographic portrayal of Wikipedia's historical roots, collaborative culture, and much debated legacy. Foreword Preface to the Web Edition Praise for Good Faith Collaboration Preface Extended Table of Contents "Reagle offers a compelling case that Wikipedia's most fascinating and unprecedented aspect isn't the encyclopedia itself — rather, it's the collaborative culture that underpins it: brawling, self-reflexive, funny, serious, and full-tilt committed to the 1. Nazis and Norms project, even if it means setting aside personal differences. Reagle's position as a scholar and a member of the community 2. The Pursuit of the Universal makes him uniquely situated to describe this culture." —Cory Doctorow , Boing Boing Encyclopedia "Reagle provides ample data regarding the everyday practices and cultural norms of the community which collaborates to 3. Good Faith Collaboration produce Wikipedia. His rich research and nuanced appreciation of the complexities of cultural digital media research are 4. The Puzzle of Openness well presented. -
The Production of Lexical Tone in Croatian
The production of lexical tone in Croatian Inauguraldissertation zur Erlangung des Grades eines Doktors der Philosophie im Fachbereich Sprach- und Kulturwissenschaften der Johann Wolfgang Goethe-Universität zu Frankfurt am Main vorgelegt von Jevgenij Zintchenko Jurlina aus Kiew 2018 (Einreichungsjahr) 2019 (Erscheinungsjahr) 1. Gutacher: Prof. Dr. Henning Reetz 2. Gutachter: Prof. Dr. Sven Grawunder Tag der mündlichen Prüfung: 01.11.2018 ABSTRACT Jevgenij Zintchenko Jurlina: The production of lexical tone in Croatian (Under the direction of Prof. Dr. Henning Reetz and Prof. Dr. Sven Grawunder) This dissertation is an investigation of pitch accent, or lexical tone, in standard Croatian. The first chapter presents an in-depth overview of the history of the Croatian language, its relationship to Serbo-Croatian, its dialect groups and pronunciation variants, and general phonology. The second chapter explains the difference between various types of prosodic prominence and describes systems of pitch accent in various languages from different parts of the world: Yucatec Maya, Lithuanian and Limburgian. Following is a detailed account of the history of tone in Serbo-Croatian and Croatian, the specifics of its tonal system, intonational phonology and finally, a review of the most prominent phonetic investigations of tone in that language. The focal point of this dissertation is a production experiment, in which ten native speakers of Croatian from the region of Slavonia were recorded. The material recorded included a diverse selection of monosyllabic, bisyllabic, trisyllabic and quadrisyllabic words, containing all four accents of standard Croatian: short falling, long falling, short rising and long rising. Each target word was spoken in initial, medial and final positions of natural Croatian sentences. -
Emerged from Antiquity As an All-Jewish Possession, Together with Is Interesting
7+ Yiddish in the Framework of OtherJewish Languages Yiddish in the Framework of OtherJewish Languages there discoverable threads extending lrom these three linguistic groups "Arabic" as a native tongue amongJews (in z.rr.I it will become to the ancient Parsic? These questions have not yet been touched by cle ar why it is more appropriate to spe ak of a separate Jewish language scholarship. with Arabic stock, which.may be called Yahudic) is current among a 2.ro The sunset of Targumic as the spoken language of a major much larger group. On the eve of World War II the number of Yahudic Jewish community came with the rise of the Arabs (z.r.r). A survey of speakers was estimated at about seven hundred thousand. Of course, we the linguistic condition of the Jews up to the Arab period is therefore in have no statistics on the Gaonic period, but by no means can the current place. figure give us any idea of the proportion and the dynamics of Yahudic The frontal attack of Hellenism on Jewish culture failed; but at least in former years. By virtue of the Arab conquests, Yahudic was firmly it was historical drama on a large scaie, and visible signs olJaphet's established in Yemen, Babylonia, Palestine, and all of North Africa, beauty remained in the tents of Shem, to use a stock phrase so popular from Egypt to the Atlantic; even Sicily and southern Italy, which as a in the Haskalah period. Nor will we leave Persian out of consideration rule should be included in the Yavanic culture area (z.I 2 ), were at times in the overall picture ofJewish subcultures, although the phenomenon considerably influenced by North Africa. -
Aleksey A. ROMANCHUK ROMANIAN a CINSTI in the LIGHT of SOME
2021, Volumul XXIX REVISTA DE ETNOLOGIE ȘI CULTUROLOGIE E-ISSN: 2537-6152 101 Aleksey A. ROMANCHUK ROMANIAN A CINSTI IN THE LIGHT OF SOME ROMANIAN-SLAVIC CONTACTS1 https://doi.org/10.52603/rec.2021.29.14 Rezumat определенное отражение следы обозначенного поздне- Cuvântul românesc a cinsti în lumina unor contacte праславянского диалекта (диалектов). В частности, к româno-slave таким следам, возможно, стоит отнести как украинские Pornind de la comparația dintre cuvântul ucrainean диалектные чандрий, шандрий, чендрий, так и диалект- частувати ‚a trata’ și românescul a cinsti‚ a trata (cu vin), ное (зафиксировано в украинском говоре с. Булэешть) / a bea vin, se consideră un grup de împrumuturi slave cu мон|золетеи/ ‘мусолить; впустую теребить’. /n/ epentetic în limba română, căreia îi aparține și a cin- Ключевые слова: славяне, румыны, лексические sti. Interpretând corpul de fapte disponibil, putem presu- заимствования, этническая история, украинские диа- pune că convergență semantică dintre cuvintele честь и лекты, Молдова, Буковина. угощение a apărut în perioada slavică timpurie. Cuvântul românesc a cinsti este un argument important pentru da- Summary tarea timpurie a apariției acestei convergențe semantice. Romanian a cinsti in the light of some Așadar, cuvântul ucrainean частувати, ca și cuvântul po- Romanian-Slavic contacts lonez częstowac, apar independent unul de celălalt, la fel A group of Slavic loanwords with epenthetic /n/ in the ca și de cuvântul românesc a cinsti. Cuvântul românesc a Romanian language, to which a cinsti belongs, is consid- cinsti, ca și, în general, grupul menționat de împrumuturi ered. Interpreting the existing set of facts, the author sup- slave cu /n/ epentetic în limba română, reprezintă un rezul- poses that the semantic convergence between the Slav- tat al contactelor timpurii ale limbii române cu un dialect ic честь ‘honour’ and угощение ‘treat’ appeared as far back (dialecte) slav vechi, pentru care era caracteristică tendința as the Late Slavic period. -
Language, Ideology and Politics in Croatia
Language, Ideology and Politics in Croatia M at e k a p o v i ć University of Zagreb, Department of Linguistics, Faculty of Humanities and Social Sciences, Ivana Lučića 3, HR – 10 000 Zagreb, [email protected] SCN IV/2 [2011], 45–56 Izhajajoč deloma iz osnovnih tez svoje pred kratkim izšle knjige Čiji je jezik (Čigav je jezik?) avtor podaja pregled zapletenega odnosa med jezikom, ideologijo in politiko na Hrvaškem v preteklih dveh desetletjih, vključno z novimi primeri in razčlembami. Razprava se osredotoča na vprašanja, povezana s Hrvaško, ki so lahko zanimiva za tuje slaviste in jezikoslovce, medtem ko se knjiga (v hrvaščini) ukvarja s problemi jezika, politike, ideologije in družbenega jeziko- slovja na splošno. Based in part on his recent book Čiji je jezik? (Who does Language Belong to?), the author reviews the intricate relation of language, ideology, and politics in Croatia in the last 20 years, including new examples and analyses. The article emphasizes problems related to Croatia specifically, which might be of interest to foreign Slavists and linguists, while the monograph (in Croatian) deals with the prob- lems of language, society, politics, ideology, and sociolinguistics in general. Ključne besede: jezikovna politika, jezikovno načrtovanje, purizem, hrvaški jezik, jezik v nekdanji Jugoslaviji Key words: language politics, language planning, purism, Croatian language, language in former Yugoslavia Introduction1 The aim of this article is to provide a general and brief overview of some problems concerning the intricate relation of language, ideology, and politics in Croatia in the last 20 years. The bulk of the article consists of some of the 1 I would like to thank Marko Kapović for reading the first draft of the article carefully.