Directl: a Language Independent Approach to Transliteration
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only. -
Phonetics and Phonology in Russian Unstressed Vowel Reduction: a Study in Hyperarticulation
Phonetics and Phonology in Russian Unstressed Vowel Reduction: A Study in Hyperarticulation Jonathan Barnes Boston University (Short title: Hyperarticulating Russian Unstressed Vowels) Jonathan Barnes Department of Modern Foreign Languages and Literatures Boston University 621 Commonwealth Avenue, Room 119 Boston, MA 02215 Tel: (617) 353-6222 Fax: (617) 353-4641 [email protected] Abstract: Unstressed vowel reduction figures centrally in recent literature on the phonetics-phonology interface, in part owing to the possibility of a causal relationship between a phonetic process, duration-dependent undershoot, and the phonological neutralizations observed in systems of unstressed vocalism. Of particular interest in this light has been Russian, traditionally described as exhibiting two distinct phonological reduction patterns, differing both in degree and distribution. This study uses hyperarticulation to investigate the relationship between phonetic duration and reduction in Russian, concluding that these two reduction patterns differ not in degree, but in the level of representation at which they apply. These results are shown to have important consequences not just for theories of vowel reduction, but for other problems in the phonetics-phonology interface as well, incomplete neutralization in particular. Introduction Unstressed vowel reduction has been a subject of intense interest in recent debate concerning the nature of the phonetics-phonology interface. This is the case at least in part due to the existence of two seemingly analogous processes bearing this name, one typically called phonetic, and the other phonological. Phonological unstressed vowel reduction is a phenomenon whereby a given language's full vowel inventory can be realized only in lexically stressed syllables, while in unstressed syllables some number of neutralizations of contrast take place, with the result that only a subset of the inventory is realized on the surface. -
Galio Uma, Maƙerin Azurfa Maƙeri (M): Ni Ina Da Shekara Yanzu…
LANGUAGE PROGRAM, 232 BAY STATE ROAD, BOSTON, MA 02215 [email protected] www.bu.edu/Africa/alp 617.353.3673 Interview 2: Galio Uma, Maƙerin Azurfa Maƙeri (M): Ni ina da shekara yanzu…, ina da vers …alal misali vers mille neuf cent quatre vingt et quatre (1984) gare mu parce que wajenmu babu hopital lokacin da anka haihe ni ban san shekaruna ba, gaskiya. Ihehehe! Makaranta, ban yi makaranta ba sabo da an sa ni makaranta sai daga baya wajenmu babana bai da hali, mammana bai da hali sai mun biya. Akwai nisa tsakaninmu da inda muna karatu. Kilometir kama sha biyar kullum ina tahiya da ƙahwa. To, ha na gaji wani lokaci mu hau jaki, wani lokaci mu hau raƙumi, to wani lokaci ma babu hali, babu karatu. Amman na yi shekara shidda ina karatu ko biyat. Amman ban ji daɗi ba da ban yi karatu ba. Sabo da shi ne daɗin duniya. Mm. To mu Buzaye ne muna yin buzanci. Ana ce muna Tuareg. Mais Tuareg ɗin ma ba guda ba ne. Mu artisans ne zan ce miki. Can ainahinmu ma ba mu da wani aiki sai ƙira. Duka mutanena suna can Abala. Nan ga ni dai na nan ga. Ka gani ina da mata, ina da yaro. Akwai babana, akwai mammana. Ina da ƙanne huɗu, macce biyu da namiji biyu. Suna duka cikin gida. Duka kuma kama ni ne chef de famille. Duka ni ne ina aider ɗinsu. Sabo da babana yana da shekara saba’in da biyar. Vers. Mammana yana da shekara shi ma hamsin da biyar. -
Writing As Aesthetic in Modern and Contemporary Japanese-Language Literature
At the Intersection of Script and Literature: Writing as Aesthetic in Modern and Contemporary Japanese-language Literature Christopher J Lowy A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2021 Reading Committee: Edward Mack, Chair Davinder Bhowmik Zev Handel Jeffrey Todd Knight Program Authorized to Offer Degree: Asian Languages and Literature ©Copyright 2021 Christopher J Lowy University of Washington Abstract At the Intersection of Script and Literature: Writing as Aesthetic in Modern and Contemporary Japanese-language Literature Christopher J Lowy Chair of the Supervisory Committee: Edward Mack Department of Asian Languages and Literature This dissertation examines the dynamic relationship between written language and literary fiction in modern and contemporary Japanese-language literature. I analyze how script and narration come together to function as a site of expression, and how they connect to questions of visuality, textuality, and materiality. Informed by work from the field of textual humanities, my project brings together new philological approaches to visual aspects of text in literature written in the Japanese script. Because research in English on the visual textuality of Japanese-language literature is scant, my work serves as a fundamental first-step in creating a new area of critical interest by establishing key terms and a general theoretical framework from which to approach the topic. Chapter One establishes the scope of my project and the vocabulary necessary for an analysis of script relative to narrative content; Chapter Two looks at one author’s relationship with written language; and Chapters Three and Four apply the concepts explored in Chapter One to a variety of modern and contemporary literary texts where script plays a central role. -
Japanese Language and Culture
NIHONGO History of Japanese Language Many linguistic experts have found that there is no specific evidence linking Japanese to a single family of language. The most prominent theory says that it stems from the Altaic family(Korean, Mongolian, Tungusic, Turkish) The transition from old Japanese to Modern Japanese took place from about the 12th century to the 16th century. Sentence Structure Japanese: Tanaka-san ga piza o tabemasu. (Subject) (Object) (Verb) 田中さんが ピザを 食べます。 English: Mr. Tanaka eats a pizza. (Subject) (Verb) (Object) Where is the subject? I go to Tokyo. Japanese translation: (私が)東京に行きます。 [Watashi ga] Toukyou ni ikimasu. (Lit. Going to Tokyo.) “I” or “We” are often omitted. Hiragana, Katakana & Kanji Three types of characters are used in Japanese: Hiragana, Katakana & Kanji(Chinese characters). Mr. Tanaka goes to Canada: 田中さんはカナダに行きます [kanji][hiragana][kataka na][hiragana][kanji] [hiragana]b Two Speech Styles Distal-Style: Semi-Polite style, can be used to anyone other than family members/close friends. Direct-Style: Casual & blunt, can be used among family members and friends. In-Group/Out-Group Semi-Polite Style for Out-Group/Strangers I/We Direct-Style for Me/Us Polite Expressions Distal-Style: 1. Regular Speech 2. Ikimasu(he/I go) Honorific Speech 3. Irasshaimasu(he goes) Humble Speech Mairimasu(I/We go) Siblings: Age Matters Older Brother & Older Sister Ani & Ane 兄 と 姉 Younger Brother & Younger Sister Otooto & Imooto 弟 と 妹 My Family/Your Family My father: chichi父 Your father: otoosan My mother: haha母 お父さん My older brother: ani Your mother: okaasan お母さん Your older brother: oniisanお兄 兄 さん My older sister: ane姉 Your older sister: oneesan My younger brother: お姉さ otooto弟 ん Your younger brother: My younger sister: otootosan弟さん imooto妹 Your younger sister: imootosan 妹さん Boy Speech & Girl Speech blunt polite I/Me = watashi, boku, ore, I/Me = watashi, washi watakushi I am going = Boku iku.僕行 I am going = Watashi iku く。 wa. -
Directions Active Employees (At 31.12
Publishing details At a glance Sales and net income for the year in € m Level of internationality Visitors: 47.9 % Exhibitors: 73.4 % 500 50 400 40 Editors-in-chief Print production 300 30 Dominique Ewert Messe Frankfurt Medien Klaus Münster-Müller und Service GmbH 200 20 Publishing Services Annual Report 2014 100 10 Editors 2010 2011 2012 2013 2014 Markus Quint (production editor) Print l from Germany attending Messe Frankfurt events at the Frankfurt venue l Sales l Net income for the year l from outside Germany attending Messe Frankfurt events at the Frankfurt venue Antje Breuer-Seifi Druckhaus Becker GmbH Claudia Lehning-Berge Dieselstraße 9 2014 The Messe Frankfurt corporate group conceives, plans and hosts trade fairs and exhibitions in Germany and abroad. Sarah Stanzel 64372 Ober-Ramstadt Messe Frankfurt The parent company and its subsidiaries offer a well-coordinated service package for national and international Gabriele Wehrl Germany Annual Report customers, exhibitors and visitors. Responsibility for content in accordance Paper Corporate group 2 with the German press laws Cover: Hello Fat Matt 1.1 350 g/m in € m* 2010 2011 2012 2013 2014 Iris Jeglitza-Moshage Inside pages: Arctic the Volume 150 g/m2 Sales 448 467 537 545 554 Photographs Print run Personnel expenses 102 106 120 123 131 Pietro Sutera Photography (p. 3) 3,000 in two editions Depreciation, amortisation and write-downs 59 59 61 56 52 Rüdiger Nehmzow (p. 6–11, 36–39) (German and English) Earnings before taxes on income 42 34 36 49 47 Iwan Baan (p. 19) EBITDA 109 99 102 108 102 BRCK (p. -
Seize the Ъ: Linguistic and Social Change in Russian Orthographic Reform Eugenia Sokolskaya
Seize the Ъ: Linguistic and Social Change in Russian Orthographic Reform Eugenia Sokolskaya It would be convenient for linguists and language students if the written form of any language were a neutral and direct representation of the sounds emitted during speech. Instead, writing systems tend to lag behind linguistic change, retaining old spellings or morphological features instead of faithfully representing a language's phonetics. For better or worse, sometimes the writing can even cause a feedback loop, causing speakers to hypercorrect in imitation of an archaic spelling. The written and spoken forms of a language exist in a complex and fluid relationship, influenced to a large extent by history, accident, misconception, and even politics.1 While in many language communities these two forms – if writing exists – are allowed to develop and influence each other in relative freedom, with some informal commentary, in some cases a willful political leader or group may step in to attempt to intentionally reform the writing system. Such reforms are a risky venture, liable to anger proponents of historic accuracy and adherence to tradition, as well as to render most if not all of the population temporarily illiterate. With such high stakes, it is no wonder that orthographic reforms are not often attempted in the course of a language's history. The Russian language has undergone two sharply defined orthographic reforms, formulated as official government policy. The first, which we will from here on call the Petrine reform, was initiated in 1710 by Peter the Great. It defined a print alphabet for secular use, distancing the writing from the Church with its South Slavic lithurgical language. -
Introduction to Japanese Computational Linguistics Francis Bond and Timothy Baldwin
1 Introduction to Japanese Computational Linguistics Francis Bond and Timothy Baldwin The purpose of this chapter is to provide a brief introduction to the Japanese language, and natural language processing (NLP) research on Japanese. For a more complete but accessible description of the Japanese language, we refer the reader to Shibatani (1990), Backhouse (1993), Tsujimura (2006), Yamaguchi (2007), and Iwasaki (2013). 1 A Basic Introduction to the Japanese Language Japanese is the official language of Japan, and belongs to the Japanese language family (Gordon, Jr., 2005).1 The first-language speaker pop- ulation of Japanese is around 120 million, based almost exclusively in Japan. The official version of Japanese, e.g. used in official settings andby the media, is called hyōjuNgo “standard language”, but Japanese also has a large number of distinctive regional dialects. Other than lexical distinctions, common features distinguishing Japanese dialects are case markers, discourse connectives and verb endings (Kokuritsu Kokugo Kenkyujyo, 1989–2006). 1There are a number of other languages in the Japanese language family of Ryukyuan type, spoken in the islands of Okinawa. Other languages native to Japan are Ainu (an isolated language spoken in northern Japan, and now almost extinct: Shibatani (1990)) and Japanese Sign Language. Readings in Japanese Natural Language Processing. Francis Bond, Timothy Baldwin, Kentaro Inui, Shun Ishizaki, Hiroshi Nakagawa and Akira Shimazu (eds.). Copyright © 2016, CSLI Publications. 1 Preview 2 / Francis Bond and Timothy Baldwin 2 The Sound System Japanese has a relatively simple sound system, made up of 5 vowel phonemes (/a/,2 /i/, /u/, /e/ and /o/), 9 unvoiced consonant phonemes (/k/, /s/,3 /t/,4 /n/, /h/,5 /m/, /j/, /ó/ and /w/), 4 voiced conso- nants (/g/, /z/,6 /d/ 7 and /b/), and one semi-voiced consonant (/p/). -
Zive Naj Vsi Narodi K I Hrepene Docakat Dan D E Koder Sonce Hodi P Repir I Z Sveta Bo Pregnan D E Rojak P Rost Bo Vsak N E Vrag
12. letno srečanje Združenja za slovansko jezikoslovje Povzetki prispevkov 12th Slavic Linguistics Society Annual Meeting Book of Abstracts XII ежегодная конференция Общества славянского языкознания Тезисы докладов Ljubljana, 21.–24. 9. 2017 v / / Zive naj vsi narodi / / / K i hrepene docakatv dan , / / / D e koder sonce hodi , / / P repir/ iz sveta bo/ pregnan , / D e rojak / P rost bo vsak , / / / N e vrag, le sosed bo mejak. 12. letno srečanje Združenja za slovansko jezikoslovje: Povzetki prispevkov 12th Slavic Linguistics Society Annual Meeting: Book of Abstracts XII ежегодная конференция Общества славянского языкознания: Тезисы докладов ISBN: 978-961-05-0027-8 Urednika / Editors / Редакторы: Luka Repanšek, Matej Šekli Recenzenti / Peer-reviewers / Рецензенты: Aleksandra Derganc, Marko Hladnik, Gašper Ilc, Zenaida Karavdić, Simona Kranjc, Domen Krvina, Nina Ledinek, Frančiška Lipovšek, Franc Marušič, Tatjana Marvin, Petra Mišmaš, Matic Pavlič, Анастасия Ильинична Плотникова / Anastasiia Plotnikova, Luka Repanšek, Михаил Николаевич Саенко / Mikhail Saenko, Дмитрий Владимирович Сичинава / Dmitri Sitchinava, Vera Smole, Mojca Smolej, Marko Snoj, Petra Stankovska, Andrej Stopar, Saška Štumberger, Hotimir Tivadar, Mitja Trojar, Mladen Uhlik, Mojca Žagar Karer, Rok Žaucer, Andreja Žele, Sašo Živanović Vodja konference / Conference leadership / Руководитель конференции: Matej Šekli Organizacijski odbor konference / Conference board / Оргкомитет конференции: Branka Kalenić Ramšak, Domen Krvina, Alenka Lap, Oto Luthar, Tatjana Marvin, Mojca -
Title: Exploring Japanese Language: from East Asia to Eurasia
Eurasia Foundation International Lectures, Fall 2020 Semester “The Construction and Transformation of East Asiaology” Lecture Series (12) Title: Exploring Japanese Language: From East Asia to Eurasia For the 12th Eurasia Foundation International Lectures, we invite Professor Shun-Yi Chen, from the Center for Japanese Studies at Chinese Culture University, to share his research results on exploring Japanese language. Professor Chen’s speech consist of two parts: the first part explores the pronunciation of Chinese characters in Japanese (Japanese Kanji) and its relations with the composition of words and the second part discusses the similarity between Japanese and Hebrew. Professor Chen indicates that the pronunciation of Japanese Kanji is different depending on when they were introduced to Japan. There are different types including Go-on (呉音, “sounds from the Wu region”), Kan-on (漢音, “Han sound”), and Tō-on (唐音, “Tang sound”). Kan-on refers to sounds of Kanji from Chang’an in Tang dynasty which were introduced with Japanese missions to Tang and Sui China. Go-on refers to sounds of Kanji from Southern China which were introduced via the Korean Peninsula in the 5th and 6th century. Tō-on refers to sounds of Kanji introduced after Song dynasty, while So-on were introduced in Yuan, Ming Dynasty. Most Tō-on and So-on were Buddhist terms. Here are some examples of different sounds: 行政 Gyōsei (ぎょうせい) is Go-on; 銀行 Ginkō (ぎんこう) is Kan-on. Professor Chen also points out that since the pronunciation of Kanji were introduced from China, they must follow the rule of the pronunciation in Chinese. -
Abin Yi Idan Kana Da COVID-19
Abin yi idan kana da COVID-19 KOYI YADDA ZAKA KULA DA KANKA DA KUMA WADANSU A GIDA. Menene Alamomin COVID-19? • Akwai alamomi iri-iri masu yawa, fara daga masu sauki zuwa masu tsanani. Wadansu mutane ba su da wadansu alamomin cutar. • Mafi yawan alamomi sun haɗa da zazzaɓi ko sanyi, tari, gajeruwar numfashi ko wahalar numfashi, gajiya, ciwon jijiyoyi ko jiki, ciwon kai, rashin dandano ko jin wari, ciwon makogwaro ko yoyon hanci, tashin zuciya ko amai, da gudawa. Wanene ke da Hadari don Me ya kamata in yi idan ina da alamomin COVID-19? Tsananin ciwo daga COVID-19? • Tsaya a gida! Kada ka bar gida sai dai don • Tsakanin manya, haɗarin yin gwaji don COVID-19 da sauran muhimman mummunan rashin lafiya yana abubuwan kula da lafiya ko bukatun ƙaruwa tare da shekaru, tsofaffi yau-da-kullum, kamar su kayan masarufi, na cikin haɗari mafi girma. idan wani ba zai iya samar maka su ba. Kada ka je wurin aiki, ko da kuwa kana • Mutane daga wadansu launin ma’aikaci mai muhimmanci. fatar da kabilu (hade da Bakake, ’yan Latin Amurka da ’Yan asali) • Yi magana da mai baka kulawar lafiya! saboda tsarin lafiya da rashin Yi amfani da tarho ko telemedicine idan zai yiwu. daidaiton zamantakewa. • Yi gwaji! Idan mai baka kulawa baya bayar • Mutanen na kowane shekaru yin gwaji, ziyarci nyc.gov/covidtest ko kira waɗanda suke da yanayin 311 don neman wurin gwaji kusa da kai. boyayyen rashin lafiya, kamar: Wurare dayawa Na bayar da gwajin kyauta. Ciwon daji • Kira 911 a yanayin gaggawa! Idan kana Daɗɗaɗen Ciwon koda da matsalar numfashi, zafi ko matsin lamba Daɗɗaɗen ciwon huhu a kirjinka, ka rikice ko baka iya zama farke, Ciwon mantuwa da sauran kana da leɓuna masu ruwan bula ko fuska, cututtukan jijiyoyin jiki ko wadansu yanayin gaggawa, jeka asibiti Ciwon suga ko kira 911 nan take. -
How to Deal with the Elements of Russian Origin
研究論文 How to Deal with the Elements of Russian Origin: Developments of Orthographic Reforms for the Kyrgyz Language1) ロシア語的要素をどのように扱うか ― キルギス語正書法改革の展開 ― 小田桐 奈 美 Nami Odagiri 本研究は、ロシア語的要素(ロシア語起源の借用語、キリル文字、ロシア語起源の音)に 着目し、ソ連時代およびソ連崩壊以降のキルギス語の正書法改革において、このロシア語的 要素が「外来要素」として認識され排除されてきたのか、それとも何らかの形で受容されて きたのかという問題を検討するものである。本研究によって、特に独立以降のキルギス語の 正書法は、単純にロシア語的要素を排除するわけでも積極的に受け入れるわけでもなく、非 常に複雑に入り組んだものであり、かつ曖昧さを包含するものであったことが明らかになっ た。 キーワード キルギス語、ロシア語、正書法改革、旧ソ連地域 1 . Introduction After the dissolution of the Soviet Union, the titular languages of each ex-Soviet state were promoted as the “state language” 2) and positioned as symbols of national integration. On the basis of the results from previous studies regarding this topic (e.g., Landau and Kellner- Heinkele 2001; Pavlenko 2008), the author emphasizes the general tendency of each states’ language influence to grow, while the role of the Russian language, which enjoyed the highest prestige during the Soviet era, is shrinking, although its influence has not been completely excluded 3). However, many previous studies are general or comparative involving two or more states of the ex-Soviet region, and details of the relational dynamics between the state and Russian language in each state have not yet been entirely explored. Moreover, as previous studies were 69 外国語学部紀要 第 12 号(2015 年 3 月) mainly concerned with the legal and social status of language, such linguistic issues as ortho- graphic and alphabet reforms in the post-Soviet era have not been entirely discussed4). This study, therefore, focuses on the elements of Russian origin in the Kyrgyz language (e.g., the Cyrillic alphabet, loanwords, and sounds of Russian origin). Additionally, it addresses issues concerning orthographic reforms for the Kyrgyz language.