A Comparison of Entity Matching Methods Between English and Japanese Katakana
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Fungsi Ateji Dalam Lirik Lagu Pada Album Marginal #4 the Best 「Star Cluster 2」 Produksi Rejet
PARAMASASTRA Vol. 6 No. 1 - Maret 2019 p-ISSN 2355-4126 e-ISSN 2527-8754 http://journal.unesa.ac.id/index.php/paramasastra FUNGSI ATEJI DALAM LIRIK LAGU PADA ALBUM MARGINAL #4 THE BEST 「STAR CLUSTER 2」 PRODUKSI REJET Meisha Putri M.R., Agus Budi Cahyono Universitas Brawijaya, [email protected] Universitas Brawijaya, [email protected] ABSTRACT This article aimed to describe why furigana in Japanese songs often found different furigana actually with kanji below it. Data uses the album MARGINAL # 4 THE BEST 「STAR CLUSTER 2」 REJET Production. This study uses qualitative descriptive to examine the type of ateji based on Lewis's theory (2010) and its function based on the theory of Jakobson (1960). Based on analysis, writer find more contrastive ateji than denotive ateji. Fatigue function is found more than other functions. The metalingual function is found on all data. Keywords: Ateji, Furigana, semantic PENDAHULUAN Huruf bahasa Jepang dibagi menjadi 4 yang digunakan sehari-hari. Adapun huruf tersebut adalah Kanji, Hiragana, Katakana dan Romaji. Pada penulisan huruf Kanji kadang diikuti dengan furigana yang merupakan bantuan cara baca serta memaknai kanji itu sendiri karena huruf kanji kadang mempunyai cara baca yang berbeda. Selain pembubuhan dengan furigana ada juga dengan ateji. Furigana itu murni sebagai cara baca dan makna aslinya, maka ateji adalah bantuan cara baca yang dilekatkan untuk menambahkan lapisan ide maupun makna di dalam kanji itu sendiri. Ateji merupakan penulisan bahasa Jepang yang tidak mengikuti cara baca jion (cara baca kanji China) dan jikun (cara baca kanji Jepang) ataupun jigi (makna asli) bahasa Jepang tersebut (Shirose, 2012: 103). -
Uhm Phd 9506222 R.Pdf
INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UM! films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent UJWD the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adverselyaffect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-band comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. U·M·I University Microfilms tnternauonat A Bell & Howell tntorrnatron Company 300 North Zeeb Road. Ann Arbor. M148106-1346 USA 313/761-4700 800:521·0600 Order Number 9506222 The linguistic and psycholinguistic nature of kanji: Do kanji represent and trigger only meanings? Matsunaga, Sachiko, Ph.D. University of Hawaii, 1994 Copyright @1994 by Matsunaga, Sachiko. -
Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only. -
Halfwidth and Fullwidth Forms Range: FF00–FFEF
Halfwidth and Fullwidth Forms Range: FF00–FFEF This file contains an excerpt from the character code tables and list of character names for The Unicode Standard, Version 14.0 This file may be changed at any time without notice to reflect errata or other updates to the Unicode Standard. See https://www.unicode.org/errata/ for an up-to-date list of errata. See https://www.unicode.org/charts/ for access to a complete list of the latest character code charts. See https://www.unicode.org/charts/PDF/Unicode-14.0/ for charts showing only the characters added in Unicode 14.0. See https://www.unicode.org/Public/14.0.0/charts/ for a complete archived file of character code charts for Unicode 14.0. Disclaimer These charts are provided as the online reference to the character contents of the Unicode Standard, Version 14.0 but do not provide all the information needed to fully support individual scripts using the Unicode Standard. For a complete understanding of the use of the characters contained in this file, please consult the appropriate sections of The Unicode Standard, Version 14.0, online at https://www.unicode.org/versions/Unicode14.0.0/, as well as Unicode Standard Annexes #9, #11, #14, #15, #24, #29, #31, #34, #38, #41, #42, #44, #45, and #50, the other Unicode Technical Reports and Standards, and the Unicode Character Database, which are available online. See https://www.unicode.org/ucd/ and https://www.unicode.org/reports/ A thorough understanding of the information contained in these additional sources is required for a successful implementation. -
Pictograms: Purely Pictorial Symbols
toponymy course 10. Writing systems Ferjan Ormeling This chapter shows the semantic, phonetic and graphic aspects of language. It traces the development of the graphic aspects from Sumeria 3500BC, from logographic or ideographic via syllabic into alphabetic scripts resulting in a number of script families. It gives examples of the various scripts in maps (see underlined script names) and finally deals with combination of scripts on maps. 03/07/2011 1 Meaning, sound and looks What is the name? - Is it what it means? - Is it what it sounds? - Is it what it looks? 03/07/2011 2 Meaning: the semantic aspect As long as a name is what it means, both its sound and its looks (spelling) are only relevant as far as they support the meaning. The name ‘Nederland’ is wat it means to our neighbours: ‘Low country’. So they translate it for instance into ‘Pays- Bas’ (French) or Netherlands (English) or Niederlande (German), even though that does neither sound nor look like ‘Nederland’. To Romans and Italians, however, the names Qart Hadasht and Neapolis had never been what they meant (‘new towns’). So they became what they sounded: Carthago (to the Roman ear) and Napoli. 03/07/2011 3 Sound: the phonetic aspect As soon as a name is what it sounds, its meaning has become irrelevant. Its looks (spelling) then may or may not be adapted to its sound. The dominance of the phonetic aspect to a name (the ‘oral tradition’) allows the name to degenerate graphically. Eventually, the name may be adapted semantically to its perceived sound, through a process called ‘popular etymology’. -
The Origin of the Alphabet: an Examination of the Goldwasser Hypothesis
Colless, Brian E. The origin of the alphabet: an examination of the Goldwasser hypothesis Antiguo Oriente: Cuadernos del Centro de Estudios de Historia del Antiguo Oriente Vol. 12, 2014 Este documento está disponible en la Biblioteca Digital de la Universidad Católica Argentina, repositorio institucional desarrollado por la Biblioteca Central “San Benito Abad”. Su objetivo es difundir y preservar la producción intelectual de la Institución. La Biblioteca posee la autorización del autor para su divulgación en línea. Cómo citar el documento: Colless, Brian E. “The origin of the alphabet : an examination of the Goldwasser hypothesis” [en línea], Antiguo Oriente : Cuadernos del Centro de Estudios de Historia del Antiguo Oriente 12 (2014). Disponible en: http://bibliotecadigital.uca.edu.ar/repositorio/revistas/origin-alphabet-goldwasser-hypothesis.pdf [Fecha de consulta:..........] . 03 Colless - Alphabet_Antiguo Oriente 09/06/2015 10:22 a.m. Página 71 THE ORIGIN OF THE ALPHABET: AN EXAMINATION OF THE GOLDWASSER HYPOTHESIS BRIAN E. COLLESS [email protected] Massey University Palmerston North, New Zealand Summary: The Origin of the Alphabet Since 2006 the discussion of the origin of the Semitic alphabet has been given an impetus through a hypothesis propagated by Orly Goldwasser: the alphabet was allegedly invented in the 19th century BCE by illiterate Semitic workers in the Egyptian turquoise mines of Sinai; they saw the picturesque Egyptian inscriptions on the site and borrowed a number of the hieroglyphs to write their own language, using a supposedly new method which is now known by the technical term acrophony. The main weakness of the theory is that it ignores the West Semitic acrophonic syllabary, which already existed, and contained most of the letters of the alphabet. -
A STUDY of WRITING Oi.Uchicago.Edu Oi.Uchicago.Edu /MAAM^MA
oi.uchicago.edu A STUDY OF WRITING oi.uchicago.edu oi.uchicago.edu /MAAM^MA. A STUDY OF "*?• ,fii WRITING REVISED EDITION I. J. GELB Phoenix Books THE UNIVERSITY OF CHICAGO PRESS oi.uchicago.edu This book is also available in a clothbound edition from THE UNIVERSITY OF CHICAGO PRESS TO THE MOKSTADS THE UNIVERSITY OF CHICAGO PRESS, CHICAGO & LONDON The University of Toronto Press, Toronto 5, Canada Copyright 1952 in the International Copyright Union. All rights reserved. Published 1952. Second Edition 1963. First Phoenix Impression 1963. Printed in the United States of America oi.uchicago.edu PREFACE HE book contains twelve chapters, but it can be broken up structurally into five parts. First, the place of writing among the various systems of human inter communication is discussed. This is followed by four Tchapters devoted to the descriptive and comparative treatment of the various types of writing in the world. The sixth chapter deals with the evolution of writing from the earliest stages of picture writing to a full alphabet. The next four chapters deal with general problems, such as the future of writing and the relationship of writing to speech, art, and religion. Of the two final chapters, one contains the first attempt to establish a full terminology of writing, the other an extensive bibliography. The aim of this study is to lay a foundation for a new science of writing which might be called grammatology. While the general histories of writing treat individual writings mainly from a descriptive-historical point of view, the new science attempts to establish general principles governing the use and evolution of writing on a comparative-typological basis. -
Japanese Input Method Independent of Applications
JAPANESE INPUT METHOD INDEPENDENT OF APPLICATIONS By Takahide Honma, Hiroyoshi Baba, and Kuniaki Takizawa ABSTRACT The Japanese input method is a complex procedure involving preediting operations. An application that accepts Japanese from an input device must have three systems for the input method: a keybinding system, a manipulator for preediting, and a kana-to-kanji conversion system. Various keybinding systems and manipulators accelerate input operations. Our implementation separates an application from the Japanese input method in three layers. An application can use a front-end input processor to perform all operations including I/O. An application can use the henkan (conversion) module and implement I/O operation itself. An application can execute all operations except keybinding, which is handled by an input method library. INTRODUCTION In this paper, we first present an overview of the technical environment of the Japanese input method implementation. Based on this overview, we briefly describe the critical engineering issues for conversion of Digital's products for the Japanese user. Our most critical engineering issue was the reduction of similar (but slightly different) work to localize products. Another issue was to satisfy customers' requests for the ability to use the many input styles familiar to them. We describe our approach to the development of a Japanese input method that overcomes these issues by separating the input method from an application in three layers. OVERVIEW OF THE JAPANESE INPUT METHOD In this section, we describe Japanese input and string manipulation from the perspective of both the user and the application. Based on these descriptions, we present a brief overview of reengineering a product for Japanese users and a summary of the industry's complex techniques developed for Japanese input methods. -
A Study of Loan Color Terms Collocation in Modern Japanese
A Study of Loan Color Terms Collocation in Modern Japanese Anna V. Bordilovskaya ([email protected]) Graduate School of Humanities, Kobe University, 1-1 Rokkodai-machi, Nada-ku, Kobe 657-8501 JAPAN Abstract English loanwords in Japanese have been a topic of various studies by both native and foreign linguists for The Japanese lexicon consists of Japanese-origin words about 100 years. (WAGO), Chinese-origin words (KANGO) and words Some researchers are more interested in the assimilation borrowed from English and other European languages processes of loanwords (Kay, 1995; Irwin, 2011), other (GAIRAIGO). The acquisition of words from three sources linguists focus on semantic changes (Daulton, 2008), third results in the abundance of near synonyms without any clear rules when a particular synonym should be used. mainly study sociolinguistic background and functions Loveday has hypothesized that WAGO/KANGO and (Loveday, 1986, 1996). GAIRAIGO concrete nouns are used to address similar At present, the number of GAIRAIGO is increasing phenomena of Japanese and Western origins, respectively. rapidly and loanwords penetrate into different spheres of This is referred as Hypothesis of Foreign vs. Native life. Dictionaries (Katakanago Jiten Consaizu (The Concise Dichotomy (HFND). However, the matter of abstract nouns, Dictionary of Katakana Words), etc.) in most cases do not adjectivals and their collocations remains unstudied. In contrast to the previous studies, based on questionnaires, our state any clear differences in the meaning and usage for the approach stems from statistical analysis of corpus data. Our abovementioned near synonyms. results illuminate a distinguishable bias in the structure of On the other hand, the experience of studying and collocations – nouns and adjectivals of the same origin tend communicating in Japanese shows that it is not possible to to appear together more often than the ones of the different substitute WAGO/KANGO and GAIRAIGO near synonyms origins. -
Japanese Typing Tutorial (For Mac): コンピュータで日本語
JPN_Typing_Tutorial_Mac.doc Japanese Typing Tutorial (for Mac): コンピュータで日本語 1. Enable Japanese language features 1) In the Microsoft Office 2004 folder on your hard disk, open the Additional Tools folder, and then open the Microsoft Language Register folder. 2) Drag the Microsoft Word icon on top of the Microsoft Language Register utility. 3) In the Select the language to enable for pop-up menu, click Japanese. Note After running the Microsoft Language Register program, you should select the Japanese keyboard layout provided with the Apple OS, or a third-party keyboard layout of your choice. To do so, open System Preferences, click International, and then click the Input Menu tab. 4) In the menu bar at the top of the screen do you see American flag icon or “A” in the black box? Click the icon, and choose [あ] Hiragana. 2. How to Type ひらがな 2.1 Romaji Typing System When you type Japanese, you have to use Romanization because HIRAGANA or KATAKANA is not listed on your keyboard. But the Romanization you are supposed to use for the word processor is not quite the same as the one you learned in class. You have to interpret Romanization as a mere substitution for the written syllables, HIRAGANA. You first have to consider how the word you want to type is written in HIRAGANA and then transcribe each HIRAGANA character into corresponding romaji. After typing in the Romanization, hit ‘Enter’ key (do not hit ‘Space’ key before hitting ‘Enter’ key, that will convert HIRAGNA into Kanji). Hit space key after each word if you want to have a space. -
Section 18.1, Han
The Unicode® Standard Version 13.0 – Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. © 2020 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at http://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see http://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium; edited by the Unicode Consortium. — Version 13.0. Includes index. ISBN 978-1-936213-26-9 (http://www.unicode.org/versions/Unicode13.0.0/) 1. -
The Notion of Syllable Across History, Theories and Analysis
The Notion of Syllable across History, Theories and Analysis The Notion of Syllable across History, Theories and Analysis Edited by Domenico Russo The Notion of Syllable across History, Theories and Analysis Edited by Domenico Russo This book first published 2015 Cambridge Scholars Publishing Lady Stephenson Library, Newcastle upon Tyne, NE6 2PA, UK British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Copyright © 2015 by Domenico Russo and contributors All rights for this book reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner. ISBN (10): 1-4438-8054-X ISBN (13): 978-1-4438-8054-1 CONTENTS Preface ........................................................................................................ ix Domenico Russo Part I. The Dawn of the Syllable Chapter One ................................................................................................. 2 The Syllable in a Syntagmatic and Paradigmatic Perspective: The Cuneiform Writing in the II Millennium B.C. in Near East and Anatolian Paola Cotticelli Kurras Chapter Two .............................................................................................. 33 Syllables and Syllabaries: Evidence from Two Aegean Syllabic Scripts Carlo Consani Chapter Three ...........................................................................................