Chinese Character Decomposition for Neural MT with Multi-Word Expressions

Total Page:16

File Type:pdf, Size:1020Kb

Chinese Character Decomposition for Neural MT with Multi-Word Expressions Chinese Character Decomposition for Neural MT with Multi-Word Expressions Lifeng Han1, Gareth J. F. Jones1, Alan F. Smeaton2 and Paolo Bolzoni 1 ADAPT Research Centre 2 Insight Centre for Data Analytics School of Computing, Dublin City University, Dublin, Ireland [email protected], [email protected] Abstract made in terms of rare and unseen words by incor- porating sub-word knowledge using Byte Pair En- Chinese character decomposition has been coding (BPE) (Sennrich et al., 2016). However, used as a feature to enhance Machine such methods cannot be directly applied to Chi- Translation (MT) models, combining rad- nese, Japanese and other ideographic languages. icals into character and word level mod- Integrating sub-character level information, els. Recent work has investigated ideo- such as Chinese ideograph and radicals as learning graph or stroke level embedding. How- knowledge has been used to enhance features in ever, questions remain about the different NMT systems (Han and Kuang, 2018; Zhang and decomposition levels of Chinese character Matsumoto, 2018; Zhang and Komachi, 2018). representations, radical and strokes, best Han and Kuang (2018), for example, explain that suited for MT. To investigate the impact the meaning of some unseen or low frequency Chi- of Chinese decomposition embedding in nese characters can be estimated and translated us- detail, i.e., radical, stroke, and intermedi- ing radicals decomposed from the Chinese char- ate levels, and how well these decomposi- acters, as long as the learning model can acquire tions represent the meaning of the original knowledge of these radicals within the training character sequences, we carry out analy- corpus. sis with both automated and human evalu- Chinese characters often include two pieces of ation of MT. Furthermore, we investigate information, with semantics encoded within radi- if the combination of decomposed Mul- cals and a phonetic part. The phonetic part is re- tiword Expressions (MWEs) can enhance lated to the pronunciation of the overall character, model learning. MWE integration into either the same or similar. For instance, Chinese MT has seen more than a decade of explo- characters with this two-stroke radical, 刂 (t´ı dao¯ ration. However, decomposed MWEs has pang),´ ordinarily relate to knife in meaning, such not previously been explored. as the Chinese character 劍 (jian,` sword) and multi-character expression 鋒) (fengl¯ `ı, sharp). 1 Introduction The radical 刂 (t´ı dao¯ pang)´ preserves the mean- Neural Machine Translation (NMT) (Cho et al., ing of knife because it is a variation of a drawing 2014; Johnson et al., 2016; Vaswani et al., 2017; of a knife evolving from the original bronze in- Lample and Conneau, 2019) has recently replaced scription (Fig. 4 in Appendices). Statistical Machine Translation (SMT) (Brown Not only can the radical part of a character be et al., 1993; Och and Ney, 2003; Chiang, 2005; decomposed into smaller fragments of strokes but Koehn, 2010) as the state-of-the-art for Machine the phonetic part can also be decomposed. Thus Translation (MT). However, research questions there are often several levels of decomposition that still remain, such as how to deal with out-of- can be applied to Chinese characters by combin- vocabulary (OOV) words, how best to integrate ing different levels of decomposition of each part linguistic knowledge and how best to correctly of the Chinese character. As one example, Fig- translate multi-word expressions (MWEs) (Sag ure 1 shows the three decomposition levels from et al., 2002; Moreau et al., 2018; Han et al., our model and the full stroke form of the above 2020a). For OOV word translation for European mentioned characters 劍(jian)` and 鋒(feng)¯ . To languages, substantial improvements have been date, little work has been carried out to investigate the full potential of these alternative levels of de- 2 Related Work composition of Chinese characters for the purpose Chinese character decomposition has been ex- of Machine Translation (MT). plored recently for MT. For instance, Han In this work, we investigate Chinese charac- and Kuang (2018) and Zhang and Matsumoto ter decomposition, and another area related to (2018), considered radical embeddings as ad- Chinese characters, namely Chinese MWEs. We ditional features for Chinese ! English and firstly investigate translation at increasing levels of Japanese , Chinese NMT. Han and Kuang decomposition of Chinese characters using under- (2018) tested a range of encoding models lying radicals, as well as the additional Chinese including word+character, word+radical, and character strokes (corresponding to ever-smaller word+character+radical. This final setting with units), breaking down characters into component word+character+radical achieved the best perfor- parts as this is likely to reduce the number of un- mance on a standard NIST 2 MT evaluation data known words. Then, in order to better deal with set for Chinese ! English. Furthermore, Zhang MWEs which have a common occurrence in gen- and Matsumoto (2018) applied radical embed- eral contexts (Sag et al., 2002), and working in dings as additional features to character level the opposite direction in terms of meaning rep- LSTM-based NMT on Japanese ! Chinese trans- resentation, we investigate translating larger units lation. None of the aforementioned work has how- of Chinese text, with the aim of restricting trans- ever investigated the performance of decomposed lation of larger groups of Chinese characters that character sequences and the effects of varied de- should be translated together as one unit. In ad- composition degrees in combination with MWEs. dition to investigating the effects of decompos- Subsequently, Zhang and Komachi (2018) devel- ing characters we simultaneously apply methods oped bidirectional English , Japanese, English of incorporating MWEs into translation. MWEs , Chinese and Chinese , Japanese NMT with can appear in Chinese in a range of ways, such word, character, ideograph (the phonetics and se- as fixed (or semi-fixed) expressions, metaphor, id- mantics parts of characters are separated) and iomatic phrases, and institutional, personal or lo- stroke levels, with experiments showing that the cation names, amongst others. ideograph level was best for ZH!EN MT, while In summary, in this paper, we investigate: (i) the stroke level was best for JP!EN MT. Al- the degree to which Chinese radical and stroke se- though their ideograph and stroke level setting re- quences represent the original word and charac- placed the original character and word sequences, ter sequences that they are composed of; (ii) the there was no investigation of intermediate decom- difference in performance achieved by each de- position performance, and they only used BLEU composition level; (iii) the effect of radical and score for automated evaluation with no human as- stroke representations in MWEs for MT. Further- sessment involved. This gives us inspiration to ex- more, we offer: plore the performance of intermediate level em- • an open-source suite of Chinese character de- bedding between ideograph and strokes for the composition extraction tools; MT task. • a Chinese , English MWE corpus where 3 Chinese Character Decomposition Chinese characters have been decomposed In this section, we introduce a character decom- available at radical4mt1. position approach and the extraction tools which The rest of this paper is organized as follows: we apply in this work (code will be publicly avail- Section 2 provides details of related work in char- able). We utilize the open source IDS dictionary 3 acter and radical related MT; Sections 3 and 4 in- which was derived from the CHISE (CHarac- 4 troduce our Chinese decomposition procedure into ter Information Service Environment) project . It radical and strokes, and our experimental design; is comprised of 88,940 Chinese characters from Section 5 provides details of our evaluations from CJK (Chinese, Japanese, Korean script) Unified both automatic and human perspectives; Section 6 2https://www.nist.gov/ describes conclusions and plans for future work. programs-projects/machine-translation 3https://github.com/cjkvi/cjkvi-ids 1https://github.com/poethan/MWE4MT 4http://www.chise.org/ Level-1 劍 (jiàn) 鋒 (fēng) Level-1: (phonetic, qiān) 僉⺉(semantic, knife) (semantic, metal) ⾦夆 (phonetic, féng) Level-2: 亼吅从 ⺉ ⼈王丷 夂丰 Level-3: ⼈⼀⼝⼝⼈⼈ ⺉ ⼈⼀⼟丷 夂三⼁ … … … … … Full-stroke: ⼃㇏⼀⼁�⼀⼁�⼀⼃㇏⼃㇏ ⼁⼅ ⼃㇏⼀⼀⼁⼂㇀⼀ ㇀㇇㇏⼀⼀⼀⼁ Figure 1: Examples of the decomposition of Chinese characters. Ideographs and the corresponding decomposition Character Decomposition Decomposition sequences of each character. Most characters are 丽 (lì) ⿱⼀⿰⿵⼌⼂⿵⼌⼂ ⿰⿱⼀⿵⼌⼂⿱⼀⿵ decomposed as a single sequence, but characters [G] ⼌⼂[T] can have up to four possible decomposed repre- 具 (jù) ⿱⿴且⼀八[GTKV] ⿳⽬⼀八[J] sentations. The reason for this is that the character 函 (hán) ⿶⼐⿻了⿱丷八[GTV] ⿶⼐⿻丂⿱丷八[JK] can come from different resources, such as Chi- 勇 (yǒng) ⿱甬⼒[GTV] ⿱⿱龴⽥⼒[JK] nese Hanzi (G, H, T for Mainland, Hong Kong, and Taiwan), Japanese Kanji (J), Korean Hanja Character construction: ⿱: up-down, ⿰: left-right, ⿵⿶ ⿴: inside-outside, ⿻: embedded (K), and Vietnamese ChuNom (V), etc.5 Even though they have the same root of Hanzi, the his- Figure 2: Character examples from IDS dictio- torical development of languages and writing sys- nary; the grey parts of decomposition graphs rep- tems in different territories has resulted in certain resent the construction structure of the character. degrees of variation in their
Recommended publications
  • International Naming Conventions NAFSA TX State Mtg
    1 2 3 4 1. Transcription is a more phonetic interpretation, while transliteration represents the letters exactly 2. Why transcription instead of transliteration? • Some English vowel sounds don’t exist in the other language and vice‐versa • Some English consonant sounds don’t exist in the other language and vice‐versa • Some languages are not written with letters 3. What issues are related to transcription and transliteration? • Lack of consistent rules from some languages or varying sets of rules • Country variation in choice of rules • Country/regional variations in pronunciation • Same name may be transcribed differently even within the same family • More confusing when common or religious names cross over several countries with different scripts (i.e., Mohammad et al) 5 Dark green countries represent those countries where Arabic is the official language. Lighter green represents those countries in which Arabic is either one of several official languages or is a language of everyday usage. Middle East and Central Asia: • Kurdish and Turkmen in Iraq • Farsi (Persian) and Baluchi in Iran • Dari, Pashto and Uzbek in Afghanistan • Uyghur, Kazakh and Kyrgyz in northwest China South Asia: • Urdu, Punjabi, Sindhi, Kashmiri, and Baluchi in Pakistan • Urdu and Kashmiri in India Southeast Asia: • Malay in Burma • Used for religious purposes in Malaysia, Indonesia, southern Thailand, Singapore, and the Philippines Africa: • Bedawi or Beja in Sudan • Hausa in Nigeria • Tamazight and other Berber languages 6 The name Mohamed is an excellent example. The name is literally written as M‐H‐M‐D. However, vowels and pronunciation depend on the region. D and T are interchangeable depending on the region, and the middle “M” is sometimes repeated when transcribed.
    [Show full text]
  • Kūnqǔ in Practice: a Case Study
    KŪNQǓ IN PRACTICE: A CASE STUDY A DISSERTATION SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAI‘I AT MĀNOA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN THEATRE OCTOBER 2019 By Ju-Hua Wei Dissertation Committee: Elizabeth A. Wichmann-Walczak, Chairperson Lurana Donnels O’Malley Kirstin A. Pauka Cathryn H. Clayton Shana J. Brown Keywords: kunqu, kunju, opera, performance, text, music, creation, practice, Wei Liangfu © 2019, Ju-Hua Wei ii ACKNOWLEDGEMENTS I wish to express my gratitude to the individuals who helped me in completion of my dissertation and on my journey of exploring the world of theatre and music: Shén Fúqìng 沈福庆 (1933-2013), for being a thoughtful teacher and a father figure. He taught me the spirit of jīngjù and demonstrated the ultimate fine art of jīngjù music and singing. He was an inspiration to all of us who learned from him. And to his spouse, Zhāng Qìnglán 张庆兰, for her motherly love during my jīngjù research in Nánjīng 南京. Sūn Jiàn’ān 孙建安, for being a great mentor to me, bringing me along on all occasions, introducing me to the production team which initiated the project for my dissertation, attending the kūnqǔ performances in which he was involved, meeting his kūnqǔ expert friends, listening to his music lessons, and more; anything which he thought might benefit my understanding of all aspects of kūnqǔ. I am grateful for all his support and his profound knowledge of kūnqǔ music composition. Wichmann-Walczak, Elizabeth, for her years of endeavor producing jīngjù productions in the US.
    [Show full text]
  • Towards Chinese Calligraphy Zhuzhong Qian
    Macalester International Volume 18 Chinese Worlds: Multiple Temporalities Article 12 and Transformations Spring 2007 Towards Chinese Calligraphy Zhuzhong Qian Desheng Fang Follow this and additional works at: http://digitalcommons.macalester.edu/macintl Recommended Citation Qian, Zhuzhong and Fang, Desheng (2007) "Towards Chinese Calligraphy," Macalester International: Vol. 18, Article 12. Available at: http://digitalcommons.macalester.edu/macintl/vol18/iss1/12 This Article is brought to you for free and open access by the Institute for Global Citizenship at DigitalCommons@Macalester College. It has been accepted for inclusion in Macalester International by an authorized administrator of DigitalCommons@Macalester College. For more information, please contact [email protected]. Towards Chinese Calligraphy Qian Zhuzhong and Fang Desheng I. History of Chinese Calligraphy: A Brief Overview Chinese calligraphy, like script itself, began with hieroglyphs and, over time, has developed various styles and schools, constituting an important part of the national cultural heritage. Chinese scripts are generally divided into five categories: Seal script, Clerical (or Official) script, Regular script, Running script, and Cursive script. What follows is a brief introduction of the evolution of Chinese calligraphy. A. From Prehistory to Xia Dynasty (ca. 16 century B.C.) The art of calligraphy began with the creation of Chinese characters. Without modern technology in ancient times, “Sound couldn’t travel to another place and couldn’t remain, so writings came into being to act as the track of meaning and sound.”1 However, instead of characters, the first calligraphy works were picture-like symbols. These symbols first appeared on ceramic vessels and only showed ambiguous con- cepts without clear meanings.
    [Show full text]
  • The Biopolitical Elements in Yan Lianke's Fiction Worlds
    Eastern Illinois University The Keep Masters Theses Student Theses & Publications 2018 The iopB olitical Elements in Yan Lianke's Fiction Worlds Xiaoyu Gao Eastern Illinois University This research is a product of the graduate program in English at Eastern Illinois University. Find out more about the program. Recommended Citation Gao, Xiaoyu, "The iopoB litical Elements in Yan Lianke's Fiction Worlds" (2018). Masters Theses. 3619. https://thekeep.eiu.edu/theses/3619 This is brought to you for free and open access by the Student Theses & Publications at The Keep. It has been accepted for inclusion in Masters Theses by an authorized administrator of The Keep. For more information, please contact [email protected]. The GraduateSchool � EA'ill 11.'1I·��-- h l:'ll\'tll\11'\' Thesis Maintenance and Reproduction Certificate FOR: Graduate candidates Completing Theses in PartialFulfillment of the Degree Graduate Faculty Advisors Directing the Theses RE: Preservation, Reproduction, and Distribution of Thesis Research Preserving, reproducing, and distributing thesis research is an important part of Booth Library's responsibility to provide access to scholarship. In order to further this goal, Booth Library makes all graduate theses completed as part of a degree program at Eastern Illinois University available for personal study, research, and other not-for­ profit educational purposes. Under 17 U.S.C. § 108, the library may reproduce and distribute a copy without infringing on copyright; however, professional courtesy dictates that permission be requested from the author before doing so. Your signatures affirm the following: •The graduate candidate is the author of this thesis. •The graduate candidate retains the copyright and intellectual property rights associated with the original research, creative activity, and intellectual or artistic content of the thesis.
    [Show full text]
  • Is Shuma the Chinese Analog of Soma/Haoma? a Study of Early Contacts Between Indo-Iranians and Chinese
    SINO-PLATONIC PAPERS Number 216 October, 2011 Is Shuma the Chinese Analog of Soma/Haoma? A Study of Early Contacts between Indo-Iranians and Chinese by ZHANG He Victor H. Mair, Editor Sino-Platonic Papers Department of East Asian Languages and Civilizations University of Pennsylvania Philadelphia, PA 19104-6305 USA [email protected] www.sino-platonic.org SINO-PLATONIC PAPERS FOUNDED 1986 Editor-in-Chief VICTOR H. MAIR Associate Editors PAULA ROBERTS MARK SWOFFORD ISSN 2157-9679 (print) 2157-9687 (online) SINO-PLATONIC PAPERS is an occasional series dedicated to making available to specialists and the interested public the results of research that, because of its unconventional or controversial nature, might otherwise go unpublished. The editor-in-chief actively encourages younger, not yet well established, scholars and independent authors to submit manuscripts for consideration. Contributions in any of the major scholarly languages of the world, including romanized modern standard Mandarin (MSM) and Japanese, are acceptable. In special circumstances, papers written in one of the Sinitic topolects (fangyan) may be considered for publication. Although the chief focus of Sino-Platonic Papers is on the intercultural relations of China with other peoples, challenging and creative studies on a wide variety of philological subjects will be entertained. This series is not the place for safe, sober, and stodgy presentations. Sino- Platonic Papers prefers lively work that, while taking reasonable risks to advance the field, capitalizes on brilliant new insights into the development of civilization. Submissions are regularly sent out to be refereed, and extensive editorial suggestions for revision may be offered. Sino-Platonic Papers emphasizes substance over form.
    [Show full text]
  • A Comparative Analysis of the Simplification of Chinese Characters in Japan and China
    CONTRASTING APPROACHES TO CHINESE CHARACTER REFORM: A COMPARATIVE ANALYSIS OF THE SIMPLIFICATION OF CHINESE CHARACTERS IN JAPAN AND CHINA A THESIS SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAI‘I AT MĀNOA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS IN ASIAN STUDIES AUGUST 2012 By Kei Imafuku Thesis Committee: Alexander Vovin, Chairperson Robert Huey Dina Rudolph Yoshimi ACKNOWLEDGEMENTS I would like to express deep gratitude to Alexander Vovin, Robert Huey, and Dina R. Yoshimi for their Japanese and Chinese expertise and kind encouragement throughout the writing of this thesis. Their guidance, as well as the support of the Center for Japanese Studies, School of Pacific and Asian Studies, and the East-West Center, has been invaluable. i ABSTRACT Due to the complexity and number of Chinese characters used in Chinese and Japanese, some characters were the target of simplification reforms. However, Japanese and Chinese simplifications frequently differed, resulting in the existence of multiple forms of the same character being used in different places. This study investigates the differences between the Japanese and Chinese simplifications and the effects of the simplification techniques implemented by each side. The more conservative Japanese simplifications were achieved by instating simpler historical character variants while the more radical Chinese simplifications were achieved primarily through the use of whole cursive script forms and phonetic simplification techniques. These techniques, however, have been criticized for their detrimental effects on character recognition, semantic and phonetic clarity, and consistency – issues less present with the Japanese approach. By comparing the Japanese and Chinese simplification techniques, this study seeks to determine the characteristics of more effective, less controversial Chinese character simplifications.
    [Show full text]
  • Third Edition 中文听说读写
    Integrated Chinese Level 1 Part 1 Textbook Simplified Characters Third Edition 中文听说读写 THIS IS A SAMPLE COPY FOR PREVIEW AND EVALUATION, AND IS NOT TO BE REPRODUCED OR SOLD. © 2009 Cheng & Tsui Company. All rights reserved. ISBN 978-0-88727-644-6 (hardcover) ISBN 978-0-88727-638-5 (paperback) To purchase a copy of this book, please visit www.cheng-tsui.com. To request an exam copy of this book, please write [email protected]. Cheng & Tsui Company www.cheng-tsui.com Tel: 617-988-2400 Fax: 617-426-3669 LESSON 1 Greetings 第一课 问好 Dì yī kè Wèn hǎo SAMPLE LEARNING OBJECTIVES In this lesson, you will learn to use Chinese to • Exchange basic greetings; • Request a person’s last name and full name and provide your own; • Determine whether someone is a teacher or a student; • Ascertain someone’s nationality. RELATE AND GET READY In your own culture/community— 1. How do people greet each other when meeting for the fi rst time? 2. Do people say their given name or family name fi rst? 3. How do acquaintances or close friends address each other? 20 Integrated Chinese • Level 1 Part 1 • Textbook Dialogue I: Exchanging Greetings SAMPLELANGUAGE NOTES 你好! 你好!(Nǐ hǎo!) is a common form of greeting. 你好! It can be used to address strangers upon fi rst introduction or between old acquaintances. To 请问,你贵姓? respond, simply repeat the same greeting. 请问 (qǐng wèn) is a polite formula to be used 1 2 我姓 李。你呢 ? to get someone’s attention before asking a question or making an inquiry, similar to “excuse me, may I 我姓王。李小姐 , please ask…” in English.
    [Show full text]
  • Cha Zhang, Chang−Shui Zhang, Chengcui Zhang, Dengsheng Zhang, Dong Zhang, Dongming Zhang, Hong−Jiang Zhang, Jiang Zhang, Jianning Zhang, Keqi Zhang, Lei Zhang, Li
    Z Zeng, Wenjun Zeng, Zhihong Zhai, Yun Zhang, Benyu Zhang, Cha Zhang, Chang−Shui Zhang, Chengcui Zhang, Dengsheng Zhang, Dong Zhang, Dongming Zhang, Hong−Jiang Zhang, Jiang Zhang, Jianning Zhang, Keqi Zhang, Lei Zhang, Li Menu Next Z Zhang, Like Zhang, Meng Zhang, Mingju Zhang, Rong Zhang, Ruofei Zhang, Weigang Zhang, Yongdong Zhang, Yun−Gang Zhang, Zhengyou Zhang, Zhenping Zhang, Zhenqiu Zhang, Zhishou Zhang, Zhongfei (Mark) Zhao, Frank Zhao, Li Zhao, Na Prev Menu Next Z Zheng, Changxi Zheng, Yizhan Zhi, Yang Zhong, Yuzhuo Zhou, Jin Zhu, Jiajun Zhu, Xiaoqing Zhu, Yongwei Zhuang, Yueting Zimmerman, John Zoric, Goranka Zou, Dekun Prev Menu Wenjun Zeng Organization : University of Missouri−Columbia, United States of America Paper(s) : ON THE RATE−DISTORTION PERFORMANCE OF DYNAMIC BITSTREAM SWITCHING MECHANISMS (Abstract) Letter−Z Menu Zhihong Zeng Organization : University of Illinois at Urbana−Champaign, United States of America Paper(s) : AUDIO−VISUAL AFFECT RECOGNITION IN ACTIVATION−EVALUATION SPACE (Abstract) Letter−Z Menu Yun Zhai Organization : University of Central Florida, United States of America Paper(s) : AUTOMATIC SEGMENTATION OF HOME VIDEOS (Abstract) Letter−Z Menu Benyu Zhang Organization : Microsoft Research Asia, China Paper(s) : SUPERVISED SEMI−DEFINITE EMBEDING FOR IMAGE MANIFOLDS (Abstract) Letter−Z Menu Cha Zhang Organization : Microsoft Research, United States of America Paper(s) : HYBRID SPEAKER TRACKING IN AN AUTOMATED LECTURE ROOM (Abstract) Letter−Z Menu Chang−Shui Zhang Organization : Tsinghua University, China
    [Show full text]
  • Official Colours of Chinese Regimes: a Panchronic Philological Study with Historical Accounts of China
    TRAMES, 2012, 16(66/61), 3, 237–285 OFFICIAL COLOURS OF CHINESE REGIMES: A PANCHRONIC PHILOLOGICAL STUDY WITH HISTORICAL ACCOUNTS OF CHINA Jingyi Gao Institute of the Estonian Language, University of Tartu, and Tallinn University Abstract. The paper reports a panchronic philological study on the official colours of Chinese regimes. The historical accounts of the Chinese regimes are introduced. The official colours are summarised with philological references of archaic texts. Remarkably, it has been suggested that the official colours of the most ancient regimes should be the three primitive colours: (1) white-yellow, (2) black-grue yellow, and (3) red-yellow, instead of the simple colours. There were inconsistent historical records on the official colours of the most ancient regimes because the composite colour categories had been split. It has solved the historical problem with the linguistic theory of composite colour categories. Besides, it is concluded how the official colours were determined: At first, the official colour might be naturally determined according to the substance of the ruling population. There might be three groups of people in the Far East. (1) The developed hunter gatherers with livestock preferred the white-yellow colour of milk. (2) The farmers preferred the red-yellow colour of sun and fire. (3) The herders preferred the black-grue-yellow colour of water bodies. Later, after the Han-Chinese consolidation, the official colour could be politically determined according to the main property of the five elements in Sino-metaphysics. The red colour has been predominate in China for many reasons. Keywords: colour symbolism, official colours, national colours, five elements, philology, Chinese history, Chinese language, etymology, basic colour terms DOI: 10.3176/tr.2012.3.03 1.
    [Show full text]
  • Linguistic Approaches to the Dating of the Lúnyŭ: Methodological Notes and Future Prospects
    The Analects: A Western Han Text? 1 Linguistic approaches to the dating of the Lúnyŭ: methodological notes and future prospects Wolfgang Behr (University of Zurich) Princeton, 4-5.XI.2011 <[email protected]> The Analects: A Western Han Text? 2 0. Basic bibliography: Zhāng Zhōngtáng 張忠堂, “Jìn bǎinián lái «Lúnyù» yŭyán yánjiū shùpíng” 近百年 來《論語》研究述評, Shānxī Shīdà Xuébào 山西師大學報 36 (5), 2009, 137-140. • lists 225 titles (lexicon: 149, word formation: 11, syntax: 38, others: 27) • but: none primarily interested in dating the text! Princeton, 4-5.XI.2011 <[email protected]> The Analects: A Western Han Text? 3 1. A set of innovations characterizing late CQ / early WS language, best reflected by Eastern Zhōu BI, the Wēnxiàn 溫縣 (497 B.C.) and Hóumǎ 侯馬 (either 497-389) co- venant texts méngshū 盟書 (Chén Yǒngzhèng, GWZYJ 21.2000) • 而 (*nə) used as a conjunction between VPs, linking complex clauses (repla- cing BI 則 (*tsˤək), 眔 (*m-rˤəp)), including conditionals, hypotheticals, ad- versatives etc.; 而 sometimes in disyllabic conjunction collocations (而況) • 者 (*ta-ʔ) appears as nominalizer, sometimes after very complex VPs (> 40 characters) • 所 (*s-Qʰra-ʔ) appears as pronominalization of object NPs • 所 (*s-Qʰra-ʔ) used as sentence-initial hypothetical conjunction ! (esp. in oaths; type: 「所不與舅氏同心者,有如白水」“If I fail to be of one heart with you, uncle, may it be as [in the case of] White Water!”; Zuo, Xi 24) • 與 (*C.ɢaʔ) appears as a conjunction between coordinated NPs, and as a co- mitative preposition • 焉 (*-(ʔ)an) marks reduplicated ADJ/ADV Princeton, 4-5.XI.2011 <[email protected]> The Analects: A Western Han Text? 4 2.
    [Show full text]
  • A Dictionary of Chinese Characters: Accessed by Phonetics
    A dictionary of Chinese characters ‘The whole thrust of the work is that it is more helpful to learners of Chinese characters to see them in terms of sound, than in visual terms. It is a radical, provocative and constructive idea.’ Dr Valerie Pellatt, University of Newcastle. By arranging frequently used characters under the phonetic element they have in common, rather than only under their radical, the Dictionary encourages the student to link characters according to their phonetic. The system of cross refer- encing then allows the student to find easily all the characters in the Dictionary which have the same phonetic element, thus helping to fix in the memory the link between a character and its sound and meaning. More controversially, the book aims to alleviate the confusion that similar looking characters can cause by printing them alongside each other. All characters are given in both their traditional and simplified forms. Appendix A clarifies the choice of characters listed while Appendix B provides a list of the radicals with detailed comments on usage. The Dictionary has a full pinyin and radical index. This innovative resource will be an excellent study-aid for students with a basic grasp of Chinese, whether they are studying with a teacher or learning on their own. Dr Stewart Paton was Head of the Department of Languages at Heriot-Watt University, Edinburgh, from 1976 to 1981. A dictionary of Chinese characters Accessed by phonetics Stewart Paton First published 2008 by Routledge 2 Park Square, Milton Park, Abingdon, OX14 4RN Simultaneously published in the USA and Canada by Routledge 270 Madison Ave, New York, NY 10016 Routledge is an imprint of the Taylor & Francis Group, an informa business This edition published in the Taylor & Francis e-Library, 2008.
    [Show full text]
  • A Practical Scheme to Compute Pessimistic Bilevel Optimization Problem
    A Practical Scheme to Compute Pessimistic Bilevel Optimization Problem Bo Zeng Department of Industrial Engineering University of Pittsburgh, Pittsburgh, PA 15261 Abstract In this paper, we present a new computation scheme for pessimistic bilevel optimization problem, which so far does not have any computational methods generally applicable yet. We first develop a tight relaxation and then design a simple scheme to ensure a feasible and optimal solution. Then, we discuss using this scheme to analyze and compute linear pessimistic bilevel problem and several extensions. We also provide demonstrations on illustrative examples, and a systematic numerical study on instances of two practical problems. Because of its simple structure and strong computational capacity, we believe that the developed scheme is of a critical value in studying and solving pessimistic bilevel optimization problems arising from practice. 1 Introduction Bilevel optimization is a popular modeling and computing tool for non-centralized decision making problems where two decision makers (DMs), i.e., the upper level and the lower level DMs, interact sequentially. In this paper, we consider the pessimistic formulation (also known as the weak formulation) of bilevel optimization [32], which can be represented in the following mathematical form: ∗ PBL :Θp = min max F(x; y) (1) x y s.t. x 2 X (2) n o y 2 S(x) = arg min f(x; y): y 2 Y(x) (3) where X ⊆ Rn and Y(x) ⊆ Rm for any x. We mention that x or y variables are not necessary to be continuous and they can be discrete. The optimization problem defined in (1-2) is referred to as the upper level DM’s (her) decision problem and the one appearing in (3) is called the lower level DM’s (his) decision problem.
    [Show full text]