An Double Hidden HMM and an CRF for Segmentation Tasks with Pinyin's Finals

Total Page:16

File Type:pdf, Size:1020Kb

An Double Hidden HMM and an CRF for Segmentation Tasks with Pinyin's Finals An Double Hidden HMM and an CRF for Segmentation Tasks with Pinyin’s Finals Huixing Jiang Zhe Dong Center for Intelligence Science and Technology Beijing University of Posts and Telecommunications Beijing, China [email protected] [email protected] Abstract inner features of Chinese Characters. And it nat- urally contributes to the identification of Out-Of- We have participated in the open tracks Vacabulary words (OOV). and closed tracks on four corpora of Chi- In our work, Chinese phonology information is nese word segmentation tasks in CIPS- used as basic features of Chinese characters in all SIGHAN-2010 Bake-offs. In our experi- models. For open tracks, we propose a new dou- ments, we used the Chinese inner phonol- ble hidden layers HMM in which a new phonol- ogy information in all tracks. For open ogy information is built in as a hidden layer, a tracks, we proposed a double hidden lay- new lexical association is proposed to deal with ers’ HMM (DHHMM) in which Chinese the OOV questions and domains’ adaptation ques- inner phonology information was used as tions. And for closed tracks, CRF model has been one hidden layer and the BIO tags as an- used , combined with Chinese inner phonology in- other hidden layer. N-best results were formation. We used the CRF++ package Version firstly generated by using DHHMM, then 0.43 by Taku Kudo1. the best one was selected by using a new In the rest sections of this paper, we firstly in- lexical statistic measure. For close tracks, troduce the Chinese phonology in Section 2. Then we used CRF model in which the Chinese in the Section 3, the models used in our tasks are inner phonology information was used as presented. And the experiments and results are features. described in Section 4. Finally, we give the con- clusions and make prospect on future work. 1 Introduction 2 Chinese Phonology Chinese language has many characteristics not possessed by other languages. One obvious is Hanyu Pinyin is the form of sound for Chi- that the written Chinese text does not have explicit nese text and the Chinese phonology informa- word boundaries like western languages. So word tion is explicit expressed by Pinyin. It is cur- segmentation became very significative for Chi- rently the most commonly used romanization sys- nese information processing, and is usually con- tem for Standard Mandarin. Hanyu means the sidered as the first step of any further processing. Chinese language, and Pinyin means “phonetics”, Identifying words has been a basic task for many or more literally, “spelling sound” or “spelled researchers who have devoted themselves on Chi- sound” (wikipedia, 2010). The system has been nese text processing. employed to teach Mandarin as home language The biggest characteristic of Chinese language or as second language by China, Malaysia, Sin- is its trinity of sound, form and meaning (Pan, gapore et.al. Pinyin has been the most Chinese 2002). Hanyu Pinyin is the form of sound for character’s input method for computers and other Chinese text and the Chinese phonology informa- devices. tion is explicit expressed by Pinyin which is the 1http://crfpp.sourceforge.net/ The romanization system was developed by a 3.1 Double hidden layers’ HMM government committee in the People’s Repub- For a given piece of Chinese sentence, X = lic of China, and approved by the Chinese gov- x1x2 : : : xT , where xi; i = 1;:::;T is a Chinese ernment on February 11, 1958. The Interna- character. Suppose that we can give each Chinese tional Organization for Standardization adopted character xi a Pinyin’s final yi. And suppose the pinyin as the international standard in 1982, and label sequence of X is S = s1s2 : : : sT , where since then it has been adopted by many other si 2 TS is the tag of xi. Then what we want to organizations(wikipedia, 2010). In this system, find is an optimal tag sequence S∗ which is de- pinyin is composed by initials(pinyin: shengmu), fined in (1). finals(pinyin: yunmu) and tones(pinyin: sheng- ∗ diao) instead of consonants and vowels used in S = arg max P (S; Y jX) S European language. For example, the Pinyin of = arg max P (XjS; Y )P (S; Y ) (1) ”中” is ”zhong1” composed by ”zh”, ”ong” and S ”1”. In which ”zh” is initial, ”ong” is final and The model is described in Fig. 1. For a given ”1” is the tone. piece of Chinese character strings, One hidden Every language has its rhythm and rhyme, so layer is label sequence S. Another hidden layer is Chinese is no exception. The rhythm system are Pinyin’s finals sequence Y . The observation layer the driving force from the unconscious habit of is the given piece of Chinese characters X. language(Edward, 1921). And the Pinyin’s finals contribute the Chinese rhythm system, Which is the basic assumption our research based on. 3 Algorithms Generally the task of segmentation can be viewed as a sequence labeling problem. We first define a tag set as TS = fB, I, E, Sg, shown in Table 1. Figure 1: Double Hidden Markov Model For transition probability, second-order Markov Table 1: The tag set used in this paper. model is used to estimate probability of the double Label Explanation hidden sequences as described in (2). Y B beginning character of a word P (S; Y ) = p(st; ytjst−1; yt−1) (2) I inner character of a word t E end character of a word For emission probability, we keep the first- S a single character as a word order Markov assumption as shown in (5). Y P (XjS; Y ) = p(xtjst; yt) (3) For the piece ”是 英 国 前 王 妃 戴 安 娜” of t the example described in the experiments section, 3.1.1 Nbest results firstly, the TS tags are labeled to it. And its re- Based on the work of (Jiang, 2010), a word lat- sult is ”是/S 英/B 国/E 前/S 王/B 妃/E 戴/B 安/I tice is also built firstly, then in the second step, the 娜/E”. Then the tags are combined sequentially to backward A∗ algorithm is used to find the top N get the finally result ”是 英国 前 王妃 戴安娜”. results instead of using the backward viterbi al- In this section, A novel HMM solution is pre- gorithm to find the top one. The backward A∗ sented firstly for open tracks. Then the CRF solu- search algorithm is described as follow (Wang, tion for closed tracks is introduced. 2002; Och, 2001). 3.1.2 Reranking with a new lexical statistic tation(Lafferty, 2001; Zhao, 2006). In the closed measure tracks of the paper, we also use it. Given two random Chinese characters X and Y and assume that they appears in an aligned region 3.2.1 Feature templates of the corpus. The distribution of the two random We adopted two main kinds of features: n-gram Chinese characters could be depicted by a 2 by 2 features and Pinyin’s finals features. The n-gram contingency table shown in Fig. 2(Chang, 2002). feature set is quite orthodox, they are, namely, C- 2, C-1, C0, C1, C2, C-2C-1, C-1C0, C0C1, C1C2. The Pinyin’s finals feature set is the same as n- gram feature set. They are described in Table. 2. Figure 2: A 2 by 2 contingency table Table 2: Feature templates In Fig. 2, a is the counts of X and Y co-occur; b is the counts of the cases that X occurs but Y does Templates Category not; c is the counts of the cases that X does not occur but Y does; d is the counts of the cases that C-2, C-1, C0, C1, C2 N-gram: Unigram both X and Y do not occur. The Log-likelihood C-2C-1, C-1C0, C0C1, C1C2 N-gram: Bigram rate is calculated by (4). P-2, P-1, P0, P1, P2 Phonetic: Unigram P-2P-1, P-1P0, P0P1, P1P2 Phonetic: Bigram a · N LLR(x; y) = 2(a · log (a + b) · (a + c) b · N + b · log (a + b) · (b + d) 4 Experiments and Results c · N + c · log (c + d) · (a + c) 4.1 Dataset d · N + d · log ) (4) We build a basic words dictionary for DHHMM (c + d) · (b + d) and a Pinyin’s finals dictionary for both DHHMM For the N-best result described in sec. 3.1.1, and CRF from The Grammatical Knowledge-base they can be re-ranked by (5). of Contemporary Chinese(Yu, 2001). For the fi- nals dictionary, we give each Chinese character a XK final extracted from its Pinyin. When it comes to ∗ λ S = arg min(scoreh(S)+ LLR(xk; yk)) a polyphone, we just combine its all finals simply S K k=1 to one. For example, ”中fongg”, ”差fa&ai&ig”. (5) The training corpus (5,769 KB) we used is the where scoreh is the negative log value of P (S; Y jX). K is the number of breaks in X and Labeled Corpus provided by the organizer. We firstly add the Pinyin’s finals to each Chinese xk is the left Chinese character of the k break character of it, then we train the parameters of and yk is the right Chinese character of the k break. λ is the regulatory factor(in our experi- DHHMM and CRF model on it. ments λ = 0:45). And the test corpus contains four domains: Lit- Bigger value of LLR(xk; yk) means stronger erature (A), Computer (B), Medicine (C) and Fi- ability in combining of the two characters xk and nance(D).
Recommended publications
  • Dressing for the Times: Fashion in Tang Dynasty China (618-907)
    Dressing for the Times: Fashion in Tang Dynasty China (618-907) BuYun Chen Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2013 © 2013 BuYun Chen All rights reserved ABSTRACT Dressing for the Times: Fashion in Tang Dynasty China (618-907) BuYun Chen During the Tang dynasty, an increased capacity for change created a new value system predicated on the accumulation of wealth and the obsolescence of things that is best understood as fashion. Increased wealth among Tang elites was paralleled by a greater investment in clothes, which imbued clothes with new meaning. Intellectuals, who viewed heightened commercial activity and social mobility as symptomatic of an unstable society, found such profound changes in the vestimentary landscape unsettling. For them, a range of troubling developments, including crisis in the central government, deep suspicion of the newly empowered military and professional class, and anxiety about waste and obsolescence were all subsumed under the trope of fashionable dressing. The clamor of these intellectuals about the widespread desire to be “current” reveals the significant space fashion inhabited in the empire – a space that was repeatedly gendered female. This dissertation considers fashion as a system of social practices that is governed by material relations – a system that is also embroiled in the politics of the gendered self and the body. I demonstrate that this notion of fashion is the best way to understand the process through which competition for status and self-identification among elites gradually broke away from the imperial court and its system of official ranks.
    [Show full text]
  • A Comparative Analysis of the Simplification of Chinese Characters in Japan and China
    CONTRASTING APPROACHES TO CHINESE CHARACTER REFORM: A COMPARATIVE ANALYSIS OF THE SIMPLIFICATION OF CHINESE CHARACTERS IN JAPAN AND CHINA A THESIS SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAI‘I AT MĀNOA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS IN ASIAN STUDIES AUGUST 2012 By Kei Imafuku Thesis Committee: Alexander Vovin, Chairperson Robert Huey Dina Rudolph Yoshimi ACKNOWLEDGEMENTS I would like to express deep gratitude to Alexander Vovin, Robert Huey, and Dina R. Yoshimi for their Japanese and Chinese expertise and kind encouragement throughout the writing of this thesis. Their guidance, as well as the support of the Center for Japanese Studies, School of Pacific and Asian Studies, and the East-West Center, has been invaluable. i ABSTRACT Due to the complexity and number of Chinese characters used in Chinese and Japanese, some characters were the target of simplification reforms. However, Japanese and Chinese simplifications frequently differed, resulting in the existence of multiple forms of the same character being used in different places. This study investigates the differences between the Japanese and Chinese simplifications and the effects of the simplification techniques implemented by each side. The more conservative Japanese simplifications were achieved by instating simpler historical character variants while the more radical Chinese simplifications were achieved primarily through the use of whole cursive script forms and phonetic simplification techniques. These techniques, however, have been criticized for their detrimental effects on character recognition, semantic and phonetic clarity, and consistency – issues less present with the Japanese approach. By comparing the Japanese and Chinese simplification techniques, this study seeks to determine the characteristics of more effective, less controversial Chinese character simplifications.
    [Show full text]
  • “English Name” Use Among Chinese and Taiwanese Students at an Australian University
    NAMING RIGHTS: THE DIALOGIC PRACTICE OF “ENGLISH NAME” USE AMONG CHINESE AND TAIWANESE STUDENTS AT AN AUSTRALIAN UNIVERSITY Julian Owen Harris SCHOOL OF LANGUAGES AND LINGUISTICS University of Melbourne By the beginning of the 21st century, Australia had become one of the world’s top 5 providers of international education services along with the USA, the UK, Germany and France. Since 2001, China has provided the largest proportion of international students to Australia, a tenfold growth in numbers from 1994 to 2003. The overwhelming majority of Chinese and Taiwanese students studying in Australian universities use what are typically called “English names”. The use of such names differs from the practice in Hong Kong of providing a new born with what might be termed an official English name as part of the full Chinese name that appears on his or her birth certificate and/or passport. By comparison, these English names as used by Chinese and Taiwanese are “unofficial” names that do not appear on the bearer’s passport, birth certificate or university administrative procedurals or degree certificates. Their use is unofficial and largely restricted to spoken interactions. Historically, English names used to be typically given to an individual by their English teacher; such classroom “baptisms” invariably occurred in Chinese or Taiwanese geographical settings. The term ‘baptisms’ and ‘baptismal events’ are drawn from Rymes (1996) and her research towards a theory of naming as practice. Noting that ‘serial mononymy is relatively uncommon in the literature on naming practices, Rymes (1996, p. 240) notes that more frequent are instances of individuals experiencing ‘a series of baptismal events in which [they] acquire and maintain different names for different purposes.’ Noting these cases among the Tewa of Arizona on Tanna in Vanuatu, Rymes (1996, pp.
    [Show full text]
  • INFORMATION to USERS This Manuscr^ Has Been Reproduced
    INFORMATION TO USERS This manuscr^ has been reproduced from the microfiliii master. UMI films the text directty from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dqiendent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photogrsqths, print bleedthrough, substandard m argins, and inqnoper alignment can adversety affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted Also, if unauthorized copyright material had to be removed, a note win indicate the deletion. Oversize materials (e g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand corner and continuing from left to right in equal sections with sm all overkq)s. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations aiq>earing in this copy for an additional charge. Contact UMI directly to order. UMI University Microfilms International A Bell & Howell information Company 300 North Zeeb Road. Ann Arbor Ml 48106-1346 USA 313/761-4700 800/521-0600 Order Nmnber OSOTTTS Situation types and aspectual classes of verbs in Mandarin Chinese He, Baoshang, Ph D. Th« Ohio State UaWanity, 1902 UMI 300 N.
    [Show full text]
  • Han Dynasty Classicism and the Making of Early Medieval Literati Culture
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2013 In Pursuit of the Great Peace: Han Dynasty Classicism and the Making of Early Medieval Literati Culture Lu Zhao University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Ancient History, Greek and Roman through Late Antiquity Commons, and the Asian History Commons Recommended Citation Zhao, Lu, "In Pursuit of the Great Peace: Han Dynasty Classicism and the Making of Early Medieval Literati Culture" (2013). Publicly Accessible Penn Dissertations. 826. https://repository.upenn.edu/edissertations/826 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/826 For more information, please contact [email protected]. In Pursuit of the Great Peace: Han Dynasty Classicism and the Making of Early Medieval Literati Culture Abstract This dissertation is focused on communities of people in the Han dynasty (205 B.C.-A.D. 220) who possessed the knowledge of a corpus of texts: the Five Classics. Previously scholars have understood the popularity of this corpus in the Han society as a result of stiff ideology and imperial propaganda. However, this approach fails to explain why the imperial government considered them effective to convey propaganda in the first place. It does not capture the diverse range of ideas in classicism. This dissertation concentrates on Han classicists and treats them as scholars who constantly competed for attention in intellectual communities and solved problems with innovative solutions that were plausible to their contemporaries. This approach explains the nature of the apocryphal texts, which scholars have previously referred to as shallow and pseudo-scientific.
    [Show full text]
  • 42 Wade–Giles Romanization System
    42 Wade–Giles ROMANIZATION SYSTEM Karen Steffen Chung N ATIONAL TAIWAN UNIVERSITY The Wade–Giles Romanization system for standard Mandarin Chinese held a distinguished place of honor in Sinology and popular usage from the late nineteenth century until the 1970s, when it began losing ground to Hanyu Pinyin. But that is not to say that the Wade– Giles system was not, and is not still, without its problems, and consequently, its sometimes highly vocal detractors. Historical absence of a phonetic alphabet, fanqie and tone marking It is surprising that the Chinese did not develop their own phonetic alphabet before the arrival of Western missionaries in China starting in the sixteenth century. The closest they came was the use of the fǎnqiè system, under which two relatively well-known characters, plus the word fǎn 反 or later mostly qiè 切, were given after a lexical item. The reader needed to take the initial of the first and splice it onto the final rhyme and tone of the second, to derive the pronunciation of the item being looked up. A typical entry is dōng déhóng qiè 東 德紅切, i.e. dé plus hóng in the qiè 切 system make dōng. (The second tone had not yet separated from the first at this time, thus the difference in tones.) One big advantage of the system is that the fǎnqiè characters were already familiar to any literate Chinese, so there was no need to learn a new set of symbols. The disadvantage is that there is no way to know with certainty the actual phonetic realizations of the syllables at the time .
    [Show full text]
  • ©Copyright 2012 Chad D. Garcia
    ©Copyright 2012 Chad D. Garcia Horsemen from the Edge of Empire: The Rise of the Jurchen Coalition Chad D. Garcia A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2012 Reading Committee: Patricia Ebrey, Chair R. Kent Guy Madeleine Yue Dong Program Authorized to Offer Degree: History University of Washington Abstract Horsemen from the Edge of Empire: The Rise of the Jurchen Coalition Chad D. Garcia Chair of the Supervisory Committee: Professor Patricia Ebrey History This dissertation examines the formation and rise of the Jurchen Coalition under the leadership of the Anchuhu Wanyan clan during the late 11 th and early 12 th centuries. The Anchuhu Wanyan utilized their political and geographical position along the periphery of the Liao Dynasty in order to consolidate power among the many Jurchen groups of the northeast. It is well established that the Anchuhu Wanyan served the Liao Dynasty as enforcers of imperial rule within Jurchen territory. However, this role as a policing force for the empire was only part of their success in consolidating power among the other Jurchen tribes of the northeast. The early Anchuhu Wanyan leaders utilized diverse tactics to allow them to maintain a dual-façade as both servants and rivals of the Liao Empire. The expansion of the Jurchen Coalition brought them into conflict with various groups seeking to challenge their power. Many of these early conflicts were with rival Jurchen leaders who would often flee to the Liao Empire for political asylum. However, the Jurchen Coalition also had a major confrontation with the kingdom of Goryeo.
    [Show full text]
  • Download Full Article
    2.1 MCP.DOC (DO NOT DELETE) 2019/1/6 11:50 PM DEBATES ON MUTILATING CORPORAL PUNISHMENTS AND THEORIES OF PUNISHMENT IN TRADITIONAL CHINESE LEGAL THOUGHT Norman P. HO Table of Contents I. INTRODUCTION ................................................................................ 44 II. OVERVIEW OF THE HISTORY, DEVELOPMENT, AND SIGNIFICANCE OF MCPS IN TRADITIONAL CHINESE LAW ......... 49 III. HAN DYNASTY MCP DEBATES ..................................................... 53 A. Emperor Wen of the Han Dynasty ...................................... 53 B. Ban Gu ................................................................................ 55 C. Kong Rong .......................................................................... 56 IV. JIN DYNASTY MCP DEBATES ....................................................... 59 A. Liu Song (289 A.D.) ........................................................... 60 B. The Debate of ~318 A.D. .................................................... 68 C. The MCP Debate of ~403 A.D. and Cai Kuo’s Positions ............................................................................ 76 V. TANG DYNASTY MCP DEBATES: BAI JUYI’S FAMOUS ESSAY AGAINST THE REINSTATEMENT OF MCPS ................................. 80 VI. SONG DYNASTY MCP DEBATES ................................................... 84 VII. CONCLUSION ............................................................................... 89 Associate Professor of Law, Peking University School of Transnational Law (e-mail contact: [email protected]). I would like to
    [Show full text]
  • POLICIES for CATALOGING CHINESE MATERIAL: SUPPLEMENT to the CHINESE ROMANIZATION GUIDELINES Revised September 22, 2004
    ROMANIZATION POLICIES FOR CATALOGING CHINESE MATERIAL: SUPPLEMENT TO THE CHINESE ROMANIZATION GUIDELINES Revised September 22, 2004 This supplement builds on the Chinese romanization guidelines that appear on the Library of Congress pinyin home page (http://lcweb.loc.gov/catdir/cpso/romanization/chinese.pdf). The revision of the Chinese guidelines corresponded with the conversion of Wade-Giles romanization in authority and bibliographic records to pinyin on October 1, 2000. The revised romanization guidelines will replace the Wade-Giles guidelines in the ALA-LC romanization tables when it is next updated. The supplement is intended to help catalogers make certain important distinctions when romanizing Chinese personal and place names. RULES OF APPLICATION Romanization 1. ALA-LC romanization of ideographic characters used for the Chinese language follows the principles of the Pinyin ("spell sound") system. The Pinyin system was developed in the mid 20th century for creating Latin script readings for Chinese script ideographic characters. It replaces the Wade-Giles system of romanization specified in earlier editions of the ALA-LC Romanization Tables. The Pinyin system as outlined in Han yu pin yin fang an 汉语拼音方案 (1962) is followed closely for creating romanizations except that the ALA-LC guidelines do not include the indication of tone marks. 2. Standard Chinese national (PRC) pronunciation is used as the basis for creating the Latin script reading of a character. When it is necessary to make semantic distinctions between multiple readings of a single character, rely upon the usage of the most recent comprehensive edition of Ci hai 辞海 (published in China by Shanghai ci shu chu ban she).
    [Show full text]
  • Chinese Characters: Semantic and Phonetic Regularity Norms for China, Singapore, and Taiwan
    Behavior Research Methods, Instruments, & Computers 1999,31 (1),155-177 Chinese characters: Semantic and phonetic regularity norms for China, Singapore, and Taiwan SUSAN J. RICKARD LIOW, SIOKKENGTNG, and CHER LENG LEE National University ofSingapore, Singapore Cognitive models of language processing in English are founded on norms for word properties, but their universality is now being explored across different writing scripts and subject groups. Although Chinese characters are popular for this comparative work, their salient properties remain ill defmed or poorly controlled. Wedescribe how norms for semantic and phonetic regularity in Mandarin can be calibrated on a regional basis. The rating datathat we present from China, Singapore, and Taiwan also illustrate why the diversity of both oral and written forms of Chinese should be considered in future empirical work. Word properties are known to affect cognitive process­ glish (Coltheart, 1981) and the Max-Planck Institute's ing in English and other alphabetic scripts. Norms for (1995) CELEX Lexical Database for Dutch, English, and these properties have been published cumulatively: word German. Despite Wu and Liu's (1988) preliminary work frequency (Carroll, Davies, & Richman, 1971; Kucera & on Mandarin, this precision is not a standard practice for Francis, 1967); concreteness, imagery, and meaningful­ research on Chinese language processing. The availability ness (Paivio, Yuille, & Madigan, 1968); word familiarity and use ofnorms for empirical work on Chinese charac­ and pleasantness (Toglia & Battig, 1978); age of acqui­ ters is very limited. Most experimenters control (or try to sition (Gilhooly & Logie, 1980); associative difficulty control)frequency and/or age ofacquisition (H. C. Chen (Brown & Ure, 1969); synonymity (Wilding & Mohindra, & Leung, 1989; Woo & Hoosain, 1984), and some have 1981); and spelling-sound regularity (Berndt, Reggia, & taken account of the number ofstrokes' per character Mitchum, 1987; Venezky, 1970).
    [Show full text]
  • Fu Hao”, “Fu Hao”, “Fuhao” Or “Fu Hao”? a Cataloger’S Navigation of an Ancient Chinese Woman’S Name
    City University of New York (CUNY) CUNY Academic Works Publications and Research York College 2014 “Fu hao”, “fu hao”, “fuHao” or “fu Hao”? A cataloger’s navigation of an ancient Chinese woman’s name Junli Diao CUNY York College How does access to this work benefit ou?y Let us know! More information about this work at: https://academicworks.cuny.edu/yc_pubs/228 Discover additional works at: https://academicworks.cuny.edu This work is made publicly available by the City University of New York (CUNY). Contact: [email protected] “Fu hao”, “fu hao”, “fuHao” or “fu Hao”? A cataloger’s navigation of an ancient Chinese woman’s name Chinese language catalogers’ work is not only challenged by the revolution in cataloging standards and principles, but also by ancient Chinese names that emerged in archaeological discoveries and Chinese classic texts, which create a significant impact on records description and retrieval in terms of consistency and accuracy. This article takes an example of an ancient Chinese lady’s name that is inconsistently romanized and described in OCLC and attempts to explore the appropriate form in an authority record through the consulting both Western and Eastern scholarly practices. This article has a further investigation of the evolving history of pre-Qin Chinese names that are not addressed and exampled in the Library of Congress Romanization Table. A revision of LC Chinese Romanization Table is suggested. Keywords Chinese name, pre-Qin Chinese name, authority control, Chinese cataloger, romanization, romanization table, Fu Hao INTRODUCTION In 1991, Lau and Wang1 concluded that the two fundamental difficulties in cataloging Chinese language materials came from the intricate nature of the Chinese characters and the variations of romanization standards.
    [Show full text]
  • Northern Song Reflections on the Tang
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2013 Northern Song Reflections on the angT Jeffrey Rice University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Asian History Commons, and the Asian Studies Commons Recommended Citation Rice, Jeffrey, "Northern Song Reflections on the ang"T (2013). Publicly Accessible Penn Dissertations. 920. https://repository.upenn.edu/edissertations/920 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/920 For more information, please contact [email protected]. Northern Song Reflections on the angT Abstract NORTHERN SONG REFLECTIONS ON THE TANG Jeffrey Rice Victor Mair In the mid-eleventh century Chinese intellectuals argued about history, and left their competing narratives to us in print. They contested how history should be written, and what relevant lessons ought to be adapted to the changing society of Song 宋 (960-1279) dynasty China. They were particularly concerned with the history of the long-lasting Tang 唐 (618-907) dynasty. They revised the official history of the Tang on a variety of levels: they used primary sources differently to analyze evidence, developed a new literary language to write historical prose, employed editorial critiques differently to draw political morals by analogy to historical events, and harnessed new print technology to disseminate their views to a wider audience. This dissertation analyzes the revisions to the history of the Tang produced in the eleventh century on each of these levels: historiography, linguistics, politics, and print culture. These elements all functioned to reinvent the ancient ideal of the Confucian scholar in terms that advanced the interests of the burgeoning class of literati officials in Northern Song China.
    [Show full text]