Character-Level Dependencies in Chinese: Usefulness and Learning

Total Page:16

File Type:pdf, Size:1020Kb

Character-Level Dependencies in Chinese: Usefulness and Learning Character-Level Dependencies in Chinese: Usefulness and Learning Hai Zhao Department of Chinese, Translation and Linguistics City University of Hong Kong Tat Chee Avenue, Kowloon, Hong Kong, China [email protected] Abstract brings the argument about how to determine the word-hood in Chinese. Linguists’ views about We investigate the possibility of exploit- what is a Chinese word diverge so greatly that ing character-based dependency for Chi- multiple word segmentation standards have been nese information processing. As Chinese proposed for computational linguistics tasks since text is made up of character sequences the first Bakeoff (Bakeoff-1, or Bakeoff-2003)3 rather than word sequences, word in Chi- (Sproat and Emerson, 2003). nese is not so natural a concept as in En- Up to Bakeoff-4, seven word segmentation stan- glish, nor is word easy to be defined with- dards have been proposed. However, this does not out argument for such a language. There- effectively solve the open problem what a Chi- fore we propose a character-level depen- nese word should exactly be but raises another is- dency scheme to represent primary lin- sue: what a segmentation standard should be se- guistic relationships within a Chinese sen- lected for the successive application. As word tence. The usefulness of character depen- often plays a basic role for the further language dencies are verified through two special- processing, if it cannot be determined in a uni- ized dependency parsing tasks. The first fied way, then all successive tasks will be affected is to handle trivial character dependencies more or less. that are equally transformed from tradi- Motivated by dependency representation for tional word boundaries. The second fur- syntactic parsing since (Collins, 1999) that has thermore considers the case that annotated been drawn more and more interests in recent internal character dependencies inside a years, we suggest that character-level dependen- word are involved. Both of these results cies can be adopted to alleviate this difficulty in from character-level dependency parsing Chinese processing. If we regard traditional word are positive. This study provides an alter- boundary as a linear representation for neighbored native way to formularize basic character- characters, then character-level dependencies can and word-level representation for Chinese. provide a way to represent non-linear relations be- tween non-neighbored characters. To show that 1 Introduction character dependencies can be useful, we develop In many human languages, word can be naturally a parsing scheme for the related learning task and identified from writing. However, this is not the demonstrate its effectiveness. case for Chinese, for Chinese is born to be written The rest of the paper is organized as fol- in character1 sequence rather than word sequence, lows. The next section shows the drawbacks of namely, no natural separators such as blanks ex- the current word boundary representation through ist between words. As word does not appear in some language examples. Section 3 describes a natural way as most European languages2, it a character-level dependency parsing scheme for traditional word segmentation task and reports its 1 Character here stands for various tokens occurring in evaluation results. Section 4 verifies the useful- a naturally written Chinese text, including Chinese charac- ter(hanzi), punctuation, and foreign letters. However, Chi- ness of annotated character dependencies inside a nese characters often cover the most part. word. Section 5 looks into a few issues concern- 2Even in European languages, a naive but necessary method to properly define word is to list them all by hand. 3First International Chinese Word Segmentation Bakeoff, Thank the first anonymous reviewer who points this fact. available at http://www.sighan.org/bakeoff2003. Proceedings of the 12th Conference of the European Chapter of the ACL, pages 879–887, Athens, Greece, 30 March – 3 April 2009. c 2009 Association for Computational Linguistics 879 ing the role of character dependencies. Section 6 This example shows that there will be in a concludes the paper. dilemma to perform segmentation over these char- acters. If a segmentation position locates before Ê 2 To Segment or Not: That Is the ‘Ò’(three) or ‘ ’(five), then this will make them Question meaningless or losing its original meaning at least because either of these two characters should log- Though most words can be unambiguously de- Ï ically follow the substring ‘´ ’ (week) to con- fined in Chinese text, some word boundaries are ÏÒ struct the expected word ‘´ ’(Wednesday) or not so easily determined. We show such three ex- Ï Ê ‘´ ’ (Friday). Otherwise, to make all the amples as the following. above five characters as a word will have to ig- The first example is from the MSRA segmented nore all these logical dependent relations among corpus of Bakeoff-2 (Bakeoff-2005) (Emerson, these characters and segment it later for a proper 2005): tackling as the above first example. Ü » ® ù ® ì ÊÆ é Ç ¬ ¬ è / / / / All these examples suggest that dependencies \| Ý ¼ Ð / / / exist between discontinuous characters, and word boundary representation is insufficient to handle a / piece of / “ / Beijing City Beijing Opera these cases. This motivates us to introduce char- OK Sodality / member / entrance / ticket / ” acter dependencies. As the guideline of MSRA standard requires any 3 Character-Level Dependency Parsing organization’s full name as a word, many long words in this form are frequently encountered. Character dependency is proposed as an alterna- Though this type of ‘words’ may be regarded as an tive to word boundary. The idea itself is extremely effective unit to some extent, some smaller mean- simple, character dependencies inside sequence ingful constituents can be still identified inside are annotated or formally defined in the similar them. Some researchers argue that these should way that syntactic dependencies over words are be seen as phrases rather than words. In fact, e.g., usually annotated. a machine translation system will have to segment We will initially develop a character-level de- this type of words into some smaller units for a pendency parsing scheme in this section. Es- proper translation. pecially, we show character dependencies, even The second example is from the PKU corpus of those trivial ones that are equally transformed Bakeoff-2, from pre-defined word boundaries, can be effec- tively captured in a parsing way. Á 7 Àê ¸ ¥ / / / 3.1 Formularization China / in / South Africa / embassy Using a character-level dependency representa- (the Chinese embassy in South Africa) tion, we first show how a word segmentation task can be transformed into a dependency parsing This example demonstrates how researchers can problem. Since word segmentation is traditionally also feel inconvenient if an organization name is formularized as an unlabeled character chunking segmented into pieces. Though the word ‘ task since (Xue, 2003), only unlabeled dependen- À ê ¸’(embassy) is right after ‘ ’(South Africa) cies are concerned in the transformation. There are in the above phrase, the embassy does not belong many ways to transform chunks in a sequence into to South Africa but China, and it is only located in dependency representation. However, for the sake South Africa. of simplicity, only well-formed and projective out- The third example is an abbreviation that makes put sequences are considered for our processing. use of the characteristics of Chinese characters. Borrowing the notation from (Nivre and Nils- son, 2005), an unlabeled dependency graph is for- Ï è Ò Ê ´ / / / mally defined as follows: Week / one / three / five An unlabeled dependency graph for a string (Monday, Wednesday and Friday) of cliques (i.e., words and characters) W = 880 3.2 Shift-reduce Parsing According to (McDonald and Nivre, 2007), all data-driven models for dependency parsing that Figure 1: Two character dependency schemes have been proposed in recent years can be de- scribed as either graph-based or transition-based. Since both dependency schemes that we construct w1...w is an unlabeled directed graph D = n for parsing are well-formed and projective, the lat- (W, A), where ter is chosen as the parsing framework for the sake (a) W is the set of ordered nodes, i.e. clique of efficiency. In detail, a shift-reduce method is tokens in the input string, ordered by a adopted as in (Nivre, 2003). linear precedence relation <, The method is step-wise and a classifier is used (b) A is a set of unlabeled arcs (wi, wj), to make a parsing decision step by step. In each 4 where wi, wj ∈ W , step, the classifier checks a clique pair , namely, TOP, the top of a stack that consists of the pro- If (w , w ) ∈ A, w is called the head of w i j i j cessed cliques, and, INPUT, the first clique in the and w a dependent of w . Traditionally, the no- j i unprocessed sequence, to determine if a dependent tation w → w means (w , w ) ∈ A; w →∗ i j i j i relation should be established between them. Be- w denotes the reflexive and transitive closure of j sides two arc-building actions, a shift action and a the (unlabeled) arc relation. We assume that the reduce action are also defined, as follows, designed dependency structure satisfies the fol- lowing common constraints in existing literature Left-arc: Add an arc from INPUT to TOP and (Nivre, 2006). pop the stack. Right-arc: Add an arc from TOP to INPUT and (1) D is weakly connected, that is, the cor- push INPUT onto the stack. responding undirected graph is connected. Reduce: Pop TOP from the stack. (CONNECTEDNESS) Shift: Push INPUT onto the stack. (2) The graph D is acyclic, i.e., if wi → wj then In this work, we adopt a left-to-right arc-eager ∗ not wj → wi.
Recommended publications
  • Cantonese Vs. Mandarin: a Summary
    Cantonese vs. Mandarin: A summary JMFT October 21, 2015 This short essay is intended to summarise the similarities and differences between Cantonese and Mandarin. 1 Introduction The large geographical area that is referred to as `China'1 is home to many languages and dialects. Most of these languages are related, and fall under the umbrella term Hanyu (¡£), a term which is usually translated as `Chinese' and spoken of as though it were a unified language. In fact, there are hundreds of dialects and varieties of Chinese, which are not mutually intelligible. With 910 million speakers worldwide2, Mandarin is by far the most common dialect of Chinese. `Mandarin' or `guanhua' originally referred to the language of the mandarins, the government bureaucrats who were based in Beijing. This language was based on the Bejing dialect of Chinese. It was promoted by the Qing dynasty (1644{1912) and later the People's Republic (1949{) as the country's lingua franca, as part of efforts by these governments to establish political unity. Mandarin is now used by most people in China and Taiwan. 3 Mandarin itself consists of many subvarities which are not mutually intelligible. Cantonese (Yuetyu (£) is named after the city Canton, whose name is now transliterated as Guangdong. It is spoken in Hong Kong and Macau (with a combined population of around 8 million), and, owing to these cities' former colonial status, by many overseas Chinese. In the rest of China, Cantonese is relatively rare, but it is still sometimes spoken in Guangzhou. 2 History and etymology It is interesting to note that the Cantonese name for Cantonese, Yuetyu, means `language of the Yuet people'.
    [Show full text]
  • Assessment of the Transition from Pinyin to Hanzi in University Level Introductory Chinese Language Courses
    Old Dominion University ODU Digital Commons OTS Master's Level Projects & Papers STEM Education & Professional Studies 12-2014 Assessment of the Transition from Pinyin to Hanzi in University Level Introductory Chinese Language Courses Ronnie Hess Old Dominion University Follow this and additional works at: https://digitalcommons.odu.edu/ots_masters_projects Part of the Education Commons Recommended Citation Hess, Ronnie, "Assessment of the Transition from Pinyin to Hanzi in University Level Introductory Chinese Language Courses" (2014). OTS Master's Level Projects & Papers. 3. https://digitalcommons.odu.edu/ots_masters_projects/3 This Master's Project is brought to you for free and open access by the STEM Education & Professional Studies at ODU Digital Commons. It has been accepted for inclusion in OTS Master's Level Projects & Papers by an authorized administrator of ODU Digital Commons. For more information, please contact [email protected]. ASSESSMENT OF THE TRANSITION FROM PINYIN TO HANZI IN UNIVERSITY LEVEL INTRODUCTORY CHINESE LANGUAGE COURSES A Research Study Presented to the Graduate Faculty of The Department of STEM Education and Professional Studies At Old Dominion University In Partial Fulfillment of the Requirement for the Degree Master of Science in Instructional Design and Technology By Ronnie Hess December, 2014 i SIGNATURE PAGE This research paper was prepared by Ronnie Alan Hess II under the direction of Dr. John Ritz for SEPS 636, Problems in Occupational and Technical Studies. It was submitted as partial fulfillment of the requirements for the Master of Science degree. Approved By: ___________________________________ Date: ____________ Dr. John Ritz Research Paper Advisor ii ACKNOWLEDGEMENTS I would like to thank Dr.
    [Show full text]
  • A Comparative Analysis of the Simplification of Chinese Characters in Japan and China
    CONTRASTING APPROACHES TO CHINESE CHARACTER REFORM: A COMPARATIVE ANALYSIS OF THE SIMPLIFICATION OF CHINESE CHARACTERS IN JAPAN AND CHINA A THESIS SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAI‘I AT MĀNOA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS IN ASIAN STUDIES AUGUST 2012 By Kei Imafuku Thesis Committee: Alexander Vovin, Chairperson Robert Huey Dina Rudolph Yoshimi ACKNOWLEDGEMENTS I would like to express deep gratitude to Alexander Vovin, Robert Huey, and Dina R. Yoshimi for their Japanese and Chinese expertise and kind encouragement throughout the writing of this thesis. Their guidance, as well as the support of the Center for Japanese Studies, School of Pacific and Asian Studies, and the East-West Center, has been invaluable. i ABSTRACT Due to the complexity and number of Chinese characters used in Chinese and Japanese, some characters were the target of simplification reforms. However, Japanese and Chinese simplifications frequently differed, resulting in the existence of multiple forms of the same character being used in different places. This study investigates the differences between the Japanese and Chinese simplifications and the effects of the simplification techniques implemented by each side. The more conservative Japanese simplifications were achieved by instating simpler historical character variants while the more radical Chinese simplifications were achieved primarily through the use of whole cursive script forms and phonetic simplification techniques. These techniques, however, have been criticized for their detrimental effects on character recognition, semantic and phonetic clarity, and consistency – issues less present with the Japanese approach. By comparing the Japanese and Chinese simplification techniques, this study seeks to determine the characteristics of more effective, less controversial Chinese character simplifications.
    [Show full text]
  • Reading Depends on Writing, in Chinese
    Reading depends on writing, in Chinese Li Hai Tan*, John A. Spinks†, Guinevere F. Eden‡, Charles A. Perfetti§, and Wai Ting Siok*¶ *Department of Linguistics, and †Vice Chancellor’s Office, University of Hong Kong, Pokfulam Road, Hong Kong, China; ‡Georgetown University Medical Center, Washington, DC 20057; and §Learning Research and Development Center, University of Pittsburgh, Pittsburgh, PA 12560 Communicated by Robert Desimone, National Institutes of Health, Bethesda, MD, April 28, 2005 (received for review January 3, 2005) Language development entails four fundamental and interactive this argument are two characteristics of the Chinese language: (i) abilities: listening, speaking, reading, and writing. Over the past Spoken Chinese is highly homophonic, with a single syllable four decades, a large body of evidence has indicated that reading shared by many words, and (ii) the writing system encodes these acquisition is strongly associated with a child’s listening skills, homophonic syllables in its major graphic unit, the character. particularly the child’s sensitivity to phonological structures of Thus, when learning to read, a Chinese child is confronted with spoken language. Furthermore, it has been hypothesized that the the fact that a large number of written characters correspond to close relationship between reading and listening is manifested the same syllable (as depicted in Fig. 1A), and phonological universally across languages and that behavioral remediation information is insufficient to access semantics of a printed using strategies addressing phonological awareness alleviates character. reading difficulties in dyslexics. The prevailing view of the central In addition to these system-level (language and writing system) role of phonological awareness in reading development is largely factors, Chinese writing presents some script-level features that based on studies using Western (alphabetic) languages, which are distinguish it visually from alphabetic systems.
    [Show full text]
  • LANGUAGE CONTACT and AREAL DIFFUSION in SINITIC LANGUAGES (Pre-Publication Version)
    LANGUAGE CONTACT AND AREAL DIFFUSION IN SINITIC LANGUAGES (pre-publication version) Hilary Chappell This analysis includes a description of language contact phenomena such as stratification, hybridization and convergence for Sinitic languages. It also presents typologically unusual grammatical features for Sinitic such as double patient constructions, negative existential constructions and agentive adversative passives, while tracing the development of complementizers and diminutives and demarcating the extent of their use across Sinitic and the Sinospheric zone. Both these kinds of data are then used to explore the issue of the adequacy of the comparative method to model linguistic relationships inside and outside of the Sinitic family. It is argued that any adequate explanation of language family formation and development needs to take into account these different kinds of evidence (or counter-evidence) in modeling genetic relationships. In §1 the application of the comparative method to Chinese is reviewed, closely followed by a brief description of the typological features of Sinitic languages in §2. The main body of this chapter is contained in two final sections: §3 discusses three main outcomes of language contact, while §4 investigates morphosyntactic features that evoke either the North-South divide in Sinitic or areal diffusion of certain features in Southeast and East Asia as opposed to grammaticalization pathways that are crosslinguistically common.i 1. The comparative method and reconstruction of Sinitic In Chinese historical
    [Show full text]
  • The Challenge of Chinese Character Acquisition
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Faculty Publications: Department of Teaching, Department of Teaching, Learning and Teacher Learning and Teacher Education Education 2017 The hC allenge of Chinese Character Acquisition: Leveraging Multimodality in Overcoming a Centuries-Old Problem Justin Olmanson University of Nebraska at Lincoln, [email protected] Xianquan Chrystal Liu University of Nebraska - Lincoln, [email protected] Follow this and additional works at: http://digitalcommons.unl.edu/teachlearnfacpub Part of the Bilingual, Multilingual, and Multicultural Education Commons, Chinese Studies Commons, Curriculum and Instruction Commons, Instructional Media Design Commons, Language and Literacy Education Commons, Online and Distance Education Commons, and the Teacher Education and Professional Development Commons Olmanson, Justin and Liu, Xianquan Chrystal, "The hC allenge of Chinese Character Acquisition: Leveraging Multimodality in Overcoming a Centuries-Old Problem" (2017). Faculty Publications: Department of Teaching, Learning and Teacher Education. 239. http://digitalcommons.unl.edu/teachlearnfacpub/239 This Article is brought to you for free and open access by the Department of Teaching, Learning and Teacher Education at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Faculty Publications: Department of Teaching, Learning and Teacher Education by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. Volume 4 (2017)
    [Show full text]
  • Section 18.1, Han
    The Unicode® Standard Version 13.0 – Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. © 2020 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at http://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see http://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium; edited by the Unicode Consortium. — Version 13.0. Includes index. ISBN 978-1-936213-26-9 (http://www.unicode.org/versions/Unicode13.0.0/) 1.
    [Show full text]
  • Pinyin and Chinese Children's Phonological Awareness
    Pinyin and Chinese Children’s Phonological Awareness by Xintian Du A thesis submitted in conformity with the requirements For the degree of Master of Arts Department of Curriculum, Teaching and Learning Ontario Institute of Studies in Education University of Toronto @Copyright by Xintian Du 2010 ABSTRACT Pinyin and Chinese Children’s Phonological Awareness Master of Arts 2010 Xintian Du Department of Curriculum, Teaching and Learning University of Toronto This paper critically reviewed the literature on the relationships between Pinyin and Chinese bilingual and monolingual children’s phonological awareness (PA) and identified areas of research worth of further investigation. As the Chinese Phonetic Alphabet providing pronunciation of the universal Chinese characters, Pinyin facilitates children’s early reading development. What research has found in English is that PA is a reliable indicator of later reading success and meta-linguistic training improves PA. In Chinese, a non-alphabetic language, there is also evidence that PA predicts reading in Chinese, which confirms the universality of PA’s role. However, research shows the uniqueness of each language: tonal awareness is stronger indicator in Chinese while phonemic awareness is stronger indicator in English. Moreover, Pinyin, the meta-linguistic training, has been found to improve PA in Chinese and reading in Chinese and possibly facilitate the cross-language transfer of PA from Chinese to English and vice versa. ii ACKNOWLEDGEMENTS I am heartily thankful to my supervisors Becky Chen and Normand Labrie, whose guidance and support from the initial to the final level enabled me to develop a thorough understanding of the subject and eventually complete the thesis paper.
    [Show full text]
  • Chinese Character) Writing Skill
    THE GRAPHABET AND BUJIAN APPROACH AT ACQUIRING HANZI (CHINESE CHARACTER) WRITING SKILL CHEN-YA HUANG University of Hong Kong Abstract Learning how to write orthographically correct Hanzi (otherwise known as the Chinese character) is a major hurdle facing students studying Chinese. The difficulty arises from the visual complexity of Hanzi, the opaqueness, i.e. diminished correspondence of sound to orthography, and the traditional method of learning Hanzi, which is monopolized by rote repetitive copying, excessive demand on memory, and lacking of any means of creating an auditory memory of the structural organization of individual Hanzi. As a result, the novice student has to invest a great deal of time and effort trying to master Hanzi, and is often deterred from continuing study. Yet, despite the seriousness of the problem, very little research has been carried out on its solution. This paper proposes a new approach at improving the ability to write Hanzi, through understanding Hanzi as strings of subunits stacked in two-dimensional space, and composed from 21 high frequency recurring shapes herein called graphabets. The combined use of graphabets and bujian can provide a means of creating an auditory memory of the structural organiza- tion and significantly decrease the memory load through chunking, as well as facilitating the use of computer feedback for learning purpose. Keywords: Chinese character, character writing, orthography, teaching method, second language learn- ing. 1. WRITING HANZI, A MAJOR HURDLE IN LEARNING CHINESE The Chinese language is considered one of the hardest languages to learn by peo- ple who are not native to the language. As a result, there is a very high drop-out rate for Chinese as foreign language (CFL) students.
    [Show full text]
  • Writing Taiwanese: the Development of Modern Written Taiwanese
    SINO-PLATONIC PAPERS Number 89 January, 1999 Writing Taiwanese: The Development of Modern Written Taiwanese by Alvin Lin Victor H. Mair, Editor Sino-Platonic Papers Department of East Asian Languages and Civilizations University of Pennsylvania Philadelphia, PA 19104-6305 USA [email protected] www.sino-platonic.org SINO-PLATONIC PAPERS is an occasional series edited by Victor H. Mair. The purpose of the series is to make available to specialists and the interested public the results of research that, because of its unconventional or controversial nature, might otherwise go unpublished. The editor actively encourages younger, not yet well established, scholars and independent authors to submit manuscripts for consideration. Contributions in any of the major scholarly languages of the world, including Romanized Modern Standard Mandarin (MSM) and Japanese, are acceptable. In special circumstances, papers written in one of the Sinitic topolects (fangyan) may be considered for publication. Although the chief focus of Sino-Platonic Papers is on the intercultural relations of China with other peoples, challenging and creative studies on a wide variety of philological subjects will be entertained. This series is not the place for safe, sober, and stodgy presentations. Sino-Platonic Papers prefers lively work that, while taking reasonable risks to advance the field, capitalizes on brilliant new insights into the development of civilization. The only style-sheet we honor is that of consistency. Where possible, we prefer the usages of the Journal of Asian Studies. Sinographs (hanzi, also called tetragraphs [fangkuaizi]) and other unusual symbols should be kept to an absolute minimum. Sino-Platonic Papers emphasizes substance over form.
    [Show full text]
  • An Overview of Chinese Writing
    Lesson Title: Chinese Writing Class and Grade level(s): 7th Grade Exploring Languages and Cultures Class Goals and Objectives The student will be able to: The student will be able to: 1. Explain the differences between written and spoken Chinese. 2. Explain how the current Chinese writing system was created. 3. Identify and define simple Chinese characters. Time required/class periods needed: 1 class period = 48 minutes Primary source bibliography The Chinese language http://afe.easia.columbia.edu/tps/1000bce.htm#language Pictures of the terracotta warriors of Qin Shi Huangdi and the Yellow River Chinese newspaper Map of China Other resources used http://ce.linedict.com/#/cnen/home http://www.mandarintools.com http://www.creaders.net/ http://kids.asiasociety.org/ http://chinapage.com/ Required materials/supplies Projector and laptop PowerPoint presentation created by David Beal Internet connection Copies of a Chinese newspaper Handouts for students Small group flashcards to go with internet activity: man, king, ox, sky, star, sun, water, road house, city Large flashcards with Chinese characters (teacher’s choice) Vocabulary Tonal Qin Shi Huangdi Pictographs Transliteration Romanization Pinyin Wade-Giles Procedure As an anticipatory set, the teacher will hand out a copy of a Chinese newspaper printed from the internet. http://www.creaders.net/ 1. The teacher will ask the students to identify the language. The teacher will ask what it says and if anyone can read it. 2. The teacher will hold a class discussion about the differences between learning German, French and Spanish and learning Chinese. 3. The teacher will distribute the handouts for note-taking.
    [Show full text]
  • David Li-Wei Chen Handbook of Taiwanese Romanization
    DAVID LI-WEI CHEN HANDBOOK OF TAIWANESE ROMANIZATION DAVID LI-WEI CHEN CONTENTS PREFACE v HOW TO USE THIS BOOK 1 TAIWANESE PHONICS AND PEHOEJI 5 白話字(POJ) ROMANIZATION TAIWANESE TONES AND TONE SANDHI 23 SOME RULES FOR TAIWANESE ROMANIZATION 43 VERNACULAR 白 AND LITERARY 文 FORMS 53 FOR SAME CHINESE CHARACTERS CHIANG-CH旧漳州 AND CHOAN-CH旧泉州 63 DIALECTS WORDS DERIVED FROM TAIWANESE 65 AND HOKKIEN WORDS BORROWED FROM OTHER 69 LANGUAGES TAILO 台羅 ROMANIZATION 73 BODMAN ROMANIZATION 75 DAIGHI TONGIONG PINGIM 85 台語通用拼音ROMANIZATION TONGIONG TAIWANESE DICTIONARY 91 通用台語字典ROMANIZATION COMPARATIVE TABLES OF TAIWANESE 97 ROMANIZATION AND TAIWANESE PHONETIC SYMBOLS (TPS) CONTENTS • P(^i-5e-jT 白話字(POJ) 99 • Tai-uan Lo-ma-jT Phing-im Hong-an 115 台灣羅馬字拼音方案(Tailo) • Bodman Romanization 131 • Daighi Tongiong PTngim 147 台語通用拼音(DT) • Tongiong Taiwanese Dictionary 163 通用台語字典 TAIWANESE COMPUTING IN POJ AND TAILO 179 • Chinese Character Input and Keyboards 183 • TaigIME臺語輸入法設定 185 • FHL Taigi-Hakka IME 189 信聖愛台語客語輸入法3.1.0版 • 羅漢跤Lohankha台語輸入法 193 • Exercise A. Practice Typing a Self­ 195 Introduction in 白話字 P^h-Oe-jT Romanization. • Exercise B. Practice Typing a Self­ 203 Introduction in 台羅 Tai-l6 Romanization. MENGDIAN 萌典 ONLINE DICTIONARY AND 211 THESAURUS BIBLIOGRAPHY PREFACE There are those who believe that Taiwanese and related Hokkien dialects are just spoken and not written, and can only be passed down orally from one generation to the next. Historically, this was the case with most Non-Mandarin Chinese languages. Grammatical literacy in Chinese characters was primarily through Classical Chinese until the early 1900's. Romanization in Hokkien began in the early 1600's with the work of Spanish and later English missionaries with Hokkien-speaking Chinese communities in the Philippines and Malaysia.
    [Show full text]