Arxiv:2106.00400V1 [Cs.CL] 1 Jun 2021 1 Introduction Sition of Stokes and Radicals, Also Contain Rich Se- Mantic Information (Wu Et Al., 2019)

Total Page:16

File Type:pdf, Size:1020Kb

Arxiv:2106.00400V1 [Cs.CL] 1 Jun 2021 1 Introduction Sition of Stokes and Radicals, Also Contain Rich Se- Mantic Information (Wu Et Al., 2019) SHUOWEN-JIEZI: Linguistically Informed Tokenizers For Chinese Language Model Pretraining Chenglei Si1∗, Zhengyan Zhang2;3;4∗, Yingfa Chen2;3;4∗, Fanchao Qi2;3;4, Xiaozhi Wang2;3;4, Zhiyuan Liu2;3;4y, Maosong Sun2;3;4 1University of Maryland, College Park, MD, USA 2Department of Computer Science and Technology, Tsinghua University, Beijing, China 3Institute for Artificial Intelligence, Tsinghua University, Beijing, China 4State Key Lab on Intelligent Technology and Systems, Tsinghua University, Beijing, China [email protected], [email protected] Abstract Unfortunately, current tokenization methods are mostly developed primarily for English (Bostrom Conventional tokenization methods for Chi- and Durrett, 2020). Almost all the current PLMs nese pretrained language models (PLMs) treat each character as an indivisible to- adopt the sub-word tokenization method originat- ken (Devlin et al., 2019), which ignores the ing from machine translation, such as the Byte- characteristics of Chinese writing system. Pair Encoding (Sennrich et al., 2016), Word- In this work, we comprehensively study the Piece (Schuster and Nakajima, 2012; Devlin et al., influences of three main factors on the Chi- 2019) and SentencePiece based on the unigram nese tokenization for PLM: pronunciation, language model (Kudo and Richardson, 2018). glyph (i.e., shape) and word boundary. Cor- While the idea of sub-word tokenization is in- respondingly, we propose three kinds of tok- tuitive and effective for morphological-rich syn- enizer: 1) SHUOWEN (说文, meaning Talk Word), the pronunciation-based tokenizers; thetic languages, it is not the case for Chinese. 2) JIEZI (ãW, meaning Solve Character), We believe that it is crucial to develop tai- the glyph-based tokenizers; 3) Word seg- lored techniques for the languages beyond En- mented tokenizers, the tokenizers with Chi- glish because there can be huge differences be- nese word segmentation. To empirically tween different languages (Bender, 2019, #Ben- compare the effectivenesses of studied to- kenizers, we pretrain BERT-style language derRule). Towards this end, we devote this work models with them and evaluate the models to analysing three unique linguistic characteris- on various downstream NLU tasks. We find tics of Chinese (writing system) compared to En- that SHUOWEN and JIEZI tokenizers can glish: 1) The Chinese writing system is mor- generally outperform conventional single- phemic (Hill, 2016), which means the Chinese character tokenizers, while Chinese word characters poorly reflect the pronunciation, result- segmentation shows no benefit as a pre- ing in the conventional character-based tokeniza- processing step. Moreover, the proposed tion misses much more phonological information. SHUOWEN and JIEZI tokenizers exhibit significantly better robustnesses on handling 2) Modern Chinese words basically do not un- noisy texts. The code and pretrained models dergo morphological alternations (Packard, 2000), will be publicly released to facilitate linguis- thus rendering sub-word tokenization inapplica- tically informed Chinese NLP. 1 ble. However, Chinese characters are mainly lo- gograms, which means their glyphs, the compo- arXiv:2106.00400v1 [cs.CL] 1 Jun 2021 1 Introduction sition of stokes and radicals, also contain rich se- mantic information (Wu et al., 2019). 3) In Chi- Large-scale Transformer-based pretrained lan- nese writing, there is no natural word boundary guage models (PLMs) (Devlin et al., 2019; Lan like the space in English writing. Although it et al., 2020; Clark et al., 2020; Ma et al., 2020; He is possible to inject word boundaries via Chinese inter alia et al., 2021, ) have achieved great success word Segmentation (CWS), there is no study on in recent years and attracted wide research inter- how this works for Chinese PLMs. est, in which the tokenization plays a fundamental Targeting the three factors, we then explore role. three corresponding tokenization strategies: 1) 1 Please refer to the Appendix A.1 for the historical mean- A pronunciation-based tokenizer family called ing of SHUOWEN-JIEZI. ∗ Equal contribution SHUOWEN, which first romanizes the Chinese y Corresponding author email: [email protected] characters based on their pronunciations, and then constructs the vocabulary with the romanized et al.(2019) empirically analysed whether CWS scripts using the unigram language model (Kudo is helpful for downstream Chinese NLP tasks be- and Richardson, 2018). 2) A glyph-based to- fore the PLM era and found that in many cases kenizer family called JIEZI, which decomposes the answer is negative. We examine the impact characters into combinations of Chinese strokes or of CWS for PLM instead. Wu et al.(2019) incor- radicals, and then constructs the vocabulary with porated glyph information of Chinese characters the stroke or radical sequences using the unigram though adding extra encoders to encode the im- language model. 3) A word segmented tokenizer ages of Chinese characters and then combine them family, which first uses a Chinese word segmenter with the character embeddings. We do not intend to segment Chinese texts into words, and then con- to fuse in additional information from sources like structs the vocabulary with the segmented word images, but instead, all of our proposed tokenza- sequences using the unigram language model. tion methods are drop-in replacements to the ex- We pretrain BERT-style PLMs using the pro- isting single-character tokenizers, without adding posed tokenizers from scratch and evaluate the any extra layers or parameters. Tan et al.(2018) resultant models on various downstream tasks. explore to Chinese text into Wubi sequences that Through comprehensive evaluation on ten Chinese represent character glyph information for the task NLU tasks, we find that our pronunciation-based of machine translation. (SHUOWEN) and glyph-based (JIEZI) tokenizers outperform conventional single-character tokeniz- 3 Method ers in most tasks. Furthermore, as they have the In this section, we introduce our proposed tok- unique advantage to learn the meanings of com- enization methods. plex characters through the composition of sim- pler sub-characters, they are naturally more robust 3.1 SHUOWEN: Pronunciation-based on handling noisy input. Surprisingly, we find that Tokenizers Chinese Word Segmentation (CWS) has no benefit The Chinese writing system is morphemic (Hill, for Chinese language model pretraining. 2016) and barely convey phonological informa- Our work suggests that linguistically informed tion. However, the pronunciation of Chinese char- techniques based on the characteristics of different acters also reveals semantic patterns (Duanmu, languages need more attention. We will release 2007) and has long been widely used as input the code, pretrained models, and the SHUOWEN- methods in China (e.g., pinyin). In order to capture JIEZI tokenizers to serve as a better foundation for such information, we propose a pronunciation- future research on Chinese PLM. based tokenizer named SHUOWEN. On raw Chinese input texts (e.g., QEMI), 2 Related Work SHUOWEN performs the following steps: Chinese PLM. Several previous works have ex- 1. Romanize the text using Chinese translitera- plored techniques to improve Chinese language tion systems. In this work, we explore two model pretraining. Zhu(2020) and Zhang et al. different transliteration methods: pinyin and (2021) expanded BERT vocabulary with Chinese zhuyin (i.e., bopomofo). Pinyin uses the Latin words apart from the single characters and incor- alphabet and four 2 different tones (¯, ´, ˇ, porated them in the pretraining objectives. Xiao `) to romanize pronunciations of characters, et al.(2021) and Cui et al.(2019) considered e.g., QEMI ! Chi¯ Mei` Wangˇ Liangˇ. coarse-grained information through masking n- On the other hand, zhuyin uses a set of self- gram and whole words during the masked lan- invented characters and the same four tones guage modeling pretraining. Diao et al.(2020) to romanize the characters, e.g., QEMI ! incorporated word-level information via superim- ㄔㄇㄟ` ($ˇ ㄌ'$ˇ. Note that in zhuyin, posing the character and word embeddings. Lai the first tone mark (¯) is usually omitted. et al.(2021) incorporated Chinese word lattice structures in the pretraining. 2. Insert special separation symbols (+) after Linguistically Informed Techniques for Chi- each character’s romanized sequence, e.g., nese. CWS is a common preprocessing step for 2The light tone is sometimes considered as the fifth tone Chinese NLP tasks (Li and Sun, 2009). Meng but we omit it for simplicity. Chi¯+Mei`+Wangˇ+Liangˇ+, ㄔ+ㄇㄟ`+( acters based on the standard stroke orders4, $ˇ+ㄌ'$ˇ+. This prevents cases where e.g., Q ! pszhshpzznnhpnzsszshn; E ! romanized sequences of different characters pszhshpzznhhspn. To convert into radical se- are mixed together, especially when there are quences, we adopt three existing glyph-based no tone markers to split them in zhuyin. Chinese input methods: Wubi, Zhengma, Cangjie. These methods group strokes to- 3. Different Chinese characters often have gether in different ways to form radicals or the same pronunciation. For disam- stroke combinations, and then represent char- biguation, we append different indices acters with them. We use Latin alphabet to after the romanized sequences for the represent these radicals or stroke combina- homophonic characters, so that allowing tions, e.g., QEMI ! Wubi: rqcc rqci rqcn a biunique mapping between each Chi- rqcw; Zhengma: njlz njbk njld njoo; Cangjie: nese character and
Recommended publications
  • Cultural Translation and Creative Misunderstanding in the Art Of
    Cultural Translation and Creative Misunderstanding in the Art of Wenda Gu David Cateforis One of the major Chinese-born avant-garde artists of his generation, Wenda Gu (b. Shanghai, 1955) began his career as part of the ’85 Movement in China, relocated to the United States in 1987, and achieved international renown in the 1990s.1 Since the late 1990s Gu has spent increasing amounts of time back in China participating in that country’s booming contemporary art scene; he now largely divides his time between Brooklyn and Shanghai. This transnational experience has led Gu to create numerous art works dealing with East–West interchange. This paper introduces and briefly analyzes two of his recent projects, Forest of Stone Steles—Retranslation and Rewriting of Tang Poetry (1993–2005), and Cultural Transference—A Neon Calligraphy Series (2004–7), both of which explore creatively certain problems and paradoxes of attempts to translate between Chinese and English languages and cultures. A full understanding of these projects requires some knowledge of the work that first gained Gu international recognition, his united nations series of installations, begun in 1993.2 The series consists of a sequence of what Gu calls “monuments,” made principally of human hair fash- ioned into such elements as bricks, carpets, and curtains, and combined to create large quasi-architectural installations. Comprising national mon- uments made from hair collected within a single country and installed there, and transnational or “universal” monuments made of hair collected from around the world, Gu’s series uses blended human hair to suggest the utopian possibility of human unification through biological merger.
    [Show full text]
  • Orthography of Early Chinese Writing: Evidence from Newly Excavated Manuscripts
    IMRE GALAMBOS ORTHOGRAPHY OF EARLY CHINESE WRITING: EVIDENCE FROM NEWLY EXCAVATED MANUSCRIPTS BUDAPEST MONOGRAPHS IN EAST ASIAN STUDIES SERIES EDITOR: IMRE HAMAR IMRE GALAMBOS ORTHOGRAPHY OF EARLY CHINESE WRITING: EVIDENCE FROM NEWLY EXCAVATED MANUSCRIPTS DEPARTMENT OF EAST ASIAN STUDIES, EÖTVÖS LORÁND UNIVERSITY BUDAPEST 2006 The present volume was published with the support of the Chiang Ching-kuo Foundation. © Imre Galambos, 2006 ISBN 963 463 811 2 ISSN 1787-7482 Responsible for the edition: Imre Hamar Megjelent a Balassi Kiadó gondozásában (???) A nyomdai munkálatokat (???)a Dabas-Jegyzet Kft. végezte Felelős vezető Marosi Györgyné ügyvezető igazgató CONTENTS Acknowledgements ................................................................................................. vii Introduction ............................................................................................................ 1 CHAPTER ONE FORMER UNDERSTANDINGS ..................................................................................... 11 1.1 Traditional views ........................................................................................... 12 1.1.1 Ganlu Zishu ........................................................................................ 13 1.1.2 Hanjian .............................................................................................. 15 1.2 Modern views ................................................................................................ 20 1.2.1 Noel Barnard .....................................................................................
    [Show full text]
  • Example Sentences
    English 中文 harmony Opening/ Home page Tap on a button in the loading pentagon to dive into that Upon opening the app, the world. Pressing the yin yang user will see “English” and in the center takes you to the “中文” merge into a yin app’s “About” page. yang. That reflects the goal of harmony - to help the user Most things are labeled learn Cantonese and/or in English and Chinese to Mandarin through a bilingual help the user learn Chinese experience without getting more quickly, but this (and too stressed. Soothing colors, many other things) can be pleasing visuals, and relaxing changed in the settings and music keep the user at peace. preferences. harmony (Icons in top navigation bar, from left to right: home button, help button, and harmony settings button.) Dictionary (initial) When you first open the By default, the app only shows dictionary, it shows the items you the last 15 items you you last looked at - your looked at, but you can change history. The green tabs along this in the settings menu. the bottom allow you to swipe between items you recently The search bar is fixed as you viewed, items you starred, or scroll so you can search at any items most popular with other point (instead of having to harmony users. scroll back up to the top). Here, all the characters are in Traditional Chinese because the user left the “Traditional Chinese” checkbox in the search bar checked. The app remembers your choice even after you leave the dictionary section. harmony Choosing Typing in type of input your query To begin your search, you’ll Tapping the search field will want to first choose your make the keyboard pop up type of input by pressing the and allow you to type in your button next to the search field.
    [Show full text]
  • Localizing Into Chinese: the Two Most Common Questions White Paper Answered
    Localizing into Chinese: the two most common questions White Paper answered Different writing systems, a variety of languages and dialects, political and cultural sensitivities and, of course, the ever-evolving nature of language itself. ALPHA CRC LTD It’s no wonder that localizing in Chinese can seem complicated to the uninitiated. St Andrew’s House For a start, there is no single “Chinese” language to localize into. St Andrew’s Road Cambridge CB4 1DL United Kingdom Most Westerners referring to the Chinese language probably mean Mandarin; but @alpha_crc you should definitely not assume this as the de facto language for all audiences both within and outside mainland China. alphacrc.com To clear up any confusion, we talked to our regional language experts to find out the most definitive and useful answers to two of the most commonly asked questions when localizing into Chinese. 1. What’s the difference between Simplified Chinese and Traditional Chinese? 2. Does localizing into “Chinese” mean localizing into Mandarin, Cantonese or both? Actually, these are really pertinent questions because they get to the heart of some of the linguistic, political and cultural complexities that need to be taken into account when localizing for this region. Because of the important nature of these issues, we’ve gone a little more in depth than some of the articles on related themes elsewhere on the internet. We think you’ll find the answers a useful starting point for any considerations about localizing for the Chinese-language market. And, taking in linguistic nuances and cultural history, we hope you’ll find them an interesting read too.
    [Show full text]
  • China in 50 Dishes
    C H I N A I N 5 0 D I S H E S CHINA IN 50 DISHES Brought to you by CHINA IN 50 DISHES A 5,000 year-old food culture To declare a love of ‘Chinese food’ is a bit like remarking Chinese food Imported spices are generously used in the western areas you enjoy European cuisine. What does the latter mean? It experts have of Xinjiang and Gansu that sit on China’s ancient trade encompasses the pickle and rye diet of Scandinavia, the identified four routes with Europe, while yak fat and iron-rich offal are sauce-driven indulgences of French cuisine, the pastas of main schools of favoured by the nomadic farmers facing harsh climes on Italy, the pork heavy dishes of Bavaria as well as Irish stew Chinese cooking the Tibetan plains. and Spanish paella. Chinese cuisine is every bit as diverse termed the Four For a more handy simplification, Chinese food experts as the list above. “Great” Cuisines have identified four main schools of Chinese cooking of China – China, with its 1.4 billion people, has a topography as termed the Four “Great” Cuisines of China. They are Shandong, varied as the entire European continent and a comparable delineated by geographical location and comprise Sichuan, Jiangsu geographical scale. Its provinces and other administrative and Cantonese Shandong cuisine or lu cai , to represent northern cooking areas (together totalling more than 30) rival the European styles; Sichuan cuisine or chuan cai for the western Union’s membership in numerical terms. regions; Huaiyang cuisine to represent China’s eastern China’s current ‘continental’ scale was slowly pieced coast; and Cantonese cuisine or yue cai to represent the together through more than 5,000 years of feudal culinary traditions of the south.
    [Show full text]
  • Joint Civil Society Report Submitted to the Committee on the Elimination of Racial Discrimination
    Joint Civil Society Report Submitted to The Committee on the Elimination of Racial Discrimination for its Review at the 96th Session of the combined fourteenth to seventeenth periodic report of the People’s Republic of China (CERD/C/CHN/14-17) on its Implementation of the Convention on the Elimination of All Forms of Racial Discrimination Submitters: Network of Chinese Human Rights Defenders (CHRD) is a coalition of Chinese and international human rights non-governmental organizations. The network is dedicated to the promotion of human rights through peaceful efforts to push for democratic and rule of law reforms and to strengthen grassroots activism in China. [email protected] https://www.nchrd.org/ Equal Rights Initiative is a China-based NGO monitoring rights development in Western China. For the protection and security of its staff, specific identification information has been withheld. Date of Submission: July 16, 2018 Table of Contents I. Executive Summary Paras. 1-2 II. Recommendations Para. 3 III. Thematic Issues & Findings A. Legislation underpinning discriminatory counter-terrorism policies Paras. 4-7 [Articles 2 (c) and 4; List of Themes para. 8] B. Militarized policing, invasive surveillance, and constant monitoring Paras. 8-21 [Articles 3, 4, and 5 (a-b); LOT para. 22] C. Extrajudicial detention, forced disappearances, torture, and other abuses Paras. 22-28 in “Re-education” camps [Article 5 (a)(b)(d); LOT para. 21] D. Counter-terrorism used to justify arbitrary detention and discriminatory Paras. 29-34 punishment of ethnic minorities [Articles 4 and 5 (a)(b)(d); LOT paras. 6 and 21] E. Discrimination and restrictions on religious freedom Paras.
    [Show full text]
  • Is School Gardening Combined with Physical Activity Intervention Effective for Improving Childhood Obesity? a Systematic Review and Meta-Analysis
    nutrients Review Is School Gardening Combined with Physical Activity Intervention Effective for Improving Childhood Obesity? A Systematic Review and Meta-Analysis Yufei Qi 1,2 , Sareena Hanim Hamzah 1, Erya Gu 3, Haonan Wang 2, Yue Xi 4 , Minghui Sun 4, Siyu Rong 5 and Qian Lin 4,* 1 Centre for Sport and Exercise Sciences, Universiti Malaya, Kuala Lumpur 50603, Malaysia; [email protected] (Y.Q.); [email protected] (S.H.H.) 2 Department of Physical Education and Research, Central South University, 932 Lushan South Rd., Changsha 410083, China; [email protected] 3 School of Foreign Languages, Central South University, 932 Lushan South Rd., Changsha 410083, China; [email protected] 4 Department of Nutrition Science and Food Hygiene, Xiangya School of Public Health, Central South University, 110 Xiangya Rd., Changsha 410078, China; [email protected] (Y.X.); [email protected] (M.S.) 5 Graduate School, Wuhan Sports University, Wuhan 430079, China; [email protected] * Correspondence: [email protected]; Tel.: +86-138-7482-0173 Abstract: School gardening activities (SGA) combined with physical activities (PA) may improve childhood dietary intake and prevent overweight and obesity. This study aims to evaluate the effect of SGA combined with PA on children’s dietary intake and anthropometric outcomes. We searched Citation: Qi, Y.; Hamzah, S.H.; Gu, studies containing randomized controlled trials up to January 2021 in Web of Science, PubMed, E.; Wang, H.; Xi, Y.; Sun, M.; Rong, S.; Cochrane Library, and the EBSCO database on this topic for children aged 7 to 12 years.
    [Show full text]
  • The Rural Video Influencers in China: on the New Edge of Urbanization
    THE RURAL VIDEO INFLUENCERS IN CHINA: ON THE NEW EDGE OF URBANIZATION A Thesis Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Master of Arts by Xinwen Zhang May 2020 © 2020 Xinwen Zhang ABSTRACT On the new media platforms in China, especially the video platforms, some rural content has become quite influential, which is, to some extent, inconsistent with people's impression of the vulnerable position of rural areas the urban-rural inequality. This thesis studied some of the most popular the rural content producers with the close reading of their videos, explaining how they frame themselves and their artworks and what kind of "rurality" is performed to the audiences. While the audience might assign them as rural figures, these people were on the edge of the urban and the rural as they had a shared history as some people who were from the rural areas, spent a period of their lives as migrant workers, and finally became video influencers performing some kind of rural lifestyle. BIOGRAPHICAL SKETCH Xinwen Zhang was born in Shenyang, China, and spent the first 18 years of her life there. Then, she went to Sun Yat-sen University and entered Boya college, which sets no determined major and emphasizes on text close reading to explore potential for students. She finally chose anthropology as the major and researched the breakfast stalls in New Phoenix Village. She went to the village and had close contact with the migrant workers running breakfast stalls and people at breakfast, mainly focusing on the population identification, identity cognition, and community construction of the migrant population as well as their dilemma.
    [Show full text]
  • Confucianism, "Cultural Tradition" and Official Discourses in China at the Start of the New Century
    China Perspectives 2007/3 | 2007 Creating a Harmonious Society Confucianism, "cultural tradition" and official discourses in China at the start of the new century Sébastien Billioud Édition électronique URL : http://journals.openedition.org/chinaperspectives/2033 DOI : 10.4000/chinaperspectives.2033 ISSN : 1996-4617 Éditeur Centre d'étude français sur la Chine contemporaine Édition imprimée Date de publication : 15 septembre 2007 ISSN : 2070-3449 Référence électronique Sébastien Billioud, « Confucianism, "cultural tradition" and official discourses in China at the start of the new century », China Perspectives [En ligne], 2007/3 | 2007, mis en ligne le 01 septembre 2010, consulté le 14 novembre 2019. URL : http://journals.openedition.org/chinaperspectives/2033 ; DOI : 10.4000/chinaperspectives.2033 © All rights reserved Special feature s e v Confucianism, “Cultural i a t c n i e Tradition,” and Official h p s c r Discourse in China at the e p Start of the New Century SÉBASTIEN BILLIOUD This article explores the reference to traditional culture and Confucianism in official discourses at the start of the new century. It shows the complexity and the ambiguity of the phenomenon and attempts to analyze it within the broader framework of society’s evolving relation to culture. armony (hexie 和谐 ), the rule of virtue ( yi into allusions made in official discourse, we are interested de zhi guo 以德治国 ): for the last few years in another general and imprecise category: cultural tradi - Hthe consonance suggested by slogans and tion ( wenhua chuantong ) or traditional cul - 文化传统 themes mobilised by China’s leadership has led to spec - ture ( chuantong wenhua 传统文化 ). ((1) However, we ulation concerning their relationship to Confucianism or, are excluding from the domain of this study the entire as - more generally, to China’s classical cultural tradition.
    [Show full text]
  • The Analects of Confucius
    The analecTs of confucius An Online Teaching Translation 2015 (Version 2.21) R. Eno © 2003, 2012, 2015 Robert Eno This online translation is made freely available for use in not for profit educational settings and for personal use. For other purposes, apart from fair use, copyright is not waived. Open access to this translation is provided, without charge, at http://hdl.handle.net/2022/23420 Also available as open access translations of the Four Books Mencius: An Online Teaching Translation http://hdl.handle.net/2022/23421 Mencius: Translation, Notes, and Commentary http://hdl.handle.net/2022/23423 The Great Learning and The Doctrine of the Mean: An Online Teaching Translation http://hdl.handle.net/2022/23422 The Great Learning and The Doctrine of the Mean: Translation, Notes, and Commentary http://hdl.handle.net/2022/23424 CONTENTS INTRODUCTION i MAPS x BOOK I 1 BOOK II 5 BOOK III 9 BOOK IV 14 BOOK V 18 BOOK VI 24 BOOK VII 30 BOOK VIII 36 BOOK IX 40 BOOK X 46 BOOK XI 52 BOOK XII 59 BOOK XIII 66 BOOK XIV 73 BOOK XV 82 BOOK XVI 89 BOOK XVII 94 BOOK XVIII 100 BOOK XIX 104 BOOK XX 109 Appendix 1: Major Disciples 112 Appendix 2: Glossary 116 Appendix 3: Analysis of Book VIII 122 Appendix 4: Manuscript Evidence 131 About the title page The title page illustration reproduces a leaf from a medieval hand copy of the Analects, dated 890 CE, recovered from an archaeological dig at Dunhuang, in the Western desert regions of China. The manuscript has been determined to be a school boy’s hand copy, complete with errors, and it reproduces not only the text (which appears in large characters), but also an early commentary (small, double-column characters).
    [Show full text]
  • The Later Han Empire (25-220CE) & Its Northwestern Frontier
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2012 Dynamics of Disintegration: The Later Han Empire (25-220CE) & Its Northwestern Frontier Wai Kit Wicky Tse University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Asian History Commons, Asian Studies Commons, and the Military History Commons Recommended Citation Tse, Wai Kit Wicky, "Dynamics of Disintegration: The Later Han Empire (25-220CE) & Its Northwestern Frontier" (2012). Publicly Accessible Penn Dissertations. 589. https://repository.upenn.edu/edissertations/589 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/589 For more information, please contact [email protected]. Dynamics of Disintegration: The Later Han Empire (25-220CE) & Its Northwestern Frontier Abstract As a frontier region of the Qin-Han (221BCE-220CE) empire, the northwest was a new territory to the Chinese realm. Until the Later Han (25-220CE) times, some portions of the northwestern region had only been part of imperial soil for one hundred years. Its coalescence into the Chinese empire was a product of long-term expansion and conquest, which arguably defined the egionr 's military nature. Furthermore, in the harsh natural environment of the region, only tough people could survive, and unsurprisingly, the region fostered vigorous warriors. Mixed culture and multi-ethnicity featured prominently in this highly militarized frontier society, which contrasted sharply with the imperial center that promoted unified cultural values and stood in the way of a greater degree of transregional integration. As this project shows, it was the northwesterners who went through a process of political peripheralization during the Later Han times played a harbinger role of the disintegration of the empire and eventually led to the breakdown of the early imperial system in Chinese history.
    [Show full text]
  • National and Regional Trends in Ideal Family Size in China Stuart Basten1,2 Gu Baochang3
    National and regional trends in ideal family size in China Stuart Basten1,2 Gu Baochang3 1 Department of Social Policy and Intervention, University of Oxford, UK 2 Population Research Institute, Väestöliitto, Helsinki, Finland 3 Center for Population and Development Studies, Renmin University of China Extended Abstract Background Ideal family sizes are useful in indicating both possible future directions of fertility in a given society as well as providing a ‘barometer’ to general societal-level attitudes towards childbearing and families (1). In China, an improved understanding of ideal family sizes can go some way to examining both the possible consequences of future reforms in family planning legislation and, related to this, to judge whether the country has fallen into the ‘Low Fertility Trap’ where small families become normalised and increasing fertility becomes extremely difficult (2). Following a groundbreaking study by Whyte and Gu in 1987 (3), there has been relatively little literature in English on fertility intentions in China since, and what evidence there is highly skewed towards a particular region of the country. Merli and Morgan (4), Basten et al. (5) and Nie and Wyman (6), for example, each study childbearing preferences in Shanghai. Basten et al. found strong evidence of low mean childbearing intentions among Shanghai’s registered population which, according to official surveys, fell from 2.04 in 1983 to 1.63 in 1994, 1.52 in 1998, 1.10 in 2003 and 1.07 in 2008 (7), despite the fact that a majority of couples would be free to have two children under the official policy as they are both singletons and respondents were explicitly asked to consider a future without any policy restrictions.
    [Show full text]