Chinese Information Processing
Total Page:16
File Type:pdf, Size:1020Kb
UNLV Retrospective Theses & Dissertations 1-1-1995 Chinese information processing Yucheng Liu University of Nevada, Las Vegas Follow this and additional works at: https://digitalscholarship.unlv.edu/rtds Repository Citation Liu, Yucheng, "Chinese information processing" (1995). UNLV Retrospective Theses & Dissertations. 544. http://dx.doi.org/10.25669/azdz-qsik This Thesis is protected by copyright and/or related rights. It has been brought to you by Digital Scholarship@UNLV with permission from the rights-holder(s). You are free to use this Thesis in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s) directly, unless additional rights are indicated by a Creative Commons license in the record and/ or on the work itself. This Thesis has been accepted for inclusion in UNLV Retrospective Theses & Dissertations by an authorized administrator of Digital Scholarship@UNLV. For more information, please contact [email protected]. INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films die text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality o f this reproduction is dependent upon the quality o f the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed through, substandard m argins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. A Bell & Howell Information Company 300 North Zeeb Road. Ann Arbor. Ml 48106-1346 USA 313/761-4700 800/521-0600 Chinese Information Processing by Yucheng Liu A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Department of Computer Science University of Nevada, Las Vegas December 1995 UMI Number: 1377645 UMI Microform 1377645 Copyright 1996, by UMI Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. UMI 300 North Zeeb Road Ann Arbor, MI 48103 The thesis of Yucheng Liu for the degree of M aster of Science in Computer Science is approved. airpersoii^Tanichi Kanai, Ph.D. Examining Committee Member, Thomas A. Nartker, Ph.D. ExaminingIxamining Committ<Committee Member, Kia Makki, Ph.D. Graduate Faculty Representative, Xin Li, Ph.D. Interim D£an of the Graduate College, Cheryl L. Bowles, Ed.D. University of Nevada, Las Vegas December 1995 © 1995 Yucheng Liu All Rights Reserved ABSTRACT A survey of the field of Chinese information processing is provided. It covers the following areas: the Chinese writing system, several popular Chinese encoding schemes and code conversions, Chinese keyboard entry methods, Chinese fonts, Chi nese operating systems, basic Chinese computing techniques and applications. Contents 1 Overview of Chinese Information Processing 1 1.1 Introduction.................................................................................................... 1 1.2 Background of Chinese Language ............................................................. 3 1.3 Basic Concepts and Terminology ............................................................. 3 1.3.1 Software Related Concepts ............................................................... 4 1.3.2 Chinese Computing Based Concepts ........................................... 4 1.3.3 Encoding Standards ........................................................................ 5 2 The Chinese Writing System 7 2.1 Roman C h a ra c te rs ...................................................................................... 7 2.2 Symbols and punctuation .......................................................................... 7 2.3 Hanzi C haracter............................................................................................. 8 2.3.1 The Structure of H an zi ..................................................................... 8 2.3.2 Pronunciation ..................................................................................... 12 2.4 Chinese Typefaces, Type Styles and Type S izes ................................... 17 2.4.1 T y p efaces ........................................................................................... 17 2.4.2 Type S ty le s .................... 19 2.4.3 Type Sizes ........................................................................................... 19 2.5 Chinese Text S t y l e ...................................................................................... 20 3 Chinese Character Set Standards and Encoding Methods 21 3.1 Chinese Character Set Standards ............................................................. 21 3.2 Chinese Encoding M ethods ......................................................................... 23 3.2.1 GB 2312-80 Encoding ..................................................................... 23 3.2.2 HZ E n c o d in g ..................................................................................... 26 3.2.3 Big-5 E n co d in g .................................................................................. 27 3.2.4 CNS 11643 E ncoding ........................................................................ 28 3.2.5 CCCII .................... 30 3.2.6 International Encoding M ethods ..................................................... 32 4 Chinese Input 36 4.1 Big Keyboard input M eth o d ....................................................................... 36 4.2 Small Keyboard Input Method ................................................................... 38 4.2.1 Input by Pronunciation .................................................................... 39 4.2.2 Input by S tru c tu re ........................................................................... 40 4.2.3 Input by Encoding V alue ................................................................. 43 4.2.4 Input by Other C r ite r ia ................................................................. 44 4.3 Optical Character Recognition ................................................................... 44 4.3.1 Online/Offline Handwritten and Handprinted C C R ................. 45 4.3.2 Printed Chinese Character Recognition ........................................ 46 iv 4.4 Audio Input M e th o d ..................................................................................... 47 4.5 Chinese Character Dictionaries .................................................................. 47 5 Chinese Output 49 5.1 Chinese Fonts .................................................................................................. 49 5.1.1 Bitmap F o n t ........................................................................................ 50 5.1.2 Vector F onts ........................................................................................ 51 5.1.3 Outline F o n ts ..................................................................................... 51 5.1.4 E valuation ........................................................................................... 52 5.2 Font Generation Methods ........................................................................... 52 5.2.1 Vector Pattern ................................................................................. 52 5.2.2 Dot Matrix P a tte rn ........................................................................... 54 5.2.3 E valuation ........................................................................................... 55 5.3 Font Storage ..................................................................................................... 55 5.4 Printer O u t p u t .............................................................................................. 56 5.5 Screen O u tp u t ................................................................................................. 57 5.6 Chinese T e rm in al........................................................................................... 57 6 Chinese Information Processing Techniques 58 6.1 Chinese Operating Systems ........................................................................ 58 6.1.1 ' Chinese Operating Systems on M S-DOS .................................... 59 6.1.2 Chinese Operating Systems on MS-Windows ............................. 60 6.1.3 Chinese Operating Systems on U nix ............................................. 60 6.1.4 Chinese Operating Systems on Macintosh ................................. 62 6.2 Code C o n v ersio n ........................................................................................... 62 6.2.1 GB <-> HZ C onversion