Tru64 UNIX Technical Reference for Using Chinese Features

August 2000

This guide provides the Chinese-specific information and describes the Chinese features supported on the Compaq Tru64 UNIX system.

Software Version: Tru64 UNIX Version 5.1 or higher

Compaq Computer Corporation Houston, Texas

i © 2000 Compaq Computer Corporation

COMPAQ and the Compaq logo Registered in U.S. Patent and Trademark Office. Tru64 is a trademark of Compaq Information Technologies Group, L.P.

Microsoft, Windows, and Windows NT are trademarks of Microsoft Corporation. Motif, OSF/1, UNIX, and X/Open are trademarks of The Open Group. All other product names mentioned herein may be trademarks or registered trademarks of their respective companies.

Confidential computer software. Valid license from Compaq required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license.

Compaq shall not be liable for technical or editorial errors or omissions contained herein. The information in this publication is subject to change without notice and is provided "as is" without warranty of any kind. The entire risk arising out of the use of this information remains with recipient. In no event shall Compaq be liable for any direct, consequential, incidental, special, punitive, or other damages whatsoever (including without limitation, damages for loss of business profits, business interruption or loss of business information), even if Compaq has been advised of the possibility of such damages. The foregoing shall apply regardless of the negligence or other fault of either party regardless of whether such liability sounds in contract, negligence, tort, or any other theory of legal liability, and notwithstanding any failure of essential purpose of any limited remedy.

The limited warranties for Compaq products are exclusively set forth in the documentation accompanying such products. Nothing herein should be construed as constituting a further or additional warranty. Table of Contents

Preface

1 Character Sets...... 1–1 1.1 CNS 11643 ...... 1–1 1.2 DTSCS ...... 1–4 1.3 Big–5...... 1–5 1.4 GB2312–80 ...... 1–6 1.5 Extended GB ...... 1–7 1.6 ...... 1–7 1.7 ISO/IEC 10646 ...... 1–7

2 Codesets and Codeset Conversion ...... 2–1 2.1 DEC Hanyu ...... 2–1 2.1.1 ASCII CODE ...... 2–2 2.1.2 CNS 11643 Code ...... 2–2 2.1.3 DTSCS Code ...... 2–3 2.1.4 User–Defined Characters ...... 2–4 2.2 Taiwanese EUC...... 2–5 2.3 Big–5...... 2–7 2.4 DEC Hanzi ...... 2–8 2.5 Shift Big–5 ...... 2–10 2.6 Telecode...... 2–11 2.6.1 Plane 1 ...... 2–12 2.6.2 Plane 2 Character Encoding ...... 2–12 2.7 UCS–4/UCS−2 ...... 2–12

iii 2.8 UTF–8...... 2–13 2.9 Codeset Conversion...... 2–13 2.9.1 Default Conversion String...... 2–14 2.9.2 One-to-Many Conversion...... 2–15 2.9.3 User–Defined Character Mappings ...... 2–16 2.10 Codeset for Peripheral Devices...... 2–16

3 Locales 3–1

4 Local Language Devices ...... 4–1 4.1 Terminals ...... 4–1 4.2 Printers...... 4–1

5 Fonts 5–1 5.1 DECwindows Fonts ...... 5–1 5.1.1 XLFD Font Names...... 5–2 5.1.2 Bitmap Font Samples...... 5–4 5.1.3 Font Encodings...... 5–7 5.1.4 Specifying Fonts in DECwindows Applications ...... 5–11 5.2 Outline Fonts...... 5–11 5.2.1 XLFD Font Names of Chinese Outline Fonts...... 5–12

6 Keyboards...... 6–1

7 Input Methods...... 7–1 1.1 Activating and Deactivating Chinese Input Methods...... 7–2 1.1.1 Character–Cell Terminal Applications...... 7–2 1.1.2 DECwindows Motif Applications ...... 7–2 1.1.3 CDE Applications...... 7–4 1.2 Switching Input Method ...... 7–5 1.3 Motif Interface Input Method ...... 7–7 1.3.1 Input Areas...... 7–7 1.3.2 Interaction Styles...... 7–7 1.3.2.1 Root Window Interaction...... 7–7 1.3.2.2 Off-the-Spot Interaction...... 7–9 1.3.3 Input Server Operations ...... 7–10 1.1.4 Options Menu ...... 7–10 1.1.4.1 Vertical Layout...... 7–11 iv 1.1.4.2 Horizontal Layout...... 7–11 1.1.4.3 Select Phrase Input Class ...... 7–11 1.1.4.4 User Phrase Database...... 7–11 1.1.4.5 System Phrase Database...... 7–12 1.1.4.6 Current Window ...... 7–12 1.1.4.7 Input Method Customization...... 7–14 1.1.4.8 Help...... 7–18 1.1.4.9 Quit ...... 7–18 1.1.5 Saving Your New Settings ...... 7–18 1.4 Alphabetic Input Methods...... 7–19 1.5 Tsang–Chi Input Method ...... 7–19 1.1.1 Tsang–Chi Root Radicals...... 7–19 1.1.2 Tsang–Chi Code Generation ...... 7–24 1.1.1.1 General Rules ...... 7–24 1.1.1.2 Connected Characters ...... 7–25 1.1.1.3 Composite Characters...... 7–26 1.1.1.4 Exceptional Characters ...... 7–28 1.1.3 Invoking Tsang–Chi Input Method...... 7–30 1.1.4 Multiple Candidates...... 7–30 1.1.5 Repeat Character Input...... 7–31 1.1.6 Error Handling...... 7–32 1.6 Quick Tsang–Chi Input Method...... 7–32 1.1.1 Quick Tsang–Chi Code Generation...... 7–32 1.1.2 Invoking Quick Tsang–Chi Input Method ...... 7–32 1.1.3 Entering Quick Tsang–Chi Code...... 7–32 1.1.4 Multiple Candidates...... 7–33 1.1.5 Repeat Character Input...... 7–33 1.1.6 Error Handling...... 7–33 1.7 Phonetic Input Method...... 7–33 1.1.1 Phonetic Symbol Categories...... 7–33 1.1.2 Phonetic Code Generation...... 7–35 1.1.3 Invoking Phonetic Input Method ...... 7–35 1.1.4 Entering Phonetic Code...... 7–36 1.1.5 Multiple Candidates...... 7–36 1.1.6 Repeat Character Input...... 7–37 1.1.7 Error Handling...... 7–37 1.8 Internal Code Input Method...... 7–37 1.1.1 Input Procedure...... 7–37 1.1.2 Repeat Character Input...... 7–38 1.1.3 Error Handling...... 7–38 1.9 Phrase Input Method...... 7–38 1.1.1 Input Procedure...... 7–39 1.1.2 Error Handling...... 7–40

v 1.1.2.1 Condition 1...... 7–40 1.1.1.2 Condition 2...... 7–40 1.10 Symbol Input in Dxhanyuim...... 7–41 1.1.1 Invoking Symbol Input Mode ...... 7–41 1.1.2 Rules of Symbol Input ...... 7–41 1.1.3 Entering Symbol Code...... 7–43 1.1.4 Multiple Candidates...... 7–44 1.1.5 Error Handling...... 7–44 1.11 Input of User–Defined Characters in DECwindows Motif...... 7–44 1.12 5–Stroke Input Method ...... 7–46 1.1.1 Input Mechanism ...... 7–46 1.1.1.1 Single Character Input...... 7–47 1.1.1.2 Input of Terms...... 7–47 1.1.2 Procedure...... 7–49 1.1.3 Multiple Candidates...... 7–49 1.1.4 The Association Mode ...... 7–50 1.1.5 Wildcard Key ...... 7–50 1.13 5–Shape Input Method...... 7–51 1.1.1 Distribution of Radicals ...... 7–51 1.1.2 Decomposition of ...... 7–54 1.1.3 Distinction Code...... 7–55 1.1.4 Principles of Character Decomposition ...... 7–56 1.1.5 Input Mechanism ...... 7–56 1.1.5.1 Single Character Input...... 7–57 1.1.1.2 Term Input ...... 7–58 1.1.6 Procedure...... 7–59 1.1.7 Multiple Candidates...... 7–59 1.1.8 Association Mode ...... 7–59 1.1.9 Simple Code Characters...... 7–59 1.1.9.1 Frequently-Used Characters (Level 1 Simple Code) ...... 7–59 1.1.1.2 Level 2 Simple Code ...... 7–60 1.1.1.3 Level 3 Simple Code ...... 7–61 1.1.10 Wildcard Key...... 7–61 1.1.11 Entering 5–Shape Code Through the Numeric Keypad ...... 7–61 1.14 Pin–Yin Input Method ...... 7–61 1.14.1 Input Mechanism ...... 7–62 1.1.2 Procedure...... 7–62 1.1.3 Multiple Candidates...... 7–63 1.1.4 Association Mode ...... 7–63 1.1.5 Multiple Phonetic Representations...... 7–63 1.1.6 Radical Characters...... 7–63 1.15 Qu–Wei Input Method...... 7–64 1.1.1 DEC GB2312 Characters ...... 7–64 vi 1.1.2 Extended GB Characters...... 7–64 1.16 Telex Code Input Method ...... 7–64 1.17 Symbol Input in Dxhanziim ...... 7–65 1.18 Conversion Between Input Servers and DECwindows Motif Applications ...... 7–65

8 Chinese Printing Support...... 8–1 8.1 Supported Printers ...... 8–1 8.1.1 Text Printers ...... 8–1 8.1.2 PostScript Printers...... 8–1 8.2 8.2 Print File Formats ...... 8–1 8.3 Printing Features...... 8–2 8.3.1 Font Embedding...... 8–2 8.3.2 Font Faulting...... 8–2 8.3.3 Software On–Demand Font Loading ...... 8–3 8.3.4 Codeset Conversion ...... 8–3 8.3.5 Outline Fonts ...... 8–4 8.4 Commands and Daemons...... 8–4 8.4.1 Country-Specific Options to the lpr Command...... 8–4 8.4.2 PostScript Font Management Utility (pfsetup)...... 8–5 8.4.3 Font-Faulting Daemon (ffd) ...... 8–7 8.4.4 PrintServer Printing Command wwlpspr ...... 8–8 8.5 Chinese Printing Setup...... 8–8 8.5.1 Dot Matrix Printers ...... 8–8 8.5.2 DEClaser 1152...... 8–9 8.5.3 DEClaser 5100...... 8–11 8.5.4 PrintServer 17 ...... 8–13 8.5.5 Generic PostScript Printers...... 8–14

9 Other Chinese Features ...... 9–1 9.1 Phrase Support in the VT382–D...... 9–1 9.1.1 Creating a Phrase Definition File ...... 9–1 9.1.2 Syntax of Phrase Definitions...... 9–1 9.1.3 Phrase Downloading ...... 9–3 9.2 Sorting Utility...... 9–4 9.2.1 asort Utility...... 9–5 9.2.2 Multiple Collating Sequences...... 9–5 9.2.3 Depth–First Against Breadth–First ...... 9–6 9.2.4 User–Defined Characters ...... 9–6 9.3 Hanyu and Hanzi DECterm ...... 9–6 9.3.1 Creating a Hanyu or Hanzi DECterm ...... 9–6 9.3.2 Customizing DECterm...... 9–7

vii 9.3.3 Font Sizes...... 9–7 9.3.4 Terminal ID...... 9–7 9.3.5 Interaction Style...... 9–7 9.3.6 Input Server ...... 9–7 9.3.7 Copying Information...... 9–8 9.3.8 Default Character Set...... 9–8 9.3.9 Chinese Character Input/Output...... 9–8 9.3.10 Reconnecting the Input Server ...... 9–8 9.3.11 VT382–D and VT382–C Terminal Functions...... 9–8 9.4 Phrase Conversion ...... 9–10 9.5 Special Characters in nroff ...... 9–10

Figures Figure 1–1: CNS 11643 Character Planes...... 1–2 Figure 1–2: CNS 11643 First Character Planes...... 1–3 Figure 1–3: CNS 11643 Second Character Plane ...... 1–3 Figure 1–4: EDPC Recommended Character Set ...... 1–5 Figure 1–5: GB2312-80 Character Set...... 1–6 Figure 2–1: DEC Hanyu Encoding of CNS 11643 Planes ...... 2–2 Figure 2–2: Code Space for CNS 11643 in DEC Hanyu ...... 2–3 Figure 2–3: DEC Hanyu Encoding of DTSCS Characters...... 2–4 Figure 2–4: Code Space for DTSCS in DEC Hanyu...... 2–4 Figure 2–5: Encoding of Taiwanese EUC...... 2–6 Figure 2–6: Code Space for Big-5 ...... 2–8 Figure 2–7: DEC Hanzi Character Encoding ...... 2–9 Figure 2–8: GB2312-80 and Extended GB Code Space ...... 2–10 Figure 3–1: Chinese Language Names...... 3–3 Figure 5–1: Sung Font Sample...... 5–5 Figure 5–2: Hei Font Sample...... 5–5 Figure 5–3: Songti Font Sample ...... 5–6 Figure 5–4:Heiti Font Sample...... 5–6 Figure 5–5: Fangsongti Font Sample ...... 5–7 Figure 5–6: Kaiti Font Sample...... 5–7 Figure 5–7: CNS 11643-1986 Font Encoding Scheme ...... 5–8 Figure 5–8: DTSCS Font Encoding Scheme ...... 5–9 Figure 5–9: GB2312-80 Font Encoding Schemes...... 5–10 Figure 6–1: LK201-D Keyboard Layout ...... 6–2 Figure 6–2: LK401-D Keyboard Layout ...... 6–2 Figure 6–3: LK201-C Keyboard Layout ...... 6–3 Figure 6–4: LK401-C Keyboard Layout ...... 6–3 viii Figure 6–5: Numeric Keypad for 5-Stroke Input Method ...... 6–4 Figure 7–1: Chinese Root Window Interaction Style ...... 7–8 Figure 7–2: Chinese Input Window Icon...... 7–8 Figure 7–3: Off-the-Spot Interaction Style...... 7–9 Figure 7–4: Customization of Invocation Key Sequences in dxhanyuim...... 7–17 Figure 7–5: Customization of Invocation Key Sequences in dxhanziim...... 7–17 Figure 7–6: Full Form Alphabet Input Method...... 7–19 Figure 7–7: Invocation of the Tsang–Chi Input Method...... 7–30 Figure 7–8: Entering a Tsang–Chi Radical ...... 7–30 Figure 7–9: Multiple Candidates...... 7–31 Figure 7–10: The Quick Tsang–Chi Input Method...... 7–32 Figure 7–11: Entering a Quick Tsang–Chi Code...... 7–33 Figure 7–12: Entering the Phonetic Symbols for “ ”...... 7–35 Figure 7–13: Input of “ ” Using the Internal Code Input Method...... 7–38 Figure 7–14: Entering a Phrase Code...... 7–39 Figure 7–15: Converting a Phrase Code to a Phrase...... 7–39 Figure 7–16: Distribution of Radicals ...... 7–52 Figure 7–17: 5-Shape Radical Keys...... 7–53 Figure 7–18: Distinction Code for the 5-Shape Input Method ...... 7–55 Figure 8–1: Two-Channel Communication of the Font-Faulting Mechanism ...... 8–9

Tables Table 1–1: Characters Defined in CNS 11643-1986...... 1–2 Table 1–2: Characters Defined in CNS 11643-1992...... 1–4 Table 1–3: Mapping of EDPC Recommended Character Set to CNS 11643-1992...... 1–5 Table 2–1: CNS 11643 Code Range in DEC Hanyu...... 2–2 Table 2–2: UDC Code Range in DEC Hanyu...... 2–5 Table 2–3: Big-5 Code Range...... 2–7 Table 2–4: Big-5 User-Defined Spaces...... 2–7 Table 2–5: Big–5 to Shift Big–5 Mappings...... 2–11 Table 2–6: Chinese Codeset Conversion...... 2–14 Table 2–7: Codeset Names and Associated Strings...... 2–14 Table 2–8: Mapping Between Big–5 and DEC Hanyu User-Defined Characters...... 2–16 Table 2–9: Feasible Chinese Codesets for Applications, Terminals, and Printers...... 2–17 Table 3–1: Chinese Locales...... 3–1 Table 4–1: Chinese Print Filters...... 4–2 Table 5–1: Traditional Chinese Screen Fonts...... 5–1 Table 5–2: Simplified Chinese Screen Fonts...... 5–2 Table 5–3: XLFD of Miscellaneous Chinese Screen Fonts...... 5–4 Table 5–4: Chinese DECwindows Font Encodings ...... 5–8 Table 5–5: Font Encoding Conversion...... 5–9

ix Table 5–6: Chinese DECwindows Font Encodings ...... 5–10 Table 5–7: GR to GL Font Encoding Conversion ...... 5–10 Table 5–8: Traditional Chinese Default Fonts...... 5–11 Table 5–9: Simplified Chinese Default Fonts ...... 5–11 Table 7–1: Key Sequences that Invoke Chinese Input Method...... 7–5 Table 7–2: Key Sequences Used to Select Traditional Chinese Input Method...... 7–6 Table 7–3: Key Sequences Used to Select Simplified Chinese input Method...... 7–6 Table 7–4: Window Input Areas...... 7–7 Table 7–5: Modifier State Customization ...... 7–16 Table 7–6: Tsang–Chi root Radicals Classification...... 7–20 Table 7–7: Quick Reference Table of the Tsang–Chi Root Radicals...... 7–23 Table 7–8: Composition Form Characters...... 7–24 Table 7–9: Connected Form Characters...... 7–24 Table 7–10: Examples of Connected Character Decomposition...... 7–25 Table 7–11: Composite Character Decomposition ...... 7–27 Table 7–12: Compound Characters...... 7–28 Table 7–13: Difficult Characters ...... 7–29 Table 7–14: Special Characters ...... 7–30 Table 7–15: Meaning of Arrow Characters...... 7–31 Table 7–16: Phonetic Symbols ...... 7–34 Table 7–17: Examples of Phonetic Input ...... 7–35 Table 7–18: Phonetic Symbols with Different Termination Keys...... 7–36 Table 7–19: Stroke Categories...... 7–46 Table 7–20: 5-Stroke Code...... 7–46 Table 7–21: Input of Single Characters with the 5-Stroke Input Method...... 7–47 Table 7–22: Input Terms with the 5-Stroke Input Method...... 7–48 Table 7–23: Shape Code...... 7–52 Table 7–24: Entering Basic Strokes...... 7–57 Table 7–25: Input of Terms with the 5-Shape Input Method...... 7–58 Table 7–26: Pin-Yin Tone Marks ...... 7–62 Table 9–1: Phrase definitions ...... 9–2 Table 9–2: Traditional Chinese Sorting Methods...... 9–4 Table 9–3: Simplified Chinese Sorting Methods...... 9–4

x Preface

This guide provides Chinese-specific information, such as character sets and locales, for end users and programmers who want to use and develop internationalized applications in Chinese locales on the Compaq Tru64 UNIX operating system. Details of the Chinese features are also documented in this guide. Intended Audience This guide is for new and experienced end users and programmers who are interested in the Chinese variant of the Compaq Tru64 UNIX operating system. Structure of this Guide This guide consists of nine chapters:

Chapter 1 Describes the Chinese character sets supported in the Compaq Tru64 UNIX operating system software. Chapter 2 Describes the Chinese codesets and the conversion among different codesets. Chapter 3 Describes the Chinese locales. Chapter 4 Describes the local hardware devices which support the Chinese locales. Chapter 5 Provides information on Chinese fonts. Chapter 6 Provides information on Chinese keyboards. Chapter 7 Describes how to input Chinese characters. Chapter 8 Introduces the Chinese printing support. Chapter 9 Provides descriptions of other Chinese features.

xi Related Documents Writing Software for the International Market Programming for the World: A Guide to Internationalization, Sandra Martin O’Donnell, Prentice Hall, 1994 OSF/Motif User’s Guide Revision 1.2, Open Software Foundation, Prentice Hall, Englewood Cliffs, New Jersey 07632 OSF/Motif Style Guide Revision 1.2, Open Software Foundation, Prentice Hall, Englewood Cliffs, New Jersey 07632 X Window System, Third Edition, Robert W. Scheifler and James Gettys, Digital Press Programmer's Supplement for Release 5 of the X Window System, Version 11, David Flanagan, O’Reilly & Associates, Inc. The Unicode Standard, Version 2.0, The Unicode Consortium, Addison Wesley, Reading, MA, 1996 Information Technology-Universal Multiple-Octet Coded Character Set, ISO/IEC 10646: 1993 Chinese Code for Data Communication

xii Conventions The following typographical conventions are used in this manual: % A percent sign represents the C shell system prompt. A $ dollar sign represents the system prompt for the Bourne and Korn shell. # A number sign represents the superuser prompt. % cat Boldface type in interactive examples indicates typed user input. File Italic (slanted) type indicates variable values, placeholders, and function argument names. [ | ] In syntax definitions, brackets indicate items that are { | } optional and braces indicate items that are required. Vertical bars separating items inside brackets or braces indicate that you choose one item from among those listed. . . . In syntax definitions, a horizontal ellipsis indicates that the preceding item can be repeated one or more times. cat(1) A cross-reference to a reference page includes the appropriate section number in parentheses. For example, cat(1) indicates that you can find information on the cat command in Section 1 of the reference pages. [RETURN] In an example, a key name enclosed in a box indicates that you press that key. Ctrl/x This symbol indicates that you hold down the first named key while pressing the key or mouse button that follows the slash. In examples, this key combination is enclosed in a box (for example [Ctrl/C]).

xiii

1 Character Sets

The Compaq Tru64 UNIX operating system software supports the following Chinese character sets: • CNS 11643 • DTSCS • Big-5 • GB2312-80 • Extended GB • Unicode • ISO/IEC 10646 For traditional Chinese characters the CNS 11643 and Big-5 character sets are commonly used. The GB2312-80 character set is commonly used for Simplified Chinese characters. The Unicode and ISO/IEC 10646 character sets are common to both traditional and Simplified Chinese. 1.1 CNS 11643 The CNS (Chinese National Standard) 11643 character set standard was published by the National Bureau of Standards of Taiwan in 1986 and was updated in 1992. It was also called "Standard Interchange Code for Generally-used Chinese Character" (SICGCC). CNS 11643 provides 16 character planes for defining Chinese characters. Each character plane is divided into 94 rows and each row has 94 columns. Altogether, a total number of 8,836 characters can be accommodated in each plane. Character planes 1-11 are reserved for defining standard Chinese characters while character planes 12-16 are user-defined areas.

Tru64 UNIX Technical Reference for Using Chinese Features 1–1 Figure 1–1: CNS 11643 Character Planes

The original CNS 11643 standard, published in 1986, defines certain groups of characters only on the first and second character planes. Table 1–1 shows these groups of characters.

Table 1–1: Characters Defined in CNS 11643-1986

Character Plane Character Type Number of Characters Plane 1 Special characters 651 Control characters 33 Frequently-used characters 5,401 Plane 2 Less frequently-used characters 7,650

Figure 1–2 and Figure 1–3 illustrate the positions of these characters in the first and second character planes.

1–2 Tru64 UNIX Technical Reference for Using Chinese Features Figure 1–2: CNS 11643 First Character Planes

Figure 1–3: CNS 11643 Second Character Plane

As the CNS11643-1986 character set was not rich enough to meet most of the application requirements, such as names and addresses, the information industry in Taiwan requested to expand the character set. In 1991, the Bureau of National Standard formed a team to study how to expand CNS 11643. On August 4, 1992, the Bureau of National Standard published the revised CNS 11643 - Chinese Standard Interchange Code (CSIC)

Tru64 UNIX Technical Reference for Using Chinese Features 1–3 The revised CNS 11643, called CNS 11643-1992, defined 651 special characters, 33 control characters and 48,027 Chinese characters, as shown in Table 1–2.

Table 1–2: Characters Defined in CNS 11643-1992

Character Plane Character Type Number of Characters Plane 1 Special characters 651 Control characters 33 Frequently-used characters 5,401 Plane 2 Less frequently-used characters 7,650 Plane 3 Rarely-used characters (EDPC Part I) 6,148 Plane 4 Used for residency system, ISO 2nd 7,298 edition DIS 10646 Han characters, 171 EDPC Part II Characters Plane 5 Rarely-used characters (Based on the 8,603 Ministry of Education publications) Plane 6 Variants based on the Ministry of 6,388 Education publications (<=14 strokes) Plane 7 Variants based on the Ministry of 6,539 Education publications (>14 strokes)

Since the number of characters defined in CNS 11643-1992 is far greater than those required for general use, the revised CNS 11643 is called "Chinese Standard Interchange Code (CSIC)".

______Note ______In this release, the new characters added to CNS 11643-1992 are not supported. Only the characters defined in CNS 11643-1986 and DTSCS (which will be described in the next section) are supported. ______

1.2 DTSCS In addition to CNS 11643, the Compaq Tru64 UNIX operating system supports the DIGITAL Taiwan Supplemental Character Set (DTSCS). Currently, only the EDPC Recommended Character Set, which defines a total of 6,319 characters, is included in DTSCS. EDPC Recommended Character Set was first published by the Electronic Data Processing Center of Executive Yuen in June, 1988.

1–4 Tru64 UNIX Technical Reference for Using Chinese Features Figure 1–4: EDPC Recommended Character Set

As a de facto standard, computer vendors support the EDPC Recommended Character Set and assign it to CNS 11643 character plane 14. In the revised CNS 11643-1992, the 6,319 characters in the EDPC Recommended Character Set are assigned to the third and fourth character planes of CNS 11643, as shown in Table 1–3.

Table 1–3: Mapping of EDPC Recommended Character Set to CNS 11643-1992

EDPC Characters Character Plane Number of Characters Part I Plane 3 6,148 Part II Plane 4 171 1.3 Big–5 The Big-5 character set, though not a national standard, is commonly used by the Taiwan information industry, particularly in the PC and workstation market. Big-5 character set was designed to meet the requirements of five major software vendors in Taiwan. Since its publication, much software and hardware, and many peripheral devices have been developed to support Big-5. Big-5 is very similar to the first two planes of CNS 11643-1992. The frequently-used Chinese characters (5,401) defined in the two character sets are exactly the same except that their positions in the code table are different. For the less frequently-used Chinese characters, Big-5 defines two more characters in addition to the 7,650 characters defined in

Tru64 UNIX Technical Reference for Using Chinese Features 1–5 the second character plane of CNS 11643, and their positions in the code table are different. 1.4 GB2312–80 The GB2312-80 character set is a standard published by the State Bureau of Standardization of the People’s Republic of China (PRC) in 1980 and put in force in May, 1981. GB2312-80 defines 7,445 characters, including 6,763 Chinese characters: • Graphic symbols 682 graphic symbols are defined and placed in rows 1-9. • Level 1 characters Those are 3,755 frequently-used characters placed in rows 16-55. • Level 2 characters Those are 3,008 less frequently-used characters placed in rows 56-87. See Figure 1–5. The GB2312-80 code table is divided into 94 rows (Qu), numbered from 1 to 94. Each row has 94 columns (Wei), also numbered from 1 to 94.

Figure 1–5: GB2312-80 Character Set

1–6 Tru64 UNIX Technical Reference for Using Chinese Features 1.5 Extended GB The extended GB character set provides 8,836 (94 x 94) code points for defining user- defined characters. The 8,836 code points are divided into two regions: • User-Defined Area — Spans rows 1-87 and provides 8,178 code positions. • User-Defined (reserved) Area — Spans rows 88-94 and provides 658 code positions. This area is where users define special and frequently-used user-defined characters. The extended GB code table is similar to the GB2312 code table. It is divided into 94 rows and each row has 94 columns. 1.6 Unicode The Unicode Standard, Version 3.0 specifies a universal character set (UCS) that contains definitions for 49,194 characters and includes a Private Use Area for vendor-defined or user-defined characters. The main features of this character set are: • All characters are treated as 16-bit units. • Each 16-bit unit has an abstract character identity. • Certain sequences of 16-bit characters in a text stream are transformed into other characters, called composed characters. • All characters have properties, such as base, numeric, spacing, combination, and directionality. The Unicode standard provides rules for ordering characters with different properties so that parsing of character sequences is unambiguous. • The relationship between Unicode characters and the glyphs in the native language script that users see, type, or print is not necessarily one-to-one. A glyph may be mapped to a single abstract character or a composed character. Conversely, more than one glyph can be mapped to a character. • The ISO 8859-1 character set occupies the first 256 code positions (and the ASCII character set the first 128 positions) of the UCS. 1.7 ISO/IEC 10646 The ISO/IEC 10646 standard, which is specified in Information Technology-Universal Multiple-Octet Coded Character Set, ISO/IEC 10646, allows characters to be specified as either 32-bit units or like Unicode, as 16-bit units. In their 32-bit form, the 16-bit character values in Unicode are zero-extended through a second 16-bit unit to conform to ISO/IEC 10646.

Tru64 UNIX Technical Reference for Using Chinese Features 1–7

2 Codesets and Codeset Conversion

The Compaq Tru64 UNIX operating system fully supports the following Chinese codesets by including locales and codeset conversion support: • DEC Hanyu • Taiwanese EUC (Extended UNIX Code) • Big-5 • DEC Hanzi • UTF-8 It also provides codeset conversion support for the following codesets: • Telecode • Shift Big-5 • UCS-2 • UCS-4 2.1 DEC Hanyu The DEC Hanyu codeset, denoted by dechanyu, consists of the following character sets: • ASCII • CNS11643, the first and second character planes • DTSCS • User-Defined Characters

Tru64 UNIX Technical Reference for Using Chinese Features 2–1 DEC Hanyu uses a combination of single-byte, two-byte, and four-byte data to represent ASCII characters, symbols, or ideographic characters. 2.1.1 ASCII CODE All ASCII characters can be represented in the form of single-byte 7-bit data in DEC Hanyu. That is, the most significant bit (MSB) of ASCII characters is always set off. 2.1.12.1.2 CNS 11643 Code Each CNS 11643 character is represented by a two-byte code in DEC Hanyu, which complies with the CNS 11643 standard. The MSB of the first byte is always set on while that of the second byte can be on for the first character plane or off for the second character plane. See Figure 2–1.

Figure 2–1: DEC Hanyu Encoding of CNS 11643 Planes

The first byte of a CNS 11643 code determines the row number of the character, while the second byte determines its column number. Table 2–1 illustrates the code range of a CNS 11643 code.

Table 2–1: CNS 11643 Code Range in DEC Hanyu

Character Plane 1st Byte (hex) 2nd Byte (hex) Plane 1 A1 to FE A1 to FE Plane 2 A1 to FE 21 to 7E

The following formulas illustrate the code of a CNS 11643 character in relation to its row and column numbers. CNS 11643 plane 1 character: First byte = A0 + row number Second byte = A0 + column number CNS 11643 plane 2 character: First byte = A0 + row number Second byte = 20 + column number

2–2 Tru64 UNIX Technical Reference for Using Chinese Features For example, if a character is positioned at the first column of the 36th row on CNS 11643 plane 1, its encoding value is calculated as follows: First byte = A0 (hex) + 36 = C4 (hex) Second byte = A0 (hex) + 01 = A1 (hex) Its encoded value is C4A1. Similarly, if a character is positioned at the first column of the 36th row on CNS 11643 plane 2, its encoding value is calculated as follows: First byte = A0 (hex) + 36 = C4 (hex) Second byte = 20 (hex) + 01 = 21 (hex) Its encoded value is C421. Figure 2–2 illustrates the division of a two-byte code space and the position of CNS 11643 characters.

Figure 2–2: Code Space for CNS 11643 in DEC Hanyu

2.1.12.1.3 DTSCS Code Each DTSCS character is represented by a four-byte code in DEC Hanyu. The first two bytes are the leading codes, namely 0xC2 0xCB, which are used as a designator sequence for the DTSCS character set. The MSB of the third and fourth bytes is set on for the EDPC Recommended Character Set.

DIGITAL UNIX Technical Reference for Using Chinese Features 2–3 Figure 2–3: DEC Hanyu Encoding of DTSCS Characters

Figure 2–4 illustrates the 4-byte code space and the position of DTSCS characters.

Figure 2–4: Code Space for DTSCS in DEC Hanyu

2.1.12.1.4 User–Defined Characters In addition to the CNS11643 and the DTSCS character sets described above, DEC Hanyu provides 3,587 positions for user-defined characters (UDC). The positions for UDCs are those unused (but not reserved) code points on the CNS 11643 first and second character planes. Therefore, the encoding of UDC is exactly the same as that of CNS 11643 except that they occupy different regions, as shown in Table 2–2.

2–4 Tru64 UNIX Technical Reference for Using Chinese Features Table 2–2: UDC Code Range in DEC Hanyu

Character Plane Number of UDC Code Range Plane 1 145 FDCC - FEFE Plane 1 2,256 AAA1 - C1FE Plane 2 1,186 F245 - FE7E 2.2 Taiwanese EUC Taiwanese EUC (extended UNIX code), denoted as eucTW, is another codeset to support CNS 11643. The design of Taiwanese EUC allows the 16 character planes of CNS 11643 to be encoded in a unified way. A stream of data encoded in Taiwanese EUC can contain characters defined in ASCII and the 16 character planes. Figure 2–5 illustrates the encoding of Taiwanese EUC.

DIGITAL UNIX Technical Reference for Using Chinese Features 2–5 Figure 2–5: Encoding of Taiwanese EUC

Taiwanese EUC uses the Single-Shift 2 control character (SS2) and an additional byte to specify a character plane. The only exception is the first plane, which does not require leading codes. Instead, two bytes specify a character’s position on the first plane. The first byte determines its row number, while the second determines its column number. The MSBs of the two bytes are set on.

2–6 Tru64 UNIX Technical Reference for Using Chinese Features In this release, only the characters defined in the first and second planes of CNS 11643 and those in the EDPC Recommended Character Set that have been remapped into the third and fourth character planes of the revised CNS 11643-1992 are supported in Taiwanese EUC. Other characters that were added to the CNS 11643-1992 standard are not supported. 2.3 Big–5 The Big-5 codeset, denoted as , is the only codeset that supports the Big-5 character set. The encoding of the Big-5 codeset is similar to that of CNS 11643 in DEC Hanyu. Each Big-5 character is represented by a two-byte code which complies with the Big-5 standard. The MSB of the first byte is always set on while that of the second byte can be set on or off. The Big-5 code range is defined as shown in Table 2–3.

Table 2–3: Big-5 Code Range

Character Number of Characters Code Range Special symbols 408 A140-A3BF Level 1 characters 5,401 A440-C67E Level 2 characters 7,652 C940-F9D5

In addition to the code points for special symbols and Chinese characters shown in Table 2–3, three areas are set aside for user-defined spaces. Some vendors in Taiwan support user-defined characters in the code ranges shown in Table 2–4.

Table 2–4: Big-5 User-Defined Spaces

Character Number of Characters Code Range Level 1 user-defined space 785 FA40-FEFE Level 2 user-defined space 2,983 8E40-A0FE Level 3 user-defined space 2,041 8140-8DFE

DIGITAL UNIX Technical Reference for Using Chinese Features 2–7 The valid ranges of the two bytes are:

Byte Valid Ranges First byte 81-FE Second byte 40-7E and A1-FE

Figure 2–6 illustrates the encoding of the Big-5 codeset in a two-byte code space.

Figure 2–6: Code Space for Big-5

2.4 DEC Hanzi The ASCII, GB2312-80 and extended GB character sets are combined to form the DEC Hanzi codeset. DEC Hanzi, denoted as dechanzi, uses a two-byte data representation for symbols and ideographic characters defined in the GB2312-80 character set. To differentiate GB2312- 80 codes from ASCII codes, MSB of the first byte is always set on while that of the second byte is on for GB2312-80 and off for extended GB as shown in Figure 2–7.

2–8 Tru64 UNIX Technical Reference for Using Chinese Features Figure 2–7: DEC Hanzi Character Encoding

The first byte of a two-byte code determines its row number, while the second byte determines its column number. The following formulas illustrate the code of a GB2312-80 character or an extended GB character in relation to its row and column numbers. GB2312-80 character: First byte = A0 + row number Second byte = A0 + column number Extended GB character: First byte = A0 + row number Second byte = 20 + column number For example, if a character is positioned at the first column of the 16th row on the GB2312-80 code plane, its encoding value is calculated as follows: First byte = A0 (hex) + 16 = B0 (hex) Second byte = A0 (hex) + 01 = A1 (hex) Its encoded value is B0A1.

DIGITAL UNIX Technical Reference for Using Chinese Features 2–9 Similarly, if a character is positioned at the first column of the 16th row on the extended GB code plane, its encoding value is calculated as follows: First byte = A0 (hex) + 16 = B0 (hex) Second byte = 20 (hex) + 01 = 21 (hex) Its encoded value is B021. Figure 2–8 illustrates the division of a two-byte code space and the position of the Chinese character sets.

Figure 2–8: GB2312-80 and Extended GB Code Space

2.5 Shift Big–5 The Shift Big-5 codeset, denoted as sbig5, is a variant of the Big-5 codeset. The difference between the two is that the second byte of some Big-5 characters is mapped to other values to form Shift Big-5 characters. Table 2–5 illustrates the mappings of Big-5 characters to Shift Big-5 characters.

2–10 Tru64 UNIX Technical Reference for Using Chinese Features Table 2–5: Big–5 to Shift Big–5 Mappings

Big-5 (Second Byte) Shift Big-5 (Second Byte) 40 30 5B 31 5C 32 5D 33 5E 34 5F 35 60 36 7B 37 7C 38 7D 39 7E 9F

The Shift Big-5 codeset can be used in codeset conversion and terminal display. Refer to Section 2.9 for details. 2.6 Telecode The Telecode codeset (called Mitac Telex in early versions of the operating system), denoted as telecode, consists of 2 character planes. Each character plane has 8836 character positions. In plane 1, standard characters occupy positions 0001 to 8045; the remaining 791 positions are for user-defined characters. In plane 2, standard characters occupy positions 0001 to 8489; the remaining 346 positions are for user-defined characters. Telecode uses 2-byte values to represent characters on both planes.

______Note ______For information about the character sets encoded by Telecode, refer to Chinese Code For Data Communication. ______

Telecode can be used in codeset conversion and terminal display. Refer to Section 2.9 for further details.

DIGITAL UNIX Technical Reference for Using Chinese Features 2–11 2.1.12.6.1 Plane 1 Character Encoding To differentiate plane 1 code from plane 2 code, MSB is set on in both bytes of a plane 1 character code. You can use the following formula to calculate the value of a plane 1 character from its position on the plane: First byte = M + 161 Second byte = N + 161 - M x 94 In this formula, N is the position of the character and M = N / 94. For example, if a character is at position 2502 on plane 1, its encoded value is BBDB, which is calculated as follows: N = 2502, M = 2502/94 = 26 First byte = 26 + 161 = 187 (or, BB (hex)) Second byte = 2502 + 161 - 26 x 94 = 219 (or, DB (hex)) 2.1.12.6.2 Plane 2 Character Encoding To differentiate plane 2 code from plane 1 code, the MSB of the first byte is set on and that of the second byte is set off for each plane 2 character code. You can use the following formula to calculate the value of a plane 2 character from its position: First byte = M + 161 Second byte = N + 33 - M x 94 In this formula, N is the position of the character on the plane and M = N / 94. For example, if a character is at position 2502 on plane 2, its encoded value is BB5B, which is calculated as follows: N = 2502, M = 2502/94 = 26 First byte = 26 + 161 = 187 (or, BB (hex)) Second byte = 2502 + 33 - 26 x 94 = 91 (or, 5B, hex)) 2.7 UCS–4/UCS−2 The UCS codeset is a standard character encoding for the universal character set (UCS) specified in Unicode and ISO/IEC 10646. There are two encoding schemes for UCS. An implementation that parses in 16-bit units (2 octets) is known as UCS-2. This is the canonical Unicode encoding in wide use on personal computers. An implementation that parses in 32-bit units (4 octets) is known as UCS-4. This is the canonical ISO/IEC 10646 encoding that is in use on systems that can support larger data unit size.

2–12 Tru64 UNIX Technical Reference for Using Chinese Features On Compaq Tru64 UNIX, UCS-2 and UCS-4 can be used in codeset conversion. In addition, UCS-4 is used as an internal process code for some locales. For codeset conversion, see Section 2.9. For locale variants, see Chapter 3. 2.8 UTF–8 The Unicode and ISO/IEC 10646 standards define transformation formats for the UCS. The following UCS transformation formats (UTFs) exist mainly to transform UCS values into sequences of bytes for handling by various byte-oriented protocols: • UTF-8 is the standard method for transforming UCS-encoded data into a sequence of 8-bit bytes and ensuring interchange transparency for characters in C0 code positions (0 to 31), the SPACE (32) character, and the DEL (127) character. • UTF-7 is the standard interchange format for environments that strip the eighth bit from each byte. • UTF-16 is a transformation format that allows systems that are limited to processing of 16-bit units (specified by UCS-2 encoding) to support the extended character definition space that is included in UCS-4. The current version of Compaq Tru64 UNIX supports UTF-8. UTF-8 can be used in codeset conversion and in locales. For codeset conversion, see Section 2.9. For locale variants, see Chapter 3. 2.9 Codeset Conversion Users may sometimes need to convert files from one codeset to another. The iconv utility provided by Compaq Tru64 UNIX is used to convert the encoding of characters in one codeset to another and write the results to standard output. Table 2–6 shows the pairs of Chinese codeset converters that are provided.

DIGITAL UNIX Technical Reference for Using Chinese Features 2–13 Table 2–6: Chinese Codeset Conversion

DEC Taiwanese Shift Tele- DEC Hanyu EUC Big-5 Big-5 code Hanzi UCS-4 UCS-2 UTF-8 DEC Hanyu – Y Y NY YYYY Taiwanese Y – Y YY YYYY EUC BIG-5 YY – YY YYYY Shift Big-5 NY Y– N NNNN Telecode YY YN– NNNN DEC Hanzi YY YNN– YYY UCS-4 YY YNNY– YY UCS-2 YY YNNNY-Y UTF-8 YY YNNYYY–

For example, the following command converts a DEC Hanyu file to Big-5: % iconv -f dechanyu -t big5 Table 2–7 shows the string names you can use as the parameters of the iconv utility.

Table 2–7: Codeset Names and Associated Strings

Codeset String DEC Hanyu dechanyu Taiwanese EUC eucTW Big-5 big5 Shift Big-5 sbig5 Telecode Telecode DEC Hanzi dechanzi Universal Codeset (4 octet form) UCS-4 Universal Codeset (2-octet form) UCS-2 Universal Transfer Format UTF-8

2.1.12.9.1 Default Conversion String When converting from one codeset to another, characters in the source codeset that have no corresponding code point in the destination codeset will not be converted. By default, the characters that cannot be converted are skipped and have no representation in the converted output.

2–14 Tru64 UNIX Technical Reference for Using Chinese Features You can control this behavior by using the ICONV_DEFSTR environment variable to define a default string to replace those unconvertible characters. If you specify a numeric value for this environment variable, the corresponding character value will be used. The ICONV_DEFSTR environment variable affects all Chinese iconv converters. You can also use the "ICONV_DEFSTR__" environment variable to control specific codeset conversion. For example, to convert a DEC Hanyu input file to DEC Hanzi with unconvertible characters converted to "?", you would enter the following commands: %setenv ICONV_DEFSTR_dehanyu_dechanzi "?" %iconv -f dechanyu -t dechanzi hanzi_input > hanyu_output

For codeset converters that end in UCS-2, UCS-4, or UTF-8, you can use the "U+XXXX" notation to specify the default character for conversion failure fallback.

______Note ______During cut-and-paste operations, those traditional Chinese characters that cannot be converted to Simplified Chinese characters are shown as default characters in the applications. ______

2.1.12.9.2 One-to-Many Conversion When converting from the DEC Hanzi codeset to other Chinese codesets, one Simplified Chinese character may be mapped to multiple traditional Chinese characters. By default, the iconv utility picks up only the most likely candidate from a list of possible choices. You can control the behavior of the iconv utility with the ICONV_ACTION environment variable. The ICONV_ACTION environment variable determines how the iconv utility behaves when there are one-to-many mappings. The possible values are: • batch –The most likely or preferred candidate will be picked up. This is the default. • conv_all – All possible choices are generated within the brackets "{" and "}" so that you can edit the converted file manually and determine which one should be used. • conv_all_nosym – All characters except symbols (for example, punctuation marks) are handled in the same manner as conv_all.

DIGITAL UNIX Technical Reference for Using Chinese Features 2–15 ______Note ______During cut-and-paste operations, the batch mode is always used for those nonunique characters. ______

______Note ______The ICONV_ACTION environment variable applies to Simplified Chinese to traditional Chinese conversion only and has no effect on UCS-4 and UTF-8 converters. ______

2.1.12.9.3 User–Defined Character Mappings Some user-defined characters in the Big-5 codeset have predefined mappings to user- defined spaces in DEC Hanyu. These mappings are the same as those supported by Pathworks/Hanyu. Table 2–8 shows this mapping.

Table 2–8: Mapping Between Big–5 and DEC Hanyu User-Defined Characters

DEC Hanyu Big-5 Code Size F321 - FB41 FA40 - FEFE 785 FB42 - FEFE 8E40 - 905C 343 AAA1 - C1FE 905D - 9EB8 2256

These predefined user-defined character mappings are supported by both the iconv methods and the terminal driver. Because some user-defined characters do not have predefined mappings, Compaq recommends that you use only those user-defined characters that have predefined mappings.

2.10 Codeset for Peripheral Devices The Compaq Tru64 UNIX software provides a mechanism for you to use to configure your system to run applications with peripherals, such as terminals and printers, that support different codesets. You can specify the codesets for the applications, terminals, and printers independently, as shown in Table 2–9. The Compaq Tru64 UNIX software automatically does the necessary codeset conversion.

2–16 Tru64 UNIX Technical Reference for Using Chinese Features Table 2–9: Feasible Chinese Codesets for Applications, Terminals, and Printers

Application Code Terminal Code Printer Code DEC Hanyu DEC Hanyu DEC Hanyu Taiwanese EUC Taiwanese EUC Taiwanese EUC Big-5 Big-5 Big-5 DEC Hanzi DEC Hanzi DEC Hanzi Shift Big-5 Telecode UTF-8 UTF-8

______Note ______Chinese DECterm software supports only DEC Hanyu, Big5, or DEC Hanzi as its terminal code. You must activate the stty drive and set tcode to dechanyu when running in a Taiwanese EUC locale. For example: % stty adec tcode dechanyu The dxterm does not support UTF-8 as a terminal code. Use dtterm when UTF-8 is required for a terminal code. ______

For details about setting up codesets for terminals and printers, see Writing Software for the International Market.

DIGITAL UNIX Technical Reference for Using Chinese Features 2–17

3 Locales

The Compaq Tru64 UNIX operating system supports different Chinese locales for different countries and areas. These include Taiwan, People’s Republic of China (PRC), and Hong Kong. Table 3–1 shows the valid Chinese locales with different countries, codesets, and collating sequences.

______Note ______zh_TW is an alias of zh_TW.eucTW, and zh_CN is an alias of zh_CN.dechanzi. ______

Table 3–1: Chinese Locales

Codeset Locale Collation Sequence DEC Hanyu zh_TW Internal code zh_TW.dechanyu Internal code zh_TW@ucs4 Internal code zh_TW.dechanyu@ucs4 Internal code zh_TW.dechanyu@radical Radical zh_TW.dechanyu@radical@ucs4 Radical zh_TW.dechanyu@stroke Stroke zh_TW.dechanyu@stroke@ucs4 Stroke zh_TW.dechanyu@chuyin Chuyin (Phonetic) zh_TW.dechanyu@chuyin@ucs4 Chuyin (Phonetic) zh_HK.dechanyu Internal code zh_HK.dechanyu@ucs4 Internal code

Tru64 UNIX Technical Reference for Using Chinese Features 3–1 Table 3-1: (Cont.) Chinese Locales

Codeset Locale Collation Sequence Taiwanese EUC zh_TW.eucTW Internal code zh_TW.eucTW@ucs4 Internal code zh_TW.eucTW@radical Radical zh_TW.eucTW@radical@ucs4 Radical zh_TW.eucTW@stroke Stroke zh_TW.eucTW@stroke@ucs4 Stroke zh_TW.eucTW@chuyin Chuyin (Phonetic) zh_TW.eucTW@chuyin@ucs4 Chuyin (Phonetic) zh_HK.eucTW Internal code zh_HK.eucTW@ucs4 Internal code

Big-5 zh_TW.big5 Internal code zh_TW.big5@radical Radical zh_TW.big5@stroke Stroke zh_TW.big5@chuyin Chuyin (Phonetic) zh_HK.big5 Internal code

DEC Hanzi zh_CN Internal code zh_CN.dechanzi Internal code zh_CN@ucs4 Internal code zh_CN.dechanzi@ucs4 Internal code zh_CN.dechanzi@radical Radical zh_CN.dechanzi@radical ucs4 Radical zh_CN.dechanzi@stroke Stroke zh_CN.dechanzi@stroke@ucs4 Stroke zh_CN.dechanzi@ Pinyin (Phonetic) zh_CN.dechanzi@pinyin@ucs4 Pinyin (Phonetic) zh_HK.dechanzi Internal code zh_HK.dechanzi@ucs4 Internal code

UTF-8 zh_CN.UTF-8 zh_CN.GBK zh_TW.UTF-8 zh_TW.dechanyu zh_HK.UTF-8 zh_HK.dechanyu

Locales that support the same country and codeset are basically the same. The radical, stroke, pinyin, and chuyin modifiers after the at (@) sign specify different criterion for collation and sorting. Moreover, the characters defined in character set standards have collating precedence over user-defined characters, which, in turn, have precedence over undefined or reserved characters.

3–2 Tru64 UNIX Technical Reference for Using Chinese Features The ucs4 modifier indicates that UCS-4 is used as the internal processing code. The classification information, however, is not provided for the full set of UCS-4 characters, but only for the corresponding language. If you are using DECwindows Motif, you can select the locale through the Language Menu of Session Manager. If you are using CDE, you can select the locale using the language menu on the CDE login screen. The applicable locales are shown in Table 3–1.

Figure 3–1: Chinese Language Names

Locale Language Name zh_TW Chinese Taiwan zh_TW.dechanyu Chinese Taiwan (DEC Hanyu) zh_TW.eucTW Chinese Taiwan (EUC) zh_TW.big5 Chinese Taiwan (Big5) zh_TW.UTF-8 Chinese Taiwan (Unicode) zh_HK.dechanyu Chinese Hong Kong (DEC Hanyu) zh_HK.dechanzi Chinese Hong Kong (DEC Hanzi) zh_HK.eucTW Chinese Hong Kong (EUC) zh_HK.big5 Chinese Hong Kong (Big5) zh_HK.UTF-8 Chinese Hong Kong (Unicode) zh_CN Chinese China zh_CN.dechanzi Chinese China (DEC Hanzi) zh_CN.UTF-8 Chinese China (Unicode)

______Note ______For DECwindows Motif and CDE, the locale modifier is ignored. ______

DIGITAL UNIX Technical Reference for Using Chinese Features 3–3

4 Local Language Devices

4.1 Terminals The Compaq Tru64 UNIX operating system supports the VT382-D and VT382-C Chinese terminals. The VT382-D terminal is for traditional Chinese and the VT382-C terminal is for Simplified Chinese. Hanyu DECterm and Hanzi DECterm are the emulation of the VT382-D Chinese terminal and Simplified Chinese terminal respectively, which provide compatible functionalities for running Chinese character-cell terminal applications. For details of Hanyu DECterm and Hanzi DECterm, see Chapter 9, Other Chinese Features. You can also use dtterm running in a Chinese locale to display Chinese in a character- cell application. 4.2 Printers The Compaq Tru64 UNIX operating system supports the following dot matrix Chinese printers: • CP382-D (traditional Chinese) • LA88-C (Simplified Chinese) • LA380-CB (Simplified Chinese) The following PostScript printers can be configured for traditional or Simplified Chinese: • DEClaser 1152 • DEClaser 5100 with font disk (LN09X-HD) • PrintServer 17 • All PostScript level 2 printers

Tru64 UNIX Technical Reference for Using Chinese Features 4–1 The print filters listed in Table 4–1 are provided to support these Chinese printers.

Table 4–1: Chinese Print Filters

Filter Name Printer Name cp382dof CP382-D printer la88cof LA88-C printer la380cbof LA380-CB printer dl1152wrof DEClaser 1152 dl5100wrof DEClaser 5100 lpsof PrintServer 17 wwpsof Level 2 PostScript printers

______Note ______To use PrintServer 17, the PrintServer Software Version 5.0 or later for Compaq Tru64 UNIX is also required. ______

For details about setting up Chinese printer queues, see Chapter 8, Chinese Printing Support.

4–2 Tru64 UNIX Technical Reference for Using Chinese Features 5 Fonts

5.1 DECwindows Fonts The Compaq Tru64 UNIX operating system provides Chinese DECwindows fonts in various sizes and typefaces for 75 dpi (dots-per-inch) display devices. Table 5–1 lists the screen fonts for traditional Chinese.

Table 5–1: Traditional Chinese Screen Fonts

Typefaces Glyph Size Bounding Box Remarks Screen 15 x 16 16 x 18 Mandatory font 22 x 22 24 x 24 Mandatory font Sung 22 x 22 24 x 24 Optional font 30 x 30 32 x 32 Optional font Hei 15 x 16 16 x 16 Optional font 22 x 22 24 x 24 Optional font

There are two sets of DECwindows fonts, one for CNS 11643-1986 and one for DTSCS. Table 5–2 lists the screen fonts for Simplified Chinese.

Tru64 UNIX Technical Reference for Using Chinese Features 5–1 Table 5–2: Simplified Chinese Screen Fonts

Typefaces Glyph Size Bounding Box Remarks Screen 15 x 16* 16 x 18 Mandatory font, defined in GB5199.1-85 22 x 22* 24 x 24 Mandatory font Songti 15 x 16* 16 x 16 Optional font, defined in GB5199.1-85 22 x 22* 24 x 24 Optional font 32 x 32* 34 x 34 Optional font, defined in GB6345.1-86 Heiti 15 x 16 16 x 16 Optional font 22 x 22* 24 x 24 Optional font 32 x 32* 34 x 34 Optional font, defined in GB12036-89 Fangsongti 22 x 22* 24 x 24 Optional font 32 x 32* 34 x 34 Optional font, defined in GB12034-89 Kaiti 22 x 22* 24 x 24 Optional font 32 x 32* 34 x 34 Optional font, defined in GB12035-89 *The fonts marked with an asterisk are supplied by China Standard Technology Development Corporation (CSTDC) of People’s Republic of China. In addition to these Chinese fonts, several miscellaneous screen fonts are provided for use in Hanyu and Hanzi DECterm, and Motif toolkit. The mandatory fonts are available after you install the Chinese language support from the worldwide language support software. Other optional fonts are available only if you install the optional Chinese font subsets. If you do not find the optional fonts on your system, please contact your system administrator. No 100 dpi Chinese fonts are provided in the kit. To allow you to use the Chinese fonts on 100 dpi display devices, a font alias file is provided to map the 75 dpi font names to 100 dpi font names. 2.1.15.1.1 XLFD Font Names You must specify the DECwindows font names in X Logical Font Description (XLFD) format in your application programs or in the application resource files. You can specify wildcards for any fields in the font names. You can use the following font names for both 75 dpi or 100 dpi display devices. If you want to state the display resolution explicitly, you can specify 75 or 100 in the X- and Y- resolution fields, that is, the second and third asterisks in the following XLFD names.

5–2 Tru64 UNIX Technical Reference for Using Chinese Features • Traditional Chinese Screen family font names in XLFD format: CNS 11643-1986 Fonts

-ADECW-Screen-Medium-R-Normal--*-180-*-*-M-160-DEC.CNS11643.1986-2 -ADECW-Screen-Medium-R-Normal--*-240-*-*-M-240-DEC.CNS11643.1986-2 DTSCS Fonts

-ADECW-Screen-Medium-R-Normal--*-180-*-*-M-160-DEC.DTSCS.1990-2 -ADECW-Screen-Medium-R-Normal--*-240-*-*-M-240-DEC.DTSCS.1990-2 • Traditional Chinese Sung family font names in XLFD format: CNS 11643-1986 Fonts

-ADECW-Sung-Medium-R-Normal--*-240-*-*-M-240-DEC.CNS11643.1986-2 -ADECW-Sung-Medium-R-Normal--*-320-*-*-M-320-DEC.CNS11643.1986-2 DTSCS Fonts

-ADECW-Sung-Medium-R-Normal--*-240-*-*-M-240-DEC.DTSCS.1990-2 -ADECW-Sung-Medium-R-Normal--*-320-*-*-M-320-DEC.DTSCS.1990-2 • Traditional Chinese Hei family font names in XLFD format: CNS 11643-1986 Fonts

-ADECW-Hei-Medium-R-Normal--*-160-*-*-M-160-DEC.CNS11643.1986-2 -ADECW-Hei-Medium-R-Normal--*-240-*-*-M-240-DEC.CNS11643.1986-2 DTSCS Fonts

-ADECW-Hei-Medium-R-Normal--*-160-*-*-M-160-DEC.DTSCS.1990-2 -ADECW-Hei-Medium-R-Normal--*-240-*-*-M-240-DEC.DTSCS.1990-2 • Simplified Chinese Screen family font names in XLFD format:

-ADECW-Screen-Medium-R-Normal--*-180-*-*-M-160-GB2312.1980-1 -ADECW-Screen-Medium-R-Normal--*-240-*-*-M-240-GB2312.1980-1 • Simplified Chinese Songti family font names in XLFD format:

-ADECW-Songti-Medium-R-Normal--*-160-*-*-M-160-GB2312.1980-1 -ADECW-Songti-Medium-R-Normal--*-240-*-*-M-240-GB2312.1980-1 -ADECW-Songti-Medium-R-Normal--*-340-*-*-M-340-GB2312.1980-1

DIGITAL UNIX Technical Reference for Using Chinese Features 5–3 • Simplified Chinese Heiti family font names in XLFD format:

-ADECW-Heiti-Medium-R-Normal--*-160-*-*-M-160-GB2312.1980-1 -ADECW-Heiti-Medium-R-Normal--*-240-*-*-M-240-GB2312.1980-1 -ADECW-Heiti-Medium-R-Normal--*-340-*-*-M-340-GB2312.1980-1 • Simplified Chinese Fangsongti family font names in XLFD format:

-ADECW-Fangsongti-Medium-R-Normal--*-240-*-*-M-240-GB2312.1980-1 -ADECW-Fangsongti-Medium-R-Normal--*-340-*-*-M-340-GB2312.1980-1 • Simplified Chinese Kaiti family font names in XLFD format:

-ADECW-Kaiti-Medium-R-Normal--*-240-*-*-M-240-GB2312.1980-1 -ADECW-Kaiti-Medium-R-Normal--*-340-*-*-M-340-GB2312.1980-1 Table 5–3 shows the font names, in XLFD format, of several miscellaneous Chinese screen fonts.

Table 5–3: XLFD of Miscellaneous Chinese Screen Fonts

XLFD Font Name Character Set -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-80-iso8859-1 ISO Latin-1 -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-80-DEC-DECctrl DEC Display Control -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-80-DEC-DECsuppl DEC Supplemental -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-80-DEC-DECtech DEC Technical -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-80-DEC-DRCS DEC DRCS -ADECW-Screen-Medium-R-Normal--*-240-*-*-M-120-iso8859-1 ISO Latin-1 -ADECW-Screen-Medium-R-Normal--*-240-*-*-M-120-DEC-DECctrl DEC Display Control -ADECW-Screen-Medium-R-Normal--*-240-*-*-M-120-DEC-DECsuppl DEC Supplemental -ADECW-Screen-Medium-R-Normal--*-240-*-*-M-120-DEC-DECtech DEC Technical -ADECW-Screen-Medium-R-Normal--*-240-*-*-M-120-DEC-DRCS DEC DRCS

2.1.15.1.2 Bitmap Font Samples Figure 5–1 through Figure 5–6 illustrate samples of Chinese fonts.

5–4 Tru64 UNIX Technical Reference for Using Chinese Features Figure 5–1: Sung Font Sample

Figure 5–2: Hei Font Sample

DIGITAL UNIX Technical Reference for Using Chinese Features 5–5 Figure 5–3: Songti Font Sample

Figure 5–4:Heiti Font Sample

5–6 Tru64 UNIX Technical Reference for Using Chinese Features Figure 5–5: Fangsongti Font Sample

Figure 5–6: Kaiti Font Sample

2.1.15.1.3 Font Encodings The X Consortium registers names for font encodings that are used in XLFDs. However, no names currently are registered for CNS 11643 and DTSCS. Therefore, they are currently supported as Compaq private encodings as shown in Table 5–4.

DIGITAL UNIX Technical Reference for Using Chinese Features 5–7 Table 5–4: Chinese DECwindows Font Encodings

Character Set Character Set Registry CNS 11643-1986 DEC.CNS11643.1986-2 DTSCS DEC.DTSCS.1990-2

Since the X Window System provides only basic Xlib functions for handling 8-bit and 16- bit characters, the four-byte data representation of DTSCS is trimmed to remove the two leading bytes, C2 CB, to form a two-byte encoding. DECwindows applications should either preprocess the four-byte data and then handle them with the low level Xlib functions or handle Chinese strings with the internationalized text drawing functions provided by X11R6 Xlib or Motif Toolkit. Figure 5–7 and Figure 5–8 illustrate these two encoding schemes.

Figure 5–7: CNS 11643-1986 Font Encoding Scheme

5–8 Tru64 UNIX Technical Reference for Using Chinese Features Figure 5–8: DTSCS Font Encoding Scheme

Vendors may adopt different encoding schemes or even different character sets to produce their fonts. The fonts supplied by Compaq are all in the encoding schemes defined in this section. To allow you to run applications on third-party workstations on which different font encodings are installed, the Compaq Tru64 UNIX implementation of X11R6 Xlib supports the conversion of encodings during text display. Table 5–5 shows these encoding conversions.

Table 5–5: Font Encoding Conversion

Character Set Convert From Convert To Taiwanese EUC euctw-1 (plane 1) dec.cns11643.1986-2 euctw-2 (plane 2) dec.cns11643.1986-2 euctw-3 (plane 3) dec.dtscs.1990-2 euctw-4 (plane 4) dec.dtscs.1990-2 Big-5 big5-0 dec.cns11643.1986-2

For Simplified Chinese, the X Window System defines two encodings for the GB2312-80 character set as shown in Table 5–6.

DIGITAL UNIX Technical Reference for Using Chinese Features 5–9 Table 5–6: Chinese DECwindows Font Encodings

Encoding Character Set Registry GL GB2312.1980-0 GR GB2312.1980-1

Figure 5–9: GB2312-80 Font Encoding Schemes

The Chinese DECwindows fonts supplied by Compaq are all in GR encoding. To allow you to run applications on third-party workstations on which only GL-encoded fonts are installed, the Compaq Tru64 UNIX implementation of X11R6 Xlib supports the conversion of GR encoding to GL encoding for text drawing and measurement, as shown in Table 5–7.

Table 5–7: GR to GL Font Encoding Conversion

Convert From Convert To gb2312.1980-1 gb2312.1980-0

For details, see Writing Software for the International Market.

5–10 Tru64 UNIX Technical Reference for Using Chinese Features 2.1.15.1.4 Specifying Fonts in DECwindows Applications Table 5–8 and Table 5–9 show the default fonts used in the Motif Toolkit.

Table 5–8: Traditional Chinese Default Fonts

XLFD Font Name Character Set -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-80-iso8859-1 ISO8859-1 -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-160- DEC.CNS11643.1986-2 DEC.CNS11643.1986-2 -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-160- DEC.CNS11643.1986-2- DEC.CNS11643.1986-2-UDC UDC -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-160- DEC.DTSCS.1990-2 DEC.DTSCS.1990-2 -ADECW-Screen-Medium-R-Normal--*-180-*-*-*-*-* Fontset

Table 5–9: Simplified Chinese Default Fonts

XLFD Font Name Character Set -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-80-iso8859-1 ISO8859-1 -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-160-GB2312.1980-1 GB2312.1980-1 -ADECW-Screen-Medium-R-Normal--*-180-*-*-M-160-GB2312.1980- GB2312.1980-UDC UDC -ADECW-Screen-Medium-R-Normal--*-180-*-*-*-*-* Fontset

To override the default fonts of a traditional Chinese DECwindows application, you should specify the ISO Latin-1, DTSCS, and CNS11643 (UDC) fonts as well as the Chinese fontset when creating widget instances. For a Simplified Chinese DECwindows application, you should specify the ISO Latin-1, GB2312-80, and extended GB (UDC) fonts as well as the Chinese fontset when creating widget instances. For details, see Writing Software for the International Market. 5.2 Outline Fonts The Compaq Tru64 UNIX software provides the following traditional and Simplified Chinese outline fonts for printing on PostScript printers and for display through Level II Display PostScript extension. For traditional Chinese: • Sung-Light-CNS11643

DIGITAL UNIX Technical Reference for Using Chinese Features 5–11 • Hei-Light-CNS11643 For Simplified Chinese: • XiSong-GB2312-80 • Hei-GB2312-80 The encoding of this font is the same as that illustrated in Figure 5–7 and Figure 5–9. These Chinese outline fonts have the following uses: • Printing on PostScript printers. For details see Chapter 8, Chinese Printing Support. • Displaying through the R6 X Windows System Type 1 rasterizer. To use these outline fonts, add the $I18NPATH/usr/lib/X11/fonts/TChinesePS and $I18NPATH/usr/lib/X11/fonts/SChinesePS directories to your font path with the following command: % xset +fp $I18NPATH/usr/lib/X11/fonts/TChinesePS, $I18NPATH/usr/lib/X11/fonts/SchinesePS This is done automatically when the outline fonts are installed. • Displaying through Display PostScript. You can view PostScript files with Chinese characters using the CDA Viewer or through the Display PostScript extension. 2.1.15.2.1 XLFD Font Names of Chinese Outline Fonts To use the Chinese outline fonts through the Type 1 rasterizer, you can specify the font names in XLFD (X Logical Font Description) format in your application programs or in the application resource files, just like ordinary DECwindows bitmap fonts. To specify the XLFD font name of an outline font, you should replace the fields currently marked with 0 (zero) with the following information: • Field 1 — The font height in number of dots. An asterisk (*) usually is entered in this field. • Field 2 — The font height in point size. For example, you can enter 240 to specify a 24 point font. • Fields 3 and 4 — The X- and Y-resolution. They usually have the value of 75 or 100. • Field 5 — The average font width in point size. An asterisk (*) usually is put in this field. For example, if you want to use a 48 point font of the Sung-Light-CNS11643 family for a 100 dpi display device, you would specify: -dyna-sung-medium-r-normal--*-480-100-100-m-*-CNS11643.1986

5–12 Tru64 UNIX Technical Reference for Using Chinese Features 6 Keyboards

The Compaq Tru64 UNIX operating system supports the following Chinese keyboard types: • LK201 • LK401 • PCXAL There are some variants for these keyboards in different countries. For example, in Taiwan, the LK201-D variant has additional symbols printed on the keycaps for different Chinese input methods. The figures in this chapter show some of these Chinese keyboard layouts. You can find online copies of these figures at the location specified. These figures are in .ddif format.

Tru64 UNIX Technical Reference for Using Chinese Features 6–1 Figure 6–1: LK201-D Keyboard Layout

Required Keymap: us_lk201re Location of file: /usr/lib/cda/hanyu-lk201d-100.ddif

Figure 6–2: LK401-D Keyboard Layout

Required Keymap us_lk401aa Location of file: /usr/lib/cda/hanyu-lk401d-100.ddif

6–2 Tru64 UNIX Technical Reference for Using Chinese Features Figure 6–3: LK201-C Keyboard Layout

Required Keymap: us_lk201re Location of file: /usr/lib/cda/hanzi-lk201c-100.ddif

Figure 6–4: LK401-C Keyboard Layout

Required Keymap us_lk401aa Location of file: /usr/lib/cda/hanzi-lk401c-100.ddif

Tru64 UNIX Technical Reference for Using Chinese Features 6–3 Figure 6–5: Numeric Keypad for 5-Stroke Input Method

Location of file: /usr/lib/cda/hanzi-keypad-100.ddif

6–4 Tru64 UNIX Technical Reference for Using Chinese Features 7 Input Methods

This chapter describes the input methods for entering Chinese characters. There are two groups of input methods, one for traditional Chinese and the other for Simplified Chinese. Traditional Chinese has the following input methods: • Full-form alphabets • Tsang-Chi • Quick Tsang-Chi, also known as Easy • Phonetic • Internal Code • Phrase input method • Symbol Input Simplified Chinese has the following input methods: • 5-Stroke • 5-Shape • Pin-Yin, or Phonetic • Qu-Wei or Row-Column in GB2312-80 • Telex Code • Phrase Input

Tru64 UNIX Technical Reference for Using Chinese Features 7–1 This chapter also describes: • The mechanism for activating and deactivating Chinese input methods • The mechanism for switching input methods • The DECwindows Motif interface and its customization 7.1 Activating and Deactivating Chinese Input Methods This section describes how to activate and deactivate the Chinese input methods for character-cell terminal applications, DECwindows Motif applications, and CDE applications. 2.1.17.1.1 Character–Cell Terminal Applications For character-cell terminal applications, traditional and Simplified Chinese input methods are implemented by the firmware of the VT382-D traditional Chinese terminal and VT382-C Simplified Chinese terminal respectively, or incorporated in the terminal emulation software, such as Hanyu DECterm for the former and Hanzi DECterm for the latter. Applications do not need to provide their own support for Chinese input They can rely on the terminal or emulation software to provide the input method services. Hanyu DECterm and Hanzi DECterm are considered as part of the DECwindows Motif applications. Therefore, the activating and deactivating method follows those of DECwindows Motif, discussed in Section 7.1.2. On VT382-D or VT382-C terminals, you select the input mode using the Compose key, which is located on the lower-left side of the main keyboard. On the Chinese version of LK201 or LK401 keyboard (that is, LK201-C, LK201-D, LK401-C or LK401-D), the Compose key is labeled as 0 . For details, see Chapter 6. Once the Chinese input mode is activated, the firmware of the terminal or the input methods incorporated in DECterm automatically compose Chinese characters and return the input data as appropriate. 2.1.17.1.2 DECwindows Motif Applications For DECwindows Motif applications, Chinese input methods are implemented in the form of independent processes called input servers. These Chinese input servers are X client processes that can work on a standard X server provided the X server has the required Chinese fonts installed. This means that the Chinese input server can run on any system which can access your X display device, including the device itself. Although the codesets returned by traditional and Simplified Chinese input servers are fixed, the Compaq Tru64 UNIX software allows you to connect applications to the input servers using any valid Chinese locale. The Compaq Tru64 UNIX software provides the required codeset conversion.

7–2 Tru64 UNIX Technical Reference for Using Chinese Features The traditional Chinese input server provided with the Compaq Tru64 UNIX software is interoperable with all existing DECwindows Motif /Hanyu platforms, including OpenVMS DECwindows Motif /Hanyu and UWS/Hanyu. The Simplified Chinese input server provided with the Compaq Tru64 UNIX software is interoperable with all existing DECwindows Motif /Hanzi platforms, including OpenVMS DECwindows Motif /Hanzi and UWS/Hanzi. Both input servers also provide input method services to the R6 X library (Xlib) supported by the Compaq Tru64 UNIX software. You can write internationalized applications using the standard R6 application programming interface and communicate with this input server. For details about developing internationalized software with X11R6, see Writing Software for the International Market. Before you can input Chinese data, you must start the Chinese input server on your workstation or any system on your network that can be accessed by your workstation. English and Chinese user interfaces are provided, so be sure to set the correct session language before starting the input server. There are several ways to start the Chinese input server: • Using the Session Manager You can start the Chinese input server after logging in to a session by selecting Hanyu IM or Hanzi IM from the Applications menu of the Session Manager, just like starting an ordinary DECwindows application. • Automatic Startup of the Input Server If you start up your session in one of the traditional Chinese locales, by default, the Hanyu IM menu item is added to the Session Manager’s Automatic Startup list. Hanzi IM is the default for Simplified Chinese. When you log in, the input server starts automatically. If you do not want to auto-start the input server, you can remove this item from the Automatic Startup list by using the Session Manager’s Customize menu.

______Note ______Applications which are started before Hanyu IM or Hanzi IM cannot connect to the input server. Therefore, Hanyu IM or Hanzi IM should be the first item on the Automatic Startup list. ______

Tru64 UNIX Technical Reference for Using Chinese Features 7–3 • Using a Command You can start the input server on a workstation that you are using by entering one of the following commands: % /usr/bin/X11/dxhanyuim & % /usr/bin/X11/dxhanziim & You can start the input server on a remote system for Hanyu IM by entering the following command on that system: % setenv DISPLAY % /usr/bin/X11/dxhanyuim & For Hanzi IM, enter the following commands: % setenv DISPLAY % /usr/bin/X11/dxhanziim & In these examples, is the display name of your workstation. After you invoke the Chinese input server, the DECwindows Motif applications which have been internationalized to support Chinese can communicate with it to provide input method services. 2.1.17.1.3 CDE Applications For CDE applications, Chinese input methods are implemented by input servers. Before you can input Chinese data, you must start the Chinese input server. There are two ways to start the Chinese input servers in CDE: • Automatic Startup of the Input Server If Chinese is selected on the CDE login menu, the Chinese input server starts automatically. When you log in, the following script runs: /usr/dt/config/Xsession.d/0020.dtims The value of the DTSTARTIMS environment variable determines whether the script will automatically start the specified Chinese input server. • Using a Command You can enter one of the following commands to start the input server on a workstation you are using: % /usr/bin/X11/dxhanyuim & % /usr/bin/X11/dxhanziim &

7–4 Tru64 UNIX Technical Reference for Using Chinese Features You can enter one of the following commands on a remote system to start the input server on that system for Hanyu IM: % setenv DISPLAY % /usr/bin/X11/dxhanyuim &

For Hanzi IM, enter the following command: % setenv DISPLAY % /usr/bin/X11/dxhanziim & In these examples, is the display name of your workstation. After you invoke the Chinese input server, the CDE applications which are internationalized to support Chinese can communicate with it to provide input method services. 7.2 Switching Input Method Table 7–1 shows the key sequences for toggling between the English and Chinese input modes.

Table 7–1: Key Sequences that Invoke Chinese Input Method

Terminal or Keyboard Type Default Key Sequence VT382-D [Compose] VT382-C DECwindows Motif LK201 [Compose/Space] LK401 [Compose] PCXAL [Alt/Space]

______Note ______You can use the input server options menu to customize the key sequences used to invoke the Chinese input method. ______

Table 7–2 and Table 7–3 show the key sequences for selecting a specific traditional Chinese or Simplified Chinese input method once you are in the Chinese input mode.

Tru64 UNIX Technical Reference for Using Chinese Features 7–5 Table 7–2: Key Sequences Used to Select Traditional Chinese Input Method

Input Method Default Key Sequence Full-Form Alphabets [Shift/Space] Tsang—Chi [F6] Quick Tsang—Chi [F7] Internal Code [F8] Phrase [F9] Phonetic [F10] Symbol Input [Z] (in Tsang—Chi or Quick Tsang—Chi mode)

Table 7–3: Key Sequences Used to Select Simplified Chinese input Method

Input Method Default Key Sequence Phrase Input [F5] 5-Stroke [F6] Qu-Wei [F7] Pin-Yin [F8] Telex Code [F9] 5-Shape [F10]

______Note ______You can use the Options menu to customize the key sequences used to select the input method. ______

In standard Motif, the function key [F10] is defined as the accelerator of the pull-down menu bar. In DECwindows Motif, the menu accelerator by default is [Ctrl/F10] so that [F10] can be used to invoke the Phonetic Input Method. To make your DECwindows Motif applications Motif compliant, insert the following line in your $HOME/.Xdefaults file: *menuAccelerator: Ctrl F10:

7–6 Tru64 UNIX Technical Reference for Using Chinese Features ______Note ______If you change [F10] to the menu accelerator, you cannot invoke the Phonetic Input Method unless you change the invocation key to another key sequence. ______

7.3 Motif Interface Input Method You can interact with the Chinese input server through a Motif-style user-interface. This interface allows an input method to provide feedback about the data being edited, to help you compose a character, list choices for selection, provide options for customizing the input server, and so on. 2.1.17.3.1 Input Areas The X Input Method specification defines the three input areas shown in Table 7–4.

Table 7–4: Window Input Areas

Region Description Auxiliary area An option menu helps you customize the Chinese input methods and the input method window. Status area Critical information about the internal state of the Chinese input methods is displayed in this area. Pre-edit area The intermediate text that is being composed is displayed in this area, which also displays a list of valid candidates for the input key sequences.

2.1.17.3.2 Interaction Styles The use of the input areas depends on the interaction style (or pre-edit style) selected for the application. The Chinese input server supports two interaction styles: • Root Window Interaction • Off-the-Spot Interaction 7.3.2.1 Root Window Interaction You can choose the root window interaction style if you want to display the pre-edit data in an input window which is separate from the application window. You can scale and move the input window to meet your preferences. If you want to free up more screen space, you can iconize the input method window. You can also choose to display pre-edit data in vertical or horizontal layout.

Tru64 UNIX Technical Reference for Using Chinese Features 7–7 Figure 7–1: Chinese Root Window Interaction Style

Figure 7–2: Chinese Input Window Icon

You can continue to input Chinese characters through a Chinese application window when the input window is iconized. The input state is displayed on the icon title, which is updated according to the input mode and the input focus. If you want to see the pre-edit data, you can double click the icon to redisplay the input window.

7–8 Tru64 UNIX Technical Reference for Using Chinese Features 7.3.2.2 Off-the-Spot Interaction If you want to display the pre-edit data in a fixed location of the application window, you can choose the off-the-spot interaction style. With this interaction style, the Chinese input server creates the input window at the bottom of the application window. You need not refer to the root window and you can iconize it to save screen area.

Figure 7–3: Off-the-Spot Interaction Style

You can specify the priority of the interaction styles of DECwindows Motif or CDE applications by specifying the VendorShell resource, XmNpreeditType. By default, the resource value is overthespot,offthespot,root,onthespot. This list is in priority order. The first style is used if available in an input method, else the second, and so on. There are two ways to choose your preferred interaction style: • Use the -xrm command line option to specify the resource value when you start an application. For example, the following command starts CDE Calendar Manager with the root window interaction style: % dtcm -xrm ’*preeditType: root,offthespot’ & To start CDE Calendar Manager with the off-the-spot interactive style, you can enter: % dtcm -xrm ’*preeditType: offthespot,root’ & • For DECwindows Motif, use the Session Manager’s Options menu in XDM: — From the Session Manager's Options menu, select Input Method... — In the popup Input Method Options window, click on the appropriate pre-edit style button

Tru64 UNIX Technical Reference for Using Chinese Features 7–9 The XmNpreeditType resource is set to a priority list beginning with the pre-edit style that you have chosen. After you select your preferred interaction style, the applications you invoke start up with the new setting. • For CDE, you can invoke the dtimsstart application to change the input method and the input style of the current input method.

______Note ______Some applications, such as DECterm, may provide their own user interface to handle interaction styles. Those mechanisms may override the methods described here. ______

2.1.17.3.3 Input Server Operations When you start a Chinese input server, no application is connected to it and the title bar indicates that there is no connection. When an internationalized application starts in a Chinese locale, the string is displayed in the status area for dxhanyuim, and is displayed for dxhanziim, indicating that the application is connected to the Chinese input server and the mode is English. If you invoke a Chinese input method, the input state displayed in the status area and the title bar is updated accordingly. If you change the input focus to a noninternationalized application window, the title bar of the input window changes to indicate there is no connection. The input server can maintain an individual state of composition for different input contexts or application windows. In addition, under the root window interaction style, each application window can be associated with its own attributes, such as font size, font style, layout, input window size, and position. You can set the input focus to an application window and then compose a Chinese character or customize the input window. The input server stores information about the composing state and input window attributes so that next time this application window gets the input focus, the original composing state and attributes are restored. 2.1.17.3.4 Options Menu The auxiliary area of the input window provides an Options menu that you use to customize the input server. You can click on the Options button to access the customization pull-down menu. The menu provides these options for traditional Chinese: • Vertical Layout • Horizontal Layout • Select Phrase Input Class

7–10 Tru64 UNIX Technical Reference for Using Chinese Features • User Phrase Database • System Phrase Database • Current Window • Input Method Customization • Help • Quit 7.3.4.1 Vertical Layout You can choose the Vertical Layout option only if the current layout is horizontal. When you choose this option, the input window and the layout of its contexts immediately display in a vertical orientation. The vertical input window remains at the same origin. 7.3.4.2 Horizontal Layout You can choose this option only if the current layout is vertical. When you choose this option, the input window and the layout of its context immediately display in a horizontal orientation. The horizontal input window remains at the same origin. 7.3.4.3 Select Phrase Input Class You can use the Select Phrase option to customize the phrase input mode. DECwindows Motif shares the phrase databases that are created and managed by the Compaq Tru64 UNIX Phrase Utility. After you create a phrase database and define your phrases, both character-cell terminal applications and DECwindows Motif applications can use the data for phrase input. In order to use the phrase databases, the LANG environment variable must be set to reflect the required codeset, that is, zh_TW.dechanyu. For details about the Compaq Tru64 UNIX Phrase Utility and phrase definition file, see Writing Software for the International Market. The Select Phrase Input Class option allows you to focus on a particular class of phrases during phrase input. When you choose this option, a dialog box pops up and you can select the phrase class that you want to use. To select all classes, you choose the * option. If you do this, the phrase input method searches all classes of phrase definitions for the phrase code that you entered. 7.3.4.4 User Phrase Database The Phrase Input method allows you to access two phrase definition databases: the system phrase database and the user phrase database. The former is for public access by all users using your system. It should be created and modified by your system administrator. You can also create and maintain your own private phrase database for storing your frequently used phrases. This is called the user phrase database.

Tru64 UNIX Technical Reference for Using Chinese Features 7–11 ______Note ______The databases that you can access are the ones available on the system on which you start your Chinese input server. ______

For details about creating a phrase database, see Writing Software for the International Market. If you choose this option, you will access your private user phrase database. 7.3.4.5 System Phrase Database If you choose the System Phrase Database option, you will access the system phrase database. 7.3.4.6 Current Window The Current Window option allows you to customize the attributes of a specific application window.

______Note ______The Current Window option is available only if you chose root window as your interaction style and you focused to an internationalized application input area. Otherwise, this option is greyed out. If you choose the off-the-spot interaction style, the application determines the attributes. ______

When you choose this option, a dialog box pops up and the following options are displayed: • Font Size You can choose the font size for displaying pre-edit data. Click on either the Big Font or Small Font toggle buttons. • Font Typeface You can choose the font typeface to be used in the input window. To choose the font typeface in a traditional Chinese input server, click on one of the following toggle buttons: — Hei — Sung — Screen

7–12 Tru64 UNIX Technical Reference for Using Chinese Features To choose the font typeface in a Simplified Chinese input server, click on one of the following toggle buttons: — Heiti — Songti — Kaiti — FangSongti — Screen You can define a typeface that does not exist in the options list in the Chinese input server resource file. This typeface will be displayed beside the Other: label in the customization window. • Line Spacing The Chinese input server can display pre-edit data on more than one text line. Usually, this happens when a list of items is displayed for your selection. You can use the Line Spacing option to specify the spacing between text lines in pixels. To adjust the line spacing, drag the Line Spacing slider or move the pointer to the desired position on the slider and click MB1. • Foreground and Background Colors You can customize the foreground and background colors of the input window. For monochrome display, the following options are provided: — Dark Text, Light Background — Light Text, Dark Background For color display, you can choose from a palette of colors to design a visually pleasing input window. To customize the foreground or background color, you should first select the color that you want to change by clicking one of the following toggle buttons: — Input Window Foreground Color — Input Window Background Color A color mixing window pops up in which you can mix the color using the three sliders, which represent the intensities of the primary colors. The modified color is displayed in the right half of the color box while the left half displays the original color.

Tru64 UNIX Technical Reference for Using Chinese Features 7–13 7.3.4.7 Input Method Customization There are several customizable attributes which apply to all input windows. In the traditional Chinese input server, they are: • Default Input Method • Bell Volume • EDPC support • Invocate Key In the Simplified Chinese input server, they are: • Default Input Method • Bell Volume • Display 5-Shape Radicals • Invocate Key When you choose the Input Method Customization option, a dialog box pops in which you can customize the attributes. • Default Input Method When the Chinese input server is activated, the default Chinese input method is set to the one you choose via this option. The input methods you can choose in the traditional Chinese input server are: — Tsang–chi — Quick Tsang–chi — Phonetic — Internal code The input methods you can choose in the Simplified Chinese input server are: — 5-Stroke — Row-Column — Pin-Yin — Telex Code — 5-Shape • Bell Volume

7–14 Tru64 UNIX Technical Reference for Using Chinese Features When you make an error while composing a Chinese character, the bell rings to alert you. To adjust the bell volume, drag the Bell Volume slider or move the pointer to the desired position on the slider and click MB1. • EDPC support (in dxhanyuim only) The traditional Chinese input server supports the input of both CNS 11643 and DTSCS (that is, EDPC) characters. However, you can choose to disable the input of EDPC characters so that the data that you can enter contains only CNS 11643 characters. This option is useful if you need to prepare Chinese data and interchange it with systems supporting only CNS 11643. To enable or disable the input of EDPC characters, click on the EDPC Characters Input button. • Display 5-shape Radicals (in dxhanziim only) The Simplified Chinese input server supports the display of 5-shape radicals when you choose the 5-stroke input method. The radicals are displayed after each candidate in the candidate list during the pre-edit. • Invocate Key The key sequences for invoking and switching Chinese input methods are set by default. You can change these default key sequences to meet your personal preference or working style. This option allows you to customize the following key sequences for both dxhanyuim and dxhanziim: — Start Input Method — End Input Method — Phrase Input — Invoke Next Input Method The following choices are for dxhanyuim only: — Start Full Form — End Full Form — Tsang–chi — Quick Tsang–chi — Internal Code — Phonetic

Tru64 UNIX Technical Reference for Using Chinese Features 7–15 The following choices are for dxhanziim only: — 5-Stroke — Row-Column — Pin-Yin — Telex Code — 5-Shape In the bottom part of the dialog box is an easy-to-use interface where you can customize a key sequence. You can select a trigger key and toggle the on/off state of the Ctrl, Alt, and Shift modifiers. The trigger keys that you can choose include NoSymbol, [F1] - [F20], [Space], [Return], [Compose] and [A]- [Z]. If you choose NoSymbol, no invocation sequence will be provided for the selected action. For each modifier key, you can select the on/off state with the toggle buttons shown in Table 7–5.

Table 7–5: Modifier State Customization

Modifier On State Off State Ctrl Ctrl ~Ctrl Shift Shift ~Shift Alt Alt ~Alt

The tilde (~) sign means that you should not press that modifier key when invoking the action. In addition to the on/off state, you can also deselect both of the states for a modifier key so that neither state is selected. To do this, click the toggle button which is currently set on. If you deselect a modifier, the input server will accept the invocation key with or without holding the modifier key. When an invocation key sequence is selected, the state of the toggle switches and the trigger key displayed at the bottom of the dialog box is updated to reflect the current value and the label at the bottom left-hand side changes.

7–16 Tru64 UNIX Technical Reference for Using Chinese Features Figure 7–4: Customization of Invocation Key Sequences in dxhanyuim

Figure 7–5: Customization of Invocation Key Sequences in dxhanziim

Tru64 UNIX Technical Reference for Using Chinese Features 7–17 For example, if you want to change the End Input Method key sequence to [Ctrl/Space], select the Ctrl, ~Alt and ~Shift buttons. To reduce the number of keys required for selecting input methods, set the hot keys for all the input methods to "NoSymbol", and define a key sequence for the Choose Next Input Method option. This method releases the [F6] to [F10] function keys for use by other DECwindows applications. To switch the input method, press the key sequence for Choose Next Input Method and the Chinese input server will cycle through all supported input methods. For example, if you use the Tsang–Chi input method and you want to switch to the internal code input method, press the hot key twice. 7.3.4.8 Help The Help option provides the following menu items that you use to display help for the Chinese input server: • Context-Sensitive Help • Overview • Using Help • Product Information 7.3.4.9 Quit Use the Quit option to terminate the input server. If you select this option, a dialog box pops up and asks if you really want to exit. 2.1.17.3.5 Saving Your New Settings All attributes that you customize with the Current Window and Input Method Customization menus can be saved into a resource file in your login directory. Each customization window provides the following options: • Save Settings as Defaults

Saves all current attributes as default values. These attributes are saved to a private resource file in your login directory; DXhanziim if you are using dxhanziim, and .DXhanyuim if you are using dxhanyuim. • Restore system setting

Restores all system default attributes

7–18 Tru64 UNIX Technical Reference for Using Chinese Features 7.4 Alphabetic Input Methods There are two alphabetic input methods available under the English mode: • Half Form Alphabet • Full Form Alphabet The Half Form Alphabet method allows you to enter uppercase and lowercase English characters, numerals, and symbols marked on the keyboard. Full Form Alphabet allows you to enter 2-byte alphabets, numerals, and symbols defined in the Chinese character sets. To invoke the Full Form Alphabet input method, press [Shift/Space]. The string (full form) is displayed in the status area, as shown in Figure 7–6. Once the prompt appears, all characters that you enter at the keyboard are sent as 2-byte characters. To exit the Full Form Alphabet input method, press [Shift/Space] again.

Figure 7–6: Full Form Alphabet Input Method `

7.5 Tsang–Chi Input Method To understand the Tsang–Chi input method, you must understand the concepts of Tsang– Chi root radicals, auxiliary forms, and character-splitting methods. 2.1.17.5.1 Tsang–Chi Root Radicals The Tsang–Chi input method is based on the concept of root radicals. The input method requires a Chinese character to be broken down into various root radicals according to its shape. Altogether 24 Tsang–Chi root radicals have been defined with which almost all existing Chinese characters can be composed. The root radicals are divided into four groups and assigned to the alphabet keys [A] - [Y] (with the exception of the [X] key) on the main keyboard. Table 7–6 a-d illustrate the classification of the root radicals, their corresponding English keys, their auxiliary forms, and the way that they are derived. Table 7–7 is a shorter table for quick reference.

Tru64 UNIX Technical Reference for Using Chinese Features 7–19 Table 7–6: Tsang–Chi root Radicals Classification a. Philosophical:

7–20 Tru64 UNIX Technical Reference for Using Chinese Features b. Stroke:

c. Human Body:

Tru64 UNIX Technical Reference for Using Chinese Features 7–21 d. Form:

7–22 Tru64 UNIX Technical Reference for Using Chinese Features Table 7–7: Quick Reference Table of the Tsang–Chi Root Radicals

Tru64 UNIX Technical Reference for Using Chinese Features 7–23 2.1.17.5.2 Tsang–Chi Code Generation To input a Chinese character using the Tsang–Chi input method, its Tsang–Chi code should be generated based on character decomposition. Most Chinese characters can be divided into two categories: the composite form and the connected form. The composite form can be split into a character head and a character body while the connected form cannot.

Table 7–8: Composition Form Characters

Composite Form Examples Left-right form Top-bottom form Inclusion form

Table 7–9: Connected Form Characters

Connected Form Examples Connected form

7.5.2.1 General Rules The general rules for generating Tsang–Chi codes are: • The character category must be composite or connected. • The code according to the writing order is usually one of the following:

• From outside to inside

• From top to bottom

• From left to right • The number of radicals is up to 5 for composite characters, where the character head can be decomposed into at most 2 radicals and the character body can be decomposed into at most 3 radicals. Connected characters can be decomposed into no more than 4 radicals. • If more than one Tsang–Chi code exists for a character, use the one with fewer radicals. • If several Tsang–Chi codes have the same number of radicals, use the one which better represents the character.

7–24 Tru64 UNIX Technical Reference for Using Chinese Features 7.5.2.2 Connected Characters Connected characters are those which cannot be split due to the existence of crossed or connected strokes, such as , , . Each character can be input by entering at most 4 radicals. If more than 4 radicals can be derived, the first 3 radicals and the last can be taken to generate the Tsang–Chi code. Table 7–10 illustrates some examples of decomposing connected characters.

Table 7–10: Examples of Connected Character Decomposition

Tru64 UNIX Technical Reference for Using Chinese Features 7–25 7.5.2.3 Composite Characters Composite characters are those that can be split from top to bottom, left to right, and outside to inside, such as , , . You can decompose the character head into 1 to 2 radicals. If more than 2 radicals are generated, take the first and the last radicals. The character body can be decomposed into 1 to 3 radicals. If it is made up of 3 or fewer radicals, you should enter all the radicals. If it is made up of more than 3 radicals which are connected, enter the first two radicals and the last. If the character body is itself a composite character, you can further decompose the character body into subhead and subbody. The radicals you should enter are the first and last radicals of the subhead, and the last radical of the subbody. Table 7–11 illustrates examples of decomposing composite characters. In the "Shape" column, a solid square represents a character head while a square represents a radical of the character body.

7–26 Tru64 UNIX Technical Reference for Using Chinese Features Table 7–11: Composite Character Decomposition

Tru64 UNIX Technical Reference for Using Chinese Features 7–27 7.5.2.4 Exceptional Characters Approximately 95% of Chinese characters can be decomposed according to the rules described in Section 7.5.2.3. The remaining 5% are exceptional characters that need to be entered in different ways. The exceptional characters can be divided into the following groups: • Compound Characters The Tsang–Chi input method has defined 9 compound characters. A compound character can be a connected character, or the character head or body of a composite character. In any case, compound characters must be represented by their first and last radicals.

Table 7–12: Compound Characters

• Difficult Characters Difficult Characters are those which are difficult to decompose in the Tsang–Chi method. Usually, these characters are composed of some special root radicals which are neither the Tsang–Chi root radicals nor their auxiliary forms. Using the Tsang– Chi input method you can press the [X] key (which is labeled with and therefore

7–28 Tru64 UNIX Technical Reference for Using Chinese Features will be referred to as the [ ] in this document) to access the special root radicals and use them to compose difficult characters. The rules of decomposing difficult characters are: — If it is easy to identify the first and the last radicals, you can enter the first radical, the [ ] key, and the last radical. For example, can be decomposed into (HXH). — If the first radical is easy to identify while the others are difficult, enter the first radical and then press the [ ] key for the rest. For example, can be decomposed into (YX), and can be decomposed into (HX). — Never use the [ ] key for the first radical.

Table 7–13: Difficult Characters

• Special Characters Some special characters are composed by superimposing the root radicals !, - and on other strokes or radicals. To keep the decomposed radicals as simple as possible, take the root radicals first, before the rest of the character is entered.

Tru64 UNIX Technical Reference for Using Chinese Features 7–29 Table 7–14: Special Characters

2.1.17.5.3 Invoking Tsang–Chi Input Method When you invoke the Tsang–Chi input method, the Chinese string is displayed in the status area, as shown in Figure 7–7.

Figure 7–7: Invocation of the Tsang–Chi Input Method

The radicals that you enter with the Tsang-Chi input method are displayed in the pre-edit area, as shown in Figure 7–8. To correct the data, press the Delete key and reenter the correct radical. Alternatively, you can press the key (that is, F6 on a standard LK201 or LK401 keyboard) to erase all radicals in the pre-edit buffer. If fewer radicals are required to input a character, you can press the Return key or Space bar to signal the end of input.

Figure 7–8: Entering a Tsang–Chi Radical

2.1.17.5.4 Multiple Candidates If there is exactly one character represented by a Tsang–Chi code, the character is sent directly to the application. Sometimes, multiple candidates for a Tsang–Chi code are

7–30 Tru64 UNIX Technical Reference for Using Chinese Features available for selection when the code represents more than one Chinese character. In this case, the candidates are displayed in the pre-edit area in the following order: • Most frequently-used • Less frequently-used • Seldom used The pre-edit area can display up to 9 candidates at a time.

Figure 7–9: Multiple Candidates 2! !!!5! !!!8! !!

The numbers 1, 4, and 7 divide the 9 characters into 3 groups so that you can easily select the desired candidate. To select a character that is displayed in the pre-edit area, press the corresponding numeric key on the main keyboard. When there are more than 9 candidates for selection, the indicators, - and are displayed in the pre-edit area. Table 7–15 lists the indicators and their definitions.

Table 7–15: Meaning of Arrow Characters

Indicator Definition The current row is the first row and you can press [Space] or [⇒] to move to the next row. The current row is somewhere between the first and the last row. You can press: • [Space] or [⇒] — move to the next row • [⇐] — move to the previous row • [⇑] — move to the first row

The current row is the last row and you can press [⇐] to move to the previous row or [⇑] to the first row.

If you enter another Tsang–Chi code without selecting a candidate, the first candidate in the list is sent to the application. If you do not want to select any candidate, but want to clear the Tsang–Chi code, press the Return key or the key (that is, F6). 2.1.17.5.5 Repeat Character Input If you want to repeat input of the same character, press the equals (=) key.

Tru64 UNIX Technical Reference for Using Chinese Features 7–31 2.1.17.5.6 Error Handling If you input incorrect data, the bell will ring. If no character is generated after you enter a Tsang–Chi code, this indicates that there is no character for the code. The radicals entered remain in the pre-edit buffer. To handle the error, you can do one of the following: 1. Press the Delete key to erase the radicals, one at a time, and then reenter the correct radicals. 2. Press the Return key or the key to erase all radicals in the pre-edit buffer, and then reenter the correct radicals. 3. Enter new radicals without pressing the Return key. The radicals in the pre-edit buffer are replaced by the newly-entered radicals. 7.6 Quick Tsang–Chi Input Method The Quick Tsang–Chi input method, also known as the Quick input method or the so- called Easy input method, is a variant of the Tsang–Chi input method and follows the same principles and rules for decomposing characters into radicals. However, the process for entering radicals is simplified and requires only the first and the last radicals. For example, the character is decomposed in the Tsang–Chi input method into , , , . The Quick Tsang–Chi input method, in this case, requires input of only and . This section discusses the mechanism of the Quick Tsang–Chi input method. For details about character decomposition, see Section 7.5, Tsang–Chi Input Method. 2.1.17.6.1 Quick Tsang–Chi Code Generation As in the Tsang–Chi input method, the character decomposition in Quick Tsang-Chi is based on whether a character is of the composite form or the connected form. However, the Quick Tsang–Chi input method requires only the first and the last radicals regardless of the number of radicals obtained. 2.1.17.6.2 Invoking Quick Tsang–Chi Input Method When you invoke the Quick Tsang–Chi input method, the Chinese string !is displayed in the status area, as shown in Figure 7–10.

Figure 7–10: The Quick Tsang–Chi Input Method

2.1.17.6.3 Entering Quick Tsang–Chi Code The radical that you enter is displayed in the pre-edit area, as shown in Figure 7–11. To correct the data, press the Delete key and reenter the correct radical. Alternatively, you

7–32 Tru64 UNIX Technical Reference for Using Chinese Features can press the key (that is, F7 on a standard LK201 or LK401 keyboard) to erase all radicals in the pre-edit buffer. If only one radical is required to input a character, press the Return key or Space bar to signal the end of input.

Figure 7–11: Entering a Quick Tsang–Chi Code

2.1.17.6.4 Multiple Candidates If exactly one character is represented by a Quick Tsang–Chi code, the character is sent directly to the application. Frequently, multiple candidates for a Quick Tsang–Chi code are available for selection when the Quick Tsang–Chi code represents more than one Chinese character. In this case, the candidates are displayed in the pre-edit area. If you do not want to select any candidate, but want to clear the Tsang–Chi code, press the Return or the key (that is, F7). 2.1.17.6.5 Repeat Character Input If you want to repeat the input of the same character, press the equals (=) key. 2.1.17.6.6 Error Handling If you input incorrect data, the bell will ring. If no character is generated after you enter a Quick Tsang–Chi code, this indicates that there is no character for the code. The radicals entered remain in the pre-edit buffer. To handle the error, you can do one of the following: • Press the Delete key to erase the radicals, one at a time, and then reenter the correct radicals. • Press the Return key or the key to erase all radicals in the pre-edit buffer, and then reenter the correct radicals. • Enter new radicals without pressing the Return key. The radicals in the pre-edit buffer are replaced by the newly entered radicals. 7.7 Phonetic Input Method The Phonetic input method is based on Chinese phonetic symbols (bopomofo) that represent the pronunciation of Chinese characters. 2.1.17.7.1 Phonetic Symbol Categories Phonetic symbols can be divided into 3 categories: consonants, vowels, and tone marks. There are 21 consonants, 16 vowels, and 5 tone marks. The 5 tone marks for Chinese pronunciation are the first, the second, the third, the fourth and the light tones. Chinese

Tru64 UNIX Technical Reference for Using Chinese Features 7–33 phonetic symbols are assigned to the alphanumeric keys on the main keyboard. Table 7–16 summaries all consonants, vowels, and tone marks.

Table 7–16: Phonetic Symbols

7–34 Tru64 UNIX Technical Reference for Using Chinese Features ______Note ______The vowels , -and are also called semi-vowels. ______

2.1.17.7.2 Phonetic Code Generation The pronunciation of a Chinese character is composed of consonants, vowels, and tone marks. Therefore, a phonetic code can be generated according to the following rules: • The phonetic symbols are entered in the following order:

- Consonant

- Vowel

- Tone marks • A phonetic representation must have at least one consonant or one vowel. • The end of input must be indicated with a tone mark or a Return.

Table 7–17: Examples of Phonetic Input

Code Format Phonetic Symbols Characters Consonant + vowel + tone mark Consonant + [vowel] + tone mark -! Vowel + tone mark -! Consonant + semivowel + vowel + tone , mark Semivowel + vowel + tone mark

2.1.17.7.3 Invoking Phonetic Input Method When you invoke the Phonetic input method, the Chinese string is displayed in the status area, as shown in Figure 7–12. The example in Figure 7–12 shows how to input the character by entering the phonetic symbols (RU6) at the main keyboard.

Figure 7–12: Entering the Phonetic Symbols for [ ] 1 4 7

Tru64 UNIX Technical Reference for Using Chinese Features 7–35 2.1.17.7.4 Entering Phonetic Code The phonetic symbols that you enter are displayed in the pre-edit area. To correct the data, press the Delete key and reenter the correct symbol. Alternatively, you can press the [ ] key (that is, F10 on a standard LK201 or LK401 keyboard) to erase all phonetic symbols in the pre-edit buffer. You can press various termination keys to signal that you are done entering the phonetic symbols. Table 7–18 shows the results of entering phonetic symbols with different termination keys.

Table 7–18: Phonetic Symbols with Different Termination Keys

Tone Key Description Example 1st Space Lists characters with the same Type - then press the Space bar, consonant or vowel in the order of ///- the first, second, third, fourth, and ///-! light tone marks. ///-! /// will be listed. 2nd 6 Lists characters of the second tone. Type - /// will be listed. 3rd 3 Lists characters of the third tone. Type - /// ! will be listed. 4th 4 Lists characters of the fourth tone. Type - /// ! will be listed. Light 7 Lists characters of the light tone. Type -!! /// ! will be listed None Return Characters corresponding to the Type -, then press the Return phonetic symbols are displayed key, characters corresponding to the according to the order of tone order marks. If only a consonant is entered before pressing Return, characters corresponding to any ///- valid combinations of this ///- /// will be consonant and other vowels are listed. also displayed.

2.1.17.7.5 Multiple Candidates If a phonetic string matches exactly one character, the character is sent directly to the application. Frequently, a phonetic string matches multiple Chinese characters. In this case, the candidates are displayed in the pre-edit area.

7–36 Tru64 UNIX Technical Reference for Using Chinese Features If you do not want to select any candidate, but want to clear the phonetic code, press the Return key or the key (that is, F10). 2.1.17.7.6 Repeat Character Input If you want to repeat the input of the same character, press the equals (=) key. 2.1.17.7.7 Error Handling If you input incorrect data, the bell will ring. If no character is generated after you enter a phonetic code, this indicates that there is no character for the code. The phonetic symbols entered remain in the pre-edit buffer. To handle the error, you can do one of the following: • Press the Delete key to erase the radicals, one at a time, and then re-enter the correct radicals. • Press the Return key or the key to erase all phonetic symbols in the pre-edit buffer, and then reenter the correct phonetic symbols. • Enter new phonetic symbols without pressing the Return key. The phonetic symbols in the pre-edit buffer are replaced by the newly entered symbols. 7.8 Internal Code Input Method Each character in DEC Hanyu has been assigned a unique internal code, just like the ID number of a company employee. For a complete list of the characters and their internal codes, see The DEC Chinese Code Book (Part Number EK-VT38D-CB-001).

______Note ______In this Compaq Tru64 UNIX release, the Internal Code input method supports only the DEC Hanyu internal code. The internal codes for Taiwanese EUC and Big-5 are not supported. Even if you set the locale to one of the Taiwanese EUC or Big-5 locales, this input method still requires you to specify the DEC Hanyu internal code. ______

2.1.17.8.1 Input Procedure When you invoke the Internal Code input method, the Chinese string is displayed in the status area. To enter an internal code, you can optionally enter a character set number to be followed by a four-digit hexadecimal number which specifies the position of the character with respect to the character set. The character set number can be 1 for the CNS 11643-1986 character set, or 2 for the DTSCS character set.

Tru64 UNIX Technical Reference for Using Chinese Features 7–37 If you omit the character set number and enter only the four-digit hexadecimal code, you must press the Return key or the Space bar to signal the end of input. If you enter the character set number and the four-digit hexadecimal code, the respective character is sent automatically without pressing the Return key. Figure 7–13 shows the input of using the Internal Code input method.

Figure 7–13: Input of Using the Internal Code Input Method ` 1CBF

The internal code that you enter is displayed in the pre-edit area. To correct the data, press the Delete key and re-enter the correct code. Alternatively, you can press the key (that is, F8 on a standard LK201 or LK401 keyboard) to erase all characters in the pre-edit buffer. Since internal codes are unique for any symbols or Chinese character in DEC Hanyu, there are not multiple candidates for an internal code. 2.1.17.8.2 Repeat Character Input If you want to repeat the input of the same character, press the equals (=) key. 2.1.17.8.3 Error Handling If you input incorrect data, the bell will ring. If no character is generated after you enter an internal code, this indicates that there is no character for the code. The internal code entered remains in the pre-edit buffer. To handle the error, you can do one of the following: • Press the Delete key to erase the characters representing the internal code, one at a time, and then reenter the correct internal code. • Press the Return key or the key to erase all characters in the pre-edit buffer, and then reenter the correct internal code. • Enter a new internal code without pressing the Return key. The characters in the pre- edit buffer are replaced by the newly entered internal code. 7.9 Phrase Input Method The Phrase input method is a mechanism designed to facilitate the input of frequently used phrases. You can define your own frequently used phrases by preparing your own phrase database. Each phrase is identified by a phrase code. To input a phrase, you enter its phrase code and then convert it to the respective phrase.

7–38 Tru64 UNIX Technical Reference for Using Chinese Features The Compaq Tru64 UNIX operating system provides a Phrase Utility that you use to create phrase databases. For details, see Writing Software for the International Market. In addition, the firmware of the VT382-D traditional Chinese terminal is designed to allow you to download phrases into its built-in memory. For DECwindows Motif applications, you do not need to download phrase definitions because the Chinese input servers can directly access the phrase databases. In addition, you can select the phrase database being used. 2.1.17.9.1 Input Procedure When you invoke the Phrase input method, the Chinese string is displayed in the status area of dxhanyuim, and in the status area of dxhanziim. Figure 7–14 shows an example of entering a phrase in dxhanyuim by specifying its phrase code. Figure 7–15 shows an example of converting the phrase code to the phrase.

Figure 7–14: Entering a Phrase Code

0BNCBTTBEPS

74-!Divoh!Tibo!O/!Se/-!Tfd/!3-!Ubjqfj

` ASIA

Figure 7–15: Converting a Phrase Code to a Phrase

! 0BNCBTTBEPS ! !74-!Divoh!Tibo!O/!Se/-!Tfd/!3-!Ubjqfj

! 0BTJB!XPSME!QMB[B`!!\ ^

The phrase code that you enter is displayed in the pre-edit area, as shown in Figure 7-14. To correct the data, press the Delete key and reenter the correct code. Alternatively, you can press the key (that is, F9 on a standard LK201 or LK401 keyboard) to erase all characters in the pre-edit buffer. The phrase code can consist of at most 8 characters. If it has fewer than 8 characters, press the Return key or Space bar to signal the end of input. If it has exactly 8 characters, the respective phrase is sent automatically without having to press the Return key after you enter the last character.

Tru64 UNIX Technical Reference for Using Chinese Features 7–39 The Phrase input method is different from other input methods in the sense that once a phrase is entered, the input mode switches back to the original input mode from which the Phrase input method was invoked. To input another phrase, press the key again and then enter another phrase code. 2.1.17.9.2 Error Handling If the data that you input is incorrect, the bell will ring. 7.9.2.1 Condition 1 If the message (Requested phrase does not exist) is displayed when you enter a phrase code, you can check the following: • Check to be sure that the phrase is in the phrase definition file. • If you are using a VT382-D terminal:

- Check to be sure that the phrase definition file containing the phrase definition has been downloaded to the VT382-D Chinese terminal.

- Check to see if the terminal power supply has been interrupted since you last loaded the phrase definition file.

- Check to see if you have reset the VT382-D terminal. The Recall and Default operations provided by the VT382-D setup menu clear the phrase definitions which have been downloaded. • If you are using a Chinese input server:

- Check to be sure that the phrase definition file containing the phrase definition has been selected by the Chinese input server.

- Check to see if the LANG environment variable has been set up correctly before starting the Chinese input server. If the phrase has been defined but it does not exist in the in-memory phrase database, reload or reselect the phrase definition file. 7.9.2.2 Condition 2 Errors can occur in the phrase definition file. If you cannot solve the problem using the procedures in Section 7.9.2.1, you can do one of the following: • Check the syntax of the definition statements in the phrase definition file. A phrase cannot be downloaded if the syntax of its definition is incorrect.

7–40 Tru64 UNIX Technical Reference for Using Chinese Features • If you are using a VT382-D terminal, a maximum of 100 phrases can be downloaded. If the number of phrases in the phrase definition file exceeds this limit, the definitions beyond the limit are not downloaded. 7.10 Symbol Input in Dxhanyuim The Symbol input method is a more intuitive way to input two-byte symbols, such as punctuation marks, table and mathematical symbols, foreign characters, phonetic symbols, and traditional Chinese control characters. You can invoke the Symbol input method only from the Tsang–Chi or the Quick Tsang– Chi input method. You can use it to input more than 600 symbols, including all full form alphabets (A -Z, a-z), full form numerals (0-9), full form symbols (such as - -! ///), phonetic elements (such as -! -! -! ), Chinese radicals and special symbols (such as -! -! *. 2.1.17.10.1 Invoking Symbol Input Mode To enter the Symbol input mode, press the Z key when invoking Tsang–Chi or the Quick Tsang–Chi input method. When you invoke the Symbol input mode, the Chinese string is displayed in the status area. 2.1.17.10.2 Rules of Symbol Input Use the following rules when entering symbols: • If a symbol exists on the keyboard, input it the same way you enter single-byte characters.

Symbol Input Code Key Sequence ! # $ A [A] A [a]

• Symbols of similar form or meaning have the same key sequence.

Symbol Input Code Description ( vertical parenthesis ) vertical parenthesis ( horizontal parenthesis ) horizontal parenthesis

Tru64 UNIX Technical Reference for Using Chinese Features 7–41 • If a symbol does not exist on the keyboard, decompose it according to its shape.

Symbol Key Sequence [=][/] [O][+] [S][S] [K][G]

• For graphical symbols, press the hyphen (-) key and a number within the range 1-8 to specify horizontal symbols or press the vertical bar key ( | ) and a number within the range 1-7 to specify vertical symbols.

Symbol Key Sequence Symbol Key Sequence [-][1] [|][1] [-][2] [|][2] [-][3] [|][3] [-][8] [|][7]

• To enter symbols for constructing a table or diagram, press the T key and an alphabet key specifying the direction.

Symbol Key Sequence [T][Z] [T][X] [T][C] - [T][A] [T][S] ,

Q W E A S D Z X C

• To enter arrows of various directions, press the A key and an alphabet key specifying the direction.

7–42 Tru64 UNIX Technical Reference for Using Chinese Features Symbol Key Sequence [A][Z] [A][X] [A][C] [A][A]

Q E W

A D Z X C

• Press the P key to indicate the input of phonetic symbols. To specify the required symbol, press the key with the respective phonetic symbol marked on the keyboard. See Chapter 6 for the keymap of phonetic symbols.

Symbol Key Sequence [P][1] [P][Q] [P][A] [P][Z] [P][2]

• Press the G key (or g for the lowercase) to input Greek letters and then enter the first two letters of the Greek character’s name. Use uppercase or lowercase letters as appropriate.

Symbol Key Sequence [G][A][L] [G][B][E] [g][a][l] [g][b][e] [G][E][P]

2.1.17.10.3 Entering Symbol Code You can input a symbol by entering its symbol code and then pressing the Return key or Space bar to signal the end of input.

Tru64 UNIX Technical Reference for Using Chinese Features 7–43 The following example describes how to enter the full form alphabet ; 1. Press the Z key to invoke the Symbol input mode.

2. Press the A key to enter the first alphabet of the Symbol code.

B

2.1.17.10.4 Multiple Candidates If a symbol code matches exactly one character, the character is sent directly to the application. However, sometimes a symbol code matches multiple candidates. In this case, the candidates are displayed in the pre-edit area. To continue the preceding example: 1. In the preceding example, press the Return key to signal the end of input.

B !2!

2. Press the 1 key to select _

!!!!!!!

______Note ______After the symbol is input, the input mode switches back to the original input mode. ______

2.1.17.10.5 Error Handling If no symbol is input after you enter a symbol code, the symbol code is invalid and the input mode switches back to the original input mode. You can press the Z key to invoke the Symbol input mode again and reenter the correct symbol code. 7.11 Input of User–Defined Characters in DECwindows Motif The Compaq Tru64 UNIX software supports the input of user-defined characters (UDC) through the Tsang–Chi and Quick Tsang–Chi input methods. When you invoke the Chinese input server, the following Tsang–Chi dictionaries are loaded: • The private UDC Tsang–Chi dictionary that you create • The system wide UDC Tsang–Chi dictionary

7–44 Tru64 UNIX Technical Reference for Using Chinese Features • The system default Tsang–Chi dictionary The Tsang–Chi and Quick Tsang–Chi input methods display candidates, which match the input key sequence you enter, in this order. You can use the cedit utility to define the input sequences of user-defined characters and then use the cgen command with the -iks option to produce a UDC Tsang–Chi dictionary for the Chinese input server. You should also define the corresponding font glyphs (both 18X16 and 24X24 size fonts). After you define the user-defined characters and their corresponding information in the system or user database, restart the input server to reread the database. For details, see Writing Software for the International Market or the cgen reference pages. By default, the Chinese input server puts your private UDC Tsang–Chi dictionary in the ~/.iks/dwimdb-dechanyu directory You can use the USER_UCD_DICT environment variable to override this default. The system wide UDC Tsang–Chi dictionary is, by default, placed in the /var/i18n/iks/dwimdb-dechanyu directory. If the dictionary is located elsewhere on your system, you use the SYSTEM_UDC_DICT variable to override the default location. If you are a system administrator, you can install a UDC Tsang–Chi dictionary by placing it at the default location and setting proper access mode for system-wide use. The system default Tsang–Chi dictionary, which is shipped with the Compaq Tru64 UNIX system, is placed in the /usr/i18n/hanyu/hanyu_tsangchi.dic directory. If it is installed at a different location, you can use the HANYU_SYSTEM_DICT environment variable to specify the location.

______Note ______In this Compaq Tru64 UNIX release, the Chinese input server supports UDC input key sequences only in the DEC Hanyu codeset.

To display UDC in DECwindows Motif applications or the Chinese input server, UDC fonts should be available on your system. For instance, the Chinese input server will display UDC using the fonts -adecw-screen-*--18-*-cns11643.1986-udc or -adecw- screen-*--24-*-cns11643.1986-udc.

If you define your own UDC fonts, you can override the system UDC fonts by adding the directory in which the UDC fonts are located to the font path. For example, if your private UDC fonts are located in the ~/fonts directory, you can enter the following command to update the font path: %xset +fp ~/fonts

Tru64 UNIX Technical Reference for Using Chinese Features 7–45 Use the cedit utility to define UDC font glyphs (both 18X16 and 24X24 size fonts) and the cgen utility to generate fonts for use in DECwindows Motif. For details, see Writing Software for the International Market. ______

7.12 5–Stroke Input Method The 5-Stroke input method makes use of the basic strokes to construct the Chinese characters. A stroke is a segment of continuous line or curve that constitutes a Chinese character. Table 7–19 describes the five categories of strokes.

Table 7–19: Stroke Categories

Category Description Horizontal strokes or Horizontal lines and left-to-right ticks

Vertical strokes or à Vertical lines and right-to-left hooks Slash or à Slanting lines and curves drawn towards lower-left Backslash or à Dots , slanting lines, and curves drawn towards lower- right Zip-zap curves or à Including different types of joints or corners which can be drawn in single continuous strokes

Using the 5-Stroke input method you can input single Chinese characters and Chinese terms. Approximately 5,000 terms can be input using this method. 2.1.17.12.1 Input Mechanism To input a Chinese character using the 5-Stroke input method, enter its 5-stroke code through the numeric keypad according to the writing order. Table 7–20 shows the codes representing the five categories of strokes.

Table 7–20: 5-Stroke Code

Stroke Code Key 1[KP1] 2[KP2] 3[KP3] 4[KP4] 5[KP5]

7–46 Tru64 UNIX Technical Reference for Using Chinese Features Figure 6–5 shows the numeric keypad layout for entering 5-Stroke codes. The following are the general rules of writing order for Chinese characters: • From top to bottom • From left to right • From outside to inside • Writing radical inside before drawing the last stroke of the outside radical (for example, the last stroke of à ) 7.12.1.1 Single Character Input If a Chinese character is composed of exactly five strokes, enter the strokes according to the writing order. If it is composed of fewer than five strokes, press the KP0 key to signal the end of input. If it is composed of more than five strokes, enter the first four strokes and the last stroke. Table 7–21 shows some examples of using the 5-Stroke input method.

Table 7–21: Input of Single Characters with the 5-Stroke Input Method

No. of 5-Stroke Character Strokes Write Order Code Key Sequence 5 Ã Ã  Ã 35112 [KP3][KP5][KP1][KP1][KP2] 4 Ã Ã Ã 12510 [KP1][KP2][KP5][KP1][KP0] 9 Ã Ã Ã  43254 [KP4][KP3][KP2][KP5][KP4]

If you are uncertain about the type of strokes or the writing order of strokes, press the wildcard key (KP6) in place of the strokes. For details, see Section 7.12.5, Wildcard Key. 7.12.1.2 Input of Terms The 5-Stroke input method can be used to input terms. A maximum of eight strokes is required for entering terms. Press the [KP7] key before inputting strokes. The [KP7] key signals the system that subsequent key strokes are for composing a term instead of a character. The number of strokes input for each character in a term depends on the number of characters composing the term as shown in Table 7–22.

Tru64 UNIX Technical Reference for Using Chinese Features 7–47 Table 7–22: Input Terms with the 5-Stroke Input Method

Number of Characters in Term Strokes to be Input 2 First 4 strokes of each character 3 First 2 strokes of the first two characters and the first 4 strokes of the last character 4 First 2 strokes of each character > 4 First 2 strokes of the first three characters and the last character

If a character is composed of fewer strokes than required, press the KP6 key for the outstanding strokes. If this character is the last character, press the KP0 key to signal the end of input.

Number of No. of Req’d 5-Stroke Term Characters Characters Strokes Strokes Code Key Sequence 2 5 4 1221 [KP7][KP1][KP2] 9 4 2512 [KP2][KP1][KP2] [KP5][KP1][KP2] 2 3 4 121(6) [KP7][KP1][KP2] 2 4 34(0) [KP1][KP6][KP3] [KP4][KP0] 3 4 2 45 [KP7][KP4][KP5] 14 2 31 [KP3][KP1][KP1] 6 4 1234 [KP2][KP3][KP4] 3 9 2 25 [KP7][KP2][KP5] 12 2 12 [KP1][KP2][KP1] 1 4 1(0) [KP0] 4 6 2 13 [KP7][KP1][KP3] 8 2 32 ]KP3][KP2][KP3] 4 2 34 [KP4][KP5][KP1] 5 2 51 4 1 2 1(6) [KP7][KP1][KP6] 4 2 34 [KP3][KP4][KP4] 4 2 43 [KP3][KP1][KP1] 2 2 11 7 4 2 25 [KP7][KP2][KP5] 6 2 32 [KP3][KP2][KP3] 2 2 34 [KP4][KP2][KP5] 8 2 25

7–48 Tru64 UNIX Technical Reference for Using Chinese Features 2.1.17.12.2 Procedure When you invoke the 5-Stroke input method, the string is displayed in the status area. When you input the 5-Stroke code, all the Chinese characters that match the code sequence are displayed in the 26th line of the terminal or the pre-edit area of the display. Each character is associated with a number and you can select the character immediately using the corresponding numeric key on the main keyboard. For example, if you press the KP1 key, all characters starting with a horizontal stroke are displayed in the pre-edit area. :1 1 DEF 2 FBN 3 GOI 4 GFK 5 GHI

If you press the KP2 key, all Chinese characters with starting 5-Stroke code 12 are displayed in the pre-edit area. Ã )%1Ã *+,Ã 6.'Ã 601Ã 6*'ÃÃÃÃÃ

You can use the Delete key to correct the 5-Stroke code. When you delete digits in the 5- Stroke code, the characters displayed in the pre-edit area are changed to match the remaining 5-Stroke code. For example, press the Delete key once: Ã '()Ã )%1Ã *2,Ã *).Ã *+,ÃÃÃÃÃÃÃÃ

Press the Delete key once more: 

Alternatively, you can press the Return key without selecting any character. The pre-edit area is refreshed and you can enter a new 5-Stroke code. 2.1.17.12.3 Multiple Candidates Because the same 5-Stroke code can represent more than one Chinese character, sometimes, multiple candidates are available for selection when you enter a 5-Stroke code. For example: B ÃÃ 601Ã 683<Ã 68&4Ã 6-;;Ã 6<01Ã

To select a character that is displayed in the pre-edit area, press the corresponding numeric key on the main keyboard. In the example, press numeric key 3 to select . If the desired character is displayed at the first position (in this case, the character), you do not need to press the 1 key to select . Enter another 5-Stroke code and the character will automatically be selected. This character is called the default character.

Tru64 UNIX Technical Reference for Using Chinese Features 7–49 If there are too many candidates for selection, the candidates are displayed in two or more rows and a plus (+) sign is displayed to signal this. Press the Space bar or KP9 key to display the next row of characters. If you want to see the previous row of characters, press [Shift/Space] or the KP8 key.

2.1.17.12.4 The Association Mode To enhance input efficiency, the Chinese input method maintains a list of built-in phrases in memory. If you select a character from the candidate list (See Section 7.12.3, Multiple Candidates) and there is a built-in phrase whose first character matches this character, Association Mode is automatically activated. For example: B

à à Ã

The string is displayed in the status area. You can select one of the built-in phrases in Association Mode. After you select the phrase, you automatically exit Association Mode. If there are many built-in phrases, they are displayed in two or more rows and a plus (+) sign is displayed to signal this. Press the Space bar or KP9 key to view the next row of phrases. To move backwards, press [Shift/Space] or the KP8 key. If you do not want to use any of the associated phrases, enter another 5-Stroke code to automatically exit the Association Mode.

______Notes______Association Mode is only active during single character input. It can not be activated during term input.

Association Mode is activated only if you press a numeric key to select a single character. If you do not press the 1 key to select the default character, the Association Mode is not activated though the default character is entered. ______

2.1.17.12.5 Wildcard Key The KP6 key is a wildcard key, like an asterisk in a file specification, which can replace any strokes about which you are uncertain. For example: Ã ÃÃ Ã " Ã Ã Ã

7–50 Tru64 UNIX Technical Reference for Using Chinese Features This example assumes you are uncertain about the second stroke, and so press KP4, KP6, KP5, KP5, KP1. Another example: Ã ÃÃ Ã Ã " Ã " This example assumes you are uncertain about the third and the fourth strokes, and so press KP3, KP2, KP6, KP6, KP0. 7.13 5–Shape Input Method The 5-Shape input method is a high performance input method which composes root radicals to form Chinese characters. Using the 5-Shape input method, you can enter one to four root radicals to input thousands of Chinese characters and terms. In addition to character input, the 5-Shape input method also supports term input. The 5- Shape input method supports approximately 5,000 terms, each of which can be defined by four radicals.

______Note ______When using a VT382-C terminal with SoftODL enabled, you can use a user-defined key sequence in the 5-Shape input method to input a user-defined character (UDC). This assumes the key sequence does not apply to other existing characters. Use cedit to create UDCs and their corresponding input key sequences. ______

2.1.17.13.1 Distribution of Radicals There are 130 root radicals defined on 25 keys on the main keyboard. (See

Tru64 UNIX Technical Reference for Using Chinese Features 7–51 Figure 6–3 and Figure 6–4 for the layout of LK201-C and LK401-C keyboards.) The 25 keys are divided into five groups, each containing five keys, as shown in Table 7–23.

Table 7–23: Shape Code

Key 1 Key 2 Key 3 Key 4 Key 5 Group 1 [11]/[G] [12]/[F] [13]/[D] [14]/[S] [15]/[A] Group 2 [21]/[H] [22]/[J] [23]/[K] [24]/[L] [25]/[M] Group 3 [31]/[T] [32]/[R] [33]/[E] [34]/[W] [35]/[Q] Group 4 [41]/[Y] [42]/[U] [43]/[I] [44]/[O] [45]/[P] Group 5 [51]/[N] [52]/[B] [53]/[V] [54]/[C] [55]/[X]

Figure 7–16: Distribution of Radicals

7–52 Tru64 UNIX Technical Reference for Using Chinese Features Figure 7–17: 5-Shape Radical Keys

Tru64 UNIX Technical Reference for Using Chinese Features 7–53 Figure 7–16 and Figure 7–17 show that the distribution of radicals has three characteristics: 1. The root radicals and the Chinese key name are similar in shape. For example:

The key name of Key [11] is while its radicals include , and so on. 2. The code for the first stroke of a root radical is the same as the group number, while the code for the second stroke is the same as the key number. For example:

The first stroke of the radicals on key [41] such as à and is , which is the same as the group number. The second stroke is  which is the same as the key number. 3. The key number denotes the number of strokes for constructing the root radicals. That is, the key number equals the number of strokes of the radicals, for example:

The group number of is 1. Therefore, is located on key [11], is located on key [12], and is located on key [13]. 2.1.17.13.2 Decomposition of Chinese Characters The 5-Shape input method decomposes Chinese characters according to their root radicals. The decomposition can be categorized as follows: • Left-Right Type (Type 1)

The character can be split into a left half and a right half, such as  , and . • Upper-Lower Type (Type 2)

The character can be split into an upper half and a lower half, such as  , and . • Miscellaneous Type (Type 3)

The character does not have a clear separation of left-right or upper-lower. Examples include  , and .

The miscellaneous type can be further divided into individual type and inclusion type. Examples of individual type include à Ãà à à  , and . Inclusion type refers to those characters with one of their root radicals totally or partially included in another root radical, such as à à , and .

7–54 Tru64 UNIX Technical Reference for Using Chinese Features 2.1.17.13.3 Distinction Code If a character is composed of fewer than four root radicals, you must enter a code which represents the decomposition type and the type of the last stroke, which is called the Distinction code.

Figure 7–18: Distinction Code for the 5-Shape Input Method

For example, the codes for à and are all [43][14] . To distinguish among them, you must enter the distinction code.

Character Last Stroke Decomposition Distinction Code (2) Left-Right (1) [21] (1) Left-Right (1) [11] (4) Left-Right (1) [41]

Tru64 UNIX Technical Reference for Using Chinese Features 7–55 ______Notes______The root radicals available on the main keyboard do not require a distinction code. Although the characters which are key names (such as ) and radical names (such as ) are composed of fewer than four radicals, no distinction code is required. For example, can be decomposed into and 

For those characters which do not have a unique writing order, such as , , , , and , you can use zip-zap stroke code (5) for the last stroke. For example, can be decomposed into , and , and you may add the distinction code [51].

For those inclusion type characters, such as , , , , and , you should take the last stroke of the interior radical. For example, the distinction code for is [23] because the last stroke is [2].

Single dots located near a radical, such as the point in and , belong to decomposition type 3. For example, the distinction code of is [43]. ______

2.1.17.13.4 Principles of Character Decomposition The following four principles can be used for decomposing Chinese characters: • Writing order. For example, should be decomposed into à , and instead of , à  • Priority for larger radicals. Among the different ways of decomposition, choose the one which results in larger radicals. For example, should be decomposed into and instead of and .

Another interpretation of this principle is that you should choose the decomposition method that creates fewer radicals. • Separated instead of cross. If a character can be decomposed as a separated structure or a cross structure, choose the separated structure. For example, should be decomposed as and instead of , and . • Intuitive. 2.1.17.13.5 Input Mechanism The 5-Shape input method allows for single-character input and term input.

7–56 Tru64 UNIX Technical Reference for Using Chinese Features 7.13.5.1 Single Character Input Each Chinese character can be composed of one to four root radicals. If the character is available on the keycap as a keyname, press the key four times. For example: ÃÃSress [41][41][41][41] If the character is a root radical, press the key for the radical and enter the first, second, and last strokes. If the character is defined with fewer than four codes, press the Space bar to signal the end of input, for example:

Decomposed Number of Character Radicals Radicals Key Sequence à à à 4 [14][11][21][11] à à 3 [12][11][31][Space]

There is one exception for the five basic strokes. The characters , , ,  and are single-stroke characters and they require two codes for input. To avoid multiple candidates for the same code, the 5-Shape input method requires adding two more codes [24] [24]. Table 7–24 shows the key sequences for entering these basic strokes.

Table 7–24: Entering Basic Strokes

Basic Stroke Key Sequence [11][11][24][24] [21][21][24][24] [31][31][24][24] [41][41][24][24] [51][51][24][24]

If the character does not appear on the keycap, follow the decomposition principles described in the previous section. If the character is composed of exactly four root radicals, enter the root radicals according to the writing order. If it is composed of more than four root radicals, enter the first, second, third, and last root radicals. If it is composed of less than four radicals, enter the radicals together with the distinction code. If the number of codes is still less than four, press the Space bar to signal the end of input.

Tru64 UNIX Technical Reference for Using Chinese Features 7–57 For example:

Decomposed Distinction Character Radicals Code Key Sequence , , , - [11][54][12][22] Ã Ã Ã - [14][35][35][32] º Ã [41] [43][54][41][Space]

7.13.5.2 Term Input Approximately 5,000 built-in terms are defined in the Chinese input methods. You can use the 5-Shape input method to input multiple character terms by entering four root radicals as shown in Table 7–25.

Table 7–25: Input of Terms with the 5-Shape Input Method

Number of Characters in Term Root Radicals to be Entered 2 The first two root radicals of each character 3 The first root radicals of the first two characters and the first two root radicals of the last character 4 The first root radical of each character > 4 The first root radicals of the first three characters and the last character

Number. of Number of Req’d Root Root Term Characters Character Radicals Radical 5-Shape Code 2 2 , [55][54][43][41] 2 , 3 1 [41][31][14][25] 1 2 Ã 4 1 [54][51][13][42] 1 1 1 5 1 [22][52][41][14] 1 1 1

7–58 Tru64 UNIX Technical Reference for Using Chinese Features 2.1.17.13.6 Procedure When the 5-Shape input method is invoked, the string is displayed in the status area. You can enter a 5-Shape code through the main keyboard and the data is displayed in the pre-edit area. When a complete 5-Shape code is entered, the character that matches the code is sent. For example: B NPKM

To correct the input data, press the Delete or Return key to erase the data. If there is no valid character for the 5-Shape code, the bell rings to signal an error and you can enter another code. 2.1.17.13.7 Multiple Candidates If input codes match multiple candidates, valid candidates are displayed in the pre-edit area. EQKBÃ %1+Ã %1+Ã %1+Ã %1+

The mechanism for selecting candidates is similar to that for the 5-Stroke input method. For details, see Section 7.12, 5–Stroke Input Method. 2.1.17.13.8 Association Mode As with the 5-Stroke input method, Association Mode is activated automatically when you select a character from the candidate list. For details, see Section 7.12, 5–Stroke Input Method. 2.1.17.13.9 Simple Code Characters There are three levels of simple code characters; Frequently-Used Characters (Level 1), Level 2, and Level 3. 7.13.9.1 Frequently-Used Characters (Level 1 Simple Code) Among the 6,000 Chinese characters defined in GB2312-80, there are 25 very frequently- used characters. These are defined on the keys [11] to [55]. To input these characters, press the key associated with the character (such as [11], [25]) and then press the Space bar.

Tru64 UNIX Technical Reference for Using Chinese Features 7–59 The 25 frequently-used characters are:

[*1] [*2] [*3] [*4] [*5] [1*] [2*] [3*] [4*] [5*]

7.13.9.2 Level 2 Simple Code The 5-Shape input method has about 600 Level 2 Simple Code Characters. To input a Level 2 Simple Code Character, enter the first two radicals and then press the Space bar. For example, the 5-Shape code for is [13][31][52][21] (that is, DTBH). But since it is a Level 2 Simple Code Character, you can press [13][31] and the Space bar to input the character. The Level 2 Simple Code Characters include:

7–60 Tru64 UNIX Technical Reference for Using Chinese Features 7.13.9.3 Level 3 Simple Code Approximately 4,000 Chinese characters defined in GB2312-80 are classified as Level 3 Simple Code Characters. To input these characters, enter the first three radicals and then press the Space bar. 2.1.17.13.10 Wildcard Key The Z key on the main keyboard is the wildcard key for the 5-Shape input method. When inputting a single character, the Z key can replace any root radicals about which you are uncertain. For example, if you are not sure whether should be decomposed into , ,  and (TVFH) or , ,  and (TVHF), you can enter [31][53][Z][Z] and all characters with starting root radicals [31] and [53] will be displayed for your selection. For example: WY]]BÃ 79(<Ã 79)+Ã 7975Ã 795&ÃÃÃÃÃÃÃÃÃÃÃ

2.1.17.13.11 Entering 5–Shape Code Through the Numeric Keypad The 25 keys on the main keyboard are usually used to enter 5-Shape code. But you can also enter 5-Shape codes through the numeric keypad. For instance, the 5-shape code of is [11][[25][34][43] (GMWI). To enter , you can press [KP1][KP1][KP2][KP5][KP3][KP4][KP4][KP3] on the numeric keypad. The function of the [KP0] key is similar to the function of the Space bar. To enter the wildcard code through the numeric keypad, press the KP6 key twice. 7.14 Pin–Yin Input Method The 6,000 Chinese characters defined in GB2312-80 are sorted according to the Pin-Yin (phonetic) representation. Each character has its own Pin-Yin representation that you can

Tru64 UNIX Technical Reference for Using Chinese Features 7–61 use to input the character. To use this input method, you should have basic knowledge of Putonghua (Mandarin) and Chinese pronunciation. The Pin-Yin input method requires each character to be entered by using 1 to 6 Roman characters and one tone mark (-, /, v, \, or o). Only single characters can be entered using the Pin-Yin input method. Term input is not supported. But you can activate Association Mode when selecting a character. 2.1.17.14.1 Input Mechanism Each Chinese character has a phonetic representation that can be specified in a sequence of syllables. To enter the pronunciation, enter the syllables through the main keyboard (keys A - Z). To enter the tone marks, press the key with preassigned tone marks as defined in Table 7–26.

Table 7–26: Pin-Yin Tone Marks

Tone Mark Symbol Key Label First tone - ; Second tone / , Third tone V \ Fourth tone \ [ Light tone O ]

For example, the phonetic representation of is "ri". Press [R][I]. 2.1.17.14.2 Procedure When you invoke the Pin-Yin input method, the string is displayed in the status area. When you enter the phonetic representation, the data is displayed in the pre- edit area. All characters that match the representation are displayed. For example, when you press the A key, all characters with phonetic representations starting with "a" are displayed in the pre-edit area: DBÃ .6.*Ã .8++Ã .%6.Ã 4%6.Ã

When you press the N key, all characters with phonetic representations starting with "an" are displayed: an_ 1 AFPV 2 SPVG 3 DGT 4 DJNG

If the phonetic representation is invalid, the bell rings. You can correct the data by using the Delete key or press Return to clear the data.

7–62 Tru64 UNIX Technical Reference for Using Chinese Features ______Note ______If the phonetic representation has only one character, you must press the 1 key to select the character. ______

2.1.17.14.3 Multiple Candidates If a Pin-Yin string matches multiple candidates, the candidates are displayed in the pre- edit area. To select one, press the corresponding numeric key. For details, see Section 7.12, 5–Stroke Input Method. To reduce the number of candidates, you can enter a tone mark after the phonetic representation. The input method displays only those characters with the specified tone. For example, the phonetic representation of is "niu". Eight characters are displayed for selection after you press [N][I][U]: QLXBÃ 91)*Ã 5+.Ã 51)*Ã 41)*Ã

If you press the / key after [N][I][U], there will be only one candidate for selection: niu2_ 1 RHK

______Note ______If you enter a tone mark after entering a phonetic representation, you do not need to press the1 key to select the character in the first position. The first candidate is selected automatically when you enter another phonetic representation. ______

2.1.17.14.4 Association Mode Similar to the 5-Stroke input method, Association Mode is activated automatically when you select a character from the candidate list. For details, see Section 7.12, 5–Stroke Input Method. 2.1.17.14.5 Multiple Phonetic Representations It is possible to have multiple phonetic representations for the same character. For example, can be pronounced as "diao" and "tiao". Therefore, you can enter either [D][I][A][O] or [T][I][A][O] for . 2.1.17.14.6 Radical Characters Approximately 20 radical characters are defined in GB2312-80. To input those characters using the Pin-Yin input method, input the phonetic representation of the first character of the radical name. For example, the radical name of is "cao zi tou" . You would enter [C][A][O].

Tru64 UNIX Technical Reference for Using Chinese Features 7–63 7.15 Qu–Wei Input Method The Qu-Wei code is the representation of Qu (row) and Wei (column) in the DEC Hanzi codeset. For example, is placed in the first column of the sixteenth row. Its Qu- Wei code is 1601. To input , press [1][6][0][1]. Since each Qu-Wei code unambiguously defines one Chinese character, the Qu-Wei input method does not support term input and Association mode. When you invoke the Qu-Wei input method, the string is displayed in the status area. To input a Qu-Wei code, enter the code by using the numeric keys (0 - 9) on the main keyboard. 2.1.17.15.1 DEC GB2312 Characters is defined in the DEC GB2312 character set and its Qu-Wei code is 1601. To input , enter [0][1][6][0][1] or [1][6][0][1] from the main keyboard, and press the Space bar to complete the input. 2.1.17.15.2 Extended GB Characters The Qu-Wei codes of characters in the extended GB character set start with "1". For example, if you press [1][1][6][0][1], the user-defined character in the first column of the sixteenth row is entered. If you make a mistake during input, you can correct the data by pressing the Delete key to erase the last key stroke, or the Return key to clear the entire code. 7.16 Telex Code Input Method Telex Code is a four-digit code used to define: • Single Chinese characters • Terms • Non-Chinese symbols It is based on the Standard Telex Code published by the People’s Republic of China. The Telex Code input method is simple and straight-forward. You can input characters, terms, or symbols by entering their corresponding Telex Code from the keyboard. When you invoke the Telex Code input method, the string is displayed in the status area. Use the numeric keys 0 - 9 on the main keyboard to enter the codes for characters, terms, or symbols:

7–64 Tru64 UNIX Technical Reference for Using Chinese Features Character/String Type Telex Code Key Sequence Character 0039 [0][0][3][9] Term 9916 [9][9][1][6] Symbol 9949 [9][9][4][9]

7.17 Symbol Input in Dxhanziim When you press [Ctrl/0] in the 5-Stroke, 5-Shape, or Pin-Yin mode, the string is displayed in the status area. All letters and numerals that you enter are converted to two- byte letters and numerals. To return to the original input mode, press [Ctrl/0] again. 7.18 Conversion Between Input Servers and DECwindows Motif Applications If you need to work with both Simplified and traditional Chinese data, you can use DECwindows Motif to enter Simplified Chinese data with the traditional Chinese input server, and vice versa. To set up the environment for entering Simplified Chinese data with the traditional Chinese input server, you should: 1. Set the session language to Chinese China. 2. Set the XMODIFIERS environment variable to select the Chinese input server: %setenv XMODIFIERS @im=DECTW Hereafter, applications start in Simplified Chinese and are connected to the traditional Chinese input server. You can enter Simplified Chinese data using the input methods provided by the Chinese input server, such as the Tsang–Chi input method. If you do not set this modifier and both Simplified and traditional Chinese input servers are running, a Chinese application connects to the first input server that is started.

Tru64 UNIX Technical Reference for Using Chinese Features 7–65 ______Note ______The data in the pre-edit area of the Chinese input server is displayed in Simplified Chinese.

Those Chinese characters that cannot be converted to Simplified Chinese characters are represented by a reverse question mark. If two or more user-defined characters have the same input key sequence, two or more reverse question marks might appear near the end of the line. In that case, you should try each one to find the character you want.

Use the UDC Manager (cedit) to define the user-defined characters and their corresponding input key sequences. Since a VT382-D terminal is a DEC Hanyu terminal, you can create an input key sequence file only under the DEC Hanyu locale. Generation of an input key sequence file under the Taiwanese EUC and BIG-5 locales is not supported in this release. ______

To set up the environment for entering traditional Chinese data with the Simplified Chinese input server: 1. Set the session language to Chinese Taiwan. 2. Set the XMODIFIERS environment variable to select the Simplified Chinese input server: %setenv XMODIFIERS @im=DECCN Hereafter, applications start in Chinese and are connected to the Simplified Chinese input server. You can enter Chinese data using the input methods provided by the Simplified Chinese input server, such as the 5-Shape and the 5-Stroke input methods.

7–66 Tru64 UNIX Technical Reference for Using Chinese Features 8 Chinese Printing Support

This chapter introduces the Chinese printing support provided by the Compaq Tru64 UNIX operating system. It describes the supported printers, print file formats, features, and maintenance procedures for supporting Chinese printing. 8.1 Supported Printers The Compaq Tru64 UNIX software supports text and PostScript printers. 2.1.18.1.1 Text Printers The Compaq Tru64 UNIX software supports text printers with built-in Chinese fonts. 2.1.18.1.2 PostScript Printers The Compaq Tru64 UNIX software supports Chinese printing on PostScript printers in two ways: • Based on the built-in or downloaded fonts installed in printers • Based on the font-faulting mechanism, which is explained in Section 8.3, Printing Features • Based on the font-embedding mechanism For details about the supported printer types and print filters, see Chapter 4, Local Language Devices. 8.2 8.2 Print File Formats The Compaq Tru64 UNIX software supports printing mixed ASCII and Chinese characters in the following print file formats: • Plain text files on text printers and PostScript printers

Tru64 UNIX Technical Reference for Using Chinese Features 8–1 • Files with the nroff control sequences (for printing with underline, superscript, subscript, and bold attributes) on text printers and PostScript printers • PostScript files on PostScript printers The print filters for PostScript printers can automatically detect the format of a print file and convert it to the proper format for printing. 8.3 Printing Features The Compaq Tru64 UNIX software supports the following printing features: • Font embedding • Font faulting • Software on-demand fond loading (SoftODL) • Codeset conversion • Outline fonts 8.3.1 Font Embedding Font embedding refers to a mechanism where all the necessary font data is embedded within the printer input file. It is not necessary for printers to have resident fonts. This mechanism is supported only on PostScript printers that support PostScript level 2 or level 1 with multibyte font extension. Outline fonts are used for the embedding, if available. Otherwise, lower quality bitmap fonts are used. The font-embedding mechanism is useful since few printers have Chinese resident fonts. However, there must be enough memory inside the printer to hold the font data or the print job might fail. The amount of memory required depends on the nature of the print job. It is recommended that at least 4 MB or more printer memory be available for the embedding mechanism. Because of the large amount of data transferred to the printer, it is also necessary to use a high speed link between the host computer and the printer. Using the serial port for printing is too slow for the font-embedding mechanism. See Writing Software for the International Market and the wwpsof(8) reference page for more information about how to use the font embedding mechanism. 2.1.18.3.2 Font Faulting Font faulting is a mechanism used to handle the large memory required by fonts for some codesets, particularly multibyte codesets for Asian languages. Using font faulting, font information is stored on either: • The secondary storage of a supporting host machine, called a font-faulting server, or

8–2 Tru64 UNIX Technical Reference for Using Chinese Features • An internal font disk The font information is loaded into the printer on demand, thus conserving printer memory. Font faulting is often essential for multibyte ideographic fonts because the memory required to store a single font can exceed the memory capacity of many printers. Specialized local language printers, such as Japanese printers, do not require font faulting because the local language fonts reside on the printer. However, other printers require a mechanism to load these fonts as needed for different parts of the same print job. The font-faulting mechanism is very useful for a desktop printing environment, in which a large number of different single-byte fonts may be required. In this case, simultaneously storing all the fonts in memory reduces the available memory, and therefore speed, of the printer. Also, it is possible that the number of required fonts is so large that they cannot all be in memory at the same time. Font faulting for multibyte fonts is done on a per character (or per glyph) basis because these fonts support extremely large numbers of characters. Font faulting for single-byte fonts is done on a per font basis. Single-byte fonts are small and relatively simple, so loading the whole font is more efficient. The font-faulting mechanism can be used with the following printers: • DEClaser 1152 • DEClaser 5100 • PrintServer 17 See Section 8.5, Chinese Printing Setup for information about configuring these printers. 2.1.18.3.3 Software On–Demand Font Loading Software On-Demand Font Loading (SoftODL) is a mechanism through which a terminal or a bitmap printer downloads the relevant bitmap font information for a user-defined character (UDC) at the time the character needs to be displayed or printed. The Chinese bitmap printers that support this feature include: • CP382D controller (for traditional Chinese) • LA88-C (for Simplified Chinese) • LA380-CB (for Simplified Chinese) 2.1.18.3.4 Codeset Conversion The Compaq Tru64 UNIX software includes a codeset conversion mechanism used to print text files that have a codeset different from the one used by the printer. For printers with built-in or downloaded Chinese fonts, the codeset of the printer should be defined to match

Tru64 UNIX Technical Reference for Using Chinese Features 8–3 the codeset of the built-in fonts. For printers using the font-faulting mechanism, the codeset of the printer should be defined to match the codeset of the font to be loaded. For printers using the font-embedding mechanism, the codeset of the printer should not be defined. 2.1.18.3.5 Outline Fonts The Compaq Tru64 UNIX software provides a large set of outline fonts for printing files in various languages. Depending on how many local language support subsets are installed on your system, more than 150 outline fonts may be available. There are four sets of Chinese outline fonts, two for traditional Chinese and two for Simplified Chinese. These fonts are: • Sung-Light-CNS11643 • Hei-Light-CNS11643 • Hei-GB2312-80Xi • XiSong-GB2312-80 Those fonts with the CNS11643 extension are traditional Chinese fonts encoded in the DEC Hanyu codeset, with glyphs for plane 1 and plane 2 characters. Those fonts with the GB2312-80 extension are Simplified Chinese fonts encoded in DEC Hanzi. 8.4 Commands and Daemons Before you can utilize the printing features supported by the Compaq Tru64 UNIX software, there are some commands and daemons that you should understand. This section discusses these commands and daemons in detail and the following section illustrates how they are used for configuration. 2.1.18.4.1 Country-Specific Options to the lpr Command In addition to the usual options to the lpr command, the -A option is used to pass country-specific parameters. You can use ya to set the parameters to the -A option in the /etc/printcap file. For example, you can specify the parameters using the -A option to the lpr command as: % lpr -A "flocale=zh_TW.big5 font=Sung-Light- CNS11643 plocale=zh_TW.dechanyu" You can define the same set of parameters in the /etc/printcap, file as: :ya="flocale=zh_TW.big5 font=Sung-Light- CNS11643 plocale=zh_TW.dechanyu":\ The parameters supplied with the -A option to the lpr command override the corresponding default values in the /etc/printcap file.

8–4 Tru64 UNIX Technical Reference for Using Chinese Features The following parameters are applicable to Chinese printing: • flocale= Specifies the locale for the source text file. The printer filters use this locale to validate the characters inside the source text file. If this value is not set properly, the text is interpreted using the current locale. In Chinese printing, this value is particularly important in order for the lpr command to correctly interpret the characters. Moreover, if the plocale option is also set, the lpr command performs codeset conversion for the source text file. • plocale= Specifies the locale for the printer. If the printer has built-in fonts, the plocale value should match the codeset of the built-in fonts. If the printer employs the font faulting mechanism, the plocale value should match the font used to print the text file. • font= Specifies the font name for printing the source text files on a PostScript printer. This parameter is used for printing text files only, as PostScript files are already tagged with the required font name. • odldb= Specifies the path of the SoftODL database files. This parameter is used to override the system default SoftODL database path, hence allowing users to access their own SoftODL database. • odlstyle= Specifies what SoftODL font style and size to use. The value is of the form