ISO/IEC International Standard 10646-1
Total Page:16
File Type:pdf, Size:1020Kb
ISO/IEC 10646:2003/Amd.5:2008 (E) Information technology — Universal Multiple-Octet Coded Character Set (UCS) — AMENDMENT 5: Tai Tham, Tai Viet, Avestan, Egyptian Hieroglyphs, CJK Unified Ideographs Extension C, and other characters Page 2, Clause 4 Terms and definitions (syllable-initial characters or initial consonants) (syllable-peak characters or medial vowels) Replace the Code table entry with: (syllable-final characters or final consonants) Code chart Page 20, Clause 27 Source references for Code table CJK Ideographs A rectangular array showing the representation of coded characters allocated to the octets in a code. In the list after the second paragraph, insert the follow- ing entry: Remove the „Detailed code table‟ entry. Hanzi M sources, Replace all further occurrences of „code table‟ in the text of the standard with „code chart‟. In the third paragraph, replace “(G, H, T, J, K, KP, V, and U)” with “(G, H, M, T, J, K, KP, V, and U)”. Page 13, Clause 17 Structure of the code tables and lists (now renamed „Structure of the Page 21, Sub-clause 27.1 Source references code charts and lists‟) for CJK Unified Ideographs Insert the following note after the first paragraph. Replace the last sentence of the second paragraph with the following: NOTE – Clause 34 also includes additional information on characters clarifying some feature of a character, such as its The current full set of CJK Unified Ideographs is naming or usage, or its associated graphic symbol. represented by the collection 385 CJK UNIFIED IDEOGRAPHS-2008 (See annex A.1). Page 14, Sub-clause 20.3 Format characters Insert the following sources in the Hanzi G list: Insert the following entries into the list of format charac- ters: G_GH Gudai Hanyu Cidian (古代汉语词典) G_GJZ Commercial Press Ideographs (商务印书馆用 1A60 TAI THAM SIGN SAKOT 字) Page 15, Sub-clause 20.4 Variation selectors G_XC Xiandai Hanyu Cidian (现代汉语词典) Replace the second note with: G_CYY Chinese Academy of Surveying and Mapping NOTE 2 – This version of the standard incorporates by ref- Ideographs (中国测绘科学院用字) erence the variation sequences listed in version 2007-12-14 G_ZFY Hanyu Fangyan Dacidian (汉语方言大辞典) of the Ideographic Variation Database, as described at http://www.unicode.org/ivd/data/2007-12-14/. G_ZJW Yinzhou Jinwen Jicheng Yinde (殷周金文集成 Page 20, Sub-clause 26.1 Hangul syllable 引得) composition method After the list for Hanzi H source, add the following text: Replace the three parenthetical notations in the second The Hanzi M source is sentence of the first paragraph with: © ISO/IEC 2008 – All rights reserved 1 ISO/IEC 10646:2003/Amd.5:2008 (E) MAC Macao Information System Character Set (澳 5th field: Hanja K sources (K0-hhhh), 門資訊系統字集) (K1-hhhh), (K2-hhhh), (K3-hhhh), (K4- hhhh), or (K5-hhhh). Insert the following sources in the Hanzi T sources: 6th field: ChuNom V sources (V0-hhhh), TC TCA-CNS 11643-1992 12th plane (V1-hhhh), (V2-hhhh), (V3-hhhh), or TD TCA-CNS 11643-1992 13th plane (V4-hhhh). TE TCA-CNS 11643-1992 14th plane 7th field: Hanzi H source (H-hhhh). Insert the following source in the Kanji J sources: 8th field: Hanja KP sources (KP0-hhhh) or JK Japanese KOKUJI Collection (KP1-hhhh). J_ARIB Association of Radio Industries and Business- th es (ARIB) ARIB STD-B24 Version 5.1, March 9 field: Unicode U sources (U0-hhhh) or 14 2007 (UTCddddd). Insert the following source in the Hanja K sources: In the description of the linked content, add a 10th field K5 Korean IRG Hanja Character Set 5th Edition: as follows: 2001 10th field: Hanzi M source (MACddddd). Replace the description of the KP1 entry in the Hanja In the paragraph starting with „The format definition KP sources with the following: uses‟, replace the first sentence as follows: KP1 KPS 10721:2000 and KPS 10721:2003 The format definition uses ‘d’ as a decimal unit and Insert the following source in the ChuNom V sources: ‘h’ as a hexadecimal unit. V4 Dictionary on Nom 2006, Dictionary on Nom of Provide a new source reference file with format Tay ethnic 2006, Lookup Table for Nom in the changes as specified above and content updated as of South 1994 resolutions WG2 M51.9, M51.10,M51.11, and M52.2. In Insert the following source in the Unicode U sources: addition to these changes, the CJK G_KX sources ref- UTC The Unicode Technical Report #45, U-source erences for 3C08, 3DD7, and 21CED are replaced by Ideographs G_HX because the KX entries for those characters are In the paragraph starting with „The content linked to is‟, virtual and therefore cannot be used as source refer- replace the „12-lines header‟ with „13-lines header‟. ences. In the description of the linked content, the descriptions (The following text is identical to ISO/IEC 10646, ex- from the 2nd to the 9th field are replaced by the following: cept for the renumbered note, but is linked to the new file.) 2nd field: Hanzi G sources(G0-hhhh), (G1-hhhh), (G3-hhhh), (G5-hhhh), Click on this highlighted text to access the reference (G7-hhhh), (GS-hhhh), (G8-hhhh), file. (G9-hhhh), (GE-hhhh), (G_KXdddddd), NOTE 5 – The content is also available as a separate view- (G_HZ), (G_HZddddd), (G_CY), (G_CH), able file in the same file directory as this document. The file (G_CHdddddd), (G_HC), (G_HCdddddd), is named: “CJKU_SR.txt”. (G_BK), (G_BKdddddd), (G_FZ), (G_FZddddd), (G_4K), (G_GHdddddd), Page 22, Clause 27 Source references for (G_GJZddddd), (G_XCdddddd), CJK Ideographs (G_CYYddddd), (G_ZFYddddd), or (G_ZJWddddd). Insert a new sub-clause 27.3 and update the following sub-clause number and its references. 3rd field: Hanzi T sources T1-hhhh), (T2-hhhh), (T3-hhhh), (T4-hhhh), 27.3 Source reference presentation for SIP CJK (T5-hhhh), (T6-hhhh), (T7-hhhh), Unified Ideographs (TC-hhhh), (TD-hhhh), (TE-hhhh), or In the SIP code charts, CJK Unified Ideographs Exten- (TF-hhhh). sion B are arranged in a manner similar to non ideo- th graphs and their presentation does not include source 4 field: Kanji J sources (J0-hhhh), reference information. CJK Unified Ideographs Exten- (J1-hhhh), (J3-hhhh), (J3A-hhhh), sion C uses a different format: (J4-hhhh), (JA-hhhh), or (JK-ddddd), or (J_ARIB-hhhh). 2 © ISO/IEC 2008 – All rights reserved ISO/IEC 10646:2003/Amd.5:2008 (E) Ucode C J K U V coded representation is indicated in the left margin while the remaining upper digits are indicated in the top G M T margin. The full coded representation for each charac- 2AB65 ter is also indicated under each representative graphic symbol. 34.2 Character names list G_ZFY00619 TC-3248 V4-4876 The character names lists contain both normative and informative information. The following information items The leftmost column of any entry shows the code posi- are normative: tion in ISO/IEC 10646. Each of the other columns Character code position, shows the graphic symbol for the character and its coded representation in the source standard also iden- Associated character name, tified in the chart entry. Character alias (one preceded by „※‟) which is Editor‟s note: Some CJK Unified Ideographs from Ex- a unique and stable alternate name for a cha- tension C have KP sources but are not shown in these racter. charts for lack of fonts. This, along with the format, will be fixed in the next edition of this standard. In addition, NOTE – Characters are given a normative character alias in the KP source information is already available in the certain cases where there is a defect in the character name. source reference file. They do not replace the character name, but rather allow users to formally refer to the character without requiring the use of a defective name. These aliases follow the same syn- Page 22, Sub-clause 27.3 Source references tax as character names. for CJK Compatibility Ideographs All other information is informative and may contain: Provide a new source reference file with format Graphic symbol associated with the character. changes as specified above and content updated as of resolution WG2 M51.10. Subheads grouping various subsets of a given block. For example, the LATIN-1 (The following text is identical to ISO/IEC 10646, but is SUPPLEMENT block contain “Latin-1 punctua- linked to the new file.) tion and symbols”, “Letters”, and “Mathemati- Click on this highlighted text to access the reference cal operator”. file. Explanatory text describing context for a sub- NOTE – The content is also available as a separate viewa- head or a whole block. ble file in the same file directory as this document. The file is named: “CJKC_SR.txt”. Informative aliases preceded by „=‟ indicate al- ternate names for characters. Page 28, Clause 34, Code Tables and list of character names (now renamed „Code charts Cross references, preceded by „→‟ indicates a and list of character names‟) related character of interest. Replace first paragraph by following. Information about languages, preceded by „•‟ indicates a non exhaustive list of languages Detailed code charts and list of character names for the using that character. For bicameral scripts, the BMP, SMP, SIP, and SSP are shown on the following information is only provided for the lower case pages. form of the character. Code charts are arranged by blocks which may span Case mappings, also preceded by „•‟, only several pages. when it cannot be derived simply from the Each code chart is followed by a corresponding charac- names. ter names list, except the CJK UNIFIED IDEOGRAPHS Other information about a character, also pre- blocks and the HANGUL SYLLABLES blocks. ceded by „•‟, describing name peculiarity, his- 34.1 Code chart torical consideration, or any noteworthy aspect of a character.