Universal Multiple-Octet Coded Character Set (UCS) —
Total Page:16
File Type:pdf, Size:1020Kb
ISO/IEC JTC1 SC2/WG2 N2845 all Final Proposed Draft Amendment (FPDAM) 1 ISO/IEC 10646:2003/Amd.1:2004 (E) Information technology — Universal Multiple-Octet Coded Character Set (UCS) — AMENDMENT 1: Glagolitic, Coptic, Georgian and other characters In the definition of Graphic character (formerly sub- Page 1, Clause 1 Scope clause 4.20, now 4.22), insert “or a format character” In the note, update the Unicode Standard version after “control function”. from 4.0 to 4.1. Page 2, Clause 3 Normative references Page 14, Clause 19 Characters in bidirectional context Update the reference to the Unicode Bidirectional Algorithm and the Unicode Normalization Forms as Add ‘Mirrored’ before ‘Character’ in clause title and follows: replace the text of the clause by the following: Unicode Standard Annex, UAX#9, The Unicode Bidi- A class of character has special significance in the rectional Algorithm, Version 4.1.0, [date TBD]. context of bidirectional text. The interpretation and rendering of any of these characters depend on the Unicode Standard Annex, UAX#15, Unicode Nor- state related to the symmetric swapping characters malization Forms, Version 4.1.0, [date TBD]. (see clause F.2.2) and on the direction of the char- acter being rendered that are in effect at the point in the CC-data-element where the coded representa- Page 2, Clause Terms and definitions tion of the character appears. The list of these char- Insert the following text as sub-clause 4.1 and Note; acters is provided in Annex E.1. update all following sub-clause numbers accord- NOTE – That list also represents all characters which have ingly. the ‘Bidi Mirrored’ property in the Unicode Standard Version 4.1. 4.1 Base character For example, if the character ACTIVATE A graphic character that does not graphically com- SYMMETRIC SWAPPING occurs and if the direction bine with preceding characters. of the character is from right to left, the character NOTE – Most graphic characters are base characters. This shall be interpreted as if the term LEFT or RIGHT in sense of graphic combination does not preclude the presen- its name had been replaced by the term RIGHT or tation of base characters from adopting different contextual forms or from participating in ligatures. LEFT, respectively. This character mirroring is not limited to paired char- acters and shall be applied to all characters belong- Page 3, Clause Terms and definitions ing to that class. Insert the following text as sub-clause 4.21 and up- date all following sub-clause numbers accordingly. Page 14, Sub-clause 20.1 Space characters Insert the following entries to the character list: 4.21 Format character 1680 OGHAM SPACE MARK A character whose primary function is to affect the 180E MONGOLIAN VOWEL SEPARATOR layout or processing of characters around them. It 202F NARROW NO-BREAK SPACE generally does not have a visible representation of 205F MEDIUM MATHEMATICAL SPACE its own. Page 14, Sub-clause 20.3 Alternate Format Char- acters © ISO/IEC 2004 – All rights reserved 1 ISO/IEC 10646:2003/Amd.1:2004 (E) Final Proposed Draft Amendment (FPDAM) 1 Rename the sub-clause from “Alternate Format In the file description, add a (G9-hhhh) value to the Characters” to “Format characters”. Hanzi G sources syntax and add a (J3A-hhhh) Replace first paragraph by: value to the Kanji J sources syntax.. th The following characters are format characters. In the file description, add a 9 field as following: th Add the following entries to the character list: • 9 field: Unicode U source (U0-hhhh). 0603 ARABIC SIGN SAFHA After the paragraph describing the format “The for- 10A3F KHAROSHTHI VIRAMA mat definition uses … appear as shown.”, add a new 1D159 MUSICAL SYMBOL NULL NOTEHEAD note: 1D173 MUSICAL SYMBOL BEGIN BEAM NOTE 3 – The original source references in the Hanja K4 1D174 MUSICAL SYMBOL END BEAM source (PKS 5700-3:1998) are described using a single deci- 1D175 MUSICAL SYMBOL BEGIN TIE mal index. For better consistency with the other sources, 1D176 MUSICAL SYMBOL END TIE those indexes are converted into hexadecimal values in the 1D177 MUSICAL SYMBOL BEGIN SLUR source reference file. Unlike the other hexadecimal values, 1D178 MUSICAL SYMBOL END SLUR they do not decompose in row, column values. 1D179 MUSICAL SYMBOL BEGIN PHRASE 1D17A MUSICAL SYMBOL END PHRASE Provide a new source reference file including the U source information, with changes according to reso- Remove sub-clause 20.5 and renumber 20.6 ac- lutions WG2 M44.7, M44.8, M44.9, M45.8, M45.9 cordingly. and M45.17 (the following text is identical to the first edition of ISO/IEC 10646:2003, but is linked to the Page 15, Sub-clause 20.4 Variation selectors new file): Replace the paragraph after the first note by the fol- Click on this highlighted text to access the refer- lowing text: ence file. NOTE 2 – The content is also available as a separate view- Variation selectors following other base characters able file in the same file directory as this document. The file and any non-base characters have no effect on the is named: “CJKU_SR.txt”. selection of the graphic symbol for that character. Page 22, Sub-clause 27.2 Source reference pres- Page 20, Clause 26 Special features of individual entation for BMP Unified Ideographs. scripts In last paragraph, replace the following sentence: Rename the clause to “Special features of individual The second line shows the coded representation in scripts and symbol repertoires”. decimal notation which comprises two digits for sec- tion number followed by two digits for position num- Add a sub-clause “26.2 Byzantine musical symbols” ber. identical to sub-clause U.1 which is therefore de- leted. by: When non empty, the second line shows the coded Page 21, Sub-clause 27.1 Source references for representation in decimal notation which comprises CJK Unified Ideographs two digits for section number followed by two digits In the enumeration of the Hanzi G sources insert the for position number except for the K4 source where following after the G8 source: it shows the original decimal source as a single 4 G9 GB18030-2000 digit value. Hanzi H source characters are identified in the G column using a ‘H-’ prefix. In the enumeration of the Kanji J sources insert the following after the J3 source: Page 22, Sub-clause 27.3 Source references for J3A JIS X 0213:2004 level-3 CJK Compatibility Ideographs After the enumeration of the ChuNom V sources add Provide a new source reference file with changes the following text and note: according to the resolution WG2 M44.8, M45.17 and The Unicode U source is: M45.18, (the following text is identical to the first edi- tion of ISO/IEC 10646:2003, but is linked to the new U0 The Unicode Standard 4.0-2003 file): NOTE 2 – Even if source references get updated, the source reference information is not be updated. The updated source Click on this highlighted text to access the refer- references may only identify characters not previously cov- ence file. ered by the older version. NOTE – The content is also available as a separate viewable In the following paragraph, change “11-lines header” file in the same file directory as this document. The file is by “12-lines header”. named: “CJKC_SR.txt”. 2 © ISO/IEC 2004 – All rights reserved Final Proposed Draft Amendment (FPDAM) 1 ISO/IEC 10646:2003/Amd.1:2004 (E) Page 26, Clause 29 Structure of the Basic Multi- Table 203 - Row D6: Mathematical Alphanumeric lingual Plane Symbols (183) At the bottom of Figure 4, add a new note before the Table 256 - Row 1E: CJK Unified Ideographs Extension B existing note which is renamed NOTE 2: These tables contained new characters and names NOTE 1 – New Tai Lue is also known as Xishuang Banna at the following code positions: Dai. 0237-0241, 0358-035C, 03FC-03FF, 04F6-04F7, 05A2, Page 30-1348 Clause 33, Code Tables and list of 05BA, 05C5-05C6, 060B, 061E, 0659-065E, 097D, 09CE, character names 0BB6, 0BE6, 0FD0-0FD1, 10F9-10FA, 10FC, 1207, 1247, 1. Corrigenda and modifications to existing ta- 1287, 12AF, 12CF, 12EF, 130F, 131F, 1347, 135F-1360, 1D6C-1D7F, 2055-2056, 2058-205E, 2090-2094, 20B2- bles 20B5, 20EB, 213C, 214C, 23D1-23DB, 2618, 267E-267F, Insert the additional character glyphs and names at 2692-269C, 26A2-26B1, 27C0-27C6, 2B0E-2B13, 327E, the indicated positions in the tables given below, the FA70-FAD9, 1D6A4-1D6A5 character names replacing the existing entries which and updated graphic symbols at the following code read “(This position shall not be used)”. The table positions: numbers are affected by the insertion of new tables 031A, 05AA, 03E2-03EF, 3396, 1D301-1D303, 21E45 (see below) preceding these modified tables. (The st EDITORS’s NOTE – At page 38, character glyphs for 1250- table numbers corresponding to the 1 edition of 125D and 1260-1261 are missing but are not new to this ISO/IEC 10646:2003 are mentioned in parenthesis.) amendment. Furthermore, at page 87 the names for 2E1C and 2E1D contain ‘PARAPHRASIS’ instead of Plane 00 ‘PARAPHRASE’. These minor issues will be corrected in the final amendment and this note will be removed. Table 6 - Row 01-2: Latin Extended B (6) Table 9 - Row 03: Combining Diacritical Marks (9) In the CJK Unified Ideographs code table, insert the Table 10 - Row 03: Greek and Coptic (10) following new characters at 9FA6-9FBB: Table 12 - Row 04: Cyrillic (12) Table 15 - Row 05: Hebrew (15) Table 16 - Row 06: Arabic (16) Table 21 - Row 09: Devanagari (20) Row/Cell C J K V Hex code G- Hanzi -T Kanji Hanja ChuNom Table 22 - Row 09: Bengali (21) Table 26 - Row 0B: Tamil (25) 159/166 Table 34 - Row 0F: Tibetan (32)