Iso/Iec 10646:2017

BS ISO/IEC 10646:2017 BSI Standards Publication Information technology — Universal Coded Character Set (UCS) BS ISO/IEC 10646:2017 BRITISH STANDARD National foreword This British Standard is the UK implementation of ISO/IEC 10646:2017. The UK participation in its preparation was entrusted to Technical Committee IST/5, Programming languages, their environments and system software interfaces. A list of organizations represented on this committee can be obtained on request to its secretary. This publication does not purport to include all the necessary provisions of a contract. Users are responsible for its correct application. © The British Standards Institution 2018 Published by BSI Standards Limited 2018 ISBN 978 0 580 90707 4 ICS 35.040.10 Compliance with a British Standard cannot confer immunity from legal obligations. This British Standard was published under the authority of the Standards Policy and Strategy Committee on 31 2018. Amendments/corrigenda issued since publicationAugust Date Text affected L ISO/IEC 10646 INTERNATIONA STANDARD Fifth edition 2017-12 Information technology — Universal Coded Character Set (UCS) Technologies de l'information — Jeu universel de caractères codés (JUC) ence number ISO/IEC 10646:2017(E) Refer ISO/IEC 2017 © BS ISO/IEC 10646:201710646:2017 ISO/IEC 10646:2017 (E) COPYRIGHT PROTECTED DOCUMENT © ISO/IEC 2017, Published in S ϐǡ witzerland All rights reserved. Unless otherwise no part of this publication may be reproduced or utilized otherwise in any form orthe by r any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission.ϐ Permission can be requested from either ISO at the address below or ISO’s member body in the country of Ch.equester. de Blandonnet 8 • CP 401 ISOCH-1214 copyrig V , Gene a, S el. +41 22 749 01 11 ax +41 22ernier 749 09 47 v witzerland T F [email protected] www.iso.org © ISO/IEC 2017 – All rights r ii eserved BS ISO/IEC 10646:2017 ISO/IEC 10646:2017 (E) ForewordCONTENTSYLL YLLL Introduction e 1 Scope d 2 Normativ references 3 Terms an definitions 4 Conformance information in 4.1 General e o d 4.2 Conformance of terchange ral struc e of the UC 4.3 Conformanc f evices Basic cture d n 5 Gene tur S 6 stru an omenclature of 6.1 Structure of de poi 6.2 Coding characters f 6.3 Types co nts s for code points (UIDs) 6.4 Naming o characters S Id 6.5 Short identifier id 6.6 UC Sequence entifiers he 6.7 Octet sequence entifiers 7 Revision and updating of t UCS 8 Subsets 8.1 General su 8.2 Limited subset S ng f 8.3 Selected bset 9 UC encodi orms 9.1 General 9.2 UTF-8 32 (UCS-4) 9.3 UTF-16 UCS 9.4 UTF- 10 Encoding schemes 10.1 General 10.2 UTF-8 10.3 UTF-16BE 10.4 UTF-16LE 10.5 UTF-16 10.6 UTF-32BE 10.7 UTF-32LE con ol fu ns wi h the 10.8 UTF-32 f f 11 Use of tr nctio t UCS d con ext f 12 Declaration o identification o features of a S ing sche 12.1 Purpose an t o identification 12.2 Identification UC encod me © ISO/IEC 2017 – All rights reserved iii BS ISO/IEC 10646:201710646:2017 ISO/IEC 10646:2017 (E) of subs s o g phic cha of con rol fu s 12.3 Identification et f ra racters of th ng s m of SO/IEC 202 12.4 Identification t nction et of co ts nd lists 12.5 Identification e codi yste I 2 Block and colle 13 Structure the de char a Block name 14 ction names 14.1 s ed characters in b al co 14.2 Collection names ed 15 Mirror idirection ntext ality of b al te 15.1 Mirror characters ial cha 15.2 Direction idirection xt 16 Spec racters 16.1 General cy sym 16.2 Space characters at 16.3 Curren bols aphic d cha 16.4 Form characters s a va s 16.5 Ideogr escription racters fo ms of cha 16.6 Variation electors nd riation equences Compatibility characters 17 Presentation r racters of 18 ing 19 Order characters of ng cha 20 Combin characters ing class d nical 20.1 Order combini racters in de 20.2 Combin an cano ordering e c 20.3 Appearance co charts com cha 20.4 Alternat oded representations s co ng co cha 20.5 Multiple bining racters g Graph 20.6 Collection ntaini mbining racters 20.7 Combinin eme Joiner of s a s 21 Normalization forms syl 22 Special features individual cripts nd ymbol repertoires of pts us n I o South A c 22.1 Hangul lable composition method musical symb 22.2 Features scri ed i ndia and some ther sian ountries r r s 22.3 Byzantine ols r r C 22.4 Source eferences fo pictographic ymbols r 23 Source eferences fo JK ideographs r f C i 23.1 List of source eferences r n f r C 23.2 Source eferences file or JK deographs r on for C C id 23.3 Source eference presentatio o JK Unified ideographs r r i 23.4 Source eferences presentati JK ompatibility eographs r 24 Source eferences fo Tangut deographs r f e f 24.1 List of source eferences r n f r 24.2 Source eference il or Tangut ideographs 24.3 Source eference presentatio o Tanguts ideographs iv © ISO/IEC 2017 – All rights reserved BS ISO/IEC 10646:2017 ISO/IEC 10646:2017 (E) r r N r 25 Source eferences fo üshu characters r f e f 25.1 List of source eferences r s a d a 25.2 Source eference il or Nüshu characters Entity 26 Characte name n nnotations fo 26.1 names nam 26.2 Name rmation im 26.3 Single e u 26.4 Name mutability s for CJK 26.5 Name niqueness r s for Tan ut 26.6 Character name ideographs s for u ch 26.7 Characte name g ideographs ter 26.8 Character name Nüsh aracters d UC Id 26.9 Charac names for Hangul syllables of Basic Mul l Plane 27 Name S Sequence entifiers of ry Mul l Plane r scripts and sym ls (SMP) 28 Structure the tilingua of ry Ideographic Plane (SIP) 29 Structure the Supplementa tilingua fo bo of y Ideographic Plan ( 30 Structure the Supplementa of ry Sp Plan (S 31 Structure the Tertiar e TIP) cha and lists of 32 Structure the Supplementa ecial-purpose e SP) 33 Code rts character names 33.1 General s 33.2 Code chart of va iation seq 33.3 Character name list cha and lists of 33.4 Summary standardized r uences x ) s g c r s 33.5 Code rts character names s of cod graphic Anne A (normative Collection of raphic haracters fo ubsets Blocks lists A.1 Collection ed characters d colle s of e co A.2 K A.3 Fixe ction the whole UCS (except Unicod llections) r colle A.4 CJ collections A.5 Othe ctions x ) List of comb A.6 Unicode collections x C ) f r plan 01 to 10 of Anne B (normative ining characters x ) Fo mat 8 (UTF Anne (normative Transformation ormat fo es the UCS (UTF-16) x ) racters i con Anne D (normative UCS Transformation r -8) x F ) at Anne E (normative Mirrored cha n bidirectional text c Anne (informative Form characters c F.1 General format haracters r a ion cha F.2 Script-specifi format characters fo mat cha F.3 Interlinea nnotat racters d fo cha F.4 Subtending r racters e m o F.5 Shorthan rmat racters F.6 Invisibl athematical perators © ISO/IEC 2017 – All rights reserved v BS ISO/IEC 10646:201710646:2017 ISO/IEC 10646:2017 (E) n musical symbols ge ta ng us Ta F.7 Wester x ) ally s list of r F.8 Langua ggi ing g characters x H ) The u of s” o id tify Anne G (informative Alphabetic orted characte names x ) aphic de cha Anne (informative se “signature t en UCS Anne I (informative Ideogr scription racters Syntax f id raphic d s I.1 General d of aphic d cha I.2 o an eog escription equence x J ( n f ed d w I.3 Individualefinitions the ideogr escription racters Anne informative) Recommendatio or combin receiving/originating evices ith internal x K ) N s o va r storage x L ter ng gu Anne (informative otation f octet lue epresentations x ) s of Anne (informative) Charac nami idelines x N ) al s to re Anne M (informative Source characters of to c ter d r c Anne (informative Extern reference character pertoires of ASN.1 cha r a ct sy N.1 Methods reference harac repertoires an thei oding of ASN.1 cha r syntax N.2 Identification racte bstra ntaxes x P al in mation on K N.3 Identification racte transfer es x Q ( ta for Anne (informative) Addition for CJ Unified ideographs x R ) mes of angul Anne informative) Code mapping ble Hangul syllables x S ure and ar of Anne (informative Na H syllables n p Anne (informative) Proced for the unification rangement CJK ideographs S.1 Unificatio rocedure se n S.2 Arrangement procedure n S.3 Source paratio examples x T ( ge ta ng us ng C S.4 Non-unificatio examples x U ) Cha rs in id Anne informative) Langua ggi i Tag haracters Anne (informative racte entifiers vi © ISO/IEC 2017 – All rights reserved BS ISO/IEC 10646:2017 ISO/IEC 10646:2017 (E) Foreword al ion ) C al chnical ) form he cialized m r .

Iso/Iec 10646:2017

ST.36 Page: 3.36.1

International Standard Iso/Iec 10646

Unicode and Code Page Support

Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress

CJKV Unified Ideographs Extension C

Hong Kong Supplementary Character Set – 2016 (Draft)

Netscape: Roadmap to Plane 2 (SIP) of ISO/IEC 10646 and Unicode

IRG N2153 IRG Principles and Procedures 2016-10-20 Version 8Confirmed Page 1 of 40 2.3.3

Character and String Representation

Section 18.1, Han

Character Properties 4

Proper Display of Bidirectional Structured Text