International Standard @ 5428
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
ISO/IEC JTC1/SC2/WG2 N 2005 Date: 1999-05-29
ISO INTERNATIONAL ORGANIZATION FOR STANDARDIZATION ORGANISATION INTERNATIONALE DE NORMALISATION --------------------------------------------------------------------------------------- ISO/IEC JTC1/SC2/WG2 Universal Multiple-Octet Coded Character Set (UCS) -------------------------------------------------------------------------------- ISO/IEC JTC1/SC2/WG2 N 2005 Date: 1999-05-29 TITLE: ISO/IEC 10646-1 Second Edition text, Draft 2 SOURCE: Bruce Paterson, project editor STATUS: Working paper of JTC1/SC2/WG2 ACTION: For review and comment by WG2 DISTRIBUTION: Members of JTC1/SC2/WG2 1. Scope This paper provides a second draft of the text sections of the Second Edition of ISO/IEC 10646-1. It replaces the previous paper WG2 N 1796 (1998-06-01). This draft text includes: - Clauses 1 to 27 (replacing the previous clauses 1 to 26), - Annexes A to R (replacing the previous Annexes A to T), and is attached here as “Draft 2 for ISO/IEC 10646-1 : 1999” (pages ii & 1 to 77). Published and Draft Amendments up to Amd.31 (Tibetan extended), Technical Corrigenda nos. 1, 2, and 3, and editorial corrigenda approved by WG2 up to 1999-03-15, have been applied to the text. The draft does not include: - character glyph tables and name tables (these will be provided in a separate WG2 document from AFII), - the alphabetically sorted list of character names in Annex E (now Annex G), - markings to show the differences from the previous draft. A separate WG2 paper will give the editorial corrigenda applied to this text since N 1796. The editorial corrigenda are as agreed at WG2 meetings #34 to #36. Editorial corrigenda applicable to the character glyph tables and name tables, as listed in N1796 pages 2 to 5, have already been applied to the draft character tables prepared by AFII. -
Iso/Iec Jtc 1/Sc 2 N 4355
ISO/IEC JTC 1/SC 2 N 4355 ISO/IEC JTC 1/SC 2 Coded character sets Secretariat: JISC (Japan) Document type: Secretariat Report Title: Secretariat Report to the 19th Plenary Meeting of ISO/IEC JTC 1/SC 2, Colombo, Sri Lanka, 2014-09-30, 10-03 Status: This document is circulated to the SC 2 members for information at the 19th Plenary Meeting to be held in Colombo, Sri Lanka. Date of document: 2014-09-08 Source: SC 2 Secretariat Expected action: INFO No. of pages: 8 Email of secretary: [email protected] Committee URL: http://isotc.iso.org/livelink/livelink/open/jtc1sc2 Secretariat Report to the 19th Plenary Meeting of ISO/IEC JTC 1/SC 2, Colombo, Sri Lanka, 2014-09-30, 10-03 Part I: Administration 1. Title Coded Character Sets 2. Scope Standardization of graphic character sets and their characteristics, including string ordering, associated control functions, their coded representation for information interchange and code extension techniques. Excluded: audio and picture coding. 3. Chairman and Secretariat Chairman: Prof. Yoshiki MIKAMI (re-appointed at the 2013 JTC 1 Perros-Guirec Plenary Meeting) Secretariat: JISC - Japan Attn.: Ayuko Nagasawa, IPSJ/ITSCJ 308-3, Kikai Shinko Kaikan Bldg. 3-5-8, Shibakoen, Minato-ku, Tokyo 105-0011 JAPAN TEL: +81 3 3431 2808/ FAX: +81 3 3431 6493 E-mail: [email protected] 4. Membership 4.1. P - Members (28) Austria (ASI); Canada (SCC); China (SAC); Egypt (EOS); Finland (SFS); France (AFNOR); Germany (DIN); Greece(ELOT); Hungary (MSZT); Iceland (IST); India (BIS); Indonesia (BSN); Ireland (NSAI); Japan (JISC); Korea, Democratic People's Republic (CSK); Korea, Republic of (KATS) ; Lithuania (LST); Mongolia (MASM); Norway (SN); Poland (PKN); Russian Federation (GOST R); Serbia (ISS) ; Sri Lanka (SLSI) ;Thailand (TISI); Tunisia (INNORPI); USA (ANSI) ; Ukraine (DSSU); United Kingdom (BSI) P-member NB added since the 18th plenary: none, removed: Romania (ASRO); Sweden (SIS) 4.2. -
International Character Code Standard for the BE2
°, , CMU-ITC-87-091 International Character Code Standard for the BE2 June 18, 1987 Tomas Centerlind Information Technology Center (ITC) Camegie Mellon University 1. Major problems with foreign languages All European languages have a set of unique characters, even Great Britain with their Pound sign. Most of these characters are static and do not change if they are in the end or in the middle of the word. The Greek sigma sign however is an example of a character that changes look depending on the position. If we move on to the non-Roman alphabets like Arabic, they have a more complex structure. A basic rule is that certain of the characters are written together if they follow each other but not otherwise. This complicates representation and requires look ahead. In addition to this many of the languages have leftwards or downwards writing. All together these properties makes it very difficult to integrate them with Roman languages. Lots of guidelines have to be established to solve these problems, and before any coding can be done a set of standards must be selected or defined. In this paper I intend to gather all that I can think of that must be considered before selecting standards. Only the basic level of the implementation will be designed, so therefore routines that for example display complex languages like Arabic must be written by the user. A basic method that dislpays Roman script correctly will be supported. 1. Standards 1.1 Existing standards Following is a list of currently existing and used standards. 1.1.1 ASCII, ISO646 The ASCII standard, that is in use ahnost anywhere, will probably have to rcmain as a basic part of the system, and without any doubt it must be possible to read and write ASCII coded documents in the foreseeable future. -
Modern Greek Dialects
<LINK "tru-n*">"tru-r22">"tru-r14"> <TARGET "tru" DOCINFO AUTHOR "Peter Trudgill"TITLE "Modern Greek dialects"SUBJECT "JGL, Volume 4"KEYWORDS "Modern Greek dialects, dialectology, traditional dialects, dialect cartography"SIZE HEIGHT "220"WIDTH "150"VOFFSET "4"> Modern Greek dialects A preliminary classification* Peter Trudgill Fribourg University Although there are many works on individual Modern Greek dialects, there are very few overall descriptions, classifications, or cartographical represen- tations of Greek dialects available in the literature. This paper discusses some possible reasons for these lacunae, having to do with dialect methodology, and Greek history and geography. It then moves on to employ the work of Kontossopoulos and Newton in an attempt to arrive at a more detailed classification of Greek dialects than has hitherto been attempted, using a small number of phonological criteria, and to provide a map, based on this classification, of the overall geographical configuration of Greek dialects. Keywords: Modern Greek dialects, dialectology, traditional dialects, dialect cartography 1. Introduction Tzitzilis (2000, 2001) divides the history of the study of Greek dialects into three chronological phases. First, there was work on individual dialects with a historical linguistic orientation focussing mainly on phonological features. (We can note that some of this early work, such as that by Psicharis and Hadzidakis, was from time to time coloured by linguistic-ideological preferences related to the diglossic situation.) The second period saw the development of structural dialectology focussing not only on phonology but also on the lexicon. Thirdly, he cites the move into generative dialectology signalled by Newton’s pioneering book (1972). As also pointed out by Sifianou (Forthcoming), however, Tzitzilis indicates that there has been very little research on social variation (Sella 1994 is essentially a discussion of registers and argots only), or on syntax, and no linguistic atlases at all except for the one produced for Crete by Kontossopoulos (1988). -
International Register of Coded Character Sets to Be Used with Escape Sequences for Information Interchange in Data Processing
INTERNATIONAL REGISTER OF CODED CHARACTER SETS TO BE USED WITH ESCAPE SEQUENCES 1 Introduction 1.1 General This document is the ISO International Register of Coded Character Sets To Be Used With Escape Sequences for information interchange in data processing. It is compiled in accordance with the provisions of ISO/IEC 2022, "Code Extension Technique" and of ISO 2375 "Procedure for Registration of Escape Sequences". This International Register contains coded character sets which have been registered in accordance with procedures given in ISO 2375. Its purpose is to identify widely used coded character sets and associate with each a unique escape sequence by means of which it can be designated according to ISO/IEC 2022 and ISO/IEC 4873. The publication of this International Register should promote compatibility in international information interchange and avoid duplication of effort in developing application-oriented coded character sets. Registration provides an identification for a coded character set but implies nothing about its status; it may or may not be part of a standard of an international, national or a corporate body. However, if such a standard is published subsequently to the registration, it would be appropriate for the escape sequence identifying the character set to be specified in the standard. If it is desired to register a set, application should be made to the Registration Authority through an appropriate Sponsoring Authority as specified in ISO 2375. Any character set can be a candidate for registration if it meets the requirements of ISO 2375. The Registration Authority ascertains that the proposals received are formally in accordance with this International Standard, technically in accordance with ISO/IEC 2022, and, where applicable, with ISO/IEC 646 and ISO/IEC 4873, and meet the presentation practice of the Registration Authority. -
Kodowanie Tekstów Polskich W Systemach Komputerowych∗
Kodowanie tekstów polskich w systemach komputerowych∗ Janusz S. Bień† 4 stycznia 1999‡ 1 Wstęp W niniejszym artykule przedstawiono podstawowe pojęcia kodowania tek- stów oraz omówiono wszystkie aktualnie obowiązujące polskie normy doty- czące tej problematyki, a także najważniejsze standardy międzynarodowe. 2 Podstawowe pojęcia kodowania tekstów 2.1 Klasyfikacja tekstów Pojęcie tekstu traktujemy jako pierwotne, którego nie będziemy tutaj defi- niować, wyróżnimy natomiast dwa podstawowe typy tekstów: teksty fizyczne i teksty elektroniczne. Za charakterystyczną cechę tekstów elektronicznych ∗Referat wygłoszony na konferencji Multimedia w nauczaniu języka rodzime- go jako obcego (Szkoła Języka i Kultury Polskiej, Uniwersytet Śląski, Katowi- ce, 7-8 grudnia 1998) i opublikowany z niewielkimi skrótami w kwartalniku Post- sciptum nr 27–29 (jesień 1998 — wiosna 1999), s. 4-27 (ISSN 1427–0501). Za zgo- dą organizatorów i redakcji niniejszy pełny tekst referatu był dostępny pod adresem ftp://ftp.mimuw.edu.pl/pub/polszczyzna/ogonki/katow98.*, obecnie jest dostępny pod adresem http://www.mimuw.edu.pl/~jsbien/publ/Kodtp98/. †Dr hab. J. S. Bień, prof. UW jest pracownikiem Katedry Lingwistyki Formalnej Uni- wersytetu Warszawskiego. W latach 1998-2003 był kierownikiem Zakładu Zastosowań In- formatycznych Instytutu Orientalistycznego Uniwersytetu Warszawskiego. W Instytucie Informatyki UW, w którym pracował do czerwca 1998 r., prowadził m.in. wykłady mono- graficzne pt. Wybrane standardy przetwarzania tekstu. W latach 1992–1993 był członkiem Normalizacyjnej Komisji Problemowej ds. Informatyki, w latach 1997-1999 Normaliza- cyjnej Komisji Problemowej nr 242 ds. Informacji i Dokumentacji, a w latach 1999-2001 Normalizacyjnej Komisji Problemowej nr 170 ds. Terminologii Informatycznej i Kodowania Informacji Polskiego Komitetu Normalizacyjnego. Adres elektroniczny: [email protected] (lub [email protected]). -
Network Working Group D. Spinellis Request for Comments: 1947 SENA S.A
Network Working Group D. Spinellis Request for Comments: 1947 SENA S.A. Category: Informational May 1996 Greek Character Encoding for Electronic Mail Messages Status of This Memo This memo provides information for the Internet community. This memo does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Overview and Rational This document describes a standard encoding for electronic mail [RFC822] containing Greek text and provides implementation guide- lines. The standard is based on MIME [RFC1521] and the ISO 8859-7 character encoding. Although the implementation of this standard is straightforward several non-standard but "functional" - though unlikely to inter-operate - alternatives are in common use. For this reason we highlight common implementation and mail user agent setup errors. Description In order to transfer Greek text via electronic mail the text is first translated into the ISO 8859-7 character set, and then encoded using either the Base64 (preferable for text that is mainly Greek) or the Quoted-Printable (justifiable in cases where some Greek words appear inside predominately Latin text) method, as defined in MIME. The following table provides most common Greek encodings (see also [RFC1345]): 0646 37 M7 51 MC 23 69 LG L1 G7 GO GC 28 97 Description ---- -- -- -- -- -- -- -- -- -- -- -- -- -- ----------- 0386 ea a2 86 cd 71 86 b6 Capital alpha with acute 0388 eb b8 8d ce 72 8d b8 Capital epsilon with acute 0389 ec b9 8f d7 73 8f b9 Capital eta with acute 038a ed ba 90 d8 75 90 ba Capital iota -
The Unicode Standard, Version 3.0, Issued by the Unicode Consor- Tium and Published by Addison-Wesley
The Unicode Standard Version 3.0 The Unicode Consortium ADDISON–WESLEY An Imprint of Addison Wesley Longman, Inc. Reading, Massachusetts · Harlow, England · Menlo Park, California Berkeley, California · Don Mills, Ontario · Sydney Bonn · Amsterdam · Tokyo · Mexico City Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Addison-Wesley was aware of a trademark claim, the designations have been printed in initial capital letters. However, not all words in initial capital letters are trademark designations. The authors and publisher have taken care in preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode®, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. If these files have been purchased on computer-readable media, the sole remedy for any claim will be exchange of defective media within ninety days of receipt. Dai Kan-Wa Jiten used as the source of reference Kanji codes was written by Tetsuji Morohashi and published by Taishukan Shoten. ISBN 0-201-61633-5 Copyright © 1991-2000 by Unicode, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or other- wise, without the prior written permission of the publisher or Unicode, Inc. -
Iso/Iec Jtc 1/Sc 2 N 3566 Date: 2001-10-22
ISO/IEC JTC 1/SC 2 N 3566 DATE: 2001-10-22 ISO/IEC JTC 1/SC 2 Coded Character Sets Secretariat: Japan (JISC) DOC. TYPE Activity Report TITLE Report of SC 2/WG 3 to the SC 2 Plenary in Singapore, 2001-10-18/19 SOURCE Mr. Evangelos Melagrakis, WG 3 Convener PROJECT This document was reviewed at the Eleventh Plenary Meeting of SC 2 STATUS held in Singapore, 2001-10-18/19. ACTION ID FYI DUE DATE P, O and L Members of ISO/IEC JTC 1/SC 2 ; ISO/IEC JTC 1 DISTRIBUTION Secretariat; ISO/IEC ITTF ACCESS LEVEL Open ISSUE NO. 126 NAME 02n3566.pdf FILE SIZE (KB) PAGES 7 Secretariat ISO/IEC JTC 1/SC 2 - IPSJ/ITSCJ *(Information Processing Society of Japan/Information Technology Standards Commission of Japan) Room 308-3, Kikai-Shinko-Kaikan Bldg., 3-5-8, Shiba-Koen, Minato-ku, Tokyo 105-0011 Japan *Standard Organization Accredited by JISC Telephone: +81-3-3431-2808; Facsimile: +81-3-3431-6493; E-mail: [email protected] ISO/IEC JTC 1/SC 2 N 3566 ISO/IEC JTC 1/SC 2/WG 3 N 515 Date : 2001-10-15 ISO/IEC JTC 1/SC 2/WG 3 7-bit and 8-bit codes and their extension SECRETARIAT : ELOT DOC TYPE : Convener’s Report TITLE : Report of SC2/WG3 to the SC2 Plenary in Singapore SOURCE : E.Melagrakis, ISO JTC 1/SC 2/WG 3 Convenor PROJECT: STATUS : ACTION ID : FYI DUE DATE : ---- P, O and L Members of ISO/IEC JTC 1/SC 2 DISTRIBUTION : MEDIUM : NO OF PAGES : 5+1 Contact 1: Secretariat ISO/IEC JTC 1/SC 2/WG 3 ELOT Mrs K.Velli (acting) Acharnon 313, 111 45 Kato Patissia, ATHENS – GREECE Tel: +30 1 21 20 307 Fax : +30 1 22 86 219 E-mail : [email protected] Contact 2 : Convenor ISO/IEC JTC 1/SC 2/WG 3 Mr E.Melagrakis Acharnon 313, 111 45 Kato Patissia, ATHENS – GREECE Tel: +30 1 21 20 301 Fax : +30 1 22 86 219 E-mail: [email protected] ISO/IEC JTC 1/SC 2/WG 3 N 515 Report of SC2/WG3 to the SC2 Plenary in Singapore General Information: Since the last SC2 plenary, WG3 has had one meeting, in London, (September 1998). -
Teaching the Ancients to Type: Better Unicode Text Entry for Ancient Greek
Teaching the Ancients to Type: Better Unicode Text Entry for Ancient Greek Steven Tammen Acknowledgements This project would not have been possible without the support of the University of Georgia’s Center for Under- graduate Research Opportunities (CURO), and the support of my research mentor, Dr. Benjamin M. Wolkow. I would also like to thank all of the Greek faculty and students that completed this project’s research survey. The data from this survey was useful in guiding this project’s progression, and also in motivating its completion. Other people care! Hooray! Finally, I wish to thank my family for all their support and encouragement. I have no doubt talked about key- boards and keyboard layouts enough over the years to drive any group of normal individuals over the edge. But they put up with me nonetheless. Contents 1 About this project 1 1.1 What is this project? ........................................... 1 1.2 Why this project? ............................................ 1 2 Project goals and features 5 2.1 Sane defaults combined with ease of use: the principle of least astonishment ........... 5 2.2 Letter placements that make sense ................................... 5 2.3 Greek letter placements ......................................... 10 2.4 Diacritic and punctuation placements that make sense ........................ 12 2.5 Greek diacritic and punctuation placements .............................. 13 2.6 Intuitive diacritic and backspacing behavior .............................. 16 2.7 Minimal interference with normal computer use ........................... 18 3 Efficient typing practice and Greek language learning 18 3.1 Repetition in typing ........................................... 19 3.2 Repetition in learning .......................................... 20 3.3 Typing, language learning, and frequent words ............................ 20 3.4 Some specific examples ........................................ -
Uniedit Conversions
UNIEDIT MULTILINGUAL TEXT EDITOR USER’S GUIDE HUMANITIES COMPUTING FACILITY DUKE UNIVERSITY Copyright Information © COPYRIGHT 1998 BY THE HUMANITIES COMPUTING FACILITY, DUKE UNIVERSITY. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED, TRANSMITTED, OR TRANSCRIBED, STORED INTO A RETRIEVAL SYSTEM OR TRANSLATED INTO ANY LANGUAGE OR COMPUTER LANGUAGE, IN ANY FORM OR BY ANY MEANS, ELECTRONIC, MECHANICAL, MAGNETIC, OPTICAL, CHEMICAL, MANUAL OR OTHERWISE, WITHOUT THE PRIOR WRITTEN CONSENT OF DUKE UNIVERSITY. THE HUMANITIES COMPUTING FACILITY RESERVES THE RIGHT TO REVISE THIS PUBLICATION AND TO MAKE CHANGES FROM TIME TO TIME IN THE CONTENT OF THIS PUBLICATION WITHOUT OBLIGATION OF THE HUMANITIES COMPUTING FACILITY TO NOTIFY ANY PERSON OF SUCH REVISION. MICROSOFT IS A REGISTERED TRADEMARK OF MICROSOFT CORPORATION; VIDEO FOR WINDOWS, WINDOWS 3.1, WINDOWS NT AND MULTIMEDIA MOVIE PLAYER ARE TRADEMARKS OF MICROSOFT CORPORATION. IBM IS A REGISTERED TRADEMARK OF INTERNATIONAL BUSINESS MACHINES CORPORATION AND PC-AT, PS/2, AND M-MOTION ARE TRADEMARKS OF INTERNATIONAL BUSINESS MACHINES. UNICODE IS A REGISTERED TRADEMARK OF THE UNICODE CONSORTIUM. SEND YOUR COMPLETED UNIEDIT REGISTRATION FORM, OR DIRECT ANY QUESTIONS, SUGGESTIONS, PRODUCT ORDERS, AND COMMENTS TO: HUMANITIES COMPUTING FACILITY 319 NORTH BUILDING BOX 90269 DUKE UNIVERSITY DURHAM, NC 27708-0269 USA PHONE: (919) 660-3190 FAX: (919) 660-3191 E-MAIL: [email protected] Table of Contents iii Table of Contents COPYRIGHT INFORMATION ....................................................................................................................... -
Test the Extended LGR Font Encoding Definitions
Test the extended LGR font encoding definitions The file lgrxenc.def allows convenient typesetting of Greek letters with diacritics. It works independent of the babel package. Symbols See the source file lgrenc-test.tex for the macros used to produce the symbols. Generic text symbols Latin: + - = < > – — { [ () ] } \ | % % ␣ LGR: + - = < > –— { [ () ] } ∖ | ‰ (Per-mille symbol is missign in LGR.) Quotes: «a» {a}, ‘a’ ‘α’, “a” “a” (double quotes wrong with Kerkis fonts), ‹a› „a” Single guillemots and base-quotes are missing in LGR. Ligature break up: AY fi AU ϊ ↦→ AY fi AvU "vi Spacing accent chars: ^a ^a ^i ~a ~va ~vi a ˘vα ˘vι ¯a ¯vα ¯vι ¨a "va "vi ´a 'va 'vi `a `va `vi Symbols for SI-units: 5 µm, 5 kW; 5 mm, 5 kW Letter schwa and Euro symbol: ə, € Some symbol definitions expect a Latin font: © ® ™ SS (uppercase of ß) Babel’s lgrenc.def defines them with \latintext, however this macro is not guaranteed to be defined, so it should not be used in a font encoding definition file. Instead, the textcomp.sty package should be used to provide the symbols for all font encodings (the sharp s (ß) is used in German text that cannot be set with LGR anyway). Greek alphabet Greek letters via Latin transscription in LGR font encoding: ABGDEZHJIKLMNXOPRSTUFQYW abgdezhjiklmnxoprcctufqyw Additional Greek symbols ϟ koppa, ϙ archaic koppa, Ϙ archaic Koppa, ϛ stigma, ϛ stigma variant, Ϛ Stigma (Sigma-Tau-Ligature in CB-fonts), ϡ sampi, Ϡ Sampi, ϝ digamma, Ã Digamma, ʹ Dexia keraia, ͵ Aristeri keraia, 1 Variant symbols for pi (휛), kappa (no TeX symbol available), rho (휚), and theta (휗) are missing in LGR.