1 Working Draft for Supporting Blueberry Revision of XML
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The Ogham-Runes and El-Mushajjar
c L ite atu e Vo l x a t n t r n o . o R So . u P R e i t ed m he T a s . 1 1 87 " p r f ro y f r r , , r , THE OGHAM - RUNES AND EL - MUSHAJJAR A D STU Y . BY RICH A R D B URTO N F . , e ad J an uar 22 (R y , PART I . The O ham-Run es g . e n u IN tr ating this first portio of my s bj ect, the - I of i Ogham Runes , have made free use the mater als r John collected by Dr . Cha les Graves , Prof. Rhys , and other students, ending it with my own work in the Orkney Islands . i The Ogham character, the fair wr ting of ' Babel - loth ancient Irish literature , is called the , ’ Bethluis Bethlm snion e or , from its initial lett rs, like “ ” Gree co- oe Al hab e t a an d the Ph nician p , the Arabo “ ” Ab ad fl d H ebrew j . It may brie y be describe as f b ormed y straight or curved strokes , of various lengths , disposed either perpendicularly or obliquely to an angle of the substa nce upon which the letters n . were i cised , punched, or rubbed In monuments supposed to be more modern , the letters were traced , b T - N E E - A HE OGHAM RU S AND L M USH JJ A R . n not on the edge , but upon the face of the recipie t f n l o t sur ace ; the latter was origi al y wo d , s aves and tablets ; then stone, rude or worked ; and , lastly, metal , Th . -
Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only. -
ISO/IEC JTC1/SC2/WG2 N 2029 Date: 1999-05-29
ISO INTERNATIONAL ORGANIZATION FOR STANDARDIZATION ORGANISATION INTERNATIONALE DE NORMALISATION --------------------------------------------------------------------------------------- ISO/IEC JTC1/SC2/WG2 Universal Multiple-Octet Coded Character Set (UCS) -------------------------------------------------------------------------------- ISO/IEC JTC1/SC2/WG2 N 2029 Date: 1999-05-29 TITLE: Repertoire additions for ISO/IEC 10646-1 - Cumulative List No.9 SOURCE: Bruce Paterson, project editor STATUS: Standing Document, replacing WG2 N 1936 ACTION: For review and confirmation by WG2 DISTRIBUTION: Members of JTC1/SC2/WG2 INTRODUCTION This working paper contains the accumulated list of additions to the repertoire of ISO/IEC 10646-1 agreed by WG2, up to meeting no.36 (Fukuoka). A summary of all allocations within the BMP is given in Annex 1. A list of additional Collections, Blocks, and character tables is given in Annex 2. All additions are assigned to a Character Category, in accordance with clause II of the document "Principles and Procedures for Allocation of New Characters and Scripts" WG2 N 1502. The column Cat. in the table below shows the category (A to G) assigned by WG2. An entry P in this column indicates that the characters are provisionally accepted by WG2. WG2 Cat. No of Code Character(s) Source Current Ballot mtg.res chars position(s) doc. ref. end date NEW/EXTENDED SCRIPTS 27.14 A 11172 AC00-D7A3 Hangul syllables (revision) N1158 AMD 5 - 28.2 A 31 0591-05AF Hebrew cantillation marks N1217 AMD 7 - +05C4 28.5 A 174 0F00-0FB9 Tibetan N1238 AMD 6 - 31.4 A 346 1200-137F Ethiopic N1420 AMD10 - 31.6 A 623 1400-167F Canadian Aboriginal Syllabics N1441 AMD11 - 31.7 B1 85 13A0-13AF Cherokee N1172 AMD12 - & N1362 32.14 A 6582 3400-4DBF CJK Unified Ideograph Exten. -
Greek Alphabet ( ) Ελληνικ¿ Γρ¿Μματα
Greek alphabet and pronunciation 9/27/05 12:01 AM Writing systems: abjads | alphabets | syllabic alphabets | syllabaries | complex scripts undeciphered scripts | alternative scripts | your con-scripts | A-Z index Greek alphabet (ελληνικ¿ γρ¿μματα) Origin The Greek alphabet has been in continuous use for the past 2,750 years or so since about 750 BC. It was developed from the Canaanite/Phoenician alphabet and the order and names of the letters are derived from Phoenician. The original Canaanite meanings of the letter names was lost when the alphabet was adapted for Greek. For example, alpha comes for the Canaanite aleph (ox) and beta from beth (house). At first, there were a number of different versions of the alphabet used in various different Greek cities. These local alphabets, known as epichoric, can be divided into three groups: green, blue and red. The blue group developed into the modern Greek alphabet, while the red group developed into the Etruscan alphabet, other alphabets of ancient Italy and eventually the Latin alphabet. By the early 4th century BC, the epichoric alphabets were replaced by the eastern Ionic alphabet. The capital letters of the modern Greek alphabet are almost identical to those of the Ionic alphabet. The minuscule or lower case letters first appeared sometime after 800 AD and developed from the Byzantine minuscule script, which developed from cursive writing. Notable features Originally written horizontal lines either from right to left or alternating from right to left and left to right (boustophedon). Around 500 BC the direction of writing changed to horizontal lines running from left to right. -
Pronunciation
PRONUNCIATION Guide to the Romanized version of quotations from the Guru Granth Saheb. A. Consonants Gurmukhi letter Roman Word in Roman Word in Gurmukhi Meaning Letter letters using the letters using the relevant letter relevant letter from from the second the first column column S s Sabh sB All H h Het ihq Affection K k Krodh kroD Anger K kh Khayl Kyl Play G g Guru gurU Teacher G gh Ghar Gr House | ng Ngyani / gyani i|AwnI / igAwnI Possessing divine knowledge C c Cor cor Thief C ch Chaata Cwqw Umbrella j j Jahaaj jhwj Ship J jh Jhaaroo JwVU Broom \ ny Sunyi su\I Quiet t t Tap t`p Jump T th Thag Tg Robber f d Dar fr Fear F dh Dholak Folk Drum x n Hun hux Now q t Tan qn Body Q th Thuk Quk Sputum d d Den idn Day D dh Dhan Dn Wealth n n Net inq Everyday p p Peta ipqw Father P f Fal Pl Fruit b b Ben ibn Without B bh Bhagat Bgq Saint m m Man mn Mind X y Yam Xm Messenger of death r r Roti rotI Bread l l Loha lohw Iron v v Vasai vsY Dwell V r Koora kUVw Rubbish (n) in brackets, and (g) in brackets after the consonant 'n' both indicate a nasalised sound - Eg. 'Tu(n)' meaning 'you'; 'saibhan(g)' meaning 'by himself'. All consonants in Punjabi / Gurmukhi are sounded - Eg. 'pai-r' meaning 'foot' where the final 'r' is sounded. 3 Copyright Material: Gurmukh Singh of Raub, Pahang, Malaysia B. -
K 03-UP-004 Insular Io02(A)
By Bernard Wailes TOP: Seventh century A.D., peoples of Ireland and Britain, with places and areas that are mentioned in the text. BOTTOM: The Ogham stone now in St. Declan’s Cathedral at Ardmore, County Waterford, Ireland. Ogham, or Ogam, was a form of cipher writing based on the Latin alphabet and preserving the earliest-known form of the Irish language. Most Ogham inscriptions are commemorative (e.g., de•fin•ing X son of Y) and occur on stone pillars (as here) or on boulders. They date probably from the fourth to seventh centuries A.D. who arrived in the fifth century, occupied the southeast. The British (p-Celtic speakers; see “Celtic Languages”) formed a (kel´tik) series of kingdoms down the western side of Britain and over- seas in Brittany. The q-Celtic speaking Irish were established not only in Ireland but also in northwest Britain, a fifth- THE CASE OF THE INSULAR CELTS century settlement that eventually expanded to become the kingdom of Scotland. (The term Scot was used interchange- ably with Irish for centuries, but was eventually used to describe only the Irish in northern Britain.) North and east of the Scots, the Picts occupied the rest of northern Britain. We know from written evidence that the Picts interacted extensively with their neighbors, but we know little of their n decades past, archaeologists several are spoken to this day. language, for they left no texts. After their incorporation into in search of clues to the ori- Moreover, since the seventh cen- the kingdom of Scotland in the ninth century, they appear to i gin of ethnic groups like the tury A.D. -
Inventory of Romanization Tools
Inventory of Romanization Tools Standards Intellectual Management Office Library and Archives Canad Ottawa 2006 Inventory of Romanization Tools page 1 Language Script Romanization system for an English Romanization system for a French Alternate Romanization system catalogue catalogue Amharic Ethiopic ALA-LC 1997 BGN/PCGN 1967 UNGEGN 1967 (I/17). http://www.eki.ee/wgrs/rom1_am.pdf Arabic Arabic ALA-LC 1997 ISO 233:1984.Transliteration of Arabic BGN/PCGN 1956 characters into Latin characters NLC COPIES: BS 4280:1968. Transliteration of Arabic characters NL Stacks - TA368 I58 fol. no. 00233 1984 E DMG 1936 NL Stacks - TA368 I58 fol. no. DIN-31635, 1982 00233 1984 E - Copy 2 I.G.N. System 1973 (also called Variant B of the Amended Beirut System) ISO 233-2:1993. Transliteration of Arabic characters into Latin characters -- Part 2: Lebanon national system 1963 Arabic language -- Simplified transliteration Morocco national system 1932 Royal Jordanian Geographic Centre (RJGC) System Survey of Egypt System (SES) UNGEGN 1972 (II/8). http://www.eki.ee/wgrs/rom1_ar.pdf Update, April 2004: http://www.eki.ee/wgrs/ung22str.pdf Armenian Armenian ALA-LC 1997 ISO 9985:1996. Transliteration of BGN/PCGN 1981 Armenian characters into Latin characters Hübschmann-Meillet. Assamese Bengali ALA-LC 1997 ISO 15919:2001. Transliteration of Hunterian System Devanagari and related Indic scripts into Latin characters UNGEGN 1977 (III/12). http://www.eki.ee/wgrs/rom1_as.pdf 14/08/2006 Inventory of Romanization Tools page 2 Language Script Romanization system for an English Romanization system for a French Alternate Romanization system catalogue catalogue Azerbaijani Arabic, Cyrillic ALA-LC 1997 ISO 233:1984.Transliteration of Arabic characters into Latin characters. -
A Message from Afar: Fact Sheet 3 (PDF)
3 What’s the Message? Here’s your signal The detection of a signal from another world would be a most remarkable moment in human history. However, if we detect such a signal, is it just a beacon from their technology, without any content, or does it contain information or even a message? Does it resemble sound, or is it like interstellar e-mail? Can we ever understand such a message? This appears to be a tremendous challenge, given that we still have many scripts from our own antiquity that remain undeciphered, despite many serious attempts, over hundreds of years. – And we know far more about humanity than about extra-terrestrial intelligence… We are facing all the complexities involved in understanding and glimpsing the intellect of the author, while the world’s expectations demand immediacy of information. So, where do we begin? Structure and language Information stands out from randomness, it is based on structures. The problem goal we face, after we detect a signal, is to first separate out those information-carrying signals from other phenomena, without being able to engage in a dialogue, and then to learn something about the structure of their content in the passing. This means that we need a suitable filter. We need a way of separating out the interesting stuff; we need a language detector. While identifying the location of origin of a candidate signal can rule out human making, its content could involve a vast array of possible structures, some of which may be beyond our knowledge or imagination; however, for identifying intelligence that shares any pattern with our way of processing and transmitting information, the collective knowledge and examples here on Earth are a good starting point. -
A STUDY of WRITING Oi.Uchicago.Edu Oi.Uchicago.Edu /MAAM^MA
oi.uchicago.edu A STUDY OF WRITING oi.uchicago.edu oi.uchicago.edu /MAAM^MA. A STUDY OF "*?• ,fii WRITING REVISED EDITION I. J. GELB Phoenix Books THE UNIVERSITY OF CHICAGO PRESS oi.uchicago.edu This book is also available in a clothbound edition from THE UNIVERSITY OF CHICAGO PRESS TO THE MOKSTADS THE UNIVERSITY OF CHICAGO PRESS, CHICAGO & LONDON The University of Toronto Press, Toronto 5, Canada Copyright 1952 in the International Copyright Union. All rights reserved. Published 1952. Second Edition 1963. First Phoenix Impression 1963. Printed in the United States of America oi.uchicago.edu PREFACE HE book contains twelve chapters, but it can be broken up structurally into five parts. First, the place of writing among the various systems of human inter communication is discussed. This is followed by four Tchapters devoted to the descriptive and comparative treatment of the various types of writing in the world. The sixth chapter deals with the evolution of writing from the earliest stages of picture writing to a full alphabet. The next four chapters deal with general problems, such as the future of writing and the relationship of writing to speech, art, and religion. Of the two final chapters, one contains the first attempt to establish a full terminology of writing, the other an extensive bibliography. The aim of this study is to lay a foundation for a new science of writing which might be called grammatology. While the general histories of writing treat individual writings mainly from a descriptive-historical point of view, the new science attempts to establish general principles governing the use and evolution of writing on a comparative-typological basis. -
Section 9.2, Arabic, Section 9.3, Syriac and Section 9.5, Man- Daic
The Unicode® Standard Version 12.0 – Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. © 2019 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at http://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see http://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium; edited by the Unicode Consortium. — Version 12.0. Includes index. ISBN 978-1-936213-22-1 (http://www.unicode.org/versions/Unicode12.0.0/) 1. -
The Unicode Standard, Version 6.2 Copyright © 1991–2012 Unicode, Inc
The Unicode Standard Version 6.2 – Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. Copyright © 1991–2012 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at http://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see http://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium ; edited by Julie D. Allen ... [et al.]. — Version 6.2. -
(RSEP) Request October 16, 2017 Registry Operator INFIBEAM INCORPORATION LIMITED 9Th Floor
Registry Services Evaluation Policy (RSEP) Request October 16, 2017 Registry Operator INFIBEAM INCORPORATION LIMITED 9th Floor, A-Wing Gopal Palace, NehruNagar Ahmedabad, Gujarat 380015 Request Details Case Number: 00874461 This service request should be used to submit a Registry Services Evaluation Policy (RSEP) request. An RSEP is required to add, modify or remove Registry Services for a TLD. More information about the process is available at https://www.icann.org/resources/pages/rsep-2014- 02-19-en Complete the information requested below. All answers marked with a red asterisk are required. Click the Save button to save your work and click the Submit button to submit to ICANN. PROPOSED SERVICE 1. Name of Proposed Service Removal of IDN Languages for .OOO 2. Technical description of Proposed Service. If additional information needs to be considered, attach one PDF file Infibeam Incorporation Limited (“infibeam”) the Registry Operator for the .OOO TLD, intends to change its Registry Service Provider for the .OOO TLD to CentralNic Limited. Accordingly, Infibeam seeks to remove the following IDN languages from Exhibit A of the .OOO New gTLD Registry Agreement: - Armenian script - Avestan script - Azerbaijani language - Balinese script - Bamum script - Batak script - Belarusian language - Bengali script - Bopomofo script - Brahmi script - Buginese script - Buhid script - Bulgarian language - Canadian Aboriginal script - Carian script - Cham script - Cherokee script - Coptic script - Croatian language - Cuneiform script - Devanagari script