Iso/Iec Jtc1/Sc2/Wg2 N4268r L2/12-168R
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Managing Order Relations in Mlbibtex∗
Managing order relations in MlBibTEX∗ Jean-Michel Hufflen LIFC (EA CNRS 4157) University of Franche-Comté 16, route de Gray 25030 BESANÇON CEDEX France hufflen (at) lifc dot univ-fcomte dot fr http://lifc.univ-fcomte.fr/~hufflen Abstract Lexicographical order relations used within dictionaries are language-dependent. First, we describe the problems inherent in automatic generation of multilin- gual bibliographies. Second, we explain how these problems are handled within MlBibTEX. To add or update an order relation for a particular natural language, we have to program in Scheme, but we show that MlBibTEX’s environment eases this task as far as possible. Keywords Lexicographical order relations, dictionaries, bibliographies, colla- tion algorithm, Unicode, MlBibTEX, Scheme. Streszczenie Porządek leksykograficzny w słownikach jest zależny od języka. Najpierw omó- wimy problemy powstające przy automatycznym generowaniu bibliografii wielo- języcznych. Następnie wyjaśnimy, jak są one traktowane w MlBibTEX-u. Do- danie lub zaktualizowanie zasad sortowania dla konkretnego języka naturalnego umożliwia program napisany w języku Scheme. Pokażemy, jak bardzo otoczenie MlBibTEX-owe ułatwia to zadanie. Słowa kluczowe Zasady sortowania leksykograficznego, słowniki, bibliografie, algorytmy sortowania leksykograficznego, Unikod, MlBibTEX, Scheme. 0 Introduction these items w.r.t. the alphabetical order of authors’ Looking for a word in a dictionary or for a name names. But the bst language of bibliography styles in a phone book is a common task. We get used [14] only provides a SORT function [13, Table 13.7] to the lexicographic order over a long time. More suitable for the English language, the commands for precisely, we get used to our own lexicographic or- accents and other diacritical signs being ignored by der, because it belongs to our cultural background. -
Investigation of English Language Contact-Induced Features in Hungarian Cardiology Discharge Reports and Language Attitudes of Physicians and Patients
University of Szeged, Faculty of Arts PhD School in Linguistics PhD Program in English Applied Linguistics Investigation of English language contact-induced features in Hungarian cardiology discharge reports and language attitudes of physicians and patients Summary of PhD Dissertation Csilla Keresztes Supervisor: Anna Fenyvesi, PhD associate professor Szeged 2010 1. Introduction Since the 1950s English has become not just an important language in the field of medicine, but the predominant language of health sciences. The aim of this study is to describe a field, namely, a subregister of the Hungarian language of medicine, to reveal the English contact-induced features in this specific purpose language, and to investigate the attitude of various discourse communities affected by it towards the English language. The impact of some major European languages, among them the English language, on Hungarian and its lexicon has already been investigated, however, it has been looked at mainly from a puristic aspect so far and little sociolinguistic or contact linguistic research has been done in the field yet. This research is focused on only one field of medicine, cardiology, which was selected for a closer investigation, on the one hand, as it is a technologically sophisticated, professionalized, institutionalized, and highly invasive medical discipline. On the other hand, heart diseases are the leading causes of death in several countries of the world including Hungary. Numerous studies have been published on medical English, but studies on medical Hungarian are limited in number, and very little has been published on the language of cardiology. Hospital discharge reports (or summaries) are written documents prepared when the patient is discharged from a health institution after receiving management. -
ISO Basic Latin Alphabet
ISO basic Latin alphabet The ISO basic Latin alphabet is a Latin-script alphabet and consists of two sets of 26 letters, codified in[1] various national and international standards and used widely in international communication. The two sets contain the following 26 letters each:[1][2] ISO basic Latin alphabet Uppercase Latin A B C D E F G H I J K L M N O P Q R S T U V W X Y Z alphabet Lowercase Latin a b c d e f g h i j k l m n o p q r s t u v w x y z alphabet Contents History Terminology Name for Unicode block that contains all letters Names for the two subsets Names for the letters Timeline for encoding standards Timeline for widely used computer codes supporting the alphabet Representation Usage Alphabets containing the same set of letters Column numbering See also References History By the 1960s it became apparent to thecomputer and telecommunications industries in the First World that a non-proprietary method of encoding characters was needed. The International Organization for Standardization (ISO) encapsulated the Latin script in their (ISO/IEC 646) 7-bit character-encoding standard. To achieve widespread acceptance, this encapsulation was based on popular usage. The standard was based on the already published American Standard Code for Information Interchange, better known as ASCII, which included in the character set the 26 × 2 letters of the English alphabet. Later standards issued by the ISO, for example ISO/IEC 8859 (8-bit character encoding) and ISO/IEC 10646 (Unicode Latin), have continued to define the 26 × 2 letters of the English alphabet as the basic Latin script with extensions to handle other letters in other languages.[1] Terminology Name for Unicode block that contains all letters The Unicode block that contains the alphabet is called "C0 Controls and Basic Latin". -
Iso/Iec Jtc1/Sc2/Wg2 N4120 2011-07-05
ISO/IEC JTC1/SC2/WG2 N4120 2011-07-05 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: Response to the Ad-hoc Report N4110 about the Rovas scripts Source: Gábor Hosszú (Hungarian National Body) Status: National Body Contribution Action: For consideration by JTC1/SC2/WG2 Date: 2011-07-05 This document gives the position of the Hungarian Standards Institution (Hungarian National Body) evaluation of the report of the ad-hoc committee on Hungarian met in Helsinki on 2011-06-08. Please send any response regarding to this document to Gábor Hosszú (email: [email protected]). Contents 1. Agreement..........................................................................................................................................................1 2. Disagreement .....................................................................................................................................................2 2.1. Naming of the script: barrier to the encoding.................................................................................................................. 2 2.2. Refused, but necessary Szekely-Hungarian Rovas characters......................................................................................... 3 2.3. Names of the characters ................................................................................................................................................. -
Early-Alphabets-3.Pdf
Early Alphabets Alphabetic characteristics 1 Cretan Pictographs 11 Hieroglyphics 16 The Phoenician Alphabet 24 The Greek Alphabet 31 The Latin Alphabet 39 Summary 53 GDT-101 / HISTORY OF GRAPHIC DESIGN / EARLY ALPHABETS 1 / 53 Alphabetic characteristics 3,000 BCE Basic building blocks of written language GDT-101 / HISTORY OF GRAPHIC DESIGN / EARLY ALPHABETS / Alphabetic Characteristics 2 / 53 Early visual language systems were disparate and decentralized 3,000 BCE Protowriting, Cuneiform, Heiroglyphs and far Eastern writing all functioned differently Rebuses, ideographs, logograms, and syllabaries · GDT-101 / HISTORY OF GRAPHIC DESIGN / EARLY ALPHABETS / Alphabetic Characteristics 3 / 53 HIEROGLYPHICS REPRESENTING THE REBUS PRINCIPAL · BEE & LEAF · SEA & SUN · BELIEF AND SEASON GDT-101 / HISTORY OF GRAPHIC DESIGN / EARLY ALPHABETS / Alphabetic Characteristics 4 / 53 PETROGLYPHIC PICTOGRAMS AND IDEOGRAPHS · CIRCA 200 BCE · UTAH, UNITED STATES GDT-101 / HISTORY OF GRAPHIC DESIGN / EARLY ALPHABETS / Alphabetic Characteristics 5 / 53 LUWIAN LOGOGRAMS · CIRCA 1400 AND 1200 BCE · TURKEY GDT-101 / HISTORY OF GRAPHIC DESIGN / EARLY ALPHABETS / Alphabetic Characteristics 6 / 53 OLD PERSIAN SYLLABARY · 600 BCE GDT-101 / HISTORY OF GRAPHIC DESIGN / EARLY ALPHABETS / Alphabetic Characteristics 7 / 53 Alphabetic structure marked an enormous societal leap 3,000 BCE Power was reserved for those who could read and write · GDT-101 / HISTORY OF GRAPHIC DESIGN / EARLY ALPHABETS / Alphabetic Characteristics 8 / 53 What is an alphabet? Definition An alphabet is a set of visual symbols or characters used to represent the elementary sounds of a spoken language. –PM · GDT-101 / HISTORY OF GRAPHIC DESIGN / EARLY ALPHABETS / Alphabetic Characteristics 9 / 53 What is an alphabet? Definition They can be connected and combined to make visual configurations signifying sounds, syllables, and words uttered by the human mouth. -
The Origins of the Alphabet but Their Numbers Tally Into of Greeks and Later Through Their Manuscripts
1 ORIGINS OF THE ALPHABET gchgchgchgchgchgchgchgcggchgchgchgchgchgchggchch disadvantages to this system: not the development of the alphabet, letters developed as a natural result of only were the symbols complex, for it was through the auspices the use of the fl exible reed pen for writing The origins of the alphabet but their numbers tally into of Greeks and later through their manuscripts. Lowercase letters for the begin within the shadowy realms of the tens of thousands, making cultural admirers, the Romans, most part require fewer strokes for their prehistory. At some time within that learning diffi cult and writing slow. that the alphabet was to fi nally formation, allowing the scribes to fi t more shadowed prehistory, humankind An early culture that take on a distinct resemblance letters in a line of type. The combination began to communicate visually. inspired to innovate a simpler, to the modern Western alphabet. of speed and space conservation was Motivated by the need to communicate more effi cient writing system was important to monks writing lengthy facts about the environment around the Phoenicians. The Phoenicians manuscripts on expensive parchment, them, humans made simple drawings v were a wide-fl ung trading people When the Romans adopted and the use of lowercase letters was widely of everyday objects such as people, of great vitality, with complex the Greek alphabet, along with adopted in animals, weapons and so forth. These business transactions that other aspects of Greek culture, a relatively object drawings are called pictographs. required accurate record keeping. they continued the development short time It was through their need for and usage of the alphabet. -
Old Cyrillic in Unicode*
Old Cyrillic in Unicode* Ivan A Derzhanski Institute for Mathematics and Computer Science, Bulgarian Academy of Sciences [email protected] The current version of the Unicode Standard acknowledges the existence of a pre- modern version of the Cyrillic script, but its support thereof is limited to assigning code points to several obsolete letters. Meanwhile mediæval Cyrillic manuscripts and some early printed books feature a plethora of letter shapes, ligatures, diacritic and punctuation marks that want proper representation. (In addition, contemporary editions of mediæval texts employ a variety of annotation signs.) As generally with scripts that predate printing, an obvious problem is the abundance of functional, chronological, regional and decorative variant shapes, the precise details of whose distribution are often unknown. The present contents of the block will need to be interpreted with Old Cyrillic in mind, and decisions to be made as to which remaining characters should be implemented via Unicode’s mechanism of variation selection, as ligatures in the typeface, or as code points in the Private space or the standard Cyrillic block. I discuss the initial stage of this work. The Unicode Standard (Unicode 4.0.1) makes a controversial statement: The historical form of the Cyrillic alphabet is treated as a font style variation of modern Cyrillic because the historical forms are relatively close to the modern appearance, and because some of them are still in modern use in languages other than Russian (for example, U+0406 “I” CYRILLIC CAPITAL LETTER I is used in modern Ukrainian and Byelorussian). Some of the letters in this range were used in modern typefaces in Russian and Bulgarian. -
Sounds to Graphemes Guide
David Newman – Speech-Language Pathologist Sounds to Graphemes Guide David Newman S p e e c h - Language Pathologist David Newman – Speech-Language Pathologist A Friendly Reminder © David Newmonic Language Resources 2015 - 2018 This program and all its contents are intellectual property. No part of this publication may be stored in a retrieval system, transmitted or reproduced in any way, including but not limited to digital copying and printing without the prior agreement and written permission of the author. However, I do give permission for class teachers or speech-language pathologists to print and copy individual worksheets for student use. Table of Contents Sounds to Graphemes Guide - Introduction ............................................................... 1 Sounds to Grapheme Guide - Meanings ..................................................................... 2 Pre-Test Assessment .................................................................................................. 6 Reading Miscue Analysis Symbols .............................................................................. 8 Intervention Ideas ................................................................................................... 10 Reading Intervention Example ................................................................................. 12 44 Phonemes Charts ................................................................................................ 18 Consonant Sound Charts and Sound Stimulation .................................................... -
Hungarian Place Name Pronunciation for English-Speakers: the Hungarian Language Is Part of the Uralic Language Family, Which Also Includes Finnish and Estonian
Hungarian Place Name Pronunciation for English-speakers: The Hungarian language is part of the Uralic language family, which also includes Finnish and Estonian. The name Uralic derives from the hypothesized homeland of the languages in the vicinity of the Ural Mountains. More specifically, Hungarian is an Ugric language, with its closest relatives believed to be Khanty and Mansi – two obscure languages each spoken by around 10,000 people in western Siberia. The Hungarian language contains several sounds that don’t exist in English, so this pronunciation guide for map locations is approximate. Organized by Chip Saltsman, Hungarian pronunciation kindly provided by Scott Moore. A QUICK GUIDE TO PRONOUNCING HUNGARIAN There are a few things you need to know before even trying to pronounce any Hungarian words: 1) All Hungarian words are stressed on the first syllable – in the list below, the stressed syllable will be shown in bold. This can be tricky to do if you aren’t used to it. Try it with the words ‘American Express’ in English. Normally these words are both stressed on the second syllable i.e. uh-mair-uh-kuhn ek-spres (US pronunciation) or uh-me-ri-kuhn ik- spres (UK pronunciation). Try to say it like this: Uh-mair-uh-kuhn Ik-spres. Difficult isn’t it? 2) Hungarian is highly phonetic. That means that any given written letter is always pronounced the same way. English is not very phonetic, especially with names. For example, Worcestershire is pronounced: wuh-stuh-shu. 3) The Hungarian alphabet has 40 letters (or 44 in the extended version)! As there aren’t enough symbols in the Latin alphabet, they’ve had to be creative in the use of: a. -
Alphabets, Letters and Diacritics in European Languages (As They Appear in Geography)
1 Vigleik Leira (Norway): [email protected] Alphabets, Letters and Diacritics in European Languages (as they appear in Geography) To the best of my knowledge English seems to be the only language which makes use of a "clean" Latin alphabet, i.d. there is no use of diacritics or special letters of any kind. All the other languages based on Latin letters employ, to a larger or lesser degree, some diacritics and/or some special letters. The survey below is purely literal. It has nothing to say on the pronunciation of the different letters. Information on the phonetic/phonemic values of the graphic entities must be sought elsewhere, in language specific descriptions. The 26 letters a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z may be considered the standard European alphabet. In this article the word diacritic is used with this meaning: any sign placed above, through or below a standard letter (among the 26 given above); disregarding the cases where the resulting letter (e.g. å in Norwegian) is considered an ordinary letter in the alphabet of the language where it is used. Albanian The alphabet (36 letters): a, b, c, ç, d, dh, e, ë, f, g, gj, h, i, j, k, l, ll, m, n, nj, o, p, q, r, rr, s, sh, t, th, u, v, x, xh, y, z, zh. Missing standard letter: w. Letters with diacritics: ç, ë. Sequences treated as one letter: dh, gj, ll, rr, sh, th, xh, zh. -
Proposal for Generation Panel for Latin Script Label Generation Ruleset for the Root Zone
Generation Panel for Latin Script Label Generation Ruleset for the Root Zone Proposal for Generation Panel for Latin Script Label Generation Ruleset for the Root Zone Table of Contents 1. General Information 2 1.1 Use of Latin Script characters in domain names 3 1.2 Target Script for the Proposed Generation Panel 4 1.2.1 Diacritics 5 1.3 Countries with significant user communities using Latin script 6 2. Proposed Initial Composition of the Panel and Relationship with Past Work or Working Groups 7 3. Work Plan 13 3.1 Suggested Timeline with Significant Milestones 13 3.2 Sources for funding travel and logistics 16 3.3 Need for ICANN provided advisors 17 4. References 17 1 Generation Panel for Latin Script Label Generation Ruleset for the Root Zone 1. General Information The Latin script1 or Roman script is a major writing system of the world today, and the most widely used in terms of number of languages and number of speakers, with circa 70% of the world’s readers and writers making use of this script2 (Wikipedia). Historically, it is derived from the Greek alphabet, as is the Cyrillic script. The Greek alphabet is in turn derived from the Phoenician alphabet which dates to the mid-11th century BC and is itself based on older scripts. This explains why Latin, Cyrillic and Greek share some letters, which may become relevant to the ruleset in the form of cross-script variants. The Latin alphabet itself originated in Italy in the 7th Century BC. The original alphabet contained 21 upper case only letters: A, B, C, D, E, F, Z, H, I, K, L, M, N, O, P, Q, R, S, T, V and X. -
Iso/Iec Jtc1/Sc2/Wg2 N3483 L2/08-268
ISO/IEC JTC1/SC2/WG2 N3483 L2/08-268 2008-08-04 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: Preliminary proposal for encoding the Old Hungarian script in the UCS Source: Michael Everson and André Szabolcs Szelp Status: Individual Contribution Action: For consideration by JTC1/SC2/WG2 and UTC Date: 2008-08-04 This document replaces N2134 (1999-10-02), N1638 (1997-09-18), and contains the proposal summary form. 1. Introduction. The Old Hungarian script is a runiform script used to write the Hungarian language. In Hungarian it is called rovásírás ‘incised script’, from rovás ‘incision’ and írás ‘writing, script’. Various sources call it “Old Hungarian” or “Hungarian Runic” where runic refers to the script's runiform character and does not indicate direct descent from the Germanic runes (though Old Hungarian and the Fuþark are distant cousins). Old Hungarian is thought to derive ultimately from the Old Turkic script used in Central Asia, and appears to have been brought by the Székely Magyars to what is now Hungary in 895 CE. Owing to its link with the Old Turkic script, Old Hungarian must have been developed around the 8th century CE; it is first mentioned in a written account in the late 13th century. The first surviving alphabetical listing dates to about 1483. Short inscriptions are attested from the 12–13th centuries; some inscriptions are said to have been written as early as the 10th century, though there is no consensus on the accuracy of this dating.