Testing Indexes: Testidx.Sty \Testidx \Tstidxprintglossaries Nicola L

Total Page:16

File Type:pdf, Size:1020Kb

Testing Indexes: Testidx.Sty \Testidx \Tstidxprintglossaries Nicola L TUGboat, Volume 38 (2017), No. 3 373 Testing indexes: testidx.sty \testidx \tstidxprintglossaries Nicola L. C. Talbot \end{document} Abstract 2 Intentional issues The testidx package [10] provides a simple method The dummy text is designed to introduce issues that of generating a test document with a multi-paged your style or build process may need to guard against. index for testing purposes. The dummy text and These allow you to test the document style, the way index produced is designed to replicate problems the indexing information is written to the external commonly encountered in real documents. file, and the way the indexing application processes The words and phrases indexed cover the basic that information. Latin set A(a), . , Z(z) and some extended Latin 2.1 Stylistic issues characters, such as Ø(ø), Æ(æ), Œ(œ), Å(å), Þ(þ) and Ł(ł), to test the indexing application’s ability to The style issues are those which need addressing sort according to various Latin alphabets (such as through the use of LATEX code within the document Swedish or Icelandic). Version 1.1 also includes some itself, or in the class or package that deals with the words starting with digraphs, Dd(dd), Dz(dz), Ff(ff), index style, or within a style file or module used by IJ(ij), Ll(ll), Ly(ly), Ng(ng), and a trigraph Dzs(dzs), the indexing application which controls the LATEX to test alphabets where these are considered separate code that’s written to the output file. The test doc- letters (such as Welsh, Dutch or Hungarian). ument should load the appropriate document class There are also some numbers and symbols in- and indexing package to match your real document. dexed that don’t have a natural word order. 2.1.1 Page breaking 1 Introduction There are enough entries for the index to span mul- There are a number of problems that can occur when tiple pages. If you have letter group headings in generating an index using LATEX. These may relate to your index style there’s a good chance that there will the index style (\printindex), the way the indexing be at least one instance of a page or column break information is written to an external file (\index) occurring between a heading and the first entry of or the way the indexing application (such as xindy that letter group. There’s also a chance that a break or makeindex) performs. A large document may will also occur between a main entry and the first of have a complicated and slow build process, which its sub-entries. can be frustrating when making minor adjustments This does, of course, depend on the font and to the index layout. The testidx package provides page dimensions. You may need to adjust the geome- a way to create a test document that can be used try to cause an unwanted break before experimenting to enhance the required style. Section 5 shows how with adjusting the style to prohibit it. the sample text can be extended to include tests for 2.1.2 Headers and footers other languages or scripts. The simplest test document is: Since the index spans multiple pages, it’s possible to test the headers and footers for the first page of the \documentclass{article} \usepackage{makeidx} index as well as subsequent even and odd pages. This \usepackage{testidx} is useful if the header or footer content needs to vary \makeindex and you need to check that this is done correctly. \begin{document} 2.1.3 Line breaking \testidx \printindex The index contains a mixture of single words, com- \end{document} pound words, phrases, names, places and titles. This Version 1.1 of testidx comes with the supplemen- means that some of the entries are quite wide, which tary package testidx-glossaries, which uses the inter- can cause line breaking problems in narrow columns. face provided by the glossaries package [9] instead of 2.1.4 Whatsits testing \index and \printindex. In this case, the simplest test document is: Some of the entries are indexed immediately before the term, for example \documentclass{article} \usepackage{testidx-glossaries} \index{page break}page break \tstidxmakegloss and some are indexed immediately after the term, \begin{document} for example Testing indexes: testidx.sty 374 TUGboat, Volume 38 (2017), No. 3 paragraph\index{paragraph} So if this is the way that you’re indexing words The whatsit introduced by \index can cause prob- in your real document, this is the mode you need lems. This is most noticeable in an example equation in the test document. where the indexing interferes with the limits of a sum- 2. Unmodified ASCII. (‘ASCII Accents’) mation. In practice, the \index would need to be This mode is triggered by running LATEX with moved to a more suitable location, but the example testidx’s nostripaccents option and without provided by the dummy text helps to highlight the the inputenc package. This mode emulates doing problem. \index{\'elite@\'elite} 2.2 Index recording issues Use this mode if in your real document you are simply doing, for example, \index{\'elite}. The way that indexing typically works is to write the 3. Active UTF-8. entry data (using \index) to an external file that’s This mode is triggered if the inputenc package then input and processed by the indexing application. with the utf8 option is loaded and testidx is This write operation can sometimes go wrong causing loaded afterwards with the nosanitize option. incorrect information to be written to the external This emulates doing file. (There’s no test for incorrect syntax within the \index{élite} argument of \index. It’s assumed you know how to correctly index entries. The tests here are for the Since the inputenc package makes the first octet underlying operation of \index.) of a UTF-8 character active, this causes the entry The glossaries package uses a similar method but to be expanded as it’s written to the index file, instead of using \index, the file write instruction so that it appears as: is internally performed by commands like \gls and \IeC {\'e}lite \glsadd. Use this mode if in your real document you are doing, for example, \index{élite} and you 2.2.1 Page breaking want to test how it’s written to the index file. The dummy text has some long paragraphs with (This mode is the default when using X LE ATEX indexing scattered about them. This increases the or LuaLATEX, but the characters aren’t active, chance of a page break occurring mid-paragraph so it’s much the same as the next mode.) (although again it depends on the font and page 4. Sanitized UTF-8. dimensions). TEX’s asynchronous output routine can The three modes listed above are for emu- cause page numbers to go awry, and this provides a lating different \index usage. This last mode useful way to check that the page number is written really belongs in the next section as it’s provided correctly to the external file. for testing the indexing application’s UTF-8 sup- 2.2.2 Extended Latin characters port, but is included in this list for completeness. This mode is triggered if inputenc is loaded The indexed entries include terms that contain non- with the utf8 option and testidx is loaded after- ASCII letters (either through accent commands like wards with the sanitize option. This emulates \’ or using UTF-8 characters). The UTF-8 encoding doing isn’t an issue for the modern X LAT X or LuaLAT X E E E \def\word{élite}\@onelevel@sanitize\word A engines, but it is a problem for the older LTEX \expandafter\index\expandafter{\word} formats. If your engine doesn’t natively support The sanitization isn’t applied to any remain- UTF-8 and you have characters outside the basic ing content of \index, such as the encap. For Latin set, then this is something that needs to be example, tested. The testidx package has four modes to test this, depending on your set-up. \index{ð|see{eth (ð)}} is implemented such that only the ð part be- 1. ASCII with LATEX commands stripped. (‘Bare ASCII’) fore the encap is sanitized so this would end up This mode is triggered through the use of written to the index file as LATEX with testidx’s stripaccents option and ð|see{eth (\IeC {\dh })} without the inputenc package [2]. This option (testidx doesn’t modify the \index command, is the default, so the first test example above but uses the \expandafter approach where the is in this mode. This mode emulates doing, for control sequence has a combination of sanitized example: and non-sanitized content.) \index{elite@\'elite} There’s no support for other encodings. Nicola L. C. Talbot TUGboat, Volume 38 (2017), No. 3 375 2.3 Indexing application issues One rule may ignore those characters (for example, An indexing application typically reads the external ‘vice-president’ < ‘viceroy’ < ‘vice versa’) whereas another rule may treat a space as coming before a file created by LATEX (the input file that contains data, discussed in the previous section) and produces hyphen (for example, ‘vice versa’ < ‘vice-president’ another file (the output file that contains typeset- < ‘viceroy’). A ting instructions) which can then be read by LTEX 2.3.4 Numbers (through commands like \printindex). The termi- nology here is a little confusing as the input file from The test entries include some numbers (2, 10, 16, 42, the indexing application’s point of view is an output 100).
Recommended publications
  • Managing Order Relations in Mlbibtex∗
    Managing order relations in MlBibTEX∗ Jean-Michel Hufflen LIFC (EA CNRS 4157) University of Franche-Comté 16, route de Gray 25030 BESANÇON CEDEX France hufflen (at) lifc dot univ-fcomte dot fr http://lifc.univ-fcomte.fr/~hufflen Abstract Lexicographical order relations used within dictionaries are language-dependent. First, we describe the problems inherent in automatic generation of multilin- gual bibliographies. Second, we explain how these problems are handled within MlBibTEX. To add or update an order relation for a particular natural language, we have to program in Scheme, but we show that MlBibTEX’s environment eases this task as far as possible. Keywords Lexicographical order relations, dictionaries, bibliographies, colla- tion algorithm, Unicode, MlBibTEX, Scheme. Streszczenie Porządek leksykograficzny w słownikach jest zależny od języka. Najpierw omó- wimy problemy powstające przy automatycznym generowaniu bibliografii wielo- języcznych. Następnie wyjaśnimy, jak są one traktowane w MlBibTEX-u. Do- danie lub zaktualizowanie zasad sortowania dla konkretnego języka naturalnego umożliwia program napisany w języku Scheme. Pokażemy, jak bardzo otoczenie MlBibTEX-owe ułatwia to zadanie. Słowa kluczowe Zasady sortowania leksykograficznego, słowniki, bibliografie, algorytmy sortowania leksykograficznego, Unikod, MlBibTEX, Scheme. 0 Introduction these items w.r.t. the alphabetical order of authors’ Looking for a word in a dictionary or for a name names. But the bst language of bibliography styles in a phone book is a common task. We get used [14] only provides a SORT function [13, Table 13.7] to the lexicographic order over a long time. More suitable for the English language, the commands for precisely, we get used to our own lexicographic or- accents and other diacritical signs being ignored by der, because it belongs to our cultural background.
    [Show full text]
  • Hungarian Place Name Pronunciation for English-Speakers: the Hungarian Language Is Part of the Uralic Language Family, Which Also Includes Finnish and Estonian
    Hungarian Place Name Pronunciation for English-speakers: The Hungarian language is part of the Uralic language family, which also includes Finnish and Estonian. The name Uralic derives from the hypothesized homeland of the languages in the vicinity of the Ural Mountains. More specifically, Hungarian is an Ugric language, with its closest relatives believed to be Khanty and Mansi – two obscure languages each spoken by around 10,000 people in western Siberia. The Hungarian language contains several sounds that don’t exist in English, so this pronunciation guide for map locations is approximate. Organized by Chip Saltsman, Hungarian pronunciation kindly provided by Scott Moore. A QUICK GUIDE TO PRONOUNCING HUNGARIAN There are a few things you need to know before even trying to pronounce any Hungarian words: 1) All Hungarian words are stressed on the first syllable – in the list below, the stressed syllable will be shown in bold. This can be tricky to do if you aren’t used to it. Try it with the words ‘American Express’ in English. Normally these words are both stressed on the second syllable i.e. uh-mair-uh-kuhn ek-spres (US pronunciation) or uh-me-ri-kuhn ik- spres (UK pronunciation). Try to say it like this: Uh-mair-uh-kuhn Ik-spres. Difficult isn’t it? 2) Hungarian is highly phonetic. That means that any given written letter is always pronounced the same way. English is not very phonetic, especially with names. For example, Worcestershire is pronounced: wuh-stuh-shu. 3) The Hungarian alphabet has 40 letters (or 44 in the extended version)! As there aren’t enough symbols in the Latin alphabet, they’ve had to be creative in the use of: a.
    [Show full text]
  • Alphabets, Letters and Diacritics in European Languages (As They Appear in Geography)
    1 Vigleik Leira (Norway): [email protected] Alphabets, Letters and Diacritics in European Languages (as they appear in Geography) To the best of my knowledge English seems to be the only language which makes use of a "clean" Latin alphabet, i.d. there is no use of diacritics or special letters of any kind. All the other languages based on Latin letters employ, to a larger or lesser degree, some diacritics and/or some special letters. The survey below is purely literal. It has nothing to say on the pronunciation of the different letters. Information on the phonetic/phonemic values of the graphic entities must be sought elsewhere, in language specific descriptions. The 26 letters a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z may be considered the standard European alphabet. In this article the word diacritic is used with this meaning: any sign placed above, through or below a standard letter (among the 26 given above); disregarding the cases where the resulting letter (e.g. å in Norwegian) is considered an ordinary letter in the alphabet of the language where it is used. Albanian The alphabet (36 letters): a, b, c, ç, d, dh, e, ë, f, g, gj, h, i, j, k, l, ll, m, n, nj, o, p, q, r, rr, s, sh, t, th, u, v, x, xh, y, z, zh. Missing standard letter: w. Letters with diacritics: ç, ë. Sequences treated as one letter: dh, gj, ll, rr, sh, th, xh, zh.
    [Show full text]
  • Sound Alphabet Letters English
    Sound Alphabet Letters English Francesco waring her subcontractor smilingly, she mythologize it externally. Raleigh is unmelodious and unsheathes ceaselessly while complimentary Saxon effulging and embedded. How boozier is Lin when dinky-di and crinkled Gaven justles some scrubland? The rhyme and finally to reduce spam you even seven cycles of a different options, there are related provision for school textbook with your mouth muscles are significant concerns it change the alphabet sound 12 Letters That Didn't Make the Alphabet Mental Floss. English Alphabet and Pronunciation Mylanguagesorg. Vowels and consonants of English Alphabets English Mirror. Phonics Alphabet Keyboard The sounds letters make. Inside Say the purple-words and first sounds Link to associate letter shapes. When letters make sounds that aren't associated with new name. Therefore the spelling and pronunciation of all letters in English are silver These letters generate 44 different types of sounds The guide of. Parent's Pencil Grasp Gripping Guide OTFC. Letter-Sound Correspondences Literacy Instruction for Individuals. All words in the English language have any least one vowel sound in. The colonel in brackets is the Italian name just the consequent letter seem like English letters have names different hue the sounds their represent but is not pronounced. Letter-sound correspondence or the relationship of the letters in the alphabet to the sounds they to is every key component of the alphabetic principle and. English Consonant Sounds IPA International Phonetic Alphabet. English TV presenterjournalist and chief founderpioneer of path now globally. International Phonetic Alphabet Sounds In Everyday Speech Short Vowels IPA Symbol Word examples e Went to send letter.
    [Show full text]
  • The Cretan Script Family Includes the Carian Alphabet
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Journal Articles Computer Science and Engineering, Department of 2017 The rC etan Script Family Includes the Carian Alphabet Peter Z. Revesz University of Nebraska-Lincoln, [email protected] Follow this and additional works at: https://digitalcommons.unl.edu/csearticles Revesz, Peter Z., "The rC etan Script Family Includes the Carian Alphabet" (2017). CSE Journal Articles. 196. https://digitalcommons.unl.edu/csearticles/196 This Article is brought to you for free and open access by the Computer Science and Engineering, Department of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in CSE Journal Articles by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. MATEC Web of Conferences 125, 05019 (2017) DOI: 10.1051/ matecconf/201712505019 CSCC 2017 The Cretan Script Family Includes the Carian Alphabet Peter Z. Revesz1,a 1 Department of Computer Science, University of Nebraska-Lincoln, Lincoln, NE, 68588, USA Abstract. The Cretan Script Family is a set of related writing systems that have a putative origin in Crete. Recently, Revesz [11] identified the Cretan Hieroglyphs, Linear A, Linear B, the Cypriot syllabary, and the Greek, Old Hungarian, Phoenician, South Arabic and Tifinagh alphabets as members of this script family and using bioinformatics algorithms gave a hypothetical evolutionary tree for their development and presented a map for their likely spread in the Mediterranean and Black Sea areas. The evolutionary tree and the map indicated some unknown writing system in western Anatolia to be the common origin of the Cypriot syllabary and the Old Hungarian alphabet.
    [Show full text]
  • Orthographies in Early Modern Europe
    Orthographies in Early Modern Europe Orthographies in Early Modern Europe Edited by Susan Baddeley Anja Voeste De Gruyter Mouton An electronic version of this book is freely available, thanks to the support of libra- ries working with Knowledge Unlatched. KU is a collaborative initiative designed to make high quality books Open Access. More information about the initiative can be found at www.knowledgeunlatched.org An electronic version of this book is freely available, thanks to the support of libra- ries working with Knowledge Unlatched. KU is a collaborative initiative designed to make high quality books Open Access. More information about the initiative can be found at www.knowledgeunlatched.org ISBN 978-3-11-021808-4 e-ISBN (PDF) 978-3-11-021809-1 e-ISBN (EPUB) 978-3-11-021806-2 ISSN 0179-0986 e-ISSN 0179-3256 ThisISBN work 978-3-11-021808-4 is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License, ase-ISBN of February (PDF) 978-3-11-021809-1 23, 2017. For details go to http://creativecommons.org/licenses/by-nc-nd/3.0/. e-ISBN (EPUB) 978-3-11-021806-2 LibraryISSN 0179-0986 of Congress Cataloging-in-Publication Data Ae-ISSN CIP catalog 0179-3256 record for this book has been applied for at the Library of Congress. ISBN 978-3-11-028812-4 e-ISBNBibliografische 978-3-11-028817-9 Information der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliogra- fie;This detaillierte work is licensed bibliografische under the DatenCreative sind Commons im Internet Attribution-NonCommercial-NoDerivs über 3.0 License, Libraryhttp://dnb.dnb.deas of February of Congress 23, 2017.abrufbar.
    [Show full text]
  • Bioinformatics Evolutionary Tree Algorithms Reveal the History of the Cretan Script Family
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND INFORMATICS Volume 10, 2016 Bioinformatics Evolutionary Tree Algorithms Reveal the History of the Cretan Script Family Peter Z. Revesz syllabary, whose similarity with Linear A was noted by Evans. Abstract— This paper shows that Crete is the likely origin of a The Phoenician alphabet [28] was a major influence on the family of related scripts that includes the Cretan Hieroglyph, Linear development of many other alphabets due to the Phoenicians’ A, Linear B and Cypriot syllabaries and the Greek, Phoenician, Old widespread commercial influence in the Mediterranean area. Hungarian, South Arabic and Tifinagh alphabets. The paper develops The Phoenician and the South Arabic [30] alphabets are a novel similarity measure between pairs of script symbols. The similarity measure is used as an aid to develop a comparison table of assumed to derive from the Proto-Sinaitic alphabet, which the nine scripts. The paper presents a method to translate comparison originated in the Sinai Peninsula sometime between the th th tables into DNA encodings, thereby enabling the use of mid-19 and mid-16 century BC [29]. Phoenician represents bioinformatics algorithms that construct hypothetical evolutionary the northern branch, while South Arabic represents the trees. Applying the method to the nine scripts yields a script southern branch of Proto-Sinaitic. evolutionary tree with two main branches. The first branch is The classical Greek alphabet from about 800 BC had a composed of Cretan Hieroglyph, Cypriot, Linear A, Linear B, Old Hungarian and Tifinagh, while the second branch is composed of major influence for many other European alphabets.
    [Show full text]
  • Angol-Magyar Nyelvészeti Szakszótár
    PORKOLÁB - FEKETE ANGOL- MAGYAR NYELVÉSZETI SZAKSZÓTÁR SZERZŐI KIADÁS, PÉCS 2021 Porkoláb Ádám - Fekete Tamás Angol-magyar nyelvészeti szakszótár Szerzői kiadás Pécs, 2021 Összeállították, szerkesztették és tördelték: Porkoláb Ádám Fekete Tamás Borítóterv: Porkoláb Ádám A tördelés LaTeX rendszer szerint, az Overleaf online tördelőrendszerével készült. A felhasznált sablon Vel ([email protected]) munkája. https://www.latextemplates.com/template/dictionary A szótárhoz nyújtott segítő szándékú megjegyzéseket, hibajelentéseket, javaslatokat, illetve felajánlásokat a szótár hagyományos, nyomdai úton történő előállítására vonatkozóan az [email protected] illetve a [email protected] e-mail címekre várjuk. Köszönjük szépen! 1. kiadás Szerzői, elektronikus kiadás ISBN 978-615-01-1075-2 El˝oszóaz els˝okiadáshoz Üdvözöljük az Olvasót! Magyar nyelven már az érdekl˝od˝oközönség hozzáférhet német–magyar, orosz–magyar nyelvészeti szakszótárakhoz, ám a modern id˝ok tudományos világnyelvéhez, az angolhoz még nem készült nyelvészeti célú szak- szótár. Ennek a több évtizedes hiánynak a leküzdésére vállalkoztunk. A nyelvtudo- mány rohamos fejl˝odéseés differenciálódása tovább sürgette, hogy elkészítsük az els˝omagyar-angol és angol-magyar nyelvészeti szakszótárakat. Jelen kötetben a kétnyelv˝unyelvészeti szakszótárunk angol-magyar részét veheti kezébe az Olvasó. Tervünk azonban nem el˝odöknélküli vállalkozás: tudomásunk szerint két nyelvészeti csoport kísérelt meg a miénkhez hasonló angol-magyar nyelvészeti szakszótárat létrehozni. Az els˝opróbálkozás
    [Show full text]
  • On the Creation of a Pronunciation Dictionary for Hungarian
    On the creation of a pronunciation dictionary for Hungarian Stephen M. Grimes [email protected] August 2007 Abstract This report describes the process of creating a pronunciation dictionary and phonological lexicon for Hungarian for the purpose of aiding in linguistic research on Hungarian phonology and phonotactics. The pronunciation dictionary was created by transforming orthographic forms to pronunciation representations by taking advantage of systematic deviations between Hungarian orthography and pronunciation. It is argued that the “automated” creation of such a dictionary is reasonably expected to be accurate due to the relative similarity of Hungarian orthography to actual pronunciation. This document includes discussion of goals and standards for creating a Hungarian pronunciation dictionary, and each phonological change creating a mismatch between orthography and pronunciation is highlighted. Future developments and additions to the current dictionary are also suggested as well as strategies for evaluating the quality of the dictionary. Finally, potential applications to linguistic research are discussed. 1 Introduction While students of the English language quickly learn that English spelling is by no means consistent, many Hungarians believe that the Hungarian alphabet is completely phonetic. Here, a phonetic alphabet refers to the existence of a one-to-one mapping between symbol and sound. It can quite easily be demonstrated by counter-example that 1 Hungarian orthography is not phonetic, and in fact several types of orthographic- pronunciation discrepancies exist. Consider as an example the word /szabadság/ 1 [sabatʃ:a:g] ‘freedom, liberty’ , in which no fewer than four orthographic-pronunciation discrepancies can be identified with the written form of this word: (1) a.
    [Show full text]
  • Toponymic Guidelines for Map and Other Editors – Croatia
    UNITED NATIONS E/CONF.98/CRP.74 ECONOMIC AND SOCIAL COUNCIL 20 August 2007 Ninth United Nations Conference on the Standardization of Geographical Names New York, 21 - 30 August 2007 Item 9 (e) of the provisional agenda* National standardization: Toponymic guidelines for map editors and other editors Toponymic Guidelines for Map and Other Editors - Croatia Submitted by Croatia ** * E/CONF.98/1. ** Prepared by Dunja Brozović, Institute of Croatian Language and Linguistics and Zvonko Stefan, Croatian Geodetic Institute. TOPONYMIC GUIDELINES FOR MAP AND OTHER EDITORS - CROATIA FOR INTERNATIONAL USE First Edition August 2007 Dunja Brozović Rončević (Institute of Croatian Language and Linguistics) and Zvonko Štefan (Croatian Geodetic Institute) Zagreb, Croatia 1 TABLE OF CONTENTS 1. Languages 1.1. General remarks 1.2. National language - Croatian 1.2.1. General remarks 1.2.2. The Croatian alphabet 1.2.3. Spelling rules for Croatian geographical names 1.2.3.1. Capitalization 1.2.3.2. Use of hyphens 1.2.3.3. Use of one or two words 1.2.4. Pronunciation of Croatian geographical names 1.2.5. Linguistic strata recognizable in Croatian place names 1.2.6. Croatian dialects 1.3. Minority languages 1.3.1. Serbian 1.3.1.1. General remarks 1.3.1.2. The Serbian alphabet 1.3.1.3. Geographical names 1.3.2. Italian 1.3.2.1. General remarks 1.3.2.2. The Italian alphabet 1.3.2.3. Geographical names 1.3.3. Hungarian 1.3.3.1. General remarks 1.3.3.2. The Hungarian alphabet 1.3.3.3.
    [Show full text]
  • ISO/IEC JTC1/SC2/WG2 N3xxx L2/08-Xxx
    ISO/IEC JTC1/SC2/WG2 N3xxx L2/08-xxx 2007-08-21 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Nemzetközi Szabványügyi Szervezet Doc Type: Working Group Document Irat típusa: Munkacsoport irat Title: Revised proposal for encoding the Old Hungarian script in the UCS Cím: Javított előterjesztés a rovásírás Egyetemes Betűkészlet-beli kódolására Source/Forrás: Michael Everson & André Szabolcs Szelp Status: Individual Contribution Státusz: Magánelőterjesztés Action: For consideration by JTC1/SC2/WG2 and UTC Intézkedés: JTC1/SC2/WG2 és UTC általi megfontolásra Date/Kelt: 2007-08-21 Ez az irat átveszi az N3483 (2008-08-04), N2134 This document replaces N3483 (2008-08-04), (1999-10-02) és a N1638 (1999-09-18). A szerzők N2134 (1999-10-02), and N1638 (1997-09-18). hálájukat szeretnék kifejezni Joó Ádámnak a The authors are grateful to Ádám Joó for the magyar fordításért. Hungarian translation. 1. Bevezetés. A rovásírás egy rúnajellegű írás a 1. Introduction. The Old Hungarian script is a magyar nyelv lejegyzésére. Magyarul rovásírás, runi form script used to write the Hungarian lan- „rótt írás”, a rovás és írás szavakból. Néhány for- guage. In Hungarian it is called rovásírás ‘incised rás „Hungarian Runic” néven említi, ahol a runic script’, from rovás ‘incision’ and írás ‘writing, az írás rúnás alakjára utal, nem pedig a germán script’. Some sources call it “Hungarian Runic” rúnáktól való leszármazásra (habár a rovásírás és a where runic refers to the script’s runiform character fuþark távoli unokatestvérek). Más források and does not indicate direct descent from the „Szekler script” (székely írás) néven nevezik, Germanic runes (though Old Hungarian and the mások nem fordítják le és „Hungarian Rovás” Fuþark are distant cousins).
    [Show full text]
  • Hungarian 16Th-Century Hungarian Orthography Klára Korompay
    Hungarian 16th-century Hungarian orthography Klára Korompay 0. Introduction Before embarking on this overview of 16th-century Hungarian orthogra- phy, we shall first of all present briefly the essential features of this orthog- raphy, and then outline some of the major aspects of its development. Hungarian has an alphabetic writing system, in which the relations be- tween writing and sound are very regular: a grapheme corresponds to a single phoneme, and vice-versa, with very few exceptions. Although the phonemic principle is predominant, we should not, however, underestimate the importance of the morphological principle, since Hungarian, a member of the Finno-Ugrian family, is an agglutinative language. The story of Hungarian orthography begins around the year 1000 AD, with the adoption of the Roman alphabet. As was the case in many other languages, difficulties soon arose due to the fact that the Latin alphabet (21 letters at first, then 23, before reaching the present number of 26) was inadequate to represent the phonological systems of the various vernaculars that it was called upon to transcribe. In the case of Old Hungarian, the num- ber of phonemes stood at around 35, and this number was to increase over the centuries. Several special graphic devices therefore had to be introduced in order to create new signs: this was achieved either by creating digraphs, or by using diacritical signs (mainly different types of accents). During the Middle Ages, both of these devices were adopted, and this led to the crea- tion of different models, whose parallel existence is significant in the 16th century.
    [Show full text]