Text Encoding

Total Page:16

File Type:pdf, Size:1020Kb

Text Encoding Language and Computers Prologue: Encoding Language Writing systems Alphabetic Syllabic Logographic Systems with unusual realization Language and Computers Relation to language Encoding written Prologue: Encoding Language language ASCII Unicode L245 Spoken language Transcription (Based on Dickinson, Brew, & Meurers (2013)) Why speech is hard to represent Indiana University Articulation Measuring sound Spring 2016 Acoustics Relating written and spoken language From Speech to Text From Text to Speech Language modeling 1 / 63 Language and Language and Computers Computers Prologue: Encoding Language Writing systems Alphabetic Syllabic Logographic Computers have a variety of applications involving language: Systems with unusual realization Relation to language Encoding written I textual searching language ASCII I grammar correction Unicode I Spoken language automatic translation Transcription Why speech is hard to I question answering represent Articulation Measuring sound I plagiarism detection Acoustics I ... Relating written and spoken language From Speech to Text From Text to Speech Language modeling 2 / 63 Language and Language and Computers – where to start? Computers Prologue: Encoding Language Writing systems Alphabetic Syllabic Logographic I Systems with unusual If we want to do anything with language, we need a way realization to represent language. Relation to language Encoding written language I We can interact with the computer in several ways: ASCII Unicode I write or read text Spoken language I speak or listen to speech Transcription Why speech is hard to represent I Computer has to have some way to represent Articulation Measuring sound I text Acoustics I speech Relating written and spoken language From Speech to Text From Text to Speech Language modeling 3 / 63 Language and Outline Computers Prologue: Encoding Language Writing systems Alphabetic Syllabic Writing systems Logographic Systems with unusual realization Relation to language Encoding written language Encoding written language ASCII Unicode Spoken language Spoken language Transcription Why speech is hard to represent Relating written and spoken language Articulation Measuring sound Acoustics Language modeling Relating written and spoken language From Speech to Text From Text to Speech Language modeling 4 / 63 Language and Writing systems used for human languages Computers Prologue: Encoding Language What is writing? Writing systems Alphabetic Syllabic “a system of more or less permanent marks used Logographic Systems with unusual to represent an utterance in such a way that it can realization be recovered more or less exactly without the Relation to language Encoding written intervention of the utterer.” language ASCII (Peter T. Daniels, The World’s Writing Systems) Unicode Spoken language Transcription Why speech is hard to Different types of writing systems are used: represent Articulation Measuring sound I Alphabetic Acoustics Relating written and I Syllabic spoken language From Speech to Text I Logographic From Text to Speech Language modeling Much of the information on writing systems and the graphics used are taken from the great site http://www.omniglot.com. 5 / 63 Language and Alphabetic systems Computers Prologue: Encoding Language Writing systems Alphabetic Alphabets (phonemic alphabets) Syllabic Logographic Systems with unusual realization I represent all sounds, i.e., consonants and vowels Relation to language Encoding written I Examples: Etruscan, Latin, Korean, Cyrillic, Runic, language ASCII International Phonetic Alphabet Unicode Spoken language Transcription Why speech is hard to Abjads (consonant alphabets) represent Articulation Measuring sound I represent consonants only (sometimes plus selected Acoustics Relating written and vowels; vowel diacritics generally available) spoken language From Speech to Text I Examples: Arabic, Aramaic, Hebrew From Text to Speech Language modeling 6 / 63 Language and Alphabet example: Fraser Computers Prologue: Encoding Language An alphabet used to write Lisu, a Tibeto-Burman language spoken by Writing systems about 657,000 people in Burma, India, Thailand and in the Chinese Alphabetic Syllabic provinces of Yunnan and Sichuan. Logographic Systems with unusual realization Relation to language Encoding written language ASCII Unicode Spoken language Transcription Why speech is hard to represent Articulation Measuring sound Acoustics Relating written and spoken language From Speech to Text From Text to Speech Language modeling (from: http://www.omniglot.com/writing/fraser.htm) 7 / 63 Language and Abjad example: Phoenician Computers Prologue: Encoding Language An abjad used to write Phoenician, created between the 18th and 17th Writing systems centuries BC; assumed to be the forerunner of the Greek and Hebrew Alphabetic Syllabic alphabet. Logographic Systems with unusual realization Relation to language Encoding written language ASCII Unicode Spoken language Transcription Why speech is hard to represent Articulation Measuring sound Acoustics Relating written and spoken language From Speech to Text From Text to Speech Language modeling (from: http://www.omniglot.com/writing/phoenician.htm) 8 / 63 Language and A note on the letter-sound correspondence Computers Prologue: Encoding Language I Alphabets use letters to encode sounds (consonants, Writing systems Alphabetic vowels). Syllabic Logographic Systems with unusual I But the correspondence between spelling and realization Relation to language pronunciation in many languages is quite complex, i.e., Encoding written not a simple one-to-one correspondence. language ASCII Unicode I Example: English Spoken language Transcription I same spelling – different sounds: ough: ought, cough, Why speech is hard to represent tough, through, though, hiccough Articulation I silent letters: knee, knight, knife, debt, psychology, Measuring sound Acoustics mortgage Relating written and I one letter – multiple sounds: exit, use spoken language From Speech to Text I multiple letters – one sound: the, revolution From Text to Speech I alternate spellings: jail or gaol; but not possible seagh Language modeling for chef (despite sure, dead, laugh) 9 / 63 Language and More examples for non-transparent letter-sound Computers Prologue: Encoding correspondences Language Writing systems Alphabetic Syllabic Logographic French Systems with unusual realization Relation to language (1) a. Versailles ! [veRsai] Encoding written language b. ete, etais, etait, etaient ! [ete] ASCII Unicode Spoken language Transcription Why speech is hard to Irish represent Articulation Measuring sound (2) a. samhradh (summer) ! [sauruh] Acoustics Relating written and b. scri’obhaim (I write) ! [shgri:m] spoken language From Speech to Text From Text to Speech Language modeling What is the notation used within the []? 10 / 63 Language and The International Phonetic Alphabet (IPA) Computers Prologue: Encoding Language Writing systems Alphabetic I Several special alphabets for representing sounds have Syllabic Logographic been developed, the best known being the International Systems with unusual realization Phonetic Alphabet (IPA). Relation to language Encoding written language I The phonetic symbols are unambiguous: ASCII Unicode I designed so that each speech sound gets its own Spoken language symbol, Transcription I eliminating the need for Why speech is hard to represent I multiple symbols used to represent simple sounds Articulation Measuring sound I one symbol being used for multiple sounds. Acoustics Relating written and I spoken language Interactive example chart: http://web.uvic.ca/ling/ From Speech to Text resources/ipa/charts/IPAlab/IPAlab.htm From Text to Speech Language modeling 11 / 63 Language and Syllabic systems Computers Prologue: Encoding Language Syllabaries Writing systems Alphabetic I writing systems with separate symbols for each syllable Syllabic Logographic of a language Systems with unusual realization I Examples: Cherokee. Ethiopic, Cypriot, Ojibwe, Relation to language Encoding written Hiragana (Japanese) language ASCII (cf. also: http://www.omniglot.com/writing/syllabaries.htm) Unicode Spoken language Transcription Abugidas (Alphasyllabaries) Why speech is hard to represent Articulation Measuring sound I writing systems organized into families Acoustics I Relating written and symbols represent a consonant with a vowel, but the spoken language vowel can be changed by adding a diacritic (= a From Speech to Text From Text to Speech symbol added to the letter). Language modeling I Examples: Balinese, Javanese, Tamil, Thai, Tagalog (cf. also: http://www.omniglot.com/writing/syllabic.htm) 12 / 63 Language and Syllabary example: Cypriot Computers Prologue: Encoding Language The Cypriot syllabary or Cypro-Minoan writing is thought to have Writing systems Alphabetic developed from the Linear A script of Crete, though its exact origins are Syllabic Logographic not known. It was used from about 1500 to 300 BC. Systems with unusual realization Relation to language Encoding written language ASCII Unicode Spoken language Transcription Why speech is hard to represent Articulation Measuring sound Acoustics Relating written and spoken language From Speech to Text From Text to Speech Language modeling (from: http://www.omniglot.com/writing/cypriot.htm) 13 / 63 Language and Abugida example: Lao Computers Prologue: Encoding Language Script developed in the 14th century to write the Lao language, based on Writing systems an early version of the Thai script, which was developed from the Old Alphabetic
Recommended publications
  • Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
    1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only.
    [Show full text]
  • Linguistic Study About the Origins of the Aegean Scripts
    Anistoriton Journal, vol. 15 (2016-2017) Essays 1 Cretan Hieroglyphics The Ornamental and Ritual Version of the Cretan Protolinear Script The Cretan Hieroglyphic script is conventionally classified as one of the five Aegean scripts, along with Linear-A, Linear-B and the two Cypriot Syllabaries, namely the Cypro-Minoan and the Cypriot Greek Syllabary, the latter ones being regarded as such because of their pictographic and phonetic similarities to the former ones. Cretan Hieroglyphics are encountered in the Aegean Sea area during the 2nd millennium BC. Their relationship to Linear-A is still in dispute, while the conveyed language (or languages) is still considered unknown. The authors argue herein that the Cretan Hieroglyphic script is simply a decorative version of Linear-A (or, more precisely, of the lost Cretan Protolinear script that is the ancestor of all the Aegean scripts) which was used mainly by the seal-makers or for ritual usage. The conveyed language must be a conservative form of Sumerian, as Cretan Hieroglyphic is strictly associated with the original and mainstream Minoan culture and religion – in contrast to Linear-A which was used for several other languages – while the phonetic values of signs have the same Sumerian origin as in Cretan Protolinear. Introduction The three syllabaries that were used in the Aegean area during the 2nd millennium BC were the Cretan Hieroglyphics, Linear-A and Linear-B. The latter conveys Mycenaean Greek, which is the oldest known written form of Greek, encountered after the 15th century BC. Linear-A is still regarded as a direct descendant of the Cretan Hieroglyphics, conveying the unknown language or languages of the Minoans (Davis 2010).
    [Show full text]
  • Bryn Mawr Classical Review 2017.08.38
    Bryn Mawr Classical Review 2017.08.38 http://bmcr.brynmawr.edu/2017/2017-08-38 BMCR 2017.08.38 on the BMCR blog Bryn Mawr Classical Review 2017.08.38 Paola Cotticelli-Kurras, Alfredo Rizza (ed.), Variation within and among Writing Systems: Concepts and Methods in the Analysis of Ancient Written Documents. LautSchriftSprache / ScriptandSound. Wiesbaden: Dr. Ludwig Reichert Verlag, 2017. Pp. 384. ISBN 9783954901456. €98.00. Reviewed by Anna P. Judson, Gonville & Caius College, University of Cambridge ([email protected]) Table of Contents [Authors and titles are listed at the end of the review.] This book is the first of a new series, ‘LautSchriftSprache / ScriptandSound’, focusing on the field of graphemics (the study of writing systems), in particular historical graphemics. As the traditional view of writing as (merely) a way of representing speech has given way to a more nuanced understanding of writing as a different, rather than secondary, means of communication,1 graphemics has become an increasingly popular field; it is also necessarily an interdisciplinary field, since it incorporates the study not only of written texts’ linguistic features, but also broader aspects such as their visual features, material supports, and contexts of production and reading. A series dedicated to the study of graphemics across multiple academic disciplines is therefore a very welcome development. This first volume presents twenty-one papers from the third ‘LautSchriftSprache’ conference, held in Verona in 2013. In their introduction, the editors stress that the aim is to present studies of writing systems with as wide a scope as possible in terms of location, chronology, writing support, cultural context, and function.
    [Show full text]
  • Iso/Iec Jtc1/Sc2/Wg2 N2378 A
    ISO/IEC JTC1/SC2/WG2 N2378 2001-10-03 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation еждународная организация по стандартизации Doc Type:Working Group Document Title:Final proposal to encode Aegean scripts in the UCS Source:Deborah Anderson and Michael Everson Status:Expert Contribution Action:For consideration by JTC1/SC2/WG2 Date:2001-10-03 A. Administrative 1. Title Final proposal to encode Aegean scripts in the UCS 2. Requester's name Deborah Anderson and Michael Everson 3. Requester type Expert contribution 4. Submission date 2001-10-03 5. Requester's reference Replaces N2327. This proposal corrects two glyphs, character names and adds additional background information about Cypriot. 6a. Completion This is a complete proposal. 6b. More information to be provided? No. B. Technical -- General 1a. New script? Name? Yes. Aegean Numbers, Cypriot Syllabary, Linear B Syllabary, and Linear B Ideograms. 1b. Addition of characters to existing block? Name? No. 2. Number of characters 324. Aegean Numbers (57), Cypriot Syllabary (55), Linear B Syllabary (88), and Linear B Ideograms (124). 3. Proposed category Category C 4. Proposed level of implementation and rationale Level 1 because they are non-combining. 5a. Character names included in proposal? Yes 5b. Character names in accordance with guidelines? Yes 5c. Character shapes reviewable? Yes 6a. Who will provide computerized font? Michael Everson, Everson Typography 6b. Font currently available? Yes 6c. Font format? 1 Proposal for the Universal Character Set Deborah Anderson, Michael Everson TrueType 7a. Are references (to other character sets, dictionaries, descriptive texts, etc.) provided? Yes. 7b.
    [Show full text]
  • The Cretan Script Family Includes the Carian Alphabet
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Journal Articles Computer Science and Engineering, Department of 2017 The rC etan Script Family Includes the Carian Alphabet Peter Z. Revesz University of Nebraska-Lincoln, [email protected] Follow this and additional works at: https://digitalcommons.unl.edu/csearticles Revesz, Peter Z., "The rC etan Script Family Includes the Carian Alphabet" (2017). CSE Journal Articles. 196. https://digitalcommons.unl.edu/csearticles/196 This Article is brought to you for free and open access by the Computer Science and Engineering, Department of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in CSE Journal Articles by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. MATEC Web of Conferences 125, 05019 (2017) DOI: 10.1051/ matecconf/201712505019 CSCC 2017 The Cretan Script Family Includes the Carian Alphabet Peter Z. Revesz1,a 1 Department of Computer Science, University of Nebraska-Lincoln, Lincoln, NE, 68588, USA Abstract. The Cretan Script Family is a set of related writing systems that have a putative origin in Crete. Recently, Revesz [11] identified the Cretan Hieroglyphs, Linear A, Linear B, the Cypriot syllabary, and the Greek, Old Hungarian, Phoenician, South Arabic and Tifinagh alphabets as members of this script family and using bioinformatics algorithms gave a hypothetical evolutionary tree for their development and presented a map for their likely spread in the Mediterranean and Black Sea areas. The evolutionary tree and the map indicated some unknown writing system in western Anatolia to be the common origin of the Cypriot syllabary and the Old Hungarian alphabet.
    [Show full text]
  • Bioinformatics Evolutionary Tree Algorithms Reveal the History of the Cretan Script Family
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND INFORMATICS Volume 10, 2016 Bioinformatics Evolutionary Tree Algorithms Reveal the History of the Cretan Script Family Peter Z. Revesz syllabary, whose similarity with Linear A was noted by Evans. Abstract— This paper shows that Crete is the likely origin of a The Phoenician alphabet [28] was a major influence on the family of related scripts that includes the Cretan Hieroglyph, Linear development of many other alphabets due to the Phoenicians’ A, Linear B and Cypriot syllabaries and the Greek, Phoenician, Old widespread commercial influence in the Mediterranean area. Hungarian, South Arabic and Tifinagh alphabets. The paper develops The Phoenician and the South Arabic [30] alphabets are a novel similarity measure between pairs of script symbols. The similarity measure is used as an aid to develop a comparison table of assumed to derive from the Proto-Sinaitic alphabet, which the nine scripts. The paper presents a method to translate comparison originated in the Sinai Peninsula sometime between the th th tables into DNA encodings, thereby enabling the use of mid-19 and mid-16 century BC [29]. Phoenician represents bioinformatics algorithms that construct hypothetical evolutionary the northern branch, while South Arabic represents the trees. Applying the method to the nine scripts yields a script southern branch of Proto-Sinaitic. evolutionary tree with two main branches. The first branch is The classical Greek alphabet from about 800 BC had a composed of Cretan Hieroglyph, Cypriot, Linear A, Linear B, Old Hungarian and Tifinagh, while the second branch is composed of major influence for many other European alphabets.
    [Show full text]
  • An Analysis of Hamptonese Using Hidden Markov Models
    An Analysis of Hamptonese Using Hidden Markov Models Ethan Le Dr. Mark Stamp Undergraduate Assistant Professor Department of Computer Science Department of Computer Science San Jose State University San Jose State University San Jose, CA, U.S.A. San Jose, CA, U.S.A. Email: [email protected] Email: [email protected] An Analysis of Hamptonese Using Hidden Markov Models Le and Stamp Table of Contents Section Page 1. Introduction 5 of 54 1.1. James Hampton 5 of 54 2. Purpose 7 of 54 3. What is Hamptonese? 8 of 54 3.1. Description of Hamptonese Text 8 of 54 3.2. Transcription 9 of 54 3.3. Frequency Counts 14 of 54 4. Hidden Markov Models (HMMs) 14 of 54 4.1. Hidden Markov Models Applications 15 of 54 4.1.1. HMM in Speech Recognition Algorithms 15 of 54 4.1.2. Music-Information Retrieval and HMMs 16 of 54 4.1.3. English Alphabet Analysis Using HMMs 17 of 54 5. English Text Analysis Using Hidden Markov Models 17 of 54 6. Modeling the Hamptonese HMM 19 of 54 7. Hamptonese Analysis 19 of 54 7.1. Reading Techniques 19 of 54 7.2. HMM Parameters 20 of 54 8. Hamptonese HMM Results 21 of 54 8.1. Non-Grouped 21 of 54 8.2. Grouped 22 of 54 9. English Phonemes 27 of 54 9.1. English Phonemes and Hamptonese 29 of 54 10. Entropy, Redundancy, and Word Representation 29 of 54 10.1. Entropy 30 of 54 10.2. Redundancy 31 of 54 10.3.
    [Show full text]
  • Ancient and Other Scripts
    The Unicode® Standard Version 13.0 – Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. © 2020 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at http://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see http://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium; edited by the Unicode Consortium. — Version 13.0. Includes index. ISBN 978-1-936213-26-9 (http://www.unicode.org/versions/Unicode13.0.0/) 1.
    [Show full text]
  • The Cypriot Font∗
    The Cypriot font∗ Peter Wilsony Herries Press 2009/05/22 Abstract The cypriot bundle provides a rendition of the Cypriot syllabary which was a script used in Cyprus for writing Greek. The script was in use between about 1000 and 200 bc. Contents 1 Introduction 1 1.1 An alphabetic tree . 2 2 The cypriot package 3 3 The font definition files 4 4 The cypriot package code 5 5 The map file 8 List of Tables 1 Commands and encoding for the signs . 4 1 Introduction The font presented here is a rendition of the Cypriot script that was used from about 1000 to 200 bc, particularly on Cyprus. It is one of a series of fonts that was initially intended to show how the Latin alphabet has evolved from its original Phoenician form to its present day appearance. ∗This file has version number v1.2, last revised 2009/05/22. yherries dot press at earthlink dot net 1 2 1 Introduction This manual is typeset according to the conventions of the LATEX doc- strip utility which enables the automatic extraction of the LATEX macro source files [MG04]. Section 2 describes the usage of the package. Commented code for the fonts and source code for the package is in later sections. 1.1 An alphabetic tree Scholars are reasonably agreed that all the world's alphabets are descended from a Semitic alphabet invented about 1600 bc in the Middle East [Dru95]. The word `Semitic' refers to the family of languages used in the geographical area from Sinai in the south, up the Mediterranean coast to Asia Minor in the north and west to the valley of the Euphrates.
    [Show full text]
  • The Writing Revolution
    9781405154062_1_pre.qxd 8/8/08 4:42 PM Page iii The Writing Revolution Cuneiform to the Internet Amalia E. Gnanadesikan A John Wiley & Sons, Ltd., Publication 9781405154062_1_pre.qxd 8/8/08 4:42 PM Page iv This edition first published 2009 © 2009 Amalia E. Gnanadesikan Blackwell Publishing was acquired by John Wiley & Sons in February 2007. Blackwell’s publishing program has been merged with Wiley’s global Scientific, Technical, and Medical business to form Wiley-Blackwell. Registered Office John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom Editorial Offices 350 Main Street, Malden, MA 02148-5020, USA 9600 Garsington Road, Oxford, OX4 2DQ, UK The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK For details of our global editorial offices, for customer services, and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell. The right of Amalia E. Gnanadesikan to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. Designations used by companies to distinguish their products are often claimed as trademarks.
    [Show full text]
  • Epigraphy: the Study of Ancient Inscriptions
    Epigraphy: The Study of Ancient Inscriptions by James L. Guthrie Epigraphy is the study of ancient inscriptions, usually found on hard surfaces such as stone. NEARA members use epigraphy to study contacts between Americans and other people before the late 15th century. Some archaeologists say that the case for early voyages to America rests almost entirely on epigraphic evidence (e.g., Lepper 1995), but inscriptions account for only a small part of the evidence. The annotated bibliography Pre-Columbian Contact with the Americas Across the Oceans by Sorenson and Raish (1996), the best key to the literature on Pre- Columbian contact, has more than 5000 entries but only about a hundred that concern epigraphy. With so much evidence of other kinds, the presence of a few Old World inscriptions is to be expected. Is This a Legitimate Subject for Study? The majority opinion of American archaeologists is that there are no authentic Old World inscriptions in the Americas and that none should be expected because there were no proven foreign contacts except by the Norse at Newfoundland. Even the Norse inscriptions are considered spurious by opponents of early voyaging. Evidence indicating early sea travel is not welcomed by those already convinced that it did not or could not have happened. Common arguments against early contact are as follows: The Oceans were barriers to travel, not highways, and people simply lacked the capacity to sail or paddle more than a few miles from shore before the exploits of the great European navigators of the 15th century. There is no convincing archaeological evidence for the presence of foreigners except at L'Anse aux Meadows.
    [Show full text]
  • Iso/Iec Jtc1/Sc2/Wg2 N4733 L2/16-179
    ISO/IEC JTC1/SC2/WG2 N4733 L2/16-179 2012-07-22 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: Revised proposal to encode the Cypro-Minoan script in the SMP of the UCS Source: UC Berkeley Script Encoding Initiative (Universal Scripts Project) Author: Michael Everson Status: Liaison Contribution Action: For consideration by JTC1/SC2/WG2 and UTC Date: 2016-07-22 Replaces: N4715 (L2/16-089) 1. Introduction. The Cypro-Minoan syllabary is an undeciphered syllabic script which was used on the island of Cyprus during the Late Bronze Age (ca. 1550–1050 BCE). Arthur Evans coined the term “Cypro-Minoan” in 1909 based on its visual similarity to Linear A on Minoan Crete, from which Cypro- Minoan is thought to be derived. The corpus of Cypro-Minoan comprises approximately 250 objects— such as clay balls, cylinders, and tablets and votive stands. Discoveries have been made at various sites around Cyprus, such as Enkomi, Kition, Kalavasso, and Palaepaphos. Discoveries have also been made in the ancient city of Ugarit on the Syrian coast and in Tiryns in Greece. In the Early Iron Age, Cypriots developed the Cypriot Syllabary from Cypro-Minoan. The Cypriot Syllabary was used to write Greek and Eteocypriot, and has been encoded already in the UCS. 2. Decipherment. The present state of Cypro-Minoan epigraphy is rather unpredictable. New analyses of the inscriptions may provide important changes in terms of the decipherment. The sign list, the basic repertoire of signs which are being worked on, however, is stable, and forms the basis of this proposal.
    [Show full text]