View Full Issue

Total Page:16

File Type:pdf, Size:1020Kb

View Full Issue THE SCIENCE AND INFORMATION ORGANIZATION www.thesai.org | [email protected] IJACSA Special Issue on Selected Papers from Third international symposium on Automatic Amazigh processing (SITACAM’ 13) Associate Editors Dr. Zuqing Zhu Service Provider Technology Group of Cisco Systems, San Jose Domain of Research: Research and development of wideband access routers for hybrid fibre-coaxial (HFC) cable networks and passive optical networks (PON) Dr. Ka Lok Man Department of Computer Science and Software Engineering at the Xi'an Jiaotong- Liverpool University, China Domain of Research: Design, analysis and tools for integrated circuits and systems; formal methods; process algebras; real-time, hybrid systems and physical cyber systems; communication and wireless sensor networks. Dr. Sasan Adibi Technical Staff Member of Advanced Research, Research In Motion (RIM), Canada Domain of Research: Security of wireless systems, Quality of Service (QoS), Ad-Hoc Networks, e-Health and m-Health (Mobile Health) Dr. Sikha Bagui Associate Professor in the Department of Computer Science at the University of West Florida, Domain of Research: Database and Data Mining. Dr. T. V. Prasad Dean, Lingaya's University, India Domain of Research: Bioinformatics, Natural Language Processing, Image Processing, Expert Systems, Robotics Dr. Bremananth R Research Fellow, Nanyang Technological University, Singapore Domain of Research: Acoustic Holography, Pattern Recognition, Computer Vision, Image Processing, Biometrics, Multimedia and Soft Computing (i) www.ijacsa.thesai.org IJACSA Special Issue on Selected Papers from Third international symposium on Automatic Amazigh processing (SITACAM’ 13) CONTENTS Paper 1: Application of Data Mining Tools for Recognition of Tifinagh Characters Authors: M. OUJAOURA, R. EL AYACHI, O. BENCHAREF, Y. CHIHAB, B. JARMOUNI PAGE 1 – 4 Paper 2: Annotation and research of pedagogical documents in a platform of e-learning based on Semantic Web Authors: S. BOUKIL, C. DAOUI, B. BOUIKHALENE, M.FAKIR PAGE 5 – 10 Paper 3: Hierarchical Algorithm for Hidden Markov Model Authors: SANAA CHAFIK, DAOUI CHERKI PAGE 11 – 14 Paper 4: Review of Color Image Segmentation Authors: Abderrahmane ELBALAOUI, M.FAKIR, N.IDRISSI, A.MARBOHA PAGE 15 – 21 Paper 5: Invariant Descriptors and Classifiers Combination for Recognition of Isolated Printed Tifinagh Characters Authors: M. OUJAOURA, R. EL AYACHI, B. MINAOUI, M. FAKIR and B. BOUIKHALENE, O. BENCHAREF PAGE 22 – 28 Paper 6: Handwritten Tifinagh Text Recognition Using Fuzzy K-NN and Bi-gram Language Model Authors: Said Gounane, Mohammad Fakir, Belaid Bouikhalen PAGE 29 – 32 Paper 7: Performance evaluation of ad hoc routing protocols in VANETs Authors: Mohammed ERRITALI, Bouabid El Ouahidi PAGE 33 – 40 Paper 8: Recognition of Amazigh characters using SURF & GIST descriptors Authors: H. Moudni, M. Er-rouidi, M. Oujaoura, O. Bencharef PAGE 41 – 44 Paper 9: Printed Arabic Character Classification Using Cadre of Level Feature Extraction Technique Authors: S.Nouri, M.Fakir PAGE 45 – 48 (ii) www.ijacsa.thesai.org IJACSA Special Issue on Selected Papers from Third international symposium on Automatic Amazigh processing (SITACAM’ 13) Application of Data Mining Tools for Recognition of Tifinagh Characters M. OUJAOURA, R. EL AYACHI O. BENCHAREF, Y. CHIHAB B. JARMOUNI Computer Science Department Computer Science Department Computer Science Department Faculty of Science and Technology Higher School of Technology Faculty of Science Sultan Moulay Slimane University Cadi Ayyad University Mohamed V University Béni Mellal, Morocco Essaouira, Morocco Rabat, Morocco. Abstract—The majority of Tifinagh OCR presented in the literature does not exceed the scope of simulation software such as Matlab. In this work, the objective is to compare the classification data mining tool for Tifinagh character recognition. This comparison is performed in a working environment using an Oracle database and Oracle Data Mining tools (ODM) to determine the algorithms that gives the best Recognition rates (rate / time). Keywords—OCR; Data Mining; Classification; Recognition; Fig. 1. Tifinagh characters – IRCAM. Tifinagh; geodesic descriptors; Zernike Moments; CART; AdaBoost; KNN; SVM; RNA; ANFIS The Tifinagh alphabet has several characters that can be obtained from others by a simple rotation, which makes I. INTRODUCTION invariant descriptors commonly used less effective. For this The Optical Character Recognition (OCR) is a rapidly reason, we used a combination of Geodesic descriptors [5] and expanding field in several areas where the text is the working Zernike moments [6]. basis. In general, a character recognition system consists of several phases [1, 2, 3, 4, 8]. The extraction is a phase that A. Geodesic descriptors focuses on the release of attributes from an image. In this A geodesic descriptor is the shortest path between two article, the geodesic descriptors and Zernike moments are two points along the spatial deformation of the surface. In the case approaches used to calculate the parameters. The effectiveness of binary images; we used a Shumfer simplification which of the system is based on the results given by the classification comprises those operations: phase using data mining tools, which is the purpose of this document. Calculate the number of pixels traveled between the two points; The rest of the paper is organized as follows. The Section 2 Penalize the transition between horizontal and vertical pixels discusses the first primordial task of any recognition system. by 1 and moving diagonally by 1.5; It’s the features extraction problems in addition to a brief formulation for geodesic descriptors and Zernike moments as Choose the optimal path. features extraction methods. The Section 3 is reserved for the We consider the preliminary processing that consists of two second important task which is the classification problems standard processes: (i) the noise elimination and (ii) the using some classifiers based on several algorithms like ANFIS, extremities detection (Fig. 2). ANN, SVM, CART, KNN and AdaBoost. Finally, the Section 1) Extremities detection 4 presents the experimental results for the recognition system. In order to identify the extremities, we use an algorithm II. FEATURES EXTRACTION that runs through the character contour and detects the nearest points to the corners of the image. Tifinagh is the set of alphabets used by the Amazigh population. The Royal Institute of Amazigh Culture (IRCAM) has normalized the Tifinagh alphabet of thirty-three characters as shown in Fig. 1. 1 | P a g e www.ijacsa.thesai.org IJACSA Special Issue on Selected Papers from Third international symposium on Automatic Amazigh processing (SITACAM’ 13) inside the unit circle as x2 + y2 1. The form of such (a) (b) polynomials is [7]: * Vnmx, y Vnm, Rnm().exp( jm) Where: n: a positive or null integer; (d) (c) m: an integer such that | m | n ; r: length of the vector from the origin to the pixel (x, y); Fig. 2. Example of the extremities of three Tifinagh characters. θ: angle between the vector x and p; 2) Geodesic descriptors calculation Rnm: radial polynomial. We called "geodesic descriptors" the distances between the four detected extremities of the character divided by their V * (x, y): complex polynomial projection of f (x, y) on the Euclidean distances. We set: space of complex polynomials. Such polynomials are orthogonal since: DlM(xy): Geodesic distance between x and y; dxy: Euclidean distance between x and y; V * x, y .V (x, y)dxdy 0 or 1 a, b, c and d: Detected extremities of each character. nm pq And, we call: x2 y2 1 st 1 metric descriptor: D1 DlM ab da b The geometrical Zernike moments are the projection of the 2nd metric descriptor: D Dl ac d function f (x, y) describing an image on a space of orthogonal 2 M a c polynomials generated by: rd 3 metric descriptor: D3 DlM ad da d 4th metric descriptor: D Dl bc d n 1 * 4 M b c Anm f (x, y).Vnm(,)dxdy th x y 5 metric descriptor: D5 DlM bd db d th 6 metric descriptor: D6 DlM cd dc d For identification of the image, the Zernike moments modules are used: To ensure resistance to scale change of the proposed descriptors, we divided each geodesic path by the corresponding Euclidean distances. 2 2 Anm Re (Anm) Im (Anm) III. CLASSIFICATION The choice of the classifier is primordial. It is the decision element in a pattern recognition system. In this context, we compared the performance of six Data Mining algorithms. A. CART Algorithm CART (Classification And Regression Tree) builds a strictly binary decision tree with exactly two branches for each decision node. The algorithm partitions or divides recursively the training set using the principle of "divide and conquer" [9]. B. KNN (k-nearest neighbor) Fig. 3. Example of Geodesic descriptors calculation. The k-nearest neighbors algorithm (kNN) [10] is a learning B. Zernike moments method based instances. To estimate the associated output with a new input x, the method of k nearest neighbors is taken into The Zernike moments are a series of calculations that account (with the same way) the k training samples whose converts an image into vectors with real components entrance is nearest to the new input x, according to distance representing moments A . ij measurement to be defined. By definition, the geometrical moments of a function f(x, y) C. SVM (Support Vector Machines) is the projection of this function on the space of polynomials generated by xp yq where (p, q)N2. Zernike introduced a set Support Vector Machines (SVM) [11] are a class of of complex polynomials which form an orthonormal basis learning algorithms that can be applied to any problem that 2 | P a g e www.ijacsa.thesai.org IJACSA Special Issue on Selected Papers from Third international symposium on Automatic Amazigh processing (SITACAM’ 13) involves a phenomenon f that produces output y=f(x) from a TABLE I. OBTAINED RESULTS: RCOGNITION RATES AND EXECUTION set of input x and wherein the goal is to find f from the TIME observation of a number of couples input/output.
Recommended publications
  • UAX #44: Unicode Character Database File:///D:/Uniweb-L2/Incoming/08249-Tr44-3D1.Html
    UAX #44: Unicode Character Database file:///D:/Uniweb-L2/Incoming/08249-tr44-3d1.html Technical Reports L2/08-249 Working Draft for Proposed Update Unicode Standard Annex #44 UNICODE CHARACTER DATABASE Version Unicode 5.2 draft 1 Authors Mark Davis ([email protected]) and Ken Whistler ([email protected]) Date 2008-7-03 This Version http://www.unicode.org/reports/tr44/tr44-3.html Previous http://www.unicode.org/reports/tr44/tr44-2.html Version Latest Version http://www.unicode.org/reports/tr44/ Revision 3 Summary This annex consolidates information documenting the Unicode Character Database. Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress. A Unicode Standard Annex (UAX) forms an integral part of the Unicode Standard, but is published online as a separate document. The Unicode Standard may require conformance to normative content in a Unicode Standard Annex, if so specified in the Conformance chapter of that version of the Unicode Standard. The version number of a UAX document corresponds to the version of the Unicode Standard of which it forms a part. Please submit corrigenda and other comments with the online reporting form [Feedback]. Related information that is useful in understanding this annex is found in Unicode Standard Annex #41, “Common References for Unicode Standard Annexes.” For the latest version of the Unicode Standard, see [Unicode]. For a list of current Unicode Technical Reports, see [Reports].
    [Show full text]
  • Tungumál, Letur Og Einkenni Hópa
    Tungumál, letur og einkenni hópa Er letur ómissandi í baráttu hópa við ríkjandi öfl? Frá Tifinagh og rúnaristum til Pixação Þorleifur Kamban Þrastarson Lokaritgerð til BA-prófs Listaháskóli Íslands Hönnunar- og arkitektúrdeild Desember 2016 Í þessari ritgerð er reynt að rökstyðja þá fullyrðingu að letur sé mikilvægt og geti jafnvel undir vissum kringumstæðum verið eitt mikilvægasta vopnið í baráttu hópa fyrir tilveru sinni, sjálfsmynd og stað í samfélagi. Með því að líta á þrjú ólík dæmi, Tifinagh, rúnaletur og Pixação, hvert frá sínum stað, menningarheimi og tímabili er ætlunin að sýna hvernig saga leturs og týpógrafíu hefur samtvinnast og mótast af samfélagi manna og haldist í hendur við einkenni þjóða og hópa fólks sem samsama sig á einn eða annan hátt. Einkenni hópa myndast oft sem andsvar við ytri öflum sem ógna menningu, auði eða tilverurétti hópsins. Hópar nota mismunandi leturtýpur til þess að tjá sig, tengjast og miðla upplýsingum. Það skiptir ekki eingöngu máli hvað þú skrifar heldur hvernig, með hvaða aðferðum og á hvaða efni. Skilaboðin eru fólgin í letrinu sjálfu en ekki innihaldi letursins. Letur er útlit upplýsingakerfis okkar og hefur notkun ritmáls og leturs aldrei verið meiri í heiminum sem og læsi. Ritmál og letur eru algjörlega samofnir hlutir og ekki hægt að slíta annað frá öðru. Ekki er hægt að koma frá sér ritmáli nema í letri og þessi tvö hugtök flækjast því oft saman. Í ljósi athugana á þessum þremur dæmum í ritgerðinni dreg ég þá ályktun að letur spilar og hefur spilað mikilvægt hlutverk í einkennum þjóða og hópa. Letur getur, ásamt tungumálinu, stuðlað að því að viðhalda, skapa eða eyðileggja menningu og menningarlegar tenginga Tungumál, letur og einkenni hópa Er letur ómissandi í baráttu hópa við ríkjandi öfl? Frá Tifinagh og rúnaristum til Pixação Þorleifur Kamban Þrastarson Lokaritgerð til BA-prófs í Grafískri hönnun Leiðbeinandi: Óli Gneisti Sóleyjarson Grafísk hönnun Hönnunar- og arkitektúrdeild Desember 2016 Ritgerð þessi er 6 eininga lokaritgerð til BA-prófs í Grafískri hönnun.
    [Show full text]
  • A Translation of the Malia Altar Stone
    MATEC Web of Conferences 125, 05018 (2017) DOI: 10.1051/ matecconf/201712505018 CSCC 2017 A Translation of the Malia Altar Stone Peter Z. Revesz1,a 1 Department of Computer Science, University of Nebraska-Lincoln, Lincoln, NE, 68588, USA Abstract. This paper presents a translation of the Malia Altar Stone inscription (CHIC 328), which is one of the longest known Cretan Hieroglyph inscriptions. The translation uses a synoptic transliteration to several scripts that are related to the Malia Altar Stone script. The synoptic transliteration strengthens the derived phonetic values and allows avoiding certain errors that would result from reliance on just a single transliteration. The synoptic transliteration is similar to a multiple alignment of related genomes in bioinformatics in order to derive the genetic sequence of a putative common ancestor of all the aligned genomes. 1 Introduction symbols. These attempts so far were not successful in deciphering the later two scripts. Cretan Hieroglyph is a writing system that existed in Using ideas and methods from bioinformatics, eastern Crete c. 2100 – 1700 BC [13, 14, 25]. The full Revesz [20] analyzed the evolutionary relationships decipherment of Cretan Hieroglyphs requires a consistent within the Cretan script family, which includes the translation of all known Cretan Hieroglyph texts not just following scripts: Cretan Hieroglyph, Linear A, Linear B the translation of some examples. In particular, many [6], Cypriot, Greek, Phoenician, South Arabic, Old authors have suggested translations for the Phaistos Disk, Hungarian [9, 10], which is also called rovásírás in the most famous and longest Cretan Hieroglyph Hungarian and also written sometimes as Rovas in inscription, but in general they were unable to show that English language publications, and Tifinagh.
    [Show full text]
  • Documenting Endangered Alphabets
    Documenting endangered alphabets Industry Focus Tim Brookes Three years ago, acting on a notion so whimsi- cal I assumed it was a kind of presenile monoma- nia, I began carving endangered alphabets. The Tdisclaimers start right away. I’m not a linguist, an anthropologist, a cultural historian or even a woodworker. I’m a writer — but I had recently started carving signs for friends and family, and I stumbled on Omniglot.com, an online encyclo- pedia of the world’s writing systems, and several things had struck me forcibly. For a start, even though the world has more than 6,000 Figure 1: Tifinagh. languages (some of which will be extinct even by the time this article goes to press), it has fewer than 100 scripts, and perhaps a passing them on as a series of items for consideration and dis- third of those are endangered. cussion. For example, what does a written language — any writ- Working with a set of gouges and a paintbrush, I started to ten language — look like? The Endangered Alphabets highlight document as many of these scripts as I could find, creating three this question in a number of interesting ways. As the forces of exhibitions and several dozen individual pieces that depicted globalism erode scripts such as these, the number of people who words, phrases, sentences or poems in Syriac, Bugis, Baybayin, can write them dwindles, and the range of examples of each Samaritan, Makassarese, Balinese, Javanese, Batak, Sui, Nom, script is reduced. My carvings may well be the only examples Cherokee, Inuktitut, Glagolitic, Vai, Bassa Vah, Tai Dam, Pahauh of, say, Samaritan script or Tifinagh that my visitors ever see.
    [Show full text]
  • A Message from Afar: Fact Sheet 3 (PDF)
    3 What’s the Message? Here’s your signal The detection of a signal from another world would be a most remarkable moment in human history. However, if we detect such a signal, is it just a beacon from their technology, without any content, or does it contain information or even a message? Does it resemble sound, or is it like interstellar e-mail? Can we ever understand such a message? This appears to be a tremendous challenge, given that we still have many scripts from our own antiquity that remain undeciphered, despite many serious attempts, over hundreds of years. – And we know far more about humanity than about extra-terrestrial intelligence… We are facing all the complexities involved in understanding and glimpsing the intellect of the author, while the world’s expectations demand immediacy of information. So, where do we begin? Structure and language Information stands out from randomness, it is based on structures. The problem goal we face, after we detect a signal, is to first separate out those information-carrying signals from other phenomena, without being able to engage in a dialogue, and then to learn something about the structure of their content in the passing. This means that we need a suitable filter. We need a way of separating out the interesting stuff; we need a language detector. While identifying the location of origin of a candidate signal can rule out human making, its content could involve a vast array of possible structures, some of which may be beyond our knowledge or imagination; however, for identifying intelligence that shares any pattern with our way of processing and transmitting information, the collective knowledge and examples here on Earth are a good starting point.
    [Show full text]
  • Recognition of Baybayin Symbols (Ancient Pre-Colonial Philippine Writing System) Using Image Processing
    ISSN 2278-3091 Volume 9, No.1, January – February 2020 Mark Jovic A. Daday et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(1), January – February 2020, 594 – 598 International Journal of Advanced Trends in Computer Science and Engineering Available Online at http://www.warse.org/IJATCSE/static/pdf/file/ijatcse83912020.pdf https://doi.org/10.30534/ijatcse/2020/83912020 Recognition of Baybayin Symbols (Ancient Pre-Colonial Philippine Writing System) using Image Processing Mark Jovic A. Daday1, Arnel C. Fajardo2, Ruji P. Medina3 1 Technological Institute of the Philippines, Quezon City, Philippines, [email protected] 2 School of Computer Science Manuel L. Quezon University, Diliman, Quezon City, Philippines, [email protected] 3 Technological Institute of the Philippines, Quezon City, Philippines, [email protected] because of the recognition problems encountered. Due to its ABSTRACT problem faces throughout recognition, computer is incapable to take out the features correctly when scanning them in [4]. The goal of this paper is to accomplish an Optical Character Recognition (OCR) that gives an extremely contribution to Alternatively, unusual modern methods for a several the advancement of technology in terms of image recognition concealed datasets of text and images were faced into more in Machine Learning. The researcher introduces the experiments, that directs to a compilation of multiple type of Feed-Forward Neural Network with Dropout Method font and unusual ruin degree in [5]. The goal of this paper is to (FFNNDM) and Convolutional Neural Network with Dropout accomplish an OCR system for the Baybayin handwriting Method (CNNDM) for the recognition of the Baybayin symbols or so called (“Alibata) an ancient and national symbols.
    [Show full text]
  • Language Ideologies in Morocco Sybil Bullock Connecticut College, [email protected]
    CORE Metadata, citation and similar papers at core.ac.uk Provided by DigitalCommons@Connecticut College Connecticut College Digital Commons @ Connecticut College Anthropology Department Honors Papers Anthropology Department 2014 Language Ideologies in Morocco Sybil Bullock Connecticut College, [email protected] Follow this and additional works at: http://digitalcommons.conncoll.edu/anthrohp Recommended Citation Bullock, Sybil, "Language Ideologies in Morocco" (2014). Anthropology Department Honors Papers. Paper 11. http://digitalcommons.conncoll.edu/anthrohp/11 This Honors Paper is brought to you for free and open access by the Anthropology Department at Digital Commons @ Connecticut College. It has been accepted for inclusion in Anthropology Department Honors Papers by an authorized administrator of Digital Commons @ Connecticut College. For more information, please contact [email protected]. The views expressed in this paper are solely those of the author. Language Ideologies in Morocco Sybil Bullock 2014 Honors Thesis Anthropology Department Connecticut College Thesis Advisor: Petko Ivanov First Reader: Christopher Steiner Second Reader: Jeffrey Cole Table of Contents Dedication………………………………………………………………………………….….…3 Acknowledgments………………………………………………………………………….…….4 Abstract…………………………………………………………………………………….…….5 Chapter 1: Introduction…………………………………………………………………………..6 Chapter 2: The Role of Language in Nation-Building…………………………………………..10 Chapter 3: The Myth of Monolingualism………………………………………………….……18 Chapter 4: Language or Dialect?...................................................................................................26
    [Show full text]
  • Association Relative À La Télévision Européenne G.E.I.E. Gtld: .Arte Status: ICANN Review Status Date: 2015-10-27 17:43:58 Print Date: 2015-10-27 17:44:13
    ICANN Registry Request Service Ticket ID: S1K9W-8K6J2 Registry Name: Association Relative à la Télévision Européenne G.E.I.E. gTLD: .arte Status: ICANN Review Status Date: 2015-10-27 17:43:58 Print Date: 2015-10-27 17:44:13 Proposed Service Name of Proposed Service: Technical description of Proposed Service: ARTE registry operator, Association Relative à la Télévision Européenne G.E.I.E. is applying to add IDN domain registration services. ARTE IDN domain name registration services will be fully compliant with IDNA 2008, as well as ICANN's Guidelines for implementation of IDNs. The language tables are submitted separately via the GDD portal together with IDN policies. ARTE is a brand TLD, as defined by the Specification 13, and as such only the registry and its affiliates are eligible to register .ARTE domain names. The full list of languages/scripts is: Azerbaijani language Belarusian language Bulgarian language Chinese language Croatian language French language Greek, Modern language Japanese language Korean language Kurdish language Macedonian language Moldavian language Polish language Russian language Serbian language Spanish language Swedish language Ukrainian language Arabic script Armenian script Avestan script Balinese script Bamum script Batak script Page 1 ICANN Registry Request Service Ticket ID: S1K9W-8K6J2 Registry Name: Association Relative à la Télévision Européenne G.E.I.E. gTLD: .arte Status: ICANN Review Status Date: 2015-10-27 17:43:58 Print Date: 2015-10-27 17:44:13 Bengali script Bopomofo script Brahmi script Buginese
    [Show full text]
  • The Cretan Script Family Includes the Carian Alphabet
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Journal Articles Computer Science and Engineering, Department of 2017 The rC etan Script Family Includes the Carian Alphabet Peter Z. Revesz University of Nebraska-Lincoln, [email protected] Follow this and additional works at: https://digitalcommons.unl.edu/csearticles Revesz, Peter Z., "The rC etan Script Family Includes the Carian Alphabet" (2017). CSE Journal Articles. 196. https://digitalcommons.unl.edu/csearticles/196 This Article is brought to you for free and open access by the Computer Science and Engineering, Department of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in CSE Journal Articles by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. MATEC Web of Conferences 125, 05019 (2017) DOI: 10.1051/ matecconf/201712505019 CSCC 2017 The Cretan Script Family Includes the Carian Alphabet Peter Z. Revesz1,a 1 Department of Computer Science, University of Nebraska-Lincoln, Lincoln, NE, 68588, USA Abstract. The Cretan Script Family is a set of related writing systems that have a putative origin in Crete. Recently, Revesz [11] identified the Cretan Hieroglyphs, Linear A, Linear B, the Cypriot syllabary, and the Greek, Old Hungarian, Phoenician, South Arabic and Tifinagh alphabets as members of this script family and using bioinformatics algorithms gave a hypothetical evolutionary tree for their development and presented a map for their likely spread in the Mediterranean and Black Sea areas. The evolutionary tree and the map indicated some unknown writing system in western Anatolia to be the common origin of the Cypriot syllabary and the Old Hungarian alphabet.
    [Show full text]
  • Encoding Diversity for All the World's Languages
    Encoding Diversity for All the World’s Languages The Script Encoding Initiative (Universal Scripts Project) Michael Everson, Evertype Westport, Co. Mayo, Ireland Bamako, Mali • 6 May 2005 1. Current State of the Unicode Standard • Unicode 4.1 defines over 97,000 characters 1. Current State of the Unicode Standard: New Script Additions Unicode 4.1 (31 March 2005): For Unicode 5.0 (2006): Buginese N’Ko Coptic Balinese Glagolitic Phags-pa New Tai Lue Phoenician Nuskhuri (extends Georgian) Syloti Nagri Cuneiform Tifinagh Kharoshthi Old Persian Cuneiform 1. Current State of the Unicode Standard • Unicode 4.1 defines over 97,000 characters • Unicode covers over 50 scripts (many of which are used for languages with over 5 million speakers) 1. Current State of the Unicode Standard • Unicode 4.1 defines over 97,000 characters • Unicode covers over 50 scripts (often used for languages with over 5 million speakers) • Unicode enables millions of users worldwide to view web pages, send e-mails, converse in chat-rooms, and share text documents in their native script 1. Current State of the Unicode Standard • Unicode 4.1 defines over 97,000 characters • Unicode covers over 50 scripts (often used for languages with over 5 million speakers) • Unicode enables millions of users worldwide to view web pages, send e-mails, converse in chat- rooms, and share text documents in their native script • Unicode is widely supported by current fonts and operating systems, but… Over 80 scripts are missing! Missing Modern Minority Scripts India, Nepal, Southeast Asia China:
    [Show full text]
  • (RSEP) Request October 16, 2017 Registry Operator INFIBEAM INCORPORATION LIMITED 9Th Floor
    Registry Services Evaluation Policy (RSEP) Request October 16, 2017 Registry Operator INFIBEAM INCORPORATION LIMITED 9th Floor, A-Wing Gopal Palace, NehruNagar Ahmedabad, Gujarat 380015 Request Details Case Number: 00874461 This service request should be used to submit a Registry Services Evaluation Policy (RSEP) request. An RSEP is required to add, modify or remove Registry Services for a TLD. More information about the process is available at https://www.icann.org/resources/pages/rsep-2014- 02-19-en Complete the information requested below. All answers marked with a red asterisk are required. Click the Save button to save your work and click the Submit button to submit to ICANN. PROPOSED SERVICE 1. Name of Proposed Service Removal of IDN Languages for .OOO 2. Technical description of Proposed Service. If additional information needs to be considered, attach one PDF file Infibeam Incorporation Limited (“infibeam”) the Registry Operator for the .OOO TLD, intends to change its Registry Service Provider for the .OOO TLD to CentralNic Limited. Accordingly, Infibeam seeks to remove the following IDN languages from Exhibit A of the .OOO New gTLD Registry Agreement: - Armenian script - Avestan script - Azerbaijani language - Balinese script - Bamum script - Batak script - Belarusian language - Bengali script - Bopomofo script - Brahmi script - Buginese script - Buhid script - Bulgarian language - Canadian Aboriginal script - Carian script - Cham script - Cherokee script - Coptic script - Croatian language - Cuneiform script - Devanagari script
    [Show full text]
  • Bioinformatics Evolutionary Tree Algorithms Reveal the History of the Cretan Script Family
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND INFORMATICS Volume 10, 2016 Bioinformatics Evolutionary Tree Algorithms Reveal the History of the Cretan Script Family Peter Z. Revesz syllabary, whose similarity with Linear A was noted by Evans. Abstract— This paper shows that Crete is the likely origin of a The Phoenician alphabet [28] was a major influence on the family of related scripts that includes the Cretan Hieroglyph, Linear development of many other alphabets due to the Phoenicians’ A, Linear B and Cypriot syllabaries and the Greek, Phoenician, Old widespread commercial influence in the Mediterranean area. Hungarian, South Arabic and Tifinagh alphabets. The paper develops The Phoenician and the South Arabic [30] alphabets are a novel similarity measure between pairs of script symbols. The similarity measure is used as an aid to develop a comparison table of assumed to derive from the Proto-Sinaitic alphabet, which the nine scripts. The paper presents a method to translate comparison originated in the Sinai Peninsula sometime between the th th tables into DNA encodings, thereby enabling the use of mid-19 and mid-16 century BC [29]. Phoenician represents bioinformatics algorithms that construct hypothetical evolutionary the northern branch, while South Arabic represents the trees. Applying the method to the nine scripts yields a script southern branch of Proto-Sinaitic. evolutionary tree with two main branches. The first branch is The classical Greek alphabet from about 800 BC had a composed of Cretan Hieroglyph, Cypriot, Linear A, Linear B, Old Hungarian and Tifinagh, while the second branch is composed of major influence for many other European alphabets.
    [Show full text]