Fo'ohkkjr @Tdil

Total Page:16

File Type:pdf, Size:1020Kb

Fo'ohkkjr @Tdil fo'oHkkjr @tdil 3. Development of OHWR System for Gurmukhi R.K. Sharma, Palki Sharma, Narpinder Sharma, Harjeet Singh, Karun Verma, Ravinder Kumar, Rajesh Kumar, School of Mathematics & Computer Applications, Thapar University, Patiala, Punjab (INDIA) Abstract Gurmukhi and Roman numerals. Handwritten character recognition is a Postprocessing of classes has also been complex task owing to various writing styles performed in this work in order to refine the of different individuals. A number of authors recognition process results. In this have worked on the problem of handwritten postprocessing, the sequence of classes is Gurmukhi OHWR character recognition. They have, in analyzed; overwritten strokes are identified general, used structural and statistical and resolved. A heuristic based algorithm features in their work. Handwritten has been developed to form Gurmukhi character recognition systems have also aksharas from the identified classes. been proposed for Gurmukhi script by some 1. Introduction authors. This work presents a system to This is a well-established fact that recognize online handwritten Gurmukhi handwritten character recognition is a characters, Gurmukhi numerals, Roman complex task. This is due to different numerals and special characters. A handwriting styles of individuals, and also the sufficiently large annotated database has cursiveness in their handwriting. Handwritten been created in this work for the strokes used character recognition is divided into two in writing these Gurmukhi symbols. The categories: offline and online. In offline recognition engines developed in this work handwritten character recognition, data are are based on 100 samples (Engine 1) and 22 scanned images taken from a prewritten text, samples (Engines 2, 3 and 4) of each class usually on a sheet of paper. In online collected from different writers. These handwriting recognition systems, data are samples have been preprocessed and captured while writing with the help of a annotated. LibSVM classifier has been used special pen and an electronic surface. These in this work for classification purpose. The systems generally include the phases of system presented in this work could attain the preprocessing, features extraction and highest recognition accuracy of 96.8% at 5- classification. Feature extraction is a very fold cross validation for Gurmukhi significant phase in a character recognition characters, of 90.0% at 4-fold cross system, which is used to decide for the validation for Gurmukhi numerals, of 90.2% relevant shape contained in the character. The at 5-fold cross validation for Gurmukhi performance of a character recognition numerals and Roman numerals, of 84.0% at system largely depends on the features, which 4-fold cross validation for special characters, have been extracted. 55.
Recommended publications
  • Roman Numerals
    History of Numbers 1c. I can distinguish between an additive and positional system, and convert between Roman and Hindu-Arabic numbers. Roman Numerals The numeric system represented by Roman numerals originated in ancient Rome (753 BC–476 AD) and remained the usual way of writing numbers throughout Europe well into the Late Middle Ages. By the 11th century, the more efJicient Hindu–Arabic numerals had been introduced into Europe by way of Arab traders. Roman numerals, however, remained in commo use well into the 14th and 15th centuries, even in accounting and other business records (where the actual calculations would have been made using an abacus). Roman numerals are still used today, in certain contexts. See: Modern Uses of Roman Numerals Numbers in this system are represented by combinations of letters from the Latin alphabet. Roman numerals, as used today, are based on seven symbols: The numbers 1 to 10 are expressed in Roman numerals as: I, II, III, IV, V, VI, VII, VIII, IX, X. This an additive system. Numbers are formed by combining symbols and adding together their values. For example, III is three (three ones) and XIII is thirteen (a ten plus three ones). Because each symbol (I, V, X ...) has a Jixed value rather than representing multiples of ten, one hundred and so on (according to the numeral's position) there is no need for “place holding” zeros, as in numbers like 207 or 1066. Using Roman numerals, those numbers are written as CCVII (two hundreds, plus a ive and two ones) and MLXVI (a thousand plus a ifty plus a ten, a ive and a one).
    [Show full text]
  • Bibliography
    Bibliography Many books were read and researched in the compilation of Binford, L. R, 1983, Working at Archaeology. Academic Press, The Encyclopedic Dictionary of Archaeology: New York. Binford, L. R, and Binford, S. R (eds.), 1968, New Perspectives in American Museum of Natural History, 1993, The First Humans. Archaeology. Aldine, Chicago. HarperSanFrancisco, San Francisco. Braidwood, R 1.,1960, Archaeologists and What They Do. Franklin American Museum of Natural History, 1993, People of the Stone Watts, New York. Age. HarperSanFrancisco, San Francisco. Branigan, Keith (ed.), 1982, The Atlas ofArchaeology. St. Martin's, American Museum of Natural History, 1994, New World and Pacific New York. Civilizations. HarperSanFrancisco, San Francisco. Bray, w., and Tump, D., 1972, Penguin Dictionary ofArchaeology. American Museum of Natural History, 1994, Old World Civiliza­ Penguin, New York. tions. HarperSanFrancisco, San Francisco. Brennan, L., 1973, Beginner's Guide to Archaeology. Stackpole Ashmore, w., and Sharer, R. J., 1988, Discovering Our Past: A Brief Books, Harrisburg, PA. Introduction to Archaeology. Mayfield, Mountain View, CA. Broderick, M., and Morton, A. A., 1924, A Concise Dictionary of Atkinson, R J. C., 1985, Field Archaeology, 2d ed. Hyperion, New Egyptian Archaeology. Ares Publishers, Chicago. York. Brothwell, D., 1963, Digging Up Bones: The Excavation, Treatment Bacon, E. (ed.), 1976, The Great Archaeologists. Bobbs-Merrill, and Study ofHuman Skeletal Remains. British Museum, London. New York. Brothwell, D., and Higgs, E. (eds.), 1969, Science in Archaeology, Bahn, P., 1993, Collins Dictionary of Archaeology. ABC-CLIO, 2d ed. Thames and Hudson, London. Santa Barbara, CA. Budge, E. A. Wallis, 1929, The Rosetta Stone. Dover, New York. Bahn, P.
    [Show full text]
  • Shahmukhi to Gurmukhi Transliteration System: a Corpus Based Approach
    Shahmukhi to Gurmukhi Transliteration System: A Corpus based Approach Tejinder Singh Saini1 and Gurpreet Singh Lehal2 1 Advanced Centre for Technical Development of Punjabi Language, Literature & Culture, Punjabi University, Patiala 147 002, Punjab, India [email protected] http://www.advancedcentrepunjabi.org 2 Department of Computer Science, Punjabi University, Patiala 147 002, Punjab, India [email protected] Abstract. This research paper describes a corpus based transliteration system for Punjabi language. The existence of two scripts for Punjabi language has created a script barrier between the Punjabi literature written in India and in Pakistan. This research project has developed a new system for the first time of its kind for Shahmukhi script of Punjabi language. The proposed system for Shahmukhi to Gurmukhi transliteration has been implemented with various research techniques based on language corpus. The corpus analysis program has been run on both Shahmukhi and Gurmukhi corpora for generating statistical data for different types like character, word and n-gram frequencies. This statistical analysis is used in different phases of transliteration. Potentially, all members of the substantial Punjabi community will benefit vastly from this transliteration system. 1 Introduction One of the great challenges before Information Technology is to overcome language barriers dividing the mankind so that everyone can communicate with everyone else on the planet in real time. South Asia is one of those unique parts of the world where a single language is written in different scripts. This is the case, for example, with Punjabi language spoken by tens of millions of people but written in Indian East Punjab (20 million) in Gurmukhi script (a left to right script based on Devanagari) and in Pakistani West Punjab (80 million), written in Shahmukhi script (a right to left script based on Arabic), and by a growing number of Punjabis (2 million) in the EU and the US in the Roman script.
    [Show full text]
  • Bana Braille Codes Update 2007
    BANA BRAILLE CODES UPDATE 2007 Developed Under the Sponsorship of the BRAILLE AUTHORITY OF NORTH AMERICA Effective Date: January 1, 2008 BANA MEMBERS American Council of the Blind American Foundation for the Blind American Printing House for the Blind Associated Services for the Blind and Visually Impaired Association for Education and Rehabilitation of the Blind and Visually Impaired Braille Institute of America California Transcribers and Educators of the Visually Handicapped Canadian Association of Educational Resource Centres for Alternate Format Materials The Clovernook Center for the Blind and Visually Impaired CNIB (Canadian National Institute for the Blind) National Braille Association National Braille Press National Federation of the Blind National Library Service for the Blind and Physically Handicapped, Library of Congress Royal New Zealand Foundation of the Blind. Associate Member Publications Committee Susan Christensen, Chairperson Judy Dixon, Board Liaison Bob Brasher Warren Figueiredo Sandy Smith Joanna E. Venneri Copyright © by the Braille Authority of North America. This material may be duplicated but not altered. This document is available for download in various formats from www.brailleauthority.org. 2 TABLE OF CONTENTS INTRODUCTION ENGLISH BRAILLE, AMERICAN EDITION, REVISED 2002 ....... L1 Table of Changes.................................................................. L2 Definition of Braille ............................................................... L3 Rule I: Punctuation Signs .....................................................L13
    [Show full text]
  • The Gentics of Civilization: an Empirical Classification of Civilizations Based on Writing Systems
    Comparative Civilizations Review Volume 49 Number 49 Fall 2003 Article 3 10-1-2003 The Gentics of Civilization: An Empirical Classification of Civilizations Based on Writing Systems Bosworth, Andrew Bosworth Universidad Jose Vasconcelos, Oaxaca, Mexico Follow this and additional works at: https://scholarsarchive.byu.edu/ccr Recommended Citation Bosworth, Bosworth, Andrew (2003) "The Gentics of Civilization: An Empirical Classification of Civilizations Based on Writing Systems," Comparative Civilizations Review: Vol. 49 : No. 49 , Article 3. Available at: https://scholarsarchive.byu.edu/ccr/vol49/iss49/3 This Article is brought to you for free and open access by the Journals at BYU ScholarsArchive. It has been accepted for inclusion in Comparative Civilizations Review by an authorized editor of BYU ScholarsArchive. For more information, please contact [email protected], [email protected]. Bosworth: The Gentics of Civilization: An Empirical Classification of Civil 9 THE GENETICS OF CIVILIZATION: AN EMPIRICAL CLASSIFICATION OF CIVILIZATIONS BASED ON WRITING SYSTEMS ANDREW BOSWORTH UNIVERSIDAD JOSE VASCONCELOS OAXACA, MEXICO Part I: Cultural DNA Introduction Writing is the DNA of civilization. Writing permits for the organi- zation of large populations, professional armies, and the passing of complex information across generations. Just as DNA transmits biolog- ical memory, so does writing transmit cultural memory. DNA and writ- ing project information into the future and contain, in their physical structure, imprinted knowledge.
    [Show full text]
  • Arabic Numeral
    CHAPTER 4 Number Representation and Calculation Copyright © 2015, 2011, 2007 Pearson Education, Inc. Section 4.4, Slide 1 4.4 Looking Back at Early Numeration Systems Copyright © 2015, 2011, 2007 Pearson Education, Inc. Section 4.4, Slide 2 Objectives 1. Understand and use the Egyptian system. 2. Understand and use the Roman system. 3. Understand and use the traditional Chinese system. 4. Understand and use the Ionic Greek system. Copyright © 2015, 2011, 2007 Pearson Education, Inc. Section 4.4, Slide 3 The Egyptian Numeration System The Egyptians used the oldest numeration system called hieroglyphic notation. Copyright © 2015, 2011, 2007 Pearson Education, Inc. Section 4.4, Slide 4 Example: Using the Egyptian Numeration System Write the following numeral as a Hindu-Arabic numeral: Solution: Using the table, find the value of each of the Egyptian numerals. Then add them. 1,000,000 + 10,000 + 10,000 + 10 + 10 + 10 + 1 + 1 + 1 + 1 = 1,020,034 Copyright © 2015, 2011, 2007 Pearson Education, Inc. Section 4.4, Slide 5 Example: Using the Egyptian Numeration System Write 1752 as an Egyptian numeral. Solution: First break down the Hindu-Arabic numeral into quantities that match the Egyptian numerals: 1752 = 1000 + 700 + 50 + 2 = 1000 + 100 + 100 + 100 + 100 + 100 + 100 + 100 + 10 + 10 + 10 + 10 + 10 + 1 + 1 Now use the table to find the Egyptian symbol that matches each quantity. Thus, 1752 can be expressed as Copyright © 2015, 2011, 2007 Pearson Education, Inc. Section 4.4, Slide 6 The Roman Numeration System Roman I V X L C D M Numeral Hindu- 1 5 10 50 100 500 1000 Arabic Numeral The Roman numerals were used until the eighteenth century and are still commonly used today for outlining, on clocks, and in numbering some pages in books.
    [Show full text]
  • Extracts from the All India Census Reports on Literacy
    CENSUS OF INOlA 1971 CENSUS CENTENARY MONOGRAPH NO. 9 EXTRACTS FROM THE ALL INnJA CENSUS REPORTS ON LITER ACY hy D. Natarajan OFFICE OF THE REGIS1RAR GENERAL, INDIA MINISTRY OF HOME AFFAIRS NEW DELHI CONTENTS PAGES PREFACE INTRODUCTION i-v CENSUS OF INDIA---iS71-72 Secretary of State for India---Memorandum on General Department IX76. CENSUS OF INDIA- ) XX) 3-~ The Statistics of Instruction;;. CENSUS OF INDIA lX91 9--22 The Distribution of the Population by literacy. CENSUS OF INDIA- -1 0 01 23-43 CENSUS OF INDIA---19J I 45-61 Education: Introductory Remarks CENSUS OF INDIA- --1921 61-80 Literacy CENSUS OF INDIA-1931 81-111 Literacy CENSUS OF INDIA -1941 113-110 Literacy CENSUS OF fNDIA--1961 117 - -J IX literacy Maps ud Diagrams 1. Diagram showing the number of persons per 1,000 in each province who are. liter.ter '. _. 26 2. Map showing the prevale~ce of education amongst malcs. ... 28 3. Diagram showing the number per 1.000 of eacb main religion who are literate. 30 4. Diagram showing the number of persons per mille in each province, etc., who are literate. 441 In addition to the series of monographs mentioned above, a mono­ graph entitled 'Indian Censuses Through a Hundred Years' has been prepared. This monograph deals with the organisational aspects of the Indian Census. The Indian Census covers the largest population-China which has a larger population has not taken a regular Census so far-and is a major administra­ tive undertaking. The success of the CenSllS is due to the detailed and proper planning and their prompt execution.
    [Show full text]
  • Characters for Classical Latin
    Characters for Classical Latin David J. Perry version 13, 2 July 2020 Introduction The purpose of this document is to identify all characters of interest to those who work with Classical Latin, no matter how rare. Epigraphers will want many of these, but I want to collect any character that is needed in any context. Those that are already available in Unicode will be so identified; those that may be available can be debated; and those that are clearly absent and should be proposed can be proposed; and those that are so rare as to be unencodable will be known. If you have any suggestions for additional characters or reactions to the suggestions made here, please email me at [email protected] . No matter how rare, let’s get all possible characters on this list. Version 6 of this document has been updated to reflect the many characters of interest to Latinists encoded as of Unicode version 13.0. Characters are indicated by their Unicode value, a hexadecimal number, and their name printed IN SMALL CAPITALS. Unicode values may be preceded by U+ to set them off from surrounding text. Combining diacritics are printed over a dotted cir- cle ◌ to show that they are intended to be used over a base character. For more basic information about Unicode, see the website of The Unicode Consortium, http://www.unicode.org/ or my book cited below. Please note that abbreviations constructed with lines above or through existing let- ters are not considered separate characters except in unusual circumstances, nor are the space-saving ligatures found in Latin inscriptions unless they have a unique grammatical or phonemic function (which they normally don’t).
    [Show full text]
  • UEB Guidelines for Technical Material
    Guidelines for Technical Material Unified English Braille Guidelines for Technical Material This version updated October 2008 ii Last updated October 2008 iii About this Document This document has been produced by the Maths Focus Group, a subgroup of the UEB Rules Committee within the International Council on English Braille (ICEB). At the ICEB General Assembly in April 2008 it was agreed that the document should be released for use internationally, and that feedback should be gathered with a view to a producing a new edition prior to the 2012 General Assembly. The purpose of this document is to give transcribers enough information and examples to produce Maths, Science and Computer notation in Unified English Braille. This document is available in the following file formats: pdf, doc or brf. These files can be sourced through the ICEB representatives on your local Braille Authorities. Please send feedback on this document to ICEB, again through the Braille Authority in your own country. Last updated October 2008 iv Guidelines for Technical Material 1 General Principles..............................................................................................1 1.1 Spacing .......................................................................................................1 1.2 Underlying rules for numbers and letters.....................................................2 1.3 Print Symbols ..............................................................................................3 1.4 Format.........................................................................................................3
    [Show full text]
  • Why Romans Sometimes Wrote 8 As VIII, and Sometimes As IIX: a Possible Explanation
    International Mathematical Forum, Vol. 16, 2021, no. 2, 95 - 99 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/imf.2021.912243 Why Romans Sometimes Wrote 8 as VIII, and Sometimes as IIX: A Possible Explanation Olga Kosheleva 1 and Vladik Kreinovich 2 1 Department of Teacher Education 2 Department of Computer Science University of Texas at El Paso 500 W. University El Paso, TX 79968, USA This article is distributed under the Creative Commons by-nc-nd Attribution License. Copyright c 2021 Hikari Ltd. Abstract Most of us are familiar with Roman numerals and with the standard way of describing numbers in the form of these numerals. What many people do not realize is that the actual ancient Romans often deviated from these rules. For example, instead of always writing the number 8 as VIII, i.e., 5 + 3, they sometimes wrote it as IIX, i.e., as 10 − 2. Some of such differences can be explained: e.g., the unusual way of writing 98 as IIC, i.e., as 100 − 2, can be explained by the fact that the Latin word for 98 literally means \two from hundred". However, other differences are not easy to explain { e.g., why Romans sometimes wrote 8 as VIII and sometimes as IIX. In this paper, we provide a possible explanation for this variation. Mathematics Subject Classification: 01A35 Keywords: Roman numerals, history of mathematics 1 Formulation of the Problem Roman numerals as we know them. Most people are familiar with Roman numerals. There, 1 is I, 5 is V, 10 is X, 50 is L, 100 is C, 500 is D, 1000 is 96 O.
    [Show full text]
  • Translating Roman Numerals
    Maggie’s Activity Pack Name __________________________ Date ___________________________ Translating Roman Numerals Look at a grandfather clock. You may see numerals that look like this: III or XI. Numerals like 3 and 5 are called Arabic numerals. Those written like this, X or VII, are called Roman numerals. Long ago the Romans needed to keep track of their trade. They probably first used simple marks. But, it likely was hard to keep making marks, I I I I I. So they started to use different symbols. Look at the chart below to see the Roman symbols for different numbers. Arabic Numeral Roman Numeral A century is 100 years. A millennium 1 I is 1000 years. These English words 5 V come from Latin. Centum means 100. 10 X Mille means 1000. This helps you to 50 L remember the C stands for 100 and M 100 C stands for 1000. 500 D 1000 M Did you notice there is not a zero in the Roman numeral system? Writing Different Numerals Where you put a symbol is very important. If a symbol is written to the right, it means you add it. XI means 11. But if a symbol is written to the left, it means you subtract that symbol. For example, IX means 9. Look at these numerals and their meanings: VI means 6 LX means 60 CX means 110 IV means 4 XL means 40 XC means 90 © Maggie's Earth Adventures, LLC 2007. Teachers may reproduce for classroom use. “Translating” Roman Numerals Now you can “translate” these Roman numerals into Arabic numerals.
    [Show full text]
  • Miramo Mmcomposer Reference Guide
    Miramo® mmComposer mmComposer Reference Guide VERSION 1.5 Copyright © 2014–2018 Datazone Ltd. All rights reserved. Miramo®, mmChart™, mmComposer™ and fmComposer™ are trademarks of Datazone Ltd. All other trademarks are the property of their respective owners. Readers of this documentation should note that its contents are intended for guidance only, and do not constitute formal offers or undertakings. ‘License Agreement’ This software, called Miramo, is licensed for use by the user subject to the terms of a License Agreement between the user and Datazone Ltd. Use of this software outside the terms of this license agreement is strictly prohibited. Unless agreed otherwise, this License Agreement grants a non-exclusive, non-transferable license to use the software programs and related document- ation in this package (collectively referred to as Miramo) on licensed computers only. Any attempted sublicense, assignment, rental, sale or other transfer of the software or the rights or obligations of the License Agreement without prior written con- sent of Datazone shall be void. In the case of a Miramo Development License, it shall be used to develop applications only and no attempt shall be made to remove the associated watermark included in output documents by any method. The documentation accompanying this software must not be copied or re-distributed to any third-party in either printed, photocopied, scanned or electronic form. The software and documentation are copyrighted. Unless otherwise agreed in writ- ing, copies of the software may be made only for backup and archival purposes. Unauthorized copying, reverse engineering, decompiling, disassembling, and creating derivative works based on the software are prohibited.
    [Show full text]