Multistage Hybrid Arabic/Indian Numeral OCR System

Total Page:16

File Type:pdf, Size:1020Kb

Multistage Hybrid Arabic/Indian Numeral OCR System (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 1, 2010 Multistage Hybrid Arabic/Indian Numeral OCR System Yasser M. Alginaih, Ph.D., P.Eng. IEEE Member Abdul Ahad Siddiqi, Ph.D., Member IEEE & PEC Dept. of Computer Science Dept. of Computer Science Taibah University Taibah University Madinah, Kingdom of Saudi Arabia Madinah, Kingdom of Saudi Arabia [email protected] [email protected] Abstract— The use of OCR in postal services is not yet numeral OCR systems for Postal services have been used universal and there are still many countries that process in some countries, but still there are problems in such mail sorting manually. Automated Arabic/Indian numeral systems, stemming from the fact that machines are unable Optical Character Recognition (OCR) systems for Postal to read the crucial information needed to distribute the services are being used in some countries, but still there are mail efficiently. Historically, most civilizations have errors during the mail sorting process, thus causing a different symbols that convey numerical values, but the reduction in efficiency. The need to investigate fast and Arabic version is the simplest and most widely efficient recognition algorithms/systems is important so as to correctly read the postal codes from mail addresses and to acceptable. In most Middle Eastern countries both the eliminate any errors during the mail sorting stage. The Arabic (0,1,2,3,4,5,6,7,8,9) and Indian ۷,۸,۹) numerals are used. The objective of,٤,٥,٦,objective of this study is to recognize printed numerical (۰,۱,۲,۳ postal codes from mail addresses. The proposed system is a this work is to develop a numeral Arabic/Indian OCR multistage hybrid system which consists of three different system to recognize postal codes from mail letters feature extraction methods, i.e., binary, zoning, and fuzzy processed in the Middle Eastern countries. A brief history features, and three different classifiers, i.e., Hamming Nets, on the development of postal services is qouted from [1]. Euclidean Distance, and Fuzzy Neural Network Classifiers. “The broad development of mechanization in postal The proposed system, systematically compares the operations was not applied until the mid-1950s. The performance of each of these methods, and ensures that the numerals are recognized correctly. Comprehensive results translation from mechanization to automation of the U.S. provide a very high recognition rate, outperforming the Postal Services (USPS) started in 1982, when the first other known developed methods in literature. optical character reader was installed in Los Angeles. The introduction of computers revolutionized the postal industry, and since then, the pace of change has Keywords-component; Hamming Net; Euclidean Distance; accelerated dramatically [1].” Fuzzy Neural Network; Feature Extration; Arabic/Indian Numerals In the 1980s, the first OCRs were confined to reading the Zip Code. In the 1990s they expanded their I. INTRODUCTION capabilities to reading the entire address, and in 1996, the Optical Character Recognition (OCR) is the electronic Remote Computer Reader (RCR) for the USPS could translation of images of printed or handwritten text into recognize about 35% of machine printed and 2% of machine-editable text format; such images are captured handwritten letter mail pieces. Today, modern systems through a scanner or a digital camera. The research work can recognize 93% of machine-printed and about 88% of in OCR encompasses many different areas, such as handwritten letter mail. Due to this progress in pattern recognition, machine vision, artificial intelligence, recognition technology the most important factor in the and digital image processing. OCR has been used in efficiency of mail sorting equipment is the reduction of many areas, e.g., postal services, banks, libraries, cost in mail processing. Therefore, a decade intensive museums to convert historical scripts into digital formats, investment in automated sorting technology, resulted in automatic text entry, information retrieval, etc. high recognition rates of machine-printed and handwritten addresses delivered by state-of- the-art systems [1 – 2] The objective of this work is to develop a numerical OCR system for postal codes. Automatic Arabic/Indian 9 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 1, 2010 According to the postal addressing standards [3], a been implemented. Many OCR systems are available in standardized mail address is one that is fully spelled out the market, which are multi font and multilingual. and abbreviated by using the postal services standard Moreover, most of these systems provide high recognition abbreviations. The standard requires that the mail rate for printed characters. The recognition rate is addressed to countries outside of the USA must have the between 95% - 100%, depending on the quality of the address typed or printed in Roman capital letters and scanned images, fed into the systems, and the application Arabic numerals. The complete address must include the it is used for [9]. The Kingdom of Saudi Arabia has also name of addressee, house number with street address or initiated its efforts in deploying the latest technology of box number/zip code, city, province, and country. automatic mail sorting. It is reported in [10], that Saudi Examples of postal addresses used in the Middle East are Post has installed an advanced Postal Automation System, given in table 1. working with a new GEO-data based postal code system, an Automatic Letter Sorting Machine, and an OCR for TABLE 1: Examples of postal addresses used in the Middle East simultaneous reading of Arabic and English addresses. It comprises components for automatic forwarding, Address with Arabic numerals Address with Indian Numerals sequencing, and coding ﺍﻟﺴﻴﺪ ﻣﺤﻤﺪ ﻋﻠﻲ Mr. Ibrahim Mohammad ﺹ. ﺏ: ٥۲۱۰٦ P.O. Box 56577 [In his in-depth research study, Fujisawa, in [11 ﺍﻟﺮﻳﺎﺽ: RIYADH 11564 ۱۲۳٤٥ reports on the key technical developments for Kanji ﺍﻟﻤﻤﻠﻜﺔ ﺍﻟﻌﺮﺑﻴﺔ ﺍﻟﺴﻌﻮﺩﻳﺔ SAUDI ARABIA (Chinese character) recognition in Japan. Palumbo and Standards are being developed to make it easy to Srihari [12] described a Hand Written Address integrate newer technologies into available components Interpretation (HWAI) system, and reported a throughput instead of replacing such components, which is very rate of 12 letters per second. An Indian postal automation costly; such standards are the OCR/Video Coding based on recognition of pin-code and city name, proposed Systems (VCS) developed by the European Committee by Roy et al in [13] uses Artificial Neural Networks for for standardization. The OCR/VCS enables postal the classification of English and Bangla postal zip codes. operators to work with different suppliers on needed In their system they used three classifiers for the replacements or extensions of sub-systems without recognition. The first classifier deals with 16-class incurring significant engineering cost [1] [4]. problem (because of shape similarity the number is reduced from 20) for simultaneous recognition of Bangla Many research articles are available in the field of and English numerals. The other two classifiers are for automation of postal systems. Several systems have been recognition of Bangla and English numerals, individually. developed for address reading, such as in USA [5], UK Ming Su et al. [14], developed an OCR system, where the [6], Japan [7], Canada [8], etc. But very few countries in goal was to accomplish the automatic mail sorting of the Middle East use automated mail-processing systems. Chinese postal system by the integration of a mechanized This is due to the absence of organized mailing address sorting machine, computer vision, and the development of systems, thus current processing is done in post offices OCR. El-Emami and Usher [15] tried to recognize postal which are limited and use only P.O. boxes. Canada Post address words, after segmenting these into letters. A is processing 2.8 billion letter mail pieces annually structural analysis method was used for selecting features through 61 Multi-line Optical Character Readers of Arabic characters. On the other hand, U.Pal et.al., (MLOCRs) in 17 letter sorting Centers. The MLOCR – [16], argues that under three-language formula, the Year 2000 has an error rate of 1.5% for machine print destination address block of postal document of an Indian reading only, and the MLOCR/RCR – Year 2003 has an state is generally written in three languages: English, error rate of 1.7% which is for print/script reading. Most Hindi and the State official language. Because of inter- of these low read errors are on handwritten addresses and mixing of these scripts in postal address writings, it is on outgoing foreign mail [9]. very difficult to identify the script by which a pin-code is written. In their work, they proposed a tri-lingual The postal automation systems, developed so far, are (English, Hindi and Bangla) 6-digit full pin-code string capable of distinguishing the city/country names, post and recognition, and obtained 99.01% reliability from their zip codes on handwritten machine-printed standard style proposed system whereas error and rejection rates were envelopes. In these systems, the identification of the 0.83% and 15.27%, respectively. In regards to postal addresses is achieved by implementing an address recognizing the Arabic numerals, Sameh et.al. [17], recognition strategy that consists of a number of stages, described a technique for the recognition of optical off- including pre-processing, address block location, address line handwritten Arabic (Indian) numerals using Hidden segmentation, character recognition, and contextual post Markov Models (HMM). Features that measure the image processing. The academic research in this area has characteristics at local, intermediate, and large scales provided many algorithms and techniques, which have were applied. Gradient, structural, and concavity features 10 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol.
Recommended publications
  • Lara Mantovan Nominal Modification in Italian Sign Language Sign Languages and Deaf Communities
    Lara Mantovan Nominal Modification in Italian Sign Language Sign Languages and Deaf Communities Editors Annika Herrmann, Markus Steinbach, Ulrike Zeshan Editorial board Carlo Geraci, Rachel McKee, Victoria Nyst, Sibaji Panda, Marianne Rossi Stumpf, Felix Sze, Sandra Wood Volume 8 Lara Mantovan Nominal Modification in Italian Sign Language ISHARA PRESS ISBN 978-1-5015-1343-5 e-ISBN (PDF) 978-1-5015-0485-3 e-ISBN (EPUB) 978-1-5015-0481-5 ISSN 2192-516X e-ISSN 2192-5178 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2017 Walter de Gruyter Inc., Boston/Berlin and Ishara Press, Preston, UK Printing and binding: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com E Pluribus Unum (uncertain origin, attributed to Virgilio, Moretum, v. 103) Acknowledgements This book is a revised version of my 2015 dissertation which was approved for the PhD degree in Linguistics at Ca’ Foscari University of Venice. When I first plunged into the world of academic research, almost five years ago, I would never have imagined it was possible to achieve such an important milestone. Being so close to finalizing this book, I would like to look back briefly and remember and thank all the people who showed me the way, supported me, and encouraged me to grow both academically and personally.
    [Show full text]
  • Eng-Numerals
    English Numerals The A.B.C. of 1.2.3 2014 © Madmolf - AFPI - ANGLAIS SOREM GROUP- USSAC 2014 Numerals •English number words include : ‣ Numerals ‣ Words derived from them 2014 © Madmolf - AFPI - ANGLAIS SOREM GROUP- USSAC 2014 Cardinal numbers 0 zero 10 ten 1 one 11 eleven 2 two 12 twelve 20 twenty 3 three 13 thirteen 30 thirty 4 four 14 fourteen 40 forty 5 five 15 fifteen 50 fifty 6 six 16 sixteen 60 sixty 7 seven 17 seventeen 70 seventy 8 eight 18 eighteen 80 eighty 9 nine 19 nineteen 90 ninety 2014 © Madmolf - AFPI - ANGLAIS SOREM GROUP- USSAC 2014 Cardinal numbers • For numbers ranging from 21 to 99 ‣ Write the number as two words separated by a hyphen 21 twenty-one 25 twenty-five 32 thirty-two 58 fifty-eight 64 sixty-four 79 seventy-nine 83 eighty-three 99 ninety-nine 2014 © Madmolf - AFPI - ANGLAIS SOREM GROUP- USSAC 2014 Cardinal numbers • Hundreds ‣ The word hundred remains in its singular form regardless of the number preceding it ‣ One may say «hundreds of people sang» ‣ Or «hundreds of cranes fly above Ussac» 100 one hundred 200 two hundred … … 900 nine hundred • And so too are the thousands... 2014 © Madmolf - AFPI - ANGLAIS SOREM GROUP- USSAC 2014 Cardinal numbers • Thousands 1 one thousand 2 two thousand 10 ten thousand 11 eleven thousand 20 twenty thousand 21 twenty-one thousand 30 thirty thousand 85 eighty-five thousand 100 one hundred thousand nine hundred and ninety-nine (British English) 999 nine hundred ninety-nine thousand (American English) 1,000,000 one million 2014 © Madmolf - AFPI - ANGLAIS SOREM GROUP- USSAC 2014 Cardinal
    [Show full text]
  • MUL TILANGUAGE TRANSLATION of NUMERALS by PROLOG Bengt
    MUL TILANGUAGE TRANSLATION OF NUMERALS BY PROLOG Bengt Sigurd INTRODUCTION The present paper describes a program (Nummer.pro), which has been developed for PC (MSDOS) in PDC prolog (Prolog Development Center, DK-2605 Broendby). It translates numerals (English, Burmese, Georgian, French and Japanese) and is based on a general theory of numerals, which presumably can be applied to all numeral systems in the world, although with varying success - related to number of exceptions. Languages will show different types of deviations from the general model (cf below). The general model accounting for the structure of numerals is formalized by the following rule: Numer al --> ... (U G3) (U G2) (U Gl) (U GO) In this generative rule U isa unit numeral and G0,Gl,G2,etc. are group numerals. The English numeral "two hundred" consists of the unit (U) numeral "two" and the group (G) word "hundred", which is G2 in the formula. There are no other constituents in this case. An expression within parentheses is called a group constituent. The meaning (value) of such a group constituent is the meaning of the unit word multiplied by the meaning of the group word, in this case 2*100=200, as expressed in the ordinary (decimal) mathematical system. A numeral may consist of one or several such group constituents. The numeral "three hundred and thirty three" consists of three group constituents (disregarding "and"): (three hundred) (thir ty) (three) 117 Some of the irregularities hinted at above show up in this example. The unit numeral for 3 has different forms (the allomorphs three/thir) in its different occurrences and it seems to be more natural to treat thirty, and the numeral thirteen, as complexes not to be further analyzed synchronically (only etymologically).
    [Show full text]
  • Designing More Effective 9 Segment Display for Bengali and English Digits
    Engineering International, Volume 3, No 2 (2015) ISSN 2409-3629 Prefix 10.18034 Designing More Effective 9 Segment Display for Bengali and English Digits Mohammad Badrul Alam Miah*, Md. Habibur Rahman, Md. Nazrul Islam Department of Information & Communication Technology, MBSTU, Tangail, BANGLADESH *Corresponding Contact: Email: [email protected] ABSTRACT Seven-segment display is well-known for displaying the English numerals form 0-9. In this paper 9-segment display for both Bengali and English digits have been proposed. Our proposed 9-segment display is more effective than the previously proposed 10-segment, 11-segment, 16-segment and 8-segment display for both Bengali and English as well as 9-segment, 10-segment and 18-segment display Bengali digits. It is an improvement of previously proposed segment display for both Bengali and English digits. Key words 7-segment display, 9-segment display, Boolean function, Bengali and English digits 12/15/2015 Source of Support: Technical University of Mombasa , No Conflict of Interest: Declared This article is is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Attribution-NonCommercial (CC BY-NC) license lets others remix, tweak, and build upon work non-commercially, and although the new works must also acknowledge & be non-commercial. INTRODUCTION Display with finite number of segment for each numeric character is preferred to dot matrix displays because the former saves both in memory space and cost. Seven segment display is commonly used for the display of English numerals. Both Bengali and English digits are represented using dot matrix, so the cost of display increases due to storage space, a large number of digits, power loss and design complexity.
    [Show full text]
  • A Unique 10 Segment Display for Bengali Numerals
    A Unique 10 Segment Display for Bengali Numerals Md. Abul Kalam Azad, Rezwana Sharmeen, Shabbir Ahmad and S. M. Kamruzzaman† Department of Computer Science & Engineering, International Islamic University Chittagong, Chittagong, Bangladesh. † Department of Computer Science & Engineering, Manarat International University, Dhaka, Bangladesh. {azadarif, r_sharmin_79, bappi_51, smk_iiuc}@yahoo.com Abstract Segmented display is widely used for efficient display of III. THE PROPOSED 10 SEGMENT alphanumeric characters. English numerals are dis- DISPLAY played by 7 segment and 16 segment display. The seg- Our proposed uniform 10 segment display model is ment size is uniform in this two display architecture. shown in Figure 1. Each segment is of uniform size. Display architecture using 8, 10, 11, 18 segments have The segments are non-overlapping and labeled with been proposed for Bengali numerals 0...9 yet no display specific alphabet. architecture is designed using segments of uniform size and uniform power consumption. In this paper we have a proposed a uniform 10 segment architecture for Bengali h numerals. This segment architecture uses segments of f b uniform size and no bent segment is used. gi Keywords: 10 segment display, Bengali numerals, segmented display, combination vector. e c I. INTRODUCTION j Different display architectures using finite number of d segments have been proposed for displaying Bengali numerals. 3X8 dot matrix system was used earlier for Figure 1: Proposed 10 Segment Display displaying Bengali numerals but it requires large num- bers of dots to be manipulated thereby increasing the IV. REPRESENTATION OF BENGALI memory storage, cost and design complexity. The 7 NUMERALS WITH THE PROPOSED 10 segment display is very common in displaying English SEGMENT DISPLAY numerals and a 16 segment display is widely used for The following figure, Figure 2 shows the representation displaying English Alphanumeric characters.
    [Show full text]
  • An Introduction to Linguistic Typology
    An Introduction to Linguistic Typology An Introduction to Linguistic Typology Viveka Velupillai University of Giessen John Benjamins Publishing Company Amsterdam / Philadelphia TM The paper used in this publication meets the minimum requirements of 8 the American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984. Library of Congress Cataloging-in-Publication Data An introduction to linguistic typology / Viveka Velupillai. â. p cm. â Includes bibliographical references and index. 1. Typology (Linguistics) 2. Linguistic universals. I. Title. P204.V45 â 2012 415--dc23 2012020909 isbn 978 90 272 1198 9 (Hb; alk. paper) isbn 978 90 272 1199 6 (Pb; alk. paper) isbn 978 90 272 7350 5 (Eb) © 2012 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Company • P.O. Box 36224 • 1020 me Amsterdam • The Netherlands John Benjamins North America • P.O. Box 27519 • Philadelphia PA 19118-0519 • USA V. Velupillai: Introduction to Typology NON-PUBLIC VERSION: PLEASE DO NOT CITE OR DISSEMINATE!! ForFor AlTô VelaVela anchoranchor and and inspiration inspiration 2 Table of contents Acknowledgements xv Abbreviations xvii Abbreviations for sign language names xx Database acronyms xxi Languages cited in chapter 1 xxii 1. Introduction 1 1.1 Fast forward from the past to the present 1 1.2 The purpose of this book 3 1.3 Conventions 5 1.3.1 Some remarks on the languages cited in this book 5 1.3.2 Some remarks on the examples in this book 8 1.4 The structure of this book 10 1.5 Keywords 12 1.6 Exercises 12 Languages cited in chapter 2 14 2.
    [Show full text]
  • List of Numbers
    List of numbers This is a list of articles aboutnumbers (not about numerals). Contents Rational numbers Natural numbers Powers of ten (scientific notation) Integers Notable integers Named numbers Prime numbers Highly composite numbers Perfect numbers Cardinal numbers Small numbers English names for powers of 10 SI prefixes for powers of 10 Fractional numbers Irrational and suspected irrational numbers Algebraic numbers Transcendental numbers Suspected transcendentals Numbers not known with high precision Hypercomplex numbers Algebraic complex numbers Other hypercomplex numbers Transfinite numbers Numbers representing measured quantities Numbers representing physical quantities Numbers without specific values See also Notes Further reading External links Rational numbers A rational number is any number that can be expressed as the quotient or fraction p/q of two integers, a numerator p and a non-zero denominator q.[1] Since q may be equal to 1, every integer is a rational number. The set of all rational numbers, often referred to as "the rationals", the field of rationals or the field of rational numbers is usually denoted by a boldface Q (or blackboard bold , Unicode ℚ);[2] it was thus denoted in 1895 byGiuseppe Peano after quoziente, Italian for "quotient". Natural numbers Natural numbers are those used for counting (as in "there are six (6) coins on the table") and ordering (as in "this is the third (3rd) largest city in the country"). In common language, words used for counting are "cardinal numbers" and words used for ordering are
    [Show full text]
  • Major Trends in Theoretical and Applied Linguistics Volume 1 Versita Discipline: Language, Literature
    Edited by: Nikolaos Lavidas Thomaï Alexiou Areti-Maria Sougari Major Trends in Theoretical and Applied Linguistics Volume 1 Versita Discipline: Language, Literature Managing Editor: Anna Borowska Language Editor: Edgar Joycey Major Trends in Theoretical and Applied Linguistics: Selected Papers from the 20th International Symposium on Theoretical and Applied Linguistics (April 1-3, 2011) / Edited by: Nikolaos Lavidas, Thomaï Alexiou & Areti-Maria Sougari. Published by Versita, Versita Ltd, 78 York Street, London W1H 1DP, Great Britain. This work is licensed under the Creative Commons Attribution-NonCommercial- NoDerivs 3.0 license, which means that the text may be used for non-commercial purposes, provided credit is given to the author. Copyright © 2013 Selection and editorial matter: Nikolaos Lavidas, ThomaÏ Alexiou, Areti-Maria Sougari; individual contributors, their contributions. ISBN (paperback): 978-83-7656-074-8 ISBN (hardcover): 978-83-7656-075-5 ISBN (for electronic copy): 978-83-7656-076-2 Managing Editor: Anna Borowska Language Editor: Edgar Joycey www.versita.com Cover illustration: © Istockphoto.com/skvoor Major Trends in Theoretical and Applied Linguistics contains 80 papers on Theoretical and Applied Linguistics by prominent and young researchers, representing a large variety of topics, dealing with virtually all domains and frameworks of modern Linguistics. These papers were originally presented at the 20th International Symposium on Theoretical and Applied Linguistics at the Aristotle University of Thessaloniki in April
    [Show full text]
  • Rapid Information Gain Explains Cross-Linguistic Tendencies In
    Rapid information gain explains cross-linguistic tendencies in numeral ordering Emmy Liu ([email protected]) Yang Xu ([email protected]) Department of Computer Science Cognitive Science Program University of Toronto Abstract tend to switch order over time (Berg & Neubauer, 2014). This One previously unexplained observation about numeral sys- preference of having the larger constituent number expressed tems is the shared tendency in numeral expressions: Numer- before the smaller constituent number is prevalent in numer- als greater than 20 often have the larger constituent number als for the range above 20 but less prominent for smaller num- expressed before the smaller constituent number (e.g., twenty- four as opposed to four-twenty in English), and systems that bers (Calude & Verkerk, 2016). Here we ask what principles originally adopt the reverse order of expression (e.g., four- might account for this shared tendency in numeral ordering. and-twenty in Old English) tend to switch order over time. To This problem has been discussed by Greenberg in his explore these phenomena, we propose the view of Rapid In- formation Gain and contrast it with the established theory of cross-linguistic generalization about the design of recursive Uniform Information Density. We compare the two theories numerals (Greenberg, 1978). Recursive numeral systems rep- k in their ability to explain the shared tendency in the ordering resent numbers based on the canonical expression x1n +:::+ of numeral expressions around 20. We find that Rapid Infor- mation Gain accounts for empirical patterns better than the al- xkn + y. Here n is called the base and the values of xi’s and y ternative theory, suggesting that there is an emphasis on infor- are in the range of 1 to the base (Comrie, 2013).
    [Show full text]
  • History of Natural Numbers and the Status of Zero
    Natural Numbers In mathematics, the natural numbers are the ordinary whole numbers used for counting ("there are 6 coins on the table") and ordering ("this is the 3rd largest city in the country"). These purposes are related to the linguistic notions of cardinal and ordinal numbers, respectively (see English numerals). A later notion is that of a nominal number, which is used only for naming. Properties of the natural numbers related to divisibility, such as the distribution of prime numbers, are studied in number theory. Problems concerning counting and ordering, such as partition enumeration, are studied in combinatorics. There is no universal agreement about whether to include zero in the set of natural numbers: some define the natural numbers to be the positive integers {1, 2, 3, ...}, while for others the term designates the non-negative integers {0, 1, 2, 3, ...}. The former definition is the traditional one, with the latter definition first appearing in the 19th century. Some authors use the term "natural number" to exclude zero and "whole number" to include it; others use "whole number" in a way that excludes zero, or in a way that includes both zero and the negative integers. History of natural numbers and the status of zero The natural numbers had their origins in the words used to count things, beginning with the number 1. The first major advance in abstraction was the use of numerals to represent numbers. This allowed systems to be developed for recording large numbers. The ancient Egyptians developed a powerful system of numerals with distinct hieroglyphs for 1, 10, and all the powers of 10 up to over one million.
    [Show full text]
  • Universal Numeric Segment Display for Indian Scheduled Languages
    International Journal of Computer Trends and Technology- volume2Issue2- 2011 Universal Numeric Segment Display for Indian Scheduled Languages: an Architectural View Partha Pratim Ray Surendra Institute of Engineering and Management Siliguri, Darjeeling-734009, West Bengal, India Abstract— India is country of several hundred different languages. Though twenty two languages have only been devised II. RELATED WORK as scheduled to the Eighth Schedule of Indian Constitution in So far 7-segment display is used for English digits, those 2007. But as there is yet no proposed compact display are Latin digits and 10-segment display is proposed for architecture to display all the scheduled language numerals at a Bengali digits [2, 3]. [4] proposed an 8-segment display is time, this paper proposes a uniform display architecture to display all twenty two different language digits with higher English and Bengali digits. Literature [5] proposed a 12- accuracy and simplicity by using a 17-segment display, which is segment display for Arabic digits. [6] described an 18- an improvement over the 16-segment display. segmented display for twelve international languages. Another report, [7] presented a 31-segmented display for Bengali Keywords— 17-segment display, Indian scheduled language, characters. [8] proposed a 26-segment display for Bengali numerals, English. characters. But these segment architectures are different to each other and we cannot use the same architecture for I. INTRODUCTION uniform display of multiple languages, that’s why I have India is country of multiple languages. Most Indians speak proposed 17-segment display as a uniform architecture to a language belonging either to the Indo-European (74%), the display multiple numerals at a time.
    [Show full text]
  • TYPOLOGICAL PARAMETERS for the LEXICO-SYNT ACTIC STRUCTURES of HIGH-DIGIT NUMERICAL Expressionsl
    TYPOLOGICAL PARAMETERS FOR THE LEXICO-SYNT ACTIC STRUCTURES OF HIGH-DIGIT NUMERICAL EXPRESSIONSl ChengLuo Brock University ABSTRACT The objective of this paper is to explore the lexico-syntactic structure of high-digit numerical expressions (henceforth HDNEs), normally with 4 digits and more, and to propose a set of typological parameters to ac- count for crosslinguistic variations in lexicalization sites. The two pa- rameters of pivot/ anchor and increment are proposed to describe and account for lexicalization sites of HDNEs, with language-specific value- setting explaining crosslinguistic variation. Support for this analysis includes psycholinguistic evidence from second language acquisition. 1. INTRODUCTION This paper examines the crosslinguistic lexico-syntactic structure of high-digit numerical expressions (HDNEs) in order to provide a principled account for their lexicalization sites, whose crosslinguistic variations are demonstrated in italics in Table 1: Table 1: Sample of crosslinguistic variation in HDNE lexicalization sites English Chinese Kannada Lugwere 1,000 (one)thousand (yi)qian ondu-siivira lu-kurni 10,000 ten thousand (yi)wan haththu-savira mu-tulo 100,000 (one)hundred thousand shiwan ondu-Iaksha mi-tuloi-kurni 1,000,000 (one)million (yi)bai wan haththu-laksha ka-kairi 10,000,000 tenmillion (yi)qianwan ondu-kati bu-kairii-kumi 100,000,000 (one)hundred million (yi)yi haththu-k6ti bu-kairichi-kumi 1,000,000,000 (one)billion shi yi kata8arikakama As Chomsky (1980: 248f) suggests, the way in which human beings de- velop the number system may 'shed light on deep and fundamental charac- 1 An earlier version of this work was presented at the 1997 Mid-America Linguistic Conference at Columbus, Missouri, at which I benefited from the remarks and suggestions from the audience.
    [Show full text]