Online Bengali Handwritten Numerals Recognition Using Deep Autoencoders

Total Page:16

File Type:pdf, Size:1020Kb

Online Bengali Handwritten Numerals Recognition Using Deep Autoencoders Online Bengali Handwritten Numerals Recognition Using Deep Autoencoders Arghya Pal∗, B. K. Khonglahy, S. Mandaly, Himakshi Choudhuryy, S. R. M. Prasanna y, H. L. Rufinerz and Vineeth N Balasubramanian∗ Department of Computer Science and Engineering Indian Institute of Technology Hyderabad, Telangana 502285 ∗ Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati, Guwahati-781039 y Facultad de Ingeniera y Ciencias Hdricas - Universidad Nacional del Litoral - CONICET, Ciudad Universitaria, Paraje El Pozo, S3000 Santa Fe, Argentina z E-mail: fcs15resch11001, [email protected], fbanriskhem, [email protected], lrufi[email protected] Abstract—This work describes the development of online handwritten Bengali numeral recognition. This work finds its handwritten isolated Bengali numerals using Deep Autoencoder place in various domain like online form filling applications, (DA) based on Multilayer perceptron (MLP) [1]. Autoencoders census data collection, banking operations, number dialing capture the class specific information and the deep version uses many hidden layers and a final classification layer to accomplish systems and many more. this. DA based on MLP uses the MLP training approach for Some of the works that recognizes online handwritten its training. Different configurations of the DA are examined characters from different perspectives include HMM based to find the best DA classifier. Then an optimization technique handwriting recognition systems for Bangla [3], Tamil [4], have been adopted to reduce the overall weight space of the Telugu [5], Assamese [6], SVM based system for Tamil [7] DA based on MLP that in turn makes it suitable for a real time application. The performance of the DA based system is and Devanagari [8]. There are reported claims on handwritten compared with systems constructed using Hidden Markov Model Bengali characters using MLPs [9], [10], but those works (HMM) and Support Vector Machine (SVM). The confusion mainly focus on offline recognition, which is not compatible matrices of DA, HMM and SVM are analyzed in order to make a with the present study of online recognition. hybrid numeral recognizer system. It is found that hybrid system A central goal of this work is to classify handwritten online gives better performance than each of the individual systems, where the average recognition performances of DA, HMM and handwritten Bengali numerals by using DA based on MLP. SVM systems are 97:74%, 97:5% and 98:14%, respectively and By adding a final classification layer DA was converted into hybrid system gives a performance of 99:18%. a Deep Classifier (DC). The final classification layer worked mainly based on “winner takes all” philosophy. Best configura- I. INTRODUCTION tion of DC was chosen as per the validation set performance. Autoencoders are machine learning tools having same out- The present work is different from [1], in two ways. First, put vector dimension as input vector dimension aiming to we have used an approximation of standard backpropagation reconstruct input with minimum reconstruction error. If the known as “Marquard Bacpropagation Algoritm” (MBP) [11] number of hidden layers are more than one, the autoencoder to incorporate a higher guarantee to converge. For the rest of is considered to be Deep. Autoencoder network based on the paper, backpropagation will imply MBP and not the simple Restricted Boltzmann Machine (RBM) proposed in [2] is used backpropagation. Secondly, inspired by the work in [12], we to form a Deep Network. This work is motivated by [1], where further applied an optimization technique to the weight space Deep Autoencoder (DA) based on Multilayer perceptrons of the deep network to reduce its size and yields a network (MLP), trained through the back propagation algorithm has with good recognition time making it feasible for a real-time been explored for the task of speech emotion recognition. application. The average recognition rate of the DA was then Good initial weights of DA were obtained using a method compared with HMM and SVM and class-wise performances of pre-training and was shown to give promising results. It of the DA, HMM and SVM systems were analyzed, to build is here worthy to mention that though the functional form a hybrid system using the algorithm combination, so as to of RBM and autoencoder are quite similar, their training further improve the recognition rate. A similar kind of work procedure and interpretation are quit different. But both of can be found in [13], where the combination of HMM- them can be extended to form a deep network to perform SVM had been done using test set performance. But the ) Research Center for Signals, Systems and Computational Intelligence (fich.unl.edu.ar/sinc) i several classification and regression tasks. In this work we main difference is that our work has taken validation set have explored the use of DA (from now onwards we use DA to sinc( Arghya Pal, B. K. Khonglah, S. Mandal, Himakshi Choudhury, R. M. Prasanna, H. L. Rufiner & Vineeth N. Balasubramanian; "Online Bengali Handwritten Numerals Recognition Using Deep Autoencoders" Proc. of the 22nd National Conference on Communications (NCC 2016), mar, 2016. performance to combine those state-of-art systems. mean Deep Autoencoder based on MLP) for the task of online The paper is organized as follows: Section II describes data 978-1-5090-2361-5/16/$31.00 c 2016 IEEE collection and feature extraction method. Online Bengali hand- Fig. 1. Last row represents basic Bengali Numerals (one-to-one relation with Indo-Arabic numerals) Fig. 2. Autoencoder with input, first hidden layer and reconstruction of input; i. e. i + h1+ reconstruction of i written recognition system using DA classifier is described in Section III. Section IV explores a technique to reduce the DA network. Section V explores the use of other state- of-art systems such as HMM and SVMs for online Bengali handwritten recognition. Hybrid system is described in Section VI. The summary and conclusion is given in Section VII. Fig. 3. Autoencoder with first hidden layer, second hidden layer and reconstruction of first hidden layer; i. e.h1 + h2+reconstruction of h1 II. DATA COLLECTION AND FEATURE EXTRACTION Having a one-to-one relation with Indo-Arabic numeral set, III. ONLINE NUMERAL RECOGNITION USING DEEP Bengali numeral consists of ten characters (Fig. 1). In the AUTOENCODERS (DA) present study, we considered recognition of isolated Bengali numeral mentioned in Fig. 1. Raw data was collected from 200 Deep Autoencoder (DA) network is feed forward and fully native Bengali writers using an open source toolkit provided by connected network consisting of an input layer i, two hidden layers h1 and h2 and one output layer o. Six feature set xp, C-DAC Pune on the Lenovo ThinkPad PC in different sessions, 0 0 00 00 without putting any constraint to the writers. A mixture of yp, xp, yp, xp and yp , each having sixty equidistant points frequent writers, non-frequent writers, writers form different acquired from feature extraction set is unrolled to form a 360 age groups, professions was collected to ensure robustness of dimensional feature vector which is the input to DA. Making the system for a real world situation. At each session each the input and output class fixed, the number of units of each writer had given ten samples of each numeric character and hidden layer was modified in order to find the best classifier. that made a total count of 30000 raw samples per class. Different from the greedy layer wise training algorithm in From that collected raw data 20000 samples of each numeric [14], a strategy has been taken by [1] which finds good initial characters had been reserved for training and 5000 samples weights by learning one layer of feature at a time known as of each numeric characters had been reserved for both testing pre-training the DA. To cut a long story short, main steps and validation to make a clear segregation between train set, proposed in [1] are as follows: test set and validation dataset. Due to the large variations • Step 1: To initialize weights corresponding to links be- in shape, size, writing style and position in unconstrained tween the input i and first hidden layer h1(i.e. i+h1), an collected raw data (i.e. temporally ordered sequence of pen autoencoder with i + h1+ reconstruction of i had been coordinates (x; y)), some preprocessing techniques like nor- used. Different values of hidden layer was tested (See malization, smoothing, removal of duplicate points and re- Table II) in order to find best autoencoder which gave sampling [13] have been done to find fixed length feature minimum Sum of Squared Errors (SSE) for the validation vector. In general, preprocessing was employed to mitigate dataset. the intraclass variations and to provoke inter class variations. • Step 2: i + h1 (i.e. coder part) has been taken from step During the preprocessing (i.e. normalization, smoothing, re- 1 using those networks which gave a minimum SSE for moval of duplicate points and re-sampling), the first and last train dataset (see Fig. 2). Train and validation set was points of each raw data were unchanged as they contain vital passed to generate output. information [13]. So, preprocessing for each numeral gave 60 • Step 3: Output of the previous step was the input to the equidistant points xp; yp where p = 1 to 60, comprising of next autoencoder, i.e. h1 +h2+ reconstruction of h1. This pen-down, pen-up and other 58 equidistant meaningful points. h1 + h2+ reconstruction of h1, autoencoder initialized Preprocessed coordinates xp; yp along with first derivatives link weights between first hidden layer h and second 0 0 00 00 1 (i.e. xp and yp) and second derivatives (i.e. xp and yp ) of each hidden layer h2.
Recommended publications
  • Cross-Cultural Research
    Cross-Cultural Research http://ccr.sagepub.com Cultural Adaptations After Progressionism Lauren W. McCall Cross-Cultural Research 2009; 43; 62 DOI: 10.1177/1069397108328613 The online version of this article can be found at: http://ccr.sagepub.com/cgi/content/abstract/43/1/62 Published by: http://www.sagepublications.com On behalf of: Society for Cross-Cultural Research Additional services and information for Cross-Cultural Research can be found at: Email Alerts: http://ccr.sagepub.com/cgi/alerts Subscriptions: http://ccr.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav Citations http://ccr.sagepub.com/cgi/content/refs/43/1/62 Downloaded from http://ccr.sagepub.com at DUKE UNIV on January 9, 2009 Cross-Cultural Research Volume 43 Number 1 February 2009 62-85 © 2009 Sage Publications Cultural Adaptations After 10.1177/1069397108328613 http://ccr.sagepub.com hosted at Progressionism http://online.sagepub.com Lauren W. McCall National Evolutionary Synthesis Center How should behavioral scientists interpret apparently progressive stages of cultural history? Adaptive progress in biology is thought to only occur locally, relative to local conditions. Just as evolutionary theory offers physi- cal anthropologists an appreciation of global human diversity through local adaptation, so the metaphor of adaptation offers behavioral scientists an appreciation of cultural diversity through analogous mechanisms. Analyses reported here test for cultural adaptation in both biotic and abiotic environ- ments. Testing cultural adaptation to the human-made environment, the culture’s pre-existing technical complexity is shown to be a predictive fac- tor. Then testing cultural adaptation to the physical environment, this article corroborates Divale’s (1999) finding that counting systems are adaptations to unstable environments, and expands the model to include other environ- mental indices and cultural traits.
    [Show full text]
  • Handwriting Recognition in Indian Regional Scripts: a Survey of Offline Techniques
    1 Handwriting Recognition in Indian Regional Scripts: A Survey of Offline Techniques UMAPADA PAL, Indian Statistical Institute RAMACHANDRAN JAYADEVAN, Pune Institute of Computer Technology NABIN SHARMA, Indian Statistical Institute Offline handwriting recognition in Indian regional scripts is an interesting area of research as almost 460 million people in India use regional scripts. The nine major Indian regional scripts are Bangla (for Bengali and Assamese languages), Gujarati, Kannada, Malayalam, Oriya, Gurumukhi (for Punjabi lan- guage), Tamil, Telugu, and Nastaliq (for Urdu language). A state-of-the-art survey about the techniques available in the area of offline handwriting recognition (OHR) in Indian regional scripts will be of a great aid to the researchers in the subcontinent and hence a sincere attempt is made in this article to discuss the advancements reported in this regard during the last few decades. The survey is organized into different sections. A brief introduction is given initially about automatic recognition of handwriting and official re- gional scripts in India. The nine regional scripts are then categorized into four subgroups based on their similarity and evolution information. The first group contains Bangla, Oriya, Gujarati and Gurumukhi scripts. The second group contains Kannada and Telugu scripts and the third group contains Tamil and Malayalam scripts. The fourth group contains only Nastaliq script (Perso-Arabic script for Urdu), which is not an Indo-Aryan script. Various feature extraction and classification techniques associated with the offline handwriting recognition of the regional scripts are discussed in this survey. As it is important to identify the script before the recognition step, a section is dedicated to handwritten script identification techniques.
    [Show full text]
  • Mathematics in African History and Cultures
    Paulus Gerdes & Ahmed Djebbar MATHEMATICS IN AFRICAN HISTORY AND CULTURES: AN ANNOTATED BIBLIOGRAPHY African Mathematical Union Commission on the History of Mathematics in Africa (AMUCHMA) Mathematics in African History and Cultures Second edition, 2007 First edition: African Mathematical Union, Cape Town, South Africa, 2004 ISBN: 978-1-4303-1537-7 Published by Lulu. Copyright © 2007 by Paulus Gerdes & Ahmed Djebbar Authors Paulus Gerdes Research Centre for Mathematics, Culture and Education, C.P. 915, Maputo, Mozambique E-mail: [email protected] Ahmed Djebbar Département de mathématiques, Bt. M 2, Université de Lille 1, 59655 Villeneuve D’Asq Cedex, France E-mail: [email protected], [email protected] Cover design inspired by a pattern on a mat woven in the 19th century by a Yombe woman from the Lower Congo area (Cf. GER-04b, p. 96). 2 Table of contents page Preface by the President of the African 7 Mathematical Union (Prof. Jan Persens) Introduction 9 Introduction to the new edition 14 Bibliography A 15 B 43 C 65 D 77 E 105 F 115 G 121 H 162 I 173 J 179 K 182 L 194 M 207 N 223 O 228 P 234 R 241 S 252 T 274 U 281 V 283 3 Mathematics in African History and Cultures page W 290 Y 296 Z 298 Appendices 1 On mathematicians of African descent / 307 Diaspora 2 Publications by Africans on the History of 313 Mathematics outside Africa (including reviews of these publications) 3 On Time-reckoning and Astronomy in 317 African History and Cultures 4 String figures in Africa 338 5 Examples of other Mathematical Books and 343
    [Show full text]
  • Arxiv:2010.11856V3 [Cs.CL] 13 Apr 2021 Questions from Non-English Native Speakers to Rep- Information-Seeking Questions—Questions from Resent Real-World Applications
    XOR QA: Cross-lingual Open-Retrieval Question Answering Akari Asaiº, Jungo Kasaiº, Jonathan H. Clark¶, Kenton Lee¶, Eunsol Choi¸, Hannaneh Hajishirziº¹ ºUniversity of Washington ¶Google Research ¸The University of Texas at Austin ¹Allen Institute for AI {akari, jkasai, hannaneh}@cs.washington.edu {jhclark, kentonl}@google.com, [email protected] Abstract ロン・ポールの学部時代の専攻は?[Japanese] (What did Ron Paul major in during undergraduate?) Multilingual question answering tasks typi- cally assume that answers exist in the same Multilingual document collections language as the question. Yet in prac- (Wikipedias) tice, many languages face both information ロン・ポール (ja.wikipedia) scarcity—where languages have few reference 高校卒業後はゲティスバーグ大学へ進学。 (After high school, he went to Gettysburg College.) articles—and information asymmetry—where questions reference concepts from other cul- Ron Paul (en.wikipedia) tures. This work extends open-retrieval ques- Paul went to Gettysburg College, where he was a member of the Lambda Chi Alpha fraternity. He tion answering to a cross-lingual setting en- graduated with a B.S. degree in Biology in 1957. abling questions from one language to be an- swered via answer content from another lan- 生物学 (Biology) guage. We construct a large-scale dataset built on 40K information-seeking questions Figure 1: Overview of XOR QA. Given a question in across 7 diverse non-English languages that Li, the model finds an answer in either English or Li TYDI QA could not find same-language an- Wikipedia and returns an answer in English or L . L swers for. Based on this dataset, we introduce i i is one of the 7 typologically diverse languages.
    [Show full text]
  • Bengali Handwritten Numeral Recognition Using Artificial Neural Network and Transition Elements
    BENGALI HANDWRITTEN NUMERAL RECOGNITION USING ARTIFICIAL NEURAL NETWORK AND TRANSITION ELEMENTS a,* b c Zahidur Rahim Chowdhury Mohammad Abu Naser Ashraf Bin Islam a United International University, Dhaka. b Islamic University of Technology, Gajipur. c Bangladesh University of Engineering and Technology, Dhaka. * Corresponding email address: [email protected] Abstract: Bengali hand-writing recognition has potential application in document processing for one the widely used for language in the world. A method using Artificial Neural Network (ANN) is utilized primarily to identify numerals of the language using transition features. Maximum accuracy of 82% is reported in this article for an optimized network. The typical performance of the handwriting recognition system that uses a single recognition scheme is around 85%. The significance of local features in a character should be incorporated to enhance the overall performance of the network. Key words: Bengali Hand-writing, Numeral, Pattern Recognition, Neural Network, Transition. article. Recognition results from different systems were INTRODUCTION compared to make the final decision. The average recognition rate, error rate and reliability achieved by the Hand-written Bengali character recognition is a integrated system were 95.05%, 0.3% and 99.03%, process where techniques of pattern recognition are applied respectively. to analyze handwritings of Bengali language, one of the In this article, a recognition process for Bengali hand most popular languages in the world. Beside Bengali, written numerals is presented using ‘holistic’ approaches researchers have studied the recognition process using due to limited number of possible outputs. Transition different techniques for other popular languages like features of an image, instead of the complete image, were English [1–4], Chinese [5], Arabic [6], Japanese [7, 8], and used as inputs of the ANN for an efficient and a compact Indic [9].
    [Show full text]
  • LNCS 8104, Pp
    Bengali Printed Character Recognition – A New Approach Soharab Hossain Shaikh1, Marek Tabedzki2, Nabendu Chaki3, and Khalid Saeed4 1 A.K.Choudhury School of Information Technology, University of Calcutta, India [email protected] 2 Faculty of Computer Science, Bialystok University of Technology, Poland [email protected] 3 Department of Computer Science & Engineering, University of Calcutta, India [email protected] 4 Faculty of Physics and Applied Computer Science, AGH University of Science and Technology, Cracow, Poland [email protected] Abstract. This paper presents a new method for Bengali character recognition based on view-based approach. Both the top-bottom and the lateral view-based approaches have been considered. A layer-based methodology in modification of the basic view-based processing has been proposed. This facilitates handling of unequal logical partitions. The document image is acquired and segmented to extract out the text lines, words, and letters. The whole image of the individual characters is taken as the input to the system. The character image is put into a bounding box and resized whenever necessary. The view-based approach is applied on the resultant image and the characteristic points are extracted from the views after some preprocessing. These points are then used to form a feature vector that represents the given character as a descriptor. The feature vectors have been classified with the aid of k-NN classifier using Dynamic Time Warping (DTW) as a distance measure. A small dataset of some of the compound characters has also been considered for recognition. The promising results obtained so far encourage the authors for further work on handwritten Bengali scripts.
    [Show full text]
  • The Evolution of the Printed Bengali Character
    The Evolution of the Printed Bengali Character from 1778 to 1978 by Fiona Georgina Elisabeth Ross School of Oriental and African Studies University of London Thesis presented for the degree of Doctor of Philosophy 1988 ProQuest Number: 10731406 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. ProQuest 10731406 Published by ProQuest LLC (2017). Copyright of the Dissertation is held by the Author. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code Microform Edition © ProQuest LLC. ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, MI 48106 - 1346 20618054 2 The Evolution of the Printed Bengali Character from 1778 to 1978 Abstract The thesis traces the evolution of the printed image of the Bengali script from its inception in movable metal type to its current status in digital photocomposition. It is concerned with identifying the factors that influenced the shaping of the Bengali character by examining the most significant Bengali type designs in their historical context, and by analyzing the composing techniques employed during the past two centuries for printing the script. Introduction: The thesis is divided into three parts according to the different methods of type manufacture and composition: 1. The Development of Movable Metal Types for the Bengali Script Particular emphasis is placed on the early founts which lay the foundations of Bengali typography.
    [Show full text]
  • Sino-Tibetan Numeral Systems: Prefixes, Protoforms and Problems
    Sino-Tibetan numeral systems: prefixes, protoforms and problems Matisoff, J.A. Sino-Tibetan Numeral Systems: Prefixes, Protoforms and Problems. B-114, xii + 147 pages. Pacific Linguistics, The Australian National University, 1997. DOI:10.15144/PL-B114.cover ©1997 Pacific Linguistics and/or the author(s). Online edition licensed 2015 CC BY-SA 4.0, with permission of PL. A sealang.net/CRCL initiative. PACIFIC LINGUISTICS FOUNDING EDITOR: Stephen A. Wunn EDITORIAL BOARD: Malcolm D. Ross and Darrell T. Tryon (Managing Editors), Thomas E. Dutton, Nikolaus P. Himmelmann, Andrew K. Pawley Pacific Linguistics is a publisher specialising in linguistic descriptions, dictionaries, atlases and other material on languages of the Pacific, the Philippines, Indonesia and southeast Asia. The authors and editors of Pacific Linguistics publications are drawn from a wide range of institutions around the world. Pacific Linguistics is associated with the Research School of Pacific and Asian Studies at the Australian National University. Pacific Linguistics was established in 1963 through an initial grant from the Hunter Douglas Fund. It is a non-profit-making body financed largely from the sales of its books to libraries and individuals throughout the world, with some assistance from the School. The Editorial Board of Pacific Linguistics is made up of the academic staff of the School's Department of Linguistics. The Board also appoints a body of editorial advisors drawn from the international community of linguists. Publications in Series A, B and C and textbooks in Series D are refereed by scholars with re levant expertise who are normally not members of the editorial board.
    [Show full text]
  • Introducing the Sylheti Language and Its Speakers, and the SOAS Sylheti Project
    Language Documentation and Description ISSN 1740-6234 ___________________________________________ This article appears in: Language Documentation and Description, vol 18. Editors: Candide Simard, Sarah M. Dopierala & E. Marie Thaut Introducing the Sylheti language and its speakers, and the SOAS Sylheti project CANDIDE SIMARD, SARAH M. DOPIERALA & E. MARIE THAUT Cite this article: Simard, Candide, Sarah M. Dopierala & E. Marie Thaut. 2020. Introducing the Sylheti language and its speakers, and the SOAS Sylheti project. In Candide Simard, Sarah M. Dopierala & E. Marie Thaut (eds.) Language Documentation and Description 18, 1-22. London: EL Publishing. Link to this article: http://www.elpublishing.org/PID/196 This electronic version first published: August 2020 __________________________________________________ This article is published under a Creative Commons License CC-BY-NC (Attribution-NonCommercial). The licence permits users to use, reproduce, disseminate or display the article provided that the author is attributed as the original creator and that the reuse is restricted to non-commercial purposes i.e. research or educational use. See http://creativecommons.org/licenses/by-nc/4.0/ ______________________________________________________ EL Publishing For more EL Publishing articles and services: Website: http://www.elpublishing.org Submissions: http://www.elpublishing.org/submissions Introducing the Sylheti language and its speakers, and the SOAS Sylheti project Candide Simard The University of the South Pacific Sarah M. Dopierala Goethe University Frankfurt E. Marie Thaut SOAS, University of London Abstract Sylheti is a minoritised, politically unrecognised, and understudied Eastern Indo-Aryan language with approximately 11 million speakers worldwide, with high speaker concentrations in the Surma and Barak river basins in north-eastern Bangladesh and south Assam, India, and in several diasporic communities around the world (especially UK, USA, and Middle East).
    [Show full text]
  • Bengali Digit Recognition from Real-World Images Debdyut Hajra, B
    Bengali Digit Recognition from Real-World Images Debdyut Hajra, B. P. Poddar Institute of Management and Technology, India Introduction Proposed Solution Training Objectives I have used LeNet5 model with inception and dropout for deep Bengali is the 4th most popular language in the world and 2nd in learning. I have applied the model on the above-mentioned datasets. This project explores the development of a complete Bengali digit India. It is mostly spoken in Bangladesh and in the states of West I have obtained the highest accuracy of 94.7% for Bengali numerals recognition for scene images. Following are our objectives: Bengal, Assam and Tripura in India, as indicated by the figure on the corresponding to the CMATERdb numeral dataset. The total training 1. Develop a robust image pre-processing pipeline to extract the time was approximately 20 minutes for a training set of 4000 images, right. It is spoken by over 200 million people across the world. The desired features of the characters from the image. Bengali language has a very rich character set. There are 10 validation set and test set of 1000 images each. Intel optimised 2. Focus primarily on achieving a sufficiently high accuracy for digit numerals and 50 Bengali basic characters. There are also over 100 TensorFlow library has been used for building the classifier. recognition which can be used for practical purposes. compound characters in Bengali. Hence, it is challenging to create an 3. Implement the concepts developed in a web-application and Application Development efficient optical character recognizer (OCR) for the same. The deploy the trained model for practical use.
    [Show full text]
  • Offline Handwritten Gurmukhi Script Recognition
    Offline Handwritten Gurmukhi Script Recognition A Thesis Submitted in fulfillment of the requirements for the award of the degree of Doctor of Philosophy Submitted by Munish Kumar (Registration No. 950811010) Under the supervision of Dr. R. K. Sharma Dr. Manish Kumar Professor, Associate Professor, Thapar University, Panjab University Regional Centre, Patiala Muktsar Thapar University, Patiala School of Mathematics and Computer Applications Thapar University Patiala–147004 (Punjab) India July, 2014 i Abstract Over the last few years, a good number of laboratories all over the world have been involved in research on handwriting recognition. Handwriting recognition is a complex problem owing to the issues of variations in writing styles and size of the characters etc. The main objective of this work is to develop an offline handwritten Gurmukhi script recognition system. Gurmukhi is the script used for writing Punjabi language which is widely spoken in certain regions of north India. This thesis is divided into eight chapters. A brief outline of each chapter is given in the following paragraphs. The first chapter introduces the process of OCR and various phases of OCR like digitization, pre-processing, segmentation, feature extraction, classification and post- processing. Applications of offline handwritten character recognition system are also discussed in this chapter. In an overview of Gurmukhi script, the nature of handwriting in Gurmukhi script and character set of Gurmukhi script has also been presented. Major contributions and assumptions in this research work have also been discussed in this chapter. Chapter 2 contains a review of literature on various methods used for non-Indian and Indian scripts recognition.
    [Show full text]
  • A Unique 10 Segment Display for Bengali Numerals
    A Unique 10 Segment Display for Bengali Numerals Md. Abul Kalam Azad, Rezwana Sharmeen, Shabbir Ahmad and S. M. Kamruzzaman† Department of Computer Science & Engineering, International Islamic University Chittagong, Chittagong, Bangladesh. † Department of Computer Science & Engineering, Manarat International University, Dhaka, Bangladesh. {azadarif, r_sharmin_79, bappi_51, smk_iiuc}@yahoo.com Abstract Segmented display is widely used for efficient display of III. THE PROPOSED 10 SEGMENT alphanumeric characters. English numerals are dis- DISPLAY played by 7 segment and 16 segment display. The seg- Our proposed uniform 10 segment display model is ment size is uniform in this two display architecture. shown in Figure 1. Each segment is of uniform size. Display architecture using 8, 10, 11, 18 segments have The segments are non-overlapping and labeled with been proposed for Bengali numerals 0...9 yet no display specific alphabet. architecture is designed using segments of uniform size and uniform power consumption. In this paper we have a proposed a uniform 10 segment architecture for Bengali h numerals. This segment architecture uses segments of f b uniform size and no bent segment is used. gi Keywords: 10 segment display, Bengali numerals, segmented display, combination vector. e c I. INTRODUCTION j Different display architectures using finite number of d segments have been proposed for displaying Bengali numerals. 3X8 dot matrix system was used earlier for Figure 1: Proposed 10 Segment Display displaying Bengali numerals but it requires large num- bers of dots to be manipulated thereby increasing the IV. REPRESENTATION OF BENGALI memory storage, cost and design complexity. The 7 NUMERALS WITH THE PROPOSED 10 segment display is very common in displaying English SEGMENT DISPLAY numerals and a 16 segment display is widely used for The following figure, Figure 2 shows the representation displaying English Alphanumeric characters.
    [Show full text]