Reconocimiento De Escritura Lecture 4/5 --- Layout Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Reconocimiento De Escritura Lecture 4/5 --- Layout Analysis Reconocimiento de Escritura Lecture 4/5 | Layout Analysis Daniel Keysers Jan/Feb-2008 Keysers: RES-08 1 Jan/Feb-2008 Outline Detection of Geometric Primitives The Hough-Transform RAST Document Layout Analysis Introduction Algorithms for Layout Analysis A `New' Algorithm: Whitespace Cuts Evaluation of Layout Analyis Statistical Layout Analysis OCR OCR - Introduction OCR fonts Tesseract Sources of OCR Errors Keysers: RES-08 2 Jan/Feb-2008 Outline Detection of Geometric Primitives The Hough-Transform RAST Document Layout Analysis Introduction Algorithms for Layout Analysis A `New' Algorithm: Whitespace Cuts Evaluation of Layout Analyis Statistical Layout Analysis OCR OCR - Introduction OCR fonts Tesseract Sources of OCR Errors Keysers: RES-08 3 Jan/Feb-2008 Detection of Geometric Primitives some geometric entities important for DIA: I text lines I whitespace rectangles (background in documents) Keysers: RES-08 4 Jan/Feb-2008 Outline Detection of Geometric Primitives The Hough-Transform RAST Document Layout Analysis Introduction Algorithms for Layout Analysis A `New' Algorithm: Whitespace Cuts Evaluation of Layout Analyis Statistical Layout Analysis OCR OCR - Introduction OCR fonts Tesseract Sources of OCR Errors Keysers: RES-08 5 Jan/Feb-2008 Hough-Transform for Line Detection Assume we are given a set of points (xn; yn) in the image plane. For all points on a line we must have yn = a0 + a1xn If we want to determine the line, each point implies a constraint yn 1 a1 = − a0 xn xn Keysers: RES-08 6 Jan/Feb-2008 Hough-Transform for Line Detection The space spanned by the model parameters a0 and a1 is called model space, parameter space, or Hough space. Its implementation is called accumulator array. http://www.rob.cs.tu-bs.de/content/04-teaching/ 06-interactive/Hough.html Keysers: RES-08 7 Jan/Feb-2008 Hough-Transform for Line Detection I points in data space on straight line ! lines in parameter space meet in one point I many points ! reliable estimate of the two parameters of the line I line in data space ! point in parameter space I This transformation from the data space to the parameter space via a model equation is called the Hough transform. possible improvement: include orientation information at each point Keysers: RES-08 8 Jan/Feb-2008 Hough-Transform for Line Detection problem with slope-intercept-parameterization: slope of line may become infinite better parameterization: angle of line with respect to the x-axis and distance from the coordinate center x sin φ + y cos φ = d φ 2 [0; 2π); d > 0 d = x sin φ + y cos φ http://www.rob.cs.tu-bs.de/content/04-teaching/ 06-interactive/HNF.html Keysers: RES-08 9 Jan/Feb-2008 Extensions of the Hough Transform Generally, the Hough Transform is applicable to any parameterized curve or shape. However, the size of the accumulator grows exponentially with the number of parameters, such that it is usually only feasible for few parameters. Example: circle detection | iris detection Keysers: RES-08 10 Jan/Feb-2008 Iris Detection Keysers: RES-08 11 Jan/Feb-2008 Generalized Hough Transform Keysers: RES-08 12 Jan/Feb-2008 Hough-Transform drawbacks of Hough transform: I computational effort, lots of storage for accumulator I discretization of model space: I bin size too small ! solutions lost I bin size too large ! solutions too inaccurate history: I slope-intercept variant patented in 1962 by Hough I angle-distance variant published 1972 by Duda and Hart I generalization to arbitrary shapes 1981 by Ballard Keysers: RES-08 13 Jan/Feb-2008 Generalized Hough Transform? B. Leibe and B. Schiele: Interleaved Object Categorization and Segmentation. Proc. British Machine Vision Conference, Norwich, UK, 2003. Keysers: RES-08 14 Jan/Feb-2008 Outline Detection of Geometric Primitives The Hough-Transform RAST Document Layout Analysis Introduction Algorithms for Layout Analysis A `New' Algorithm: Whitespace Cuts Evaluation of Layout Analyis Statistical Layout Analysis OCR OCR - Introduction OCR fonts Tesseract Sources of OCR Errors Keysers: RES-08 15 Jan/Feb-2008 RAST I Recognition by Adaptive Subdivision of Transformation Space I branch-and-bound search in parameter space w.r.t. a given quality function I related to adaptive Hough transform I easily adaptable to new problems Keysers: RES-08 16 Jan/Feb-2008 RAST: Algorithm I algorithm I keep priority queue of parameter regions I use upper bound of quality estimate for region I on best region: I stop? I output? I split and insert new regions I use interval arithmetic to determine bounds Keysers: RES-08 17 Jan/Feb-2008 RAST: Example Subdivision Keysers: RES-08 18 Jan/Feb-2008 Outline Detection of Geometric Primitives The Hough-Transform RAST Document Layout Analysis Introduction Algorithms for Layout Analysis A `New' Algorithm: Whitespace Cuts Evaluation of Layout Analyis Statistical Layout Analysis OCR OCR - Introduction OCR fonts Tesseract Sources of OCR Errors Keysers: RES-08 19 Jan/Feb-2008 Outline Detection of Geometric Primitives The Hough-Transform RAST Document Layout Analysis Introduction Algorithms for Layout Analysis A `New' Algorithm: Whitespace Cuts Evaluation of Layout Analyis Statistical Layout Analysis OCR OCR - Introduction OCR fonts Tesseract Sources of OCR Errors Keysers: RES-08 20 Jan/Feb-2008 Introduction Page segmentation I divides the document image into homogeneous zones, each consisting of only one physical layout structure (e.g. text, graphics, or pictures) I an important basic step of document analysis systems I many algorithms proposed over last two decades I comparative performance evaluation? See also: R. Cattoni, T. Coianiz, S. Messelodi, C.M. Modena, \Geometric layout analysis techniques for document image understanding: a review," IRST, Trento, Italy, TR 9703-09, 1998. Acknowledgment: This lecture is largely based on work by Faisal Shafait. Keysers: RES-08 21 Jan/Feb-2008 Introduction \Mostly-text pages are separated into columns, paragraph-blocks, text-lines, word, and characters. OCR converts the individual word or character images into a character code like ASCII or Unicode. But, there is much more to a text document than a string of symbols. Additional, and sometimes essential, information is conveyed by long-established conventions of layout and format, choice of type size and typeface, italics and boldface, and the two-dimensional arrangement of tables and formulas. To capture the whole meaning of a document, DIA must extract and interpret all of this subtle encoding." (Nagy 2000) Keysers: RES-08 22 Jan/Feb-2008 Introduction Keysers: RES-08 23 Jan/Feb-2008 Page Segmentation Approaches area based text-line based ! ! I sensitive to spacing (e.g. I insensitive to spacing, fails on large-font titles works on pages even with and headlines) minimum spacing I still requires text-line I text-lines directly used for extraction OCR Keysers: RES-08 24 Jan/Feb-2008 Area-Based Approach Document Whitespace Whitespace Page Image Cover Evaluation Segments I principled approach I hard to implement I evaluation function not very document-centric I oversegments (too many vertical cuts) Keysers: RES-08 25 Jan/Feb-2008 Text-Line-Based Approach Find Document Extract Column Image Text-Lines Separators I globally optimal whitespace cover computation I very accurate column model I works very well on clean text I uses straight (non-curled) text-line model I excludes noise from detected text-lines Keysers: RES-08 26 Jan/Feb-2008 Outline Detection of Geometric Primitives The Hough-Transform RAST Document Layout Analysis Introduction Algorithms for Layout Analysis A `New' Algorithm: Whitespace Cuts Evaluation of Layout Analyis Statistical Layout Analysis OCR OCR - Introduction OCR fonts Tesseract Sources of OCR Errors Keysers: RES-08 27 Jan/Feb-2008 Overview of Algorithms Page segmentation algorithms: I X-Y cut by Nagy et al. I Smearing by Wong et al. I Whitespace analysis by Baird I Docstrum by O'Gorman I Voronoi-diagram by Kise et al. I Constrained text-line finding by Breuel I Whitespace Cuts by Shafait et al. (\Dummy"-algorithm for comparison: return whole page as one segment) Keysers: RES-08 28 Jan/Feb-2008 Segmentation Results of an Example Image Whitespace Whitespace Cuts Text-Line Smearing Voronoi Docstrum X-Y cut Keysers: RES-08 29 Jan/Feb-2008 Recursive X-Y Cuts Algorithm G. Nagy, S. Seth, and M. Viswanathan, \A prototype document image analysis system for technical journals," Computer, vol. 7, no. 25, pp. 10{22, 1992. I tree-based top-down algorithm. I root = entire document page I leaves = final segmentation I recursively split each node into two or more smaller rectangular zones I at each step compute horizontal and vertical projection profiles I if a valley in the profile is low and wide enough, split Keysers: RES-08 30 Jan/Feb-2008 Run-length Smearing Algorithm K. Y. Wong, R. G. Casey, and F. M. Wahl, \Document analysis system," IBM Journal of Research and Development, vol. 26, no. 6, pp. 647{656, 1982. I operate on binary image separately in both dimensions I white pixel runs are changed to black if shorter than a threshold I first along x, then along y, then AND-ing the two images, then x-direction again with smaller threshold I effect: link connected components that are separated by only few pixels of background (words, lines, but not blocks) I connected component analysis determines zones I classify as text-block, if height is no too large and mean run-lengths are not too large ! essentially morphological operations Keysers: RES-08 31 Jan/Feb-2008 Run-length Smearing Algorithm Keysers: RES-08 32 Jan/Feb-2008 Whitespace H. S. Baird, \Background structure in document images," in Document Image Analysis, H. Bunke, P. Wang, and H. S. Baird, Eds., World Scientific, Singapore, 1994, pp. 17{34. I find a set of maximal white rectangles (called covers) whose union completely covers the background I covers sorted with respect to the sort key K(c) p K(c) = area(c) ∗ W (jlog2 (height(c)=width(c))j) weighting function W (:) assigns higher weight to tall and long rectangles because they are supposed to be meaningful separators of text blocks I add covers to form a background cover until sort key is less than a threshold I uncovered area are blocks (not necessarily rectangular) Keysers: RES-08 33 Jan/Feb-2008 Docstrum L.
Recommended publications
  • OCR Pwds and Assistive Qatari Using OCR Issue No
    Arabic Optical State of the Smart Character Art in Arabic Apps for Recognition OCR PWDs and Assistive Qatari using OCR Issue no. 15 Technology Research Nafath Efforts Page 04 Page 07 Page 27 Machine Learning, Deep Learning and OCR Revitalizing Technology Arabic Optical Character Recognition (OCR) Technology at Qatar National Library Overview of Arabic OCR and Related Applications www.mada.org.qa Nafath About AboutIssue 15 Content Mada Nafath3 Page Nafath aims to be a key information 04 Arabic Optical Character resource for disseminating the facts about Recognition and Assistive Mada Center is a private institution for public benefit, which latest trends and innovation in the field of Technology was founded in 2010 as an initiative that aims at promoting ICT Accessibility. It is published in English digital inclusion and building a technology-based community and Arabic languages on a quarterly basis 07 State of the Art in Arabic OCR that meets the needs of persons with functional limitations and intends to be a window of information Qatari Research Efforts (PFLs) – persons with disabilities (PWDs) and the elderly in to the world, highlighting the pioneering Qatar. Mada today is the world’s Center of Excellence in digital work done in our field to meet the growing access in Arabic. Overview of Arabic demands of ICT Accessibility and Assistive 11 OCR and Related Through strategic partnerships, the center works to Technology products and services in Qatar Applications enable the education, culture and community sectors and the Arab region. through ICT to achieve an inclusive community and educational system. The Center achieves its goals 14 Examples of Optical by building partners’ capabilities and supporting the Character Recognition Tools development and accreditation of digital platforms in accordance with international standards of digital access.
    [Show full text]
  • An Accuracy Examination of OCR Tools
    International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-8, Issue-9S4, July 2019 An Accuracy Examination of OCR Tools Jayesh Majumdar, Richa Gupta texts, pen computing, developing technologies for assisting Abstract—In this research paper, the authors have aimed to do a the visually impaired, making electronic images searchable comparative study of optical character recognition using of hard copies, defeating or evaluating the robustness of different open source OCR tools. Optical character recognition CAPTCHA. (OCR) method has been used in extracting the text from images. OCR has various applications which include extracting text from any document or image or involves just for reading and processing the text available in digital form. The accuracy of OCR can be dependent on text segmentation and pre-processing algorithms. Sometimes it is difficult to retrieve text from the image because of different size, style, orientation, a complex background of image etc. From vehicle number plate the authors tried to extract vehicle number by using various OCR tools like Tesseract, GOCR, Ocrad and Tensor flow. The authors in this research paper have tried to diagnose the best possible method for optical character recognition and have provided with a comparative analysis of their accuracy. Keywords— OCR tools; Orcad; GOCR; Tensorflow; Tesseract; I. INTRODUCTION Optical character recognition is a method with which text in images of handwritten documents, scripts, passport documents, invoices, vehicle number plate, bank statements, Fig.1: Functioning of OCR [2] computerized receipts, business cards, mail, printouts of static-data, any appropriate documentation or any II. OCR PROCDURE AND PROCESSING computerized receipts, business cards, mail, printouts of To improve the probability of successful processing of an static-data, any appropriate documentation or any picture image, the input image is often ‘pre-processed’; it may be with text in it gets processed and the text in the picture is de-skewed or despeckled.
    [Show full text]
  • Text Classification and Layout Analysis for Document Reassembling
    Text Classification and Layout Analysis for Document Reassembling DISSERTATION submitted in partial fulfillment of the requirements for the degree of Doktor der technischen Wissenschaften by Markus Diem Registration Number 0226595 to the Faculty of Informatics at the Vienna University of Technology Advisor: Ao.Univ.Prof. Dipl.-Ing. Dr.techn. Robert Sablatnig The dissertation has been reviewed by: (Ao.Univ.Prof. Dipl.-Ing. (Basilis G. Gatos, PhD) Dr.techn. Robert Sablatnig) Wien, March 10, 2014 (Markus Diem) Technische Universit¨atWien A-1040 Wien Karlsplatz 13 Tel. +43-1-58801-0 www.tuwien.ac.at Erkl¨arung zur Verfassung der Arbeit Markus Diem Mollardgasse 22/19, 1060 Wien Hiermit erkl¨are ich, dass ich diese Arbeit selbst¨andig verfasst habe, dass ich die ver- wendeten Quellen und Hilfsmittel vollst¨andig angegeben habe und dass ich die Stellen der Arbeit { einschließlich Tabellen, Karten und Abbildungen {, die anderen Werken oder dem Internet im Wortlaut oder dem Sinn nach entnommen sind, auf jeden Fall unter Angabe der Quelle als Entlehnung kenntlich gemacht habe. (Ort, Datum) (Unterschrift Verfasser) i Danksagung Ich m¨ochte mich an dieser Stelle bei all jenen Menschen bedanken, die mich begleitet und unterst¨utzthaben. Sie haben wesentlich dazu beigetragen, dass ich diese Arbeit schreiben konnte. F¨urdie fachliche Unterst¨utzungm¨ochte ich mich bei allen Arbeitskollegen am Cvl bedanken, die in vielen Diskussionen meine Idee der Computer Vision geformt haben. Speziell m¨ochte ich mich bei Melanie Gau, Michael H¨odlmoser,Martin Kampel, Rainer Planinc, Michael Reiter, Sebastian Zambanini und Andreas Zweng bedanken. Ein Dank gilt auch Stefan, der mich nie zu seiner Masterparty eingeladen hat, Florian, dessen zum Teil absurde Ideen den B¨uroalltagauffrischen und Fabian, der mich schon beim Studium f¨urseine nicht immer absurden Ideen begeisterte.
    [Show full text]
  • Comparison of Visual and Logical Character Segmentation in Tesseract OCR Language Data for Indic Writing Scripts
    Comparison of Visual and Logical Character Segmentation in Tesseract OCR Language Data for Indic Writing Scripts Jennifer Biggs National Security & Intelligence, Surveillance & Reconnaissance Division Defence Science and Technology Group Edinburgh, South Australia {[email protected]} interfaces, online services, training and training Abstract data preparation, and additional language data. Originally developed for recognition of Eng- Language data for the Tesseract OCR lish text, Smith (2007), Smith et al (2009) and system currently supports recognition of Smith (2014) provide overviews of the Tesseract a number of languages written in Indic system during the process of development and writing scripts. An initial study is de- internationalization. Currently, Tesseract v3.02 scribed to create comparable data for release, v3.03 candidate release and v3.04 devel- Tesseract training and evaluation based opment versions are available, and the tesseract- on two approaches to character segmen- ocr project supports recognition of over 60 lan- tation of Indic scripts; logical vs. visual. guages. Results indicate further investigation of Languages that use Indic scripts are found visual based character segmentation lan- throughout South Asia, Southeast Asia, and parts guage data for Tesseract may be warrant- of Central and East Asia. Indic scripts descend ed. from the Brāhmī script of ancient India, and are broadly divided into North and South. With some 1 Introduction exceptions, South Indic scripts are very rounded, while North Indic scripts are less rounded. North The Tesseract Optical Character Recognition Indic scripts typically incorporate a horizontal (OCR) engine originally developed by Hewlett- bar grouping letters. Packard between 1984 and 1994 was one of the This paper describes an initial study investi- top 3 engines in the 1995 UNLV Accuracy test gating alternate approaches to segmenting char- as “HP Labs OCR” (Rice et al 1995).
    [Show full text]
  • Design a Fast 3D Scanner Using a Laser Line
    REPUBLIC OF TURKEY FIRAT UNIVERSITY GRADUATE SCHOOL OF NATURAL AND APPLIED SCIENCE AN ANDROID BASED RECEIPT TRACKER SYSTEM USING OPTICAL CHARACTER RECOGNITION KAREZ ABDULWAHHAB HAMAD Master Thesis Department: Software Engineering Supervisor: Asst. Prof. Dr. Mehmet KAYA JULY – 2017 ACKNOWLEDGEMENTS First, thanks to ALLAH, the Almighty, for granting me the well and strength, with which this master thesis was accomplished; it will be the first step to propose much more great scientific researches. I would like to acknowledge my thankfulness and appreciation to my supervisor Asst. Prof. Dr. Mehmet KAYA for his guidance, assistance encouragement, wisdom suggestions, and valuable advice that made the completion of the present master thesis possible. Last but not the least; I want to express my special thankfulness to my lovely parents, and special gratitude to all members of my family and friends. Special thanks to my lovely uncle Assoc. Prof. Dr. Yadgar Rasool, who helped me and encouraged me a lot during my study. II TABLE OF CONTENTS Page No ACKNOWLEDGEMENTS ............................................................................................... II TABLE OF CONTENTS ................................................................................................. III ABSTRACT ....................................................................................................................... VI ÖZET ................................................................................................................................ VII
    [Show full text]
  • Mathematical Expression Detection and Segmentation in Document Images
    Mathematical Expression Detection and Segmentation in Document Images Jacob R. Bruce Thesis submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Master of Science In Computer Engineering A. Lynn Abbott, Committee Chair Jianhua J. Xuan, Co-Chair Michael S. Hsiao February 12, 2014 Blacksburg, VA Keywords: document layout analysis, optical character recognition, mathematical expression detection and segmentation, document image, type- specific layout analysis Mathematical Expression Detection and Segmentation in Document Images Jacob R. Bruce Abstract Various document layout analysis techniques are employed in order to enhance the accuracy of optical character recognition (OCR) in document images. Type-specific document layout analysis involves localizing and segmenting specific zones in an image so that they may be recognized by specialized OCR modules. Zones of interest include titles, headers/footers, paragraphs, images, mathematical expressions, chemical equations, musical notations, tables, circuit diagrams, among others. False positive/negative detections, oversegmentations, and undersegmentations made during the detection and segmentation stage will confuse a specialized OCR system and thus may result in garbled, incoherent output. In this work a mathematical expression detection and segmentation (MEDS) module is implemented and then thoroughly evaluated. The module is fully integrated with the open source OCR software, Tesseract, and is designed to function as a component of it. Evaluation is carried out on freely available public domain images so that future and existing techniques may be objectively compared. Acknowledgments I would like to thank Dr. Abbott for his helpful insights. I'd also like to thank my lab-mates Sherin Aly, Sherin Ghannam, Ahmed Ibrahim, Mahmoud Sobhy, and Amira Youssef for keeping me company, teaching me a thing or two about their rich culture, and letting me try some great foods.
    [Show full text]
  • Gradu04243.Pdf
    Paperilomakkeesta tietomalliin Kalle Malin Tampereen yliopisto Tietojenkäsittelytieteiden laitos Tietojenkäsittelyoppi Pro gradu -tutkielma Ohjaaja: Erkki Mäkinen Toukokuu 2010 i Tampereen yliopisto Tietojenkäsittelytieteiden laitos Tietojenkäsittelyoppi Kalle Malin: Paperilomakkeesta tietomalliin Pro gradu -tutkielma, 61 sivua, 3 liitesivua Toukokuu 2010 Tässä tutkimuksessa käsitellään paperilomakkeiden digitalisointiin liittyvää kokonaisprosessia yleisellä tasolla. Prosessiin tutustutaan tarkastelemalla eri osa-alueiden toimintoja ja laitteita kokonaisjärjestelmän vaatimusten näkökul- masta. Tarkastelu aloitetaan paperilomakkeiden skannaamisesta ja lopetetaan kerättyjen tietojen tallentamiseen tietomalliin. Lisäksi luodaan silmäys markki- noilla oleviin valmisratkaisuihin, jotka sisältävät prosessin kannalta oleelliset toiminnot. Avainsanat ja -sanonnat: lomake, skannaus, lomakerakenne, lomakemalli, OCR, OFR, tietomalli. ii Lyhenteet ADRT = Adaptive Document Recoginition Technology API = Application Programming Interface BAG = Block Adjacency Graph DIR = Document Image Recognition dpi= Dots Per Inch ICR = Intelligent Character Recognition IFPS = Intelligent Forms Processing System IR = Information Retrieval IRM = Image and Records Management IWR = Intelligent Word Recognition NAS = Network Attached Storage OCR = Optical Character Recognition OFR = Optical Form Recognition OHR = Optical Handwriting Recognition OMR = Optical Mark Recognition PDF = Portable Document Format SAN = Storage Area Networks SDK = Software Development Kit SLM
    [Show full text]
  • Integral Estimation in Quantum Physics
    INTEGRAL ESTIMATION IN QUANTUM PHYSICS by Jane Doe A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mathematical Physics Department of Mathematics The University of Utah May 2016 Copyright c Jane Doe 2016 All Rights Reserved The University of Utah Graduate School STATEMENT OF DISSERTATION APPROVAL The dissertation of Jane Doe has been approved by the following supervisory committee members: Cornelius L´anczos , Chair(s) 17 Feb 2016 Date Approved Hans Bethe , Member 17 Feb 2016 Date Approved Niels Bohr , Member 17 Feb 2016 Date Approved Max Born , Member 17 Feb 2016 Date Approved Paul A. M. Dirac , Member 17 Feb 2016 Date Approved by Petrus Marcus Aurelius Featherstone-Hough , Chair/Dean of the Department/College/School of Mathematics and by Alice B. Toklas , Dean of The Graduate School. ABSTRACT Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah. Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah. Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah. Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah. Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah. Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah. Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah. Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah.
    [Show full text]
  • JETIR Research Journal
    © 2019 JETIR June 2019, Volume 6, Issue 6 www.jetir.org (ISSN-2349-5162) IMAGE TEXT CONVERSION FROM REGIONAL LANGUAGE TO SPEECH/TEXT IN LOCAL LANGUAGE 1Sayee Tale,2Manali Umbarkar,3Vishwajit Singh Javriya,4Mamta Wanjre 1Student,2Student,3Student,4Assistant Professor 1Electronics and Telecommunication, 1AISSMS, Institute of Information Technology, Pune, India. Abstract: The drivers who drive in other states are unable to recognize the languages on the sign board. So this project helps them to understand the signs in different language and also they will be able to listen it through speaker. This paper describes the working of two module image processing module and voice processing module. This is done majorly using Raspberry Pi using the technology OCR (optical character recognition) technique. This system is constituted by raspberry Pi, camera, speaker, audio playback module. So the system will help in decreasing the accidents causes due to wrong sign recognition. Keywords: Raspberry pi model B+, Tesseract OCR, camera module, recording and playback module,switch. 1. Introduction: In today’s world life is too important and one cannot loose it simply in accidents. The accident rates in today’s world are increasing day by day. The last data says that 78% accidents were because of driver’s fault. There are many faults of drivers and one of them is that they are unable to read the signs and instructions written on boards when they drove into other states. Though the instruction are for them only but they are not able to make it. So keeping this thing in mind we have proposed a system which will help them to understand the sign boards written in regional language.
    [Show full text]
  • Optical Character Recognition As a Cloud Service in Azure Architecture
    International Journal of Computer Applications (0975 – 8887) Volume 146 – No.13, July 2016 Optical Character Recognition as a Cloud Service in Azure Architecture Onyejegbu L. N. Ikechukwu O. A. Department of Computer Science, Department of Computer Science, University of Port Harcourt. University of Port Harcourt. Port Harcourt. Rivers State. Nigeria. Port Harcourt. Rivers State. Nigeria ABSTRACT The dawn of digital computers has made it unavoidable that Cloud computing and Optical Character Recognition (OCR) everything processed by the digital computer must be technology has become an extremely attractive area of processed in digital form. For example, most famous libraries research over the last few years. Sometimes it is difficult to in the world like the Boston public library has over 6 million retrieve text from the image because of different size, style, books, for public consumption, inescapably has to change all orientation and complex background of image. There is need its paper-based books into digital documents in order that they to convert paper books and documents into text. OCR is still could be stored on a storage drive. It can also be projected that imperfect as it occasionally mis-recognizes letters and falsely every year over 200 million books are being published[22], identifies scanned text, leading to misspellings and linguistics many of which are disseminated and published on papers [6]. errors in the OCR output text. A cloud based Optical Thus, for several document-input jobs, Opitcal Character Character Recognition Technology was used. This was Reconigtion is the utmost economical and swift process powered on Microsoft Windows Azure in form of a Web obtainable.
    [Show full text]
  • Representation of Web Based Graphics and Equations for the Visually Impaired
    TH NATIONAL ENGINEERING CONFERENCE 2012, 18 ERU SYMPOSIUM, FACULTY OF ENGINEERING, UNIVERSITY OF MORATUWA, SRI LANKA Representation of Web based Graphics and Equations for the Visually Impaired C.L.R. Gunawardhana, H.M.M. Hasanthika, T.D.G.Piyasena,S.P.D.P.Pathirana, S. Fernando, A.S. Perera, U. Kohomban Abstract With the extensive growth of technology, it is becoming prominent in making learning more interactive and effective. Due to the use of Internet based resources in the learning process, the visually impaired community faces difficulties. In this research we are focusing on developing an e-Learning solution that can be accessible by both normal and visually impaired users. Accessibility to tactile graphics is an important requirement for visually impaired people. Recurrent expenditure of the printers which support graphic printing such as thermal embossers is beyond the budget for most developing countries which cannot afford such a cost for printing images. Currently most of the books printed using normal text Braille printers ignore images in documents and convert only the textual part. Printing images and equations using normal text Braille printers is a main research area in the project. Mathematical content in a forum and simple images such as maps in a course page need to be made affordable using the normal text Braille printer, as these functionalities are not available in current Braille converters. The authors came up with an effective solution for the above problems and the solution is presented in this paper. 1 1. Introduction In order to images accessible to visually impaired people the images should be converted into a tactile In this research our main focus is to make e-Learning format.
    [Show full text]
  • Manual Archivista 2009/I
    Manual Archivista 2009/I c 18th January 2009 by Archivista GmbH, CH-8118 Pfaffhausen Web pages: www.archivista.ch Contents I Introduction 8 4.4 Accessing the manual . 26 4.5 Login WebClient . 26 1 Introduction 9 4.6 Scanning and entering keywords . 26 1.1 Welcome to Archivista . 9 4.7 Rotating pages . 26 1.2 Notes on the manual . 9 4.8 Title search . 27 1.3 Our address . 9 4.9 Full text search . 27 1.4 Previous versions . 9 4.10 Login WebAdmin . 27 1.5 Licensing . 12 4.11 Adding users . 27 4.12 Adding/deleting fields . 27 2 First Steps 18 4.13 Editing the input/search mask . 27 2.1 Introduction . 18 4.14 Activating SSH . 27 2.2 The digital archive . 18 4.15 Activating VNC . 27 2.3 Database, server and client . 18 4.16 Enabling print server (CUPS) . 28 2.4 Tables, records and fields . 19 4.17 Password, Unlock & Restart OCR . 28 2.5 Archivista and working method . 19 4.18 Activating HTTPS . 28 2.6 Tips for archiving . 20 2.7 Archive, pages and documents . 20 5 Tutorial RichClient 29 2.8 The Archivista document . 20 5.1 Archivista in 90 Seconds . 29 5.2 Adding Documents . 29 3 Installation 22 5.3 Search . 29 3.1 ArchivistaBox . 22 5.4 Extended Functions . 30 3.2 Virtual (Box) . 22 5.5 Users & Fields . 31 3.3 OpenSource (Box) . 22 5.6 Databases, fields and barcodes . 31 3.4 OpenSource (Windows) . 22 III ArchivistaBox 33 II Tutorials 25 6 Introduction 34 4 Introduction 26 6.1 Advantages .
    [Show full text]