
INTELLI 2012 : The First International Conference on Intelligent Systems and Applications Smart Implementation of Text Recognition (OCR) for Smart Mobile Devices Ondrej Krejcar Department of Information Technologies, Faculty of Informatics and Management, University of Hradec Kralove, Hradec Kralove, Czech Republic [email protected] Abstract –The paper deal with a development of a mobile video-sequence. Therefore, it is a convenient and instant application for capturing digital photography and its way of capturing information. Moreover, if this information subsequent processing by OCR (Optical Character is time-limited (e.g. it must return within certain time limit Recognition) technologies.The developped solution adds to or it is only displayed for short time period) then it is the existing Smart Device a capability of a virtual keyboard to only method. which it is possible to transfer recognized text for further work Nevertheless, sometimes there is a need to further in SMS or text editor. For example, based on the limitation of process this captured text. The text retyping from these mobile devices it is mainly targeting at short text sections images is lengthy. Furthermore, if it is necessary to retype (internet references, complex adresses, etc.). The accent is using the PDA it should be accounted for switching often targeted on the simple, fast and intuitive working with a between an application with displayed image and the text mobile device. Practical realization is verified at several Smart Devices with Windows Mobile OS. editor. In these cases the usage of OCR (Optical Character Keywords – OCR; Smart Device; Windows Mobile; Image Recognition) technology is the best solution. The first Processing; Virtual Keyboard mobile application OCR was released to the market already in 2002 [3]. Certain factors complicate the usage of OCR in I. INTRODUCTION PDA which mostly originated from the low quality of The Smart Phones, such as cell phones and PDA copies acquired by CCM (Compact Camera Module, (Personal Digital Assistant), especially MDA (Mobile module with integrated CCD (Charge Coupled Device) or Digital Assistant) are the phenomenon nowadays. The CMOS (Complementary Metal Oxide Semiconductor) number of cell phone users over 16 years old in the Czech sensor, simple optics and electronics). Finally, it is Republic for the year 2009 climbed up to 91%. For the necessary to mention that the common source for OCR population in the age group from 16 to 54 years the number application is a scanner. is equal to 98% [1]. A great boom in the field of cell phones A PDA which is supplied by OCR has many options in a and their performance was caused by using the OS way of utilization. If the user notices an URL address in (Operating System), such as Symbian, Android or Windows some printed document, he can look at it by taking a picture Mobile. Many of these devices use large colorful displays which consequently opens the link in a browser. After this with touch screen and fast 32bit CPU. Moreover, the GSM picture the business card with user’s data is saved into module is usually integrated within the standard PDA contacts, etc. together with WiFi module. The result is the incorporation The problem we would like to deal with in this paper is of cell phones and PDA as Smart Phones. Based on the based on a development of mobile OCR application for usage of efficient 32bit CPUs it is possible to develop power current Smart Phones at Windows Mobile platform. Such applications for computation. application is necessary for solving of problems mentioned The primary input system of these devices is the before with the goal in development of virtual keyboard keyboard in a classic “physical” design or in the form of with embedded OCR engine. virtual keyboard on a display in the case of touch screen. Firstly an evaluation of existing solutions will be made These types of keyboards provide a comfortable method of in (Section II). information inscription. Nevertheless, the typing is approx. II. EXISTING OCR ENGINES FOR MOBILE DEVICES 4x slower than in the case of computer keyboards [2]. However, this typing speed may be insufficient if we would The accuracy of OCR depends mainly on the quality of like to use a Smart Device as a tool for fast information recognizable under layer. The most common usage of OCR recording (e.g. business card copying or copying parts of on scanned documents achieves quite satisfactory results. text). Most commonly integrated CCD (In many PDAs, Using of OCR in PDA with CCM as a data source more precisely in cell phones the cheaper CMOS sensors are recognizer carries number of problems [4], especially: used) chips enables the photographing or recording of a Copyright (c) IARIA, 2012. ISBN: 978-1-61208-224-0 19 INTELLI 2012 : The First International Conference on Intelligent Systems and Applications Relatively low computational performance 4) Babel Reader-LE (Usually 1/10 of PC performance) Babel Reader-LE [8] is a particular version of Babel Low quality of images for OCR (Generally meant Reader for Windows Mobile distributed as a freeware. It as low resolution, blurring, background noise, anomalies enables capturing of an image and subsequent storing of this caused by compression, etc.) image in a form of text. Babel Reader-LE is a very simple Tilt (perspective deformation), skew and rotation application. Moreover, it is possible to adjust the captured Incoherent lighting and shadows image before the actual recognition e.g. by background noise removal. Mainly due to these complications is OCR in PDA As in the case of Nokia solution a clipboard and keybard limited to just small parts of text. Therefore, the insufficient option is not possible. quality of acquired images is compensated by the size B. Problems of Existing Mobile OCR Solutions proportion of symbols in the overall resolution. The existing applications may be good examples, because they are Nokia Multiscanner is the closest application to the one usually specialized on business card scanning. we needed. However, it is designed only for OS Symbian. CameraDictionary OCR and CamCard are commercial A. Existing mobile applications applications which are very specialized and not free. 1) Nokia Multiscanner Finally, the last mentioned application called Babel Reader Nokia Multiscanner [5] is a freeware application was only invented for text recognition. The selection of designed for cell phones with Symbian OS. The application these applications with OCR for cell phones is significantly supports picture taking and consequently sending it through limited and the broader application with OCR which would MMS, Bluetooth or via infrared. It is possible to transfer work as an alternative for a virtual keyboard is still missing. the image into a text and save it and at the same time the These reasons lead us to develop a new application selection of certain area can be made by dragging. Another which is described in this article. We expect to develop a possibility is to send the image for business card solution which fills a space on current market. recognition. This option automatically recognizes contact C. Selection of OCR engine details on the business card and fills in the details for adding a new contact. The OCR engine supports post-processing on Due to the extent of this application, it is planned to use the basis of language dictionaries (Technology for the existing OCR engine. Following types of engines were replacement of recognized words by words from a chosen as the most suitable: dictionary according to their relevance), including the Czech Tesseract OCR [9] – OCR Engine developed by language. HP Company in since 1985 until 1995. Nowadays, it is However this solution do not support real virtual being improved by Google. It is offered in C/C++ keyboard nor clipboard Copy/Past features. Also only language. Symbian OS is supported. Ocrad [10] – another open-source OCR engine. 2) CameraDictionary OCR for Moto One of his main advantages is mainly an automatic An application [6] for cell phones with Android, transformation of an input image. It does not accomplish Symbian and Windows Mobile systems. It operates on the post-processing on the basis of language dictionaries. It basis of recorded text recognition and its immediate is written in C/C++ language. translation to another language. Even though, this recorded Puma.NET [11] – an engine for implementation in C# projects with .NET framework. language is available in Chinese or English, the translation is extended by couple of other languages. Furthermore, it ABBYY Mobile OCR Engine [12] – a enables the text recording with consequent signing of the commercial engine used here just for comparison of translated text or so called “Video” regime during which the results. Not available for end users, tested by ABBYY cursor appears on the screen. The text below the cursor is FineReader Online service. immediately translated. However, the main disadvantages are the price and the A script in PHP language was created in order to necessity of internet connection when used. accomplish an objective comparison of recognition 3) CamCard - Business Card Reader accuracy. This script is not included within the topic of this CamCard [7] is an application specialized on reading article and therefore will not be described in the text. The business cards. It is targeted at cell phones which run on OS accuracy of the match is calculated by following formula. Android, iOS (OS of iPhone cell phones), or Windows The Greek letter ω is going to represent the number of Mobile and BlackBerry phones. Furthermore, the CamCard symbols in a reference text and ωerr is the number of errors is an extensively automated business card reader with (substituted, missing symbols or additional symbols). Then detection of a rotation and a language.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-