Android Optical Character Recognition
Total Page:16
File Type:pdf, Size:1020Kb
Imperial Journal of Interdisciplinary Research (IJIR) Vol-3, Issue-4, 2017 ISSN: 2454-1362, http://www.onlinejournal.in Android Optical Character Recognition 1 2 3 Shital Malhar , Manasi Gosavi , Pooja Lad 1,2,3 Dilkap Research Institute of Engineering and Management Studies, Neral. (Mumbai University) operating system which can run on every mobile Abstract - The next generation open operating device and not for their specific mobile devices systems are not on desktops or mainframes but on itself. This enables them to reach as many people as the small mobile devices people carry every day. possible. This application is useful for native The openness of these new environments leads to Tourists and Travellers who possess Android Smart new applications and markets and enables greater phones. The main objective of this application is to integration. Every day a Smartphone user may look help tourist to travel easily and freely without any for a new application dedicated for his need. difficulty. The proposed application also provides Android makes it easier for consumers to get and currency convertor, a real market value. use new content and applications on their Smart phones. The Proposed project presents an 1.1 Problem Definition extremely on-demand, fast and user friendly Android Application. This application is useful for The Project presents an extremely on-demand, fast native Tourists and Travellers who possess Android and user friendly Android Application. This Smart phones. One of the feature in application is it application is useful for native Tourists and enables Travellers and Tourists to easily capture Travelers who possess Android Smart phones. the native country language Books pages, Using mobile phones to obtain information is not signboards, banners and hotel menus, currency only quick, but also more convenient shortcut to convertor. The built-in OCR converts the text improve people's lives. The Proposed Android embedded in the captured image into Unicode text Travel Mate Application enables Tourists and format. It also provides translation facility so that Travelers so that he/she can: Tourists can translate the Native Language OCR the native country language Books pages, Unicode text into their own country language. Signboards, Banners and hotel menus etc. (English) There is no remote computing overhead because Translate the Recognized English text into one of the application has built in OCR suite as well as 22 languages (10 National Languages: Hindi, Image processing suite both installed in the Marathi, Arabic, Bengali, Gujarati, Kannada, Android device. The main objective of this Punjabi, Tamil, Telugu, Urdu.12 International application is to help tourist to travel/navigate Languages: Bulgarian, Chinese, French, German, easily and freely in travelling place without any Greek, Italian, Japanese, Latin, Nepali, Portuguese, difficulty. Spanish, Thai.)Copy, paste and share the Translated Text as well as Recognized Text. Get 1. Introduction the current currency status of that country as well as all the currency conversion. The combination of the smart phone and the Internet service is the trend of the future 1.2 Scope information development and software applications. Mobile phones are the most Applications of OCR include to recognize the commonly used communication tools. The English text from images by just clicking the image motivation of the application is that it enables the from camera and to translate that recognized users to get English text translated, as ease as a English text to 22 National & International button click. The camera captures the English Text languages, which will be helpful in many situations Image and returns the English Text and then one such as in hotels, museums, signboards, notice, can translate the English text result in 22 warning, instructions, etc. User can see the languages. Optical character recognition (OCR) has currency conversion of the destination city come out with the new open and comprehensive currency from its residence country. platform for mobile devices called Android. Android includes an operating system, middleware, user-interface and applications. By now mobile internet usage has taken over Desktop internet usage. Google’s approach is to develop an Imperial Journal of Interdisciplinary Research (IJIR) Page 788 Imperial Journal of Interdisciplinary Research (IJIR) Vol-3, Issue-4, 2017 ISSN: 2454-1362, http://www.onlinejournal.in 2 Review of Literature 2.3 Features of Android 2.1 Android The Android continues to lead the Smart Phone market because of having a great number of Android is a software stack for mobile devices features. Android allows one to access core mobile which means a reference to a set of system device functionality through standard API available programs or a set of application programs that form open-source. One can combine information from a complete system. Android is an operating system the web with data on the phone such as contacts or based on Linux with a Java programming interface. geographic location to create new user It provides tools, e.g. a compiler, debugger and a experiences. Android does not differentiate device emulator as well as its own Java Virtual between the phones basic and third party machine (Dalvik Virtual Machine - DVM). To applications even the dialler or home screen can be simplify development Google provides the Android replaced. The SDK contains what one need to build Development Tools (ADT) for Eclipse. Every and run Android applications, including a true Android application runs in its own process and device emulator and advanced debugging tools. under its own user id which is generated Thus provides fast and easy development. It automatically by the Android system during enables reuse and replacement of components. deployment. Therefore the application is isolated Dalvik virtual machine is optimized for mobile from other running applications and a misbehaving devices. Integrated browser based on the open application cannot easily harm other Android source Web Kit engine. Optimized graphics applications. powered by a custom 2D graphics library; 3D graphics based on the OpenGL ES 1.0 specification 2.2 Android Architecture (hardware acceleration optional). SQLite is used for structured data storage. Media support for The Android Architecture layers are common audio, video, and still image formats as follows: (MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, GIF). Android supports GSM Telephony (hardware dependent), Bluetooth, EDGE, 3G, and Wi-Fi (hardware dependent), Camera, GPS, compass, and accelerometer (hardware dependent) etc. Android possesses a rich development environment including a device emulator, tools for debugging, memory and performance profiling, and a plug in for the Eclipse IDE. Fig-1: Android Architecture 2.4 Tesseract OCR Engine Android comes with a set of built-in core applications including an email client, SMS Today, Tesseract is considered one of the most program, calendar, maps, browser, contacts, and accurate open source OCR engines available. others which are written using Java. By providing Tesseract OCR Engine was one of the best 3 an open development platform, Android offers engines in 1995 UNLV Accuracy Test. Between developers the ability to build extremely rich and 1995 and 2006 however; there was little activity in innovative applications. Developers are free to take Tesseract, until it was open sourced by HP and advantage of the device hardware, access location UNLV in 2005. It was again re-released to the open information, run background services, set alarms, source community in August of 2006 by Google add notifications to the status bar, and much, much [1]. Tesseract has ability to train for newer more. Developers have full access to the same language and scripts as well [2]. A complete framework APIs used by the core applications. The overview of Tesseract OCR engine can be found in application architecture is designed to simplify the [3]. While Tesseract was originally developed for reuse of components; any application can publish English, it has since been extended to recognize its capabilities and any other application may then French, Italian, Catalan, Czech, Danish, Polish, make use of those capabilities (subject to security Bulgarian, Russian, Greek, Korean, Spanish, constraints enforced by the framework). Japanese, Dutch, Chinese, Indonesian, Swedish, German, Thai, Arabic, and Hindi etc [4]. Training Android includes a set of C/C++ libraries used by the Tesseract OCR Engine for Hindi language various components of the Android system. These requires in-depth knowledge of Devanagari script capabilities are exposed to developers through the in order to collect the character set. Android application framework. Imperial Journal of Interdisciplinary Research (IJIR) Page 789 Imperial Journal of Interdisciplinary Research (IJIR) Vol-3, Issue-4, 2017 ISSN: 2454-1362, http://www.onlinejournal.in 2.4.1 Components of Tesseract OCR Engine user registrations, recommended places to be modified. We will be using php language for it. The application consists of four major components And there will be 1 client side, where user can described below: access the functionality of application, which will be in android. In camera capture module the user is allowed to resize the camera capture box by touching the box 3.1 Client and server role corners on the screen so as to capture the only concerned text image from signboard, banner and The client–server characteristic describes the book pages [1].