Mobile Application for Recognition of Japanese Writing System

Bachelor’s thesis Czech Technical University in Prague Faculty of Electrical Engineering F3 Department of Cybernetics Mobile Application for Recognition of Japanese Writing System Jan Zdeněk Program: Open Informatics Field: Computer and Information Science May 2014 Supervisor: prof. Ing. Pavel Zahradník, CSc. ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 2 Acknowledgement First of all, I would like to express my gratitude to my supervisor, prof. Ing. Pavel Zahradník, CSc., for giving me the opportunity to work on this project. I appreciate his guidance and comments on my work and I would like to thank him for giving me the freedom to explore on my own. I would like to acknowledge the help of my friend Jakub Černý who gave me his opinion on the thesis and provided me with helpful comments. Finally, I want to thank my parents for their love, encouragement and constant support throughout my studies. i Declaration / Prohlášení I declare that I worked out the presented thesis independently and I quoted all used sources of information in accord with Methodical instructions about ethical principles for writing academic thesis. In Prague, May 23, 2014 .................................................... Jan Zdeněk Prohlašuji, že jsem předloženou práci vypracoval samostatně a že jsem uvedl veškeré použité informační zdroje v souladu s Metodickým pokynem o dodržování etických principů při přípravě vysokoškolských závěrečných prací. V Praze dne, 23. května 2014 .................................................... Jan Zdeněk ii Abstract / Abstrakt The objective of this work was to implement and compare various methods which can be used for optical character recognition (OCR) of characters used in the Japanese language and create a mobile application which could recognize characters in an image captured by the camera of a device and present the user with a translation of the words into English. The engine for recognition has been trained on a database with 2,908 unique characters. Using the moment shape normalization, the Histogram of Oriented Gradients for feature extraction, and quadratic discriminant function for classification, recognition rate of 99.9% has been achieved on isolated characters in standard fonts used in print publications and the engine outperformed some of the free OCR engines for Japanese. An initial alpha version of a mobile application for Japanese OCR for Android operating system has been created and the engine with optimal results has been successfully implemented therein. The author intends to continue the development of the application and expand the range of its possible use. The OCR engine could also be used in other applications requiring Japanese OCR. Keywords: Pattern Recognition; OCR Cílem práce bylo implementovat a otestovat různé metody použitelné pro optické rozpozná- vání znaků (OCR) používaných v japonštině a vytvořit mobilní aplikaci, která bude rozpoznávat znaky v obraze zachyceném fotoaparátem zařízení a poskytne uživateli překlad textu do anglič- tiny. Systém pro rozpoznávání byl natrénován na databázi obsahující 2908 unikátních znaků. Za použití momentové normalizace pro normalizaci tvaru znaku, histogramu orientovaných gra- dientů pro extrakci příznaků a kvadratické diskriminační funkce pro klasifikaci bylo dosaženo úspěšnosti rozpoznání 99,9% na izolovaných znacích ve standardním fontu používaném v tiště- ných publikacích. Systém byl úspěšnější než některé existující volně šířené systémy pro OCR japonštiny. Byla vytvořena počáteční alfa verze aplikace pro operační systém Android a úspěšně v něm byla implementována verze systému dosahující nejlepších výsledků. Autor chce nadále pokračovat ve vývoji aplikace a rozšířit spektrum její použitelnosti. OCR systém aplikace by mohl bý použit i v dalších aplikacích vyžadujících OCR japonštiny. Klíčová slova: Rozpoznávání; optické rozpoznávání znaků iii Contents List of Figures vii List of Tables ix 1 Introduction 1 1.1 State of the Art and the Goal ............................. 1 1.2 Related Work ...................................... 3 1.3 Japanese Writing System ............................... 4 1.4 Thesis Structure .................................... 6 2 Dataset Generation 7 2.1 Supported Characters ................................. 7 2.2 Character Image Generation .............................. 8 3 Preprocessing and Feature Extraction 9 3.1 Grayscale Normalization ................................ 9 3.2 Shape Normalization .................................. 9 3.2.1 Aspect Ratio Adaptive Normalization .................... 10 3.2.2 Linear Normalization .............................. 10 3.2.3 Moment Normalization ............................ 11 3.2.4 Bi-moment Normalization ........................... 13 3.2.5 Smoothing ................................... 14 3.3 Feature Extraction ................................... 15 3.3.1 Zonal Line Density Method .......................... 16 3.3.2 Histogram of Oriented Gradients in Local Regions ............. 16 4 Learning and Classification 19 4.1 Dimensionality Reduction ............................... 19 4.1.1 Principal Component Analysis ........................ 19 4.1.2 Linear Discriminant Analysis ......................... 20 4.2 Classification Methods ................................. 21 4.2.1 k-Nearest Neighbors .............................. 22 4.2.2 Linear Discriminant Function ......................... 23 4.2.3 Quadratic Discriminant Function ....................... 24 4.3 Results .......................................... 24 iv Contents ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 5 Text Extraction 30 5.1 Binarization ....................................... 31 5.2 Black on White / White on Black Detection ..................... 32 5.3 Skew Correction .................................... 33 5.4 Character Segmentation ................................ 36 5.4.1 Projection Profile Analysis .......................... 36 6 Implementation 42 6.1 Training ......................................... 42 6.2 Comparison ....................................... 42 6.3 Application ....................................... 42 7 Conclusion 45 7.1 Summary ........................................ 45 7.2 Future Work ...................................... 46 References 47 A Abbreviations and Symbols 52 A.1 Abbreviations ...................................... 52 A.2 Symbols ......................................... 53 B CD Content 54 C Application Manual 56 D Dataset Format 57 E Matlab Demo for OCR 58 v List of Figures 1.1 Examples of hiragana, katakana, and kanji characters ............... 5 3.1 Examples of grayscale normalization ......................... 9 3.2 Examples of shape normalization ........................... 14 3.3 Example of image smoothing using a Gaussian filter ................ 15 3.4 Example of a character composed of multiple existing kanji ............ 15 3.5 Visualization of gradient direction and gradient strengths ............. 17 3.6 Division of a block during calculation of HOG .................... 18 4.1 Example of a k-NN classifier with 3 classes ..................... 22 4.2 Recognition error rates of tested shape normalization methods with varying sizes of feature vectors. Tested on standard print fonts. ................. 25 4.3 Recognition error rates of tested shape normalization methods with varying sizes of feature vectors. Tested on fonts resembling handwriting. ............ 25 4.4 Recognition error rates using different classification methods with varying sizes of feature vectors. Tested on standard print fonts. ................. 26 4.5 Recognition error rates using different classification methods with varying sizes of feature vectors. Tested on fonts resembling handwriting. ............ 26 4.6 Recognition error rates using different methods for feature dimensionality reduction with varying sizes of feature vectors. Tested on standard print fonts. .... 26 4.7 Recognition error rates using different methods for feature dimensionality reduction with varying sizes of feature vectors. Tested on fonts resembling handwriting. 26 4.8 Recognition error rates using different methods for feature extraction. Tested on standard print fonts. .................................. 27 4.9 Recognition error rates using different methods for feature extraction. Tested on fonts resembling handwriting. ............................. 27 4.10 Examples of correctly classified characters ...................... 28 4.11 Examples of incorrectly classified characters ..................... 28 5.1 Diagram of the steps from image acquisition to character classification. ...... 31 5.2 Example of binarization using Otsu’s thresholding method ............. 32 5.3 Radon transform mechanism ............................. 34 5.4 Example of Radon transform on a correctly aligned line of text .......... 35 5.5 Proposed method for skew correction using radon transform ............ 36 5.6 Example of correctly split multi-character projection block ............. 38 5.7 Example of correctly merged projection mark blocks ................ 40 5.8 Example of incorrectly merged projection blocks of two hiragana characters ... 41 vi List of Figures ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 5.9 An example of correctly segmented line of printed text. .............. 41 6.1 Screenshot of the Android application. ........................ 43 C.1 Screenshot of the Android application with description of elements. ........ 56 E.1 Screenshot of the demo application in Matlab. .................... 59 vii List of Tables 4.1 Minimum recognition error rate of different classification

Load more