Character Recognition in the Image Using Deep Learning
Total Page:16
File Type:pdf, Size:1020Kb
ALGERIAN REPUBLIC DEMOCRATIC AND POPULAR MINISTRY OF HIGH EDUCATION AND SCIENTIFIC RESEARCHES University of Mohamed Khider Biskra Faculty of Exact Science and Science of Life and Nature Department of Computer Science Dissertation For Master Degree Graduation In Computer Science Field Computer Graphics & Computer Vision Title: Character Recognition In The Image Using Deep Learning Realized by: Barkat Ishak Defended the 06/07/2019, in front of the jury composed of: Academic year 2018-2019 Dedication This research work is dedicated to: My beloved mother, my first teacher who always had faith in me. My respected father whose support helped me to reach my goals. My lovely sisters Sara, Nafla and Takia. My dear brothers Immed and mehdi for their encouragement. To all my uncles and aunties. Last but not least, to my best friends and colleagues Amir, Mostapha, Seddik, Hazem, Hatem, Lotfi, Yasmine, Roufaida and Khaoula, And to all those whom I love and cherish. 1 Acknowledgement Before all, my deep and sincere praise to Allah the almighty for providing me with strength and patience to accomplish this research work. My thanks go to my supervisor Zerari Abd el Moumen for his valuable guidance. Special thanks would go to the members of the jury namely, Mokhtari Bilel, Chighoub Rabia for their efforts to evaluate my research work. I also express my gratitude to my classmate Halimi Roufaida for her tips through this work. 2 Abstract The human brain as complex product of evolutionary success, has acquired and is em- powered by a sophisticated learning intuition which consists of the broader memory and the creativeness of imagination, both grant flexibility and an extreme potential of pattern recogni- tion and acquisition, that’s aside, a computer is only suitable for information retrieval and is as good as the program it runs, which makes it’s window of usage very limited to fewer tasks, but is that the apex of computer science? What if computers were able to learn and recognize unknown patterns without the constant teaching and bypass first-hand programming?. Key-words: Deep learning, Image processing, Character recognition, Convolution neural networks. 3 Contents 1 Recognition in the digital image 12 1.1 Introduction . 12 1.2 Image processing . 12 1.2.1 Photo-metric property of the human visual system . 12 1.2.1.1 Human eye . 12 1.2.1.2 Color vision . 14 1.2.2 Digital Camera . 17 1.2.2.1 Sampling and aliasing . 19 1.2.2.2 Color cameras . 20 1.2.2.3 Compression . 20 1.2.3 Feature extraction . 20 1.2.3.1 The Uses of feature extraction . 21 1.2.3.2 Practical Uses of Feature Extraction . 21 1.2.3.3 Feature extraction methods . 21 1.2.4 Image Segmentation . 22 1.3 Recognition in the image . 23 1.3.1 Implementation of the image recognition Algorithms . 23 1.3.1.1 The uses of images recognition . 24 1.3.1.2 Recognition in images problems . 26 1.4 Characters recognition in images . 27 1.4.1 Definition . 27 1.4.2 Characters recognition types . 27 1.4.3 Character recognition Techniques . 27 1.4.3.1 Pre-processing . 28 1.4.3.2 Character recognition . 28 1.4.3.3 Post-processing . 29 1.4.3.4 Application-specific optimizations . 29 1.4.4 Applications . 30 1.5 Conclusion . 30 2 Deep Learning 31 2.1 Introduction . 31 4 CONTENTS 5 2.2 Neural networks . 31 2.2.1 Definition . 31 2.2.2 Neural Networks Architectures . 31 2.2.2.1 Single-Layer Feedforward Networks . 32 2.2.2.2 Multilayer Feedforward Networks . 32 2.2.2.3 Recurrent Networks . 33 2.2.3 Automatic learning types . 34 2.2.3.1 Supervised learning . 34 2.2.3.2 Unsupervised learning . 34 2.2.3.3 Reinforcement learning . 35 2.2.3.4 Semi-supervised learning . 35 2.3 Deep Learning . 36 2.3.1 Definition . 36 2.3.2 Deep Learning Architectures . 36 2.3.2.1 Unsupervised Pretrained Networks (UPNs) . 36 2.3.2.2 Recurrent Neural Networks (RNNs) . 38 2.3.2.3 Recursive Neural Networks (RNNs) . 39 2.3.2.4 Convolutional Neural Networks (CNNs) . 39 2.3.3 Deep Learning Application Domain . 46 2.3.3.1 Bioinformatics . 46 2.3.3.2 Health . 46 2.3.3.3 Robotics . 46 2.3.3.4 Visual and vocal recognition . 47 2.3.4 The challenges of deep learning . 47 2.3.4.1 The quantity of data used . 47 2.3.4.2 The learning time too expensive . 47 2.3.4.3 Optimization of the hyper-parameter . 47 2.3.4.4 Difficulty in understanding how information arrives . 47 2.3.4.5 Deep learning is sensitive . 48 2.3.5 Related works . 48 2.3.5.1 Image colorization . 48 2.3.5.2 Scene Labelling . 48 2.3.5.3 Image Classification . 48 2.3.5.4 Document Analysis . 49 2.3.5.5 Optical Character Recognition . 49 2.3.5.6 Text Classification . 49 2.4 Conclusion . 49 3 Design, implementation and results 50 3.1 Introduction . 50 3.2 System Design . 50 CONTENTS 6 3.2.1 Methodology . 50 3.2.2 Global system design . 50 3.2.3 Detailed system design . 52 3.2.3.1 Back-end . 52 3.2.3.2 Front-end . 62 3.3 Implementation . 63 3.3.1 Environments and developing tools . 63 3.3.2 Developing the Back-end . 70 3.3.2.1 Loading datasets . 70 3.3.2.2 Pre-processing the loaded datasets . 72 3.3.2.3 Training the model . 74 3.3.3 Developing the Front-end . 76 3.4 Results . 80 3.4.1 Arabic handwritten character Results . 80 3.4.2 Handwritten Latin characters Results . 81 3.4.3 Optic Latin characters results . 83 3.5 Discussion of results and comparison . 84 3.6 Limitation . 85 3.7 Conclusion . 85 List of Tables 1.1 Table of spectral or near-spectral colors [9] . 18 3.1 Breakdown of the number of available training and testing samples in the NIST special database 19 using the original training and testing splits [103]. 54 3.2 Structure and organization of the EMNIST datasets[103]. 54 3.3 Fonts, Sizes, Orientation used in dataset generator . 57 3.4 Example of categorical data [106]. 59 3.5 Example of a Dummy encoding [106]. 59 3.6 Example of a Dataset with undefined values . 60 3.7 Arabic handwritten character tests . 80 3.8 Handwritten Latin characters tests . 81 3.9 Results Comparison . 84 7 List of Figures 1.1 The Human Eye [2]. 13 1.2 Vision Diagram [2]. 13 1.3 CIE 1931 color space chromaticity diagram[6] . 16 1.4 Additive color mixing: combining red and green yields yellow; combining all three primary colors together yields white [4]. 16 1.5 Subtractive color mixing: combining yellow and magenta yields red; combining all three primary colors together yields black [7]. 17 1.6 Image sensing pipeline, showing the various sources of noise as well as typical digital post-processing steps [1]. 19 1.7 One-dimensional signal [1]. 20 1.8 Image compressed with JPEG at three quality settings [1]. ..