Investigation on Algorithm for Handwritten Gujarati OCR Ph.D. Synopsis Submitted To Technological University For the Degree of Doctor of Philosophy in Electronics and Communication Engineering By Mikita R. Gandhi Enrollment No: 149997111007

Supervisor: Co-Supervisor: Dr. Vishvjit Thakar Dr. Hetal N. Patel Associate Professor and Head Professor & Head

Information and Communication Technology, Electronics and Communication Department, Sankalchand Patel University, A.D.Patel Institute of Technology,

Visnagar. New V. V. Nagar.

Table of Contents

1. Title of the Thesis and Abstract...... 1 2. Brief description on the state of the art of the research topic...... 2 3. Objective and Scope of the work...... 4 4. Original contribution by the thesis...... 4 5. Methodology of Research, Results / Comparisons...... 4 6. Achievements with respect to objectives...... 17 7. Conclusions...... 18 8. List of publication arising from the thesis...... 19 9. References...... 20

1. Title of the Thesis and Abstract

1.1 Title of the Thesis: Investigation on Algorithm for Handwritten Gujarati OCR 1.2 Abstract: Optical Character Recognition is getting much more attention because by this the computer learns and recognizes the regional languages pretty well and if it successes, then it opens a whole new world of endless possibility.The machine printed characters are accurately recognizable which has solved many problems and hence commercialized in routine use but the recognition of hand written characters are very difficult and methods of recognition of hand written documents is still a subject of active research. There is no common algorithm is possible for all Indian language, because each Indian language has its own features and restrictions. In Gujarat state, Gujarati is the commercial language and most of the communication in Government office, schools and private sectors is done in Gujarati. Handwritten Gujarati OCR system was developed for handwritten amount on cheque, automatic reading of marks from answer sheet and a learning application for education system. The research work is mainly focused on implementation of robust algorithm for Handwritten Gujarati OCR.

The KNN and SVM classifiers were used on different feature extraction methods like pixel count ratio, object gradient; geometry, profile, local binary pattern, ceter-symmetric local binary pattern and wavelet transform methods. Furthermore hybrid feature extraction methods were used for increase the performance of character recognition. The other novel approach of automated features extracted was implemented using Deep learning. The extracted features were given to SVM for handwritten character classification. For increasing recognition rate of characters, pretrained Deep Neural network (Alexnet) has been used and implemented three different application: Handwritten Guajarati Numeral to speech conversion, character to speech conversion and Automatic Handwritten Marks Recognition.

KNN, SVM and Deep Neural Networks gives recognition accuracy of 98.14%,98.72% and 99.30% for Numeral, 92.37%, 92.21% and 97.65% for characters and 92.64%, 92.93% and 97.73% for combining Numerals and characters respectively.

1

2. Brief descriptions on the state of the art of the research topic As the world move closer to the concept of the “paperless office,” more and more communication and storage of documents is performed digitally. Documents and files that were once stored physically on paper are now being converted into electronic form in order to facilitate quicker additions, searches, and modifications and also doing this, life of such documents are prolonged. The advances in character recognition were limited to the extraction of English language character for both digital and handwritten. The character recognition of Indian languages can help authors, novelist, and many people to recognize the Indian characters and even to extract old heritage documents. The research work is approximately negligible for handwritten character recognition in general for Indian languages and in particular. In Gujarat State, all Government agency documents are written in Gujarati language. The software is available for printed Gujarati OCR but recognition of handwritten character is still changing exertion. Basic block diagram of OCR system is shown in figure 2.1. There are five major stages are like preprocessing, segmentation, representation, training and recognition and post processing.

Figure 2.1 Basic block diagram Preprocessing is required to make the raw data usable in the descriptive stages of character analysis like smoothing, sharpening, binarize the image, remove background and extracting the required information. Segmentation converts the document into separate character by first segment the lines, then line segments the words and from words to individual characters which is used by classifier. In representation stage, the set of features are extracted to distinguished one class of the images from other class. KNN, SVM, Neural Network, Deep Learning like classifier are used for training and recognition. The Gujarati OCR worked was initiated by Sameer Antani et.al.[1] on printed .KNN and hamming distance classifier was applied on 15 characters; 30 samples for

2 each character and got 67 % and 41.33% accuracy respectively. Using template matching and wavelet transform coefficients [2], Shah S. K et.al. attained 72.30 % accuracy for printed Gujarati OCR, Ankit K. Sharma et.al. [3] worked on zoning method and using multilayer feed forward neural network classifier achieved 95.92% accuracy for handwritten Gujarati Numerals and Archna vyas et.al.[4] got 96.99% accuracy using KNN. Using hybrid feature space method and SVM classifier A. Desai [5] has recognize forty handwritten Gujarati characters with 86.66% accuracy. The Zonal Boundary was successfully detected [6] by Jignesh Dholakia et.al. using zoning method. Swital J. Macwan et al [7] has applied discrete wavelet transform method on Gujarati Handwritten and got 89.46% accuracy. V. A. Naik et al have used different structural and statistical features for recognition of handwritten numerals and acquired 95% accuracy [8] and Dinesh Satange et al. obtained 90% accuracy using Multi Layer Perception[9]. Ashutosh Aggarwal et.al. [10]has worked on gradient based feature extraction and SVM classifier for Hindi handwritten character recognition. LBP features are used for Bangla digits recognition in 2015 [11] and achieved 96.7% accuracy using KNN classifier; the same LBP feature applied on Persian/Arabic handwritten digit recognition[12]. Sekhar Mandal et al [13] proposed algorithm for machine-printed character recognition in Bangla language using two dimensional wavelet transform and gradient information. Saleem Pasha et al have solved problem of handwritten recognition for Kannada language using statistical featuresand wavelet transform [14]. Two stage CNN network was used by Shibaprasad Sen et al [15] for online Bengali handwritten character recognition and gain 99.40% accuracy. Akm Ashiquzzaman et al worked on 10 different layer of CNN architecture for Arabic handwritten digit and achieved 97.4% accuracy [16]. Chaouki Boufenar et al shown the three different approach of Deep learning methods for handwritten Arabic character recognition: i) scratch approach; (ii) transfer learning approach and (iii) fine-tuning approach [17]. Table 2.1 shows the Gujarati Numerals and Characters used for research work. Numerals

૦ ૧ ૨ ૩ ૪ ૫ ૬ ૭ ૮ ૯ Characters

ક ખ ગ ઘ ચ છ જ ઝ ટ ઠ ડ ઢ ણ ત થ દ ધ ન પ ફ

બ ભ મ ય ર લ વ શ ષ સ હ ળ ક્ષ જ્ઞ શ્ર અ ઋ ઈ ઉ ઊ Table 2.1 Gujarati Numerals and Characters

3

3. Objective and Scope of work 3.1 Objective:  To develop an algorithm for handwritten Gujarati OCR features to recognize numerals and Character.  To design an Optical Character Recognition system for handwritten Gujarati Numerals.  To design an Optical Character Recognition system for handwritten Gujarati Characters.  To design an Optical Character Recognition system for combined handwritten Gujarati Numerals and Characters. 3.2 Scope of work:  The research work is useful to automatic detection of amount written in Gujarati on bank cheque, marks written on answer sheet and in Gujarati numeral and characters learning application.  Handwritten Gujarati numeral and characters to speech conversation  The implemented algorithms can be useful for recognition of Gujarati text modifier.

4. Original contribution by the thesis  Develop different feature extraction methods along with creation of database for Gujarati Handwritten numerals and characters.  Three different classification methods: KNN, SVM and Deep learning was used for recognition  Hybrid features with the above listed three classification methods.  Transfer learning approach of Deep learning is used for better accuracy.  Three applications are developed:  Gujarati handwritten Numeral to speech conversion  Gujarati handwritten Character to speech conversion  Automatic Handwritten Marks recognition.

5. Methodology of research, results and comparisons There is no standard database available for handwritten Gujarati OCR, the database was created for research work which containing 5000 samples for Numerals and 10,000 samples for characters from different age people.

4

5.1 Feature Extraction Method: Method 1: Pixel Count Ratio The image is resized into 64 X64 matrix. Some morphological operations are applied on it. Then image was divided into 8X 8 zone and total 64 zones are created. From each zone, the ratio of of white pixels/number of Black pixels is taken as features so total 64 features are generated. Figure 5.1 shows the image of Guajarati numeral ‘1’ and its 64 different zones.

5.1 pixel count ratio

Method 2: Object Gradient The gradient magnitude and gradient direction is used as features. First the image is divided into 9 sub images. The code is assigned for 30° span of direction. So total 12 code assigned to each sub image. The total 12X9 =108 features are obtained from single image . Figure 5.2 shows the image of Guajarati numeral ‘0’ and its 9 different zones. For each sub image gradient magnitude and gradient direction is computed and further, each white pixels gradient direction is observed, then checks that specific pixel lies in which span, according to span the code is assign to that pixel.

Figure 5.2 Object Gradient

5

Method 3: Object Geometry In this method [18] object Geometry is used as features, geometry features like horizontal line, vertical line, right diagonal and left diagonal lines, area and Euler number. For these feature extraction image is divided into 9 sub images. From each sub images, first starting point and intersection points are founded, and then of line segments are counted in particular direction like horizontal, vertical, right diagonal and left diagonal lines. The first four features of each sub images are values of these lines, computed by equation (1). Value =1 - ((number of lines /10) * 2) …………… (1) Next four features are computed using length of lines. If particular line is not available, than consider value =-1, else normalize the length consider as feature value. Last feature is considered as area of sub image. So total 9 features from each sub image X 9 sub images =81 features and Euler no of image is consider as one another features. Hence total 82 features are computed.

Figure 5.3 Object Geometry Figure 5.3 shows the one of the sub image of digit ‘0’. The 9 features are calculated as below:  No. of segments: 3 o No. of horizontal lines : 0 o No. of vertical lines : 0 o No. of right diagonal lines :2 o No. of left diagonal lines :1  Value is calculated by: Value =1 - ((number of lines /10) * 2)  So the first four features are: o Value of horizontal lines : 1 o Value of vertical lines :1 o Value of right diagonal lines:0.6

6

o Value of left diagonal lines :0.8  Next 4 features are calculated as: Length= (total no. of pixels in a particular direction) / (total no. of all pixels belonging to skeleton)  Total no. of all pixels belonging to skeleton: 20  If there is no pixels in particular line than consider length = -1 o Normalized Length of all horizontal lines :-1 o Normalized Length of all vertical lines :-1 o Normalized Length of all right diagonal lines :12/20 =0.60 o Normalized Length of all left diagonal lines :5/20 =0.25  The 9th feature from each zone is computed as: o Normalized Area of the Skeleton= (Total no. of all pixels belonging to skeleton) /(size of sub image) o Normalized Area of the Skeleton = 20/289 = 0.0692

Method 4: Character Profile Objects horizontal, vertical, right diagonal and left diagonal profile are considered as features. The image is resized into 50 X 50. So total 298 features =50 horizontal + 50 vertical + 99 right diagonal+ 99 left diagonal profile are calculated. Figure 5.4 show the character profile for numeral ‘1’.

Figure 5.4 Character Profile

7

Method 5: Local Binary Pattern Local Binary Patterns (LBP) is mostly used as feature extraction method in recognition of Face, fingerprint, texture. It operates on image pixels and replace its value with number. In image, each central pixel value is compared with its eight neighboring pixels, if the neighboring pixel has less value than assign 0 else assign 1 to that pixel. Considering top left corner as a first bit and rotate clock wise manner generates eight bit binary code. The central pixel value is replaced by the decimal value of that binary code. The histogram of these decimal values is used as features. Figure 5.5 shows LBP code generation. The generated Binary code is 11000010, that is equivalent to 194 in decimal. To implement the rotation invariant features and reduce the size of feature vector, Uniform LBB is used.

Figure 5.5 (a) Input image (b) The LBP (8,1) operator (c) LBP coded block A local binary pattern is called uniform if its uniformity measure is at most 2.For example, the patterns 00000000 (0 transitions), 01110000 (2 transitions) and 11001111 (2 transitions) are uniform and the patterns 11001001 (4 transitions) and 01010011 (6 transitions) are not uniform. In uniform LBP mapping there is a separate output label for each uniform pattern and all the non-uniform patterns are assigned to a single label. So, there are 58 uniform patterns and 1 non uniform pattern, total 59 point feature vector is considered. For obtain Uniform LBP features in this research work, binary image is first converted into gray image and then image is further divided into various size of blocks, suppose size is 12X12, so total 16 blocks are generated and from each block 59 features are obtained, so total features are 16 X 59 + 59 features from whole image =1003 features are obtained.

Method 6: Center Symmetric Local Binary Pattern. Center Symmetric Local Binary Pattern (CSLBP) is extension of LBP, in which difference of opposite pixel values are taken, if the difference is greater than some threshold value the assign bit 1 else assign bit 0; so the length of histogram of CSLBP is 16 point.

8

Figure 5.6 shows the generation of CSLBP code. consider the threshold value is 8 and the difference between opposite pixel values 80-70= 10 and 10 is greater than 8 so put 1, same way the value of other opposite are counted and binary code is generated. Here, the binary code is 1011 and its equivalent decimal code is 11.

Figure 5.6 Computation of CSLBP (8,1) with threshold= 8 Like LBP, CSLBP features are obtain by converting image into gray scale, and the divide into blocks. Total number of features for 12X12 block size is 16 blocks X 16 features + 16 features of whole image= 272 .

Method 7: Wavelet Transform Wavelet transform is usually used for representing and analyzing image. Image is represented by the two dimensional matrix; the wavelet transform is applied first row wise and then column wise, so final image is divided into four sub bands: [LL, HL, LH, HH], each sub band gives image’s approximation detail, horizontal detail, vertical detail and diagonal detail. Figure 5.7 shows the first level wavelet transform. The approximation details are used as features in the research work. The number of features is 256, 64 and 16 for level 2, 3 and 4 respectively.

Figure 5.7 2-D Wavelet Transform

9

5.2 Classification Method Classification is a task which assign object one of the classes from predetermined classes. Here five fold cross validation is used for classification. 5.2.1 K-Nearest Neighbors Classifier  In pattern recognition, KNN is a method for classifying objects based on the closest training examples in the feature space.  KNN is a type of lazy learning where the function is only approximated locally and all computation is deferred until classification.  The simplest of all machine learning algorithms: an object is classified by a majority vote of its neighbors, with the object being assigned to the class most common amongst its k nearest neighbors.  k is a positive integer, typically small.  If k = 1, then the object is simply assigned to the class of its nearest neighbor.  K-NN assumes that the data is in feature space.  The data can be scalars. Since the points are in feature space, they have notion of distance.

 Given an m-by-n data matrix X, which is treated as m (1-by-n) row vectors x1, x2,

..., xm, the various distances between the vector xs and xt are defined as follows: o Euclidean distance dst( x s  x t )( x s  x t )

o City block metric n dst x sj x tj j1 xx d 1 st o Cosine distance st xs x s x t x t

 o Correlation distance (xs xs )( x t x t ) dst 1 (x xst )( x  x ) ( x  x )( x  x ) s s s t t t

10

Table 5.1 shows the accuracy of numerals, characters and combining numerals and characters for different feature extraction method using KNN classifier.

Feature Extraction Accuracy Methods Numerals Characters Mix Pixel Count Intensity 97.18% 78.12% 80.65 Gradient based 98.14% 89.06% 89.81 Object Geometry based 90.02% 67.25% 71.38 Object Profile 95.82% 76.71% 79.19 Local Binary Pattern 97.92% 88.12% 88.97 Center Symmetric Local 97.92% 87.65% 89.38 Binary Patterns Wavelet transform 97.86% 81.40% 84.02 Table 5.1 KNN Classifier Accuracy for Different Feature Extraction Method Figure 5.8 shows the accuracy of hybrid feature extraction method. By concatenating CSLBP and Gradient features, character recognition accuracy reach up to 92.37%.

Hybrid Feature Extraction Methods

93

92

91 90

89 % Accuracy% 88 87 Local Binary Pattern Center Symmetric Wavelet Transform + + Gradient Local Binary Patterns Gradient + Gradient Features Extraction Method

Figure 5.8 hybrid feature extraction method for characters Figure 5.9 shows the accuracy of hybrid feature extraction method for mixed numerals and characters. By concatenating CSLBP and Gradient features, the recognition accuracy reach up to 92.64%.

11

Hybrid Feature Extraction Methods

93

92 91

% Accuracy% 90 89 88 Local Binary Pattern Center Symmetric Wavelet Transform + + Gradient Local Binary Patterns Gradient + Gradient Features Extraction Method

Figure 5.9 hybrid feature extraction method for mixed numerals and characters

5.2.2 Support Vector Machine  When there is no idea about data, support vector machine (SVM) extremely work well.  SVM’s are very excellent when we have no idea on the data.  It works with unconstructed and semi constructed information data like images, text and trees.  The kernel strategy is main power of SVM . With a specific kernel functionality , it is possible to deal with any kind of complex problem  In contrast to neural networks, SVM is not made up for local optima.  It scales extremely good to high dimensional data.  SVM is always gives better result than ANN  SVM also required the good kernel selection and large dataset.  SVM takes long training time than other classifier.  Common kernels

o Linear K(x,z) = xTz

o Quadratic K(x,z) = (1+xTz)2

o Polynomial K(x,z) = (1+xTz)d

o RBF K(x,z) = exp-(||x-z||2)

12

Table 5.2 shows the accuracy of numerals, characters and combining numerals and characters for different feature extraction method using SVM classifier.

Feature Extraction Accuracy Methods Numerals Characters Mix Pixel Count Intensity 96.90% 76.28 % 81.05% Gradient based 98.72% 89.57% 92.10% Object Geometry based 90.70% 67.59% 70.56% Object Profile 93.50% 64.45% 70.08% Local Binary Pattern 97.50% 85.99% 88.26% Center Symmetric Local 95.82% 83.3% 86.21% Binary Patterns Wavelet transform 97.40% 84.93% 86.68% Table 5.2 SVM Classifier Accuracy for Different Feature Extraction Method Figure 5.10 shows the accuracy of hybrid feature extraction method. By concatenating wavelet and Gradient features, character recognition accuracy reach up to 92.37%.

Hybrid Feature Extraction Methods

93 92

91 90 89 88 % Accuracy % 87 86 85 Local Binary Pattern + Center Symmetric Wavelet Transform + Gradient Local Binary Patterns + Gradient Gradient Features Extraction Method

Figure 5.10 hybrid feature extraction method for characters Figure 5.11 shows the accuracy of hybrid feature extraction method for mixed numerals and characters. By concatenating wavelet and Gradient features, the recognition accuracy reach up to 92.64%.

13

Hybrid Feature Extraction Methods

93 92.5 92 91.5 91 90.5 % Accuracy% 90 89.5 89 88.5 Local Binary Pattern + Center Symmetric Wavelet Transform + Gradient Local Binary Patterns + Gradient Gradient Features Extraction Method

Figure 5.11 hybrid feature extraction method for mixed numerals and characters

5..3 Deep Learning:  In deep learning, a computer model learns to perform classification tasks directly from images, text, or sound.  Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding human-level performance.  Models are trained by using a large set of labeled data and neural network architectures that contain many layers.  Deep learning requires substantial computing power. High-performance GPUs have a parallel architecture that is efficient for deep learning.  Most deep learning methods use neural network architectures, which is why deep learning models are often referred to as deep neural networks. 5.3.1 Convolutional Neural Network (CNN)  Most popular algorithms for deep learning with images composed of an input layer, an output layer, and many hidden layers in between.  CNN usually consists of o Convolution Layer : In convolutional layer used to extract the features from image, different filters with different weight are applied on the input layer of the image, so the outcomes of this, two dimensional feature maps are generated.

14 o Pooling Layer or Sub Sampling : Poling layer operate on each feature map. It can decrease the spatial dimension but cannot decrease the depth of the feature map. It trims down the amount of parameter and computation in network. o Fully Connected Layer (Classification) : Fully connected network connect features obtained by previous layers into its number of classes.

Figure 5.12 Architecture of CNN

Figure 5.13 Implementation of CNN

15

Figure 5.12 shows the CNN architecture, which has two convolutional layers, two pooling layers and one fully connected layers. It also consist two ReLU layers, which is increase the non- linearity of image by replacing all negative values by zero. Also the Softmax layer is used for highlights the largest value and suppresses the value which is significantly below the maximum value. Implementation of CNN network is shown in figure 5.12. The center blocks of the image shows sequence of layers, left side shows the size of feature map of each layers and right side shows number of feature maps with its size. The output of each layer is shown in figure 5.13. Table 5.3 shows the accuracy for numerals and characters for input size 64X64 and 96X96 respectively.

CNN Layers No. of % Accuracy for % Accuracy for Filters Numerals Characters Input size: Input size: Input size: Input size: 64X64 96X96 64X64 96X96 Convolutional layer 1 20 80.7 94.4 70.15 74.15 Convolutional layer 2 40 Convolutional layer 1 40 88.25 95.1 72 76.25 Convolutional layer 2 80 Table 5.3 Proposed CNN Architecture Accuracy for Numerals and Characters 5.3.2 Pretrained Network Approach Pretrained network is a previously trained network on a large standard datasets like similar problems that we want to solve. It already knows how to extract features which are informative and more powerful. More than million images are given to train this type Network and its output also classified into approximately 1000 class.

Figure 5.14 Pretrained Network Approach

16

Pretrained networks like alexnet, vgg16, vgg19, googlenet, resnet18, resnet50, sufflenet are used for new task with only feature extraction purpose or transfer learning approach, as shown in figure 5.14. Alexnet is used as pretrained Network. Alexnet returns a pretrained AlexNet model and it contains 25 layers. The ImageNet database is used to train this model and its classified the image into 1000 classes such as different animals, mouse, pencils, cup, ambulance. A. Pretrained network as feature extractor In this approach, Alexnet model is used for feature extraction and these extracted features are given to SVM classifier for training and tasting purpose. Features are extracted using 20 layers of Alexnet that is layer ‘fc7’. The recognition accuracy for numerals, characters and mix database is shown in table 5.4 B. Transfer Learning Approach In this approach, all the layers of the pretrained Alexnet has been used expect last three layers. The new task has been carried out by replacing those last three layers with fully connected layers, Softmax layers and classification output layers. The recognition accuracy for numerals, characters and mix database for this approach is shown in table 5.4

Database SVM classification Transfer Learning Approach Approach Numerals 97.40% 99.30% Characters 86.50% 97.65% Mixed 89.33% 97.73% Table 5.4 Pretrained Network Approach

6. Achievements with respect to objectives  Gujarati Handwritten numerals and characters database has been created in which 5000 samples for numerals and 10,000 samples for characters.  Handwritten Gujarati numerals, total 10 class, are recognized with 98.14% accuracy using gradient feature extraction method and KNN classifier, 98.72% accuracy using gradient features and SVM classifier and achieve 99.30% accuracy using Transfer Learning approach in Deep Learning.  Handwritten Gujarati characters, total 40 class, are recognized with 92.37% accuracy using CSLBP + gradient based hybrid feature extraction method and KNN classifier, 92.21% accuracy using wavelet + gradient based hybrid features and SVM classifier and achieve 97.65% accuracy using Transfer Learning approach in Deep Learning.

17

 Handwritten Gujarati numerals and characters, total 48 class, are recognized with 92.64% accuracy using CSLBP + gradient based hybrid feature extraction method and KNN classifier, 92.93% accuracy using wavelet + gradient based hybrid features and SVM classifier and achieve 97.73% accuracy using Transfer Learning approach in Deep Learning.  Three applications are implemented: 1) Handwritten Guajarati Numerals to speech conversion. 2) Handwritten Guajarati Characters to speech conversion. 3) Automatic Handwritten Marks Recognition.

7. Conclusion:  The hand written Gujarati numeral recognition algorithm was successfully developed using large number (5000) of test images with accuracy of 99.30%.  The hand written Gujarati character recognition algorithm was successfully developed using large number (10,000) of test images with accuracy of 97.65%.  The hand written Gujarati number and character recognition algorithm was successfully developed using large number (15,000) of test images with accuracy of 97.73 %.

18

8. List of publications 1) Mikita Gandhi, V.K.Thakar, H.N.Patel, “Handwritten Gujarati Numeral Recognition using wavelet Transform”, Journal of Applied Science and Computation (JASC), Volume VI, Issue IV,2019 2) Mikita Gandhi, V.K.Thakar, H.N.Patel, “Gujarati Handwritten Character Recognition Using Convolutional Neural Network”, Journal of Emerging Technologies and Innovative Research (JETIR) , Volume VI, Issue V, May 2019

19

References

1) Antani S, Agnihotri L “Gujarati character recognition”, In: Proceedings of fifth international conference on document analysis and recognition, 1999 (ICDAR’99), pp 418–421 2) Shah SK, Sharma A, “Design and implementation of optical character recognition system to recognize Gujarati script using template matching”, J. Inst Eng () Electron Telecommunication Eng.,2006. 3) Ankit K. Sharma, Dipak M. Adhyaru, Tanish H. Zaveri, Priyank B Thakkar, “Comparative analysis of zoning based methods for Gujarati handwritten numeral recognition”, 5th Nirma University International Conference on Engineering (NUiCONE),IEEE 2015 4) Vyas, A. N. ,Goswami, M. M., “Classification of hand written Gujarati numerals”, IEEE transactions on pattern analysis and machine intelligence, pp.1231- 1237,2015 5) A. Desai, “Support vector machine for identification of handwritten Gujarati alphabets using hybrid feature space”, CSIT, springer, January, 2015. 6) Dholakia J, Negi A, Rama Mohan S, “Zone identification in the printed Gujarati text”, In: Proceedings of the eight international conference on document analysis and recognition, 2005 (ICDAR’05). 7) Swital J. Macwan, Archana N. Vyas, "Classification of Offline Gujarati Handwritten Characters", International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2015 8) Dr. Dinesh Satange, Dr. P E Ajmire, Fozia I. Khandwani, “Offline Handwritten Gujrati Numeral Recognition Using MLP Classifier”, International Journal of Novel Research and Development, Volume 3, Issue 8 August 2018 9) Sekhar Mandal, Sanjib Sur ,Avishek Dan “Handwritten Bangla Character Recognition in Machine-printed Forms using Gradient Information and Haar Wavelet”, 2011 International Conference on Image Information Processing.

10) Ashutosh Aggarwal, Rajneesh Rani, RenuDhir, "Handwritten Character Recognition Using Gradient Features", International Journal of Advanced Research in Computer Science and Software Engineering (ISSN: 2277-128X), Vol. 2, Issue 5, pp. 85- 90, May 2012 11) T. Hassan, H. Khan, “Handwritten BangIa Numeral Recognition using Local Binary Pattern”, 2nd Int'l Conf. on Electrical Engineering and Information & Communication Technology (lCEEICT),2015.

20

12) M. Pietikäinen, A. Hadid, G. Zhao, T. Ahonen (2011), ‘Local Binary Patterns for Still Images, Computer Vision Using Local Binary Patterns’, Chapter 2, Computational Imaging and Vision 40, Springer-Verlag London Limited, pp 13 – 47. 13) Sekhar Mandal, Sanjib Sur ,Avishek Dan “Handwritten Bangla Character Recognition in Machine-printed Forms using Gradient Information and Haar Wavelet”, 2011 International Conference on Image Information Processing. 14) Saleem Pasha, M.C.Padma, “Handwritten Kannada Character Recognition using Wavelet Transform and Structural Features” International Conference on Emerging Research in Electronics, Computer Science and Technology – 2015 15) Sen S., Shaoo D., Paul S., Sarkar R., Roy K. ,“Online Handwritten Bangla Character Recognition Using CNN: A Deep Learning Approach”, In: Bhateja V., Coello Coello C., Satapathy S., Pattnaik P. (eds) Intelligent Engineering Informatics. Advances in Intelligent Systems and Computing, vol 695. Springer, Singapore,2018. 16) Alom, M.Z, Sidike, P., Taha, T.M., Asari, V.K.., “Handwritten Bangla Digit Recognition Using Deep Learning” Journal Neural Processing Letters; 45, pp: 703- 725,2017. 17) C. Boufenar, A. Kerboua, M. Batouche, "Investigation on deep learning for off-line handwritten Arabic character recognition", Cogn. Syst. Res., 2017. 18) M. Blumenstein, B. K. Verma and H. Basli, A Novel Feature Extraction Technique for the Recognition of Segmented Handwritten Characters, 7th International Conference on Document Analysis and Recognition (ICDAR ’03) Eddinburgh, Scotland: pp.137-141, 2003. 19) Patel CN, Desai AA , “Segmentation of text lines into words for Gujarati handwritten text”, In: Proceedings of international conference on signal and image processing, 2010 (ICSIP’10), IEEEXplore,15–17. 20) Dapping Tao, Xu Lin, Lianwen Jin, “Principal Component 2-D Long Short-Term Memory for Font Recognition on Single Chinese Characters”, IEEE TRANSACTIONS ON CYBERNETICS, VOL. 46, NO. 3, MARCH 2016 21) Devendra K Sahu and C. V. J awahar, “Unsupervised Feature Learning for Optical Character Recognition”, In: Proceedings of the 13th International Conference on Document Analysis and Recognition,2015 (ICDAR’15) 22) Mohamed Dahi, Noura A. Semary, and Mohiy M. Hadhoud, “Primitive Printed Arabic Optical Character Recognition using Statistical Features”, IEEE Seventh

21

International Conference on Intelligent Computing and Information Systems, 2015 (ICICIS'15). 23) Tanzila Saba, “Language Independent Rule Based Classification of Printed & Handwritten Text”, IEEE International Conference on Evolving and Adaptive Intelligent Systems (EAIS),December ,2015. 24) Abdeljalil Gattal, “Segmentation-Verification Based on Fuzzy Integral for Connected Handwritten Digit Recognition”, IEEE transaction on Image Processing Theory, Tools and Applications, 2015. 25) Patel CN, Desai AA (2013) Gujarati handwritten character recognition using hybrid method based on binary tree-classifier and k-nearest neighbour. Int J Eng Res Technology, II(6):2337–2345. 26) D. Bradley and G. Roth, “ Adaptive thresholding using the integral image”, Journal of Graphics tools, Vol.12, No.2,pp.13-21, Jun 2007. 27) Wojciech Bieniecki, Szymon Grabowski and Wojciech Rozenberg “Image Preprocessing for Improving OCR Accuracy” International Conference on Perspective Technologies and Methods in MEMS Design, MEMSTECH 2007 Pp.75- 80, 23-26 May 2007. 28) Luis R. Blando’, Junichi Kanai, and Thomas A. Nartker “Prediction of OCR Accuracy Using Simple Image Features” IEEE Proceedings of the Third International Conference on Document Analysis and Recognition, Vol.1 PP. 319 – 322, 14-16 Aug 1995 29) Chinmay Chinara, Nishant Nath, Subhajeet Mishra, “ A Novel Approach to Skew- Detection and Correction of English Alphabets for OCR” IEEE Student Conference on Research and Development (SCOReD), pp.5-6 241 – 244, Dec. 2012 30) Xiaoling Fu, Yazhuo Xu, Lijing Tong “Document Image Skew Adjusting Based on the Feedback Information Recognized By OCR” IEEE 3rd International Conference on Communication Software and Networks (ICCSN), pp. 376 – 378, 27-29 May 2011. 31) E.Kavallieratou, N.Fakotakis and G.Kokkinakis “Handwritten Character Recognition based on Structural Characteristics” IEEE 16th International Conference on Pattern Recognition, 2002. Proceedings. Vol.3 pp.139 - 142 . 32) Hanchuan Peng, , Fuhui Long, Zheru Chi ” Document Image Recognition Based on Template Matching of Component Block Projections ” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 25, No. 9, pp. 1188 - 1192 September 2003.

22

33) PEPE SIY, C. S. CHEN “Fuzzy Logic for Handwritten Numeral Character Recognition” IEEE Transactions on Systems, Man and Cybernetics, Vol.4,No.6, pp.570-575 34) Salvador Espan˜a-Boquera, Maria Jose Castro-Bleda, Jorge Gorbe-Moya, and Francisco Zamora-Martinez “Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, No.4,pp.767-779,APRIL,2011. 35) D. Bradley and G. Roth, “ Adaptive thresholding using the integral image”, Journal of Graphics tools, Vol.12, No.2,pp.13-21, Jun 2007. 36) Koga, M. “Camera-based Kanji OCR for mobile-phones: practical issues” Eighth IEEE International Conference on Document Analysis and Recognition, Vol. 2, pp. 635 –639, 29 Aug.-1 Sept. 2005 37) Lund, W.B. “Error Correction with In-domain Training across Multiple OCR System Outputs” IEEE International Conference on Document Analysis and Recognition (ICDAR),pp. 658 – 662, 18-21 Sept. 2011. 38) Bhattacharya, U. “Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.31, No.3 , pp. 444 - 457, March 2009 39) Kavallieratou, E. “New algorithms for skewing correction and slant removal on word- level [OCR]” The 6th IEEE International Conference on Electronics, Circuits and Systems,Vol.2, pp. 1159 – 1162, 5-8 Sep 1999 40) Nikhil Pai, Vijaykumar S. Kolkure,”Optical Character Recognition: An Encompassing Review”, International Journal of Research in Engineering and Technology, Vol . 04, Issue: 01 , Jan-2015 41) J .Mantas, "An overview of character recognition methodologies”, Pattern Recognition, vol. 19, no. 6, pp. 425-43 0, 1 986. 42) Rajean Plamondon, Fellow IEEE and Sargur N. Srihari, Fellow IEEE, “On-Line And Off-Line Handwriting character Recognition: A Comprehensive Survey”, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. VOL. 22, NO. 1. JANUARY 2000 43) Amritha Sampath, Tripti C, Govindaru V, "Freeman code based online handwritten character recognition for Malayalam using back propagation neural networks", International journal on Advanced computing, Vol. 3, No. 4, pp. 51 - 58, July 2012.

23

44) Pradeep, E Shrinivasan and S.Himavathi, "Diagonal Based Feature Extraction for Handwritten Alphabets Recognition System Using Neural Network", International Journal of Computer Science & Information Technology (IJCSIT), vol. 3, No 1, Feb 2011. 45) Om Prakash Sharma, M. K. Ghose, Krishna Bikram Shah, "An Improved Zone Based Hybrid Feature Extraction Model for Handwritten Alphabets Recognition Using Euler Number", International Journal of Soft Computing and Engineering (ISSN: 2231 - 2307), Vol. 2, Issue 2, pp. 504-508, May 2012 46) He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778). 47) Krizhevsky, A., Sutskever, I., Hinton, & G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105). 48) R. Gonzalez, E. Woods, Digital Image Processing, 3rd edition , Prentice hall. 49) www.mathswork.com

24