Available online at www.sciencedirect.com ScienceDirect

Procedia Computer Science 89 ( 2016 ) 836 – 844

Twelfth International Multi-Conference on Information Processing-2016 (IMCIP-2016) Multimodal Recognition Framework: An Accurate and Powerful Nandinagari Handwritten Character Recognition Model Prathima Guruprasada,∗ and Jharna Majumdarb

aUniversity of Mysore (Research Centre: Nitte Research and Education Academy) bNitte Meenakshi Institute of Technology, Bangalore, , India

Abstract Effective and efficient recognition of ancient Nandinagari Handwritten manuscripts depend on the identification of the key interest points on the images which are invariant to Scale, rotation, translation and illumination. Good literature for finding specific types of key interest points using single modality is available for scripts. However, finding the best points and its descriptors is challenging and could vary depending on the type of handwritten character. Its choice is also dependent on the tradeoff between precision of recognition and speed of recognition and the volume of handwritten characters in question. Thus for a recognition system to be effective and efficient, we need to adopt a multimodal approach in which the different modalities are used in a collaborative manner. In this paper, instead of separately treating the different recognition models and their algorithms, we focus on applying them at a phase where their recognition is optimal. A varied set of Handwritten Nandinagari characters are selected with different image formats, scale, rotation, translation and illumination. Various models of feature extraction techniques are applied of which SURF is chosen to be very performance efficient with good accuracy. A dissimilarity ratio matrix is computed out of the maximum number of matched features of query to test image and vice versa to denote the magnitude of dissimilarity between the images in the data test updated with the new query image. The result is processed through agglomerative clustering to achieve powerful and accurate clusters belonging to designated set of images with 99% percent recognition. © 2016 The Authors. Published by Elsevier B.V. Peer-review© 2016 The Authors. under Published responsibility by Elsevier of organizing B.V. This is committee an open access of thearticle Twelfth under the International CC BY-NC-ND Multi-Conference license on Information (http://creativecommons.org/licenses/by-nc-nd/4.0/). Processing-2016 (IMCIP-2016). Peer-review under responsibility of organizing committee of the Organizing Committee of IMCIP-2016 Keywords: Feature Extraction; Handwritten Text Recognition; Hierarchical Clustering; Pattern Classification; Principal Component Analysis; Scale Invariant Feature Transform; Speeded Up Robust Feature.

1. Introduction

Nandinagari is earlier version of . These scripts have rich heritage and cover vast areas of philosophy, politics, Management, science, arts, culture and religion. In the recent years, there has been a great improvement in research related to the recognition of printed and handwritten Devanagari scripts. But this is the first attempt to recognize handwritten Nandinagari script. They pose significant challenges due to its stroke variety and proximity in appearance. Handwritten Nandinagari characters can come with different illumination, sizes, orientation and occlusion. Orientation could be negative or positive orientation while occlusion include challenges in identifying the characters.

∗ Corresponding author. Tel.: +91 -9449153569. -mail address: [email protected]

1877-0509 © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of organizing committee of the Organizing Committee of IMCIP-2016 doi: 10.1016/j.procs.2016.06.069 Prathima Guruprasad and Jharna Majumdar / Procedia Computer Science 89 ( 2016 ) 836 – 844 837

The effects of Occlusion could be due to a spray of water droplets on the script, stains of oil, explosion of object pixels, circular waves on image, convulsion, blurring, noise or other forms. A dataset of over 1000 Handwritten Nandinagari characters are created for this project to aid research since standard data sets of similar types are not available for this work. The objective is to derive features of the image that are invariant to scale, orientation, illumination and occlusion. These invariant features derived are used to form an intelligent database that help to visualize the data.

2. Related Work

This is the first of its kind for handwritten Nandinagari characters.Hierarchical Face clustering using SIFT1–3 is a popular algorithm4 proposed for clustering face images found in video sequences. This novel method uses hierarchical average linkage clustering algorithm, which yields the clustering result. SURF4–6 is also another popular algorithm for extracting key features of an image. These two methods provide scale and rotation invariant interest points. They even outperform previously proposed schemes with respect to repeatability, distinctiveness, and robustness. RANSAC7,9 algorithm is used as a transformation model for matching. The key advantages of both these approaches are: • They can be extended to other newer and unexplored versions of character variations. • There is very less need to force normalization of characters before recognition unlike other approaches such as neural networks. • No or lesser need to pre filter the images which adds to less processing time.

3. Multi Models

SIFT starts with the scale space to obtain the potential regions of interest (ROI) points. Document image processing uses this approach widely8–10. ROI points are rejected if they have a low contrast or if they were located on an edge in order to have stable interest points. These interest points are scale invariant. We then assign orientation to these interest points which make them rotation invariant. This increases efficiency and robustness of the SIFT algorithm. SURF method is similar to SIFT approach but relies on integral images for image convolutions to reduce computation time, builds the scale space in increasing size order. From the scale space, potential interest points are obtained. Using Hessian matrix, stable points are selected which are scale invariant. For these points, different orientations are assigned to make them rotation invariant. Integral images are used for speed and only 64 dimensions are used reducing the time for feature computation and matching. SURF is less computationally demanding than SIFT. Other models such as SUSAN, FAST, and HARRIS are not as effective and efficient as compared to the above. The selection of the model is hence crucial for our framework and after rigorous testing, we choose SURF for extraction and matching in Fig. 1, steps 1 through 3 below:

4. Methodology

In the first step we setup a new Image database with 1040 handwritten Nandinagari character image datasets. This had all the 52 characters of Nandinagari script with four size variations. Within each size, we had samples with 5 varieties involving different rotations and translation/image format. These images (A) are acquired from a scanner/digital camera or any other suitable digital input device manually and converted into Grey scale images. SURF features are extracted and values are stored in the database. In the second step, a query image (B) is provided, converted to grey scale and similar features are extracted and stored in the database. In the third step, every image in training set (A) is matched with any corresponding features in test image (B) and the maximum match if stored in the database. A total of 4608 such combinations of matches are evaluated for a 48 character sample base and 2304 values having the Maximum of the to and fro combinations are stored in the database as shown in Table 1. 838 Prathima Guruprasad and Jharna Majumdar / Procedia Computer Science 89 ( 2016 ) 836 – 844

Fig. 1. MultiModal Recognition Framework Architecture.

Table 1. Match Features.

Source Image A PointMatch A → B Query Image B PointMatch B → A 11D50−A0000−37.jpg 57 of 57 11D50−A0000−37.jpg 57 of 57 11D50−A0000−37.png 56 of 58 11D50−A0000−37.jpg 53 of 57 11D50−A0000−37−135.jpg 23 of 49 11D50−A0000−37.jpg 23 of 57 11D50−A0000−37−180.jpg 41 of 59 11D50−A0000−37.jpg 37 of 57 11D50−A0000−37−45.jpg 22 of 57 11D50−A0000−37.jpg 22 of 57 11D50−A0000−37−90.jpg 46 of 62 11D50−A0000−37.jpg 40 of 57

In Table 1, PointMatch A → B indicates that out of 59 SURF points in A(11D50−A0000−37−180.jpg), 41 are matching to B(11D50−A0000−37.jpg) while PointMatch B → A indicates the only 37 matching points in B out of 57 SURF points in A. In the fourth step, these matches are transformed to a dissimilarity ratio matrix between the two compared images using the formula (1):

DRji = DRij = 100 (1 − Mij/ min (Ki, Kj)) (1)

where Mij is the maximum number of key point matches found between the pairs (, Aj), (Aj, Ai). Ki and Kj are the number of key points found in Ai and Aj respectively. DRij AC (0,100) and a high DRij value indicate large dissimilarity between handwritten Nandinagari characters. In the fifth step, this data is fed to a hierarchical clustering method which in turn transforms into a sequence of nested partitions. These values are captured for training set images and indicated in Table 3.

5. Result Analysis

The results are obtained for various stages of character recognition. As shown below Fig. 2, Fig. 3 and Fig. 4, the results are analysed using Robust feature transform techniques. The samples of images of different size 256 × 256, 384 × 384, 512 × 512, 640 × 640, different orientation angles 0◦,45◦,90◦, 135◦, 180◦ and translated image of size Prathima Guruprasad and Jharna Majumdar / Procedia Computer Science 89 ( 2016 ) 836 – 844 839

384 × 384 are taken. Figure 4 shows a sample of the same. Figure 5 shows the Nine clusters that get clearly identified when hierarchical clustering is applied.

5.1 SURF results (384 × 384)

Fig. 2. SURF Interest Points for Nandinagari Characters.

5.2 SIFT results

Fig. 3. SIFT Interest Points for Nandinagari Characters.

Fig. 4. List of Characters in Nandinagari along with Comparison of SURF and SIFT Interest Points for Nandinagari Characters. 840 Prathima Guruprasad and Jharna Majumdar / Procedia Computer Science 89 ( 2016 ) 836 – 844

Table 2. Metrics for Comparison of Handwritten Nandinagari Character Images using SIFT and SURF Method.

Fig. 5. List of NINE Clusters after Applying Hierarchical Clustering to Training Set.

Table 2 indicates the metrics for comparison of Handwritten Nandinagari character images with SIFT and SURF methods. For every feature transform type, Average displacement indicates the distance computed using RANSAC. The Interest points indicate the number of points for interest for each detector type. The Matches field indicates the number of points with feature descriptors that are identical across the images in the row. The Time Taken indicates the time taken for the query image to retrieve the matches in milli seconds. This time is less as compared to SIFT method since only half (64) feature descriptors are required to be computed. From the above, we hence choose SURF as the Model for feature extraction and computed the dissimilarity matrix as shown in Table 3. Figure 6 shows two query images used for testing. The first query image was a vowel which had a 300X300 size and a negative 20 degree orientation was named 11D50−A0000−37-min20-300.jpg. The second query image was a consonant had a 90 degree positive orientation of size 400 × 400 was named 11D74−BA000−37-90-400.jpg. The dissimilarity matrix is found and hierarchical clustering is applied. They automatically get grouped into the respective green boxed cluster and highlighted with an inner red box as shown below. Prathima Guruprasad and Jharna Majumdar / Procedia Computer Science 89 ( 2016 ) 836 – 844 841

Table 3. Dissimilarity Matrix for Handwritten Nandinagari Character Training Images (Dsm1 through Dsm48). 842 Prathima Guruprasad and Jharna Majumdar / Procedia Computer Science 89 ( 2016 ) 836 – 844

Fig. 6. List of NINE Clusters after Applying Hierarchical Clustering to Test Set.

a. Dissimilarity Matrix for letter A and its variants (test image-11D74−BA000−37-90-400.jpg)

Fig. 7. Dissimilarity Matrix for Letter A and its Variants Shows the Dip for Matching Letters Only. Prathima Guruprasad and Jharna Majumdar / Procedia Computer Science 89 ( 2016 ) 836 – 844 843 b. Dissimilarity matrix for and its variants (training images only) is shown below. Note its minimum value for dsm indices 29 to 33

Fig. 8. Dissimilarity Matrix for Letter La and its Variants Shows the Dip for Matching Letter Only.

Fig. 9. PCA Applied to Dissimilarity Matrix and then Subjected to Hierarchical Clustering.

6. Conclusions

This paper presents a novel approach to recognizing the Nandinagari characters. Multiple models are deployed and compared to get the best feature extractor which is invariant to scale, rotation and translation. A dissimilarity 844 Prathima Guruprasad and Jharna Majumdar / Procedia Computer Science 89 ( 2016 ) 836 – 844

matrix formed fed to hierarchical clustering algorithm clearly separates the 9 sample characters with over 48 variants indicating very high recognition accuracy. PCA is applied to verify if the results obtained are in order as shown in Fig. 8. The cluster number is as indicated in Fig. 4. This also classifies into 9 distinctly different clusters as expected.This work can be extended to the entire result set to obtain effective and efficient results.

References

[1] D. G. Lowe, Distinctive Image Features from Scale-invariant Key Points, IJCV, vol. 60(2), pp. 91–110, (2004). [2] Ravi Shekhar and C.V. Jawahar, Word Image Retrieval using Bag of Visual Words, IEEE 10th IAPR International Workshop on Document Analysis Systems, DOI 10.1109/DAS.2012.96, (2012). [3] Ives Rey-Otero, Jean-Michel Morel and Mauricio Delbarcio, An Analysis of Scale-space Sampling in SIFT, ICIP, (2014). [4] P. Antonopoulos, Hierarchical Face Clustering using SIFT Image Features, IEEE Symposium on Computational Intelligence in Image and Signal Processing, CIISP, (2007). [5] Baofeng Zhang, Yingkui Jiao Zhijun , Yongchen Li and Junchao Zhu, An Efficient Image Matching Method using Speed Up Robust Features, IEEE International Conference on Mechatronics and Automation, (2014). [6] Dong Hui and Han Dian Yuan, Research of Image Matching Algorithm based on SURF Features, IEEE International Conference on Computer Science and Information Processing (CSIP), (2012). [7] K. M. Goh, S. A. R. Abu-Bakar and M. M. Mokji et al., Implementation and Application of Orientation Correction on SIFT and SURF Matching, International Journal on Advanced Soft Computputing Applications, vol. 5(3), (2013). [8] Martin A. Fischler and Robert C. Bolles, Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography, In Journal on Communications of the ACM, ACM Press, NY, vol. 24(6), pp. 381–395, (1981). [9] S. Setlur and V. Govindaraju (editors), Guide to OCR for Indic Scripts, in Springer, (2009). [10] K. P. Sankar, V. Ambati, L. Pratha and C. V. Jawahar, Digitizing a Million Books: Challenges for Document Analysis in DAS, pp. 425–436, (2006).