View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Directory of Open Access Journals

International Journal of Computer Science and Network (IJCSN) Volume 1, Issue 6, December 2012 www.ijcsn.org ISSN 2277-5420

Hand Gesture Recognition using Neural Network

1Rajesh Mapari, 2Dr. Govind Kharat

1 Dept of Electronics and Telecommunication Engineering, Anuradha Engineering College, Chikhli, Maharashtra-443201, India

2 Principal, Sharadchandra Pawar College of Engineering, Otur, Maharashtra-443201, India

Abstract This paper presents a simple method to recognize sign gestures recognition system [4]. Ryszard S . Choras of American using features like number of Peaks proposed a method identification of persons based on the and Valleys in an image with its position in an image. Sign shape of the hand and the second recognizing gestures language is mainly employed by deaf-mutes to communicate and signs executed by hands using geometrical and Radon with each other through gestures and visions. We extract the transform (RT) features [5]. Salma Begum, Md. skin part which represents the hand from an image using L*a*b* Color space. Every hand gesture is cropped from an Hasanuzzaman proposed system which uses PCA image such that hand is placed in the center of image for ease of (Principal Component Analysis) based pattern matching finding features. The system does require hand to be properly method for recognition of sign [6]. Yang quan, Peng aligned to the camera and does not need any special color Jinye, Li Yulong proposed a novel vision-based SVMs [8] markers, glove or wearable sensors. The experimental results classifier for sign language recognition [7]. A vision show that 100% recognition rate for testing and training data based Sign Language recognition system uses many set. features of image like area, DCT and uses Neural Keywords: Gesture recognition, boundary tracing, Network [9] or HMM [14], [16]. segmentation, peaks & valleys.

2. Proposed Methodology 1. Introduction

In this paper we present an efficient and accurate The ultimate aim of our research is to enable technique for sign detection. Our method has five phases communication between speech impaired (i.e. deaf-dumb) of processing viz., image cropping, resizing, peaks and people and common people who don’t understand sign valleys detection, dividing image in sixteen parts, finding language. This may work as translator [10] to convert of peaks and valleys as shown in Figure 1. sign language into text or spoken words. Our work has

explored modified way of recognition of sign using peaks InputInput Image Image ad valleys with added feature of positioning of finger in image. There were many approaches to recognize sign Image ImageCropping cropping and Resizing using data gloves [11], [12] or colored gloves [15] worn by signer to derive features from gesture or posture. MarkingMarking and and counting counting peaks peaks and valleys and valleys Ravikiran J. et al. proposed a method of recognizing sign using number of fingers opened in a gesture representing Dividing image in to sixteen parts and an alphabet of the [1]. Iwan finding positions of peaks and valleys Njoto Sandjaja et al . proposed a modification in color- coded gloves which uses less color compared with other Training neural network with parameters color-coded gloves in previous research to recognizes the and recognizing sign

Filipino Sign Language [2]. Jianjie Zhang et al. proposed a new complexion model has been proposed to extract Fig. 1 Block Diagram of Sign Detection hand regions under a variety of lighting conditions [3]. Authors have collected data of 20 persons (students of V.Radha et al. developed a threshold based segmentation engineering college) who have been given little training process which helps to promote a better vision based sign about how to perform signs. For acquiring image we have 56

International Journal of Computer Science and Network (IJCSN) Volume 1, Issue 6, December 2012 www.ijcsn.org ISSN 2277-5420 used camera of 1.3M pixels (Interpolated 12M pixels still 2.2 Resizing image image resolution). After getting RGB image in size of either W* W or H*H , In first phase we have read image and cropped it by image is converted to gray scale image. maintaining height width ratio of hand portion only. Later Image is then filtered using Gaussian filter with size [8 8] on hand portion is resized to 256*256 size to extract and sigma value 2 which found suitable for this features. experimentation.

2.1 Cropping input image Filtered image is then resized to 256*256 sized image. Hand portion image is then converted to 256*256 size First converts the RGB image to L*a*b* Color space to RGB image, this way hand portion comes at the center of separate intensity information into a single plane of the image. This way cropping operation is performed. image, and then calculates the local range in each layer. Second and third layer is intensity images are converted to black and white image according to threshold value of each layer. Two images are then multiplied to get one result image. From the result image 4-connected components are labeled. Properties of each labeled region are measured using Bonding box to make structures. Convert structure to cell array. Convert cell array of matrices to single matrix.

From this matrix hand portion is marked by marking square box on original RGB image.

Fig. 4 Grayscale Image

2.3 Boundary Tracing for Peaks and Valleys

Resized image is smoothed by moving average filter to remove unnecessary discontinuities.

Fig. 2 Image of hand in with red box marked

If the width (W) hand portion is more than height (H) then cropping is W*W size else it is H*H size.

Fig. 5 Hand image before and after smoothing operation

Using morphological operations this smoothed image is converted to boundary image.

Fig. 3 Resized Image

International Journal of Computer Science and Network (IJCSN) Volume 1, Issue 6, December 2012 www.ijcsn.org ISSN 2277-5420

0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0

Fig. 8 Condition I

Condition II: If we don’t get any pixel it means we have Fig. 6 Boundary Image to search on existing pixels right side, if pixel exist we 2.4 Peaks and valleys detection follows the same way until we get no pixel on right side. We again follows as per condition I. if condition I and II After getting boundary image we first find the boundary not satisfied it means we have to search down, here we tracing point from where to start and where to stop mark as peak as shown in figure 9. finding peaks and valleys. For this we find maximum value of x where white pixel exists. 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 We call this point as opti_x and then find corresponding 0 0 0 1 0 0 0 0 0 value of y. The starting point on x direction as 0.80*x. 0 0 1 0 0 0 0 0 0 From this x value we find starting y co-ordinate of 0 0 1 0 0 0 0 0 0 starting point. 0 0 1 0 0 0 0 0 0 y 0 1 0 0 0 0 0 0 0

Fig. 9 Condition II

If condition I and II not satisfied then we search on down side by making DN=1

Condition III: we start with DN=1, We first travel to Start(x,y) 0.80*opti_x Stop(x,y) down and check whether white pixel exist or not. If exist then continue in same way if not we check it on down left opti_x or down right. Again we search on down side and continue until we don’t get any pixel on down or down - x left or down -right.

Fig. 7 Tracing Starting & Ending Point of Hand Image 0 0 0 0 0 0 0 0 0 0 0 1

0 1 This is our starting point to trace boundary and ending 0 0 0 0 0 0 0 0 0 0 point is starting point y position plus one i.e. next row of 1 1 1 1 0 0 0 0 0 0 1 0 starting point where white pixel exist. 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 Condition I: we start with UP=1, we first travel to top and 0 0 0 0 0 0 0 0 0 1 0 0 check whether white pixel exist or not. If exist then continue in same way if not we check it on top left or top Fig. 10 Condition III right. Again we search on top side and continue until we don’t get any pixel on top or top-left or top-right. Condition IV: If we don’t get any pixel it means we have Condition I is demonstrated using Figure 8. to search on existing pixels right side, if pixel exist we follows the same way until we get no pixel on right side and then we follows condition III. 0 1 0 0 0 0 0 0 0 0 0 0 1 0

58

International Journal of Computer Science and Network (IJCSN) Volume 1, Issue 6, December 2012 www.ijcsn.org ISSN 2277-5420

If in condition IV there is no pixel on Using these parameters a Neural Network is trained. For right side we search on existing pixels left side, if pixel Neural Network training we have collected data base of exist we follows the same way until we get no pixel on left 20 persons for the signs shown below in figure 14. side and then we follows condition III.

0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 A B D F 0 0 0 1 1 0 0 0 0 0 1 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 J K L V Fig. 11 Condition IV

If condition III and IV not satisfied it means we have to W Y search on top side, here we mark as valley. After marking valley we again start from condition I. This Fig. 14 American Sign Language Gestures way we keep on tracing peaks and valleys until we reach at stop point as shown in figure 12. 3. Recognition of sign using Neural Network

The Support Vector Machine (SVM) is used for classification. The parameters that we have set are as follows. Data for training: 100% Data for testing: 20% Input PE’s:50 Output PE’s:10 Exemplars: 180 Hidden layer: 0 Fig. 12 Marking of Peaks and valleys Step size: 0.01 Epochs: 1000 2.4 Feature Extraction Termination-incremental: 0.0001 No. of Runs: 3 Image is then divided in to 16 parts, each of size 16*16 and naming them as A1, A2…A16. We then count A result for training and testing database is shown in number of peaks and number of valleys in image as Table-1 and Table 2. shown in figure 13.

Table 1: Result on Training Data set.

Output / Desired A B D F J K L V W Y

A 18 0 0 0 0 0 0 0 0 0 B 0 19 0 0 0 0 0 0 0 0 D 0 0 18 0 0 0 0 0 0 0 F 0 0 0 18 0 0 0 0 0 0 J 0 0 0 0 18 0 0 0 0 0 K 0 0 0 0 0 17 0 0 0 0 Fig.13 Image divided in 16 parts L 0 0 0 0 0 0 19 0 0 0 From the divided image we find other parameters like in V 0 0 0 0 0 0 0 18 0 0 which part the highest peak has been detected in an image W 0 0 0 0 0 0 0 0 17 0 and which areas have been occupied by peaks and valleys. Y 0 0 0 0 0 0 0 0 0 18 Result(%) 100 100 100 100 100 100 100 100 100 100

International Journal of Computer Science and Network (IJCSN) Volume 1, Issue 6, December 2012 www.ijcsn.org ISSN 2277-5420

Symposium on Industrial Electronics and Applications, Table 2: Result on Testing Data set. October 4-6, 2009, pp.145-149.

Output / [6] Salma Begum, Md. Hasanuzzaman, “Computer Vision- A B D F J K L V W Y based Bangladeshi Sign Language Recognition System”, Desired Proceedings of 12 th International Conference on Computer A 2 0 0 0 0 0 0 0 0 0 and Information Technology, 21-23 Dec. 2009, pp. 414- 419. B 0 1 0 0 0 0 0 0 0 0 D 0 0 2 0 0 0 0 0 0 0 [7] Yang quan, Peng Jinye, Li Yulong, “ F 0 0 0 2 0 0 0 0 0 0 Recognition Based on Gray-Level Co-Occurrence Matrix and Other Multi-features Fusion”, 4 th IEEE conference on J 0 0 0 0 2 0 0 0 0 0 Industrial Electronics & Application, 2009, pp. 1569-1572. K 0 0 0 0 0 3 0 0 0 0 L 0 0 0 0 0 0 1 0 0 0 [8] Yang Quan, Peng Jinye, “Chinese Sign Language V 0 0 0 0 0 0 0 2 0 0 Recognition for a Vision-Based Multi-feature Classifier”, W 0 0 0 0 0 0 0 0 3 0 International Symposium on Computer Science and Computational Technology, 2008, pp.194-197. Y 0 0 0 0 0 0 0 0 0 2 Result (%) 100100100100100100100 100100100 [9] Paulraj M P et.al., “Extraction of Head and Hand Gesture Features for Recognition of Sign Language”, International Conference on Electronic Design, 2008, pp. 1-6.

4. Conclusion [10] Rini Akmeliawatil et.al., “Real-Time Malaysian Sig Language Translation using Colour Segmentation and Detecting peaks and valleys algorithm is simple and easy Neural Network”, Instrumentation and Measurement to implement to recognize signs which belong to Technology Conference Proceeding, 2007, pp. 1-6. American Sign Language. For recognition we have extracted simple features form images and network is [11] Nilanjan Dey, Anamitra Bardhan Roy, Moumita Pal, Achintya Das ,” FCM Based Blood Vessel Segmentation trained using Support Vector Machine. The accuracy Method for Retinal Images”, ijcsn, vol 1, issue 3,2012 obtained in this work is 100 % as only few signs have been considered here for recognition. In future work [12] Tan Tian Swee et.al., “Wireless Data Gloves Malay Sign authors will try to recognize all signs of American Sign Language Recognition System”, 6 th International Language including dynamic signs which involves hand Conference on Information, Communication & Signal motion and design system which will convert signs into Processing, 2007, pp. 1-4. text or spoken words. [13] Maryam Pahlevanzadeh, Mansour Vafadoost, Majid Shahnazi, “Sign Language Recognition”, 9 th International References Symposium on Signal Processing and Its Application, [1] Ravikiran J. et.al., “Finger Detection for Sign Language 2007, pp. 1-4. Recognition”, Proceeding of International MultiConference of Engineering and computer Scientists, 2009, Vol.1. [14] M. Mohandes, S. I. Quadri, M. Deriche, “ Sign Language Recognition an Image–Based Approach”, 21 st [2] Iwan Njoto Sandjaja, Nelson Marcos, “Sign Language th International Conference on Advanced Information Number Recognition”, proceeding of 5 International Joint Networking and Applications Workshops, 2007, pp. 272- Conference on INC,IMS and IDC, 2009, pp. 1503-1508. 276.

[3] Jianjie Zhang, Hao Lin, Mingguo Zhao, “A Fast Algorithm [15] Qi Wang et.al., “Viewpoint Invariant Sign Language for Hand Gesture Recognition Using Relief”, proceeding of Recognition” 18 th International Conference on 6th International Conference on Fuzzy Systems and Pattern Recognition, 2005, pp. 456-459. Knowledge Discovery, 2009, Vol.1, pp. 8-12. [16] Eun-Jung Holden, Gareth Lee, and Robyn Owens, [4] V.Radha, V.Radha, “Threshold based Segmentation using “Automatic Recognition of Colloquial Australian median filter for Sign language recognition system”, Sign Language” Proceedings of the IEEE Workshop proceeding of World Congress on Nature & Biologically on Motion and Video Computing, 2005, pp. 183-188. Inspired Computing, 2009, pp. 1394-1399. [17] Tan Tian Swee et.al., “Malay Sign Language Gesture [5] Ryszard S. Chora´s Institute of Telecommunications, Recognition System” , International Conference on “Hand Shape and Hand Gesture Recognition”, IEEE Intelligent and Advanced Systems, 2007, pp. 982-985.

60