Finger Spelled Sign Language Translator for Deaf and Speech Impaired People in Srilanka Using Convolutional Neural Network
Total Page:16
File Type:pdf, Size:1020Kb
13th International Research Conference General Sir John Kotelawala Defence University Computing Sessions Paper ID: 566 Finger spelled Sign Language Translator for Deaf and Speech Impaired People in Srilanka using Convolutional Neural Network HKK Perera#, DMR Kulasekara, Asela Gunasekara Department of Computer Science, General Sir John Kotelawala Defense University, Sri Lanka. #[email protected] Abstract: Sign language is a visual language help people in every situation. People who used by people with speech and hearing are suffering from hearing impairments, disabilities for communication in their daily struggle daily in communicating their ideas conversation activities. It is completely an to other people. Essentially to the oral optical communication language through its language, sign language has a lexical, a native grammar. In this paper, hoping to "phonetic". (Saldaña González et al., 2018) present an optimal approach, whose major (rather than verbalized sounds it has objective is to accomplish the transliteration enunciated signs), a "phonology" (rather of 24 static sign language alphabet words and than phonemes, it has components from numbers of Srilankan Sign Language into various natures that achieve a similar humanoid or machine decipherable English differential capacity from the words visual manuscript in the real-time environment. structure), punctuation, a semantic, and it’s Since Srilanka has a native sign language very own practice. Being a trademark from deaf/Signers become uncomfortable when every nation and culture, and not widespread expressing their ideas to a normal person permits portraying all the truth that includes which is why this system is proposed. us, what users see, feel, or think. In order to Artificial Neural Networks (ANN) and fill that gap in Srilanka, Video Relay Service Support Vector machines (SVM) have been (VRS) (Gebre et al., 2014) is used in Srilanka used as the technologies of this proposed for now. In VRS what happens is, a human system. Pre-processing operations of the interpreter translates the hand signs to Voice signed input gestures are done in the first and the same is the inverse. The problem that phase. In the next phase, the various region lies there is that the communication speed is properties of the pre-processed gesture kind of slow and there is no privacy at all. images are computed. In the final phase, Many researchers have been carried out to based on the properties calculated of the solve this problem in Srilanka, but the earlier phase, the transliteration of signed problem is that Srilankan sign language is not gesture into text and voice is carried out. The static. Meaning that the same word can be proposed model is developed using Python interpreted in many ways. Just as speakers and Python libraries like OpenCV, Keras, and have different voices, signers have different Pickle. signs. (Gebre et al., 2014) And also the camera induces scale, translation, and Keywords: Artificial Neural Networks, rotation errors which may make noises. Static gestures, Gesture recognition, Support (Saldaña González et al., 2018) Vector Machines, Gesture Classification This system is carried out using the camera, Introduction so the environmental conditions also affects Now a day’s technology has advanced to help but some systems don’t care about the people with any kind of disability. It almost makes IoT devices and there are robots to 87 13th International Research Conference General Sir John Kotelawala Defence University Computing Sessions environmental conditions because they are Mainly there are 2 main gesture types in carried out with a wearable device. Sinhala sign language. (Punchimudiyanse and Meegama, 2017) One is Conversational Sri Lankan Sign Language (SSL) consists of type, that has a set of sign gestures for 56 characters and overall, it has about 1000+ common words and phrases which are made signs for now and its increasing day by day dynamically for the ease of use. And the because people tend to form new signs for second one uses fingerspelling alphabet to the words they use usually. According to the decode Sinhala words that are not there in census results of housing 2012, there are a the alphabet using letter by letter. For total of 569910 people including people with both hearing and speech impairments in example, the word “Janaki/ජාන槓” is Srilanka. Since the inability to convey the spelled as J + A + N + A + K + I when converted messages from the normal people and them to SSL its, 觊 + අ + න් + අ + ක් + ඉ. they have lost the right to live a normal life. This paper suggests a method to solve that barrier and break the unfairness between the people. (Garcia and Viesca, n.d.) This system is proposed to find solutions for the following Figure 1: Sinhala Finger Spelling Word facts about the previous systems made. Talking about the numbers in SSL, the basic • Environmental concerns. numbers which normal people use is valid as • Occlusion (Detecting signs which are it is and special number patters are available made from the reference box). for, • Coarticulation (Signs are affected by • Numbers from 0 to 19. the sign which is made before or • Signs from 20 to 90. after). • Signs from 100 to 900. This proposed system only identifies the manual signs which are made by the hands • Signs for 1000, 100 000, Million and (fingers). Taking ASL (American sign Billion. language) as a reference, where the The proposed system will help the deaf and dictionary contains 7154 sign words that speech impaired individuals in catching their visually look the same, which can lead to sign-based message by means of the portable miscommunication. For example, signs for camera and afterward convert it into Sinhala ‘Chocolate’ & ‘Cleveland’ are similar, they are content. Consequently, like average talking, it not the same at least they cannot get even will be simple for them to speak with any close. It is very hard to catch when a individual at wherever and complete their conversation is messed up (Mahesh Kumar et work easily. Subsequently, the hardships al.., 2018). Basically, Communication via they experience during the commitment to gesture recognition is a significant utilization talk will be abridged. of motion recognition. Gesture-based communication acknowledgment has two This article is organized as follows: Section II unique approaches. presents the Related work on sign language recognition, Section III 1. Glove Based Approach Methods, and Approach, Section IV offers 2. Vision-Based Approach Design, Section V contains Results and 88 13th International Research Conference General Sir John Kotelawala Defence University Computing Sessions Evaluation and finally, section VI discusses identification are still to be created the Conclusion and Future works. (Escudeiro et al., 2014). Related Works Appearance-based methods use images as information sources. A few formats are the Many types of research have been conducted deformable 2D layouts of the human pieces in the field of Sign Language Recognition of the body, especially hands. The using various novel approaches. Such as skin arrangements focus on the blueprint of an filtering technique, use of gloves, use of article called deformable layouts. This Microsoft Kinect sensor for tracking hand, proposed system is an Appearance-based Fingertip Search method, Hu Moments system. Some algorithms based on vision- method, Convolutional Neural Network based recognition are, (CNN), shape descriptors, Grid-Based Feature Extraction technique in feature Support Vector Machine (SVM): American extraction, and Artificial Neural Network Sign Language recognition system which is (ANN), etc. Sign language recognition can be proposed by Shivashankara (Department of mainly categorized into 2 main parts as, Computer Science & Engineering, Sri Jayachamarajendra College of A. Manual Signs Engineering, Mysuru, India et al., 2018) uses Manual signs are the signs which are made SVM for both numbers and letters and letters from the hands. It can again be categorized performed with an accuracy of 97.5% and into 2 main parts, numbers performed with an accuracy of 1) Vision-Based Recognition: Image 100%.Data was collected from the ASL which processing algorithms are utilized in a is a freely available dataset. Vision-based system to recognize and track Artificial Neural Networks (ANN): hand signs and outward appearances of the Recognition and classification of sign signer. This system is simpler to the signer language systems (Saldaña González et al., since there is no compelling reason to wear 2018) proposed for Spanish was made using any additional equipment. Notwithstanding, a glove while also carrying the same project there are exactness issues identified with on artificial neural networks giving an image processing algorithms, and these accuracy for the Ann 90%. And giving an issues are yet to be adjusted. There are two error rate of 2.61% which is independent of distinct methodologies in vision-based sign the user. The same paper presented a system language acknowledgment: using a glove and ANN which was 97% 1. 3D model based independent of the user. 2. Appearance-based Convolutional neural networks. (CNN): Real- 3D model-based techniques utilize 3D data of time ASL recognition system with CNN key components of the body parts. Utilizing (Garcia and Viesca, n.d.) proposes a system this data, a few significant parameters, like that compromises over 90% of accuracy palm position, joint edges and so forth., can without using any motion-tracking gloves. be taken. This methodology utilizes This also uses heuristics and the letter volumetric or skeletal models or a mix of the classification was done in ConvNet. Dataset two. The volumetric technique is more was consisting of 65000, 150*150px images qualified for the PC animation field and PC each image with 2 of the same images which Computer vision. This methodology is are color image and depth image. exceptionally computational concentrated and furthermore, frameworks for live 89 13th International Research Conference General Sir John Kotelawala Defence University Computing Sessions SURF and Hu- Moment: Paper (Rekha that’s over 90%.