<<

International Journal For Technological Research In Engineering Volume 2, Issue 12, August-2015 ISSN (Online): 2347 - 4718

CONVERSION OF PUNJABI TEXT TO IPA USING PHONETIC SYMBOLS

Samandeep Kaur1, Er. Charanjiv Singh2 Department of Computer engineering

Abstract: IPA stands for International Phonetic . . Also, there is a difference between Normally, have different sounds in different phonology and phonetics because phonetics deal with all languages. Hence, it finds difficulty in pronouncing sounds types of sound production by human beings and phonology of various languages. IPA is the only way to put the sounds on the other hand, is concerned with the sound production of in symbols in such a way that any word can be read in the a particular language only. Two types of phonemes are same way anywhere around the globe. It has another usually used- one is segmental and the other is supra advantage that it makes spelling clearer and more segmental phoneme. Segmental phonemes can be identified understandable. This paper presents the steps followed for either physically or auditory, in the stream of speech. converting the given Punjabi text to its corresponding IPA and are the segmental phonemes of a language. and the algorithm used for it. The task of developing an While supra segmental phonemes cannot exist independently accurate text to IPA system is difficult due to two reasons: because they exist only with the segmental phonemes. Punjabi being the only tonal language and second is its Segmental phonemes constitute alphabets of a language. historical circumstances and the use of words from multiple languages and non- native phonemes. Here the main focus III. RELATED WORK is on the text to its IPA conversion process. It is in fact, a Sheilly Padda et al. [2], Rupinderdeep kaur and Nidhi system that translates text into IPA which is the primary proposed the steps for converting Punjabi text into its stage for text to sound conversion. The whole procedure corresponding IPA. The conversion process is based on that involves converting the given Punjabi text to IPA . Kamaldeep and Goyal et al. [3] have addressed the involves a great deal of time as it’s not an easy task and problem of transliterating Punjabi to English language using requires efforts. rule based approach. The technique has demonstrated Keywords: Phonetics, IPA, Phonemes, Language transliteration from Punjabi to English for common names Transliteration, Vowels, Consonants. and achieved accuracy of 93.22%. Dhore et al. [7] have addressed the problem of machine transliteration where . INTRODUCTION given a named entity in Hindi using need This modern era is the computer era. Everyone wish that the to be transliterated in English using CRF as a statistical modern computer system should behave like humans and probability tool and n-grams as a features set. In this they should be user friendly. With the growth of computing approach, they presented machine transliteration of named machines their applications are increasing day to day routine entities for Hindi –English language pair using CRF as a [2]. This paper presents one of the modern techniques of text statistical probability tool and n- gram as feature set. They to IPA conversion. The text to IPA conversion is useful for achieved accuracy of 85.79%. The convertor available for people as it helps them with reading ability and those who Hindi to IPA actually converts the Hindi text into Roman find difficulty in pronouncing the words of a particular language. The IPA conversion is not available for this language can read the words of that language in a precise language. manner. In this paper we have presented the linguistic features of Punjabi text stating its phonetic representation. IV. INTERNATIONAL PHONETIC ASSOCIATION(IPA) Through the knowledge of language’s phonological International Phonetics Association is widely used for the orthographic we could develop a better and promising transcription of English and many languages.IPA offers a set Punjabi text to IPA conversion. This research involves of symbols and general guidelines for the use of such computer based conversion of a character or word from one symbols. It is a standardized system of pronunciation language or script to another without losing its phonological (phonetic) symbols used. The guiding principles for the characteristics. association: it chooses the symbols for its alphabet and decides how they should be used and where: II. PHONETICS One sound = one symbol Phonetics is the science that is used to draw general laws There should be a one-to-one correspondence between a related to sounds and their production. Speech sounds can be speech sound and the symbol used to represent it. A symbol examined corresponding to the stages of transmission of the should always represent the same sound, regardless of the speech signal from a speaker to the listener. Study of sound language being transcribed. A sound should always be production by the human organs like larynx, velum, tongue, represented by the same symbol. lips etc comes under the type of phonetics, which is known as IPA [1] is used by:

www.ijtre.com Copyright 2015.All rights reserved. 3180

International Journal For Technological Research In Engineering Volume 2, Issue 12, August-2015 ISSN (Online): 2347 - 4718

 Lexicographers conversion from Punjabi Text to IPA. Using this procedure,  Foreign language students and teachers we have made the IPA for around five words and stored  Linguists them in the database. The program ,when runs on ‘MY  Speech language pathologists ECLIPSE’ platform chooses the words from the back end  Singers and if that word is not stored in the given database then  Actors character by character mapping of the given word is done by the program. Then, the accuracy of this proposed technique  Constructed language creators, and is compared with the traditional approach and accuracy of  Translators the new system is generated. Character by character mapping IPA chart for consonants, vowels and supra segmental is done only for those words which are not contained in the phonemes [1] are as follows: database.

Fig.2 Flowchart of Text to IPA Convertor Fig.1 Chart of Phonemes. Conversion of Punjabi into IPA in Excel:

V. PROPOSED METHODOLOGY Available Punjabi to IPA convertor converts the given Punjabi text into Roman text. Till now, there is only one convertor but it converts the written text into Roman text, rather than converting it into IPA. Hence this research converts the written text into a standardized method adapted all over the globe i.. IPA. Hence people belonging to any region of the world can read written in Gurumukhi script without even knowing Gurumukhi. Algorithm for IPA conversion:  Enter the text  Split the text into words  Compare the word/ character with the database.  If found, process step no 5 else step no 6.  Replace the word with the IPA symbol, provided in the database.  Split the word into characters and then replace each character with the IPA symbol. The flowchart shown below describes all the steps of (a)

www.ijtre.com Copyright 2015.All rights reserved. 3181

International Journal For Technological Research In Engineering Volume 2, Issue 12, August-2015 ISSN (Online): 2347 - 4718

(b)Fig 3. (a) (b) Conversion of Punjabi to IPA in Excel

Conversion of Punjabi paragraph(notepad) into IPA :

Fig 4. Two paragraphs converting Punjabi text into IPA

VI. CONCLUSION AND FUTURE SCOPE IPA is the standardized technique for the pronunciation of words all around the globe. Here we have manually prepared the IPA for around 5, 00,000 words and stored them in ‘MY ECLIPSE’ database. Any paragraph, containing Punjabi text is selected for the conversion process. If all the words in the paragraph match word by word with the database, then the whole paragraph is changed into its corresponding IPA. If in case, any word contained in the paragraph doesn’t match

with the words stored in the database, then those words are converted character by character and hence final output sentence, in the form of IPA paragraph is generated. The work so far has been done on the conversion of the text into Roman language but in this research a modified and a standardized method for converting the text into IPA has been done. Various algorithms have been developed for other local languages but this work has not been done in Punjabi language. After converting the text to IPA the accuracy of proposed approach is done with the traditional approach of text to its Roman conversion. In future an online convertor to convert the given Punjabi text into IPA may be designed so that Punjabi becomes easy to speak by people all over the world. The IPA convertor may help people

www.ijtre.com Copyright 2015.All rights reserved. 3182

International Journal For Technological Research In Engineering Volume 2, Issue 12, August-2015 ISSN (Online): 2347 - 4718

belonging to any region of the world read Punjabi language written in Gurumukhi script without even knowing Gurumukhi. In this research paper, we have described how a given Punjabi text could be converted into IPA using phonetics. The phonetic nature of the language was surveyed. Punjabi is a highly tonal language. Hence the conversion process was highly challenging. It has been tried to put almost all the Punjabi words in the database. But Punjabi being the tonal language and having different regions speaking different words. Hence it is a time consuming process and requires efforts.

REFERENCES [1] https://en.wikipedia.org/wiki/International_Phonetic _Alphabet [2] http://www.ijerd.com/paper/vol1- issue5/B0150811.pdf [3] http://csjournals.com/IJCSC/PDF2- 2/Article%2045.pdf [4] http://www.punjabiheritage.com/punjabi%20langua ge.html [5] http://www.gurmukhi.org [6] http://www.sikhs.org/gurmukhi [7] http://www.ijcaonline.org/archives/volume48/numb er23/7522-0624 [8] https://en.wikipedia.org/wiki/Help:IPA_for_Assame se [9] http://gate2home.com/Malayalam- Keyboard/Translate#lang=en&t=ദഗഹബൂീ [10] http://anunaadam.appspot.com/transcription [11] https://en.wikipedia.org/wiki/Assamese_alphabet [12] https://en.wikipedia.org/wiki/Help:IPA_for_Tamil [13] http://teflpedia.com/Phonetic_symbol

www.ijtre.com Copyright 2015.All rights reserved. 3183