Named Entity Recognition System for Kashmiri Language Iamir Bashir Malik, Iikhushboo Bansal Istudent, M.Tech, Iiassistant Professor I,Iidept

Total Page:16

File Type:pdf, Size:1020Kb

Named Entity Recognition System for Kashmiri Language Iamir Bashir Malik, Iikhushboo Bansal Istudent, M.Tech, Iiassistant Professor I,Iidept ISSN : 2347 - 8446 (Online) International Journal of Advanced Research in ISSN : 2347 - 9817 (Print) Vol. 3, Issue 2 (Apr. - Jun. 2015) Computer Science & Technology (IJARCST 2015) Named Entity Recognition System for Kashmiri Language IAmir Bashir Malik, IIKhushboo Bansal IStudent, M.Tech, IIAssistant Professor I,IIDept. of CSE, Desh Bhagat University, Mandi Gobindgarh, Punjab, India Abstract Named Entity Recognition (NER) is a task which helps in finding out Persons name, Location names, Organization names, Place, Date, Time etc. and classifies them into predefined different categories. Named Entity Recognition plays a major role in various Natural Language Processing (NLP) fields like Information Extraction, Machine Translations and Question Answering. Unfortunately Kashmiri language which is a scarce resourced language has not been taken into account. This paper describes the problems of NER in the context of Kashmiri Language and provides relevant solutions. Keywords Named Entity, Named Entity Recognition, Natural language process, Kashmiri language text. I. Introduction is as follows. The term Named Entity (NE) was evolved during the sixth (1) “Micromax”represent anorganization and “ Dec19, 2014” Message Understanding Conference (MUC -6, 1995).Named represent dateand “smartphone” represent entity and “had Entity Recognition (NER) is also knows as entity identification is a launched its on” represent others. subtask of information extraction (IE). NER extracts and classifies The named entities may be of any type such as given below in the true Named Entities in text. NER system is widely used in a table. different tasks of Natural Language Processing (NLP) and in many commercial applications on internet like Search Engine Table 1: Different named entities .Named Entity Recognition (NER) is a process of searching the S.NO NE Tag DEFINITION (example) text to detect entities in a text and to classify them into predefined 01 ORG Name of organization (Micromax) classes such as the names of persons, organizations, locations, date, time, Designations, Measures, , abbreviations and brand 02 PER Name of person (Amir) etc. Construction of a Named Entity 03 COUNTRY Name of Country (India) Recognition (NER) system becomes challenging if proper 04 OTHER Not a named entity resources are not available. Gazetteer lists are often used for the development of NER systems In many resource-poor languages like kashmiri gazetteer lists of proper size are not available, II. Literature Survey but sometimes relevant lists are available in English. 1. Amarappa and Sathyanarayana, 2012, came up with a paper In Indian languages kashmiri is a most popular language in on „Named Entity Recognition and Classification (NERC) in northern part of India. Kashmiri languagethe current number of Kannada language’, that built a SEMI-Automatic Statistical its speakers will be around four million. Kashmiri is also spoken Machine Learning NLP models based on noun taggers using by Kashmiris settled in other parts of India, and other countries. HMM. Kashmiri language belongs to the Dardic sub-group of the Indo- The challenges and issues faced for Kannada language are listed Aryan group of languages. by them are 1. No capitalization NER based approaches are shown in fig1 given below. 2. High phonetic characteristic of Brahmi script. 3. Non-availability of large gazetteer lists MICROSOFT APPLE NOKIA 4. Lack of standardization and spelling 5. Number of frequently used words (common nouns). ORGANIZATION Their proposed NER system for Kannada receives the HUMAN MUSLIM unannotated text file containing the Kannada document, FOOD ENTITY NER COMMUNITY HINDU recognizes the NE‟s and generates an annotated text document CURRENCY SIKH file. Further the output of NERC system is subjected to a suitable cryptographic algorithm to secure the structured corpus. They NUMERIC OTHERS came up with 13 noun taggers for NER like person name (NNP), location name (NNL), organization name (NNO),etc. Hidden DATE TIME PERCENT Markov Model (HMM) is a supervised learning technique and a statistical model with generalized learning method. It is used Fig. 1 : A Named Entity recognition split into more specific Named to develop a NER with symbolic, statistical, connectionist and Entities hybrid natures. NE‟s and NE Tags are defined with examples in this paper. For example consider the English sentence like: 2. Kaur and Vishal Gupta, 2012, built a „NER for Punjabi‟ Micromaxhad launched its first smartphone on Dec 19, 2014. using rule based and list look up approaches. As mentioned earlier, After performing the named entities on these sentences the result Punjabi is also a language with high clung and inflections, which www.ijarcst.com 209 © All Rights Reserved, IJARCST 2013 International Journal of Advanced Research in ISSN : 2347 - 8446 (Online) Computer Science & Technology (IJARCST 2015) Vol. 3, Issue 2 (Apr. - Jun. 2015) ISSN : 2347 - 9817 (Print) leads to linguistic problems. The rule based approach trained for Punjabi Language”. International Journal of Computer the system to identify NEs by writing rules manually for all Science and Information Technology&Security (IJCSITS), NE features. The most common words are removed from ISSN: 2249-9555 Vol. 2, No.3, June 2012. the database, and then a list look up approach is used with [06] Yungwei ding hsinhsi Chen and ShihchungTsaI, “Named the Gazetteer's lists to classify the identified NEs. Their system entity extraction for information retrieval”. Proc. of HLT- resulted with 85.88% f-measure. NAACL. 3. PrakashHiremath, Shambhavi B. R, 2014, Named Entity [07] http://en.wikipedia.org/wiki/Urdu Accessed on March Recognition (NER) is subtask of information extraction that 2012 seeks to locate and classify the elements in some text into [08] www.bbc.co.uk/urdu/ Accessed on March-May 2012 pre-defined categories. NER finds its application in Natural [09] Pallavi, Dr. Anitha S Pillai. “Named Entity Recognition Language Processing tasks like machine translation, question- for Indian Languages: A Survey”. International Journal of answering systems and automatic summarization. The approaches Engineering and Advanced Technology (IJEAT) ISSN: 2277 to NER are rule based, statistics based or a combination of both. In 128X, Volume 3, November 2013 this paper, we present a survey of these various approaches for [10] Surya Bahadur Bam, TejBahadurShahi,” Named Entity identification of Names Entities (NE) in Indian Languages. Recognition for Nepali Text Using Support Vector Machines”. 4. UmrinderPal Singh, Vishal Goyal, 2014, built a „NER for Intelligent Information Management Published March 2014 Urdu‟ using rule based approaches. This paper describes the in S ci R es. problems of NER in the context of Urdu Language and provides [11] NavneetKaurAulakh, Er.YadwinderKaur. “Review Paper relevant solutions. The system is developed to tag thirteen different on Name Entity Recognition of Machine Translation”. Named Entities (NE), twelve NE proposed by IJCNLP-08 and International Journal of Advanced Research in Computer Izaafats. Science and Software Engineering ISSN: 2277 128X Volume 4, April 2014 III. Issues in Kashmiri NER System [12] PrakashHiremath, Shambhavi B. R. “Approaches to Named • Non-availability of resources Entity Recognition in Indian Languages”. International • Language Resources are must for any approach whether it Journal of Engineering and Advanced Technology (IJEAT) is Rule Based or Statistical. There is no large gazetteer and ISSN: 2249 – 8958, Volume-3, August 2014. annotated data available for Kashmiri language. Kashmiri language is written from right to left. • One major issue with Kashmiri language is that it requires language experts. • The training and testing for Kashmiri language is difficult task for the person who is not language expert of Kashmiri. • No Kashmiri language conversion in Google translator. • 05. No inbuilt knowledge base. IV. Conclusion and Future Work In this work, the method for extracting named entities from data of various domains has been presented which is a system useful in the identification and classification of names. The work for Kashmiri NER is very complex due to the nature of Kashmiri language which is in free order and lacks of research work in Kashmiri text. References [01] Joel , N. (2008) Learning NER from Wikipedia. [1] Pramod Kumar Gupta and SunitaArora(2009) “An Approach for Named Entity Recognition System for Hindi”: An Experimental Study In Proceedings of ASCNT CDAC, Noida, India, pp. 103 – 108. [02] DarvinderKaur, Vishal Gupta, ―A survey of Named Entity Recognition in English and other Indian Languages, IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 6, November 2010. [03] Riaz K. Rule-based named entity recognition in Urdu. In Proceedings of the Named Entities Workshop. Pages 126- 135.2010 [04] Vishal Gupta, Gurpreet Singh Lehal, “Named Entity Recognition for Punjabi Language Text Summarization”. International Journal of ComputerApplications (0975 – 8887) Volume 33– No.3, November 2011. [05] KamaldeepKaur, Vishal Gupta.“Name Entity Recognition © 2013, IJARCST All Rights Reserved 210 www.ijarcst.com.
Recommended publications
  • (And Potential) Language and Linguistic Resources on South Asian Languages
    CoRSAL Symposium, University of North Texas, November 17, 2017 Existing (and Potential) Language and Linguistic Resources on South Asian Languages Elena Bashir, The University of Chicago Resources or published lists outside of South Asia Digital Dictionaries of South Asia in Digital South Asia Library (dsal), at the University of Chicago. http://dsal.uchicago.edu/dictionaries/ . Some, mostly older, not under copyright dictionaries. No corpora. Digital Media Archive at University of Chicago https://dma.uchicago.edu/about/about-digital-media-archive Hock & Bashir (eds.) 2016 appendix. Lists 9 electronic corpora, 6 of which are on Sanskrit. The 3 non-Sanskrit entries are: (1) the EMILLE corpus, (2) the Nepali national corpus, and (3) the LDC-IL — Linguistic Data Consortium for Indian Languages Focus on Pakistan Urdu Most work has been done on Urdu, prioritized at government institutions like the Center for Language Engineering at the University of Engineering and Technology in Lahore (CLE). Text corpora: http://cle.org.pk/clestore/index.htm (largest is a 1 million word Urdu corpus from the Urdu Digest. Work on Essential Urdu Linguistic Resources: http://www.cle.org.pk/eulr/ Tagset for Urdu corpus: http://cle.org.pk/Publication/papers/2014/The%20CLE%20Urdu%20POS%20Tagset.pdf Urdu OCR: http://cle.org.pk/clestore/urduocr.htm Sindhi Sindhi is the medium of education in some schools in Sindh Has more institutional backing and consequent research than other languages, especially Panjabi. Sindhi-English dictionary developed jointly by Jennifer Cole at the University of Illinois Urbana- Champaign and Sarmad Hussain at CLE (http://182.180.102.251:8081/sed1/homepage.aspx).
    [Show full text]
  • Punjabi Language Characteristics and Role of Thesaurus in Natural
    Dharam Veer Sharma et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (4) , 2011, 1434-1437 Punjabi Language Characteristics and Role of Thesaurus in Natural Language processing Dharam Veer Sharma1 Aarti2 Department of Computer Science, Punjabi University, Patiala, INDIA Abstract---This paper describes an attempt to explain various 2.2 Characteristics of the Punjabi Language characteristics of Punjabi language. The origin and symbols of Modern Punjabi is a very tonal language, making use of Punjabi language are presents in this paper. Various relations various tones to differentiate words that would otherwise be exist in thesaurus and role of thesaurus in natural language identical. Three primary tones can be identified: high-rising- processing also has been elaborated in this paper. falling, mid-rising-falling, and low rising. Following are characteristics of Punjabi language [3] [4]. Keywords---Thesaurus, Punjabi, characteristics, relations 2.2.1 Morphological characteristics Morphologically, Punjabi is an agglutinative language. That 1. INTRODUCTION is to say, grammatical information is encoded by way of A thesaurus links semantically related words and helps in the affixation (largely suffixation), rather than via independent selection of most appropriate words for given contexts [1]. A freestanding morphemes. Punjabi nouns inflect for number thesaurus contains synonyms (words which have basically the (singular, plural), gender (masculine, feminine), and same meaning) and as such is an important tool for many declension class (absolute, oblique). The absolute form of a applications in NLP too. The purpose is twofold: For writers, noun is its default or uninflected form. This form is used as it is a tool - one with words grouped and classified to help the object of the verb, typically when inanimate, as well as in select the best word to convey a specific nuance of meaning, measure or temporal (point of time) constructions.
    [Show full text]
  • Online Guides to Indian Languages with Particular Reference to Hindi, Punjabi, and Sanskrit
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Library Philosophy and Practice (e-journal) Libraries at University of Nebraska-Lincoln 5-2012 Online Guides to Indian Languages with Particular Reference to Hindi, Punjabi, and Sanskrit Preeti Mahajan Panjab University, [email protected] Neeraj Kumar Singh Panjab University, [email protected] Follow this and additional works at: https://digitalcommons.unl.edu/libphilprac Part of the Library and Information Science Commons Mahajan, Preeti and Singh, Neeraj Kumar, "Online Guides to Indian Languages with Particular Reference to Hindi, Punjabi, and Sanskrit" (2012). Library Philosophy and Practice (e-journal). 749. https://digitalcommons.unl.edu/libphilprac/749 http://unllib.unl.edu/LPP/ Library Philosophy and Practice 2012 ISSN 1522-0222 Online Guides to Indian Languages with Particular Reference to Hindi, Punjabi, and Sanskrit Prof. Preeti Mahajan Department of Library and Information Science Panjab University Chandigarh, India Neeraj Kumar Singh Assistant Librarian A C Joshi Library Panjab University Chandigarh, India Introduction India is a multilingual country and the second most populated country on earth There are a quite a number of languages spoken in India. Some of these languages are accepted nationally while others are accepted as dialects of that particular region. The Indian languages belong to four language families namely Indo-European, Dravidian, Austroasiatic (Austric) and Sino-Tibetan. Majority of India's population are using Indo-European and Dravidian languages. The former are spoken mainly in northern and central regions and the latter in southern India. India has 22 officially recognised languages. But around 33 different languages and 2000 dialects have been identified in India.
    [Show full text]
  • Volume 33 (1) November 2018
    A Peer-reviewed Journal of Linguistic Society of Nepal Nepalese Linguistics Volume 33 (1) November 2018 Editor-in-Chief Kamal Poudel Editors Ram Raj Lohani Dr. Tikaram Poudel Office bearers for 2018-2020 President Bhim Narayan Regmi Vice President Krishna Prasad Chalise General Secretary Dr. Karnakhar Khatiwada Secretary (Office) Dr. Ambika Regmi Secretary (General) Dr. Tara Mani Rai Treasurer Ekku Maya Pun Member Dr. Narayan Prasad Sharma Member Dr. Ramesh Kumar Limbu Member Dr. Laxmi Raj Pandit Member Pratigya Regmi Member Shankar Subedi Editorial Board Editor-in-Chief Kamal Poudel Editors Ram Raj Lohani Dr. Tikaram Poudel Nepalese Linguistics is a peer-reviewed journal published by Linguistic Society of Nepal (LSN). LSN publishes articles related to the scientific study of languages, especially from Nepal. The authors are solely responsible for the views expressed in their articles. Published by: Linguistic Society of Nepal Kirtipur, Kathmandu Nepal Copies: 300 © Linguistic Society of Nepal ISSN 0259-1006 Price: NC 400/- (Nepal) IC 350/- (India) US$ 10/- The publication of this volume was supported by Nepal Academy. Editorial Linguistic Society of Nepal, since its inception in 1979, has been involved in preserving and promoting the languages of the Himalayan region through different activities such as organizing conferences, workshops and publications. As all our esteemed readers know that the journal Nepalese Linguistics is one of the major initiatives of the Society. The Board of Editors feels immense pleasure to bring out Volume 33.1 of Nepalese Linguistics in the eve of the 39th International Annual Conference of Linguistic Society of Nepal. The Society decided to peer-review the articles since this issue in order to ensure the quality of the journal.
    [Show full text]
  • Neo-Vernacularization of South Asian Languages
    LLanguageanguage EEndangermentndangerment andand PPreservationreservation inin SSouthouth AAsiasia ed. by Hugo C. Cardoso Language Documentation & Conservation Special Publication No. 7 Language Endangerment and Preservation in South Asia ed. by Hugo C. Cardoso Language Documentation & Conservation Special Publication No. 7 PUBLISHED AS A SPECIAL PUBLICATION OF LANGUAGE DOCUMENTATION & CONSERVATION LANGUAGE ENDANGERMENT AND PRESERVATION IN SOUTH ASIA Special Publication No. 7 (January 2014) ed. by Hugo C. Cardoso LANGUAGE DOCUMENTATION & CONSERVATION Department of Linguistics, UHM Moore Hall 569 1890 East-West Road Honolulu, Hawai’i 96822 USA http:/nflrc.hawaii.edu/ldc UNIVERSITY OF HAWAI’I PRESS 2840 Kolowalu Street Honolulu, Hawai’i 96822-1888 USA © All text and images are copyright to the authors, 2014 Licensed under Creative Commons Attribution Non-Commercial No Derivatives License ISBN 978-0-9856211-4-8 http://hdl.handle.net/10125/4607 Contents Contributors iii Foreword 1 Hugo C. Cardoso 1 Death by other means: Neo-vernacularization of South Asian 3 languages E. Annamalai 2 Majority language death 19 Liudmila V. Khokhlova 3 Ahom and Tangsa: Case studies of language maintenance and 46 loss in North East India Stephen Morey 4 Script as a potential demarcator and stabilizer of languages in 78 South Asia Carmen Brandt 5 The lifecycle of Sri Lanka Malay 100 Umberto Ansaldo & Lisa Lim LANGUAGE ENDANGERMENT AND PRESERVATION IN SOUTH ASIA iii CONTRIBUTORS E. ANNAMALAI ([email protected]) is director emeritus of the Central Institute of Indian Languages, Mysore (India). He was chair of Terralingua, a non-profit organization to promote bi-cultural diversity and a panel member of the Endangered Languages Documentation Project, London.
    [Show full text]
  • JPRSS-Vol-02-No-02-Winter-2015.Pdf
    JPRSS, Vol. 02, No. 02, Winter 2015 JOURNAL OF PROFESSIONAL RESEARCH IN SOCIAL SCIENCES Prof. Dr. Naudir Bakht Editor In-Chief It is a matter of great honor and dignity for me and my team that by your (National and International) fabulous and continuous cooperation we are able to present our Research Journal, “Journal of Professional Research in Social Sciences, Vol. 2, No.2, Winter 2015, is in your hands. The Center has made every effort to improve the quality and standard of the paper, printing and of the matter. I feel honored to acknowledge your generous appreciation input and response for the improvement of the Journal. I offer my special thanks to 1. Prof. Dr. Neelambar Hatti, Professor Emeritus, Department of Economic History, Lund University, Sweden. 2. Prof. Dr. Khalid Iraqi Dean Public Administration University of Karachi-Karachi 3. Vice Chancellor City University of Science and Information Technology Dalazak Road-Peshawar 4. Prof. Dr. Faizullah Abbasi Vice Chancellor Dawood University of Engineering and Technology M.A. Jinnah Road, Karachi 5. Prof. Dr. Rukhsana David Principal Journal of Professional Research in Social Sciences JPRSS, Vol. 02, No. 02, Winter 2015 Kinnaird College for Women Lahore 6. Prof. Dr. Parveen Shah Vice Chancellor Shah Abdul Latif University Khair Pur-Sindh 7. Engr. Prof. Dr. Sarfraz Hussain, TI(M), SI(M) Vice Chancellor DHA SUFA UNIVERSITY DHA, Karachi 8. Vice Chancellor University of Agriculture Faisal Abad 9. Vice Chancellor SZABIST-Islamabad Campus H-8/4, IslamAbad 10. Vice Chancellor Dr. Abdul Salam Ganghara University, Canal road-Peshawar 11. Vice Chancellor, Allama Iqbal Open University, Islamabad.
    [Show full text]
  • A Historical View of Sindhi Language
    Chapter 7 A Historical view of Sindhi Language Any historical view can be contested by opponents who fear undermining of their position. Even if not that, there are other problems. The evidence of the available interpretations of the decreasing historical records as we move back in history has to be re-interpreted for any new viewpoint. In doing this one may not come across some good evidence. Therefore one should keep one’s options open which means that even if we are not precise or even wrong somewhere in specifics, it should not alter our course and its general direction. In this case our general direction is determined by the interests of the Sindhi people and therefore of the Sindhi language, remaining overall within the right behavior and civilizational framework. Therefore, so far, what was possible is being put in front of the reader keeping the possibility open for any new evidence and correcting any error if it came or brought to my notice. A view of Sindhi language from an Indian source [1] goes like this: Sindhi is the language of the Sindh region of Pakistan. It is spoken by approximately 18 million people in Pakistan, making it the third most spoken language of Pakistan and the official language of Sindh in Pakistan. It is also spoken in India and has also been made an official language of India. It is an Indo- Aryan language of the Indo-Iranian branch of the Indo-European language family. Sindhi and Urdu are the two languages in which the government of Pakistan issues national identity cards to its citizens.
    [Show full text]
  • The Teaching of Punjabi in American Universities: Present Situation and Future Prospects
    281 Gurinder S. Mann: Teaching of Punjabi in America The Teaching of Punjabi in American Universities: Present Situation and Future Prospects Gurinder Singh Mann University of California, Santa Barbara ________________________________________________________________ The paper begins with the historical context in which South Asians languages began to be taught in the United States, and highlight the teaching of Punjabi in some American Universities. It then goes on to focus on the factors that were instrumental in the creation of these programs, the key academic players on the scene, the constituency of students interested in learning Punjabi and their reasons for doing so, and some accomplishments of the past decade. The paper concludes by outlining the historical and linguistic challenges that will need to be confronted to strengthen the future development of Punjabi. ________________________________________________________________ The teaching of Punjabi was introduced into the university system of the United States in the late-1980s, and it is presently available in South Asian language curricula of Columbia University (1989-), University of Michigan, Ann Arbor (1989-), the University of California, Berkeley (1993-), the University of Pennsylvania (1995-), and the University of California, Santa Barbara (1999-).1 Provisions to teach Punjabi exist at the University of Chicago and the University of Washington, Seattle.2 In addition, Punjabi is offered at Stanford University (1986-), San Jose State University (1998-), and Hofstra University (2004-), the universities with relatively recent associations with South Asian Studies. How Punjabi emerged on the language map of the U.S. universities and where it is current prospects lie are issues of significance for those who are interested in studying the history and culture of the Punjab and Punjabis living overseas.
    [Show full text]
  • Punjabi LANGUAGE and CULTURE
    -YEAR PROGRAM Classroom Assessment 9 Materials Punjabi LANGUAGE and CULTURE [ GRADE] 4 2008 Punjabi Language and Culture Nine-year Program Grade 4 Classroom Assessment Materials 2008 ALBERTA EDUCATION CATALOGUING IN PUBLICATION DATA Alberta. Alberta Education. Learning and Teaching Resources Branch. Punjabi language and culture nine-year program classroom assessment materials, grade 4. ISBN 978–0–7785–6418–8 1. Panjabi language – Study and teaching (Elementary) – Alberta. 2. Education – Alberta – Curricula. 3. Panjabi language – Outlines, syllabi, etc. I. Title. PK2631.A333 2008 491.42 For further information, contact: Alan Chouinard Alberta Education Learning and Teaching Resources Branch 8th Floor, 44 Capital Boulevard 10044 – 108 Street NW Edmonton, Alberta T5J 5E6 Telephone: 780–427–2984 in Edmonton or toll-free in Alberta by dialling 310–0000 Fax: 780–422–0576 Copyright © 2008, the Crown in Right of Alberta, as represented by the Minister of Education. Alberta Education, Learning and Teaching Resources Branch, 44 Capital Boulevard, 10044 – 108 Street NW, Edmonton, Alberta, Canada, T5J 5E6. Every effort has been made to provide proper acknowledgement of original sources. If cases are identified where this has not been done, please notify Alberta Education so appropriate corrective action can be taken. Permission is given by the copyright owner to reproduce this document for educational purposes and on a nonprofit basis, with the exception of materials cited for which Alberta Education does not own copyright. Acknowledgements Alberta Education gratefully acknowledges Edmonton School District No. 7 for writing contributions and language validation over the course of the development of the Punjabi Language and Culture Classroom Assessment Materials, Grade 4.
    [Show full text]
  • Majority Language Death
    Language Documentation & Conservation Special Publication No. 7 (January 2014) Language Endangerment and Preservation in South Asia, ed. by Hugo C. Cardoso, pp. 19-45 KWWSQÀUFKDZDLLHGXOGFVS 2 http://hdl.handle.net/10125/4600 Majority language death Liudmila V. Khokhlova Moscow University The notion of ‘language death’ is usually associated with one of the ‘endangered languages’, i.e. languages that are at risk of falling out of use as their speakers die out or shift to some other language. This paper describes another kind of language death: the situation in which a language remains a powerful identity marker and the mother tongue of a country’s privileged and numerically dominant group with all the features that are treated as constituting ethnicity, and yet ceases to be used as a means of expressing its speakers’ intellectual demands and preserving the FRPPXQLW\¶VFXOWXUDOWUDGLWLRQV7KLVSURFHVVPD\EHGH¿QHG as the ‘intellectual death’ of a language. The focal point of the analysis undertaken is the sociolinguistic status of Punjabi in Pakistan. The aim of the paper is to explore the historical, economic, political, cultural and psychological reasons for the gradual removal of a majority language from the repertoires of native speakers. 1. P REFACE. The Punjabi-speaking community constitutes 44.15% of the total population of Pakistan and 47.56% of its urban population. 1 13DNLVWDQLVDPXOWLOLQJXDOFRXQWU\ZLWKVL[PDMRUODQJXDJHVDQGRYHU¿IW\QLQHVPDOOHU languages. The major languages are Punjabi (44.15% of the population), Pashto (15.42%), Sindhi
    [Show full text]
  • Analysis of Automatic Speech Recognition Systems for Indo-Aryan Languages: Punjabi a Case Study
    International Journal of Soft Computing and Engineering (IJSCE) ISSN: 2231-2307, Volume-2 Issue-1, March 2012 Analysis of Automatic Speech Recognition Systems for Indo-Aryan Languages: Punjabi a Case Study Wiqas Ghai, Navdeep Singh Sinhala, Urdu, Oriya, Assamese and Punjabi. Punjabi being a Abstract— Punjabi, Hindi, Marathi, Gujarati, Sindhi, language with large number of speakers in the world and Bengali, Nepali, Sinhala, Oriya, Assamese, Urdu are prominent members of the family of Indo-Aryan languages. These II. AUTOMATIC SPEECH RECOGNITION languages are mainly spoken in India, Pakistan, Bangladesh, Nepal, Sri Lanka and Maldive Islands. All these languages Automatic speech recognition is the process of mapping an contain huge diversity of phonetic content. In the last two acoustic waveform into a text/the set of words which should decades, few researchers have worked for the development of be equivalent to the information being conveyed by the Automatic Speech Recognition Systems for most of these spoken words. This challenging field of research has almost languages in such a way that development of this technology can reach at par with the research work which has been done and is made it possible to provide a PC which can perform as a being done for the different languages in the rest of the world. stenographer, teach the students in their mother language Punjabi is the 10th most widely spoken language in the world for and read the newspaper of reader’s choice. The advent and which no considerable work has been done in this area of development of ASR in the last 6 decades has resolved the automatic speech recognition.
    [Show full text]
  • A Complete Guide to Sikhism
    A Complete Guide to Sikhism <siqgur pRswid A Complete Guide to Sikhism Dr JAGRAJ SINGH Copyright Dr. Jagraj Singh 1 A Complete Guide to Sikhism < siqgur pRswid[[ “There is only one God, He is infinite, his existence cannot be denied, He is enlightener and gracious” (GGS, p1). “eyk ipqw eyks ky hMm bwrk qUM myrw gurhweI”[[ “He is our common father, we are all His children and he takes care of us all.” --Ibid, p. 611, Guru Nanak Deh shiva bar mohay ihay O, Lord these boons of thee I ask, Shub karman tay kabhoon na taroon I should never shun a righteous task, Na daroon arson jab jae laroon I should be fearless when I go to battle, Nischay kar apni jeet karoon Grant me conviction that victory will be mine with dead certainty, Ar Sikh haun apnay he mann ko As a Sikh may my mind be enshrined with your teachings, Ih laalach haun gun tau uchroon And my highest ambition should be to sing your praises, Jab av kee audh nidhan banay When the hour of reckoning comes At he ran mah tab joojh maroon I should die fighting for a righteous cause in the thick of battlefield. --Chandi Charitar, Guru Gobind Singh Copyright Dr. Jagraj Singh 2 A Complete Guide to Sikhism < siqgur pRswid A COMPLETE GUIDE TO SIKHISM Dr. JAGRAJ SINGH UNISTAR Copyright Dr. Jagraj Singh 3 A Complete Guide to Sikhism A COMPLETE GUIDE TO SIKHISM By Dr. Jagraj Singh Jagraj [email protected] 2011 Published by Unistar Books Pvt. Ltd. S.C.O.26-27, Sector 34A, Chandigarh-160022, India.
    [Show full text]