Volume 33 (1) November 2018

Volume 33 (1) November 2018

A Peer-reviewed Journal of Linguistic Society of Nepal Nepalese Linguistics Volume 33 (1) November 2018 Editor-in-Chief Kamal Poudel Editors Ram Raj Lohani Dr. Tikaram Poudel Office bearers for 2018-2020 President Bhim Narayan Regmi Vice President Krishna Prasad Chalise General Secretary Dr. Karnakhar Khatiwada Secretary (Office) Dr. Ambika Regmi Secretary (General) Dr. Tara Mani Rai Treasurer Ekku Maya Pun Member Dr. Narayan Prasad Sharma Member Dr. Ramesh Kumar Limbu Member Dr. Laxmi Raj Pandit Member Pratigya Regmi Member Shankar Subedi Editorial Board Editor-in-Chief Kamal Poudel Editors Ram Raj Lohani Dr. Tikaram Poudel Nepalese Linguistics is a peer-reviewed journal published by Linguistic Society of Nepal (LSN). LSN publishes articles related to the scientific study of languages, especially from Nepal. The authors are solely responsible for the views expressed in their articles. Published by: Linguistic Society of Nepal Kirtipur, Kathmandu Nepal Copies: 300 © Linguistic Society of Nepal ISSN 0259-1006 Price: NC 400/- (Nepal) IC 350/- (India) US$ 10/- The publication of this volume was supported by Nepal Academy. Editorial Linguistic Society of Nepal, since its inception in 1979, has been involved in preserving and promoting the languages of the Himalayan region through different activities such as organizing conferences, workshops and publications. As all our esteemed readers know that the journal Nepalese Linguistics is one of the major initiatives of the Society. The Board of Editors feels immense pleasure to bring out Volume 33.1 of Nepalese Linguistics in the eve of the 39th International Annual Conference of Linguistic Society of Nepal. The Society decided to peer-review the articles since this issue in order to ensure the quality of the journal. We created a pool of reviewers of more than thirty scholars from the various areas of linguistics. We are grateful to the contributions of the reviewers, who accepted our requests and reviewed the articles within short period of time. We sincerely acknowledge their support. Similarly, we thank the authors for submitting their articlesand revising them to address the comments made by the reviewers. The consent of the writers and reviewers has encouraged us to pave a new beginning of the journal. We are aware that the time allocated to the writers and reviewers was not sufficient, and we assure you all that the editorial board will embark on its activitiesin time in the future issues.In spite of our sincere effort, we could not accommodate all the articles submitted to the editorial board. The articles that the authors revised following the reviewers’ suggestions reached us just in the eve, and some are yet to be received. Similarly, we are still waiting for feedback on some articles.Thanks to the limitation of time; these articles could not obviously be included in this issue. The board will immediately proceed ahead for the publication ofthe second issue of this volume. The current issue comprises eight articles and the keynote address delivered at the 38th Conference of the Society. These articles cover five different themes namely morphosyntax, sociolinguistics, computational linguistics, acoustic phonetics and language planning. We acknowledge the cooperation of the executive members of Linguistic Society of Nepal, authors, reviewers and other individuals and organizations involved in the publication of the journal. The Board specially recognises the support extended by Mr. Krishna Prasad Chalise in preparation of this issue. Finally, we look forward to constructive feedbacks from our esteemed readers. Board of Editors CONTENTS DEVELOPING CLASSIFICATION-BASED NAMED Pitambar Behera and 1 ENTITY RECOGNIZERS (NER) FOR SAMBALPURI AND Sharmin Muzaffar ODIA APPLYING SUPPORT VECTOR MACHINES (SVM) INDEFINITE PRONOUNS IN ASSAMESE Pushpa Renu 8 Bhattacharyya ASPIRATION IN NEPALI Krishna Prasad Chalise 16 ECHO-WORD FORMATION IN BANGLA Kuntala Ghosh Dastidar 22 CASE MARKING IN NUBRI Cathryn Donohue 28 LANGUAGE SHIFT IN NEWAR: A CASE STUDY IN THE Bhim Lal Gautam 33 KATHMANDU VALLEY DETERMINING OFFICIAL LANGUAGES IN THE Dan Raj Regmi 42 FEDERAL STATES IN NEPAL A PRELIMINARY STUDY OF MNAR Ruth Rymbai and Arvind 52 Kumar Rawat PRONOMINALISATION IN SOUTH ASIAN LANGUAGES: Tanmoy Bhattacharya 60 OF PEOPLE AND THEIR ACTIONS [Keynote speech delivered at the 38th Annual Conference of LSN] Pitambar Behera is affiliated to Jawaharlal Nehru University. The author’s email address is <[email protected]>. Sharmin Muzaffar is affiliated to Aligarh Muslim University. The author’s email address is <[email protected]>. PushpaRenu Bhattacharyya is affiliated to Tezpur University, India. The author’s email address is <[email protected]>. Krishna Prasad Chalise is affiliated to Tribhuvan University, Nepal. The author’s email address is <[email protected]>. Kuntala Ghosh Dastidar is affiliated to University of Calcutta, India. The author’s email address is <[email protected]>. Cathryn Donohue is affiliated to Hongkong University, Hongkong. The author’s email address is < [email protected]>. Bhim Lal Gautam is affiliated to Central Department of Linguistics, Tribhuvan University, Nepal. The author’s email address is <[email protected]>. Dan Raj Regmi is affiliated to Central Department of Linguistics, Tribhuvan University, Nepal. The author’s email address is <[email protected]>. Ruth Rymbai is affiliated to North Eastern Hill University, Shillong, India. The author’s email address is <[email protected]>. Arvind Kumar Rawat is affiliated to North Eastern Hill University. The author’s email address is <[email protected]>. Tanmoy Bhattacharya is affiliated to University of Delhi, India. The author’s email address is <[email protected]>. DEVELOPING CLASSIFICATION-BASED NAMED ENTITY RECOGNIZERS (NER) FOR SAMBALPURI AND ODIA APPLYING SUPPORT VECTOR MACHINES (SVM) Pitambar Behera and Sharmin Muzaffar This paper demonstrates the development of named classes with the application of any of the NER Entity Recognizers (NER) applying Support Vector based approaches. Machines (SVM) for Sambalpuri and Odia. The 1.1 Approaches to named entity recognition Sambalpuri corpus amounts to 112k word tokens out of which 5,887 are named entities. On the contrary, 250k There are basically two broad approaches that are ILCI corpus has been applied for Odia out of which employed in the recognition of named entities 18,447 tokens are named entities. The former (Nayan et al., 2008; Sasidhar et al., 2011; Saha, accurately recognizes 96.72% whereas the latter 2008). These include: Rule-based approach and provides 98.10% accuracy. Machine learning based approach (Kaur and Keywords: NER, Sambalpuri, NLP, Odia, SVM, Gupta, 2012; Kaur and Gupta, 2010; Srivastava Machine Learning, Indo-Aryan languages, Information et al., 2011). Retrieval, Natural Language Processing. 1.1.1 Rule-based approach 1 Overview Under this section, there are list lookup approach Named entity recognition (NER) is one of the and linguistic approach. So far as the former is applications of Natural Language Processing and concerned, gazetteers are exploited that comprise it is considered as the subtask of information of different lists of named entity classes and a retrieval. NER is the process of detecting Named simple look up or search operation is conducted in Entities (NEs) in a document and to categorize order to detect whether a word belongs to a them into certain named entity classes such as the named entity class or not. If a particular word names of organization, person, location, sport, belongs to a named entity class, a named entity river, city, country, quantity etc. In English, we label, as specified in the annotation schema, is have accomplished a lot of work pertaining to allotted to that word on the basis of the named recognizing named entities. On the contrary, we entity class which it originally belongs to. On the have not achieved remarkable accomplishment other hand, in linguistic approach, a linguist is with regard to detecting NER in Indian languages. entrusted with the work of formulating heuristic India is an abode for 22 official languages along linguistic rules, so that the named entities can be with endangered, lesser-known and less-studied identified as well as classified and extracted easily languages. NER is still considered to be an (Ekbal and Bandyopadyay, 2010; Gupta and emerging area of research in the field of NLP in Lehal, 2011). The formulated rules are language the context of Indian languages. dependent and cannot be applied in order to There are various applications of NER such as identify named entities in any other given Information Extraction, Question Answering, language (Kaur and Gupta, 2012). Therefore, Information Retrieval, Automatic Summarization, data-driven statistical approach became Machine Translation, etc. The Named Entities can indispensable. be made known to us by performing computation 1.1.2 Statistical approach on a given natural language through rule-based or statistical approaches. The task of identification, This approach is motivated by the machine extraction and retrieving necessary information learning theories and algorithms, for instance, can be made faster, if we are already acquainted Hidden Markov Models (HMM), Maximum with the nature, type and functions of named Entropy Markov Model (MEMM), Conditional entities. Therefore, NER is the process of Random Field (CRF), Support Vector Machines detecting, classifying and

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    84 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us