Sanskrit Kāraka Analyzer for Machine Translation Sudhir Kumar Mishra

Sanskrit K āraka Analyzer for Machine Translation Thesis submitted to Jawaharlal Nehru University In partial fulfillment of the requirements For the award of the Degree of DOCTOR OF PHILOSOPHY by Sudhir Kumar Mishra Special Centre for Sanskrit Studies Jawaharlal Nehru University New Delhi–110067 INDIA 2007 ÌuÉÍvÉ· xÉÇxM×üiÉ AkrÉrÉlÉ MåülSì SPECIAL CENTRE FOR SANSKRIT STUDIES eÉuÉÉWûUsÉÉsÉ lÉåWûÃ ÌuÉµÉÌuÉ±ÉsÉrÉ JAWAHARLAL NEHRU UNIVERSITY lÉD ÌSssÉÏÌSssÉÏ----110067 New Delhi-110067 July 20, 2007 D E C L A R A T I O N I declare that thesis entitled “Sanskrit K āraka Analyzer for Machine Translation” submitted by me for the award of the degree of Doctor of Philosophy is an original research work and has not been previously submitted for any other degree or diploma in any other institution/university. (Sudhir K Mishra) ÌuÉÍvÉ· xÉÇxM×üiÉ AkrÉrÉlÉ MåülSì SPECIAL CENTRE FOR SANSKRIT STUDIES eÉuÉÉWûUsÉÉsÉ lÉåWûÃ ÌuÉµÉÌuÉ±ÉsÉrÉ JAWAHARLAL NEHRU UNIVERSITY lÉD ÌSsÌSssÉÏsÉÏsÉÏsÉÏ----110067110067 NEW DELHI-110067 July 20, 2007 C E R T I F I C A T E This thesis entitled “Sanskrit K āraka Analyzer For Machine Translation” submitted by Sudhir K Mishra to Special Centre for Sanskrit Studies, Jawaharlal Nehru University, New Delhi-110067 , for the award of the degree of Doctor of Philosophy, is an original work and has not been submitted so far, in part or full, for any other degree or diploma of any University. This may be placed before the examiners for evaluation and for award of the degree of Doctor of Philosophy. Dr. Girish Nath Jha Dr. C. Upender Rao [Supervisor] [Chairperson] Á Á eÉrÉliÉÏ qÉXçaÉsÉÉ MüÉsÉÏ pÉSìMüÉsÉÏ MümÉÉÍsÉlÉÏ | SÒaÉÉï ¤ÉqÉÉ ÍvÉuÉÉ kÉÉ§ÉÏ xuÉÉWûÉ xuÉkÉÉ lÉqÉÉåÅxiÉÑ iÉå || §ÉÉÌWû qÉÉÇ SåÌuÉ SÒwmÉëå¤rÉå vÉ§ÉÔhÉÉÇ pÉrÉuÉÍkÉïÌlÉ | mÉëÉcrÉÉÇ U¤ÉiÉÑ qÉÉqÉælSìÏ AÉalÉårrÉÉqÉÎalÉSåuÉiÉÉ || AÎeÉiÉÉ uÉÉqÉmÉÉµÉåï iÉÑ SÍ¤ÉhÉå cÉÉmÉUÉÎeÉiÉÉ | ÍvÉZÉÉqÉÑ±ÉåÌiÉlÉÏ U¤ÉåSÒqÉÉ qÉÔÎklÉï urÉuÉÎxjÉiÉÉ || lÉÉÍxÉMüÉrÉÉÇ xÉÑaÉlkÉÉ cÉ E¨ÉUÉå¸å cÉ cÉÍcÉïMüÉ | AkÉUå cÉÉqÉ×iÉMüsÉÉ ÎeÉÀûÉrÉÉÇ cÉ xÉUxuÉiÉÏ || vÉÑ¢üÇ oÉë¼ÉÍhÉ qÉå U¤ÉåcNûÉrÉÉÇ Nû§ÉåµÉUÏ iÉjÉÉ | AWûÇMüÉUÇ qÉlÉÉå oÉÑÌ®Ç U¤ÉålqÉå kÉqÉïkÉÉËUhÉÏ || AÉrÉÔ U¤ÉiÉÑ uÉÉUÉWûÏ kÉqÉïÇ U¤ÉiÉÑ uÉæwhÉuÉÏ | rÉvÉÈ MüÐÌiÉïÇ cÉ sÉ¤qÉÏÇ cÉ kÉlÉÇ ÌuÉ±ÉÇ cÉ cÉÌ¢ühÉÏ || Á ÌuÉvÉÑ®¥ÉÉlÉSåWûÉrÉ Ì§ÉuÉåSÏÌSurÉcÉ¤ÉÑwÉå | ´ÉårÉÈmÉëÉÎmiÉÌlÉÍqÉ¨ÉÉrÉ lÉqÉÈ xÉÉåqÉÉkÉïkÉÉËUhÉå || Contents Acknowledgements i Abbreviations iv Transliteration scheme used vii INTRODUCTION 01 CHAPTER – I 6-42 Indian Language Technology and Computational Sanskrit 1.-1 Introduction 6 1.1 Survey of IL technology R & D 6 1.2 IL-IL MT Systems in India 16 1.3 Major Approaches to MT 36 1.4 Role of Sanskrit in translation among Indian Languages (IL) 37 1.5 Kāraka and its usefulness for IL-IL translation 38 CHAPTER – II 43-67 Centrality of Verb in P āini’s Kāraka System 2.0 Sanskrit sentence 43 2.1 Survey of Literature 44 2.2 Sanskrit verb and its identification 44 2.2.1 Definition of dh ātu 45 2.2.2 Classification of dh ātus 47 2.2.3 Concept of lak āra 49 2.3 Structure of Sentence 50 2.4 TGL id module 53 2.5 Derivational Verb Identification 58 2.6 Prefix Identification 65 2.7 Verb identification for Kāraka 66 CHAPTER – III 68-100 The Kāraka System of P āini 3.0 Definition of Kāraka 68 3.1 Types of Kāraka 69 3.1.1 Kart k āraka 70 3.1.2 Karman k āraka 75 3.1.3 Kara a kāraka 79 3.1.4 Sa prad āna k āraka 80 3.1.5 Ap ādāna k āraka 83 3.1.6 Adhikara a k āraka 87 3.1.7 Optional kāraka 89 3.2 Karmapravacn īya 91 CHAPTER – IV 101-140 The Vibhakti System of P āini 4.0 Vibhakti 101 4.1.1 Pratham ā vibhakti 101 4.1.2 Optional pratham ā vibhakti 102 4.1.3 Dvit īyā vibhakti 103 4.1.4 Optional dvit īyā vibhakti 105 4.1.5 Ttīyā vibhakti 106 4.1.6 Optional ttīyā vibhakti 108 4.1.7 Caturth ī vibhakti 110 4.1.8 Optional caturth ī vibhakti 111 4.1.9 Pa –cami vibhakti 112 4.1.10 Optional pa –cami vibhakti 114 4.1.11 a hī vibhakti 114 4.1.12 Optional a hī vibhakti 116 4.1.13 Saptam ī vibhakti 117 4.1.14 Optional saptam ī vibhakti 120 4.2 Kāraka - vibhakti mapping 120 4.2.1 Kart k āraka and vibhakti 121 4.2.2 Karman k āraka and vibhakti 125 4.2.3 Kara a k āraka and vibhakti 135 4.2.4 Samprad āna k āraka and vibhakti 136 4.2.5 Ap ādāna k āraka and vibhakti 136 4.2.6 Adhikara a k āraka and vibhakti 137 4.2.7 vibhakti sūtras related to Vedic texts 140 CHAPTER – V 141-154 Kāraka Analysis System (KAS) 5.0 Introduction 141 5.1 KAS – Architecture 142 5.2 The algorithm 142 5.3 KAS platform 143 5.4 KAS modules 144 5.4.1 Preprocessor 144 5.4.2 Verb Analyzer 144 5.4.2.1 upasarga dh ātus 145 5.4.2.2 ijanta dh ātus 145 5.4.2.3 nāma-dh ātus 146 5.4.2.4.1 Lexical Resource for ti anta analyzer 146 5.4.2.4.2 Process of ti anta identification 147 5.4.2.4.3 ti anta information for kāraka analysis 147 5.4.2.4.3.1 ga a information 147 5.4.2.4.3.2 artha information 148 5.4.2.4.3.3 karmaka information 149 5.4.3 avyaya 149 5.4.4 karmapravacan īya 150 5.4.5 pratyay ānta pada 150 5.4.6 anyapada 151 5.4.7 subanta Analyzer 152 5.5 Conclusion 154 CONCLUSION 155-158 APPENDICES 159-164 • Sample of source code • Sample of lexical resources • Test corpora • Screen shots • Web application CD BIBLIOGRAPHY 165-176 Acknowledgement First of all, I would like to give my deep regards and deliver thanks to the divine power called God for enabling me to finish this work. I am greatly indebted to Dr. Girish Nath Jha , under whose guidance and supervision I was able to do my research in the emergning area of multidisciplinary research called Computational Linguistics Dr. Jha provided me continuous help and encouragement despite his hectic schedule. Without his able guidance and inspiration it would have been extremely difficult for the present research to complete. I have presented a lot of papers with Dr. Jha in different national and international conferences/ seminars. In fact, whatever I know in this field, it is an outcome of his effort. His invaluable intellectual inputs, scholarly guidance, insightful suggestions, affectionate encouragement, fatherly admonishment and bountiful patience made this work see the light of the day. I am particularly grateful to Prof. Kapil Kapoor , the guru of my guru, for teaching research methodology to me and for being a constant support whenever I needed. I am deeply grateful to my teachers from the Special Centre for Sanskrit Studies (Prof. Shashiprabha Kumar, Dr. Hari Ram Mishra, Dr. C. Upendra Rao, Dr. Rajnish Mishra, Dr. Santosh Kumar Shukla and Dr. Ram Nath Jha) and Prof. V. N. Jha for initiating me in the discipline of Sanskrit (Grammar, Literature, Linguistics and Philosophy) and helping me out whenever there was a need for it. Prof. Rama Nath Sharma (Department of Hawaiian and Indo-pacific Languages and Literature, University of Hawaii,) helped me in the linguistic approaches to Sanskrit Language. He also solved many research related problems from time to time through his readings and valuable suggestions. I had an opportunity to learn kāraka prakara a of Pata –jali’s Mah ābh āya by Prof. George Cardona (School of Arts and Sciences, Department of Linguistics, University of Pennsylvania, USA) at J.N.U. when he was visiting Sanskrit Centre in 2005. Prof. Cardona also provided research related valuable materials. Dr. Amba Pradeep Kulkarni (Senior Consultant, Satyam Computer Services Ltd., Hyderbad) and Dr. Deepti Mishra Sharma (Language Technologies Research Centre, IIIT, Hyderabad) thoroughly discussed with me my topic and suggested techniques of solving this problem. I am extremely grateful to Dr. H. R. Mishra and Mrs. Dr. Divya Mishra for their unfailing help and encouragement on both levels - personal and academics. After completing my M.A. in Sanskrit from University of Allahabad, Dr. Goparaju Rama (Principal, Ganganatha Jha Kendriya Sanskrit Vidyapeetha, Allahabad) gave me an opportunity to present my another work on Ny āya and Vai śeśika Philosphy before a body of scholars He has published it in the journal of Ganganatha Jha Kendriya Sanskrit Vidyapeetha. Dr. Kishor Nath Jha (Ganganatha Jha Kendriya Sanskrit Vidyapeetha, Allahabad) helped me in writing the above paper. I have no words to express my gratitude to all the above mentioned people and their valuable help. My deep acknowledgement for the emotional and moral support provided by my grand-mother Mrs. Chandrawati Mishra , my parents Shri U.S. Mishra and Smt. Lalati Mishra . My elder brother Mr. Sunil Kumar Mishra has been an inspiration and role model in many ways for me. He has been not only my elder brother but also a good friend and has been just like my parents. Still, this work would not have been completed without consistent emotional and academic support by him. I am feeling very happy and share a lot of love with my younger sisters Miss Susmita Mishra , Mrs. Renu Tiwari , Smt. Rashmi Mishra (Bhabhi ji ). Among my friends Dr. Devendra Singh (JNU), Dr. R. Chandrashekar (Brown University), Subhash (CDAC, Kolakata), L.K.Vimal, Dr. Satyapal Sharma (BHU), Dr. Girish Singh and Dr. Sudheer P Singh (DU) were always there to share my feelings and emotions.

Sanskrit Kāraka Analyzer for Machine Translation Sudhir Kumar Mishra

Why Are Sanskrit Play Titles Strange?

Olga Tribulato Ancient Greek Verb-Initial Compounds

Loukota 2019

To Be Read by Dr Malhar Kulkarni, IIT Bombay)

ED132833.Pdf

Completed Research, Studies, and Instructional Materials for Language Development Under the National Defense Education Act of 1958, Title VI, Section 602. a Bibliography, List No. 6

Modeling the Pāṇinian System of Sanskrit Grammar

Introduction*

HISTORICAL SYNTAX the Syntax of Sanskrit Compounds John J

Lesson 193 LESSON 20 (Vi‚軻 P¹-Ha P鼓ini Has

Anantaratnaprabhava

Modeling the Pāṇinian System of Sanskrit Grammar