Syntactic Knowledge Based Framework for Resolving Reflexive and Distributive Anaphors in Urdu Discourse
Total Page:16
File Type:pdf, Size:1020Kb
Syntactic Knowledge based Framework for Resolving Reflexive and Distributive Anaphors in Urdu Discourse By JAMAL ABDUL NASIR Registration No. 1079-D-83 A thesis is submitted in partial fulfillment of the requirements for the degree of Ph.D. in Computer Science INSTITUTE OF COMPUTING AND INFORMATION TECHNOLOGY GOMAL UNIVERSITY DERA ISMAIL KHAN, KPK, PAKISTAN September, 2020 Dedicated to Humanity List of Contents S. No Description Page No 1. Student’s Declaration………………………………………………. i 2 List of Tables………………………………………………………. ii 3. List of Figures……………………………………………………… iii 4. List of Illustrations…………………………………………………. iv 5. List of Abbreviations………………………………………………. V 6. List of Appendices…………………………………………………. Vi 7. Acknowledgement…………………………………………………. Vii 8. Abstract…………………………………………………………….. Viii 9 Chapter 1: Introduction………………………………………….. 1 1.1 Overview ……………………………………………….. 1 1.2 Terminology ……………………………………………. 2 1.3 Anaphora Resolution …………………………………… 3 1.4 Aim and Objectives …………………………………… 5 1.5 Trends and Challenges …………………………………. 6 1.6 Reflexive and distributive anaphora in Urdu …………... 11 1.7 Key Contributions ……………………………………… 14 1.8 Significance of the Study ………………………………. 14 1.9 Thesis Organization ……………………………………. 15 1.10 Summary ……………………………………………….. 15 10. Chapter 2: Literature Review……..…………………………...... 16 2.1 Overview ……………………………………………… 16 2.2 Factors in Anaphora Resolution ………………………. 17 2.2.1 Constraints …………………………………... 17 2.2.2 Preferences ………………………………….. 18 2.3 Early AR systems ……………………………………... 19 2.4 Modern Anaphora Resolution Systems ………………. 20 2.5 Machine Learning and Statistics based AR System ….. 21 2.6 AR for URDU and Indian Languages ………………… 22 2.7 Summary ……………………………………………… 26 11. Chapter 3: Reflexive and Distributive Pronouns………………. 27 12. 3.1 Overview ………………………………………………. 27 3.2 Noun Cases in Urdu …………………………………… 28 3.2.1 Nominative case ……………………………. 29 3.2.2 Ergative case ………………………………… 29 3.2.3 Accusative case ……………………………… 30 3.2.4 Dative case …………………………………… 30 3.2.5 Instrumental case ……………………………... 31 3.2.6 Genitive case .………………………………. 31 3.2.7 Locative case ………………………………... 32 3.2.8 Vocative case ………………………………. 33 3.2.9 Oblique case ………………………………… 34 36 .............................. ) امضرئ وکعمس( Reflexive pronoun in Urdu 3.3 37 ...……………… (امضرئ وکعمس) Exploring Reflexive pronoun 3.4 3.4.1 Possessive Reflexive pronouns ………………….. 37 3.4.1.1 Possessive reflexive pronoun preceded by a noun or a pronoun ………………………. 38 3.4.1.2 Possessive reflexive pronoun preceded by ergative case ……………………………… 39 3.4.1.3 Possessive reflexive pronoun preceded by an adverb and a noun/pronoun ………………. 41 3.4.1.4 Possessive reflexive pronoun preceded by a dative case ………………………………… 42 Aur) between two) ”اور“ The connector 3.4.1.5 possessive reflexive pronouns …………… 43 45 ”وخد“ Non Possessive or Emphatic Reflexive pronoun 3.4.2 compound with ”وخد“ Emphatic Reflexive AD 3.4.2.1 a noun or personal pronoun ………………… 45 preceded by ”وخد“ Emphatic Reflexive pronoun 3.4.2.2 ergative case ……………………………….. 47 preceded a ”وخد“ Emphatic reflexive pronoun 3.4.2.3 dative case ………………………………… 48 preceded by ”وخد“ Emphatic reflexive pronoun 3.4.2.4 an adverb and a noun/pronoun ……………. 49 preceded by ”وخد“ Emphatic reflexive pronoun 3.4.2.5 and noun/pronoun …………………… 50 ”ذبات“ 3.4.3 Possessive and non-possessive reflexive pronouns Together …………………………………………... 51 3.4.4 Distributive Reflexive pronoun ……………………. 52 57 ………………… (امضرئمیسقت) Exploring Distributive Pronouns 3.5 Har Ek) ……………….. 57) رہ اکی Distributive Pronoun 3.5.1 3.5.1.1 Group Reference …………………………… 57 3.5.1.2 Role of verb ……………………………….. 59 3.5.1.3 Using properties/attributes/complements to identify referent …………………………. 61 Har Ek) followed by) رہاکی Pronoun 3.5.1.4 Noun/Noun Phrase ………………………… 64 3.5.1.5 Topicalized structure ………………………. 65 3.5.1.6 Referent in Oblique form ………………….. 67 Har Ek) referring to many) رہ اکی Pronoun 3.5.1.7 entities of same category separated by comma 68 Koi Ek) …………….. 70) وکیئ اکی Distributive Pronoun 3.5.2 3.5.2.1 Group formed by number ………………… 70 3.5.2.2 Group formed by a plural ………………… 71 Koi Ek) …. 72) وکیئ اکی Noun after the pronoun 3.5.2.3 Koi) وکیئ اکی Personal Pronoun before pronoun 3.5.2.4 Ek) …………………………………………. 73 Koi Bhi) …………….. 75) وکیئ یھب Distributive Pronoun 3.5.3 Koi Bhi) in a) وکیئ یھب Distributive pronoun 3.5.3.1 negative sentence ………………………… 75 Koi Bhi) in a) وکیئ یھب Distributive pronoun 3.5.3.2 positive sentence …………………………. 77 Kai Ek) …………….. 80) یئک اکی Distributive pronoun 3.5.4 3.5.4.1 Reference to preceding group ……………. 80 3.5.4.2 Genitive or Possessive case ……………… 80 3.6 Summary ……………………………………………….. 86 13. Chapter 4: Proposed Framework for Reflexive And Distributive Anaphora Resolution (RADAR) …………… 87 14. 4.1 Introduction to RADAR …………………………………. 87 4.2 General framework for anaphora resolution …………….. 88 4.3 Architecture of RADAR ………………………………… 90 4.4 Framework for resolving reflexive anaphora …………… 92 4.5 Workflow of framework for reflexive anaphora ……….. 92 4.5.1 Reflexive Identifier ………………………………. 92 4.5.2 Noun Phrase Extractor …………………………… 92 4.5.3 PR case Identifier ………………………………… 92 4.5.4 NPR case Identifier ………………………………. 94 4.5.5 PR Rules ………………………………………….. 94 4.5.6 NPR Rules ………………………………………… 94 4.5.7 Anaphora Resolution ………………………………. 94 4.5.8 RA …………………………………………………. 94 4.5.9 Noun ……………………………………………….. 95 4.5.10 PP …………………………………………………. 95 4.5.11 Noun Cases ……………………………………….. 95 4.6 Framework for resolving distributive anaphora …………. 96 4.7 Workflow of framework for distributive anaphora ……… 96 4.7.1 Distributive Anaphor Identifier ……………………. 96 4.7.2 Mark Groups ………………………………………. 96 4.7.3 Verb Identification ………………………………… 98 4.7.4 Distributive Anaphora Resolution ………………… 98 4.7.5 Resolve and mark case …………………………….. 98 4.7.6 Processes and resources for Distributive anaphora .. 98 4.7.6.1 QW () ……………………………………. 98 4.7.6.2 Nmbr () ………………………………….. 98 4.7.6.3 SP () ……………………………………… 98 4.7.6.4 Verb () …………………………………… 98 4.7.6.5 Attrib () …………………………………… 99 4.7.6.6 Polarity () …………………………………. 99 4.7.6.7 NCase () …………………………………… 99 4.7.6.8 MVerb () …………………………………... 99 4.7.6.9 DA ………………………………………... 100 4.7.6.10 Plurals …………………………………… 100 4.7.6.11 Numbers ………………………………… 100 4.7.6.12 Quantifier ……………………………….. 100 4.7.6.13 Verbs ……………………………………. 100 4.7.6.14 Attribute …………………………………. 101 4.7.6.15 Category …………………………………. 101 4.7.6.16 PP ………………………………………… 101 4.7.6.17 Oblique …………………………………… 101 4.7.6.18 Adverb …………………………………… 101 4.7.6.19 NounCase ………………………………… 101 4.8 Summary …………………………………………………. 102 15. Chapter 5: Evaluation and Results ……………………………… 103 16. 5.1 Evaluation overview ……………………………………… 103 5.2 Evaluation of RADAR …………………………………… 103 5.3 Overall Results …………………………………………… 104 5.4 Evaluation of RADAR for Reflexive anaphora ………….. 105 5.4.1 Analysis of possessive reflexive pronoun …………. 106 5.4.2 Analysis of Non-possessive reflexive pronoun ……. 108 5.5 Evaluation of RADAR for Distributive anaphora ………. 110 5.5.1 Analysis of possessive reflexive pronoun …………. 112 Har Ek) with) رہ اکی Distributive pronoun 5.5.1.1 various entities ……………………………. 112 Koi Ek) with) وکیئ اکی Distributive pronoun 5.5.1.2 different entities …………………………. 114 Koi Bhi) ……. 115) وکیئ یھب Distributive anaphor 5.5.1.3 Kai Ek) ……… 117) یئک اکی Distributive anaphor 5.5.1.4 5.6 Conclusion ……………………………………………….. 119 5.7 Future Work ……………………………………………… 120 17. Chapter 6: References……………………………………………. 122 Student’s Declaration I, Jamal Abdul Nasir, do hereby state that my Ph.D. thesis titled “Syntactic Knowledge Based Framework for Resolving Reflexive and Distributive Anaphors in Urdu Discourse” is my own work and has not been submitted previously by me for taking any degree from Gomal University, Dera Ismail Khan or anywhere else in the country/world. I understand the zero tolerance policy of the HEC and Gomal University, Dera Ismail Khan towards plagiarism. Therefore, I declare that no portion of my thesis has been plagiarized and any material used as reference is properly cited. I undertake that if I am found guilty of any formal plagiarism in the above titled thesis even after award of Ph.D. degree, the university reserves the rights to withdraw/revoke my Ph.D. degree and that HEC has the right to publish my name on the website on which names of students are placed who submitted plagiarized work. Name of Student: Jamal Abdul Nasir Signature_____________ Date___________ Name of Supervisor: Dr. Zia Ud Din Signature_____________ Date__________ i List of Tables Table No Description Page No 2.1 Summary of related work in Indian languages 24 3.1 Singular-Plural 28 3.2 Noun Cases 35 3.3 Resolution rules for reflexive anaphora 55 3.4 Resolution rules for distributive anaphora 83 5.1 Results of Reflexive and Distributive Anaphora Resolution 104 5.2 Reflexive Pronoun Individually 105 5.3 Possessive Reflexive Pronoun with various entities 107 5.4 Non- Possessive Reflexive Pronoun with various entities 109 5.5 Distributive Anaphors individually 111 5.6 112 Har Ek) with various entities) رہ اکی Distributive pronoun 5.7 114 Koi Ek) with different entities) وکیئ اکی Distributive pronoun 5.8 116 Koi Bhi) with different entities) وکیئ یھب Distributive pronoun 5.9 117 Kai Ek) with different entities) یئک اکی Distributive pronoun ii List of Figures Figure No Description Page No 1.1 Knowledge required for Anaphora Resolution 9 3.1 Possessive reflexive pronoun preceded by noun 38 3.2 Possessive reflexive pronoun with ergative case 40 3.3 Possessive reflexive pronoun preceded by adverb and noun 41 3.4 Two possessive reflexive