Semantics Driven Human-Machine Computation Framework for Linked Islamic Knowledge Engineering

Semantics Driven Human-Machine Computation Framework for Linked Islamic Knowledge Engineering by Amna Basharat (Under the Direction of Khaled Rasheed and I. Budak Arpinar) Abstract Formalized knowledge engineering activities including semantic annotation and linked data management tasks in specialized domains suffer from considerable knowledge acquisition bottleneck - owing to the lack of availability of experts and in-efficacy of automated approaches. Human Computation & Crowdsourcing (HC&C) methods successfully advocate leveraging the human intelligence and pro- cessing power to solve problems that are still difficult to be solved computationally. Contextualized to the domain of Islamic Knowledge, this research investigates the synergistic interplay of these HC&C methods and the semantic web and proposes a semantics driven human-machine computation framework for knowledge engineering in specialized and knowledge intensive domains. The overall objective is to augment the process of automated knowledge extraction and text mining methods using a hybrid approach for combining collective intelligence of the crowds with that of experts to facilitate activities in formalized knowledge engineering - thus overcoming the so-called knowledge acquisition bottleneck. As part of this framework, we design and implement formal and scalable knowledge acquisition workflows through the application of semantics driven crowdsourcing methodology and its specialized derivative, called learnersourcing. We evaluate these methods and workflows for a range of knowledge engineering tasks including thematic classification, thematic disambiguation, thematic annotation and contex- tual interlinking for two primary Islamic texts, namely the Qur’an and the books of Prophetic narrations called the Hadith. This is done at various levels of granu- larity, including atomic and composite task workflows, that existing research fails to address. We leverage primarily upon students and learners engaging in typical knowledge seeking and learning scenarios. The chosen method ensures annotation reliability by introducing an ’expert sourcing’ workflow tightly integrated within the system. Therefore, quantitative measures of ensuring annotation quality are woven into the very fabric of the human computation framework. The results of our evaluation demonstrate that our proposed methods are robust and are capable of generating high quality and reliable annotations, while significantly reducing the need for expert contributions. Index words: Qur’an, Quran, Hadith, Semantic, Islam, Islamic, Crowdsourcing, Learnersourcing, Expert-sourcing, Human Computation, Thematic Annotation, Thematic Disambiguation Semantics Driven Human-Machine Computation Framework for Linked Islamic Knowledge Engineering by Amna Basharat B.E(SE)., National University of Science and Technology, 2005 M.Sc., Brunel Univeristy, 2007 A Dissertation Submitted to the Graduate Faculty of The University of Georgia in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Athens, Georgia 2016 © 2016 Amna Basharat All Rights Reserved Semantics Driven Human-Machine Computation Framework for Linked Islamic Knowledge Engineering by Amna Basharat Approved: Major Professors: Khaled Rasheed I. Budak Arpinar Committee: Krys Kochut Thiab Taha Electronic Version Approved: Suzanne Barbour Dean of the Graduate School The University of Georgia December 2016 Semantics Driven Human-Machine Computation Framework for Linked Islamic Knowledge Engineering Amna Basharat December 2016 Inspiration In the name of Allah, the most merciful, the most beneficent. And those who strive for Us - We will surely guide them to Our ways. And indeed, Allah is with the doers of good. — [Al-Quran, Al-Ankaboot,29:69] May Allah accept this humble effort, purely for His sake, and make this a source of elevating my ranks in this world and hereafter, along with all those who have contributed in making this effort possible, foremost amongst whom are my parents, teachers, siblings, family and friends. v vi Dedicated to Ammi and Abbu Ji, with all my love vii Acknowledgments All thanks and praise be to Almighty Allah, the Lord of the Worlds. It is only due to the grace and blessings of Almighty Allah (God), that I have been blessed with the poise, the will and the perseverance to see this milestone to an end. It was only a dream when I came. Only Allah has made its realization possible. After the prayers and the support of all my loved ones, I am deeply beholden to my advisor, Dr. Khaled Rasheed, who placed a great deal of confidence and trust viii in my ideas and abilities. He gave me both independence of thought and decision making, with the perfect balance of guidance and empathy, to help me realize my dream goals that I was most passionate about. When I look back at my years spent at UGA, Dr. Rasheed stands out as someone who played the most crucial role towards my successes and growth as an individual, and towards achieving my research vision. I will always continue to derive inspiration and learning from my time spent under his fatherly supervision. I would like to convey my immense gratitude to my co-advisor, Dr. I. Budak Arpinar, who also played a crucial role in driving my research objective and path. I am extremely grateful for the valuable suggestions, unprecedented guidance, appreciation and encouragement I received from him. He always encouraged me and challenged me to think with high level impact in mind and it certainly helped me explore avenues that I would not have been able to otherwise. I would also like to thank my committee members Dr. Krys Kochut and Dr. Thiab Taha, for their valuable time and feedback towards refining my research objectives and direction. My years of learning at UGA will have a profound impact on my life to come. I would like to express thanks to my colleagues Usman Nisar, Pan Huang, Kyle Krafka and Delaram Yazdansepas for their support and help. I would also like to thank the support staff at CS UGA, especially Suzi, Lafarrah, Timothy and Melody for their help and facilitation in all matters. I would also like to express my gratitude to my Fulbright advisor, Brian Diffley, for his support always. I would like to thank the IIE and USEFP and all those who ix enabled this scholarship opportunity and helped me see through the completion of this degree. I would like to express my special gratitude to Ms. Rita Akhtar (Executive Director, USEFP) and those along her side, and Ms. Angela P. Aggeler (US Embassy, Pakistan), who helped me at a difficult stage in my program and ensuring everything that would help reach me a successful completion. I am extremely grateful to Dr. Aftab Maroof and Dr. Arshad Ali Shahid (FAST-NUCES, Islamabad) for facilitating, guiding and mentoring me in this time. My very special thanks to my Bhaiii who took care of me all these years, played the role of ’choti ammi’ and ’chota daddy’, and who has always placed great pride and confidence in me and also put up with my three year old acts. This PhD would simply not have been the same without you. It makes me really proud to have you as my brother and I am grateful for all the valuable life lessons I have learnt from you bro!. I would also like to say special thanks to my little sister, An’amta, who specially played a huge role in ensuring ease and comfort for the last few months of my PhD, when I needed all the more support and help. Her presence was indeed a source of Barakah in my time and efforts, Alhamdulillah. A shout out to Aapi’s little brothers Hebban and Zahir. Zahir, thank you for the great company, the forest walks, the great food and not to forget the endless questions and conversations. Hebban, thank you for all the wish lists and for always making sure that my luggage is loaded on my trips home. I am extremely grateful, from the bottom of my heart, especially to so many members of my extended family, my khalas, mamus, cousins and others who always x load me with duas, love and pampering throughout my life. My cousins Murrium and Umar have especially been a huge support for my travel trips throughout this period of five years. I feel extremely fortunate and blessed to have loving friends, who may not have been physically present, yet who stood by me, supported me, cheered me on, and helped me in every possible manner they could, with their efforts, duas and foremost their love and encouragement. Mahwash, Humaira, Tahreem, Sana, Amina, Zainab Abaid, Zainab Jamil and the Baqais to name a few, I owe it to all of you. So many others who always loaded me with duas and well wishes, may Allah reward you with the very best in both worlds. During these years, Allah also brought some special people into my life, as a special gift and mercy. Eternal friends, whom I feel so blessed to have in my life. My two pseudo moms are truly exceptional. I thank you both for bringing so much more joy, positivity and enlightenment that you bring, for all the duas, mentoring and support and for making my life and the process of my PhD in particular all the more special and cherished. I also would like to express great admiration and love for my friend, SNG. She has been a role model, an inspiration and someone I look up to a lot and has been a mentor for me in this PhD journey. I cannot forget all my friends here in Athens and the Atlanta community, both within and outside UGA, who made my stay comfortable and helped me in every possible way they can. I would specially like to mention Amena and Ayesha for always being there. xi I also acknowledge the efforts of my ex-students. Muhammad Shoaib, was always a great source of help and support in the initial stages of my PhD.

Semantics Driven Human-Machine Computation Framework for Linked Islamic Knowledge Engineering

Lecture Notes in Computer Science 7032 Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan Van Leeuwen

An Overview of Ontoclean

Cristobalthesis

Semantic Web: a Review of the Field Pascal Hitzler [email protected] Kansas State University Manhattan, Kansas, USA

Introduction to Semantic Web Technologies & Linked Data

An Overview of Ontoclean

Ontology-Driven Conceptual Modeling

IBM Research Report Harnessing Disagreement in Crowdsourcing A

Crowdsourcing,The,Semantic,Web

The Disagreement Deconvolution: Bringing Machine Learning Performance Metrics in Line with Reality

Evaluating Ontology Cleaning

Crowd Truth and the Seven Myths of Human Annotation Aroyo, Lora; Welty, Chris