Intelligent Support for Exploration of Data Graphs
Total Page:16
File Type:pdf, Size:1020Kb
Intelligent Support for Exploration of Data Graphs Marwan Ahmad Talal Al-Tawil Submitted in accordance with the requirements for the degree of Doctor of Philosophy The University of Leeds School of Computing October, 2017 - ii - Dedication To my family - iii - Declaration The candidate confirms that the work submitted is his/her own, except where work which has formed part of jointly-authored publications has been included. The contribution of the candidate and the other authors to this work has been explicitly indicated below. The candidate confirms that appropriate credit has been given within the thesis where reference has been made to the work of others. Some of the work in this thesis has been published prior to thesis submission. The exploratory user study in Chapter 3 has been presented in: Al-Tawil, M., Thakker, D., Dimitrova, V.: Nudging to Expand User’s Domain Knowledge while Exploring Linked Data. In: Proceedings of Intelligent Exploration of Semantic Data @ International Semantic Web Conference (2014). The KADG algorithms in Chapter 4 have been presented in: Al-Tawil, M., Dimitrova, V., Thakker, D., Bennett, B.: Identifying Knowledge Anchors in a Data Graph. In: Proceedings of the 27th ACM Conference on Hypertext and Social Media (2016). Part of research description in Chapters 1 and 3 has been described in: Al-Tawil, M.: Intelligent Nudging to Support Interactive Exploration of Big Data Graphs, in: HT 2016 – Doctoral Consortium. 27th ACM Conf. In Hypertext and Social. Media (2016). Part of the work for identifying KADG in L4All in Chapter 4 has been presented in: Poulovassilis, A., Al-Tawil, M., Frosini, R., Dimartino, M., Dimitrova, V.: Combining Flexible Queries and Knowledge Anchors to facilitate the exploration of Knowledge Graphs. In: Proceedings of Intelligent Exploration of Semantic Data @ International Semantic Web Conference (2016). The evaluation of KADG algorithms in Chapter 5 has been presented in: Al-Tawil, M., Dimitrova, V., Thakker, D., Alexandra Poulovassilis.: Evaluating Knowledge Anchors in Data Graphs against Basic Level Objects. In: Proceedings of International conference on Web Engineering (2017). Part of the future research in Chapter 8 has been presented in: Al-Tawil, M., Dimitrova, V., Thakker, D.: Using Basic Level Concepts in a Linked Data Graph to Detect User's Domain Familiarity. In: Proceedings of User Modelling Adaptation and Personalisation conference (2015). - iv - This copy has been supplied on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement. The right of Marwan Al-Tawil to be identified as Author of this work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. © 2017 The University of Leeds and Marwan Ahmad Talal Al-Tawil - v - Acknowledgements First and foremost, I praise my God Allah who gave me the health, strength and courage to complete this work. My heartfelt appreciation and deepest gratitude goes to my supervisor Dr. Vania Dimitrova as without her constant support and encouragement this research would not have been accomplished. Her guidance through the years of my study gave me the confidence to pursue my studies and research. I would like also to thank my Co-supervisors Dr. Dhavalkumar Thakker and Dr. Brandon Bennett for their dedicated efforts and support during my study. I have no words to describe my gratitude for the support I received from my family. Your encouragement and compassion are precious in all aspects of my life. Special thanks to my parents who made me the way I am. Also, I would like to express my gratitude to my wife and my daughter for their love and support during my journey. Acknowledgement goes to the University of Jordan who granted my scholarship to complete this research. - vi - Abstract This research investigates how to support a user’s exploration through data graphs generated from semantic databases in a way leading to expanding the user’s domain knowledge. To be effective, approaches to facilitate exploration of data graphs should take into account the utility from a user’s point of view. Our work focuses on knowledge utility – how useful exploration paths through a data graph are for expanding the user’s knowledge. The main goal of this research is to design an intelligent support mechanism to direct the user to ‘good’ exploration paths through big data graphs for knowledge expansion. We propose a new exploration support mechanism underpinned by the subsumption theory for meaningful learning, which postulates that new knowledge is grasped by starting from familiar concepts in the graph which serve as knowledge anchors from where links to new knowledge are made. A core algorithmic component for adapting the subsumption theory for generating exploration paths is the automatic identification of Knowledge Anchors in a Data Graph (KADG). Several metrics for identifying KADG and the corresponding algorithms for implementation have been developed and evaluated against human cognitive structures. A subsumption algorithm which utilises KADG for generating exploration paths for knowledge expansion is presented and evaluated in the context of a semantic data browser in a musical instrument domain. The resultant exploration paths are evaluated in a controlled user study to examine whether they increase the users’ knowledge as compared to free exploration. The findings show that exploration paths using knowledge anchors and subsumption lead to significantly higher increase in the users’ conceptual knowledge. The approach can be adopted in applications providing data graph exploration to facilitate learning and sensemaking of layman users who are not fully familiar with the domain presented in the data graph. - vii - Table of Contents Dedication .............................................................................................................................. ii Acknowledgements ................................................................................................................v Abstract ................................................................................................................................ vi Table of Contents ................................................................................................................ vii Abbreviations ....................................................................................................................... xi List of Figures ..................................................................................................................... xii List of Tables ...................................................................................................................... xiv Chapter 1 Introduction ............................................................................................................1 Chapter 2 Related Work .........................................................................................................8 2.1 Introduction .............................................................................................................8 2.2 Background .............................................................................................................9 2.2.1 The Web Graph ...........................................................................................9 2.2.2 Linked Data ...............................................................................................11 2.2.3 Ontology ....................................................................................................13 2.2.4 SPARQL Query Language .......................................................................15 2.2.5 Graph Databases and Triple Stores ...........................................................16 2.2.6 Semantic Web Applications ......................................................................18 2.2.6.1 Semantic Data Browsers ..............................................................19 2.2.6.2 Semantic Data Search Engines .....................................................21 2.3 Exploration of Data Graphs ...................................................................................23 2.3.1 Visualisation-based Approaches ...............................................................23 2.3.2 Text-based Approaches .............................................................................25 2.4 Identifying Key Entities in Data Graphs ................................................................27 2.4.1 Ontology Summarisation Approaches ......................................................28 2.4.2 Formal Concept Analysis Based Approaches ...........................................29 2.4.3 Approaches that adopt Basic Level Objects ..............................................30 2.5 Generating Paths in Data Graphs ...........................................................................32 2.6 Data Exploration Evaluation Approaches ..............................................................35 2.7 Theories .................................................................................................................36 2.7.1 Bloom’s Taxonomy for Approximating Knowledge Utility .....................36 2.7.2 Underpinning Theoretical Model for Data Graph Exploration .................39 2.7.2.1 Schema Theory .............................................................................39 2.7.2.2 Subsumption Theory for Meaningful Learning ............................40 - viii - 2.7.2.3