Using Knowledge Anchors to Facilitate User Exploration of Data Graphs
Total Page:16
File Type:pdf, Size:1020Kb
Using Knowledge Anchors to Facilitate User Exploration of Data Graphs Editor(s): Name Surname, University, Country Solicited review(s): Name Surname, University, Country Open review(s): Name Surname, University, Country Marwan Al-Tawila,b,*, Vania Dimitrovab, Dhavalkumar Thakkerc a King Abdullah II School of Information Technology, University of Jordan, Amman, Jordan b School of Computing, University of Leeds, Leeds, UK c Faculty of Engineering and Informatics, University of Bradford, Bradford, UK Abstract. This paper investigates how to facilitate users’ exploration through data graphs for knowledge expansion. Our work focuses on knowledge utility – increasing users’ domain knowledge while exploring a data graph. We introduce a novel explo- ration support mechanism underpinned by the subsumption theory of meaningful learning, which postulates that new knowledge is grasped by starting from familiar concepts in the graph which serve as knowledge anchors from where links to new knowledge are made. A core algorithmic component for operationalising the subsumption theory for meaningful learning to generate explo- ration paths for knowledge expansion is the automatic identification of knowledge anchors in a data graph (KADG). We present several metrics for identifying KADG which are evaluated against familiar concepts in human cognitive structures. A subsumption algorithm that utilises KADG for generating exploration paths for knowledge expansion is presented, and applied in the context of a Semantic data browser in a music domain. The resultant exploration paths are evaluated in a task-driven experimental user study compared to free data graph exploration. The findings show that exploration paths, based on subsumption and using knowledge anchors, lead to significantly higher increase in the users’ conceptual knowledge and better usability than free explo- ration of data graphs. The work opens a new avenue in semantic data exploration which investigates the link between learning and knowledge exploration. This extends the value of exploration and enables broader applications of data graphs in systems where the end users are not experts in the specific domain. Keywords: Data graphs, knowledge utility, data exploration, meaningful learning, knowledge anchors, exploration paths. 1. Introduction with many options (like exploring job opportunities, travel and accommodation offers, videos, music). Of- In recent years, RDF linked data graphs have be- ten, the users have no (or limited) familiarity with the come widely available on the Web and are being specific domain. When users are novices to a domain, adopted in a range of user-facing applications offering their cognitive structures about that domain are un- search and exploration tasks. In contrast to regular likely to match the complex knowledge structures of a search where the user has a specific need in mind and data graph that represents the domain. This can have an idea of the expected search result [1], exploratory a negative impact on the exploration experience and search is open-ended requiring significant amount of effectiveness, as users may be unable to formulate ap- exploration [2], has an unclear information need [3], propriate knowledge retrieval queries (users do not and is used to conduct learning and investigative tasks know what they do not know [3]). Moreover, users can [4]. There are numerous examples from exploring re- face an overwhelming amount of exploration options sources in a new domain (like in academic research and may not be able to identify which exploration tasks) to browsing through large information spaces *Corresponding author (E-mail: [email protected]). paths are most useful; this can lead to confusion, high familiar or partially familiar domains, users serendip- cognitive load, frustration and feeling of being lost. itously learn new things that they are unaware of [17– To overcome these challenges, appropriate ways to 19]. However, not all exploration paths can be benefi- facilitate users’ exploration through data graphs are re- cial for knowledge expansion: paths may not bring quired. Research on exploration of data graphs has new knowledge to the user leading to boredom, or may come a long way from initial works on presenting bring too many unfamiliar things so that the user be- linked data in visual or textual forms [5,6]. Recent comes confused and overwhelmed [18]. studies on data graph exploration have brought to- The key contribution of this paper is a novel com- gether research from related areas - Semantic Web, putational approach for generating exploration paths personalisation, adaptive hypermedia, and human- that can lead to expanding users’ domain knowledge. computer interaction - with the aim of reducing users’ Our approach operationalises Ausbel’s subsumption cognitive load and providing support for knowledge theory for meaningful learning [20] which postulates exploration and discovery [7–9]. Several attempts that human cognitive structures are hierarchically or- have developed support for layman users, i.e. novices ganised with respect to levels of abstraction, general- in the domain. Examples include: personalising the ity, and inclusiveness of concepts; hence, familiar and exploration path tailored to the user’s interests [10], inclusive entities are used as knowledge anchors to presenting RDF patterns to give an overview of the subsume new knowledge. Consequently, our approach domain [11], or providing graph visualisations to sup- to generate exploration paths includes: port navigation [12]. However, existing work on facil- computational methods for identifying knowledge itating users’ exploration through data graphs has ad- anchors in a data graph (KADG); and dressed mainly investigative tasks, omitting important algorithms for generating exploration paths by uti- exploratory search tasks linked to supporting learning. lising the identified knowledge anchors. The exploration of a data graph (if properly as- To find possible knowledge anchors in a data graph, sisted) can lead to an increase in the user’s knowledge. we utilise Rosch’s notion of Basic Level Objects This is similar to learning through search - an emerg- (BLO) [21]. According to this notion, familiar cate- ing research area in information retrieval [13,14], gory objects (e.g. the musical instrument Guitar) are which argues that “searching for data on the Web at a level of abstraction called the basic level where should be considered an area in its own right for future the category’s members (e.g. Folk Guitar, Clas- research in the context of search as a learning activity” sical Guitar) share attributes (e.g. both have a [15]. In the context of data graphs, learning while neck and a bridge) that are not shared by members (e.g. searching/exploring has not been studied. The closest Grand Piano, Upright Piano) of another cat- to learning is research on tools for exploration of in- egory at the same level of abstraction such as Piano. terlinked open educational resources [16]. However, We have adapted metrics from formal concept analy- this is a very specific context, and does not consider sis for detecting knowledge anchors in data graphs. the generic context of learning while exploring data The KA metrics are applied on an existing data graphs in any domain. This generic learning context is DG graph, and the output is evaluated against human Basic addressed here. Level Objects in a Data Graph (BLO ) derived via The work presented in this paper opens a new ave- DG free-naming tasks. nue that studies learning through data graph explora- To generate exploration paths, we first identify the tion. It addresses a key challenge: how to support peo- closest knowledge anchor to be used as a starting ple who are not domain experts to explore data graphs point, from where we use subsumption to find a set of in a way that can lead to expanding their domain transition narratives to form a path that can expand the knowledge. We investigate how to build automated user’s knowledge. The effectiveness of our novel ex- ways for navigating through data graphs in order to ploration approach is evaluated in a study with a Se- add a new value to the exploration, which we call mantic data browser in the Music domain. Subsump- ‘knowledge utility’ - expanding one’s domain tion-based exploration paths are compared to free data knowledge while exploring a data graph1. Our earlier graph exploration. The results show that when users work showed that when exploring data graphs in un- 1 We follow definitions of utility (i.e. reducing users’ cognitive load in knowledge retrieval) and usability (i.e. supporting users’ sense making and information exploration and discovery) [103]. have followed the suggested exploration paths, the in- linked data graphs is presented in [22]. Sheiderman’s crease of their knowledge was significantly higher and seminal work on visual information seeking (overview the usability was better. first, zoom and filter, then details-on-demand) [30] is The paper is structured as follows. Section 2 posi- used to evaluate the usability and utility of these ap- tions the work in the relevant literature and presents proaches and focuses on: (i) how well these ap- relevant theories. Section 3 presents experimental and proaches generate summary of data; (ii) how well they theoretical foundations, including the application con- focus on finding relevant and important data; and (iii) text for algorithm validation and the theoretical under- what visualisation techniques are used. In [31], the au- pinning. Section 4 provides preliminaries with key thors utilise a cartographic metaphor and visual infor- definitions, followed by a formal description of met- mation seeking principles to offer overview of data rics for identifying KADG (Section 5). Sections 6 de- based on instance types, and then automatically gen- scribes an experimental study to validate the KADG erating SPARQL queries based on search and interac- metrics against human BLODG. Section 7 describes a tion with a map. The focus of their work is the entry subsumption algorithm for generating exploration point of the vast data graph the users have to explore.