Exploratory Search on Topics Through Different Perspectives with Dbpedia Nicolas Marie, Fabien Gandon, Alain Giboin, Emilie Palagi
Total Page:16
File Type:pdf, Size:1020Kb
Exploratory search on topics through different perspectives with DBpedia Nicolas Marie, Fabien Gandon, Alain Giboin, Emilie Palagi To cite this version: Nicolas Marie, Fabien Gandon, Alain Giboin, Emilie Palagi. Exploratory search on topics through different perspectives with DBpedia. SEMANTICS, Sep 2014, Leipzig, Germany. hal-01057031 HAL Id: hal-01057031 https://hal.inria.fr/hal-01057031 Submitted on 21 Aug 2014 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Exploratory search on topics through different perspectives with DBpedia Nicolas MARIE Fabien Gandon, Alain Giboin, Alcatel-Lucent Bell labs Émilie Palagi INRIA Sophia-Antipolis INRIA Sophia-Antipolis France France [email protected] fi[email protected] ABSTRACT term learning is employed in the broad sense and can con- A promising scenario for combining linked data and search cern educational, professional and personal contexts. During is exploratory search. During exploratory search, the search exploratory search the search objective is ill-defined, favor- objective is ill-defined favorable to discovery. A common able to discovery. Actual popular search engines are evolv- limit of the existing linked data based exploratory search ing in the right direction but are still not supporting ex- systems is that they constrain the exploration through sin- ploratory search efficiently today. This is notably due to gle results selection and ranking schemes. The users can not their keyword-search paradigm and lack of assistance dur- influence the results to reveal specific aspects of knowledge ing the results consultation. First it is impossible to cap- that interest them. The models and algorithms we propose ture complex information needs in few keywords, especially unveil such knowledge nuances by allowing the exploration for vague ones. Second the users have to synthesize an im- of topics through several perspectives. The users adjust im- portant amount of information without support from the portant computation parameters through three operations system. They have to rely on their own search strategies that help retrieving desired exploration perspectives: spec- leading to a considerable cognitive load. It increases the ification of interest criteria about the topic explored, con- information integration work needed to understand and to trolled randomness injection to reveal unexpected knowl- use the information collected: ”the human user’s brain is edge and choice of the processed knowledge source(s). This the fundamental platform for information integration” [3]. paper describes the corresponding models, algorithms and There is a need to complete the actual widely-used solu- the Discovery Hub implementation. It focuses on the three tions by designing and popularizing systems optimized for mentioned perspective-operations and their evaluations. exploratory search tasks. This challenge requires contribu- tions from several research fields including in particular in- formation retrieval and interaction design1. Categories and Subject Descriptors Linked data [1] and DBpedia [8] have been extensively [Graph Theory]: Graph algorithms; [HCI design and described in the scientific literature and respectively corre- evaluation methods]: User studies spond to an approach for publishing and linking data on the web and its application to data extracted from Wikipedia2. The improvement of search through incorporation of seman- General Terms tics is referred to as semantic search. Semantic search has Algorithms, experimentation been the subject of numerous researches targeting a wide range of search objectives [14]. Today important initiatives emerge from major players including the knowledge graphs- Keywords based search engines functionalities (based on Bing Satori3, Exploratory search, Multi-perspectives exploration, DBpe- Google4 and Yahoo Knowledge Graphs [2]), Apple Siri5 and dia, Discovery engine, Discovery Hub the Facebook Graph Search6. Such technologies and func- tionalities open the door to a better support of exploratory search tasks (”explore your search”, ”help you research a 1. INTRODUCTION topic more in depth than before”4). Even if some of these As stated by White in [16]: ”search is only a partially graphs are partially built from open data sources they are solved problem”. Exploratory search refers to cognitively not publicly accessible and are consequently not part of the consuming search tasks like learning or investigation [9]. The linked open data cloud. Permission to make digital or hard copies of all or part of this work for Linked data are promising for supporting exploratory search. personal or classroom use is granted without fee provided that copies are not Their richness and structured aspect allow proposing new al- made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components 1https://sites.google.com/site/hcirworkshop/ of this work owned by others than ACM must be honored. Abstracting with 2http://www.wikipedia.org/ credit is permitted. To copy otherwise, or republish, to post on servers or to 3http://www.bing.com/blogs/site blogs/b/search/archive/ redistribute to lists, requires prior specific permission and/or a fee. Request 2013/03/21/satorii.aspx permissions from [email protected]. 4google.com/insidesearch/features/search/knowledge.html SEM ’14, September 04 - 05 2014, Leipzig, AA, Germany 5http://www.apple.com/ios/siri/ Copyright 2014 ACM 978-1-4503-2927-9/14/09?$15.00. 6 http://dx.doi.org/10.1145/2660517.2660518. https://www.facebook.com/about/graphsearch gorithms and interaction models optimized for exploration setting could be improved. The systems presented hereafter purposes. The research in this field still faces several chal- are summarized in Table 1. We observed that the following lenges. One limit is that existing linked data based systems limits are recurrent: often offer only one exploration perspective i.e. the users can not or hardly influence the queries’ results in a direction of • The results selection process is generally fixed. interest. This paper supports (1) the idea that a plurality of • The ranking scheme is generally fixed. relevant exploration perspectives can be offered to the users • The data source(s) used to process the query is fixed. starting from a topic of interest, (2) that some linked data datasets constitute a valuable source of knowledge for such LED (Lookup Explore Discover) [11] is an exploratory multi-perspectives exploratory search. Indeed, the objects search system that suggests related query-terms starting described in linked data datasets can be rich, complex and from a user initial query. It implements the DBpedia Ranker approached in many manners. For example, a user can be algorithm for such recommendations. Seevl8 is a music dis- interested in a painter (e.g. Claude Monet or Mary Cassat) covery platform that offers DBpedia-based artists’ recom- in many ways: works, epoch, movement, entourage, social or mendations thanks to the DBrec algorithm [13]. Yovisto9 is political contexts and more. The user may also be interested an academic video platform offering related-queries sugges- by basic information or by unexpected and unusual ones de- tions computed with a set of heuristics on the German and pending on his actual knowledge about the painter. He may English DBpedia chapters [15]. These 3 applications do not also want to explore the topic through a specific culture or allow the users to influence the results retrieved. area e.g. impressionism in American or French culture. A Starting from a resource of interest Aemoo10 visually presents single interest can be explored through many perspectives its direct neighborhood filtered with semantic-based pat- corresponding to different knowledge nuances. In the graph terns called Encyclopedic Knowledge Patterns (EKPs) [12]. context of linked data these perspectives correspond to dif- EKPs are selections of the most informative classes regard- ferent non exclusive sets of objects and relations that are ing a specific class: ”the most relevant types of things that informative on a topic regarding specific aspects. people use for describing other things”. For instance the DB- In our proposition such exploration perspectives are ob- pedia Actor class EKP includes the classes Actor, City, Film, tained by increasing the results relevance thanks to users’ TelevisionShow, etc. Aemoo proposes a ”curiosity” function query refinement/personalization and provoking discoveries which displays the queried resource neighborhood through by injecting randomness during the results’ computation. an inverted EKP filtering. This function offers an explo- Such topics are not new in the general field of information ration perspective that aims to unveil unexpected knowl- retrieval but, to the best of our knowledge, no approaches edge. The MORE movie recommender11 [11] is based on a were formalized, implemented and evaluated in the context semantic adaptation of the Vector Space Model and allows of linked data based