Lookup, Explore, Discover: How Dbpedia Can Improve Your Web Search
Total Page:16
File Type:pdf, Size:1020Kb
Undefined 0 (2010) 1{0 1 IOS Press Lookup, Explore, Discover: how DBpedia can improve your Web search Roberto Mirizzi a, Azzurra Ragone a Tommaso Di Noia a and Eugenio Di Sciascio a a Politecnico di Bari, via Re David 200, 7015, Bari { Italy E-mail: [email protected], [email protected], [email protected], [email protected] Abstract. The power of search is with no doubt one of the main aspects for the success of the Web. Differently from traditional searches in a library, search engines for digital libraries (as the Web is) allow to return results that are considerably effective. Nevertheless, current search engines are no more able to really help the user in all those tasks that go under the umbrella of exploratory search, where the user needs not only to perform a lookup operation but also to discover, understand and learn novel contents on complex topics while searching. Similar problems arise also with search exploiting classical keyword-based tagging techniques. In the paper we present LED, a web based system that aims to improve Web search by enabling users to properly explore knowledge associated to a query/keyword. We rely on DBpedia to explore the semantics of keywords and to guide the user during the exploration of search results. 1. Introduction happens when the user has not a clear idea of what she is looking for and (possibly) she needs some Thanks to the social paradigm of the Web 2.0 suggestions in the exploration process. The above new tools and technologies emerged to allow the mentioned tools enrich the page of search results user to interact with on-line resources. By now, we from classical search engines, e.g., Google, with a are very familiar with the notion of annotating a tag cloud suggesting relevant keywords, so that the web page as well as of tagging system, tag cloud, user may expand her query by adding new terms tag browsing just to cite a few. Surfing the wave of just clicking on tags in the cloud. the social Web, several web search tools have been From the users perspective, a very interesting developed to help users to navigate through search aspect related to the generation of the tag cloud results in a fast, simpler and efficient way combin- in the page of search results is the combination ing traditional search facilities with the tag cloud of lookup search with exploratory browsing. As exploration. Such tools are usually available as pointed out by Marchionini in [8] there are mainly plug-ins (add-ons) for the most common browsers two different ways of search strategies: lookup and and the most relevant are currently: DeeperWeb1, exploratory search. The presentation of contex- Search Cloudlet 2, SenseBot - Search Results Sum- tualized tag clouds combined with search results marizer 3. The main aim of these plug-ins is to makes possible to combine the two search strate- guide users in refining queries with terms which gies. As a matter of fact, after a lookup in a search are somehow related with the ones they are look- engine the user may explore the results space. ing thus exploring the results space. This usually Basically, all the above cited tools exploit text retrieval techniques to suggest new tags looking for relevant keywords in the snippets of search en- 1http://www.deeperweb.com 2http://www.getcloudlet.com gines results. Terms in the tag cloud are the most 3http://www.sensebot.net frequent ones in the result snippets. None of these 0000-0000/10/$00.00 c 2010 { IOS Press and the authors. All rights reserved 2R. Mirizzi, A. Ragone, T. Di Noia, E. Di Sciascio / Lookup, Explore, Discover: how DBpedia can improve your Web search tools take into account the meaning/semantics as- { a hybrid approach to rank resources in a RDF sociated to tags. In this paper we present LED dataset (such as DBpedia): our system ex- (acronym of Lookup Explore Discover), a tool for ploits both the graph nature of a RDF dataset exploratory search exploiting the knowledge en- and text-based techniques used in IR; coded in the Web of data for the tag cloud gener- { a RESTful JSON API that can be accessed ation. Although LED shares with the above men- from any environment that allows HTTP re- tioned tools all the main functionalities such as quest, useful both for building innovative web add/remove tags in order to refine the query, the applications and for comparison with different way it generates the tag cloud is completely dif- algorithms. ferent: The remainder of this paper is structured as fol- { tags are semantically related to the query and lows: in Section 2 we present in detail the LED in- not to the (partial) content of search results; terface. Section 3 is focused on the back-end of the { each generated tag is associated to a RDF re- system. Section 4 shows how LED backend can be source coming from the Web of Data (in our used as a semantic tagging system. In Section 5 we case we mainly rely on DBpedia [2]) with its present the LED RESTful API. A discussion and a associate semantics; comparison of related work is done in Section 6. { tags are semantically related with each other; Conclusion and future work close the paper. Therefore, while for the other tools the main aim is to give to the user a \visual summary" of the 2. LED: Lookup Explore Discover search results, in LED the terms included in the tag cloud are the most relevant w.r.t. the typed In this section we introduce LED, a web-based keyword(s), not the most relevant w.r.t. the re- tool that helps users in query refinement on trieved documents. Also for LED, the main pur- search engines and exploratory knowledge search pose is to suggest keywords similar to the one(s) in DBpedia. The front-end of the system is avail- representing the query in order to allow users able at http://sisinflab.poliba.it/led/ and to refine it and narrow down the search results currently is focused on search in the IT domain. adding/removing new tags from the cloud. In LED In a nutshell, the web-interface is composed by the different dimension of tags in the cloud does three different areas. The first one (marked with not reflect the number of occurrences of the key- (1) in Fig. 1) is mainly a keyword lookup: here the words in the search results but the semantic simi- user types in some characters to obtain an auto- larity of each tags w.r.t. the searched keyword(s). complete list of related keywords. The second area LED wants to favour a serendipitous approach [6], (marked with (2)) in Fig. 1 consists of a tag cloud that is the fortuitous discovery of something which (or a set of tag clouds) populated with automati- is not already included in the search results and cally generated tags that are semantically related that the user did not know to exist, but that could with the keyword entered by the user. Each of be as well of interest for her. them corresponds to a DBpedia resource. In fact, The innovative aspects of LED are: the displayed tag is the value of the rdf:label of the corresponding resource. The user may also { a novel approach to exploratory browsing that search for multiple keywords thanks to a tab-based exploits the Semantic Web: in this first ver- structure of the area devoted to search results. As sion of the tool we focus on DBpedia and shown in Fig. 1, one tab for each searched keyword present an intuitive way to explore this knowl- is open (RDFa and Microformat, respectively) and edge base; one with all the keywords (RDFa Microformat). { a web application that help users to refine The tag cloud allows the user to discover new queries for search engines with the aid of se- knowledge that is hidden in a traditional search mantically generated suggestions; engine. Thanks to LED it is also possible to explore { a meta-search engine and news collector that knowledge, browsing through suggested tags, and integrates results coming from different exter- refine the query by adding more specific keywords. nal sources in a single platform; Finally, the third areas of the interface (marked R. Mirizzi, A. Ragone, T. Di Noia, E. Di Sciascio / Lookup, Explore, Discover: how DBpedia can improve your Web search3 Fig. 1. The LED interface, available online at http://sisinflab.poliba.it/led/. with (3) in Fig. 1) acts as a meta-search engine: [8]. Therefore, using additional external knowledge it displays the results coming from the three most to refine the queries can be a good way to narrow popular search engines (i.e., Google, Yahoo! and down the search results and improve the user ex- Bing), the social network and microblogging ser- perience, maximizing the number of possibly rele- vice Twitter, the news aggregator Google News, vant retrieved objects. In [11] the authors describe Flickr and Google Images, according to the query the main ideas behind a simple lookup search on a formulated in the first section and refined in the traditional search engine and a deeper (semantic- second section. guided) exploratory search: a search engine allows to find what you are looking for and already know, 2.1. Lookup while LED (and more generally exploratory search activities) allows to explore and discover what you The entry point to LED is a text input field. Sim- probably did not know to exist.