LNCS 4273, Pp

Semantic Desktop 2.0: The Gnowsis Experience Leo Sauermann1, Gunnar Aastrand Grimnes1, Malte Kiesel1, Christiaan Fluit3,HeikoMaus1, Dominik Heim2, Danish Nadeem4, Benjamin Horak2, and Andreas Dengel12 1 Knowledge Management Department German Research Center for Artificial Intelligence DFKI GmbH, Kaiserslautern, Germany {firstname.surname}@dfki.de 2 Knowledge-Based Systems Group, Department of Computer Science, University of Kaiserslautern, Germany 3 Aduna BV, Amersfoort, The Netherlands [email protected] 4 University of Osnabrueck, Germany [email protected] Abstract. In this paper we present lessons learned from building a Se- mantic Desktop system, the gnowsis beta. On desktop computers, semantic software has to provide stable services and has to reflect the personal view of the user. Our approach to ontologies, the Personal In- formation Model PIMO allows to create tagging services like del.icio.us on the desktop. A semantic wiki allows further annotations. Continuous evaluations of the system helped to improve it. These results were cre- atedintheEPOSresearchprojectandareavailableintheopensource projects Aperture, kaukoluwiki, and gnowsis and will be continued in the Nepomuk project. By using these components, other developers can create new desktop applications the web 2.0 way. 1 Introduction A characteristic of human nature is to collect. In the information age we have moved from the basic collection of food, books and paintings to collecting web- sites, documents, e-mails, ideas, tasks and sometimes arguments and facts. We gather information and store them on our desktop computers, but once stored the satisfaction of possessing something is soon distorted by the task of finding information in our personal data swamp [9]. In this paper we present parts of the gnowsis semantic desktop framework, a tool for personal information management (PIM). In addition to providing an interface for managing your personal data it also provides interfaces for other applications to access this, acting as a central hub for semantic information on the desktop. The described gnowsis system is a prototype of a Semantic Desktop [17], aiming to integrate desktop applications and the data managed on desktop computers using semantic web technology. Previous work published about Semantic Desktop applications [5,12,15] did show that this approach is promising to support users in finding and remind- ing information, and to work with information in new ways. The architecture I. Cruz et al. (Eds.): ISWC 2006, LNCS 4273, pp. 887–900, 2006. c Springer-Verlag Berlin Heidelberg 2006 888 L. Sauermann et al. was improved during the last years, taking inspiration from the current advanced and popularity of the web 2.0 (See Section 3). The new architecture and new ontology policies for gnowsis version 0.9 is described in Section 2. In Section 3 we will discuss what semantic desktop applications can learn from the web 2.0, in particular we discuss how semantic wikis provide an ideal lightweight solution for ad-hoc creating of meta-data, and how such semantic wikis and tagging were integrated into gnowsis. In section 4, a short description of the information integration framework Aperture is given. A summary on our evaluation efforts and lessons learned is given in Section 5, indicating best practices and other remarks on practical semantic web engineering. The system has been deployed in several evaluation settings, gathering user feedback to further improve it, and an active community is now building around the gnowsis semantic desktop. As the system is now part of the EU-funded Integrated Project Nepomuk1,which develops a comprehensive solution for a Social Semantic Desktop, we expect that the service oriented approach to the semantic desktop will have more impact in the future, which is the conclusion of this paper. 2 The Gnowsis Approach Gnowsis is a semantic desktop with a strong focus on extensibility and integration. The goal of gnowsis is to enhance existing desktop applications and the desktop operating system with Semantic Web features. The primary use for such asystemisPersonal Information Management (PIM), technically realized by representing the user’s data in RDF.Although the technology used is the same, communication, collaboration, and the integration with the global semantic web is not addressed by the gnowsis system. The gnowsis project was created 2003 in Leo Sauermann’s diploma thesis [14] and continued in the DFKI research project EPOS2 [7]. Gnowsis can be coarsely split into two parts, the gnowsis-server which does all the data processing, storage and interaction with native applications; and the graphical user interface (GUI) part, currently implemented as Swing GUI and some web-interfaces (See the gnowsis web-page for illustrative screenshots3). The interface between the server and GUI is clearly specified, making it easy to develop alternative interfaces. It is also possible to run gnowsis-server standalone, without a GUI. Gnowsis uses a Service Oriented Architecture (SOA), where each component defines a certain interface, after the server started the component the interface is available as XML/RPC service4, to be used by other applications, for more detail refer to Section 3. External applications like Microsoft Outlook or Mozilla Thunderbird are integrated via Aperture data-source (See Section 4), their data is imported and mirrored in gnowsis. Some new features were also added to these applications 1 http://nepomuk.semanticdesktop.org 2 http://www.dfki.uni-kl.de/epos 3 http://www.gnowsis.opendfki.de/ 4 http://www.xmlrpc.com/ Semantic Desktop 2.0: The Gnowsis Experience 889 Fig. 1. The Gnowsis Architecture using plugins, for example, in Thunderbird users can relate e-mails to concepts within their personal ontology (See Section 3.1). The whole gnowsis framework is free software, published under a BSD com- patible license. It is implemented in Java to be platform-independent and reuses well-known tools like Jena, Sesame, Servlets, Swing, and XML-RPC. Gnowsis can be downloaded from http://www.gnowsis.org/. The Gnowsis Server. The architecture of the gnowsis-server service is shown in Figure 1. Its central component is naturally an RDF storage repository. Gnow- sis uses four different stores for this purpose. The PIMO store handles the information in the user’s Personal Information Model (See Section 2.1), the resource store handles the data crawled from Aperture data-sources (See Section 4), the configuration store handles the data about available data-sources, log- levels, crawl-intervals, etc., and finally, the service store handles data created by various gnowsis modules, such as user profiling data or metadata for the crawling of data-sources. Separating the PIMO store from the resource store was an important decision for the gnowsis architecture, and it was made for several reasons: The resource store is inherently chaotic, since it mirrors the structure of the user’s applications (consider your email inbox), whereas the thoughts (eg, concepts and relations) can be structured separately in the PIMO. Another reason was efficiency, while a user’s PIMO may contain a few thousand instances for a very frequent user, it is not uncommon for people to have an archive of 100,000 emails. By separating the two we can save time and resources by only performing inference on the 890 L. Sauermann et al. PIMO store. We also note that a similar approach was taken in many other projects, for instance the topic maps community, where topics and occurrences are separated [13]. A discussion of cognitive justification for such a split can be found in Section 2.1. The storage modules in gnowsis are currently based on Sesame 2 [2] and are using Sesame’s native Storage And Inference Layer (SAIL) to store the data on disk. In the previous gnowsis versions we used MySQL in combination with Jena as triple store, but this enforced users to install the database server on their desktops and also the performance of fulltext-searching in MySQL/Jena was not satisfying. By using Sesame2 with the embedded native SAIL we simplified the installation significantly. In addition to the raw RDF the PIMO and resource stores use an additional SAIL layer which utilizes Lucene5 to index the text of RDF literals, providing extremely fast full-text search capabilities on our RDF stores. Lucene indexing operates on the level of documents. Our LuceneSail has two modes for mapping RDF to logical documents: one is used with Aperture and will index each Aperture data-object (for example files, webpages, emails, etc.) as a document. The other mode does not require Aperture. Instead one Lucene document is created for each named RDF resource in the store. The resulting Lucene Index can be accessed either explicitly through java-code, or by using special predicates when querying the RDF store. Figure 2 shows an example SPARQL query for PIMO documents containing the word “rome”. The LuceneSail will rewrite this query to access the Lucene index and remove the special predicates from the triple patterns. This method for full-text querying of RDF stores is equivalent to the method used in Aduna MetaData server and Aduna AutoFocus [6]. SELECT ?X WHERE { ?X rdf:type pimo:Document ; lucene:matches ?Q. ?Q lucene:query “rome”. } Fig. 2. A Query using special predicates for full-text searching 2.1 Open Source, Open Standards, Open Minds A key to success for a future semantic desktop is making it extendable. Parti- tioning the framework into services and making it possibile to add new services is one way to support this. Another contributor is exposing all programming interfaces based on the XML-RPC standard, allowing new applications can use the data and services of gnowsis. Open standards are important to help create interoperable and affordable solutions for everybody. They also promote competition by setting up a technical playing field that is level to all market players.

Load more