Ontology Metadata for Ontology Reuse Elena Simperl
Total Page:16
File Type:pdf, Size:1020Kb
Ontology Metadata for Ontology Reuse 197 Ontology Metadata for Ontology Reuse Elena Simperl* Karlsruhe Institute of Technology, Institute AIFB, Englerstr. 11, Building 11.40, 76128 Karlsruhe, Germany E-mail: [email protected] *Corresponding author Cristina Sarasua Vicomtech-IK4 - Visual Interaction Communication Technologies, Paseo Mikeletegi 57, Parque Tecnologico´ Miramon, 20009 San Sebastian,´ Spain E-mail: [email protected] Rachanee Ungrangsi Shinawatra University, 99 Moo 10 Bangtoey Samkok Pathumthani, 12160, Thailand E-mail: [email protected] Tobias Burger¨ Capgemini, Carl-Wery-Str. 42, 81739 Munich, Germany E-mail: [email protected] Abstract: Most ontologies are poorly documented, thus not being optimally accessible to potential users and ontology reuse services. A first step towards ontology documentation is the development of an explicit schema for the systematic and comprehensive description of ontologies. To provide an informed background for such a schema, we performed a survey of the most recent ontology engineering technology, and the types of information for describing ontologies. We introduce an ontology metadata schema, which as part of the OMV standard, was evaluated through professional reviews. A second step is the provision of automatic techniques to acquire such documentation. For this purpose we have developed the OMEGA (Ontology MEtadata GenerAtion) algorithm, which automatically generates metadata about arbitrary ontologies on the Web and is available as Web application and REST Web service. It was evaluated with very promising results, providing thus a core building block for the realization of fully-fledged ontology location services. Keywords: ontology, metadata, reuse, schema, havervesting, extraction, generation, documentation, repositories, search engine Reference to this paper should be made as follows: Elena Simperl, Cristina Sarasua, Rachanee Ungrangsi. and Tobias Burger.¨ (2011) ‘Ontology Metadata for Ontology Reuse’, Int. J. of Metadata, Semantics and Ontologies, Vol. 1, Nos. 1/1, pp.11–111. 198 Simperl, E. et al. 1 Introduction functionality is negatively influenced by the absence of reuse- relevant information about ontologies in a machine-readable, As semantic technologies mature, an increasing number structured form. A first step towards the alleviation of this of private and public sector organizations are developing situation is the development of an explicit schema for the ontologies to capture their knowledge about a domain of systematic and comprehensive description of ontologies so interest in a machine-understandable way. These ontologies as to enable more effective access, management and usage are expected to act as communication interfaces between of them across the Web. To provide an informed background humans and machines. They are commonly agreed, shared for such a schema, we survey the most recent ontology conceptualizations through which humans express the engineering technology, including ontology repositories, intended meaning of the types of things they talk about in Semantic Web search engines, and ontology documentation a specific domain; at the same time they form the basis for tools, and the types of information they provide to describe the resolution of interoperability issues between IT systems. ontologies. We then introduce an ontology metadata schema, Consequently, it is understandable that the question of how which is not only aligned with the related formats implicitly to ensure the reusability of ontologies beyond the context in or explicitly used by state-of-the-art technology and tools, but which they have been initially developed has often been put in also combines and extends the best elements of these formats relation to the impact semantic technologies might potentially in order to maximize the expressivity and usefulness of the have in various business sectors. This question can be unifying result. As part of the OMV (Ontology Metadata addressed from multiple viewpoints, ranging from knowledge Vocabulary) standard, this metadata schema was evaluated representation languages over ontology reuse technology through professional reviews, and is supported by ontology to community and organizational processes (Simperl 2009, location services such as Oyster1 and Watson.2 A second Simperl & Burger¨ 2010, Pinto & Martins 2000). step towards more informatively documented ontologies is While finding a solution to the problematic trade-off the development and usage of automatic techniques to acquire between application setting-motivated usability and global such documentation in order to scale to the number of reusability is often considered an art more than a science, ontologies constantly being built in various sectors. For the latter can be significantly improved by resorting to this purpose we have developed the OMEGA (Ontology several design principles, which have proved their relevance MEtadata GenerAtion) algorithm, which is available as Web across several fields in computer science confronted with application and REST Web service. The algorithm follows similar challenges: modularity and decomposition along a heuristic approach to automatically generate a wide range abstraction and functionality levels, standards compliance, of metadata information about arbitrary ontologies available careful documentation of the development process, as well on the Web or indexed by Semantic Web search engines. as up-to-date, meaningfully organized artifact repositories. It was evaluated in terms of coverage, precision, recall, and Recent research achievements in the Semantic Web area overall user-perceived quality in several studies with very offer a feasible basis for many of these principles to be promising results. The evaluation revealed that automatic put into practice. The usage of knowledge representation ontology metadata generation is feasible to a much larger standards in conjunction with the ubiquitous, URI-based extent than it is done in state-of-the-art Semantic Web search access of Semantic Web resources are a fair starting point engines if the wealth of knowledge resources freely available to facilitate the cross-application usage of ontologies (Bojars on the Web is taken into account in the process, making et al. 2008, Gracia et al. 2010, Heath & Bizer 2011). OMEGA a core building block for the realization of fully- Semantic Web search engines and ontology repositories fledged ontology location services. play an equally important role in this context, as they In a nutshell, the contribution of this work is twofold. provide potential ontology users with the query, search and First, we provide a comprehensive overview of the types browse support they need to identify ontologies potentially of information used in state-of-the-art ontology engineering relevant to their own needs (Fikes & Farquhar 1999, environments to describe ontologies, identifying potential Ding & Fensel 2001, Hartmann et al. 2009, Baclawski & limitations at the level of ontology metadata in current Schneider 2009, Noy et al. 2008). The work presented in ontology repositories, Semantic Web search engines, and this paper provides the theoretical and technical foundations ontology documentation tools. Second, we investigate how to allow ontology engineering environments, including the ontology metadata could be generated algorithmically in two types of systems just mentioned, to solve one of order to prevent the metadata acquisition bottleneck with their currently most stringent limitation: the lack of useful which ontology location tools are typically confronted. metadata describing ontologies; this metadata is an important ingredient for the implementation of useful retrieval and The remainder of this article is organized as follows: navigation functionality. First we review current state-of-the-art ontology engineering Most ontologies, independently of their provenance and technology with respect to ontology metadata they support location, are poorly documented, while additional descriptive and the means to acquire it (Sections 2 to 4). In particular, information is spread across the Web in various forms, we introduce OMV (Ontology Metadata Vocabulary), which thus not being optimally accessible to potential users and is used by our approach for automatic metadata acquisition. ontology reuse services. Available services in this area In Sections 5 and 6 we present the OMEGA algorithm and offer a basic set of search, browse and navigation features its implementation. Its evaluation and main findings are the to the ontological resources administrated. However, their presented in Section 7. We conclude with a short discussion Ontology Metadata for Ontology Reuse 199 of the limitations of the current approach, and sketch the ontologies, as well as the degree of documentation offered by planned future work in Section 8. different ontology environements and tools. 2.1 General-purpose metadata schemas 2 Metadata schemas used to document ontologies Metadata is often defined as data about data. A metadata record generally consists of a set of attributes or elements As a first step of the ontology reuse process the describing a resource. Metadata can be created manually ontology engineering team typically resorts to conventional or automatically. It can be integrated into the actual or Semantic Web-oriented search engines, and browses resource or stored separately in a metadata database. Several repositories of ontological resources in order to build a list of metadata schemas and vocabularies,