A Case Study for MKM in Software Engineering

Total Page:16

File Type:pdf, Size:1020Kb

A Case Study for MKM in Software Engineering Dimensions of Formality: A Case Study for MKM in Software Engineering Andrea Kohlhase1 and Michael Kohlhase2 and Christoph Lange2 1 German Research Center for Artificial Intelligence (DFKI) [email protected] 2 Computer Science, Jacobs University Bremen {m.kohlhase,ch.lange}@jacobs-university.de Abstract. We study the formalization process of a collection of docu- ments created for a Software Engineering project from an MKM perspec- tive. We analyze how document and collection markup formats can cope with an open-ended, multi-dimensional space of first and secondary clas- sifications and relations. We show that RDFa-based extensions of MKM formats by flexible “metadata” relations referencing specific vocabular- ies are well-suited to encode and operationalize this. This explicated knowledge can be used for enriching interactive document browsing, for enabling multi-dimensional metadata queries over documents and col- lections, and for exporting Linked Data to the Semantic Web and thus enabling further reuse. 1 Introduction The field of Mathematical Knowledge Management (MKM) tries to model math- ematical objects and their relations, their creation and publication processes, and their management requirements. In [CF09, 237 ff.] Carette and Farmer analyzed “six major lenses through which researchers view MKM ”: the document, library, formal, digital, interactive, and the process lens. Quite obviously, there is a tension between the formal aspects “library”, “formal”, “digital” – related to machine use of mathematical knowledge – and the informal ones “document”, “interactive”, “process” – related to human use. We encountered and dealt with all of these aspects in an extended case study in Software Engineering. The goal of this study was to create a document- oriented formalized process for Software Engineering, where MKM techniques are used to bridge the gap between informally stated user requirements and formal verification. The object of the study was a safety component for au- tonomous mobile service robots developed and certified as SIL-3 standard com- pliant (see [FHL+08]) in the course of the 3-year project “Sicherungskomponente für Autonome Mobile Systeme (SAMS)” at the German Research Center for Ar- tificial Intelligence (DFKI). Certification required the software development to follow the V-Model (figure 1) and to be based on a verification of certain safety properties in the proof checker Isabelle [NPW02]. The V-Model mandates e.g. that relevant document fragments (“objects”) get justified and linked to the multiform.tex 1305 2010-03-10 22:56:33Z clange corresponding objects in other members of the document collection in a succes- sive refinement process (the arms of the ‘V’ from the upper left over the bottom to the upper right and between arms in figure 1). System development with respect to this regime results in a highly interconnected col- lection of design documents, certification documents, code, formal specifications, and for- mal proofs. This collection of documents (we call it “SAMSDocs” [SAM09]) make up the basis of a case study in Fig. 1. The V-Model of Software Engineering the context of the FormalSafe project [For08] at DFKI Bremen, where they serve as a basis for research on machine-supported change management, information retrieval, and document interaction. The idea for using SAMSDocs as a FormalSafe case study was based on the assumption that a collection created with a strong formalization pressure would be easier to semantify than a regular collection. In this paper we report on — and draw conclusions from — the formal- ization of the LATEX documents of SAMSDocs, particularly the inherent multi- dimensionality of the explicated structures (see section 2). The consequences for possible markup approaches we explore in section 3. Section 4 discusses added- value services enabled by multi-dimensional structured representations and sec- tion 5 concludes the paper. 2 Dimensions of Formality in SAMSDocs We use the SAMSDocs corpus as a case study for what kind of implicit knowledge can be formalized. The structure of a collection – induced e.g. by the V-Model process – has a strong influence on the formality of and within the documents. At the same time, the structure itself is an object of formalization which makes the structural cues inscribed in the documents explicit. We used the STEX system [Koh08], a semantic extension of LATEX, in order to both publish the documents as high-quality human-readable PDF and formal machine-processable OMDoc [Koh10], an XML format, via LATEXML [SKG+10]. Mathematical, structural relations have a privileged state in STEX, and its com- mand sequence/environment syntax is analogous to the native element and at- tribute names in OMDoc. STEX supports the collection view, since it allows to mark up a theory/imports structure as a collection-level structure. Unfortu- nately, it soon became apparent that this logic-inspired structure was too rigid for the intended stepwise knowledge explication. For example, STEX’s generated, convenient symbol macros could not be used without chunking document frag- ments into theories first. Fortunately, STEX not only offers OMDoc 1.2 core fea- tures, it also allows for the OMDoc 1.3 scheme of metadata via RDFa [ABMP08] 2 multiform.tex 1305 2010-03-10 22:56:33Z clange annotations (see [LK09,Koh10]). This enables formalization on a much broader basis, since we can add pre-formal markup in the formalization process, we speak of (semantic) preloading. Such extensional STEX packages can thus serve as a development platform for metadata vocabularies for specific structured repre- sentations, mirroring the metadata extensibility of OMDoc. This is additionally and particularly useful as complex RDFa annotations, which involve long and unintuitive URIs, can be hidden under simple command sequences for preload- ing3. Fig. 2. The Formalization Workflow with STEX-SD In figure 2 we can see an example of a concrete formalization workflow where the original tabular TEX environment contains a list of symbols for document states with its definitions, e.g. “i.B.” for “in Bearbeitung [in progress]”. STEX enables the person doing the structural explication to create collection-specific independent layout commands for the PDF and the OMDoc output; in our case we call this extension package STEX-SD . For instance, we replaced the use of the tabular environment in figure 2 by employing a project-specific SDTab-def environment together with a list of SDdef-commands. 3 We are currently developing an STEX-based markup infrastructure for defining meta- data vocabularies together with a module-based inheritance mechanism of custom metadata macros. This would alleviate the need for package extensions and allow metadata extensibility inside the STEX format. The inspiration for this extension was a direct consequence of [KKL10]. 3 multiform.tex 1305 2010-03-10 22:56:33Z clange As we were interested in the kind of explicated, previously implicit knowl- edge for future (re)use, we analyzed the semantic preloading and found distinct dimensions of formality, which we present in the following: Preloading the Project Structure: A specific problem of the SAMS project constrained that the PDF form of the preloaded STEX documents was not supposed to differ from the originals. Here, we extended STEX-SD by syntactic sugar for project-wide layout structures, such as the definition table in figure 2 as this construct was used throughout SAMSDocs. In more detail, we introduced the SDdef command, which translates to an OMDoc element symbol named “zustand-doc-iB” and a corresponding definition element. Moreover, the new environment SDTab-def groups all respective SDdef entries into a PDF table and into an OMDoc theory. Such macro-support alleviated the project-specific markup progress and made it much more efficient. Preloading the Document Structure: Spotting objects, i.e., identifying document fragments as autonomous objects, was a major part in the formal- ization process. Once spotted, they were first preloaded with a referencable ID, turning them into an object, and then classified and interrelated. For example, in the PDF screenshot in figure 2 we can easily spot the symbol “i.B.” and its definition “in Bearbeitung” in the row of the definition table. The row as a doc- ument fragment turns into an object in the STEX screenshot as it is attributed the ID “zustand-doc-iB”, categorized as “definition” and implicitly specified via STEX-SD to be a symbol and its definition. Once all relevant objects for a con- cept were gathered, their relating and chunking process started. Relating was attempted with light markup using commands like \SDreferences created for STEX-SD or with STEX’s feature symbol macros. Latter could only be used if the respective module/theory structure for the collection was in place, that is if chunking had partially been done. Explicating the document structure is not always obvious, since many of the documents contain re- caps or previews or material that is normatively in- troduced in other documents/parts to make documents self-contained or enhance the narrative structure. Fig. 3. s is Bra- Consider for example figures 3 and 4, which are actu- king Distance? ally clippings from the detailed specification “Konzept- Bremsmodell.pdf”. Note the use of s resp. sG, both pointing in fig. 3 to the braking distance function for straight-ahead driving (which is obvious from the local context), whereas in fig. 4 s represents the general arc length function of a circle, which is principally different from the braking distance, but coincides here. Preloading the Collection Structure: Many concepts have occurrences in several of the documents in the collection SAMSDocs. Such occurrences are re- lated but they are not occurrences of the same object. For example, an object was introduced as a high-level Fig. 4. Yet another concept in the contract, then it was specified in another Braking Distance s? 4 multiform.tex 1305 2010-03-10 22:56:33Z clange document, refined in a detailed specification, implemented in the code, reviewed at some stage, and so on until it was finally described in the manual.
Recommended publications
  • Semantics Developer's Guide
    MarkLogic Server Semantic Graph Developer’s Guide 2 MarkLogic 10 May, 2019 Last Revised: 10.0-8, October, 2021 Copyright © 2021 MarkLogic Corporation. All rights reserved. MarkLogic Server MarkLogic 10—May, 2019 Semantic Graph Developer’s Guide—Page 2 MarkLogic Server Table of Contents Table of Contents Semantic Graph Developer’s Guide 1.0 Introduction to Semantic Graphs in MarkLogic ..........................................11 1.1 Terminology ..........................................................................................................12 1.2 Linked Open Data .................................................................................................13 1.3 RDF Implementation in MarkLogic .....................................................................14 1.3.1 Using RDF in MarkLogic .........................................................................15 1.3.1.1 Storing RDF Triples in MarkLogic ...........................................17 1.3.1.2 Querying Triples .......................................................................18 1.3.2 RDF Data Model .......................................................................................20 1.3.3 Blank Node Identifiers ..............................................................................21 1.3.4 RDF Datatypes ..........................................................................................21 1.3.5 IRIs and Prefixes .......................................................................................22 1.3.5.1 IRIs ............................................................................................22
    [Show full text]
  • PDF Formats; the XHTML Version Has Active Links That You Can Follow
    Semantic Web @ W3C: Activities, Recommendations and State of Adoption Athens, GA, USA, 2006-11-09 Ivan Herman, W3C Ivan Herman, W3C RDF(S), tools We have a solid specification since 2004: well defined (formal) semantics, clear RDF/XML syntax Lots of tools are available. Are listed on W3C’s wiki: RDF programming environment for 14+ languages, including C, C++, Python, Java, Javascript, Ruby, PHP,… (no Cobol or Ada yet ) 13+ Triple Stores, ie, database systems to store datasets 16+ general development tools (specialized editors, application builders, …) etc Ivan Herman, W3C RDF(S), tools (cont.) Note the large number of large corporations among the tool developers: Adobe, IBM, Software AG, Oracle, HP, Northrop Grumman, … …but the small companies and independent developers also play a major role! Some of the tools are Open Source, some are not; some are very mature, some are not : it is the usual picture of software tools, nothing special any more! Anybody can start developing RDF-based applications today Ivan Herman, W3C RDF(S), tools (cont.) There are lots of tutorials, overviews, or books around the wiki page on books lists 20+ (English) textbooks; 19+ proceedings for 2005 & 2006 alone… again, some of them good, some of them bad, just as with any other areas… Active developers’ communities Ivan Herman, W3C Large datasets are accumulating IngentaConnect bibliographic metadata storage: over 200 million triplets UniProt Protein Database: 262 million triplets RDF version of Wikipedia: more than 47 million triplets RDFS/OWL Representation
    [Show full text]
  • A Framework for Semantic Publishing of Modular Content Objects
    A Framework for Semantic Publishing of Modular Content Objects Catalin David, Deyan Ginev, Michael Kohlhase, Bogdan Matican, Stefan Mirea Computer Science, Jacobs University, Germany; http://kwarc.info/ Abstract. We present the Active Documents approach to semantic pub- lishing (semantically annotated documents associated with a content commons that holds the background ontologies) and the Planetary system (as an active document player). In this paper we explore the interaction of content object reuse and context sensitivity in the presentation process that transforms content modules to active documents. We propose a \separate compilation and dynamic linking" regime that makes semantic publishing of highly struc- tured content representations into active documents tractable and show how this is realized in the Planetary system. 1 Introduction Semantic publication can range from merely equipping published documents with RDFa annotations, expressing metadata or inter-paper links, to frame- works that support the provisioning of user-adapted documents from content representations and instrumenting them with interactions based on the seman- tic information embedded in the content forms. We want to propose an entry to the latter category in this paper. Our framework is based on semantically anno- tated documents together with semantic background ontologies (which we call the content commons). This information can then be used by user-visible, se- mantic services like program (fragment) execution, computation, visualization, navigation, information aggregation and information retrieval (see Figure 5). Finally a document player application can embed these services to make docu- ments executable. We call this framework the Active Documents Paradigm (ADP), since documents can also actively adapt to user preferences and envi- ronment rather than only executing services upon user request.
    [Show full text]
  • Enhancing JSON to RDF Data Conversion with Entity Type Recognition
    Enhancing JSON to RDF Data Conversion with Entity Type Recognition Fellipe Freire, Crishane Freire and Damires Souza Academic Unit of Informatics, Federal Institute of Education, Science and Technology of Paraiba, João Pessoa, Brazil Keywords: RDF Data Conversion, Entity Type Recognition, Semi-Structured Data, JSON Documents, Semantics Usage. Abstract: Nowadays, many Web data sources and APIs make their data available on the Web in semi-structured formats such as JSON. However, JSON data cannot be directly used in the Web of data, where principles such as URIs and semantically named links are essential. Thus it is necessary to convert JSON data into RDF data. To this end, we have to consider semantics in order to provide data reference according to domain vocabularies. To help matters, we present an approach which identifies JSON metadata, aligns them with domain vocabulary terms and converts data into RDF. In addition, along with the data conversion process, we provide the identification of the semantically most appropriate entity types to the JSON objects. We present the definitions underlying our approach and results obtained with the evaluation. 1 INTRODUCTION principles of this work is using semantics provided by the knowledge domain of the data to enhance The Web has evolved into an interactive information their conversion to RDF data. To this end, we use network, allowing users and applications to share recommended open vocabularies which are data on a massive scale. To help matters, the Linked composed by a set of terms, i.e., classes and Data principles define a set of practices for properties useful to describe specific types of things publishing structured data on the Web aiming to (LOV Documentation, 2016).
    [Show full text]
  • Linked Data Schemata: Fixing Unsound Foundations
    Linked data schemata: fixing unsound foundations. Kevin Feeney, Gavin Mendel Gleason, Rob Brennan Knowledge and Data Engineering Group & ADAPT Centre, School of Computer Science & Statistics, Trinity College Dublin, Ireland Abstract. This paper describes an analysis, and the tools and methods used to produce it, of the practical and logical implications of unifying common linked data vocabularies into a single logical model. In order to support any type of reasoning or even just simple type-checking, the vocabularies that are referenced by linked data statements need to be unified into a complete model wherever they reference or reuse terms that have been defined in other linked data vocabularies. Strong interdependencies between vocabularies are common and a large number of logical and practical problems make this unification inconsistent and messy. However, the situation is far from hopeless. We identify a minimal set of necessary fixes that can be carried out to make a large number of widely-deployed vocabularies mutually compatible, and a set of wider-ranging recommendations for linked data ontology design best practice to help alleviate the problem in future. Finally we make some suggestions for improving OWL’s support for distributed authoring and ontology reuse in the wild. Keywords: Linked Data, Reasoning, Data Quality 1. Introduction One of the central tenets of the Linked Data movement is the reuse of terms from existing well- known vocabularies [Bizer09] when developing new schemata or datasets. The semantic web infrastructure, and the RDF, RDFS and OWL languages, support this with their inherently distributed and modular nature. In practice, through vocabulary reuse, linked data schemata adopt knowledge models that are based on multiple, independently devised ontologies that often exhibit varying definitional semantics [Hogan12].
    [Show full text]
  • Building Linked Data for Both Humans and Machines∗
    Building Linked Data For Both Humans and Machines∗ y z x Wolfgang Halb Yves Raimond Michael Hausenblas Institute of Information Centre for Digital Music Institute of Information Systems & Information London, UK Systems & Information Management Management Graz, Austria Graz, Austria ABSTRACT Existing linked datasets such as [3] are slanted towards ma- In this paper we describe our experience with building the chines as the consumer. Although there are exceptions to riese dataset, an interlinked, RDF-based version of the Eu- this machine-first approach (cf. [13]), we strongly believe rostat data, containing statistical data about the European that satisfying both humans and machines from a single Union. The riese dataset (http://riese.joanneum.at), aims source is a necessary path to follow. at serving roughly 3 billion RDF triples, along with mil- lions of high-quality interlinks. Our contribution is twofold: We subscribe to the view that every LOD dataset can be Firstly, we suggest using RDFa as the main deployment understood as a Semantic Web application. Every Semantic mechanism, hence serving both humans and machines to Web application in turn is a Web application in the sense effectively and efficiently explore and use the dataset. Sec- that it should support a certain task for a human user. With- ondly, we introduce a new way of enriching the dataset out offering a state-of-the-art Web user interface, potential with high-quality links: the User Contributed Interlinking, end-users are scared away. Hence a Semantic Web applica- a Wiki-style way of adding semantic links to data pages. tion needs to have a nice outfit, as well.
    [Show full text]
  • Linked Data Schemata: Fixing Unsound Foundations
    Linked data schemata: fixing unsound foundations Editor(s): Amrapali Zaveri, Stanford University, USA; Dimitris Kontokostas, Universität Leipzig, Germany; Sebastian Hellmann, Universität Leipzig, Germany; Jürgen Umbrich, Wirtschaftsuniversität Wien, Austria Solicited review(s): Mathieu d’Aquin, The Open University, UK; Peter F. Patel-Schneider, Nuance Communications, USA; John McCrae, Insight Centre for Data Analytics, Ireland; One anonymous reviewer Kevin Chekov Feeney, Gavin Mendel Gleason, and Rob Brennan Knowledge and Data Engineering Group & ADAPT Centre, School of Computer Science & Statistics, Trinity College Dublin, Ireland Corresponding author’s e-mail: [email protected] Abstract. This paper describes our tools and method for an evaluation of the practical and logical implications of combining common linked data vocabularies into a single local logical model for the purpose of reasoning or performing quality evalua- tions. These vocabularies need to be unified to form a combined model because they reference or reuse terms from other linked data vocabularies and thus the definitions of those terms must be imported. We found that strong interdependencies between vocabularies are common and that a significant number of logical and practical problems make this model unification inconsistent. In addition to identifying problems, this paper suggests a set of recommendations for linked data ontology design best practice. Finally we make some suggestions for improving OWL’s support for distributed authoring and ontology reuse. Keywords: Linked Data, Reasoning, Data Quality 1. Introduction In order to validate any dataset which uses the adms:Asset term we must combine the adms ontolo- One of the central tenets of the linked data move- gy and the dcat ontology in order to ensure that ment is the reuse of terms from existing well-known dcat:Dataset is a valid class.
    [Show full text]
  • Slideshare.Net/Scorlosquet/Drupal-As-A-Semantic-Web-Platform
    Linked Data Publishing with Drupal Joachim Neubert ZBW – German National Library of Economics Leibniz Information Centre for Economics SWIB13 Workshop Hamburg, Germany 25.11.2013 ZBW is member of the Leibniz Association My background • Scientific software developer at ZBW – German National Library for Economics, mainly concerned with Linked Open Data and knowledge organization systems and services • Published 20th Century Press Archives in 2010, with some 100,000 digitized newspaper articles in dossiers (http://zbw.eu/beta/p20, custom application written in Perl) • Published a repository of ZBW Labs projects recently – basicly project descriptions and a blog (http://zbw.eu/labs, Drupal based) Page 2 Workshop Agenda – Part 1 1) Drupal 7 as a Content Management System: Linked Data by Default Hands-on: Get familiar with Drupal and it‘s default RDF mappings 2) Using Drupal 7as a Framework for Content Management Hands-on: Create a custom content type and map it to RDF Page 3 Workshop Agenda – Part 2 Produce other RDF Serialization Formats: RDF/XML, Turtle, Ntriples, JSON-LD Create a SPARQL Endpoint from your Drupal Site Cool URIs Create Out-Links with Web Taxonomy Current limitations of RDF/LD support in Drupal 7 Outlook on Drupal 8 Page 4 Drupal as a CMS (Content Management System) ready for RDF and Linked Data Page 5 Why at all linked data enhanced publishing? • Differentiate the subjects of your web pages and their attributes • Thus, foster data reuse in 3rd party services and applications • Mashups • Search engines • Create meaningful
    [Show full text]
  • A User-Friendly Interface to Browse and Find DOAP Projects with Doap:Store
    A user-friendly interface to browse and find DOAP projects with doap:store Alexandre Passant Universit´eParis IV Sorbonne, Laboratoire LaLICC, Paris, France [email protected] 1 Motivation The DOAP[2] vocabulary is now widely used by people - and organizations - to describe their projects using Semantic Web standards. Yet, since files are spread around the Web, there is no easy way to find a project regarding its metadata. Recently, Ping The Semantic Web1 (PTSW) and Semantic Radar2 plugin for Firefox introduced a new way to discover Semantic Web documents[1]: by brows- ing the Web, users ping the PTSW service so that it can maintain a contineously updated list of Semantic Web document URIs. Thus, the idea of doap:store - http://doapstore.org is to provide a user- friendly interface, easily accessible for not RDF-aware users, to find and browse DOAP projects, using PTSW as a provider of data sources. This way, users do not have to register to promote a project as in freshmeat3 or related services, but just need to publish some DOAP files on their websites to benefit of this distributed architecture. doap:store is the first implemented service using PTSW data sources to provide such browsing and querying features. 2 Architecture doap:store involves 3 main components: – A crawler: Running hourly, a tiny script parses the list of latest DOAP pings received by PTSW and then put each related RDF files into a triple-store; – A triple-store: The core of the system, storing RDF files retrieved thanks to the crawler, and providing SPARQL capabilities to be used by the user- interface that is plugged on the endpoint; – A user-interface: A simple interface, offering a list of latest retrieved projects, a case-insensitive tagcloud of programming languages, and a search engine to find DOAP projects regarding various criteria in a easy way.
    [Show full text]
  • Importing the OEIS Library Into Omdoc
    Importing the OEIS library into OMDoc Enxhell Luzhnica, Mihnea Iancu, Michael Kohlhase Computer Science, Jacobs University, Bremen, Germany [email protected] Abstract. The On-line Encyclopedia of Integer Sequences (OEIS) is the largest database of its kind and an important resource for mathematicians. The database is well-structured and rich in mathematical content but is informal in nature so knowledge management services are not directly applicable. In this paper we provide a partial parser for the OEIS that leverages the fact that, in practice, the syntax used in its formulas is fairly regular. Then, we import the result into OMDoc to make the OEIS accessible to OMDoc-based knowledge management applications. We exemplify this with a formula search application based on the MathWebSearch system. 1 Introduction Integer sequences are important mathematical objects that appear in many areas of math- ematics and science and are studied in their own right. The On-line Encyclopedia of Integer Sequences (OEIS) [6] is a publicly accessible, searchable database documenting such sequences and collecting knowledge about them. The effort was started in 1964 by N. J. A. Sloane and led to a book [10] describing 2372 sequences which was later extended to over 5000 in [11]. The online version [8] started in 1994 and currently con- tains over 250000 documents from thousands of contributors with 15000 new entries being added each year [9]. Documents contain varied information about each sequence such as the beginning of the sequence, its name or description, formulas describing it, or computer programs in various languages for generating it.
    [Show full text]
  • Mathematical Document Representation and Processing1
    Formats Format Usage in DML-CZ Delivery Summary and Conclusions Mathematical Document Representation and Processing1 Petr Sojka Faculty of Informatics, Masaryk University, Brno, CZ, EU Dec 1st, 2008 1Supported by the Academy of Sciences of Czech Republic grant #1ET200190513 Mathematical Document Representation and Processing Faculty of Informatics, Masaryk University, Brno, CZ, EU Formats Format Usage in DML-CZ Delivery Summary and Conclusions Purpose-driven format choices Purposes for storing mathematics I Format choice depends on application’s purpose. I Most applications have its own internal format anyway. I For exchange seems to win XML/MathML (but which one?). I As authoring tools seems to be prefered (La)TEX. I Quite different requirements have theorem proving systems. Mathematical Document Representation and Processing Faculty of Informatics, Masaryk University, Brno, CZ, EU Formats Format Usage in DML-CZ Delivery Summary and Conclusions Purpose-driven format choices Mathematical Document Representation and Processing Faculty of Informatics, Masaryk University, Brno, CZ, EU Formats Format Usage in DML-CZ Delivery Summary and Conclusions Purpose-driven format choices Math Author Tools: LATEX, AMSLATEX (pdfAMSLATEX) I Good for authors: authors may express as close as possible to their mental model in their brain (new macros, namespaces). I This author’s advantage make headaches to the editors and those wishing to convert to some standard formalism (to index, evaluate, . ). I Many different macropackages, and active development as possibilites grow (XeTeX, LuaTEX, pdfTEX), . Mathematical Document Representation and Processing Faculty of Informatics, Masaryk University, Brno, CZ, EU Formats Format Usage in DML-CZ Delivery Summary and Conclusions Purpose-driven format choices pdfTEX program (used in Infty) was born in Brno during studies of Vietnamese student Hàn Thê´ Thành (1990–2001).
    [Show full text]
  • Combining RDF Vocabularies for Expert Finding
    Combining RDF Vocabularies for Expert Finding Boanerges Aleman-Meza1, Uldis Boj¯ars2, Harold Boley3, John G. Breslin2, Malgorzata Mochol4, Lyndon JB Nixon4, Axel Polleres2,5, and Anna V. Zhdanova5 1 LSDIS Lab, University of Georgia, USA 2 DERI, National University of Ireland, Galway 3 University of New Brunswick and National Research Council, Canada 4 Free University of Berlin, Germany 5 Universidad Rey Juan Carlos, Madrid, Spain 6 University of Surrey, UK Abstract. This paper presents a framework for the reuse and extension of ex- isting, established vocabularies in the Semantic Web. Driven by the primary ap- plication of expert finding, we will explore the reuse of vocabularies that have attracted a considerable user community already (FOAF, SIOC, etc.) or are de- rived from de facto standards used in tools or industrial practice (such as vCard, iCal and Dublin Core). This focus guarantees direct applicability and low entry barriers, unlike when devising a new ontology from scratch. The Web is already populated with several vocabularies which complement each other (but also have considerable overlap) in that they cover a wide range of necessary features to adequately describe the expert finding domain. Little effort has been made so far to identify and compare existing approaches, and to devise best practices on how to use and extend various vocabularies conjointly. It is the goal of the re- cently started ExpertFinder initiative to fill this gap. In this paper we present the ExpertFinder framework for reuse and extension of existing vocabularies in the Semantic Web. We provide a practical analysis of overlaps and options for com- bined use and extensions of several existing vocabularies, as well as a proposal for applying rules and other enabling technologies to the expert finding task.
    [Show full text]