Scalable Methods to Analyze Semantic Web Data

Total Page:16

File Type:pdf, Size:1020Kb

Scalable Methods to Analyze Semantic Web Data UNIVERSITAT JAUME I Departament de Llenguatges i Sistemes Inform`atics Scalable methods to analyze Semantic Web data Ph. D. dissertation Victoria Nebot Romero Supervisor Dr. Rafael Berlanga Llavori Castell´on,December, 2012 To my family. To Rafa. Abstract Semantic Web data is currently being heavily used as a data rep- resentation format in scientific communities, social networks, busi- ness companies, news portals and other domains. The irruption and availability of Semantic Web data is demanding new methods and tools to efficiently analyze such data and take advantage of the underlying semantics. Although there exist some applications that make use of Semantic Web data, advanced analytical tools are still lacking, preventing the user from exploiting the attached semantics. The main objective of this dissertation is to provide a formal frame- work that enables the multidimensional analysis of Semantic Web data in an scalable and efficient manner. The success of multidi- mensional analysis techniques applied to large volumes of structured data in the context of business intelligence, especially for data ware- housing and OLAP applications, has prompted us to investigate the application of such techniques to Semantic Web data, whose nature is semi-structured and contain implicit knowledge. Multidimensionality is based on the fact/dimension dichotomy. Data are modeled in terms of facts, which are the analytical metrics, and dimensions, which are the different analysis perspectives, and are usually hierarchically organized. We believe that the construc- tion of a multidimensional view of Semantic Web data driven by the semantics encoded in the data themselves and the user's re- quirements, empowers and enriches the analysis process in a unique manner, as it brings about new analytical capabilities not possible before. Aggregations and display operations typical of multidimen- sional analysis tools, such as changing the granularity level of the displayed data, or adding a new analysis perspective to the data, will be performed based on the semantic relations encoded in the data. This is possible thanks to the mapping of the data to a con- ceptual multidimensional space. We base our research on the hypothesis that Semantic Web data is an emerging knowledge resource worth exploiting, and that the knowledge encoded in Semantic Web data can be leveraged to per- form an efficient, scalable and full-fledged multidimensional analy- sis. Scalability is achieved by two means. On one hand, we provide ii an ontology indexing model over ontologies that allows to manage implicit knowledge in a compact format. Therefore, operations that require reasoning can be efficiently solved using the indexes. On the other hand, we have developed several scalable modularization tech- niques that build upon the previous indexes and allow to extract and work only with the ontological subsets of interest. The previous indexing and modularization methods assist in the an- alytical tasks by enabling the extraction of multidimensional data from semantic sources based on the user query. That is, the meth- ods are used to make the extraction of facts and dimensions from Se- mantic Web data efficient and scalable. However, identifying facts, dimensions, measures and well-shaped dimension hierarchies from the graph structure that underlies SW data is a big challenge due to the mismatch between the graph model that underlies Semantic Web data and the multidimensional model. Therefore, the notions of fact and dimension are revisited in the Semantic Web context and both facts and dimensions are defined from a logical viewpoint. Facts are formally defined as multidimensional points quantified by measure values, all of which are logically reachable from the subject of analysis defined by the user. To this end, we use the notion of aggregation path, which is less restrictive than the tradi- tional multidimensional constraints usually imposed between facts and dimensions, and define different interesting subgroups for anal- ysis. We also detect the summarizability of the extracted facts and produce correctly aggregated results. Dimensions are defined as direct acyclic graphs composed by nodes that are semantically related and adhere to the conceptual specifi- cation of the user. That is, nodes are sub-concepts of the dimen- sion type and the edges are subsumption relations. Two alternative methods allow to re-shape the extracted dimension hierarchies to favor aggregation while preserving the semantics of the dimension values as much as possible. The flexibility introduced in the discovery of facts and dimensions allows us to analyze data that have complex relations, which es- cape to the traditional multidimensional model and cannot be oth- erwise analyzed. Moreover, instead of building a huge, one time data warehouse, our method for fact and dimension extraction is based on several indexes and precomputed data, which allows us to efficiently materialize only the facts and dimensions required by an analytical query. This provides up-to-date, dynamic and cus- tomized results to the user. The experimental evaluation performed on each of the components of the framework demonstrates that the proposed analysis frame- iii work scales to large ontological resources. Likewise, the developed use cases show the potential of the multidimensional analysis of Semantic Web data. vi Resumen Introducci´on La Web se ha convertido en una de las mayores fuentes de conocimiento, tanto a nivel cient´ıficoy empresarial, como a nivel cotidiano. Inicialmente, la Web se plante´ocomo un recurso de consulta de informaci´oncon contenidos m´aso menos est´aticos,el cual se compon´ıade una serie de documentos enlazados, t´ıpicamente ricos en texto. La primera gran evoluci´onde la Web desemboc´o en la Web 2.0, cuyo objetivo es crear una Web m´asdin´amicaque facilite a los usuarios interactuar y compartir informaci´on.A trav´esde las tecnolog´ıasWeb 2.0 se permite a los usuarios crear y gestionar contenidos en la Web. Parte del ´exitode la Web 2.0 se debe a la utilizaci´onde lenguajes semi-estructurados como XML para el intercambio de informaci´on,los cuales permiten dar cierta estructura al contenido. Sin embargo, se ha demostrado que las tecnologas XML son insuficientes para gestionar de forma eficiente la avalancha de in- formaci´onpublicada en la Web, puesto que XML es capaz de estructurar sint´acticamente un documento pero no su contenido. Ello ha propiciado la evoluci´onde la Web hacia una Web m´asinteligente, la Web Sem´antica, o Web 3.0. El objetivo que persigue la Web Sem´antica es describir los contenidos y la informaci´onpresente en la Web utilizando formalismos l´ogicospara que ´estapueda ser \entendida" y procesada de forma autom´aticapor las m´aquinas. Aunque este objetivo es muy ambicioso, ya se dispone de tecnolog´ıasadecuadas para la descripci´onsem´antica de los contenidos, tales como RDF y OWL. Adem´as,existen iniciativas como Linked Open Data (LOD) que promueven la publicaci´ony enlace de datos en formato sem´antico en la Web. Tal ha sido el ´exitode este tipo de iniciativas que distintas organizaciones de ´ambitos muy diversos han decidido publicar sus datos en la Web en formato sem´antico. Entre las organizaciones que se han adherido a esta iniciativa podemos men- cionar: distintos gobiernos como el de Reino Unido o los Estados Unidos, los cuales han publicado gran cantidad de datos acerca de temas tan diversos e interesantes como los niveles de ozono de las distintas ciudades o la situaci´on financiera de las empresas; medios de comunicaci´oncomo la BBC, que pone a vii viii disposici´onde los usuarios datos sobre sus programas de televisi´ony radio en- lazados con otros recursos de conocimiento; varias comunidades cient´ıficasdel ´ambito de la biomedicina como Bio2RDF, DailyMed, Diseasome o DrungBank entre muchas otras, han publicado gran cantidad de datos sobre el genoma hu- mano, medicamentos junto con su composici´onqu´ımicay su uso, enfermedades y sus relaciones gen´eticas, etc. El dominio biom´edicoes especialmente com- plejo, y es por ello que la conceptualizaci´onde estos datos puede acarrear avances importantes. La existencia de este tipo de datos anotados sem´anticamente abre nuevas ´areasde investigaci´onpara el desarrollo de aplicaciones que permitan explotar el conocimiento impl´ıcitode los datos. Sin embargo, la explotaci´onde datos anotados sem´anticamente es bastante compleja debido a su estructura subya- cente en forma de grafo y a la sem´antica impl´ıcita,la cual requiere de t´ecnicas de razonamiento que suelen ser costosas computacionalmente. Objetivos En esta tesis se proponen una serie de m´etodos para al an´alisisde datos anota- dos sem´anticamente de forma escalable y eficiente. El ´exitoque han tenido las t´ecnicasde analisis multidimensional aplicadas a grandes cantidades de datos estructurados, mayormente las t´ecnicas de almacenes de datos y OLAP, nos ha hecho plantearnos la aplicaci´onde dichas t´ecnicassobre datos anotados sem´anticamente. De esta forma, en un hospital en donde la informaci´onde los pacientes est´eanotada sem´anticamente y enlazada con datos biom´edicos, el m´edicopodr´arealizar un an´alisismultidimensional a nivel conceptual para poder analizar el impacto de administrar ciertos f´armacosa pacientes con un de- terminado tipo de enfermedad. Es decir, el m´edicopodr´aescoger sus variables de estudio o dimensiones, tales como el tipo de enfermedad, el sexo del paciente, o el tipo de medicamento administrado, y estudiar el impacto que tienen en dis- tintos indicadores
Recommended publications
  • The Semantic Web - ISWC 2004
    Lecture Notes in Computer Science 3298 The Semantic Web - ISWC 2004 Third International Semantic Web Conference, Hiroshima, Japan, November 7-11, 2004. Proceedings Bearbeitet von Sheila A. McIlraith, Dimitris Plexousakis, Frank van Harmelen 1. Auflage 2004. Taschenbuch. xxxv, 844 S. Paperback ISBN 978 3 540 23798 3 Format (B x L): 15,5 x 23,5 cm Gewicht: 970 g Weitere Fachgebiete > EDV, Informatik > Datenbanken, Informationssicherheit, Geschäftssoftware > Datenkompression, Dokumentaustauschformate schnell und portofrei erhältlich bei Die Online-Fachbuchhandlung beck-shop.de ist spezialisiert auf Fachbücher, insbesondere Recht, Steuern und Wirtschaft. Im Sortiment finden Sie alle Medien (Bücher, Zeitschriften, CDs, eBooks, etc.) aller Verlage. Ergänzt wird das Programm durch Services wie Neuerscheinungsdienst oder Zusammenstellungen von Büchern zu Sonderpreisen. Der Shop führt mehr als 8 Millionen Produkte. Preface The 3rd International Semantic Web Conference (ISWC 2004) was held Novem- ber 7–11, 2004 in Hiroshima, Japan. If it is true what the proverb says: “Once by accident, twice by habit, three times by tradition,” then this third ISWC did indeed firmly establish a tradition. After the overwhelming interest in last year’s conference at Sanibel Island, Florida, this year’s conference showed that the Semantic Web is not just a one-day wonder, but has established itself firmly on the research agenda. At a time when special interest meetings with a Seman- tic Web theme are springing up at major conferences in numerous areas (ACL, VLDB, ECAI, AAAI, ECML, WWW, to name but a few), the ISWC series has established itself as the primary venue for Semantic Web research. Response to the call for papers for the conference continued to be strong.
    [Show full text]
  • Trombert-Paviot, Alan Rector, Robert Baud, Pieter
    HIMJ: Reviewed articles The development of CCAM: The new French coding system of clinical HIMJ HOME procedures Beatrice Trombert-Paviot, Alan Rector, Robert Baud, Pieter CONTENTS Zanstra, Caroline Martin, Egbert van der Haring, Lucienne Clavel and Jean Marie Rodrigues GUIDELINES Abstract MISSION A new French coding system of clinical procedures, the Classification Commune Des Actes Medicaux (CCAM), has been developed at the turn of the millennium (between 1996 and CONTACT US 2001). Two methodologies were used: a traditional domain- experts consensus method, and an artificial-intelligence-based semantic representation. An economic evaluation of clinical procedures was also undertaken for the rating for fee-for- service payment. We present the methodologies used and stress how the European Union research project, ‘European Consortium, Generalised Architecture for Languages, Encyclopaedias and Nomenclatures in Medicine’ (GALEN), facilitated the sharing and maintaining of consistent medical knowledge. This country case study highlights the significant cost to individual countries in developing their own HIMAA classifications in isolation. It also demonstrates the benefits of Locked Bag 2045 North Ryde, NSW contributing to international efforts such as GALEN that enable Australia 1670 harmonisation, yet still allow for diversity. ABN 54 008 451 910 Key words : Artificial intelligence; coding system; clinical procedures; terminology; GALEN Introduction Following an Act passed in the National Assembly and issued in January 1993, the Ministry
    [Show full text]
  • Ontologies and the Semantic Web
    Ontologies and the Semantic Web Ian Horrocks <[email protected]> Information Management Group School of Computer Science University of Manchester The Semantic Web Today’s Web • Distributed hypertext/hypermedia • Information accessed via (keyword based) search and browse • Browser tools render information for human consumption What is the Semantic Web? • Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN • His vision of the Web was much more ambitious than the reality of the existing (syntactic) Web: “… a set of connected applications … forming a consistent logical web of data …” “… an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation …” • This vision of the Web has become known as the Semantic Web Hard Work using “Syntactic Web” Find images of Peter Patel-Schneider, Frank van Harmelen and Alan Rector… Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois Impossible (?) using “Syntactic Web” • Complex queries involving background knowledge – Find information about “animals that use sonar but are neither bats nor dolphins” , e.g., Barn Owl • Locating information in data repositories – Travel enquiries – Prices of goods and services – Results of human genome experiments • Finding and using “web services” – Given a DNA sequence, identify its genes, determine the proteins they can produce, and hence the biological processes they control • Delegating complex tasks to web “agents” – Book me a holiday next weekend somewhere warm, not too far away, and where they speak either French or English What is the Problem? Consider a typical web page: • Markup consists of: – rendering information (e.g., font size and colour) – Hyper-links to related content • Semantic content is accessible to humans, but not (easily) to computers… What is the (Proposed) Solution? • Add semantic annotations to web resources Dr.
    [Show full text]
  • Knowledge Driven Software and “Fractal Tailoring”: Ontologies in Development Environments for Clinical Systems
    Knowledge Driven Software and “Fractal Tailoring”: Ontologies in development environments for clinical systems Alan RECTOR School of Computer Science, University of Manchester, UK Abstract. Ontologies have been highly successful in applications involving annotation and data fusion. However, ontologies as the core of “Knowledge Driven Architectures” have not achieved the same influence as “Model Driven Architectures”, despite the fact that many biomedical applications require features that seem achievable only via ontological technologies – composition of descriptions, automatic classification and inference, and management of combinatorial explosions in many contexts. Our group adopted Knowledge Driven Architectures based on ontologies to address these problems in the early 1990s. In this paper we discuss first the use cases and requirements and then some of the requirements for more effective use of Knowledge Driven Architectures today: clearer separation of language and formal ontology, integration with contingent knowledge, richer and better distinguished annotations, higher order representations, integration with data models, and improved auxiliary structures to allow easy access and browsing by users. Keywords. Ontology based systems, Model driven architecture, OWL Citation. Rector A; Knowledge driven software and “fractal tailoring”: Ontologies in development environments for clinical systems. 2010; Formal Ontology in Information Systems (FOIS 2010): IOS Press; 17-30. 1. Introduction In the past fifteen years, the notion of “ontology” has gone from being an obscure “O” word to almost a synonym for “good”. Everybody thinks, and is often told, that they need ontologies. At the same time ontologies have, on the one hand, become dissociated from other forms of knowledge representation and, on the other hand, been taken by some to be all of knowledge representation.
    [Show full text]
  • Analysing Syntactic Regularities and Irregularities in SNOMED-CT Eleni Mikroyannidi*, Robert Stevens, Luigi Iannone and Alan Rector
    Mikroyannidi et al. Journal of Biomedical Semantics 2012, 3:8 http://www.jbiomedsem.com/content/3/1/8 JOURNAL OF BIOMEDICAL SEMANTICS RESEARCH Open Access Analysing Syntactic Regularities and Irregularities in SNOMED-CT Eleni Mikroyannidi*, Robert Stevens, Luigi Iannone and Alan Rector Abstract Motivation: In this paper we demonstrate the usage of RIO; a framework for detecting syntactic regularities using cluster analysis of the entities in the signature of an ontology. Quality assurance in ontologies is vital for their use in real applications, as well as a complex and difficult task. It is also important to have such methods and tools when the ontology lacks documentation and the user cannot consult the ontology developers to understand its construction. One aspect of quality assurance is checking how well an ontology complies with established ‘coding standards’; is the ontology regular in how descriptions of different types of entities are axiomatised? Is there a similar way to describe them and are there any corner cases that are not covered by a pattern? Detection of regularities and irregularities in axiom patterns should provide ontology authors and quality inspectors with a level of abstraction such that compliance to coding standards can be automated. However, there is a lack of such reverse ontology engineering methods and tools. Results: RIO framework allows regularities to be detected in an OWL ontology, i.e. repetitive structures in the axioms of an ontology. We describe the use of standard machine learning approaches to make clusters of similar entities and generalise over their axioms to find regularities. This abstraction allows matches to, and deviations from, an ontology’s patterns to be shown.
    [Show full text]
  • Ontologies and the Semantic Web
    Ontologies and the Semantic Web Ian Horrocks <[email protected]> Information Management Group School of Computer Science University of Manchester The Semantic Web Today’s Web • Distributed hypertext/hypermedia • Information accessed via (keyword based) search and browse • Browser tools render information for human consumption What is the Semantic Web? • Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN • His vision of the Web was much more ambitious than the reality of the existing (syntactic) Web: “… a set of connected applications … forming a consistent logical web of data …” “… an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation …” • This vision of the Web has become known as the Semantic Web Hard Work using “Syntactic Web” Find images of Peter Patel-Schneider, Frank van Harmelen and Alan Rector… Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois Impossible (?) using “Syntactic Web” • Complex queries involving background knowledge – Find information about “animals that use sonar but are neither bats nor dolphins” , e.g., Barn Owl • Locating information in data repositories – Travel enquiries – Prices of goods and services – Results of human genome experiments • Finding and using “web services” – Given a DNA sequence, identify its genes, determine the proteins they can produce, and hence the biological processes they control • Delegating complex tasks to web “agents” – Book me a holiday next weekend somewhere warm, not too far away, and where they speak either French or English What is the Problem? Consider a typical web page: • Markup consists of: – rendering information (e.g., font size and colour) – Hyper-links to related content • Semantic content is accessible to humans, but not (easily) to computers… What is the (Proposed) Solution? • Add semantic annotations to web resources Dr.
    [Show full text]
  • Thesis Submitted to the University of Manchester for the Degree of Doctor of Philosophy in the Faculty of Medicine
    Structured Methods of Information Management for Medical Records A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Medicine 1993 William Anthony Nowlan Table of Contents Structured Methods of Information Management for Medical Records 1 Table of Contents 2 List of Figures 8 Abstract 9 Declaration 11 Acknowledgements 12 The Author 13 1 Purpose and Introduction 14 1.1 The purpose of this thesis 14 1.2 Motivations for the work 14 1.2.1 Research on clinical information systems: PEN&PAD 14 1.2.2 Characteristics of clinical information systems 15 1.3 What is to be represented: the shared medical model 15 1.3.1 An example: thinking about ‘stroke’ 16 1.3.2 Medical limits on what can be modelled 16 1.4 Scope of the work in this thesis: the limits on context and modelling 17 1.5 Organisation of the thesis 17 2 Coding and Classification Schemes 18 2.1 The changing task: the move from populations to patients 18 2.2 Current methods: nomenclatures and classifications 19 2.2.1 Nomenclatures. 19 2.2.2 Classifications 19 2.2.3 Classification rules 20 2.3 An exemplar classification: the International Classification of Diseases (ICD) 20 2.3.1 History of the International Classification of Diseases 20 2.3.2 The organisation of ICD-9 21 2.3.3 Classification principles within ICD-9 23 2.3.4 Goal-orientation 24 2.3.5 The use of the classification structure: abstraction 24 2.3.6 Extending the coverage of a classification and the ‘combinatorial explosion’ 25 2 Contents 3 2.4 Multiaxial nomenclatures and classifications: SNOMED 26 2.4.1 Organisation of SNOMED 26 2.4.2 SNOMED coding of terms 27 2.4.3 Sense and nonsense in SNOMED 28 2.5 The structure within SNOMED fascicles 28 2.5.1 Abstraction in SNOMED 29 2.5.2 Relationships between topographical concepts in the SNOMED topography fascicle 31 2.6 SNOMED enhancements.
    [Show full text]
  • Semantic Mapping of Clinical Model Data to Biomedical Terminologies to Facilitate Interoperability
    SEMANTIC MAPPING OF CLINICAL MODEL DATA TO BIOMEDICAL TERMINOLOGIES TO FACILITATE INTEROPERABILITY A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Engineering and Physical Sciences 2008 By Rahil Qamar School of Computer Science Contents Abstract 17 Declaration 19 Copyright 20 Dedication 21 Acknowledgments 22 1 Introduction 24 1.1 The task of semantic interoperability . 25 1.2 The Research Agenda . 26 1.2.1 Research Objective . 27 1.2.2 Research Aim . 29 1.2.3 Research Question . 30 1.2.4 Research Contributions . 30 1.2.5 Significance of the Research . 31 1.2.6 The Research Programme . 32 1.3 Vocabulary and Notations . 35 1.4 Thesis Outline . 36 1.5 List of Publications . 37 2 Clinical Data Models and Terminologies 39 2.1 Why is the medical domain particularly complex? . 40 2.2 Problems facing interoperability standards . 42 2.3 Benefits of standardised data in health care . 43 2.4 Clinical Data Models . 44 2.4.1 Need for Structured Clinical Data Models . 45 2 2.4.2 Examples of Data Models . 46 2.4.2.1 The openEHR Archetype Models . 46 2.4.2.2 HL7 CDA documents . 47 2.4.3 History of openEHR Archetypes . 49 2.4.3.1 Archetype Modeling Features . 50 2.4.3.2 Purpose of Archetypes . 51 2.4.3.3 Reasons for selecting Archetypes . 52 2.4.3.4 Term Binding in Archetypes . 53 2.5 The openEHR Information Model . 54 2.5.1 The openEHR ENTRY submodel . 55 2.5.1.1 The Observation subclass .
    [Show full text]
  • An Introduction to OWL
    An Introduction to OWL Sean Bechhofer School of Computer Science University of Manchester, UK http://www.cs.manchester.ac.uk OWL: Web Ontology Language • OWL is an ontology language designed for the Semantic Web – It provides a rich collection of operators for forming concept descriptions – It is a W3C standard, promoting interoperation and sharing between applications – It has been designed to be compatible with existing web standards • In this talk, we’ll see some of the motivation behind OWL and some details of the language Introduction to the Semantic Web Tutorial Towards a Semantic Web • The Web was made possible through established standards – TCP/IP for transporting bits down a wire – HTTP & HTML for transporting and rendering hyperlinked text • Applications able to exploit this common infrastructure – Result is the WWW as we know it • 1st generation web mostly handwritten HTML pages • 2nd generation (current) web often machine generated/active – Both intended for direct human processing/interaction • In next generation web, resources should be more accessible to automated processes – To be achieved via semantic markup – Metadata annotations that describe content/function Introduction to the Semantic Web Tutorial What’s the Problem? • Consider a typical web page • Markup consists of: – rendering information (e.g., font size and colour) – Hyper-links to related content • Semantic content is accessible to humans but not (easily) to computers… • Requires (at least) NL understanding Introduction to the Semantic Web Tutorial A Semantic
    [Show full text]
  • OWL 2 Web Ontology Language Primer (Second Edition) W3C Recommendation 11 December 2012
    OWL 2 Web Ontology Language Primer (Second Edition) W3C Recommendation 11 December 2012 OWL 2 Web Ontology Language Primer (Second Edition) W3C Recommendation 11 December 2012 This version: http://www.w3.org/TR/2012/REC-owl2-primer-20121211/ Latest version (series 2): http://www.w3.org/TR/owl2-primer/ Latest Recommendation: http://www.w3.org/TR/owl-primer Previous version: http://www.w3.org/TR/2012/PER-owl2-primer-20121018/ Editors: Pascal Hitzler, Wright State University Markus Krötzsch, University of Oxford Bijan Parsia, University of Manchester Peter F. Patel-Schneider, Nuance Communications Sebastian Rudolph, FZI Research Center for Information Technology Please refer to the errata for this document, which may include some normative corrections. A color-coded version of this document showing changes made since the previous version is also available. This document is also available in these non-normative formats: PDF version. See also translations. Copyright © 2012 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply. Abstract The OWL 2 Web Ontology Language, informally OWL 2, is an ontology language for the Semantic Web with formally defined meaning. OWL 2 ontologies provide classes, properties, individuals, and data values and are stored as Semantic Web documents. OWL 2 ontologies can be used along with information written in RDF, and OWL 2 ontologies themselves are primarily exchanged as RDF documents. The OWL 2 Document Overview describes the overall state of OWL 2, and should be read before other OWL 2 documents. This primer provides an approachable introduction to OWL 2, including orientation for those coming from other disciplines, a running example showing how OWL 2 can be used to represent first simple information and then more complex information, how OWL 2 manages ontologies, and finally the distinctions between the various sublanguages of OWL 2.
    [Show full text]
  • Proceedings Ontology Matching 2009
    Ontology Matching OM-2009 Papers from the ISWC Workshop Introduction Ontology matching is a key interoperability enabler for the Semantic Web, as well as a useful tactic in some classical data integration tasks. It takes the ontologies as input and determines as output an alignment, that is, a set of correspondences between the semantically related entities of those ontologies. These correspondences can be used for various tasks, such as ontology merging and data translation. Thus, matching ontologies enables the knowledge and data expressed in the matched ontologies to interoperate. The workshop has three goals: • To bring together leaders from academia, industry and user institutions to assess how academic advances are addressing real-world requirements. The workshop will strive to improve academic awareness of industrial and final user needs, and therefore, direct research towards those needs. Simul- taneously, the workshop will serve to inform industry and user represen- tatives about existing research efforts that may meet their requirements. The workshop will also investigate how the ontology matching technology is going to evolve. • To conduct an extensive and rigorous evaluation of ontology matching approaches through the OAEI (Ontology Alignment Evaluation Initiative) 2009 campaign, http://oaei.ontologymatching.org/2009. This year's OAEI campaign introduces two new tracks about oriented alignments and about instance matching (a timely topic for the linked data community). Therefore, the ontology matching evaluation initiative itself will provide a solid ground for discussion of how well the current approaches are meeting business needs. • To examine similarities and differences from database schema matching, which has received decades of attention but is just beginning to transition to mainstream tools.
    [Show full text]
  • A Practical Introduction to Protégé OWL
    A Practical Introduction to Protégé OWL Session 1: Primitive Classes Nick Drummond, Matthew Horridge, Olivier Dameron, Alan Rector, Hai Wang Overview (morning) ‣ Tutorial Aims ‣ OWL Language Overview ‣ Language constructs ‣ Primitive Pizzas ‣ Creating a class hierarchy ‣ Basic relations ‣ Basic Reasoning ‣ Class consistency ‣ Your first defined class ‣ Q&A © 2006, The University of Manchester 2 Overview (afternoon) ‣ Formal Semantics of OWL ‣ Harder but more fun ‣ Advanced Reasoning ‣ Defined Classes ‣ Using a reasoner for computing a classification ‣ Common Mistakes ‣ Q&A © 2006, The University of Manchester 3 Aims of this morning ‣ Make OWL (DL) more approachable ‣ Get you used to the tool ‣ Give you a taste for the afternoon session © 2006, The University of Manchester 4 Exposing OWL © 2006, The University of Manchester 5 What is OWL? ‣ OWL is the Web Ontology Language ‣ It’s part of the Semantic Web framework ‣ It’s a standard © 2006, The University of Manchester 6 OWL has explicit formal semantics Can therefore be used to capture knowledge in a machine interpretable way © 2006, The University of Manchester 7 OWL helps us... ‣ Describe something, rather than just name it ‣ Class (BlueThing) does not mean anything ‣ Class (BlueThing complete owl:Thing restriction (hasColour someValuesFrom (Blue))) has an agreed meaning to any program accepting OWL semantics © 2006, The University of Manchester What is the Semantic Web? ‣ A vision of a computer-understandable web ‣ Distributed knowledge and data in reusable form ‣ XML, RDF(S), OWL just
    [Show full text]