© 2014 Nine Points Solutions, LLC

Tool for Mapping Tabular Data to an Ontology, A Work-In-Progress

2 Jun 2014

Andrea Westerinen [email protected]

Agenda • Problem • Approach − Background, iRINGTools − “Templates” − Mapping infrastructure • Working Example − Cruise scenario

2 © 2014 Nine Points Solutions, LLC Problem • Ontologies valuable for: − Data integration/mapping/merging − Knowledge engineering and enhancement (via reasoning) • But, ontologies and their encodings are foreign and intimidating to most users and developers − Many times, ontologies are referred to as the “o” word • And, data is typically stored/understood in tabular form or accessed via RESTful services − Tabular forms: Relational data or spreadsheets − RESTful interface: XML or JSON • “Never the twain shall meet”?

3 © 2014 Nine Points Solutions, LLC Approach • Capture and present the important concepts from an ontology − Using templates • A “tabular form” consistent with ontology design patterns • Provide implementation to: − Present templates (which define the “necessary” data) − Define mappings of data from relational stores, spreadsheets, XML/JSON exchanges to the template data • Not mapping a complete db or assuming an uber-spreadsheet, but mapping small sets of domain-/problem-specific concepts − Support SPARQL endpoint, providing the mapped data on request • Have existence proof of the approach with iRINGTools − For ISO 15926

4 © 2014 Nine Points Solutions, LLC

Background – ISO 15926 • http://en.wikipedia.org/wiki/ISO_15926 • Designed for data integration and exchange regarding the lifecycle of an “installation” and its components − A “babel fish” ( from “Hitchhiker’s Guide to the Galaxy”) for project information • Established for the process industry − Large projects with many participants, being built and maintained for long periods of time • Technology useful outside the process industry if have a vocabulary of reference data

5 © 2014 Nine Points Solutions, LLC Background – iRINGTools • Created within the iRING User Group − iRING acronym for “ISO 15926 Realtime Interoperability Network Grid” • http://iringug.org/wiki/index.php?title=IRINGTools#iR INGTools_Resources − Deployable implementation of ISO 15926 to: • Browse and extend ISO 15926 reference data • Map an application schema to the reference data • Transform an application's data into an ISO 15926 representation • Exchange data between other iRING implementations • Two packages: − iRINGTools-Adapter (for mapping, configuration and reference data management; C# based, requires IIS) − iRINGTools-Core (basic directory, reference data checking and exchange services; Java based, requires an application server, e.g. Apache Tomcat) • iRINGTools services and applications can be evaluated at http://www.iringsandbox.org/

6 © 2014 Nine Points Solutions, LLC Background – iRINGTools Information Flow

Using Reference Data Libraries (RDLs) based on Part 4

7 © 2014 Nine Points Solutions, LLC Background – iRINGTools Editor

or spreadsheet, or REST API

template details

8 © 2014 Nine Points Solutions, LLC Templates

• Starting from existing ontologies (if any), competency questions and requirements, … − Define/extend ontology(ies) − Define templates based on the ontology(ies) • Templates: − 1+ ontology classes and 1+ data and/or object properties − ~OWL-Full since no reasoning performed at the template level

9 © 2014 Nine Points Solutions, LLC Template Ontology

10 Nine Points Solutions, LLC Proprietary Mappings • Using a similar interface to iRINGTools … • Record: − Templates mapped -> Classes and properties available when SPARQL query received • Data can come from a combination of sources – DBs, spreadsheets and/or RESTful URIs − Information to access the data source(s) • DB name/location, or spreadsheet path and file name, or RESTful URI • Access/login/password details − Mapping details • Class from DB table(s), spreadsheet(s), and/or RESTful resource(s) • Property from DB column(s), spreadsheet page(s)/column(s), or RESTful GET statements + date, string or mathematical manipulations − Cross-source access of data for a single property is a future work- item

11 © 2014 Nine Points Solutions, LLC Technology Infrastructure and Reuse • Mapping interface prototyped using: − Bootstrap (front-end framework for web development) − Jersey (JAX-RS implementation supporting XML or JSON) • Mapped data published to Stardog triple store and periodically updated − Could re-map on every query or trigger updates based on db or file changes • Access APIs: − Hibernate (or myBatis) for db access − Apache POI for spreadsheet access (Java API for accessing MS Office file formats) − JSR 353 (JSON-P) for JSON − JSR 206 (JAXP) for XML

12 © 2014 Nine Points Solutions, LLC Example – Semantic Trajectories

From the presentation, Towards ontology patterns for ocean science repository integration by Pascal Hitzler

13 © 2014 Nine Points Solutions, LLC Example – Ocean Science Cruise Ontology

From the presentation, Towards ontology patterns for ocean science repository integration by Pascal Hitzler

14 © 2014 Nine Points Solutions, LLC Example - Templates • Trajectory template: − containsClass – MovingObject and Position • minCardinality 1, maxCardinality 1 − containsProperty: • minCardinality 1, maxCardinality 1 for all • MovingObject – inverse of isTraversedBy - Segment - − startsFrom – Fix - atTime – Instant (or shortcut dateTime datatype) − startsFrom – Fix – hasLocation – Position − endsAt – Fix – atTime – Instant (or shortcut dateTime datatype) − endsAt – Fix – hasLocation – Position • Cruise template: − Includes trajectory template • Refines MovingObject class to be Vessel • Refines Position class to be Port

15 © 2014 Nine Points Solutions, LLC

Example - Mapping

Vessel and Port and Arrival and Identifying Identifying Departure Properties Properties Times

• Result: − Creation of Vessel and Port individuals from spreadsheet data − Auto-creation of Segment and Fix individuals, and appropriate object properties from property chains − Creation of (OWL-Time) Instant individuals from the arrival and departure strings (where available) • Or conversion of the strings to dateTime values

16 © 2014 Nine Points Solutions, LLC