Semantic search and reporting implementation on .15926 platform

Victor Agroskin 10.05.2012 1 About the .15926 project

• TechInvestLab.ru – Moscow-based strategy, organization and IT architecture consultancy • Software platform for ontology programming – allow business user to perform advanced tasks with data using only domain-specific terms, patterns and metaphors • .15926 public releases: – Browser, March 2011 – Editor, December 2011 – TabLan modeling methodology, March 2012 – SearchLan , May 2012 (planned) • Russian-speaking community of approx. 100 friends

2 Ontology Programming Platform • A long road to business user starts from rather complex things • Programming, modeling, ontologizing – different names for one activity – Mapping and compiling are the same • Ontology-related computations based on general- purpose multi-paradigm language – Not a logic one! • Domain Specific Languages (DSLs) - defining higher abstraction language layers and domain-specific constructs – From triples to instances to templates to patterns…

3 Language Workbench IDE • The goal – to have a product of Language Workbench class for ontology work • Fully integrated DSL development – definitions, libraries, editors • Turing-complete mapping environment to any schema (conceptual or proprietary CAD/PLM) • Seamless integration with outside data sources – tables, databases, XML • Python realization: – Core functionality to work with various triple representations of ISO 15926 type and template instances – Core support of SPARQL querying, optimized for work with federated endpoints as unreliable infrastructure – Plug-in architecture for data analysis and transformation (mappings, searches, verification, reasoning, etc.) – Optimized for ISO 15926 data structure searches (indexing, substring filtering, etc.) 4 SearchLan Tables Interface

Table Reader OIM Writer TabLan OIM Mapping Mapping Definition .15926 Builder .15926 Scanner

Editor Interface Library Library Template Template ISO 15926 Data ____ 15926-2 15926-7 15926-6 classes, relationships, template definitions, individuals template instances

.15926 Core 15926-7 template constructor

RDF/OWL Files RDF/OWL Files & SPARQL 15926-2,7 OWL Definitions iRING, Part 8, PCA RDL/JORD, Part 4 SPARQL Endpoint 5 Engineering Data Domain • Big Data – PCA RDL – more then 3 mil. triples and set to grow – … but it is just reusable reference data! • vs. data reuse – do not throw away intermediary files, but learn to work with them – Distributed semantic networks with many-layered semantic predefined by engineering knowledge • Mappings across several ontologies • Specialized semantic tools required for: – Data modeling – Mappings – Reasoning – Search

6 ISO 15926 (Meta) Languages Conceptual Graphical Data Query

Patterns

Part 7 Templates Part 8 RDF/OWL Part 2 Type Part 2,7 Conventions Instances Instance Diagrams EXPRESS EXPRESS-G OWL RDF SPARQL XML

7 + Engineering Languages

Conceptual Graphical Data Query Engineering Drawings & CAD/PLM Natural Specialty Diagrams Formats Language Information Patterns Part 7 Templates Part 8 RDF/OWL Part 2 Type Part 2-7 Conventions Instances Instance Diagrams EXPRESS EXPRESS-G OWL RDF SPARQL

XML 8 Filling the Gaps Conceptual Graphical Data Query Engineering Drawings & CAD/PLM Natural Specialty Data Diagrams Formats Language Patterns SearchLan TabLan.15926 .15926 Part 7 Templates Part 8 RDF/OWL Part 2 Type Part 2-7 Conventions Instances Instance Diagrams EXPRESS EXPRESS-G OWL RDF SPARQL XML 9 SearchLan.15926 • Query language for 15926-restricted (Part 2 type instances and template instances) RDF graph • Built over SPARQL • Integrated 15926-8 specific data and meta-data (annotation properties) queries • Available on .15926 platform to plug-ins and in user interface • Extendable as standard Python functions • High-level logic available for language extension • Configurable for specific presets: collections of interrelated data sources (files and endpoints) with namespace conventions, template libraries and metadata annotations

10 Name Queries @find(label=contains('UOM'))

11 Part 2 Type Queries

@find(id=R1, type=part2.ClassOfClassOfInformationRepresentation)

12 Part 2 Relationship Queries @find(type=part2.Classification, hasClassifier=R5, hasClassified=out)

13 Template Queries

@find(type=p7tpl.DescriptionByInformationObject, hasRepresented=out, hasPattern=find(label=icontains(“snip”)))

14 Reference Data Verification (1)

@find(type=part2.Classification, hasClassified= find(type=part2.any.ClassOfRelationship), hasClassifier= find(type=part2.any.ClassOfClassOfIndividual) )

15 Reference Data Verification (2) @find(type=part2.Classification, hasClassified= find(type=part2.any.ClassOfIndividual), hasClassifier= find(type=part2.any.ClassOfClassOfRelationship) )

16 Template “Contraction” Query

@find(type=part2.ClassOfIndirectProperty, hasClassOfPossessor=out, hasPropertySpace= find(type=part2.Classification, hasClassifier=out, hasClassified= find(type=part2.PropertyQuantification, hasInput=out, hasResult= find(type=part2.RealNumber) ) ) ) 17 Template “Contraction” Results

18 Patterns (iRING version) PLANT AREA Functional Area ClassifiedArrangementOfIndividual COMPOSITION P0002 AREA CODE ClassifiedClassOfIdentification IDENTIFICATION @find(type=p7tpl.ClassifiedArrangementOfIndividual, hasPart=find(id=uri(‘http://company.com/pr oject/data#R7554677677’)), hasWhole=out, hasContext=find(label=icontains(‘plant area composition’))) @find(type=p7tpl.ClasifiedClassOfIdentification, hasRepresented=R1, valPattern=out, hasContext=find(label=icontains(‘area code’))) 19 Object Information Models • Extracting ISO 15926 sub-graphs (not RDF!) and presenting them in a user interface in a compact form • Partial definition: oim_settings = [ dict(category="classified by", type=part2.Classification, hasClassified=_this, hasClassifier=_other), dict(category="classifies", type=part2.Classification, hasClassified=_other, hasClassifier=_this), dict(category="is specialization of", type=part2.Specialization, hasSubclass=_this, hasSuperclass=_other), dict(category="is generalization of", type=part2.Specialization, hasSubclass=_other, hasSuperclass=_this), dict(category="is identified by", type=part2.ClassOfIdentification, hasRepresented=_this, hasPattern=_other), ]

20 21 OIM Enhanced with Query

22 23 Roadmap

• Open plug-in specification • DSL Workbench IDE • Client-driven feature sets: – Presets for data sources – servers, files, namespaces, metadata, etc. – Readers, Writers and Mappings – Template expansion – Data verifiers and reasoners • Opening the source code for partners • Python based – for foreseeable future

24 SearchLan XML Files Databases Tables CAD/CAM/PLM Interface

XML, SQL, CAD/PLM API Table OIM XML, SQL, CAD/PLM API Reader Reader Writer Writer

.15926 Builder .15926 Scanner

Editor Interface Library Library Template Template

15926-2 15926-7 15926-6 classes, relationships, template definitions, metadata individuals template instances

Template Expansion 15926-7 template constructor

RDF/OWL Files RDF/OWL Files & SPARQL 15926-2,7 OWL Definitions iRING, Part 8, PCA RDL/JORD, Part 4 SPARQL Endpoint 25 Thank you! Anatoly Levenchuk http://ailev.ru (Rus) http://levenchuk.com (Eng) [email protected]

Victor Agroskin [email protected]

Freeware .15926 Editor available “as is” for evaluation and tests at http://techinvestlab.ru/dot15926Editor Feedback and comments: [email protected] http://community.livejournal.com/dot15926/

TechInvestLab.ru +7 (495) 748-5388 2626 Elephant icon by Martin Berube is used for .15926 software according to terms at http://www.iconarchive.com/show/animal -icons -by -martin -berube/elephant -icon.html