SHACL Overview

September 7, 2017

© Copyright 2017 TopQuadrant Inc. Slide 1 Agenda . Introducing SHACL – 15 min – What is it? – A brief history of SHACL – Why is it important? . Demo: creating and using SHACL shapes – 10- 15 min . Deeper dives – 15 min – Additional features (but still in SHACL Core) – SHACL and OWL – SHACL and SPIN . Summary and Q&A

© Copyright 2017 TopQuadrant Inc. Slide 2 Introducing SHACL

© Copyright 2017 TopQuadrant Inc. Slide 3 What is SHACL?

. SHApes Constraint Language / W3C Recommendation Language for defining the “shape” of RDF data (+++) – Simply put, a schema language for RDF . Validating RDF graphs against a set of shapes – Shapes are defined in an RDF graph called “Shapes Graph” – The graph to be validated is called “Data Graph”

– What about (+++)?

© Copyright 2017 TopQuadrant Inc. Slide 4 Motivation for SHACL . Until SHACL, no official W3C recommendation for Validating RDF – Since 2001? . But RDF Schema / OWL are schema languages... – Or NOT? – #OWA, #CWA, #UNA => inferencing not validation . Semi-official specifications have been created (W3C member submissions) – SPIN – Resource Shapes – ShEx . Other Standalone / ad-hoc solutions have also evolved – Most on top of SPARQL & RDFS/OWL e.g. Stardog DB

OWA: Open World Assumption; CWA: Closed World Assumption; UNA: Unique Name Assumption

© Copyright 2017 TopQuadrant Inc. Slide 5 What does SHACL do?

. Ensures conformance of RDF data to a defined schema (primary goal)

© Copyright 2017 TopQuadrant Inc. Slide 6 What else SHACL can be used for

. Extended, but not limited to, goals: – interface building – data structure communication – code generation – data integration – rule-based inferencing – ...

© Copyright 2017 TopQuadrant Inc. Slide 7 A little bit of history … Key Inputs to SHACL

. SHACL Core Vocabulary and Syntax: inspired mostly by IBM Resource Shapes – (with many additions along the way to address requirements) . Overall Architecture and Extension Mechanism: inspired mostly by SPIN – (some input from RDFUnit) . Compact syntax: inspired by ShEx . Rules: inspired mostly by SPIN

© Copyright 2017 TopQuadrant Inc. Slide 8 A picture is worth a million words …

© Copyright 2017 TopQuadrant Inc. Slide 9 Example

ex:PersonShape a sh:NodeShape ; sh:targetClass ex:Person ; sh:pattern “^http://mydomain.com/person/” ; sh:property [ # _:b1 sh:path ex:firstName ; sh:minCount 1 ; sh:datatype xsd:string ; ] ; sh:property [ # _:b2 sh:path ex:lastName ; sh:minCount 1 ; sh:maxCount 1 ; sh:datatype xsd:string ; sh:maxLength 20 ;] .

© Copyright 2017 TopQuadrant Inc. Slide 10 Explicit declaration is recommended, Example but SHACL processors will recognize shapes even without type statements ex:PersonShape a sh:NodeShape ; Node Shape sh:targetClass ex:Person ; sh:pattern “^http://mydomain.com/person/” ; sh:property [ # _:b1 Property Shape sh:path ex:firstName ; sh:minCount 1 ; sh:datatype xsd:string ; ] ; sh:property ex:Person-lastName . Property Shape

ex:Person-lastName sh:path ex:lastName ; sh:minCount 1 ; sh:maxCount 1 ; sh:datatype xsd:string ; sh:maxLength 20 .

. Node shapes are used to: – Specify constraints on the target nodes – Group property shapes

© Copyright 2017 TopQuadrant Inc. Slide 11 © Copyright 2017 TopQuadrant Inc. Slide 12 Demo: creating and using SHACL shapes – within TopBraid Enterprise Data Governance

© Copyright 2017 TopQuadrant Inc. Slide 13 SHACL – Deeper Dives

© Copyright 2017 TopQuadrant Inc. Slide 14 If you know OWL: FAMILIAR things you can do using SHACL . Specify cardinalities for a property when used with a member of a class – Also can do qualified cardinalities (owl:someValuesFrom = min 1 QCR) – Closed world meaning . Specify a range of values for a property when used with a member of a class – Similar to owl:allValuesFrom, but closed world . Combine restrictions (shapes) using logical operators – “and” is assumed , by default – or, not and xone are available – SHACL and OWL COMPARED http://spinrdf.org/shacl-and-owl.html © Copyright 2017 TopQuadrant Inc. Slide 15 If you know OWL: Some NEW things you can do using SHACL

. Use large pre-built vocabulary for restricting property values – min/max value, regex, node-kind . Restrict property value based on the value of another property . Not be limited to a direct property values – can use paths just like in SPARQL . Restrict a resource itself – Node-kind, URI, closed shape (with ignore list) . De-activate – useful for re-use and testing . Define such restrictions (constraints) not just for a member of a class - for a specific resource/some other grouping of resources . Extend – declaratively define your own constraint types (components) . Error messages, some UI generation support, etc. © Copyright 2017 TopQuadrant Inc. Slide 16 Variation on the Example - 1

ex:PersonShape a sh:NodeShape ; sh:targetSubjectsOf ex:SSN ; sh:pattern “^http://mydomain.com/person/” ; sh:property [ # _:b1 sh:path ex:firstName ; sh:minCount 1 ; sh:datatype xsd:string ; ] ; sh:property [ # _:b2 sh:path ex:lastName ; sh:minCount 1 ; sh:maxCount 1 ; sh:datatype xsd:string ; sh:maxLength 20 ;] .

© Copyright 2017 TopQuadrant Inc. Slide 17 Variation on the Example - 2

ex:PersonShape a sh:NodeShape ; sh:targetSubjectsOf ex:SSN ; sh:pattern “^http://mydomain.com/person/” ; sh:property [ # _:b1 sh:path ex:firstName ; sh:minCount 1 ; sh:datatype xsd:string ; ] ; sh:property [ # _:b2 sh:path ex:lastName ; sh:minCount 1 ; sh:maxCount 1 ; sh:datatype xsd:string ; sh:maxLength 20 ;] ; sh:property [ sh:path rdf:type ; sh:or ( [ sh:hasValue ex:Customer ] [ sh:hasValue ex:Person ] ) ] .

© Copyright 2017 TopQuadrant Inc. Slide 18 If you know SPIN: FAMILIAR things you can do with SHACL

. Attach a constraint to a class – As with SPIN, it will be in effect for all (transitive) class members . Create your own vocabulary of constraint types . Define error messages to display when a constraint is violated . Attach inference rules to a class

FROM SPIN TO SHACL http://spinrdf.org/spin-shacl.html

© Copyright 2017 TopQuadrant Inc. Slide 19 If you know SPIN: Some NEW things you can do with SHACL

. Use a large standard pre-built vocabulary of constraints – For SPIN we had a smaller number of pre-defined constraint components/templates . De-activate – useful for re-use and testing . Define constraints not just for a member of a class - for a specific resource/some other grouping of resources . Make use of some additional properties for the UI generation support, etc.

© Copyright 2017 TopQuadrant Inc. Slide 20 Variation on the Example - 3

ex:Person a sh:NodeShape ; a rdfs:Class ; sh:pattern “^http://mydomain.com/person/” ; sh:closed true; sh:ignoredProperties ( rdf:type ) ; sh:property [ # _:b1 sh:path ex:firstName ; sh:minCount 1 ; sh:datatype xsd:string ; ] ; sh:property [ # _:b2 sh:path ex:lastName ; sh:minCount 1 ; sh:maxCount 1 ; sh:datatype xsd:string ; sh:maxLength 20 ;] .

© Copyright 2017 TopQuadrant Inc. Slide 21 Validation Report Vocabulary . sh:conforms – true if no validation results were produced . sh:result/sh:ValidationResult . sh:focusNode – identifies a node that produced the results i.e., a node that has problems . sh:value – identifies what value is incorrect . sh:resultPath – identifies how the incorrect value is connected to the focus node . sh:sourceShape – what shape has been violated . sh:sourceConstraintComponent – what constraint component has been violated . sh:detail – further details . sh:resultMessage – tools may use this to return helpful messages to the users . sh:resultSeverity © Copyright 2017 TopQuadrant Inc. Slide 22 Compact Syntax Example ex:PersonShape a sh:NodeShape ; ex:PersonShape -> ex:Person { sh:targetClass ex:Person ; closed=true ignoredProperties=[rdf:type] sh:closed true ; pattern=“^http://mydomain.com/person/”; sh:ignoredProperties (rdf:type) ; ex:firstName xsd:string [1..*]; sh:pattern “^http://mydomain.com/person/”; ex:lastName xsd:string [1..1] maxLength=20 ; sh:property [ sh:path ex:firstName ; } sh:minCount 1 ; sh:datatype xsd:string ; ] ; sh:property [sh:path ex:lastName ; sh:minCount 1 ; sh:maxCount 1 ; sh:datatype xsd:string ; sh:maxLength 20 ;] .

. Targets are identified first using “->” . Statements about target nodes themselves follow directly after the “{“, separated by spaces, end with “;” . In line property shapes start with the name of the property (or a SPARQL property path)

© Copyright 2017 TopQuadrant Inc. Slide 23 Summary and Q&A

© Copyright 2017 TopQuadrant Inc. Slide 24 Resources for learning SHACL

. SHACL Community Group https://www.w3.org/community/shacl/

. SHACL W3C Wiki https://www.w3.org/2014/data-shapes/wiki/Main_Page – Links to implementations, WG deliverables, Meeting minutes, … – Some historic info – not so useful anymore

. TQ’s SHACL page - http://www.topquadrant.com/technology/shacl/ – Tutorials, articles, presentations – for example: – AN OVERVIEW OF SHACL FEATURES AND SPECIFICATIONS – USING SHACL DATA CONSTRAINTS IN THE TOPBRAID WEB PRODUCTS EVN AND EDG – HOW TO DEFINE CONSTRAINTS ON RDF:LISTS USING SHACL – HOW TO USE TOPBRAID AS A DATA VALIDATION SERVER

. TopQuadrant is conducting a full day tutorial at SEMANTiCS 2017: SHACL (Shapes Constraint Language) – Introduction and Implementation, Thursday, September 14, 2017

© Copyright 2017 TopQuadrant Inc. Slide 25 Why is SHACL important?

. SHACL is significant addition to the semantic standards stack – it addresses a critical need for enterprises looking to ensure their organizations data quality – It can support enterprise applications . From LinkedIn blogs: – Irene Polikoff, Now You Can Finally Validate your RDF …: “It delivers much needed capabilities for ensuring data quality in enterprise solutions.” – Jan Voskuill, CEO, Taxonic, Breaking News: At Last, SHACL …: “… SHACL is a significant step towards making more viable and useable in practical situations. … SHACL will be a game changer in data governance and Big Data.” . Allotrope Foundation is using SHACL as major component of its Framework for – the third component of the Allotrope Framework is a collection of data models using SHACL – these include ways of defining and constraining the way that the semantic content is packaged in the data description layer of the Allotrope Data Format (ADF) – bio-itworld.com/2017/08/25/allotrope-foundation-a-framework-for-knowledge-management.aspx

© Copyright 2017 TopQuadrant Inc. Slide 26 … Questions? We are presenting this webinar again

© Copyright 2017 TopQuadrant Inc. Slide 28