<<

Using OWL/RDFS for Building Applications Xavier Lopez, Ph.D., Director, Oracle Server Technologies Zhe Wu, Ph.D., Consultant Member, Oracle Server Technologies Agenda

• Introduction to Semantic Web and Oracle 11g Semantic Technologies • What is semantic web? • Business use cases • Capabilities overview • Architecture/Query/Store//Java • Scalability and performance • Web ontology languages overview • RDFS/OWL • Using RDFS/OWL in your semantic applications • Summary Introduction to Semantic Web and Business Use Cases Semantic Data Management Characteristics

• Discovery of data relationships across… • Structured data (, apps, web services) • Unstructured data (email, office documents) Multi-data types (graphs, spatial, text, sensors) • Text Mining & Web Mining infrastructure • Terabytes of structured & unstructured data • Queries are not defined in advance • Schemas are continuously evolving • Associate more meaning (context) to enterprise data to enable its (re)use across applications • Allow sharing and reuse of enterprise and web data. • Built on open, industry W3C standards: • SQL, XML, RDF, OWL, SPARQL Case Study: National Intelligence

Ontology Engineering Modeling Process

Information RDF/OWL Extraction Processed OWL Categorization, Feature/ Document Ontologies term Extraction Web Resources Collection Domain Specific

News, Email, RSS

SQL/SPARQL Query

Content Mgmt. Systems

Explore

Browsing, Presentation, Reporting, Visualization, Query Analyst Data Integration Platform in Health Informatics Enterprise Information Consumers (EICs)

Access Patient Workforce Clinical Care Business Analytics Management Intelligence

Run-Time Model Virtual Deploy Relate Integration Server (Semantic Knowledge base) Model Physical

Access LIS CIS HTB HIS Data Integration Platform in Health Informatics Enterprise Information Consumers (EICs)

Access Patient Workforce Clinical Care Business Analytics Management Intelligence

Run-Time Metadata Model Virtual Deploy Relate Integration Server (Semantic Knowledge base) Model Physical

Access LIS CIS HTB HIS Semantic Data Management Workflow

Edit & Load, Query Applications & Transaction Systems Transform & Inference Analysis

Unstructured Content •Entity • RDF/OWL Data • Graph Extraction & Management Visualization Transform • SQL & SPARQL • Link Analysis Query RSS, email •Ontology • Statistical Engineering • Inferencing Analysis •Categorization • Semantic Rules • Faceted Search

Other Data •Custom • Scalability & • Pattern Discovery Formats Scripting Security • Text Mining

Data Partners Partners Sources Oracle 11g RDF/OWL Graph Data Management

Key Capabilities:

• Oracle 11g is the leading • Native RDF graph data store commercial database with native Load / • Manages billions of triples RDF/OWL data management Storage • Fast batch, bulk and incremental load • Scalable & secure platform for wide-range of semantic • SQL: SEM_Match applications Query • SPARQL: via Jena plug-in • Readily scales to ultra-large • Ontology assisted query of RDBMS data repositories (+1 billion) • Choice of SQL or SPARQL query • Leverages Oracle Partitioning • Forward chaining model Reasoning • RDFS++ OWL, OWL Prime and Advanced Compression. RAC • User defined rule base supported • Growing ecosystem of 3rd party tools partners Partners Integrated Tools and Solution Providers: Commitment to W3C Semantic Standards

• Our implementation entirely based on W3C standards (RDF, RDFS, OWL) • SPARQL support through Jena • Members of following W3C Web Semantic Activities: • W3C Data Access Working Group (DAWG) • W3C OWL Working group • W3C Semantic Web Education & Outreach (SWEO) • W3C Health Care & Life Sciences Interest Group (HCLS) • W3C Multimedia Incubator group • W3C Semantic Web Rules Language (SWRL)

11 Oracle 11g Semantic Technologies

12 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. Capabilities Overview INFER QUERY Query RDF/ Ontology- RDF/S User OWL data Assisted defined and Query of OWL rules ontologies Enterprise Data

Incr. DML

Batch- • RDF/OWL Load Relational data data

STORE • Ontologies & Bulk- rule bases Load

14 Store Semantic Data

• Native graph data store in Oracle database • Implemented using relational tables/views • Optimized for semantic data • Scales to very large datasets • No limits to amount of data that can be stored • Stored along with other relational data • Leverages decades of experience • Can be combined with other relational data • Business Data • XML • Location • Images, Video

15 Store Semantic Data: APIs

• Incremental DMLs (small number of changes) Recommended loading method • Insert for very small • Delete number of triples • GraphOracleSem.add, delete

• Batch loader • BatchImport • OracleBulkUpdateHandler.addInBatch(…)

• Bulk loader (large number of changes) Recommended loading method • sem_apis.bulk_load_from_staging_table(…) for very large • OracleBulkUpdateHandler.addInBulk(…) number of triples

16 Infer Semantic Data

• Native inferencing in the database for • RDF, RDFS, and a rich subset of OWL semantics (OWLSIF, OWLPRIME, RDFS++) • User-defined rules

• Forward chaining. • New relationships/triples are inferred and stored ahead of query time • Removes on-the-fly reasoning and results in fast query times

• Proof generation • Show one deduction path

17 OWL Subsets Supported

• RDFS++ • RDFS plus owl:sameAs and owl:InverseFunctionalProperty • OWLSIF (OWL with IF semantics) • Based on Dr. Horst’s pD* vocabulary¹ • OWLPrime • rdfs:subClassOf, subPropertyOf, domain, range • owl:TransitiveProperty, SymmetricProperty, FunctionalProperty, InverseFunctionalProperty, inverseOf • owl:sameAs, differentFrom OWL DL • owl:disjointWith, complementOf, OWL Lite • owl:hasValue, allValuesFrom, someValuesFrom OWLPrime • owl:equivalentClass, equivalentProperty • Jointly determined with domain experts, customers and partners

18

1 Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary Infer Semantic Data: APIs

• SEM_APIS.CREATE_ENTAILMENT( Typical Usage: • Index_name • First load RDF/OWL data • sem_models(‘GraphTBox’, ‘GraphABox’, …), • Call create_entailment to Recommended • sem_rulebases(‘OWLPrime’), generate inferred graph API • passes, • Query both original graph and for inference • Inf_components, inferred data • Options Inferred graph contains only new ) triples! Saves time & • Use “PROOF=T” to generate inference proof resources

• SEM_APIS.VALIDATE_ENTAILMENT( • sem_models((‘GraphTBox’, ‘GraphABox’, …), Typical Usage: • sem_rulebases(‘OWLPrime’), • First load RDF/OWL data • Criteria, • Call create_entailment to • Max_conflicts, generate inferred graph • Options • Call validate_entailment to find inconsistencies ) • Java API: GraphOracleSem.performInference()

19 Query Semantic Data

• Choice of SQL or SPARQL • SPARQL-like graph queries can be embedded in SQL • Key advantages • Graph queries can be integrated with enterprise relational data • Graph queries can be enhanced with relational operators. • E.g. replace, substr, concatenation, to_number, …

• Jena Adaptor for Oracle can be used, includes a full SPARQL API • Oracle plans to natively support SPARQL

20 Query Semantic Data: APIs

• Graph query using SEM_MATCH select g.s, t.frequency from table(sem_match ( -- query graph + relational

'(?s )', sem_models('nci1'), null, null, null)) g, terms t where substr(g.s, instr(g.s,'#',-1)+1)= t.subject; -- using a predicate to tie

select o from table(sem_match ( ‘ -- query original graph + inferred ( ?o)', Important for sem_models('nci'), sem_rulebases('owlprime'), null, null)); multi-model query performance

select o from table(sem_match ( ‘ -- query multi-graphs + inferred ( ?o)', sem_models('nci', 'gene'), sem_rulebases('owlprime'), null, null, null, 'ALLOW_DUP=T’));

• Graph query using Jena Adaptor

21 http://www.oracle.com/technology/obe/11gr1_db/datamgmt/nci_semantic_network/nci_Semantics_les01.htm OBE Query Semantic Data: Semantic Operators • Scalable, efficient SQL operators to perform ontology- assisted query against enterprise relational data

Query: “Find all entries in diagnosis column ID DIAGNOSIS that are related to ‘Upper_Extremity_Fracture’”

Patients 1 Hand_Fracture Syntactic query against relational table will not work! diagnosis 2 Rheumatoid_Arthritis table SELECT p_id, diagnosis FROM Patients  Zero Matches! WHERE diagnosis = ‘Upper_Extremity_Fracture; Traditional Syntactic query against relational data New against relational data (while consulting ontology)

SELECT p_id, diagnosis Upper_Extremity_Fracture FROM Patients rdfs:subClassOf WHERE SEM_RELATED ( diagnosis, Arm_Fracture ‘rdfs:subClassOf’, rdfs:subClassOf rdfs:subClassOf ‘Upper_Extremity_Fracture’, ‘Medical_ontology’) = 1; Forearm_Fracture Elbow_Fracture Hand_Fracture rdfs:subClassOf Finger_Fracture

22 Query Semantic Data: Semantic Operators • Scalable, efficient SQL operators to perform ontology- assisted query against enterprise relational data

Query: “Find all entries in diagnosis column ID DIAGNOSIS that are related to ‘Upper_Extremity_Fracture’”

Patients 1 Hand_Fracture Syntactic query against relational table will not work! diagnosis 2 Rheumatoid_Arthritis table SELECT p_id, diagnosis FROM Patients  Zero Matches! WHERE diagnosis = ‘Upper_Extremity_Fracture; Traditional Syntactic query against relational data New Semantic query against relational data (while consulting ontology)

SELECT p_id, diagnosis Upper_Extremity_Fracture FROM Patients rdfs:subClassOf WHERE SEM_RELATED ( diagnosis, Arm_Fracture ‘rdfs:subClassOf’, rdfs:subClassOf rdfs:subClassOf ‘Upper_Extremity_Fracture’, ‘Medical_ontology’‘Medical_ontology’) = = 1) 1; Forearm_Fracture Elbow_Fracture Hand_Fracture AND SEM_DISTANCE() <= 2; rdfs:subClassOf Finger_Fracture

22 Java APIs: Jena Adaptor • Implements Jena’s Graph/Model/BulkUpdateHandler/ … APIs • “Proxy” like design • Data not cached in memory for scalability • SPARQL query converted into SQL and executed inside DB • A SPARQL with just conjunctive patterns is converted into a single SEM_MATCH query • Usage example String queryString = " PREFIX : " + "SELECT ?person WHERE { ?person foaf:firstName \"Julie\" . } ";

QueryExecution qexec = QueryExecutionFactory.create( QueryFactory.create(queryString), modelOracleSem);

23 http://www.oracle.com/technology/tech/semantic_technologies/documentation/jenaadaptor2_readme.pdf Java APIs: Jena Adaptor (2)

• Allows various data loading • Bulk/Batch/Incremental load RDF or OWL (in N3, RDF/XML, N- TRIPLE etc.) with strict syntax verification and long literal support • Integrates Oracle 11gR1 RDF/OWL with Top Braid Composer • Makes integration with external reasoners easier • E.g. PelletInfGraph can be created on top of GraphOracleSem

• 2nd version released on OTN (Aug 2008)

24 11gR1 RDF/OWL Usage Flow

• Create an application table • create table app_table(triple sdo_rdf_triple_s); After inference is done, what will happen if • Create a semantic model - New assertions are added to the graph • exec sem_apis.create_sem_model(‘family’, • Inferred data becomes incomplete. Existing ’app_table’,’triple’); inferred data will be reused if create_entailment API invoked again. Faster • Load data than rebuild. • Use DML, Bulk loader, or Batch loader - Existing assertions are removed from the graph • insert into app_table (triple) values(1, sdo_rdf_triple_s(‘family', • Inferred data becomes invalid. Existing ‘’, inferred data will not be reused if the ‘’, create_entailment API is invoked again. ‘’)); • Collect statistics using exec sem_apis.analyze_model(‘family’); Important for • Run inference performance • exec sem_apis.create_entailment(‘family_idx’,sem_models(‘family’), sem_rulebases(‘owlprime’)); ! • Collect statistics using exec sem_apis.analyze_rules_index(‘family_idx’); • Query both original model and inferred data select p, o from table(sem_match('( ?p ?o)', sem_models(‘family'), sem_rulebases(‘owlprime’), null, null));

25 Scalability and Performance

26 Bulk Loader Performance on Desktop PC

Ontology Time Space (in GB) size bulk-load Sql*loader RDF RDF Total: App Staging API[1] time range Model: Values: Data Table: Table: Time low[2] Data Data Index Data[4] Data[5] high[3] Indexes Indexes LUBM50 8 min 1min 0.14 0.11 0.25 0.14 0.32 6.9 million 4.3min 0.48 0.12 0.60 LUBM1000 3hr 25min 19min 2.75 2.30 5.05 2.77 6.39 138 million 1h 26m 9.32 2.33 11.65 LUBM8000 30hr 43min 2h 35m 21.98 18.62 40.60 22.10 51.66 1,106 million 11h 32m 74.15 19.45 93.60 UniProt (old) 4hr 40min 30m 4.06 1.44 5.50 4.04 7.69 207 million 1h 55m 13.86 2.18 16.04

• Results collected on a single CPU PC (3GHz), 4GB RAM, 7200rpm SATA 3.0Gbps, 32 bit Linux. RDBMS 11.1.0.6 • Empty network is assumed

[1] Uses flags=>' VALUES_TABLE_INDEX_REBUILD ' [2] Less time for minimal syntax check. [3] More time is needed when RDF values used in N-Triple file are checked for correctness. [4] Application table has table compression enabled.[5] Staging table has table compression enabled.

27 Inference Performance on Desktop PC

Ontology RDFS OWLPrime OWLPrime + (size) Pellet on TBox #Triples Time #Triples Time #Triples Time (after duplicate inferred inferred inferred elimination) (millions) (millions) (millions) LUBM50 2.75 12min 14s 3.05 8m 01s 3.25 8min 21s 6.6 million LUBM1000 55.09 7h 19min 61.25 7hr 0min 65.25 7h 12m 133.6 million UniProt 3.4 24min 06s 50.8 3hr 1min NA NA 20 million

OWLPrime Inference (with Pellet on TBox) As a reference (not a comparison) 500 375 250 2.52k triples/s BigOWLIM loads, , and stores 125 (2GB RAM, P4 3.0GHz,) 0 - LUBM50 in 11 minutes (JAVA 6, -Xmx192 ) ¹ 6.49k triples/s 50 500 1000 - LUBM1000 in 11h 20min (JAVA 5, -Xmx1600 )¹ Inference Time (minutes) Number of universities Note: Our inference time does not include loading time! Also, set of rules is different.

• Results collected on a single CPU PC (3GHz), 2GB RAM (1.4G dedicate to DB), Multiple Disks over NFS

28 1 http://www.ontotext.com/owlim/OWLIMPres.pdf, Oct 2007 Inference Performance on Desktop PC 1 Billion Triples OWLPrime + Pellet on TBox 2GB Physical Memory 4GB Physical Memory

Ontology #Triples inferred Time #Triples inferred Time (size) (millions) (millions)

LUBM 8000 521.7 88.5hr 521.7 56.7hr 1.068 billion+ triples

OWLPrime Inference Time-Memory Tradeoff 100 80 60 40 20 0 Inference Time (hrs) 2GB 4GB Size of Physical Memory

• Results collected on a single CPU PC (3GHz), 2~4GB RAM, 2 Local 7200 RPM SATA Disks • Check inference best practice document on OTN for more details ¹

29 1 http://www.oracle.com/technology/tech/semantic_technologies/pdf/semantic_infer_bestprac_wp.pdf Inference Performance on Desktop PC 1 Billion Triples • Using better I/O by adding two commodity disks OWLPrime + Pellet on TBox 2 SATA 7200RPM DISKS, 4 SATA 7200RPM DISKS Ontology 4GB RAM (ASM), 4GB RAM

(size) #Triples inferred Time #Triples inferred Time (millions) (millions) LUBM 8000 521.7 56.7hr 521.7 40.5hr 1.068 billion+ triples

OWLPrime Inference Comparison 60 40 20 0

Inference Time (hrs) 2 4 Number of Physical Disks

• Using the same setup, UniProt (956.9 million triples) OWLPrime inference takes 2hr 29 minutes to generate 23.3 million new triples

30 Inference Performance on Server • Using server-class machine with much better I/O (flash- based storage), CPU and memory OWLPrime + Pellet on TBox Single CPU, 2GB RAM 16 Cores, NAND based flash Ontology 2 SATA 7200RPM DISKS, storage, 32GB RAM, (size) 32 bit Linux 64 bit Linux

#Triples inferred Time #Triples inferred Time (millions) (millions)

LUBM1000 65.25 7.2hr 65.25 1.22 hr 133.6 million triples

OWLPrime Inference Comparison 8

6

4

2

Inference Time (hrs) 0 Commodity Server Box Linux (with NAND Box hard disc)

31 Inference Performance on Server Going Parallel • Server class machine: 16 Cores, NAND based flash storage, 32GB RAM, 64 bit Linux • LUBM1000 benchmark. One execution of owl:someValuesFrom rule generates 12.99 million triples.

OWLPrime Inference for 1 execution of owl:someValuesFrom rule

300 274.13 200 100 144.35 0 DOP=1 DOP=4 Degree of Parallelism Inference Time (seconds)

• Parallel inference is planned post 11gR1

32 Query Performance on Desktop PC

Ontology LUBM50 LUBM Benchmark Queries

6.8 million & Q1 Q2 Q3 Q4 Q5 Q6 Q7 3+ million inferred

# answers 4 130 6 34 719 393730 59

Complete? OWLPrime Y Y Y Y Y N N Time 0.09 0.8 0.28 6.1 0.4 36.82 1.05 (sec)

OWLPrime # answers 4 130 6 34 719 519842 67 + Pellet on Complete? Y Y Y Y Y Y Y TBox Time 0.09 0.79 0.28 9.5 0.4 43.65 1.13 (sec)

• Results collected on a single CPU PC (3GHz), 2GB RAM (1.4G dedicate to DB), Multiple Disks over NFS

33 Query Performance on Desktop PC (2)

Ontology LUBM Benchmark Queries LUBM50

Q8 Q9 Q10 Q11 Q12 Q13 Q14 6.8 million & 3+ million inferred # answers 5916 6538 0 224 0 228 393730

Complete? N N N Y N Y Y OWLPrime Time 2.76 2.08 0.18 0.02 0.01 0.39 31.5 (sec) OWLPrime # answers 7790 13639 4 224 15 228 393730 + Pellet on Complete? Y Y Y Y Y Y Y TBox Time 3.9 3.47 0.37 0.02 0.03 0.39 39.4 (sec)

• Results collected on a single CPU PC (3GHz), 2GB RAM (1.4G dedicate to DB), Multiple Disks over NFS

34 Query Performance on Server Going Parallel • Server class machine: 16 Cores, NAND based flash storage, 32GB RAM, 64 bit Linux • LUBM query benchmark (LUBM1000).

Query Performance Comparison 100 80 60 DOP=1 40 DOP=4 20 0 Time (seconds) 1 2 3 4 5 6 7 8 9 11 10 12 13 14 LUBM Benchmark Query Overview

36 Basic Elements of RDF

• Instances E.g. :John, :MovieXYZ, :PurchaseOrder432

• Classes • Class represents a group/category/categorization of instances E.g. :John rdf:type :Student

• Properties • Linking data together E.g. :John :brother :Mary, :John :hasAge “33”^^xsd:integer.

37 RDF Schema (RDFS) • Core language constructs • rdfs:subClassOf • :A rdfs:subClassOf :B  instance of A is also instance of B • rdfs:subPropertyOf (property transfer) • :p1 rdfs:subPropertyOf :p2, :a :p1 :b  :a :p2 :b • :firstAuthor rdfs:subPropertyOf :Author • skos:prefLabel rdfs:subPropertyOf rdfs:label • rdfs:domain and rdfs:range (specify how a property can be used) • :p1 rdfs:domain :D, :a :p1 :b  :a rdf:type :D • :p2 rdfs:range :R, :a :p2 :b  :b rdf:type :R • E.g. :performSurgeryOn rdfs:domain :Surgeon :performSurgeryOn rdfs:range :Patient • rdfs:label, seeAlso, isDefinedBy, … • :Jack rdfs:seeAlso http://…/Jack_Blog

38 (OWL) • More expressive compared to RDFS • Property related constructs • owl:inverseOf • E.g. :write owl:inverseOf :authoredBy • owl:SymmetricProperty • :relatedTo rdf:type owl:SymmetricProperty • foaf:knows is not defined as a symmetric property! • owl:TransitiveProperty • :partOf rdf:type owl:TransitiveProperty. • skos:broader rdf:type owl:TransitiveProperty • owl:equivalentProperty • owl:FunctionalProperty • :hasBirthMother rdf:type owl:FunctionalProperty • owl:InverseFunctionalProperty • foaf:mbox rdf:type owl:InverseFunctionalProperty • Instances (owl:sameAs, owl:differentFrom)

39 OWL • Class related constructs • owl:equivalentClass • owl:disjointWith • :Boys owl:disjointWith :Girls • owl:complementOf • :Boys owl:complementOf :Non_Boys • owl:unionOf, owl:intersectionOf, owl:oneOf • owl:Restriction is used to define a class whose members have certain restrictions w.r.t a property • owl:someValuesFrom • owl:allValuesFrom • owl:hasValue

40 OWL • Class related constructs • owl:equivalentClass • owl:disjointWith • :Boys owl:disjointWith :Girls • owl:complementOf • :Boys owl:complementOf :Non_Boys owl:someValuesFrom • owl:unionOf, owl:intersectionOf, owl:oneOf • owl:Restriction is used to define a class whose members •:ApprovedPurchaseOrder owl:equivalentClass have certain restrictions w.r.t a [property a owl:Restriction ; • owl:someValuesFrom owl:onProperty :approvedBy ; • owl:allValuesFrom owl:someValuesFrom :Manager ] • owl:hasValue :PO1 :approvedBy :managerXyz :managerXyz rdf:type :Manager  :PO1 rdf:type :ApprovedPurchaseOrder

41 OWL • Class related constructs • owl:equivalentClass • owl:disjointWith • :Boys owl:disjointWith :Girls • owl:complementOf owl:allValuesFrom • :Boys owl:complementOf :Non_Boys • owl:unionOf, owl:intersectionOf, owl:oneOf •:Vegetarian rdfs:subClassOf • owl:Restriction is used to [ adefine owl:Restriction a class ; whose members have certain restrictions w.r.t owl:onProperty a property :eats ; • owl:someValuesFrom owl:allValuesFrom :VegetarianFood ] • owl:allValuesFrom • owl:hasValue :Jen rdf:type :Vegetarian . :Jen :eats :Marzipan .  :Marzipan rdf:type :VegetarianFood .

We SHOULD not use :eats rdfs:range :VegetarianFood

42 OWL • Class related constructs • owl:equivalentClass • owl:disjointWith • :Boys owl:disjointWith :Girls • owl:complementOf • :Boys owl:complementOf :Non_Boys owl:hasValue • owl:unionOf, owl:intersectionOf, owl:oneOf • owl:Restriction is used to define a class whose members •:HighPriorityItem owl:equivalentClass have certain restrictions w.r.t a property [ a owl:Restriction ; • owl:someValuesFrom owl:onProperty :hasPriority ; • owl:allValuesFrom owl:hasValue :High ] • owl:hasValue :Item1 rdf:type :HighPriorityItem .  :Item1 :hasPriority :High

43 OWL • Class related constructs • restrictions constrain the number of distinct individuals that can associate with a class instance via a particular property • owl:minCardinality • owl:maxCardinality • owl:cardinality E.g. To express that a basketball game has at least 2 players :BasketBallGame rdfs:subClassOf [ a owl:Restriction; owl:onProperty :hasPlayer; owl:minCardinality 2 ] • Others • DatatypeProperty, AnnotationProperty, OntologyProperty,…

44 Using RDFS/OWL in your semantic applications

45 Using RDFS

• Approximate set intersection with C ⊆ A ∩ B C rdfs:subClassOf A C rdfs:subClassOf B • Approximate property intersection with p ⊆ r ∩ s p rdfs:subPropertyOf r p rdfs:subPropertyOf s • Approximate set union with A ∪ B ⊆ C A rdfs:subClassOf C B rdfs:subClassOf C • Approximate property union with r ∪ s ⊆ p r rdfs:subPropertyOf p s rdfs:subPropertyOf p • Term reconciliation • Property transfer using :author rdfs:subPropertyOf dc:creator. • Equivalent classes using: :Analyst rdfs:subClassOf :Researcher . :Researcher rdfs:subClassOf :Analyst .

46 Book: Semantic Web for the working ontologist Effective Modeling in RDFS and OWL Using RDFS

• Approximate set intersection with C ⊆ A ∩ B C rdfs:subClassOf A C rdfs:subClassOf B • Approximate property intersection with p ⊆ r ∩ s p rdfs:subPropertyOf r p rdfs:subPropertyOf s Term Reconciliation • Approximate set •unionDifferent with property A ∪ B ⊆ Cnames used A rdfs:subClassOf C •One library uses term library1:borrows B rdfs:subClassOf C • Approximate property•Another union librarywith ruses ∪ s term ⊆ p library2:checkOut r rdfs:subPropertyOf p Solution: s rdfs:subPropertyOf p •library1:borrows rdfs:subPropertyOf library2:checkOut • Term reconciliation •library2:checkOut rdfs:subPropertyOf library1:borrows • Property transfer using :author rdfs:subPropertyOf dc:creator. • Equivalent classes using: :Analyst rdfs:subClassOf :Researcher . :Researcher rdfs:subClassOf :Analyst .

47 Book: Semantic Web for the working ontologist Effective Modeling in RDFS and OWL Using RDFS

• Approximate set intersection with C ⊆ A ∩ B C rdfs:subClassOf A C rdfs:subClassOf B • Approximate property intersection with p ⊆ r ∩ s p rdfs:subPropertyOf r p rdfs:subPropertyOf s Term Reconciliation • Approximate set union with•Different A ∪ B ⊆ class C names used A rdfs:subClassOf C B rdfs:subClassOf C •One application uses term :Researcher • Approximate property union with•Another r ∪ applications ⊆ p uses term :Analyst r rdfs:subPropertyOf p Solution: s rdfs:subPropertyOf p •:Researcher rdfs:subClassOf :Analyst • Term reconciliation •:Analyst rdfs:subClassOf :Researcher • Property transfer using :author rdfs:subPropertyOf dc:creator. • Equivalent classes using: :Analyst rdfs:subClassOf :Researcher . :Researcher rdfs:subClassOf :Analyst .

48 Book: Semantic Web for the working ontologist Effective Modeling in RDFS and OWL Using OWL

• Merge Information using simple constructs • Using functional property :hasBirthMother rdf:type owl:FunctionalProperty :John :hasBirthMother :Mary :Jonh :hasBirthMother :Jill  :Mary owl:sameAs :Jill  all information about :Mary and :Jill will be merged • Using inverse functional property :directManagerOf rdf:type owl:InverseFunctionalProperty :Jack :directManagerOf :Eddie :CoolJack :directManagerOf :Eddie  :Jack owl:sameAs :CoolJack  all information about :Jack and :CoolJack will be merged • Using owl:sameAs directly owl:sameAs .

49 Using OWL

• Use Classes as property values :BookXyz dc:subject :TigerSubject . :TigerSubject rdf:type :Subject . :TigerSubject rdfs:seeAlso :Tiger . :Tiger rdfs:subClassOf :Animal . • Representing is_part_of relationship is_part_of a owl:TransitiveProperty, owl:ObjectProperty ; rdfs:domain Item; rdfs:range Item; owl:inverseOf has_part ;

• minCardinality 0 can be used to denote “may” or “optional” relationship. • No specific semantics though.

50 1 http://www.cs.man.ac.uk/~rector/swbp/simple-part-whole/simple-part-whole-relations-v0-2. 2 http://www.w3.org/TR/swbp-classes-as-values/ Common Pitfalls

• Domain/range have intersection semantics :eats :range :VegetarianFood . :eats :range :Meat :eats :range (:VegetarianFood ∩ :Meat)

• owl:complementOf is strong Is :Male owl:complementOf :Female?

• Transitive properties should not be functional or inverse functional. It is legal but does not make sense :relatedTo rdf:type owl:TransitiveProperty . :relatedTo rdf:type owl:FunctionalProperty . :John :relatedTo :Jack . :Jack :relatedTo :Mary :Jack owl:sameAs :Mary

51 Common Pitfalls

• owl:allValuesFrom does not imply existence :Carnivore owl:equivalentClass [ a owl:Restriction; owl:onProperty :eats; owl:allValuesFrom :Meat] :John rdf:type Carnivore does not imply :John :eats [ a :Meat]

• owl:allValuesFrom and Open World Assumption (OWA) :John :eats :Steak :Steak rdf:type :Meat does not imply :John rdf:type :Carnivore

• owl:min/maxCardinality and OWA and absence of Unique Name Assumption (UNA)

52 Summary

• 11g provides a solid, efficient, and scalable semantic data management • Store • Query (graph query and ontology-assisted query) • Inference • Java APIs make application building easier • RDFS/OWL are powerful knowledge modeling languages. • Be careful with consequence of OWA and absence of UNA • Future direction • Even better performance • Richer semantics • More application/tools integrations

53 For More Information

http://search.oracle.com

semantic technologies

Please visit our demo booth

54 Semantics at OOW 2008 - Sessions

Date/Time Title Location Sunday, Sept 21 Get the Most from Oracle Database 11g Semantic Marriott 1:15-2:15 p.m. Technology: Best Practices Salon 12/13

Monday, Sept 22 Using RDFS/OWL for Building Semantic Web Marriott 1:00-2:00 p.m. Applications Salon 12/13 Oracle Database 11g Production Case Study: Teranode Marriott 4:00-5:00 p.m. and Pfizer Semantic/Relational Approach to Science Salon 12/13 Collaboration

Tuesday, Sept 23 Oracle Database 11g Production Case Study: Eli Lilly Marriott 5:30-6:30 p.m. and Company—Semantic Technologies from a DBA’s Salon 01 Point of View

•DEMOgrounds - Database – Moscone West •Oracle Semantic Database Technologies - Pod L14 Appendix

56 Infer Semantic Data: OWL subsets

57 Support Semantics beyond OWLPrime (1) • Option1: add user-defined rules • Both 10gR2 RDF and 11g RDF/OWL supports user-defined rules in this form (filter is supported) Antecedents Consequents ?x :parentOf ?y .  ?z :uncleOf ?y ?z :brotherOf ?x . • E.g. to support core semantics of owl:intersectionOf

female astronaut 1.  :FemaleAstronaut rdfs:subClassOf :Female 2.  :FemaleAstronaut rdfs:subClassOf  :Astronaut 3. ?x rdf:type :Female . ?x rdf:type :Astronaut .  x rdf:type :FemaleAstronaut

58 Support Semantics beyond OWLPrime (2)

• Option2: Separation of schema (TBox) and instance (ABox) reasoning through Jena Adaptor • TBox tends to be small in size • Generate a class subsumption tree using complete DL reasoners (like Pellet, KAON2, Fact++, Racer, etc) • ABox can be arbitrarily large • Use Oracle OWL to infer new knowledge based on the class subsumption tree from TBox

TBox & DL Complete class TBox reasoner tree OWL Prime

ABox

59 What is Semantic Web?

• Basic Technologies • URI • Uniform resource identifier • RDF • Resource description framework • RDFS • RDF Schema

Standards • OWL based • Web ontology language

60