Using OWL/RDFS for Building Semantic Web Applications
Total Page:16
File Type:pdf, Size:1020Kb
Using OWL/RDFS for Building Semantic Web Applications Xavier Lopez, Ph.D., Director, Oracle Server Technologies Zhe Wu, Ph.D., Consultant Member, Oracle Server Technologies Agenda • Introduction to Semantic Web and Oracle 11g Semantic Technologies • What is semantic web? • Business use cases • Capabilities overview • Architecture/Query/Store/Inference/Java APIs • Scalability and performance • Web ontology languages overview • RDFS/OWL • Using RDFS/OWL in your semantic applications • Summary Introduction to Semantic Web and Business Use Cases Semantic Data Management Characteristics • Discovery of data relationships across… • Structured data (database, apps, web services) • Unstructured data (email, office documents) Multi-data types (graphs, spatial, text, sensors) • Text Mining & Web Mining infrastructure • Terabytes of structured & unstructured data • Queries are not defined in advance • Schemas are continuously evolving • Associate more meaning (context) to enterprise data to enable its (re)use across applications • Allow sharing and reuse of enterprise and web data. • Built on open, industry W3C standards: • SQL, XML, RDF, OWL, SPARQL Case Study: National Intelligence Ontology Engineering Modeling Process Information RDF/OWL Extraction Processed OWL Categorization, Feature/ Document Ontologies term Extraction Web Resources Collection Domain Specific Knowledge Base News, Email, RSS SQL/SPARQL Query Content Mgmt. Systems Explore Browsing, Presentation, Reporting, Visualization, Query Analyst Data Integration Platform in Health Informatics Enterprise Information Consumers (EICs) Access Patient Workforce Clinical Care Business Analytics Management Intelligence Run-Time Metadata Model Virtual Deploy Relate Integration Server (Semantic Knowledge base) Model Physical Access LIS CIS HTB HIS Data Integration Platform in Health Informatics Enterprise Information Consumers (EICs) Access Patient Workforce Clinical Care Business Analytics Management Intelligence Run-Time Metadata Model Virtual Deploy Relate Integration Server (Semantic Knowledge base) Model Physical Access LIS CIS HTB HIS Semantic Data Management Workflow Edit & Load, Query Applications & Transaction Systems Transform & Inference Analysis Unstructured Content •Entity • RDF/OWL Data • Graph Extraction & Management Visualization Transform • SQL & SPARQL • Link Analysis Query RSS, email •Ontology • Statistical Engineering • Inferencing Analysis •Categorization • Semantic Rules • Faceted Search Other Data •Custom • Scalability & • Pattern Discovery Formats Scripting Security • Text Mining Data Partners Partners Sources Oracle 11g RDF/OWL Graph Data Management Key Capabilities: • Oracle 11g is the leading • Native RDF graph data store commercial database with native Load / • Manages billions of triples RDF/OWL data management Storage • Fast batch, bulk and incremental load • Scalable & secure platform for wide-range of semantic • SQL: SEM_Match applications Query • SPARQL: via Jena plug-in • Readily scales to ultra-large • Ontology assisted query of RDBMS data repositories (+1 billion) • Choice of SQL or SPARQL query • Leverages Oracle Partitioning • Forward chaining model Reasoning • RDFS++ OWL, OWL Prime and Advanced Compression. RAC • User defined rule base supported • Growing ecosystem of 3rd party tools partners Semantic Technology Partners Integrated Tools and Solution Providers: Commitment to W3C Semantic Standards • Our implementation entirely based on W3C standards (RDF, RDFS, OWL) • SPARQL support through Jena • Members of following W3C Web Semantic Activities: • W3C Data Access Working Group (DAWG) • W3C OWL Working group • W3C Semantic Web Education & Outreach (SWEO) • W3C Health Care & Life Sciences Interest Group (HCLS) • W3C Multimedia Semantics Incubator group • W3C Semantic Web Rules Language (SWRL) 11 Oracle 11g Semantic Technologies 12 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. Capabilities Overview INFER QUERY Query RDF/ Ontology- RDF/S User OWL data Assisted defined and Query of OWL rules ontologies Enterprise Data Incr. DML Batch- • RDF/OWL Load Relational data data STORE • Ontologies & Bulk- rule bases Load 14 Store Semantic Data • Native graph data store in Oracle database • Implemented using relational tables/views • Optimized for semantic data • Scales to very large datasets • No limits to amount of data that can be stored • Stored along with other relational data • Leverages decades of experience • Can be combined with other relational data • Business Data • XML • Location • Images, Video 15 Store Semantic Data: APIs • Incremental DMLs (small number of changes) Recommended loading method • Insert for very small • Delete number of triples • GraphOracleSem.add, delete • Batch loader • BatchImport • OracleBulkUpdateHandler.addInBatch(…) • Bulk loader (large number of changes) Recommended loading method • sem_apis.bulk_load_from_staging_table(…) for very large • OracleBulkUpdateHandler.addInBulk(…) number of triples 16 Infer Semantic Data • Native inferencing in the database for • RDF, RDFS, and a rich subset of OWL semantics (OWLSIF, OWLPRIME, RDFS++) • User-defined rules • Forward chaining. • New relationships/triples are inferred and stored ahead of query time • Removes on-the-fly reasoning and results in fast query times • Proof generation • Show one deduction path 17 OWL Subsets Supported • RDFS++ • RDFS plus owl:sameAs and owl:InverseFunctionalProperty • OWLSIF (OWL with IF semantics) • Based on Dr. Horst’s pD* vocabulary¹ • OWLPrime • rdfs:subClassOf, subPropertyOf, domain, range • owl:TransitiveProperty, SymmetricProperty, FunctionalProperty, InverseFunctionalProperty, inverseOf • owl:sameAs, differentFrom OWL DL • owl:disjointWith, complementOf, OWL Lite • owl:hasValue, allValuesFrom, someValuesFrom OWLPrime • owl:equivalentClass, equivalentProperty • Jointly determined with domain experts, customers and partners 18 1 Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary Infer Semantic Data: APIs • SEM_APIS.CREATE_ENTAILMENT( Typical Usage: • Index_name • First load RDF/OWL data • sem_models(‘GraphTBox’, ‘GraphABox’, …), • Call create_entailment to Recommended • sem_rulebases(‘OWLPrime’), generate inferred graph API • passes, • Query both original graph and for inference • Inf_components, inferred data • Options Inferred graph contains only new ) triples! Saves time & • Use “PROOF=T” to generate inference proof resources • SEM_APIS.VALIDATE_ENTAILMENT( • sem_models((‘GraphTBox’, ‘GraphABox’, …), Typical Usage: • sem_rulebases(‘OWLPrime’), • First load RDF/OWL data • Criteria, • Call create_entailment to • Max_conflicts, generate inferred graph • Options • Call validate_entailment to find inconsistencies ) • Java API: GraphOracleSem.performInference() 19 Query Semantic Data • Choice of SQL or SPARQL • SPARQL-like graph queries can be embedded in SQL • Key advantages • Graph queries can be integrated with enterprise relational data • Graph queries can be enhanced with relational operators. • E.g. replace, substr, concatenation, to_number, … • Jena Adaptor for Oracle can be used, includes a full SPARQL API • Oracle plans to natively support SPARQL 20 Query Semantic Data: APIs • Graph query using SEM_MATCH select g.s, t.frequency from table(sem_match ( -- query graph + relational '(?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class>)', sem_models('nci1'), null, null, null)) g, terms t where substr(g.s, instr(g.s,'#',-1)+1)= t.subject; -- using a predicate to tie select o from table(sem_match ( ‘ -- query original graph + inferred (<http://www.mindswap.org/2003/nciOncology.owl#Finger_Fracture> <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?o)', Important for sem_models('nci'), sem_rulebases('owlprime'), null, null)); multi-model query performance select o from table(sem_match ( ‘ -- query multi-graphs + inferred (<http://www.mindswap.org/2003/nciOncology.owl#Finger_Fracture> <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?o)', sem_models('nci', 'gene'), sem_rulebases('owlprime'), null, null, null, 'ALLOW_DUP=T’)); • Graph query using Jena Adaptor 21 http://www.oracle.com/technology/obe/11gr1_db/datamgmt/nci_semantic_network/nci_Semantics_les01.htm OBE Query Semantic Data: Semantic Operators • Scalable, efficient SQL operators to perform ontology- assisted query against enterprise relational data Query: “Find all entries in diagnosis column ID DIAGNOSIS that are related to ‘Upper_Extremity_Fracture’” Patients 1 Hand_Fracture Syntactic query against relational table will not work! diagnosis 2 Rheumatoid_Arthritis table SELECT p_id, diagnosis FROM Patients Zero Matches! WHERE diagnosis = ‘Upper_Extremity_Fracture; Traditional Syntactic query against relational data New Semantic query against relational data (while consulting ontology) SELECT p_id, diagnosis Upper_Extremity_Fracture FROM Patients rdfs:subClassOf WHERE SEM_RELATED ( diagnosis, Arm_Fracture ‘rdfs:subClassOf’, rdfs:subClassOf rdfs:subClassOf ‘Upper_Extremity_Fracture’, ‘Medical_ontology’) = 1; Forearm_Fracture Elbow_Fracture Hand_Fracture rdfs:subClassOf Finger_Fracture 22 Query