<<

TECHNOLOGIES

THE

ΑΝΑΓΝΩΣΤΟΠΟΥΛΟΣ ΙΩΑΝΝΗΣ

INTERNET TECHNOLOGIES History of the Semantic Web

• Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN • TBL’s original vision of the Web was much more ambitious than the reality of the existing (syntactic) Web:

“... a goal of the Web was that, if the interaction between person and could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.”

• TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web

1 INTERNET TECHNOLOGIES Where we are Today: the Syntactic Web

INTERNET TECHNOLOGIES Difficulties in the Syntactic Web…

• Complex queries involving background knowledge – Find information about “animals that use sonar but are not either bats or dolphins” • Locating information in data repositories – Travel enquiries – Prices of goods and services – Results of human genome experiments • Assigning complex tasks to web “agents” – Book me a holiday next weekend somewhere warm, not too far away, and where they speak French or English

2 INTERNET TECHNOLOGIES What is the Problem?

• Consider a typical web page:

• Markup consists of: – rendering information (e.g., font size and colour) – Hyper-links to related content • Semantic content is accessible to humans but not (easily) to computers…

INTERNET TECHNOLOGIES Need to Add “

• External agreement on meaning of annotations – E.g., http://www.lub.lu.se/cgi-bin/nmdc.pl • Agree on the meaning of a set of annotation tags – Problems with this approach • Inflexible • Limited number of things can be expressed • Use Ontologies to specify meaning of annotations – Ontologies provide a vocabulary of terms – New terms can be formed by combining existing ones

3 INTERNET TECHNOLOGIES Structure of an Ontology

Ontologies typically have two distinct components:

• Names for important concepts in a knowledge domain – Elephant is a concept whose members are a kind of animal – Herbivore is a concept whose members are exactly those animals who eat only plants or parts of plants – Adult_Elephant is a concept whose members are exactly those elephants whose age is greater than 20 years

• Background knowledge/constraints on the domain – Adult_Elephants weigh at least 2,000 kg – All Elephants are either African_Elephants or Indian_Elephants – No individual can be both a Herbivore and a Carnivore

INTERNET TECHNOLOGIES Thesaurus vs. Ontology

Controlled Vocabulary Terms: Metal working machinery, equipment and supplies, metal-cutting machinery, metal-turning Ontology equipment, metal-milling equipment, milling insert, turning insert, etc. Concepts Relations: use, used-for, broader-term, narrower- Logical-Conceptual term, related-term Semantics (Strong)

Thesaurus Real (& Possible) Terms World Referents Term ‘Semantic’ Relations: Semantics Logical Concepts Entities: Metal working machinery, equipment and z Equivalent = (Weak) supplies, metal-cutting machinery, metal-turning Semantic Relations: equipment, metal-milling equipment, milling insert, z Used For (Synonym) turning insert, etc. z Subclass Of UF Relations: subclass-of; instance-of; part-of; has- geometry; performs, used-on;etc. Properties: geometry; material; length; operation; z Part Of z Broader Term BT UN/SPSC-code; ISO-code; etc. Values: 1; 2; 3; “2.5 inches”; “85-degree-diamond”; z Arbitrary Relations z Narrower Term NT “231716”; “boring”; “drilling”; etc. Axioms/Rules: If milling-insert(X) & operation(Y) & z Meta-Properties on material(Z)=HG_Steel & performs(X, Y, Z), then z Related Term RT has-geometry(X, 85-degree-diamond). Relations

4 INTERNET TECHNOLOGIES Ontology Spectrum: One View

strong semantics Modal Logic First Order Logic Logical Theory Is Disjoint Subclass of with transitivity DAML+OIL, OWL property UML Conceptual Model Is Subclass of RDF/S Semantic Interoperability XTM Extended ER

Thesaurus Has Narrower Meaning Than ER DB Schemas, XML Schema Structural Interoperability

Taxonomy Is Sub-Classification of Relational Model, XML Syntactic Interoperability weak semantics

INTERNET TECHNOLOGIES Ontology Representation Levels

Level Example Constructs Knowledge Class, Relation, Instance, Representation (KR) Function, Attribute, Property, Constraint, Axiom, Language (Ontology Rule Language) Level: Meta Level to the Ontology Concept Level Ontology Concept Person, Location, Event, (OC) Level: Parent, Hammer, River, Object Level to the KR FinancialTransaction, Language Level, BuyingAHouse, Automobile, Meta Level to the TravelPlanning, etc. Instance Level Ontology Instance Harry X. Landsford III, Ralph (OI) Level: Waldo Emerson, Person560234, Object Level to the PurchaseOrderTransactionEve Ontology Concept nt6117090, 1995-96 V-6 Ford Level Taurus 244/4.0 Aerostar Automatic with Block Casting # 95TM-AB and Head Casting 95TM

5 INTERNET TECHNOLOGIES A Semantic Web — First Steps

Make web resources more accessible to automated processes

• Extend existing markup with semantic markup – annotations that describe content/function of web accessible resources

• A prerequisite is a standard – Need to agree common syntax before we can share semantics – Syntactic web based on standards such as HTTP and HTML

INTERNET TECHNOLOGIES

What Problems Do Ontologies & the Semantic Web Help Solve?

• Heterogeneous problem – Different organizational units, Service Needers/Providers have radically different – Different syntactically: what’s the format? – Different structurally: how are they structured? – Different semantically: what do they mean? – They all speak different languages (access, description, schemas, meaning)

• Relevant document retrieval/question-answering problem – What is the meaning of your query? – What is the meaning of documents that would satisfy your query? – Can you obtain only meaningful, relevant documents?

6 INTERNET TECHNOLOGIES

RDF and RDFS

• RDF stands for Resource Description Framework

• It is a W3C candidate recommendation (http://www.w3.org/RDF) • RDF is graphical formalism ( + XML syntax + semantics) – for representing metadata – for describing the semantics of information in a machine- accessible way • RDFS extends RDF with “schema vocabulary”, e.g.: – Class, Property – type, subClassOf, subPropertyOf – range, domain

INTERNET TECHNOLOGIES Building Blocks in RDF

Semantic Web Metadata URI

Data about data – labeling and structuring Universal Resource Identifier – information in a document an universal and unique name for any resource http://www.something.com/one

7 INTERNET TECHNOLOGIES The RDF statements • Statements are triples: • Can be represented as a graph:

hasColleague Ian Uli

• Statements describe properties of resources • A resource is any object that can be pointed to by a URI: – a document, a picture, a paragraph on the Web; – a book in the library, a real person – isbn://5031-4444-3333 – … • Properties themselves are also resources (URIs)

INTERNET TECHNOLOGIES URIs

• URI = Uniform Resource Identifier • URLs (Uniform Resource Locators) are a particular type of URI, used for resources that can be accessed on the WWW (e.g., web pages) • In RDF, URIs typically look like “normal” URLs, often with fragment identifiers to point at specific parts of a document: – http://www.somedomain.com/some/path/to/file#fragmentID

8 INTERNET TECHNOLOGIES

Linking Statements • Each triple (Subject, Predicate, Object) represents an RDF statement • Subject Æ Resource (R) • Predicate Æ Property (P) • Object Æ Resource (R)

Object(R) Subject(R) hasColleague Ian John hasColleague hasHomePage(P) hasColleague(P)

Carole Subject(R) http://www.cs.mam.ac.uk/~johnsmith Object(L)

INTERNET TECHNOLOGIES What RDF/S provide?

• RDFS enables you to make simple, generic statements about your Web object classes, properties • RDF enables you to make specific statements about your Web object instances (of those classes, properties)

• A set of RDF statements can be viewed in 3 ways: – A set of triples: consider them as rows/tuples in a database – A directed graph: consider them as a complex, navigatable data structure

9 INTERNET TECHNOLOGIES The RDF Data Model

ƒ Simple but powerful data model for the description of resources and the creation of metadata ƒ Consists of three core concepts: ¾Resource ¾Property ¾Statement + Class (in RDF Schema) ƒ Similar to other modeling approaches (e.g. object-oriented modeling), but property-centric, not class-centric

INTERNET TECHNOLOGIES RDF Syntax • RDF has an XML syntax that has a specific meaning: • Every Description element describes a resource • Every attribute or nested element inside a Description is a property of that Resource • We can refer to resources by using URIs

http://www.cs.mam.ac.uk/~johnsmith

10 INTERNET TECHNOLOGIES RDF Schema (RDFS)

RDF Schema allows you to define vocabulary terms and the relations between those terms – it gives “extra meaning” to particular RDF predicates and resources – this “extra meaning”, or semantics, specifies how a term should be interpreted

RDF Schema terms (just a few examples): •Class •Property •type •subClassOf •range •domain

INTERNET TECHNOLOGIES RDFS Examples • These terms are the RDF Schema building blocks (constructors) used to create vocabularies:

• Properties can themselves have properties

11 INTERNET TECHNOLOGIES Problems with RDFS But… • RDFS too weak to describe resources in sufficient detail – No localised range and domain constraints • Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants – No transitive, inverse or symmetrical properties • Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical

INTERNET TECHNOLOGIES Web Ontology Language Requirements

Desirable features identified for Web Ontology Language:

• Extends existing Web standards – Such as XML, RDF, RDFS • Easy to understand and use – Should be based on familiar KR idioms • Formally specified • Of “adequate” expressive power • Possible to provide automated reasoning support

12 INTERNET TECHNOLOGIES From RDF to OWL • Two languages developed to satisfy above requirements – OIL: developed by group of (largely) European researchers (several from EU OntoKnowledge project) – DAML-ONT: developed by group of (largely) US researchers (in DARPA DAML programme) • Efforts merged to produce DAML+OIL – Development was carried out by “Joint EU/US Committee on Agent Markup Languages” – Extends (“DL subset” of) RDF

• DAML+OIL submitted to W3C as basis for standardisation – WebOnt group developed OWL language based on DAML+OIL – OWL language now a W3C Candidate Recommendation – Will soon become Proposed Recommendation

INTERNET TECHNOLOGIES What OWL provide

• OWL enables you to make complex, generic statements about your Web object classes, properties • OWL’s instances are expressed as RDF statements • OWL has 3 dialects/layers, increasingly more complex: – OWL-Lite, OWL-DL, OWL-Full • OWL enables you to map among ontologies: – Import one ontology into another: all things that are true in the imported ontology will thereby be true in the importing ontology

13 INTERNET TECHNOLOGIES OWL Language Levels*

Language Description Level OWL Full The complete OWL. For example, a class can be considered both as a collection of instances (individuals) and an instance (individual) itself. OWL DL Slightly constrained OWL. Properties cannot be (description individuals, for example. More expressive logic) cardinality constraints. OWL Lite A simpler language but one that is more expressive than RDF/S. Simple cardinality constraints only (0 or 1).

*Daconta, Obrst, Smith, 2003; cf. also OWL docs at http://www.w3.org/2001/sw/WebOnt/

INTERNET TECHNOLOGIES OWL FULL

• OWL Full extends OWL DL by permitting classes to be treated simultaneously as both collections and individuals (instances) • Also, a given datatypeProperty can be specified as being inverseFunctional, thus enabling, for example, the specification of a string as a unique key

**Clyde is an elephant. mammal species Elephant is a species. Therefore, Clyde is a subclass_of ×× instance_of species. WRONG! Elephant (class) Elephant (instance) Clyde is an elephant. instance_of Elephant is a mammal. Same label used for “elephant as a Therefore, Clyde is a subclass_of mammal” & “elephant as an Clyde mammal. instance_of species” RIGHT!

14 INTERNET TECHNOLOGIES Example: Inference and Proof

Proof Using Inference Rules

Given: If parentOf is a subProperty of motherOf, Deduction A method of and Mary is the mother of Bill, then Mary reasoning by which one infers is the parentOf Bill a conclusion from a set of sentences by employing the parentOf is a subProperty of motherOf axioms and rules of inference for a given logical system. Mary is the motherOf Bill

Infer: Mary is the parentOf Bill

Given... And... Can conclude...

parentOf Mary Mary subProperty motherOf parentOf

motherOf Bill Bill

A simple inferencing example from “Why use OWL?” by Adam Pease, http://www.xfront.com/why-use-owl.html

INTERNET TECHNOLOGIES Semantic Mark-up Languages

Semantic Web Ontology Languages

Ontologies provide semantics Interoperability achieved through languages based on XML eg: DAML+OIL, Dublin Core, OWL

15 INTERNET TECHNOLOGIES Semantic Web - Timetable

Trusted Web Resources

DAML+OIL Shared Terminology OWL machine ³´machine 2010

XML Self Describing RDF Documents 2000

HTTP Foundation of Web today 1990 HTML Human ³´Machine

SGML Document Exchange Format 1985 Hy Time

INTERNET TECHNOLOGIES

Challenges for the Semantic Web

ƒ Design and evolution of ontologies ¾ Reuse existing ontologies: for technical documentation, thesauri (e.g. wordnet) product classifications (e.g. uncefact) ¾ Ontologies through community process ¾ ƒ Enriching data with semantics ¾ Data and text mining ¾ Image understanding ¾ Computer linguistics ¾ Through community process

16 INTERNET TECHNOLOGIES

Challenges for the Semantic Web

ƒ Scalability of methods ¾Querying ¾Distributed reasoning ƒ Deployment and acceptance ¾Trust ¾Security ¾Transparency of reasoning ¾Adequacy: Is XML + RDF + OWL the hammer for every nail?

INTERNET TECHNOLOGIES References

Specifications • Web Ontology Language (OWL) Abstract Syntax and Semantics W3C Working Draft 8 November 2002 http://www.w3.org/TR/2002/WD-owl-semantics-20021108 • Resource Description Framework (RDF) Model and Syntax Specification W3C Proposed Recommendation 05 January 1999 http://www.w3.org/TR/PR-rdf-syntax/ • RDF/XML Syntax Specification (Revised) W3C Working Draft 8 November 2002 http://www.w3.org/TR/2002/WD-rdf-syntax-grammar-20021108 • RDF Semantics W3C Working Draft 12 November 2002 http://www.w3.org/TR/2002/WD-rdf-mt-20021112/ • Resource Description Framework (RDF): Concepts and Abstract Syntax W3C Working Draft 08 November 2002 http://www.w3.org/TR/2002/WD-rdf-concepts-20021108/ • RDF Vocabulary Description Language 1.0: RDF Schema by W3C http://www.w3.org/TR/rdf-schema/

17 INTERNET TECHNOLOGIES References

Links • W3C http://www.w3.org/2001/sw/ • Professor http://www.cs.umd.edu/~hendler/ • Dr. Jeff Heflin http://www.cse.lehigh.edu/~heflin/ • OntoWeb http://ontoweb.aifb.uni-karlsruhe.de/ An introduction to RDF - http://www- 106.ibm.com/developerworks/library/w-rdf/ • Introduction to RDF - http://www.w3schools.com/rdf/rdf_intro.asp • Aristotle (384-322 BCE.) Overview http://www.utm.edu/research/iep/a/aristotl.htm • Timeline, Smith, B. 2002, Ontology and Information Systems, Available: [URL:http://ontology.buffalo.edu/ontology(PIC).pdf] • Ontologies in Computer Science, Gruber, T. R. 1993, 'Towards Principles for the Design of Ontologies Used for Knowledge Sharing', International Journal of Human-Computer Studies, vol. 43, no. 5-6 [URL:http://www.cise.ufl.edu/~jhammer/classes/6930/XML- FA02/papers/gruber93ontology.pdf] • Ontology bibliography, http://glotta.ntua.gr/nlp/StateoftheArt/Ontologies/

INTERNET TECHNOLOGIES References Scientific articles • Semantic Web Road map by Tim Berners-Lee http://www.w3.org/DesignIssues/Semantic.html • The Semantic Web by Tim Berners-Lee http://www.w3.org/2002/Talks/04-sweb/Overview.html • Using W3C XML Schema by Eric van der Vlist http://www.xml.com/pub/a/2000/11/29/schemas/part1.html • Querying XML Documents by Paul Cotton and Jonathan Robie http://www.w3.org/2002/01/xquery-unicode.pdf • Primer: Getting into RDF & Semantic Web using N3 by W3C http://www.w3.org/2000/10/swap/Primer • Agents and the Semantic Web by James Hendler http://dlib.computer.org/ex/books/ex2001/pdf/x2030.pdf • The Future of the Semantic Web by Christian Ohlms • Ontologies in the Semantic Web, Lassila, O., van Harmelen, F., Horrocks, I., Hendler, J. & McGuinness, D. L. 2000, 'The semantic Web and its languages', Intelligent Systems, IEEE [see also IEEE Expert], vol. 15, no. 6, pp. 67-73

18