<<

XML Standards and Topic Maps

Erik Wilde

16.7.2001 XML Metadata Standards and Topic Maps 1

Outline

what is XML? z a syntax (not a model!) what is the behind XML? z XML Set (basically, trees...) what can be described with XML? z describing the content syntactically (schemas) z describing the content abstractly (metadata) XML metadata is outside of XML documents ISO Topic Maps z a "schema language" for meta data

16.7.2001 XML Metadata Standards and Topic Maps 2

1 Extensible Markup Language

standardized by the W3C in February 1998 a subset (aka profile) of SGML (ISO 8879) coming from a document world z data are documents defined in syntax z no abstract data model problems in many real-world scenarios z how to compare XML documents z attribute order, white space, namespace prefixes, ... z how to search for data within documents z query languages operate on abstract data models z often data are not documents

16.7.2001 XML Metadata Standards and Topic Maps 3

Why XML at all?

because it's simple z easily understandable, human-readable because of the available tools z it's easy to find (free) XML because of improved z all others do it! z easy to interface with other XML applications because it's versatile z the data model behind XML is very versatile

16.7.2001 XML Metadata Standards and Topic Maps 4

2 XML Information Set

several XML applications need a data model z style sheets for XML (CSS, XSL) z interfaces to programming languages (DOM) z XML transformation languages (XSLT) z XML fragment (XPointer) z XML query languages (XQuery) XML does not have a real data model z implicitly defined, but not authoritatively XML Information Set (XML Infoset) z describes a set of information items z each XML document is a set of such items

16.7.2001 XML Metadata Standards and Topic Maps 5

XML Infoset Essentials

only Namespace-compliant XML allowed! so what's in the Infoset? z elements z attributes z Namespace declarations and prefixes z comments z processing instructions and what's not in the Infoset? z whitespace within element tags z the order of attributes within element tags z any information about the DTD

16.7.2001 XML Metadata Standards and Topic Maps 6

3 XML Schema Languages

XML represents structured Information z XML Infoset defines the data model (trees) z XML 1.0 defines a character-based syntax XML 1.0 also defines DTDs z element types and their content models z attributes and their data types every XML application has to support DTDs z the only globally accepted schema language z almost 20 years old z many drawbacks for non-document scenarios

16.7.2001 XML Metadata Standards and Topic Maps 7

XML Schema

developed because of user demand z B2B scenarios need better data types z needs better structuring XML Schema W3C standard since 5/2001 z implementations available z rapid adoption is very likely Part I defines structuring mechanisms z element types may be derived from each other Part II defines a data type vocabulary z a set of application-oriented simple types

16.7.2001 XML Metadata Standards and Topic Maps 8

4 Schemas and Metadata

XML resources may contain any type of data z documents (as originally intended by SGML) z order forms (as is common in B2B scenarios) z generic things such as RPC requests and responses z SOAP and XML RPC are two popular variants z or even information about other resources XML metadata describes data resources z not necessarily XML data (eg, image descriptions) z not necessarily attached to the resources z making comments on other people's resources z metadata is also data (ie, structured information) z XML metadata needs schema definitions

16.7.2001 XML Metadata Standards and Topic Maps 9

XMLizing the World...

should everything be XML? z structured data would be an appropriate target z but what about GIF, JPEG, MPEG, ... ? everything should be described using XML z descriptions of resources are metadata z metadata is structured data z metadata should be in XML so there must be an XML z TimBL's favorite: Resource Description Framework z coming from ISO standardization: Topic Maps

16.7.2001 XML Metadata Standards and Topic Maps 10

5 Resource Description Framework

RDF starts with a data model z and defines an XML syntax for representation everything in RDF can be represented by a graph with nodes and arcs z each node is a resource z each arc represents a property z properties and resources are named with URIs describes the whole Web and beyond z anything which can be named with a URI z which is almost anything (phone, tv-channels, ...) RDF graphs describe logical assertions

16.7.2001 XML Metadata Standards and Topic Maps 11

RDF Metadata

RessourceRessource

RDFRDF RDFRDF SchemaSchema DescriptionDescription (Ontology)(Ontology)

RDFRDF SchemaSchema && SyntaxSyntax

16.7.2001 XML Metadata Standards and Topic Maps 12

6 RDF-based Email Description

16.7.2001 XML Metadata Standards and Topic Maps 13

But ... what is it good for?

ask questions about the email z who sent me mail on a particular topic? z get me all the mail from Fred Smith z who where the people who I mailed with on Friday? join the email graphs with other ones z address z home pages z browser history z organizational affiliations

16.7.2001 XML Metadata Standards and Topic Maps 14

7 Topic Maps

Topics are "things of interest" z loosely defined, widely usable z each Topic has name(s) and/or occurrence(s) z Topics have "topic types" (which are Topics...) Associations are used to connect Topics z the have an "association type" (which is a Topic...) z Topics references in Associations have an "association role type" (which are Topics...) Topic Occurrences point to resources z anything addressable by a name (URI) z described by an "occurrence role type" (a Topic...)

16.7.2001 XML Metadata Standards and Topic Maps 15

A Simple

16.7.2001 XML Metadata Standards and Topic Maps 16

8 The Whole Picture

16.7.2001 XML Metadata Standards and Topic Maps 17

Topic Map View (I)

16.7.2001 XML Metadata Standards and Topic Maps 18

9 Topic Map View (II)

16.7.2001 XML Metadata Standards and Topic Maps 19

Topic Map View (III)

16.7.2001 XML Metadata Standards and Topic Maps 20

10 Comparison

RDF Topic Maps standardized by W3C standardized by ISO explicit data model defined by syntax properties have data types associations aren't constrained inherently distributed centralized separates schema from eveything (almost...) is a topic, instance (resource description) there are no types

16.7.2001 XML Metadata Standards and Topic Maps 21

What is missing?

for Topic Maps only z a clean way to separate schemas and instances z a constraint language for topic associations z a way to distribute Topic Maps for RDF only z a unified data model with XML Schema for both approaches z tools for creating and managing metadata z a for actually using metadata z support from a wide range of vendors & users z an approach for achieving vocabulary consensus z smart ways to handle distribution

16.7.2001 XML Metadata Standards and Topic Maps 22

11 XLinkbase System Architecture

HTML X.500 CMS HTTP

Server Browser LDAP EJB Pool EJB

EJB Search XQuery HTTP

Engine Server XML EJB EJB XLink XSLT EJB Browser

JDBC EJB EJB EJB Servlet Engine XLinkbase RDBMS J2EE (EJB/JMS/JTS) Servlet Client exchanging XML messages

XLinkbase Server HTTP Server

16.7.2001 XML Metadata Standards and Topic Maps 23

XLinkbase Status

where the implementation is going z currently concentrating on EJB environment z hard to keep up with commercial engines z case study with simpler model & implementation z case study for generating DHTML links where the concept is going z proof of concept with the case study z 1 or 2 DAs dealing with Topic Map distribution z looking into data model improvements z constraint language for associations z schema/instance separation or separability

16.7.2001 XML Metadata Standards and Topic Maps 24

12