XML Metadata Standards and Topic Maps Outline
Total Page:16
File Type:pdf, Size:1020Kb
XML Metadata Standards and Topic Maps Erik Wilde 16.7.2001 XML Metadata Standards and Topic Maps 1 Outline what is XML? z a syntax (not a data model!) what is the data model behind XML? z XML Information Set (basically, trees...) what can be described with XML? z describing the content syntactically (schemas) z describing the content abstractly (metadata) XML metadata is outside of XML documents ISO Topic Maps z a "schema language" for meta data 16.7.2001 XML Metadata Standards and Topic Maps 2 1 Extensible Markup Language standardized by the W3C in February 1998 a subset (aka profile) of SGML (ISO 8879) coming from a document world z data are documents defined in syntax z no abstract data model problems in many real-world scenarios z how to compare XML documents z attribute order, white space, namespace prefixes, ... z how to search for data within documents z query languages operate on abstract data models z often data are not documents 16.7.2001 XML Metadata Standards and Topic Maps 3 Why XML at all? because it's simple z easily understandable, human-readable because of the available tools z it's easy to find (free) XML software because of improved interoperability z all others do it! z easy to interface with other XML applications because it's versatile z the data model behind XML is very versatile 16.7.2001 XML Metadata Standards and Topic Maps 4 2 XML Information Set several XML applications need a data model z style sheets for XML (CSS, XSL) z interfaces to programming languages (DOM) z XML transformation languages (XSLT) z XML fragment identifiers (XPointer) z XML query languages (XQuery) XML does not have a real data model z implicitly defined, but not authoritatively XML Information Set (XML Infoset) z describes a set of information items z each XML document is a set of such items 16.7.2001 XML Metadata Standards and Topic Maps 5 XML Infoset Essentials only Namespace-compliant XML allowed! so what's in the Infoset? z elements z attributes z Namespace declarations and prefixes z comments z processing instructions and what's not in the Infoset? z whitespace within element tags z the order of attributes within element tags z any information about the DTD 16.7.2001 XML Metadata Standards and Topic Maps 6 3 XML Schema Languages XML represents structured Information z XML Infoset defines the data model (trees) z XML 1.0 defines a character-based syntax XML 1.0 also defines DTDs z element types and their content models z attributes and their data types every XML application has to support DTDs z the only globally accepted schema language z almost 20 years old z many drawbacks for non-document scenarios 16.7.2001 XML Metadata Standards and Topic Maps 7 XML Schema developed because of user demand z B2B scenarios need better data types z data modeling needs better structuring XML Schema W3C standard since 5/2001 z implementations available z rapid adoption is very likely Part I defines structuring mechanisms z element types may be derived from each other Part II defines a data type vocabulary z a set of application-oriented simple types 16.7.2001 XML Metadata Standards and Topic Maps 8 4 Schemas and Metadata XML resources may contain any type of data z documents (as originally intended by SGML) z order forms (as is common in B2B scenarios) z generic things such as RPC requests and responses z SOAP and XML RPC are two popular variants z or even information about other resources XML metadata describes data resources z not necessarily XML data (eg, image descriptions) z not necessarily attached to the resources z making comments on other people's resources z metadata is also data (ie, structured information) z XML metadata needs schema definitions 16.7.2001 XML Metadata Standards and Topic Maps 9 XMLizing the World... should everything be XML? z structured data would be an appropriate target z but what about GIF, JPEG, MPEG, ... ? everything should be described using XML z descriptions of resources are metadata z metadata is structured data z metadata should be in XML so there must be an XML metadata standard z TimBL's favorite: Resource Description Framework z coming from ISO standardization: Topic Maps 16.7.2001 XML Metadata Standards and Topic Maps 10 5 Resource Description Framework RDF starts with a data model z and defines an XML syntax for representation everything in RDF can be represented by a graph with nodes and arcs z each node is a resource z each arc represents a property z properties and resources are named with URIs describes the whole Web and beyond z anything which can be named with a URI z which is almost anything (phone, tv-channels, ...) RDF graphs describe logical assertions 16.7.2001 XML Metadata Standards and Topic Maps 11 RDF Metadata RessourceRessource RDFRDF RDFRDF SchemaSchema DescriptionDescription (Ontology)(Ontology) RDFRDF SchemaSchema && SyntaxSyntax 16.7.2001 XML Metadata Standards and Topic Maps 12 6 RDF-based Email Description 16.7.2001 XML Metadata Standards and Topic Maps 13 But ... what is it good for? ask questions about the email z who sent me mail on a particular topic? z get me all the mail from Fred Smith z who where the people who I mailed with on Friday? join the email graphs with other ones z address books z home pages z browser history z organizational affiliations 16.7.2001 XML Metadata Standards and Topic Maps 14 7 Topic Maps Topics are "things of interest" z loosely defined, widely usable z each Topic has name(s) and/or occurrence(s) z Topics have "topic types" (which are Topics...) Associations are used to connect Topics z the have an "association type" (which is a Topic...) z Topics references in Associations have an "association role type" (which are Topics...) Topic Occurrences point to resources z anything addressable by a name (URI) z described by an "occurrence role type" (a Topic...) 16.7.2001 XML Metadata Standards and Topic Maps 15 A Simple Topic Map 16.7.2001 XML Metadata Standards and Topic Maps 16 8 The Whole Picture 16.7.2001 XML Metadata Standards and Topic Maps 17 Topic Map View (I) 16.7.2001 XML Metadata Standards and Topic Maps 18 9 Topic Map View (II) 16.7.2001 XML Metadata Standards and Topic Maps 19 Topic Map View (III) 16.7.2001 XML Metadata Standards and Topic Maps 20 10 Comparison RDF Topic Maps standardized by W3C standardized by ISO explicit data model defined by syntax properties have data types associations aren't constrained inherently distributed centralized separates schema from eveything (almost...) is a topic, instance (resource description) there are no types 16.7.2001 XML Metadata Standards and Topic Maps 21 What is missing? for Topic Maps only z a clean way to separate schemas and instances z a constraint language for topic associations z a way to distribute Topic Maps for RDF only z a unified data model with XML Schema for both approaches z tools for creating and managing metadata z a query language for actually using metadata z support from a wide range of vendors & users z an approach for achieving vocabulary consensus z smart ways to handle distribution 16.7.2001 XML Metadata Standards and Topic Maps 22 11 XLinkbase System Architecture HTML X.500 CMS HTTP Server Browser LDAP EJB Pool EJB EJB Search XQuery HTTP Engine Server XML EJB EJB XLink XSLT EJB Browser JDBC EJB EJB EJB Servlet Engine XLinkbase RDBMS J2EE (EJB/JMS/JTS) Servlet Client exchanging XML messages XLinkbase Server HTTP Server 16.7.2001 XML Metadata Standards and Topic Maps 23 XLinkbase Status where the implementation is going z currently concentrating on EJB environment z hard to keep up with commercial engines z case study with simpler model & implementation z case study for generating DHTML links where the concept is going z proof of concept with the case study z 1 or 2 DAs dealing with Topic Map distribution z looking into data model improvements z constraint language for associations z schema/instance separation or separability 16.7.2001 XML Metadata Standards and Topic Maps 24 12.