Chapter 2 An HTML Example Structured Web Documents in XML
Nonmonotonic Reasoning: Context- Dependent Reasoning
by V. Marek and Grigoris Antoniou M. TruszczynskiFrank van Harmelen Springer 1993
ISBN 0387976892
1 Chapter 2 A Semantic Web Primer 2 Chapter 2 A Semantic Web Primer
The Same Example in XML HTML versus XML: Similarities
and )
3 Chapter 2 A Semantic Web Primer 4 Chapter 2 A Semantic Web Primer Problems with Automated Interpretation of HTML Documents HTML vs XML: Structural Information
An intelligent agent trying to retrieve the names ò HTML documents do not contain structural of the authors of the book information: pieces of the document and their ò Authors‘ names could appear immediately relationships. after the title ò XML more easily accessible to machines because œ Every piece of information is described. ò or immediately after the word by œ Relations are also defined through the nesting structure. ò Are there two authors? œ E.g., the
5 Chapter 2 A Semantic Web Primer 6 Chapter 2 A Semantic Web Primer
HTML vs XML: Structural Information (2) HTML vs XML: Formatting
ò A machine processing the XML document ò The HTML representation provides more would be able to deduce that than the XML representation: œ the author element refers to the enclosing book œ The formatting of the document is also described element ò he main use of an HTML document is to œ rather than by proximity considerations display information: it must define formatting ò XML allows the definition of constraints on ò XML: separation of content from display values œ same information can be displayed in different œ E.g. a year must be a number of four digits ways
7 Chapter 2 A Semantic Web Primer 8 Chapter 2 A Semantic Web Primer HTML vs XML: Another Example HTML vs XML: Different Use of Tags
ò In HTML ò In both HTML docs same tags
Relationship matter-energy
ò In XML completely different E = M þ c2 ò In XML9 Chapter 2 A Semantic Web Primer 10 Chapter 2 A Semantic Web Primer
XML Vocabularies Lecture Outline
ò Web applications must agree on common 1. Introduction vocabularies to communicate and collaborate 2. Detailed Description of XML ò Communities and business sectors are 3. Structuring defining their specialized vocabularies a) DTDs œ mathematics (MathML) b) XML Schema œ bioinformatics (BSML) 4. Namespaces œ human resources (HRML) 5. Accessing, querying XML documents: XPath œ … 6. Transformations: XSLT
11 Chapter 2 A Semantic Web Primer 12 Chapter 2 A Semantic Web Primer The XML Language Prolog of an XML Document
An XML document consists of The prolog consists of ò a prolog ò an XML declaration and ò a number of elements ò an optional reference to external structuring documents ò an optional epilog (not discussed)
13 Chapter 2 A Semantic Web Primer 14 Chapter 2 A Semantic Web Primer
XML Elements XML Elements (2)
ò The —things“ the XML document talks about ò Tag names can be chosen almost freely. œ E.g. books, authors, publishers ò The first character must be a letter, an ò An element consists of: underscore, or a colon œ an opening tag ò No name may begin with the string —xml“ in œ the content any combination of cases œ a closing tag œ E.g. —Xml“, —xML“
15 Chapter 2 A Semantic Web Primer 16 Chapter 2 A Semantic Web Primer Content of XML Elements XML Attributes
ò Content may be text, or other elements, or nothing ò An empty element is not necessarily meaningless
ò If there is no content, then the element is called
17 Chapter 2 A Semantic Web Primer 18 Chapter 2 A Semantic Web Primer
XML Attributes: An Example The Same Example without Attributes
19 Chapter 2 A Semantic Web Primer 20 Chapter 2 A Semantic Web Primer XML Elements vs Attributes Further Components of XML Docs
ò Attributes can be replaced by elements ò Comments ò When to use elements and when attributes is œ A piece of text that is to be ignored by parser a matter of taste œ ò But attributes cannot be nested ò Processing Instructions (PIs) œ Define procedural attachments œ css"?>
21 Chapter 2 A Semantic Web Primer 22 Chapter 2 A Semantic Web Primer
The Tree Model of XML Documents: Well-Formed XML Documents An Example
ò
23 Chapter 2 A Semantic Web Primer 24 Chapter 2 A Semantic Web Primer The Tree Model of XML Documents: An Example (2) The Tree Model of XML Docs
ò The tree representation of an XML document is an ordered labeled tree: œ There is exactly one root œ There are no cycles œ Each non-root node has exactly one parent œ Each node has a label. œ The order of elements is important œ … but the order of attributes is not important
25 Chapter 2 A Semantic Web Primer 26 Chapter 2 A Semantic Web Primer
Lecture Outline Structuring XML Documents
1. Introduction ò Define all the element and attribute names 2. Detailed Description of XML that may be used 3. Structuring ò Define the structure a) DTDs œ what values an attribute may take b) XML Schema œ which elements may or must occur within other 4. Namespaces elements, etc. 5. Accessing, querying XML documents: XPath ò If such structuring information exists, the 6. Transformations: XSLT document can be validated
27 Chapter 2 A Semantic Web Primer 28 Chapter 2 A Semantic Web Primer Structuring XML Dcuments (2) DTD: Element Type Definition
ò An XML document is valid if
29 Chapter 2 A Semantic Web Primer 30 Chapter 2 A Semantic Web Primer
DTD: Disjunction in Element Type The Meaning of the DTD Definitions
ò The element types lecturer, name, and phone may ò We express that a lecturer element contains be used in the document either a name element or a phone element as ò A lecturer element contains a name element and a follows: phone element, in that order (sequence) ò A name element and a phone element may have any content ò A lecturer element contains a name element ò In DTDs, #PCDATA is the only atomic type for and a phone element in any order. elements
31 Chapter 2 A Semantic Web Primer 32 Chapter 2 A Semantic Web Primer Example of an XML Element The Corresponding DTD
33 Chapter 2 A Semantic Web Primer 34 Chapter 2 A Semantic Web Primer
Comments on the DTD Comments on the DTD (2)
ò The item element type is defined to be empty ò In addition to defining elements, we define ò + (after item) is a cardinality operator: attributes œ ?: appears zero times or once ò This is done in an attribute list containing: œ *: appears zero or more times œ Name of the element type to which the list applies œ +: appears one or more times œ A list of triplets of attribute name, attribute type, œ No cardinality operator means exactly once and value type ò Attribute name: A name that may be used in an XML document using a DTD
35 Chapter 2 A Semantic Web Primer 36 Chapter 2 A Semantic Web Primer DTD: Attribute Types DTD: Attribute Value Types
ò Similar to predefined data types, but limited selection ò #REQUIRED ò The most important types are œ Attribute must appear in every occurrence of the element type in the XML document œ CDATA, a string (sequence of characters) œ ID, a name that is unique across the entire XML document ò #IMPLIED œ IDREF, a reference to another element with an ID attribute œ The appearance of the attribute is optional carrying the same value as the IDREF attribute ò #FIXED "value" œ IDREFS, a series of IDREFs œ Every element must have this attribute œ (v1| . . . |vn), an enumeration of all possible values ò "value" ò Limitations: no dates, number ranges etc. œ This specifies the default value for the attribute
37 Chapter 2 A Semantic Web Primer 38 Chapter 2 A Semantic Web Primer
Referencing with IDREF and IDREFS An XML Document Respecting the DTD
39 Chapter 2 A Semantic Web Primer 40 Chapter 2 A Semantic Web Primer A DTD for an Email Element A DTD for an Email Element (2)
address CDATA #REQUIRED> address CDATA #REQUIRED> encoding (mime|binhex) "mime" file CDATA #REQUIRED>
41 Chapter 2 A Semantic Web Primer 42 Chapter 2 A Semantic Web Primer
Interesting Parts of the DTD Interesting Parts of the DTD (2)
ò A head element contains (in that order): ò A body element contains œ a from element œ a text element œ at least one to element œ possibly followed by a number of attachment œ zero or more cc elements elements œ a subject element ò The encoding attribute of an attachment ò In from, to, and cc elements element must have either the value —mime“ œ the name attribute is not required or —binhex“ œ the address attribute is always required œ —mime“ is the default value
43 Chapter 2 A Semantic Web Primer 44 Chapter 2 A Semantic Web Primer Remarks on DTDs Lecture Outline
ò A DTD can be interpreted as an Extended 1. Introduction Backus-Naur Form (EBNF) 2. Detailed Description of XML œ 3. Structuring œ is equivalent to email ::= head body a) DTDs ò Recursive definitions possible in DTDs b) XML Schema œ 5. Accessing, querying XML documents: XPath 6. Transformations: XSLT
45 Chapter 2 A Semantic Web Primer 46 Chapter 2 A Semantic Web Primer
XML Schema XML Schema (2)
ò Significantly richer language for defining the ò An XML schema is an element with an structure of XML documents opening tag like ò Tts syntax is based on XML itself
47 Chapter 2 A Semantic Web Primer 48 Chapter 2 A Semantic Web Primer Element Types Attribute Types
49 Chapter 2 A Semantic Web Primer 50 Chapter 2 A Semantic Web Primer
Data Types Data Types (2)
ò There is a variety of built-in data types ò Complex data types are defined from already œ Numerical data types: integer, Short etc. existing data types by defining some attributes (if any) and using: œ String types: string, ID, IDREF, CDATA etc. œ sequence, a sequence of existing data type œ Date and time data types: time, Month etc. elements (order is important) ò There are also user-defined data types œ all, a collection of elements that must appear œ simple data types, which cannot use elements or (order is not important) attributes œ choice, a collection of elements, of which one will be chosen œ complex data types, which can use these
51 Chapter 2 A Semantic Web Primer 52 Chapter 2 A Semantic Web Primer A Data Type Example Data Type Extension
53 Chapter 2 A Semantic Web Primer 54 Chapter 2 A Semantic Web Primer
Resulting Data Type Data Type Extension (2)
55 Chapter 2 A Semantic Web Primer 56 Chapter 2 A Semantic Web Primer Data Type Restriction Example of Data Type Restriction
ò An existing data type may be restricted by adding
57 Chapter 2 A Semantic Web Primer 58 Chapter 2 A Semantic Web Primer
Restriction of Simple Data Types Data Type Restriction: Enumeration
59 Chapter 2 A Semantic Web Primer 60 Chapter 2 A Semantic Web Primer XML Schema: The Email Example XML Schema: The Email Example (2)
61 Chapter 2 A Semantic Web Primer 62 Chapter 2 A Semantic Web Primer
XML Schema: The Email Example (3) Lecture Outline
63 Chapter 2 A Semantic Web Primer 64 Chapter 2 A Semantic Web Primer Namespaces An Example
65 Chapter 2 A Semantic Web Primer 66 Chapter 2 A Semantic Web Primer
Namespace Declarations Lecture Outline
ò Namespaces are declared within an element 1. Introduction and can be used in that element and any of 2. Detailed Description of XML its children (elements and attributes) 3. Structuring ò A namespace declaration has the form: a) DTDs œ xmlns:prefix="location" b) XML Schema œ location is the address of the DTD or schema 4. Namespaces ò If a prefix is not specified: xmlns="location" 5. Accessing, querying XML documents: XPath then the location is used by default 6. Transformations: XSLT
67 Chapter 2 A Semantic Web Primer 68 Chapter 2 A Semantic Web Primer Addressing and Querying XML Documents XPath
ò In relational databases, parts of a database ò XPath is core for XML query languages can be selected and retrieved using SQL ò Language for addressing parts of an XML œ Same necessary for XML documents document. œ Query languages: XQuery, XQL, XML-QL œ It operates on the tree data model of XML ò The central concept of XML query languages is a path expression œ It has a non-XML syntax œ Specifies how a node or a set of nodes, in the tree representation of the XML document can be reached
69 Chapter 2 A Semantic Web Primer 70 Chapter 2 A Semantic Web Primer
Types of Path Expressions An XML Example
71 Chapter 2 A Semantic Web Primer 72 Chapter 2 A Semantic Web Primer Examples of Path Expressions in Tree Representation XPath
ò Address all author elements /library/author
ò Addresses all author elements that are children of the library element node, which resides immediately below the root ò /t1/.../tn, where each ti+1 is a child node of ti, is a path through the tree representation
73 Chapter 2 A Semantic Web Primer 74 Chapter 2 A Semantic Web Primer
Examples of Path Expressions in Examples of Path Expressions in XPath (2) XPath (3)
ò Address all author elements ò Address the location attribute nodes within //author library element nodes ò Here // says that we should consider all /library/@location elements in the document and check ò The symbol @ is used to denote attribute whether they are of type author nodes ò This path expression addresses all author elements anywhere in the document
75 Chapter 2 A Semantic Web Primer 76 Chapter 2 A Semantic Web Primer Examples of Path Expressions in Examples of Path Expressions in XPath (4) XPath (5)
ò Address all title attribute nodes within book ò Address all books with title —Artificial Intelligence“ elements anywhere in the document, which /book[@title="Artificial Intelligence"] have the value —Artificial Intelligence“ ò Test within square brackets: a filter expression //book/@title="Artificial Intelligence" œ It restricts the set of addressed nodes. ò Difference with query 4. œ Query 5 addresses book elements, the title of which satisfies a certain condition. œ Query 4 collects title attribute nodes of book elements
77 Chapter 2 A Semantic Web Primer 78 Chapter 2 A Semantic Web Primer
Tree Representation of Query 4 Tree Representation of Query 5
79 Chapter 2 A Semantic Web Primer 80 Chapter 2 A Semantic Web Primer Examples of Path Expressions in XPath (6) General Form of Path Expressions
ò Address the first author element node in the XML ò A path expression consists of a series of document steps, separated by slashes //author[1] ò Address the last book element within the first ò A step consists of author element node in the document œ An axis specifier, //author[1]/book[last()] œ A node test, and ò Address all book element nodes without a title œ An optional predicate attribute //book[not @title]
81 Chapter 2 A Semantic Web Primer 82 Chapter 2 A Semantic Web Primer
General Form of Path Expressions (2) General Form of Path Expressions (3)
ò An axis specifier determines the tree ò A node test specifies which nodes to relationship between the nodes to be address addressed and the context node œ The most common node tests are element names œ E.g. parent, ancestor, child (the default), sibling, œ E.g., * addresses all element nodes attribute node œ comment() addresses all comment nodes œ // is such an axis specifier: descendant or self
83 Chapter 2 A Semantic Web Primer 84 Chapter 2 A Semantic Web Primer General Form of Path Expressions (4) Lecture Outline
ò Predicates (or filter expressions) are 1. Introduction optional and are used to refine the set of 2. Detailed Description of XML addressed nodes 3. Structuring œ E.g., the expression [1] selects the first node a) DTDs œ [position()=last()] selects the last node b) XML Schema œ [position() mod 2 =0] selects the even nodes 4. Namespaces ò XPath has a more complicated full syntax. 5. Accessing, querying XML documents: XPath œ We have only presented the abbreviated syntax 6. Transformations: XSLT
85 Chapter 2 A Semantic Web Primer 86 Chapter 2 A Semantic Web Primer
Displaying XML Documents Style Sheets
87 Chapter 2 A Semantic Web Primer 88 Chapter 2 A Semantic Web Primer XSL Transformations (XSLT) XSLT (2)
ò XSLT specifies rules with which an input XML ò Move data and metadata from one XML document is transformed to representation to another œ another XML document ò XSLT is chosen when applications that use different œ an HTML document DTDs or schemas need to communicate œ plain text ò XSLT can be used for machine processing of content ò The output document may use the same DTD or without any regard to displaying the information for schema, or a completely different vocabulary people to read. ò XSLT can be used independently of the formatting ò In the following we use XSLT only to display XML language documents
89 Chapter 2 A Semantic Web Primer 90 Chapter 2 A Semantic Web Primer
XSLT Transformation into HTML Style Sheet Output
Grigoris Antoniou
University of Bremen
91 Chapter 2 A Semantic Web Primer 92 Chapter 2 A Semantic Web Primer Observations About XSLT A Template
ò XSLT documents are XML documents œ XSLT resides on top of XML
placeholders for content to be inserted ...
ò xsl:value-of retrieves the value of an ... element and copies it into the output document œ It places some content into the template
93 Chapter 2 A Semantic Web Primer 94 Chapter 2 A Semantic Web Primer
Auxiliary Templates Example of an Auxiliary Template
95 Chapter 2 A Semantic Web Primer 96 Chapter 2 A Semantic Web Primer Example of an Auxiliary Template (2) Example of an Auxiliary Template (3)
select="affiliation"/>
Email:
97 Chapter 2 A Semantic Web Primer 98 Chapter 2 A Semantic Web Primer
Multiple Authors Output Explanation of the Example
ò xsl:apply-templates element causes all children of
Grigoris Antoniou
path expression Affiliation: University of Bremenœ E.g., if the current template applies to /, then the element Email: [email protected] xsl:apply-templates applies to the root element
œ I.e. the authors element (/ is located above the root
David Billington
element) Affiliation: Griffith Universityœ If the current context node is the authors element, then the Email: [email protected] element xsl:apply-templates select="author" causes the
template for the author elements to be applied to all author children of the authors element
99 Chapter 2 A Semantic Web Primer 100 Chapter 2 A Semantic Web Primer Explanation of the Example (2) Processing XML Attributes
ò It is good practice to define a template for ò Suppose we wish to transform to itself the element: each element type in the document
101 Chapter 2 A Semantic Web Primer 102 Chapter 2 A Semantic Web Primer
Transforming an XML Document to Processing XML Attributes (2) Another
ò Not well-formed because tags are not allowed within the values of attributes ò We wish to add attribute values into template
103 Chapter 2 A Semantic Web Primer 104 Chapter 2 A Semantic Web Primer Transforming an XML Document to Transforming an XML Document to Another (2) Another (3)
105 Chapter 2 A Semantic Web Primer 106 Chapter 2 A Semantic Web Primer
Points for Discussion in Subsequent Summary Chapters
ò XML is a metalanguage that allows users to ò The nesting of tags does not have standard meaning define markup ò The semantics of XML documents is not accessible ò XML separates content and structure from to machines, only to people formatting ò Collaboration and exchange are supported if there is ò XML is the de facto standard for the underlying shared understanding of the vocabulary representation and exchange of structured ò XML is well-suited for close collaboration, where information on the Web domain- or community-based vocabularies are used œ It is not so well-suited for global communication. ò XML is supported by query languages
107 Chapter 2 A Semantic Web Primer 108 Chapter 2 A Semantic Web Primer