<<
Home , XML

Chapter 2 An HTML Example Structured Web Documents in XML

Nonmonotonic Reasoning: Context- Dependent Reasoning

by V. Marek and Grigoris Antoniou M. Truszczynski
Frank van Harmelen Springer 1993
ISBN 0387976892

1 Chapter 2 Primer 2 Chapter 2 A Semantic Web Primer

The Same Example in XML HTML versus XML: Similarities

ò Both use tags (e.g.

and ) Nonmonotonic Reasoning: Context- Dependent Reasoning ò Tags may be nested (tags within tags) V. Marek ò Human users can read and interpret both M. Truszczynski HTML and XML representations quite easily Springer … But how about machines? 1993 0387976892

3 Chapter 2 A Semantic Web Primer 4 Chapter 2 A Semantic Web Primer Problems with Automated Interpretation of HTML Documents HTML vs XML: Structural Information

An intelligent agent trying to retrieve the names ò HTML documents do not contain structural of the authors of the book information: pieces of the document and their ò Authors‘ names could appear immediately relationships. after the title ò XML more easily accessible to machines because œ Every piece of information is described. ò or immediately after the word by œ Relations are also defined through the nesting structure. ò Are there two authors? œ E.g., the tags appear within the tags, so they describe properties of the particular book. ò Or just one, called —V. Marek and M. Truszczynski“?

5 Chapter 2 A Semantic Web Primer 6 Chapter 2 A Semantic Web Primer

HTML vs XML: Structural Information (2) HTML vs XML: Formatting

ò A machine processing the XML document ò The HTML representation provides more would be able to deduce that than the XML representation: œ the author element refers to the enclosing book œ The formatting of the document is also described element ò he main use of an HTML document is to œ rather than by proximity considerations display information: it must define formatting ò XML allows the definition of constraints on ò XML: separation of content from display values œ same information can be displayed in different œ E.g. a year must be a number of four digits ways

7 Chapter 2 A Semantic Web Primer 8 Chapter 2 A Semantic Web Primer HTML vs XML: Another Example HTML vs XML: Different Use of Tags

ò In HTML ò In both HTML docs same tags

Relationship matter-energy

ò In XML completely different E = M þ c2 ò In XML ò HTML tags define display: color, lists … Relationship matter ò XML tags not fixed: user definable tags energy E ò XML meta : language for M þ c2 defining markup languages

9 Chapter 2 A Semantic Web Primer 10 Chapter 2 A Semantic Web Primer

XML Vocabularies Lecture Outline

ò Web applications must agree on common 1. Introduction vocabularies to communicate and collaborate 2. Detailed Description of XML ò Communities and business sectors are 3. Structuring defining their specialized vocabularies a) DTDs œ mathematics (MathML) b) XML Schema œ bioinformatics (BSML) 4. œ human resources (HRML) 5. Accessing, querying XML documents: XPath œ … 6. Transformations: XSLT

11 Chapter 2 A Semantic Web Primer 12 Chapter 2 A Semantic Web Primer The XML Language of an XML Document

An XML document consists of The prolog consists of ò a prolog ò an XML declaration and ò a number of elements ò an optional reference to external structuring documents ò an optional epilog (not discussed)

13 Chapter 2 A Semantic Web Primer 14 Chapter 2 A Semantic Web Primer

XML Elements XML Elements (2)

ò The —things“ the XML document talks about ò Tag names can be chosen almost freely. œ E.g. books, authors, publishers ò The first character must be a letter, an ò An element consists of: underscore, or a colon œ an opening tag ò No name may begin with the string —xml“ in œ the content any combination of cases œ a closing tag œ E.g. —Xml“, —xML“ David Billington

15 Chapter 2 A Semantic Web Primer 16 Chapter 2 A Semantic Web Primer Content of XML Elements XML Attributes

ò Content may be text, or other elements, or nothing ò An empty element is not necessarily meaningless David Billington œ It may have some properties in terms of attributes +61 Þ 7 Þ 3875 507 ò An attribute is a name-value pair inside the opening tag of an element

ò If there is no content, then the element is called for

17 Chapter 2 A Semantic Web Primer 18 Chapter 2 A Semantic Web Primer

XML Attributes: An Example The Same Example without Attributes

date="October 15, 2002"> 23456 John Smith October 15, 2002 a528 1 c817 3

19 Chapter 2 A Semantic Web Primer 20 Chapter 2 A Semantic Web Primer XML Elements vs Attributes Further Components of XML Docs

ò Attributes can be replaced by elements ò Comments ò When to use elements and when attributes is œ A piece of text that is to be ignored by parser a matter of taste œ ò But attributes cannot be nested ò Processing Instructions (PIs) œ Define procedural attachments œ css"?>

21 Chapter 2 A Semantic Web Primer 22 Chapter 2 A Semantic Web Primer

The Model of XML Documents: Well-Formed XML Documents An Example

ò Syntactically correct documents ò Some syntactic rules: œ Only one outermost element (called root element) corresponding closing tag Where is your draft? œ Tags may not overlap. The following is forbidden: ò Lee Hong Grigoris, where is the draft of the paper you promised me last week? œ Attributes within an element have unique names œ Element and tag names must be permissible

23 Chapter 2 A Semantic Web Primer 24 Chapter 2 A Semantic Web Primer The Tree Model of XML Documents: An Example (2) The Tree Model of XML Docs

ò The tree representation of an XML document is an ordered labeled tree: œ There is exactly one root œ There are no cycles œ Each non-root has exactly one parent œ Each node has a label. œ The order of elements is important œ … but the order of attributes is not important

25 Chapter 2 A Semantic Web Primer 26 Chapter 2 A Semantic Web Primer

Lecture Outline Structuring XML Documents

1. Introduction ò Define all the element and attribute names 2. Detailed Description of XML that may be used 3. Structuring ò Define the structure a) DTDs œ what values an attribute may take b) XML Schema œ which elements may or must occur within other 4. Namespaces elements, etc. 5. Accessing, querying XML documents: XPath ò If such structuring information exists, the 6. Transformations: XSLT document can be validated

27 Chapter 2 A Semantic Web Primer 28 Chapter 2 A Semantic Web Primer Structuring XML Dcuments (2) DTD: Element Type Definition

ò An XML document is valid if œ it is well-formed David Billington œ respects the structuring information it uses +61 Þ 7 Þ 3875 507 ò There are two ways of defining the structure of XML documents: DTD for above element (and all lecturer elements): œ DTDs (the older and more restricted way) œ XML Schema (offers extended possibilities)

29 Chapter 2 A Semantic Web Primer 30 Chapter 2 A Semantic Web Primer

DTD: Disjunction in Element Type The Meaning of the DTD Definitions

ò The element types lecturer, name, and phone may ò We express that a lecturer element contains be used in the document either a name element or a phone element as ò A lecturer element contains a name element and a follows: phone element, in that order (sequence) ò A name element and a phone element may have any content ò A lecturer element contains a name element ò In DTDs, #PCDATA is the only atomic type for and a phone element in any order. elements

31 Chapter 2 A Semantic Web Primer 32 Chapter 2 A Semantic Web Primer Example of an XML Element The Corresponding DTD

date CDATA #REQUIRED> comments CDATA #IMPLIED>

33 Chapter 2 A Semantic Web Primer 34 Chapter 2 A Semantic Web Primer

Comments on the DTD Comments on the DTD (2)

ò The item element type is defined to be empty ò In addition to defining elements, we define ò + (after item) is a cardinality operator: attributes œ ?: appears zero times or once ò This is done in an attribute list containing: œ *: appears zero or more times œ Name of the element type to which the list applies œ +: appears one or more times œ A list of triplets of attribute name, attribute type, œ No cardinality operator means exactly once and value type ò Attribute name: A name that may be used in an XML document using a DTD

35 Chapter 2 A Semantic Web Primer 36 Chapter 2 A Semantic Web Primer DTD: Attribute Types DTD: Attribute Value Types

ò Similar to predefined types, but limited selection ò #REQUIRED ò The most important types are œ Attribute must appear in every occurrence of the element type in the XML document œ CDATA, a string (sequence of characters) œ ID, a name that is unique across the entire XML document ò #IMPLIED œ IDREF, a reference to another element with an ID attribute œ The appearance of the attribute is optional carrying the same value as the IDREF attribute ò #FIXED "value" œ IDREFS, a series of IDREFs œ Every element must have this attribute œ (v1| . . . |vn), an enumeration of all possible values ò "value" ò Limitations: no dates, number ranges etc. œ This specifies the default value for the attribute

37 Chapter 2 A Semantic Web Primer 38 Chapter 2 A Semantic Web Primer

Referencing with IDREF and IDREFS An XML Document Respecting the DTD

Bob Marley Bridget Jones mother IDREF #IMPLIED father IDREF #IMPLIED Mary Poppins children IDREFS #IMPLIED> Peter Marley

39 Chapter 2 A Semantic Web Primer 40 Chapter 2 A Semantic Web Primer A DTD for an Email Element A DTD for an Email Element (2)

address CDATA #REQUIRED> address CDATA #REQUIRED> encoding (mime|binhex) "mime" file CDATA #REQUIRED>

41 Chapter 2 A Semantic Web Primer 42 Chapter 2 A Semantic Web Primer

Interesting Parts of the DTD Interesting Parts of the DTD (2)

ò A head element contains (in that order): ò A body element contains œ a from element œ a text element œ at least one to element œ possibly followed by a number of attachment œ zero or more cc elements elements œ a subject element ò The encoding attribute of an attachment ò In from, to, and cc elements element must have either the value —mime“ œ the name attribute is not required or —binhex“ œ the address attribute is always required œ —mime“ is the default value

43 Chapter 2 A Semantic Web Primer 44 Chapter 2 A Semantic Web Primer Remarks on DTDs Lecture Outline

ò A DTD can be interpreted as an Extended 1. Introduction Backus-Naur Form (EBNF) 2. Detailed Description of XML œ 3. Structuring œ is equivalent to email ::= head body a) DTDs ò Recursive definitions possible in DTDs b) XML Schema œ 5. Accessing, querying XML documents: XPath 6. Transformations: XSLT

45 Chapter 2 A Semantic Web Primer 46 Chapter 2 A Semantic Web Primer

XML Schema XML Schema (2)

ò Significantly richer language for defining the ò An XML schema is an element with an structure of XML documents opening tag like ò Tts syntax is based on XML itself œ Expand or delete already existent schemas ò Structure of schema elements ò Sophisticated set of data types, compared to DTDs (which only supports strings) œ Element and attribute types using data types

47 Chapter 2 A Semantic Web Primer 48 Chapter 2 A Semantic Web Primer Element Types Attribute Types

maxOccurs="1"/> < attribute name="speaks" type="Language" use="default" value="en"/> Cardinality constraints: ò Existence: use="x", where x may be ò minOccurs="x" (default value 1) optional or required ò maxOccurs="x" (default value 1) ò Default value: use="x" value="...", where x may be default or fixed ò Generalizations of *,?,+ offered by DTDs

49 Chapter 2 A Semantic Web Primer 50 Chapter 2 A Semantic Web Primer

Data Types Data Types (2)

ò There is a variety of built-in data types ò Complex data types are defined from already œ Numerical data types: integer, Short etc. existing data types by defining some attributes (if any) and using: œ String types: string, ID, IDREF, CDATA etc. œ sequence, a sequence of existing œ Date and time data types: time, Month etc. elements (order is important) ò There are also user-defined data types œ all, a collection of elements that must appear œ simple data types, which cannot use elements or (order is not important) attributes œ choice, a collection of elements, of which one will be chosen œ complex data types, which can use these

51 Chapter 2 A Semantic Web Primer 52 Chapter 2 A Semantic Web Primer A Data Type Example Data Type Extension

ò Already existing data types can be extended by new elements or attributes. Example: minOccurs="0" maxOccurs="1"/>

53 Chapter 2 A Semantic Web Primer 54 Chapter 2 A Semantic Web Primer

Resulting Data Type Data Type Extension (2)

ò A hierarchical relationship exists between the œ Instances of the extended type are also instances of the original type neither less information, nor information of the wrong type

55 Chapter 2 A Semantic Web Primer 56 Chapter 2 A Semantic Web Primer Data Type Restriction Example of Data Type Restriction

ò An existing data type may be restricted by adding constraints on certain values ò Restriction is not the opposite from extension ò The following hierarchical relationship still holds: œ Instances of the restricted type are also instances of the œ They satisfy at least the constraints of the original type

57 Chapter 2 A Semantic Web Primer 58 Chapter 2 A Semantic Web Primer

Restriction of Simple Data Types Data Type Restriction: Enumeration

59 Chapter 2 A Semantic Web Primer 60 Chapter 2 A Semantic Web Primer XML Schema: The Email Example XML Schema: The Email Example (2)

minOccurs="1" maxOccurs="unbounded"/>

61 Chapter 2 A Semantic Web Primer 62 Chapter 2 A Semantic Web Primer

XML Schema: The Email Example (3) Lecture Outline

1. Introduction 3. Structuring b) XML Schema 4. Namespaces 5. Accessing, querying XML documents: XPath ò Similar for bodyType 6. Transformations: XSLT

63 Chapter 2 A Semantic Web Primer 64 Chapter 2 A Semantic Web Primer Namespaces An Example

ò Since each structuring document was

65 Chapter 2 A Semantic Web Primer 66 Chapter 2 A Semantic Web Primer

Namespace Declarations Lecture Outline

ò Namespaces are declared within an element 1. Introduction and can be used in that element and any of 2. Detailed Description of XML its children (elements and attributes) 3. Structuring ò A declaration has the form: a) DTDs œ xmlns:prefix="location" b) XML Schema œ location is the address of the DTD or schema 4. Namespaces ò If a prefix is not specified: xmlns="location" 5. Accessing, querying XML documents: XPath then the location is used by default 6. Transformations: XSLT

67 Chapter 2 A Semantic Web Primer 68 Chapter 2 A Semantic Web Primer Addressing and Querying XML Documents XPath

ò In relational , parts of a ò XPath is core for XML query languages can be selected and retrieved using SQL ò Language for addressing parts of an XML œ Same necessary for XML documents document. œ Query languages: XQuery, XQL, XML-QL œ It operates on the tree of XML ò The central concept of XML query languages is a path expression œ It has a non-XML syntax œ Specifies how a node or a set of nodes, in the tree representation of the XML document can be reached

69 Chapter 2 A Semantic Web Primer 70 Chapter 2 A Semantic Web Primer

Types of Path Expressions An XML Example

ò Absolute (starting at the root of the tree) œ Syntactically they begin with the symbol / œ It refers to the root of the document (situated one level above the root element of the document) ò Relative to a context node

71 Chapter 2 A Semantic Web Primer 72 Chapter 2 A Semantic Web Primer Examples of Path Expressions in Tree Representation XPath

ò Address all author elements /library/author

ò Addresses all author elements that are children of the library element node, which resides immediately below the root ò /t1/.../tn, where each ti+1 is a child node of ti, is a path through the tree representation

73 Chapter 2 A Semantic Web Primer 74 Chapter 2 A Semantic Web Primer

Examples of Path Expressions in Examples of Path Expressions in XPath (2) XPath (3)

ò Address all author elements ò Address the location attribute nodes within //author library element nodes ò Here // says that we should consider all /library/@location elements in the document and check ò The symbol @ is used to denote attribute whether they are of type author nodes ò This path expression addresses all author elements anywhere in the document

75 Chapter 2 A Semantic Web Primer 76 Chapter 2 A Semantic Web Primer Examples of Path Expressions in Examples of Path Expressions in XPath (4) XPath (5)

ò Address all title attribute nodes within book ò Address all books with title —Artificial Intelligence“ elements anywhere in the document, which /book[@title="Artificial Intelligence"] have the value —Artificial Intelligence“ ò Test within square brackets: a filter expression //book/@title="Artificial Intelligence" œ It restricts the set of addressed nodes. ò Difference with query 4. œ Query 5 addresses book elements, the title of which satisfies a certain condition. œ Query 4 collects title attribute nodes of book elements

77 Chapter 2 A Semantic Web Primer 78 Chapter 2 A Semantic Web Primer

Tree Representation of Query 4 Tree Representation of Query 5

79 Chapter 2 A Semantic Web Primer 80 Chapter 2 A Semantic Web Primer Examples of Path Expressions in XPath (6) General Form of Path Expressions

ò Address the first author element node in the XML ò A path expression consists of a series of document steps, separated by slashes //author[1] ò Address the last book element within the first ò A step consists of author element node in the document œ An axis specifier, //author[1]/book[last()] œ A node test, and ò Address all book element nodes without a title œ An optional predicate attribute //book[not @title]

81 Chapter 2 A Semantic Web Primer 82 Chapter 2 A Semantic Web Primer

General Form of Path Expressions (2) General Form of Path Expressions (3)

ò An axis specifier determines the tree ò A node test specifies which nodes to relationship between the nodes to be address addressed and the context node œ The most common node tests are element names œ E.g. parent, ancestor, child (the default), sibling, œ E.g., * addresses all element nodes attribute node œ comment() addresses all comment nodes œ // is such an axis specifier: descendant or self

83 Chapter 2 A Semantic Web Primer 84 Chapter 2 A Semantic Web Primer General Form of Path Expressions (4) Lecture Outline

ò Predicates (or filter expressions) are 1. Introduction optional and are used to refine the set of 2. Detailed Description of XML addressed nodes 3. Structuring œ E.g., the expression [1] selects the first node a) DTDs œ [position()=last()] selects the last node b) XML Schema œ [position() mod 2 =0] selects the even nodes 4. Namespaces ò XPath has a more complicated full syntax. 5. Accessing, querying XML documents: XPath œ We have only presented the abbreviated syntax 6. Transformations: XSLT

85 Chapter 2 A Semantic Web Primer 86 Chapter 2 A Semantic Web Primer

Displaying XML Documents Style Sheets

ò Style sheets can be written in various Grigoris Antoniou languages University of Bremen œ E.g. CSS2 (cascading style sheets level 2) [email protected] œ XSL (extensible stylesheet language) ò XSL includes may be displayed in different ways: Grigoris Antoniou Grigoris Antoniou œ a transformation language (XSLT) University of Bremen University of Bremen œ a formatting language [email protected] [email protected] œ Both are XML applications

87 Chapter 2 A Semantic Web Primer 88 Chapter 2 A Semantic Web Primer XSL Transformations (XSLT) XSLT (2)

ò XSLT specifies rules with which an input XML ò Move data and from one XML document is transformed to representation to another œ another XML document ò XSLT is chosen when applications that use different œ an HTML document DTDs or schemas need to communicate œ ò XSLT can be used for machine processing of content ò The output document may use the same DTD or without any regard to displaying the information for schema, or a completely different vocabulary people to read. ò XSLT can be used independently of the formatting ò In the following we use XSLT only to display XML language documents

89 Chapter 2 A Semantic Web Primer 90 Chapter 2 A Semantic Web Primer

XSLT Transformation into HTML Style Sheet Output

An author An author
Grigoris Antoniou

University of Bremen
[email protected]

91 Chapter 2 A Semantic Web Primer 92 Chapter 2 A Semantic Web Primer Observations About XSLT A Template

ò XSLT documents are XML documents œ XSLT resides on top of XML An author ò The XSLT document defines a template œ In this case an HTML document, with some ...
placeholders for content to be inserted ...
ò xsl:value-of retrieves the value of an ... element and copies it into the output document œ It places some content into the template

93 Chapter 2 A Semantic Web Primer 94 Chapter 2 A Semantic Web Primer

Auxiliary Templates Example of an Auxiliary Template

ò We have an XML document with details of several authors Grigoris Antoniou University of Bremen ò It is a waste of effort to treat each author [email protected] element separately ò In such cases, a special template is defined for David Billington author elements, which is used by the main Griffith University template [email protected]

95 Chapter 2 A Semantic Web Primer 96 Chapter 2 A Semantic Web Primer Example of an Auxiliary Template (2) Example of an Auxiliary Template (3)

Authors

select="affiliation"/>
Email:

97 Chapter 2 A Semantic Web Primer 98 Chapter 2 A Semantic Web Primer

Multiple Authors Output Explanation of the Example

ò xsl:apply-templates element causes all children of Authors the context node to be matched against the selected

Grigoris Antoniou

path expression Affiliation: University of Bremen
œ E.g., if the current template applies to /, then the element Email: [email protected] xsl:apply-templates applies to the root element

œ I.e. the authors element (/ is located above the root

David Billington

element) Affiliation: Griffith University
œ If the current context node is the authors element, then the Email: [email protected] element xsl:apply-templates select="author" causes the

template for the author elements to be applied to all author children of the authors element

99 Chapter 2 A Semantic Web Primer 100 Chapter 2 A Semantic Web Primer Explanation of the Example (2) Processing XML Attributes

ò It is good practice to define a template for ò Suppose we wish to transform to itself the element: each element type in the document œ Even if no specific processing is applied to certain elements, the xsl:apply-templates element ò Wrong solution: should be used œ E.g. authors " leaves of the tree, and all templates are lastname=""/> applied

101 Chapter 2 A Semantic Web Primer 102 Chapter 2 A Semantic Web Primer

Transforming an XML Document to Processing XML Attributes (2) Another

ò Not well-formed because tags are not allowed within the values of attributes ò We wish to add attribute values into template

103 Chapter 2 A Semantic Web Primer 104 Chapter 2 A Semantic Web Primer Transforming an XML Document to Transforming an XML Document to Another (2) Another (3)

105 Chapter 2 A Semantic Web Primer 106 Chapter 2 A Semantic Web Primer

Points for Discussion in Subsequent Summary Chapters

ò XML is a metalanguage that allows users to ò The nesting of tags does not have standard meaning define markup ò The of XML documents is not accessible ò XML separates content and structure from to machines, only to people formatting ò Collaboration and exchange are supported if there is ò XML is the for the underlying shared understanding of the vocabulary representation and exchange of structured ò XML is well-suited for close collaboration, where information on the Web domain- or community-based vocabularies are used œ It is not so well-suited for global communication. ò XML is supported by query languages

107 Chapter 2 A Semantic Web Primer 108 Chapter 2 A Semantic Web Primer