<<

Objectives

An Introduction to XML and Web Technologies ƒ How XML generalizes relational ƒ The XQuery language Querying XML Documents ƒ How XML may be supported in databases with XQuery

Anders Møller & Michael I. Schwartzbach © 2006 Addison-Wesley

An Introduction to XML and Web Technologies 2

XQuery 1.0 From Relations to Trees

ƒ XML documents naturally generalize relations ƒ XQuery is the corresponding generalization of SQL

An Introduction to XML and Web Technologies 3 An Introduction to XML and Web Technologies 4

1 Only Some Trees are Relations Trees Are Not Relations

ƒ They have height two ƒ Not all trees satisfy the previous characterization ƒ The root has an unbounded number of children ƒ Trees are ordered, while both rows and columns ƒ All nodes in the second layer (records) have a of tables may be permuted without changing the fixed number of child nodes (fields) meaning of the data

An Introduction to XML and Web Technologies 5 An Introduction to XML and Web Technologies 6

A Student Database A More Natural Model (1/2)

Joe Average 21 Biology

An Introduction to XML and Web Technologies 7 An Introduction to XML and Web Technologies 8

2 A More Natural Model (2/2) Usage Scenario: Data-Oriented

ƒ We want to carry over the kinds of queries that we Jack Doe 18 performed in the original relational model Physics XML Science

An Introduction to XML and Web Technologies 9 An Introduction to XML and Web Technologies 10

Usage Scenario: Document-Oriented Usage Scenario: Programming

ƒ Queries could be used ƒ Queries could be used to automatically generate • to retrieve parts of documents documentation • to provide dynamic indexes • to perform context-sensitive searching • to generate new documents as combinations of existing documents

An Introduction to XML and Web Technologies 11 An Introduction to XML and Web Technologies 12

3 Usage Scenario: Hybrid XQuery Design Requirements

ƒ Queries could be used to data mine hybrid data, ƒ Must have at least one XML syntax and at least such as patient records one human-readable syntax ƒ Must be declarative ƒ Must be namespace aware ƒ Must coordinate with XML Schema ƒ Must support simple and complex datatypes ƒ Must combine information from multiple documents ƒ Must be able to transform and create XML trees

An Introduction to XML and Web Technologies 13 An Introduction to XML and Web Technologies 14

Relationship to XPath Relationship to XSLT

ƒ XQuery 1.0 is a strict superset of XPath 2.0 ƒ XQuery and XSLT are both domain-specific ƒ Every XPath 2.0 expression is directly an XQuery languages for combining and transforming XML 1.0 expression (a query) data from multiple sources ƒ The extra expressive power is the ability to ƒ They are vastly different in design, partly for • join information from different sources and historical reasons • generate new XML fragments ƒ XQuery is designed from scratch, XSLT is an intellectual descendant of CSS ƒ Technically, they may emulate each other

An Introduction to XML and Web Technologies 15 An Introduction to XML and Web Technologies 16

4 XQuery More From the Prolog

ƒ Like XPath expressions, XQuery expressions are declare default element namespace URI; evaluated relatively to a context declare default function namespace URI; import schema at URI; ƒ This is explicitly provided by a prolog declare namespace NCName = URI; ƒ Settings define various parameters for the XQuery processor language, such as:

xquery version "1.0"; declare xmlspace preserve; declare xmlspace strip;

An Introduction to XML and Web Technologies 17 An Introduction to XML and Web Technologies 18

Implicit Declarations XPath Expressions

declare namespace = ƒ XPath expressions are also XQuery expressions "http://www.w3.org/XML/1998/namespace"; declare namespace xs = ƒ The XQuery prolog gives the required static "http://www.w3.org/2001/XMLSchema"; context declare namespace xsi = "http://www.w3.org/2001/XMLSchema-instance"; ƒ The initial context node, position, and size are declare namespace fn = undefined "http://www.w3.org/2005/11/xpath-functions"; declare namespace xdt = "http://www.w3.org/2005/11/xpath-datatypes"; declare namespace local = "http://www.w3.org/2005/11/xquery-local-functions";

An Introduction to XML and Web Technologies 19 An Introduction to XML and Web Technologies 20

5 Datatype Expressions XML Expressions

ƒ Same atomic values as XPath 2.0 ƒ XQuery expressions may compute new XML ƒ Also lots of primitive simple values: nodes xs:string("XML is fun") ƒ Expressions may denote element, character data, xs:boolean("true") xs:decimal("3.1415") comment, and nodes xs:float("6.02214199E23") xs:dateTime("1999-05-31T13:20:00-05:00") ƒ Each node is created with a unique node identity xs:time("13:20:00-05:00") ƒ Constructors may be either direct or computed xs:date("1999-05-31") xs:gYearMonth("1999-05") xs:gYear("1999") xs:hexBinary("48656c6c6f0a") xs:base64Binary("SGVsbG8K") xs:anyURI("http://www.brics.dk/ixwt/") xs:QName("rcp:recipe")

An Introduction to XML and Web Technologies 21 An Introduction to XML and Web Technologies 22

Direct Constructors Namespaces in Constructors (1/3)

ƒ Uses the standard XML syntax declare default element namespace "http://businesscard.org"; ƒ The expression John Doe CEO, Widget Inc. baz [email protected] (202) 555-1414 ƒ evaluates to the given XML fragment ƒ Note that is ƒ evaluates to false

An Introduction to XML and Web Technologies 23 An Introduction to XML and Web Technologies 24

6 Namespaces in Constructors (2/3) Namespaces in Constructors (3/3)

declare namespace b = "http://businesscard.org"; John Doe John Doe CEO, Widget Inc. CEO, Widget Inc. [email protected] [email protected] (202) 555-1414 (202) 555-1414

An Introduction to XML and Web Technologies 25 An Introduction to XML and Web Technologies 26

Enclosed Expressions Explicit Constructors

1 2 3 4 5 {1, 2, 3, 4, 5} John Doe CEO, Widget Inc. {1, "2", 3, 4, 5} [email protected] {1 to 5} (202) 555-1414 1 {1+1} {" "} {"3"} {" "} {4 to 5}

element card { namespace { "http://businesscard.org" }, element name { text { "John Doe" } }, element title { text { "CEO, Widget Inc." } } , element email { text { "[email protected]" } }, element phone { text { "(202) 555-1414" } }, element logo { attribute uri { "widget.gif" } } }

An Introduction to XML and Web Technologies 27 An Introduction to XML and Web Technologies 28

7 Computed QNames Biliingual Business Cards

element { if ($lang="Danish") then "kort" else "card" } { element { "card" } { namespace { "http://businesscard.org" }, namespace { "http://businesscard.org" }, element { if ($lang="Danish") then "navn" else "name" } element { "name" } { text { "John Doe" } }, { text { "John Doe" } }, element { "title" } { text { "CEO, Widget Inc." } }, element { if ($lang="Danish") then "titel" else "title" } element { "email" } { text { "[email protected]" } }, { text { "CEO, Widget Inc." } }, element { "phone" } { text { "(202) 555-1414" } }, element { "email" } element { "logo" } { { text { "[email protected]" } }, attribute { "uri" } { "widget.gif" } element { if ($lang="Danish") then "telefon" else "phone"} } { text { "(202) 456-1414" } }, } element { "logo" } { attribute { "uri" } { "widget.gif" } } }

An Introduction to XML and Web Technologies 29 An Introduction to XML and Web Technologies 30

FLWOR Expressions The Difference Between For and Let (1/4)

ƒ Used for general queries: for $x in (1, 2, 3, 4) let $y := ("a", "b", "c") return ($x, $y) { for $s in fn:doc("students.xml")//student let $m := $s/major 1, a, b, c, 2, a, b, c, 3, a, b, c, 4, a, b, c where fn:count($m) ge 2 order by $s/@id return { $s/name/text() } }

An Introduction to XML and Web Technologies 31 An Introduction to XML and Web Technologies 32

8 The Difference Between For and Let (2/4) The Difference Between For and Let (3/4)

let $x in (1, 2, 3, 4) for $x in (1, 2, 3, 4) for $y := ("a", "b", "c") for $y in ("a", "b", "c") return ($x, $y) return ($x, $y)

1, 2, 3, 4, a, 1, 2, 3, 4, b, 1, 2, 3, 4, c 1, a, 1, b, 1, c, 2, a, 2, b, 2, c, 3, a, 3, b, 3, c, 4, a, 4, b, 4, c

An Introduction to XML and Web Technologies 33 An Introduction to XML and Web Technologies 34

The Difference Between For and Let (4/4) Computing Joins

let $x := (1, 2, 3, 4) ƒ What recipes can we (sort of) make? let $y := ("a", "b", "c") return ($x, $y) declare namespace rcp = "http://www.brics.dk/ixwt/recipes"; for $r in fn:doc("recipes.xml")//rcp:recipe for $i in $r//rcp:ingredient/@name 1, 2, 3, 4, a, b, c for $s in fn:doc("fridge.xml")//stuff[text()=$i] return $r/rcp:title/text()

eggs olive oil ketchup unrecognizable moldy thing

An Introduction to XML and Web Technologies 35 An Introduction to XML and Web Technologies 36

9 Inverting a Relation Sorting the Results

declare namespace rcp = "http://www.brics.dk/ixwt/recipes"; declare namespace rcp = "http://www.brics.dk/ixwt/recipes"; { for $i in distinct-values( { for $i in distinct-values( fn:doc("recipes.xml")//rcp:ingredient/@name fn:doc("recipes.xml")//rcp:ingredient/@name ) ) order by $i return return { for $r in fn:doc("recipes.xml")//rcp:recipe { for $r in fn:doc("recipes.xml")//rcp:recipe where $r//rcp:ingredient[@name=$i] where $r//rcp:ingredient[@name=$i] return $r/rcp:title/text() order by $r/rcp:title/text() } return $r/rcp:title/text() } } }

An Introduction to XML and Web Technologies 37 An Introduction to XML and Web Technologies 38

A More Complicated Sorting Using Functions

declare function local:grade($g) { for $s in document("students.xml")//student if ($g="A") then 4.0 else if ($g="A-") then 3.7 order by else if ($g="B+") then 3.3 else if ($g="B") then 3.0 fn:count($s/results/result[fn:contains(@grade,"A")]) descending, else if ($g="B-") then 2.7 else if ($g="C+") then 2.3 fn:count($s/major) descending, else if ($g="C") then 2.0 else if ($g="C-") then 1.7 xs:integer($s/age/text()) ascending else if ($g="D+") then 1.3 else if ($g="D") then 1.0 return $s/name/text() else if ($g="D-") then 0.7 else 0 };

declare function local:gpa($s) { fn:avg(for $g in $s/results/result/@grade return local:grade($g)) };

{ for $s in fn:doc("students.xml")//student return }

An Introduction to XML and Web Technologies 39 An Introduction to XML and Web Technologies 40

10 A Height Function A Textual Outline

declare function local:height($x) { Cailles en Sarcophages if (fn:empty($x/*)) then 1 pastry else fn:max(for $y in $x/* return local:height($y))+1 chilled unsalted butter flour }; salt ice water filling baked chicken marinated chicken small chickens, cut up Herbes de Provence dry white wine orange juice minced garlic truffle oil ...

An Introduction to XML and Web Technologies 41 An Introduction to XML and Web Technologies 42

Computing Textual Outlines Sequence Types

declare namespace rcp = "http://www.brics.dk/ixwt/recipes"; 2 instance of xs:integer declare function local:ingredients($i,$p) { 2 instance of item() fn:string-join( 2 instance of xs:integer? for $j in $i/rcp:ingredient () instance of empty() return fn:string-join(($p,$j/@name," () instance of xs:integer* ",local:ingredients($j,fn:concat($p," "))),""),"") (1,2,3,4) instance of xs:integer* }; (1,2,3,4) instance of xs:integer+ instance of item() declare function local:recipes($r) { instance of node() fn:concat($r/rcp:title/text()," instance of element() ",local:ingredients($r," ")) instance of element(foo) }; instance of element(foo) /@bar instance of attribute() fn:string-join( /@bar instance of attribute(bar) for $r in fn:doc("recipes.xml")//rcp:recipe[5] fn:doc("recipes.xml")//rcp:ingredient instance of element()+ return local:recipes($r),"" fn:doc("recipes.xml")//rcp:ingredient ) instance of element(rcp:ingredient)+

An Introduction to XML and Web Technologies 43 An Introduction to XML and Web Technologies 44

11 An Untyped Function A Default Typed Function

declare function local:grade($g) { declare function local:grade($g as item()*) as item()* { if ($g="A") then 4.0 else if ($g="A-") then 3.7 if ($g="A") then 4.0 else if ($g="A-") then 3.7 else if ($g="B+") then 3.3 else if ($g="B") then 3.0 else if ($g="B+") then 3.3 else if ($g="B") then 3.0 else if ($g="B-") then 2.7 else if ($g="C+") then 2.3 else if ($g="B-") then 2.7 else if ($g="C+") then 2.3 else if ($g="C") then 2.0 else if ($g="C-") then 1.7 else if ($g="C") then 2.0 else if ($g="C-") then 1.7 else if ($g="D+") then 1.3 else if ($g="D") then 1.0 else if ($g="D+") then 1.3 else if ($g="D") then 1.0 else if ($g="D-") then 0.7 else 0 else if ($g="D-") then 0.7 else 0 }; };

An Introduction to XML and Web Technologies 45 An Introduction to XML and Web Technologies 46

A Precisely Typed Function Another Typed Function

declare function local:grade($g as xs:string) as xs:decimal { declare function local:grades($s as element(students)) if ($g="A") then 4.0 else if ($g="A-") then 3.7 as attribute(grade)* { else if ($g="B+") then 3.3 else if ($g="B") then 3.0 $s/student/results/result/@grade else if ($g="B-") then 2.7 else if ($g="C+") then 2.3 }; else if ($g="C") then 2.0 else if ($g="C-") then 1.7 else if ($g="D+") then 1.3 else if ($g="D") then 1.0 else if ($g="D-") then 0.7 else 0 };

An Introduction to XML and Web Technologies 47 An Introduction to XML and Web Technologies 48

12 Runtime Type Checks Built-In Functions Have Signatures

ƒ Type annotations are checked during runtime fn:contains($x as xs:string?, $y as xs:string?) as xs:boolean ƒ A runtime type error is provoked when • an actual argument value does not match the declared type op:union($x as node()*, $y as node()*) as node()* • a function result value does not match the declared type • a valued assigned to a variable does not match the declared type

An Introduction to XML and Web Technologies 49 An Introduction to XML and Web Technologies 50

XQueryX XML Databases

for $t in fn:doc("recipes.xml")/rcp:collection/rcp:recipe/rcp:title ƒ How can XML and databases be merged? return $t

ƒ Several different approaches: xsi:schemaLocation="http://www.w3.org/2003/12/XQueryX xqueryx.xsd">child xqx:nodeName> •extract XML views of relations rcp:title rcp:collection • use SQL to generate XML child • shred XML into relational databases t rcp:recipe xsi:type="xqx:pathExpr"> t doc child recipes.xml

An Introduction to XML and Web Technologies 51 An Introduction to XML and Web Technologies 52

13 The Student Database Again Automatic XML Views (1/2)

An Introduction to XML and Web Technologies 53 An Introduction to XML and Web Technologies 54

Automatic XML Views (2/2) Programmable Views

xmlelement(name, "Students", select xmlelement(name, 100026 "record", Joe Average xmlattributes(s.id, s.name, s.age)) from Students 21 ) 100078 xmlelement(name, "Students", Jack Doe select xmlelement(name, 18 "record", xmlforest(s.id, s.name, s.age)) from Students )

An Introduction to XML and Web Technologies 55 An Introduction to XML and Web Technologies 56

14 XML Shredding From XQuery to SQL

ƒ Each element type is represented by a relation ƒ Any XML document can be faithfully represented ƒ Each element node is assigned a unique key in ƒ This takes advantage of the existing database document order implementation ƒ Each element node contains the key of its parent ƒ Queries must now be phrased in ordinary SQL ƒ The possible attributes are represented as fields, rather than XQuery where absent attributes have the null value ƒ But an automatic translation is possible ƒ Contents consisting of a single character data //rcp:ingredient[@name="butter"]/@amount node is inlined as a field select ingredient.amount from ingredient where ingredient.name="butter"

An Introduction to XML and Web Technologies 57 An Introduction to XML and Web Technologies 58

Summary Essential Online Resources

ƒ XML trees generalize relational tables ƒ http://www.w3.org/TR/xquery/ ƒ XQuery similarly generalizes SQL ƒ http://www.galaxquery.org/ ƒ http://www.w3.org/XML/Query/ ƒ XQuery and XSLT have roughly the same expressive power ƒ But they are suited for different application domains: data-centric vs. document-centric

An Introduction to XML and Web Technologies 59 An Introduction to XML and Web Technologies 60

15