16.12.2009

8. Updates + XSLT

8.1 Introduction 8.2 Full document replacement XML Databases 8.3 XQuery Update Facility 8. Updates + XSLT, 16.12.09 8.4 XSLT & the XSLTRANSFORM function Silke Eckstein 8.5 Overview Andreas Kupfer Institut für Informationssysteme 8.6 References Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 2

8.1 Introduction 8. Updates + XSLT

• Three general techniques for modifying XML 8.1 Introduction documents: – Full document replacement 8.2 Full document replacement • Replace existing document with an updated one 8.3 XQuery Update Facility – XQueryUpdate Facility • Standardized extension to XQuery 8.4 XSLT & the XSLTRANSFORM function • Modify, insert or delete individual elements and attributes within an XML document 8.5 Overview – Extensible Stylesheet Language Transformation (XSLT) 8.6 References • Apply a style sheet to an XML document • Use XSLTRANSFORM function to do this in SQL statements

[NK09] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 3 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 4

8.2 Full document replacement 8.2 Full document replacement

• Replacing a full XML document • Example – Use regular SQL UPDATE statement to replace a full XML UPDATE customer document in a table with a new document SET info = ' • treats XML document as a "black box" • application needs to provide new document Larry Trotter – UPDATE statement needs to select a single row 5 Rosewood • ... predicate on the relational columns of the table • predicates on an XML element value 416-555-1358' predicates on an XML attribute value WHERE cid = 1000; • predicates on XML and relational values – New documents can be provided via parameter markers

[NK09] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 5 [NK09] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6

1 16.12.2009

8.2 Full document replacement 8.2 Full document replacement

• Using parameter marker or host variables • 2 more examples: – to provide the new XML document: UPDATE customer UPDATE customer SET info = ? WHERE cid = 1000 SET info = ? WHERE XMLEXISTS('$INFO/customerinfo[name = "Larry Trotter"] UPDATE customer SET info = :hvar WHERE cid = 1000 AND cid = 1000; – ... and to provide the relational value: UPDATE customer SET info = ? WHERE cid = ? UPDATE customer SET info = ? UPDATE customer SET info = :hvar WHERE cid = :hvar2 WHERE XMLEXISTS('$INFO/customerinfo/phone[type = "work" and text()="416-555-1358"]'); • Replacing an existing XML document with a NULL value – removes the document from the row without deleting the row: UPDATE customer SET info = NULL WHERE cid = 1000

[NK09] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 7 [NK09] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 8

8. Updates + XSLT 8.3 XQuery Update Facility

• 8.1 Introduction XQuery Update Facility – Standardized extension to XQuery 8.2 Full document replacement – Allows to modify, insert or delete individual elements or attributes within an XML document 8.3 XQuery Update Facility – Makes updating easier and provides more performance than full document replacements 8.4 XSLT & the XSLTRANSFORM function – Allows to modify nodes in the following way: • Replace the value of a node 8.5 Overview • Replace a node with a new one • Insert a new node (at a specific location) 8.6 References • Delete a node • Rename a node • Modify multiple nodes in a document in a single statement • Update multiple documents ib a single statement

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 9 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 10

8.3 XQuery Update Facility 8.3 XQuery Update Facility

• XQuery Update Facility: New XQuery expressions • Node insertion XQuery expressions – An insert expression is an updating expression that inserts ExprSingle ::= FLWORExpr copies of zero or more nodes into a designated position | QuantifiedExpr | TypeswitchExpr with respect to a target node. | IfExpr Syntax and examples | InsertExpr taken from the W3C Syntax | DeleteExpr web site. InsertExpr ::= "insert" ("node" | "nodes") | RenameExpr SourceExpr InsertExprTargetChoice TargetExpr | ReplaceExpr InsertExprTargetChoice ::= (("as" ("first" | "last"))? "into") | TransformExpr | "after" | "before" SourceExpr ::= ExprSingle | OrExpr TargetExpr ::= ExprSingle – N.B. Updating expressions (insert, delete, rename, replace) lead to a loss of type/validation information at the affected nodes. Such information may be recovered by revalidation .

[Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 11 [Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 12

2 16.12.2009

8.3 XQuery Update Facility 8.3 XQuery Update Facility

• Node insertion: Examples • Node deletion – A delete expression deletes zero or more nodes from an XDM Insert a year element after the publisher of the first book. instance. insert node 2005 – The keywords node and nodes may be used interchangeably, after fn:doc("bib.")/books/book[1]/publisher regardless of how many nodes are actually deleted. Syntax Navigating by means of several bound variables, insert a new DeleteExpr ::= "delete" ("node" | "nodes") TargetExpr police report into the list of police reports for a particular TargetExpr ::= ExprSingle accident. Delete the last author of the first book in a given bibliography. insert node $new-police-report delete node as last into fn:doc("insurance.xml")/policies fn:doc("bib.xml")/books/book[1]/author[last()] /policy[id = $pid] /driver[license = $license] Delete all email messages that are more than 365 days old. /accident[date = $accdate] delete nodes /email/message /police-reports [fn:currentDate() - date > xs:dayTimeDuration("P365D")]

[Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 13 [Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 14

8.3 XQuery Update Facility 8.3 XQuery Update Facility

• Node replacement • Node replacement: Examples Syntax Replace the publisher of the first book with the publisher of the second ReplaceExpr ::= "replace" ("value" "of")? "node" book. TargetExpr "with" ExprSingle TargetExpr ::= ExprSingle replace node fn:doc("bib.xml")/books/book[1]/publisher with fn:doc("bib.xml")/books/book[2]/publisher – Replace takes two forms, depending on whether value of is specified: • If value of is not specified, a replace expression replaces one node with a new sequence of zero or more nodes. The replacement nodes occupy the Increase the price of the first book by ten percent. position in the node hierarchy that was formerly occupied by the node that replace value of node fn:doc("bib.xml")/books/book[1]/price was replaced. with fn:doc("bib.xml")/books/book[1]/price * 1.1 – Hence, an attribute node can be replaced only by zero or more attribute nodes, and an element, text, comment, or processing instruction node can be replaced only by zero or more element, text, comment, or processing instruction nodes. • If value of is specified, a replace expression is used to modify the value of a node while preserving its node identity .

[Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 15 [Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 16

8.3 XQuery Update Facility 8.3 XQuery Update Facility

• Renaming nodes • Renaming is local! – A rename expression replaces the name property of a data – The effects of a rename expression are limited to its target model node with a new QName. node, descendants are not affected . Global change of names or namespaces needs explicit iteration. Syntax RenameExpr ::= "rename" "node" TargetExpr "as" Example (Change all QNames from prefix abc to xyz and new namespace URI NewNameExpr http://xyz/ns for node $root and its decendents.) for $node in $root//abc:* let $localName := fn:local-name($node), Rename the first author element of the first book to principal-author. $newQName := fn:concat("xyz:", $localName) rename node fn:doc("bib.xml")/books/book[1]/author[1] return as "principal-author" rename node $node as fn:QName("http://xyz/ns", $newQName), for $attr in $node/@abc:* let $attrLocalName := fn:local-name($attr), Rename the first author element of the first book to the QName that $attrNewQName := fn:concat("xyz:", $attrLocalName) is the value of the variable $newname. return rename node fn:doc("bib.xml")/books/book[1]/author[1] rename node $attr as fn:QName("http://xyz/ns", as $newname $attrNewQName)

[Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 17 [Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 18

3 16.12.2009

8.3 XQuery Update Facility 8.3 XQuery Update Facility

• Node transformation • Node transformation: Examples – . . . creates modified copies of existing nodes. Each copied node Return a sequence consisting of all employee elements that have Java obtains a new node identity. The resulting XDM instance can as a skill, excluding their salary child-elements. contain both, newly created and previously existing nodes. for $e in //employee[skill = "Java"] Node transformation is a non-updating expression, since it does return not modify existing nodes! copy $je := $e modify delete node $je/salary Syntax return $je TransformExpr ::= "copy" "$"VarName ":=" ExprSingle ("," "$"VarName ":=" ExprSingle)* Copy a node, modify copy, then return original and modified copy. "modify" ExprSingle let $oldx := /a/b/x "return" ExprSingle return copy $newx := $oldx – Idea: modify (rename node $newx as "newx", replace value of node $newx by $newx * 2) 1. Bind variables of copy clause (non-updating expressions), return ($oldx, $newx) 2. update copies (only!) as per modify clause, 3. construct result by return (copied/modified and/or other nodes). – N.B. Underlying persistent data not changed by these examples!

[Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 19 [Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 20

8.3 XQuery Update Facility 8. Updates + XSLT

• On the semantics of the XQuery Update 8.1 Introduction Facility 8.2 Full document replacement – Formally specifying the exact semantics of the XQuery UF is non-trivial for several reasons: 8.3 XQuery Update Facility • Formal update semantics are always a lot more involved than retrieval semantics. 8.4 XSLT & the XSLTRANSFORM function • Updates and bulk operations do not go together well (cf. 8.5 Overview SQL set-oriented updates). • XUF uses a notion of "snapshots" and "pending update lists" 8.6 References to work around some of the subtleties. • The details are beyond the scope of this lecture.

[Scholl07] XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 21 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 22

8.4 XSLT – Intro 8.4 XSLT – Intro

• XSL Languages • CSS = Style Sheets for HTML – It started with XSL and ended up with XSLT, XPath – HTML uses predefined tags, and the meaning of each and XSL-FO tag is well understood . • It started with XSL – The

tag in HTML defines a table - and a – XSL stands for E Xtensible Stylesheet Language. browser knows how to display it . – – The World Wide Web Consortium (W3C) started to Adding styles to HTML elements are simple. Telling a develop XSL because there was a need for an XML- browser to display an element in a special font or based Stylesheet Language. color, is easy with CSS.

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 23 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 24

4 16.12.2009

8.4 XSLT – Intro 8.4 XSLT – Intro

• XSL = Style Sheets for XML • XSL - More Than a Style Sheet Language – XML does not use predefined tags (we can use any – XSL consists of three parts: tag-names we like), and therefore the meaning of each – XSLT - a language for transforming XML documents tag is not well understood . – XPath - a language for navigating in XML documents – A

tag could mean an HTML table, a piece of – XSL-FO - a language for formatting XML documents furniture, or something else - and a browser does not know how to display it . – XSL describes how the XML document should be displayed!

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 25 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 26

8.4 XSLT 8.4 XSLT

• XSLT • XSLT processor – Extensible Stylesheet Language – Transformations – A language to describe transformations from source to target tree structures (= XML documents) – A transformation in XSLT XSLT stylesheet XSLT tree • Is described by a well-formed XML document called stylesheet Transformation Result and and Result tree • Can use elements of the XSLT namespace as well as of other process document namespaces XML document Source tree • Contains template rules to execute the transformation

XML file XSLT XML file Processor XSLT Stylesheet

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 27 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 28

8.4 XSLT 8.4 XSLT

• Template rules • XSLT processing model – By processing a list of source nodes, – A rule consists of a pattern and a template. fragments of the target tree can be created. – The pattern is compared to the nodes of the source – The list starts with the root node only. document tree. – A node is processed • By selecting the best matching pattern from all rules – The template can be instanciated to create a part of (resolving any conflicts). the target tree. It can contain elements of the XSLT • The template of the best matching rule is instanciated with the current node as context node. namespace which are instructions to create fragments. – A template usually contains instructions to select further source tree nodes for processing. – Recursivly repeat the selection of matching rules, instanciation and selecting of new source nodes until the list is empty.

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 29 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 30

5 16.12.2009

8.4 XSLT 8.4 XSLT

• Structure of a stylesheet • Top level elements name = { qname } priority = { number } mode = { qname } – Elements and attributes with the XSLT namespace must be recognized by the XSLT processor

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 31 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 32

8.4 XSLT 8.4 XSLT

• A pattern specifies a set of conditions to a • Multiple matching patterns node – If multiple patterns match a node, the conflict is resolved by priorities (cf. priority attribute) – Uses a set of alternative (|-seperated) address paths in • Imported rules have a lower priority than rules of the primary the child and attribute axis. stylesheet – • Alternatives are processed as if each alternative is defined by a The use of '/' and '//', 'id' and 'key' functions is possible. single rule – Pattern predicates ('[…]') can use all XPath • ChildOrAttributeAxisSpecifier::QName patterns have priority 0 expressions. • ChildOrAttributeAxisSpecifier::NCName patterns have priority -0.25 • ChildOrAttributeAxisSpecifier::NodeTest patterns have priority -0.5 • All other patterns have priority 0.5

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 33 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 34

8.4 XSLT 8.4 XSLT

• XSLT contains default rules • Rules – Process the document recursivly – Can be named and be called in templates of other rules – But have lower priority than rules in the stylesheet – Can have parameters which can be passed along on their invocation, default values can be defined – Example: – The mode attribute allows a rule to be processed multiple times and with different results – If the template is invoked directly with xsl:call- template or xsl:apply-template , the filter attributes (match , mode , priority or name ) are not processed

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 35 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 36

6 16.12.2009

8.4 XSLT 8.4 XSLT

• Templates • Instructions to process nodes recusively – Can contain literal elements (non XSLT – If the rule is selected, the template can construct fragments of the result tree. – Processing depends on the context. – Without the attribute select all children of the – Default behaviour is to write all elements which are context node are processed not in the XSLT namespace to the result tree. – Select can be a (XPath-) expression to select nodes – Must be valid XML. • Could result in not terminating recursion! – Can contain instructions.

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 37 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 38

8.4 XSLT 8.4 XSLT

• Instructions to create a node • Instructions for flow control test = { } – – Name attribute is required, but can be calculated Test expression is evaluated and result is casted to a boolean. If it is true the template will be instanciated – Other create instructions are similar • xsl:attribute , xsl:attribute-set , xsl:text (to create a text/leaf node with whitespaces), xsl:processing-instruction , xsl:comment

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 39 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 40

8.4 XSLT 8.4 XSLT

• Instructions for flow control • Repetition • Multiple choice ("if-then-else" / "switch") – The template is instanciated for each node selected by the node set expression. On instanciation the current node becomes the context node and all selected nodes are the node list. – If multiple xsl:when elements are true, only the first one is processed (no "break" needed) – If there is no explicit sort statement, the nodes are – If no xsl:when element is true and there is no xsl:otherwise, no content is created processed in document order.

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 41 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 42

7 16.12.2009

8.4 XSLT 8.4 XSLT

• "Calculation" of output text • Other statements for sorting, numbering, – see http://www.w3.org/TR/xslt • – The selected object is casted to a string value and is Some advice inserted as content of the instanciated text node. – Denomination "variable" is misleading! – Context node is changed by for-each!

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 43 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 44

8.4 XSLT 8.4 XSLT

xmlns:xsi="http://www.w3.org/2001/XMLSchema- instance"> 1 Super Pizza

4 Summary about 1 Pizzeria s

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 45 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 46

8. Updates + XSLT 8.5 Overview

Introduction and Basics Producing XML 8.1 Introduction 1. Introduction 9. Mapping relational 8.2 Full document replacement 2. XML Basics data to XML 3. Schema Definition Storing XML 8.3 XQuery Update Facility 4. XML Processing 10. XML storage 8.4 XSLT & the XSLTRANSFORM function Querying XML 5. XPath & SQL/XML 11. Relational XML storage 8.5 Overview Queries 12. Storage Optimization 6. XQuery Data Model Systems 8.6 References 7. XQuery 13. Technology Overview XML Updates 8. XML Updates & XSLT

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 47 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 48

8 16.12.2009

7.6 References Questions, Ideas, Comments

• "Database-Supported XML Processors", [Gru08] • Now, or ... – Th. Grust – Lecture, Uni Tübingen, WS 08/09 • Room: IZ 232 • "XML und Datenbanken", [Tür08] • Office our: Tuesday, 12:30 – 13:30 Uhr – Can Türker – Lecture, University of Zürich, 2008 or on appointment • DB2 pureXML CookBook [NK09] • Email: [email protected] – Matthias Nicola and Pav Kumar-Chatterjee – IBMPress, 2009, ISBN 9780138150471

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 49 XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 50

9