Breaking the Relational Limit with Purexml in DB2 for Z/OS

Breaking the Relational Limit with pureXML in DB2 for z/OS Guogen Zhang IBM August 10, 2012 Session 11787 2 Agenda • Introduction and Overview of new XML features in DB2 10 • Tips and hints of using XML in DB2 • Generating XML test data • Enforcing XML schema conformance automatically • Sub-document Update • CHECK DATA for XML • Binary XML format for performance • SQL PL Stored Procs and UDFs • Data movement, distribution and gathering • Simulating a queue • Search for documents without a certain element or attribute with indexes • Case-insensitive search with XML indexes • String pattern matching using regular expressions • Flexible parameter passing, simulating arrays and structures • Design decision making for XML storage and hybrid storage using triggers • Consuming Web services • End-user customizable application framework • Summary 2 Relational Limit and Pain points • Static schema • Add or drop columns • Normal form: no arrays, nested structures • Schema evolution: from a single value to multiple values • New demand: • Schema less: free to change • Flexible structures • XML data 3 4 DB2 XML Positioning Event/ Search Engine , Flexible/ Stream/ Complex (MapReduce) XML Schema IMS, • Breaking relational limit DB2 IDMS • For application developers, Fixed Relational new flexible data model and powerful language to deal with new requirements. Predefined Ad hoc • Higher productivity Queries • Shorter time to market • For DBAs, use the familiar tools Rule of thumb: No need to query – use LOB/VARCHAR/VARBINARY; managing XML objects. Extensive queries over regular data -- use relational; • Manageability for XML data Some queries on complex or flexible data – use XML • z platform: unparalleled reliability, availability, scalability, serviceability, and security. 4 5 Common XML Usage Scenarios • Managing XML documents • Object persistence • Sparse attributes • Front-end – bridge to back-end when full normalization is overkill • Flexible structures – many scenarios • Flexible parameter passing • Web services • Rapid prototyping • Event logging • End-user customizable applications/generators 5 6 Key XML Features in DB2 10 • XML schema association with XML columns (a.k.a. XML column type modifier) and automatic schema validation for insert/update • XML schema validation using z/OS XML System Services, 100% zIIP/zAAP redirectable (retrofit to V9) • Native XML Date and Time support and XML index support • Basic XML sub-document update • XML support in SQL PL STP/UDF • Performance enhancements, including binary XML between client and server. • CHECK DATAT for XML, LOAD/UNLOAD • Multi-versioning for concurrency control • Post-GA: XQuery support (PM47617, PM47618) 6 7 A Few Words about Multi-versioning format • DB2 10 NFM, base table in UTS, XML will be in multi-versioning (MV) format • MV format is required to support many new features for XML: • XMLMODIFY • Currently Committed Read • SELECT FROM OLD TABLE for XML • No XML locking for readers • Temporal data (“AS OF”) • No automatic conversion from non-MV format to MV format (unfortunately) 7 8 XQuery Support • Basic XQuery (PM47617, PM47618) • FLWOR expression • XQuery constructors • If-then-else expression • More XQuery functions and operators • Control of namespaces and whitespace in constructed XML • Best used in XMLQuery, replacing complex SQL/XML XMLTABLE- Constructors-XMLAGG in DB2 9 • No top-level XQuery 8 9 FLWOR expression • FLWOR • Select stmt for $x in …, $y in … WITH y(d1, d2) AS (…) let $z := … Select … where $x/c = $y/d From T1 x, y order by $x/c2 Where x.c1 = y.d1 return … Group by … Having … Order By x.c2 9 10 XQuery Showcases – Basic FLWOR Constructs • FLWOR - For, Let, Where, Order By, Return • Query all items purchased in a purchase order, and order the items ascendingly on the total cost <purchaseOrder orderDate="2011-05-18"> SELECT XMLQUERY( <shipTo country="US"> ‘for $i at $pos in $po/purchaseOrder/items/item <name>Alice Smith</name> let $p := $i/USPrice, <street>..</street><city>..</city><state>..</state><zip>..</zip> </shipTo> $q := $i/quantity, <billTo country="US"> ..</billTo> $r := $i/productName <items> where xs:decimal( $p ) > 0 and xs:integer( $q ) > 0 1 <item partNum="872-AA"> order by $p * $q <productName>Lawnmower </productName> return fn:concat("Item ", $pos , " productName: ", $r , <quantity> 1</quantity> <USPrice>149.99 </USPrice> " price: US$", $p , " quantity: ", $q ) <shipDate>2011-05-20</shipDate> ' </item> PASSING PODOC AS "po") 2 <item partNum="926-AA">..</item> FROM PURCHASE_ORDER 3 <item partNum="945-ZG">..</item> WHERE POID = 1; </items> xquery </purchaseOrder> Input xml doc <?xml version="1.0" encoding="IBM037"?> Item 2 productName: Baby Monitor price: US$ 39.98 quantity: 2 Item 1 productName: Lawnmower price: US$ 149.99 quantity: 1 Item 3 productName: Sapphire Bracelet price: US$ 178.99 quantity: 2 Output xml doc 10 11 Conditional expression • If (<TestExpression>) Then (<Expression>) Else (<Expression>) • Query all items purchased in a purchase order, and calculate the shipping cost for each item SELECT XMLQUERY( <purchaseOrder orderDate="2011-05-18"> <shipTo country="US" > ‘let $s := $po/purchaseOrder/shipTo <name>Alice Smith</name> let $b := $s/@country=“US” and $s/state=“California” <street>..</street>…<state>California</state> <zip>..</zip> for $i in $po/purchaseOrder/items/item </shipTo> return ( <billTo country="US"> ..</billTo> if ($b ) <items> <item partNum="872-AA"> then fn:concat( $i/productName , “ : shipping cost US$", 8) <productName>Lawnmower </productName> ... else fn:concat ($i/productName , “ : shipping cost US$", 12) </item> ) <item partNum=..><productName>Baby Monitor </productName >.. ' </item> PASSING PODOC AS "po") <item partNum=..><productName>Sapphire Bracelet </productName>.. </item> FROM PURCHASE_ORDER </items> WHERE POID = 1; XQuery </purchaseOrder> Input xml doc <?xml version="1.0" encoding="IBM037"?> Lawnmower : shipping cost US$8 Baby Monitor : shipping cost US$8 Sapphire Bracelet : shipping cost US$8 Output xml doc 11 12 Value Comparison and Node Comparison • Value comparisons compare two atomic values - eq, ne, lt, le, gt, ge • Node comparisons compare two nodes - is, <<, >> • Check if the product Lawnmower has the part number 872-AA <purchaseOrder orderDate="2011-05-18"> SELECT XMLQUERY( <shipTo country="US"> ‘if ($po/purchaseOrder/items/ item [@partNum eq “872-AA”] <name>Alice Smith</name> is <street>..</street>…<state>California</state><zip>..</zip> $po/purchaseOrder/items/ item [productName eq “Lawnmower”]) </shipTo> then “partNum 872-AA matches the product Lawnmower” <billTo country="US"> ..</billTo> <items> else “partNum 872-AA DOES NOT match the product Lawnmower” <item partNum="872-AA"> ' <productName>Lawnmower</productName> ... PASSING PODOC AS "po") </item> FROM PURCHASE_ORDER .. WHERE POID = 1; </items> </purchaseOrder> XQuery Input xml doc <?xml version="1.0" encoding="IBM037"?> partNum 872-AA matches the product Lawnmower Output xml doc 12 13 Castable and fn:avg() • Castable expressions test whether a value can be cast to a specific data type – Expression castable as TargetType? • fn:avg (sequence-expression) • Calculate the average cost of each item SELECT XMLQUERY( 'let $cost := ( <purchaseOrder orderDate="2011-05-18"> <shipTo country="US"> .. </shipTo> for $i in $po/purchaseOrder/items/item <billTo country="US"> ..</billTo> let $p := if ($i/USPrice castable as xs:decimal) <items> then xs:decimal($i/USPrice) <item partNum="872-AA"> ..<quantity> 1</quantity> else 0.0 <USPrice>149.99 </USPrice> let $q := if ($i/quantity castable as xs:integer) </item> <item partNum="926-AA"> ..<quantity> 2</quantity> then xs:integer($i/quantity) <USPrice>39.98 </USPrice> else 0 </item> return $p * $q ) <item partNum="945-ZG"> ..<quantity> 2</quantity> return fn:round( fn:avg ($cost)) <USPrice>178.99 </USPrice> ' </item> </items> PASSING PODOC AS "po") </purchaseOrder> FROM PURCHASE_ORDER Input xml doc WHERE POID = 1; XQuery (149.99 *1 + 39.98 *2 + 178.99 *2)/3 = 195.97666667 <?xml version="1.0" encoding="IBM037"?>196 13 Output xml doc 14 XQuery Constructors • Reconstruct all items NOT shipped yet in a purchase order <purchaseOrder orderDate="2011-05-18"> declare boundary-space strip; <shipTo country="US"> .. declare copy-namespaces no-preserve, inherit; <street>..</street><city>..</city><state>..</state><zip>..</zip> <shipping orderDate ="{$po/purchaseOrder/@orderDate}"> </shipTo> <shipTo >{ let $s := $po/purchaseOrder/shipTo <billTo country="US"> ..</billTo> return fn:concat($s/state, " ", $s/zip, " ", $s/@country) <items> <item partNum="872-AA"> .. } <shipDate>2011-05-20</shipDate> </shipTo > </item> {for $i in $po/purchaseOrder/items/item <item partNum="926-AA">.. where fn:not($i/shipDate castable as xs:date) or <shipDate>2011-05-22</shipDate> xs:date($i/shipDate) > fn:current-date() </item> <item partNum="945-ZG"> return < item partNum ="{$i/@partNum}"> <productName>Sapphire Bracelet</productName> <partName > {$i/productName/text()} < /partName > <quantity>2</quantity> { $i/quantity , $i/USPrice } <USPrice>178.99</USPrice> </item > <comment>Not shipped</comment> } </item> </items> </shipping > XQuery </purchaseOrder> Input xml doc <shipping orderDate="2011-05-18"> SELECT XMLQUERY(‘…’ PASSING PO as “po”) <shipTo>California 95123 US</shipTo> <item partNum="945-ZG"> FROM PURCHASEORDERS <partName>Sapphire Bracelet</partName> WHERE … <quantity>2</quantity> <USPrice>178.99</USPrice> </item> </shipping> Output xml doc 14 15 Tricks in Generating XML test data • Generate from relational data using constructors

Breaking the Relational Limit with Purexml in DB2 for Z/OS

Xquery in IBM DB2 10 for Z/OS

Bibliography of Erik Wilde

Supporting SPARQL Update Queries in RDF-XML Integration *

SQL Stored Procedures

XML Transformations, Views and Updates Based on Xquery Fragments

Access Control Models for XML

Web and Semantic Web Query Languages: a Survey

Best Practices Managing XML Data

Mongodb - Document-Oriented Store

Declarative Access to Filesystem Data New Application Domains for XML Database Management Systems

DB2 10.5 with BLU Acceleration / Zikopoulos / 349-2

History Repeats Itself: Sensible and Nonsensql Aspects of the Nosql Hoopla C