Unleash SQL Power to Your XML Data

Unleash SQL Power to Your XML Data

Session: G06 Unleash SQL Power to your XML Data Matthias Nicola IBM Silicon Valley Lab May 20, 2008 • 10:30 a.m. – 11:30 a.m. Platform: DB2 for Linux, UNIX, Windows and DB2 for z/OS SQL is no longer the purely relational language that it used to be. The latest SQL standard defines an XML data type and new functions for querying hierarchical XML data. This is supported by DB2's pureXML functionality on zOS and Linux, Unix, and Windows. Apart from efficient storage and indexing, pureXML allows you to search and extract XML values through new the SQL functions. This also enables you to join XML and relational data in a single query, and to convert one to the other as needed. Additionally, you can apply the full power of all the good old relational SQL features to your XML data. This session introduces you to querying XML data in the DB2 9 family and presents tips & tricks for getting the best of both worlds when querying XML and relational data. Matthias Nicola is a Senior Software Engineer for DB2 pureXML at IBM's Silicon Valley Lab. His work focuses on all aspects of XML in DB2, including XQuery, SQL/XML, indexing and performance. Matthias works closely with the DB2 development teams as well as with customers and business partners who are using XML, assisting them in the design, implementation, and optimization of XML solutions. Prior to joining IBM, Matthias worked on data warehousing performance for Informix Software. He also worked in research and industry projects on distributed and replicated databases. He received his doctorate in computer science from the Technical University of Aachen, Germany. Key Points • Learn the basics of XPATH and how to use XPATH embedded in SQL to extract XML document fragments or to express predicates on XML data. • Learn how to query XML data in DB2, using the new SQL functions XMLEXISTS, XMLQUERY, and XMLTABLE. • Learn how to retrieve XML data in relational format, and vice versa. • Learn how to define relational views over XML to make your XML data available to your existing SQL applications. • Learn fundamental guidelines for writing efficient queries, and for integrating XML and relational data. 2 •XPath, and how to embed XPath in SQL •New SQL Functions: XMLQUERY, XMLTABLE, XMLEXISTS •Retrieve XML data in relational format, and vice versa •Apply SQL functions to XML Data •Best Practices for writing hybrid XML/relational queries Agenda Brief Recap: DB2 9 pureXML XPath Expressions Combining SQL and XPath SQL/XML functions XMLQUERY and XMLEXISTS From XML to SQL Types: XMLCAST Applying SQL functions to XML Data Returning XML in relational format: XMLTABLE Summary 3 Agenda Brief Recap: DB2 9 pureXML XPath Expressions Combining SQL and XPath SQL/XML functions XMLQUERY and XMLEXISTS From XML to SQL Types: XMLCAST Applying SQL functions to XML Data Returning XML in relational format: XMLTABLE Summary 4 XML Storage: Old and New (DB2 V8 vs. DB2 9) Unstructured XML Shredding: DB2 9 pureXML: storage: XML as text XML → Relational XML as XML XML XML DOC DOC Extract XML Fixed selected DOC Mapping elements/attr. XML DOC XML DOC XML DOC XML CLOB Column i.e. XML as text Index Side Tables or Indexes (regular relational tables) Mapping prevents XML schema Any sub-document level access changes, and is often too complex. Maximum flexibility XML Column requires XML parsing – slow. XML reconstruction is slow. and performance 5 The XML Extender in DB2 V8 allows XML storage in CLOB (or Varchar) columns, but performance is often not adequate due to XML parsing at query time. Another option in DB2 V8 was to shred XML data to relational tables based on a fixed mapping. This can work well if the XML structure is relatively simple and doesn’t change over time. Hence, this is still possible in DB2 9 and beyond. However, in many real-world scenarios the required mapping is very complex and it may take dozens (sometimes hundreds) of tables to represent the XML data in relational format. In such cases the shredding is very expensive, and re-constructing the XML documents requires multi-way joins which often perform poorly. Also, if the XML format changes, the mapping and the underlying relational need to be adjusted accordingly, which is often a very complex and expensive task. DB2 9 solves these problems. DB2 9 stores XML in a parsed format which avoids XML parsing at query time. DB2 9 also does not require a fixed mapping to store XML data. Documents of different form and shape can be stored in a single column of type XML. The XML Data Type & pureXML Storage Tables can contain relational and/or XML columns Relational columns stored in tabular format XML columns stored in a parsed hierarchical format No XML parsing for query evaluation Æ High Performance create table dept (deptID char(8),...,deptdoc xml); deptID … deptdoc “PR27” <dept> … DB2 Storage: … <emp>…</emp> </dept> XML … … … Relational 6 The new data type “XML” can be used to define 1 or multiple XML columns in a DB2 table. The table can also have traditional relational columns, but that’s optional. XML and relational data are stored differently, but closely linked. Differently, because relational data is flat and best stored in rows and columns, while XML is nested and best stored in a tree format. For more details, see: Matthias Nicola and Bert Van der Linden. ”Native XML Support in DB2 Universal Database,” Proceedings of the 31st Annual VLDB, 2005. (http://www.vldb2005.org/program/paper/thu/p1164-nicola.pdf) Efficient Document Tree Storage <dept> <employee id="901"> 14 <name>John Doe</name> <phone>408 555 1212</phone> <office>344</office> 4 4 </employee> <employee id="902"> <name>Peter Pan</name> 7=901 1 6 3 7=902 1 6 3 <phone>408 555 9918</phone> <office>216</office> John Doe 408-555-1212 344 Peter Pan 408-555-9918 216 </employee> </dept> Tags encoded as XML text represented dept Integers. as document tree Reduces storage employee employee Fast comparisons & navigation id=901 name phone office id=902 name phone office John Doe 408-555-1212 344 Peter Pan 408-555-9918 216 7 This shows a textual XML document (top left) and the hierarchical representation for efficient storage. One of multiple optimizations is that DB2 encodes XML tag names as unique integer values. This is invisible to the application but makes XML processing inside DB2 much faster. For more details, see: http://www.vldb2005.org/program/paper/thu/p1164-nicola.pdf Relevant XML Standards SQL/XML http://www.sqlx.org XQUERY 1.0 Expressions http://www.w3.org/TR/xquery XPath 2.0 Functions & Operators http://www.w3.org http://www.w3.org/TR/xquery-operators/ /TR/xpath20/ XQuery 1.0 and XPath 2.0 Data Model http://www.w3.org/TR/query-datamodel/ 8 The basic language to query XML data is XPath. It’s very powerful. XQuery is an extension of XPath, and XPath is sub-set of XQuery. XQuery adds additional expressions and functions to XPath to allow more complicated queries. Both XPath and XQuery are based on the same data model which is called “XQuery 1.0 and XPath 2.0 Data Model”. Basically, this data model defines how each XML document is actually a tree of element and attribute nodes. Queries expressed in XQuery or XPath traverse these trees, evaluate predicates, and retrieve XML values. XQuery and XPath have been standardized by the W3C. The SQL standard has been enhanced to allow embedding of XQuery or XPath in SQL statements. You will see later in this presentation how that works. Options to query XML data in DB2 9 SQL Plain SQL, allows full doc retrieval SQL/XML XPath embedded XPath in SQL SQL/XML XQuery embedded XQuery in SQL XQuery XQuery as a stand- alone language XQuery SQL embedded SQL in XQuery 9 These are the 5 languages (or: combination of languages) that the DB2 Family offers. Options to query XML data in DB2 9 DB2 LUW DB2/zOS SQL Plain SQL, allows full doc retrieval SQL/XML XPath embedded XPath in SQL SQL/XML XQuery embedded XQuery in SQL XQuery XQuery as a stand- alone language XQuery SQL embedded SQL in XQuery 10 All of these options are supported in DB2 9 for Linux, Unix, Windows. DB2 9 for zOS does not support XQuery. However, you will see that XPath embedded in SQL is already a very powerful combination that‘s fully sufficient for many XML applications. Options to query XML data in DB2 9 DB2 LUW DB2 zOS SQL Plain SQL, allows full doc retrieval SQL/XML XPath embedded XPath in SQL SQL/XML XQuery embedded XQuery in SQL XQuery XQuery as a stand- alone language XQuery SQL embedded SQL in XQuery 11 In this presentation we‘ll focus on SQL/XML with XPath, which applies to both, DB2 9 for Linux, Unix, Windows and DB2 9 for zOS. Agenda Brief Recap: DB2 9 pureXML XPath Expressions Combining SQL and XPath SQL/XML functions XMLQUERY and XMLEXISTS From XML to SQL Types: XMLCAST Applying SQL functions to XML Data Returning XML in relational format: XMLTABLE Summary 12 XPath Concepts / <dept bldg=“101”> /dept <employee id=“901”> /dept/employee <name>John Doe</name> /dept/employee/@id <phone>408 555 1212</phone> Each node /dept/employee/name <office>344</office> has a path /dept/employee/phone </employee> /dept/employee/phone/text() <employee id=“902”> (...) <name>Peter Pan</name> <phone>408 555 9918</phone> <office>216</office> dept </employee> </dept> bldg=101 employee employee id=901 name phone office id=902 name phone office John Doe 408-555-1212 344 Peter Pan 408-555-9918 216 13 XPath is based on the fact that every XML document is a tree of element and attribute nodes.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    41 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us