IBM Information Management

DB2 Version 9 – the Viper Release

LUW A next generation hybrid data server

Holger Seubert [email protected] DB2 Information Management Development IBM Laboratory Boeblingen 122. Datenbankstammtisch der HTW Dresden Fachbereich Informatik/Mathematik 13. Dezember 2006

© 2006 IBM Corporation IBM Software Group - Information Management

"There are 68 patents alone in Viper, and it involved 750 developers over five years," Bob Picciano, VP WW Information Management Sales said.

"This is something no one else has and will take years to get here."

Explore yourself for free with DB2 Express-C 9:

à http://www-306.ibm.com/software/data/db2/express/

There's a lot of innovation in Viper. Let’s go and explore ….

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Agenda – Start the hybrid engine

• < Overview />

• pureXML Storage • XML Indexes • XQuery & SQL/XML support • XML Query Execution • XML Schema support (XSR)

• Utilities, Tools & API’s

• Summary

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management History of XML: How it all began …

The “Standard Generalized Markup Language” (SGML) is a metalanguage in which you can define markup languages for documents. SGML was originally XML 1.0 & XML 1.1 designed for sharing machine readable documents. It 3rd Edition also has been used extensively in the printing and 2005 publishing industries. 2004

XML 1.0 W3C Recommendation The Extensible Markup Language (XML) is a general-purpose markup language for creating 1998 2000 special-purpose markup languages, capable of XML 1.0 describing many different kinds of data. 2nd Edition It is a simplified subset of SGML. SGML Standardization (ISO) 1993 1983 HTML 1st Version “HyperText Markup Language” (HTML) is a markup language (subset of SGML) designed for beginning 1980’s the creation of web pages and other information viewable in a browser. HTML is used to structure information The relational data-model and can be used to describe the appearance and becomes popular semantics of a document. DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

… and where we are today – the hype 2006 UDDI SOAP XML 1.0 XUpdate XLink & XML 1.1 WSDL 3rd Edition 2005 XPath 2004 XPointer XML Schema XQuery XSLT XML Windows Ajax INFOSET RSS XML 1.0 Installer XML W3C Recommendation XML-FO CSS XHTML 1998 2000 XForms XML 1.0 2nd Edition SQL/XML JAX-RPC SGML Standardization (ISO) SAX DOM 1993 1983 Native XML Databases HTML 1st Version XML-enabled Databases

Pls Note: The order of the different technologies mentioned above does not reflect their 100% order of invention/ appearance.

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management The evolution to a new database technology

Last generation database technology .. .. needed data types like CLOB or text to store XML data:

û no chance to query XML directly on database tier û XML is read in strings and passed to “middle-tier”, which then queries against XML data

Next generation database technology … … interacts directly with the XML data:

ü new XML datatype allows to store native data inside the database ü run queries against XML with XQuery or XPath ü embed XQuery statements directly into SQL statements ü special XML indexes are used to boost performance ü assign a schema to XML data, ensure that XML data is valid

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

A New Model is Emerging – a hybrid system

XML Developer SQL Developer “I see a sophisticated "I see a sophisticated XML repository that Familiar RDBMS that also also supports SQL." Programming Models supports XML."

Mature Familiar Services Tooling

Optimized Optimized Performance & Storage Models Scale

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Agenda – Start the hybrid engine

• Overview

• < pureXML Storage /> • XML Indexes • XQuery & SQL/XML support • XML Query Execution • XML Schema support (XSR)

• Utilities, Tools & API’s

• Summary

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management pureXML in DB2 9

§ Standards compliant & driving the standards – XML, XQuery, SQL/XML, XML Schema …

§ 100% integrated in DB2 – leveraging performance, scalability, reliability, availability …

§ 100% integrated with SQL – XML is a new SQL type – Access relational and XML data in same statement

§ 100% integrated with application APIs: – JDBC, ODBC, .NET, embedded SQL, PHP

9 DB2 9 – The next generation hybrid data server IBM Software Group - Information Management pureXML in DB2 9 § What does pureXML® support mean? – Storage, compiler, optimizer, indexing, tools, utilities, APIs, … à XML capabilities in all DB2 components

§ pureXML® Storage – XML stored in parsed, annotated DOM-like trees – the XQuery Data Model is persisted à NOT shredded, NOT as LOB – XML data is formatted to buffered data pages (LOB pages or not buffered!) – XML data can be placed in separate table space à Shared with LOB data of that table – New data XDA object on disk (new data type)

§ Customer benefits – Faster navigation and queries – Simpler indexing – Natural XML user paradigm

The XDA object

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Integration of XML & Relational Capabilities

§ Native XML data type (server & client side) – not Varchar, not CLOB, not object-relational ! § XML capabilities in all DB2 components § Applications can combine XML & relational data

DB2 HYBRID DATA SERVER

CLIENT SQL/XML Relational DB2 DB2 Storage: C Engine o Relational DB2 Client / Interface m Customer Client p Application XQuery I XML l e XML r Interface

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management The XML Data Type

§ A column of type XML can hold one well-formed XML document for every row of the table create table dept (deptID char(8),…, deptdoc ); § XML and relational columns are stored differently: • Relational columns are stored in relational format (tables) deptID … deptdoc • XML values are stored natively “PR27” … in the XQuery Data Model • A descriptor pointing to the XML storage is stored in the row … … …

§ no limit on size of XML document (no length associated with XML data type, client-server protocol DB2 storage limits document size to 2GB at the moment)

§ Parse-once paradigm: No XML parsing at query time!

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

DB2 XML Storage – XML to the core

§ Document is stored in parsed hierarchical representation – This is similar to a DOM representation of the XML INFOSET – IBM’s version of open-source Xerces is used. – The XQuery Data Model is persisted

§ All XML nodes are type annotated, according to the XQuery Specification (W3C) – XML Schema types if validated. – Default types otherwise.

§ All data is stored in UTF-8 – Regardless of the document encoding – Regardless of the locale – Regardless of the codepage of the database

store XML intact with full DBMS knowledge of documents internal structure

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

DB2 XML Storage – XML to the core

Information stored with every node:

§ Name (e.g. element name, encoded as StringID from the string table) § A nodeID § Type of node (e.g. element, attribute, etc.) § Namespace § Namespace prefix § Data type § Pointer to parent § Array of child pointers § For text/attribute notes: the data itself

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

DB2 XML storage – XML to the core

§ Node hierarchy of an XML document stored on DB2 pages – Large documents split into pages/regions

§ Nodes are physically connected – Query performance

Regions index § Regions are logically connected – Regions index is a system component

split into

regions

page page page Large XML document

15 DB2 9 – The next generation hybrid data server IBM Software Group - Information Management New System Indexes

§ Entries in SYSCAT.INDEXES with the following INDEXTYPE:

§ XRGN: XML Region Index – Created once for table with XML column(s) – Maps logical pointers to XML data pages

§ XPTH: XML Path Index – Created for each XML column – Holds local subset of global path/pathID mapping information / path table – Can be used for wildcard resolution

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management DB2 XML Storage

INX Object DAT Object ID … DEPTDOC PR27 … Region Path PR28 … ACC … /dept /dept/employee /dept/employee/@id …

XDA Object

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Agenda – Start the hybrid engine

• Overview • pureXML Storage • < XML Indexes /> • XQuery & SQL/XML support • XML Query Execution

• XML Schema support (XSR)

• Utilities, Tools & API’s

• Summary

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XML Indexes for High Query Performance

§ Index elements and attributes inside the document

§ Uses an XML pattern expression to index paths and values in XML documents stored in a single XML column

§ Specify the type to index § Should be the same as used in the query § Query /Person[Age = 5] needs a numeric index on Age

§ 0,1 or multiple index entries per document

create table t1 (docID int,XMLDoc xml); create index AgeIndex on t1( XMLDoc); generate key using xmlpattern '/Person/Age' as sql Double;

NOTE: Declaration & use of namespace prefix supported (not shown above)

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XML Index DDL

CREATE INDEX index-name UNIQUE

ON table-name (xml-column-name) GENERATE KEY USING xmlpattern

AS SQL VARCHAR (integer) xmlpattern = XPath VARCHAR (HASHED) without predicates, DOUBLE only child axis (/) and DATE descendent-or-self axis (//) xmlpattern: TIMESTAMP

/ // text() @attribute-tag element-tag / @ * // *

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XML Index Examples

create table dept(deptID char(8) primary key, deptdoc xml); John Doe 408 555 1212 344 § create unique index idx1 on dept(deptdoc) generate key using xmlpattern '/dept/@bldg' as sql double; Peter Pan 408 555 9918 § create unique index idx2 on dept(deptdoc) generate key 216 using xmlpattern '/dept/employee/@id' as sql double; § create index idx3 on dept(deptdoc) generate key using xmlpattern '/dept/employee/name' as sql varchar(35);

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

XML Index Wizard (DB2 Control Center)

Create a value index on XML elements or XML attributes by right-clicking in the document structure

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

XML Index Wizard (DB2 Control Center)

A pop-up menu shows possibilities to create XML value index on selected XML node

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management The Big Indexing Picture

Created by users to improve performance Index during queries on XML Relational on XML documents Index Column

Relational Column 1 Relational Column 2 XML Column SQL Table with XML Column

XML Storage Logical mapping of .XDA file regions in an XML XML document used to retrieve the document data Regions Index

Maps paths to path ids XML for each XML column. Catalog Path Table Column Subset of paths stored in Paths global catalog path table. Index

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management A full text index for XML

§ XML is less like traditional data stored in database

§ Applications on XML documents often rely on a full text index

§ DB2 9 offers both – Traditional-behaving database indexes – Full-text indexing

§ Existing Net Search Extender is used for full text index – XML aware: limit search to specific elements or attributes – Proximity searches – Wildcard searches – and a lot more … text

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Agenda – Start the hybrid engine

• Overview • pureXML Storage • XML Indexes • < XQuery & SQL/XML support /> • XML Query Execution

• XML Schema support (XSR)

• Utilities, Tools & API’s

• Summary

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XQuery and SQL/XML

§ DB2 treats both, SQL and XQuery as primary query languages (hybrid system)

§ SQL and XQuery independently operate on their respective data models

§ DB2 also allows to combine and correlate relational and XML types of data

Two ways to query XML data: This section: - Querying XML data with XQuery - Optional: SQL embedded in XQuery

Next section: - Querying XML data with SQL - Optional: XQuery embedded in SQL

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management What is XQuery? A designed for XML data… …and supported in DB2 9.

XQUERY

Expressions www.w3.org/TR/xquery XPath 2.0 XML Schema Functions & Operators www.w3.org/ www.w3.org/TR/xquery-operators/ TR/xpath20/

www.w3.org/ XML/Schema XQuery 1.0 & XPath 2.0 Data Model www.w3.org/TR/query-datamodel/

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management The FLWOR Expression

§ FOR: iterates through a sequence, bind variable to items § LET: binds a variable to a sequence § WHERE: eliminates items of the iteration § ORDER: reorders items of the iteration § RETURN: constructs query results

for $movie in db2-fn:xmlcolumn(‘movies.doc’) let $actors := $movie//actor where $movie/duration > 90 order by $movie/@year Chicago return Renee Zellweger {$movie/title, $actors} Richard Gere Catherine Zeta-Jones

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Which data does XQuery (as a primary language) work on?

§ All XML data is in XML typed columns in tables

§ XQuery standard defines a “collection” function – Very abstract, implementation dependent

§ DB2 XQuery uses 2 XQuery functions to get data: – db2-fn:xmlcolumn() – db2-fn:sqlquery()

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Querying XML data – with XQuery

1) Identifying XML data by column: db2-fn:xmlcolumn()

for $d in db2-fn:xmlcolumn(‘dept.deptdoc’)/dept/employee

operate on entire XML column

2) Identifying XML data via a select statement: db2-fn:sqlquery() Leverage predicates/ indexes on relational columns:

§ for $d in db2-fn:sqlquery(“select deptdoc from dept”)/dept/employee … entire column

§ for $d in db2-fn:sqlquery(“select deptdoc from dept where deptID = ‘PR27’ ”) … single document

§ for $d in db2-fn:sqlquery(“select deptdoc from dept where contains(deptdoc, SECTION(/dept/employee/) ‘John’)=1”) … some documents

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Querying XML data – with XQuery This query returns all customerinfo elements in documents in the CUSTOMER.INFO column where the value of the attribute Cid is greater than 1000.

Prefix each XQuery query with the keyword ‘XQuery’ to indicate the DB2 parser to use the XQuery language.

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Querying XML data – with XQuery Visual XQuery Builder integrated in DB2 Developer Workbench (Eclipse IDE)

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XQuery and SQL/XML

§ DB2 treats both, SQL and XQuery as primary query languages (hybrid system)

§ SQL and XQuery independently operate on their respective data models

§ DB2 also allows to combine and correlate relational and XML types of data

Two ways to query XML data: Last section: - Querying XML data with XQuery - Optional: SQL embedded in XQuery

This section: - Querying XML data with SQL - Optional: XQuery embedded in SQL

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Querying XML data – with SQL/XML SQL/XML Publishing Functions since DB2 V8.2

Several functions are available to enable XML values to be constructed, or published, from SQL values.

Function Description Type XMLELEMENT generates an XML element Scalar XMLATTRIBUTES used within XMLELEMENT, specifies attributes Scalar XMLFOREST produces a forest of XML elements from SQL Scalar values XMLCONCAT concatenates a variable number of XML values Scalar XMLNAMESPACES produces a namespace declaration Scalar XMLAGG to group or aggregate XML data Aggregate XML2CLOB converts the XML data type into a CLOB Cast XMLSERIALIZE converts XML data type to serialized XML as a Cast char/ varchar/ clob/ blob

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Querying XML data – new SQL/XML functions in DB2 9

Function Description Example XMLPARSE Parses character/ BLOB data INSERT INTO T1(XMLDOC) VALUES( and produces XML value. XMLPARSE(DOCUMENT ‘some XML doc’)) XMLVALIDATE Validates XML value against INSERT INTO T1(XMLDOC) VALUES( XML schema and type-annotates XMLVALIDATE ( the XML value. XMLPARSE(DOCUMENT ‘...’)) according to xmlschema id ‘ibm.invoice’) XMLEXISTS Determines if an XQuery returns SELECT ID FROM T1 a result (i.e. a sequence of one WHERE or more items, non-empty XMLEXSISTS (‘$d/dept[@bldg = 101]’ passing xmldoc as “d”) sequence) XMLQUERY Executes an XQuery and returns SELECT ID, the result sequence XMLQUERY(‘for $i in $d/dept let $j := $i//name return $j’passing xmldoc as “d”) FROM T1 XMLTABLE Executes an XQuery, returns the Refer to following slides result sequence as a relational table (if possible)

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Querying XML data – with SQL/XML Examples

-- Create a new table with xml datatype column CREATE TABLE dept(deptID char(8) primary key, deptdoc xml)

-- Plain SQL to get full XML document(s) SELECT deptID, deptdoc FROM dept WHERE deptID = “PR37”

-- SQL with embedded XPath or XQuery expression SELECT deptID, XMLQUERY(‘for $i in $d/dept let $j := $i//name return $j’ passing deptdoc as “d”) FROM dept WHERE deptID LIKE “PR%” AND XMLEXISTS(‘$d/dept[@bldg = 101]’ passing deptdoc as “d”)

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Querying XML data – with SQL/XML XMLTABLE(), generates a table from XML data

SELECT X.* FROM dept, XMLTABLE (‘$d/dept/employee’ passing deptdoc as “d” COLUMNS “empID” INTEGER PATH ‘@id’, “firstname” VARCHAR(30) PATH ‘name/first’, “lastname” VARCHAR(30) PATH ‘name/last’, “office” INTEGER PATH ‘office’) AS “X” John Doe 344 empID firstname lastname office 901 John Doe 344 902 Peter Pan 216 Peter Pan 216

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Querying XML data – with SQL/XML Visual SQL Builder integrated in DB2 Developer Workbench (Eclipse IDE)

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Querying XML data – with SQL/XML

Graphically create SQL and SQL/XML queries with the support of an Expression Builder

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XML Data Modification

§ Data Modification Language (DML) only supports full document replace (no XUpdate standard yet): update dept set deptdoc = ? where …

§ DB2 provides a Stored Procedure for sub-document level updates: – Value updates of text nodes or attributes – Replace elements or document subtrees – Delete any node or subtree – Insert (append) any element or subtree – Document to update: identified by SQL or XQuery – New values or elements can be static, or produced on the fly by SQL or XQuery – One or multiple updates in 1 stored procedure call

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XML Data Modification “Update the phone number of employee 301” Type of Call DB2XMLFUNCTIONS.XMLUPDATE ( update ' What to update 408-463-4963 1 or more updates New value (…) (static) ', 'Select deptdoc from dept where deptid=1006', '',?,?);

Which doc action = replace | append | delete to update

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XML Data Modification “Update the phone number of employee 301”

Call DB2XMLFUNCTIONS.XMLUPDATE ( ' for $i in db2-fn:xmlcolumn(‘T.col’)/Phone where $i/change/emp/@id = 301 return $i/phone New value, produced by an XQuery (…) ', 'Select deptdoc from dept where deptid=1006', '',?,?);

using = XQUERY | SQL

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Agenda – Start the hybrid engine

• Overview • pureXML Storage • XML Indexes • XQuery & SQL/XML support

• < XML Query Execution /> • XML Schema support (XSR)

• Utilities, Tools & API’s

• Summary

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management DB2 Query Operators (Explain) § Base access methods: TBSCAN, IXSCAN, FETCH § Joins: NLJOIN, MSJOIN, HSJOIN § Aggregation: GRPBY § Temping: TEMP § Sorting: SORT § Index AND’ing, dynamic bit map indexing: IXAND § Index OR’ing, list prefetch: RIDSCN § XML Scan and Navigation: XSCAN § XML Index access: XISCAN New ! § XML Index anding: XANDOR § Table queues (xTQ) Extended hybrid optimizer

DB2 9 – The next generation hybrid data server Tom Eliaz, Matthias Nicola, IBM SVL IBM Software Group - Information Management XSCAN – XML Scan Operator

Query: /customerinfo[name=“Matt Foreman” and phone=“905-555-4789”]

Matt Foreman 905-555-4789

No index XSCAN = XML Document Scan

RETURN | • Navigates 1 document at a time NLJOIN | • Evaluates the expression /customerinfo[…] /-+-\ / \ • Returns XML nodes that satisfy the expression TBSCAN XSCAN | | TABLE: • Takes input via sideways passing NLJOIN MNICOLA.MYTEST

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XISCAN – XML Index Scan Operator Query: /customerinfo[name=“Matt Foreman” and phone=“905-555-4789”] Matt Foreman 905-555-4789 1 index, on name Find matching rows efficiently using XML Indexes RETURN • Evaluates the expression | NLJOIN /customerinfo[name=“Matt Foreman”] | /-+-\ / \ • Varchar(hashed) index may produce false FETCH XSCAN | positives -> eliminated by XSCAN /---+---\ / \ RIDSCN TABLE: | MNICOLA.MYTEST •Only for value comparisons, SORT | not for “structural” predicates (element existence) XISCAN DB2 9 – The next generation hybrid data server Tom Eliaz, Matthias Nicola, IBM SVL IBM Software Group - Information Management XANDOR – Pivot XML Anding (and oring) Query: /customerinfo[name=“Matt Foreman” and phone=“905-555-4789”]

2 indexes, on name and phone

RETURN | NLJOIN | /-+-\ / \ FETCH XSCAN Efficient XML Index ANDing | /---+---\ using pivot algorithm / \ RIDSCN TABLE: • Combine the results of 2 or more XISCANs | MNICOLA.MYTEST SORT | XANDOR | /-+-\ • Only for equality predicates without wildcards, / \ traditional IXAND used otherwise XISCAN XISCAN DB2 9 – The next generation hybrid data server Tom Eliaz, Matthias Nicola, IBM SVL IBM Software Group - Information Management Agenda – Start the hybrid engine

• Overview • pureXML Storage • XML Indexes • XQuery & SQL/XML support • XML Query Execution • < XML Schema support (XSR) />

• Utilities, Tools & API’s

• Summary

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management DB2 XML Schema Repository (XSR) § Database needs a Schema repository – Stable & high performance access to Schemas for validation at XML insert/update time – Support for XML Schema management

§ DB2 XML Schema Repository (XSR) – XML Schemas are registered • Consistent set of .xsd document – Registered Schema identification • A SQL 2-part name • The URL the Schema is externally known as (e.g. used in schemaLocation attributes) • The "primary namespace" – Also used by Shred • Stores annotated Schema • Internal formats to make Shredding effecient – Also DTDs and External entities • Used for entity reference resolution and defaults • NOT used for validation

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Register a Schema – via DB2 Control Center

Already registered XML Schema documents

Register new XML Schema via wizard.

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Register a new Schema – via DB2 Developer Workbench

Invoke Schema registration wizard

Browse registered Schemas

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XMLVALIDATE

create table dept(deptID char(8) primary key, deptdoc xml);

Validation is optional.

Can override schema location found in the document by referencing a schema from DB2’s schema repository:

insert into dept(deptdoc) values (xmlvalidate(?))

insert into dept(deptdoc) values ( xmlvalidate(? according to xmlschema id “ibm.invoice”)

insert into dept(deptdoc) values ( xmlvalidate(? according to xmlschema uri ‘http://my.world.com’)

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Schema evolution with DB2 9

§ No agreement how to evolve schemas because the general problem is very complex

§ Applications do it anyway because there are point solutions

§ Enable schema evolution (don't prevent it)

§ DB2 XML Schema Repository is very flexible – Register conflicting schemas – Register schemas with same namespace – Register schemas with same URL

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Shredding into relational tables § There are still reasons to shred XML: – Co-existence with legacy applications – Relational processing is faster than XML – Analytics/cubes work over non-XML data

§ Mapping from XML to relational: – Annotate the XML schema – Register XML schemas in the schema repository – Shred via CLP commands or stored procedure calls

Annotation Example:

§ Replaces XML Extender shred (XML collection) – Faster; using XML Schema

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Define XML mapping rules – DB2 Developer Workbench

Invoke Annotated XSD mapping editor

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Define XML mapping rules – DB2 Developer Workbench

Graphically define mapping rules from XML to a relational schema

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Agenda – Start the hybrid engine

• Overview • pureXML Storage • XML Indexes • XQuery & SQL/XML support

• XML Query Execution

• XML Schema support (XSR) • < Utilities, Tools & API’s />

• Summary

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XML Utilities & Tools Enhancements for the new XML data type

§ XML Import & Export

§ XML Runstats

§ XML type support in stored procedures

§ XML type supported by HADR replication

§ Control Center extensions (e.g. Index creation wizard)

§ DB2 Developer Workbench

§ and more…

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management XML API enhancements

§ New XML type support added to APIs: – JDBC, .NET, ODBC/CLI, Embedded SQL

§ SQL/XML supported by all APIs

§ XQuery supported by all APIs – Result sequence will be treated as a resultset – Each item will be treated as a row

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Agenda – Start the hybrid engine

• Overview

• pureXML Storage • XML Indexes • XQuery & SQL/XML support • XML Query Execution • XML Schema support (XSR)

• Utilities, Tools & API’s

• < Summary />

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Summary

§ Standards compliant & driving the standards – XML, XQuery, SQL/XML, XML Schema …

§ 100% integrated in DB2 – leveraging performance, scalability, reliability, availability …

§ 100% integrated with SQL – XML is a new SQL type – Access relational and XML data in the same statement

§ 100% integrated with application APIs: – JDBC, ODBC, .NET, embedded SQL

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Summary

§ Flexibility because that is what XML is all about… – any document, any schema, not just the ones that are mapped to relational tables

§ pureXML storage – XML is parsed and stored hierarchical. – shredded: using annotated Schema – CLOB/ BLOB

§ Sophisticated XML indexing

§ Broad XQuery support – both embedded in SQL and as a primary language

§ Supports Digital Signatures – signatures can be validated on retrieved documents

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management Users will see

1. New XML data type for columns create s1.t1 (c1 int, c2 xml)

2. Language bindings for the new XML type Java, .Net, C, Cobol, Embedded SQL

3. New XML indexes create index ix1 on t1(c2) generate keys using pattern ‘/dept/emp/@empno’

4. An XML Schema/DTD repository

5. Support for XQuery as a primary language as well as: Support for SQL within XQuery Support for XQuery with SQL Support for new SQL/XML functions

6. Performance, scale, and everything else expected from a DBMS

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management References

DB2 9 on the net www.ibm.com/db2/viper

Articles @ IBM developerWorks http://www-128.ibm.com/developerworks/db2/

DB2 9 – The next generation hybrid data server IBM Software Group - Information Management

Thank you for your attention

Holger Seubert Software Engineer DB2 Information Management [email protected]

DB2 9 – The next generation hybrid data server