xquery.txt Fri Apr 05 18:27:33 2019 1 Notes on XML and XQuery in Relational

Owen Kaser March 22, 2016. Updated April 5, 2019 some code frags are untested!

As usual, the idea is to give you a taste of some things, without exploring all details, but letting you do some simple things.

There are likely a fair number of errors in this, though I have tested many code fragments.

Corrections appreciated. xquery.txt Fri Apr 05 18:27:33 2019 2

Prerequisites ------

Your textbook gives a quick introduction to XML and XQuery. Plus it mentions that XML can be stored and manipulated in relational databases. Make sure you understand what it says.

We’ll look at this with a few more details. xquery.txt Fri Apr 05 18:27:33 2019 3

Mini Review ------

XML documents and fragments contain nested elements (delimited by tags). Elements can have attributes.

Owen Kaser 1.1 2.2 Newo Resak 25 xquery.txt Fri Apr 05 18:27:33 2019 4

XML Review ------In previous example, we have elements: classroster, student, sname, gpa, friends, friend. tags: eg , and attributes: snum text: eg Newo Resak.

There is a "family tree" type relationship between elements: the classroster element is the parent of two student elements. We speak of parents, children, siblings, ancestors, descendants, etc. Topmost element is called the "root".

We are not told whether there is an applicable schema: we don’t know whether snum must always be an integer, whether the text for age could be "really old" instead of "25", etc. xquery.txt Fri Apr 05 18:27:33 2019 5

XPath [http://www.w3schools.com/xsl/xpath_intro.asp] ------

The W3C defines an XPath language that you can use to locate elements within a bunch of XML. It is important in other XML technologies, including XQuery.

XPath lets you write expressions that return a "node" or a set of nodes. "Nodes" include elements, attributes, text and 4 other kinds.

Example paths in example XML doc /classroster/student[1]//gpa start at root, which must a classroster node of its student element children, take first. any gpa elements that are descendants are returned.

In this case, just two: 2.2 and 1.1 xquery.txt Fri Apr 05 18:27:33 2019 6

More XPath examples ------

//gpa this gives the set of all gpa elements anywhere. (In this example, there are two) ------//@snum gives the set of all snum attributes anywhere ------//student//friend//@snum only those snum attributes somewhere inside a friend element, somewhere inside a student element xquery.txt Fri Apr 05 18:27:33 2019 7

Wildcards ------* matches any element node @* matches any attribute node node() matches any node text() matches any text node

** possible broken note: example queries seem to put node() and text() in [] for some reason eg //friends//@* will return same thing as //friends//@snum in our example, because we don’t have any other attributes xquery.txt Fri Apr 05 18:27:33 2019 8

Axes ----- //, / and @ seem to be shorthands

axis names: child (seems like default, with /) attribute (seems like @ prefix) descendant (seems like //) ancestor preceding (everything before here) following

Eg: descendant::friends/descendant::*/attribute::snum same as //friends//@snum

If current node is the first student, then following::* is be the second student (all that is left) xquery.txt Fri Apr 05 18:27:33 2019 9

Predicates ------Sometimes, you want to provide tests. Use [] eg. /classroster/student[@snum>10]/gpa (Both relevant snums (12345 and 33333) pass the test) //student[gpa > 1.0] xquery.txt Fri Apr 05 18:27:33 2019 10

Merging node sets ------The | operator can form a union eg

//gpa | //friends xquery.txt Fri Apr 05 18:27:33 2019 11

XPath functions(http://www.w3.org/TR/xpath-functions/) ------There are MANY useful functions, including doc(URL) - read an XML node from a file or URL eg doc("myfile.") count(item list), avg(number list) - count, average also max, min, sum filter, fold-left, fold-right, for-each: for "", beyond our scope has-children() - boolean head(list of items) - gives first item tail(list of items) - all but head insert-before(list of items) number() - item as a double subsequence(list, startpos, length) string-length, upper-case op:to(low,high) or use the "to" operator 1 to 5 generates sequence 1,2,3,4,5 op:is-same-node or use the "is" operator zillions of date/time functions xquery.txt Fri Apr 05 18:27:33 2019 12

XQuery ------

XQuery is supposed to be "the SQL of XML".

Another W3C technology. Uses XPath heavily. All XPath functions listed earlier can be used.

A single XPath expression can be an XQuery

FLWOR expressions: For-Let-Where-OrderBy-Return optional sections, in that order (well, Return is mandatory, and either For or Let)

Variables with $ prefix. case sensitive Strings with ’ or " (: This is a comment :) xquery.txt Fri Apr 05 18:27:33 2019 13

Comparisons: ------= vs eq :

Suppose XPR is an XPath expression XPR eq 5 requires that the XPath set have exactly one value in it. Is it equal to 5?

XPR = 5 can handle XPath sets with several values; if any of them are 5, you’re good

Also, eq would require xsd:integer for XPR but without a schema to say otherwise, XPR might have incompatible type untypedAtomic. But = can handle this.

Similary for > and gt, <= and le, etc. xquery.txt Fri Apr 05 18:27:33 2019 14

XQuery: for ------for VAR in EXPRESSION

where EXPRESSION is an XPath expression or a function giving a sequence eg: for $foo in doc("myfile.xml")//friend eg: for $foo in (3 to 10) xquery.txt Fri Apr 05 18:27:33 2019 15

XQuery: nested for ------for $foo in (1 to 5), $bar in (2 to 6) essentially, the $bar loop is nested. xquery.txt Fri Apr 05 18:27:33 2019 16

XQuery: let ------

Let allows an assignment, goes with := operator let $midpoint := ($start + $finish) div 2 xquery.txt Fri Apr 05 18:27:33 2019 17

XQuery: where ------

The where clause provides additional filtering, beyond what you may have done with predicates in the XPath statement.

If you have "nested fors", this can be extra useful. for $x in (1 to 5), $y in (1 to 5) let $mid = ($x + $y) div 2 where $x gt $y return { $mid } xquery.txt Fri Apr 05 18:27:33 2019 18

XQuery: order by ------

Order by does what you expect...

order by $x/gpa xquery.txt Fri Apr 05 18:27:33 2019 19

XQuery: return ------

Each ordered result then produces an XML fragment.

You can use an XPath expression eg return $x/gpa

Or you can generate some XML elements with varying stuff based on your variables. Note the use of {}

eg return {$x}

There is an if-then-else expression, plus all those lovely XPath functions xquery.txt Fri Apr 05 18:27:33 2019 20

Defining your own functions ------

You can even define your own functions, though that’s getting too fancy for us.

For instance declare function foo ( $param1 as node()?, $param2 as xs:anyAtomicType) as xs:anyAtomicType* { if ($param1) then fn:data($param1) else $param2 } xquery.txt Fri Apr 05 18:27:33 2019 21

XQuery and XML in Oracle 12c ------see docs.oracle.com/database/121/ADXDB/xdb_xquery.htm

The SQL standard has a part called SQL/XML that standardizes the use of XML in SQL databases.

Oracle SQL uses a data type XMLTYPE whereas SQL/XML says it should have been called XML.

You can have columns of xmltype in your tables.

Optionally, tell Oracle how to store the XML columns and do things to help indexing the XML. eg create table t( x xmltype); xquery.txt Fri Apr 05 18:27:33 2019 22

Getting some XML into a table ------

There are a variety of ways to get XML.

One is to use xmltype as a constructor, to convert a string into XML. insert into t values (xmltype(’ Owen Kaser 1.1 ....stuff omitted... ’));

Table t now has one row. xquery.txt Fri Apr 05 18:27:33 2019 23

Getting XML out of a table ------

Can supposedly process the whole table as hunk of XML. In a table with an XMLType column and nothing else, Oracle docs suggest you can retrieve the "pseudocolumn" OBJECT_VALUE eg select object_value from t;

Note that it didn’t work for me. xquery.txt Fri Apr 05 18:27:33 2019 24

Getting XML fragments out ------

You can process individual fragments of XML from a table using "XMLQuery".

It takes an XQuery and passes in a value (t.x below) for processing select xmlquery(’for $i in //student where $i/gpa > 1.5 order by $i/gpa return $i/@snum’ passing x returning content) from t;

This returns 1111. The "passing" and "returning content" keywords are mandatory with xmlquery. xquery.txt Fri Apr 05 18:27:33 2019 25

"Passing as" ------

You can associate a table attribute to a variable using "passing as". Below, variable $d is set to the chunk of XML in attribute x. select xmlquery(’for $i in $d//student where $i/gpa > 1.5 order by $i/gpa return $i/@snum’ ’ passing x as "d" returning content) from t;

This returns 1111, as before. xquery.txt Fri Apr 05 18:27:33 2019 26

XMLExists and XMLcast ------

XMLExists is useful for Boolean tests. It returns true if an XQuery expression has any results.

XMLCast can turn xml into a string, discarding tags select xmlcast(x as varchar(1000)) from t where xmlexists(’$y//gpa’ passing x as "y");

This returns a varchar version of the text portions of our data, since there was at least one gpa element.

Owen Kaser 1.1 2.2 Newo Resak 25 xquery.txt Fri Apr 05 18:27:33 2019 27

XMLTable ------

XMLTable converts the result of an XQuery into a relational table.

Suppose that our usual table t had its usual XML column x, but also had a VARCHAR column y select y, xx.stuNum, xx.gpaCol from t, xmltable(’/classroster//student’ passing t.x columns stuNum int path ’@snum’, gpaCol number path ’gpa’ ) xx where xx.gpaCol > 1.2;

For each row of t, t.x will be turned into a little table with a row for each student element in t.x, along with the student’s id number and gpa (which might be null) xquery.txt Fri Apr 05 18:27:33 2019 28

Manufacturing XML yourself ------use XMLElement and XMLAttributes as follows select xmlelement("foo", xmlelement("contact", xmlattributes(e.employeeid as "bar"))) from employee_t e; to produce (one toplevel element per employee)

xquery.txt Fri Apr 05 18:27:33 2019 29

XMLAgg ------

If you want to put the various rows into a single XML item, use XMLAgg as follows: select xmlelement("foo", (xmlagg( xmlelement("contact", xmlattributes(e.employeeid as "bar"))))) from employee_t e; which gives xquery.txt Fri Apr 05 18:27:33 2019 30

Summary ------

Conventional relational databases like Oracle can handle semi-structured XML documents.

XPath and XQuery give many opportunities to work with the XML information

(But native XML databases might be more efficient if all you want to do is process lots of XML.) xquery.txt Fri Apr 05 18:27:33 2019 31

XML’s alternative: JSON ------

XML is rather verbose. It has a lot of technology built around it (eg, XQuery) and libraries for XML processing are available for all major languages.

It has been widely used for information exchange. But its verbosity hurts.

JSON (Javascript Object Notation) now challenges XML for information interchange. It’s simpler.

JSON’s essentially a dump of a Javascript data structure. So if the JSON is received by Javascript code (eg running in a browser for a web application), it is more easily dealt with than XML. xquery.txt Fri Apr 05 18:27:33 2019 32

JSON example (our running example in JSON) ------Thanks to http://www.utilities-online.info/xmltojson/

"classroster": { "student": [ { "-snum": "12345", "sname": " Owen Kaser ", "gpa": " 1.1 ", "friends": { "friend": [ { "student": { "-snum": "6789" }}, { "student": { "-snum": "1111", "gpa": " 2.2 " }}]}}, { "-snum": "33333", "sname": " Newo Resak ", "age": " 25 " }]}} xquery.txt Fri Apr 05 18:27:33 2019 33

Ease of Use ------Javascript code easily reads JSON into an in-memory object and processes it there, natively.

Something similar using XML requires more complex processing: a "DOM parser" builds a data structure that requires tedious API calls to process.

Though XML is a pain to process this way, it has nifty standardized tools like XQuery that enable higher-level processing. The JSON world wasn’t as rich.

But XQuery 3.1 (we use earlier versions) has support for converting JSON to/from XML.

REST web services tend to prefer JSON over XML. SOAP web services require XML.

SOAP can do some things that REST can’t. But most people don’t need to do these complicated things. xquery.txt Fri Apr 05 18:27:33 2019 34

SOAP/XML vs REST/JSON example ------From keithba.net/simplicity-and-utility-or-why-soap-lost

POST /customers HTTP/1.1 Host: www.example.org Content-Type: application/soap+xml; charset=utf-8

43456 for SOAP, vs for REST

GET /customers/43456 HTTP/1.1 Host: www.example.org xquery.txt Fri Apr 05 18:27:33 2019 35

SOAP/XML vs REST/JSON example, 2 ------Data returned, first for SOAP

HTTP/1.1 200 OK Content-Type: application/soap+xml; charset=utf-8

Foobar Quux, inc then for REST

HTTP/1.1 200 OK Content-Type: application/; charset=utf-8

{’Customer’: ’Foobar Quux, inc’}