XML Query (XQuery) Support in Oracle 10g Release 2

An Oracle White Paper May 2005

XML Query (XQuery) Support in 10g Release 2

Introduction ...... 3 Note on Future Compatibility ...... 4 The Structure of XQuery...... 5 Database or Mid-Tier?...... 6 SQL and XQuery ...... 7 Differences between SQL and XQuery ...... 15 Using XQuery in the Database...... 15 Querying XML Documents in the Oracle XML DB Repository...... 16 Full-Text Searches of XML ...... 19 Querying a Relational Table or View as if it were XML Data ...... 21 Querying Native XMLType data in Oracle XML DB ...... 22 Querying Across and Filesystems, or Websites...... 24 Some Interesting Cases ...... 27 Type Checking...... 27 Namespaces...... 28 Performance and Scalability: Co-processor vs. Native Compilation...... 29 Storage Optimization ...... 29 Intra-Query Optimization ...... 29 Inter-Query Optimization ...... 29 Beyond Co-Processor: Oracle’s Native XQuery Implementation ...... 30 Areas of Limited Support...... 31 Implementation-specific choices ...... 31 Implementation Departures from the XQuery Standard ...... 32 Support for XQuery Functions and Operators...... 32 Benefits of XQuery...... 32 Conciseness and Simplicity...... 32 Heterogeneous Queries...... 34 Leveraging the XML Data Model ...... 34 XML Construction ...... 34 XQuery vs. XSLT...... 36 The Database XQuery Environment ...... 36 Top Level Query Support through SQL*Plus ...... 36 XDK...... 36 XQuery in the Mid-Tier ...... 37 XQJ...... 37 Query Pushdown ...... 37 Using XQuery in the Mid-Tier ...... 38 Querying a RSS Feed...... 38 Oracle XML Query Services ...... 39 Conclusion...... 40

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 2 XML Query (XQuery) Support in Oracle Database 10g Release 2

INTRODUCTION

XQuery makes its debut in Oracle XML Query (XQuery) is a standards effort undertaken by the World Wide Web Database 10g Release 2. consortium to enable data to be extracted from XML documents. XQuery is designed to work with the XML data model, and be a comprehensive for data that is expressed in XML -- just as SQL has been the query language for much of the world’s structured data expressed as relational tables, and as keyword searches have powered much of the information access on the Internet. XQuery is also useful for constructing XML data and documents as the return result of query expressions. As XQuery nears finalization, the IT community has started investigating the business uses of XML and determining what value XQuery might provide. As the innovation leader in commercial database technology, Oracle Database 10g Release2 debuts a full-featured native XQuery engine integrated with the traditional Oracle database server to help organizations explore their XQuery needs. With Oracle Database 9i Release 2, we introduced a native XML store – XML DB – integrated with and part of the traditional Oracle Database. XML DB provides a high-fidelity storage and retrieval of XML documents, and is useful for a certain classes of data-management tasks, viz.: • Content Management – standards-based life-cycle management of content expressed in XML (e.g. technical manuals, multi-media messages, legal statutes, regulatory filings) • Exchange and Storage – generation of template-based business documents expressed as XML (e.g. purchase orders, bills, invoices, reports) to exchange between applications, and storage of such documents natively in the database • Data Integration – querying across different types of information assets – database records, files, web servers, news feeds. With 10g Release 1 and Release 2, the rapidly maturing XML DB technologies included with the Oracle Database are being used for a variety of applications which incorporate the above tasks, such as: • Product Documentation and Technical Manuals

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 3 • Product Specification and Design Documents • Web Site Content • News and Information Services • Legislative Documents • Contracts • Financial Reporting • Document Publishing. From the perspective of developers building such applications, the initial use of XML as a storage format is broadening into the need to query the stored XML. The common reasons typically cited for using XQuery to query XML are: • Conciseness and simplicity – it is estimated that accessing XML data through XQuery results in one-fifth the code of XSLT-oriented approaches and one-twentieth the code of DOM-based approaches • Heterogeneous queries – since XML can map to a large number of different data models, one could search databases, files, web-services simultaneously using XQuery. • XML data model – XQuery is based on XML Schema. The XML Schema- based data model is best suited for representing variable, unpredictable, and irregular – i.e. ‘semi-structured’ data. When querying such data, XQuery is an obvious choice. • XML construction – XQuery can construct XML as the result of evaluating query expressions, in many cases more expressively and efficiently than XSLT.

Note on Future Compatibility

Since XQuery is not yet a final W3C At the time of Oracle Database 10g Release 2, XQuery is in the final stages of recommendation, the current support is becoming a World Wide Web Consortium (W3C) recommendation. The Oracle not guaranteed to be backwards- Database 10g Release2 supports the April 2005 version of the XQuery language compatible. specifications. It is possible that the final standard might be different from the late- stage versions. Oracle will continue to track the evolution of the XQuery standard, until it becomes a recommendation. During this period, Oracle does not guarantee backward compatibility of your XQuery code. After the XQuery standard becomes a recommendation, Oracle will produce a release of Oracle Database that is compatible with the recommendation. From that point on, standard Oracle policies will apply to the Oracle XQuery implementation.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 4 The Structure of XQuery As a brief introduction to XQuery, let us assume we are operating on a simple document named emp.:

This document could be stored as a file, or generated as virtual XML document out of an Oracle database. To get the names of those employees with salary > 80000, the following XQuery fragment can be used:

for $i in document(”emp.xml‘)/empset let $j = 80000 where $i/@salary > $j return $i/@ename

The result is the following attribute node:

JONES

Let us look at the basic structure of XQuery. The language consists of the following major constructs. For -- Analogous to the FROM clause of a SQL SELECT statement, the For clause lets you iterate across a range of sequence values Let – Analogous to the SQL SET statement, the Let clause lets you define variables and assign them in turn during iteration through a FOR clause. Where -- Analogous to the WHERE clause of a SQL SELECT statement, the Where clause provides a set of conditions that filter or limit the initial selection in a For statement. Return -- Analogous to the SQL RETURN statement, the XQuery return clause creates output in a custom formatting language. The output does not necessarily have to be XML, although it is optimized to produce XML. Order-by -- Analogous to ORDER BY in SQL, order-by provides the ordering constraints on a sequence.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 5 Namespaces – XQuery’s declare namespace prolog declaration associates a namespace URI with a prefix, crucial for indicating scoping rules on functionality. This does not have an obvious SQL analog. Functions and Operators — Analogous to SQL functions and operators, XQuery comes with certain built-in functions and operators, to perform, for example, common numeric ( fn:abs, fn:ceiling ) and string ( fn:compare, fn:string-join ) operations 1. In addition, analogous to SQL user-defined functions or PL/SQL stored procedures, you can create user defined functions in XQuery, and these can be called inline in an XQuery expression. XPath—Most XPath 2.0 functions are supported in XQuery, as is the tree-walking model used to navigate XML node-hierarchies (though of course some data sources might not support all aspects of XPath 2.0 because of the types of data involved). Most XQueries have a For-Let-Where-Return or For-Let-Where-Order By-Return construction. As a result we often use the acronym FLWR or FLWOR to represent XQuery constructs.

DATABASE OR MID-TIER?

Oracle supports XQuery running in both Since databases already have a rich (SQL) query execution capability, it is often database and mid-tier. Where you want to surmised that XQuery is needed only in the middle-tier. issue XQueries against large amounts of persistent data (whether relational or XML, Certainly, when you want to integrate heterogeneous data sources, a middle-tier whether in tables or in files) the database approach is feasible in that you have to address non-database sources. In practice, XQuery engine is recommended. Where however, one typically sees that the organizations’ important data lives best in you need to query small amounts of databases – either relationally, or as content in CM systems, or as persistent metadata, transient messages, or messages (email etc.) The advantages of transactional consistency, high availability, configuration information, or where you cluster-based scale-out, secure backup and recovery, auditing – are not available if want to issue XQueries against a large number of heterogeneous sources, the your data is left in files. The evolutionary arrow of content- and message- mid-tier XQuery engine is recommended. management does lead firmly towards databases. Further, one of the major objections of storing anything other than structured data in a database was that databases could only perform relational algebra; with a file- oriented XML Repository inside Oracle, combined with support for the XML data model, XQuery, etc., Oracle has become the leading platform for storing all your data. Operational data and content, then, will be stored in databases, modeled as XML or SQL, and queried by whichever metaphor is more appropriate. This ‘duality’ is at the heart of Oracle XML DB. However, there are still cases where relatively smaller amounts of metadata, configuration files, in-memory cached data have to be

1 See http://www.w3.org/TR/xquery-operators for the available functions and operators.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 6 queried in the middle tier. Oracle provides both a database and a middle tier implementation of XQuery, and the latter covers all the lightweight query requirements of application servers. The middle-tier implementation is briefly discussed in a separate section below. Figure 1 shows the two XQuery engines.

Figure 1: XMLQuery: Database & Mid-Tier

SQL AND XQUERY

Oracle’s XQuery support complements the existing industry-leading SQL support SQL and XQuery are not competing in the Oracle database. Existing relational applications will continue to use SQL, technologies, but complementary ones. In addition to being the industry‘s best and Oracle will remain the industry’s best implementation of SQL. New implementation of SQL, Oracle will also be applications based on XML will use XQuery, and Oracle will be the industry’s best the leading implementation of XQuery. implementation of XQuery as well. Oracle will support XQuery both in the database and in the mid-tier. This whitepaper primarily discusses XQuery support in the database; in addition, the same degree of XQuery support will be available as a Java XQuery engine, downloadable from the Oracle Technology Network website http://otn.oracle.com , or available with the Oracle Application Server 10g. Developers can use the mid-tier engine when they want to query non-database sources. Where the data to be queried comes from an Oracle database, we support an intelligent ‘query pushdown’ from mid-tier to db when possible. The XQuery implementation in the database is well integrated with and part of the SQL execution engine, which means you can also mix-and-match XQuery and SQL.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 7 SQL and XQuery are not competing technologies. They are best described as complementary technologies. Let us explore this in the context of an example, a document that describes a ‘project’ in an organization. Our project is titled “ Oracle 10gR1 to 10gR2 Migration for Project Management Database ”. It was created May 01 2005 by Said, and it describes the plan for migrating the company-wide Project Management Database from Oracle 10gR1 to 10gR2. This is what might be called a natural-language, or text version:

Oracle 10gR1 to 10gR2 Migration for Project Management Database

Created on May 01 2005 by Ed Said Revised on May 21, 2005 by Dan Wang Approved on May 23 by John Pilger

We are planning to migrate the company-wide Project Management database from Oracle 10g R1 to Oracle 10gR2. Along with the software migration, the hardware will be migrated from Unix SMP to a Linux RAC.

In the first part of the project, we will procure the Linux hardware and install Oracle 10gR2 with Real Application Clusters. The performance and scalability of this system will be evaluated.

Once the new installation is deemed stable, we will set up replication between the existing system and the new one, and let both run in parallel for a week. A suite of tests will be run against the new installation.

Finally, we will decommission the old system and cut over the project management client applications to the 10gR2 system.

The XML version of this document might be:

Oracle 10gR1 to 10gR2 Migration for Project Management Database 20050501 Ed Said Creation

20050521 Dan Wang Modification

20050523 John Pilger Approval

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 8 We are planning to migrate the company-wide Project Management database from Oracle 10g R1 to Oracle 10gR2. Along with the software migration, the hardware will be migrated from Unix SMP to a Linux RAC. In the first part of the project, we will procure the Linux hardware and install Oracle 10gR2 with Real Application Clusters. The performance and scalability of this system will be evaluated. Once the new installation is deemed stable, we will set up replication between the existing system and the new one, and let both run in parallel for a week. A suite of tests will be run against the new installation. Finally, we will decommission the old system and cut over the project management client applications to the 10gR2 system.

What does the XML representation give us that the natural-language-like text version does not? XML gives us: • Distinct, named logical chunks (elements), which makes it easier to describe the data and the type of each field • Hierarchical, named structure (sub-elements), which makes it easier to express the data – e.g. each action has a date, the name of who performed that action, and what type of action (Creation/Modification/Approval) it was. • Document order (except attributes) -- breaking the information down into logical chunks (elements) is useful, but in some cases - and especially with passages of text - those logical chunks are only meaningful when you know their order. • Flexibility: many people quote "flexibility" as one of their chief reasons for preferring an XML representation of data. However, there is point beyond which flexibility can be counterproductive. After all, the informal natural language representation of data is the most flexible one – and that's exactly why we are looking at alternatives. It is also possible to represent our project document in terms of the SQL data model:

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 9

Table 1: Project Summaries

Project No B udget Cost Center No Summary Oracle 10g R1 to 10gR2 Migration for Project 1278 1000 A01 Management Database ….

Table 2: Project Actions

Project No Name Action Date Action Type 1278 Ed Said 2005-05-01 Creation 1278 Dan Wang 2005-05-21 Modification 1278 John Pilger 2005-05-23 Approval

Table 3: Project Bodies

Project No Para Paragraph order 1278 1 We are planning to migrate … 1278 2 In the first part of the project, … 1278 3 Once the new installation is deemed stable … 1278 4 Finally, we will decommission … etc.

This is a fairly naïve representation. We have not tried to either fully normalize the data for efficient disk storage, nor denormalize for performance. What does the SQL representation give us that the informal, natural language description does not? Some of the benefits that the SQL representation gives us are: • Distinct, named data fields (elements) -- an element (or cell) is represented by a row/column intersection, named by a schema.table.column-namemakes it easier to express and query the data concisely and precisely. • Relations between data fields defined by the table structure and enforced by database referential integrity mechanisms. • A set-based algebra over the data and the relations, that allows complex queries with well-defined, standard characteristics. • Stored Procedures, Triggers, Constraints -- with SQL, it's easy to code business rules and data integrity rules into the data - something XML does not handle well.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 10 There are two things that were on the XML list that are missing from the SQL list. First, data represented in SQL does not have a default ordering (except in Object- Relational constructs such as varrays). SQL was designed to handle structured data, where order is rarely important; but as you can see from the Project Bodies table, it is not hard to introduce an ordering when it is necessary. Second, people tend to think the SQL representation is less flexible than XML. In some cases this is true, in others, flexbility could be counterproductive – if you allowed anyone to add data in any form, it might reduce the value of the data. You might choose the SQL presentation for your data so that you can manage and query it with tools that are robust, mature, and readily available. You might already own a suite of tools that work well with an SQL representation, and you probably already have the skills in-house to work with SQL. Many applications need to publish data (make data available) to a 3rd-party tool or application. SQL has been around for several decades, and there is a wealth of available tools, applications and skills. On the other hand, you might choose the XML representation because you need to publish data to (or consume data from) a web service or a browser or an integration application, or manage the data as a ‘document.’ Importantly, the SQL representation does not take advantage of any of the relatively recent advances in Object-Relational modelling of hierarchical structures, such as variable-length arrays (for modelling sequences of data elements) or nested tables (for modelling unordered sets of data elements). However, with the gradual introduction of Object-Relational constructs in the SQL standards, the same data we saw represented in natural language and in XML can also be represented in SQL. SQL:2003 is the first version of SQL that includes the built-in datatype XMLType, plus publishing functions ( XMLElement , XMLForest , XMLAgg , and others) to create XML directly from relational data in SQL queries. The XML-related functionality in SQL, starting at SQL:2003, is known collectively as SQL/XML 2.

2 The SQL/XML standard is ISO/IEC 9075–14:2005(E), Information technology – Database languages – SQL – Part 14: XML-Related Specifications (SQL/XML). As part of the SQL standard, it is aligned with SQL:2003. It is being developed under the auspices of these two standards bodies: 1. ISO/IEC JTC1/SC32 ("International Organization for Standardization and International Electrotechnical Committee Joint Technical Committee 1, Information technology, Subcommittee 32, Data Management and Interchange"). 2. INCITS Technical Committee H2 ("INCITS" stands for "International Committee for Information Technology Standards"). INCITS is an Accredited Standards Development Organization operating under the policies and procedures of ANSI, the American National Standards Institute. Committee H2 is the committee responsible for SQL and SQL/MM.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 11 Do not confuse this standard with Microsoft’s SQLXML (without the /), which means something different. Oracle, starting with Oracle9 i, supports SQL/XML plus some additional The SQL/XML standard has brought extensions – the functions extract , extractValue , and existsNode , to query and extract together the SQL and XML worlds, and is now making XQuery interoperable with XML data. The functions extract and extractValue take as arguments an XML SQL, document and an XPath expression, and return the result of evaluating the XPath expression against the XML document. extract returns a sequence of XML nodes, extractValue returns a single value. existsNode is another function that accepts the same arguments as extract – it returns 1 or 0 depending on whether the XPath matches any node in the document. It is typically used in the WHERE clause of the SQL query . Using these functions, you can construct XML data using relational data, query relational data as if it were XML, and construct relational data from XML data. With XMLType storing data ‘as XML’, we can conceptualize our project document as:

Project No Body

Oracle 10gR1 to 10gR2 Migration for Project Management Database 20050501 Ed Said Creation 20050521 Dan Wang Modification 20050523 1278 John Pilger Approval We are planning to migrate the company-wide Project Management database from Oracle 10g R1 to Oracle 10gR2. Along with the software migration, the hardware will be migrated from Unix SMP to a Linux RAC. In the first part of the project, we will procure the Linux hardware and install Oracle 10gR2 with Real Application Clusters. The performance and scalability of this system will be evaluated. Once the new installation is deemed stable, we will set up replication between the existing system and the new one, and let both run in parallel for a week. A suite of tests will be run against the new installation.

This SQL/XML standardization process is ongoing. Please refer to http://www.sqlx.org for the latest information about XMLQuery and XMLTable.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 12 Project No Body Finally, we will decommission the old system and cut over the project management client applications to the 10gR2 system.

In our naïve SQL-92 representation (i.e. without any SQL/XML), it is only possible to treat the project body as a block of text. You can return the whole body as text, or perform certain string operations if you model the body as a varchar . However, you cannot ‘step into’ the structure of the XML -- SQL-92 knows nothing about the XML structure or content. If you want to use SQL to query inside the XML data in the resolutions column, you need the SQL/XML extensions, such as XMLType. If we model the table above as

CREATE TABLE PROJECTS_XML ( ProjectNo number primary key, Body XMLType)

Then we can issue SQL/XML queries such as:

SELECT extract(Body, '//summary' ) AS summary , extractValue(value(x), '/action-desc/name') AS Approver FROM projects_xml, table(xmlsequence(extract(Body, ”/project/actions/action-desc‘))) x WHERE extractValue(value(x) , '/action-desc/type‘) = ”Approval‘

This returns the summaries of approved projects. As we see, the SQL/XML mechanism for expressing a path inside an XML fragment is XPath. While XPath is good at expressing positions in an XML document and conditions, but it is not as fully functional a query language as XQuery. XQuery has a number of advantages over XPath: • the Let clause makes XQuery queries simpler and easier to read by letting you identify a point in the XML structure and refer to it later in the query with a variable. There is no equivalent in XPath. • the Return clause allows you to construct new elements that contain the results of XQuery expressions. This means you can return XML in any shape you want. There is no element construction in XPath.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 13 • you can express a join – a query across more than one collection of XML documents – in XQuery, but not in XPath. • you can specify an ordering for the result of an XQuery, while XPath results are always in document order – Order By is supported • the result of an XQuery can contain not just XML nodes, but also scalar values – i.e. mixed models are supported • the XQuery data model allows type expressions and typing rules to be applied at compile time (statically), reducing run-time errors and the overhead of run-time error checking – ie. strong typing is supported. The XQuery specification says very little about input data, output data, or the host language – all these things are purposely left "implementation-defined" - you could say that XQuery is by definition "context-free". These examples could query data stored in a database, or in a file system. They could run in an SQL engine, in a Java program, or both (a Java engine that talks to a database via e.g. , JDBC). When you are running XQuery inside a database, an obvious extension is an XMLQuery function, one that takes in an XQuery rather than an Xpath. This is currently under consideration for the revision of SQL/XML to be part of the proposed SQL:2005. As we have seen, in SQL:2003 the SQL/XML functions extract , extractValue , and existsNode , to query and extract XML data. The corresponding functions in SQL:2005 are expected to be XMLQuery , XMLExists and XMLCast . Oracle Database 10g Release 2 introduces such a XMLQuery function in SQL, which takes in a XQuery FLWOR expressions, as well as additional optional arguments, and runs the XQuery inside Oracle. Several of our examples below will illustrate the XMLQuery SQL function. You can also perform ‘top level’ XQueries from tools like SQL*Plus by issuing the directive ‘’. With SQL/XML, SQL and XQuery become complementary, rather than competing technologies. If you need to use XQuery to query your data – perhaps queries are coming in from some application that only speaks XQuery – that data can be stored in a database as an XML type, and you can ask the SQL engine to run Xqueries on it (via the XMLQuery operator.). And we have shown that if you want to use SQL to query your data - perhaps you have tools and available skills for SQL but not for XQuery - then you can use the SQL/XML operators to use XPath to do the bits that need to look "inside" an XML structure. We will look at some of the use-cases of XQuery below in the ‘Using XQuery In the Database’ section below.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 14 While they are now interoperable and Differences between SQL and XQuery complementary, SQL and XQuery do have differences related to algebra, model, and comprehensiveness. While SQL and XQuery are quite complementary, there are several important differences between them. In general, these differences mean that for highly structured relational data, SQL will be the most natural algebra and developers will likely not get value from using XQuery over such data; conversely, for variable, semi-structured data, XQuery will likely be the syntax of choice. The important differences between the two algebras are: • The type systems are different; the XQuery type system is XML Schema, not SQL • XQuery supports namespace-aware name-resolution • SQL has many extensions (related to analytical processing, data warehousing, full-text search, multimedia management and so on) which are not present in XQuery 1.0 • XQuery 1.0 has no insert/update/delete capabilities as does SQL.

USING XQUERY IN THE DATABASE

There are four typical ways to use XQuery in the database: Query native XML or Relational data, as well as XML file content, and also URLs • Querying XML Documents in the Oracle XML DB Repository . that return data as XML. Increasingly, Content Management applications seek to migrate their content stores from file-systems to databases. Databases provide transactional robustness, superior scalability, and, with XQuery, a natural way to query documents stored in Oracle. This usage of XQuery typically dovetails with XCM applications of Oracle XML DB. • Querying a Relational Table or View as if it were XML data. XML must be frequently generated from relational data for purposes of Exchange. In Oracle 9i Release 2 and Oracle 10g Release 1, we provided SQL/XML operators to help generate XML from relational data. With Oracle Database 10g Release 2, you can also use the Oracle XQuery function ora: to create an XML view over the relational data, on the fly. • Querying native XMLType data in Oracle XML DB. When you have incoming data as XML that must be stored, it is increasingly important to store is as XML – i.e. not lose any ordering, namespace etc. information that is not supported in the basic relational model. Oracle XMLDB provides a native XMLType to store XML with DOM-fidelity or whitespace-fidelity,

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 15 depending on your needs 3. For use-cases involving XML Storage, XQueries need to operate on natively stored XMLType data. • Querying across Databases and Filesystems, or Websites. In certain cases, data in the database may refer to content stored in a file-system, perhaps through an URL. Where you control all the data, it will always help to consolidate the file content into the XML DB Repository, converting this case to first case (Querying XML Documents in the Oracle XML DB Repository.) However, in some cases you may not have enough control over the data to move it into your database – as when dealing with a URL on a partner site. Oracle’s XQuery implementation can do the following: (i) call out from the database (ii) read in a file using the doc function and thereafter (iii) perform a query across both data in the database and a document read in from an external source. We call this use case the Data Integration case. Let us look at each of these cases in more detail.

Querying XML Documents in the Oracle XML DB Repository

Oracle XML DB uses the term ‘resource’ to describe content – documents, messages, metadata – stored in the Oracle XML DB Repository. Let us create two ‘resources’ in the Repository to serve as the data being queried for the first example. Our resources are (i) a project summaries document, which carries summary information of the projects, and (ii) a document that describes the cost centers with which individual projects are associated. These resources can be thought of documents managed as ‘XML files’ in Oracle XML DB. Internally, Oracle builds hierarchical indexes to traverse the directory-structure of these documents efficiently, and content-based indexes (full-text, B-Tree, XPath) can also be built depending on the desired access.

3 See the Oracle Database 10g Release2 XML DB Technical Whitepaper for more information

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 16 DECLARE res BOOLEAN; doc1xmlstring VARCHAR2(3000):= ' Project Summaries

Oracle 10gR1 to 10gR2 Migration for Project Management Database Web Service Implementation for Self Service Vacation Reporting Common API Framework XQJ Api Redesign for Projects Salamander and Phoenix ';

costcentersxmlstring VARCHAR2(500):= ' ';

BEGIN res := DBMS_XDB.createResource('/public/doc1.xml', doc1xmlstring); res := DBMS_XDB.createResource('/public/costcenters.xml', costcentersxmlstring); END; /

You can also create such resources non-programmatically by typing the text in any text editor or XML instance creation tool, and using the XML DB protocols (FTP, HTTP, WebDAV) to load them into the Repository. A simple way of using the protocols from a Windows desktop is to use a popular WebDAV file browser like

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 17 Windows Explorer ® to drag-and-drop the files into the XML DB Repository. Figure 2 shows the Repository viewed via WebDAV using Windows Explorer .

Figure 2: The Oracle XML DB Repository Viewed from WebDAV

In Oracle XML DB, functions doc and collection return file and folder resources in the repository, respectively. We thus use the XQuery function doc to obtain a repository file that contains XML data, and then binds XQuery variables to parts of that data using For and Let clauses:

SELECT XMLQuery( 'for $p in doc("/public/doc1.xml")/doc1/body/project let $c := doc("/public/costcenters.xml")//costcenter [@costcenterno = $p/@costcenterno]/@name where $p/@budget > 4000 order by $p/@projectno return ' RETURNING CONTENT) FROM DUAL;

Note the FLWOR construct in the query above. The result for this query is:

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 18

XMLQUERY('FOR$PINDOC("/PUBLIC/DOC1.XML")/DOC1/BOD Y/PROJECTLET$C:=DOC("/PUBLIC/CO ------ 1 row selected.

Oracle SQL*Plus also supports the XQuery command, which allows you to type in FLWOR expressions without the surrounding SQL operators. Below, we show an equivalent query to the one above, which returns the same result.

SQLPLUS > xquery for $p in doc("/public/doc1.xml")/doc1/body/project, $c in doc("/public/costcenters.xml")//costcenter where $p/@costcenterno = $c/@costcenterno and $p/@budget > 4000 order by $p/@projectno return '

With Oracle Database 10g Release 2, you Full-Text Searches of XML can go beyond some of the current limitations of XQuery in searching full-text. SQLPLUS> xquery for $p in doc("/public/doc1.xml")/doc1/body/project, $c in doc("/public/costcenters.xml")//costcenter where $p/@costcenterno = $c/@costcenterno and ora:contains($p/summary,''Migration or "web service"'') > 0 order by $p/@projectno return Full-text keyword searches are also often important for document-oriented XML. Unfortunately, XQuery 1.0 does not support full-text searching of XML documents. A related standards effort, XQuery 1.0 and XPath 2.0 Full-Text 4 is still in the early stages of formulation. While we wait for these standards to emerge, Oracle adds a function ora:contains to enable full-text keyword searches as part of XQueries.

4 See http://www.w3.org/TR/2005/WD-xquery-full-text-20050404/ for more information on the proposal for a language to extend XQuery 1.0 and XPath 2.0 with Full-Text Search capabilities.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 19 Ora:contains can leverage a Oracle Text index on the XML documents if one has been created. Else, ora:contains will be evaluated functionally. The example below queries for cost centers that deal with projects whose summary contains one of the keywords ‘Migration’ or ‘Web Service’:

XMLQUERY('FOR$PINDOC("/PUBLIC/DOC1.XML")/DOC1/BODY/PRO JECT,$CINDOC("/PUBLIC/COST ------

2 item(s) selected.

This returns: Hypothetically, once the XQuery standards evolve to include Full-Text, the ora:contains function in the above fragment may be replaceable by, say,

ftcontains "Migration" || "web service"

assuming ftcontains is the name of the standard XQuery Full-Text operator. In other words, full-text queries might change in syntax, but are expected to functionally work in the same way as ora:contains . The Oracle database has long provided full-text search as part of the Oracle Text contains function, and the related operators WITHIN , HASPATH , and INPATH . The Oracle Text contains provides advanced search functionalities, such as scoring, which are not part of ora:contains at present. Ora:contains is designed to provide a XQuery-oriented full-text grammar, while the SQL contains is a more general full- text search mechanism described by the Oracle Text BNF grammar. The example below finds purchaseOrders that contain the keyword electric anywhere in the document and have some item that has a USPrice that equals "148.95", and also contain 10 in the purchaseOrder attribute orderDate . It scores the results and orders by score.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 20 SELECT score(10), id FROM purchase_orders WHERE contains(doc, 'electric AND HASPATH (/purchaseOrder//item/USPrice="148.95") AND 10 INPATH (/purchaseOrder/@orderDate)') > 0 ORDER BY score(10) DESC;

Unlike ora:contains , which can be functionally invoked, the Oracle Text contains requires an Oracle Text index to be present. For invoking advanced full-text search functionality, beyond the likely scope of XQuery Full-Text, please refer to the Oracle Text whitepapers and documentation.

Querying a Relational Table or View as if it were XML Data

The second example uses the relational tables regions and countries in the standard sample schema hr available with the Oracle Database 10g Release 2. Querying across relational data is very common for Application-to-Application data Exchange scenarios. While SQL will be usually be more expressive and intuitive for querying relational data, in certain cases you may want to use XQuery, for instance if you have to feed the results to an XML-consumer or a web service.

SQL> desc regions Name Null? Type ------REGION_ID NOT NULL NUMBER REGION_NAME VARCHAR2(25)

SQL> desc countries Name Null? Type ------COUNTRY_ID NOT NULL CHAR(2) COUNTRY_NAME VARCHAR2(40) REGION_ID NUMBER

The XQuery to join these two tables employing ora:view , and the results are:

SELECT XMLQuery('for $i in ora:view("REGIONS"), $j in ora:view("COUNTRIES") where $i/ROW/REGION_ID = $j/ROW/REGION_ID and $i/ROW/REGION_NAME = "Asia" return $j' RETURNING CONTENT) AS asian_countries FROM DUAL;

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 21

ASIAN_COUNTRIES ------ AU Australia 3 CN China 3 HK HongKong 3 IN India 3

JP Japan 3 SG Singapore 3

1 row selected.

If we look at an execution plan for such a query, we see that it is the same as a relational plan – i.e. the XQuery has been compiled exactly into a relational query, since in this case we are querying relational data. In other caes, where, say, native XMLType storage exists for XML data, you will see plans optimized to access the XML directly from XMLType columns.

Querying Native XMLType data in Oracle XML DB XMLType data stored natively in Oracle XML DB can be based on an XML Schema, or not. For data based on XML Schema, such as purchaseorder in the oe sample schema, you would use the function XMLTable and the PASSING clause to

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 22 provide the purchaseorder table as the context item for the XQuery execution. The pseudocolumn COLUMN_VALUE of the resulting virtual table holds a constructed element, A10po, which contains the Reference information for those purchase orders whose CostCenter element has value A10 and whose User element has value SMCCAIN. The query performs a join between the virtual table and database table purchaseorder.

SELECT xtab.COLUMN_VALUE FROM purchaseorder, XMLTable('for $i in /PurchaseOrder where $i/CostCenter eq "A10" and $i/User eq "SMCCAIN" return ' PASSING OBJECT_VALUE) xtab;

The returned result is:

COLUMN_VALUE ------

You can also use XQuery to query non-schema-based XMLType data. Further examples are provided in Oracle Database 10g Release2 documentation.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 23 Querying Across Databases and Filesystems, or Websites Let us consider a case where a database contains information that references other content stored externally, perhaps in a file-system. Some queries need to address both the internal and external content. This illustrates the integrative aspects of XQuery: increasingly, XML is an über model that can be used to map a variety of other models, and thus XQuery can be used to ‘join’ them against each other. Let us assume the project summaries information in the earlier example is stored in a file, and accessible via a web server at http://localhost:80/public/doc1.xml . The contents of this file are (as before):

Project Summaries

Oracle 10gR1 to 10gR2 Migration for Project Management Database

Web Service Implementation for Self Service Vacation Reporting Common API Framework XQJ Api Redesign for Projects Salamander and Phoenix

The cost-center information, however, is now modeled as a relational table:

SQL> desc costcenters Name Null? Type ------COSTCENTERNO VARCHAR2(3) NAME VARCHAR2(30)

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 24 We insert the same data as before, so we have:

SQL> select * from costcenters;

COS NAME ------A01 Database Administration C05 Application Development 12F IT Architecture

To query across relational data in Oracle as well as file content accessible through HTTP, we can write the following XQuery. Note the use of ora:view() to access the relational table, and httpuritype() to access the file.

SELECT XMLquery( 'for $i in ora:view("COSTCENTERS")/ROW, $j in $ext/doc1/body/project where $i/COSTCENTERNO eq $j/@costcenterno return $j' passing xmlparse (document httpuritype('http://localhost:80/public/doc1.xml'). getclob())as "ext" returning content) from dual;

The result is:

XMLQUERY('FOR$IINORA:VIEW("COSTCENTERS")/ROW,$JIN$EXT/ DOC1/BODY/PROJECTWHERE$I/C ------

Oracle 10gR1 to 10gR2 Migration for Project Management Database Web Service Implementation for Self Service Vacation Reporting Common API Framework XQJ Api Redesign for Projects Salamander and Phoenix

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 25

Since the external document has to be parsed into XML upon access, clearly this approach is suitable only for occasional access to small documents. If you need to frequently and efficiently query large XML files, you have two possible approaches. If you have control over the content, the best approach is to store these files in the Oracle XML DB Repository where, while maintaining the file-abstraction and while continuing to access the documents as files, you can also index them as XML and query them efficiently. This is the use-case described in the Querying XML Documents in the Oracle XML DB Repository section above, and is represented in Figure 3.

Figure 3: Moving File Content inside Oracle

If you do not control the content, or if it is not possible to move the files, then the best option is to crawl them using Oracle Ultra Search, and build a Full- Text/XML/Metadata attribute index using Oracle Text. The Full-Text index supports XPath-based section searching, as described above in the ‘Full-Text Searching of XML’ subsection. This is represented in Figure 4.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 26

Figure 4: Using Oracle Text for XML Searching

In addition, you could use the Mid-Tier XQuery engine, as shown in Figure 6.

Some Interesting Cases

Type Checking

Oracle XML DB performs static (that is, compile-time) type-checking of XQuery expressions. It also performs dynamic (runtime) type-checking. In general, you want to perform as much of the type checking as possible while compiling the query, since run-time errors are expensive and difficult to recover from. For example, the XQuery static type-checking finds a mismatch between an XPath expression and its target XML schema-based data. The element CostCenter of purchaseorder in the oe schema is misspelt here as costcenter (remember XQuery and XPath are case-sensitive):

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 27 SELECT xtab.poref, xtab.usr, xtab.requestor FROM purchaseorder, XMLTable('for $i in /PurchaseOrder where $i/costcenter eq "A10" return $i' PASSING OBJECT_VALUE COLUMNS poref VARCHAR2(20) PATH 'Reference', usr VARCHAR2(20) PATH 'User' DEFAULT 'Unknown', requestor VARCHAR2(20) PATH 'Requestor') xtab; FROM purchaseorder, * ERROR at line 2: ORA-19276: XP0005 - XPath step specifies an invalid element/attribute name: (costcenter)

Namespaces You can use the XQuery declare namespace declaration in the prolog of XQuery expressions to define a namespace prefix. You can also use declare default namespace to establish the namespace as the default namespace for the expression. However, an XQuery expression’s namespace declaration has no effect outside the expression. To declare a namespace prefix for use outside of the XQuery expression, as in an XMLTable expression, you can use the XMLNAMESPACES clause. This clause also covers the XQuery expression argument to XMLTable, eliminating the need for a separate declaration in the XQuery prolog. Suppose we have foldered into the Oracle XML DB Repository the following XML document, under /public/empsns.xml :

We can invoke a namespace-based XQuery, which, say, returns, the ename and empno attributes of emp as XMLTable columns, as below:

SELECT * FROM XMLTable(XMLNAMESPACES('http://emp.com' AS "e") , 'for $i in doc("/public/empsns.xml") return $i/e:emps/e:emp' COLUMNS name VARCHAR2(6) PATH '@ename', id NUMBER PATH '@empno');

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 28

This should produce:

NAME ID ------John 1 Jack 2 Jill 3

3 rows selected.

PERFORMANCE AND SCALABILITY: CO-PROCESSOR VS. NATIVE COMPILATION Supporting XQuery imposes new challenges on the Oracle database engine. A

straightforward approach, referred to as the coprocessor approach, is to simply embed

The XQuery engine in the database is not a an off-the-shelf XQuery processor and treat the XQuery related functions as a bolted-on XQuery co-processor, but a black-box -- sending the queries over to the embedded processor and getting the natively compiled into the database kernel. results. Although this approach is conceptually clean and easy to implement, it does not leverage the full potential of Oracle as a query-optimization and execution engine and suffers from intrinsic performance limitations. Here are the major reasons why a co-processor approach suffers from limitations.

Storage Optimization If the XQuery processor is completely opaque to the rest of the database execution engine, it may not be able to take advantage of the physical storage of the XML input data. In many cases, the XML data as input to the XQuery processor may have to be constructed by the SQL processor at run time, even though the underlying storage of the XML data may have been stored into tables or may have been defined as an XMLType view over the relational data.

Intra-Query Optimization Having a separate processor distinct from the SQL engine may prevent the use of standard Oracle query optimization functionalities such as constant folding, view merging, sub-query optimizations, distributed query processing, parallel query, partition pruning and common sub-expression elimination within the XQuery functions. There is little point re-implementing such functionalities as part of a separate XQuery processing.

Inter-Query Optimization Furthermore, even if the embedded XQuery processor is able to optimize the single XQuery passed to it, it may not be able to optimize the XQuery in the global context of, say, an original SQL statement which invokes the XMLQuery function

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 29 (as one of several clauses.) A SQL statement can invoke multiple XMLQuery() functions and the output of one XMLQuery() can become the input of another XMLQuery() in the same SQL statement. This occurs naturally in the presence of views in an Oracle database where one view may use XMLQuery() to query result from another view which in turn uses XMLQuery() to query another XMLType tables, views or columns. In the case of supporting XMLTable construct, the output of the XMLQuery() computing the row XML value is passed as input to the multiple invocations of XMLQuery() to compute each column value.

Beyond Co-Processor: Oracle‘s Native XQuery Implementation The Oracle Database 10g Release2 native XQuery implementation goes deeper Oracle compiles XQueries into low-level than the co-processor approach by directly generating low-level database query database execution structures and query sub-blocks. execution structures and query sub-blocks, which are then amenable to optimization by the underlying database optimizer and efficiently executable by the Oracle execution engine. This approach enables us to tightly integrate XQuery and SQL/XML support within the database kernel and delivers performance that is orders of magnitude faster than the co-processor approach. This also enables us to utilize standard indexes that are present on the underlying data and enable relational performance optimization techniques such as parallel query and partitioning on XML queries. As SQL is a compiled language, it makes sense to do the static type analysis of the XQuery for each XMLQuery() and XMLTable invocation during SQL compilation. As a result, XQueries can then be compiled into a set of sub-query blocks and operators that can be algebraically optimized in the context of the global SQL statement. This approach works well within the view-expansion-and-merge techniques in traditional Oracle databases, as it enables the pushdown of predicates and optimizes the result by eliminating unnecessary intermediate materializations of XML values. There are a few XQuery expressions that cannot be rewritten to subquery-blocks and operators. Oracle evaluates these XQuery expressions using an XQuery interpreter , or evaluation engine (which itself has been compiled into the database). The treatment of all XQuery expressions, whether natively compiled or interpreted, is transparent to you: you will never need to change your code in any way to take advantage of available XQuery optimizations. The architecture diagram below shows Oracle’s XQuery engines. Let us first examine the database XQuery engine (the Mid-Tier XQuery Engine is described in a section below.) XQueries sent to the database engine come in to the SQL compiler. The compiler picks out the XMLQuery and XMLTable constructs, and sends them to the native XQuery processing modules in the SQL engine, which perform parsing, compilation, type checking and normalization. This process results in the creation of internal query-time structures which are fed to the optimizer and run time engine, where they are combined with any other SQL query-time structures from the original query. This is represented in Figure 5.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 30

Figure 5: XQuery Database Architecture

AREAS OF LIMITED SUPPORT As noted earlier, Oracle Database 10g Release2 supports the April 2005 draft definition of the XQuery language. Because the language definition is an ongoing process, some areas of the language that are not yet firmly established are as yet unsupported in Oracle Database 10g Release 2, or supported in a limited manner. In this release of the database, we have refrained from supporting those aspects of XQuery that are not firm, in order to limit the impact of change for developers. Oracle participates vigorously in the definition of the XQuery language, is committed to full XQuery support, and will continue to remain at the forefront of XQuery development. With regards to areas of limited support, developers’ attention must be drawn to three areas.

Implementation-specific choices

The XQuery specification specifies that each of the following aspects of language processing is to be defined by the implementation. Implicit time zone support – In Oracle XML DB, the implicit time zone is always assumed to be Z, and timestamps with missing time zones are automatically converted to UTC. Invalid XPath expressions – Whenever it can determine at query compile time that an XPath expression is invalid, Oracle XML DB raises an error; it does not return the empty sequence.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 31 Ordering mode – In Oracle XML DB, the default ordering mode is unordered. You can use an ordering mode declaration in the prolog of an XQuery expression to set the ordering mode for the expression.

Implementation Departures from the XQuery Standard Collations – Oracle XML DB uses the SQL collation rules, based on the current settings for parameters NLS_LANG, NLS_COMP, and NLS_SORT. You can add a default collation clause to a query to force the standard XQuery collation rules. Boundary condition differences – The SQL behavior is generally used for XQuery in Oracle XML DB, if the SQL behavior differs from the standard XQuery behavior. Examples include boundary cases in doc, collection, contains, mod, empty string and NULL. Please refer to the documentation for details. We expect to reconcile these implementation departures as the XQuery standard approches greater firmness in these aspects.

Support for XQuery Functions and Operators Oracle supports all of the XQuery functions and operators included in the April 2005 XQuery 1.0 and XPath 2.0 Functions and Operators specification, with the following exceptions. There is no support as yet for the following: • XQuery regular-expression (regex) functions. Use the Oracle extensions for regex operations, instead. • Implicit time zones, when using functions that involve durations, dates, or times. (See "Implementation-specific choice above".) • Values of type xs:IDREF or xs:IDREFS , in string functions that involve nodes. Please refer to the Oracle Database 10g Release 2 documentation for more information.

BENEFITS OF XQUERY As we have mentioned earlier, there are three major advantages to using XQuery for querying XML data:

Conciseness and Simplicity Some of the brevity and conciseness exhibited by XQuery comes from the ability to use XPath expressions, using the // construct. The // allows navigation to a node’s children, its children’s children, and so on to all its descendants. For example, let us consider the case of an automobile manufacturer who gets vehicle defect reports (collected from accident scenes, mandated tests etc.) from various jurisdictions (state, local government, police etc.) all marked up as XML. Some jurisdictions report the vehicle make at the top as a standalone node, some in the middle – i.e. there is no strict schema to the defect reports and the data is semi- structured. In order to find all the documents which have a VehicleMake node

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 32 somewhere in them and return the Year for that node, you can issue a simple path- expression like:

Doc(”defect_reports.xml‘)//VehicleMake//Year

It would be far more tedious in relational SQL to model such a variable, semi- structured schema and create a normalized data model capable of addressing the above information need. Consider the following XQuery:

select xmlquery (” for $i in /project, $j in $i//paragraph where exists ($i/actions/action-desc) and contains($j, —Oracle“) return {count($i/actions/action-desc)} {$j} ‘ passing body returning content) from projects_xml

The equivalent SQL is not as concise:

Select (select xmlagg( xmlelement(—OracleProject“, Xmlattributes(—$i“.summary as —summary“), Xmlelement(—NumActions“, (select count(*) from table(xmlsequence( extract(—$i“.Body, ”/project/actions/action- desc‘))) ) ), value(—$j“) ) ) from table(xmlsequence( extract(—$i“.Body,‘//paragraph‘))) —$j“ where existsnode(value(—$j“), ‘/paragraph[contains(text(),“Oracle“)]‘) = 1 and existsnode(value(—$i“), ‘/project/actions/action-desc‘) = 1 )

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 33 In general, if your data-model is XML, XQuery will likely be a more concise and natural syntax for expressing queries; if your data model is SQL, or the sources of your data are relational, then SQL will be a cleaner syntax.

Heterogeneous Queries

As we have seen before, you may want to heterogeneously query across databases and filesystems We can query across relational data (accessed via ora:view ) and file data (accessed by httpuritype ) in the same XQuery:

select xmlquery( 'for $i in ora:view("COSTCENTERS")/ROW, $j in $ext/doc1/body/project where $i/COSTCENTERNO eq $j/@costcenterno return $j' passing xmlparse (document httpuritype('http://localhost:80/public/doc1.xml'). getclob())as "ext" returning content) from dual;

The xmlparse can be avoided by storing the file-data in the XML DB Repository and building an index. If you cannot move the data into Oracle, then you could crawl it and build a full-text index with XML section searching, as noted before.

Leveraging the XML Data Model XQuery uses the XML Schema data model. This means that XQuery seamlessly uses the notion of elements, attributes, composition, namespaces and so in XML Schema. In most cases, we can map the SQL object-relational constructs to XML, but clearly the ‘fit’ is higher for XQuery.

XML Construction XQuery is also an attractive way to construct XML. We illustrate this in the example below, which also illustrates the use of nested FLWOR expressions. The query below accesses the relational table warehouses , which is in sample database schema oe , and the relational table locations , which is in sample database schema hr . To run this example as user oe , you must first connect as user hr and grant the SELECT permission to user oe :

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 34 CONNECT HR/HR GRANT SELECT ON LOCATIONS TO OE CONNECT OE/OE

SELECT XMLQuery( 'for $i in ora:view("OE", "WAREHOUSES")/ROW return {for $j in ora:view("HR", "LOCATIONS")/ROW where $j/LOCATION_ID eq $i/LOCATION_ID return ($j/STREET_ADDRESS, $j/CITY, $j/STATE_PROVINCE)} ' RETURNING CONTENT) FROM DUAL;

This returns:

XMLQUERY('FOR$IINORA:VIEW("OE","WAREHOUSES")/ROWRETU RN 2014 Jabberwocky Rd Southlake Texas 2011 Interiors Blvd South San Francisco California 1298 Vileparle (E) Bombay Maharashtra

1 row selected.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 35 We see that XQuery can be used to cleanly specify how XML should be constructed, in this case out of relational tables.

XQuery vs. XSLT XML is also often constructed using XSLT processors. Constructing a new document based on a source document may be thought either as transformation from the source to a result (XSLT), or as building new document basing on a query on the source (XQuery.) Oracle provides a native XSLT processor as part of Oracle XML DB. XQuery will be more efficient for XML construction where optimizations can be done dynamically, i.e. based on an understanding of the specific data one is constructing – what the type is, how the data flowing is through expressions, where data it comes from and where is it going. An example might be path expressions which have to be sorted; dataflow analysis is easier to do in XQuery (using expressions) than in XSLT (using templates.)

THE DATABASE XQUERY ENVIRONMENT In addition to the database execution runtime, XQuery support in Oracle Database The database XQuery engine is integrated 10g Release2 is well integrated with the rest of the Oracle environment. We look at with the rest of the database environment. some of important areas which surround the XQuery processing engine –

SQL*Plus and the XML Developer’s Kit (XDK.)

Top Level Query Support through SQL*Plus A SQL*Plus command XQUERY is also provided, which lets you enter XQuery expressions directly into the SQL*PLUS — in effect, this command turns SQL*Plus into an XQuery command-line interpreter. Top level query support mean that it is possible to go virtually into XQuery execution without ‘going through’ SQL.

SQLPLUS > xquery for $p in doc("/public/doc1.xml")/doc1/body/project, $c in doc("/public/costcenters.xml")//costcenter where $p/@costcenterno = $c/@costcenterno and $p/@budget > 4000 order by $p/@projectno return '

XDK The XML Developer’s Kit available with the Oracle Database 10g Release2 includes support for the XPATH 2.0 and XQuery 1.0 Functions and Operators in Java. This enables a programmatic manipulation of XML data using these standards.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 36 XQUERY IN THE MID-TIER

In addition to the database XQuery implementation, Oracle also provides a Java

mid-tier-based XQuery implementation that will be available for download from the Oracle Technology Network, and also with Oracle Application Server 10g. For the first time, you can perform ad-hoc While the Java XQuery engine is not part of Oracle Database 10g Release 2, we queries in pure Java in the mid-tier. describe it here for sake of completeness.

XQJ The XQuery API for Java (XQJ) provides Java bindings for XQuery. Enterprise mid-tier based applications built on the J2SE and J2EE platforms often require invocation of and integration with functionality not part of XQuery – such a DOM, SAX, StAX, Web Services and so on. XQJ is currently under development as JSR 225 as part of the . XQJ is intended to let XQuery programmers have access to the J2SE and J2EE platforms. XQJ allows a Java program to connect to XML data sources, prepare and issue XQueries, and process the results as XML. What JDBC is to SQL, XQJ is to XQuery. A series of XQJ invocations might look like the Java code fragment below. You see the XQuery expression contains the document function to access the data source. You can also bind a statement to specific data sources at run-time.

import oracle.xquery;

XQueryContext ctx = new XQuerycontext();

XQueryPreparedStatement xq = ctx.prepareStatement(—for $i in document(”emp.xml‘)/empset let $j = 80000 where $i/@salary > $j return $i/@ename“);

XQueryResultSet rset = xq.executeQuery();

while (rset.next()) rset.getNode().print(System.out);

Query Pushdown In cases where the data being queried comes from an Oracle database source, the mid-tier Java XQuery engine intelligently pushes down the query to the database XQuery engine, enabling the mid-tier to scale better while processing XQueries. The architecture diagram in Figure 6 summarizes Oracle’s support of XQuery in the mid-tier. The Java run time for XQuery is wrapped by the XQJ driver. The XQuery Java engine is also integrated with database drivers, such as the Oracle JDBC drviers using which query pushdown can be performed where necessary. The XQuery Java Engine can also be used in conjunction with various connectivity mechanisms that materialize XML.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 37 Using XQuery in the Mid-Tier Apart from using XQJ as a container vs., say, SQL*Plus, invoking XQuery in the Mid-Tier is no different from invoking it in the database. For example, you can save a query to a file, read it in as a stream to XQJ, and run the query:

Figure 6: XQuery Mid-Tier Architecture Let’s say the file sample.xquery contains: We can read in this query as a stream and pass it to XQJ:

import oracle.xquery;

XQueryContext ctx = new XQuerycontext(); Reader strm = new FileReader(—sample.xquery“)

XQueryPreparedStatement xq = ctx.prepareStatement(strm);

XQueryResultSet rset = xq.executeQuery();

while (rset.next()) rset.getNode().print(System.out);

Querying a RSS Feed In a similar vein, you could also query an XML RSS feed. (Really Simple Syndication, or RSS, is a format for syndicating news and the content of news-like sites like web-logs, and also for publishing web-site content with richer metadata than HTML.) Any XML sources that you reach through an URL, including dynamic pages generated by applications, can be queried in this manner.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 38 import oracle.xquery;

XQueryPreparedStatement xq = ctx.prepareStatement (”for $i in $k//item where contains ($i/title ,"Film") return $i' passing xmlparse( document httpuritype('http://www.rediff.com/rss/inrss.xml'). getclob()) as "k" returning content‘);

XQueryResultSet rset = xq.executeQuery();

while (rset.next()) rset.getNode().print(System.out);

This returns:

Filmmaker Ismail Merchant dies http://www.rediff.com/rss/redirect.php?url=http://w ww.rediff.com/movies/2005/may/25ismail.htm The filmmaker was 68.

Note that exactly the same XQuery would work for the Database XQuery engine as well. In the case your database is behind a firewall, you may need to set the proxy- server for the httpuritype call.

begin utl_http.set_proxy(proxy => proxy.mydomain.com', no_proxy_domains => mydomain.com'); end;

For advanced mid-tier-based connectivity to different types of data sources, Oracle XML Query Services provide additional services and connectivity around the mid- tier XQuery engine and XQJ. This is discussed below.

Oracle XML Query Services Oracle XML Query Services (earlier called Oracle XML Data Synthesis 5) provides the J2EE ® Connector Architecture and related connectivity for the XQuery Mid-

5 See http://www.oracle.com/technology/oramag/oracle/05-mar/o25xml.html for some examples of Oracle XDS

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 39 tier engine, as a generic service that takes care of information source access and information aggregation. Oracle XML Query Services (XQS) provides an easy-to- use declarative framework to plug in various sources of data, and dynamically query across them. Oracle XQS is a middle-tier J2EE-based solution that employs standard APIs and technologies, including Java Database Connectivity (JDBC), J2EE Connector Architecture (J2CA), Web services etc. for data access, and the Mid-tier XQuery engine for querying. Oracle XML Query Services is currently available as a download from http://otn.oracle.com ., and is also part of Oracle Application Server. It allows you to specify data sources in a declarative manner—for example, in a configuration file. When querying non-XML data sources, you can also declaratively plug in translation functions for XML conversion -- the underlying Oracle XQS infrastructure is responsible for invoking the appropriate application-programming interfaces (APIs) to get the information, processing it following the user-specified XQuery, and returning the consolidated result. Because the XML is synthesized live from the data sources and queried directly, it is always current, real-time information.

CONCLUSION Oracle Database 10g Release 2 introduces a native XQuery processing engine built

into the database kernel and well-integrated with the database environment. Oracle

Oracle Database 10g Release 2 is Oracle‘s generally evaluates XQuery expressions by compiling them into the same underlying largest, most innovative release ever, and structures as SQL queries. Queries are optimized leveraging both relational- one of its key innovations is support for database and XQuery-specific optimization technologies. XQuery. With Oracle’s XQuery support – integrated with SQL in the database, and also

available standalone as a Java engine for the middle-tier, you can: • Search with any language –XQuery/Xpath/SQL/XSL/Full-Text • Search anywhere - mid tier or backend • Search anything - files, relational content, message, web-service etc. • … and search everything – structured tables, semi-structured XML, unstructured documents • Search any size - a native, scalable implementation • Search any time - an unbreakable solution, built on Oracle’s robust enterprise platform.

XML Query (XQuery) Support in Oracle Database 10g Release 2 Page 40

XML Query (XQuery) Support in Oracle Database 10g Release 2 May 2005 Author: Sandeepan Banerjee Contributing Authors: Muralidhar Krishnaprasad, Steve Buxton , Vikas Arora, Zhen Hua Liu, Drew Adams, Rohan Angrish, Geoff Lee, Mark Drake, Susan Kotsovolos, Vishu Krishnamurthy, Eric Sedlar, Nipun Agarwal, Ravi Murthy, Daniela Florescu, Jim Melton, Nirav Chanchani

Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065 U.S.A.

Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200 www.oracle.com

Copyright © 2005, Oracle. All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle is a registered trademark of and/or its affiliates. Other names may be trademarks of their respective owners.