<<

XPath Processing in a Nutshell∗

Georg Gottlob, Christoph Koch, and Reinhard Pichler

Database and Artificial Intelligence Group Technische Universit¨at Wien, A-1040 Vienna, Austria {gottlob, koch}@dbai.tuwien.ac.at, [email protected]

Abstract size and intricacy of the queries. (Database theoreti- cians refer to this as combined complexity.) We provide a concise yet complete formal We present main-memory XPath processing algo- definition of the semantics of XPath 1 and rithms that run in time guaranteed to be polynomial summarize efficient algorithms for processing in the size of the complete input, i.e. query and data. queries in this language. Our presentation is As experimentally verified in [3], where these and other intended both for the reader who is looking results where first presented, contemporary XPath en- for a short but comprehensive formal account gines do not have this property, and require time ex- of XPath as well as the software developer in ponential in the size of queries at worst. need of material that facilitates the rapid im- We present polynomial-time XPath evaluation algo- plementation of XPath engines. rithms in two flavors, bottom-up and top-down2. Both algorithms have the same worst-case polynomial-time 1 Introduction bounds. The former approach comes with a clear and XPath [10] is a practical language for selecting nodes simple intuition why our algorithms are polynomial, from XML document trees. Its importance stems from while the latter requires to compute considerably fewer its suitability as an XML per se and it useless intermediate results and consequently performs being at the core of several other XML-related tech- much better in practice. nologies, such as XSLT, XPointer, and XQuery. The structure of this paper is as follows. Sec- Previous semantics definitions of XPath have either tion 2 introduces XPath axes. Section 3 presents the been overly lengthy [10, 12], or were restricted to frag- data model of XPath and auxiliary functions used ments of the language (e.g. [6, 7]). In this paper, we throughout the paper. Section 4 defines the seman- provide a formal semantics definition of full XPath, in- tics of XPath in a concise way. Section 5 houses the tended for the implementor or the reader who values bottom-up semantics definition and algorithm. Sec- a concise, formal, and functional description. As an tion 6 comes up with the modifications to obtain a extension to [3], we also provide a brief but complete top-down algorithm. We conclude with Section 7. definition of the XPath data model. Given a query language, the prototypical algorith- 2 XPath Axes mic problem is query evaluation, and its hardness needs to be formally studied. The query evaluation In XPath, an XML document is viewed as an unranked problem was not addressed for full XPath in any depth (i.e., nodes may have a variable number of children), until recently [3, 2, 4]1. We summarize the main ideas ordered, and labeled . Before we make the data of that work in this paper, with an emphasis on use- model used by XPath precise (which distinguishes be- fulness to a wide audience. tween several types of tree nodes) in Section 3, we Since XPath and related technologies will be tested introduce the main mode of navigation in document in ever-growing deployment scenarios, implementa- trees employed by XPath – axes – in the abstract, ig- tions of XPath engines need to scale well both with noring types. We will point out how to deal with respect to the size of the XML data and the growing different node types in Section 3. All of the artifacts of this and the next section ∗ This work was supported by the Austrian Science Fund (FWF) under project No. Z29-INF. All methods and algorithms are defined in the context of a given XML document. presented in this paper are covered by a pending patent. Fur- Given a document tree, let dom be the set of its nodes, ther resources, updates, and possible corrections are available and let us use the two functions at http://www.xmltaskforce.com. 1The novelty of our results is a bit surprising, as considerable effort has been made in the past to understand the XPath query firstchild, nextsibling : dom → dom, containment problem (e.g. [1, 5, 8]), which – with its relevance to query optimization – is only a means to the end of efficient 2By this we refer to the mode of traversal of the expression query evaluation. tree of the query while computing the query result. ∗ child := firstchild.nextsibling function −1 ∗ −1 evalχ1∪χ2 (S) := evalχ1 (S) ∪ evalχ2 (S). parent := (nextsibling ) .firstchild ∗ ∗ function eval(R1∪...∪Rn) (S) begin descendant := firstchild.(firstchild ∪ nextsibling) 0 0 −1 −1 ∗ −1 S := S; /* S is represented as a list */ ancestor := (firstchild ∪ nextsibling ) .firstchild while there is a next element x in S0 do descendant-or-self := descendant ∪ self append {Ri(x) | 1 ≤ i ≤ n, Ri(x) 6= null, ancestor-or-self := ancestor ∪ self 0 0 Ri(x) 6∈ S } to S ; following := ancestor-or-self.nextsibling. 0 nextsibling∗.descendant-or-self return S ; preceding := ancestor-or-self.nextsibling−1. end; −1 ∗ (nextsibling ) .descendant-or-self where S ⊆ dom is a set of nodes of an XML document, following-sibling := nextsibling.nextsibling∗ 1 1 e and e are regular expressions, R, R , . . ., Rn are preceding-sibling := (nextsibling− )∗.nextsibling− 1 2 1 primitive relations, χ1 and χ2 are axes, and χ is an axis other than “self”. Table 1: Axis definitions in terms of “primitive” tree relations “firstchild”, “nextsibling”, and their inverses. Clearly, some axes could have been defined in a simpler way in Table 1 (e.g., ancestor equals ∗ 3 parent.parent ). However, the definitions, which use to represent its structure . “firstchild” returns the a limited form of regular expressions only, allow to first child of a node (if there are any children, i.e., compute χ(S) in a very simple way, as evidenced by the node is not a leaf), and otherwise “null”. Let Algorithm 2.2. n , . . . , nk be the children of some node in document 1 The function eval ∗ essentially computes order. Then, nextsibling(n ) = n , i.e., “nextsib- (R1∪...∪Rn) i i+1 graph reachability (not transitive closure). It can be ling” returns the neighboring node to the right, if it implemented to run in linear time in terms of the data exists, and “null” otherwise (if i = k). We define the in a straightforward manner; (non)membership in S0 is functions firstchild−1 and nextsibling−1 as the inverses checked in constant time using a direct-access version of the former two functions, where “null” is returned if of S0 maintained in parallel to its list representation no inverse exists for a given node. Where appropriate, (naively, this could be an array of bits, one for each we will use binary relations of the same name instead member of dom, telling which nodes are in S0). of the functions. ({hx, f(x)i | x ∈ dom, f(x) 6= null} is the binary relation for function f.) Lemma 2.3 Let S ⊆ dom be a set of nodes of an The axes self , child, parent, descendant, ancestor, XML document and χ be an axis. Then, (1) χ(S) = descendant-or-self , ancestor-or-self , following, preced- evalχ(S) and (2) Algorithm 2.2 runs in time O(|dom|). ing, following-sibling, and preceding-sibling are binary relations χ ⊆ dom × dom. Let self := {hx, xi | x ∈ 3 Data Model dom}. The other axes are defined in terms of our “primitive” relations “firstchild” and “nextsibling” as Let dom be the set of nodes in the document tree as ∗ shown in Table 1 (cf. [10]). R1.R2, R1 ∪R2, and R1 de- introduced in the previous section. Each node is of one note the concatenation, union, and reflexive and tran- of seven types, namely root, element, text, comment, sitive closure, respectively, of binary relations R1 and attribute, namespace, and . As R2. Let E(χ) denote the regular expression defining in DOM [9], the root node of the document is the only χ in Table 1. It is important to observe that some one of type “root”, and is the parent of the document axes are defined in terms of other axes, but that these element node of the XML document. The main type of definitions are acyclic. non-terminal node is “element”, the other node types are self-explaining (cf. [10]). Nodes of all types besides Definition 2.1 (Axis Function) Let χ denote an “text” and “comment” have a name associated with it. XPath axis relation. We define the function χ : A node test is an expression of the form τ() (where 2dom → 2dom as χ(X ) = {x | ∃x ∈ X : x χx} τ is a node type or the wildcard “node”, matching any 0 0 0 0 type) or τ(n) (where n is a node name and τ is a type (and thus overload the relation name χ), where X0 ⊆ dom is a set of nodes. whose nodes have a name). τ(∗) is equivalent to τ(). We define a function T which maps each node test Algorithm 2.2 (Axis Evaluation) to the subset of dom that satisfies it. For instance, Input: A set of nodes S and an axis χ T (node()) = dom and T (attribute(href)) returns all Output: χ(S) attribute nodes labeled “href”. Method: evalχ(S) Example 3.1 Consider a document consisting of six function eval (S) := S. nodes – the root node r, which is the parent of self the document element node a (labeled “a”), and function evalχ(S) := evalE(χ)(S). its four children b1, . . . , b4 (labeled “b”). We have function evale1.e2 (S) := evale2 (evale1 (S)). dom = {r, a, b1, . . . , b4}, firstchild = {hr, ai, ha, b1i}, function evalR(S) := {R(x) | x ∈ S}. nextsibling = {hb1, b2i, hb2, b3i, hb3, b4i}, T (root()) =

3Actually, “firstchild” and “nextsibling” are part of the XML {r}, T (element()) = {a, b1, . . . , b4}, T (element(a)) =

Document Object Model (DOM). {a}, and T (element(b)) = {b1, . . . , b4}. Now, XPath axes differ from the abstract, untyped P [[χ::t[e1] · · · [em]]](x) := axes of Section 2 in that there are special child axes begin “attribute” and “namespace” which filter out all re- S := {y | xχy, y ∈ T (t)}; sulting nodes that are not of type attribute or names- for 1 ≤ i ≤ m (in ascending order) do pace, respectively. In turn, all other XPath axis func- S := {y ∈ S | [[ei]](y, idxχ(y, S), |S|) = true}; tions remove nodes of these two types from their re- return S; sults. We can express this formally as end; P [[π1|π2]](x) := P [[π1]](x) ∪ P [[π2]](x) attribute(S) := child(S) ∩ T (attribute()) P [[/π]](x) := P [[π]](root) P [[π /π ]](x) := S P [[π ]](y) namespace(S) := child(S) ∩ T (namespace()) 1 2 y∈P [[π1]](x) 2 and for all other XPath axes χ (let χ0 be the abstract Figure 1: Standard semantics of location paths. axis of the same name), a number to a string resp. a string to a number ac- χ(S):=χ0(S) − (T (attribute()) ∪ T (namespace())). cording to the rules specified in [10]. This concludes our discussion of the XPath data Node tests that occur explicitly in XPath queries must model, which is complete except for some details re- not use the types “root”, “attribute”, or “names- 4 lated to namespaces. This topic is mostly orthogonal pace” . In XPath, axis applications χ and node tests to our discussion, and extending our framework to also t always come in location step expressions of the form handle namespaces (without a penalty with respect to χ::t. The node test n (where n is a node name or the efficiency bounds) is an easy exercise. 5 wildcard *) is a shortcut for τ(n), where τ is the prin- cipal node type of χ. For the axis attribute, the princi- pal node type is attribute, for namespace it is names- 4 Semantics of XPath pace, and for all other axes, it is element. For exam- In this section, we present a concise definition of the ple, child::a is short for child::element(a) and child::* semantics of XPath 1 [10]. We assume the syntax of is short for child::element(*). this language known, and cohere with its unabbreviated Note that for a set of nodes S and a typed axis χ, form [10]. We use a normal form syntax of XPath, χ(S) can be computed in linear time – just as for the which is obtained by the following rewrite rules, ap- untyped axes of Section 2. plied initially: Let doc. Moreover, the input variable binding. given a node x and a set of nodes S with x ∈ S, let idxχ(x, S) be the index of x in S w.r.t. }, EqOp ∈ {=, 6=}, and in that list. GtOp ∈ {≤, <, ≥, >}. By slight abuse of notation, The function strval : dom → string returns we identify these arithmetic and relational operations the string value of a node, for the precise defini- with their symbols in the remainder of this paper. tion of which we refer to [10]. Notably, the string However, it should be clear whether we refer to the value of an element or root node x is the concate- operation or its symbol at any point. By π, π1, π2, . . . nation of the string values of descendant text nodes we denote location paths. {y | descendant({x}) ∩ T (text())} visited in document 5 To be consistent, we also will not discuss the “local-name”, order. The functions to string and to number convert “namespace-uri”, and “name” core library functions [10]. Note that names used in node tests may be of the form 4These node tests are also redundant with ‘/’ and the “at- NCName:*, which matches all names from a given namespace tribute” and “namespace” axes. named NCNAME. Definition 4.1 (Semantics of XPath) Each XPath Expr. E : Operator Signature expression returns a value of one of the following four Semantics F[[E]] types: number, node set, string, and boolean (abbre- F[[constant number v : → num]]() viated num, nset, str, and bool, respectively). Let T v be an expression type and the semantics [[e]] : C → T F[[ArithOp : num × num → num]](v1, v2) of XPath expression e be defined as follows. v1 ArithOp v2 F[[count : nset → num]](S) [[π]](hx, k, ni) := P [[π]](x) |S| [[position()]](hx, k, ni) := k F[[sum : nset → num]](S) Σn∈S to number(strval(n)) [[last()]](hx, k, ni) := n F[[id : nset → nset]](S) F[[id]](strval(n)) For all other kinds of expressions e = Op(e1, . . . , em) n∈S mapping a context ~c to a value of type T , F[[id : str → nset]](s) [[Op(e1, . . . , em)]](~c) := F[[Op]]([[e1]](~c), . . . , [[em]](~c)), deref ids(s) where F[[Op]] : T1 ×. . .×Tm → T is called the effective F[[constant string s : → str]]() semantics function of Op. The function P is defined s in Figure 1 and the effective semantics function F is F[[and : bool × bool → bool]](b1, b2) defined in Table 2. b1 ∧ b2 F[[or : bool × bool → bool]](b1, b2) To save space, we at times re-use function defi- b1 ∨ b2 nitions in Table 2 to define others. However, our F[[not : bool → bool]](b) definitions are not circular and the indirections can ¬b be eliminated by a constant number of unfolding F[[true() : → bool]]() steps. Moreover, for lack of space, we define nei- true ther the number operations floor, ceiling, and round F[[false() : → bool]]() nor the string operations concat, starts-with, contains, false substring-before, substring-after, substring (two ver- F[[RelOp : nset × nset → bool]](S1, S2) sions), string-length, normalize-space, translate, and ∃n1 ∈ S1, n2 ∈ S2 : strval(n1) RelOp strval(n2) lang in Table 2, but it is very easy to obtain these F[[RelOp : nset × num → bool]](S, v) definitions from the XPath 1 Recommendation [10]. ∃n ∈ S : to number(strval(n)) RelOp v F[[RelOp : nset × str → bool]](S, s) The compatibility of our semantics definition (mod- ∃n ∈ S : strval(n) RelOp s ulo the assumptions made in this paper to simplify the F[[RelOp : nset × bool → bool]](S, b) data model) with [10] can easily be verified by inspec- F[[boolean]](S) RelOp b tion of the latter document. F[[EqOp : bool × (str ∪ num ∪ bool) → bool]](b, x) It is instructive to look at the definition of P [[π1/π2]] b EqOp F[[boolean]](x) in Figure 1 in more detail. It is easy to see that, if this F[[EqOp : num × (str ∪ num) → bool]](v, x) semantics definition is followed rigorously to obtain v EqOp F[[number]](x) an analogous functional implementation, query eval- F[[EqOp : str × str → bool]](s1, s2) uation using this implementation requires time expo- s1 EqOp s2 nential in the size of the queries. All systems evaluated F[[GtOp : (str ∪ num ∪ bool) × in [3] went into this (or a very similar) trap. (str ∪ num ∪ bool) → bool]](x1, x2) F[[number]](x1) GtOp F[[number]](x2) 5 Bottom-up Evaluation of XPath F[[string : num → str]](v) to string(v) In this section, we present a bottom-up semantics and F[[string : nset → str]](S) algorithm for evaluating XPath queries in polynomial if S = ∅ then “” else strval(first

y∈Y hy, k2, n2, zi ∈ E↑[[π2]]} ;

location path π1 | π2 : nset × nset → nset return E↑[[root(Tree(Q))]]. E↑[[π1]] ∪ E↑[[π2]] position() : → num Example 5.4 Consider the document of Exam- {hx, k, n, ki | hx, k, ni ∈ C} ple 3.1. We want to evaluate the XPath query Q, last() : → num which reads as {hx, k, n, ni | hx, k, ni ∈ C} descendant::b/following-sibling::*[position() != last()] Table 4: Expression relations for location paths, posi- tion(), and last(). over the input context ha, 1, 1i. We illustrate how this

evaluation can be done using Algorithm 5.3: First of for the remaining kinds of XPath expressions. all, we have to set up the parse tree Now, for each expression e and each hx, k, ni ∈ C, Q: E /E

¡ 1 2

¡ ¢ there is exactly one v s.t. hx, k, n, vi ∈ E↑[[e]]. ¢ Theorem 5.2 Let e be an arbitrary XPath expres- E : descendant::b E : E [E ]

1 2¡ 3 4

¡ ¢ sion. Then, for context node x, position k, and size ¢ n, the value of e is v, where v is the unique value such E : following-sibling::* E : E != E

3 4 ¡ 5 6

¢ ¡ that hx, k, n, vi ∈ E↑[[e]]. ¢ E5: position() E6: last() The main principle that we propose at this point to obtain an XPath evaluation algorithm with of Q with its 6 proper subexpressions E1, . . . , E6. polynomial-time complexity is the notion of a context- Then we compute the context-value tables of the leaf value table (i.e., a relation for each expression, as dis- nodes E1, E3, E5 and E6 in the parse tree, and from cussed above). the latter two the table for E4. By combining E3 and E4, we obtain E2, which is in turn needed for comput- Context-value Table Principle. Given an ex- 7 pression e that occurs in the input query, the context- ing Q. The tables for E1, E2, E3 and Q are shown in value table of e specifies all valid combinations of con- Figure 2. Moreover, texts ~c and values v, such that e evaluates to v in context ~c. Such a table for expression e is obtained by E↑[[E5]] = {hx, k, n, ki | hx, k, ni ∈ C} first computing the context-value tables of the direct E↑[[E6]] = {hx, k, n, ni | hx, k, ni ∈ C} subexpressions of e and subsequently combining them E↑[[E4]] = {hx, k, n, k 6= ni | hx, k, ni ∈ C} into the context-value table for e. Given that the size of each of the context-value tables has a polynomial The most interesting step is the computation of E↑[[E2]] bound and each of the combination steps can be ef- from the tables for E3 and E4. For instance, consider fected in polynomial time (all of which we can assure hb1, k, n, {b2, b3, b4}i ∈ E↑[[E3]]. b2 is the first, b3 the in the following), query evaluation in total under our second, and b4 the third of the three siblings following principle also has a polynomial time bound6. 7The k and n columns have been omitted. Full tables are 6The number of expressions to be considered is fixed with obtained by computing the cartesian product of each table with the parse tree of a given query. {hk, ni | 1 ≤ k ≤ n ≤ |dom|}. E↑[[E2]] E↑[[E1]] S↓[[χ::t[e1] · · · [em]]](X1, . . . , Xk) := begin x val x val k S := {hx, yi | x ∈ S Xi, x χ y, and y ∈ T (t)}; b1 {b2, b3} r {b1, b2, b3, b4} i=1 b {b } a {b , b , b , b } for each 1 ≤ i ≤ m (in ascending order) do 2 3 1 2 3 4 begin ~ E↑[[E3]] fix some order S = hhx1, y1i, . . . , hxl, ylii for S; hr , . . . , r i := E [[e ]](t , . . . , t ) x val E↑[[Q]] 1 l ↓ i 1 l where tj = hyj , idxχ(yj , Sj ), |Sj |i b {b , b , b } x val 1 2 3 4 and Sj := {z | hxj , zi ∈ S}; b {b , b } 2 3 4 r {b2, b3} S := {hxi, yii | ri is true}; b3 {b4} a {b2, b3} end; Figure 2: Context-value tables of Example 5.4. for each 1 ≤ i ≤ k do Ri := {y | hx, yi ∈ S, x ∈ Xi}; b1. Thus, only for b2 and b3 is the condition E2 (requir- return hR1, . . . , Rki; ing that the position in set {b2, b3, b4} is different from end; the size of the set, three) satisfied. Thus, we obtain k times the tuple hb , k, n, {b , b }i which we add to E [[E ]]. 1 2 3 ↑ 2 z }| { We can read out the final result {b2, b3} from the S↓[[/π]](X1, . . . , Xk) := S↓[[π]]({root}, . . . , {root}) context-value table of Q. S↓[[π1/π2]](X1, . . . , Xk) := S↓[[π2]](S↓[[π1]](X1, . . . , Xk))

S↓[[π1 | π2]](X1, . . . , Xk) := Theorem 5.5 XPath can be evaluated bottom-up in hi S↓[[π1]](X1, . . . , Xk) ∪ S↓[[π2]](X1, . . . , Xk) polynomial time (combined complexity).

(For a proof sketch of this result see [3].) Figure 3: Top-down evaluation of location paths.

6 Top-down Evaluation of XPath Definition 6.1 The semantics function E↓ for arbi- In the previous section, we obtained a bottom-up trary XPath expressions is of the following type: semantics definition which led to a polynomial-time E↓ : XPathExpression → List(C) query evaluation algorithm for XPath. Despite this → List(XPathType) favorable complexity bound, this algorithm is still not Given an XPath expression e and a list (~c , . . . ,~c ) of practical, as usually many irrelevant intermediate re- 1 l contexts, E determines a list hr , . . . , r i of results of sults are computed to fill the context-value tables ↓ 1 l one of the XPath types number, string, boolean, or which are not used later on. Next, building on the node set. E is defined as context-value table principle of Section 5, we develop ↓ a top-down algorithm based on vector computation E↓[[π]](hx1, k1, n1i, . . . , hxl, kl, nli) := for which the favorable (worst-case) complexity bound S↓[[π]]({x1}, . . . , {xl}) carries over but in which the computation of a large E [[position()]](hx , k , n i, . . . , hx , k , n i) := number of irrelevant results is avoided. ↓ 1 1 1 l l l hk , . . . , k i Given an m-ary operation Op : Dm → D, its vec- 1 l hi k m k torized version Op : (D ) → D is defined as E↓[[last()]](hx1, k1, n1i, . . . , hxl, kl, nli) := hi hn1, . . . , nli Op (hx1,1, . . . , x1,ki, . . . , hxm,1, . . . , xm,ki) := hOp(x1,1, . . . , xm,1), . . . , Op(x1,k, . . . , xm,k)i and

hi E [[Op(e , . . . , em)]](~c , . . . ,~cl) := For instance, hX1, . . . , Xki ∪ hY1, . . . , Yki := ↓ 1 1 hi

hX1 ∪ Y1, . . . , Xk ∪ Yki. Let F[[Op]] (E↓[[e1]](~c1, . . . ,~cl), . . . , E↓[[em]](~c1, . . . ,~cl))

dom dom for the remaining kinds of expressions. S↓ : LocationPath → List(2 ) → List(2 ) Example 6.2 Given the query Q, data, and con- be the auxiliary semantics function for location paths text ha, 1, 1i of Example 5.4, we evaluate Q defined in Figure 3. We basically distinguish the same as E [[Q]](ha, 1, 1i) = S [[E ]](S [[descendant::b]]({a})) cases (related to location paths) as for the bottom- ↓ ↓ 2 ↓ where S↓[[descendant::b]]({a}) = h{b1, b2, b3, b4}i. up semantics E↑[[π]]. Given a location path π and a To compute S↓[[E2]](h{b1, b2, b3, b4}i), we proceed as list hX1, . . . , Xki of node sets, S↓ determines a list described in the algorithm for location steps in Fig- hY1, . . . , Yki of node sets, s.t. for every i ∈ {1, . . . , k}, ure 3. We initially obtain the set the nodes reachable from the context nodes in Xi via the location path π are precisely the nodes in Yi. S↓[[π]] S = {hb1, b2i, hb1, b3i, hb1, b4i, hb2, b3i, hb2, b4i, hb3, b4i} can be obtained from the relations E↑[[π]] as follows. A node y is in Yi iff there is an x ∈ Xi and some p, s such and the list of contexts ~t = hhb2, 1, 3i, hb3, 2, 3i, that hx, p, s, yi ∈ E↑[[π]]. hb4, 3, 3i, hb3, 1, 2i, hb4, 2, 2i, hb4, 1, 1ii. The check of condition E4 returns the filter Acknowledgments We thank J. Sim´eon for pointing out to us that the ~r = htrue, true, false, true, false, falsei. mapping from XPath to the XML Query Algebra, which will be the preferred semantics definition for which is applied to S to obtain XPath 2, in an direct functional implementation also leads to exponential-time query processing on XPath 1 S = {hb , b i, hb , b i, hb , b i} 1 2 1 3 2 3 (which is a fragment of XPath 2). Thus, the query returns h{b , b }i. 2 3 References The correctness of the top-down semantics fol- [1] A. Deutsch and V. Tannen. Containment and In- lows immediately from the corresponding result in the tegrity Constraints for XPath. In Proc. KRDB 2001, bottom-up case and from the definition of S↓ and E↓. CEUR Workshop Proceedings 45, 2001. [2] G. Gottlob and C. Koch. “Monadic Queries over Tree- Theorem 6.3 (Correctness of E↓) Let e be an Structured Data”. In Proceedings of the 17th An- arbitrary XPath expression. Then, hv1, . . . , vli = nual IEEE Symposium on Logic in Computer Science E↓[[e]](~c1, . . . ,~cl) iff h~c1, v1i, . . . , h~cl, vli ∈ E↑[[e]]. (LICS 2002), pages 189–202, Copenhagen, Denmark, July 2002.

S↓ and E↓ can be immediately transformed into [3] G. Gottlob, C. Koch, and R. Pichler. “Efficient Algo- function definitions in a top-down algorithm. We thus rithms for Processing XPath Queries”. In Proceedings have to define one evaluation function for each case of the 28th International Conference on Very Large of the definition of S and E , respectively. The func- Data Bases (VLDB’02), Hong Kong, China, Aug. ↓ ↓ 2002. tions corresponding to the various cases of S↓ have a location path and a list of node sets of variable [4] G. Gottlob, C. Koch, and R. Pichler. “XPath Query length (X1, . . . , Xk) as input parameter and return a Evaluation: Improving Time and Space Efficiency”. In Proceedings of the 19th IEEE International Con- list (R1, . . . , Rk) of node sets of the same length as ference on Data Engineering (ICDE’03), Bangalore, result. Likewise, the functions corresponding to E↓ take an arbitrary XPath expression and a list of con- India, Mar. 2003. to appear. texts as input and return a list of XPath values (which [5] G. Miklau and D. Suciu. “Containment and Equiv- can be of type num, str, bool or nset). Moreover, the alence for an XPath Fragment”. In Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Sympo- recursions in the definition of S↓ and E↓ correspond to recursive function calls of the respective evaluation sium on Principles of Database Systems (PODS’02), pages 65–76, Madison, Wisconsin, 2002. functions. Analogously to Theorem 5.5, we get [6] P. Wadler. “Two Semantics for XPath”, 2000. Draft Theorem 6.4 The immediate functional implemen- paper available at http://www.research.avayalabs.com/user/wadler/. tation of E↓ evaluates XPath queries in polynomial time (combined complexity). [7] P. Wadler. “A Formal Semantics of Patterns in XSLT”. In Markup Technologies, Philadelphia, De- Finally, note that using arguments relating the top- cember 1999. Revised version in Markup Languages, MIT Press, June 2001. down method of this section with (join) optimization techniques in relational databases, one may argue that [8] P. T. Wood. “On the Equivalence of XML Patterns”. the context-value table principle is also the basis of the In Proc. 1st International Conference on Computa- polynomial-time bound of Theorem 6.4. tional Logic (CL 2000), LNCS 1861, pages 1152–1166, London, UK, July 2000. Springer-Verlag. [9] World Wide Web Consortium. DOM Specification 7 Conclusions http://www.w3c.org/DOM/. The algorithms presented in this paper empower [10] World Wide Web Consortium. XML XPath engines to deal efficiently with very sophisti- Path Language (XPath) Recommendation. cated queries while scaling up to large documents. http://www.w3c.org/TR/xpath/, Nov. 1999. In [3, 4], in addition to an experimental verifica- [11] World Wide Web Consortium. “Extensible Markup tion of our claim that current XPath engines evalu- Language (XML) 1.0 (Second Edition)”, Oct. 2000. ate queries in exponential time only, we have also pre- http://www.w3.org/TR/REC-xml. sented fragments of XPath which contain most of the [12] World Wide Web Consortium. “XQuery XPath queries likely to be used in practice and can be 1.0 and XPath 2.0 Formal Semantics. evaluated particularly efficiently, using methods that W3C Working Draft Aug. 16th 2002, 2002. are beyond the scope of this paper. http://www.w3.org/TR/query-algebra/. We have made a main-memory implementation of the top-down algorithm of Section 6. Resources such as this implementation and publications will be made accessible at http://www.xmltaskforce.com.