WIN : An Efficient Data Placement Strategy for Parallel XML

Nan Tang† Guoren Wang‡ Jeffrey Xu Yu† Kam-Fai Wong† Ge Yu‡

† Department of SE & EM, The Chinese University of Hong Kong, Hong Kong, China ‡ College of Information Science and Engineering, Northeastern University, Shenyang, China {ntang, yu, kfwong}se.cuhk.edu.hk {wanggr, yuge}@mail.neu.edu.cn

Abstract graph-based. Distinct from them, XML employs a tree struc- The basic idea behind parallel systems is to ture. Conventional data placement strategies in perform operations in parallel to reduce the response RDBMSs and OODBMSs cannot serve XML data well. time and improve the system throughput. Data place- Semistructured data involve nested structures; there ment is a key factor on the overall performance of are much intricacy and redundancy when represent- parallel systems. XML is semistructured data, tradi- ing XML as relations. Data placement strategies in tional data placement strategies cannot serve it well. RDBMSs cannot, therefore, well solve the data place- In this paper, we present the concept of intermediary ment problem of XML data. Although graph node INode, and propose a novel workload-aware data algorithms in OODBMSs can be applied to tree parti- placement WIN to effectively decluster XML data, to tion problems, the key relationships between tree nodes obtain high intra query parallelism. The extensive ex- such as ancestor-descendant or sibling cannot be clearly periments show that the speedup and scaleup perfor- denoted on graph. Hence data placement strategies mance of WIN outperforms the previous strategies. for OODBMSs cannot well adapt to XML data either. Furthermore, the path based XML query languages such as XPath and XQuery enhance the hardness and 1 Introduction intricacy of tree data placement problem. Data place- ment strategies in both parallel relational and object- XML has become the de facto standard for data oriented database systems are helpful to the data place- representation and exchange over the Internet. Nev- ment for XML documents. They cannot, nevertheless, ertheless, along with the increasing size of XML doc- solve the problem which XML faces. Efficient data ument and complexity of evaluating XML queries, ex- placement strategies for XML DBMSs are, therefore, isting query processing performance in concentrative meaningful and challenging. environment will get a limitation. Parallelism is a vi- [13] proposed two data placement strategies for able solution to this problem. XML data tree: NSNRR and PSPIB. NSNRR is pro- In parallel DBMSs, data placement has a signifi- posed to exploit pipeline query parallelism. With cant impact on the overall system performance. Sev- NSNRR elements having different tag names are al- eral earlier studies have shown that data placement is located to different sites in a round-robin way. Previ- closely relevant to the performance of a shared-nothing ous experiments show that when the document is small parallel architecture. Many classical data placement NSNRR has good speedup and scaleup performance; strategies are thereby proposed in the literature. [2,11] but as the document grows, both the speedup and proposed many single dimension and multiple dimen- scaleup performance of NSNRR behave badly. PSPIB sion data placement strategies for structured data in is a data placement strategy based on path instances. RDBMSs. In OODBMSs otherwise, since the objects Path instance is informally defined as a sequence of can be arbitrarily referenced by one another, the re- node identifiers from root to leaf node, while path lationship of objects is regarded as a graph on topol- schema is an abstract of path instance. The main ogy. Thus most of the object-placement strategies are idea of PSPIB is to decluster the path instances having

1 the same path schema evenly across all sites. Because And yet these nodes are the crux of many XML query PSPIB and WIN have the similar idea to partition the expressions. Therefore, we should duplicate these XML data tree to form sub-trees, we compare WIN nodes to all sites; the tiny storage space consumption with PSPIB in the performance evaluation. We give a can trade off between high intra query parallelism and simple example of data distribution of PSPIB to make low communication overhead. We will partition the re- it more concrete. Given a lightweight XML data tree maining sub-trees to different sites, and then in each in Fig. 1, Fig. 2 shows the data placement under the site, combine the sub-trees and duplicated nodes to case of two sites. Solid circles represent the elements form a local data tree. WIN describes how to find a duplicated at both sites; real circles represent the ele- group of duplicate nodes and a group of sub-trees to ments distributed to site 1; dotted circles represent the tradeoff the duplication cost and high parallelism. The elements distributed to site 2. problem is formally defined as follows: Root Let XT be an XML data tree, Q be the query set a b 1 1 of n XML query expressions Q1,Q2, · · · ,Qn, and fi be

c1 c2 c4 c5 c8 c9 c11 c12 the frequency of Qi, let N > 1 be the number of sites. How to partition XT to N sub-trees XT1 ∼ XTN , d1 d2 d3 d5 d6 d7 d9 d11 d12 d13 d15 d16 d18 N XT = XT , to take each Q a high intra query c3 c6 c7 c10 c13 c14 i i i=1 d4 d8 d10 d14 d17 d19 parallelism.S Figure 1. A Lightweight XML Data Tree We will first present the concept of INode, and then give the basic data placement idea based on INode, last Root we give our workload estimation formulae.

a1 b1 2.1 Intermediary Node c1 c2 c4 c5 c8 c9 c11 c12

d1 d2 d3 d5 d6 d7 d9 d11 d12 d13 d15 d16 d18 We classify nodes in an XML data tree XT into

c3 c6 c7 c10 c13 c14 three classes: Duplicate Node (DNode), Intermediary Node (INode) and Trivial Node (T Node). DNode will d d d d d d 4 8 10 14 17 19 be duplicated at all sites. INode is the essential node Figure 2. Data Placement of PSPIB that decides which nodes will be partitioned to different In this paper, we present the concept of interme- sites. T Node will be partitioned to the site where their diary node INode. Considering that determining the ancestor INode is. We use DN to represent the set of best data placement strategy in parallel DBMSs is an DNodes, IN for INodes and TN for T Nodes. They NP-complete [9] problem, a novel workload-aware data satisfy the following properties: placement strategy WIN is proposed to find the sub- 1. DN 0s ancestor nodes are DN or NULL; optimal INode set for partitioning XML data tree. 2. IN 0s ancestor nodes are DN or NULL; This strategy is not aim at special queries, neverthe- 3. IN 0s descendant nodes are TN or attribute nodes less, as will be shown, this strategy suits each query or text nodes or NULL; expression for good intra query parallelism. Based on 4. DN is duplicated in all sites, IN, TN are unique; an XML database benchmark–XMark [10], comprehen- 5. IN ∩ DN ∩ TN = Ø; sive experimental results show that our new data place- 6. IN ∪ DN ∪ TN = XT . ment strategy WIN has better speedup and scaleup DN forms the duplicated root tree in all sites, and performance than previous work. each INode in IN is the root of a sub-tree to be par- The rest of this paper is organized as follows. Sec- titioned. IN is the pivot of our data placement strat- tion 2 gives the concept of INode and the basic idea for egy. For a certain IN, our partition algorithm will data placement. Section 3 provides a detail statement get one data placement result. Different IN will con- of the novel data placement strategy WIN . Section duct different data placement, the key of our novel data 4 shows experimental results. Our conclusions and fu- placement strategy is how to find the best IN. There ture work are contained in Section 5. are numerous IN in a data tree, as shown in Fig. 3, {b, c} is a possible INode set, and also {b, f, g} and 2 Data Placement using INode {d, e, c}, INode could be leaf node, e.g., {b, l, m, g}, {b, f, n, o}, {b, l, m, n, o}, etc. The extreme case of IN Observe from a common XML data tree, we find is all INodes are leaf nodes such as {h, i, j, k, l, m, n, o}. that the number of upper level nodes is always few. How to choose the best IN is obvious a NP-Complete

2 problem, as discussed in [9] which search the best data at proposing a workload estimation model to meet de- placement strategy in parallel DBMSs. mand for common XML query processing techniques a instead of certain algorithms or specific systems. Now we will give our workload cost model to eval- b c uate the workload of a data tree with root a. 1 gives the variables used in this paper. d e f g Variable Description na the number of element a h i j kl mno N the number of sites na b the result number of a b Figure 3. Example for INode Sa the size of element a Qi the i-th query in statistics 2.2 Data Placement based on INode fi query frequency of Qi δa b selectivity of operation a b t avg. disk access time The basic idea of data placement based on INode I/O v net speed is described below. Given an XML data tree XT and net W workload of tree rooted a an INode set IN, the ancestors of all INodes are DN, a W workload of label(a,b) and the remaining nodes in XT are TN. We first du- label(a,b) f frequency of label(a,b) plicate the root tree formed by DN in all sites, then label(a,b) we group the INodes in IN according to their parent Table 1. Variables used in this paper node, in each group, we distribute the sub-trees rooted with INodes to different sites with the goal of work- We define the workload of a data tree with root a as: load average. On each sites, combine all the sub-trees 0 a is leaf W = and DN to form one local data tree. a (Wb + Wlabel(a,b)) a is not leaf ( b∈child(a) Our goal is high intra query parallelism, thereby low P (1) system response time and high throughput. However, if where Wlabel(a,b) is the workload of label from a to b. there are many complex query expressions, there is no In an XML data tree, the label workload from ele- any data placement which can make each query having ment a to b is: equal workload in all sites. Moreover, we cannot ana- lyze all possible data placement results to find the best, PreSize + TempSize Wlabel(a,b) = · tI/O which is obviously impractical. We exploit two heuris- Spage  (2) tic rules to reduce the search space which are discussed TempSize in Section 3. =+ · flabel(a,b) v net  The first item in the bracket is the cost of disk 2.3 Workload Cost Model I/O, including the preprocessing course of hash table and storage of temporary results. The second is the In workload-aware data placement strategies, work- communication cost for transferring temporary results. load estimation is a crucial part since the outcome of Consider the cpu cost is in a higher order of magnitude workload estimation will directly affect the data place- than disk I/O and net communication, we ignore it ment result, thereby influence the performance of a par- here. And flabel(a,b) is the number of label(a,b). allel system. In this section we propose a cost model to The PreSize of constructing and using hash table is: estimate the workload of XML data tree in a shared- PreSize = (3na +3nb) · S (3) nothing parallel database system. These formulae are based not only on the inherent structure of XML data it shows the size of constructing and using hash table. and query expressions but also on the index structures a and b has the same size in our OODBMS, so we use and query methods. Different from the path indices S to represent the object size of a and b when clear. for regular path expressions proposed in [4], we only The temporary results size TempSize is defined as: employ basic indices implemented in XBase [7]. As for TempSize = nb · δa b · (Sa + Sb) (4) some core problems such as ancestor-descendant query and twig join, we implement them with query rewrit- it shows the size of temporary results of a b. Here ing and hash join technology though they have been operation “ ” mainly refers to parent-child join “/”. efficiently solved in [1, 5]. The reason is that we aim The total number of label(a,b) is defined as:

3 n 2 duplicate each node in DN to all sites P toP f = Q / label(a,b) · f (5) 1 N label(a,b) i i 3 for each element E ∈ IN i=1 X    4 do group E according to getParent(E) where Qi /label(a,b) represents the existing number of 5 for each group G label(a,b) in query Qi. 6 do num ← 1 Selectivity δa b is defined as: 7 for each element E ∈ G δa b = na b/nb (6) 8 do workloadAV G ← W(getP arent(E))/N 9 workloadADD ← 0 Selectivity represents the percent of na b with b af- 10 if workloadADD < workloadAV G ter the operation . 11 then workloadADD += WE 12 else workloadADD ← 0 3 WIN Data Placement Strategy 13 num ← num +1 14 distribute Subtree(E) to Pnum In this section we present the framework of our 15 return distribution of IN workload-aware intermediary node ( WIN ) data placement strategy. We first give the holistic algorithm The following procedure describes the Expand () of WIN , and then we will discuss each part in detail procedure which expands interList. Each time we only one by one. The following shows the comprehensive de- expand an INode set IN in interList, the total expand scription of how to find the sub-optimal IN and how number is |IN| (|IN| represents the node number of to get the data distribution. We consider only element IN). Each time we pop a new node in ∈ IN and nodes in the WIN algorithm because attribute nodes add all its children to IN for expanding, this proce- and text nodes are partitioned according to their par- dure only expands those results which could take more ent element nodes. benefit in IN and add them to interList. Suppose, The first algorithm outlines WIN algorithm for for example, that a node has numerous children, e.g., fragmenting an XML data tree into partitions. In this 10000, the expansion of this node will definitely take algorithm, a parameter XT represents an XML data a time-consuming operation, and further expansion is tree to be partitioned, a parameter N represents the unthinkable. As shown in Fig. 4(a), the parent node p number of sites, interList represents the list set of has many children nodes c, if we expand p, then the INode and finalIN the result of IN. Briefly, the al- interList will grow huge. Under this case, we add gorithm creates each data placement as follows: the virtual node called V Node to solve this problem. Win(XT,N) V Node is the node between p and its children c. The 1 interList ← {{root}} gray nodes in Fig. 4(b) are V Nodes. With V Node we 2 finalIN ← nil could solve above problem. We do not store V Node 3 while interList 6= nil while just import it to decrease the computation. p 4 do IN ← interList.popFront() p 5 if Benefit(IN,finalIN) v v v v v 6 then finalIN ← IN c c 7 Expand(interList, IN) c c c c 8 return Partition(finalIN) (a) p and its children (b) After add VNode Figure 4. An Example of VNode Next procedure outlines Partition () for distribut- Expand(interList, IN) ing an data tree with a given INode set IN to different 1 for each node in ∈ IN sites. With the to get high intra query parallelism, 2 do IN 0 ← IN − in in.getChildren() combining with the special structure of XML document 3 if Benefit(IN 0,IN) and the characters of query expressions, we first group 4 then interList.pushBackS (IN 0) nodes in IN with the same parent with the goal that 5 return interList the sub-trees in each group have the similar structure, and then partition sub-trees in each group to different Search all possible chances to find the best result sites with workload balancing to guarantee high intra is an NP-complete problem, so we add two heuristic query parallelism. After this procedure, each site has rules in computing whether new INode set IN can a local XML data tree. 1 take more benefit than the old INode set IN and Partition(IN) 2 judge to push it or not, with which to reduce the search 1 DN ← Ancestor(E) Benefit E∈IN space. It is shown in the procedure (). From S 4 Expand () procedure, we can see that after each ex- 4.1 Experiment Platform and Test Data pansion of IN, there is more DN of new IN. There is a trend that with the further expansion, the workload The experiment platform used is a Network of Per- distribution is more average, but the workload in each sonal Computers–NOPC. There are six nodes; each site is higher with the increasing of duplication cost. node has a PIII 800HZ CPU, 128MB of memory The extreme case is full duplication. Numerous stud- and a 20G hard disk. The operating system used is ies have shown that full duplication leads to reduced Sun Unix(Solaris8.0) Intel platform edition of Sun Mi- performance and this has lead many to believe that crosystems. They form a shared-nothing parallel XML partial duplication is the correct solution. Therefore database system that communicates through 100Mb/s we exploit two heuristic rules to simplify search space high-speed LAN. One of them acts as scheduler and the and avoid full duplication. They are shown below: others as servers. We employ a distributed and paral- 1. If IN1 takes worse workload distribution than lel OODBMS FISH [12] to store XML data. The test- IN2, similar to hill climbing method, we assume ing programs were written in Forte C++ and INADA that there is no meaning to continue expanding conformed to ODMG C++ binding. Because of estab- and we do not expand in this case; lishment limitation, we can only use six nodes to test 2. Even if IN1 takes better workload distribution now, however, our algorithm could be easily extended than IN2, the duplication cost is higher than the to more nodes and our current experiment could ba- benefit it takes; we also do not expand in this case. sically show the performance trend. And we will give the test of more nodes in the future. Benefit(IN1,IN2) We adopt XML Benchmark Project [10] proposed by 1 Partition(IN1) 0 German CWI as our experiment data set. The XMark 2 w1, w2,...,wN is IN s workload from P1 to PN 1 data set provides a large single scalable document; each 3 Partition(IN2) 0 0 0 0 is modelled after the same DTD given in [10]. The test 4 w1, w2,...,wN is IN2s workload from P1 to PN N data consists of 20M, 40M, 60M, 80M and 100M scaling 5 WIN1 ← wi single XML document. We use a comprehensive query i=1 N set of 20 queries [10] to test the performance of these P 0 two data placement strategies. 6 WIN2 ← wi i=1

7 Avg ← W 1 /N, Avg ← W 2 /N 1 PIN 2 IN 4.2 Experimental Results and Performance Eval- 8 Max1 ← max(w1, w2,...,wN ) 0 0 0 uation 9 Max2 ← max(w1, w2,...,wN ) 10 Ben1 ← Max1 − Avg1, Ben2 ← Max2 − Avg2 WIN vs PSPIB: In this section, we evaluate the 11 if Ben1 ≥ Ben2 12 then return false performance of WIN strategy and compare it with PSPIB. To assure the impartial performance evaluation 13 else ∆ben ← Ben2 − Ben1 of these two data placement, we use the same query 14 ∆dup ← (WIN1 − WIN2 )/N method for these two strategies. With the index struc- 15 return (∆ben − ∆dup) > 0 tures PNameIndex, NameIndex, TextValueIndex and AttributeValueIndex proposed in [8], we adopt query rewriting and hash join operation for all query expres- 4 Performance Evaluation sions. To make experimental results more accurate, we executed each query expression 5 times, the first time WIN and PSPIB are both based on the idea to is cold results and the latter is hot results. The average partition path instances to form sub-trees, WIN is time of the hot results is the final response time. Since aim at workload balance and the goal of PSPIB is there are 5 test documents from 20M to 100M, each is the average of path instances. To compare these two tested from 1 site to 5 sites for all queries, the test work data placement strategies, therefore, we implemented and experimental results are very large. We evaluate them in a native XML database and completed perfor- two key properties in measuring a parallel database mance test. In this section, we present an experimental management system: speedup and scaleup. In view of evaluation of our novel workload-aware data placement the memory limitation, when handling huge XML doc- strategy WIN and PSPIB. We start with a description uments such as 80M and 100M, one site is always the of our experiment platform and test data set in Section exception point in performance evaluation. In speedup 4.1. Section 4.2 presents a result evaluation and per- performance evaluation, therefore, we begin with the formance evaluation of our algorithms. case of two sites. Not all performance curves are given

5 here because of the space limitation. shown in Fig.10. One site testes 20M document, two We first give the performance curves of Q5, this sites 40M document, three sites 60M document, four query is to test the casting ability. Strings are the sites 80M document and five sites 100M document. generic data type in XML documents [10]. Queries From it, we can clearly see that: with the increasing of that interpret strings will often need to cast strings to the site number, the scaleup curve of WIN decreases another data type that carries more semantics. The smoothly and is close to linear. On the contrary, with query requirement is shown below: the going up of the site number, the scaleup curve How many sold items cost more than 40? of PSPIB changes irregularly and embodies a vibrate This query challenges the DBMS in terms of the change trend. So we can get the conclusion: in this casting primitives it provides. Especially, if there is no query model, WIN has better scalability than PSPIB. additional schema information or just a DTD at hand, Next we give the performance analysis of Q17. This casts are likely to occur frequently. is to test how well the query processors knows to deal with the semi-structured aspect of XML data, espe-

2.5 2.5 cially elements that are declared optional in the DTD. linear linear 2.25 WIN-Q5 2.25 WIN-Q5 The query requirement is shown below: 2 PSPIB-Q5 2 PSPIB-Q5 1.75 1.75 Which persons don’t have a homepage? 1.5 1.5

speedup value 1.25 speedup value 1.25 1 1 2.5 2.5 linear linear 2 3 4 5 2 3 4 5 2.25 WIN-Q17 2.25 WIN-Q17 sites 20M sites 40M 2 PSPIB-Q17 2 PSPIB-Q17

1.75 1.75 Figure 5. Q5 Figure 6. Q5 1.5 1.5

speedup value speedup of 20M speedup of 40M speedup value 1.25 1.25 1 1 document document 2 3 4 5 2 3 4 5 sites 20M sites 40M

Figure 11. Q17 Figure 12. Q17 2.5 2.5 linear linear speedup of 20M speedup of 40M 2.25 WIN-Q5 2.25 WIN-Q5 2 PSPIB-Q5 2 PSPIB-Q5 document document 1.75 1.75 1.5 1.5

speedup value

speedup value 1.25 1.25 2.5 2.5 1 1 linear linear 2.25 WIN-Q17 2.25 WIN-Q17 2 3 4 5 2 3 4 5 sites 60M sites 80M 2 PSPIB-Q17 2 PSPIB-Q17

1.75 1.75 Figure 7. Q5 Figure 8. Q5 1.5 1.5

speedup value speedup of 60M speedup of 80M 1.25 speedup value 1.25 1 1 2 3 4 5 2 3 4 5 document document sites 60M sites 80M

Figure 13. Q17 Figure 14. Q17

2.5 speedup of 60M speedup of 80M linear linear WIN-Q5 PSPIB-Q5 2.25 WIN-Q5 1.2 document document 2 PSPIB-Q5 1 0.8 1.75 0.6 1.5 0.4 speedup value 1.25 2.5 linear WIN-Q17 PSPIB-Q17 scaleup value 0.2 linear 1.2 1 0 2.25 WIN-Q17 2 3 4 5 PSPIB-Q17 1 sites 100M 1 2 3 4 5 2 sites 1.75 0.8 Figure 9. Q5 1.5 0.6 speedup value 1.25 0.4 speedup of 100M Figure 10. Q5 scaleup value 1 0.2 scaleup 2 3 4 5 document sites 100M 0 1 2 3 4 5 sites Figure 15. Q17 speedup of 100M Figure 16. Q17 The speedup curves of Q5 under two partition document scaleup strategies were shown in Fig.5–Fig.9. It is easy to see that in this case both strategies have good speedup performance. It demonstrates that our native XML Homepage is a optional segment in the definition of database has good support for casting primitives. Q5 DTD. The fraction of people without a homepage is contains only single path expression, there is few cost rather high so that this query also presents a challeng- for results combination and in most case, the local ing path traversal to non-clustering systems. query results in both strategies are final results. There- The speedup curves of Q17 under two partition fore, both strategies have good speedup performance. strategies were shown in Fig.11–Fig.15. When the Both WIN and PSPIB’s scaleup curves of Q5 were documents are small, PSPIB and WIN have similar

6 speedup but WIN is more stable. With the increas- References ing of document size, nevertheless, it is obviously that the speedup of WIN is better than PSPIB. The reason [1] S. Al-Khalifa, H. V. Jagadish, N. Koudas, J. M. is that this query contains twig query. There are some Patel, D. Srivastava, and Y. Wu. Structural efficient twig join methods [3, 5, 6] in single processor joins: A primitive for efficient XML query pattern enviroment. This query is decomposed into several sub- matching. In Proceedings of ICDE’02, 2002. queries through query rewriting, when the documents [2] H. Boral et al. Prototyping Bubba, a highly are small, the communication cost is low and only a parallel database system. IEEE Transactions on nice distinction in speedup, while with the increasing Knowledge and Data Engineering, 2:4–24, 1990. size of documents, PSPIB has large communication cost and WIN has little communication cost. Thus [3] N. Bruno, N. Koudas, and D. Srivastava. Holistic WIN has better speedup performance than PSPIB in twig joins: Optimal XML pattern matching. In this case. Proceedings of SIGMOD’02, 2002. Both WIN and PSPIB’s scaleup curves of Q17 were [4] C.-Y. Chan, M. Garofalakis, and R. Rastogi. RE- shown in Fig.16. From it, we can clearly see that: with Tree: An efficient index structure for regular ex- the increasing of the site number, the scaleup curve pressions. In Proceedings of VLDB’02, 2002. of PSPIB changes irregularly. As to the scaleup of WIN , the case of three processors is a exception point, [5] Z. Chen, H. V. Jagadish, F. Korn, N. Koudas, WIN has better tendency except this point. There- S. Muthukrishnan, R. Ng, and D. Srivastava. fore, WIN has better scaleup performance in this case. Counting twig matches in a tree. In Proceedings of ICDE’01, 2001. Besides above query expressions, the query set in- cludes many complex query expressions which include [6] H. Jiang, W. Wang, H. Lu, and J. X. Yu. Holistic multiple query paths. Finally we give the performance twig joins on indexed XML documents. In Pro- analysis of Q20. This query is about aggregation. The ceedings of VLDB’03, 2003. following query computes a simple aggregation by as- [7] H. Lu, G. Wang, G. Yu, Y. Bao, J. Lv, and Y. Yu. signing each person to a category. Note that the aggre- XBase: Making your gigabyte disk files queriable. gation is truly semi-structured as it also includes those In Proceedings of SIGMOD’02, 2002. persons for whom the relevant data is not available. It is shown below: [8] J. Lv, G. Wang, J. X. Yu, G. Yu, H. Lu, and B. Sun. Performamce evaluation of a DOM-based XML database: Storage, indexing and query op- 5 Conclusions and Future Work timization. In Proceedings of WAIM’02, 2002. [9] D. Sacc`aand G. Wiederhold. Database partition- In this paper, we proposed the concept of INode ing in a cluster of processors. In Proceedings of and developed a novel workload-aware data placement VLDB ’83, 1983. strategy, WIN , which partitions XML data tree in [10] A. Schmidt, F. Waas, M. Kersten, M. J. Carey, parallel XML database systems. Extensive experimen- I. Manolescu, and R. Busse. XMark: A bench- tal results show that our WIN strategy has much mark for XML data management. In Proceedings better speedup and scaleup performance than previ- of VLDB’02, 2002. ous data placement strategy PSPIB which is based on XML inherent structure. [11] Teradata Corporation. DBC/1012 Data Base Computer System Manual. Teradata Corporation As for future work, we plan to further investigate Document, 1985. Release 2.0. workload-aware data placement strategy and exploit different strategies for different requirements. There [12] G. Yu, K. Kaneko, G. Bai, and A. Makinouchi. are also many challenges in expanding this strategy Transaction management for a distributed store about replacement problem when the XML document system WAKASHI design, implementation and is updated or query set is changed. performance. In Proceedings of ICDE’96, 1996. Acknowledgement [13] Y. Yu, G. Wang, G. Yu, G. Wu, J. Hu, and N. Tang. Data placement and query processing based on RPE parallelisms. In Proceedings of This work is partially supported by the CUHK COMPSAC’03, 2003. strategic grant(numbered 4410001).

7