BoxTrees and Rtrees with NearOptimal Query Time

y z x x

Pankaj K Agarwal Mark de Berg Joachim Gudmundsson Mikael Hammar

z

Herman J Haverkort

Abstract graphics spatial databases GIS and rob otics In

order to exp edite and simplify the data structure a

A b ox is a b oundingb ox hierarchy that uses

window query is answered in two steps In the rst

axisaligned b oxes as b ounding volumes The stab

step called the ltering step each ob ject is replaced

bing number of a b oxtree with resp ect to a given

by the smallest b ox containing the ob ject and the

type of query is the maximum number of no des vis

query pro cedure computes the b ounding b oxes that

ited when answering such a query We describ e sev

intersect the query window Instead of b oxes other

eral new algorithms to construct b oxtrees with small

simple shap es such as spheres ellipsoids cylinders

stabbing number with resp ect to queries with axis

have also b een used The second step called the

parallel b oxes and with p oints We also prove lower

renement step extracts the actual ob jects among

b ounds on the worstcase stabbing number for b ox

these b ounding b oxes that intersect the query win

trees which show that our results are optimal or close

dow A few recent results show that under

to optimal Finally we present algorithms to convert

certain reasonable assumptions on the input ob jects

b oxtrees to Rtrees resulting in Rtrees with al

the number of b ounding b oxes intersecting a query

most optimal stabbing number

window is not much larger than the number of ob

jects intersecting the window which makes this ap

Introduction

proach quite attractive see the pap er by Zhou and

Suri and the references therein There has b een

Motivation and problem statement Prepro

d

much work on the ltering step and we also fo cus

cessing a S of geometric ob jects in R for an

on this step More precisely we wish to prepro cess

swering window queriesrep ort all ob jects of S that

d

a set S of n drectangles in R so that all rectan

intersect a query drectangle is central to many

gles of S intersecting a query drectangle can b e re

applications and has b een widely studied in several

p orted eciently We will refer to this query as the

areas including computational geometry computer

rectangleintersection query A related query is the

The work by PA is supp orted by Army Research Oce

rectanglecontainment query in which we want to re

MURI grant DAAH by a Sloan fellowship by

p ort all rectangles in S that contain a query p oint

NSF grants EIA EIA and CCR

and by grant from the USIsraeli Binational Science Foun

dation The work by HH is supp orted by the Netherlands A number of data structures with go o d provable

Organization for Scientic Research NWO

b ounds have b een prop osed for answering rectangle

y

Center for Geometric Computing Department of Com

intersection queries Unfortunately they are of lim

puter Science Box Duke University Durham NC

ited practical use b ecause the amount of storage they

USA

z

use is rather high O n log n storage and even O n

Institute of Information and Computing Sciences Utrecht

University PO Box TB Utrecht the Netherlands

storage with a large hidden constant are often unac

x

Department of Computer Science Lund University Box

ceptable Therefore in practice one usually uses sim

Lund Sweden

1

pler data structures A commonly used structure for

A drectangle also called a ddimensional hyperrectangle

is the pro duct of d intervals a b a b answering rectangleintersection queries rectangle

1 1

d d

containment queries and in fact many other types of b oundingb ox hierarchies b oth in main and external

queries is the boundingbox hierarchy sometimes also memory that have low stabbing number

called AABBtree this is a tree T in which each

Previous results As noted ab ove several e

leaf is asso ciated with a rectangle of S and each in

cient data structures have b een prop osed for an

terior no de is asso ciated with the smallest b ox B

swering a rectangleintersection query For example

enclosing all the rectangles stored at the leaves of the

Chazelle showed that a compressed

subtree ro oted at All the rectangles of S intersect

can b e used to answer a ddimensional rectangle

d

ing a query rectangle R are rep orted by traversing T

intersection query in time O log n k using

d

in a topdown manner Supp ose the query pro cedure

O n log nlog log n space This data structure

is visiting a no de If B R there is nothing to

is to o complex to b e practical even in R Since a d

do If B R then it rep orts all rectangles stored

d d

rectangle in R can b e mapp ed to a p oint in R

in subtree ro oted at Finally if B R but

we can use a k dtree to construct a data structure of

B R it recursively visits the children of We

O n size that can answer a ddimensional rectangle

say that R crosses a no de if B R and B R

d

intersection query in time O n k A num

If the fanout of T is b ounded then the query time

b er of heuristics based on k dtrees have also b een pro

is prop ortional to the number of no des of T that R

p osed to answer rectangleintersection queries

crosses plus the number of rectangles rep orted We

Several pap ers describ e how to construct

call the stabbing number of T denoted S T to

b oundingb ox hierarchies or other b oundingvolume

b e the maximum number of its no des crossed by a

hierarchies for example using spheres or k DOPs as

rectangle It is therefore desirable to construct a

b ounding volumes but they do not obtain b ounds

b oundingb ox hierarchy with small stabbing number

on the worstcase query complexity

In many applications esp ecially in the database

Some of the most widely used externalmemory

applications the set S is to o large to t in the main

b oundingb ox hierarchies are the R tree and its

memory therefore it is stored on disk In that case

variants An R tree originally introduced by

the main goal is to minimize the number of disk ac

Guttmann is a B tree each of whose leaves is

cesses needed to answer a window query and the

asso ciated with an input rectangle All leaves of an

p erformance of an algorithm is analyzed under the

R tree are at the same level the degree of all inter

standard external memory mo del This mo del as

nal no des except of the ro ot is b etween t and t for

sumes that each disk access transmits a contiguous

a given parameter t and the degree of the ro ot varies

blo ck of t units of data in a single inputoutput op

b etween and t We will refer to t as the mini

eration or IO The eciency of a data structure

mum degree of the tree Although several metho ds

is measured in terms of the amount of disk space it

have b een prop osed for ordering the

uses measured in units of disk blo cks the number

input rectangles along the leavesvarying from sim

of IOs required to answer a query and the number

ple heuristics to space lling curvesto minimize the

of IOs need to construct the data structure In the

stabbing number none of them guarantee the worst

context of b oundingb ox hierarchies several schemes

case p erformance In the worst case a linear number

have b een prop osed that construct a tree as ab ove but

of b ounding b oxes might intersect a query rectan

in which the fanout of each no de dep ends on t Some

gle even if it intersects only O input rectangles

notable examples of externalmemory b oundingb ox

The only analytical result is by Faloutsos et al

hierarchies are various variants of Rtrees see the

but they prove b ounds on the query time only in

survey pap er We can still dene the crossing

the dimensional case when the input intervals are

no des and the stabbing number as earlier and one

2

Barequet et al gave an algorithm to construct a

can argue that the number of IOs needed to answer

2

b oundingb ox hierarchy in R and they claimed that if the

a query is prop ortional to the stabbing number plus

rectangles in S are pairwise disjoint then any p oint lies in

the output size

O log n b ounding b oxes But the argument presented in the

In this pap er we study the problem of constructing pap er has a technical problem

sults to our b oxtrees we improve the result of de uniformly distributed and have at most two dierent

Berg et al our query complexity do es not de lengths Recently de Berg et al describ ed an al

p end on the parameter w which makes their query gorithm for construcing an R tree on rectangles in R

complexity linear in the worst case and it is linear so that all k rectangles containing a query p oint b e re

p

in p orted in O log log n log t IOs Here is the instead of We also introduce the concept of

ratio of the maximum and the minimum xlengths of semiRtrees these are similar to ordinary Rtrees

the input rectangles and is the pointstabbing num the degree of each internal no de except for the ro ot

ber of S that is is the maximum number of input is b etween t and t for some given parameter t

rectangles containing any p oint in the plane They except that the leaves do not have to b e at the same

also describ ed another algorithm for constructing an level We give a general algorithm to convert a b ox

R tree that can answer a rectangleintersection query tree with small stabbing number into a semiRtrees

in O log w k log n IOs where and with small stabbing number the stabbing number

t

are as ab ove and w is the ratio of the xlength of the obtained here is b etter than that for Rtrees This

query rectangle to the smallest xlength of an input leads to semiRtrees with almost optimal stabbing

rectangle number

Lower Bounds

Our results In this pap er we rst describ e several

algorithms for constructing b oxtrees and we prove

In this section we give lower b ounds on the stabbing

lower b ounds on the worstcase stabbing number of

number of semiRtrees of minimum degree t in var

b oxtrees Our rst algorithm constructs a b oxtree

ious settings Since semiRtrees are more general

on a set of p ossibly intersecting drectangles in

than Rtrees the same b ounds hold for Rtrees By

d

R so that a rectangleintersection query can b e an

choosing t we obtain lower b ounds for b oxtrees

d

swered in O n k time Our lower b ound shows

We start with a simple generalization of the

that this is optimal

dimensional lower b ound given by de Berg et al

In the plane we show that if the input is disjoint

It is obtained by putting n unit hypercub es in an

d d

more generally if the p ointstabbing number of

n n grid the argument showing that

the input is smallhow to construct a b oxtree that

this gives the stated b ound is given in the full pap er

still has almost optimal query time for rectangle

Theorem For any n and d > there is a set of

intersection queries but much b etter query times for

d

n disjoint unit hypercub es in R with the following

p oint queries More precisely the time for rectangle

p p

prop erty for any semiRtree T of minimum degree t

n log n log n k intersection queries is O

p

there is a query b ox not intersecting any b ox from S

and the time for p oint queries is O log n k We

d

p

such that a query with that b ox visits nt

also develop a b oxtree with O log n k

no des in T

query time when the asp ect ratio of the query rect

angles is small One would hop e that similar improve

Next we describ e a construction that proves lower

ments are p ossible in higher dimensions One of our

b ounds on rectanglecontainment queries and that

lowerbound results shows that this is not p ossible

will also b e useful for a number of other cases For

d

in dimensions d > the n k lower b ound

any we call a drectangle an hypercube if the

on the query complexity holds even for hypercub es as

length of each edge is b etween and We x a pa

query ranges and any b oundingb ox hierarchy that

rameter and construct a set S fb b g

n

d

achieves this query time cannot have a b etter worst

of n hypercub es in R We also construct two sets

case query time for p oint queries even when the input

of query p oints Q and Q called primary and sec

consists of disjoint almostunithypercub es

ondary p oint sets that lie in the common exterior of

Finally we give general metho ds to convert b ox rectangles in S and have the following crucial prop

trees with small stabbing number into Rtrees with erty For any semiR tree T on S with minimum de

small stabbing number When we apply these re gree t either a p oint of Q lies in at least b ounding

d

b oxes of T or a p oint of Q lies in nt can shift them a little to make them disjoint from all

b ounding b oxes of T We rst decsrib e the set S and input b oxes

then construct the p oint sets

Lemma Let T b e any semiRtree of minimum

Let n n b e the outward normals of a d

d

degree t on the set S constructed ab ove Then ei

rectangle We can pair these normals into d pairs

ther there is a primary query p oint contained in

n n n n so that no pair contains op

d d

b ounding b oxes stored in T or one of the secondary

p osite normals that is n n for i d

d

i i

query p oints is contained in nt b ound

Let h b e the plane spanned by the vectors n and

i i

ing b oxes stored in T

n and containing the origin Let b b e a drectangle

i

Pro of We prove the lemma for b oxtrees the gen

containing the origin Since n n the facets

i i

eralization to semiRtrees is describ ed in the full

f f of b normal to n and n resp ectively share

i i i i

pap er Supp ose that all primary query p oints are

a d face f which is orthogonal to the plane

i

contained in less than b ounding b oxes stored

h The intersection of f and h is a p oint c Con

i i i i

in the interior no des in T Then the number of

versely by sp ecifying a p oint c on each h i d

i i

incidences b etween these p oints and interior no des

we can represent a unique drectangle in which c lies

i

b ounding b oxes is at most n Since there

on the facets normal to n and n We will therefore

i i

d

are n interior no des in T they store at least

dene each rectangle b S by a dtuple c c

j

j j

n b ounding b oxes that contain less than

where the facets of b whose outward normals are n

j i

i

primary query p oints Observe that a b ounding b ox

and n pass through c We next describ e how to

i

j

for input b oxes b b S contains jj j j primary

i

j

j

choose the p oints c for i d and j n

j

query p oints b ecause there are that many concave

For each plane h we choose a line of slop e

i i

corners in the staircase b etween corners c and c

0

j

j

the exact equation of will sp ecied b elow We

i

We conclude that there are at least n b ound

will refer to h as the primary plane and every h

i

0

and p erhaps some ing b oxes that store b oxes b b

j j

with i will b e called a secondary plane Set

more b oxes with jj j j But if jj j j

d

on p We place n p oints p

n

then j j mo d so b b This im

j

j

for every j n p sorted along and set c

j j

plies that there is at least one i with 6 i 6 d

i i

on p For each j we place p oints p

i i

such that c c Hence the b ounding b ox storing

0

j

j

i

and assign c to these p oints as follows Let

i

j

0

will contain one of the secondary query p oints b b

j j

j j j b e the representation of

d

So in total we have at least n incidences b e

P

d

j

j mo d in radix that is j mo d

j

tween secondary query p oints and b ounding b oxes

j

i i

d

Note that n p oints have the We set c p

so one of the d b O secondary

j

j

i

d

i

query p oints is contained in n b ounding

same value of c Finally we choose and the p oints

i

j

b oxes 2

on so that each b is an hypercub e The easy de

i j

tails are omitted from this abstract

We can use this lemma to prove lower b ounds for sev

Finally we choose a set Q of n p oints on the

eral settings We start with a lower b ound for p oint

primary plane h and a set Q of d p oints

queries

on the secondary planes as follows Supp ose h is the

Theorem For any n d > and there

x x plane For each j n we choose the

d

is a set S of n hypercub es in R with the following

p oint q x p x p and add it to Q In

j

j j

prop erty for any semiRtree T of minimum degree

other words if we regard the p oints on as a stair

t there is a p oint not contained in any b ox from S

case Q is the set of concave corners of the stair

d

such that a query with that p oint visits nt

case In order to construct Q we rep eat the same

no des in T

step for each of the secondary planes thus obtaining

p oints on each of them These p oints will b e Next we mo dify the ab ove construction so that the

on the b oundary of some of the input b oxes but we same b ound can b e achieved in d even if the

b oxtree in workspace pro ceed as follows Replace input consists of a set of n disjoint hypercub es and

the representative tuple in each leaf by the corre the queries are hypercub es The idea is to use the

sp onding input b ox Then going b ottomup store in construction ab ove in d dimensional space and

each internal no de the b ounding b ox of its children use the remaining dimension to turn the intersecting

We call the resulting b oxtree a congurationspace d dimensional b oxes into disjoint ddimensional

boxtree or csboxtree for short b oxes details are in the full pap er

For the analysis of the range stabbing number in

Theorem For any n d > and there is

the resulting csb oxtree we need the following fact

d

a set S of n disjoint hypercub es in R with the fol

ab out kdtrees given here without pro of

lowing prop erty for any semiRtree T of minimum

degree t there is a hypercub e not intersecting any b ox

Lemma The number of cells at depth i in a d

from S such that a query with that hypercub e visits

dimensional kdtree that intersect an axisparallel f

d

if d

nt no des in T

at 6 f 6 d is O

Finally we observe that the pro of of the preceding

A kdtree and hence our b oxtree has the following

theorem actually shows that in higher dimensions any

prop erty the number of ob jects stored in the two

semiRtree with small say p olylogarithmic query

subtrees of any given no de dier by at most one We

complexity for p oints must have large nearlinear

call such trees perfectly balanced That our b oxtree

query complexity for ranges More precisely it shows

is p erfectly balanced will b e advantageous when we

the following result

will convert it to an Rtree We can now analyse the

range stabbing number in a csb oxtree

Theorem For any n d > and there

d

is a set S of n disjoint hypercub es in R with the

Lemma Let S b e a set of n p ossibly intersecting

following prop erty for any semiRtree T of mini

b oxes in the plane There is a p erfectly balanced b ox

mum degree t if the number of no des visited by any

tree for S such that the number of no des at level i

p oint query is then there is a hypercub e not inter

that are visited by a range query with an axisaligned

id

secting any b ox from S such that a query with that

b ox is O k where k is the number of b oxes

d

hypercub e visits nt no des in T

in S intersecting the query range

Q

d

From kdtrees to BoxTrees

x Q x Q b e a query Pro of Let Q

i i

i

range We can restrict our attention to the interior

In this section we describ e and analyze several meth

no des visited since the number of visited leaves is at

o ds to construct b oxtrees using kdtrees For conve

most one more We distinguish two types of visited

nience we will allow no des of degree up to d it

interior no des The rst type is where at least one

is easy to convert these trees to binary trees with

of the input b oxes stored in the subtree of inter

out aecting the asymptotic b ounds on the stabbing

sects Q Obviously there are only O k such no des

numbers

at a given level i The second type is where all input

The congurationspace approach

b oxes in the subtree of are disjoint from Q Any

input b ox disjoint from Q must b e separated from Q

The basic metho d Let S b e a set of n

by a hyperplane through a facet of Q Clearly not

arbitrary p ossibly overlapping drectangles

d

all input b oxes can b e separated from Q by the same

in R which we call the workspace As

hyperplane otherwise the b ounding b ox of would

noted in Introduction a drectangle b

Q

d

not intersect Q and would not b e visited Hence

x b x b can b e represented by a p oint

i i

i

there are at least two such hyperplanes separating Q

x b x b x b x b x b x b in

d d

d

from an input b ox in the subtree of

R which we call the conguration space We

Assume wlog that x x Q is one of these build a ddimensional kdtree on these tuples To

i

i

separating hyperplanes and let b b e the input b ox transform the kdtree in conguration space into a

normals of the facets of B take the b ox from S that it separates from Q Then we must have x b

i

extends furthest in the direction of that normal This x Q But there must also b e a b ox b with x b

i i

results in a set S of at most d b oxes Each b ox Q otherwise the b ounding b ox of would not x

i

in S is put in a socalled priority leaf which is an intersect Q We can conclude that the p oints rep

immediate child of If the set S n S of remaining resenting b and b in the conguration space lie on

b oxes contains less than two b oxes then this b ox Q Now x opp osite sides of the hyperplane x

i i

if it exists is put as a leaf child of If two or Q inter x this implies that the hyperplane x

i i

more b oxes remain we split the set of b oxes into sects the cell in conguration space of the no de in the

two almost equalsized subsets with a hyperplane kdtree corresp onding to

in conguration space This hyperplane is orthogonal

We can apply the same argument to the second hy

to the cutting direction of the level we are working

p erplane separating Q from an input b ox the hyper

at as b efore the cutting directions of the levels cycle

plane x x Q for example to show that there

j

j

through the d directions in conguration space

is a hyperplane in conguration space with p oints on

The subset of b oxes whose representative p oints

opp osite sides x x Q in the example

j j

lie to one side of the cutting hyperplane are stored

We can conclude the following Supp ose Q visits

recursively in one subtree of The subset of b oxes

a no de of the second type Then in conguration

whose representative p oints lie to the other side of the

space there is a pair of hyperplanes b oth of the form

cutting hyperplane are stored recursively in another

x x Q or x x Q and b oth intersecting

i i i i

subtree of

the cell in conguration space of the no de in the kd

Next we analyze the query complexity of the tree

tree corresp onding to But then the cell must also

resulting from this construction which we call a cs

b e intersected by the d at that is the intersec

priorityboxtree In our analysis we b ound the num

tion of these two hyperplanes By Lemma there

idd id

b er of visited no des of a given weight where the

are only O O such no des at

weight of a no de is dened as the number of input

level i 2

b oxes stored in its subtree This will b e useful when

This leads directly to the following theorem

we convert this b oxtree into a semiRtree

Theorem Let S b e a set of n p ossibly intersect

Lemma The number of no des of weight at

ing b oxes in the plane There is a p erfectly balanced

least w visited by a query with a query b ox Q is

d

b oxtree for S such that the number of no des vis

O nw k

ited by a range query with an axisaligned b ox is

Q

d

d

Pro of Let Q x Q x Q We can re

i i

i

O n k log n where k is the number of b oxes

strict our attention to the visited no des of weight at

in S intersecting the query range

least d as the total number of visited no des is at

Improving the query time We now show how

most a constant times larger than this number Let

to reduce the O k log n term in the query complexity

b e such a visited no de of weight at least d There

to O k The idea is the same as in a priority search

are two cases

tree input elements b oxes in our case that have

The rst case is where one of the priority leaves

a high chance of b eing an answer are pushed to high

directly b elow stores a b ox intersecting Q Clearly

levels in the tree In our case this means that we store

there are at most k such no des The second case is

b oxes that extend the farthest in some direction high

when all priority leaves directly b elow store b oxes

in the tree More precisely the construction of the

disjoint from Q Thus each such b ox is separated

d

tree T for a set S of b oxes in R is as follows

from Q by a hyperplane through a facet of Q We

If jS j then T consists of a single leaf no de claim that not all b oxes can b e separated by the

storing the input b ox in S Otherwise we make a same hyperplane Supp ose for a contradiction that

no de storing the b ounding b ox B of all b oxes in there is a facet f whose containing hyperplane sep

S and we pro ceed as follows For each of the d inner arates all b oxes of the priority leaves from Q Then

in particular it would separate the b ox that extends queries increases only slightly This approach only

furthest in the direction of the inner normal of the works in two dimensions Theorem states that a

facet f contradicting that Q intersects the b ounding similar result in more than two dimensions cannot b e

b ox stored at So we have two distinct hyperplanes obtained

through facets of Q separating a b ox in the subtree The basic idea b ehind kdinterval trees is again to

of from Q use a kdtree but this time in the workspace which

The b oxtree that we have constructed basically is now the plane Since the ob jects in the workspace

corresp onds to a kdtree in conguration space as are rectangles not p oints many of them may inter

b efore The priority leaves make that the tree in con sect the cutting line These b oxes are taken out and

guration space is strictly sp eaking not a kdtree but handled separately like in an interval tree To make

it is easy to see that Lemma still holds More kdinterval trees more ecient we introduce priority

over there is still a onetoone corresp ondence b e leaves like in the previous section

tween no des of the b oxtree and no des of the kdtree

The dimensional case First we describ e how

in conguration space Hence we can use the fact

a set S of b oxes that all intersect a given line are

that there are two distinct hyperplanes through facets

handled With a slight abuse of terminology we call

of Q separating a b ox in the subtree of from Q in

a tree for this case a dimensional kdinterval tree

the same way as in the pro of of Lemma it implies

If jS j then T consists of a single leaf no de storing

that there is a d at in conguration space de

the input rectangle in S Otherwise we make a no de

ned by a pair of facets of Q intersecting the cell in

storing the b ounding b ox B of all rectangles in S

the kdtree corresp onding to It follows that the

and we pro ceed as follows For each of the inner

total number of no des to which the second case

normals of the edges of B take the rectangle from

id

applies at a given level i is O

S that extends furthest in the direction of that nor

To nish the pro of observe that no des at the low

mal This results in a set S of at most rectangles

ermost blogw dc levels have weight less than w

Each rectangle in S is put in a socalled priority leaf

Adding the b ounds for the second case on the remain

Consider the set of intersections of the edges of the

P

dlog neblog w dc

id

ing levels we get O

i remaining rectangles with Let p b e the median of

d

O nw 2

these intersection p oints The rectangles in S n S

containing p are stored in a subtree of that is a

The following theorem follows directly

dimensional cspriorityboxtree as describ ed in the

Theorem Let S b e a set of n p ossibly intersect

previous section The rectangles in S n S completely

d

ing b oxes in R There is a b oxtree for S such that

to one side of p are stored recursively in one subtree

the number of no des that are visited by a range query

of The rectangles in S n S completely to the other

d

with an axisaligned b ox is O n k where k

side of p are stored recursively in another subtree of

is the number of b oxes in S intersecting the query

range

We call the no des in the main dimensional kd

interval tree Dnodes Such a no de corresp onds to

The kdintervaltree approach

an interval on the dening line We call the no des

The csb oxtree of the previous section has optimal

of the dimensional cspriorityboxtrees csnodes

query complexity for p oint queries and for range

We start by analyzing the query complexity when

queries if the input consists of arbitrary intersecting

we query with a segment on the line The pro of of

b oxes Unfortunately if the input b oxes are disjoint

the following lemma is omitted

then the query complexity for p oint queries do es not

improve In this section we develop a dierent b ox Lemma If we query a dimensional kdinterval

tree the kdinterval tree whose query complexity is tree storing a set S of n rectangles with a line segment

much b etter if the p ointstabbing number of the on the dening line then we visit at most O log n

input set S is small The query complexity for range k no des where k is the number of rectangles to b e

Now consider a dimensional cspriorityboxtree

that is visited Supp ose the interval of the Dno de

that is the parent of this subtree is completely con

tained in Q Then we can argue again using the

priority leaves that we can charge all the visited

Q

no des to rectangles intersecting Q If the interval

of the Dno de that is the parent of this subtree

is not completely contained in the pro jection of Q

we argue as follows First we observe that the in

terval must then contain an endp oint of the pro jec

tion of Q so there are only O log n such parent

no des In the dimensional congurationspace b ox

tree b elow such a parent we can use Lemma to

Figure Querying a dimensional kdinterval tree with

b ound the number of visited no des of weight w by

p

a b ox Q

O n w k where n is the number of b oxes

stored in the congurationspace b oxtree and k is

the number of answers rep orted in this subtree Note

that n O since the congurationspace b ox

rep orted

trees are used only to store sets of b oxes that share

a single p oint Hence the overal number of csno des

Next we analyze the query complexity when we query

p

w log n k nishing the pro of of visited is O

with a b ox

part i of the lemma

ii The pro of of the second part of the lemma can

Lemma i If we query a dimensional kd

b e found in the full version of this pap er 2

interval tree storing a set S of n rectangles with a

p

w log n k query b ox Q then we visit at most O

no des of weight at least w where k is the number of

The dimensional case Our kdinterval tree for

rectangles to b e rep orted

a general set S of rectangles in the plane is dened

ii If the pro jection of Q onto the line that stabs the

as follows If jS j then T consists of a single leaf

rectangles in S contains the intersections of all rect

no de storing the input b ox in S Otherwise we make

angles with then the query time reduces to O k

a no de storing the b ounding b ox B of all b oxes in

Pro of i See Figure If Q intersects then the S construct priority no des like in the dimensional

query is equivalent to querying with Q so the case and split the remaining set of b oxes by a ver

result follows from the previous lemma Otherwise tical or horizontal line dep ending on the level in

assume wlog that is vertical and that Q lies to the the tree This splitting line is chosen such that the

right of Consider a Dno de that is visited when number of rectangles lying completely to one side of

we query with Q When the interval corresp onding diers at most one from the number lying completely

to this no de is completely contained in the pro jection to the other one side The rectangles lying completely

of Q onto then the rectangle in the subtree extend to one side of are stored recursively in one subtree

ing furthest to the right must b e intersected This of The rectangles to the other side are stored re

rectangle is stored in a priority leaf immediately b e cursively in another subtree of The rectangles in

low to which we can charge the visit of Hence tersecting are stored in a dimensional kdinterval

there cannot b e more than k such no des When the tree as explained ab ove We call the no des of the

interval is not completely contained in the pro jection main tree which corresp ond to dimensional cells

of Q then it contains an endp oint of the pro jection Dnodes Next we analyze the p erformance of the

of Q and there are only O log n such no des kdinterval tree

Lemma The number of no des of weight at least so ciated dimensional kdinterval trees we have to

w that are visited by a range query with an axis multiply this by the b ound of Lemma leading to

p p p

aligned b ox is O a total of O nw log n w log n k w log n k

where k is the number of rep orted answers The There are no other types of no des whose b ounding

number of such no des visited by a p oint query is b oxes intersect Q Adding up the number of no des

p

w log n k O for all four cases gives the desired b ound 2

This leads to the following theorem

Pro of Consider a Dno de that is visited when we

query with an axisaligned rectangle Q We distin

Theorem Let S b e a set of n p ossibly inter

guish four dierent types of such no des We b ound

secting b oxes in the plane such that no single p oint

their number and the number of no des visited in

is contained in more than b oxes There is a

dimensional kdintervalsubtrees separately

b oxtree for S such that the number of no des vis

Inner nodes These are Dno des whose b ounding

ited by a range query with an axisaligned b ox is

p p

b oxes lie completely inside Q The number of inner

n log n log n k where k is the number of O

no des is easy to b ound since all rectangles in the

b oxes in S intersecting the query range The number

p

subtree of such a no de intersect Q Hence the total

log n k of no des visited by a p oint query is O

number of such no des or no des in their dimensional

The longestsiderst approach

asso ciated kdinterval trees is O k

Side nodes These are Dno des whose b ounding

Recall that a kdinterval tree is basically a mo died

b oxes cut exactly one edge of Q In this case the

kdtree where each no de is split by a line The ori

rectangle that extends furthest into the direction of

entations of these lines dep end on the level in the

the inner normal of this edge must intersect Q This

tree in such a way that orientations take turns in a

rectangle is stored in a priority leaf immediately b e

roundrobin fashion on any path from the ro ot down

low the no de The same reasoning applies to their

into the tree An interesting variation of the kd

dimensional asso ciated kdinterval trees Hence the

interval tree arises when we replace the roundrobin

total number of side no des or no des in their asso ci

splitting strategy by the longestside splitting rule as

ated kdinterval trees is O k

suggested by Dickerson et al In such a longest

Piercing nodes These are Dno des that cut two op

siderst kdtree the number of no des whose corre

p osing edges of Q but do not contain any corners of

sp onding cell is pierced by a query rectangle is small

Q From Lemma and the fact that no des at the

if the query rectangle is fat We can use this to prove

lowermost blogw dc levels of the tree must have

the following result

weight less than w we conclude that the number of

Theorem Let S b e a set of n b oxes in the

Dno des of weight at least w that intersect any edge

P

dlog neblog w dc

i

plane with stabbing number There is a b oxtree

O of Q is b ounded by

i

p

for S such that the number of no des that are vis

nw Now there are two cases the splitting line O

ited by a range query with a rectangular range of

used at such a no de is orthogonal to the intersected

p

asp ect ratio is O log k where k is the

edges or it is parallel to them In the former case we

number of b oxes in S intersecting the query range

can apply Lemma to obtain a O log n k b ound

The number of such no des visited by a p oint query

on the number of no des visited in the dimensional

p

log n k is O

kdinterval tree asso ciated with where k is the

number of rep orted answers In the latter case we

From b oxtrees to semiRtrees

can apply Lemma ii to get a b ound of O k

p

In the previous section we describ ed several algo Hence we get a grand total of O nw log n k

rithms to construct b oxtrees with go o d query com corner nodes These are Dno des that contain one

plexity In this section we give general theorems to or more corners of Q There are O log n such no des

convert them to semiRtrees To obtain the total number of visited no des in the as

We start with a general theorem that converts any settle for semiRtrees instead of real Rtrees Recall

b oxtree to an Rtree Recall that the weight of a that the dierence b etween a semiRtree and an R

b oxtree no de is the number of input b oxes stored in tree is that in the former we do not require all leaves

its subtree to b e at the same depth

Theorem Let T b e a b oxtree for a set of n

Theorem Let T b e a b oxtree for a set of n

d

d

b oxes in R such that any query with a range of a

b oxes in R such that any query with a range of a

given type visits at most f w no des of weight w or

given type visits at most f w no des of weight w or

more Then T can b e converted to an Rtree of min

more Then T can b e converted to a semiRtree of

imum degree t where every query with a range of the

minimum degree t where every query with a range of

same type visits at most O f t log n log t no des

the same type visits at most O f t no des

Pro of We simply read out the leaves from T in

By applying the conversion algorithms of the theo

order and then construct an Rtree where the b oxes

rems ab ove to the structures from the previous sec

o ccur in the same order in the leaves for example

tion we obtain the following results

using an algorithm to construct Btrees

d

Consider a b ounding b ox B stored in the Rtree It

Corollary Let S b e a set of n b oxes in R with

is the b ounding b ox for some input b oxes that were

stabbing number

stored in consecutive leaves in the b oxtree T Let

i There is an Rtree for S of minimum degree t such

B b e the lowest common ancestor of these leaves

that the number of no des visited by any b ox query is

d

Since the minimum degree in the Rtree is t the

O nt k log n log t where k is the number

weight of B is t or more Furthermore the no des

of rep orted answers

B for the b ounding b oxes B stored at a xed level

ii There is an semiRtree for S of minimum degree

in the Rtree must b e distinct b ecause their dening

t such that the number of no des visited by any b ox

d

sets form a partition of the leaves in T into consec

query is O nt k

utive sequences Hence we can charge the visited

iii When d there is a semiRtree for S of min

no des of the Rtree to visited no des of weight t or

imum degree t such that the number of no des visited

p p

more in T in such a way that a no de in T do es not

nt log n t log n k by any b ox query is O

get charged more than once from no des at a xed

and the the number of no des visited by any p oint

p

level in the Rtree Since the depth of the Rtree is

t log n k In b oth b ounds k is query is O

O log n log t the b ound follows 2

the number of rep orted answers

iv When d there is a semiRtree for S of min

The construction of Theorem results in losing a

imum degree t such that the number of no des visited

logarithmic factor in the query complexity In the

by any query with a rectangle of asp ect ratio is

p

full pap er we show how to improve this result for

O t log n k where k is the number of

p erfectly balanced b oxtrees Recall that a b oxtree

rep orted answers

is called p erfectly balanced if for any no de the weight

v For the cases mentioned under iii and iv there

of its left and right child dier by at most one

is also an Rtree of minimum degree t for which the

Theorem Let T b e a p erfectly balanced b ox

number of visited no des is O log n log t times the

d

tree for a set of n b oxes in R such that any

number of visited no des in the semiRtree

query with a range of a given type visits at most

f i no des at level i in T Then T can b e con

References

verted to an Rtree of minimum degree t where every

P K Agarwal and J Erickson Geometric range

query with a range of the given type visits at most

P

searching and its relatives In B Chazelle J E

log n log t

f i log t no des O

i

Go o dman and R Pollack editors Advances

in Discrete and Computational Geometry vol Finally we can show that that we can also improve

ume of Contemporary Mathematics pages Theorem for the general case if we are willing to

American Mathematical So ciety Provi V Gaede and O G unther Multidimensional ac

dence RI cess metho ds ACM Comput Surveys

A Aggarwal and J S Vitter The InputOutput

S Gottschalk MC Lin and D Mano cha OBB

complexity of sorting and related problems

Tree a hierarchical structure for rapid inter

Communications of the ACM

ference detection In ACM Computer Graphics

Proceedings pages

G Barequet B Chazelle LJ Guibas JSB

A Guttmann Rtrees a dynamic indexing

Mitchell and A Tal BOXTREE A hierarchi

structure for spatial searching In Proc ACM

cal representation for surfaces in D Computer

SIGMOD International Conference on Manage

Graphics Forum CGF C

ment of Data pages

M de Berg J Gudmundsson M Hammar and

PM Hubbard Approximating p olyhedra with

M Overmars On Rtrees with low stabbing

spheres for timecritical collision detection In

number Proc th European Symposium on Al

ACM transactions on graphics

gorithms LNCS pages

M de Berg M van Kreveld M Overmars

JT Klosowski M Held JSB Mitchell

and O Schwarzkopf Computational Geometry

H Sowizral and K Zikan Ecient collision de

Algorithms and Applications SpringerVerlag

tection using b ounding volume hierarchies of k

Berlin

DOPs In IEEE transactions on visulalization

and computer graphics

M de Berg M Overmars and O Schwarzkopf

Computing and verifying depth orders SIAM J

S Leutenegger M A Lop ez and J Edington

Comput

STR A simple and ecient algorithm for R

tree packing Proc th IEEE Internat Conf

B Chazelle A functional approach to data

on Data Engineering pp

structures and its use in multidimensional

searching SIAM J Comput J Nievergelt and P Widmayer Spatial data

structures concepts and design choices In

M van Kreveld J Nievergelt T Ro os and

M Dickerson C Duncon and M Go o drich K

P Widmayer eds Algorithmic Foundations of

D Trees Are Better when Cut on the Longest

Geographic Information Systems LNCS

Side Proc th European Symposium on Algo

pages

rithms LNCS pages

J Orenstein A comparison of spatial query

T Brinkho HP Kriegel R Schneider and

pro cessing techniques for native and parameter

B Seeger Multistep pro cessing of spatial joins

spaces Proc ACM SIGMOD Conf on Manage

In Proc ACMSIGMOD International Confer

ment of Data pp

ence on Management of Data

Y Zhou and S Suri Analysis of a b ounding

C Faloutos and I Kamel Packed Rtrees us

b ox heuristic for ob ject intersection In Proc

ing fractals Rep ort CSTR University of

th Annual Symposium on Discrete Algorithms

Maryland College Park

SODA To app ear in Journal of the

ACM

C Faloutos T Sellis and N Roussop oulos

Analysis of ob ject oriented spatial access meth

o ds In Proc ACMSIGMOD International Con

ference on Management of Data pages