Fast Algorithms for D Dominance Rep orting and

Counting

Qingmin Shi and Joseph JaJa

Institute for Advanced Computer Studies

Department of Electrical and Computer Engineering

University of Maryland Col lege Park MD

fqshijosephumiacsumdedug

January

Abstract

We present in this pap er fast algorithms for the D dominance rep orting and counting

problems and generalize the results to the ddimensional case Our D dominance rep orting

algorithm achieves O log n log log n f query time using O n log n space where f is the

numb er of p oints satisfying the query and is an arbitrary small constant For the D

dominance counting problem whichisequivalent to the D range counting problem our

 

algorithm runs in O log n log log n time using O nl og n log log n space

Intro duction

d

Let S be a of n points in Thedominancereporting problem is to store S in a data

structure so that the subset Q of p oints in S that dominate an arbitrarily given p oint q

can b e rep orted ecientlyGiven two p oints p p p p andq q q q we

  d   d

say p dominates q if and only if p q for all i d Another related problem is the

i i

dominancecounting problem In this problem the size k of Q only needs to b e rep orted

Many computational geometry problems such as the rectangle intersection problems can

b e reduced to the dominance search problem

In this pap er weprovide faster algorithms than previously known for b oth the D

dominance rep orting and counting problems Our mo del of computation is the RAM mo del

as mo died byFredman and Willard In this mo del it is assumed that eachword is of

w

size w and that the numb er of data elements n never exceeds that is w log nIn



addition arithmetic and bitwise logical op erations take constanttime

Before outlining our results we give a brief overview of related literature The D dom

inance rep orting is a sp ecial case of the D orthogonal range search problem Hence results



Supp orted in part by the National Science Foundation through the National Partnership for Advanced

Computational Infrastructure NPACI DoDMD Pro curement under contract MDAC and NASA

under the ESIP Program NCC

for orthogonal range search apply immediately to our problem In particular a class of

algorithms for orthogonal range rep orting uses constant time for each rep orted p ointand

p olylogarithmic search time see for example The b est known result for



D orthogonal range search is given by Alstrup and Bro dal whichachieves O n log n

space and O log n f query time Several other data structures use less space but

require larger query time In Chazelle gives two data structures one with query time



O log n f log log n f and using space O n log n log log n and the other with query



time O log n f log nf and using O n log n space The b est known dominance rep ort

ing algorithm is due to Makris and Tsakalidis whichfollows the approach of Chazelle

and Edelsbrunner and achieves linear space and O log n f query time

Unlike dominance rep orting which seems to b e inherently simpler than general orthogo

nal range rep orting the dominance counting problem is equivalent to the orthogonal range

counting problem if the dimension d is assumed to b e constant Indeed it is easy to show

that if a of size O sn exists for answering any ddimensional dominance

counting query in O tn time then any ddimensional range counting query can b e an

d d

swered with O sn space and using O tn query time Not many results are known for

the multidimensional dominance counting problem In Chazelle uses a compressed form

of the range to achieve O log n query time and O n space for the D dominance

counting problem This result easily leads to a solution to the D dominance counting with



O n log n space and O log n query time On the other hand we can reduce the counting

query problem to the aggregation range query problem in the same dimension by assign

ingaweight of to eachpoint Willard in shows howtocombine the



and its variant called the q heaps and the fractional cascading technique to achieve





O n log n space and O log n log log n query time to handle D aggregation range

queries where is arbitrarily small constant

In this pap er we establish the following results that provide faster algorithms for D

dominance rep orting and counting

An algorithm for threesided D range rep orting that achieves O n space and

O log n log log n f query time This result is similar to Willards mo dication of

priority tree but the algorithm is much simpler This algorithm plays an imp ortant

role in our D dominance rep orting algorithm

An algorithm for D dominance rep orting that uses O n log n space and

O log n log log n f query time where f is the number of points satisfying the query

An algorithm for D dominance counting problem whichusesO n log n space and

runs in O log n log log n query time This algorithm can b e seen as an improvement

on the query time over Chazelles algorithm at the exp ense of O log n additional

space

An extension of the D dominance counting algorithm and D dominance rep orting

d

algorithm to multidimensional dominance search which leads to an O log n log log n

d

query time and O n log nlog n log log n space algorithm for the counting case

d d

and an O log n log log n f query time and O n log nlog n log log n space

algorithm for the rep orting case where d is the numb er of dimensions and d for

the counting case and d for the rep orting case

In Section we briey discuss several known techniques used heavily in this pap er The

threesided D range rep orting algorithm is describ ed in Section while the description of

our D dominance rep orting algorithm is given in Section Our D dominance count

ing algorithm is describ ed in Section Results in Sections and are extended to the

multidimensional case in Section

Preliminaries

Toavoid tedious details in describing the algorithms we assume for the rest of this pap er

that no two p oints have the same co ordinate in any dimension For simplicitywe call the

p oint with the largest xco ordinate smaller than or equal to a real number r the xpredecessor

of r and the one with the smallest xco ordinate larger than or equal to r the xsuccessor of

r The y and zpredecessorssuccessors are dened similarly

Cartesian Trees

Cartesian trees dened over a nite set of D p oints were rst intro duced byVuillemin

Let p p p be a set of n D p oints sorted by their xco ordinates The corresp onding

  n

Cartesian tree C is dened recursively as follows Let p b e the p oint with the largest y

i

co ordinate Then p is asso ciated with the ro ot r of C The left child of r is the ro ot of the

i

Cartesian tree built on p p and the rightchild of r is the ro ot of the Cartesian tree

 i

built on p p The leftright child do es exist if i n We call such a Cartesian

i n

tree an x y Cartesian tree

An imp ortant prop erty of the Cartesian tree is given by the following observation

Observation Consider a set S of D points and the corresponding x y Cartesian tree

C Let x x be the xcoordinates of two points in S and let and be their respective

 

vertices Then the point with the largest ycoordinate among those whose xcoordinates are

between x and x is stored in the nearest common ancestor nca of and

 

Fractional Cascading

Supp ose wehave a tree T of b ounded degree c and ro oted at no de r suchthateachnode

v contains a list Lv of sorted elements Let n b e the total size of all the lists stored in

T and let F b e an arbitrary forest with p no des consisting of subtrees of T determined by

some of the children of r F may b e sp ecied on line ie not necessarily known during

the prepro cessing step The following lemma is a direct derivation from the one given by

Chazelle and Guibas for identifying all the successors of a value g in the lists stored in F

Lemma There exits a linear size fractional cascading data structure that can beusedto

determine the successors of a given item g in the lists storedin F in O p log c tn time

where tn is the time it takes to identify the successor of g in Lr

A simple way to implement the fractional cascading strategy is to use augmented lists



For eachnode v in T we store in addition to Lv another sorted list L v called the

augmented list which consists of elements in the lists asso ciated with b oth v and its children



We asso ciate with each element e in L v c pointers One p ointer called the intranode

pointerpoints to the successor of e in Lv while the other c internode pointers each p oints

to the successor of e in the augmented lists asso ciated with a dierentchild of v Let v be

an internal no de and u b e its j th child starting from the left If weknow the successor e

v



of g in L v then by following the j th interno de p ointer asso ciated with e we can obtain

v



the successor e of g in L u The successor of g in Lu is then obtained by following the

u

intrano de p ointer asso ciated with e Note that this works correctly since Lu is a subset

u



of L u

Since each element is stored at most twice and there are c pointers asso ciated with

each element the storage cost of this scheme is O cn where n is the total size of all the lists

Wethus haveavariation of the fractional cascading structure which allows constanttime

lo okup in nonro ot no des Wecallitfast fractional cascading and the linearspace fractional

cascading structure compact fractional cascading

Lemma The fast fractional cascading structure al lows the identication of the successors

of a given item g in the lists storedin F in O p tn time where tn is the time it takes

to identify the successor of g in Lr This structurerequires O cn space

We should note that in the case where c is a constant Both the compact and faster

fractional cascading structures have asymptotically the same p erformance

Fusion Trees

Fusion trees develop ed byFredman and Willard achieve sublogarithmic search time

of onedimensional data and can b e exploited to asymptotically sp eed up many algorithms

as shown in In essence this strategy achieves sublogarithmic searchtimeby increasing

the degree of the as a function of n where n is the size of the data Typically

the degree c of the fusion tree satises log c log log n As a result the depth of the tree

is reduced to log n log log n Using compressed key representation the fusion tree allows

the correct child of a no de to b e determined in constant time The following Lemma which

wemake use of in our results is a simplied version of Corollary in

Lemma Assume that in a database of n elements we have available the use of precomputed

tables of size on Then it is possible to construct a data structureofsize O n space

which has a worstcase time O log n log log n for performing member predecessor and rank

operations

Fast Algorithm for Threesided D Range Rep ort

ing

A threesided range rep orting query lo oks for the twodimensional p oints p p p ina

x y

data set suchthatx p x and p y Using McCreights priority search tree

 x  y

this problem can b e solved in O log n time using O n space In Willard improves this

p

algorithm by increasing the degree of the prioritysearch tree to log n and applying the



Q structure A global table is required in order to avoid the access of the tree no des

that do not haveanypoint to rep ort We describ e b elow another algorithm that achieves

the same b ound but is much simpler

Instead of using a balanced tree we construct an xyCartesian tree C Given a query

x x y if we know the xsuccessor of x and the xpredecessor of x and their corresp ond

   

ing no des and in the Cartesian tree C then the nearest common ancestor of and

stores the p oint with the largest yco ordinate among those p oints whose xco ordinates

are b etween x and x By transforming C into the structure D C using the techniques

 

of Harel et al the no de can b e found in O time After the no de is lo cated its

subtrees are traversed recursively as follows When a no de v is accessed wecheck the the

p oint stored there to see if it should b e rep orted If yes the children of v are accessed

Otherwise we stop searching the subtree of v

Lemma Let C beanx y Cartesian for a set of n twodimensional points Given a

threesided twodimensional range query given as x x y with the two pointers to the left

 

and rightmost nodes of C whose xcoordinates fal l within the range x x then the query

 

can be answeredinO f time using D C of size O n

The xsuccessor of x and xpredecessor of x can b e found in O log n log log n time if

 

we build a fusion tree to index the no des of the Cartesian tree according to the increasing

order of the xco ordinates of their asso ciated p oints Thus wehave an algorithm that can

handle threesided twodimensional range queries in O log n log log n f time using linear

space

Fast Algorithm for D Dominance Rep orting

The D dominance rep orting problem involves the determination of all the p oints that

dominate the query p ointq q q In this section we describ e a faster algorithm than the

x y z

one presented by Makris and Tsakalidis at the exp ense of a factor of log n additional

space Our algorithm is inspired by Willards improvement on the priority search tree

We rst give a brief overview of the algorithm of Makris and Tsakalidis

The MakrisTsakalidis Algorithm

This algorithm is based on the linear space data structure of Chazelle and Edelsbrunner



which handles queries in O log n f time The primary data structure used is a binary

search tree T built on the p oints in decreasing order of the zco ordinate Eachinternal no de

v stores the maximal set M v of the p oints stored in the subtree ro oted at v and which are

not stored in one of its ancestors A maximal set of a p ointsetS consists of the p oints in S

whose pro jections onto the xyplane are not dominated byany other pro jection

At eachnode v M v is represented by three data structures The rst two are lists

L v and L v that store the p oints in M v in increasing order of the xco ordinates and

 

decreasing order of the yco ordinates resp ectively The third structure D v D C v is the

derivation of the x z Cartesian tree C v built on M v and ignoring the yco ordinates

Lists L and L asso ciated with all the primary tree no des are chained together separately

 

using fractional cascading to facilitate constant time search of these lists Elements of L v



L v and D v that corresp ond to the same p oint are also linked together



Given a query q q q the algorithm works as follows First weidentify in O log n

x y z

time the path from the ro ot to the leaf no de whose corresp onding p oint has the smallest

zco ordinate that is larger than or equal to q Thenwe p erform a preorder traversal of the

z

tree p erforming the following op erations

For eachnode v visited determine the p oints in M v that need to b e rep orted

If v do es not haveanypoint to b e rep orted and is not on do not visit any of its

children

If v and its left child are b oth on the path do not visit its rightchild

The ab ove restrictions guarantee that the numb er of no des visited is b ounded by O log n f

For a no de v that is visited but not on we nd the xsuccessor of q in L v inO

x 

time except when v is the ro ot which requires O log n time If this p oint do es not satisfy

the querythenwe are done with this no de and its descendants Otherwise we follow L v



until a p oint not in Q is encountered and recursively visit the twochildren v

Ateachnodev on we can easily in O time nd left most and right most leaf no des

and of C v the x and yco ordinates of whose p oints satisfy the query By Lemma

D v allows us to rep ort p oints in M v inO f v time where f v isthenumb er of p oints

rep orted in M v

Our Algorithm

To reduce the query complexitywe increase the degree of the primary tree T used in the

previous section from to c log n where is an arbitrary p ositive constant As a result

the height of the tree as well as the length of the path is reduced to O log n log log n

Eachnodev contains c auxiliary structures The rst three are L v L v and

x y

L v consisting of the p oints in M v sorted according to the order of the x y and z

z

co ordinates resp ectively The L L and L lists of all the no des are chained together

x y z

separately using the fast fractional cascading structure Also wehave the structure D v

which is the same as describ ed in Section asso ciated with the maximal set of p oints

M v stored at v In addition log n x y Cartesian trees C v C v C v are built

  c

C v contains the p oints stored in the rst i children of v starting from the left The leaf

i

no des of C v are in decreasing order of the xco ordinates of their asso ciated p oints and

i

eachinternal no de stores the p oint whose yco ordinate is the largest among the p oints stored

in its subtree Actuallywe are not storing the Cartesian trees themselves in v but rather

the transformations D v D C v so that eachsuch structure can b e used to answer

i i

twosided twodimensional range queries in constanttime And nally one fusion tree is

built for each of the three lists stored at the ro ot and hence a member lookup in eachof

them can b e p erformed in O log n log log ntime

It is obvious that the total size of all the sorted lists and their corresp onding fractional

cascading structures is O n Each p oint can app ear in at most one x z Cartesian tree

and c x y Cartesian trees and hence the overall storage cost of these Cartesian trees is

O n log n

Using the fast fractional cascading structure on L we will b e able to identify the path

z

inO log n log log n time b ecause O log n log log n time is sucient to nd whichchild

of the ro ot is on and we can nd the remaining O log n log log n no des on in constant

time p er no de

As wehave seen b efore we do not need to access the children of a no de v if it contains no

p oints to b e rep orted Also the combination of the three fast fractional cascading structures

and the x z Cartesian trees guarantees that O f v time will b e sp entoneachnode v

on However since the degree of our primary tree is no longer a constant we cannot aord

to access eachchild of v even if v has contributed some p oints Doing so would result in a

f log n term in the overall query time Wehave to b e sure at least one p oint will b e rep orted

during the visit of a no de v b efore it is actually visited This is where the x y Cartesian

trees come into play

Let v b e the no de b eing visited Supp ose v is on Wealways start bysearching the

ro ot whichisalways visited and always on We rst use D v to rep ort M v QIfv is

a leaf no de we are done Otherwise wendinO time the child u of v that is also on

Supp ose it is the ith child from the left If i thenwe recursively search the subtree of u

If not D v is accessed as describ ed in Section to answer the query q p q p

i x x y y

Since all the p oints p in the subtrees of the rst i children already satisfy p q

z z

we can b e condent that the p oints rep orted in the searchof D v using the x and y

i

co ordinates do b elong to Q Note that the rightmost no des of C v that satisfy q p

i x x

can b e identied in constant time and hence Lemma to b e applicable This condition

can b e satised by augmenting the fractional cascading structure for the L lists by using

x

log n additional p ointers to link eachelement in the augmented list asso ciated with no de v

to D v i log nwhichintro duces O n log n additional storage cost

i

To ensure we reach in constant time eachchild of v that has at least one p ointtobe

rep orted we maintain a vector of c bits initialized as zeros and set the j th bit to whenever

a p oint from the subtree of the j th child of v is rep orted After this pro cess the bits

corresp ond to the children of v in addition to the child of v already identied on which

needs to b e accessed Let k be the numb er of suchchildren In order to nd these children in

c

O k time wemaintain a table of size The dth entry of this table stores a list of integers

l I I I where l is the numb er of bits in the binary representation of d and I is

  l i

the p osition of the ith bit in the integer dSincec log n eachoftheseintegers can b e

enco ded in O log log n bits Therefore eachentry uses at most O log nlog log nO log n

bits as long as If eachentry can b e stored in one word and the size of the table

is O n Note that this single table is used for the entire searching pro cess

Now supp ose v is not on We then simply use D v to rep ort M v Q and recursively

access eachofthechildren of v only if M v Q The description of our algorithm is

now complete

It should b e clear from the ab ove discussion that the numb er of primary tree no des visited

is O log n log log and O f v time is sp entateachnode v Moreover each p ointinQ

will b e rep orted at most twice We therefore have the following theorem

Theorem Asetofn threedimensional points can bestored in a data structure of size

O n log n where is an arbitrary positive constant smal ler than such that any three

dimensional dominancereporting query can be answeredinO log n log log n f time

An Improved Algorithm for D Range Counting

In this section we describ e a fast algorithm for handling the twodimensional dominance

range counting problem

Our solution uses ideas similar to the ones used in Instead of using a binary search

tree as the skeleton of our data structure we use a tree of degree c log nPoints are stored

in the leaf no des in order by decreasing xco ordinates Eachinternal no de v corresp onds to

an xrange x x where x and x are the xco ordinates of the p oints stored in its leftmost

   

and rightmost leaf descendants Wecallx the key of no de v denoted as k v Eachinternal



no de v of the tree stores the list l v ofkeys of its log c children and the p ointers to them

In addition v also contains an auxiliary data structure to b e used to gain information ab out

the yco ordinates of those p oints in v s subtree We will describ e this structure so on In

addition to our tree structure wehave a fusion tree that o ccupies O n space and p erforms

rank searches on the n p oints sorted on decreasing yco ordinates in O log n log log n time

The auxiliary data structure of eachinternal no de v consists of two parts First similar

to Chazelles bit vector a vector is used to record for eachpoint in the subtree ro oted at v

in order of decreasing yco ordinates which of the c child subtrees it b elongs to Obviously

weneedlogc bits to enco de this information We call these log c bits a microword and the

vector of microwords a router denoted as r v Since a word can accommo date log n log c

microwords mv dnv log c log ne words are needed to store r v ifv has nv leaf

descendants In addition we count for each i between and mv andj betweenand

cthenumb er of microwords stored in the rst i words of r v whose values are b etween

and j and store them in a mv c twodimensional arraycalledacounter

denoted as cv

Now we analyze the storage cost of the overall data structure The tree itself is clearly

of size O n Since each microword corresp onds to a p oint which is represented once at each

of the log n levels of the tree there are a total number of n log n log c microwords The

c

word cost of all the routers is thus O n Let T denote the set of internal no des The total

P

size of all the counters is mv c O cn The overall size of our data structure

v T

is thus O n log n

A query given as q q is answered by recursively exploring the tree We will show

x y

that only a constanttimeisspent at eachlevel of the tree and hence the complexityof

our query algorithm is O log n log log n First we use the fusion tree to nd the rank i of

the successor of q This implies that the p oints that corresp ond to the rst i microwords in

y

r hhave their yco ordinates no less than q whereh is the ro ot of the tree If the xrange

y

of the ro ot h is included in q then i is the answer Otherwise supp ose the j th child v

x

of h is the rightmost one such that k v q Thenumb er of p oints in the subtrees ro oted

x

the rst j children of v that should b e counted are those whose yco oridnates are no less

than q Thisnumb er can b e computed bycounting howmanypoints that corresp ond to the

y

rst i microwords in r hgotoitsrstj children Let g di log c log ne The number

of p oints that corresp ond to the rst g log n log c microwords can b e immediately found in

cv g j The number of points corresp onding to the remaining i g log n log c microwords

that should b e counted can b e obtained by lo oking up a table which costs only a constant

amount of storage The numb er of remaining p oints to b e counted are stored in the subtree

of the j th child v of h Hence we rep eat the ab ove pro cedure on no de v The only

dierence is that we already know the rank of x in l h By Lemma we can compute the

rank of x in l v inconstant time without asymptotically increasing the storage cost It is

now clear the overall complexity of our algorithm is O log n log log n

Theorem There exist an algorithm that solves twodimensional dominancecounting query

and range counting query O log n log log n time This algorithm uses O n log n space

Fast Algorithms for Multidimensional Dominance

Searching

Using the results obtained for the twodimensional dominance counting case we can achieve

a b etter query complexity for the threedimensional dominance counting problem We



build a search tree of degree c log n on the p oints sorted on decreasing zvalues

Let D S denote the data structure describ ed in the previous section for answering two

dimensional dominance counting queries For eachinternal no de v we construct c data

structures fD S iji cg where S i is the set of p oints stored in the subtrees

ro oted at the rst i children of v Toanswer a query given as q q q we identify

x y z

O log n log log ntwodimensional data structures one at eachlevel each corresp onding

to a no de whose zrange is completely covered byq This takes O log n log log n

z

time We then obtain the answers to the twodimensional dominance query given byx y

by searching each of the twodimensional data structures identied The complexityofthis



algorithm is O log n log log n Let s s s b e the size of the p oint sets stored at the

  l

l twodimensional data structures at a certain level Then the storage cost of that level is

P P

 

s O n log n Thus the overall storage cost is s log s O log n O

i i i

il il



O n log n log log n Generalization of this result to higher dimensions is straightforward

Similar technique can b e used for to obtain fast algorithms for the multidimensional

dominance rep orting case in whichwe use the data structure describ ed in Section to

handle the lowest three dimensions

Theorem There exist data structures such that any ddimensional dominancecounting

d d

query can be answeredinO log n log log n time using O n log n log n log log n

space for d Similarly an arbitrary ddimensional reporting query can be answeredin

d d

O log n log log n f time using O n log nlog n log log n spaceford

References

S Alstrup and G S Bro dal New data structures for orthogonal In

Proceedings of IEEE Symposium on Foundations of Computer Science pages

J L Bentley Multidimensional binary search trees used for asso ciative searching Com

munications of the ACM Sept

B Chazelle Filtering search A new approach to queryanswering SIAM Journal on

Computing Aug

B Chazelle A functional approach to data structures and its use in multidimensional

searching SIAM Journal on Computing June

B Chazelle and H Edelsbrunner Linear space data structures for twotyeps of range

search Discrete Commput Geom

B Chazelle and L J Guibas Fractional Cascading I A data structure technique

Algorithmica

M L Fredman and D E Willard Surpassing the information theoretic b ound with

fusion trees Journal of Computer and System Sciences

M L Fredman and D E Willard Transdichotomous algorithms for minimum spanning

trees and shortest paths Journal of Computer and System Sciences

H N Gab ow J L Bently and R E Tarjan Scaling and related techniques for geometry

problems In Proceedings of the th Annual ACM Symposium on Theory of Computing

pages

D Harel and R E Tarjan Fast algorithms for nding nearest common ancestors SIAM

Journal on Computing

C Makris and A K Tsakalidis Algorithms for threedimensional dominance searching

in linear space Information Processing Letters

E M McCreight Priority search trees SIAM Journal on Computing

May

S Subramanian and S Ramaswamy The Prange tree A new data structure for

range searching in secondary memoryInProceedings of the ACMSIAM Symposium on

Discrete Algorithms pages

J Vuillemin A unifying lo ok at data structures Communications of the ACM

D E Willard Examining computational geometryvan Emde Boas trees and hashing

from the p ersp ective of the fusion three SIAM Journal on Computing