Performance Evaluation of Approximate Priority Queues

y z

Yossi Matias Suleyman Cenk Sahinalp Neal E Young

June

Abstract

We rep ort on implementation and a mo dest exp erimental evaluation of a recently intro duced

priorityqueue The new data structure is designed to takeadvantage of fast

op erations on machine words and as appropriate reduced keyuniverse size andor tolerance

of approximate answers to queries In addition to standard priorityqueue op erations the data

structure also supp orts successor and predecessor queries Our results suggest that the data

structure is practical and can b e faster than traditional priority queues when holding a large

number of keys and that tolerance for approximate answers can lead to signicant increases in

sp eed

Bell Lab oratories Mountain Ave Murray Hill NJ matiasresearchb elll abscom

y

Bell Lab oratories Mountain Ave Murray Hill NJ jenkresearchb ellla bscom The author was

also aliated with Department of Computer Science University of Maryland College Park MD when the

exp eriments were conducted

z

Department of Computer Science Dartmouth College Hanover NH neycsdartmouthedu

speedup n An employers demand for accelerated output without increasedpay

Websters dictionary

Intro duction

A priority queue is an abstract data typ e consisting of a dynamic of data items with the following

op erations

inserting an item

deleting the item whose key is the maximum or minimum among the keys of all items

The variants describ ed in this pap er also supp ort the following op erations deleting an item with

any given key nding the item with the minimum or the maximum key checking for an item with

a given key checking for the item with the largest key smaller than a given key and checking for

the item with the smallest key larger than a given key

Priority queues are widely used in many applications including shortest path algorithms com

putation of minimum spanning trees and heuristics for NPhard problems eg TSP For many

applications priority queue op erations esp ecially deleting the minimum or maximum key are the

dominant factor in the p erformance of the application Priority queues have b een extensively ana

lyzed and numerous implementations and empirical studies exist see eg AHU Meh Jon

SV LL

This note presents an exp erimental evaluation of one of the priority queues recently intro duced

in MVY the wordbased radix This data structure is qualitativ ely dierent than tra

ditional priority queues in that its time p er op eration do es not increase as the number of keys

grows in fact it can decrease Instead as in the van Emde Boas data structure vKZ the

running time dep ends on the size of the universe the set of p ossible keys which is assumed

to b e f Ug for some integer U Most notably the data structure is designed to take

advantage of any tolerance the application has for working with approximate rather than exact

key values The greater the tolerance for approximation the smaller the eective universe size and

so the greater the sp eedup

More sp ecically supp ose that the data structure is implemented on a machine whose basic

word size is b bits When no approximation error is allowed the running time of each op eration

on the data structure is exp ected to b e close to c lg U lg b for some c It is exp ected to

b e signicantly faster when a relative error tolerance is allowed In this case eachkey k is

mapp ed to a smaller approximate key k dlg k e resulting in an eective universe U that is

b

and the relative considerably smaller In particular when the universe size U is smaller than

j

error is for nonnegativeinteger j then the running time of each op eration on the data

structure is exp ected to b e close to c j lg b for some constant c

The data structure is also unusual in that it is designed to takeadvantage of fast bitbased

op erations on machine words Previouslysuch op erations were used in the context of theoretical

priority queues such as those of Fredman and Willard FWa FWb Wil which are often

considered nonpractical In contrast the data structure considered here app ears to b e simple

to implement and to have small constants it is therefore exp ected at least in theory to b e a

candidate for a comp etitive practical implementation as long as the bitbased op erations can b e

supp orted eciently

In this pap er wereportonanimplementation of the wordbased radix tree on a demonstration

of the eectiveness of approximation for improved p erformance and on a mo dest exp erimental

testing whichwas designed to study to what extent the ab ove theoretical exp ectations hold in

practice Sp ecically our exp erimental study considers the following questions

What maybeatypical constant c for the radix tree data structure with no approximation

tolerance in the estimated running time of c lg U lg b

What maybeatypical constant c for the radix tree data structure with approximation error

for which the estimated running time is approximately c lg b and

How do es the p erformance of the radix tree implementations compare with traditional priority

queue implementations and what maybeatypical range of parameters for whichitmay

b ecome comp etitive in practice

We study two implementations of the radix tree one topdown the other bottomup The latter

variant is more exp ensive in terms of memory usage but it has b etter time p erformance The main

b o dy of the tests timed these two implementations on numerous randomly generated sequences

varying the universe size U over the set f g the n umb er of op erations N over the set

f g and the approximation tolerance over the set f U g where

the case corresp onds to no tolerance for approximation The tests were p erformed on a single

no de of an SGI Power Challenge with Gbytes of main memory

The sequences are of three given typ es inserts only insertsfollowed by deleteminimums

and inserts followed by mixed op erations Based on the measured running times we identify

reasonable estimates to the constants c and c for which the actual running time deviates from the

estimated running time byatmostabout

The b ottomup radix tree was generally faster than the topdown one sometimes signicantly

so For sequences with enough insertions the radix trees were faster than the traditional priority

queues sometimes signicantly so

The Data Structure

We describ e the the topdown and the bottomup variants of the wordbased radix tree priority

queue

Both implementations are designed to takeadvantage of any tolerance of approximation in the

application Sp ecically if an approximation error of is tolerated then each given key k is

mapp ed b efore use to the approximate key k dlg k e details are given b elow This ensures

that for instance the findminimum op eration will return an element at most times the true

minimum

Op erating on approximate keys reduces the universe size to the eective universe size U For

completeness wegive the following general expression for U according to our implementation

dlgdlgU eedlge

if

U

U if

Both the topdown and b ottomup implementations of the wordbased radixtree maintain a

complete bary tree with U leaves where b is the numb er of bits p er machine word on the host

computer here we test only b Each leaf holds the items having a particular approximate

key value k The time p er op eration for b oth implementations is upp er b ounded by a constant

times the number of levels in the tree whichis

lg lg U lg if

b b

levels U dlg U e

b

lg U if

b

b

For U and for b apower of two weget

d lg be if

levels U

d lg U lg be if

For all parameter values we test the ab ove inequality is tight with the number of levels ranging

from to and the numb er of leaves ranging from to

As mentioned ab ove each leaf of the tree holds the items having a particular approximate

key value Eachinterior no de holds a single machine word whose ith bit indicates whether the ith

subtree is empty The topdown variant allo cates memory only for no des with nonempty subtrees

Each such no de in addition to the machine word also holds an arrayof b pointers to its children

An item with a giv en key is found by starting at the ro ot and following p ointers to the appropriate

leaf It examines a no des word to navigate quickly through the no de For instance to nd the

index of the leftmost nonempty subtree it determines the most signicant bit of the word If a

leaf is inserted or deleted the bits of the ancestors words are up dated as necessary

The b ottomup variant preallo cates a single machine word for every interior no de emptyor

not The word is stored in a xed p osition in a global arraysonopointers are needed The items

with identical keys are stored in a doubly linked list which is accessible from the corresp onding

leaf Each op eration is executed by up dating the appropriate item of the leaf and then up dating

the ancestors words bymoving up the tree This typically will terminate b efore reaching the ro ot

for instance an insert needs only to up date the ancestors whose subtrees were previously empty

Details on implementations of the data structure and word op erations

Belowwe describ e how the basic op erations are implemented in each of our data structures We

also describ e implementations of the wordbased op erations we require

Computing approximate keys

j

Given a key k this is howwe compute the approximate key k assuming for some

nonnegativeinteger j Leta a a b e the binary representation of a key k U where

B B

B dlg U e Let m maxfi a g b e the index of the most signicant bit of k Letn

i

b e the integer whose j bit binary representation is a a a ie the j bits following the

m m mj

most signicant bit Then

j

k m n

Implicitly this mapping partitions the universe into sets whose elements dier byatmostamul

tiplicativefactoroftwo then partitions each set into subsets whose elements dier byatmostan

appropriate additive amount Under this mapping the universe of approximate keys is a subset

of f U g where U is the largest integer whose binary representation has a numb er of bits

equal to dlg dlg U ee j

Bottomup implementation

The b ottomup implementation of the radix tree is a complete ary tree with L dlg U e levels

level consists of those no des at distance from the b ottom of the tree so that the leaves form

level zero

j

max fj mg and n This mapping can b e improved a bit by using k m j n wherem

a a a

m m  m j

The leaves in the tree corresp ond to the approximate key values For each leaf there is a p ointer

to a linked list of the items if any whose keys equal the corresp onding value These p ointers are

L

maintainedinanarray of size

L

For eachlevel wemaintain an array A of bit words one for eachnodeinlevel

For convenience and lo cality of reference the no des are stored in the arrays so that the binary

representation of the index of each no de is the common prex of the binary representation of the

indices of its descendants That is the index of a leaf with key value k is just k The index of the

ancestor of the leaf at level has binary representation a a a wherea a a is

D D D D

the binary representation of k D dlg U e This allows the index of an ancestor or child to

b e computed quickly from the index of a given no de

As describ ed previouslywe maintain the invariant that the r th bit of a no des word is set i

the subtree of the r th child is nonemptyFor brevitywe call r the rank of the child

Insert First insert the item with key k into the linked list of the corresp onding leaf Then set

the appropriate bits of the ancestors words to Sp ecically starting at the leaf check the r th

bit of the parents word where r is the rank of the leaf note that r is the ve least signicant bits

of k If the bit is already set stop otherwise set it move up to the parent and continue Stop

at the ro ot or up on nding an alreadyset bit

Search To test whether an item with a given key k is present simply check whether the leaf s

linked list is emptyTo nd an item with the given key search the linked list of the leaf

Delete First remove the item from the linked list of its leaf If the list b ecomes empty up date

the appropriate bits of the ancestors words Sp ecically starting at the leaf zero the r th bit of the

parents word where r is the rank of the no de If the entire word is still nonzero stop Otherwise

move up to the parent and continue Stop at the ro ot or up on nding a word that is nonzero after

its bit is zero ed

Minimum Starting from the ro ot compute the index r of the least signicant bit lsbofthe

ro ots word Descend to the child of rank r and continue Stop at a leaf

Maximum This is similar to minimum except it uses the most rather than least signicant bits

Successor Given a key k return an item with smallest key as large as k

If the leaf corresp onding to k has a nonempty list return an item from the list Otherwise

search up the tree from the leaf to the rst ancestor A with a nonempty sibling R of rank at least

A

the rank of ATondR examine the word of the parentofA to nd the most signicant bit if

A

any set among those bits enco ding the emptiness of the subtrees of those siblings that have rank

at least the rank of AFor this use the nearestonetoright op eration nor as describ ed later in

this section

Next nd the leftmost nonemptyleafin Rs subtree pro ceeding as in the Minimum op eration

ab ove Return an item from that leaf s list



If this array is large and only sparsely lled it can instead b e maintained as a



In retrosp ect this query should probably have b een done in a b ottomup fashion similar to the Successor query

b elow

Predecessor Predecessor is similar to Successor but is based on the nearestonetoleft

nol op eration

Topdown implementation

This implementation also maintains a ary tree with L levels The implementation diers from

the b ottomup one in that memory is allo cated on a p erno de basis and op erations start at the

ro ot rather than at the leaves

Sp ecically eachinterior no de ro oting a nonempty subtree consists of a bit word used as in

the b ottomup implementation and an arrayofpointers to its children no des where a p ointer

is null if the childs subtree is empty

Insert Starting from the ro ot followpointers to the appropriate leaf instantiating new no des

and setting no des words bits as needed Sp ecically b efore descending into a rankr child set the

r th bit of the no des word

Search Followpointers from the ro ot as in Insert but stop up on reaching any empty subtree

and return NULL If a leaf is reach return a no de from the leaf s list

Delete Follow p ointers as in Search Up on nding the leaf remove an item from the leaf s list

Retrace the path followed down the tree in reverse zeroing bits as necessary to represent subtrees

that have b ecome empty

Minimum Starting at the ro ot use the no des word to nd the leftmost child with a nonempty

subtree as in the b ottomup implementation of Insert Movetothatchild by following the

corresp onding p ointer Continue in this fashion down the tree Stop up on reaching a leaf and

return an item from the leaf s list

Maximum Similar to Minimum

Successor Starting at the ro ot let r b e the rank of the child whose subtree has the leaf cor

resp onding to the given key k Among the children of rank at least r with nonempty subtrees

nd the one of smallest rank Do this by applying the nearestonetoright wordop eration nor

dened b elow to the no des word Follow the p ointer to that child Continue until reaching a leaf

Predecessor Similar to Successor except use the nol op eration instead of nor

Implementation of binary op erations

Belowwe describ e howwe implemented the binary op erations that are not supp orted by the machine

hardware

msbkey The most signicant bit op eration or msb is dened as follows Given a key k msbk

ve three implementations of is the index of the most signicant bit that is equal to in k Weha

the op eration

th

The rst metho d is starting from the lg U msb of the key and linearly searching the msb

bychecking next msb by shift left op eration We exp ect this metho d to b e fast just a few

shifts will b e needed for most values of the words

A metho d for obtaining a worstcase p erformance of lg lg U is based on binary search We

apply at most lg lg U masks and comparisons with to b e to determine the index of msb

This metho d is not used in any our exp eriments as it was slower than the ab ove metho d in

preliminary exp eriments

The last metho d is dividing the key to blo cks of bit each V V V V by shifts and

masks and then searchthemsb of the most signicant nonzero blo ck This approach could

b e seen as a compromise b etween the two metho ds describ ed ab ove and is used for the

computation of the maximum and minimum keys

lsbkey The least signicant bit op eration or lsb is dened as follows Given a key k lsb k

is the index of the least signicant bit that is equal to in k The implementations for lsb are

similar to those of msb essentially replacing ab ove most with least msb with lsb and

left with right

We note that by using the ab ove implementations for msb and lsb in our exp eriments we

exploited the fact that the input is random For general applicabilitywewould need an implemen

tation that is ecient for arbitrary input This rep ort is concerned mainly with the p erformance

of the radix tree data structure when ecient bitbased op erations are available and therefore

exploiting the nature of input for ecient msb and lsb executions is appropriate Nevertheless

we mention an alternativetechnique whichwehave not tested that enables an msb and lsb

implementation using a few op erations one subtraction one bitwise XOR one MOD and one

lo okup to a table of size b

We describ e the lsb implementation for a given value x First note that x x bitwisexor

lsbx

x maps x to Next note that there will b e a small prime p slightly larger than

i

b such that mo d p is unique for i fp g b ecause generates the multiplicativemodp

group Thus the function x x bitwisexor x modpmaps x to a number in fp g

that is uniquely determined by lsbx By precomputing an array T of p bbbit words such that

T x bitwisexor x mo d p lsb x the lsb function can subsequently b e computed in a few

op erations Examples of go o d p include b b b b

b b and b

nol indexkey The op eration nearest one to leftornol is dened as follows Given a key k

and an index i noli k is the index of the rst bit that is equal to to the left of more signicant

than the given index The nol op eration is implemented by using the op erations shift rightby

index and msb

norindexk ey The op eration nearest one the rightornor is dened as follows Given a key

k and an index i the nori k is the index of the rst bit that is equal to to the right of least

signicant than the given index The nor op eration is implemented by using the op erations shift

left by index and lsb

The Tests

In this section we describ e in detail what kind of tests we p erformed the data sets weusedand

the testing results Weverify from the testing results howwell our theoretical exp ectations were

supp orted and interpret the variations

The test sets

The main b o dy of the test timed two implementations on numerous sequences varying the universe

size U over the set f g the numb er of op erations N over the set f g

and the approximation tolerance over the set f U gWe note that the case

corresp onds to no tolerance for approximation

The sequences were of three kinds N insertions N insertions followed by N deleteminimums

or N insertions followed by N mixed op erations half inserts one quarter delete s and one

quarter deleteminimum s Each insertion was of an item with a random key from the universe

Each deletion was of an item with a random key from the keys in the priority queue at the time of

the deletion

For comparison we also timed more traditional priority queues the binary the Fi

b onacci heap the and the ary heap from the LEDA library LED as well as a

plain binary with no balance condition None of these data structures are designed to

take advantage of U or so for these tests wexed U and and let only N vary

The test results

In summarywe observed that the times p er op eration were roughly linear in the numberoflevels

as exp ected Across the sequences of a given typ e inserts inserts then deleteminimum s or

inserts then mixed op erations the constant of prop ortionalityvaried up or down typically by

as much as The b ottomup radix tree was generally faster than the topdown one some

times signicantly so For sequences with enough insertions the radix trees were faster than the

traditional priority queues sometimes signicantly so

Estimating the constants c and c

The following table summarizes our b est estimates of the running time in microseconds p er

op eration as a function of the numb er of levels in the tree L level U see Equation

for the two radixtree implementations on the three typ es of sequences as N U and vary

These estimates are based on the sequences describ ed ab ove and are chosen to minimize the ratio

between the largest and smallest deviations from the estimate The accuracy of these estimates in

comparison to the actual times p er op eration is presented Figures and

insert insdel insmixed

topdown L L L

L L L

b ottomup L L L

L L L

The variation of the actual running times on the sequences in comparison to the estimate

is typically on the order of lower or higher of the appropriate estimate The b ottomup

implementation is generally faster than the topdown implementation The case is typically

slower per level than the case esp ecially for the b ottomup implementation

The table b elow presents for comparison an estimate of the time for the pairing heap consis

tently the fastest of the traditional heaps we tested as a function of lg N



The use of approximate keys reduces the number of distinct keys By using approximate keys together with

an implementation of a traditional priority queue that buckets equal elements the time p er op eration for delete

and deletemin could b e reduced from prop ortional to lg N down to prop ortional to the logarithm of the number

of distinct keys We dont explore this here -3.1+3.9*L, 36 seq’s -2.9+3.4*L, 30 seq’s -7.9+7.6*L, 30 seq’s 1 1 1

.8 .8 .8

.6 .6 .6

.4 .4 .4

.2 .2 .2

0 0 0 0.5 0.76 1.02 1.28 1.54 1.8 0.7 0.84 0.98 1.12 1.26 1.4 0.8 0.88 0.96 1.04 1.12 1.2 top-down inserts top-down ins/del top-down ins/mixed

1.3+1.3*L, 27 seq’s 0.8+1.3*L, 30 seq’s 1.1+1.6*L, 36 seq’s 1 1 1

.8 .8 .8

.6 .6 .6

.4 .4 .4

.2 .2 .2

0 0 0 0.6 0.78 0.96 1.14 1.32 1.5 0.8 0.9 1. 1.1 1.2 1.3 0.7 0.82 0.94 1.06 1.18 1.3

bottom-up inserts bottom-up ins/del bottom-up ins/mixed

Figure For the plot in the upp er left the topdown radix tree was run on sequences of N

insertions of a universe of size U with approximation tolerance for the various values of N U and

Ab ove the plot is an estimate chosen to t the data of the running time p er op eration in

microseconds as a function of the numberoflevels L for the topdown implementation on sequences

of insertions To the right of the function is the numb er of sequences Each sequence yields an error

ratio the actual time for the sequence divided by the time predicted by the estimate The plot

shows the distribution of the set of error ratios Each histogram bar represents the fraction of error

ratios near the corresp onding xaxes lab el xThecurveabove represents the cumulative fraction

of sequences with error ratios x or less There is one plot for each implementationsequencetyp e

pair Typically the error ratios are b etween and

insert insdel insmixed

b est traditional lg N lg N lg N

Demonstrating the eect of approximation

Our exp eriments suggest that the radix tree mightbevery suitable for exploiting approximation

tolerance to obtain b etter p erformance In the following table weprovide timing results for three

sequences of op erations applied to the b ottom up implementation with approximation tolerance

and the universe size U The rst sequence consist of N

insertions the second one consists of N insertions followed by N deleteminimums

and nally the third one consists of N insertions followed by N mixed op erations -14.9+5.2*L, 15 seq’s -11.4+4.4*L, 12 seq’s -12.4+6.6*L, 12 seq’s 1 1 1

.8 .8 .8

.6 .6 .6

.4 .4 .4

.2 .2 .2

0 0 0 0.4 0.82 1.24 1.66 2.08 2.5 0.6 0.8 1. 1.2 1.4 1.6 0.9 0.94 0.98 1.02 1.06 1.1 top-down inserts top-down ins/del top-down ins/mixed

-4.8+1.8*L, 13 seq’s -5.6+2.4*L, 12 seq’s -4.8+2.2*L, 15 seq’s 1 1 1

.8 .8 .8

.6 .6 .6

.4 .4 .4

.2 .2 .2

0 0 0 0.4 0.78 1.16 1.54 1.92 2.3 0.6 0.8 1. 1.2 1.4 1.6 0.6 0.78 0.96 1.14 1.32 1.5

bottom-up inserts bottom-up ins/del bottom-up ins/mixed

Figure This gure is analogous to Figure but represents the case Note the larger

variation in insert times

Sequence Sequence Sequence

topdown b ottomup topdown b ottomup topdown b ottomup

Some comparisons to traditional heaps

Our exp eriments provide some indication that the radix tree based priority queue mightbeofprac

tical interest Belowwe provide timing results for the b ottom up implementation on three sequences

for comparing the p erformance of the b ottom up implementation and the fastest traditional heap

implementation from the LEDA library which consistently o ccured to b e the pairing heap The

rst sequence consist of N insertions the second one consists of N insertions followed

by N deleteminimums and nally the third one consists of N insertions followed

by N mixed op erations U and for all sequences

These gures suggest that under appropriate conditions the size of the universe the number

of elements and the distribution of the keys over the universe the radix tree can b e comp etitive

with and sometimes b etter than the traditional heaps

Sequence Sequence Sequence

BottomUp

fastest traditional

The pairing heap is quite fast for the insertiononly sequences For the other sequences from

the ab ove tables we can give rough estimates of when say the b ottomup radix tree will b e faster

inserts

inserts and deletes

inserts deletes and deletemins

raditional

Sp eedup vs T

Eective universe size U

Figure Sp eedup factor running time for b est traditional priority queue divided by time for

b ottomup approximate radixtree as a function of for three random test sequences with N

and U

than the pairing heap Recall that in the range of parameters wehave studied

lg if

levels U

lgU if

For instance assume a time p er op eration of level U for the radix tree and lg N

for the pairing heap Even for U as large as and the radix tree is faster once N is larger

than roughly If then the radix tree is faster once N is larger than roughly

For direct comparisons on actual sequences see Figure which plots the sp eedup factor ratio of

running times as a function of on sequences of each of the three typ es

We also provide Figures and to demonstrate the conditions under which the radix tree

implementations b ecome comp etitive with the LEDA implementations

Discussion

The variation in running times makes us cautious ab out extrap olating the results

The eect of big data sets The rst source of variation is that in b oth implementations having

more keys in the tree reduces the average time p er op eration due to the following factors

The topdown variant is slower by a constant factor when inserting if the insert instantiates

ypically only a few many new no des The signicance of this varies if a tree is fairly full t

new no des near the leaves will b e allo cated If the tree is relatively empty no des will also

b e allo cated further up the tree

The b ottomup variant is faster if the tree is relatively full in the region of the op eration

As discussed ab ove b ottomup op erations typically step from a leaf only to a previously or memory-intensive approx. p.q. 14 other approx. p.q. pairing heap 12

10

8

6 Time per operation

4

2

0

5 10 15 20

log(N)

Figure Time p er op eration in seconds varying with lg N for a sequence of N inserts for

U with no approximation tolerance One can observe that the time p er op eration in the

topdown implementation and the pairing heap do es not vary with the numb er of op erations as

exp ected For the b ottom up implementation the time p er op eration decreases with the increasing

numb er of items as explained ab ove

still nonempty ancestor This means op erations can taketimemuch less than the number

of levels in the tree

The signicance of the ab ovetwo eects dep ends on the distribution of keys in the trees For

instance if the keys are uniformly distributed in the original universe as in our exp eriments

then when the approximate keys will not b e uniformly distributed there will b e more

large keys than small keys in the eective universe Thus the rightmost subtrees will b e fuller

and supp ort faster op erations the leftmost subtrees will b e less full and therefore slower

The eect of memory issues On a typical computer the memoryaccess time varies dep ending

on how many memory cells are in use and with the degree of lo cality of reference in the memory

access pattern A typical computer has a small very fast cache a large fast random access memory

RAM and a very larger slower virtual memory implemented on top of disk storage

Our test sequences are designed to allow the data structures to t in RAM but not in cache This

choice reduces but do es not eliminate the memoryaccess time variation variation due to caching

remains It is imp ortant to note that our results probably do not extrap olate to applications where

the data structures do not t in RAM when that happ ens the relative b ehaviors could change

qualitatively as lo cality of reference b ecomes more of an issue On the other hand even with fairly

large universes andor a large number of keys most of these data structures can t in a typical

mo dern RAM

Another memoryrelated issue is the numberofbitspermachine word b In this pap er we

consider only b On computers with larger words the times p er op eration should decrease



Of all the priority queues considered here the b ottomup radix tree is the most memoryintensive Nonetheless

it seems p ossible to implement with reasonable lo cality of reference 20 memory intensive approx. p.q. other approx. p.q. binary tree pairing heap

15

10 Time per operation

5

0

4 6 8 10 12 14 16 18 20

log(N)

Figure Time p er op eration in seconds varying with lg N for a sequence of N inserts

followed by N deletemins for U with no approximation tolerance One can observe that

the time p er op eration in the topdown and b ottomup implementations is decreasing with time

The nonlinear increase in the running time of the pairing heap is p ossibly due to decreasing cache

utilization with increasing numb er of op erations

slightly all other things b eing equal Also the relative sp eed of the various bitwise op erations

on machine words as opp osed to for instance comparisons and p ointer indirection could aect

the relative sp eeds of radix trees in comparison to traditional priority queues

Acknowledgment

We thank Jon Bright for participating in this research at an early stage and for contributing to

the implementations

References

AHU AV Aho JE Hop croft and JD Ullman The Design and Analysis of Computer Algo

rithms AddisonWesley Publishing Company Inc Reading Massachusetts

FWa ML Fredman and DE Willard Blasting through the information theoretic barrier with

fusion trees In Proc nd ACM Symp on Theory of Computing pages

FWb ML Fredman and DE Willard Transdichotomous algorithms for minimum spanning

trees and shortest paths In Proc st IEEE Symp on Foundations of Computer Science

pages

Jon D W Jones An empirical comparison of priority queue and event set implementations

Communications of the ACM April 16 memory-intensive approx. p.q. other approx. p.q. binary tree pairing heap 14

12

10

8

Time per operation 6

4

2

0

5 10 15 20

log(N)

Figure Time p er op eration in seconds varying with lg N for a sequence of N inserts

followed by N mixed op erations for U with no approximation tolerance The same trend

as in Figure can b e observed

LED LEDA system manual version Technical rep ort Max Planck Institute fur Infor

matik Saarbrucken Germany

LL Anthony LaMarca and Richard E Ladner The inuence of caches on the p erformance of

heaps Manuscript

Meh K Mehlhorn Data Structures and Algorithms I Sorting and Searching SpringerVerlag

Berlin Heidelb erg EATCS Monographs on Theoretical Computer Science

MVY Y Matias JS Vitter and NE Young Approximate data structures with applications

In Proc th ACMSIAM Symp on Discrete Algorithms pages January

SV J T Stasko and J S Vitter Pairing heaps Exp eriments and analysis Communications

of the ACM March

vKZ Pvan Emde Boas R Kaas and E Zijlstra Design and implementation of an ecient

priority queue Math Systems Theory

Wil DE Willard Applications of the metho d to computational geometry and

searching In Proc rdACMSIAM Symp on Discrete Algorithms pages