Performance Evaluation of Approximate Priority Queues

Performance Evaluation of Approximate Priority Queues y z Yossi Matias Suleyman Cenk Sahinalp Neal E Young June Abstract We rep ort on implementation and a mo dest exp erimental evaluation of a recently intro duced priorityqueue data structure The new data structure is designed to takeadvantage of fast op erations on machine words and as appropriate reduced keyuniverse size andor tolerance of approximate answers to queries In addition to standard priorityqueue op erations the data structure also supp orts successor and predecessor queries Our results suggest that the data structure is practical and can b e faster than traditional priority queues when holding a large number of keys and that tolerance for approximate answers can lead to signicant increases in sp eed Bell Lab oratories Mountain Ave Murray Hill NJ matiasresearchb elll abscom y Bell Lab oratories Mountain Ave Murray Hill NJ jenkresearchb ellla bscom The author was also aliated with Department of Computer Science University of Maryland College Park MD when the exp eriments were conducted z Department of Computer Science Dartmouth College Hanover NH neycsdartmouthedu speedup n An employers demand for accelerated output without increasedpay Websters dictionary Intro duction A priority queue is an abstract data typ e consisting of a dynamic set of data items with the following op erations inserting an item deleting the item whose key is the maximum or minimum among the keys of all items The variants describ ed in this pap er also supp ort the following op erations deleting an item with any given key nding the item with the minimum or the maximum key checking for an item with a given key checking for the item with the largest key smaller than a given key and checking for the item with the smallest key larger than a given key Priority queues are widely used in many applications including shortest path algorithms com putation of minimum spanning trees and heuristics for NPhard problems eg TSP For many applications priority queue op erations esp ecially deleting the minimum or maximum key are the dominant factor in the p erformance of the application Priority queues have b een extensively ana lyzed and numerous implementations and empirical studies exist see eg AHU Meh Jon SV LL This note presents an exp erimental evaluation of one of the priority queues recently intro duced in MVY the wordbased radix tree This data structure is qualitativ ely dierent than tra ditional priority queues in that its time p er op eration do es not increase as the number of keys grows in fact it can decrease Instead as in the van Emde Boas data structure vKZ the running time dep ends on the size of the universe the set of p ossible keys which is assumed to b e f Ug for some integer U Most notably the data structure is designed to take advantage of any tolerance the application has for working with approximate rather than exact key values The greater the tolerance for approximation the smaller the eective universe size and so the greater the sp eedup More sp ecically supp ose that the data structure is implemented on a machine whose basic word size is b bits When no approximation error is allowed the running time of each op eration on the data structure is exp ected to b e close to c lg U lg b for some c It is exp ected to b e signicantly faster when a relative error tolerance is allowed In this case eachkey k is mapp ed to a smaller approximate key k dlg k e resulting in an eective universe U that is b and the relative considerably smaller In particular when the universe size U is smaller than j error is for nonnegativeinteger j then the running time of each op eration on the data structure is exp ected to b e close to c j lg b for some constant c The data structure is also unusual in that it is designed to takeadvantage of fast bitbased op erations on machine words Previouslysuch op erations were used in the context of theoretical priority queues such as those of Fredman and Willard FWa FWb Wil which are often considered nonpractical In contrast the data structure considered here app ears to b e simple to implement and to have small constants it is therefore exp ected at least in theory to b e a candidate for a comp etitive practical implementation as long as the bitbased op erations can b e supp orted eciently In this pap er wereportonanimplementation of the wordbased radix tree on a demonstration of the eectiveness of approximation for improved p erformance and on a mo dest exp erimental testing whichwas designed to study to what extent the ab ove theoretical exp ectations hold in practice Sp ecically our exp erimental study considers the following questions What maybeatypical constant c for the radix tree data structure with no approximation tolerance in the estimated running time of c lg U lg b What maybeatypical constant c for the radix tree data structure with approximation error for which the estimated running time is approximately c lg b and How do es the p erformance of the radix tree implementations compare with traditional priority queue implementations and what maybeatypical range of parameters for whichitmay b ecome comp etitive in practice We study two implementations of the radix tree one topdown the other bottomup The latter variant is more exp ensive in terms of memory usage but it has b etter time p erformance The main b o dy of the tests timed these two implementations on numerous randomly generated sequences varying the universe size U over the set f g the n umb er of op erations N over the set f g and the approximation tolerance over the set f U g where the case corresp onds to no tolerance for approximation The tests were p erformed on a single no de of an SGI Power Challenge with Gbytes of main memory The sequences are of three given typ es inserts only insertsfollowed by deleteminimums and inserts followed by mixed op erations Based on the measured running times we identify reasonable estimates to the constants c and c for which the actual running time deviates from the estimated running time byatmostabout The b ottomup radix tree was generally faster than the topdown one sometimes signicantly so For sequences with enough insertions the radix trees were faster than the traditional priority queues sometimes signicantly so The Data Structure We describ e the the topdown and the bottomup variants of the wordbased radix tree priority queue Both implementations are designed to takeadvantage of any tolerance of approximation in the application Sp ecically if an approximation error of is tolerated then each given key k is mapp ed b efore use to the approximate key k dlg k e details are given b elow This ensures that for instance the findminimum op eration will return an element at most times the true minimum Op erating on approximate keys reduces the universe size to the eective universe size U For completeness wegive the following general expression for U according to our implementation dlgdlgU eedlge if U U if Both the topdown and b ottomup implementations of the wordbased radixtree maintain a complete bary tree with U leaves where b is the numb er of bits p er machine word on the host computer here we test only b Each leaf holds the items having a particular approximate key value k The time p er op eration for b oth implementations is upp er b ounded by a constant times the number of levels in the tree whichis lg lg U lg if b b levels U dlg U e b lg U if b b For U and for b apower of two weget d lg be if levels U d lg U lg be if For all parameter values we test the ab ove inequality is tight with the number of levels ranging from to and the numb er of leaves ranging from to As mentioned ab ove each leaf of the tree holds the items having a particular approximate key value Eachinterior no de holds a single machine word whose ith bit indicates whether the ith subtree is empty The topdown variant allo cates memory only for no des with nonempty subtrees Each such no de in addition to the machine word also holds an arrayof b pointers to its children An item with a giv en key is found by starting at the ro ot and following p ointers to the appropriate leaf It examines a no des word to navigate quickly through the no de For instance to nd the index of the leftmost nonempty subtree it determines the most signicant bit of the word If a leaf is inserted or deleted the bits of the ancestors words are up dated as necessary The b ottomup variant preallo cates a single machine word for every interior no de emptyor not The word is stored in a xed p osition in a global arraysonopointers are needed The items with identical keys are stored in a doubly linked list which is accessible from the corresp onding leaf Each op eration is executed by up dating the appropriate item of the leaf and then up dating the ancestors words bymoving up the tree This typically will terminate b efore reaching the ro ot for instance an insert needs only to up date the ancestors whose subtrees were previously empty Details on implementations of the data structure and word op erations Belowwe describ e how the basic op erations are implemented in each of our data structures We also describ e implementations of the wordbased op erations we require Computing approximate keys j Given a key k this is howwe compute the approximate key k assuming for some nonnegativeinteger j Leta a a b e the binary representation of a key k U where B B B dlg U e Let m maxfi a g b e

Load more