Multiresolution Priority Queues and Applications

Multiresolution Priority Queues and Applications Jordi Ros-Giralt, Alan Commike, Peter Cullen Richard Lethin {giralt, commike, cullen, lethin}@reservoir.com Reservoir Labs 632 Broadway Suite 803, New York, NY 10012 Abstract䴨 Priority queues are container data structures The work we present stems from research on essential to many high performance computing improving the performance of a priority queue by applications. In this paper, we introduce multiresolution introducing a trade-off to exploit an existing gap priority queues, a data structure that achieves greater between the applications’ requirements and the set of performance than the standard heap based implementation by trading off a controllable amount of priorities supported by a priority queue. With this resolution in the space of priorities. The new data objective, we introduce the concept of multiresolution structure can reduce the worst case performance of priority queue, a new container data structure that can inserting an element from O(log(n)) to O(log(r)) , where flexibly trade off a controllable amount of resolution in n is the number of elements in the queue and r is the the space of priorities in exchange for achieving number of resolution groups in the priority space. The greater performance. worst case cost of extracting the top element is O(1) . A simple example works as follows. Suppose that we When the number of elements in the table is high, the use a priority queue to hold jobs {job ,,...,job job } amortized cost to insert an element becomes O(1) . 1 2 n that need to be scheduled using a “shortest job first” 1. Iɴᴛʀᴏᴅᴜᴄᴛɪᴏɴ policy. Each job job takes a time to process equal to Priority queues are a type of container data structures i p , and the removal of an element from the queue designed to guarantee that the element at the front of i the queue is the greatest of all the elements it contains, returns the job with the smallest processing time . (Hence determines the priority of each according to some total ordering defined by their min{pi} pi priority. This allows for elements of higher priority to job.) A common approach is to use a binary heap based be extracted first, independently of the order in which implementation of a priority queue, which effectively they are inserted into the queue. Fast implementations provides support for all possible range of priorities pi of priority queues are important to many high in the space of real numbers . Assume now that the performance computing (HPC) applications and hence set of possible priority values pi is discretized into l optimizing their performance has been the subject of subsets and that the application only requires intense research. In graph searching, the algorithmic guaranteeing the ordering between elements belonging complexity of essential algorithms such as finding the to different subsets. A multiresolution priority queue is shortest path between two nodes in a graph rely a data structure that can provide better performance entirely on the performance of a priority queue (see for than a standard priority queue by supporting the instance Dijkstra’s algorithm [1], [2]). Other essential coarser grained resolution space consisting of l applications are found in the field of bandwidth subsets rather than the complete set of priorities. management. For instance, in the Internet, routers Multiresolution priority queues can also be understood often have insufficient capacity to transmit all packets as a generalization of standard priority queues in which at line rate and so they use priority queues in order to the total ordering requirement imposed on the set of assign higher preference to those packets that are most priorities is relaxed, allowing for elements in the queue critical in preserving the quality of experience (QoE) to be only partially ordered. Hence an alternative name of the network (e.g., packets carrying delay sensitive to refer to this type of container data structures would data such as voice). Priority queues are also important be partial order priority queues. in other areas of research including Huffman This paper is organized as follows. In Section 2, we compression codes, operating systems, Bayesian spam summarize some of the main historical results in the filtering, discrete optimization, simulation of colliding research area of high performance priority queues. particles, or artificial intelligence to name some Section 3 presents the body of our work, introducing a examples. formal definition of the concept of multiresolution This work was funded in part by the US Department of Energy under contract DE-SC0017184. 1 A patent application with Serial No. 62/512,438 has been submitted for this work. priority queues, providing pseudocode for the main entropy independent of the number of elements in the routines to insert and extract elements from the queue, queue, and this result leads to our O(1) bound. and analyzing the algorithm’s complexity. This section Our approach is perhaps closest to the data structures introduces also two variants of the base algorithm to known as bucket queues [7] and calendar queues [ 8 ]. support sliding priority windows for applications such When the set of priorities are integers and bounded to as event-driven simulators and a binary tree based C , a bucket queue can insert and extract elements in optimization applicable to general graph algorithms O(1) and O(C) , respectively. The resolution of this such as finding the shortest path between two nodes. data structure however cannot be tuned and it does not Section 4 describes our implementation of a support the general case of priorities in the set of real multiresolution priority queue and demonstrates how it numbers . Calendar queues areO(1) for both insert can help a real world HPC application achieve greater and remove operations, but to achieve these bounds performance than a traditional binary heap based they need to be balanced and they also do not support solution. We summarize the main conclusions of this changing the resolution of the priority set to optimize work in Section 5. the trade off performance versus accuracy to match the 2. Bᴀᴄᴋɢʀᴏᴜɴᴅ Wᴏʀᴋ application requirements. Priority queues have been the subject of intense Our data structure can also be understood as a type of research since J. W. J. Williams introduced the binary sketching data structure. These are data structures that heap data structure and the heapsort algorithm in 1964 can answer certain questions about the data they store [3]. Williams’ work lead to a very efficient extremely efficiently, at the price of an occasional implementation with a worst case cost to insert and error. Examples of sketching data structures are Bloom extract elements from the priority queue of O(log(n)) , filters [9], count-min sketch [10] and the LFN data where n is the number of elements in the queue. This structure [11] . is in many applications a preferred way to implement 3. Mᴜʟᴛɪ-Rᴇꜱᴏʟᴜᴛɪᴏɴ Pʀɪᴏʀɪᴛʏ Qᴜᴇᴜᴇꜱ priority queues due to its simplicity and good performance. Variations exist that can achieve better 3.1. Definition performance on some operations at the expense of A multiresolution priority queue (MR-PQ) is a adding more implementation complexity, such as container data structure that retrieves its elements Fibonacci heaps [4], which have an amortized cost of based on their priority and according to three inserting and removing an element of O(1) and parameters: a priority resolution pΔ and two priority O(log(n)) , respectively. limitspmin andpmax . The priority limits are chosen Soft heaps by Chazelle [5] relate to our work in that such that pmax > pmin and (pmax pmin) is divisible by they also achieve a constant amortized cost by pΔ . Unlike traditional priority queues, multiresolution introducing a certain amount of error. Specifically, a priority queues do not guarantee the total order of soft heap can break the information-theoretic barriers elements within a certain priority resolution p . of the problem by introducing error in the key space in Δ However, by trading off a small resolution, they can a way that the problem’s entropy is reduced. While soft achieve higher performance. In particular, because heaps provide the most asymptotically efficient known many real world applications do not require priority algorithm for finding minimum spanning trees [6], queues to support infinite resolution spaces, in many of their applications are limited in that they cannot these cases MR-PQs can be used to achieve greater guarantee an upper bound error on the number of performance at no cost, as we will show. elements in the queue. (Depending on the sequence of insertions and extractions, all keys could be corrupted.) More formally, an MR-PQ is a queue that at all times Instead, following on Chazelle’s analogy, a maintains the following invariant: multiresolution priority queue breaks the Property 1. Multiresolution Priority Queue (MR-PQ) information-theoretic barriers of the problem by Invariant. Let e and e be two arbitrary elements exploiting the multiresolution properties of its priority i j space. Since in many problems the entropy of the with prioritiespi andpj , respectively, where priority space is lower than the entropy of the key pmin ≤ pi < pmax and pmin ≤ pj < pmax . Then for all space, the end result is a container data structure with a possible states, a multiresolution priority queue ensures lower computational complexity. In particular, a that element ei is dequeued before element ej if the multi-resolutive priority space will have a constant following condition is true: 2 Bᴜɪʟᴅ(q) (1) 1 sentinel = alloc_element(); § 2 queue = sentinel; 3 queue->next = queue; Intuitively, an MR-PQ discretizes the priority space 4 queue->prev = queue; 5 qltsize = (pmax-pmin)/pdelta + 1; into a sequence of slots or resolution groups 6for i in [1, qltsize): 7 qlt[i] = NULL; [pmin,)pmin + pΔ , [pmin + pΔ,)pmin +2 pΔ , …, 8 qlt[0] = queue; [pmax pΔ,)pmax and prioritizes elements according to 9 queue->prio = pmin - pdelta; the slot in which they belong.

Load more