Nearest Neighbor Searching and Priority Queues

Total Page:16

File Type:pdf, Size:1020Kb

Nearest Neighbor Searching and Priority Queues Nearest neighbor searching and priority queues Nearest Neighbor Search • Given: a set P of n points in Rd • Goal: a data structure, which given a query point q, finds the nearest neighbor p of q in P or the k nearest neighbors p q Variants of nearest neighbor • Near neighbor (range search): find one/all points in P within distance r from q • Spatial join: given two sets P,Q, find all pairs p in P, q in Q, such that p is within distance r from q • Approximate near neighbor: find one/all points p’ in P, whose distance to q is at most (1+ε) times the distance from q to its nearest neighbor Solutions Depends on the value of d: • low d: graphics, GIS, etc. • high d: – similarity search in databases (text, images etc) – finding pairs of similar objects (e.g., copyright violation detection) Nearest neighbor search in documents • How could we represent documents so that we can define a reasonable distance between two documents? • Vector of word frequency occurrences – Probably want to get rid of useless words that occur in all documents – Probably need to worry about synonyms and other details from language – But basically, we get a VERY long vector • And maybe we ignore the frequencies and just idenEfy with a “1” the words that occur in some document. Nearest neighbor search in documents • One reasonable measure of distance between two documents is just a count of words they share – this is just the point wise product of the two vectors when we ignore counts. • Easy enough to compute for a pair of documents, but suppose our document database contains millions of documents. How can we solve the nearest neighbor problem FAST? Algorithms • Main memory – linear scan – tree-based: • quadtree • kd-tree – hashing-based: Locality-Sensitive Hashing • Secondary storage (Databases) – R-tree (and numerous variants) – Vector Approximation File (VA-file) Nearest neighbors in k-d trees Make a guess about the nearest neighbor of the star Nearest neighbors in k-d trees ub, the radius of the circle, is the upper bound on the distance to the nearest neighbor ub Nearest neighbors in k-d trees • Establishing an upper bound lets us prune parts of the tree which cannot hold the true nearest neighbor. • In parEcular, this circle is enErely to the right of the spling line running through the root of the tree. So, any point to the leT of the root cannot be in the candidate circle, and so can't be any beVer than our current guess. – Once we have a guess about where the nearest neighbor is, we can start eliminang parts of the tree where the actual answer cannot be. • This general technique of searching a large space and pruning opEons based on parEal results is called branch- and-bound. Nearest neighbors in k-d trees • It is easy to tell where this circle is with respect to the line passing through the k-d tree point. y = y0 r2 r1 (x2, y2) y2 + r2 > y0 (x1, y1) y1 + r1 < y0 Nearest neighbors in k-d trees • Let the query point be (a1,a2). • Maintain a global best esEmate of the nearest neighbor, called 'guess.' • Maintain a global value of the distance to that neighbor, called 'bestDist' • Set 'guess' to NULL. • Set 'bestDist' to infinity. StarEng at the root, execute the following procedure: if curr == NULL return /* If the current locaon is beVer than the best known locaon, update the best known locaon. */ if distance(curr, guess) < bestDist bestDist = distance(curr, guess) guess = curr /* Recursively search the half of the tree that contains the test point. */ if ai < curri recursively search the leT subtree on the next axis else recursively search the right subtree on the next axis /* If the candidate circle crosses this spling plane, look on the other side of the plane by examining the other subtree. */ if |curri – ai | < bestDist recursively search the other subtree on the next axis • Procedure works by walking down to the leaf of the kd-tree as if searching for the test point. • As we start unwinding the recursion and walking back up the tree, check whether each node is beVer than the best esEmate we have so far. – If so, update best esEmate to be the current node. • Finally, check whether the candidate circle based on current guess could cross the spling line of the current node. If not, eliminate all points on the other side of the spling line and walk back up to the next node in the tree. Otherwise, look in that side of the tree to see if there are any closer points. Suppose we want more than 1 nearest neighbor? • Find the k nearest neighbors (kNN) of a query point in the k-d tree (sorry about using k in two different ways!) • Algorithm uses a data structure called a bounded priority queue (or BPQ for short). • A bounded priority queue stores a fixed number of entries, each of which has a key and a priority (lower is beer). • When you add a new element to the BPQ and the BPQ is full, you eject the node with maximum priority (which might be the new node). – If we have not reached the bound, then we just insert the new element in its appropriate locaon. kNN searching • There are two changes to this algorithm that differenEate it from the iniEal 1-NN search algorithm. 1. First, when determining whether to look on the opposite side of the spling plane, we use as the radius of the candidate circle the distance from the test point to the maximum-priority point in the BPQ. The raonale behind this is that when finding the k nearest neighbors, our candidate circle for the k nearest points needs to encompass all k of those neighbors, not just the closest. 2. The other main change is that when we consider whether to look on the opposite side of the spling plane, our decision takes into account whether the BPQ contains at least k points. – This is extremely important! If we prune out parts of the tree before we have made at least k guesses, we might accidentally throw out one of the closest points. k-NN search • Perform a 2-NN lookup for the star. • Recursively check the leT subtree of the spling plane, and find the blue point as a candidate nearest neighbor. Since we haven't found two nearest neighbors yet, we sEll need to look on the other side of the spling plane for more neighbors, even though the candidate circle does not cross the spling line. Priority Queue • A priority queue stores a collecEon of items • An item is a pair: (key, element) • Main methods: – insert(key, element) inserts an item with the specified key and element – removeMin() removes the item with the smallest key and returns the associated element Monday, March 30, 15 19 Priority Queue Implementaons Implementaon add removeMin Unsorted Array O(1) O(n) Sorted Array O(n) O(1) Unsorted Linked List O(1) O(n) Sorted Linked List O(n) O(1) Hash Table O(1) O(n) Heap O(log n) O(log n) Monday, March 30, 15 20 Binary heap implementaon of priority queues • Binary heap (or heap) is a complete binary tree having the following heap order 13 property: – for every node X, the key in the 21 16 parent of X is smaller than the key at X. 24 31 19 68 • Heaps stored using sequenEal representaon of complete 65 26 32 binary trees • Smallest element is at the root of the heap InserEon of x into a binary heap • Create a hole in the next available locaon • If x can be placed in the hole, finished • Otherwise, percolate x up into its parent’s locaon and recurse • Terminate if x is switched with the key at the root. Example - Insert 14 13 13 21 16 21 16 24 31 19 68 24 14 19 68 65 26 32 14 13 65 26 32 31 14 16 24 21 19 68 65 26 32 31 Code for inseron • Place a small element in posiEon 0 of the heap to avoid tesEng for root – value known as a sennel • RouEne does not use swaps as it percolates up – percolang up using swaps would require 3d assignments for d percolates – Code shown uses d+1 assignments Code for inseron Procedure insert (x:element to be inserted; H: priority queue); vari i: integer; begin if H.size = Maximum then error else begin H.size : = H.size +1 i := H.size while H.element[i div2] > x do begin H.element[i] := H.element[i div 2]; move that value down i := i div 2; this is now an empty heap locaon end H.element[i] := x end! Delete-min • key,at root, is always deleted • Move last key, x, in heap into root • Percolate down unEl it is smaller than both of its children – if x is smaller than both of its children, halt – otherwise swap x with its smaller child and repeat Example 32 16 21 16 21 32 24 31 19 68 24 31 19 68 16 65 26 65 26 21 19 24 31 32 68 65 26 Building a heap • A heap can be built from n keys in O(n) Eme • Insert the keys in any order, maintaining the structure property (complete BT) • Then percolate keys down from “boom” to “top”. – percolang a node down can only take Eme proporEonal to the height of the node – But the “total” height of a complete BT is O(n) 150 Example 80 40 30 10 70 110 percolate down (7) 100 20 90 60 50 120 140 130 150 80 40 30 10 70 110 100 20 90 60 50 120 140 130 150 percolate-down (6) 80 40 30 10 70 110 100 20 90 60 50 120 140 130 150 80 40 30 10 50 110 100 20 90 60 70 120 140 130 150 80 40 percolate down (5) 30 10 50 110 100 20 90 60 70 120 140 130 150 80 40 30 10 50 110 100 20 90 60 70 120 140 130 150 percolate down (4) 80 40 30 10 50 110 100 20 90 60 70 120 140 130 150 80 40 20 10 50 110 100 30 90 60 70 120 140 130 150 percolate down (3) 80 40 20 10 50 110 100 30 90 60 70 120 140 130 150 percolate down (2) 10 40 20 60 50 110 100 30 90 80 70 120 140 130 150 10 40 20 60 50 110 100 30 90 80 70 120 140 130 10 20 40 30 60 50 110 100 150 90 80 70 120 140 130 Binomial queues • Consider problem of merging two priority queues – binary heap soluEon would require inserEng the keys one at a Eme from H1 into H2.
Recommended publications
  • An Alternative to Fibonacci Heaps with Worst Case Rather Than Amortized Time Bounds∗
    Relaxed Fibonacci heaps: An alternative to Fibonacci heaps with worst case rather than amortized time bounds¤ Chandrasekhar Boyapati C. Pandu Rangan Department of Computer Science and Engineering Indian Institute of Technology, Madras 600036, India Email: [email protected] November 1995 Abstract We present a new data structure called relaxed Fibonacci heaps for implementing priority queues on a RAM. Relaxed Fibonacci heaps support the operations find minimum, insert, decrease key and meld, each in O(1) worst case time and delete and delete min in O(log n) worst case time. Introduction The implementation of priority queues is a classical problem in data structures. Priority queues find applications in a variety of network problems like single source shortest paths, all pairs shortest paths, minimum spanning tree, weighted bipartite matching etc. [1] [2] [3] [4] [5] In the amortized sense, the best performance is achieved by the well known Fibonacci heaps. They support delete and delete min in amortized O(log n) time and find min, insert, decrease key and meld in amortized constant time. Fast meldable priority queues described in [1] achieve all the above time bounds in worst case rather than amortized time, except for the decrease key operation which takes O(log n) worst case time. On the other hand, relaxed heaps described in [2] achieve in the worst case all the time bounds of the Fi- bonacci heaps except for the meld operation, which takes O(log n) worst case ¤Please see Errata at the end of the paper. 1 time. The problem that was posed in [1] was to consider if it is possible to support both decrease key and meld simultaneously in constant worst case time.
    [Show full text]
  • Data Structures and Algorithms Binary Heaps (S&W 2.4)
    Data structures and algorithms DAT038/TDA417, LP2 2019 Lecture 12, 2019-12-02 Binary heaps (S&W 2.4) Some slides by Sedgewick & Wayne Collections A collection is a data type that stores groups of items. stack Push, Pop linked list, resizing array queue Enqueue, Dequeue linked list, resizing array symbol table Put, Get, Delete BST, hash table, trie, TST set Add, ontains, Delete BST, hash table, trie, TST A priority queue is another kind of collection. “ Show me your code and conceal your data structures, and I shall continue to be mystified. Show me your data structures, and I won't usually need your code; it'll be obvious.” — Fred Brooks 2 Priority queues Collections. Can add and remove items. Stack. Add item; remove the item most recently added. Queue. Add item; remove the item least recently added. Min priority queue. Add item; remove the smallest item. Max priority queue. Add item; remove the largest item. return contents contents operation argument value size (unordered) (ordered) insert P 1 P P insert Q 2 P Q P Q insert E 3 P Q E E P Q remove max Q 2 P E E P insert X 3 P E X E P X insert A 4 P E X A A E P X insert M 5 P E X A M A E M P X remove max X 4 P E M A A E M P insert P 5 P E M A P A E M P P insert L 6 P E M A P L A E L M P P insert E 7 P E M A P L E A E E L M P P remove max P 6 E M A P L E A E E L M P A sequence of operations on a priority queue 3 Priority queue API Requirement.
    [Show full text]
  • Priority Queues and Binary Heaps Chapter 6.5
    Priority Queues and Binary Heaps Chapter 6.5 1 Some animals are more equal than others • A queue is a FIFO data structure • the first element in is the first element out • which of course means the last one in is the last one out • But sometimes we want to sort of have a queue but we want to order items according to some characteristic the item has. 107 - Trees 2 Priorities • We call the ordering characteristic the priority. • When we pull something from this “queue” we always get the element with the best priority (sometimes best means lowest). • It is really common in Operating Systems to use priority to schedule when something happens. e.g. • the most important process should run before a process which isn’t so important • data off disk should be retrieved for more important processes first 107 - Trees 3 Priority Queue • A priority queue always produces the element with the best priority when queried. • You can do this in many ways • keep the list sorted • or search the list for the minimum value (if like the textbook - and Unix actually - you take the smallest value to be the best) • You should be able to estimate the Big O values for implementations like this. e.g. O(n) for choosing the minimum value of an unsorted list. • There is a clever data structure which allows all operations on a priority queue to be done in O(log n). 107 - Trees 4 Binary Heap Actually binary min heap • Shape property - a complete binary tree - all levels except the last full.
    [Show full text]
  • Assignment 3: Kdtree ______Due June 4, 11:59 PM
    CS106L Handout #04 Spring 2014 May 15, 2014 Assignment 3: KDTree _________________________________________________________________________________________________________ Due June 4, 11:59 PM Over the past seven weeks, we've explored a wide array of STL container classes. You've seen the linear vector and deque, along with the associative map and set. One property common to all these containers is that they are exact. An element is either in a set or it isn't. A value either ap- pears at a particular position in a vector or it does not. For most applications, this is exactly what we want. However, in some cases we may be interested not in the question “is X in this container,” but rather “what value in the container is X most similar to?” Queries of this sort often arise in data mining, machine learning, and computational geometry. In this assignment, you will implement a special data structure called a kd-tree (short for “k-dimensional tree”) that efficiently supports this operation. At a high level, a kd-tree is a generalization of a binary search tree that stores points in k-dimen- sional space. That is, you could use a kd-tree to store a collection of points in the Cartesian plane, in three-dimensional space, etc. You could also use a kd-tree to store biometric data, for example, by representing the data as an ordered tuple, perhaps (height, weight, blood pressure, cholesterol). However, a kd-tree cannot be used to store collections of other data types, such as strings. Also note that while it's possible to build a kd-tree to hold data of any dimension, all of the data stored in a kd-tree must have the same dimension.
    [Show full text]
  • CS210-Data Structures-Module-29-Binary-Heap-II
    Data Structures and Algorithms (CS210A) Lecture 29: • Building a Binary heap on 풏 elements in O(풏) time. • Applications of Binary heap : sorting • Binary trees: beyond searching and sorting 1 Recap from the last lecture 2 A complete binary tree How many leaves are there in a Complete Binary tree of size 풏 ? 풏/ퟐ 3 Building a Binary heap Problem: Given 풏 elements {푥0, …, 푥푛−1}, build a binary heap H storing them. Trivial solution: (Building the Binary heap incrementally) CreateHeap(H); For( 풊 = 0 to 풏 − ퟏ ) What is the time Insert(푥,H); complexity of this algorithm? 4 Building a Binary heap incrementally What useful inference can you draw from Top-down this Theorem ? approach The time complexity for inserting a leaf node = ?O (log 풏 ) # leaf nodes = 풏/ퟐ , Theorem: Time complexity of building a binary heap incrementally is O(풏 log 풏). 5 Building a Binary heap incrementally What useful inference can you draw from Top-down this Theorem ? approach The O(풏) time algorithm must take O(1) time for each of the 풏/ퟐ leaves. 6 Building a Binary heap incrementally Top-down approach 7 Think of alternate approach for building a binary heap In any complete 98 binaryDoes treeit suggest, how a manynew nodes approach satisfy to heapbuild propertybinary heap ? ? 14 33 all leaf 37 11 52 32 nodes 41 21 76 85 17 25 88 29 Bottom-up approach 47 75 9 57 23 heap property: “Every ? node stores value smaller than its children” We just need to ensure this property at each node.
    [Show full text]
  • Rethinking Host Network Stack Architecture Using a Dataflow Modeling Approach
    DISS.ETH NO. 23474 Rethinking host network stack architecture using a dataflow modeling approach A thesis submitted to attain the degree of DOCTOR OF SCIENCES of ETH ZURICH (Dr. sc. ETH Zurich) presented by Pravin Shinde Master of Science, Vrije Universiteit, Amsterdam born on 15.10.1982 citizen of Republic of India accepted on the recommendation of Prof. Dr. Timothy Roscoe, examiner Prof. Dr. Gustavo Alonso, co-examiner Dr. Kornilios Kourtis, co-examiner Dr. Andrew Moore, co-examiner 2016 Abstract As the gap between the speed of networks and processor cores increases, the software alone will not be able to handle all incoming data without additional assistance from the hardware. The network interface controllers (NICs) evolve and add supporting features which could help the system increase its scalability with respect to incoming packets, provide Quality of Service (QoS) guarantees and reduce the CPU load. However, modern operating systems are ill suited to both efficiently exploit and effectively manage the hardware resources of state-of-the-art NICs. The main problem is the layered architecture of the network stack and the rigid interfaces. This dissertation argues that in order to effectively use the diverse and complex NIC hardware features, we need (i) a hardware agnostic representation of the packet processing capabilities of the NICs, and (ii) a flexible interface to share this information with different layers of the network stack. This work presents the Dataflow graph based model to capture both the hardware capabilities for packet processing of the NIC and the state of the network stack, in order to enable automated reasoning about the NIC features in a hardware-agnostic way.
    [Show full text]
  • Priorityqueue
    CSE 373 Java Collection Framework, Part 2: Priority Queue, Map slides created by Marty Stepp http://www.cs.washington.edu/373/ © University of Washington, all rights reserved. 1 Priority queue ADT • priority queue : a collection of ordered elements that provides fast access to the minimum (or maximum) element usually implemented using a tree structure called a heap • priority queue operations: add adds in order; O(log N) worst peek returns minimum value; O(1) always remove removes/returns minimum value; O(log N) worst isEmpty , clear , size , iterator O(1) always 2 Java's PriorityQueue class public class PriorityQueue< E> implements Queue< E> Method/Constructor Description Runtime PriorityQueue< E>() constructs new empty queue O(1) add( E value) adds value in sorted order O(log N ) clear() removes all elements O(1) iterator() returns iterator over elements O(1) peek() returns minimum element O(1) remove() removes/returns min element O(log N ) Queue<String> pq = new PriorityQueue <String>(); pq.add("Stuart"); pq.add("Marty"); ... 3 Priority queue ordering • For a priority queue to work, elements must have an ordering in Java, this means implementing the Comparable interface • many existing types (Integer, String, etc.) already implement this • if you store objects of your own types in a PQ, you must implement it TreeSet and TreeMap also require Comparable types public class Foo implements Comparable<Foo> { … public int compareTo(Foo other) { // Return > 0 if this object is > other // Return < 0 if this object is < other // Return 0 if this object == other } } 4 The Map ADT • map : Holds a set of unique keys and a collection of values , where each key is associated with one value.
    [Show full text]
  • Programmatic Testing of the Standard Template Library Containers
    Programmatic Testing of the Standard Template Library Containers y z Jason McDonald Daniel Ho man Paul Stro op er May 11, 1998 Abstract In 1968, McIlroy prop osed a software industry based on reusable comp onents, serv- ing roughly the same role that chips do in the hardware industry. After 30 years, McIlroy's vision is b ecoming a reality. In particular, the C++ Standard Template Library STL is an ANSI standard and is b eing shipp ed with C++ compilers. While considerable attention has b een given to techniques for developing comp onents, little is known ab out testing these comp onents. This pap er describ es an STL conformance test suite currently under development. Test suites for all of the STL containers have b een written, demonstrating the feasi- bility of thorough and highly automated testing of industrial comp onent libraries. We describ e a ordable test suites that provide go o d co de and b oundary value coverage, including the thousands of cases that naturally o ccur from combinations of b oundary values. We showhowtwo simple oracles can provide fully automated output checking for all the containers. We re ne the traditional categories of black-b ox and white-b ox testing to sp eci cation-based, implementation-based and implementation-dep endent testing, and showhow these three categories highlight the key cost/thoroughness trade- o s. 1 Intro duction Our testing fo cuses on container classes |those providing sets, queues, trees, etc.|rather than on graphical user interface classes. Our approach is based on programmatic testing where the number of inputs is typically very large and b oth the input generation and output checking are under program control.
    [Show full text]
  • CS 270 Algorithms
    CS 270 Algorithms Week 10 Oliver Kullmann Binary heaps Sorting Heapification Building a heap 1 Binary heaps HEAP- SORT Priority 2 Heapification queues QUICK- 3 Building a heap SORT Analysing 4 QUICK- HEAP-SORT SORT 5 Priority queues Tutorial 6 QUICK-SORT 7 Analysing QUICK-SORT 8 Tutorial CS 270 General remarks Algorithms Oliver Kullmann Binary heaps Heapification Building a heap We return to sorting, considering HEAP-SORT and HEAP- QUICK-SORT. SORT Priority queues CLRS Reading from for week 7 QUICK- SORT 1 Chapter 6, Sections 6.1 - 6.5. Analysing 2 QUICK- Chapter 7, Sections 7.1, 7.2. SORT Tutorial CS 270 Discover the properties of binary heaps Algorithms Oliver Running example Kullmann Binary heaps Heapification Building a heap HEAP- SORT Priority queues QUICK- SORT Analysing QUICK- SORT Tutorial CS 270 First property: level-completeness Algorithms Oliver Kullmann Binary heaps In week 7 we have seen binary trees: Heapification Building a 1 We said they should be as “balanced” as possible. heap 2 Perfect are the perfect binary trees. HEAP- SORT 3 Now close to perfect come the level-complete binary Priority trees: queues QUICK- 1 We can partition the nodes of a (binary) tree T into levels, SORT according to their distance from the root. Analysing 2 We have levels 0, 1,..., ht(T ). QUICK- k SORT 3 Level k has from 1 to 2 nodes. Tutorial 4 If all levels k except possibly of level ht(T ) are full (have precisely 2k nodes in them), then we call the tree level-complete. CS 270 Examples Algorithms Oliver The binary tree Kullmann 1 ❚ Binary heaps ❥❥❥❥ ❚❚❚❚ ❥❥❥❥ ❚❚❚❚ ❥❥❥❥ ❚❚❚❚ Heapification 2 ❥ 3 ❖ ❄❄ ❖❖ Building a ⑧⑧ ❄ ⑧⑧ ❖❖ heap ⑧⑧ ❄ ⑧⑧ ❖❖❖ 4 5 6 ❄ 7 ❄ HEAP- ⑧ ❄ ⑧ ❄ SORT ⑧⑧ ❄ ⑧⑧ ❄ Priority 10 13 14 15 queues QUICK- is level-complete (level-sizes are 1, 2, 4, 4), while SORT ❥ 1 ❚❚ Analysing ❥❥❥❥ ❚❚❚❚ QUICK- ❥❥❥ ❚❚❚ SORT ❥❥❥ ❚❚❚❚ 2 ❥❥ 3 ♦♦♦ ❄❄ ⑧ Tutorial ♦♦ ❄❄ ⑧⑧ ♦♦♦ ⑧ 4 ❄ 5 ❄ 6 ❄ ⑧⑧ ❄❄ ⑧ ❄ ⑧ ❄ ⑧⑧ ❄ ⑧⑧ ❄ ⑧⑧ ❄ 8 9 10 11 12 13 is not (level-sizes are 1, 2, 3, 6).
    [Show full text]
  • Kd Trees What's the Goal for This Course? Data St
    Today’s Outline - kd trees CSE 326: Data Structures Too much light often blinds gentlemen of this sort, Seeing the forest for the trees They cannot see the forest for the trees. - Christoph Martin Wieland Hannah Tang and Brian Tjaden Summer Quarter 2002 What’s the goal for this course? Data Structures - what’s in a name? Shakespeare It is not possible for one to teach others, until one can first teach herself - Confucious • Stacks and Queues • Asymptotic analysis • Priority Queues • Sorting – Binary heap, Leftist heap, Skew heap, d - heap – Comparison based sorting, lower- • Trees bound on sorting, radix sorting – Binary search tree, AVL tree, Splay tree, B tree • World Wide Web • Hash Tables – Open and closed hashing, extendible, perfect, • Implement if you had to and universal hashing • Understand trade-offs between • Disjoint Sets various data structures/algorithms • Graphs • Know when to use and when not to – Topological sort, shortest path algorithms, Dijkstra’s algorithm, minimum spanning trees use (Prim’s algorithm and Kruskal’s algorithm) • Real world applications Range Query Range Query Example Y A range query is a search in a dictionary in which the exact key may not be entirely specified. Bellingham Seattle Spokane Range queries are the primary interface Tacoma Olympia with multi-D data structures. Pullman Yakima Walla Walla Remember Assignment #2? Give an algorithm that takes a binary search tree as input along with 2 keys, x and y, with xÃÃy, and ÃÃ ÃÃ prints all keys z in the tree such that x z y. X 1 Range Querying in 1-D
    [Show full text]
  • Readings Findmin Problem Priority Queue
    Readings • Chapter 6 Priority Queues & Binary Heaps › Section 6.1-6.4 CSE 373 Data Structures Winter 2007 Binary Heaps 2 FindMin Problem Priority Queue ADT • Quickly find the smallest (or highest priority) • Priority Queue can efficiently do: item in a set ›FindMin() • Applications: • Returns minimum value but does not delete it › Operating system needs to schedule jobs according to priority instead of FIFO › DeleteMin( ) › Event simulation (bank customers arriving and • Returns minimum value and deletes it departing, ordered according to when the event › Insert (k) happened) • In GT Insert (k,x) where k is the key and x the value. In › Find student with highest grade, employee with all algorithms the important part is the key, a highest salary etc. “comparable” item. We’ll skip the value. › Find “most important” customer waiting in line › size() and isEmpty() Binary Heaps 3 Binary Heaps 4 List implementation of a Priority BST implementation of a Priority Queue Queue • What if we use unsorted lists: • Worst case (degenerate tree) › FindMin and DeleteMin are O(n) › FindMin, DeleteMin and Insert (k) are all O(n) • In fact you have to go through the whole list • Best case (completely balanced BST) › Insert(k) is O(1) › FindMin, DeleteMin and Insert (k) are all O(logn) • What if we used sorted lists • Balanced BSTs › FindMin and DeleteMin are O(1) › FindMin, DeleteMin and Insert (k) are all O(logn) • Be careful if we want both Min and Max (circular array or doubly linked list) › Insert(k) is O(n) Binary Heaps 5 Binary Heaps 6 1 Better than a speeding BST Binary Heaps • A binary heap is a binary tree (NOT a BST) that • Can we do better than Balanced Binary is: Search Trees? › Complete: the tree is completely filled except • Very limited requirements: Insert, possibly the bottom level, which is filled from left to FindMin, DeleteMin.
    [Show full text]
  • Binary Trees, Binary Search Trees
    Binary Trees, Binary Search Trees www.cs.ust.hk/~huamin/ COMP171/bst.ppt Trees • Linear access time of linked lists is prohibitive – Does there exist any simple data structure for which the running time of most operations (search, insert, delete) is O(log N)? Trees • A tree is a collection of nodes – The collection can be empty – (recursive definition) If not empty, a tree consists of a distinguished node r (the root), and zero or more nonempty subtrees T1, T2, ...., Tk, each of whose roots are connected by a directed edge from r Some Terminologies • Child and parent – Every node except the root has one parent – A node can have an arbitrary number of children • Leaves – Nodes with no children • Sibling – nodes with same parent Some Terminologies • Path • Length – number of edges on the path • Depth of a node – length of the unique path from the root to that node – The depth of a tree is equal to the depth of the deepest leaf • Height of a node – length of the longest path from that node to a leaf – all leaves are at height 0 – The height of a tree is equal to the height of the root • Ancestor and descendant – Proper ancestor and proper descendant Example: UNIX Directory Binary Trees • A tree in which no node can have more than two children • The depth of an “average” binary tree is considerably smaller than N, eventhough in the worst case, the depth can be as large as N – 1. Example: Expression Trees • Leaves are operands (constants or variables) • The other nodes (internal nodes) contain operators • Will not be a binary tree if some operators are not binary Tree traversal • Used to print out the data in a tree in a certain order • Pre-order traversal – Print the data at the root – Recursively print out all data in the left subtree – Recursively print out all data in the right subtree Preorder, Postorder and Inorder • Preorder traversal – node, left, right – prefix expression • ++a*bc*+*defg Preorder, Postorder and Inorder • Postorder traversal – left, right, node – postfix expression • abc*+de*f+g*+ • Inorder traversal – left, node, right.
    [Show full text]