What is a ! A list of items where each item is given a priority value! •!priority values are usually numbers! •!priority values should have relative order (e.g., <)! •!dequeue operation differs from that of a queue or dictionary: item dequeued is always one with highest priority! Method! Description! Lecture 8: Priority Queue! enqueue(item)! insert item by its priority! ! Item &dequeuemax()! remove highest priority element! Item& findmax()! return a reference to highest ! priority element ! ! size()! number of elements in pqueue! empty()! checks if pqueue has no elements! No search()!!

Priority Queue Examples! Priority Queue: Other Examples! Emergency call center:! •! operators receive calls and assign levels of urgency! in general:! •! lower numbers indicate higher urgency! •!shortest job first print queue! •! calls are dispatched (or not dispatched) by computer to police •!shortest job first cpu queue! squads based on urgency! •!discrete events simulation (e.g., computer games)!

Example:! Priority value used 1.!Level 2 call comes in! as array index! 2.!Level 2 call comes in! 3.!Level 1 call comes in! 0! 4.!A call is dispatched! 1! 5.!Level 0 call comes in! 6.!A call is disptached! 2! Priority Queue Implementations! BST as Priority Queue! Where is the smallest/largest item in a BST?! Implementation! dequeuemax()! enqueue()! of enqueue() and dequeuemax():! •!enqueue(): O(log N)! Unsorted list! O(N)! O(1)! •!dequeuemax(): O(log N)! Sorted list! O(1)! O(N)! Why not just use a BST for priority queue?! Array of ! (but only for a small O(1)! O(1)! number of priorities)! Heap! O(log n)! O(log n)!

Heaps! Heaps Implementation! A is a complete binary ! A binary heap is a complete ! •! can be efficiently stored as an array! A non-empty maxHeap T is an ordered •! if root is at node 0:! tree whereby:! •! a node at index i has children at indices 2i+1 and 2i+2! •! a node at index i has parent at index floor((i–1)/2)! •!the key in the root of T is !" •! what happens when array is full?! heap:! the keys in every subtree of T! •!every subtree of T is a maxHeap! •!(the keys of nodes across subtrees" have no required relationship)! •!a size variable keeps the number of" nodes in heap (not per subtree)! •!a minHeap is similarly defined! maxHeap::dequeuemax()! maxHeap::dequeuemax()! Save the max item at root, to be returned! Move the item in the rightmost leaf node to root! 1059!! 10! 5! 9! •! since the heap is a complete binary tree," the rightmost leaf node is always at the last index! 59! 6! 9! 6! 9! 6! 5! 6! •! swap(heap[0], heap(--size));!

The tree is no longer a heap at this point! 3! 2! 5! 3! 2! 5! 3! 2! 3! 2! Trickle down the recently moved item at the root to its proper place to restore heap property! 1059!! 10! 5! 9! •! for each subtree, recursively, if the root has a smaller 95! 9! 9! 5! search key than either of its children, swap the item" in the root with that of the larger child! 6! 6! 6! 6! 3! 3! 3! 3! Complexity: O(log n)! 2! 2! 2! 2! 105!! 5! 10! return:! 10!

Top Down Heapify! void maxHeap::! maxHeap::enqueue() ! trickleDown(int idx) {! 159!! for (j = 2*idx+1; j <= size; j = 2*j+1) {! Insert newItem into the bottom of the tree! if (j < size-1 && heap[j] < heap[j+1]) j++;! •! heap[size++] = newItem;! if (heap[idx] >= heap[j]) break; ! 5! 1596!! swap(heap[idx], heap[j]); idx = j;! The tree may no longer be a heap at this point! }! Percolate newItem up to an appropriate spot 3! 2! 156!! }! in the tree to restore the heap property! 15! Pass index (idx) of array element that needs to be trickled down! 9! Complexity: O(log n)! 5! Swap the key in the given node with the largest key among the 1569!! node’s children, moving down to that child, until either! 3! •! we reach a leaf node, or! •! both children have smaller (or equal) key! 2! 156!! Last node is at heap.size! maxHeap::enqueue()! Bottom Up Heapify!

void maxHeap::! 9! 9! 9! 15! percolateUp(int idx) {! while (idx >= 1 && heap[(idx-1)/2] < heap[idx]){! swap(heap[idx], heap[(idx-1)/2]); ! 5! 6! 5! 6! 5! 15! 5! 9! idx = (idx-1)/2;! }! }! 3! 2! 3! 2! 15! 3! 2! 6! 3! 2! 6! Pass index (idx) of array element that needs to be percolated up! 9! 9! 9! 15! Swap the key in the given node with the key of its parent, " 5! 5! 5! 5! moving up to parent until:! 6! 6! 15! 9! •! we reach the root, or! 3! 3! 3! 3! •! the parent has a larger (or equal) key! 2! 2! 2! 2! Root is at position 0! 15! 6! 6!

Heap Implementation! Trie! trie: from retrieval, originally pronounced void maxHeap::! enqueue():! to rhyme with retrieval, now commonly enqueue(Item newItem) {! •! put item at the end of pronounced to rhyme with “try”, to heap[size] = newItem;! the priority queue! differentiate from tree! o! t! percolateUp(size);! •! use percolateUp() size++! to find proper position! A trie is a tree that uses parts of the key, }! as opposed to the whole key, to perform n! w! i! o! search! Item maxHeap::! dequeuemax():! dequeuemax() {! •! remove root! Whereas a tree associates keys with e! l! p! swap(heap[0], heap[""size]);! •! take item from end of nodes, a trie associates keys with edges" trickleDown(0);! array and place at root! (though implementation may store the return heap[size];! •! use trickleDown() keys in the nodes)! }! to find proper position! Example: the trie on the right encodes this of strings: {on, owe, owl, tip, to}! Trie! Partial Match! A trie is useful for doing partial match search:! For S a set of strings from an alphabet (not necessarily Latin alphabets) where none of longest-prefix match: a search for" the strings is a prefix of another, a trie of S o! t! “tin” would return “tip”! o! t! is an ordered tree such that:! •! implementation:! continue to search until a mismatch! •! each edge of the tree is labeled with symbols n! w! i! o! n! w! i! o! from the alphabet! approximate match: allowing for" •! the labels can be stored either at" the children nodes or at the parent node! one error, a search for “oil” would" p! e! l! p! return “owl” in this example! e! l! •! the ordering of edges attached to children of a node follows the natural ordering of the alphabet!

•! labels of edges on the path from the root to any node in the tree forms a prefix of a string in S ! Useful for suggesting alternatives to misspellings!

Trie Deletion! String Encoding! How many bits do we need to encode this By post-order traversal, remove an example string:! internal node only if it’s also a leaf node! If a woodchuck could chuck wood!!

•! ASCII encoding: 8 bits/character (or 7 bits/character)! w! w! w! •! the example string has 32 characters," 4966206120776f6f64…! so we’ll need 256 bits to encode it using ASCII! a! e! a! e! a! e! remove “wade”! remove “wadi”! •! There are only 13 distinct characters in the" ! i! b! d! i! b! i! b! example string, 4 bits/character is enough" to encode the string, for a total of 128 bits! e! i! t! i! t! t! •! Can we do better (use less bits)? How?! symbol! frequency! code! Huffman Codes! Huffman Encoding! I! 1! 11111! The example string:! f! 1! 11110! In the English language, the characters e and t If a woodchuck could a! 1! 11101! occur much more frequently than q and x! chuck wood!! l! 1! 11100! Can we use fewer bits for the former and more Can be encoded using !! 1! 1101! bits for the latter, so that the weighted average the following code (for w! 2! 1100! is less than 5 bits/character?! example)! d! 3! 101! Huffman encoding main ideas:! u! 3! 100! 1.! variable-length encoding: use different number of bits Resulting encoding uses h! 2! 0111! (code length) to represent different symbols! only 111 bits! k! 2! 0110! 2.! entropy encoding: assign smaller code to more •! 111111111000011101…! o! 5! 010! frequently occurring symbols! ! 5! 001! (For binary data, treat each byte as a “character”)! Where do the codes come from?! ‘ ‘! 5! 000!

Prefix Codes! How to Construct a Prefix Code?!

Since each character is represented by a different Minimize expected number of bits over text! number of bits, we need to know when we have reached the end of a character! If you know the text, you should be able to do this perfectly! There will be no confusion in decoding a string of bits if the bit pattern encoding for one character But top-down allocations are hard to get right! cannot be a prefix of the code for another character! Huffman’s insight is to do this bottom-up, Known as a prefix-free code, or just, prefix code! starting with the two least frequent characters! We have a prefix code if all codes are always located at the leaf nodes of a proper binary trie! I! 11111! f! 11110! Huffman Trie! Huffman Trie Example! a! 11101! l! 11100! Better known as Huffman Tree! If a woodchuck could chuck wood!! !! 1101! w! 1100! Huffman tree construction algorithm:! d! 101! u! 100! •!count the frequency each symbol occurs! 32! 1! h! 0111! •!make each symbol a leaf node, with its frequency as its weight! 0! k! 0110! 13! •!pair up the two least frequently occurring symbols" 1! o! 010! c! 001! (break ties arbitrarily)! 19! 7! 1! 0! 1! ‘ ‘! 000! •!create a subtree of the pair with the root weighing" 0! 0! 9! 4! the sum of its children’s weights! 1! 0! 1! •!repeat until all subtrees have been paired up into a single tree! 10! 4! 6! 3! 2! 2! •!encode each symbol as the path from the root," 0! 1! 0! 0! 1! 0! 1! 0! 1! 0! 1! 0! 1! with a left represented as a 0, and a right a 1! 5’ ‘! 5c! 5o! 2k! 2h! 3u! 3d! 2w! 1!! 1l! 1a! 1f! 1I!

Characteristics of Huffman Trees! Encoding Time Complexity!

All symbols are leaf nodes by construction, Running times, string length: n, alphabet size: m!

hence no code is a prefix of another code! •! frequency count: O( )! I! 11111! •! Huffman tree construction: O( )! f! 11110! More frequently occurring symbols at shallower a! 11101! depth, having smaller codes! •! Total time: O( )! l! 11100! !! 1101! Implementation:! To decode, we need the code table," w! 1100! •! You are given a list of items with varying frequencies! so the code table must be stored" d! 101! •! need to repeatedly choose two that" with the encoded string! u! 100! currently have the lowest frequency! h! 0111! •! need to repeatedly place with sum of above back into list! How to store/communicate" k! 0110! •! How would you implement the algorithm?! o! 010! the code table?! c! 001! ‘ ‘! 000! Code Table Compression! Code Table Encoding! The last column was not created from a Huffman tree directly! sym! freq! code1! code2! code3! The Huffman code for ‘ ‘! 5! 000! 001! 000! The Huffman tree is used only to determine the code length of any particular text is c! 5! 001! 010! 001! each symbol, then:! not unique! d! 3! 101! 100! 010! 1.! order symbols by code length! o! 5! 010! 000! 011! 2.! starting from all 0s for the shortest length code! For example, all three u! 3! 100! 101! 100! 3.! add 1 to the code for each subsequent symbol! sets of codes in the !! 2! 1101! 0110! 1010! table are all valid for" h! 2! 0111! 1101! 1011! 4.! when transitioning from code of length k to code of length" k+1, as determined by the Huffman trie, add 1 to the last length-k code k! 2! 0110! 1110! 1100! the example string! and use it as the prefix for the first length k+1 code! w! 1! 1100! 0111! 1101! 5.! set the k+1st bit to 0 and continue adding 1 for each subsequent code! The last column can be a! 1! 11101! 11100! 11100! compressed into:" f! 1! 11110! 11101! 11101! Resulting code table has one encoding for each symbol and is 3‘ ’cdou4!hkw5aflI! l! 1! 11100! 11111! 11110! I! 1! 11111! 11110! 11111! prefix-free!