Practical Session #10 - Heap, Sort Properties, Quicksort Algorithm, Selection

Practical Session #10 - Heap, Sort properties, QuickSort algorithm, Selection Heap A binary heap can be considered as a complete binary tree, Heap (the last level is full from the left to a certain point). For each node x, x ≥ left(x) and x ≥ right(x) Maximum-Heap The root in a Maximum-Heap is the maximal element in the heap For each node x, x ≤ left(x) and x ≤ right(x) Minimum-Heap The root in a Minimum-Heap is the minimal element in the heap Heap-Array A heap can be represented using an array: 1. A[1] - the root of the heap (not A[0]) 2. length[A] - number of elements in the array 3. heap-size[A] - number of elements in the heap represented by the array A 4. For each index i : Parent(i) = i 2 Left(i) = 2i Right(i) = 2i+1 Actions on Insert ~ O(log(n)) Heaps Max ~ O(1) Extract-Max ~ O(log(n)) Build-Heap ~ O(n) Down-Heapify(H, i) – same as MaxHeapify seen in class but starting in the node in index i ~ O(log(n)) Up-Heapify(H, i) – Fixing the heap going from node in index i to the root as in Insert() ~ O(log(n)) 1 Question 1 Given a maximum heap H, What is the time complexity of finding the 3 largest keys in H ? What is the time complexity of finding the C largest keys in H ? (C is a constant) What if C is not constant? Solution: It takes O(1) to find the 3 maximal keys in H. The 3 maximal keys in H are: The root The root's maximal son y The largest among y's maximal son and y's sibling Finding the C largest keys in H is done in a similar way: The i-th largest key is one of at most i keys, the sons of the i-1 largest keys that have not been taken yet, thus can be found in O(i) time. Finding the C largest keys takes O(C2) = O(1) (Another solution would be ordering all the keys in the first C levels of the heap) If C is not constant we have to be more efficient, We will use a new heap, where we will store nodes of the original heap so that for each node entered we will have it sons as put in this new heap but we also preserve access to its sons as in the original heap. We first enter the root to the new heap. Then C times we do Extract-Max() on the new heap while at each time we Insert the original sons (the sons as in the original heap) to the new heap. The logic is as before only this time since the new heap has at most C elements we get O(C*log(C)). (note that we do not change the original heap at any point) Question 2 Analyze the complexity of finding the i smallest elements in a set of size n using each of the following: Sorting Priority Queue Sorting Solution: Sort the numbers using algorithm that takes 푂(푛 log 푛) Find the i smallest elements in 푂() Total = 푂( + 푛 log 푛) = 푂(푛 log 푛) Priority Queue Solution: Build a minimum heap in 푂(푛) Execute Extract-Min i times in 푂( log 푛) Total = 푂(푛 + log 푛). Using question 1 this can be reduced to 푂(푛 + log ). Note: 푛 + log = Θ(max(푛, log )) and clearly 푛 + log 푛 = 푂(푛 log 푛) (and faster if ≠ Θ(푛)) 2 Question 3 In a given heap H that is represented using an array; suggest an algorithm for extracting and returning the i'th indexed element. Solution: Delete(H, i) if (heap_size(H) < i) error(“heap underflow”) key ← H[i] H[i] ← H[heap_size] /*Deleting the i'th element and replacing it with the last element*/ heap_size ← heap_size − 1 /*Effectively shrinking the heap*/ Down-Heapify(H, i) /*Subtree rooted in H[i] is legal heap*/ Up-Heapify(H, i) /*Area above H[i] is legal heap*/ return key Question 4 You are given two min-heaps 퐻1 and 퐻2 with n1 and n2 keys respectively. Each element of H1 is smaller than each element of 퐻2. How can you merge H1 and H2 into one heap H (with 푛1 + 푛2 keys) using only 푂(푛2) time? (Assume both heaps are represented in arrays of size > 푛1 + 푛2) Solution 풏ퟏ ≥ 풏ퟐ: Add the keys of H2 to the array of H1 from index n1+1 to index 푛1 + 푛2. The resulting array represents a correct heap due to the following reasoning. Number of leaves in H1 is l1, hence if l1 is even, the number of internal nodes is l1-1 and there are 2l1 "openings" for new leaves. 푙1 + 푙1 − 1 = 푛1 2푙1 − 1 = 푛1 2푙1 > 푛1 ≥ 푛2 If l1 is odd then there are l1 internal nodes and 2푙1 + 1 ≥ 푛2 openings All keys from 푯ퟐ are added as leaves due to the fact stated in formula 2푙1 > 푛2. As all keys in H1 are smaller than any key in H2, the heap property still will hold in the resulting array. 풏ퟏ < 풏ퟐ: Build a new heap with all the elements in H1 and H2 in 푂(푛1 + 푛2) = 푂(2푛2) = 푂(푛2). 3 Question 5 Give an O(nlogk) time algorithm to merge k sorted arrays A1 .. Ak into one sorted array. n is the total number of elements (you may assume k≤n). Solution We will use a min-heap of k triples of the form (d, i, Ad[i]) for 1 ≤ 푑 ≤ 푘, 1 ≤ ≤ 푛. The heap will use an array B. 1. First we build the heap with the first elements lists A1...Ak in O(k) time. 2. In each step extract the minimal element of the heap and add it at the end of the output Array M. 3. If the array that the element mentioned in step two came from is not empty, than remove its minimum and add it to our heap. for d←1 to k B[d] ← (d, 1, Ad[1]) Build-Heap(B) /*By the order of Ad [1] */ for j←1 to n (d, i, x) ← Extract-Min(B) M[j]←x if < 퐴푑. 푙푒푛푡ℎ Heap-Insert(B,(푑, + 1, 퐴푑[ + 1])) Worst case time analysis: Build-Heap : O(k) Extract-Min : O(log k) Heap-Insert : O(log k) Total O(nlogk) 4 Question 6 (For practice at home) Suppose that you had to implement a priority queue that only needed to store items whose priorities were integer values between 0 and 푘 − 1. Describe how a priority queue could be implemented so that 푛푠푒푟푡( ) has complexity 푂(1) and 푅푒푚표푣푒푀푛( ) has complexity 푂(푘). Be sure to give the pseudo-code for both operations. Hint: This is very similar to a hash table. Solution Use an array A of size 푘 of linked lists. The linked list in the ith slot stores entries with priority . To insert an item into the priority queue with priority , append it to the linked list in the th slot. 푛푠푒푟푡(푣) 퐴[푣. 푝푟표푟푡푦].add-to-tail(푣) To remove an item from the queue, scan the array of lists, starting at 0 until a non-empty list is found. Remove the first element and return it—the for-loop induces the O(k) worst case complexity. E.g., 푟푒푚표푣푒푀푛( ) for ← 0 to k-1 List L← 퐴[] if( L.empty() = false ) 푣 ← L.remove-from-head( ) /*Removes list's head and returns it = dequeue*/ return 푣 return NULL 5 Quicksort quickSort( A, low, high ) if( high > low ) pivot ← partition( A, low, high ) // quickSort( A, low, pivot-1 ) quickSort( A, pivot+1, high ) int partition( A, low, high ) pivot_value A[low] left ← low pivot ← left right ← high while ( left < right ) // Move left while item < pivot while( left < high && A[left] ≤ pivot_value) left++ // Move right while item > pivot while( A[right] > pivot_value) right-- if( left < right ) Make sure right has not passed left SWAP(A,left,right) // right is final position for the pivot A[low] ← A[right] A[right] ← pivot_item return right quickSort(A,0,length(A)-1) stable sorting algorithms: maintain the relative order of records with equal keys in place algorithms: need only O(log N) extra memory beyond the items being sorted and they don't need to create auxiliary locations for data to be temporarily stored QuickSort version above is not stable. 6 Question 1 Given a multi-set S of n integer elements and an index k (1 ≤ k ≤ n), we define the k-smallest element to be the k-th element when the elements are sorted from the smallest to the largest. Suggest an O(n) on average time algorithm for finding the k-smallest element. Example: For the given set of numbers: {6, 3, 2, 4, 1, 1, 2, 6} The 4-smallest element is 2 since in the 2 is the 4th element in the sorted set {1, 1, 2, 2, 3, 4, 6, 6}. Solution: The algorithm is based on the Quick-Sort algorithm. Quick-Sort : //Reminder quicksort(A,p, r) If (p<r) q ← partition(A,p,r) // Partition into two parts in Θ(푟 − 푝) time. quicksort(A,p,q-1) quicksort(A,q+1,r) Select(k, S) // returns k-th element in S. pick x in S partition S into: // Slightly different variant of partition() max(L) < x, E = {x}, x < min(G) if k ≤ length(L) // Searching for item ≤ x.

Load more