Sorting Algorithms Ch. 6
Total Page:16
File Type:pdf, Size:1020Kb
Sorting Algorithms Sorting Algorithms • A sorting algorithm is in place if only a constant number of Ch. 6 - 8 elements of the input array are ever stored outside the array. Slightly modified definition of the sorting problem: • A sorting algorithm is comparison based if the only operation we can perform on keys is to compare two keys. input: A collection of n data items <a1,a2,...,an> where each data item has a key drawn from a linearly ordered set (e.g., ints, chars) Comparison-Based Sorts Running Time output: A permutation (reordering) <a'1,a'2,...,a'n> of the input worst-case average-case best-case in place sequence such that a'1 £ a'2 £ ...£ a'n Insertion Sort O(n2) O(n2) O(n) yes • In practice, one usually sorts 'records' according to their key Merge Sort O(nlgn) O(nlgn) O(nlgn) no (the non-key data is called satellite data.) Heap Sort O(nlgn) O(nlgn) O(nlgn) yes • If the records are large, we may sort an array of pointers. Quick Sort O(n2) O(nlgn) O(nlgn) yes Algorithmics 1 Algorithmics 2 heap concept (Binary) Heaps Max-Heap • complete binary tree, except may be missing some rightmost In the array representation of a max-heap, the root of the tree leaves on the bottom level. is in A[1], and given the index i of a node, • each node contains a key Parent(i) LeftChild(i) RightChild(i) • values in the nodes satisfy the heap property. return (ëi/2û) return (2i) return (2i + 1) 1 binary tree: an array implementation 2 3 Max-heap property: A[Parent(i)] ³ A[i] • root is A[1] 4 5 6 7 • for element A[i] 1 2 3 4 5 6 7 8 9 10 11 A index 9 - left child is in position A[2i] 8 10 11 20 18 15 11 10 9 6 2 4 5 3 keys - right child is in position A[2i + 1] 1 - parent is in A[ëi/2û] n = 11 20 heap as an array implementation height = 3 2 3 (# edges on 18 15 Note that current • store heap as a binary tree in an array longest leaf to 4 5 6 7 length may be greater • heapsize is number of elements in heap root path) 11 10 9 6 than current heapsize. • length is number of elements in array 8 2 9 4 5 10 3 11 Algorithmics 3 Algorithmics 4 Min-Heap Heap Sort Input: An n-element array A (unsorted). Min-heaps are commonly used for priority queues in event-driven Output: An n-element array A in sorted order, smallest to largest. simulators. Parent(i) LeftChild(i) RightChild(i) HeapSort(A) return (ëi/2û) return (2i) return (2i + 1) 1. Build-Max-Heap(A) /* put all elements in heap */ 2. for i ¬ length(A) downto 2 do Min-heap property: A[Parent(i)] £ A[i] 3. swap A[1] « A[i] /* puts max in ith array position */ 1 2 3 4 5 6 7 8 9 10 11 A index 4. heap-size[A] ¬ heap-size[A] - 1 2 3 5 4 6 9 11 10 15 20 18 keys 5. Max-Heapify(A,1) /* restore heap property */ 1 n = 11 2 Running time: height = 3 2 3 Line 1: c (?????) Need to know running time of Build-Max-Heap (# edges on 3 5 1 longest leaf to 4 5 6 7 Line 2: c2(length(A)) = c2n Line 3: c (n - 1) root path) 4 6 9 11 3 Line 4: c4(n - 1) 8 10 9 15 20 10 18 11 Line 5: c5(n - 1) (????) Need to know running time of Max-Heapify Algorithmics 5 Algorithmics 6 1 Heap Sort Heapify: Maintaining the Heap Property Max-Heapify(A, i) Input: An n-element array A (unsorted). • Assumption: subtrees rooted at left and right children of A[i] (i.e., roots of Output: An n-element array A in sorted order, smallest to largest. these subtrees are A[2i] and A[2i + 1]) are heaps (i.e., obey the max-heap property) HeapSort(A) • ...but subtree rooted at A[i] might not be a heap (that is, A[i] may be smaller 1. Build-Max-Heap(A) /* put all elements in heap */ than its left or right child) 2. for i ¬ length(A) downto 2 do • Max-Heapify(A, i) will cause the value at A[i] to "float down" in the heap so that subtree rooted at A[i] becomes a heap. 3. swap A[1] « A[i] /* puts max in ith array position */ 4. heap-size[A] ¬ heap-size[A] - 1 Max-Heapify(A,i) 1. left ¬ 2i; right ¬ 2i + 1 /*indices of left & right children of A[i] */ 5. Max-Heapify(A,1) /* restore heap property */ 2. largest ¬ i; 3. if left £ heapsize(A) and A[left] > A[i] then We'll see that Running time of HeapSort 4. largest ¬ left • 1 call to Build-Max-Heap() • Build-Max-Heap(A) takes 5. if right £ heapsize(A) and A[right] > A[largest] then Þ O(n) time O(|A|) = O(n) time 6. largest ¬ right • n-1 calls to Max-Heapify() • Max-Heapify(A,1) takes each takes O(lgn) time 7. if largest ¹ i then O(lg|A|) = O(lgn) time Þ O(nlgn) time 8. swap(A[i], A[largest]) 9. Max-Heapify(A, largest) Algorithmics 7 Algorithmics 8 Max-Heapify: Running Time Build-Max-Heap(A) Intuition: use Max-Heapify in a bottom-up manner to convert A into Running Time of Max-Heapify a heap • every line is q(1) time except the recursive call • Leaves are already heaps. Elements A[(ën/2û + 1) ... n] are all leaves. • in worst-case, last row of binary tree is half empty, so children's • Start at parents of leaves...then, grandparents of leaves...etc. sub-trees have size at most (2/3)n Build-Max-Heap(A) So we get the recurrence T(n) £ T(2n/3) + q(1) 1. heapsize(A) ¬ length(A) which, by case 2 of the master theorem, has the solution 2. for i ¬ ëlength(A)/2û downto 1 do T(n) = O(lgn) 3. Max-Heapify(A, i) (or, Max-Heapify takes O(h) time when node A[i] has height h in the Running Time of Build-Max-Heap heap) • About n/2 calls to Max-Heapify (O(n) calls) • Each call takes O(lgn) time • Þ O(nlgn) time total (note: This bound is not tight.) • The book shows that Build-Max-Heap runs in O(n) time. Algorithmics 9 Algorithmics 10 Correctness of Build-Max-Heap Inserting Heap Elements Loop invariant: At the start of each iteration of the for loop, each Inserting an element into a heap: node i + 1, i + 2, ..., n is the root of a max-heap. • increment heapsize and "add" new element to the end of array • Initialization: i = ën/2û. Each node ën/2û + 1, ën/2û + 2, ... n is a • walk up tree from new leaf to root, swapping values. Insert input leaf, trivially satisfying the max-heap property. key when a parent key larger than the input key is found • At the k-1st iteration, the children of node i are numbered higher than i. Therefore, by the inductive hypothesis, the children of i are Max-Heap-Insert(A, key) the roots of max-heaps. This is the condition required for inputs 1. heapsize(A) ¬ heapsize(A) +1 to Max-Heapify to make node i a max-heap root. 2. i ¬ heapsize(A) • At termination, i = 0. By the loop invariant, nodes 1, 2, ...,n are 3. while i > 1 and A[parent(i)] < key do the roots of a max-heap. 4. A[i] ¬ A[parent(i)] Build-Max-Heap(A) 5. i ¬ parent(i) 1. heapsize(A) ¬ length(A) 6. A[i] ¬ key 2. for i ¬ ëlength(A)/2û downto 1 do Running time of Max-Heap-Insert: O(lgn) 3. Max-Heapify(A, i) • time to traverse leaf to root path (height = O(lgn)) Algorithmics 11 Algorithmics 12 2 Priority Queues Priority Queues: Application for Heaps Definition: A priority queue is a data structure for maintaining a set An application of max-priority queues is to schedule jobs on a S of elements, each with an associated key. A max-priority-queue shared processor. Need to be able to gives priority to keys with larger values and supports the following check current job's priority Heap-Maximum(A) remove job from the queue Heap-Extract-Max(A) operations insert new jobs into queue Max-Heap-Insert(A, key) 1. insert(S, x) inserts the element x into set S. increase priority of jobs Heap-Increase-Key(A,i,key) 2. max(S) returns element of S with largest key. 3. extract-max(S) removes and returns element of S with largest Initialize PQ by running Build-Max-Heap on an array A. key. 4. increase-key(S,x,k) increases the value of element x's key to new A[1] holds the maximum value after this step. value k (assuming k is at least as large as current key's value). Heap-Maximum(A) - returns value of A[1]. Heap-Extract-Max(A) - Saves A[1] and then, like Heap-Sort, puts item in A[heapsize] at A[1], decrements heapsize, and uses Max-Heapify to restore heap property. Algorithmics 13 Algorithmics 14 Priority Queues: QuickSort (Ch. 7) Application for Heaps Like Merge-Sort, a divide-and-conquer algorithm.