<<

Sorting Algorithms • A sorting is in place if only a constant number of Ch. 6 - 8 elements of the input array are ever stored outside the array. Slightly modified definition of the sorting problem: • A is comparison based if the only operation we can perform on keys is to compare two keys. input: A collection of n data items where each data item has a key drawn from a linearly ordered set (e.g., ints, chars) Comparison-Based Sorts Running Time output: A (reordering) of the input worst-case average-case best-case in place sequence such that a'1 £ a'2 £ ...£ a'n Insertion O(n2) O(n2) O(n) yes • In practice, one usually sorts 'records' according to their key O(nlgn) O(nlgn) O(nlgn) no (the non-key data is called satellite data.) Sort O(nlgn) O(nlgn) O(nlgn) yes • If the records are large, we may sort an array of pointers. Quick Sort O(n2) O(nlgn) O(nlgn) yes

Algorithmics 1 Algorithmics 2

heap concept (Binary) Heaps Max-Heap • complete binary , except may be missing some rightmost In the array representation of a max-heap, the root of the tree leaves on the bottom level. is in A[1], and given the index i of a node, • each node contains a key Parent(i) LeftChild(i) RightChild(i) • values in the nodes satisfy the heap property. return (ëi/2û) return (2i) return (2i + 1) 1 : an array implementation 2 3 Max-heap property: A[Parent(i)] ³ A[i] • root is A[1] 4 5 6 7 • for element A[i] 1 2 3 4 5 6 7 8 9 10 11 A index 9 - left child is in position A[2i] 8 10 11 20 18 15 11 10 9 6 2 4 5 3 keys - right child is in position A[2i + 1] 1 - parent is in A[ëi/2û] n = 11 20 heap as an array implementation height = 3 2 3 (# edges on 18 15 Note that current • store heap as a binary tree in an array longest leaf to 4 5 6 7 length may be greater • heapsize is number of elements in heap root path) 11 10 9 6 than current heapsize. • length is number of elements in array 8 2 9 4 5 10 3 11 Algorithmics 3 Algorithmics 4

Min-Heap Heap Sort Input: An n-element array A (unsorted). Min-heaps are commonly used for priority queues in event-driven Output: An n-element array A in sorted order, smallest to largest. simulators. Parent(i) LeftChild(i) RightChild(i) (A) return (ëi/2û) return (2i) return (2i + 1) 1. Build-Max-Heap(A) /* put all elements in heap */ 2. for i ¬ length(A) downto 2 do Min-heap property: A[Parent(i)] £ A[i] 3. swap A[1] « A[i] /* puts max in ith array position */ 1 2 3 4 5 6 7 8 9 10 11 A index 4. heap-size[A] ¬ heap-size[A] - 1 2 3 5 4 6 9 11 10 15 20 18 keys 5. Max-Heapify(A,1) /* restore heap property */ 1 n = 11 2 Running time: height = 3 2 3 Line 1: c (?????) Need to know running time of Build-Max-Heap (# edges on 3 5 1 longest leaf to 4 5 6 7 Line 2: c2(length(A)) = c2n Line 3: c (n - 1) root path) 4 6 9 11 3 Line 4: c4(n - 1) 8 10 9 15 20 10 18 11 Line 5: c5(n - 1) (????) Need to know running time of Max-Heapify Algorithmics 5 Algorithmics 6

1 Heap Sort Heapify: Maintaining the Heap Property Max-Heapify(A, i) Input: An n-element array A (unsorted). • Assumption: subtrees rooted at left and right children of A[i] (i.e., roots of Output: An n-element array A in sorted order, smallest to largest. these subtrees are A[2i] and A[2i + 1]) are heaps (i.e., obey the max-heap property) HeapSort(A) • ...but subtree rooted at A[i] might not be a heap (that is, A[i] may be smaller 1. Build-Max-Heap(A) /* put all elements in heap */ than its left or right child) 2. for i ¬ length(A) downto 2 do • Max-Heapify(A, i) will cause the value at A[i] to "float down" in the heap so that subtree rooted at A[i] becomes a heap. 3. swap A[1] « A[i] /* puts max in ith array position */ 4. heap-size[A] ¬ heap-size[A] - 1 Max-Heapify(A,i) 1. left ¬ 2i; right ¬ 2i + 1 /*indices of left & right children of A[i] */ 5. Max-Heapify(A,1) /* restore heap property */ 2. largest ¬ i; 3. if left £ heapsize(A) and A[left] > A[i] then We'll see that Running time of HeapSort 4. largest ¬ left • 1 call to Build-Max-Heap() • Build-Max-Heap(A) takes 5. if right £ heapsize(A) and A[right] > A[largest] then Þ O(n) time O(|A|) = O(n) time 6. largest ¬ right • n-1 calls to Max-Heapify() • Max-Heapify(A,1) takes each takes O(lgn) time 7. if largest ¹ i then O(lg|A|) = O(lgn) time Þ O(nlgn) time 8. swap(A[i], A[largest]) 9. Max-Heapify(A, largest) Algorithmics 7 Algorithmics 8

Max-Heapify: Running Time Build-Max-Heap(A) Intuition: use Max-Heapify in a bottom-up manner to convert A into Running Time of Max-Heapify a heap • every line is q(1) time except the recursive call • Leaves are already heaps. Elements A[(ën/2û + 1) ... n] are all leaves. • in worst-case, last row of binary tree is half empty, so children's • Start at parents of leaves...then, grandparents of leaves...etc. sub-trees have size at most (2/3)n Build-Max-Heap(A) So we get the recurrence T(n) £ T(2n/3) + q(1) 1. heapsize(A) ¬ length(A) which, by case 2 of the master theorem, has the solution 2. for i ¬ ëlength(A)/2û downto 1 do T(n) = O(lgn) 3. Max-Heapify(A, i)

(or, Max-Heapify takes O(h) time when node A[i] has height h in the Running Time of Build-Max-Heap heap) • About n/2 calls to Max-Heapify (O(n) calls) • Each call takes O(lgn) time • Þ O(nlgn) time total (note: This bound is not tight.) • The book shows that Build-Max-Heap runs in O(n) time. Algorithmics 9 Algorithmics 10

Correctness of Build-Max-Heap Inserting Heap Elements Loop invariant: At the start of each iteration of the for loop, each Inserting an element into a heap: node i + 1, i + 2, ..., n is the root of a max-heap. • increment heapsize and "add" new element to the end of array • Initialization: i = ën/2û. Each node ën/2û + 1, ën/2û + 2, ... n is a • walk up tree from new leaf to root, swapping values. Insert input leaf, trivially satisfying the max-heap property. key when a parent key larger than the input key is found • At the k-1st iteration, the children of node i are numbered higher than i. Therefore, by the inductive hypothesis, the children of i are Max-Heap-Insert(A, key) the roots of max-heaps. This is the condition required for inputs 1. heapsize(A) ¬ heapsize(A) +1 to Max-Heapify to make node i a max-heap root. 2. i ¬ heapsize(A) • At termination, i = 0. By the loop invariant, nodes 1, 2, ...,n are 3. while i > 1 and A[parent(i)] < key do the roots of a max-heap. 4. A[i] ¬ A[parent(i)] Build-Max-Heap(A) 5. i ¬ parent(i) 1. heapsize(A) ¬ length(A) 6. A[i] ¬ key 2. for i ¬ ëlength(A)/2û downto 1 do Running time of Max-Heap-Insert: O(lgn) 3. Max-Heapify(A, i) • time to traverse leaf to root path (height = O(lgn)) Algorithmics 11 Algorithmics 12

2 Priority Queues Priority Queues: Application for Heaps

Definition: A is a for maintaining a set An application of max-priority queues is to schedule jobs on a S of elements, each with an associated key. A max-priority-queue shared processor. Need to be able to gives priority to keys with larger values and supports the following check current job's priority Heap-Maximum(A) remove job from the queue Heap-Extract-Max(A) operations insert new jobs into queue Max-Heap-Insert(A, key) 1. insert(S, x) inserts the element x into set S. increase priority of jobs Heap-Increase-Key(A,i,key) 2. max(S) returns element of S with largest key. 3. extract-max(S) removes and returns element of S with largest Initialize PQ by running Build-Max-Heap on an array A. key. 4. increase-key(S,x,k) increases the value of element x's key to new A[1] holds the maximum value after this step. value k (assuming k is at least as large as current key's value). Heap-Maximum(A) - returns value of A[1]. Heap-Extract-Max(A) - Saves A[1] and then, like Heap-Sort, puts item in A[heapsize] at A[1], decrements heapsize, and uses Max-Heapify to restore heap property.

Algorithmics 13 Algorithmics 14

Priority Queues: (Ch. 7) Application for Heaps Like Merge-Sort, a divide-and-conquer algorithm. Heap-Increase-Key(A, i, key) - If key is larger than current key at A[i], floats inserted key up heap until heap property is restored. Divide: Rearrange the array A[p..r] into two (possibly empty) subarrays A[p..q-1] and A[q+1..r] such that each element of An application for a min-heap priority queue is an event-driven A[p..q-1] £ A[q] and each element of A[q+1..r] ³ q after simulator, where the key is an integer representing the number of computation of index q. seconds (or other discrete time unit) from time zero (starting point for simulation). Conquer: Sort the two subarrays A[p..q-1] and A[q+1..r] recursively.

Combine: No work is needed to combine subarrays since they are sorted in-place.

Algorithmics 15 Algorithmics 16

QuickSort Correctness of Quicksort Input: An n-element array A (unsorted). Output: An n-element array A in sorted order, smallest to largest. Claim: Partition satisfies the specifications of the Divide step.

Quicksort(A, p, r) Partition(A, p, r) Loop invariant: At the beginning Partition(A, p, r) 1. if p < r then 1. x ¬ A[r] of each iteration of the for loop 1. x ¬ A[r] 2. q ¬ Partition(A, p, r) 2. i ¬ p - 1 (lines 3-6), for any array index k, 2. i ¬ p - 1 3. Quicksort(A, p, q-1) 3. for j ¬ p to r - 1 do 3. for j ¬ p to r - 1 do 4. Quicksort(A, q+1, r) 4. if A[j] £ x then 1. If p £ k £ i, then A[k] £ x. 4. if A[j] £ x then 2. If i+1 £ k £ j-1, then A[k] > x. 5. i ¬ i + 1 5. i ¬ i + 1 3. If k=r, then A[k] = x. Initial call: 6. swap A[i] « A[j] 6. swap A[i] « A[j] Quicksort(A, 1, length(A)) 7. swap A[i+1] « A[r] 7. swap A[i+1] « A[r] 8. return i + 1 8. return i + 1 What does Partition do? What is running time of Partition?

Algorithmics 17 Algorithmics 18

3 Partition: Correctness Quicksort Running Time

Initialization: i = p-1 = 0 and j = p = 1. k cannot be between 0 and 1 T(n) = T(q - p) + T(r - q) + O(n) (cond. 1), nor can k be between i+1 = 1 and j - 1 = 0 (cond.2). The value of T(n) depends on the location of q in the array A[p..r]. Line 1 satisfies cond. 3. Since we don't know this in advance, we must look at worst-case, best-case, and average-case partitioning. Ind. Step: Either A[j] > x or A[j] £ x. In the first case, j is incremented and cond. 2 holds for A[j-1] with no other changes. In the Worst-case partitioning: Each partition results in a 0 : n-1 split second case, i is incremented, A[i] and A[j] are swapped, and then j T(0) = q(1) and the partitioning costs q(n), so recurrence is is incremented. Cond. 1 holds for A[i] after swap. By the IHOP, the item in A[i] was in A[i+1] during the last iteration,and was > x then, so T(n) = T(n-1) + T(0) + q(n) = T(n-1) + q(n) cond. 2 holds. This is an arithmetic series which evaluates to q(n2) (using substitution method). So worst-case for Quicksort is no better than . Termination: At termination, j = r and A has been partitioned into 3 sets: items £ x, items > x and A[j]=x. What does the input look like in Quicksort's worst-case?

Algorithmics 19 Algorithmics 20

Quicksort Best-case Quicksort Average-case

T(n) = T(q - p) + T(r - q) + O(n) Intuition: Some splits will be close to balanced and others close to unbalanced Þ good and bad splits will be randomly distributed in Best-case partitioning: Each partition results in a ën/2û : én/2ù -1 recursion tree. split (i.e., close to balanced split each time), so recurrence is T(n) = 2T(n/2) + q(n) The running time will be bad only if there are many bad splits in a row. By case 2 of the master theorem, this recurrence evaluates to q(nlgn). • A bad split followed by a good split results in a good partitioning which means there is at least one execution that takes at most after one extra step. nlgn time. • Implies a q(nlgn) running time (with a larger constant factor).

Algorithmics 21 Algorithmics 22

Randomized Quicksort Lower Bounds for Comparison-Based Sorting How can we modify Quicksort to get good average case behavior on all inputs? Answer: Randomization! Algorithms (Ch. 8) 2 techniques: We have seen several sorting algorithms that run in W(nlgn) time in 1. randomly permute input prior to running Quicksort. Will produce the worst case (meaning there is some input on which the algorithms tree of possible executions, most of them finish fast. run in at least W(nlgn) time). 2. choose partition randomly at each iteration instead of choosing • mergesort element in highest array position. • heapsort Randomized-Partition(A, p, r) In section 7.4, a probabilistic analysis • quicksort is presented, showing that the expected 1. i ¬ Random(p, r) running time of Randomized-Quicksort In all comparison-based sorting algorithms, the sorted order results 2. swap A[r] « A[i] is O(nlgn) only from comparisons between input elements. 3.return Partition(A, p, r) Is it possible for any comparison-based sorting algorithm to do better? no.

Algorithmics 23 Algorithmics 24

4 Lower Bounds for Sorting The Decision Tree Model Algorithms (Ch. 8) Given any comparison-based sorting algorithm, we can represent its behavior on an input of size n by a decision tree: Note: we need Theorem: Any comparison-based sort must make W(nlgn) only consider the comparisons in the algorithm (the other operations comparisons in the worst case to sort a sequence of n elements. only make the algorithm take longer). • each internal node in the decision tree corresponds to one of the But how do we prove this? comparisons in the algorithm. We'll use the decision tree model to represent any sorting algorithm • start at the root and do first comparison: and then argue that no matter the algorithm, there is some input if £ take left branch, if > take right branch, etc. which will cause it to run in W(nlgn) time. • each leaf represents one possible ordering of the input § one leaf for each of n! possible orderings. Þ One decision tree for each algorithm and input size

Algorithmics 25 Algorithmics 26

The Decision Tree Model The W(nlgn) Lower Bound Example: Insertion sort with n = 3 (3! = 6 leaves) Theorem: Any decision tree for sorting n elements has height W(nlgn) £ A[1] vs A[2] > (therefore, any algorithm requires W(nlgn) compares in worst case). A[2] vs A[3] A[1] vs A[3] £ Proof: Let h be the height of the tree. Then we know > > £ • tree has at least n! leaves A[1] vs A[3] A[1], A[2], A[3] A[2], A[1], A[3] A[2] vs A[3] • tree is binary, so it has at most 2h leaves £ > £ > h A[1], A[3], A[2] A[3], A[1], A[2] A[2], A[3], A[1] A[3], A[2], A[1] n! £ #leaves £ 2

2h ³ n! Note: The length of the longest root to leaf path in this tree h = worst-case number of comparisons lg(2 ) ³ lg(n!) £ worst-case number of operations of algorithm h ³ W(nlgn) (Eq. 3.18) (eq 3.5.8) in J ¨ Algorithmics 27 Algorithmics 28

Beating the lower bound --> non-comparison based sorts Requirement: input elements are integers in known range [0..k] for some constant k. Idea: Algorithms that are NOT comparison based might be faster. Idea: for each input element x, find the number of elements £ x (say There are three such algorithms presented in Chapter 8: this number = m) and put x in the (m+1)st spot in the output array. • counting sort Counting-Sort(A, k) • // A[1..n] is input array, C[0..k] is initially all 0's, B[1..n] is output array • 1. for i ¬ 1 to length(A) do 2. C[A[i]] ¬ C[A[i]] + 1 These algorithms 3. for i ¬ 1 to k do • run in O(n) time (under certain conditions) 4. C[i] ¬ C[i] + C[i-1] // Make C into a "" array, where C[i] • either use information about the values to be sorted (counting // contains number of elements £ i sort, bucket sort), or 5. for j ¬ length(A) downto 1 do 6. B[C[A[j]]] ¬ A[j] • operate on "pieces" of the input elements (radix sort) 7. C[A[j]] ¬ C[A[j]] - 1

Algorithmics 29 Algorithmics 30

5 Running Time of Counting Sort Radix Sort for loop in lines 1-2 takes q(n) time. Let d be the number of digits in each input number. for loop in lines 3-4 takes q(k) time. Overall time is q(k + n). Radix-Sort(A, d) for loop in lines 5-7 takes q(n) time. 1. for i ¬ 1 to d do 2. use stable sort to sort array A on digit i In practice, use counting sort when we have k = q(n), Note: so running time is q(n). • radix sort sorts the least significant digit first! • correctness can be shown by induction on the digit being sorted. Counting sort has the important property of stability: • often counting sort is used as the in step 2. A sorting algorithm is stable when numbers with the same values Running time of radix sort: O(dTss(n)) appear in the output array in the same order as they do in the input array. • Tss is the time for the internal sort. Counting sort gives Tss(n) = O(k + n), so O(dTss(n)) = O(d(k + n)), Important when satellite data is stored with elements being sorted and which is O(n) if d = O(1) and k = O(n). because counting sort is used as a subroutine for radix sort. • if d = O(lgn) and k = 2 (common for computers), then O(d(k+n)) = O(nlgn) Algorithmics 31

Bucket Sort Summary NCB Sorts Assumption: input elements distributed uniformly over some known Non-Comparison Based Sorts range, e.g., [0,1). (Appendix C.2 has defn of uniform distribution) Running Time Bucket-Sort(A, x, y) worst-case average-case best-case in place 1. divide interval [x, y) into n equal-sized subintervals (buckets) Counting Sort O(n + k) O(n + k) O(n + k) no 2. distributed the n input keys into the buckets Radix Sort O(d(n + k')) O(d(n + k')) O(d(n + k')) no 3. sort the numbers in each bucket (e.g., with insertion sort) Bucket Sort O(n) no 4. scan the (sorted) buckets in order and produce output array

Running time of bucket sort: O(n) expected time Counting sort assumes input elements are in range [1,2,..,k] and uses array Step 1: O(1) for each interval = O(n) time total. indexing to count the number of occurrences of each value. Step 2: O(n) time. Radix sort (integer sort only) assumes each integer consists of d digits, and Step 3: The expected number of elements in each bucket is O(1) each digit is in range [1,2,..,k']. (see book for formal argument, section 8.4) Bucket sort requires advance knowledge of input distribution (sorts n numbers Step 4: O(n) time to scan the n buckets containing a total of n input uniformly distributed in range in O(n) time).

elements Algorithmics 34

6