Quicksort, Bucket Sort, Radix Sort

Quick-Sort, Bucket Sort, Radix Sort Sections 11.2, 11.3.2, 11.3.3 7 4 9 6 2 → 2 4 6 7 9 4 2 → 2 4 7 9 → 7 9 2 → 2 9 → 9 © 2004 Goodrich, Tamassia Quick-Sort 1 Quick-Sort Quick-sort is a randomized sorting algorithm based x on the divide-and-conquer paradigm: ◼ Divide: pick a random element x (called pivot) and x partition S into L elements less than x E elements equal x L E G G elements greater than x ◼ Recur: sort L and G ◼ Conquer: join L, E and G x © 2004 Goodrich, Tamassia Quick-Sort 2 Partition We partition an input Algorithm partition(S, p) sequence as follows: Input sequence S, position p of pivot Output subsequences L, E, G of the ◼ We remove, in turn, each elements of S less than, equal to, element y from S and or greater than the pivot, resp. ◼ We insert y into L, E or G, L, E, G empty sequences depending on the result of x S.erase(p) the comparison with the while S.empty() pivot x y S.eraseFront() Each insertion and removal if y < x is at the beginning or at the L.insertBack(y) end of a sequence, and else if y = x hence takes O(1) time E.insertBack(y) Thus, the partition step of else { y > x } quick-sort takes O(n) time G.insertBack(y) return L, E, G © 2004 Goodrich, Tamassia Quick-Sort 3 Quick-Sort Tree An execution of quick-sort is depicted by a binary tree ◼ Each node represents a recursive call of quick-sort and stores Unsorted sequence before the execution and its pivot Sorted sequence at the end of the execution ◼ The root is the initial call ◼ The leaves are calls on subsequences of size 0 or 1 7 4 9 6 2 → 2 4 6 7 9 4 2 → 2 4 7 9 → 7 9 2 → 2 9 → 9 © 2004 Goodrich, Tamassia Quick-Sort 4 Execution Example Pivot selection 7 2 9 4 3 7 6 1 → 1 2 3 4 6 7 8 9 7 2 9 4 → 2 4 7 9 3 8 6 1 → 1 3 8 6 2 → 2 9 4 → 4 9 3 → 3 8 → 8 9 → 9 4 → 4 © 2004 Goodrich, Tamassia Quick-Sort 5 Execution Example (cont.) Partition, recursive call, pivot selection 7 2 9 4 3 7 6 1 → 1 2 3 4 6 7 8 9 2 4 3 1 → 2 4 7 9 3 8 6 1 → 1 3 8 6 2 → 2 9 4 → 4 9 3 → 3 8 → 8 9 → 9 4 → 4 © 2004 Goodrich, Tamassia Quick-Sort 6 Execution Example (cont.) Partition, recursive call, base case 7 2 9 4 3 7 6 1 → → 1 2 3 4 6 7 8 9 2 4 3 1 →→ 2 4 7 3 8 6 1 → 1 3 8 6 1 → 1 9 4 → 4 9 3 → 3 8 → 8 9 → 9 4 → 4 © 2004 Goodrich, Tamassia Quick-Sort 7 Execution Example (cont.) Recursive call, …, base case, join 7 2 9 4 3 7 6 1 → 1 2 3 4 6 7 8 9 2 4 3 1 → 1 2 3 4 3 8 6 1 → 1 3 8 6 1 → 1 4 3 → 3 4 3 → 3 8 → 8 9 → 9 4 → 4 © 2004 Goodrich, Tamassia Quick-Sort 8 Execution Example (cont.) Recursive call, pivot selection 7 2 9 4 3 7 6 1 → 1 2 3 4 6 7 8 9 2 4 3 1 → 1 2 3 4 7 9 7 1 → 1 3 8 6 1 → 1 4 3 → 3 4 8 → 8 9 → 9 9 → 9 4 → 4 © 2004 Goodrich, Tamassia Quick-Sort 9 Execution Example (cont.) Partition, …, recursive call, base case 7 2 9 4 3 7 6 1 → 1 2 3 4 6 7 8 9 2 4 3 1 → 1 2 3 4 7 9 7 1 → 1 3 8 6 1 → 1 4 3 → 3 4 8 → 8 9 → 9 9 → 9 4 → 4 © 2004 Goodrich, Tamassia Quick-Sort 10 Execution Example (cont.) Join, join 7 2 9 4 3 7 6 1 → 1 2 3 4 6 7 7 9 2 4 3 1 → 1 2 3 4 7 9 7 → 17 7 9 1 → 1 4 3 → 3 4 8 → 8 9 → 9 9 → 9 4 → 4 © 2004 Goodrich, Tamassia Quick-Sort 11 Worst-case Running Time The worst case for quick-sort occurs when the pivot is the unique minimum or maximum element One of L and G has size n - 1 and the other has size 0 The running time is proportional to the sum n + (n - 1) + … + 2 + 1 Thus, the worst-case running time of quick-sort is O(n2) depth time 0 n 1 n - 1 … … n - 1 1 © 2004 Goodrich, Tamassia Quick-Sort 12 Expected Running Time Consider a recursive call of quick-sort on a sequence of size s ◼ Good call: the sizes of L and G are each less than 3s/4 ◼ Bad call: one of L and G has size greater than 3s/4 7 2 9 4 3 7 6 1 9 7 2 9 8 4 3 7 6 1 2 4 3 1 7 9 7 1 → 1 1 7 9 8 4 3 7 6 Good call Bad call A call is good with probability 1/2 ◼ 1/2 of the possible pivots cause good calls: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Bad pivots Good pivots Bad pivots © 2004 Goodrich, Tamassia Quick-Sort 13 Expected Running Time, Part 2 Probabilistic Fact: The expected number of coin tosses required in order to get k heads is 2k For a node of depth i, we expect ◼ i/2 ancestors are good calls i/2 ◼ The size of the input sequence for the current call is at most (3/4) n Therefore, we have expected height time per level s(r) O(n) ◼ For a node of depth 2log4/3n, the expected input size is one ◼ The expected height of the s(a) s(b) O(n) quick-sort tree is O(log n) O(log n) The amount or work done at the s(c) s(d) s(e) s(f) O(n) nodes of the same depth is O(n) Thus, the expected running time of quick-sort is O(n log n) total expected time: O(n log n) © 2004 Goodrich, Tamassia Quick-Sort 14 In-Place Quick-Sort Quick-sort can be implemented to run in-place In the partition step, we use Algorithm inPlaceQuickSort(S, l, r) replace operations to rearrange Input sequence S, ranks l and r the elements of the input Output sequence S with the sequence such that elements of rank between l and r rearranged in increasing order ◼ the elements less than the pivot have rank less than h if l r return ◼ the elements equal to the pivot have rank between h and k i a random integer between l and r ◼ the elements greater than the x S.elemAtRank(i) pivot have rank greater than k (h, k) inPlacePartition(x) The recursive calls consider inPlaceQuickSort(S, l, h - 1) ◼ elements with rank less than h inPlaceQuickSort(S, k + 1, r) ◼ elements with rank greater than k © 2004 Goodrich, Tamassia Quick-Sort 15 In-Place Partitioning Perform the partition using two indices to split S into L and E U G (a similar method can split E U G into E and G). j k 3 2 5 1 0 7 3 5 9 2 7 9 8 9 7 6 9 (pivot = 6) Repeat until j and k cross: ◼ Scan j to the right until finding an element > x. ◼ Scan k to the left until finding an element < x. ◼ Swap elements at indices j and k j k 3 2 5 1 0 7 3 5 9 2 7 9 8 9 7 6 9 © 2004 Goodrich, Tamassia Quick-Sort 16 Summary of Sorting Algorithms Algorithm Time Notes ▪ in-place selection-sort O(n2) ▪ slow (good for small inputs) ▪ in-place insertion-sort O(n2) ▪ slow (good for small inputs) O(n log n) ▪ in-place, randomized quick-sort expected ▪ fastest (good for large inputs) ▪ in-place heap-sort O(n log n) ▪ fast (good for large inputs) ▪ sequential data access merge-sort O(n log n) ▪ fast (good for huge inputs) © 2004 Goodrich, Tamassia Quick-Sort 17 Bucket-Sort Let be S be a sequence of n Algorithm bucketSort(S, N) (key, element) entries with Input sequence S of (key, element) keys in the range [0, N - 1] items with keys in the range Bucket-sort uses the keys as [0, N - 1] indices into an auxiliary array B Output sequence S sorted by of sequences (buckets) increasing keys Phase 1: Empty sequence S by B array of N empty sequences moving each entry (k, o) into while S.empty() its bucket B[k] (k, o) S.front() Phase 2: For i = 0, …, N - 1, move S.eraseFront() the entries of bucket B[i] to the B[k].insertBack((k, o)) end of sequence S for i 0 to N - 1 Analysis: while B[i].empty() ◼ Phase 1 takes O(n) time (k, o) B[i].front() ◼ Phase 2 takes O(n + N) time B[i].eraseFront() Bucket-sort takes O(n + N) time S.insertBack((k, o)) © 2004 Goodrich, Tamassia Bucket-Sort and Radix-Sort 18 Example Key range [0, 9] 7, d 1, c 3, a 7, g 3, b 7, e Phase 1 1, c 3, a 3, b 7, d 7, g 7, e B 0 1 2 3 4 5 6 7 8 9 Phase 2 1, c 3, a 3, b 7, d 7, g 7, e © 2004 Goodrich, Tamassia Bucket-Sort and Radix-Sort 19 Properties and Extensions Key-type Property Extensions ◼ The keys are used as ◼ Integer keys in the range [a, b] indices into an array Put entry (k, o) into bucket and cannot be arbitrary B[k - a] objects ◼ String keys from a set D of possible strings, where D has ◼ No external comparator constant size (e.g., names of Stable Sort Property the 13 provinces and territories) ◼ The relative order of Sort D and compute the rank any two items with the r(k) of each string k of D in same key is preserved the sorted sequence after the execution of Put entry (k, o) into bucket the algorithm B[r(k)] © 2004 Goodrich, Tamassia Bucket-Sort and Radix-Sort 20 Lexicographic Order A d-tuple is a sequence of d keys (k1, k2, …, kd), where key ki is said to be the i-th dimension of the tuple Example: ◼ The Cartesian coordinates of a point in space are a 3-tuple The lexicographic order of two d-tuples is recursively defined as follows (x1, x2, …, xd) < (y1, y2, …, yd) x1 < y1 x1 = y1 (x2, …, xd) < (y2, …, yd) I.e., the tuples are compared by the first dimension, then by the second dimension, etc.

Load more