Mergesort Two Classic Sorting Algorithms Critical Components in the World’S Computational Infrastructure
Total Page:16
File Type:pdf, Size:1020Kb
2.2 Mergesort Two classic sorting algorithms Critical components in the world’s computational infrastructure. • Full scientific understanding of their properties has enabled us to develop them into practical system sorts. • Quicksort honored as one of top 10 algorithms of 20th century in science and engineering. ‣ mergesort Mergesort. today ‣ bottom-up mergesort • Java sort for objects. ‣ sorting complexity • Perl, Python stable sort. ‣ comparators Quicksort. next lecture • Java sort for primitive types. • C qsort, Unix, g++, Visual C++, Python. th Algorithms in Java, 4 Edition · Robert Sedgewick and Kevin Wayne · Copyright © 2009 · January 22, 2010 2:31:50 PM 2 Mergesort Basic plan. • Divide array into two halves. • Recursively sort each half. • Merge two halves. input M E R G E S O R T E X A M P L E sort left half E E G M O R R S T E X A M P L E sort right half E E G M O R R S A E E L M P T X ‣ mergesort merge results A E E E E G L M M O P R R S T X ‣ bottom-up mergesort Mergesort overview ‣ sorting complexity ‣ comparators 3 4 Merging Merging: Java implementation Q. How to combine two sorted subarrays into a sorted whole. private static void merge(Comparable[] a, int lo, int mid, int hi) { A. Use an auxiliary array. assert isSorted(a, lo, mid); // precondition: a[lo..mid] sorted assert isSorted(a, mid+1, hi); // precondition: a[mid+1..hi] sorted for (int k = lo; k <= hi; k++) copy a[] aux[] aux[k] = a[k]; k 0 1 2 3 4 5 6 7 8 9 i j 0 1 2 3 4 5 6 7 8 9 input E E G M R A C E R T - - - - - - - - - - int i = lo, j = mid+1; copy E E G M R A C E R T E E G M R A C E R T for (int k = lo; k <= hi; k++) 0 5 { merge 0 A 0 6 E E G M R A C E R T if (i > mid) a[k] = aux[j++]; else if (j > hi) a[k] = aux[i++]; 1 A C 0 7 E E G M R C E R T else if (less(aux[j], aux[i])) a[k] = aux[j++]; 2 A C E 1 7 E E G M R E R T else a[k] = aux[i++]; 3 A C E E 2 7 E G M R E R T } 4 A C E E E 2 8 G M R E R T 5 A C E E E G 3 8 G M R R T assert isSorted(a, lo, hi); // postcondition: a[lo..hi] sorted 6 A C E E E G M 4 8 M R R T } 7 A C E E E G M R 5 8 R R T 8 A C E E E G M R R 5 9 R T lo i mid j hi 9 A C E E E G M R R T 6 10 T aux[] merged result A C E E E G M R R T A G L O R H I M S T Abstract in-place merge trace k a[] A G H I L M 5 6 Assertions Mergesort: Java implementation Assertion. Statement to test assumptions about your program. public class Merge Helps detect logic bugs. { • private static Comparable[] aux; • Documents code. private static void merge(Comparable[] a, int lo, int mid, int hi) { /* as before */ } Java assert statement. Throws an exception unless boolean condition is ture. private static void sort(Comparable[] a, int lo, int hi) { assert isSorted(a, lo, hi); if (hi <= lo) return; int mid = lo + (hi - lo) / 2; sort(a, lo, mid); sort(a, mid+1, hi); Can enable or disable at runtime. ! No cost in production code. merge(a, lo, m, hi); } java -ea MyProgram // enable assertions public static void sort(Comparable[] a) java -da MyProgram // disable assertions (default) { aux = new Comparable[a.length]; sort(a, 0, a.length - 1); } Best practices. Use to check internal invariants. Assume assertions will be } disabled in production code (e.g., don't use for external argument-checking). lo mid hi 10 11 12 13 14 15 16 17 18 19 7 8 Mergesort trace Mergesort animation 50 random elements a[] lo hi 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 M E R G E S O R T E X A M P L E merge(a, 0, 0, 1) E M R G E S O R T E X A M P L E merge(a, 2, 2, 3) E M G R E S O R T E X A M P L E merge(a, 0, 1, 3) E G M R E S O R T E X A M P L E merge(a, 4, 4, 5) E G M R E S O R T E X A M P L E merge(a, 6, 6, 7) E G M R E S O R T E X A M P L E merge(a, 4, 5, 7) E G M R E O R S T E X A M P L E merge(a, 0, 3, 7) E E G M O R R S T E X A M P L E merge(a, 8, 8, 9) E E G M O R R S E T X A M P L E merge(a, 10, 10, 11) E E G M O R R S E T A X M P L E merge(a, 8, 9, 11) E E G M O R R S A E T X M P L E merge(a, 12, 12, 13) E E G M O R R S A E T X M P L E merge(a, 14, 14, 15) E E G M O R R S A E T X M P E L merge(a, 12, 13, 15) E E G M O R R S A E T X E L M P merge(a, 8, 11, 15) E E G M O R R S A E E L M P T X merge(a, 0, 7, 15) A E E E E G L M M O P R R S T X algorithm position Trace of merge results for top-down mergesort in order current subarray result after recursive call not in order http://www.sorting-algorithms.com/merge-sort 9 10 Mergesort animation Mergesort: empirical analysis 50 reverse-sorted elements Running time estimates: • Home pc executes 108 comparisons/second. • Supercomputer executes 1012 comparisons/second. insertion sort (N2) mergesort (N log N) computer thousand million billion thousand million billion home instant 2.8 hours 317 years instant 1 second 18 min super instant 1 second 1 week instant instant instant algorithm position in order current subarray not in order http://www.sorting-algorithms.com/merge-sort Bottom line. Good algorithms are better than supercomputers. 11 12 Mergesort: mathematical analysis Mergesort recurrence: proof 1 Proposition. Mergesort uses ~ 2 N lg N data moves to sort any array of size N. Mergesort recurrence. D(N) = 2 D(N / 2) + 2 N for N > 1, with D(1) = 0. Def. D(N) = number of data moves to mergesort an array of size N. Proposition. If N is a power of 2, then D(N) = 2 N lg N. = D(N / 2) + D(N / 2) + 2 N Pf. D(N) left half right half merge 2N = 2N D(N/2) D(N/2) 2 (2N/2) = 2N Mergesort recurrence. D(N) = 2 D(N / 2) + 2 N for N > 1, with T(1) = 0. Not quite right for odd N. • D(N/4) D(N/4) D(N/4) D(N/4) 4 (2N/4) = 2N • Similar recurrence holds for many divide-and-conquer algorithms. ... lg N 2k (2N/2k) = 2N D(N / 2k) Solution. D(N) ~ 2 N lg N. ... For simplicity, we'll prove when N is a power of 2. • D(2) D(2) D(2) D(2) D(2) D(2) D(2) D(2) N/2 (4) = 2N • True for all N. [see COS 340] 2N lg N 13 14 Mergesort recurrence: proof 2 Mergesort recurrence: proof 3 Mergesort recurrence. D(N) = 2 D(N / 2) + 2 N for N > 1, with D(1) = 0. Mergesort recurrence. D(N) = 2 D(N / 2) + 2 N for N > 1, with D(1) = 0. Proposition. If N is a power of 2, then D(N) = 2 N lg N. Proposition. If N is a power of 2, then D(N) = 2 N lg N. Pf. Pf. [by induction on N] • Base case: N = 1. given D(N) = 2 D(N/2) + 2N • Inductive hypothesis: D(N) = 2N lg N. D(N) / N = 2 D(N/2) / N + 2 divide both sides by N • Goal: show that D(2N) = 2(2N)lg (2N). = D(N/2) / (N/2) + 2 algebra = D(N/4) / (N/4) + 2 + 2 apply to first term D(2N) = 2 D(N) + 4N given = D(N/8) / (N/8) + 2 + 2 + 2 apply to first term again = 4 N lg N + 4 N inductive hypothesis . = 4 N (lg (2N) - 1) + 4N algebra = D(N/N) / (N/N) + 2 + 2 + ... + 2 stop applying, T(1) = 0 = 4 N lg (2N) QED = 2 lg N 15 16 Mergesort: number of compares Mergesort analysis: memory Proposition.