Mergesort and Lecture 3: Efficient Sorts Two great . Full scientific understanding of their properties has enabled us to

hammer them into practical system sorts.

Occupies a prominent place in world's computational infrastructure.

Quicksort honored as one of top 10 algorithms for science and engineering of 20th century. Mergesort

Quicksort Mergesort.

Analysis of Algorithms Java Arrays for type Object.

Java Collections sort.

Perl stable, Python stable.

Quicksort.

Java Arrays sort for primitive types.

C , Unix, g++, Visual C++, Perl, Python.

Princeton University • COS 226 • Algorithms and Data Structures • Spring 2004 • Kevin Wayne • http://www.Princeton.EDU/~cos226 2

Sorting Applications Estimating the Running Time

Applications. Total running time is sum of cost P frequency for all of the basic ops.

Sort a list of names. obvious applications Cost depends on machine, compiler. Organize an MP3 library. Frequency depends on , input. Display Google PageRank results.

Find the median. Cost for sorting.

Find the closest pair. A = # function calls. Binary search in a database. problems become easy once B = # exchanges. Identify statistical outliers. items are in sorted order

Find duplicates in a mailing list. C = # comparisons.

Cost on a typical machine = 35A + 11B + 4C. Data compression.

Computer graphics. Frequency of sorting ops. Computational biology. Supply chain management. non-obvious applications N = # elements to sort. Simulate a system of particles. : A = 1, B = N-1, C = N(N-1) / 2. Book recommendations on Amazon. Load balancing on a parallel computer. . . .

3 4 Estimating the Running Time Big Oh Notation

An easier alternative. Big Theta, Oh, and Omega notation. (i) Analyze asymptotic growth as a function of input size N. 2 2 2 2 1.5 (ii) For medium N, run and measure time. (N ) means { N , 17N , N + 17N + 3N, . . . } (iii) For large N, use (i) and (ii) to predict time. – ignore lower order terms and leading coefficients

2 2 2 2 1.5 1.5 Asymptotic growth rates. O(N ) means { N , 17N , N + 17N + 3N, N , 100N, . . . }  2 Estimate as a function of input size N. – (N ) and smaller – N, N log N, N2, N3, 2N, N! – use for upper bounds

Ignore lower order terms and leading coefficients. 3 2 3 2 2 2 2 1.5 3 5 – Ex. 6N + 17N + 56 is asymptotically proportional to N (N ) means { N , 17N , N + 17N + 3N, N , 100N , . . . } – (N2) and larger – use for lower bounds

Never say: makes at least O(N2) comparisons.

5 6

Estimating the Running Time Why It Matters

Insertion sort is quadratic. 2 Run time in N / 4 - N / 4 comparisons on average. 1.3 N3 10 N2 47 N log N 48 N nanoseconds --> 2 2 (N ). 1000 1.3 seconds 10 msec 0.4 msec 0.048 msec On arizona: 1 second for N = 10,000. Time to 10,000 22 minutes 1 second 6 msec 0.48 msec solve a 100,000 15 days 1.7 minutes 78 msec 4.8 msec How long for N = 100,000? 100 seconds (100 times as long) problem

N = 1 million? 2.78 hours (another factor of 100) of size million 41 years 2.8 hours 0.94 seconds 48 msec 6 10 million 41 millennia 1.7 weeks 11 seconds 0.48 seconds N = 1 billion? 317 years (another factor of 10 )

N = 1 trillion? second 920 10,000 1 million 21 million Max size problem minute 3,600 77,000 49 million 1.3 billion solved hour 14,000 600,000 2.4 trillion 76 trillion in one day 41,000 2.9 million 50 trillion 1,800 trillion

N multiplied by 10, 1,000 100 10+ 10 time multiplied by

Reference: More Programming Pearls by Jon Bentley

7 8 Orders of Magnitude Mergesort

Mergesort (divide-and-conquer) Meters Per Imperial Seconds Equivalent Example Second Units Divide array into two halves. 1 1 second 10-10 1.2 in / decade Continental drift Recursively sort each half. 10 10 seconds 10-8 1 ft / year Hair growing 102 1.7 minutes Merge two halves to make sorted whole. 10-6 3.4 in / day Glacier 103 17 minutes 10-4 1.2 ft / hour Gastro-intestinal tract 104 2.8 hours 10-2 2 ft / minute Ant 105 1.1 days 1 2.2 mi / hour Human walk 106 1.6 weeks 102 220 mi / hour Propeller airplane 107 3.8 months 104 370 mi / min Space shuttle 108 3.1 years A L G O R I T H M S 106 620 mi / sec Earth in galactic orbit 109 3.1 decades 108 62,000 mi / sec 1/3 speed of light 1010 3.1 centuries A L G O R I T H M S divide . . . forever 210 thousand Powers age of 220 million sort 1017 of 2 A G L O R H I M S T universe 230 billion A G H I L M O R S T merge Reference: More Programming Pearls by Jon Bentley

9 12

Mergesort Implementation in Java Mergesort Analysis

Stability? Yes, if underlying merge is stable. public static void mergesort(Comparable[] a, int low, int high) { Comparable temp[] = new Comparable[a.length]; for (int i = 0; i < a.length; i++) temp[i]=a[i]; How much memory does array implementation of mergesort require? mergesort(temp, a, low, high); } Original input = N. Auxiliary array for merging = N. private static void mergesort(Comparable[] from, Comparable[] to,

int low, int high) { Local variables: constant. if (high <= low) return; Function call stack: log2 N. int mid =(low + high)/2; Total = 2N + O(log N). mergesort(to, from, low, mid); mergesort(to, from, mid+1, high); How much memory do other sorting algorithms require? int p = low, q = mid+1; for(int i = low; i <= high; i++) { N + O(1) for insertion sort, selection sort, .

if (q > high) to[i]=from[p++]; In-place = N + O(log N). else if (p > mid) to[i]=from[q++]; else if (less(from[q], from[p])) to[i]=from[q++]; else to[i]=from[p++]; } }

13 14 Mergesort Analysis Proof by Picture of Recursion Tree N T How long does mergesort take? $ 0 N if  1 '  N Bottleneck = merging (and copying). T( ) % 2 ( /2)  N otherwise '   – merging two files of size N/2 requires ? N comparisons & sorting both halves merging

T(N) = comparisons to mergesort N elements. – assume N is a power of 2 T(N) N – assume merging requires exactly N comparisons N T $ 0 N if  1 T(N/2) T(N/2) 2(N/2) ' N T( )  % 2 ( /2)  N otherwise '   & sorting both halves merging T(N/4) T(N/4) T(N/4) T(N/4) 4(N/4) log2N . . . including already sorted T(N / 2k) 2k (N / 2k) Claim. T(N) = N log2 N. . . . Note: same number of comparisons for ANY file. T(2) T(2) We'll give several proofs to illustrate standard techniques. T(2) T(2) T(2) T(2) T(2) T(2) N/2 (2)

Nlog2N 15 16

Proof by Telescoping Mathematical Induction

Claim. If T(N) satisfies this recurrence, then T(N) = N log2 N. Mathematical induction. N Powerful and general proof technique in discrete mathematics. T $ 0 N if  1 assumes N is a power of 2 To prove a theorem true for all integers k O 0: '  N T( ) % 2( /2)  N otherwise – base case: prove it to be true for N = 0 ' merging & sorting both halves – induction hypothesis: assuming it is true for arbitrary N T N – induction step: show it is true for N + 1 N T N ( ) 2 ( /2) Proof. For N > 1: N  1 T N Claim: 0 + 1 + 2 + 3 + . . . + N = N(N+1) / 2 for all N O 0. ( /2) N  1 Proof: (by mathematical induction) /2 T N Base case (N = 0).  ( / 4)   N 1 1 – 0 = 0(0+1) / 2. / 4  T N Induction hypothesis: assume 0 + 1 + 2 + . . . + N = N(N+1) / 2  ( N/ )     N 1 1 Induction step: 0 + 1 + . . . + N + N + 1 = (0 + 1 + . . . + N) + N+1 /N   log2 N = N (N+1) /2 + N+1  log2 N = (N+2)(N+1) / 2

17 18 Proof by Induction Proof by Induction

Claim. If T(N) satisfies this recurrence, then T(N) = N log2 N. Q. What if N is not a power of 2? N Q. What if merging takes at most N comparisons instead of exactly N? T $ 0 N if  1 assumes N is a power of 2 N ' N T( )  % 2 ( /2)  N otherwise A. T(N) satisfies followingT recurrence. '   N & sorting both halves merging T $ 0 N if  1 ' N T( ) ? % !1/2  #3/2  N otherwise '    & solve left half solve right half merging Proof. (by induction on N)

Base case: N = 1.

Inductive hypothesis: T(N) = N log2 N. Claim. T(N) ? N !log N1. Goal: show that T(2N) = 2N log (2N). 2 T 2 N Proof. Challenge for the bored. T N N (2 )  N2 ( )  2  N  2N log2 N2  N   2N log2(2 ) 1 2  N 2 log2(2N )

19 20

Mergesort: Practical Improvements Sorting By Different Fields

Eliminate recursion. Bottom-up mergesort. Sedgewick Program 8.5 Design challenge: enable sorting students by email or section.

Stop if already sorted. // sort by email 1 Anand Dharan adharan Is biggest element in first half ? smallest element in second half? Student.setSortKey(Student.EMAIL); 1 Ashley Evans amevans 1 Alicia Myers amyers Helps for nearly ordered lists. ArraySort.mergesort(students, 0, N-1); 1 Arthur Shum ashum 1 Amy Trangsrud atrangsr // then by precept 1 Bryant Chen bryantc Insertion sort small files. Student.setSortKey(Student.SECTION); 1 Charles Alden calden Mergesort has too much overhead for tiny files. ArraySort.mergesort(students, 0, N-1); 1 Cole Deforest cde 1 David Astle dastle Cutoff to insertion sort for < 7 elements. 1 Elinor Keith ekeith 1 Kira Hohensee hohensee

Use sentinels. . . . 5 Tom Brennan tpbrenna Two of four statements in inner loop are bounds checking. 5 Timothy Ruse truse 5 Yiting Jin ycjin "Superoptimization requires mindbending recursive switchery." Mergesort is stable

21 22 Sorting By Different Fields Computational Complexity public class Student implements Comparable { Computational complexity. Framework to study efficiency of algorithms for solving a particular problem X. private String first, last, email; data members private int section; (one for each student) Machine model. Count fundamental operations.

public final static int FIRST = 0; public final static int LAST = 1; Upper bound. Cost guarantee provided by some algorithm for X. classwide variables public final static int EMAIL = 2; (shared by all students) Lower bound. Proven limit on cost guarantee of any algorithm for X. public final static int SECTION = 3; Optimal algorithm. Algorithm with best cost guarantee for X. private static int sortKey = SECTION;

public static void setSortKey(int k) { sortKey = k; } lower bound ~ upper bound Example: sorting. public int compareTo(Object x) { Student a = this; Machine model = # comparisons on random access machine. Student b =(Student) x; Upper bound = N log N from mergesort. if (sortKey == FIRST) return a.first.compareTo(b.first); 2 else if (sortKey == LAST) return a.last.compareTo(b.last); Lower bound = N log2 N - N log2 e applies to any comparison-based else if (sortKey == EMAIL) return a.email.compareTo(b.email); algorithm (see COS 226) Optimal algorithm = mergesort. else return a.section - b.section; } compare using chosen key ... }

23 24

Decision Tree Comparison Based Sorting Lower Bound

Theorem. Any comparison based must use a1 < a2 (N log N) comparisons. YES NO 2

Proof. Worst case dictated by tree height h.

N! different orderings. a2 < a3 a1 < a3 One (or more) leaves corresponding to each ordering. h YES NO YES NO with N! leaves must have height N

print print O N logN 2 ( ! e) a , a , a a , a , a 1 2 3 a1 < a3 2 1 3 a2 < a3 O log ( N/ )N Stirling's formula 2 N   log 2 log 2 e YES NO YES NO

print print print print a1, a3, a2 a3, a1, a2 a2, a3, a1 a3, a2, a1 What if we don't use comparisons? Stay tuned for .

25 26 Sorting Analysis Summary Quicksort

Running time estimates: Quicksort. 8 Home pc executes 10 comparisons/second. Partition array so that: 12 Supercomputer executes 10 comparisons/second. – some pivot element a[m] is in its final position – no larger element to the left of m – no smaller element to the right of m Insertion Sort (N2) Mergesort (N log N) computer thousand million billion thousand million billion

home instant 2.8 hours 317 years instant 1 sec 18 min Sir Charles Antony super instant 1 second 1.6 weeks instant instant instant Richard Hoare, 1960

Q U I C K S O R T I S C O O L Lesson 1: good algorithms are better than supercomputers.

27 28

Quicksort Quicksort

Quicksort. Quicksort.

Partition array so that: Partition array so that: – some pivot element a[m] is in its final position – some pivot element a[m] is in its final position – no larger element to the left of m – no larger element to the left of m – no smaller element to the right of m – no smaller element to the right of m

Sort each "half" recursively.

partitioning element partitioning element

Q U I C K S O R T I S C O O L Q U I C K S O R T I S C O O L

I C K I C L Q U S O R T S O O CI C KI I CK L QO OU OS QO R ST S OT OU

? L O L partitioned array sort each piece

29 30 Quicksort: Java Implementation Quicksort : Implementing Partition

Quicksort. How do we partition in-place efficiently?

Partition array so that: – some pivot element a[m] is in its final position – no larger element to the left of m static int partition(Comparable[] a, int L, int R) { int i = L - 1; – no smaller element to the right of m int j = R; Sort each "half" recursively. while(true) { while (less(a[++i], a[R])) find item on left to swap public static void quicksort(Comparable[] a, int L, int R) { ; if (R <= L) return; while (less(a[R], a[--j])) find item on right to swap int m = partition(a, L, R); if (j == L) break; quicksort(a, L, m-1); quicksort(a, m+1, R); if (i >= j) break; check if pointers cross } exch(a, i, j); swap } exch(a, i, R); swap with partitioning element return i; return index where crossing occurs }

31 32

Quicksort Example Quicksort: Worst Case

Number of comparisons in worst case is quadratic.

N + (N-1) + (N-2) + . . . + 1 = N(N+1)/2

Worst-case inputs.

Already sorted!

Reverse sorted.

What about all equal keys or only two distinct keys?

Many textbook implementations go quadratic.

Sedgewick partitioning algorithm stops on equal keys.

Stay tuned for 3-way quicksort.

Partitioning Quicksort

33 34 Quicksort: Average Case Quicksort: Average Case

Average case running time. Theorem. The average number of comparisons CN to quicksort a random file of N elements is about 2N ln N. Roughly 2 N ln N comparisons. proof on next slide

Assumption: file is randomly shuffled. C N The precise recurrence satisfies C = C = 0 and for N O 2: Equivalent assumption: pivot on random element. N 0 1 N N    1 C N 1 k  k  Remarks. 1 C N k N N 39% more comparisons than mergesort. N C    2 N 1  C k 1 N k 1 Faster than mergesort in practice because of lower cost of other C N high-frequency instructions. N N 2 Worst case still proportional to N but more likely that you are Multiply both sides by N and subtract the same formula for N-1: N struck by lightning and meteor at same time. N        2 ( 1) 1 ( 1) ( 1) 2C N 1 Caveat: many textbook implementations have best case N if duplicates, even if randomized! NC Simplify to: N N C    ( 1) N 1 2N

35 36

Quicksort: Average Case Sorting Analysis Summary

N C Divide both sidesN by N(N+1) to get a telescoping sum: Running time estimates: N C 8 Home pc executes 10 comparisons/second.  1  2 NN 12  1  1 Supercomputer executes 10 comparisons/second. C N 2 2 2 NN    1  1 C 2  N 2 2 2 Insertion Sort (N ) Mergesort (N log N)  3  N   N  2  1  1 computer thousand million billion thousand million billion N   home instant 2.8 hours 317 years instant 1 sec 18 min N super instant 1 second 1.6 weeks instant instant instant N N  C 2   2 k 1 3  C k 3 Quicksort (N log N) N thousand million billion N Approximate the exact answerN by an integral: instant 0.3 sec 6 min C N k k instant instant instant W  2 NW * 2  2 ln N  k 1 1 k 1 N N Lesson 1: good algorithms are better than supercomputers. Finally, what we want: W 2(  1) ln W 1.39 log N . N 2 Lesson 2: great algorithms are better than good ones.

37 38 Quicksort: Practical Improvements Engineering a System Sort

Median of sample. .

Best choice of pivot element = median. Basic algorithm = quicksort.

But how would you compute the median? Sort a relatively large random sample from the array.

Estimate true median by taking median of sample. Use sorted elements as pivots.

Pivots are (probabilistically) good estimates of true medians. Insertion sort small files.

Even quicksort has too much overhead for tiny files. Bentley-McIlroy.

Can delay insertion sort until end. Original motivation: improve qsort function in C.

Basic algorithm = quicksort.

Optimize parameters. Partition on Tukey's ninther: Approximate median-of-9.

Median of 3 elements. – used median-of-3 elements, each of which is median-of-3

Cutoff to insertion sort for < 10 elements. – idea borrowed from statistics, useful in many disciplines

3-way quicksort to deal with equal keys. Non-recursive version. stay tuned

Use explicit stack. guarantees O(log N) stack size Reference: Engineering a Sort Function by Jon L. Bentley and M. Douglas McIlroy. Always sort smaller half first.

39 40

System Sorts Breaking Java's System Sort

Java's Arrays.sort library function for arrays. Is it possible to make system sort go quadratic?

Uses Bentley-McIlroy quicksort implementation for objects. No, for mergesort. so, why are most system sorts deterministic? Uses mergesort for primitive types. Yes, for deterministic quicksort.

starting index is inclusive, Arrays.sort(students, 0, N); ending index is exclusive McIlroy's devious idea. http://java.sun.com/j2se/1.4.2/docs/api/ Construct malicious input WHILE running system quicksort in response to elements compared.

To access library, need following line at beginning of program. If p is partition element, commit to x < p, y < p, but don't commit to any order on x, y until x and y are compared.

import java.util.Arrays; Consequences.

Confirms theoretical possibility.

Algorithmic complexity attack: you enter linear amount of data; server performs quadratic amount of work. Why the difference for objects and primitive types? more disastrous Blows function call stack and crashes program. possibilities in C

Reference: McIlory. A Killer Adversary for Quicksort.

41 42 Lots of Sorting Algorithms

Internal sorts.

Insertion sort, selection sort, bubblesort, , shaker sort.

Quicksort, mergesort, .

Samplesort, .

Solitaire sort, red-black sort, splaysort, psort, . . . .

External sorts. Poly-phase mergesort, cascade-merge, oscillating sort.

Radix sorts.

Distribution, MSD, LSD.

3-way radix quicksort.

Parallel sorts.

Bitonic sort, Batcher even-odd sort.

Smooth sort, cube sort, column sort.

43