Insertion

Sorting ■ Corresponds to how most people sort cards ■ Invariant: everything to left // Code for a[ ] an array of int is already sorted for (int i = 1; i < a.length; i++) { ■ Works especially well when int temp = a[ i ]; input is nearly sorted int k = i; for (; k > 0 && a[k–1] > temp; k – –) ■ CS211 Runtime a[k] = a[k–1]; Fall 2000 ● Worst-case a[k] = temp; } ▲ O(n2) ▲ Consider reverse- sorted input ● Best-case ▲ O(n) ▲ Consider sorted input 2

Merge Sort Quick Sort

■ Also uses recursion (Divide & ■ Runtime analysis (worst-case) ■ Uses recursion (Divide & ■ Runtime recurrence Conquer) ● Partition can work badly Conquer) ■ ● Let T(n) be the time to Outline producing this: ■ Outline (text has detailed sort an array of size n ● Partition the array p > p code) ● Recursively sort each piece ● T(n) = 2T(n/2) + O(n) ● Split array into two halves of the partition ● Runtime recurrence ● T(1) = O(1) ■ ● Recursively sort each half Partition = divide the array like T(n) = T(n–1) + O(n) ● this ● ● Merge the two halves Can show by induction < p p > p This can be solved by that T(n) = O(n log n) induction to show T(n) = ■ p is the pivot item O(n2) ■ Merge = combine two sorted ■ Best pivot choices ■ Runtime analysis (expected- arrays to make a single sorted ■ Alternately, can show T(n) ● middle item case) array ● = O(n log n) by looking at ● random item More complex recurrence ● ● Rule: Always choose the tree of recursive calls ● of leftmost, rightmost, Can solve by induction to smallest item and middle items show expected T(n) = O(n log n) ● Time: O(n) ■ Can improve constant factor by avoiding on small sets 3 4

Heap Sort Sorting Summary

■ Not recursive ■ Runtime analysis (worst- ■ The ones we have discussed ■ Why so many? Do Computer ■ Outline case) ● Scientists have some kind of sorting fetish or what? ● Build ● O(n) time to build heap ● ● Stable sorts: Ins, Mer ● Perform removeMax on (using bottom-up ● Quick Sort ● Worst-case O(n log n): Mer, heap until empty approach) ● Heap Sort Hea ● Note that items are ● O(log n) time (worst- ● Expected-case O(n log n): removed from heap in case) for each removal ■ Other sorting sorted order Mer, Hea, Qui ● Total time: O(n log n) ● ■ Heap Sort is the only ● Best for nearly-sorted sets: ● Shell Sort (in text) O(n log n) sort that uses no Ins extra space ● ● No extra space needed: Ins, ● Merge Sort uses extra ● Hea array during merge ● Bin Sort ● Fastest in practice: Qui ● Quick Sort uses recursive ● ● Least data movement: Sel stack

5 6

1 Lower Bounds on Sorting: Goals Comparison Trees

■ Goal: Determine the ■ But how can we prove ■ Any algorithm can be ■ In general, you get a minimum time required to anything about the best “unrolled” to show the comparison tree comparisons that are sort n items possible algorithm? ■ If the algorithm fails to (potentially) performed ■ Note: we want worst-case terminate for some input Example not best-case time ● then the comparison tree is We want to find for (int i = 0; i < x.length; i++) ● Best-case doesn’t tell us characteristics that are infinite if (x[i] < 0) x[i] = – x[i]; much; for example, we common to all sorting ■ The height of the know Insertion Sort algorithms comparison tree represents takes O(n) time on 0 < length x[1] < 0 the worst-case number of already-sorted input comparisons for that ● Let’s try looking at ● We want to determine comparisons x[0] < 0 2 < length algorithm the worst-case time for

the best-possible 1 < length x[2] < 0 algorithm

7 8

Lower Bounds on Sorting: Notation The Answer to a Sorting Problem

■ ■ Suppose we want to sort the items in the array B[ ] An answer for a sorting problem tells where each of the ai resides when the algorithm finishes ■ How many answers are possible? ■ Let’s name the items ● a1 is the item initially residing in B[1], a2 is the ■ The correct answer depends on the actual values

item initially residing in B[2], etc. represented by each ai ● ■ Since we don’t know what the a are going to be, it has to be In general, ai is the item initially stored in B[i] i possible to produce each of the ai

■ Rule: an item keeps its name forever, but it can ■ For a to be valid it must be possible for that change its location algorithm to give any of n! potential answers ● Example: after swap(B,1,5), a1 is stored in B[5] and a5 is stored in B[1]

9 10

Comparison Tree for Sorting Time vs. Height

■ Every sorting algorithm has ■ Comparison tree for sorting ■ The worst-case time for a ■ What is the minimum possible a corresponding n items: sorting method must be ≥ height for a with n! comparison tree the height of its comparison leaves? Height ≥ log(n!) = Θ(n log n) ● Note that other stuff tree happens during the ● The height corresponds sorting algorithm, we to the worst-case ■ This implies that any just aren’t showing it in number of comparisons comparison-based sorting comparison algorithm must have a worst- the tree ● tree Each comparison takes case time of Ω(n log n) ■ The comparison tree must Θ(1) time ● Note: this is a lower have n! (or more) leaves abc... bacd... cabd... ● The algorithm is doing bound; thus, the use of because a valid sorting more than just big-Omega instead of algorithm must be able to comparisons big-O get any of n! possible n! leaves answers

11 12

2 Using the Lower Bound on Sorting Sorting in Linear Time

Claim: I have a PQ Claim: I have a PQ There are several sorting ■ How do these methods get Ω ● Insert time: O(1) ● Insert time: O(loglog n) methods that take linear around the (n log n) lower time bound? ● GetMax time: O(1) ● GetMax time: O(loglog ● ■ True or false? n) They don’t use ■ True or false? ■ Counting Sort comparisons ● False (for general sets) sorts integers from a ■ because if such a PQ False (for general sets) small range: [0..k] What sorting method works existed, it could be used to because it could be used to where k = O(n) best? sort in time O(n) sort in time O(n loglog n) ■ Radix Sort ● is best True for items with priorities in ● the method used by the general-purpose sort range 1..n [van Emde Boas] old card-sorters ● Counting Sort or Radix (Note: such a set can be ● sorting time O(dn) Sort can be best for sorted in O(n) time) where d is the number some kinds of data of “digits” 13 14

3