Algorithms ROBERT SEDGEWICK | KEVIN WAYNE

COLLECTIONS, IMPLEMENTATIONS, COLLECTIONS, IMPLEMENTATIONS, PRIORITY QUEUES PRIORITY QUEUES

‣ collections ‣ collections ‣ priority queues, sets, symbol tables ‣ priority queues, sets, symbol tables Algorithms ‣ heaps and priority queues Algorithms ‣ heaps and priority queues FOURTH EDITION ‣ ‣ heapsort ‣ event-driven simulation (optional) ‣ event-driven simulation (optional) ROBERT SEDGEWICK | KEVIN WAYNE ROBERT SEDGEWICK | KEVIN WAYNE http://algs4.cs.princeton.edu http://algs4.cs.princeton.edu

Collections Terminology

“The difference between a bad programmer and a good one is whether (ADT) [the programmer] considers code or data structures more important. Bad ・A of abstract values, and a collection of operations on those values. programmers worry about the code. Good programmers worry about ・Operations: data structures and their relationships.” – Queue: enqueue, dequeue — Linus Torvalds (creator of Linux) – Stack: push, pop – Union-Find: union, find, connected Things we might like to represent ・Sequences of items. Example: queue of integers ・Sets of items. ・A sequence of integers ・Mappings between items, e.g. jhug’s grade is 88.4 – Mathematical sequence, not any particular ! ・create: returns empty sequence. ・enqueue x: puts x on the right side of the sequence. ・dequeue x: removes and returns the element on the left-hand side of the sequence.

For more: COS326

3 4 Terminology Terminology

Abstract Data Type (ADT) Implementation ・A set of abstract values, and a collection of operations on those values. ・Data structures are used to implement ADTs. ・Choice of data structure may involve performance tradeoffs. Collection – Worst case vs. average case performance. ・An abstract data type that contains a collection of data items. – Space vs. time. ・Restricting ADT capabilities may allow better performance. [stay tuned] Data Structure ・A specific way to store and organize data. Examples ・Can be used to implement ADTs. ・Queue ・Examples: Array, , binary . – Linked list – Resizing array ・Randomized Queue – Linked list (slow, but you can do it!) – Resizing array

5 6

List (Java terminology)

(some) List operations

operation parameters returns effect contains Item boolean checks if an item is in the list OLLECTIONS MPLEMENTATIONS add Item appends Item at end C , I , add index, Item adds Item at position index PRIORITY QUEUES set index, Item replaces item at position index with Item remove index boolean removes item at index ‣ collections remove Item boolean removes Item if present in list get index Item returns item at index ‣ priority queues, sets, symbol tables Java implementations ‣ heaps and priority queues Algorithms ArrayList ‣ heapsort ・ ・LinkedList ‣ event-driven simulation (optional) ROBERT SEDGEWICK | KEVIN WAYNE http://algs4.cs.princeton.edu Caveat ・Java list does not match standard ADT terminology. ・Abstract lists don’t support random access. 8 Task Priority queue

NSA Monitoring Priority queue. Remove the largest (or smallest) item.

You receive 1,000,000,000 unencrypted documents every day. MaxPQ: Supports largest operations. return contents contents ・ ・ operation argument size ・You’d like to save the 1,000 documents with the highest score for ・MinPQ: Supports smallest operations. value (unordered) (ordered) manual review. insert P 1 P P insert Q 2 P Q P Q insert E 3 P Q E E P Q In real code, pick a list implementation, e.g. LinkedList remove max Q 2 P E E P insert X 3 P E X E P X insert A 4 P E X A A E P X public void process(List top1000, Document newDoc) { insert M 5 P E X A M A E M P X Document lowest = top1000.get(0); remove max X 4 P E M A A E M P insert for (Document d : top1000) P 5 P E M A P A E M P P insert L 6 P E M A P L A E L M P P if (d.score() < lowest.score()) insert E 7 P E M A P L E A E E L M P P lowest = d; remove max P 6 E M A P L E A E E L M P

A sequence of operations on a priority queue if (newDoc.score() > lowest.score()) { Min PQ operations. top1000.remove(lowest); top1000.add(newDoc); operation parameters returns effect } inserts Item adds an item } min Item returns minimum Item delMin Item deletes and returns minimum Item List based solution 9 10

Priority queue Priority queue

Implementation operation parameters returns effect s We’ll get to that later. insert Item adds an item ・ min Item returns minimum Item delMin Item deletes and returns minimum Item “Programmers waste enormous amounts of time thinking about,

Actual algs4 class or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong public void process(MinPQ top1000, Document newDoc) { negative impact when debugging and maintenance are considered. top1000.insert(newDoc); We should forget about small efficiencies, say about 97% of the top1000.delMin(newDoc); time: premature optimization is the root of all evil. Yet we should } not pass up our opportunities in that critical 3%” — Donald Knuth, Structured Programming with Go To Statements PQ based solution

Advantages ・Much simpler code. ・ADT is problem specific. May be faster. 11 12 Sets Application

Genre identification operation parameters returns effect s Collect set of all words from a song’s lyrics. add Item adds an item, only one copy may exist ・ contains boolean returns true if item is present ・Compare against large dataset using machine learning techniques. delete deletes item – Guess genre.

In real code, pick a set implementation, e.g. TreeSet

public Set wordsInFile(In file) { Set s = new Set();

while (!file.isEmpty()) s.add(file.readString());

return s; }

Finding all words in a file

13 14

Symbol tables Collinear

Collinear revisited (on board / projector) operation parameters returns effect put Key, Value associates Value with Key ・Collections make things easier. contains Key boolean returns true if Key is present Likely to be slower and use more memory. get Key Value returns Value associated with Key (if any) ・ delete Key Value deletes Key and returns Value

In real code, pick a symbol table implementation, e.g. TreeMap

public void countChars(SymT charCount, String s) { for (Character c : s.toCharArray()) if (charCount.contains(c)) charCount.put(c, charCount.get(c) + 1); else charCount.put(c, 1); }

Adding letter counts to array of strings

Other names ・, map, dictionary 15 16 Design Problem Implementations

Solo in Groups ・Erweiterten Netzwerk is a new German minimalist social networking Collection Java Implementations algs4 Implementations site that provides only two operations for its logged-in users. LinkedList, ArrayList, Stack (oops) – : Enter another user’s username and click the Neu button. List None This marks the two users as friends. MinPQ MinPQ PriorityQueue MaxPQ MaxPQ TreeSet, HashSet, Set SET (note: ordered) – : Type in another user’s username and CopyOnWriteArraySet, ...

determine whether the two users are in the same extended network TreeMap, HashMap, RedBlackBST, SeparateChainingHashST, Symbol Table (i.e. there exists some chain of friends between the two users). ConcurrentHashMap, ... LinearProbingHashST, ...

pollEv.com/jhug text to 37607 Implementations Identify at least one ADT that Erweiterten Netzwerk should use: ・Use algs4 classes when possible in COS226. ・When performance matters, pick the right implementation! A. Queue [879345] D. Priority Queue [879348] ・Next two weeks: Implementation details of these collections. B. Union-find [879346] E. Symbol Table [879349] ・More collections to come. C. Stack [879347] F. Randomized Queue [879350] Note: There may be more than one ‘good’ answer.

17 18

Priority queue API

Requirement. Generic items are Comparable.

Key must be Comparable COLLECTIONS, IMPLEMENTATIONS, (bounded type parameter) PRIORITY QUEUES public class MaxPQ> MaxPQ() create an empty priority queue ‣ collections MaxPQ(Key[] a) create a priority queue with given keys ‣ priority queues, sets, symbol tables void insert(Key v) insert a key into the priority queue ‣ heaps and priority queues Algorithms Key delMax() return and remove the largest key ‣ heapsort boolean isEmpty() is the priority queue empty? ‣ event-driven simulation (optional) ROBERT SEDGEWICK | KEVIN WAYNE Key max() return the largest key http://algs4.cs.princeton.edu int size() number of entries in the priority queue

20 Priority queue applications Priority queue client example

See optional slides / Coursera lecture. ・Event-driven simulation. [customers in a line, colliding particles] Challenge. Find the largest M items in a stream of N items. ・Numerical computation. [reducing roundoff error] ・Data compression. [Huffman codes] Graph searching. [Dijkstra's algorithm, Prim's algorithm] ・ order of growth of finding the largest M in a stream of N items Number theory. [sum of powers] ・ Assignment 4. ・Artificial intelligence. [A* search] implementation time space ・Statistics. [maintain largest M values in a sequence] sort N log N N Operating systems. [load balancing, interrupt handling] ・ elementary PQ M N M Discrete optimization. [bin packing, ] ・ binary N log M M Spam filtering. [Bayesian spam filter] ・ best in theory N M

Generalizes: stack, queue, randomized queue.

21 22

Priority queue: unordered and ordered array implementation Priority queue: unordered array implementation

public class UnorderedArrayMaxPQ> { private Key[] pq; // pq[i] = ith element on pq return contents contents private int N; // number of elements on pq operation argument value size (unordered) (ordered) public UnorderedArrayMaxPQ(int capacity) no generic insert P 1 P P { pq = (Key[]) new Comparable[capacity]; } array creation insert Q 2 P Q P Q insert E 3 P Q E E P Q remove max Q 2 P E E P public boolean isEmpty() insert X 3 P E X E P X { return N == 0; } insert A 4 P E X A A E P X insert M 5 P E X A M A E M P X public void insert(Key x) remove max X 4 P E M A A E M P { pq[N++] = x; } insert P 5 P E M A P A E M P P insert L 6 P E M A P L A E L M P P public Key delMax() insert E 7 P E M A P L E A E E L M P P { remove max P 6 E M A P L E A E E L M P int max = 0; for (int i = 1; i < N; i++) less() and exch() A sequence of operations on a priority queue if (less(max, i)) max = i; similar to sorting methods exch(max, N-1); (but don't pass pq[]) return pq[--N]; should null out entry } to prevent loitering }

23 24 Priority queue elementary implementations

Challenge. Implement all operations efficiently. Basic idea on board

order of growth of running time for priority queue with N items

implementation insert del max max

unordered array 1 N N

ordered array N 1 1

goal log N log N log N

25 26

Binary heap demo Complete binary tree

Insert. Add node at end, then swim it up. Binary tree. Empty or node with links to left and right binary trees. Remove the maximum. Exchange root with node at end, then sink it down. Complete tree. Perfectly balanced, except for bottom level.

heap ordered

T

P R

N H O A complete tree with N = 16 nodes (height = 4)

E I G

Property. Height of complete tree with N nodes is !lg N". Pf. Height only increases when N is a power of 2. T P R N H O A E I G

27 28 A complete binary tree in nature Binary heap representations

Binary heap. Array representation of a heap-ordered complete binary tree.

Heap-ordered binary tree. ・Keys in nodes. ・Parent's key no smaller than children's keys. i 0 1 2 3 4 5 6 7 8 9 10 11 a[i] - T S R P N O A E I H G Array representation. T S R Indices start at 1. ・ P N O A ・Take nodes in level order. No explicit links needed! ・ E I H G 1 T

2 S 3 R

4 P 5 N 6 O 7 A

8 E 9 I 10 H 11 G

Heap representations

29 30

Binary heap properties Promotion in a heap

Proposition. Largest key is a[1], which is root of binary tree. Scenario. Child's key becomes larger key than its parent's key.

Proposition. Can use array indices to move through tree. To eliminate the violation: ・Parent of node at k is at k/2. ・Exchange key in child with key in parent. ・Children of node at k are at 2k and 2k+1. ・Repeat until heap order restored. S

i 0 1 2 3 4 5 6 7 8 9 10 11 P R a[i] - T S R P N O A E I H G private void swim(int k) { N 5 T O A T while (k > 1 && less(k/2, k)) violates heap order S R E I H G (larger key than parent) { P N O A exch(k, k/2); 1 T k = k/2; 2 } S R parent of node at k is at k/2 E I H G 5 1 } N P O A T E I H G 2 S 3 R

4 P 5 N 6 O 7 A Bottom-up reheapify (swim)

8 E 9 I 10 H 11 G Heap representations Peter principle. Node promoted to level of incompetence.

31 32 Insertion in a heap Demotion in a heap

Insert. Add node at end, then swim it up. Scenario. Parent's key becomes smaller than one (or both) of its children's. Cost. At most 1 + lg N compares. why not smaller child? insert remove the maximum T To eliminateT key to removethe violation: P R S ・ExchangeR key in parent with key in larger child. N H O A N P・RepeatO A until heap order restored. exchange key E I G S key to insert E I G H with root public void insert(Key x) violates heap order private void sink(int k) (smaller than a child) { T { violates pq[++N] = x; T H heap order 2 while (2*k <= N) children of node at k H R swim(N); P R S R 5 { are 2k and 2k+1 P S O A } N H O A N P O intA j = 2*k; E I N G remove node E I G S add key to heap E I G T if (j < N && less(j, j+1)) j++; violates heap order from heap T 2 if (!less(k, j)) break; S R T S exch(k, j); swim up P 5 N O A S R P R k = j; 10 sink down E I H G } N P O A N H O A } Top-down reheapify (sink) E I G H E I G

Heap operations Power struggle. Better subordinate promoted.

33 34

Delete the maximum in a heap Binary heap: Java implementation

Delete max. Exchange root with node at end, then sink it down. Cost. At most 2 lg N compares. public class MaxPQ> { private Key[] pq; insert remove the maximum T T key to remove private int N;

P R S R public MaxPQ(int capacity) fixed capacity N H O A N P O A { pq = (Key[]) new Comparable[capacity+1]; } (for simplicity) public Key delMax() exchange key { E I G S key to insert E I G H with root public boolean isEmpty() Key max = pq[1]; PQ ops violates { return N == 0; } T H exch(1, N--); heap order public void insert(Key key) sink(1); P R S R public Key delMax() pq[N+1] = null; prevent loitering { /* see previous code */ } N H O A N P O A return max; remove node E I G S add key to heap E I G T } violates heap order from heap private void swim(int k) private void sink(int k) heap helper functions T S swim up { /* see previous code */ } S R P R sink down private boolean less(int i, int j) N P O A N H O A { return pq[i].compareTo(pq[j]) < 0; } array helper functions E I G H E I G private void exch(int i, int j) { Key t = pq[i]; pq[i] = pq[j]; pq[j] = t; } } Heap operations

35 36 Priority queues implementation cost summary Binary heap considerations

Immutability of keys. ・Assumption: client does not change keys while they're on the PQ. order-of-growth of running time for priority queue with N items ・Best practice: use immutable keys. implementation insert del max max Underflow and overflow. unordered array 1 N N ・Underflow: throw exception if deleting from empty PQ. ordered array N 1 1 ・Overflow: add no-arg constructor and use resizing array. leads to log N binary heap log N log N 1 Minimum-oriented priority queue. amortized time per op (how to make worst case?) d-ary heap logd N d logd N 1 ・Replace less() with greater(). Implement greater(). Fibonacci 1 log N † 1 ・

Brodal queue 1 log N 1 Other operations. Remove an arbitrary item. impossible 1 1 1 why impossible? ・ can implement with sink() and swim() [stay tuned] ・Change the priority of an item. † amortized

37 38

Immutability: implementing in Java Immutability: properties

Data type. Set of values and operations on those values. Data type. Set of values and operations on those values. Immutable data type. Can't change the data type value once created. Immutable data type. Can't change the data type value once created.

public final class Vector { can't override instance methods Advantages. private final int N; instance variables private and final Simplifies debugging. private final double[] data; ・ ・Safer in presence of hostile code. public Vector(double[] data) { Simplifies concurrent programming. this.N = data.length; ・ Safe to use as key in priority queue or symbol table. this.data = new double[N]; defensive copy of mutable ・ for (int i = 0; i < N; i++) instance variables this.data[i] = data[i]; Disadvantage. Must create new object for each data type value. }

instance methods don't change … instance variables } “ Classes should be immutable unless there's a very good reason to make them mutable.… If a class cannot be made immutable, you should still limit its mutability as much as possible. ” Immutable. String, Integer, Double, Color, Vector, Transaction, Point2D. — Joshua Bloch (Java architect) Mutable. StringBuilder, Stack, Counter, Java array.

39 40 Challenge

Design a ・Given an Iterable. ・Design a sorting algorithm that only uses methods from the Set COLLECTIONS, IMPLEMENTATIONS, collection to print the items in order. PRIORITY QUEUES

‣ collections ‣ priority queues, sets, symbol tables Algorithms ‣ heaps and priority queues ‣ heapsort ‣ event-driven simulation (optional) ROBERT SEDGEWICK | KEVIN WAYNE http://algs4.cs.princeton.edu

42

Challenge Heapsort

Design a sorting algorithm Observation ・Given an Iterable. ・Our heaps are represented with arrays. ・Design a sorting algorithm that only uses methods from the Set – Any array is just a messed up heap! collection to print the items in order. Heapsort can be done in place. public void HeapSort(Iterable a) { Step 1: Heapify the array. MaxPQ mpq = new MaxPQ(); ・ for (Comparable c : a) – In place. mpq.insert(c); ・Step 2: Delete the max repeatedly. – Largest element is swapped to the end. for (Comparable c : a) System.out.println(mpq.delMax()); – Once completed, array is in order. } ・Items take a round trip, but only a logarithmic distance.

Performance ・Order of growth of running time: N lg N. ・Lots of unnecessary data movement and memory usage. Donald Knuth - The Art of Computer Programming Volume 3

43 44 heap construction sortdown

1 S X exch(1, 6) M sink(1, 5) 2 O 3 R T S L E 4 5 6 7 T E X A P L R A A E O P 8 9 10 11 M P L E M O E E R S T X starting point (arbitrary order) starting point (heap-ordered) sink(5, 11) S exch(1, 11) T exch(1, 5) L sink(1, 10) sink(1, 4) O R P S E E

T L X A O L R A A M O P M P E E M E E X R S T X sink(4, 11) S exch(1, 10) S exch(1, 4) E sink(1, 9) sink(1, 3) O HeapsortR P R A E Heapsort demo T L X A O L E A L M O P M P E E Basic plan for in-placeM sort.E T X R S T X Heap construction. Build max heap using bottom-up method. Create max-heap with all N keys. heap construction sortdown ・ we assume array entries are indexed 1 to N sink(3, 11) exch(1, 9) exch(1, 3) S Repeatedly remove the maximumR key. 1 SE X exch(1, 6) M ・ sink(1, 8) sink(1, 2) sink(1, 5) O X P E 2 AO 3 RE T S L E start with array of keys 4 5 6 7 array in arbitrary order T L R A O L E A LT EM XO AP P L R A A E O P in arbitrary order 8 9 10 11 P E E S T X M SP LT EX M O E E R S T X M M R 1 S starting point (arbitrary order) starting point (heap-ordered) heap construction sortdown sink(2, 11) exch(1, 8) exch(1, 2) S 1 S P sink(5, 11) SAX exch(1,exch(1, 6) 11) M T exch(1, 5) L sink(1, 7) sink(1, 1) sink(1,sink(1, 5) 10) sink(1, 24) O 3 R T X 2 O 3 E EO RE P S E E O build a max-heapR T S L E 4 5 6 7 P L R A T M E (inL Xplace)AE A LTP LML XRO AAP A O E LO P R A A M O P 4 5 6 7 8 9 10 11 T E X A M O E E M RP SL TE X RMM SPO ETE EXE R S M T E X E X R S T X starting point (arbitrary order) starting point (heap-ordered)

sink(4, 11) S exch(1, 10) S 8 M exch(1,9 P 4) 10 L E11 E exch(1,sink(5, 7)11) S O exch(1, 11) 1 AT exch(1,sink(1, 5) 9) L sink(1, 3) sink(1, 6) sink(1, 10) sink(1, 4) sink(1, 11) X O R P R A E O M sortedR resultE 2 EP 3 SE E E heap construction sortdown T S (in place) 4 T 5 L 6X 7A O L E A L M O P T L L X AE P O ML RO AP A M O P A L 1 S X exch(1, 6) M E X X sink(1, 5) P L R A 8 M 9 P 10 E 11 M E T SR OS RT T E X A M P L E M RP SE TE X RM SE TE X R S T X 2 O 3 R T S L E 1 2 4 3 45 5 6 67 7 8 9 10 11 M O E E T E X A P L R A A E O P result (sorted) 45 46 result (heap-ordered) sink(3, 11) S exch(1, 9) R exch(1, 3) 8 9 10 E11 sink(4, 11) S exch(1, 10) S exch(1,sink(1, 4) 8) E sink(1, 2) M P L E M O E E R S T X Heapsort: constructing (left) and sorting down (right) a heapsink(1, 9) O X sink(1, 3) P E Astarting point (arbitraryE order) starting point (heap-ordered) O R P R A E T L R A O L E A L sink(5,M 11) O S P exch(1, 11) T exch(1, 5) L T L X A O L E A L M O P sink(1, 10) sink(1, 4) O R P S E E Heapsort demo M P E E M S THeapsort:X heap RconstructionS T X M P E E M E T X R S T X T L X A O L R A A M O P M P E E M E E X R S T X sink(2, 11) S exch(1, 8) P exch(1, 2) A sink(1, 7) sink(1, 1) Sortdown. Repeatedlysink(3, delete 11) theS largest remainingexch(1, item. 9) R exch(1, 3) EFirst pass. Heapify using bottom-up method. sink(1, 8) sink(1, 2) T X O E Esink(4, 11) SE exch(1, 10) S exch(1, 4) E O X P E A ELinear time (see book or optional slides). sink(1, 9) sink(1, 3) P L R A M ・L E A L M O O PR P R A E T L R A O L E A L M O Top-downP method is N lg N (seeL book or optional slides).L M M O E E R S T ・X R S TT X X A O E A L O P M P E E M S T X R S T X M P E E M E T X R S T X

array in sorted order heap construction sortdown 1 exch(1, 7) O sink(3, 11)A exch(1, 9) exch(1, 3) sink(2, 11) exch(1, 8) exch(1,sink(1, 2) 6) S R E S sink(1, 11) XP A 1 S X exch(1,forsink(1, 6) (int 8) k M= N/2; k >= sink(1,1; k--) 2) sink(1, 7) sink(1, 1) M E 2 E O 3 E X sink(1, 5) P E A E A 2 3 T X O E E OE R T S sink(a,L k, N);E T S 4 T5 L6 7R A O L E A L M O P 4 L 5 6 7 M A T EE PX A L P LO PR A A E O P L R A L E A M O P P PM L R A L 8 9 10 11 8 9 10M P11 E E M S T X R S T X R S MT P X L E R S MT O X E E R S T X E E M O E E MR OS ET EX R S T X starting point (arbitrary order) resultstarting (sorted) point (heap-ordered) sink(2, 11) S exch(1, 8) P exch(1, 2) A result (heap-ordered) sink(1, 7) sink(1, 1) sink(5, 11) S exch(1, 11)T T X exch(1, 5) O L E E E Heapsort: constructing (left) and sorting down (right) a heap sink(1, 10) sink(1, 4) exch(1, 7) 1 L M O PO A O R P P L R SA E M L E E A L M O P sink(1, 11) sink(1, 6) X 2 3 M E E T E L X A M OO E LE R A A R SM T O X P R S T X T S 4 5 M 6 P 7E E M E E X R S T X R S T X A L E P L M O P exch(1, 7) O 1 A P L R A 8 9 10 11 sink(1, 6) R S T X R S T X sink(1, 11) X sink(4, 11) S exch(1, 10) S exch(1, 4) M E E 2 E 3 E O E E sink(1, 9) sink(1, 3) M result (sorted) T S 4 5 6 7 O R P R A A L E E P L M O P result (heap-ordered) L L A P L R AA M P 8 9 10 11 Heapsort: constructing (left) and sorting down (right) a heap T X O E L R S T O X R S T X O E E A E E L M O P R S T X M P E E MM E T X R S T X result (sorted) result (heap-ordered) 1 2 3 4 5 6 7 8 9 10 11 Heapsort: constructing (left) and sorting down (right) a heap 47 48 sink(3, 11) S exch(1, 9) R exch(1, 3) E sink(1, 8) sink(1, 2) O X P E A E

T L R A O L E A L M O P M P E E M S T X R S T X

sink(2, 11) S exch(1, 8) P exch(1, 2) A sink(1, 7) sink(1, 1) T X O E E E

P L R A M L E A L M O P M O E E R S T X R S T X

exch(1, 7) O 1 A sink(1, 6) sink(1, 11) X M E 2 E 3 E T S 4 5 6 7 A L E P L M O P P L R A 8 9 10 11 R S T X R S T X M O E E result (sorted) result (heap-ordered) Heapsort: constructing (left) and sorting down (right) a heap Heapsort: sortdown Heapsort: Java implementation

heap construction sortdown

Second pass. 1 S X exch(1, 6) M sink(1, 5) public class Heap 2 O 3 R T S L E Remove the maximum, one4 at a5 time.6 7 { ・ T E X A P L R A A E O P 8 9 10 11 public static void sort(Comparable[] a) Leave in array, instead of nullingM P L Eout. M O E E R S T X ・ starting point (arbitrary order) starting point (heap-ordered) { sink(5, 11) S exch(1, 11) T exch(1, 5) L int N = a.length; sink(1, 10) sink(1, 4) O R P S E E for (int k = N/2; k >= 1; k--) while (N > 1) T L X A O L R A A M O P sink(a, k, N); M P E E M E E X R S T X { while (N > 1) { exch(a, 1, N--); sink(4, 11) S exch(1, 10) S exch(1, 4) E sink(1, 9) sink(1, 3) exch(a, 1, N); sink(a, 1, N); O R P R A E T L X A O L E A L M O P sink(a, 1, --N); } M P E E M E T X R S T X } } sink(3, 11) S exch(1, 9) R exch(1, 3) E sink(1, 8) sink(1, 2) but make static (and pass arguments) O X P E A E

T L R A O L E A L M O P private static void sink(Comparable[] a, int k, int N) M P E E M S T X R S T X { /* as before */ }

sink(2, 11) S exch(1, 8) P exch(1, 2) A sink(1, 7) sink(1, 1) private static boolean less(Comparable[] a, int i, int j) T X O E E E { /* as before */ } P L R A M L E A L M O P M O E E R S T X R S T X private static void exch(Object[] a, int i, int j) exch(1, 7) O 1 A sink(1, 11) sink(1, 6) { /* as before */ } X 2 3 M E E E T S 4 5 6 7 A L E P L M O P but convert from 1-based P L R A 8 9 10 11 } R S T X R S T X indexing to 0-base indexing M O E E result (sorted) result (heap-ordered) 49 50 Heapsort: constructing (left) and sorting down (right) a heap

Heapsort: trace Heapsort animation

a[i] 50 random items N k 0 1 2 3 4 5 6 7 8 9 10 11 initial values S O R T E X A M P L E 11 5 S O R T L X A M P E E 11 4 S O R T L X A M P E E 11 3 S O X T L R A M P E E 11 2 S T X P L R A M O E E 11 1 X T S P L R A M O E E heap-ordered X T S P L R A M O E E 10 1 T P S O L R A M E E X 9 1 S P R O L E A M E T X 8 1 R P E O L E A M S T X 7 1 P O E M L E A R S T X 6 1 O M E A L E P R S T X 5 1 M L E A E O P R S T X 4 1 L E E A M O P R S T X 3 1 E A E L M O P R S T X 2 1 E A E L M O P R S T X

1 1 A E E L M O P R S T X algorithm position sorted result A E E L M O P R S T X in order

Heapsort trace (array contents just after each sink) not in order http://www.sorting-algorithms.com/heap-sort

51 52 Heapsort: mathematical analysis Heapsort summary

Proposition. Heap construction uses ≤ 2 N compares and exchanges. The good news: Proposition. Heapsort uses ≤ 2 N lg N compares and exchanges. ・Heap sort: In place and theoretically fast (not in place)

algorithm be improved to ~ 1 N lg N Mergesort Quicksort Heapsort

Significance. In-place sorting algorithm with N log N worst-case. ・Mergesort: no, linear extra space. in-place merge possible, not practical ・Quicksort: no, quadratic time in worst case. N log N worst-case quicksort possible, ・Heapsort: yes! not practical

Bottom line. Heapsort is optimal for both time and space, but: The bad news: ・Inner loop longer than quicksort’s. ・(Almost) nobody uses Heapsort in the real world. Why? ・Makes poor use of cache memory. – Like Miss Manners, Heapsort is very well-behaved, but is unable to ・Not stable. handle the stresses of the real world – In particular, performance on real computers is heavily impacted by really messy factors like cache performance

53 54

What does your computer look like inside? Play with it!

55 56 Levels of caches Key idea behind caching

We’ll assume there’s just one cache, to keep things simple.

When fetching one memory address, fetch everything nearby. ・Because memory access patterns of most programs/algorithms are highly localized! http://media.soundonsound.com/sos/dec08/images/DigitalVillageSynergyPC_04.jpg 57 58

Which of these is faster? Cache and memory latencies: an analogy

A. sum=0 Cache for (i = 0 to size) for (j = 0 to size) Get up and get something sum += array[i][j] from the kitchen

B. sum=0 for (i = 0 to size) for (j = 0 to size) RAM sum += array[j][i] Walk down the block to borrow from neighbor

Answer: A is faster, sometimes by an order of magnitude or more. Hard drive Drive around the world...

...twice

59 60 Sort algorithms and cache performance Sorting algorithms: summary

Mergesort: sort subarrays first

inplace? stable? worst average best remarks Quicksort: partition into subarrays selection x ½ N 2 ½ N 2 ½ N 2 N exchanges

Heapsort: all over the place insertion x x ½ N 2 ¼ N 2 N use for small N or partially ordered

shell x ? ? N tight code, subquadratic

N log N probabilistic guarantee quick x ½ N 2 2 N ln N N lg N fastest in practice improves quicksort in presence 3-way quick x # key comparisons½ N 2 2 to N sortln N N distinctN randomly-ordered keys of duplicate keys

merge x N lg N N lg N N lg N N log N guarantee, stable

heap x 2 N lg N 2 N lg N N lg N N log N guarantee, in-place

??? x x N lg N N lg N N lg N holy sorting grail

61 62

Sink-based (bottom up) heapification

Optional slides on heapification running time Observation ・Or see textbook section 2.4 ・Given two heaps of height 1. ・A heap of height 2 results by: – Pointing the root of each heap at a new item. – Sinking that new item. ・Cost: 4 compares (2 * height of new tree). pollEv.com/jhug text to 37607 Q: How many compares are needed to sink the O into the correct position in the worst case? O

A. 1 [676050] T h = 1 L B. 2 [676051] C. 3 [676052] D. 4 [676053] M P E E

63 Sink-based (bottom up) heapification

7-node heap Observation 7-node heap ・Given two heaps of height h-1. ・A heap of height h results by – Pointing the root of each heap at a new item. – Sinking that new item. ・Cost to sink: At most 2h compares. ・Total heap construction cost: 4*2 + 2*4 + 6 = 22 compares 6 pollEv.com/jhug text to 37607

Q: How many worst-case compares are needed to form a height 3 heap by 4 4 sinking an item into one of two perfectly balanced heaps of height 2?

2 2 2 2 A. 4 [676057] B. 6 [676058] C. 8 [676059] 0 0 0 0 0 0 0 0

65

Sink-based (bottom up) heapification Heapsort

Total Heap Construction Cost Order of growth of running time ・For h=1: C1 = 2 ・Heap construction: N ・For h=2: C2 = 2C1 + 2*2 ・N calls to delete max: N lg N ・For h: Ch = 2Ch-1 + 2h ・Total cost: Doubles with h (plus a small constant factor): Exponential in h Total Extra Space ・Total cost: Linear in N ・Constant (in-place)

6 6

4 4 4 4

2 2 2 2 2 2 2 2

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Molecular dynamics simulation of hard discs

Goal. Simulate the motion of N moving particles that behave according to the laws of elastic collision.

COLLECTIONS, IMPLEMENTATIONS, PRIORITY QUEUES

‣ collections ‣ priority queues, sets, symbol tables Algorithms ‣ heaps and priority queues ‣ heapsort ‣ event-driven simulation (optional) ROBERT SEDGEWICK | KEVIN WAYNE http://algs4.cs.princeton.edu

70

Molecular dynamics simulation of hard discs Warmup: bouncing balls

Goal. Simulate the motion of N moving particles that behave Time-driven simulation. N bouncing balls in the unit square. according to the laws of elastic collision.

public class BouncingBalls % java BouncingBalls 100 Hard disc model. { public static void main(String[] args) ・Moving particles interact via elastic collisions with each other and walls. { Each particle is a disc with known position, velocity, mass, and radius. int N = Integer.parseInt(args[0]); ・ Ball[] balls = new Ball[N]; No other forces. for (int i = 0; i < N; i++) ・ balls[i] = new Ball(); while(true) { StdDraw.clear(); temperature, pressure, motion of individual for (int i = 0; i < N; i++) diffusion constant atoms and molecules { balls[i].move(0.5); Significance. Relates macroscopic observables to microscopic dynamics. balls[i].draw(); Maxwell-Boltzmann: distribution of speeds as a function of temperature. } ・ StdDraw.show(50); Einstein: explain Brownian motion of pollen grains. } ・ } main simulation loop }

71 72 Warmup: bouncing balls Time-driven simulation

public class Ball ・Discretize time in quanta of size dt. { Update the position of each particle after every dt units of time, private double rx, ry; // position ・ private double vx, vy; // velocity and check for overlaps. private final double radius; // radius If overlap, roll back the clock to the time of the collision, update the public Ball(...) ・ { /* initialize position and velocity */ } check for collision with walls velocities of the colliding particles, and continue the simulation.

public void move(double dt) { if ((rx + vx*dt < radius) || (rx + vx*dt > 1.0 - radius)) { vx = -vx; } if ((ry + vy*dt < radius) || (ry + vy*dt > 1.0 - radius)) { vy = -vy; } rx = rx + vx*dt; ry = ry + vy*dt; } public void draw() { StdDraw.filledCircle(rx, ry, radius); } }

Missing. Check for balls colliding with each other. t t + dt t + 2 dt t + Δt Physics problems: when? what effect? ・ (collision detected) (roll back clock) ・CS problems: which object does the check? too many checks? 73 74

Time-driven simulation Event-driven simulation

Main drawbacks. Change state only when something happens. ・~ N 2 / 2 overlap checks per time quantum. ・Between collisions, particles move in straight-line trajectories. ・Simulation is too slow if dt is very small. ・Focus only on times when collisions occur. ・May miss collisions if dt is too large. dt too small: excessive computation ・Maintain PQ of collision events, prioritized by time. (if colliding particles fail to overlap when we are looking) ・Remove the min = get next collision.

Collision prediction. Given position, velocity, and radius of a particle, when will it collide next with a wall or another particle? dt too small: excessive computation dt too large: may miss collisions Collision resolution. If collision occurs, update colliding particle(s) according to laws of elastic collisions.

prediction (at time t) particles hit unless one passes intersection point before the other dt too large: may miss collisions arrives (see Exercise 3.6.X) Fundamental challenge for resolution (at time t + dt) time-driven simulation velocities of both particles change after collision (see Exercise 3.6.X)

Predicting and resolving a particle-particle collision 75 76

Fundamental challenge for time-driven simulation Particle-wall collision Particle-particle collision prediction

Collision prediction and resolution. Collision prediction. ・Particle of radius s at position (rx, ry). ・Particle i: radius si, position (rxi, ryi), velocity (vxi, vyi). ・Particle moving in unit box with velocity (vx, vy). ・Particle j: radius sj, position (rxj, ryj), velocity (vxj, vyj). ・Will it collide with a vertical wall? If so, when? ・Will particles i and j collide? If so, when?

(vxi', vyi')

resolution (at time t + dt) s (vx ', vy ') m j j velocity after collision = ( − vx , vy) i position after collision = ( 1 − s , r + v dt) y y s i (vxi , vyi ) prediction (at time t) dt ϵ time to hit wall (rxi , ryi) wall at (rxi', ryi') (r , r ) = distance/velocity x y x = 1 = (1 − s − r )/v i x x vy vx time = t time = t + Δt 1 − s − rx

(vxj , vyj) Predicting and resolving a particle-wall collision

sj

j

77 78

Particle-particle collision prediction Particle-particle collision resolution

Collision prediction. Collision resolution. When two particles collide, how does velocity change? ・Particle i: radius si, position (rxi, ryi), velocity (vxi, vyi). ・Particle j: radius sj, position (rxj, ryj), velocity (vxj, vyj). Will particles i and j collide? If so, when? vx " vxvx" = Jx vx/ m + Jx / m ・ i = ii + i i i " " vyi = vyvyii += Jy /vymi i + Jy / mi Newton's second law " " (momentum form) vx j = vx j −= Jxvx/ mj j− Jx / m j & " " ∞ if Δv ⋅Δr ≥ 0 vyj = vyvxjj −= Jyvx/ mj j− Jy / m j ( Δt = ' ∞ if d < 0 ( Δv ⋅Δr + d - otherwise )( v v Δ ⋅Δ € €

2 2 d = (Δv ⋅Δr) − (Δv ⋅Δv) (Δr ⋅Δr − σ ) σ = σi + σ j

€ J Δrx J Δry 2 mi m j (Δv ⋅Δr) Jx = , Jy = , J = σ σ σ(mi + m j ) € Δv€⋅ Δv = (Δvx)2 + (Δvy)2 impulse due to normal force Δv = (Δvx, Δvy) = (vxi − vx j , vyi − vyj ) Δr ⋅ Δr = (Δrx)2 + (Δry)2 (conservation of energy, conservation of momentum) Δr = (Δrx, Δry) = (rxi − rx j , ryi − ryj ) Δv ⋅ Δr = (Δvx)(Δrx)+ (Δvy)(Δry) € € € Important note: This is high-school physics, so we won’t be testing you on it! Important note: This is high-school physics, so we won’t be testing you on it! € € 79 80 € Particle data type skeleton Particle-particle collision and resolution implementation

public double timeToHit(Particle that) { public class Particle if (this == that) return INFINITY; { double dx = that.rx - this.rx, dy = that.ry - this.ry; double dvx = that.vx - this.vx; dvy = that.vy - this.vy; private double rx, ry; // position double dvdr = dx*dvx + dy*dvy; private double vx, vy; // velocity if( dvdr > 0) return INFINITY; no collision private final double radius; // radius double dvdv = dvx*dvx + dvy*dvy; private final double mass; // mass double drdr = dx*dx + dy*dy; private int count; // number of collisions double sigma = this.radius + that.radius; double d = (dvdr*dvdr) - dvdv * (drdr - sigma*sigma); if (d < 0) return INFINITY; public Particle(...) { } return -(dvdr + Math.sqrt(d)) / dvdv; } public void move(double dt) { } public void draw() { } public void bounceOff(Particle that) {

double dx = that.rx - this.rx, dy = that.ry - this.ry; public double timeToHit(Particle that) { } predict collision double dvx = that.vx - this.vx, dvy = that.vy - this.vy; public double timeToHitVerticalWall() { } with particle or wall double dvdr = dx*dvx + dy*dvy; public double timeToHitHorizontalWall() { } double dist = this.radius + that.radius; double J = 2 * this.mass * that.mass * dvdr / ((this.mass + that.mass) * dist); resolve collision double Jx = J * dx / dist; public void bounceOff(Particle that) { } with particle or wall double Jy = J * dy / dist; public void bounceOffVerticalWall() { } this.vx += Jx / this.mass; public void bounceOffHorizontalWall() { } this.vy += Jy / this.mass; that.vx -= Jx / that.mass; } that.vy -= Jy / that.mass; this.count++; that.count++; Important note: This is high-school physics, so we won’t be testing you on it! } 81 82

Collision system: event-driven simulation main loop Event data type

Initialization. two particles on a collision course Conventions. ・Fill PQ with all potential particle-wall collisions. ・Neither particle null ⇒ particle-particle collision. ・Fill PQ with all potential particle-particle collisions. ・One particle null ⇒ particle-wall collision. Both particles null ⇒ redraw event. third particle interferes: no collision ・ “potential” since collision may not happen if some other collision intervenes

private class Event implements Comparable { An invalidated event private double time; // time of event Main loop. private Particle a, b; // particles involved in event Delete the impending event from PQ (min priority = t). private int countA, countB; // collision counts for a and b ・ ・If the event has been invalidated, ignore it. public Event(double t, Particle a, Particle b) { } create event Advance all particles to time t, on a straight-line trajectory. ・ public int compareTo(Event that) ordered by time ・Update the velocities of the colliding particle(s). { return this.time - that.time; } invalid if Predict future particle-wall and particle-particle collisions involving the public boolean isValid() ・ intervening collision colliding particle(s) and insert events onto PQ. { } }

83 84 Collision system implementation: skeleton Collision system implementation: main event-driven simulation loop

public class CollisionSystem public void simulate() { { private MinPQ pq; // the priority queue pq = new MinPQ(); initialize PQ with private double t = 0.0; // simulation clock time for(int i = 0; i < N; i++) predict(particles[i]); collision events and private Particle[] particles; // the array of particles pq.insert(new Event(0, null, null)); redraw event

while(!pq.isEmpty()) public CollisionSystem(Particle[] particles) { } { Event event = pq.delMin(); private void predict(Particle a) add to PQ all particle-wall and particle- if(!event.isValid()) continue; get next event { particle collisions involving this particle Particle a = event.a; if (a == null) return; Particle b = event.b; for (int i = 0; i < N; i++) for(int i = 0; i < N; i++) { update positions particles[i].move(event.time - t); double dt = a.timeToHit(particles[i]); t = event.time; and time pq.insert(new Event(t + dt, a, particles[i]));

} if (a != null && b != null) a.bounceOff(b); pq.insert(new Event(t + a.timeToHitVerticalWall() , a, null)); else if (a != null && b == null) a.bounceOffVerticalWall() process event pq.insert(new Event(t + a.timeToHitHorizontalWall(), null, a)); else if (a == null && b != null) b.bounceOffHorizontalWall(); } else if (a == null && b == null) redraw();

private void redraw() { } predict(a); predict new events predict(b); based on changes } public void simulate() { /* see next slide */ } } }

85 86

Particle collision simulation example 1 Particle collision simulation example 2

% java CollisionSystem 100 % java CollisionSystem < billiards.txt

87 88 Particle collision simulation example 3 Particle collision simulation example 4

% java CollisionSystem < brownian.txt % java CollisionSystem < diffusion.txt

89 90