BucketSort Complexity: O(n+K) BucketSort (aka BinSort) If all values to be sorted are known to be between 1 and K, create an array count of size K, increment • Case 1: K is a constant counts while traversing the input, and finally output the result. – BinSort is linear time • Case 2: K is variable Example K=5. Input = (5,1,3,4,3,2,1,1,5,4,5) – Not simply linear time count array 32 1 • Case 3: K is constant but large (e.g. 2 ) 2 – ??? 3 4 Running time to sort n items? 5 1 2

Fixing impracticality: RadixSort Radix Sort Example (1 st pass)

Bucket sort • Radix = “The base of a number system” by 1’s digit st Input data After 1 pass – We’ll use 10 for convenience, but could be 478 721 anything 537 3 9 0 1 2 3 4 5 6 7 8 9 123 537 721 72 1 3 53 7 47 8 9 3 12 3 67 38 67 38 478 • Idea : BucketSort on each digit , 123 38 67 least significant to most significant 9 (lsd to msd)

This example uses B=10 and base 10 digits for simplicity of demonstration. Larger bucket counts should be used 3 in an actual implementation. 4

Radix Sort Example (2 nd pass) Radix Sort Example (3 rd pass)

Bucket sort After 1 st pass After 2 nd pass After 2 nd pass Bucket sort After 3 rd pass by 10’s by 100’s digit 721 3 3 digit 3 3 9 9 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 123 721 721 38 537 03 721 537 67 478 123 123 003 123 478 537 721 67 67 09 123 38 537 537 009 123 038 478 38 38 478 067 38 67 67 537 9 478 478 721

Invariant : after k passes the low order k digits are sorted.

5 6

1 Radixsort: Complexity Internal versus • How many passes? • Need sorting algorithms that minimize disk/tape access time • How much work per pass? • External sorting – Basic Idea: • Total time? – Load chunk of data into RAM, sort, store this “run” on disk/tape • Conclusion? – Use the Merge routine from Mergesort to merge runs – Repeat until you have only one run (one sorted chunk) • In practice – Text gives some examples – RadixSort only good for large number of elements with relatively small values – Hard on the cache compared to MergeSort/ 7 8

Graph… ADT? • Not quite an ADT… operations not clear Han Luke Graphs • A formalism for representing relationships between objects Leia Graph G = (V,E) Chapter 9 in Weiss – Set of vertices : V = {v 1,v 2,…,vn} V = { Han , Leia , Luke } – Set of edges : E = {( Luke , Leia ), (Han , Leia ), E = {e 1,e 2,…,em} (Leia , Han )} where each ei connects two vertices (v i1 ,v i2 ) 10

More Definitions: Graph Definitions Simple Paths and Cycles In directed graphs, edges have a specific direction: A simple path repeats no vertices (except that the first can be the last): Han Luke p = {Seattle, Salt Lake City, San Francisco, Dallas} p = {Seattle, Salt Lake City, Dallas, San Francisco, Seattle} Leia A cycle is a path that starts and ends at the same node: In undirected graphs, they don’t (edges are two-way): p = {Seattle, Salt Lake City, Dallas, San Francisco, Seattle} Han Luke p = {Seattle, Salt Lake City, Seattle, San Francisco, Seattle}

A simple cycle is a cycle that repeats no vertices except Leia that the first vertex is also the last (in undirected v is adjacent to u if (u,v) ∈∈∈ E graphs, no edge can be repeated)

11 12

2 Trees as Graphs Directed Acyclic Graphs (DAGs)

DAGs are directed main() • Every tree is a graph! A graphs with no • Not all graphs are trees! (directed) cycles. B C mult()

add() A graph is a tree if D E F Aside: If program call- – There are no cycles graph is a DAG, then all G H (directed or undirected) procedure calls can be in- read() access() – There is a path from the lined root to every node

13 14

Graph Representations Representation 1: Adjacency Matrix Han Luke A |V| x |V| array in which an element 0. List of vertices + list of edges Leia (u,v) is true if and only if there is an edge 1. 2-D matrix of vertices (marking edges in the cells) from u to v “adjacency matrix” Han Luke Leia 2. List of vertices each with a list of adjacent vertices Han Han “adjacency list” Luke Luke Vertices and edges Leia Things we might want to do: may be labeled • iterate over vertices Leia • iterate over edges • iterate over vertices adj. to a vertex space requirements: runtime: • check whether an edge exists 15 16

Representation Representation • adjacency matrix : • adjacency list: À weight , if (u, v) ∈ E A[u][v] = Ã Õ 0 , if (u, v) ∉ E 1 2

1 2 3 4

1 3 4 1 2 2

1 3 2 3 4 2 3 4 3 3 4 4 1 2

17 18

3 Some Applications: Representation 2: Adjacency List Moving Around Washington A |V| -ary list (array) in which each entry stores a list (linked list) of all adjacent vertices

Han Han Luke Luke Leia Leia

What’s the shortest way to get from Seattle to Pullman? Edge labels: space requirements: runtime: 19 20

Some Applications: Some Applications: Moving Around Washington Reliability of Communication

What’s the fastest way to get from Seattle to Pullman? If Wenatchee’s phone exchange goes down , Edge labels: 21 can Seattle still talk to Pullman? 22

Some Applications: Application: Topological Sort Bus Routes in Downtown Seattle Given a directed graph, G = (V,E) , output all the vertices in V such that no vertex is output before any other vertex with an edge to it.

CSE 403 CSE 321 CSE 322 CSE 421 CSE 326 CSE 142 CSE 143 CSE 341 CSE 451 CSE 370 CSE 467 CSE 378 If we’re at 3 rd and Pine, how can we get to st 1 and University using Metro? Is the output unique? 23 24

4 Topological Sort: Take One void Graph::topsort(){ Vertex v, w;

labelEachVertexWithItsIn-degree(); 1. Label each vertex with its in-degree (# of inbound edges) for (int counter=0; counter < NUM_VERTICES; 2. While there are vertices remaining: counter++){ a. Choose a vertex v of in-degree zero ; output v v = findNewVertexOfDegreeZero(); b. Reduce the in-degree of all vertices adjacent to v c. Remove v from the list of vertices v.topologicalNum = counter; for each w adjacent to v w.indegree--; } Runtime: }

25 26

5