Directed Graphs

Digraph. Set of objects with oriented pairwise connections.

one-way streets in a map hyperlinks connecting web pages

0 .002 25 1 .017 Directed Graphs 34 0 2 .009 3 .003 4 .006 7 5 .016 6 .066 2 10 7 .021 40 8 .017 9 .040 19 33 29 10 .002 41 49 15 11 .028 8 12 .006 44 13 .045 14 .018 45 15 .026 16 .023 • introduction 28 17 .005 1 14 18 .023 19 .026 22 48 20 .004 • digraph search 39 21 .034 22 .063 18 6 21 23 .043 • transitive closure 24 .011 25 .005 42 13 26 .006 23 27 .008 • topological sort 31 47 28 .037 11 12 29 .003 32 30 .037 • strong components 30 31 .023 26 32 .018 33 .013 5 37 27 34 .024 9 16 35 .019 36 .003 43 37 .031 38 .012 24 39 .023 4 40 .017 38 41 .021 42 .021 3 17 35 43 .016 36 44 .023 References: in Java (Part 5), Chapter 19 46 20 45 .006 46 .023 Intro to Algs and Data Structures, Section 5.2 47 .024 48 .019 49 .016

Copyright © 2007 by Robert Sedgewick and Kevin Wayne. 6 22 3 Page ranks with histogram for a larger example

Directed graphs (digraphs)

Set of objects with oriented pairwise connections.

dependencies in software modules prey-predator relationship among species

introduction digraph search transitive closure topological sort strong components

2 4 Digraph Applications Digraph representation

Vertex names. Digraph Vertex Edge ! This lecture: use integers between 0 and V-1. financial stock, currency transaction ! Real world: convert between names and integers with symbol table. transportation street intersection, airport highway, airway route

scheduling task precedence constraint Orientation of edge is significant.

WordNet synset hypernym

Web web page hyperlink 0 7 8 game board position legal move

telephone person placed call 1 2 6 9 10 food web species predator-prey relation

infectious disease person infection 3 4 11 12 citation journal article citation

object graph object pointer 5

inheritance hierarchy class inherits from

control flow code block jump

5 7

Some Digraph Problems Adjacency Matrix Representation

Transitive closure. Is there a directed path from v to w? Maintain a two-dimensional V ! V boolean array. For each edge v"w in graph: adj[v][w] = true. Strong connectivity. Are all vertices mutually reachable?

Topological sort. Can you draw the graph so that all edges point to

from left to right? 0 0 1 2 3 4 5 6 7 8 9 10 11 12

0 0 1 1 0 0 1 1 0 0 0 0 0 0 PERT/CPM. Given a set of tasks with precedence constraints, 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 6 what is the earliest that we can complete each task? 2 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 1 0 0 0 0 0 0 0 0 0 Shortest path. Given a weighted digraph, find best route from s to t 3 4 5 0 0 0 1 1 0 0 0 0 0 0 0 0 from 6 0 0 0 0 1 0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 1 0 0 0 0 5 PageRank. What is the importance of a web page? 9 10 8 0 0 0 0 0 0 0 0 0 0 0 0 0 9 0 0 0 0 0 0 0 0 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 1 7 8 11 12 12 0 0 0 0 0 0 0 0 0 0 0 0 0

6 8 Adjacency-list digraph representation Digraph Representations

Maintain vertex-indexed array of lists. Digraphs are abstract mathematical objects. ! ADT implementation requires specific representation. Note: one representation of each directed edge ! Efficiency depends on matching algorithms to representations.

0: 5 2 1 6 Edge from Iterate over Representation Space 0 v to w? edges leaving v? 1:

2: List of edges E + V E E

1 2 6 3: Adjacency matrix V 2 1 V 4: 3 Adjacency list E + V outdegree(v) outdegree(v) 5: 4 3 3 4 6: 4

5 7: 8 9 10 8: In practice: Use adjacency-list representation

! 9: 10 11 12 Bottleneck is iterating over edges leaving v.

! 7 8 11 12 10: Real world digraphs are sparse. 11: 12

12: E is proportional to V

9 11

Adjacency-list digraph representation: Java implementation Typical digraph application: Google's PageRank

Same as Graph, but only insert one copy of each edge. Goal. Determine which web pages on Internet are important. Solution. Ignore keywords and content, focus on hyperlink structure. public class Digraph { private int V; adjacency Random surfer model. private SET[] adj; lists ! Start at random page. public Digraph(int V) ! With probability 0.85, randomly select a hyperlink to visit next; { this.V = V; with probability 0.15, randomly select any page. adj = (SET[]) new SET[V]; create empty ! PageRank = proportion of time random surfer spends on each page. V-vertex graph for (int v = 0; v < V; v++) 0 .002 25 1 .017 0 34 2 .009 adj[v] = new SET(); 3 .003 4 .006 7 5 .016 6 .066 } 2 Solution 1: Simulate random surfer for a long time. 10 7 .021 40 8 .017 9 .040 19 33 29 10 .002 41 49 15 11 .028 Solution 2: Compute ranks directly until they converge 12 .006 8 public void addEdge(int v, int w) 44 13 .045 14 .018 45 15 .026 { 16 .023 Solution 3: Compute eigenvalues of adjacency matrix! 28 17 .005 add edge from v to w 1 14 18 .023 adj[v].add(w); 19 .026 22 48 20 .004 (parallel edges allowed) 39 21 .034 22 .063 } 18 6 21 23 .043 24 .011 25 .005 42 13 26 .006 None feasible without sparse digraph representation 23 27 .008 31 47 28 .037 public Iterable adj(int v) 11 12 29 .003 32 30 .037 30 31 .023 { 26 32 .018 33 .013 iterable SET for 5 37 27 34 .024 return adj[v]; 9 16 35 .019 v’s neighbors Every square matrix is a weighted digraph 36 .003 43 37 .031 } 38 .012 24 39 .023 4 40 .017 } 38 41 .021 42 .021 3 17 35 43 .016 36 44 .023 46 20 45 .006 46 .023 10 12 47 .024 48 .019 49 .016

6 22 Page ranks with histogram for a larger example Digraph application: mark-sweep garbage collector

Every is a digraph (objects connected by references)

Roots. Objects known to be directly accessible by program (e.g., stack).

Reachable objects. introduction Objects indirectly accessible by digraph search program (starting at a root and transitive closure following a chain of pointers).

topological sort easy to identify pointers in type-safe language strong components pagerank Mark-sweep algorithm. [McCarthy, 1960] ! Mark: run DFS from roots to mark reachable objects. ! Sweep: if object is unmarked, it is garbage, so add to free list.

Memory cost: Uses 1 extra mark bit per object, plus DFS stack. 13 15

Digraph application: program control-flow analysis

Every program is a digraph (instructions connected to possible successors) Goal. Find all vertices reachable from s along a directed path.

s

Dead code elimination. Find (and remove) unreachable code

can arise from compiler optimization (or bad code)

Infinite loop detection. Determine whether exit is unreachable

can’t detect all possible infinite loops (halting problem)

14 16 Reachability Digraph-processing challenge 1:

Goal. Find all vertices reachable from s along a directed path. Problem: Mark all vertices reachable from a given vertex.

s

Use DFS How difficult? ! 1) any CS126 student could do it 2) need to be a typical diligent CS226 student

3) hire an expert 0-1 4) intractable 0 0-6 0-2 5) no one knows 3-4 1 2 6 3-2 5-4 5-0 3 4 3-5 2-1 5 6-4 3-1

17 19

Digraph-processing challenge 1: Depth-first search in digraphs

Problem: Mark all vertices reachable from a given vertex. Same method as for undirected graphs

Every undirected graph is a digraph ! happens to have edges in both directions ! DFS is a digraph algorithm (never uses that fact)

How difficult? 1) any CS126 student could do it 2) need to be a typical diligent CS226 student DFS (to visit a vertex s) 3) hire an expert 0-1 4) intractable 0 0-6 Mark s as visited. 0-2 5) no one knows 3-4 Visit all unmarked vertices w adjacent to v. 1 2 6 3-2 5-4 5-0 recursive 3 4 3-5 2-1 5 6-4 3-1

18 20 Depth-first search (single-source reachability) Cycle detection

Identical to undirected version (substitute Digraph for Graph). Goal. Find any cycle in the graph

public class DFSearcher s { private boolean[] marked; true if connected to s

public DFSearcher(Digraph G, int s) { marked = new boolean[G.V()]; constructor marks vertices dfs(G, s); connected to s }

private void dfs(Digraph G, int v) { marked[v] = true; recursive DFS for (int w : G.adj(v)) does the work if (!marked[w]) dfs(G, w); }

public boolean isReachable(int v) { client can ask whether return marked[v]; any vertex is } connected to s }

21 23

Digraph application: dependencies among software modules Cycle detection

Every software system is a digraph (modules dependent on others) Goal. Find any cycle in the graph

s Mozilla Internet explorer

Issue: Any cycles?

Can’t find a cycle? The digraph is a DAG (directed acyclic graph) 22 24 Digraph-processing challenge 2: Cycle detection in a digraph: Java implementation

Problem: Does a digraph contain a cycle ? public class CycleDetector { standard DFS Equivalent: Is a digraph a DAG? private boolean[] marked; with a few private boolean[] done; modifications private boolean dagflag;

public CycleDetector(Digraph G) { marked = new boolean[G.V()]; How difficult? done = new boolean[G.V()]; 1) any CS126 student could do it count = G.V(); for (int v = 0; v < G.V(); v++) 2) need to be a typical diligent CS226 student if (!marked[v]) 3) hire an expert dagflag = search(G, v); 0-1 } 4) intractable 0 0-6 0-2 5) no one knows 3-4 private boolean search(Digraph G, int v) 1 2 6 3-2 { 5-4 marked[v] = true; 5-0 for (int w : G.adj(v)) 3 4 3-5 if (!marked[w]) return search(G, w); 2-1 else if (!done[w]) return false; 5 6-4 done[v] = true; add method dag() 1-3 return true; to return dagflag on } client query } 25 27

Digraph-processing challenge 2: Cycle detection in a digraph: Example

Problem: Does a digraph contain a cycle ? Trace of marks set at beginning and end of search() Equivalent: Is a digraph a DAG? adj lists 0: 6 1 2 marked[] done[] 1: 3 2: 1 visit 0: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 3: 4 2 5 4: visit 6: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 5: 0 4 implementation 6: 4 How difficult? visit 4: 1 0 0 0 1 0 1 0 0 0 0 0 0 0 proof ! 1) any CS126 student could do it leave 4: 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0-1 ! 2) need to be a typical diligent CS226 student leave 6: 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0-6 3) hire an expert 0-2 0-1 visit 1: 1 1 0 0 1 0 1 0 0 0 0 1 0 1 3-4 4) intractable 0 0-6 1 2 6 3-2 0-2 visit 3: 1 1 0 1 1 0 1 0 0 0 0 1 0 1 5-4 5) no one knows 3-4 5-0 Use DFS 1 2 6 3-2 check 4: ignore since marked and done 3 4 3-5 5-4 2-1 visit 2: 1 1 1 1 1 0 1 0 0 0 0 1 0 1 but prove that it works 5-0 5 6-4 3 4 3-5 check 1: 1-3 2-1 5 6-4 cycle since marked but not done 1-3

26 28 Cycle detection in a digraph: Correctness proof Breadth-first search in digraphs

marked[v] = true && done[v] = false Same method as for undirected graphs we know a directed path from source to v 0-1 Every undirected graph is a digraph 0 0-6 Case 1: no cycle 0-2 ! happens to have edges in both directions 3-4 ! ! search() will never touch a vertex that 1 2 6 3-2 BFS is a digraph algorithm (never uses that fact) is marked and not done 5-4 5-0 ! therefore will return true 3 4 3-5 2-1 5 6-4 BFS (from source vertex s) 3-1 Put s onto a FIFO queue. Case 2: cycle Repeat until the queue is empty: ! search() must be called for some vertex on cycle ! remove the least recently added vertex ! v that one will be found marked and not done 0-1 ! add each of v's unvisited neighbors to the ! return false 0 0-6 0-2 queue and mark them as visited. 3-4 1 2 6 3-2 5-4 5-0 3 4 3-5 2-1 5 6-4 Finds the shortest directed path from s to t 1-3 29 31

Depth First Search Digraph BFS application: Web Crawler

DFS enables direct solution of simple digraph problems. The internet is a digraph ! ! Reachability. ! ! Cycle detection Goal. Crawl Internet, starting from some root website. ! Topological sort Solution. BFS with implicit graph.

! 0 .002 25 1 .017 Transitive closure. 0 34 2 .009 3 .003 ! 4 .006 7 5 .016 Is there a path from s to t ? BFS. 6 .066 2 10 7 .021 40 8 .017 ! 9 .040 19 33 stay tuned Start at some root website 29 10 .002 41 49 15 11 .028 8 12 .006 44 13 .045 Basis for solving difficult digraph problems. ( say http://www.princeton.edu.). 14 .018 45 15 .026 16 .023 ! ! 28 17 .005 Directed Euler path. Maintain a Queue of websites to explore. 1 14 18 .023 19 .026 22 48 20 .004 ! ! 39 21 .034 22 .063 18 6 21 Strong connected components. Maintain a SET of discovered websites. 23 .043 24 .011 25 .005 ! 42 13 26 .006 Dequeue the next website 23 27 .008 31 47 28 .037 11 12 29 .003 32 30 .037 and enqueue websites to which it links 30 31 .023 26 32 .018 33 .013 5 37 27 34 .024 (provided you haven't done so before). 9 16 35 .019 36 .003 43 37 .031 38 .012 24 39 .023 4 40 .017 38 41 .021 42 .021 3 17 35 43 .016 36 44 .023 46 20 45 .006 46 .023 47 .024 Q. Why not use DFS? 48 .019 49 .016 A. Internet is not fixed (some pages generate new ones when visited) 6 22 Page ranks with histogram for a larger example 30 32 Web crawler: Java implementation Graph-processing challenge (revisited)

Queue q = new Queue(); queue of sites to crawl Problem: Is there a path from s to t ? SET visited = new SET(); set of visited sites Assumptions: linear (V + E) preprocessing time String s = "http://www.princeton.edu"; constant query time q.enqueue(s); start crawling from s 0 visited.add(s); 0-1 while (!q.isEmpty()) 1 2 6 0-6 0-2 { How difficult? 4-3 String v = q.dequeue(); read in raw html for next site in queue 1) any CS126 student could do it 3 4 5-3 System.out.println(v); 5-4 In in = new In(v); 2) need to be a typical diligent CS226 student 5 String input = in.readAll(); http://xxx.yyy.zzz 3) hire an expert String regexp = "http://(\\w+\\.)*(\\w+)"; 4) intractable Pattern pattern = Pattern.compile(regexp); use regular expression 5) no one knows Matcher matcher = pattern.matcher(input); to find all URLs in site while (matcher.find()) 6) impossible { String w = matcher.group(); if (!visited.contains(w)) { if unvisited, mark as visited visited.add(w); and put on queue q.enqueue(w); } } } 33 35

Graph-processing challenge (revisited)

Problem: Is there a path from s to t ? Assumptions: linear (V + E) preprocessing time constant query time 0 0-1 1 2 6 0-6 How difficult? 0-2 introduction 4-3 1) any CS126 student could do it 3 4 5-3 digraph search ! 2) need to be a typical diligent CS226 student 5-4 5 transitive closure 3) hire an expert topological sort 4) intractable 5) no one knows Use DFS strong components 6) impossible mark vertices with connected component ID (see lecture on undirected graphs)

34 36 Digraph-processing challenge 3 Transitive Closure

Problem: Is there a directed path from s to t ? The transitive closure of G has an directed edge from v to w Assumptions: linear (V + E) preprocessing time if there is a directed path from v to w in G constant query time graph is usually sparse

How difficult? 1) any CS126 student could do it G 2) need to be a typical diligent CS226 student 3) hire an expert

4) intractable 0-1 5) no one knows 0 0-6 0-2 6) impossible 3-4 TC is usually dense 1 2 6 3-2 so adjacency matrix 5-4 Transitive closure representation is OK 5-0 of G 3 4 3-5 2-1 5 6-4 1-3

37 39

Digraph-processing challenge 3 Digraph-processing challenge 3 (revised)

Problem: Is there a directed path from s to t ? Problem: Is there a directed path from s to t ? Assumptions: linear (V + E) preprocessing time Assumptions: V2 preprocessing time constant query time constant query time

How difficult? How difficult? 1) any CS126 student could do it 1) any CS126 student could do it 2) need to be a typical diligent CS226 student 2) need to be a typical diligent CS226 student 3) hire an expert 3) hire an expert

4) intractable 0-1 4) intractable 0-1 5) no one knows 0 0-6 5) no one knows 0 0-6 0 1 2 3 4 5 6 0-2 0-2 ! 6) impossible 0 1 1 1 0 1 1 0 3-4 6) impossible 3-4 1 0 1 0 0 0 0 0 1 2 6 3-2 1 2 6 3-2 2 0 1 1 0 0 0 0 5-4 5-4 3 0 1 1 1 1 0 0 5-0 5-0 2 V possibilities 4 0 0 0 0 1 0 0 3 4 3-5 3 4 3-5 5 1 1 1 0 1 1 1 2-1 2-1 6 0 0 0 0 1 0 1 5 6-4 5 6-4 3-1 1-3

38 40 Digraph-processing challenge 3 (revised) Digraph-processing challenge 3 (revised again)

Problem: Is there a directed path from s to t ? Problem: Is there a directed path from s to t ? Assumptions: V2 preprocessing time Assumptions: VE preprocessing time constant query time V2 space constant query time

How difficult? 1) any CS126 student could do it How difficult? 2) need to be a typical diligent CS226 student 1) any CS126 student could do it 3) hire an expert ! 2) need to be a typical diligent CS226 student

4) intractable 0-1 3) hire an expert 0-1 ! 5) no one knows 0 0-6 4) intractable 0 0-6 0-2 Use DFS 0-2 6) impossible 3-4 5) no one knows 3-4 once for each 1 2 6 3-2 6) impossible 1 2 6 3-2 5-4 vertex 5-4 5-0 5-0 3 4 3-5 to compute rows of 3 4 3-5 2-1 2-1 5 6-4 transitive closure 5 6-4 1-3 1-3

41 43

Digraph-processing challenge 3 (revised again) Transitive Closure: Java Implementation

Problem: Is there a directed path from s to t ? Use an array of DFSearcher objects, Assumptions: VE preprocessing time one for each row of transitive closure public class DFSearcher 2 { V space private boolean[] marked; public DFSearcher(Digraph G, int s) constant query time { marked = new boolean[G.V()]; dfs(G, s); } private void dfs(Digraph G, int v) { marked[v] = true; public class TransitiveClosure for (int w : G.adj(v)) How difficult? if (!marked[w]) dfs(G, w); { } 1) any CS126 student could do it public boolean isReachable(int v) { private DFSearcher[] tc; return marked[v]; 2) need to be a typical diligent CS226 student } } 3) hire an expert 0-1 public TransitiveClosure(Digraph G) 4) intractable 0 0-6 { 0-2 tc = new DFSearcher[G.V()]; 5) no one knows 3-4 for (int v = 0; v < G.V(); v++) 6) impossible 1 2 6 3-2 tc[v] = new DFSearcher(G, v); 5-4 } 5-0 3 4 3-5 2-1 public boolean reachable(int v, int w) 5 6-4 { 1-3 return tc[v].isReachable(w); is there a directed path from v to w ? } } 42 44 Topological Sort

DAG. Directed acyclic graph.

introduction digraph search Topological sort. Redraw DAG so all edges point left to right. transitive closure topological sort strong components

Observation. Not possible if graph has a directed cycle.

45 47

Digraph application: Scheduling Digraph-processing challenge 4

Scheduling. Given a set of tasks to be completed with precedence Problem: Check that the digraph is a DAG. constraints, in what order should we schedule the tasks? If it is a DAG, do a topological sort. Assumptions: linear (V + E) preprocessing time Graph model. provide client with vertex iterator for topological order ! Create a vertex v for each task. ! Create an edge v"w if task v must precede task w. ! Schedule tasks in topological order. How difficult? tasks 1) any CS126 student could do it 0-1 2) need to be a typical diligent CS226 student 0-6 0. read programming assignment 0-2 1. download files 3) hire an expert 0-5 precedence 2. write code 4) intractable 2-3 constraints 3. attend precept 4-9 … 6-4 12. sleep 5) no one knows 6-9 6) impossible 7-6 8-7 9-10 9-11 9-12 feasible 11-12 schedule

46 48 Digraph-processing challenge 4 Topological sort in a DAG: Correctness proof

Problem: Check that the digraph is a DAG. To topologically sort a DAG. If it is a DAG, do a topological sort. ! Run DFS to compute reverse postorder numbering in ts[]. Assumptions: linear (V + E) preprocessing time ! Use client iterator to return ts[0], ts[1], ts[2], ... provide client with vertex iterator for topological order

A How difficult? implementation Key observation. When DFS backtracks from a vertex v, proof v ! 1) any CS126 student could do it 0-1 all vertices reachable from have already been explored. D ! 2) need to be a typical diligent CS226 student 0-6 0-2 3) hire an expert 0-5 E 4) intractable 2-3 4-9 5) no one knows 6-4 no back edges in DAG 6-9 F 6) impossible 7-6 Use DFS 8-7 9-10 reverse 9-11 H postorder 9-12 11-12 DFS numbering Running time. linear tree

49 51

Topological sort in a DAG: Java implementation Topological sort applications.

public class TopologicalSorter standard DFS { with 3 private int count; extra lines of code ! Causalities. private boolean[] marked; ! Compilation units. private int[] ts; ! Class inheritance. public TopologicalSorter(Digraph G) ! Course prerequisites. { marked = new boolean[G.V()]; ! Deadlocking detection. ts = new int[G.V()]; ! Temporal dependencies. count = G.V(); ! for (int v = 0; v < G.V(); v++) Pipeline of computing jobs. if (!marked[v]) tsort(G, v); ! Check for symbolic link loop. } ! Evaluate formula in spreadsheet. private void tsort(Digraph G, int v) ! Program Evaluation and Review Technique / Critical Path Method { marked[v] = true; for (int w : G.adj(v)) if (!marked[w]) tsort(G, w); ts[--count] = v; } add iterator that returns } ts[0], ts[1], ts[2]...

50 52 Topological sort application (weighted DAG): PERT/CPM

Program Evaluation and Review Technique / Critical Path Method ! Task v takes time[v] units of time. index task time prereq ! Can work on jobs in parallel. A begin 0 - ! Precedence constraints: must finish B framing 4 A task v before beginning task w. C roofing 2 B ! What's earliest we can finish each task? D siding 6 B introduction E windows 5 D F plumbing 3 D digraph search G electricity 4 C, E transitive closure H 6 C, E paint topological sort F I finish 0 F, H D 3 strong components 6 E 5

A B C G H I time[v] 0 4 2 4 6 0

53 55

Program Evaluation and Review Technique / Critical Path Method Graph-processing challenge (revisited again)

PERT/CPM algorithm. Problem: Is there a path from s to t ? ! Compute topological order of vertices. Equivalent: Are s and t connected? ! Initialize fin[v] = 0 for all vertices v. ! Consider vertices v in topological order. Assumptions: linear (V + E) preprocessing time 0 0-1 – for each edge v"w, set fin[w]= max(fin[w], fin[v] + time[w]) constant query time 1 2 6 0-6 0-2 4-3 3 4 5-3 How difficult? 5-4 5 1) any CS126 student could do it 13 2) need to be a typical diligent CS226 student 10 F 3) hire an expert D 3 4) intractable 6 15 E 5) no one knows critical path 5 6) impossible

fin[v] 0 4 6 19 25 25 A B C G H I time[v] 0 4 2 4 6 0

54 56 Graph-processing challenge (revisited again) Strong components: terminology

Problem: Is there a path from s to t ? Def. Vertices v and w are strongly connected if there is a path Equivalent: Are s and t connected? from v to w and a path from w to v. symmetric, transitive, reflexive.

Assumptions: linear (V + E) preprocessing time 0 Strong component. Maximal subset of strongly connected vertices. 0-1 constant query time 1 2 6 0-6 0-2 Kernel DAG.

4-3 ! 3 4 5-3 vertex: set of vertices in same strong component How difficult? 5-4 ! edge: any edge from original graph connecting two vertices 5 1) any CS126 student could do it 0 2 3 ! 7 8 2) need to be a typical diligent CS226 student 4 5 6 3) hire an expert 4) intractable 5) no one knows Use DFS 9 10 6) impossible 1 11 12 sink mark vertices with connected component ID kernel DAG (see lecture on undirected graphs)

57 59

Digraph-processing challenge 5 Typical strong components application: Ecological food web

Problem: Is there a directed cycle containing s and t ? Strong component is subset of species with common energy flow Equivalent: Are there directed paths from s to t and from t to s? ! source in kernel DAG: heading to extinction? Equivalent: Are s and t strongly connected? ! sink in kernel DAG: heading for growth?

Assumptions: linear (V + E) preprocessing time Digraph changes over time constant query time

How difficult? 1) any CS126 student could do it 2) need to be a typical diligent CS226 student 3) hire an expert 4) intractable 5) no one knows 6) impossible

58 60 Typical strong components application: Packaging software Kosaraju's Algorithm

Strong component is subset of mutually interacting modules Simple (but mysterious) algorithm for computing strong components ! Run DFS on GR and compute postorder. Approach: Package together modules in same strong component ! Run DFS on G, considering vertices in reverse postorder.

Theorem. Trees in second DFS are strong components. (!)

61 63

Strong components algorithms: brief history Digraph-processing challenge 5

1960s: Core OR problem Problem: Is there a directed cycle containing s and t ? ! widely studied Equivalent: Are there directed paths from s to t and from t to s? ! some practical algorithms Equivalent: Are s and t strongly connected? ! complexity not understood Assumptions: linear (V + E) preprocessing time 1972: Linear-time DFS algorithm (Tarjan) constant query time ! classic algorithm

! Use DFS level of difficulty: CS226++ implementation ! demonstrated broad applicability and importance of DFS How difficult? (twice) ! 1) any CS126 student could do it 1980s: Easy two-pass linear-time algorithm (Kosaraju) 2) need to be a typical diligent CS226 student proof ! forgot notes for teaching algorithms class ! 3) hire an expert (well, maybe a CS341 student) ! developed algorithm in order to teach it! 4) intractable ! later found in Russian scientific literature (1972) 5) no one knows 6) impossible 1990s: More easy linear-time algorithms (Gabow, Mehlhorn) ! Gabow: fixed old OR algorithm ! Mehlhorn: needed one-pass algorithm for LEDA 62 64 Digraph-processing summary: Algorithms of the day

Single-source DFS reachability

cycle detection DFS

transitive closure DFS from each vertex

topological sort DFS (DAG)

Kosaraju strong components DFS (twice)

65