Data Structures – LECTURE 14 Connected components • Find the largest components (sub-graphs) such that Strongly connected components there is a path from any vertex to any other vertex. • Applications : networking, communications. • Definition and motivation • Undirected graphs : apply BFS/DFS (inner function) • Algorithm from a vertex, and mark vertices as visited . Upon termination, repeat for every unvisited vertex. • Directed graphs : strongly connected components, not Chapter 22.5 in the textbook (pp 552—557). just connected: a path from u to v AND from v to u, which are not necessarily the same!

Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures, Spring 2006 © L. Joskowicz 2

Example: strongly connected components Example: strongly connected components

b e b e

a a d g d g c c h f h f

Data Structures, Spring 2006 © L. Joskowicz 3 Data Structures, Spring 2006 © L. Joskowicz 4

Strongly connected components Strongly connected components graph • Definition : the strongly connected components • Definition : the SCC graph G~ = ( V~,E~) of the

(SCC) C1, …, Ck of a directed graph G = ( V,E) are graph G = ( V,E) is as follows: ~ the largest disjoint sub-graphs (no common vertices – V = { C1, …, Ck}. Each SCC is a vertex. or edges) such that for any two vertices u and v in ~ ∈ ∈ ∈ – E = {( Ci,C j)| i≠j and ( x,y) E, where x Ci and y Cj}. Ci, there is a path from u to v and from v to u. A directed edge between components corresponds to a • Equivalence classes of the path (u,v) directed edge between them from any of their vertices. denoted by u ~ v. The relation is not symmetric! • G~ is a (no directed cycles)! • Goal : compute the strongly connected components • Definition : the transpose graph GT = ( V,ET) of the of G in time linear in the graph size (| V|+| E|). graph G = ( V,E) is G with its edge directions reversed: ET= {( u,v)| ( v,u )∈E}.

Data Structures, Spring 2006 © L. Joskowicz 5 Data Structures, Spring 2006 © L. Joskowicz 6

1 Example: SCC graph Example: transpose graph GT

G C b e b e 3 C 4 a d a g d C1 c T g h f G c C2 b e h f a d g c h f Data Structures, Spring 2006 © L. Joskowicz 7 Data Structures, Spring 2006 © L. Joskowicz 8

SCC algorithm Example: computing SCC (1) 2/5 3/4 ~ ~ ~ b Idea : compute the SCC graph G = ( V ,E ) with two 1/6 e DFS, one for G and one for its transpose GT, a visiting the vertices in reverse order . 7/16 d g 11/12 SCC( G) c h f ∀ ∈ 14/15 1. DFS( G) to compute finishing times f [v], v V 9/10 8/13 2. Compute GT 3. DFS( GT) in the order of decreasing f [v] 4. Output the vertices of each tree in the DFS forest as a separate SCC.

Data Structures, Spring 2006 © L. Joskowicz 9 Data Structures, Spring 2006 © L. Joskowicz 10

Example: computing SCC (2) Example: computing SCC (3) 2/5 3/4 b e 5 4 1/6 b e a 12 7/16 d g 11/12 a 16 d c 6 g h f 14/15 5 4 c 8/13 h f 9/10 b e 15 10 13 b e 12 2/7 a a 16 d 1/8 d 6 g g c c h f h 15 f Data Structures, Spring 2006 © L. Joskowicz 10 13 11 Data Structures, Spring 2006 © L. Joskowicz 4/5 3/6 12

2 Example: computing SCC (4) Example: computing SCC (5)

5 4 5 4 b e b e 12 12 a a 16 d 16 d 6 g 6 g c c h f h f 15 15 10 13 b e 10 13 b e a a 1/2 c 1/2 Data Structures, Spring 2006 © L. Joskowicz 13 Data Structures, Spring 2006 © L. Joskowicz 14

Example: computing SCC (6) Example: computing SCC (2) 5 Labeled transpose graph G T b e 12 2 b e a 16 d g 6 a 34’’ 4’ 4 d c 1/4 2/3 1 g 2 h f C4 15 c b e 10 13 h f C3 4 3 C1 1 C2

Data Structures, Spring 2006 © L. Joskowicz 15 Data Structures, Spring 2006 © L. Joskowicz 16

Proof of correctness: SCC (1) Proof of correctness: SCC (2) Lemma 2 : Let C and C’ be two distinct SCC of Lemma 1 : Let C and C’ be two distinct SCC of ∈ ∈ ∈ G = ( V,E), let u,v ∈ C and u’,v’ ∈ C’. G, and let ( u,v) E where and u C and v C’. If there is a path from u to u’, then there Then , f [C] > f [C’]. cannot be a path from v’ to v. Proof: there are two cases, depending on which Definition : the start and finishing times of a set of strongly connected component, C or C’ vertices U ⊆ V is: is discovered first: d[U] = min u∈U{d [u]} 1. C was discovered before C’: d(C) < d(C’) f [U] = max u∈U{f [u]} 2. C was discovered after C’: d(C) > d(C’) Data Structures, Spring 2006 © L. Joskowicz 17 Data Structures, Spring 2006 © L. Joskowicz 18

3 Example: finishing times Example: finishing times 2/5 3/4 2/5 3/4 b b 1/6 e 1/6 e a a 7/16 d g 11/12 7/16 d g 11/12 c f [C ] = 5 c h f 4 h f f [C4] = 5 14/15 14/15 8/13 8/13 9/10 b e 9/10 b e

a a d g d g f [C3] = 6 f [C3] = 6 c c h f h f

Data Structures, Spring 2006 © L. Joskowicz f [C2] = 15 19 Data Structures, Spring 2006 © L. Joskowicz f [C2] = 15 20 f [C1] = 16 f [C1] = 16

Proof of correctness: SCC (3) Proof of correctness: SCC (4) 1. d(C) < d(C’): C discovered before C’ 2. d(C) > d(C’): C discovered after C’ • Let x be the first vertex discovered in C. Let y be the first vertex discovered in C’. • There is a path in G from x to each vertex of C • At time d[y], all vertices in C’ are unvisited. There which has not yet been discovered. is a path in G from y to each vertex of C’ which has • Because (u,v) ∈E, for any vertex w∈C’, there is only vertices not yet discovered. Thus, all vertices in also a path at time d[x] from x to w in G consisting C’ will become descendants of y in the depth-first only of unvisited vertices: x‰u‰v‰w. tree, and so f [y] = f [C’]. • At time d[y], all vertices in C are unvisited. Since • Thus, all vertices in C and C’ become descendants there is an edge (u,v) from C to C’, there cannot, by of x in the depth-first tree. Lemma 1, be a path from C’ to C. Hence, no vertex • Therefore, f [x] = f [C] > f [C’]. in C is reachable from y. Data Structures, Spring 2006 © L. Joskowicz 21 Data Structures, Spring 2006 © L. Joskowicz 22

Proof of correctness: SCC (5) Proof of correctness: SCC (6) ∈ T ∈ ∈ 2. d(C) > d(C’) Corollary : for edge (u,v) E , and u C and v’ C’ • At time f [y], therefore, all vertices in C are f [C] < f [C’] unvisited. Thus, no vertex in C is reachable from y. • This provides shows to what happens during the • At time f [y], therefore, all vertices in C are still second DFS. unvisited. Thus, for anuy vertex w in C: • The algorithm starts at x with the SCC C whose f [w] > f [y] ‰ f [C] > f [C’]. finishing time f [C] is maximum. Since there are no vertices in GT from C to any other SCC, the search from x will not visit any other component! • Once all the vertices have been visited, a new SCC is constructed as above. Data Structures, Spring 2006 © L. Joskowicz 23 Data Structures, Spring 2006 © L. Joskowicz 24

4 Proof of correctness: SCC (7) Proof of correctness: SCC (8) Theorem : The SCC algorithm computes the strongly • When u is visited, all the vertices v in its SCC have not connected components of a directed graph G. been visited. Therefore, all vertices v are descendants Proof : by induction on the number of depth-first trees of u in the depth-first tree. found in the DFS of GT: the vertices of each tree • By the inductive hypothesis, and the corollary, any form a SCC. The first k trees produced by the edges in GT that leave C must be in SCC that have algorithm are SCC. already been visited. Basis : for k = 0, this is trivially true. • Thus, no vertex in any SCC other than C will be a T Inductive step : The first k trees produced by the descendant of u during the depth first search of G . algorithm are SCC. Consider the ( k+1) st tree rooted at • Thus, the vertices of the depth-first search tree in GT u in SCC C. By the lemma, f [u] = f [C] > f [C’] for that is rooted at u form exactly one connected SCC C’ that has not yet been visited. component. Data Structures, Spring 2006 © L. Joskowicz 25 Data Structures, Spring 2006 © L. Joskowicz 26

Uses of the SCC graph • Articulation : a vertex whose removal disconnects G. • Bridge : an edge whose removal disconnects G. • Euler tour : a cycle that traverses all edges of G exactly once (vertices can be visited more than once) All can be computed in O(| E|) on the SCC.

C b e 2 C1 a d g C4 c h f C3

Data Structures, Spring 2006 © L. Joskowicz 27

5