CSE 548: (Design And) Analysis of Algorithms Figurenetworks 3.1 (A) A— Map Communication, and (B) Its Graph
Total Page:16
File Type:pdf, Size:1020Kb
Intro DFS Biconnectivity DAGs SCC BFS Paths and Matrices Intro DFS Biconnectivity DAGs SCC BFS Paths and Matrices Overview Overview Graphs provide a concise representation of a range problems Map coloring – more generally, resource contention problems CSE 548: (Design and) Analysis of Algorithms FigureNetworks 3.1 (a) A— map communication, and (b) its graph. trac, social, biological, ... Graphs (a) (b) 5 4 6 3 2 R. Sekar 8 7 1 9 10 12 13 11 1 / 30 2 / 30 necessary to use edges with directions on them. There can be directed edgese fromx toy (writtene=(x, y)), or fromy tox (written(y, x)), or both. A particularly enormous example of a directed graph is the graph of all links in the World Wide Web. It has a vertex for each Intro DFS Biconnectivity DAGs SCC BFS Paths and Matrices Overview Introsite DFS on Biconnectivity the Internet, DAGs and SCC a directedBFS Pathsedge and Matrices(u, v) whenever siteu has a link to sitev: in total, billions of nodes and edges! Understanding even the most basic connectivity properties of the Web is of great economic and social interest. Although the size of this problem is daunting, Definition and Representations Depth-Firstwe will soon see that a Searchlot of valuable information (DFS) about the structure of a graph can, happily, be determined in just linear time. A technique for traversing all vertices in the graph A graphG = (V; E), where V is a set of vertices, and E a set of edges. An 3.1.1 How is a graph represented? We can represent a graph by an adjacency matrix; if there aren= V verticesv , . , v , this Very versatile, forms the linchpin of many graph| algorithms| 1 n edge e of the form (v1; v2) is said to span vertices v1 and v2. The edges in a is ann n array whose(i, j)th entry is × � directed graph are directed. 1 if there is an edge fromv i tov j dfs(V; E) aij = A G0 = (V 0; E0) is called a subgraph of G if V 0 ⊆ V and E0 includes every 0 otherwise. 2 [ ] = 0 foreachFor undirectedv graphs,V do thevisited matrix isv symmetricfalse since an edge u, v can be taken in either edge in E between vertices in V . { } foreachdirection. v 2 V do The biggest convenience of this format is that the presence of a particular edge can be checkedif not invisited constant[ time,v] then with justexplore one memory(V; E access.; v) On the other hand the matrix takes Adjacency matrix Adjacency list upO(n 2) space, which is wasteful if the graph does not have very many edges. An alternative representation, with size proportional to the number of edges, is the adja- A graph (V = fv1;:::; vng; E) can be Each vertex v is associated with a explorecency list(V. ItE consistsv) of V linked lists, one per vertex. The linked list for vertexu holds the ; ; | | names of vertices to whichu has an outgoing edge—that is, verticesv for which(u, v) E. represented by an n × n matrix a, linked list consisting of all vertices u visited[v] = true ∈ 88 where aij = 1 i (vi; vj ) 2 E such that (v; u) 2 E. previsit(v) /*A placeholder for now*/ foreach (v; u) 2 E do Note that adjacency matrix uses O(n2) storage, while adjacency list uses if not visited[u] then explore(G; V; u) O(jVj + jEj) storage. Both can represent directed as well as undirected postvisit(v) /*Another placeholder*/ graphs. 3 / 30 4 / 30 Intro DFS Biconnectivity DAGs SCC BFS Paths and Matrices Intro DFS Biconnectivity DAGs SCC BFS Paths and Matrices Graphs, Mazes and DFS A graph and its DFS tree Figure 3.4 The result of explore(A) on the graph of Figure 3.2. Figure 3.2 Exploring a graph is rather like navigating a maze. A L K F D A B E H B B D G H C F IJ C G E E F G J K L I C H I A D J FigureIf a maze 3.3 Finding is represented all nodes reachable as a from graph, a particular then DFS node. of the graph amounts DFS uses O(jVj) space and O(jEj + jVj) time. procedureto an exploration explore(G, and v) mapping of the maze. Figure 3.5 Depth-first search. Input:G=(V, E) is a graph;v V ∈ procedure dfs(G) Output: visited(u) is set to true for all nodesu reachable fromv for allv V: visited(v)= true 5 / 30 ∈ 6 / 30 previsit(v) visited(v)= false for each edge(v, u) E: ∈ if not visited(u): explore(u) for allv V: ∈ postvisit(v) if not visited(v): explore(v) Intro DFS Biconnectivity DAGs SCC BFS Paths and Matrices Intro DFS Biconnectivity DAGs SCC BFS Paths and Matrices DFS and Connected Components DFS Numbering and whenever you arrive at any junction (vertex) there are a variety of passages (edges) you This loop takes a different amount of time for each vertex, so let’s consider all vertices to- Figurecan follow. 3.6 (a) A careless A 12-node choice graph. of passages (b) DFS might search lead forest. you around in circles or might cause you gether. The total work done in step 1 is thenO( V ). In step 2, over the course of the entire to overlook some accessible part of the maze. Clearly, you need to record some intermediate | | Associate post andDFS, pre eachnumbers edge x, with y E each is examined visited exactly node bytwice defining, once during explore(x) and once dur- information during exploration. 1,10 11,22 23,24 { }∈ (a) (b) ing explore(y). The overall time for step 2 is thereforeO( E ) and so the depth-first search This classic challenge has amused people for centuries.A Everybody knowsC that all youF previsit and postvisit | | has a running time ofO( V + E ), linear in the size of its input. This is as efficient as we needAB to explore a labyrinthCD is a ball of string and a piece of chalk. The chalk prevents looping, | | | | could possibly hope for, since it takes this long even just to read the adjacency list. by marking the junctions you have already visited.B The stringE 4,9 always takes12,21 D you back to the previsit(v) postvisit(v) starting place, enabling you to return to passages2,3 that you previously saw but did not yet Figure 3.6 shows the outcome of depth-first search on a 12-node graph, once again break- investigate.E F G H pre[v] = clock ing ties alphabeticallypost (ignore[v] the = pairsclock of numbers for the time being). The outer loop of DFS I 5,8 13,20 H How can we simulate these two primitives, chalk and string, on a computer? The chalk calls explore three times, onA,C, andfinallyF . As a result, there are three trees, each marks are easy: for each vertex, maintain a Boolean variable indicating whether it has been clock++ rooted at one of these startingclock++ points. Together they constitute a forest. visitedI already.JK As for the ball ofL string, the correct cyberanalogJ is a14,17 stackG. After all,L the exact 6,7 18,19 role of the string is to offer two primitive operations—unwind to get to a new junction (the 3.2.3 Connectivity in undirected graphs stack equivalent is to push the new vertex) and rewind to return to the previous junction (pop Property K the stack). An undirected graph is if there is a path between any pair of vertices. The graph 15,16 For any two vertices u and v, the intervals connected[pre[u]; post[u]] and Instead of explicitly maintaining a stack, we will do so implicitly via recursion (which of Figure 3.6 is not connected because, for instance, there is no path fromA toK. However, it is implemented using a stack of activation records). The resulting algorithm is shown in [pre[v]; post[v]] aredoes either have disjoint, three disjoint or one connected is contained regions, entirelycorresponding within to the following sets of vertices: A connected component of a graph is a maximal subgraph where 90 another. Thesethere regions is path are called between connected any components two vertices: each in of themthe subgraph, is a subgraph i.e., that it is is internally a A,B,E,I,J C,D,G,H,K,L F connected but has no edges to the remaining vertices. When explore is started at a particular { }{ }{ } vertex,maximal it identiconnectedfies precisely subgraph the connected. component containing that vertex. And each time 92 the DFS outer loop calls explore, a new connected component is picked out. 7 / 30 8 / 30 Thus depth-first search is trivially adapted to check if a graph is connected and, more generally, to assign each nodev an integer ccnum[v] identifying the connected component to which it belongs. All it takes is procedure previsit(v) ccnum[v] = cc where cc needs to be initialized to zero and to be incremented each time the DFS procedure calls explore. 3.2.4 Previsit and postvisit orderings We have seen how depth-first search—a few unassuming lines of code—is able to uncover the connectivity structure of an undirected graph in just linear time. But it is far more versatile than this. In order to stretch it further, we will collect a little more information during the ex- ploration process: for each node, we will note down the times of two important events, the mo- ment offirst discovery (corresponding to previsit) and that offinal departure (postvisit).