4 Basic Graph Theory and Algorithms

4 Basic graph theory and algorithms References: [DPV06, Ros11]. 4.1 Basic graph definitions Definition 4.1. A graph G = (V; E) is a set V of vertices and a set E of edges. Each edge e 2 E is associated with two vertices u and v from V , and we write e = (u; v). We say that u is adjacent to v, u is incident to v, and u is a neighbor of v. Graphs are a common abstraction to represent data. Some examples include: road networks, where the vertices are cities and there is an edge between any two cities that share a highway; protein interaction networks, where the vertices are proteins and the edges represent interactions between proteins; and social networks, where the nodes are people and the edges represent friends. Sometimes we want to associate a direction with the edges to indicate a one-way relationship. For example, consider a predator-prey network where the vertices are animals and an edge represents that one animal hunts the other. We want to express that the fox hunts the mouse but not the other way around. This naturally leads to directed graphs: Definition 4.2. A directed graph G = (V; E) is a set V of vertices and set E of edges. Each edge e 2 E is an ordered pair of vertices from V . We denote the edge from u to v by (u; v) and say that u points to v. This could be useful, for example, in making a graph representation of a road network where some streets were one-way. Here are some other common definitions that you should know: Definition 4.3. A weighted graph G = (V; E; w) is a graph (V; E) with an associated weight function w : E ! R. In other words, each edge e has an associated weight w(e). This could be useful incorporating lengths of roads in a graph representation of a road network. Definition 4.4. In an undirected graph, the degree of a node u 2 V is the number of edges e = (u; v) 2 E. We will write du to denote the degree of node u. Lemma 4.5. The handshaking lemma. In an undirected graph, there are an even number of nodes with odd degree. Proof. Let dv be the degree of node v. X X X 2jEj = dv = dv + dv v2G v2G:dv even v2G:dv odd The terms 2jEj and P d are even, so P d must also be even. v2G:dv even v v2G:dv odd v Definition 4.6. In a directed graph, the in-degree of a node u 2 V is the number of edges that point to u, i.e., the number of edges (v; u) 2 E. The out-degree of u is the number of edges that point away from u, i.e., the number of edges (u; v) 2 E. Definition 4.7. A path on a graph is a sequence of nodes v1; v2; : : : vk such that the edge (vi; vi+1) 2 E for i = 1; : : : ; k − 1 (see Figure2). 13 Figure 2: An example of a path (left) and cycle (right). Definition 4.8. A cycle is a path v1; v2; : : : vk with k ≥ 3 such that v1 = vk (see Figure2). Definition 4.9. A complete graph is a graph where for every u; v 2 V , u 6= v, the edge (u; v) 2 E. Definition 4.10. The complement of a graph G is the graph G0 such that (u; v) 2 G0 () (u; v) 2= G. 4.2 Storing a graph 1 2 3 4 5 1 0 0 0 0 0 0 1 1 2 B 0 0 1 1 0 C 2 3,4 B C 3 B 0 1 0 1 0 C 3 2,4 B C 4 @ 0 1 1 0 1 A 4 2,3,4 5 0 0 0 1 0 5 4 Figure 3: Example graph with adjacency matrix and adjacency list. How do we actually store a graph on a computer? There are two common methods: an adjacency matrix or an adjacency list. In the adjacency matrix, we store a matrix A of size jV j×jV j with entry Aij = 1 if edge (i; j) is in the graph, and Aij = 0 otherwise. The main advantage of an adjacency matrix is that we can query the existence of an edge in Θ(1) time|we just look up the entry in the matrix. The disadvantage is that storage is Θ(jV j2). In many graphs, the number of edges is far less than jV j2, which makes adjacency matrices impractical for many problems. However, adjacency matrices are still a useful tools for understanding graphs. Adjacency lists, on the other hand, only need memory proportional to the size of the graph. They consist of a length-jV j array of linked lists. The ith linked list contains all nodes adjacent to node i (also called node i's neighbor list). We can access the neighbor list of i in Θ(1) time, but querying for an edge takes O(dmax) time, where dmax = maxj dj. You can see examples of these two storage methods in Figure3. 4.3 Graph exploration Definition 4.11. We say that node v is reachable from node u if there is a path from u to v. 14 We now cover two ways of exploring a graph: depth-first search (DFS) and breadth-first search (BFS). The goal of these algorithms is to find all nodes reachable from a given node, or simply to explore all nodes in a graph. We will describe the algorithms for undirected graphs, but they generalize to directed graphs. 4.3.1 Depth-first search In depth first search we explore a graph by starting at a vertex and then following a path of unexplored vertices away from this vertex as far as possible (and repeating). This construction of DFS comes from [DPV06]. We will use an auxiliary routine called Explore(v), where v specifies a starting node. We use a vector C to keep track of which nodes have already been explored. The routine is described in Algorithm7. We have included functions Pre-visit() and Post-visit() before and after exploring a node. We will reserve these functions for book-keeping to help us understand other algorithms. For example, Pre-visit() may record the order in which nodes were explored. Data: Graph G = (V; E), starting node v, vector of covered nodes C Result: C[u] = true for all nodes u reachable from v C[v] true for (v; w) 2 E do if C[w] is not true then Pre-visit(w) Explore(w) Post-visit(w) end end Algorithm 7: Function Explore(v) Given the explore routine, we have a very simple DFS algorithm. We present it in Algorithm8. The cost of DFS is O(jV j + jEj). We only call Explore() on a given node once. Thus, we only consider the neighbors of each node once. Therefore, we only look at the edges of a given node once. Data: Graph G = (V; E) Result: depth-first-search exploration of G for v 2 V do // Initialized covered nodes C[v] false end for v 2 V do if C[v] is not true then Explore(v) end end Algorithm 8: Function DFS(G) We now look at an application of DFS: topological ordering of DAGs. 15 Definition 4.12. A directed acyclic graph (DAG) is a directed graph that has no cycles. DAGs characterize an important class of graphs. One example of a DAG is a schedule where some tasks can't start before others have finished. There is an edge (u; v) if task u must finish before task v. If a cycle exists, then the schedule cannot be completed. We now investigate how to order the tasks to create a valid sequential schedule. Definition 4.13. A topological ordering of the nodes v is an order such that if order[v] < order[w], then there is no path from w to v. Example 4.14. Suppose we define the post-visit function to simply update a count and record the counter, as in Algorithm9. Ordering the nodes by decreasing post[v] after a DFS search gives a topological ordering of a DAG. Proof. Suppose not. Then there are nodes v and w such that post[v] > post[w] and there is a path from w to v. If we called the explore function on w first, then post[v] < post[w] as there is a path from w to v. This is a contradiction. Suppose instead that we called the explore function on v first. If w was reachable from v, then there is a cycle in the graph. Hence, w must not be reachable from v. But then post[w] would be larger than post[v] since we explored v first. Again, we have a contradiction. // Post-visit() // count is initialized to 0 before running the algorithm post[v] count count count + 1 Algorithm 9: Post-visit function to find topological orderings Figure 4: Left: a tree's nodes numbered by DFS. Right: the same tree's nodes numbered by BFS. 4.3.2 Breadth-first search With DFS, we went as far along a path away from the starting node as we could. With breadth- first-search (BFS), we instead look at all neighbors first, then neighbors of neighbors, and so on. For an example of the difference between these two search methods see Figure4.

4 Basic Graph Theory and Algorithms

Directed Graph Algorithms

Graph Varieties Axiomatized by Semimedial, Medial, and Some Other Groupoid Identities

Networkx: Network Analysis with Python

1. Directed Graphs Or Quivers What Is Category Theory? • Graph Theory on Steroids • Comic Book Mathematics • Abstract Nonsense • the Secret Dictionary

Network Analysis of the Multimodal Freight Transportation System in New York City

Tutorial 13 Travelling Salesman Problem1

Graph Theory

Structural Graph Theory Meets Algorithms: Covering And

An Application of Graph Theory in Markov Chains Reliability Analysis

Networkx Reference Release 1.9.1

Strongly Connected Components and Biconnected Components

Analyzing Social Media Network for Students in Presidential Election 2019 with Nodexl