Modular Decomposition of Undirected Graphs
Total Page:16
File Type:pdf, Size:1020Kb
THESIS MODULAR DECOMPOSITION OF UNDIRECTED GRAPHS Submitted by Adrian E. Esparza Department of Computer Science In partial fulfillment of the requirements For the Degree of Master of Science Colorado State University Fort Collins, Colorado Summer 2020 Master’s Committee: Advisor: Ross McConnell Sangmi Pallickara Alexander Hulpke Copyright by Adrian E. Esparza 2020 All Rights Reserved ABSTRACT MODULAR DECOMPOSITION OF UNDIRECTED GRAPHS Graphs found in nature tend to have structure and interesting compositions that allow for com- pression of the information and thoughtful analysis of the relationships between vertices. Modular decomposition has been studied since 1967 [1]. Modular decomposition breaks a graph down into smaller components that, from the outside, are inseparable. In doing so, modules provide great po- tential to better study problems from genetics to compression. This paper describes the implemen- tation of a practical algorithm to take a graph and decompose it into it modules in O(n+m log(n)) time. In the implementation of this algorithm, each sub-problem was solved using object ori- ented design principles to allow a reader to extract the individual objects and turn them to other problems of practical interest. The purpose of this paper is to provide the reader with the tools to easily compute: modular decomposition of an undirected graph, partition an undirected graph, depth first search on a directed partially complemented graph, and stack operations with a com- plement stack. The provided implementation of these problems all compute within the time bound discussed above, or even faster for several of the sub problems. ii ACKNOWLEDGEMENTS I would like to thank my family for their ceaseless support. I would also like to thank my advisor Ross McConnell for his guidance and help. The Shindlebowers for treating me like family and caring for my dog. My friends Dave Bessen, Cole Frederick, and Gareth Halladay for ever more unflagging support. In addition a special recognition for Sophia Fantus for brilliant proof reading. Know that this work would not have been possible with out all of you. Thank you and congratulations on work well done. iii DEDICATION I would like to dedicate this thesis to my family, good job getting it done team Esparza. iv TABLE OF CONTENTS ABSTRACT . ii ACKNOWLEDGEMENTS . iii DEDICATION . iv LIST OF FIGURES . vi Chapter 1 Introduction . 1 Chapter 2 Preliminaries . 2 Chapter 3 Modules . 4 Chapter 4 Applications . 10 4.1 Compact representation of a graph . 10 4.2 Max weight independent set . 11 4.3 Cographs . 12 4.4 Transitive Orientation . 14 4.5 Max Clique and Min coloring in comparability graphs. 17 4.6 Related graph classes . 17 Chapter 5 The Algorithm . 19 5.1 Overview . 19 5.2 Finding the maximal modules that do not contain x . 21 5.2.1 Strategy for an efficient implementation . 22 5.2.2 Finding M in practice . 24 5.3 Finding Modules containing x in G′ ..................... 25 5.3.1 Finding the Strongly Connected Components . 27 5.3.2 Building MD(G′) ............................. 30 5.4 Recurse . 31 Bibliography . 32 v LIST OF FIGURES 3.1 X represents a module, Y represents a potential module spoiled by f .......... 4 3.2 The factor is the subgraph induced by X, and the quotient is the original graph with X collapsed to a point. 5 3.3 A graph with several modules within it along with its modular decomposition. 6 4.1 A C5 and steps demonstrating the forcing relation, and how it shows a C5 is not tran- sitively orientable and thus not a comparability graph. 14 vi Chapter 1 Introduction According to Elias Dahlhaus, Jens Gustedt, and Ross M. McConnell [2], modules have been studied since 1967 [1]. Modules represent interesting structures within graphs. A module is some nonempty set X of vertices in a graph G =(V,E), such that ∀u ∈ (V − X), u is either a neighbor of every vertex in X, or a non neighbor of every vertex in X.A spoiler for X is a vertex u ∈ V −X, such that X contains both neighbors and non neighbors of u. X is a module if, and only if, it has no spoilers. The lack of spoilers is what make modules novel structures for study. For example, optimiza- tion problems that are hard in the general case, can be solved in linear time on graphs with a non trivial modular decomposition. The beginning of this thesis will cover related research into mod- ules, problems that have linear time solutions, and graph classes that modules can help identify. The rest of the thesis covers an O(n+m log(n)) algorithm used to find the modular decomposition of a graph as well as describing our implementation of a program that runs said algorithm. In chapter 2, we will discuss general graph terms and definitions that will be used throughout the paper. In chapter 3, we will examine the reasons modules are interesting to other types of graph theory problems. In chapter 4, we will describe how, in using modular decomposition, it is possible to find linear time solutions to max weight independent set, transitive orientation, and detecting interval graphs [3]. In chapter 5, we will consider an algorithm for computing the modular decomposition of a graph in O(n + m log(n)) time [4]. In parallel to describing the algorithm, we will describe an implementation of said algorithm with explanations for design decisions. 1 Chapter 2 Preliminaries A directed graph is represented by a set of vertices V , and a set E of ordered pairs of vertices, known as edges. An undirected graph is a special case of a directed graph, where (x,y) ∈ E ⇐⇒ (y,x) ∈ E. In an undirected graph, the symmetric pair {(x,y), (y,x)} can be denoted as an undirected edge xy. For simplicity, in this thesis, a graph will refer to an undirected graph, unless otherwise indicated. The variables n and m will be reserved for representing the size of the sets V and E, respectively. A subgraph of G is a graph G′ = (V ′,E′), where V ′ ⊆ V and E′ is some subset of E, such that every edge in E′ has both its endpoints in V ′. An induced subgraph is a graph G′ =(V ′,E′), where V ′ ⊆ V and ∀a, b ∈ V ′, (a, b) ∈ E′ iff (a, b) ∈ E. In this thesis, G[X] shall represent the subgraph of G induced by X, where X represents a set of vertices in G. Unless otherwise indicated, all subgraphs referenced in this paper shall be induced subgraphs. When (x,y) is a directed edge, y is adjacent to x. For a given vertex x, the neighborhood of x N(x) will denote {y ∈ V : (x,y) ∈ E}. That is, N(x) will represent the set of vertices in E that x is adjacent to. A complement, G = (V, E) of a graph G = (V,E) retains all the vertices of the original graph, but the edges in G are non edges in G and the non edges in G are edges in G. That is, ∀a, b ∈ V , (a, b) ∈ E iff (a, b) ∈/ E. Two sets X and Y are properly overlapping if their intersection is non empty, but neither is contained within the each other. Written formally, if X,Y are properly overlapping sets then |X ∩ Y | > 0 and |X ∩ Y | < |X| and |X ∩ Y | < |Y |.A partition of a set X is a family {X1,X2,...,Xk} of subsets of X, such that every element in X is a member of exactly one set Xi in the family. A disjoint union of two graphs takes two graphs, G = (V,E) and G′ = (V ′,E′), which share no vertices, and creates a new graph H =({V + V ′}, {E + E′}).A path exists between vertices x and y if there exists a set Z of edges in E, such that starting with x it is possible to traverse the 2 edges of Z and reach y.A cycle exists when there is a set of edge that can be traversed from some vertex x which leads back to x.A DAG is a directed graph with the condition that no cycles exist, also known as a directed acyclic graph. A set of vertices where a path exists between any two vertices of the set is known as a strongly connected component. A clique C within a graph is a maximal set of vertices, such that ∀v,w ∈ C : v =6 w, (v,w) ∈ E. The set of cliques in a graph is denoted CG. The notation, CG(x) , denotes the set of cliques which contain some vertex x, and CG(¯x) denotes the cliques which do not contain x.A valid coloring of a graph is the assignment of colors to vertices of the graph, such that no two vertices of the same color are adjacent to one another. For a graph G =(V,E), |V | is clearly an upper bound on the number of colors needed for a valid coloring, by assigning each vertex a unique color. A min coloring is the minimum number of colors necessary for a valid coloring. It is trivial to show that the size of a max clique is a lower bound on a min coloring. Proof. By contradiction. Consider that a coloring assigns the same color to two vertices in the clique. By the definition of a clique, those two vertices are adjacent; therefore, the min coloring is not valid. Opposite a clique is a kernel, a maximal independent set of vertices in G. For some kernel, K, ∀v,w ∈ K, (v,w) ∈/ E.