Data Reduction and Exact Algorithms for Clique Cover

Data Reduction and Exact Algorithms for Clique Cover JENS GRAMM Eberhard-Karls-Universität Tübingen and JIONG GUO, FALK HUFFNER,¨ and ROLF NIEDERMEIER Friedrich-Schiller-Universität Jena To cover the edges of a graph with a minimum number of cliques is an NP-hard problem with many applications. We develop for this problem efficient and effective polynomial-time data reduction rules that, combined with a search tree algorithm, allow for exact problem solutions in competitive time. This is confirmed by experiments with real-world and synthetic data. Moreover, we prove the fixed-parameter tractability of covering edges by cliques. Categories and Subject Descriptors: F.2.2 [Analysis of Algorithms and Problem Complex- ity]: Nonnumerical Algorithms and Problems; G.2.1 [Discrete Mathematics]: Combinatorics— Combinatorial Algorithms; G.2.2 [Discrete Mathematics]: Graph Theory—Graph algorithms General Terms: Algorithms; Experimentation; Theory Additional Key Words and Phrases: Clique cover; Data reduction; Fixed-parameter tractability 1. INTRODUCTION Data reduction techniques for exactly solving NP-hard combinatorial optimization problems have proven useful in many studies [Guo and Niedermeier 2007]. The point is that by polynomial-time executable reduction rules many input instances of hard combinatorial problems can be significantly shrunk or simplified, without sacrificing the possibility of finding an optimal solution to the given problem. For A preliminary version appeared under the title “Data reduction, exact, and heuristic algorithms for clique cover” in Proceedings of the 8th Workshop on Algorithm Engineering and Experiments (ALENEX ’06), pages 86–94, SIAM 2006. Note that the conference version contains additional material concerning a polynomial-time heuristic which we omit here to improve the focus. Jens Gramm was supported by DFG project OPAL, NI 369/2. Jiong Guo and Falk Hüffner were supported by DFG Emmy Noether research group PIAF, NI 369/4. Authors’ addresses: Jens Gramm, Wilhelm-Schickard-Institut für Informatik, Eberhard-Karls- Universität Tübingen, Sand 13, D-72076 Tübingen, Germany; email: [email protected] tuebingen.de. Jiong Guo, Falk Hüffner, and Rolf Niedermeier, Institut für Informatik, Friedrich-Schiller-Universität Jena, Ernst-Abbe-Platz 2, D-07743 Jena, Germany; email: {guo,hueffner,niedermr}@minet.uni-jena.de. Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. c 20YY ACM 0000-0000/20YY/0000-0001 $5.00 ACM Journal Name, Vol. V, No. N, Month 20YY, Pages 1–0??. 2 · Jens Gramm et al. such reduced instances then often exhaustive search algorithms can be applied to efficiently find optimal solutions. Hence, data reduction techniques are considered as a “must” when trying to cope with computational intractability. Studying the NP-hard problem to cover the edges of a graph with a minimum number of cliques ((Edge) Clique Cover)1, we add a new example to the success story of data reduction, presenting both empirical as well as theoretical findings. Our study problem Clique Cover, also known as Keyword Conflict problem [Kellerman 1973] or Covering by Cliques (GT17) or Intersection Graph Basis (GT59) [Garey and Johnson 1979], has applications in diverse fields such as compiler optimization [Rajagopalan et al. 2000], computational geometry [Agarwal et al. 1994], and applied statistics [Piepho 2004; Gramm et al. 2007]. Thus, it is not surprising that there has been substantial work on (polynomial-time) heuristic algorithms for Clique Cover [Kellerman 1973; Kou et al. 1978; Rajagopalan et al. 2000; Piepho 2004; Gramm et al. 2007; Behrisch and Taraz 2006]. In this paper, we extend and complement this work, in particular introducing new data reduction techniques. A clique in an undirected graph G = (V, E) is a set C of vertices such that for any two vertices in C, there is an edge connecting the two. We will also use “clique” to refer to the complete subgraph of G induced by C and assume that the exact meaning will be made clear from context. Formally, as a (parameterized) decision problem, Clique Cover is defined as follows: Clique Cover Input: An undirected graph G = (V, E) and an integer k ≥ 0. Question: Is there a set of at most k cliques in G such that each edge in E has both its endpoints in at least one of the selected cliques? As first observed by Erd˝os et al. [1966], Clique Cover is equivalent to a problem from intersection graph theory (see McKee and McMorris [1999] for a monograph on intersection graphs). Let F = {S1,...,Sn} be a family of sets. The intersection graph of F, denoted Ω(F), is the graph having F as vertex set with Si adjacent to Sj iff i 6= j and Si∩Sj 6= ∅. It is easy to see that for every feature x ∈U(F) := SS∈F S, the set Cx := {Si ∈ F | x ∈ Si} forms a clique in Ω(F), and {Cx | x ∈U(F)} is a clique cover for Ω(F). Therefore, finding a minimum cardinality clique cover for a graph G is equivalent to finding a set intersection representation F for G that min- imizes |U(F)| (called Intersection Graph Basis by Garey and Johnson [1979]). Guillaume and Latapy [2004] argue that this model is very widely applicable to discover underlying structure in complex real-world networks. Behrisch and Taraz [2006] give simple greedy algorithms for Clique Cover that provide asymptoti- cally optimal solutions for certain random intersection graphs. Clique Cover is NP-hard [Orlin 1977], even when restricted to planar graphs [Chang and Müller 2001] or graphs with maximum degree 6 [Hoover 1992]. It is polynomial-time solvable for chordal graphs [Ma et al. 1989], graphs with maximum 1We remark that covering vertices by cliques (Vertex Clique Cover or Clique Partition) is of less interest to be studied on its own because it is equivalent to the well-investigated Graph Coloring problem: A graph has a vertex clique cover of size k iff its complement graph can be colored with k colors such that adjacent vertices have different colors. ACM Journal Name, Vol. V, No. N, Month 20YY. Data Reduction and Exact Algorithms for Clique Cover · 3 degree 5 [Hoover 1992], line graphs [Orlin 1977], and circular arc graphs [Hsu and Tsai 1991]. By way of contrast, Clique Cover is not approximable within a factor of |V |ǫ for some ǫ > 0 unless P = NP [Lund and Yannakakis 1994], and 2 2 (log log |V |) nothing better than a polynomial approximation factor of O|V | (log |V |)3 is known [Ausiello et al. 1999]. We examine Clique Cover in the context of parameterized complexity (see Flum and Grohe [2006] and Niedermeier [2006] for new monographs on parameterized complexity analysis). An instance of a parameterized problem consists of a problem instance I and a parameter k being a nonnegative integer. A parameterized problem is fixed-parameter tractable if it can be solved in f(k)·|I|O(1) time, where f is a computable function solely depending on the parameter k, not on the input size |I|. First, as our main algorithmic contribution, we introduce and analyze data reduction techniques for Clique Cover. As a side effect, we provide a so-called problem kernel for Clique Cover, for the first time showing—somewhat surprisingly— that the problem is fixed-parameter tractable with respect to the parameter k. We continue with describing an exact algorithm based on a search tree. For our experimental investigations, we combined our data reduction rules with the search tree, clearly outperforming heuristic algorithms [Kellerman 1973; Kou et al. 1978] in several ways. For instance, we can solve real-world instances from a statistical application [Piepho 2004]—so far solved heuristically [Piepho 2004]—optimally without time loss. This indicates that for a significant fraction of real-world instances our exact approach is clearly to be preferred to a heuristic approach which is without guaranteed solution quality. We also experimented with random graphs of different densities, showing that our exact approach works extremely well for sparse graphs. In addition, our empirical results reveal that for dense graphs a data reduction rule that was designed for showing the problem kernel is often useful. In particular, this gives strong empirical support for further theoretical studies in the direction of improved fixed-parameter tractability results for Clique Cover, nicely demon- strating a fruitful interchange between applied and theoretical algorithmic research. 2. DATA REDUCTION A (data) reduction rule replaces, in polynomial time, a given Clique Cover instance (G, k) consisting of a graph G and a nonnegative integer k by a “simpler” instance (G′, k′) such that (G, k) has a solution iff (G′, k′) has a solution. An instance to which none of a given set of reduction rules applies is called reduced with respect to these rules. A parameterized problem such as Clique Cover (the parameter is k) is said to have a problem kernel if, after the application of the reduction rules, the reduced instance has size f(k) for a function f depending only on k. It is a well-known result from parameterized complexity theory that a parameterized problem is fixed-parameter tractable iff it admits a problem kernel [Cai et al. 1997]. Given an n-vertex and m-edge graph G, we use N(v) to denote the neighborhood of vertex v in G, namely, N(v) := {u | {u, v} ∈ E}.

Load more