Approximation Algorithms - Weighted Set Cover Problem Lecturer: Kavitha Telikepalli, Naveen Garg Scribe: Rahul Aggarwal, Tarun Aggarwal
Total Page:16
File Type:pdf, Size:1020Kb
CSL758: Advanced Algorithms April 16, 2008 Approximation Algorithms - Weighted Set Cover Problem Lecturer: Kavitha Telikepalli, Naveen Garg Scribe: Rahul Aggarwal, Tarun Aggarwal 1 Introduction In this lecture we will discuss a NP-hard problem and try to find a good approximation algorithm for it. The problem we consider is the weighted set cover problem. In the process of analysing this problem we will also discuss a closely related problem of finding the vertex cover both weighted and non-weighted. 2 Weighted Set Cover Problem Given a finite universe U = fe1; e2; : : : ; eng of n members , S = fs1; s2; : : : ; smg ^ 8i si ⊆ U a collection of subsets of U and a weight function w : s ! <+ that assigns a positive real weight to each subset of U, the goal is to find the minimum weight subcollection of S whose union is U or a minimum weight set cover. 3 Proof of NP completeness We will show a reduction from the vertex cover problem to the weighted set cover problem. The decision version of set cover problem is wether there exists a set cover with number of sets in it ≤ c; c is an integer and w : s ! 1. Vertex Cover Problem - Given a graph G = (V; E), A set S ⊆ V is a vertex cover if 8e = (u; v) 2 E atleast one of u or v is in S. The problem is to find the minimum car- dinality vertex cover. The corresponding decision problem is wether there exists a vertex cover of size ≤ k; k is an integer. Proof - Assuming we have a polynomial time algorithm for each instance of the set cover problem and the vertex cover problem to be NP-Complete. We consider an instance of vertex cover problem G = (V; E) and k. Let us define U = E, Sv = fcollection of all edges incident on vg, S = fSv : v 2 V g and c = k. It is easy to see that 9 a set cover of size ≤ k , 9 a vertex cover of size ≤ k. Therefore it is a valid polynomial time reduction. So, we can solve vertex cover problem in polynomial time which is a contradiction. Hence subset sum problem is NP complete. Now we prove the NP completeness of the vertex cover problem. An independent set S in G = (V; E) is a subset of vertices such that 8e = (u; v) 2 E atmost one of u or v is in S. If S ⊆ V is a vertex cover of G than V − S is an independent set. 13-1 Claim - S is an independent set iff V − S is a vertex cover. Proof - ) Let S be any independent set and consider any edge (u; v). As S is independent ) either u 2 V − S or v 2 V − S. Thus V − S covers (u; v). So, it is a vertex cover. ( Let V − S be any vertex cover and consider 2 vertices u 2 S and v 2 S. As V − S is a vertex cover ) (u; v) 2= E. Thus no 2 nodes in S are joined by an edge. So, it is an independent set. So, finding a vertex cover of size ≤ k is same as finding an independent set of size ≥ k. So, we will prove independent set problem NP complete here. Proof - Assume we have a polynomial time algorithm for the vertex cover problem we consider an instance φ of 3-SAT (a well known NP complete problem). We construct an instance of independent set problem (G; k) as follows: 1. G contains 3 vertices for each clause, one for each literal. 2. Connect 3 literals in a clause in a triangle. 3. Connect a literal to each of its negations in other clauses. 4. Set k = number of clauses in φ. Example : φ = (x1 _ x2 _ x3) ^ (x1 _ x2 _ x3) ^ (x1 _ x2 _ x4). So, k = 3 and G is: Claim - G contains an independent set of size k = jφj iff φ is satisfiable. Proof - ) Let S be any independent set of size k. S must contains exactly one vertex in each triangle. Set these literals to true and any other variables (taking care of negations) in a consistent way. So, all clauses are satisfied ) φ is satisfiable. ( Given a satisfying assignment select a true literal from each triangle which forms an independent set of size k So, we have got a valid polynomial time reduction. Hence by contradiction the independent set problem is NP complete. 13-2 4 f-Approximation algorithm for weighted set cover problem We try to formulate the problem as a linear programming problem. Let S = fs1; s2; : : : ; sng be the given set of subsets of U, associate xj with each sj suct that it is 1 if sj is in mini- mum weight set cover otherwise 0. Let w : s ! <+ be the weight function. So, the linear program is: Minimize X xjwj j Subject to For each e 2 U such that e 2 si1; si2; : : : ; sik xi1 + xi2 + ::: + xik ≥ 1 xj 2 f0; 1g 8j The constrains an also be represented in matrix form Ax ≥ b, where A(i; j) = 1 if ei 2 sj other wise 0, x is a column vector of n elements and b is a colun vector of all 1s. The linear program presented here is an integer program (IP) which is hard to solve. So, we can get an approximation to the solution by relaxing the constraints to 0 ≤ xj ≤ 1. The solution to the relaxed problem will always be less or equal to the solution to the integer program (as some constraints are removed so we can achieve a lesser value). Now we can solve the relaxed program by using any LP solver and get a solution x. The task is to convert this real solution to an integral solution considering the fact that the sets we get unite to give U. While doing so, we have to round off the fractional values of the LP solution. We follow the following rounding strategy. Suppose each element e is present in atmost f subsets, we call f as the frequency of occu- rance. So, 8 xi, which are the solutions of the LP, if xi ≥ 1=f, then set xi = 1 otherwise xi = 0. In this way, the fractional solution will be converted to an integer solution. Now, we prove that this integer solution gives a set cover. For any solution to be a set cover the above constraints must be satisfied. Consider the constraint, xi1 + xi2 + ::: + xik ≥ 1. If the frequnecy is atmost f, then the LHS can not contain more than f terms. So, 9 some xij ≥ 1=f in the solution of LP to satisfy the constraint. So, this xij is set to 1 after rounding and the constraint remains satisfied. This is true for all constraints. So, after rounding we will get a valid set cover. P Also the rounded solution i.e. j xjwj can be atmost f times the solution to the LP as we can increase any xi by a factor of f atmost. The optimal solution to the LP is less than or equal to the optimal solution of the IP. So, the rounded solution will be atmost f times the optimal solution to the integer program. 13-3 5 2-Approximation algorithm for Cardinality Vertex Cover Recall the definition of the cardinality vertex cover problem. Input: An undirected graph G = (V; E). Goal: Find a minimum cardinality set of vertices S ⊆ V such that every edge in E has at least one endpoint in S. Definition: A matching in a graph G = (V; E) is a subset of edges M ⊆ E such that no two edges in M share a common endpoint. A matching M is called maximal if it is not strictly contained in any other matching (i.e. no edges in E − M can be added to M). Our vertex cover algorithm simply finds any maximal matching M. The cover is defined as all endpoints of edges in M. Algorithm VC2 S φ while E 6= φ do let (u; v) be any edge in E S S [ u; v delete vertices u; v and all their incident edges from G return S Lemma 1: The set S returned by VC2 is a vertex cover. Proof: Consider any edge e deleted in an iteration of the loop. If e was selected as the edge in the first line of the loop, then both of its endpoints were added to S. Otherwise, e must share an endpoint with the edge selected in the first line of the loop so one of its endpoints is added to S. Since all edges are eventually deleted in some iteration, the final set S is a vertex cover. Lemma 2: Let M be any matching and S be any vertex cover. Then, jMj ≤ jSj. Proof: Each e 2 M must have at least one of its endpoints covered by S. Since M is a matching then no two edges in e share an endpoint. Therefore, each e 2 M is covered by a vertex that covers no other edge , f 2 M, so jMj ≤ jSj. Lemma 3: VC2 is a 2 − approximation: Proof: Let M be the set of all edges selected in the first line of the loop. Then M is a matching since any edge sharing an endpoint with some e 2 M was deleted in the third line of the loop. Since S consists exactly of the endpoints of edges in M then jSj = 2jMj.