COMPSCI 532: Design and Analysis of Algorithms October 14, 2015 Lecture 12 Lecturer: Debmalya Panigrahi Scribe: Allen Xiao

1 Overview

In this lecture, we introduce approximation algorithms and their analysis in the form of approximation ratio. We review a few examples, then introduce an analysis technique for linear programs known as dual fitting.

2 Approximation Algorithms

It is uncertain whether polynomial time algorithms exist for NP-hard problems, but in many cases, polyno- mial time algorithms exist which approximate the solution.

Definition 1. Let P be an optimization problem for minimization, with an A . The approximation ratio α of A is: ALGO(I) α = max I∈P OPT(I) Each I is an input/instance to P. ALGO(I) is the value A achieves on I, and OPT(I) is the value of the optimal solution for I. An equivalent form exists for maximization problems:

ALGO(I) α = min I∈P OPT(I)

In both cases, we say that A is an α-approximation algorithm for P. A natural way to think of this (as we maximize over all possible inputs) is the worst-case performance of A against optimal. We will often use the abbreviations ALGO and OPT to denote the worst-case values which form α.

3 2-Approximation for Cover

A vertex cover of a graph G = (V,E) is a set of vertices S ⊆ V such that every edge has at least one endpoint in S. The VERTEX-COVER asks, given a graph G and parameter k, whether G admits a vertex cover of size at most k. The optimization problem is to find a vertex cover of the minimum size. We will provide an approximation algorithm for VERTEX-COVER with an approximation ratio of 2. Consider a very naive algorithm: while an uncovered edge exists, add one of its endpoints to the cover. It turns out this algorithm is rather difficult to analyze in terms of approximation ratio. A small variation gives a very straightforward analysis: instead of adding one vertex of the uncovered edge, add both.

12-1 ijitegs ic h loih eoe donn etcsfreeyvre tad.OTms oe each cover must OPT adds. it vertex every for edges vertices these adjoining of removes algorithm the since edges, disjoint nutvl,ti hielwr h vrg oto oeiga lmn ntefia e cover. set final the in element an covering of cost average the lowers choice this Intuitively, which set the add we step, Each covers it elements. element uncovered (remaining) of cover set set the be the of of cover set cost cost the minimum the let for and asks problem covers, optimization the on function cost T of universe a Cover Given Set for Approximation Greedy 4 of size the half least at is well. as OPT in used been 1 have Algorithm must edge each from vertex one least at edges; these all cover must of set The 1: Figure 6: 5: 4: 8: 7: 3: 2: 1: ⊆ osdrtevrie de yti rcdr.Tevre ar de yteagrtmaeastof set a are algorithm the by added pairs vertex The procedure. this by added vertices the Consider swt etxcvr ewl s ipitcagrtmadpoeisapoiainrto Let ratio. approximation its prove and algorithm simplistic a use will we cover, vertex with As U return while end while S S ← ← uhta vr betin object every that such eoealegsadjoining edges all Remove both Add any Choose /0 E U S sntempty not is etxCvr2-Approximation Cover Vertex ( v i . , v w and i ( v ) n v i n utteeoepc tlatoeedon rmec de tflosta OPT that follows It edge. each from endpoint one least at pick therefore must and , , , objects w w w i ) | r h etcscoe yteapoiainagrtm h pia etxcover vertex optimal The algorithm. approximation the by chosen vertices the are S do to ∈ | oteapoiainrtofrti loih sa ot2. most at is algorithm this for ratio approximation the so , U S . . X n aiyo subsets of family a and X samme fa es n e in set one least at of member a is v or w from v v v 2 1 t min U s ∈ S . 12-2 | s c

∩ ··· ( s F ) S | = w w w X s t 1 2 1 c sn covers using ,..., ( T = ) s m ∑ T ( s s i ∈ (i.e. ⊆ T c S X ( . astelatpruncovered per least the pays s a ) S ) h egtdstcover set weighted The . s ∈ e cover set T s = X .Let ). sasubfamily a is c ( · F ) ea be ⊆ ( G X ) Algorithm 2 Greedy Set Cover 1: F ← X 2: T ← /0 3: while F is not empty do c(s0) 4: s ← argmins0∈S |s0∩F| 5: T ← T ∪ {s} 6: F ← F \ s 7: end while 8: return T

Correctness follows from the same argument as the vertex cover analysis: Elements are only removed from F (initially X) when they are covered by the set we add to T, and we finish with F empty. Therefore all elements of X are covered by some set in T. To prove the approximation ratio, consider the state of the algorithm before adding the ith set. For clarity, let Fi be F on this iteration (elements not yet covered), but let T denote the final output set cover, and T ∗ the optimal set cover. By optimality of T ∗: ∑ c(s) = c(T ∗) = OPT s∈T ∗

∗ T covers X, and therefore covers Fi: ∑ |s ∩ Fi| ≥ |Fi| s∈T ∗ We can consider how the sets in T ∗ perform on the cost-per-uncovered ratio that is minimized in the algo- rithm. c(s) ∑ ∗ c(s) OPT min ≤ s∈T ≤ ∗ s∈T |s ∩ Fi| ∑s∈T ∗ |s ∩ Fi| |Fi| The second inequality used “the minimum is at most the average”. Now notice that the algorithm takes a minimum over all subsets S. Since S ⊇ T ∗, the chosen set must have had at least as low a ratio as the minimum from T ∗. c(s) c(s) OPT min ≤ min ≤ s∈S |s ∩ Fi| s∈T ∗ |s ∩ Fi| |Fi| Finally, the cost of T is the sum of costs of its sets. Using the notation above, we can write this expression as a weighted sum of the minimized ratios, and then apply the above inequality to find an upper bound linear in OPT. Let s(i) be the ith set selected.

|T| ALGO = c(T) = ∑ c(s) = ∑ c(s(i)) s∈T i=1 |T| c(s(i)) = · |s(i) ∩ F | ∑ (i) i i=1 |s ∩ Fi| |T| c(s(i)) = · (|F | − |F |) ∑ (i) i i+1 i=1 |s ∩ Fi| |T| OPT ≤ ∑ · (|Fi| − |Fi+1|) i=1 |Fi|

12-3 Analyzing the sum will give us an expression for the approximation ratio. Since each sum term is OPT/|Fi| duplicated (|Fi| − |Fi−1|) times, we can replace the denominator terms to get an upper bound.

OPT  OPT OPT  · (|Fi| − |Fi+1|) = + ··· + |Fi| |Fi| |Fi| | {z } (|Fi|−|Fi+1|) times OPT OPT OPT OPT  ≤ + + + ··· + |Fi| |Fi| − 1 |Fi| − 2 |Fi−1| + 1 |Fi|−|Fi+1|−1 OPT = ∑ j=0 |Fi| − j

Returning to the original sum, we realize this is actually a big descending sum of OPT/(n − j) terms. ! |T| |Fi|−|Fi+1|−1 OPT OPT OPT  OPT OPT  ∑ ∑ = + ··· + + + ··· + + ··· i=1 j=0 |Fi| − j |F0| |F1| + 1 |F1| |F2| + 1 OPT OPT OPT = + + ··· + n n − 1 1 n−1 OPT = ∑ j=0 n − j 1 OPT = ∑ k=n k

In the last step, we applied a change of variables with k = n − j. This familiar sum is the nth harmonic number (times OPT).

1 OPT ALGO ≤ ∑ k=n k = OPT · Hn = OPT · Θ(logn)

Rearranging, we see that the approximation factor for the greedy algorithm is no more than some constant multiple of logn. ALGO = O(logn) OPT 5 Dual Fitting

Dual fitting is an analysis technique for approximation algorithms on problems. In short, we maintain a feasible dual corresponding to the infeasible (primal) solution we have built so far, and show that the ratio between their values is bound by a constant. This gives us a bound on the approximation ratio, thanks to strong duality. To see this, suppose we have a minimization primal, and let:

1. P be the value of the integral primal solution output by the algorithm. This is equivalent to ALGO.

12-4 minP P (ALGO) a Pint (OPT) b P∗ = D∗

D maxD 0

Figure 2: We have an optimization problem expressed as a minimization LP. The approximation algorithm produces some primal solution P which we wish to compare against the integral optimal Pint ( a above). Dual fitting bounds this gap by constructing a feasible dual D whose gap with P is known ( b above).

2. Pint be the optimal value for an integral solution to the primal. This is equivalent to OPT. 3. D be the value of the feasible dual we constructed to associate with P.

4. P∗,D∗ be the primal and dual fractional optimal values, respectively.

Suppose we manage to construct feasible D such that P ≤ αD for constant α ≥ 1. First, notice that the approximation factor is defined to be: ALGO P = OPT Pint Since the integral optimal in minimization is always at least the fractional optimal: P P ≤ ∗ Pint P Invoke strong duality: P P = P∗ D∗ D is feasible, therefore D ≤ D∗: P P ≤ D∗ D Combined, we have that: ALGO P ≤ ≤ α OPT D See Figure 2 for a pictorial representation of the inequalities we used. We will go over a few examples of problems and construct feasible duals for them.

12-5 5.1 Dual fitting for vertex cover Recall the 2-approximation algorithm for vertex cover from before (see Algorithm 1). To perform a dual fitting analysis, we will first write the relaxed (fractional) linear programs for vertex cover.

min ∑ xv v∈V

s.t. xv + xw ≥ 1 ∀(v,w) ∈ E

xv ≥ 0 ∀v ∈ V The dual is: max ∑ yvw (v,w)∈E s.t. ∑yvw ≤ 1 ∀v ∈ V v

yvw ≥ 0 ∀v ∈ V

We will interpret each vertex selection as xv = 1. Initially, our solutions are xv = 0 for all v in the primal, and yvw = 0 for all (v,w) in the dual. Notice that the primal solution is infeasible, and the dual solution is feasible. When (v,w) is added to S:

1. In the primal, we set xv = 1 and xw = 1. Since the edges selected are disjoint, xv,xw = 0 previously. Therefore, ∆P = 2 for each iteration.

2. For the dual, we will do the natural thing and set yvw = 1 as well. Since the edges selected are disjoint, the edges with yvw = 1 form a on the graph. No vertex constraint in the dual has value more than 1, and yvw = 0 before this addition. Thus, D remains feasible and ∆D = 1 for each iteration. At the end of the algorithm, P becomes primal feasible (no edge of E left uncovered). At each step, ∆P = 2∆D. Let Pt (resp. Dt ) be the primal (resp. dual) value after the tth iteration. t Pt ∑i=1 (∆P)i 2t = t = = 2 Dt ∑i=1 (∆D)i t P is exactly the size of the output cover (ALGO). Since D is a feasible dual at the end of the algorithm, dual fitting gives us that: ALGO P ≤ = 2 OPT D

5.2 Dual fitting in general It is not exactly necessary that dual we maintain is feasible. Suppose instead that we maintain an infeasible dual D and P/D = α, where D/β is feasible instead. Applying the dual fitting argument with D0 = D/β tells us that: ALGO P P ≤ = = αβ OPT D0 D/β In general, there are two ratios we can control in dual fitting: 1. The primal-dual ratio: P/D = α by construction, though D may not be feasible. 2. The infeasibility ratio: D/β is feasible. One might ask, “why not just maintain D/β instead?” For certain problems, maintaining infeasible D is more semantically useful than its feasible scaled counterpart. The next example will demonstrate this.

12-6 5.3 Dual fitting for greedy set cover Recall the algorithm for greedy set cover, which we proved using other techniques to be logn-approximate.

Algorithm 3 Greedy Set Cover 1: F ← X 2: T ← /0 3: while F is not empty do c(s0) 4: s ← argmins0∈S |s0∩F| 5: T ← T ∪ {s} 6: F ← F \ s 7: end while 8: return T

The relaxed linear programming form of weighted set cover is:

min ∑ c(s)xs s∈S s.t. ∑ xs ≥ 1 ∀e ∈ X s:e∈s

xs ≥ 0 ∀s ∈ S The dual is: max ∑ ye e∈X s.t. ∑ ye ≤ c(s) ∀s ∈ S e∈s

ye ≥ 0 ∀e ∈ X

Initially, all ye,xs = 0. This is infeasible for the primal, but feasible in the dual. Each iteration of the algorithm, we add a set s to T. F is the set of uncovered elements each iteration.

1. In the primal, we set xs = 1. Then, ∆P = c(s) · 1 = c(s).

2. We will set the dual to maintain ∆D = ∆P = c(s). Set ye = c(s)/|s ∩ F| for the new elements e ∈ s covered. Intuitively, this is charging the “purchase” of s to the elements it newly covers (s∩F). Since each element is first covered exactly once, ye is unchanged after the first time e is covered and until the end of the algorithm. The total charge is ∆D = |s ∩ F| · c(s)/|s ∩ F| = c(s). Lemma 1. D/logn is dual feasible. Proof. We will do the analysis on D (without scaling) first. On each iteration, the dual constraint for a fixed set s∗ changes if any of its elements were covered. Without loss of generality, we will only consider iterations which cover some e ∈ s∗. Recall that each iteration picks some set s:  c(s0)  c(s) min = s0∈S # uncovered elements left in s0 |s ∩ F| As the minimizer, this lower bounds the same ratio for s∗: c(s) c(s∗) ≤ |s ∩ F| |s∗ ∩ F|

12-7 We will examine the dual constraint for s∗ and use an argument similar to that of the original greedy quicksort analysis to extract a logn factor. Let si be the ith selected set, and Fi be the set of uncovered elements before ∗ si is added. When si is selected, the dual change in the s constraint is at most: c(s∗) |s∗ ∩Y| · |s∗ ∩ F|

At the end of the algorithm, the dual constraint for s∗ becomes:

c(si) ∗ ∑ ye = ∑ ∗ · |{elements of s covered by si}| e∈s∗ i |s ∩ Fi|

c(si) ∗ ∗ = ∑ ∗ · (|s ∩ Fi| − |s ∩ Fi+1|) i |s ∩ Fi| ∗ c(s ) ∗ ∗ ≤ ∑ ∗ · (|s ∩ Fi| − |s ∩ Fi+1|) i |s ∩ Fi|  c(s∗) c(s∗)   c(s∗) c(s∗)  ≤ ∗ + ··· + ∗ + ∗ + ··· + ∗ + ··· |s ∩ F1| 1 + |s ∩ F2| |s ∩ F2| 1 + |s ∩ F3|  1 1 1 ≤ c(s∗) + + ··· + |s∗| |s∗| − 1 1 ∗ = c(s )H|s∗| ≤ c(s∗)log|s∗| ≤ c(s∗)logn

This is better than logn, especially when sets are small. Dividing by logn gives feasibility.

1 ∗ ∑ ye ≤ c(s ) logn e∈s∗

It follows that D/logn is dual feasible.

We now have that P/D = 1, and D/logn is dual feasible. Applying the dual fitting argument with P and D, we see that the greedy set cover algorithm is (logn)-approximate.

12-8