86

K-Medians, Facility Location, and the Chernoff-Wald Bound

Neal E. Young*

Abstract minimizez d + k We study the general (non-metric) facility-location and weighted k-medians problems, as well as the fractional cost(x) < k facility-location and k-medians problems. We describe a dist(x) S d natural scheme and use it to derive approximation algorithms for all of these problems. subject to ~-~/x(f,c) = 1 (Vc) For facility location and weighted k-medians, the re- =(f,c) < x(f) (vf, c) spective algorithms are polynomial-time [HAk + d]- and [(1 + e)d; In(n + n/e)k]-approximation algorithms. These x(f,c) >_ 0 (vf, c) performance guarantees improve on the best previous per- x(f) E {0,1} (vf) formance guarantees, due respectively to Hochbaum (1982) and Lin and Vitter (1992). For fractional k-medians, the al- gorithm is a new, Lagrangian-relaxation, [(1 + e)d, (1 + e)k]- Figure 1: The facility-location IP. Above d, k, x(f) and . It runs in O(kln(n/e)/e 2) linear- x(f, c) are variables, dist(x), the assignment cost of x, time iterations. is ~-~Ic x(f, c)disz(f, c), and cost(x), the facility cost of For fractional facilities-location (a generalization of frac- tional weighted set cover), the algorithm is a Lagrangian- relaxation, ~(1 + e)k]-approximation algorithm. It runs in O(nln(n)/e ~) linear-time iterations and is essentially the The weighted k-medians problem is the same same as an unpublished Lagrangian-relaxation algorithm as the facility location problem except for the following: due to Garg (1998). By recasting his analysis probabilis- a positive real number k is given as input, and the goal tically and abstracting it, we obtain an interesting (and as far as we know new) probabilistic bound that may be of is to choose a subset F of facilities minimizing dist(F) independent interest. We call it the Chernoff- Wald bound. subject to the constraint cost(F) < k. The standard integer programming formulation is the k-medians IP, 1 Problem definitions which differs from the IP in Fig. 1 only in that k is given, not a variable, and the objective function is d instead The input to the weighted is of d + k. The (unweighted) k-medians problem is the a collection of sets, where each set s is given a cost special case when each cost(f) = 1. For the fractional cost(s) E ~+. The goal is to choose a cover (a collection k-medians problem, the input is the same; the goal is of sets containing all elements) of minimum total cost. to solve the linear program obtained by removing the The (uncapacitated) facility-location problem is constraint "x(f) E {0, 1}" from the k-medians IP. a generalization of weighted set cover in which each We take the "size" of each of the above problems to set f (called a "facility") and element c (called a be the number of pairs (f, c) such that d±st(f, c) < oo. "customer") are given a distance dist(f, c) E ~+U{oo}. We assume the size is at least the number of customers The goal is to choose a set of facilities F minimizing and facilities. cost(F) + dist(F), where cost(F), the facility cost of By an In(d); fl(k)]-approximation algorithm for F, is ~"~.EF COSt(f) and dist(F), the assignment cost k-medians, we mean an algorithm that, given a problem of F, is )-:c rainier d±st(f, c). instance for which there exists a fractional solution of Fig. 1 shows the standard integer programming assignment cost d and facility cost k, produces a solution formulation of the problem -- the facility-location IP of assignment cost at most a(d) and facility cost at most [12, p. 8]. The facility-location linear program (LP) is D(k). We use similar non-standard notations for facility the same except without the constraint "x(f) E {0, 1}". location, set cover, and k-set cover. For instance, by A fractional solution is a feasible solution to the LP. a [d + 2k]-approximation algorithm for facility location, Fractional facility location is the problem of solving the we mean an algorithm that, given an instance for which LP. there exists a fractional solution of assignment cost d -'--'¢R-~'earch partially funded by NSF CAREER award CCR- and facility cost k, produces a solution for which the 9720664. Department of Computer Science, Dartmouth College, assignment cost plus the facility cost is at most d + 2k. Hanover, NH. ney~}cs,daxtmouZh, edu 87

2 Background 3 Results In the mid 1970's Johnson and Lovasz gave a greedy Fig. 2 shows the simple randomized rounding scheme [HAk]-approximation algorithm for unweighted set at the center of all our results. With minor variations cover [9, 11]. In 1979 Chvatal generalized it to a [HAk]- (e.g. Fig. 4), this rounding scheme can be used as approximation algorithm for weighted set cover [3]. the basis for approximation algorithms for set cover, In 1982 Hochbaum gave a greedy [HA(d + k)]- weighted set cover, facility location, and k-medians, and approximation algorithm for the uncapacitated facility- as the basis for Lagrangian-relaxation algorithms for the location problem by an implicit reduction to the fractional variants of these problems. weighted set-cover problem [8].1 Above /x is at most Although essentially the same rounding scheme suf- the maximum, over all facilities f, of the number of rices for each of these problems, the respective proba- customers c such that dist(f, c) < oo. bilistic analyses require different (albeit standard) tech- In 1992 Lin and Vitter gave a polynomial-time niques in each case. For set cover, a simple direct [(1 + e)d; (1 + 1/e) (lnn + 1) k]-approximation algorithm analysis suffices [16]. For weighted set cover, facility for the k-medians problem [10]. (Here c > 0 is an location, and k-medians, a basic probabilistic lemma input parameter that tunes the tradeoff between the called Wald's inequality is necessary. For fractional set two criteria and n is the number of customers.) Their cover and k-medians, the analysis rests on the Chernoff algorithm combines a greedy algorithm with a technique bound. For fractional weighted set cover and facility lo- they call filtering. cation, the analysis is a simple application of what we In 1994 Plotkin, Shmoys, and Tardos (PST) gave call the Chernoff- Wald bound. Lagrangian-relaxation algorithms for general packing For each problem, we apply the method of condi- and covering problems [13]. As a special case, their tional probabilities to the rounding scheme in order to algorithms imply a [(1 + e)k]-approximation algorithm derive a corresponding approximation algorithm. The for fractional set cover that runs in O(k ln(n)/e 2) linear- structure of each resulting algorithm, being closely tied time iterations. Here n is the number of elements. to the underlying probabilistic analysis, ends up differ- In 1998 Garg generalized and simplified the PST ing substantially from problem to problem. set cover algorithm to obtain a [(1 + e)k]-approximation For facility location, the resulting algorithm (Fig. 2) algorithm for fractional weighted set cover [6]. Garg's is a randomized rounding, polynomial-time [d + H~xk]- algorithm runs in O(nln(n)/e 2) linear-time iterations. approximation algorithm. The performance guarantee By Hochbaum's reduction, one can use Garg's algorithm improves over Hochbaum's algorithm with respect to as a [(1 + e)k]-approximation algorithm for fractional the assignment costs. /acility location. The running time is the same, where For weighted k-medians, the resulting algorithm n is the number of customers. (Fig. 3) is a greedy [(1 + e)d; ln(n +n/e)k]-apprordmation Recent work has focused on metric k-medians and algorithm. In comparison to Lin and Vitter's algorithm, facility location problems. In the metric versions, the the performance ratio with respect to the facility costs distance function is assumed to satisfy the triangle in- is better by a factor of roughly 1/e. equality. For example, [O(d + k)]-approximation algo- For fractional k-medians, the algorithm (Fig. 5) rithms have recently been shown for the metric facility- is a [(1 + e)d, (1 + e)k]-approximation algorithm. It location problem [7, 15]. Charikar, Guha, Tardos and is a Lagrangian-relaxation algorithm and runs in Shmoys [2] recently gave an [O(d); k]-approximation al- O(k ln(n/e)/e 2) linear-time iterations. This is a factor gorithm for metric k-medians. Many of these algorithm of n faster than the best bound we can show by applying first solve the fractional problems and then round the the general algorithm of Plotkin, Shmoys and Tardos. fractional solutions. Finally, for fractional facility location, the algo- rithm (see Fig. 6) is a [(1 + e)(d + k)]-approximation, rHHochbaum's reduction is easy to adapt in order to reduce Lagrangian-relaxation algorithm. It runs in at most unweighted k-medians to a variant of set cover that we call k- O(nln(n)/e 2) iterations, where each iteration requires set cover. The reduction also extends naturally to the fractional time linear in the input size times Inn. This algorithm problems. (See appendix for details.) This shows a loose is the same as the unpublished fractional weighted set- equivalence between facility location and weighted set cover, and between k-medians and k-set cover. However, Hochbaum's cover algorithm due to Garg [6]. reduction does not preserve the distinction between facility costs The main interest of this last result is not that and assignment costs. For this reason, we work with the facility we improve Garg's algorithm (we don't!), but that location and k-medians representations. The algorithms and we recast and abstract Garg's analysis to obtain an analyses we give for facility location easily imply corresponding apparently new (?) probabilistic bound -- we callit the results for weighted set cover, and the results for k-medians are straightforward to adapt to k-set cover. Chernoff-Wald bound -- that may be of general interest 88

for probabilistic applications. 2 are met. But the conclusion E[X1 + X2 +.." + XT] <_ 0 A basic contribution of this work is to identify does not hold, because )(1 + X2 + -.- + XT ---- 1. and abstract out (using the probabilistic method) com- The proof is just an adaptation of the proof of mon techniques underlying the design and analysis of Wald's equation [1, p. 370]. The reader can skip it on Lagrangian-relaxation and greedy approximation algo- first reading. rithms. Proof. W.l.o.g. assume It -- 0, otherwise apply the 4 Wald's inequality and Chernoff-Wald bound change of variables X~ = Xt - tt before proceeding. Before we state and prove Wald's inequality and the Define Yt - X1 + X2 +-.. + Xt. If E(YT) = --oo then Chernoff-Wald bound, we give some simple examples. the claim clearlyholds. Otherwise, Suppose we perform repeated trials of a random ex- periment in which we roll a 6-sided die and flip a fair E(YT) = EPr(T = t)E(Yt [T = t) coin. We stop as soon as the total of the numbers t=l rolled exceeds 3494. Let T be the number of trials. oo t Let D _< 3500 be the total of the numbers rolled. Let = EEPr(T = t)E(X8 IT = t). H be the number of flips that come up heads. Since the t:l s:l expectation of the number in each roll is 3.5, Wald's The sum of the positive terms above is at most implies E[D l = 3.5EFT], which implies EfT] < 1000. ~t ~-~s s)Z(Xs ]T > s) e ~ 0.128 (so (1 + e)500 ~-. 564). s-----1 If we were to modify the experiment so that the This establishes the claim because E(X8 IT >_ s) < O. number of trials T was set at 1000, the Chernoff-Wald The claim with "<_"'s replaced by ">_"'s follows via the bound would givethe same conclusion, but in that case change of variables X~ - -Xt, #' = -#, and c' - -e. the Chernoff bound would also imply (for the same e) [] that Pr[M > (1 + e)500] < 1

LEMMA 4.1. (Wald's inequality) Let T E Iq+ be a In many applications of Wald's, the random vari- random variable with EfT] < oo and let X1, X2,... be a ables X1, X2,... will be independent, and T will be sequence of random variables. Let it, c E ~. If a stopping time for {Xt} -- a random variable in i~1+ such that the event "T = t" is independent of E[XtIT>t] < # (fort> 1) {Xt+l, Xt+2,...}. In this case the following companion lemma facilitates the application of Wald's inequality: and Xt < e (for t = 1,..., T), then LEMMA 4.2. Let XI,X2,... be a sequence of indepen- E(X1 + X2 +." + XT) <~ #E(T). dent random variables and let T be a stopping time for The claim also holds if each "<" is replaced by '~_ ". the sequence. Then E[Xt [ T > t] = E[Xt].

The condition "Xt < c" is necessary. Consider Proof. Because Xt is independent of XI, X2,..., Xt-i, choosing each Xt randomly to be 4"2 t and letting T = and the event "T > t" (i.e. T • {1,2,...,t- 1}) is min{t : Xt > 0}. Then E[XtlT > t] = 0, so taking determined by the values of X1, X2, ..., Xt-1, it follows It = 0, all conditions for the theorem except "Xt < c" that Xt is independent of the event "T _> t". []

~The Chernoff-Wald bound also plays a central role in Here is a statement of a standard Chernoff bound. the randomized-rounding interpretation of the Garg and Kone- For a proof see e.g. [14, 16]. mann's recent multicommodity-flow algorithm [5]. This and the general connection between randomized rounding and greedy/Lagrangian-relaxation algorithms are explored in depth in the journal version of [16], which as of October 1999 is still being written. 89

Let ch(e) a (1 + e) ln(1 + e) - e > e ln(1 + e)/2. z ~ 0. Thus, conditioned on the event T > t and the For e < 1, ch(e) > e2/3 and ch(-e) > e~/2. values of Yi,t-i (1 < i < m), LEMMA 4.3. (Chernoff Bound [14]) E[Z, - Z~_~] < Let X1, X2,..., Xk be a sequence of independent random ln(1 + e) ~'~i yit variables in [0, 1] with E(~ i Xi) >_ It > O. Let e > O. Then Pr [Y~4 Xi > It(1 + e)] < exp(-ch(e)It). This implies that E[Zt - Zt_IIT >_ t] < Ite/In(1 + e). For e < 1, Pr [~"~'~iXi < It(1 - e)] < exp(-ch(-e)It). Using Z0 -- lOgl+~ m and Wald's inequality,

THEOREM 4.1. (Chernoff-Wald Bound) E[M] < E[ZT] < logl+~ m + E[T]#e/ln(1 + e). For each i = 1,2,...,m, let Xil,Xi2,... be a sequence of random variables such that 0 < Xit <_ 1. By algebra, the above together with the assumption on Let T 6 N+ be a random variable with EfT] < oo. e imply that E[M] < (1 + e)/~E[T]. Let M = max['=1 Xil + Xi2 + "" + Xir. Suppose The proof of the claim for the minimum is essen- tially the same, with each "e" replaced by "-e" and IT >__ t; {Xy~ : j < t, 1 < i < m}] < u E[X. reversals of appropriate inequalities. In verifying this, for all i, t for some It E IR. Let e > 0 satisfy note that log1_ ~ is a decreasing function. []

e--Ch(e) max{#E[T], E[M]/(1 + e)} _< 1/m. In many applications of Chernoff-Wald, each Xit will be independent of {Xjl : ~ < t, 1 <_ j < m}, and T Then will be a stopping time for {)(it} -- a random variable E[M l <_ (1 + e)pE[T]. in N + such for any t the event "T = t" is independent of The claim also holds with the following replace- {Xi~ : g > t, 1 < i < m}. Then by an argument similar ments: '%nini" for '%naxi "; ">_ #" for "< It'; to the proof of Lemma 4.2, we have: "E[M] >" for "E[M] <_ "; and, for each "e ", "-e" (ex- LEMMA4.4. For 1 < i <_ m, let Xn,Xi2,... be cept in "e >_ 0 "). a sequence of random variables such that each Xit is In the case when T is constant, the Chernoff bound independent of {Xjt : g < t, 1 < j <_ m}. Let T be a implies Pr[M _> (1 + e)#T] < I for e as above. stopping time for { Xit }. Then The first-time reader can skip the following proof. E[Xi, IT >_ t; {Xj, : j < t, 1 < i < m}] = E[X~t].

Proof. For t = 0,..., T, let Yit - Xil + Xi2 + "" " "~ Xit (i = 1,...,m) and 5 Randomized rounding for facility location. m We use Wald's inequality to analyze a natural random- Zt - log1+, E(1 + e) Y" • ized rounding scheme for facility location. i=1 GUARANTEE 5.1. Let F be the output of the facility- Note that for each i, Z T > logl+e(l+e) Y'r = Y/T- Thus location rounding scheme in Fig. 2 given input x. Let M <_ ZT. We use Wald's inequality to bound E[ZT]. AIz be the number of customers c such that x(f, c) > O. Fix any t > 0. Let Yu = (1 + e) Y~,'-~ . Then Then EF[dist(g) + cost(F)] is at most dist(x) + E I cost(f)x(f)gA t, . ~i yit(1 + e) x'' Zt - Zt-t = l°gl+e ~iYit Proof. Observe the following basic facts about each < iogl+ ~ ~iYit(1 + eXit) iteration of the outer loop: -- ~i Yit 1. The probability that a given customer c is as- = lOgl+~ [l+e ~-~iyitzit] signed to a particular facility f in this iteration ~"~i Yit is x(f, c)/[x I. < e ~-~i yitXit ln(1 + e) Y'~iYit 2. The probability that c is assigned to some facility is EI z(f, c)/Izf > x/Ixl. The first inequality follows from (1 + e) z < 1 + ez for 0 < z < 1. The last inequality follows from 3. Given that c is assigned, the probabilitY of it being logl+ ~1 +z = ln(1 +z)/ln(l+e) < z/In(1 +e) for assigned to a particular f is x(f, c). 90

input: fractional facility location solution x. output: random solution F C jz s.t. EF[dist(F) + cosz(F)] < dist(x) + Hzxcost(x).

1. Repeat until all customers are assigned: 2. Choose a single facility f at random so that Pr(f chosen) = x(f)/Ix I. 3. For each customer c independently with probability x(s, e)/x(s): 4. Assign (or, if c was previously assigned, reassign) c to f. 5. Return the set containing those facilities having customers assigned to them.

Figure 2: Facility-location rounding scheme. Note Ixl - ~']s x(s).

To bound E[dist(F)], it suffices to bound the ex- 6 Greedy algorithm for weighted k-medians. pected cost of the assignment chosen by the algo- The k-medians rounding scheme takes a fractional k- rithm. For a given pair (f,c), what is the prob- medians solution (x, k,d) and an e > 0 and outputs ability that f is assigned to c? By the third a random solution F. The scheme is the same as fact above, this is x(f,c). Thus, E[dist(F)] < the facility-location rounding scheme in Fig. 2 except ~"]~f,c Pr(c assigned to f)dist(f, c) = dist(x). for the termination condition. The modified algorithm To finish, we show Pr(f 6 F) < Has z(f). terminates after the first iteration in which the facility This suffices because it implies that E[cost(F)] equals cost exceeds k[ln(n +n/e)] or all customers are assigned ~[~f Pr(f 6 F)cost(f) < ~-~y HA/X(f)cost(f). with assignment cost less than d(1 + e). We analyze Fix a facility f. Call the customers c such that this rounding scheme using Wald's inequality and then x(f, c) > 0 the "fractional customers of f". Let random derandomize it to obtain a greedy algorithm (Fig. 3). variable T be the number of iterations before all these customers are assigned. GUARANTEE 6.1. Let F be the output of the weighted claim 1: Pr(f e F) < E[T]x(f)/Ixl. k-medians rounding scheme given input x. Then proof: Define Xi to be the indicator variable for the cost(F) < kln(n + n/~) + maxf cost(f) and with pos- event "f is first chosen in iteration t". itive probability dist(F) < d(1 + e). As E[XtlT >_ t] < x(f)/Ixl, by Wald's inequality Proof. The bound on cost(F) always holds due to the Pr(f 6 F) = E[X1 + X2 +"" + XT] < E[T]x(f)/Ix[. termination condition of the algorithm. This proves the claim. Let random variable T be the number of iterations Recall that Alz is the number of fractional cus- of the rounding scheme. Let random variable ut be tomers of f. To finish the proof, it suffices to show: the number of not-yet-assigned customers at the end claim 2: EfT] < ]xlHA/. of round t (0 < t < T). By fact 2 in the proof of proof: Define ut to be the number of fractional cus- Guarantee 5.1, E[utlut_l A T > t] = (1 - 1/Ixl)ut_l. tomers of f not yet assigned after iteration t (0 < t < Define random variable dt to be the total cost of the T). (Recall Ho - 0 and Hi ~ 1 + 1/2 +--- + 1/i.) Then current (partial) assignment of customers to facilities at provided t < T, H,,, - H,,,+ I is the end of round t. Because each customer is reassigned with probability 1fix I in each round, it is not to hard to --+~1 1 +...+-- 1 > ut - Ut+l Ut Ut --~ Ut+l -- 1 -- U t show that E[dt [ dt-1 A T > t] = (1 - 1/[x])dt_l +d/Ix [. Define random variable ct to be the total cost of The expectation of the right-hand side is at least 1/Izl the facilities chosen so far at the end of round t. In each because each customer is assigned in "each iteration with iteration, E[cost(f)] -- k/Ix], so that E[ct I ct_l A T > probability at least Since H~, is H~x~. when 1/Izl. t] = ~_~ + k/M. t = 0 and decreases by at least 1/Izl in expectation Define ¢~ - ~_ E[T]/Ixl. Since H~,r = 0, the claim in the preceding three paragraphs, a calculation shows follows. [] E[¢t - Ct-lldt-1, ut-1, ce-1] <_ 0. By Wald's inequality, this implies E[OT] _< ¢o (lnn. Thus, with positive The randomized scheme can easily be derandomized. probability, CT < Inn. Assuming this event occurs, we COROLLARY 5.1. There is a polynomial-time [d+ HAk]- will show that at the end all elements are covered and approximation algorithm for uncap, facility location. the assignment cost is not too high. 91

input: diEt0, cost(), e, d, n. output: Set F of facilities s.t. cost(F) < kln(n + n/e) + max! cost(J) and diEt(F) < (1 + e)d.

1. Define ¢(c, f) - diEt(C, f)/d(1 + e) .... c's contrib, to ft + dr~d(1 + e) if c assigned to f . 2. For each customer c do: fc +-- none .... f¢ is the facility c is currently assigned to 3. Define ¢(c, none) - 1 .... contribution is I if c is unassigned 4. Repeat until all customers are assigned with assignment cost _< d(1 + e): 5. For each facility f define Cf - {c : ¢(c, f~) > ¢(c, f)}. 6. Choose a facility f to maximize [~-~-~Ecl ~(c, fc) - ~p(c, f)] / cost(J). 7. For each c E C I do: fc +- f .... Assign c to f . : 8. Return the set of chosen facilities. Figure 3: The greedy k-medians algorithm.

If the rounding scheme terminates because all ele- ments are covered and the assignment cost is less than cost(f) ft-I - fit + (dt-I/d - dt/d)l(1 + e) (1 + e)d, then clearly the performance guarantee holds. k + (dt_i/d - 1)/(1 + e) Otherwise the algorithm terminates because the facility It suffices to choose f and C so that the above is non- cost CT exceeds ln(n + n/e)k. This lower bound on the positive. Whatever ft-1, 5t-1, and dr-1 are, if f and C size and the occurrence of the event "qiT < In n" imply are chosen randomly, then the expectation of the above that UT < 1 and dT < (1 + e)d. [] is zero. Thus, there is some choice of f and C which makes it non-positive. Thus it suffices to choose f and Next we apply the method of conditional probabil- F to maximize ities. Let T, dr, ut, ct, and ~bt (0 < t < T) be defined as ft-1 - ft + (dt_i/d - djd)/(1 + e) in the proof of Guarantee 6.1 for the k-medians rounding scheme. That proof showed that E[¢T] < In n, and that cost(J) if qiT < In n then F meets the performance guarantee. The algorithm considers each facility f; for each f, it To obtain the greedy algorithm, in each iteration determines the best set C of customers to assign. The we replace the random choices by deterministic choices. algorithm is shown in Fig. 3. Let dr, fit, and 5t denote, respectively, the assignment The termination condition in the algorithm differs cost, number of unassigned elements, and facility cost from the one in the rounding scheme, but the modified at the end of the tth iteration of the greedy algorithm termination condition suffices because it foUows from (analogous to dr, ut, and ct for the randomized algo- the analysis that whatever k is, the algorithm will rithm). The greedy algorithm will make its choices in a terminate no later than the first iteration such that the way that maintains the invariant facility cost exceeds k In(n + ne). By the derivation,

I dt = Jt ^ ut = ^ c, = < In n. GUARANTEE 6.2. Given E, and d such that a fractional solution of cost k and assignment cost at most d ex- Note that the expectation above is with respect to the ists, the greedy weighted k-medians algorithm (Fig. 3) random experiment. That is, the invariant says that if, returns a solution F such that diEt(F) < (1 + e)d and starting from the current configuration, the remaining cost(F) _ k ln(n + n/e) + maxf cost(f). choices were to be made randomly, then (in expectation) Without loss of generality (since we are approximately gb would end up less than Inn. solving the IP) maxf cost(J) < k. For the unweighted Define Ct - 6t/k + In[ft + (dr~d-1) / (1 + e)] (anal- problem, the number of iterations is O(kln(n/e)). No ogous to Ct for the randomized algorithm). The proof facility is chosen twice, so the number of iterations is of Guarantee 6.1 easily generalizes to show E[¢T ]dr = always at most m, the number of facilities. d~ A ut = f~ A c~ = fit] _< ~t. Thus, it suffices to main- tain the invariant Ct < inn. Since ¢o = 6o < In n, the COROLLARY 6.1. Let 0 < e < 1. The weighted k- invariant holds initially. medians problem has a [(1 + e)d; (1 + ln(n + n/e)k)]- During each iteration t, the algorithm chooses a approximation algorithm that runs in O(m) linear-time facility f and assigns it a set of customers C so that iterations, or O ( k In(n/e)) iterations for the unweighted qit < 4t-1- A calculation shows Ct - ¢t-1 is less than problem. 92

input: fractional k-medians solution (x*, k, d), 0 < e < 1. output: random fractional solution ~ s.t. cost(~) = (1 - e)-lk and E[dist(&)] _< (1 - e)-2d.

1. Choose N > ln(n/e)/ch(-e) s.t. Nk is an integer .... Recall oh(e) _> eln(1 + e). 2. For each f,c do: x(f) 6- x(f, c) 6- x(c) 6- 0. 3. Repeat N k times: 4. Choose a single facility f at random so that Pr(f chosen) = x*(f)/k. 5. Increment x(f). 6. For each customer c, with probability x* (f, c)/x* (f) do: 7. Increment x(f, c) and x(c). 8. Return ~, where ~ - x/(1 - e)N.

Figure 4: Fractional k-medians rounding scheme.

7 Lagrangian relaxation for k-medians. the conditional expectation of the final value of the In this section we derive and analyze a Lagrangian- above, given the current value of x and the number t relaxation, [(1 + e)2d; (1 + e)k]-approximation algorithm of remaining iterations, is for the fractional unweighted k-medians problem. The dist(x) + td/k (1 - e)=(¢)e -t'/k rounding scheme is shown in Fig. 4. We use the Chernoff ~(x,t) -- dN/(1 - e) + Z (1 - e)(1-') g and Markov bounds to bound the probability of failure. c

GUARANTEE 7.1. Let ~ be the output of the fractional The algorithm chooses f and C in each iteration in order k-medians rounding scheme. Then with positive prob- to minimize the increase in the above quantity, which ability ~ has cost(~) _< (1- e)-lk and dist(~) consequently stays less than 1. (1 - e)-2d. GUARANTEE 7.2. The fractional k-medians algorithm Proof. Recall k = cost(x*) = [x*[ and d = dist(x*). in Fig. 5 returns a fractional solution ~ having The bound on the cost always holds, because each of cost(~) ~ (I- e)-Ik and dist(~) ~ (I -e)-2d. the kN iterations adds i to cost(x). It remains to show that with positive probability, after the final iteration, Proof. (Sketch.) Let $ be as defined above. The dist(x) _< (1 -- e)-lNd and each x(c) > (1 - e)N. algorithm maintains the invariant that with t iterations Since each iteration increases dist(x) by d/k in remaining, ~(x, t) < 1. It is straightforward to verify expectation, finally E[dist(x)] < (Nk)d/k = dN. By that the invariant is initially true, and that if is true the Markov bound, Pr[dist(x) >_ dN/(1 - e)] < 1 - e. at the end, then the performance guarantee is met. We For any customer c, x(c) is the sum of kN indepen- verify that the invariant is maintained at each step. The dent 0-1 random variables each with expectation at least increase in ~ in a single iteration is proportional to l/k, so by the Chernoff bound, Pr[x(c) _< (1 - e)N] < exp(-ch(-e)N), which is at most e/n by the choice of N. dist(f, e)- By the naive union bound, Pr[dist(x) > dN/(1 - cEC c c e) V minex(c) < (1 - e)N] < (1 - e) +n(e/n) = 1. [] where y(c) = a (1 - e) ~(c) before the iteration for suitably chosen scalar a > 0. If f and C were chosen Next we sketch how the method of conditional randomly as in the rounding scheme, the expectation of expectations yields the algorithm shown in Fig. 5. The the above would be at most 0. The choice made by the proof of Guarantee 7.1 implicitly bounds the probability algorithm minimizes the above quantity, therefore the of failure by the expectation of algorithm maintains the invariant. [] dist(x) ~ (1 - e)z(c) COROLLARY 7.1. Let 0 < e < 1. The fractional k- + Z.o medians problem has a [(1 + e )d, (14- e ) k]-approximation and it shows that the expectation is less than 1. An algorithm that runs in O(k ln(n/e)/e ~) linear-time iter- upper bound (called a "pessimistic estimator" [14]) of ations. 93

input: fractional k-medians instance, 0 < e < 1, k, d. output: fractional solution ~ s.t. cost(~) < (1 - e)-lk and dist($) < (1 - e)-2d.

1. Choose N > ln(n/e)/ch(-~) such that N k is an integer .... Recall ch(--e) > e2/2. 2. For each f ,c do: x(f) +-- x(f,c) +-- x(c) +-- 0; y(c) = ~(1 - e)-I NdeCh(-~)lv. 3. Repeat N k times: 4. For each c do: set y(c) +- y(c) d/k. 5. Choose a single facility f and a set of customers C to maximize ~-~ec y(c) - dist(f, c). 6. Increment x(f). 7. For each c E C do: Increment x(f, c) and x(c), and set y(c) e- (1 - ~)y(c). 8. Return ~, where & = x / (1 - c)N.

Figure 5: Lagrangian-relaxation algorithm for fractional k-medians. In step 4, it suffices to choose the best set of the form C = {c : y(c) > dist(f, c)} for some f. This can be clone in linear time.

8 Lagrangian relaxation for facility location. the value of x after the final iteration of the algo- The rounding scheme for fractional facility location rithm and x denote the value at the "current" itera- is the same as the rounding scheme for fractional k- tion. The analysis of the rounding scheme shows that medians in Fig. 4, except for the termination condition. E[cost(xl) + disz(x/)] ~ N (k + d). The conditional The modified algorithm terminates after the first itera- expectation of cost(x/) + dist(x/) at the end, given tion where each x(e) _> (1 - e)N, where N is chosen to the current x, is cost(z) + dist(x) + E[tIx](k + d)/Ix* I, be at least ln(n)/ch(-e) s.t. (1 -- c)N is an integer. We where random variable t is the number of iterations left. use the Chernoff-Wald bound to analyze the scheme. The proof of Chernoff-Wald, in this context, ar- gues that the quantity 2tT/(x) - logl_~ ~-~c(1 - e) z(c) is GUARANTEE 8.1. Let ~ be the output of the fractional logl_ ~ n initially, at most (1 -c)N finally, and decreases facility-location rounding scheme as described above. in expectation at least -e/(Ix*l in(1 - ~)) in each iter- Then ~ is a fractional solution to the facility location ation. An easy generalization of the argument shows (l--e)N--~/(x) LP and E[cost(&) + dist(&)] < (1 - e)-i(d + k). E[tlx ] is at most "~](Iz'tln(1-~))" This gives us our pes- Proof. The termination condition ensures that all cus- simistic estimator: E[cost(xi) + gist(x/) Ix] ~ ~(x) tomers are adequately covered. It remains to bound where E[cosz(2) + dist(~)]. Let r.v. T denote the number of iterations of the ~(x) - cost(x) + dist(x) + (k + d) (1 - ~)g - M(x) -e/ln(1 - e) rounding scheme. In each iteration, the expected in- crease in cost(x)+dist(x) is k/]x*] +d/]x* I. By Wald's The algorithm chooses f and C to keep ~ from increas- inequality, at termination, E[cost(x) + disl;(x)] < ing (although not necessarily to minimize ~) at each S[T](k + d)/[x* I. It remains to show EfT] <_ N[x*[. round. (Recall that ~ = x/(1 - e)N.) For any customer c, the probability that x(c) is GUARANTEE 8.2. Let ~ be the output of the algorithm incremented in a given iteration is at least 1/Ix*[, shown in Fig. 6. Then & is a fractional solution to independently of the previous iterations. Let r.v. M - the facility location LP and cost(~) + dist(&) < (1 - minc x(c) at the end. Note tl~at, by the choice of N, in e)-I minx dist(x) + cost(x), where x ranges over all fact M = (1 - e)N. fractional solutions. By the Chernoff-Wald bound, (1 - e)N = E[M] >_ (1-e)E[T]/Ix* I (provided ch(-e) > ln(m)(1-e)/E[M], Proof. (Sketch) Define ~ as above. The algorithm which indeed holds by the choice of N). Rewriting gives maintains the invariant ~(x) < (k+d)N. A calculation 3 EfT] < Nix* I. [] shows that the invariant is initially true by the choice of N. Clearly if the invariant is true at the end the

Next we sketch how applying the method of con- 3This proof is an adaptation of part of the Chernoff-Wald proof ditional expectations gives the Lagrangian-relaxation to this context. For further details on parts marked with this algorithm shown in Fig. 6. Below we let xf denote footnote, see that proof. 94

input: fractional facility-location instance, 0 < e < 1. output: fractional solution ~ s.t. cost(~)+ dist(~) < (1- e)-l(d + k).

1. Choose N >_ ln(n)/ch(-e) such that N(1 - e) is an integer. ... Recall ch(-e) _> e2/2. 2. For each f ,c do: x(f) ~ x(f, c) e- x(c) +- 0; y(c) +-- 1. 3. Repeat until each x(c) > (1 - e)N: 4. Choose a single facility f and a set of customers C to maximize EoecU(C) cost(f) + Ecec dist(f, c)" 5. Increment x(f). 6. For each c E C do: 7. Increment x(f, c) and z(c), and set y(c) +-- (1 - e)y(c). If x(c) > N set y(c) +-- O. 8. Return ~, where ~ - x / (1 - e)N.

Figure 6: tagrangian-relaxation algorithm for fractional facility location. In step 4, it suffices to choose the best set of the form C = {c : y(c)/dist(f, c) _> ~} for some ~.

performance guarantee holds. In a given iteration, the Konemann's multicommodity flow algorithm [4], or increase in ~ is at most 3 (k + d)e times (depending on the application) using standard data structures. cost(f) + ZcEc dist(f, c) _ Ec c y(c) In practice, changing the objective function of the k + d ~-~c y(c) LP relaxation of the IP to better reflect the performance guarantee might be worthwhile. For example, if one where y(c) = (l-c) z(c). If f and C were chosen is going to randomly round a fractional solution x to randomly as in the rounding scheme, the expectation the facility-location LP, it might be better to minimize of the above quantity would be non-positive. 3 Thus, dist(x) + ~']~f cost(f)x(f)H~xf rather than dist(x) + to keep it non-positive, it suffices to choose f and c to cost(x). This gives a performance guarantee that is maximize provably as good, and may allow the LP to compensate Eo c y(c) for the fact that the difficulty of approximating the cost(f) + Eeec dist(f, c) various components of the cost varies.

which is what the algorithm does. [] Acknowledgements Thanks to Sanjeev Arora, Lisa Fleischer, Naveen Garg, In each iteration, at least one customer c with and David Shmoys for helpful comments. x(c) < N has x(c) incremented. Thus the number of iterations is O(nln(n)/e2). Each iteration can be References implemented in linear times O(ln(n)) time. Thus, COROLLARY 8.1. Let 0 < e < 1. "Fractional facility [1] Allan Borodin and Ran E1-Yaniv. Online Computa- location has a [(1 + e)(d + k)]-approximation algorithm tion and Competitive Analysis. Cambridge University that runs m O(nln(n)/E 9-) iterations, each requiring Press, 1998. time linear in the input size times In n. [2] M. Charikar, S. Guha, E. Tardos, and D.B. Shmoys. A constant-factor approximation algorithm for the k- 9 Further directions median problem. In Proceedings of the Thirty First Is there a greedy [d + H~k]-approximation algorithm Annual ACM Symposium on Theory of Computing, for facility location? A [(1 + e)d; (1 + ln(n + n/e))k]- 1999. [3] V. Chvatal. A greedy heuristic for the set-covering approximation algorithm for weighted k-medians? A problem. Mathematics of Operations Research, [d, (1 + e)k]- or [(1 + e)d, k]-approximation algorithm 4(3):233-235, 1979. for fractional k-medians? A Lagrangian-relaxation al- [4] Lisa Fleischer. Unpublished manuscript. 1999. gorithm for fractional weighted k-medians? [5] Garg and Konemann. Faster and simpler algorithms The running times of all of the algorithms here for multicommodity flow and other fractional packing can probably be improved using techniques similar to problems. In 39th Annual Symposium on Foundations the one that Fleischer applied to improve Garg and of Computer Science, 1998. 95

[6] Naveen Garg. Unpublished manuscript. Distributed The input to that algorithm is (distO,k,d,e). We can at Dagstiihl, 1998. use it to compute an approximate solution x such that [7] Sudipto Guha and Samir Khuller. Greedy strikes back: (Ve)xe >__ 1- e and dist(x) < (1 ÷e)d (provided the original Improved facility location algorithms. In Proceedings problem is feasible). We scale x, multiplying it by 1 + O(e), of the Ninth Annual ACM-SIAM Symposium on Dis- to get the final output. crete Algorithms, pages 649-657, 1998. With care, we can show that to implement the PST [8] Dorit S. Hochbaum. Heuristics for the fixed cost algorithm, it suffices to have a subroutine that, given a vector median problem. Math. Programming, 22(2):148-162, a, returns x E P minimizing dist(x)-- ~'~e c~ex¢. An optimal 1982. x can be found by enumerating the sets and choosing the set [9] David S. Johnson. Approximation algorithms for s that minimizes dist(s) -- ~-~es ~" combinatorial problems. Journal of Computer and The running time of the PST algorithm is dominated by System Sciences, 9:256-278, 1974. the time spent in this subroutine. The subroutine is called [10] Jyh-Han Lin and Jeffrey Scott Vitter. e- O(pln(m)/e 2) times, where rn is the number of elements and approximations with minimum packing constraint p is the width of the problem instance, which in this case is violation (extended abstract). In Proceedings of the k maxs(dist(s)/d, 1}. Thus, Twenty Fourth Annual ACM Symposium on Theory of Computing, pages 771-782, 1992. COROLLARY A.1. The fractional k-set cover decision prob- [11] L~szl6 Lov~sz. On the ratio of optimal integral and lem reduces to a mixed packing/covering problem of width fractional covers. Discrete Mathematics, 13:383-390, p = k maxs{dist(s)/d, 1}. If a problem instance is feasi- 1975. ble, the algorithm of [13] yields a fractional solution x with [12] George L. Nemhauser and Laurence A. Wolsey. Integer Ix] _< (1 ÷ ~)k and dist(x) < (1 + e)d in time linear in the and Combinatorial Optimization. John Wiley and input size times O(pln(m)/J). Sons, New York, 1988. In many cases, we can assume without loss of generality [13] Serge A. Plotkin, David B. Shmoys, and l~va Tardos. that maxsdist(s) ~ d, in which case the width is k. Fast approximation algorithms for fractional packing Except for the fact that this is a decision procedure, this is and covering problems. Math. Oper. Res., 20(2):257- comparable to Corollary 7.1. (Although that bound requires 301, 1995. no assumption about d.) [14] Prabhakar Raghavan. Probabilistic construction of de- Next we sketch how weighted k-medians reduces to k- terministic algorithms approximating packing integer set cover. We adapt Hochbaum's facility-location-to-set- programs. Journal of Computer and System Sciences, cover reduction. Fix a weighted k-medians instance with 37(2):130-143, October 1988. n facilities and m customers C. Construct an (exponentially [15] David B. Shmoys, l~va Tardos, and Karen Aardal. Ap- large) family of sets as follows. For each facility f and subset proximation algorithms for facility location problems C of customers, define a set Sfc -- C, with dist(Sfc) -- (extended abstract). In Proceedings of the Twenty- ~"~ceC dist(f,c) and cost(Sfc) -- cost(f). Then each k- Ninth Annual ACM Symposium on Theory of Comput- medians solution corresponds to a k-set cover, and vice versa. ing, pages 265-274, 1997. The bijection preserves dist, and extends naturally to the [16] Neal E. Young. Randomized rounding without solving fractional problems as well. the linear program. In Proceedings of the Sixth Annual Even though the resulting fractional k-set is exponen- A CM-SIAM Symposium on Discrete Algorithms, pages tially large, we can still solve it efficiently using PST pro- 170-178, 1995. vided we have a subroutine that, given a vector a, effi- ciently finds a facility f and set of customers C minimizing A K-medians via k-set cover via PST. ~"~cec dist(f, c) --ac. This C and f can in fact be found by For completeness, we discuss a relation between fractional choosing the facility f minimizing ~-~¢ecf dist(f, c) --ac, k-medians and the mixed/packing covering framework of where Cf = {c : dist(s) < at.}. Thus, we have Plotkin, Shmoys, and Tardos (PST) [13]. First we consider the k-set cover problem -- a variant of weighted set cover in COROLLARY A.2. The fractional weighted k-medians deci- which each set s is given a "distance" dist(s) E R+, and the sion problem reduces to a mixed packing/covering problem goal is to choose a cover (a collection of sets containing all of width p = k maxf{~cdist(f,c)/d, 1}. If a problem in- elements) of size at most k, minimizing the total distance. stance is feasible, the algorithm of [13] yields a fractional We formulate the decision problem (given d, is there solution x with Ixl < (1 + e)k and dist(x) _< (1 + e)d in time a set cover of size at most k and distance at most d?) linear in the input size times O(pln(m)/e2). as a mixed packing/covering problem. Let P -=- {x : ~-~ x~ < g}. For x E P define xe -- ~-~x~ and If each dist(f, c) ~ d, then p < kin, where m is the number dist(x) -- ~'-~sxsdist(s). Then the fractional K-set cover of customers. This bound on the running time is a factor of problem is the packing/covering problem 3?x E P : (Ve)x~ > m worse than the bound in Corollary 7.1 (though a reduction 1; dist(x) _~ d. yielding smaller width may be possible). We can solve this using the PST algorithm as follows.