Online Algorithms for Covering and with Convex Objectives

Yossi Azar∗, Niv Buchbinder†, T-H. Hubert Chan‡, Shahar Chen§, Ilan Reuven Cohen†, Anupam Gupta¶, Zhiyi Huang‡, Ning Kang‡, Viswanath Nagarajank, Joseph (Seffi) Naor§, Debmalya Panigrahi∗∗ ∗Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel. Email: {[email protected], [email protected]} †Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv, Israel. Email: [email protected] ‡Department of Computer Science, The University of Hong Kong, Hong Kong, China. Email: {hubert,zhiyi,nkang}@cs.hku.hk §Department of Computer Science, Technion - Israel Institute of Technology, Haifa, Israel. Email:{[email protected], [email protected]} ¶Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA. Email: [email protected] kDepartment of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI 48109, USA. Email: [email protected] ∗∗Department of Computer Science, Duke University, Durham, NC 27708, USA. Email: [email protected]

Abstract—We present online algorithms for covering and a primal-dual algorithm, and then rounding it, has proved to be packing problems with (non-linear) convex objectives. The convex a very powerful general technique (see [4] for a survey). This covering problem is defined as: minx∈ n f(x) s.t. Ax ≥ 1, where R+ framework is applicable to many central online problems, like f : n → A m × n R+ R+ is a monotone convex function, and is an the classic ski rental problem, online set-cover [5], generalized matrix with non-negative entries. In the online version, a new row of the constraint matrix, representing a new covering constraint, paging [6], [7], k-server and metrical task systems [8], [9], is revealed in each step and the algorithm is required to maintain graph connectivity [10], [11], routing [12], load balancing and a feasible and monotonically non-decreasing assignment x over machine scheduling [13], [14], [15], and budgeted time. We also consider a convex packing problem defined as: Pm T n allocation [16]. Via this approach, not only can we unify maxy∈ m yj − g(A y), where g : → + is a monotone R+ j=1 R+ R previously known results, but we can also resolve important convex function. In the online version, each variable y arrives j open questions in competitive analysis. The use of the online online and the algorithm must decide the value of yj on its arrival. This represents the Fenchel dual of the convex covering program, primal-dual method mirrors (and is inspired by) the use of when g is the convex conjugate of f. We use a primal-dual relaxations for approximation algorithms, approach to give online algorithms for these generic problems, with one major difference. In the online setting, generic and use them to simplify, unify, and improve upon previous linear programming solvers cannot be used for even obtaining results for several applications. Index Terms—online algorithm; convex optimization; primal- a fractional solution, since constraints are presented online. dual algorithm Hence, new algorithms have been developed for solving linear programs online [17], [18]. I.INTRODUCTION The basic setting of the online primal-dual framework is In the area of online algorithms, the method of modeling a grounded on the well-studied covering/packing framework for problem as a linear program, obtaining a fractional solution via combinatorial optimization problems. Covering/packing linear formulations are a special class of linear formulations in which This paper presents results independently obtained by Y. Azar, I. R. Cohen, all coefficients are non-negative. In a covering problem, the and D. Panigrahi [1], N. Buchbinder, S. Chen, A. Gupta, V. Nagarajan, and goal is to minimize a non-negative cost function subject to J. Naor [2], and T.-H. H. Chan, Z. Huang, and N. Kang [3]. Y. Azar and I. R. Cohen are supported in part by the Israel Science non-negative covering constraints. In a packing problem the Foundation (grant No. 1506/16), by the I-CORE program (Center No. 4/11), goal is to maximize a non-negative profit function subject to and by the Blavatnik Fund. N. Buchbinder is supported in part by ISF grant non-negative packing constraints. Packing and covering form 1585/15 and by BSF grant 2014414. S. Chen and J. Naor are supported in part by ISF grant 1585/15 and BSF grant 2014414. A. Gupta is supported in a primal-dual pair – for consistency, we refer to the covering part by NSF awards CCF-1319811, CCF-1536002, CCF-1540541 and CCF- problem as the primal problem and to the packing problem as 1617790. D. Panigrahi is supported in part by NSF Awards CCF-1527084 and the dual problem, although in some cases, it is the dual packing CCF-1535972. T-H. H. Chan is supported in part by the Hong Kong RGC grant 17202715. Z. Huang is supported in part by the Hong Kong RGC grant problem in which we are interested in. The covering/packing 17202115. framework captures a large class of (relaxations) of well- studied combinatorial problems. produces a quantity zj of each item j ∈ [n], at total cost ? Although extremely successful, the online primal-dual ap- f (z). Now, if yj denotes the value obtained from the bundle proach has been mainly studied for linear objective functions. of goods assigned to buyer j, and Ay ≤ z captures the Yet, what about convex optimization problems? In the offline constraints that no item is sold in greater quantities than its setting, the Ellipsoid and interior-point methods can solve production, then we recover precisely the dual problem. This these problems with linear convergence rates. But, online, it social-welfare maximization problem was studied by Blum was not known how to solve convex optimization problems et al. [24] and Huang et al. [14] for the “separable” special in general prior to this work. In contrast, specific online case in which the production cost of each item is a univariate problems with non-linear convex/concave objectives have been convex function, and the total cost is the sum over the items: ? P ? studied in recent years, in areas such as energy-efficient f (z) = i fi (zi). This case is somewhat restrictive in that scheduling [19], [20], [21], paging [22], network routing [23], it does not capture production costs such as energy that are ? P α combinatorial auctions [24], [14], matching [25], and budgeted typically modeled as f (z) = ( zi) for some α > 1. This allocation [26]. In this paper, we develop a general framework is a non-separable function, and so not captured by previous for a broad class of covering/packing programs with convex work, but it falls neatly into our framework. objective functions. The competitive ratios we obtain are optimal. This already improves and substantially generalizes B. Our Results previous results for problem classes such as mixed packing covering LPs considered in [13]. We then show how to round Our main theorem for the general online covering/packing these fractional solutions online for specific optimization prob- framework is for monotone convex functions with monotone † lems, obtaining improved results for non-linear optimization gradients (we denote the gradient ∇f). Our primal-dual problems considered recently in machine scheduling, network approach simultaneously gives us solutions to both the primal routing, and combinatorial auctions. and dual convex programs. A. The Convex Optimization Framework Theorem 1 [Online Convex Covering/Packing Theorem] Let n f : R+ → R+ be a function that is monotone, differentiable, The next convex covering problem is our primal problem. h∇f(x),xi convex, and ∇f is monotone. Let p = supx≥0 f(x) . Then, min C(x) := f(x) subject to Ax ≥ 1. (I.1) (a) There exists an O(p log d)p-competitive algorithm that x∈ n R+ produces a monotone primal solution. n p Here, f : R+ → R is a convex function and Am×n is a (b) There exists an O(p log(ρd)) -competitive algorithm that non-negative matrix. The rows of A, the covering constraints, produces a monotone dual solution. arrive online over time. At any point of time we must maintain Here d is the maximal row sparsity of the matrix and ρ is ai0j a feasible fractional solution x, non-decreasing over time. maxj max 0 { }. i,i |ai,j 6=0 aij This convex program captures, e.g., the mixed covering- packing problem [13], the machine scheduling and online The conditions of the theorem are satisfied, for exam- routing problems from [23], [20], and the nested set cover ple, by convex polynomials of degree p (with non-negative f(x) = 1 kxkp problem [27]. coefficients). One such example is p p, whose ? 1 q 1 1 As in solving LPs online, we use a primal-dual technique Fenchel conjugate function is f = q kxkq where p + q = 1. where we simultaneously obtain a solution to both primal and Intuitively, p bounds the growth of the objective around a given dual programs. In the convex setting, the dual of (I.1) is the point and shows the (necessary) effects of discretization of following packing problem: the feasible region performed by the algorithm. In fact, our competitive ratios are the best possible for all p: see §III-A. P ? T max P (y) := yj−f (z) subject to A y ≤ z. m n j∈[m] Observe that the dual result of Theorem1(b) makes as- y∈R+ ,z∈R (I.2) sumptions on the function f and then solves the dual problem ? Here f ? is the Fenchel dual or conjugate function of f, using f . However, if our goal is to solve the dual problem, which is always a convex function∗, and hence another convex e.g., for social-welfare maximization with production costs, optimization problem. In the online setting, at each step a new we can instead impose conditions on the dual cost function ? dual variable yi arrives along with its coefficients. The dual f , which may yield a better competitive ratio. Indeed, the values must be monotone: we fix a value of yi upon arrival assumptions on the primal and the dual yield incomparable ‡ and cannot change it later, in particular in applications where families of convex functions. Our general result in this setting the dual problem is our focus. is the following: Interestingly, the dual has a natural interpretation of its own, † n 0 0 as the problem of social-welfare maximization with production Function g : R → R is monotone if for all x , x such that xi ≥ xi for 0 costs. Consider a seller who produces and sells items of n all i ∈ [n], g(x ) ≥ g(x). Monotonicity of the objective is clearly necessary, but that of its gradient is a technical assumption that might be avoidable. types in a combinatorial market with m buyers. The seller ‡ For example, monotonicity of f and ∇f does not imply monotonicity ? 2 ? 2 2 of ∇f : e.g., if f(x) = (x1 + x2) , then f (z) = max(z1 , z2 ) which has ∗The Fenchel dual function is formally defined in (II.3); see, e.g., [28] for non-monotone gradients. Hence, one can use Theorem1 to solve the dual background and properties. problem for this f ?, but not Theorem2. Theorem 2 [Online Convex Packing Theorem] Let f ? : K objective functions. As a direct application of our general n R+ → R+ be a convex polynomial function with non-negative covering framework (Theorem1(a)), we obtain an O(p log d)- coefficients, zero constant term and maximum degree q. Then, competitive algorithm for this problem, where d ≤ n is the there exists an O(q)-competitive algorithm for the online row-sparsity of the covering constraint matrix A. This yields convex packing problem. an O(log K log d)-competitive algorithm for minimizing the maximum constraint (the ∞-norm), improving upon the pre- This competitive ratio is also optimal, even for the special vious best bound of O(log K · log(dκγ)) [13], where γ and case where f ? is “separable”, i.e., the sum of single-variate κ are the max-to-min ratio of the entries in the covering and convex functions f ?(z) = P g (z ); see [14]. j j j packing constraints respectively. For p-norms of the objective Prior to this work, analogous results for online cover- functions, no previous bound was known. ing/packing were only known for linear objectives. Com- petitive ratios of O(log n) for covering and O(log(ρd)) for While our framework is designed for obtaining fractional packing were obtained in [17], the covering ratio being sub- solutions, it also yields new results for discrete problems via sequently improved to O(log d) in [18]. For linear objec- online rounding. Several examples follow. tives, Theorem1 obtains ratios of O(log d) for covering and (2) Social Welfare Maximization with Production Costs. As O(log(ρd)) for packing, matching the best bounds. an application of our packing framework, we consider social C. Our Techniques welfare maximization in combinatorial auctions with produc- tion costs. For the problem with arbitrary valuation functions Our algorithm for convex covering and packing (Theorem1) (assuming access via demand oracles), where the production is based on a clean unified approach, that significantly sim- cost function is allowed to be any convex degree-q polynomial plifies even the well-studied linear case [18]. The value of production cost function with non-negative coefficients, we get the primal variables grows exponentially proportional both to an algorithm which is O(q)-competitive, with an additive loss their current value and their coefficient in the constraint, but in the social welfare that depends only on the production cost divided by the gradient of f at the current solution. The dual is functions (but is independent of the number of buyers). [14] increased at a (carefully chosen) linear rate. This update rule considered the separable case and characterized the optimal is enough for obtaining the second part of Theorem1 (which competitive ratio of any given cost function via a differential has a dependence on ρ, a parameter that depends on the matrix equation. For the special case of polynomial cost functions xq, A). But it does not prove the first part, which is independent [14] explicitly solved the differential equation and obtained a of the numbers. To obtain the improved primal result we keep competitive ratio of 4q. Our results in this paper, applying to “shadow duals” that are used to bound the primal cost more the special case of separable degree-q polynomial functions, tightly. These dual variables are non-monotone and may be essentially gives the same algorithm and a competitive ratio of decreased at certain times. Decreasing the dual variables may O(q) §, matching the results of [14] asymptotically. We stress potentially reduce the total dual value being used to bound that the results in this paper applies to the much more general the primal cost. However, doing this carefully and maintaining non-separable cost functions. such “non-monotone” duals is crucial in obtaining the sharp O(p log d)p competitive ratio. (3) Unrelated Machine Scheduling with Startup Costs. In The algorithm for convex packing (Theorem2) achieves this problem, a set of n jobs arriving online have to be a much better competitive ratio when parameterized by the scheduled on m machines, where machine i has startup “smoothness parameter” of the dual cost function f ?. This cost ci and can run job j in processing time pij. The goal parametrization is also natural in applications such as welfare is to assign each job to a machine so as to simultane- maximization. This algorithm increases duals linearly, while ously minimize the p-norm of the overall machine loads maintaining primal variables as the gradient of f ? at a scaled and the total startup cost of machines used. For this prob- lem, we obtain an (O(log m log(mn)),O(p2 log1/p(mn))) dual solution. The primal/dual updates and their analysis are ¶ quite different from those of Theorem1: this is not surprising bicriteria competitive ratio, which is almost tight. For since the two algorithms are intended to perform well for the special case of total load (the 1-norm), we obtain a (roughly) complementary classes of the function f. Moreover, tighter bound of (O(log m log n),O(1)). For the special case as noted earlier, there are functions f and f ? for which the of the ∞-norm (maximum load or makespan), the best guarantees in both Theorem1 and Theorem2 are tight. competitive ratio previously obtained using tools specific to this case was (O(log m log(mn)),O(log m)) [13] while D. Applications our general framework yields a slightly worse bound of The general framework captures many existing and new applications. Due to space limitations, we only give an outline of these applications here and defer a detailed discussion to §We have not optimized the constants for specific polynomials, as our focus the full version of the paper. has been on getting a general positive result for a large family of cost functions (1) Mixed Covering and Packing LPs. Here we have an ¶A bi-criteria competitive ratio of (α, β) implies that the online algorithm produces a schedule of expected startup cost at most αC and expected p- online covering problem, along with K different objective norm of machine loads at most βL, if there exists a feasible solution with functions, and the goal is to minimize the p-norm of these startup cost C and p-norm of loads L. (O(log m log(mn)),O(log2 m(log n)1/ log m)).k For all other recently by Bhawalkar et al. [27]. Using an approach identical p-norms, including the 1-norm, no previous result was known, to that for the CCFL problem, we get an O(log2 m log mnk)- not only in the online setting but also offline [29], [30], [31]. competitive randomized online algorithm that violates each (4) Capacity Constrained Facility Location (CCFL). We capacity by an O(log2 m log mnk) factor. Again this factor is are given m potential facility locations, each with an opening weaker than the best result known by a logarithmic factor, but cost ci and a capacity ui. Now, n clients arrive online, each directly follows from our general framework. client j ∈ [n] having an assignment cost aij and a demand II.PRELIMINARIES bij for each facility i ∈ [m]. The online algorithm must open facilities (paying the opening costs ci) and assign each arriving For a positive integer n, we denote [n] := {1, 2, . . . , n}. client j to an open facility i (paying the assignment cost aij, We use x to denote a column vector. For two vectors a and and incurring a load pij on facility i). The congestion of an b of the same dimension, we write a ≥ b if each coordinate assignment is the maximum load on any facility. The objective of a is at least the corresponding coordinate of b. We use in CCFL is to simultaneously minimize the sum of opening ha, bi = a|b to denote the dot product of a and b. Let 0 and costs and assignment costs, and the maximum congestion 1 denote the all zero’s and all one’s vectors, respectively. A of any facility. We obtain an O(log2 m log mn)-competitive n m function g : R+ → R+ is monotone if a ≤ b implies that randomized online algorithm. This competitive ratio is worse g(a) ≤ g(b). For i ∈ [n], we use ei to denote the unit vector by a logarithmic factor than the best result known [13], but whose ith coordinate is 1. follows easily from our general framework. n Fenchel Conjugate: Given a function f : R+ → R+, its (5) Capacitated Multicast Problem (CMC). This is a com- ? n Fenchel conjugate f : R+ → R+ is defined as mon generalization of CCFL and the online multicast prob- ?  lem [10]. There are m edge-disjoint rooted trees T , ··· ,T f (z) = sup hx, zi − f(x) . (II.3) 1 m n x∈R+ corresponding to multicast trees in some network. Each tree Ti m has a capacity ui, and each edge e ∈ ∪i=1Ti has an opening It is well known (e.g., [28]) that the supremum is achieved at cost ce. A sequence of n clients arrive online, and each must a gradient of f, and thus hx, zi − f(x) is the negative of the be assigned to one of these trees. Each client j has a tree- y-intercept of a hyperplane that passes through point x and has dependent load of pij for tree Ti, and is connected to exactly gradient z. The Fenchel conjugate can also be interpreted as one vertex πij in tree Ti. Thus, if client j is assigned to tree representing f by a function that relates slopes of tangents to T then the load of T increases by p , and all edges on the 1 p i i ij their y-intercept. As an example, if f(x) = p x is a degree-p p path in Ti from πij to its root must be opened. The objective polynomial (p > 1), then f ?(z) = (1 − 1 )z p−1 is a degree- is to minimize the total cost of opening the edges, subject p p polynomial. For the rest of this paper, we will focus on to the capacity constraints that the total load on tree Ti is at p−1 the following type of functions f. most ui. Solving a natural fractional convex relaxation, and n then applying a suitable randomized rounding to it, we get an Assumption 3 [Nice function] Function f : → + is 2 R+ R O(log m log mn)-competitive randomized online algorithm convex, monotone, differentiable, and f(0) = 0. that violates each capacity by an O((d + log2 m) log mn) ? m In this case, the conjugate function f satisfies the following factor; here d is the maximum depth of the trees {Ti}i=1. The capacitated multicast problem with depth d = 2 trees properties: ? generalizes the CCFL problem, in which case we recover the 1) The conjugate f is non-negative, monotone, convex, and ? above result for CCFL. f (0) = 0. ?? (6) Online Set Cover with Set Requests (SCSR). We are 2) If f is lower semi-continuous, f = f. given a universe U of n resources, and a collection of m We will use the following properties of nice functions. facilities, where each facility i ∈ [m] is specified by (i) a Lemma 4 [Bounded Growth] Suppose that function f sat- subset Si ⊆ U of resources (ii) opening cost ci and (iii) isfies Assumption3 and there is some p ≥ 1 such that capacity ui. The resources and facilities are given up-front. n h∇f(x), xi ≤ p · f(x) for all x ∈ R+. Then, the following Now, a sequence of k requests arrive over time. Each request statements hold. j ∈ [k] requires some subset Rj ⊆ U of resources. The request n p (a) For δ ≥ 1, for x ∈ R+, f(δx) ≤ δ f(x). has to be served by assigning it to some collection Fj ⊆ [m] of n ? (b) For all x ∈ R+, f (∇f(x)) = hx, ∇f(x)i − f(x) ≤ facilities whose sets collectively cover Rj, i.e., Rj ⊆ ∪i∈Fj Si. (p − 1) · f(x). Note that these facilities have to be open, and we incur their n ? (c) If p > 1 then for any 0 < γ ≤ 1, z ∈ R+, f (γz) ≤ i j p ? ? p opening cost. Moreover, if facility is used to serve client , γ p−1 · f (z) and f (γ · ∇f(z)) ≤ γ p−1 · (p − 1) · f(z). this contributes to the load of facility i, and this total load n ? (d) If p = 1 then for any 0 < γ ≤ 1, z ∈ R+, f (γ · must be at most the capacity ui. This problem was considered ∇f(z)) ≤ 0.

k 1/ log m Note that the (log n) is a superconstant only if n is super- Proof. For statement (a), define the function g : [1, +∞) → R exponential in m. So, for polynomial instances, the competitive ratio is 0 h∇f(θx),xi p (O(log m log(mn)),O(log2 m)), which only has an additional log m term by g(θ) := ln f(θx). Then, g (θ) = f(θx) ≤ θ , where in the competitive ratio of the makespan compared to [13]. the last inequality follows because hf(θx), θxi ≤ pf(θx). X t T  Integrating over θ ∈ [1, δ] gives g(δ) − g(1) ≤ p ln δ, which = yj − x A y − f(x) p is equivalent to f(δx) ≤ δ f(x). j∈[m] ? For statement (b), observe that f (∇f(x)) = X t T  ≥ yj − sup x A y − f(x) sup n h(w), where h(w) := hw, ∇f(x)i − f(w). x∈ n w∈R+ j∈[m] R+ ∇h(w) = ∇f(x) − ∇f(w) f Hence, . Since is a convex X ? T  function, h is a concave function. Also ∇h(x) = 0. It = yj − f A y . follows that the (global) supremum of h is at w = x, and so j∈[m] f ?(∇f(x)) = hx, ∇f(x)i−f(x). This is at most (p−1)·f(x) The first inequality follows by the feasibility of x and since from the assumption on f. y ≥ 0. The second inequality follows as x ≥ 0. 1 1 For statement (c), for 0 < γ ≤ 1, let δ := ( ) p−1 ≥ 1, γ III.ONLINE COVERING FRAMEWORK f ?(γz) = sup {hx, γzi − f(x)} We give here a primal-dual algorithm for the online convex x∈Rn + packing/covering problem, and prove Theorem1. The algo- p p = γ p−1 sup {hδx, zi − δ f(x)} rithm is presented in Figure III and is very simple; in fact, even x∈Rn + simpler than the optimal algorithm for the linear case [18]. p ≤ γ p−1 sup {hδx, zi − f(δx)} Let x denote the primal solution and y the dual solution. x∈Rn + The algorithm is presented in Figure III. The primal update p ? = γ p−1 f (z) , rule is particularly simple (see the first three steps): effectively, it is a multiplicative increase in the primal variables, where th where the inequality follows from statement (a). This proves the rate of increase of xi is inversely proportional to the i the first part of (c). The second part follows by combining this coordinate of the gradient at x. The dual update in Step4 is with (b). just as simple: it increases the current dual variable at constant For statement (d), we have p = 1 and: rate; here 0 < δ < 1 is a parameter which we optimize later. The monotonicity of the dual is immediate. f ?(γ · ∇f(z)) = sup {hx, γ · ∇f(z)i − f(x)} n Finally, we also maintain “shadow dual” variables denoted x∈R+ by z in Step5 which are used only for the primal analysis. †† ≤ γ · sup {hx, ∇f(z)i − f(x)} n Note that this step both increases some shadow duals and x∈R+ decreases others. This non-monotonicity of the shadow dual = γ · f ?(∇f(z)) ≤ 0 is essential for proving the primal bound in Theorem1(a). f ≥ 0 γ ≤ 1 The first inequality is since and , and the second τ τ inequality is by part (b). For the analysis, we denote by x , y , ··· the values of the variables at time τ. Let x be the final value of x at the Convex Covering/Packing Framework: We consider the end of the execution. We first analyze the algorithm with the following convex covering problem: more complex update step5 and then state the minor changes required to analyze the dual. min C(x) := f(x) subject to Ax ≥ 1. (II.4) x∈ n R+ Observation 6 [Feasible Solutions] The algorithm maintains n a feasible monotonically non-decreasing primal solution x. Above, f : R+ → R is a monotone convex function and ? Also, the duals y and z are both feasible. Am×n is non-negative. Let f be the Fenchel dual function of f. The (Fenchel) dual problem of (II.4) is the following To analyze the competitive ratio we first state the following ∗∗ packing problem : claim whose proof is deferred to the full version of the paper. max P (y) := P y − f ?AT y. (II.5) [Lower-bounding x variables] For a variable x , let m j∈[m] j Claim 7 i y∈R+ Ti = {j | aji > 0} and let Si be any subset of Ti. If the For definitions of the online versions of the pack- current dual variable zk increases at rate r (and other dual ing/covering problems, see the Introduction. The following variables may decrease), then lemma is a standard weak duality result.     1 1 X n xτ ≥ exp a zτ − 1 . Lemma 5 [Weak Duality] For any x ∈ R+ such that Ax ≥ 1 i   ji j   maxj∈S {aji}· d r · ∇if(x) m i j∈S and any y ∈ R+ , we have C(x) ≥ P (y), which holds even if i f is not convex. (III.7) Proof. Let x, y be feasible solutions to problems (II.4) and Proof of Theorem1. First, we prove the competitive ratio of (II.5) respectively. Then, the primal algorithm. For this, we first claim that for any time τ, the variables z satisfy Atz ≤ δ∇f(x), i.e. for each i, t Pk τ f(x) ≥ f(x) + y (1 − Ax) j=1 ajizj ≤ δ ∇if(x). Indeed, while processing constraint

∗∗Variable z appearing in (I.2) is redundant here. ††The shadow variables z are unrelated to the z-variables in (I.2). th Pn Fractional Algorithm: When the k request i=1 akixi ≥ 1 arrives: 1) Let τ be a continuous variable denoting the current time. Pn 2) While the new constraint is unsatisfied, i.e., i=1 akixi < 1, increase τ at rate 1 and: 3) Change primal variables: • For each i with aki > 0, increase each xi at rate 1 ∂x aki xi + i = d . (III.6) ∂τ ∇if(x) th Here d is an upper bound on the row sparsity of the matrix and ∇if(x) is the i -coordinate of the gradient ∇f(x). 4) Change dual variables: δ • Increase yk at rate w = log(1+d·ρ) . Here ρ is an upper bound on the maximum-to-minimum ratio of positive entries in any column of A. 5) Change “shadow” dual variables (only for primal analysis): δ • Increase zk at rate r = log(1+2d2) . Pk • If for variable xi we have j=1 ajizj = δ · ∇if(x), then ? k – Let mi = arg maxj=1{aji | zj > 0}. aki – Increase zm? at rate − · r. (Note: this change occurs only if aki is strictly positive.) i am?i i

Fig. III.1. Algorithm for Convex Covering

Pk τ k, if j=1 ajizj < δ · ∇if(x) for column i we are trivially the algorithm, the rate of increase of the primal objective f satisfied. Suppose that during the processing of constraint k, is: Pk τ we have ajiz = δ ∇if(x) for some dual constraint i τ τ  τ 1  j=1 j df(x ) X τ ∂xi X τ akixi + d and time τ. Now the dual decrease part of the algorithm kicks = ∇if(x ) = ∇if(x ) τ dτ ∂τ ∇if(x ) in, and the rate of increase in the left-hand side of the dual i i|aki>0   constraint is at most: X τ 1 = akixi + ≤ 2. (III.9)  k  d i|aki>0 d X τ aki  ajizj  = aki · r − am?i · · r = 0. dτ i a ? The final inequality uses the fact that the covering constraint mi i j=1 is unsatisfied, and that d is at least the number of non-zeroes Now consider the update when primal constraint k arrives in the vector ak. From (III.8) and (III.9) we can now bound and τ is the current time. Let U(τ) denote the set of indices i the following primal-dual ratio: Pk τ   for which we have aki > 0 and j=1 ajizi = δ ∇if(x). So Pk τ d j=1 zj r δ |U(τ)| ≤ d, the row-sparsity of A. Moreover, define Si := {j | ≥ = . (III.10) a > 0, zτ > 0} for every i ∈ U(τ). Clearly, P a zτ = df(xτ ) 4 4 ln (1 + 2d2) ji j j∈Si ji j Pk τ δ j=1 ajizj = δ · ∇if(x). Plugging in r = ln(1+2d2) into Thus, if z is the final dual solution we get, Claim7, and using the fact that P a xτ < 1, we get for m i ki i X δ every i ∈ U(τ), zi ≥ · f(x). (III.11) 4 ln (1 + 2d2) i=1 1 1 > xτ ≥ exp ln(1 + 2d2) − 1 , To complete the proof of Theorem1, we use Lemma4. We a i max {a }· d ki j∈Si ji have for any 0 ≤ δ ≤ 1: aki aki 1 p and after simplifying we get = ≤ . As a  p−1 am?i maxj∈S {aji} 2d ∗ t ∗ δ · (p − 1) · f(x) if p > 1 i i f (A z) ≤ f (δ·∇f(x)) ≤ result, we can bound the rate of change in the dual expression 0 if p = 1 Pk z at any time τ: j=1 j The first inequality is by monotonicity of f ? and our claim t Pk  that A z ≤ δ∇f(x). The second inequality is by Lemma4 d zj j=1 X aki = r − · r parts (c-d). Combining with (III.11), D = dτ a ? m mi i   i∈U(τ) X ∗ t δ p z −f (A z) ≥ − δ p−1 · (p − 1) ·f(x)   i 4 ln (1 + 2d2) X 1 1 i=1 ≥ r 1 − ≥ r (III.8)  2d 2 δ i∈U(τ) if p > 1, and D ≥ 4 ln(1+2d2) f(x) if p = 1. The optimal 1 choice of δ is 2 p−1 . Plugging in this value we get: where the last inequality follows as |U(τ)| ≤ d. On the other (4p ln(1+2d )) p p hand, when processing constraint k during the execution of f(x) ≤ 4p ln 1 + 2d2 D ≤ 4p ln 1 + 2d2 f(OPT ). Hence the proof of the primal algorithm. with the larger sum of variables (≥ Hd/2) in the algorithmic Next, we prove the primal-dual competitive ratio using the solution. This continues until we reach one of the leaves. Then, monotone dual update in step4. Applying Claim7 with w = there is path from root to leaf where p blocks each have a total δ ln(1+dρ) (instead of r) with the final (feasible) primal and dual variable value of at least Hd/2 in the algorithmic solution. On solutions x, y and with Si := {j | aji > 0} to get the other hand, the optimal solution only sets one variable to 1 in each root to leaf path. Therefore, the competitive ratio of      p p Hd 1 ln (1 + dρ) X 2 p xi ≥ exp  ajiyj − 1 . the algorithm is at least 2p = Ω(p log d) . maxj∈Si {aji}· d δ · ∇if(x) j∈Si

1 Note that xi ≤ (since all constraints that contain B. An Application: `p-norm Set Cover minj∈Si {aji} xi would be satisfied at the upper bound). Simplifying and In this section we consider a generalization of the online set- max {a } j∈Si ji P using that ρ ≥ we get, j∈S ajiyj ≤ δ·∇if(x), cover problem introduced in [5]. In our problem we are given minj∈Si {aji} i t n or equivalently A y ≤ δ · ∇f(x). Using the same arguments n sets {Sj}j=1 over some ground set U. Apart from the set m P δ system, we are also given K cost functions bk :[n] → + for as above we get, i=1 yi ≥ 2 ln(1+dρ) · f(x). Plugging it in R we get, k ∈ [K]. Elements from U arrive online and must be covered m by some set upon arrival; the decision to select a set into the X D = y − f ∗(Aty) solution is irrevocable. The goal is to maintain a set-cover i that minimizes the ` norm of the K cost functions. We use i=1 p the above fractional online algorithm along with a rounding  δ p  ≥ − δ p−1 · (p − 1) · f(x). scheme (similar to [23]) to obtain the following. 2 ln (1 + d · ρ) 1 Theorem 9 [`p-norm Set Cover] There is an Optimizing, we get δ = p−1 yielding a competi-  3  (2p ln(1+d·ρ)) O p log d log r -competitive randomized online algorithm tive ratio O(p log(dρ))p for the dual. log p for set cover minimizing the `p-norm of multiple cost- A. A Lower Bound for Online Convex Covering functions. Here d is the maximum number of sets containing In this section we adapt the example in Azar et al. [13] any element, and r = |U| is the number of elements. for the `∞ norm to the `p norm and showing that our Proof. We use the following convex relaxation. There is a competitive ratio for the convex covering problem is optimal variable xj for each set j ∈ [n] which denotes whether this up to constants. set is chosen. Theorem 8 [Lower bound for `p-norm objective function] K  n p n  K  p X X X X p There exists an instance of the online covering problem with an min f(x) = bkj · xj + bkj · xj p k=1 j=1 j=1 k=1 `p-norm objective function f for which any online algorithm p X is Ω(p log d) -competitive, where d is the maximal sparsity of s.t. xj ≥ 1, ∀e ∈ U the covering matrix. j:e∈Sj Proof. For parameters p and d, the example has 2p+1 − 2 x ≥ 0. pairwise disjoint sets (blocks) of d variables each. We use Since f satisfies h∇f(x), xi ≤ p f(x), we can use our frame- B to refer to the ith block. The objective function f is a i work to solve this fractional convex covering problem online, polynomial of degree p composed of 2p sums of variables with a cost at most O(p log d)p times the optimal fractional each raised to the power of p. To define the sums let us solution. Moreover, if C? denotes the pth power of the optimal consider a complete binary tree of depth p. Each node in this value of the objective of the given set cover instance, then the tree except the root corresponds to a unique block. Each term optimal value of the above fractional relaxation is at most in f corresponds to a path from the root to a leaf node k, ? p 2C , and hence the value of our fractional online solution x P  and is of the form x where Qk is the set of p ? x∈∪i∈Qk Bi is f(x) = O(p log d) · C . blocks encountered on the path (excluding the root, which is To get an integer solution, we use a simple online ran- not a block). domized rounding algorithm. For each set j ∈ [n], define In [13], there is a procedure for revealing covering con- Xj to be a {0, 1}-random variable with Pr[Xj = 1] = straints of sparsity at most 2d of two blocks such that: min{4p log r · xj, 1}. This can easily be implemented online. • the sum of variables in the online algorithm for one of It is easy to see by a Chernoff bound that for each element e, 1 the blocks is at least Hd/2, where Hd refers to the dth it is not covered with probability at most r2p . If an element e harmonic number, and is not covered by this rounding, we choose the set minimizing n PK p • there is a feasible solution that sets a single variable in minj=1{ k=1 bkj : e ∈ Sj}; let e ∈ [n] index this set and the other block to 1. PK p ? Ce = k=1 bke. Observe that Ce ≤ C for all e ∈ U. Pn First, we apply the procedure to the two children of the root. To bound the `p-norm of the cost, let Ck = j=1 bkj · Xj Next, the procedure is applied to the children of the block be the cost of the randomly rounded solution under the kth PK p cost function, and let C := k=1 Ck . Also, for each element Fractional Packing Algorithm: Initialize x := z := 0. When th e ∈ U and k ∈ [K], define: the k constraint vector ak = (ak1, . . . , akn) arrives in round  k: bke If e is not covered Dek = 1) Set yk := 0. 0 Otherwise. Pn 2) While akixi < 1, do:  i=1 Ce If e is not covered • Continuously increase yk. De = 0 Otherwise. • Simultaneously for each i ∈ [n], increase zi at rate dzi = aki. K p th dyk Note that D = P D . The p power of the objective ∗ e k=1 ek • Increase x according to x = ∇f (ρz). function is: ? K !p Fig. IV.2. Algorithm for super-linear f X X C = Ck + Dek k=1 e∈U K K !p general objective functions for which our packing algorithm p X p p X X can work (for instance f ?(z) = Pn z1.5), we think that for ≤ 2 Ck + 2 Dek i=1 i k=1 k=1 e∈U ease of exposition, it is simpler to restrict ourselves to the case K of polynomials (with integral degrees). p p X p X p ≤ 2 · C + 2 r Dek The reader can check that the special class of polynomials k=1 e∈U we describe above satisfy the first two conditions in the p p X = 2 · C + (2r) De (III.12) following Assumption 10. e∈U Assumption 10 [Nice function f ?] ? ? We now bound E[C] using (III.12). Observe that E[Ck] ≤ 1) The function f is convex, monotone, and f (0) = 0. Pn ? ? 4p log r · j=1 bkj · xj. Since each Ck is the sum of inde- 2) The function f is differentiable, and the gradient ∇f p pendent non-negative random variables, we can bound E[Ck ] is monotone. Moreover, there is some q > 0 such that for th n ? ? using a concentration inequality involving p moments [32]: all z ∈ R+, h∇f (z), zi ≤ q · f (z)  n  3) The offline convex packing problem has bounded optimal p p X p p ‡‡ E[Ck ] ≤ Kp · E[Ck] + E[bkj · Xj ] objective. j=1 ? n p We first obtain an algorithm for the special case when f   X  ≤ K · (4p log r)p b · x is “super-linear”, which means, in addition to Assumption 10, p kj j ? j=1 suppose the packing cost function f satisfies the following n  condition. There exists some λ > 1 such that for all ρ ≥ 1, X p for all z ∈ n , ∇f ?(ρz) ≥ ρλ−1 · ∇f ?(z). + 4p log r bkj · xj . R+ j=1 Here, the vector x plays an auxiliary role and is initialized p to 0. Throughout the algorithm, we maintain the invariant Above Kp = O(p/ log p) . By linearity of expectation, z = AT y and x = ∇f ?(ρz) for some parameter ρ > 1 K X to be determined later. In round k ∈ [m], the vector ak = [C] = [Cp] E E k (ak1, ak2, . . . , akn) is given. The variable yk is initialized to k=1 P 0, and is continuously increased while i∈[n] akixi < 1. To K  n p n  T p X X X p maintain z = A y, for each i ∈ [n], zi is increased at rate ≤ Kp(4p log r) bkj · xj + bkj · xj dz = aki. As the coordinates of z are increased, the vector dyk k=1 j=1 j=1 ? p x is increased according to the invariant x = ∇f (ρz). Since = Kp(4p log r) · f(x). ? ∇f is monotone, as yk increases, both z and z increase mono-  3 p tonically. We show in Lemma 11 that unless the offline packing Thus we have [C] = O p · log d · log r · C?. E log p P a x P  P problem is unbounded, eventually i∈[n] ki i reaches 1, at Observe that De = Pr[e uncovered] · E e∈U e∈U which moment yk stops increasing and round k finishes. C ≤ r−2p · P C? = r1−2p · C? e e∈U . Using these bounds Observe that the coordinates of x are increased monotoni- p p P in (III.12), we have E[C] ≤ 2 · E[C] + (2r) e∈U E[De] = cally throughout the algorithm. Below we show that at the end  3 p p ? P O log p · log d · log r · C . of the process, the constraints i∈[n] ajixi ≥ 1 are satisfied

IV. ONLINE PACKING FRAMEWORK ‡‡Note that there are instances for which the offline convex packing In this section, we consider the convex online packing problem whose objective can be arbitrarily large. Consider, for example, a ? ? 1 P problem. We consider the case that f is a convex multi-variate convex packing problem with a linear cost function f (z) = 2n i∈[n] zi polynomial with non-negative coefficients, zero constant term and the first request is a1 = 1. Then, by letting y1 to be arbitrarily large, we can get a feasible packing assignment with arbitrarily large objective. and maximum degree q. We shall state exactly what properties Obviously, such instances are not interesting for practical purposes. Hence, of f ? are needed for our proofs. Although there are more we will assume that the offline optimal is bounded. for all j ∈ [m]. Hence, the vector x is feasible for the covering Proof. Suppose z(m) = AT y is the vector at the end of the problem. algorithm, and x(m) = ∇f ?(ρz(m)). Then, we have (k) In the rest of the section, for k ∈ [m], we let z denote (m) ? (m) ? (m) the vector z at the end of round k, where z(0) := 0. C(x ) = f(∇f (ρz )) ≤ (q − 1) · f (ρz ) , Lemma 11 [Dual Feasibility] Recall our assumption that the where the inequality follows from applying Lemma4(c) with offline optimal packing objective is bounded. Then, in each the roles of f and f ? reversed. P On the other hand, round k ∈ [m], eventually we have i∈[n] akixi ≥ 1, and yk will stop increasing. P ? (m) 1 ? (m) ? (m) P (y) = k∈[m] yk − f (z ) ≥ ρ · f (ρz ) − f (z ) . Proof. During round k ∈ [m], the algorithm increases yk only (IV.13) Pn T when i=1 akixi < 1. Therefore, recalling z = A y, when Hence, it follows that the algorithm increases yk, it also increases each zi at rate ∗ (m) dzi C(x) (p − 1) · f (ρz ) = aki. ≤ (IV.14) dyk P (y) 1 ∗ (m) ∗ (m) Hence, we have ρ · f (ρz ) − f (z ) λ ∗ (m) ∂P (y) ? 1 ? (p − 1) · ρ · f (z ) = 1 − hak, ∇f (z)i ≥ 1 − λ−1 · hak, ∇f (ρz)i ≤ (IV.15) ∂yk ρ 1 λ ∗ (m) ∗ (m) ρ · ρ · f (z ) − f (z ) = 1 − 1 · ha , xi ≥ 1 − 1 , ρλ−1 k ρλ−1 ρλ = (p − 1) · (IV.16) where the first inequality follows from the assumption ρλ−1 − 1 ∇f ?(ρz) ≥ ρλ−1 · ∇f ?(z) , and the last inequality follows where the penultimate inequality follows because the assump- ha , xi < 1 y because k when k is increased. tion ∇f ?(ρz) ≥ ρλ−1 · ∇f ?(z) implies that f ?(ρz(m)) = Therefore, suppose for contrary that ha , xi never reaches (m) (m) k ρ R z h∇f ?(ρz), dzi ≥ ρλ R z h∇f ?(z), dzi = ρλ · P (y) z=0 z=0 1, then the objective function increases at least at some ? (m) t 1 f (z )), and the function t 7→ t −f ?(z(m)) is decreasing. positive rate 1 − ρλ−1 (recalling ρ > 1 and λ ≥ 2) as ρ 1 yk increases, which means the offline packing problem is Choosing ρ := λ λ−1 and observing that x(m) is feasible for unbounded, contradicting our assumption. the covering problem, we have

(k) λ Lemma 12 [Bounding Increase in y] For k ∈ [m], let z P opt Copt C(x(m)) λ λ−1 (0) ≤ ≤ ≤ (q − 1) · . denote the vector z at the end of round k, where z := 0. P (y) P (y) P (y) λ − 1 Then, at the end of round k when yk stops increasing (by 1 1  λ−1 Lemma 11) The result then follows because λ λ−1 = 1 + (λ − 1) ≤ 1 ? (k) ? (k−1)  e. yk ≥ ρ · f (ρz ) − f (z ) . Then we discuss the a general function f ? that satisfies In particular, since f ?(0) = 0, this implies that at the end of Assumption 10. Recall that we consider a polynomial f ? with the algorithm, non-negative coefficients and zero constant term. Hence, we ? ? X 1 ? (m) apply the following transformation f (z) = hc, zi + fb (z), yk ≥ · f (ρz ) . ρ where c = ∇f ?(0). Observe that f ?(z) no longer has linear k∈[m] b terms, thus the second condition of Lemma 13 always holds Proof. Recall again that yk increases only when hak, xi < 1, with λ = 2 (and hence ρ = 2). we have [f ∗ has bounded growth and is superlinear.] We X X dzi Lemma 14 b 1 ≥ akixi = xi · . have the following. dyk i∈[n] i∈[n] n ∗ ∗ • For all z ∈ R+, h∇fb (z), zi ≤ q · fb (z). ? ? • Hence, integrating this with respect to yk throughout round k, For all ρ ≥ 1 and z ≥ 0, ∇fb (ρz) ≥ ρ · ∇fb (z). ? and observing that x = ∇f (ρz), we have Proof. To prove the first statement, we can pick any non-zero ∗ Z z(k) term M in fb . Since the coefficient of M is positive and its ? 1 ? (k) ? (k−1) yk ≥ h∇f (ρz), dzi = ·(f (ρz )−f (z )) , degree is at most q, it follows that h∇M(z), zi ≤ q · M(z). (k−1) ρ z=z Therefore, summing up all the terms, we get h∇fb∗(z), zi ≤ where the last equality comes from the fundamental theorem q · fb∗(z). of calculus for path integrals of vector fields. For the second statement, if fb∗ is the zero function, then the statement trivially holds. Otherwise, observe that Lemma 13 [Super-linear f ?] For a super-linear function f ?, ∗ 1 in fb (x), only terms with degree at least 2 can have non- by setting ρ := λ λ−1 in the Fractional Packing Algorithm, we zero coefficients. Therefore, for z ≥ 0 and ρ ≥ 1, we have qλ  ? ? obtain an O λ−1 -competitive online algorithm for the online ∇fb (ρz) ≥ ρ · ∇fb (z), i.e., the condition in Lemma 13 is convex packing problem. satisfied with λ = 2. In terms of fb∗, the objective function becomes [5] N. Alon, B. Awerbuch, Y. Azar, N. Buchbinder, and J. Naor, “The online ,” SIAM J. Comput., vol. 39, no. 2, pp. 361–370, 2009. P (y) = h1, yi − f ?(AT y) = h1 − Ac, yi − fb?(AT y) . [6] N. Bansal, N. Buchbinder, and J. Naor, “A primal-dual randomized algorithm for weighted paging,” J. ACM, vol. 59, no. 4, p. 19, 2012. Moreover, the corresponding covering problem becomes [7] ——, “Randomized competitive algorithms for generalized caching,” min f(x) Ax ≥ 1 − Ac SIAM J. Comput., vol. 41, no. 2, pp. 391–414, 2012. x≥0 b subject to . In other words, in [8] ——, “Towards the randomized k-server conjecture: A primal-dual round k ∈ [m], as the vector ak arrives, the covering constraint approach,” in SODA, 2010, pp. 40–55. becomes hak, xi ≥ bk, where bk := 1−hak, ci. If bk ≤ 0, then [9] N. Bansal, N. Buchbinder, A. Madry, and J. Naor, “A polylogarithmic- y = 0 competitive algorithm for the k-server problem,” J. ACM, vol. 62, no. 5, the constraint is automatically satisfied, and we set k p. 40, 2015. such that round k finishes immediately. Otherwise, we can [10] N. Alon, B. Awerbuch, Y. Azar, N. Buchbinder, and J. S. Naor, “A run the Fractional Packing Algorithm using the function fb? general approach to online network optimization problems,” ACM Trans. and the constraint vector ak in round k. Algorithms, vol. 2, no. 4, pp. 640–660, 2006. bk [11] A. Ene, D. Chakrabarty, R. Krishnaswamy, and D. Panigrahi, “Online Next, we present a proof of Theorem2 based on the above buy-at-bulk network design,” in FOCS, 2015, pp. 545–562. discussion. [12] N. Buchbinder and J. Naor, “Improved bounds for online routing and c := packing via a primal-dual approach,” in FOCS, 2006, pp. 293–304. Proof of Theorem2: As discussed above, we write [13] Y. Azar, U. Bhaskar, L. Fleischer, and D. Panigrahi, “Online mixed ? ? ? ∇f (0) and fb (x) := f (x) − hc, xi as the convex function. packing and covering,” in SODA, 2013, pp. 85–100. (Observe that if f ? is a polynomial function containing only [14] Z. Huang and A. Kim, “Welfare maximization with production costs: A primal dual approach,” in SODA. SIAM, 2015, pp. 59–72. linear terms, then the problem is trivial because the objective [15] N. R. Devanur, K. Jain, and R. D. Kleinberg, “Randomized primal-dual is unbounded when there is some round k ∈ [m] such that analysis of ranking for online bipartite matching,” in SODA, 2013, pp. bk > 0.) 101–107. k ∈ [m] [16] N. Buchbinder, K. Jain, and J. Naor, “Online primal-dual algorithms for For ease of exposition, we can assume that for each , maximizing ad-auctions revenue,” in ESA, 2007, pp. 253–264. bk := 1 − hak, ci > 0. Otherwise, we can essentially ignore [17] N. Buchbinder and J. S. Naor, “Online primal-dual algorithms for the variable yk by setting it to 0. We denote B as the diagonal covering and packing,” Math. Oper. Res., vol. 34, no. 2, pp. 270–286, (k, k) b w := By 2009. [Online]. Available: http://dx.doi.org/10.1287/moor.1080.0363 matrix whose -th entry is k. By writing , the [18] A. Gupta and V. Nagarajan, “Approximating sparse covering integer objective function can be expressed in terms of w as Pb(w) := programs online,” Math. Oper. Res., vol. 39, no. 4, pp. 998–1011, 2014. h1, wi − fb?((B−1A)T w). [19] N. Bansal, K. Pruhs, and C. Stein, “Speed scaling for weighted flow time,” SIAM J. Comput., vol. 39, no. 4, pp. 1294–1308, 2009. Hence, we can run the Fractional Packing Algorithm using [20] N. R. Devanur and Z. Huang, “Primal dual gives almost optimal energy ? function fb such that in round k ∈ [m], when the vector ak efficient online algorithms.” in SODA. SIAM, 2014, pp. 1123–1140. arrives, we can transform it by dividing each coordinate by bk [21] Y. Azar, N. R. Devanur, Z. Huang, and D. Panigrahi, “Speed scaling in the non-clairvoyant model,” in SPAA, 2015, pp. 133–142. before passing it to the algorithm. [22] I. Menache and M. Singh, “Online caching with convex costs: Extended By Lemmas 13 and 14, using λ = 2, it follows that the abstract,” in SPAA, 2015, pp. 46–54. algorithm has competitive ratio O(q). [23] A. Gupta, R. Krishnaswamy, and K. Pruhs, “Online primal-dual for non- linear optimization with applications to speed scaling,” in WAOA, 2012, As an application of the above fractional solver (with some pp. 173–186. additional ideas to ensure integral allocation), we can obtain [24] A. Blum, A. Gupta, Y. Mansour, and A. Sharma, “Welfare and profit a competitive algorithm for a generalized version of combi- maximization with production costs,” in FOCS, Nov 2011, pp. 77–86. [25] N. R. Devanur and K. Jain, “Online matching with concave returns,” in natorial auction where the seller may produce any number of Proceedings of the forty-fourth annual ACM symposium on Theory of ? copies of each item subject to a production cost function f . computing. ACM, 2012, pp. 137–144. [26] S. Agrawal and N. R. Devanur, “Fast algorithms for online stochastic Theorem 15 [Combinatorial Auction with Production Cost] convex programming,” in SODA, 2015, pp. 1405–1424. Suppose the production cost function f ? satisfies the con- [27] K. Bhawalkar, S. Gollapudi, and D. Panigrahi, “Online set cover with set requests,” in APPROX/RANDOM, 2014, pp. 64–79. ditions in Theorem2. Then, there is an O(q)-competitive [28] J. Borwein and A. Lewis, Convex analysis and and Nonlinear Optimiza- online algorithm for the online combinatorial auction problem tion. Springer, 2006. with production cost f ?, with an additive error that depends [29] S. Khuller, J. Li, and B. Saha, “Energy efficient scheduling via partial ? shutdown,” in SODA, 2010, pp. 1360–1372. on f (but does not depends on the number of buyers). [30] L. Fleischer, “Data center scheduling, generalized flows, and submodu- Specifically, the algorithm returns a solution with value at least larity,” in ANALCO, 2010, pp. 56–65. Ω( OPT )−V (f ?), where V (f ?) := 1 ·(f ?(ρ(q+1)1)−f ?(ρ1)) [31] J. Li and S. Khuller, “Generalized machine activation problems,” in q ρ SODA, 2011, pp. 80–94. and ρ = 2. [32] R. Latała, “Estimation of moments of sums of independent real random variables,” Ann. Probab., vol. 25, no. 3, pp. 1502–1513, 1997. [Online]. REFERENCES Available: http://dx.doi.org/10.1214/aop/1024404522 [1] Y. Azar, I. R. Cohen, and D. Panigrahi, “Online covering with convex objectives and applications,” arXiv preprint arXiv:1412.3507, 2014. [2] N. Buchbinder, S. Chen, A. Gupta, V. Nagarajan, and J. S. Naor, “Online packing and covering framework with convex objectives,” arXiv preprint arXiv:1412.8347, 2014. [3] T. H. Chan, Z. Huang, and N. Kang, “Online convex covering and packing problems,” CoRR, vol. abs/1502.01802, 2015. [4] N. Buchbinder and J. S. Naor, “The design of competitive online algorithms via a primal-dual approach,” Found. Trends Theor. Comput. Sci., vol. 3, no. 2-3, pp. 93–263, 2007. [Online]. Available: http://dx.doi.org/10.1561/0400000024