Applying Subgradient Methods – and Accelerated Gradient Methods – to Solve General Convex, Conic Optimization Problems
Jim Renegar School of Operations Research and Information Engineering Cornell University
Workshop on “Optimization and Parsimonius Modeling” IMA, January 25-29, 2016 convex function (assume lower semicontinuous)
min f(x) closed, convex set s.t. x Q 2
graph of f
x graph of the function y f(x)+ g, y x 7! h i
a subgradient of f at x
The set of all subgradients at x is denoted @f(x) – the “subdi↵erential” at x For a concave function, supgradients play the analogous role. convex function (assume lower semicontinuous)
min f(x) closed, convex set s.t. x Q 2 Assume f is Lipschitz-continuous on an open neighborhood of Q: f(x) f(y) M x y | | k k Lipschitz constant
Goal: Compute x Q satisfying f(x) f ⇤ + ✏ 2 optimal value
A typical subgradient method:
Initialize: x Q 0 2 ✏ Iterate: Compute gk @f(xk), and let xk+1 = PQ( xk g 2 gk ) 2 k kk where PQ is projection onto Q
A typical theorem: an optimal solution 2 M x0 x⇤ ` k k min f(xk) f ⇤ + ✏ ✏ ) k ` ✓ ◆ In the special case of linear programming this becomes ... min cT x s.t. Ax b so Q = x : Ax b { } Then, of course, the objective function is Lipschitz continuous: cT x cT y c x y | | k kk k Lipschitz constant
T Goal: Compute x satisfying Ax b and c x z⇤ + ✏ optimal value
A typical subgradient method:
Initialize: x Q 0 2 ✏ Iterate: Let xk+1 = PQ( xk c 2 c ) k k where PQ is projection onto Q
A typical theorem: an optimal solution 2 c x0 x⇤ T ` k kk k min c xk z⇤ + ✏ ✏ ) k ` ✓ ◆ But in general, projecting onto Q = x : Ax b is no easier than solving linear{ programs!!! }
A typical subgradient method:
Initialize: x Q 0 2 ✏ Iterate: Let xk+1 = PQ( xk c 2 c ) k k where PQ is projection onto Q But in general, projecting onto Q = x : Ax b is no easier than solving linear{ programs!!! }
There are ways, however, to use a subgradient method to “solve” an LP. For example, here is a way to “approximate” an LP by an unconstrained convex optimization problem:
min cT x min cT x + max 0,b ↵T x : i =1,...,m s.t. Ax b ⇡ { i i } user-chosen postive constant However, the optimal solution for the problem on the right will not necessarily be feasible for LP.
In the literature, in fact, the only subgradient methods producing feasible iterates require the feasible region to be “simple.”
“Why is this?” min cT x s.t. Ax = b x 0
Think of this 2-dimensional plane as being the slice of Rn cut out by x : Ax = b . { } min cT x s.t. Ax = b x 0
⇡⇤ optimal solution
Assume the objective function x cT x is constant on horizontal slices. 7! T 1 min c x s.t. Ax = b x 0
To begin with simplicity, assume the vector of all one’s is feasible. T 1 min c x s.t. Ax = b x 0 x
A nez ⇡(x) :=
x : Ax = b “radial projection” { and cT x = z } T 1 min c x s.t. Ax = b x 0 x
A nez :=
x : Ax = b { and cT x = z } ⇡(x) T 1 min c x s.t. Ax = b x 0 ⇡(x) x
A nez :=
x : Ax = b { and cT x = z } T 1 min c x s.t. Ax = b x 0 xz⇤
A nez :=
x : Ax = b { and cT x = z }
⇡⇤ = ⇡(xz⇤) T 1 min c x s.t. Ax = b x 0
A nez :=
x : Ax = b { and cT x = z } T 1 min c x s.t. Ax = b x 0
x
A nez :=
x : Ax = b { and cT x = z } ⇡(x) = 1 1 + 1 min x (x 1) j j z⇤ optimal value, assumed finite T min c x T s.t. Ax = b LP z a fixed value satisfying z