Convex Optimization — Boyd & Vandenberghe 4. Convex optimization problems
optimization problem in standard form • convex optimization problems • quasiconvex optimization • linear optimization • quadratic optimization • geometric programming • generalized inequality constraints • semidefinite programming • vector optimization •
4–1 Optimization problem in standard form
minimize f0(x) subject to f (x) 0, i =1,...,m i ≤ hi(x)=0, i =1,...,p
x Rn is the optimization variable • ∈ f : Rn R is the objective or cost function • 0 → f : Rn R, i =1,...,m, are the inequality constraint functions • i → h : Rn R are the equality constraint functions • i → optimal value:
p⋆ = inf f (x) f (x) 0, i =1,...,m, h (x)=0, i =1,...,p { 0 | i ≤ i } p⋆ = if problem is infeasible (no x satisfies the constraints) • ∞ p⋆ = if problem is unbounded below • −∞
Convex optimization problems 4–2 Optimal and locally optimal points x is feasible if x dom f and it satisfies the constraints ∈ 0 ⋆ a feasible x is optimal if f0(x)= p ; Xopt is the set of optimal points x is locally optimal if there is an R> 0 such that x is optimal for
minimize (over z) f0(z) subject to fi(z) 0, i =1,...,m, hi(z)=0, i =1,...,p z x≤ R k − k2 ≤ examples (with n =1, m = p =0) f (x)=1/x, dom f = R : p⋆ =0, no optimal point • 0 0 ++ f (x)= log x, dom f = R : p⋆ = • 0 − 0 ++ −∞ f (x)= x log x, dom f = R : p⋆ = 1/e, x =1/e is optimal • 0 0 ++ − f (x)= x3 3x, p⋆ = , local optimum at x =1 • 0 − −∞
Convex optimization problems 4–3 Implicit constraints the standard form optimization problem has an implicit constraint
m p x = dom f dom h , ∈D i ∩ i i=0 i=1 \ \ we call the domain of the problem • D the constraints f (x) 0, h (x)=0 are the explicit constraints • i ≤ i a problem is unconstrained if it has no explicit constraints (m = p =0) • example: minimize f (x)= k log(b aT x) 0 − i=1 i − i P T is an unconstrained problem with implicit constraints ai x Convex optimization problems 4–4 Feasibility problem find x subject to f (x) 0, i =1,...,m i ≤ hi(x)=0, i =1,...,p can be considered a special case of the general problem with f0(x)=0: minimize 0 subject to f (x) 0, i =1,...,m i ≤ hi(x)=0, i =1,...,p p⋆ =0 if constraints are feasible; any feasible x is optimal • p⋆ = if constraints are infeasible • ∞ Convex optimization problems 4–5 Convex optimization problem standard form convex optimization problem minimize f0(x) subject to fi(x) 0, i =1,...,m T ≤ ai x = bi, i =1,...,p f , f ,..., f are convex; equality constraints are affine • 0 1 m problem is quasiconvex if f is quasiconvex (and f ,..., f convex) • 0 1 m often written as minimize f0(x) subject to fi(x) 0, i =1,...,m Ax =≤b important property: feasible set of a convex optimization problem is convex Convex optimization problems 4–6 example 2 2 minimize f0(x)= x1 + x2 2 subject to f1(x)= x1/(1 + x2) 0 2 ≤ h1(x)=(x1 + x2) =0 f is convex; feasible set (x ,x ) x = x 0 is convex • 0 { 1 2 | 1 − 2 ≤ } not a convex problem (according to our definition): f1 is not convex, h1 • is not affine equivalent (but not identical) to the convex problem • 2 2 minimize x1 + x2 subject to x 0 1 ≤ x1 + x2 =0 Convex optimization problems 4–7 Local and global optima any locally optimal point of a convex problem is (globally) optimal proof: suppose x is locally optimal, but there exists a feasible y with f0(y) z feasible, z x R = f (z) f (x) k − k2 ≤ ⇒ 0 ≥ 0 consider z = θy + (1 θ)x with θ = R/(2 y x ) − k − k2 y x >R, so 0 <θ< 1/2 • k − k2 z is a convex combination of two feasible points, hence also feasible • z x = R/2 and • k − k2 f (z) θf (y)+(1 θ)f (x) Convex optimization problems 4–8 Optimality criterion for differentiable f0 x is optimal if and only if it is feasible and f (x)T (y x) 0 for all feasible y ∇ 0 − ≥ −∇f0(x) x X if nonzero, f (x) defines a supporting hyperplane to feasible set X at x ∇ 0 Convex optimization problems 4–9 unconstrained problem: x is optimal if and only if • x dom f , f (x)=0 ∈ 0 ∇ 0 equality constrained problem • minimize f0(x) subject to Ax = b x is optimal if and only if there exists a ν such that x dom f , Ax = b, f (x)+ AT ν =0 ∈ 0 ∇ 0 minimization over nonnegative orthant • minimize f (x) subject to x 0 0 x is optimal if and only if f0(x)i 0 xi =0 x dom f0, x 0, ∇ ≥ ∈ f0(x)i =0 xi > 0 ∇ Convex optimization problems 4–10 Equivalent convex problems two problems are (informally) equivalent if the solution of one is readily obtained from the solution of the other, and vice-versa some common transformations that preserve convexity: eliminating equality constraints • minimize f0(x) subject to fi(x) 0, i =1,...,m Ax =≤b is equivalent to minimize (over z) f0(Fz + x0) subject to f (Fz + x ) 0, i =1,...,m i 0 ≤ where F and x0 are such that Ax = b x = Fz + x for some z ⇐⇒ 0 Convex optimization problems 4–11 introducing equality constraints • minimize f0(A0x + b0) subject to f (A x + b ) 0, i =1,...,m i i i ≤ is equivalent to minimize (over x, yi) f0(y0) subject to f (y ) 0, i =1,...,m i i ≤ yi = Aix + bi, i =0, 1,...,m introducing slack variables for linear inequalities • minimize f0(x) subject to aT x b , i =1,...,m i ≤ i is equivalent to minimize (over x, s) f0(x) T subject to ai x + si = bi, i =1,...,m s 0, i =1,...m i ≥ Convex optimization problems 4–12 epigraph form: standard form convex problem is equivalent to • minimize (over x, t) t subject to f (x) t 0 0 − ≤ fi(x) 0, i =1,...,m Ax =≤b minimizing over some variables • minimize f0(x1,x2) subject to f (x ) 0, i =1,...,m i 1 ≤ is equivalent to minimize f˜0(x1) subject to f (x ) 0, i =1,...,m i 1 ≤ ˜ where f0(x1) = infx2 f0(x1,x2) Convex optimization problems 4–13 Quasiconvex optimization minimize f0(x) subject to fi(x) 0, i =1,...,m Ax =≤b with f : Rn R quasiconvex, f ,..., f convex 0 → 1 m can have locally optimal points that are not (globally) optimal (x,f0(x)) Convex optimization problems 4–14 convex representation of sublevel sets of f0 if f0 is quasiconvex, there exists a family of functions φt such that: φ (x) is convex in x for fixed t • t t-sublevel set of f is 0-sublevel set of φ , i.e., • 0 t f (x) t φ (x) 0 0 ≤ ⇐⇒ t ≤ example p(x) f (x)= 0 q(x) with p convex, q concave, and p(x) 0, q(x) > 0 on dom f ≥ 0 can take φ (x)= p(x) tq(x): t − for t 0, φ convex in x • ≥ t p(x)/q(x) t if and only if φ (x) 0 • ≤ t ≤ Convex optimization problems 4–15 quasiconvex optimization via convex feasibility problems φ (x) 0, f (x) 0, i =1,...,m, Ax = b (1) t ≤ i ≤ for fixed t, a convex feasibility problem in x • if feasible, we can conclude that t p⋆; if infeasible, t p⋆ • ≥ ≤ Bisection method for quasiconvex optimization given l ≤ p⋆, u ≥ p⋆, tolerance ǫ > 0. repeat 1. t := (l + u)/2. 2. Solve the convex feasibility problem (1). 3. if (1) is feasible, u := t; else l := t. until u − l ≤ ǫ. requires exactly log ((u l)/ǫ) iterations (where u, l are initial values) ⌈ 2 − ⌉ Convex optimization problems 4–16 Linear program (LP) minimize cT x + d subject to Gx h Ax = b convex problem with affine objective and constraint functions • feasible set is a polyhedron • −c ⋆ P x Convex optimization problems 4–17 Examples diet problem: choose quantities x1,..., xn of n foods one unit of food j costs c , contains amount a of nutrient i • j ij healthy diet requires nutrient i in quantity at least b • i to find cheapest healthy diet, minimize cT x subject to Ax b, x 0 piecewise-linear minimization T minimize maxi=1,...,m(ai x + bi) equivalent to an LP minimize t subject to aT x + b t, i =1,...,m i i ≤ Convex optimization problems 4–18 Chebyshev center of a polyhedron Chebyshev center of = x aT x b , i =1,...,m P { | i ≤ i } xcheb is center of largest inscribed ball = x + u u r B { c |k k2 ≤ } aT x b for all x if and only if • i ≤ i ∈B sup aT (x + u) u r = aT x + r a b { i c |k k2 ≤ } i c k ik2 ≤ i hence, x , r can be determined by solving the LP • c maximize r subject to aT x + r a b , i =1,...,m i c k ik2 ≤ i Convex optimization problems 4–19 Linear-fractional program minimize f0(x) subject to Gx h Ax = b linear-fractional program cT x + d f (x)= , dom f (x)= x eT x + f > 0 0 eT x + f 0 { | } a quasiconvex optimization problem; can be solved by bisection • also equivalent to the LP (variables y, z) • minimize cT y + dz subject to Gy hz Ay = bz eT y + fz =1 z 0 ≥ Convex optimization problems 4–20 generalized linear-fractional program T ci x + di T f0(x)= max , dom f0(x)= x ei x+fi > 0, i =1,...,r i=1,...,r T ei x + fi { | } a quasiconvex optimization problem; can be solved by bisection example: Von Neumann model of a growing economy + + maximize (over x, x ) mini=1,...,n xi /xi subject to x+ 0, Bx+ Ax x,x+ Rn: activity levels of n sectors, in current and next period • ∈ (Ax) , (Bx+) : produced, resp. consumed, amounts of good i • i i x+/x : growth rate of sector i • i i allocate activity to maximize growth rate of slowest growing sector Convex optimization problems 4–21 Quadratic program (QP) minimize (1/2)xT Px + qT x + r subject to Gx h Ax = b P Sn , so objective is convex quadratic • ∈ + minimize a convex quadratic function over a polyhedron • ⋆ −∇f0(x ) x⋆ P Convex optimization problems 4–22 Examples least-squares minimize Ax b 2 k − k2 analytical solution x⋆ = A†b (A† is pseudo-inverse) • can add linear constraints, e.g., l x u • linear program with random cost minimize c¯T x + γxT Σx = E cT x + γ var(cT x) subject to Gx h, Ax = b c is random vector with mean c¯ and covariance Σ • hence, cT x is random variable with mean c¯T x and variance xT Σx • γ > 0 is risk aversion parameter; controls the trade-off between • expected cost and variance (risk) Convex optimization problems 4–23 Quadratically constrained quadratic program (QCQP) T T minimize (1/2)x P0x + q0 x + r0 T T subject to (1/2)x Pix + qi x + ri 0, i =1,...,m Ax = b ≤ P Sn ; objective and constraints are convex quadratic • i ∈ + n if P1,...,Pm S++, feasible region is intersection of m ellipsoids and • an affine set ∈ Convex optimization problems 4–24 Second-order cone programming minimize f T x T subject to Aix + bi 2 ci x + di, i =1,...,m Fxk = g k ≤ (A Rni×n, F Rp×n) i ∈ ∈ inequalities are called second-order cone (SOC) constraints: • (A x + b ,cT x + d ) second-order cone in Rni+1 i i i i ∈ for n =0, reduces to an LP; if c =0, reduces to a QCQP • i i more general than QCQP and LP • Convex optimization problems 4–25 Robust linear programming the parameters in optimization problems are often uncertain, e.g., in an LP minimize cT x subject to aT x b , i =1,...,m, i ≤ i there can be uncertainty in c, ai, bi two common approaches to handling uncertainty (in ai, for simplicity) deterministic model: constraints must hold for all a • i ∈Ei minimize cT x subject to aT x b for all a , i =1,...,m, i ≤ i i ∈Ei stochastic model: ai is random variable; constraints must hold with • probability η minimize cT x subject to prob(aT x b ) η, i =1,...,m i ≤ i ≥ Convex optimization problems 4–26 deterministic approach via SOCP choose an ellipsoid as : • Ei = a¯ + P u u 1 (¯a Rn, P Rn×n) Ei { i i |k k2 ≤ } i ∈ i ∈ center is a¯i, semi-axes determined by singular values/vectors of Pi robust LP • minimize cT x subject to aT x b a , i =1,...,m i ≤ i ∀ i ∈Ei is equivalent to the SOCP minimize cT x subject to a¯T x + P T x b , i =1,...,m i k i k2 ≤ i (follows from sup (¯a + P u)T x =¯aT x + P T x ) kuk2≤1 i i i k i k2 Convex optimization problems 4–27 stochastic approach via SOCP assume a is Gaussian with mean a¯ , covariance Σ (a (¯a , Σ )) • i i i i ∼N i i aT x is Gaussian r.v. with mean a¯T x, variance xT Σ x; hence • i i i b a¯T x prob T i i (ai x bi)=Φ −1/2 ≤ Σ x ! k i k2 2 where Φ(x)=(1/√2π) x e−t /2 dt is CDF of (0, 1) −∞ N robust LP R • minimize cT x subject to prob(aT x b ) η, i =1,...,m, i ≤ i ≥ with η 1/2, is equivalent to the SOCP ≥ minimize cT x subject to a¯T x +Φ−1(η) Σ1/2x b , i =1,...,m i k i k2 ≤ i Convex optimization problems 4–28 Geometric programming monomial function f(x)= cxa1xa2 xan, dom f = Rn 1 2 ··· n ++ with c> 0; exponent ai can be any real number posynomial function: sum of monomials K f(x)= c xa1kxa2k xank, dom f = Rn k 1 2 ··· n ++ kX=1 geometric program (GP) minimize f0(x) subject to f (x) 1, i =1,...,m i ≤ hi(x)=1, i =1,...,p with fi posynomial, hi monomial Convex optimization problems 4–29 Geometric program in convex form change variables to yi = log xi, and take logarithm of cost, constraints monomial f(x)= cxa1 xan transforms to • 1 ··· n log f(ey1,...,eyn)= aT y + b (b = log c) K a1k a2k ank posynomial f(x)= c x x xn transforms to • k=1 k 1 2 ··· P K T y1 yn a y+bk log f(e ,...,e ) = log e k (bk = log ck) ! kX=1 geometric program transforms to convex problem • K T minimize log k=1 exp(a0ky + b0k) K T subject to log P exp(a y + bik) 0, i =1,...,m k=1 ik ≤ Gy +Pd =0 Convex optimization problems 4–30 Design of cantilever beam segment 4 segment 3 segment 2 segment 1 F N segments with unit lengths, rectangular cross-sections of size w h • i × i given vertical force F applied at the right end • design problem minimize total weight subject to upper & lower bounds on wi, hi upper bound & lower bounds on aspect ratios hi/wi upper bound on stress in each segment upper bound on vertical deflection at the end of the beam variables: wi, hi for i =1,...,N Convex optimization problems 4–31 objective and constraint functions total weight w h + + w h is posynomial • 1 1 ··· N N aspect ratio h /w and inverse aspect ratio w /h are monomials • i i i i maximum stress in segment i is given by 6iF/(w h2), a monomial • i i the vertical deflection yi and slope vi of central axis at the right end of • segment i are defined recursively as F vi = 12(i 1/2) 3 + vi+1 − Ewihi F yi = 6(i 1/3) 3 + vi+1 + yi+1 − Ewihi for i = N,N 1,..., 1, with v = y =0 (E is Young’s modulus) − N+1 N+1 vi and yi are posynomial functions of w, h Convex optimization problems 4–32 formulation as a GP minimize w h + + w h 1 1 ··· N N subject to w−1 w 1, w w−1 1, i =1,...,N max i ≤ min i ≤ h−1 h 1, h h−1 1, i =1,...,N max i ≤ min i ≤ S−1 w−1h 1, S w h−1 1, i =1,...,N max i i ≤ min i i ≤ 6iFσ−1 w−1h−2 1, i =1,...,N max i i ≤ y−1 y 1 max 1 ≤ note we write w w w and h h h • min ≤ i ≤ max min ≤ i ≤ max w /w 1, w /w 1, h /h 1, h /h 1 min i ≤ i max ≤ min i ≤ i max ≤ we write S h /w S as • min ≤ i i ≤ max S w /h 1, h /(w S ) 1 min i i ≤ i i max ≤ Convex optimization problems 4–33 Minimizing spectral radius of nonnegative matrix Perron-Frobenius eigenvalue λpf(A) exists for (elementwise) positive A Rn×n • ∈ a real, positive eigenvalue of A, equal to spectral radius max λ (A) • i | i | determines asymptotic growth (decay) rate of Ak: Ak λk as k • ∼ pf →∞ alternative characterization: λ (A) = inf λ Av λv for some v 0 • pf { | ≻ } minimizing spectral radius of matrix of posynomials minimize λ (A(x)), where the elements A(x) are posynomials of x • pf ij equivalent geometric program: • minimize λ subject to n A(x) v /(λv ) 1, i =1,...,n j=1 ij j i ≤ variables λ, v, x P Convex optimization problems 4–34 Generalized inequality constraints convex problem with generalized inequality constraints minimize f0(x) subject to fi(x) Ki 0, i =1,...,m Ax =b f : Rn R convex; f : Rn Rki K -convex w.r.t. proper cone K • 0 → i → i i same properties as standard convex problem (convex feasible set, local • optimum is global, etc.) conic form problem: special case with affine objective and constraints minimize cT x subject to Fx + g K 0 Ax = b m extends linear programming (K = R+ ) to nonpolyhedral cones Convex optimization problems 4–35 Semidefinite program (SDP) minimize cT x subject to x1F1 + x2F2 + + xnFn + G 0 Ax = b ··· with F , G Sk i ∈ inequality constraint is called linear matrix inequality (LMI) • includes problems with multiple LMI constraints: for example, • x Fˆ + + x Fˆ + Gˆ 0, x F˜ + + x F˜ + G˜ 0 1 1 ··· n n 1 1 ··· n n is equivalent to single LMI Fˆ 0 Fˆ 0 Fˆ 0 Gˆ 0 x 1 +x 2 + +x n + 0 1 0 F˜ 2 0 F˜ ··· n 0 F˜ 0 G˜ 1 2 n Convex optimization problems 4–36 LP and SOCP as SDP LP and equivalent SDP LP: minimize cT x SDP: minimize cT x subject to Ax b subject to diag(Ax b) 0 − (note different interpretation of generalized inequality ) SOCP and equivalent SDP SOCP: minimize f T x subject to A x + b cT x + d , i =1,...,m k i ik2 ≤ i i SDP: minimize f T x T (ci x + di)I Aix + bi subject to T T 0, i =1,...,m (Aix + bi) c x + di i Convex optimization problems 4–37 Eigenvalue minimization minimize λmax(A(x)) where A(x)= A + x A + + x A (with given A Sk) 0 1 1 ··· n n i ∈ equivalent SDP minimize t subject to A(x) tI variables x Rn, t R • ∈ ∈ follows from • λ (A) t A tI max ≤ ⇐⇒ Convex optimization problems 4–38 Matrix norm minimization 1/2 minimize A(x) = λ (A(x)T A(x)) k k2 max where A(x)= A + x A + + x A (with given A Rp×q) 0 1 1 ··· n n i ∈ equivalent SDP minimize t tI A(x) subject to 0 A(x)T tI variables x Rn, t R • ∈ ∈ constraint follows from • A t AT A t2I, t 0 k k2 ≤ ⇐⇒ ≥ tI A 0 ⇐⇒ AT tI Convex optimization problems 4–39 Vector optimization general vector optimization problem minimize (w.r.t. K) f0(x) subject to f (x) 0, i =1,...,m i ≤ hi(x)=0, i =1,...,p vector objective f : Rn Rq, minimized w.r.t. proper cone K Rq 0 → ∈ convex vector optimization problem minimize (w.r.t. K) f0(x) subject to fi(x) 0, i =1,...,m Ax =≤b with f0 K-convex, f1,..., fm convex Convex optimization problems 4–40 Optimal and Pareto optimal points set of achievable objective values = f (x) x feasible O { 0 | } feasible x is optimal if f (x) is the minimum value of • 0 O feasible x is Pareto optimal if f (x) is a minimal value of • 0 O O O po f0(x ) ⋆ f0(x ) x⋆ is optimal xpo is Pareto optimal Convex optimization problems 4–41 Multicriterion optimization q vector optimization problem with K = R+ f0(x)=(F1(x),...,Fq(x)) q different objectives F ; roughly speaking we want all F ’s to be small • i i feasible x⋆ is optimal if • y feasible = f (x⋆) f (y) ⇒ 0 0 if there exists an optimal point, the objectives are noncompeting feasible xpo is Pareto optimal if • y feasible, f (y) f (xpo) = f (xpo)= f (y) 0 0 ⇒ 0 0 if there are multiple Pareto optimal values, there is a trade-off between the objectives Convex optimization problems 4–42 Regularized least-squares minimize (w.r.t. R2 ) ( Ax b 2, x 2) + k − k2 k k2 25 20 2 2 k O x k 15 )= 10 x ( 2 F 5 0 0 10 20 30 40 50 F (x)= Ax b 2 1 k − k2 example for A R100×10; heavy line is formed by Pareto optimal points ∈ Convex optimization problems 4–43 Risk return trade-off in portfolio optimization 2 T T minimize (w.r.t. R+) ( p¯ x,x Σx) subject to 1−T x =1, x 0 x Rn is investment portfolio; x is fraction invested in asset i • ∈ i p Rn is vector of relative asset price changes; modeled as a random • variable∈ with mean p¯, covariance Σ p¯T x = E r is expected return; xT Σx = var r is return variance • example 15% 1 x(4) x(3) x(2) x 10% 0.5 x(1) 5% allocation mean return 0 0% 0% 10% 20% 0% 10% 20% standard deviation of return standard deviation of return Convex optimization problems 4–44 Scalarization to find Pareto optimal points: choose λ ∗ 0 and solve scalar problem ≻K T minimize λ f0(x) subject to f (x) 0, i =1,...,m i ≤ hi(x)=0, i =1,...,p O if x is optimal for scalar problem, then it is Pareto-optimal for vector f0(x1) optimization problem f0(x3) λ1 λ2 f0(x2) for convex vector optimization problems, can find (almost) all Pareto optimal points by varying λ ∗ 0 ≻K Convex optimization problems 4–45 Scalarization for multicriterion problems to find Pareto optimal points, minimize positive weighted sum λT f (x)= λ F (x)+ + λ F (x) 0 1 1 ··· q q examples regularized least-squares problem of page 4–43 • 20 take λ = (1,γ) with γ > 0 15 2 2 k 10 minimize Ax b 2 + γ x 2 x k − k2 k k2 k 5 for fixed γ, a LS problem γ =1 0 0 5 10 15 20 Ax b 2 k − k2 Convex optimization problems 4–46 risk-return trade-off of page 4–44 • minimize p¯T x + γxT Σx subject to 1−T x =1, x 0 for fixed γ > 0, a quadratic program Convex optimization problems 4–47