Convex Optimization 3

Convex Optimization 3. Convex Functions Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2018 SJTU Ying Cui 1 / 42 Outline Basic properties and examples Operations that preserve convexity The conjugate function Quasiconvex functions Log-concave and log-convex functions Convexity with respect to generalized inequalities SJTU Ying Cui 2 / 42 Definition ◮ convex: f : Rn R is convex if domf is a convex set and if → f (θx + (1 θ)y) θf (x) + (1 θ)f (y) − ≤ − for all x, y domf , and θ with 0 θ 1 ∈ ≤ ≤ ◮ geometric interpretation: line segment between (x, f (x)) and (y, f (y)) (i.e., chord from x to y) lies above graph of f (y, f(y)) (x, f(x)) Figure 3.1 Graph of a convex function. The chord (i.e. , line segment) between any two points on the graph lies above the graph. ◮ concave: f is concave if f is convex − SJTU Ying Cui 3 / 42 Definition ◮ strictly convex: f : Rn R is strictly convex if domf is a → convex set and if f (θx + (1 θ)y) <θf (x) + (1 θ)f (y) − − for all x, y domf , x = y, and θ with 0 <θ< 1 ∈ 6 ◮ strictly concave: f is strictly concave if f is strictly convex − ◮ affine functions are both convex and concave ◮ any function that is convex and concave is affine SJTU Ying Cui 4 / 42 Examples on R convex: ◮ affine: ax+b on R, for any a, b R ∈ ◮ exponential: eax on R, for any a R ∈ ◮ powers: xα on R , for α 1 or α 0 ++ ≥ ≤ ◮ powers of absolute value: x p on R, for p 1 ◮ | | ≥ negative entropy: x log x on R++ concave: ◮ affine: ax+b on R, for any a, b R ∈ ◮ α powers: x on R++, for 0 α 1 ◮ ≤ ≤ logarithm: log x on R++ SJTU Ying Cui 5 / 42 n m n Examples on R and R × Examples on Rn: ◮ affine function f (x)=aT x+b is both convex and concave ◮ every norm is convex ◮ due to triangle inequality and homogeneity ◮ n p 1/p n lp-norms: x p=( xi ) for p 1 ( x = xi , || || i=1 | | ≥ || ||1 i=1 | | x ∞=maxk xk ) ◮ || || | | P P max function f (x) = max x , , xn is convex { 1 · · · } ◮ log-sum-exp f (x) = log(ex1 + + exn ) is convex · · · ◮ a differentiable approximation of the max function: x1 xn x1 xn log(e + +e ) log n max x , , xn log(e + +e ) ··· − ≤ { 1 ··· }≤ ··· m n Examples on R × : ◮ T m n affine function f (X )= tr(A X )+ b = i=1 j=1 Aij Xij + b is both convex and concave P P ◮ spectral (maximum singular value) norm f (X )= X = σ (X ) = (λ (X T X ))1/2 on is convex k k2 max max SJTU Ying Cui 6 / 42 Restriction of a convex function to a line ◮ a function f : Rn R is convex iff it is convex when restricted to any line→ that intersects its domain, i.e., ◮ g(t)= f (x + tv) is convex on t x + tv domf for all x domf and all v Rn { | ∈ } ∈ ∈ ◮ check convexity of a function of multiple variables by restricting it to a line and checking convexity of a function of one variable ◮ example: f : Sn R with f (X ) = logdet X , domf = Sn → ++ Consider an arbitrary line X = Z + tV Sn with Z, V Sn. ∈ ++ ∈ w. l. o. g., assume t = 0 is in the interval, i.e., Z Sn . ∈ ++ 1/2 1/2 1/2 1/2 g(t) = log det(Z + tV ) = log det(Z (I + tZ − VZ − )Z ) 1/2 1/2 =logdet Z + log det(I + tZ − VZ − ) n 1/2 1/2 =logdet Z + log(1 + tλi ) λi : eigenvalues of Z − VZ − i X=1 g is concave in t. Thus, f is concave. SJTU Ying Cui 7 / 42 Extended-value extension ◮ extended-value extension f˜ of a convex function f is f (x), x domf f˜(x)= ∈ , x / domf (∞ ∈ ◮ f˜ is defined on all Rn, and takes values in R ◮ recover domain of f from f˜ as domf = x f˜(∪x {∞}) < { | ∞} ◮ extension can simplify notation, as no need to explicitly describe the domain, or add the qualifier ‘for all x domf ’ ∈ ◮ basic defining inequality for convexity can be expressed as: for 0 <θ< 1, f˜(θx + (1 θ)y) θf˜(x) + (1 θ)f˜(y) for any x − ≤ − and y ◮ the inequality always holds for θ =0, 1 ◮ no need to mention the two conditions: domf is convex (can be shown by contradiction) and x, y domf (x, y Rn is used instead, which can be omitted) ∈ ∈ SJTU Ying Cui 8 / 42 First-order conditions Suppose f is differentiable, i.e., domf is open and gradient f (x)= ∂f (x) , , ∂f (x) exists at any x domf ∇ ∂x1 · · · ∂xn ∈ ◮ f is convex iff domf is convex and f (y) f (x)+ f (x)T (y x) for all x, y domf ≥ ∇ − ∈ ◮ first-order Taylor approx. of a convex function is a global underestimator of it; if first-order Taylor approx. of a function is always a global underestimator of it, then it is convex ◮ local information about a convex function (value and derivative at a point) implies global information (a global underestimator) ◮ if f is convex and f (x) = 0, then x is a global minimizer of f ∇ ◮ f is strictly convex iff domf is convex and f (y) > f (x)+ f (x)T (y x) for all x, y domf and x = y ∇ − ∈ 6 f(y) f(x) + ∇f(x)T (y − x) (x, f(x)) Figure 3.2 If f is convex and differentiable, then f(x)+∇f(x)T (y−x) ≤ f(y) for all x, y ∈ dom f. SJTU Ying Cui 9 / 42 Second-order conditions Suppose f is twice differentiable, i.e., domf is open and Hessian 2 n 2 ∂2f (x) f (x) S exists at any x domf , where f (x)ij = , ∇ ∈ ∈ ∇ ∂xi ∂xj i, j = 1, , n · · · ◮ f is convex iff domf is convex and 2f (x) 0 for all ∇ x domf ∈ ◮ for a function on R, this reduces to domf is an interval and f ′′(x) 0 for all x in the interval ◮ 2f (x≥) 0 means the graph of f has positive (upward) curvature∇ at x ◮ if domf is convex and 2f (x) 0 for all x domf , then f is strictly convex ∇ ≻ ∈ ◮ the converse is not true, e.g., f (x)= x 4 is strictly convex but f ′′(0) = 0 SJTU Ying Cui 10 / 42 Second-order conditions Examples ◮ quadratic function: f (x) = (1/2)xT Px + qT x + r (P Sn) ∈ f (x)= Px + q, 2f (x)= P ∇ ∇ convex iff P Sn ∈ + ◮ least-squares objective: f (x)= Ax b 2 = xT AT Ax 2xT AT b + bT b k − k2 − f (x) = 2AT (Ax b), 2f (x) = 2AT A ∇ − ∇ convex for all A Rm n (as AT A 0 for all A Rm n) ∈ × ∈ × ◮ quadratic-over-linear function: f (x, y)= x2/y 2 y y T 2f (x, y)= 0 ∇ y 3 x x − − convex for all x R and y R (as zzT 0 for all z Rn) ∈ ∈ ++ ∈ SJTU Ying Cui 11 / 42 Second-order conditions Examples ◮ n log-sum-exp: f (x) =log k=1 exp xk is convex ◮ proof: P 2 1 1 T f (x)= diag(z) zz (zk = exp xk ) ∇ 1T z − (1T z)2 to show 2f (x) 0, we must verify that v T 2f (x)v 0 for all v: ∇ ∇ ≥ 2 2 ( zk v )( zk ) ( vk zk ) v T 2f (x)v = k k k − k 0 ∇ ( z )2 ≥ P P k k P 2 2 P since ( k vk zk ) ( k zk vk )( k zk ) (from Cauchy-Schwarz T ≤T T 2 inequality (a a)(b b) (a b) by treating ai = vi √zi and P P≥ P bi = √zi ) ◮ n 1/n n geometric mean: f (x) = ( k=1 xk ) on R++ is concave (similar proof as for log-sum-exp) Q SJTU Ying Cui 12 / 42 Sublevel set and superlevel set Sublevel set ◮ α-sublevel set of f : Rn R: x domf f (x) α → { ∈ | ≤ } ◮ sublevel sets of a convex function are convex ◮ the converse is false (e.g., f (x)= exp x is not convex (indeed, strictly concave) but all its− sublevel sets are convex) Superlevel set ◮ α-superlevel set of f : Rn R: x domf f (x) α → { ∈ | ≥ } ◮ superlevel sets of a concave function are convex To establish convexity of a set, express it as a sublevel set of a convex function, or as the superlevel set of a concave function. SJTU Ying Cui 13 / 42 Epigraph and hypergraph ◮ graph of f : Rn R: → (x, f (x)) x domf Rn+1 { | ∈ }⊆ ◮ epigraph of f : Rn R: → epi f = (x, t) Rn+1 x domf , f (x) t Rn+1 { ∈ | ∈ ≤ }⊆ ◮ f is convex iff epi f is a convex set ◮ hypograph of f : Rn R: → hypo f = (x, t) Rn+1 x domf , f (x) t Rn+1 { ∈ | ∈ ≥ }⊆ ◮ f is concave iff hypo f is a convex set epi f f Figure 3.5 Epigraph of a function f, shown shaded. The lower boundary, shown darker, is the graph of f. SJTU Ying Cui 14 / 42 Jensen’s inequality and extensions ◮ basic inequality: if f is convex, x, y domf and 0 θ 1, ∈ ≤ ≤ then f (θx + (1 θ)y) θf (x) + (1 θ)f (y) − ≤ − ◮ extension to convex combinations of more than two points: if f is convex, x , , xk domf , and θ , ,θk 0 with 1 · · · ∈ 1 · · · ≥ θ + + θk = 1, then 1 · · · f (θ x + + θk xk ) θ f (x )+ + θk f (xk ) 1 1 · · · ≤ 1 1 · · · ◮ extensions to infinite sums and integrals (if p(x) 0 on ≥ S domf , p(x)dx = 1, then ⊆ S f p(x)xdx f (x)p(x)dx, provided the integrals exist) S R S ◮ ≤ extensionR to expected R values: if f is convex and X is a random variable such that X domf w.p.

Load more