3. Convex Functions

Convex Optimization — Boyd & Vandenberghe 3. Convex functions

basic properties and examples operations that preserve convexity the conjugate function quasiconvex functions log-concave and log-convex functions convexity with respect to generalized inequalities

3–1

Denition f : Rn R is convex if dom f is a convex set and → f(x + (1 )y) f(x) + (1 )f(y) for all x, y dom f, 0 1 ∈

PSfrag replacements (y, f(y)) (x, f(x))

f is concave if f is convex f is strictly convex if dom f is convex and f(x + (1 )y) < f(x) + (1 )f(y) for x, y dom f, x = y, 0 < < 1 ∈ 6

Convex functions 3–2 Examples on R convex: ane: ax + b on R, for any a, b R ∈ exponential: eax, for any a R ∈ powers: x on R , for 1 or 0 ++ powers of absolute value: x p on R, for p 1 | | negative entropy: x log x on R ++ concave: ane: ax + b on R, for any a, b R ∈ powers: x on R , for 0 1 ++ logarithm: log x on R ++

Convex functions 3–3

Examples on Rn and Rmn ane functions are convex and concave; all norms are convex examples on Rn ane function f(x) = aT x + b norms: x = ( n x p)1/p for p 1; x = max x k kp i=1 | i| k k∞ k | k| P examples on Rmn (m n matrices) ane function m n T f(X) = tr(A X) + b = AijXij + b Xi=1 Xj=1

spectral (maximum singular value) norm f(X) = X = (X) = ( (XT X))1/2 k k2 max max

Convex functions 3–4 Restriction of a convex function to a line f : Rn R is convex if and only if the function g : R R, → → g(t) = f(x + tv), dom g = t x + tv dom f { | ∈ } is convex (in t) for any x dom f, v Rn ∈ ∈ can check convexity of f by checking convexity of functions of one variable example. f : Sn R with f(X) = log det X, dom X = Sn → ++ g(t) = log det(X + tV ) = log det X + log det(I + tX 1/2V X1/2) n

= log det X + log(1 + ti) Xi=1 1/2 1/2 where i are the eigenvalues of X V X g is concave in t (for any choice of X 0, V ); hence f is concave

Convex functions 3–5

Extended-value extension extended-value extension f˜ of f is

f˜(x) = f(x), x dom f, f˜(x) = , x dom f ∈ ∞ 6∈ often simplies notation; for example, the condition

0 1 = f˜(x + (1 )y) f˜(x) + (1 )f˜(y) ⇒ (as an inequality in R ), means the same as the two conditions ∪ {∞} dom f is convex for x, y dom f, ∈ 0 1 = f(x + (1 )y) f(x) + (1 )f(y) ⇒

Convex functions 3–6 First-order condition

f is dierentiable if dom f is open and the gradient

∂f(x) ∂f(x) ∂f(x) f(x) = , , . . . , ∇ µ ∂x1 ∂x2 ∂xn ¶

exists at each x dom f ∈ 1st-order condition: dierentiable f with convex domain is convex i

f(y) f(x) + f(x)T (y x) for all x, y dom f ∇ ∈

f(y) f(x) + ∇f(x)T (y x) PSfrag replacements

(x, f(x))

rst-order approximation of f is global underestimator

Convex functions 3–7

Second-order conditions

f is twice dierentiable if dom f is open and the Hessian 2f(x) Sn, ∇ ∈

2 2 ∂ f(x) f(x)ij = , i, j = 1, . . . , n, ∇ ∂xi∂xj

exists at each x dom f ∈ 2nd-order conditions: for twice dierentiable f with convex domain

f is convex if and only if 2f(x) 0 for all x dom f ∇ ∈

if 2f(x) 0 for all x dom f, then f is strictly convex ∇ ∈

Convex functions 3–8 Examples quadratic function: f(x) = (1/2)xT P x + qT x + r (with P Sn) ∈ f(x) = P x + q, 2f(x) = P ∇ ∇ convex if P 0 least-squares objective: f(x) = Ax b 2 k k2 f(x) = 2AT (Ax b), 2f(x) = 2AT A ∇ ∇ convex (for any A) PSfrag replacements quadratic-over-linear: f(x, y) = x2/y 2 ) y T 1

2 y y x, 2f(x, y) = 0 ( 3 f ∇ y · x ¸ · x ¸ 0 2 2 1 0 convex for y > 0 y 0 2 x

Convex functions 3–9

n log-sum-exp: f(x) = log k=1 exp xk is convex P 1 1 2f(x) = diag(z) zzT (z = exp x ) ∇ 1T z (1T z)2 k k to show 2f(x) 0, we must verify that vT 2f(x)v 0 for all v: ∇ ∇

2 2 T 2 ( k zkvk)( k zk) ( k vkzk) v f(x)v = 2 0 ∇ P P( k zk) P P since ( v z )2 ( z v2)( z ) (from Cauchy-Schwarz inequality) k k k k k k k k P P P

n 1/n n geometric mean: f(x) = ( k=1 xk) on R++ is concave Q (similar proof as for log-sum-exp)

Convex functions 3–10 Epigraph and sublevel set

-sublevel set of f : Rn R: → C = x dom f f(x) { ∈ | } sublevel sets of convex functions are convex (converse is false) epigraph of f : Rn R: → epi f = (x, t) Rn+1 x dom f, f(x) t { ∈ | ∈ } epi f

PSfrag replacements f

f is convex if and only if epi f is a convex set

Convex functions 3–11

Jensen’s inequality basic inequality: if f is convex, then for 0 1, f(x + (1 )y) f(x) + (1 )f(y) extension: if f is convex, then

f(E z) E f(z) for any random variable z basic inequality is special case with discrete distribution

prob(z = x) = , prob(z = y) = 1

Convex functions 3–12 Operations that preserve convexity practical methods for establishing convexity of a function

1. verify denition (often simplied by restricting to a line)

2. for twice dierentiable functions, show 2f(x) 0 ∇ 3. show that f is obtained from simple convex functions by operations that preserve convexity nonnegative weighted sum composition with ane function pointwise maximum and supremum composition minimization perspective

Convex functions 3–13

Positive weighted sum & composition with ane function nonnegative multiple: f is convex if f is convex, 0 sum: f1 + f2 convex if f1, f2 convex (extends to innite sums, integrals) composition with ane function: f(Ax + b) is convex if f is convex examples

log barrier for linear inequalities m f(x) = log(b aT x), dom f = x aT x < b , i = 1, . . . , m i i { | i i } Xi=1

(any) norm of ane function: f(x) = Ax + b k k

Convex functions 3–14 Pointwise maximum if f , . . . , f are convex, then f(x) = max f (x), . . . , f (x) is convex 1 m { 1 m } examples

piecewise-linear function: f(x) = max (aT x + b ) is convex i=1,...,m i i sum of r largest components of x Rn: ∈ f(x) = x + x + + x [1] [2] [r]

is convex (x[i] is ith largest component of x) proof: f(x) = max x + x + + x 1 i < i < < i n { i1 i2 ir | 1 2 r }

Convex functions 3–15

Pointwise supremum if f(x, y) is convex in x for each y , then ∈ A g(x) = sup f(x, y) y∈A is convex examples support function of a set C: S (x) = sup yT x is convex C y∈C distance to farthest point in a set C: f(x) = sup x y y∈C k k

maximum eigenvalue of symmetric matrix: for X Sn, ∈ T max(X) = sup y Xy kyk2=1

Convex functions 3–16 Composition with scalar functions composition of g : Rn R and h : R R: → → f(x) = h(g(x))

g convex, h convex, h˜ nondecreasing f is convex if g concave, h convex, h˜ nonincreasing

proof (for n = 1, dierentiable g, h) f 00(x) = h00(g(x))g0(x)2 + h0(g(x))g00(x)

note: monotonicity must hold for extended-value extension h˜ examples exp g(x) is convex if g is convex 1/g(x) is convex if g is concave and positive

Convex functions 3–17

Vector composition composition of g : Rn Rk and h : Rk R: → →

f(x) = h(g(x)) = h(g1(x), g2(x), . . . , gk(x))

g convex, h convex, h˜ nondecreasing in each argument f is convex if i gi concave, h convex, h˜ nonincreasing in each argument proof (for n = 1, dierentiable g, h)

f 00(x) = g0(x)T 2h(g(x))g0(x) + h(g(x))T g00(x) ∇ ∇ examples m log g (x) is concave if g are concave and positive i=1 i i P m log exp g (x) is convex if g are convex i=1 i i P

Convex functions 3–18 Minimization if f(x, y) is convex in (x, y) and C is a convex set, then

g(x) = inf f(x, y) y∈C is convex examples f(x, y) = xT Ax + 2xT By + yT Cy with A B 0, C 0 · BT C ¸

minimizing over y gives g(x) = inf f(x, y) = xT (A BC1BT )x y g is convex, hence Schur complement A BC1BT 0 distance to a set: dist(x, S) = inf x y is convex if S is convex y∈S k k

Convex functions 3–19

Perspective the perspective of a function f : Rn R is the function g : Rn R R, → → g(x, t) = tf(x/t), dom g = (x, t) x/t dom f, t > 0 { | ∈ } g is convex if f is convex examples f(x) = xT x is convex; hence g(x, t) = xT x/t is convex for t > 0 negative logarithm f(x) = log x is convex; hence relative entropy g(x, t) = t log t t log x is convex on R2 ++ if f is convex, then g(x) = (cT x + d)f (Ax + b)/(cT x + d) ¡ ¢ is convex on x cT x + d > 0, (Ax + b)/(cT x + d) dom f { | ∈ }

Convex functions 3–20 The conjugate function the conjugate of a function f is

f (y) = sup (yT x f(x)) x∈dom f

f(x) xy

PSfrag replacements x

(0, f (y))

f is convex (even if f is not) will be useful in chapter 5

Convex functions 3–21

examples

negative logarithm f(x) = log x f (y) = sup(xy + log x) x>0 1 log( y) y < 0 = ½ otherwise ∞

strictly convex quadratic f(x) = (1/2)xT Qx with Q Sn ∈ ++

f (y) = sup(yT x (1/2)xT Qx) x 1 = yT Q1y 2

Convex functions 3–22 Quasiconvex functions f : Rn R is quasiconvex if dom f is convex and the sublevel sets → S = x dom f f(x) { ∈ | } are convex for all

PSfrag replacements

a b c

f is quasiconcave if f is quasiconvex f is quasilinear if it is quasiconvex and quasiconcave

Convex functions 3–23

Examples

x is quasiconvex on R | | ceilp (x) = inf z Z z x is quasilinear { ∈ | } log x is quasilinear on R ++ f(x , x ) = x x is quasiconcave on R2 1 2 1 2 ++ linear-fractional function aT x + b f(x) = , dom f = x cT x + d > 0 cT x + d { | } is quasilinear distance ratio x a f(x) = k k2, dom f = x x a x b x b { | k k2 k k2} k k2 is quasiconvex

Convex functions 3–24 internal rate of return

cash ow x = (x , . . . , x ); x is payment in period i (to us if x > 0) 0 n i i we assume x < 0 and x + x + + x > 0 0 0 1 n present value of cash ow x, for interest rate r: n i PV(x, r) = (1 + r) xi Xi=0

internal rate of return is smallest interest rate for which PV(x, r) = 0: IRR(x) = inf r 0 PV(x, r) = 0 { | }

IRR is quasiconcave: superlevel set is intersection of halfspaces

n IRR(x) R (1 + r)ix 0 for 0 r R ⇐⇒ i Xi=0

Convex functions 3–25

Properties modied Jensen inequality: for quasiconvex f

0 1 = f(x + (1 )y) max f(x), f(y) ⇒ { }

rst-order condition: dierentiable f with cvx domain is quasiconvex i

f(y) f(x) = f(x)T (y x) 0 ⇒ ∇

∇f(x) x

PSfrag replacements

sums of quasiconvex functions are not necessarily quasiconvex

Convex functions 3–26 Log-concave and log-convex functions a positive function f is log-concave if log f is concave:

f(x + (1 )y) f(x)f(y)1 for 0 1 f is log-convex if log f is convex

powers: xa on R is log-convex for a 0, log-concave for a 0 ++ many common probability densities are log-concave, e.g., normal: 1 1 T 1 f(x) = e2(xx) (xx) (2)n det p cumulative Gaussian distribution function is log-concave x 1 2 (x) = eu /2 du √2 Z∞

Convex functions 3–27

Properties of log-concave functions

twice dierentiable f with convex domain is log-concave if and only if f(x) 2f(x) f(x) f(x)T ∇ ∇ ∇ for all x dom f ∈ product of log-concave functions is log-concave sum of log-concave functions is not always log-concave integration: if f : Rn Rm R is log-concave, then →

g(x) = f(x, y) dy Z

is log-concave (not easy to show)

Convex functions 3–28 consequences of integration property

convolution f g of log-concave functions f, g is log-concave

(f g)(x) = f(x y)g(y)dy Z

if C Rn convex and y is a random variable with log-concave pdf then f(x) = prob(x + y C) ∈ is log-concave proof: write f(x) as integral of product of log-concave functions

1 u C f(x) = g(x + y)p(y) dy, g(u) = ∈ Z ½ 0 u C, 6∈ p is pdf of y

Convex functions 3–29

example: yield function

Y (x) = prob(x + w S) ∈

x Rn: nominal parameter values for product ∈ w Rn: random variations of parameters in manufactured product ∈ S: set of acceptable values if S is convex and w has a log-concave pdf, then

Y is log-concave yield regions x Y (x) are convex { | }

Convex functions 3–30 Convexity with respect to generalized inequalities f : Rn Rm is K-convex if dom f is convex and → f(x + (1 )y) f(x) + (1 )f(y) K for x, y dom f, 0 1 ∈ example f : Sm Sm, f(X) = X2 is Sm-convex → + proof: for xed z Rm, zT X2z = Xz 2 is convex in X, i.e., ∈ k k2 zT (X + (1 )Y )2z zT X2z + (1 )zT Y 2z for X, Y Sm, 0 1 ∈ therefore (X + (1 )Y )2 X2 + (1 )Y 2

Convex functions 3–31