Convexity Theory and Gradient Methods

Convexity Theory and Gradient Methods Angelia Nedić [email protected] ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign IMA, Minnesota June 1{6, 2014 Outline • Convex Functions • Optimality Principle • Projection Theorem • Gradient Methods 1 IMA, Minnesota June 1{6, 2014 Convex Set n Line segment [x1; x2] ⊆ R is the set of all points xα = αx1 + (1 − α)x2 for α 2 [0; 1]. A set C is convex when for all x1; x2 2 C, the segment [x1; x2] is contained in the set C. 2 IMA, Minnesota June 1{6, 2014 Convex Function Let f be a function from Rn to R, f : Rn ! R Informally: f is convex when for every segment [x1; x2], as xα = αx1+(1−α)x2 varies over the line segment [x1; x2], the points (xα; f(xα)) lie below the segment connecting (x1; f(x1)) and (x2; f(x2)) The domain of f is a set in Rn defined by dom(f) = fx 2 Rn j f(x) is well defined (finite)g Def. A function f is convex if (1) Its domain dom(f) is a convex set in Rn and (2) For all x1; x2 2 dom(f) and α 2 [0; 1] f(αx1 + (1 − α)x2) ≤ αf(x1) + (1 − α)f(x2) The function is strictly convex if the inequality is strict whenever x1 6= x2 3 IMA, Minnesota June 1{6, 2014 Examples on R Convex: • Affine: ax + b over R for any a; b 2 R • Exponential: eax over R for any a 2 R • Power: xp over (0; +1) for p ≥ 1 or p ≤ 0 • Powers of absolute value: jxjp over R for p ≥ 1 • Negative entropy: x ln x over (0; +1) Concave: • Affine: ax + b over R for any a; b 2 R • Powers: xp over (0; +1) for 0 ≤ p ≤ 1 • Logarithm: ln x over (0; +1) 4 IMA, Minnesota June 1{6, 2014 Examples: Affine Functions and Norms • Affine functions are both convex and concave • Norms are convex Examples on Rn • Affine function f(x) = a0x + b with a 2 Rn and b 2 R • Euclidean, l1, and l1 norms • General lp norms n !1=p X p kxkp = jxij for p ≥ 1 i=1 5 IMA, Minnesota June 1{6, 2014 Examples on Rm×n The space Rm×n is the space of m × n matrices • Affine function m n 0 X X f(X) = tr(A X) + b = aijxij + b i=1 j=1 • Spectral (maximum singular value) norm q 0 f(X) = kXk2 = σmax(X) = λmax(X X) where λmax(A) denotes the maximum eigenvalue of a matrix A 6 IMA, Minnesota June 1{6, 2014 Verifying Convexity of a Function We can verify that a given function f is convex by • Using the definition • Applying some special criteria • Second-order conditions • First-order conditions • Reduction to a scalar function • Showing that f is obtained through operations preserving convexity 7 IMA, Minnesota June 1{6, 2014 Second-Order Conditions Let f be twice differentiable and let dom(f) = Rn [in general, it is required that dom(f) is open] The Hessian r2f(x) is a symmetric n × n matrix whose entries are the second-order partial derivatives of f at x: h i @2f(x) r2f(x) = for i; j = 1; : : : ; n ij @xi@xj 2nd-order conditions: For a twice differentiable f with convex domain • f is convex if and only if r2f(x) 0 for all x 2 dom(f) • f is strictly convex if r2f(x) 0 for all x 2 dom(f) 8 IMA, Minnesota June 1{6, 2014 Examples Quadratic function: f(x) = (1=2)x0P x + q0x + r with a symmetric n × n matrix P rf(x) = P x + q; r2f(x) = P Convex for P 0 Least-squares objective: f(x) = kAx − bk2 with an m × n matrix A rf(x) = 2A0(Ax − b); r2f(x) = 2A0A Convex for any A Quadratic-over-linear: f(x; y) = x2=y T " y #" y # r2f(x; y) = 2 0 y3 −x −x Convex for y > 0 9 IMA, Minnesota June 1{6, 2014 Verifying Convexity of a Function We can verify that a given function f is convex by • Using the definition • Applying some special criteria • Second-order conditions • First-order conditions • Reduction to a scalar function • Showing that f is obtained through operations preserving convexity 10 IMA, Minnesota June 1{6, 2014 First-Order Condition f is differentiable if dom(f) is open and the gradient @f(x) @f(x) @f(x)! rf(x) = ; ;:::; @x1 @x2 @xn exists at each x 2 domf 1st-order condition: differentiable f is convex if and only if its domain is convex and f(x) + hrf(x); z − xi ≤ f(z) for all x; z 2 dom(f) A first order approximation is a global underestimate of f Very important property used in algorithm designs and performance analysis 11 IMA, Minnesota June 1{6, 2014 Restriction of a convex function to a line f is convex if and only if domf is convex and the function g : R ! R, g(t) = f(x + tv); dom(g) = ft j x + tv 2 dom(f)g is convex (in t) for any x 2 domf, v 2 Rn Checking convexity of multivariable functions can be done by checking convexity of functions of one variable n n Example f : S ! R with f(X) = − ln det X, dom(f) = S++ g(t) = − ln det(X + tV ) = − ln det X − ln det(I + tX−1=2VX−1=2) n X = − ln det X − ln(1 + tλi) i=1 −1=2 −1=2 where λi are the eigenvalues of X VX g is convex in t (for any choice of V and any X 0); hence f is convex 12 IMA, Minnesota June 1{6, 2014 Operations Preserving Convexity • Positive Scaling • Sum • Composition with Affine Mapping • Special Compositions • Point-wise Maximum • Point-wise Supremum • Partial Minimization 13 IMA, Minnesota June 1{6, 2014 Scaling, Sum, & Composition with Affine Function Positive multiple For a convex f and λ > 0, the function λf is convex Sum: For convex f1 and f2, the sum f1 + f2 is convex (extends to infinite sums, integrals) Composition with affine function: For a convex f and affine g [i.e., g(x) = Ax + b], the composition f ◦ g is convex, where (f ◦ g)(x) = f(Ax + b) Examples • Log-barrier for linear inequalities m X 0 0 f(x) = − ln(bi − aix); domf = fx j aix < bi; i = 1; : : : ; mg i=1 • (Any) Norm of affine function: f(x) = kAx + bk 14 IMA, Minnesota June 1{6, 2014 Composition with Scalar Functions Composition of g : Rn ! R and h : R ! R with dom(g) = Rn and dom(h) = R: f(x) = h(g(x)) f is convex if (1) g is convex, h is nondecreasing and convex (2) g is concave, h is nonincreasing and convex Examples • eg(x) is convex if g is convex 1 • g(x) is convex if g is concave and positive 15 IMA, Minnesota June 1{6, 2014 Composition with Vector Functions Composition of g : Rn ! Rp and h : Rp ! R with dom(g) = Rn and dom(h) = Rp: f(x) = h(g(x)) = h(g1(x); g2(x); : : : ; gp(x)) f is convex if (1) each gi is convex, h is convex and nondecreasing in each argument (2) each gi is concave, h is convex and nonincreasing in each argument Example Pm gi(x) • i=1 e is convex if gi are convex 16 IMA, Minnesota June 1{6, 2014 Pointwise maximum For convex functions f1; : : : ; fm, the pointwise-max function F (x) = max ff1(x); : : : ; fm(x)g is convex (What is domain of F ?) Examples T • Piecewise-linear function: f(x) = maxi=1;:::;m(ai x + bi) is convex • Sum of r largest components of a vector x 2 Rn: f(x) = x[1] + x[2] + ··· + x[r] is convex (x[i] is i-th largest component of x) f(x) = max fxi1 + xi2 + ··· + xirg (i1;:::;ir)2Ir Ir = f(i1; : : : ; ir) j i1 < : : : < ir; ij 2 f1; : : : ; mg; j = 1; : : : ; ng Pointwise supremum - later 17 IMA, Minnesota June 1{6, 2014 Extended-Value Functions: Epigraph A function f is an extended-value function if f : Rn ! R [ {−∞; +1g Example: consider f(x) = infy≥0 xy for x 2 R Def. The epigraph of a function f over Rn is the following set in Rn+1: n+1 n epif = f(x; w) 2 R j x 2 R ; f(x) ≤ wg Theorem: [Convex Function - Convex Epigraph ] A function f is convex if and only if its epigraph epif is a convex set in Rn+1 This allows us to use the convexity of the epigraph as the definition of convexity (often done). These are equivalent in view of the theorem. For an f with domain domf, we associate an extended-value function f~ 8 defined by <f(x) if x 2 domf f~(x) = :+1 otherwise domf is the projection of epif on Rn; convexity of f by letting w = f(x) 18 IMA, Minnesota June 1{6, 2014 Pointwise Supremum Let A ⊆ Rp and f : Rn × Rp ! R. Let f(x; z) be convex in x for each z 2 A. Then, the supremum function over the set A is convex: g(x) = supz2A f(x; z) Examples • Set support function is convex for a set C ⊂ Rn, n 0 SC : R ! R;SC(x) = supz2C z x • Set farthest-distance function is convex for a set C ⊂ Rn, n f : R ! R; f(x) = supz2C kx − zk • Maximum eigenvalue function of a symmetric matrix is convex n 0 λmax : S ! R; λmax(X) = supkzk=1 z Xz 19 IMA, Minnesota June 1{6, 2014 Minimization Let C ⊆ Rp be a nonempty convex set Let f : Rn × Rp ! R be a convex function [in (x; z) 2 Rn × Rp].

Load more