Convexity Theory and Gradient Methods

Convexity Theory and Gradient Methods Angelia Nedić [email protected] ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign IMA, Minnesota June 1{6, 2014 Outline • Convex Functions • Optimality Principle • Projection Theorem • Gradient Methods 1 IMA, Minnesota June 1{6, 2014 Convex Set n Line segment [x1; x2] ⊆ R is the set of all points xα = αx1 + (1 − α)x2 for α 2 [0; 1]. A set C is convex when for all x1; x2 2 C, the segment [x1; x2] is contained in the set C. 2 IMA, Minnesota June 1{6, 2014 Convex Function Let f be a function from Rn to R, f : Rn ! R Informally: f is convex when for every segment [x1; x2], as xα = αx1+(1−α)x2 varies over the line segment [x1; x2], the points (xα; f(xα)) lie below the segment connecting (x1; f(x1)) and (x2; f(x2)) The domain of f is a set in Rn defined by dom(f) = fx 2 Rn j f(x) is well defined (finite)g Def. A function f is convex if (1) Its domain dom(f) is a convex set in Rn and (2) For all x1; x2 2 dom(f) and α 2 [0; 1] f(αx1 + (1 − α)x2) ≤ αf(x1) + (1 − α)f(x2) The function is strictly convex if the inequality is strict whenever x1 6= x2 3 IMA, Minnesota June 1{6, 2014 Examples on R Convex: • Affine: ax + b over R for any a; b 2 R • Exponential: eax over R for any a 2 R • Power: xp over (0; +1) for p ≥ 1 or p ≤ 0 • Powers of absolute value: jxjp over R for p ≥ 1 • Negative entropy: x ln x over (0; +1) Concave: • Affine: ax + b over R for any a; b 2 R • Powers: xp over (0; +1) for 0 ≤ p ≤ 1 • Logarithm: ln x over (0; +1) 4 IMA, Minnesota June 1{6, 2014 Examples: Affine Functions and Norms • Affine functions are both convex and concave • Norms are convex Examples on Rn • Affine function f(x) = a0x + b with a 2 Rn and b 2 R • Euclidean, l1, and l1 norms • General lp norms n !1=p X p kxkp = jxij for p ≥ 1 i=1 5 IMA, Minnesota June 1{6, 2014 Examples on Rm×n The space Rm×n is the space of m × n matrices • Affine function m n 0 X X f(X) = tr(A X) + b = aijxij + b i=1 j=1 • Spectral (maximum singular value) norm q 0 f(X) = kXk2 = σmax(X) = λmax(X X) where λmax(A) denotes the maximum eigenvalue of a matrix A 6 IMA, Minnesota June 1{6, 2014 Verifying Convexity of a Function We can verify that a given function f is convex by • Using the definition • Applying some special criteria • Second-order conditions • First-order conditions • Reduction to a scalar function • Showing that f is obtained through operations preserving convexity 7 IMA, Minnesota June 1{6, 2014 Second-Order Conditions Let f be twice differentiable and let dom(f) = Rn [in general, it is required that dom(f) is open] The Hessian r2f(x) is a symmetric n × n matrix whose entries are the second-order partial derivatives of f at x: h i @2f(x) r2f(x) = for i; j = 1; : : : ; n ij @xi@xj 2nd-order conditions: For a twice differentiable f with convex domain • f is convex if and only if r2f(x) 0 for all x 2 dom(f) • f is strictly convex if r2f(x) 0 for all x 2 dom(f) 8 IMA, Minnesota June 1{6, 2014 Examples Quadratic function: f(x) = (1=2)x0P x + q0x + r with a symmetric n × n matrix P rf(x) = P x + q; r2f(x) = P Convex for P 0 Least-squares objective: f(x) = kAx − bk2 with an m × n matrix A rf(x) = 2A0(Ax − b); r2f(x) = 2A0A Convex for any A Quadratic-over-linear: f(x; y) = x2=y T " y #" y # r2f(x; y) = 2 0 y3 −x −x Convex for y > 0 9 IMA, Minnesota June 1{6, 2014 Verifying Convexity of a Function We can verify that a given function f is convex by • Using the definition • Applying some special criteria • Second-order conditions • First-order conditions • Reduction to a scalar function • Showing that f is obtained through operations preserving convexity 10 IMA, Minnesota June 1{6, 2014 First-Order Condition f is differentiable if dom(f) is open and the gradient @f(x) @f(x) @f(x)! rf(x) = ; ;:::; @x1 @x2 @xn exists at each x 2 domf 1st-order condition: differentiable f is convex if and only if its domain is convex and f(x) + hrf(x); z − xi ≤ f(z) for all x; z 2 dom(f) A first order approximation is a global underestimate of f Very important property used in algorithm designs and performance analysis 11 IMA, Minnesota June 1{6, 2014 Restriction of a convex function to a line f is convex if and only if domf is convex and the function g : R ! R, g(t) = f(x + tv); dom(g) = ft j x + tv 2 dom(f)g is convex (in t) for any x 2 domf, v 2 Rn Checking convexity of multivariable functions can be done by checking convexity of functions of one variable n n Example f : S ! R with f(X) = − ln det X, dom(f) = S++ g(t) = − ln det(X + tV ) = − ln det X − ln det(I + tX−1=2VX−1=2) n X = − ln det X − ln(1 + tλi) i=1 −1=2 −1=2 where λi are the eigenvalues of X VX g is convex in t (for any choice of V and any X 0); hence f is convex 12 IMA, Minnesota June 1{6, 2014 Operations Preserving Convexity • Positive Scaling • Sum • Composition with Affine Mapping • Special Compositions • Point-wise Maximum • Point-wise Supremum • Partial Minimization 13 IMA, Minnesota June 1{6, 2014 Scaling, Sum, & Composition with Affine Function Positive multiple For a convex f and λ > 0, the function λf is convex Sum: For convex f1 and f2, the sum f1 + f2 is convex (extends to infinite sums, integrals) Composition with affine function: For a convex f and affine g [i.e., g(x) = Ax + b], the composition f ◦ g is convex, where (f ◦ g)(x) = f(Ax + b) Examples • Log-barrier for linear inequalities m X 0 0 f(x) = − ln(bi − aix); domf = fx j aix < bi; i = 1; : : : ; mg i=1 • (Any) Norm of affine function: f(x) = kAx + bk 14 IMA, Minnesota June 1{6, 2014 Composition with Scalar Functions Composition of g : Rn ! R and h : R ! R with dom(g) = Rn and dom(h) = R: f(x) = h(g(x)) f is convex if (1) g is convex, h is nondecreasing and convex (2) g is concave, h is nonincreasing and convex Examples • eg(x) is convex if g is convex 1 • g(x) is convex if g is concave and positive 15 IMA, Minnesota June 1{6, 2014 Composition with Vector Functions Composition of g : Rn ! Rp and h : Rp ! R with dom(g) = Rn and dom(h) = Rp: f(x) = h(g(x)) = h(g1(x); g2(x); : : : ; gp(x)) f is convex if (1) each gi is convex, h is convex and nondecreasing in each argument (2) each gi is concave, h is convex and nonincreasing in each argument Example Pm gi(x) • i=1 e is convex if gi are convex 16 IMA, Minnesota June 1{6, 2014 Pointwise maximum For convex functions f1; : : : ; fm, the pointwise-max function F (x) = max ff1(x); : : : ; fm(x)g is convex (What is domain of F ?) Examples T • Piecewise-linear function: f(x) = maxi=1;:::;m(ai x + bi) is convex • Sum of r largest components of a vector x 2 Rn: f(x) = x[1] + x[2] + ··· + x[r] is convex (x[i] is i-th largest component of x) f(x) = max fxi1 + xi2 + ··· + xirg (i1;:::;ir)2Ir Ir = f(i1; : : : ; ir) j i1 < : : : < ir; ij 2 f1; : : : ; mg; j = 1; : : : ; ng Pointwise supremum - later 17 IMA, Minnesota June 1{6, 2014 Extended-Value Functions: Epigraph A function f is an extended-value function if f : Rn ! R [ {−∞; +1g Example: consider f(x) = infy≥0 xy for x 2 R Def. The epigraph of a function f over Rn is the following set in Rn+1: n+1 n epif = f(x; w) 2 R j x 2 R ; f(x) ≤ wg Theorem: [Convex Function - Convex Epigraph ] A function f is convex if and only if its epigraph epif is a convex set in Rn+1 This allows us to use the convexity of the epigraph as the definition of convexity (often done). These are equivalent in view of the theorem. For an f with domain domf, we associate an extended-value function f~ 8 defined by <f(x) if x 2 domf f~(x) = :+1 otherwise domf is the projection of epif on Rn; convexity of f by letting w = f(x) 18 IMA, Minnesota June 1{6, 2014 Pointwise Supremum Let A ⊆ Rp and f : Rn × Rp ! R. Let f(x; z) be convex in x for each z 2 A. Then, the supremum function over the set A is convex: g(x) = supz2A f(x; z) Examples • Set support function is convex for a set C ⊂ Rn, n 0 SC : R ! R;SC(x) = supz2C z x • Set farthest-distance function is convex for a set C ⊂ Rn, n f : R ! R; f(x) = supz2C kx − zk • Maximum eigenvalue function of a symmetric matrix is convex n 0 λmax : S ! R; λmax(X) = supkzk=1 z Xz 19 IMA, Minnesota June 1{6, 2014 Minimization Let C ⊆ Rp be a nonempty convex set Let f : Rn × Rp ! R be a convex function [in (x; z) 2 Rn × Rp].

Convexity Theory and Gradient Methods

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support