Convex Optimization Scai Lab Study Group

Convex Optimization ScAi Lab Study Group Zhiping (Patricia) Xiao University of California, Los Angeles Winter 2020 Outline 2 Introduction: Convex Optimization Convexity Convex Functions' Properties Definition of Convex Optimization Convex Optimization General Strategy Learning Algorithms Convergence Analysis Examples Source of Materials 3 Textbook: I Convex Optimization and Intro to Linear Algebra by Prof. Boyd and Prof. Vandenberghe Course Materials: I ECE236B, ECE236C offered by Prof. Vandenberghe I CS260 Lecture 12 offered by Prof. Quanquan Gu Notes: I My previous ECE236B notes and ECE236C final report. I My previous CS260 Cheat Sheet. Related Papers: I Accelerated methods for nonconvex optimization I Lipschitz regularity of deep neural networks: analysis and efficient estimation Introduction: Convex Optimization q Notations 5 I iff: if and only if I R+ = fx 2 R j x ≥ 0g I R++ = fx 2 R j x > 0g I int K: interior of set K, not its boundary. I Generalized inequalities (textbook 2.4), based on a proper cone K (convex, closed, solid, pointed | if x 2 K and −x 2 K then x = 0): I x K y () y − x 2 K I x ≺K y () y − x 2 int K n n T I Positive semidefinite matrix X 2 S+, 8y 2 R , y Xy ≥ 0 () X 0. Convexity of Set 6 Set C is convex iff the line segment between any two points in C lies in C, i.e. 8x1; x2 2 C and 8θ 2 [0; 1], we have: θx1 + (1 − θ)x2 2 C Both convex and nonconvex sets have convex hull, which is defined as: k k X X conv C = f θixi j xi 2 C; θi ≥ 0; i = 1; 2; : : : ; k; θi = 1g i=1 i=1 Convexity of Set (Examples from Textbook) 7 Figure: Left: convex, middle & right: nonconvex. Figure: Left: convex hull of the points, right: convex hull of the kidney-shaped set above. Operations that Preserve Convexity of SetsI 8 The most common operations that preserve convexity of convex sets include: I Intersection I Image / inverse image under affine function I Cartesian Product, Minkowski sum, Projection I Perspective function I Linear-fractional functions Operations that Preserve Convexity of SetsII 9 Convexity is preserved under intersection: I S1;S2 are convex sets then S1 \ S2 is also convex set. I If Sα is convex for 8α 2 A, then \α2ASα is convex. Proof: Intersection of a collection of convex sets is convex set. If the intersection is empty, or consists of only a single point, then proved by definition. Otherwise, for any two points A, B in the intersection, line AB must lie wholly within each set in the collection, hence must lie wholly within their intersection. Operations that Preserve Convexity of Sets III 10 n m An affine function f : R ! R is a sum of a linear function and a constant, i.e., if it has the form f(x) = Ax + b, where m×n m A 2 R , b 2 R , thus f represents a hyperplane. n Suppose that S ⊆ R is convex and then the image of S under f is convex: f(S) = ff(x) j x 2 Sg m n Also, if f : R ! R is an affine function, the inverse image of S under f is convex: f −1(S) = fx j f(x) 2 Sg Examples include scaling αS = ff(x) j αx; x 2 Sg (α 2 R) and n translation S + a = ff(x) j x + a; x 2 Sg (a 2 R ); they are both convex sets when S is convex. Operations that Preserve Convexity of SetsIV 11 Proof: the image of convex set S under affine function f(x) = Ax + b is also convex. If S is empty or contains only one point, then f(S) is obviously convex. Otherwise, take xS; yS 2 f(S). xS = f(x) = Ax + b, xS = f(y) = Ay + b. Then 8θ 2 [0; 1], we have: θxS + (1 − θ)yS = A(θx + (1 − θ)y) + b = f(θx + (1 − θ)y) Since x; y 2 S, and S is convex set, then θx + (1 − θ)y 2 S, and thus f(θx + (1 − θ)y) 2 f(S). Operations that Preserve Convexity of SetsV 12 n m The Cartesian Product of convex sets S1 ⊆ R , S2 ⊆ R is obviously convex: S1 × S2 = f(x1; x2) j x1 2 S1; x2 2 S2g The Minkowski sum of the two sets is defined as: S1 + S2 = fx1 + x2 j x1 2 S1; x2 2 S2g and it is also obviously convex. The projection of a convex set onto some of its coordinates is also obviously convex. (consider the definition of convexity reflected on each coordinate) m n T = fx1 2 R j (x1; x2) 2 S for some x2 2 R g Operations that Preserve Convexity of SetsVI 13 n+1 n We define the perspective function P : R ! R , with domain n dom P = R × R++, as P (z; t) = z=t. The perspective function scales or normalizes vectors so the last component is one, and then drops the last component. We can interpret the perspective function as the action of a pin-hole camera. (x1; x2; x3) through a hold at (0; 0; 0) on plane x3 = 0 forms an image at −(x1=x3; x2=x3; 1) at x3 = −1. The last component could be dropped, since the image point is fixed. Proof: That this operation preserves convexity is already proved by affine function + projection preserve convexity. Operations that Preserve Convexity of Sets VII 14 n m A linear-fractional function f : R ! R is formed by composing the perspective function with an affine function. n m+1 Consider the following affine function g : R ! R : A b g(x) = x + cT d m×n m n where A 2 R , b 2 R , c 2 R , d 2 R. m+1 m Followed by a perspective function P : R ! R we have: f(x) = (Ax + b)=(cT x + d); dom f = fx j cT x + d > 0g And it naturally preserves convexity because both affine function and perspective function preserve convexity. Convexity of Function 15 Convex Functions Strict Convex Functions Strong Convex Functions Figure: The three commonly-seen types of convex functions and their relations. In brief, strong convex functions ) strict convex functions ) convex functions. Convexity of Function 16 n f : R ! R is convex iff it satisfies: I dom f is a convex set. I 8x; y 2 dom f, θ 2 [0; 1], we have the Jensen's inequality: f(θx + (1 − θ)y) ≤ θf(x) + (1 − θ)f(y) f is strictly convex iff when x 6= y and θ 2 (0; 1), strict inequality of the above inequation holds. f is concave when −f is convex, strictly concave when −f strictly convex, and vice versa. f is strong convex iff 9α > 0 such that f(x) − αkxk2 is convex. k·k is any norm. Convexity of Function 17 Proof: strong convex functions ) strict convex functions ) convex functions. That all strict convex functions are convex functions, and that convex functions are not necessarily strict convex. Strong convexity implies, 8x; y 2 dom f; θ 2 [0; 1], x 6= y, 9α > 0: f(θx + (1 − θ)y) − αkθx + (1 − θ)yk2 (1.1) ≤ θf(x) + (1 − θ)f(y) − θαkxk2 − (1 − θ)αkyk2 Something we didn't prove yet but is true: k·k2 is strictly convex. We need it for this proof. kθx + (1 − θ)yk2 < θkxk2 + (1 − θ)kyk2 Convexity of Function 18 (proof continues) αkθx + (1 − θ)yk2 < θαkxk2 + (1 − θ)αkyk2 t = −αkθx + (1 − θ)yk2 + θαkxk2 + (1 − θ)αkyk2 > 0 (1.1) is equivalent with: f(θx + (1 − θ)y) + t ≤ θf(x) + (1 − θ)f(y) where t > 0, thus: f(θx + (1 − θ)y) < θf(x) + (1 − θ)f(y) Convexity of Function 19 Figure: Convex function illustration from Prof. Gu's Slides. This figure shows a typical convex function f, and instead of our expression of x and y he used u & v instead. Convexity of Function 20 Commonly-seen uni-variate convex functions include: I Constant: C I Exponential function: eax I Power function: xa (a 2 (−∞; 0] [ [1; 1), otherwise it is concave) I Powers of absolute value: jxjp (p ≥ 1) I Logarithm: − log(x)(x 2 R++) I x log(x)(x 2 R++) I All norm functions kxk I \The inequality follows from the triangle inequality, and the equality follows from homogeneity of a norm." Affine functions are Convex & Concave 21 n m An affine function f : R ! R , f(x) = Ax + b, where m×n m A 2 R , b 2 R , is convex & concave (neither strict convex nor strict concave). Conversely, all functions that are both convex and concave are affine functions. Proof: 8θ 2 [0; 1]; x; y 2 dom f, we have: f(θx + (1 − θ)y) = A(θx + (1 − θ)y) + b = θ(Ax + b) + (1 − θ)(Ay + b) = θf(x) + (1 − θ)f(y) th 0 -Order Characterization 22 f is convex iff it is convex when restricted to any line that intersects its domain. n In other words, f is convex iff 8x 2 dom f and 8v 2 R , the function: g(t) = f(x + tv) is convex. dom g = ft j x + tv 2 dom fg This property allows us to check convexity of a function by restricting it to a line. st 1 -Order Characterization 23 Suppose f is differentiable (its gradient rf exists at each point in dom f, which is open). Then f is convex iff: I dom f is a convex set I 8x; y 2 dom f: f(y) ≥ f(x) + rf(x)T (y − x) It states that, for a convex function, the first-order Taylor approximation (f(x) + rf(x)T (y − x) is the first-order Taylor approximation of f near x) is in fact a global underestimator of the function.

Convex Optimization Scai Lab Study Group

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support