Convex Optimization (EE227BT: UC Berkeley)

Convex Optimization (EE227BT: UC Berkeley) Lecture 1 (Convex sets and functions) August 28, 2014 ◦ Laurent El Ghaoui (slides by Suvrit Sra) Course organization I Course material and interaction: use bCourse I Relevant texts / references: Convex optimization { Boyd & Vandenberghe (BV) ~ Introductory lectures on convex optimisation { Nesterov ~ Nonlinear programming { Bertsekas ~ Convex Analysis { Rockafellar ~ Numerical optimization { Nocedal & Wright ~ Lectures on modern convex optimization { Nemirovski ~ Optimization for Machine Learning { Sra, Nowozin, Wright ~ Optimization Models { Calafiore, El Ghaoui (to appear in ~ November) I Instructor: Laurent El Ghaoui ([email protected]) I HW + Quizzes (40%); Midterm (30%); Project (30%) I TA: Vu Pham ([email protected]) I Office hours: Thu 9-10am (El Ghaoui), TBA (Vu Pham) I If you email me, please put EE227BT in Subject: Linear algebra recap Eigenvalues and Eigenvectors n×n n Def. If A C and x C . Consider the equation 2 2 Ax = λx; x = 0; λ C: 6 2 If scalar λ and vector x satisfy this equation, then λ is called an eigenvalue and x and eigenvector of A. Above equation may be rewritten equivalently as (λI A)x = 0; x = 0: − 6 Thus, λ is an eigenvalue, if and only if det(λI A) = 0: − Def. pA(t) := det(tI A) is called characteristic polynomial. − Eigenvalues are roots of characteristic polynomial. Eigenvalues and Eigenvectors n×n Theorem Let λ1; : : : ; λn be eigenvalues of A C . Then, 2 X X Y Tr(A) = aii = λi ; det(A) = λi : i i i n×n ∗ ∗ Def. Matrix U C unitary if U U = I ([U ]ij = [¯uji ]) 2 n×n Theorem (Schur factorization). If A C with eigenvalues 2 n×n ∗ λ1; : : : ; λn, then there is a unitary matrix U C (i.e., U U = I ), such that 2 ∗ U AU = T = [tij ] is upper triangular with diagonal entries tii = λi . Corollary. If A∗A = AA∗, then there exists a unitary U such that A = UΛU∗. We will call this the Eigenvector Decomposition. Proof.A = VTV ∗, A∗ = VT ∗V ∗, so AA∗ = TT ∗ = T ∗T = A∗A. But T is upper triangular, so only way for TT ∗ = T ∗T , some easy but tedious induction shows that T must be diagonal. Hence, T = Λ. Singular value decomposition m×n Theorem (SVD) Let A C . There are unitaries s.t. U and V 2 ∗ U AV = Diag(σ1; : : : ; σp); p = min(m; n); where σ1 σ2 σp 0. Usually written as ≥ ≥ · · · ≥ A = UΣV ∗: left singular vectors U are eigenvectors of AA∗ right singular vectors V are eigenvectors of A∗A p ∗ p ∗ nonzero singular values σi = λi (AA ) = λi (A A) Positive definite matrices n×n Def. Let A R be symmetric, i.e., aij = aji . Then, A is called positive definite2 if T X x Ax = xi aij xj > 0; x = 0: ij 8 6 If > replaced by , we call A positive semidefinite. ≥ Theorem A symmetric real matrix is positive semidefinite (positive definite) iff all its eigenvalues are nonnegative (positive). Theorem Every semidefinite matrix can be written as BT B Exercise: Prove this claim. Also prove converse. Notation: A 0 (posdef) or A 0 (semidef) Amongst most important objects in convex optimization! Matrix and vector calculus f (x) f (x) T P r x a = i xi ai a T P T x Ax = ij xi aij xj (A + A )x log det(X ) X −1 P T Tr(XA) = ij xij aji A T P Tr(X A) = ij xij aij A Tr(X T AX ) (A + AT )X Easily derived using \brute-force" rules Wikipedia | Suvrit's ancient notes | Matrix cookbook | Convex Sets Convex sets Convex sets n Def. A set C R is called convex, if for any x; y C, the line-segment θx⊂+ (1 θ)y (here θ 0) also lies in C. 2 − ≥ Combinations Convex: θ1x + θ2y C, where θ1; θ2 0 and θ1 + θ2 = 1. I 2 ≥ I Linear: if restrictions on θ1; θ2 are dropped I Conic: if restriction θ1 + θ2 = 1 is dropped Convex sets Theorem (Intersection). Let C1, C2 be convex sets. Then, C1 C2 is also convex. \ Proof. If C1 C2 = , then true vacuously. \ ; Let x; y C1 C2. Then, x; y C1 and x; y C2. 2 \ 2 2 But C1, C2 are convex, hence θx + (1 θ)y C1, and also in C2. − 2 Thus, θx + (1 θ)y C1 C2. − 2 m\ Inductively follows that Ci is also convex. \i=1 Convex sets { more examples (psdcone image from convexoptimization.com, Dattorro) n Let x1; x2;:::; xm R . Their convex hull is ~ 2 nX X o co(x1;:::; xm) := θi xi θi 0; θi = 1 : i j ≥ i m×n m Let A R , and b R . The set x Ax = b is convex (it ~ is an affine2 space over2 subspace of solutionsf j of Axg= 0). halfspace x aT x b . ~ polyhedron xj Ax ≤ b; Cx = d . ~ f j ≤T g ellipsoid x (x x0) A(x x0) 1 ,(A: semidefinite) ~ j − − P ≤ probability simplex x x 0; xi = 1 ~ f j ≥ i g ◦ Quiz: Prove that these sets are convex. Convex functions Convex functions Def. Function f : I R on interval I called midpoint convex if ! f x+y f (x)+f (y) ; whenever x; y I : 2 ≤ 2 2 Read: f of AM is less than or equal to AM of f . n Def. A function f : R R is called convex if its domain dom(f ) ! is a convex set and for any x; y dom(f ) and θ 0 2 ≥ f (θx + (1 θ)y) θf (x) + (1 θ)f (y): − ≤ − Theorem (J.L.W.V. Jensen). Let f : I R be continuous. Then, f is convex if and only if it is midpoint convex.! n I Theorem extends to functions f : R R. Very useful to checking convexity of a given function.X ⊆ ! Convex functions ) f(y) λ)f(y λf(x) + (1 − f(x) x y f (λx + (1 λ)y) λf (x) + (1 λ)f (y) − ≤ − Convex functions f(x) y i ), x − f(y ) + h∇ f(y f(y) y x f (x) f (y) + f (y); x y ≥ hr − i Convex functions P R Q x z = λx + (1 λ)y y − slope PQ slope PR slope QR ≤ ≤ Recognizing convex functions If f is continuous and midpoint convex, then it is convex. ♠ If f is differentiable, then f is convex if and only if dom f is ♠ convex and f (x) f (y) + f (y); x y for all x; y dom f . ≥ hr − i 2 If f is twice differentiable, then f is convex if and only if dom f ♠ is convex and 2f (x) 0 at every x dom f . r 2 Convex functions I Linear: f (θ1x + θ2y) = θ1f (x) + θ2f (y); θ1; θ2 unrestricted I Concave: f (θx + (1 θ)y) θf (x) + (1 θ)f (y) − ≥ − I Strictly convex: If inequality is strict for x = y 6 Convex functions Example The pointwise maximum of a family of convex functions is convex. That is, if f (x; y) is a convex function of x for every y in some \index set" , then Y f (x) := max f (x; y) y2Y is a convex function of x (set is arbitrary). Y n m×n m Example Let f : R R be convex. Let A R , and b R . Prove that g(x) = f!(Ax + b) is convex. 2 2 Exercise: Verify truth of above examples. Convex functions Theorem Let be a nonempty convex set. Suppose L(x; y) is convex in (x; y), then,Y f (x) := inf L(x; y) y2Y is a convex function of x, provided f (x) > . −∞ Proof. Let u; v dom f . Since f (u) = infy L(u; y), for each > 0, there 2 is a y1 , s.t. f (u) + 2 is not the infimum. Thus, L(u; y1) f (u) + 2 . 2 Y ≤ Similarly, there is y2 , such that L(v; y2) f (v) + 2 . Now we prove that f2(λ Yu + (1 λ)v) λf (u≤) + (1 λ)f (v) directly. − ≤ − f (λu + (1 λ)v) = inf L(λu + (1 λ)v; y) − y2Y − L(λu + (1 λ)v; λy1 + (1 λ)y2) ≤ − − λL(u; y1) + (1 λ)L(v; y2) ≤ − λf (u) + (1 λ)f (v) + . ≤ − Since > 0 is arbitrary, claim follows. Example: Schur complement Let A; B; C be matrices such that C 0, and let AB Z := 0; BT C then the Schur complement A BC −1BT 0. Proof.L (x; y) = [x; y]T Z[x; y] is− convex in (x; y) since Z 0 T −1 T Observe that f (x) = infy L(x; y) = x (A BC B )x is convex. − (We skipped ahead and solved y L(x; y) = 0 to minimize). r Recognizing convex functions If f is continuous and midpoint convex, then it is convex. ♠ If f is differentiable, then f is convex if and only if dom f is ♠ convex and f (x) f (y) + f (y); x y for all x; y dom f . ≥ hr − i 2 If f is twice differentiable, then f is convex if and only if dom f ♠ is convex and 2f (x) 0 at every x dom f . r 2 By showing f to be a pointwise max of convex functions ♠ By showing f : dom(f ) R is convex if and only if its ♠ restriction to any line that! intersects dom(f ) is convex. That is, for any x dom(f ) and any v, the function g(t) = f (x + tv) is convex (on2 its domain t x + tv dom(f ) ). f j 2 g See exercises (Ch.

Convex Optimization (EE227BT: UC Berkeley)

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support