Convex Analysis
Total Page:16
File Type:pdf, Size:1020Kb
Convex Analysis Taught by Professor Adrian Lewis Notes by Mateo D´ıaz [email protected] Cornell University Spring, 2018 Last updated May 18, 2018. The latest version is online here. Notes template by David Mehrle. Contents 1 Convex Sets .................................................. 4 2 Convex functions............................................... 6 2.1 Convexity and calculus......................................... 7 2.1.1 Gradients ............................................ 7 2.1.2 Recognizing smooth convexity................................ 8 2.2 Continuity of convex functions.................................... 8 2.3 Subgradients............................................... 9 3 Duality..................................................... 10 3.1 Fenchel Duality............................................. 10 3.2 Conic programs............................................. 12 3.3 Convex calculus............................................. 14 3.4 Lagrangian duality........................................... 15 4 Algorithms................................................... 19 4.1 Augmented Lagrangian Method ................................... 19 4.2 Proximal Point method and fixed points............................... 20 4.3 Smooth Minimization.......................................... 25 4.4 Splitting Problems and Alternating projections........................... 26 4.5 Proximal Gradient ........................................... 28 4.6 Douglas-Rachford............................................ 29 4.6.1 Consensus optimization.................................... 30 4.7 ADMM: Alternating Directions Method of Multipliers....................... 31 4.8 Splitting ................................................. 34 4.9 Projected subgradient method..................................... 34 4.10 Accelerated Proximal Gradient .................................... 36 5 Variational analysis.............................................. 38 5.1 Nonconvex calculus........................................... 38 5.2 Subgradients and nonconvexity.................................... 40 5.3 Inverse problems ............................................ 43 5.4 Linearizing sets............................................. 47 5.5 Optimality conditions ......................................... 49 1 Contents by Lecture Lecture 01 on 30 January 2018.......................................... 4 Lecture 02 on February 1 2018.......................................... 5 Lecture 03 on February 6 2018.......................................... 8 Lecture 04 on February 8 2018.......................................... 9 Lecture 05 on February 13 2018 ......................................... 10 Lecture 06 on February 15 2018 ......................................... 12 Lecture 07 on February 22 2018 ......................................... 15 Lecture 08 on February 27 2018 ......................................... 18 Lecture 09 on March 1 2018 ........................................... 20 Lecture 10 on March 6 2018 ........................................... 21 Lecture 11 on March 8 2018 ........................................... 24 Lecture 12 on March 13 2018........................................... 26 Lecture 13 on March 15 2018........................................... 29 Lecture 14 on March 20 2018........................................... 31 Lecture 15 on March 22 2018........................................... 33 Lecture 16 on April 10 2018 ........................................... 34 Lecture 17 on April 12 2018 ........................................... 36 Lecture 18 on April 17 2018 ........................................... 38 Lecture 19 on April 19 2018 ........................................... 40 Lecture 20 on April 24 2018 ........................................... 42 Lecture 21 on April 26 2018 ........................................... 43 Lecture 22 on May 1 2018............................................. 45 Lecture 23 on May 3 2018............................................. 46 Lecture 24 on May 8 2018............................................. 48 2 Disclousure These notes are not official. I do not accept any responsibility or liability for the accuracy, content, complete- ness, correctness, or reliability of the information contained on these notes. Please feel free to email me if you find any mistakes or typos. 3 Lecture 01: Convex Sets 30 January 2018 1 Convex Sets We will mainly consider a real Euclidean space E (finite-dimensional Hilbert space). We denote the inner product as h·, ·i. Example 1. Few examples: • The standard space. Rn, hx, yi = x>y. • Symmetric matrices. Sn := fsymmetric n × n matricesg, hX, Yi = tr(X>Y). Definition 1. A set C ⊆ Rn is convex if for all x, y 2 E, λ 2 [0, 1] we have λx + (1 - λ)y 2 C. (a) Nonconvex set (b) Convex set Example 2. Here are two simple examples: • Closed Halfspaces. H := fx j a>x ≤ βg for some fixed β 2 R and a nonzero a 2 E∗. • Unit ball. Define kxk2 = hx, xi, then B = fx j kxk ≤ 1g. Proposition 1. Arbitrary intersections of convex sets are convex. Proof. Exercise. Example 3. Consider: • Polyhedrons. They are intersections of halfspaces. n n > • PSD Matrices. S+ = fX 2 S j y Xy ≥ 0g, this is convex because 0 ≤ y>Xy = tr(ytXy) = tr(Xyy>) = hX, yy>i, n therefore each constraint defines a halfspaces and thus S+ is an infinite intersection of halfspaces. Proposition 2. Closures of convex sets are convex. Proof. Exercise. But, how about interiors? Lemma 1 (Accessibility). For a convex set S if x 2 int S, y 2 cl S, then λx + (1 - λ)y 2 int S 80 ≤ λ < 1. 4 Lecture 02: Convex Sets February 1 2018 Proof. Assume first that y 2 S. Then there exists δ > 0 such that x + δB ⊂ S. By convexity, for all λ 2 (0, 1) λ(x + δB) + (1 - λ)y ⊆ S = λ(x + (1 - λ)y) + λδB 2 S which implies that λ(x + (1 - λ)y) 2 int S. ) Now suppose that y 2 cl S, then there exists a sequence yk 2 S such that yk y. By the first argument Å (1 - λ) ã λ(x + (1 - λ)y) = λ(x + (1 - λ)y ) + (1 - λ)(y - y ) = λ x + (y - y ) +(1 - λ)y . k k λ ! k k zk Notice that when k is big enough zk lies in the interior of S.| Thus by{z our first argument} this convex combination lies in the interior of S. Corollary 1. The interior of convex sets is convex and assuming that int S 6= Æ, then cl int S = cl S. Question 1. For convex S, int cl S = int S? Theorem 1 (Best approximation). Any nonempty closed convex S has a unique shortest vector x¯, i.e. kx¯k ≤ kxk for all x 2 S. Moreover is characterized by the following property hx¯, x - x¯i ≥ 0. (1.1) Proof. Existence. Choose any x^ 2 S, consider S1 = S \ kx^kB is nonempty and compact. Thus the continuous function k · k has a minimizer on that set, call it x¯. For any x 2 S, either kxk > kx^k ≥ kx¯k or kxk ≤ kx^k and thus kxk ≥ kx^k. Characterization. We know for x 2 S and t 2 [0, 1] such that kx^k2 ≤ kx¯ + t(x - x¯)k2, then expanding 0 ≤ tk(x - x¯)k2 + 2hx¯, (x - x¯)i. By letting t 0 we get the result. Uniqueness. Suppose that there exist another point x0 2 S which also satisfies (1.1). Thus ! 0 0 0 hx¯, x - x¯i ≥ 0 and hx ,x ¯ - x i ≥ 0. Then if we add the two inequalities we get kx¯ - x0k ≤ 0 sox ¯ = x0. Theorem 2 (Basic separation). If S is closed and convex and y 2/ S then there exists a halfspaces H such that S ⊆ H and y 2/ H. Proof. Without loss of generality take y = 0 and use previous result. Theorem 3 (Supporting hyperplane (Hann-Banach)). Suppose that S is convex and int S 6= Æ and let y 2/ int S, then there exist a halfspace H containing S with y 2 bd Proof. Without loss of generality assume that 0 2 int S. For each r 2 N+, define the vector Å 1ã z = 1 + y. k r By the Accessibility Lemma then zk 2/ cl S, by basic separation there is a separating hyperplane given by ak 6= 0 so that har, zki ≥ har, xi 8x 2 S. Without loss of generality assume that kakk = 1, then by compactness some subsequence converges to a. Thus, ha, yi ≥ ha, xi 8x 2 S. 5 Lecture 02: Convex functions February 1 2018 2 Convex functions There is a very natural way to extend our results, of the previous class, from sets to functions. The kind of trick to do that is to allow your function to have infinite values, namely ± . We will think about functions of the form f : E R := [- , ]. 1 Definition 2. Let f : E R := [- , ] be a function, then its epigraph is defined as ! 1 1 f := f(x r) E R j f(x) rg ! 1 1epi , 2 × ≤ Definition 3. We say that a function f is convex if epi f is convex. Figure 2: Epigraph of the function f = (x2 - 1)2 In the other direction, we can define Definition 4. Given a set C 2 E, define the indicator function as 0 if x 2 C, δ (x) := C otherwise. Then it is possible to show that C ⊂ E is convex if, and only if, δ is convex. 1 C Definition 5. We say that a function f : S R is convex if f + δS is convex. Equivalently, S is convex and f(λx + (1 - λ)y) λf(x) + (1 - λ)f(y) x y S λ [0 1] ≤ ! 8 , 2 , 2 , . (2.1) Definition 6. We say that a function f : S R is strictly convex if inequality (2.1) holds strictly. Example 4. In particular, norms are strictly convex, k · k , k · k , k · k , ::: . ! 1 2 Definition 7. We say that x 2 E is a minimizer for f : E1 R if f(x) f(x) x E ≥ ! 8 2 . Proposition 3. Minimizers of strictly convex functions are unique. Proof. Easy to check. Definition 8. We say that x 2 E is a local minimizer for f : E R if there exist a neighbor U ⊂ E of x such that f(x) f(x) x U ≥ 8! 2 . 6 Lecture 02: Convexity and calculus February 1 2018 Proposition 4. For convex f, local minimizers are minimizers. Proof. First, we need an easy 1-dimensional result for convex functions. g(t) Claim 1. Suppose that g : R+ R convex with g(0) = 0. Then t 7 t is a nondecreasing function. Proof. Exercise. ! ! Suppose that x is a local minimizer. Now, given any x 2 E we would like to show that f(x) ≥ f(x). Define g(t) = f(x + t(x - x)) - f(x), it is easy to see that function is convex because f is. Since x is a local minimizer then for small enough t g(t) g(1) 0 ≤ ≤ = f(x)- f(x), t 1 where the second inequality follows by the claim. 2.1 Convexity and calculus Convexity is a one-dimensional idea, so if we understand how it behaves on the reals then we are in good shape for everything else.