Introduction to Convexity

1 Introduction to Convexity 2 Amitabh Basu th 3 Compiled on Monday 9 December, 2019 at 16:35 1 4 Contents 5 1 Definitions and Preliminaries4 6 Basic real analysis and topology...............................4 7 Basic facts about matrices..................................6 8 2 Convex Sets 6 9 2.1 Definitions and basic properties...................................6 10 2.2 Convex cones, affine sets and dimension..............................8 11 2.3 Representations of convex sets................................... 11 12 2.3.1 Extrinsic description: separating hyperplanes....................... 11 13 How to represent general convex sets: Separation oracles................. 12 14 Farkas' lemma: A glimpse into polyhedral theory...................... 12 15 Duality/Polarity........................................ 12 16 2.3.2 Intrinsic description: faces, extreme points, recession cone, lineality space....... 14 17 2.3.3 A remark about extrinsic and intrinsic descriptions.................... 20 18 2.4 Combinatorial theorems: Helly-Radon-Carathéodory....................... 20 19 An application to learning theory: VC-dimension of halfspaces.............. 21 20 Application to centerpoints.................................. 23 21 2.5 Polyhedra............................................... 25 22 2.5.1 The Minkowski-Weyl Theorem............................... 27 23 2.5.2 Valid inequalities and feasibility.............................. 28 24 2.5.3 Faces of polyhedra...................................... 30 25 2.5.4 Implicit equalities, dimension of polyhedra and facets.................. 31 26 3 Convex Functions 33 27 3.1 General properties, epigraphs, subgradients............................ 34 28 3.2 Continuity properties......................................... 38 29 3.3 First-order derivative properties................................... 40 30 3.4 Second-order derivative properties................................. 42 31 3.5 Sublinear functions, support functions and gauges........................ 43 32 Gauges............................................. 43 33 Support functions....................................... 47 34 Generalized Cauchy-Schwarz/Holder's inequality...................... 48 35 One-to-one correspondence between closed, convex sets and closed, sublinear functions. 49 36 3.6 Directional derivatives, subgradients and subdifferential calculus................ 50 37 4 Optimization 54 38 Algorithmic setup: First-order oracles............................ 56 39 4.1 Subgradient algorithm........................................ 56 40 Subgradient Algorithm.................................... 57 41 4.2 Generalized inequalities and convex mappings........................... 59 42 4.3 Convex optimization with generalized inequalities......................... 60 43 4.3.1 Lagrangian duality for convex optimization with generalized constraints........ 61 44 4.3.2 Solving the Lagrangian dual problem............................ 63 45 4.3.3 Explicit examples of the Lagrangian dual......................... 63 46 Conic optimization...................................... 64 47 Convex optimization with explicit constraints and objective................ 64 48 A closer look at linear programming duality........................ 65 49 4.3.4 Strong duality: sufficient conditions and complementary slackness........... 65 50 Slater's condition for strong duality............................. 65 51 Closed cone condition for strong duality in conic optimization.............. 66 2 52 Complementary slackness................................... 67 53 4.3.5 Saddle point interpretation of the Lagrangian dual.................... 68 54 4.4 Cutting plane schemes........................................ 68 55 General cutting plane scheme................................ 69 56 Center of Gravity Method.................................. 71 57 Ellipsoid method........................................ 72 3 58 1 Definitions and Preliminaries d d d 59 We will focus on R for arbitrary d N: x = (x1; : : : ; xd) R . We will use the notation R+ to denote 2 2 i 60 the set of all vectors with nonnegative coordinates. We will also use e , i = 1; : : : ; d to denote the i-th unit 61 vector, i.e., the vector which has 1 in the i-th coordinate and 0 in every other coordinate. d d 62 Definition 1.1. A norm on R is a function N : R R+ satisfying: ! 63 1. N(x) = 0 if and only if x = 0, d 64 2. N(αx) = α N(x) for all α R and x R , j j 2 2 d 65 3. N(x + y) N(x) + N(y) for all x; y R . (Triangle inequality) ≤ 2 p d p p p 1 66 Example 1.2. For any p 1, define the ` norm on R : x p = ( x1 + x2 + ::: + xd ) p . p = 2 ≥ k k j j j j j j 67 is also called the standard Euclidean norm; we will drop the subscript 2 to denote the standard norm: p 2 2 2 1 n 68 x = x + x + ::: + x : The ` norm is defined as x 1 = max xi . k k 1 2 d k k i=1 j j d d 69 Definition 1.3. Any norm on R defines a distance between points in x; y R as dN (x; y) := N(x y). 2 − 70 This is called the metric or distance induced by the norm. Such a metric satisfies three important properties: 71 1. dN (x; y) = 0 if and only if x = y, d 72 2. dN (x; y) = dN (y; x) for all x R , 2 d 73 3. dN (x; z) dN (x; y) + dN (y; z) for all x; y; z R . (Triangle inequality) ≤ 2 d 74 Definition 1.4. We also utilize the (standard) inner product of x; y R : x; y = x1y1 +x2y2 +:::+xdyd. 2 2 h i 75 (Note that x = x; x ). We say x and y are orthogonal if x; y = 0. k k2 h i h i d d 76 Definition 1.5. For any norm N and x R , r R+, we will call the set BN (x; r) := y R : N(y x) 2 2 f 2 − ≤ 77 r as the ball around x of radius r. BN (0; 1) will be called the unit ball for the norm N. We will drop g 78 the subscript N when we speak of the standard Euclidean norm and there is no chance of confusion in the 79 context. d 80 A subset X R is said to be bounded if there exists R R such that X BN (0;R). ⊆ 2 ⊆ Definition 1.6. Given any set X Rd and a scalar α R, ⊆ 2 αX := αx : x X : f 2 g Given any two sets X; Y Rd, we define the Minkowski sum of X; Y as ⊆ X + Y := x + y : x X; y Y : f 2 2 g 81 Basic real analysis and topology. For any subset of real numbers S R, we denote the concept of the ⊆ 82 infimum by inf S and the supremum by sup S. d d 83 Fix a norm N on R . A set X R is called open if for every x X, there exists r R+ such that ⊆ d 2 2 84 BN (x; r) X. A set X is closed if its complement R X is open. ⊆ n d 85 Theorem 1.7. 1. ; R are both open and closed. ; 86 2. An arbitrary union of open sets is open. An arbitrary intersection of closed sets is closed. 87 3. A finite intersection of open sets is open. A finite union of closed sets is closed. 4 d 1 2 3 i 88 A sequence in R is a countable ordered set of points: x ; x ; x ;::: and will often be denoted by x i2 . f g N 89 We say that the sequence converges or that the limit of the sequence exists if there exists a point x such that n 90 for every > 0, there exists M N such that N(x x ) for all n M. x is called the limit point, or 2 − ≤ ≥ n 91 simply the limit, of the sequence and will also sometimes be denoted by limn!1 x . Although the definition 92 of the limit is made here with respect to a particular norm, it is a well-known fact that the concept actually 93 does not depend on the choice of the norm. 94 Theorem 1.8. A set X is closed if and only if for every convergent sequence in X, the limit of the sequence 95 is also in X. 96 We introduce three important notions: d 97 1. For any set X R , the closure of X is the smallest closed set containing X and will be denoted by ⊆ 98 cl(X). d 99 2. For any set X R , the interior of X is the largest open set contained inside X and will be denoted ⊆ 100 by int(X). d 101 3. For any set X R , the boundary of X is defined as bd(X) := cl(X) int(X). ⊆ n d 102 Definition 1.9. A set in R that is closed and bounded is called compact. d i 103 Theorem 1.10. Let C R be a compact set. Then every sequence x i2 contained in C (not necessarily ⊆ f g N 104 convergent) has a convergent subsequence. d n i d 105 A function f : R R is continuous if for every convergent sequence x i2N R , the following holds: i ! i f g ⊆ 106 limi!1 f(x ) = f(limn!1 x ). d d 107 Theorem 1.11. [Weierstrass' Theorem] Let f : R R be a continuous function. Let X R be a ! min ⊆ 108 nonempty, compact subset. Then inf f(x): x X is attained, i.e., there exists x X such that min f 2 maxg max 2 109 f(x ) = inf f(x): x X . Similarly, there exists x X such that f(x ) = sup f(x): x X . f 2 g 2 f 2 g 110 A generalization of the above theorem is the following. d n 111 Theorem 1.12. Let f : R R be a continuous function, and C be a compact set. Then f(C) is compact. ! d n 112 We will also need to speak of differentiability of functions f : R R . ! Definition 1.13. We say that f : Rd Rn is differentiable at x Rd, if there exists a linear transformation ! 2 A : Rd Rn such that ! f(x + h) f(x) Ah lim k − − k = 0: h!0 h k k 113 If f is differentiable at x, then the linear transformation A is unique.

Introduction to Convexity

Boundedness in Linear Topological Spaces

1 Definition

A Representation of Partially Ordered Preferences Teddy Seidenfeld

Compact and Totally Bounded Metric Spaces

Chapter 7. Complete Metric Spaces and Function Spaces

Topology Proceedings

On the Polyhedrality of Closures of Multi-Branch Split Sets and Other Polyhedra with Bounded Max-Facet-Width

An Introduction to Some Aspects of Functional Analysis

Math 131: Introduction to Topology 1

Definition in Terms of Order Alone in the Linear

Metric Spaces

Introduction to Real Analysis