Information-Based Complexity of Convex Programming

1 TECHNION - THE ISRAEL INSTITUTE OF TECHNOLOGY FACULTY OF INDUSTRIAL ENGINEERING & MANAGEMENT INFORMATION-BASED COMPLEXITY OF CONVEX PROGRAMMING A. Nemirovski Fall Semester 1994/95 2 Information-Based Complexity of Convex Programming Goals: given a class of Convex Optimization problems, one may look for an efficient algorithm for the class, i.e., an algorithm with a "good" (best possible, polynomial time,...) theo- retical worst-case efficiency estimate on the class. The goal of the course is to present a number of efficient algorithms for several standard classes of Convex Optimization problems. The course deals with the black-box setting of an optimization problem (all known in advance is that the problem belongs to a given "wide" class, say, is convex, convex of a given degree of smoothness, etc.; besides this a priory qualitative information, we have the possibility to ask an "oracle" for quantitive local information on the objective and the constraints, like their values and derivatives at a point). We present results on the associated with this setting complexity of standard problem classes (i.e. the best possible worst-case # of oracle calls which allows to solve any problem from the class to a given accuracy) and focus on the corresponding optimal algorithms. Duration: one semester Prerequisites: knowledge of elementary Calculus, Linear Algebra and of the basic concepts of Convex Analysis (like convexity of functions/sets and the notion of subgradient of a convex function) is welcomed, although is not absolutely necessary. Contents: Introduction: problem complexity and method efficiency in optimization Methods with linear dimension-dependent convergence from bisection to the cutting plane scheme how to divide a n-dimensional pie: the Center-of-Gravity method the Outer Ellipsoid method polynomial solvability of Linear Programming the Inner Ellipsoid method convex-concave games and variational inequalities with monotone operators Large-scale problems and methods with dimension-independent convergence subradient and mirror descent methods for nonsmooth convex optimization optimal methods for smooth convex minimization strongly convex unconstrained problems How to solve a linear system: optimal iterative methods for unconstrained convex quadratic minimization 3 About Exercises The majority of Lectures are accompanied by the "Exercise" sections. In several cases, the exercises are devoted to the lecture where they are placed; sometimes they prepare the reader to the next lecture. The mark ∗ at the word "Exercise" or at an item of an exercise means that you may use hints given in Appendix "Hints". A hint, in turn, may refer you to the solution of the exercise given in the Appendix "Solutions"; this is denoted by the mark +. Some exercises are marked by + rather than by ∗; this refers you directly to the solution of an exercise. Exercises marked by # are closely related to the lecture where they are placed; it would be a good thing to solve such an exercise or at least to become acquainted with its solution (if any is given). Exercises which I find difficult are marked with >. The exercises, usually, are not that simple. They in no sense are obligatory, and the reader is not expected to solve all or even the majority of the exercises. Those who would like to work on the solutions should take into account that the order of exercises is important: a problem which could cause serious difficulties as it is becomes much simpler in the context (at least I hope so). 4 Contents 1 Introduction: what the course is about 9 1.1 Example: one-dimensional convex problems . .9 1.2 Conclusion . 17 1.3 Exercises: Brunn, Minkowski and convex pie . 17 1.3.1 Prerequisites . 17 1.3.2 Brunn, Minkowski and Convex Pie . 18 2 Methods with linear convergence, I 29 2.1 Class of general convex problems: description and complexity . 29 2.2 Cutting Plane scheme and Center of Gravity Method . 31 2.2.1 Case of problems without functional constraints . 31 2.3 The general case: problems with functional constraints . 36 2.4 Exercises: Extremal Ellipsoids . 40 2.4.1 Tschebyshev-type results for sums of random vectors . 46 3 Methods with linear convergence, II 51 3.1 Lower complexity bound . 51 3.2 The Ellipsoid method . 56 3.2.1 Ellipsoids . 57 3.2.2 The Ellipsoid method . 59 3.3 Exercises: The Center of Gravity and the Ellipsoid methods . 61 3.3.1 Is it actually difficult to find the center of gravity? . 61 3.3.2 Some extensions of the Cutting Plane scheme . 65 4 Polynomial solvability of Linear Programming 75 4.1 Classes P and NP . 75 4.2 Linear Programming . 78 4.2.1 Polynomial solvability of FLP . 80 4.2.2 From detecting feasibility to solving linear programs . 82 4.3 Exercises: Around the Simplex method and other Simplices . 84 4.3.1 Example of Klee and Minty . 84 4.3.2 The method of outer simplex . 85 5 6 CONTENTS 5 Linearly converging methods for games 87 5.1 Convex-concave games . 87 5.2 Cutting plane scheme for games: updating localizers . 89 5.3 Cutting plane scheme for games: generating solutions . 90 5.4 Concluding remarks . 94 5.5 Exercises: Maximal Inscribed Ellipsoid . 95 6 Variational inequalities with monotone operators 101 6.1 Variational inequalities with monotone operators . 101 6.2 Cutting plane scheme for variational inequalities . 108 6.3 Exercises: Around monotone operators . 111 7 Large-scale optimization problems 115 7.1 Goals and motivations . 115 7.2 The main result . 116 7.3 Upper complexity bound: the Gradient Descent . 117 7.4 The lower bound . 120 7.5 Exercises: Around Subgradient Descent . 122 8 Subgradient Descent and Bundle methods 127 8.1 Subgradient Descent method . 127 8.2 Bundle methods . 132 8.2.1 The Level method . 133 8.2.2 Concluding remarks . 137 8.3 Exercises: Mirror Descent . 138 9 Large-scale games and variational inequalities 143 9.1 Subrgadient Descent method for variational inequalities . 144 9.2 Level method for variational inequalities and games . 147 9.2.1 Level method for games . 147 9.2.2 Level method for variational inequalities . 150 9.3 Exercises: Around Level . 152 9.3.1 "Prox-Level" . 152 9.3.2 Level for constrained optimization . 154 10 Smooth convex minimization problems 159 10.1 Traditional methods . 160 10.2 Complexity of classes Sn(L; R)............................ 161 10.2.1 Upper complexity bound: Nesterov's method . 162 10.2.2 Lower bound . 166 10.2.3 Appendix: proof of Proposition 10.2.1 . 169 11 Constrained smooth and strongly convex problems 171 11.1 Composite problem . 171 11.2 Gradient mapping . 172 11.3 Nesterov's method for composite problems . 175 CONTENTS 7 11.4 Smooth strongly convex problems . 178 12 Unconstrained quadratic optimization 181 12.1 Complexity of quadratic problems: motivation . 181 12.2 Families of source-representable quadratic problems . 183 12.3 Lower complexity bounds . 184 12.4 Complexity of linear operator equations . 187 12.5 Ill-posed problems . 192 12.6 Exercises: Around quadratic forms . 192 13 Optimality of the Conjugate Gradient method 195 13.1 The Conjugate Gradient method . 196 13.2 Main result . 197 13.3 Proof of the main result . 198 13.3.1 CGM and orthogonal polynomials . 198 13.3.2 Expression for inaccuracy . 201 13.3.3 Momentum inequality . 201 13.3.4 Proof of (13.3.20) . 202 13.3.5 Concluding the proof of Theorem 13.2.1 . 204 13.4 Exercises: Around Conjugate Gradient Method . 206 14 Convex Stochastic Programming 211 14.1 Stochastic Approximation: simple case . 214 14.1.1 Assumptions . 214 14.1.2 The Stochastic Approximation method . 214 14.1.3 Comments . 216 14.2 MinMax Stochastic Programming problems . 218 Hints to exercises 221 Solutions to exercises 235 8 CONTENTS Lecture 1 Introduction: what the course is about What we are interested in the course are theoretically efficient methods for convex optimization problems. Almost each word in the previous sentence should be explained, and this explanation, that is, formulation of our goals, is the main thing I am going to speak about today. I believe that the best way to explain what we are about to do is to start with a simple example - one-dimensional convex minimization - where everything is seen. 1.1 Example: one-dimensional convex problems Consider one-dimensional convex problems minimize f(x) s.t. x 2 G = [a; b]; where [a; b] is a given finite segment on the axis. It is also known that our objective f is a continuous convex function on G; for the sake of simplicity, assume that we know bounds, let them be 0 and V , for the values of the objective on G. Thus, all we know about the objective is that it belongs to the family P = ff :[a; b] ! R j f is convex and continuous; 0 ≤ f(x) ≤ V; x 2 [a; b]g: And what we are asked to do is to find, for a given positive ", an "-solution to the problem, i.e., a pointx ¯ 2 G such that f(¯x) − f ∗ ≡ f(¯x) − min f ≤ ": G Of course, our a priori knowledge on the objective given by the inclusion f 2 P, is, for small ", far from being sufficient for finding an "-solution, and we need some source of quantitative information on the objective. The standard assumption here which comes from the optimization practice is that we can compute the value and a subgradient of the objective at a point, i.e., we have access to a subroutine, an oracle O, which gets, as an input, a point x from our segment and returns the value f(x) and a subgradient f 0(x) of the objective at the point.

Load more