1
Universit´e Joseph Fourier Master de Math´ematiques Appliqu´ees , 2`eme ann´ee
EFFICIENT METHODS IN OPTIMIZATION Th´eorie de la programmation non-lin´eaire Anatoli Iouditski
Lecture Notes
Optimization problems arise naturally in many application fields. Whatever people do, at some point they get a craving for organizing things in a best possible way. This intention, converted in a mathematical form, appears to be an optimization problem of certain type (think of, say, Optimal Diet Problem). Unfortunately, the next step, consisting of finding a solution to the mathematical model is less trivial. At the first glance, everything looks very simple: many commercial optimization packages are easily available and any user can get a “solution” to his model just by clicking on an icon at the desktop of his PC. However, the question is, how much he could trust it? One of the goals of this course is to show that, despite to their attraction, the general optimization problems very often break the expectations of a naive user. In order to apply these formulations successfully, it is necessary to be aware of some theory, which tells us what we can and what we cannot do with optimization problems. The elements of this theory can be found in each lecture of the course. This course itself is based on the lectures given by Arkadi Nemirovski at Technion in late 1990’s. On the other hand all the errors and inanities you may find here should be put on the account of the name at the title page.
http://www-ljk.imag.fr/membres/Anatoli.Iouditski/cours/optimisation-convexe.htm 2 Contents
1 Introduction 5 1.1 General formulation of the problem ...... 5 1.1.1 Problem formulation and terminology ...... 5 1.1.2 Performance of Numerical Methods ...... 8 1.2 Complexity bounds for Global Optimization ...... 10 1.3 Identity cards of the fields ...... 15 1.4 Rules of the game ...... 17 1.5 Suggested reading ...... 17
2 When everything is simple: 1-dimensional Convex Optimization 19 2.1 Example: one-dimensional convex problems ...... 19 2.1.1 Upper complexity bound ...... 20 2.1.2 Lower complexity bound ...... 24 2.2 Conclusion ...... 27 2.3 Exercises ...... 27 2.3.1 Can we use 1-dimensional optimization? ...... 27 2.3.2 Extremal Ellipsoids ...... 30
3 Methods with linear convergence 37 3.1 Class of general convex problems: description and complexity ...... 37 3.2 Cutting Plane scheme and Center of Gravity Method ...... 39 3.2.1 Case of problems without functional constraints ...... 39 3.3 The general case: problems with functional constraints ...... 44 3.4 Lower complexity bound ...... 48 3.5 The Ellipsoid method ...... 54 3.5.1 Ellipsoids ...... 55 3.5.2 The Ellipsoid method ...... 57 3.6 Exercises ...... 60 3.6.1 Some extensions of the Cutting Plane scheme ...... 60 3.6.2 The method of outer simplex ...... 71
3 4 CONTENTS
4 Large-scale optimization problems 73 4.1 Goals and motivations ...... 73 4.2 The main result ...... 74 4.3 Upper complexity bound: Subgradient Descent ...... 76 4.4 The lower bound ...... 79 4.5 Subgradient Descent for Lipschitz-continuous convex problems ...... 80 4.6 Bundle methods ...... 84 4.6.1 The Level method ...... 86 4.6.2 Concluding remarks ...... 90 4.7 Exercises ...... 92 4.7.1 Around Subgradient Descent ...... 92 4.7.2 Mirror Descent ...... 96 4.7.3 Stochastic Approximation ...... 101 4.7.4 The Stochastic Approximation method ...... 105
5 Nonlinear programming: Unconstrained Minimization 109 5.1 Relaxation ...... 109 5.2 Gradient method ...... 110 5.3 Newton method ...... 117 5.3.1 Gradient Method and Newton Method: What is different? ...... 121 5.4 Newton Method and Self-Concordant Functions ...... 125 5.4.1 Preliminaries ...... 125 5.4.2 Self-concordance ...... 126 5.4.3 Self-concordant functions and the Newton method ...... 127 5.4.4 Self-concordant functions: applications ...... 130 5.5 Conjugate gradients method ...... 133 5.6 Exercises ...... 137 5.6.1 Implementing Gradient method ...... 137 5.6.2 Regular functions ...... 139 5.6.3 Properties of the gradient ...... 139 5.6.4 Classes of differentiable functions ...... 142
6 Constrained Minimization 145 6.1 Penalty and Barrier function methods ...... 145 6.2 Self-concordant barriers and path-following scheme ...... 150 6.2.1 Path-following scheme ...... 151 6.2.2 Applications ...... 156 6.2.3 Concluding remarks ...... 158 Lecture 1
Introduction
(General formulation of the problem; Important examples; Black Box and Iterative Methods; Analytical and Arithmetical Complexity; Uniform Grid Method; Lower complexity bounds; Lower bounds for Global Optimization; Rules of the Game.)
1.1 General formulation of the problem
Let us start with fixing the mathematical form of our main problem and the standard terminology.
1.1.1 Problem formulation and terminology Let x be an n-dimensional real vector: x =(x(1),...,x(n)) ∈ Rn, G be a subset of Rn,and functions f(x),g1 ...gm(x) are some real-valued function of x. In the entire course we deal with some variants of the following minimization problem:
min f(x),
s.t.: gj(x)&0,j=1...m, (1.1.1)
x ∈ G, where & could be ≤, ≥ or =. We call f(x)theobjective function, the vector function g(x)=(g1(x),...,gm(x)) is called the functional constraint,thesetG is called the basic feasible set,andtheset
Q = {x ∈ G | gj(x) ≤ 0,j=1...m},
is called the feasible set of the problem (1.1.1).1 There is some natural classification of the types of minimization problems:
1That is just a convention to consider the minimization problems. Instead, we could consider a maxi- mization problem with the objective function −f(x).
5 6 LECTURE 1. INTRODUCTION
• Constrained problems: Q ⊂ Rn.
• Unconstrained problems: Q ≡ Rn.
• Smooth problems:allgj(x) are differentiable.
• Nonsmooth problems: there is a nondifferentiable component gk(x), • Linearly constrained problems: all functional constraints are linear: