Lecture 10: descent methods

• Generic descent • Generalization to multiple dimensions • Problems of descent methods, possible improvements • Fixes • Local minima

Gradient descent (reminder)

Minimum of a is found by following the of the function f

f(x)

guess

f(m)

m x

1 descent (illustration)

f

f(x)

guess

next step

f(m)

m x

Gradient descent (illustration)

f

f(x) new gradient guess

next step

f(m)

m x

2 Gradient descent (illustration)

f

f(x)

guess

next step f(m)

m x

Gradient descent (illustration)

f

f(x)

guess

stop f(m)

m x

3 Gradient descent: algorithm

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied

guess

Gradient descent: algorithm

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied

Direction: downhill

4 Gradient descent: algorithm

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied

step

Gradient descent: algorithm

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied

Now you are here

5 Gradient descent: algorithm

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied

Stop when “close” from minimum

Gradient descent: algorithm

Start with a point (guess) guess = x Repeat Determine a descent direction direction = -f’(x) Choose a step step = h > 0 Update x:=x–hf’(x) Until stopping criterion is satisfied f’(x)~0

6 Example of 2D gradient: pic of the MATLAB demo Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo Illustration of the gradient in 2D

7 Example of 2D gradient: pic of the MATLAB demo Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo Definition of the gradient in 2D

This is just a genaralization of the in two dimensions. This can be generalized to any dimension.

8 Example of 2D gradient: pic of the MATLAB demo Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo Gradient descent works in 2D

9 Generalization to multiple dimensions

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied

guess

Generalization to multiple dimensions

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied

Direction: downhill

10 Generalization to multiple dimensions

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied

step

Generalization to multiple dimensions

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied

Now you are here

11 Generalization to multiple dimensions

Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied Stop when “close” from minimum

Generalization to multiple dimensions

Start with a point (guess) guess = x Repeat Determine a descent direction direction = -f(x) Choose a step step = h > 0 Update x:=x–h Vf’(x) Until stopping criterion is satisfied Vf’(x)~0

12 Multiple dimensions Everything that you have seen with can be generalized with the gradient.

For the descent method, f’(x) can be replaced by

In two dimensions, and by

in N dimensions.

Example of 2D gradient: MATLAB demo The cost to buy a portfolio is: Stock N Stock i Stock 2 Stock 1

If you want to minimize the price to buy your portfolio, you need to compute the gradient of its price:

13 Problem 1: choice of the step When updating the current computation: - small steps: inefficient - large steps: potentially bad results

f f(x)

guess

Too many steps: stop takes too long to converge f(m)

m x

Problem 1: choice of the step When updating the current computation: - small steps: inefficient - large steps: potentially bad results

f f(x)

Next point (went too far)

f(m) Current point

m x

14 Problem 2: « ping pong effect »

[S. Boyd, L. Vandenberghe, Convex lect. Notes, Stanford Univ. 2004 ]

Problem 2: « ping pong effect »

[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]

15 Problem 2: (other norm dependent issues)

[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]

Problem 3: stopping criterion Intuitive criterion:

In multiple dimensions:

Or equivalently

Rarely used in practice. More about this in EE227A (convex optimization, Prof. L. El Ghaoui).

16 Fixes

Several methods exist to address this problem

- methods, in particular - - Exact line search

- Normalized steepest descent

- Newton steps

Fundamental problem of the method: local minima

Local minima: pic of the MATLAB demo

The iterations of the algorithm converge to a local minimum

17 Local minima: pic of the MATLAB demo

View of the algorithm is « myopic »

18