Lecture 10: descent methods
• Generic descent algorithm • Generalization to multiple dimensions • Problems of descent methods, possible improvements • Fixes • Local minima
Gradient descent (reminder)
Minimum of a function is found by following the slope of the function f
f(x)
guess
f(m)
m x
1 Gradient descent (illustration)
f
f(x)
guess
next step
f(m)
m x
Gradient descent (illustration)
f
f(x) new gradient guess
next step
f(m)
m x
2 Gradient descent (illustration)
f
f(x)
guess
next step f(m)
m x
Gradient descent (illustration)
f
f(x)
guess
stop f(m)
m x
3 Gradient descent: algorithm
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied
guess
Gradient descent: algorithm
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied
Direction: downhill
4 Gradient descent: algorithm
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied
step
Gradient descent: algorithm
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied
Now you are here
5 Gradient descent: algorithm
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied
Stop when “close” from minimum
Gradient descent: algorithm
Start with a point (guess) guess = x Repeat Determine a descent direction direction = -f’(x) Choose a step step = h > 0 Update x:=x–hf’(x) Until stopping criterion is satisfied f’(x)~0
6 Example of 2D gradient: pic of the MATLAB demo Illustration of the gradient in 2D
Example of 2D gradient: pic of the MATLAB demo Illustration of the gradient in 2D
7 Example of 2D gradient: pic of the MATLAB demo Illustration of the gradient in 2D
Example of 2D gradient: pic of the MATLAB demo Definition of the gradient in 2D
This is just a genaralization of the derivative in two dimensions. This can be generalized to any dimension.
8 Example of 2D gradient: pic of the MATLAB demo Illustration of the gradient in 2D
Example of 2D gradient: pic of the MATLAB demo Gradient descent works in 2D
9 Generalization to multiple dimensions
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied
guess
Generalization to multiple dimensions
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied
Direction: downhill
10 Generalization to multiple dimensions
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied
step
Generalization to multiple dimensions
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied
Now you are here
11 Generalization to multiple dimensions
Start with a point (guess) Repeat Determine a descent direction Choose a step Update Until stopping criterion is satisfied Stop when “close” from minimum
Generalization to multiple dimensions
Start with a point (guess) guess = x Repeat Determine a descent direction direction = -f(x) Choose a step step = h > 0 Update x:=x–h Vf’(x) Until stopping criterion is satisfied Vf’(x)~0
12 Multiple dimensions Everything that you have seen with derivatives can be generalized with the gradient.
For the descent method, f’(x) can be replaced by
In two dimensions, and by
in N dimensions.
Example of 2D gradient: MATLAB demo The cost to buy a portfolio is: Stock N Stock i Stock 2 Stock 1
If you want to minimize the price to buy your portfolio, you need to compute the gradient of its price:
13 Problem 1: choice of the step When updating the current computation: - small steps: inefficient - large steps: potentially bad results
f f(x)
guess
Too many steps: stop takes too long to converge f(m)
m x
Problem 1: choice of the step When updating the current computation: - small steps: inefficient - large steps: potentially bad results
f f(x)
Next point (went too far)
f(m) Current point
m x
14 Problem 2: « ping pong effect »
[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]
Problem 2: « ping pong effect »
[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]
15 Problem 2: (other norm dependent issues)
[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]
Problem 3: stopping criterion Intuitive criterion:
In multiple dimensions:
Or equivalently
Rarely used in practice. More about this in EE227A (convex optimization, Prof. L. El Ghaoui).
16 Fixes
Several methods exist to address this problem
- Line search methods, in particular - Backtracking line search - Exact line search
- Normalized steepest descent
- Newton steps
Fundamental problem of the method: local minima
Local minima: pic of the MATLAB demo
The iterations of the algorithm converge to a local minimum
17 Local minima: pic of the MATLAB demo
View of the algorithm is « myopic »
18