The MATLAB Notebook V1.5.2

E8.5 Consider the following function of two variables:

4 F(x) = (x1  x2 ) 12x1 x2  x1  x2 1

i. Verify that the function has three stationary points at

 0.6504 0.085 0.5655 x1 =   x2 =   x3 =    0.6504 0.085 0.5655

ii. Test the stationary points to find any minima, maxima or saddle points. iii. Find the second-order Taylor series approximations for the function at each of the atationary points. iv. Plot the function and the approximations using MATLAB.

Answer:

i. We can find the stationary points of this function by setting its gradient to zero:

   F(x)   3 x1 4(x1  x2 ) 12x2 1  F(x) =      =    3 4(x1  x2 ) 12x1 1  F(x) x2  0

3 Then we have: x1  x2 and 32x1 12x1 1  0 Solving them, gives:

P = [32 0 -12 1]; x0 = roots(P) x0 = -0.6504 0.5655 0.0850

So the three stationary points of the function are

 0.6504 0.085 0.5655 x1 =   x2 =   x3 =    0.6504 0.085 0.5655

ii. The Hessian matrix for F(x) is  12(x  x ) 2 12(x  x )2 12 2 1 2 1 2  F(x) =  2 2  12(x1  x2 ) 12 12(x1  x2 ) 

20.305 8.305 1 2 1   At x :  F(x ) =   , and its eigenvalues are: 1  12.0 , 2  28.61  8.305 20.305 therefore x1 is a stong minimum point.

 0.3468 11.6532 At x 2:  2 F(x2) =   , and its eigenvalues are: 11.6532 0.3468 

2 1  12.0 ,2  11.3064 . Therefore x must be a saddle point.

15.3499 3.3499 3 2 3   At x :  F(x ) =   , and its eigenvalues are: 1  12.0 , 2  19.6998  3.3499 15.3499 therefore x3 must be a stong minimum point.

iii. The 2nd-order Taylor series expansion of F(x) about point x1 is

1 F 1 (X )  F(X 1 )  F(X )T (X  X 1 )  (X  X 1 )T  2 F(X ) (X  X 1 ) X  X 1 2 X  X 1

T 1   .6504 20.305 8.305   .6504     = -2.5139 +  X      X    2   .6504  8.305 20.305  .6504

1 T 20.305 8.305  = 9.5887 +18.6079 18.6079X  X   X 2  8.305 20.305

Similarly, at point x2 and x3:

1 F 2 (X )  F(X 2 )  F(X )T (X  X 2 )  (X  X 2 )T  2 F(X ) (X  X 2 ) X  X 2 2 X X 2

= 1.0841 + T 1  0.085  0.3468 11.6532 0.085      X      X    2  0.085 11.6532 0.3468  0.085

1 T  0.3468 11.6532 = 1.0024 +0.9610 0.9610X  X   X 2 11.6532 0.3468 

1 F 3 (X )  F(X 3 )  F(X )T (X  X 3 )  (X  X 3 )T  2 F(X ) (X  X 3 ) X X 3 2 X  X 3

= -0.0702+ T 1  0.5655 15.3499 3.3499  0.5655      X      X    2  0.5655  3.3499 15.3499 0.5655

1 T 15.3499 3.3499  = 5.9098 -10.5747 10.5747X  X  X 2  3.3499 15.3499

iv. Using the following MATLAB code to plot F(x) and its three approximations: clear [X,Y] = meshgrid(-1 : .1 : 1); Z = (X+Y).^4 - 12*X.*Y + X + Y + 1; Z1 = 10.1525*(X.^2+Y.^2) + 18.6079*(X+Y) + 8.305*X.*Y + 9.5887; Z2 = 0.1734*(X.^2+Y.^2) + 0.961*(X+Y) - 11.6532*X.*Y + 1.0024; Z3 = 7.6749*(X.^2+Y.^2) - 10.5747*(X+Y) + 3.3499*X.*Y + 5.9098; N = 10; figure; subplot(2,2,1) surf(X,Y,Z), title('F(X)'); %contour(X, Y, Z, N) subplot(2,2,2) surf(X,Y,Z1), title('F_1(X)'); subplot(2,2,3) surf(X,Y,Z2), title('F_2(X)'); subplot(2,2,4) surf(X,Y,Z3), title('F_3(X)'); E9.2 We want to find the minimum of the following function:

1 T  6  2 F(X )  X  X  1 1X 2  2 6  i. Sketch a contour plot of this function. ii. Sketch the trajectory of the steepest descent algorithm on the contour plot of T part (i), if the initial guess is X 0  0 0 . Assume a very small learning rate is used. iii. Perform two iterations of steepest descent with learning rate  = 0.1. iv. What is the maximum stable learning rate? v. What is the maximum stable learning rate for the initial guess given in part (ii)? vi. Write a MATLAB M-file to implement the steepest descent algorithm for this problem, and use it to check your answers to parts (i) through (v).

Answer: 2  6  2 i. The Hessian matrix is  F(X )  A    , and its eigenvalues and  2 6  1 1 eigenvectors are 1  8, z1    , 2  4, z2    . 1  1 The stationary point of this function can be found by setting

 6  2 1 0 F(X )  AX  d    X        2 6  1 0 1 *  6  2 1 0.25  X          2 6  1 0.25

Now we have enough information of the function contours: the contours are * elliptical, centered at X . The maximum curvature is in the direction of z1, while the minimum curvature is in the direction of z2 (the long axis of the ellipses). Please see Fig.2 for the function contours. ii. The steepest descent trajectory will follow a path that is orthogonal to each contour line it intersects, if the learning rate is small enough. The contour plot of F(X) and the trajectory are given in Fig.2 in part (vi). iii. The first two iterations of the steepest descent method are performed as follows:  6  2 1 F(X )  AX  d    X     2 6  1

 6  20 1 1 1 g0 = F(X 0 )           , P0 = - g0 =    2 6 0 1 1 1

0 1 0.1 X 1  X 0  g 0     0.1     0 1 0.1

 6  20.1 1  0.6 g1 = F(X 1 )            2 6 0.1 1  0.6

0.1  0.6 0.16 X 2  X 1  g1     0.1     0.1  0.6 0.16

iv. Since the maximum eigenvalue of the Hessian matrix is 1  8, the stable 2 learning rate should satisfy    0.25 . 8 T v. The initial guess X 0  0 0 is considered to be in the direction of the 2 second eigenvector z2, thus the maximum learning rate will be    0.5. 2 Fig. 3 shows that for  = 0.51, the solution becomes very unstable and can't converge. However, we can get a stable solution for  = 0.26.

vi. The following Matlab code will give the stationary point x*, the x value after two iterations (x2), the eigenvalues of Hessian matrix, and the contour plot of F(x) with the trajectory. clear %--- Part(i) [X,Y] = meshgrid(-0.2 : .01 : .5); F = 3*(X.^2+Y.^2) - 2*X.*Y - X - Y; N = 10; x0 =[.25 .25]'; figure; plot(x0(1), x0(2),'r*') hold on; contour(X,Y,F,N) title('Fig.2 Trajectory for steepest descent for E9.2'); xlabel('X1 axis'), ylabel('X2 axis'); %---

A = [6 -2;-2 6]; d = [-1;-1]; alfa = 0.1;

%Find the eigenvalues and eigenvecors of Hessian matrix [V,D] = eig(A) iter = 0; x = [0 0]'; %Initialize x

%Find the stationary point G = A*x + d; small = [1.0e-4 1.0e-4]'; while (abs(G(1)) >= small(1) | abs(G(2))>= small(2)) plot(x(1), x(2),'k.') x = x - alfa * G; G = A*x + d; iter = iter + 1; if (iter == 2) x2 = x %Output the x value after two iterations end end x %Output the stationary point hold off;

V = -0.7071 -0.7071 0.7071 -0.7071 D = 8 0 0 4 x2 = 0.1600 0.1600 x = 0.2500 0.2500 F i g . 2 T r a j e c t o r y f o r s t e e p e s t d e s c e n t f o r E 9 . 2 0 . 5

0 . 4

0 . 3

0 . 2 s i x a

2

X 0 . 1

0

- 0 . 1

- 0 . 2 - 0 . 2 - 0 . 1 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 X 1 a x i s

If we use  = 0.51, the solution become very unstable: x = Inf Inf E9.6 Recall the function presented in E8.5. Write MATLAB M-files to implement the steepest descent algorithm and Newton's method for that function. Test the performance of the algorithms for various initial guesses.

Answer:

Steepest descent algorithm

For initial guess x0 = [0 0]': clear %plot the function contour lines [X,Y] = meshgrid(-1 : .1 : 1); Z = (X+Y).^4 - 12*X.*Y + X + Y + 1; N = 10; figure; contour(X, Y, Z, N), title('Steepest Descent Trajectory for X_0 = [0 0]'); hold on; x = [0 0]'; %Initialize x G = zeros(2,1); G(1) = 4*(x(1)+x(2))^3 - 12*x(2) + 1; G(2) = 4*(x(1)+x(2))^3 - 12*x(1) + 1; alfa = 0.01; dx = [1e4 1e4]'; small = [1.0e-5,1.0e-5]';

%Find the stationary point while (abs(dx(1)) >= small(1) | abs(dx(2))>= small(2)) plot(x(1), x(2),'k.') old = x; x = x - alfa * G; dx = x - old; G(1) = 4*(x(1)+x(2))^3 - 12*x(2) + 1; G(2) = 4*(x(1)+x(2))^3 - 12*x(1) + 1; end x plot(x(1), x(2),'ro') x = -0.6504 -0.6504 S t e e p e s t D e s c e n t T r a j e c t o r y f o r X = [ 0 0 ] 0 1

0 . 8

0 . 6

0 . 4

0 . 2

0

- 0 . 2

- 0 . 4

- 0 . 6

- 0 . 8

- 1 - 1 - 0 . 8 - 0 . 6 - 0 . 4 - 0 . 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1

For initial guess x0 = [0.5 0.5]': x = 0.5654 0.5654 S t e e p e s t D e s c e n t T r a j e c t o r y f o r X = [ 0 . 5 0 . 5 ] 0 1

0 . 8

0 . 6

0 . 4

0 . 2

0

- 0 . 2

- 0 . 4

- 0 . 6

- 0 . 8

- 1 - 1 - 0 . 8 - 0 . 6 - 0 . 4 - 0 . 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1

For initial guess x0 = [-0.5 -0.5]': x = -0.6504 -0.6504 S t e e p e s t D e s c e n t T r a j e c t o r y f o r X = [ - 0 . 5 – 0 . 5 ] 0 1

0 . 8

0 . 6

0 . 4

0 . 2

0

- 0 . 2

- 0 . 4

- 0 . 6

- 0 . 8

- 1 - 1 - 0 . 8 - 0 . 6 - 0 . 4 - 0 . 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Newton's Method:

For initial guess x0 = [0 0]': clear %plot the function contour lines [X,Y] = meshgrid(-1 : .1 : 1); Z = (X+Y).^4 - 12*X.*Y + X + Y + 1; N = 10; figure; contour(X, Y, Z, N), title('Newton''s Method Trajectory for X_0 = [0 0]'); hold on;

G = zeros(2,1); A = zeros(2); x = [0 0]'; %Initialize x G = grad(x); A = hessian(x); dx = [1e4 1e4]'; small = [1.0e-5,1.0e-5]';

%Find the stationary point while (abs(dx(1)) >= small(1) | abs(dx(2))>= small(2)) plot(x(1), x(2),'k.') old = x; x = x - inv(A)*G; dx = x - old; G = grad(x); A = hessian(x); end x plot(x(1), x(2),'ro') x = 0.0850 0.0850 N e w t o n 's M e t h o d T r a j e c t o r y f o r X = [ 0 0 ] 0 1

0 . 8

0 . 6

0 . 4

0 . 2

0

- 0 . 2

- 0 . 4

- 0 . 6

- 0 . 8

- 1 - 1 - 0 . 8 - 0 . 6 - 0 . 4 - 0 . 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1

For initial guess x0 = [0.5 0.5]': x = 0.5655 0.5655 N e w t o n ' s M e t h o d T r a j e c t o r y f o r X = [ 0 . 5 0 . 5 ] 0 1

0 . 8

0 . 6

0 . 4

0 . 2

0

- 0 . 2

- 0 . 4

- 0 . 6

- 0 . 8

- 1 - 1 - 0 . 8 - 0 . 6 - 0 . 4 - 0 . 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1 For initial guess x0 = [-0.5 -0.5]': x = -0.6504 -0.6504 N e w t o n ' s M e t h o d T r a j e c t o r y f o r X = [ - 0 . 5 - 0 . 5 ] 0 1

0 . 8

0 . 6

0 . 4

0 . 2

0

- 0 . 2

- 0 . 4

- 0 . 6

- 0 . 8

- 1 - 1 - 0 . 8 - 0 . 6 - 0 . 4 - 0 . 2 0 0 . 2 0 . 4 0 . 6 0 . 8 1

From above we can see that Newton's method does locate the stationary point, but it doesn't distinguish between minima, maxima and saddle points. While the steepest descent method can locate the local minimum points instead of saddle point, but its solution also depends on the initial guess of x.