1 Space Curves and Tangent Lines 2 Gradients and Tangent Planes

CLASS NOTES for CHAPTER 4, Nonlinear Programming 1 Space Curves and Tangent Lines Recall that space curves are de¯ned by a set of parametric equations, x1(t) 2 x2(t) 3 r(t) = . 6 . 7 6 7 6 xn(t) 7 4 5 In Calc III, we might have written this a little di®erently, ~r(t) =< x(t); y(t); z(t) > but here we want to use n dimensions rather than two or three dimensions. The derivative and antiderivative of r with respect to t is done componentwise, x10 (t) x1(t) dt 2 x20 (t) 3 2 R x2(t) dt 3 r(t) = . ; R(t) = . 6 . 7 6 R . 7 6 7 6 7 6 xn0 (t) 7 6 xn(t) dt 7 4 5 4 5 And the local linear approximation to r(t) is alsoRdone componentwise. The tangent line (in n dimensions) can be written easily- the derivative at t = a is the direction of the curve, so the tangent line is given by: x1(a) x10 (a) 2 x2(a) 3 2 x20 (a) 3 L(t) = . + t . 6 . 7 6 . 7 6 7 6 7 6 xn(a) 7 6 xn0 (a) 7 4 5 4 5 In Class Exercise: Use Maple to plot the curve r(t) = [cos(t); sin(t); t]T ¼ and its tangent line at t = 2 . 2 Gradients and Tangent Planes Let f : Rn R. In this case, we can write: ! y = f(x1; x2; x3; : : : ; xn) 1 Note that any function that we wish to optimize must be of this form- It would not make sense to ¯nd the maximum of a function like a space curve; n dimensional coordinates are not well-ordered like the real line- so the fol- lowing statement would be meaningless: (3; 5) > (1; 2). This type of function is sometimes called a surface even though we can't graph it if the domain has a higher dimension than 2. The gradient of f, f, is de¯ned by its partial derivatives. We will denote @f r @xi as fxi , and so: fx1 (x1; x2; : : : ; xn) 2 fx2 (x1; x2; : : : ; xn) 3 f(x1; x2; : : : ; xn) = . r 6 . 7 6 7 6 fxn (x1; x2; : : : ; xn) 7 4 5 NOTE: When writing the gradient, we will not distinguish between it being a row vector or a column vector. For some ideas, it may be more clear notationally to write it as a column, but for computations, it is a row vector. Normally, it will be clear from the context whether it is a row or column. The rate of change of f at x = a in the direction of the unit vector u is de¯ned as the directional derivative of f: Duf(a) = f(a) u r ¢ If we recall the useful relationship between any two vectors u, v Rn, 2 u v = u 2 v 2 cos(θ) ¢ k k k k then we see that the direction for which the rate of change of f is a maximum is in the direction of f(a). In Class Example:rFind the maximum rate of change of f at the given point and the direction in which it occurs: f(x; y) = x + y=z at (4; 3). The linearization of f at x = a is the tangent plane, L(x) = f(a) + f(a) (x a) r ¢ ¡ which is also the ¯rst order Taylor series approximation to f at a. 2 2.1 Surfaces and Contours In maximizing or minimizing a function, we'll consider equations like: F (x) = k, where k is a real number. From our earlier discussion on the Implicit Function Theorem, we know that this equation could implicitly de¯ne one of the variables in terms of the others. In Calculus III, for example, we saw F (x; y) = k de¯nes a level curve in the plane. ² F (x; y; z) = k de¯nes a level surface in three dimensions. ² Now let r(t) be a curve on the surface de¯ned by F (x) = k. Then we could write F as dependent on t, and di®erentiate with respect to t: F (r(t)) r0(t) = 0 r ¢ which implies that the gradient is orthogonal to any curve on the surface. In particular, we can compute the tangent line or plane at the point (a; b) or (a; b; c): Fx(a; b) (x a) + Fy(a; b) (y b) = 0 ² ¢ ¡ ¢ ¡ Fx(a; b; c) (x a) + Fy(a; b; c) (y b) + Fz(a; b; c) (z c) = 0 ² ¢ ¡ ¢ ¡ ¢ ¡ If x Rn, this can be written compactly as F (a) (x a) = 0 2 r ¢ ¡ In Class Example: Compute the tangent line and show that the gradient 2 2 1 1 is orthogonal to it, if x + y = 1 at p2 ; p2 . In this case, the gradient of ³ ´ F (x; y) = x2 + y2 is [2x; 2y], and at the given point, the gradient is [p2; p2]. Notice that this vector points directly opposite from the center of the circle. On the other hand, the equation of the tangent line is: 1 1 p2 x + p2 y = 0 µ ¡ p2¶ µ ¡ p2¶ or in the more familiar form, y = x + p2. The \slope" of the gradient is 1, while the slope of the tangent line¡is 1. In two dimensions such as this, a ¡ F line in the direction of the gradient would have slope y , while the slope of Fx F the tangent line would be x . ¡Fy The important point in this section: Given a level surface of the form F (x) = k, the gradient F at x = a is orthogonal. Furthermore, the gradient points in the directionr where F increases the fastest (for y = F (x). 3 3 Derivatives we don't see in Calculus III 3.1 The Hessian Matrix n Let f : R R, so that we have something in the form of y = f(x1; x2; : : : ; xn). We have already! seen that the derivative of f takes the form of the gradient. We now examine the second derivative of f, and we'll see that the second derivative forms what is called the Hessian matrix. The ¯rst derivative is in the form of the gradient, where we have a vector of functions. Let fxi (x1; x2; : : : ; xn) = gi(x1; x2; : : : ; xn). Now each function g is also a function from Rn to R, and so the derivative of g is also a gradient. We could therefore write the second derivative of f in matrix form to obtain what is called the Hessian of f: g1 fx1x1 fx1x2 : : : fx1xn r 2 g2 3 2 fx2x1 fx2x2 : : : fx2xn 3 Hf = r. = . 6 . 7 6 . 7 6 7 6 7 6 gn 7 6 fxnx1 fxnx2 : : : fxnxn 7 4 r 5 4 5 In Class Example: Let f(x; y) = x3 +2xy y2. Compute the gradient and Hessian of f. ¡ 3x2 + 2y 6x + 2 2 f = ; Hf = r · 2x 2y ¸ · 2 2 ¸ ¡ ¡ You might note that, if the second partials of f are continuous, then Hf will always be a symmetric matrix (this is a consequence of Clairaut's Theorem). In Class Example: Let f(x; y) = 2 + x2 + x + xy + y + y2. Show that f can be written as: 1 f(a) + f(a)(x a) + (x a)T Hf(a)(x a) r ¡ 2 ¡ ¡ where a = (0; 0). First, we compute f and Hf: r 2x + 1 + y 2 1 f = ; Hf = r · x + 1 + 2y ¸ · 1 2 ¸ 4 Substituting (x; y) = (0; 0), the gradient is [1; 1], and now compute: x 0 1 2 1 x 0 2 + [1; 1] ¡ + [x 0; y 0] ¡ ¢ · y 0 ¸ 2 ¡ ¡ · 1 2 ¸ · y 0 ¸ ¡ ¡ 1 2x + y 2 + x + y + [x; y] = 2 + x + y + x2 + xy + y2 2 · x + 2y ¸ The Second Order Taylor Approximation If f : Rn R, then the Taylor approximation to f at a to second order is: ! 1 f(x) f(a) + f(a)(x a) + (x a)T Hf(a)(x a) ¼ r ¡ 2 ¡ ¡ 3.2 Connecting the Hessian to Calculus III In Calc III, when we were ¯nding local extrema, we had the \Second Deriva- tives Test". To generalize this result requires a little linear algebra, but the end result is that if x = a is a critical point of f, and Hf(a) has all positive eigenvalues, then f has a local minimum at a. If Hf(a) has all negative eigenvalues, then f has a local maximum at a. If the eigenvalues are mixed in sign, there is a saddle point. 3.3 The Jacobian Matrix Let f : Rn Rm. The we could write f using m coordinate functions, each depending !on n variables, f1(x1; : : : ; xn) 2 f2(x1; : : : ; xn) 3 f(x1; : : : ; xn) = . 6 . 7 6 7 6 fm(x1; : : : ; xn) 7 4 5 De¯nition: The derivative of f takes the form of a matrix called the Jaco- bian of f, constructed by taking the partial derivatives: (f1)x1 (f1)x2 (f1)xn ¢ ¢ ¢ 2 (f2)x1 (f2)x2 (f2)xn 3 Jf = . ¢ ¢ ¢ . 6 . 7 6 7 6 (fm)x1 (fm)x2 (fm)xn 7 4 ¢ ¢ ¢ 5 5 In Class Example: Compute the Jacobian of: x2 + yz + z f(x; y; z) = 2 y2 + xy + 3 3 sin(xyz) 4 5 2x z y + 1 Jf = 2 y 2y + x 0 3 yz cos(xyz) xz cos(xyz) xy cos(xyz) 4 5 If f : Rn Rm, then the linearization of f at x = a is given by: ! L(x) = f(a) + Jf(a)(x a) ¡ 4 The Extreme Value Theorem Let f : Rn R be continuous on a closed and bounded domain D.

1 Space Curves and Tangent Lines 2 Gradients and Tangent Planes

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support