<<

February 18, 2019

LECTURE 7: DIRECTIONAL .

110.211 HONORS MULTIVARIABLE PROFESSOR RICHARD BROWN

Synopsis. Today, we move into directional derivatives, a generalization of a partial deriva- tive where we look for how a is changing at a point in any single direction in the domain. This gives a powerful tool, both conceptually as well as technically, to discuss the role the of a function plays in exposing the properties of both functions on and sets within Euclidean space. We define the of a real-valued function (finally) and its interpretations and usefulness, and move toward one of the most powerful theorems of , the Implicit Function Theorem.

The .

7.0.1. Vector form of a . Recall the definition of a partial derivative evalu- ated at a point: Let f : X ⊂ R2 → R, x open, and (a, b) ∈ X. Then the partial derivative of f with respect to the first coordinate x, evaluated at (a, b) is ∂f f(a + h, b) − f(a, b) (a, b) = lim . ∂x h→0 h Here, we vary only the first coordinate, leaving the y coordinate value b fixed, and write  a  (a + h, b) = (a, b) + (h, 0). In vector notation, this is like taking the vector a = , and b adding to it a small amount h, but only in the x-direction. Indeed, this means adding to a  1  the vector h = hi, where here we use the standard convention for unit vectors in 2 0 R 3 and R , namely i = e1, j = e2, etc. We get  a   1   a   h   a + h  a + hi = + h = + = . b 0 b 0 b

Then the definition of a partial derivative becomes ∂f f(a + hi) − f(a) (a) = lim . ∂x h→0 h However, one can take a derivative of f at a point (a, b),  a  or the point a = in any direction in the domain: Let b v ∈ X. Then f(a + hv) − f(a) lim h→0 h Figure 7.1. A directional derivative in the x-direction is the partial. is perfectly well defined as long as the quantity a + hv remains in X, which, since X is open, will be the case for 1 2 110.211 HONORS MULTIVARIABLE CALCULUS PROFESSOR RICHARD BROWN

small enough h. This is the derivative of f at (a, b) in the direction of v, also known as the directional derivative of f at (a, b) with respect to v: f(a + hv) − f(a) Dvf(a) = lim . h→0 h How does this work? For f differentiable at a, compose f with the affine function g : R → R2, where  a   v  g(t) = a + tv = + t 1 . b v2 Here, g parameterizes a line in R2 where at t = 0, g(0) = a, and at t = 1, g(0) = a + v. g is also C1, and g0(t) = v. In particular, g0(0) = v. Now let F (t) = f (g(t)) = (f ◦ g)(t) = f(a + tv), like Figure 7.2. A directional derivative in the direction of v ∈ X. in our definition of directional derivative. Here, F , as the composition of two differentiable function, will also be dif- ferentiable, and F (t) − F (0) f(a + tv) − f(a) F 0(0) = lim = lim . t→0 t − 0 t→0 t But, using the , we can write

0 d 0 F (0) = Dvf(a) = f(a + tv) = Df (g(0)) g (0) = Df(a)v. dt t=0 Definition 7.1. Let f : X ⊂ Rn → R be C1. Then the gradient function of f is the function   fx1 (x) f (x) n n  x2  ∇f : X ⊂ R → R , ∇f(x) =  .  .  . 

fxn (x) The gradient vector of f at a ∈ X is a vector in Rn based at a:   fx1 (a) f (a)  x2  ∇f(a) =  .  .  . 

fxn (a)

Notes: • The gradient function carries the same information as the derivative matrix of f, but is a vector of functions so that Df(x) = (∇f)T , where T = transpose. • The gradient is only defined for scalar-valued functions. Using this gradient function, we can write

Dvf(a) = Df(a)v = ∇f(a) · v . | {z } | {z } matrix mult. dot product LECTURE 7: DIRECTIONAL DERIVATIVES. 3

Warning! The choice of v is really a choice of direction only! Thus, it is vitally important that ||v|| = 1 for this choice.

Exercise 1. Show that for k ∈ R and w = kv, that Dwf(a) = kDvf(a). The directional derivative specifies how f is changing in the direction of v ∈ X. But what does this mean? Imagine standing in X ∈ Rn at a point a where a real-valued f is defined and differentiable. How is f changing in the particular direction that you are facing at the n moment? For any v ∈ R , where ||v|| = 1, Dvf(a) = ∇f(a) · v. So recall that x · y = ||x|| ||y|| cos θ, where θ is the angle between x and y. Remember that, for any n > 1, any two non-collinear (what does this mean?) vectors in Rn span a plane. Within that plane, there is a well-defined angle between them. So

Dvf(a) = ∇f(a) · v = ||∇f(a)|| cos θ, since ||v|| = 1. But notice then that − ||∇f(a)|| ≤ Dv(a) ≤ ||∇f(a)|| . Thus the directional derivative of f at a will achieve its maximum when θ = 0, and its minimum when θ = π. And, of course, the directional derivative will be 0 precisely when π θ = ± 2 . All of this comes from the Dot Product of the gradient vector and the chosen unit-length directional vector v. Geometrically, what does this mean? Here is a beautiful and important interpretation:

n 1 Theorem 7.2. Let X ∈ R be open and f : X → R a C -function. For x0 ∈ X, let  Sx0 = x ∈ X f(x) = f (x0) = c .

The ∇f(x0) ⊥ Sx0 .

Another way to say this is that any vector v to Sx0 will be perpendicular to ∇f(x0) (See Figure 7.3. The proof of this is constructive and very informative.

Figure 7.3. Geometrically, the gradient vector is always perpendicular to the level sets of a function.

Proof. Let I = (a, b), an open interval in R, and c : I → Rn be a C1-parameterized ,  x1(t)  . with c(t) =  .  such that xn(t) 4 110.211 HONORS MULTIVARIABLE CALCULUS PROFESSOR RICHARD BROWN

(1) c(t0) = x0, for some t0 ∈ I, and

(2) c(I) ⊂ Sx0 . Then the composition (f ◦ c): I → R is C1 on I since both f and the curve are, and

f (c(t)) = f(x1(t), . . . , xn(t)) = c. Differentiate this last inplicitly with respect to t. we get d d [f(x (t), . . . , x (t))] = [c] = 0 dt 1 n dt Df (c(t)) c0(t) = 0. 0 Now, at t = t0, c(t0) = x0, and Df (x0) v = 0, where v = c (t0), a vector tangent to the curve and hence tangent to Sx0 . Hence the result follows.  Definition 7.3. For any (n − 1)-dimensional hypersurface in Rn defined as the c- of a C1 function f : X ⊂ Rn → R,  S = x ∈ X f(x) = c , the to S at a ∈ S is the space of all vectors perpendicular to ∇f(a); it is defined by h(x) = ∇f(a) · (x − a) = 0, or n X ∂f h(x) = (a)(x − a ) = 0. ∂x i i i=1 i

Note: Compare this to the tangent space of graph(f) ⊂ R3, where f : R2 → R and graph(f) is defined by the equation in R3 given by z = f(x, y).