The Chain Rule, Part 1 Math 131 Multivariate Calculus

How we're going to generalize the chain rule to several variables. We'll do it in stages start- ing today and finishing next time. First, we're going to turn x into a vector. Then, f is a scalar- f The chain rule, part 1 valued function Rm ! R, and x is actually a func- Math 131 Multivariate Calculus tion R !x Rm. Then, f ◦ x : R ! R is defined by D Joyce, Spring 2014 (f ◦ x)(t) = f(x(t)). x RR- m The chain rule. First the one you know. @ We'll start with the chain rule that you already @ @ f know from ordinary functions of one variable. It f ◦ x @ tells you how to find the derivative of the com- @R ? f◦g f R position A −! C of two functions, R ! R and g df R ! R. This composition is defined by (f ◦g)(x) = The question is, what's the derivative in this f(g(x)) and illustrated by the commutative dia- dt generalization? gram After we've answered that question, we'll make g RR- t into a vector t in Rn. Then x won't be just a @ function R !x Rm, but a function Rn !x Rm, and @ n @ f f ◦ x : R ! R is defined as (f ◦ x)(t) = f(x(t)). f ◦ g @ x @R ? Rn - Rm R @ @ @ f When we generalize this, we'll use a slightly dif- f ◦ x @ ferent notation. We'll let the independent variable @R ? be t instead of x, but we'll use x instead of g, R so x is a function of t. Thus, f ◦ x is defined by Finally, we'll make f into a vector-valued func- (f ◦ x)(t) = f(x(t)). f tion Rn ! Rp to get full generality. Then f ◦ x : x RR- Rn ! Rp is defined by (f ◦ x)(t) = f(x(t)). @ x @ Rn - Rm @ f @ f ◦ x @ @ @R ? @ f R f ◦ x @ @R ? p Using these symbols for the functions and using R the prime notation for derivatives, the chain rule says The first step in the case m = 2. To get a (f ◦ x)0(t) = f 0(x(t)) x0(t): better start on this, we'll let m = 2. Then we've got a function x : R ! R2. It's made out of two We'll have partial derivatives, so the chain rule in coordinate functions Leibniz' notation is a better place to start. In that notation, it says x(t) = (x; y)(t) = (x(t); y(t)); df df dx where x and y are real-valued functions of t. Then = : dt dx dt f : R2 ! R is evaluated at (x(t); y(t)) to get the 1 composition f(x(t); y(t)). And we want to compute notation, dot products, and matrix multiplication. df the derivative in terms of the derivatives of f Note that he gradient of f is dt @f @f @f @f @f (which are the two partial derivatives and ) rf = ; ;:::; ; @x @y @x1 @x2 @xm dx dy and the derivatives of x and y, namely, and . dt dt while the derivative of x is Example 1. Let f(x; y) = x3 + xy + y4, and let dx dx dx 1 ; 2 ;:::; m x(t) = cos t and y(t) = sin t. Then the composition dt dt dt is which we can denote either Dx or x0. Then the f(x(t); y(t)) = cos3 t + cos t sin t + sin4 t: chain rule is the dot product of these two vectors df df Now, to compute , just differentiate the equation = (rf) · x0: dt dt f(x; y) = x3 +xy +y4, but keep in mind that x and y are both functions of t. You get When treated as matrices, we can write h @f @f @f i df 2 dx dx dy 3 dy Df = ::: = 3x + y + x + 4y : @x1 @x2 @xm dt dt dt dt dt and Now, since 2 dx1 3 @f dt = 3x2 + y 6 dx2 7 @x Dx = 6 dt 7 4 ::: 5 while dxm @f dt = x + 4y3; @y so we can also write the chain rule as a matrix mul- df tiplication of these last two matrices we can see that can be rewritten as dt df = Df Dx: df @f dx @f dy dt = + : dt @x dt @y dt Continued in part 2. That last equation is the chain rule in this gen- Math 131 Home Page at eralization. The formal proof depends on the ordi- http://math.clarku.edu/~djoyce/ma131/ nary definition of derivative and the usual proper- ties of limits, but as this is a form of the chain rule, the proof has a lot of details. The first step for general m. Of course, this generalizes to df @f dx @f dx @f dx = 1 + 2 + ··· + m dt @x1 dt @x2 dt @xm dt for general m, not just for m = 2. But let's see how that can be written more concisely using vector 2.

Load more