<<

THE CHAIN RULE

LEO GOLDMAKHER

d d 2 After building up intuition with examples like dx f(5x) and dx f(x ), we’re ready to explore one of the power tools of differential . Theorem 1 (Chain Rule). Given a ∈ R and functions f and g such that g is differentiable at a and f is differentiable at g(a). Then   (f ◦ g)0(a) = f 0 g(a) g0(a).

We start with a proof which is not entirely correct, but contains in it the heart of the argument.

“Proof.” By definition,     f g(a + h) − f g(a) (f ◦ g)0(a) = lim . h→0 h We rewrite this in the following way:     f g(a) + ∆h − f g(a) (f ◦ g)0(a) = lim (1) h→0 h where ∆h = g(a + h) − g(a). Note that ∆h → 0 as h → 0 (why?). We’re now in a pretty good situation: (1) looks extremely similar to the definition of f 0g(a), since it is of the form fg(a) + tiny − fg(a) . tiny The difficulty is that the two tiny quantities are tending to 0 at different rates. Fortunately, we can get around this by using the same trick we’ve been using in our earlier examples: we multiply by a clever form of 1. More precisely, we rewrite (1) in the form     f g(a) + ∆h − f g(a) ∆ (f ◦ g)0(a) = lim · h h→0 ∆h h      f g(a) + ∆h − f g(a)   ∆h = lim  lim h→0 ∆h h→0 h      f g(a) + ∆h − f g(a)  g(a + h) − g(a) =  lim  lim ∆h →0 ∆h h→0 h

0  0 = f g(a) g (a).  At first glance, the above argument seems sound. Upon closer inspection, however, there are two potential issues:

∆h (I) We multiplied by . This is only possible if ∆h 6= 0 for all h 6= 0 in some ∆h neighbourhood of 0.

(II) We replaced lim by lim . Is this justified? h→0 ∆h →0

We now write down a proof of the chain rule which resolves both of these issues. As you will see, it is very similar to the false argument given above. (Note that this is the proof given in Spivak.)

0 Proof. As before, we begin by rewriting the definition of (f ◦ g) (a) in terms of ∆h:     f g(a) + ∆h − f g(a) (f ◦ g)0(a) = lim . (2) h→0 h Next, we write     f g(a) + ∆h − f g(a) = Φ(h) · ∆h (3) where    f g(a) + ∆h − f g(a)  if ∆h 6= 0 Φ(h) = ∆h (4) 0  f g(a) if ∆h = 0.

(Observe that if ∆h 6= 0, equation (3) is exactly what we had in the false proof. Further note that if ∆h = 0, equation (3) holds no matter what the value of Φ(h) is!) Plugging (3) into (2), we find ∆ (f ◦ g)0(a) = lim Φ(h) · h h→0 h    ∆  = lim Φ(h) · lim h h→0 h→0 h   = lim Φ(h) · g0(a). h→0 To conclude the proof of the Chain Rule, it therefore remains only to show that   lim Φ(h) = f 0 g(a) . h→0 Intuitively, this is obvious (once you stare long enough at the definition of Φ). However, the rigorous proof is slightly technical, so we isolate it as a separate lemma (see below).  Lemma. Let Φ be the defined in (4). Then   lim Φ(h) = f 0 g(a) . h→0 Proof. Given  > 0. From the definition of f 0g(a), we see that ∃δ0 > 0 such that

fg(a) + k − fg(a) 0 0  0 < |k| < δ =⇒ − f g(a) < . k It follows that 0 0  |∆h| < δ =⇒ Φ(h) − f g(a) < . (5) Recall that ∆h = g(a + h) − g(a). Since g is continuous at a, we have lim ∆h = 0. Therefore, h→0 ∃δ > 0 such that 0 0 < |h| < δ =⇒ |∆h| < δ . (6) Combining (5) and (6) yields

0  0 < |h| < δ =⇒ Φ(h) − f g(a) <  whence lim Φ(h) = f 0g(a) as claimed. h→0 

HTTP://WEB.WILLIAMS.EDU/MATHEMATICS/LG5/

E-mail address: [email protected]