<<

The Value Theorem

The is an extremely useful result, although unfortunately the power of the mean value theorem does not shine through in an introductory course. This is because the main application of the mean value theorem is proving further results, but our focus is not on proving the theorems of calculus. Motivated students may consider studying , in which the focus of study is proving and understanding the propositions of calculus. Without further ado, we will consider the mean value theorem.

Theorem: The Mean Value Theorem If f is continuous on the closed [a, b], and differentiable on the interval’s interior (a, b), then there exists at least one c ∈ (a, b) at which

f(b) − f(a) = f 0(c) b − a

Verbally, the mean value theorem states: for a continuous and differentiable f on an interval, we can find a point inside the interval at which the instantaneous rate of change of f is equal to the average rate of change of f over that interval. In other words, if we draw the secant between the points (a, f(a)) and (b, f(b)), we can find a point c ∈ (a, b) so that the of the line at c, f 0(c), is the same as the slope of the through the endpoints of the interval. If we think of the average rate of change over an interval as the mean rate of change, then the name of this theorem makes sense.

It is important to emphasize that the mean value theorem does not apply for functions which are not continuous on the closed interval [a, b], or not differentiable on the open interval (a, b). Consider the function   0 x = 0 f(x) = 0.5 0 < x < 1  1 x = 1 If we look at the average rate of change on [0, 1] we find

f(1) − f(0) = 1 1 − 0 and we find the secant line through the points (0, f(0)) and (1, f(1)) is just the line of slope one that passes through the origin. What about the of this function? It turns out f 0(x) does not exist at x = 0 and x = 1, because the function is not continuous at those points. However, for 0 < x < 1 the function is continuous, and just a constant, so the derivative f 0(x) = 0. It can be seen that there are no points in this interval for which the slope of the tangent line is 1, because for all points in (0, 1) the tangent line is horizontal, having slope 0. Here the mean value did not apply because we had a function that was not continuous on [0, 1], even though it was differentiable on (0, 1). It is noteworthy that we could still apply the mean value theorem on any closed interval contained in (0, 1), and it would tell us that somewhere in the interval the slope of the tangent line is 0 (which is not particularly useful, as we already know that the tangent line is 0 everywhere in the interval).

Now let us consider a function that is continuous on a closed interval, but not differentiable within the open interval. Let us consider f(x) = |x| on the interval [−1, 1]. If we look at the average rate of change over the interval we find f(1) − f(−1) 0 = = 0 1 − (−1) 2 Yet there are no points c ∈ (−1, 1) such that f 0(c) = 0. For x < 0 we have f 0(x) = −1 and for x > 0 we have f 0(x) = 1 (of course f 0(0) does not exist). If the function |x| were differentiable at x = 0, then in a sense, we would have to have f 0(0) = 0, as this would be the point where the derivative would change from negative to positive. However, in this case we don’t have any points with a zero derivative, and because the derivative is discontinous at x = 0, it is possible for us to jump from a positive to negative derivative.

We gain further information from the mean value theorem in the following two corollaries (a corollary is a minor result that follows almost immediately from major theorem): 1. If f 0(x) = 0 for all x in an open interval (a, b), then f(x) = C for all x ∈ (a, b), where C is a constant. 2. If f 0(x) = g0(x) for all x in an open interval (a, b), then there exists some constant C so that f(x) = g(x) + C for some x ∈ (a, b). Both of these results are things that we already knew intuitively, but follow rigorously from the mean value theorem. To move from the first of these results to the second, we consider the function (f − g)(x). For this function, (f − g)0(x) = f 0(x) − g0(x) = 0 on the interval (a, b), so (f − g)(x) = C for some constant, and thus f(x) = g(x) + C.

If we apply the mean value theorem on an interval for which f(a) = f(b), it follows that there exists a point c ∈ (a, b) so that f(b) − f(a) f 0(c) = = 0 b − a This that if we have any two points a and b at which our function of interest crosses the x-axis, we can find a point in the interval which is a critical point (at which we could look for a local extremum). This result is called Rolle’s theorem, and is used in the proof of the mean value theorem, but really ought to be called Rolle’s lemma, because a lemma is a minor result that is used in the proof of a major result, but is not thought of as an extremely important result of mathematics in its own right. If we think about the function x3 f(x) = − 3x 3 which is 0 at −3, 0, 3, then we know it has a critical point somewhere in the intervals (−3, 0) and (0, 3). Thus, we know it has at least two critical points, but it could have more. We can also use the mean value theorem in the classification of critical points. Going back to the above function x3 f(x) = − 3x 3 defined on the interval [−3, 3], we find that

f 0(x) = x2 − 3 √ √ which is 0 when x = ± 3, meaning that ± 3 are critical points. The next step is for us to evaluate the value√ of the√ function√ at each√ of the critical points and endpoints, yielding f(−3) = 0, f(− 3) = 2 3, f( 3) = −2 3, and f(3) = 0. Normally we think of these points as partitioning our larger interval into subintervals, over which we want to consider the of the derivative. Since the sign of the derivative cannot change over these intervals, knowing the sign of the derivative at some point in the interval tells us the sign of the derivative over the entire subinterval.√ If we apply the mean value theorem to√ the endpoints of one of these subintervals, say [ 3, 3], we find that there exists some c ∈ ( 3, 3) so that √ √ f(3) − f( 3) 0 − (−2 3) f 0(c) = √ = √ > 0 3 − 3 3 − 3 which means that the derivative must be positive, so the function is increasing, over the interior of the entire subinterval. Note that the denominator will always be positive when we apply the mean value theorem, and since we only care about the sign of the derivative, we simply need to determine the relative magnitude of the√ endpoints√ to see√ if the derivative√ will be positive or negative. Doing so we can see that f( 3)√ =√−2 3 < 2 3 = f(−√ 3) which implies the derivative is negative on the subinterval (− 3, 3),√ and finally f(− 3) > f√(3) implies that the derivative is√ positive over the subinterval (−3, − 3). It follows that f(− 3) is a local maximum, and f( 3) is a local minimum.

Finally, there is another application of the mean value theorem that proves to be extremely useful in many cases. The mean value theorem allows us to transform a bounded derivative into a bound for a function defined on an interval. Let’s suppose that we have a function f which is continuous on the interval [a, b], differentiable on (a, b), and that |f 0(x)| < B for all x ∈ (a, b) (this means that the derivative of f is bounded). This bound on the derivative tells us intuitively that the function can only change so fast. In fact, the maximum rate at which it can change is B. Intuitively, if the function can only change so fast, then it should not be able to become too much larger or smaller than f(a) (or f(b) for that matter) if we don’t move too far from x = a. Let us consider a point x ∈ (a, b). Applying the mean value theorem on the interval [a, x] we find

f(x) − f(a) = f 0(c) x − a for some c ∈ (a, x). Rearranging terms it follows

f(x) = f 0(c)(x − a) + f(a) although this looks like the equation for a line, remember that x is a fixed value, and for any value of x, we might actually find a different value for c. Normally we would find f 0(c) using information about f(x) and f(a), but here we want to gain some information about f(x). This means that we don’t know exactly what f 0(c) is, but we do know that f 0(c) < B and f 0(c) > −B. It then follows that

−B(x − a) + f(a) ≤ f(x) ≤ B(x − a) + f(a)

which is an equation that holds for all x ∈ (a, b). In fact, if we consider the above equation in the cases of equality, we get the equations of two lines. When x = a, the value of both inequalities is f(a). From there, the lines increase (or decrease) with slope B (or −B), which describe bounds on the function f(x). At any given point x, the function f(x) must be between the two lines, which implies the function is bounded. The only way the function would be equal to one of the two lines, is if the derivative was a constant of B or −B up to that point, meaning that the magnitude of the derivative was equal to the bound, which is the largest possible magnitude of the derivative (or may not even be achievable, if the bound is larger than any value the derivative ever achieves). Because in this region the rate of change of the function is bounded, it is only possible for the function to change by so much, which forces it to be within these bounds. This is an important concept for visualizing functions, and is useful in many proofs.