<<

Lecture 1: Introduction to integration theory and

What is this course about? Integration theory. The first question you might have is why there is anything you need to learn about integration.

You took a class called Math 1a and learned something like this: Given a bounded real-valued f on an [a, b], we want to define

Z b f(x)dx. a

So what we do is that we partition the interval [a, b]. A partition P of [a, b] is a finite P = {x0, x1, . . . , xn} with a = x0 < x1 < . . . < xn = b.

For every partition P , we define upper and lower Riemann sums:

n X UP (f) = (xj − xj−1) sup f(x), j=1 x∈[xj−1,xj ] and n X LP (f) = (xj − xj−1) inf f(x), x∈[x ,x ] j=1 j−1 j We that a partition Q refines a partition P if P ⊂ Q and if Q refines P we have the inequalities UQ(f) ≤ UP (f), and LP (f) ≤ LQ(f). (Why?)

Then we define upper and lower by

b Ua(f) = inf UP (f), P and b La(f) = sup LP (f). P

1 b b If the two numbers, Ua(f) and La(f) match, then we say that f is integrable and denote this number by Z b f(x)dx. a Otherwise, we say that f is not integrable.

The above is called the Riemann integral. Are there any drawbacks to this set-up. One thing we could worry about is that we haven’t entirely worked out which functions are Riemann integrable and which are not. [We could do that this quarter]. But we know something about it.

For instance:

Proposition 1 A function f which is continuous on [a, b] is Riemann integrable.

Proof: Since [a, b] is compact, we know that f is uniformly continuous on [a, b]. Hence for every  > 0, there is δ > 0 so that whenever x, y ∈ [a, b] and |x − y| < δ, it must be that |f(x) − f(y)| < .

Then if P is a partition so that |xj − xj−1| < δ for every j, we must have

UP (f) − LP (f) < (b − a).

(Why?) Thus for every  > 0, we have

b b Ua(f) − La(f) < (b − a).

We conclude the difference must be zero, hence f is Riemann integrable.

But continuity is not really a necessary condition. For instance, a f with just finitely many discontinuities is still Riemann integrable. (Why?)

A slightly worse f is given by

 1 if x ∈ Q; f(x) = 0 if x∈ / Q.

This is a disaster. Since the rationals, Q, are dense in the reals, every is a point of discontinuity of f, and in any subinterval of [a, b], we find a point where f is 1 and a point where f is 0. We get b Ua(f) = b − a, while b La(f) = 0.

2 Thus the function f is not integrable.

What’s really wrong with that f? The rationals Q are really a fairly small set. They’re countable whereas the reals are not. Everywhere but on the rationals, the function f is zero. Why should the rationals have so much weight in determining the integral. (Answer: Because it’s a Riemann integral, location, location, location.)

A more practical minded person might view such a question as nonsense. A function like f can’t really appear in nature. How can you even whether a number is rational or irrational? These practical people might think that we should be interested only in much “nicer” functions and what identities we can discover about their integrals.

For instance, if we take f(x) instead to be a on [a, b] and let F (x) be its antiderivative then we have

Z b f(x)dx = F (b) − F (a), a the fundamental theorem of calculus. (Secretly, this is some identity about telescoping sums.) We can get still fancier identities out of this. To wit:

Proposition 2 (Integration by parts) Let f and g be once continuously differen- tiable functions on [a, b]. Then

Z b Z b f(x)g0(x)dx = − f 0(x)g(x)dx + f(b)g(b) − f(a)g(a). a a

Proof We use the fundamental theorem of calculus and the for to get Z b [f(x)g0(x) + f 0(x)g(x)]dx = f(b)g(b) − f(a)g(a). a

Integration by parts is definitely cool. It is a somewhat deeper statement about telescoping sums than the fundamental theorem and lets us derive lots of identities and estimates.

Is continuity of the really necessary for integration by parts to work? This is in fact what we’ll be focusing on for the first two weeks of the course.

But why do we care? Is it of any practical importance to know which functions integration by parts works for? What this course is largely about is the question of what estimates (inequalities) depend on. When we prove the integration by parts identity, we are actually proving inequalities because both sides of the equation are limits. When we have a theorem, we know the limits are equal, but we might care how fast they’re converging to

3 one another. What does that depend on? Continuity is a complicated thing to quantify. Even accepting that the continuity of derivatives has to be uniform on [a, b], the input into any one of our proofs is really how δ depends on . We might like everything to be based on a simple quantity instead.

Part of what we’re going to see over the course of this term (and studying analysis in general) is that there’s perfect universal definition of integral that is best for every situation. Rather, we should introduce different integrals for different purposes and usually there’s a nice class of functions each one works for. We should try to see how everything depends on the definition of the class. But the simpler and more general the class of functions is, the better. We begin the course by introducing a class of functions for which we’ll see that integration by parts works very well. These are the functions of bounded variation. We proceed to define them:

Given f :[a, b] −→ R, and P = {x), x1, . . . , xn} a partition of [a, b] as before, we define the variation of f with respect to the partition P by

N X V (f, P ) = |f(xj) − f(xj−1)|. j=1

It is easy to see that if Q refines P then

V (f, Q) ≥ V (f, P ).

We define the of f on [a, b] by

b Va = sup V (f, P ). P

b If Va (f) < ∞, we say that f is of bounded variation on [a, b], or sometimes write

f ∈ BV ([a, b]).

To work in the class of functions of bounded variation really means to use these definitions, that is to make all estimates depend only on total variations.

I’ll close this lecture with some examples.

Example 1: Let f be a monotone increasing function on [a, b]. That is, suppose that if x < y, we have f(x) ≤ f(y). Then the sum defining V (f, P ) telescopes so that we have

V (f, P ) = f(b) − f(a),

4 b regardless of P . Then this quantity also gives the total variation Va f.

Example 2: You’re probably dying to know what happens with “nice” functions. Let f be once continuously differentiable on [a, b]. Then

Z b b 0 Va f = |f (x)|dx. a

How do we know. Pick a partition P . Then by the mean value theorem for each part [xj−1, xj], there is a point cj in the interval so that

0 f (cj)(xj − xj−1) = f(xj) − f(xj−1).

Thus the variation V (f, P ) is a Riemann sum of |f 0| on the partition P . As we take the sup of variations we arrive at the integral. Thus using the theory of bounded variation on just nice functions means tying one hand behind our backs and making everything just depend on total variations, that is on integrals of absolute values of derivatives. Finally:

Example 3 Let’s recall our old enemy

 1 if x ∈ Q; f(x) = 0 if x∈ / Q.

We can pick a partition P where the xj’s alternate between being rational and irrational. This makes the variation arbitrarily large. We see that this function is not of bounded variation, which is just the tip of the iceberg of what is wrong with it.

5