<<

Chapter 7. Induction and

Part 1. Mathematical Induction

The principle of mathematical induction is this: to establish an infinite sequence of propositions

P1,P2,P3,...,Pn,...

(or, simply put, Pn (n 1)), it is enough to verify the following two things ≥

(1) P1, and

(2) Pk Pk ; (that is, assuming Pk vaild, we can rpove the validity of Pk ). ⇒ + 1 + 1

These two things will give a “domino effect” for the validity of all Pn (n 1). Indeed, ≥ step (1) tells us that P1 is true. Applying (2) to the case n = 1, we know that P2 is

true. Knowing that P2 is true and applying (2) to the case n = 2 we know that P3 is

true. Continue in this manner, we see that P4 is true, P5 is true, etc. This

can go on and on. Now you should be convinced that Pn is true, no matter how big is n.

We give some examples to show how this induction principle works.

Example 1. Use mathematical induction to show 1 + 3 + 5 + + (2n 1) = n2. − (Remember: in mathematics, “show” means “prove”.)

Answer: For n = 1, the identity becomes 1 = 12, which is obviously true. Now assume the validity of the identity for n = k:

1 + 3 + 5 + + (2k 1) = k2. −

For n = k + 1, the left hand side of the identity is

1 + 3 + 5 + + (2k 1) + (2(k + 1) 1) = k2 + (2k + 1) = (k + 1)2 − − which is the right hand side for n = k + 1. The proof is complete.

1 Example 2. Use mathematical induction to show the following formula for a geometric series: 1 xn+ 1 1 + x + x2 + + xn = − . (x = 0) (7.1) 1 x − 1−x Answer: For n = 1, the formula becomes 1 = 1−x , which holds trivially. Now we assume the validity of the identity for n = k: 1 + x + x2 + + xk = (1 xk+ 1)/(1 x). − − For n = k + 1, the left hand side of the formula is

1 xk+ 1 1 + x + x2 + + xk + xk+ 1 = − + xk+ 1 1 x − 1 xk+ 1 xk+ 1(1 x) 1 xk+ 1 xk+ 1 xk+ 2 1 xk+ 2 = − + − = − + − = − 1 x 1 x 1 x 1 x 1 x − − − − − which is the right hand side for n = k + 1. So the formula is valid for general n.

Exercise 1. Prove the following identity by induction:

n(n + 1) 1 + 2 + 3 + + n = (7.2) 2

Exercise 2. Prove the following identity by induction:

n(n + 1)(2n + 1) 12 + 22 + 32 + + n2 = . (7.3) 6

Exercise 3. Prove the following identity by induction:

13 + 23 + 33 + + n3 = (1 + 2 + 3 + + n)2 2 (According to Exercise 1, you only need to prove 13 + 23 + 33 + + n3 = n(n+ 1) .) 2 Not every identity depending on n can be handled by induction, as indicated in the following exercise.

Exercise 4. Try to prove the following identity by induction:

an+ 1 bn+ 1 = (a b)(an + an−1b + an−2b2 + + abn−1 + bn). (7.4) − −

After failing to do so, try to prove this by multiplying out the right hand side.

2 Sometimes we can use induction to statements depending on a parameter n of natural numbers, other than identities, as shown in the following example:

Example 3. The celebrated Cauchy-Schwarz inequality says

2 2 2 2 2 2 2 (a b + a b + + anbn) (a + a + + a )(b + b + + b ) (7.5) 1 1 2 2 ≤ 1 2 n 1 2 n Check this for n = 1, 2. Then use induction to prove this inequality for general n.

Answer. When n = 1, the inequality reads (a b )2 a2b2. This is clear: in fact, 1 1 ≤ 1 2 2 2 2 (a1b1) = a1b2. Next we verify this inequality for n = 2. Here we rewrite this inequality in different symbols because we need it in the inductive step:

(A B + A B )2 (A2 + A2)(B2 + B2), ( ) 1 1 2 2 ≤ 1 2 1 2 ∗ that is, A2B2 + 2A B A B + A2B2 A2B2 + A2B2 + A2B2 + A2B2. So it is enough to 1 1 1 1 2 2 2 2 ≤ 1 1 1 2 2 1 2 2 check 2A B A B A2B2 + A2B2, 1 1 2 2 ≤ 1 2 2 1 which can be recast as A2B2 2A B A B + A2B2 0. But 1 2 − 1 1 2 2 2 1 ≥ A2B2 2A B A B + A2B2 = (A B A B )2 1 2 − 1 1 2 2 2 1 1 2 − 2 1 which is clearly 0. Thus ( ) is valid. Now we assume the validity of ≥ ∗ 2 2 2 2 2 2 2 (a b + a b + + anbn) (a + a + + a )(b + b + + b )( ) 1 1 2 2 ≤ 1 2 n 1 2 n ∗∗ for n = k. Notice that the left hand side of ( ) increases or remains the same if all ∗∗ numbers in it are replaced by their absolute values, while the right hand side remains the same. Hence we may (and we do) assume that aj 0 and bj 0 for all j = 1, . . . , n. ≥ ≥ 2 2 2 2 2 By the induction hypothesis, (a b + + akbk) (a + + a )(b + + b ), or 1 1 ≤ 1 k 1 k

a b + a b + + akbk A B , 1 1 2 2 ≤ 1 1 where A = a2 + a2 + + a2 , and B = b2 + b2 + + b2 . 1 1 2 k 1 1 2 k Now let A = ak and B = bk and use ( ) to obtain 2 + 1 2 + 1 ∗ 2 2 (a b + + akbk + ak bk ) (A B + ak bk ) 1 1 + 1 + 1 ≤ 1 1 + 1 + 1 (A B + A B )2 (A2 + A2)(B2 + B2) ≡ 1 1 2 2 ≤ 1 2 1 2 = (a2 + a2 + + a2 + a2 )(b2 + b2 + + b2 + b2 ), 1 2 k k+ 1 1 2 k k+ 1 3 and hence ( ) is also valid for n = k + 1. ∗∗

Question 5. What is wrong with the following “proof” of “n = n + 1”?

“Assume that n = n + 1 holds for n = k, that is, k = k + 1. Add both sides by 1, we get k + 1 = k + 1 + 1, which shows that the identity n = n +1 also holds for n = k + 1. By the principle of induction, n = n + 1 is valid for all n.”

Question 6. What is wrong with the following “proof” of the statement “every man is bald” by induction?

“All we need to do is to establish the statement ‘a man with n hairs or less is bald’ for all n. When n = 0, 1 or 2, the statement is clearly valid. Now assume that the statement is true for n = k, that is, assume that “a man with k hairs or less is bald” is valid. Then the statement n = k + 1 is also valid because an additional hair cannot change the fact of being bald. So by the principle of induction, the statement ‘a man with n hairs is bald’ is valid for all n. This proves all men are bald.”

An important role played by induction is to give proper definitions for many mathe- matical expressions involving natural numbers. The usual pattern (or its variant) of this definition by induction or by recursion is such: to define a sequence f(n) n≥ , we { } 1 take the following two steps:

(I) Specify what f(1) is,

(II) Specify how f(n + 1) can be decided from a subset of f(1), f(2), . . . , f(n).

Example 4. You know the meanings of the expressions an, and n!. Strictly speaking, they should be defined by induction. We write down the two steps (I), (II) described above for defining each of them:

a1 = a; an+ 1 = ana. (With f(n) = an, f(1) = a and f(n + 1) = f(n)f(1).)

1! = 1; (n+1)! = n!(n+1). (With f(n) = n!, f(1) = 1 and f(n+1) = f(n)(n+1).)

Certainly, na is also defined by induction: 1a = a and (n + 1)a = na + a.

4 Example 5. This is an example of great importance about an expression involving the summation symbol that can be defined recursively, given as follows

n ak. k = 1

We read the expression as “the sum of all terms ak, where k runs from 1 to n. (Earlier we write such an expression simply as a + a + + an.) Many identities and exercises 1 2 above can be recast nicely in the format using the summation symbol, for example,

n n n 2 (2k 1) = n2, k3 = k , k= 1 − k= 1 k= 1 n etc.; (see Example 1 and Exercise 3 above). The recursive definition of k= 1 ak is

1 n+ 1 n ak = a1; ak = ak + an+ 1 k= 1 k= 1 k= 1 n (with f(n) = k= 1 ak, f(1) = a1 and f(n + 1) = f(n) + an+ 1.) Example 6. The famous Fibonacci numbers Fn (n = 0, 1, 2, 3,... ) are defined by induction (or recursion):

Fn = Fn− + Fn− , n 2, (7.6) 1 2 ≥ which says, starting from the third term, every term in the sequence of Fibonacci numbers is the sum of the previous two terms. In order to determine this sequence, we have to specify the first two terms. Here they are: F0 = 1 and F1 = 1. It is easy to produce a few terms at the beginning of the sequence

1, 1, 2, 3, 5, 8, 13, 21, 34,...

(This sequence was originally described by the inventor as how rabbits multiply: Fn is the number of pairs of rabbits on the nth day. There is an interesting story of this but it does not concern us here.) The mathematical question here is, how can we find a closed expression for Fn for general n? There are many ways to answer this question, such as: treat it as a difference equation, or rewrite the recursion relation as a matrix identity

F 1 1 F n = n−1 Fn− 1 0 Fn− 1 2 and use the method of diagonalization in linear algebra, or use generating functions.

5 ** Appendix (may be omitted) . We briefly describe the last method here. Form the ∞ n power series f(x) = n= 0Fnx . It is called the generating function for the Fibonacci sequence. It converges when x is small enough. Now | | n n f(x) = F0 + F1x + Fnx = F0 + F1x + (Fn−1 + Fn−2) x n≥2 n≥2 n n = 1 + x + Fn−1 x + Fn−2 x n≥2 n≥2 k+ 1 ℓ+ 2 = 1 + x + Fk x + Fℓ x (k = n 1, ℓ = n 2) k≥1 ℓ≥0 − − k 2 ℓ = 1 + x + x Fk x + x Fℓ x k≥1 ℓ≥0 = 1 + x + x(f(x) 1) + x2f(x) = 1 + xf(x) + x2f(x). − So we have (1 x x2)f(x) = 1. We can factorize the polynomial 1 x x2 as − − − − 2 1 x x = (1 r x)(1 r−x), where r± = (1 √5)/2 (with r + r− = 1 and − − − + − ± + r r− = 1). + −

Now

1 1 r+ r− 1 f(x) = 2 = = 1 x x (1 r x)(1 r−x) 1 r x − 1 r−x r r− − − − + − − + − + − n n n n 1 1 n+ 1 n+ 1 n = r+ r+ x r− r− x = (r+ r− ) x . n≥0 − n≥0 √5 √5 n≥0 − n Comparing this with f(x) = n≥0 Fn x , we obtain

n+ 1 n+ 1 1 n+ 1 n+ 1 1 1 + √5 1 1 √5 Fn = (r+ r− ) − . √5 − ≡ √5 2 − √5 2

The amazing thing here is the appearance of the irrational number √5 in this answer for

Fn, which is always an integer. (End of the appendix **)

Exercise 7. Use the recursion relation for Fibonacci numbers to prove the following identity

F + F + F + + Fn− = Fn 1, (n 1). 0 1 2 1 + 1 − ≥

Exercise 8. Prove the amazing identity Fm+ n = FmFn + Fm−1Fn−1 by induction.

Recursion is an important notion for constructing so called recursive functions, which is a key notion in logic and theoretical .

6 Part 2. Polynomials: Taylor’s formula, binomial formula, etc.

By a polynomial we mean a function of the form n n n−1 2 k p(x) = anx + an−1x + + a2x + a1x + a0 akx ≡ k= 0 where a , a , a , . . . , an are some constants, called coefficients of p(x). When an = 0, 0 1 2 we say that the degree of p(x) is n. Now let us take an arbitrary constant a. Then p(x + a) as a function of x is a polynomial of the same degree as that of p(x). However, the degree of p(x + a) p(x) is lower by 1, since the highest power terms in p(x + a) − and p(x) are canceled out.

Example 7. Consider the polynomial p(x) = 2x2 + 3x + 1 of degree 2. Then, for any constant a, p(x + a) = 2(x + a)2 + 3(x + a) + 1 = 2(x2 + 2ax + a2) + 3(x + a) + 1 = 2x2 + (4a + 3)x + (2a2 + 3a + 1) which is also a polynomial of degree 2. So p(x + a) p(x) = 4ax + (2a2 + 3a), which is a − polynomial of degree 1.

′ d The derivative of a polynomial p(x), denoted by p (x) or dx p(x), is defined to be the limit of the expression (p(x + h) p(x))/h, called a difference quotient” as h tends − to zero. Thus we write d p(x + h) p(x) p(x) p′(x) = lim − . dx ≡ h→ 0 h Suppose that the degree of p(x) is n. Then the degree of the difference p(x + h) p(x) − drops by 1 and hence we expect that the degree of p′(x) is n 1. When p(x) is of − degree zero, in other words, when p(x) is a constant function, say p(x) C, we have ≡ d C = 0. dx Simply put, the derivative of a constant function is zero.

Example 8. Consider the polynomial p(x) = 2x2 + 3x + 1. From the previous example we know that p(x + a) p(x) = 4ax + (2a2 + 3a). Replace a by h to get − p(x + h) p(x) = 4hx + (2h2 + 3h) = h(4x + 2h + 3). So − p(x + h) p(x) p′(x) = lim − = lim(4x + 2h + 3) = 4x + 3. h→ 0 h h→ 0

7 We will develop some general rules to obtain this answer within seconds.

The following are two basic rules for computing derivatives. For simplicity, we write u and v for two polynomials (instead of u(x) and v(x)) and a, b as two constants:

Linearity:(au + bv)′ = au′ + bv′. Product rule:(uv)′ = u′v + uv′.

They can be checked as follows. For linearity: [au(x + h) + bv(x + h)] [au(x) + bv(x)] (au + bv)′(x) = lim − h→ 0 h u(x + h) u(x) v(x + h) v(x) = lim a − + b − = au′(x) + bv′(x). h→ 0 h h For the product rule, we need a clever trick of inserting appropriate terms used first time by Leibnitz: u(x + h)v(x + h) u(x)v(x) (uv)′(x) = lim − h→ 0 h u(x + h)v(x + h) u(x)v(x + h) u(x)v(x + h) u(x)v(x) = lim − + − h→ 0 h h u(x + h) u(x) v(x + h) v(x) = lim − v(x + h) + u(x) − h→ 0 h h = u′(x)v(x) + u(x)v′(x). (Note: In the mathematical literature this rule is commonly known as Leibnitz rule. “Prod- uct rule” is used in textbooks for the convenience of beginners.)

The following identity is a basic formula for differentiating polynomials d xn = nxn−1 (n 0) (7.7) dx ≥ Together with linearity, we can find the derivative of a given polynomial instantly. For example, the derivative of p(x) = 2x2 + 3x + 1 (Example 8 above) is p′(x) = 2(2x) + 3. 1 + 0 = 4x + 3.

Example 9. Prove (7.7) by induction.

Solution. For n = 0, xn is the constant function 1. In this case the identity becomes d dx 1 = 0, which is valid – as we know that the derivative of a constant function vanishes. For n = 1, we have dx (x + h) x h = lim − = lim = lim 1 = 1 1. x0. dx h→ 0 h h→ 0 h h→ 0 ≡

8 Hence the identity is valid for n = 1 as well. Now assume the validity of the identity for d k k−1 n = k, that is, dx x = kx . By means of the product rule, we have d d d d xk+ 1 = xk x = xk x + xk x = kxk−1 x + xk = (k + 1)xk, dx dx dx dx showing the validity of the identity for n = k + 1. The induction principle tells us that the identity holds for all n 0. ≥ The derivative of the derivative p′(x) of a polynomial p(x) is called the second derivative of p(x), and is denoted by p′′(x) or p(2)(x). The derivative of p(2)(x) is called the third derivative of p(x) and is denoted by p(3)(x). In general, we can define the nth derivative p(n)(x) of a polynomial p(x) recursively as follows: d p(0)(x) = p(x), p(k+ 1)(x) = p(k)(x). dx For example, with p(x) = 2x2 + 3x + 1 (Example 8 above) we have p(1)(x) = 4x + 3, p(2)(x) = 4, p(3)(x) = 0.

Exercise 9. In each of the following cases, find all derivatives of a given polynomial p(x): 2 2 3 1 5 1 4 (a) p(x) = 3 + 4x + 2x , (b) p(x) = 7 + 3x + 2x + x , (c) p(x) = 15 x + 12 x .

Let f(x) be an arbitrary polynomial and let a be any number. Then f(x + a) is a polynomial in x of the same degree and hence we can put it in the form f(x + a) = n k ckx . Replace x by x a through out the last identity, we get k= 0 − n k 2 n f(x) = ck(x a) c + c (x a) + c (x a) + + an(x a) . (7.8) − ≡ 0 1 − 2 − − k = 0 To determine the constant term c0, we let x = a in (7.8) to get f(a) = c0. To determine

c1, we take the derivatives of both sides of (8) first:

′ n−1 f (x) = c + c . 2(x a) + + cn. n(x a) 1 2 − − ′ and then we substitute x = a to get p (a) = c1. To get an expression for cj for general j (j n), we need to differentiate both sides of (7.8) j times. This leads us to the question ≤ of finding the jth derivative of (x a)k. The answer to this question is − k! k−j j (k−j)! (x a) if j < k ; d k − (x a) = k! if j = k ; (7.9) dxj −   0 if j > k .  9 You have to convince yourself that (7.9) is correct.

Exercise 10. Verify (7.9) for k = 4 and j = 1, 2, 3, 4.

When j = k, the right hand side of (7.9) either has a factor of x a or is equal to zero, − and therefore its value is zero at x = a. When j = k, it becomes k! which is the same as j!. Thus we have dj j! if j = k ; (x a)k = (7.10) dxj − 0 if j = k . x= a Now take the jth derivatives of both sides of (7.8) and then evaluate at x = a. The left (j) hand side becomes f (a). It follows from (7.10) that the right hand side becomes j!cj. (j) (j) Thus we have f (a) = j!cj, or cj = f (a)/j!. Replace j by k and substitute the (k) resulting identity ck = f (a)/k! back to (7.8). We finally get

n f (k)(a) f(x) = (x a)k. (7.11) k! − k = 0

The last identity is called the Taylor expansion at x = a for the polynomial f.

x2 + x + 2 Example 10. Use the Taylor expansion to find the partial fractions of . (x 1)3 − Solution. Let p(x) = x2 + x + 2. Then p(1)(x) = 2x + 1, p(2)(x) = 2, p(3)(x) = 0. So p(1) = 4, p(1)(1) = 3, p(2)(1) = 2. The Taylor expansion of p(x) at x = 1 is p(x) = 4 + 3(x 1) + (x 1)2. Hence − − x2 + x + 2 4 + 3(x 1) + (x 1)2 1 3 4 = − − = + + . (x 1)3 (x 1)3 x 1 (x 1)2 (x 1)3 − − − − − which is the requires partial fraction decomposition.

Now we apply (7.11) to the monomial f(x) = xn. We have to compute the kth derivative of f(x) = xn for k n. The answer to this is supplied by (7.9). In using ≤ (7.9), we switch k to n and j to k, and set a = 0. We obtain

dk n! f (k)(x) xn = xn−k ≡ dxk (n k)! − 10 (k) n! n−k from which we get f (a) = (n−k)! a . So (7.11) gives

n n! xn = an−k(x a)k. k= 0 k!(n k)! − − Substituting x = a + b to the last identity and noticing that

n! n = , (7.12) k!(n k)! k − we obtain n n (a + b)n = an−kbk k k = 0 which is the celebrated .

Exercise 11. Use the binomial theorem to verify the identities

n n n n n + + + = 2n k= 0 k ≡ 0 1 n and n n n n n ( 1)k + + ( 1)n = 0. k= 0 − k ≡ 0 − 1 − n n The number k , read as “n choose k”, is interpreted as the number of ways to choose 4 4! k objects from n objects. For example, 2 = 2!2! = 6 tells us that there are 6 ways to take two objects from a set of 4. Indeed, suppose that the four objects are , {♥ ♣ ♦ ♠} then the six ways of picking two are , , , , , . {♥ ♣} {♥ ♦} {♥ ♠} {♣ ♦} {♣ ♠} {♦ ♠} The following identities concerning “n choose k” is basic:

n n n + 1 n n = , = + . (7.13) k n k k k 1 k − − Both can be understood as follows. The first identity says, in dividing n objects between you and me, k for me and n k for you, the two methods below give the same number of − n ways. One method is that I pick k objects, leaving the rest to you, and there are k ways to do this. The other method is that you pick n k objects, leaving the rest to me and n − there are n−k ways to do this. The second identity concerns the number of ways to pick k objects from n +1 objects. Let us label on of these n +1 objects and call it . There ♠ are n ways of taking k objects not including and there are n ways of taking k k ♠ k−1 objects including . Together they give n + n ways ♠ k k−1 11 Exercise 12. Use (7.12) to verify (7.13).

*The rest material of this chapter is optional.

We use the binomial theorem to prove the following assertion inductively:

Proposition If p(x) is a polynomial of degree n, then there is a polynomial f(x) of degree n + 1 such that p(x) = f(x + 1) f(x). − This proposition suggests how to find a recipe for the sum

Sn = p(0) + p(1) + p(2) + p(3) + + p(n), where p(x) is a given polynomial. Indeed, if we know f(x) with p(x) = f(x+1) f(x) − ≡ f(x) + f(x + 1), Sn becomes a “telescoping sum”: − Sn = ( f(0) + f(1)) + ( f(1) + f(2)) + ( f(2) + f(3)) + + ( f(n) + f(n + 1)) − − − − = f(0) + f(n + 1) = f(n + 1) f(0). − − Since the relation f(x + 1) f(x) = p(x) is unchanged when we add an constant to f(x), − we may arrange f(x) in such a way that f(0) = 0 so that the above identity becomes

Sn = f(n + 1), which is the required recipe for the sum. This proposition tells us that such f(x) exists. All we need is to look for it – that may involve some hard work!

Now we use this method to find the recipe for the sum 12 +22 +32 + +n2. Certainly the answer is given by (7.3) in Exercise 2, which asks to verify this recipe by induction. But how one can conjure up such a recipe is a complete myth! Our task here is to use a systematic method to demystify this recipe. Here p(x) = x2. The above proposition tells us that there is a polynomial f(x) of degree 3 such that f(x + 1) f(x) = p(x). Let − us write f(x) = ax3 + bx2 + cx + d. Let us set f(0)=0 so that d = 0. Now

f(1) = f(1) f(0) = p(0) = 0, − f(2) = f(2) f(1) = p(1) = 12 = 1, − f(3) = f(2) + p(2) = 1 + 22 = 5. So we have a + b + c = 0, 8a + 4b + 2c = 1, 27a + 9b + 3c = 4.

The solution to this system of linear equations is a = 1/3, b = 1/2, c = 1/6. So − 1 1 1 (2x2 3x + 1)x (x 1)(2x 1)x f(x) = x3 x2 + x = − = − − 3 − 2 6 6 6 12 and hence Sn = f(n + 1) = n(2n + 1)(n + 1)/6.

Now we return to the proof of the proposition by induction on the degree n of p(x). When n = 0, p(x) is a constant, say p(x) = c. In this case we just let f(x) = cx, which is a polynomial of degree 1 0 + 1. Indeed, ≡

f(x + 1) f(x) = c(x + 1) cx = c = p(x). − −

Now we assume that the proposition is valid for all n k 1 and let p(x) be a ≤ − polynomial of degree k. We can write p(x) = cxk + s(x), where s(x) is of degree k 1. By the induction hypothesis, there is a polynomial g(x) of degree k such ≤ − ≤ that s(x) = g(x + 1) g(x). On the other hand, the binomial theorem tells us that − (x + 1)k+ 1 = xk+ 1 + (k + 1)xk + r(x), where r(x) is a polynomial of degree k 1 and − hence r(x) = h(x + 1) h(x) for some polynomial h(x) of degree k. Thus we have − 1 xk = (x + 1)k+ 1 xk+ 1 (h(x + 1) h(x)) = q(x + 1) q(x) k + 1 − − − − where q(x) = (xk+ 1 h(x))/(k +1) is a polynomial of degree k + 1. Hence −

p(x) = c(q(x + 1) q(x)) + (g(x + 1) g(x)) = f(x + 1) f(x), − − −

where f(x) = cq(x) + g(x) is a polynomial of degree k + 1. The proof is complete.

13