Texas A&M University - San Antonio
Concrete Mathematics
Concrete Mathematics: A Portfolio of Problems
Author: Supervisor: Sean Zachary Prof. Donald Myers Roberson
July 1, 2014 1 Chapter 1: Recurrent Problems
Many problems in mathematics are recurrent problems, meaning that they are defined in terms of themselves. For example, one can find the greatest common divisor of two integers a and b by the following relation:
gcd(a, a) = 1;
gcd(a, b) = gcd(b, a mod b)
Each successive step reduces the problem into a simpler case. Another recursive function is the factorial:
a0 = 1;
a1 = 1;
an = nan−1
Solving these recurrences can be done in many ways. One may be able to deduce a closed form after finding a pattern in the first few terms. Another method is to unfold the recurrence by successively plugging in previous terms until the base case is reached. Other methods include using the characteristic polynomial for linear recurrences, or the use of generating functions for general recurrences. Generating functions will be examined in depth in Chapter 7 (see the section titled Generating Functions). In this section, we examine one well-known recurrent problem, known as the Josephus Problem. Ac- cording to legend, Flavius Josephus and a group of Jewish rebels were captured by Romans. The rebels formed a circle and killed every third person in the circle until all were dead. The main question of the Josephus problem is as follows: Suppose n people, numbered from 1 to n, stand in a circle. If every other person is killed, which person is the last to survive? To investigate this problem, suppose there are six people in the circle. The first person to die is numbered 2, then next is 4, followed by 6, 3, and 1. The person numbered 5 is the last one surviving. Let J(n) be the person left standing after all other n − 1 persons have killed themselves. We can then say J(6) = 5. Now, let us suppose that we have 2n people in this circle at the start. The first person to die is number 2, then 4, and every even-numbered person until the sword (or whatever object is used to commit suicide with) returns to person number 1. Now, only odd-numbered persons are left in the circle. This situation is similar to starting with n people, only this time their labels have been multiplied by 2 and had 1 subtracted. For example, person 5 would become person 9. So, a form for the survivor amongst an even-numbered group of people is J(2n) = 2J(n) − 1.
Using the fact that J(6) = 5, we can deduce that J(12) = 2J(6) − 1 = 9. We can then conclude that J(24) = 17, and J(48) = 33.
1 But what if there are an odd number of people from the start, say, 2n + 1? Execution begins as normal, with persons 2, 4, 6, 8,..., 2n dying in the first trip around the circle, but now the next to die after 2n is person 1. The people left in the circle are 3, 5, 7,..., 2n + 1. This is similar to the case with n people, as before, but now numbers are increased by 1 after they are doubled, not decreased by 1. Again, we have a form for determining the survivor in a group of 2n + 1 people; that is,
J(2n + 1) = 2J(n) + 1.
For example, J(15) = 2J(7) + 1 = 2(2J(3) + 1) + 1 = 15. Combining the previous two general forms with the base case J(1) = 1, we have the following recursion:
J(1) = 1;
J(2n) = 2J(n) − 1;
J(2n + 1) = 2J(n) + 1.
For this recursion, it is sufficient to divide the number of people in the circle by 2 and round down to the nearest integer. Repeated application will give the survivor’s number. What if we desire a closed-form solution? Such a solution is possible to construct. First, let us generate a short table of values of J(n).
n J(n) 1 1 2 1 3 3 4 1 5 3 6 5 7 7 8 1 9 3 10 5 11 7 12 9 13 11 14 13 15 15 16 1
Notice that the table is blocked in certain areas. This sparks some curious thought.
2 First, let’s see what happens to powers of two. Suppose k is an integer that is at least zero. Then what is J(2k)? For k = 0, we have J(1) = 1, and for k = 1, we have J(2) = 1 (from the table). Now, assume that, for all integers k up to j, that J(2j) = 1. Now, for k = j + 1, we have:
J(2j+1) = 2J(2j) − 1
= 2(1) − 1
= 1
as needed. Hence J(2k) = 1, where k is an integer greater than or equal to zero. What’s next to show? Let’s try to see what happens between powers of two, say, from 8 to 16. We see that J(8) = 1,J(9) = 3,J(10) = 5, and J(11) = 7. What’s happening? The value of the function increases by 2 as n increases by 1. But, we see that J(15) = 15, and J(16) = 1. So, something happens between n = 2j − 1 and n = 2j. Observe that 15 = 23 + 7, and 16 = 24. From the table, this is a transition between new blocks. But what happens at this transition? We see that 8 is the largest power of two that does not exceed 15. While 8 is not larger than 16, it is not the biggest power of two that is less than 16 (here, the desired number is 16). We almost see a pattern. Let l = n − 2j, where 2j = max{2p|2p < n}. We conjecture that J(n) = 2l + 1. For n = 1, that is, 1 = 20 + 0, we have J(1) = 1, as needed. The case n = 2 is trivial, by the previous derivation, so we move to n = 3. Here, we write 3 = 21 + 1 and so J(3) = 2(1) + 1 = 3. Now assume this holds for every n = 2j, inducting on j. For the case j = q + 1, we must first consider even l. So, suppose l is even. Then
J(2q+1 + l) = 2(J(2q + l/2)) − 1
= 2(2(l/2) + 1) − 1
= 2l + 1
as needed. Otherwise, when l is odd (say, l = 2p + 1), the induction step is as follows:
J(2q+1 + 2p + 1) = 2J(2q + p) + 1
= 2(2p + 1) + 1
= 2(l − 1) + 3
= 2l + 1
From the first to the second line, we assumed that p was even. However, it can be shown that any even number can be divided by 2 a finite number of times to produce an odd number. So, for odd l, J(2q + l) = 2l + 1.
3 We combine this with the previous result to give a suitable closed form solution for the Josephus problem:
J(n) = 2l + 1
where n = 2q + l, and l < 2q.
8. Solve the recurrence
Q0 = α;
Q1 = β;
1 + Qn−1 Qn = . Qn−2
Solution.
The recurrence doesn’t seem to have any recognizable pattern at first, so the best way to proceed is to start finding terms using the initial conditions.
1 + β Q = 2 α 1 + β 1 + Q = α 3 β 1 + α + β = αβ 1 + α + β 1 + αβ Q = 4 1 + β α 1 + α = β 1 + α 1 + β Q = 5 1 + α + β αβ = α
But, note that Q5 = Q0. This suggests that for every n, Qn = Qn−5. This recurrence is linear, and it
n n n−5 5 can be solved. To solve this recurrence, let r = Qn. Then we have r = r , or, equivalently, r = 1. 2π 2π Solving this equation in the complex numbers gives the solution r = cos 5 + i sin 5 . This value of r 2πi can actually be written as exp 5 . To account for the initial condition, it suffices to solve 2πi · 0 α = c exp , 1 5
whose solution gives c1 = α. Hence, the solution to the recurrence is 2nπi Q = α exp . n 5
4 2 Chapter 2: Sums
In mathematics, sums appear in almost every branch. For example, they appear in linear algebra when computing the (i, j) entry of a matrix product: n X cij = aikbkj k=1 where the sum is across k. Also, they appear in calculus and analysis when approximating a function with a polynomial:
∞ X f (j)(a) T (x) = (x − a)j j! j=0 But how do sums really work? How can one manipulate a sum and get a closed form? There are many ways to view a sum. First, one can view a sum as a recurrence. Suppose, for the sake of Pn example, that Sn = j=0 aj, where the terms aj come from a sequence. Observe that the sum can also be written as the following recurrence:
S0 = a0;
Sn = Sn−1 + an.
This recurrence says that each successive value of the sum depends on the previous terms. There are many well-known sums to keep in mind. Suppose one wishes to sum the first n integers in succession; that is, what is 1+2+3+...+n? There are many ways to solve this problem. Perhaps one of the most well-known ways to find this sum is by adding a copy of the sum to itself. This technique is attributed to Gauss, and is told through a story. Following Gauss’ method, we see that, after first calling our sum S,
2S = 1 + 2 + 3 + ... + n + 1 + 2 + 3 + ... + n
= (n + 1) + (n + 1) + ... + (n + 1) | {z } n terms = n(n + 1)
n(n + 1) And from this, we deduce that S = . These are called the triangular numbers. Using the 2 n(n + 1) notation above, we can also write the sum as S = Pn j = . j=0 2 Working with sums may require trickery sometimes. For example, it may be advantageous to change the index of summation to force a form already known. In the sum
7 X j j=3 can make the substitution j = n + 2 so that the sum transforms to
5 X n + 3 n=1
5 and this sum is easier to evaluate. Because sums arise so often, particular sums have been given names, like the triangular numbers. One other special sum gives the square pyramidal numbers, which come from the sum of the first n integer squares:
n X n(n + 1)(2n + 1) P = j2 = n 6 j=1
There is also a name for the sum of the first n unit fractions, the harmonic numbers.
n X 1 H = n j j=1
So, sums do appear often in mathematics, but sometimes more tools are needed, other than expanding the sum, adding it to itself, and re-indexing. What else can be used to evaluate a sum? Fortunately, there is a powerful tool available, involving a discrete analogue to calculus. In calculus, one can differentiate functions by the following definition:
f(x + h) − f(x) f 0(x) = lim h→0 h
The finite calculus has a similar operator, the difference operator. Just like the derivative of infinite calculus, the difference operator ∆ shows change. It is defined as:
∆f(x) = f(x + 1) − f(x)
The difference operator can be thought of the differential operator in which h = 1. So, how can we use the difference operator? It works just like differentiation. For example, what is ∆(x2)? By the definition, it is (x + 1)2 − x2 = 2x + 1. In infinite calculus, polynomials differentiate well by the power rule. Finite calculus doesn’t do the same sort of thing with ordinary polynomials, as seen above. A new type of power can be defined that transforms well under the difference operator. This new power is a falling power, defined as n factors z }| { xn = x(x − 1)(x − 2) ... (x − m + 1)
It is left to the reader to verify that ∆(xn) = nxn−1. Just as the differential operator D in infinite calculus has an “inverse” operator R , the difference operator ∆ has an “inverse,” the sum (or anti-difference) operator P. So, just as we say Z g(x) = D(f(x)) ⇐⇒ g(x) dx = f(x) + C
we can say X g(x) = ∆f(x) ⇐⇒ g(x) δx = f(x) + C
where P g(x) δx describes all functions whose difference is g(x). Also, the C that comes as a result of indefinite summation need not be a constant; it can be a function φ(x + 1) = φ(x).
6 Definite summation can be defined like definite integration. We have
b X b g(x) δx = f(x) = f(b) − f(a). a a Many of the usual differentiation rules carry over to the finite calculus. The difference operator is linear, so it preserves addition and scalar multiplication. That is,
∆(αf + βg) = α∆(f) + β∆(g).
A product rule also exists:
∆(u(x)v(x)) = u(x + 1)v(x + 1) − u(x)v(x)
= u(x + 1)v(x + 1) − u(x)v(x + 1) + u(x)v(x + 1) − u(x)v(x)
= v(x + 1)(u(x + 1) − u(x)) + u(x)(v(x + 1) − v(x))
= v(x + 1)∆u(x) + u(x)∆v(x)
= Ev(x)∆u(x) + u(x)∆v(x)
where Ef(x) = f(x + 1). Now, just as integration by parts can be derived by the product rule of infinite calculus, a discrete analogue of this method exists in the finite calculus. The following steps are true:
∆(u(x)v(x)) = Ev(x)∆u(x) + u(x)∆v(x) X X ⇐⇒ ∆(u(x)v(x)) δx = Ev(x)∆u(x) + u(x)∆v(x) δx X ⇐⇒ u(x)v(x) = u(x)∆v(x) + Ev(x)∆u(x) X X ⇐⇒ u(x)∆v(x) = u(x)v(x) − Ev(x)∆u(x)
The last line gives the summation by parts formula:
X X u(x)∆v(x) = u(x)v(x) − Ev(x)∆u(x)
A proof of this formula using rules of summation is given at the end of this section. What follows are example sums that can be evaluated with summation by parts. First, examine the sum
X Hx δx.
This is an indefinite sum of harmonic numbers. The harmonic numbers are the discrete analogue of natural logarithms. To see this, consider the integral Z ln x dx.
7 With the choices u = ln x, dv = dx, integration by parts gives x ln x − x + C. The same reasoning is used for this sum: let u = Hx, and ∆v = 1. We then have:
X X Hx δx = xHx − δx
= xHx − x + C
Another example involves exponential functions. What is
X xcx δx,
where c is an integer other than 1? Again, consider this the discrete analogue of the integral Z xex dx,
whose result is xex − ex + C. Again, proceed by summation by parts: let u = x and ∆v = cx. Then the sum is now:
X xcx 1 X xcx δx = − (c + 1)x δx c − 1 c xcx (c + 1)x = − + C c − 1 c2
What follows is an alternative proof of summation by parts.
11. The general rule (2.56) for summation by parts is equivalent to
X X (ak+1 − ak) bk = anbn − a0b0 − ak+1 (bk+1 − bk) , 0≤k for n ≥ 0. Prove this formula directly by using the distributive, associative, and commutatitve laws. Solution. Before proceeding, note that the procedure is similar to finding an expression for the derivative of a product. That is, a cross-term (in this case, ak+1bk+1) must be added and subtracted. Now, the summand can be writen as: (ak+1 − ak) bk = ak+1bk − akbk = ak+1bk − akbk + ak+1bk+1 − ak+1bk+1 = ak+1 (bk − bk+1) ak+1 + ak+1bk+1 − akbk = −ak+1 (bk+1 − bk) Summing from 0 to n − 1, observe that the last two terms form a telescoping sum, so what remains is anbn − a0b0. Combining this with the new sum formed gives the result. 8 3 Chapter 3: Integer Functions There are two functions that are used in computer science for estimates. These functions, known as the floor and ceiling, allow a person to force any real number to an integer. Both these functions are related by inequalities; this relationship will be explained below. The floor function maps a real number x to the largest integer that does not exceed x. In short, it rounds the number down. The floor of x is denoted bxc. For example, b3.5c = 3 and b12.9999c = 12. The ceiling function behaves similarly. It maps a real number x to the smallest integer that exceeds x. The ceiling of x is denoted by dxe. In short, it rounds a number up. For example, d3.14e = 3 and d2.00001e = 3. The floor and ceiling functions are related by a series of inequalities. The first should be easy to see: bxc ≤ x ≤ dxe This inequality can actually be extended in both directions by adding x − 1 on the lower end and x + 1 on the upper end: x − 1bxc ≤ x ≤ dxe < x + 1 Observe that there is strict inequality in the ends, while equality can occur in the middle terms. This equality is met when x is an integer. There are a series of equivalences that involve floors and ceilings; these can be used to simplify expressions in computations: bxc = m ⇐⇒ m ≤ x < m + 1 dxe = n ⇐⇒ n − 1 < x ≤ n bxc = m ⇐⇒ x − 1 < m ≤ x dxe = n ⇐⇒ x ≤ n < x + 1 Manipulating sums with floors and ceilings takes extra work. For example, consider the sum m X √ b kc k=0 √ √ If a table of values of b kc were to be created, one would find that there are repeated occurrences of b kc in each interval [n2, (n + 1)2 − 1]. In order to find the desired sum, we must first count the number of times √ b kc appears in each interval. Each interval, except possibly the last, sums to k(2k + 1). The last interval √ √ may not contain this number, so it is enough to call the sum in the last interval b mc m − b mc2 + 1. The remaining sum is as follows: √ b mc−1 X n(n + 1)(2n + 1) n(n + 1) k(2k + 1) = + 3 2 k=0 9 √ where n = b mc − 1. The final sum is then m X √ n(n + 1)(2n + 1) n(n + 1) b kc = + + (n + 1)(m − (n + 1)2 + 1). 3 2 k=0 4 Chapter 4: Number Theory The study of number theory is centered around the integers. There are many functions whose domain is the set of integers. In number theory, some important functions include the Euler phi function φ(n), and the M¨obiusmu function µ(n). This section will focus on these two functions and their relationship to one another. The Euler phi function counts the number of positive integers less than n that are relatively prime to n. So, if S = {m ∈ N | gcd(m, n) = 1 and m < n}, then φ(n) = |S|. For small values of n, one can find φ(n) by counting the integers that satisfy the condition. For example, φ(6) = 2 and φ(11) = 10. For larger numbers, it may be more difficult to count the integers relatively prime to n and less than n (unless a computer is programmed to do so). Before attempting to find a way to compute the number of integers less than and relatively prime to n, it would be wise to first observe what happens when n is prime, or a prime power. So, suppose p is a prime. The integers less than p are {1, 2, 3, . . . , p − 1}. But which of these are relatively prime to p? Recall that a prime has only two divisors, 1 and itself (these are called the trivial divisors). Since p is prime, none of the integers from 1 to p − 1 divide evenly into p. So, φ(p) = p − 1. What about φ(p2)? A similar argument can be used. Construct the set S = {1, 2, . . . , p − 1, p, . . . , 2p, . . . , p(p − 1), . . . , p2} Note that p divides p2, so any multiple of p not exceeding p2 will divide p2. In S, there are p− multiples of p that divide p2. The phi function counts the integers relatively prime to p2, so we must have φ(p2) = p2 − p. In general, we have, for prime p, φ(pn) = pn − pn−1. There are two ways to use the phi function without directly counting the integers that meet the require- ment. One way is to use a product definition: Y p − 1 φ(n) = p p|n where the product is taken over all primes p that divide n. A derivation of this formula is omitted. A second way is to use the fact that the phi function is multiplicative. That is, when m and n are relatively prime, then φ(mn) = φ(m)φ(n). 10 This helps in breaking the argument into smaller factors for which the values of the function are easily known. For example: φ(78) = φ(2)φ(39) = φ(2)φ(3)φ(13) = 1 · 2 · 12 = 24. The M¨obius function is a function that depends on the prime factorization of an integer n. It is is defined as follows: 1, n = 1; µ(x) = (−1)r, if n is a product of r distinct primes; 0, if n is not squarefree. For example, µ(6) = 1 and µ(36) = 0. It is important to note that if one sums values of µ(n) over the factors of n, the sum will be zero except when n = 1. That is, X µ(d) = [n = 1]. d|n For example, X µ(d) = µ(1) + µ(2) + µ(4) + µ(8) + µ(16) d|16 = 1 − 1 + 0 + 0 + 0 = 0. Just like the phi function, the mu function is multiplicative. Again, when m and n are relatively prime, µ(mn) = µ(m)µ(n). Both the phi and mu functions are examples of arithmetic functions. These functions preserve either addition or multiplication, much like a homomorphism. A special product can be defined with arithmetic functions, using a sum. This product is called the Dirichlet convolution, written as (f ∗ g)(n), where f and g are arithmetic functions. The convolution is defined as follows: X n (f ∗ g)(n) = f(d)g d d|n Two equivalent convolutions are of great importance. The M¨obiusinversion formula, due to Richard Dedekind and Joseph Liouville, states that X X n g(n) = f(d) ⇐⇒ f(n) = µ(d)g . d d|n d|n 11 In the notation of the Dirichlet convolution, the inversion formula can be written as g(n) = (f ∗1)(n) ⇐⇒ f(n) = (µ ∗ g)(n), where 1, as a function, is the constant function in which every integer is mapped to the number 1. Now, the phi function has a special property such that when a sum is taken over all factors d of the argument n, one finds that the value of the sum is n. That is, X φ(d) = n d|n Now, suppose that f(n) = n, the identity map. Then one could apply the M¨obiusinversion formula (or, equivalently, the Dirichlet convolution) to see that X n φ(n) = µ(d) . d d|n This gives a relationship between both functions under the convolution. 5 Chapter 5: Binomial Coefficients The binomial coefficients come from combinatorics, a branch of mathematics involving counting. They arise in the problem of taking a certain number of objects from a larger set. We define them as follows: n n! = k k!(n − k)! where n! is the factorial of n. The left side is read “n choose k”. By convention, this is defined where n ≥ k, even if k is allowed to be negative. Otherwise, if k exceeds n, then the value of the coefficient is zero. There are many identities involving the binomial coefficients. What if k were replaced with n − k? Then we have: n n! n! n = = = n − k (n − k)!(n − (n − k))! (n − k)!k! k so the binomial coefficients admit symmetry. There is also an identity involving the sum of two binomial coefficients, but requires a bit of thought: n n n + 1 + = k − 1 k k This is Pascal’s identity. The identity says the number of ways to choose k − 1 objects from n plus the number of ways to choose k from those same n is equal to the number of ways to select k objects from n + 1. This identity can be expressed in fewer words by referring to Pascal’s triangle, a visual representation of the binomial coefficients. When written out, Pascal’s identity says that an entry is equal to the sum of the two entries above it. There are many more identities involving binomial coefficients; too many to be explained in this paper. They will be referenced in problems, and it is left to the reader to prove the truth of these identities. 12 Why are these numbers called binomial coefficients? They arise in the binomial theorem, the expansion of binomials to integer powers. The binomial theorem is as follows: n X n (x + y)n = xjyn−j j j=1 For example, (x + y)2 = x2 + 2xy + y2 (x + y)3 = x3 + 3x2y + 3xy2 + y3 (x + y)4 = x4 + 4x3y + 6x2y2 + 4xy3 + y4 The coefficients on each term represent a binomial coefficient, or, from a counting perspective, the number of ways to arrange x and y, where the exponent on each determines how many of each variable there are. Here’s one example of a sum involving binomial coefficients. Example. n m X k Find the sum: n k=0 k Solution. The summand is a quotient of binomial coefficients. To simplify the summand, we use the “trinomial revision” identity: n m n n − k = m k k m − k n n This almost matches the form of the summand. Dividing both sides of the identity by k m gives the desired form. So, the sum can now be written as n n−k X m−k n . k=0 m In fact, the denominator can be taken outside the summand, since it does not depend on the counter k. The indices of summation can be changed to one simple condition, k ≥ 0, since when k exceeds m, the binomial coefficient will be zero. We now evaluate the simpler sum X n − k . m − k k≥0 This summand still needs to be simplified. The next step is to replace k with m − k, and the sum is now X n − (m − k) m − (m − k) m−k≥0 or, equivalently, X n − m + k . k k≤m 13 The sum is almost evaluated fully. The last trick uses the parallel summation rule: X r + k r + n + 1 = k n k≤n So, letting r = n − m, we obtain X n − m + k n + 1 = k m k≤m