Chapter 4

Some Counting Problems; Multinomial Coecients, The Inclusion-Exclusion Principle, Sylvester’s Formula, The Sieve Formula

4.1 Counting and Functions

In this short section, we consider some simple counting problems.

Let us begin with permutations. Recall that a of a set, A,isanybijectionbetweenA and itself.

427 428 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS If A is a finite set with n elements, we mentioned earlier (without proof) that A has n!permutations,wherethe function, n n!(n N), is given recursively by: 7! 2 0! = 1 (n +1)!= (n +1)n!.

The reader should check that the existence of the func- tion, n n!, can be justified using the Theo- rem (Theorem7! 2.5.1).

Proposition 4.1.1 The number of permutations of a set of n elements is n!.

Let us also count the number of functions between two finite sets.

Proposition 4.1.2 If A and B are finite sets with A = m and B = n, then the set of function, BA, from| | A to B has| | nm elements. 4.1. COUNTING PERMUTATIONS AND FUNCTIONS 429 As a corollary, we determine the of a finite .

Corollary 4.1.3 For any finite set, A, if A = n, then 2A =2n. | | | |

Computing the value of the factorial function for a few inputs, say n =1, 2 ...,10, shows that it grows very fast. For example, 10! = 3, 628, 800.

Is it possible to quantify how fast factorial grows com- pared to other functions, say nn or en?

Remarkably, the answer is yes. A beautiful formula due to James Stirling (1692-1770) tells us that n n n! p2⇡n , ⇠ e which means that ⇣ ⌘ n! lim n =1. n p n !1 2⇡n e 430 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS

Figure 4.1: Jacques Binet, 1786-1856 Here, of course, 1 1 1 1 e =1+ + + + + + 1! 2! 3! ··· n! ··· the base of the .

It is even possible to estimate the error. It turns out that n n n!=p2⇡n en, e where ⇣ ⌘ 1 1 < < , 12n +1 n 12n aformuladuetoJacquesBinet(1786-1856).

Let us introduce some notation used for comparing the rate of growth of functions. 4.1. COUNTING PERMUTATIONS AND FUNCTIONS 431 We begin with the “Big oh” notation.

Given any two functions, f : N and g : N R,we say that f is O(g) (or f(n) is O!(g(n))) i↵there! is some N>0andaconstantc>0suchthat f(n) g(n) , for all n N. | | | |

In other words, for n large enough, f(n) is bounded by c g(n) .Wesometimeswriten>>|0toindicatethat| n is| “large.”|

1 For example n is O(12n). By abuse of notation, we often write f(n)=O(g(n)) even though this does not make sense.

The “Big omega” notation means the following: f is ⌦(g) (or f(n) is ⌦(g(n))) i↵there is some N>0anda constant c>0suchthat f(n) c g(n) , for all n N. | | | | 432 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS The reader should check that f(n)isO(g(n)) i↵ g(n)is ⌦(f(n)).

We can combine O and ⌦to get the “Big theta” nota- tion: f is ⇥(g) (or f(n) is ⇥(g(n))) i↵there is some N>0andsomeconstantsc1 > 0andc2 > 0suchthat c g(n) f(n) c g(n) , for all n N. 1| || | 2| |

Finally, the “Little oh” notation expresses the fact that afunction,f,hasmuchslowergrowththanafunctiong.

We say that f is o(g) (or f(n) is o(g(n))) i↵ f(n) lim =0. n !1 g(n) For example, pn is o(n). 4.2. COUNTING SUBSETS OF SIZE K; MULTINOMIAL COEFFICIENTS 433 4.2 Counting Subsets of Size k; and Multi- nomial Coecients

Let us now count the number of subsets of cardinality k of a set of cardinality n,with0 k n.   n Denote this number by k (say “n choose k”). Actually, in the proposition below, it will be more convenient to assume that k Z. 2

Proposition 4.2.1 For all n N and all k Z, if n 2 2 k denotes the number of subsets of cardinality k of a set of cardinality n, then 0 =1 0 ✓ ◆ n =0 if k/ 0, 1,...,n k 2{ } ✓ ◆ n n 1 n 1 = + (n 1, 0 k n). k k k 1   ✓ ◆ ✓ ◆ ✓ ◆

n The numbers k are also called binomial coecients, because they arise in the expansion of the binomial ex- pression (a + b)n,aswewillseeshortly. 434 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS The binomial coecients can be computed inductively using the formula n n 1 n 1 = + k k k 1 ✓ ◆ ✓ ◆ ✓ ◆ (sometimes known as Pascal’s recurrence formula)by forming what is usually called Pascal’s triangle,which n is based on the recurrence for k :

n n n n n n n n n n n n 0 1 2 3 4 5 6 7 8 9 10 ... 01 111 2121 31331 414641 515101051 6161520 15 61 717213535 21 7 1 818285670562881 9193684126126843691 10 1 10 45 120 210 252 210 120 45 10 1 ...... 4.2. COUNTING SUBSETS OF SIZE K; MULTINOMIAL COEFFICIENTS 435

Figure 4.2: , 1623-1662

n We can also give the following explicit formula for k in terms of the factorial function:

Proposition 4.2.2 For all n, k N, with 0 k n, we have 2   n n! = . k k!(n k)! ✓ ◆ Then, it is very easy to see that n n = . k n k ✓ ◆ ✓ ◆ Remarks: (1) The binomial coecients were already known in the twelfth century by the Indian Scholar Bhaskra. Pas- cal’s triangle was taught back in 1265 by the Persian philosopher, Nasir-Ad-Din. 436 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS (2) The formula given in Proposition 4.2.2 suggests gen- eralizing the definition of the binomial coecients to upper indices taking real values.

Indeed, for all r R and all , k Z,wecan set 2 2 k r r r(r 1) (r k +1) = = ··· if k 0 k k! k(k 1) 2 1 ✓ ◆ ( 0if ··· · k<0.

Note that the expression in the numerator, rk,stands for the product of the k terms k terms r(r 1) (r k +1). ··· z }| { By convention, the value of this expression is 1 when r k =0,sothat 0 =1. 4.2. COUNTING SUBSETS OF SIZE K; MULTINOMIAL COEFFICIENTS 437

r The expression k can be viewed as a of degree k in r.Thegeneralizedbinomialcoecients allow for a useful extension of the binomial formula (see next) to real exponents.

However, beware that the symmetry identity fails when r is not a and that the formula in Proposition 4.2.2 (in terms of the factorial function) only makes sense for natural numbers.

We now prove the “binomial formula” (also called “bino- mial theorem”). 438 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS Proposition 4.2.3 (Binomial Formula) For all n N and for all reals, a, b R, (or more generally, any2 two commuting variables2a, b, i.e., satisfying ab = ba), we have the formula:

n n n n 1 n n 2 2 (a + b) = a + a b + a b + 1 2 ··· ✓ ◆ ✓ ◆ n n k k n n 1 n + a b + + ab + b . k ··· n 1 ✓ ◆ ✓ ◆ The above can be written concisely as n n n n k k (a + b) = a b . k Xk=0 ✓ ◆

Remark: The binomial formula can be generalized to the case where the exponent, r,isarealnumber(even negative). This result is usually known as the or Newton’s generalized binomial theorem. 4.2. COUNTING SUBSETS OF SIZE K; MULTINOMIAL COEFFICIENTS 439 Formally, the binomial theorem states that

r 1 r r k k (a + b) = a b ,rN or b/a < 1. k 2 | | Xk=0 ✓ ◆ Observe that when r is not a natural number, the right- hand side is an infinite sum and the condition b/a < 1 insures that the series converges. | | For example, when a =1andr =1/2, if we rename b as x,weget 1 1 1 (1 + x)2 = 2 xk k Xk=0 ✓ ◆ 1 1 1 1 1 =1+ 1 k +1 xk k! 2 2 ··· 2 Xk=1 ✓ ◆ ✓ ◆ 1 k 1 1 3 5 (2k 3) k =1+ ( 1) · · ··· x 2 4 6 2k k=1 · · ··· X k 1 1 ( 1) (2k)! =1+ xk, 22k(2k 1)(k!)2 k=1 X k 1 1 ( 1) 2k =1+ xk 22k(2k 1) k k=1 ✓ ◆ X k 1 1 ( 1) 1 2k 2 =1+ xk, 22k k k 1 Xk=1 ✓ ◆ 440 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS which converges if x < 1. | | The first few terms of this series are 1 1 1 1 5 (1 + x)2 =1+ x x2 + x3 x4 + 2 8 16 128 ···

For r = 1, we get the familiar geometric series 1 =1 x + x2 x3 + +( 1)kxk + , 1+x ··· ··· which converges if x < 1. | |

Remark: The numbers, 1 2n C = , n n +1 n ✓ ◆ are the Catalan numbers.Theyarethesolutionofmany counting problems in .

Proposition 4.2.4 The number of injections between a set, A, with m elements and a set, B, with n ele- ments, where m n, is given by n!  (n m)! = n(n 1) (n m +1). ··· 4.2. COUNTING SUBSETS OF SIZE K; MULTINOMIAL COEFFICIENTS 441 Counting the number of surjections between a set with n elements and a set with p elements, where n p,is harder.

We state the following formula without giving a proof right now. Finding a proof of this formula is an interesting exercise.

We will give a quick proof using the Inclusion-Exclusion Principle in Section 4.4.

Proposition 4.2.5 The number of surjections, Snp, between a set, A, with n elements and a set, B, with p elements, where n p, is given by p p S = pn (p 1)n + (p 2)n + np 1 2 ··· ✓ ◆ ✓ ◆ p 1 p +( 1) . p 1 ✓ ◆ 442 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS Remarks:

1. It can be shown that Snp satisfies the following pecu- liar version of Pascal’s recurrence formula:

Snp = p(Sn 1 p + Sn 1 p 1),p2, and, of course, Sn 1 =1andSnp =0ifp>n.

Using this recurrence formula and the fact that Snn = n!, simple expressions can be obtained for Sn+1 n and Sn+2 n.

2. The numbers, Snp,areintimatelyrelatedtotheso- called Stirling numbers of the second kind,denoted n (p) p , S(n, p), or Sn ,whichcountthenumberofpar- titions of a set of n elements into p nonempty pairwise disjoint blocks (see Section 5.5). In fact, n S = p! . np p ⇢ 4.2. COUNTING SUBSETS OF SIZE K; MULTINOMIAL COEFFICIENTS 443

n The Stirling numbers, p ,satisfyarecurrenceequa- tion which is another variant of Pascal’s recurrence formula: n =1 1 ⇢ n =1 n ⇢ n n 1 n 1 = + p (1 p

There is a recurrence formula for the Bell numbers but it is complicated and not very useful because the formula for bn+1 involves all the previous Bell num- bers.

AgoodreferenceforallthesespecialnumbersisGraham, Knuth and Patashnik [9], Chapter 6. 444 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS

Figure 4.3: Eric Temple Bell, 1883-1960 (left) and , 1938- (right) The binomial coecients can be generalized as follows. For all n, m, k1,...,km N,withk1 + + km = n and m 2, we have the multinomial2 coe···cient, n , k k ✓ 1 ··· m◆ which counts the number of ways of splitting a set of n elements into an ordered sequence of m disjoint subsets, the ith subset having k 0elements. i Such sequences of disjoint subsets whose union is 1,...,n itself are sometimes called ordered partitions. { }

Beware that some of the subsets in an ordered partition may be empty, so we feel that the terminology “partition” is confusing since as will see in Section 5.5, the subsets that form a partition are never empty. 4.2. COUNTING SUBSETS OF SIZE K; MULTINOMIAL COEFFICIENTS 445 Note that when m =2,thenumberofwaysofsplitting asetofn elements into two disjoint subsets where the first subset has k1 elements and the second subset has k = n k elements is precisely the number of subsets 2 1 of size k1 of a set of n elements, that is n n = . k k k ✓ 1 2◆ ✓ 1◆

Observe that the order of the m subsets matters.

For example, for n =5,m =4,k1 =2and k2 = k3 = k4 =1,thesequencesofsubsets ( 1, 2 , 3 , 4 , 5 ), ( 1, 2 , 3 , 5 , 4 ), ({1, 2}, {5}, {3}, {4}), ({1, 2}, {4}, {3}, {5}), ({1, 2}, {4}, {5}, {3}), ({1, 2}, {5}, {4}, {3}) are{ all} di{↵}erent{ } and{ } they{ correspond} { } { to} { the} same parti- tion, 1, 2 , 3 , 4 , 5 . {{ } { } { } { }} 446 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS

Proposition 4.2.6 For all n, m, k1,...,km N, with k + + k = n and m 2, we have 2 1 ··· m n n! = . k k k ! k ! ✓ 1 ··· m◆ 1 ··· m

As in the binomial case, it is convenient to set n =0 k k ✓ 1 ··· m◆ if ki < 0orki >n,foranyi,with1 i m.Then, Proposition 4.2.1 is generalized as follows: 

Proposition 4.2.7 For all n, m, k1,...,km N, with k + + k = n, n 1 and m 2, we have2 1 ··· m n m n 1 = . k k k (k 1) k 1 m i=1 1 i m ✓ ··· ◆ X ✓ ··· ··· ◆ 4.2. COUNTING SUBSETS OF SIZE K; MULTINOMIAL COEFFICIENTS 447 Remark: Proposition 4.2.7 shows that Pascal’s triangle generalizes to “higher dimensions”, that is, to m 3. Indeed, it is possible to give a geometric interpretation of Proposition 4.2.7 in which the multinomial coecients corresponding to those k1,...,km with k1 + +km = n lie on the hyperplane of equation x + +···x = n in 1 ··· m Rm,andallthemultinomialcoecientsforwhichn N, for any fixed N,lieinageneralizedtetrahedroncalleda .

When m =3,themultinomialcoecientsforwhich n N lie in a tetrahedron whose faces are the planes of equations, x =0;y =0;z =0;andx + y + z = N.

We have also the following generalization of Proposition 4.2.3: 448 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS Proposition 4.2.8 (Multinomial Formula) For all n, m N with m 2, for all pairwise commuting 2 variables a1,...,am, we have

n n k1 km (a1 + + am) = a1 am . ··· k1 km ··· k1,...,km 0 ✓ ··· ◆ k + X+km=n 1 ···

How many terms occur on the right-hand side of the multinomial formula?

After a of reflexion, we see that this is the number of finite of size n whose elements are drawn from asetofm elements, which is also equal to the number of m-tuples, k1,...,km,withki N and 2 k + + k = n. 1 ··· m

Proposition 4.2.9 The number of finite multisets of size n 0 whose elements come from a set of size m 1 is m + n 1 . n ✓ ◆ 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 449 4.3 Some Properties of the Binomial Coecients

The binomial coecients satisfy many remarkable identi- ties.

If one looks at the Pascal triangle, it is easy to figure out what are the sums of the elements in any given row

It is also easy to figure out what are the sums of n m+1 consecutive elements in any given column (starting from the top and with 0 m n).   What about the sums of elements on the diagonals? Again, it is easy to determine what these sums are.

Here are the answers, beginning with sums of the elements in a column.

(a) Sum of the first n m +1elementsincolumnm (0 m n).   450 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS For example, if we consider the sum of the first 5 (non- zero) elements in column m =3(so,n =7),wefind that 1 + 4 + 10 + 20 + 35 = 70, where 70 is the entry on the next row and the next col- umn.

n n n n n n n n n n 0 1 2 3 4 5 6 7 8 ... 01 111 2121 31331 41464 1 5151010 51 6161520 15 6 1 7172135 35 21 7 1 818285670 56 28 8 1 ...... Thus, we conjecture that m m +1 n 1 n n +1 + + + + = , m m ··· m m m +1 ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ which is easily proved by induction. 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 451 The above formula can be written concisely as

n k n +1 = , m m +1 kX=m ✓ ◆ ✓ ◆ or even as

n k n +1 = , m m +1 Xk=0 ✓ ◆ ✓ ◆

k since m =0whenk

For example, if we consider the sum of the elements in row n =6,wefindthat

1+6+15+20+15+6+1=64=26. 452 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS

n n n n n n n n n n 0 1 2 3 4 5 6 7 8 ... 01 111 2121 31331 414641 515101051 6 1 6 15 20 15 6 1 7172135352171 818285670562881 ...... Thus, we conjecture that n n n n + + + + =2n. 0 1 ··· n 1 n ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆

This is easily proved by induction of by setting a = b =1 in the binomial formula for (a + b)n. 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 453 Unlike the columns for which there is a formula for the partial sums, there is no closed form formula for the par- tial sums of the rows.

However, there is a closed form formula for partial al- ternating sums of rows. Indeed, it is easily shown by induction that m n n 1 ( 1)k =( 1)m , k m Xk=0 ✓ ◆ ✓ ◆ if 0 m n.Forexample   1 7+21 35 = 20. Also, for m = n,weget n n ( 1)k =0. k Xk=0 ✓ ◆

(c) Sum of the first n +1elementsonthedescending diagonal starting from row m. 454 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS For example, if we consider the sum of the first 5 elements starting from row m =3(so,n =4),wefindthat 1 + 4 + 10 + 20 + 35 = 70, the elements on the next row below the last element, 35.

n n n n n n n n n n 0 1 2 3 4 5 6 7 8 ... 01 111 2121 3 1 331 414 641 51510 10 5 1 6161520 15 6 1 717213535 21 7 1 818285670 56 28 8 1 ...... Thus, we conjecture that m m +1 m + n m + n +1 + + + = , 0 1 ··· n n ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ which is easily shown by induction. 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 455 The above formula can be written concisely as n m + k m + n +1 = , k n Xk=0 ✓ ◆ ✓ ◆

It is often called the parallel formula since it involves a sum over an index, k,appearingbothinthe upper and in the lower position of the binomial coecient, m+k k .

(d) Sum of the elements on the ascending diagonal start- ing from row n. 456 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS

n n n n n n n n n nFn+1 0 1 2 3 4 5 6 7 8 ... 01 1 11 11 22 121 33 1331 45 146 4 1 58 15 10 10 51 6 13 1 6 15 20 15 6 1 7 21 1 7 21 35 35 21 7 1 8 34 1 8285670562881 ......

For example, the sum of the numbers on the diagonal starting on row 6 (in cyan), row 7 (in blue)androw8(in red)are: 1+6+5+1 = 13 4+10+6+1 = 21 1+10+15+7+1 = 34. 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 457

We recognize the Fibonacci numbers, F7,F8 and F9, what a nice surprise!

Recall that F0 =0,F1 =1and

Fn+2 = Fn+1 + Fn. Thus, we conjecture that n n 1 n 2 0 F = + + + + . n+1 0 1 2 ··· n ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆

The above formula can indeed be proved by induction, but we have to distinguish the two case where n is even or odd.

We now list a few more formulae which are often used in the manipulations of binomial coecients.

They are among the “top ten binomial coecient iden- tities” listed in Graham, Knuth and Patashnik [9], see Chapter 5. 458 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS (e) The equation n n i k n = , i k i i k ✓ ◆✓ ◆ ✓ ◆✓ ◆ holds for all n, i, k,with0 i k n.    This is because, we find that after a few calculations, n n i n! k n = = . i k i i!(k i)!(n k)! i k ✓ ◆✓ ◆ ✓ ◆✓ ◆

Observe that the expression in the middle is really the trinomial coecient n . ik in k ✓ ◆

For this reason, the equation (e) is often called trinomial revision. 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 459 For i =1,weget n 1 n n = k . k 1 k ✓ ◆ ✓ ◆ So, if k =0,wegettheequation 6 n n n 1 = ,k=0. k k k 1 6 ✓ ◆ ✓ ◆

This equation is often called the absorption identity.

(f) The equation m + p m m p = n k n k ✓ ◆ Xk=0 ✓ ◆✓ ◆ holds for m, n, p 0suchthatm + p n. This equation is usually known as Vandermonde convo- lution. 460 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS An interesting special case of Vandermonde convolution arises when m = p = n.Inthiscase,wegettheequation 2n n n n = . n k n k ✓ ◆ Xk=0 ✓ ◆✓ ◆

n n However, k = n k ,soweget n n 2 2n = , k n Xk=0 ✓ ◆ ✓ ◆ that is, the sum of the squares of the entries on row n of the Pascal triangle is the middle element on row 2n.

Asummaryofthetopninebinomialcoecientidentities is given in Figure 4.4. 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 461

n n! = , 0 k n factorial expansion k k!(n k)!   ✓ ◆ n n = , 0 k n symmetry k n k   ✓ ◆ ✓ ◆ n n n 1 = ,k=0 absorption k k k 1 6 ✓ ◆ ✓ ◆ n n 1 n 1 = + , 0 k n addition/induction k k k 1   ✓ ◆ ✓ ◆ ✓ ◆ n n i k n = , 0 i k n trinomial revision i k i i k    ✓ ◆✓ ◆ ✓ ◆✓ ◆ n n n n k k (a + b) = a b ,n0 binomial formula k Xk=0 ✓ ◆ n m + k m + n +1 = ,m,n0 parallel summation k n Xk=0 ✓ ◆ ✓ ◆ n k n +1 = , 0 m n upper summation m m +1   Xk=0 ✓ ◆ ✓ ◆ m + p m m p m + p n = Vandermonde convolution n k n k m, n, p 0 ✓ ◆ Xk=0 ✓ ◆✓ ◆

Figure 4.4: Summary of Binomial Coecient Identities 462 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS Remark: Going back to the generalized binomial coef- r ficients, k ,wherer is a real number, possibly negative, the following formula is easily shown: r k r 1 =( 1)k , k k ✓ ◆ ✓ ◆ where r R and k Z. 2 2 If r<0andk 1thenk r 1 > 0, so the formula shows how a binomial coecient with negative upper in- dex can be expessed as a binomial coecient with positive index.

For this reason, this formula is known as negating the upper index. 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 463 Next, we would like to better understand the growth pat- tern of the binomial coecients.

Looking at the Pascal triangle, it is clear that when 2m n =2m is even, the central element, m ,isthelargest element on row 2m and when n =2m +1isodd,the 2m+1 2m+1 two central elements, m = m+1 ,arethelargest elements on row 2m +1. n Furthermore, k is strictly increasing until it reaches its maximal value and then it is strictly decreasing (with two equal maximum values when n is odd).

The above facts are easy to prove by considering the ratio n n k +1 = , k k +1 n k ✓ ◆✓ ◆ where 0 k n 1.   464 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS It would be nice to have an estimate of how large is the n maximum value of the largest binomial coecient, n/2 . b c Since the sum of the elements on row n is 2n and since there are n +1elementsonrown,someroughbounds are 2n n < 2n n +1 n/2 ✓b c◆ for all n 1. Thus, we see that the middle element on row n grows very fast (exponentially).

We can get a sharper estimate using Stirling’s formula (see Section 4.1). We give such an estimate when n =2m is even, the case where n is odd being similar.

We have 2m 22m . m ⇠ p⇡m ✓ ◆ 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 465

n The next question is to figure out how quickly k drops n from its maximum value, n/2 . b c Let us consider the case where n =2m is even, the case when n is odd being similar and left as an exercise.

We would like to estimate the ratio 2m 2m , m t m ✓ ◆✓ ◆ where 0 t m.   Actually, it will be more convenient to deal with the in- verse ratio, 2m 2m (m t)!(m + t)! r(t)= = . m m t (m!)2 ✓ ◆✓ ◆ 466 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS Observe that (m + t)(m + t 1) (m +1) r(t)= ··· . m(m 1) (m t +1) ···

The above expression is not easy to handle but if we take its (natural) logarithm, we can use basic inequali- ties about logarithms to get some bounds.

We will make use of the following proposition:

Proposition 4.3.1 We have the inequalities 1 1 ln x x 1, x   for all x R with x>0. 2

We are now ready to prove the following inequalities: 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 467 Proposition 4.3.2 For every m 0 and every t, with 0 t m, we have the inequalities   t2/(m t+1) 2m 2m t2/(m+t) e e .  m t m  ✓ ◆✓ ◆ This implies that 2m 2m t2 em, m t m ⇠ ✓ ◆✓ ◆ for m large and 0 t m.  

What is remarkable about Proposition 4.3.2 is that it 2m shows that m t varies according to the Gaussian curve t2 (also known as bell curve), t em,whichistheprob- ability density function of the7!normal distribution (or Gaussian distribution).

If we make the change of variable, k = m t,weseethat if 0 k 2m,then   (m k)2 2m 2m e m . k ⇠ m ✓ ◆ ✓ ◆ 468 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS If we plot this curve, we observe that it reaches its max- imum for k = m and that it decays very quickly as k varies away from m.

It is an interesting exercise to plot a bar chart of the binomial coecients and the above curve together, say for m =50.Onewillfindthatthebellcurveisanexcellent fit.

Given some number, c>1, it sometimes desirable to find for which values of t does the inequality 2m 2m >c m m t ✓ ◆✓ ◆ hold. This question can be answered using Proposition 4.3.2. 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 469 Proposition 4.3.3 For every constant, c>1, and every natural number, m 0, if pm ln c +lnc t m, then   2m 2m >c m m t ✓ ◆✓ ◆ and if 0 t pm ln c ln c m, then    2m 2m c. m m t  ✓ ◆✓ ◆

As an example, if m =1000andc =100,wewillhave 1000 1000 > 100 500 500 (500 k) ✓ ◆✓ ◆ or equivalently 1000 1000 1 < k 500 100 ✓ ◆✓ ◆ when 500 k p500 ln 100 + ln 100, that is, when k 447.4.  470 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS It is also possible to give an upper on the partial sum 2m 2m 2m + + + , 0 1 ··· k 1 ✓ ◆ ✓ ◆ ✓ ◆ with 0 k m,intermsoftheratio,c = 2m 2m .   k m The following proposition is taken from Lov´asz,Pelik´an and Vesztergombi [12] (Lemma 3.8.2, Chapter 3):

Proposition 4.3.4 For any natural numbers m and 2m 2m k with 0 k m, if we let c = k m , then we have   2m 2m 2m 2m 1 + + +

This proposition implies an important result in (discrete) probability theory as explained in [12] (see Chapter 5). 4.3. SOME PROPERTIES OF THE BINOMIAL COEFFICIENTS 471 Observe that 22m is the sum of all the entries on row 2m.

As an application, if k 447, the sum of the first 447 numbers on row 1000 of the Pascal triangle makes up less than 0.5% of the total sum and similarly for the last 447 entries.

Thus, the middle 107 entries account for 99% of the total sum. 472 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS 4.4 The Inclusion-Exclusion Principle, Sylvester’s For- mula, The Sieve Formula

We close this chapter with the proof of a poweful formula for determining the cardinality of the union of a finite number of (finite) sets in terms of the of the various intersections of these sets.

This identity variously attributed Nicholas Bernoulli, de Moivre, Sylvester and Poincar´ehas many applications to counting problems and to probability theory.

Figure 4.5: Abraham de Moivre, 1667-1754 (left) and Henri Poincar´e, 1854-1912 (right)

We begin with the “baby case” of two finite sets. 4.4. THE INCLUSION-EXCLUSION PRINCIPLE 473 Proposition 4.4.1 Given any two finite sets, A, and B, we have A B = A + B A B . | [ | | | | || \ |

We would like to generalize the formula of Proposition 4.4.1 to any finite collection of finite sets, A1,...,An.

Amomentofreflexionshowsthatwhenn =3,wehave A B C = A + B + C A B A C B C | [ [ | | | | | | || \ || \ || \ | + A B C . | \ \ |

One of the obstacles in generalizing the above formula to n sets is purely notational: We need a way of denoting arbitrary intersections of sets belonging to a family of sets indexed by 1,...,n . { } 474 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS We can do this by using indices ranging over subsets of 1,...,n ,asopposedtoindicesrangingoverintegers. { } So, for example, for any nonempty subset, I 1,...,n , ✓{ } the expression i I Ai denotes the intersection of all the subsets whose index,2 i,belongstoI. T

Theorem 4.4.2 (Inclusion-Exclusion Principle) For any finite sequence, A1,...,An, of n 2 subsets of a finite set, X, we have n ( I 1) Ak = ( 1) | | Ai . k=1 I 1,...,n i I [ ✓{XI= } \2 6 ; As an application of the Inclusion-Exclusion Principle, let us prove the formula for counting the number of surjec- tions from 1,...,n to 1,...,p ,withp n,givenin Proposition{ 4.2.5. } { }  4.4. THE INCLUSION-EXCLUSION PRINCIPLE 475 Recall that the total number of functions from 1,...,n to 1,...,p is pn. { } { } The trick is to count the number of functions that are not surjective.

Any such function has the property that its image misses one element from 1,...,p . { } So, if we let A = f : 1,...,n 1,...,p i/Im (f) , i { { }!{ }| 2 } we need to count A A . | 1 [···[ p| But, we can easily do this using the Inclusion-Exclusion Principle.

We find that n Ai =(p k) . i I \2 476 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS From this, the Inclusion-Exclusion Principle yields p 1 k 1 p n A A = ( 1) (p k) , | 1 [···[ p| k Xk=1 ✓ ◆ and so, the number of surjections, Snp,is n Snp = p A1 Ap |p 1 [···[ | n k 1 p n = p ( 1) (p k) k k=1 ✓ ◆ p 1 X p = ( 1)k (p k)n k Xk=0 ✓ ◆ p p = pn (p 1)n + (p 2)n + 1 2 ··· ✓ ◆ ✓ ◆ p 1 p +( 1) , p 1 ✓ ◆ which is indeed the formula of Proposition 4.2.5. 4.4. THE INCLUSION-EXCLUSION PRINCIPLE 477 Another amusing application of the Inclusion-Exclusion Principle is the formula giving the number, pn,ofper- mutations of 1,...,n that leave no element fixed (i.e., f(i) = i,forall{ i 1},...,n ). Such permutations are often6 called derangements2{ . }

We get 1 1 ( 1)k ( 1)n p = n! 1 + + + + + n 1! 2! ··· k! ··· n! ✓ ◆ n n = n! (n 1)! + (n 2)! + +( 1)n. 1 2 ··· ✓ ◆ ✓ ◆ Remark: We know (using the series expansion for ex in which we set x = 1) that 1 1 1 ( 1)k =1 + + + + . e 1! 2! ··· k! ···

Consequently, the factor of n!intheaboveformulaforpn 1 is the sum of the first n +1termsof e and so, p 1 lim n = . n !1 n! e 478 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS

1 It turns out that the series for e converges very rapidly, so p 1n!. n ⇡ e

The ratio pn/n!hasaninterestinginterpretationinterms of probabilities.

Assume n persons go to a restaurant (or to the theatre, etc.) and that they all check their coats. Unfortunately, the cleck loses all the coat tags.

Then, pn/n!istheprobabilitythatnobodywillgether or his own coat back!

1 1 As we just explained, this probability is roughly e 3,a surprisingly large number. ⇡

The Inclusion-Exclusion Principle can be easily general- ized in a useful way as follows: 4.4. THE INCLUSION-EXCLUSION PRINCIPLE 479 Given a finite set, X,letm be any given function, m: X R+,andforanynonemptysubset,A X,set ! ✓ m(A)= m(a), a A X2 with the convention that m( )=0(Recallthat ; R+ = x R x 0 ). { 2 | } For any x X,thenumberm(x)iscalledtheweight (or measure2)ofx and the quantity m(A)isoftencalled the measure of the set A.

For example, if m(x)=1forallx A,thenm(A)= A , the cardinality of A,whichisthespecialcasethatwehave2 | | been considering.

For any two subsets, A, B X,itisobviousthat ✓ m(A B)=m(A)+m(B) m(A B) [ \ m(X A)=m(X) m(A) m(A B)=m(A B) [ \ m(A B)=m(A B), \ [ where A = X A. 480 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS

Figure 4.6: James Joseph Sylvester, 1814-1897 Then, we have the following version of Theorem 4.4.2:

Theorem 4.4.3 (Inclusion-Exclusion Principle, Ver- sion 2 ) Given any measure function, m: X R+, for ! any finite sequence, A1,...,An, of n 2 subsets of a finite set, X, we have n ( I 1) m A = ( 1) | | m A . k i k=1 ! I 1,...,n i I ! [ ✓{XI= } \2 6 ;

AusefulcorollaryofTheorem4.4.3oftenknownas Sylvester’s formula is: 4.4. THE INCLUSION-EXCLUSION PRINCIPLE 481 Theorem 4.4.4 (Sylvester’s Formula) Given any mea- sure, m: X R+, for any finite sequence, A1,...,An, of n 2 subsets! of a finite set, X, the measure of the set of elements of X that do not belong to any of the sets Ai is given by n I m A = m(X)+ ( 1)| | m A . k i k=1 ! I 1,...,n i I ! \ ✓{XI= } \2 6 ;

Note that if we use the convention that when the index set, I,isemptythen

Ai = X, i \2; then the term m(X)canbeincludedintheabovesum by removing the condition that I = and this version of Sylvester’s formula is written: 6 ; n I m A = ( 1)| | m A . k i k=1 ! I 1,...,n i I ! \ ✓{X } \2 482 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS Sometimes, it is also convenient to regroup terms involv- ing subsets, I,havingthesamecardinalityandanother way to state Sylvester’s formula is as follows:

n n m A = ( 1)k m A . k i k=1 ! k=0 I 1,...,n i I ! \ X ✓{XI =k } \2 | | (Sylvester’s Formula)

Finally, Sylvester’s formula can be generalized to a for- mula usually known as the “Sieve Formula”: 4.4. THE INCLUSION-EXCLUSION PRINCIPLE 483 Theorem 4.4.5 (Sieve Formula) Given any measure, m: X R+, for any finite sequence, A1,...,An, of n 2 !subsets of a finite set, X, the measure of the set of elements of X that belong to exactly p of the sets A (0 p n) is given by i   n p k p k T = ( 1) m A . n p i k=p ✓ ◆ I 1,...,n i I ! X ✓{XI =k } \2 | |

Observe that Sylvester’s Formula is the special case of the Sieve Formula for which p =0.

The Inclusion-Exclusion Principle (and its relatives) plays an important role in combinatorics and probability the- ory as the reader will verify by consulting any text on combinatorics. 484 CHAPTER 4. SOME COUNTING PROBLEMS; MULTINOMIAL COEFFICIENTS AclassicalreferenceoncombinatoricsisBerge[1];amore recent is Cameron [3].

More advanced references are van Lint and Wilson [19], and Stanley [17].

Another great (but deceptively tough) reference covering discrete and including a lot of combinatorics is Graham, Knuth and Patashnik [9].

Conway and Guy [4] is another beautiful book that presents many fascinating and intriguing geometric and combina- torial properties of numbers in a very untertaining man- ner.

For readers interested in geometry with a combinatriol flavor, Matousek [13] is a delightful (but more advanced) reference.

We are now ready to study special kinds of relations: Partial orders and equivalence relations.