<<

Catalan Numbers, Their Generalization, and Their Uses Peter Hilton and Jean Pedersen

Probably the most prominent among the special in- stepping one unit east or one unit north of P,. We say tegers that arise in combinatorial contexts are the bino- that this is a path from P to Q if Po = P, Pm= Q. It is mial coefficients (~). These have many uses and, often, now easy to count the number of paths. fascinating interpretations [9]. We would like to stress one particular interpretation in terms of paths on the THEOREM 0.1. The number of paths from (0,0) to (a,b) is integral lattice in the coordinate plane, and discuss the the if+b). celebrated ballot problem using this interpretation. A path is a of points Po P1 9 9 Pro, 9 m >I O, Proof: We may denote a path from (0,0) to (a,b) as a where each P, is a lattice point (that is, a point with sequence consisting of a Es (E for East) and b Ns (N for coordinates) and Pz+l, i 1> 0, is obtained by North) in some order. Thus the number of such paths

64 THE MATHEMATICALINTELUGENCER VOL 13, NO 2 @ 1991 Spnnger-Verlag New York is the number of such . A sequence consists of (a + b) symbols and is determined when we have decided which a of the (a + b) symbols should be Es. p=2 p=3 Thus we have (o+b) choices of symbol. k=3 k=2

3 4 5

2 3

Figure 1. Left: a binary (or 2-ary) tree with 3 source-nodes (Q) and 4 end-nodes (o) and, at right, a ternary (or 3-ary) tree with 2 source-nodes (o) and 5 end-nodes (o).

p=2 p=3 k=3 k=2

(Sl((S2 S3)S4)) (sl s2(ss

Figure 2. Left: an expression involving 3 applications of a applied to 4 symbols and, at right, an ex- pression involving 2 applications of a ternary operation ap- plied to 5 symbols. We will present another family of special , called the Catalan numbers (after the nineteenth-cen- tury Belgian 1 mathematician Eugene Charles Catalan; see the historical note at the end of this ar~cle), which p=2 p=3 are closely related, both algebraically and concep- k=3 k=2 tually, to the binomial coefficients. These Catalan numbers have many combinatorial interpretations, of which we will emphasize three, in addition to their interpretation in terms of 2-good paths on the integral lattice. Because all these interpretations remain valid if one makes a conceptually obvious generalization of the original definition of Catalan numbers, we will present these interpretations in this generalized form. Figure 3. Left: a S-gun subdivided into three 3-gons by 2 Thus the definitions we are about to give depend on a diagonals and, at right, a 6-gon subdivided into two 4-gons parameter p, which is an integer greater than one. The by I diagonal. original Catalan numbers--or, rather, characteriza- tions of these numbers--are obtained by taking p = 2. ak: rao = 1, ~a k = the number of p-ary trees with k Let us then present three very natural combinatorial source-nodes, k ~> 1 (see Fi- concepts, which arise in rather different contexts, but gure 1). which turn out to be equivalent. bk: pbo = 1, ~bk = the number of ways of associating Let p be a fixed integer greater than 1. We define k applications of a given p-ary three sequences of positive integers, depending on p, operation, k I> 1 (see Figure 2). as follows: ck: fo = 1, t,ck = the number of ways of subdi- viding a convex polygon into k disjoint (p + 1)-gons by means of t He lived his life m Belgium; but, being born in 1814, he was not non-intersecting diagonals, k I> 1 born a Belgian! (see Figure 3).

THE MATHEMATICALINTELLIGENCER VOL. 13, NO. 2, 1991 65 However, relating the trees of Figure 1 (or the paren- p=2 p=3 thetical expressions of Figure 2) to the corresponding k=3 k=2 (1 ((23) 4)) (~ 2 s 4 5)) dissected polygons of Figure 3 is more subtle (and the hint given in [8], in the case p = 2, is somewhat cryptic), so we will indicate the proof that bk = Ck. Thus, suppose that we are given a rule for associating k applications of a given p-ary operation to a string of (p - 1)k + 1 symbols sl, s2..... s0,_l~,+ v Label the successive sides (in the anticlockwise direction) of the convex ((p - 1)k + 2)-gon with the integers from 1 to 3 (p - 1)k + 1, leaving the top, horizontal side unla- Figure 4. The dissected polygons of Figure 3, with diag- beled. Pick the first place along the expression onals and last side labeled. (counting from the left) where a succession of p symbols is enclosed in parentheses. If the symbols en- closed run from st+ 1 to sj+p, draw a diagonal from the (1(23(4(567) 8))9) initial vertex of the (j + 1)st side to the final vertex p=3 of the (j + p)th side, and label it (j + 1..... j + p) k=4 Also imagine the part (sj+a .... sj+p) replaced by a 3(4(567) 8)) /~ single symbol. We now have effectively reduced our word to a set of (k - 1) applications and our polygon to a set of 2 polygons, one a (p + 1)-gon, the other a [(p - 1)(k - 1) + 2]-gon, so we proceed inductively to complete the rule for introducing diagonals. More- over, the eventual label for the last side will corre- spond precisely to the string of (p - 1)k + 1 symbols 'Y with the given rule for associating them, that is, to the original expression. Figure 4 illustrates the labeling of the dissections of the pentagon and hexagon that are determined by the corresponding expressions of Figure 2. The converse procedure, involving the same initial labeling of the original sides of a dissected polygon, and the successive labeling of the diagonals 5 introduced, leads to an expression that acts as label for the last (horizontal) side. Figure 5. The polygonal dissection associated with We now illustrate a slightly more complicated situa- (s~(s2s3(s,(sss~ )ss) ) ~. tion: Example O. 1. Let p = 3, k = 4 and consider the expres- (As indicated, we suppress the 'p' from the symbol if sion no ambiguity need be feared.) sl(s2s3(s,(sss6sT)sg))sg). Note that, if k t 1, (i) a p-ary tree with k source-nodes has The dissection of the convex 10-gon associated with (p - 1)k + 1 end-nodes and pk + 1 nodes in this expression, with the appropriate labeling, is all (Figure 1); shown in Figure 5. The corresponding ternary tree is (ii) the k applications of a given p-ary operation are shown in Figure 6, which also demonstrates how the applied to a sequence of (p - 1)k + 1 symbols tree may be associated directly with the dissected (Figure 2); polygon. The rule is "Having entered a (p + 1)-gon (iii) the polygon has (p - 1)k + 2 sides and is sub- through one of its sides, we may exit by any of the divided into k disjoint (p + 1)-gons by (k - 1) other p sides." diagonals (Figure 3). In the case p = 2, any of a t, bk, Ck may be taken as the A well-known and easily proved result (for the case definition of the kth Catalan number. Thus we may re- p = 2 see Sloane [8]) is the following: gard any of pat, pbk, pck as defining the generalized k th Catalan number. Moreover, the following result is THEOREM 0.2. pak = pbk = pCk" known (see [6]).

The reader will probably have no trouble seeing THEOREM 0.3. how the trees of Figure I are converted into the corre- Pak= k 1 (p-1)k + l ' sponding expressions of Figure 2--and vice versa. k~>l.

66 THE MATHEMATICAL INTELLIGENCER VOL 13, NO 2, 1991 The "classical" proof of this result is rather sophisti- particular stress on the flexibility of this fourth inter- cated. One of our principal aims is to give an elemen- pretation, which we now describe. tary proof of Theorem 0.3, based on yet a fourth inter- We say that a path from P to Q is p-good if it lies pretation of the generalized Catalan numbers which, entirely below the line y = (p - 1)x. Let dk = pdk be like that of the binomial coefficients given earlier, is the number of p-good paths from (0, - 1) to (k, (p - 1)k also in terms of paths on the integral lattice. We lay - 1). (Thus, by our convention, do = 1.) We extend Theorem 0.2 to assert the following.

THEOREM 0.4. pak = pbk = pCk =pd k.

Proof: First we restate (with some precision) the defini- tions of b k and dk, which numbers we will prove equal. p=3 bk = the number of expressions involving k appli- k=4 cations of a p-ary operation (on a word of 1 9 k(p - 1) + 1 symbols). Let E be the set of all O•j• such expressions. dk = the number of good paths from (0,-1) to (k, (p - 1)k - 1); by following such a path by a 2 single vertical segment, we may identify such paths with certain paths (which we will also call 'good') from (0,- 1) to (k, (p - 1)k). Let P be the set of all such paths. We set up functions ~: E---* P, ~F: P--, E, which are obviously mutual inverses. Thus is obtained by re- moving all final parentheses 2 and (reading the re- sulting expression from left to right) interpreting a pa- renthesis as a horizontal move and a symbol as a ver- 5 6 7 tical move. The inverse rule is ~. It remains to show (i) that ImCI> consists of elements of P (and not merely of arbitrary paths from (0,- 1) to (k, (p - 1)k); and (ii) Figure 6 (a). The tree associated with (sl(s2ss(s,(ssssS~)s.))sg). that Im~F consists of elements of E (that is, of well- formed, meaningful expressions). (i) Let E be an expression. We argue by induction on Ente[ here k that ~E is a good path. Plainly ~E is a path from I (0, - 1) to (k, (p - 1)k), so we have only to prove it good. This obviously holds if k = 1. Let k I> 2 and let E involve k applications. There must occur somewhere in E the section (st+is1+2.. 9st+ p) Call this the key section. We wish to show that if (u,v) is a point on the path E that is not the end- point, then v < (p - 1)u. Let E' be the expression obtained from E by replacing the key section by s. Then E' involves (k - 1) applications. Now if A is the point on ~E corresponding to the parenthesis of the key section, then the inductive hypothesis tells us that v < (p - 1)u if (u,v) occurs prior to A. Assume now that (u,v) is not prior to A. There are then 2 possibilities:

2 This converts our expression into a meaningful reverse Polish ex- pression (MRPE); see [2]. Note that the closing parentheses are, strictly speaking, superfluous, prowded that we know that we are Figure 6 (b). Associating a tree with a dissected polygon (in dealing with a p-ary operation. It is not even necessary to know this this case, the tree shown in (a) and the polygon of Figure 5). if we are sure that the expression is well-formed, since the 'arity' is Note that 9= interior node = source-node; o = exterior ~-1K + 1, where ~ is the number of symbols and Kis the number of node = end node. opening parentheses.

THE MATHEMATICAL INTELL1GENCER VOL 13, NO 2, 1991 67 y=2x / p=3 k=4 ~2 /(.,C

X

(0, -1) t ~=~bE E =~b~

Figure 7 (a). The path associated with the expression E = Figure 7 (b). The inductive step, in either direction. (s~(s2s3(s,(sss6sT)ss))Sg).

If E ends with S/+p, so that E' ends with s, then by AB, followed by ~2'" It is trivial that ~' is a 3(u',v'), the last point of ~E', such that u' = good path to (k - 1, (p - 1)(k - 1)). Thus, by the u - 1, v' > v - p + 1, v' = (p - 1)u'. inductive hypothesis, ~' is a well-formed ex- Hencev - p + 1< (p - 1)(u - 1), soy< pression involving (k - 1) applications. Let s be (p - 1)u. the symbol in ~@' corresponding to AB. Replace s If E does not end with st+ ~, so that E' does not end by (sj+ls1+ 2 . . . sj+p, where these symbols do not with s, then 3(u',v'), not the last point of ~E', occur in ~@'. Then the new expression is well- such that u' = u - 1, v' I> v - p + 1, v' < formed and is obviously ~@. This completes the (p - 1)u' (by the inductive hypothesis). Hence induction in the ~-direction so that Theorem 0.4 is v-p + l<(p- 1)(u- 1),sov<(p- 1)u. established. This completes the induction in the ~-direction. With Theorem 0.4 established, we may proceed to Figure 7, which gives a special, but not particular, calculate d k. This is achieved in Section 2; a more gen- case, may be helpful in following the argument. eral algorithm for counting p-good paths is described (ii) Let ~ be a good path to (k, (p - 1)k). We argue by in Section 3. induction on k that ~@ is a well-formed expres- By considering the last application of the given p-ary sion. This holds if k = 1, because then there is operation, it is clear that pbk satisfies the recurrence re- only one good path @ and ~@ is (sis2. 9 9 Sp). Let k lation (generalizing the familiar formula in the case p ~> 2. Then because the path climbs altogether = 2) k(p - 1) + 1 places in k jumps, there must be a Pbk = E pbtl pbi2 . . . pbip, k I> 1. (0.1) jump somewhere (not at the initial point) of at tl+t2+ . +tp=k-1 least p places. Let A be a point on ~ where the In Section 2 we also enunciate a similar recurrence re- path takes one horizontal step followed by p ver- lation for pdk, thus providing an alternative proof that tical steps, bringing it to C. Let B be the point one ~bk = pdk. We will also show in Section 2, in outline, step above A (so that B is not on ~). Then B is on y ow (0.1) leads to a proof of Theorem 0.3 via gener- = (p - 1)x if and only if C is on y = (p - 1)x, that ating functions and the theory of B/~rmann-Lagrange is, if and only if C is the endpoint of ~; otherwise B series (see [5]). is below y = (p - 1)x. Let ~1, ~2 be the parts of The case p = 2, dealing with the Catalan numbers ending in A and beginning in C, respectively. Let as originally defined, is discussed in Section 1. In that ~2' be the translate of ~2, given by special case we have available an elegant proof that 2dk u' = u - 1, v' = v- (p- 1); = ~(k~l), k I 1, using Andre's Reflection Method [1]; however, we have not discovered in the literature a and let @' be the path consisting of ~a, followed means of generalizing this method effectively.

68 Tim MATHEMATICAL INTELLIGENCER VOL 13, NO 2, 1991 m The authors wish to thank Richard Guy, Richard composition, @ = ~1~2; and let @1 be the path ob- Johnsonbaugh, Hudson Kronk, and Tom Zaslavsky tained from @1 by reflecting_in the line y = x. Then, if for very helpful conversations and communications = ~2, @ is a path from P(d,c) to Q(a,b); and the rule during the preparation of this article. ---* @ sets up a one-one correspondence between the set of bad paths from P to Q and the set of paths from 1. Catalan Numbers and the Ballot Problem to Q. It follows that there are ((a+b)2(~+a)) bad paths from P to Q, and hence The ballot problem (see box) was first solved by Ber- trand, but a very elegant solution was given in 1887 by ((a + b) - c(c + d)) ((a + b) - d(c + (1.1) the French mathematician D4sir6 Andr6 (see [1], [4]). In fact, his method allows one easily to count the number of 2-good paths from (c,d) to (a,b), where (c,d), good paths from P to Q. The argument, illustrated in (a,b) are any two lattice points below the line y = x. Figure 8, is called Andre's Reflection Method. Because p = 2 throughout this discussion, we will Notice, in particular, that if c = 1, d = 0, this gives speak simply of good paths; any path from (c,d) to (a,b) the number of good paths from (1,0) to (a,b) as in the complementary set will be called a bad path. We will assume from the outset that both good paths and (o+b l) (a+b l) bad paths exist; it is easy to see that this is equivalent a-1 a = a to assuming Thus the probability that a path from (0,0) to (a,b) dl. (1.2) subpa~hs PF, FQ, so that, using juxtaposition for path- co = 1, ck =-~ k- 1 "

Notice that an alternative expression is

Ck=k+ 1 (1.3)

Notice, too, that, by translation, ck may also be inter- preted as the number of good paths from (1,0) to (k + 1,k). Andr4's Reflection Method does not seem to be readily applicable to obtaining a formula corre- sponding to (1.2) in the case of a general p I> 2. In- deed, in the next section, where we calculate pck, p/> 2, we do not obtain a general closed formula for the number of p-good paths from (c,d) to (a,b). We do in- troduce there and calculate explicitly the number dqk of p-good paths from (1,q - 1) to Pk(k,(p - 1)k - 1) for q ~< p - 1. This enables us, by translation, to calculate the number of p-good paths from any lattice point C below the line y = (p - 1)x to Pk. For if C = (c,d), then the number of p-good paths from C to Pk is the number of p-good paths from (1,q - 1) to Pk-c+l, where q = Figure 8. Andr4's reflection method. d + 1 - (p - 1)(c - 1). On the other hand, although

THE MATHEMATICAL INTELLIGENCER VOL 13, NO 2, 1991 69 a convenient explicit formula when the terminus of for an easy translation argument shows that the the path is an arbitrary lattice point below the line y = number of p-good paths from (1, p - 2) to Pk is the (p - 1)x does not seem available, we will nevertheless same as the number of p-good paths from (0,- 1) to derive in Section 3 an algorithmic formula in the form of Pk-l" Thus dp-l,1 = 1 = d00 if k = 1, and, if k I-- 2, the a finite sum, depending on the quantities dqk. This for- number of p-good paths from (0,- 1) to Pk-1 is, as ar- mula, while readily applicable, has a mysterious con- gued above, the number of p-good paths from (1, - 1) ceptual feature to which we draw attention at the end to Pk- 1, that is, do,k- 1. of Section 3. We may write ~dk for pd0k, so that, if k I> 0, The standard procedure for calculating Ck (we here 2dk = Ck, the kth Catalan number, and (2.3) revert to the original case p = 2) is based on the evi- dent =/k =/0,k =/p-l,k§ the generalized kth Catalan number. (2.4) Ck = ~ c,cl, k >~ l; co = 1. (1.4) z+/=k-1 We will also suppress the 'p' from these symbols and This relation is most easily seen by considering bk and from the term 'p-good' if no ambiguity need be feared. concentrating on the last application of a given binary We now enunciate two fundamental properties of operation within a certain expression. If we form the the numbers dqk, for a fixed p 1 2. A special case of the power series first property was pointed out to us by David Jonah oo during a Mathematical Association of America Short S(x) = (1.5) Course, conducted by us in the summer of 1987, k=O under the auspices of the Michigan Section. His orig- inal formula concerned the case p = 2 and was re- then (1.4) shows that S satisfes stricted to q = 0 (and the parameter n appearing in the xS 2 - S -t- 1 -~ 0, S(0) = 1, formula was restricted to being an integer not less than 2k instead of an arbitrary real number). so that Recall that, for any real number n, the binomial coef- 1 - X/1 -4x S = (1.6) ficient (8) may be interpreted as 1, and the binomial 2x coefficient (~) may be interpreted as the expression n(n-1) .... (n-r+1), provided r is a positive integer (see, It is now straightforward to expand (1.6) and thus to r! e.g., [3]). Using this interpretation, we prove the fol- obtain confirmation of the value of ck already obtained lowing. in (1.2). As we will demonstrate, this analytical method for calculating ck does generalize, but the gen- THEOREM 2.2 (Generalized Jonah Formula). Let n be eralization is highly sophisticated and we prefer to any real number. If k >I 1, and q ~ p - 1, then emphasize a much more elementary, combinatorial ar- k gument to calculate pck. -

2. Generalized Catalan Numbers Proof: We first assume (see Figure 9) that n is an in- We will base our calculation of ~ck on that of a more teger not less than pk, so that (k, n - k) is a lattice general quantity. Let p be a fixed integer greater than point not below the line y = (p - 1)x. Partition the 1, and let Pk be the point (k,(p - 1)k - 1), k ~ 0. We paths from (1, q - 1) to (k, n - k) according to where define the numbers ~/qk = dqk, as follows. they first meet the line y = (p - 1)x. The number of these paths that first meet this line where x = i is ~i ~i, Definition 2.1 Let q ~< p - 1. Then dq0 = I and dqk is the where number of p-good paths from (1, q - 1) to Pk, if k >1 1. ~/z = the number of paths from (1, q - 1) to (i, (p - 1)i) The quantities dqk will play a crucial role in counting which stay below y = (p - 1)x except at the p-good paths and turn out to be no more difficult to endpoint calculate than the quantities d k. Of course, we have = the number of good paths from (1, q - 1) to P, ~Ck = ~/k = pd0k, k ~> 0, (2.1) t/q ; where pck is the kth (generalized) Catalan number. For, if k ~> 1, every p-good path from (0, - 1) to Pk must first and proceed to (1,- 1). We further note that we have the the number of paths from (i, (p - 1)i) to relation (k, n - k)

6_1, k = d0,k_l, k ~ 1; (2.2)

70 THE MATHEMATICAL INTELLIGENCER VOL 13, NO 2, 1991 (Recall that k t> 2.) (k,n - k) Now we have the universal identity (r"+l) = 7~-i(r)"-rn Hence (k, (p - :) k) - pi- 1) pk- pi- k + i _ 9 ( pkk-1 = k -~i (P~ PZ---11)

kk i-1]' so that (p-- 1)z) ' Pz

= dqk (o 1) = dqk.

Finally,

1 (P 1)( pk k - 2

(1,q- 1) pk- q - Figure 9. Partitioning paths from (1, q - 1) to (k, n - k) according to where they first meet the line y = (p - 1)x. = P- q(pk- q~ pk- q\k- 1]" Because i ranges from I to k and there are (~--0 paths in all from (1, q - 1) to (k, n - k), formula (2.5) is and (2.6) is proved. proved in this case. Note that nothing prevents us from taking q nega- We now observe that each side of (2.5) is a polyno- tive in Theorems 2.2 and 2.3. Taking q = 0, we obtain mial in n over the rationals Q of degree (k - 1), and the values of the generalized Catalan numbers: we have proved that these two polynomials agree for infinitely many values of n. They therefore agree for Corollary 2.4 ~lo = 1 and ~dk = ~_~),1 k >I 1. all real values of n. We may use Theorem 2.2 to calculate the value of We come now to the second fundamental property dqk; recall that q ~< p - 1. of the numbers dqk, for a fixed p ~> 2.

THEOREM 2.3 If k >>- 1, then THEOREM 2.5 (Recurrence relation for dqk). Choose a fixed r >>- 1. If k >I 1, and q < p - r, then

Proof: Because, from its definition, dqa = 1, formula dqk = Z dp-r,'dq+r,J' where i >~ 1, j >I 1, i + j = k + 1. (2.6) holds if k = 1. We therefore assume k/> 2. We z,! (2.7) first substitute 3 n = pk - 1 into (2.5), obtaining The proof proceeds by partitioning the good paths 1 2 d~ (Pk k _ 1 from (1, q - 1) to Pk according to where they first meet the line y = (p - 1)x - r (see Figure 10). We suppress We next substitute n = pk - 1 in (2.5), but now re- the details of the argument, which resemble those for place k by (k - 1), obtaining Theorem 2.2. However, notice that formula (2.7) gen- eralizes the familiar recurrence relation (1.4), 2 ~ d, (P~- /P'_--11). Ck= ~ c,q, k>'l, i+y=k-1 3 It is amusing to note that the combinatorial argument in the proof of Theorem 2.2 required n >t pk, and we are here substituting a value of for the Catalan numbers, in the light of (2.3) and (2.4) n less than pk. --we have merely to substitute p = 2, q = 0, r = 1. In

THE MATHEMATICALINTELLIGENCER VOL 13, NO 2, 1991 71 Corollary 2.7 If k >I 1, then

= E A1 A2 . . . ,l+Z2+...+Zp=k-1

+ Because the numbers pbk are easily seen to satisfy the same relation, Corollary 2.7 provides another proof of the equality of pd k with the k th generalized Catalan number. It also shows us that, if S(x) is the

power series oo S(x) = (p --1)1 --r) k=O then -"-X xSp = S - 1. (2.9) Klamer [6] attributes to P61ya-Szeg6 [7] the observa- tion that we may invert (2.9) to obtain

(1,p -1 -r) S(x) = 1 + ~,-s k- 1 (o, --r) (1, q--l) indeed, this is the solution given to Problem 211 on p. 125 of [71, based on the theory of Bfirmann-Lagrange series. Thus, Corollary 2.7 leads to a different, but far more sophisticated, evaluation of the generalized Ca- Figure 10. Proof of Theorem 2.5. talan numbers pdk, though it does not, in any obvious way, yield Theorem 2.3. fact, we may use Theorem 2.5 to express dqk in terms of Note that Corollary 2.4 could, of course, have been the generalized Catalan numbers ~di, at the same time obtained without introducing the quantities pdqk for generalizing (2.4): q # 0. However, these quantities have an obvious combinatorial significance and satisfy a recurrence re- THEOREM 2.6. If q ~ p - 1, and if k >! 1, then lation (Theorem 2.5) much simpler than that of Corol- lary 2.7. In the light of (2.1) the quantities pdqk them- selves deserve to be regarded as generalizations of the dqk = E rdq ~d,2 . . . ~dlp_q. (2.8) Catalan numbers--though certainly not the ultimate q+i2 + . +ip_q=k-1 generalization! We will see in the next section that Proof: The case q = p - 1 is just (2.4), so we assume they are useful in certain important counting algo- q < p - 1. Then, by Theorem 2.5, with r = 1, rithms.

3. Counting p-good Paths dqk = ~ dp-l,, dq+l a, ',] We will give in this section an algorithm for counting wherei/1, j1>l,i+j = k + 1 the number of p-good paths from (c,d) to (a,b), each of those lattice points being assumed below the line y = (p - 1)x. As explained in Section 1, it suffices (via a = ~ ~d,ldq+ 1,12, where j2 ~ 1, il + j2 = k, translation) to replace (c,d) by the point (1, q - 1), '1,12 with q ~ p - 1. Further, in order to blend our notation using (1.4). with that of Section 2, we write (k, n - k) for (a,b), so that n < pk. We assume there are paths from (1, q - 1) Iterating this formula, we obtain to(k,n - k),i.e.,thatk1>l,n- k/>q- 1. dqk Of course, there are (~-~) paths from (1, q - 1) to (k, n - k). It is, moreover, easy to see that there exist = E pdH pdz2.., pdzp_q_l pdp_l,lp_q , p-bad paths from (1, q - 1) to (k, n - k) if and only if n Zl+Z2+ . +,p-q- l +lp-q=k - k I> p - 1, so we assume this--otherwise, there are where jp_q >i 1. only p-good paths and we have our formula, namely, (~--0. A second application of (2.4) yields the formula (2.8). To sum up, we are counting the p-good paths from Setting q = 0 in (2.8), and again using (2.4), yields (1, q - 1) to (k, n - k) under the (nontrivializing) as- our next result. sumption that 1 ~ k ~ n - p + 1 ~ p(k - 1); notice

72 THE MATHEMATICAL INTELLIGENCERVOL 13, NO 2, 1991 that this composite inequality implies that, in fact, k/> 2, so that we assume y = (p-l)/ 2~k~n- p + l <-p(k- 1). (3.1)

9(k,. -k) There are h~-~) paths in all, so we count the p-bad paths. Let (~ = [~-1~], where [x] is the integral part of x. Then (3.1) ensures that (~/> 1, and a p-bad path from g = J (1, q - 1) to (k, n - k) will meet the line y = (p - 1)x first at some point Q(j,(p - 1)j), j = 1, 2 ..... e (see Figure 11). Because there are dqj paths from (1, q - 1) to Qi that stay below the line y = (p - 1)x except at QI' and be- cause there are (~-ro paths from Qj to (k, n - k), it follows that there are dq1(~-rO bad paths from (1, q - 1) to (k, n - k) that first meet the line y = (p - 1)x at Qj. Thus the number of p-bad paths from (1, q - 1) to (k, n - k) is (1, q-1

Figure 11. A bad path from (1, q - 1) to (k, n - k), n < pk. Notice that Qe is the last possible initial crossing point of the line y = (p - 1)x for such a path. We have proved the following result.

THEOREM 3.1. The number of p-good paths from (1, q - 1) to (k, n - k), under the conditions (3.1), is Yl , [n,] y = 2x We may express this differently, however, and in a way that will be more convenient if ~ is relatively xxxx [ (5, large. If we invoke Theorem 2.2, we obtain the fol- b x lowing corollary.

Corollary 3.2 The number of p-good paths from (1, q - 1) to (k, n - k), under the conditions (3.1), is (3, 6)~\ N N y=4 ' -,,J r ,-k 1 \"~ (5, 4) l=e+l J ' LP-lJ It is interesting to observe that the binomial coeffi- 1, 2) dents entering into the formula of Corollary 3.2 are certainly 'generalized,' because k - j > n - pj if j > 2. Thus if e + 1 ~ j ~< k, then (~--ro = o, so long as n - pj / >/0, i.e., j ~ In~p]. Because n/p > (n - k)/(p - 1), it / 9(1, -1) follows that In~p] <~ [(n - k)/(p - 1)], so that we may improve Corollary 3.2, at least formally, by replacing (~ by m = In~p]. Thus we obtain our final corollary.

Figure 12. Corollary 3.3 with q = 0, k = 5, p = 3, n = 9; Corollary 3.3. The number of p-good paths from (1, q - 1) hencel=2, m= 3. to (k, n - k), under the conditions (3.1), is all paths bad paths good paths

-- + r -3 6 1=m+1 J ' = 1(15) + 3(1) + 12(0) + 55(-3) + 273(1) i.e., 126 = 18 + 108 We give an example (see Figure 12).

THE MATHEMATICALINTELLIGENCER VOL 13, NO 2, 1991 73 Example 3.1. Let q = 0, k -- 5, p = 3, n = 9. The in- him in connection with his solution of the problem of equalities (3.1) arc certainly satisfied and m = 3. Thus dissecting a polygon by means of non-intersecting di- the number of 3-good paths from (1, - 1) to (5,4) is agonals into ; thus his definition of the kth Ca- talan number is our Ck. In fact, the problem of enumer- ating such dissections had already been solved by Segner in the eighteenth century, but not in conve- Now dj is the jth (generalized) Catalan number for - nient or perspicuous form. Euler (see the bibliog- 3; thus, by Corollary 2.4, d4 = 88 = 55 and ds = ~(i~) raphy) had immediately obtained a simpler solution = 273. Thus the number of 3-good paths is 108. Notice involving generating functions--we have sketched that the total number of paths is (9) = 126. Notice also this approach in Section 1. Catalan's own solution, that, in this case, e = [(n - k)/(p - 1)] = 2, so that achieved almost simultaneously with that of Binet there is an advantage in replacing e by m. around 1838, was even simpler, not requiring the Circumstances will determine whether it is more theory of generating functions, which was regarded at convenient to use Theorem 3.1 or Corollary 3.3. In the the time as rather 'delicate' in view of its (apparent) example above there is little to choose. However, dependence on the notion of convergence of series. whereas in Theorem 3.1 each term has an obvious A host of other interpretations of the Catalan combinatorial meaning, it is difficult to assign a com- numbers have been given, largely inspired by work in binatorial meaning to the terms in Corollary 3.3; the and graph theory, and have been col- binomial coefficient (~2r0, j I> m + 1, does not count lected by Gould (see the bibliography). The interpreta- the paths from Q1 to (k, n - k), because there are none tions have led to generalizations, of which the most --might it be said to count 'phantom paths'? obvious and important, treated in this article, have Example 3.1 is really "generic" and thus merits been studied by probabilists (e.g., Feller) and others. closer study. Theorem 2.2 tells us, in this case, that The ballot problem itself leads to an evident general- ization, considered by Feller, in which the line y = x is 5 (9- replaced by the line y = (p - 1)x. An interesting generalization of a somewhat dif- ferent kind is treated by Wenchang Chu (see the bibli- The expression of the right of (3.2) breaks up into ography).

33/ References 1. D. Andr6, Solution directe du probl~me r6solu par M. Bertrand, Comptes Rendus Acad. Sc~. Paros 105 (1887), Now the binomial coefficient (9) of the left of (3.2) 436-7. counts the number of paths from (1,- 1) to (5,4); the 2. A. P. Hillman, G. L. Alexanderson and R. M. Grassl, first block in (3.3) counts the bad paths from (1, - 1) to Discrete and Combinatorial Mathematzcs, San Francisco: (5,4), partitioning them into those that first meet the Dellen (1987). danger line y = 2x at (1,2) and those that first meet the 3. Peter Hilton and Jean Pedersen, Extending the binomial coefficients to preserve symmetry and pattern, Com- danger line at (2,4); the second block in (3.3) makes a puters Math. Apphc. 17 (1-3) (1989), 89-102. zero contribution; and the third block in (3.3) thus 4. Peter Hilton and Jean Pedersen, The ballot problem and counts the good paths from (1,- 1) to (5,4). In some Catalan numbers, Nzeuw Archief voor Wiskunde, 1990 (to mysterious sense the third block--which intrigues us appear). enormously--is also partitioned by the points (4,8) 5. A. Hurwitz and R. Courant, Allgemeine Funktzonentheorze und elliptische Funktionen. Geometrische Funktionentheorie, and (5,10) on the line y = 2x. But this is only a nota- Berlin: Springer (1922), 128. tional partitioning, because we have 108 represented 6. David A. Klarner, Correspondences between plane trees as (55)(-3) + (273)(1). Here the coefficients d,(i = 4,5) and binary sequences, J. of Comb. Theory 9 (1970), have an obvious combinatorial significance, but the bi- 401-411. nomial coefficients, (]3) and (56), do not. They result 7. G. P61ya and G. Szego, Aufgaben und Lehrs~tze aus der Analyszs, Vol. I, Berlin, Gottingen, Heidelberg: Springer from applying the formula (~_-p') for the number of (1954), 125. paths from (i, (p - 1)i) to (k, n - k) outside its domain 8. N. J. A. Sloane, A Handbook of Integer Sequences, New of validity 1 ~ i ~ 2. There is here much food for York and London: Academic Press (1973). thought! 9. Marta Sved, Counting and recounting, Mathematzcal In- telligencer 5, no. 4 (1983), 21-26.

Historical Note Bibliography The versatile Belgian mathematician Eugene Charles [Items [2], [6] and [8] among the References may also be re- Catalan (1814-1894) defined the numbers named after garded as general sources on Catalan numbers.]

74 THE MATHEMATICAL INTELLIGENCER VOL 13, NO 2, 1991 R. Alter, Some remarks and results on Catalan numbers, series inversion, Trans. Amer. Math. Soc. 94 (1960), Proc. 2nd Louisiana Conf. on Comb., Graph Theory and Comp. 441-451. (1971), 109-132. D. G. Rogers, Pascal triangles, Catalan numbers and re- D. Andre, Solution directe du probl~me r~solu par M. Ber- newal arrays, Discrete Math. 22 (1978), 301-310. t-rand, Comptes Rendus Acad. Sc~. Parrs 105 (1887), 436-7. A.D. Sands, On generalized Catalan numbers, Dzscrete W. G. Brown, Historical note on a recurrent combinatorial Math. 21 (1978), 218-221. problem, Amer. Math Monthly 72 (1965), 973-977. Wenchang Chu, A new combinatorial interpretation for gen- Memoirs by Eugene-Charles Catalan relevant to the theme eralized Catalan numbers, Discrete Math. 65 (1987), 91-94. of Catalan numbers may be found in Journal de Math&na- K. L. Chung and W. Feller, On fluctuations in coin-tossing, tlques pures et appliqu~es de Liouville, (1), III, 508-816; W, Proc. Nat. Acad. Sc~., U.S.A. 35 (1949), 605-608. 91-94, 95-99; VI, 74, 1838-41. John H. Conway and Richard K. Guy, The Book of Numbers, For details of Catalan's life and work see "Les travaux math- Sci. Amer. Library, W. H. Freeman (to appear 1990). ~matiques de Eugene-Charles Catalan," Annuaire de L'Aca- I. Z. Chomeyko and S. G. Mohanty, On the of ddmie Royale des Sczences, des Lettres et des Beaux-Arts de Bel- certain sets of planted trees, J. Combm. Theory Ser. B18 gique, Brussels (1896), 115-172. (1975), 209-21. T. T. Cong and M. Sato, One-dimensional random walk Peter H~lton Jean Pedersen with unequal step lengths restricted by an absorbing bar- Department of Mathematical Department of Mathematzcs rier, Dzscrete Math. 40 (1982), 153-162. Sciences Santa Clara University R. Donoghey and L. W. Shapiro, Motzkin numbers, J. State University of New York Santa Clara, CA 95053 USA Combin. Theory Ser. A23 (1977), 291-301. Binghamton, NY 13901 USA N. Dershowitz and S. Zaks, Enumeration of ordered trees, and Discrete Math. 31 (1980), 9-28. Abteilung fi~r Mathematik L. Euler, Novz Commentarii Academiae Sczentarzum Imperialis E.T.H., 8092 Ziirich, Switzerland Petropohtanque 7 (1758/9), 13-14. Roger B. Eggleton and Richard K. Guy, Catalan strikes again! How likely is a function to be convex?, Math. Mag. 61 (1988), 211-219. A. Erd~lyi and I. M. H. Etherington, Some problems of non- associative combinatorics, Edinburgh Math. Notes. 32 (1941), 7-12. I. J. Good, Generalizattons to several variables of Lagrange's expansion, Proc. Camb. Phil. Soc. 56 (1960), 367-380. Henry W. Gould, Bell and Catalan Numbers, Research bibliog- raphy of two special number sequences, available from the au- thor (Department of Mathematics, West Virginia Univer- sity, Morgantown, WV 26506). The 1979 edition sells for $3.00 and contains over 500 references pertaining to Ca- talan numbers. Henry W. Gould, Combinatorial Identities, available from the author (Department of Mathematics, West Virginia Uni- versity, Morgantown, WV 26506). Richard K. Guy, Dissecting a polygon into triangles, Bulletin of the Malayan Mathematical Society 5 (45) (1958), 57-60. Richard K. Guy, A medley of Malayan mathematical memo- ries and mantissae, Math. Medley (Singapore), 12, no. 1 (1984), 9-17. V. E. Hoggatt, Jr., and Marjorie Bicknell, Pascal, Catalan, and general sequence convolution arrays in a matrix, Fi- bonaccz Quarterly 14, no. 2, 135-143. V. E. Hoggatt, Jr., and Marjorie Bicknell, Sequences of ma- trix inverses from Pascal, Catalan, and related convolution arrays, Fibonaccz Quarterly 14, no. 3, 224-232. V. E. Hoggatt, Jr., and Marjorie Blcknell, Catalan and re- lated sequences arising from inverses of Pascal's matrices, Fibonacci Quarterly (December 1976), 395-405. M. S. Klamkin, Problem 4983, Amer. Math. Monthly 69 (1962), 930-931. Mike Kuchinski, Catalan structures and Correspondences, M.Sc. thesis, Department of Mathematics, West Virginia Univer- sity, Morgantown, WV 26506. Th. Motzkin, Relations between hypersurface cross ratios, and a combinatorial formula for partitions of a polygon, and for non-associative products, Bull. Amer. Math. Soc. 54 (1948), 352-360. G. P61ya, On picture-writing, Amer. Math. Monthly 63 (1956),

689 - 697. G. N. Raney, Functional composition patterns and power

THE MATHEMATICAL INTELLIGENCER VOL 13, NO 2, 1991 75